Category Archives: Machine Learning

Hello, TensorFlow!

I proudly announce that I’ve been starting to capture my first online course TensorFlow 101: Introduction to TensorFlow.

Actually, I have been planning to create an online course material for a long time. That’s why; I am so excited right now. On the other hand, accomplishing a long term course is not easy task, I know.

I read a book one day and my whole life was changed. That’s the opening sentence of Orhan Pamuk‘s The New Life. This sentence can be adapted into my life with small modifications.

Previously, I’ve attended Prof. Andrew‘s Machine Learning course on Coursera. I have been met with MOOC through that course. It has taken almost 5 months and I can only focus on the course after working hours. But, I was motivated and feeling ambitious. Finally, I accomplished. That course was a touchstone in my career path. After then, my title which was Software Developer has changed as Data Scientist. BTW, I think a Data Scientist who has a software developer background would get the upper hand on someone else. To sum up, I attend a course one day and my whole life was changed. As a matter of fact, this inspiration triggers me to create this course

The course content will be always free and accessible at my YouTube channel. Moreover, I will share source codes while capturing the course on my GitHub profile. Also, I will capture and add new videos for the course.

keep-calm-and-be-a-youtuber-4

Keep calm and be a youtuber

I hope the course content to be beneficial and contribute your machine learnig adventure. Herewith, I am a youtuber now!

Thanks a lot in advance for your interest.

Becoming the MacGyver of Machine Learning with Neural Networks

You would most probably remember MacGyver if you are a member of generation Y. He is famous for creating materials around him to solve unordinary solutions he faced with. Swiss army knife and duct tape would most probably be used in his practical solution. So, neural networks would be your swiss army knife in machine learning studies.

macgyver

Richard Dean Anderson appears in series as MacGyver

Previous experiments determine machine learning study to be handled as supervised or unsupervised.

Segmentation is a type of unsupervised learning. In this field, related group of an instance would be looked for. For example, a gym can group customers as fat and thin. However, segmentation method can be based on customer weights, body mass index or muscle and body fat ratio. In other words, there is no correct way for solution. A customer can be involved in different segments in different studies.

In contrast, labels for instances are exact in supervised learning. Suppose that you are working on dead loans. Outstanding ones of given loans are already known.

Continue reading

Homer Simpson Guide to Backpropagation

IQ-homer

Homer Simpson has a low IQ of 55

Backpropagation algorithm is based on complex mathematical calculations. That’s why, it is hard to understand and that is the reason why people have an edge on neural networks. Adapting the concept into the real world makes even Homer Simpson easier to figure out. In this post, we’ll mention how to explain backpropagation to beginners.

What if an approved loan application would be outstanding loan (or dead loan)? The bank loses money. So, how can this financial institution derive lesson from this mistake?

Loan application is a process. In other words, an application is required to be examined by multiple authorized employees respectively. For instance, a customer makes an application to bank branch agent, then agent delivers the application to branch supervisor or branch manager. After then, head office employees examine the application when branch manager approved. To sum up, a loan application follows a path and comes to hands of in charged of employees. Should these employees responsible for the lose? The answer is yes based on backpropagation.

Backpropagation algorithm proposes to reflect the lost money amount on the same path, but backwardly. That’s why, it is named as back-propagation. Fine head office employees first, then punish branch manager, supervisor, and agent respectively. What’s more, how much the total lose amount should be reflected to a branch agent? Total lose amount should be divided between in charged of employees based on their contributions on total lose. (Actually, that is the derivative of total lose amount with respect to the employee. E.g. ∂TotalLoseAmount / ∂BranchAgent).  In this way, these employees would be more careful in the next time. That is the principle of backpropagation algorithm. Thus, examination process would progress in time.

As the phrase goes, backpropagation advices slapping ones who are on the tracked path backwardly and in the ratio of their contribution on total error. I would like to thank Dr. Alper Ozpinar for this metaphor.

batman-slap-v2

Batman backpropagates Robin

Step Function as a Neural Network Activation Function

Activation functions are decision making units of neural networks. They calculates net output of a neural node. Herein, heaviside step function is one of the most common activation function in neural networks. The function produces binary output. That is the reason why it also called as binary step function. The function produces 1 (or true) when input passes threshold limit whereas it produces 0 (or false) when input does not pass threshold. That’s why, they are very useful for binary classification studies.

step_function_dance_move-v2

Heaviside Step Function Dance Move

Human reflexes act based on the same principle. A person will withdraw his hand when he touces on a hot surface. Because his sensory neuron detects high temperature and fires. Passing threshold triggers to respond and withdrawal reflex action is taken. You might think true output causing fire action.

Continue reading

Homer Sometimes Nods: Error Metrics in Machine Learning

Even the worthy Homer sometimes nods. The idiom means even the most gifted person occasionally makes mistakes. We would adapt this sentence to machine learning lifecycle. Even the best ML-models should make mistakes (or else overfitting problem). The important thing is know how to measeure errors. There are lots of metrics for measuring forecasts. In this post, we will mention evalution metrics meaningful for ML studies.

homer-doh-small

Homer Simpson uses catchphrase D’oh! when he has done something wrong

Sign of actual and predicted value diffence should not be considered when calculation total error of a system. Otherwise, total error of a series including equally high underestimations and overestimations might measure very low error. In fact, forecasts should include low underestimations and overestimations, and total error should be measured low. Discarding sign values provides to get rid of this negative effect. Squaring differences enables discarding signs. This metric is called as Mean Squared Error or mostly MSE.

Continue reading

AI: a one-day wonder or an everlasting challenge

Debates between humans and computers start with mechanical turk. That’s an historical autonomous chess player costructed in 18th century. However, that’s a fake one. The mechanism allows to hide a chess player inside the machine. Thus, the turk operates while hiding master playing chess. (Yes, just like Athony Deniels and Kenny Baker hid inside of 3PO and R2D2 in Star Wars). So, there is no intelligence for this ancient example. Still, this fake machine shows expectations of 18th century people for an intelligent system to involve in daily life.

IBM Deep Blue is first chess playing computer won against a world champion. Garry Kasparov were defeated by Deep Blue in 1997. Interestingly, development of Deep Blue has began in 1985 at Carnegie Mellon University (remember this university). In other words, with 12 years study comes success.

Continue reading

Adaptive Learning in Neural Networks

Gradient descent is one of the most powerful optimizing method. However, learning time is a challange, too. Standard version of gradient descent learns slowly. That’s why, some modifications are included into the gradient descent in real world applications. These approaches would be applied to converge faster. In previous post, incorporating momentum is already mentioned to work for same purpose. Now, we will focus on applying adaptive learning to learn faster.

rango

Learning Should Adapt To The Environment Like Chameleons (Rango, 2011)

As you might remember, weights are updated by the following formula in back propagation.

wi = wi – α . (∂Error / ∂wi)

Alpha refers to learning rate in the formula. Applying adaptive learning rate proposes to increase / decrease alpha based on cost changes. The following code block would realize this process.

Continue reading