Cholera is a killer epidemic disease and it can cause passing away within hours after first symptons of disease appear. Ten thousands of British people died because of cholera between 1831 – 1854. In those days, cholera was thought to be spread by bad air until John Snow proves the disease is spread by the dirty water. However, convincing people about changing this accepted opinion is not an easy task for him although he is a British doctor. Snow believed that sewage contamination into the water was cause of the disease outbreak.
In 1854 Aug, London was again hit by a outbreak of cholera. Outbreak makes sick more than 500 people who live in London in 10 days. Snow negotiated with City Hall and closed street-pumps off in Broad Street. That night nobody died anymore in Broad Street.
Debates between humans and computers start with mechanical turk. That’s an historical autonomous chess player costructed in 18th century. However, that’s a fake one. The mechanism allows to hide a chess player inside the machine. Thus, the turk operates while hiding master playing chess. (Yes, just like Athony Deniels and Kenny Baker hid inside of 3PO and R2D2 in Star Wars). So, there is no intelligence for this ancient example. Still, this fake machine shows expectations of 18th century people for an intelligent system to involve in daily life.
IBM Deep Blue is first chess playing computer won against a world champion. Garry Kasparov were defeated by Deep Blue in 1997. Interestingly, development of Deep Blue has began in 1985 at Carnegie Mellon University (remember this university). In other words, with 12 years study comes success.
Newton’s cradle is the most popular example of momentumconservation. A lifted and released sphere strikes the stationary spheres and force is transmitted through the stationary spheres. This action pushes the last sphere upward. This shows that the last ball receives the momentum of the first ball. We would apply similar principle in neural networks to improve learning speed. The idea including momentum into neural networks learning is incorporating previous update in the current change.
Newton’s Cradle Demonstrates Conservation of Momentum
Gradient descent guarantees to reach the local minimum when iteration approaches to infinity. However, that is not applicable in reality. Gradient descent iterations have to be terminated by a reasonable value. Moreover, gradient descent converges slowly. Herein, momentum improves the performance of the gradient descent considerably. Thus, cost might converge faster with less iterations if momentum is involved in the weight update formula.
Applying neural networks could be divided into two phases as learning and forecasting. Learning phase has high cost whereas forecasting phase produces results very quickly. Epoch value (aka training time), network structure and historical data size specify the cost of learning phase. Normally, the larger epoch produces the better results. However, increment of epoch value will cause to be taken longer time. That’s why, picking up very large epoch value would not be applicable for online transaction if learning is implemented instantly.
However, we can apply learning and forecasting steps asynchronously. We would perform neural network learning as batch application (e.g. periodic day-end or month-end calculation). Thus, epoch would be picked up as very large value. Besides, weights of neural networks will be calculated on low system load (most probably late night hours). In this way, no matter how long neural networks learning lasts. Thus, we can even make forecasts for online transactions in milliseconds. You might imagine this approach like that human nervous system updates its own weights while sleeping.
Building neural networks models and implementing learning consist of lots of math this might be boring. Herein, some tools help researchers to build network easily. Thus, a researcher who knows the basic concept of neural networks can build a model without applying any math formula.
So, Weka is one of the most common machine learning tool for machine learning studies. It is a java-based API developed by Waikato University, New Zealand. Weka is an acronym for Waikato Environment for Knowledge Analysis. Actually, name of the tool is a funny word play because weka is a bird species endemic to New Zealand. Thus, researchers can introduce an endemic bird to world wide.
We’ve focused on the math behind neural networks learning and proof of the backpropagation algorithm. Let’s face it, mathematical background of the algorihm is complex. Implementation might make the discipline easier to be figured out.
Now, it’s implementation time. We would transform extracted formulas into the code. I would prefer to impelement the core algorithm in Java. This post would also be a tutorial of the neural network project that I’ve already shared on my GitHub profile. You might play around the code before reading this post.
Non-linear sinus wave is chosen as dataset. The same dataset is used in the time-series post. Thus, we’ll be able to compare the prospective forecasts for both neural network and time series approaches. Basically, a random point in the wave would be predicted based on previous known points.
Neural Networks inspired from human central nervous system. They are based on making mistakes and learning lessons from past errors. They produces results very fast like human reflexes in contrast to learning which lasts continous.
For instance, a kid who never touch a hot surface would decide to touch it. However, he would not touch hot surfaces anymore if his hand is burnt once. Adapting this example to the neural networks discipline makes the concept easier to figure out.