
ReLU function produces 0 when x is less than or equal to 0 whereas it would be equal to x when x is greater than 0. We can generalize the function output as max(0, x).
🙋♂️ You may consider to enroll my top-rated machine learning course on Udemy


Previously, we’ve mentioned on softplus function. The secret is that ReLU function is very similar to softplus function except near 0. Moreover, smoothing ReLU arises softplus function as illustrated below.

Pros
Sigmoid function produces outputs in scale of [0, +1]. Similarly, tanh function produces results in scale of [-1, +1]. These functions would produce same results when they increased or decreased dramatically. This means that gradient of these functions would be equal for differet positive or negative large values. This reveals that the gradient of these functions vanishes as x value is increased or decreased. However, ReLU destroys gradient vanishing problem. Because its derivative is 1 when x is greater than 0 and its derivative is 0 when x is less than or equal to 0. In other words, derivative of ReLU is step function.
What’s more, the dataset must be normalized if output of activation function has upper and lower limit. We can skip this task for ReLU based systems. Because, the function produces outputs in scale of [0, +∞).
Finally, calculation of the function result and gradient is easy task because it does not include exponential calculations. Thus, we can process the both feed forward and back progate steps fastly. That’s why, experiments show ReLU is six times faster than other well known activation functions. That is the reason why ReLU is commonly used in convolutional neural networks.
Let’s dance
These are the dance moves of the most common activation functions in deep learning. Ensure to turn the volume up 🙂
Support this blog if you do like!
In the sentence ==> In other words, its derivative is either 0 or 1. What you want to convey???
As ReLU function is not differentiable at 0. Therefore there is a discontinuity. Instead we can use a LeakyReLU to overcome this problem