Hyperbolic Tangent as Neural Network Activation Function

In neural networks, as an alternative to sigmoid function, hyperbolic tangent function could be used as activation function. When you backpropage, derivative of activation function would be involved in calculation for error effects on weights. Derivative of hyperbolic tangent function has a simple form just like sigmoid function. This explains why hyperbolic tangent common in neural networks.

tanh_dance-v2
Tanh dance move (Inspired from Imaginary)

Hyperbolic Tangent Function: tanh(x) = (ex – e-x) / (ex + e-x)


🙋‍♂️ You may consider to enroll my top-rated machine learning course on Udemy

Decision Trees for Machine Learning

tanh
Hyperbolic Tangent Function (aka tanh)

The function produces outputs in scale of [-1, +1]. Moreover, it is continuous function. In other words, function produces output for every x value.

Derivative of Hyperbolic Tangent Function

Before we begin, let’s recall the quotient rule.

Suppose that function h is quotient of fuction f and function g. If derivatives exist for both function f and function h. Then derivative of function h would be demonstrated as following formula.

h(x) = f(x) / g(x)

d(h(x)) / dx = ( f'(x).g(x) – g'(x).f(x) ) / g(x)2

or d(h(x)) / dx = ( (df(x)/dx).g(x) – (dg(x)/dx).f(x) ) / g(x)2

So, we can adapt this rule for hyperbolic tanget function. Because we know that tangent function is quotient of sine and cosine functions.





tanh(x) = sinh(x) / cosh(x)

d(tanh(x))/dx = (  (d(sinh(x))/dx).cosh(x) – (d(cosh(x))/dx).sinh(x) ) / (cosh(x))2

Let’s calculate the derivative of sinh(x) and cosh(x)

sinh(x) = (ex – e-x) / 2

cosh(x) = (ex + e-x) / 2

d(sinh(x))/dx= d ((ex – e-x) / 2 ) / dx = d ( (ex/2) – (e-x/2) ) / dx = d(ex/2)/dx – d(e-x/2)/dx = (1/2).(d(ex)/dx) – (1/2).(d(e-x)/dx) = (1/2).ex – (1/2).e-x.(-1) = (1/2).ex + (1/2).e-x= (e+ e-x)/2 = cosh(x)

d(cosh(x))/dx = d((ex + e-x) / 2)/dx = d((ex/2+ e-x/2)/dx = d(ex/2)/dx + d(e-x/2)/dx = (1/2).d(ex)/dx + (1/2).d(e-x)/dx = (1/2).ex + (1/2).e-x.(-1) = (1/2).ex – (1/2).e-x =(ex-e-x)/2 = sinh(x)

Let’s back to calculation of tanh function

d(tanh(x))/dx = (  (d(sinh(x))/dx).cosh(x) – (d(cosh(x))/dx).sinh(x) ) / (cosh(x))2

d(tanh(x))/dx = ( cosh(x).cosh(x) – sinh(x).sinh(x) ) / (cosh(x))2





d(tanh(x))/dx = ( (cosh(x))2 – (sinh(x))2 ) / (cosh(x))2

d(tanh(x))/dx = 1 – (sinh(x))2/(cosh(x))2 = 1 – ( sinh(x)/cosh(x) )2

d(tanh(x))/dx = 1 – (tanh(x))2

To sum up, hyperbolic tangent function and its derivative are demonstrated as following formulas:

f(x) = (ex – e-x) / (ex + e-x)

d(f(x))/dx = 1 – (f(x))2

Proof of concept

If formulas confused you, you might want to look at step by step derivative calculation video

Let’s dance

These are the dance moves of the most common activation functions in deep learning. Ensure to turn the volume up 🙂


Like this blog? Support me on Patreon

Buy me a coffee