
Hyperbolic Tangent Function: tanh(x) = (ex – e-x) / (ex + e-x)
🙋♂️ You may consider to enroll my top-rated machine learning course on Udemy


The function produces outputs in scale of [-1, +1]. Moreover, it is continuous function. In other words, function produces output for every x value.
Derivative of Hyperbolic Tangent Function
Before we begin, let’s recall the quotient rule.
Suppose that function h is quotient of fuction f and function g. If derivatives exist for both function f and function h. Then derivative of function h would be demonstrated as following formula.
h(x) = f(x) / g(x)
d(h(x)) / dx = ( f'(x).g(x) – g'(x).f(x) ) / g(x)2
or d(h(x)) / dx = ( (df(x)/dx).g(x) – (dg(x)/dx).f(x) ) / g(x)2
So, we can adapt this rule for hyperbolic tanget function. Because we know that tangent function is quotient of sine and cosine functions.
tanh(x) = sinh(x) / cosh(x)
d(tanh(x))/dx = ( (d(sinh(x))/dx).cosh(x) – (d(cosh(x))/dx).sinh(x) ) / (cosh(x))2
Let’s calculate the derivative of sinh(x) and cosh(x)
sinh(x) = (ex – e-x) / 2
cosh(x) = (ex + e-x) / 2
d(sinh(x))/dx= d ((ex – e-x) / 2 ) / dx = d ( (ex/2) – (e-x/2) ) / dx = d(ex/2)/dx – d(e-x/2)/dx = (1/2).(d(ex)/dx) – (1/2).(d(e-x)/dx) = (1/2).ex – (1/2).e-x.(-1) = (1/2).ex + (1/2).e-x= (ex + e-x)/2 = cosh(x)
d(cosh(x))/dx = d((ex + e-x) / 2)/dx = d((ex/2+ e-x/2)/dx = d(ex/2)/dx + d(e-x/2)/dx = (1/2).d(ex)/dx + (1/2).d(e-x)/dx = (1/2).ex + (1/2).e-x.(-1) = (1/2).ex – (1/2).e-x =(ex-e-x)/2 = sinh(x)
Let’s back to calculation of tanh function
d(tanh(x))/dx = ( (d(sinh(x))/dx).cosh(x) – (d(cosh(x))/dx).sinh(x) ) / (cosh(x))2
d(tanh(x))/dx = ( cosh(x).cosh(x) – sinh(x).sinh(x) ) / (cosh(x))2
d(tanh(x))/dx = ( (cosh(x))2 – (sinh(x))2 ) / (cosh(x))2
d(tanh(x))/dx = 1 – (sinh(x))2/(cosh(x))2 = 1 – ( sinh(x)/cosh(x) )2
d(tanh(x))/dx = 1 – (tanh(x))2
To sum up, hyperbolic tangent function and its derivative are demonstrated as following formulas:
f(x) = (ex – e-x) / (ex + e-x)
d(f(x))/dx = 1 – (f(x))2
Proof of concept
If formulas confused you, you might want to look at step by step derivative calculation video
Let’s dance
These are the dance moves of the most common activation functions in deep learning. Ensure to turn the volume up 🙂
Support this blog if you do like!
5 Comments