Hyperbolic Tangent as Neural Network Activation Function

In neural networks, as an alternative to sigmoid function, hyperbolic tangent function could be used as activation function. When you backpropage, derivative of activation function would be involved in calculation for error effects on weights. Derivative of hyperbolic tangent function has a simple form just like sigmoid function. This explains why hyperbolic tangent common in neural networks.

tanh_dance
Hyperbolic tangent dance move (Imaginary)

Hyperbolic Tangent Function: tanh(x) = (ex – e-x) / (ex + e-x)

tanh
Hyperbolic Tangent Function (aka tanh)

The function produces outputs in scale of [-1, +1]. Moreover, it is continuous function. In other words, function produces output for every x value.

Derivative of Hyperbolic Tangent Function

Before we begin, let’s recall the quotient rule.

Suppose that function h is quotient of fuction f and function g. If derivatives exist for both function f and function h. Then derivative of function h would be demonstrated as following formula.

h(x) = f(x) / g(x)

d(h(x)) / dx = ( f'(x).g(x) – g'(x).f(x) ) / g(x)2

or d(h(x)) / dx = ( (df(x)/dx).g(x) – (dg(x)/dx).f(x) ) / g(x)2

So, we can adapt this rule for hyperbolic tanget function. Because we know that tangent function is quotient of sine and cosine functions.

tanh(x) = sinh(x) / cosh(x)

d(tanh(x))/dx = (  (d(sinh(x))/dx).cosh(x) – (d(cosh(x))/dx).sinh(x) ) / (cosh(x))2

Let’s calculate the derivative of sinh(x) and cosh(x)

sinh(x) = (ex – e-x) / 2

cosh(x) = (ex + e-x) / 2

d(sinh(x))/dx= d ((ex – e-x) / 2 ) / dx = d ( (ex/2) – (e-x/2) ) / dx = d(ex/2)/dx – d(e-x/2)/dx = (1/2).(d(ex)/dx) – (1/2).(d(e-x)/dx) = (1/2).ex – (1/2).e-x.(-1) = (1/2).ex + (1/2).e-x= (e+ e-x)/2 = cosh(x)

d(cosh(x))/dx = d((ex + e-x) / 2)/dx = d((ex/2+ e-x/2)/dx = d(ex/2)/dx + d(e-x/2)/dx = (1/2).d(ex)/dx + (1/2).d(e-x)/dx = (1/2).ex + (1/2).e-x.(-1) = (1/2).ex – (1/2).e-x =(ex-e-x)/2 = sinh(x)

Let’s back to calculation of tanh function

d(tanh(x))/dx = (  (d(sinh(x))/dx).cosh(x) – (d(cosh(x))/dx).sinh(x) ) / (cosh(x))2

d(tanh(x))/dx = ( cosh(x).cosh(x) – sinh(x).sinh(x) ) / (cosh(x))2

d(tanh(x))/dx = ( (cosh(x))2 – (sinh(x))2 ) / (cosh(x))2

d(tanh(x))/dx = 1 – (sinh(x))2/(cosh(x))2 = 1 – ( sinh(x)/cosh(x) )2

d(tanh(x))/dx = 1 – (tanh(x))2

To sum up, hyperbolic tangent function and its derivative are demonstrated as following formulas:

f(x) = (ex – e-x) / (ex + e-x)

d(f(x))/dx = 1 – (f(x))2

4 Comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s