In neural networks, as an alternative to sigmoid function, hyperbolic tangent function could be used as activation function. When you backpropage, derivative of activation function would be involved in calculation for error effects on weights. Derivative of hyperbolic tangent function has a simple form just like sigmoid function. This explains why hyperbolic tangent common in neural networks.

Hyperbolic Tangent Function: tanh(x) = (e^{x} – e^{-x}) / (e^{x} + e^{-x})

The function produces outputs in scale of [-1, +1]. Moreover, it is continuous function. In other words, function produces output for every x value.

### Derivative of Hyperbolic Tangent Function

Before we begin, let’s recall the quotient rule.

Suppose that function h is quotient of fuction f and function g. If derivatives exist for both function f and function h. Then derivative of function h would be demonstrated as following formula.

h(x) = f(x) / g(x)

d(h(x)) / dx = ( f'(x).g(x) – g'(x).f(x) ) / g(x)^{2}

or d(h(x)) / dx = ( (df(x)/dx).g(x) – (dg(x)/dx).f(x) ) / g(x)^{2}

So, we can adapt this rule for hyperbolic tanget function. Because we know that tangent function is quotient of sine and cosine functions.

tanh(x) = sinh(x) / cosh(x)

d(tanh(x))/dx = ( (d(sinh(x))/dx).cosh(x) – (d(cosh(x))/dx).sinh(x) ) / (cosh(x))^{2}

Let’s calculate the derivative of sinh(x) and cosh(x)

sinh(x) = (e^{x} – e^{-x}) / 2

cosh(x) = (e^{x} + e^{-x}) / 2

d(sinh(x))/dx= d ((e^{x} – e^{-x}) / 2 ) / dx = d ( (e^{x}/2) – (e^{-x}/2) ) / dx = d(e^{x}/2)/dx – d(e^{-x}/2)/dx = (1/2).(d(e^{x})/dx) – (1/2).(d(e^{-x})/dx) = (1/2).e^{x} – (1/2).e^{-x}.(-1) = (1/2).e^{x} + (1/2).e^{-x}= (e^{x }+ e^{-x})/2 = cosh(x)

d(cosh(x))/dx = d((e^{x} + e^{-x}) / 2)/dx = d((e^{x}/2+ e^{-x}/2)/dx = d(e^{x}/2)/dx + d(e^{-x}/2)/dx = (1/2).d(e^{x})/dx + (1/2).d(e^{-x})/dx = (1/2).e^{x} + (1/2).e^{-x}.(-1) = (1/2).e^{x} – (1/2).e^{-x} =(e^{x}-e^{-x})/2 = sinh(x)

Let’s back to calculation of tanh function

d(tanh(x))/dx = ( (d(sinh(x))/dx).cosh(x) – (d(cosh(x))/dx).sinh(x) ) / (cosh(x))^{2}

d(tanh(x))/dx = ( cosh(x).cosh(x) – sinh(x).sinh(x) ) / (cosh(x))^{2}

d(tanh(x))/dx = ( (cosh(x))^{2} – (sinh(x))^{2} ) / (cosh(x))^{2}

d(tanh(x))/dx = 1 – (sinh(x))^{2}/(cosh(x))^{2} = 1 – ( sinh(x)/cosh(x) )^{2}

d(tanh(x))/dx = 1 – (tanh(x))^{2}

To sum up, hyperbolic tangent function and its derivative are demonstrated as following formulas:

f(x) = (e^{x}– e^{-x}) / (e^{x}+ e^{-x})

d(f(x))/dx = 1 – (f(x))^{2}

## 2 thoughts on “Hyperbolic Tangent as Neural Network Activation Function”