Softsign as a Neural Networks Activation Function

Activation functions play pivotal role in neural networks. As an alternative to hyperbolic tangent, softsign is an activation function for neural networks. Even though tanh and softsign functions are closely related, tanh converges exponentially whereas softsign converges polynomially. Even though softsign appears in literature, it would not be adopted in practice as much as tanh.

softsign_dance
Softsign function dance move (Imaginary)

Softsign function: y = x / (1 + |x|)

softsign
Softsign

The both tanh and softsign functions produce outputs in scale of [-1, +1].

Derivative

We need the function’s partial derivative to backpropagate.

Remember quotient rule first. We would apply quotient rule to the function

dy/dx = [x’.(1 + |x|) – x.(1 + |x|)’] / (1 + |x|)2

We have to find derivative of absolute x first. We’ve already known that the following statements are true

abs_graph
Graph of absolute x

Before that we’ve already known that when x > 0, then |x|’ = 1

when x < 0, then |x|’ = -1

when x = 0, then |x|’= undefined because it has no slope in this point.

So |x|’ would be ±1 for x ≠0. We would express differently that result

|x|’ = dy/dx = (x2)1/2

dy/dx = (1/2).(x2)1/2 – 1 .2x= (1/2). (x2)-1/2 .2x = (1/2).[1/(x2)1/2].2x

We’ve already known that squared root of x squared is equal to the absolute x.

Let’s replace the following term (x2)1/2 = |x| in the equation above

dy/dx = (1/2).[1/|x|].2x = x/|x|

So, derivative of absolute x is equal to x over absolute x

|x|’ = x /|x|

Put the derivative term in main equation

dy/dx = [1.(1 + |x|) – x.(0 + |x|’)] / (1 + |x|)2

dy/dx = [1.(1 + |x|) – x.(0 + x/|x|)] / (1 + |x|)2

dy/dx = [(1 + |x|) – (x2 /|x|)] / (1 + |x|)2

dy/dx = [(1 + |x|- (x2 /|x|)] / (1 + |x|)2

Let’s put +3 and -3 to the x for the equation |x|- (x2 /|x|)

for x = +3 -> |3| – 32/|3| = 0

for x = -3 -> |-3| – [(-3)*(-3)/|-3|] = 3 – 9/3 = 0

The equation |x|- (x2 /|x|) is equal to the 0 for positive and negative values.

dy/dx = [(1 + |x|- (x2 /|x|)] / (1 + |x|)2

dy/dx = [1 + 0] / (1 + |x|)2

dy/dx = 1 / (1 + |x|)2

softsign-and-derivative
Softsign and its derivative

So, softsign is one of the dozens of activation functions. Maybe it would not be adopted by professionals and this makes it uncommon. But do not forget that choice of the activation function is state-of-the-art. It might be the most convenient transfer function for your problem.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s