Why do we need non linear activation function?

Sweta
1 min readAug 19, 2020

If we use a linear activation function at the hidden layers our neural networks just outputs a linear function of the input. That will happen no matter how many layers a neural network has. This then makes a neural network no more better than logistic regression.

The key takeaway for us should be that linear activation functions within hidden layers are more or less useless except some very special cases.

One case where we could use it, is if you are working at a regression problem where y is a real number, like predicting the prices of houses. But only at the output layer, the hidden layers should use non-linear functions. We can see an example at the picture below.

Nevertheless even then we could use a relu instead of a linear function at the output layer with the same result. This is one of the reasons why a sigmoid function is rarely used nowadays.

Reference:

https://www.experfy.com/blog/activation-functions-within-neural-networks/

--

--

Sweta

Data Science | Deep learning | Machine learning | Python