Backward propagation in Neural Network with example in simple way

Sweta
6 min readAug 15, 2020

Backward propagation is the process of updating the weights of the network’s in order to reduce the error in prediction.

This was the definition part. Let us understand with example on how it works.

Also, if you want to learn from scratch i.e. from basics of neural network, click on this link., Basics of Neural network and

Complete guidelines of activation function

At the end of this explanation, i am very much sure, you will be amazed to know that this is nothing much tough that you were worring for.

Note:

In this example, I have assigned random value just to make you understand, kindly, don’t calculate it, there could be mistake in calculation since I took random value. My purpose just to make you understand, so just understand it.

Let’s start

Here is a neural network diagram with input layer, hidden layer giving two output in output layer.

Where, the variables means

Now, we need to calculate output of H1 at hidden layer , in this , we do calculation as such given below

Now, we need to use activation function on H1 , the result of this will be our output of H1.

Let us use sigmoid activation function here,

We called this as sigmoid function.

We get the output of H1 as,

Note that , the output, we get after applying function, I have named it as “out H1".

Now, let’s understand this further more by putting random values to these variables.

Now here comes the first step.

Forward propagation :

Firstly, same thing , calculating H1 first, which I explained above, this time , we put values with it.

Now, after applying the sigmoid activation function to it, we get the output value of H1 and I have named it as “out H1"

In the same way, we calculate the output value of H2

Now, in the same way, we will calculate “y1" and “out y1" as shown below

Similarly, we calculate “out y2"

“out y1" and “out y2" are nothing but the predicted values, which we compare with the Target values T1 and T2 .

If they aren’t matching , then we will calculate total error , to know how much error we are getting. This is called Error estimation.

Calculation of Total Error:

Where ,

Forward propagation done, error estimation done

Now is the entry of the real game changer, we all know its..

“Backward propagation”

Backward pass:

Here, we move from backward side, layer by layer , i.e. from layer N to N-1 and so on.

Now, in our neural network, layer 2 is the last layer, so we start to calculate the weight of neuron from here.

So, firstly, we will calculate the error at weight W5 , where we will do partial differentiation ( derivative)

Consider w5

Since, in E total , which we just calculated, doesn’t contain w5 in it.

So, we splits the differentiation into multiple terms, so that we can calculate its values.

In this , we splits it into 3 terms and named it as 1,2 and 3.

Let’s take the 1st term,

In the same way, we calculate

2nd term,

3rd term,

Now, putting all the values of the terms together ,

We get the change in weight and we put it in updation function to update our weight with new weight.

So, We will update the value of w5 with this change using updation function.

We can take any value of learning rate between 0 to 1, I have taken 0.5 .

In the same way, we get all other new weight value of current layer.

Now, it’s time to update or change the weight w1, w2, w3 and w4 of hidden layer.

Again , while calculating new value of w1 , we take derivative,

Again, there is no term w1 in E total , so we splits the differentiation and termed it as 1 , 2 and 3.

While calculating term 1, again it furtner splits the differentiation.

Don’t worry , its isn’t that complex, you just need to understand it. This is an important concept, if you concentrate and understand it today, it will benefits you for lifetime. So, just see and understand it.

Term 1,

I hope you get it, the splitting of derivations.

Similarly,

We put all values together in term 1,

Now, 2nd term,

3rd term,

Putting all values together , we get,

Now, our updated weight w1,

The same way, we calculate other weight value.

From layer 2, we went to layer 1 updating all weights from w5 to w8, and again from layer 1, we went to layer 0, updating weights from w1 to w4. By this way, we came at start point , means we back propagated the whole network.

Now, after updated weights, we again do forward propagation and calculate the output values and check with the Target values if its near to the target values or not.

And again we calculate the error and then back propagate it and again update the weights and go on repeating this cycle until we get the output near the target values OR the error we get is minimised until then we go in doing this process.

If this post helped you in understanding the topic backward propagation, then kindly give it a clap, this will motivate me to write more.

Thank you for reading.

Keep learning , keep reading ☺

Reference:

  1. https://youtu.be/0e0z28wAWfg

--

--

Sweta

Data Science | Deep learning | Machine learning | Python