Backward Propagation

Backward Propagation is a fundamental algorithm used in training neural networks. It adjusts the weights to minimize the error between the actual output and the predicted output.

Forward Propagation

Before backward propagation, the network goes through the hidden layers, weighted sum is calculated at each layer and an activation function is applied to produce an output.
After generating the output, it’s compared to the actual value and loss is calculated that is difference between the actual value and the predicted value.

Backward Propagation

The main aim of backward propagation is to minimize the loss by updating the weights in network. The process involves:

Calculating Gradient

To compute the gradient of the loss function, it uses the chain rule of calculus.
Starting from the output layer, the algorithm calculates how much the loss would change if each weight was slightly adjusted.

Updating weights:

The gradient is propagated backward through the network, from the output layer to the input layer. Gradient updates the weights at each layer and is controlled by a factor called learning rate. The weights are updated in the direction that reduces the loss.



Global minima is the point where the error is least.

    \[w_{\text{new}} = w_{\text{old}} - \eta \cdot \frac{\partial L}{\partial w}\]

    \[\frac{\partial L}{\partial w_o} = \frac{\partial L}{\partial a_o} \cdot \frac{\partial a_o}{\partial z_o} \cdot \frac{\partial z_o}{\partial w_o}\]

    \[\frac{\partial L}{\partial w_o} \text{ is the gradient of the loss function with respect to the weight } w_o. \]

    \[\frac{\partial L}{\partial a_o} \text{ is the gradient of the loss function with respect to the output activation } a_o. \]

    \[\frac{\partial a_o}{\partial z_o} \text{ is the gradient of the output activation } a_o \text{ with respect to the weighted sum } z_o. \]

    \[\frac{\partial z_o}{\partial w_o} \text{ is the gradient of the weighted sum } z_o \text{ with respect to the weight } w_o.\]

Leave a Comment