Vanishing Gradient

2025 | AI Dictionary

Problem where gradients become too small during training.

What is Vanishing Gradient?

Vanishing Gradient refers to a problem that occurs during the training of deep neural networks, where the gradients (used for updating the model weights) become exceedingly small, making it difficult for the model to learn. This problem is especially prominent in networks with many layers, where the gradients diminish as they are backpropagated through the network.

Causes of Vanishing Gradient

Activation Functions: Certain activation functions, such as sigmoid and tanh, can squash their output into a small range (e.g., between 0 and 1), which leads to small gradients during backpropagation .
Deep Networks: In deep networks with many layers, gradients are multiplied by small values at each layer, causing them to shrink exponentially as they reach the initial layers.

Impact of Vanishing Gradient

Slow or Stagnant Learning: Because the gradients become too small to update the weights effectively, the network struggles to learn, especially in the earlier layers.
Poor Performance: If the gradient vanishes completely, the model may fail to improve or converge to a good solution.

Solutions to Vanishing Gradient

ReLU Activation Function: Using ReLU or variants like Leaky ReLU helps mitigate the vanishing gradient problem, as these functions do not squash the input into a small range.
Batch Normalization : This technique normalizes the inputs to each layer, which helps maintain gradients at a manageable scale.
Residual Networks (ResNets): These networks use shortcut connections to allow gradients to bypass certain layers, improving learning in very deep networks.

Example of Vanishing Gradient

In a neural network with a sigmoid activation function , if the inputs to the neurons are very large or very small, the gradient can become very close to zero, leading to extremely slow or stalled learning.

Did you liked the Vanishing Gradient gist?

Learn about 250+ need-to-know artificial intelligence terms in the AI Dictionary.

Read the Governor's Letter

Stay ahead with Governor's Letter, the newsletter delivering expert insights, AI updates, and curated knowledge directly to your inbox.

By subscribing to the Governor's Letter, you consent to receive emails from AI Guv.
We respect your privacy - read our Privacy Policy to learn how we protect your information.