What Weight Decay Meaning, Applications & Example
Regularization technique to prevent overfeeding.
What is Weight Decay?
Weight Decay is a regularization technique used in machine learning and neural networks to prevent overfitting by penalizing large weights during training. It adds a term to the loss function, which is proportional to the square of the magnitude of the weights, encouraging the model to learn smaller weights and thus improving generalization.
How Weight Decay Works
- Penalty Term: The loss function is modified by adding a penalty term that is proportional to the square of the model’s weights (L2 norm).
- Prevents Overfitting: By discouraging overly large weights, weight decay reduces the risk of the model fitting noise or irrelevant patterns in the training data.
Example of Weight Decay
In a neural network
, if the original loss function is L
, the loss function with weight decay becomes:
L' = L + λ * ||W||²
Where:
λ
is the weight decay coefficient (a hyperparameter),W
are the model weights,||W||²
is the L2 norm of the weights.
This adjustment helps the model focus on simpler, more generalizable patterns.