Weight Decay
2024 | AI Dictionary
What is Weight Decay: A regularization technique used in machine learning that prevents overfitting by penalizing large weights during model training.
What is Weight Decay?
Weight Decay is a regularization technique used in machine learning and neural networks to prevent overfitting by penalizing large weights during training. It adds a term to the loss function, which is proportional to the square of the magnitude of the weights, encouraging the model to learn smaller weights and thus improving generalization.
How Weight Decay Works
- Penalty Term: The loss function is modified by adding a penalty term that is proportional to the square of the model’s weights (L2 norm).
- Prevents Overfitting: By discouraging overly large weights, weight decay reduces the risk of the model fitting noise or irrelevant patterns in the training data.
Example of Weight Decay
In a neural network
, if the original loss function is L
, the loss function with weight decay becomes:
L' = L + λ * ||W||²
Where:
λ
is the weight decay coefficient (a hyperparameter),W
are the model weights,||W||²
is the L2 norm of the weights.
This adjustment helps the model focus on simpler, more generalizable patterns.
Did you liked the Weight Decay gist?
Learn about 250+ need-to-know artificial intelligence terms in the AI Dictionary.