Layer Normalization
2024 | AI Dictionary
Technique to normalize the inputs across features.
What is Layer Normalization?
Layer Normalization is a technique used to normalize the activations of a neural network layer for each training example independently. It helps stabilize and accelerate training by reducing internal covariate shift, ensuring that each layer’s inputs have a consistent distribution.
How Layer Normalization Works
- Compute Mean and Variance: For each layer, calculate the mean and variance of the activations across all features in the layer.
- Normalize: Subtract the mean and divide by the standard deviation, normalizing the activations.
- Scale and Shift: Apply learned scale (gamma) and shift (beta) parameters to restore the model ’s capacity to represent any distribution.
Applications of Layer Normalization
- Recurrent Neural Networks (RNNs): It is often used in RNNs to improve training stability and performance.
- Transformers: Layer Normalization is widely used in Transformer models, like BERT and GPT, to improve the convergence of deep learning models.
Example of Layer Normalization
In practice, Layer Normalization helps in stabilizing training in models like transformers and RNNs:
import torch
import torch.nn as nn
# Example of Layer Normalization in PyTorch
layer_norm = nn.LayerNorm(10) # Normalize over the last dimension
input_tensor = torch.randn(2, 10) # Batch of 2 samples, each with 10 features
output_tensor = layer_norm(input_tensor)
print(output_tensor)
Did you liked the Layer Normalization gist?
Learn about 250+ need-to-know artificial intelligence terms in the AI Dictionary.