Layer Normalization

2025 | AI Dictionary

Technique to normalize the inputs across features.

What is Layer Normalization?

Layer Normalization is a technique used to normalize the activations of a neural network layer for each training example independently. It helps stabilize and accelerate training by reducing internal covariate shift, ensuring that each layer’s inputs have a consistent distribution.

How Layer Normalization Works

Compute Mean and Variance: For each layer, calculate the mean and variance of the activations across all features in the layer.
Normalize: Subtract the mean and divide by the standard deviation, normalizing the activations.
Scale and Shift: Apply learned scale (gamma) and shift (beta) parameters to restore the model ’s capacity to represent any distribution.

Applications of Layer Normalization

Recurrent Neural Networks (RNNs): It is often used in RNNs to improve training stability and performance.
Transformers: Layer Normalization is widely used in Transformer models, like BERT and GPT, to improve the convergence of deep learning models.

Example of Layer Normalization

In practice, Layer Normalization helps in stabilizing training in models like transformers and RNNs:

import torch
import torch.nn as nn

# Example of Layer Normalization in PyTorch
layer_norm = nn.LayerNorm(10)  # Normalize over the last dimension
input_tensor = torch.randn(2, 10)  # Batch of 2 samples, each with 10 features
output_tensor = layer_norm(input_tensor)
print(output_tensor)

Did you liked the Layer Normalization gist?

Learn about 250+ need-to-know artificial intelligence terms in the AI Dictionary.

Read the Governor's Letter

Stay ahead with Governor's Letter, the newsletter delivering expert insights, AI updates, and curated knowledge directly to your inbox.

By subscribing to the Governor's Letter, you consent to receive emails from AI Guv.
We respect your privacy - read our Privacy Policy to learn how we protect your information.