Quantization

2025 | AI Dictionary

Technique to reduce model size by using fewer bits per weight.

What is Quantization?

Quantization in machine learning refers to the process of reducing the precision of the model ’s weights and/or activations to reduce the model size and improve inference speed, often for deployment on resource-constrained devices. By approximating continuous values with discrete values, quantization can significantly decrease the memory footprint and computation requirements.

Types of Quantization

Weight Quantization: Reducing the precision of the model’s weights (e.g., from 32-bit to 8-bit integers).
Activation Quantization: Reducing the precision of the activation values during the forward pass.
Post-training Quantization: Applied after the model has been trained, adjusting the weights without requiring retraining.
Quantization-Aware Training (QAT): A technique where quantization is incorporated during training, allowing the model to adapt to the lower precision during the learning process.

Applications of Quantization

Mobile and Edge Devices: Deploying machine learning models on devices with limited computational resources, such as smartphones or IoT devices.
Inference Optimization: Reducing the time and memory needed for model inference, making it faster and more efficient.
Cloud Deployment: Improving model serving speed and reducing bandwidth consumption in cloud environments by sending smaller models.

Example of Quantization

In a computer vision application, a deep neural network for image classification can be quantized from 32-bit floating point weights to 8-bit integer weights. This can drastically reduce the model size, allowing it to run faster on mobile devices without a significant loss in accuracy.

Did you liked the Quantization gist?

Learn about 250+ need-to-know artificial intelligence terms in the AI Dictionary.

Read the Governor's Letter

Stay ahead with Governor's Letter, the newsletter delivering expert insights, AI updates, and curated knowledge directly to your inbox.

By subscribing to the Governor's Letter, you consent to receive emails from AI Guv.
We respect your privacy - read our Privacy Policy to learn how we protect your information.