Quantization#
Created On: Oct 09, 2019 | Last Updated On: Jul 10, 2025
We are cetralizing all quantization related development to torchao, please checkout our new doc page: https://docs.pytorch.org/ao/stable/index.html
Plan for the existing quantization flows: 1. Eager mode quantization (torch.ao.quantization.quantize, torch.ao.quantization.quantize_dynamic), please migrate to use torchao eager mode quantize_ API instead
2. FX graph mode quantization (torch.ao.quantization.quantize_fx.prepare_fx torch.ao.quantization.quantize_fx.convert_fx, please migrate to use torchao pt2e quantization API instead (torchao.quantization.pt2e.quantize_pt2e.prepare_pt2e, torchao.quantization.pt2e.quantize_pt2e.convert_pt2e)
3. pt2e quantization has been migrated to torchao (pytorch/ao) see pytorch/ao#2259 for more details
We plan to delete torch.ao.quantization in 2.10 if there are no blockers, or in the earliest PyTorch version until all the blockers are cleared.
Quantization API Reference (Kept since APIs are still public)#
The Quantization API Reference contains documentation of quantization APIs, such as quantization passes, quantized tensor operations, and supported quantized modules and functions.