-
Notifications
You must be signed in to change notification settings - Fork 24.3k
quantize_per_tensor get diffrenet results w/o setting OMP_NUM_THREADS=1 #80501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
with setting OMP_NUM_THREADS=1 |
And if not setting OMP_NUM_THREADS |
can you confirm |
Thanks for reply, @jerryzh168 . |
@jerryzh168 , there has a patch fix in pytorch/FBGEMM#1261. |
Summary: There will have a data overflow for int type when input len > 2,147,483,647, PyTorch side has reported an issue which using FBGEMM Quantize, see pytorch/pytorch#80501, the user example's input len is 5,117,410,688, but FBGEMM side use **int** to represent the input len, there will get a wrong number for single thread case. This PR just fixed those two **Quantize** and **FindMinMax** API which are used in PyTorch side, there may have other functions which need to be updated using high-precison dtype. Pull Request resolved: #1261 Reviewed By: jianyuh Differential Revision: D39089686 Pulled By: jspark1105 fbshipit-source-id: 9623bbb20bdba0f98040a1c8143e4bc552d2a6cb
is this fixed? @zhuhaozhe @XiaobingSuper |
@jerryzh168 , I check the FBGEMM used by PyTorch, pytorch/FBGEMM#1261 has been applied, so I think this issue is fixed. |
Uh oh!
There was an error while loading. Please reload this page.
🐛 Describe the bug
For python file below (we call test_quant.py)
will get different i8_args.
Versions
Collecting environment information...
PyTorch version: 1.13.0a0+git0922cc0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: CentOS Stream 8 (x86_64)
GCC version: (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10)
Clang version: 13.0.0 (Red Hat 13.0.0-3.module_el8.6.0+1074+380cef3f)
CMake version: version 3.19.6
Libc version: glibc-2.10
Python version: 3.7.7 (default, Mar 26 2020, 15:48:22) [GCC 7.3.0] (64-bit runtime)
Python platform: Linux-4.18.0-365.el8.x86_64-x86_64-with-centos-8
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Versions of relevant libraries:
[pip3] intel-extension-for-pytorch==1.12.0+cpu
[pip3] numpy==1.21.2
[pip3] torch==1.13.0a0+git0922cc0
[conda] blas 1.0 mkl
[conda] intel-extension-for-pytorch 1.12.0+cpu pypi_0 pypi
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-include 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py37h7f8727e_0
[conda] mkl_fft 1.3.1 py37hd3c417c_0
[conda] mkl_random 1.2.2 py37h51133e4_0
[conda] numpy 1.21.2 py37h20f2e39_0
[conda] numpy-base 1.21.2 py37h79a1101_0
[conda] torch 1.13.0a0+git0922cc0 dev_0
cc @jerryzh168 @jianyuh @raghuramank100 @jamesr66a @vkuzo
The text was updated successfully, but these errors were encountered: