[quant][graphmode][fx] Produce torch.cat instead of torch.ops.quantized.cat #54924

jerryzh168 · 2021-03-29T23:58:07Z

Stack from ghstack:

[quant][graphmode][fx][refactor] Quantize by Use of a Tensor instead of Tensor #54928 [quant][graphmode][fx][refactor] Quantize by Use of a Tensor instead of Tensor
[quant][graphmode][fx] Produce torch.cat instead of torch.ops.quantized.cat #54924 [quant][graphmode][fx] Produce torch.cat instead of torch.ops.quantized.cat

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D27416528

…ed.cat Summary: Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing the same observer/fakequant instance). Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant. Test Plan: python test/test_quantization.py TestQuantizeFx.test_cat Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

facebook-github-bot · 2021-03-29T23:58:14Z

💊 CI failures summary and remediations

As of commit 1117ed2 (more details on the Dr. CI page):

2/2 failures possibly* introduced in this PR
- 1/2 non-scanned failure(s)

1 failure not recognized by patterns:

Job	Step	Action
^{pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test1}	^{Report results}	🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

…ops.quantized.cat" Summary: Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing the same observer/fakequant instance). Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant. Test Plan: python test/test_quantization.py TestQuantizeFx.test_cat Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528) [ghstack-poisoned]

aten/src/ATen/native/quantized/cpu/qconcat.cpp

torch/quantization/fx/quantization_patterns.py

…ops.quantized.cat" Summary: Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing the same observer/fakequant instance). Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant. Test Plan: python test/test_quantization.py TestQuantizeFx.test_cat Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528) [ghstack-poisoned]

vkuzo · 2021-03-31T22:55:16Z

makes sense. Does this negatively impact accuracy? What happens if there is something like this

conv1  --- cat
          /
conv2 --- ---- some_other_nodes

and conv1 and conv2 have qparams which are not close to each other?

jerryzh168 · 2021-04-01T00:56:11Z

makes sense. Does this negatively impact accuracy? What happens if there is something like this

good question, I think probably not, even in current implementation, we add observer for output of cat, which is a concatenation of the output of both convs, so they are observed with the same observer anyways.

…ops.quantized.cat" Summary: Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing the same observer/fakequant instance). Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant. Test Plan: python test/test_quantization.py TestQuantizeFx.test_cat Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528) [ghstack-poisoned]

vkuzo · 2021-04-20T02:45:07Z

torch/quantization/ns/mappings.py

@@ -67,6 +66,7 @@
    torch.avg_pool1d,
    torch._C._nn.avg_pool2d,
    torch._C._nn.avg_pool3d,
+    torch.cat,


looks like we also need to update the TestFXNumericSuiteCoreAPIs.test_op_io_dtype_coverage test for cat to point to this list of functions, instead of the old one

…ops.quantized.cat" Summary: Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing the same observer/fakequant instance). Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant. Test Plan: python test/test_quantization.py TestQuantizeFx.test_cat Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528) [ghstack-poisoned]

facebook-github-bot · 2021-04-21T17:59:36Z

This pull request has been merged in 096089a.

…ed.cat (pytorch#54924) Summary: Pull Request resolved: pytorch#54924 Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing the same observer/fakequant instance). Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant. Test Plan: python test/test_quantization.py TestQuantizeFx.test_cat Imported from OSS Reviewed By: vkuzo Differential Revision: D27416528 fbshipit-source-id: 896c280abec2903c29d597c655729666583ff0dd

This was referenced Mar 29, 2021

[quant][graphmode][fx] Add support for one value being quantized with different qconfigs #53586

Closed

[quant][graphmode][refactor] Remove reduandent code #54073

Closed

facebook-github-bot added the fx label Mar 29, 2021

jerryzh168 mentioned this pull request Mar 29, 2021

[quant][fx][graphmode][refactor] Change activation_post_process_map to track the observer name instead #54643

Closed

facebook-github-bot added the cla signed label Mar 29, 2021

This was referenced Mar 29, 2021

[quant][graphmode][fx] Separate handling Copy operator to a helper function #54644

Closed

[quant][graphmode][fx][refactor] Factor out insert_observers_for_model to a separate function #54733

Closed

This was referenced Mar 29, 2021

[nn] Add remove_duplicate option for named_modules #54812

Closed

[quant][graphmode][fx] Optimize cat #54813

Closed

jerryzh168 requested review from vkuzo and raghuramank100 March 30, 2021 00:01

jerryzh168 mentioned this pull request Mar 30, 2021

[quant][graphmode][fx][refactor] Quantize by Use of a Tensor instead of Tensor #54928

Closed

jerryzh168 added 2 commits March 29, 2021 18:55

vkuzo reviewed Mar 30, 2021

View reviewed changes

aten/src/ATen/native/quantized/cpu/qconcat.cpp Show resolved Hide resolved

torch/quantization/fx/quantization_patterns.py Outdated Show resolved Hide resolved

jerryzh168 added 5 commits March 30, 2021 10:54

jerryzh168 added 2 commits April 1, 2021 10:10

jerryzh168 added 12 commits April 1, 2021 10:53

jerryzh168 requested a review from vkuzo April 16, 2021 23:18

vkuzo reviewed Apr 20, 2021

View reviewed changes

vkuzo approved these changes Apr 20, 2021

View reviewed changes

facebook-github-bot closed this in 096089a Apr 21, 2021

facebook-github-bot added the Merged label Apr 21, 2021

facebook-github-bot deleted the gh/jerryzh168/578/head branch April 25, 2021 14:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[quant][graphmode][fx] Produce torch.cat instead of torch.ops.quantized.cat #54924

[quant][graphmode][fx] Produce torch.cat instead of torch.ops.quantized.cat #54924

Uh oh!

jerryzh168 commented Mar 29, 2021 •

edited

Loading

Uh oh!

facebook-github-bot commented Mar 29, 2021 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

vkuzo commented Mar 31, 2021 •

edited

Loading

Uh oh!

jerryzh168 commented Apr 1, 2021 •

edited

Loading

Uh oh!

vkuzo Apr 20, 2021

Uh oh!

facebook-github-bot commented Apr 21, 2021

Uh oh!

Uh oh!

[quant][graphmode][fx] Produce torch.cat instead of torch.ops.quantized.cat #54924

[quant][graphmode][fx] Produce torch.cat instead of torch.ops.quantized.cat #54924

Uh oh!

Conversation

jerryzh168 commented Mar 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Mar 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

1 failure not recognized by patterns:

Uh oh!

Uh oh!

Uh oh!

vkuzo commented Mar 31, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jerryzh168 commented Apr 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vkuzo Apr 20, 2021

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Apr 21, 2021

Uh oh!

Uh oh!

jerryzh168 commented Mar 29, 2021 •

edited

Loading

facebook-github-bot commented Mar 29, 2021 •

edited

Loading

vkuzo commented Mar 31, 2021 •

edited

Loading

jerryzh168 commented Apr 1, 2021 •

edited

Loading