Skip to content

[quant][graphmode][fx] Produce torch.cat instead of torch.ops.quantized.cat #54924

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 26 commits into from

Conversation

jerryzh168
Copy link
Contributor

@jerryzh168 jerryzh168 commented Mar 29, 2021

Stack from ghstack:

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D27416528

…ed.cat

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Mar 29, 2021

💊 CI failures summary and remediations

As of commit 1117ed2 (more details on the Dr. CI page):


  • 2/2 failures possibly* introduced in this PR
    • 1/2 non-scanned failure(s)

1 failure not recognized by patterns:

Job Step Action
CircleCI pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test1 Report results 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
@vkuzo
Copy link
Contributor

vkuzo commented Mar 31, 2021

makes sense. Does this negatively impact accuracy? What happens if there is something like this

conv1  --- cat
          /
conv2 --- ---- some_other_nodes

and conv1 and conv2 have qparams which are not close to each other?

@jerryzh168
Copy link
Contributor Author

jerryzh168 commented Apr 1, 2021

makes sense. Does this negatively impact accuracy? What happens if there is something like this

good question, I think probably not, even in current implementation, we add observer for output of cat, which is a concatenation of the output of both convs, so they are observed with the same observer anyways.

…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
@jerryzh168 jerryzh168 requested a review from vkuzo April 16, 2021 23:18
…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
@@ -67,6 +66,7 @@
torch.avg_pool1d,
torch._C._nn.avg_pool2d,
torch._C._nn.avg_pool3d,
torch.cat,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like we also need to update the TestFXNumericSuiteCoreAPIs.test_op_io_dtype_coverage test for cat to point to this list of functions, instead of the old one

…ops.quantized.cat"

Summary:
Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D27416528](https://our.internmc.facebook.com/intern/diff/D27416528)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 096089a.

@facebook-github-bot facebook-github-bot deleted the gh/jerryzh168/578/head branch April 25, 2021 14:16
krshrimali pushed a commit to krshrimali/pytorch that referenced this pull request May 19, 2021
…ed.cat (pytorch#54924)

Summary:
Pull Request resolved: pytorch#54924

Previously we are producing torch.ops.quantize.cat which takes inputs, dequantize them
and requantize with new qparams. This PR changes that to produce torch.cat directly, torch.cat
will assume all inputs are sharing the same qparam, and it will produce a quantized Tensor with
the same qparam as all inputs (because previous PR makes sure all inputs and output of cat are sharing
the same observer/fakequant instance).

Using torch.cat is expected to be more efficient since it does not introduce extra quant/dequant.

Test Plan:
python test/test_quantization.py TestQuantizeFx.test_cat

Imported from OSS

Reviewed By: vkuzo

Differential Revision: D27416528

fbshipit-source-id: 896c280abec2903c29d597c655729666583ff0dd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants