Skip to content

[ONNX] Improve verify_onnx_program to use VerificationInterpreter #148706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

justinchuby
Copy link
Collaborator

@justinchuby justinchuby commented Mar 6, 2025

Stack from ghstack (oldest at bottom):

I realized we can just extend verify_onnx_program to return intermediate values. There is no need for us to expose the VerificationInterpreter to users.

I added a compare_intermediates option to verify_onnx_program.

[ghstack-poisoned]
@pytorch-bot pytorch-bot bot added the release notes: onnx torch.onnx related changes that should show up in the release notes label Mar 6, 2025
Copy link

pytorch-bot bot commented Mar 6, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148706

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 59 Pending

As of commit 291183d with merge base 96176e3 (image):

NEW FAILURE - The following job has failed:

  • pull / win-vs2022-cpu-py3 / build (gh)
    C:\actions-runner\_work\pytorch\pytorch\aten\src\ATen/native/quantized/cpu/OnednnUtils.h(445): error C2065: 'vnni_available': undeclared identifier

UNSTABLE - The following jobs are marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]
onnx_outputs = onnx_program(*args, **kwargs)

# Flatten args for ONNX program and the VerificationInterpreter
flat_args, _ = exported_program._get_flat_args_with_check(args, kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR catches a bug?

Copy link
Collaborator Author

@justinchuby justinchuby Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No but I figured this is more robust when the kwargs are not in order

[ghstack-poisoned]
justinchuby added a commit to justinchuby/pytorch that referenced this pull request Mar 7, 2025
ghstack-source-id: b273d2d
Pull Request resolved: pytorch#148706

Signed-off-by: Justin Chu <[email protected]>
@justinchuby
Copy link
Collaborator Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 7, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

@justinchuby
Copy link
Collaborator Author

@pytorchbot merge -i

@justinchuby
Copy link
Collaborator Author

@pytorchbot merge -f "ONNX tests passed"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged while ignoring the following 1 checks: pull / win-vs2022-cpu-py3 / build

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled or timed out. This most often happen if two merge requests were issued for the same PR, or if merge job was waiting for more than 6 hours for tests to finish. In later case, please do not hesitate to reissue the merge command
For more information see pytorch-bot wiki.

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Mar 7, 2025
Use a simple try catch to handle onnx runtime errors in the verification interpreter when that happens. One example is ort will sometimes produce a list of None for some nodes. I am not sure how that happens yet.

Pull Request resolved: #148730
Approved by: https://github.com/titaiwangms
ghstack dependencies: #148706
@github-actions github-actions bot deleted the gh/justinchuby/113/head branch April 11, 2025 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged module: onnx Related to torch.onnx open source release notes: onnx torch.onnx related changes that should show up in the release notes topic: not user facing topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants