Skip to content

Add shim.h C API to call dispatcher on our own aten ops #148832

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from

Conversation

janeyx99
Copy link
Contributor

@janeyx99 janeyx99 commented Mar 9, 2025

This PR still needs testing through some cpp extension

Stack from ghstack (oldest at bottom):

Copy link

pytorch-bot bot commented Mar 9, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/148832

Note: Links to docs will display an error until the docs builds have been completed.

⏳ 9 Pending, 2 Unrelated Failures

As of commit 9f133f4 with merge base c983e11 (image):

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@janeyx99 janeyx99 added the release notes: cpp release notes category label Mar 9, 2025
Copy link
Contributor

github-actions bot commented Mar 9, 2025

Attention! PyTorch one of the C-stable API file was changed

You MUST NOT change existing function declarations in this, as this header defines a stable C ABI. If you need to change the signature for a function, introduce a new v2 version of the function and modify code generation to target the new version of the function.


Caused by:

@janeyx99 janeyx99 changed the title Add shim.h C API to call dispatcher on any op Add shim.h C API to call dispatcher on aten op Mar 9, 2025
@janeyx99 janeyx99 changed the title Add shim.h C API to call dispatcher on aten op Add shim.h C API to call dispatcher on our own aten ops Mar 9, 2025
This PR still needs testing through some cpp extension




[ghstack-poisoned]
janeyx99 added a commit that referenced this pull request Mar 9, 2025
ghstack-source-id: ce093fe
Pull Request resolved: #148832
This PR still needs testing through some cpp extension




[ghstack-poisoned]
This PR still needs testing through some cpp extension




[ghstack-poisoned]
This PR still needs testing through some cpp extension




[ghstack-poisoned]
janeyx99 added a commit that referenced this pull request Mar 10, 2025
ghstack-source-id: e670878
Pull Request resolved: #148832
This PR still needs testing through some cpp extension




[ghstack-poisoned]
This PR still needs testing through some cpp extension




[ghstack-poisoned]
This PR still needs testing through some cpp extension




[ghstack-poisoned]
self.assertEqual(cpu_t, torch.ones_like(t, device="cpu"))

def _make_cuda_tensors(prior_mem):
cuda_t = libtorch_agnostic.ops.my_ones_like(t, "cuda")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note, passing in the device obj here does not work

Suggested change
cuda_t = libtorch_agnostic.ops.my_ones_like(t, "cuda")
cuda_t = libtorch_agnostic.ops.my_ones_like(t, device)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? Device is not a supported type for the stable dispatcher registration?

This PR still needs testing through some cpp extension




[ghstack-poisoned]
janeyx99 added a commit that referenced this pull request Mar 11, 2025
ghstack-source-id: 8af5038
Pull Request resolved: #148832
@janeyx99 janeyx99 requested review from albanD and zou3519 March 11, 2025 02:40
@janeyx99 janeyx99 requested a review from desertfire March 11, 2025 02:40
self.assertEqual(cpu_t, torch.ones_like(t, device="cpu"))

def _make_cuda_tensors(prior_mem):
cuda_t = libtorch_agnostic.ops.my_ones_like(t, "cuda")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? Device is not a supported type for the stable dispatcher registration?

auto inner_type = arg_type->castRaw<at::OptionalType>()->getElementType();

// our contract is that IValue None = StableIValue nullptr
if (to<std::nullptr_t>(stable_ivalue) == nullptr) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's ok, you are definitely going to confuse a value 0 with empty optional here.
I think the simplest would be to do some tagging here: have a special value that means "nullopt" and forbid the content of the optional being that value (with a runtime check).
My guess here is that you want something that is unlikely to be a used int64 or pointer or etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to respect the branch cut, i got rid of supporting optional for now

This PR still needs testing through some cpp extension




[ghstack-poisoned]
This PR still needs testing through some cpp extension




[ghstack-poisoned]
This PR still needs testing through some cpp extension




[ghstack-poisoned]
This PR still needs testing through some cpp extension




[ghstack-poisoned]
janeyx99 added a commit that referenced this pull request Mar 11, 2025
ghstack-source-id: 53a598a
Pull Request resolved: #148832
@janeyx99 janeyx99 added the ciflow/trunk Trigger trunk jobs on your pull request label Mar 11, 2025
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

// calls the op overload defined by a given opName, overloadName, and a
// stack of StableIValues. This call will populate any return values of the
// op into the stack in their StableIValue form, with ret0 at index 0, ret1
// at index 1, and so on.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo after this: we should document our StableIValue stack definition and expose to/from helpers for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janeyx99
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: inductor / cuda12.6-py3.10-gcc9-sm86 / test (inductor_timm, 1, 2, linux.g5.4xlarge.nvidia.gpu)

Details for Dev Infra team Raised by workflow job

@janeyx99
Copy link
Contributor Author

@pytorchbot merge -f "Failures are preexisting on trunk"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@github-actions github-actions bot deleted the gh/janeyx99/227/head branch April 12, 2025 02:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: cpp release notes category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants