Skip to content

Optimize nth and nth_back for BoundListIterator #4810

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

Owen-CH-Leung
Copy link
Contributor

@Owen-CH-Leung Owen-CH-Leung commented Dec 21, 2024

See #4787

This PR optimizes nth and nth_back for BoundListIterator, and added unittest & benchmarks for these 2 APIs. Here are the benchmark of the optimized nth and nth_back:

On Stable toolchain

With Optimization

list_nth                time:   [351.21 ns 351.71 ns 352.28 ns]
list_nth_back           time:   [355.34 ns 356.09 ns 356.95 ns]

The default nth and nth_back implementation

list_nth                time:   [2.8422 µs 2.8502 µs 2.8592 µs]
list_nth_back           time:   [3.6509 µs 3.6557 µs 3.6611 µs]

On nightly toolchain

With Optimization

list_nth                time:   [293.11 ns 293.59 ns 294.15 ns]
list_nth_back           time:   [301.09 ns 301.51 ns 302.00 ns]

The default nth and nth_back implementation

list_nth                time:   [2.8488 µs 2.8525 µs 2.8568 µs]
list_nth_back           time:   [3.6524 µs 3.6600 µs 3.6683 µs]

@davidhewitt
Copy link
Member

Awesome, thanks! Will seek to review soon. I guess we can do the same thing for tuple iteration?

@Owen-CH-Leung Owen-CH-Leung marked this pull request as ready for review December 21, 2024 15:47
@Owen-CH-Leung
Copy link
Contributor Author

Owen-CH-Leung commented Dec 21, 2024

Awesome, thanks! Will seek to review soon. I guess we can do the same thing for tuple iteration?

Yup - I'll file a PR to optimise tuple iterator after the holiday =)

let length = self.length.min(self.list.len());
let target_index = self.index + n;
if self.index + n < length {
let item = unsafe { self.get_item(target_index) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, is there a time-of-check to time-of-use bug here on the length? Not one for this PR, but a follow up I will try not to forget...

Copy link
Contributor Author

@Owen-CH-Leung Owen-CH-Leung Dec 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup I think that's a potential TOCTOU bug here. This particular implementation assumes the user ensures proper synchronization if they intend to use the iterator in a multi-threaded or mutable environment.

Copy link
Contributor Author

@Owen-CH-Leung Owen-CH-Leung Dec 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current implementation of next also has a potential TOCTOU bug:

https://github.com/PyO3/pyo3/blob/main/src/types/list.rs#L495-L498

@ngoldbaum
Copy link
Contributor

Thanks for working on this! Hopefully my PR should be merged in the next few days and then you can rebase this.

@ngoldbaum
Copy link
Contributor

It might also make sense to implement advance_by on nightly. nth is implemented in terms of advance_by, so this improvement would fall out "for free" on nightly if you implement that function.

Copy link
Contributor

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline comments, I think there are some issues in how this is set up for the limited API. I also think we should define nth and nth_back for all non-nightly builds. I also don't think it's necessary to call get_item in advance_by, per my reading of the advance_by docs.

@Owen-CH-Leung Owen-CH-Leung marked this pull request as draft January 10, 2025 08:22
@Owen-CH-Leung Owen-CH-Leung marked this pull request as ready for review January 10, 2025 13:35
@Owen-CH-Leung
Copy link
Contributor Author

Owen-CH-Leung commented Jan 10, 2025

@ngoldbaum Can I seek your review again ? I've adopted your suggested changes above

the CI clippy/x86_64-unknown-linux-gnu/beta is failing as main btw

@ngoldbaum
Copy link
Contributor

the CI clippy/x86_64-unknown-linux-gnu/beta is failing as main btw

GitHub's interface doesn't help by putting a big red x next to it, but this is fine and we can merge PRs as long as the "conclusion" job finishes. Right now it's broken because of an upstream bug in clippy, so it will probably stay that way until the beta channel gets the fix.

I'll try to give this PR another once-over today but failing that will get to it next week.

@@ -750,6 +849,32 @@ impl<'py> Iterator for BoundListIterator<'py> {
None
})
}

#[inline]
#[cfg(all(not(Py_LIMITED_API), feature = "nightly"))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[cfg(all(not(Py_LIMITED_API), feature = "nightly"))]
#[cfg(feature = "nightly")]

Unless I'm missing something...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we need the flag not(Py_LIMITED_API. Otherwise we will compile the code without access to with_critical_section

https://github.com/PyO3/pyo3/actions/runs/12769998628/job/35593851381?pr=4810

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can delete the #[cfg(not(Py_LIMITED_API))] on BoundListIterator::with_critical_section, it was only there to avoid a clippy lint about unused code. pyo3::sync::with_critical_section is unconditionally exposed in the PyO3 API, it's just a no-op on GIL-enabled builds. We wanted to allow people to write code that avoids conditional compilation, where possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I deleted #[cfg(not(Py_LIMITED_API))] but was hitting the unused code lint warning, so I put in a allow lint dead_code. Lemme know your thoughts.

I also pushed a few commits and implemented advance_back_by.

@@ -547,6 +547,46 @@ impl<'py> BoundListIterator<'py> {
}
}

#[inline]
#[cfg(all(not(Py_LIMITED_API), feature = "nightly"))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused - shouldn't this only be defined on not(feature = "nightly")? Because on nightly this is defined in terms of advance_by in the standard library implementation, so we don't need to override nth. It'll just do the right thing on nightly by calling advance_by and then next(). On not nightly you need this override because there's no way to override advance_by.

Ditto for all the other BoundListIterator methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed - I spent hours reading the stdlib again to figure out that on stable toolchain, we should override nth, and override advance_by on nightly. Thanks for spotting this. I've made the changes to compile it on stable


#[inline]
#[cfg(all(Py_LIMITED_API, feature = "nightly"))]
#[deny(unsafe_op_in_unsafe_fn)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[deny(unsafe_op_in_unsafe_fn)]

there's no unsafe code in this function so this is unnecessary

) -> Option<Bound<'py, PyAny>> {
let length = length.0.min(list.len());
let target_index = index.0 + n;
if index.0 + n < length {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if index.0 + n < length {
if target_index < length {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks & revised

) -> Option<Bound<'py, PyAny>> {
let length = length.0.min(list.len());
let target_index = index.0 + n;
if index.0 + n < length {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if index.0 + n < length {
if target_index < length {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks & revised

@@ -625,6 +704,26 @@ impl<'py> Iterator for BoundListIterator<'py> {
}
}

#[inline]
#[cfg(feature = "nightly")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[cfg(feature = "nightly")]
#[cfg(not(feature = "nightly"))]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks & revised

Comment on lines 710 to 724
#[cfg(not(Py_LIMITED_API))]
{
self.with_critical_section(|index, length, list| unsafe {
Self::nth_unchecked(index, length, list, n)
})
}
#[cfg(Py_LIMITED_API)]
{
let Self {
index,
length,
list,
} = self;
Self::nth(index, length, list, n)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[cfg(not(Py_LIMITED_API))]
{
self.with_critical_section(|index, length, list| unsafe {
Self::nth_unchecked(index, length, list, n)
})
}
#[cfg(Py_LIMITED_API)]
{
let Self {
index,
length,
list,
} = self;
Self::nth(index, length, list, n)
}
self.with_critical_section(|index, length, list| unsafe {
Self::nth(index, length, list, n)
})

If you implement my suggestion to only have BoundListIterator::nth then you can simplify this a lot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and you can do a similar refactor for the nth_back implementation for the impl DoubleEndedIterator for BoundListIterator block below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I've adopted this idea for nth and nth_back

) -> Option<Bound<'py, PyAny>> {
let length_size = length.0.min(list.len());
if index.0 + n < length_size {
let target_index = length_size - n - 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally this logic feels a little backwards to me, and I would probably prefer to do this using signed integers and testing if the index is less than zero. That said, this is totally equivalent and if it makes sense to you as-is then no need to change it.

@Owen-CH-Leung
Copy link
Contributor Author

Owen-CH-Leung commented Jan 14, 2025

Just realize that I forgot to implement advance_back_by. I'll implement it in coming days

Copy link
Contributor

@ngoldbaum ngoldbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, much simpler now!

I'm not going to merge immediately because I'd like someone with a little bit more experience maintaining PyO3 to give this a once-over.

@@ -0,0 +1 @@
Optimizes `nth` and `nth_back` for `BoundListIterator`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe also mention that advance_by and advance_by_back are implemented on nightly

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks great to me! Sorry for the slow review cycles 😬

@davidhewitt davidhewitt added this pull request to the merge queue Feb 7, 2025
Merged via the queue into PyO3:main with commit 903afcd Feb 7, 2025
45 of 46 checks passed
clin1234 pushed a commit to clin1234/pyo3 that referenced this pull request Feb 10, 2025
* Optimize nth and nth_back for BoundListIterator. Add unit test and benchmarks

* Fix fmt and newsfragment CI

* Fix clippy and changelog CI

* Revise Impl of nth and nth_back. Impl advance_by

* Fix failing fmt

* Fix failing ruff test

* branch out nth, nth_unchecked, nth_back, nth_back_unchecked.

* Fix fmt

* Revise advance_by impl. add advance_by unittest.

* Fix fmt

* Fix clippy unused function warning

* Set appropriate Py_LIMITED_API flag

* Rewrite nth & nth_back using conditional compilation. Rearrange flags for proper compilation

* fix fmt

* fix failing CI

* Impl advance_back_by. Remove cfg flag for with_critical_section

* refactor advance_by and advance_back_by. Add back cfg for with_critical_section

* Put allow deadcode for with_critical_section

* Remove use of get_item. Revise changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants