-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
CLN: Use cython algo for groupby var with ddof != 1 #48152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a quick look, I didn't see a test in pandas/tests/groupby
with ddof !=1
. Could you confirm and add that test if there isn't one?
We've got the numba tests that are testing consistency between numba and cython, so I think we are good. They are testing ddof 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, forgot there were cython comparison tests for numba_supported_reductions
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! From what I can tell I added it to cython but ran into issues with var, reworked the implementation and forgot to add it back in.
Could use a line in the performance section of the whatsnew
pandas/core/groupby/groupby.py
Outdated
alt=lambda x: Series(x).var(ddof=ddof), | ||
numeric_only=numeric_only, | ||
ignore_failures=numeric_only is lib.no_default, | ||
**{"ddof": ddof}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just do ddof=ddof
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, also added whatsnew
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Thanks @phofl |
* CLN: Use cython algo for groupby var with ddof != 1 * Adress review
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.cc @rhshadrach
I think you've implemented ddof on the cython level a while back. Any reasons why var is taking a different path? Results look sensible and no failing tests
Also, gives a nice speedup