Skip to content

BUG: Make timsort deal with zero length elements. #12944

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 7, 2019

Conversation

charris
Copy link
Member

@charris charris commented Feb 7, 2019

The elements of arrays of unicode, string, and void type may have zero
length and such types do not need sorting. This fixes a segfault in
timsort due to integer division by zero by checking the element size and
returning immediately when it is zero.

This is the first of three PRs dealing with timsort. The next will take care of the broken API compatibility, and the last will be a big coding style cleanup.

The elements of arrays of unicode, string, and void type may have zero
length and such types do not need sorting. This fixes a segfault in
timsort due to integer division by zero by checking the element size and
returning immediately when it is zero.
Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nitpick comment which probably should move to the later PRs you were planning in any case.

size_t len = PyArray_ITEMSIZE(varr);
PyArray_CompareFunc *cmp = PyArray_DESCR(varr)->f->compare;
PyArrayObject *arr = varr;
size_t len = PyArray_ITEMSIZE(arr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe name it elsize so that the definitions match with the above one? The check can maybe be elsize for both for similarity.

But you did not change it, so I guess that is probably already part of your "style cleanup".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I will probably change it later, but there are so many style nits that I want to clean up that they would just get in the way of a review, so I decided to make that a separate, lower priority, task.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may actually do two style cleanups, one just for code style, and another doing a refactor. An instance of the latter would be using npy_intp rather than size_t.

@seberg
Copy link
Member

seberg commented Feb 7, 2019

Ok, just double checked to see that the other sorts handle this the same way. Style fixups are nice, but it isn't like this is very bad in any case.

Thanks Chuck!

@seberg seberg merged commit de1ca61 into numpy:master Feb 7, 2019
@charris charris deleted the fix-timsort-zero-length-elements branch February 7, 2019 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants