-
-
Notifications
You must be signed in to change notification settings - Fork 11.2k
BUG: Fix segfault in random.permutation(x)
when x is a string.
#14241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
numpy.random.permuation(x)
where isinstance(x, str)
. Reference issue #14238numpy.random.permuation(x)
where isinstance(x, str)
. Referenc [ #14238](https://github.com/numpy/numpy/issues/14238)
numpy.random.permuation(x)
where isinstance(x, str)
. Referenc [ #14238](https://github.com/numpy/numpy/issues/14238)numpy.random.permuation(x)
where isinstance(x, str)
.
To be honest, I do not think I like to special case strings like that. This also does not fix the undrelying segmentation fault, since the same error will still happen for example for floating point input. The underlying issue is the cython
or the other way around (could probably also mark the whole function). Maybe the best option here specifically is to just raise a new |
The old behavior is to raise an
|
Hmm, right, at least for the legacy API its probably better to err on the save side. Another option, which works for both is to use |
Thanks for the feedback. The issue about floating point numbers is true. Will you consider a PR for an np.AxisError or you'll prefer an explicit cython bounds check? @seberg |
Raising an error explicitly is good, I do not mind much either way, but I feel AxisError may be nicer than IndexError. |
I do have some time. Will look at the bad Axis input too. |
My claim would be that AxisError is only correct if you can trigger it via a sane call to normalize_axis_index |
A check like '.normalize_axis_index(0, arr.ndim)` will trigger an AxisError! And I think it's sane enough since intuitively, we expect all permutable inputs to be either ints or have at least 1 axis |
numpy.random.permuation(x)
where isinstance(x, str)
. numpy.random.permuation(x)
where isinstance(x, str)
.
@@ -3921,7 +3921,7 @@ cdef class Generator: | |||
Randomly permute a sequence, or return a permuted range. | |||
|
|||
If `x` is a multi-dimensional array, it is only shuffled along its | |||
first index. | |||
first index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert the whitespace additions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spurious space is still there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still there. @maxwell-aladago, the space added at the end of this line should not be there.
b44d8a5
to
0627f48
Compare
0627f48
to
e4091ef
Compare
fwiw I think this should raise an error: this would be consistent with np.random.choice("abcd") which currently raises an error -- in fact, this would even be consistent with np.asarray("abcd") returning a scalar array of dtype <U4 rather than a shape-(4,) array. |
|
Wouldn't a |
Changed it to |
|
The appropriate comparison is not |
I think the last case we could just stick with raising an error, but either will work. While we do generally allow 0D arrays instead of scalars, the function is overloaded in a way that I think it would be fair to assume it is unintentional when it happens. Plus 0D arrays are exceedingly rare in any case. |
my patch actually fixed the case for ' |
I don't think we should use |
So, you want me to roll it back to ValueError? @eric-wieser. ValueError looks right to me too because it really isn't an indexing problem. |
Ah, per the PR title, I thought we were talking about But it seems like this PR does not fix |
Fixes both |
@rkern wrote
What is the rationale for raising an |
@@ -3921,7 +3921,7 @@ cdef class Generator: | |||
Randomly permute a sequence, or return a permuted range. | |||
|
|||
If `x` is a multi-dimensional array, it is only shuffled along its | |||
first index. | |||
first index. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spurious space is still there.
>>> rng.permutation("abc") | ||
Traceback (most recent call last): | ||
... | ||
numpy.AxisError: x must be an integer or at least 1-dimensional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure an error example helps too much here, but maybe if someone thinks they can shuffle strings or similar objects.
Broadly speaking, we are expecting to be indexing along the first of >=1 axes, in the usual case. The integer semantic branch complicates that, and the message should incorporate information about that, but I think that |
The refguide check is failing. |
numpy.random.permuation(x)
where isinstance(x, str)
. random.permutation(x)
when x is a string.
Everyone good with this? |
@maxwell-aladago Could you make another PR against 1.16.x that only makes the fixes/tests for RandomState? |
…py#14241) * fixing segfault error in np.random.permutation(x) where x is str * removed whitespace * changing error type to ValueError * changing error type to ValueError * changing error type to ValueError * tests * changed error to IndexError for backward compatibility with numpy 1.16 * fixes numpy.randomstate.permutation segfault too * Rolled back to ValueError for Generator.permutation() for all 0-dimensional * fixes refuige erro and rolls backs to AxisError
I don't think there's a segfault in 1.16. It was introduced in 1.17. |
Ah, just checked and you are right. |
Did anyone check for other similar errors, or should I go through it? |
Reference #14238
I made the assumption that anyone who passes a string
x
to the functionnp.random.permutation(x)
most likely intend to shuffle the string in question. I suggest that if this isn't the desired behaviour, then I can flag an error instead of the segfault.