-
-
Notifications
You must be signed in to change notification settings - Fork 11.2k
BUG: Preserve identity of dtypes in make_mask_descr #8667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
53ed1c8
to
9defd18
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, with some nitpicks that really are a matter of taste, so fee free to ignore.
numpy/ma/core.py
Outdated
@@ -1296,14 +1296,23 @@ def _recursive_make_descr(datatype, newtype=bool_): | |||
# Prepend the title to the name | |||
name = (field[-1], name) | |||
descr.append((name, _recursive_make_descr(field[0], newtype))) | |||
return descr | |||
newtype = np.dtype(descr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe instead of calling this newtype
(which is confusing as it is also a passed-in argument), just keep calling it descr
?
numpy/ma/core.py
Outdated
# Is this some kind of composite a la (np.float,2) | ||
elif datatype.subdtype: | ||
mdescr = list(datatype.subdtype) | ||
mdescr[0] = _recursive_make_descr(datatype.subdtype[0], newtype) | ||
return tuple(mdescr) | ||
newtype = np.dtype(tuple(mdescr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idem
numpy/ma/core.py
Outdated
else: | ||
return newtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for this private function, we should be able to count on newtype
being a dtype
already, so I would leave the return newtype
here. (I realise that means updating the two occurrences where "O"
is passed in.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every top-level call-site in the file fails to pass in a dtype
right now (either MaskType
, 'O'
, or bool
, I think), so this is absolutely required. Sure, we could change all the callsites, I guess...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be better, but it is a very mild preference at best, so also fine to leave it. For easy following of the flow of logic, I would still think it is best to not reuse the newtype
name, though (it really threw me off, to convince me that the recursive calls could not pick that up).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do that, then I'd be tempted to change MaskType
from np.bool_
to np.dtype(np.bool_)
My feeling would be to have this for 1.13 rather than 1.12.1. What do you think? |
numpy/ma/core.py
Outdated
newtype = np.dtype(newtype) | ||
|
||
# preserve identity of dtypes | ||
if newtype == datatype: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the above, the np.dtype
calls can be here, i.e.,
descr = np.dtype(descr)
if descr == datatype:
return datatype
else:
return descr
Yeah, I agree - no pressure for 12.1 here |
ef5a1a3
to
c6a2fe3
Compare
@mhvk : Made the changes sort of like you requested |
@@ -1285,25 +1285,48 @@ def __str__(self): | |||
############################################################################### | |||
|
|||
|
|||
def _recursive_make_descr(datatype, newtype=bool_): | |||
"Private function allowing recursion in make_descr." | |||
def _replace_dtype_fields_recursive(dtype, primitive_dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_make_descr
was a pretty bad name
""" | ||
dtype = np.dtype(dtype) | ||
primitive_dtype = np.dtype(primitive_dtype) | ||
return _replace_dtype_fields_recursive(dtype, primitive_dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now do the dtype conversions just once, so we don't need to again
return new_dtype | ||
|
||
|
||
def _replace_dtype_fields(dtype, primitive_dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible scope for public visibility? I can see "convert all fields to object/bool" being marginally desirable elsewhere
@@ -3834,7 +3854,7 @@ def __str__(self): | |||
res = data.astype("O") | |||
res.view(ndarray)[mask] = f | |||
else: | |||
rdtype = _recursive_make_descr(self.dtype, "O") | |||
rdtype = _replace_dtype_fields(self.dtype, "O") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation details of mask_make_descr
shouldn't be used elsewhere!
""" | ||
dtype = np.dtype(dtype) | ||
primitive_dtype = np.dtype(primitive_dtype) | ||
return _replace_dtype_fields_recursive(dtype, primitive_dtype) | ||
|
||
|
||
def make_mask_descr(ndtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally I'd rename this to make_mask_dtype
or mask_dtype_for
, but that's for another PR.
I like this very much! My only comment is that with this change it is as much |
Arguably |
numpy/ma/core.py
Outdated
if len(field) == 3: | ||
# Prepend the title to the name | ||
name = (field[-1], name) | ||
descr.append((name, _recursive_make_descr(field[0], newtype))) | ||
return descr | ||
descr.append((name, _replace_dtype_fields_recursive(field[0], primitive_dtype))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
line is too long (more below too)
So if I understand, the main bug fix is the test for |
@ahaldane: Yep, that's correct. I'll rewrite this into two commits to emphasize that |
8115a12
to
e13cf08
Compare
Partially addresses numpy#8666
Cleans up make_mask_descr, and adds a private _replace_dtype_fields, which we need elsewhere to replace all dtypes with `object` instead of `bool`. This new version also removes repeated calls to `np.dtype`, doing the conversion only once.
e13cf08
to
b1a7057
Compare
@ahaldane: Rebased, and long lines shortened. First commit is enough to make the tests pass |
@@ -1334,13 +1359,10 @@ def make_mask_descr(ndtype): | |||
>>> ma.make_mask_descr(dtype) | |||
dtype([('foo', '|b1'), ('bar', '|b1')]) | |||
>>> ma.make_mask_descr(np.float32) | |||
<type 'numpy.bool_'> | |||
dtype('bool') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was already the behaviour before the PR - the docs were just wrong
I'm now even happier with this, so @ahaldane, if you agree as well, I think this can just be merged. |
Looks good, I'll go ahead with the merge then. Thanks @eric-wieser ! |
Partially addresses #8666