Skip to content

BUG: Preserve identity of dtypes in make_mask_descr #8667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 6, 2017

Conversation

eric-wieser
Copy link
Member

Partially addresses #8666

@eric-wieser eric-wieser force-pushed the reuse-ma-dtype branch 3 times, most recently from 53ed1c8 to 9defd18 Compare February 22, 2017 11:20
Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, with some nitpicks that really are a matter of taste, so fee free to ignore.

numpy/ma/core.py Outdated
@@ -1296,14 +1296,23 @@ def _recursive_make_descr(datatype, newtype=bool_):
# Prepend the title to the name
name = (field[-1], name)
descr.append((name, _recursive_make_descr(field[0], newtype)))
return descr
newtype = np.dtype(descr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of calling this newtype (which is confusing as it is also a passed-in argument), just keep calling it descr?

numpy/ma/core.py Outdated
# Is this some kind of composite a la (np.float,2)
elif datatype.subdtype:
mdescr = list(datatype.subdtype)
mdescr[0] = _recursive_make_descr(datatype.subdtype[0], newtype)
return tuple(mdescr)
newtype = np.dtype(tuple(mdescr))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem

numpy/ma/core.py Outdated
else:
return newtype
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for this private function, we should be able to count on newtype being a dtype already, so I would leave the return newtype here. (I realise that means updating the two occurrences where "O" is passed in.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every top-level call-site in the file fails to pass in a dtype right now (either MaskType, 'O', or bool, I think), so this is absolutely required. Sure, we could change all the callsites, I guess...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be better, but it is a very mild preference at best, so also fine to leave it. For easy following of the flow of logic, I would still think it is best to not reuse the newtype name, though (it really threw me off, to convince me that the recursive calls could not pick that up).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do that, then I'd be tempted to change MaskType from np.bool_ to np.dtype(np.bool_)

@mhvk
Copy link
Contributor

mhvk commented Feb 22, 2017

My feeling would be to have this for 1.13 rather than 1.12.1. What do you think?

numpy/ma/core.py Outdated
newtype = np.dtype(newtype)

# preserve identity of dtypes
if newtype == datatype:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the above, the np.dtype calls can be here, i.e.,

descr = np.dtype(descr)
if descr == datatype:
    return datatype
else:
    return descr

@eric-wieser
Copy link
Member Author

Yeah, I agree - no pressure for 12.1 here

@mhvk mhvk added this to the 1.13.0 release milestone Feb 22, 2017
@eric-wieser eric-wieser force-pushed the reuse-ma-dtype branch 2 times, most recently from ef5a1a3 to c6a2fe3 Compare February 22, 2017 22:44
@eric-wieser
Copy link
Member Author

@mhvk : Made the changes sort of like you requested

@@ -1285,25 +1285,48 @@ def __str__(self):
###############################################################################


def _recursive_make_descr(datatype, newtype=bool_):
"Private function allowing recursion in make_descr."
def _replace_dtype_fields_recursive(dtype, primitive_dtype):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_make_descr was a pretty bad name

"""
dtype = np.dtype(dtype)
primitive_dtype = np.dtype(primitive_dtype)
return _replace_dtype_fields_recursive(dtype, primitive_dtype)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now do the dtype conversions just once, so we don't need to again

return new_dtype


def _replace_dtype_fields(dtype, primitive_dtype):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible scope for public visibility? I can see "convert all fields to object/bool" being marginally desirable elsewhere

@@ -3834,7 +3854,7 @@ def __str__(self):
res = data.astype("O")
res.view(ndarray)[mask] = f
else:
rdtype = _recursive_make_descr(self.dtype, "O")
rdtype = _replace_dtype_fields(self.dtype, "O")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation details of mask_make_descr shouldn't be used elsewhere!

"""
dtype = np.dtype(dtype)
primitive_dtype = np.dtype(primitive_dtype)
return _replace_dtype_fields_recursive(dtype, primitive_dtype)


def make_mask_descr(ndtype):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally I'd rename this to make_mask_dtype or mask_dtype_for, but that's for another PR.

@mhvk
Copy link
Contributor

mhvk commented Feb 23, 2017

I like this very much! My only comment is that with this change it is as much MAINT as it is BUG...

@eric-wieser
Copy link
Member Author

Arguably BUG should be ENH here anyway, since identity comparison of dtypes is not really promised anywhere

numpy/ma/core.py Outdated
if len(field) == 3:
# Prepend the title to the name
name = (field[-1], name)
descr.append((name, _recursive_make_descr(field[0], newtype)))
return descr
descr.append((name, _replace_dtype_fields_recursive(field[0], primitive_dtype)))
Copy link
Member

@ahaldane ahaldane Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line is too long (more below too)

@ahaldane
Copy link
Member

ahaldane commented Mar 3, 2017

So if I understand, the main bug fix is the test for if new_dtype == dtype: new_dtype = dtype and the other changes are for style? That's fine, I just want to understand the motivation for the other changes.

@eric-wieser
Copy link
Member Author

@ahaldane: Yep, that's correct. I'll rewrite this into two commits to emphasize that

@eric-wieser eric-wieser force-pushed the reuse-ma-dtype branch 2 times, most recently from 8115a12 to e13cf08 Compare March 3, 2017 12:45
Cleans up make_mask_descr, and adds a private _replace_dtype_fields, which
we need elsewhere to replace all dtypes with `object` instead of `bool`.

This new version also removes repeated calls to `np.dtype`, doing the conversion
only once.
@eric-wieser
Copy link
Member Author

@ahaldane: Rebased, and long lines shortened. First commit is enough to make the tests pass

@@ -1334,13 +1359,10 @@ def make_mask_descr(ndtype):
>>> ma.make_mask_descr(dtype)
dtype([('foo', '|b1'), ('bar', '|b1')])
>>> ma.make_mask_descr(np.float32)
<type 'numpy.bool_'>
dtype('bool')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was already the behaviour before the PR - the docs were just wrong

@mhvk
Copy link
Contributor

mhvk commented Mar 3, 2017

I'm now even happier with this, so @ahaldane, if you agree as well, I think this can just be merged.

@ahaldane
Copy link
Member

ahaldane commented Mar 6, 2017

Looks good, I'll go ahead with the merge then. Thanks @eric-wieser !

@ahaldane ahaldane merged commit 66f1b8a into numpy:master Mar 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants