-
-
Notifications
You must be signed in to change notification settings - Fork 11.2k
BUG: Remove error-prone borrowed reference handling #13039
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattip - nice! I slightly wonder about also initializing new
and metadata
to NULL
and XDECREF
them in the failure path, but it may be more work than it is worth it. Indeed, probably best to just go with this self-contained commit.
Hmm, @eric-wieser clearly looked much more carefully at the reference counting chain. Sorry. (Wish this was caught in the tests!) |
names = Borrowed_PyMapping_GetItemString(obj, "names"); | ||
descrs = Borrowed_PyMapping_GetItemString(obj, "formats"); | ||
names = PyMapping_GetItemString(obj, "names"); | ||
descrs = PyMapping_GetItemString(obj, "formats"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure it's not safe to do these both before checking if either errored. Consider a pathological type like:
class BadDict:
def __getitem__(self, item):
raise TypeError("Exception in flight here causes python to be unhappy")
class Foo:
___array_interface__ = BadDict
Note that we need a type overloading __getitem__
like this to make your patch worth it in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing, even though In this case I am pretty sure the code will work correctly, both name
and descrs
will be NULL
so the following if
will kick in.
The patch is to ensure we do not use an object with refcount 0, since it could be collected at any time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking more carefully, the code clears any exceptions before calling _use_fields_dict
so is there a problem with the test you propose?
I will add a pathological test that only errors on name
lookup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, the error code path is hit when a dict
with no name
, format
. For instance when converting this valid spec
dt = np.dtype({'f0': ('i4', 0), 'f1':('u1', 4)}, align=True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen problems before where __getitem__("formats")
would raise a SystemError
because an exception was set by __getitem__("names")
. I think in this case you silence both anyway, but in principle you could hit an assert or worse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that is something we should be checking for, we should only silence a KeyError
. Would you like that to be part of this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worrying about preserving exceptions is a fair point, but I think it's fine to leave for another PR.
Calling python code while PyErr_Occurred() != NULL
is pretty dangerous though - grepping through the cpython source code for assert(!PyErr_Occurred())
finds a lot of matches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to clarify - I'd like to see this changed to call one, check for null, then call the other. That's in line with the purpose of this PR, which is to make this work for non-dict
objects with __getitem__
(the only case when the refcount could drop to 0)
Catching KeyError
specifically is a nice idea, but not one that really makes sense as part of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it, thanks for persevering. Fixed.
numpy/core/tests/test_dtype.py
Outdated
@@ -321,6 +320,11 @@ def test_fields_by_index(self): | |||
|
|||
assert_equal(dt[1], dt[np.int8(1)]) | |||
|
|||
def test_partial_dict(self): | |||
# 'name' is missing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: names
with an s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heh, fixing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now, thanks
OK, merging! |
Fixes #9851. Previously we could have held on to a
PyObject*
with refcount 0.