Skip to content

Update 7175, BUG: Invalid read of size 4 in PyArray_FromFile #7757

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 18, 2016

Conversation

charris
Copy link
Member

@charris charris commented Jun 18, 2016

Update of #7175, closes #7756.

When the input dtype has a subarray, the dtype is DECREFed by PyArray_NewFromDescr,
before dtype->elsize is accessed.

If no one else holds a reference to the dtype object, then the dtype object will be destroyed,
and dtype->elsize shall not be accessed.
This raises an error in Valgrind, and occasionally crashes innocently looking code.
e.g. numpy.fromfile('filename', dtype=('f8', 3'))

A workaround would be
dtype=numpy.dtype(('f8', 3)); numpy.fromfile('filename', dtype=dtype)

This affects versions as early as 1.9.2 (where I found this bug) and seems to be still relevant today.
I hope someone can prove me wrong.
This PR is just a demonstration of the idea. I didn't try to compile it.

Valgrind log:

==17479== Invalid read of size 4
==17479== at 0x1CD0BB88: UnknownInlinedFun (stdio2.h:295)
==17479== by 0x1CD0BB88: array_fromfile_binary (ctors.c:3177)
==17479== by 0x1CD0BB88: PyArray_FromFile (ctors.c:3304)
==17479== by 0x1CD886B5: array_fromfile (multiarraymodule.c:2040)
==17479== by 0x37AB6E2571: do_call (ceval.c:4327)
==17479== by 0x37AB6E2571: call_function (ceval.c:4135)
==17479== by 0x37AB6E2571: PyEval_EvalFrameEx (ceval.c:2755)
==17479== by 0x37AB6E2665: fast_function (ceval.c:4198)
==17479== by 0x37AB6E2665: call_function (ceval.c:4133)
==17479== by 0x37AB6E2665: PyEval_EvalFrameEx (ceval.c:2755)
==17479== by 0x37AB6E2665: fast_function (ceval.c:4198)
==17479== by 0x37AB6E2665: call_function (ceval.c:4133)
==17479== by 0x37AB6E2665: PyEval_EvalFrameEx (ceval.c:2755)
==17479== by 0x37AB664DCB: gen_send_ex.isra.0 (genobject.c:85)
==17479== by 0x37AB6DE419: PyEval_EvalFrameEx (ceval.c:2586)
==17479== by 0x37AB664DCB: gen_send_ex.isra.0 (genobject.c:85)
==17479== by 0x37AB6DE419: PyEval_EvalFrameEx (ceval.c:2586)
==17479== by 0x37AB6E2665: fast_function (ceval.c:4198)
==17479== by 0x37AB6E2665: call_function (ceval.c:4133)
==17479== by 0x37AB6E2665: PyEval_EvalFrameEx (ceval.c:2755)
==17479== by 0x37AB6E36B3: PyEval_EvalCodeEx (ceval.c:3344)
==17479== by 0x37AB6E25C5: fast_function (ceval.c:4208)
==17479== by 0x37AB6E25C5: call_function (ceval.c:4133)
==17479== by 0x37AB6E25C5: PyEval_EvalFrameEx (ceval.c:2755)
==17479== Address 0x113bef20 is 32 bytes inside a block of size 88 free'd
==17479== at 0x4A07D6A: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==17479== by 0x1CD07C15: _update_descr_and_dimensions (ctors.c:273)
==17479== by 0x1CD07C15: PyArray_NewFromDescr_int (ctors.c:900)
==17479== by 0x1CD0BB76: PyArray_NewFromDescr (ctors.c:1121)
==17479== by 0x1CD0BB76: array_fromfile_binary (ctors.c:3168)
==17479== by 0x1CD0BB76: PyArray_FromFile (ctors.c:3304)
==17479== by 0x1CD886B5: array_fromfile (multiarraymodule.c:2040)
==17479== by 0x37AB6E2571: do_call (ceval.c:4327)
==17479== by 0x37AB6E2571: call_function (ceval.c:4135)
==17479== by 0x37AB6E2571: PyEval_EvalFrameEx (ceval.c:2755)
==17479== by 0x37AB6E2665: fast_function (ceval.c:4198)
==17479== by 0x37AB6E2665: call_function (ceval.c:4133)
==17479== by 0x37AB6E2665: PyEval_EvalFrameEx (ceval.c:2755)
==17479== by 0x37AB6E2665: fast_function (ceval.c:4198)
==17479== by 0x37AB6E2665: call_function (ceval.c:4133)
==17479== by 0x37AB6E2665: PyEval_EvalFrameEx (ceval.c:2755)
==17479== by 0x37AB664DCB: gen_send_ex.isra.0 (genobject.c:85)
==17479== by 0x37AB6DE419: PyEval_EvalFrameEx (ceval.c:2586)
==17479== by 0x37AB664DCB: gen_send_ex.isra.0 (genobject.c:85)
==17479== by 0x37AB6DE419: PyEval_EvalFrameEx (ceval.c:2586)
==17479== by 0x37AB6E2665: fast_function (ceval.c:4198)
==17479== by 0x37AB6E2665: call_function (ceval.c:4133)
==17479== by 0x37AB6E2665: PyEval_EvalFrameEx (ceval.c:2755)

When the input dtype has a subarray, the dtype is DECREFed by
PyArray_NewFromDescr, before dtype->elsize is accessed.

If no one else holds a reference to the dtype object, then the dtype
object will be destroyed, and dtype->elsize shall not be accessed.  This
raises an error in Valgrind, and occasionally crashes innocently looking
code.  e.g. ```numpy.fromfile('filename', dtype=('f8', 3'))```

This affects versions as early as 1.9.2 (where I found this bug) and
seems to be still relevant today.

Closes numpy#7756.
- Do not use fail exit in array_fromfile_binary.
- Add comment explaining why we Py_INCREF dtype before call to
  PyArray_NewFromDescr in array_fromfile_binary and array_from_text.
@charris charris merged commit 5500019 into numpy:master Jun 18, 2016
@charris charris deleted the update-7175 branch June 18, 2016 13:33
@rainwoodman
Copy link
Contributor

Thanks for closing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Invalid read of size 4 in PyArray_FromFile
2 participants