-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
MAINT: io: migration to use sparray in IO #21905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dschult! Before looking at the details, my main comment here is about the migration strategy, which doesn't look ideal. The current approach shows every user of these functions a deprecation warning, and then to avoid it they have a choice between:
- Explicitly add
sparray=False
, or - Write an
if scipy_version>=1.15: sparray=True else: leave-out-sparray-kw
(2) is quite awkward and in addition it doesn't get you the same behavior for your code for older and newer SciPy. Hence most users probably need to go with (1).
It seems much cleaner to just add an sparray=False
keyword, which lets users opt into returning arrays and otherwise it gets you the same outcome as (1) but without all users needing to change their code. And then when the matrix classes themselves get deprecated, those users will see a deprecation warning. That's a few releases away, so the if-else dance is then no longer needed.
Done. I've set the new keyword to be |
Perhaps a better name for the keyword is |
Sure, that sounds fine to me. I suspect it comes down to the same thing - one has to start using the keyword at some point to avoid a deprecation warning. |
I've switched the kwarg name from The lint errors are UP031 due to the new ruff version (format strings). And the changes are subtle in the |
Yeah it'd be nice to turn those off .... |
e235a49
to
962506d
Compare
I think this is ready to merge. @rgommers did you want to look at this one more time? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments were addressed, changes look good, and CI is happy - let's give this a go. Thanks @dschult!
This PR migrates
sparse.io
to use sparse arrays internally and to read files into either sparray or spmatrix.The primary functions that can (depending on the file contents) return sparse containers are:
scipy.io.loadmat
,scipy.io.hb_read
,scipy.io.mmread
.These can also be accessed via
scipy.io.matlab.loadmat
,scipy._harwell_boeing.hb_read
scipy.io._fast_matrix_market.mmread
, and alternate version of mmread is available atscipy.io._mmio.mmread
.Each of those functions currently returns a sparse matrix. This PR adds a kwarg
sparray=None
indicating (bool) whether this should return a sparray or an spmatrix. The default isNone
indicating a preference has not been provided.The default value is set to None, but is deprecated in release 1.15 with a change of default to sparray coming in 1.17.
The deprecation in the doc_strings is:
And the
DeprecationWarning
message is shown when a sparse container is returned andsparray is None
. The warning message is:I hope I have the stack levels correct. It seems to be OK because the tests picked up the warnings before I updated them to avoid it.
Outside of those functions, the helper functions and classes/methods and the tests now work with sparse arrays.
After the deprecation period for the deault return value, the default will shift to returning sparse arrays, though folks who choose to quite the deprecation warning by setting
sparray=False
will still getspmatrix
.The
dev.py smoke-docs
doesn't reach all doc_tests in this subpackage. Butdev.py smoke-docs -t scipy/io/_mmio.py
showed a number of errors that I corrected here (hopefully those fixes will work with the CI doctests too).One test in
sparse.csgraph
usesmmread
and had to be updated.