ENH: argmax and argmin methods for sparse matrices #6761

nmayorov · 2016-11-04T18:41:45Z

I think my approach is reasonable and efficient in terms of algorithmic complexity, but maybe it can be done with less Python loops.

For now I included tests as a separate file (instead of test_base.py). Just easier to test and demonstrate what's going on. At the end we can move it to test_base.py.

@perimosocordiae please look when you can.

Closes scipy#5883

perimosocordiae

Looks reasonable, overall. It might be nice to follow the pattern used by _min_or_max, rather than handling the axis=None and row/column-wise cases all together.

perimosocordiae · 2016-11-04T20:28:20Z

scipy/sparse/data.py

+            mat.sum_duplicates()
+
+            line_size = mat.shape[axis]
+            ret = np.empty(mat.shape[1 - axis], dtype=int)


ret_size, line_size = mat._swap(mat.shape) ret = np.zeros(ret_size, dtype=int)

perimosocordiae · 2016-11-04T20:33:57Z

scipy/sparse/data.py

+            line_size = mat.shape[axis]
+            ret = np.empty(mat.shape[1 - axis], dtype=int)
+
+            for i in range(ret.shape[0]):


I don't think we can avoid the loop entirely, but we can at least vectorize the first condition:

nz_lines, = np.diff(mat.indptr) > 0 for i in nz_lines: p, q = mat.indptr[i:i+2] data = mat.data[p:q] # etc...

perimosocordiae · 2016-11-04T20:59:11Z

scipy/sparse/tests/test_argmax.py

+
+    D2 = D1.transpose()
+
+    classes = [bsr_matrix, coo_matrix, csr_matrix, csc_matrix]


When you integrate this with test_base.py, you can add the argmin/argmax tests to the _TestMinMax class and follow the existing tests as a template.

perimosocordiae · 2016-11-04T21:01:46Z

scipy/sparse/tests/test_argmax.py

+        for axis in [None, 0, 1]:
+            mat = spmatrix(D)
+            assert_raises(ValueError, mat.argmax, axis=axis)
+            assert_raises(ValueError, mat.argmin, axis=axis)


Some of these cases actually don't raise an error when mat is a numpy array:

In [1]: x = np.ones((5,0)) In [2]: x.argmax(axis=0) Out[2]: array([], dtype=int64)

nmayorov · 2016-11-05T00:18:56Z

@perimosocordiae great advices, I think I handled them now.

If you are fine with the updated state, I will move tests to test_base.py.

Another thing I forgot to mention. I decided that returning ndarray is more convenient than any form of sparse matrix, because I expect that in majority of situation people will want to work with ndarray eventually. Is it right decision?

perimosocordiae · 2016-11-05T18:38:49Z

Looks good to me. I agree that dense results are reasonable, considering that an argmin/argmax of zero doesn't necessarily indicate missing data.

If we want to follow the numpy matrix convention (which spmatrix mimics), the result should be a matrix (row matrix for axis=0, column matrix for axis=1). On the other hand, argmax/argmin are typically then used for indexing, where a flat ndarray is typically the most useful. I'm leaning toward the matrix return type for now, but I could be convinced otherwise.

nmayorov · 2016-11-05T19:18:54Z

@perimosocordiae

Maybe I'm wrong on that, but it seems to me that people usually avoid using numpy matrices. Leaving consistency aside, I think having ndarray right away is more practical. At least for me it's very true. Leave to you to decide.

…rgmin

nmayorov · 2016-11-09T21:44:28Z

I made sure that the minimum possible index is always returned and moved tests to test_base.py.

Could you please make the final decision about whether to return array or matrix?

perimosocordiae · 2016-11-10T16:42:41Z

scipy/sparse/tests/test_base.py

+
+            mat = self.spmatrix(D)
+
+            assert_equal(mat.argmax(), argmax)


Nitpick: I think it's clearer to have tests of the form:

assert_equal(mat.argmax(), np.argmax(D))

Rather than computing all the expected results first.

perimosocordiae · 2016-11-10T16:48:36Z

Looks good to me. I'm +1 to merge.

I'll defer the final choice about array vs matrix return types to another reviewer. @pv, @rgommers, others: what do you think?

rgommers · 2016-12-06T08:23:33Z

I'll defer the final choice about array vs matrix return types to another reviewer. @pv, @rgommers, others: what do you think?

Not a strong preference, but despite my dislike of matrix I think we should go for consistency and return a matrix here.

nmayorov · 2016-12-06T20:39:30Z

Not a strong preference, but despite my dislike of matrix I think we should go for consistency and return a matrix here.

OK, maybe later we can change to array everywhere (like for 1.0 release).

I changed to matrix for now.

perimosocordiae · 2017-01-16T20:18:13Z

I want this in version 0.19, so merging now. Thanks, @nmayorov!

The methods were added in gh-6761.

Methods were added in scipygh-6761.

ENH: argmax and argmin methods for sparse matrices

5b8948f

Closes scipy#5883

nmayorov added enhancement A new feature or improvement scipy.sparse labels Nov 4, 2016

perimosocordiae reviewed Nov 4, 2016

View reviewed changes

MAINT: Refactor sparse argmax and argmin

e3ee125

nmayorov added 2 commits November 10, 2016 02:26

MAINT: Make sure the minimum index is returned in sparse argmax and a…

e2ea816

…rgmin

MAINT: Move tests for sparse argmin and argmax to test_base.py

a74ebd1

perimosocordiae reviewed Nov 10, 2016

View reviewed changes

MAINT: Small refactor of sparse argmin and argmax tests

b607e5b

API: Sparse argmin and argmax return np.matrix

3c85bb1

nmayorov force-pushed the sparse_argmax branch from 8273e40 to 3c85bb1 Compare December 6, 2016 20:37

rgommers added this to the 0.19.0 milestone Dec 21, 2016

perimosocordiae merged commit 56f2045 into scipy:master Jan 16, 2017

perimosocordiae added a commit that referenced this pull request Jan 16, 2017

DOC: note addition of argmin/argmax

b5310bf

The methods were added in gh-6761.

perimosocordiae added a commit to perimosocordiae/scipy that referenced this pull request Jan 16, 2017

DOC: note argmin/argmax addition

05eaf17

Methods were added in scipygh-6761.

perimosocordiae mentioned this pull request Jan 16, 2017

DOC: note argmin/argmax addition #6964

Merged

nmayorov deleted the sparse_argmax branch March 15, 2017 00:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: argmax and argmin methods for sparse matrices #6761

ENH: argmax and argmin methods for sparse matrices #6761

Uh oh!

nmayorov commented Nov 4, 2016

Uh oh!

perimosocordiae left a comment

Uh oh!

perimosocordiae Nov 4, 2016

Uh oh!

perimosocordiae Nov 4, 2016

Uh oh!

perimosocordiae Nov 4, 2016

Uh oh!

perimosocordiae Nov 4, 2016

Uh oh!

nmayorov commented Nov 5, 2016 •

edited

Loading

Uh oh!

perimosocordiae commented Nov 5, 2016

Uh oh!

nmayorov commented Nov 5, 2016

Uh oh!

nmayorov commented Nov 9, 2016

Uh oh!

perimosocordiae Nov 10, 2016

Uh oh!

perimosocordiae commented Nov 10, 2016

Uh oh!

rgommers commented Dec 6, 2016

Uh oh!

nmayorov commented Dec 6, 2016

Uh oh!

perimosocordiae commented Jan 16, 2017

Uh oh!

Uh oh!


		D2 = D1.transpose()

		classes = [bsr_matrix, coo_matrix, csr_matrix, csc_matrix]

Uh oh!

ENH: argmax and argmin methods for sparse matrices #6761

ENH: argmax and argmin methods for sparse matrices #6761

Uh oh!

Conversation

nmayorov commented Nov 4, 2016

Uh oh!

perimosocordiae left a comment

Choose a reason for hiding this comment

Uh oh!

perimosocordiae Nov 4, 2016

Choose a reason for hiding this comment

Uh oh!

perimosocordiae Nov 4, 2016

Choose a reason for hiding this comment

Uh oh!

perimosocordiae Nov 4, 2016

Choose a reason for hiding this comment

Uh oh!

perimosocordiae Nov 4, 2016

Choose a reason for hiding this comment

Uh oh!

nmayorov commented Nov 5, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

perimosocordiae commented Nov 5, 2016

Uh oh!

nmayorov commented Nov 5, 2016

Uh oh!

nmayorov commented Nov 9, 2016

Uh oh!

perimosocordiae Nov 10, 2016

Choose a reason for hiding this comment

Uh oh!

perimosocordiae commented Nov 10, 2016

Uh oh!

rgommers commented Dec 6, 2016

Uh oh!

nmayorov commented Dec 6, 2016

Uh oh!

perimosocordiae commented Jan 16, 2017

Uh oh!

Uh oh!

nmayorov commented Nov 5, 2016 •

edited

Loading