Skip to content

BUG: Allow list-like in DatetimeIndex.searchsorted #32764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Mar 26, 2020
Merged

BUG: Allow list-like in DatetimeIndex.searchsorted #32764

merged 28 commits into from
Mar 26, 2020

Conversation

dsaxton
Copy link
Member

@dsaxton dsaxton commented Mar 16, 2020

result = dates.searchsorted(klass(dates))
expected = np.array([0, 1], dtype=result.dtype)

tm.assert_numpy_array_equal(result, expected)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have sufficient tests for if we have non-matching dtypes e.g. integers?

what about different timezones, or tzawareness-compat? (or periodarray freq mismatch)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some tests for type mismatches

@jreback jreback added Bug Datetime Datetime data dtype labels Mar 19, 2020
@jreback jreback added this to the 1.1 milestone Mar 19, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally as a followup we test things like IntervalIndex as well for this behavior (pls create an issue)



@pytest.mark.parametrize(
"arg", [[1, 2], ["a", "b"], [pd.Timestamp("2020-01-01", tz="Europe/London")] * 2]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also test for a TDI and a PeriodIndex.



@pytest.mark.parametrize("klass", [list, np.array, pd.array, pd.Series])
def test_searchsorted_datetimelike_with_listlike(klass):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you parameterize over TDI and a PeriodIndex. they should work as well

@dsaxton
Copy link
Member Author

dsaxton commented Mar 19, 2020

ideally as a followup we test things like IntervalIndex as well for this behavior (pls create an issue)

Created #32845

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to overlap with #32874 can you clarify?

@@ -804,6 +804,9 @@ def searchsorted(self, value, side="left", sorter=None):
indices : array of ints
Array of insertion points with the same shape as `value`.
"""
if is_list_like(value):
value = array(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be an elif

@dsaxton
Copy link
Member Author

dsaxton commented Mar 22, 2020

this seems to overlap with #32874 can you clarify?

Yeah, that's me confusing myself working on duplicative branches. I could try merging that one into here, hope I don't break things, and then close the other (it was originally only intended to add testing code)?

@jreback
Copy link
Contributor

jreback commented Mar 22, 2020

this seems to overlap with #32874 can you clarify?

Yeah, that's me confusing myself working on duplicative branches. I could try merging that one into here, hope I don't break things, and then close the other (it was originally only intended to add testing code)?

yep one branch is fine (tests & changes)

@dsaxton dsaxton mentioned this pull request Mar 22, 2020
3 tasks
if not type(self)._is_recognized_dtype(value):
raise TypeError(
"searchsorted requires compatible dtype or scalar, "
f"not {type(value).__name__}"
)
value = type(self)(value)
self._check_compatible_with(value)

if not (isinstance(value, (self._scalar_type, type(self))) or (value is NaT)):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a lot of test failures if I remove this so I think it's needed

@@ -64,6 +66,19 @@ def test_searchsorted(self, freq):
with pytest.raises(IncompatibleFrequency, match=msg):
pidx.searchsorted(Period("2014-01-01", freq="5D"))

@pytest.mark.parametrize("klass", [list, np.array, array, Series])
def test_searchsorted_different_argument_classes(self, klass):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel prob should move the searchsorted tests out of here

@@ -111,6 +112,26 @@ def test_sort_values(self):

tm.assert_numpy_array_equal(dexer, np.array([0, 2, 1]), check_dtype=False)

@pytest.mark.parametrize("klass", [list, np.array, array, Series])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same ought to collect these tests

@jreback jreback merged commit ee017c1 into pandas-dev:master Mar 26, 2020
@jreback
Copy link
Contributor

jreback commented Mar 26, 2020

thanks @dsaxton

I think we need to collect the index searchsorted tests and separate them out, but a separate exercise.

@dsaxton dsaxton deleted the searchsorted-date branch March 26, 2020 01:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TypeError: searchsorted requires compatible dtype or scalar, not Series
4 participants