Skip to content

Conversation

jschendel
Copy link
Member

I've seen a few complaints about this on StackOverflow, so wanted to get this into 0.24.0 before I try implementing larger changes like the new behavior specs and interval accessor.

@jschendel jschendel added Bug Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves Interval Interval data type labels Oct 4, 2018
@jschendel jschendel added this to the 0.24.0 milestone Oct 4, 2018
@pep8speaks
Copy link

Hello @jschendel! Thanks for submitting the PR.

@mroeschke
Copy link
Member

Could you also add tests for Periods as well? I think they also need i8 conversion (though possibly less commonly used).

@jschendel
Copy link
Member Author

We currently do not allow an IntervalIndex to have Period endpoints, so I don't think tests are necessary:

In [2]: pd.IntervalIndex.from_breaks(pd.period_range('2018Q1', freq='Q', periods=3))
---------------------------------------------------------------------------
ValueError: Period dtypes are not supported, use a PeriodIndex instead

Interestingly enough, it looks like an Interval can have Period endpoints, which seems like a bug:

In [3]: pd.Interval(pd.Period('2018Q1', freq='Q'), pd.Period('2018Q3', freq='Q'))
Out[3]: Interval(Period('2018Q1', 'Q-DEC'), Period('2018Q3', 'Q-DEC'), closed='right')

@mroeschke
Copy link
Member

Gotcha. Are there plans for IntervalIndex to support Period? That might get tricky since Periods themselves are conceptually 1 label for an interval of time.

@codecov
Copy link

codecov bot commented Oct 4, 2018

Codecov Report

Merging #22988 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22988      +/-   ##
==========================================
+ Coverage   92.18%    92.2%   +0.01%     
==========================================
  Files         169      169              
  Lines       50823    50865      +42     
==========================================
+ Hits        46853    46898      +45     
+ Misses       3970     3967       -3
Flag Coverage Δ
#multiple 90.62% <100%> (+0.01%) ⬆️
#single 42.33% <13.88%> (-0.04%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/interval.py 94.63% <100%> (+0.3%) ⬆️
pandas/core/dtypes/dtypes.py 95.56% <0%> (-0.3%) ⬇️
pandas/core/reshape/merge.py 93.89% <0%> (-0.26%) ⬇️
pandas/core/strings.py 98.63% <0%> (ø) ⬆️
pandas/core/dtypes/base.py 100% <0%> (ø) ⬆️
pandas/core/dtypes/cast.py 88.58% <0%> (ø) ⬆️
pandas/core/series.py 93.75% <0%> (ø) ⬆️
pandas/core/frame.py 97.2% <0%> (ø) ⬆️
pandas/core/dtypes/common.py 95.03% <0%> (+0.01%) ⬆️
pandas/core/arrays/datetimelike.py 95.56% <0%> (+0.02%) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c282e31...f338f3d. Read the comment docs.

@jschendel
Copy link
Member Author

Are there plans for IntervalIndex to support Period?

I'm not aware of any such plans. I don't think the same type algorithms would apply; a Period is basically a special case of an Interval, so you'd essentially have intervals with interval endpoints (e.g. Period('2018Q1') is conceptually Interval('2018-01-01', '2018-04-01', closed='left') with some additional special properties.). Haven't fully thought it through though.

There is an open issue to have Period/PeriodIndex inherit from Interval/IntervalIndex though, allowing one to use any of the interval methods on periods. Not quite as straight forward as it might initially seem though, as we allow a Period to be outside the bounds of Timestamp, so just using Timestamp endpoints for the Interval representation wouldn't cut it.

@@ -514,6 +519,41 @@ def _maybe_cast_indexed(self, key):

return key

def _maybe_convert_i8(self, key):
if isinstance(key, Interval):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you document this a bit, and can you simplify at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Look better?

@jreback
Copy link
Contributor

jreback commented Oct 7, 2018

@mroeschke as indicated above, a Period doesn't make sense as an element of an IntervalIndex, though I don't think we actually preclude it, maybe we should. However implemented Period as a sub-class of Intevral and PeriodIndex as a sub-class of IntervalIndex, does make sense (though may not be straightforward to actually do this).

@jreback jreback merged commit 2a1ef8c into pandas-dev:master Oct 7, 2018
@jreback
Copy link
Contributor

jreback commented Oct 7, 2018

thanks @jschendel

@jschendel jschendel deleted the ivtree-i8 branch October 8, 2018 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves Interval Interval data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: indexing in datetime IntervalIndex with duplicate values fails
4 participants