Skip to content

Unexpected result with union, intersection and diff on Index objects #1708

@kdebrab

Description

@kdebrab

Pandas 0.8.1:

import pandas as pd
import datetime as dt
index_1 = pd.DatetimeIndex([dt.datetime(2012,1,1,0), dt.datetime(2012,1,1,12),
    dt.datetime(2012,1,2,0), dt.datetime(2012,1,2,12)])
index_2 = index_1 + pd.DateOffset(hours=1)
index_1 & index_2

correctly returns:

<class 'pandas.tseries.index.DatetimeIndex'>
Length: 0, Freq: None, Timezone: None

But when building the same Index objects by specifying their frequency:

index_1 = pd.date_range('1/1/2012', periods=4, freq='12H')
index_2 = index_1 + pd.DateOffset(hours=1)
index_1 & index_2

unexpectedly results in:

<class 'pandas.tseries.index.DatetimeIndex'>
[2012-01-01 12:00:00, ..., 2012-01-02 12:00:00]
Length: 3, Freq: 12H, Timezone: None

The same issue occurs when directly calling the intersection() method and similar unexpected results occur for union (|, +) and diff (-) operators.

For information, combining append() and order() methods as an alternative for the union operators does give the correct result, independently how index_1 and index_2 are built:

index_1 = pd.date_range('1/1/2012', periods=4, freq='12H')
index_2 = index_1 + pd.DateOffset(hours=1)
index_1.append(index_2).order()

correctly results in:

[2012-01-01 00:00:00, ..., 2012-01-02 13:00:00]
Length: 8, Freq: None, Timezone: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugDatetimeDatetime data dtypeIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions