Description
This is building off of some issues seen in #13475 and subsequent PR. Essentially, the resulting index name when combining existing named indices with union_indexes
is not consistent across index types. This primarily affects concat
and the DataFrame
constructor, which call union_indexes
.
When combining named indices, there are three main name resolution rules I can think of:
ignore
: assign no name to outputunanimous
: assign name if all names agreeconsensus
: assign name only if one unique non-null name
With some testing, below is my best understanding of what resolution rule the various index types use. Note that the behavior may differ depending on whether the indices are numerically equal or not, as with RangeIndex. For the not numerically equal case:
Index
: consensusRangeIndex
: ignoreInt64Index
: unanimousFloat64Index
: unanimousDateTimeIndex
: consensusTimeDeltaIndex
: consensusPeriodIndex
: unanimousCategoricalIndex
: unanimousMultiIndex
: unanimous (over all levels)
I'm not really taking a stand on the correct name resolution rule, but I think they should at least be consistent across index type! And of course MultiIndex is a bit more complicated. Seems possible things could be implemented in common for non-multi-indices in the higher level union
function? I'm not really sure, but I'm happy to put some work into it.
The test is slightly different for each index type, but for the RangeIndex case, here's an MWE:
idx1 = pd.RangeIndex(0, 5, name='idx')
idx2 = pd.RangeIndex(2, 7, name='idx')
pd.core.indexes.api.union_indexes([idx1, idx2])
and the result will have no name (checked on master).