Skip to content

inconsistent results building a DataFrame from a dict of Series with MultiIndexes #1727

@grsr

Description

@grsr

I am trying to build a DataFrame from a dict of Series objects which have (not necessarily exactly matching) MultiIndex indices, sometimes I don't get any results for some particular data point and so I create an empty Series object and add that to the dict, as I need there to be a column present even without any data (later I convert all NAs to 0). This seems to work sometimes, but other times I get an error message that implies all the Series need hierarchical indices, it seems to depend on the order in which the Series are added to the dict. See example session below, the first time I create the DataFrame it behaves just as I'd like, giving me a df index that is the union of all the indices in the populated Series and supplying NaNs wherever there is no data, but the second time it blows up. Perhaps I shouldn't be relying on this behaviour but it seems that the results should at least be consistent.

Any tips on how to solve this in a cleaner way also very welcome. Thanks.

In [292]: pandas.__version__
Out[292]: '0.8.1'

In [293]: s1 = Series([1,2,3,4], index=MultiIndex.from_tuples([(1,2),(1,3),(2,2),(2,4)]))

In [294]: s2 = Series([1,2,3,4], index=MultiIndex.from_tuples([(1,2),(1,3),(3,2),(3,4)]))

In [295]: s3 = Series()

In [296]: df = DataFrame.from_dict({'foo':s1, 'bar':s2, 'baz':s3})

In [297]: df
Out[297]: 
     bar  baz  foo
1 2    1  NaN    1
  3    2  NaN    2
2 2  NaN  NaN    3
  4  NaN  NaN    4
3 2    3  NaN  NaN
  4    4  NaN  NaN

In [298]: df = DataFrame.from_dict({'foo':s1, 'baz':s3, 'bar':s2})

... stacktrace

TypeError: can only call with other hierarchical index objects

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions