Skip to content

DataFrame.apply not working with datetimes #6125

@dbew

Description

@dbew

When you use apply on a DataFrame with datetimes in, the result is unexpected. This is a dataframe with just integers and strings and the result is that we get the market names back out.

positions = pd.DataFrame([[1, 'ABC', 50], [1, 'YUM', 20], 
                          [1, 'DEF', 20], [2, 'ABC', 50],
                          [2, 'YUM', 20], [2, 'DEF', 20]],
                         columns=['a', 'market', 'position'])
positions.apply(lambda r: r['market'], axis=1)
Out[210]: 
0    ABC
1    YUM
2    DEF
3    ABC
4    YUM
5    DEF
dtype: object

If we replace the data in column 'a' with datetimes, then we get the wrong result - the first value in the market column is repeated:

import datetime

positions = pd.DataFrame([[datetime.datetime(2013, 1, 1), 'ABC', 50], 
                           [datetime.datetime(2013, 1, 1), 'YUM', 20],
                           [datetime.datetime(2013, 1, 1), 'DEF', 20],
                           [datetime.datetime(2013, 1, 2), 'ABC', 50],
                           [datetime.datetime(2013, 1, 2), 'YUM', 20], 
                           [datetime.datetime(2013, 1, 2), 'DEF', 20]],
                          columns=['a', 'market', 'position'])
positions.apply(lambda r: r['market'], axis=1)
Out[213]: 
0    ABC
1    ABC
2    ABC
3    ABC
4    ABC
5    ABC
dtype: object

If you replace the lambda function with a function which prints the object passed in, then you can see that you only ever receive the first row of the dataframe:

def print_input(r):
    print r
    return 1

positions.apply(print_input, axis=1)
a           2013-01-01 00:00:00
market                      ABC
position                     50
Name: 0, dtype: object
a           2013-01-01 00:00:00
market                      ABC
position                     50
Name: 1, dtype: object
a           2013-01-01 00:00:00
market                      ABC
position                     50
Name: 2, dtype: object
a           2013-01-01 00:00:00
market                      ABC
position                     50
Name: 3, dtype: object
a           2013-01-01 00:00:00
market                      ABC
position                     50
Name: 4, dtype: object
a           2013-01-01 00:00:00
market                      ABC
position                     50
Name: 5, dtype: object
Out[215]: 
0    1
1    1
2    1
3    1
4    1
5    1
dtype: int64

This is new in the master, I didn't see it in pandas 0.11.0 or 0.13.0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions