Skip to content

BUG: df.apply disobeys raw=True #32423

Closed
Closed
@kernc

Description

@kernc

Code Sample, a copy-pastable example if possible

>>> df = pd.DataFrame(dict(ints=np.arange(3), 
...                        floats=np.arange(3, dtype=float)))                                         

# df.apply callbacks with Series even though raw requested
>>> df.apply(type, raw=True)                                                     
ints      <class 'pandas.core.series.Series'>
floats    <class 'pandas.core.series.Series'>
dtype: object

# Works for single dtype dfs
>>> df.astype(int).apply(type, raw=True)                                         
ints      <class 'numpy.ndarray'>
floats    <class 'numpy.ndarray'>
dtype: object

Problem description

When df.apply(..., raw=True), the callback should always be passed a numpy array (as documented) for reasons of performance and convenience (arrays index much differently than Series).

This is not a recent regression; 0.25.3 exhibits the same behavior.

Expected Output

>>> df.apply(type, raw=True, axis=1)
0    <class 'numpy.ndarray'>
1    <class 'numpy.ndarray'>
2    <class 'numpy.ndarray'>
dtype: object

Output of pd.show_versions()

pandas 1.1.0.dev0+679.gd33b0025d
pandas 1.0.1
pandas 0.25.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions