Closed
Description
Code Sample, a copy-pastable example if possible
>>> df = pd.DataFrame(dict(ints=np.arange(3),
... floats=np.arange(3, dtype=float)))
# df.apply callbacks with Series even though raw requested
>>> df.apply(type, raw=True)
ints <class 'pandas.core.series.Series'>
floats <class 'pandas.core.series.Series'>
dtype: object
# Works for single dtype dfs
>>> df.astype(int).apply(type, raw=True)
ints <class 'numpy.ndarray'>
floats <class 'numpy.ndarray'>
dtype: object
Problem description
When df.apply(..., raw=True)
, the callback should always be passed a numpy array (as documented) for reasons of performance and convenience (arrays index much differently than Series).
This is not a recent regression; 0.25.3 exhibits the same behavior.
Expected Output
>>> df.apply(type, raw=True, axis=1)
0 <class 'numpy.ndarray'>
1 <class 'numpy.ndarray'>
2 <class 'numpy.ndarray'>
dtype: object
Output of pd.show_versions()
pandas 1.1.0.dev0+679.gd33b0025d
pandas 1.0.1
pandas 0.25.3