Skip to content

unexpected result with groupby apply on categorical data #9603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tmoroz opened this issue Mar 6, 2015 · 2 comments
Closed

unexpected result with groupby apply on categorical data #9603

tmoroz opened this issue Mar 6, 2015 · 2 comments
Labels
Bug Categorical Categorical Data Type Groupby
Milestone

Comments

@tmoroz
Copy link

tmoroz commented Mar 6, 2015

df = pandas.DataFrame({'a': [1, 0, 0, 0]})
df.groupby(pandas.cut(df.a, [0, 1, 2, 3, 4])).apply(lambda x: len(x))

Out[3]:
a
(0, 1]    1
(1, 2]    0
(2, 3]    0
(3, 4]    3
dtype: int64

final group length should be 0 not 3

@jreback jreback added Bug Groupby Categorical Categorical Data Type labels Mar 6, 2015
@jreback jreback added this to the 0.16.1 milestone Mar 6, 2015
@jreback
Copy link
Contributor

jreback commented Mar 6, 2015

I recall another issue that is similar (but can't find ATM).

You can use these (and are much faster anyhow)

In [6]: df.groupby(pandas.cut(df.a, [0, 1, 2, 3, 4])).size()
Out[6]: 
a
(0, 1]     1
(1, 2]   NaN
(2, 3]   NaN
(3, 4]   NaN
dtype: float64

In [7]: df.groupby(pandas.cut(df.a, [0, 1, 2, 3, 4])).count()
Out[7]: 
        a
a        
(0, 1]  1
(1, 2]  0
(2, 3]  0
(3, 4]  0

@jreback
Copy link
Contributor

jreback commented Apr 29, 2015

closed by #10014

@jreback jreback closed this as completed Apr 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type Groupby
Projects
None yet
Development

No branches or pull requests

2 participants