Skip to content

qcut user-reported failure #1978

@wesm

Description

@wesm

from pystatsmodels mailing list

qcut in pandas 0.8.1 is failing for some quantile lists but not others (see below). Sorry if I'm missing something...

type(F.g)

Out[3]:
pandas.core.series.TimeSeries

chg = (F.g[20:]-F.g[20:].shift(1))
fac = qcut(chg, [0, .5, 1])
fac

Out[4]:
Categorical: g
array([nan, [-1005094.81, 0], [-1005094.81, 0], ..., (0, 1478547.3],
       (0, 1478547.3], (0, 1478547.3]], dtype=object)
Levels (2): Index([[-1005094.81, 0], (0, 1478547.3]], dtype=object)

chg = (F.g[20:]-F.g[20:].shift(1))
fac = qcut(chg, [0, .5, 1])
fac

Out[5]:
Categorical: g
array([nan, [-1005094.81, 0], [-1005094.81, 0], ..., (0, 1478547.3],
       (0, 1478547.3], (0, 1478547.3]], dtype=object)
Levels (2): Index([[-1005094.81, 0], (0, 1478547.3]], dtype=object)

fac = qcut(chg, [0, .5, .75, 1])
fac

Out[6]:
Categorical: g
array([nan, [-1005094.81, 0], [-1005094.81, 0], ..., (0, 1478547.3],
       (0, 1478547.3], (0, 1478547.3]], dtype=object)
Levels (3): Index([[-1005094.81, 0], (0, 0], (0, 1478547.3]], dtype=object)

fac = qcut(chg, [0, .25, .5, .75, 1])
fac

Out[7]:
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/home/birone/<ipython-input-9-3896feb36dcd> in <module>()
----> 1 fac = qcut(chg, [0, .25, .5, .75, 1])
      2 fac

/usr/lib/pymodules/python2.7/pandas/tools/tile.pyc in qcut(x, q, labels, retbins, precision)
    139     bins = algos.quantile(x, quantiles)
    140     return _bins_to_cuts(x, bins, labels=labels, retbins=retbins,
--> 141                          precision=precision, include_lowest=True)
    142 
    143 

/usr/lib/pymodules/python2.7/pandas/tools/tile.pyc in _bins_to_cuts(x, bins, right, labels, retbins, precision, name, include_lowest)
    177         levels = np.asarray(levels, dtype=object)
    178         np.putmask(ids, na_mask, 0)
--> 179         fac = Categorical(ids - 1, levels, name=name)
    180     else:
    181         fac = ids - 1

/usr/lib/pymodules/python2.7/pandas/core/categorical.pyc in __init__(self, labels, levels, name)
     43     def __init__(self, labels, levels, name=None):
     44         self.labels = labels
---> 45         self.levels = levels
     46         self.name = name
     47 

/usr/lib/pymodules/python2.7/pandas/core/categorical.pyc in _set_levels(self, levels)
     62         levels = _ensure_index(levels)
     63         if not levels.is_unique:
---> 64             raise ValueError('Categorical levels must be unique')
     65         self._levels = levels
     66 

ValueError: Categorical levels must be unique

fac = qcut(chg, n) fails with the same error for n>3

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions