-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Closed
Copy link
Labels
Error ReportingIncorrect or improved errors from pandasIncorrect or improved errors from pandasIO Stataread_stata, to_stataread_stata, to_stata
Milestone
Description
I am trying to open some Stata files generated in IPUMS International, but I am getting a ValueError: Categorical categories must be unique
. I opened in Stata and could not find a repeated category for the column I am trying to import. I had similar issues with other datasets from the same source, which seemed to be generated by missing values, but that does not seem to be the case here. Here's the link to the file I am trying to read.
Code Sample, a copy-pastable example if possible
df = pd.read_stata('ipumsi_00014.dta', columns=['ethnicsn'])
Expected Output
df.shape = (1694761,1)
output of pd.show_versions()
commit: None
python: 2.7.9.final.0
python-bits: 64
OS: Darwin
OS-release: 13.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.18.1
nose: 1.3.7
pip: 8.1.2
setuptools: 18.2
Cython: 0.22.1
numpy: 1.11.1
scipy: 0.15.1
statsmodels: 0.6.1
xarray: None
IPython: 3.2.1
sphinx: None
patsy: 0.2.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 0.8.0
tables: None
numexpr: 2.4
matplotlib: 1.4.3
openpyxl: 2.1.3
xlrd: 0.9.3
xlwt: 0.7.5
xlsxwriter: 0.6.4
lxml: 3.3.5
bs4: 4.3.2
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 0.9.8
pymysql: None
psycopg2: 2.5.3 (dt dec pq3 ext)
jinja2: 2.7.3
boto: 2.34.0
pandas_datareader: None
Metadata
Metadata
Assignees
Labels
Error ReportingIncorrect or improved errors from pandasIncorrect or improved errors from pandasIO Stataread_stata, to_stataread_stata, to_stata