Skip to content

read_csv skips rows with value 0 if having initial space #9710

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aarimond opened this issue Mar 23, 2015 · 4 comments
Closed

read_csv skips rows with value 0 if having initial space #9710

aarimond opened this issue Mar 23, 2015 · 4 comments
Labels
Milestone

Comments

@aarimond
Copy link

Hi,

I have something like the following csv file:

MyColumn
   0
   1
   0
   1

Note the initial space in each row.
Upgrading from 0.14.1 to 0.16 I recognized that read_csv started throwing away the 0 rows

In [28]: import pandas
In [29]: from StringIO import StringIO
In [30]: data = 'MyColumn\n   0\n   1\n   0\n   1'
In [31]: pandas.read_csv(StringIO(data))
Out[31]:
   MyColumn
0         1
1         1

skipinitialspace=True did not help:

In [32]: pandas.read_csv(StringIO(data), skipinitialspace=True)
Out[32]:
   MyColumn
0         1
1         1

however, skip_blank_lines=False would help:

In [34]: pandas.read_csv(StringIO(data), skip_blank_lines=False)
Out[34]:
   MyColumn
0         0
1         1
2         0
3         1

Not sure if this is working as intended.

Cheers,
Alex

PS:
Having a second columns works as expected:

In [40]: data = 'MyColumn,SecondColumn\n   0, 2\n   1, 3\n   0, 0\n   1, 4'
In [41]: pandas.read_csv(StringIO(data))
Out[41]:
   MyColumn  SecondColumn
0         0             3
1         1             4
2         0             0
3         1             6

UPDATE:
Made code more reproducable.

@jreback
Copy link
Contributor

jreback commented Mar 23, 2015

pls read the section / warning here: http://pandas.pydata.org/pandas-docs/stable/io.html#ignoring-line-comments-and-empty-lines

When you upgrade thru multiple major versions, it IS necessary to read the api changes sections.

@jreback jreback closed this as completed Mar 23, 2015
@jreback jreback added IO CSV read_csv, to_csv Usage Question labels Mar 23, 2015
@aarimond
Copy link
Author

Thanks for your comment.

I indeed recognized the API change on ignoring empty lines and comment lines.
However, the mentioned csv input does neither contain comments nor empty/blank lines (as far as I interpret).
I'm not using parameters comment, header or skiprows which could confuse anything.
I think the problem is about the preceding white spaces.

Other example with preceding white spaces:

MyColumn
  1
  2
  3
  4
  5
  6
  7
  8

read_csv will skip every 2nd line/row.

In [124]: data = 'MyColumn\n 1\n 2\n 3\n 4\n 5\n 6\n 7\n 8'

In [125]: pandas.read_csv(StringIO(data))
Out[125]:
   MyColumn
0         2
1         4
2         6
3         8

I admit having one column and preceding white spaces is a very special use case.
Having multiple columns in the csv works.

@jreback
Copy link
Contributor

jreback commented Mar 23, 2015

@aarimond can you update the top section with the examples that don't work. (and make them self-reproducing), like your last example here [124]. Will be more clear exactly what doesn't work.

@jreback
Copy link
Contributor

jreback commented Apr 28, 2015

closed by #9837

@jreback jreback closed this as completed Apr 28, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants