Skip to content

Conversation

twoertwein
Copy link
Member

@twoertwein twoertwein commented Dec 2, 2020

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

Allows (and encourages) the following use:

import pandas as pd

filename = "pandas/tests/io/data/csv/iris.csv"
chunksize = 2
with pd.read_csv(filename, chunksize=chunksize) as reader:
    for chunk in reader:
        # risky code that might raise

Same can be done for read_json/sas (I think these are all methods that support chunksize). If this PR should make it into 1.2, I can quickly add the changes for json/sas as well.

Are there more places to promote this new context manager?

@jreback jreback added the IO CSV read_csv, to_csv label Dec 2, 2020
@jreback
Copy link
Contributor

jreback commented Dec 2, 2020

this looks pretty neat. can you add a small note section in io.rst which shows this off. i think this is ok for 1.2, can add json/sas as followsup.

@twoertwein twoertwein changed the title ENH: context-manager for TextFileReader ENH: context-manager for TextFile/JSON/SASReader Dec 2, 2020
@twoertwein twoertwein marked this pull request as ready for review December 2, 2020 07:37
@twoertwein
Copy link
Member Author

twoertwein commented Dec 2, 2020

I don't understand why the documentation example is failing:

Exception in /home/runner/work/pandas/pandas/doc/source/user_guide/io.rst at block ending on line 1586
ParserError: Error tokenizing data. C error: out of memory

edit: the blank line was causing the issue, it was interpreted as the end of the with-block

@twoertwein twoertwein changed the title ENH: context-manager for TextFile/JSON/SASReader ENH: context-manager for chunksize/iterator-reader Dec 3, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great. just some minor doc omments. ping on green.

@jreback jreback added this to the 1.2 milestone Dec 4, 2020
@twoertwein
Copy link
Member Author

@jreback green. I hope the whatsnew entry is good now

@jreback jreback merged commit 5011a37 into pandas-dev:master Dec 4, 2020
@jreback
Copy link
Contributor

jreback commented Dec 4, 2020

thanks @twoertwein very nice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants