-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Closed
Labels
API - ConsistencyInternal Consistency of API/BehaviorInternal Consistency of API/BehaviorEnhancementIO Parquetparquet, featherparquet, featherNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further action
Milestone
Description
Is your feature request related to a problem?
I find it useful to write a parquet to a bytes
object for some unit tests. The code that I currently use to do this is quite verbose.
To provide some background, df.to_csv()
(w/o args) just works. It returns a str
object as is expected. In the same vein, df.to_parquet()
(w/o args) should return a bytes
object.
More precisely, the current behavior is:
>>> df = pd.DataFrame()
>>> type(df.to_csv()) # This works
<class 'str'>
>>> df.to_parquet() # This should be made to work
Traceback (most recent call last):
File "<input>", line 1, in <module>
TypeError: to_parquet() missing 1 required positional argument: 'path'
Describe the solution you'd like
The requested behavior is:
>>> df = pd.DataFrame()
>>> type(df.to_parquet())
<class 'bytes'>
Other uses of df.to_parquet
should obviously remain unaffected.
API breaking implications
It won't break the documented API.
Describe alternatives you've considered
I currently use this verbose code to get what I want:
import io
import pandas as pd
df = pd.DataFrame()
pq_file = io.BytesIO()
df.to_parquet(pq_file)
pq_bytes = pq_file.getvalue()
This workaround is too effortful.
pydatasci-repos, leoCamilo, JANHMS and parkerburchett
Metadata
Metadata
Assignees
Labels
API - ConsistencyInternal Consistency of API/BehaviorInternal Consistency of API/BehaviorEnhancementIO Parquetparquet, featherparquet, featherNeeds DiscussionRequires discussion from core team before further actionRequires discussion from core team before further action