Skip to content

Unstack is not careful about potential memory use problems #2278

@wesm

Description

@wesm

Cartesian product problem; unstacks relative to hypothetical possibilities instead of observed combinations. I'm claiming this unless someone else wants to look inside the reshape code

import pandas as pd
import numpy as np

# Generate Long File & Test Pivot
NUM_ROWS = 1000000

df = pd.DataFrame({'A' : np.random.randint(100, size=NUM_ROWS), 
                                'B' : np.random.randint(300, size=NUM_ROWS), 
                                'C' : np.random.randint(-7, 7, size=NUM_ROWS), 
                                'D' : np.random.randint(-19,19, size=NUM_ROWS),
                                'E' : np.random.randint(3000, size=NUM_ROWS),
                                'F' : np.random.randn(NUM_ROWS)})

df_pivoted = df.pivot_table(rows=['A', 'B', 'C'], cols='E', values='F')
df_pivoted

Metadata

Metadata

Assignees

Labels

BugReshapingConcat, Merge/Join, Stack/Unstack, Explode

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions