0% found this document useful (0 votes)

15 views4 pages

Cheat Sheet Pandas

This document is a comprehensive cheat sheet for the Pandas library, covering various functionalities such as importing modules, loading and saving CSV files, converting datatypes, selecting and reshaping data, and performing operations on DataFrames. It includes code snippets and explanations for tasks like selecting rows and columns, adding new columns, renaming columns, and merging DataFrames. The cheat sheet serves as a quick reference for users to efficiently utilize Pandas for data manipulation and analysis.

Uploaded by

brhanegebregn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views4 pages

Cheat Sheet Pandas

Uploaded by

brhanegebregn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Pandas Cheat Sheet

by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

Import the Pandas Module Loading and Saving CSVs (cont) Converting Datatypes

import pandas as pd > # Get the first DataFrame chunk: # Convert argument to numeric type
df_urb_pop pandas.to_numeric(arg, errors="ra‐
Create a DataFrame df_urb_pop = next(urb_pop_reader) ise")

# Method 1 errors:
Inspect a DataFrame "raise" -> raise an exception
df1 = pd.DataFrame({
df.head(5) First 5 rows "coerce" -> invalid parsing will be set as NaN
'name':
['John Smith',
'Jane Doe'], df.info() Statistics of columns (row
DataFrame for Select Columns / Rows
'address':
['13 Main St.', count, null values, datatype)
'46 Maple Ave.'], df = pd.DataFrame([

'age':
[34, 28] Reshape (for Scikit) ['January', 100, 100, 23,

}) 100],
nums = np.array(range(1, 11))
# Method 2 ['February', 51, 45, 145,
-> [ 1 2 3 4 5 6 7 8 9 10]
df2 = pd.DataFrame([ 45],
nums = nums.reshape(-1, 1)
['John Smith', '123 Main ['March', 81, 96, 65, 96],
-> [ [1],
St.', 34], ['April', 80, 80, 54, 180],
[2],
['Jane Doe', '456 Maple ['May', 51, 54, 54, 154],
[3],
Ave.', 28], ['June', 112, 109, 79,
[4],
['Joe Schmo', '9 129]],
[5],
Broadway', 51] columns=['month',
[6],
], 'east', 'north', 'south',
[7],
columns=['name',
'address', 'west']
[8],
'age']) )
[9],
[10]]
Loading and Saving CSVs Select Columns
You can think of reshape() as rotating this
# Load a CSV File in to a # Select one Column
array. Rather than one big row of numbers,
clinic_north = df.north
DataFrame nums is now a big column of numbers -
df = pd.read_csv('my-csv- there’s one number in each row. --> Reshape values for Scikit

file.csv') learn: clinic_north.values.re‐

# Saving DataFrame to a CSV File shape(-1, 1)

df.to_csv('new-csv-fi‐ # Select multiple Columns

le.csv') clinic_north_south
= df[['n‐

# Load DataFrame in Chunks (For orth', 'south']]

large Datasets) Make sure that you have a double set of

# Initialize reader object: brackets [[ ]], or this command won’t work!
urb_pop_reader
urb_pop_reader = pd.read_c‐
sv('ind_pop_data.csv',
chunksize=1000)

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

cheatography.com/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 1 of 4. https://readable.com
Pandas Cheat Sheet
by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

Select Rows Adding a Column Performing Column Operation (cont)

# Select one Row df = pd.DataFrame([ > -> lower, upper

march = df.iloc[2] [1, '3 inch screw', 0.5, # Perform a lambda Operation on a Column
# Select multiple Rows 0.75], get_last_name = lambda x: x.split(" ")[-1]
jan_feb_march = df.iloc[:3] [2, '2 inch nail', 0.10, df['last_name'] = df.Name.apply(get_last_‐
feb_march_april = df.ilo‐ 0.25], name)

c[1:4] [3, 'hammer', 3.00, 5.50],

Performing a Operation on Multiple
may_june = df.iloc[-2:] [4, 'screwdriver', 2.50,
Columns
# Select Rows with Logic 3.00]
january = df[df.month == ], df = pd.DataFrame([
'January'] columns=['Product ID', ["Apple", 1.00, "No"],
-> <, >, <=, >=, !=, == 'Description', 'Cost to ["Milk", 4.20, "No"],
march_april = df[(df.month == Manufacture', 'Price'] ["Paper Towels", 5.00, "‐
'March') | (df.month == ) Yes"],
'April')] # Add a Column with specified ["Light Bulbs", 3.75, "‐
-> &, | row-values Yes"],
january_february_march = df['Sold in Bulk?'] = ['Yes', ],
df[df.month.isin(['January', 'Yes', 'No', 'No'] columns=["Item", "‐
'February', 'March'])] # Add a Column with same value Price", "Is taxed?"])
-> column_name.isin([" ", " in every row # Lambda Function
"]) df['Is taxed?'] = 'Yes' df['Price with Tax'] = df.app‐
# Add a Column with calculation ly(lambda row:
Selecting a Subset of a Dataframe often
df['Revenue'] = df['Price'] - row['Price'] * 1.075
results in non-consecutive indices.
df['Cost to Manufacture'] if row['Is taxed?'] ==

Using .reset_index() will create a new 'Yes'

DataFrame move the old indices into a new Performing Column Operation else row['Price'],
colum called index. df = pd.DataFrame([
axis=1

)
['JOHN SMITH', 'john.smi‐
Use .reset_index(drop=True) if you dont th@gmail.com'], We apply a lambda to rows, as opposed to
need the index column. columns, when we want to perform functi‐
['Jane Doe', 'jdoe@yah‐
Use .reset_index(inplace=True) to prevent a onality that needs to access more than one
oo.com'],
new DataFrame from brein created. column at a time.
['joe schmo', 'joeschmo‐
@hotmail.com']
],
columns=['Name', 'Email'])
# Changing a column with an
Operation
df['Name'] = df.Name. apply(‐
lower)

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

cheatography.com/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 2 of 4. https://readable.com
Pandas Cheat Sheet
by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

Rename Columns Column Statistics Pivot Tables

# Method 1 Mean = Average df.column.mean() orders =

df.columns = ['NewName_1', Median df.column.median() pd.read_csv('orders.csv')
'NewName_2, 'NewName_3', shoe_counts = orders.
Minimal Value df.column.min()
'...'] groupby(['shoe_type',
Maximum Value df.column.max()
# Method 2 'shoe_color']).
Number of Values df.column.count()
df.rename(columns={ id.count().reset_index()
'OldName_1': 'NewNa‐ Unique Values df.column.nunique() shoe_counts_pivot = shoe_c‐
me_1', Standard Deviation df.column.std() ounts.pivot(
'OldName_2': 'NewNa‐ List of Unique Values df.column.unique() index = 'shoe_type',
me_2' columns = 'shoe_color',
Dont't forget reset_index() at the end of a
}, inplace=True
) values = 'id').reset_index()
groupby operation
Using inplace=True lets us edit the original We have to build a temporary table where
DataFrame. Calculating Aggregate Functions we group by the columns we want to

# Group By include in the pivot table

Series vs. Dataframes
grouped = df. groupby(['col1',
Merge (Same Column Name)
# Dataframe and Series 'col2']).col3
print(type(clinic_north)): .measurement()
. reset_index() sales = pd.read_csv('sales.csv')
# <class 'pandas.core.series.Series'> # -> group by column1 and targets = pd.read_csv('ta‐
print(type(df)): column2, calculate values of rgets.csv')
# <class 'pandas.core.frame.DataFrame'> column3 men_women = pd.read_csv('me‐
print(type(clinic_north_south)) n_women_sales.csv')
# Percentile
# <class 'pandas.core.frame.DataFrame'> # Method 1
high_earners = df.groupby('‐
In Pandas category').wage sales_targets = pd.merge(sales,
- a series is a one-dimensional object that apply(lambda
. x: np.per‐ targets, how=" ")
contains any type of data. centile(x, 75)) # how: "inner"(default), "‐
reset_index()
. outer", "left", "right"
- a dataframe is a two-dimensional object #Method 2 (Method Chaining)
# np.percentile can calculate
that can hold multiple columns of different all_data =
any percentile over an array of
types of data.
values sales.merge(targets).merge(men_w
omen)
A single column of a dataframe is a series, Don't forget reset.index()
and a dataframe is a container of two or
more series objects.

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

cheatography.com/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 3 of 4. https://readable.com
Pandas Cheat Sheet
by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

Inner Merge (Different Column Name) Melt

orders = pandas.melt(DataFrame, id_vars, value_vars, var_name, value_name='‐

pd.read_csv('orders.csv') value')
products = pd.read_csv('pr‐ id_vars: Column(s) to use as identifier variables.
oducts.csv') value_vars: Column(s) to unpivot. If not specified, uses all columns that are not set as id_vars.
# Method 1: Rename Columns var_name: Name to use for the ‘variable’ column.
orders_products = value_name: Name to use for the ‘value’ column.
pd.merge(orders, Unpivot a DataFrame from wide to long
products.rename(columns={'i‐ format, optionally leaving identifiers set.
d':'product_id'}), how=" ")
.reset_index() Assert Statements
# how: "inner"(default), "‐ # Test if country is of type
outer", "left", "right" object
# Method 2: assert gapminder.country.d‐
orders_products = types == np.object
pd.merge(orders, products, # Test if year is of type int64
left_on="pr‐
assert gapminder.year.dtypes
oduct_id", == np.int64
right_on="id",
# Test if life_expectancy is
suffixes=["_‐
of type float64
orders","_products"]) assert gapminder.life_exp‐
Method 2: ectancy.dtypes == np.float64
If we use this syntax, we’ll end up with two # Assert that country does not
columns called id. contain any missing values
Pandas won’t let you have two columns assert pd.notnull(gapmind‐
with the same name, so it will change them er.country).all()
to id_x and id_y. # Assert that year does not
We can help make them more useful by contain any missing values
using the keyword suffixes. assert pd.notnull(gapmind‐
er.year).all()
Concatenate

bakery =
pd.read_csv('bakery.csv')
ice_cream = pd.read_csv('ic‐
e_cream.csv')
menu = pd.concat([bakery,
ice_cream])

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

cheatography.com/justin1209/ Last updated 31st January, 2020. Measure your website readability!
Page 4 of 4. https://readable.com

Deploying ML Production (Flask - API)
No ratings yet
Deploying ML Production (Flask - API)
27 pages
Online Loan Application and Verification 2016
No ratings yet
Online Loan Application and Verification 2016
34 pages
Sap Workflow For Beginners Step by Step
86% (7)
Sap Workflow For Beginners Step by Step
5 pages
EC8661 VLSI Design Lab Manual
100% (3)
EC8661 VLSI Design Lab Manual
76 pages
Troubleshooting MW Ceragon
No ratings yet
Troubleshooting MW Ceragon
19 pages
Pandas Cheat Sheet CN
No ratings yet
Pandas Cheat Sheet CN
4 pages
History of Computers
No ratings yet
History of Computers
43 pages
ServiceNow Overview Brochure
No ratings yet
ServiceNow Overview Brochure
6 pages
Man User DPS232
No ratings yet
Man User DPS232
110 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
Cryptography and Network Security: Fifth Edition by William Stallings
No ratings yet
Cryptography and Network Security: Fifth Edition by William Stallings
22 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
Pandas DataFrame Notes
67% (3)
Pandas DataFrame Notes
13 pages
S CSP As PRVD Cert As Fwork Faq
No ratings yet
S CSP As PRVD Cert As Fwork Faq
11 pages
Enabled Secure Boot Issues With Schneider Electric PLC USB Driver & Unitelway Driver
No ratings yet
Enabled Secure Boot Issues With Schneider Electric PLC USB Driver & Unitelway Driver
7 pages
Buku Teks BM Tahun 4 KSSR Semakan
No ratings yet
Buku Teks BM Tahun 4 KSSR Semakan
102 pages
The First Step To Linux Part 1 - The Basic Commands
No ratings yet
The First Step To Linux Part 1 - The Basic Commands
43 pages
Python Libraries Cheat Sheets
No ratings yet
Python Libraries Cheat Sheets
6 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
Python Interviews
No ratings yet
Python Interviews
154 pages
Pandas Cheat Sheet
100% (2)
Pandas Cheat Sheet
6 pages
Logrhythm Na Alphabet Soup Making Sense of XDR Edr NDR and Siem e Book
No ratings yet
Logrhythm Na Alphabet Soup Making Sense of XDR Edr NDR and Siem e Book
12 pages
GEN - Instructions For Remotely Proctored Testing - EN
No ratings yet
GEN - Instructions For Remotely Proctored Testing - EN
15 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
Pandas Cheat Sheet
100% (4)
Pandas Cheat Sheet
2 pages
Introduction To Data Science - Unit-1
No ratings yet
Introduction To Data Science - Unit-1
9 pages
Microsoft Powerbi Connection Adw
No ratings yet
Microsoft Powerbi Connection Adw
11 pages
Pandas - Cheatsheet
No ratings yet
Pandas - Cheatsheet
4 pages
Python Pandas Demo PDF
100% (2)
Python Pandas Demo PDF
23 pages
Cloud Computing - Everything Is A Service
No ratings yet
Cloud Computing - Everything Is A Service
11 pages
Cpe Reviewer
No ratings yet
Cpe Reviewer
5 pages
Blueprint For Supply Chain Visibility
No ratings yet
Blueprint For Supply Chain Visibility
16 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Users Perception of Cloud Based Accounting Software
No ratings yet
Users Perception of Cloud Based Accounting Software
19 pages
Damson University College of Engineering Computer Engineering Department
No ratings yet
Damson University College of Engineering Computer Engineering Department
5 pages
Rev-Trac One: Accelerate SAP Change
No ratings yet
Rev-Trac One: Accelerate SAP Change
3 pages
Januarius T. Manipol - Profile - PDF - 03152024
No ratings yet
Januarius T. Manipol - Profile - PDF - 03152024
4 pages
Your Personalised Guide To Taking RSL Graded EXAMS Via Video Submission
No ratings yet
Your Personalised Guide To Taking RSL Graded EXAMS Via Video Submission
9 pages
Chapter 2 Python Pandas - II
No ratings yet
Chapter 2 Python Pandas - II
19 pages
Meet Raval Resume
No ratings yet
Meet Raval Resume
2 pages
OSI Model Notes
No ratings yet
OSI Model Notes
3 pages
Measurable Usability Goals Template
No ratings yet
Measurable Usability Goals Template
3 pages
Pandas Cheat Sheet
85% (13)
Pandas Cheat Sheet
2 pages
! Diet Problem Given in The Note : Model Title
No ratings yet
! Diet Problem Given in The Note : Model Title
2 pages
Panda Cheatsheet
No ratings yet
Panda Cheatsheet
17 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
19 pages
Lvpei BD Call 090321
No ratings yet
Lvpei BD Call 090321
1 page
12 Pandas
100% (1)
12 Pandas
21 pages
GR12 Record Programs 6TH Onwards
No ratings yet
GR12 Record Programs 6TH Onwards
18 pages
Lab 1 ML Lab
No ratings yet
Lab 1 ML Lab
15 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Pandas Cheat Sheet Free Resources At: Dataquest - Io/guide
No ratings yet
Pandas Cheat Sheet Free Resources At: Dataquest - Io/guide
7 pages
Data Science Cheat Sheet: KEY Imports
100% (1)
Data Science Cheat Sheet: KEY Imports
1 page
Pandas in Python
No ratings yet
Pandas in Python
59 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
Unit 1 Python Pandas
No ratings yet
Unit 1 Python Pandas
20 pages
Pandas Introduction: What Is Python Pandas Used For?
No ratings yet
Pandas Introduction: What Is Python Pandas Used For?
28 pages
3Y3Z2Xzqn7 U Y%K : 2. How To Create A Data Frame Using A Dictionary of Pre-Existing Columns or Numpy 2D Arrays?
No ratings yet
3Y3Z2Xzqn7 U Y%K : 2. How To Create A Data Frame Using A Dictionary of Pre-Existing Columns or Numpy 2D Arrays?
8 pages
Loc Iloc at Dataframe
No ratings yet
Loc Iloc at Dataframe
9 pages
Pandas Cheat Sheet - Python For Data Science
No ratings yet
Pandas Cheat Sheet - Python For Data Science
5 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
60 pages
Pandas
No ratings yet
Pandas
26 pages
PANDAS Cheatsheet
No ratings yet
PANDAS Cheatsheet
4 pages
Exp 3
No ratings yet
Exp 3
10 pages
Pandas Cheat Sheet........
No ratings yet
Pandas Cheat Sheet........
11 pages
Pandas 1705297450
No ratings yet
Pandas 1705297450
21 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
12 Pandas
No ratings yet
12 Pandas
9 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
5 pages
PYTHON PANDAS Cheat Sheet
No ratings yet
PYTHON PANDAS Cheat Sheet
2 pages
Pandas
No ratings yet
Pandas
27 pages
Pandas Handbook
No ratings yet
Pandas Handbook
33 pages
Cheat Sheet - Pandas
No ratings yet
Cheat Sheet - Pandas
12 pages
Content Pandas Cheat Sheet
No ratings yet
Content Pandas Cheat Sheet
9 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
Pandas Merged
No ratings yet
Pandas Merged
2 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
Important Pandas Operations 1697910759
No ratings yet
Important Pandas Operations 1697910759
6 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
17 pages
Learn Data Analysis With Pandas - Introduction
No ratings yet
Learn Data Analysis With Pandas - Introduction
2 pages
Pandas
No ratings yet
Pandas
5 pages
Pandas Cheat Sheet Final
No ratings yet
Pandas Cheat Sheet Final
1 page
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Apache Cassandra Developer Associate - Exam Practice Tests
From Everand
Apache Cassandra Developer Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Computer Engineering Laboratory Solution Primer
From Everand
Computer Engineering Laboratory Solution Primer
Karan Bhandari
No ratings yet
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet

Cheat Sheet Pandas

Uploaded by

Cheat Sheet Pandas

Uploaded by

Pandas Cheat Sheet

by Justin1209 (Justin1209) via cheatography.com/101982/cs/21202/

f​ile.csv') learn: clinic​_north.values.re​‐

# Saving DataFrame to a CSV File sha​pe(-1, 1)

df.to_​csv​('n​ew-​csv​-fi​‐ # Select multiple Columns

# Load DataFrame in Chunks (For orth', 'south']]

large Datasets) Make sure that you have a double set of

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

Select Rows Adding a Column Performing Column Operation (cont)

# Select one Row df = pd.DataFrame([ > -> lower, upper

c[1:4] ​ [3, 'hammer', 3.00, 5.50],

Using .reset​_in​dex() will create a new 'Yes'

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

Rename Columns Column Statistics Pivot Tables

# Method 1 Mean = Average df.column.mean() orders =

# Group By include in the pivot table

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

Inner Merge (Different Column Name) Melt

orders = pandas.me​lt(​Dat​aFrame, id_vars, value_​vars, var_name, value_​nam​e='​‐

By Justin1209 (Justin1209) Published 23rd November, 2019. Sponsored by Readable.com

You might also like

file.csv') learn: clinic_north.values.re‐

# Saving DataFrame to a CSV File shape(-1, 1)

df.to_csv('new-csv-fi‐ # Select multiple Columns

c[1:4] [3, 'hammer', 3.00, 5.50],

Using .reset_index() will create a new 'Yes'

orders = pandas.melt(DataFrame, id_vars, value_vars, var_name, value_name='‐