0% found this document useful (0 votes)

10 views7 pages

Pandas Cheat Sheet Free Resources At: Dataquest - Io/guide

This pandas cheat sheet serves as a quick reference for essential commands related to data manipulation and analysis, including importing, cleaning, and exporting data. It covers operations like filtering, sorting, grouping, and calculating statistics, with practical examples using Fortune 500 Companies data. The guide is designed for efficient application of pandas functionalities in data workflows.

Uploaded by

sasikumar krishnamoorthy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views7 pages

Pandas Cheat Sheet Free Resources At: Dataquest - Io/guide

Uploaded by

sasikumar krishnamoorthy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

pandas Cheat Sheet Table of Contents

This cheat sheet offers a handy reference for essential pandas Importing Data Data Cleaning
commands, focused on efficient data manipulation and IMPORT, read_csv, read_table, read_excel, columns, isnull, notnull, dropna,
analysis. Using examples from the Fortune 500 Companies
read_sql, read_json, read_html, fillna, astype, replace, rename,
clipboard, DataFrame set_index, Finding Correlation,
Dataset, it covers key pandas operations such as reading and Converting a Column to Datetime
writing data, selecting and filtering DataFrame values, and Exporting Data
performing common transformations.

to_csv, to_excel, to_sql, to_json, Filter, Sort, & Group By

to_html, to_clipboard columns, sort_values, group by,
pivot_table, apply
You'll find easy-to-follow examples for grouping, sorting, and
Create Test Objects
aggregating data, as well as calculating statistics like mean,
DataFrame, Series, index Join & Combine
correlation, and summary statistics. Whether you're cleaning
append, concat, join
datasets, analyzing trends, or visualizing data, this cheat sheet Working with DataFrames
provides concise instructions to help you navigate pandas’ Dataframe Basics, DataFrame Values, loc, Statistics
powerful functionality.

iloc, Boolean Masks, Boolean Operators, describe, mean, corr, count, max,
Data Exploration, Assigning Values, min, median, std
Boolean Indexing
Designed to be practical and actionable, this guide ensures you
can quickly apply pandas’ versatile data manipulation tools in View & Inspect Data
your workflow. Frequency Table, Histogram, Vertical Bar
Plot, Horizontal Bar Plot, Line Plot,
Scatter Plot, head, tail, shape, info,
describe, value_counts, apply

pandas Cheat Sheet Free resources at: dataquest.io/guide

Importing Data Exporting Data
Syntax for How to use Explained Syntax for How to use Explained
Import the library using its to_csv df.to_csv(filename) Writes to a CSV file
IMPORT import pandas as pd
standard alias

read_csv pd.read_csv(filename) Reads from a CSV file to_excel df.to_excel(filename) Writes to an Excel file

read_table pd.read_table(filename)
Reads from a delimited text
to_sql df.to_sql(table_name, connection_object) Writes to a SQL table
file (like TSV)

to_json df.to_json(filename)
Writes to a file in JSON

read_excel pd.read_excel(filename) Reads from an Excel file format

Reads from a SQL table/ to_html df.to_html(filename) Writes to an HTML table

read_sql pd.read_sql(query, connection_object)
database

Reads from a JSON

to_clipboard df.to_clipboard() Writes to the clipboard
read_json pd.read_json(json_string)
formatted string, URL or file

Parses an html URL, string or

read_html pd.read_html(url) file and extracts tables to a
list of dataframes Create Test Objects
Reads the contents of your
Syntax for How to use Explained
clipboard pd.read_clipboard()
clipboard
DataFrame pd.DataFrame(np.random.rand(20, 5))
5 columns and 20 rows of
random floats
Reads from a dict; keys for

DataFrame pd.DataFrame(dict) Creates a series from an

columns names, values for Series pd.Series(my_list)
data as lists existing list object

index df.index = pd.date_range('1900/1/30',

Adds a date index
periods=df.shape[0])

pandas Cheat Sheet Free resources at: dataquest.io/guide

Working with DataFrames
Syntax for How to use Explained Syntax for How to use Explained

Dataframe

f500 = pd.read_csv('f500.csv', index_col=0)

Read a CSV file into a
il co w
third_ro _first_col = f500.iloc[ , 2 0] ,
Select the third row first
Basics DataFrame column by integer location
y y Return the data type of each Select the second row by
col_t pes = f500.dt pes second_ro w 1
= f500.iloc[ ]
column in a DataFrame g
inte er location

Return the dimensions of a

dims = f500.shape Boolean
Check for null values in the
DataFrame rev_is_null = f500[ "revenue_change"].isnull()
Masks revenue_change column
Selecting
Select the rank column from
f500["rank"] rev_change_null = f500[rev_is_null] Filtering using Boolean array
DataFrame
f500
Values
f500[["country", "rank"]] Select the country and rank
f500[f500[ "previous_rank"].notnull()] Filter rows where
columns from f500
previous_rank is not null

Select the first five rows from

first_five = f500.head(5)
f500
Boolean
filter_big_rev_neg_profit = (
Create a Boolean filter for
o
l c
big_movers = f500.loc[["Aviva", "HP", Use .loc[] to select rows and Operators f500["revenues"] > 100000) &
p
com anies with revenues
columns from f500 by greater than 100,000 and
"JD.com", "BHP Billiton"], ["rank", (f500["profits"] < 0)
label―rows are specified first, profits less than 0
"previous_rank"]]

followed by columns. You can

select individual rows/columns

bottom_companies = f500.loc["National
or multiple by passing a list,
Grid":"AutoNation", ["rank", "sector", and label-based slicing
"country"]]

includes both the start and end

labels.

revenue_giants = f500.loc[["Apple",
"Industrial & Commercial Bank of China",
"China Construction Bank", "Agricultural
Bank of China"], "revenues":"profit_change"]

pandas Cheat Sheet Free resources at: dataquest.io/guide

Working with DataFrames View & Inspect Data
Syntax for How to use Explained Syntax for How to use Explained

Data
revs = f500["revenues"]
Generate summary statistics Frequency
Series.value_counts() Generate a frequency table
Exploration summary_stats = revs.describe() for the revenues column in Table from a Series object
f500
Series.value_counts().sort_index() Generate a sorted frequency
Count the occurrences of each table from a Series object
country_freqs =

f500["country"].value_counts()
country in f500
Histogram Series.plot.hist()
Generate a histogram from a
plt.show() Series object
Assigning
top5_rank_revenue["year_founded"] = 0
Set the year_founded
Values column to 0
Vertical Bar
Series.plot.bar()
Generate a vertical bar plot
f500.loc["Dow Chemical", "ceo"] =
Update the CEO of Dow Plot plt.show() from a Series object
"Jim Fitterling"
Chemical to Jim Fitterling

Horizontal Series.plot.barh() Generate a horizontal bar

Boolean
kr_bool = f500["country"] == "South Korea"
Filter rows for South Korea Bar Plot plt.show() plot from a Series object
Indexing top_5_kr = f500[kr_bool].head() and display the top 5
Line Plot DataFrame.plot.line(x='col_1', y='col_2')
Generate a line plot from a
f500.loc[f500["previous_rank"] == 0,
Replace 0 with NaN in the plt.show()
DataFrame object
"previous_rank"] = np.nan
previous_rank column and
prev_rank_after =
shows the top 5 most
f500["previous_rank"].value_counts(
common values Scatter Plot DataFrame.plot.scatter(x='col_1', y='col_2')
Generate a scatter plot from
dropna=False).head() plt.show() a DataFrame object

pandas Cheat Sheet Free resources at: dataquest.io/guide

View & Inspect Data Data Cleaning
Syntax for How to use Explained Syntax for How to use Explained

head df.head(n) First n rows of the DataFrame columns df.columns = ['a', 'b', 'c'] Renames columns

tail df.tail(n) Last n rows of the DataFrame isnull pd.isnull()

Checks for null Values,
Returns Boolean Array
shape df.shape() Number of rows and columns notnull pd.notnull() Opposite of pd.isnull()

info df.info()
Index, Datatype and Memory Drops all rows that contain
dropna df.dropna()
information null values
Summary statistics for df.dropna(axis=1) Drops all columns that

describe df.describe()
numerical columns contain null values

Views unique values and Drops all rows have have less
value_counts df.dropna(axis=1, thresh=n)
s.value_counts(dropna=False)
counts than n non-null values

Unique values and counts for fillna df.fillna(x)

Replaces all null values with
apply df.apply(pd.Series.value_counts) x
all columns
s.fillna(s.mean())
Replaces all null values with

the mean (mean can be

replaced with almost

any function from the

statistics section)

astype s.astype(float)
Converts the datatype of the

Series to float

replace s.replace(1, 'one')

Replaces all values equal to

1 with one

pandas Cheat Sheet Free resources at: dataquest.io/guide

Data Cleaning Filter, Sort, & Group By
Syntax for How to use Explained Syntax for How to use Explained
Replaces all 1 with 'one' and Rows where the col column

replace s.replace([1, 3], ['one','three']) columns df[df[col] > 0.5]

3 with 'three' is greater than 0.5

df[(df[col] > 0.5) & (df[col] < 0.7)]

Rows where 0.7 > col >
rename df.rename(columns=lambda x: x + 1) Mass renaming of columns
0.5

df.rename(columns={'old_name': 'new_name'}) Selective renaming of Sorts values by col1 in

sort_values df.sort_values(col1)
columns ascending order

df.rename(index=lambda x: x + 1) Mass renaming of index Sorts values by col2 in

df.sort_values(col2, ascending=False)
descending order

Selectively sets the index Sorts values by col1 in

set_index df.set_index('column_one') df.sort_values([col1, col2],

ascending order then col2

ascending=[True, False])
in descending order
Finding
Calculate Pearson's r
f500['revenues'].corr(f500[profits]) correlation between
Correlation Returns a groupby object for
revenues and profits groupby df.groupby(col)
values from one column
Calculate the Pearson's r Returns a groupby object
f500.corr() df.groupby([col1, col2]) values from multiple
correlation matrix between all
columns of f500 columns

Calculate the correlation Returns the mean of the

df.groupby(col1)[col2].mean()
f500.corr()[['revenues',

matrix for f500 and select the values in col2, grouped by

'profits',
correlations for the revenues, the values in col1 (mean
'assets']] profits, and assets can be replaced with almost
columns any function from the
statistics section)
Converting a
f500['founding_date'] =
Convert the founding_date
Column to
column in f500 to datetime pivot_table df.pivot_table(index=col1,
Creates a pivot table that
f500.to_datetime(f500['founding_date']) groups by col1 and
Datetime format
values=[col2, col3],

calculates the mean of col2

aggfunc=mean)
and col3

pandas Cheat Sheet Free resources at: dataquest.io/guide

Filter, Sort, & Group By Statistics
Syntax for How to use Explained Syntax for How to use Explained

groupby df.groupby(col1).agg(np.mean)
Finds the average across all describe df.describe()
Summary statistics for
columns for every unique
numerical columns
col 1 group
mean df.mean()
Returns the mean of all
Applies a function across
columns
apply df.apply(np.mean)
each column
corr df.corr()
Returns the correlation
df.apply(np.max, axis=1)
Applies a function across between columns in a
each row DataFrame

count df.count()
Returns the number of non-
null values in each
Join & Combine DataFrame column

ma x df.max()
Returns the highest value in
Syntax for How to use Explained each column

append
Adds the rows in df1 to the
Returns the lowest value in
df1.append(df2) min df.min()
end of df2 (number of columns each column
should be identical)
median df.median()
Returns the median of each
concat
Adds the columns in df1 to column
pd.concat([df1, df2], axis=1)
the end of df2 (number of
rows should be identical) std df.std()
Returns the standard
deviation of each column
join df1.join(df2, on=col1, how='inner')
SQL-style joins the columns
in df1 with the columns

on df2 where the rows for

col have identical values. how
can be one of 'left',
'right', 'outer', 'inner'

pandas Cheat Sheet Free resources at: dataquest.io/guide

CHP 8 Pandas
No ratings yet
CHP 8 Pandas
49 pages
Python For DS Cheat Sheet
100% (2)
Python For DS Cheat Sheet
6 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
1 page
Pandas Complete Notes
No ratings yet
Pandas Complete Notes
105 pages
Pandas - Cheat - Sheet (1) - 240511 - 113437
No ratings yet
Pandas - Cheat - Sheet (1) - 240511 - 113437
1 page
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
60 pages
PandasGUIA PYTHON-04
No ratings yet
PandasGUIA PYTHON-04
1 page
Panda 1
No ratings yet
Panda 1
18 pages
Cheat Sheet Pandas
No ratings yet
Cheat Sheet Pandas
4 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
1 page
SwiftData Mastery in SwiftUI (Moeykens M.) (Z-Library)
No ratings yet
SwiftData Mastery in SwiftUI (Moeykens M.) (Z-Library)
543 pages
Pandas
No ratings yet
Pandas
21 pages
Pandas PDF
No ratings yet
Pandas PDF
25 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
2,3. Introduction Pandas & Matplotlib
No ratings yet
2,3. Introduction Pandas & Matplotlib
32 pages
Pandas Cheat Sheet PDF
67% (3)
Pandas Cheat Sheet PDF
1 page
Pandas
No ratings yet
Pandas
26 pages
PANDAS Cheatsheet
No ratings yet
PANDAS Cheatsheet
4 pages
Mdad - Numpy ML
No ratings yet
Mdad - Numpy ML
85 pages
Cheat Sheet
No ratings yet
Cheat Sheet
15 pages
Pandas
No ratings yet
Pandas
41 pages
Pandas DataFrame Notes
67% (3)
Pandas DataFrame Notes
13 pages
Wa0000
No ratings yet
Wa0000
13 pages
Learning Pandas PDF
No ratings yet
Learning Pandas PDF
171 pages
Pandas Cheat Sheet........
No ratings yet
Pandas Cheat Sheet........
11 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
EDS - Python Cheat Sheet
0% (1)
EDS - Python Cheat Sheet
3 pages
Pandas Handbook
No ratings yet
Pandas Handbook
33 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Exp3 Python
No ratings yet
Exp3 Python
15 pages
Introduction To Pandas
No ratings yet
Introduction To Pandas
27 pages
Pandas (Ziad)
No ratings yet
Pandas (Ziad)
38 pages
Pandas PDF
No ratings yet
Pandas PDF
171 pages
Unit 3 (FODS)
No ratings yet
Unit 3 (FODS)
34 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
Python Data Science 101
100% (1)
Python Data Science 101
41 pages
Pandas 1705297450
No ratings yet
Pandas 1705297450
21 pages
Pandas Basics Cheat Sheet Python For Data Science: Retrieving Series/Dataframe Information
No ratings yet
Pandas Basics Cheat Sheet Python For Data Science: Retrieving Series/Dataframe Information
1 page
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
Data Flow Diagram Quality Checklist PDF
0% (1)
Data Flow Diagram Quality Checklist PDF
4 pages
Pandas Data Structures: Sections
No ratings yet
Pandas Data Structures: Sections
13 pages
Best Practices - BW Data Loading & Performance
0% (1)
Best Practices - BW Data Loading & Performance
37 pages
Python Cheat Sheet Code Academy
100% (1)
Python Cheat Sheet Code Academy
1 page
Full Research (Unfinished)
No ratings yet
Full Research (Unfinished)
32 pages
ASCP EAM Integration PDF
No ratings yet
ASCP EAM Integration PDF
9 pages
Cheat Sheet - Pandas
No ratings yet
Cheat Sheet - Pandas
12 pages
PR2 DLP Q1 W3
No ratings yet
PR2 DLP Q1 W3
3 pages
Research Proposal Gebeyaw Wudneh - PDF - Factors Affecting... : View Full Document
67% (3)
Research Proposal Gebeyaw Wudneh - PDF - Factors Affecting... : View Full Document
14 pages
De Lab Manual
No ratings yet
De Lab Manual
40 pages
Pandas
No ratings yet
Pandas
5 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
Pandas Cheat Sheet - Python For Data Science
No ratings yet
Pandas Cheat Sheet - Python For Data Science
5 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
Er 1
100% (3)
Er 1
38 pages
Unit Ii - Nosql Databases
No ratings yet
Unit Ii - Nosql Databases
112 pages
Data Science Cheat Sheet: KEY Imports
100% (1)
Data Science Cheat Sheet: KEY Imports
1 page
Extended TAM Model To Explore The Factors That Affect Intention To Use AI Robotic Architects For Arc
No ratings yet
Extended TAM Model To Explore The Factors That Affect Intention To Use AI Robotic Architects For Arc
15 pages
Abawi
No ratings yet
Abawi
14 pages
Dichtung-Digital 24-1-8 Arns Metonymical Movies
No ratings yet
Dichtung-Digital 24-1-8 Arns Metonymical Movies
9 pages
Assignment - Database Design and Development
100% (1)
Assignment - Database Design and Development
7 pages
Module 2
No ratings yet
Module 2
20 pages
Huỳnh Kim Tuyến - ITITDK21053 -Lab7 - Guide
No ratings yet
Huỳnh Kim Tuyến - ITITDK21053 -Lab7 - Guide
24 pages
Chapter 2 and Chapter 3
No ratings yet
Chapter 2 and Chapter 3
30 pages
GEC 5 LESSON 5 Communication For Academic Purposes
No ratings yet
GEC 5 LESSON 5 Communication For Academic Purposes
17 pages
Kartik Project
No ratings yet
Kartik Project
45 pages
Text Mining
No ratings yet
Text Mining
12 pages
MCC Lumanlan Lovely Ann T. Pedagogy Vol2 DPCF
No ratings yet
MCC Lumanlan Lovely Ann T. Pedagogy Vol2 DPCF
6 pages
BC0058 SLM Unit 02
No ratings yet
BC0058 SLM Unit 02
13 pages
Concurrency Control DBMS
No ratings yet
Concurrency Control DBMS
12 pages
EDA ASS-1 Data Collection
No ratings yet
EDA ASS-1 Data Collection
5 pages
Variant Configuration Service
No ratings yet
Variant Configuration Service
5 pages
Distributed File Systems
No ratings yet
Distributed File Systems
18 pages
Web Scraping Tool To Scrape Real-Time Data From Any Public Source
No ratings yet
Web Scraping Tool To Scrape Real-Time Data From Any Public Source
1 page
List of SQL Commands - Codecademy
No ratings yet
List of SQL Commands - Codecademy
9 pages
HR Data Collection
No ratings yet
HR Data Collection
4 pages
Meaning
No ratings yet
Meaning
3 pages
Learning R Programming
From Everand
Learning R Programming
Kun Ren
5/5 (3)
Python Data Science Cookbook
From Everand
Python Data Science Cookbook
Taryn Voska
No ratings yet
Python Data Science Cookbook: Practical solutions across fast data cleaning, processing, and machine learning workflows with pandas, NumPy, and scikit-learn
From Everand
Python Data Science Cookbook: Practical solutions across fast data cleaning, processing, and machine learning workflows with pandas, NumPy, and scikit-learn
Taryn Voska
No ratings yet
Learning Hadoop 2
From Everand
Learning Hadoop 2
Garry Turkington
4/5 (1)
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet
Learning Pandas 2.0: A Comprehensive Guide to Data Manipulation and Analysis for Data Scientists and Machine Learning Professionals
From Everand
Learning Pandas 2.0: A Comprehensive Guide to Data Manipulation and Analysis for Data Scientists and Machine Learning Professionals
Matthew Rosch
No ratings yet
Visualizing Data Structures
From Everand
Visualizing Data Structures
Rhonda Hoenigman
No ratings yet
Semantic Translation: Fundamentals and Applications
From Everand
Semantic Translation: Fundamentals and Applications
Fouad Sabry
No ratings yet

Pandas Cheat Sheet Free Resources At: Dataquest - Io/guide

Uploaded by

Pandas Cheat Sheet Free Resources At: Dataquest - Io/guide

Uploaded by

pandas Cheat Sheet Table of Contents

to_csv, to_excel, to_sql, to_json, Filter, Sort, & Group By

pandas Cheat Sheet Free resources at: dataquest.io/guide

read_excel pd.read_excel(filename) Reads from an Excel file format

Reads from a SQL table/ to_html df.to_html(filename) Writes to an HTML table

Reads from a JSON

Parses an html URL, string or

DataFrame pd.DataFrame(dict) Creates a series from an

index df.index = pd.date_range('1900/1/30',

pandas Cheat Sheet Free resources at: dataquest.io/guide

f500 = pd.read_csv('f500.csv', index_col=0)

Return the dimensions of a

Select the first five rows from

followed by columns. You can

select individual rows/columns

includes both the start and end

pandas Cheat Sheet Free resources at: dataquest.io/guide

Horizontal Series.plot.barh() Generate a horizontal bar

pandas Cheat Sheet Free resources at: dataquest.io/guide

tail df.tail(n) Last n rows of the DataFrame isnull pd.isnull()

Unique values and counts for fillna df.fillna(x)

the mean (mean can be

any function from the

replace s.replace(1, 'one')

pandas Cheat Sheet Free resources at: dataquest.io/guide

replace s.replace([1, 3], ['one','three']) columns df[df[col] > 0.5]

df[(df[col] > 0.5) & (df[col] < 0.7)]

df.rename(columns={'old_name': 'new_name'}) Selective renaming of Sorts values by col1 in

df.rename(index=lambda x: x + 1) Mass renaming of index Sorts values by col2 in

Selectively sets the index Sorts values by col1 in

ascending order then col2

Calculate the correlation Returns the mean of the

matrix for f500 and select the values in col2, grouped by

calculates the mean of col2

pandas Cheat Sheet Free resources at: dataquest.io/guide

on df2 where the rows for

pandas Cheat Sheet Free resources at: dataquest.io/guide

You might also like