0% found this document useful (0 votes)

77 views1 page

Pandas Cheat Sheet Final

This document provides a cheat sheet for common Pandas operations: 1. It outlines how to install and import Pandas using pip and importing as pd. 2. It describes Pandas conventions for reading and writing dataframes and exploring data through feature exploration like masking and filtering. 3. It lists several common operations like sorting, grouping, pivoting, and melting data between long and wide formats.

Uploaded by

ASWINKUMAR R

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

77 views1 page

Pandas Cheat Sheet Final

Uploaded by

ASWINKUMAR R

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

PANDAS cheat-sheet

1 Installing and Importing Column df.loc[:, ‘a’:’ ’] b

7 Operations df.groupby ‘group_col_name’).
( Pivot
df.pi ot ind x=[‘li t of column ,
v ( e s s]

Both row and df.loc[1:3, ‘a’:’ ’] (1 and b 3

filter( oolean array a ed on b b s Opposite of melt, columns=’col_name’,
columns are explicit indice ere) condition)

converts dataframe alu s=’col_name’)

Installing pip install pandas SOR T I N G

s h v e

from long to wide

E.g.
format

Importing
2 Convention import pandas as pd F eature exploration ( masking, filtering ) df.sort al u s [‘col1’],
_v e (

Group based data ro b ('dire tor name').

E.g.

a cending=[ T rue )
.g up y c _
s ] filtering filter(lambda : ["b d et"].
x x u g Outputs a multi- data melt ivot(inde =['Date' 'Dr
_ .p x , u

2 Reading and Writing data Masking

df[‘col’'] > alue
ma () >= 100)

x index dataframe
Name' 'Parameter']
g_ ol mns =, , c u

Creates a mask based on E df[‘a e’] > 30

v
B uilt in ops 'time' val es='readin ') , u g

df = pd.r ad cs our required condition .g. g

# This filters all rows of those
Built in ops such as mean, min, max, etc.
dire tors whose ma im m b d et is
e _ v

(pat = ‘filename.c ’
h sv )

df.loc[ df['col1’] == al1)

( v

reater than 100 million)

x u u g

& data[‘col ’] == al )]

E.g., df[‘col1’].min(), df[‘col1’].count(), etc. g

Reading data # an e tend for son

( 2 v 2

df.groupby ‘group_col_name’).

e el t es too sin
x j ,

E.g.

A pply (
xc yp u g
Filtering
apply(function)

d read son d read e el et

p . _j /p . _ xc , c. Filters data based on df lo [(df[‘month’] ==
. c
Applies a function along one of the axis of the dataframe
Cut
df[‘new_cat_column’]=pd.cut

df.to cs ‘filename.c ’ conditions ‘Jan ar ’) & u y

df[‘col’].apply(function)

E.g.
Bins continuous (df[‘continou _col’],bins= in

(df[‘ ear’]==’2022’)]
def f n ( ):

_ v( sv )

s b
y u c x
data into _ alue , lab ls=la el_ alue )

# filters o t data for ["risk "] = ["b d et"] -

v s e b v s

Writing data # an e tend for son an ar 2022

E.g.

x y x u g

categorical groups
["reven e"] mean() >= 0

c x j ,

e el and s h too sin j u y

Group based apply ret x u . E.g.

df to son df to e el et
uc u g

data[['re enue', ' udget' ].apply np.sum, axis=1)

v b ] ( rn x
u data tid ['tem at'] =
data risk = data ro by

_ y p_c
#sums values of revenue and budget across each row
. _j / . _ xc , c.

6 Dataframe Manipulation _ y .g up d t(data tid ['Tem erat re']

("dire tor name") a l (f n )

p .cu _ y p u ,

bins=tem oints
3 Series and Dataframes
c _ . pp y u c p_p ,

8 Joins labels=tem labels)

# Finds movies whose b d et is
p_

A dding a new row/column hi her than its dire tor’s

u g

C reating a series Concat

g c

avera e reven e
pd. S ri s [‘a’, ‘ ’, ‘c’
g u 1 67 0-11 <12
1 67 Old
e e ( b ] ) df.loc[explicit_row_num] = ['a’, 1]
pd.concat([df 1, df2], axis = 0] (for concatenating horizontally, 2 40 12-17 Teen
2 40 Adult
18-59 Adult

e.g.
change axis = 1) 3 34 60 & above Older 34 Adult
C reating a dataframe Row df.loc[len(df.index)] = ['a', 1]
1
3

# this will add a row at the end of the dataframe 1 3

pd. ata rame([[‘a’, 1 , axis = 1 Shift
df[‘col’].s ift p riods=n, axi =0
df[‘new_col’]=data
h ( e s )

Row
D F ]

name id Column
5
Shifts the values of
Oriented [‘ ’, , b

column =[‘name’, ‘id’ )

2]]

0 a 1 df1 df1 df
3
6
Group

by
8 average

2
rows/columns
E.g.

2 2 agg. 5
df["Marks"] shift( eriods = 1
D eleting a new row/column CONCAT
s ]

Column pd. ata rame({‘name’:

D F
1 b 2
8 55 .
a is = 0)
. p ,

Oriented [‘a’, ‘ ’ , ‘id’:[1, }) 2 5 x

df
b ] 2]

df.drop la el =None, axi =0 df1 6

axis = 0
( b s s )

4 Info extraction E.g.

df
Row df dro (3 a is=0)

. p , x
2 10 Cleaning our data
Shape
df.s ape
# Here 3 is the e li it inde
None and nan
h xp c x,

(Return a tuple representing the #e -(2 3)

.g. ,
a is=0 is for row
x
Merge
dimensionality of the DataFrame.) for 2 rows

Column df.drop ‘col_name’, axi =1) “NaN” is for columns with numbers as their value
12 Misc T opics
and 3 ol mns c u
( s
df 1.merge(df2, on=’foreign_key’, how=’type_of_join’
“None” is for columns with non-number entries(e.g. String,
Optional -> left_on and right_o
Head

(first n rows, default 5) df. ad n)he ( R enaming a column Eg. df 1.merge(df2, on=’id’, how=’inner’)
object type, etc. D atetime
Can check for null values using “isna()
Tail Convert to Datetime object: pd.to_datetime(df[‘col’])

(last n rows, default 5) df.tail n) ( Row df.ind x = new_indice

e s
E.g
df.isna() # returns the dataframe with True/False for null
df.r nam {‘old_name’:’new_name’, ‘ foreign key ’ Extracting Information
Info

df.info ) Column axi =1})

e e(
df1 values in the respective element’s positio
(return info of all columns) ( s
Merge df1 df df.isna().sum() # returns number of null values per Extract t e year for t e 0t s h h h

Describe D uplicates and dropping duplicates

2
column. Can modify with df.isna().sum(axis=1) for each index alue
v

(gives statistical information of df.d scrib e e( ) 2

row’s null coun df[‘col’][0].year Here 0 i t e implicit index
s h

data) U e .mont and .day for t e

s h h
Find duplicate rows df.isna().sum().sum() # returns total number of null
re pecti e data
s v
values
5 Accessing Extract t e year for w ole
df.duplicated(subset=None, keep=’first’)

9 Groupby F illing null values df[‘col’].dt.year s

column (all t e datetime alue )

s
h

h
h

v s
# subset can be used to specify certain column(s) for

D irect accessing columns and rows, as well identifying the duplicates

d f . f i l l n a n ) fill null alue wit alue ‘n’

( # s v s h v
df[‘col’] ormat t e elect data (0t index
F s h s h

as both together df.groupby ‘group_col_name’)

[0].strftime(‘%M% datetime alue ere) into t e
D ropping null v alues
( v h h
# keep determines which duplicates to mar [‘col( )’].aggregate_function
s ()

Y’) required data time format (mont h

df.loc[ i] df.dropna(axis = 0)

and year in t i ca e) h s s
e

(ei ere i explicit index

first : Marks All duplicates as True except for the first Grouping based on E.g.

Accessing a row # h s )

a single aggregate df ro b (‘dire tor name’)

# Default axis=0, use 1 for columns

df.iloc[ii]

occurrence .g

[‘title’] o nt()

up y c _

# Drops rows/columns with even a single missing value S tring functions

#(ii ere i implicit index)
h s last : Marks all duplicates as True except for the last
# Finds n mber of titles
.c u

df[‘column_name’] occurrence er dire tor

Accessing a column #for sin le ol m

False : Marks all duplicates as True

p c
11 Data Restructuring We can use .str to apply string functions to any column

df.groupby [‘group_col_name’])

g c u n

df[‘col’].str.function()

df[[‘col1’, ‘col ’]] Returns a boolean series for each duplicate row marked as True
(

[‘col’].aggregate [‘func1’, pd.m lt df, id ars=[‘li t of

#for m lti le ol mns ( Melt

e ( _v s

‘func ’] Convert dataframe column ’ )

u p c u

Drop duplicate values

Grouping based 2 )

s ] E.g.

S L ICI N G on multiple from wide to long E.g.

i. data_tidy['Date'].str.split('-')

aggregates E.g.
format
d melt(data id vars=['Date'
p . , _ ,
df.loc[1:3]
df.drop_duplicates(subset=None, keep=’first’)

df ro b (['dire tor name'])

.g up y c _ 'Parameter' 'Dr Name']) , ug_

# This will split the “Date” column into elements separated by “-”

(1 and 3 are t e explicit indice ere

h s h )
[" ear"] a re ate(['min' 'ma '])

y . gg g , x
ii. data_tidy.loc[data_tidy['Drug_Name'].str.contains

Row Or
# Parameters have the same meaning as in df.duplicated, # Finds first and re ent ear c y
('hydrochloride')]

df.iloc[ :4] ( and 4 are t e implicit

2 2 h

except here it will drop the rows marked duplicate of movies made b all dire tors y c # Will filter out rows containing the string “hydrochloride”
indice ere)s h

EDA Cheat Sheet - Exploratory Data Analysis
No ratings yet
EDA Cheat Sheet - Exploratory Data Analysis
2 pages
Algebra
From Everand
Algebra
Larry C. Grove
5/5 (3)
Content Pandas Cheat Sheet
No ratings yet
Content Pandas Cheat Sheet
9 pages
Pandas Merged
No ratings yet
Pandas Merged
2 pages
Pandas
No ratings yet
Pandas
94 pages
Pandas Cheat Sheet
100% (1)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet
100% (2)
Pandas Cheat Sheet
6 pages
Cheat Sheet Pandas
No ratings yet
Cheat Sheet Pandas
4 pages
12 Pandas
100% (1)
12 Pandas
21 pages
Pandas Cheat Sheet
100% (4)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet
85% (13)
Pandas Cheat Sheet
2 pages
Pandas Cheat Sheet CN
No ratings yet
Pandas Cheat Sheet CN
4 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
60 pages
04-Data Manipulation With Pandas
No ratings yet
04-Data Manipulation With Pandas
28 pages
Rapids Cheatsheet
100% (1)
Rapids Cheatsheet
2 pages
Pandas - Cheatsheet
No ratings yet
Pandas - Cheatsheet
4 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Pandas
No ratings yet
Pandas
5 pages
Pandas Data Wrangling Cheatsheet Datacamp PDF
No ratings yet
Pandas Data Wrangling Cheatsheet Datacamp PDF
1 page
Chapter 2 Python Pandas - II
No ratings yet
Chapter 2 Python Pandas - II
19 pages
Analyzing Data Using Python - Cleaning and Analyzing Data in Pandas
No ratings yet
Analyzing Data Using Python - Cleaning and Analyzing Data in Pandas
81 pages
Pandas: Import
100% (1)
Pandas: Import
13 pages
Important Pandas Operations 1697910759
No ratings yet
Important Pandas Operations 1697910759
6 pages
Pandas Module (Part-I)
No ratings yet
Pandas Module (Part-I)
36 pages
Pandas
No ratings yet
Pandas
26 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
Pandas Cheat Sheet
No ratings yet
Pandas Cheat Sheet
2 pages
Python Libraries Cheat Sheets
No ratings yet
Python Libraries Cheat Sheets
6 pages
Pandas Notes
No ratings yet
Pandas Notes
4 pages
01-Numpy & Pandas
No ratings yet
01-Numpy & Pandas
69 pages
Pandas
No ratings yet
Pandas
13 pages
Pandas Syntax Revision For ML
No ratings yet
Pandas Syntax Revision For ML
10 pages
Pandas 1
No ratings yet
Pandas 1
50 pages
Lab 1 ML Lab
No ratings yet
Lab 1 ML Lab
15 pages
Introduction To Pandas in Data Analytics
No ratings yet
Introduction To Pandas in Data Analytics
12 pages
7 Days Analytics Course 3feiz7 4
No ratings yet
7 Days Analytics Course 3feiz7 4
8 pages
3Y3Z2Xzqn7 U Y%K : 2. How To Create A Data Frame Using A Dictionary of Pre-Existing Columns or Numpy 2D Arrays?
No ratings yet
3Y3Z2Xzqn7 U Y%K : 2. How To Create A Data Frame Using A Dictionary of Pre-Existing Columns or Numpy 2D Arrays?
8 pages
Introduction To Pandas
No ratings yet
Introduction To Pandas
27 pages
Chapter 2 - Python Pandas II
No ratings yet
Chapter 2 - Python Pandas II
71 pages
Dataframe in Pandas - Cheatsheet
No ratings yet
Dataframe in Pandas - Cheatsheet
8 pages
Pandas Cheat Sheet - Python For Data Science
No ratings yet
Pandas Cheat Sheet - Python For Data Science
5 pages
Commands SQL, Python (BASICS)
No ratings yet
Commands SQL, Python (BASICS)
7 pages
Pandas 1705297450
No ratings yet
Pandas 1705297450
21 pages
Dataframing in CSV
No ratings yet
Dataframing in CSV
14 pages
Justenoughpython Pandas 220915 175329
No ratings yet
Justenoughpython Pandas 220915 175329
64 pages
Exp 3
No ratings yet
Exp 3
10 pages
Python Cheat Sheets
97% (33)
Python Cheat Sheets
11 pages
Data Science Unit 2 Second Half Notes
No ratings yet
Data Science Unit 2 Second Half Notes
18 pages
DevOps Session 3 Pandas
No ratings yet
DevOps Session 3 Pandas
33 pages
EDA Cheat Sheet
No ratings yet
EDA Cheat Sheet
7 pages
Pandas Dataframe
No ratings yet
Pandas Dataframe
10 pages
Data Science Cheat Sheet: KEY Imports
100% (1)
Data Science Cheat Sheet: KEY Imports
1 page
Python Interviews
No ratings yet
Python Interviews
154 pages
Pandas
No ratings yet
Pandas
30 pages
Mastering Java: A Comprehensive Guide to Development Tools and Techniques
From Everand
Mastering Java: A Comprehensive Guide to Development Tools and Techniques
Lena Neill
No ratings yet
Learn Python through Nursery Rhymes and Fairy Tales: Classic Stories Translated into Python Programs (Coding for Kids and Beginners)
From Everand
Learn Python through Nursery Rhymes and Fairy Tales: Classic Stories Translated into Python Programs (Coding for Kids and Beginners)
Shari Eskenas
5/5 (1)
I'M STILL HEALING: A poetry collection of love and trauma.
From Everand
I'M STILL HEALING: A poetry collection of love and trauma.
Koko Escoto
No ratings yet
Computer Programming: A Simplified Entry to Python, Java, and C++ Programming for Beginners
From Everand
Computer Programming: A Simplified Entry to Python, Java, and C++ Programming for Beginners
Lena Neill
No ratings yet
White Shade: The Real-World Primer for the Black Professional Woman
From Everand
White Shade: The Real-World Primer for the Black Professional Woman
Dr. Michelle L. Shelton
No ratings yet

Pandas Cheat Sheet Final

Uploaded by

Pandas Cheat Sheet Final

Uploaded by

PANDAS cheat-sheet

1 Installing and Importing Column df.loc[:, ‘a’:’ ’] b

Both row and df.loc[1:3, ‘a’:’ ’] (1 and b 3

converts dataframe alu s=’col_name’)

Installing pip install pandas SOR T I N G

from long to wide

Group based data ro b ('dire tor name').

2 Reading and Writing data Masking

Creates a mask based on E df[‘a e’] > 30

df = pd.r ad cs our required condition .g. g

df.loc[ df['col1’] == al1)

reater than 100 million)

E.g., df[‘col1’].min(), df[‘col1’].count(), etc. g

Reading data # an e tend for son

d read son d read e el et

df.to cs ‘filename.c ’ conditions ‘Jan ar ’) & u y

# filters o t data for ["risk "] = ["b d et"] -

Writing data # an e tend for son an ar 2022

e el and s h too sin j u y

data[['re enue', ' udget' ].apply np.sum, axis=1)

6 Dataframe Manipulation _ y .g up d t(data tid ['Tem erat re']

8 Joins labels=tem labels)

A dding a new row/column hi her than its dire tor’s

C reating a series Concat

# this will add a row at the end of the dataframe 1 3

column =[‘name’, ‘id’ )

Column pd. ata rame({‘name’:

Oriented [‘a’, ‘ ’ , ‘id’:[1, }) 2 5 x

df.drop la el =None, axi =0 df1 6

4 Info extraction E.g.

(Return a tuple representing the #e -(2 3)

(last n rows, default 5) df.tail n) ( Row df.ind x = new_indice

df.info ) Column axi =1})

Describe D uplicates and dropping duplicates

(gives statistical information of df.d scrib e e( ) 2

data) U e .mont and .day for t e

9 Groupby F illing null values df[‘col’].dt.year s

column (all t e datetime alue )

D irect accessing columns and rows, as well identifying the duplicates

d f . f i l l n a n ) fill null alue wit alue ‘n’

as both together df.groupby ‘group_col_name’)

Y’) required data time format (mont h

(ei ere i explicit index

a single aggregate df ro b (‘dire tor name’)

# Drops rows/columns with even a single missing value S tring functions

df[‘column_name’] occurrence er dire tor

False : Marks all duplicates as True

[‘col’].aggregate [‘func1’, pd.m lt df, id ars=[‘li t of

#for m lti le ol mns ( Melt

‘func ’] Convert dataframe column ’ )

Drop duplicate values

S L ICI N G on multiple from wide to long E.g.

df ro b (['dire tor name'])

.g up y c _ 'Parameter' 'Dr Name']) , ug_

(1 and 3 are t e explicit indice ere

df.iloc[ :4] ( and 4 are t e implicit

You might also like