Pandas Dataframe
Pandas Dataframe
manipulation
cheatsheet
stratascratch.com
data exploring
tail() head()
helps you to show the first
helps you to show the last
‘n’ rows of the DataFrame
‘n’ rows of the DataFrame
sample()
generates ‘n’
sample of your
DataFrame shape()
shows the dimension of
the DataFrame
info()
gives you the column length,
types and count of the column describe()
information of the Dataframe gives descriptive statistics
of your DataFrame: min,max,
iqr,mean.
data retrieving
groupby() iloc()
it groups the DataFrame
it gets values of
accordingly to the
specific indexes
arguments
loc()
it helps you select the row
and columns by names
and labels sort_values()
it sorts the DataFrame ascendingly
or descendingly by a given comun
data operations
agg()
passes one or multiple functions min()
to column/rows and renames the calculates the min
index of the resulting DataFrame,
which can reproduce
aggregare results
max()
calculates the max
drop()
removes rows or columns
by labeling name or
location (index)
transform() apply()
allows you to broadcast function performs operations on columns,
and functions to column/rows, can rows/DataFrames, only allowed
not produce aggregated results, works to work with functions &
with function or function list written can produce aggregate
in a list, dictionaruy or string like results
data operations
drop_duplicates() not_null()
output DataFrame without
detects non-missing values
duplicate rows
and returns booleans
to show that
mean()
calculates the mean
missing value & NA’s
dropna()
isna() drops missing values
returns booleans(true,false) from rows or columns
which indicates the value na
or not
fill(na)
assigns specific values to NA’s
merging
merge() concat()
combines two DataFrames bonds two DataFrame
on columns or indices across the rows or columns
join()
joins two DataFrames
on a key column or index
plotting
plot.bar() plot.line()
to plot bar graph
to plot line graph
plot.scatter()
to draw scatter plot
plot.pie()
top lot pie chart
saving from and reading
to DataFrame
read_sql() to_sql()
To read from SQL To save the DataFrame
into a DataFrame into a SQL
read_excel() to_excel()
To read from excel files To save the DataFrame
into a DataFrame into a CSV file
read_csv() to_csv()
To read from CSV files To save the DataFrame
into a DataFrame into a CSV file
read_json() read_json()
To read from json files To
Tosave
save the DataFrame
the DataFrame
into a DataFrame into aa json
into CSV file
file
date time
dt.month()
This function returns
the number of month
dt.quarter()
It selects the “n” quarter
of the DataFrame.
2 means the first 6 months
to_datetime()
It changes the format of
the DataFrame/series to a
pandas date time object.
dt.year()
It returns the year
of the date.