5CS037 WS02 PandasForDataAnalysis
5CS037 WS02 PandasForDataAnalysis
Workshop-2
Pandas for Data Analysis.
Siman Giri
methods-Syntax definition
head(n)/tail(n) n rows from top or bottom
dataset.head(2)
sample()-dataset.sample(n) n random samples from dataset
max()/,min() maximum or minimum of
dataset["column"].max() numeric column
mean()/median()/std() mean or median or std of
dataset["column"].mean() numeric column
describe() summary statistics of numeric
dataset.describe() columns in dataset.
methods-Syntax definition
unique() unique values of column
dataset.column.unique()
map(arg) map distinct values of a column
dataset["column"].map(arg) to another set of corresponding
{arg:function,dict,col.} values.
apply() takes a function and applies to
dataset["col"].apply(func) all values of column
1 import pandas as pd
2 from sklearn . datasets import load_iris
3 iris = load_iris () # Load the Iris dataset
4 iris_df = pd . DataFrame ( data = iris [ ’ data ’] , columns = iris
[ ’ feature_names ’ ])
5 # Standard Scaling
6 iris_st a n d a r d _ s c a led = ( iris_df - iris_df . mean () ) /
iris_df . std ()
7 print ( " Original Iris DataFrame : " )
8 print ( iris_df . head () )
9 print ( " \ nStandard Scaled Iris DataFrame : " )
10 print ( i r i s _ s t a n d a rd_scaled . head () ) # Display scaled
data
1 import pandas as pd
2 from sklearn . datasets import load_iris
3 iris = load_iris () # Load the Iris dataset
4 iris_df = pd . DataFrame ( data = iris [ ’ data ’] , columns = iris
[ ’ feature_names ’ ])
5 # Min - Max Scaling using Pandas
6 iris_min ma x_ sc al ed = ( iris_df - iris_df . min () ) / (
iris_df . max () - iris_df . min () )
7 print ( " Original Iris DataFrame : " )
8 print ( iris_df . head () )
9 print ( " \ nMin - Max Scaled Iris DataFrame : " )
10 print ( ir is _m in ma x_scaled . head () ) # Display scaled data
Ordinal Encoding:
▶ Ordinal encoding is used for categorical data with a
meaningful order or ranking.
▶ Each category is assigned a numerical value based on its order.
▶ Example: Low, Medium, High can be encoded as 1, 2, 3.
1 import pandas as pd
2 # Sample DataFrame with ordinal categories
3 df = pd . DataFrame ({ ’ Category ’: [ ’ Low ’ , ’ Medium ’ , ’ High
’ , ’ Low ’ , ’ High ’ ]})
4 # Ordinal encoding using map
5 ordinal_mapping = { ’ Low ’: 1 , ’ Medium ’: 2 , ’ High ’: 3}
6 df [ ’ Category_Ordinal ’] = df [ ’ Category ’ ]. map (
ordinal_mapping )
7 print ( df )