0% found this document useful (0 votes)

45 views30 pages

5CS037 WS02 PandasForDataAnalysis

The document discusses Pandas, an open-source library for data analysis in Python. It introduces Pandas data structures like Series and DataFrames, and describes how to create, read, write and manipulate tabular data with Pandas methods and attributes.

Uploaded by

Pankaj Mahato

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views30 pages

5CS037 WS02 PandasForDataAnalysis

Uploaded by

Pankaj Mahato

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

5CS037 Concepts and Technologies of AI

Workshop-2
Pandas for Data Analysis.

Siman Giri

November 28, 2023

Siman Giri Pandas for Data Analysis. 1/29

1. Pandas: Introduction.

Siman Giri Pandas for Data Analysis. 2/29

1.1 What is pandas?
▶ Pandas is an open-source add-on modules to python which
provides high-performance, easy-to-use data structure, and
data analysis tools.

[pandas] is derived from the term ”panel data”, an

econometrics term for data sets that include observa-
tions over multiple time periods for the same individuals.
-Wikipedia

▶ The pandas library contains several methods and functions for

cleaning, manipulating and analyzing data.
▶ Though Pandas is built on top of the Numpy package, Numpy
is suited for working with homogeneous numerical array data,
Pandas is designed for working with tabular or heterogeneous
data.

Siman Giri Pandas for Data Analysis. 3/29

1.2 Use for Pandas!!!

Typically, the pandas library is used for:

1. Data manipulation tasks such as missing values,filtering
rows/columns, aggregating and mutating data.
2. Computing summary (Descriptive) statistics.
3. Computing correlation and distributions among columns in the
data.
4. Visualizing the data with the help from the Matplotlib library.
5. Writing the cleaned and transformed data into CSV file or
other database formats.
Importing the pandas:
1 import pandas as pd

Siman Giri Pandas for Data Analysis. 4/29

1.3 Pandas Data Structure: Series and DataFrame
▶ There are two core components of the pandas library:
- Series and DataFrame.
▶ A DataFrame is a two-dimensional object
▶ comprising of tabular data organized in rows and columns
▶ individual columns can be of different value types (numeric /
string / Boolean etc.)
▶ row indices: refers to individual rows (called index, usually
integers if not defined otherwise).
▶ column indices: refers to name(head) of each columns, if not
defined otherwise.
▶ Each column in a DataFrame is a Series.

Siman Giri Pandas for Data Analysis. 5/29

2.Creating, Reading and Writing Data with Pandas.

Siman Giri Pandas for Data Analysis. 6/29

2.1 How do I Create, Read and Write Tabular Data?
▶ Creating a DataFrame/Series: A Pandas DataFrame can be
created by converting the in-built python data structures such
as lists, dictionaries etc. Example:
1 # Transforming in - built data structures - DataFrame
2 # Style -1
3 import pandas as pd
4 pd . DataFrame ({ ’ Bob ’: [ ’I liked it . ’ , ’ It was awful ’
] , ’ Sue ’: [ ’ Pretty good . ’ , ’ Bland . ’ ]})
5 # Style -2
6 pd . DataFrame ({ ’ Bob ’: [ ’I liked it . ’ , ’ It was awful
. ’] , ’ Sue ’: [ ’ Pretty good . ’ , ’ Bland . ’]} ,
7 index =[ ’ Product A ’ , ’ Product B ’ ])

Siman Giri Pandas for Data Analysis. 7/29

Figure: Output-Style:1 Figure: Output-Style:2

Siman Giri Pandas for Data Analysis. 7/29
2.1 How do I Create, Read and Write Tabular Data?
▶ Importing data from files: In the real world, a pandas
DataFrame will typically be created by loading the datasets
from CSV file, Excel file, etc.
▶ Pandas provides the read_csv() function to read data stored
as a csv file into a pandas DataFrame.
▶ Pandas supports many different file formats or data sources
out of the box (csv, excel, sql, json, parquet, . . . ), each of
them with the prefix read_*.
▶ The head/tail/info methods and the dtypes attribute are
convenient for a first check.
1 # Importing Data from file
2 import pandas as pd
3 # path to your dataset must be given to built in
read_csv (" Your path ") function .
4 dataset = pd . read_csv ( " / data / Week02 / bank . csv " )
5 dataset . head ()
6 dataset . tail ()
7 dataset . info ()
8 # Run the above code and observe the output .

Siman Giri Pandas for Data Analysis. 8/29

2.1 How do I Create, Read and Write Tabular Data?
▶ Writing Data: Whereas read_* functions are used to read
data to pandas, the to_* methods are used to store data.
▶ The to_csv("path+file name", index=false) method
stores the data as an csv file.
▶ path: Where you wan to store the created file.
▶ file name: in the name you want to store the file.
▶ index:boolean: store the index or not.
▶ Pandas supports many different file formats or data sources
out of the box (csv, excel, sql, json, parquet, . . . ), each of
them with the prefix to_*.
1 # Importing Data from file
2 import pandas as pd
3 data = { ’ Name ’: [ ’ Alice ’ , ’ Bob ’ , ’ Charlie ’] , ’ City ’
: [ ’ New York ’ , ’ San Francisco ’ , ’ Los Angeles ’
]}
4 df = pd . DataFrame ( data ) # creating a DataFrame
5 # Writing DataFrame to csv .
6 df . to_csv ( ’ output . csv ’ , index = False )
7 # Run the above code and observe the output .
Siman Giri Pandas for Data Analysis. 9/29
3.Do something with a DataFrame or Series.

Siman Giri Pandas for Data Analysis. 10/29

3.1 Attributes of Pandas DataFrame.
Some of the attributes of the Pandas DataFrame class are the
following:

attributes definition Syntax

dtypes data-types of columns dataset.dtypes
columns name of columns dataset.columns
axes row index × col index dataset.axes
ndim Dimension(2-DF and 1-Series) dataset.ndim
size number of elements - DataFrame dataset.size
shape tuple (rows, cols) dataset.shape
values NumPy Representation dataset.values
Table: Attributes @ Pandas DataFrame

Siman Giri Pandas for Data Analysis. 11/29

3.2 Methods of Pandas DataFrame.
Some of the popular methods of the Pandas DataFrame class are
the following:

methods-Syntax definition
head(n)/tail(n) n rows from top or bottom
dataset.head(2)
sample()-dataset.sample(n) n random samples from dataset
max()/,min() maximum or minimum of
dataset["column"].max() numeric column
mean()/median()/std() mean or median or std of
dataset["column"].mean() numeric column
describe() summary statistics of numeric
dataset.describe() columns in dataset.

Table: (some)Methods @ Pandas DataFrame

Siman Giri Pandas for Data Analysis. 12/29

3.2 Methods of Pandas DataFrame.

methods-Syntax definition
unique() unique values of column
dataset.column.unique()
map(arg) map distinct values of a column
dataset["column"].map(arg) to another set of corresponding
{arg:function,dict,col.} values.
apply() takes a function and applies to
dataset["col"].apply(func) all values of column

Table: (some)Methods @ Pandas DataFrame

For Associated examples and python implementation of all the

attributes and methods also check provided code file.

Siman Giri Pandas for Data Analysis. 13/29

3.2.1 Methods of Pandas DataFrame:drop()
drop(): Probably!!! the most important methods used in data
manipulation.
Removing Rows from DataFrame
dataset.dropna()| : Removes all rows with (at least) one
missing values.{used: shaldomly}
1 import pandas as pd
2 # Assuming df is your DataFrame
3 data = { ’ Name ’: [ ’ Alice ’ , ’ Bob ’ , ’ Charlie ’] , ’ City ’: [ ’
New York ’ , ’ San Francisco ’ , ’ Los Angeles ’ ]}
4 df = pd . DataFrame ( data )
5 # Drop a specific row by index
6 df = df . drop (1)
7 # Drop rows based on a condition .
8 df = df [ df [ ’ city ’] == " New York " ]
9 # Reset index after dropping rows
10 df = df . reset_index ( drop = True )

Siman Giri Pandas for Data Analysis. 14/29

3.2.2 Methods of Pandas DataFrame:drop()
Removing Columns from DataFrame
dataset.dropna(axis=1)| : Removes all columns with (at least)
one missing values.{used shaldomly}
1 import pandas as pd
2 # Assuming df is your DataFrame
3 data = { ’ Name ’: [ ’ Alice ’ , ’ Bob ’ , ’ Charlie ’] , ’ City ’: [ ’
New York ’ , ’ San Francisco ’ , ’ Los Angeles ’ ]}
4 df = pd . DataFrame ( data )
5 # Drop a specific column by name
6 df = df . drop ( ’ city ’ , axis =1)
7 # Drop multiple columns by names
8 df = df . drop ([ ’ Name ’ , ’ City ’] , axis =1)
9 # Drop columns by index
10 df = df . drop ( df . columns [1] , axis =1)

Siman Giri Pandas for Data Analysis. 15/29

4.Data Cleaning and Preparation.

Siman Giri Pandas for Data Analysis. 16/29

4.1 Missing Values
Missing values in a dataset can occur due to several reasons.
Types of Missing Values:
1. Missing Completely at Random (MCAR): The probability of
being missing is the same for all cases,
▶ missingness occurs with-out any systematic pattern or
dependence on other varaibles.
▶ Randomly deleting survey responses without considering
content.
2. Missing at Random (MAR): If the probability of being missing
is the same only within groups defined by the observed data,
▶ missingness is related to other observed variables in the
dataset.
▶ For example, when placed on a soft surface, a weighing scale
may produce more missing values than when placed on a hard
surface. Such data are thus not MCAR.
3. Missing not at Random (MNAR): MNAR means that the
probability of being missing varies for reasons that are
unknown to us.
Siman Giri Pandas for Data Analysis. 17/29
4.2 Handling the Missing Values:Identifying Missing Values
▶ Missing values in a Pandas DataFrame can be identified with
the dataset.isnull() method.
▶ Total number of missing values in each column can be found
using syntax dataset.isnull().sum().
▶ The easiest fix for handling missing data might be using
dataset.dropna() methods, which drops the observation
that even have a single missing value.
1 import pandas as pd
2 from sklearn . datasets import load_iris
3 import numpy as np
4 iris = load_iris () # Load the Iris dataset
5 iris_df = pd . DataFrame ( data = np . c_ [ iris [ ’ data ’] , iris [ ’
target ’]] , columns = iris [ ’ feature_names ’] + [ ’
target ’ ])
6 np . random . seed (42) # Introduce missing values randomly
7 mask = np . random . rand (* iris_df . shape ) < 0.1 # 10%
8 iris_df [ mask ] = np . nan
9 print ( " Missing Values in Iris Dataset : " )
10 print ( iris_df . isnull () . sum () )
Siman Giri Pandas for Data Analysis. 18/29
4.2 Handling the Missing Values:Data Imputations
The easiest fix for handling missing data might be using
dataset.dropna() methods, which drops the observation that
even have a single missing value.
Data Imputations Techniques: The best way to impute the data
will depend on the problem, and the assumptions taken. Below we
present few techniques:
1. Naive Method: Filling the missing value of a column by
coping the value of the previous non-missing observation.
▶ Syntax: dataset.fillna(method = "ffill")
2. Imputing with the mean/median/constant: Missing values in
the column can be imputed(filled) using the
mean/median/constant of the non-missing values in the
column.{constant can be any values such as 0
▶ Syntax:
dataset.column.fillna(dataset.column.mean())
{ Please check python documentation for more such imputations
techniques}
Siman Giri Pandas for Data Analysis. 19/29
4.3 Data Imputations: Code Example
We will try to fill missing values form slide 18.Dataset is iris_df.
Run the following code and observe the output:
1 # Contd from code @ slide 18
2 # Filling missing values with forward fill ( ffill ) ,
mean , median , and 0
3 iris_df_ffill = iris_df . ffill ()
4 iris_df_mean = iris_df . fillna ( iris_df . mean () )
5 iris_df_median = iris_df . fillna ( iris_df . median () )
6 iris_df_zero = iris_df . fillna (0)
7 # Expand iris_df with filled columns
8 iris_df_expanded = pd . concat ([ iris_df , iris_df_ffill .
add_suffix ( ’ _ffill ’) , iris_df_mean . add_suffix ( ’
_mean ’) , iris_df_median . add_suffix ( ’ _median ’) ,
iris_df_zero . add_suffix ( ’ _zero ’) ] , axis =1)
9 # Display the head of the expanded DataFrame
10 print ( " \ nDataset after Filling Missing Values : " )
11 print ( iris_df_expanded . head () )

Siman Giri Pandas for Data Analysis. 20/29

5. Data Transformation

Siman Giri Pandas for Data Analysis. 21/29

5.1 Data Transformations: Data Scaling
Standard Scaling: also known as z-score normalization) is a
technique used to standardize the range of features by transforming
them to have a mean of 0 and a standard deviation of 1.
x − x̄
z=
SD

1 import pandas as pd
2 from sklearn . datasets import load_iris
3 iris = load_iris () # Load the Iris dataset
4 iris_df = pd . DataFrame ( data = iris [ ’ data ’] , columns = iris
[ ’ feature_names ’ ])
5 # Standard Scaling
6 iris_st a n d a r d _ s c a led = ( iris_df - iris_df . mean () ) /
iris_df . std ()
7 print ( " Original Iris DataFrame : " )
8 print ( iris_df . head () )
9 print ( " \ nStandard Scaled Iris DataFrame : " )
10 print ( i r i s _ s t a n d a rd_scaled . head () ) # Display scaled
data

Siman Giri Pandas for Data Analysis. 22/29

5.1 Data Transformations: Data Scaling
Min-Max Scaling: Min-Max scaling (also known as feature scaling
or min-max normalization) is a technique used to scale and center
the values of a feature in a specific range, usually between 0 and 1.
X − Xmin
Xscaled =
Xmax − Xmin

1 import pandas as pd
2 from sklearn . datasets import load_iris
3 iris = load_iris () # Load the Iris dataset
4 iris_df = pd . DataFrame ( data = iris [ ’ data ’] , columns = iris
[ ’ feature_names ’ ])
5 # Min - Max Scaling using Pandas
6 iris_min ma x_ sc al ed = ( iris_df - iris_df . min () ) / (
iris_df . max () - iris_df . min () )
7 print ( " Original Iris DataFrame : " )
8 print ( iris_df . head () )
9 print ( " \ nMin - Max Scaled Iris DataFrame : " )
10 print ( ir is _m in ma x_scaled . head () ) # Display scaled data

Siman Giri Pandas for Data Analysis. 23/29

5.2 Data Transformations: Encoding

Ordinal Encoding:
▶ Ordinal encoding is used for categorical data with a
meaningful order or ranking.
▶ Each category is assigned a numerical value based on its order.
▶ Example: Low, Medium, High can be encoded as 1, 2, 3.
1 import pandas as pd
2 # Sample DataFrame with ordinal categories
3 df = pd . DataFrame ({ ’ Category ’: [ ’ Low ’ , ’ Medium ’ , ’ High
’ , ’ Low ’ , ’ High ’ ]})
4 # Ordinal encoding using map
5 ordinal_mapping = { ’ Low ’: 1 , ’ Medium ’: 2 , ’ High ’: 3}
6 df [ ’ Category_Ordinal ’] = df [ ’ Category ’ ]. map (
ordinal_mapping )
7 print ( df )

Siman Giri Pandas for Data Analysis. 24/29

5.2 Data Transformations: Encoding
One-Hot Encoding:
▶ In one-hot encoding, each category is represented as a binary
vector (0 or 1) in which all elements are zero except for the
index that corresponds to the category.
▶ If there are n categories, each category is represented by a
vector of length n with all zeros except for a 1 at the index
corresponding to the category.
1 import pandas as pd
2 df_munici paliti es = pd . DataFrame ({ ’ Municipality ’: [ ’
Kathmandu ’ , ’ Bhaktapur ’ , ’ Lalitpur ’ , ’ Madhyapur
Thimi ’ , ’ Kirtipur ’ ]})
3 one_hot_encoding = pd . get_dummies ( df_municipalities [ ’
Municipality ’] , prefix = ’ Municipality ’)
4 df_encoded = pd . concat ([ df_municipalities ,
one_hot_encoding ] , axis =1)
5 print ( df_encoded ) # Display the result

Siman Giri Pandas for Data Analysis. 25/29

6.Exercises.

Siman Giri Pandas for Data Analysis. 26/29

Task Set-I: DataFrame Reading and Writing.
Answer the following:
Dataset: "bank.csv"|.
▶ Load the provided dataset and import in pandas
DataFrame.
▶ Check info of the DataFrame and identify
following:
1. columns with dtypes=object|
2. unique values of those columns.
3. check for the total number of null values in each
column.
▶ Drop all the columns with dtypes object and store
in new DataFrame, also write the DataFrame in
".csv" with name "banknumericdata.csv"
▶ Read "banknumericdata.csv" and Find the summary
statistics.

Siman Giri Pandas for Data Analysis. 27/29

Task Set-II: Data Imputations

Answer the following:

Dataset: "medical_Student.csv".
▶ Load the provided dataset and import in pandas DataFrame.
▶ Check info of the DataFrame and identify column with
missing (null) values.
▶ For the column with missing values fill the values using
various techniques we discussed above. Try to explain why did
you select the particular methods for particular column.
▶ Check for any duplicate values present in Dataset and do
necessary to manage the duplicate items.
{Hint: dataset.duplicated.sum()}

Siman Giri Pandas for Data Analysis. 28/29

Task Set-III: Data Transformations
Transform variables according to the following instructions:
Dataset: "performance.csv".
▶ ”School”, ”internet”, ”activities”, into binary: 0 or 1 (create
new columns without overwriting the existing ones).
▶ ”Medu”,”reason”, ”guardian”,”studytime”„ and ”health” into
ordinal numbers based on the number cases in the data set
(create news columns without overwriting the existing ones).
▶ Convert column ”age” to interval datatype. i.e. Create a new
column name category age whose values should be based on
the frequency in the column ”age”, You can create categorical
data with following interval.

interval1: [15-17];interval2: [18-20]; interval3: [21-all]

▶ Create a new column name passed (yes or no) whose values

should be based on the values present in the G3 column
(>= 8–yes, < −no).
Siman Giri Pandas for Data Analysis. 29/29

MLS-C01 AWS Certified Exam Practice Questions
No ratings yet
MLS-C01 AWS Certified Exam Practice Questions
34 pages
For Assignment-3 (Final - Pandas - Lab)
No ratings yet
For Assignment-3 (Final - Pandas - Lab)
40 pages
Primer of Applied Regression and Analysis of Variance (Glantz S.a., Slinker B.K., Neilands T.B)
No ratings yet
Primer of Applied Regression and Analysis of Variance (Glantz S.a., Slinker B.K., Neilands T.B)
1,472 pages
Machine Learning With Python.
100% (1)
Machine Learning With Python.
147 pages
2023 PLS
No ratings yet
2023 PLS
21 pages
Python Pandas Tutorial For Beginners
No ratings yet
Python Pandas Tutorial For Beginners
203 pages
CHP 8 Pandas
No ratings yet
CHP 8 Pandas
49 pages
Data Handling Using Pandas-1
No ratings yet
Data Handling Using Pandas-1
60 pages
LBOE2112 Module 2 Multivariate Data Analysis - 2024-2025 - All
No ratings yet
LBOE2112 Module 2 Multivariate Data Analysis - 2024-2025 - All
155 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
Pandas
No ratings yet
Pandas
41 pages
Exp1 - Manipulating Datasets Using Pandas
No ratings yet
Exp1 - Manipulating Datasets Using Pandas
15 pages
Pandas, Numpy, Matplotlib
No ratings yet
Pandas, Numpy, Matplotlib
11 pages
M-II FDS U-II Questions
No ratings yet
M-II FDS U-II Questions
43 pages
Pandas
No ratings yet
Pandas
25 pages
Advanced Diploma in Data Science and Machine Learning
No ratings yet
Advanced Diploma in Data Science and Machine Learning
113 pages
2 Pandas
No ratings yet
2 Pandas
22 pages
DataFrame Ac Win Final
No ratings yet
DataFrame Ac Win Final
30 pages
Aishwarya PDF
No ratings yet
Aishwarya PDF
80 pages
Pandas DataFrame
No ratings yet
Pandas DataFrame
70 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
Unit-I (Data Analytics)
No ratings yet
Unit-I (Data Analytics)
22 pages
4 Data Transformation Using Pandas
No ratings yet
4 Data Transformation Using Pandas
59 pages
Pandas 1
No ratings yet
Pandas 1
89 pages
Protective Factors Angels in The Nursery
No ratings yet
Protective Factors Angels in The Nursery
15 pages
04-Data Manipulation With Pandas
No ratings yet
04-Data Manipulation With Pandas
28 pages
Python Pandas Demo PDF
100% (2)
Python Pandas Demo PDF
23 pages
UNIT II Notes
No ratings yet
UNIT II Notes
23 pages
Python 3rd Unit Question and Answer
No ratings yet
Python 3rd Unit Question and Answer
25 pages
Effects of Preschoolers' Storybook Exposure and Literacy Environments On Lower Level and Higher Level Language Skills
No ratings yet
Effects of Preschoolers' Storybook Exposure and Literacy Environments On Lower Level and Higher Level Language Skills
24 pages
Pandas
No ratings yet
Pandas
29 pages
Alon EvolutionClassInequality 2009 Unlocked
No ratings yet
Alon EvolutionClassInequality 2009 Unlocked
26 pages
Pandas - Ipynb - Colab
No ratings yet
Pandas - Ipynb - Colab
8 pages
Pandas Basics
No ratings yet
Pandas Basics
84 pages
Document
No ratings yet
Document
29 pages
PSY417 Week02
No ratings yet
PSY417 Week02
38 pages
Pandas
No ratings yet
Pandas
41 pages
Pandas
No ratings yet
Pandas
13 pages
Rubin 1976
No ratings yet
Rubin 1976
12 pages
Python Pandas
No ratings yet
Python Pandas
13 pages
Data Science and Data Analytics: Part B
No ratings yet
Data Science and Data Analytics: Part B
42 pages
The Pandas Library
No ratings yet
The Pandas Library
39 pages
18 Pandas
No ratings yet
18 Pandas
33 pages
Dwdm-Lab Manual
No ratings yet
Dwdm-Lab Manual
39 pages
Pandas Data Cleaning Presentation
No ratings yet
Pandas Data Cleaning Presentation
11 pages
DevOps Session 3 Pandas
No ratings yet
DevOps Session 3 Pandas
33 pages
Pandas (Ziad)
No ratings yet
Pandas (Ziad)
38 pages
Pandas Questions
No ratings yet
Pandas Questions
11 pages
Pandas - Digitalocean
No ratings yet
Pandas - Digitalocean
15 pages
Pandas Cheat Sheet........
No ratings yet
Pandas Cheat Sheet........
11 pages
Lab 9
No ratings yet
Lab 9
9 pages
Draft Reasearch Paper
No ratings yet
Draft Reasearch Paper
3 pages
Lab-3 Pandas Library
No ratings yet
Lab-3 Pandas Library
14 pages
Predicting and Explaining Corruption Across Countries: A Machine Learning Approach
100% (1)
Predicting and Explaining Corruption Across Countries: A Machine Learning Approach
15 pages
Pandas Handbook
No ratings yet
Pandas Handbook
33 pages
Loki Temp PPT Pandas 2
No ratings yet
Loki Temp PPT Pandas 2
31 pages
Orac Table
No ratings yet
Orac Table
48 pages
IP 12th Chapter 3
No ratings yet
IP 12th Chapter 3
9 pages
Pandas
No ratings yet
Pandas
12 pages
Pandas Python
No ratings yet
Pandas Python
11 pages
IV Unit Fds
No ratings yet
IV Unit Fds
16 pages
Foundation of Data Science Solve Question Paper Aug 2022
No ratings yet
Foundation of Data Science Solve Question Paper Aug 2022
7 pages
Pandas 1705297450
No ratings yet
Pandas 1705297450
21 pages
DWV Unit1
No ratings yet
DWV Unit1
102 pages
Missing Data Techniques - UCLA
No ratings yet
Missing Data Techniques - UCLA
66 pages
FDS Module 2 Notes
No ratings yet
FDS Module 2 Notes
24 pages
All Document Reader 1715619870900
No ratings yet
All Document Reader 1715619870900
6 pages
ABED 2014 Video-Assisted Patient Education To Modify Behavior A Systematic Review
No ratings yet
ABED 2014 Video-Assisted Patient Education To Modify Behavior A Systematic Review
7 pages
Manual Timss Pisa2005 0503
No ratings yet
Manual Timss Pisa2005 0503
30 pages
Python Data Frame New
No ratings yet
Python Data Frame New
32 pages
Introduction To Pandas For Data Analysis
No ratings yet
Introduction To Pandas For Data Analysis
6 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
Pandas
No ratings yet
Pandas
16 pages
Pandas Notes
No ratings yet
Pandas Notes
6 pages
Pandas: Import
100% (1)
Pandas: Import
13 pages
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
No ratings yet
Class Notes: Class: XII Date: 7-Apr-2020 Subject: Informatics Practices Topic: 2. Python Pandas
4 pages
BayesiaLab User Guide
No ratings yet
BayesiaLab User Guide
380 pages
Pandas Dataframe Export The CSV File
No ratings yet
Pandas Dataframe Export The CSV File
9 pages
Analyticsvidhya Com
No ratings yet
Analyticsvidhya Com
38 pages
Day64 - Pandas Interview Questions
No ratings yet
Day64 - Pandas Interview Questions
5 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
Institut Manajemen Telkom: Quantitative Data Analysis: Doing Social Research To Test Ideas
No ratings yet
Institut Manajemen Telkom: Quantitative Data Analysis: Doing Social Research To Test Ideas
1 page
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
Regression Modeling Strategies - With Applications To Linear Models by Frank E. Harrell
100% (4)
Regression Modeling Strategies - With Applications To Linear Models by Frank E. Harrell
598 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Data Science with R: Beginner to Expert
From Everand
Data Science with R: Beginner to Expert
Narayana Nemani
No ratings yet
Learning Pandas 2.0: A Comprehensive Guide to Data Manipulation and Analysis for Data Scientists and Machine Learning Professionals
From Everand
Learning Pandas 2.0: A Comprehensive Guide to Data Manipulation and Analysis for Data Scientists and Machine Learning Professionals
Matthew Rosch
No ratings yet
Mastering Pandas in Python: Course Book
From Everand
Mastering Pandas in Python: Course Book
Pedro Martins
No ratings yet

5CS037 WS02 PandasForDataAnalysis

Uploaded by

5CS037 WS02 PandasForDataAnalysis

Uploaded by

5CS037 Concepts and Technologies of AI

November 28, 2023

Siman Giri Pandas for Data Analysis. 1/29

Siman Giri Pandas for Data Analysis. 2/29

[pandas] is derived from the term ”panel data”, an

▶ The pandas library contains several methods and functions for

Siman Giri Pandas for Data Analysis. 3/29

Typically, the pandas library is used for:

Siman Giri Pandas for Data Analysis. 4/29

Siman Giri Pandas for Data Analysis. 5/29

Siman Giri Pandas for Data Analysis. 6/29

Siman Giri Pandas for Data Analysis. 7/29

Figure: Output-Style:1 Figure: Output-Style:2

Siman Giri Pandas for Data Analysis. 8/29

Siman Giri Pandas for Data Analysis. 10/29

attributes definition Syntax

Siman Giri Pandas for Data Analysis. 11/29

Table: (some)Methods @ Pandas DataFrame

Siman Giri Pandas for Data Analysis. 12/29

Table: (some)Methods @ Pandas DataFrame

For Associated examples and python implementation of all the

Siman Giri Pandas for Data Analysis. 13/29

Siman Giri Pandas for Data Analysis. 14/29

Siman Giri Pandas for Data Analysis. 15/29

Siman Giri Pandas for Data Analysis. 16/29

Siman Giri Pandas for Data Analysis. 20/29

Siman Giri Pandas for Data Analysis. 21/29

Siman Giri Pandas for Data Analysis. 22/29

Siman Giri Pandas for Data Analysis. 23/29

Siman Giri Pandas for Data Analysis. 24/29

Siman Giri Pandas for Data Analysis. 25/29

Siman Giri Pandas for Data Analysis. 26/29

Siman Giri Pandas for Data Analysis. 27/29

Answer the following:

Siman Giri Pandas for Data Analysis. 28/29

interval1: [15-17];interval2: [18-20]; interval3: [21-all]

▶ Create a new column name passed (yes or no) whose values

You might also like