MLS 2 - NumPy and Pandas
MLS 2 - NumPy and Pandas
com
25QCU3YBAJ
NumPy and Pandas
25QCU3YBAJ Agenda
[email protected]
Pandas and Pandas Functions
NumPy stands for Numerical Python - one of the fundamental packages for
mathematical, logical, and statistical operations with Python
Provides a large
[email protected] set of functions for creating, manipulating, and transforming ndarrays
25QCU3YBAJ
[email protected]
np.random.randint( np.random.randint(lo np.random.randint(1, 10, To create an array of specified
25QCU3YBAJ
) w, high, size) size=(2, 3)) shape filled with random
integers from low (inclusive) to
high (exclusive)
A cust_data.iloc[:100, 2:3]
[email protected]
25QCU3YBAJ
B cust_data.iloc[:100, 2:4]
A cust_data.iloc[:100, 2:3]
[email protected]
25QCU3YBAJ
B cust_data.iloc[:100, 2:4]
One can think of a pandas dataframe like an excel spreadsheet - data stored in rows and
columns.
[email protected]
25QCU3YBAJ
Which of the following can be used to drop the Job Category column and
ensures modification is directly made to the dataframe?
Which of the following can be used to drop the Job Category column and
ensures modification is directly made to the dataframe?
A df.groupby(['Height'])['Gender'].mean()
[email protected]
25QCU3YBAJ
B df.groupby(['Gender']).Height.mean()
C df.groupby(['Gender'])['Height'].mean
D df.groupby(['Gender'])['Height'].mean()
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz
Consider a dataframe df having the columns Gender and Height.
Which of the following can be used to get the average height by different
categories of gender?
A df.groupby(['Height'])['Gender'].mean()
[email protected]
25QCU3YBAJ
B df.groupby(['Gender']).Height.mean()
C df.groupby(['Gender'])['Height'].mean
D df.groupby(['Gender'])['Height'].mean()
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Pandas Functions
Function Syntax Example Description
[email protected]
25QCU3YBAJ
Consider two dataframes df1 and df2 containing a common column Cust_ID.
Which of the following code snippets will merge these two dataframes?
Consider two dataframes df1 and df2 containing a common column Cust_ID.
Which of the following code snippets will merge these two dataframes?
merge - more versatile and allows to specify columns (besides the index) to join on
[email protected]
25QCU3YBAJ
[email protected]
25QCU3YBAJ
A pd.read_csv(Customer_Data.csv)
[email protected]
25QCU3YBAJ
B pd.read_csv(“Customer_Data.csv”)
C pd.read_csv(‘Customer_Data.csv’)
D pd.read_csv(Customer_Data)
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz
A pd.read_csv(Customer_Data.csv)
[email protected]
25QCU3YBAJ
B pd.read_csv(“Customer_Data.csv”)
C pd.read_csv(‘Customer_Data.csv’)
D pd.read_csv(Customer_Data)
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Loading Datasets in Pandas
read_csv - pandas function used to load datasets in CSV format into a pandas dataframe
Syntax: df = pd.read_csv(“file_name.csv”)
[email protected]
25QCU3YBAJ
The file name has to be enclosed in quotation marks (single or double)
Above syntax works when the file (dataset) is in the same working directory as the Python
notebook
When the file (dataset) and the Python notebook are not in the same working directory,
the path to the file has to be specified