0% found this document useful (0 votes)
17 views27 pages

MLS 2 - NumPy and Pandas

Uploaded by

for.nagesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views27 pages

MLS 2 - NumPy and Pandas

Uploaded by

for.nagesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

for.nagesh@gmail.

com
25QCU3YBAJ
NumPy and Pandas

This file is meant for personal use by [email protected] only.


Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz

NumPy and NumPy Functions

25QCU3YBAJ Agenda
[email protected]
Pandas and Pandas Functions

Merge and Join in Pandas

Loading Datasets in Pandas

This file is meant for personal use by [email protected] only.


Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Let’s begin the discussion by answering a few questions
[email protected]
25QCU3YBAJ
on NumPy and Pandas

This file is meant for personal use by [email protected] only.


Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz

What does the following code snippet do?


np.arange(1, 10, 2)

A Return an array of integers from 1 to 10 (included) with step size 2.


[email protected]
25QCU3YBAJ

B Return an array of integers from 1 to 9 (included) with step size 2.

C Return an array of integers from 1 to 9 (excluded) with step size 2.

D Return an array of integers from 2 to 10 (included) with step size 1.


This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz

What does the following code snippet do?


np.arange(1, 10, 2)

A Return an array of integers from 1 to 10 (included) with step size 2.


[email protected]
25QCU3YBAJ

B Return an array of integers from 1 to 9 (included) with step size 2.

C Return an array of integers from 1 to 9 (excluded) with step size 2.

D Return an array of integers from 2 to 10 (included) with step size 1.


This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy

NumPy stands for Numerical Python - one of the fundamental packages for
mathematical, logical, and statistical operations with Python

Provides powerful n-dimensional array object, called ndarray

Provides a large
[email protected] set of functions for creating, manipulating, and transforming ndarrays
25QCU3YBAJ

Function Syntax Example Description

np.array() np.array(object, np.array([1, 2, 3]) To create an array


dtype=None )

np.arange() np.arange(start, stop, np.arange(0, 10, 2) To create an array of evenly


step) spaced values within a given
interval

This file is meant for personal use by [email protected] only.


Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz

What does the following code snippet do?


np.random.randint(10, 20, 1000)

A Return an array of 1000 integers between 10 and 20 (excluded)


[email protected]
25QCU3YBAJ

B Return an array of 10 integers between 20 and 1000 (included)

C Return an array of 10 integers between 20 and 1000 (excluded)

D Return an array of 1000 integers between 10 and 20 (included)


This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz

What does the following code snippet do?


np.random.randint(10, 20, 1000)

A Return an array of 1000 integers between 10 and 20 (excluded)


[email protected]
25QCU3YBAJ

B Return an array of 10 integers between 20 and 1000 (included)

C Return an array of 10 integers between 20 and 1000 (excluded)

D Return an array of 1000 integers between 10 and 20 (included)


This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy Functions
Function Syntax Example Description

np.random.rand() np.random.rand(d0 np.random.rand(3, 2) To create an array of specified


, d1, ..., dn) shape filled with random
values from the uniform
● d0, d1, ..., dn: distribution
The dimensions of
the returned array.

[email protected]
np.random.randint( np.random.randint(lo np.random.randint(1, 10, To create an array of specified
25QCU3YBAJ
) w, high, size) size=(2, 3)) shape filled with random
integers from low (inclusive) to
high (exclusive)

np.random.randn() np.random.randn(d0, np.random.randn(2, 3) To create an array of specified


d1, ..., dn) shape filled with random
● d0, d1, ..., dn: values from the standard
The dimensions of normal distribution
the returned array.

This file is meant for personal use by [email protected] only.


Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz

Consider a dataframe cust_data having information on the following attributes of 200


customers (in the same order) - ID, Name, Age, Annual Income, Job Category
Which of the following can be used to fetch the Age and Annual Income of the first 100
customers?

A cust_data.iloc[:100, 2:3]
[email protected]
25QCU3YBAJ

B cust_data.iloc[:100, 2:4]

C cust_data.loc[:100, ‘Age’:’Annual Income’]

D cust_data.loc[:100, ‘Age’:’Job Category’]


This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz

Consider a dataframe cust_data having information on the following attributes of 200


customers (in the same order) - ID, Name, Age, Annual Income, Job Category
Which of the following can be used to fetch the Age and Annual Income of the first 100
customers?

A cust_data.iloc[:100, 2:3]
[email protected]
25QCU3YBAJ

B cust_data.iloc[:100, 2:4]

C cust_data.loc[:100, ‘Age’:’Annual Income’]

D cust_data.loc[:100, ‘Age’:’Job Category’]


This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Pandas
Pandas is primarily used for analysis and manipulation of tabular data

Offers two major data structures - Series & Dataframe

One can think of a pandas dataframe like an excel spreadsheet - data stored in rows and
columns.
[email protected]
25QCU3YBAJ

Function Syntax Example Description

df.loc[] df.loc[row_label_start:ro df.loc[10:100, Access elements via


w_label_end, ‘Age’:’Annual Income’] label-based indexing
column_label_start:column (includes the end label)
_label_end]

df.iloc[] df.iloc[row_index_start:r df.iloc[10:20, 2:4] Access elements via


ow_index_end, integer-based indexing
column_index_start:column (excludes the end index)
_index_end]
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz

Which of the following can be used to drop the Job Category column and
ensures modification is directly made to the dataframe?

A cust_data.drop('Job Category', axis=0, inplace=True)


[email protected]
25QCU3YBAJ

B cust_data.drop('Job Category', axis=1, inplace=False)

C cust_data.drop('Job Category', axis=0, inplace=False)

D cust_data.drop('Job Category', axis=1, inplace=True)


This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz

Which of the following can be used to drop the Job Category column and
ensures modification is directly made to the dataframe?

A cust_data.drop('Job Category', axis=0, inplace=True)


[email protected]
25QCU3YBAJ

B cust_data.drop('Job Category', axis=1, inplace=False)

C cust_data.drop('Job Category', axis=0, inplace=False)

D cust_data.drop('Job Category', axis=1, inplace=True)


This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Pandas Functions
Function Syntax Example Description

df.drop() df.drop(labels, axis, df.drop('Job Category', Drop specified labels from


inplace) axis=1, inplace=True) rows or columns

Modifies a dataframe directly,


inplace=True:
[email protected]
25QCU3YBAJ
avoids creating a copy of the original dataframe

axis=0: Performs operations row-wise

axis=1: Performs operations column-wise

This file is meant for personal use by [email protected] only.


Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz
Consider a dataframe df having the columns Gender and Height.
Which of the following can be used to get the average height by different
categories of gender?

A df.groupby(['Height'])['Gender'].mean()
[email protected]
25QCU3YBAJ

B df.groupby(['Gender']).Height.mean()

C df.groupby(['Gender'])['Height'].mean

D df.groupby(['Gender'])['Height'].mean()
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz
Consider a dataframe df having the columns Gender and Height.
Which of the following can be used to get the average height by different
categories of gender?

A df.groupby(['Height'])['Gender'].mean()
[email protected]
25QCU3YBAJ

B df.groupby(['Gender']).Height.mean()

C df.groupby(['Gender'])['Height'].mean

D df.groupby(['Gender'])['Height'].mean()
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Pandas Functions
Function Syntax Example Description

df.groupby() df.groupby(['column_nam df.groupby([' Gender'])[ To split, apply and combine


e'])[aggregate_column]. 'Height'].mean() the data structures to get
agg_func() aggregated values wrt
attribute(s)

[email protected]
25QCU3YBAJ

This file is meant for personal use by [email protected] only.


Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz

Consider two dataframes df1 and df2 containing a common column Cust_ID.
Which of the following code snippets will merge these two dataframes?

A pd.merge(df1, df2, on='ABC', how='inner')


[email protected]
25QCU3YBAJ

B pd.merge(df1, on='ABC', how='inner')

C df1.merge(df2, on='ABC', how='inner')

D pd.merge(df1, df2, on='ABC')


This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz

Consider two dataframes df1 and df2 containing a common column Cust_ID.
Which of the following code snippets will merge these two dataframes?

A pd.merge(df1, df2, on='ABC', how='inner')


[email protected]
25QCU3YBAJ

B pd.merge(df1, on='ABC', how='inner')

C df1.merge(df2, on='ABC', how='inner')

D pd.merge(df1, df2, on='ABC')


This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Pandas - Merge and Join
join - works best when joining dataframes on their indices (though you can specify another
column to join on)

merge - more versatile and allows to specify columns (besides the index) to join on

how=’inner’ how=’outer’ how=’left’ how=’right’


[email protected]
25QCU3YBAJ
Retains all the rows Retains all the rows
Retains only the rows
Retains all the rows from the first from the second
that are common
from both the dataframe and only dataframe and only
between the
dataframes the matching ones the matching ones
dataframes
from the second from the first

This file is meant for personal use by [email protected] only.


Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Pandas - Merge Example

[email protected]
25QCU3YBAJ

This file is meant for personal use by [email protected] only.


Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Pandas - Merge Example

[email protected]
25QCU3YBAJ

This file is meant for personal use by [email protected] only.


Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz

Consider a file Customer_Data.csv that contains multiple attributes of 100


customers. Which of the following can be used to load the file into a pandas
dataframe?

A pd.read_csv(Customer_Data.csv)
[email protected]
25QCU3YBAJ

B pd.read_csv(“Customer_Data.csv”)

C pd.read_csv(‘Customer_Data.csv’)

D pd.read_csv(Customer_Data)
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy and Pandas Quiz

Consider a file Customer_Data.csv that contains multiple attributes of 100


customers. Which of the following can be used to load the file into a pandas
dataframe?

A pd.read_csv(Customer_Data.csv)
[email protected]
25QCU3YBAJ

B pd.read_csv(“Customer_Data.csv”)

C pd.read_csv(‘Customer_Data.csv’)

D pd.read_csv(Customer_Data)
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Loading Datasets in Pandas
read_csv - pandas function used to load datasets in CSV format into a pandas dataframe

Syntax: df = pd.read_csv(“file_name.csv”)

Pandas has to be imported with alias pd - import pandas as pd

[email protected]
25QCU3YBAJ
The file name has to be enclosed in quotation marks (single or double)

Above syntax works when the file (dataset) is in the same working directory as the Python
notebook

When the file (dataset) and the Python notebook are not in the same working directory,
the path to the file has to be specified

This file is meant for personal use by [email protected] only.


Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Happy Learning !
[email protected]
25QCU3YBAJ

This file is meant for personal use by [email protected] only.


Sharing or publishing the contents in part or full is liable for legal action. 27
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.

You might also like