0% found this document useful (0 votes)
15 views17 pages

Packages in Python

The document outlines experiments conducted using Python Pandas and Matplotlib for data manipulation, analysis and visualization. It includes steps to load datasets, perform operations like grouping, merging, EDA using aggregates and null handling, and generate different plot types like line, bar, histogram and scatter plots.

Uploaded by

Bharath M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views17 pages

Packages in Python

The document outlines experiments conducted using Python Pandas and Matplotlib for data manipulation, analysis and visualization. It includes steps to load datasets, perform operations like grouping, merging, EDA using aggregates and null handling, and generate different plot types like line, bar, histogram and scatter plots.

Uploaded by

Bharath M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

S.

n Experiment Date Remark

1
LAB EXERCISES
Data Manipulation – Loading and Filtration
PACKAGES IN PYTHON
2 Grouping and Merging data using Pandas

3 Exploratory Data Analysis using Pandas

4 Plotting Using Matplotlib -1

5 Plotting Using Matplotlib -2

TABLE OF CONTENT
EXP NO DATE
1 Data Manipulation – Loading and Filtration

AIM:
To create a python – Pandas module to perform basic data manipulation tasks
using Jupyter Notebook

PROCEDURE:
Step – 1 : Install the required packages from Command Prompt using pip
Step – 2 : Launch Jupyter notebook from command Prompt
Step – 3 : Import necessary library files at the beginning of the module
Step – 4: Upload the ‘titanic’ dataset and import the same using pandas
Functions
Step – 5 : Perform the necessary programming

PROGRAM :
import pandas as pd
df=pd.read_csv('titanic.csv')
df
#MAKE 1ST COLUMN AS INDEX
df.set_index('Name')

#SELECT SINGLE COL AND PRINT DATA


df [ 'Pclass' ]
#SELECT MULTIPLE COLUMNS
df[['Name', 'Age', 'Sex']]

#SELECT SINGLE COLUMNS AND PRINT LAST 5 ELEMENTS


df['Name'].tail(5)

# SELECT MULTIPLE ROWS AND PRINT FIRST 5 ELEMENTS


df.iloc[784:789].head()

#SELECT MULTIPLE ROWS & COL FROM DATASET AND PRINT IT


df.iloc[ 0:5 , 0:5 ]

# SELECT ALL ROWS AND SOME COL(MORE THAN 2) AND PRINT IT


df.iloc[ : , 0:5 ]
# Deleting a Column
del df['Sex']
df

#CHANGE THE 1ST, 2ND & 3RD COL NAME AND PRINT IT
c=df.rename(columns={ 'Ticket':'Ticket_No', 'Name':'Passenger_name'})
c

OUTPUT:
RESULT:
Thus, the python program to perform basic data manipulation tasks using
Jupyter Notebook was executed and output is verified successfull

EXP NO DATE
2 Grouping and Merging data using Pandas

AIM:
To create a python – Pandas module to perform different grouping and merging
operations using Jupyter Notebook

PROCEDURE:
Step – 1 : Install the required packages from Command Prompt using pip
Step – 2 : Launch Jupyter notebook from command Prompt
Step – 3 : Import necessary library files at the beginning of the module
Step – 4: Upload the ‘Online_Attendance’ , ‘covid_vaccine_statewise ‘ and
‘StatewiseTestingDetails’ datasets and import the same using pandas
Functions
Step – 5: Perform the necessary programming

PROGRAM:
#(i) Grouping
import pandas as pd
df=pd.read_csv("Online_Attendance.csv")
df

#Grouping based on 'Category' Column


g=df.groupby("Category")
new=g.get_group("IBM")
new

#Exporting to a New CSV


new.to_csv("New_data_set.csv")

# With Corona Dataset


df1=pd.read_csv("StatewiseTestingDetails.csv")
df1

#Grouping By Date Column


g1=df1.groupby("Date")
g1

#Grouping for a Particular Date


new1=g1.get_group("14-02-2021")
new1

#Exporting New Dataset


new1.to_csv("New_data1_set.csv")

# Grouping for TamilNadu


df2=pd.read_csv("covid_vaccine_statewise.csv")
g2=df2.groupby("State")
new2=g2.get_group("Tamil Nadu")
new2

# Joining two different Datasets


#Combining Df1 with Df2 using Left Join
join_data=pd.merge(df1,df2,on="State",how="left")
join_data

#Combining Df1 with Df2 using Right Join


join_data=pd.merge(df1,df2,on="State",how="right")
join_data

# Combining Df1 with Df2 using inner Join


join_data=pd.merge(df1,df2,on="State",how="inner")
join_data

# Combining Df1 with Df2 using Outer Join


join_data=pd.merge(df1,df2,on="State",how="outer")
join_data
OUTPUT:
RESULT:
Thys a python – Pandas module was created to perform different grouping and
merging operations using Jupyter Notebook and the output is verified
successfully

EXP NO DATE
3 Exploratory Data Analysis using Pandas

AIM:
To create a python – Pandas module to perform an Exploratory data analysis
using Jupyter Notebook

PROCEDURE:
Step – 1 : Install the required packages from Command Prompt using pip
Step – 2 : Launch Jupyter notebook from command Prompt
Step – 3 : Import necessary library files at the beginning of the module
Step – 4: Upload the ‘Loan Data’ datasets and import the same using pandas
Functions
Step – 5 : Perform the necessary programming

Program :
#Exploratory Data Analysis
import pandas as pd
df=pd.read_csv("Loan_Data.csv")
df

#Display Number of rows and Columns


df.shape
#Checking Number of Null Values in Each Column
df.isnull().sum()

#Displaying Data types of Individual Columns


df.dtypes

#Display Last 5 Columns


df.tail()

#Replacing Gender Column's Null Value with mode()


df['Gender'].fillna(df['Gender'].mode()[0], inplace=True)
df
df.isnull().sum()

#Gender - Null Value is now Zero


#Replacing every column Null Value with mode() and mean()
df['Married'].fillna(df['Married'].mode()[0], inplace=True)
df['Dependents']. fillna(df['Dependents'].mode()[0], inplace=True)
df['Self_Employed'].fillna(df[ 'Self_Employed'].mode() [0],
inplace=True)
df['Loan_Amount_Term'].fillna(df['Loan_Amount_Term'].mean(),
inplace=True)
df['LoanAmount'].fillna(df['LoanAmount'].mean(), inplace=True)
df['Credit_History'].fillna(df['Credit_History'].mode()
[0],inplace=True)

#Checking Null Values Again


df.isnull().sum()

#Exporing to an External CSV File


df.to_csv("FINAL.CSV")

#Converting Yes to 1 and No to 0


df["Self_Employed"].replace(to_replace="Yes", value=1, inplace=True)
df["Self_Employed"].replace(to_replace="No", value=0, inplace=True)
df

OUTPUT:
RESULT :
Thus, a python – Pandas module to perform an Exploratory data analysis using
Jupyter Notebook was created and output is verified successfully.

EXP NO DATE
4 Plotting Using Matplotlib -1

AIM:
To create a python – Pandas module to generate basic graphs in matplotlib
using Jupyter Notebook

PROCEDURE:
Step – 1 : Install the required packages from Command Prompt using pip
Step – 2 : Launch Jupyter notebook from command Prompt
Step – 3 : Import necessary library files at the beginning of the module
Step – 4: Upload the ‘C19INDIA’ datasets and import the same using pandas
Functions
Step – 5 : Perform the necessary programming
PROGRAM :
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#Loading the Dataset
df=pd.read_csv("C19INDIA.CSV")
df
df.head()
df.describe()

#Converting Sno as Index and dropping the excess


df.index.name="Sno"
df.drop ("Sno", axis=1, inplace=True)
df

#Dropping the Zeros


new_df = df.drop(0)
new_df
new_df.describe()

#Plotting State vs Confirmed


plt.figure(figsize=(10,10) )
plt.bar(new_df['State/UnionTerritory'],new_df['Confirmed'])
plt.xticks(rotation=90)
plt.show()
#Using the Second Dataset
df1 = pd.read_csv("Vaccine.csv")
df1

#Plotting
plt. figure (figsize=(20,25))
plt.plot(df1['CoviShield (Doses Administered)'],color='Blue',
linestyle='--', marker='o')
plt.xlabel("Days")
plt.ylabel("Vaccine Number")
plt.show()

OUTPUT:
RESULT:
Thus, a python – Pandas module to generate basic graphs in matplotlib using
Jupyter Notebook was executed and output is verified successfully
EXP NO DATE
5 Plotting Using Matplotlib -2

AIM:
To create a python – Pandas module to generate basic graphs in matplotlib
using Jupyter Notebook

PROCEDURE:
Step – 1 : Install the required packages from Command Prompt using pip
Step – 2 : Launch Jupyter notebook from command Prompt
Step – 3 : Import necessary library files at the beginning of the module
Step – 4 : Perform the necessary programming

PROGRAM:
#Importing
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.style

#Line Graph
x=[5,6,8,10,15]
y=[20,30,40,50,55]
plt.plot(x,y)
plt.title('STUDENT DATA-LINE GRAPH')
plt.ylabel('Present ')
plt.xlabel('Roll.no')
plt.show()

# LINE GRAPH WITH STYLE:


x=[5,6,8,10,15]
y=[20,30,40,50,55]
x2=[2,13,16,20,18]
y2=[25,35,16,23.5,40]
plt.plot(x,y,'c',label='A',linewidth=6)
plt.plot(x2,y2,'purple',label='B',linewidth=6)
plt.title('STUDENT DATA-LINE GRAPH WITH STYLE')
plt.ylabel('Present %')
plt.xlabel('Roll.no')
plt.legend()
plt.show()

# BAR GRAPH:
studentnames = ['Jack','Daniel','Bira','Antiquity','Heineken']
marks = [850,1350,220,900,190]
plt.bar(studentnames,marks,color='purple')
plt.title('STUDENT DATA-BAR GRAPH VERTICAL')
plt.xlabel('NAMES')
plt.ylabel('MARKS')
plt.show()

# Horizontal Bar Graph


studentnames = ['Jack','Daniel','Bira','Antiquity','Heineken']
marks = [850,1350,220,900,190]
plt.barh(studentnames,marks,color='orange')
plt.title('STUDENT DATA-BAR GRAPH VERTICAL')
plt.xlabel('NAMES')
plt.ylabel('MARKS')
plt.show()

# Histogram
student_marks=[45,12,13,26,15,55,100,98,95,54,58,56,52,24,71,6
6,66.5,12,23,55,78,10,9,5,10,22,35,65,45]
bins=[0,10,20,30,40,50,60,70,80,90,100]
plt.hist(student_marks,bins,rwidth=0.8,color='purple')
plt.xlabel('MARKS')
plt.ylabel('NUMBER OF STUDENT')
plt.title('STUDENT DATA-HISTOGTAM')
plt.show()

# SCATTER PLOT:
x=[5,6,8,10,15]
y=[20,30,40,50,55]
x2=[2,13,16,20,18]
y2=[25,35,16,23.5,40]
plt.scatter(x,y,color='red')
plt.scatter(x2,y2,color='black')
plt.title=('STUDENT DATA-SCATTER PLOT')
plt.ylabel('Present %')
plt.xlabel('Roll.no')
plt.show()

OUTPUT:

RESULT:
Thus, a python – Pandas module to generate basic graphs in matplotlib using
Jupyter Notebook was executed and output is verified successfully

You might also like