0% found this document useful (0 votes)
6 views

168 python

The document contains a series of Python programming exercises using the Pandas library for data manipulation and Matplotlib for data visualization. It includes tasks such as creating dataframes, handling missing data, grouping data, and implementing various types of plots. Each exercise is accompanied by sample code and expected output.

Uploaded by

shine prints
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

168 python

The document contains a series of Python programming exercises using the Pandas library for data manipulation and Matplotlib for data visualization. It includes tasks such as creating dataframes, handling missing data, grouping data, and implementing various types of plots. Each exercise is accompanied by sample code and expected output.

Uploaded by

shine prints
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

EX.

NO : 7A SERIES AND DATAFRAME USING PANDAS


DATE:

QUESTION:

Write a python program to create a dataframe using pandas. Perform the following operations on the
dataframe
i. Data Selection
ii. Data Indexing iii.
iii. Handling missing data in nominal attributes iv.
iv. Handling missing data in numeric attributes v.
v. Grouping

AIM :

ALGORITHM :

Reg No : 2127240501139 CS22201 Python for Data Science Laboratory Page No :


PROGRAM :
import pandas as pd
import numpy as np
data = {
'Name': ['John', 'Jane', 'Bob', 'Alice'],'Age': [25, 30, np.nan, 35],
'Gender': ['Male', 'Female', np.nan, 'Female'],
'Salary': [50000, 60000, 45000, np.nan]
}
df = pd.DataFrame(data)
print("Data Selection:")
print(df[['Name', 'Age']])
print("\nData Indexing:")
print(df.loc[1:3])
print("\nHandling missing data in nominal attributes:")
df['Gender'].fillna('Not Specified', inplace=True)
print(df)
print("\nHandling missing data in numeric attributes:")
df['Salary'].fillna(df['Salary'].mean(), inplace=True)
print(df)
print("\nGrouping operations:")
grouped_data =df.groupby('Gender').aggregate('Salary').mean()
print(grouped_data)

SAMPLE OUTPUT :
Data Selection:
Name Age
0 John 25.0
1 Jane 30.0
2 Bob NaN
3 Alice 35.0

Data Indexing:
Name Age Gender Salary
1 Jane 30.0 Female 60000.0
2 Bob NaN NaN 45000.0
3 Alice 35.0 Female NaN

Handling missing data in nominal attributes:


Name Age Gender Salary
0 John 25.0 Male 50000.0
1 Jane 30.0 Female 60000.0
2 Bob NaN Not Specified 45000.0
3 Alice 35.0 Female NaN

Handling missing data in numeric attributes:


Name Age Gender Salary
0 John 25.0 Male 50000.000000
1 Jane 30.0 Female 60000.000000
2 Bob NaN Not Specified 45000.000000
3 Alice 35.0 Female 51666.666667
Reg No : 2127240501139 CS22201 Python for Data Science Laboratory Page No :
Grouping operations:
Gender
Female 55833.333333
Male 50000.000000
Not Specified 45000.000000
Name: Salary, dtype: float64

RESULT :

Reg No : 2127240501139 CS22201 Python for Data Science Laboratory Page No :


EX.NO : 7B SERIES AND DATAFRAME USING PANDAS
DATE:

QUESTION:

Write a python program to create a pandas series using list which consists of two separate list circuit
departments and non-circuit departments and implement explicit and implicit indexing.

AIM :

ALGORITHM :

PROGRAM :
import pandas as p
b=[["EEE","Mech","Civil","Auto"],["CSE","IT","ECE","AI&DS"]]
s=p.Series(b)
print("Implicit Indexing")
print(s)
s=p.Series(b,index=["Non-Circuit","Circuit"])
print("\nExplicit Indexing")
print(s)

Reg No : 2127240501139 CS22201 Python for Data Science Laboratory Page No :


SAMPLE OUTPUT :
Implicit Indexing
0 [EEE, Mech, Civil, Auto]
1 [CSE, IT, ECE, AI&DS]
dtype: object

Explicit Indexing
Non-Circuit [EEE, Mech, Civil, Auto]
Circuit [CSE, IT, ECE, AI&DS]
dtype: object

RESULT :

Reg No : 2127240501139 CS22201 Python for Data Science Laboratory Page No :


EX.NO : 7C SERIES AND DATAFRAME USING PANDAS
DATE:

QUESTION:

Write a python program to create a pandas series and dataframe using dictionary which consists of the
key as name and the value as department and implement hybrid indexing

AIM :

ALGORITHM :

PROGRAM :

import pandas as pd
data = { 'bala':'CSE','hari':'ECE','dhoni':'IT'}
print("SERIES:")
print(pd.Series(data))
df=pd.DataFrame(data.items(),columns=['NAME','DEPT'])
print("\nDATAFRAME:")
print(df)
print("\nRow selection using label-based indexing:")
print(df.loc[1])
print("\nRow selection using integer-based indexing:")
print(df.iloc[2])
Reg No : 2127240501139 CS22201 Python for Data Science Laboratory Page No :
print("\nColumn selection using label-based indexing:")
print(df['DEPT'])
print("\nColumn selection using integer-based indexing:")
print(df.iloc[0:,1])

SAMPLE OUTPUT :
SERIES:
bala CSE
hari ECE
dhoni IT
dtype: object

DATAFRAME:
NAME DEPT
0 bala CSE
1 hari ECE
2 dhoni IT

Row selection using label-based indexing:


NAME hari
DEPT ECE
Name: 1, dtype: object

Row selection using integer-based indexing:


NAME dhoni
DEPT IT
Name: 2, dtype: object

Column selection using label-based indexing:


0 CSE
1 ECE
2 IT
Name: DEPT, dtype: object

Column selection using integer-based indexing:


0 CSE
1 ECE
2 IT
Name: DEPT, dtype: object

RESULT :

Reg No : 2127240501139 CS22201 Python for Data Science Laboratory Page No :


EX.NO : 8 DATA REPRESENTATION USING MATPLOTLIB
DATE:

QUESTION:

Write a python program to implement the following plots using Matplotlib


i). Line plot
ii).Scatter plot
iii).Density plot
iv). Box plot
v). Histogram

AIM :

ALGORITHM :

PROGRAM :

import matplotlib.pyplot as p
import numpy as np
np.random.seed(0)
x = np.linspace(0, 10, 100)
y = np.sin(x)
p.plot(x,y)
p.title("Line Plot")
p.xlabel("X")
p.ylabel("Y")
p.show()
Reg No : 2127240501139 CS22201 Python for Data Science Laboratory Page No :
p.scatter(x,y)
p.title("Scatter Plot")
p.xlabel("X")
p.ylabel("Y")
p.show()
p.hist(y)
p.title("Density Plot")
p.xlabel("Y")
p.ylabel("Density")
p.show()
d = np.random.randn(100,4)
p.boxplot(d)
p.title("Box Plot")
p.xlabel("Variables")
p.ylabel("Value")
p.show()
p.hist(d,label=['A', 'B', 'C', 'D'])
p.title("Histogram")
p.xlabel("Value")
p.ylabel("Frequency")
p.legend()
p.show()

SAMPLE OUTPUT :

Reg No : 2127240501139 CS22201 Python for Data Science Laboratory Page No :


Reg No : 2127240501139 CS22201 Python for Data Science Laboratory Page No :
RESULT :

Reg No : 2127240501139 CS22201 Python for Data Science Laboratory Page No :

You might also like