168 python
168 python
QUESTION:
Write a python program to create a dataframe using pandas. Perform the following operations on the
dataframe
i. Data Selection
ii. Data Indexing iii.
iii. Handling missing data in nominal attributes iv.
iv. Handling missing data in numeric attributes v.
v. Grouping
AIM :
ALGORITHM :
SAMPLE OUTPUT :
Data Selection:
Name Age
0 John 25.0
1 Jane 30.0
2 Bob NaN
3 Alice 35.0
Data Indexing:
Name Age Gender Salary
1 Jane 30.0 Female 60000.0
2 Bob NaN NaN 45000.0
3 Alice 35.0 Female NaN
RESULT :
QUESTION:
Write a python program to create a pandas series using list which consists of two separate list circuit
departments and non-circuit departments and implement explicit and implicit indexing.
AIM :
ALGORITHM :
PROGRAM :
import pandas as p
b=[["EEE","Mech","Civil","Auto"],["CSE","IT","ECE","AI&DS"]]
s=p.Series(b)
print("Implicit Indexing")
print(s)
s=p.Series(b,index=["Non-Circuit","Circuit"])
print("\nExplicit Indexing")
print(s)
Explicit Indexing
Non-Circuit [EEE, Mech, Civil, Auto]
Circuit [CSE, IT, ECE, AI&DS]
dtype: object
RESULT :
QUESTION:
Write a python program to create a pandas series and dataframe using dictionary which consists of the
key as name and the value as department and implement hybrid indexing
AIM :
ALGORITHM :
PROGRAM :
import pandas as pd
data = { 'bala':'CSE','hari':'ECE','dhoni':'IT'}
print("SERIES:")
print(pd.Series(data))
df=pd.DataFrame(data.items(),columns=['NAME','DEPT'])
print("\nDATAFRAME:")
print(df)
print("\nRow selection using label-based indexing:")
print(df.loc[1])
print("\nRow selection using integer-based indexing:")
print(df.iloc[2])
Reg No : 2127240501139 CS22201 Python for Data Science Laboratory Page No :
print("\nColumn selection using label-based indexing:")
print(df['DEPT'])
print("\nColumn selection using integer-based indexing:")
print(df.iloc[0:,1])
SAMPLE OUTPUT :
SERIES:
bala CSE
hari ECE
dhoni IT
dtype: object
DATAFRAME:
NAME DEPT
0 bala CSE
1 hari ECE
2 dhoni IT
RESULT :
QUESTION:
AIM :
ALGORITHM :
PROGRAM :
import matplotlib.pyplot as p
import numpy as np
np.random.seed(0)
x = np.linspace(0, 10, 100)
y = np.sin(x)
p.plot(x,y)
p.title("Line Plot")
p.xlabel("X")
p.ylabel("Y")
p.show()
Reg No : 2127240501139 CS22201 Python for Data Science Laboratory Page No :
p.scatter(x,y)
p.title("Scatter Plot")
p.xlabel("X")
p.ylabel("Y")
p.show()
p.hist(y)
p.title("Density Plot")
p.xlabel("Y")
p.ylabel("Density")
p.show()
d = np.random.randn(100,4)
p.boxplot(d)
p.title("Box Plot")
p.xlabel("Variables")
p.ylabel("Value")
p.show()
p.hist(d,label=['A', 'B', 'C', 'D'])
p.title("Histogram")
p.xlabel("Value")
p.ylabel("Frequency")
p.legend()
p.show()
SAMPLE OUTPUT :