0% found this document useful (0 votes)
16 views4 pages

Vicky Patil - Practical - 9 - Colab

Uploaded by

vickyvpatil25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views4 pages

Vicky Patil - Practical - 9 - Colab

Uploaded by

vickyvpatil25
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

12/3/24, 4:15 PM Vicky patil_Practical_9 - Colab

Practical No.9

Name:-Vicky v patil

Class:-MCA 2nd Year Semester 3rd

Subject:-Data Anaytics Lab

Title of Practical:-Read a data which will give a proper distribution curve using pandas. Apply preprocessing on the data and plot a histogram
for the same. Properly label the plot. Analyze the plot

import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import LabelEncoder
from sklearn.impute import SimpleImputer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import LinearRegression

#
df=pd.read_csv('/content/Iris.csv')
df

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

... ... ... ... ... ... ...

145 146 6.7 3.0 5.2 2.3 Iris-virginica

146 147 6.3 2.5 5.0 1.9 Iris-virginica

147 148 6.5 3.0 5.2 2.0 Iris-virginica

148 149 6.2 3.4 5.4 2.3 Iris-virginica

149 150 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 6 columns

from google.colab import drive


drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

print(df.isnull().sum())

Id 0
SepalLengthCm 0
SepalWidthCm 0
PetalLengthCm 0
PetalWidthCm 0
Species 0
dtype: int64

le=LabelEncoder()
df['Species']=le.fit_transform(df['Species'])
df

https://colab.research.google.com/drive/1S8CI8u92AuPe6vwKOCNQdusyjrCoArqs?usp=sharing#scrollTo=HsAEcmTrKUV0&printMode=true 1/4
12/3/24, 4:15 PM Vicky patil_Practical_9 - Colab

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 0

1 2 4.9 3.0 1.4 0.2 0

2 3 4.7 3.2 1.3 0.2 0

3 4 4.6 3.1 1.5 0.2 0

4 5 5.0 3.6 1.4 0.2 0

... ... ... ... ... ... ...

145 146 6.7 3.0 5.2 2.3 2

146 147 6.3 2.5 5.0 1.9 2

147 148 6.5 3.0 5.2 2.0 2

148 149 6.2 3.4 5.4 2.3 2

149 150 5.9 3.0 5.1 1.8 2

150 rows × 6 columns

x=df[df.columns[0:3]]
y=df[df.columns[4]]

df.shape

(150, 6)

x.head()

Id SepalLengthCm SepalWidthCm

0 1 5.1 3.5

1 2 4.9 3.0

2 3 4.7 3.2

3 4 4.6 3.1

4 5 5.0 3.6

y.head()

PetalWidthCm

0 0.2

1 0.2

2 0.2

3 0.2

4 0.2

dtype: float64

x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.3,random_state=10)

x_train.head()

Id SepalLengthCm SepalWidthCm

32 33 5.2 4.1

52 53 6.9 3.1

70 71 5.9 3.2

121 122 5.6 2.8

144 145 6.7 3.3

y_train.head()

https://colab.research.google.com/drive/1S8CI8u92AuPe6vwKOCNQdusyjrCoArqs?usp=sharing#scrollTo=HsAEcmTrKUV0&printMode=true 2/4
12/3/24, 4:15 PM Vicky patil_Practical_9 - Colab

PetalWidthCm

32 0.1

52 1.5

70 1.8

121 2.0

144 2.5

dtype: float64

# prompt: how to build a regression model

from sklearn.linear_model import LinearRegression

# Initialize the model


model = LinearRegression()

# Train the model


model.fit(x_train, y_train)

# Make predictions on the test set


y_pred = model.predict(x_test)

# Evaluate the model (example: R-squared)


from sklearn.metrics import r2_score
r2 = r2_score(y_test, y_pred)
print(f"R-squared: {r2}")

R-squared: 0.8627149401415144

md=LinearRegression()

md.fit(x_train,y_train)

▾ LinearRegression i ?
LinearRegression()

y_pred=md.predict(x_test)
y_pred

array([ 1.56788019, 1.81167007, 0.26224485, 1.45759149, 0.59318562,


0.86904724, 1.38235127, 1.13323559, 0.50776235, 0.95786364,
1.01767325, 1.9396602 , 0.92852498, 0.07921787, 0.24422778,
1.92294443, 1.31036751, 0.30594397, 0.27785444, 0.45477503,
1.91808717, 2.02534696, 1.89238898, 0.2438114 , 1.06657763,
-0.00271782, 1.36371277, 1.28999511, 1.56010462, 2.18772813,
1.28599669, 1.56844946, 2.24262204, 1.62647301, 2.17802355,
0.44691494, 1.98138436, 2.01894106, 2.34394446, 2.25686195,
0.33376007, 0.50442765, 0.84188492, 0.45727696, 1.11333801])

from sklearn.metrics import r2_score


from sklearn.metrics import mean_squared_error

r2 = r2_score(y_test, y_pred)
print(f"R-squared: {r2}")

mse = mean_squared_error(y_test, y_pred)


print(f"Mean Squared Error: {mse}")

R-squared: 0.8627149401415144
Mean Squared Error: 0.07482273045106973

import seaborn as sns


import matplotlib.pyplot as plt

plt.figure(figsize=(9,6))
sns.histplot(df['Species'])
plt.title('Distribution of Species')
plt.xlabel('Species')
plt.ylabel('Count')
plt.show()

https://colab.research.google.com/drive/1S8CI8u92AuPe6vwKOCNQdusyjrCoArqs?usp=sharing#scrollTo=HsAEcmTrKUV0&printMode=true 3/4
12/3/24, 4:15 PM Vicky patil_Practical_9 - Colab

https://colab.research.google.com/drive/1S8CI8u92AuPe6vwKOCNQdusyjrCoArqs?usp=sharing#scrollTo=HsAEcmTrKUV0&printMode=true 4/4

You might also like