0% found this document useful (0 votes)

6 views81 pages

Machine Learning

The document is a practical file for a Machine Learning course, detailing various implementations of machine learning techniques using Python. It includes practical exercises on vector and matrix algebra, data preprocessing, and multiple regression models, along with code snippets and outputs for each implementation. The file is submitted by a student to their instructor at Akal University, Talwandi Sabo, in May 2025.

Uploaded by

diljeetpc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views81 pages

Machine Learning

Uploaded by

diljeetpc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 81

Practical File

Of
Machine Learning

Submitted to: Submitted by:

Er.Zubair Fayaz Name: Diljeet Singh

Dept. of Computer Science & Engineering Class: B-Tech (Sem-4) AI-ML
AUID: 237106007

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING

AKAL UNIVERSITY, TALWANDI SABO

May 2025

TABLE OF CONTENT

1
S. No. Name Of Practical Page No.
1. Implementation of vector algebra in 3-5
machine learning
2. Implementation of matrix algebra in 6-9
machine learning
3. Implementation of various data 10-25
preprocessing steps in Python
4. Implementation of Simple linear 26-36
regression in Python.
5. Implementation of Multiple linear 37-42
regression in Python.
6. Implementation of Support Vector 43-45
Machine using python.
7. Implementation of Decision Tree 46-49
Regression using python
8. Implementation of Random forest 50-53
classification using pythons

9. Implementation of Random forest 54-55

regression using python

10. Implementation of Logistic 56-59

Regression using python
11. Implementation of KNN regression 60-61
in python
12. Implementation of Clustering with 62-64
k-means in python

2
13. Implementation of agglomerative 65-67
hierarchal clustering in python *
14. Implementation of Naïve Bayes 68-70
using Pyton
15. Implementation of Hierarchical 71-72
clustering using Pyton
16. Implementation of Ridge and Lasso 73-74
Regression using Pyton
17. Implementation of DBSCAN using 75-76
Pyton
18. Implementation of K-Mean 77-80
Clustering using Pyton
2

Practical: 1
AIM: Implementation of vector algebra using python.

#import necessary libraries

importnumpy as np 1.

#create a vector

a=np.array([2,3,7])

b=np.array([1,2,3])

Output :

3
2. #vector addition

print(a+b) Output :

3. #vector subtraction

print(a-b) Output :

4. #vector

multiplication

print(a*b) Output :

5. #vector division

print(a/b) Output :

4
6. #vector scaler

multiplication

print(5*a) Output :

7. #vector exponential
print(a**b) Output

8. #vector dot product

print(np.dot(a,b))

Output :

9. #cross product

print(np.cross(a,b))

Output :

5
10. #vector norm

print(np.linalg.norm(

a)) Output :

Practical: 2
AIM: Implementation of basic matrix algebra using python.

1. # create a matrix Import

numpy as np

a=np.array([[1,3,5], [2,4,7],

[4,9,2]]) b=np.array([[1,2,3],

[4,5,6], [7,8,9]]) print (a) print

(b)

Output:

6
2. #addition

print(np.add(a,b)) Output:

3. #subtraction

print(np.subtract(a,b)) Output:

4. #matrix scaler

multiplication

7
print(np.multiply(5,a))

Output:

5. #vactor multiplication

v=np.array([[1],[3],[3]])

print(np.dot(a,v)) Output:

6. #matrix multiplication

print(np.matmul(a,b))

print(np.dot(a,b)) result=a@b

print(result) Output:

8
7.#determinant
print(np.linalg.det(a)) Output:

8. #transpose

print(np.transpose(a))

Output:

9
9. #inverse

print(np.linalg.inv(a))

Output:

Practical -3
Aim: Implementation of various data preprocessing steps in Python.

Description:- 1)Importing the libraries.

2)Importing the dataset

3)Taking care of missing data.

10
4)Encoding the Categorical data(description)

5)Feature Scaling(Normalization and Standardization)

#importing the dataset import

pandas as pd

data = pd.read_csv("employees.csv") data

#checking missing data data.isnull()

#check number of null values in each column

data.isnull().sum()

11
#total number of null values in the dataset data.isnull().sum().sum()

#total number of not null values in the dataset

data.notnull().sum().sum()

#drop all the missing values data.dropna()

#drop null values from a particular

column data["Gender"].dropna() #dropping

a row data.drop("Gender",axis=1)

#dropping a column

data.drop(0, axis=0)

#using dropna(how=?)

data.dropna().any()

data.dropna().all()

12
#dropping the row on basis of particular keyword

data[data["Team"].str.contains("Marketing") == False]

#using fillna import

numpy as np

data.fillna(50)

#data.fillna(method= 'pad')

#data['Team'].fillna(method= 'bfill', inplace= True)

data.replace(to_replace=np.NaN, value=50) data.head(5)

13
dict = {'FirstScore':[100, 90, np.nan, 95],

'SecondScore': [30, 45, 56, np.nan],

'ThirdScore':[np.nan, 40, 80, 98]}

# creating a dataframe from list

df = pd.DataFrame(dict) m=

df['FirstScore'].mean()

df['FirstScore'].fillna(m)

14
df.interpolate(method ='linear', limit_direction ='forward')

Encoding the Categorical data.

Description: - Encoding categorical data is a common task in machine learning and
data analysis, especially when working with algorithms that require numerical
input.

• One-hot encoding(1,0 form)

Description: - This method creates binary columns for each category and assigns a
1 or 0 to indicate the presence or absence of a category. For example, if you have
the same "red," "green," and "blue" categories, one-hot encoding would create
three columns: "red," "green," and "blue,"

#encoding categorical data import

category_encoders as ce

dict = {'City':['Delhi','Chennai','bangalore','Hyderabad','Jammu']}

df= pd.DataFrame(dict) df

encoder = ce.OneHotEncoder(cols="City", handle_unknown= True) encoded_data

= encoder.fit_transform(df)

encoded_data

OneHot= pd.get_dummies(df["City"])

OneHot

15
Merge= pd.concat([df, OneHot], axis=1)

#Merge

Merge.drop(["City"], axis=1)

#Merge

• Dummy encoding

Description: - Similar to one-hot encoding but drops one of the columns to avoid
multicollinearity. This is often used when building linear models to avoid
redundancy in the encoded variables.

#dummy encoding

16
dummy = pd.get_dummies(df["City"]) dummy

= dummy.drop("Delhi",axis=1) dummy

#effect encoding or deviation or sum encoding

effect = ce.sum_coding.SumEncoder(df["City"])

encoded_data=effect.fit_transform(df) encoded_data

• Label encoding

Description:- This involves assigning a unique integer to each category. For

example, if you have categories like "red," "green," and "blue," you could encode
them as 0, 1, and 2, respectively.

#label encoding

from sklearn.preprocessing import LabelEncoder le

= LabelEncoder()

dict = {'City':['Delhi','Chennai','bangalore','Chennai','Hyderabad','Jammu']}

df1= pd.DataFrame(dict) df1

df1["City_Label"]= le.fit_transform(df1["City"]) df1

17
• Ordinal encoding

Description:- This is suitable when there's an inherent order or hierarchy among the
categories. For instance, if you have categories like "low," "medium," and "high,"
you can assign them ordinal values like 1, 2, and 3, respectively.

#ordinal encoding
from sklearn.preprocessing import OrdinalEncoder encoder

= OrdinalEncoder()

dict = {'T-Shirt size':['Small','Medium','Large']}

df2= pd.DataFrame(dict) df2

encoded_data= encoder.fit_transform(df2) encoded_data

• Binary encoding

18
Description:- This method converts categories into binary digits and then splits
those digits into separate columns. It reduces the number of columns compared to
one-hot encoding while still preserving the information.

#binary encoding from category_encoders

import BinaryEncoder BE =

BinaryEncoder()

encoded_data = BE.fit_transform(df2) encoded_data

• Count encoding

Description:- Count encoding is a method used to transform categorical variables

into numerical representations based on the frequency of each category in the
dataset. It replaces each category with the number of times it appears in the dataset.
This encoding technique is particularly useful for high-cardinality categorical
variables, where one-hot encoding might lead to a high-dimensional sparse matrix.

#count encoding from category_encoders

import CountEncoder CE =

CountEncoder()

df3 = pd.DataFrame({'fruits':['Apple','banana','Cherry', 'Apple', 'Cherry']}) df3

a = df3['fruits'].value_counts() df3['fruits'].map(a)

encoded_data = CE.fit_transform(df3) encoded_data

19
• BaseN encoding

Description:- BaseN encoding is a method used to represent numerical data in a

different base system, such as binary, octal, or hexadecimal. In this encoding, each
digit in the original number is represented by a character in the chosen base system.

#baseN encoding import category_encoders as ce

dict = {'City':['Delhi','Chennai','bangalore','Chennai','Hyderabad','Jammu']}

df1= pd.DataFrame(dict) df1

encoder = ce.BaseNEncoder(cols=['City'], base=5, return_df=True)

encoded_data = encoder.fit_transform(df1) encoded_data

• Target encoding

Description:- Target encoding, also known as mean encoding or likelihood

encoding, is a method used to encode categorical variables into numerical

20
representations based on the target variable. In target encoding, each category of
the categorical variable is replaced with the mean (or another statistic) of the target
variable for that category. #target encoding import pandas as pd import
category_encoders as ce

car1={ "cars":"bmw", "price":20 } car2={ "cars":"audi",

"price":30 } list=[] for i in range(10000):

list.append(car1) for i

in range(10000):

list.append(car2)

21
df=pd.DataFrame(list)

ce.TargetEncoder().fit_transform(df["cars"],df['price'])

➢ Feature scaling
Feature scaling is a preprocessing technique used in machine learning to
standardize the range of independent variables or features of the dataset. It ensures
that all features have the same scale, which can be crucial for certain algorithms to
perform effectively, particularly those based on distance calculations or gradient
descent optimization.

Min-Max Scaling (Normalization)

Description:- Min-max scaling, also known as normalization, is a technique used in

data preprocessing to scale numeric features to a specific range. This process
involves transforming the data such that it falls within a pre-defined interval,
typically [0, 1] or [-1, 1].

#min max scaling

from sklearn.preprocessing import MinMaxScaler scaler

= MinMaxScaler()

dict= {'weight in grams': [500, 400, 300, 700, 800], 'price in dollars': [10, 8, 5, 12,
15]}

df5 = pd.DataFrame(dict) df5

22
import pandas as pd

data=pd.read_csv("pima-indians-diabetes.data.csv")

df6 = pd.DataFrame(data) scaled_data =

scaler.fit_transform(df6)

#labels = ('a','b','c','d','e','f','g','h','i')

frame = pd.DataFrame(scaled_data, columns=df6.columns) frame

Standardization

Description:- Standardization, also known as z-score normalization, is another data

preprocessing technique used to scale numeric features. Unlike min-max scaling,
standardization does not bound the data to a specific range like [0, 1] or [-1, 1].
Instead, it centers the data around the mean and scales it based on the standard
deviation. This results in transformed data with a mean of 0 and a standard
deviation of 1.

from sklearn .preprocessing import StandardScaler rescaled_data

=StandardScaler().fit_transform(df6) print(rescaled_data)

23
24
PRACTICAL-4
AIM: Implementation of Simple linear regression in Python.

Simple linear regression: Aims to find a linear relationship to

describe the correlation between an independent and possibly
dependent variable.
import pandas as pd
import pandas as pd
import matplotlib.pyplot as plt
d=pd.read_csv("/content/placement and cgpa.csv") print(d)

plt. scatter(d['cgpa'],
d['package']) plt. xlabel('cgpa')
plt. ylabel('package')

25
x=d.iloc[:, 0:1]
print(x)

y=d.iloc[:, 1:2]

26
print(y)

from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test =train_test_split(x, y, test_size=0.2)
print(x_train) print(x_test)

print(y_train)

MAKING OUR MODEL USING ALWAYS TRAINING DATASET AND THIS IS WHERE
ACTUALLY OUR MACHINE IS LEARNING!

from sklearn.linear_model import

LinearRegression lr=LinearRegression ()
lr.fit(x_train, y_train)

27
PRIDICTIONS ON TRAINING DATASET
import pandas as pd
predictions = lr.predict(x_train)
pre_d=pd.DataFrame (predictions, columns=['predictions'])
print(pre_d)

print(x_train)

28
print(y_train)

VISUALIZATION ON TRAINING DATASET

plt.scatter(d['cgpa'], d['package'])
plt.plot(x_train, lr.predict (x_train), color=
'red')
plt.xlabel('cgpa')
plt. ylabel('package(in lpa)) '

29
PREDICTIONS ON TEST DATASET
lr.predict(X_test.iloc[0].values.reshape(1,1))

# Visualization on Test Dataset

plt.scatter(df['cgpa'],df['package']) plt.plot(X_test,lr.predict(X_test),color='green') #It's
predicting the target variable values #for each input feature in X_train.
plt.xlabel('CGPA') plt.ylabel('Package(in
lpa)')

30
# DOING RANDOM PREDICTIONS FOR TESTING OUR MODEL

m=lr.coef_
print(m)

b=lr.intercept_
print(b)

y=m*3.58+b
print(y)

31
EVALUATION METRics

from sklearn.metrics import mean_absolute_error,mean_squared_error,r2_score

y_pred=lr.predict(x_test) print(y_pred)

y_test.values

32
print("MAE",mean_absolute_error(y_test,y_pred))

print("MSE",mean_squared_error(y_test,y_pred))

multiple regression models the linear relationship between single dependent

variable and ore than one independent variable import pandas as pd
d=pd.read_csv("/content/50_Startups.csv")
print(d)

print("RMSE",np.sqrt(mean_squared_error(y_test,y_pred)))

33
print("R2 Score"
,r2_score
(y_test,y_pred
))

# Assuming y_test and y_pred are your actual and predicted values respectively

# Calculate Mean Squared Error (MSE)

mse = mean_squared_error(y_test, y_pred)

# Calculate the variance of the actual target values variance_y_test

= np.var(y_test)

# Calculate Relative MSE

relative_mse = mse / variance_y_test

print("Relative MSE:", relative_mse)

rmse = np.sqrt(mean_squared_error(y_test,y_pred))
print(rmse)

mean_y_test = np.mean(y_test)
mean_y_test

CV= rmse/ mean_y_test

34
PRACTICAL-5

Aim: Implementation of Multiple linear Regression in python

import pandas as pd import
matplotlib.pyplot as plt df =
pd.read_csv('50_Startups.csv')
df.head()

Multiple Linear Regression is one of the important regression algorithms which

models the linear relationship between a single dependent continuous variable and
more than one independent variable. Example: Prediction of CO2 emission based
on engine size and number of cylinders in a car.
x=d.iloc[:, 0:4]
print(x)

y=d.iloc[:, 4:5]
print(y)

35
import pandas as pd
import matplotlib.pyplot as plt plt.
scatter(d['R&D Spend'], d['Profit'])
plt. xlabel('R&D Spend')
plt. ylabel('Profit')

plt. scatter(d['Administration'],
d['Profit']) plt. xlabel('Administration')
plt. ylabel('Profit')

36
plt. scatter(d['Marketing Spend'],
d['Profit']) plt. xlabel('Marketing Spend')
plt. ylabel('Profit')

plt. xlabel('State') plt.

37
scatter(d['State'], d['Profit'])
plt. ylabel('Profit')

Handling Categorical Variables

import category_encoders as ce import
pandas as pd
dp=pd.get_dummies(data=d,drop_first=True)
print(dp)

38
from sklearn.model_selection import train_test_split
# splitting the data
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 42) from
sklearn.linear_model import LinearRegression
# creating an object of LinearRegression class
LR = LinearRegression()
# fitting the training data
LR.fit(x_train,y_train)

y_prediction = LR.predict(x_test) y_prediction

coefficients = LR.coef_ intercept

= LR.intercept_
print(coefficients)
print(intercept)

import numpy as np
from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error

# predicting the accuracy score

39
score=r2_score(y_test,y_prediction)

print('r2 socre is ',score)

print('mean_sqrd_error is==',mean_squared_error(y_test,y_prediction))

print('root_mean_squared error of is==',np.sqrt(mean_squared_error(y_test,y_prediction)))

40
Practical-6
Aim-Implementation of Support vector Regression Using Python

41
42
43
Practical -7
Aim-Implementation of Decision Tree Regression using python

44
45
46
Practical: 8
Aim: Implementation of random forest classification using pythons
tep:1 Import the necessary libraries.

import numpy as np import pandas as pd Step:

2 Load dataset from sklearn.datasets import

load_breast_cance r data =

load_breast_cancer() data.data

data.feature_names

data.target

47
data.target_names

df = pd.DataFrame(np.c_[data.data, data.target],

columns=[list(data.feature_names)+['target']]) df.head()

df.tail() df.shape

Step: 4 Split Data X

= df.iloc[:, 0:-1] y =

df.iloc[:, -1]

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print('Shape of X_train = ', X_train.shape) print('Shape of y_train = ', y_train.shape)

print('Shape of X_test = ', X_test.shape)

print('Shape of y_test = ',

y_test.shape)

48
Step: 5 Train Random Forest Classification Model from

sklearn.ensemble import RandomForestClassifier classifier =

RandomForestClassifier(n_estimators=100, criterion='gini')

classifier.fit(X_train, y_train) classifier.score(X_test, y_test)

Step: 6 Predict Cancer

patient1 = [17.99, 10.38, 122.8, 1001.0, 0.1184, 0.2776, 0.3001, 0.1471, 0.2419, 0.07871, 1.095,
0.9053, 8.589, 153.4, 0.006399, 0.04904, 0.05373, 0.01587, 0.03003, 0.006193, 25.38, 17.33,
184.6, 2019.0, 0.1622, 0.6656, 0.7119, 0.2654, 0.4601, 0.1189] patient1 = np.array([patient1])

patient1

classifier.predict(patient1)

pred = classifier.predict(patient1)

if pred[0] == 0:

print('Patient has Cancer (malignant tumor)')

else: print('Patient has no Cancer (benign

tumor)')

49
Practical –9
Aim: Implement Random Forest Regression using Python.

50
Step 1: Import necessary libraries.

51
Step 2: Load the Height-Age dataset.

52
Step 3: Separate the dataset into independent and dependent variables.

Step 4: Split the dataset into training and testing sets.

Step 5: Import the Random Forest Regressor.

Step 6: Create a Random Forest Regressor object.

Step 7: Train the model with the training data.

Step 8: Make predictions on the test dataset.

Step 9: Evaluate the model using R-Square.

53
Step 10: Visualize the Random Forest Regression.

Practical-10
Aim: Implementation of Logistic Regression using python

54
importnumpy
#X represents the size oftumor
a in centimeters.
X = numpy.array([
3.78, 2.44, 2.09, 0.14, 1.72, 1.65, 4.92, 4.37, 4.96, 4.52, 3.69, 5.88]).reshape(-1,1)

#Note: X has to be reshaped into a column from a row for the LogisticRegression() function to work.
#y represents whether or notthe tumor is cancerous (0 for "No", 1 for "Yes").
y = numpy.array([
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1])
fromsklearnimportlinear_model
logr = linear_model.LogisticRegression()
logr.fit(X,y)

#predict if tumor is cancerous where the size is 3.46mm:

predicted = logr.predict(numpy.array([
3.46]).reshape(-1,1))

print(predicted)

Example2:
importpandasas pd

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn import preprocessing

from sklearn.metrics import accuracy_score

from matplotlib import pyplot as plt

import seaborn as sns df =

pd.read_csv('creditcard.csv')
df.info()

55
df.head()

sum(df.duplicated())

df.drop_duplicates(inplace=True)

df.drop('Time', axis=1, inplace=True) X

= df.iloc[:,df.columns != 'Class']

56
y = df.Class
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.20, random_state=5, stratify=y)

from sklearn.preprocessing import StandardScaler

# Fit the scaler to the training data and transform both the training and test data scaler
= StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

model = LogisticRegression()

model.fit(X_train_scaled, y_train) #training the model

# Make predictions using the trained model

y_pred = model.predict(X_test_scaled)

train_acc = model.score(X_train_scaled, y_train)

print("The Accuracy for Training Set is {}".format(train_acc*100))

test_acc = accuracy_score(y_test, y_pred)

print("The Accuracy for Test Set is {}".format(test_acc*100))

57
Practical-11
Aim: Implementation of KNN Regression using python
#importing neccessary libraries
import pandas as pd import
numpy as np

gymdata=pd.DataFrame(data=DataValues,columns=ColumnNames) gymdata.head()

TargetVariable='Weight'
Predictors=['Hours','Calories']
x=gymdata[Predictors].values
y=gymdata[TargetVariable].values print(x)

print(y)

from sklearn.model_selection import train_test_split

58
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
from sklearn.neighbors import KNeighborsRegressor RegModel =
KNeighborsRegressor(n_neighbors=2) print(RegModel)

KNN=RegModel.fit(X_train,y_train)
prediction=KNN.predict(X_test) print(X_test)

from sklearn import metrics

print('R2 Value:',metrics.r2_score(y_train, KNN.predict(X_train)))

print('Accuracy',100- (np.mean(np.abs((y_test - prediction) / y_test)) * 100))

TestingDataResults=pd.DataFrame(data=X_test, columns=Predictors)
TestingDataResults[TargetVariable]=y_test
TestingDataResults[('Predicted'+TargetVariable)]=prediction TestingDataResults.head()

Practical-12
Aim: Implementation of Clustering with K means using python.

59
from sklearn.cluster import KMeans import
pandas as pd from sklearn.preprocessing import
MinMaxScaler from matplotlib import pyplot as
plt df = pd.read_csv("income.csv") df.head()
plt.scatter(df.Age,df['Income($)'])
plt.xlabel('Age') plt.ylabel('Income($)')

# Assuming df is your DataFrame containing the 'Age' and 'Income($)' columns

km = KMeans(n_clusters=3) y_predicted =
km.fit_predict(df[['Age','Income($)']])

# Now y_predicted should contain the cluster labels

print(y_predicted)

df['cluster']=y_predicted df.head()

60
km.cluster_centers_

df1 = df[df.cluster==0] df2 = df[df.cluster==1] df3 = df[df.cluster==2]

plt.scatter(df1.Age,df1['Income($)'],color='green')
plt.scatter(df2.Age,df2['Income($)'],color='red')
plt.scatter(df3.Age,df3['Income($)'],color='black')
plt.scatter(km.cluster_centers_[:,0],km.cluster_centers_[:,1],color='purple',marker=
'*',label='centroid')
plt.xlabel('Age')
plt.ylabel('Income ($)')
plt.legend()

61
Preprocessing using min max scale scaler
= MinMaxScaler()

scaler.fit(df[['Income($)']]) df['Income($)'] =
scaler.transform(df[['Income($)']])

scaler.fit(df[['Age']]) df['Age'] =
scaler.transform(df[['Age']]) df.head()

plt.scatter(df.Age,df['Income($)'])

62
km = KMeans(n_clusters=3) y_predicted =
km.fit_predict(df[['Age','Income($)']]) y_predicted

df['cluster']=y_predicted df.head()

km.cluster_centers_

63
df1 = df[df.cluster==0] df2 = df[df.cluster==1] df3 = df[df.cluster==2]
plt.scatter(df1.Age,df1['Income($)'],color='green')
plt.scatter(df2.Age,df2['Income($)'],color='red')
plt.scatter(df3.Age,df3['Income($)'],color='black')
plt.scatter(km.cluster_centers_[:,0],km.cluster_centers_[:,1],color='purple',marker=
'*',label='centroid')
plt.legend()

Elbow Plot
sse = [] k_rng =
range(1,10) for k in
k_rng:
km = KMeans(n_clusters=k)
km.fit(df[['Age','Income($)']])
sse.append(km.inertia_)
plt.xlabel('K') plt.ylabel('Sum of
squared error')
plt.plot(k_rng,sse)

64
Practical-13
Aim-Implementation of Agglomerative Hierarchical
Clustering in python.

# Importing the libraries

import numpy as nm import
matplotlib.pyplot as mtp
import pandas as pd import
warnings
# Define a function that triggers a specific warning def
trigger_warning():
warnings.warn("This is a warning message", Warning)

# Ignore the specific warning using a context manager

with warnings.catch_warnings():
warnings.simplefilter("ignore") trigger_warning()

# After the context manager, warnings are not ignored anymore

trigger_warning()

# Importing the dataset dataset =

pd.read_csv('Mall_Customers.csv')
dataset.head()

65
x = dataset.iloc[:, [3, 4]].values print(x)

#Finding the optimal number of clusters using the dendrogram import

scipy.cluster.hierarchy as shc dendro = shc.dendrogram(shc.linkage(x,
method="ward"))#ward is technique mtp.title("Dendrogram Plot")
mtp.ylabel("Euclidean Distances") mtp.xlabel("Customers") mtp.show()

66
#training the hierarchical model on dataset from sklearn.cluster import
AgglomerativeClustering hc= AgglomerativeClustering(n_clusters=5,
metric='euclidean', linkage='ward') y_pred= hc.fit_predict(x)

y_pred

67
#visulaizing the clusters mtp.scatter(x[y_pred == 0, 0], x[y_pred == 0, 1], s = 50,
c = 'blue', label = 'Cluster
1')
mtp.scatter(x[y_pred == 1, 0], x[y_pred == 1, 1], s = 100, c = 'green', label =
'Cluster 2') mtp.scatter(x[y_pred== 2, 0], x[y_pred == 2, 1], s = 100, c = 'red',
label = 'Cluster
3')
mtp.scatter(x[y_pred == 3, 0], x[y_pred == 3, 1], s = 100, c = 'cyan', label =
'Cluster 4') mtp.scatter(x[y_pred == 4, 0], x[y_pred == 4, 1], s = 100, c =
'magenta', label =
'Cluster 5') mtp.title('Clusters of
customers') mtp.xlabel('Annual
Income (k$)') mtp.ylabel('Spending
Score (1-100)') mtp.legend()
mtp.show()

68
Practical -14
Aim: Implementation of Naïve Bayes in Python.

69
70
71
Practical -15
Aim: Implementation of Hierarchical Clustering in Python.

72
73
Practical -16
Aim: Implementation of Ridge and Lasso Regression in Python.

74
75
76
Practical -15
Aim: Implementation of Hierarchical in Python.

77
78
Practical -18
Aim: Implementation of K-Mean Clustering in Python.

79
80
81

C4.5 Decision Tree Algorithm
No ratings yet
C4.5 Decision Tree Algorithm
11 pages
9. UG_B.Tech & BCA_Date Sheet_May- 2025
No ratings yet
9. UG_B.Tech & BCA_Date Sheet_May- 2025
1 page
Peace Corps Response Interview Assessment 2015 May
100% (2)
Peace Corps Response Interview Assessment 2015 May
3 pages
L and T Projects - Colabs
No ratings yet
L and T Projects - Colabs
7 pages
Machine Learning Lab (1)
No ratings yet
Machine Learning Lab (1)
33 pages
Bug Busters
No ratings yet
Bug Busters
1 page
L_AND_T_project_Naveen 24cs002895
No ratings yet
L_AND_T_project_Naveen 24cs002895
7 pages
Uhv Mid-2 Unit-4 Long Questions With Answer
No ratings yet
Uhv Mid-2 Unit-4 Long Questions With Answer
7 pages
Machine Learning Record VR19
No ratings yet
Machine Learning Record VR19
46 pages
cs229_python_friday
No ratings yet
cs229_python_friday
40 pages
ashfatmaterial
No ratings yet
ashfatmaterial
4 pages
Ahsham.ml file
No ratings yet
Ahsham.ml file
34 pages
AIML Practical exam codes 1
No ratings yet
AIML Practical exam codes 1
7 pages
100 Days of Machine Learning
No ratings yet
100 Days of Machine Learning
14 pages
WHLP - G9 Arts Q1 Week 3&4
No ratings yet
WHLP - G9 Arts Q1 Week 3&4
3 pages
MLCyberLab
No ratings yet
MLCyberLab
9 pages
Yoon and The Jade Bracelet?
No ratings yet
Yoon and The Jade Bracelet?
4 pages
01-134192-066-9559671601-28052022-103753pm.docx
No ratings yet
01-134192-066-9559671601-28052022-103753pm.docx
1 page
1
No ratings yet
1
3 pages
data analytics lab manual
No ratings yet
data analytics lab manual
26 pages
MAPEH 4 SUMMATIVE 04
No ratings yet
MAPEH 4 SUMMATIVE 04
2 pages
Apa Style - Itc
No ratings yet
Apa Style - Itc
9 pages
ML-CONTENTHALF
No ratings yet
ML-CONTENTHALF
35 pages
ML - Practical File
No ratings yet
ML - Practical File
15 pages
Practical 3 - Categorical Feature Engineering
No ratings yet
Practical 3 - Categorical Feature Engineering
6 pages
Ml Lab Manual
No ratings yet
Ml Lab Manual
59 pages
Unit 4_Working With Graphs _python
No ratings yet
Unit 4_Working With Graphs _python
49 pages
L1_Data Pre-processing & Steps of Building a Model (1)
No ratings yet
L1_Data Pre-processing & Steps of Building a Model (1)
30 pages
ml lab
No ratings yet
ml lab
23 pages
Association For The Advancement of Artificial Intelligence
No ratings yet
Association For The Advancement of Artificial Intelligence
24 pages
(Feature Engineering) (Extended-Cheatsheet)
No ratings yet
(Feature Engineering) (Extended-Cheatsheet)
9 pages
MACHINE LEARNING manual
No ratings yet
MACHINE LEARNING manual
36 pages
Lecture 5 Encoding
No ratings yet
Lecture 5 Encoding
35 pages
IML Lab Manual
No ratings yet
IML Lab Manual
31 pages
Sample Essays
100% (1)
Sample Essays
9 pages
ANDRIANY Athalia 20063272 - Assessment Task 2 - SITHKOP002
No ratings yet
ANDRIANY Athalia 20063272 - Assessment Task 2 - SITHKOP002
4 pages
50inference
No ratings yet
50inference
31 pages
Institutional Training Report Done at
No ratings yet
Institutional Training Report Done at
5 pages
Assignment1_LATEX
No ratings yet
Assignment1_LATEX
11 pages
Data Pre Processing
No ratings yet
Data Pre Processing
2 pages
ML2 Write-Ups Prac 1-5
No ratings yet
ML2 Write-Ups Prac 1-5
11 pages
ML-Lab05-Data Preprocessing Techniques in Python
No ratings yet
ML-Lab05-Data Preprocessing Techniques in Python
7 pages
machinelearning
No ratings yet
machinelearning
26 pages
Machine
No ratings yet
Machine
33 pages
ML LAB
No ratings yet
ML LAB
23 pages
06 - Research Methodology and Work Plan PDF
No ratings yet
06 - Research Methodology and Work Plan PDF
4 pages
Ml Cyber Lab
No ratings yet
Ml Cyber Lab
16 pages
4 Data Preprocessing
No ratings yet
4 Data Preprocessing
27 pages
Practical (Data Science)
No ratings yet
Practical (Data Science)
13 pages
Almost Normal
No ratings yet
Almost Normal
4 pages
Lecture Material 3
No ratings yet
Lecture Material 3
7 pages
003-FIN7790 (Part2)
No ratings yet
003-FIN7790 (Part2)
162 pages
2nd QTR MOD. 1 DLL
No ratings yet
2nd QTR MOD. 1 DLL
52 pages
Japanese
No ratings yet
Japanese
6 pages
Final ML File
No ratings yet
Final ML File
34 pages
2024 TECHNOLYMPICS Invitation Card Making Guidelines
100% (2)
2024 TECHNOLYMPICS Invitation Card Making Guidelines
2 pages
Data Preparation.2
No ratings yet
Data Preparation.2
18 pages
Parcels For Rooftops: Order The Pictures. Read The Story Again and Check
No ratings yet
Parcels For Rooftops: Order The Pictures. Read The Story Again and Check
7 pages
ML MANUAL
No ratings yet
ML MANUAL
21 pages
ML RECORD - Merged
No ratings yet
ML RECORD - Merged
33 pages
Role of Values and Emotional Intelligence in Education
No ratings yet
Role of Values and Emotional Intelligence in Education
51 pages
Postgraduate Programmes Fee Schedule: (International Only)
No ratings yet
Postgraduate Programmes Fee Schedule: (International Only)
5 pages
learning-through-play-in-early-childhood-a-systematic-review
No ratings yet
learning-through-play-in-early-childhood-a-systematic-review
45 pages
Hypothesis Space Search in Decision Trees
No ratings yet
Hypothesis Space Search in Decision Trees
15 pages
Trung Tâm Anh NG Nhung PH M 27N7A KĐT Trung Hòa Nhân Chính - 0944 225 191
No ratings yet
Trung Tâm Anh NG Nhung PH M 27N7A KĐT Trung Hòa Nhân Chính - 0944 225 191
5 pages
EDA - Exploratory Data Analysis
No ratings yet
EDA - Exploratory Data Analysis
16 pages
LAB MANUAL 5 SOLVED 40 (1)
No ratings yet
LAB MANUAL 5 SOLVED 40 (1)
13 pages
Module V: Qualitative Analysis: Language and Literature Assessment
No ratings yet
Module V: Qualitative Analysis: Language and Literature Assessment
5 pages
EE2211 CheatSheet
No ratings yet
EE2211 CheatSheet
15 pages
DA_Programs
No ratings yet
DA_Programs
44 pages
Dwdm-Lab Manual
No ratings yet
Dwdm-Lab Manual
39 pages
24CSPC212-PIC Lab Manual
No ratings yet
24CSPC212-PIC Lab Manual
45 pages
A2 Lets-revise-Present-Simple-and-Present-Continuous
No ratings yet
A2 Lets-revise-Present-Simple-and-Present-Continuous
4 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
47 pages
FDS RECORD-1-4
No ratings yet
FDS RECORD-1-4
18 pages
PDPR - L1 Knowing Oneself
No ratings yet
PDPR - L1 Knowing Oneself
23 pages
Lesson Plan Communication Process
No ratings yet
Lesson Plan Communication Process
5 pages
Smep Report
No ratings yet
Smep Report
1 page
Syllabus AIML
No ratings yet
Syllabus AIML
14 pages
Week 10
No ratings yet
Week 10
50 pages
Lab 08 - Data Preprocessing
No ratings yet
Lab 08 - Data Preprocessing
9 pages
ml file syllabus
No ratings yet
ml file syllabus
43 pages
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
No ratings yet
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
38 pages
Puzzle Related To Math
No ratings yet
Puzzle Related To Math
2 pages
National Public School: Name-Karan Choudhary Class-XII Subject - Informatics Practices (065) Board Roll No.
No ratings yet
National Public School: Name-Karan Choudhary Class-XII Subject - Informatics Practices (065) Board Roll No.
24 pages
C.A. Final Result - Sum12
No ratings yet
C.A. Final Result - Sum12
26 pages
Manual
No ratings yet
Manual
48 pages
Building Good Training Sets UNIT 1 PART2
No ratings yet
Building Good Training Sets UNIT 1 PART2
46 pages
Datascience
No ratings yet
Datascience
8 pages
Region I Division of Pangasinan II Binalonan
No ratings yet
Region I Division of Pangasinan II Binalonan
4 pages
1003-Guide 11 - PET TEST Juan Raba
No ratings yet
1003-Guide 11 - PET TEST Juan Raba
8 pages
7 Transformation of Leadership
100% (1)
7 Transformation of Leadership
14 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet