0% found this document useful (0 votes)
242 views31 pages

ML Lab Manual Sem-7

The document provides a lab manual for a Machine Learning course, outlining 10 programming assignments to implement various machine learning algorithms including linear regression, KNN, decision trees, and clustering. It includes sample code and output for each assignment, with the goal of giving students hands-on experience applying machine learning techniques to different datasets. The manual was prepared by Darsha Chauhan for a 7th semester Machine Learning course at Mahavir Swami College of Engineering and Technology.

Uploaded by

Prabhdeep Gill
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
242 views31 pages

ML Lab Manual Sem-7

The document provides a lab manual for a Machine Learning course, outlining 10 programming assignments to implement various machine learning algorithms including linear regression, KNN, decision trees, and clustering. It includes sample code and output for each assignment, with the goal of giving students hands-on experience applying machine learning techniques to different datasets. The manual was prepared by Darsha Chauhan for a 7th semester Machine Learning course at Mahavir Swami College of Engineering and Technology.

Uploaded by

Prabhdeep Gill
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Machine learning 3170724

LAB MANUAL
MACHINE LEARNING

Subject Code: 3170724

Prepared By:

Mahavir Swami Collage of Engineering and Technology, Surat

Practical List (Academic year : 2022-23)

Subject : Sem : Department : Faculty Name:


Machine Learning (3170724) 7th C.S.E Darsha Chauhan

Guide: Darsha Chauhan


Machine learning 3170724

MAHAVIR SWAMI COLLEGE OF


ENGINEERING & TECHNOLOGY, SURAT

CERTIFICATE

This is to certify that MR. / Ms. ____________________________________


of class Computer Science & Engineering 7thsemester Enrollment No.
__________________ has satisfactorily submitted his / her term work in
subject Machine learning (Sub. Code: 3170724) for the term ending in
___________.
Date:

Sign of teacher Sign of Head of department


Machine learning 3170724

Index
Sr. date PRACTICAL PAGE Sign
No.

1 17/06/2021 Write a program to Implementation of mean, median and 4


mode

2 01/07/2021 Write a program to implement Data distribution 6


histogram.

3 08/07/2021 Write a program to implement scatter plot using given 7


dataset

4 15/07/2021 Write a program to Implementation of linear regression 8


from given dataset

5 29/07/2021 Write a program to implement Scale 10

6 12/08/2021 Write a program to training and testing from given 12


dataset

7 02/09/2021 Write a program to Implementation of Decision tree from 16


given dataset

8 16/09/2021 Write a program to Implement K-Nearest Neighbors 20


Algorithm from given dataset

9 23/09/2021 Write a program to implementation of K- Mean 22


clustering from given dataset

10 07/10/2021 Write a program to implementation of hierarchical 26


clustering from dataset
Machine learning 3170724

Practical 1: Write a program to Implementation of mean,


median and mode
Code:
import numpy as np

v1=np.arange(1,33)

print(v1)

print('----------------------------------')

v2=np.mean(v1)

print(v2)

print('----------------------------------')

v4=np.arange(1,11)

v3=np.median(v4)

print(v3)

print('----------------------------------')

v5=np.arange(1,11)

v6=np.arange(2,22)

#v5%2

#print(v5)

Gill Prabhdeep Singh(191110107013) Page 1


Machine learning 3170724

sum=0

i=0

for i in range(1,11):

sum=sum+i

print(sum)

sum/11

print(sum)

Gill Prabhdeep Singh(191110107013) Page 2


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 3


Machine learning 3170724

Practical: 2 Write a program to implement Data distribution


histogram from the given dataset.
Code:
import numpy

import matplotlib.pyplot as plt

x = numpy.random.normal(5.0, 1.0, 2000000)

plt.hist(x, 100)

plt.show()

Gill Prabhdeep Singh(191110107013) Page 4


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 5


Machine learning 3170724

Practical 3: Write a program to implement scatter plot using


given Dataset
Code:
importnumpy

importmatplotlib.pyplot as plt

x = numpy.random.normal(6.0, 1.0, 200)

y = numpy.random.normal(10.0, 2.0, 200)

plt.scatter(x, y)

plt.show()

Gill Prabhdeep Singh(191110107013) Page 6


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 7


Machine learning 3170724

Practical 4: Write a program to Implement linear regression


from given dataset
Code:
fromtkinter import *

def select():

sel = "Value = " + str(v.get())

label.config(text = sel)

top = Tk()

top.geometry("200x100")

v = DoubleVar()

scale = Scale( top, variable = v, from_ = 1, to = 100, orient = HORIZONTAL)

scale.pack(anchor=CENTER)

btn = Button(top, text="Value", command=select)

btn.pack(anchor=CENTER)

label = Label(top)

label.pack()

top.mainloop()

Gill Prabhdeep Singh(191110107013) Page 8


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 9


Machine learning 3170724

Practical 5: Write a program to implement Scale from given


dataset
Code:
import pandas

from sklearn import linear_model

from sklearn.preprocessing import StandardScaler

scale = StandardScaler()

df = pandas.read_csv("cars2.csv")

X = df[['Weight', 'Volume']]

scaledX = scale.fit_transform(X)

print(scaledX)

Gill Prabhdeep Singh(191110107013) Page 10


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 11


Machine learning 3170724

Practical 6: Write a program to training and testing from given


dataset
Code:
1. Training set: 80% from original dataset by random selection:
Testing set: 20% from original dataset by random selection:

import numpy

import matplotlib.pyplot as plt

numpy.random.seed(2)

x = numpy.random.normal(3, 1, 100)

y = numpy.random.normal(150, 40, 100) / x

train_x = x[:80]

train_y = y[:80]

test_x = x[20:]

test_y = y[20:]

mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4))

myline = numpy.linspace(0, 6, 100)

plt.scatter(train_x, train_y)

plt.plot(myline, mymodel(myline))

plt.show()

Gill Prabhdeep Singh(191110107013) Page 12


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 13


Machine learning 3170724

2. Training set: 20% from original dataset by random selection:


Testing set: 80% from original dataset by random selection:

import numpy
import matplotlib.pyplot as plt
numpy.random.seed(2)

x = numpy.random.normal(3, 1, 100)
y = numpy.random.normal(150, 40, 100) / x

train_x = x[:80]
train_y = y[:80]

test_x = x[80:]
test_y = y[80:]

mymodel = numpy.poly1d(numpy.polyfit(train_x, train_y, 4))

myline = numpy.linspace(0, 6, 100)

plt.scatter(train_x, train_y)
plt.plot(myline, mymodel(myline))
plt.show()

Gill Prabhdeep Singh(191110107013) Page 14


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 15


Machine learning 3170724

Practical 7: Write a program to implement Decision tree from


given datast.
Code:
from DecisionTree import *

import pandas as pd

from sklearn import model_selection

df = pd.read_csv('data_set/Social_Network_Ads.csv')

header = list(df.columns)

lst = df.values.tolist()

trainDF, testDF = model_selection.train_test_split(lst, test_size=0.2)

t = build_tree(trainDF, header)

print("\nLeaf nodes ****************")

leaves = getLeafNodes(t)

for leaf in leaves:

print("id = " + str(leaf.id) + " depth =" + str(leaf.depth))

print("\nNon-leaf nodes ****************")

innerNodes = getInnerNodes(t)

for inner in innerNodes:

Gill Prabhdeep Singh(191110107013) Page 16


Machine learning 3170724

print("id = " + str(inner.id) + " depth =" + str(inner.depth))

maxAccuracy = computeAccuracy(testDF, t)

print("\nTree before pruning with accuracy: " + str(maxAccuracy*100) + "\n")

print_tree(t)

nodeIdToPrune = -1

for node in innerNodes:

if node.id != 0:

prune_tree(t, [node.id])

currentAccuracy = computeAccuracy(testDF, t)

print("Pruned node_id: " + str(node.id) + " to achieve accuracy: " +


str(currentAccuracy*100) + "%")

if currentAccuracy > maxAccuracy:

maxAccuracy = currentAccuracy

nodeIdToPrune = node.id

t = build_tree(trainDF, header)

if maxAccuracy == 1:

break

if nodeIdToPrune != -1:

t = build_tree(trainDF, header)

prune_tree(t, [nodeIdToPrune])

print("\nFinal node Id to prune (for max accuracy): " + str(nodeIdToPrune))

else:

t = build_tree(trainDF, header)

Gill Prabhdeep Singh(191110107013) Page 17


Machine learning 3170724

print("\nPruning strategy did'nt increased accuracy")

print("\n********************************************************************")

print("*********** Final Tree with accuracy: " + str(maxAccuracy*100) + "% ************")

print("********************************************************************\n")

print_tree(t)

Gill Prabhdeep Singh(191110107013) Page 18


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 19


Machine learning 3170724

Practical 8: K-Nearest Neighbors Algorithm


Code:
import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"

# Assign colum names to the dataset

names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']

# Read dataset to pandas dataframe

dataset = pd.read_csv(url, names=names)

X = dataset.iloc[:, :-1].values

y = dataset.iloc[:, 4].values

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20)

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

scaler.fit(X_train)

X_train = scaler.transform(X_train)

X_test = scaler.transform(X_test)

from sklearn.neighbors import KNeighborsClassifier

classifier = KNeighborsClassifier(n_neighbors=5)

Gill Prabhdeep Singh(191110107013) Page 20


Machine learning 3170724

classifier.fit(X_train, y_train)

y_pred = classifier.predict(X_test)

from sklearn.metrics import classification_report, confusion_matrix

print(confusion_matrix(y_test, y_pred))

print(classification_report(y_test, y_pred))

error = []

# Calculating error for K values between 1 and 40

for i in range(1, 40):

knn = KNeighborsClassifier(n_neighbors=i)

knn.fit(X_train, y_train)

pred_i = knn.predict(X_test)

error.append(np.mean(pred_i != y_test))

plt.figure(figsize=(12, 6))

plt.plot(range(1, 40), error, color='red', linestyle='dashed', marker='o',

markerfacecolor='blue', markersize=10)

plt.title('Error Rate K Value')

plt.xlabel('K Value')

plt.ylabel('Mean Error')

Gill Prabhdeep Singh(191110107013) Page 21


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 22


Machine learning 3170724

Practical 9: Write a program to implementation of K- Mean


clustering given dataset.
Code:
import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

dataset = pd.read_csv('Mall_Customers.csv')

X = dataset.iloc[:, [3, 4]].values

from sklearn.cluster import KMeans

wcss = []

for i in range(1, 11):

kmeans = KMeans(n_clusters = i, init = 'k-means++', random_state = 42)

kmeans.fit(X)

wcss.append(kmeans.inertia_)

plt.plot(range(1, 11), wcss)

plt.title('The Elbow Method')

plt.xlabel('Number of clusters')

plt.ylabel('WCSS')

plt.show()

kmeans = KMeans(n_clusters = 5, init = 'k-means++', random_state = 42)

y_kmeans = kmeans.fit_predict(X)

plt.scatter(X[y_kmeans == 0, 0], X[y_kmeans == 0, 1], s = 100, c = 'red', label = 'Cluster 1')

Gill Prabhdeep Singh(191110107013) Page 23


Machine learning 3170724

plt.scatter(X[y_kmeans == 1, 0], X[y_kmeans == 1, 1], s = 100, c = 'blue', label = 'Cluster 2')

plt.scatter(X[y_kmeans == 2, 0], X[y_kmeans == 2, 1], s = 100, c = 'green', label = 'Cluster 3')

plt.scatter(X[y_kmeans == 3, 0], X[y_kmeans == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4')

plt.scatter(X[y_kmeans == 4, 0], X[y_kmeans == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5')

plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s = 300, c = 'yellow',


label = 'Centroids')

plt.title('Clusters of customers')

plt.xlabel('Annual Income (k$)')

plt.ylabel('Spending Score (1-100)')

plt.legend()

plt.show()

Gill Prabhdeep Singh(191110107013) Page 24


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 25


Machine learning 3170724

Practical 10: Write a program to implementation of


hierarchical clustering from given dataset.
Code:
import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

dataset = pd.read_csv('Mall_Customers.csv')

X = dataset.iloc[:, [3, 4]].values

import scipy.cluster.hierarchy as sch

dendrogram = sch.dendrogram(sch.linkage(X, method = 'ward'))

plt.title('Dendrogram')

plt.xlabel('Customers')

plt.ylabel('Euclidean distances')

plt.show()

from sklearn.cluster import AgglomerativeClustering

hc = AgglomerativeClustering(n_clusters = 5, affinity = 'euclidean', linkage = 'ward')

y_hc = hc.fit_predict(X)

plt.scatter(X[y_hc == 0, 0], X[y_hc == 0, 1], s = 100, c = 'red', label = 'Cluster 1')

plt.scatter(X[y_hc == 1, 0], X[y_hc == 1, 1], s = 100, c = 'blue', label = 'Cluster 2')

plt.scatter(X[y_hc == 2, 0], X[y_hc == 2, 1], s = 100, c = 'green', label = 'Cluster 3')

plt.scatter(X[y_hc == 3, 0], X[y_hc == 3, 1], s = 100, c = 'cyan', label = 'Cluster 4')

plt.scatter(X[y_hc == 4, 0], X[y_hc == 4, 1], s = 100, c = 'magenta', label = 'Cluster 5')

Gill Prabhdeep Singh(191110107013) Page 26


Machine learning 3170724

plt.title('Clusters of customers')

plt.xlabel('Annual Income (k$)')

plt.ylabel('Spending Score (1-100)')

plt.legend()

plt.show()

Gill Prabhdeep Singh(191110107013) Page 27


Machine learning 3170724

Output:

Gill Prabhdeep Singh(191110107013) Page 28

You might also like