0% found this document useful (0 votes)

7 views

Machine Learning - Lab Manual

The document outlines the implementation of various machine learning algorithms, including the Candidate-Elimination algorithm, ID3 Decision Tree, Back Propagation for Neural Networks, and Naïve Bayesian Classifier. Each section provides the aim, algorithm steps, and Python code to demonstrate the implementation of these algorithms using sample datasets. The results indicate successful execution and verification of each algorithm's functionality.

Uploaded by

A.Ruba COMPUTER SCIENCE AND ENGINEERING

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Machine Learning - Lab Manual

Uploaded by

A.Ruba COMPUTER SCIENCE AND ENGINEERING

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

EX.

NO: 1
DATE :
IMPLEMENTATION OF CANDIDATE –ELIMINATION ALGORITHM
AIM:
To implement and demonstrate the Candidate-Elimination algorithm, for a given set of
training data examples stored in a .CSV file, to output a description of the set of all hypotheses
consistent with the training examples.
ALGORITHM:

1. Load Data set.

2. Initialize General Hypothesis and Specific Hypothesis.
3. For each training example
4. If example is positive example
if attribute_value == hypothesis_value:
Do nothing
else:
replace attribute value with '?' (Basically generalizing it)
5. If example is Negative example
Make generalize hypothesis more specific.

PROGRAM:

dataset.csv

2
import numpy as np
import pandas as pd
# Loading Data from a CSV File
data = pd.DataFrame(data=pd.read_csv('E:\BALA\AI\Lab programs\pgms\dataset.csv'))
print(data)

# Separating concept features from Target

concepts = np.array(data.iloc[:,0:-1])
print(concepts)

# Isolating target into a separate DataFrame

# copying last column to target array
target = np.array(data.iloc[:,-1])
print(target)

3
def learn(concepts, target):
'''
learn() function implements the learning method of the Candidate elimination algorithm.
Arguments:
concepts - a data frame with all the features
target - a data frame with corresponding output values
'''
# Initialise S0 with the first instance from concepts
# .copy() makes sure a new list is created instead of just pointing to the same memory location
specific_h = concepts[0].copy()
print("\nInitialization of specific_h and general_h")
print(specific_h)
#h=["#" for i in range(0,5)]
#print(h)

general_h = [["?" for i in range(len(specific_h))] for i in range(len(specific_h))]

print(general_h)
# The learning iterations
for i, h in enumerate(concepts):
# Checking if the hypothesis has a positive target
if target[i] == "Yes":
for x in range(len(specific_h)):
# Change values in S & G only if values change
if h[x] != specific_h[x]:
specific_h[x] = '?'
general_h[x][x] = '?'
# Checking if the hypothesis has a positive target
4
if target[i] == "No":
for x in range(len(specific_h)):
# For negative hyposthesis change values only in G
if h[x] != specific_h[x]:
general_h[x][x] = specific_h[x]
else:
general_h[x][x] = '?'
print("\nSteps of Candidate Elimination Algorithm",i+1)
print(specific_h)
print(general_h)
# find indices where we have empty rows, meaning those that are unchanged
indices = [i for i, val in enumerate(general_h) if val == ['?', '?', '?', '?', '?', '?']]
for i in indices:
# remove those rows from general_h
general_h.remove(['?', '?', '?', '?', '?', '?'])
# Return final values
return specific_h, general_h
s_final, g_final = learn(concepts, target)
print("\nFinal Specific_h:", s_final, sep="\n")
print("\nFinal General_h:", g_final, sep="\n")

5
EX. NO: 2
DATE :

IMPLEMENTATION OF DECISION TREE IN ID3 ALGORITHM

AIM:

To build Decision tree in ID3 algorithm to classify a new sample using python.

ALGORITHM:

1. Observe the dataset. Import the necessary basic python libraries.

2. Read the dataset.
3. Calculate the Entropy of the whole dataset.
4. Calculate the Entropy of the filtered dataset.
5. Calculate the Information gain for the feature(outlook).
6. Finding the most informative feature (feature with highest information gain).
7. Adding a node to the tree.
8. Perform ID3 algorithm and generate a tree.
9. Finding unique classes of the label.
10. Predicting from the tree.
11. Evaluating the test dataset.
12. Checking the test dataset.

PROGRAM:
import numpy as np
import math
import csv
def read_data(filename):
with open(filename, 'r') as csvfile:
datareader = csv.reader(csvfile, delimiter=',')
headers = next(datareader)
metadata = []
8
traindata = []
for name in headers:
metadata.append(name)
for row in datareader:
traindata.append(row)
return (metadata, traindata)
class Node:
def init (self, attribute):
self.attribute = attribute
self.children = []
self.answer = ""

def str (self):

return self.attribute
def subtables(data, col, delete):
dict = {}
items = np.unique(data[:, col])
count = np.zeros((items.shape[0], 1), dtype=np.int32)
for x in range(items.shape[0]):
for y in range(data.shape[0]):
if data[y, col] == items[x]:
count[x] += 1
for x in range(items.shape[0]):
dict[items[x]] = np.empty((int(count[x]), data.shape[1]), dtype="|S32")
pos = 0
for y in range(data.shape[0]):
if data[y, col] == items[x]:
dict[items[x]][pos] = data[y]
pos += 1
if delete:
9
dict[items[x]] = np.delete(dict[items[x]], col, 1)
return items, dict

def entropy(S):
items = np.unique(S)
if items.size == 1:
return 0
counts = np.zeros((items.shape[0], 1))
sums = 0
for x in range(items.shape[0]):
counts[x] = sum(S == items[x]) / (S.size * 1.0)
for count in counts:
sums += -1 * count * math.log(count, 2)
return sums
def gain_ratio(data, col):
items, dict = subtables(data, col, delete=False)
total_size = data.shape[0]
entropies = np.zeros((items.shape[0], 1))
intrinsic = np.zeros((items.shape[0], 1))
for x in range(items.shape[0]):
ratio = dict[items[x]].shape[0]/(total_size * 1.0)
entropies[x] = ratio * entropy(dict[items[x]][:, -1])
intrinsic[x] = ratio * math.log(ratio, 2)
total_entropy = entropy(data[:, -1])
iv = -1 * sum(intrinsic)
for x in range(entropies.shape[0]):
total_entropy -= entropies[x]
return total_entropy / iv
def create_node(data, metadata):
if (np.unique(data[:, -1])).shape[0] == 1:
10
node = Node("")

node.answer = np.unique(data[:, -1])[0]

return node
gains = np.zeros((data.shape[1] - 1, 1))
for col in range(data.shape[1] - 1):
gains[col] = gain_ratio(data, col)
split = np.argmax(gains)
node = Node(metadata[split])
metadata = np.delete(metadata, split, 0)
items, dict = subtables(data, split, delete=True)
for x in range(items.shape[0]):
child = create_node(dict[items[x]], metadata)
node.children.append((items[x], child))
return node
def empty(size):
s = ""
for x in range(size):
s += " "
return s
def print_tree(node, level):
if node.answer != "":
print(empty(level), node.answer)
return
print(empty(level), node.attribute)
for value, n in node.children:
print(empty(level + 1), value)
print_tree(n, level + 2)
metadata, traindata = read_data("E:\BALA\AI\Lab programs\pgms\Tennisdata.csv")
data = np.array(traindata)
11
node = create_node(data, metadata)
print_tree(node, 0)

OUTPUT:

RESULT:

Thus the program to implement decision tree based ID3 algorithm using python was
executed and verified successfully.

12
EX. NO: 3
DATE :
IMPLEMENTATION OF BACK PROPAGATION ALGORITHM TO BUILD AN
ARTIFICIAL NEURAL NETWORK

AIM:
To implement the Back Propagation algorithm to build an Artificial Neural Network.

ALGORITHM:

1. Inputs X, arrive through the preconnected path.

2. Input is modeled using real weights W. The weights are usually randomly selected.
3. Calculate the output for every neuron from the input layer, to the hidden layers, to the
output layer.
4. Calculate the error in the outputs
5. Travel back from the output layer to the hidden layer to adjust the weights such that the
errors is decreased. Keep repeating the process until the desired output is achieved.

PROGRAM:

from math import exp

from random import seed
from random import random
# Initialize a network
def initialize_network(n_inputs, n_hidden, n_outputs):

13
network = list()
hidden_layer = [{'weights':[random() for i in range(n_inputs + 1)]} for i in
range(n_hidden)]
network.append(hidden_layer)
output_layer = [{'weights':[random() for i in range(n_hidden + 1)]} for i in
range(n_outputs)]
network.append(output_layer)
return network
# Calculate neuron activation for an input
def activate(weights, inputs):
activation = weights[-1]
for i in range(len(weights)-1):
activation += weights[i] * inputs[i]
return activation
# Transfer neuron activation
def transfer(activation):
return 1.0 / (1.0 + exp(-activation))
# Forward propagate input to a network output
def forward_propagate(network, row):
inputs = row
for layer in network:
new_inputs = []
for neuron in layer:
activation = activate(neuron['weights'], inputs)
neuron['output'] = transfer(activation)
new_inputs.append(neuron['output'])
inputs = new_inputs
return inputs
# Calculate the derivative of an neuron output

14
def transfer_derivative(output):
return output * (1.0 - output)
# Backpropagate error and store in neurons
def backward_propagate_error(network, expected):
for i in reversed(range(len(network))):
layer = network[i]
errors = list()
if i != len(network)-1:
for j in range(len(layer)):
error = 0.0
for neuron in network[i + 1]:
error += (neuron['weights'][j] * neuron['delta'])
errors.append(error)
else:
for j in range(len(layer)):
neuron = layer[j]
errors.append(neuron['output'] - expected[j])
for j in range(len(layer)):
neuron = layer[j]
neuron['delta'] = errors[j] * transfer_derivative(neuron['output'])
# Update network weights with error
def update_weights(network, row, l_rate):
for i in range(len(network)):
inputs = row[:-1]
if i != 0:
inputs = [neuron['output'] for neuron in network[i - 1]]
for neuron in network[i]:
for j in range(len(inputs)):

15
neuron['weights'][j] -= l_rate * neuron['delta'] * inputs[j]
neuron['weights'][-1] -= l_rate * neuron['delta']
# Train a network for a fixed number of epochs
def train_network(network, train, l_rate, n_epoch, n_outputs):
for epoch in range(n_epoch):
sum_error = 0
for row in train:
outputs = forward_propagate(network, row)
expected = [0 for i in range(n_outputs)]
expected[row[-1]] = 1
sum_error += sum([(expected[i]-outputs[i])**2 for i in
range(len(expected))])
backward_propagate_error(network, expected)
update_weights(network, row, l_rate)
print('>epoch=%d, lrate=%.3f, error=%.3f' % (epoch, l_rate, sum_error))
# Test training backprop algorithm
seed(1)
dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]
n_inputs = len(dataset[0]) - 1
n_outputs = len(set([row[-1] for row in dataset]))

16
network = initialize_network(n_inputs, 2, n_outputs)
train_network(network, dataset, 0.5, 20, n_outputs)
for layer in network:
print(layer)

OUTPUT:
>epoch=0, lrate=0.500, error=6.350
>epoch=1, lrate=0.500, error=5.531
>epoch=2, lrate=0.500, error=5.221
>epoch=3, lrate=0.500, error=4.951
>epoch=4, lrate=0.500, error=4.519
>epoch=5, lrate=0.500, error=4.173
>epoch=6, lrate=0.500, error=3.835
>epoch=7, lrate=0.500, error=3.506
>epoch=8, lrate=0.500, error=3.192
>epoch=9, lrate=0.500, error=2.898
>epoch=10, lrate=0.500, error=2.626
>epoch=11, lrate=0.500, error=2.377
>epoch=12, lrate=0.500, error=2.153
>epoch=13, lrate=0.500, error=1.953
>epoch=14, lrate=0.500, error=1.774
>epoch=15, lrate=0.500, error=1.614
>epoch=16, lrate=0.500, error=1.472
>epoch=17, lrate=0.500, error=1.346
>epoch=18, lrate=0.500, error=1.233
>epoch=19, lrate=0.500, error=1.132
[{'weights': [-1.4688375095432327, 1.850887325439514, 1.0858178629550297], 'output': 0.029
980305604426185, 'delta': 0.0059546604162323625}, {'weights': [0.37711098142462157, -0.06
25909894552989, 0.2765123702642716], 'output': 0.9456229000211323, 'delta': -0.0026279652
850863837}]
[{'weights': [2.515394649397849, -0.3391927502445985, -0.9671565426390275], 'output': 0.23
648794202357587, 'delta': 0.04270059278364587}, {'weights': [-2.5584149848484263, 1.00364
22106209202, 0.42383086467582715], 'output': 0.7790535202438367, 'delta': -0.038031325964
37354}]

RESULT:
Thus the Back propagation algorithm to build an Artificial Neural networks was
implemented successfully.

17
EX.NO: 4
DATE:
IMPLEMENTATION OF NAÏVE BAYESIAN CLASSIFIER FOR A SAMPLE

TRAINING DATASET AND TO COMPUTE ACCURACY

AIM:

To implement Naïve Bayesian classifier for Tennis data set and to compute the accuracy
with few datasets.

ALGORITHM:

1. Convert the data set into a frequency table.

2. Create likelihood table by finding the probabilities like overcast probability = 0.29 and
probability of plating is 0.64.
3. Now, use Naive Bayesian equation to calculate the posterior probability for each class. The
class with the highest posterior probability is the outcome of prediction.

Problem: Players will play if weather is sunny. Is this statement is correct?

We can solve it using above discussed method of posterior probability.

P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)

Here we have P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36, P( Yes)= 9/14 = 0.64

Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which has higher probability.
4. Exit.

PROGRAM:

import pandas as pd
from sklearn import tree
from sklearn.preprocessing import LabelEncoder
from sklearn.naive_bayes import GaussianNB
data = pd.read_csv("E:\BALA\AI\Lab programs\pgms\Tennis.csv")
print("The first 5 values of data is :\n",data.head())

18
# obtain Train data and Train output
X = data.iloc[:,:-1]
print("\nThe First 5 values of train data is\n",X.head())

y = data.iloc[:,-1]
print("\nThe first 5 values of Train output is\n",y.head())

# Convert then in numbers

le_outlook = LabelEncoder()
X.Outlook = le_outlook.fit_transform(X.Outlook)
le_Temperature = LabelEncoder()
X.Temperature = le_Temperature.fit_transform(X.Temperature)
le_Humidity = LabelEncoder()
X.Humidity = le_Humidity.fit_transform(X.Humidity)
le_Windy = LabelEncoder()
X.Windy = le_Windy.fit_transform(X.Windy)

19
print("\nNow the Train data is :\n",X.head())

le_PlayTennis = LabelEncoder()
y = le_PlayTennis.fit_transform(y)
print("\nNow the Train output is\n",y)

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.20)

classifier = GaussianNB()
classifier.fit(X_train,y_train)

from sklearn.metrics import accuracy_score

print("Accuracy is:",accuracy_score(classifier.predict(X_test),y_test))

OUTPUT:

Accuracy is: 0.6666666666666666

RESULT:
Thus the program to implement Naïve Bayesian classifier to compute the accuracy with
few datasets using python was executed and verified successfully.
20
EX. NO: 5

DATE :

IMPLEMENTATION OF NAÏVE BAYESIAN CLASSIFIER MODEL TO CLASSIFY A

SET OF DOCUMENTS AND TO MEASURE THE ACCURACY, PRECISION, AND
RECALL

AIM:

To classify a set of documents using Naïve Bayesian classifier and to measure the accuracy
and precision

ALGORITHM:

1. Import basic libraries.

2. Importing the dataset.
3. Data preprocessing.
4. Training the model.
5. Testing and evaluation of the model.
6. Visualizing the model.

PROGRAM:

from sklearn.datasets import fetch_20newsgroups

from sklearn.metrics import confusion_matrix

from sklearn.metrics import classification_report

import numpy as np

categories = ['alt.atheism', 'soc.religion.christian','comp.graphics', 'sci.med']

twenty_train = fetch_20newsgroups(subset='train',categories=categories,shuffle=True)

twenty_test = fetch_20newsgroups(subset='test',categories=categories,shuffle=True)

print(len(twenty_train.data))

print(len(twenty_test.data))

print(twenty_train.target_names)

21
print("\n".join(twenty_train.data[0].split("\n")))

print(twenty_train.target[0])

OUTPUT:

from sklearn.feature_extraction.text import CountVectorizer

count_vect = CountVectorizer()

X_train_tf = count_vect.fit_transform(twenty_train.data)

from sklearn.feature_extraction.text import TfidfTransformer

tfidf_transformer = TfidfTransformer()

X_train_tfidf = tfidf_transformer.fit_transform(X_train_tf)

X_train_tfidf.shape

22
from sklearn.naive_bayes import MultinomialNB

from sklearn.metrics import accuracy_score

from sklearn import metrics

mod = MultinomialNB()

mod.fit(X_train_tfidf, twenty_train.target)

X_test_tf = count_vect.transform(twenty_test.data)

X_test_tfidf = tfidf_transformer.transform(X_test_tf)

predicted = mod.predict(X_test_tfidf)

print("Accuracy:", accuracy_score(twenty_test.target, predicted))

print(classification_report(twenty_test.target,predicted,target_names=twenty_test.target_names))

print("confusion matrix is \n",metrics.confusion_matrix(twenty_test.target, predicted))

OUTPUT:

RESULT:

Thus the accuracy and precision was measured by Naïve Bayesian classifier model.

23
EX. NO: 6
DATE :
CONSTRUCTION OF A BAYESIAN NETWORK TO DIAGNOSE CORONA
INFECTION USING STANDARD WHO DATA SET

AIM:
To construct a Bayesian network to diagnose corona infection using WHO data set.

ALGORITHM:

This Naive Bayes is broken down into 5 parts:

1: Separate by Class.
2: Summarize Dataset.
3: Summarize Data by Class.
4: Gaussian Probability Density Function.
5: Class Probabilities.

PROGRAM

import pandas as pd

covid_19_data=pd.read_csv("/content/corona.csv")

covid_19_data

24
import warnings

warnings.filterwarnings("ignore",category=FutureWarning)

covid_19_data=pd.read_csv("/content/corona.csv")

print(f'The shape of the dataframe is {covid_19_data.shape}')

print()

The shape of the dataframe is (5, 11)

print(covid_19_data.info())

print()
import numpy as np
covid_19_data.replace(to_replace='?',value=np.NaN,inplace=True)
print(covid_19_data.describe(include='all'))
print()

25
print(covid_19_data["Loss of Taste/Smell"].value_counts())
print(covid_19_data.isnull().sum())

26
import seaborn as sns
sns.countplot(x="Loss of Taste/Smell",data=covid_19_data,linewidth=3)

import matplotlib.pyplot as plt

covid_19_data[['Patient ID','Age','Gender','Fever','Cough','Sore Throat']].hist(bins=50,figsize=(1
5,8))
plt.show()

27
covid_19_data['Patient ID'].fillna(covid_19_data['Sore Throat'].mode()[0],inplace=True)

X=covid_19_data.drop(['Travel History'],axis=1)
y=covid_19_data.Cough
X=X[['Patient ID','Age']]
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2)
from sklearn.naive_bayes import GaussianNB
NB_classifier=GaussianNB()
NB_classifier.fit(X_test,y_test)
y_predict=NB_classifier.predict(X_test)
from sklearn.metrics import confusion_matrix
cm=confusion_matrix(y_test,y_predict)
sns.heatmap(cm,annot=True,cmap='Blues')
from sklearn.metrics import classification_report
print(classification_report(y_test,y_predict))
OUTPUT:

28
RESULT:

Thus the program to diagnose corona infection using Bayesian network was successfully
implemented using python.

29
EX. NO: 7
DATE :
COMPARISON OF CLUSTERING IN EM ALGORITHM AND K-MEANS
ALGORITHM USING THE SAME DATA SETS

AIM:

To compare the clustering in EM algorithm and K-means algorithm using the same data
sets.

ALGORITHM:

The K-means implementation is as follows:

1. Choose the number of clusters k.
2. Select k random points from the data as centroids.
3. Assign all the points to the closest cluster centroid.
4. Recompute the centroids of newly formed clusters.
5. Repeat steps 3 and 4.

The EM implementation is as follows:

1. Expectation step (E - step): It involves the estimation (guess) of all missing values in the
dataset so that after completing this step, there should not be any missing value.
2. Maximization step (M - step): This step involves the use of estimated data in the E-step
and updating the parameters.
3. Repeat E-step and M-step until the convergence of the values occurs.

PROGRAM:
from sklearn.cluster import KMeans
from sklearn import preprocessing
from sklearn.mixture import GaussianMixture
from sklearn.datasets import load_iris
import sklearn.metrics as sm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

30
dataset=load_iris()
# print(dataset)

X=pd.DataFrame(dataset.data)
X.columns=['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width']
y=pd.DataFrame(dataset.target)
y.columns=['Targets']
# print(X)

plt.figure(figsize=(14,7))
colormap=np.array(['red','lime','black'])

# REAL PLOT
plt.subplot(1,3,1)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y.Targets],s=40)
plt.title('Real')

# K-PLOT
plt.subplot(1,3,2)
model=KMeans(n_clusters=3)
model.fit(X)
predY=np.choose(model.labels_,[0,1,2]).astype(np.int64)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[predY],s=40)
plt.title('KMeans')

# GMM PLOT
scaler=preprocessing.StandardScaler()
scaler.fit(X)

31
xsa=scaler.transform(X)
xs=pd.DataFrame(xsa,columns=X.columns)
gmm=GaussianMixture(n_components=3)
gmm.fit(xs)
y_cluster_gmm=gmm.predict(xs)
plt.subplot(1,3,3)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y_cluster_gmm],s=40)

plt.title('GMM Classification')

OUTPUT:

RESULT:

Thus the program to compare clustering in EM and K-means algorithm with few datasets
was performed successfully.
32
EX. NO: 8
DATE :
IMPLEMENTATION OF K-NEAREST NEIGHBOUR ALGORITHM TO CLASSIFY
THE IRIS DATA SET

AIM:
To implement K-Nearest Neighbour algorithm to classify iris data set.

ALGORITHM:

1. Calculate the Information Gain of each feature.

2. Considering that all rows don’t belong to the same class, split the dataset S into subsets using
the feature for which the Information Gain is maximum.

3. Make a decision tree node using the feature with the maximum Information gain.

4. If all rows belong to the same class, make the current node as a leaf node with the class as its
label.

5. Repeat for the remaining features until we run out of all features, or the decision tree has all
leaf nodes.

PROGRAM:
from sklearn.datasets import load_iris
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
import numpy as np
dataset=load_iris()
X_train,X_test,y_train,y_test=train_test_split(dataset["data"],dataset["target"],random_state=0)
kn=KNeighborsClassifier(n_neighbors=1)
kn.fit(X_train,y_train)
for i in range(len(X_test)):

33
x=X_test[i]
x_new=np.array([x])
prediction=kn.predict(x_new)
print("TARGET=",y_test[i],dataset["target_names"]
[y_test[i]],"PREDICTED=",prediction,dataset["target_names"][prediction])
print(kn.score(X_test,y_test))

OUTPUT:

TARGET= 2 virginica PREDICTED= [2] ['virginica']

TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']

34
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [2] ['virginica']
0.9736842105263158

RESULT:
Thus the program for K-Nearest Neighbour algorithm was implemented successfully using
an iris data set.

35
EX. NO: 9
DATE :
IMPLEMENTATION OF THE NON-PARAMETRIC LOCALLY WEIGHTED

REGRESSION ALGORITHM IN ORDER TO FIT DATA POINTS

AIM:
To implement the non-parametric Locally Weighted Regression algorithm in order to fit
data points.
ALGORITHM:

1. Read the Given data Sample to X and the curve (linear or non linear) to Y

2. Set the value for Smoothening parameter or Free parameter say τ

3. Set the bias /Point of interest set x0 which is a subset of X

4. Determine the weight matrix using :

5. Determine the value of model term parameter β using:

6. Prediction = x0*β

PROGRAM:
from math import ceil
import numpy as np
from scipy import linalg
def lowess(x, y, f, iterations):
n = len(x)
r = int(ceil(f * n))
h = [np.sort(np.abs(x - x[i]))[r] for i in range(n)]

36
w = np.clip(np.abs((x[:, None] - x[None, :]) / h), 0.0, 1.0)
w = (1 - w ** 3) ** 3
yest = np.zeros(n)
delta = np.ones(n)
for iteration in range(iterations):
for i in range(n):
weights = delta * w[:, i]
b = np.array([np.sum(weights * y), np.sum(weights * y * x)])
A = np.array([[np.sum(weights), np.sum(weights * x)],[np.sum(weights * x),
np.sum(weights * x * x)]])
beta = linalg.solve(A, b)
yest[i] = beta[0] + beta[1] * x[i]
residuals = y - yest
s = np.median(np.abs(residuals))
delta = np.clip(residuals / (6.0 * s), -1, 1)
delta = (1 - delta ** 2) ** 2
return yest
import math
n = 100
x = np.linspace(0, 2 * math.pi, n)
y = np.sin(x) + 0.3 * np.random.randn(n)
f =0.25
iterations=3
yest = lowess(x, y, f, iterations)
import matplotlib.pyplot as plt
plt.plot(x,y,"r.")
plt.plot(x,yest,"b-")

37
OUTPUT:

RESULT:

Thus the non-parametric Locally Weighted Regression algorithm to fit data points was
implemented successfully.

ML Lab Record
No ratings yet
ML Lab Record
33 pages
DOC-20250509-WA0027.
No ratings yet
DOC-20250509-WA0027.
34 pages
Machine Learning practical file
No ratings yet
Machine Learning practical file
31 pages
code mlt
No ratings yet
code mlt
9 pages
(P) Program AIO
No ratings yet
(P) Program AIO
22 pages
ML Manual
No ratings yet
ML Manual
34 pages
24CSPC212-PIC Lab Manual
No ratings yet
24CSPC212-PIC Lab Manual
45 pages
Machine Learning Laboratory Manual
No ratings yet
Machine Learning Laboratory Manual
11 pages
ML Record
No ratings yet
ML Record
24 pages
Advance Machine Learning
No ratings yet
Advance Machine Learning
28 pages
ccc
No ratings yet
ccc
25 pages
ML LAB P-1
No ratings yet
ML LAB P-1
10 pages
ML lab manual
No ratings yet
ML lab manual
25 pages
Ad3461-ML Manual (1)
No ratings yet
Ad3461-ML Manual (1)
27 pages
ML Lab Manual (1-9)
No ratings yet
ML Lab Manual (1-9)
37 pages
Null 0
No ratings yet
Null 0
6 pages
AIML
No ratings yet
AIML
12 pages
Lab Manual ML
No ratings yet
Lab Manual ML
28 pages
ML Lab Manual PDF
No ratings yet
ML Lab Manual PDF
9 pages
Machine Learning Lab (17CSL76)
No ratings yet
Machine Learning Lab (17CSL76)
48 pages
Machine Learning Lab Record: Dr. Sarika Hegde
No ratings yet
Machine Learning Lab Record: Dr. Sarika Hegde
23 pages
ML_LAB Record_final
No ratings yet
ML_LAB Record_final
39 pages
Ml Lab Manual1 9
No ratings yet
Ml Lab Manual1 9
38 pages
ad3461-ml-lab-manual-format-edited
No ratings yet
ad3461-ml-lab-manual-format-edited
45 pages
ML File
No ratings yet
ML File
13 pages
Fedal #5
No ratings yet
Fedal #5
33 pages
AD3461 ML lab manual
No ratings yet
AD3461 ML lab manual
32 pages
ML - LAB Record
No ratings yet
ML - LAB Record
36 pages
Practical - 1
No ratings yet
Practical - 1
25 pages
15CSL76 Students
No ratings yet
15CSL76 Students
18 pages
ML Lab Prog1-5 (5) College PDF
No ratings yet
ML Lab Prog1-5 (5) College PDF
12 pages
Ad3461 Ml Lab Manual
100% (1)
Ad3461 Ml Lab Manual
54 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
26 pages
Machine learning
No ratings yet
Machine learning
27 pages
MANUAL (1)
No ratings yet
MANUAL (1)
34 pages
AIML Final Programs
No ratings yet
AIML Final Programs
8 pages
MLAll Practical
No ratings yet
MLAll Practical
27 pages
Amit MLT1
No ratings yet
Amit MLT1
22 pages
Aiml Lab
No ratings yet
Aiml Lab
14 pages
Pra 5 ML
No ratings yet
Pra 5 ML
5 pages
AI
No ratings yet
AI
11 pages
Machine Learning Lab Manual (15CSL76)
No ratings yet
Machine Learning Lab Manual (15CSL76)
30 pages
Machine Learning Laboratory Record Book: 1 Find S Algorithm
No ratings yet
Machine Learning Laboratory Record Book: 1 Find S Algorithm
22 pages
MLWP LAB Experiment's
No ratings yet
MLWP LAB Experiment's
11 pages
ML Record Print
No ratings yet
ML Record Print
20 pages
Soft Computing
No ratings yet
Soft Computing
16 pages
ML Experiments
No ratings yet
ML Experiments
22 pages
Exe 1
No ratings yet
Exe 1
13 pages
ML RECORD NEW FORMAT
No ratings yet
ML RECORD NEW FORMAT
48 pages
Ml Lab Record
No ratings yet
Ml Lab Record
49 pages
Week 2 - Lab
No ratings yet
Week 2 - Lab
9 pages
P 4 Andp 5
No ratings yet
P 4 Andp 5
4 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
43 pages
ai int-1
No ratings yet
ai int-1
6 pages
AI Lab M.Tech
No ratings yet
AI Lab M.Tech
29 pages
ML Lab Programs 1-10-Converted NAM COLLEGE PDF
No ratings yet
ML Lab Programs 1-10-Converted NAM COLLEGE PDF
33 pages
MANUAL (2)
No ratings yet
MANUAL (2)
33 pages
S. NO. Title of The Experiments Page No
No ratings yet
S. NO. Title of The Experiments Page No
11 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Secure Group Data Sharing in Cloud Computing Using Role Based Encryption Techniques
No ratings yet
Secure Group Data Sharing in Cloud Computing Using Role Based Encryption Techniques
4 pages
It- Nba Visit Schedule
No ratings yet
It- Nba Visit Schedule
8 pages
Cw3551-Dis Lesson Plan
No ratings yet
Cw3551-Dis Lesson Plan
5 pages
Lab Manual
No ratings yet
Lab Manual
24 pages
Group Assignment
No ratings yet
Group Assignment
11 pages
ML Lab Exp 7 K-Means Clustering
No ratings yet
ML Lab Exp 7 K-Means Clustering
14 pages
Introduction To Data Analytics and Visualization Question Paper
100% (1)
Introduction To Data Analytics and Visualization Question Paper
2 pages
Credit Card Fraud Detection by Improving K-Means: Mahesh Singh, Aashima, Sangeeta Raheja
No ratings yet
Credit Card Fraud Detection by Improving K-Means: Mahesh Singh, Aashima, Sangeeta Raheja
5 pages
"Slang Detection Using Speech": Data Mining Techniques Minor Project Report On
No ratings yet
"Slang Detection Using Speech": Data Mining Techniques Minor Project Report On
15 pages
Unit 4 - Machine Learning - WWW - Rgpvnotes.in PDF
No ratings yet
Unit 4 - Machine Learning - WWW - Rgpvnotes.in PDF
27 pages
Syllabus Sem 5
No ratings yet
Syllabus Sem 5
90 pages
DATA MINING II SOL
No ratings yet
DATA MINING II SOL
106 pages
Interview - Preparation-Machine Learning Questions & Answers
No ratings yet
Interview - Preparation-Machine Learning Questions & Answers
37 pages
QCM
No ratings yet
QCM
24 pages
A Simple Guide To Centroid Based Clustering (With Python Code)
No ratings yet
A Simple Guide To Centroid Based Clustering (With Python Code)
25 pages
Project Report
No ratings yet
Project Report
6 pages
Detection of Abnormalities in Real-Time Computer Network Traffic Empowered by Machine Learning
No ratings yet
Detection of Abnormalities in Real-Time Computer Network Traffic Empowered by Machine Learning
8 pages
Final PPT
100% (1)
Final PPT
16 pages
homework4_v1.0
No ratings yet
homework4_v1.0
5 pages
ML 2
No ratings yet
ML 2
4 pages
Hkbu Thesis Format
100% (3)
Hkbu Thesis Format
5 pages
Cheatsheet Reflex Models
No ratings yet
Cheatsheet Reflex Models
4 pages
Analysing Lexical Semantic Change
No ratings yet
Analysing Lexical Semantic Change
14 pages
Machine Learning: April 2022
No ratings yet
Machine Learning: April 2022
32 pages
Amazon-Fine-Food-Review - K-Means, Agglomerative & DBSCAN Clustering
No ratings yet
Amazon-Fine-Food-Review - K-Means, Agglomerative & DBSCAN Clustering
79 pages
Fruit Quality Evaluation Using Machine Learning Techniques - Review, Motivation and Future Perspectives
No ratings yet
Fruit Quality Evaluation Using Machine Learning Techniques - Review, Motivation and Future Perspectives
23 pages
Passion Fruit Disease Detection Using Image Processing: March 2019
No ratings yet
Passion Fruit Disease Detection Using Image Processing: March 2019
9 pages
Machine Learning Foundations: Supervised, Unsupervised, and Advanced Learning Taeho Jo - Download the ebook now to never miss important information
100% (2)
Machine Learning Foundations: Supervised, Unsupervised, and Advanced Learning Taeho Jo - Download the ebook now to never miss important information
70 pages
Clustering A Data Recovery Approach Second Edition Boris Mirkin (Author) download
100% (7)
Clustering A Data Recovery Approach Second Edition Boris Mirkin (Author) download
53 pages
Shao-Han Liu and Jzau-Sheng Lin - A Compensated Fuzzy Hopfield Neural Network For Codebook Design in Vector Quantization
No ratings yet
Shao-Han Liu and Jzau-Sheng Lin - A Compensated Fuzzy Hopfield Neural Network For Codebook Design in Vector Quantization
13 pages
Lec 05 - K-Means
No ratings yet
Lec 05 - K-Means
4 pages
Ai & ML Week-9
No ratings yet
Ai & ML Week-9
30 pages
Application Model of K-Means Clustering: Insights Into Promotion Strategy of Vocational High School
No ratings yet
Application Model of K-Means Clustering: Insights Into Promotion Strategy of Vocational High School
6 pages
DSA Presentation Group 6
No ratings yet
DSA Presentation Group 6
34 pages
dbscan
No ratings yet
dbscan
18 pages

Machine Learning - Lab Manual

Uploaded by

Machine Learning - Lab Manual

Uploaded by

EX.

1. Load Data set.

# Separating concept features from Target

# Isolating target into a separate DataFrame

general_h = [["?" for i in range(len(specific_h))] for i in range(len(specific_h))]

IMPLEMENTATION OF DECISION TREE IN ID3 ALGORITHM

1. Observe the dataset. Import the necessary basic python libraries.

def str (self):

node.answer = np.unique(data[:, -1])[0]

1. Inputs X, arrive through the preconnected path.

from math import exp

TRAINING DATASET AND TO COMPUTE ACCURACY

1. Convert the data set into a frequency table.

Problem: Players will play if weather is sunny. Is this statement is correct?

We can solve it using above discussed method of posterior probability.

P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)

# Convert then in numbers

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score

Accuracy is: 0.6666666666666666

IMPLEMENTATION OF NAÏVE BAYESIAN CLASSIFIER MODEL TO CLASSIFY A

1. Import basic libraries.

from sklearn.datasets import fetch_20newsgroups

from sklearn.metrics import confusion_matrix

from sklearn.metrics import classification_report

categories = ['alt.atheism', 'soc.religion.christian','comp.graphics', 'sci.med']

from sklearn.feature_extraction.text import CountVectorizer

from sklearn.feature_extraction.text import TfidfTransformer

from sklearn.metrics import accuracy_score

from sklearn import metrics

print("Accuracy:", accuracy_score(twenty_test.target, predicted))

print("confusion matrix is \n",metrics.confusion_matrix(twenty_test.target, predicted))

This Naive Bayes is broken down into 5 parts:

print(f'The shape of the dataframe is {covid_19_data.shape}')

The shape of the dataframe is (5, 11)

import matplotlib.pyplot as plt

The K-means implementation is as follows:

The EM implementation is as follows:

1. Calculate the Information Gain of each feature.

TARGET= 2 virginica PREDICTED= [2] ['virginica']

REGRESSION ALGORITHM IN ORDER TO FIT DATA POINTS

2. Set the value for Smoothening parameter or Free parameter say τ

3. Set the bias /Point of interest set x0 which is a subset of X

4. Determine the weight matrix using :

5. Determine the value of model term parameter β using:

You might also like