0% found this document useful (0 votes)

151 views48 pages

Ad3461 ML Lab Manual

The document is a laboratory manual for a Machine Learning course at Anna University, detailing various experiments including the implementation of the Candidate-Elimination algorithm, Decision Tree using ID3, and Back Propagation for building an Artificial Neural Network. Each experiment outlines the aim, algorithm, and provides Python code for implementation, along with expected results. The manual serves as a practical guide for students to understand and apply machine learning concepts through hands-on coding exercises.

Uploaded by

Madhu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

151 views48 pages

Ad3461 ML Lab Manual

Uploaded by

Madhu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

lOMoARcPSD|52155318

AD3461 ML LAB Manual

Machine learning laboratory (Anna University)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

Downloaded by Supradheeka k ([email protected])
lOMoARcPSD|52155318

EX. NO: 1
DATE :
IMPLEMENTATION OF CANDIDATE –ELIMINATION ALGORITHM
AIM:
To implement and demonstrate the Candidate-Elimination algorithm, for a given set of
training data examples stored in a .CSV file, to output a description of the set of all hypotheses
consistent with the training examples.
ALGORITHM:

1. Load Data set.

2. Initialize General Hypothesis and Specific Hypothesis.
3. For each training example
4. If example is positive example
if attribute_value == hypothesis_value:
Do nothing
else:
replace attribute value with '?' (Basically generalizing it)
5. If example is Negative example
Make generalize hypothesis more specific.

1
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

PROGRAM:

dataset.csv

2
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

import numpy as np
import pandas as pd
# Loading Data from a CSV File
data = pd.DataFrame(data=pd.read_csv('E:\BALA\AI\Lab programs\pgms\dataset.csv'))
print(data)

# Separating concept features from Target

concepts = np.array(data.iloc[:,0:-1])
print(concepts)

# Isolating target into a separate DataFrame

# copying last column to target array
target = np.array(data.iloc[:,-1])
print(target)

3
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

def learn(concepts, target):

'''
learn() function implements the learning method of the Candidate elimination algorithm.
Arguments:
concepts - a data frame with all the features
target - a data frame with corresponding output values
'''
# Initialise S0 with the first instance from concepts
# .copy() makes sure a new list is created instead of just pointing to the same memory location
specific_h = concepts[0].copy()
print("\nInitialization of specific_h and general_h")
print(specific_h)
#h=["#" for i in range(0,5)]
#print(h)

general_h = [["?" for i in range(len(specific_h))] for i in range(len(specific_h))]

print(general_h)
# The learning iterations
for i, h in enumerate(concepts):
# Checking if the hypothesis has a positive target
if target[i] == "Yes":
for x in range(len(specific_h)):
# Change values in S & G only if values change
if h[x] != specific_h[x]:
specific_h[x] = '?'
general_h[x][x] = '?'
# Checking if the hypothesis has a positive target

4
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

if target[i] == "No":
for x in range(len(specific_h)):
# For negative hyposthesis change values only in G
if h[x] != specific_h[x]:
general_h[x][x] = specific_h[x]
else:
general_h[x][x] = '?'
print("\nSteps of Candidate Elimination Algorithm",i+1)
print(specific_h)
print(general_h)
# find indices where we have empty rows, meaning those that are unchanged
indices = [i for i, val in enumerate(general_h) if val == ['?', '?', '?', '?', '?', '?']]
for i in indices:
# remove those rows from general_h
general_h.remove(['?', '?', '?', '?', '?', '?'])
# Return final values
return specific_h, general_h
s_final, g_final = learn(concepts, target)
print("\nFinal Specific_h:", s_final, sep="\n")
print("\nFinal General_h:", g_final, sep="\n")

5
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

OUTPUT:

6
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

7
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

RESULT:
Thus the Candidate-Elimination algorithm, to test all the hypotheses with the training sets using
python was executed and verified successfully.

8
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

EX. NO: 2

DATE :
IMPLEMENTATION OF DECISION TREE IN ID3 ALGORITHM

AIM:
To build Decision tree in ID3 algorithm to classify a new sample using python.
ALGORITHM:

1. Observe the dataset. Import the necessary basic python libraries.

2. Read the dataset.
3. Calculate the Entropy of the whole dataset.
4. Calculate the Entropy of the filtered dataset.
5. Calculate the Information gain for the feature(outlook).
6. Finding the most informative feature (feature with highest information gain).
7. Adding a node to the tree.
8. Perform ID3 algorithm and generate a tree.
9. Finding unique classes of the label.
10. Predicting from the tree.
11. Evaluating the test dataset.
12. Checking the test dataset.

9
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

PROGRAM:
import numpy as np
import math
import csv
def read_data(filename):
with open(filename, 'r') as csvfile:
datareader = csv.reader(csvfile, delimiter=',')
headers = next(datareader)
metadata = []
traindata = []
for name in headers:
metadata.append(name)
for row in datareader:

traindata.append(row)
return (metadata, traindata)
class Node:
def init (self, attribute):
self.attribute = attribute
self.children = []
self.answer = ""

def str (self):

return self.attribute
def subtables(data, col, delete):
dict = {}
items = np.unique(data[:, col])
count = np.zeros((items.shape[0], 1), dtype=np.int32)
for x in range(items.shape[0]):

10
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

for y in range(data.shape[0]):
if data[y, col] == items[x]:
count[x] += 1
for x in range(items.shape[0]):
dict[items[x]] = np.empty((int(count[x]), data.shape[1]), dtype="|S32")
pos = 0
for y in range(data.shape[0]):
if data[y, col] == items[x]:
dict[items[x]][pos] = data[y]
pos += 1
if delete:
dict[items[x]] = np.delete(dict[items[x]], col, 1)
return items, dict

def entropy(S):
items = np.unique(S)
if items.size == 1:
return 0
counts = np.zeros((items.shape[0], 1))
sums = 0
for x in range(items.shape[0]):
counts[x] = sum(S == items[x]) / (S.size * 1.0)
for count in counts:
sums += -1 * count * math.log(count, 2)
return sums
def gain_ratio(data, col):
items, dict = subtables(data, col, delete=False)
total_size = data.shape[0]
entropies = np.zeros((items.shape[0], 1))
11
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

intrinsic = np.zeros((items.shape[0], 1))

for x in range(items.shape[0]):
ratio = dict[items[x]].shape[0]/(total_size * 1.0)
entropies[x] = ratio * entropy(dict[items[x]][:, -1])
intrinsic[x] = ratio * math.log(ratio, 2)
total_entropy = entropy(data[:, -1])
iv = -1 * sum(intrinsic)
for x in range(entropies.shape[0]):
total_entropy -= entropies[x]
return total_entropy / iv
def create_node(data, metadata):
if (np.unique(data[:, -1])).shape[0] == 1:
node = Node("")

node.answer = np.unique(data[:, -1])[0]

return node
gains = np.zeros((data.shape[1] - 1, 1))
for col in range(data.shape[1] - 1):
gains[col] = gain_ratio(data, col)
split = np.argmax(gains)
node = Node(metadata[split])
metadata = np.delete(metadata, split, 0)
items, dict = subtables(data, split, delete=True)
for x in range(items.shape[0]):
child = create_node(dict[items[x]], metadata)
node.children.append((items[x], child))
return node
def empty(size):
s = ""
for x in range(size):
12
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

s += " "
return s
def print_tree(node, level):
if node.answer != "":
print(empty(level), node.answer)
return
print(empty(level), node.attribute)
for value, n in node.children:
print(empty(level + 1), value)
print_tree(n, level + 2)
metadata, traindata = read_data("E:\BALA\AI\Lab programs\pgms\Tennisdata.csv")
data = np.array(traindata)

node = create_node(data, metadata)

print_tree(node, 0)

13
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

OUTPUT:

RESULT:
Thus the program to implement decision tree based ID3 algorithm using python was
executed and verified successfully.

14
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

EX. NO: 3
DATE :
IMPLEMENTATION OF BACK PROPAGATION ALGORITHM TO BUILD AN
ARTIFICIAL NEURAL NETWORK
AIM:
To implement the Back Propagation algorithm to build an Artificial Neural Network.
ALGORITHM:

1. Inputs X, arrive through the preconnected path.

2. Input is modeled using real weights W. The weights are usually randomly selected.
3. Calculate the output for every neuron from the input layer, to the hidden layers, to the
output layer.
4. Calculate the error in the outputs
5. Travel back from the output layer to the hidden layer to adjust the weights such that the
errors is decreased. Keep repeating the process until the desired output is achieved.

15
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

PROGRAM:

from math import exp

from random import seed
from random import random
# Initialize a network
def initialize_network(n_inputs, n_hidden, n_outputs):

network = list()
hidden_layer = [{'weights':[random() for i in range(n_inputs + 1)]} for i in
range(n_hidden)]
network.append(hidden_layer)
output_layer = [{'weights':[random() for i in range(n_hidden + 1)]} for i in
range(n_outputs)]
network.append(output_layer)
return network
# Calculate neuron activation for an input
def activate(weights, inputs):
activation = weights[-1]
for i in range(len(weights)-1):
activation += weights[i] * inputs[i]
return activation
# Transfer neuron activation
def transfer(activation):
return 1.0 / (1.0 + exp(-activation))
# Forward propagate input to a network output
def forward_propagate(network, row):
inputs = row
for layer in network:
new_inputs = []
for neuron in layer:

16
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

activation = activate(neuron['weights'], inputs)

neuron['output'] = transfer(activation)
new_inputs.append(neuron['output'])
inputs = new_inputs
return inputs
# Calculate the derivative of an neuron output

def transfer_derivative(output):
return output * (1.0 - output)
# Backpropagate error and store in neurons
def backward_propagate_error(network, expected):
for i in reversed(range(len(network))):
layer = network[i]
errors = list()
if i != len(network)-1:
for j in range(len(layer)):
error = 0.0
for neuron in network[i + 1]:
error += (neuron['weights'][j] * neuron['delta'])
errors.append(error)
else:
for j in range(len(layer)):
neuron = layer[j]
errors.append(neuron['output'] - expected[j])
for j in range(len(layer)):
neuron = layer[j]
neuron['delta'] = errors[j] * transfer_derivative(neuron['output'])
# Update network weights with error
def update_weights(network, row, l_rate):
for i in range(len(network)):
17
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

inputs = row[:-1]
if i != 0:
inputs = [neuron['output'] for neuron in network[i - 1]]
for neuron in network[i]:
for j in range(len(inputs)):

neuron['weights'][j] -= l_rate * neuron['delta'] * inputs[j]

neuron['weights'][-1] -= l_rate * neuron['delta']
# Train a network for a fixed number of epochs
def train_network(network, train, l_rate, n_epoch, n_outputs):
for epoch in range(n_epoch):
sum_error = 0
for row in train:
outputs = forward_propagate(network, row)
expected = [0 for i in range(n_outputs)]
expected[row[-1]] = 1
sum_error += sum([(expected[i]-outputs[i])**2 for i in
range(len(expected))])
backward_propagate_error(network, expected)
update_weights(network, row, l_rate)
print('>epoch=%d, lrate=%.3f, error=%.3f' % (epoch, l_rate, sum_error))
# Test training backprop algorithm
seed(1)
dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],
18
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]
n_inputs = len(dataset[0]) - 1
n_outputs = len(set([row[-1] for row in dataset]))

network = initialize_network(n_inputs, 2, n_outputs)

train_network(network, dataset, 0.5, 20, n_outputs)
for layer in network:
print(layer)

19
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

OUTPUT:
>epoch=0, lrate=0.500, error=6.350
>epoch=1, lrate=0.500, error=5.531
>epoch=2, lrate=0.500, error=5.221
>epoch=3, lrate=0.500, error=4.951
>epoch=4, lrate=0.500, error=4.519
>epoch=5, lrate=0.500, error=4.173
>epoch=6, lrate=0.500, error=3.835
>epoch=7, lrate=0.500, error=3.506
>epoch=8, lrate=0.500, error=3.192
>epoch=9, lrate=0.500, error=2.898
>epoch=10, lrate=0.500, error=2.626
>epoch=11, lrate=0.500, error=2.377
>epoch=12, lrate=0.500, error=2.153
>epoch=13, lrate=0.500, error=1.953
>epoch=14, lrate=0.500, error=1.774
>epoch=15, lrate=0.500, error=1.614
>epoch=16, lrate=0.500, error=1.472
>epoch=17, lrate=0.500, error=1.346
>epoch=18, lrate=0.500, error=1.233
>epoch=19, lrate=0.500, error=1.132
[{'weights': [-1.4688375095432327, 1.850887325439514, 1.0858178629550297], 'output': 0.029
980305604426185, 'delta': 0.0059546604162323625}, {'weights': [0.37711098142462157, -0.06
25909894552989, 0.2765123702642716], 'output': 0.9456229000211323, 'delta': -0.0026279652
850863837}]
[{'weights': [2.515394649397849, -0.3391927502445985, -0.9671565426390275], 'output': 0.23
648794202357587, 'delta': 0.04270059278364587}, {'weights': [-2.5584149848484263, 1.00364
22106209202, 0.42383086467582715], 'output': 0.7790535202438367, 'delta': -0.038031325964
37354}]

RESULT:
Thus the Back propagation algorithm to build an Artificial Neural networks was
implemented successfully.

20
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

EX. NO: 4
DATE :
IMPLEMENTATION OF NAÏVE BAYESIAN CLASSIFIER FOR A SAMPLE
TRAINING DATASET AND TO COMPUTE ACCURACY

AIM:

To implement Naïve Bayesian classifier for Tennis data set and to compute the accuracy
with few datasets.

ALGORITHM:

1. Convert the data set into a frequency table.

2. Create likelihood table by finding the probabilities like overcast probability = 0.29 and
probability of plating is 0.64.
3. Now, use Naive Bayesian equation to calculate the posterior probability for each class. The
class with the highest posterior probability is the outcome of prediction.

Problem: Players will play if weather is sunny. Is this statement is correct?

We can solve it using above discussed method of posterior probability.

P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)

Here we have P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36, P( Yes)= 9/14 = 0.64

Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which has higher probability.
4. Exit.

21
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

PROGRAM:

import pandas as pd
from sklearn import tree
from sklearn.preprocessing import LabelEncoder
from sklearn.naive_bayes import GaussianNB
data = pd.read_csv("E:\BALA\AI\Lab programs\pgms\Tennis.csv")
print("The first 5 values of data is :\n",data.head())

# obtain Train data and Train output

X = data.iloc[:,:-1]
print("\nThe First 5 values of train data is\n",X.head())

y = data.iloc[:,-1]
print("\nThe first 5 values of Train output is\n",y.head())

# Convert then in numbers

le_outlook = LabelEncoder()

22
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

X.Outlook = le_outlook.fit_transform(X.Outlook)
le_Temperature = LabelEncoder()
X.Temperature = le_Temperature.fit_transform(X.Temperature)
le_Humidity = LabelEncoder()
X.Humidity = le_Humidity.fit_transform(X.Humidity)
le_Windy = LabelEncoder()
X.Windy = le_Windy.fit_transform(X.Windy)
print("\nNow the Train data is :\n",X.head())

le_PlayTennis = LabelEncoder()
y = le_PlayTennis.fit_transform(y)
print("\nNow the Train output is\n",y)

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.20)

classifier = GaussianNB()
classifier.fit(X_train,y_train)

from sklearn.metrics import accuracy_score

print("Accuracy is:",accuracy_score(classifier.predict(X_test),y_test))

23
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

OUTPUT:

Accuracy is: 0.6666666666666666

RESULT:
Thus the program to implement Naïve Bayesian classifier to compute the accuracy with
few datasets using python was executed and verified successfully.

24
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

EX. NO: 5

DATE :

IMPLEMENTATION OF NAÏVE BAYESIAN CLASSIFIER MODEL TO CLASSIFY A

SET OF DOCUMENTS AND TO MEASURE THE ACCURACY, PRECISION, AND
RECALL

AIM:

To classify a set of documents using Naïve Bayesian classifier and to measure the accuracy
and precision

ALGORITHM:

1. Import basic libraries.

2. Importing the dataset.
3. Data preprocessing.
4. Training the model.
5. Testing and evaluation of the model.
6. Visualizing the model.

25
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

PROGRAM:

from sklearn.datasets import fetch_20newsgroups

from sklearn.metrics import confusion_matrix

from sklearn.metrics import classification_report

import numpy as np

categories = ['alt.atheism', 'soc.religion.christian','comp.graphics', 'sci.med']

twenty_train = fetch_20newsgroups(subset='train',categories=categories,shuffle=True)

twenty_test = fetch_20newsgroups(subset='test',categories=categories,shuffle=True)

print(len(twenty_train.data))

print(len(twenty_test.data))

print(twenty_train.target_names)

print("\n".join(twenty_train.data[0].split("\n")))

print(twenty_train.target[0])

26
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

OUTPUT:

27
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

from sklearn.feature_extraction.text import CountVectorizer

count_vect = CountVectorizer()

X_train_tf = count_vect.fit_transform(twenty_train.data)

from sklearn.feature_extraction.text import TfidfTransformer

tfidf_transformer = TfidfTransformer()

X_train_tfidf = tfidf_transformer.fit_transform(X_train_tf)

X_train_tfidf.shape

from sklearn.naive_bayes import MultinomialNB

from sklearn.metrics import accuracy_score

from sklearn import metrics

mod = MultinomialNB()

mod.fit(X_train_tfidf, twenty_train.target)

X_test_tf = count_vect.transform(twenty_test.data)

X_test_tfidf = tfidf_transformer.transform(X_test_tf)

predicted = mod.predict(X_test_tfidf)

print("Accuracy:", accuracy_score(twenty_test.target, predicted))

print(classification_report(twenty_test.target,predicted,target_names=twenty_test.target_names))

print("confusion matrix is \n",metrics.confusion_matrix(twenty_test.target, predicted))

28
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

OUTPUT:

RESULT:

Thus the accuracy and precision was measured by Naïve Bayesian classifier model.

29
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

EX. NO: 6
DATE :
CONSTRUCTION OF A BAYESIAN NETWORK TO DIAGNOSE CORONA
INFECTION USING STANDARD WHO DATA SET
AIM:
To construct a Bayesian network to diagnose corona infection using WHO data set.

ALGORITHM:

This Naive Bayes is broken down into 5 parts:

1: Separate by Class.
2: Summarize Dataset.
3: Summarize Data by Class.
4: Gaussian Probability Density Function.
5: Class Probabilities.

30
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

PROGRAM

import numpy as np
import pandas as pd
from scipy.stats import randint
import pandas as pd
import matplotlib.pyplot as plt
from pandas import set_option
plt.style.use('ggplot')
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.feature_selection import RFE
from sklearn.model_selection import KFold
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import RandomizedSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier
import xgboost as xgb
from xgboost import XGBClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import cross_val_score
from sklearn.metrics import confusion_matrix
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.feature_selection import SelectFromModel
from sklearn import metrics
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
31
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

from sklearn.metrics import classification_report

covid_19_data = pd.read_csv("E:\BALA\AI\Lab programs\pgms\covid_19_data.csv")
print(f'The shape of the dataframe is {covid_19_data.shape}')
print()

print(covid_19_data.info())
print()

covid_19_data.replace(to_replace='?', value=np.NaN, inplace=True)

print(covid_19_data.describe(include='all'))
print()

32
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

print(covid_19_data['Country/Region'].value_counts())
print(covid_19_data.isnull().sum())

33
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

import seaborn as sns

sns.countplot(x='Country/Region', data=covid_19_data, linewidth=3)
plt.show()
covid_19_data[['ObservationDate', 'Province/State', 'Country/Region','Last Update','Confirmed',
'Deaths', 'Recovered']].hist(bins=50, figsize=(15,8))
plt.show()
covid_19_data['Country/Region'].fillna(covid_19_data['Country/Region'].mode()[0],
inplace=True)
covid_19_data['Confirmed'].fillna(covid_19_data['Confirmed'].mode()[0], inplace=True)
X = covid_19_data.drop(['Deaths'],axis=1)
y = covid_19_data.Recovered
X=X[['confirmed', 'Recovered']]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
NB_classifier = GaussianNB()
NB_classifier.fit(X_train, y_train)
y_predict = NB_classifier.predict(X_test)

cm = confusion_matrix(y_test, y_predict)
sns.heatmap(cm, annot=True, cmap='Blues')
print(classification_report(y_test, y_predict))

34
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

OUTPUT:

RESULT:

Thus the program to diagnose corona infection using Bayesian network was successfully
implemented using python.

35
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

EX. NO: 7
DATE :
COMPARISON OF CLUSTERING IN EM ALGORITHM AND K-MEANS
ALGORITHM USING THE SAME DATA SETS

AIM:

To compare the clustering in EM algorithm and K-means algorithm using the same data sets.

ALGORITHM:

The K-means implementation is as follows:

1. Choose the number of clusters k.
2. Select k random points from the data as centroids.
3. Assign all the points to the closest cluster centroid.
4. Recompute the centroids of newly formed clusters.
5. Repeat steps 3 and 4.

The EM implementation is as follows:

1. Expectation step (E - step): It involves the estimation (guess) of all missing values in the
dataset so that after completing this step, there should not be any missing value.
2. Maximization step (M - step): This step involves the use of estimated data in the E-step
and updating the parameters.
3. Repeat E-step and M-step until the convergence of the values occurs.

36
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

PROGRAM:
from sklearn.cluster import KMeans
from sklearn import preprocessing
from sklearn.mixture import GaussianMixture
from sklearn.datasets import load_iris
import sklearn.metrics as sm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

dataset=load_iris()
# print(dataset)

X=pd.DataFrame(dataset.data)
X.columns=['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width']
y=pd.DataFrame(dataset.target)
y.columns=['Targets']
# print(X)

plt.figure(figsize=(14,7))
colormap=np.array(['red','lime','black'])

# REAL PLOT
plt.subplot(1,3,1)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y.Targets],s=40)
plt.title('Real')

# K-PLOT
plt.subplot(1,3,2)
model=KMeans(n_clusters=3)
37
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

model.fit(X)
predY=np.choose(model.labels_,[0,1,2]).astype(np.int64)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[predY],s=40)
plt.title('KMeans')

# GMM PLOT
scaler=preprocessing.StandardScaler()
scaler.fit(X)

xsa=scaler.transform(X)
xs=pd.DataFrame(xsa,columns=X.columns)
gmm=GaussianMixture(n_components=3)
gmm.fit(xs)
y_cluster_gmm=gmm.predict(xs)
plt.subplot(1,3,3)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y_cluster_gmm],s=40)

plt.title('GMM Classification')

38
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

OUTPUT:

RESULT:
Thus the program to compare clustering in EM and K-means algorithm with few datasets
was performed successfully.

39
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

EX. NO: 8
DATE :
IMPLEMENTATION OF K-NEAREST NEIGHBOUR ALGORITHM TO CLASSIFY
THE IRIS DATA SET
AIM:
To implement K-Nearest Neighbour algorithm to classify iris data set.

ALGORITHM:

1. Calculate the Information Gain of each feature.

2. Considering that all rows don’t belong to the same class, split the dataset S into subsets using
the feature for which the Information Gain is maximum.

3. Make a decision tree node using the feature with the maximum Information gain.

4. If all rows belong to the same class, make the current node as a leaf node with the class as its
label.

5. Repeat for the remaining features until we run out of all features, or the decision tree has all
leaf nodes.

40
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

PROGRAM:
from sklearn.datasets import load_iris
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
import numpy as np
dataset=load_iris()
X_train,X_test,y_train,y_test=train_test_split(dataset["data"],dataset["target"],random_state=0)
kn=KNeighborsClassifier(n_neighbors=1)
kn.fit(X_train,y_train)
for i in range(len(X_test)):

x=X_test[i]
x_new=np.array([x])
prediction=kn.predict(x_new)
print("TARGET=",y_test[i],dataset["target_names"]
[y_test[i]],"PREDICTED=",prediction,dataset["target_names"][prediction])
print(kn.score(X_test,y_test))

41
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

OUTPUT:
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']

42
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

TARGET= 0 setosa PREDICTED= [0] ['setosa']

TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [2] ['virginica']
0.9736842105263158

RESULT:
Thus the program for K-Nearest Neighbour algorithm was implemented successfully using
an iris data set.

43
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

EX. NO: 9
DATE :
IMPLEMENTATION OF THE NON-PARAMETRIC LOCALLY WEIGHTED
REGRESSION ALGORITHM IN ORDER TO FIT DATA POINTS
AIM:
To implement the non-parametric Locally Weighted Regression algorithm in order to fit
data points.
ALGORITHM:

1. Read the Given data Sample to X and the curve (linear or non linear) to Y

2. Set the value for Smoothening parameter or Free parameter say τ

3. Set the bias /Point of interest set x0 which is a subset of X

4. Determine the weight matrix using :

5. Determine the value of model term parameter β using:

6. Prediction = x0*β

44
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

PROGRAM:
from math import ceil
import numpy as np
from scipy import linalg
def lowess(x, y, f, iterations):
n = len(x)
r = int(ceil(f * n))
h = [np.sort(np.abs(x - x[i]))[r] for i in range(n)]

w = np.clip(np.abs((x[:, None] - x[None, :]) / h), 0.0, 1.0)

w = (1 - w ** 3) ** 3
yest = np.zeros(n)
delta = np.ones(n)
for iteration in range(iterations):
for i in range(n):
weights = delta * w[:, i]
b = np.array([np.sum(weights * y), np.sum(weights * y * x)])
A = np.array([[np.sum(weights), np.sum(weights * x)],[np.sum(weights * x),
np.sum(weights * x * x)]])
beta = linalg.solve(A, b)
yest[i] = beta[0] + beta[1] * x[i]
residuals = y - yest
s = np.median(np.abs(residuals))
delta = np.clip(residuals / (6.0 * s), -1, 1)
delta = (1 - delta ** 2) ** 2
return yest
import math
n = 100
x = np.linspace(0, 2 * math.pi, n)
y = np.sin(x) + 0.3 * np.random.randn(n)

45
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

f =0.25
iterations=3
yest = lowess(x, y, f, iterations)
import matplotlib.pyplot as plt
plt.plot(x,y,"r.")
plt.plot(x,yest,"b-")

46
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

lOMoARcPSD|52155318

OUTPUT:

RESULT:
Thus the non-parametric Locally Weighted Regression algorithm to fit data points was
implemented successfully.

47
AL3461– MACHINE LEARNING LABORATORY II YEAR IV SEM

Downloaded by Supradheeka k ([email protected])

2019 ASHRAE Boston Product Guide Final PDF
No ratings yet
2019 ASHRAE Boston Product Guide Final PDF
75 pages
Colligative Properties of Non Electrolytes
50% (2)
Colligative Properties of Non Electrolytes
20 pages
Bank Soal Recount Text
No ratings yet
Bank Soal Recount Text
8 pages
YAMAHA OUTBOARD LZ200NETO, LZ200TR Service Repair Manual X 100101 PDF
No ratings yet
YAMAHA OUTBOARD LZ200NETO, LZ200TR Service Repair Manual X 100101 PDF
60 pages
SuperiorBroomDT80 CT
No ratings yet
SuperiorBroomDT80 CT
2 pages
Instructional Module
100% (2)
Instructional Module
6 pages
Skin Rejuvenation Regimens
No ratings yet
Skin Rejuvenation Regimens
5 pages
People Code Data
No ratings yet
People Code Data
39 pages
60 Seconds: It Only Takes Up 60 Seconds For A Person To Fall in Love
No ratings yet
60 Seconds: It Only Takes Up 60 Seconds For A Person To Fall in Love
40 pages
Hindustan Aeronautics Limited: Asia'S Premier Aerospace Complex
No ratings yet
Hindustan Aeronautics Limited: Asia'S Premier Aerospace Complex
20 pages
Ikea2 Bam B2 SC 6501 - 01 PDF
No ratings yet
Ikea2 Bam B2 SC 6501 - 01 PDF
1 page
New Microsoft Office Word Document
No ratings yet
New Microsoft Office Word Document
34 pages
Walking in Clutha Brochure
No ratings yet
Walking in Clutha Brochure
4 pages
Mensuration
No ratings yet
Mensuration
6 pages
Thumb Rules - Xls For Chemical Engineer
No ratings yet
Thumb Rules - Xls For Chemical Engineer
46 pages
DS&BD Lab Manul
No ratings yet
DS&BD Lab Manul
98 pages
Standard Atmosphere For Measuring and Testing
No ratings yet
Standard Atmosphere For Measuring and Testing
2 pages
AD3461 ML Lab Manual
No ratings yet
AD3461 ML Lab Manual
32 pages
Lab Program
100% (1)
Lab Program
15 pages
Analysis2 Final Exam 2022 PDF
No ratings yet
Analysis2 Final Exam 2022 PDF
3 pages
Ielts Reading Question Sheet
No ratings yet
Ielts Reading Question Sheet
2 pages
ML Lab
No ratings yet
ML Lab
62 pages
AIML Lab Manual
No ratings yet
AIML Lab Manual
43 pages
Probability Althea
No ratings yet
Probability Althea
8 pages
CS 3 - Problem Solving Agent
No ratings yet
CS 3 - Problem Solving Agent
80 pages
Fascinating Photos of Afghanistan in The 1960s Show Life Before The Taliban
No ratings yet
Fascinating Photos of Afghanistan in The 1960s Show Life Before The Taliban
1 page
Algorithms Lab Manual
100% (1)
Algorithms Lab Manual
37 pages
Ad3411 - Student
No ratings yet
Ad3411 - Student
27 pages
Cs3353 Foundations of Data Science L T P C 3 0 0 3
No ratings yet
Cs3353 Foundations of Data Science L T P C 3 0 0 3
2 pages
ML Question Bank - Beena Kapadia
No ratings yet
ML Question Bank - Beena Kapadia
3 pages
18AI61
No ratings yet
18AI61
3 pages
2.1 Exploratory Data Analysis Using Python
No ratings yet
2.1 Exploratory Data Analysis Using Python
12 pages
Aiml Lab Manual 2023
No ratings yet
Aiml Lab Manual 2023
17 pages
Study On Intel 80386 Microprocessor
No ratings yet
Study On Intel 80386 Microprocessor
3 pages
Ad3301 Data Exploration and Visualization
No ratings yet
Ad3301 Data Exploration and Visualization
24 pages
Cs3481 - Dbms Record
No ratings yet
Cs3481 - Dbms Record
63 pages
Unit 5 Fod (1) (Repaired)
No ratings yet
Unit 5 Fod (1) (Repaired)
28 pages
Breadth First Search and Iterative Depth First Search: Practical 1
No ratings yet
Breadth First Search and Iterative Depth First Search: Practical 1
21 pages
AI Lab MAnual Final
No ratings yet
AI Lab MAnual Final
44 pages
Data Warehousing Full
No ratings yet
Data Warehousing Full
41 pages
Hotel Recommendation Systen Final
No ratings yet
Hotel Recommendation Systen Final
16 pages
BE LP5 Manual 23-24
No ratings yet
BE LP5 Manual 23-24
67 pages
Studocu DAA Unit 1 Notes
No ratings yet
Studocu DAA Unit 1 Notes
52 pages
DAA UNIT 4 - Final
No ratings yet
DAA UNIT 4 - Final
12 pages
Lecture Notes: Introduction To Data Science and Big Data
No ratings yet
Lecture Notes: Introduction To Data Science and Big Data
5 pages
FDSA Unit-2
No ratings yet
FDSA Unit-2
41 pages
ML - LAB Record
No ratings yet
ML - LAB Record
36 pages
ML Unit 1
No ratings yet
ML Unit 1
44 pages
Ad3351 Set2
No ratings yet
Ad3351 Set2
5 pages
Soft Computing Lab Manual
No ratings yet
Soft Computing Lab Manual
24 pages
Ai Lab
No ratings yet
Ai Lab
48 pages
Cs3461 Operating Systems Laboratory L T P C
No ratings yet
Cs3461 Operating Systems Laboratory L T P C
1 page
Flutter For Beginners
No ratings yet
Flutter For Beginners
39 pages
Pandas Viva Questions
No ratings yet
Pandas Viva Questions
23 pages
Ad3311 Set 1
No ratings yet
Ad3311 Set 1
2 pages
Numpy - Tutorial - Ipynb - Colaboratory
No ratings yet
Numpy - Tutorial - Ipynb - Colaboratory
9 pages
AD3311-AI Lab Manual-Ex1a and 1b
No ratings yet
AD3311-AI Lab Manual-Ex1a and 1b
6 pages
Question Paper - AI (Feb 1)
No ratings yet
Question Paper - AI (Feb 1)
2 pages
Ad3461 ML Lab Manual
100% (1)
Ad3461 ML Lab Manual
54 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
33 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
48 pages
Ad3311 Set4
No ratings yet
Ad3311 Set4
2 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
37 pages
Ad3301 Data Exploration and Visualization
No ratings yet
Ad3301 Data Exploration and Visualization
38 pages
CD3291 Data Structurres and Algorithm Lab Manual
No ratings yet
CD3291 Data Structurres and Algorithm Lab Manual
84 pages
Mini Project HPC
No ratings yet
Mini Project HPC
17 pages
CCS341 Data Warehousing
No ratings yet
CCS341 Data Warehousing
7 pages
TCA 1 Hard Surface Flooring Proposal and Reason Statement
No ratings yet
TCA 1 Hard Surface Flooring Proposal and Reason Statement
2 pages
Ad3511 DL Lab All Lab Manual
No ratings yet
Ad3511 DL Lab All Lab Manual
36 pages
IF4071 - Deep Learning Laboratory
No ratings yet
IF4071 - Deep Learning Laboratory
1 page
ccs341 Data Warehouse Lab Experiments
No ratings yet
ccs341 Data Warehouse Lab Experiments
26 pages
Ad3311 - Artificial Intelligence Lab Manual
100% (1)
Ad3311 - Artificial Intelligence Lab Manual
30 pages
The Physics of Clinical MR Taught Through Images, 5th Edition Educational Ebook Download
100% (15)
The Physics of Clinical MR Taught Through Images, 5th Edition Educational Ebook Download
15 pages
Vasilka
No ratings yet
Vasilka
4 pages
Al3411 Artificial Intelligence and Machine Learning Laboratory L T P C
No ratings yet
Al3411 Artificial Intelligence and Machine Learning Laboratory L T P C
11 pages
CS3591 Computer Networks Lab Manual Finalized
No ratings yet
CS3591 Computer Networks Lab Manual Finalized
67 pages
ML - LAB Record - Final
No ratings yet
ML - LAB Record - Final
39 pages
Cp4252-Machine Learning Lab Manual 23-24
No ratings yet
Cp4252-Machine Learning Lab Manual 23-24
28 pages
Information Technology Project Management: Providing Measurable Organizational Value 5th Edition by Jack Marchewka 1118911016 9781118911013
100% (11)
Information Technology Project Management: Providing Measurable Organizational Value 5th Edition by Jack Marchewka 1118911016 9781118911013
81 pages
Mobiltherm 605 Pds
No ratings yet
Mobiltherm 605 Pds
2 pages
Efficient Convolution Algorithms
No ratings yet
Efficient Convolution Algorithms
13 pages
r22 1 9 ML Lab Manual r22 Regulations
No ratings yet
r22 1 9 ML Lab Manual r22 Regulations
24 pages
Ccs336 CSM Lab Manual
No ratings yet
Ccs336 CSM Lab Manual
30 pages
Lucky House Others
No ratings yet
Lucky House Others
16 pages
Programming in C - CS3251 - HandWritten Notes - Un - 250316 - 200237
No ratings yet
Programming in C - CS3251 - HandWritten Notes - Un - 250316 - 200237
38 pages
My Time Table 2024-25
No ratings yet
My Time Table 2024-25
1 page
Hazardous Area Ventilation Sce Performance Standard
No ratings yet
Hazardous Area Ventilation Sce Performance Standard
82 pages
Transform and Conquer, Presorting
100% (1)
Transform and Conquer, Presorting
2 pages
Unit 4 NNDL
No ratings yet
Unit 4 NNDL
37 pages
AY 2025-26 SPPU Guidelines For OJT
No ratings yet
AY 2025-26 SPPU Guidelines For OJT
2 pages