0% found this document useful (0 votes)
25 views58 pages

Aiml Lab Record

The document outlines the curriculum for the Artificial Intelligence and Machine Learning Laboratory for the Computer Science and Engineering department at Arunai Engineering College for the 2023-2024 academic year. It includes a list of experiments such as implementing various search algorithms, Naive Bayes models, Bayesian networks, and neural networks, along with their aims, algorithms, programs, and results. Each experiment is designed to provide hands-on experience in key AI and machine learning concepts using Python.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views58 pages

Aiml Lab Record

The document outlines the curriculum for the Artificial Intelligence and Machine Learning Laboratory for the Computer Science and Engineering department at Arunai Engineering College for the 2023-2024 academic year. It includes a list of experiments such as implementing various search algorithms, Naive Bayes models, Bayesian networks, and neural networks, along with their aims, algorithms, programs, and results. Each experiment is designed to provide hands-on experience in key AI and machine learning concepts using Python.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

ARUNAI ENGINEERING COLLEGE

(Affiliated to Anna University)


Velu Nagar, Tiruvannamalai – 606 603
Phone: 04175-237419/236799/237739
www.arunai.org

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


(CYBER SECURITY)

2023- 2024

FOURTH SEMESTER

CS3491- ARTIFICIAL INTELLIGENCE AND


MACHINE LEARNING LABORATORY

1
PAGE
S.NO DATE NAME OF EXPERIMENT REMARKS
NO

1. Implementation of Uninformed search


algorithms (BFS, DFS)

Implementation of Informed search


2.
algorithms (A*, memory-bounded A*)

Implement naïve Bayes models


3.

4. Implement Bayesian Networks

5. Build Regression models

Build decision trees and random


6.

7. Build SVM models

Implement ensembling techniques


8.
9. Implement clustering algorithms

Implement EM for Bayesian


10.
networks

Build simple NN models


11.

Build deep learning NN models


12.
1

EX NO:1
IMPLEMENTATION OF UNINFORMED SEARCH
DATE: ALGORITHM(BFS,DFS)

AIM:

To implement the uninformed search algorithm (BFS, DFS) using python.

ALGORITHM:

BFS ALGORITHM:

Step 1: Push the root node in the Queue.


Step 2: Loop until the queue is empty.
Step 3: Remove the node from the Queue.
Step 4: If the removed node has unvisited child nodes, mark them as visited and
insert the unvisited children in the queue.

DFS ALGORITHM:

Step 1: Push the root node in the Stack.


Step 2: Loop until the stack is empty.
Step 3: Peek the node of the stack.
Step 4: If the node has unvisited child nodes, get the unvisited child node, mark it
as traversed and pushed it on the stack.
Step 5: If the node does not have any unvisited child nodes, pop the node from the
Stack.
2

PROGRAM:

BFS PROGRAM:

graph = {
'5' : ['3','7'],
'3' : ['2', '4'],
'7' : ['8'],
'2' : [],
'4' : ['8'],
'8' : []
}

visited = [] # List for visited nodes.


queue = [] #Initialize a queue
def bfs(visited, graph, node): #function for BFS
visited.append(node)
queue.append(node)
while queue:

m= queue.pop(0)
print (m, end = " ")

for neighbour in graph[m]:


if neighbour not in visited:
visited.append(neighbour)
queue.append(neighbour)

# Driver Code
print("Following is the Breadth-First Search")
bfs(visited, graph, '5') #function calling
3

DFS PROGRAM:

graph = {
'5' : ['3','7'],
'3' : ['2', '4'],
'7' : ['8'],
'2' : [],
'4' : ['8'],
'8' : []
}
visited = set() # Set to keep track of visited nodes of graph.

def dfs(visited, graph, node): #function for


dfs if node not in visited: print (node)
visited.add(node) for neighbour in
graph[node]: dfs(visited, graph, neighbour)

# Driver Code
print("Following is the Depth-First Search")

dfs(visited, graph, '5')


4

OUTPUT:

BFS OUTPUT
Following is the Breadth-First
Search 5 3 7 2 4 8

DFS OUTPUT
Following is the Depth-First Search
5
3
2
4
8
7

RESULT:

Thus the program for BFS and DFS has been implemented successfully.
5

EX NO:2
IMPLEMENTATION OF INFORMED SEARCH ALGORITHMS
DATE: (A*,MEMORY - BOUNDED A*)

AIM:

To implement the informed search algorithm(A*,MEMORY-BOUNDED A*)


using python.

A* ALGORITHM:

1. Add the starting square (or node) to the open list.


2. Repeat the following:
A) Look for the lowest F cost square on the open list. We refer to this as
the current
square.
B). Switch it to the closed list.
C) For each of the 8 squares adjacent to this current square
● If it is not walk-able or if it is on the closed list, ignore it.
Otherwise do the following.
● If it isn’t on the open list, add it to the open list. Make the current
square as the parent of this square. Record the F, G, and H costs of
the square.
● If it is on the open list already, check to see if this path to that
square is better, using G cost as the measure. A lower G cost
means that this is a better path. If so, change the parent of the
square to the current square, and recalculate the G and F scores of
6

the square. If you are keeping your open list sorted by F score, you
may need to resort to the list to account for the change.
D) Stop when you:
● Add the target square to the closed list, in which case the path has
been found, or
● Fail to find the target square, and the open list is empty. In this
case, there is no path.
3. Save the path. Working backwards from the target square, go from each
square to its parent square until you reach the starting square. That is your path.
7

PROGRAM:

def aStarAlgo(start_node, stop_node):


open_set = set(start_node)
closed_set = set()
g = {} #store distance from starting node
parents = {} # parents contains an adjacency map of all nodes
#distance of starting node from itself is zero
g[start_node] = 0
#start_node is root node i.e it has no
parent nodes #so start_node is set to its
own parent node parents[start_node] =
start_node while len(open_set) > 0:
n = None
#node with lowest f() is found
for v in open_set:
if n == None or g[v] + heuristic(v) < g[n] + heuristic(n):
n=v
if n == stop_node or Graph_nodes[n] == None:
pass
else:
for (m, weight) in get_neighbors(n):
#nodes 'm' not in first and last set are
added to first #n is set its parent
if m not in open_set and m not in closed_set:
open_set.add(m)
parents[m] = n
g[m] = g[n] + weight
#for each node m,compare its distance from start i.e
g(m) to the #from start through n node
if g[m] > g[n] + weight:
#update g(m)
g[m] = g[n] + weight
#change parent of m to n
parents[m] = n
8

#if m in closed set,remove and add to open


if m in closed_set:
closed_set.remove(m)
open_set.add(m)
if n == None:
print('Path does not exist!')
return None
# if the current node is the stop_node
# then we begin reconstruction the path from it to the
start_node if n == stop_node:
path = []
while parents[n]
!= n:
path.append(n)
n = parents[n]
path.append(start_n
ode) path.reverse()
print('Path found:
{}'.format(path)) return path
# remove n from the open_list, and add it to closed_list
# because all of his neighbors were inspected
open_set.remove(n)
closed_set.add(n)
print('Path does not exist!')
return None
#define function to return neighbor and its distance
#from the passed node
def get_neighbors(v):
if v in Graph_nodes:
return Graph_nodes[v]
else:
return None
9

OUTPUT :

Path found: ['A', 'F', 'G', 'I', 'J']

RESULT :

Thus the A* algorithm is implemented by Python code.


10

EX NO:3
IMPLEMENT NAIVE BAYES MODELS
DATE:

AIM:
To implement the Naive Bayes Model using python.

ALGORITHM:
Naive Bayes is among one of the very simple and powerful algorithms
for classification based on Bayes Theorem with an assumption of
independence among the predictors.
The Naive Bayes classifier assumes that the presence of a feature in a
class is not related to any other feature.
Naive Bayes is a classification algorithm for binary and multi-class
classification problems.

BAYES THEOREM:
Based on prior knowledge of conditions that may be related to an event,
Bayes theorem describes the probability of the event
• conditional probability can be found this way
• Assume we have a Hypothesis(H) and evidence(E),
According to Bayes theorem, the relationship between
the probability of Hypothesis before getting the
evidence represented as P(H) and the probability of
the hypothesis after getting the evidence represented
as P(H|E) is:
P(H|E) = P(E|H)*P(H)/P(E)
11

• Prior probability = P(H) is the probability before getting the


evidence Posterior probability = P(H|E) is the probability after
getting evidence
• In general,
P(class|data) = (P(data|class) * P(class)) / P(data)

Assume we have to find the probability of the randomly picked card to be


king given that it is a face card. There are 4 Kings in a Deck of Cards which
implies that P(King) =4/52 as all the Kings are face Cards so P(Face|King) =
1 there are 3 Face Cards in a Suit of 13 cards and there are 4 Suits in total so
P(Face) = 12/52
Therefore,
12

PROGRAM:

# Importing
library import
math import
random import
csv
# the categorical class names are changed to numberic data
# eg: yes and no encoded to 1 and 0
def encode_class(mydata):
classes = []
for i in range(len(mydata)):
if mydata[i][-1] not in classes:
classes.append(mydata[i][-1])
for i in range(len(classes)):
for j in range(len(mydata)):
if mydata[j][-1] == classes[i]:
mydata[j][-1] = i
return mydata

# Splitting the data


def splitting(mydata, ratio):
train_num = int(len(mydata) * ratio)
train = []
# initially testset will have all the
dataset test = list(mydata)
while len(train) < train_num:
# index generated randomly from range 0
# to length of testset
13

index = random.randrange(len(test))
# from testset, pop data rows and put it
in train train.append(test.pop(index))
return train, test

# Group the data rows under each class yes or


# no in dictionary eg: dict[yes] and
dict[no] def
groupUnderClass(mydata):
dict = {}
for i in range(len(mydata)):
if (mydata[i][-1] not in
dict):
dict[mydata[i][-1]] =
[]
dict[mydata[i][-
1]].append(mydata[i]) return dict

# Calculating Mean
def mean(numbers):
return sum(numbers) / float(len(numbers))

# Calculating Standard
Deviation def
std_dev(numbers):
avg = mean(numbers)
variance = sum([pow(x - avg, 2) for x in numbers]) /
float(len(numbers) - 1)
14

return math.sqrt(variance)
def MeanAndStdDev(mydata):
info = [(mean(attribute), std_dev(attribute)) for attribute in zip(*mydata)]
# eg: list = [ [a, b, c], [m, n, o], [x, y, z]]
# here mean of 1st attribute =(a + m+x), mean of 2nd attribute = (b + n+y)/3
# delete summaries of last class
del info[-1]
return info

# find Mean and Standard Deviation under


each class def
MeanAndStdDevForClass(mydata):
info = {}
dict = groupUnderClass(mydata) for classValue, instances in
dict.items():
info[classValue] = MeanAndStdDev(instances)
return info

# Calculate Gaussian Probability Density Function


def calculateGaussianProbability(x, mean, stdev):
expo = math.exp(-(math.pow(x - mean, 2) / (2 * math.pow(stdev, 2))))
return (1 / (math.sqrt(2 * math.pi) * stdev)) * expo

# Calculate Class Probabilities


def calculateClassProbabilities(info, test):
probabilities = {}
for classValue, classSummaries in info.items():
probabilities[classValue] = 1
15

for i in range(len(classSummaries)):
mean, std_dev = classSummaries[i]
x = test[i]
probabilities[classValue] *= calculateGaussianProbability(x,
mean, std_dev) return probabilities

# Make prediction - highest probability is the


prediction def predict(info, test):
probabilities =
calculateClassProbabilities(info, test)
bestLabel, bestProb = None, -1
for classValue, probability in
probabilities.items(): if bestLabel is
None or probability > bestProb:
bestProb =
probability
bestLabel =
classValue
return bestLabel

# returns predictions for a set of examples


def getPredictions(info, test):
predictions = []
for i in range(len(test)):
result = predict(info, test[i])
predictions.append(result)
return predictions
16

#Accuracy score
def accuracy_rate(test, predictions):
correct = 0
for i in range(len(test)):
if test[i][-1] == predictions[i]:
correct += 1
return (correct / float(len(test))) * 100.0

#driver code
# add the data path in your system
filename = r'E:\user\MACHINE LEARNING\machine learning
algos\Naive bayes\filedata.csv'

# load the file and store it in


mydata list mydata =
csv.reader(open(filename, "rt"))
mydata = list(mydata)
mydata =
encode_class(mydata) for i
in range(len(mydata)):
mydata[i] = [float(x) for x in mydata[i]]

# split ratio = 0.7


# 70% of data is training data and 30% is test data used
for testing ratio = 0.7
train_data, test_data = splitting(mydata, ratio)
print('Total number of examples are: ',
len(mydata)) print('Out of these, training examples
17

are: ', len(train_data)) print("Test examples are: ",


len(test_data))

# prepare model
info = MeanAndStdDevForClass(train_data)

# test model
predictions = getPredictions(info, test_data)
accuracy = accuracy_rate(test_data, predictions)
print("Accuracy of your model is: ", accuracy)
18

INPUT:

DATASET DOWNLOAD

https://gist.github.com/ktisha/c21e73a1bd1700294ef790c56c8aec1f

OUTPUT:

Total number of examples are: 200


Out of these, training examples are: 140
Test examples are: 60
Accuracy of your model is: 71.2376788

RESULT :

Thus the Naive Bayes Model using python is implemented successfully.


19

EX NO:4
IMPLEMENT BAYESIAN NETWORKS
DATE:

AIM:

To implement the Bayesian networks and perform inferences using python .

PROCEDURE:

A Bayesian network is a directed acyclic graph in which each edge


corresponds to a conditional dependency, and each node corresponds to a
unique random variable. Bayesian network consists of two major parts: a
directed
d acyclic graph and a set of conditional probability distributions
• The directed acyclic graph is a set of random variables represented by
nodes.
• The conditional probability distribution of a node (random variable)
is defined for every possible outcome of the preceding causal
node(s).

For illustration, consider the following example. Suppose we attempt to turn


on our computer, but the computer does not start (observation/evidence). We
would like to know which of the possible causes of computer failure is more
m
likely. In this simplified illustration, we assume only two possible causes of
this misfortune: electricity failure and computer malfunction. The
corresponding directed acyclic graph is depicted in the figure below.
20

The goal is to calculate the posterior conditional probability distribution of


each of the possible unobserved causes given the observed evidence, i.e. P
[Cause | Evidence].
21

PROGRAM:
from pgmpy.models import BayesianNetwork
from pgmpy.factors.discrete import TabularCPD
import networkx as nx
import pylab as plt
# Defining Bayesian Structure
model = BayesianNetwork([('Guest', 'Host'), ('Price', 'Host')])

# Defining the CPDs:


cpd_guest = TabularCPD('Guest', 3, [[0.33], [0.33],
[0.33]]) cpd_price = TabularCPD('Price', 3, [[0.33],
[0.33], [0.33]]) cpd_host = TabularCPD('Host', 3,
[[0, 0, 0, 0, 0.5, 1, 0, 1, 0.5],
[0.5, 0, 1, 0, 0, 0, 1, 0, 0.5],
[0.5, 1, 0, 1, 0.5, 0, 0, 0, 0]],
evidence=['Guest', 'Price'],
evidence_card=[3, 3])

# Associating the CPDs with the network


structure. model.add_cpds(cpd_guest,
cpd_price, cpd_host)

model.check_model()

# Inferring the posterior probability


from pgmpy.inference import VariableElimination

infer = VariableElimination(model)
22

posterior_p = infer.query(['Host'], evidence={'Guest': 2, 'Price': 2})


print(posterior_p)

nx.draw(model, with_labels=True)
plt.savefig('model.png')
plt.close()
23

INPUT :
DATASET DOWNLOAD
https://drive.google.com/file/d/17vwRLAY8uR-6vWusM5prn08it
https://drive.google.com/file/d/17vwRLAY8uR 6vWusM5prn08it-BEGp-
f/view

OUTPUT:

RESULT :
Thus the Bayesian networks and perform inferences using python is
implemented.
24

EX NO:5
BUILD REGRESSION MODELS
DATE:

AIM:

To build regression models using python.

ALGORITHM:

1. Import the necessary libraries and dataset


2. Split the dataset into training and testing sets
3. Create a linear regression model and fit it to the training data
4. Make predictions on the testing data and evaluate the model's performance.
25

PROGRAM:

Step 1: Import the necessary libraries and dataset

We will start by importing the required libraries and loading a dataset. For
this example, we will be using the Boston Housing dataset which is available
in scikit-learn.

import pandas as pd
import numpy as np
from sklearn.datasets import load_boston
boston = load_boston()
df = pd.DataFrame(boston.data, columns=boston.feature_names)
df['MEDV'] = boston.target

Step 2: Split the dataset into training and testing sets

Next, we need to split the dataset into training and testing sets. We will use 80%
of the data for training and 20% for testing.
from sklearn.model_selection import train_test_split

X = df.drop('MEDV', axis=1)
y = df['MEDV']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,


random_state=42)

Step 3: Create a linear regression model and fit it to the training data

Now we can create a linear regression model and fit it to the training data.

from sklearn.linear_model import LinearRegression

model = LinearRegression()
26

model.fit(X_train, y_train)

Step 4: Make predictions on the testing data and evaluate the model's
performance

Finally, we can use the trained model to make predictions on the testing data
and evaluate its performance.

from sklearn.metrics import mean_squared_error,

r2_score y_pred = model.predict(X_test)

print('Mean squared error: %.2f' % mean_squared_error(y_test, y_pred))

print('Coefficient of determination: %.2f' % r2_score(y_test, y_pred))

This will output the mean squared error and coefficient of determination (R-
squared) for our model. The mean squared error gives us an idea of how well
the model is able to predict the target variable, while the R-squared value tells
us how much of the variation in the target variable is explained by the model.

And that's it! We have built a simple linear regression model using Python. Of
course, there are many other regression models you can build and various ways
to fine-tune the model's performance, but this should give you a good starting
point.
27

OUTPUT:

Mean squared error: 24.29


Coefficient of determination: 0.67

RESULT:

Thus the building of regression models using python was implemented


successfully.
28

EX NO:6
BUILD DECISION TREES AND RANDOM FORESTS
DATE:

AIM:

To build a decision tree and random forest by using python.

PROCEDURE:

DECISION TREES
A decision tree is a tree-like model of decisions and their possible consequences.
It consists of nodes, which represent the decisions, and edges, which represent
the possible outcomes. The tree starts at the root node and follows a path down to
the leaves, which represent the final decisions.

To build a decision tree, you need to follow these steps:

1.Select the attribute that best separates the data. You want to select the
attribute that best separates the data into distinct groups that have
different outcomes. You can use different measures to determine which
attribute is the best, such as the Gini index or the information gain.

2.Split the data based on the selected attribute. You want to split the
data into smaller groups based on the selected attribute. Each group
should have similar outcomes.

3.Repeat the process for each group. For each group, you want to
repeat the process by selecting the next best attribute and splitting
the data.
29

4.Stop when you reach a stopping criterion. You want to stop building
the tree when you reach a stopping criterion, such as a minimum number
of instances in a node or a maximum depth of the tree.

Once you have built the decision tree, you can use it to make predictions for
new instances by following the path from the root to the leaf node that
corresponds to the instance.

RANDOM FORESTS

A random forest is an ensemble learning method that combines multiple


decision trees to improve the accuracy and reduce the overfitting. It works by
creating a set of decision trees on random subsets of the data and random subsets
of the attributes.

To build a random forest, you need to follow these steps:

1. Select a random subset of the data. You want to select a random subset
of the data to train each decision tree.
2. Select a random subset of the attributes. For each split in each decision
tree, you want to select a random subset of the attributes to consider.
3. Build a decision tree for each subset. You want to build a decision tree for
each random subset of the data and attributes.
4. Combine the decision trees to make predictions. To make predictions for
new instances, you want to combine the predictions of all the decision
trees in the random forest.
30

Random forests are typically more accurate than single decision trees and are

less prone to overfitting. They are widely used in machine learning for

classification and regression tasks.


31

PROGRAM:

Here's some sample code that you can use to get started:

First, you need to install scikit-learn using pip:

pip install scikit-learn

Now, let's import the necessary modules and load the dataset:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# Load the iris


dataset iris =
load_iris()
X=

iris.data y

=
iris.target

# Split the dataset into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

Next, let's build a decision tree:

# Build the decision tree


model dtc =
32

DecisionTreeClassifier()
dtc.fit(X_train, y_train)
# Make predictions on the test set and calculate the
accuracy y_pred = dtc.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

# Print the accuracy


print("Decision Tree Accuracy:", accuracy)

Finally, let's build a random forest:

# Build the random forest model


rfc = RandomForestClassifier(n_estimators=100, random_state=42)
rfc.fit(X_train, y_train)

# Make predictions on the test set and calculate the


accuracy y_pred = rfc.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

# Print the accuracy


print("Random Forest Accuracy:", accuracy)

In this example, we're using the iris dataset and splitting it into
training and testing sets. Then, we build a decision tree and a
random forest using scikit-learn's DecisionTreeClassifier and
RandomForestClassifier classes, respectively. Finally, we make
predictions on the test set and calculate the accuracy of each model
using scikit-learn's accuracy_score function.
33

OUTPUT:

Decision Tree Accuracy: 1.0


Random Forest Accuracy: 1.0

RESULT:

Thus the building of decision tree and random forest using python was
implemented successfully.
34

EX NO:7
BUILD SVM MODELS
DATE:

AIM:

To build a SVM model by using python.

PROCEDURE:

SVM is a popular machine learning algorithm used for classification and


regression tasks. It works by finding the optimal hyperplane that separates the
data into different classes or predicts the target variable based on the input
features.
Here are the general steps to build an SVM model:
1. Prepare your data: SVM works best with numerical data, so make sure your
data is in numerical format. You may need to perform data cleaning, scaling, and
feature selection as necessary.
2. Split your data: Divide your data into training and testing sets. Typically,
you use 70-80% of the data for training and 20-30% for testing.
3. Choose your kernel: SVM models use different types of kernel functions to
transform the input data into higher dimensions where it is easier to separate the
classes. Some common kernel functions are linear, polynomial, radial basis
function (RBF), and sigmoid.
4.Train your model: Use the training data to train your SVM model. The
algorithm will find the optimal hyperplane that separates the classes or predicts
the target variable.
35

5. Evaluate your model: Use the testing data to evaluate your SVM model.
Calculate the accuracy, precision, recall, and F1 score to measure the
performance of your model.
6.Tune your model: If your model does not perform well, you may need to
adjust the hyperparameters, such as the regularization parameter, kernel
coefficient, or degree of the polynomial function.

7. Predict with your model: Once you are satisfied with your model, you can
use it to predict new data.
36

PROGRAM:
Here is some sample code in Python using the scikit-learn library to build
an SVM model with an RBF kernel:

from sklearn import datasets


from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

# Load the iris dataset


iris = datasets.load_iris()
X = iris.data
y = iris.target

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)

# Create an SVM model with an RBF kernel


model = SVC(kernel='rbf')

# Train the model


model.fit(X_train, y_train)

# Make predictions on the testing data


y_pred = model.predict(X_test)

# Evaluate the model


accuracy = accuracy_score(y_test, y_pred)

print('Accuracy:', accuracy)
37

OUTPUT:

**Accuracy:** <accuracy_value>

Note: The <accuracy_value> will vary slightly due to the random splitting of the
data in the train_test_split function. You can expect a value in the range of 80% to
95%, indicating the model's performance on unseen testing data.

RESULT:
Thus the building of SVM model using python was implemented successfully.
38

EX NO:8
IMPLEMENT ENSEMBLING TECHNIQUES
DATE:

AIM:

To implement ensembling techniques using python and machine learning.

PROCEDURES:

Ensembling is a technique that combines the predictions of multiple


machine learning models to improve the overall performance of the system.
There are several ways to implement ensembling techniques, some of which are:

1. Bagging: This technique involves training multiple models independently


on different subsets of the training data, with replacement. The predictions of
these models are then combined by taking the average or majority vote of the
predictions.
2. Boosting: This technique involves sequentially training multiple models,
with each model trying to correct the errors of the previous model. The final
predictions are then obtained by combining the predictions of all the models.
3. Stacking: This technique involves training multiple models, and then
using their predictions as input features for a final model that is trained to make
the final predictions.
39

PROGRAM:

Here's an example implementation of ensembling techniques using Python


and scikit-learn library:

from sklearn.ensemble import BaggingClassifier,


AdaBoostClassifier from sklearn.linear_model import
LogisticRegression from sklearn.naive_bayes import
GaussianNB
from sklearn.model_selection import
train_test_split from sklearn.metrics import
accuracy_score
# Load the

dataset X, y =
load_data()
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
# Initialize the base
models model1 =
LogisticRegression()
model2 = GaussianNB()
# Initialize the ensemble models
bagging = BaggingClassifier(base_estimator=model1, n_estimators=10,
random_state=42) boosting = AdaBoostClassifier(base_estimator=model2,
n_estimators=10, random_state=42)
40

# Fit the ensemble models on the


training data bagging.fit(X_train,
y_train) boosting.fit(X_train, y_train)
# Make predictions on the testing
data bagging_preds =
bagging.predict(X_test)
boosting_preds =
boosting.predict(X_test)
# Combine the predictions using majority
voting ensemble_preds = (bagging_preds +
boosting_preds) / 2
# Calculate the accuracy of the ensemble model

ensemble_acc = accuracy_score(y_test,
ensemble_preds)

In this example, we have used bagging and boosting techniques to create


two ensemble models. We have then combined the predictions of these models
using majority voting, and calculated the accuracy of the ensemble model.
41

OUTPUT:
Accuracy: 0.875 # This value will vary

Understood. Since the output is dependent on the specific data and the
inherent randomness in the process

RESULT:
Thus the implementation of ensembling techniques using python and machine
learning was implemented successfully.
42

EX NO:9
IMPLEMENT CLUSTERING ALGORITHMS
DATE:

AIM:

To implement clustering algorithms using python and scikit learn library.

PROCEDURE:

Clustering is a type of unsupervised learning technique that involves


grouping similar data points together. There are several clustering algorithms
available, including K-Means, DBSCAN, Hierarchical Clustering, and more. In
this answer, I will provide an example implementation of the K-Means and
DBSCAN clustering algorithms using Python and scikit-learn library.

1. K-Means Clustering
K-means clustering is one of the most popular clustering algorithms. The
algorithm partitions a set of data points into k clusters, where k is a
predefined number. The algorithm works by iteratively assigning data
points to the nearest cluster centroid, and then updating the centroid based
on the new assignments. This process continues until convergence.
43

PROGRAM:

Here's some sample Python code for K-Means Clustering using the scikit-
learn library:

from sklearn.cluster import KMeans


import numpy as np

X = np.array([[1, 2], [1, 4], [1, 0], [4, 2], [4, 4], [4, 0]])
kmeans = KMeans(n_clusters=2, random_state=0).fit(X)
print(kmeans.labels_)

These are just a few examples of clustering algorithms and their


implementations. There are many more clustering algorithms and variations of
these algorithms that can be used depending on the problem at hand.
44

OUTPUT:

[1 1 1 0 0 0]

RESULT:
Thus the implementation of clustering algorithms using python and scikit learn
library was implemented successfully.
45

EX NO:10
IMPLEMENT EM FOR BAYESIAN NETWORKS
DATE:

AIM:

To implement EM for a bayesian network using python and machine learning.

PROCEDURE:

Expectation-Maximization (EM) is a popular algorithm for learning the


parameters of a Bayesian network from data. The goal of EM is to estimate the
maximum likelihood or maximum a posteriori (MAP) parameters of the model
given the observed data. The algorithm alternates between an Expectation step
(E-step), which computes the expected sufficient statistics of the latent variables,
and a Maximization step (M-step), which maximizes the likelihood or posterior
with respect to the parameters.

The EM algorithm for Bayesian networks has the following steps:

1. Define the Bayesian network structure and parameters:


● Specify the variables, their domains, and their dependencies in the
network structure.
● Define the conditional probability distributions (CPDs) for each
variable based on the network structure and any prior knowledge or
assumptions.
● Initialize the parameters of the CPDs randomly or based on some
prior distribution.
2. E-step:
46

● Given the current parameter estimates, compute the posterior


distribution of the latent variables for each data point using Bayes'
rule and the network structure and CPDs.
● Compute the expected sufficient statistics of the latent variables,
such as the expected counts of each state or the expected pairwise
correlations, for each variable and each data point.
3. M-step:

● Use the expected sufficient statistics computed in the E-step to


update the parameter estimates of the CPDs.
● This can be done using maximum likelihood or maximum a
posteriori estimation, depending on whether you want to incorporate
prior knowledge or not.

4. Repeat the E-step and M-step until convergence:

● Evaluate the log-likelihood or log-posterior of the data given the


current parameter estimates at each iteration.
● Stop when the log-likelihood or log-posterior stops increasing or
when some convergence criterion is met.
● The E-Step and M-Step can be further broken down into specific
calculations depending on the type of Bayesian network being
learned. For example, if the network is a naive Bayes model, the E-
Step involves calculating the posterior probabilities of each feature
given the class variable, and the M-Step involves updating the
conditional probabilities of the features given the class variable.
● Overall, the EM algorithm is a useful tool for learning the
parameters of Bayesian networks, especially when there are hidden
47

variables in the model. It is important to note that the algorithm can


converge to a local optimum, so multiple runs with different
initializations may be necessary to find the global optimum.

5. Use the learned parameters: Once the EM algorithm has converged, the
learned parameters can be used to make predictions, perform inference, and
generate samples from the Bayesian network.
Note that the E-step and M-step for a Bayesian network can be more complex
than those for a simple mixture model or a linear regression model. In particular,
the E-step involves computing the posterior distribution over the latent variables,
which can be computationally expensive for large networks. The M-step involves
maximizing the log-likelihood of the data with respect to the parameters, which
can involve solving a non-convex optimization problem. Therefore, efficient
algorithms and approximations are often used to implement the EM algorithm for
Bayesian networks.

Here's a more detailed algorithm for implementing the EM algorithm for a


Bayesian network:

INPUT:

A Bayesian network structure G

● Incomplete or partially observed data X


● Maximum number of iterations N
● A tolerance level ε
48

OUTPUT:

● The estimated parameter values θ

1.Initialize the parameters of the Bayesian network:

● For each node i in G, initialize its parameters θ_i

2.Repeat until convergence or until N iterations:

a. E-step:

● For each instance x in X, compute its posterior distribution over the

hidden variables: (Z|x,θ) = P(x,Z|θ) / P(x|θ)

● Compute the expected sufficient statistics:

E[n_i(x)] = ∑_Z n_i(x,Z) P(Z|x,θ) for each node i in G

b. M-step:

● For each node i in G, update its parameters to maximize the expected


log-likelihood: θ_i = argmax_θ L(X,Z|θ)

= argmax_θ ∑_x ∑_Z n_i(x,Z) log P(x,Z|θ)

● Check for convergence:

If the change in the log-likelihood is less than ε, stop iterating and


return θ

● If not converged, go to step 2a.

3.Return θ
49

Note that the specific form of the expected sufficient statistics and
the update equations for each node's parameters will depend on the type of
distribution used to model the node's conditional probabilities. For example, if
the node's distribution is Gaussian, the expected sufficient statistics would be the
mean and variance, and the update equation would involve setting the mean and
variance to the sample mean and variance of the data weighted by the posterior
probabilities.

RESULT:

Thus the implementation of EM for a bayesian network using python

and machine learning is implemented successfully


50

EX NO:11
BUILD SIMPLE NN MODEL
DATE:

AIM:

To build a simple neural network model by using a simple neural network with
one input layer, one hidden layer, and one output layer.

PROCEDURE:
First, we need to import the necessary libraries:
import numpy as np
import tensorflow as tf

Next, let's define the model architecture. For this example, we will create a
neural network with 2 input nodes, 4 hidden nodes, and 1 output node.
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(4, input_shape=(2,),
activation='relu'), tf.keras.layers.Dense(1,
activation='sigmoid')
])

In the code above, we create a sequential model, which is a linear stack of layers.
The first layer is a dense layer with 4 nodes and the activation function ‘relu’. The
input shape of this layer is (2,), which means we expect 2 input values for each
sample. The second layer is also a dense layer with 1 node and the activation
function ‘sigmoid’
51

Now, let's compile the model with a loss function, optimizer, and metrics

model.compile(loss='binary_crossentropy', optimizer='adam',
metrics=['accuracy'])
6.
We use the binary_crossentropy loss function because this is a binary classification
problem. For the optimizer, we use adam, which is a popular choice for gradient
descent. We also specify that we want to track the accuracy metric during training

Finally, let's train the model on some data.

X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])

y = np.array([0, 1, 1, 0])

model.fit(X, y, epochs=1000)

In this example, we use the XOR logical function as our dataset. X contains
the input values, and y contains the output values. We train the model for 1000
epochs using the fit() function.

That's it! You've created a simple neural network model. Of course, you can
modify the number of layers, nodes, activation functions, loss functions, and
optimizers to suit your needs.

RESULT:

Thus the implementation of a simple neural network model by using a


simple neural network with one input layer, one hidden layer, and one output
layer was implemented successfully .
52

EX NO:12
BUILD DEEP LEARNING NN MODELS
DATE:

AIM:

To write a python program to implement deep learning of NN models.

ALGORITHM:

1. Import the necessary libraries, such as numpy and keras.


2. Load or generate your dataset. This can be done using numpy or any other
data manipulation library.
3. Preprocess your data by performing any necessary normalization, scaling,
or other transformations.
4. Define your neural network architecture using the Keras Sequential API.
Add layers to the model using the add() method, specifying the number
of units, activation function, and input dimensions for each layer.
5. Compile your model using the compile() method. Specify the loss
function, optimizer, and any evaluation metrics you want to use.
6. Train your model using the fit() method. Specify the training data,
validation data, batch size, and number of epochs.
7. Evaluate your model using the evaluate() method. This will give you the
loss and accuracy metrics on the test set.
8. Use your trained model to make predictions on new data using the
predict() method.
53

PROGRAM:

# Import necessary libraries

import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import to_categorical

# Define the neural network

model = Sequential()
model.add(Dense(units=64, activation='relu', input_dim=100))
model.add(Dense(units=10, activation='softmax'))

# Compile the model

model.compile(loss='categorical_crossentropy',
optimizer='sgd',
metrics=['accuracy'])

# Generate some random data for training and testing


data = np.random.random((1000, 100))

labels = np.random.randint(10, size=(1000, 1))


one_hot_labels = to_categorical(labels, num_classes=10)

# Train the model on the data

model.fit(data, one_hot_labels, epochs=10, batch_size=32)

# Evaluate the model on a test set

test_data = np.random.random((100, 100))


test_labels = np.random.randint(10, size=(100, 1))
test_one_hot_labels = to_categorical(test_labels, num_classes=10)

loss_and_metrics = model.evaluate(test_data, test_one_hot_labels, batch_size=32)


print("Test loss:", loss_and_metrics[0])
print("Test accuracy:", loss_and_metrics[1])
54

OUTPUT:

Epoch 1/10
32/32 [==============================] - 0s 2ms/step - loss: 2.4028 -
accuracy: 0.0900
Epoch 2/10
32/32 [==============================] - 0s 1ms/step - loss: 2.3642 -
accuracy: 0.0980
...
Epoch 10/10
32/32 [==============================] - 0s 1ms/step - loss: 2.3169 -
accuracy: 0.1120
Test loss: 2.3998091220855713
Test accuracy: 0.07999999821186066

RESULT:
Thus the Python program to implement deep learning of NN Models
was developed successfully.

You might also like