0% found this document useful (0 votes)
15 views17 pages

Atul MLT Exp 4-11

The document outlines various experiments involving machine learning algorithms, including the implementation of an Artificial Neural Network using Backpropagation, a Naïve Bayesian classifier, and a Bayesian network for medical diagnosis. It also covers clustering techniques using the EM algorithm and K-Means, as well as the k-Nearest Neighbour algorithm for classifying the Iris dataset. Each experiment includes code snippets and objectives, demonstrating practical applications of machine learning concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views17 pages

Atul MLT Exp 4-11

The document outlines various experiments involving machine learning algorithms, including the implementation of an Artificial Neural Network using Backpropagation, a Naïve Bayesian classifier, and a Bayesian network for medical diagnosis. It also covers clustering techniques using the EM algorithm and K-Means, as well as the k-Nearest Neighbour algorithm for classifying the Iris dataset. Each experiment includes code snippets and objectives, demonstrating practical applications of machine learning concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

EXPERIMENT 4

Object: Build an Artificial Neural Network by implementing the Backpropagation algorithm and test
the same using appropriate data sets.

import numpy as np

# Sigmoid activation function and its derivative

def sigmoid(x): return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):

return x * (1 - x)

# XOR input and output inputs

= np.array([

[0, 0],

[0, 1],

[1, 0],

[1, 1]

])

outputs = np.array([

[0],

[1],

[1],

[0]

])

# Seed for reproducibility np.random.seed(1)

# Initialize weights randomly with mean 0

ATUL KUMAR SINGH (2201201640022)


input_layer_neurons = 2

hidden_layer_neurons = 2 output_neurons

=1

# Weights and biases weights_input_hidden = 2 * np.random.rand(input_layer_neurons,

hidden_layer_neurons) - 1 weights_hidden_output = 2 *

np.random.rand(hidden_layer_neurons, output_neurons) - 1

bias_hidden = np.random.rand(1, hidden_layer_neurons) bias_output

= np.random.rand(1, output_neurons)

# Training parameters epochs

= 10000 learning_rate = 0.1

# Training loop for epoch in range(epochs): # Forward Pass

hidden_input = np.dot(inputs, weights_input_hidden) + bias_hidden

hidden_output = sigmoid(hidden_input)

final_input = np.dot(hidden_output, weights_hidden_output) + bias_output

predicted_output = sigmoid(final_input)

# Error error = outputs -

predicted_output

# Backpropagation d_predicted_output = error *

sigmoid_derivative(predicted_output)

error_hidden_layer = d_predicted_output.dot(weights_hidden_output.T)

d_hidden_layer = error_hidden_layer * sigmoid_derivative(hidden_output) # Update

ATUL KUMAR SINGH (2201201640022)


weights and biases weights_hidden_output += hidden_output.T.dot(d_predicted_output)

* learning_rate weights_input_hidden += inputs.T.dot(d_hidden_layer) * learning_rate

bias_output += np.sum(d_predicted_output, axis=0, keepdims=True) * learning_rate

bias_hidden += np.sum(d_hidden_layer, axis=0, keepdims=True) * learning_rate

if epoch % 1000 == 0:

loss = np.mean(np.square(error))

print(f"Epoch {epoch}, Loss: {loss:.4f}")

# Final output print("\nFinal Output

after Training:")

print(predicted_output)

OUTPUT

ATUL KUMAR SINGH (2201201640022)


EXPERIMENT 5

Object: Write a program to implement the naïve Bayesian classifier for a sample training data set
stored as a CSV file. Compute the accuracy of the classifier, considering few test data sets.

import csv

from collections import defaultdict

def load_csv(filename):

with open(filename, 'r') as f:

reader = csv.reader(f)

data = list(reader)

header = data[0] return

header, data[1:]

def separate_by_class(data, class_index):

separated = defaultdict(list)

for row in data:

label = row[class_index]

separated[label].append(row) return

separated

def summarize_by_class(data, header):

class_index = -1 # last column separated =

separate_by_class(data, class_index) summaries

= {}

for class_value, rows in separated.items():

total = len(rows)

feature_summary = {}

ATUL KUMAR SINGH (2201201640022)


for i in range(len(header) - 1): # ignore class label

counts = defaultdict(int)

for row in rows:

counts[row[i]] += 1 feature_summary[i] = {val: count / total

for val, count in counts.items()}

summaries[class_value] = {

'prior': total / len(data),

'features': feature_summary

return summaries

def predict(row, summaries): probabilities = {} for

class_value, class_summary in summaries.items():

prob = class_summary['prior'] for i in range(len(row) - 1): # exclude

label value = row[i] prob *= class_summary['features'][i].get(value,

1e-6) # Laplace smoothing probabilities[class_value] = prob return

max(probabilities, key=probabilities.get)

def calculate_accuracy(test_data, summaries):

correct = 0 for row in test_data:

predicted = predict(row, summaries)

if predicted == row[-1]: correct

+= 1 return correct / len(test_data)

if __name__ == "__main__":

# Load dataset header, data =

load_csv("play_tennis.csv") # Split: first

10 for training, rest for testing train_data

= data[:10] test_data = data[10:]

ATUL KUMAR SINGH (2201201640022)


# Train model =

summarize_by_class(train_data, header)

# Test and report accuracy =

calculate_accuracy(test_data, model) print(f"Accuracy

on test data: {accuracy * 100:.2f}%")

# Predictions

print("\nPredictions:") for

row in test_data:

prediction = predict(row, model)

print(f"Expected: {row[-1]}, Predicted: {prediction}")

OUTPUT :

ATUL KUMAR SINGH (2201201640022)


EXPERIMENT 6

Object: Assuming a set of documents that need to be classified, use the naïve Bayesian Classifier
model to perform this task. Built-in Java classes/API can be used to write the program. Calculate the
accuracy, precision, and recall for your data set. from sklearn.feature_extraction.text import
CountVectorizer from sklearn.naive_bayes import MultinomialNB

from sklearn.metrics import accuracy_score, precision_score, recall_score from

sklearn.model_selection import train_test_split

# Step 1: Sample dataset (documents and labels) documents

=[

"Buy cheap meds online now", # spam

"Limited time offer, buy now", # spam

"Meet me at the cafe at noon", # ham

"Project deadline is approaching", # ham

"Win big prizes easily", # spam

"Let's catch up tomorrow", # ham

"Exclusive deal just for you", # spam

"Don't forget the team meeting", # ham

labels = ["spam", "spam", "ham", "ham", "spam", "ham", "spam", "ham"]

# Step 2: Split dataset into training and test sets X_train,

X_test, y_train, y_test = train_test_split( documents,

labels, test_size=0.25, random_state=42

# Step 3: Convert text to numerical features using CountVectorizer (Bag-of-Words) vectorizer

= CountVectorizer()

ATUL KUMAR SINGH (2201201640022)


X_train_counts = vectorizer.fit_transform(X_train)
X_test_counts = vectorizer.transform(X_test)

# Step 4: Train Naïve Bayes classifier model

= MultinomialNB()

model.fit(X_train_counts, y_train)

# Step 5: Make predictions y_pred =

model.predict(X_test_counts)

# Step 6: Evaluate classifier performance accuracy =

accuracy_score(y_test, y_pred) precision =

precision_score(y_test, y_pred, pos_label="spam") recall =

recall_score(y_test, y_pred, pos_label="spam")

# Step 7: Display results print("\n--- Predictions ---") for text, pred,

actual in zip(X_test, y_pred, y_test): print(f"Document:

'{text}'\nPredicted: {pred} | Actual: {actual}\n")

print(f"Accuracy : {accuracy:.2f}")

print(f"Precision: {precision:.2f}") print(f"Recall

: {recall:.2f}")

OUTPUT

ATUL KUMAR SINGH (2201201640022)


EXPERIMENT 7

Object: Write a program to construct a Bayesian network considering medical data. Use this model to
demonstrate the diagnosis of heart patients using standard Heart Disease Data Set. You can use
Java/Python ML library classes/API. import pandas as pd

from pgmpy.models import BayesianNetwork from pgmpy.factors.discrete import

DiscreteFactor, ConditionalProbabilityTable as CPT from pgmpy.inference import

VariableElimination from sklearn.model_selection import train_test_split from

sklearn.preprocessing import LabelEncoder import numpy as np

# Step 1: Load the Heart Disease dataset url = "https://archive.ics.uci.edu/ml/machine-learning-

databases/heart-disease/heart-disease.data"

column_names = ['age', 'sex', 'cp', 'trestbps', 'chol', 'fbs', 'restecg', 'thalach', 'exang', 'oldpeak', 'slope',
'ca', 'thal', 'num']

data = pd.read_csv(url, header=None, names=column_names)

# Step 2: Encode categorical variables encoder

= LabelEncoder()

data['sex'] = encoder.fit_transform(data['sex'])

data['cp'] = encoder.fit_transform(data['cp']) data['fbs']

= encoder.fit_transform(data['fbs']) data['restecg'] =

encoder.fit_transform(data['restecg']) data['exang'] =

encoder.fit_transform(data['exang']) data['slope'] =

encoder.fit_transform(data['slope']) data['ca'] =

encoder.fit_transform(data['ca']) data['thal'] =

encoder.fit_transform(data['thal']) data['num'] =

encoder.fit_transform(data['num'])

# Step 3: Split dataset into training and test sets

X = data.drop('num', axis=1) y = data['num']

ATUL KUMAR SINGH (2201201640022)


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Step 4: Construct Bayesian Network model model

= BayesianNetwork([

('age', 'num'),

('sex', 'num'),

('cp', 'num'),

('trestbps', 'num'),

('chol', 'num'),

('fbs', 'num'),

('restecg', 'num'),

('thalach', 'num'),

('exang', 'num'),

('oldpeak', 'num'),

('slope', 'num'),

('ca', 'num'),

('thal', 'num')

])

# Step 5: Define Conditional Probability Tables (CPDs)


# These CPDs are placeholders. In a real scenario, they should be learned from data or specified based
on domain knowledge.

# Example CPDs (this should ideally be learned or based on domain knowledge) age_cpt

= CPT(

['age', 'num'],

[2, 2],

[[0.9, 0.1], [0.2, 0.8]] # Random probabilities for age to be healthy (0) or diseased (1)

ATUL KUMAR SINGH (2201201640022)


sex_cpt = CPT(

['sex', 'num'],

[2, 2],
[[0.7, 0.3], [0.3, 0.7]] # Random probabilities for sex to be healthy (0) or diseased (1)

# You would need to create CPTs for all other variables similarly. Here, we'll just add the age and sex
CPDs for simplicity.

model.add_cpds(age_cpt, sex_cpt)

# Step 6: Perform inference using Variable Elimination inference

= VariableElimination(model)

# Step 7: Make predictions on new patient data (for example) patient_data = {'age': 60, 'sex': 1, 'cp':

2, 'trestbps': 130, 'chol': 250, 'fbs': 0, 'restecg': 1, 'thalach': 140,

'exang': 0, 'oldpeak': 1.5, 'slope': 2, 'ca': 0, 'thal': 1}

# Convert patient data to format that matches the variables in the model query

= inference.query(variables=['num'], evidence=patient_data)

print("Heart Disease Prediction:") print(query)

OUTPUT

ATUL KUMAR SINGH (2201201640022)


EXPERIMENT 8

Object: 8. Apply EM algorithm to cluster a set of data stored in a CSV file. Use the same data set for
clustering using k-Means algorithm. Compare the results of these two algorithms and comment on the
quality of clustering. You can add Python ML library classes/API in the program. import pandas as pd
import numpy as np

from sklearn.mixture import GaussianMixture from

sklearn.cluster import KMeans

from sklearn.metrics import silhouette_score, adjusted_rand_score import

matplotlib.pyplot as plt

# Step 1: Load the dataset (Replace with the path to your CSV file)
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data" # Example URL for Iris
dataset

column_names = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'] data

= pd.read_csv(url, header=None, names=column_names)

# Step 2: Prepare the data (features only, remove the target column 'species')

X = data.drop('species', axis=1)

# Step 3: Apply the EM Algorithm (Gaussian Mixture Model)


gmm = GaussianMixture(n_components=3, random_state=42) # Assuming 3 clusters (based on Iris
dataset)

gmm_labels = gmm.fit_predict(X)

# Step 4: Apply K-Means Clustering kmeans =

KMeans(n_clusters=3, random_state=42)

kmeans_labels = kmeans.fit_predict(X)

# Step 5: Evaluate Clustering Quality

ATUL KUMAR SINGH (2201201640022)


# Using silhouette score (higher is better) and Adjusted Rand Index (ARI)

gmm_silhouette = silhouette_score(X, gmm_labels) kmeans_silhouette =

silhouette_score(X, kmeans_labels)

gmm_ari = adjusted_rand_score(data['species'].map({'Iris-setosa': 0, 'Iris-versicolor': 1, 'Iris-virginica':


2}), gmm_labels)
kmeans_ari = adjusted_rand_score(data['species'].map({'Iris-setosa': 0, 'Iris-versicolor': 1,
'Irisvirginica': 2}), kmeans_labels)

# Step 6: Print Evaluation Results

print(f"GMM Silhouette Score: {gmm_silhouette:.4f}") print(f"K-Means

Silhouette Score: {kmeans_silhouette:.4f}") print(f"GMM Adjusted

Rand Index (ARI): {gmm_ari:.4f}") print(f"K-Means Adjusted Rand

Index (ARI): {kmeans_ari:.4f}")

# Step 7: Visualize the Clusters fig,

axes = plt.su

OUTPUT

ATUL KUMAR SINGH (2201201640022)


EXPERIMENT 9

Object: Write a program to implement k-Nearest Neighbour algorithm to classify the iris data set.
Print both correct and wrong predictions. Python ML library classes can be used for this problem.

import pandas as pd

from sklearn.datasets import load_iris

from sklearn.model_selection import train_test_split from

sklearn.neighbors import KNeighborsClassifier from

sklearn.metrics import accuracy_score

# Step 1: Load the Iris dataset

iris = load_iris() X = iris.data

# Features y = iris.target #

Labels

# Step 2: Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Step 3: Initialize and train the k-NN classifier

k = 3 # You can change the value of k to see how it affects accuracy

knn = KNeighborsClassifier(n_neighbors=k) knn.fit(X_train,

y_train)

# Step 4: Make predictions on the test set y_pred

= knn.predict(X_test)

# Step 5: Calculate the accuracy of the classifier accuracy

= accuracy_score(y_test, y_pred) print(f"Accuracy:

{accuracy * 100:.2f}%")

ATUL KUMAR SINGH (2201201640022)


# Step 6: Print correct and incorrect predictions

correct_predictions = [] incorrect_predictions

= []

for true_label, predicted_label, data in zip(y_test, y_pred, X_test):

if true_label == predicted_label:

correct_predictions.append((data, true_label, predicted_label))

else:

incorrect_predictions.append((data, true_label, predicted_label))

# Print Correct Predictions print("\nCorrect Predictions:")

for data, true_label, predicted_label in correct_predictions:

print(f"True Label: {iris.target_names[true_label]} - Predicted Label:


{iris.target_names[predicted_label]}")

# Print Incorrect Predictions print("\nIncorrect Predictions:")

for data, true_label, predicted_label in incorrect_predictions:

print(f"True Label: {iris.target_names[true_label]} - Predicted Label:


{iris.target_names[predicted_label]}")

OUTPUT

ATUL KUMAR SINGH (2201201640022)


EXPERIMENT 10

Object: Implement the non-parametric Locally Weighted Regression algorithm in order to fit data
points. Select appropriate data set for your experiment and draw graphs.

import numpy as np import

matplotlib.pyplot as plt

# Generate synthetic non-linear data def generate_data(n=100): X=

np.linspace(-3, 3, n) y = np.sin(X) + np.random.normal(0, 0.1, n) #

Nonlinear with some noise return X, y

# Locally Weighted Linear Regression def

kernel(xi, x, tau): return np.exp(-np.square(x -

xi) / (2 * tau**2))

def locally_weighted_regression(X, y, tau, x_query):

m = len(X) W = np.eye(m) for i in range(m):

W[i, i] = kernel(x_query, X[i], tau)

X_mat = np.c_[np.ones(m), X] # Add intercept

x_query_mat = np.array([1, x_query]) # Add intercept to query

theta = np.linalg.pinv(X_mat.T @ W @ X_mat) @ (X_mat.T @ W @ y)

prediction = x_query_mat @ theta return prediction

# Predict for multiple query points def predict_curve(X, y, tau, X_test):

return np.array([locally_weighted_regression(X, y, tau, xi) for xi in X_test]) #

Main routine

ATUL KUMAR SINGH (2201201640022)


X, y = generate_data(100) X_test

= np.linspace(-3, 3, 300) tau =

0.3 # Bandwidth parameter

y_pred = predict_curve(X, y, tau, X_test)

# Plotting plt.figure(figsize=(10,

6))

plt.scatter(X, y, color='red', label='Training Data', alpha=0.6)

plt.plot(X_test, y_pred, color='blue', label='LWR Prediction')

plt.title('Locally Weighted Regression (tau = {:.2f})'.format(tau))

plt.xlabel('X') plt.ylabel('y') plt.legend() plt.grid(True) plt.show()

Output:
• A smooth curve that locally fits the data based on tau.

• Lower tau → more local fitting (can overfit).

• Higher tau → smoother, global trend

ATUL KUMAR SINGH (2201201640022)

You might also like