0% found this document useful (0 votes)
28 views61 pages

ML Lab Manual-It

The document is a Machine Learning Lab Manual for III B.Tech II Semester students at Marri Laxman Reddy Institute of Technology and Management, detailing course objectives, outcomes, and a series of experiments to be conducted using Python. It includes guidelines for lab conduct, institutional vision and mission statements, and a list of experiments covering various machine learning algorithms. The manual is prepared by Mrs. Karimunnisa Shaik, Assistant Professor in the Department of Information Technology.

Uploaded by

hareeshn622
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views61 pages

ML Lab Manual-It

The document is a Machine Learning Lab Manual for III B.Tech II Semester students at Marri Laxman Reddy Institute of Technology and Management, detailing course objectives, outcomes, and a series of experiments to be conducted using Python. It includes guidelines for lab conduct, institutional vision and mission statements, and a list of experiments covering various machine learning algorithms. The manual is prepared by Mrs. Karimunnisa Shaik, Assistant Professor in the Department of Information Technology.

Uploaded by

hareeshn622
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 61

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

MACHINE LEARNING
LAB MANUAL

(Subject Code: 2070518)

B. Tech, III Year II–

Semester Regulation-

R22
A.Y: 2024-25

Prepared by:
MRS.KARIMUNNISA. SHAIK,
ASSISTANT PROFESSOR

Department of INFORMATION TECHNOLOGY


MARRI LAXMAN REDDY INSTITUTE OF TECHNOLOGY AND MANAGEMENT
Hyderabad-500043
CERTIFICATE

This is to certify that this manual is a bonafide record of practical work in the
“MACHINE LEARNING” in III B.Tech II Sem (IT) program during the academicyear 2024-
25. This manual is prepared by Mrs.Karimunnisa shaik, Assistant Professor, IT.
PREFACE

This Lab Manual entitled “Machine Learning Lab” is intended for the use of III B. Tech II Semester
Information Technology students of Marri Laxman Reddy Institute of Technology and Management,
Dundigal, Hyderabad. The main objective of the Machine Learning Lab manual is to introduce the
concepts of operating systems, designing principles of operating systems and implementation of
machine learning.

By
Karimunnisa shaik
Department of IT
ACKNOWLEDGEMENT

It was really a good experience, working with “MACHINE LEARNING LAB”. First,
we would like to thank Dr. M. Nagalakshmi , Professor, HOD of Department of Information
Technology and , Marri Laxman Reddy Institute of Technology & Management for his
concern and giving the technical support in preparing the document. We are deeply indebted
and gratefully acknowledge the constant support and valuable patronage of Dr. P. SRIDHAR,
Director, Marri Laxman Reddy Institute of technology & Management for giving us this
wonderful opportunity for preparing the “MACHINE LEARNING LAB” laboratory manual.
We express our hearty thanks to Dr. R. Murali Prasad, Principal, Marri Laxman Reddy
Institute of technology & Management, for timely corrections and scholarly guidance. At last,
but not the least I would like to thanks the entire IT faculties those who had inspired and
helped us to achieve our goal.

By
KARIMUNNISA SHAIK
Assistant Professor
GENERAL INSTRUCTIONS

1. Students are instructed to come to Machine Learning laboratory on time. Late comers are not
entertained in the lab.

2. Students should be punctual to the lab. If not ,the conducted experiments will not be repeated.

3. Students are expected to come prepared at home with the experiments which are going to be
performed.

4. Students are instructed to display their identity cards before entering into the lab.

5. Students are instructed not to bring mobile phones to the lab.

6. Any damage/loss of system parts like keyboard, mouse during the lab session, it is student’s
responsibility and penalty or fine will be collected from the student.

7. Students should update the records and lab observation books session wise. Beforeleaving the lab
the student should get his/her lab observation book signed by the faculty.

8. Students should submit the lab records by the next lab to the concerned faculty members in the staff
room for their correction and return.

9. Students should not move around the lab during the lab session.

10. If any emergency arises, the student should take the permission from facultymember
concerned in written format.

11. The faculty members may suspend any student from the lab session on disciplinary grounds.

12. Never copy the output from other students. Write down your own outputs.
INSTITUTION VISION AND MISSION

VISION
To be as an ideal academic institution by graduating talented engineers to be ethically strong,
competent with quality research and technologies.

MISSION
To fulfill the promised vision through the following strategic characteristics and aspirations:
 Utilize rigorous educational experiences to produce talented engineers.
 Create an atmosphere that facilitates the success of students.
 Programs that integrate global awareness, communication skills and
Leadership qualities.
 Education and Research partnership with institutions and industries to
prepare the students for interdisciplinary research.
DEPARTMENT VISION AND MISSION

VISION
To empower the students to be technologically adept, innovative, self-motivated and responsible
global citizen possessing human values and contribute significantly towards high quality
technical education with ever changing world.
MISSION

 To offer high-quality education in the computing fields by providing an environment


where the knowledge is gained and applied to participate in research, for both students
and faculty.
 To develop the problem-solving skills in the students to be ready to deal with cutting
edge technologies of the industry.
 To make the students and faculty excel in their professional fields by inculcating the
communication skills, leadership skills, team building skills with the organization of
various cocurricular and extra-curricular programs.
 To provide the students with theoretical and applied knowledge, and adopt an
education approach that promotes lifelong learning and ethical growth.
PROGRAMME EDUCATIONAL OBJECTIVES

 Learn and Integrate: Graduates shall apply knowledge to solve computer science
and allied engineering problems with continuous learning.

• Think and Create: Graduates are inculcated with a passion towards higher education
and research with social responsibility.

• Communicate and Organize: Graduates shall pursue career in industry, empowered


with professional and interpersonal skills.

PROGRAM SPECIFIC OUTCOMES

PSO1: Applications of Computing: Ability to use knowledge in various domains to provide


solution to new ideas and innovations.
PSO2: Programming Skills: Identify required data structures, design suitable algorithms,
develop and maintain software for real world problems.
PROGRAMME OUT COMES

The Program Outcomes (POs) of the department are defined in a way that the Graduate
Attributes are included, which can be seen in the Program Outcomes (POs) defined.
Program Outcomes (POs) department are as stated below:

a : An ability to apply knowledge of Science, Mathematics, Engineering & Computing


fundamentals for the solutions of Complex Engineering problems.
b : An ability to identify, formulates, research literature and analyze complex engineering
problems using first principles of mathematics and engineering sciences.
c : An ability to design solutions to complex process or program to meet desired needs.
d : Ability to use research-based knowledge and research methods including design of
experiments to provide valid conclusions.
e : An ability to use appropriate techniques, skills and tools necessary for computing practice.
f : Ability to apply reasoning informed by the contextual knowledge to assess social issues,
consequences & responsibilities relevant to the professional engineering practice.
g: Ability to understand the impact of engineering solutions in a global, economic,
environmental, and societal context with sustainability.
h : An understanding of professional, ethical, Social issues and responsibilities.
i : An ability to function as an individual, and as a member or leader in diverse teams and

inmultidisciplinary settings.
j: An ability to communicate effectively on complex engineering activities within the
engineering community.
k : Ability to demonstrate and understanding of the engineering and management principles
as a member and leader in a team.
l : Ability to engage in independent and lifelong learning in the context of technological
change.
MACHINE LEARNING LAB

COURSE STRUCTURE, OBJECTIVES & OUTCOMES

COURSE STRUCTURE:

Laboratory subjects – Internal and external evaluation – Details of marks“ MACHINE


LEARNING“ lab will have a continuous evaluation during IV semester for 40 sessional marks
and 60 end semester examination marks. Out of the 40 marks for internal evaluation, day- to-
day work in the laboratory shall be evaluatedfor 20 marks and internal practical examination
shall be evaluated for 20 marks conducted by the laboratory teacher concerned. The end
examination will be evaluatedfor a maximum of 60 marks. The end semester examination shall
be conducted with an external examiner and internal examiner. The external examiner shall be
appointedby the principal / Chief Controller of examinations

COURSE OBJECTIVES:

 To get an overview of the various Machine Learning Techniques and can able to
Demonstrate them using Python.

COURSE OUTCOMES:

 Understand complexity of Machine Learning algorithms and their limitations


 Understand modern notions in data analysis-oriented computing;
 Confidently applying common Machine Learning algorithms in practice and
implementing their own;
 Apply experiments in Machine Learning using real-world data.
MACHINE LEARNING LAB

2070584: MACHINE LEARNING LAB


Course Objectives:
 To get an overview of the various Machine Learning Techniques and can able to Demonstrate
them using Python.
Course Outcomes:
The students will be able to:
 Understand complexity of Machine Learning algorithms and their limitations
 Understand modern notions in data analysis-oriented computing;
 Confidently applying common Machine Learning algorithms in practice and implementing their own;
 Apply experiments in Machine Learning using real-world data.
List of Experiments
1. The probability that it is Friday and that a student is absent is 3 %. Since there are 5 school days in a
week, the probability that it is Friday is 20 %. What is the probability that a student is absent given
that today is Friday? Apply Baye’s rule in python to get the result.(Ans: 15%)
2. Extract the data from database using python
3. Implement Find-S algorithm using python.
4. Implement Candidate-Elimination algorithm using python.
5. Implement Decision-Tree Learning algorithm using python.
6. Implement k-nearest neighbors classification using python
7. Given the following data, which specify classifications for nine combinations of VAR1 and
VAR2 predict a classification for a case where VAR1=0.906 and VAR2=0.606, using the result k-
means clustering with 3 means (i.e., 3 centroids)
VAR1 VAR2 CLASS 1.713
1.586 0
0.180 1.786 1
0.353 1.240 1
0.940 1.566 0
1.486 0.759 1
1.266 1.106 0
1.540 0.419 1
0.459 1.799 1
0.773 0.186 1
8. The following training examples map descriptions of individuals onto high, medium and
low creditworthiness.
medium skiing design single twenties no -> highRisk high
golf trading married forties yes -> lowRisk
MACHINE LEARNING LAB

low speedway transport married thirties yes -> medRisk


medium football banking single thirties yes -> lowRisk
high flying media married fifties yes -> highRisk
low football security single twenties no -> medRisk
medium golf media single thirties yes -> medRisk
medium golf transport married forties yes -> lowRisk
high skiing banking single thirties yes -> highRisk
low golf unemployed married forties yes -> highRisk
Input attributes are (from left to right) income, recreation, job, status, age-group, home-owner.Find the
unconditional probability of `golf' and the conditional probability of `single' given `medRisk' in the
dataset?
9. Implement linear regression using python.
10. Implement Naïve Bayes theorem to classify the English text
11. Implement an algorithm to demonstrate the significance of genetic algorithm
12. Implement the finite words classification system using Back-propagation algorithm
TEXT BOOKS:
1. Machine Learning – Tom M. Mitchell, - MGH
REFERENCES:
1. Machine Learning: An Algorithmic Perspective, Stephen Marshland, Taylor & Francis
MACHINE LEARNING LAB

EXPERIMENT -1

AIM: The probability that it is Friday and that a student is absent is 3 %. Since there are 5
school days in a week, the probability that it is Friday is 20 %. What is the probability that a
student is absent given that today is Friday? Apply Baye’s rule in python to get the result.
(Ans: 15%)

ALGORITHM:

Step 1: Calculate probability for each word in a text and filter the words which have a probability
less than threshold probability. Words with probability less than threshold probability are irrelevant.
Step 2: Then for each word in the dictionary, create a probability of that word being in insincere
questions and its probability insincere questions. Then finding the conditional probability to use in
naive Bayes classifier.
Step 3: Prediction using conditional probabilities.
Step 4: End.

SOURCE CODE:

PFIA=float(input(“Enter probability that it is Friday and that a student is absent=”))


PF=float(input(“ probability that it is Friday=”))
PABF=PFIA / PF
print(“probability that a student is absent given that today is Friday using conditional
probabilities=”,PABF)

OUTPUT:

Enter probability that it is Friday and that a student is absent= 0.03


probability that it is Friday= 0.2
probability that a student is absent given that today is Friday using conditional probabilities= 0.15
MACHINE LEARNING LAB

VIVA QUESTIONS

1. How machine learning is different from general programming?

2. Is Python a compiled language or an interpreted language?

3. What is the difference between a Set and Dictionary?

4. What are *args and *kwargs?

5. Definitions A prior probability AND POSTERIOR?

6. Bayes Theorem Derivation?

7. What is Bayes rule ?

8. What is Bayes classifier?

9. What are Bayesian Networks (BN) ?

10.Given the following statistics, what is the probability that a woman has cancer if she has a

positive result?
MACHINE LEARNING LAB

EXPERIMENT – 2

AIM: EXTRACT THE DATA FROM DATABASE USING PYTHON.


EXPLANATION:
1. First you need to create a table (students) in MySQL database (SampleDB)
2. Open command prompt and then execute the following command to enter into MYSQL prompt.
3. MySQL -u root -p
4. And the, you need to execute the following commands at MYSQL prompt to create table in the
database.
5. Create database SampleDB;
6. Use SampleDB;
7. CREATE TABLE students (sid VARCHAR(10),sname VARCHAR(10),age int);
INSERT INTO students VALUES(‘s521’,’jhon Bob’,23);
INSERT INTO students VALUES(‘s522’,’jhon Dilly’,22);
INSERT INTO students VALUES(‘s523’,’Kenny’,23);
INSERT INTO students VALUES(‘s524’,’Herny’,23);
8. Next, open command prompt and then execute the following command to install mysql.connector
package to connect with MySQL database through python.
Pip install mysql.connector

SOURCE CODE:
Import mysql .connector
Myconn=mysql.connector.connect(host=”localhost”,user=”root”,passwd=””,database=”SamplDB”)
Cur=myconn.cursor( )
Cur.execute(“select * from students”)
Result = cur.fetchall( )
Print(“student Details are:”) For x
in result:
Print (x);
myconn.commit( )
myconn.close( )
MACHINE LEARNING LAB

OUTPUT:
MACHINE LEARNING LAB

VIVA QUESTIONS
1. What is MySQL Connector/Python?
2. What are the five major steps for connecting MySQL and Python?
3. How do we create a connection object?
4. How do we create a cursor object?
5. How do we execute SQL query through Python?
6. What is the difference between fetchall() and fetchnone() methods?
7. What is the purpose of rowcount parameter?
8. Which method is used to establish a connection to a database using sqlite3 in Python?
9. Which method is used to execute an SQL query and fetch all the results in Python’s database
interaction?
10. Which method is used to close the cursor in Python’s database interaction?
MACHINE LEARNING LAB

EXPERIMENT – 3

AIM : IMPLEMENT FIND-S ALGORITHM USING PYTHON.


TRAINING DATABASE

ALGORITHM
1. Initialize h to the most specific hypothesis in H
2. For each positive training instance x For each attribute constraint a, in
h If the constraint a, is satisfied by x Then do nothing
Else replace a, in h by the next more general constraint that is satisfied by x
Output hypothesis h.
3. Hypothesis Construction

SOURCE CODE:
import csv
# Initialize an empty list to hold the data a =
[]
# Open and read the CSV file
with open('FIND-S.csv', 'r') as csvfile:
reader = csv.reader(csvfile)

# Read all rows into the list `a` for


row in reader:
a.append(row)

# Print the number of training instances


print("\nThe total number of training instances are:", len(a)) #

Get the number of attributes (excluding the target attribute)


MACHINE LEARNING LAB

num_attribute = len(a[0]) - 1

# Initialize the hypothesis with '0' (assuming all attributes are '0')
hypothesis = ['0'] * num_attribute
print("\nThe initial hypothesis is:")
print(hypothesis)

# Process each training instance


for i in range(len(a)):
# Check if the target attribute is 'yes'
if a[i][num_attribute] == 'yes':
# Update the hypothesis based on the current instance
for j in range(num_attribute):
if hypothesis[j] == '0' or hypothesis[j] == a[i][j]:
hypothesis[j] = a[i][j]
else:
hypothesis[j] = '?'

# Print the hypothesis for the current training instance


print("\nThe hypothesis for the training instance {} is:".format(i + 1))
print(hypothesis)

# Print the final maximally specific hypothesis


print("\nThe maximally specific hypothesis is:")
print(hypothesis)
MACHINE LEARNING LAB

OUTPUT:
MACHINE LEARNING LAB

VIVA QUESTIONS

1. What is present in the version space of the Find-S algorithm in the beginning?

2. When does the hypothesis change in the Find-S algorithm, while iteration?

3. What is one of the advantages of the Find-S algorithm?

4. How does the hypothesis change gradually?

5. When do we use CSV file?

6. What is csv reader( ) function?

7.What is one of the advantages of the Find-S algorithm?

8. How does the hypothesis change gradually?

9. What is one of the drawbacks of the Find-S algorithm

10. The algorithm accommodates all the maximally specific hypotheses.


MACHINE LEARNING LAB

EXPERIMENT: 4

AIM: IMPLEMENT CANDIDATE-ELIMINATION ALGORITHM USING PYTHON.

TRAINING DATABASE

ALGORITHM

SOURCE CODE:

import csv

# Open and read the CSV file with


open("enjoysport.csv") as f:
csv_file = csv.reader(f) data =
list(csv_file)
MACHINE LEARNING LAB

# Print the data from the CSV file


print(data)
print(" ")
# Extracting the first positive instance (excluding the last element)
s = data[1][:-1]

# Initialize the general hypothesis as the most general


g = [['?' for i in range(len(s))] for j in range(len(s))]

# Print initial specific and general hypotheses


print("Initial specific hypothesis:", s)
print("Initial general hypothesis:", g)
print(" ")

# Candidate Elimination Algorithm


for i in data:
if i[-1] == "TRUE":
# For each positive training record
for j in range(len(s)):
if i[j] != s[j]:
s[j] = '?'
g[j][j] = '?'
elif i[-1] == "FALSE":
# For each negative training record
for j in range(len(s)):
if i[j] != s[j]:
g[j][j] = s[j]
else:
g[j][j] = '?'

# Print the hypothesis after processing each instance


print("\nSteps of Candidate Elimination Algorithm", data.index(i) + 1)
print("Specific hypothesis:", s)
print("General hypothesis:", g)

# Extract the final general hypothesis


gh = []
for i in g:
for j in i:
if j != '?':
gh.append(i)
break

# Print the final hypotheses print("\


nFinal specific hypothesis:\n", s) print("\
nFinal general hypothesis:\n", gh)
MACHINE LEARNING LAB

OUTPUT:
MACHINE LEARNING LAB

VIVA QUESTIONS:

1. The algorithm is trying to find a suitable day for swimming. What is the most general hypothesis?

2. Candidate-Elimination algorithm can be described by?

3. How is the version space represented?

4. It is possible that in the output, set S contains only phi.

5. Python supports the creation of anonymous functions at runtime, using a construct called?

6. What is the order of precedence in python?

7. Which of the following is true for variable names in Python?

8. Let G be the set of maximally general hypotheses. While iterating through the dataset,

9. when is it changed for the first time?

10. What are the two main types of functions in Python?


MACHINE LEARNING LAB

EXPERIMENT: 5

AIM: IMPLEMENT DECISION-TREE LEARNING ALGORITHM USING PYTHON.


ALGORITHM:

Consider a training Dataset D:(Real Time Dataset which predicts Cancer.xlsx)


MACHINE LEARNING LAB

SOURCE CODE:

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.tree import DecisionTreeClassifier, plot_tree

from sklearn.metrics import accuracy_score, classification_report

from sklearn.preprocessing import LabelEncoder

import numpy as np

import math

import matplotlib.pyplot as plt

file_path = "C:\\ASMA\\cancer.xlsx" # Use double backslashes in the file path

df = pd.read_excel(file_path)

label_encoders = {}

for column in df.columns:

if df[column].dtype == 'object': # Check if column is non-numeric

le = LabelEncoder()

df[column] = le.fit_transform(df[column])

label_encoders[column] = le

X = df.drop('Level', axis=1) # Drop the 'Level' column to use the rest as features

y = df['Level'] # Target variable

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

clf = DecisionTreeClassifier(criterion='entropy', random_state=42)

clf.fit(X_train, y_train)

python

Copy code

y_pred = clf.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)

def calculate_entropy(y):
MACHINE LEARNING LAB

class_counts = np.bincount(y)

probabilities = class_counts /

len(y)

return -np.sum([p * math.log2(p) for p in probabilities if p >

0]) def calculate_information_gain(X_column, y, threshold):

parent_entropy =

calculate_entropy(y) left_split =

y[X_column <= threshold] right_split

= y[X_column > threshold] n = len(y)

n_left, n_right = len(left_split),

len(right_split) if n_left == 0 or n_right == 0:

return 0

weighted_avg_entropy = (n_left / n) * calculate_entropy(left_split) + (n_right / n) *

calculate_entropy(right_split)

return parent_entropy - weighted_avg_entropy

first_feature_index = clf.tree_.feature[0]

first_threshold = clf.tree_.threshold[0]

first_feature_name = X.columns[first_feature_index]

X_column = X_train.iloc[:, first_feature_index]

first_split_ig = calculate_information_gain(X_column, y_train, first_threshold)

first_split_entropy = calculate_entropy(y_train)

print(f"Model Accuracy: {accuracy}")

print(f"Entropy of the first node: {first_split_entropy}")

print(f"Information Gain of the first split on {first_feature_name} at threshold {first_threshold}:

{first_split_ig}")

python

Copy code
MACHINE LEARNING LAB

plt.figure(figsize=(15, 10))
MACHINE LEARNING LAB

plot_tree(clf, filled=True, feature_names=X.columns, class_names=np.unique(y).astype(str),

rounded=True)

plt.title("Decision Tree Visualization")

plt.show()

print("\nClassification Report:\n", classification_report(y_test, y_pred))

OUTPUT:
MACHINE LEARNING LAB

VIVA QUESTIONS:

1. Practical decision tree learning algorithms are based on heuristics?

2. Given the entropy for a split, e split = 0.39 and the entropy before the split, e before = What is the

information gain for the split?

3. Information gain and gini index are the same.?

4. What is a decision tree algorithm used for?

5. Which algorithm is commonly used to construct decision trees?

6. Which attribute selection measure is used in the id3 algorithm?

7. What is the goal of a decision tree algorithm during training?

8. Which algorithm can handle missing values in decision trees?

9. what is purning?

10. Formula for gini index?


MACHINE LEARNING LAB

EXPERIMENT – 6

AIM: IMPLEMENT K-NEAREST NEIGHBORS CLASSIFICATION USING PYTHON


Step 1: Load the data
Step 2: Initialize the value of k
Step 3: For getting the predicted class, iterate from 1 to total number of training data points
 Calculate the distance between test data and each row of training data. Here we will use
Euclidean distance as our distance metric since it’s the most popular method. The other metrics
that can be used are Chebyshev, cosine, etc.
 Sort the calculated distances in ascending order based on distance values 3. Get top k rows from
the sorted array
 Get the most frequent class of these rows i.e. Get the labels of the selected K entries
 Return the predicted class
 If regression, return the mean of the K labels
 If classification, return the mode of the K labels
 If regression, return the mean of the K labels
 If classification, return the mode of the K
labels Step 4: End.

SOURCE CODE

import numpy as np
from sklearn import datasets import
matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

# Load the iris dataset iris =


datasets.load_iris() data
= iris.data
labels = iris.target

# Print selected samples for


i in [0, 79, 99, 101]:
print(f"index: {i:3}, features: {data[i]}, label: {labels[i]}")

# Shuffle the data


MACHINE LEARNING LAB

np.random.seed(42)
indices = np.random.permutation(len(data))

n_training_samples = 12

# Create training and test sets


learn_data = data[indices[:-n_training_samples]]
learn_labels = labels[indices[:-n_training_samples]]

test_data = data[indices[-n_training_samples:]]
test_labels = labels[indices[-n_training_samples:]]

# Print learn and test sets


print("The first samples of our learn set:")
print(f"{'index':7s}{'data':50s}{'label':3s}")
for i in range(5):
print(f"{i:4d} {learn_data[i]} {learn_labels[i]:3}")

print("The first samples of our test set:")


print(f"{'index':7s}{'data':50s}{'label':3s}")
for i in range(5):
print(f"{i:4d} {test_data[i]} {test_labels[i]:3}")

# Visualizing the data of our learnset


colours = ("r", "g", "y")
X = []

for iclass in range(3):


X.append([[], [], []])
for i in range(len(learn_data)):
if learn_labels[i] == iclass:
X[iclass][0].append(learn_data[i][0]) X[iclass]
[1].append(learn_data[i][1]) X[iclass]
[2].append(sum(learn_data[i][2:]))

fig = plt.figure()
MACHINE LEARNING LAB

ax = fig.add_subplot(111, projection='3d')

for iclass in range(3):


ax.scatter(X[iclass][0], X[iclass][1], X[iclass][2], c=colours[iclass])

plt.show()

# Euclidean distance function


def distance(instance1, instance2):
""" Calculates the Euclidean distance between two instances """
return np.linalg.norm(np.subtract(instance1, instance2))

# Function to find k nearest neighbors


def get_neighbors(training_set, labels, test_instance, k, distance):
"""
get_neighbors calculates a list of the k nearest neighbors of an instance
'test_instance'. The function returns a list of k 3-tuples. Each 3-tuple consists of
(index, dist, label) """
distances = []
for index in range(len(training_set)):
dist = distance(test_instance, training_set[index])
distances.append((training_set[index], dist, labels[index]))
distances.sort(key=lambda x: x[1])
neighbors = distances[:k]
return neighbors

# Testing the neighbors function on the test set


for i in range(5):
neighbors = get_neighbors(learn_data, learn_labels, test_data[i], 3, distance=distance)
print("Index: ", i, '\n',
"Testset Data: ", test_data[i], '\n',
"Testset Label: ", test_labels[i], '\n',
"Neighbors: ", neighbors, '\n')
MACHINE LEARNING LAB

OUTPUT:
MACHINE LEARNING LAB
MACHINE LEARNING LAB

VIVA QUESTIONS
1. How does KNN deal with missing data?

2. How does KNN handle imbalanced datasets?

3. What are some common ways to improve KNN performance?

4. How do you implement KNN in Python using sklearn?

5. What are the computational challenges in KNN?

6. What are the pros and cons of KNN?

7. How does distance metric affect KNN?

8. What is the role of the parameter ‘K’ in KNN?

9. How does KNN handle classification and regression tasks?

10. What is KNN?


MACHINE LEARNING LAB

EXPERIMENT -7

AIM: GIVEN THE FOLLOWING DATA, WHICH SPECIFY CLASSIFICATIONS FOR NINE
COMBINATIONS OF VAR1 AND VAR2 PREDICT A CLASSIFICATION FOR A CASE WHERE
VAR1=0.906 AND VAR2=0.606, USING THE RESULT K- MEANS CLUSTERING WITH 3
MEANS (I.E., 3 CENTROIDS)
VAR1 VAR2 CLASS
1.713 1.586 0
0.180 1.786 1
0.353 1.240 1
0.940 1.566 0
1.486 0.759 1
1.266 1.106 0
1.540 0.419 1
0.459 1.799 1
0.773 0.186 1
SOURCE CODE:
import numpy as np
from sklearn.cluster import KMeans

# Data with VAR1, VAR2, and CLASS


data = np.array([
[1.713, 1.586, 0],
[0.180, 1.786, 1],
[0.353, 1.240, 1],
[0.940, 1.566, 0],
[1.486, 0.759, 1],
[1.266, 1.106, 0],
[1.540, 0.419, 1],
[0.459, 1.799, 1],
[0.773, 0.186, 1]
])

# Separate features (VAR1, VAR2) and labels (CLASS)


X = data[:, :2]
y = data[:, 2]
MACHINE LEARNING LAB

# Define KMeans model with 3 clusters


kmeans = KMeans(n_clusters=3, random_state=42)

# Fit the model to the data


kmeans.fit(X)

# New data point to classify


new_point = np.array([[0.906, 0.606]])

# Predict the cluster for the new data point


predicted_cluster = kmeans.predict(new_point)

print(f"The new data point belongs to cluster: {predicted_cluster[0]}")

# Map cluster to the majority class within each cluster


# For simplicity, we'll assign the most frequent class in each cluster
from collections import Counter

# Get the labels (clusters) for each point


cluster_labels = kmeans.labels_

# Create a mapping from cluster to majority class


cluster_to_class = {}

for cluster in np.unique(cluster_labels):


# Get the indices of points in the current cluster
indices = np.where(cluster_labels == cluster)
# Find the most common class for this cluster
common_class = Counter(y[indices]).most_common(1)[0][0]
cluster_to_class[cluster] = common_class

# Predict the class based on the cluster


predicted_class = cluster_to_class[predicted_cluster[0]]
print(f"The predicted class for VAR1=0.906 and VAR2=0.606 is: {predicted_class}")
MACHINE LEARNING LAB

OUTPUT:
MACHINE LEARNING LAB

VIVA QUESTIONS

1. List the classification algorithm


2. Classification is which type of learning?
3. Which of the following statements is false about ensemble learning?
4. Which of the following statements is true about stochastic gradient descent?
5. decision tree uses the inductive learning machine learning approach.
6. Which of the following statements is not true about boosting?
7. K-nearest neighbors (knn) is classified as what type of machine learning algorithm?
8. What is machine learning?
9. What’s the key benefit of using deep learning for tasks like recognizing images
10. What is classification?
MACHINE LEARNING LAB

EXPERIMENT - 8

AIM: THE FOLLOWING TRAINING EXAMPLES MAP DESCRIPTIONS OF INDIVIDUALS


ONTO HIGH, MEDIUM AND LOW CREDITWORTHINESS.
medium skiing design single twenties no -> highRisk
high golf trading married forties yes -> lowRisk
low speedway transport married thirties yes -> medRisk
medium football banking single thirties yes -> lowRisk
high flying media married fifties yes -> highRisk
low football security single twenties no -> medRisk
medium golf media single thirties yes -> medRisk
medium golf transport married forties yes ->
lowRisk high skiing banking single thirties yes ->
highRisk low golf unemployed married forties yes ->
highRisk
Input attributes are (from left to right) income, recreation, job, status, age-group, home-owner.Find the
unconditional probability of `golf' and the conditional probability of `single' given `medRisk' in the
dataset?

SOURCE CODE:
import numpy as np #
Define the dataset
data = np.array([
['medium', 'skiing', 'design', 'single', 'twenties', 'no', 'highRisk'],
['high', 'golf', 'trading', 'married', 'forties', 'yes', 'lowRisk'],
['low', 'speedway', 'transport', 'married', 'thirties', 'yes', 'medRisk'],
['medium', 'football', 'banking', 'single', 'thirties', 'yes', 'lowRisk'],
['high', 'flying', 'media', 'married', 'fifties', 'yes', 'highRisk'],
['low', 'football', 'security', 'single', 'twenties', 'no', 'medRisk'],
['medium', 'golf', 'media', 'single', 'thirties', 'yes', 'medRisk'],
['medium', 'golf', 'transport', 'married', 'forties', 'yes', 'lowRisk'],
['high', 'skiing', 'banking', 'single', 'thirties', 'yes', 'highRisk'],
['low', 'golf', 'unemployed', 'married', 'forties', 'yes', 'highRisk']
])
MACHINE LEARNING LAB

# Extract columns for recreation, status, and label


MACHINE LEARNING LAB

recreation_column = data[:, 1]
status_column = data[:, 3]
label_column = data[:, 6]

# Calculate the unconditional probability of 'golf'


total_samples = len(data)
golf_count = np.sum(recreation_column == 'golf')
unconditional_prob_golf = golf_count / total_samples

# Calculate the conditional probability of 'single' given 'medRisk'


medrisk_indices = np.where(label_column == 'medRisk')
single_given_medrisk_count = np.sum(status_column[medrisk_indices] == 'single')
medrisk_count = len(medrisk_indices[0])
conditional_prob_single_given_medrisk = single_given_medrisk_count / medrisk_count

# Output the results


print(f"Unconditional probability of 'golf': {unconditional_prob_golf:.2f}")
print(f"Conditional probability of 'single' given 'medRisk': {conditional_prob_single_given_medrisk:.2f}")

OUTPUT:
MACHINE LEARNING LAB

VIVA QUESTIONS

1. What is Linear Regression?


2. n a simple linear regression, how many independent variables are there?
3. What is the primary goal of linear regression?
4. What is the equation of a simple linear regression line?
5. What is the difference between simple linear regression and multiple linear regression?
6. Which type of Programming does Python support?
7. Who developed Python Programming Language?
8. Is Python code compiled or interpreted?
9. All keywords in Python ?
10. What will be the value of the following Python expression?
MACHINE LEARNING LAB

EXPERIMENT - 9

AIM: IMPLEMENT LINEAR REGRESSION USING PYTHON.


SOURCE CODE:
import numpy as np
import matplotlib.pyplot as plt

# Sample data
X = np.array([1, 2, 4, 3, 5])
y = np.array([1, 3, 3, 2, 5])

# Mean of X and y
X_mean = np.mean(X)
y_mean = np.mean(y)

# Calculate coefficients (slope m and intercept b) for the line y = mx + b


numerator = np.sum((X - X_mean) * (y - y_mean))
denominator = np.sum((X - X_mean) ** 2)

m = numerator / denominator # slope b =


y_mean - m * X_mean # intercept

# Regression line y_pred =


m*X+b

# Print coefficients print(f"Slope


(m): {m}")
print(f"Intercept (b): {b}")

# Plotting the regression line


plt.scatter(X, y, color='blue', label='Data points') plt.plot(X,
y_pred, color='red', label='Regression line')
plt.xlabel('X')
plt.ylabel('y')
MACHINE LEARNING LAB

plt.legend()
plt.show()
OUTPUT:
MACHINE LEARNING LAB

VIVA QUESTIONS
1. What Is Regression?

2. What are the different types of Logistic Regression?

3. How do we handle categorical variables in Logistic Regression?

4. What are the assumptions made in Logistic Regression?

5. Can we solve the multiclass classification problems using Logistic Regression? If Yes thenHow?

6. The correlation for the values of two variables moving in the same direction is

7. Who introduced the term ‘regression’?

8. The slope of the regression line of Y on X is also referred to as the:

9. What is the other term used for dependent variables?

10. What is the significance of hypothesis testing?


MACHINE LEARNING LAB

EXPERIMENT - 10

AIM: IMPLEMENT NAÏVE BAYES THEOREM TO CLASSIFY THE ENGLISH TEXT


SOURCE CODE
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report

# Sample text data (text, label)


data = [
("I love this sandwich", "positive"), ("This
is an amazing place", "positive"),
("I feel very good about these beers", "positive"), ("This
is my best work", "positive"),
("I do not like this restaurant", "negative"), ("I
am tired of this stuff", "negative"),
("I can't deal with this", "negative"), ("He is
my sworn enemy", "negative"), ("My
boss is horrible", "negative")
]

# Separate the text and the labels


texts, labels = zip(*data)

# Convert the text data to a bag-of-words (BOW) model using CountVectorizer


vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)

# Split the data into training and testing sets


X_train, X_test, y_train, y_test = train_test_split(X, labels, test_size=0.3, random_state=42)

# Create and train the Naive Bayes model model


= MultinomialNB()
MACHINE LEARNING LAB

model.fit(X_train, y_train)

# Predict the labels for the test data


y_pred = model.predict(X_test)

# Evaluate the model


print(f"Accuracy: {accuracy_score(y_test, y_pred):.2f}")
print("Classification Report:")
print(classification_report(y_test, y_pred))

# Test on new text data


new_texts = ["I enjoy this new movie", "I hate this weather"]
new_X = vectorizer.transform(new_texts)
predictions = model.predict(new_X)

for text, prediction in zip(new_texts, predictions):


print(f"Text: {text} => Predicted Label: {prediction}")

OUTPUT:
MACHINE LEARNING LAB

VIVA QUESTIONS
1. Naïve Bayes classifier algorithms are mainly used in text classification?

2. What is the formula for Bayes’ theorem? Where (A & B) and (H & E) are events and P(B), P(H) &

P(E) ≠ 0.

3. What is the assumptions of Naïve Bayesian classifier?

4. Is the assumption of the Naïve Bayes algorithm a limitation to use it?

5. here are two boxes. The first box contains 3 white and 2 red balls whereas the second contains 5

white and 4 red balls. A ball is drawn at random from one of the two boxes and is found to be

white. Find the probability that the ball was drawn from the second box?

6. The main objective of a classification algorithm in supervised learning is to?

7. The term "supervised" in supervised learning refers to:

8. What is the maximum possible length of an identifier?

9. In which year was the Python language developed?

10. What do we use to define a block of code in Python language?


MACHINE LEARNING LAB

EXPERIMENT – 11

AIM: IMPLEMENT AN ALGORITHM TO DEMONSTRATE THE SIGNIFICANCE OF GENETIC


ALGORITHM

SOURCE CODE:
import random import
numpy as np

# Parameters population_size =
10
chromosome_length = 6 # Binary representation with 6 bits
max_generations = 50
mutation_rate = 0.01

# Fitness function: maximize f(x) = x^2 def


fitness_function(individual):
# Convert binary string to decimal value x =
int("".join(map(str, individual)), 2)
return x ** 2

# Create an individual (random binary string) def


create_individual():
return [random.randint(0, 1) for _ in range(chromosome_length)]

# Create an initial population def


create_population():
return [create_individual() for _ in range(population_size)]

# Selection function (roulette wheel selection)


def selection(population, fitness_scores):
total_fitness = sum(fitness_scores)
probabilities = [score / total_fitness for score in fitness_scores]
selected_index = np.random.choice(len(population), p=probabilities)
return population[selected_index]
MACHINE LEARNING LAB

# Crossover (single-point crossover)


def crossover(parent1, parent2):
crossover_point = random.randint(1, chromosome_length - 1)
child1 = parent1[:crossover_point] + parent2[crossover_point:]
child2 = parent2[:crossover_point] + parent1[crossover_point:]
return child1, child2

# Mutation
def mutate(individual):
for i in range(len(individual)):
if random.random() < mutation_rate:
individual[i] = 1 - individual[i] # Flip bit
return individual

# Main Genetic Algorithm loop


def genetic_algorithm():
# Step 1: Initialize population
population = create_population()

for generation in range(max_generations):


# Step 2: Evaluate fitness for the population
fitness_scores = [fitness_function(individual) for individual in population]

# Display best solution in current generation


best_fitness = max(fitness_scores)
best_individual = population[fitness_scores.index(best_fitness)]
print(f"Generation {generation + 1}: Best Fitness = {best_fitness}, Best Individual =
{best_individual}")

# Step 3: Selection, Crossover, and Mutation to create new population


new_population = []
for _ in range(population_size // 2): # Create pairs of offspring
# Select two parents based on fitness
parent1 = selection(population, fitness_scores)
parent2 = selection(population, fitness_scores)
MACHINE LEARNING LAB

# Apply crossover
child1, child2 = crossover(parent1, parent2)

# Apply mutation
child1 = mutate(child1)
child2 = mutate(child2)

# Add children to the new population


new_population.append(child1)
new_population.append(child2)

# Step 4: Replace the old population with the new population


population = new_population

# Step 5: Return the best solution after all generations


fitness_scores = [fitness_function(individual) for individual in population]
best_fitness = max(fitness_scores)
best_individual = population[fitness_scores.index(best_fitness)]
return best_individual, best_fitness

# Run the genetic algorithm


best_individual, best_fitness = genetic_algorithm()

# Display the final result


x = int("".join(map(str, best_individual)), 2)
print(f"\nBest individual after {max_generations} generations: {best_individual}")
print(f"Best solution x = {x}, Fitness (x^2) = {best_fitness}")
MACHINE LEARNING LAB

OUTPUT:
MACHINE LEARNING LAB

VIVA QUESTIONS

1. What is a Genetic Algorithm?

2. The algorithm operates by iteratively updating a pool of hypotheses, called the

3. When would the genetic algorithm terminate?

4. GA techniques are inspired by biology.

5. is any predicate (or its negation) applied to any set of terms.

6. What is/are the requirement for the Learn-One-Rule method?

7. Which type of feedback used by RL?

8. 0*10 represents the set of bit strings that includes exactly

9. Search through the hypothesis space cannot be characterized. Why?

10. ILP stand for


MACHINE LEARNING LAB

EXPERIMENT -12

AIM: IMPLEMENT THE FINITE WORDS CLASSIFICATION SYSTEM USING BACK- PROPAGATION
ALGORITHM

SOURCE CODE:
import numpy as np

# Sigmoid activation function def


sigmoid(x):
return 1 / (1 + np.exp(-x))

# Derivative of the sigmoid function (used for backpropagation) def


sigmoid_derivative(x):
return x * (1 - x)

# Word dataset with feature vectors and labels (manually created embeddings) #
Features: [Length of word, Number of vowels, Frequency of letter 'e', etc.]
word_data = np.array([
[5, 2, 1], # apple (fruit)
[6, 3, 1], # orange (fruit)
[3, 1, 0], # dog (animal)
[5, 2, 0], # tiger (animal)
[6, 2, 1], # table (object)
[4, 1, 0], # book (object)
])

# Labels (fruit = [1, 0, 0], animal = [0, 1, 0], object = [0, 0, 1]) labels
= np.array([
[1, 0, 0], # fruit
[1, 0, 0], # fruit
[0, 1, 0], # animal
[0, 1, 0], # animal
[0, 0, 1], # object
[0, 0, 1], # object
MACHINE LEARNING LAB

])

# Initialize parameters
input_size = word_data.shape[1] # Number of input features
hidden_size = 4 # Number of neurons in the hidden layer
output_size = labels.shape[1] # Number of output categories (3 in this case)
learning_rate = 0.5
epochs = 10000 # Number of iterations for training

# Initialize weights and biases


np.random.seed(42)
weights_input_hidden = np.random.uniform(-1, 1, (input_size, hidden_size))
weights_hidden_output = np.random.uniform(-1, 1, (hidden_size, output_size))

bias_hidden = np.random.uniform(-1, 1, (1, hidden_size))


bias_output = np.random.uniform(-1, 1, (1, output_size))

# Training the neural network using backpropagation


for epoch in range(epochs):
# Feedforward phase
hidden_input = np.dot(word_data, weights_input_hidden) + bias_hidden
hidden_output = sigmoid(hidden_input)

final_input = np.dot(hidden_output, weights_hidden_output) + bias_output


final_output = sigmoid(final_input)

# Calculate error
error = labels - final_output

# Backpropagation phase
d_output = error * sigmoid_derivative(final_output)

error_hidden_layer = d_output.dot(weights_hidden_output.T)
d_hidden_layer = error_hidden_layer * sigmoid_derivative(hidden_output)

# Update weights and biases


MACHINE LEARNING LAB

weights_hidden_output += hidden_output.T.dot(d_output) * learning_rate


bias_output += np.sum(d_output, axis=0, keepdims=True) * learning_rate

weights_input_hidden += word_data.T.dot(d_hidden_layer) * learning_rate


bias_hidden += np.sum(d_hidden_layer, axis=0, keepdims=True) * learning_rate

# Print error at intervals


if epoch % 1000 == 0:
loss = np.mean(np.abs(error))
print(f'Epoch {epoch}, Loss: {loss}')

# Testing the trained neural network


def classify(word_vector):
hidden_layer_activation = np.dot(word_vector, weights_input_hidden) + bias_hidden
hidden_layer_output = sigmoid(hidden_layer_activation)

final_layer_activation = np.dot(hidden_layer_output, weights_hidden_output) + bias_output


final_output = sigmoid(final_layer_activation)

return final_output

# Test on new word data


new_word = np.array([4, 1, 0]) # book-like input
classification = classify(new_word)
print("\nClassification (Output Probabilities):", classification)

# Convert probabilities to class


category = np.argmax(classification)
categories = ["fruit", "animal",
"object"]
print(f"Classified as: {categories[category]}")
MACHINE LEARNING LAB

OUTPUT:
MACHINE LEARNING LAB

VIVA QUESTIONS

1. What is backpropagation?

2. How does backpropagation work?

3. What is the difference between a Perceptron and Logistic Regression?

4. Can we have the same bias for all neurons of a hidden layer?

5. What if we do not use any activation function(s) in a neural network?

6. In a neural network, what if all the weights are initialized with the same value?

7. What is the role of weights and bias in a neural network?

8. How can learning process be stopped in backpropagation rule?

9. Does backpropagation learning is based on gradient descent along error surface?

10. What is meant by generalized in statement “backpropagation is a generalized delta rule” ?

You might also like