0% found this document useful (0 votes)
86 views41 pages

KNN, Kmeans

This document is a machine learning lab manual for a B.Tech third year computer science course. It includes information such as the vision, mission and outcomes of the course and department. It also provides instructions for students attending the machine learning lab and lists some sample experiments involving applying machine learning algorithms and techniques using Python. The experiments include using Bayes' rule to calculate conditional probabilities, k-nearest neighbors classification, and kmeans clustering to make predictions.

Uploaded by

kirsagar akash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views41 pages

KNN, Kmeans

This document is a machine learning lab manual for a B.Tech third year computer science course. It includes information such as the vision, mission and outcomes of the course and department. It also provides instructions for students attending the machine learning lab and lists some sample experiments involving applying machine learning algorithms and techniques using Python. The experiments include using Bayes' rule to calculate conditional probabilities, k-nearest neighbors classification, and kmeans clustering to make predictions.

Uploaded by

kirsagar akash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 41

LAB MANUAL

MACHINE LEARNING

B Tech – III Year II Semester CSE R18

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Approved by AICTE, New Delhi | Affiliated to JNTUH, Hyderabad | Accredited by NAAC “A” Grade |
Departments : CSE,ECE & Mech are Accredited by NBA | Hyderabad | PIN: 500068

III-B Tech – II Semester [Branch: CSE]


MACHINE LEARNING LAB MANUAL

SREYAS INSTITUTE OF ENGINEERING & TECHNOLOGY


Besides InduAranya, Nagole, Hyderabad.

SREYAS INSTITUTE OF ENGINEERING AND TECHNOLOGY


(Approved by AICTE, Affiliated to JNTUH)
G.S.I. Bandlaguda, Nagole, Hyderabad - 500068
Machine Learning Lab Manual

CERTIFICATE

LAB NAME : MACHINE LEARNING LAB MANUAL

BRANCH : CSE

YEAR & SEM : III – II

REGULATION : R18

Lab Coordinator HoD

VISION & MISSION OF INSTITUTION

Department of Computer Science and Engineering, SREYAS Page 2


Machine Learning Lab Manual

VISION
To be a centre of excellence in technical education to empower the
young talent through quality education and innovative engineering for well-
being of the society.
MISSION
1. Provide quality education with innovative methodology and intellectual
human capital.
2. Provide conducive environment for research and developmental
activities.
3. Inculcate holistic approach towards nature, society and human ethics with
lifelong learning attitude.

Department of Computer Science and Engineering, SREYAS Page 3


Machine Learning Lab Manual

VISION & MISSION OF DEPARTMENT

Vision

To excel in computer science engineering education with best learning


practices, research and professional ethics.

Mission

1. To offer technical education with innovative teaching, good infrastructure


and qualified human resources.
2. Accomplish a process to advance knowledge in the subject and promote
academic and research environment.
3. To impart moral and ethical values and interpersonal skills to the students.

Program Educational Objectives

Computer Science & Engineering (CSE) is one of the most prominent technical
fields in Engineering. The curriculum offers courses with various areas of
emphasis on theory, design and experimental work. Subject matter ranges from
basics of Computers & Programming Languages to Compiler Design and Cloud
Computing. It maintains strong tie-ups with industry and is dedicated to preparing
students for a career in Web Technologies, Object Oriented Analysis and Design,
Networking & Security, Databases, Data Mining & Data Warehousing and
Software Testing.

PROGRAM OUTCOMES (POs)


Engineering Graduates will be able to:

Department of Computer Science and Engineering, SREYAS Page 4


Machine Learning Lab Manual

1. Engineering Knowledge: Apply the knowledge of mathematics, science, engineering


fundamentals, and an engineering specialization to the solution of complex engineering
problems.
2. Problem analysis: Identify, formulate, review research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of
mathematics, natural sciences, and engineering sciences.
3. Design/development of solutions: Design solutions for complex engineering problems and
design system components or processes that meet the specified needs with appropriate
consideration for the public health and safety, and the cultural, societal, and environmental
considerations.
4. Conduct investigations of complex problems: Use research-based knowledge and research
methods including design of experiments, analysis and interpretation of data, and synthesis
of the information to provide valid conclusions.
5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modeling to complex engineering activities
with an understanding of the limitations.
6. The engineer and society: Apply reasoning informed by the contextual knowledge to assess
societal, health, safety, legal and cultural issues and the consequent responsibilities relevant
to the professional engineering practice.
7. Environment and sustainability: Understand the impact of the professional engineering
solutions in societal and environmental contexts, and demonstrate the knowledge of, and
need for sustainable development.
8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice.
9. Individual and team work: Function effectively as an individual, and as a member or leader
in diverse teams, and in multidisciplinary settings.
10. Communication: Communicate effectively on complex engineering activities with the
engineering community and with society at large, such as, being able to comprehend and
write effective reports and design documentation, make effective presentations, and give and
receive clear instructions.
11. Project management and finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a member and
leader in a team, to manage projects and in multidisciplinary environments.
12. Life-long learning: Recognize the need for, and have the preparation and ability to engage
in independent and life-long learning in the broadest context of technological change.

Department of Computer Science and Engineering, SREYAS Page 5


Machine Learning Lab Manual

PROGRAM SPECIFIC OUTCOMES (PSOs)

13. Proficiency on the contemporary skills towards development of innovative apps and
firmware products.
14. Capabilities to participate in the construction of software systems of varying complexity.

COURSE OUTCOMES (COs)

CO1. Understand the mathematical and statistical perspectives of machine learning algorithms


through python programming.
CO2. Understand the concepts of Conditional probability and Bayes theorem and implement
Naive Bayes Classifier.
CO3. Acquire the basic concepts of Back Propagation algorithm and implement the same.
CO4. Design and evaluate the supervised and unsupervised models through python built-in
functions.
CO5. Evaluate the machine learning models pre-processed through various feature engineering
algorithms by python programming.
CO6. Apply common Machine Learning algorithms on real-world data.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


III Year B.Tech CSE- II Sem

MACHINE LEARNING LAB INSTRUCTIONS TO THE STUDENTS

Department of Computer Science and Engineering, SREYAS Page 6


Machine Learning Lab Manual

Things to Do:

1) Students should come in formal dresses.


2) Students must wear their id cards.
3) They have to be in the lab before 10 minutes.
4) They should come up with the observation and the record.
5) Observation should get corrected with the concerned faculty.
6) The programs corrected by the faculty have to copy to record.
7) They should maintain silence in the lab.

Things not to do:

1) Students should not bring any electronic gadgets into the lab.
2) They should not come late.
3) You should not create any disturbances to others.

HOD Lab Incharge

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD 

R18 B.Tech. CSE III JNTU Hyderabad

MACHINE LEARNING LAB


B.Tech. III Year II Sem. LTPC
0 0 3 1.5
Course Objective: The objective of this lab is to get an overview of the various machine learning
techniques and can able to demonstrate them using python.
Department of Computer Science and Engineering, SREYAS Page 7
Machine Learning Lab Manual

Course Outcomes: After the completion of the course the student can able to:
understand complexity of Machine Learning algorithms and their limitations;
understand modern notions in data analysis-oriented computing;
be capable of confidently applying common Machine Learning algorithms in practice and
implementing their own;
be capable of performing experiments in Machine Learning using real-world data.

List of Experiments
1. The probability that it is Friday and that a student is absent is 3 %. Since there are 5 school
days in a week, the probability that it is Friday is 20 %. What is theprobability that a student is
absent given that today is Friday? Apply Baye’s rule in python to get the result. (Ans: 15%)

2. Extract the data from database using python

3. Implement k-nearest neighbors classification using python

4. Given the following data, which specify classifications for nine combinations of VAR1 predict a
classification for a case where VAR1=0.906 and VAR2=0.606, using the result of kmeans
clustering with 3 means (i.e., 3 centroids)
VAR1 VAR2 CLASS
1.713 1.586 0
0.180 1.786 1
0.353 1.240 1
0.940 1.566 0
1.486 0.759 1
1.266 1.106 0
1.540 0.419 1
0.459 1.799 1
0.773 0.186 1
5. The following training examples map descriptions of individuals onto high, medium and low
credit-worthiness.
medium skiing design single twenties no ->highRisk
high golf trading married forties yes ->lowRisk
low speedway transport married thirties yes ->medRisk
medium football banking single thirties yes ->lowRisk
high flying media married fifties yes ->highRisk
low football security single twenties no ->medRisk
medium golf media single thirties yes ->medRisk
high skiing banking single thirties yes ->highRisk
low golf unemployed married forties yes ->highRisk
Input attributes are (from left to right) income, recreation, job, status, age-group, home-owner. Find the
unconditional probability of `golf' and the conditional probability of `single' given `medRisk' in the
dataset?

6. Implement linear regression using python.


7. Implement Naïve Bayes theorem to classify the English text
8. Implement an algorithm to demonstrate the significance of genetic algorithm
9. Implement the finite words classification system using Back-propagation algorithm
CS604PC Course Objectives
A To introduce students to the basic concepts and techniques of Machine
Learning.
B To improve their skills using Python Programming Libraries like sci-learn
and Numpy.
C To demonstrate various machine learning techniques
D To provide hands-on experience on Extraction of Databases
E To become familiar with regression methods, classification methods,
clustering methods

Department of Computer Science and Engineering, SREYAS Page 8


Machine Learning Lab Manual

F To develop skills of using recent machine learning software for solving


practical problems
CO Course Outcomes
CO 1 Compare Machine Learning algorithms based on their advantages and
limitations and use the best one according to situation

CO 2 Interpret and understand modern notions in data analysis-oriented


computing
CO 3 Apply Conditional Probability using Bayes Theorem

CO 4 Evaluate Decision tree algorithms using real world data


CO5 Apply common Machine Learning algorithms in practice and
implement by their own confidently.
CO6 Experiment with real-world data using Machine Learning algorithms.

COs PROGRAM OUTCOMES (POs) PSOs

1 2 3 4 5 6 7 8 9 10 11 12 i ii
CO1 3 3 2 3 3 2 1 2 2 0 1 3 3 3
CO2 2 3 2 3 3 2 1 2 2 0 1 3 2 2
CO3 3 2 3 3 2 1 2 2 0 1 3 2 2 3
CO4 3 2 3 3 2 1 2 2 0 1 3 2 2 2
CO5 3 2 3 3 2 1 2 2 0 1 3 2 3 2
CO6 3 2 3 3 2 1 2 2 0 1 3 2 2 3
Avg 2.83 2.3 2.6 3 2.3 1.3 1.6 2 0.6 0.6 2.3 2.3 2.3 2.5

Department of Computer Science and Engineering, SREYAS Page 9


Machine Learning Lab Manual

MACHINE LEARNING:
Introduction:
Machine learning is a subfield of artificial intelligence that enables the systems to learn
and improve from experience without being explicitly programmed. Machine learning
algorithms detect patterns in data and learn from them, in order to make their own predictions. In
traditional programming, a computer engineer writes a series of directions that instruct a
computer how to transform input data into a desired output. Machine learning, on the other hand,
is an automated process that enables machines to solve problems with little or no human input,
and take actions based on past observations.
Machine learning can be put to work on massive amounts of data and can perform much
more accurately than humans. It helps us to save time and money on tasks and analyses,
like solving customer pain points to improve customer satisfaction, support ticket automation,
and data mining from internal sources and all over the internet.
The four most common and most used types of machine learning:
I. Supervised Learning:

Department of Computer Science and Engineering, SREYAS Page 10


Machine Learning Lab Manual

Supervised learning algorithms make predictions based on labeled training data. Each


training sample includes an input and a desired output. A supervised learning algorithm analyzes
this sample data and makes an inference.

Data is labeled to tell the machine what patterns (similar words and images, data categories,
etc.) it should be looking for and recognize connections with.

Fig: Working of Supervised Learning with Example

Here we have a dataset of different types of shapes which includes square, rectangle, triangle,
and Polygon. Now the first step is that we need to train the model for each shape.

o If the given shape has four sides, and all the sides are equal, then it will be labelled as
a Square.
o If the given shape has three sides, then it will be labelled as a triangle.

o If the given shape has six equal sides then it will be labelled as hexagon.

Now, after training, we test our model using the test set, and the task of the model is to
identify the shape.The machine is already trained on all types of shapes, and when it finds a new
shape, it classifies the shape on the bases of a number of sides, and predicts the output.

Classification in supervised machine learning

Supervised Machine Learning algorithm can be broadly classified into Regression and
Classification Algorithms

Department of Computer Science and Engineering, SREYAS Page 11


Machine Learning Lab Manual

1. Regression:

Regression algorithms are used if there is a relationship between the input variable and
the output variable. It is used for the prediction of continuous variables, such as Weather
forecasting, Market Trends, etc. Below are some popular Regression algorithms which come
under supervised learning:

o Linear Regression

o Regression Trees

o Non-Linear Regression

o Bayesian Linear Regression

o Polynomial Regression

2. Classification:

Classification algorithms are used when the output variable is categorical, which means
there are two classes such as Yes-No, Male-Female, True-false, etc.

o Random Forest

o Decision Trees

o Logistic Regression

o Support vector Machines

II. Unsupervised Learning:


Unsupervised learning is a type of machine learning in which models are trained using
unlabeled dataset and are allowed to act on that data without any supervision.Unsupervised
learning cannot be directly applied to a regression or classification problem because unlike

Department of Computer Science and Engineering, SREYAS Page 12


Machine Learning Lab Manual

supervised learning, we have the input data but no corresponding output data. The goal of
unsupervised learning is to find the underlying structure of dataset, group that data according to
similarities, and represent that dataset in a compressed format.

Fig: Working of Unsupervised Learning with Example

Here, unlabeled input data is considered, which means it is not categorized and
corresponding outputs are also not given. Now, this unlabeled input data is fed to the machine
learning model in order to train it. Firstly, it will interpret the raw data to find the hidden patterns
from the data and then will apply suitable algorithms such as k-means clustering, Decision tree,
etc.

Once it applies the suitable algorithm, the algorithm divides the data objects into groups
according to the similarities and difference between the objects.

Classification in Unsupervised machine learning:

Unsupervised Machine Learning algorithm can be broadly classified into Clustering and
Association Algorithms.

1. Clustering: Clustering is a method of grouping the objects into clusters such that objects
with most similarities remains into a group and has less or no similarities with the objects of
another group. Cluster analysis finds the commonalities between the data objects and
categorizes them as per the presence and absence of those commonalities.

Department of Computer Science and Engineering, SREYAS Page 13


Machine Learning Lab Manual

2. Association: An association rule is an unsupervised learning method which is used for


finding the relationships between variables in the large database. It determines the set of
items that occurs together in the dataset. Association rule makes marketing strategy more
effective. Such as people who buy X item (suppose a bread) are also tend to purchase Y
(Butter/Jam) item. A typical example of Association rule is Market Basket Analysis.

Below is the list of some popular unsupervised learning algorithms:

o K-means clustering

o KNN (k-nearest neighbors)

o Hierarchal clustering

o Anomaly detection

o Neural Networks

o Principle Component Analysis

o Independent Component Analysis

o Apriori algorithm

o Singular value decomposition

III. Semi-supervised Learning:


Semi-supervised learning is a type of machine learning that falls in between supervised
and unsupervised learning. It is a method that uses a small amount of labeled data and a large
amount of unlabeled data to train a model. The goal of semi-supervised learning is to learn a
function that can accurately predict the output variable based on the input variables, similar to
supervised learning. However, unlike supervised learning, the algorithm is trained on a dataset
that contains both labeled and unlabeled data.

Semi-supervised learning is particularly useful when there is a large amount of


unlabeled data available, but it’s too expensive or difficult to label all of it. Some examples of
semi-supervised learning applications include:

Text classification: In text classification, the goal is to classify a given text into one or more
predefined categories. Semi-supervised learning can be used to train a text classification model
using a small amount of labeled data and a large amount of unlabeled text data.

Department of Computer Science and Engineering, SREYAS Page 14


Machine Learning Lab Manual

Image classification: In image classification, the goal is to classify a given image into one or
more predefined categories. Semi-supervised learning can be used to train an image
classification model using a small amount of labeled data and a large amount of unlabeled
image data.

Anomaly detection: In anomaly detection, the goal is to detect patterns or observations that
are unusual or different from the norm.

IV. Reinforcement Learning:

Reinforcement Learning (RL) is the science of decision making. It is about learning the
optimal behavior in an environment to obtain maximum reward. In RL, the data is accumulated
from machine learning systems that use a trial-and-error method. Data is not part of the input
that we would find in supervised or unsupervised machine learning.

Reinforcement learning uses algorithms that learn from outcomes and decide which
action to take next. After each action, the algorithm receives feedback that helps it determine
whether the choice it made was correct, neutral or incorrect. It is a good technique to use for
automated systems that have to make a lot of small decisions without human guidance.

Reinforcement learning is an autonomous, self- teaching system that essentially learns


by trial and error. It performs actions with the aim of maximizing rewards, or in other words, it
is learning by doing in order to achieve the best outcomes.

PROGRAM 1

1. a. The probability that it is Friday and that a student is absent is 3%. The probability
that it is Friday is 20%. What is the probability that a student is absent given that today is
Friday? Apply Baye’s rule in python to get the result.

AIM: To find the probability that a student is absent given that today is Friday

THEORY:
Baye’s theorem gives the formula for determining conditional probability. Conditional
probability is the likelihood of an outcome occurring, based on a previous outcome having
occurred in similar circumstances. Baye’s theorem provides a way to revise existing predictions
or theories i.e., update the probabilities given new or additional evidence.
Bayes' theorem relies on incorporating prior probability distributions in order to generate
posterior probabilities. Prior probability, in Bayesian statistical inference, is the probability of an
event occurring before new data is collected. Posterior probability is the revised probability of an
event occurring after taking into consideration the new information.

Department of Computer Science and Engineering, SREYAS Page 15


Machine Learning Lab Manual

where, P(A)= The probability of A occurring
P(B)= The probability of B occurring
P(AIB)=The probability of A given B
P(BIA)= The probability of B given A
P(AᴖB)= The probability of both A and B occurring
FLOW CHART:

SOURCE CODE:
#pf is for p(F)
pf =0.20
paif = 0.03
#pa is for p(A)
pa= paif/pf
# pfa is for p(F/A)
pfa =paif/pa
#paf is for p(A/F)
paf=(pfa*pa)/pf
#convert to percentage
r=paf*100
#print the result
print(“The probability of a given student is absent given that day is Friday =”,int(r),”%”)

Output:The probability of given student is absent given that day is Friday =15%

Department of Computer Science and Engineering, SREYAS Page 16


Machine Learning Lab Manual

Result: Thus the program to find probability of student absent given that day is Friday is
executed and the output is verified.

1.b10% of the patients entering into a clinic are having liver disease. 5% of the patients are
alcoholic .The probability of that patient is alcoholic given that they have liver disease is
7%.Find the probability that a patient having liver disease give that alcoholic.

AIM: To write a program to find probability of liver disease patients to that of alcoholic.

SOURCE CODE:

#pl is for p(L)


pl=0.1
#pa is for p(A)
pa=0.05
#pal is for p(A/L)
pal=0.07
# pla is for P(L/A)
pla=pal.pl/pa
#convert to percentage
r=pla*100
#print the result
print(“the probability of liver disease patient given alcoholic=”,int(r),”%”)

Output:The probability of liver disease patient given alcoholic=14%

Result: Thus the program to find the probability of liver disease patients to that of alcoholic is
executed and the output is verified.

Department of Computer Science and Engineering, SREYAS Page 17


Machine Learning Lab Manual

PROGRAM 2

Extract the data from database using python

2.a. AIM: To write a python program to fetch and display records from the product table.

Flow Chart:

Required Installations: pip install mysql-connector-python

SOURCE CODE:

BACKEND:

Department of Computer Science and Engineering, SREYAS Page 18


Machine Learning Lab Manual

$ sudomysql
mysql>create user ‘USERNAME’ @’Localhost’ identified by ‘PASSWORD’;
mysql>grant all on *.* to ‘USERNAME’ @ ‘Localhost’;
mysql>flush privileges;
mysql>exit
$ mysql -u USERNAME -p
Enter password: PASSWORD
mysql>create database DATABASENAME;
mysql> use DATABASENAME;
mysql>create table product(pcodevarchar(20),pnamevarchar(30));
mysql>Insert into product values(“p101”,”A”);
mysql>Insert into product values(“p102,”B);
mysql>Insert into product values(“p103,”c);
mysql>exit

FRONT END:

importmysql.connector
d=mysql.connector.connect(host=”Localhost”,user=”USERNAME”,password=”PASSWORD”,s
database=”DATABASENAME”)
print(d)
k=d.cursor()
k.execute(“select * from product”)
r=k.fetchall()
print(“the records from products table are”)
for I in r:
print(i)

Output:
The records from product table are
(‘p101’,’A’)
(‘p102’,’B’)
(‘p103’,’C’)

Result:To fetch and display records from product table is executed and verified.

2. b.AIM:To write a python program to fetch and display records from customer table in
descending order by names.

SOURCE CODE:

BackEnd:

$ mysql -u USERNAME -p
Enter password:PASSWORD
mysql>use DATABASENAME;
mysql>create table customer( cname char(30),caddress char(100), cmobile_no real);

Department of Computer Science and Engineering, SREYAS Page 19


Machine Learning Lab Manual

mysql>insert into customer values(“H1”,”Hyderabad”,7123456789);


mysql>insert into customer values(“A1”,”Delhi”,0876543210);
mysql>insert into customer values(“Z1”,”Mumbai”,965432198);
mysql> exit

FrontEnd:

importmysql.connector
d=mysql.connector.connect(host=”Localhost”,user=”USERNAME”,password=”PASSWORD”,s
database=”DATABASENAME”)
print(d)
k=d.cursor()
k.execute(“select * from customers ORDER BY cname DESC”)
r=k.fetchall()
print(“the records from customer table are”)
fori in r:
print(i)

Output:
The records from customer table are
(Z1,Mumbai,965432198)
(H1,Hyderabad,7123456789)
(A1,Delhi,0876543210)

Result:Thus the program to fetch and display records from customer table executed and verified.

2.c. AIM:To write a program to design GUI using tkinter to read data and store into database.

Required Installations: sudo apt-get install python3-tk

SOURCE CODE:

BackEnd:

$ mysql -u USERNAME -p
ENTER PASSWORD: PASSWORD
mysql>use DATABASENAME;
mysql>create table cplayer( cname char(100), crunsint);
mysql>exit

FrontEnd:

Department of Computer Science and Engineering, SREYAS Page 20


Machine Learning Lab Manual

importtkinter as t
importmysql.connector
def f1():
x = v1.get()
y = v2.get()
d = mysql.connector.connect(host = 'localhost', user = 'USERNAME', password =
'PASSWORD’, database = 'DATABASE')
c = d.cursor()
s = "insert into cplayer(cname,cruns)values(%s,%s)"
a = (x, int(y))
c.execute(s,a)
d.commit()
d.close()
w = t.Tk()
l1 = t.Label(text = "player name")
l1.place(x = 100, y = 50)
v1 = t.StringVar()
t1 = t.Entry(text = " ", textvariable = v1)
t1.place(x = 200, y = 50)
l2 = t.Label(text = "player runs")
l2.place(x = 100, y = 100)
v2 = t.StringVar()
t2 = t.Entry(text = " ", textvariable = v2)
t2.place(x = 200, y = 100)
b = t.Button(text = "submit", command = f1)
b.place(x = 300, y = 200)
w.mainloop()

OUTPUT:

Department of Computer Science and Engineering, SREYAS Page 21


Machine Learning Lab Manual

$ mysql -u USERNAME -p
ENTER PASWORD: PASSWORD
mysql> use DATABASE;
mysql>select * from cplayer;

RESULT: Thus the program to design and read values from GUI is executed and the output is
verified.

2.d. AIM:To write a program to design GUI using tkinter to read data and store into database.

SOURCECODE:

BackEnd:
$mysql -u USERNAME -p
ENTER PASSWORD:PASSWORD
mysql>use DATABASENAME;
mysql>create table student (sname char(100),rollnovarchar(20),gender char(20),year int,Branch
char(20));
mysql>exit

FrontEnd:

importtkinter as t
importmysql.connector
def f1():
x = v1.get()
y = v2.get()
p = v3.get()
q = v4.get()
r = v5.get()
d = mysql.connector.connect(host = "localhost", user = "spandana", password =
"abhee123", database = "vir1")
c = d.cursor()
z = "insert into student(sname,rollno,gender,year,branch)values(%s,%s,%s,%s,%s)"
a = (x,y,p,int(q),r)
c.execute(z,a)
d.commit()
d.close()

Department of Computer Science and Engineering, SREYAS Page 22


Machine Learning Lab Manual

w = t.Tk()
w.title("Student Registration Form")
l1 = t.Label(text = "Name")
l1.place(x = 100, y =50)
v1 = t.StringVar()
t1 = t.Entry(text = " ", textvariable = v1)
t1.place(x = 200, y = 50)
l2 = t.Label(text = "Roll No.")
l2.place(x = 100, y = 100)
v2 = t.StringVar()
t2 = t.Entry(text = " ", textvariable = v2)
t2.place(x = 200, y = 100)
v3 = t.StringVar()
l3 = t.Label(text = "Gender")
l3.place(x = 100, y = 150)
r1 = t.Radiobutton(w, text = "male", variable = v3, value = "male", command = f1)
r1.pack()
r1.place(x = 100, y =150)
r2 = t.Radiobutton(w, text = "female", variable = v3, value = "female", command = f1)
r2.pack()
r2.place(x = 200, y = 150)
l4 = t.Label(text = "year")
l4.place(x = 100, y = 200)
v4 = t.IntVar()
v4.set("Select your year")
drop = t.OptionMenu(w,v4,"1","2","3","4")
drop.pack()
drop.place(x = 200, y = 250)
l5 = t.Label(w , text= "Branch")
l5.place(x = 100, y = 300)
v5 = t.StringVar()
v5.set("Select your Branch")
drop = t.OptionMenu(w, v5, "CSE","AIML","DS","ECE")
drop.pack
drop.place(x = 200, y = 300)
b = t.Button(text = "Submit", command = f1)
b.place(x = 300, y = 400)
w.mainloop()

OUTPUT:

Department of Computer Science and Engineering, SREYAS Page 23


Machine Learning Lab Manual

$ mysql -u USERNAME -p
ENTER PASWORD: PASSWORD
mysql> use DATABASE;
mysql>select * from student;

RESULT:Thus the program to design and read values from GUI is executed and output is
verified.

Department of Computer Science and Engineering, SREYAS Page 24


Machine Learning Lab Manual

5.a. KNN (K Nearest Neighbours Algorithm) as Classifier:

AIM: Towrite a program to classify a person as overweight, underweight, normal weight using
python

THEORY:
o K-NN algorithm assumes the similarity between the new case/data and available cases
and put the new case into the category that is most similar to the available categories.
o K-NN algorithm can be used for Regression as well as for Classification but mostly it is
used for the Classification problems.

The working of K-NN can be explained as follows:

1. Select the number K of the neighbors


2. Calculate the Euclidean distance of K number of neighbors
3. Take the K nearest neighbors as per the calculated Euclidean distance.
4. Among these k neighbors, count the number of the data points in each category.
5. Assign the new data points to that category for which the number of the neighbor is max-
imum.

FLOWCHART:

SOURCE CODE:
Department of Computer Science and Engineering, SREYAS Page 25
Machine Learning Lab Manual

Dataset:
$ gedit weight.csv
height,weight,target
137,35,0
137,48,1
137,80,2
140,20,0
140,50,1
140,100,2
141,15,0
141,52,1
141,200,2
143,56,1
143,30,0
143,99,2
145,59,1
145,110,2
145,35,0
146,62,1
146,47,0
146,47,0
146,88,2
148,65,1
148,100,2
148,25,0
150,68,1
150,46,0
150,86,2
155,72,1
155,56,0
155,88,2
161,78,1
161,565,2
161,100,2
170,85,1
170,50,0
170,100,2
174,88,1
174,66,0
174,120,2
182,93,1
182,60,0
182,110,2
187,96,1
187,70,0
187,125,2
193,98,1,
193,65,0
193,100,2

Python Code:

Department of Computer Science and Engineering, SREYAS Page 26


Machine Learning Lab Manual

import pandas as pd
importmatplotlib.pyplot as pt
importseaborn as sb
fromsklearn import model_selection
fromsklearn.metrics import confusion_matrix
fromsklearn.neighbors import kNeighborsClassifier
#Data Visualization
df = pd.read_csv("weight.csv")
print(df)
pt.xlabel("height")
pt.ylabel("weight")
df1 = df[df.target = = 0]
df2 = df.[df.target = =1]
df3 = df[df.target = =2]
#Scatter Diagram
pt.scatter(df1["height"], df1["weight"], color = "red", marker = "+")
pt.scatter(df2["height"], df2["weight"], color = "green", marker = "*")
pt.scatter(df3["height"], df3["weight"], color = "black", marker = ".")
pt.show()
#Experiance
x = df.drop(["target"], axis = "columns")
y = df["target"]
xtrain,xtest,ytrain,ytest = model_selection.train_test_split(x,ytest_size = 0.2, random_state = 1)
print(xtrain)
print(ytrain)
knn =KNeighborsClassifier(n_neighbors = 5)
knn.fit(xtrain,ytrain)
#Task
ypredict = knn.predict(xtest)
cm = confusion_matrix(ytest,ypredict)
print("confusion matrix = ",cm)
pt.figure(figsize = (10,5))
sb.heatmap(cm, annot = true)
pt.xlabel("Predicted Value")
pt.ylabel("Actual value from Dataset")
pt.show()
#End User Input
print("Enter Height and Weight")
h = int(input())
w = int(input())
data = {"height" : [h], "weight" : [w]}
k = pd.DataFrame(data)
pt = knn.predict(k[["height","weight"]])
print("predicted target = ", pt)
#Performance
acc = knn.score(xtest,ytest)
acc = int(round(acc,2)*100)
print("accuracy = ", acc, "%")

Output:

Department of Computer Science and Engineering, SREYAS Page 27


Machine Learning Lab Manual

weight.csv
43 records
height weight
33 174 66
43 193 100
32 174 88
23 155 72
17 146 88
31 170 100
29 170 85
36 18260
40 187 125
4 140 50
24 155 56
14 145 35
10 143 30
39 187 70
26 161 78
27 161 565
38 187 96
20 148 25
18 148 65
25 155 88
6 141 15
28 161 100
13 145 110
7 141 52
42 193 65
1 137 48
16 146 47
0 140 35
15 143 62
5 140 100
11 143 99
9 143 56
8 141 200
12 145 59
37 182 110
33 0
43 2
32 1
23 1
17 2
31 2
29 1
36 0
40 2
4 1
24 0
14 0
10 0
39 0

Department of Computer Science and Engineering, SREYAS Page 28


Machine Learning Lab Manual

26 1
27 2
38 1
20 0
18 1
25 2
6 0
28 2
13 2
7 1
42 0
1 1
16 0
0 0
15 1
5 2
11 2
9 1
8 2
12 1
37 2

Name : target, dtype:int64


Confusion matrix = [[3 0 0]
[0 2 1]
[0 0 3]]
Enter height weight
155
90
Predicted target = [20]
Accuracy = 89%

Department of Computer Science and Engineering, SREYAS Page 29


Machine Learning Lab Manual

Department of Computer Science and Engineering, SREYAS Page 30


Machine Learning Lab Manual

Result: Thus the program to classify a person as overweight, underweight, normal weight has
been executed and output is verified.

5.b) KNN as Regressor:


Aim: To write a program to find the price for a given area of a house using KNN as regressor.

Source Code:

$ gedit house.csv
area,price
Department of Computer Science and Engineering, SREYAS Page 31
Machine Learning Lab Manual

500,1000000
525,1210000
540,1400000
600,1700000
635,1870000
670,1900000
800,2900000
900,3200000
1000,3600000
1050,3800000
1100,nan
1200,4500000
1250,4600000
1300,5000000
1325,5100000
1350,5200000
1400,5300000
1500,5450000
1550,5600000
1600,5700000
1700,6000000
1725,nan
1800,6500000
18256700000
1877,7200000
1900,7000000
1960,7700000
2000,8500000
2100,8600000
2200,8500000
2500,9000000
2600,8700000
2700,8300000

Python Code:

# import Modules
import pandas as pd
importmatplotlib.myplot as pt
fromsklearn import model_selection
fromsklearn.neighbors import KNeighborsRegressor
#Data visualization
df = pd.read_csv(“house .csv”).
print(df)
pt.xlabel (“area”)
pt-ylabel (‘price’)
pt.plot (df. Area, ds.price)
pt.show()
#Data preprocessing
print(“Missing value information”)
print (df. Is nac). Sum())
df[“price”].fillna (df [“price”] . mean(), limit = 1; inplace=True)

Department of Computer Science and Engineering, SREYAS Page 32


Machine Learning Lab Manual

df [“price”] filling (method = “ffill”, inplace=True) print(df)


#Data visualization after preprocessing
pt.xlabel(‘area’)
pt.ylabel (‘price’)
pt.plot(df.area, df.price)
pt.show()
#E Experience
x=df.drop([‘price’], axis=”columns’)
y = df.drop([“area”], axis= “columns”)
xtrain, xtext, ytrain, ytest= model_ selection. Train_tent_split (x,y, text_size=0.25,
random_state=1)
print (“The training data”) train = pd.DataFrame ({‘area’: xtrain. squeeze(), “price”:
ytrain.squeeze()})
print (train)
knn = KNeighborsRegressor (n-neighbors = 7)
knn_fit= (xtrain, ytrain)
# T Task
print(“ The test data is’)
text=pd.DataFrame({ “area”: xtext. squeeze(), “price” :ytest.squeeze
print (text)
ypredict = knn. Predict (xtest)
print (“comparision’)
yp = pd.DataFrame ({ “y predict”: ypredict. Squeeze(), “ylest”:ylest.squeeze()})
print (yp)
print(“enter area of the house”)
a = int(input ())
data = {‘area’: [a]}
k= pd.DataFrame (data) pp = knn.predict(k[[“area”]]
print(“predicted price “, PP)
#P performance
acc = knn. score(xtext, y test)
acc=int (round (acc, 2) * 100)
print (“accuracy = “, acc, “%”)

Output:
The missing value information
area 0
price 2
dtype : int64

The training data


area price
30 2500 9.000000e+06
17 1500 5.450000e+06
22 1800 6.50000e+06
4 635 1.870000e+06
2 540 1.40000e+06
21 1725 6.00000e+06
23 1825 6.700000e+06

The test data

Department of Computer Science and Engineering, SREYAS Page 33


Machine Learning Lab Manual

area price
14 1325 5100000.0
19 1600 5700000.0
3 600 1700000.0
27 2000 8500000.0
31 1600 7700000.0
26 1960 8000000.0
20 1700 70000000.0

Enter the area


1600
Predicted price = [[6250000.0]]
Accuracy = 77%

Result: Thus the program tofind the price for a given area of a house using KNN as regressor
has been executed and output is verified.

PROGRAM 6
6a.) K Means Clustering

Aim: To implement KMeans Clustering algorithm using python

Department of Computer Science and Engineering, SREYAS Page 34


Machine Learning Lab Manual

Theory:
K-Means Clustering is an Unsupervised Learning algorithm, It is an iterative algorithm
that divides the unlabeled dataset into k different clusters in such a way that each dataset belongs
only one group that has similar properties.

The working of the K-Means algorithm is explained in the below steps:


 Select the number K to decide the number of clusters.
 Select random K points or centroids. (It can be other from the input dataset).
 Assign each data point to their closest centroid, which will form the predefined K
clusters.
 Calculate the variance and place a new centroid of each cluster.
 Repeat the third steps, which means reassign each datapoint to the new closest centroid of
each cluster.
 If any reassignment occurs, then go to step-4 else go to FINISH.

Source Code:

Data Set:
$ gedit customer1.csv
id,gender,age,income,spendingscore
1,male,19,1245000,39
2,male,21,1245000,81
3,female,20,1328000,6
4,female,23,1328000,77
5,female,31,1411000,40
…….
…….
…….
196,female,35,9960000,79
197,female,45,1e+07,28
198,male,32, 1e+07,74
199,male,32,1e+07,18
200,male,30,1e+07,83

Python Code:

#import modules
importmatplotlib.pyplot as pf
importnumpy as np
fromsklearn.cluster import kmeans
fromsklearn.preprocessing import StandardScaler
import pandas as pd

#data preprocessing,
df = pd.read_csv (“customer1. csv)
print(“data set before scaling”)
print (df)
s=StandardScaler().
t = df.iloc[:, [3,4]]
x = s.fit_transtorm (t)
df [“income”] = x[:,0]

Department of Computer Science and Engineering, SREYAS Page 35


Machine Learning Lab Manual

df[‘spending score’] = x[;,1]


print (“data set after scaling’)
print(df)

#data visualization
inc=df.iloc[:,3]
ss= df.iloc[:,4]
pt.title(“income vs spending score scatter diagram”)
pt.xlabel (“income)
pt.y label (“spending score”).
pt.scatter(inc, ss)
pt.show()

# to find number of clusters using elbow method.


wcss=[]
k=[]
y=df.iloc[:,[3,4]]
fori in range (2,10):
km = KMeans(n_clusters=i)
km.fit(y)
wcss. append (km. Inextia_)
k.append (i)
pt.title (number of clusters vswess line graph)
pt.xlabel (“k”)
pt.ylabel(“wcss”)
pt.plot (k, wcss)
pt.show()

# fit the data using k means clustering algorithm


km = Means (n_clusters =5)
km.fit(y)

# to predict clusters
dc = km.predict (y)
dc=km.predict(y)
print(dc)
df [‘cluster’]=dc
print (“datasd after cluster assignment for each example”)
print (df)
print (“centeroids of five clusters”)
cen = km. Clusters_centers_
print(un)
c1 = df [df. clusters ==0]
c2=df [df. clusters == 1]
c3 =df [df. Cluster ==2]
c4=df[df.cluster ==3]
c5=df [df.clusters = =4]
pt.xlabel(“income”)
pt.ylabel (“spending score”)
pt-scatter (c1.income, c1. spending score, color = “red”, label=”cluster1”)
pt-scalter (c2. income, c2∙spending score, color=”blue”, label = ”cluster2”)
pt-scatter (c3.income, c3. spending score, color = “green”, label = ”cluster3”)

Department of Computer Science and Engineering, SREYAS Page 36


Machine Learning Lab Manual

pt. Scatter (c4 income, c4. Spending score, color=”yellow”, label=”cluster4”)


pt-scatter (c5.income, c5. spending score, color=”orange” label=”cluster5”)
pt. scatter (con [:0], (en [:1], color=”black”, label = “centroid “)
Pt.legend ()
Pt.show()

#end user input


Print (“enter Income &spending score of a person”)
inc = int(input())
ss = int(input())
data = { ‘income’: [inc], “spending score” : [ss]}
k = pd.DataFrame(data)
pc = km.predict (k [[‘income’, ‘spending score’]])
print(‘the cluster for given input = “, pc)

Output:
Dataset before scaling:

Id gender age income spendingscore


0 1 male 19 1245000 39
1 2 male 21 1245000 81
2 3 female 20 1328000 6
3 4 female 23 1328000 77
4 5 female 31 1411000 40
……..
195 196 female 35 9960000 79
196 197 female 45 10458000 28
197 198 male 32 10458000 74
198 199 male 32 11371000 18
199 200 male 30 11371000 83
[200rows x 5columns]

Dataset after scaling:

Id gender age income spendingscore


0 1 male 19 -1.738999 -0.434801
1 2 male 21 -1.738999 1.195704
2 3 female 20 -1.700830 -1.715913
3 4 female 23 -1.700830 1.040418
4 5 female 31 -1.662660 -.0395980
……..
195 196 female 35 2.2688791 1.118061
196 197 female 45 2.497807 -0.861839
197 198 male 32 2.497807 0.923953
198 199 male 32 2.917671 -1.250054
199 200 male 30 2.917671 1.273347
[200rowsx5columns]

Result: Thus the program to implement KMeans Clustering algorithm using python has been
executed and the output is verified.

Department of Computer Science and Engineering, SREYAS Page 37


Machine Learning Lab Manual

6.b)Given the following data, which specify classifications for nine combinations of VAR1 and
VAR2 predict a classification for a case where VAR1=0.906 and VAR2=0.606, using the result
of k-means clustering with 3 means (i.e., 3centroids)

Aim :To implement KMeans Clustering algorithm using python

SOURCE CODE:

importmatplotlib.pyplot as pt
importnumpy as np
import pandas as pd
fromsklearn.cluster import KMeans
fromsklearn.preprocessing import StandardScaler

#data preprocessing
df = pd.read_csv (“var12.csv”)
print(df)
s = StandardScales()
t = df.iloc[:, [0, 1]]
x = sfit_transform (t)
df [VAR1’] = x[:,0]
df [‘VAR2’] = x[:,1]
print (“dataset after scaling”)
print (df)
#data visualization
inc = df.iloc[:,0]
ss=df.iloc[:,1]
pt.title (“var1 vsvar 2 scatter diagram”)
pt.xlabel (VARI”)
pt.ylabel(“VAR2”)
pt scatter (inc, ss)
pt.show()
# find the no g cluster using the elbow method
wcss = [ ]
k=[]
y = df.iloc[:, [0, 1]]
km = KMeans (n_clusters = 3)
wcss.append(km.inertia_)
k.append(1)
pt.title (“no of clusters vs the wess line. Graph”)
pt.xlabel (“k”)
pt.ylabel(“wcss”)
pt.plot (k, wcss)

Department of Computer Science and Engineering, SREYAS Page 38


Machine Learning Lab Manual

pt.show()
#fit the data using kmeans al algonthm km=kMeans (n_clusters =3)
km.fit(y)
# To predict the clusters
dc=km.predict (y)
print (dc)
df [“cluster”]=dc
print (“the dataset after the cluster assignment for each example”)
print (df)
print(“the centeroids of fif clusters”)
cen=km.cluster _centers_
print (cen)
c1 = df [df.cluster == 0]
c2=df [df.cluster == 1]
c3 = df [df.cluster == 2]
pt.xlabel (“VAR1”)
pt.ylabel(“VAR2”)
pt.scatter (C, VARI, C1.VAR2, color=”red”, label = “cluster1”)
pt.scatter (C2.VARI, C2. VAR2, color=”blue”, label = “cluster 2”)
pt.scatter (C3.VARI, C3.VAR2, color = “green”, label = “cluster 3”)
pt.scatter (cen [:,0], can [:,1], color = “black”, label = “centroid”)
pt.legend ()
pt.show()
print(’enter the VAR1 VAR2”)
inc = float(input ())
ss=float(input())
data = {“VAR1”: [inc], “VAR2”: [ss]}
k = pd.DataFrame(data)
pc = km.predict (K[[“VAR1”, “VAR 2”]])
print(“The cluster for given input = “, pc)

OUTPUT:

Enter the VAR1 and VAR2


0.906
0.606
The cluster for given input = [1]

Department of Computer Science and Engineering, SREYAS Page 39


Machine Learning Lab Manual

Result: Thus the program to implement KMeans Clustering algorithm using python has been
executed and the output is verified.

Department of Computer Science and Engineering, SREYAS Page 40


Machine Learning Lab Manual

Department of Computer Science and Engineering, SREYAS Page 41

You might also like