Ai Lab
Ai Lab
OBJECTIVE:
What is Prolog?
• Prolog stands for programming in logic.
• Prolog is a declarative language, which means that a program consists of data
based on the facts and rules (Logical relationship) rather than computing how to
find a solution.
• A logical relationship describes the relationships which hold for the given
application.
• To obtain the solution, the user asks a question rather than running a program.
When a user asks a question, then to determine the answer, the run time system
searches through the database of facts and rules.
• Prolog is a declarative language that means we can specify what problem we want
to solve rather than how to solve it.
• Prolog is used in some areas like database, natural language processing, artificial
intelligence, but it is pretty useless in some areas like a numerical algorithm or
instance graphics.
• In artificial intelligence applications, prolog is used. The artificial intelligence
applications can be automated reasoning systems, natural language interfaces, and
expert systems. The expert system consists of an interface engine and a database
of facts. The prolog's run time system provides the service of an interface engine.
Applications of Prolog
• Specification Language
• Robot Planning
• Natural language understanding
• Machine Learning
• Problem Solving
• Intelligent Database retrieval
• Expert System
• Automated Reasoning
Starting Prolog
• Prolog system is straightforward.
• Prolog will produce a number of lines of headings in the starting, which is
followed by a line. It contains just
• ?-
• The above symbol shows the system prompt. The prompt is used to show that the
Prolog system is ready to specify one or more goals of sequence to the user.
• Using a full stop, we can terminate the sequence of goals.
• For example:
Subject Code: KCS-751A Subject Name: Artificial Intelligence Lab
Page 1
Page 2
Prolog Programs
• To write a Prolog program, firstly, the user has to write a program which is
written in the Prolog language, load that program, and then specify a sequence of
one or more goals at the prompt.
• To create a program in Prolog, the simple way is to type it into the text editor and
then save it as a text file like prolog1.pl.
• The following example shows a simple program of Prolog. The program contains
three components, which are known as clauses. Each clause is terminated using a
full stop.
dog(rottweiler).
cat(munchkin).
animal(A) :- cat(A).
• Using the built-in predicate 'consult', the above program can be loaded in the
Prolog system.
• ?-consult('prolog1.pl').
• This shows that prolog1.pl file exists, and the prolog program is systemically
correct, which means it has valid clauses, the goal will succeed, and to confirm
that the program has been correctly read, it produces one or more lines of output.
e.g.,
• ?-
# 0.00 seconds to consult prolog1.pl
?-
• The alternative of 'consult' is 'Load', which will exist on the menu option if the
Prolog system has a graphical user interface.
• When the program is loaded, the clause will be placed in a storage area, and that
storage area is known as the Prolog database. In response to the system prompt,
specify a sequence of goals, and it will cause Prolog to search for and use the
clauses necessary to evaluate the goals.
Subject Code: KCS-751A Subject Name: Artificial Intelligence Lab
Page 2
Page 3
Terminology
In the following program, three lines show the clauses.
dog(rottweiler).
cat(munchkin).
animal(A) :- cat(A).
Using the full stop, each clause will be terminated. Prolog programs have a
sequence of clauses. Facts or rules are described by these clauses.
Example of facts is dog(rottweiler) and cat(munchkin). They mean that
'rottweiler is a dog' and 'munchkin is a cat'.
Dog is called a predicate. Dog contains one argument. Word 'rottweiler' enclosed
in bracket( ). Rottweiler is called an atom.
The example of rule is the final line of the program.
animal(A) :- dog(A).
The colon(:-) character will be read as 'if'. Here A is a variable, and it represents
any value. In a natural way, the rule can be read as "If A is an animal, then A is a
dog".
The above clause shows that the rottweiler is an animal.
Such deduction can also make by Prolog:
?- animal(rottweiler).
yes
To imply that munchkin is an animal, there is no evidence of this.
?- animal(munchkin).
no
Experiment-1 (B)
OBJECTIVE: Write simple fact for the statements using PROLOG.
Facts
A fact is like a predicate expression. It is used to provide a declarative statement about
the problem. In a Prolog expression, when a variable occurs, it is assumed to be
universally quantified. Facts are specified in the form of the head. Head is known as the
clause head. It will take in the same way as the goal entered at the prompt by the user.
cat(bengal). /* bengal is a cat */
dog(rottweiler). /* rottweiler is a dog */
likes(Jolie, Kevin). /* Jolie likes Kevin */
likes(A, Kevin). /* Everyone likes Kevin */
likes(Jolie, B). /* Jolie likes everybody */
likes(B, Jolie), likes(Jolie, B). /* Everybody likes Jolie and Jolie likes everybody */
likes(Jolie, Kevin); likes(Jolie, Ray). /* Jolie likes Kevin or Jolie likes Ray */
not(likes(Jolie, pasta)). /* Jolie does not like pasta */
Queries
In Prolog, the query is the action of asking the program about the information which is
available within its database. When a Prolog program is loaded, we will get the query
prompt,
?-
After this, we can ask about the information to the run time system. Using the above
simple database, we can ask a question to the program like
?- 'It is sunny'.
and it will give the answer
yes
?-
The system responds to the query with yes if the database information is consistent to
answer the query. Using the available database information, we can also check that the
program is capable of proving the query true. No indicates that the fact is not deducible
based on the available information.
The system answers no to the query if the database does not have sufficient information.
?- 'It is cold'.
no
?-
b. Seema is a girl.
c. Bill likes Cindy.
d. Rose is red.
e. John owns gold.
Program:
likes(ram ,mango).
girl(seema).
red(rose).
likes(bill ,cindy).
owns(john ,gold).
Output:
Goal
queries
?-likes(ram,What).
What= mango
?-likes(Who,cindy).
Who= cindy
?-red(What).
What= rose
?-owns(Who,What).
Who= john
What= gold.
Example 2.
Statements:
1. The Cakes are delicious.
2. The Pickles are delicious.
3. The Pickles are spicy.
4. Priya relishes coffee.
5. Priya likes food if they are delicious.
6. Prakash likes food if they are spicy and delicious.
Statements in Prolog:
1. delicious(cakes).
2. delicious(pickles).
3. spicy( pickles).
4. relishes(priya, coffee).
5. likes(priya, Food) if delicious(Food). %Here Food is a variable
6. likes(prakash,Food) if spicy(Food) and delicious(Food).
Program Clauses:
delicious(cakes).
delicious(pickles).
spicy(pickles).
relishes(priya, coffee).
% here Food is a variable
likes(priya, Food):-delicious(Food).
likes(prakash, Food):- spicy(Food), delicious(Food).
Goal 1- Which food items are delicious.
Output-
Experiment-2 (A)
OBJECTIVE: Write predicates One converts centigrade temperatures to
Fahrenheit, the other checks if a temperature is below freezing.
Formula for Centigrade (C) temperatures to Fahrenheit (F) -
F = C * 9 / 5 + 32
Rule
Centigrade to Fahrenheit (c_to_f)F is C * 9 / 5 + 32
Program-
c_to_f(C,F) :-F is C * 9 / 5 + 32.
% here freezing point is less than 32 Fahrenheit
freezing (F) :-F =< 32.
Goal to find Fahrenheit temperature and freezing point
Output-
Experiment-2 (B)
OBJECTIVE: Write a program to solve the Monkey Banana problem.
Monkey wants the bananas but he can’t reach them.
What shall he do? The monkey is in the room.
Suspended from the roof, just out of his reach, is a bunch of bananas.
In the corner of the room is a box. The monkey desperately wants to grasp bananas.
After several unsuccessful attempts to reach the bananas:
1. The monkey walks to the box.
2. Pushes it under the bananas.
3. Climb on the box.
4. Picks the banana & eats them.
Program:
on(floor,monkey).
on(floor,box).
in(room,monkey).
in(room,box).
at(ceiling,bnana).
strong(monkey).
grasp(monkey).
climb(monkey,box).
push(monkey,box):-
strong(monkey).
Subject Code: KCS-751A Subject Name: Artificial Intelligence Lab
Page 8
Page 9
under(banana,box):-
push(monkey,box).
canreach(banana,monkey):-
at(floor,banana);
at(ceiling,banana);
under(banana,box).
canget(banana,monkey):-
canreach(banana,monkey),grasp(monkey).
OUTPUT:
Experiment-3
OBJECTIVE: Write a program to design medical diagnosis expert system in SWI
Prolog. The system must work for diagnosis of following diseases:-
a) measles
b) germanmeasles
c) Flu
d) commoncold
e) mumps
f). chickenpox
Program Code:
symptom(charlie,fever).
symptom(charlie,headache).
symptom(charlie,runnynose).
symptom(charlie,rash).
symptom(amit,headache).
symptom(amit,runnynose).
symptom(amit,snuzing).
symptom(amit,chills).
symptom(amit,sorethrought).
symptom(ajay,runnynose).
symptom(ajay,snuzing).
symptom(ajay,cough).
symptom(rajesh,fever).
symptom(rajesh,rash).
symptom(rajesh,bodyache).
symptom(deepak,fever).
symptom(deepak,headache).
Subject Code: KCS-751A Subject Name: Artificial Intelligence Lab
Page 10
Page 11
symptom(deepak,bodyache).
symptom(deepak,chills).
symptom(deepak,sorethrought).
symptom(deepak,cough).
symptom(deepak,conjunctive).
symptom(deepak,runnynose).
hypothesis(patient,measles) :-
symptom(Patient,fever),symptom(Patient,cough), symptom(Patient,conjunctive),
symptom(Patient,runnynose), symptom(Patient,rash).
hypothesis(Patient,germanmeasles):-
symptom(Patient,fever),symptom(Patient,headache),symptom(Patient,runnynose),
symptom(Patient,rash).
hypothesis(Patient,flu):-
symptom(Patient,fever),symptom(Patient,headache),symptom(Patient,bodyache),
symptom(Patient,chills), symptom(Patient,sorethrought),symptom(Patient,cough),
symptom(Patient,conjunctive),symptom(Patient,conjunctive),symptom(Patient,runnynose
).
hypothesis(Patient,commoncold):-
symptom(Patient,headache),symptom(Patient,runnynose),symptom(Patient,snuzin
g),symptom(Patient,chills),symptom(Patient,sorethrought).
hypothesis(Patient,mumps):-
symptom(Patient,fever),symptom(Patient,swallenglands).
hypothesis(Patient,chikenpox):-
symptom(Patient,fever),symptom(Patient,rash),symptom(Patient,bodyache).
hypothesis(Patient,whooping-cough):-
symptom(Patient,runnynose),symptom(Patient,snuzing),symptom(Patient,cough).
OUTPUT:
hypothesis(charlie,P).
P=germanmeasles
Experiment-4 (A)
OBJECTIVE: WAP to implement factorial, Fibonacci of a given number.
Program-
factorial(0,1).
factorial(N,F):-
N>0,
N1 is N-1,
factorial(N1,F1),
F is N * F1.
Goal- To find Factorial of the number.
Output-
The Fibonacci sequence is a sequence where the next term is the sum of the previous two
terms. The first two terms of the Fibonacci sequence are 0 followed by 1.
N > 1,
N1 is N-1,
N2 is N-2,
fib(N1, F1),
fib(N2, F2),
F is F1+F2
Goal- To find Fibonacci number.
Output-
Experiment-4 (B)
OBJECTIVE: Write a program to solve 4-Queen problem.
N queens problem is one of the most common examples of backtracking. Our goal is to
arrange N queens on an NxN chessboard such that no queen can strike down any other
queen. A queen can attack horizontally, vertically, or diagonally.
So, we start by placing the first queen anywhere arbitrarily and then place the next
queen in any of the safe places. We continue this process until the number of
unplaced queens becomes zero (a solution is found) or no safe place is left. If no
safe place is left, then we change the position of the previously placed queen.
The above picture shows a 4x4 chessboard and we have to place 4 queens on it.
So, we will start by placing the first queen in the first row.
Now, the second step is to place the second queen in a safe position. Also, we can't
place the queen in the first row, so we will try putting the queen in the second row
this time.
Let's place the third queen in a safe position, somewhere in the third row.
Now, we can see that there is no safe place where we can put the last queen.
So, we will just change the position of the previous queen i.e., backtrack and
change the previous decision.
Also, there is no other position where we can place the third queen, so we will go
back one more step and change the position of the second queen.
And now we will place the third queen again in a safe position other than the
previously placed position in the third row.
We will continue this process and finally, we will get the solution as shown below.
Subject Code: KCS-751A Subject Name: Artificial Intelligence Lab
Page 16
Page 17
Program:
% render solutions nicely.
:- use_rendering(chess).
queens(N, Queens) :-
length(Queens, N),
board(Queens, Board, 0, N, _, _),
queens(Board, 0, Queens).
constraints(0, _, _, _) :- !.
constraints(N, Row, [R|Rs], [C|Cs]) :-
arg(N, Row, R-C),
M is N-1,
constraints(M, Row, Rs, Cs).
queens([], _, []).
queens([C|Cs], Row0, [Col|Solution]) :-
Row is Row0+1,
Subject Code: KCS-751A Subject Name: Artificial Intelligence Lab
Page 17
Page 18
/** <examples>
?- queens(8, Queens).
*/
OUTPUT:
Experiment-5
OBJECTIVE: Write a program to solve traveling salesman problem.
Given a set of cities and distances between every pair of cities, the problem is to find the
shortest possible route that visits every city exactly once and returns to the starting point.
Note the difference between Hamiltonian Cycle and TSP. The Hamiltonian cycle
problem is to find if there exists a tour that visits every city exactly once. Here we know
that Hamiltonian Tour exists (because the graph is complete) and in fact, many such tours
exist, the problem is to find a minimum weight Hamiltonian Cycle.
For example, consider the graph shown in the figure on the right side. A TSP tour in the
graph is 1-2-4-3-1. The cost of the tour is 10+25+30+15 which is 80.
Program:
# Python3 program to implement traveling salesman
# problem using naive approach.
from sys import maxsize
from itertools import permutations
V=4
current_pathweight += graph[k][s]
# update minimum
min_path = min(min_path, current_pathweight)
return min_path
# Driver Code
if __name__ == "__main__":
Output
Experiment-6
OBJECTIVE: Write a program to solve water jug problem.
Problem: You are given two jugs, a 4-gallon one and a 3-gallon one. Neither has any
measuring mark on it. There is a pump that can be used to fill the jugs with water. How
can you get exactly 2 gallons of water into the 4-gallon jug.
Solution:
• The state space for this problem can be described as the set of ordered pairs of
integers (x,y)
• Where,
• X represents the quantity of water in the 4-gallon jug X= 0,1,2,3,4
• Y represents the quantity of water in 3-gallon jug Y=0,1,2,3
• Start State: (0,0)
• Goal State: (2,0)
OUTPUT:
Experiment-7
OBJECTIVE: Write a Python program to Support vector Machine algorithm.
SVM is one of the most popular Supervised Learning algorithms, used for Classification
as well as Regression problems. However, primarily, it is used for Classification
problems in Machine Learning.
The goal of the SVM algorithm is to create the best line or decision boundary that can
segregate n-dimensional space into classes so that we can easily put the new data point in
the correct category in the future. This best decision boundary is called a hyperplane.
SVM chooses the extreme points/vectors that help in creating the hyperplane. These
extreme cases are called as support vectors, and hence algorithm is termed as Support
Vector Machine. Consider the below diagram in which there are two different categories
that are classified using a decision boundary or hyperplane:
Types of SVM
SVM can be of two types:
o Linear SVM: Linear SVM is used for linearly separable data, which means if a
dataset can be classified into two classes by using a single straight line, then such
data is termed as linearly separable data, and classifier is used called as Linear
SVM classifier.
o Non-linear SVM: Non-Linear SVM is used for non-linearly separated data,
which means if a dataset cannot be classified by using a straight line, then such
data is termed as non-linear data and classifier used is called as Non-linear SVM
classifier.
Hyperplane and Support Vectors in the SVM algorithm:
So as it is 2-d space so by just using a straight line, we can easily separate these two
classes. But there can be multiple lines that can separate these classes. Consider the
below image:
Hence, the SVM algorithm helps to find the best line or decision boundary; this best
boundary or region is called as a hyperplane. SVM algorithm finds the closest point of
the lines from both the classes. These points are called support vectors. The distance
between the vectors and the hyperplane is called as margin. And the goal of SVM is to
maximize this margin. The hyperplane with maximum margin is called the optimal
hyperplane.
Non-Linear SVM:
If data is linearly arranged, then we can separate it by using a straight line, but for non-
linear data, we cannot draw a single straight line. Consider the below image:
So to separate these data points, we need to add one more dimension. For linear data, we
have used two dimensions x and y, so for non-linear data, we will add a third dimension
z. It can be calculated as:
z=x2 +y2
By adding the third dimension, the sample space will become as below image:
So now, SVM will divide the datasets into classes in the following way. Consider the
below image:
Program:
In the model the building part, you can use the cancer dataset, which is a very famous
multi-class classification problem. This dataset is computed from a digitized image of a
fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell
nuclei present in the image.
This data has two types of cancer classes: malignant (harmful) and benign (not
harmful).
Here, you can build a model to classify the type of cancer. The dataset is available in the
scikit-learn library or you can also download it from the UCI Machine Learning Library.
Loading Data
Let's first load the required dataset you will use.
Exploring Data
After you have loaded the dataset, you might want to know a little bit more about it. You
can check feature and target names.
You can also check the shape of the dataset using shape.
Splitting Data
To understand model performance, dividing the dataset into a training set and a test set is
a good strategy.
Split the dataset by using the function train_test_split(). you need to pass 3 parameters
features, target, and test_set size. Additionally, you can use random_state to select
records randomly.
Generating Model
Let's build support vector machine model. First, import the SVM module and create
support vector classifier object by passing argument kernel as the linear kernel in SVC()
function.
Then, fit your model on train set using fit() and perform prediction on the test set using
predict().
Experiment-8
OBJECTIVE: Write a python program for Decision tree algorithm.
Introduction: Decision Tree is a Supervised learning technique that can be used for
both classification and Regression problems, but mostly it is preferred for solving
Classification problems. It is a tree-structured classifier, where internal nodes represent
the features of a dataset, branches represent the decision rules and each leaf node
represents the outcome.
Below are the two reasons for using the Decision tree:
1. Decision Trees usually mimic human thinking ability while making a decision, so
it is easy to understand.
2. The logic behind the decision tree can be easily understood because it shows a
tree-like structure.
Decision Tree Terminologies
Root Node: Root node is from where the decision tree starts. It represents the entire
dataset, which further gets divided into two or more homogeneous sets.
Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated
further after getting a leaf node.
Splitting: Splitting is the process of dividing the decision node/root node into sub-
nodes according to the given conditions.
Branch/Sub Tree: A tree formed by splitting the tree.
Pruning: Pruning is the process of removing the unwanted branches from the tree.
Parent/Child node: The root node of the tree is called the parent node, and other
nodes are called the child nodes.
How does the Decision Tree algorithm Work?
In a decision tree, for predicting the class of the given dataset, the algorithm starts from
the root node of the tree. This algorithm compares the values of root attribute with the
record (real dataset) attribute and, based on the comparison, follows the branch and
jumps to the next node.
For the next node, the algorithm again compares the attribute value with the other sub-
nodes and move further. It continues the process until it reaches the leaf node of the tree.
The complete process can be better understood using the below algorithm:
o Step-1: Begin the tree with the root node, says S, which contains the complete
dataset.
o Step-2: Find the best attribute in the dataset using Attribute Selection Measure
(ASM).
o Step-3: Divide the S into subsets that contains possible values for the best
attributes.
o Step-4: Generate the decision tree node, which contains the best attribute.
o Step-5: Recursively make new decision trees using the subsets of the dataset
created in step -3. Continue this process until a stage is reached where you cannot
further classify the nodes and called the final node as a leaf node.
Attribute Selection Measures
While implementing a Decision tree, the main issue arises that how to select the best
attribute for the root node and for sub-nodes. So, to solve such problems there is a
technique which is called as Attribute selection measure or ASM. By this
measurement, we can easily select the best attribute for the nodes of the tree. There are
two popular techniques for ASM, which are:
o Information Gain
o Gini Index
1. Information Gain:
o Information gain is the measurement of changes in entropy after the segmentation
of a dataset based on an attribute.
o It calculates how much information a feature provides us about a class.
o According to the value of information gain, we split the node and build the
decision tree.
o A decision tree algorithm always tries to maximize the value of information gain,
and a node/attribute having the highest information gain is split first. It can be
calculated using the below formula:
1. Information Gain= Entropy(S)- [(Weighted Avg) *Entropy(each feature)
Entropy: Entropy is a metric to measure the impurity in a given attribute. It specifies
randomness in data. Entropy can be calculated as:
Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)
Where,
o S= Total number of samples
o P(yes)= probability of yes
o P(no)= probability of no
2. Gini Index:
o Gini index is a measure of impurity or purity used while creating a decision tree
in the CART(Classification and Regression Tree) algorithm.
o An attribute with the low Gini index should be preferred as compared to the high
Gini index.
o It only creates binary splits, and the CART algorithm uses the Gini index to create
binary splits.
o Gini index can be calculated using the below formula:
Gini Index= 1- ∑jPj2
PROGRAM:
1 1 85 66 29 0 33.6 0.627 50 1
3 1 89 66 23 94 28.1 0.167 21 0
Now, split the dataset into features and target variable as follows −
feature_cols = ['pregnant', 'insulin', 'bmi', 'age','glucose','bp','pedigree']
X = pima[feature_cols] # Features
y = pima.label # Target variable
Next, we will divide the data into train and test split. The following code will split the
dataset into 70% training data and 30% of testing data −
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 1)
Next, train the model with the help of DecisionTreeClassifier class of sklearn as follows
−
clf = DecisionTreeClassifier()
clf = clf.fit(X_train,y_train)
At last we need to make prediction. It can be done with the help of following script −
y_pred = clf.predict(X_test)
Next, we can get the accuracy score, confusion matrix and classification report as
follows −
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
result = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(result)
result1 = classification_report(y_test, y_pred)
print("Classification Report:",)
print (result1)
result2 = accuracy_score(y_test,y_pred)
print("Accuracy:",result2)
Output
Confusion Matrix:
[[116 30]
[ 46 39]]
Classification Report:
precision recall f1-score support
0 0.72 0.79 0.75 146
1 0.57 0.46 0.51 85
micro avg 0.67 0.67 0.67 231
macro avg 0.64 0.63 0.63 231
weighted avg 0.66 0.67 0.66 231
Accuracy: 0.670995670995671
Experiment-9
OBJECTIVE: Write a python program for Gaussian Naïve Bayes Classifier.
It is a variant of Naive Bayes that follows Gaussian normal distribution and supports
continuous data.
Naive Bayes are a group of supervised machine learning classification algorithms based
on the Bayes theorem. It is a simple classification technique, but has high functionality.
They find use when the dimensionality of the inputs is high.
Bayes Theorem
Bayes Theorem can be used to calculate conditional probability. Being a powerful tool in
the study of probability, it is also applied in Machine Learning.
Naive Bayes Classifiers are based on the Bayes Theorem.These classifiers assume that
the value of a particular feature is independent of the value of any other feature. In a
supervised learning situation, Naive Bayes Classifiers are trained very efficiently. Naive
Bayed classifiers need a small training data to estimate the parameters needed for
classification
• or both (i.e., σ)
Gaussian Naive Bayes supports continuous valued features and models each as
conforming to a Gaussian (normal) distribution.
The above illustration indicates how a Gaussian Naive Bayes (GNB) classifier works. At
every data point, the z-score distance between that point and each class-mean is
calculated, namely the distance from the class mean divided by the standard deviation of
that class.
Program:
# Gaussian Naive Bayes
Subject Code: KCS-751A Subject Name: Artificial Intelligence Lab
Page 38
Page 39
model.fit(dataset.data, dataset.target)
print(model)
# make predictions
expected = dataset.target
predicted = model.predict(dataset.data)
# summarize the fit of the model
print(metrics.classification_report(expected, predicted))
print(metrics.confusion_matrix(expected, predicted))
OUTPUT:
GaussianNB()
precision recall f1-score support
[[50 0 0]
[ 0 47 3]
[ 0 3 47]]
Experiment-10
OBJECTIVE: Write a python program to Predicting Cardiovascular Disease Using K
Nearest Neighbors Algorithm.
K Nearest Neighborhood Algorithm also known as KNN or k-NN, is a non-parametric,
supervised learning classifier, which uses proximity to make classifications or predictions
about the grouping of an individual data point. While it can be used for either regression
or classification problems, it is typically used as a classification algorithm, working off
the assumption that similar points can be found near one another.
The K-NN working can be explained on the basis of the below algorithm:
Suppose we have a new data point and we need to put it in the required category.
Consider the below image:
o Firstly, we will choose the number of neighbors, so we will choose the k=5.
o Next, we will calculate the Euclidean distance between the data points. The
Euclidean distance is the distance between two points, which we have already
studied in geometry. It can be calculated as:
o As we can see the 3 nearest neighbors are from category A, hence this new data
point must belong to category A.
Program:
import pandas as pd
Subject Code: KCS-751A Subject Name: Artificial Intelligence Lab
Page 41
Page 42
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
df = pd.read_csv('heart.csv')
df.head()
sns.countplot(df['target'])
x= df.iloc[:,0:13].values
y= df['target'].values
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 0.25, random_state=0)
from sklearn.preprocessing import StandardScaler
st_x= StandardScaler()
x_train= st_x.fit_transform(x_train)
x_test= st_x.transform(x_test)
error = []
Subject Code: KCS-751A Subject Name: Artificial Intelligence Lab
Page 42
Page 43
classifier= KNeighborsClassifier(n_neighbors=7)
classifier.fit(x_train, y_train)
y_pred= classifier.predict(x_test)
from sklearn.metrics import confusion_matrix
cm= confusion_matrix(y_test, y_pred)
# Output =>array([[26, 7],
[ 3, 40]], dtype=int64)
accuracy_score(y_test, y_pred)