0% found this document useful (0 votes)

114 views13 pages

Decision Tree and KNN Assignment Two

The document provides details on building a decision tree to classify examples from a training data set into two classes (Yes, No) based on four attributes: age, income, student status, and credit rating. It describes calculating information gain for each attribute to determine the root node, then recursively splitting the data set and making predictions. For a new example (age=Senior, income=medium, student=yes, credit=excellent), the decision tree predicts a "Yes" class. It also predicts the classes for two other examples using the decision tree.

Uploaded by

abaynesh moges

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

114 views13 pages

Decision Tree and KNN Assignment Two

Uploaded by

abaynesh moges

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

University of Gondar

College of Informatics
Department of Computer Science (Postgraduate program)
Machine Learning
Course code
Decision Tree and KNN Assignment Two

By: Abaynesh Moges

To: Abdel Ahmed. (PhD)

July 12/2022

Gondar, Ethiopia
Given the above training examples, predict the class for the following instances Using Decision Tree
and KNN where k=3.
X = (age=Senior, income=medium, student = yes, credit rating=excellent)
Z = (age=middle-aged, income=low, student = no, credit rating=fair)
R = (age=Youth, income=medium, student = no, credit rating=excellent).

1. Using Decision
Given the training data (Buy Computer data), build a decision tree and predict the class of the
following new example (age=Senior, income=medium, student = yes, credit rating=excellent).

First check which attribute will be the root node by using the highest Information Gain in order to
split the training set based on that attribute. We need to calculate the expected information to classify
the set and the entropy of each attribute. The information gain is this mutual information minus the
entropy.
The information of the total dataset with the two classes which are class-Yes and class-
No is calculated as

1
𝒚𝒆𝒔 𝒚𝒆𝒔 𝑵𝒐 𝑵𝒐
I (Yes, No) = (- (𝒕𝒐𝒕𝒂𝒍 𝒅𝒂𝒕𝒂 𝒔𝒆 ) ∗ 𝐥𝐨𝐠 𝟐 (𝒕𝒐𝒕𝒂𝒍 𝒅𝒂𝒕𝒂 𝒔𝒆 ) ) – ((𝒕𝒐𝒕𝒂𝒍 𝒅𝒂𝒕𝒂 𝒔𝒆 ) ∗ 𝐥𝐨𝐠 𝟐 (𝒕𝒐𝒕𝒂𝒍 𝒅𝒂𝒕𝒂 𝒔𝒆𝒕 ))

In the given training data set we have 14 total number of example data with two classes:
class-yes 9 and class-no 5. From this we get the total information gain
I (Yes, No) = I (9, 5) = -9/14 log2 (9/14) – 5/14 log2 (5/14) =0.94.
Then we find the entropy of each features (age, income, student and crediting rate) as
Age v=youth (Yes) = 2 age v=youth (No) = 3
Age v = middle-aged (Yes) = 4 age v =middle-aged (No) =0
Age v= Senior (Yes) =3 age v=senior (No) =2

Calculate Entropy of age (residual information) and then after we calculate the information
gain.
Entropy (age) = 5/14 (-2/5 log2 (2/5)-3/5log2 (3/5)) + 4/14 (0) + 5/14 (-3/5log2 (3/5)-2/5log2 (2/5))
= 5/14(0.9709) + 0 + 5/14(0.9709)
= 0.6935
Gain (age) = 0.94 – 0.6935 = 0.2465
For Income we have three values
Income v=high (Yes) = 2 income v=high (No) = 2
Income v =medium (Yes) = 4 income v =medium (No) =2
Income v= low (Yes) =3 income v=low (No) =1

Entropy (income) = 4/14(-2/4log2 (2/4)-2/4log 2(2/4)) + 6/14 (-4/6log 2(4/6)-2/6log2 (2/6))

+ 4/14 (-3/4log2 (3/4)-1/4log 2(1/4))
= 4/14 (1) + 6/14 (0.918) + 4/14 (0.811)
= 0.285714 + 0.393428 + 0.231714 = 0.9108
Gain (income) = 0.94 – 0.9108 = 0.0292
For Student we have two values
Student v=yes (Yes) = 6 Student v=yes (No) = 1
Student v =no (Yes) = 3 Student v =no (No) =4

Entropy (student) = 7/14(-6/7log2 (6/7)) + 7/14(-3/7log2 (3/7)-4/7log 2(4/7)

= 7/14(0.5916) + 7/14(0.9852)
= 0.2958 + 0.4926 = 0.7884
Gain (student) = 0.94 – 0.7884 = 0.1516
For Credit_Rating we have two values
Credit_Rating v=fair (Yes) = 6 Student v=yes (No) = 2
Credit_Rating v =excellent (Yes) = 3 Student v =excellent (No) =3

Entropy (Credit_Rating) = 8/14(-6/8log2 (6/8)-2/8log2 (2/8)) + 6/14(-3/6log 2(3/6)-3/6log2 (3/6))

= 8/14(0.8112) + 6/14(1)
= 0.4635 + 0.4285 = 0.8920
2
Gain (Credit_Rating) = 0.94 – 0.8920 = 0.479
Gain (age) = 0.94 – 0.6935 = 0.2465
Gain (income) = 0.94 – 0.9108 = 0.0292
Gain (student) = 0.94 – 0.7884 = 0.1516
Gain (Credit_Rating) = 0.94 – 0.8920 = 0.048
Since Age has the highest Information Gain we start splitting the dataset using the age attribute

Since all records under the branch age middle-aged are all of class Yes, we can replace the leaf with
Class=Yes.

Step 2: Decide what feature to use at the root node for the left branch.
The same process of splitting has to happen for the two remaining branches.
For branch age youth we still have attributes income, student and Credit_Rating. Which one should
be used to split the partition?

3
income student Credit-rating class
High No Fair No
High No Excellent No
Medium No Fair No
Low Yes Fair Yes
Medium Yes excellent yes
The information of the total dataset with the two classes which are class-Yes and class-No is calculated
𝒚𝒆𝒔 𝒚𝒆𝒔 𝑵𝒐 𝑵𝒐
as I (Yes, No) = (- (𝒕𝒐𝒕𝒂𝒍 𝒅𝒂𝒕𝒂 𝒔𝒆 ) ∗ 𝐥𝐨𝐠 𝟐 (𝒕𝒐𝒕𝒂𝒍 𝒅𝒂𝒕𝒂 𝒔𝒆 ) ) – ((𝒕𝒐𝒕𝒂𝒍 𝒅𝒂𝒕𝒂 𝒔𝒆 ) ∗ 𝐥𝐨𝐠 𝟐 (𝒕𝒐𝒕𝒂𝒍 𝒅𝒂𝒕𝒂 𝒔𝒆 ))

In the above data set we have 5 total number of example data with two classes: class-yes
2 and class-no 3. From this we get the total information gain as
I (yes, no) = I (2, 3) = -2/5 log2 (2/5) – 3/5 log2 (3/5) =0.97

For Income we have three values

Income v=high (Yes) = 0 income v=high (No) = 2
Income v =medium (Yes) = 1 income v =medium (No) =1

Income v= low (Yes) =1 income v=low (No) =0

Entropy (income) = 2/5(0) + 2/5 (-1/2log 2(1/2)-1/2log 2(1/2)) + 1/5 (0)
= 2/5 (1) = 0.4
Gain (income) = 0.97 – 0.4 = 0.57
For Student we have two values
Student v=yes (Yes) = 2 Student v=yes (No) = 0
Student v =no (Yes) = 0 Student v =no (No) =3

Entropy (student) = 2/5(0) + 3/5(0) = 0

Gain (student) = 0.97 – 0 = 0.97
For Credit_Rating we have two values
Credit_Rating v=fair (Yes) = 1 Student v=yes (No) = 2
Credit_Rating v =excellent (Yes) = 1 Student v =excellent (No) =1

= 2/5 (-1/3 log 2(1/3)-2/3log 2(2/3)) + (-1/2 log 2(1/2)-1/2log 2(1/2))

=0.95
Gain (student) = 0.97 -0.95=0.2
Gain (income) = 0.97 – 0.4 = 0.57
Gain (student) = 0.97 – 0 = 0.97
Gain (student) = 0.97 -0.95=0.2
We can then safely split on attribute student without checking the other attributes since the information gain
is maximized.

4
Since these two new branches are from distinct classes, we make them into leaf nodes with their
respective class as label.

Again the same process is needed for the right branch of age which is senior.

Income Student Credit-rating class

Medium No Fair yes
low yes fair yes
Low Yes Excellent No
Medium Yes Fair Yes
medium no Excellent no

The information of the total dataset with the two classes which are class-Yes and class-
No is calculated as
𝒚𝒆𝒔 𝒚𝒆𝒔 𝑵𝒐 𝑵𝒐
I (Yes, No) = (-(𝒕𝒐𝒕𝒂𝒍 𝒅𝒂𝒕𝒂 𝒔𝒆 ) ∗ 𝐥𝐨𝐠 𝟐 (𝒕𝒐𝒕𝒂𝒍 𝒅𝒂𝒕𝒂 𝒔𝒆 ) ) – ((𝒕𝒐𝒕𝒂𝒍 𝒅𝒂𝒕𝒂 𝒔𝒆 ) ∗ 𝐥𝐨𝐠 𝟐 (𝒕𝒐𝒕𝒂𝒍 𝒅𝒂𝒕𝒂 𝒔𝒆 ))

In the above data set we have 5 total number of example data with two classes: class-yes
3 and class-no 2. From this we get the total information gain as
I (yes, no) = I (3, 2) = -3/5 log2 (3/5) – 2/5 log2 (2/5)

5
=0.97

For Income we have two values

Income v =medium (Yes) = 2 income v =medium (No) =1

Income v= low (Yes) =1 income v=low (No) =1

Entropy (income) = 3/5(-2/3log 2(2/3)-1/3log2 (1/3)) + 2/5 (-1/2log2 (1/2)-1/2log 2(1/2))
= 3/5(0.9182) +2/5 (1) = 0.55+0. 4= 0.95
Gain (income) = 0.97 – 0.95 = 0.02
For Student we have two values
Student v=yes (Yes) = 2 Student v=yes (No) = 1
Student v =no (Yes) = 1 Student v =no (No) =1
Entropy (student) = 3/5(-2/3log2 (2/3)-1/3log2 (1/3)) + 2/5(-1/2log 2(1/2)-1/2log2 (1/2))

= 0.95
Gain (student) = 0.97 – 0.95 = 0.02
For Credit_Rating we have two values
Credit_Rating v=fair (Yes) = 3 Student v=yes (No) = 0
Credit_Rating v =excellent (Yes) = 0 Student v =excellent (No) =2
Entropy (Credit_Rating) = 0

Gain (Credit_Rating) = 0.97 – 0 = 0.97

Gain (income) = 0.97 – 0.95 = 0.02
Gain (student) = 0.97 – 0.95 = 0.02
Gain (Credit_Rating) = 0.97 – 0 = 0.97
We then split based on Credit_Rating. These splits give partitions each with records from the same class. We
just need to make these into leaf nodes with their class label attached.
The final decision tree for the given dataset is as follows.

6
Based on this decision tree we can predict the class of new examples as follows.
X = (age=Senior, income=medium, student = yes, credit rating=excellent)
Follow branch (age = senior) then credit rating=excellent we predict Class=no So, the new instance cannot buy
a computer.
Z = (age=middle-aged, income=low, student = no, credit rating=fair)

Follow branch (age = middle) then we predict Class=yes So, the new instance can buy a computer.
R = (age=Youth, income=medium, student = no, credit rating=excellent).

Follow branch (age = youth) then student = no we predict Class=no So, the new instance cannot buy a
computer.

2. Using k nearest neighbors

First assign the value of k and select the distance metrics
Here I use Euclidean distance measure and k=3
RID age income student Credit- Class:
rating Buys
computer
1 youth high no fair no
2 youth high no excellent No
3 Middle-aged high no fair Yes
4 senior medium no fair Yes
5 senior low yes fair Yes
6 senior low yes excellent No
7 Middle-aged low yes excellent Yes
8 youth medium no fair No
9 youth low yes fair Yes
10 senior medium yes fair Yes
11 youth medium yes excellent Yes
12 Middle-aged medium no excellent Yes
13 Middle-aged high yes fair Yes
14 senior medium no excellent No

Let as change all the categorical data into numerical data by random value.

age income student Credit-rating Class or target

Youth=0 High=2 Yes=1 Fair=0 Yes
Middle-aged=1 Medium=1 No=0 Excellent=1 No
Senior=2 Low=0

7
RID age income student Credit- Class:
rating Buys
computer
1 0 2 0 0 No
2 0 2 0 1 No
3 1 2 0 0 Yes
4 2 1 0 0 Yes
5 2 0 1 0 Yes
6 2 0 1 1 No
7 1 0 1 1 Yes
8 0 1 0 0 No
9 0 0 1 0 Yes
10 2 1 1 0 Yes
11 0 1 1 1 Yes
12 1 1 0 1 Yes
13 1 2 1 0 Yes
14 2 1 0 1 no

X = (age=Senior, income=medium, student = yes, credit rating=excellent)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15(new)
age 0 0 1 2 2 2 1 0 0 2 0 1 1 2 2
income 2 2 2 1 0 0 0 1 0 1 1 1 2 1 1
student 0 0 0 0 1 1 1 0 1 1 1 0 1 0 1
Credit- 0 1 0 0 0 1 1 0 0 0 1 1 0 1 1
rating
class no no yes yes yes no yes no yes yes yes yes yes no ?

Calculate the distance between the query-instance and all the training examples

Sqrt ((2-0) 2+ (1-2)2 + (1-0) 2+ (1-0)2]

= Sqrt (4+1+1+1)

= Sqrt (7)

=2.65
This is the distance between the first data and the new example data.
Based on this formula we find the distance for each given training example data.

8
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15(new)
age 0 0 1 2 2 2 1 0 0 2 0 1 1 2 2
income 2 2 2 1 0 0 0 1 0 1 1 1 2 1 1
student 0 0 0 0 1 1 1 0 1 1 1 0 1 0 1
Credit- 0 1 0 0 0 1 1 0 0 0 1 1 0 1 1
rating
class no no yes yes yes no yes no yes yes yes yes yes no ?
distance 2.65 2.45 2 1.41 1.41 1 1.7 2.45 2.45 1 2 1.41 1.7 1
From this we select three neighbors which are more close to the new data example these are 6, 12 and 14
and we predict the new class by the majority of the three one. Here the majority are class no therefore the
new class is no class.

Z = (age=middle-aged, income=low, student = no, credit rating=fair)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 16(new)
age 0 0 1 2 2 2 1 0 0 2 0 1 1 2 1
income 2 2 2 1 0 0 0 1 0 1 1 1 2 1 0
student 0 0 0 0 1 1 1 0 1 1 1 0 1 0 0
Credit-rating 0 1 0 0 0 1 1 0 0 0 1 1 0 1 0
class no no yes yes yes no yes no yes yes yes yes yes no ?

Calculate the distance between the query-instance and all the training examples

Sqrt ((1-0) 2+ (0-2)2 + (0-0) 2+ (0-0)2]

= Sqrt (1+4+0+0)

= Sqrt (5)

=2.23
This is the distance between the first data and the new example data.
Based on this formula we find the distance for each given training example data.

9
1 2 3 4 5 6 7 8 9 10 11 12 13 14 16(new)
age 0 0 1 2 2 2 1 0 0 2 0 1 1 2 1
income 2 2 2 1 0 0 0 1 0 1 1 1 2 1 0
student 0 0 0 0 1 1 1 0 1 1 1 0 1 0 0
Credit- 0 1 0 0 0 1 1 0 0 0 1 1 0 1 0
rating
class no no yes yes yes no yes no yes yes yes yes yes no ?
distance 2.23 2.45 2 1.41 1.41 1.7 1.41 1.41 1.41 1.7 2 1.41 1.7 1.7
From this we select three neighbors which are more close to the new data example which has smallest distance
i.e. 1.41 and we predict the new class by the majority of these.
Here the majority are class no therefore the new class is yes class.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 16(new)
age 0 0 1 2 2 2 1 0 0 2 0 1 1 2 1
income 2 2 2 1 0 0 0 1 0 1 1 1 2 1 0
student 0 0 0 0 1 1 1 0 1 1 1 0 1 0 0
Credit- 0 1 0 0 0 1 1 0 0 0 1 1 0 1 0
rating
class no no yes yes yes no yes no yes yes yes yes yes no yes
distance 2.23 2.45 2 1.41 1.41 1.7 1.41 1.41 1.41 1.7 2 1.41 1.7 1.7

R = (age=Youth, income=medium, student = no, credit rating=excellent).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 17(new)
age 0 0 1 2 2 2 1 0 0 2 0 1 1 2 0
income 2 2 2 1 0 0 0 1 0 1 1 1 2 1 1
student 0 0 0 0 1 1 1 0 1 1 1 0 1 0 0
Credit-rating 0 1 0 0 0 1 1 0 0 0 1 1 0 1 1
class no no yes yes yes no yes no yes yes yes yes yes no ?
Calculate the distance between the query-instance and all the training examples

Sqrt ((0-0) 2+ (1-2)2 + (0-0) 2+ (1-0)2]

= Sqrt (0+1+0+1)

= Sqrt (2)

=1.41

10
This is the distance between the first data and the new example data.
Based on this formula we find the distance for each given training example data.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 17(new)
age 0 0 1 2 2 2 1 0 0 2 0 1 1 2 0
income 2 2 2 1 0 0 0 1 0 1 1 1 2 1 1
student 0 0 0 0 1 1 1 0 1 1 1 0 1 0 0
Credit- 0 1 0 0 0 1 1 0 0 0 1 1 0 1 1
rating
class no no yes yes yes no yes no yes yes yes yes yes no no
distance 1.41 1 1.7 2.23 2.65 2.45 1.7 1 1.7 2.45 1 1 2 2

From this we select three neighbors which are more close to the new data example which has smallest distance
i.e. 1 and we predict the new class by the majority of these.
Here the majority are class no therefore the new class is yes class

11
RID Age Income student Credit- Class: buys-
rating computer
1 Youth High No Fair No
2 Youth High No Excellent No
3 Middle-aged High No Fair Yes
4 Senior Medium No Fair Yes
5 Senior Low Yes Fair Yes
6 Senior Low Yes Excellent No
7 Middle-aged Low Yes Excellent Yes
8 Youth Medium No Fair No
9 Youth Low Yes Fair Yes
10 Senior Medium Yes Fair Yes
11 Youth Medium Yes Excellent Yes
12 Middle-aged Medium No Excellent Yes
13 Middle-aged High Yes Fair Yes
14 Senior Medium No Excellent No
15 senior Medium yes excellent No
16 Middle-aged low no Fair yes
17 youth medium no excellent No

g21010e Inverform Brochure
No ratings yet
g21010e Inverform Brochure
12 pages
Ieee 487 - 2015
No ratings yet
Ieee 487 - 2015
106 pages
EE418 HW5 Solutions
No ratings yet
EE418 HW5 Solutions
12 pages
Advanced Well Completions: © The Robert Gordon University 2011
No ratings yet
Advanced Well Completions: © The Robert Gordon University 2011
9 pages
Apply Decision Tree Algorithm On The Following Table and Construct Decision Tree Accordingly
No ratings yet
Apply Decision Tree Algorithm On The Following Table and Construct Decision Tree Accordingly
7 pages
Two Mean or Difference of Means Large Samples Problems
No ratings yet
Two Mean or Difference of Means Large Samples Problems
11 pages
ML Question
No ratings yet
ML Question
2 pages
Chapter-4: Pairs of Random Variables
No ratings yet
Chapter-4: Pairs of Random Variables
111 pages
Student Feedback Management System Project Report
No ratings yet
Student Feedback Management System Project Report
53 pages
Thyroid Disease Classification Using Machine Learning Project
No ratings yet
Thyroid Disease Classification Using Machine Learning Project
34 pages
Rayleigh Distributions of A Random Variable
No ratings yet
Rayleigh Distributions of A Random Variable
5 pages
Aashima Aaashima - Be19@thapar - Edu: Quiz II - UCS415
No ratings yet
Aashima Aaashima - Be19@thapar - Edu: Quiz II - UCS415
8 pages
Artificial Neural Network-Adaline & Madaline
No ratings yet
Artificial Neural Network-Adaline & Madaline
18 pages
Module 2-1
No ratings yet
Module 2-1
18 pages
Vlsi Implementation For High Speed Adders
100% (1)
Vlsi Implementation For High Speed Adders
6 pages
Aiml Lab Manual 2023
No ratings yet
Aiml Lab Manual 2023
17 pages
AI Lab Assignmentc7
No ratings yet
AI Lab Assignmentc7
5 pages
Deep Learning Technique Syllabus
No ratings yet
Deep Learning Technique Syllabus
2 pages
B-8-Fopl and Resolution
No ratings yet
B-8-Fopl and Resolution
39 pages
Problems 19 - Codeforces
No ratings yet
Problems 19 - Codeforces
6 pages
Data Science and ML-KTU
No ratings yet
Data Science and ML-KTU
11 pages
Cutset Matrix Concept of Electric Circuit
No ratings yet
Cutset Matrix Concept of Electric Circuit
4 pages
IAT-I Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Oct-2022-Dr. Paras Nath Singh
No ratings yet
IAT-I Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Oct-2022-Dr. Paras Nath Singh
7 pages
Reconfigurable Hardware Design Approach For Economic Neural Network
No ratings yet
Reconfigurable Hardware Design Approach For Economic Neural Network
5 pages
Set and Frozenset in Python
No ratings yet
Set and Frozenset in Python
5 pages
Unit 1
No ratings yet
Unit 1
67 pages
Overlap Add Save
No ratings yet
Overlap Add Save
8 pages
AI - LAB Midterm Fall-2020 Updated Paper
No ratings yet
AI - LAB Midterm Fall-2020 Updated Paper
3 pages
MODULE 5
No ratings yet
MODULE 5
31 pages
Transistor Current Components
No ratings yet
Transistor Current Components
3 pages
NAND NOR Implementation
50% (2)
NAND NOR Implementation
3 pages
Perceptons Neural Networks
No ratings yet
Perceptons Neural Networks
33 pages
Asymptotic Analysis
No ratings yet
Asymptotic Analysis
19 pages
Unit-1 Semiconductors and Rectifiers: Topics
No ratings yet
Unit-1 Semiconductors and Rectifiers: Topics
21 pages
Module 4
No ratings yet
Module 4
18 pages
On Approximate Computing Techniques
No ratings yet
On Approximate Computing Techniques
58 pages
KNN Solved Example
100% (1)
KNN Solved Example
6 pages
Hw1 Theory Solution PuHK4fmHvB
No ratings yet
Hw1 Theory Solution PuHK4fmHvB
4 pages
5g Communication Lab File Updated
No ratings yet
5g Communication Lab File Updated
35 pages
Text
No ratings yet
Text
131 pages
Binary To Gray: B) Write A Behavioral VHDL Code Description To Implement Octal To Binary Encoder
100% (1)
Binary To Gray: B) Write A Behavioral VHDL Code Description To Implement Octal To Binary Encoder
2 pages
Solution 4 Ann Weka 2012
No ratings yet
Solution 4 Ann Weka 2012
8 pages
Packed BCD To Unpacked BCD
No ratings yet
Packed BCD To Unpacked BCD
5 pages
Source Symbol P Binary Code Shannon-Fano: Example 1
No ratings yet
Source Symbol P Binary Code Shannon-Fano: Example 1
2 pages
IML-IITKGP - Assignment 2 Solution
No ratings yet
IML-IITKGP - Assignment 2 Solution
11 pages
Exp 4 RP Grating
100% (3)
Exp 4 RP Grating
3 pages
Sat - 13.Pdf - Child Mortality Prediction Using Machine Learning
No ratings yet
Sat - 13.Pdf - Child Mortality Prediction Using Machine Learning
11 pages
Weka Lab Record Experiments
No ratings yet
Weka Lab Record Experiments
21 pages
Confusion Matrix, Accuracy, Precision, Recall, F1 Score
No ratings yet
Confusion Matrix, Accuracy, Precision, Recall, F1 Score
1 page
IT Assignment 2
No ratings yet
IT Assignment 2
2 pages
Coursera Week 3 Cryptographie
No ratings yet
Coursera Week 3 Cryptographie
7 pages
Week 7 Solution
No ratings yet
Week 7 Solution
6 pages
Huffman Code
No ratings yet
Huffman Code
5 pages
Non Monotonic Reasoning
No ratings yet
Non Monotonic Reasoning
45 pages
FPGA Based Real Time Implementation of Modified Tollgate System
No ratings yet
FPGA Based Real Time Implementation of Modified Tollgate System
5 pages
Accenture Mettle - Quants
100% (1)
Accenture Mettle - Quants
32 pages
Rahul Maheshwari's DSA Company Sheet - Part-1
No ratings yet
Rahul Maheshwari's DSA Company Sheet - Part-1
4 pages
Ec1to6 PDF
No ratings yet
Ec1to6 PDF
61 pages
Topic:-Alpha-Beta Pruning: Artificial Intelligence and Expert System
No ratings yet
Topic:-Alpha-Beta Pruning: Artificial Intelligence and Expert System
15 pages
Digital Electronics Lab Manual
0% (1)
Digital Electronics Lab Manual
50 pages
Modelos Graficos
No ratings yet
Modelos Graficos
4 pages
CALCULATION
No ratings yet
CALCULATION
15 pages
Decision Tree
No ratings yet
Decision Tree
71 pages
Rice Bran Oil
100% (1)
Rice Bran Oil
16 pages
(Sample) : "Underground Communication Networks For Mines and Pathways"
No ratings yet
(Sample) : "Underground Communication Networks For Mines and Pathways"
5 pages
CBSE Class 12 Maths Chapter 11 Three Dimensional Geometry Important Questions 2022-23
No ratings yet
CBSE Class 12 Maths Chapter 11 Three Dimensional Geometry Important Questions 2022-23
25 pages
CAC Edible Oil Standard (19-2005)
No ratings yet
CAC Edible Oil Standard (19-2005)
6 pages
MIR Spirolab 3 - User Manual
No ratings yet
MIR Spirolab 3 - User Manual
77 pages
PT - 1 (Data Science)
No ratings yet
PT - 1 (Data Science)
2 pages
Holly Valley Stable Rules
No ratings yet
Holly Valley Stable Rules
3 pages
Metastock Country RIC Codes ECIX
No ratings yet
Metastock Country RIC Codes ECIX
147 pages
Assignment02 Physics20
No ratings yet
Assignment02 Physics20
12 pages
Narrative Profile Template 1
No ratings yet
Narrative Profile Template 1
47 pages
Current Affairs 11 June 2024
No ratings yet
Current Affairs 11 June 2024
12 pages
Đề Viếtanh 8 Năm Học 2022 - 2023
No ratings yet
Đề Viếtanh 8 Năm Học 2022 - 2023
8 pages
Overcoming Shoulder Impingement Syndrome, 2003
100% (1)
Overcoming Shoulder Impingement Syndrome, 2003
16 pages
Math Unit 1 MCQ
No ratings yet
Math Unit 1 MCQ
4 pages
Perbedaan Spec Hyundai STARGAZER
No ratings yet
Perbedaan Spec Hyundai STARGAZER
12 pages
LCD Ea Dip204b-4nlw
No ratings yet
LCD Ea Dip204b-4nlw
4 pages
Torque Wrenches: STW101 STW1011 STW102 STW103 STW104 STW200 STW201.V2 STW202 STW1012
No ratings yet
Torque Wrenches: STW101 STW1011 STW102 STW103 STW104 STW200 STW201.V2 STW202 STW1012
3 pages
Guaifenesin Elixir PACKAGE INSERT
No ratings yet
Guaifenesin Elixir PACKAGE INSERT
5 pages
Ovrhtp-400 DGT
No ratings yet
Ovrhtp-400 DGT
4 pages
Just Jorie - Robin Alexander
No ratings yet
Just Jorie - Robin Alexander
207 pages
12 Numbers Required
No ratings yet
12 Numbers Required
29 pages
RPF SI
No ratings yet
RPF SI
44 pages
Sandvik Du411 Underground Drill Rig: Technical Specification
0% (1)
Sandvik Du411 Underground Drill Rig: Technical Specification
4 pages
Classes 2 and 3
No ratings yet
Classes 2 and 3
3 pages
Full Download The Emergent Past A Relational Realist Archaeology of Early Bronze Age Mortuary Practices 1st Edition Chris Fowler PDF DOCX
100% (1)
Full Download The Emergent Past A Relational Realist Archaeology of Early Bronze Age Mortuary Practices 1st Edition Chris Fowler PDF DOCX
48 pages
1.0 7a - Pipe Thickness
100% (1)
1.0 7a - Pipe Thickness
11 pages
Sundaram 2014
No ratings yet
Sundaram 2014
11 pages

Decision Tree and KNN Assignment Two

Uploaded by

Decision Tree and KNN Assignment Two

Uploaded by

University of Gondar

By: Abaynesh Moges

To: Abdel Ahmed. (PhD)

Entropy (income) = 4/14(-2/4log2 (2/4)-2/4log 2(2/4)) + 6/14 (-4/6log 2(4/6)-2/6log2 (2/6))

Entropy (student) = 7/14(-6/7log2 (6/7)) + 7/14(-3/7log2 (3/7)-4/7log 2(4/7)

Entropy (Credit_Rating) = 8/14(-6/8log2 (6/8)-2/8log2 (2/8)) + 6/14(-3/6log 2(3/6)-3/6log2 (3/6))

For Income we have three values

Income v= low (Yes) =1 income v=low (No) =0

Entropy (student) = 2/5(0) + 3/5(0) = 0

= 2/5 (-1/3 log 2(1/3)-2/3log 2(2/3)) + (-1/2 log 2(1/2)-1/2log 2(1/2))

Income Student Credit-rating class

For Income we have two values

Income v= low (Yes) =1 income v=low (No) =1

Gain (Credit_Rating) = 0.97 – 0 = 0.97

2. Using k nearest neighbors

age income student Credit-rating Class or target

X = (age=Senior, income=medium, student = yes, credit rating=excellent)

Sqrt ((2-0) 2+ (1-2)2 + (1-0) 2+ (1-0)2]

Z = (age=middle-aged, income=low, student = no, credit rating=fair)

Sqrt ((1-0) 2+ (0-2)2 + (0-0) 2+ (0-0)2]

R = (age=Youth, income=medium, student = no, credit rating=excellent).

Sqrt ((0-0) 2+ (1-2)2 + (0-0) 2+ (1-0)2]

You might also like