0% found this document useful (0 votes)

4 views10 pages

DataMining Chapter3

This chapter discusses decision trees, a model used in data mining for classification and regression tasks, particularly in fields like medical diagnosis where interpretability is crucial. It explains the structure of decision trees, including decision nodes and leaves, and how they can be translated into rules. Additionally, it covers the construction of decision trees using algorithms like ID3 and its successor C4.5, emphasizing the importance of entropy and information gain in selecting attributes for testing.

Uploaded by

hamidbnb865

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views10 pages

DataMining Chapter3

Uploaded by

hamidbnb865

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

CHAPTER III : DECISION TREES

In this chapter we present decision trees, a model widely used in Data Mining.

3.1 INTRODUCTION
For certain fields of application, it is essential to produce classification procedures that the
user can understand. This is particularly the case for medical diagnosis, where the doctor
needs to be able to interpret the reasons for the diagnosis. Decision trees meet this
requirement because they graphically represent a set of rules and are easy to interpret.

3.2 CONCEPT OF DECISION TREES

3.2.1 Aim
A decision tree models a hierarchy of tests on the values of a set of variables called attributes.
At the end of these tests, the model (the decision tree) produces a numerical value or selects
an element from a discrete set of conclusions. The former is known as regression and the
latter as classification. For example, the following decision tree (figure 3.1) models a problem
where we wish to classify individuals into two classes {sick, healthy} according to the values
taken by two descriptors: "temperature" and "sore throat".

Fig 3.1. Example of a classification decision tree.

You can also find on the website (https://www.medg.fr/informations/arbres-decisionnels-

medicaux/) several decision trees used in the medical field.
The objective of the following regression-type decision tree (figure 3.2) is to estimate the
price of a vehicle as a function of the two attributes "Fuel type" and "Power".

35
Fig 3.2. Example of a regression decision tree.

The internal nodes of a decision tree are called decision nodes. These nodes are labelled with
a test that can be applied to any description of an individual in the population. In general, each
test examines the value of a single attribute in the description space. The possible answers to
the test correspond to the labels of the arcs originating from this node. In the case of binary
decision nodes, the labels of the arcs are omitted and, by convention, the left arc corresponds
to a positive response to the test. Leaves are labelled by a class.
A decision tree is the graphical representation of a classification procedure. Each complete
description is associated with a single leaf of the decision tree. This association is defined by
starting at the root of the tree and moving down the tree according to the responses to the tests
that label the internal nodes. The associated class is then the default class associated with the
leaf that corresponds to the description. The classification procedure obtained is immediately
translated into decision rules. The rule systems obtained are special in that the order in which
the attributes are examined is fixed and the decision rules are mutually exclusive.

3.2.2 Translating a decision tree into rules

A decision tree can be interpreted as a series of rules. For example, a patient with a
temperature of 39 and a non-irritated throat will be classified as "sick" by the tree in the
previous example. The translation of this tree into decision rules is shown in figure 3.3.

If (Temperature<=37)
then if (sore throat)
then Class= « sick »
else
Class= « healthy »

else
Class= « sick »

Fig 3.3 Translating a decision tree into rules

3.2.3 Notations used with decision trees

The following is an introduction to some of the notations used with decision trees.

36
Position of a node: The nodes of a tree are identified by positions, which are numbers where
the level of the node and its position (from the left) are concatenated (figure 3.4). The root is
noted Ø. For example, position 11 refers to the node at level 1, and is the first position from
the left.

Fig 3.4 Positions in a decision tree

Given a sample S, a set of classes {1,...,c} and a decision tree T, at each position pos of T
corresponds a subset of the sample which is the set of examples which satisfy the tests from
the root up to that position. Consequently, for any position pos of T, we can define the
following quantities:
N(pos) is the cardinal of the set of examples associated with p,
N(k/pos) is the cardinal of the set of examples associated with p that belong to class k,
P(k/pos) = N(k/pos)/N(pos) the proportion of elements of class k at position p.
Example: Consider the decision tree from the previous example. We also have a sample of
200 patients. We know that 100 are sick and 100 are healthy. The distribution between the
two classes M (for sick) and S (for healthy) is given by :
Sore throat Non sore throat
temperature <= 37 (0 Healthy, 38 Sick) (100 Healthy, 0 Sick)

We return to the tree, adding the associated examples to each node (figure 3.5).

Fig 3.5 Decision tree with examples associated with each node

37
Here is the calculation of the different cardinal values at the root.
We then have: N(Ø)=200; N(H/Ø)=100; N(S/Ø)=100; P(H/Ø)=100/200 and
P(S/Ø)=100/200.

3.2.4 Concept of Entropy

At any position pos in a decision tree, we can associate a quantity i(pos) which represents the
degree of mixing of classes at position pos. The higher i(pos) is, the greater the mixing of
classes will be. The function i should reach its maximum when the examples are equally
distributed between the different classes and its minimum when one class contains all the
examples (there is no mixing: the node is said to be pure).
Several functions have been proposed to measure class mixing: Shannon entropy, Ginni
measure, etc. In the rest of this course, we will only use Shannon entropy, whose formula is :

(equation 3.1)

This function takes its values in the interval [0, 1].

The next section explains how this notion of entropy is used to determine whether a node is
terminal or not when constructing a decision tree.
Example: Calculation of the entropy at the root node (Ø) and node 11 of the medical decision
tree (paragraph 3.1).

=1.00

In general, entropy decreases as we go down the tree, until it reaches zero at the leaf level.

3.3 DECISION TREES CONSTRUCTION

The problem of constructing a decision tree is to propose learning algorithms, i.e. algorithms
which, given a sample S as input, construct a decision tree.

3.3.1 General principle

The general principle of decision tree construction methods is to recursively divide the
examples in the training set as efficiently as possible by tests defined using attributes, until we
obtain subsets of examples that all belong to the same class.

38
Dans toutes les méthodes, on trouve les trois opérateurs suivants :
1. Decide whether a node is terminal, i.e. decide whether a node should be labelled as a
leaf. For example: all examples are in the same class, there are fewer than a certain
number of errors, ...
2. Select a test to associate with a node. For example: randomly, using statistical criteria,
etc.
3. Assigning a class to a sheet. The majority class is assigned, except where cost or risk
functions are used.
The methods will differ in the choices made for these different operators, i.e. the choice of test
(for example, use of the gain and entropy function) and the stopping criterion (when to stop
the growth of the tree, i.e. when to decide whether a node is terminal). The general outline of
the algorithms is as follows:

Algorithm 3.2 : Generic algorithm for building a decision tree

Algorithme Decision Tree Construction
Input : Data set S
Output : Decision Tree
Begin
Initialise the empty tree; the root is the current node
Repeat
Check whether the current node is terminal
If the node is terminal
then assign it a class
else
Select a test and create the sub-tree
EndIf
Go to the next unexplored node if there is one
Until getting a decision tree
End.

With such an algorithm, it is possible to calculate a decision tree with little or no apparent
error. A perfect decision tree is one in which all the examples in the training set are correctly
classified. Such a tree does not always exist (if there are two examples such that two different
classes correspond to two identical descriptions). The aim is to build a tree with the smallest
possible classification error.

3.3.2 ID3 Algorithm

ID3 (Iterative Dichotomiser) is one of a number of algorithms that have been proposed for
generating a decision tree from a training dataset. This algorithm was developed in 1986 by
Ross Quinlan. An improvement on ID3 was published by Quinlan in 1990 under the name
C4.5.

The following table represents the PlayTennis dataset presented by Quilan himself to
introduce the ID3 algorithm. Note that all the variables (corresponding to the columns) have
been made discrete.

39
The ID3 algorithm starts with a table whose data has already been classified (labelled). From
this table, the algorithm constructs a decision tree which can predict the class of each of the
data items in the table, and even the class of new data (which does not appear in the dataset).

Table 3.1 : Discretised Quinlan PlayTennis dataset (Quinlan, 1986)

N° Sky Temperature Humidity Wind Class
1 Sunny Warm High Weak No
2 Sunny Warm High Strong No
3 Overcast Warm High Weak Yes
4 Rainy Medium High Weak Yes
5 Rainy Cool Normal Weak Yes
6 Rainy Cool Normal Strong No
7 Overcast Cool Normal Strong Yes
8 Sunny Medium High Weak No
9 Sunny Cool Normal Weak Yes
10 Rainy Medium Normal Weak Yes
11 Sunny Medium Normal Strong Yes
12 Overcast Medium High Strong Yes
13 Overcast Warm Normal Weak Yes
14 Rainy Medium High Strong No

Sky, Temperature, Humidity and Wind are the four attributes that describe the data. We can
see that the dataset contains just 14 rows corresponding to situations in which tennis players
accept or refuse to play depending on the values taken by the attributes describing the weather
conditions. But in reality, there are 36 different examples if we vary the attributes with all the
possible values they can take on:
|{Sunny,Overcast,Rainy}| × |{Warm,Medium,Cool}| x |{High,Normal}| × |{Weak, Strong}|
=3×3×2×2=9×4=36

The ID3 algorithm is based on the concept of attributes and classes from machine learning
(discrete classification). This algorithm looks for the most relevant attribute to test so that the
tree is as short and optimised as possible.
To find the attribute to test, we use the entropy defined in the previous section.
Initially, the algorithm takes the whole dataset S = {J1, J2, J3, ..., J14}. And as 9 out of 14
examples give the decision (or class) Yes and 5 out of 14 give the decision No, we can
calculate the following proportions:

PYes = 9/14 PNo = 5/14

The entropy of S can be calculated as follows:

(equation 3.2)

40
Now that we know that the initial entropy of the dataset is 0.94, we need to know which
attribute to test first, then second, and so on.

To find out which attribute to test, we need to use the notion of entropy gain. The gain is
defined by a set of examples and by an attribute. This formula is used to calculate the
contribution of this attribute to the disorder of the set. The more an attribute contributes to
disorder, the more important it is to test it in order to separate the set into smaller sets with
lower entropy.
Here is the formula that calculates the entropy gain for a set S and an attribute A.

(equation 3.3)

The attribute that will be tested at this node of the tree is the node that will reduce entropy the
most.
Taking the example again, and considering S as the initial set, to determine which attribute to
test, we need to calculate the gain of all the attributes.

Calculating the entropy gain of the "Sky" attribute :

The "Sky" attribute has three possible values: {Sunny, Overcast, Rainy}. The proportion for
each of these values in the initial set is 5/14, 4/14 and 5/14 respectively. This gives the
following calculation of the entropy gain for this attribute.

Calculating the entropy gain of the "Temperature" attribute :

The "Temperature" attribute has three possible values: {Warm, Medium, Cool}. Its entropy
gain is :

Calculating the entropy gain of the "Humidity" attribute :

The "Humidity" attribute has two possible values: {High, Normal}. Its entropy gain is :

Calculating the entropy gain of the "Wind" attribute :

The "Wind" attribute has two possible values: {Weak, Strong}. Its entropy gain is :

41
Here is a summary of the calculations made :

Gain(S, Sky) = 0.247

Gain(S, Temperature) = 0.028

Gain(S, Humidity) = 0.153

Gain(S, Wind) = 0.048

The calculations show that: Gain(S, Temperature) < Gain(S, Wind) < Gain(S, Humidity) <
Gain(S, Sky). The greatest gain is for Sky. Sky is therefore the first attribute tested in the tree.
If we look at each child node, we see that for the overcast node, all the results are positive. So
there's no attribute to test here, we can label Yes directly. The following figure shows the
decision tree to be constructed after this first iteration.

Fig 3.6 Tree after the first iteration of its creation with ID3

We now need to continue adding test nodes after Sunny and Rainy because there is a mixture
of classes between the examples. So let's determine for Sunny which is the best attribute to
test using entropy gain again. However, it is no longer useful to test the gain of Sky, as it has
just been used. The results of the calculation are given directly:

Gain(Ssunny, Temperature) = 0.571

Gain(Ssunny, Humidity) = 0.971

Gain(Ssunny, Wind) = 0.019

We can see that : Gain(Ssunny, Wind) < Gain(Ssunny, Temperature) < Gain(Ssunny, Humidity)
The biggest gain is for Humidity. You can see that the gain is equal to Ssunny entropy. This
means that all the children of Humidity will give a class (label). Here's our tree after a second
iteration of ID3.

42
Fig 3.7 Tree after the second iteration of its creation with ID3

We still have to continue the tree on the Rainy edge. Here are the gains for the different
attributes:

Gain(Srainy, Temperature) = 0.019

Gain(Srainy, Humidity) = 0.019

Gain(Srainy, Wind) = 0.971

We can see that : Gain(Srainy, Temperature) ≤ Gain(Srainy, Humidity) < Gain(Srainy, Wind)
The largest gain is 0.971 and is for Wind. We therefore need to test Wind, and since the gain
is equal to the entropy of Srainy, each of Wind's child nodes will be a label (pure node).
So here's our final tree.

Fig 3.8 Final tree after being created with ID3

We can check that this tree gives the correct prediction for each of the 14 cases in the dataset
used to construct it. For example, for case number 1 (Sky="Sunny", Temperature="Warm",
Humidity="High", Wind="Weak"), the tree gives the class "No", which is consistent with
what exists in the training dataset.

43
But the tree also allows predictions to be made about new cases that do not exist in the
dataset. For example, for a new case (Sky='Sunny', Temperature='Cool', Humidity='High',
Wind='Weak'), the tree gives the class 'No'.

3.3.3 Algorithm C.45 vs Algorithm ID3

The ID3 algorithm was improved by Ross Quinlan in 1990, under the name C4.5. This latest
algorithm introduces the following new features:
- The ability to manipulate continuous values.
- The ability to generate a tree even if data is missing for certain attributes.

CONCLUSION OF THE CHAPTER

In this chapter, we introduced another well-known model in data mining: decision trees.
Decision trees represent a hierarchy of tests to be performed on the data in order to obtain a
classification or regression. After introducing the basic concepts, we defined the concept of
entropy on which the construction of decision trees is based. The ID3 algorithm (by R.
Quinlan) was introduced using an example.

EXERCISES

Exercise 3.1: Recall the general objective of the Decision Tree (DT) model.
Exercise 3.2: What is the difference between a classification DT and a regression DT?
Exercise 3.3: Consider the following classification Decision Tree.The classes are (class1 and
class2).
1/ Translate the tree into a set of rules.
2/ Transform the tree into a binary Decision Tree.

Exercise 3.4: Consider the AD presented in this chapter (medical diagnosis). We have a
sample of 200 patients. In this sample, 100 are healthy and 100 are sick. The distribution
between the two classes H (Healthy) and S (Sick) is given in the following table:

Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Decision Tree Learning (8 Hours)
No ratings yet
Decision Tree Learning (8 Hours)
141 pages
9-Module 5 Decision Tree-21-03-2024
No ratings yet
9-Module 5 Decision Tree-21-03-2024
83 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
ML Unit 3
No ratings yet
ML Unit 3
36 pages
2 ML Ch3 Decision Trees Final
No ratings yet
2 ML Ch3 Decision Trees Final
70 pages
Decision Trees Lectures
No ratings yet
Decision Trees Lectures
55 pages
Lecture Notes - Decision Tree
No ratings yet
Lecture Notes - Decision Tree
13 pages
Decision Tree - Associative Rule Mining
No ratings yet
Decision Tree - Associative Rule Mining
69 pages
Question-Answers in Machine Learning
No ratings yet
Question-Answers in Machine Learning
14 pages
Unit 3 MLT
No ratings yet
Unit 3 MLT
18 pages
FALLSEM2024-25 BCSE209L TH VL2024250101735 2024-07-29 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101735 2024-07-29 Reference-Material-I
48 pages
ML Lecture 13-14
No ratings yet
ML Lecture 13-14
33 pages
ML Unit 3 Notes-1
No ratings yet
ML Unit 3 Notes-1
118 pages
Decision Trees
No ratings yet
Decision Trees
14 pages
Classification
No ratings yet
Classification
30 pages
Lect 8-Decision Tree-2
No ratings yet
Lect 8-Decision Tree-2
16 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
ML Unit 2-2-40
No ratings yet
ML Unit 2-2-40
39 pages
Unit 3
No ratings yet
Unit 3
31 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
Module 3
No ratings yet
Module 3
102 pages
ML Unit-2 Material
No ratings yet
ML Unit-2 Material
20 pages
ML Unit 3 Qa
No ratings yet
ML Unit 3 Qa
26 pages
UNIT3
No ratings yet
UNIT3
71 pages
Module 3
No ratings yet
Module 3
101 pages
Module 3 DecisionTree Notes
100% (1)
Module 3 DecisionTree Notes
14 pages
Decision Tree DT
No ratings yet
Decision Tree DT
20 pages
Bhabesh - Chapter 3 Complete Editing Including Summary
No ratings yet
Bhabesh - Chapter 3 Complete Editing Including Summary
18 pages
Decision Trees
No ratings yet
Decision Trees
8 pages
Unit 3
No ratings yet
Unit 3
46 pages
DMDW Co3 Session 14
No ratings yet
DMDW Co3 Session 14
55 pages
Decision Tree Basics
No ratings yet
Decision Tree Basics
30 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
Session 17-Decision Tree
No ratings yet
Session 17-Decision Tree
16 pages
Decision Tree
No ratings yet
Decision Tree
7 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
Decision Tree
No ratings yet
Decision Tree
4 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Konsep Ensemble
No ratings yet
Konsep Ensemble
52 pages
ML - Module-3-Chapter-6 RNSIT
No ratings yet
ML - Module-3-Chapter-6 RNSIT
10 pages
Machine Learning: MVJ21CS62
No ratings yet
Machine Learning: MVJ21CS62
12 pages
Decision Tree Algorithm: and Classification Problems Too
No ratings yet
Decision Tree Algorithm: and Classification Problems Too
12 pages
Chapter 5 2018 2019
No ratings yet
Chapter 5 2018 2019
5 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Decision Trees / NLP
No ratings yet
Decision Trees / NLP
27 pages
Decision Trees
No ratings yet
Decision Trees
7 pages
Decision Trees and How To Build and Optimize Decision Tree Classifier
No ratings yet
Decision Trees and How To Build and Optimize Decision Tree Classifier
16 pages
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
No ratings yet
Decision Trees Iterative Dichotomiser 3 (ID3) For Classification: An ML Algorithm
7 pages
Decisiontree 2
No ratings yet
Decisiontree 2
16 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
ML Unit-2.1
No ratings yet
ML Unit-2.1
17 pages
ID3 Algorithm
100% (1)
ID3 Algorithm
3 pages
Data Structures: Notes For Lecture 13 Techniques of Data Mining by Samaher Hussein Ali
No ratings yet
Data Structures: Notes For Lecture 13 Techniques of Data Mining by Samaher Hussein Ali
8 pages
ID3
No ratings yet
ID3
7 pages
A Brief Introduction To Pytorch: (A Deep Learning Library)
No ratings yet
A Brief Introduction To Pytorch: (A Deep Learning Library)
32 pages
Numerical Analysis Final Exam
No ratings yet
Numerical Analysis Final Exam
2 pages
Artificial Neural Networks Kluniversity Course Handout
No ratings yet
Artificial Neural Networks Kluniversity Course Handout
18 pages
Data Science Interview Questions (#Day13)
No ratings yet
Data Science Interview Questions (#Day13)
10 pages
Quantum Computers
No ratings yet
Quantum Computers
20 pages
EE462 Design of Digital Control Systems PDF
No ratings yet
EE462 Design of Digital Control Systems PDF
2 pages
EC6303
No ratings yet
EC6303
5 pages
Unit Iii Greedy and Dynamic Programming
No ratings yet
Unit Iii Greedy and Dynamic Programming
120 pages
Pet
No ratings yet
Pet
15 pages
Blowfish
No ratings yet
Blowfish
21 pages
Math 1090 Linear Programming Project
0% (1)
Math 1090 Linear Programming Project
3 pages
Summer Term 2024 Course Handout: Date: 28.05.2024
No ratings yet
Summer Term 2024 Course Handout: Date: 28.05.2024
3 pages
ML Question Bank
No ratings yet
ML Question Bank
4 pages
Job Description For DS4
No ratings yet
Job Description For DS4
5 pages
Multi-Layer Feed-Forward Networks
No ratings yet
Multi-Layer Feed-Forward Networks
6 pages
Association Rule Mining: FP Growth
No ratings yet
Association Rule Mining: FP Growth
22 pages
Revisiting VAE For Unsupervised Time Series Anomaly Detection: A Frequency Perspective
No ratings yet
Revisiting VAE For Unsupervised Time Series Anomaly Detection: A Frequency Perspective
10 pages
Computer Vision: Image Enhancement in Spatial Domain
No ratings yet
Computer Vision: Image Enhancement in Spatial Domain
18 pages
Nguyễn Phát Thịnh - assignment 11
No ratings yet
Nguyễn Phát Thịnh - assignment 11
6 pages
Learning Representations On Logs For AIOps
No ratings yet
Learning Representations On Logs For AIOps
11 pages
4.2.4 Chain Rule and Implicit Differentation
No ratings yet
4.2.4 Chain Rule and Implicit Differentation
8 pages
Graph Algorithm
No ratings yet
Graph Algorithm
4 pages
Multi-Criteria Decision Making - 191-Mid - PNK Phuc - Khoa 16-ISE - 16-LOG
No ratings yet
Multi-Criteria Decision Making - 191-Mid - PNK Phuc - Khoa 16-ISE - 16-LOG
3 pages
Doc-20240512-Wa0003 240513 125207
No ratings yet
Doc-20240512-Wa0003 240513 125207
4 pages
6.1.9 Recursion - Recursive Algorithms Assignment
No ratings yet
6.1.9 Recursion - Recursive Algorithms Assignment
3 pages
Unit 11.2 Graphing-Lines
No ratings yet
Unit 11.2 Graphing-Lines
2 pages
Desalgo 02 - Practice - Exercises - 1
No ratings yet
Desalgo 02 - Practice - Exercises - 1
2 pages
Toets2 20201207 Final
No ratings yet
Toets2 20201207 Final
2 pages
NPV First Principle Discount Factor
No ratings yet
NPV First Principle Discount Factor
3 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet

DataMining Chapter3

Uploaded by

DataMining Chapter3

Uploaded by

CHAPTER III : DECISION TREES

3.2 CONCEPT OF DECISION TREES

Fig 3.1. Example of a classification decision tree.

You can also find on the website (https://www.medg.fr/informations/arbres-decisionnels-

3.2.2 Translating a decision tree into rules

Fig 3.3 Translating a decision tree into rules

3.2.3 Notations used with decision trees

Fig 3.4 Positions in a decision tree

3.2.4 Concept of Entropy

This function takes its values in the interval [0, 1].

3.3 DECISION TREES CONSTRUCTION

3.3.1 General principle

Algorithm 3.2 : Generic algorithm for building a decision tree

3.3.2 ID3 Algorithm

Table 3.1 : Discretised Quinlan PlayTennis dataset (Quinlan, 1986)

PYes = 9/14 PNo = 5/14

The entropy of S can be calculated as follows:

Calculating the entropy gain of the "Sky" attribute :

Calculating the entropy gain of the "Temperature" attribute :

Calculating the entropy gain of the "Humidity" attribute :

Calculating the entropy gain of the "Wind" attribute :

Gain(S, Sky) = 0.247

Gain(S, Temperature) = 0.028

Gain(S, Humidity) = 0.153

Gain(Ssunny, Temperature) = 0.571

Gain(Ssunny, Humidity) = 0.971

Gain(Ssunny, Wind) = 0.019

Gain(Srainy, Temperature) = 0.019

Gain(Srainy, Humidity) = 0.019

Gain(Srainy, Wind) = 0.971

Fig 3.8 Final tree after being created with ID3

3.3.3 Algorithm C.45 vs Algorithm ID3

CONCLUSION OF THE CHAPTER

You might also like