0% found this document useful (0 votes)

12 views41 pages

2025 Lecture07 P1 ID3

The document presents an overview of the ID3 decision tree algorithm, which is a supervised learning method for predicting outcomes based on decision rules derived from data features. It discusses the process of training and testing hypotheses, the structure of decision trees, and how to determine the best attributes for splitting data based on information gain and entropy. Additionally, it provides a practical example of predicting restaurant waiting times using various attributes.

Uploaded by

nmkhoi232

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views41 pages

2025 Lecture07 P1 ID3

Uploaded by

nmkhoi232

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

ID3 DECISION TREE

Nguyễn Ngọc Thảo – Nguyễn Hải Minh

{nnthao, nhminh}@fit.hcmus.edu.vn
Outline
• Supervised learning: A brief revision
• ID3 decision tree algorithm

2
Supervised learning: Training
• Consider a labeled training set of 𝑁 examples.
(𝑥1, 𝑦1), (𝑥2, 𝑦2), … , (𝑥𝑁 , 𝑦𝑁 )
• where each 𝑦𝑗 was generated by an unknown function 𝑦 = 𝑓(𝑥).
• The output 𝒚𝒋 is called ground truth, i.e., the true answer
that the model must predict.

• The training process finds a hypothesis ℎ such that 𝒉 ≈ 𝒇.

3
Supervised learning: Hypothesis space
• ℎ is drawn from a hypothesis space 𝐻 of possible functions.
• E.g., 𝐻 might be the set of polynomials of degree 3; or the set of 3-
SAT Boolean logic formulas.
• Choose 𝐻 by some prior knowledge about the process that
generated the data or exploratory data analysis (EDA).
• EDA examines the data with statistical tests and visualizations to get
some insight into what hypothesis space might be appropriate.
• Or just try multiple hypothesis spaces and evaluate which
one works best.

4
Supervised learning: Hypothesis
• The hypothesis ℎ is consistent if it agrees with the true
function 𝑓 on all training observations, i.e., ∀𝑥𝑖 ℎ 𝑥𝑖 = 𝑦𝑖 .
• For continuous data, we instead look for a best-fit function for which
each ℎ 𝑥𝑖 is close to 𝑦𝑖 .
• Ockham’s razor: Select the simplest consistent hypothesis.

5
Supervised learning: Hypothesis

Finding hypotheses to fit data. Top row: four plots of best-fit functions from
four different hypothesis spaces trained on data set 1. Bottom row: the same
four functions, but trained on a slightly different data set (sampled from the
same 𝑓(𝑥) function).
6
Supervised learning: Testing
• The quality of the hypothesis ℎ depends on how accurately it
predicts the observations in the test set → generalization.
• The test set must use the same distribution over example space as
training set.

A learning curve for the decision

tree learning algorithm on 100
randomly generated examples in
the restaurant domain. Each data
point is the average of 20 trials.

7
ID3
Decision Tree
What is a decision tree?
• A decision tree is a SL algorithm that predicts the output by
learning decision rules inferred from the features in the data.

Learning
algorithm
Data
Decision tree

• It is offen the building blocks for more complex algorithms,

such as random forests and gradient boosting machines.

9
Example problem: Restaurant waiting

Predicting whether a certain person will wait to

have a seat in a restaurant.

1. Alternate: is there an alternative restaurant nearby?

2. Bar: is there a comfortable bar area to wait in?
3. Fri/Sat: is today Friday or Saturday?
4. Hungry: are we hungry?
5. Patrons: number of people in the restaurant (None, Some, Full)
6. Price: price range ($, $$, $$$)
7. Raining: is it raining outside?
8. Reservation: have we made a reservation?
9. Type: kind of restaurant (French, Italian, Thai, Burger)
10. WaitEstimate: estimated waiting time (0-10, 10-30, 30-60, >60) 10
Example problem: Restaurant waiting

A (true) decision tree for deciding

whether to wait for a table.

11
Example problem: Restaurant waiting

12
Learning decision trees
• Divide and conquer: Split data into x1 >  ?
smaller and smaller subsets no
yes
• Splits are usually on a single variable
x2 >  ? x2 >  ?

yes no yes no

• After splitting up, each outcome is a new decision tree

learning problem with fewer examples and one less attribute.

13
Learning decision trees

Splitting the examples by testing on attributes. At each node we show the positive
(light boxes) and negative (dark boxes) examples remaining. (a) Splitting on Type
brings us no nearer to distinguishing between positive and negative examples. (b)
Splitting on Patrons does a good job of separating positive and negative examples.
After splitting on Patrons, Hungry is a fairly good second test. 14
ID3 Decision tree: Pseudo-code

function LEARN-ECISION-TREE(𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠, 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑠, 𝑝𝑎𝑟𝑒𝑛𝑡_𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠)

returns a tree
No examples left 3
if 𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠 is empty
then return PLURALITY-VALUE(𝑝𝑎𝑟𝑒𝑛𝑡_𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠)
else if all 𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠 have the same classification
then return the classification Remaining examples
2
are all pos/neg
else if 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑠 is empty
then return PLURALITY-VALUE(𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠)
else No attributes left but
4
… examples are still pos & neg

The decision tree learning algorithm. The function PLURALITY-VALUE selects the
most common output value among a set of examples, breaking ties randomly.
15
ID3 Decision tree: Pseudo-code
function LEARN-DECISION-TREE(𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠, 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑠, 𝑝𝑎𝑟𝑒𝑛𝑡_𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠)
returns a tree
There are still attributes
… to split the examples
1
else
𝐴 ← 𝑎𝑟𝑔𝑚𝑎𝑥𝑎∈𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑠 IMPORTANCE(𝑎, 𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠)
𝑡𝑟𝑒𝑒 ← a new decision tree with root test A
for each value 𝑣 of A do
𝑒𝑥𝑠 ← 𝑒 ∶ 𝑒 ∈ 𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠 and 𝑒. 𝐴 = 𝑣
𝑠𝑢𝑏𝑡𝑟𝑒𝑒 ← LEARN-DECISION-TREE(𝑒𝑥𝑠, 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑠 − 𝐴, 𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠)
add a branch to 𝑡𝑟𝑒𝑒 with label (𝐴 = 𝑣) and subtree 𝑠𝑢𝑏𝑡𝑟𝑒𝑒
return 𝑡𝑟𝑒𝑒

The decision tree learning algorithm. The function IMPORTANCE evaluates the
profitability of attributes.
16
ID3 Decision tree algorithm
There are some positive and some negative examples → choose the
1
best attribute to split them
The remaining examples are all positive (or all negative), → DONE, it
2
is possible to answer Yes or No.
No examples left at a branch → return a default value.
• No example has been observed for a combination of attribute values
3
• The default value is calculated from the plurality classification of all the
examples that were used in constructing the node’s parent.

No attributes left but both positive and negative examples → return

the plurality classification of remaining ones.
4 • Examples of the same description, but different classifications
• It is due to an error or noise in the data, nondeterministic domain, or no
observation of an attribute that would distinguish the examples.
17
Example problem: Restaurant waiting

The decision tree induced from the 12-example training set.

18
Example problem: Restaurant waiting
• The induced decision tree can classify all the examples
without tests for Raining and Reservation.
• It can detect interesting and previously unsuspected pattern.
• E.g., the customers will wait for Thai food on weekends.
• It is also bound to make some mistakes for cases where it
has seen no examples.
• E.g., how about a situation in which the wait is 0–10 minutes, the
restaurant is full, yet the customer is not hungry?

19
Decision tree: Inductive learning
• Simplest: Construct a decision tree
with one leaf for every example
→ memory based learning
→ worse generalization

• Advanced: Split on each variable so that the purity of each

split increases (i.e. either only yes or only no).

20
A purity measure with entropy
• The Entropy measures the uncertainty of a random variable
𝑉 with values 𝑣𝑘 having probability 𝑃 𝑣𝑘 is defined as
𝟏
𝑯 𝑽 = ෍ 𝑷 𝒗𝒌 𝒍𝒐𝒈𝟐 = − ෍ 𝑷 𝒗𝒌 𝒍𝒐𝒈𝟐 𝑷 𝒗𝒌
𝑷 𝒗𝒌
𝒌 𝒌
• It is fundamental quantity in information theory.

• The information gain (IG) for an attribute 𝐴 is the expected

reduction in entropy from before to after splitting data on 𝐴.

21
A purity measure with entropy
• Entropy is maximal when all possibilities are equally likely.

• Entropy is zero in a pure

”yes” (or pure ”no”) node.

• Decision tree aims to decrease the entropy while increasing

the information gain in each node.
22
Example problem: Restaurant waiting
Alternate?

Yes No

3 Y, 3 N 3 Y, 3 N

• Calculate the Entropy

H(S) = − 6ൗ12 log 2 6ൗ12 − 6ൗ12 log 2 6ൗ12 = 1
of the whole data set
• Calculate Average entropy of attribute Alternate?
𝐴𝐸𝐴𝑙𝑡𝑒𝑟𝑛𝑎𝑡𝑒? = 𝑃 𝐴𝑙𝑡 = 𝑌 × 𝐻 𝐴𝑙𝑡 = 𝑌 + 𝑃 𝐴𝑙𝑡 = 𝑁 × 𝐻 𝐴𝑙𝑡 = 𝑁 = 1
6 3 3 3 3 6 3 3 3 3
= − log 2 − log 2 + − log 2 − log 2
12 6 6 6 6 12 6 6 6 6
• Calculate Information gain of attribute Alternate?
𝐼𝐺 𝐴𝑙𝑡𝑒𝑟𝑛𝑎𝑡𝑒? = 𝐻 𝑆 − 𝐴𝐸𝐴𝑙𝑡𝑒𝑟𝑛𝑎𝑡𝑒? = 1 − 1 = 0
23
Example problem: Restaurant waiting
Bar?

Yes No

3 Y, 3 N 3 Y, 3 N

• Calculate Average entropy of attribute Bar?

6 3 3 3 3 6 3 3 3 3
𝐴𝐸𝐵𝑎𝑟? = − log 2 − log 2 + − log 2 − log 2 =1
12 6 6 6 6 12 6 6 6 6

• Calculate Information gain of attribute Bar?

𝐼𝐺 𝐵𝑎𝑟? = 𝐻 𝑆 − 𝐴𝐸𝐵𝑎𝑟? = 1 − 1 = 0

24
Example problem: Restaurant waiting
Sat/Fri?

Yes No

2 Y, 3 N 4 Y, 3 N

• Calculate Average entropy of attribute Sat/Fri?

5 2 2 3 3 7 4 4 3 3
𝐴𝐸𝑆𝑎𝑡Τ𝐹𝑟𝑖? = − log 2 − log 2 + − log 2 − log 2 = 0.979
12 5 5 5 5 12 7 7 7 7

• Calculate Information gain of attribute Sat/Fri?

𝐼𝐺 𝑆𝑎𝑡Τ𝐹𝑟𝑖? = 𝐻 𝑆 − 𝐴𝐸𝑆𝑎𝑡Τ𝐹𝑟𝑖? = 1 − 0.979 = 0.021

25
Example problem: Restaurant waiting
Hungry?

Yes No

5 Y, 2 N 1 Y, 4 N

• Calculate Average entropy of attribute Hungry?

7 5 5 2 2 5 1 1 4 4
𝐴𝐸𝐻𝑢𝑛𝑔𝑟𝑦? = − log 2 − log 2 + − log 2 − log 2 = 0.804
12 7 7 7 7 12 5 5 5 5

• Calculate Information gain of attribute Hungry?

𝐼𝐺 𝐻𝑢𝑛𝑔𝑟𝑦? = 𝐻 𝑆 − 𝐴𝐸𝐻𝑢𝑛𝑔𝑟𝑦? = 1 − 0.804 = 0.196

26
Example problem: Restaurant waiting
Raining?

Yes No

3 Y, 2 N 3 Y, 4 N

• Calculate Average entropy of attribute Raining?

5 3 3 2 2 7 3 3 4 4
𝐴𝐸𝑅𝑎𝑖𝑛𝑖𝑛𝑔? = − log 2 − log 2 + − log 2 − log 2 = 0.979
12 5 5 5 5 12 7 7 7 7
• Calculate Information gain of attribute Raining?

𝐼𝐺 𝑅𝑎𝑖𝑛𝑖𝑛𝑔? = 𝐻 𝑆 − 𝐴𝐸𝑅𝑎𝑖𝑛𝑖𝑛𝑔? = 1 − 0.979 = 0.021

27
Example problem: Restaurant waiting
Reservation?

Yes No

3 Y, 2 N 3 Y, 4 N

• Calculate Average entropy of attribute Reservation?

5 3 3 2 2 7 3 3 4 4
𝐴𝐸𝑅𝑒𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛? = − log 2 − log 2 + − log 2 − log 2 = 0.979
12 5 5 5 5 12 7 7 7 7

• Calculate Information gain of attribute Reservation?

𝐼𝐺 𝑅𝑒𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛? = 𝐻 𝑆 − 𝐴𝐸𝑅𝑒𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛? = 1 − 0.979 = 0.021

28
Example problem: Restaurant waiting
Type?
French Burger

1 Y, 1 N Italian 2 Y, 2 N
Thai

1 Y, 1 N 2 Y, 2 N

• Calculate Average entropy of attribute Type?

2 1 1 1 1 2 1 1 1 1
𝐴𝐸𝑇𝑦𝑝𝑒? = − log 2 − log 2 + − log 2 − log 2
12 2 2 2 2 12 2 2 2 2
4 2 2 2 2 4 2 2 2 2
+ − log 2 − log 2 + − log 2 − log 2 =1
12 4 4 4 4 12 4 4 4 4

• Calculate Information gain of attribute Type?

𝐼𝐺 𝑇𝑦𝑝𝑒? = 𝐻 𝑆 − 𝐴𝐸𝑇𝑦𝑝𝑒? = 1 − 1 = 0 31
Example problem: Restaurant waiting
Est. waiting
time?
0-10 > 60
10-30
4 Y, 2 N 2N
30-60

1 Y, 1 N 1 Y, 1 N

• Calculate Average entropy of attribute Est. waiting time?

6 4 4 2 2 2 1 1 1 1
𝐴𝐸𝐸𝑠𝑡.𝑤𝑎𝑖𝑡𝑖𝑛𝑔 𝑡𝑖𝑚𝑒? = − log 2 − log 2 + − log 2 − log 2
12 6 6 6 6 12 2 2 2 2
2 1 1 1 1 2 0 0 2 2
+ − log 2 − log 2 + − log 2 − log 2 = 0.792
12 2 2 2 2 12 2 2 2 2

• Calculate Information gain of attribute Est. waiting time?

𝐼𝐺 𝐸𝑠𝑡. 𝑤𝑎𝑖𝑡𝑖𝑛𝑔 𝑡𝑖𝑚𝑒? , 𝑆 = 𝐻 𝑆 − 𝐴𝐸𝐸𝑠𝑡.𝑤𝑎𝑖𝑡𝑖𝑛𝑔 𝑡𝑖𝑚𝑒? = 1 − 0.792 = 0.208
32
Example problem: Restaurant waiting
• Largest Information Gain (0.459) / Smallest Entropy (0.541)
achieved by splitting on Patrons.
Patrons?
None Full
Some
2N 2 T,X?4 F
4Y

• Continue making new splits, always purifying nodes

33
Another
numerical example
Example data set: Weather data
outlook temperature humidity windy play
sunny hot high false no
sunny hot high true no
overcast hot high false yes
rainy mild high false yes
rainy cool normal false yes
rainy cool normal true no
overcast cool normal true yes
sunny mild high false no
sunny cool normal false yes
rainy mild normal false yes
sunny mild normal true yes
overcast mild high true yes
overcast hot normal false yes
rainy mild high true no
35
Numerical example: Choose the root
outlook Hsunny = - 2/5  log22/5 - 3/5  log23/5 = 0.971
Hovercast = - 4/4  log24/4 - 0/4  log20/4 = 0
sunny overcast rainy
Hrainy = - 3/5  log23/5 - 2/5  log22/5 = 0.971
AE = 5/14  0.971 + 4/14  0 + 5/14  0.971 = 0.694
2+/3- 4+/0- 3+/2-

temperature Hhot = - 2/4  log22/4 - 2/4  log22/4 = 1

Hmild = - 4/6  log24/6 - 2/6  log22/6 = 0.918
hot mild cool
Hcool = - 2/4  log22/4 - 3/4  log23/4 = 0.811
AE = 4/14  1 + 6/14  0.918 + 4/14  0.811 = 0.911
2+/2- 4+/2- 3+/1-

36
Numerical example: Choose the root

humidity Hhigh = - 3/7  log23/7 - 4/7  log24/7 = 0.985

Hnormal = - 6/7  log26/7 - 1/7  log21/7 = 0.592
high normal
AE = 7/14  0.985 + 7/14  0.592 = 0.789

3+/4- 6+/1-

windy
Htrue = - 3/6  log23/6 - 3/6  log23/6 = 1
true false Hfalse = - 6/8  log26/8 - 2/8  log22/8 = 0.811
AE = 6/14  1 + 8/14  0.811 = 0.892

3+/3- 6+/2-

37
Numerical example: The partial tree

outlook

sunny rainy
overcast

2+/3- 3+/2-
yes

• Which attributes are chosen for the next splits?

• Continue splitting…

38
Numerical example: The second level
• Choose an attribute for the branch outlook = sunny.
outlook temperature humidity windy play
sunny hot high false no
sunny hot high true no
sunny mild high false no
sunny cool normal false yes
sunny mild normal true yes

temperature humidity windy

hot mild cool high normal true false

0+/2- 1+/1- 1+/0- 0+/3- 2+/0- 1+/1- 1+/2-

Hhot = 0, Hmild = 1, Hcool = 0 Hhigh = 0, Hnormal = 0 HTRUE = 1, HFALSE = 0.918
AE = 0.4 AE = 0 AE = 3/3  0.918 = 0.951
39
Numerical example: The second level
• Choose an attribute for the branch outlook = rainy

outlook temperature humidity windy play

rainy mild high false yes
rainy cool normal false yes
rainy cool normal true no
rainy mild normal false yes
rainy mild high true no

temperature humidity windy

mild cool high normal true false

2+/1- 1+/1- 1+/1- 2+/1- 0+/2- 3+/0-

Hmild = 0.918, Hcool = 1 Hhigh= 1, Hnormal = 0.918 HTRUE = 0, HFALSE = 0
AE = 0.951 AE = 0.951 AE = 0 40
Numberical example: The final tree

outlook

sunny rainy
overcast

humidity windy
yes

high normal true false

no yes no yes

41
Quiz 01: ID3 decision tree
• The data represent files on a computer system. Possible
values of the class variable are “infected”, which implies the
file has a virus infection, or “clean” if it doesn't.
• Derive decision tree for virus identification.

No. Writable Updated Size Class

1 Yes No Small Infected
2 Yes Yes Large Infected
3 No Yes Med Infected
4 No No Med Clean
5 Yes No Large Clean
6 No No Large Clean

Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
CENG313 Introduction To Data Science: Lecture 12: Classification Decision Trees
No ratings yet
CENG313 Introduction To Data Science: Lecture 12: Classification Decision Trees
61 pages
Unit-3 MLT
No ratings yet
Unit-3 MLT
74 pages
Unit 3
No ratings yet
Unit 3
81 pages
Decession Tree
No ratings yet
Decession Tree
72 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
Lec-3-Decision Trees
No ratings yet
Lec-3-Decision Trees
47 pages
Module 2
No ratings yet
Module 2
42 pages
Tree Models
No ratings yet
Tree Models
42 pages
3 Decision Tree Learning
No ratings yet
3 Decision Tree Learning
38 pages
2024 Lecture11 MLAlgorithms
No ratings yet
2024 Lecture11 MLAlgorithms
84 pages
Ai Mod3@Azdocuments - in
No ratings yet
Ai Mod3@Azdocuments - in
42 pages
NOTES Module 3 - Chapter 6 - Decision Tree Learning
No ratings yet
NOTES Module 3 - Chapter 6 - Decision Tree Learning
20 pages
2021 Lecture10 BasicML
No ratings yet
2021 Lecture10 BasicML
76 pages
Module 3
No ratings yet
Module 3
103 pages
DS Notes BCA
No ratings yet
DS Notes BCA
16 pages
7 DecisionTree
No ratings yet
7 DecisionTree
58 pages
Lect 8-Decision Tree-2
No ratings yet
Lect 8-Decision Tree-2
16 pages
Module 3 Chap 3 Decision Tree Learning
No ratings yet
Module 3 Chap 3 Decision Tree Learning
79 pages
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
No ratings yet
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
129 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
AIML - Module 3 - Updated
No ratings yet
AIML - Module 3 - Updated
42 pages
Module 3
No ratings yet
Module 3
102 pages
ML Unit-2 Material
No ratings yet
ML Unit-2 Material
20 pages
Decision Tree Learning Lecture
No ratings yet
Decision Tree Learning Lecture
13 pages
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
No ratings yet
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
21 pages
Module 3
No ratings yet
Module 3
101 pages
Unit 2 1
No ratings yet
Unit 2 1
15 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
Unit 3
No ratings yet
Unit 3
46 pages
Module 3 DecisionTree Notes
100% (1)
Module 3 DecisionTree Notes
14 pages
Module 3-Decision Tree Learning
100% (1)
Module 3-Decision Tree Learning
33 pages
Data Mining Practical 8
No ratings yet
Data Mining Practical 8
7 pages
Decision Tree Basics
No ratings yet
Decision Tree Basics
30 pages
Screenshot 2024-02-06 at 1.43.15 PM
No ratings yet
Screenshot 2024-02-06 at 1.43.15 PM
66 pages
Unit IV Notes
No ratings yet
Unit IV Notes
20 pages
New Module 3 Part1
No ratings yet
New Module 3 Part1
69 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
Module - 3 - DTL & Ann
No ratings yet
Module - 3 - DTL & Ann
10 pages
Module 3-1 PDF
No ratings yet
Module 3-1 PDF
43 pages
Chapter18 2
No ratings yet
Chapter18 2
40 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
ML-3-Decision Tree
No ratings yet
ML-3-Decision Tree
17 pages
Module 2 Notes v1 PDF
No ratings yet
Module 2 Notes v1 PDF
20 pages
Video Tutorial: Decision Tree Learning
No ratings yet
Video Tutorial: Decision Tree Learning
21 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
Machine Learning: MVJ21CS62
No ratings yet
Machine Learning: MVJ21CS62
12 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Decision Tree 2
No ratings yet
Decision Tree 2
20 pages
Minor Project Report - 7TH SEMESTER - Odt
No ratings yet
Minor Project Report - 7TH SEMESTER - Odt
16 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Machine Learning
No ratings yet
Machine Learning
25 pages
Decision Trees
No ratings yet
Decision Trees
53 pages
Decision Trees
No ratings yet
Decision Trees
7 pages
ML Unit-2.1
No ratings yet
ML Unit-2.1
17 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
Module 1 Notes
100% (1)
Module 1 Notes
73 pages
Day 5 Supervised Technique-Decision Tree For Classification PDF
100% (1)
Day 5 Supervised Technique-Decision Tree For Classification PDF
58 pages
A Course in Machine Learning
No ratings yet
A Course in Machine Learning
50 pages
Decision Trees & The Iterative Dichotomiser 3 (ID3) Algorithm
100% (1)
Decision Trees & The Iterative Dichotomiser 3 (ID3) Algorithm
8 pages
Q1. What Is Data Science? List The Differences Between Supervised and Unsupervised Learning
100% (1)
Q1. What Is Data Science? List The Differences Between Supervised and Unsupervised Learning
41 pages
Artificial Intelligence With Sas PDF
100% (1)
Artificial Intelligence With Sas PDF
141 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
16 pages
Unit-II - Tree Based Methods
No ratings yet
Unit-II - Tree Based Methods
158 pages
CampusX DSMP 2.0 Syllabus
No ratings yet
CampusX DSMP 2.0 Syllabus
36 pages
ML Model Paper 1
No ratings yet
ML Model Paper 1
3 pages
Unit - 2 ML Notes
No ratings yet
Unit - 2 ML Notes
14 pages
Course - Machine Learning A-Z - AI, Python & R + ChatGPT Prize (2025) - Udemy Business
No ratings yet
Course - Machine Learning A-Z - AI, Python & R + ChatGPT Prize (2025) - Udemy Business
18 pages
CVR DWDM Manual
100% (1)
CVR DWDM Manual
70 pages
Enrollment Prediction - Project
No ratings yet
Enrollment Prediction - Project
17 pages
Guia R
No ratings yet
Guia R
32 pages
Lesson 7 Supervised Method (Decision Trees) Algorithms
No ratings yet
Lesson 7 Supervised Method (Decision Trees) Algorithms
12 pages
Mechanical Engg 4th Year
No ratings yet
Mechanical Engg 4th Year
21 pages
Decision Tree: Courtesy: Prof. Pabitra Mitra, CSE, IIT Kharagpur
No ratings yet
Decision Tree: Courtesy: Prof. Pabitra Mitra, CSE, IIT Kharagpur
73 pages
S Clémençon ML PDF
No ratings yet
S Clémençon ML PDF
73 pages
Application and Comparison of Classification Techniques in Controlling Credit Risk
0% (1)
Application and Comparison of Classification Techniques in Controlling Credit Risk
16 pages
AbuSaa2019 Article FactorsAffectingStudentsPerfor
No ratings yet
AbuSaa2019 Article FactorsAffectingStudentsPerfor
32 pages
Synthetic Well Log Generation Using Machine Learning Techniques
No ratings yet
Synthetic Well Log Generation Using Machine Learning Techniques
16 pages
Whopes PDF
No ratings yet
Whopes PDF
81 pages
Using Machine Learning Techniques To Identify Rare Cyber Attacks On The UNSW NB15 Dataset
No ratings yet
Using Machine Learning Techniques To Identify Rare Cyber Attacks On The UNSW NB15 Dataset
14 pages
Predicting Breast Cancer Recurrence Using Machine Learning Techniques: A Systematic Review
No ratings yet
Predicting Breast Cancer Recurrence Using Machine Learning Techniques: A Systematic Review
41 pages
Information Theory in Machine Learning
No ratings yet
Information Theory in Machine Learning
3 pages
Introduction To Machine Learning 9
No ratings yet
Introduction To Machine Learning 9
3 pages
Hyperparameters Hyperparameters For Decision Trees: Maximum Depth
No ratings yet
Hyperparameters Hyperparameters For Decision Trees: Maximum Depth
4 pages
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet

2025 Lecture07 P1 ID3

Uploaded by

2025 Lecture07 P1 ID3

Uploaded by

ID3 DECISION TREE

Nguyễn Ngọc Thảo – Nguyễn Hải Minh

• The training process finds a hypothesis ℎ such that 𝒉 ≈ 𝒇.

A learning curve for the decision

• It is offen the building blocks for more complex algorithms,

Predicting whether a certain person will wait to

1. Alternate: is there an alternative restaurant nearby?

A (true) decision tree for deciding

• After splitting up, each outcome is a new decision tree

function LEARN-ECISION-TREE(𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠, 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑠, 𝑝𝑎𝑟𝑒𝑛𝑡_𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠)

No attributes left but both positive and negative examples → return

The decision tree induced from the 12-example training set.

• Advanced: Split on each variable so that the purity of each

• The information gain (IG) for an attribute 𝐴 is the expected

• Entropy is zero in a pure

• Decision tree aims to decrease the entropy while increasing

• Calculate the Entropy

• Calculate Average entropy of attribute Bar?

• Calculate Information gain of attribute Bar?

• Calculate Average entropy of attribute Sat/Fri?

• Calculate Information gain of attribute Sat/Fri?

• Calculate Average entropy of attribute Hungry?

• Calculate Information gain of attribute Hungry?

𝐼𝐺 𝐻𝑢𝑛𝑔𝑟𝑦? = 𝐻 𝑆 − 𝐴𝐸𝐻𝑢𝑛𝑔𝑟𝑦? = 1 − 0.804 = 0.196

• Calculate Average entropy of attribute Raining?

𝐼𝐺 𝑅𝑎𝑖𝑛𝑖𝑛𝑔? = 𝐻 𝑆 − 𝐴𝐸𝑅𝑎𝑖𝑛𝑖𝑛𝑔? = 1 − 0.979 = 0.021

• Calculate Average entropy of attribute Reservation?

• Calculate Information gain of attribute Reservation?

𝐼𝐺 𝑅𝑒𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛? = 𝐻 𝑆 − 𝐴𝐸𝑅𝑒𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛? = 1 − 0.979 = 0.021

• Calculate Average entropy of attribute Type?

• Calculate Information gain of attribute Type?

• Calculate Average entropy of attribute Est. waiting time?

• Calculate Information gain of attribute Est. waiting time?

• Continue making new splits, always purifying nodes

temperature Hhot = - 2/4  log22/4 - 2/4  log22/4 = 1

humidity Hhigh = - 3/7  log23/7 - 4/7  log24/7 = 0.985

• Which attributes are chosen for the next splits?

temperature humidity windy

hot mild cool high normal true false

0+/2- 1+/1- 1+/0- 0+/3- 2+/0- 1+/1- 1+/2-

outlook temperature humidity windy play

temperature humidity windy

mild cool high normal true false

2+/1- 1+/1- 1+/1- 2+/1- 0+/2- 3+/0-

high normal true false

No. Writable Updated Size Class

You might also like