0% found this document useful (0 votes)

23 views79 pages

Module 3 Chap 3 Decision Tree Learning

Uploaded by

hetomo5120

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views79 pages

Module 3 Chap 3 Decision Tree Learning

Uploaded by

hetomo5120

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 79

DECISION TREE

LEARNING
MODULE - 2

1
INTRODUCTION

• Decision tree learning is one of the most widely used methods for inductive
inference
• It is used to approximate discrete valued functions that is robust to noisy
data
• It is capable of learning disjunctive expressions

• The learned function is represented by a decision tree.

• Disjunctive Expressions – (A ∧ B ∧ C) ∨ (D ∧ E ∧ F)

2
CONSIDER THE DATASET
Day Outlook Temperature Humidity Wind PlayTennis
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No

3
DECISION TREE REPRESENTATION

• Each internal node tests an

attribute
• Each branch corresponds to an
attribute value
• Each leaf node assigns a
classification

PlayTennis: This decision tree classifies Saturday mornings

according to whether or not they are suitable for playing tennis

4
REPRESENTATION

 internal node =

attribute test

 branch =

attribute value

 leaf node =
classification 5
DECISION TREE REPRESENTATION -
CLASSIFICATION

• An example is classified by
sorting it through the tree from
the root to the leaf node

• Example – (Outlook = Sunny,

Humidity = High) =>
(PlayTennis = No)

PlayTennis: This decision tree classifies Saturday mornings

according to whether or not they are suitable for playing tennis

6
DECISION TREE REPRESENTATION

• In general, decision trees represent a disjunction of conjunctions

of constraints on the attribute values of instances

• Example –

7
APPROPRIATE PROBLEMS FOR DECISION
TREE LEARNING

• Instances are represented by attribute-value pairs

• Target function has discrete output values
• Disjunctive hypothesis may be required
• Possibly noisy data
• Training data may contain errors
• Training data may contain missing attribute values
• Examples – Classification Problems
1. Equipment or medical diagnosis
2. Credit risk analysis

8
DECISION TREE ALGORITHMS

• Decision tree algorithms employs top-down greedy

search through the space of possible decision trees

• ID3 (Iterative Dichotomizer 3) by Quinlan 1986, and

C4.5 by Quinlan 1986 uses this approach

• Most algorithms that have been developed for

learning decision trees are variations of the core
algorithms

9
BASIC ID3 LEARNING ALGORITHM APPROACH
• Top-down construction of the tree, beginning with the question "which
attribute should be tested at the root of the tree?'
• Each instance attribute is evaluated using a statistical test to determine
how well it alone classifies the training examples.
• The best attribute is selected and used as the test at the root node of the
tree.
• A descendant of the root node is then created for each possible value of
this attribute.
• The training examples are sorted to the appropriate descendant node
• The entire process is then repeated at the descendant node using the
training examples associated with each descendant node
• GREEDY Approach
• No Backtracking - So we may get a suboptimal solution.
10
ID3 ALGORITHM

11
TOP-DOWN INDUCTION OF DECISION TREES

1. Find A = the best decision attribute for next node

2. Assign A as decision attribute for node
3. For each value of A create new descendants of node
4. Sort the training examples to the leaf node.
5. If training examples classified perfectly, STOP else
iterate over the new leaf nodes.

12
WHICH ATTRIBUTE IS THE BEST CLASSIFIER?

• Information Gain – A statistical property that measures

how well a given attribute separates the training
examples according to their target classification.
• This measure is used to select among the candidate
attributes at each step while growing the tree.
13
ENTROPY
• Entropy (E) is the minimum number of bits needed in order to classify an arbitrary
example as yes or no
• Entropy is commonly used in information theory. It characterizes the (im)purity of an
arbitrary collection of examples.
• S is a sample of training examples
• is the proportion of positive examples in S
• is the proportion of negative examples in S
• Then the entropy measures the impurity of S:

• But If the target attribute can take c different values:

14
ENTROPY - EXAMPLE

• Entropy([29+, 35-]) = - (29/64) log2(29/64) - (35/64) log2(35/64)

= 0.994

• Note: 0 log 0 is defined to be 0

15
ENTROPY

• The entropy varies between 0 and 1.

• If all the members belong to the same class => The entropy
= 0, In this case =1 and = 0.
• Entropy (S) = -1. log2 (1) – 0. log2 0

• = -1 . 0 – 0 . log2 0 = 0
• If there are equal number of positive and negative examples
=> The entropy = 1 16
INFORMATION GAIN
• Information gain measures the expected reduction in Entropy

• It measures the effectiveness of the attribute in classifying the

training data
• Gain(S,A) = expected reduction in entropy by
partitioning the examples according to the attribute A

• Here values(A) is the set of all possible values for attribute A, sv

is the subset of S for which atribute A has value v
17
INFORMATION GAIN - EXAMPLE

E=0.994 E=0.994

E=0.706 E=0.742 E=0.937 E=0.619

• Gain(S,A1) • Gain(S,A2)
= 0.994 – (26/64)*.706 – (38/64)*.742 = 0.994 – (51/64)*.937 –
= 0.266 (13/64)*.619
• Information gained by partitioning = 0.121
along attribute A1 is 0.266 • Information gained by partitioning
along attribute A2 is 0.121
18
An Illustrative Example

19
DECISION TREE LEARNING
■ Let’s Try an Example!
■ Let
– E([X+,Y-]) represent that there are X positive training elements
and Y negative elements.
■ Therefore the Entropy for the training data, E(S), can be
represented as E([9+,5-]) because of the 14 training examples 9 of
them are yes and 5 of them are no.

20
DECISION TREE LEARNING:
A SIMPLE EXAMPLE

■ Let’s start off by calculating the Entropy of the Training Set.

■ E(S) = E([9+,5-]) = (-9/14 log2 9/14) + (-5/14 log2 5/14)
■ = 0.94

■ Next we will need to calculate the information gain G(S,A) for

each attribute A where A is taken from the set {Outlook,
Temperature, Humidity, Wind}.

21
DECISION TREE LEARNING:
A SIMPLE EXAMPLE
■ The information gain for Outlook is:

– Gain(S,Outlook) = E(S) – [5/14 * E(Outlook=sunny) + 4/14 *

E(Outlook = overcast) + 5/14 * E(Outlook=rain)]

– Gain(S,Outlook) = E([9+,5-]) – [5/14E(2+,3-) + 4/14E([4+,0]) +

(- 2/5 log2 (2/5) )+ (-3/5 log2(3/5))
5/14*E([3+,2-])]

– Gain(S,Outlook) = 0.94 – [5/140.971 + 4/140.0 + 5/14*0.971]

– Gain(S,Outlook) = 0.246
22
DECISION TREE LEARNING:
A SIMPLE EXAMPLE

■ Gain(S,Temperature) = 0.94 – [4/14*E(Temperature=hot) +

6/14*E(Temperature=mild) +
4/14*E(Temperature=cool)]
■ Gain(S,Temperature) = 0.94 – [4/14*E([2+,2-]) + 6/14*E([4+,2-]) +
4/14*E([3+,1-])]
■ Gain(S,Temperature) = 0.94 – [4/14 + 6/14*0.918 + 4/14*0.811]
■ Gain(S,Temperature) = 0.029

23
DECISION TREE LEARNING:
A SIMPLE EXAMPLE

■ Gain(S,Humidity) = 0.94 – [7/14*E(Humidity=high) +

7/14*E(Humidity=normal)]
■ Gain(S,Humidity = 0.94 – [7/14*E([3+,4-]) + 7/14*E([6+,1-])]

■ Gain(S,Humidity = 0.94 – [7/140.985 + 7/140.592]

■ Gain(S,Humidity) = 0.1515

24
DECISION TREE LEARNING:
A SIMPLE EXAMPLE

■ G(S,Wind) = 0.94 – [8/140.811 + 6/141.00]

■ G(S,Wind) = 0.048

25
AN ILLUSTRATIVE EXAMPLE

• Gain(S, Outlook) = 0.246

• Gain(S, Humidity) = 0.151
• Gain(S, Wind) = 0.048
• Gain(S, Temperature) = 0.029
• Since Outlook attribute provides the
best prediction of the target attribute,
PlayTennis, it is selected as the
decision attribute for the root node, and
branches are created with its possible
values (i.e., Sunny, Overcast, and
Rain).

26
ROOT NODE

27
AN ILLUSTRATIVE EXAMPLE

For Overcast – Decision Class can be obtained

Day Outlook Temp. Humidity Wind Decision

3 Overcast Hot High Weak Yes

7 Overcast Cool Normal Strong Yes

12 Overcast Mild High Strong Yes

13 Overcast Hot Normal Weak Yes

28
AN ILLUSTRATIVE EXAMPLE
Day Outlook Temp. Humidity Wind Decision

For Sunny– 1 Sunny Hot High Weak No

Decision Class
2 Sunny Hot High Strong No
cannot be
obtained 8 Sunny Mild High Weak No

9 Sunny Cool Normal Weak Yes

11 Sunny Mild Normal Strong Yes

Day Outlook Temp. Humidity Wind Decision

4 Rain Mild High Weak Yes

For Rain–
5 Rain Cool Normal Weak Yes Decision Class
6 Rain Cool Normal Strong No cannot be
obtained
10 Rain Mild Normal Weak Yes

14 Rain Mild High Strong No

29
AN ILLUSTRATIVE EXAMPLE

30
INTERMEDIATE NODE COMPUTATION
Ssunny = {D1,D2,D8,D9,D11}
Entropy (Ssunny ) = (-2/5 log 2/5 ) + ( -3/5 log 3/5) = 0.97
• Gain (Ssunny , Humidity)
■ = .970 - (3/5) 0.0 - (2/5) 0.0
■ = .970 [3/5 * ((-0/3 log 0/3) + ((-3/3 log (3/3))] + [2/5 *
((-2/2 log(2/2) + ((-0/2 log (0/2))]
• Gain (S sunny , Temperature)
■ = .970 - (2/5) 0.0 - (2/5) 1.0 - (1/5) 0.0
■ = .570
• Gain (S sunny , Wind)
■ = .970 - (2/5) 1.0 - (3/5) .918
■ = .019
31
INTERMEDIATE NODE COMPUTATION

For the rightmost branch: Rain

• Gain (SRain, Temperature) = 0.019

• Gain (SRain, Humidity) = 0.019

• Gain (SRain, Wind)= 0.97

32
FINAL DECISION TREE :

33
DECISION FOR LEFT BRANCH:
BRANCH: (HUMIDITY)

Day Outlook Temp. Humidity Wind Decision

1 Sunny Hot High Weak No

2 Sunny Hot High Strong No
8 Sunny Mild High Weak No

Day Outlook Temp. Humidity Wind Decision

9 Sunny Cool Normal Weak Yes

11 Sunny Mild Normal Strong Yes

34
DECISION FOR RIGHT BRANCH:
BRANCH: (WIND)
Day Outlook Temp. Humidity Wind Decision
4 Rain Mild High Weak Yes
5 Rain Cool Normal Weak Yes
10 Rain Mild Normal Weak Yes

Day Outlook Temp. Humidity Wind Decision

6 Rain Cool Normal Strong No
14 Rain Mild High Strong No

35
FINAL DECISION TREE :

Test Instance: (Outlook = Rain, Temperature = Hot, Humidity = High, Wind =

Strong ) = ?
36
EXAMPLE 2:

37
DECISION TREE LEARNING

38
DECISION TREE LEARNING

39
DECISION TREE LEARNING

40
DECISION TREE LEARNING

41
DECISION TREE LEARNING

42
DECISION TREE LEARNING

43
DECISION TREE LEARNING

44
HYPOTHESIS SPACE SEARCH IN DECISION
TREE LEARNING
ID3 – Capabilities and Limitations
• ID3’s hypothesis of all decision trees is a complete space
of finite discrete-valued functions
• ID3 maintains a single current hypothesis as it searches
through the space of decision trees
• ID3 in its pure form performs no backtracking in its
search
• ID3 uses all training examples at each step in the search
to make statistically based decisions
45
INDUCTIVE BIAS IN ID3

• Inductive bias is the set of assumptions that along with

the training data justify the classifications assigned by the
learner to future instances.
• Given a collection of training examples, there are several
decision trees consistent with these examples
• Inductive bias of ID3:
• Which of these decision trees does the ID3 choose?
INDUCTIVE BIAS IN ID3

Approximate inductive bias of ID3

• ID3 has preference of short trees over larger trees
• Trees that place high information gain attributes closer to
the root are preferred over that are not.

47
INDUCTIVE BIAS IN ID3

Restriction Biases and Preference Biases

• Interesting difference between the types of inductive
bias exhibited by ID3 and Candidate-Elimination
algorithm
• ID3 – searches complete hypothesis space
incompletely
• CE – searches incomplete hypothesis space
completely

48
INDUCTIVE BIAS IN ID3

• ID3 – Preference Biases

• CE – Restriction Biases

Which type of inductive bias is preferable?

• Combination is also possible

49
OCCAM’S RAZOR – WHY PREFER SHORTER
HYPOTHESIS
• William of Occam was the 14th century English logician

• The principle states that, “All other things being equal, the
simplest solution is the best”

• When multiple competing theories are equal in other

respects, the principle recommends selecting the theory
that introduces fewest assumptions.

50
OCCAM’S RAZOR
• Occam’s Razor: Prefer the simplest hypothesis that fits the
data
• Argument in favor -
• A short hypothesis that fits data unlikely to be a coincidence
• A long hypothesis that fits data might be a coincidence

• Argument Opposed –
• There are many ways to define small sets of hypotheses
• Two different hypotheses from the same training examples
possible when applied by two learners that perceive these
examples in terms of different internal representations
51
ISSUES IN DECISION TREE LEARNING

• Overfitting
• Incorporating Continuous-valued attributes
• Alternative measures for selecting attributes
• Handling attributes with costs
• Handling examples with missing attribute values

52
OVERFITTING
• Consider a hypothesis h over
• Training data: errortrain(h)

• Entire distribution D of data: errorD(h)

• The hypothesis h ∈ H overfits training data if there is an

alternative hypothesis h’ ∈ H such that
• errortrain(h) <
errortrain(h’) AND
• errorD(h) > errorD(h’)
53
OVERFITTING
Definition:
Given a hypothesis space H, a hypothesis h belonging to H
is said to overfit the training data if there exists some
alternative hypothesis h’ belonging H, such that h has
smaller error than h’ over the training examples, but h’ has a
smaller error than h over the entire distribution of instances.

54
OVERFITTING IN DECISION TREE LEARNING

55
OVERFITTING IN DECISION TREE LEARNING

56
AVOIDING OVERFITTING

57
AVOIDING OVERFITTING

58
AVOIDING OVERFITTING

59
AVOIDING OVERFITTING

60
AVOIDING OVERFITTING

61
AVOIDING OVERFITTING

62
AVOIDING OVERFITTING

63
REDUCED-ERROR PRUNING

• Split data into training and validation sets

• Do until further pruning is harmful
1. Evaluate impact of pruning each possible node on
validation set
2. Greedily remove the one that most improves the validation
set accuracy

64
EFFECT OF REDUCED-ERROR PRUNING

65
EFFECT OF REDUCED-ERROR PRUNING

66
RULE POST-PRUNING
• The major drawback of Reduced-Error Pruning is when
the data is limited, validation set reduces even further
the number of examples for training.

Hence Rule Post-Pruning

67
RULE POST-PRUNING

68
RULE POST-PRUNING

69
RULE POST-PRUNING

70
RULE POST-PRUNING

71
RULE POST-PRUNING

72
RULE POST-PRUNING

73
74
CONTINUOUS VALUED-ATTRIBUTES

• Create a discrete-valued attribute to test continuous

• Dynamically define new discrete valued attributes
that partition the continuous valued attribute into
discrete set of intervals
• For an attribute A, create new boolean attribute Ac
that is true of A<c and false otherwise

75
(48+60) /2 = 54 -
Threshold

(80+90) /2 = 85 -
Threshold

• So if Temperature = 75
• We can infer that PlayTennis = Yes

76
ALTERNATIVE MEASURES FOR SELECTING
ATTRIBUTES
• Problem:
• If attribute has many values, Gain will select any value
• Example – Using date attribute
• One approach – Gain Ratio

Where si is a subset of S which has value vi

77
EXAMPLES WITH MISSING ATTRIBUTE
VALUES
• What if some examples missing values of attribute A?
• Use training examples anyway and sort through tree
• If node n tests A, Assign it the most common value among
the examples at node n
• Assign a probability pi to each possible value of A – vi and
assign fraction pi of example to each descendant in tree

78
ATTRIBUTES WITH COSTS
• Problem:
• Medical diagnosis, BloodTest has cost $150
• Robotics, Width_from_1ft has cost 23 sec
• One Approach - replace gain
• Tan and Schlimmer (1990)

• Nunez (1988)
• where w ∈ [0, 1] is a constant that determines the relative importance of cost versus information
gain.

Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
ID3 Explanation
No ratings yet
ID3 Explanation
23 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
Decision Tree
No ratings yet
Decision Tree
42 pages
Module 3-Decision Tree Learning
100% (1)
Module 3-Decision Tree Learning
33 pages
Decision - Tree
No ratings yet
Decision - Tree
75 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
41 pages
2.3 Decision-Tree-Algorithm
No ratings yet
2.3 Decision-Tree-Algorithm
61 pages
AIML Lect5 Decision Tree
No ratings yet
AIML Lect5 Decision Tree
33 pages
DMDW Co3 Session 14
No ratings yet
DMDW Co3 Session 14
55 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
Unit 3
No ratings yet
Unit 3
81 pages
7-Decision Trees Learning
No ratings yet
7-Decision Trees Learning
51 pages
L8-1-decisiontrees--random-forest (1)
No ratings yet
L8-1-decisiontrees--random-forest (1)
118 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
ML Lec5
No ratings yet
ML Lec5
7 pages
Module - 3 - DTL & Ann
No ratings yet
Module - 3 - DTL & Ann
10 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
16 pages
DM-Lecture Decision Trees (A)
No ratings yet
DM-Lecture Decision Trees (A)
161 pages
CENG313 Introduction To Data Science: Lecture 12: Classification Decision Trees
No ratings yet
CENG313 Introduction To Data Science: Lecture 12: Classification Decision Trees
61 pages
New Module 3 Part1
No ratings yet
New Module 3 Part1
69 pages
Module 2 Notes v1 PDF
No ratings yet
Module 2 Notes v1 PDF
20 pages
Module 2
No ratings yet
Module 2
42 pages
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
No ratings yet
Visit:: Join Telegram To Get Instant Updates: Contact: MAIL: Instagram: Instagram: Whatsapp Share
21 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
ID3 Algorithm Machine Learning, Btech Cse
No ratings yet
ID3 Algorithm Machine Learning, Btech Cse
6 pages
Decision Trees
No ratings yet
Decision Trees
53 pages
T6 Decision Tree
No ratings yet
T6 Decision Tree
38 pages
Lecture2 DT
No ratings yet
Lecture2 DT
75 pages
Lect 8-Decision Tree-2
No ratings yet
Lect 8-Decision Tree-2
16 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
Session-29 Co3 - BBN DT KNN
No ratings yet
Session-29 Co3 - BBN DT KNN
34 pages
Unit 3
No ratings yet
Unit 3
90 pages
Machine Learning - Part 1
100% (1)
Machine Learning - Part 1
80 pages
Class 16 Decision Tree
No ratings yet
Class 16 Decision Tree
45 pages
Module 3
No ratings yet
Module 3
101 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Screenshot 2024-02-06 at 1.43.15 PM
No ratings yet
Screenshot 2024-02-06 at 1.43.15 PM
66 pages
SDG Sdgs DF
No ratings yet
SDG Sdgs DF
23 pages
06 Classification Decision Tree
No ratings yet
06 Classification Decision Tree
42 pages
ML Unit-2 Material WORD
No ratings yet
ML Unit-2 Material WORD
25 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
MLT UNIT-3 Notes
No ratings yet
MLT UNIT-3 Notes
35 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
7 DecisionTree
No ratings yet
7 DecisionTree
58 pages
Deep Learning: Decision Trees I
No ratings yet
Deep Learning: Decision Trees I
45 pages
ML-3-Decision Tree
No ratings yet
ML-3-Decision Tree
17 pages
Module 3
No ratings yet
Module 3
102 pages
Dec Tree
No ratings yet
Dec Tree
17 pages
L5 - Decision Tree - B
No ratings yet
L5 - Decision Tree - B
51 pages
Decision Tree Learning: - A Learned Decision Tree Can Also Be Re-Represented As A Set of If-Then Rules
No ratings yet
Decision Tree Learning: - A Learned Decision Tree Can Also Be Re-Represented As A Set of If-Then Rules
49 pages
Classification and Clustering
No ratings yet
Classification and Clustering
59 pages
Week 11 - Decision Tree Learning
No ratings yet
Week 11 - Decision Tree Learning
43 pages
2.decision Tree
No ratings yet
2.decision Tree
56 pages
Learning by Asking Questions: Decision Trees: Piyush Rai Machine Learning (CS771A)
No ratings yet
Learning by Asking Questions: Decision Trees: Piyush Rai Machine Learning (CS771A)
22 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Decision Tree-Using Entropy
No ratings yet
Decision Tree-Using Entropy
17 pages
Report Card - M. Hidayatullah Aspar
No ratings yet
Report Card - M. Hidayatullah Aspar
5 pages
CS 2210 - Notes CH 6
No ratings yet
CS 2210 - Notes CH 6
7 pages
Visualizing and Forecasting Stocks: Submitted in Partial Fulfillment of The Requirement of For The Degree of
No ratings yet
Visualizing and Forecasting Stocks: Submitted in Partial Fulfillment of The Requirement of For The Degree of
31 pages
TNP Brochure IIIT Surat 2023-24
No ratings yet
TNP Brochure IIIT Surat 2023-24
23 pages
Amadeus Big Data
100% (1)
Amadeus Big Data
32 pages
8NT10 e EduNote - Machine Learning Based Network Optimization
No ratings yet
8NT10 e EduNote - Machine Learning Based Network Optimization
30 pages
SAS Event Stream Processing For Edge Computing: Key Benefits
No ratings yet
SAS Event Stream Processing For Edge Computing: Key Benefits
2 pages
BigComp2017 Program
No ratings yet
BigComp2017 Program
32 pages
Generative Adversarial Networks: Akrit Mohapatra Ece Department, Virginia Tech
No ratings yet
Generative Adversarial Networks: Akrit Mohapatra Ece Department, Virginia Tech
21 pages
Moringa School DS Fulltime Flatiron Brochure Mobile Brochure 2022 1
No ratings yet
Moringa School DS Fulltime Flatiron Brochure Mobile Brochure 2022 1
6 pages
(2019) Towards Machine Learning With Zero Real - World Data
No ratings yet
(2019) Towards Machine Learning With Zero Real - World Data
6 pages
AIO2023
No ratings yet
AIO2023
11 pages
Namineni Rakesh - Report
No ratings yet
Namineni Rakesh - Report
15 pages
FDP Schedules GenAI
No ratings yet
FDP Schedules GenAI
2 pages
Tulasi Cognizant
No ratings yet
Tulasi Cognizant
1 page
Literature Review
No ratings yet
Literature Review
4 pages
Fundamentals of Data Science Unit 1
No ratings yet
Fundamentals of Data Science Unit 1
29 pages
3-1 Syllabus
No ratings yet
3-1 Syllabus
44 pages
Research Project
No ratings yet
Research Project
13 pages
10.26650 Jecs2023 1415085 3641058
No ratings yet
10.26650 Jecs2023 1415085 3641058
10 pages
Personality Prediction Model For Social Media Us - 2022 - Computers and Electric
No ratings yet
Personality Prediction Model For Social Media Us - 2022 - Computers and Electric
12 pages
Collaborative Filtering Matrix Factorization Approach: Jeff Howbert Introduction To Machine Learning Winter 2012 #
No ratings yet
Collaborative Filtering Matrix Factorization Approach: Jeff Howbert Introduction To Machine Learning Winter 2012 #
30 pages
Q Learning
No ratings yet
Q Learning
38 pages
Naman Meena: Data Science Engineer
No ratings yet
Naman Meena: Data Science Engineer
1 page
AI and Human-Robot Interaction - A Review of Recent Advances and Challenges
No ratings yet
AI and Human-Robot Interaction - A Review of Recent Advances and Challenges
10 pages
Exploring Potential of State-of-the-Art Speaker Diarization Frameworks For Multilingual Multi-Speaker Conversational Audio
No ratings yet
Exploring Potential of State-of-the-Art Speaker Diarization Frameworks For Multilingual Multi-Speaker Conversational Audio
6 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
117 pages
Microsoft Azure Training and Certifications: Aka - Ms/Azuretraincertdeck
No ratings yet
Microsoft Azure Training and Certifications: Aka - Ms/Azuretraincertdeck
55 pages
Latent Graph Diffusion A Unified Framework For Generation and Prediction On Graphs
No ratings yet
Latent Graph Diffusion A Unified Framework For Generation and Prediction On Graphs
21 pages
Syllabus - Deep Learning and Edge Intelligence
No ratings yet
Syllabus - Deep Learning and Edge Intelligence
3 pages

Module 3 Chap 3 Decision Tree Learning

Uploaded by

Module 3 Chap 3 Decision Tree Learning

Uploaded by

DECISION TREE

• The learned function is represented by a decision tree.

• Each internal node tests an

PlayTennis: This decision tree classifies Saturday mornings

• Example – (Outlook = Sunny,

PlayTennis: This decision tree classifies Saturday mornings

• In general, decision trees represent a disjunction of conjunctions

• Instances are represented by attribute-value pairs

• Decision tree algorithms employs top-down greedy

• ID3 (Iterative Dichotomizer 3) by Quinlan 1986, and

• Most algorithms that have been developed for

1. Find A = the best decision attribute for next node

• Information Gain – A statistical property that measures

• But If the target attribute can take c different values:

• Entropy([29+, 35-]) = - (29/64) log2(29/64) - (35/64) log2(35/64)

• Note: 0 log 0 is defined to be 0

• The entropy varies between 0 and 1.

• It measures the effectiveness of the attribute in classifying the

• Here values(A) is the set of all possible values for attribute A, sv

E=0.706 E=0.742 E=0.937 E=0.619

■ Let’s start off by calculating the Entropy of the Training Set.

■ Next we will need to calculate the information gain G(S,A) for

– Gain(S,Outlook) = E(S) – [5/14 * E(Outlook=sunny) + 4/14 *

– Gain(S,Outlook) = E([9+,5-]) – [5/14*E(2+,3-) + 4/14*E([4+,0]) +

– Gain(S,Outlook) = 0.94 – [5/14*0.971 + 4/14*0.0 + 5/14*0.971]

■ Gain(S,Temperature) = 0.94 – [4/14*E(Temperature=hot) +

■ Gain(S,Humidity) = 0.94 – [7/14*E(Humidity=high) +

■ Gain(S,Humidity = 0.94 – [7/14*0.985 + 7/14*0.592]

■ G(S,Wind) = 0.94 – [8/14*0.811 + 6/14*1.00]

• Gain(S, Outlook) = 0.246

For Overcast – Decision Class can be obtained

Day Outlook Temp. Humidity Wind Decision

3 Overcast Hot High Weak Yes

7 Overcast Cool Normal Strong Yes

12 Overcast Mild High Strong Yes

13 Overcast Hot Normal Weak Yes

For Sunny– 1 Sunny Hot High Weak No

9 Sunny Cool Normal Weak Yes

11 Sunny Mild Normal Strong Yes

Day Outlook Temp. Humidity Wind Decision

4 Rain Mild High Weak Yes

14 Rain Mild High Strong No

For the rightmost branch: Rain

• Gain (SRain, Temperature) = 0.019

• Gain (SRain, Humidity) = 0.019

• Gain (SRain, Wind)= 0.97

Day Outlook Temp. Humidity Wind Decision

1 Sunny Hot High Weak No

Day Outlook Temp. Humidity Wind Decision

9 Sunny Cool Normal Weak Yes

Day Outlook Temp. Humidity Wind Decision

Test Instance: (Outlook = Rain, Temperature = Hot, Humidity = High, Wind =

• Inductive bias is the set of assumptions that along with

Approximate inductive bias of ID3

Restriction Biases and Preference Biases

• ID3 – Preference Biases

Which type of inductive bias is preferable?

• When multiple competing theories are equal in other

• Entire distribution D of data: errorD(h)

• The hypothesis h ∈ H overfits training data if there is an

• Split data into training and validation sets

Hence Rule Post-Pruning

• Create a discrete-valued attribute to test continuous

Where si is a subset of S which has value vi

You might also like

– Gain(S,Outlook) = E([9+,5-]) – [5/14E(2+,3-) + 4/14E([4+,0]) +

– Gain(S,Outlook) = 0.94 – [5/140.971 + 4/140.0 + 5/14*0.971]

■ Gain(S,Humidity = 0.94 – [7/140.985 + 7/140.592]

■ G(S,Wind) = 0.94 – [8/140.811 + 6/141.00]