0% found this document useful (0 votes)

4 views37 pages

M2 Decision Trees

Decision Trees

Uploaded by

tamizharasi.s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views37 pages

M2 Decision Trees

Decision Trees

Uploaded by

tamizharasi.s

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 37

Introduction of Decision trees

02/03/2025
Representing Data
• Think about a large table, N attributes, and assume you want to know something about the people
represented as entries in this table.
• E.g. reads a lot of books or not;
• Simplest way: Histogram on the first attribute – reads
• Then, histogram on 1st and 2nd: (reads (0/1) & gender (0/1): 00, 01, 10, 11)
• But, what if the # of attributes is larger: N=16
• How large are the 1-d histograms (contingency tables) ? 16 numbers
• How large are the 2-d histograms? 16-choose-2 (all pairs) = 120 numbers
• How many 3-d tables? 560 numbers
• With 100 attributes, the 3-d tables need 161,700 numbers
• We need to figure out a way to represent data in a better way;
– In part, this depends on identifying the important attributes, since we want to look at these first.
– Information theory has something to say about it – we will use it to better represent the data.
Decision Trees
• A hierarchical data structure that represents data by
implementing a divide and conquer strategy
– Can be used as a non-parametric classification and regression method
(real numbers associated with each example, rather than a categorical
label)
• Process:
– Given a collection of examples, learn a decision tree that represents it.
– Use this representation to classify new examples Given this collection of shapes,
what shapes are type A, B, and C?
C B A
The Representation
• Decision Trees are classifiers for instances represented as feature vectors Learning a
– color={red, blue, green} ; shape={circle, triangle, rectangle} ; label= {A, B, C} Decision Tree?
– An example: <(color = green; shape = rectangle), label = B>
• Nodes are tests for feature values
• There is one branch for each value of the feature
• Leaves specify the category (labels) - Check the color
feature.
• Can categorize instances into multiple disjoint categories - If it is blue than
Color check the shape
Evaluation of a
Decision Tree feature
- if it is …then…

C B A
Shape B Shape

B A C B A
Expressivity of Decision Trees
• As Boolean functions they can represent any Boolean function.
• Can be rewritten as rules in Disjunctive Normal Form (DNF)
– Green ∧ Square  positive
– Blue ∧ Circle  positive Color
– Blue ∧ Square  positive
• The disjunction of these rules is equivalent to the Decision Tree
• What did we show? What is the hypothesis space here?
– 2 dimensions: color and shape
– 3 values each: color(red, blue, green), shape(triangle, square, circle) Shape B Shape
– |X| = 9: (red, triangle), (red, circle), (blue, square) …
– |Y| = 2: + and -
– |H| = 29
• And, all these functions can be represented as decision trees.
- + + + -
Decision Trees

• Output is a discrete category. Real valued outputs

are possible (regression trees)

• There are efficient algorithms for processing large

amounts of data (but not too many features) Color

• There are methods for handling noisy data

(classification noise and attribute noise) and for
handling missing attribute values
Shape B Shape

- + + + -
Decision Boundaries
• Usually, instances are represented as attribute-value pairs (color=blue,
shape = square, +)
• Numerical values can be used either by discretizing or by using thresholds
for splitting nodes
• In this case, the tree divides the features space into axis-parallel
rectangles, each labeled with one of the labels

Y
X<3
no yes
+ + +

7
Y>7 Y<5
+ + -
no yes no yes
5

- +
- - + + X<1
no yes
1 3 X
+ -
Decision Trees
• Can represent any Boolean Function
• Can be viewed as a way to compactly represent a lot of data.
• Natural representation: (20 questions)
• The evaluation of the Decision Tree Classifier is easy

• Clearly, given data, there are

many ways to represent it as Outlook

a decision tree.
• Learning a good representation
Sunny Overcast Rain
from data is the challenge.
Humidity Wind
Yes

High Normal Strong Weak

No Yes No Yes
Will I play tennis today?
• Features
– Outlook: {Sun, Overcast, Rain}
– Temperature: {Hot, Mild, Cool}
– Humidity: {High, Normal, Low}
– Wind: {Strong, Weak}

• Labels
– Binary classification task: Y = {+, -}
Will I play tennis today?
O T H W Play?
Outlook: S(unny),
1 S H H W -
O(vercast),
2 S H H S -
R(ainy)
3 O H H W +
4 R M H W + Temperature: H(ot),
M(edium),
5 R C N W +
6 R C N S - C(ool)
7 O C N S +
8 S M H W - Humidity: H(igh),
N(ormal),
9 S C N W +
L(ow)
1 R M N W +
0 Wind: S(trong),
1 S M N S + W(eak)
1
1 O M H S +
2
1 O H N W +
3
1 R M H S -
02/03/2025
Basic Decision Trees Learning Algorithm
O T H W Play?
• Data is processed in Batch (i.e. all the
1 S H H W -
2 S H H S - data available) Algorithm?
3 O H H W + • Recursively build a decision tree top
4 R M H W + down.
5 R C N W +
6 R C N S -
7 O C N S + Outlook
8 S M H W -
9 S C N W +
1 R M N W + Sunny Overcast Rain
0
1 S M N S + Humidity Wind
Yes
1
1 O M H S + High Normal Strong Weak
2 No Yes No Yes
1 O H N W +
3
1 R M H S -
Basic Decision Tree Algorithm
• Let S be the set of Examples
– Label is the target attribute (the prediction)
– Attributes is the set of measured attributes
• ID3(S, Attributes, Label)
If all examples are labeled the same return a single node tree with Label
Otherwise Begin
A = attribute in Attributes that best classifies S (Create a Root node for tree)
for each possible value v of A
Add a new tree branch corresponding to A=v
Let Sv be the subset of examples in S with A=v
if Sv is empty: add leaf node with the common value of Label in S
Else: below this branch add the subtree
ID3(Sv, Attributes - {a}, Label)
why?
End
Return Root
For evaluation time
Picking the Root Attribute
• The goal is to have the resulting decision tree as small as
possible (Occam’s Razor)
– But, finding the minimal decision tree consistent with the data is NP-
hard
• The recursive algorithm is a greedy heuristic search for a
simple tree, but cannot guarantee optimality.
• The main decision in the algorithm is the selection of the next
attribute to condition on.
Picking the Root Attribute
• Consider data with two Boolean attributes (A,B).
< (A=0,B=0), - >: 50 examples
< (A=0,B=1), - >: 50 examples
< (A=1,B=0), - >: 0 examples
< (A=1,B=1), + >: 100 examples
• What should be the first attribute we select?
Picking the Root Attribute
• Consider data with two Boolean attributes (A,B).
A
< (A=0,B=0), - >: 50 examples
1 0
< (A=0,B=1), - >: 50 examples
< (A=1,B=0), - >: 0 examples
+ -
< (A=1,B=1), + >: 100 examples
• What should be the first attribute we select?
– Splitting on A: we get purely labeled nodes. splitting on A
– Splitting on B: we don’t get purely labeled nodes.
– What if we have: <(A=1,B=0), - >: 3 examples?

B
1 0
• (one way to think about it: # of queries required to label a
random data point)
A -
1 0

+ -

splitting on B
Picking the Root Attribute
• Consider data with two Boolean attributes (A,B).
< (A=0,B=0), - >: 50 examples
< (A=0,B=1), - >: 50 examples
< (A=1,B=0), - >: 0 examples 3 examples
< (A=1,B=1), + >: 100 examples
• What should be the first attribute we select?
Picking the Root Attribute
• Consider data with two Boolean attributes (A,B).
< (A=0,B=0), - >: 50 examples
< (A=0,B=1), - >: 50 examples
< (A=1,B=0), - >: 0 examples 3 examples
< (A=1,B=1), + >: 100 examples
• What should be the first attribute we select?
• Trees looks structurally similar; which attribute should we choose?

Advantage A. But…
Need a way to quantify things
A B
1 0 1 0
• One way to think about it: # of queries required to
label a random data point.
• If we choose A we have less uncertainty about B - A -
the labels.
1 0 100 1 0 53
+ - + -
100 3 100 50
splitting on A splitting on B
Picking the Root Attribute
• The goal is to have the resulting decision tree as small as
possible (Occam’s Razor)
– The main decision in the algorithm is the selection of the next
attribute to condition on.
• We want attributes that split the examples to sets that are
relatively pure in one label; this way we are closer to a leaf
node.
– The most popular heuristics is based on information gain, originated
with the ID3 system of Quinlan.
Entropy
• Entropy (impurity, disorder) of a set of examples, S, relative to a binary
classification is:

• is the proportion of positive examples in S and

• is the proportion of negatives examples in S
– If all the examples belong to the same category [(1,0) or (0,1)]: Entropy = 0
– If all the examples are equally mixed (0.5, 0.5): Entropy = 1
– Entropy = Level of uncertainty.
• In general, when pi is the fraction of examples labeled i:

• Entropy can be viewed as the number of bits required, on average, to encode the
class of labels. If the probability for + is 0.5, a single bit is required for each
example; if it is 0.8 – can use less then 1 bit.
Entropy
• Entropy (impurity, disorder) of a set of examples, S, relative to a binary classification is:

• is the proportion of positive examples in S and

• is the proportion of negatives examples in S
– If all the examples belong to the same category: Entropy = 0
– If all the examples are equally mixed (0.5, 0.5): Entropy = 1
– Entropy = Level of uncertainty.

Test yourself: assign

1 1 1
high, medium, low
to each of these
distributions

-- -- --
+ + +
Entropy
(Convince yourself that the max value would be )
(Also note that the base of the log only introduces a constant factor; therefore, we’ll think about base 2)

Test yourself again:

assign high,
medium, low to
each of these
distributions.
For the middle
distribution, try to
guess the value of
the entropy.

1 1 1
High Entropy – High level of Uncertainty

Information Gain
Low Entropy – No Uncertainty.

• The information gain of an attribute a is the expected

reduction in entropy caused by partitioning on this attribute

• Where:
– Sv is the subset of S for which attribute a has value v, and Outlook
– the entropy of partitioning the data is calculated by weighing the
entropy of each partition by its size relative to the original set
Sunny Overcast Rain

• Partitions of low entropy (imbalanced splits) lead to high gain

• Go back to check which of the A, B splits is better
Will I play tennis today?
O T H W Play? Outlook: S(unny),
1 S H H W - O(vercast),
2 S H H S - R(ainy)
3 O H H W +
Temperature: H(ot),
4 R M H W +
5 R C N W + M(edium),
6 R C N S - C(ool)
7 O C N S + Humidity: H(igh),
8 S M H W - N(ormal),
9 S C N W + L(ow)
1 R M N W +
Wind: S(trong),
0
1 S M N S + W(eak)
1
1 O M H S +
2
1 O H N W +
3
1 R M H S -
4
Will I play tennis today?
O T H W Play?
1 S H H W - calculate current entropy
2 S H H S -
3 O H H W +
4 R M H W +
5 R C N W +
6 R C N S -  0.94
7 O C N S +
8 S M H W -
9 S C N W +
1 R M N W +
0
1 S M N S +
1
1 O M H S +
2
1 O H N W +
3
1 R M H S -
4
Information Gain: Outlook
¿
O T H W Play?
𝐺𝑎𝑖𝑛 ( 𝑆,𝑎 )=𝐸𝑛𝑡𝑟𝑜𝑝𝑦 ( 𝑆 ) − ∑ ¿ 𝑆𝑣 ∨
¿𝑆∨¿ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝑆𝑣 )
¿¿
𝑣∈𝑣𝑎𝑙𝑢𝑒𝑠(𝑆)
1 S H H W - Outlook = sunny:
2 S H H S - Entropy(O = S)= 0.971
3 O H H W + Outlook = overcast:
4 R M H W + Entropy(O = O)= 0
Outlook = rainy:
5 R C N W + Entropy(O = R)= 0.971
6 R C N S -
7 O C N S + Expected entropy
=
8 S M H W -
= (5/14)×0.971 + (4/14)×0 + (5/14)×0.971 = 0.694
9 S C N W +
1 R M N W + Information gain = 0.940 – 0.694 = 0.246
0
1 S M N S +
1
1 O M H S +
2
1 O H N W +
3
1 R M H S -
4
Information Gain: Humidity
¿
O T H W Play?
𝐺𝑎𝑖𝑛 ( 𝑆,𝑎 )=𝐸𝑛𝑡𝑟𝑜𝑝𝑦 ( 𝑆 ) − ∑ ¿ 𝑆𝑣 ∨
¿𝑆∨¿ 𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝑆𝑣 )
¿¿
𝑣∈𝑣𝑎𝑙𝑢𝑒𝑠(𝑆)
1 S H H W - Humidity = high:
2 S H H S - Entropy(H = H)= 0.985
3 O H H W + Humidity = Normal:
4 R M H W + Entropy(H = N)= 0.592

5 R C N W + Expected entropy
6 R C N S - =
7 O C N S + = (7/14)×0.985 + (7/14)×0.592 = 0.7785
8 S M H W -
Information gain = 0.940 – 0.694 = 0.246
9 S C N W +
1 R M N W +
0
1 S M N S +
1
1 O M H S +
2
1 O H N W +
3
1 R M H S -
4
Which feature to split on?
O T H W Play?
1 S H H W -
2 S H H S -
3 O H H W +
Information gain:
4 R M H W +
Outlook: 0.246
5 R C N W + Humidity: 0.151
6 R C N S - Wind: 0.048
7 O C N S + Temperature: 0.029
8 S M H W -
9 S C N W + → Split on Outlook
1 R M N W +
0
1 S M N S +
1
1 O M H S +
2
1 O H N W +
3
1 R M H S -
4
An Illustrative Example (III)
Gain(S,Humidity)=0.151
Gain(S,Wind) = 0.048
Outlook
Gain(S,Temperature) = 0.029
Gain(S,Outlook) = 0.246
An Illustrative Example (III)

O T H W Play?
Outlook
1S H H W -
2S H H S -
3O H H W +
4R M H W +
Sunny Overcast Rain 5R C N W +
1,2,8,9,11 3,7,12,13 4,5,6,10,14 6R C N S -
2+,3- 4+,0- 3+,2- 7O C N S +
8S M H W -
? Yes ?
9S C N W +
1R M N W +
0
1 S M N S +
1
1 O M H S +
2
1 O H N W +
3
An Illustrative Example (III)

Outlook
O T H W Play?
1S H H W -
2S H H S -
3O H H W +
4R M H W +
Sunny Overcast Rain
5R C N W +
1,2,8,9,11 3,7,12,13 4,5,6,10,14 6R C N S -
2+,3- 4+,0- 3+,2- 7O C N S +
? Yes ? 8S M H W -
9S C N W +
1R M N W +
Continue until:
• Every attribute is included in path, or, 0
• All examples in the leaf have same label 1 S M N S +
1
1 O M H S +
2
1 O H N W +
3
An Illustrative Example (IV)
Outlook
O T H W Play?
1 S H H W -
2 S H H S -
Sunny Overcast Rain 4 R M H W +
5 R C N W +
1,2,8,9,11 3,7,12,13 4,5,6,10,14
6 R C N S -
2+,3- 4+,0- 3+,2- 7 O C N S +
? Yes ? 8 S M H W -
9 S C N W +
1 R M N W +
Gain(S sunny , Humidity)  .97-(3/5) 0-(2/5) 0 = .97 0
Gain(S sunny , Temp)  .97- 0-(2/5) 1 = .57
1 S M N S +
1
Gain(S sunny , Wind)  .97-(2/5) 1 - (3/5) .92= .02 1 O M H S +
2
Split on Humidity 1 O H N W +
3
1 R M H S -
An Illustrative Example (V)

Outlook

Sunny Overcast Rain

1,2,8,9,11 3,7,12,13 4,5,6,10,14

2+,3- 4+,0- 3+,2-

? Yes ?
An Illustrative Example (V)

Outlook

Sunny Overcast Rain

1,2,8,9,11 3,7,12,13 4,5,6,10,14

2+,3- 4+,0- 3+,2-

Humidity Yes ?

High Normal
No Yes
induceDecisionTree(S)
• 1. Does S uniquely define a class?
if all s ∈ S have the same label y: return S;

• 2. Find the feature with the most information gain:

i = argmax i Gain(S, Xi)

• 3. Add children to S:
for k in Values(Xi):
Sk = {s ∈ S | xi = k}
addChild(S, Sk)
induceDecisionTree(Sk)
return S;
An Illustrative Example (VI)

Outlook

Sunny Overcast Rain

1,2,8,9,11 3,7,12,13 4,5,6,10,14

2+,3- 4+,0- 3+,2-

Humidity Yes Wind

High Normal Strong Weak

No Yes No Yes
Decision Trees - Summary
• Hypothesis Space:
– Variable size (contains all functions)
– Deterministic; Discrete and Continuous attributes
• Search Algorithm
– ID3 - batch
– Extensions: missing values
• Issues:
– What is the goal?
– When to stop? How to guarantee good generalization?
• Did not address:
– How are we doing? (Correctness-wise, Complexity-wise)

Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
Unit 4-Solvent Extraction
75% (4)
Unit 4-Solvent Extraction
58 pages
ACT IV Shakuntala
No ratings yet
ACT IV Shakuntala
6 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
Mastering Gymnastic Strength Training. Foundation Four (PDFDrive)
No ratings yet
Mastering Gymnastic Strength Training. Foundation Four (PDFDrive)
66 pages
Decision Trees
100% (1)
Decision Trees
61 pages
Food Culture in Japan
100% (5)
Food Culture in Japan
230 pages
Som All Theory Question and Answers Shaikh Sir Notes
No ratings yet
Som All Theory Question and Answers Shaikh Sir Notes
7 pages
DB58 Engine Manual (En)
88% (8)
DB58 Engine Manual (En)
210 pages
Decision Trees
No ratings yet
Decision Trees
150 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
ML Unit 3 Notes-1
No ratings yet
ML Unit 3 Notes-1
118 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Module 3
No ratings yet
Module 3
102 pages
19 - Decision Tree - ID3
No ratings yet
19 - Decision Tree - ID3
87 pages
9-Module 5 Decision Tree-21-03-2024
No ratings yet
9-Module 5 Decision Tree-21-03-2024
83 pages
(IFAC Symposia Series) International Federation of Automatic Control, C. McGreavy-Dynamics and Control of Chemical Reactors and Distillation Columns. Selected Papers from the IFAC Symposium, Bournemou.pdf
No ratings yet
(IFAC Symposia Series) International Federation of Automatic Control, C. McGreavy-Dynamics and Control of Chemical Reactors and Distillation Columns. Selected Papers from the IFAC Symposium, Bournemou.pdf
322 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Unit 3 Decision Trees-3
No ratings yet
Unit 3 Decision Trees-3
70 pages
Decision - Tree
No ratings yet
Decision - Tree
75 pages
Book Nocse
No ratings yet
Book Nocse
340 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Unit 3
No ratings yet
Unit 3
81 pages
Introduction To Polymer Chemistry, Fourth Edition Carraher Jr. Download
No ratings yet
Introduction To Polymer Chemistry, Fourth Edition Carraher Jr. Download
65 pages
3 Dtrees-Lect6
No ratings yet
3 Dtrees-Lect6
63 pages
Trees
No ratings yet
Trees
78 pages
2 ML Ch3 Decision Trees Final
No ratings yet
2 ML Ch3 Decision Trees Final
70 pages
Classification
No ratings yet
Classification
75 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
Decision Tree
No ratings yet
Decision Tree
58 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
Random Forest Regression
No ratings yet
Random Forest Regression
57 pages
Decision Tree: Courtesy: Prof. Pabitra Mitra, CSE, IIT Kharagpur
No ratings yet
Decision Tree: Courtesy: Prof. Pabitra Mitra, CSE, IIT Kharagpur
73 pages
Complete Gilded Crown: (An MM Fairy Prince Romance) (Midas Book 2) Julie Mannino PDF For All Chapters
100% (1)
Complete Gilded Crown: (An MM Fairy Prince Romance) (Midas Book 2) Julie Mannino PDF For All Chapters
47 pages
ML Unit 2-2-40
No ratings yet
ML Unit 2-2-40
39 pages
FUJITSU SoftwareServerView Suite Remote Management
No ratings yet
FUJITSU SoftwareServerView Suite Remote Management
426 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
Soal Bahasa Inggris Kelas 9 SMP/MTs - Report Text
100% (10)
Soal Bahasa Inggris Kelas 9 SMP/MTs - Report Text
2 pages
Decision Tree Basics
No ratings yet
Decision Tree Basics
30 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Chapter18 2
No ratings yet
Chapter18 2
40 pages
Decision Tree
No ratings yet
Decision Tree
38 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
Week 2 Lecture Notes
No ratings yet
Week 2 Lecture Notes
61 pages
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
No ratings yet
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
22 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
1694600905-Unit2.4 Decision Tree CU 2.0
No ratings yet
1694600905-Unit2.4 Decision Tree CU 2.0
29 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
T6 Decision Tree
No ratings yet
T6 Decision Tree
38 pages
ML Lecture 13-14
No ratings yet
ML Lecture 13-14
33 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
Application of The Exact Muffin-Tin Orbitals Theory
No ratings yet
Application of The Exact Muffin-Tin Orbitals Theory
30 pages
Machine Learning: MVJ21CS62
No ratings yet
Machine Learning: MVJ21CS62
12 pages
Week 11 - Decision Tree Learning
No ratings yet
Week 11 - Decision Tree Learning
43 pages
Decision Tree
No ratings yet
Decision Tree
23 pages
Ju. 24, 2CR
No ratings yet
Ju. 24, 2CR
24 pages
ML Unit-2.1
No ratings yet
ML Unit-2.1
17 pages
Data Mining CS4168 Lecture 5 Basics of Classification 1
No ratings yet
Data Mining CS4168 Lecture 5 Basics of Classification 1
25 pages
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
No ratings yet
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
25 pages
Arista 7280R-DataSheet
No ratings yet
Arista 7280R-DataSheet
16 pages
SRM20 Operator Manual
No ratings yet
SRM20 Operator Manual
19 pages
Vol 3-2 28-08-07
No ratings yet
Vol 3-2 28-08-07
96 pages
CSE 455 Artificial Intelligence: Decision Trees
No ratings yet
CSE 455 Artificial Intelligence: Decision Trees
16 pages
15.module6 Decisiontree-Updated 14
No ratings yet
15.module6 Decisiontree-Updated 14
20 pages
Lect 8-Decision Tree-2
No ratings yet
Lect 8-Decision Tree-2
16 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
ID3 Algorithm: Abbas Rizvi CS157 B Spring 2010
No ratings yet
ID3 Algorithm: Abbas Rizvi CS157 B Spring 2010
19 pages
Mary English Work
No ratings yet
Mary English Work
10 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
Decision Trees: Classifier
No ratings yet
Decision Trees: Classifier
23 pages
Soul Worker Mega Guide
No ratings yet
Soul Worker Mega Guide
16 pages
DNA Coloring
No ratings yet
DNA Coloring
4 pages
Unit 4
No ratings yet
Unit 4
7 pages
Evaluation of A Sensor For Low Interface Pressure Application
No ratings yet
Evaluation of A Sensor For Low Interface Pressure Application
8 pages
Fall Detection Using OpenPose
No ratings yet
Fall Detection Using OpenPose
4 pages
List of Filipino Inventions and Discoveries
No ratings yet
List of Filipino Inventions and Discoveries
6 pages
John D Musa
No ratings yet
John D Musa
1 page
The Work of The Traditional Healer: Traditional Healers and Modern Doctors
No ratings yet
The Work of The Traditional Healer: Traditional Healers and Modern Doctors
2 pages
Pre-Calculus Essentials
From Everand
Pre-Calculus Essentials
Ernest Woodward
No ratings yet
AWRN-22B: 802.11n WLAN ADSL2+ Router
No ratings yet
AWRN-22B: 802.11n WLAN ADSL2+ Router
2 pages
Books of M.A I-Ii Etc
No ratings yet
Books of M.A I-Ii Etc
3 pages
Sub: Complaint For (1) Stealing of Subsidized LPG Cylinder in My Name. (2) Tampering of Evidence of Stealing
No ratings yet
Sub: Complaint For (1) Stealing of Subsidized LPG Cylinder in My Name. (2) Tampering of Evidence of Stealing
2 pages
SAT Math: Master the Skills in 40 Pages
From Everand
SAT Math: Master the Skills in 40 Pages
Jennifer L Johnson
No ratings yet
ENSC 20032 - Computer Fundamentals and Programming
No ratings yet
ENSC 20032 - Computer Fundamentals and Programming
55 pages

M2 Decision Trees

Uploaded by

M2 Decision Trees

Uploaded by

Introduction of Decision trees

• Output is a discrete category. Real valued outputs

• There are efficient algorithms for processing large

• There are methods for handling noisy data

• Clearly, given data, there are

High Normal Strong Weak

• is the proportion of positive examples in S and

• is the proportion of positive examples in S and

Test yourself: assign

Test yourself again:

• The information gain of an attribute a is the expected

• Partitions of low entropy (imbalanced splits) lead to high gain

Sunny Overcast Rain

1,2,8,9,11 3,7,12,13 4,5,6,10,14

Sunny Overcast Rain

1,2,8,9,11 3,7,12,13 4,5,6,10,14

• 2. Find the feature with the most information gain:

Sunny Overcast Rain

1,2,8,9,11 3,7,12,13 4,5,6,10,14

Humidity Yes Wind

High Normal Strong Weak

You might also like