0% found this document useful (0 votes)

34 views60 pages

Decision Trees-Lecture 9&10

Uploaded by

umtxengineer3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views60 pages

Decision Trees-Lecture 9&10

Uploaded by

umtxengineer3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 60

Decision Trees

Lecture 9 &10
Introduction of Decision
trees
Decision Trees
• A hierarchical data structure that represents data by implementing a divide
and conquer strategy
• Can be used as a non-parametric classification and regression method
• Given a collection of examples, learn a decision tree that represents it.
• Use this representation to classify new examples

C B A

4
The Representation
• Decision Trees are classifiers for instances represented as feature vectors
• color={red, blue, green} ; shape={circle, triangle, rectangle} ; label= {A, B, C}
• Nodes are tests for feature values Learning a
Evaluation of a
• There is one branch for each value of the feature Decision Tree Decision Tree

• Leaves specify the category (labels)

Color
• Can categorize instances into multiple disjoint categories

C B A
Shape B Shape

B A C B A
5
Expressivity of Decision Trees
• As Boolean functions they can represent any Boolean function.
• Can be rewritten as rules in Disjunctive Normal Form (DNF)
• Green ∧ Square  positive
• Blue ∧ Circle  positive
• Blue ∧ Square  positive Color
• The disjunction of these rules is equivalent to the Decision Tree
• What did we show? What is the hypothesis space here?
• 2 dimensions: color and shape
• 3 values each: color(red, blue, green), shape(triangle, square, circle) Shape B Shape
• |X| = 9: (red, triangle), (red, circle), (blue, square) …
• |Y| = 2: + and -
• |H| = 29

B A C B A
6
Decision Trees

• Output is a discrete category. Real valued outputs are

possible (regression trees)
• There are efficient algorithms for processing large
amounts of data (but not too many features) Color
• There are methods for handling noisy data
(classification noise and attribute noise) and for
handling missing attribute values
Shape B Shape

B A C B A
7
Decision Boundaries
• Usually, instances are represented as attribute-value pairs (color=blue, shape =
square, +)
• Numerical values can be used either by discretizing or by using thresholds for
splitting nodes
• In this case, the tree divides the features space into axis-parallel rectangles,
each labeled with one of the labels

Y
X<3
+ + +
no yes
7
Y>7 Y<5
+ + -
5 no yes no yes

- + - - + + X<1
no yes
1 3 X
+ - 8
Today’s key concepts
• Learning decision trees (ID3 algorithm)
• Greedy heuristic (based on information gain)
Originally developed for discrete features

• Overfitting
• What is it? How do we deal with it?

• Some extensions of DTs

• Principles of Experimental ML
9
Administration
• Since there is no waiting list anymore; all people that wanted to
be in are in.
• Recitations
• Quizzes
• Questions?
• Please ask/comment during class.

10
Learning decision trees (ID3
algorithm
Decision Trees
• Can represent any Boolean Function
• Can be viewed as a way to compactly represent a lot of data.
• Natural representation: (20 questions)
• The evaluation of the Decision Tree Classifier is easy

Outlook
• Clearly, given data, there are
many ways to represent it as
Sunny Overcast Rain
a decision tree. Humidity Wind
Yes
• Learning a good representation
High Normal Strong Weak
from data is the challenge. No Yes No Yes
12
Will I play tennis today?
• Features
• Outlook: {Sun, Overcast, Rain}
• Temperature: {Hot, Mild, Cool}
• Humidity: {High, Normal, Low}
• Wind: {Strong, Weak}

• Labels
• Binary classification task: Y = {+, -}

13
Will I play tennis today?
O T H W Play?
1 S H H W - Outlook: S(unny),
2 S H H S - O(vercast),
3 O H H W + R(ainy)
4 R M H W +
5 R C N W + Temperature: H(ot),
6 R C N S - M(edium),
7 O C N S + C(ool)
8 S M H W -
9 S C N W + Humidity: H(igh),
10 R M N W + N(ormal),
11 S M N S + L(ow)
12 O M H S +
13 O H N W + Wind: S(trong),
14 R M H S - W(eak)

14
Basic Decision Trees Learning Algorithm
O T H W Play?
1 S H H W -
• Data is processed in Batch (i.e. all the
2 S H H S - data available) Algorithm?
3 O H H W +
4 R M H W + • Recursively build a decision tree top
5 R C N W + down.
6 R C N S -
7 O C N S + Outlook
8 S M H W -
9 S C N W +
10 R M N W + Rain
Sunny Overcast
11 S M N S +
12 O M H S + Humidity Yes Wind
13 O H N W +
14 R M H S - High Normal Strong Weak
No Yes No Yes
Basic Decision Tree Algorithm
• Let S be the set of Examples
• Label is the target attribute (the prediction) For evaluation time why?
• Attributes is the set of measured attributes
• ID3(S, Attributes, Label)- Iterative Dichotomiser
If all examples are labeled the same return a single node tree with Label
Otherwise Begin
A = attribute in Attributes that best classifies S (Create a Root node for tree)

for each possible value v of A

Add a new tree branch corresponding to A=v
Let Sv be the subset of examples in S with A=v
if Sv is empty: add leaf node with the common value of Label in S
Else: below this branch add the subtree
ID3(Sv, Attributes - {a}, Label)
End
Return Root

16
Picking the Root Attribute
• The goal is to have the resulting decision tree as small as
possible (Occam’s Razor)
• But, finding the minimal decision tree consistent with the data is NP-hard
• The recursive algorithm is a greedy heuristic search for a simple
tree, but cannot guarantee optimality.
• The main decision in the algorithm is the selection of the next
attribute to condition on.

17
Picking the Root Attribute
• Consider data with two Boolean attributes (A,B).
A
Instances, Label 1 0
(A=0,B=0), - : 50 examples
+ -
(A=0,B=1), - : 50 examples
(A=1,B=0), - : 0 examples splitting on A
(A=1,B=1), + : 100 examples
• What should be the first attribute we select?
• Splitting on A: we get purely labeled nodes. B
• Splitting on B: we don’t get purely labeled nodes. 1 0
• What if we have: (A=1,B=0), - : 3 examples?
A -
1 0
+ -
• (one way to think about it: # of queries required to label a random
data point) splitting on B
18
Picking the Root Attribute
• Consider data with two Boolean attributes (A,B).
< (A=0,B=0), - >: 50 examples
< (A=0,B=1), - >: 50 examples
< (A=1,B=0), - >: 0 examples 3 examples
< (A=1,B=1), + >: 100 examples
• What should be the first attribute we select?
• Trees looks structurally similar; which attribute should we choose?

A B
Advantage A. But… 1 0 1 0
Need a way to quantify things
B - A -
1 0 100 1 0 53
• One way to think about it: # of queries required to
label a random data point. + - + -
• If we choose A we have less uncertainty about
the labels.
100 3 100 50
splitting on A splitting on B 19
Picking the Root Attribute
• The goal is to have the resulting decision tree as small as
possible (Occam’s Razor)
• The main decision in the algorithm is the selection of the next attribute to
condition on.
• We want attributes that split the examples to sets that are
relatively pure in one label; this way we are closer to a leaf
node.
• The most popular heuristics is based on information gain, originated with
the ID3 system of Quinlan.

20
Entropy
• Entropy (impurity, disorder) of a set of examples, S, relative to a binary
classification is:

• is the proportion of positive examples in S and

• is the proportion of negatives examples in S
• If all the examples belong to the same category: Entropy = 0
• If all the examples are equally mixed (0.5, 0.5): Entropy = 1
• Entropy = Level of uncertainty.
• In general, when pi is the fraction of examples labeled i:

• Entropy can be viewed as the number of bits required, on average, to encode

the class of labels. If the probability for + is 0.5, a single bit is required for
each example; if it is 0.8, can use less then 1 bit. 21
Information Gain
High Entropy –> High level of Uncertainty
Low Entropy –> No Uncertainty.

• The information gain of an attribute a is the expected

reduction in entropy caused by partitioning on this attribute
Outlook
• Where:
• Sv is the subset of S for which attribute a has value v, and Rain
Sunny Overcast
• the entropy of partitioning the data is calculated by weighing the
entropy of each partition by its size relative to the original set

• Partitions of low entropy (imbalanced splits) lead to high

gain
• Go back to check which of the A, B splits is better
24
Will I play tennis today?
O T H W Play?
1 S H H W -
2 S H H S -
3 O H H W + calculate current entropy
4 R M H W +
5 R C N W + •
6 R C N S -
7 O C N S +
8 S M H W -
9 S C N W +  0.94
10 R M N W +
11 S M N S +
12 O M H S +
13 O H N W +
14 R M H S -

25
Decision Boundaries
• Usually, instances are represented as attribute-value pairs (color=blue, shape =
square, +)
• Numerical values can be used either by discretizing or by using thresholds for
splitting nodes
• In this case, the tree divides the features space into axis-parallel rectangles,
each labeled with one of the labels

Y
X<3
+ + +
no yes
7
Y>7 Y<5
+ + -
5 no yes no yes

- + - - + + X<1
no yes
1 3 X
+ - 26
Today’s key concepts
• Learning decision trees (ID3 algorithm)
• Greedy heuristic (based on information gain)
Originally developed for discrete features

• Overfitting
• What is it? How do we deal with it?

• Some extensions of DTs

• Principles of Experimental ML
27
Administration
• Since there is no waiting list anymore; all people that wanted to
be in are in.
• Recitations
• Quizzes
• Questions?
• Please ask/comment during class.

28
Learning decision trees (ID3
algorithm
Decision Trees
• Can represent any Boolean Function
• Can be viewed as a way to compactly represent a lot of data.
• Natural representation: (20 questions)
• The evaluation of the Decision Tree Classifier is easy

Outlook
• Clearly, given data, there are
many ways to represent it as
Sunny Overcast Rain
a decision tree. Humidity Wind
Yes
• Learning a good representation
High Normal Strong Weak
from data is the challenge. No Yes No Yes
30
Will I play tennis today?
• Features
• Outlook: {Sun, Overcast, Rain}
• Temperature: {Hot, Mild, Cool}
• Humidity: {High, Normal, Low}
• Wind: {Strong, Weak}

• Labels
• Binary classification task: Y = {+, -}

31
Will I play tennis today?
O T H W Play?
1 S H H W - Outlook: S(unny),
2 S H H S - O(vercast),
3 O H H W + R(ainy)
4 R M H W +
5 R C N W + Temperature: H(ot),
6 R C N S - M(edium),
7 O C N S + C(ool)
8 S M H W -
9 S C N W + Humidity: H(igh),
10 R M N W + N(ormal),
11 S M N S + L(ow)
12 O M H S +
13 O H N W + Wind: S(trong),
14 R M H S - W(eak)

32
Basic Decision Trees Learning Algorithm
O T H W Play?
1 S H H W -
• Data is processed in Batch (i.e. all the
2 S H H S - data available) Algorithm?
3 O H H W +
4 R M H W + • Recursively build a decision tree top
5 R C N W + down.
6 R C N S -
7 O C N S + Outlook
8 S M H W -
9 S C N W +
10 R M N W + Rain
Sunny Overcast
11 S M N S +
12 O M H S + Humidity Yes Wind
13 O H N W +
14 R M H S - High Normal Strong Weak
No Yes No Yes
Basic Decision Tree Algorithm
• Let S be the set of Examples
• Label is the target attribute (the prediction)
• Attributes is the set of measured attributes
• ID3(S, Attributes, Label)- Iterative Dichotomiser
If all examples are labeled the same return a single node tree with Label
Otherwise Begin
A = attribute in Attributes that best classifies S (Create a Root node for tree)

for each possible value v of A

Add a new tree branch corresponding to A=v
Let Sv be the subset of examples in S with A=v
if Sv is empty: add leaf node with the common value of Label in S why?
Else: below this branch add the subtree
For evaluation time
ID3(Sv, Attributes - {a}, Label)
End
Return Root

34
Picking the Root Attribute
• The goal is to have the resulting decision tree as small as
possible (Occam’s Razor)
• But, finding the minimal decision tree consistent with the data is NP-hard
• The recursive algorithm is a greedy heuristic search for a simple
tree, but cannot guarantee optimality.
• The main decision in the algorithm is the selection of the next
attribute to condition on.

35
Picking the Root Attribute
• Consider data with two Boolean attributes (A,B).
A
Instances, Label 1 0
(A=0,B=0), - : 50 examples
+ -
(A=0,B=1), - : 50 examples
(A=1,B=0), - : 0 examples splitting on A
(A=1,B=1), + : 100 examples
• What should be the first attribute we select?
• Splitting on A: we get purely labeled nodes. B
• Splitting on B: we don’t get purely labeled nodes. 1 0
• What if we have: (A=1,B=0), - : 3 examples?
A -
1 0
+ -
• (one way to think about it: # of queries required to label a random
data point) splitting on B
36
Picking the Root Attribute
• Consider data with two Boolean attributes (A,B).
< (A=0,B=0), - >: 50 examples
< (A=0,B=1), - >: 50 examples
< (A=1,B=0), - >: 0 examples 3 examples
< (A=1,B=1), + >: 100 examples
• What should be the first attribute we select?
• Trees looks structurally similar; which attribute should we choose?

A B
Advantage A. But… 1 0 1 0
Need a way to quantify things
B - A -
1 0 100 1 0 53
• One way to think about it: # of queries required to
label a random data point. + - + -
• If we choose A we have less uncertainty about
the labels.
100 3 100 50
splitting on A splitting on B 37
Picking the Root Attribute
• The goal is to have the resulting decision tree as small as
possible (Occam’s Razor)
• The main decision in the algorithm is the selection of the next attribute to
condition on.
• We want attributes that split the examples to sets that are
relatively pure in one label; this way we are closer to a leaf
node.
• The most popular heuristics is based on information gain, originated with
the ID3 system of Quinlan.

38
Entropy
• Entropy (impurity, disorder) of a set of examples, S, relative to a binary
classification is:

• is the proportion of positive examples in S and

• Entropy can be viewed as the number of bits required, on average, to encode

the class of labels. If the probability for + is 0.5, a single bit is required for
each example; if it is 0.8, can use less then 1 bit. 39
Information Gain
High Entropy –> High level of Uncertainty
Low Entropy –> No Uncertainty.

• The information gain of an attribute a is the expected

• Partitions of low entropy (imbalanced splits) lead to high

gain
• Go back to check which of the A, B splits is better
42
Information Gain of Every Attribute
1. Calculate the Entropy of Whole data
2. Calculate the Entropy of attribute/feature values
Will I play tennis today?
O T H W Play?
1 S H H W -
2 S H H S -
3 O H H W + calculate current entropy
4 R M H W +
5 R C N W + •
6 R C N S -
7 O C N S +
8 S M H W -
9 S C N W +  0.94
10 R M N W +
11 S M N S +
12 O M H S +
13 O H N W +
14 R M H S -

44
Entropy of Entire Data Set
Entropy of Outlook Attribute Value: Dataset which is consisting
Sunny of sunny for outlook
Entropy of Outlook Attribute Value: Sunny, Rain,
Overcast
Entropy of Temperature Attribute Value: Hot,
Mild, Cool…G(S,Temp)=0.0289

Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Laser
No ratings yet
Laser
35 pages
Gill
No ratings yet
Gill
474 pages
HCIA-Datacom V1.0 Lab Guide
No ratings yet
HCIA-Datacom V1.0 Lab Guide
182 pages
Sci 7 q1 12 Demonstrate Proper Use and Handling of Science Equipment
No ratings yet
Sci 7 q1 12 Demonstrate Proper Use and Handling of Science Equipment
44 pages
Decision Tree Learning (8 Hours)
No ratings yet
Decision Tree Learning (8 Hours)
141 pages
Python
100% (1)
Python
635 pages
Trees
No ratings yet
Trees
78 pages
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet
Week 11 - Decision Tree Learning
No ratings yet
Week 11 - Decision Tree Learning
43 pages
ML Unit 3 Notes-1
No ratings yet
ML Unit 3 Notes-1
118 pages
Chapter 3 Decision Trees
No ratings yet
Chapter 3 Decision Trees
61 pages
19 - Decision Tree - ID3
No ratings yet
19 - Decision Tree - ID3
87 pages
2024 Lecture11 MLAlgorithms
No ratings yet
2024 Lecture11 MLAlgorithms
84 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
T6 Decision Tree
No ratings yet
T6 Decision Tree
38 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
Decision Tree
No ratings yet
Decision Tree
58 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
9-Module 5 Decision Tree-21-03-2024
No ratings yet
9-Module 5 Decision Tree-21-03-2024
83 pages
Module 3
No ratings yet
Module 3
103 pages
2025 Lecture07 P1 ID3
No ratings yet
2025 Lecture07 P1 ID3
41 pages
ML Unit 2-2-40
No ratings yet
ML Unit 2-2-40
39 pages
02 LecDT
No ratings yet
02 LecDT
85 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
M2 Decision Trees
No ratings yet
M2 Decision Trees
37 pages
Lecture2 DT
No ratings yet
Lecture2 DT
75 pages
Tree Models
No ratings yet
Tree Models
42 pages
Lec4 - Decision Trees
No ratings yet
Lec4 - Decision Trees
43 pages
Random Forest Regression
No ratings yet
Random Forest Regression
57 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Unit 3
No ratings yet
Unit 3
46 pages
Decision Tree and Random Forest
No ratings yet
Decision Tree and Random Forest
41 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
I-V Curves Report - Template
No ratings yet
I-V Curves Report - Template
8 pages
ML Lecture 13-14
No ratings yet
ML Lecture 13-14
33 pages
2 ML Ch3 Decision Trees Final
No ratings yet
2 ML Ch3 Decision Trees Final
70 pages
New Module 3 Part1
No ratings yet
New Module 3 Part1
69 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
No ratings yet
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
22 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
ML Unit-2.1
No ratings yet
ML Unit-2.1
17 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Decision Tree Basics
No ratings yet
Decision Tree Basics
30 pages
Lect 8-Decision Tree-2
No ratings yet
Lect 8-Decision Tree-2
16 pages
Decision Tree: Courtesy: Prof. Pabitra Mitra, CSE, IIT Kharagpur
No ratings yet
Decision Tree: Courtesy: Prof. Pabitra Mitra, CSE, IIT Kharagpur
73 pages
Type Ia Supernova
No ratings yet
Type Ia Supernova
22 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
Brigance Record Book I ASSIGN A STUDENT A GRADE LEVEL PDF
100% (2)
Brigance Record Book I ASSIGN A STUDENT A GRADE LEVEL PDF
41 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
Chapter18 2
No ratings yet
Chapter18 2
40 pages
Analysis of Lutein
No ratings yet
Analysis of Lutein
15 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
VU21997 - Expose Website Security Vulnerabilities - Class 4 SQLMap Final
No ratings yet
VU21997 - Expose Website Security Vulnerabilities - Class 4 SQLMap Final
21 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
Decision Trees
No ratings yet
Decision Trees
42 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
Unit 2 1
No ratings yet
Unit 2 1
15 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Machine Learning: MVJ21CS62
No ratings yet
Machine Learning: MVJ21CS62
12 pages
Decision Trees
No ratings yet
Decision Trees
53 pages
Criminalistics Review materialsLATEST
No ratings yet
Criminalistics Review materialsLATEST
60 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
Data Mining CS4168 Lecture 5 Basics of Classification 1
No ratings yet
Data Mining CS4168 Lecture 5 Basics of Classification 1
25 pages
SR 728 C 1
No ratings yet
SR 728 C 1
36 pages
JAROL Assumes The Promotion of Energy-Saving Technology As Its Own Task! 1. PREFACE NOTICE
50% (2)
JAROL Assumes The Promotion of Energy-Saving Technology As Its Own Task! 1. PREFACE NOTICE
182 pages
Worksheet 1 Descriptive Statistics
No ratings yet
Worksheet 1 Descriptive Statistics
4 pages
Notes On EV:CV
No ratings yet
Notes On EV:CV
13 pages
Improvements in The Mechanical Properties of The 18R-6R High-Hysteresis Martensitic Transformation by Nanoprecipitates in CuZnAl Alloys
No ratings yet
Improvements in The Mechanical Properties of The 18R-6R High-Hysteresis Martensitic Transformation by Nanoprecipitates in CuZnAl Alloys
8 pages
Assignment One
No ratings yet
Assignment One
4 pages
ID3 Algorithm: Abbas Rizvi CS157 B Spring 2010
No ratings yet
ID3 Algorithm: Abbas Rizvi CS157 B Spring 2010
19 pages
10 1149@2 1181908jes
No ratings yet
10 1149@2 1181908jes
6 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
Design of RF to DC conversion circuit for energy harvesting in CMOS 0.13-μm technology
No ratings yet
Design of RF to DC conversion circuit for energy harvesting in CMOS 0.13-μm technology
11 pages
32 Unit Wise Maths Formulas
No ratings yet
32 Unit Wise Maths Formulas
10 pages
Hydrocracking Technology
100% (1)
Hydrocracking Technology
12 pages
Test - Unit - 1 - Vector - Google Forms
No ratings yet
Test - Unit - 1 - Vector - Google Forms
4 pages
Decision Trees
No ratings yet
Decision Trees
7 pages
Trs en
No ratings yet
Trs en
2 pages
Module 2 Previous Year Questions
No ratings yet
Module 2 Previous Year Questions
9 pages
High Electron Mobility Transistor-Foti
No ratings yet
High Electron Mobility Transistor-Foti
17 pages
AEF3e Level 1 TG PCM Grammar 2A
No ratings yet
AEF3e Level 1 TG PCM Grammar 2A
1 page
Pervaporation Ketazine Aq Layer Prodn HH Peroxide Proc PDF
No ratings yet
Pervaporation Ketazine Aq Layer Prodn HH Peroxide Proc PDF
6 pages
1 - Assignment - PH 401 (EE) - MODULE - 6 (Statistical Mechanics)
No ratings yet
1 - Assignment - PH 401 (EE) - MODULE - 6 (Statistical Mechanics)
2 pages
History of Computing
No ratings yet
History of Computing
3 pages
Nmrws2: H O That Are Aldehydes
No ratings yet
Nmrws2: H O That Are Aldehydes
4 pages

Decision Trees-Lecture 9&10

Uploaded by

Decision Trees-Lecture 9&10

Uploaded by

Decision Trees

• Leaves specify the category (labels)

• Output is a discrete category. Real valued outputs are

• Some extensions of DTs

for each possible value v of A

• is the proportion of positive examples in S and

• Entropy can be viewed as the number of bits required, on average, to encode

• The information gain of an attribute a is the expected

• Partitions of low entropy (imbalanced splits) lead to high

• Some extensions of DTs

for each possible value v of A

• is the proportion of positive examples in S and

• Entropy can be viewed as the number of bits required, on average, to encode

• The information gain of an attribute a is the expected

• Partitions of low entropy (imbalanced splits) lead to high

You might also like