0% found this document useful (0 votes)

6 views54 pages

Decision Tree

Decision trees are a powerful tool for classification and prediction, representing concepts through a flowchart-like structure where nodes test attributes and leaves hold class labels. They can handle both continuous and categorical variables, making them suitable for various applications such as medical diagnosis and credit risk analysis. However, they may struggle with noisy data and can be computationally expensive to train.

Uploaded by

Sadbin Mohshin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views54 pages

Decision Tree

Uploaded by

Sadbin Mohshin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 54

DECISION TREES

1
Representation of Concepts
 Concept learning: conjunction of attributes
 (Sunny AND Hot AND Humid AND Windy) +

• Decision trees: disjunction of conjunction of attributes

• (Sunny AND Normal) OR (Overcast) OR (Rain AND Weak) +
• More powerful representation
• Larger hypothesis space H OutdoorSport
• Can be represented as a tree
• Common form of decision sunny overcast rain

making in humans Humidity Yes Wind

high normal strong weak

No Yes No Yes
2
Rectangle learning….
-----------
-- +++++++ --
-- -- Conjunctions
-- ++++++ (single rectangle)
--
-- --
-----------

-----------
--
----------- +++++++ --
----------- ++++++ --
-- Disjunctions of Conjunctions
-- -- +++++++ --
-- (union of rectangles)
-- -- --
-- ++++++ --
-- --
-- --
-- --
--
-----------

3
Decision Tree :
Decision tree is the most powerful and
popular tool for classification and
prediction. A Decision tree is a flowchart
like tree structure, where each internal
node denotes a test on an attribute, each
branch represents an outcome of the test,
and each leaf node (terminal node) holds
Fig. 1: A decision tree for the concept Play Tennis.
a class label.
Training Examples
Day Outlook Temp Humidity Wind Tennis?
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No
Decision Trees
◦ Decision tree to represent learned target functions
◦ Each internal node tests an attribute
◦ Each branch corresponds to attribute value
◦ Each leaf node assigns a classification

◦ Can be represented Outlook

by logical formulas
sunny overcast rain

Humidity Yes Wind

high normal strong weak

No Yes No Yes

6
Representation in decision trees
 Example of representing rule in DT’s:
if outlook = sunny AND humidity = normal
OR
if outlook = overcast
OR
if outlook = rain AND wind = weak
then playtennis

7
Applications of Decision Trees
 Instances describable by a fixed set of attributes and their values
 Target function is discrete valued
– 2-valued
– N-valued
– But can approximate continuous functions
 Disjunctive hypothesis space
 Possibly noisy training data
– Errors, missing values, …
 Examples:
– Equipment or medical diagnosis
– Credit risk analysis
– Calendar scheduling preferences

8
Decision Trees

Given distribution
+ + + + - - - - + + + + Of training instances
Attribute 2 - Draw axis parallel
+ + + + - - - + + + +
+ + + + - - - - + + + + Lines to separate the
- - - - Instances of each class
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - -

Attribute 1

9
Decision Tree Structure

Draw axis parallel

+ + + + - - - - + + + + Lines to separate the
Attribute 2 - Instances of each class
+ + + + - - - + + + +
+ + + + - - - - + + + +
- - - -
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - -

Attribute 1

10
Decision Tree Structure
Decision leaf
* Alternate splits possible

Attribute 2
+ + + + - - - - + + + +
- Decision node
+ + + + - - - + + + +
+ + + + - - - - + + + + = condition
30 - - - - = box
- - - - - - - - + + + + = collection of satisfying
- - - - - - - - + + + + examples
- - - - - - - - + + + +
- - - - - - - -

Decision nodes (splits)

20 40 Attribute 1

11
The strengths of decision tree methods are:

Decision trees are able to generate understandable rules.

Decision trees perform classification without requiring much computation.

Decision trees are able to handle both continuous and categorical variables.

Decision trees provide a clear indication of which fields are most important

for prediction or classification.

The weaknesses of decision tree methods :

 Decision trees are less appropriate for estimation tasks where the goal is to predict the value

of a continuous attribute.

 Decision trees are prone to errors in classification problems with many class and relatively

small number of training examples.

 Decision tree can be computationally expensive to train.

Decision Tree Introduction with example:
Decision Tree Construction
◦ Find the best structure
◦ Given a training data set

15
Top-Down Construction
 Start with empty tree
 Main loop:
1. Split the “best” decision attribute (A) for next node
2. Assign A as decision attribute for node
3. For each value of A, create new descendant of node
4. Sort training examples to leaf nodes
5. If training examples perfectly classified, STOP,
Else iterate over new leaf nodes
 Grow tree just deep enough for perfect classification
– If possible (or can approximate at chosen depth)
 Which attribute is best?

16
Principle of Decision Tree Construction
◦ Finally we want to form pure leaves
◦ Correct classification
◦ Greedy approach to reach correct classification
1. Initially treat the entire data set as a single box
2. For each box choose the spilt that reduces its impurity (in
terms of class labels) by the maximum amount
3. Split the box having highest reduction in impurity
4. Continue to Step 2
5. Stop when all boxes are pure

17
Best attribute to split?

+ + + + - - - - + + + +
Attribute 2 -
+ + + + - - - + + + +
+ + + + - - - - + + + +
- - - -
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - -

Attribute 1

18
Best attribute to split?

+ + + + - - - - + + + +
Attribute 2 -
+ + + + - - - + + + +
+ + + + - - - - + + + +
- - - -
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - -

Attribute 1 > 40? Attribute 1

19
Best attribute to split?

+ + + + - - - - + + + +
Attribute 2 -
+ + + + - - - + + + +
+ + + + - - - - + + + +
- - - -
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - -

Attribute 1 > 40? Attribute 1

20
Which split to make next?
Pure box/node
Mixed box/node

+ + + + - - - - + + + +
Attribute 2 -
+ + + + - - - + + + + Already pure leaf
+ + + + - - - - + + + + No further need to split
- - - -
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - -

Attribute 1 > 20? Attribute 1

21
Which split to make next?
Already pure leaf
No further need to split

Attribute 2
+ + + + - - - - + + + +
+ + + + - - - - + + + +
+ + + + - - - - + + + +
A2 > 30? - - - -
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - -

Attribute 1

22
In Decision Tree the major challenge is to identification of the attribute for the

root node in each level. This process is known as attribute selection. We have two

popular attribute selection measures:

 Information Gain

 Gini Index
1. Information Gain

When we use a node in a decision tree to partition the training instances into
smaller subsets the entropy changes. Information gain is a measure of this
change in entropy.

Entropy:
Entropy is the measure of uncertainty of a random variable, it characterizes the
impurity of an arbitrary collection of examples. The higher the entropy more the
information content.
Example:
Choosing Best Attribute?
◦ Consider 64 examples, 29+ and 35-
◦ Which one is better?

29+, 35- A1 29+, 35- A2

t f t f

25+, 5- 4+, 30- 15+, 19- 14+, 16-

◦ Which is better?

29+, 35- A1 29+, 35- A2

t f t f

21+, 5- 8+, 30- 18+, 33- 11+, 2-

26
Entropy
◦ A measure for
◦ uncertainty
◦ purity
◦ information content
◦ Information theory: optimal length code assigns (-
log2p) bits to message having probability p
◦ S is a sample of training examples
◦ p+ is the proportion of positive examples in S
◦ p- is the proportion of negative examples in S
◦ Entropy of S: average optimal number of bits to encode
information about certainty/uncertainty about S
Entropy(S) = p+(-log2p+) + p-(-log2p-) = -p+log2p+- p-log2p-
◦ Can be generalized to more than two values
27
Entropy
 Entropy can also be viewed as measuring
– purity of S,
– uncertainty in S,
– information in S, …
 E.g.: values of entropy for p+=1, p+=0, p+=.5
 Easy generalization to more than binary values
– Sum over pi *(-log2 pi) , i=1,n
 i is + or – for binary
 i varies from 1 to n in the general case

28
Choosing Best Attribute?
◦ Consider 64 examples (29+,35-) and compute entropies:
◦ Which one is better?

E(S)=0.993
29+, 35- A1 E(S)=0.993 29+, 35- A2
t f 0.522 t f 0.997
0.650 0.989
25+, 5- 4+, 30- 15+, 19- 14+, 16-
◦ Which is better?

E(S)=0.993 E(S)=0.993
29+, 35- A1 29+, 35- A2
t f t f
0.708 0.742 0.937 0.619
21+, 5- 8+, 30- 18+, 33- 11+, 2-

29
Information Gain
◦ Gain(S,A): reduction in entropy after choosing attr. A
Sv
Gain( S , A) Entropy ( S )  
vValues ( A ) S
Entropy ( S v )

E(S)=0.993
29+, 35- A1 E(S)=0.993 29 , 35
+ - A2
t f t f
0.650 0.522 0.989 0.997
25+, 5- 4+, 30- 15+, 19- 14+, 16-
Gain: 0.395 Gain: 0.000

E(S)=0.993 E(S)=0.993
29+, 35- A1 29 , 35 A2
+ -

t f t f
0.708 0.742 0.937 0.619
21+, 5- 8+, 30- 18+, 33- 11+, 2-
Gain: 0.265 Gain: 0.121
30
Gain function
 Gain is measure of how much can
– Reduce uncertainty
 Value lies between 0,1
 What is significance of
 gain of 0?
 example where have 50/50 split of +/- both before and
after discriminating on attributes values
 gain of 1?
 Example of going from “perfect uncertainty” to perfect
certainty after splitting example with predictive attribute
– Find “patterns” in TE’s relating to attribute
values
 Move to locally minimal representation of TE’s

31
Training Examples
Day Outlook Temp Humidity Wind Tennis?
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No
Determine the Root Attribute
9+, 5- E=0.940 9+, 5- E=0.940

Humidity Wind

High Low Weak Strong

3+, 4- 6+, 1- 6+, 2- 3+, 3-

E=0.985 E=0.592 E=0.811 E=1.000

Gain (S, Humidity) = 0.151 Gain (S, Wind) = 0.048

Gain (S, Outlook) = 0.246 Gain (S, Temp) = 0.029

33
Sort the Training Examples
9+, 5- {D1,…,D14}

Outlook

Sunny Overcast Rain

{D1,D2,D8,D9,D11} {D3,D7,D12,D13} {D4,D5,D6,D10,D15}

2+, 3- 4+, 0- 3+, 2-

? Yes ?

Ssunny= {D1,D2,D8,D9,D11}
Gain (Ssunny, Humidity) = .970
Gain (Ssunny, Temp) = .570
Gain (Ssunny, Wind) = .019

34
Final Decision Tree for Example
Outlook

Sunny Rain
Overcast

Humidity
Yes Wind
High
Normal Strong Weak

No Yes No
Yes

35
GINI Index:
Gini index or Gini impurity measures the degree or probability of a particular
variable being wrongly classified when it is randomly chosen. But what is
actually meant by ‘impurity’? If all the elements belong to a single class, then
it can be called pure. The degree of Gini index varies between 0 and 1, where
0 denotes that all elements belong to a certain class or if there exists only one
class, and 1 denotes that the elements are randomly distributed across various
classes. A Gini Index of 0.5 denotes equally distributed elements into some
classes.
Formula for Gini Index:

where pi is the probability of an object being classified to a particular class.

Example of GINI Index:
Let’s start by calculating the Gini Index for ‘Past Trend’.

P(Past Trend=Positive): 6/10

P(Past Trend=Negative): 4/10
If (Past Trend = Positive & Return = Up), probability = 4/6
If (Past Trend = Positive & Return = Down), probability = 2/6
Gini index = 1 - ((4/6)^2 + (2/6)^2) = 0.45
If (Past Trend = Negative & Return = Up), probability = 0
If (Past Trend = Negative & Return = Down), probability = 4/4
Gini index = 1 - ((0)^2 + (4/4)^2) = 0
Weighted sum of the Gini Indices can be calculated as follows:
Gini Index for Past Trend = (6/10)0.45 + (4/10)0 = 0.27
**By this way we can calculate GINI Index for ‘Open Interest’, ‘Trading Volume ’.
Gini Index is preferred over Information gain:

Gini Index, unlike information gain, isn’t computationally intensive as it doesn’t

involve the logarithm function used to calculate entropy in information gain,
which is why Gini Index is preferred over Information gain.
Decision Tree Regression:
Decision tree regression observes features of an object and trains a model in
the structure of a tree to predict data in the future to produce meaningful
continuous output. Continuous output means that the output/result is not
discrete, i.e., it is not represented just by a discrete, known set of numbers or
values.
Discrete output example: A weather prediction model that predicts whether or
not there’ll be rain in a particular day.
Continuous output example: A profit prediction model that states the
probable profit that can be generated from the sale of a product.
Criteria
Decision Tree
Logistic Regression
Classification

Interpretability Less interpretable More interpretable

Decision Linear and single Bisects the space into

Boundaries decision boundary smaller spaces

Automatically
Ease of Decision A decision threshold
handles decision
Making has to be set
making

Robustness to Majorly affected by

Robust to noise
noise noise

Requires a large Can be trained on a

Scalability
enough training set small training set
Overfitting the Data
• Learning a tree that classifies the training data perfectly may
not lead to the tree with the best generalization performance.
- There may be noise in the training data the tree is fitting
- The algorithm might be making decisions based on
very little data
• A hypothesis h is said to overfit the training data if the is
another hypothesis, h’, such that h has smaller error than h’
on the training data but h has larger error on the test data than h’.

On training

accuracy On testing

Complexity of tree 43
Overfitting

Attribute 2
A very deep tree required
+ + + + - - - - + + + + To fit just one odd training
+ + - + - - - - + + + + example
+ + + + - - - - + + + +
- - - -
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - - + + + +
- - - - - - - -

Attribute 1

44
When to stop splitting further?

Attribute 1

45
Avoiding Overfitting

• Two basic approaches

- Prepruning: Stop growing the tree at some point during
construction when it is determined that there is not enough
data to make reliable choices.
- Postpruning: Grow the full tree and then remove nodes
that seem not to have sufficient evidence. (more popular)
• Methods for evaluating subtrees to prune:
- Cross-validation: Reserve hold-out set to evaluate utility (more popular)
- Statistical testing: Test if the observed regularity can be
dismissed as likely to be occur by chance
- Minimum Description Length: Is the additional complexity of
the hypothesis smaller than remembering the exceptions ?
This is related to the notion of regularization that we will see
in other contexts– keep the hypothesis simple.

46
Reduced-Error Pruning
• A post-pruning, cross validation approach
- Partition training data into “grow” set and “validation” set.
- Build a complete tree for the “grow” data
- Until accuracy on validation set decreases, do:
For each non-leaf node in the tree
Temporarily prune the tree below; replace it by majority vote.
Test the accuracy of the hypothesis on the validation set
Permanently prune the node with the greatest increase
in accuracy on the validation test.
• Problem: Uses less data to construct the tree
• Sometimes done at the rules level

General Strategy: Overfit and Simplify

47
Rule post-pruning

◦Allow tree to grow until best fit (allow overfitting)

◦Convert tree to equivalent set of rules
◦One rule per leaf node
◦Prune each rule independently of others
◦Remove various preconditions to improve performance
◦Sort final rules into desired sequence for use

48
Thanks to the audience .
Gini Index
◦ Another sensible measure of impurity
(i and j are classes)

◦ After applying attribute A, the resulting Gini index is

◦ Gini can be interpreted as expected error rate

Gini Index
. .

. .
. .

Attributes: color, border, dot

Classification: triangle, square

51
Gini Index for Color
. .
. .
. .
. .
red

Color? green

.
yellow .

.
.

52
Gain of Gini Index

53
Three Impurity Measures
A Gain(A) GainRatio(A) GiniGain(A)

Color 0.247 0.156 0.058

Outline 0.152 0.152 0.046
Dot 0.048 0.049 0.015

Decision Trees
100% (1)
Decision Trees
61 pages
Machine Learning Unit-3.2
No ratings yet
Machine Learning Unit-3.2
61 pages
Decision Tree & Techniques
71% (7)
Decision Tree & Techniques
41 pages
L5 - Decision Tree - B
No ratings yet
L5 - Decision Tree - B
51 pages
Răspuns: Binomial (N 150, T) Distribution
100% (2)
Răspuns: Binomial (N 150, T) Distribution
4 pages
Decision Tree: Courtesy: Prof. Pabitra Mitra, CSE, IIT Kharagpur
No ratings yet
Decision Tree: Courtesy: Prof. Pabitra Mitra, CSE, IIT Kharagpur
73 pages
Week 2 Lecture Notes
No ratings yet
Week 2 Lecture Notes
61 pages
Unit 3 Decision Trees-3
No ratings yet
Unit 3 Decision Trees-3
70 pages
Decision Tree
No ratings yet
Decision Tree
58 pages
2 ML Ch3 Decision Trees Final
No ratings yet
2 ML Ch3 Decision Trees Final
70 pages
Topic 3-SPSS and STATA
100% (1)
Topic 3-SPSS and STATA
73 pages
Multivariate Material
No ratings yet
Multivariate Material
58 pages
9-Module 5 Decision Tree-21-03-2024
No ratings yet
9-Module 5 Decision Tree-21-03-2024
83 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
2.3 Decision-Tree-Algorithm
No ratings yet
2.3 Decision-Tree-Algorithm
61 pages
M2 Decision Trees
No ratings yet
M2 Decision Trees
37 pages
Data Mining: Classification-1
No ratings yet
Data Mining: Classification-1
53 pages
Decision Tree Learning (8 Hours)
No ratings yet
Decision Tree Learning (8 Hours)
141 pages
Trinh Khanh Ly 20213676
No ratings yet
Trinh Khanh Ly 20213676
13 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
T6 Decision Tree
No ratings yet
T6 Decision Tree
38 pages
19 - Decision Tree - ID3
No ratings yet
19 - Decision Tree - ID3
87 pages
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
No ratings yet
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
25 pages
Unit Iii DM
No ratings yet
Unit Iii DM
48 pages
Decision Trees
No ratings yet
Decision Trees
18 pages
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
No ratings yet
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
7 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
Decision Tree
No ratings yet
Decision Tree
22 pages
Data Mining Notes Unit 4
No ratings yet
Data Mining Notes Unit 4
30 pages
Data Minning Unit 5 PDF
No ratings yet
Data Minning Unit 5 PDF
19 pages
Classification
No ratings yet
Classification
8 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
CSC454 10
No ratings yet
CSC454 10
36 pages
Lecture Notes 3
No ratings yet
Lecture Notes 3
11 pages
Trees
No ratings yet
Trees
78 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
18 pages
Decision Tree
No ratings yet
Decision Tree
35 pages
Lect 8-Decision Tree-2
No ratings yet
Lect 8-Decision Tree-2
16 pages
ML Ch-3 Decision Trees and Ensemble Methods
No ratings yet
ML Ch-3 Decision Trees and Ensemble Methods
14 pages
3 Descriptive Statistics PDF
No ratings yet
3 Descriptive Statistics PDF
58 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Decision Trees: Decision Tree Representation ID3 Learning Algorithm Entropy, Information Gain Overfitting
No ratings yet
Decision Trees: Decision Tree Representation ID3 Learning Algorithm Entropy, Information Gain Overfitting
33 pages
Spatial Economterics Using SMLE
100% (1)
Spatial Economterics Using SMLE
29 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
DMDW Co3 Session 14
No ratings yet
DMDW Co3 Session 14
55 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Unit-2 Solution
No ratings yet
Unit-2 Solution
22 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
Harmon Case - Group 6
No ratings yet
Harmon Case - Group 6
8 pages
ML Classification Tree
No ratings yet
ML Classification Tree
36 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
7-Decision Trees Learning
No ratings yet
7-Decision Trees Learning
51 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
No ratings yet
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
25 pages
Decision Tree
No ratings yet
Decision Tree
25 pages
Calculating Sample Size For A Case-Control Study
No ratings yet
Calculating Sample Size For A Case-Control Study
14 pages
Lecture 6 Hidden Markov and Maximum Entropy Models
No ratings yet
Lecture 6 Hidden Markov and Maximum Entropy Models
28 pages
Aldrin B. Fajilagutan - Math Pt4
No ratings yet
Aldrin B. Fajilagutan - Math Pt4
24 pages
Unit-V Basic Statistics and Probability: Presentation - Three Forms - Histogram, Bar Chart, Frequency Polygon
No ratings yet
Unit-V Basic Statistics and Probability: Presentation - Three Forms - Histogram, Bar Chart, Frequency Polygon
6 pages
Handout 2020 Part1 PDF
No ratings yet
Handout 2020 Part1 PDF
82 pages
QTM Assignment-2: Submitted by NAME - Akash Malik ROLL NUMBER-170102018
0% (1)
QTM Assignment-2: Submitted by NAME - Akash Malik ROLL NUMBER-170102018
7 pages
2 Brownian Motion: T T T T T
No ratings yet
2 Brownian Motion: T T T T T
4 pages
Chapter Five Sceduling CPT and PERT
No ratings yet
Chapter Five Sceduling CPT and PERT
29 pages
f5 CHP 8 Extra Materials
0% (1)
f5 CHP 8 Extra Materials
16 pages
1-Data Mining and Applications
No ratings yet
1-Data Mining and Applications
70 pages
Out Put Factor
No ratings yet
Out Put Factor
13 pages
Application Report
No ratings yet
Application Report
1 page
Practical Example Full Notes
No ratings yet
Practical Example Full Notes
48 pages
Preprocessing
No ratings yet
Preprocessing
90 pages
Chapter 03 - Random Variables
No ratings yet
Chapter 03 - Random Variables
14 pages
2 BN Ve
No ratings yet
2 BN Ve
25 pages
DT RF
No ratings yet
DT RF
64 pages
Franchising and Firm Risk
No ratings yet
Franchising and Firm Risk
11 pages
GEP June 2024 Chapter2 ECA
No ratings yet
GEP June 2024 Chapter2 ECA
60 pages
COT No. 4sampling Distribution With Replacement
No ratings yet
COT No. 4sampling Distribution With Replacement
15 pages
GEP June 2024 Chapter2 EAP
No ratings yet
GEP June 2024 Chapter2 EAP
64 pages
Quality Assurance and Quality Control in The Analytical Chemical Laboratory A Practical Approach, 2nd Edition DOCX PDF Download
No ratings yet
Quality Assurance and Quality Control in The Analytical Chemical Laboratory A Practical Approach, 2nd Edition DOCX PDF Download
14 pages
GEP June 2024 Chapter1 Box1
No ratings yet
GEP June 2024 Chapter1 Box1
39 pages
GBU 3311 Quantitative Methods in Business Fall 2022 Homework 1
No ratings yet
GBU 3311 Quantitative Methods in Business Fall 2022 Homework 1
2 pages
Normality Check Student
No ratings yet
Normality Check Student
14 pages
OCR MEI S2 Revision Sheets
No ratings yet
OCR MEI S2 Revision Sheets
8 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
16 pages
Phân Tích H I Quy
No ratings yet
Phân Tích H I Quy
9 pages
Practice Questions
No ratings yet
Practice Questions
2 pages
Lab Report 04
No ratings yet
Lab Report 04
10 pages
Tsa HW03
No ratings yet
Tsa HW03
1 page
TSDS Admission Exam Questions 2022 FINAL
No ratings yet
TSDS Admission Exam Questions 2022 FINAL
4 pages
Home, Body & Mind Book 3 - Divinities Series 3
From Everand
Home, Body & Mind Book 3 - Divinities Series 3
Coallier
No ratings yet
Solving Math Problems
From Everand
Solving Math Problems
George N. Frempong
No ratings yet
A Billion Suns: Interstellar Fleet Battles
From Everand
A Billion Suns: Interstellar Fleet Battles
Mike Hutchinson
1/5 (1)
Factoring and Algebra - A Selection of Classic Mathematical Articles Containing Examples and Exercises on the Subject of Algebra (Mathematics Series)
From Everand
Factoring and Algebra - A Selection of Classic Mathematical Articles Containing Examples and Exercises on the Subject of Algebra (Mathematics Series)
CSPacademic
No ratings yet

Decision Tree

Uploaded by

Decision Tree

Uploaded by

DECISION TREES

• Decision trees: disjunction of conjunction of attributes

making in humans Humidity Yes Wind

high normal strong weak

◦ Can be represented Outlook

Humidity Yes Wind

high normal strong weak

Draw axis parallel

Decision nodes (splits)

Decision trees are able to generate understandable rules.

Decision trees perform classification without requiring much computation.

for prediction or classification.

small number of training examples.

 Decision tree can be computationally expensive to train.

Attribute 1 > 40? Attribute 1

Attribute 1 > 40? Attribute 1

Attribute 1 > 20? Attribute 1

popular attribute selection measures:

29+, 35- A1 29+, 35- A2

25+, 5- 4+, 30- 15+, 19- 14+, 16-

29+, 35- A1 29+, 35- A2

21+, 5- 8+, 30- 18+, 33- 11+, 2-

High Low Weak Strong

3+, 4- 6+, 1- 6+, 2- 3+, 3-

Gain (S, Humidity) = 0.151 Gain (S, Wind) = 0.048

Gain (S, Outlook) = 0.246 Gain (S, Temp) = 0.029

Sunny Overcast Rain

{D1,D2,D8,D9,D11} {D3,D7,D12,D13} {D4,D5,D6,D10,D15}

where pi is the probability of an object being classified to a particular class.

P(Past Trend=Positive): 6/10

Gini Index, unlike information gain, isn’t computationally intensive as it doesn’t

Interpretability Less interpretable More interpretable

Decision Linear and single Bisects the space into

Robustness to Majorly affected by

Requires a large Can be trained on a

• Two basic approaches

General Strategy: Overfit and Simplify

◦Allow tree to grow until best fit (allow overfitting)

◦ After applying attribute A, the resulting Gini index is

◦ Gini can be interpreted as expected error rate

Attributes: color, border, dot

Color 0.247 0.156 0.058

You might also like