0% found this document useful (0 votes)
42 views3 pages

Decision Trees: Principal Data Miner, ATO Adjunct Associate Professor, ANU

Introduction Decision Trees Decision Trees in R Introduction Graham Williams Principal Data Miner, ATO Adjunct Associate Professor, ANU Introduction. 'Classification is to build models (sentences) in a knowledge representation (language) from examples of past decisions' Common approaches: decision trees; neural networks; logistic regression; support vector machines.

Uploaded by

a.k.s
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views3 pages

Decision Trees: Principal Data Miner, ATO Adjunct Associate Professor, ANU

Introduction Decision Trees Decision Trees in R Introduction Graham Williams Principal Data Miner, ATO Adjunct Associate Professor, ANU Introduction. 'Classification is to build models (sentences) in a knowledge representation (language) from examples of past decisions' Common approaches: decision trees; neural networks; logistic regression; support vector machines.

Uploaded by

a.k.s
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Introduction Decision Trees Decision Trees in R Introduction Decision Trees Decision Trees in R

Overview

Data Mining Algorithms Introduction


Reference Material
Decision Trees
Decision Trees
Basics
Graham Williams Example
Algorithm
Principal Data Miner, ATO
Adjunct Associate Professor, ANU
Decision Trees in R
Examples

Copyright
c 2006, Graham J. Williams http://togaware.com 1/19/1 Copyright
c 2006, Graham J. Williams http://togaware.com 3/19/2

Introduction Decision Trees Decision Trees in R Introduction Decision Trees Decision Trees in R

Reference Book Overview

Data Mining: Concepts and Techniques Introduction


Jiawei Han, Micheline Kamber Reference Material
2006, Morgan Kaufmann Publishers
ISBN: 1558609016. Decision Trees
Basics
Section 6.3 Example
Algorithm
See Also:
http://datamining.togaware.com/survivor/Decision Trees.html Decision Trees in R
Examples

Copyright
c 2006, Graham J. Williams http://togaware.com 4/19/3 Copyright
c 2006, Graham J. Williams http://togaware.com 5/19/4

Introduction Decision Trees Decision Trees in R Introduction Decision Trees Decision Trees in R

Predictive Modelling: Classification Language: Decision Trees

Knowledge representation: A flow-chart-like tree structure


Goal of classification is to build models (sentences) in a Internal nodes denotes a test on an attribute
knowledge representation (language) from examples of past Branch represents an outcome of the test
decisions. Leaf nodes represent class labels or class distribution
The model is to be used on unseen cases to make decisions.
Often referred to as supervised learning. Gender
Common approaches: decision trees; neural networks; logistic Male Female
regression; support vector machines. Age Y
< 43 > 43

Y N

Copyright
c 2006, Graham J. Williams http://togaware.com 6/19/5 Copyright
c 2006, Graham J. Williams http://togaware.com 7/19/6
Introduction Decision Trees Decision Trees in R Introduction Decision Trees Decision Trees in R

Tree Construction: Divide and Conquer Training Dataset: Buys Computer?


Decision tree induction is an example of a recursive What rule would you “learn” to identify who buys a computer?
partitioning algorithm: divide and conquer.
At start, all the training examples are at the root Age Income Student Credit Buys
≤ 30 High No Fair No
Partition examples recursively based on selected attributes ≤ 30 High No Excellent Yes
+ − +

31 . . . 40 High No Fair Yes

+Females + > 40 Medium No Fair Yes
− −
− + +
− − + + > 40 Low Yes Fair Yes
Males ++ −
> 40 Low Yes Excellent No
−+ + − +
+ + 31 . . . 40 Low Yes Excellent Yes
≤ 30 Medium No Fair No
_ ≤ 30 Low Yes Fair No
+
+ + −−− − > 40 Medium Yes Fair Yes
+ + −− >42
++ −

≤ 30 Medium Yes Excellent Yes
<42 31 . . . 40 Medium No Excellent Yes
31 . . . 40 High Yes Fair Yes
> 40 Medium No Excellent No
+
Copyright
c 2006, Graham J. Williams http://togaware.com 8/19/7 Copyright
c 2006, Graham J. Williams http://togaware.com 9/19/8

Introduction Decision Trees Decision Trees in R Introduction Decision Trees Decision Trees in R

Output: Decision Tree for Buys Computer Algorithm for Decision Tree Induction

Age Income Student Credit Buys


≤ 30 High No Fair No A greedy algorithm: takes the best immediate (local) decision
≤ 30 High No Excellent Yes
31 . . . 40
> 40
High
Medium
No
No
Fair
Fair
Yes
Top
Yes
while building the overall model
> 40 Low Yes Fair Yes
> 40
31 . . . 40
Low
Low
Yes
Yes
Excellent
Excellent
No
Yes Age
Tree constructed top-down, recursive, divide-and-conquer
≤ 30 Medium No Fair No
≤ 30 Low Yes Fair No Begin with all training examples at the root
> 40 Medium Yes Fair Yes
≤ 30
31 . . . 40
Medium
Medium
Yes
No
Excellent
Excellent
Yes
Yes <= 30 30...40 >40 Data is partitioned recursively based on selected attributes
31 . . . 40 High Yes Fair Yes
> 40 Medium No Excellent No Select attributes on basis of a measure
Student Yes Rating
Stop partitioning when?
No Yes Excellent Fair All samples for a given node belong to the same class
There are no remaining attributes for further partitioning –
No Yes No Yes majority voting is employed for classifying the leaf
Top There are no samples left

Copyright
c 2006, Graham J. Williams http://togaware.com 10/19/9 Copyright
c 2006, Graham J. Williams http://togaware.com 11/19/10

Introduction Decision Trees Decision Trees in R Introduction Decision Trees Decision Trees in R

Basic Motivation: Information Content Attribute Selection Measure

Information gain (ID3/C4.5)


Select the attribute with the highest information gain
A data set contains a certain amount of information
Assume there are two classes: P and N
Work toward increasing the amount of information
exhibited by the data Let the data S contain p elements of class P and n elements
A random data set has high entropy of class N
Work towards reducing the amount of entropy in the data The amount of information, needed to decide if an arbitrary
data example in S belongs to P or N is defined as
p p n n
I (p, n) = − log2 − log2
p+n p+n p+n p+n

Copyright
c 2006, Graham J. Williams http://togaware.com 12/19/11 Copyright
c 2006, Graham J. Williams http://togaware.com 13/19/12
Introduction Decision Trees Decision Trees in R Introduction Decision Trees Decision Trees in R

Attribute Selection Measure Information Gain


Information Required to Classify Entities Now use attribute A to partition S into v cells:
{S1 , S2 , . . . , Sv }
0.7

If Si contains pi examples of P and ni examples of N, the


0.6

information now needed to classify objects in all subtrees Si is:


−p * log(p) − (1 − p) * log(1 − p)

v
0.5

X pi + ni
E (A) = I (pi , ni )
p+n
0.4

i=1

So, the information gained by branching on A is:


0.3
0.2

Gain(A) = I (p, n) − E (A)


0.1

So choose the attribute A which results in the greatest gain in


0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
information.
Copyright
c 2006, Graham J. Williams p
http://togaware.com 14/19/13 Copyright
c 2006, Graham J. Williams http://togaware.com 15/19/14

Introduction Decision Trees Decision Trees in R Introduction Decision Trees Decision Trees in R

Overview Simple Train/Test Paradigm

Introduction > s u b <− c ( s a m p l e ( 1 : 1 5 0 , 7 5 ) ) # Random s a m p l e f o r t r a i n i n g


> f i t <− r p a r t ( S p e c i e s ˜ . , d a t a= i r i s , s u b s e t=s u b )
Reference Material > fit
n= 75

node ) , s p l i t , n , l o s s , y v a l , ( y p r o b )
∗ d e n o t e s t e r m i n a l node
Decision Trees
1 ) r o o t 75 47 v i r g i n i c a ( 0 . 2 8 0 0 0 0 0 0 . 3 4 6 6 6 6 7 0 . 3 7 3 3 3 3 3 )
Basics 2 ) P e t a l . Length< 2 . 5 21 0 s e t o s a ( 1 . 0 0 0 0 0 0 0 0 . 0 0 0 0 0 0 0 0 . 0 0 0 0 0 0 0 ) ∗
3 ) P e t a l . Length >=2.5 54 26 v i r g i n i c a ( 0 . 0 0 0 0 0 0 0 0 . 4 8 1 4 8 1 5 0 . 5 1 8 5 1 8 5 )
Example 6 ) P e t a l . Length< 5 . 0 5 29 3 v e r s i c o l o r ( 0 . 0 0 0 0 0 0 0 0 . 8 9 6 5 5 1 7 0 . 1 0 3 4 4 8 3 ) ∗
7 ) P e t a l . Length >=5.05 25 0 v i r g i n i c a ( 0 . 0 0 0 0 0 0 0 0 . 0 0 0 0 0 0 0 1 . 0 0 0 0 0 0 0 ) ∗

Algorithm > table ( predict ( fit , i r i s [−sub , ] , t y p e =” c l a s s ” ) , i r i s [−sub , ” S p e c i e s ” ] )

setosa versicolor vi rg ini ca


setosa 29 0 0
versicolor 0 23 6
Decision Trees in R virginica 0 1 16

Examples

Copyright
c 2006, Graham J. Williams http://togaware.com 16/19/15 Copyright
c 2006, Graham J. Williams http://togaware.com 17/19/16

Introduction Decision Trees Decision Trees in R Introduction Decision Trees Decision Trees in R

Example DTree Plot using Rattle Summary


drawTreeNodes(fit)

Sample Iris Decision Tree

Petal.Width < − > 1.65


Decision Tree Induction is one of the most widely deployed
machine learning technologies.
Simplicity of the idea, and yet a powerful tool.
Petal.Length < − > 2.6 3
Available in R through the rpart package.
virginica
26 cases
100%

4 5
setosa versicolor
24 cases 25 cases
100% 100%

Copyright
c 2006, Graham J. Williams http://togaware.com 18/19/17 Copyright
c 2006, Graham J. Williams http://togaware.com 19/19/18
Rattle 2006−08−21 21:28:13 gjw

You might also like