0% found this document useful (0 votes)

27 views19 pages

Decision Tree Tutorial by Kardi Teknomo

This document provides a tutorial on decision trees. It explains that a decision tree is a hierarchical structure that classifies data using a series of questions or rules about attribute values. It then gives an example decision tree generated from sample transportation data. The document outlines how to use a decision tree to classify new unseen data by following the rules generated from the tree. Finally, it notes some limitations of decision trees given the small sample size used in the example.

Uploaded by

NWAWEL A IROUME

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views19 pages

Decision Tree Tutorial by Kardi Teknomo

Uploaded by

NWAWEL A IROUME

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Kardi Teknomo

DECISION TREE
TUTORIAL

Revoledu.com
Decision Tree Tutorial by Kardi Teknomo
Copyright © 2008-2012 by Kardi Teknomo
Published by Revoledu.com
Online edition is available at Revoledu.com
Last Update: October 2012

Notice of rights
All rights reserved. No part of this text and its companion files may be reproduced or
transmitted in any form by any means, electronic, mechanical, photocopying, recording,
or otherwise, without the prior written permission of the publisher. For information on
getting permission for reprint and excerpts, contact [email protected]

Notice of liability
The information in this text and its companion files are distributed on an “As is” basis,
without warranty. While every precaution has been taken in the preparation of the text,
neither the author(s) nor Revoledu, shall have any liability to any person or entity with
respect to any loss or damage caused or alleged to be caused directly or indirectly by
the instructions contained in this text or by the computer software and hardware
products described in it.

All product names and services identified throughout this text are used in editorial
fashion only with no intention of infringement of any trademark. No such use, or the
use of any trade name, is intended to convey endorsement or other affiliation with this
text.
Table of Contents
Decision Tree Tutorial ................................................................................................................................... 1
What is Decision Tree?........................................................................................................................... 1
How to use a decision tree? .................................................................................................................. 2
How to generate a decision tree? ........................................................................................................ 4
How to measure impurity? ................................................................................................................ 4
Entropy ............................................................................................................................................... 5
Gini Index ........................................................................................................................................... 6
Classification error .......................................................................................................................... 7
Decision Tree Algorithm ........................................................................................................................ 7
Information gain ................................................................................................................................. 10
Second Iteration ................................................................................................................................. 13
Third iteration ...................................................................................................................................... 16
Decision tree is a popular classifier that does not require any knowledge or parameter
setting. The approach is supervised learning. Given a training data, we can induce a
decision tree. From a decision tree we can easily create rules about the data. Using
decision tree, we can easily predict the classification of unseen records. In this decision
tree tutorial, you will learn how to use, and how to build a decision tree in a very simple
explanation.

What is Decision Tree?

Decision tree is a hierarchical tree structure that used to classify classes based on a series
of questions (or rules) about the attributes of the class. The attributes of the classes can
be any type of variables from binary, nominal, ordinal, and quantitative values, while the
classes must be qualitative type (categorical or binary, or ordinal). In short, given a data of
attributes together with its classes, a decision tree produces a sequence of rules (or series
of questions) that can be used to recognize the class.

Let us start with an example. Throughout this tutorial, we will use the following 10 training
data. The training data is supposed to be a part of a transportation study regarding mode
choice to select Bus, Car or Train among commuters along a major route in a city,
gathered through a questionnaire study. The data have 4 attributes which I selected for the
sake of clarity. Attribute gender is binary type, car ownership is quantitative integer (thus
behave like nominal). Travel cost/km is quantitative of ratio type but in here I put into
ordinal type (at later section of this tutorial, I will discuss how to split quantitative data into
qualitative) and income level is also an ordinal type.

Attributes Classes
Gender Car Travel Cost Income Level Transportation
ownership ($)/km mode
Male 0 Cheap Low Bus
Male 1 Cheap Medium Bus
Female 1 Cheap Medium Train
Female 0 Cheap Low Bus
Male 1 Cheap Medium Bus
Male 0 Standard Medium Train
Female 1 Standard Medium Train
Female 1 Expensive High Car
Male 2 Expensive Medium Car
Female 2 Expensive High Car

Based on above training data, we can induce a decision tree as the following:

Kardi Teknomo Page 1

Notice that attribute “income level” is not included in the decision tree because based on
the given data attribute “travel cost per km” would produce better classification than
“income level”. We will see later how the decision is generated. In the next section, I will
discuss how to use a decision tree to predict unseen record.

How to use a decision tree?

Decision tree can be used to predict a pattern or to classify the class of a data. Suppose
we have new unseen records of a person from the same location where the data sample
was taken. The following data are called test data (in contrast to training data) because we
would like to examine the classes of these data.

Person Gender Car Travel Cost Income Transportation

name ownership ($)/km Level Mode
Alex Male 1 Standard High ?
Buddy Male 0 Cheap Medium ?
Cherry Female 1 Cheap High ?

The question is what transportation mode would Alex, Buddy and Cheery use? Using the
decision tree that we have generated in the previous section, we will use deductive
approach to classify whether a person will use car, train or bus as his or her mode along a
major route in that city, based on the given attributes.

We can start from the root node which contains an attribute of Travel cost per km. If the
travel cost per km is expensive, the person uses car. If the travel cost per km is standard
price, the person uses train. If the travel cost is cheap, the decision tree needs to ask next
question about the gender of the person. If the person is a male, then he uses bus. If the
gender is female, the decision tree needs to ask again on how many cars she own in her
household. If she has no car, she uses bus, otherwise she uses train.

Kardi Teknomo Page 2

The rules generated from the decision tree above are mutually exclusive and exhaustive
for each class label on the leaf node of the tree:
Rule 1: If Travel cost/km is expensive then mode = car
Rule 2: If Travel cost/km is standard then mode = train
Rule 3: If Travel cost/km is cheap and gender is male then mode = bus
Rule 4: If Travel cost/km is cheap and gender is female and she owns no car then mode =
bus
Rule 5: If Travel cost/km is cheap and gender is female and she owns 1 car then mode =
train

Based on the rules or decision tree above, the classification is very straightforward. Alex
is willing to pay standard travel cost per km, thus regardless his other attributes, his
transportation mode must be train. Buddy is only willing to pay cheap travel cost per km,
and his gender is male, thus his selection of transportation mode should be bus. Cherry is
also willing to pay cheap travel cost per km, and her gender is female and actually she
owns a car, thus her transportation mode choice to work is train (probably she uses car
only during weekend to shop). Variable Income level never be utilized to classify the
transportation mode in this case.
Person Travel Cost Gender Car Transportation
name ($)/km ownership Mode
Alex Standard Male 1 Train
Buddy Cheap Male 0 Bus
Cherry Cheap Female 1 Train

Though decision tree is very powerful method, at this point, I shall give several notes to the
readers in decision tree utilization. First, it must be noted, however, that with limited
number of training data (only 10) that induce the decision tree, we cannot generalize the
rules of the decision tree above to be applicable for other cases in your city. The decision

Kardi Teknomo Page 3

tree above is only true for the cases on the given data, which is only for the particular
major route in that city where the data was gathered.

The sequence of rules generated by the decision tree is based on priority of the attributes.
For example, there is no rule for people who own more than 1 car because based on the
data it is already covered by attribute travel cost/km. For those who own 2 cars the travel
cost/km are always expensive, thus the mode is car.

Due to the limitation of decision algorithm (most algorithms of decision tree employ greedy
strategy with no backtracking thus it is not exhaustive search), these sequences of priority
in general is not optimum. We cannot say that the rules generated by decision tree are the
best rules.

In the next section, you will learn more detail on how to generate a decision tree.

How to generate a decision tree?

In this section, you will learn how to generate a decision tree. This approach is sometimes
called decision tree inductive because the decision tree are build based on data. I will
show the manual computation step by step such that you can check using calculator or
spreadsheet.

Before I discuss about decision tree algorithm, it would be better if you familiar yourself
with several measures of impurity. Therefore, the topics in this section are:

How to measure impurity?

Entropy
Gini Index
Classification error
How a decision tree algorithm work?
Improvement through gain ratio

How to measure impurity?

Given a data table that contains attributes and class of the attributes, we can measure
homogeneity (or heterogeneity) of the table based on the classes. We say a table is pure or
homogenous if it contains only a single class. If a data table contains several classes, then we say
that the table is impure or heterogeneous. There are several indices to measure degree of
impurity quantitatively. Most well known indices to measure degree of impurity are entropy, gini
index, and classification error. The formulas are given below

Kardi Teknomo Page 4

ove formulas contain
All abo c value
es of probability pj of a classs j.

In our example, thhe classes of

o Transporta
ation mode below consiist of three g groups of Buus,
Car annd Train. In this case, we
w have 4 buuses, 3 carss and 3 trains (in short w
we write as 4
4B,
3C, 3T
T). The total data is 10 ro
ows.

Based d on these daata, we can compute pro

obability of e
each class. S
Since probability is equa
al
to freq
quency relatiive, we have
e

Prob (Bus)
( = 4 / 10 = 0.4
Prob (Car)
( = 3 / 100 = 0.3
Prob (Train)
( = 3 / 10 = 0.3

Obserrve that when to compute probability y, we only fo

ocus on the cclasses, not on the
attribu
utes. Having the probability of each class,
c now w
we are readyy to computee the
quantitative indice
es of impurity
y degrees.

Entro
opy
One way ure impurity degree is us
w to measu sing entropyy.

Kardi Teknomo
T Page 5
Example: Given that Prob (Bus) = 0.4, Prob (Car) = 0.3 and Prob (Train) = 0.3, we can
now compute entropy as

Entropy = – 0.4 log (0.4) – 0.3 log (0.3) – 0.3 log (0.3) = 1.571

The logarithm is base 2.

Entropy of a pure table (consist of single class) is zero because the probability is 1 and log
(1) = 0. Entropy reaches maximum value when all classes in the table have equal
probability. Figure below plots the values of maximum entropy for different number of
classes n, where probability is equal to p=1/n. I this case, maximum entropy is equal to -
n*p*log p. Notice that the value of entropy is larger than 1 if the number of classes is more
than 2.

Gini Index
Another way to measure impurity degree is using Gini index.

Example: Given that Prob (Bus) = 0.4, Prob (Car) = 0.3 and Prob (Train) = 0.3, we can
now compute Gini index as

Gini Index = 1 – (0.4^2 + 0.3^2 + 0.3^2) = 0.660

Gini index of a pure table (consist of single class) is zero because the probability is 1 and
1-(1)^2 = 0. Similar to Entropy, Gini index also reaches maximum value when all classes
in the table have equal probability. Figure below plots the values of maximum gini index
for different number of classes n, where probability is equal to p=1/n. Notice that the value
of Gini index is always between 0 and 1 regardless the number of classes.

Kardi Teknomo Page 6

Classification error
Still another way to measure impurity degree is using index of classification error

Example: Given that Prob (Bus) = 0.4, Prob (Car) = 0.3 and Prob (Train) = 0.3, index of
classification error is given as

Classification Error Index = 1 – Max{0.4, 0.3, 0.3} = 1 - 0.4 = 0.60

Similar to Entropy and Gini Index, Classification error index of a pure table (consist of
single class) is zero because the probability is 1 and 1-max(1) = 0. The value of
classification error index is always between 0 and 1. In fact the maximum Gini index for a
given number of classes is always equal to the maximum of classification error index
because for a number of classes n, we set probability is equal to p=1/n and maximum Gini
index happens at 1-n*(1/n)^2 = 1-1/n, while maximum classification error index also
happens at 1-max{1/n} =1-1/n.

Knowing how to compute degree of impurity, now we are ready to proceed with decision
tree algorithm that I will explain in the next section.

Decision Tree Algorithm

There are several most popular decision tree algorithms such as ID3, C4.5 and CART
(classification and regression trees). In general, the actual decision tree algorithms are
recursive. (For example, it is based on a greedy recursive algorithm called Hunt algorithm
that uses only local optimum on each call without backtracking. The result is not optimum

Kardi Teknomo Page 7

but very fast). For clarity, however, in this tutorial, I will describe as if the algorithm is
iterative.

Here is an explanation on how a decision tree algorithm work. We have a data record
which contains attributes and the associated classes. Let us call this data as table D.
From table D, we take out each attribute and its associate classes. If we have p attributes,
then we will take out p subset of D. Let us call these subsets as Si. Table D is the parent of
table Si.

From table D and for each associated subset Si, we compute degree of impurity. We have
discussed about how to compute these indices in the previous section.

To compute the degree of impurity, we must distinguish whether it is come from the parent
table D or it come from a subset table Si with attribute i.

If the table is a parent table D, we simply compute the number of records of each class.
For example, in the parent table below, we can compute degree of impurity based on
transportation mode. In this case we have 4 Busses, 3 Cars and 3 Trains (in short 4B, 3C,
3T):

Kardi Teknomo Page 8

If the table is a subset of attribute table Si, we need to separate the computation of impurity
degree for each value of the attribute i.

For example, attribute Travel cost per km has three values: Cheap, Standard and
Expensive. Now we sort the table Si = [Travel cost/km, Transportation mode] based on the
values of Travel cost per km. Then we separate each value of the travel cost and compute
the degree of impurity (either using entropy, gini index or classification error).

Kardi Teknomo Page 9

Information gain
The reason for different ways of computation of impurity degrees between data table D
and subset table Si is because we would like to compare the difference of impurity degrees
before we split the table (i.e. data table D) and after we split the table according to the
values of an attribute i (i.e. subset table Si) . The measure to compare the difference of
impurity degrees is called information gain. We would like to know what our gain is if we
split the data table based on some attribute values.

Information gain is computed as impurity degrees of the parent table and weighted
summation of impurity degrees of the subset table. The weight is based on the number of
records for each attribute values. Suppose we will use entropy as measurement of impurity
degree, then we have:

Information gain (i) = Entropy of parent table D – Sum (nk/n * Entropy of each value k of
subset table Si)

For example, our data table D has classes of 4B, 3C, 3T which produce entropy of 1.571.
Now we try the attribute Travel cost per km which we split into three: Cheap that has
classes of 4B, 1T (thus entropy of 0.722), Standard that has classes of 2T (thus entropy =

Kardi Teknomo Page 10

0 because pure single class) and Expensive with single class of 3C (thus entropy also
zero).

The information gain of attribute Travel cost per km is computed as 1.571 – (5/10 *
0.722+2/10*0+3/10*0) = 1.210

You can also compute information gain based on Gini index or classification error in the
same method. The results are given below.

For each attribute in our data, we try to compute the information gain. The illustration
below shows the computation of information gain for the first iteration (based on the data
table) for other three attributes of Gender, Car ownership and Income level.

Kardi Teknomo Page 11

Table below summarizes the information gain for all four attributes. In practice, you don’t
need to compute the impurity degree based on three methods. You can use either one of
Entropy or Gini index or index of classification error.

Once you get the information gain for all attributes, then we find the optimum attribute that
produce the maximum information gain (i* = argmax {information gain of attribute i}). In
our case, travel cost per km produces the maximum information gain. We put this optimum
attribute into the node of our decision tree. As it is the first node, then it is the root node of
the decision tree. Our decision tree now consists of a single root node.

Kardi Teknomo Page 12

Once we obtain the optimum attribute, we can split the data table according to that
optimum attribute. In our example, we split the data table based on the value of travel cost
per km.

After the split of the data, we can see clearly that value of Expensive travel cost/km is
associated only with pure class of Car while Standard travel cost/km is only related to pure
class of Train. Pure class is always assigned into leaf node of a decision tree. We can use
this information to update our decision tree in our first iteration into the following.

For Cheap travel cost/km, the classes are not pure, thus we need to split further in the next
iteration.

Second Iteration
In the second iteration, we need to update our data table. Since Expensive and Standard
travel cost/km have been associated with pure class, we do not need these data any
longer. For second iteration, our data table D is only come from the Cheap Travel cost/km.
We remove attribute travel cost/km from the data because they are equal and redundant.

Kardi Teknomo Page 13

Now we have only three attributes: Gender, car ownership and Income level. The degree
of impurity of the data table D is shown in the picture below.

Then, we repeat the procedure of computing degree of impurity and information gain for
the three attributes. The results of computation are exhibited below.

Kardi Teknomo Page 14

The maximum gain is obtained for the optimum attribute Gender. Once we obtain the
optimum attribute, the data table is split according to that optimum attribute. In our case,
Male Gender is only associated with pure class Bus, while Female still need further split of
attribute.

Using this information, we can now update our decision tree. We can add node Gender
which has two values of male and female. The pure class is related to leaf node, thus Male
gender has leaf node of Bus. For Female gender, we need to split further the attributes in
the next iteration.

Kardi Teknomo Page 15

Third iteration
Data table of the third iteration comes only from part of the data table of the second
iteration with male gender removed (thus only female part). Since attribute Gender has
been used in the decision tree, we can remove the attribute and focus only on the
remaining two attributes: Car ownership and Income level.

If you observed the data table of the third iteration, it consists only two rows. Each row has
distinct values. If we use attribute car ownership, we will get pure class for each of its
value. Similarly, attribute income level will also give pure class for each value. Therefore,
we can use either one of the two attributes. Suppose we select attribute car ownership, we
can update our decision tree into the final version.

Now we have grown the full decision tree based on the data.

Kardi Teknomo Page 16

Planning For Estidama
No ratings yet
Planning For Estidama
34 pages
Of Tribes, Hunters and Barbarians - Forest Dwellers in The Mauryan Period
No ratings yet
Of Tribes, Hunters and Barbarians - Forest Dwellers in The Mauryan Period
20 pages
ShortCourse QTT Lecture2
No ratings yet
ShortCourse QTT Lecture2
37 pages
Konsep Ensemble
No ratings yet
Konsep Ensemble
52 pages
L6 Decision Tree Classifier
No ratings yet
L6 Decision Tree Classifier
46 pages
Decision Trees Lectures
No ratings yet
Decision Trees Lectures
55 pages
ML Chapter 4 Part2
No ratings yet
ML Chapter 4 Part2
75 pages
Decision Trees 4
No ratings yet
Decision Trees 4
56 pages
ML L8 Decision Tree
No ratings yet
ML L8 Decision Tree
109 pages
Decision Tree
No ratings yet
Decision Tree
57 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
96 pages
Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
Decision Tree
100% (1)
Decision Tree
57 pages
Unit-II - Tree Based Methods
No ratings yet
Unit-II - Tree Based Methods
158 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
Tree Based Classifiers: Dinesh R
No ratings yet
Tree Based Classifiers: Dinesh R
54 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
11 pages
U4 ML Updated
No ratings yet
U4 ML Updated
32 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
7 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
Classification Trees
No ratings yet
Classification Trees
48 pages
Decision Tree
No ratings yet
Decision Tree
38 pages
Machine - Learning - Lecture - 08 - Decision Tree Learning
No ratings yet
Machine - Learning - Lecture - 08 - Decision Tree Learning
67 pages
Decisiontree1 2
No ratings yet
Decisiontree1 2
29 pages
Tree
No ratings yet
Tree
7 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Decision Trees and Regression Techniques
No ratings yet
Decision Trees and Regression Techniques
27 pages
Decision Tree Algorithm in Machine Learning
No ratings yet
Decision Tree Algorithm in Machine Learning
17 pages
Decisiontree
No ratings yet
Decisiontree
6 pages
DecisionTree Numerical ID3Prob
No ratings yet
DecisionTree Numerical ID3Prob
114 pages
1822 B.E Cse Batchno 149
No ratings yet
1822 B.E Cse Batchno 149
66 pages
Lecture Note #5 - PEC-CS701E
No ratings yet
Lecture Note #5 - PEC-CS701E
16 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
Adobe Scan 16 May 2023
No ratings yet
Adobe Scan 16 May 2023
14 pages
Decision Trees
No ratings yet
Decision Trees
26 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
2013 Facilitating Decision Support Through Decision Tree
No ratings yet
2013 Facilitating Decision Support Through Decision Tree
5 pages
Cours #4-Decision Tree
No ratings yet
Cours #4-Decision Tree
18 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Decision Tree Algorithm, Explained
No ratings yet
Decision Tree Algorithm, Explained
20 pages
Decision Trees
No ratings yet
Decision Trees
27 pages
Week 8 - Understanding The Decision Tree
No ratings yet
Week 8 - Understanding The Decision Tree
28 pages
Aiml M4 C1
No ratings yet
Aiml M4 C1
101 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
15 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
Lesson 3.1 - Supervised Learning Decision Trees
No ratings yet
Lesson 3.1 - Supervised Learning Decision Trees
51 pages
Data Mining Unit-Iii
No ratings yet
Data Mining Unit-Iii
36 pages
Lec.7.intro.D.S. Fall 2023
No ratings yet
Lec.7.intro.D.S. Fall 2023
26 pages
15.module6 Decisiontree-Updated 14
No ratings yet
15.module6 Decisiontree-Updated 14
20 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
17 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Decision Tree
No ratings yet
Decision Tree
11 pages
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
No ratings yet
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
22 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
What Is A Decision Tree ?: - Decision Tree Is A Classifier in The Form of A Tree Structure, Where Each Node Is Either
No ratings yet
What Is A Decision Tree ?: - Decision Tree Is A Classifier in The Form of A Tree Structure, Where Each Node Is Either
18 pages
Decision Trees
No ratings yet
Decision Trees
14 pages
Decisiontrees
No ratings yet
Decisiontrees
28 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Decision Tree New
No ratings yet
Decision Tree New
52 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
The Careful vs. The Careless Driver
From Everand
The Careful vs. The Careless Driver
Chaz Van Heyden
No ratings yet
Behind The Wheel: Memoir of a Bus Driver
From Everand
Behind The Wheel: Memoir of a Bus Driver
Larry Gnass
No ratings yet
ICT 7 Learning Module
No ratings yet
ICT 7 Learning Module
77 pages
A Simple Proof of Bernoulli's Inequality: Sanjeev Saxena
No ratings yet
A Simple Proof of Bernoulli's Inequality: Sanjeev Saxena
2 pages
Spe 201216 Ms Minifrac
No ratings yet
Spe 201216 Ms Minifrac
12 pages
Physics Grade 9 Worksheet I Second Sem
No ratings yet
Physics Grade 9 Worksheet I Second Sem
11 pages
BQ1031 Exercises
No ratings yet
BQ1031 Exercises
90 pages
eME4 HW5 Flores BSME-4B
No ratings yet
eME4 HW5 Flores BSME-4B
4 pages
Affected Models: Date: November, 2006 No. 2006-06 (W) MODELS: 2007 Evinrude SUBJECT: Engine Software Update
No ratings yet
Affected Models: Date: November, 2006 No. 2006-06 (W) MODELS: 2007 Evinrude SUBJECT: Engine Software Update
2 pages
DVB Conditional Access System - CAS & SMS - CryptoGuard AB CryptoGuard AB
No ratings yet
DVB Conditional Access System - CAS & SMS - CryptoGuard AB CryptoGuard AB
7 pages
M-35 Mix Design
No ratings yet
M-35 Mix Design
1 page
Guidance Mandatory Competence Attainment Report (v7) Final 04072012
No ratings yet
Guidance Mandatory Competence Attainment Report (v7) Final 04072012
8 pages
Glade Tutorial
No ratings yet
Glade Tutorial
5 pages
Ecology PDF
No ratings yet
Ecology PDF
3 pages
OOP Templates Assignment - Zip
No ratings yet
OOP Templates Assignment - Zip
31 pages
Q2 Lesson 1 Worksheet
No ratings yet
Q2 Lesson 1 Worksheet
2 pages
Project Management: - Dr. Gyanesh Kumar Sinha Associate Professor - Operations and Analytics
No ratings yet
Project Management: - Dr. Gyanesh Kumar Sinha Associate Professor - Operations and Analytics
10 pages
MATH 115: Lecture XIII Notes
No ratings yet
MATH 115: Lecture XIII Notes
3 pages
Mutations
No ratings yet
Mutations
48 pages
Aug 1-27 Final
No ratings yet
Aug 1-27 Final
90 pages
OB Biruktawit Zegeye
No ratings yet
OB Biruktawit Zegeye
6 pages
02-07-23 SR - Iit Star Co-Sc (Model-A) Jee Adv 2020 (P-I) Wat-45 Key&Sol
No ratings yet
02-07-23 SR - Iit Star Co-Sc (Model-A) Jee Adv 2020 (P-I) Wat-45 Key&Sol
14 pages
Internal Structure of A Leaf
50% (2)
Internal Structure of A Leaf
25 pages
Hazop Ip
No ratings yet
Hazop Ip
117 pages
DxDiag Requisitos
No ratings yet
DxDiag Requisitos
30 pages
1 - Tuberia 4'' SCH40 222956 Tpco
No ratings yet
1 - Tuberia 4'' SCH40 222956 Tpco
2 pages
Mod 1maths
No ratings yet
Mod 1maths
155 pages
List of Government Colleges Affiliated To The University of Jammu (ACADEMIC SESSION 2020-21)
No ratings yet
List of Government Colleges Affiliated To The University of Jammu (ACADEMIC SESSION 2020-21)
9 pages
Small Hydro Power Plant: Scenario in India - A Comparative Study
No ratings yet
Small Hydro Power Plant: Scenario in India - A Comparative Study
7 pages
Lecture 1
No ratings yet
Lecture 1
20 pages

Decision Tree Tutorial by Kardi Teknomo

Uploaded by

Decision Tree Tutorial by Kardi Teknomo

Uploaded by

Kardi Teknomo

What is Decision Tree?

Kardi Teknomo Page 1

How to use a decision tree?

Person Gender Car Travel Cost Income Transportation

Kardi Teknomo Page 2

Kardi Teknomo Page 3

How to generate a decision tree?

How to measure impurity?

How to measure impurity?

Kardi Teknomo Page 4

In our example, thhe classes of

Based d on these daata, we can compute pro

Obserrve that when to compute probability y, we only fo

The logarithm is base 2.

Gini Index = 1 – (0.4^2 + 0.3^2 + 0.3^2) = 0.660

Kardi Teknomo Page 6

Classification Error Index = 1 – Max{0.4, 0.3, 0.3} = 1 - 0.4 = 0.60

Decision Tree Algorithm

Kardi Teknomo Page 7

Kardi Teknomo Page 8

Kardi Teknomo Page 9

Kardi Teknomo Page 10

Kardi Teknomo Page 11

Kardi Teknomo Page 12

Kardi Teknomo Page 13

Kardi Teknomo Page 14

Kardi Teknomo Page 15

Kardi Teknomo Page 16

You might also like