0% found this document useful (0 votes)
11 views6 pages

5 Lesson 02

Uploaded by

mec17b209
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views6 pages

5 Lesson 02

Uploaded by

mec17b209
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Lesson 02 – Decision Trees and its Applications

2.1 Introduction
Lesson 01 provided a comprehensive introduction to the filed of machine learning in
general. This week onwards, you will study various machine learning techniques and their
applications. It is recalled that machine learning is about learning from examples, without
concerning any model, theory or algorithm that can describe the example data. As such,
machine learning techniques can be used to model real world systems that do not fit into
any mathematical or scientific model. This module covers three major machine learning
techniques, namely, Decision Trees, Artificial Neural Networks and Genetic Algorithms. This
week discusses fundamentals of decision trees and decision tree applications.

2.2 Basics of Decision tress


As stated, machine learning techniques accepts some input data or examples, and create a
model that can be used to predict about new data. Model created by a machine learning
technique is a kind of black-box, which accept an input and determines whether the input
can be recognized as per the data used to construct the black-box. Here I use the term
black-box, because how and why the model works cannot really be explained, but it works.
In fact, this applies to our brain as well, because no one knows how the neurons in the
brain do various tasks such as calculation, thinking, etc.

Decision Trees is the simplest kind of machine learning technique. As name implies, a
decision trees technique draw a tree to represent a given example (data) set. It is quite
natural that we use tree as a structure to represent many things in the world. For instance,
organizational structure can be represented as a tree. Sample space of probabilistic
experiment can also be represented by a tree.

For all machine learning techniques, we start with a data set called examples. Each
example is characterized by set of attributes. One of the attribute can be considered as the
goal attribute. For example, Figure 2.1 shows four examples X1, X2, X3 and X4 from the
domain of weather forecasting.

Exampl Pressure Temperature Humidity Wind speed Rain


e
X1 high medium high 50 Yes
X2 medium medium Low 35 No
X3 low low high 50 Yes
X4 high High Low 35 No

Figure 2.1 – Some data from weather forecasting domain

In Figure 2.1, pressure, temperature, humidity, wind speed and Rain are the attributes of
examples. Further, Rain, one of the attributes would be the goal attribute.
2.2.1 Goal attribute
In decision trees the gaol attributes has been identified only with two options such as
positive/negative, yes/no, good/bad, etc. However, we can also consider more than two
options for the value of goal attribute. For instance, according to Figure 2.1, the attribute
pressure got three options as high, low and medium, whereas the attribute humidity got
only two attributes: high and low.

2.2.2 Positive and negative examples


Based on the value of the goal attribute, we can classify each example into two classes as
positive examples and negative examples. For instance, X1 and X3 are positive examples
as the value of the goal attribute of these examples appear as yes. In contrast, X2 should
be labelled as a negative example.

2.2.3 Basic idea of decision tree


It is surprisingly simple to understand the basic idea behind a decision tree. To construct a
decision tree we begin with an attribute. Then identify options available for the attribute.
These are the branches extending from the selected attributes. Now we consider goal
attribute and count how many positive and negative examples, along each branch.

For instance, if we select the attribute pressure, there will be three braches labelling high,
low and medium. For the branch, high, we identify that goal attribute would be Yes (by X1)
and No (by X4). Thus the branch, high returns both positive and negative answers. This
means if the pressure is high we cannot have an exact conclusion as Rain is Yes or No. On
the other hand, if we consider the branch low, the goal attribute is classified exactly as Yes.
That means, if we say that pressure is low, we can conclude that Rain will definitely be No.

Just imagine if a branch returns both positive and negative examples, what can we do? Now
we can consider another attribute and classify examples according to its branches. This can
be done until all braches return all positive or all negative classifications. This is how we
construct decision trees.

The fundamental challenge behind construction of a decision tree is the choice of the
appropriate attribute to classify positive and negative examples at a given moment. Theory
has been developed for this purpose.

2.3 Intuitive approach to construct decision trees


Before going to discuss a theory for construction of decision trees let us consider, how we
can develop a decision tree for a simple data set (Figure 2.2). Note that these data are
some hypothetical values and does not refer to actual situations.
Example Weight Exercise Meal Health
X1 High No Normal Good
X2 High Yes Fatty Good
X3 Low No Sugary Bad
X4 Medium Yes Fatty Good
X5 Low No fatty Bad
X6 Medium Yes Sugary Bad
X7 Medium No Fatty Bad

Figure 2.2 – Training data for simple application

When we consider the attribute weight, it has three options or values as high, low and
medium. Therefore, we have three braches from the weight attribute. Now the example X1
goes along the branch high, and classifies the goal attribute as good (+). Similarly X2 goes
along the branch high, and classifies the goal attribute as +. We denote this by +[X1, X2].
Similarly other two values medium and low can be considered to classify data. Thus far
analysis of data depicts the decision shown in Figure 2.3.

Weight
high medium
low
+[X1, X2] -+[X3, +[X4, X6]
-[X5] X52] -[X6, X7]

Figure 2.3 – Decision tree after expanding on the attribute weight

Now we notice that along the branch high we got the examples X1 and X2. Both are
positive. So we can in fact conclude, if weight is high then health is good. Similarly, for the
branch low, we can conclude that “if weight is low then health is bad” In this manner if a
branch returns all positive or all negative we can end up a conclusion.

Let us consider branch labelled as medium. Here there are tree examples, out of which X4
is positive, while X6 and X7 are negative. No conclusions can be made. Therefore, we
should use another attribute, say, exercise, and classify the examples X4, X6 and X7. Since
the attribute exercise has two attributes Yes and No, we will obtain the extended decision
tree as in Figure 2.4.
Weight
high medium
low
Exercise
+[X1, X2] -+[X3, +[X4]
Weight
yes
-[X5] X52] med -[X6, X7]
no
med
+[X4, +[X4,
X6] X6]
-[X6]
Figure 2.4 – After expanding on the attribute Exercise

Now we notice that in Figure 2.4, along the branch yes, we have positive and negative
examples. Therefore, we need to consider another attribute (now, meal) to further classify
examples X6 and X7. The final decision tree is shown in Figure 2.5. Note that all leaf nodes
of the tree have all positive or all negative examples. Thus the construction of the decision
tree is over now.

Weight
high medium
low
Exercise
+[X1, X2] -+[X3, Weight
yes
-[X5] X52] med no
med
MealWeig +[X4,
normalm ht sugary X6]
edium mediu
Fatty
mediu
+[X4, -+
X6] [X64,
Figure 2.5 – Complete decision tree

2.3.1 Some conclusions


This decision tree can be used t draw very interesting conclusions. Note that along the
braches of the tree we can draw various conclusions (as good or bad) about the goal
attribute of health.

For example, by considering the leftmost branch, we can say that “if weight is high then
health is good”.
Similarly, by going along the rightmost branch, we can say that “if weight is medium and
exercise is no then health is bad”.

Exercise 2.1
Write TWO more conclusions that use all three attributes.

More conclusions
Let us consider other possible conclusions. It appears that the value (option), normal of the
attribute does not collect any examples along the branch. What does this mean? What do
you mean by this? It says that the value normal has no contribution to classify data. As such
the attribute normal is unnecessary when collecting data.

Note also that there are instance where the conclusions made along two different branches
may be contradicting. In such a situation, the collected data may include some noise or
error.

In this case, we use all three attributes, weight, exercise and meal for constructing the
decision tree. However, it is possible that we may end up with all positive or all negative
classification, without using all given attributes. So we may conclude that some attributes
may not be relevant for the data set.

Another very important point would be that we can draw more than one decision tree to
represent a given set of training data. For example, if we begin with the meal as the
attribute at the root, and then use weight and exercise, the decision tree will be different
from what we received in Figure 2.5. As such conclusions may also change. Therefore, it
would be an interesting question to ask, which decision tree is the most appropriate.

2.3.2 Rules from a decision tree


From machine learning viewpoint, decision tree has such a nice feature, which is not
possible with other machine learning techniques such as Artificial Neural Networks (ANN)
and Genetic Algorithms (GA). This feature is nothing, but the ability to generate rules out
of the decision tree.

For example, our conclusions such as “if weight is medium and exercise is no then health
is bad” is in fact a rule. They appear in the standard rule format of

IF <condition> THEN <conclusion>

Recall that machine learning techniques are used to model non-algorithmic real world
systems that cannot be fitted into any formal model. In such a situation decision tress has
been able to create rules to describe non-algorithmic systems. This is a surprising feature of
decision trees as compared with features of machine learning techniques such as GA and
ANN

2.4 Applications of decision trees


It is evident from the example that a decision tree provides a simple structure to represent
data and to draw conclusions. More importantly, we can drive some rules from a decision
tree. Such rules can be directly used for drawing conclusions about a question at hand.

It should also be noted that for a given training data set we can produce more than one
distinct decision trees. Among those decision trees, some may be very large, while another
one may be relatively small. On the other hand some decision trees may not even include
all attributes stated in the same training data. Perhaps, this may be due to the use of
irrelevant attributes for a particular set of examples.

Since we can produce several decision trees for a given training data set, it is required to
identify the most appropriate decision tree for a given training data set.

All machine learning techniques are expected to produce a model that is generic as much
as possible. That means, a decision tree should not just remember the training data and
produce conclusion. If that is the case, a decision tree would be able to recognize only the
inputs that are used to construct the decision tree. This is why decision trees should go
beyond memorization and achieve generalization over the training data provided.

Discovery of rules
With the help of a decision tree, we can identify some rules that can describe a certain
data set. This is a key feature of decision tree since machine learning techniques are
generally used to model data which do not fit into any rules, algorithms, etc. Note that
other machine learning techniques such as Artificial Neural Networks and Genetic
Algorithms cannot devise a rule set to describe a collection of data. Therefore, if we are
interested in some form of rule support to realize a data set, decision tree is unique among
other machine learning techniques.

Extending expert systems


As stated already, the most popular kind of expert systems use the rule based
representation in their knowledge bases. As such, it is quite natural to merge ideas from
expert systems technology and the decision trees. Recall that expert systems are filled with
thousands of rules, resulting which reasoning takes long time durations. This is why expert
systems technology has been criticized for not being able to use for time critical systems.
However, decision trees are able to produce a more generalized set of rules, by removing
redundancies and unnecessary complexity due to the size of rule base.

2.5 Summary
This lesson discussed the fundamental concepts of decision trees. Here we also learn the
creation of decision tress through our intuitive understanding. It was realized that we can
construct more than one decision trees for a given real world problem. Among those
decision trees, the one which provide the highest generalization would be the most suitable
decision tree to model a particular problem. We also pointed out the importance of
decision trees over other machine learning techniques. This is because; decision trees are
able providing information for construction of rules. Most generalized decision tree for a
real world system would be capable of showing improved performance over the expert
systems.

You might also like