0% found this document useful (0 votes)
129 views18 pages

LAB (1) Decision Tree: Islamic University of Gaza Computer Engineering Department Artificial Intelligence ECOM 5038

1. The document describes a lab on building classification decision trees using Python. It introduces decision trees and their use for classification and regression problems. 2. It discusses representing classification trees, growing classification trees by splitting nodes, and parameters like maximum depth to control overfitting. 3. The lab objectives are to become familiar with classification decision trees and build one using the Iris dataset in Python. It covers importing packages, preprocessing data, building and evaluating the model, and displaying the trained tree.

Uploaded by

Eman Ibrahim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
129 views18 pages

LAB (1) Decision Tree: Islamic University of Gaza Computer Engineering Department Artificial Intelligence ECOM 5038

1. The document describes a lab on building classification decision trees using Python. It introduces decision trees and their use for classification and regression problems. 2. It discusses representing classification trees, growing classification trees by splitting nodes, and parameters like maximum depth to control overfitting. 3. The lab objectives are to become familiar with classification decision trees and build one using the Iris dataset in Python. It covers importing packages, preprocessing data, building and evaluating the model, and displaying the trained tree.

Uploaded by

Eman Ibrahim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Islamic University Of Gaza

Computer Engineering Department


Artificial Intelligence
ECOM 5038

LAB (1)
Decision Tree

Eng.Mohammed W. Awwad
Eng.Lina Y. Al.Aloul

Mar,2020
P age |2

Objectives
1- To be familiar with classification decision tree.
2- To be able to build classification decision tree using Python.

Introduction
A decision tree is a map of the possible outcomes of a series of related choices. It allows an
individual or organization to weigh possible actions against one another based on their
costs, probabilities, and benefits. They can be used to map out an algorithm that predicts
the best choice mathematically.
A decision tree typically starts with a single node, which branches into possible outcomes.
Each of those outcomes leads to additional nodes, which branch off into other possibilities.

Types Of Trees
Classification and Regression Trees (CART) is a term introduced by Leo Breiman to refer to the
Decision Tree algorithm that can be learned for classification or regression predictive
modeling problems.
Classification predictive modelling : is the task of approximating a mapping function (f) from
input variables (X) to discrete output variables (y).
The output variables are often called labels or categories. The mapping function predicts the
class or category for a given observation.
Regression predictive modelling: is the task of approximating a mapping function (f) from
input variables (X) to a continuous output variable (y).
A continuous output variable is a real-value, such as an integer or floating point value.

Representation of Classification Trees


Classification trees are essentially a series of questions designed to assign a classification. The
image below is a classification tree trained on the IRIS dataset (flower species).Root (brown)
and decision (blue) nodes contain questions which split into sub-nodes. The root node is just
the topmost decision node. In other words, it is where you start traversing the classification
tree. The leaf nodes (green), also called terminal nodes, are nodes that don’t split into more
nodes. Leaf nodes are where classes are assigned by majority vote.

Figure 1 : Representation of Classification Trees


P age |3

Classification Trees Grown


A classification tree learns a sequence of if then questions with each question involving one
feature and one split point Look at the partial tree below (Figure 2(a)), the question, “petal
length (cm) ≤ 2.45” splits the data into two branches based on some value (2.45 in this case).
The value between the nodes is called a split point. A good value for a split point is one that
does a good job of separating one class from the others. Classification trees are a greedy
algorithm which means by default it will continue to split until it has a pure node.

Figure 2(a) : Partial Tree splits the data into two branches Figure 2(b): vertical line A as splitter at 2.45

Figure 3(a): splits the data into two branches based on 4.95 Figure 3(b) : Vertical line B as splitter at 4.95
P age |4

In the image in Figure 3(a) , the tree has a maximum depth of 2.Tree depth is a measure of
how many splits a tree can make before coming to a prediction. This process could be
continued further with more splitting until the tree is as pure as possible. The problem with
many repetitions of this process is that this can lead to a very deep classification tree with
many nodes. Luckily, most classification tree implementations allow you to control for the
maximum depth of a tree which reduces overfitting. In other words, you can set the
maximum depth to stop the growth of the decision tree past a certain depth. For a visual
understanding of maximum depth, you can look at the image in Figure 4 .

Figure 4 : Classification trees of different depths fit on the IRIS dataset.

Selection Criterion
Decision tree algorithm use information gain to split a node . Gini or entropy is the criterion
for calculating information gain .
IG = Entropy/Impurity before splitting(parent) — Entropy/Impurity after splitting(children)

Entropy in statistics is analogous to entropy in thermodynamics where it signifies


disorder. If there are multiple classes in a node, there is disorder in that node.

Gini impurity is a measure of how often a randomly chosen element from the set would be
incorrectly labeled if it was randomly labeled according to the distribution of labels in the
subset.
P age |5

Classification Tree Prediction


To use a classification tree, start at the root node (brown), and traverse the tree until you
reach a leaf (terminal) node. Using the classification tree in the the image below, imagine you
had a flower with a petal length of 4.5 cm and you wanted to classify it. Starting at the root
node, you would first ask “Is the petal length (cm) ≤ 2.45”? The length is greater than 2.45 so
that question is False. Proceed to the next decision node and ask, “Is the petal length (cm) ≤
4.95”? This is True so you could predict the flower species as versicolor.

What class (species) is


a flower with the
following feature ?

Petal length (cm) : 4.5

Figure (5) : Species counts are : setosa = 0 , versicolor = 38 ,


virginica = 3 . Prediction is versicolor as it is the majority class.

Tree Parameters
One of the benefits of decision tree training is that you can stop training based on several
thresholds.
The option minbucket provides the smallest number of observations that are allowed in a
terminal node. If a split decision breaks up the data into a node with less than the minbucket,
it won’t accept it.
The minsplit parameter is the smallest number of observations in the parent node that could
be split further. The default is 20. If you have less than 20 records in a parent node, it is labeled
as a terminal node.
Finally, the maxdepth parameter prevents the tree from growing past a certain depth/height.
height. The default is 30 . You can use the maxdepth option to create single-rule trees.
P age |6

Advantages and Disadvantages of Decision Tree


Advantages:
1. Compared to other algorithms decision trees requires less effort for data preparation
during pre-processing.
2. A decision tree does not require normalization of data.
3. A decision tree does not require scaling of data as well.
4. Missing values in the data also does NOT affect the process of building decision tree to
any considerable extent.
5. A Decision trees model is very intuitive and easy to explain to technical teams as well as
stakeholders.

Disadvantages:
1. A small change in the data can cause a large change in the structure of the decision tree
causing instability.
2. For a Decision tree sometimes calculation can go far more complex compared to other
algorithms.
3. Decision tree often involves higher time to train the model.
4. Decision tree training is relatively expensive as complexity and time taken is more.
5. Decision Tree algorithm is inadequate for applying regression and predicting continuous
values.
6. Decision trees prone to overfitting.
P age |7

Classification Tree Using Python


1- Import Packages

2- Overview of the problem set


Problem Statement: You are given the iris dataset which consists of 3 different types
of irises that are Setosa, Versicolour, and Virginica. The petal and sepal length and
width are stored in a 150x4 numpy array. Thus, The dataset contains 150 iris sample
where each sample has four features that are the petal and sepal length and width.
We will build a tree decision classifier that can correctly classify irises as Setosa,
Versicolour, or Virginica. Let's get more familiar with the dataset.

Note: You should reshape iris target shape from (150,) to (150,1) to treat it as
column vector.
P age |8

3- Splitting Data into Training and Test Sets

4- Building Model

5- Measuring Model Performance


P age |9

6- Displaying Decision Tree

To draw the tree , we need to install graphviz library by enter the command :
conda install python-graphviz , in anaconda window
P a g e | 10

7- Features importance

As we note the petal length and width have the highest features importance
weights. Keep in mind that if a feature has a low feature importance value, it doesn’t
necessarily mean that the feature isn’t important for prediction, it just means that the
particular feature wasn’t chosen at a particularly early level of the tree. It could also be
that the feature identical or highly correlated with another informative feature.
P a g e | 11

8- Tuning Model hyperparameters

One way to improve the performance of our model is by finding the optimal value
for max_depth hyperparameter. The code below outputs the accuracy for decision
trees with different values for max_depth.
P a g e | 12
P a g e | 13

9- Model Decision Boundary


P a g e | 14

10- Suggested Models

Model A
P a g e | 15

Model B
P a g e | 16

Model C
P a g e | 17

Model D
P a g e | 18

Model E

Good Luck :)

You might also like