0% found this document useful (0 votes)
66 views15 pages

ID3 Decision Tree Classifier From Scratch in Python - by Bernardo Garcia Del Rio - Towards Data Science

ID3 decision classifier algorithm is a classification algortihm in machine learning used to find the assigned class

Uploaded by

Vishvesh Muthu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views15 pages

ID3 Decision Tree Classifier From Scratch in Python - by Bernardo Garcia Del Rio - Towards Data Science

ID3 decision classifier algorithm is a classification algortihm in machine learning used to find the assigned class

Uploaded by

Vishvesh Muthu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data

el Rio | Towards Data Science

Get started Open in app

Follow 585K Followers

ID3 Decision Tree Classifier from scratch in


Python
Coding the ID3 algorithm to build a Decision Tree Classifier from scratch.

Bernardo Garcia del Rio Dec 13, 2020 · 7 min read

https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 1/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science

Photo by Fabrice Villard on Unsplash


Get started Open in app

In my last article, I showed how you could have some fun growing a Random Forest
using Sklearn’s DecisionTreeClassifier. You can check it out below:

Growing a Random Forest using Sklearn’s DecisionTreeClassifier


Understanding Decision Trees and Random Forests with a hands-on
example
bernardogarciadelrio.medium.com

If you are new to Machine Learning (ML) or are in the process of becoming a Data
Scientist or ML practitioner like me, you will probably get something out of it. In my
case, preparing the material for the article helped me gain a good understanding of how
Random Forests and Decision Trees work under the hood without getting into complex
code.

In that article, I mentioned that there are many algorithms that can be used to build a
Decision Tree. One of them is ID3 (Iterative Dichotomiser 3) and we are going to see
how to code it from scratch using ONLY Python to build a Decision Tree Classifier.

All the code can be found in a public repository that I have attached below:

bergr7/ID3_From_Scratch
All the code is in a public repository at the link below: Installation
Project Motivation File Descriptions Results…
github.com

Modelling the nodes of the tree


A Decision Tree is formed by nodes: root node, internal nodes and leaf nodes. We can
create a Python class that will contain all the information of all the nodes of the Decision
Tree.

1 class Node:
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 2/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
1 class Node:
2 """Contains the information of the node and another nodes of the Decision Tree."""
Get started Open in app
3
4 def __init__(self):
5 self.value = None
6 self.next = None
7 self.childs = None
8

nodes.py
hosted with ❤ by GitHub view raw

The class Node will contain the following information:

value: Feature to make the split and branches.

next: Next node

childs: Branches coming off the decision nodes

Decision Tree Classifier Class


We create now our main class called DecisionTreeClassifier and use the __init__
constructor to initialise the attributes of the class and some important variables that are
going to be needed.

Note that I have provided many annotations in the code snippets that help understand
the code.

1 class DecisionTreeClassifier:
2 """Decision Tree Classifier using ID3 algorithm."""
3
4 def __init__(self, X, feature_names, labels):
5 self.X = X # features or predictors
6 self.feature_names = feature_names # name of the features
7 self.labels = labels # categories
8 self.labelCategories = list(set(labels)) # unique categories
9 # number of instances of each category
10 self.labelCategoriesCount = [list(labels).count(x) for x in self.labelCategories]
11 self.node = None # nodes
12 # calculate the initial entropy of the system
13 self.entropy = self._get_entropy([x for x in range(len(self.labels))])
14

decisiontree.py
hosted with ❤ by GitHub view raw

https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 3/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science

Get started Open in app


In order to calculate the entropy note we are using the private function _get_entropy( ).
The code for this function is provided below:

1 def _get_entropy(self, x_ids):


2 """ Calculates the entropy.
3 Parameters
4 __________
5 :param x_ids: list, List containing the instances ID's
6 __________
7 :return: entropy: float, Entropy.
8 """
9 # sorted labels by instance id
10 labels = [self.labels[i] for i in x_ids]
11 # count number of instances of each category
12 label_count = [labels.count(x) for x in self.labelCategories]
13 # calculate the entropy for each category and sum them
14 entropy = sum([-count / len(x_ids) * math.log(count / len(x_ids), 2)
15 if count else 0
16 for count in label_count
17 ])
18
19 return entropy
20

entropy.py
hosted with ❤ by GitHub view raw

We pass the instances id’s or indexes to this function. For doing this, we need to generate
an unique number for each instance. Python’s lists comprehensions come in very handy
for this task as you can see.

We are going to code an ID3 algorithm that uses the information gain to find the
feature that maximises it and make a split based on that feature. The information gain is
based on entropy.

As this article is about coding the ID3 algorithm, I am not going to go into the details but
entropy is a measure of uncertainty about a random variable. If we minimise the
entropy, then we increase the certainty about the variable. In other words, if the random
variable can take only one value the entropy reaches its minimum whereas if all the
values are equiprobable the entropy is maximum.
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 4/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science

The algorithm will try to minimise the entropy or, equivalently, maximising the information
Get started Open in app
gain.

We can compute the entropy with the following formula:

Entropy formula (Caption by Author)

where pi is the proportion of each category i=1,…,N.

We said that we would compute the information gain to choose the feature that
maximises it and then make the split based on that feature. The information gain is a
concept based on entropy. It is defined as the total entropy minus the entropy if we chose
a particular feature j.

Information gain (Caption by Author)

So we create another private function that computes the information gain:

https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 5/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science

Get started Open in app

_get_information_gain( ) takes the instances ids and the feature id of the selected
featured to be evaluated. Then it calculates the total entropy, the entropy if we selected
the feature specified in feature_id and finally the information gain.

This ID3 algorithm chooses the feature that maximise the information gain at each split.
That is if the amount information in the feature is large, the entropy will be small, and
therefore the second term of the information gain formula will be also small, increasing
the information gain.

_get_information_gain( ) only calculates the information gain for a given set and feature
but, at each split, we need to figure out which is the feature that maximises the
information gain. We need to create another function to find the feature that maximises
the information gain.

https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 6/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science

Get started Open in app

Ok..so we have a private function (_get_feature_max_information_gain( )) that find


the feature that maximises the information gain, that uses another private function
(_get_information_gain( )) that calculates the information gain, that uses another
private function (_get_entropy()) that computes the entropy. We are all set! We can
start coding the ID3 algorithm that will create our ID3 Decision Tree for classification
problems.

We create a function that initialises the algorithm and then uses a private function to call
the algorithm recursively to build our tree.

https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 7/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science

Get started Open in app

Note how we call _id3_recv( ) function in self.node. This function returns an instance of
the class Node with information of all the nodes (decisions) in the Decision Tree.

_id3_recv_( ) is the trickiest function to code so let’s spend some time understanding
what is does.

Let’s generate some data


I have generated a small toy dataset so we can work on a small example and actually
visualise the resulting tree. This will help us understand the algorithm as well.

Let’s imagine that we love surfing waves and would like to have a model to predict if
there will be good waves in our favourite spot.

Our features are:

https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 8/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science

Get started Open in app

Features (Caption by Author)

And our target is simply a binary categorial variable that tells us if there were/will be
good waves based on the information provided in our features:

Target (Caption by Author)

Let’s generate some synthetic data using random sampling:

https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 9/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science

Get started Open in app

First 5 rows of our toy dataset (Caption by Author)

_id3_recv( ) function
Now that we have some data to work with it will be easier to understand _id3_recv ( )
function. Below is the code with some annotations:

https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 10/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science

Get started Open in app

The function takes two lists containing instances ids and feature ids and an instance of
the class Node. Firstly, it checks if the node attribute is None and if it is, it initialises an
instance of the class Node.

Then, at each split the algorithm checks if some of the following conditions are met:

Are all the instances of the same category? If so, it assigns the category to the
node value and returns the variable node. The following chunk of code does this
task in the function:

https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 11/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science

Get started Open in app

Are there more features to compute the information gain? If there are not, it
assigns the most probably category to the node value (it computes the mode) and
returns the variable node. Code doing this is below:

https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 12/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science

Get started Open in app

If none of the above conditions are met, it chooses the feature that maximises the
information gain and stores it in the node value. In this case, the node value is the
feature with which it is making the split.

The ID3 algorithm creates a branch for each value of the selected feature and finds the
instances in the training set that takes that branch. Note each branch is represented with
a new instance of the class node that also contains the the next node.

For the next node, it computes the information gain for the remaining features and
instances and chooses the one that maximises it to make another split until all the
instances have the same class or there not more features to compute, returning the most
probable category in this case. All of this is done in the last part of the function:

https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 13/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science

Get started Open in app

Resulting Decision Tree


And here we have the Decision Tree Classifier that the ID3 algorithm has built using our
synthetic data:

Resulting Decision Tree (Caption by Author)

It seems that our “synthetic” beach only works well when the wind blows from the
South. If there is a large swell, it is likely that there will be good waves regardless of the
tide but if the swell is medium, we will usually find better surfing conditions during low
tide.

I would like to give credit to the Professors of the MSc in Machine Learning, Deep Learning
and AI by Big Data International Campus for providing a version of the source code, that
inspired me to write this article.
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 14/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science

References
Get started Open in app
1. Text notes of MSc in Machine Learning, Deep Learning and AI by Big Data

International Campus and UCAM

Sign up for The Variable


By Towards Data Science

Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials
and cutting-edge research to original features you don't want to miss. Take a look.

Get this newsletter

Decision Tree Decision Tree Classifier Machine Learning Id3 Algorithm Artificial Intelligence

About Write Help Legal

Get the Medium app

https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 15/15

You might also like