ID3 Decision Tree Classifier From Scratch in Python - by Bernardo Garcia Del Rio - Towards Data Science
ID3 Decision Tree Classifier From Scratch in Python - by Bernardo Garcia Del Rio - Towards Data Science
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 1/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
In my last article, I showed how you could have some fun growing a Random Forest
using Sklearn’s DecisionTreeClassifier. You can check it out below:
If you are new to Machine Learning (ML) or are in the process of becoming a Data
Scientist or ML practitioner like me, you will probably get something out of it. In my
case, preparing the material for the article helped me gain a good understanding of how
Random Forests and Decision Trees work under the hood without getting into complex
code.
In that article, I mentioned that there are many algorithms that can be used to build a
Decision Tree. One of them is ID3 (Iterative Dichotomiser 3) and we are going to see
how to code it from scratch using ONLY Python to build a Decision Tree Classifier.
All the code can be found in a public repository that I have attached below:
bergr7/ID3_From_Scratch
All the code is in a public repository at the link below: Installation
Project Motivation File Descriptions Results…
github.com
1 class Node:
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 2/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
1 class Node:
2 """Contains the information of the node and another nodes of the Decision Tree."""
Get started Open in app
3
4 def __init__(self):
5 self.value = None
6 self.next = None
7 self.childs = None
8
nodes.py
hosted with ❤ by GitHub view raw
Note that I have provided many annotations in the code snippets that help understand
the code.
1 class DecisionTreeClassifier:
2 """Decision Tree Classifier using ID3 algorithm."""
3
4 def __init__(self, X, feature_names, labels):
5 self.X = X # features or predictors
6 self.feature_names = feature_names # name of the features
7 self.labels = labels # categories
8 self.labelCategories = list(set(labels)) # unique categories
9 # number of instances of each category
10 self.labelCategoriesCount = [list(labels).count(x) for x in self.labelCategories]
11 self.node = None # nodes
12 # calculate the initial entropy of the system
13 self.entropy = self._get_entropy([x for x in range(len(self.labels))])
14
decisiontree.py
hosted with ❤ by GitHub view raw
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 3/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
entropy.py
hosted with ❤ by GitHub view raw
We pass the instances id’s or indexes to this function. For doing this, we need to generate
an unique number for each instance. Python’s lists comprehensions come in very handy
for this task as you can see.
We are going to code an ID3 algorithm that uses the information gain to find the
feature that maximises it and make a split based on that feature. The information gain is
based on entropy.
As this article is about coding the ID3 algorithm, I am not going to go into the details but
entropy is a measure of uncertainty about a random variable. If we minimise the
entropy, then we increase the certainty about the variable. In other words, if the random
variable can take only one value the entropy reaches its minimum whereas if all the
values are equiprobable the entropy is maximum.
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 4/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
The algorithm will try to minimise the entropy or, equivalently, maximising the information
Get started Open in app
gain.
We said that we would compute the information gain to choose the feature that
maximises it and then make the split based on that feature. The information gain is a
concept based on entropy. It is defined as the total entropy minus the entropy if we chose
a particular feature j.
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 5/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
_get_information_gain( ) takes the instances ids and the feature id of the selected
featured to be evaluated. Then it calculates the total entropy, the entropy if we selected
the feature specified in feature_id and finally the information gain.
This ID3 algorithm chooses the feature that maximise the information gain at each split.
That is if the amount information in the feature is large, the entropy will be small, and
therefore the second term of the information gain formula will be also small, increasing
the information gain.
_get_information_gain( ) only calculates the information gain for a given set and feature
but, at each split, we need to figure out which is the feature that maximises the
information gain. We need to create another function to find the feature that maximises
the information gain.
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 6/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
We create a function that initialises the algorithm and then uses a private function to call
the algorithm recursively to build our tree.
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 7/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
Note how we call _id3_recv( ) function in self.node. This function returns an instance of
the class Node with information of all the nodes (decisions) in the Decision Tree.
_id3_recv_( ) is the trickiest function to code so let’s spend some time understanding
what is does.
Let’s imagine that we love surfing waves and would like to have a model to predict if
there will be good waves in our favourite spot.
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 8/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
And our target is simply a binary categorial variable that tells us if there were/will be
good waves based on the information provided in our features:
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 9/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
_id3_recv( ) function
Now that we have some data to work with it will be easier to understand _id3_recv ( )
function. Below is the code with some annotations:
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 10/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
The function takes two lists containing instances ids and feature ids and an instance of
the class Node. Firstly, it checks if the node attribute is None and if it is, it initialises an
instance of the class Node.
Then, at each split the algorithm checks if some of the following conditions are met:
Are all the instances of the same category? If so, it assigns the category to the
node value and returns the variable node. The following chunk of code does this
task in the function:
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 11/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
Are there more features to compute the information gain? If there are not, it
assigns the most probably category to the node value (it computes the mode) and
returns the variable node. Code doing this is below:
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 12/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
If none of the above conditions are met, it chooses the feature that maximises the
information gain and stores it in the node value. In this case, the node value is the
feature with which it is making the split.
The ID3 algorithm creates a branch for each value of the selected feature and finds the
instances in the training set that takes that branch. Note each branch is represented with
a new instance of the class node that also contains the the next node.
For the next node, it computes the information gain for the remaining features and
instances and chooses the one that maximises it to make another split until all the
instances have the same class or there not more features to compute, returning the most
probable category in this case. All of this is done in the last part of the function:
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 13/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
It seems that our “synthetic” beach only works well when the wind blows from the
South. If there is a large swell, it is likely that there will be good waves regardless of the
tide but if the swell is medium, we will usually find better surfing conditions during low
tide.
I would like to give credit to the Professors of the MSc in Machine Learning, Deep Learning
and AI by Big Data International Campus for providing a version of the source code, that
inspired me to write this article.
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 14/15
10/5/21, 11:12 AM ID3 Decision Tree Classifier from scratch in Python | by Bernardo Garcia del Rio | Towards Data Science
References
Get started Open in app
1. Text notes of MSc in Machine Learning, Deep Learning and AI by Big Data
Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials
and cutting-edge research to original features you don't want to miss. Take a look.
Decision Tree Decision Tree Classifier Machine Learning Id3 Algorithm Artificial Intelligence
https://towardsdatascience.com/id3-decision-tree-classifier-from-scratch-in-python-b38ef145fd90 15/15