0% found this document useful (0 votes)

30 views34 pages

Decision Tree

Uploaded by

2205456

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views34 pages

Decision Tree

Uploaded by

2205456

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Decision Tree

What is a Decision Tree?

• A decision tree is a supervised learning algorithm that can be used for
both classification and regression tasks.
• It works by recursively partitioning the data into subsets based on
feature values, making decisions at each node to maximize a specific
criterion (e.g., information gain or Gini index).
• Key Components:
• Root Node: The top node in the tree that represents the best
feature to split the data.
• Internal Nodes: Represent the features used for splitting the data
based on specific decision rules.
• Leaf Nodes: Terminal nodes that represent the predicted outcome
(class label or numerical value).
• Branches: Connections between nodes representing the possible
values of the features.
Decision Tree
Create Decision Tree
• The process of creating a decision tree involves:
1. Selecting the Best Attribute: Using a metric like Gini impurity,
entropy, or information gain, the best attribute to split the data is
selected.
2. Splitting the Dataset: The dataset is split into subsets based on the
selected attribute.
3. Repeating the Process: The process is repeated recursively for each
subset, creating a new internal node or leaf node until a stopping
criterion is met (e.g., all instances in a node belong to the same
class or a predefined depth is reached).
Dataset where play tennis is the target output and
rest all are input features
Dataset Information
• Total observations: 14
• Number of observations for Yes class: 9
• Number of observations for No class: 5
• Dataset columns:
• Outlook: Sunny, Overcast, Rain
• Temperature: Hot, Mild, Cool
• Humidity: High, Normal
• Wind: Weak, Strong
• Play Tennis: Yes, No (Target)
Decision tree example where the root node
chosen is outlook
Leaf Node
• Leaf nodes represent the final output or prediction of the decision tree.
• Once a data point reaches a leaf node, a decision or prediction is made
based on the majority class (for classification) or the average value (for
regression) of the data points that reach that leaf.
• To check mathematically if any split is pure split or not we use entropy
or gini impurity.
• Information Gain helps us to determine which features need to be
selected.
• Information Gain helps us to determine which features need to be
selected.
Entropy
• Entropy is a measure of uncertainty or impurity. A low entropy
indicates a more ordered or homogeneous set, while a high entropy
signifies greater uncertainty or diversity.
• In the context of a decision tree, the goal is to reduce entropy by
selecting features and split points that result in more ordered subsets.
• Entropy values range from 0 to 1.
• The minimum entropy (0) occurs when all instances belong to a single
class, making the set perfectly ordered.
• The maximum entropy (1) occurs when observations are evenly
distributed across all classes, creating a state of maximum
randomness.
Gini impurity
• Gini impurity is a measure of the impurity or randoness in a set of
elements, commonly used in decision tree algorithms, especially for
classification tasks.
• A lower Gini impurity suggests a more homogeneous set of elements
within the node, making it an attractive split in a decision tree.
• Decision tree algorithms aim to minimize the Gini impurity at each
node, selecting the feature and split point that results in the lowest
impurity.
• Entropy and Gini impurity formulas:
Entropy and Gini index values
➢Entropy:
• The minimum value of entropy is 0.
• Thus, maximum entropy values vary for different numbers of classes.
• For 2 classes (binary classification): maximum entropy is 1.
• For 3 classes: maximum entropy is 𝑙𝑜𝑔2 3≈1.585
• For 4 classes: maximum entropy is 𝑙𝑜𝑔2 4 =2 and so on.
➢Gini index:
• G = 0 indicates a perfectly pure node (all elements belong to the same
class).
• G = 0.5 indicates maximum impurity (elements are evenly distributed
among all classes)
Example

If our dataset is huge, we should choose Gini impurity as its calculation

is much simpler compared to entropy.
Information Gain
Feature selection for splitting

Since the information gain of f2 is greater than that of f1, the split will
be done with respect to that of f2.
Mini version of Play Tennis dataset
Information Gain for the attribute “Outlook”
• Step 1: Calculate the Entropy of the Dataset

• Step 2: Calculate Entropy After the Split

• Sunny: 2 “No”, 1 “Yes”
• Overcast: All “Yes” (no uncertainty here)

• Rain: 2 “Yes”, 1 “No”

• Step 3: Calculate Information Gain

Interpretation
• The Information Gain for the attribute “Outlook” is 0.45. This means
that splitting the data based on “Outlook” reduces the uncertainty in
the dataset by 45%.
• It’s the most informative attribute in this case, and that’s why a
decision tree would likely choose it as the first split.
Steps to Construct Decision Tree using Gini Index

1. Calculate Gini Index for the dataset.

2. Split Data by Attributes and compute the Gini Index for each split.
3. Choose the Best Attribute for splitting the dataset by selecting the
one with the lowest Gini Index.
4. Recursively Repeat steps 1–3 for each subset until all data points
are classified or other stopping criteria are met (e.g., the tree
depth).
Decision Tree on Playing Tennis using Gini Impurity
1. Calculate Gini index for the entire dataset

Where 𝑝 𝑥𝑖 is the proportion of instances belonging to class 𝑖.

9 2 5 1 9 9 5 5
G(X) = 1− + = 0.45 ➔ 𝐺 𝑋 = 1− + 1− = 0.45
14 14 14 14 14 14

• Numerically, it turns out that gini impurity and entropy are very similar.
Gini is preferred as we don’t have to find log, so we can speed up our
implementation.
• Evaluate Splits for each feature: Outlook, Temperature, Humidity, and
Wind.
𝐼𝐺 𝑂𝑢𝑡𝑙𝑜𝑜𝑘 = 0.45 − 0.34 = 0.11 IG is maximized for Outlook so we will select
𝐼𝐺 𝑇𝑒𝑚𝑝𝑎𝑟𝑎𝑡𝑢𝑟𝑒 = 0.04
𝐼𝐺 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 = 0.09
Outlook as the node at this level.
𝐼𝐺 𝑊𝑖𝑛𝑑 = 0.03
IG is maximum for Outlook, Outlook will be Root
Information Gain on Sunny outlook factor
Now focusing on the left branch Outlook = Sunny
2 2 3 3
𝐺 𝑋 𝑂 = 𝑆𝑢𝑛𝑛𝑦 = 1− + 1− = 0.48
5 5 5 5
2 2
2 2
2 3
𝐺 𝑋 𝑂 = 𝑆𝑢𝑛𝑛𝑦 = 1 − 𝑝1 + 𝑝2 =1− + = 0.48
5 5
• Choosing Humidity (H) as our next node, the Gini impurity of PlayTennis
given Humidity and Outlook being Sunny is:
• Subset 1: High (3 Samples: 0 Yes, 3 No)
• Subset 2: Normal (2 Samples: 2 Yes, 0 No)
3 3 3 2 2 2
• 𝐺 𝑋 𝐻, 𝑂 = 𝑆𝑢𝑛𝑛𝑦 = 1 − + 1− =0
5 3 3 5 2 2
• 𝐼𝐺 𝑋 𝐻, 𝑂 = 𝑆𝑢𝑛𝑛𝑦 = 0.48 − 0 = 0.48
• Similarly we can compute,
• 𝐼𝐺 𝑋 𝑂 = 𝑆𝑢𝑛𝑛𝑦, 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 = 0.48
• 𝐼𝐺(𝑋|𝑂 = 𝑆𝑢𝑛𝑛𝑦, 𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒) = 0.28
• 𝐼𝐺(𝑋|𝑂 = 𝑆𝑢𝑛𝑛𝑦, 𝑊𝑖𝑛𝑑) = 0.013
• IG is maximized for Humidity so we will select Humidity as the node at
this level.
• Our tree will look like:
Information Gain on Rain outlook factor
Day Wind Decision
4 Weak Yes
5 Weak Yes
10 Weak Yes

Day Wind Decision

6 Strong No
14 Strong No
Now focusing on the right branch Outlook = Rain
3 3 2 2
• 𝐺 𝑋 𝑂 = 𝑅𝑎𝑖𝑛 = 1− + 1− = 0.48
5 5 5 5

2 2 3 2 2 2
• 𝐺 𝑋 𝑂 = 𝑅𝑎𝑖𝑛 = 1 − 𝑝1 + 𝑝2 =1− + = 0.48
5 5

• Choosing Humidity (H) as our next node, the Gini impurity of Play
Tennis given Humidity and Outlook being Rain is:
• Subset 1: Weak (3 Samples: 3 Yes, 0 No)
• Subset 2: Strong (2 Samples: 0 Yes, 2 No)
3 3 3 2 2 2
• 𝐺 𝑋 𝑊, 𝑂 = 𝑅𝑎𝑖𝑛 = 1− + 1 − =0
5 3 3 5 2 2
• 𝐼𝐺 𝑋 𝑊, 𝑂 = 𝑅𝑎𝑖𝑛 = 0.48 − 0 = 0.48
Choosing Humidity (H) and Temperature (T)with
Outlook being Rain is:
• Similarly we can compute,
• 𝐼𝐺 𝑋 𝑂 = 𝑅𝑎𝑖𝑛, 𝐻𝑢𝑚𝑖𝑑𝑖𝑡𝑦 = 0.48 − 0.46 = 0.02
• 𝐼𝐺 𝑋 𝑂 = 𝑅𝑎𝑖𝑛, 𝑇𝑒𝑚𝑝𝑒𝑟𝑎𝑡𝑢𝑟𝑒 = 0.48 − 0.46 = 0.02
• 𝐼𝐺 𝑋 𝑂 = 𝑅𝑎𝑖𝑛, 𝑊𝑖𝑛𝑑 = 0.48 − 0 = 0.48
• IG is maximized for Wind so we will select Wind as the node at this
level.
Final Decision Tree
Test Instance
• For test instance, we just need to start from the root node and traverse
to a leaf node.
• For example, if we had a sample like: Outlook = Sunny; Humidity =
Normal; Wind = Strong; Temperature = Mild.
• We will first start from Outlook and as it is Sunny so we go to the left
branch, then we check Humidity and as that is Normal, we go to the
right branch and reach a leaf node and our output is Yes!
• Final decision, we will play tennis.
Overfitting of Decision Tree
• Decision trees can easily overfit, if we have too few examples near the
bottom or our tree is too deep then we could end up overfitting to
our training data.
• In such situations we can use stopping criteria's like maximum depth,
etc.
• We can also use pruning where we remove a subtree and use
majority voting at that node instead.
• Intuitively decision trees are relatively easier to understand as
compared to other classification algorithms.
• That is why they are widely used in the industry.
• import pandas as pd
• from sklearn.tree import DecisionTreeClassifier
• from sklearn.model_selection import train_test_split
• from sklearn.metrics import accuracy_score

• # Load the dataset

• data = {
• "Outlook": ["Sunny", "Sunny", "Overcast", "Rain", "Rain", "Rain", "Overcast", "Sunny", "Sunny",
"Rain", "Sunny", "Overcast", "Overcast", "Rain"],
• "Temperature": ["Hot", "Hot", "Hot", "Mild", "Cool", "Cool", "Cool", "Mild", "Cool", "Mild",
"Mild", "Mild", "Hot", "Mild"],
• "Humidity": ["High", "High", "High", "High", "Normal", "Normal", "Normal", "High", "Normal",
"Normal", "Normal", "High", "Normal", "High"],
• "Wind": ["Weak", "Strong", "Weak", "Weak", "Weak", "Strong", "Strong", "Weak", "Weak",
"Weak", "Strong", "Strong", "Weak", "Strong"],
• "Play Tennis": ["No", "No", "Yes", "Yes", "Yes", "No", "Yes", "No", "Yes", "Yes", "Yes", "Yes", "Yes",
"No"]
• }

• df = pd.DataFrame(data)
• # Encode categorical variables
• df_encoded = pd.get_dummies(df, columns=["Outlook", "Temperature", "Humidity", "Wind"], drop_first=True)
• X = df_encoded.drop("Play Tennis", axis=1)
• y = df["Play Tennis"]

• # Split the data into training and testing sets

• X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

• # Train Decision Tree Classifier

• clf = DecisionTreeClassifier(criterion="gini", random_state=42)
• clf.fit(X_train, y_train)

• # Make predictions on the test set

• y_pred = clf.predict(X_test)

• # Compute accuracy
• accuracy = accuracy_score(y_test, y_pred)
• print("Accuracy of the Decision Tree: {:.2f}%".format(accuracy * 100))

FortiGate Security and FortiGate Infrastructure 7.2 Sample Questions - Attempt Review
100% (1)
FortiGate Security and FortiGate Infrastructure 7.2 Sample Questions - Attempt Review
19 pages
30 Tips & Tricks To Master Microsoft Excel
No ratings yet
30 Tips & Tricks To Master Microsoft Excel
33 pages
ProxiGuard Patrol Management System 7x Manual English
No ratings yet
ProxiGuard Patrol Management System 7x Manual English
29 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
12 pages
Examples
No ratings yet
Examples
8 pages
Decision Tree
No ratings yet
Decision Tree
36 pages
Session 6 - Decision Tree
No ratings yet
Session 6 - Decision Tree
37 pages
MODULE 4-Dr - GM
No ratings yet
MODULE 4-Dr - GM
23 pages
Decision Tree
No ratings yet
Decision Tree
19 pages
unit-4[1].docx ML
No ratings yet
unit-4[1].docx ML
42 pages
6__DecisionTrees__ID3_CART
No ratings yet
6__DecisionTrees__ID3_CART
24 pages
Lecture2 Random Forest
No ratings yet
Lecture2 Random Forest
23 pages
ML Unit-3 ppt
No ratings yet
ML Unit-3 ppt
92 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Ml Unit 2 Final_iii Yr
No ratings yet
Ml Unit 2 Final_iii Yr
72 pages
Decision Tree (1)
No ratings yet
Decision Tree (1)
7 pages
Decision Tree
No ratings yet
Decision Tree
100 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
CSC454_10
No ratings yet
CSC454_10
36 pages
Lecture 5 DecisionTree
No ratings yet
Lecture 5 DecisionTree
21 pages
Trinh Khanh Ly 20213676
No ratings yet
Trinh Khanh Ly 20213676
13 pages
Decision Tree
100% (4)
Decision Tree
66 pages
Machine Learning chapter 4
No ratings yet
Machine Learning chapter 4
9 pages
06 - Decision Trees
No ratings yet
06 - Decision Trees
14 pages
3. Tree Models
No ratings yet
3. Tree Models
42 pages
Decision Trees: Decision Tree Is One of The Most Widely Used and
No ratings yet
Decision Trees: Decision Tree Is One of The Most Widely Used and
53 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
decision tree
No ratings yet
decision tree
66 pages
IS4834 Week 8
No ratings yet
IS4834 Week 8
42 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
18 pages
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
No ratings yet
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
7 pages
MLT UNIT-3 notes
No ratings yet
MLT UNIT-3 notes
35 pages
ML Classification Tree
No ratings yet
ML Classification Tree
36 pages
Data Mining Notes Unit 4
No ratings yet
Data Mining Notes Unit 4
30 pages
Data Minning Unit 5 PDF
No ratings yet
Data Minning Unit 5 PDF
19 pages
Classification
No ratings yet
Classification
30 pages
Week - 2 Day - 2 Machine Learning 2 - 3
No ratings yet
Week - 2 Day - 2 Machine Learning 2 - 3
33 pages
DS4 - CLS-Decision Tree
No ratings yet
DS4 - CLS-Decision Tree
32 pages
Decision Trees
No ratings yet
Decision Trees
18 pages
Decision Tree
No ratings yet
Decision Tree
58 pages
Decision Tree
No ratings yet
Decision Tree
35 pages
Experiment No-2
No ratings yet
Experiment No-2
4 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
CSE445 NSU Week_4
No ratings yet
CSE445 NSU Week_4
48 pages
06 Classification Decision Tree
No ratings yet
06 Classification Decision Tree
42 pages
Lecture 8 PDF
No ratings yet
Lecture 8 PDF
23 pages
L5 - Decision Tree - B
No ratings yet
L5 - Decision Tree - B
51 pages
Classification Algorithms: Inteligência Artificial E Cibersegurança (Inacs)
No ratings yet
Classification Algorithms: Inteligência Artificial E Cibersegurança (Inacs)
60 pages
DT Classifier
No ratings yet
DT Classifier
45 pages
فاينل تعلم
No ratings yet
فاينل تعلم
144 pages
Decitions Tree
No ratings yet
Decitions Tree
6 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
28 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Unit6 -2 Classification-Decision-Trees_25625586-1bf9-4821-a721-70db2d7805ef
No ratings yet
Unit6 -2 Classification-Decision-Trees_25625586-1bf9-4821-a721-70db2d7805ef
36 pages
dm unit 4
No ratings yet
dm unit 4
24 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
2.3 Decision-Tree-Algorithm
No ratings yet
2.3 Decision-Tree-Algorithm
61 pages
COS10022 DSP Week05 Decision Tree and Random Forest
No ratings yet
COS10022 DSP Week05 Decision Tree and Random Forest
50 pages
Assignment 3
No ratings yet
Assignment 3
8 pages
GCSE Maths Revision: Cheeky Revision Shortcuts
From Everand
GCSE Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (2)
Birla Institute of Technology & Science, Pilani EEE G613: Advanced Digital Signal Processing Semester I: 2021-2022
No ratings yet
Birla Institute of Technology & Science, Pilani EEE G613: Advanced Digital Signal Processing Semester I: 2021-2022
6 pages
Family and Personal History History 1st Grade
No ratings yet
Family and Personal History History 1st Grade
63 pages
13 @home
No ratings yet
13 @home
13 pages
HTML Atrributes
No ratings yet
HTML Atrributes
28 pages
Head Phone OneDer S1 Manual Usuario PDF
No ratings yet
Head Phone OneDer S1 Manual Usuario PDF
2 pages
Si 3622 1725348068
No ratings yet
Si 3622 1725348068
48 pages
COCOMO Model
No ratings yet
COCOMO Model
17 pages
TOC in 8 Hours
100% (1)
TOC in 8 Hours
312 pages
Web Development With Django Syllabus
No ratings yet
Web Development With Django Syllabus
4 pages
Adobe Photoshop Request Code PDF
No ratings yet
Adobe Photoshop Request Code PDF
4 pages
CIT215 SUMMARY
No ratings yet
CIT215 SUMMARY
42 pages
Strategic and Economic Aspects of Network Sharing in FTTH/PON Architectures
No ratings yet
Strategic and Economic Aspects of Network Sharing in FTTH/PON Architectures
13 pages
Internship Report On Digital Marketing
No ratings yet
Internship Report On Digital Marketing
35 pages
L1 Sales Deck Why SentinelOne
No ratings yet
L1 Sales Deck Why SentinelOne
27 pages
Cse Aie 2019 Curriculum
No ratings yet
Cse Aie 2019 Curriculum
9 pages
CH-9-Forward Engineering
No ratings yet
CH-9-Forward Engineering
35 pages
Spectrum Archive VM
No ratings yet
Spectrum Archive VM
83 pages
Self-Adaptive Testing in the Field
No ratings yet
Self-Adaptive Testing in the Field
37 pages
MathDash_Book 2025
No ratings yet
MathDash_Book 2025
137 pages
Optoma DS611 Password Reset Via Serial Cable (Prolific Chip)
No ratings yet
Optoma DS611 Password Reset Via Serial Cable (Prolific Chip)
68 pages
Wize Technical Presentation - MAY 2021
No ratings yet
Wize Technical Presentation - MAY 2021
14 pages
CS403 Grand QUIZ Spring 2021-1
No ratings yet
CS403 Grand QUIZ Spring 2021-1
4 pages
UC23NA 3PRI03 AVEVA Millhoff Maximize Your Operational Excellence
No ratings yet
UC23NA 3PRI03 AVEVA Millhoff Maximize Your Operational Excellence
32 pages
Computer Operation Final
No ratings yet
Computer Operation Final
10 pages
3.0MP 4G Smart All Time Color Bullet Cam 8075
No ratings yet
3.0MP 4G Smart All Time Color Bullet Cam 8075
2 pages
Phabulous Physics Unit Plan
No ratings yet
Phabulous Physics Unit Plan
8 pages
Design of IIR Elliptical Band Pass Filter: Expt. No.: 3A
No ratings yet
Design of IIR Elliptical Band Pass Filter: Expt. No.: 3A
12 pages

Decision Tree

Uploaded by

Decision Tree

Uploaded by

Decision Tree

What is a Decision Tree?

If our dataset is huge, we should choose Gini impurity as its calculation

• Step 2: Calculate Entropy After the Split

• Rain: 2 “Yes”, 1 “No”

• Step 3: Calculate Information Gain

1. Calculate Gini Index for the dataset.

Where 𝑝 𝑥𝑖 is the proportion of instances belonging to class 𝑖.

Day Wind Decision

• # Load the dataset

• # Split the data into training and testing sets

• # Train Decision Tree Classifier

• # Make predictions on the test set

You might also like