0% found this document useful (0 votes)
5 views15 pages

Complete ID3 Decision Tree

Decision Trees are supervised learning algorithms used for classification and regression, represented in a tree-like structure with nodes and branches. The ID3 Algorithm, developed by Ross Quinlan, builds decision trees using entropy and information gain to classify data effectively. While Decision Trees are easy to interpret and handle various data types, they can be prone to overfitting and may require pruning for optimal performance.

Uploaded by

Boomika G
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views15 pages

Complete ID3 Decision Tree

Decision Trees are supervised learning algorithms used for classification and regression, represented in a tree-like structure with nodes and branches. The ID3 Algorithm, developed by Ross Quinlan, builds decision trees using entropy and information gain to classify data effectively. While Decision Trees are easy to interpret and handle various data types, they can be prone to overfitting and may require pruning for optimal performance.

Uploaded by

Boomika G
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 15

Decision Tree and ID3 Algorithm

in Machine Learning
Understanding Decision Trees and
How ID3 Algorithm Works
Introduction to Decision Trees
• - A supervised learning algorithm used for
classification and regression
• - Represents decisions in a tree-like structure
with nodes and branches
• - Each node represents a feature, and
branches represent decision outcomes
• - Uses a recursive, top-down approach to
classify data
Why Use Decision Trees?
• - Easy to interpret and visualize
• - Handles both categorical and numerical data
• - Requires little data preprocessing
• - Works well on small to medium-sized
datasets
• - Can handle missing values and irrelevant
features
Components of a Decision Tree
• - **Root Node:** Represents the entire
dataset
• - **Decision Nodes:** Intermediate nodes
that split based on attributes
• - **Leaf Nodes:** Terminal nodes
representing final classifications
• - **Branches:** Paths connecting nodes
based on attribute values
ID3 Algorithm - Introduction
• - **Iterative Dichotomiser 3 (ID3)** is a
decision tree algorithm by Ross Quinlan
• - Used for classification tasks
• - Builds a tree by selecting attributes using
**Information Gain**
ID3 Algorithm Steps
• 1. Calculate **Entropy** for the dataset
• 2. Compute **Information Gain** for each
attribute
• 3. Choose the attribute with **highest
Information Gain** as the root
• 4. Recursively apply the process to split the
dataset
• 5. Stop when all data is classified or no further
gain is possible
Entropy - Measuring Uncertainty
• - Entropy (H) measures impurity or
randomness in data
• - Formula:
• H(S) = - Σ p_i log₂(p_i)
• - **Low Entropy:** Data is pure (one class
dominates)
• - **High Entropy:** Data is mixed (multiple
classes present)
Information Gain (IG)
• - Measures how much an attribute reduces
entropy
• - Formula:
• IG(S, A) = H(S) - Σ (|S_v| / |S|) * H(S_v)
• - The attribute with **highest Information
Gain** is selected for splitting
Example Problem: Employee
Promotion
• - Given attributes: **Experience, Education,
Performance**
• - Goal: Predict if an employee gets promoted
(Yes/No)
• - Apply the **ID3 algorithm** to construct a
decision tree
Step-by-Step Calculation
• 1. Compute **Entropy** for the dataset
• 2. Calculate **Information Gain** for
Experience, Education, and Performance
• 3. Choose the best attribute as the root node
• 4. Recursively split based on the best
attributes
Final Decision Tree
Advantages and Limitations of ID3
• **Advantages:**
• - Simple and easy to understand
• - Works well with categorical data
• - Produces compact decision trees

• **Limitations:**
• - Prone to **overfitting** if not pruned
• - **Biased towards attributes** with many
values
Applications of Decision Trees
• - **Medical diagnosis** (predicting diseases)
• - **Fraud detection** (credit card fraud
detection)
• - **Customer segmentation** (marketing
analytics)
• - **Spam filtering** (email classification)
• - **Risk assessment** (loan approval process)
Conclusion
• - **Decision Trees** are effective for
classification problems
• - **ID3 Algorithm** builds trees using entropy
and information gain
• - Effective for small datasets but requires
pruning to avoid overfitting
• - Variants like **C4.5 and CART** improve
upon ID3
References
• - Quinlan, J. R. (1986). "Induction of Decision
Trees."
• - Machine Learning textbooks and research
papers
• - Online courses and tutorials on Decision
Trees and ID3

You might also like