Week03 Classification

Uploaded by

bobforlife001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views22 pages

Week03 Classification

Uploaded by

bobforlife001

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Lecture 03 –

Classification

1
Recall – Classification
• Definition: Classification is a supervised learning task where
the goal is to predict the category or label of a given input
based on learned patterns from a dataset
• Spam Detection (Spam vs. Not Spam)
• Fault Detection (Fault vs. Not Fault)

• Output: The model assigns a class label to the input (e.g.,

class "A" or "B").
Classification algorithms
• Some popular classification algorithms include:
• Decision Trees
• Logistic Regression
• Support Vector Machines (SVM)
• k-Nearest Neighbors (k-NN)
• Neural Networks
• In this lecture we will focus on Decision Trees
Decision Tree
• A decision tree is a supervised machine
learning algorithm used for classification.
• It models decisions and their possible
consequences in a tree-like structure.
• A decision tree is a model composed of a
collection of "questions" organized
hierarchically in the shape of a tree

https://developers.google.com/machine-learning/decision-
forests/decision-trees
Decision Tree
• Root Node: The topmost node
representing the entire dataset
• Internal Nodes: Nodes that
perform tests on features
• Branches: Edges connecting
nodes, representing the outcome
of a test
• Leaf Nodes: Terminal nodes that
provide the final decision or
prediction
Types of Decision Trees
• In practice, decision trees generally use binary trees, and we
will focus on them
Building a Decision Tree
• We split the data based on questions like:
• "Is the pet a cat or dog?"
• "Is the temperature high or low?"
• Goal: We want each split to make the groups as pure as
possible (like grouping similar items together)

What should be our criterion of splitting the data?

Understanding Entropy
1. What is Entropy in Thermodynamics?
• In physics, entropy measures disorder in a system (how mixed or random
things are)
• For example, a room with scattered objects has high entropy (high
disorder)
1. Entropy in Decision Trees:
• Entropy helps us measure how mixed or uncertain our data is.
• Lower entropy means the data is more organized (purer), just like a clean room has
lower disorder
Goal: We want to reduce entropy at each step (make the groups less
mixed)
Information Gain (How We Use Entropy)
1. Information Gain: When we split the data, we calculate how
much entropy (disorder) we reduced. This is called
information gain (IG).
2. Why Use Entropy?
• By choosing the splits that reduce entropy the most, we’re making
the data more organized and easier to classify.

Final Goal: We keep splitting the data until each group is as

pure as possible, resulting in a well-organized tree.
Entropy Calculation
• If all data points in a group belong to the same category (very
organized), the entropy is low (close to 0).
• If the data points are evenly split between categories (very
mixed), the entropy is high (close to 1).
Information Gain
• Information Gain tells us how much entropy is reduced after
splitting the data
• The goal is to make the data more organized after each split

Information Gain (IG)=Entropy (before split)−Entropy (after split)

• This is a measure of “goodness” of our split.

• We will use this metric to build our decision tree.
Information Gain Example
• Initial Group:
• We have 10 animals: 6 dogs and 4 cats.
• The initial entropy (before split):

𝐻 = −(0.6 ∗ log 2 0.6 + 0.4 log 2 (0.4)) = 0.971

• Split by tail length:
• Short Tail Group: 4 dogs, 1 cat
• Long Tail Group: 2 dogs, 3 cat

Let see if we gain any information by splitting based on the tail length feature.
Information Gain Example
• Short tail group: 𝐻 = − 0.8 ∗ log2 0.8 + 0.2 ∗ log2 0.2 = 0.722
• Long tail group: 𝐻 = − 0.8 ∗ log2 0.8 + 0.2 ∗ log2 0.2 = 0.722
Weighted Average Entropy (after split):

5 5
𝐻𝑎𝑓𝑡𝑒𝑟 𝑠𝑝𝑙𝑖𝑡 = × 0.722 + × 0.971 = 0.846
10 10

Information gain: 0.971 – 0.846 = 0.125 (so a useful split!)

Decision Tree Algorithm (pseudo code)
Types of Data
• Categorical data
• Data that represents distinct categories or labels
• Material Type: "Steel", "Aluminum", "Plastic”
• Machine Status: "Running", "Stopped", "Maintenance"
• How to Handle in Decision Trees
• Split data based on categories:
• Example:
• "Is the material type Steel?" or "Is the machine in Maintenance?"
Types of Data
• Numerical Data
• Data that represents numerical values, which can be continuous or discrete
• Temperature: 50°C, 100°C, 150°C
• Pressure: 5 bar, 10 bar, 20 bar

• How to Handle in Decision Trees

• Splitting Strategy
• Use comparison operators (e.g., <= or >) to divide the data into two groups (e.g., based on the
mean, median, etc.)
• Example: "Is the pressure less than or equal to 10 bar?"
• Binning Strategy
• Group values into ranges if appropriate
• Example: "Low Pressure (0-10 bar)", "Medium Pressure (11-20 bar)", "High Pressure (21-30
bar)".
Overfitting

Tree 2
Tree 1
Weight is < 1350
Weight is > 1350

Color
Overfitting
• Overfitting occurs when a model learns the details and noise in
the training data to the extent that it negatively impacts
performance on new, unseen data.
• The model becomes too complex, capturing even irrelevant patterns,
which reduces its ability to generalize to unseen data.
Depth of the tree
• Lower depth (e.g., 2 to 7): • Larger depth (e.g., > 7):
• Easier to interpret • Can capture complex
• Ideal for scenarios where relationships
model transparency is • More prone to overfitting and
essential. do not generalize well
• May learn noise in the data
Train-Test Split
• To prevent overfitting, we split the dataset into two parts:
• Training set: Used to train the model (e.g., 70% of the data)
• Test set: Used to evaluate the model's performance on unseen data
(e.g., 30% of the data)
Overfitting in Decision Trees
• The tree is grown too deep, creating many branches that
capture noise and irrelevant patterns in the training data
• Solution: Limit depth of the tree and rely on testing accuracy

Source: https://machinelearningmastery.com/overfitting-machine-learning-models/
Pros and cons
Pros
• Easy to Understand and Interpret
• Handles Both Numerical and Categorical Data
• Works well for small datasets
Cons
• Prone to overfitting
• Computationally expensive for large trees

MCE IGCSE Physics PPT C07
100% (1)
MCE IGCSE Physics PPT C07
26 pages
Legend of Spelljammer - Book One
No ratings yet
Legend of Spelljammer - Book One
98 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
16 pages
ML_UNIT_3_NOTES-1
No ratings yet
ML_UNIT_3_NOTES-1
118 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
2N EasyGate Pro User Manual 2N EasyGate Pro User Manual
No ratings yet
2N EasyGate Pro User Manual 2N EasyGate Pro User Manual
70 pages
ML Unit 3
No ratings yet
ML Unit 3
15 pages
Decision Tree
No ratings yet
Decision Tree
23 pages
Decision Tree
No ratings yet
Decision Tree
19 pages
21 Decision Trees
No ratings yet
21 Decision Trees
62 pages
ML - 04 - Decision Trees
No ratings yet
ML - 04 - Decision Trees
51 pages
فاينل تعلم
No ratings yet
فاينل تعلم
144 pages
ML-chap-3
No ratings yet
ML-chap-3
52 pages
P4-DTRF 1
No ratings yet
P4-DTRF 1
63 pages
Lec4 Tree v2.4 1
No ratings yet
Lec4 Tree v2.4 1
54 pages
Ds 6
No ratings yet
Ds 6
24 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
Act9
No ratings yet
Act9
22 pages
6. Decision Trees
No ratings yet
6. Decision Trees
18 pages
DS_w12_DT
No ratings yet
DS_w12_DT
61 pages
Decision Trees
No ratings yet
Decision Trees
5 pages
Week 11 - Decision Tree Learning
No ratings yet
Week 11 - Decision Tree Learning
43 pages
Decision Tree Random Forest Theory
No ratings yet
Decision Tree Random Forest Theory
13 pages
CH2-Decision Trees and Random Forest
No ratings yet
CH2-Decision Trees and Random Forest
54 pages
ML pp7_u2
No ratings yet
ML pp7_u2
42 pages
CSE445 T5a Decision Trees
No ratings yet
CSE445 T5a Decision Trees
54 pages
Dtree&rf
No ratings yet
Dtree&rf
26 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
ML4 - Decision Trees & Random Forest
No ratings yet
ML4 - Decision Trees & Random Forest
44 pages
Trees
No ratings yet
Trees
78 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
Lecture2 DT
No ratings yet
Lecture2 DT
75 pages
7_DecisionTree
No ratings yet
7_DecisionTree
58 pages
22.InfoTheory-DecisionTrees-short
No ratings yet
22.InfoTheory-DecisionTrees-short
25 pages
Module 3_ Machine Learning Algorithms
No ratings yet
Module 3_ Machine Learning Algorithms
17 pages
Neural Nets (Wrap-Up) and Decision Trees: CS 188: Artificial Intelligence
No ratings yet
Neural Nets (Wrap-Up) and Decision Trees: CS 188: Artificial Intelligence
26 pages
Lecture 6 Classification-Decision Tree Rule Based K-NN
No ratings yet
Lecture 6 Classification-Decision Tree Rule Based K-NN
73 pages
Machine Learning Unit-3.2
No ratings yet
Machine Learning Unit-3.2
61 pages
DM chapter 4
No ratings yet
DM chapter 4
6 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
LVC+1+Post-Session+Summary
No ratings yet
LVC+1+Post-Session+Summary
9 pages
Learning by Asking Questions: Decision Trees: Piyush Rai Machine Learning (CS771A)
No ratings yet
Learning by Asking Questions: Decision Trees: Piyush Rai Machine Learning (CS771A)
22 pages
7. Decision Tree & Random Forest
No ratings yet
7. Decision Tree & Random Forest
41 pages
Machine Learning-Lecture 05
No ratings yet
Machine Learning-Lecture 05
21 pages
2024 Decision Trees
No ratings yet
2024 Decision Trees
28 pages
Decision Tree
No ratings yet
Decision Tree
35 pages
DMDW-CO3-SESSION-14
No ratings yet
DMDW-CO3-SESSION-14
55 pages
Decision Trees: Classifier
No ratings yet
Decision Trees: Classifier
23 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
Decision Tree Algorithm - A Complete Guide: Data Science Blogathon
No ratings yet
Decision Tree Algorithm - A Complete Guide: Data Science Blogathon
13 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
Decision Tree
No ratings yet
Decision Tree
15 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
No ratings yet
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
22 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
23 Id3
No ratings yet
23 Id3
20 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
No ratings yet
Geometric Intuition of Decision Tree: Axis Parallel Hyperplanes
7 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Ialogue Riting: School Section
No ratings yet
Ialogue Riting: School Section
9 pages
Exogenous Homeopathy by Stanciulescu
No ratings yet
Exogenous Homeopathy by Stanciulescu
256 pages
Classification With Decision Trees I: Instructor: Qiang Yang
No ratings yet
Classification With Decision Trees I: Instructor: Qiang Yang
29 pages
Chapter 8 - Disorders of The Respiratory System - 2018 - Equine Internal Medicin
No ratings yet
Chapter 8 - Disorders of The Respiratory System - 2018 - Equine Internal Medicin
74 pages
20-Sdms-02 (Overhead Line Accessories) Rev01
No ratings yet
20-Sdms-02 (Overhead Line Accessories) Rev01
15 pages
Psychological Statistics and Psychometrics Using Stata, 1st Edition pdf docx
100% (12)
Psychological Statistics and Psychometrics Using Stata, 1st Edition pdf docx
17 pages
Solving Manufacturing Problems With 8d Methodology a Case Study of Leakage Current in a Production Company1
No ratings yet
Solving Manufacturing Problems With 8d Methodology a Case Study of Leakage Current in a Production Company1
19 pages
DECISION TREES-jb
No ratings yet
DECISION TREES-jb
8 pages
BrakeDrum Cross Reference Chart 050115
100% (1)
BrakeDrum Cross Reference Chart 050115
2 pages
Kenwood rxd-803 803e 853 853e A83
No ratings yet
Kenwood rxd-803 803e 853 853e A83
35 pages
Python Cheatsheet
No ratings yet
Python Cheatsheet
35 pages
College Entrance Test
No ratings yet
College Entrance Test
37 pages
Chemistry Guide en
No ratings yet
Chemistry Guide en
90 pages
Arthritis: by DR Samra Tahseen Registrar, Radiology LNH
No ratings yet
Arthritis: by DR Samra Tahseen Registrar, Radiology LNH
86 pages
Pax 2 Vaporizer User Manual
0% (1)
Pax 2 Vaporizer User Manual
14 pages
Hypothesis Testing Z-Test and T-Test
No ratings yet
Hypothesis Testing Z-Test and T-Test
13 pages
Symbols
No ratings yet
Symbols
10 pages
Cls GRD 3 Gqkue Pap
No ratings yet
Cls GRD 3 Gqkue Pap
5 pages
6 Unidades Condensadoras
No ratings yet
6 Unidades Condensadoras
23 pages
Conduct of Examination Rules
No ratings yet
Conduct of Examination Rules
15 pages
Transcendental Deduction Kant
No ratings yet
Transcendental Deduction Kant
3 pages
4.3/4.4 Scale Factor and Similarity - Review Name: - Class
No ratings yet
4.3/4.4 Scale Factor and Similarity - Review Name: - Class
5 pages
Mk7 Data Sheet
No ratings yet
Mk7 Data Sheet
4 pages
RP Vs City of Kadapawen
No ratings yet
RP Vs City of Kadapawen
2 pages
The nth Term of a Linear Sequence
No ratings yet
The nth Term of a Linear Sequence
3 pages
Investment
No ratings yet
Investment
4 pages
1482402756balmerol Licom 2 Premium
No ratings yet
1482402756balmerol Licom 2 Premium
2 pages
FX 45a 55a SS
No ratings yet
FX 45a 55a SS
2 pages
Chlorination of Alcohol Using PPh3
No ratings yet
Chlorination of Alcohol Using PPh3
3 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet

Week03 Classification

Uploaded by

Week03 Classification

Uploaded by

Lecture 03 –

• Output: The model assigns a class label to the input (e.g.,

What should be our criterion of splitting the data?

Final Goal: We keep splitting the data until each group is as

Information Gain (IG)=Entropy (before split)−Entropy (after split)

• This is a measure of “goodness” of our split.

𝐻 = −(0.6 ∗ log 2 0.6 + 0.4 log 2 (0.4)) = 0.971

Information gain: 0.971 – 0.846 = 0.125 (so a useful split!)

• How to Handle in Decision Trees

You might also like