0% found this document useful (0 votes)

4 views8 pages

Classification

Uploaded by

ofiscobaraki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views8 pages

Classification

Uploaded by

ofiscobaraki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Classification

Role of classification models:

a. Predictive Models: Predict class labels for unseen data, and Learn patterns
from historical data to predict.
b. Descriptive Models: For distinguishing features of different classes, and
analyses the data to find common characteristics.

Framework for Classification:

a. Induction (Training): Learn a model from labeled set using learning algorithms.

Deduction (Testing): Apply the model to unseen data to predict the class label,
And assess the model's performance and use it to improve the training process.

Decision Tree Induction:

a. Goal: Find optimal splits in the data that classes as accurately as possible.
b. Steps:
i. Select attribute: To split the data.
ii. Split data: Into subsets based on attribute values.
iii. Repeat: Until stopping criteria are met.

Hunt’s Algorithm for Decision Tree building:

a. Initial Node with all the instances.
b. Expansion and Child Nodes Formation : Select an attribute to split based on its
values, remove the instances of the leaf nodes from the dataset for the next
split.
c. Recursive Process: repeat the process until Termination.
d. Termination:node has instances of only one class.

How to handle an empty test outcome?:

a. When: The training set does not have instances with specific attribute values.
(happen in testing instances).
b. Solution:Assign the most common class label from parent node to empty nodes.

All attribute values are identical, but the class labels differ?
a. When:Noise or inconsistencies in the data.
b. Solution:Declare it a leaf node and assign it the most common class label in
the training instances associated with this node.
How to determine the best attribute test?
a. *Attribute Test Conditions:
i. Binary Attributes→Binary Split
ii. Nominal Attributes→{Multiway Split, Binary Split:By grouping attribute
values}
iii. Ordinal Attributes→Binary or multiway splits
iv. Grouping should not violate the order.
v. *Continuous Attributes**→{Multiway Split: non overlapping intervals
discretization, Binary Split:determine the threshold}.

b. Objective: Prefer attribute tests leading to pure child nodes.(to stop

expanding. Pure nodes → fewer expansions → less complexity, reduces the
probability of overfitting, easy to interpret).

Impurity Measure for a Single Node:

d. Gini is faster than entropy (it doesn’t compute log) and it often produces
simpler trees.
e. Entropy→[0,1], Gini→[0,.5], ME→[0,.5].

Collective Impurity of Child Nodes:

b. Δ = I(parent) − I(children) → gain in purity (the same as the information gain)

c. Maximizing gain is equivalent to minimizing weighted child impurity.

Gain Ratio:
a. To select the optimal attributes for splitting data like the IG.
b. Addresses limitations of IG, To reduce bias toward attributes with many values.
(‫ نسعملها في حالة ما كان لدينا متغير بعدد كبير من القيم مثل‬ID)

Characteristics of Decision Tree Classifiers

a. Applicability:
i. Nonparametric Approach: No assumptions on the probability distribution of
the data.
ii. Wide Applicability: Any type of data.
iii. No Data Transformation
iv. Multiclass Problem Handling: Without decomposing them into multiple binary
classification tasks.
v. Interpretability: easy to understand (particularly shorter ones).
vi. Competitive Accuracy

b. Expressiveness:
i. Universal Representation: Can encode any function of discrete-valued
attributes.
ii. Efficient Encoding: The discrete-valued function can be represented as an
assignment table and the decision trees can represent them efficiently.DT
can group a combinations of attributes as leaf nodes (compact
representations). But not all decision trees for discrete-valued-attributes
can be simplified (parity function)
iii. Rectilinear Splits:
1. The test conditions described so far in this
chapter involve using only a single attribute at a time. As a consequence,
The tree-growing procedure can be viewed as the process of partitioning
the attribute space into disjoint regions until each region contains
records of the same class. The border between two neighboring regions of
different classes is known as a decision boundary.
2. Since the test condition involves only a single attribute, the decision
boundaries are rectilinear; i.e., parallel to the coordinate axes.
3. Effective in handling both categorical and continues variables.
4. Disadvantages of Rectilinear Splits:
1. Struggle with Non-linear Boundaries
2. Limited Flexibility: Restricts decision boundaries to orthogonal lines,
limiting flexibility.
3. Oversimplification Risks: Can lead to oversimplified models that fail
to capture the true nature of the data.

Model Evaluation :
a. After training we estimate the performance on new unseen data.
1. Defining Evaluation Metrics:
b. Classification Metrics: Confusion matrix, Accuracy, Precision, F1 score ...
2. Choosing a Data Splitting Strategy:
c. Holdout: A single division of data, reserving a portion for testing.
d. Cross-Validation: Repeated splits for a robust performance estimate.
e. Stratified Sampling: Ensures class balance in each split, especially for
imbalanced data.

Confusion Matrix:
a. Compare the predicted labels against true labels.
b. In BINARY CLASSIFICATION.

c.
d.

Model Overfitting / model Underfitting:

a. Overfitting: The Model fits well over the training data, But it shows a poor
generalization performance.
b. Underfitting: Both error rates for the training and the testing are large.
c. When the training and the test error rates are close, the performance on the
training set is fairly representative of the generalization performance.
d. The training error rate keep decreasing by the increasing of the decision tree,
while the test error rate stop decreasing at certain tree size and begins to
increase.
e. The training error rate thus grossly under-estimates the test error rate once
the tree becomes too large.

Reasons for Model Overfitting:

a. Big decision tree give more complex model and more complex decision boundary.
b. When the tree become big, it tries to cover (perfectly fit the training data)
all the data, which makes it fine-tune itself to specific patterns in the
training data, leading to poor performance on an independently chosen test set.

c. Factors:
i. Limited Training Size:
1. Finite number of instances can only provide a limited representation of
the overall data, making the patterns learned from a training set do not
fully represent the true patterns in the overall data.
2. Increase the size of a training set → Better patterns learning → Better
resembling the true patterns in the overall data.

ii. High Model Complexity:

1. Complex Models not always give the best performance.(as we decide before).
2. One measure of model complexity is the number of “parameters” that need to
be inferred from the training set.
3. Parameters are the elements of the tree that are learned from the training
data. These include:
1. Attribute Test Conditions:
1. The rules or conditions at internal nodes (e.g., age>30\text{Age} >
30Age>30) that decide how to split the data.

2. Thresholds or Split Points:

1. The specific values used to split continuous attributes (e.g.,
Salary>50,000\text{Salary} > 50,000Salary>50,000).

3. Class Labels in Leaf Nodes:

1. The predicted class for each leaf node based on the majority class of
the training instances in that node.

4. A more complex tree risks overfitting as it infers more parameters from

the training set.

Model Selection:
a. There are many possible classifications with different levels of complexity, We
want to select the model that shows lowest generalization error rate.
b. The training error rate cannot be reliably used as the sole criterion for model
selection.
c. Generic approaches:
i. Using a Validation Set:
1. The idea is to use out of sample estimates by evaluating the model on a
separate validation set that is not used for training the model.
2. The validation error rate (the error rate on the validation set) is a
better indicator of generalization performance than the training error
rate (unseen data)
3. The process is the following :
1. Partitioning the D.train into D.tr and D.val.
2. For any model m trained on D.tr we can estimate its validation error
rate on D.val.
3. We select the model with the lowest value of error.val(m).
4. Drawbacks are : sensitivity to the size of the D.tr and D.val.
1. If the D.tr is small it will be less representative.
2. If the D.val is small the validation error rate might not be reliable
for selecting models.

ii. Incorporating Model Complexity:

1. When the complexity of the model increase → The chances of overfitting
increase → So we need to take the complexity of the model on consideration
not only the treating error rate.
2. Principle of parsimony, which suggests that given two models with the same
errors, the simpler model is preferred over the more complex model.

Model Selection for Decision Trees:

a. Prepruning (Early Stopping Rule):
i. Haling the growing tree before generating a tree the perfectly fits the
training data.
ii. Limits tree growth by limiting the maximum depth or minimum leaf size.
iii. The advantage of prepruning is that it avoids the computations associated
with generating overly complex subtrees that overfit the training data.
iv. Drawback : If the best possible model in some depth, and we stop before it,
we never reach it.

b. Post-pruning (Applied after the tree is fully grown) :

i. Removes branches that contribute little to classification accuracy.
ii. Reduces model complexity, enhancing generalization to new data.
iii. Subtree Replacement : Replace an entire subtree with a single leaf node. The
leaf node's class label is determined by the majority class of the instances
in that subtree.
iv. Subtree Raising: Replace a subtree by promoting the most frequently used
branch (the branch that has the most instances) to the parent node.

Model Evaluation:
a. The estimate of the generalization performance used to guide the selection of
the classification model are biased indicators of the performance on unseen
instances.
b. We need to evaluate the performance on unseen data D.test by computing the
error.test rate;
c. Data partitioning :
i. Holdout Method:
1. D.train and D.test.
2. Choosing the right fraction for training data is not trivial.
3. Small size of D.train → bad pattern learning/bad generalization.
4. Small size of D.test → Error.test less reliable.
5. Moreover, error.test can have a high variance as we change the random
partitioning of D into D.train and D.test.
6. Random subsampling = repeated holdout ⇾ to obtain a distribution of
D.test to understand its variance.

ii. Cross-Validation:
1. Aims to make effective use of all labeled instances in D for both training
and testing.To avoid the split bias of the holdout method.
2. The k-fold cross-validation method segments the labeled data D of size N
into K equal-sized folds.
3. Each fold is used exactly once for error calculation.(error.test(i))

5. Every instance in the data used k times for testing,(k-1) for

training,every run uses (k-1)/k fraction of the data for traning and 1/k
for testing.
6. Leave-one-out:
1. k = N.
2. advantage of utilizing as much data as possible for training.
3. But it can be misleading and computationally expensive for large data
sets.

7. Stratified Sampling:
1. Ensures equal representation of classes in each partition.

Psycho Cybernetics MAxwell Maltz PDF
100% (5)
Psycho Cybernetics MAxwell Maltz PDF
10 pages
Misosa - Animals With Backbones - The Vertebrates
100% (2)
Misosa - Animals With Backbones - The Vertebrates
12 pages
Bloom's Revised Taxonomy of Educational Objectives
No ratings yet
Bloom's Revised Taxonomy of Educational Objectives
36 pages
Cyclone MKV 2 - User Manual
No ratings yet
Cyclone MKV 2 - User Manual
44 pages
Unit-4 (1) .Docx ML
No ratings yet
Unit-4 (1) .Docx ML
42 pages
Rebuttal of Colin Leslie Dean's Critique of Kurt Godel
100% (1)
Rebuttal of Colin Leslie Dean's Critique of Kurt Godel
4 pages
9
No ratings yet
9
21 pages
D0BL-D0BM Parts Catalog (PDF) - Rfg080421
No ratings yet
D0BL-D0BM Parts Catalog (PDF) - Rfg080421
218 pages
The Kaldnes Moving Bed Process For Wastewater Treatment at Pulp and Paper Mills
No ratings yet
The Kaldnes Moving Bed Process For Wastewater Treatment at Pulp and Paper Mills
3 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
Mathematics Material 2023-2024: KKP / Myp Centre
No ratings yet
Mathematics Material 2023-2024: KKP / Myp Centre
70 pages
Iconlibrary Production Oct2016
No ratings yet
Iconlibrary Production Oct2016
137 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
16 pages
Mitsubishi FD70N Part 4 Circuit Diagram
No ratings yet
Mitsubishi FD70N Part 4 Circuit Diagram
42 pages
Kolkata City Accident Report - 2018
No ratings yet
Kolkata City Accident Report - 2018
48 pages
Chapter 5 Gastrointestinal Agents Reviewer PDF
No ratings yet
Chapter 5 Gastrointestinal Agents Reviewer PDF
6 pages
Qaisar Nadeem Department of Nuclear Engineering, PIEAS Pakistan 1 Meteorology and Radioactive Effluent Dispersion
No ratings yet
Qaisar Nadeem Department of Nuclear Engineering, PIEAS Pakistan 1 Meteorology and Radioactive Effluent Dispersion
21 pages
ADM Natural Vs Synthetic Vitamin E
No ratings yet
ADM Natural Vs Synthetic Vitamin E
2 pages
Fake News Detectio3
No ratings yet
Fake News Detectio3
24 pages
Clasa A 9a Limba Engleza
No ratings yet
Clasa A 9a Limba Engleza
2 pages
Decision Tree Induction Algorithm
No ratings yet
Decision Tree Induction Algorithm
6 pages
Two Phase Pressure Drop & Flooding Characyeristics in A Horizontal Vertical Pulsed Seive Plate Column
No ratings yet
Two Phase Pressure Drop & Flooding Characyeristics in A Horizontal Vertical Pulsed Seive Plate Column
11 pages
Book 13 Apr 2024
No ratings yet
Book 13 Apr 2024
15 pages
620b6 2. Rkvy Project Proposal For New Project
No ratings yet
620b6 2. Rkvy Project Proposal For New Project
6 pages
Issues in Decision Tree Learning
No ratings yet
Issues in Decision Tree Learning
6 pages
Robotics INNOVATION REPORT
No ratings yet
Robotics INNOVATION REPORT
15 pages
MI - Unit 4
No ratings yet
MI - Unit 4
79 pages
Lab Manual - Skull Bones - English - Student - Fill in
No ratings yet
Lab Manual - Skull Bones - English - Student - Fill in
6 pages
DPU4E HdwGuide A5
No ratings yet
DPU4E HdwGuide A5
58 pages
ML Ch-3 Decision Trees and Ensemble Methods
No ratings yet
ML Ch-3 Decision Trees and Ensemble Methods
14 pages
Induction Proofs, IV: Fallacies and Pitfalls: Example 1
No ratings yet
Induction Proofs, IV: Fallacies and Pitfalls: Example 1
4 pages
Sample Creative Brief
No ratings yet
Sample Creative Brief
2 pages
Patent and Intellectual Property Rights Issues With Technology Transfer in Bhutan
No ratings yet
Patent and Intellectual Property Rights Issues With Technology Transfer in Bhutan
16 pages
1.17. MarketLine - JarirMarketingCo - Jan - 29 - 2024
No ratings yet
1.17. MarketLine - JarirMarketingCo - Jan - 29 - 2024
27 pages
Assignment: Renewsys Launch New Range of High Efficiency Solar Modules/ Pannels
No ratings yet
Assignment: Renewsys Launch New Range of High Efficiency Solar Modules/ Pannels
2 pages
ESGB - 2025 - Classification and Regression Tress (Enregistré Automatiquement)
No ratings yet
ESGB - 2025 - Classification and Regression Tress (Enregistré Automatiquement)
43 pages
ML Unit II
No ratings yet
ML Unit II
183 pages
Trees
No ratings yet
Trees
78 pages
Supervised Decision TreeRandom Forest
No ratings yet
Supervised Decision TreeRandom Forest
39 pages
Classification
No ratings yet
Classification
75 pages
Chapter 04
No ratings yet
Chapter 04
48 pages
Decision Tree
No ratings yet
Decision Tree
35 pages
DWDM Unit IV Note
No ratings yet
DWDM Unit IV Note
21 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
CSE445 NSU Week - 4
No ratings yet
CSE445 NSU Week - 4
48 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
Nkomazana 2005 Gender Analysis of Bogwera and Bojale Initiation Among Batswana
No ratings yet
Nkomazana 2005 Gender Analysis of Bogwera and Bojale Initiation Among Batswana
22 pages
Classification&Decision Tree
No ratings yet
Classification&Decision Tree
10 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
22 pages
Dmi Unit 4
No ratings yet
Dmi Unit 4
34 pages
Unit-4 DM
No ratings yet
Unit-4 DM
19 pages
Decision Tree
No ratings yet
Decision Tree
28 pages
DWDM - Unit - V
No ratings yet
DWDM - Unit - V
93 pages
Unit 3 (A) NGP
No ratings yet
Unit 3 (A) NGP
78 pages
Aiml M4 C1
No ratings yet
Aiml M4 C1
101 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
DMDW Classification
No ratings yet
DMDW Classification
18 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
DM Unit 4
No ratings yet
DM Unit 4
24 pages
Decision Tree
No ratings yet
Decision Tree
33 pages
Supervised Learning Algorithm
No ratings yet
Supervised Learning Algorithm
59 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Lecture Note #5 - PEC-CS701E
No ratings yet
Lecture Note #5 - PEC-CS701E
16 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Decision Tree
No ratings yet
Decision Tree
13 pages
Unit 4
No ratings yet
Unit 4
33 pages
An Introduction TO Decision Trees
No ratings yet
An Introduction TO Decision Trees
30 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Decision Tree
No ratings yet
Decision Tree
21 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
Decision Trees
No ratings yet
Decision Trees
37 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
Decision Tree
No ratings yet
Decision Tree
4 pages
Decision Trees: Decision Tree Is One of The Most Widely Used and
No ratings yet
Decision Trees: Decision Tree Is One of The Most Widely Used and
53 pages
Lecture Notes 3
No ratings yet
Lecture Notes 3
11 pages
Decision Trees: A Recent Overview: S. B. Kotsiantis
No ratings yet
Decision Trees: A Recent Overview: S. B. Kotsiantis
23 pages
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
Unit - Iii
No ratings yet
Unit - Iii
52 pages
20210913115613D3708 - Session 05-08 Decision Tree Classification
No ratings yet
20210913115613D3708 - Session 05-08 Decision Tree Classification
37 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
Classification Problems
No ratings yet
Classification Problems
53 pages
DWDM Unit 4 PDF
No ratings yet
DWDM Unit 4 PDF
18 pages
2 - Decision Tree
No ratings yet
2 - Decision Tree
23 pages
Guidelines On OECD Code 2
No ratings yet
Guidelines On OECD Code 2
45 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet

Classification

Uploaded by

Classification

Uploaded by

Classification

Role of classification models:

Framework for Classification:

Decision Tree Induction:

Hunt’s Algorithm for Decision Tree building:

How to handle an empty test outcome?:

b. Objective: Prefer attribute tests leading to pure child nodes.(to stop

Impurity Measure for a Single Node:

Collective Impurity of Child Nodes:

b. Δ = I(parent) − I(children) → gain in purity (the same as the information gain)

Characteristics of Decision Tree Classifiers

Model Overfitting / model Underfitting:

Reasons for Model Overfitting:

ii. High Model Complexity:

2. Thresholds or Split Points:

3. Class Labels in Leaf Nodes:

4. A more complex tree risks overfitting as it infers more parameters from

ii. Incorporating Model Complexity:

Model Selection for Decision Trees:

b. Post-pruning (Applied after the tree is fully grown) :

5. Every instance in the data used k times for testing,(k-1) for

You might also like