0% found this document useful (0 votes)

1 views26 pages

INT354 - Unit 2

The document provides an overview of supervised learning, differentiating between regression and classification with real-life examples. It discusses how to choose classification algorithms based on dataset size, feature types, interpretability, and other factors. Additionally, it covers specific algorithms like Logistic Regression, Perceptron, and Decision Trees, including their mechanisms and comparisons.

Uploaded by

HARSH KUMAR

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1 views26 pages

INT354 - Unit 2

Uploaded by

HARSH KUMAR

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Machine Learning Classifier-I

Overview of Supervised Learning

Regression Vs Classification
Regression Vs Classification
Real-life examples of Supervised Learning
Techniques
Estimating the price of house based on price,
Classifying emails as spam or not spam location and area

Categorizing patients into different

Predicting future sales of product based on
disease categories based on symptoms historical data
and test results

Forecasting the future price of stock based on

Identifying objects in images financial indicators

Predicting temperature, rainfall and other

Determining sentiments of text data weather conditions

Predicting creditworthiness of an individual or

Predicting the grade of student in ML company based on financial history

Predicting that whether India is going to Predicting product quality based on

win the match or not manufacturing process variables

Student is going to get the placement or Predict student performance based on

not attendance, grades and test scores

Classification Examples Regression Examples

Choosing a Classification Algorithm
• a) Small dataset (<10,000 samples) → Classical ML models (Logistic
Regression, SVM, Decision Tree)
• Large dataset (>10,000 samples) → Deep learning models (CNN, LSTM,
Transformer)

• b) Number of Features (Dimensionality)

• Low-dimensional data (<20 features) → Logistic Regression, Naïve Bayes,
Decision Tree
• High-dimensional data (>1000 features) → SVM (with kernel trick), Deep
Learning, Feature Selection needed
Choosing a Classification Algorithm
• c) Type of Features
• Numerical → Logistic Regression, Random Forest, XGBoost, Neural Networks
• Categorical → Decision Trees, Random Forest, XGBoost
• Mixed (Numerical + Categorical) → CatBoost, XGBoost, Decision Trees

• d) Data Imbalance
• Balanced → Any model works well
• Imbalanced → Consider:
• Resampling (Oversampling, SMOTE, Undersampling)
• Weighted loss function (XGBoost, CatBoost, Deep Learning)
Choosing a Classification Algorithm
• a) Interpretability
• High Interpretability Required → Logistic Regression, Decision Trees, Naïve Bayes
• Low Interpretability (but High Accuracy) → Random Forest, XGBoost, Deep
Learning

• b) Computational Complexity
• Low Complexity (Fast Training & Inference) → Logistic Regression, Naïve Bayes,
SVM (linear)
• Medium Complexity → Decision Trees, Random Forest, XGBoost
• High Complexity (Slow Training, High GPU Requirements) → Deep Learning (CNN,
LSTM, Transformers)
Choosing a Classification Algorithm
• a) Accuracy vs Speed Trade-off
• High Accuracy Required → XGBoost, Random Forest, Deep Learning
• Fast Computation Needed → Logistic Regression, Naïve Bayes

• b) Sensitivity to Noise
• Robust to Noise → Random Forest, XGBoost, Deep Learning
• Sensitive to Noise → SVM, Decision Trees (prone to overfitting)

• c) Overfitting Risk
• Low Risk → Random Forest, XGBoost (with regularization), Ridge/Lasso Regression
• High Risk → Decision Trees (without pruning), Deep Learning (without dropout)
Choosing a Classification Algorithm
• a) Sequential or Time-Series Data
• Yes → LSTM, GRU, Transformer, 1D-CNN
• No → Use traditional ML models

• b) Multiclass vs Binary
• Binary Classification → Logistic Regression, SVM, Decision Trees, Deep Learning
• Multiclass (≥3 classes) → Random Forest, XGBoost, Neural Networks (softmax activation)

• c) Streaming / Real-time Processing

• Yes → Online Learning Models (Incremental Learning with SGD, Adaptive Boosting, Streaming
Decision Trees)
• No → Batch Training Methods (Deep Learning, XGBoost)
Choosing a Classification Algorithm
Use Case Recommended Algorithm
Text Classification Naïve Bayes, Transformer (BERT), LSTM

Image Classification CNN, ResNet, EfficientNet, Vision Transformer

Tabular Data Classification XGBoost, Random Forest, Logistic Regression

Anomaly Detection Isolation Forest, Autoencoders, One-Class SVM

Medical Diagnosis XGBoost, Random Forest, CNN for Images

Fraud Detection Random Forest, XGBoost, Autoencoders

No Free Lunch Theorem
• No algorithm best for all problems.
• Performance depends on problem context.
• Data matters more than the algorithm
• Trade-offs exist: Accuracy vs. Efficiency.
• Evaluate models using cross-validation methods.
• Experiment with hyperparameters for optimization.
• Choose algorithm based on specific requirements.
Logistic Regression
• Logistic regression predicts binary class probabilities.
• Uses sigmoid function to map outputs.
• Optimized using cross-entropy loss function.
• Works well for linearly separable data.
• Extension: Multinomial logistic for multiple classes.
• Widely used for classification problems.
Logistic Regression
Logistic Regression Algorithm
Step 1: Compute Linear Combination of Features
Logistic Regression starts with a weighted sum of input features:
Z=w1X1+w2X2+...+wnXn+b

Step 2: Apply the Sigmoid Activation Function

Instead of predicting any real number, we transform Z using the sigmoid
function:
1
𝜎 𝑍 =
1 + 𝑒 !"
where:
• σ(Z) squashes the output between 0 and 1, making it a probability.
• If σ(Z) >0.5, classify as 1 (Positive Class).
• If σ(Z) ≤0.5, classify as 0 (Negative Class).
Logistic Regression Algorithm
Step 3: Define the Cost Function (Log Loss)
Instead of Mean Squared Error (MSE), we use Log Loss (Binary Cross-Entropy):
&
1
J 𝑤 = − +[𝑦# 𝑙𝑜𝑔𝑦#^ + (1 − 𝑦# ) log(1 − 𝑦#^ )]
𝑚
#$%
• Minimizing Log Loss ensures that the model maximizes the probability of
correct predictions.

Step 4: Optimize Weights Using Gradient Descent

Logistic Regression updates weights using Gradient Descent:
𝜕𝐽
𝑤( = 𝑤( − 𝛼
𝜕𝑤(
• Gradient Descent ensures the cost function is minimized.
Perceptron
Perceptron Algorithm
• Step 1: Initialize Weights & Bias
§ Set all weights (wi) and bias (b) to small random values or
zeros.
• Step 2: Compute the Output
• For each training sample (X,y):
§ Compute the weighted sum: Z=∑wiXi+b
§ Apply the step activation function:
§ ypred={1: if Z ≥0, 0: if Z<0}
Perceptron Algorithm
• Step 3: Update Weights Using the Perceptron Rule
• If the predicted output ypred matches the actual label y, do nothing.
• If incorrect, update the weights using the rule:
• wi=wi+η (y−ypred)Xi
• b = b + η (y - ypred)

• Step 4: Repeat Until Convergence

• Iterate over the dataset multiple times (epochs) until:
oAll samples are classified correctly.
oA stopping criterion (max iterations or minimal error) is met.
Decision Tree Classifier
• Models decisions as tree-like structures.
• Splits nodes using feature thresholds recursively.
• Handles both categorical and numerical data.
• Overfitting mitigated using pruning techniques.
• Easy to interpret, lacks generalization sometimes.
• Algorithms include ID3, CART, and C4.5.
Decision Tree Classifier
• A Decision Tree might split the data as follows:
• First Split on Credit Score:
• If Credit Score > 700 → No Default
• If Credit Score ≤ 700 → Check Age.
• Second Split on Age:
• If Age ≤ 30 → Default
• If Age > 30 → No Default

• This hierarchical decision-making forms a tree-like structure.

Decision Tree Classifier
• Root Node: The first decision point (e.g., Credit Score).
• Internal Nodes: Decision points based on feature splits.
• Leaves (Terminal Nodes): Final classification or prediction outcome.

• The algorithm determines the best feature to split on using:

• Gini Impurity → Used in CART (Classification and Regression Trees) Algorithm,
measures how “impure” a node is.
• Entropy (Information Gain) → Used in ID3 Algorithm, calculates reduction in
uncertainty.

• Pruning: Removes unnecessary branches to avoid overfitting.

• Pre-Pruning (Stopping early): Limits tree depth.
• Post-Pruning (Trimming later): Removes nodes after training.
ID3 Algorithm
• ID3 builds trees using information gain metric.
• Splits data based on entropy reduction.
• Suitable for categorical feature data splitting.
• Prone to overfitting with noisy data.
• Iteratively chooses best attribute for split.
• Simpler than CART and C4.5 algorithms.
ID3 Algorithm Start

Input Dataset contain N feature

While
i<=N

Compute entropy and information gain

Select best feature with minimum

entropy/maximum gain

Split dataset based on that feature

Make decision tree node containing that

feature
Make nodes of decision tree with subset of
data created
More
feature
to split

Stop
C4.5 Algorithm
Comparison of ID3 and C4.5
Feature ID3 (Iterative Dichotomiser 3) C4.5 (Successor of ID3)
Uses Entropy & Gain Ratio, which normalizes
Uses Entropy & Information Gain to choose the best feature
Splitting Criterion Information Gain to prevent bias towards
for splitting.
multi-valued attributes.
Handling Cannot handle numerical (continuous) attributes; requires Can handle continuous (numerical) attributes
Continuous discretization. by creating threshold splits.
Features (e.g., "High Credit Score" vs. "Low Credit Score"). (e.g., "Credit Score ≤ 680").
Handling Missing Can handle missing values by assigning
Cannot handle missing values.
Values probabilities to different attribute values.

Uses post-pruning to remove branches that

Tree Pruning No pruning; leads to overfitting.
do not improve accuracy, reducing overfitting.

Bias Towards Multi- Yes, because Information Gain favors attributes with more
No, because Gain Ratio corrects this bias.
Valued Attributes unique values.
Produces a simpler tree with better
Output Produces a large tree with possible overfitting.
generalization.
Slower than ID3 but more accurate due to
Efficiency Faster but less accurate due to lack of pruning.
pruning and handling continuous values.

CNN RNN LSTM GRU Simple
100% (3)
CNN RNN LSTM GRU Simple
20 pages
Deep Learning Material
No ratings yet
Deep Learning Material
136 pages
UNIT II 2.1 ML Decision Tree Learning
No ratings yet
UNIT II 2.1 ML Decision Tree Learning
55 pages
Unit IV Da Online - PPTX 2 82
No ratings yet
Unit IV Da Online - PPTX 2 82
81 pages
ML Notes - 2025
No ratings yet
ML Notes - 2025
145 pages
ML Unit-Ii
No ratings yet
ML Unit-Ii
37 pages
Unit3 ML
No ratings yet
Unit3 ML
23 pages
Deep Learning Glossary
No ratings yet
Deep Learning Glossary
30 pages
Supervised Learning
No ratings yet
Supervised Learning
187 pages
Machine Learning Algorithms Laiki
No ratings yet
Machine Learning Algorithms Laiki
123 pages
DWDM - Unit - V
No ratings yet
DWDM - Unit - V
93 pages
Classification, Prediction
100% (1)
Classification, Prediction
67 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
224 pages
Chapter 2 Machine Learning Draft-85-172
No ratings yet
Chapter 2 Machine Learning Draft-85-172
88 pages
Data Mining Unit 2
No ratings yet
Data Mining Unit 2
41 pages
Pa Unit-Iii
No ratings yet
Pa Unit-Iii
75 pages
Aiya Session 4
No ratings yet
Aiya Session 4
42 pages
CH 5
No ratings yet
CH 5
84 pages
Chapter 7 Supervised Learning
No ratings yet
Chapter 7 Supervised Learning
71 pages
2021 Lecture10 BasicML
No ratings yet
2021 Lecture10 BasicML
76 pages
ML Unit 2
No ratings yet
ML Unit 2
37 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
DWM - Module 3
No ratings yet
DWM - Module 3
22 pages
Unit 3 (MLT)
No ratings yet
Unit 3 (MLT)
42 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Data Mining Unit 2
No ratings yet
Data Mining Unit 2
40 pages
Module 04
No ratings yet
Module 04
75 pages
Decision Trees and Decision Modeling
No ratings yet
Decision Trees and Decision Modeling
58 pages
What Is Classification? What Is Prediction?
No ratings yet
What Is Classification? What Is Prediction?
36 pages
Classification
No ratings yet
Classification
33 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Learning
No ratings yet
Learning
51 pages
08 Class Basic
No ratings yet
08 Class Basic
103 pages
Lecture 8
No ratings yet
Lecture 8
28 pages
Machine Learning: Mona Leeza Email: Monaleeza - Bukc@bahria - Edu.pk
No ratings yet
Machine Learning: Mona Leeza Email: Monaleeza - Bukc@bahria - Edu.pk
60 pages
Chapter 3
No ratings yet
Chapter 3
67 pages
CCST9017 (2023-24lecture11printed Version) MachineLearning
No ratings yet
CCST9017 (2023-24lecture11printed Version) MachineLearning
55 pages
Module 5
No ratings yet
Module 5
98 pages
Introduction To ML
No ratings yet
Introduction To ML
31 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
ML-Lec-06-Supervised Learning-Decision Trees
No ratings yet
ML-Lec-06-Supervised Learning-Decision Trees
45 pages
Lec-13-Perceptron Vs Bayes Classifier
No ratings yet
Lec-13-Perceptron Vs Bayes Classifier
12 pages
Machine Learning - Iii
No ratings yet
Machine Learning - Iii
53 pages
DM - 06 Mar 2025
No ratings yet
DM - 06 Mar 2025
13 pages
Unit-4 Data Mining
No ratings yet
Unit-4 Data Mining
19 pages
Unit 5
No ratings yet
Unit 5
25 pages
Big Data Lesson 5 Lucrezia Noli
No ratings yet
Big Data Lesson 5 Lucrezia Noli
30 pages
Unit 4 Classification & Prediction
No ratings yet
Unit 4 Classification & Prediction
10 pages
Int354 Unit 2
No ratings yet
Int354 Unit 2
15 pages
NNDL - Unit - I Notes
No ratings yet
NNDL - Unit - I Notes
23 pages
7 Types of Classification Algorithms
No ratings yet
7 Types of Classification Algorithms
9 pages
Module 3 - Machine Learning Algorithms
No ratings yet
Module 3 - Machine Learning Algorithms
17 pages
Interview Preparing - ML Draft
No ratings yet
Interview Preparing - ML Draft
12 pages
DL
No ratings yet
DL
10 pages
8 Classification
No ratings yet
8 Classification
45 pages
Classification Notes
No ratings yet
Classification Notes
14 pages
Classification and Clustering Techniques in Data Mining
No ratings yet
Classification and Clustering Techniques in Data Mining
18 pages
Chapter - 4
No ratings yet
Chapter - 4
14 pages
ABP DWDM UNIT 4 Classification 1
No ratings yet
ABP DWDM UNIT 4 Classification 1
51 pages
Lecture 8 Deep Learning Overview PDF
No ratings yet
Lecture 8 Deep Learning Overview PDF
98 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
87 pages
Machine Learning - Brief
No ratings yet
Machine Learning - Brief
12 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
50 pages
مشین سیکھنا
No ratings yet
مشین سیکھنا
5 pages
Keras Cheat Sheet Python For Data Science: Model Architecture Inspect Model
No ratings yet
Keras Cheat Sheet Python For Data Science: Model Architecture Inspect Model
1 page
7 - Classification
No ratings yet
7 - Classification
71 pages
Machine Learning Supervised
No ratings yet
Machine Learning Supervised
42 pages
AI-ML-DataScience Notes
No ratings yet
AI-ML-DataScience Notes
7 pages
DL Question Bank - AD3501
No ratings yet
DL Question Bank - AD3501
3 pages
Introduction To Deep Convolutional Neural Networks: March 2016
No ratings yet
Introduction To Deep Convolutional Neural Networks: March 2016
51 pages
INT354 - Unit 1
No ratings yet
INT354 - Unit 1
72 pages
Feed-Forward Neural Networks (Part 2: Learning)
No ratings yet
Feed-Forward Neural Networks (Part 2: Learning)
17 pages
DL Unit-Ii
No ratings yet
DL Unit-Ii
17 pages
FFNN
No ratings yet
FFNN
3 pages
NN Bnu4
No ratings yet
NN Bnu4
47 pages
Genaifile
No ratings yet
Genaifile
39 pages
Unit 2 Machine Learning Aktu
No ratings yet
Unit 2 Machine Learning Aktu
18 pages
Exercises695Clas Solution
100% (2)
Exercises695Clas Solution
13 pages
Holography
No ratings yet
Holography
12 pages
Chap6 (Neural Network)
No ratings yet
Chap6 (Neural Network)
63 pages
Lecture 5 - CS50's Introduction To Artificial Intelligence With Python
No ratings yet
Lecture 5 - CS50's Introduction To Artificial Intelligence With Python
16 pages
2025 Lecture07 P2 MLP
No ratings yet
2025 Lecture07 P2 MLP
56 pages
Chapter 6 Data-DrivenModelingUsingMATLAB-6
No ratings yet
Chapter 6 Data-DrivenModelingUsingMATLAB-6
7 pages
Laser
No ratings yet
Laser
28 pages
Vggnet
No ratings yet
Vggnet
8 pages
769f52c6 1751895446828
No ratings yet
769f52c6 1751895446828
55 pages
Data Driven Artificial Neural Network LSTM Hybrid 250129 102818-1
No ratings yet
Data Driven Artificial Neural Network LSTM Hybrid 250129 102818-1
6 pages
ANN Calculations
No ratings yet
ANN Calculations
24 pages
Project 5 - Traffic Sign Classification Using LeNet
No ratings yet
Project 5 - Traffic Sign Classification Using LeNet
13 pages
Module 3 Quiz - Review
No ratings yet
Module 3 Quiz - Review
4 pages
Machine Learning IMP Questions
No ratings yet
Machine Learning IMP Questions
5 pages
Neural Network Course Guide Book
No ratings yet
Neural Network Course Guide Book
2 pages
Data Mining (Gtu Sem-6) 001
No ratings yet
Data Mining (Gtu Sem-6) 001
2 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet

INT354 - Unit 2

Uploaded by

INT354 - Unit 2

Uploaded by

Machine Learning Classifier-I

Overview of Supervised Learning

Categorizing patients into different

Forecasting the future price of stock based on

Predicting temperature, rainfall and other

Predicting creditworthiness of an individual or

Predicting that whether India is going to Predicting product quality based on

Student is going to get the placement or Predict student performance based on

Classification Examples Regression Examples

• b) Number of Features (Dimensionality)

• c) Streaming / Real-time Processing

Image Classification CNN, ResNet, EfficientNet, Vision Transformer

Tabular Data Classification XGBoost, Random Forest, Logistic Regression

Anomaly Detection Isolation Forest, Autoencoders, One-Class SVM

Medical Diagnosis XGBoost, Random Forest, CNN for Images

Fraud Detection Random Forest, XGBoost, Autoencoders

Step 2: Apply the Sigmoid Activation Function

Step 4: Optimize Weights Using Gradient Descent

• Step 4: Repeat Until Convergence

• This hierarchical decision-making forms a tree-like structure.

• The algorithm determines the best feature to split on using:

• Pruning: Removes unnecessary branches to avoid overfitting.

Input Dataset contain N feature

Compute entropy and information gain

Select best feature with minimum

Split dataset based on that feature

Make decision tree node containing that

Uses post-pruning to remove branches that

You might also like