Week 8. Supervised Learning. Classification
Week 8. Supervised Learning. Classification
Classification
• Types of Classification
• Logistic Regression
• Naïve Bayes
Labels per
Type Number of Classes Example
Instance
Binary Spam Detection
2 1
Classification (Spam/Not Spam)
Multi-Class Dog/Cat/Rabbit
3 or more 1
Classification Classification
Movie Genre
Multi-Label
3 or more Prediction (Action + Multiple
Classification
Comedy + Sci-Fi)
Logistic Regression
Logistic Regression
• Logistic Regression is a widely used supervised learning
algorithm for classification tasks. Despite its name, it is not used
for regression but for predicting categorical outcomes. It is
especially useful for binary classification problems, where the
target variable has two possible classes, such as "Yes/No,"
"Spam/Not Spam," or "Disease/No Disease."
• Despite its name, it is used for classification, not regression.
• Applications:
• Spam detection (Spam/Not Spam)
• Disease prediction (Diabetic/Non-Diabetic)
• Customer churn prediction (Will Leave/Will Stay)
Logistic Regression
• Logistic regression is used for binary classification where we use
sigmoid function, that takes input as independent variables and
produces a probability value between 0 and 1.
• For example, we have two classes Class 0 and Class 1 if the value
of the logistic function for an input is greater than 0.5 (threshold
value) then it belongs to Class 1 otherwise it belongs to Class 0.
It’s referred to as regression because it is the extension of linear
regression but is mainly used for classification problems.
Sigmoid Function
K=3
Distance Metrics Used in KNN Algorithm
• Margin:
• The margin is the distance
between the hyperplane
and the nearest support
vectors.
• A larger margin means
better generalization (less
risk of overfitting).
• SVM aims to find the
maximum margin
hyperplane (MMH) for
better separation of
classes.
Terminologies in Support Vector Machine?
• Kernel Trick:
• Some datasets are not linearly separable in their original form.
• The Kernel Trick transforms data into a higher-dimensional space, where it
becomes separable.
• Common kernel functions:
• Linear Kernel: Used when data is linearly separable.
Terminologies in Support Vector Machine?
• Kernel Trick:
• Common kernel
functions:
• Polynomial Kernel:
Useful for curved
boundaries.
Polynomial • Radial Basis
Function (RBF)
Kernel: Widely
used for complex,
non-linear
decision
boundaries.
RBF
Decision Tree
What is Decision Tree?
• Works well when the dataset has balanced classes (e.g., 50%
positive, 50% negative).
• Not reliable for imbalanced datasets, where one class dominates the
other.
Precision
Precision focuses on the quality of the model’s positive predictions. It tells
us how many of the instances predicted as positive are actually positive.
Precision is important in situations where false positives need to be
minimized, such as detecting spam emails or fraud.