0% found this document useful (0 votes)
1 views14 pages

Week 4 Supervised Learning Classification

Uploaded by

fla43683
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views14 pages

Week 4 Supervised Learning Classification

Uploaded by

fla43683
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Week 4: Supervised

Learning -
Classification
What is Classification?
1 Predicting Categories 2 Categorical Output
Classification is a type of The output of a classification
supervised learning where the model is a discrete category,
goal is to predict the category such as "spam" or "not spam,"
or class of a given data point. "cat" or "dog," or "positive" or
"negative."

3 Training Data
Classification models are trained on labeled data where each data point
has a known category.
Logistic Regression
Sigmoid Function Binary Classification
Logistic regression uses a The simplest form of logistic
sigmoid function to map the regression involves two classes,
input features to a probability often referred to as "positive"
between 0 and 1, representing and "negative."
the likelihood of belonging to a
specific class.

Linear Decision Boundary


In its core, logistic regression seeks to find a linear decision boundary that
best separates the data points into their respective classes.
Logistic Regression Example
Imagine a scenario where we're trying to predict whether a customer will click on an advertisement based on their age and income. Logistic
regression can be used to create a model that learns the relationship between these features and the probability of a click. For example, the
model might learn that younger customers with higher incomes are more likely to click on ads.

Logistic regression works by fitting a sigmoid function to the data. The sigmoid function takes the input features and outputs a probability
between 0 and 1. This probability represents the likelihood that the data point belongs to the positive class. For example, if the sigmoid
function outputs a probability of 0.8, this means that the model predicts an 80% chance that the customer will click on the ad.
Decision Trees
Tree Structure 1
Decision trees represent a hierarchical structure where
each node represents a feature and each branch
represents a decision rule based on the feature's value. 2 Recursive Partitioning
The tree is built recursively by partitioning the data
based on features that best separate the classes.

Prediction 3
Predictions are made by traversing the tree from the
root to a leaf node, where the leaf node represents the
predicted class.
Decision Tree Example
Here is an example of a decision tree used for a dataset about playing tennis.
The tree begins with the root node, which asks whether the weather is sunny. If
the weather is sunny, the tree branches to the next node, which asks if humidity
is high. If humidity is high, the tree predicts that the player should not play
tennis.
Decision Tree:
Root Node
The tree starts with the root node, which is the feature that best splits the data into different classes.

Internal Nodes
Each internal node represents a decision based on a feature, and branches lead to different child nodes.

Leaf Nodes
Leaf nodes represent the final prediction for a specific class.
Advantages of Decision Trees
Interpretability Handling Categorical Data Non-Linear Relationships

Decision trees are easy to understand and Decision trees can handle both numerical Decision trees can capture non-linear
interpret, as the rules are transparent and and categorical features without requiring relationships between features and the
readily visualized. feature scaling or transformation. target variable.
Disadvantages of Decision Trees
Overfitting Sensitivity to Small Changes

Decision trees can easily overfit the training data, resulting in poor Slight changes in the training data can significantly alter the
generalization performance on unseen data. structure of the tree, leading to instability in predictions.
Evaluation Metrics - Confusion
Matrix
Definition A table that summarizes the
performance of a classification model
by showing the number of true
positives, true negatives, false
positives, and false negatives.

Visual Representation Provides a clear overview of the


model's ability to correctly classify
instances into different classes.

Use Cases To gain a deeper understanding of


model performance and identify
potential biases or weaknesses.
Evaluation Metrics - Accuracy
Definition The ratio of correctly classified
instances to the total number of
instances.

Formula (True Positives + True Negatives) /


(Total Instances)

Use Cases When all classes are equally


important and balanced.
Evaluation Metrics - Precision
Definition The ratio of correctly classified
positive instances to the total
number of instances predicted as
positive.

Formula True Positives / (True Positives +


False Positives)

Use Cases When minimizing false positive


errors is critical.
Evaluation Metrics - Recall
Definition The ratio of correctly classified positive instances to the total
number of actual positive instances.

Formula True Positives / (True Positives + False Negatives)

Use Cases When minimizing false negative errors is crucial.


Evaluation Metrics - F1 Score
Definition The harmonic mean of precision and recall, providing a
balanced measure of both.

Formula 2 * (Precision * Recall) / (Precision + Recall)

Use Cases When a trade-off between precision and recall is needed.

You might also like