0% found this document useful (0 votes)
54 views11 pages

Chapter3 Classification Summary Final

Classification is a supervised learning technique that categorizes data into predefined classes or labels. There are two types of classification problems: binary classification, which has two classes, and multi-class classification, which has more than two classes. Logistic regression is commonly used for classification problems by using a logistic function called the sigmoid function to output a value between 0 and 1 for each sample. The performance of a logistic regression classifier can be evaluated using a confusion matrix and metrics like accuracy, precision, recall, and F1 score. Support vector machines (SVMs) find boundaries called hyper-planes in multi-dimensional space to separate different classes of data samples.

Uploaded by

Zubair Najim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views11 pages

Chapter3 Classification Summary Final

Classification is a supervised learning technique that categorizes data into predefined classes or labels. There are two types of classification problems: binary classification, which has two classes, and multi-class classification, which has more than two classes. Logistic regression is commonly used for classification problems by using a logistic function called the sigmoid function to output a value between 0 and 1 for each sample. The performance of a logistic regression classifier can be evaluated using a confusion matrix and metrics like accuracy, precision, recall, and F1 score. Support vector machines (SVMs) find boundaries called hyper-planes in multi-dimensional space to separate different classes of data samples.

Uploaded by

Zubair Najim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

A supervised learning technique:

classification

What is classification?
• Classification is the process of categorizing a given set
of data into classes. The pre-defined classes act as our
labels, or ground truth.
• The model uses the features of an object to predict its
labels. E.g., filtering spam from non-spam emails or
classifying types of fruits based on their color, weight
and size.

1
What types of problems does classification
solve?

There are two types of classification problems

Binary Multi-class

The output is restricted The output has more


to two classes than two classes

| 2
To solve classification problems:
logistic regression

What is logistic regression?


Logistic regression is a linear regression but for
classification problems. Unlike linear regression, logistic
regression doesn’t need a linear relationship between
input and output variables.

| 3
Logistic regression uses a logistic function:
sigmoid function

The sigmoid function


takes any real input, and
outputs a value between
zero and one.

| 4
How can we measure the performance of a
logistic regression classifier?

• Once we have the predicted


results from our classification
model (classifier), the results are
compared with the actual label
(ground truth)

• Then the performance of the


model is being evaluated using
the confusion matrix

| 5
Applying the confusion matrix to measure
the model performance
Negative Positive

• True positives (TP) - results which were predicted


as positive & ground truth were also positive.
Negative TN FP
• False positives (FP) - instances predicted as
positives but actually were negative. Actual
Class
• True negatives (TN) - instances predicted as
negatives & their ground truth was also negative. Positive FN TP

• False negatives (FN) - instances predicted as


negative but their ground truth was positive.
Predicted Class

| 6
Possible Collaboration areas

Accuracy Precision Recall (Sensitivity) F1 score (F measure)

Indicates how Indicates how Indicates how many Indicates the equi-
accurately a result accurately positive positive samples the librium between the
can be correctly instances were classifier has falsely precision and the
predicted from the predicted and how predicted recall
total amount of many of them are
samples positive

The aim is to maximize true positives & true negatives; minimize false
positives & negatives

7
The evaluation metrics

| 8
Support vector machine (SVM)

What is support vector machine (SVM)?


• Support vector machine (SVM), is a supervised ML
technique that can be used to solve classification and
regression problems. It is, however, mostly used for
classification.
• In this algorithm, each feature & data points are plotted
in the space. Then, the SVM model finds boundaries to
separates different data samples into specific classes.

| 9
A practical example: finding a 2D plane that
differentiates two classes

Let’s say we have a dataset of


different animals of two classes:
birds & fish
• There are only three features:
body weight, body length, and
daily food consumption
• We draw a 3D grid and plot all
these points

A SVM model will try to find a


2D plane that differentiates
the 2 classes

| 10
If there are more than three features,
we would have a hyper-space

A hyper-space is a space with higher than 3 dimensions like


4D, 5D etc., and a separating line in a dimension higher than
3, is called a hyper-plane.
• If the hyper-planes are linear, the SVM is called
Linear Kernel SVM
• For nonlinear hyper-planes, a Polynomial Kernel
or other advanced SVMs are used

11

You might also like