0% found this document useful (0 votes)
24 views28 pages

4.ClassificationLR Slides

The document discusses classification algorithms for prediction of categories including logistic regression, decision trees, and random forests. It covers loss functions like cross entropy, performance metrics like confusion matrices and ROC curves, and demonstrates classification workflows in KNIME.

Uploaded by

Líbano trindade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views28 pages

4.ClassificationLR Slides

The document discusses classification algorithms for prediction of categories including logistic regression, decision trees, and random forests. It covers loss functions like cross entropy, performance metrics like confusion matrices and ROC curves, and demonstrates classification workflows in KNIME.

Uploaded by

Líbano trindade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Classification

Prediction of categories

Stefano Puglia
MIA-DIGITAL UNIVERSITY, Course of Machine Learning, CIS005, Spring 2021
Outline
● A different loss/cost function
– Logistic regression
● Additional performance metrics
– Confusion matrix
– ROC curve
– AUC
● Other algorithms (nonparametric)
– Decision trees
– Random forests
● Hands-on Lab
Machine learning: key concepts
● Model
– Built in accord with machine learning types (i.e. algorithms)
– E.g. Regression: y = wx+b
● Parameters
– Numbers characterising the model (i.e. numbers to learn)
– Best numbers to fit the model with given data
– E.g. Regression: w (slope) and b (intercept)
● Iterative learning processs
– Gradual calculation of best parameters (i.e. step-by-step computation)
– Minimisation of a “loss/cost function” (i.e. error estimation)
– E.g. Regression: GD*-based calculation, MSE (Mean Squared Error);
Classification: GD*-based calculation, Cross-entropy
● Hyperparameters
– Numbers initialising/fine tuning the model (i.e. numbers to set)
– N/A in all machine learning algortithms
– E.g. Regression: learning rate η; k-Means: parameter k

* Gradient Descent
Simple linear regression

Source: https://scikit-learn.org/stable/auto_examples/linear_model/plot_ols.html
}
Logistic regression

Source:https://en.wikipedia.org/wiki/File:Exam_pass_logistic_curve.jpeg
}
Logistic regression

Source:https://en.wikipedia.org/wiki/File:Exam_pass_logistic_curve.jpeg
}
Logistic regression

Source:https://en.wikipedia.org/wiki/File:Exam_pass_logistic_curve.jpeg
}
Logistic regression

Source:https://en.wikipedia.org/wiki/File:Exam_pass_logistic_curve.jpeg
Linear vs. Logistic regression
Mean Squared Error Loss Function
Linear vs. Logistic regression
Cross Entropy Loss Function
Cross Entropy Loss Function
Linear vs. Logistic regression
Cross Entropy Loss Function
Why not MSE for Logistic regression?

Source: https://towardsdatascience.com
● Non-convex MSE
– Converging to many local minima
● Convex LCE
– Converging to one global minimum
Minimising loss/cost function
Source: https://blog.clairvoyantsoft.com/the-ascent-of-gradient-descent-23356390836f

Learning rate
(carefully set)

Learning: progressive weight adjustment


Machine learning: key concepts
● Model
– Built in accord with machine learning types (i.e. algorithms)
– E.g. Regression: y = wx+b
● Parameters
– Numbers characterising the model (i.e. numbers to learn)
– Best numbers to fit the model with given data
– E.g. Regression: w (slope) and b (intercept)
● Iterative learning processs
– Gradual calculation of best parameters (i.e. step-by-step computation)
– Minimisation of a “loss/cost function” (i.e. error estimation)
– E.g. Regression: GD*-based calculation, MSE (Mean Squared Error);
Classification: GD*-based calculation, Cross-entropy
● Hyperparameters
– Numbers initialising/fine tuning the model (i.e. numbers to set)
– N/A in all machine learning algortithms
– E.g. Regression: learning rate η; k-Means: parameter k

* Gradient Descent
Generalised Linear Models (GLMs)

Source: https://en.wikipedia.org/wiki/Generalized_linear_model
Classification evaluation
Cross Entropy/Log
Loss Function

Confusion Matrix

ROC Curve - AUC


Confusion matrix

Source: https://en.wikipedia.org/wiki/Confusion_matrix
Source: https://upload.wikimedia.org/wikipedia/commons/3/36/ROC_space-2.png

ROC Curve
Source: https://upload.wikimedia.org/wikipedia/commons/3/36/ROC_space-2.png

ROC Curve

x
x
x
x
Source: https://upload.wikimedia.org/wikipedia/commons/3/36/ROC_space-2.png

ROC Curve

x
x
x
x
Source: https://upload.wikimedia.org/wikipedia/commons/3/36/ROC_space-2.png

ROC Curve

x
x
x
x
Source: https://upload.wikimedia.org/wikipedia/commons/3/36/ROC_space-2.png

ROC Curve

x
x
x
x
ROC Curve – Moving threshold

Source:By Sharpr - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=44059691


Area Under the Curve (AUC)

Source:https://upload.wikimedia.org/wikipedia/commons/3/36/Roc-draft-xkcd-style.svg
Area Under the Curve (AUC)

Source:https://upload.wikimedia.org/wikipedia/commons/6/6b/Roccurves.png
Classification workflow with KNIME
Classification workflow with KNIME

You might also like