4.ClassificationLR Slides
4.ClassificationLR Slides
Prediction of categories
Stefano Puglia
MIA-DIGITAL UNIVERSITY, Course of Machine Learning, CIS005, Spring 2021
Outline
● A different loss/cost function
– Logistic regression
● Additional performance metrics
– Confusion matrix
– ROC curve
– AUC
● Other algorithms (nonparametric)
– Decision trees
– Random forests
● Hands-on Lab
Machine learning: key concepts
● Model
– Built in accord with machine learning types (i.e. algorithms)
– E.g. Regression: y = wx+b
● Parameters
– Numbers characterising the model (i.e. numbers to learn)
– Best numbers to fit the model with given data
– E.g. Regression: w (slope) and b (intercept)
● Iterative learning processs
– Gradual calculation of best parameters (i.e. step-by-step computation)
– Minimisation of a “loss/cost function” (i.e. error estimation)
– E.g. Regression: GD*-based calculation, MSE (Mean Squared Error);
Classification: GD*-based calculation, Cross-entropy
● Hyperparameters
– Numbers initialising/fine tuning the model (i.e. numbers to set)
– N/A in all machine learning algortithms
– E.g. Regression: learning rate η; k-Means: parameter k
* Gradient Descent
Simple linear regression
Source: https://scikit-learn.org/stable/auto_examples/linear_model/plot_ols.html
}
Logistic regression
Source:https://en.wikipedia.org/wiki/File:Exam_pass_logistic_curve.jpeg
}
Logistic regression
Source:https://en.wikipedia.org/wiki/File:Exam_pass_logistic_curve.jpeg
}
Logistic regression
Source:https://en.wikipedia.org/wiki/File:Exam_pass_logistic_curve.jpeg
}
Logistic regression
Source:https://en.wikipedia.org/wiki/File:Exam_pass_logistic_curve.jpeg
Linear vs. Logistic regression
Mean Squared Error Loss Function
Linear vs. Logistic regression
Cross Entropy Loss Function
Cross Entropy Loss Function
Linear vs. Logistic regression
Cross Entropy Loss Function
Why not MSE for Logistic regression?
Source: https://towardsdatascience.com
● Non-convex MSE
– Converging to many local minima
● Convex LCE
– Converging to one global minimum
Minimising loss/cost function
Source: https://blog.clairvoyantsoft.com/the-ascent-of-gradient-descent-23356390836f
Learning rate
(carefully set)
* Gradient Descent
Generalised Linear Models (GLMs)
Source: https://en.wikipedia.org/wiki/Generalized_linear_model
Classification evaluation
Cross Entropy/Log
Loss Function
Confusion Matrix
Source: https://en.wikipedia.org/wiki/Confusion_matrix
Source: https://upload.wikimedia.org/wikipedia/commons/3/36/ROC_space-2.png
ROC Curve
Source: https://upload.wikimedia.org/wikipedia/commons/3/36/ROC_space-2.png
ROC Curve
x
x
x
x
Source: https://upload.wikimedia.org/wikipedia/commons/3/36/ROC_space-2.png
ROC Curve
x
x
x
x
Source: https://upload.wikimedia.org/wikipedia/commons/3/36/ROC_space-2.png
ROC Curve
x
x
x
x
Source: https://upload.wikimedia.org/wikipedia/commons/3/36/ROC_space-2.png
ROC Curve
x
x
x
x
ROC Curve – Moving threshold
Source:https://upload.wikimedia.org/wikipedia/commons/3/36/Roc-draft-xkcd-style.svg
Area Under the Curve (AUC)
Source:https://upload.wikimedia.org/wikipedia/commons/6/6b/Roccurves.png
Classification workflow with KNIME
Classification workflow with KNIME