11 - Model Eval and Tuning

Uploaded by

eric8y

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views17 pages

11 - Model Eval and Tuning

Uploaded by

eric8y

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Model Evaluation and

Tuning
Cross-industry Standard Process for Data Mining (CRISP-DM)

Figure by Kenneth Jensen 2

Evaluating Regressions
Mean Squared Error (MSE)
◦ Average squared residuals
◦ Pros
◦ Easy to calculate
◦ Widely used
◦ Cons
◦ Can be difficult to interpret

Root Means Squared Error (RMSE)

◦ Square root of MSE
◦ Puts MSE back into original units
◦ How far, on average, the predictions are from actuals
Evaluating Regressions (cont.)
Mean Absolute Error
◦ The average of the absolute values of deviations
◦ Easy to interpret
◦ Easy to calculate

R-Squared
◦ Sometimes called “coefficient of determination”
◦ Interpreted as
◦ Percent of variance explained by the model as compared to default model

Adjusted R-Squared
◦ Same Interpretation as R2
2 (𝑆𝑆 𝐸 𝑟𝑒𝑔 )/(𝑛 − 𝐾 )
◦ Includes penalty for adding predictors 𝐴𝑑𝑗 𝑅 =1 −
( 𝑆𝑆 𝐸 𝑠𝑖𝑚𝑝𝑙𝑒 ) /(𝑛 − 𝐾 )
Evaluating Residuals
Normal distribution of residuals
◦ Jarque-Bera Test
◦ Evaluates skewness and kurtosis
◦ Shapiro-Wilk Test
◦ Evaluates of the sample comes from a normally distributed population
◦ Graphical Tests
◦ Quantile-Quantile (QQ) plot
◦ Prediction versus Observed Plots

Homoscedasticity
◦ Bartlett’s Test
◦ If the ratio of the largest
variance to smallest
variance is 1.5 or below
Classifier Evaluation Measures
Confusion matrix
Classified or Predicted
a b
Actual a aa ab
b ba bb

D = aa+bb+ab+ba,
Actual a = (aa+ab), also Actual non-b
Actual b = (ba+bb), also Actual non-a
Classified a = (aa+ba), Classified b = (ab+bb)

6
Evaluation Metrics
Accuracy is the overall correctness of the model and is calculated as the sum of correct
classifications divided by the total number of classifications.
Accuracy = (aa+bb) / D
True Positive Rate (a) = aa / Actual a
True Positive Rate (b) = bb / Actual b
False Positive Rate (a) = ba / Actual b
False Positive Rate (b) = ab / Actual a

7
Precision, Recall, and F-Measure
Precision
◦ Measure of accuracy for a specific class.
◦ Precision (a) = aa/ Classified a
◦ Precision (b) = bb/ Classified b

Recall is a measure of the ability of a classification model to select instances of a certain class from a
data set. It is commonly also called sensitivity.
◦ Equivalent to TP rate.
◦ Recall (a) = aa / Actual a
◦ Recall (b) = bb/ Actual b

F-Measure
◦ The F-measure is the harmonic mean of precision and recall.
◦ It can be used as a single measure of performance of the test.
◦ F = ( 2 x Precision x Recall ) / ( Precision + Recall )

8
ROC Curves
Used for binary classifiers
Originated with radar analysis of signal vs noise
Plot of True Positive Rate (Recall)
against False Positive Rate
The larger the Area Under the Curve (AUC), the better
the discrimination of classification
Can monitor the relationship through the ranges of
positives and negatives
K-Fold Cross Validation
Useful for smaller data sets
Alternative to train – test splitting
◦ Every piece of data is used for both
training and testing
◦ If k = number of folds then
◦ Each data point will be used for training k-1 times
◦ Each data point will be used for testing 1 time
Optimization – Hyperparameter
Tuning
A hyperparameter is defined as
◦ A parameter that impact model performance but …
◦ Is NOT learned from the data

Examples include:
◦ Number of neighbors in a KNN model (MAE, RMSE)
◦ Number of clusters in a clustering model (Silhouette Score)
◦ Number of levels in a decision tree (Precision / Recall)
◦ Number of epochs in a neural network model (Train / Validation performance by epoch)

Tuning procedure
◦ Decide on metric that you will use to evaluate model performance
◦ Examples include
◦ Repeat the training / testing process for many hyperparameter values
Example – KNN Tuning with MAE
Techniques for Improving
Performance
Transforming target variables
◦ When targets are highly skewed it can cause issues with model performance
◦ Can make the model difficult to interpret
◦ Example: Regression predicting diamond prices

** Note: You can also do this with predictors

Look for segments of better
prediction
Evaluate residuals over the entire range of target values
It might show that prediction is better in a specific range
Questions?
Presentations
The following team members from RevUnit will be here to judge your presentations:
◦ Tim Lee: Product Owner
◦ Anh Ta: VP of Design
◦ Renee Lu: Machine Learning Engineer (UNLV Grad)

Group Projects will be graded in the following Rubric:

1. 10%: Evaluation of other group members (360 Eval)
2. 30%: Quality of presentation / audience engagement
3. 30%: Appropriateness of analytical processes and relevance of results
4. 30%: Executive Summary – completeness, grammar, design, etc.

Group with highest presentation score (Items 2 and 3 above) will be given the choice to waive
the final exam, which means that you can take the grade you have without taking the final.
Final Exam
I will present you with two short scenarios with a data set for each scenario.
Each scenario will require data analysis to address the scenario requirements. Therefore you must:
◦ Select the appropriate analytical technique based on
◦ Scenario objective
◦ Structure of the data set, especially the target variable.

Submission will be a Word Document or PDF illustrating your decisions, analysis, processes, etc.
Final will be released after the presentations on Dec 2 and will be at 10 PM on Dec 9.
This is a take-home exam, so be aware that the following conditions will be strictly enforced:
◦ Work alone. Anyone reported or observed to be working with others will have their scores adjusted downward. This will be strictly
enforced for the final. No exceptions.
◦ Please do not ask other students in the class for help. That is unfair to them because they will have to say no or could have their own
grade impacted.
◦ No late exams will be accepted. If they are not submitted on time, your score will be 0

The team that wins the presentation judging will have the option to waive the exam. This does not mean that you will get
100% on the exam, it means that you can accept your grade before the exam as your final grade.

Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
Model Evaluation
No ratings yet
Model Evaluation
18 pages
Unit - I Chap-4 Model Evaluation and Development
No ratings yet
Unit - I Chap-4 Model Evaluation and Development
35 pages
Week 4 Lecture Slides BUS265 2023
No ratings yet
Week 4 Lecture Slides BUS265 2023
45 pages
Machine Learning Model Evaluation
No ratings yet
Machine Learning Model Evaluation
2 pages
ML CH 5
No ratings yet
ML CH 5
5 pages
Confusion Matrix
No ratings yet
Confusion Matrix
4 pages
Model Validation and Perf Metrics - v2 - Noman - 08 - 06 - 24
No ratings yet
Model Validation and Perf Metrics - v2 - Noman - 08 - 06 - 24
25 pages
Exam PA Knowledge Based Outline
No ratings yet
Exam PA Knowledge Based Outline
22 pages
3-Performance Measures
No ratings yet
3-Performance Measures
35 pages
2-Training and Testing Models, Evaluation Metrics-01-07-2023
No ratings yet
2-Training and Testing Models, Evaluation Metrics-01-07-2023
23 pages
ML3 Evaluating Models
No ratings yet
ML3 Evaluating Models
40 pages
CS 620 / DASC 600 Introduction To Data Science & Analytics: Lecture 8-Performance Evaluation
No ratings yet
CS 620 / DASC 600 Introduction To Data Science & Analytics: Lecture 8-Performance Evaluation
62 pages
Evaluation Metrics in Machine Learning
No ratings yet
Evaluation Metrics in Machine Learning
14 pages
Lec 4
No ratings yet
Lec 4
24 pages
Unit 5
No ratings yet
Unit 5
18 pages
Concepts - Model Evaluation (Data Mining Fundamentals)
No ratings yet
Concepts - Model Evaluation (Data Mining Fundamentals)
40 pages
Chapter 2 Part II
No ratings yet
Chapter 2 Part II
28 pages
Metrix in ML
No ratings yet
Metrix in ML
7 pages
Mod8 DM
No ratings yet
Mod8 DM
13 pages
Metric
No ratings yet
Metric
6 pages
Ad3501-Dl-Unit 4 Notes
No ratings yet
Ad3501-Dl-Unit 4 Notes
16 pages
DL IT324a 4
No ratings yet
DL IT324a 4
52 pages
Week 10 - PROG 8510 Week 10
No ratings yet
Week 10 - PROG 8510 Week 10
16 pages
Performance Metrics
No ratings yet
Performance Metrics
8 pages
IT 138 - Lecture 4
No ratings yet
IT 138 - Lecture 4
30 pages
Performance Measures
No ratings yet
Performance Measures
19 pages
Data Mining Primer
No ratings yet
Data Mining Primer
5 pages
Evaluating A Machine Learning Model
No ratings yet
Evaluating A Machine Learning Model
14 pages
Machine Learning Model Evaluation - Zero To Mastery Academy
No ratings yet
Machine Learning Model Evaluation - Zero To Mastery Academy
1 page
Evaluation Metrics For Your Regression Model - Analytics Vidhya
No ratings yet
Evaluation Metrics For Your Regression Model - Analytics Vidhya
6 pages
6 Evaluarea Performantei
No ratings yet
6 Evaluarea Performantei
43 pages
S1 Evaluate Performance LKW 1mar2025
No ratings yet
S1 Evaluate Performance LKW 1mar2025
26 pages
IS4242 W6 Model Evaluation and Selection
No ratings yet
IS4242 W6 Model Evaluation and Selection
86 pages
A10 Model Performance v2 2up
No ratings yet
A10 Model Performance v2 2up
11 pages
Lect 02 Evaluation Part 1
No ratings yet
Lect 02 Evaluation Part 1
33 pages
Performance Metrics (Classification) : Enrique J. de La Hoz D
100% (1)
Performance Metrics (Classification) : Enrique J. de La Hoz D
30 pages
Chapter 7 - LAST
No ratings yet
Chapter 7 - LAST
29 pages
Updated Syllabus ML Ver 1
No ratings yet
Updated Syllabus ML Ver 1
21 pages
AI & ML Notes
No ratings yet
AI & ML Notes
22 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
Performance Evaluation
No ratings yet
Performance Evaluation
24 pages
Intel Assignment
No ratings yet
Intel Assignment
13 pages
ML Unit-3 - RTU
No ratings yet
ML Unit-3 - RTU
20 pages
Unit III Iml Final
No ratings yet
Unit III Iml Final
36 pages
机器学习
No ratings yet
机器学习
41 pages
Cheatsheet Machine Learning Tips and Tricks PDF
No ratings yet
Cheatsheet Machine Learning Tips and Tricks PDF
2 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Model Performance Assessment
No ratings yet
Model Performance Assessment
13 pages
Evaluation Metrics:: Confusion Matrix
No ratings yet
Evaluation Metrics:: Confusion Matrix
7 pages
Week 10 - Lecture 10
No ratings yet
Week 10 - Lecture 10
59 pages
Lec 4 ML S4 Evaluation Metrics
No ratings yet
Lec 4 ML S4 Evaluation Metrics
29 pages
Capstone Project
No ratings yet
Capstone Project
6 pages
SML
No ratings yet
SML
8 pages
Evaluation
No ratings yet
Evaluation
18 pages
Bia Unit Ii
No ratings yet
Bia Unit Ii
37 pages
Chapter 1 Capstone Project Ai Class 12
No ratings yet
Chapter 1 Capstone Project Ai Class 12
5 pages
L4b - Perfomance Evaluation Metric - Regression
No ratings yet
L4b - Perfomance Evaluation Metric - Regression
6 pages
L4b - Perfomance Evaluation Metric - Regression
No ratings yet
L4b - Perfomance Evaluation Metric - Regression
6 pages
Assignment 9
No ratings yet
Assignment 9
8 pages
Geuself Module 3 Solo PDF March 2024
No ratings yet
Geuself Module 3 Solo PDF March 2024
8 pages
Energy Relationships in Chemical Reactions
No ratings yet
Energy Relationships in Chemical Reactions
11 pages
Catch Up Friday Research
No ratings yet
Catch Up Friday Research
1 page
Writing Drills - Answer 2012: A) Exercise 1A: Formal Letter
No ratings yet
Writing Drills - Answer 2012: A) Exercise 1A: Formal Letter
10 pages
TD Sba0 en
No ratings yet
TD Sba0 en
3 pages
An Introduction To Ferrography
No ratings yet
An Introduction To Ferrography
37 pages
Research II Proposal
No ratings yet
Research II Proposal
26 pages
Dan Glimne Motor Tuning 2 - MC Jan-70
No ratings yet
Dan Glimne Motor Tuning 2 - MC Jan-70
40 pages
Spectrele Lui Marx - Derrida PDF
100% (1)
Spectrele Lui Marx - Derrida PDF
35 pages
A World of Art Exam Chapter 1
No ratings yet
A World of Art Exam Chapter 1
7 pages
15 Advanced English Phrases For Better Expressing Emotions
No ratings yet
15 Advanced English Phrases For Better Expressing Emotions
4 pages
BondMaster1000eplus en
No ratings yet
BondMaster1000eplus en
2 pages
About The Author: Fabio Saccomanno Was Born in Genoa, Italy in 1933. He Received The Laurea
No ratings yet
About The Author: Fabio Saccomanno Was Born in Genoa, Italy in 1933. He Received The Laurea
2 pages
Subject-Verb Agreement: by Resmy R S
No ratings yet
Subject-Verb Agreement: by Resmy R S
19 pages
Agente 13625683 Fecha 06-12-2021 Hora 08-42 Version 3-4-1-34 Estacion SMG-NSB LodID
No ratings yet
Agente 13625683 Fecha 06-12-2021 Hora 08-42 Version 3-4-1-34 Estacion SMG-NSB LodID
44 pages
Vocative in English PDF
No ratings yet
Vocative in English PDF
22 pages
Tle 75602
No ratings yet
Tle 75602
70 pages
Lesson 1: Pre-Analytical Factors and Gross Description: Histopathologic and Cytologic Techniques - Lecture
No ratings yet
Lesson 1: Pre-Analytical Factors and Gross Description: Histopathologic and Cytologic Techniques - Lecture
28 pages
Action Plan in English
No ratings yet
Action Plan in English
4 pages
1.1 How-To-Use-This-Competency-Based-Learning-Material
No ratings yet
1.1 How-To-Use-This-Competency-Based-Learning-Material
2 pages
Energy Drinks
0% (1)
Energy Drinks
19 pages
Unit Ix Cost Effectiveness and Cost Accounting
No ratings yet
Unit Ix Cost Effectiveness and Cost Accounting
38 pages
Indian Railway
No ratings yet
Indian Railway
29 pages
CPAP-HFNC - Medin - NC3 Ops - Manual Book
No ratings yet
CPAP-HFNC - Medin - NC3 Ops - Manual Book
59 pages
Maintenance of Plastics Processing & Testing Machinery Unit 1
100% (5)
Maintenance of Plastics Processing & Testing Machinery Unit 1
41 pages
Jimmy Kenya Power Report
No ratings yet
Jimmy Kenya Power Report
38 pages
Mod Menu Crash 2025 01 03-19 23 22
No ratings yet
Mod Menu Crash 2025 01 03-19 23 22
1 page
Final 5
100% (3)
Final 5
9 pages
Id Questio N A Graph Is A Set of - and Set of - A Vertices, Edges B Variables, Values C Vertices, Distances D Variable, Equation Answer A Marks 1 Unit 1
No ratings yet
Id Questio N A Graph Is A Set of - and Set of - A Vertices, Edges B Variables, Values C Vertices, Distances D Variable, Equation Answer A Marks 1 Unit 1
94 pages
Fundamentals of Multimedia
No ratings yet
Fundamentals of Multimedia
3 pages

11 - Model Eval and Tuning

Uploaded by

11 - Model Eval and Tuning

Uploaded by

Model Evaluation and

Figure by Kenneth Jensen 2

Root Means Squared Error (RMSE)

** Note: You can also do this with predictors

Group Projects will be graded in the following Rubric:

You might also like