Risk Minimization

The document discusses the concepts of risk and error in machine learning, specifically distinguishing between true risk and empirical risk. True risk represents the average loss over the entire population, while empirical risk is calculated from a sample dataset. It also highlights the importance of empirical risk minimization and introduces L2 regularization as a technique to prevent overfitting in models.

Uploaded by

ibk2007

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views12 pages

Risk Minimization

Uploaded by

ibk2007

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 12

Risk

• In machine learning, 'risk' is synonymous to 'error'.

• Error - difference between the actual value and the predicted value.

• In supervised learning, we use the dataset that is representative of all the classes
for a given problem, to solve the problem.

• consider an example of supervised learning problem - cancer diagnosis. Each input

has a label 0 (cancer not diagnosed) or 1 (cancer diagnosed).

• Since we cannot have a data of all the people in the world, we sample out data of
some people to be fed into our model.

• We use a loss function L(h, x) to find the difference between the actual diagnosis
and the predicted diagnosis. This is essentially a measurement of error.
https://www.cs.cornell.edu/courses/cs4780/2018fa/lectures/
lecturenote10.html
True Risk
• True risk is the average loss/error over all
possibilities (here, the population of the whole
world). Its formula is as follows:

• For calculating the true risk, we need the entire

distribution D that generates our dataset and
also a true labeling function.
Empirical Risk
• Since we use a small sample as the data for
our model, we talk about empirical risk here.
It is calculated as follows:
Example
• Consider the data set which shows the components of
composition of a new drug developed by a
pharmaceutical company yielded aa prositive result or
not
• Based on the above dataset, we develop a model h that
takes in inputs of the percentage of both the
components
x = (Component 1 (%),Component 2 (%)) and predicts y = 0
if Component 2 (%) > 0.5.

Let us now compare our model's prediction with the actual

ones.
we use the 0-1 loss function which simply charges 1 when the model is
wrong and zero otherwise and compare the model's predictions with
the actual ones. Then we compute the empirical risk as follows:

1\5 (0+0+1+1+0)= 2/5

True Risk
• Let us assume that the amount of components
1 and 2 in D are selected in the interval [0,1] in
a random, uniform and independent manner.
• Let the true labeling function be a function
will label samples as hT = 0 (did not yield
result) if their c1,c2 combination is within
distance 1/2 from the point [1,1].
• The plot looks as follows:

Here, the gray area is the condition where the model's prediction was 0 and the red circle
represents the actual region where the sample did not yield results.

Here, the difference between the area of the gray rectangle and the area of the quadrant
gives the true risk.
True risk = 0.5 x 1 - 0.25 x 3.14 x (0.5)2 = 0.30
Empirical Risk Minimization
• While building our machine learning model,
we choose a function that reduces the
differences between the actual and the
predicted output i.e. empirical risk.

• We aim to reduce/minimize the empirical risk

as an attempt to minimize the true risk by
hoping that the empirical risk is almost the
same as the true risk.
• Empirical risk minimization depends on four
factors:
– The size of the dataset - the more data we get, the
more the empirical risk approaches the true risk.
– The complexity of the true distribution - if the
underlying distribution is too complex, we might
need more data to get a good approximation of it.
– The class of functions we consider - the
approximation error will be very high if the size of
the function is too large.
– The loss function - It can cause trouble if the loss
function gives very high loss in certain conditions.
L2 Regularization
• To handle the problem of overfitting, we use
the regularization techniques.
• Also known as ridge regression
• In ridge regression, the predictors that are
insignificant are penalized.
• This method constricts the coefficients to deal
with independent variables that are highly
correlated.
• It adds the “squared magnitude” of
coefficient, which is the sum of squares of the
weights of all features as the penalty term to
the loss function.

– Here, λ is the regularization parameter.

F-S Divertor PDF
No ratings yet
F-S Divertor PDF
174 pages
Empirical Risk Minimization
No ratings yet
Empirical Risk Minimization
3 pages
PilotstarD AP02-S01 Mar09
No ratings yet
PilotstarD AP02-S01 Mar09
168 pages
Pms Deck Nasyda Linso
100% (1)
Pms Deck Nasyda Linso
21 pages
MATH 499 Homework 2
100% (3)
MATH 499 Homework 2
2 pages
Hot Fress
100% (2)
Hot Fress
37 pages
Lec11 Handout
No ratings yet
Lec11 Handout
86 pages
ML Opt
No ratings yet
ML Opt
89 pages
Fath On y 2017 Adversarial
No ratings yet
Fath On y 2017 Adversarial
19 pages
DDA3020 Lecture 06 Logistic Regression
No ratings yet
DDA3020 Lecture 06 Logistic Regression
47 pages
Empirical Risk Minimization For Losses Without Variance: Catoni 2012
No ratings yet
Empirical Risk Minimization For Losses Without Variance: Catoni 2012
43 pages
Tuo Zhao Notes
No ratings yet
Tuo Zhao Notes
47 pages
Chapter 4a Riskmin-Reg - Commented4
No ratings yet
Chapter 4a Riskmin-Reg - Commented4
54 pages
10: Empirical Risk Minimization
No ratings yet
10: Empirical Risk Minimization
6 pages
Naive Bayes Linear Models Linear Regression
No ratings yet
Naive Bayes Linear Models Linear Regression
140 pages
Lecture 2 PDF
No ratings yet
Lecture 2 PDF
12 pages
Notes Chapter 2
No ratings yet
Notes Chapter 2
19 pages
n14 PDF
No ratings yet
n14 PDF
4 pages
3.1 Binary Classification
No ratings yet
3.1 Binary Classification
4 pages
n27 PDF
No ratings yet
n27 PDF
3 pages
Representer Function
No ratings yet
Representer Function
12 pages
Stat Risk
No ratings yet
Stat Risk
6 pages
Class 02
No ratings yet
Class 02
42 pages
Sol Advriskmin 2
No ratings yet
Sol Advriskmin 2
3 pages
UNIT1 ERM and PAC Learning
No ratings yet
UNIT1 ERM and PAC Learning
20 pages
ML - Unit 2
No ratings yet
ML - Unit 2
155 pages
Statistical Learning Theory: 18.657: Mathematics of Machine Learning
No ratings yet
Statistical Learning Theory: 18.657: Mathematics of Machine Learning
9 pages
Empirical Risk Minimization
No ratings yet
Empirical Risk Minimization
7 pages
Logistic Regression
No ratings yet
Logistic Regression
42 pages
Chapter 4 - Linear Model: Prepared By: Shier Nee, SAW Based On: Probabilistic Machine Learning by Kevin Murphy
No ratings yet
Chapter 4 - Linear Model: Prepared By: Shier Nee, SAW Based On: Probabilistic Machine Learning by Kevin Murphy
42 pages
CS229 Supplemental Lecture Notes: 1 Binary Classification
No ratings yet
CS229 Supplemental Lecture Notes: 1 Binary Classification
7 pages
ML Linear Model
No ratings yet
ML Linear Model
10 pages
Machine Learning PDF
No ratings yet
Machine Learning PDF
77 pages
Sol Multiclass 1
No ratings yet
Sol Multiclass 1
5 pages
Beyond Classification Beyond Classification Beyond Classification Beyond Classification
No ratings yet
Beyond Classification Beyond Classification Beyond Classification Beyond Classification
23 pages
01 Lecturenote SRM
No ratings yet
01 Lecturenote SRM
9 pages
Linear Regression
No ratings yet
Linear Regression
26 pages
Lecture-4 Emprical Risk and Optimization
No ratings yet
Lecture-4 Emprical Risk and Optimization
20 pages
Linear Regression
No ratings yet
Linear Regression
29 pages
Handout 02 Logistic Regression
No ratings yet
Handout 02 Logistic Regression
39 pages
Unit Online 1.2
No ratings yet
Unit Online 1.2
20 pages
Exam 21
No ratings yet
Exam 21
17 pages
Unit 02 - Nonlinear Classification, Linear Regression, Collaborative Filtering - MD
No ratings yet
Unit 02 - Nonlinear Classification, Linear Regression, Collaborative Filtering - MD
14 pages
Logistic Regression Loss
No ratings yet
Logistic Regression Loss
7 pages
L02 Linear Regression
No ratings yet
L02 Linear Regression
9 pages
Group 30
No ratings yet
Group 30
33 pages
CH 1
No ratings yet
CH 1
24 pages
E Brochure Raptor
No ratings yet
E Brochure Raptor
11 pages
Statistical Learning Theory
No ratings yet
Statistical Learning Theory
100 pages
Chap 3
No ratings yet
Chap 3
74 pages
Group30 Linear Regression
No ratings yet
Group30 Linear Regression
20 pages
Binary Classification and Logistic Regression
No ratings yet
Binary Classification and Logistic Regression
7 pages
Unit 2&3 - 250421 - 215911
No ratings yet
Unit 2&3 - 250421 - 215911
60 pages
Operating Instructions: Rotary Microtome CUT 4062 / CUT 5062 / CUT 6062
No ratings yet
Operating Instructions: Rotary Microtome CUT 4062 / CUT 5062 / CUT 6062
38 pages
Notes On Logistic Regression
No ratings yet
Notes On Logistic Regression
3 pages
Unit 1.2 Perceptron 2024
No ratings yet
Unit 1.2 Perceptron 2024
107 pages
BT Wk5 LectureNotes A
No ratings yet
BT Wk5 LectureNotes A
17 pages
Chap3 01
No ratings yet
Chap3 01
35 pages
Lecture 1
No ratings yet
Lecture 1
5 pages
Ai512 Book
No ratings yet
Ai512 Book
127 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
Convexity: 18.657: Mathematics of Machine Learning
No ratings yet
Convexity: 18.657: Mathematics of Machine Learning
6 pages
Machine Learning
No ratings yet
Machine Learning
19 pages
VISTA EXPLODIDA Lei SA
No ratings yet
VISTA EXPLODIDA Lei SA
56 pages
Service Manual SKS14SBA - VKS14SBA
No ratings yet
Service Manual SKS14SBA - VKS14SBA
21 pages
ISSCC 2021 Regular Presentations (Template & Guide)
No ratings yet
ISSCC 2021 Regular Presentations (Template & Guide)
17 pages
Central Purchase Contract
No ratings yet
Central Purchase Contract
38 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
Unit 2
No ratings yet
Unit 2
8 pages
Skillnet Ireland - Network Brand Guidelines
100% (1)
Skillnet Ireland - Network Brand Guidelines
59 pages
SAP Afaria System Requirements
No ratings yet
SAP Afaria System Requirements
38 pages
Iterative Construct in Java
No ratings yet
Iterative Construct in Java
39 pages
3 Categories of Entrants
No ratings yet
3 Categories of Entrants
5 pages
Temporary Revision N 07702-TR-02-20181009
No ratings yet
Temporary Revision N 07702-TR-02-20181009
32 pages
Ks2 Mathematics 2001 Marking Scheme
No ratings yet
Ks2 Mathematics 2001 Marking Scheme
30 pages
Pac Learning
No ratings yet
Pac Learning
30 pages
Snowplow 101 Guide To Marketing Attribution - 2023
No ratings yet
Snowplow 101 Guide To Marketing Attribution - 2023
16 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
15 pages
Objective:: Write An Experiment On Zener Diode Clipper
No ratings yet
Objective:: Write An Experiment On Zener Diode Clipper
13 pages
Clustering
No ratings yet
Clustering
84 pages
Verified PDF Download Discrete Time Signal Processing 3rd Edition by Alan V Oppenheim Ebook and TestBank Bundle Fast Instant Download
No ratings yet
Verified PDF Download Discrete Time Signal Processing 3rd Edition by Alan V Oppenheim Ebook and TestBank Bundle Fast Instant Download
408 pages
When The Supply Is Restored Automatically, The Resulting Event Is Called Short Interruption
No ratings yet
When The Supply Is Restored Automatically, The Resulting Event Is Called Short Interruption
17 pages
Lecture 2 - Problem Solving Process
No ratings yet
Lecture 2 - Problem Solving Process
32 pages
LJ CG Unit 2
No ratings yet
LJ CG Unit 2
2 pages
Full Ordinary Differential Equations Principles and Applications Cambridge IISc Series 1st Edition A. K. Nandakumaran PDF All Chapters
No ratings yet
Full Ordinary Differential Equations Principles and Applications Cambridge IISc Series 1st Edition A. K. Nandakumaran PDF All Chapters
65 pages
PKG List (Submit To Mr. Jeong)
No ratings yet
PKG List (Submit To Mr. Jeong)
6 pages
Raphael
No ratings yet
Raphael
8 pages
A Novel Bi-Directional DC-DC Converter For Distributed Energy Storage Device
No ratings yet
A Novel Bi-Directional DC-DC Converter For Distributed Energy Storage Device
5 pages
.Trashed-1742732428-Abstraction in Java - GeeksforGeeks
No ratings yet
.Trashed-1742732428-Abstraction in Java - GeeksforGeeks
11 pages
Sheetal Cyriac Virtual Impedance Based Stabilization
No ratings yet
Sheetal Cyriac Virtual Impedance Based Stabilization
6 pages
Matlab Demo Instructions
No ratings yet
Matlab Demo Instructions
1 page
CLADLOK Flat Panel Datasheet
No ratings yet
CLADLOK Flat Panel Datasheet
2 pages
Online Learning
No ratings yet
Online Learning
5 pages
Bee305p SW1
No ratings yet
Bee305p SW1
4 pages
BMC Bos
No ratings yet
BMC Bos
1 page
Resume: Lokam Srikanth Contact No: +91 8463931010
No ratings yet
Resume: Lokam Srikanth Contact No: +91 8463931010
2 pages
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet

Risk Minimization

Uploaded by

Risk Minimization

Uploaded by

Risk

• In machine learning, 'risk' is synonymous to 'error'.

• consider an example of supervised learning problem - cancer diagnosis. Each input

• For calculating the true risk, we need the entire

Let us now compare our model's prediction with the actual

1\5 (0+0+1+1+0)= 2/5

• We aim to reduce/minimize the empirical risk

– Here, λ is the regularization parameter.

You might also like