0% found this document useful (0 votes)
11 views12 pages

Risk Minimization

The document discusses the concepts of risk and error in machine learning, specifically distinguishing between true risk and empirical risk. True risk represents the average loss over the entire population, while empirical risk is calculated from a sample dataset. It also highlights the importance of empirical risk minimization and introduces L2 regularization as a technique to prevent overfitting in models.

Uploaded by

ibk2007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views12 pages

Risk Minimization

The document discusses the concepts of risk and error in machine learning, specifically distinguishing between true risk and empirical risk. True risk represents the average loss over the entire population, while empirical risk is calculated from a sample dataset. It also highlights the importance of empirical risk minimization and introduces L2 regularization as a technique to prevent overfitting in models.

Uploaded by

ibk2007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Risk

• In machine learning, 'risk' is synonymous to 'error'.

• Error - difference between the actual value and the predicted value.

• In supervised learning, we use the dataset that is representative of all the classes
for a given problem, to solve the problem.

• consider an example of supervised learning problem - cancer diagnosis. Each input


has a label 0 (cancer not diagnosed) or 1 (cancer diagnosed).

• Since we cannot have a data of all the people in the world, we sample out data of
some people to be fed into our model.

• We use a loss function L(h, x) to find the difference between the actual diagnosis
and the predicted diagnosis. This is essentially a measurement of error.
https://www.cs.cornell.edu/courses/cs4780/2018fa/lectures/
lecturenote10.html
True Risk
• True risk is the average loss/error over all
possibilities (here, the population of the whole
world). Its formula is as follows:

• For calculating the true risk, we need the entire


distribution D that generates our dataset and
also a true labeling function.
Empirical Risk
• Since we use a small sample as the data for
our model, we talk about empirical risk here.
It is calculated as follows:
Example
• Consider the data set which shows the components of
composition of a new drug developed by a
pharmaceutical company yielded aa prositive result or
not
• Based on the above dataset, we develop a model h that
takes in inputs of the percentage of both the
components
x = (Component 1 (%),Component 2 (%)) and predicts y = 0
if Component 2 (%) > 0.5.

Let us now compare our model's prediction with the actual


ones.
we use the 0-1 loss function which simply charges 1 when the model is
wrong and zero otherwise and compare the model's predictions with
the actual ones. Then we compute the empirical risk as follows:

1\5 (0+0+1+1+0)= 2/5


True Risk
• Let us assume that the amount of components
1 and 2 in D are selected in the interval [0,1] in
a random, uniform and independent manner.
• Let the true labeling function be a function
will label samples as hT = 0 (did not yield
result) if their c1,c2 combination is within
distance 1/2 from the point [1,1].
• The plot looks as follows:

Here, the gray area is the condition where the model's prediction was 0 and the red circle
represents the actual region where the sample did not yield results.

Here, the difference between the area of the gray rectangle and the area of the quadrant
gives the true risk.
True risk = 0.5 x 1 - 0.25 x 3.14 x (0.5)2 = 0.30
Empirical Risk Minimization
• While building our machine learning model,
we choose a function that reduces the
differences between the actual and the
predicted output i.e. empirical risk.

• We aim to reduce/minimize the empirical risk


as an attempt to minimize the true risk by
hoping that the empirical risk is almost the
same as the true risk.
• Empirical risk minimization depends on four
factors:
– The size of the dataset - the more data we get, the
more the empirical risk approaches the true risk.
– The complexity of the true distribution - if the
underlying distribution is too complex, we might
need more data to get a good approximation of it.
– The class of functions we consider - the
approximation error will be very high if the size of
the function is too large.
– The loss function - It can cause trouble if the loss
function gives very high loss in certain conditions.
L2 Regularization
• To handle the problem of overfitting, we use
the regularization techniques.
• Also known as ridge regression
• In ridge regression, the predictors that are
insignificant are penalized.
• This method constricts the coefficients to deal
with independent variables that are highly
correlated.
• It adds the “squared magnitude” of
coefficient, which is the sum of squares of the
weights of all features as the penalty term to
the loss function.

– Here, λ is the regularization parameter.

You might also like