0% found this document useful (0 votes)

8 views7 pages

CS4780 Homework 5 SP24-2

The document outlines Homework 5 for CS 4780/5780, due on April 10, 2024, focusing on K-NN algorithms and their bias-variance tradeoff in classification and regression tasks. It includes multiple problems that require visualizations, mathematical proofs, and discussions on overfitting and regularization techniques. Students are encouraged to work in groups and submit their work on Gradescope.

Uploaded by

matthew.lew.04

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views7 pages

CS4780 Homework 5 SP24-2

Uploaded by

matthew.lew.04

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

CS 4780/5780 Homework 5

Due: Wednesday 04/10/2024 11:59pm on Gradescope

Note: You can work in a group of up to 3. Please include your teammates’ NetIDs
and names on the front page and form a group on Gradescope.

Problem 1: Back AK -NN [16 points]

K-NN is back again! We will first learn visually learn the relation between k in the K-NN algorithm,
and the bias and variance of the error it produces.
Assume we have a big dataset of patient records each tagged with three properties: detected atrial
fibrillation, age of the patient, and how many minutes of exercise they put in every week on
average. This data is plotted in figure 1. The red circles represent patients with AFib (class 1)
while the blue circles represent those without (class 0). We want to predict the probability of onset
of Atrial Fibrillation in new patients.

1. (1) Independent of how K-NN treats classification, say you are asked to draw a single line
to demarcate the boundary between high chance of AFib (red area) vs low chance of AFib
(blue area). Where would you draw the line? Make a copy of figure 1 on your submission
and overlay your line. Keep your line smooth.

While we haven’t discussed the mathematical form of bias, variance and noise terms for classifi-
cation models, we can safely assume that variance will depend on the number of times hD (x) will
disagree with h̄(x), while bias will depend on how often h̄(x) will differ from ȳ(x).
Here you can estimate ȳ(x) (expected class for x) as the class predicted using the line you drew
in part 1, hD (x) is the class predicted by the K-NN classifier trained on dataset D, and h̄(x) can
be estimated by taking majority poll of the predicted class from all hD (x) for D = 1 to 5.
For the sampled datasets D given in figure 2, we will now visualize what happens when hD (x)
is obtained from Ak (D) (the K-NN algorithm with the desired k). We will repeat this process of
training 5 classifiers for k = 1, 10, 30. Each dataset has 30 points - 10 from the red class, 20 from
the blue.

2. (2) Let’s start with k = 1. For your reference, we have given the voronoi cell boundaries for
all these 5 datasets in figure 3. Refer to them in your submission to state and justify whether
each statement below is true or false.

(a) For most datasets, there are blue and red regions both above and below the line you
drew in part 1.
(b) The 1-NN will correctly predict the class for all train points.
Takeaway: This shows whether the classifier is overfitting or underfitting.

1
Figure 1: Plot of heart condition (red) and healthy heart (blue) when plotted against Age (y-axis)
and Minutes of Exercise Weekly (x-axis)

Figure 2: Datasets used to train the K-NN classifiers

2
Figure 3: 1-NN boundaries for given datasets

3. (3) Show for k = 1 and the datasets given, that variance is higher than bias. Specifically,
show that:
X X X X
I[hD (x) ̸= h̄(x)] > I[h̄(x) ̸= ȳ(x)] where hD = A1 (D)
D (x,y)∈D D (x,y)∈D

where I is the indicator function. (in words: you need to show that hD (x) disagrees with h̄(x)
more often than h̄(x) disagrees with ȳ(x)). You will not need to sum over all points, instead
pick a few points per dataset to show RHS is low, while LHS is high
Hint: focus on the outliers in each dataset, because all hD (x), h̄(x), ȳ(x) will agree for other
points.
4. (2) Now let’s try k = 30. Notice that k = |D|. First, for each dataset in figure 2, draw
the 30-NN classification boundary by shading the area of the plot red and blue based on its
prediction. If you are not submitting a colored submission, shade the red area and leave the
blue area un-shaded.
5. (2) By looking at your answers in part 4, state and justify whether the following statements
are true or false.

(a) For most datasets, there are blue and red regions both above and below the line you
drew in part 1.
(b) The 30-NN will correctly predict the class for all train points.

6. (3) Show for k = 30 and the datasets given, that bias is higher than variance. Specifically,
show that:
X X X X
I[hD (x) ̸= h̄(x)] < I[h̄(x) ̸= ȳ(x)] where hD = A30 (D)
D (x,y)∈D D (x,y)∈D

3
where I is the indicator function. (in words: you need to show that hD (x) disagrees with h̄(x)
less often than h̄(x) disagrees with ȳ(x)). Again, you will not need to manually sum over all
points in this part to show that LHS is low and RHS is high for each dataset.

7. Finally let’s look at k = 5. For each dataset in figure 2, draw the the best approximation of
classification boundaries that a 5-NN would make. You don’t have to be exact - a hand-drawn
shading will do.

8. (3) Simply by eye-balling, what conclusions can you make about the 5-NN classifiers? Is the
variance lower than 1-NN? Is the bias lower than 30-NN? No need to justify. Refer to your
answer in part 7.

4
Problem 2: Regressi-knn [15 points]
Enough eye-balling. We will now understand the relation between k in the K-NN algorithm and
its error terms, mathematically.
We will be using K-NN for a regression task, since its easiest to do the derivations for regression
(and mean-square error loss)
Suppose we have data generated by a model yi = f (xi ) + εi , where εi are i.i.d. random variables
with E[εi ] = 0 and Var[εi ] = σ 2 . Denote D as the training set. The expected prediction error at a
single x is
EPEk (x) = ED,(x,y) [(y − hk (x))2 ],
where y = f (x) + ε. (Here, ε is also i.i.d. and from the same distribution as εi .) For simplicity, we
assume that the values of xi and x in the training sample are fixed in advance (nonrandom), while
the value of yi and y are random variables as defined. In the specific KNN regression model,
k k
1X 1X
hk (x) = y(l) = (f (x(l) ) + ε(l) ),
k k
l=1 l=1

where x(l) is the lth closest point to x in D.

Decompose EPEk (x) into three components: variance, noise and bias. Each term should be
represented by x(1) , · · · , x(l) , x, σ and f . Using your expression, argue that the variance will drop
as k is increased.

1. (2) Let’s start off by finding ȳ(x) = Ey|x [y(x)]

2. (Bonus) Let h̄k (x) be the expected classifier. What can we say about E[h̄k (x) − y(x)]?

3. (5) Prove that

EPEk (x) = ED,(x,y) [(y − hk (x))2 ]
can be reduced to

EPEk (x) = ED,(x,y) [(hk (x) − h̄k (x))2 ] + ED,(x,y) [(h̄k (x) − ȳ(x))2 ] + ED,(x,y) [(ȳ(x) − y(x))2 ]

Identify the corresponding bias, noise and variance respectively.

4. (8) Can you simplify the terms further by representing it in terms x(1) , ..., x(l) , x, σ and f ?

5
Problem 3: Overfitting/Underfitting [6 points]
Which of the following strategies can be used when overfitting / underfitting happens?

overfitting underfitting
increase the regularization
decrease the regularization
use less features
use more features
use a more complex model
use a less complex model

Problem 4: Regularization Mitigates Overfitting [15 points]

In this question, we are going to investigate how adding l2 regularization can help mitigate the
effect of overfitting for ordinary least square regression. First, recall that in our notes for lecture
11, we mention that we can rewrite the objective function of l2-regularized least square regression
(or ridge regression)
Xn
min (w ⃗ 22
⃗ T ⃗xi − yi )2 + λ||w||
w
⃗
i=1
as
n
X
min (w ⃗ 22 ≤ B
⃗ T ⃗xi − yi )2 subject to ||w||
w
⃗
i=1
To simplify our analysis, we are going to focus on the second expression. In addition, we are going
to assume the following:
(i) Each data point (⃗xi , yi ) is drawn identically and independently from the distribution P,
namely, the dataset D ∼ P n

(ii) For any (⃗x, y) sampled from P, we have ||⃗x||22 = 1

With the above assumption, we are going to do the following:
1. Notice that w(D)
⃗ is a function of D and since D is random, so is w(D).
⃗ Define w̄ = ED (w(D)).
⃗
Show that
||w(D)
⃗ − w̄||22 ≤ 4B 2
using the triangular inequality

||a − b||2 ≤ ||a||2 + ||b||2

2. Define the model hD (⃗x) = w(D)

⃗ T⃗ x and h̄(⃗x) = ED (hD (⃗x)). Show that the variance of the
model
E⃗x,D ((hD (⃗x) − h̄(⃗x))2 ) ≤ 4B 2
by first showing that
hD (⃗x) − h̄(⃗x) = (w(D) − w)T ⃗x

6
and then using the Cauchy-Schwarz inequality:

(aT b)2 ≤ (aT a)(bT b)

to conclude the result.

Takeaway: By adding regularization, we essentially bound the variance of the model which reduces
overfitting.

Shreya Bansal - 250418 - 153433
No ratings yet
Shreya Bansal - 250418 - 153433
971 pages
DD&ME&KDE&KNN
No ratings yet
DD&ME&KDE&KNN
27 pages
ISLR Solutions - Classification
No ratings yet
ISLR Solutions - Classification
20 pages
Recitation 8
No ratings yet
Recitation 8
5 pages
ML Merged
No ratings yet
ML Merged
15 pages
Classification and K Nearest Neighbour Algorithm
No ratings yet
Classification and K Nearest Neighbour Algorithm
53 pages
QCM DL
No ratings yet
QCM DL
7 pages
Final
No ratings yet
Final
13 pages
HW 1
No ratings yet
HW 1
11 pages
DS&ML 2
No ratings yet
DS&ML 2
8 pages
Stochastic Gradient Descent 1
No ratings yet
Stochastic Gradient Descent 1
42 pages
ML 2024a QP Solution Full
No ratings yet
ML 2024a QP Solution Full
13 pages
ML Assignments 2025
No ratings yet
ML Assignments 2025
91 pages
Worksheet Classification1
No ratings yet
Worksheet Classification1
15 pages
Classification: K N X X X y I y
No ratings yet
Classification: K N X X X y I y
6 pages
Solution 2.2
No ratings yet
Solution 2.2
4 pages
Statistical Learning
No ratings yet
Statistical Learning
4 pages
2 Manual RPI M50A 12s V1 EU EN 2017-03-09
No ratings yet
2 Manual RPI M50A 12s V1 EU EN 2017-03-09
166 pages
Practice Finals
No ratings yet
Practice Finals
7 pages
Q and A BIS
No ratings yet
Q and A BIS
7 pages
P04 EvaluationKNN SolutionNotes
No ratings yet
P04 EvaluationKNN SolutionNotes
3 pages
Assignment 1-12 ML
No ratings yet
Assignment 1-12 ML
54 pages
HW 07
No ratings yet
HW 07
2 pages
QB - Data Science
No ratings yet
QB - Data Science
4 pages
t4 Sol
No ratings yet
t4 Sol
8 pages
Lec4 Oct12 2022 PracticalNotes LinearRegression
No ratings yet
Lec4 Oct12 2022 PracticalNotes LinearRegression
34 pages
AML Winter 2021 Solution
No ratings yet
AML Winter 2021 Solution
6 pages
HW02 - KNN DT
No ratings yet
HW02 - KNN DT
3 pages
MachineLearning MidTerm UMT Spring 2021
100% (1)
MachineLearning MidTerm UMT Spring 2021
12 pages
2024 Machine Learning
No ratings yet
2024 Machine Learning
8 pages
ML End Sem Nov2024 Paper
No ratings yet
ML End Sem Nov2024 Paper
4 pages
I-K-Means and Clustering
No ratings yet
I-K-Means and Clustering
6 pages
Fiat Hitachi Excavator Ex135w Workshop Manual
100% (1)
Fiat Hitachi Excavator Ex135w Workshop Manual
22 pages
Big Data Assignment #1: Submitted To/ Eng. Eman Hossam
No ratings yet
Big Data Assignment #1: Submitted To/ Eng. Eman Hossam
16 pages
Machine Learning 20CSE09
No ratings yet
Machine Learning 20CSE09
3 pages
MAD111 - Chap 1
No ratings yet
MAD111 - Chap 1
237 pages
Midterm Sol
No ratings yet
Midterm Sol
23 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
11 pages
Solution 1
No ratings yet
Solution 1
6 pages
Lecture10 Mid
No ratings yet
Lecture10 Mid
43 pages
Machine 2021 Jul-Dec
No ratings yet
Machine 2021 Jul-Dec
46 pages
ML 2023a Midsem Solution
No ratings yet
ML 2023a Midsem Solution
9 pages
HW02 Sol - KNN DT
No ratings yet
HW02 Sol - KNN DT
8 pages
Machine Learning Solutions
No ratings yet
Machine Learning Solutions
6 pages
ML PYQs
No ratings yet
ML PYQs
32 pages
COMPSCI5014 1 Machine Learning (M) 201904
No ratings yet
COMPSCI5014 1 Machine Learning (M) 201904
7 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
BV Raman 300 Important Yogas
67% (3)
BV Raman 300 Important Yogas
17 pages
HW 02
No ratings yet
HW 02
3 pages
2017-18-I MS Key
No ratings yet
2017-18-I MS Key
6 pages
The 10 Hook Lead System
100% (1)
The 10 Hook Lead System
5 pages
Homework 1
0% (1)
Homework 1
4 pages
Midterm Practice Questions
No ratings yet
Midterm Practice Questions
14 pages
Epfl Machine Learning Final Exam 2021 Solutions
No ratings yet
Epfl Machine Learning Final Exam 2021 Solutions
21 pages
Final Exam Epfl 2020 Machine Leaning
No ratings yet
Final Exam Epfl 2020 Machine Leaning
16 pages
ST3189 - Machine Learning - 2019 Exam - Zone-B
No ratings yet
ST3189 - Machine Learning - 2019 Exam - Zone-B
6 pages
7 Habits of Highly Effective People
No ratings yet
7 Habits of Highly Effective People
2 pages
Questions and Solutions On Linear Regression
No ratings yet
Questions and Solutions On Linear Regression
5 pages
Machine Learning Unit 4 MCQ
No ratings yet
Machine Learning Unit 4 MCQ
28 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
10 pages
Flow Over Cylinder
No ratings yet
Flow Over Cylinder
8 pages
Wa0006.
No ratings yet
Wa0006.
4 pages
Midterm 2006
No ratings yet
Midterm 2006
11 pages
MCQ Class 2 MS Word
No ratings yet
MCQ Class 2 MS Word
11 pages
Ucc2817, Ucc2818, Ucc3817 and Ucc3818 Bicmos Power Factor Pregulator
No ratings yet
Ucc2817, Ucc2818, Ucc3817 and Ucc3818 Bicmos Power Factor Pregulator
45 pages
Iso 11600 2002
No ratings yet
Iso 11600 2002
9 pages
Asset Holiday Home Work 2
No ratings yet
Asset Holiday Home Work 2
13 pages
CHAPTER 2 - FILE HANDLING-txtfile
No ratings yet
CHAPTER 2 - FILE HANDLING-txtfile
23 pages
Kernel PCA
No ratings yet
Kernel PCA
13 pages
Darrel Todd Woodruff 261 WEST 600 NORTH #1, Logan, UT 84321 435-232-4326 Email Website
No ratings yet
Darrel Todd Woodruff 261 WEST 600 NORTH #1, Logan, UT 84321 435-232-4326 Email Website
2 pages
Edexcel Igcse Physics
No ratings yet
Edexcel Igcse Physics
12 pages
CSC403 - Software Engineering BOSU
No ratings yet
CSC403 - Software Engineering BOSU
13 pages
WhatsApp Chat With Nazia Lahor-1
No ratings yet
WhatsApp Chat With Nazia Lahor-1
13 pages
Kinetic Theory & Thermal Properties Notes IGCSE AVG
100% (3)
Kinetic Theory & Thermal Properties Notes IGCSE AVG
12 pages
Preparation and Delivery of Sermons Manual
No ratings yet
Preparation and Delivery of Sermons Manual
4 pages
Alemite Oil Mist Application Manual
100% (1)
Alemite Oil Mist Application Manual
34 pages
Unit One: Lesson 10 "I'll Always Be Proud of Him"
No ratings yet
Unit One: Lesson 10 "I'll Always Be Proud of Him"
11 pages
Mechanical Tube English
No ratings yet
Mechanical Tube English
8 pages
Techniques in Measuring Microbial Growth
No ratings yet
Techniques in Measuring Microbial Growth
7 pages
Vigyan Vahini
No ratings yet
Vigyan Vahini
8 pages
Session 2 Overview of Integrity
No ratings yet
Session 2 Overview of Integrity
19 pages
CAS Reflection (Sample)
No ratings yet
CAS Reflection (Sample)
4 pages
Python Programs by Narayana
100% (1)
Python Programs by Narayana
18 pages
A Journey of Self-Actualization of Amir in The Kite Runner
No ratings yet
A Journey of Self-Actualization of Amir in The Kite Runner
4 pages
Alter Table: Table - Name ADD Column - Name Datatype
No ratings yet
Alter Table: Table - Name ADD Column - Name Datatype
5 pages
The Inventory Control Account Balance of Magic Fashions at June 30
No ratings yet
The Inventory Control Account Balance of Magic Fashions at June 30
2 pages
वदेश मं ालय भारत सरकार Ministry of External Affairs Government of India Online Appointment Receipt
No ratings yet
वदेश मं ालय भारत सरकार Ministry of External Affairs Government of India Online Appointment Receipt
3 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet