0% found this document useful (0 votes)

16 views6 pages

2019-20-I MS Key

Uploaded by

singhalabhi53

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views6 pages

2019-20-I MS Key

Uploaded by

singhalabhi53

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

CS 771A: Introduction to Machine Learning Midsem Exam (15 Sep 2019)

Name SAMPLE SOLUTIONS 80 marks

Roll No Dept. Page 1 of 6

Instructions:
1. This question paper contains 3 pages (6 sides of paper). Please verify.
2. Write your name, roll number, department in block letters neatly with ink on each page of this question paper.
3. If you don’t write your name and roll number on all pages, pages may get lost when we unstaple to scan pages
4. Write your final answers neatly with a blue/black pen. Pencil marks may get smudged.
5. Don’t overwrite/scratch answers especially in MCQ and T/F. We will entertain no requests for leniency.

Q1. Write T or F for True/False (write only in the box on the right hand side) (10x2=20 marks)
When using kNN to do classification, using a large value of k always gives better
1
performance since more training points are used to decide label of the test point
F
Cross validation means taking a small subset of the test data and using it to get an
2
estimate of how well will our algorithm perform on the entire test dataset
F
The EM algo does not require a careful initialization of model parameters since it
3
anyway considers all possible assignments of latent variables with different weights
F
If 𝑋 and 𝑌 are two real-valued random variables such that Cov(𝑋, 𝑌) < 0 then at
4
least one of 𝑋 or 𝑌 must have negative variance i.e. either 𝕍𝑋 < 0 or 𝕍𝑌 < 0
F
If 𝐚 ∈ ℝ2 is a constant vector and 𝑓: ℝ2 → ℝ is such that 𝑔(𝐱) = 𝑓(𝐱) + 𝐚⊤ 𝐱 is a
5 T
non-convex function, then ℎ(𝐱) = 𝑓(𝐱) − 𝐚⊤ 𝐱 must be a non-convex function too
The SVM is so named because the decision boundary of the SVM classifier passes
6
through the data points which are marked as being support vectors
F
Suppose 𝑋 is a real valued random variable with variance 𝕍𝑋 = 9. Then the
7 F
random variable 𝑌 defined as 𝑌 = 𝑋 − 2 will always satisfy 𝕍𝑌 = 𝕍𝑋 − 22 = 5
The LwP algorithm for binary classification always gives linear decision boundary if
8
we use one prototype per class and Euclidean distance to measure distances
T
If 𝑓, 𝑔: ℝ2 → ℝ are two non-convex functions, then the function ℎ: ℝ2 → ℝ
9
defined as ℎ(𝐱) = 𝑓(𝐱) + 𝑔(𝐱) must always be non-convex too
F
If we learn models {𝐰 𝑐 }𝐶𝑐=1 for multiclassification using the Crammer-Singer loss
10 T
function, these models can be used to assign a PMF over the class labels [𝐶]

Q2 Phase retrieval is used in X-ray crystallography. Let 𝐱 𝑖 ∈ ℝ𝑑 , 𝑖 ∈ [𝑛] be features and 𝑦 𝑖 ∈ ℝ be

labels. All data points are independent. However, we only get to see the absolute value of labels,
𝑛
i.e. the train data is {(𝐱 𝑖 , 𝑢𝑖 )}𝑖=1 where 𝑢𝑖 = |𝑦 𝑖 |. Let 𝑧 𝑖 ∈ {−1,1} be latent variables for missing
label signs (aka phases). Use the data likelihood function ℙ[𝑢𝑖 | 𝑧 𝑖 , 𝐱 𝑖 , 𝐰] = 𝒩(𝑢𝑖 𝑧 𝑖 ; 𝐰 ⊤ 𝐱 𝑖 , 1).
Note that this is a discriminative setting (i.e. 𝐱 𝑖 are constants). Expressions in your answers may
contain unspecified normalization constants. Give only brief derivations. (8+6+6=20 marks)
2.1 Assuming ℙ[𝑧 𝑖 = 𝑐 | 𝐱 𝑖 , 𝐰] = ℙ[𝑧 𝑖 = 𝑐] = 0.5 for 𝑐 ∈ {−1,1} (i.e. uniform prior on 𝑧 𝑖 that
does not depend on features or model), derive an expression for ℙ[𝑧 𝑖 = 1 | 𝑢𝑖 , 𝐱 𝑖 , 𝐰]. Using this,
derive an expression for the MAP estimate arg max ℙ[𝑧 𝑖 = 𝑐 | 𝑢𝑖 , 𝐱 𝑖 , 𝐰]
𝑐∈{−1,+1}
Page 2 of 6
Applying Bayes rule and ℙ[𝑧 𝑖 = 1 | 𝐱 𝑖 , 𝐰] = 0. 5, we have (omitting normalization constants)
2
𝑖 𝑖 𝑖
ℙ[𝑢𝑖 | 𝑧 𝑖 = 1, 𝐱 𝑖 , 𝐰] ⋅ ℙ[𝑧 𝑖 = 1 | 𝐱 𝑖 , 𝐰] (𝑢𝑖 − 𝐰 ⊤ 𝐱 𝑖 )
ℙ[𝑧 = 1 | 𝑢 , 𝐱 , 𝐰] = ∝ exp (− )
ℙ[𝑢𝑖 | 𝐱 𝑖 , 𝐰] 2
2
𝑖 𝑖 𝑖 (−𝑢𝑖 −𝐰 ⊤ 𝐱 𝑖 )
Similarly, we have ℙ[𝑧 = −1 | 𝑢 , 𝐱 , 𝐰] ∝ exp (− ). This tells us that we should
2
set 𝑧 𝑖 to whatever value that leads to a smaller residual error. A nice way of saying this is
arg max ℙ[𝑧 𝑖 = 𝑐 | 𝑢𝑖 , 𝐱 𝑖 , 𝐰] = sign(|−𝑢𝑖 − 𝐰 ⊤ 𝐱 𝑖 | − |𝑢𝑖 − 𝐰 ⊤ 𝐱 𝑖 |)
𝑐∈{−1,+1}

where we break ties (when both terms on the RHS are equal) arbitrarily, say in favour of 1. We
may choose to break ties any way we wish since sign(0) is not cleanly defined and does not
matter in calculations.

2.2 Derive an expression for ℙ[𝐰 | 𝐮, 𝐳, 𝑋] using a standard Gaussian prior ℙ[𝐰] = 𝒩(𝟎, 𝐼𝑑 ).
Then derive an expression for the MAP estimate for 𝐰 i.e. arg max𝑑 ℙ[𝐰 | 𝐮, 𝐳, 𝑋] (here we are
𝐰∈ℝ
using shorthand notation 𝑋 = [𝐱1 ,…𝐱 𝑛 ]⊤ 𝑛×𝑑
∈ℝ ,𝐮 = [𝑢1 ,…,𝑢𝑛]
∈ ℝ𝑛 , 𝐳 = [𝑧 1 , … , 𝑧 𝑛 ] ∈ ℝ𝑛 ).

Using independence, the Bayes rule, and ignoring proportionality constants as before gives us
2
1 2
𝑛 (𝑢𝑖 𝑧 𝑖 − 𝐰 ⊤ 𝐱 𝑖 )
ℙ[𝐰 | 𝐮, 𝐳, 𝑋] ∝ ℙ[𝐮 | 𝐰, 𝐳, 𝑋] ⋅ ℙ[𝐰] ∝ exp (− ‖𝐰‖2 ) ⋅ ∏ exp (− )
2 𝑖=1 2

Note that the expression for ℙ[𝑢𝑖 | 𝑧 𝑖 , 𝐱 𝑖 , 𝐰] is available to us from the question text itself.
Taking logarithms as usual gives us
𝑛 2
̂ MAP = arg min𝑑‖𝐰‖22 + ∑
𝐰 (𝑢𝑖 𝑧 𝑖 − 𝐰 ⊤ 𝐱 𝑖 )
𝐰∈ℝ 𝑖=1

Applying first order optimality and using the shorthand 𝑣 𝑖 = 𝑢𝑖 𝑧 𝑖 and 𝐯 = [𝑣 1 , … , 𝑣 𝑛 ] ∈ ℝ𝑛

̂ MAP = (𝑋 ⊤ 𝑋 + 𝐼𝑑 )−1 𝑋 ⊤ 𝐯
𝐰

2.3 Using the above derivations, give the pseudocode (as we write in lecture slides i.e. not
necessarily Python code or C code but sufficient details of the algorithm updates) for an
alternating optimization algorithm for estimating the model 𝐰 in the presence of the latent
variables. Give precise update expressions in your pseudocode and not just vague statements.
CS 771A: Introduction to Machine Learning Midsem Exam (15 Sep 2019)
Name SAMPLE SOLUTIONS 80 marks
Roll No Dept. Page 3 of 6

AltOpt for Phase Retrieval

1. Initialize model 𝐰
2. For 𝑖 ∈ [𝑛], update {𝑧𝑖 } using {𝐰 𝑐 }
1. Let 𝑧𝑖 = sign(|−𝑢𝑖 − 𝐰 ⊤ 𝐱 𝑖 | − |𝑢𝑖 − 𝐰 ⊤ 𝐱 𝑖 |)
2. Break ties arbitrarily
3. Update 𝐰 using {𝑧𝑖 }
1. Let 𝑣 𝑖 = 𝑢𝑖 𝑧 𝑖 and 𝐯 = [𝑣 1 , … , 𝑣 𝑛 ]
2. Let 𝐰 = (𝑋 ⊤ 𝑋 + 𝐼𝑑 )−1 𝑋 ⊤ 𝐯
4. Repeat until convergence

Q3 We have seen that algorithms such as the EM require weighted optimization problems to be
solved where different data points may have different weights. Consider the following problem
of L2 regularized squared hinge loss minimization but with different weights per data point. The
data points are 𝐱 𝑖 ∈ ℝ𝑑 and the labels are 𝑦 𝑖 ∈ {−1,1}. The weights 𝑞𝑖 are all known (i.e. are
constants) and are all strictly positive i.e. 𝑞𝑖 > 0, 𝑞𝑖 ≠ 0 for all 𝑖 = 1, … , 𝑛 (3+2+5=10 marks)
1 𝑛 2
arg mind ‖𝐰‖22 + ∑ 𝑞𝑖 ⋅ ([1 − 𝑦 𝑖 ⋅ 𝐰 ⊤ 𝐱 𝑖 ]+ )
𝐰∈ℝ 2 𝑖=1

3.1 As we did in assignment 1, rewrite the above problem as an equivalent problem that has
inequality constraints in it (the above problem does not have any constraints).
1 𝑛
arg mind ‖𝐰‖22 + ∑ 𝑞𝑖 ⋅ 𝜉𝑖2
𝐰∈ℝ 2 𝑖=1
𝛏∈ℝ𝑛

s. t. 𝑦 𝑖 ⋅ 𝐰 ⊤ 𝐱 𝑖 ≥ 1 − 𝜉𝑖 , for all 𝑖 ∈ [𝑛]

Similar to what we observed in assignment 1, even in this case, including or omitting the
constraints 𝜉𝑖 ≥ 0 does not affect the solution.
Page 4 of 6
3.2 Then introduce dual variables as appropriate and write down the expression for the dual
problem as a max-min problem (no need to write the Lagrangian expression separately).
1
The Lagrangian is ℒ(𝐰, 𝛏, 𝛂) = ‖𝐰‖22 + ∑𝑛𝑖=1 𝑞𝑖 ⋅ 𝜉𝑖2 + ∑𝑛𝑖=1 𝛼𝑖 (1 − 𝜉𝑖 − 𝑦 𝑖 ⋅ 𝐰 ⊤ 𝐱 𝑖 )
2

Thus, the dual problem is

1 𝑛 𝑛
max {min { ‖𝐰‖22 + ∑ 𝑞𝑖 ⋅ 𝜉𝑖2 + ∑ 𝛼𝑖 (1 − 𝜉𝑖 − 𝑦 𝑖 ⋅ 𝐰 ⊤ 𝐱 𝑖 ) }}
𝛂≥0 𝐰,𝛏 2 𝑖=1 𝑖=1

3.3 Simplify the dual by eliminating the primal variables and write down the expression for the
simplified dual. Show only brief derivations.

Applying first order optimality to the inner unconstrained optimization problem gives us:
𝜕ℒ 𝑛
= 0 ⇒ 𝐰 = ∑ 𝛼𝑖 𝑦 𝑖 ⋅ 𝐱 𝑖
𝜕𝐰 𝑖=1

𝜕ℒ 𝛼𝑖
= 0 ⇒ 𝜉𝑖 =
𝜕𝜉 2𝑞𝑖
Putting these in the dual expression gives us the following simplified dual problem
1
max { 𝛂⊤ 1 − 𝛂⊤ (𝑄 + 𝐷)𝛂 }
𝛂≥0 2
where 𝑄 is an 𝑛 × 𝑛 matrix with 𝑄𝑖𝑗 = 𝛼𝑖 𝛼𝑗 𝑦 𝑖 𝑦 𝑗 〈𝐱 𝑖 , 𝐱𝑗 〉 and 𝐷 is an 𝑛 × 𝑛 diagonal matrix with
1
𝐷𝑖𝑖 = and 𝐷𝑖𝑗 = 0 if 𝑖 ≠ 𝑗.
2𝑞𝑖

Q4 Recall the uniform distribution over an interval [𝑎, 𝑏] ⊂ ℝ where 𝑎 < 𝑏. Just two parameters,
namely 𝑎, 𝑏, are required to define this distribution (no restrictions on 𝑎, 𝑏 being positive/non-
zero etc, just that we must have 𝑎 < 𝑏. Note this implies 𝑎 ≠ 𝑏). The PDF of this distribution is
0 𝑥<𝑎
ℙ[𝑥 | 𝑎, 𝑏] = 𝒰(𝑥; 𝑎, 𝑏) ≜ {1⁄(𝑏 − 𝑎) 𝑥 ∈ [𝑎, 𝑏]
0 𝑥>𝑏
Given 𝑛 independent samples 𝑥 1 , … , 𝑥 𝑛 ∈ ℝ (assume w.l.o.g. that not all samples are the same
number) we wish to learn a uniform distribution as a generative distribution using these samples
using the MLE technique i.e. we wish to find
arg max ℙ[𝑥 1 , … , 𝑥 𝑛 | 𝑎, 𝑏]
𝑎<𝑏,𝑎≠𝑏

Give a brief derivation for, and the final values of, 𝑎̂MLE and 𝑏̂MLE . (5+5=10 marks)
CS 771A: Introduction to Machine Learning Midsem Exam (15 Sep 2019)
Name SAMPLE SOLUTIONS 80 marks
Roll No Dept. Page 5 of 6

Using independence, we have arg max ℙ[𝑥 1 , … , 𝑥 𝑛 | 𝑎, 𝑏] = arg max ∏𝑛𝑖=1 𝒰(𝑥 𝑖 ; 𝑎, 𝑏)
𝑎<𝑏,𝑎≠𝑏 𝑎<𝑏,𝑎≠𝑏

Now, suppose we have a pair (𝑎, 𝑏) such that for some 𝑖 ∈ [𝑛], we have 𝑥 𝑖 ≠ [𝑎, 𝑏], then
𝒰(𝑥 𝑖 ; 𝑎, 𝑏) = 0 and as a result ℙ[𝑥 1 , … , 𝑥 𝑛 | 𝑎, 𝑏] = 0 too! This means that if we denote 𝑚 ≜
min 𝑥 𝑖 and 𝑀 ≜ max 𝑥 𝑖 , then we must have 𝑎 ≤ 𝑚 and 𝑏 ≥ 𝑀 to get a non-zero value of the
𝑖 𝑖
likelihood function i.e. we need to solve
𝑛
𝑖
1 𝑛
arg max ∏ 𝒰(𝑥 ; 𝑎, 𝑏) = arg max ( )
𝑎≤𝑚,𝑏≥𝑀 𝑖=1 𝑎≤𝑚,𝑏≥𝑀 𝑏 − 𝑎

The above is maximized for the smallest value of 𝑏 − 𝑎 which, subject to the constraints, is
achieved exactly at 𝑎 = 𝑚, 𝑏 = 𝑀. Thus, we have
𝑎̂MLE = min 𝑥 𝑖 and 𝑏̂MLE = max 𝑥 𝑖
𝑖 𝑖

Q5. Fill the circle (don’t tick) next to all the correct options (many may be correct).(2x3=6 marks)
5.1 The use of the Laplace (aka Laplacian) prior and Laplace (aka Laplacian) likelihood results in a
MAP problem that requires us to solve an optimization problem whose objective function is

A Always convex and always differentiable

B Always convex but possibly non-differentiable
C Possibly non-convex but always differentiable
D Always non-convex and always non-differentiable
Page 6 of 6
5.2 In probabilistic multiclassification with 𝐶 classes, if for a test data point, the ML algorithm
predicts a PMF over the classes with an extremely small variance, then it means that
A The mode of that PMF should have a probability value much larger than 0
B The mode of that PMF should have a probability value very close to 0
C The ML algorithm is very confident about its prediction on that data point
D The ML algorithm is very unsure about its prediction on that data point
Q6 Nadal and Federer have played a total of 80 matches of which Nadal won 50, Federer won 30.
They have played on three types of courts – clay, grass, and hard. Among the matches Nadal
won, 70% were played on clay courts, 4% on grass courts and rest on hard courts. Federer has
won a 15/120 fraction of matches played on clay courts, 96/120 fraction of matches played on
grass courts, and 68/120 fraction of matches played on hard courts. What is the number of
matches that the two players have played on each of the three types of courts? (3x2=6 marks)

Clay ( 40 ) Grass ( 10 ) Hard ( 30 )

Q7 Let 𝑋 be a discrete random variable with support {−1,0,1}. Find a PMF for 𝑋 for which 𝑋 has
the highest possible variance. What value of variance do you get in this case? Repeat the analysis
(i.e. give the highest variance PMF as well as the variance value) when 𝑋 is a Rademacher random
variable i.e. has support only over {−1,1}. Justify all your answers briefly. (3+1+3+1=8 marks)
Suppose the PMF assigns probability values 𝑝−1 , 𝑝0 , 𝑝1 to the support elements. Then we have
𝔼𝑋 = (𝑝1 − 𝑝−1 ) and 𝕍𝑋 = 𝔼[𝑋 2 ] − (𝔼𝑋)2 = (𝑝1 + 𝑝−1 ) − (𝑝1 − 𝑝−1 )2 . Now, whereas we
could go all Lagrangian on this problem and solve it by brute force, a more careful look at the
problem gives results more readily.
The largest (perhaps unachievably so) value of the last expression is achieved when 𝑝1 + 𝑝−1
takes on its largest value (which is 1 since 𝑝−1 + 𝑝0 + 𝑝1 = 1 and 𝑝0 ≥ 0) and (𝑝1 − 𝑝−1 )2
takes on its smallest value (which is 0 since a square of a real number can never be negative).
Thus, we must not expect a result better than 𝕍𝑋 = 1.
However, the above can actually be achieved. (𝑝1 − 𝑝−1 )2 = 0 when 𝑝1 = 𝑝−1 and we cab
even simultaneously ensure 𝑝1 + 𝑝−1 by setting 𝑝1 = 𝑝−1 = 0.5. Thus, at the PMF
{𝑝−1 = 0.5, 𝑝0 = 0, 𝑝1 = 0.5}, the random variable has highest variance of 𝕍𝑋 = 1.
For the Rademacher case, the solution is readily seen to be {𝑝−1 = 0.5, 𝑝1 = 0.5} and 𝕍𝑋 = 1
since in the first case, the solution we obtained looks exactly like a Rademacher random
variable if we look at the support elements which are assigned non-zero probability.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- END OF EXAM - - - - -- - - - - - - - - - - - - - - - - - - - - - - - -
---

Mco 03 em PDF
80% (5)
Mco 03 em PDF
8 pages
FINAL C1ES 108903 New Reporting Template For LESF
No ratings yet
FINAL C1ES 108903 New Reporting Template For LESF
528 pages
Bar Council of India
100% (3)
Bar Council of India
17 pages
Lesson 9 The School Head in School-Based Management (SBM)
100% (3)
Lesson 9 The School Head in School-Based Management (SBM)
6 pages
Science 4 Q4 W3
No ratings yet
Science 4 Q4 W3
6 pages
Hackethon For Students
No ratings yet
Hackethon For Students
9 pages
Intro To Philosophy Q1 WEEK3 For Teacher PDF
50% (2)
Intro To Philosophy Q1 WEEK3 For Teacher PDF
18 pages
Lea Rning Journal S 2012
No ratings yet
Lea Rning Journal S 2012
4 pages
Detailed Lesson Plan in Mathematics
100% (3)
Detailed Lesson Plan in Mathematics
10 pages
Emotion Recognition From Facial Expression of Autism Spectrum Disordered Children Using Image Processing and Machine Learning Algorithms
No ratings yet
Emotion Recognition From Facial Expression of Autism Spectrum Disordered Children Using Image Processing and Machine Learning Algorithms
47 pages
Teaching Portfolio Rubric
No ratings yet
Teaching Portfolio Rubric
5 pages
Inference Quals 1992-2019
No ratings yet
Inference Quals 1992-2019
66 pages
The Benefits of ICT To Students and Teachers
No ratings yet
The Benefits of ICT To Students and Teachers
2 pages
Entrep Report
No ratings yet
Entrep Report
8 pages
Slides Estimation
No ratings yet
Slides Estimation
171 pages
ML Practice 1
No ratings yet
ML Practice 1
106 pages
CIS 520, Machine Learning, Fall 2015: Assignment 2 Due: Friday, September 18th, 11:59pm (Via Turnin)
No ratings yet
CIS 520, Machine Learning, Fall 2015: Assignment 2 Due: Friday, September 18th, 11:59pm (Via Turnin)
3 pages
AJOKE 1 F Pneumagogy, A Proposed Theory For Effective Teaching and Learning in Christian Kingdom Education (FINAL)
No ratings yet
AJOKE 1 F Pneumagogy, A Proposed Theory For Effective Teaching and Learning in Christian Kingdom Education (FINAL)
16 pages
CS771: Machine Learning: Tools, Techniques and Applications Mid-Semester Exam
No ratings yet
CS771: Machine Learning: Tools, Techniques and Applications Mid-Semester Exam
7 pages
COGS 118 Homework 3 Supervised Machine Learning Algorithms
No ratings yet
COGS 118 Homework 3 Supervised Machine Learning Algorithms
7 pages
Fe 1 Grids
No ratings yet
Fe 1 Grids
12 pages
CS 771A: Introduction To Machine Learning Name Roll No Dept
No ratings yet
CS 771A: Introduction To Machine Learning Name Roll No Dept
2 pages
Reading and Writing Skills (Final Term Exam) Sy 2018-2019 I. MULTIPLE CHOICES: Choose The Letter of Your Answer
100% (1)
Reading and Writing Skills (Final Term Exam) Sy 2018-2019 I. MULTIPLE CHOICES: Choose The Letter of Your Answer
2 pages
CS 771A: Intro To Machine Learning, IIT Kanpur Name Roll No Dept
No ratings yet
CS 771A: Intro To Machine Learning, IIT Kanpur Name Roll No Dept
2 pages
Understanding Archipelagic Insight
No ratings yet
Understanding Archipelagic Insight
5 pages
CS 419M Midsem 2021 22
No ratings yet
CS 419M Midsem 2021 22
6 pages
Abid Hussain - 6552-1st Assignment Autumnt-2023s
No ratings yet
Abid Hussain - 6552-1st Assignment Autumnt-2023s
64 pages
Code of Ethics For Teacher As Legal Pillar To Improve Educational Professionalism
No ratings yet
Code of Ethics For Teacher As Legal Pillar To Improve Educational Professionalism
5 pages
Hw2 - Raymond Von Mizener - Chirag Mahapatra
No ratings yet
Hw2 - Raymond Von Mizener - Chirag Mahapatra
13 pages
Worksheet For Quiz
No ratings yet
Worksheet For Quiz
5 pages
Tut04 - One Algorithm To Optimize Them All
No ratings yet
Tut04 - One Algorithm To Optimize Them All
19 pages
Pridesample Juanmiguelciudad
No ratings yet
Pridesample Juanmiguelciudad
2 pages
EE364a Homework 6 Solutions: I 1,..., K I I I
No ratings yet
EE364a Homework 6 Solutions: I 1,..., K I I I
20 pages
Interview Failures Its Causes
No ratings yet
Interview Failures Its Causes
1 page
HW 2
No ratings yet
HW 2
3 pages
PHI446 Mid-Sem Sample 2015
No ratings yet
PHI446 Mid-Sem Sample 2015
3 pages
hw3 Solutions PDF
No ratings yet
hw3 Solutions PDF
11 pages
ME341A Exam Paper Y20
No ratings yet
ME341A Exam Paper Y20
3 pages
Midterm F02soln
No ratings yet
Midterm F02soln
14 pages
Midterm 2010 F
No ratings yet
Midterm 2010 F
15 pages
Me341 HW11
No ratings yet
Me341 HW11
2 pages
HW 3
No ratings yet
HW 3
2 pages
HW 1
No ratings yet
HW 1
2 pages
ML MS 22-23-II Key
No ratings yet
ML MS 22-23-II Key
4 pages
2022 CS244 End Sem Soln
No ratings yet
2022 CS244 End Sem Soln
6 pages
ML Midsem 2018 Solutions
No ratings yet
ML Midsem 2018 Solutions
7 pages
HW 1
No ratings yet
HW 1
3 pages
cs675 SS2022 Midterm Solution PDF
No ratings yet
cs675 SS2022 Midterm Solution PDF
10 pages
OM2-Project Guidelines, Format and Rubrics - PGP 2022-24
No ratings yet
OM2-Project Guidelines, Format and Rubrics - PGP 2022-24
2 pages
ICS Certification Roadmap
No ratings yet
ICS Certification Roadmap
4 pages
OptimumEngineeringDesign Day2b
No ratings yet
OptimumEngineeringDesign Day2b
24 pages
Final Exam Epfl 2020 Machine Leaning
No ratings yet
Final Exam Epfl 2020 Machine Leaning
16 pages
Midterm 2010 Solutions
No ratings yet
Midterm 2010 Solutions
8 pages
ML ES 23-24-II Key
No ratings yet
ML ES 23-24-II Key
4 pages
Culminating Recital
No ratings yet
Culminating Recital
3 pages
CMU 2018s NinaBALCAN HW3
No ratings yet
CMU 2018s NinaBALCAN HW3
7 pages
2021 EE769 Tutorial Sheet 1
No ratings yet
2021 EE769 Tutorial Sheet 1
4 pages
2022 23 Ii - MS
No ratings yet
2022 23 Ii - MS
4 pages
Worksheet Practicing Either-Neither - So - and Nor
No ratings yet
Worksheet Practicing Either-Neither - So - and Nor
2 pages
2017-18-I MS Key
No ratings yet
2017-18-I MS Key
6 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
Sample Midterm With Solutions (Updated)
No ratings yet
Sample Midterm With Solutions (Updated)
26 pages
Machine 2021 Jan-Apr
No ratings yet
Machine 2021 Jan-Apr
45 pages
CS 771A: Intro To Machine Learning, IIT Kanpur Name Roll No Dept
No ratings yet
CS 771A: Intro To Machine Learning, IIT Kanpur Name Roll No Dept
4 pages
EE 769 2020.02.29 Mid Term Solution
No ratings yet
EE 769 2020.02.29 Mid Term Solution
6 pages
Midterm With Solutions
No ratings yet
Midterm With Solutions
26 pages
C2 M2 Exam Withsol
No ratings yet
C2 M2 Exam Withsol
12 pages
ML Question CMU
No ratings yet
ML Question CMU
12 pages
Wa0030.
No ratings yet
Wa0030.
36 pages
EE 769 2023.02.23 Mid Term
No ratings yet
EE 769 2023.02.23 Mid Term
2 pages
Endsem ML Makeup AK - 1
No ratings yet
Endsem ML Makeup AK - 1
7 pages
Sol3 2015
No ratings yet
Sol3 2015
8 pages
City Bus Stand Dhanvar 1st 2nd Floor Plan 22.01.25-Combined
No ratings yet
City Bus Stand Dhanvar 1st 2nd Floor Plan 22.01.25-Combined
2 pages
hw5 1
No ratings yet
hw5 1
6 pages
Endsem ML Regular AK
No ratings yet
Endsem ML Regular AK
7 pages
Colonial Colleges - Wikipedia
No ratings yet
Colonial Colleges - Wikipedia
55 pages
2020 Exam2 Solution
No ratings yet
2020 Exam2 Solution
9 pages
Mock End Term Solution
No ratings yet
Mock End Term Solution
12 pages
MS Key-4
No ratings yet
MS Key-4
4 pages
Assignment 1
No ratings yet
Assignment 1
6 pages
Cs 419 Endsemsols
No ratings yet
Cs 419 Endsemsols
6 pages
MS Key-2
No ratings yet
MS Key-2
4 pages
ML MS 24-25-II Key
No ratings yet
ML MS 24-25-II Key
4 pages
Quiz3 2024
No ratings yet
Quiz3 2024
2 pages
Awsm: CS 771A: Intro To Machine Learning, IIT Kanpur (19 Oct 2022) Name Roll No Dept
No ratings yet
Awsm: CS 771A: Intro To Machine Learning, IIT Kanpur (19 Oct 2022) Name Roll No Dept
2 pages
SVM Problems1
No ratings yet
SVM Problems1
5 pages
ES Key
No ratings yet
ES Key
4 pages
ES Key
No ratings yet
ES Key
4 pages
HW 3
No ratings yet
HW 3
7 pages
Complete Bundle Semiconductor Physics and Devices Basic Principles 4th Edition Neamen
No ratings yet
Complete Bundle Semiconductor Physics and Devices Basic Principles 4th Edition Neamen
401 pages
ES Key
No ratings yet
ES Key
4 pages
ES Key
No ratings yet
ES Key
6 pages
Practice Questions Lec 18 45
No ratings yet
Practice Questions Lec 18 45
4 pages
Automotive Service Inspection Maintenance Repair 6th Edition Tim Gilles
No ratings yet
Automotive Service Inspection Maintenance Repair 6th Edition Tim Gilles
306 pages
DL 1
No ratings yet
DL 1
10 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet

2019-20-I MS Key

Uploaded by

2019-20-I MS Key

Uploaded by

CS 771A: Introduction to Machine Learning Midsem Exam (15 Sep 2019)

Name SAMPLE SOLUTIONS 80 marks

Q2 Phase retrieval is used in X-ray crystallography. Let 𝐱 𝑖 ∈ ℝ𝑑 , 𝑖 ∈ [𝑛] be features and 𝑦 𝑖 ∈ ℝ be

Applying first order optimality and using the shorthand 𝑣 𝑖 = 𝑢𝑖 𝑧 𝑖 and 𝐯 = [𝑣 1 , … , 𝑣 𝑛 ] ∈ ℝ𝑛

AltOpt for Phase Retrieval

s. t. 𝑦 𝑖 ⋅ 𝐰 ⊤ 𝐱 𝑖 ≥ 1 − 𝜉𝑖 , for all 𝑖 ∈ [𝑛]

Thus, the dual problem is

A Always convex and always differentiable

Clay ( 40 ) Grass ( 10 ) Hard ( 30 )

You might also like