0% found this document useful (0 votes)

82 views5 pages

553.740 Project 2 Optimization. Fall 2020 Due On Wednesday October 21

The document provides directions for project 2 on optimization. It states that solutions must be typed, sufficiently explained, and include program sources as an appendix. It will provide CSV data files with sample data. Question 1 asks to reformulate a penalized log-likelihood estimation problem into an equivalent constrained minimization problem. It then asks to prove properties of the objective function like convexity, describe the KKT conditions, and find the optimal solutions under different conditions on the parameters. Question 2 extends the model to consider a binary variable Y and estimates the parameters conditionally on Y=0 and Y=1. It asks to reformulate the penalized log-likelihood problem and adapt the solution from question 1 to find the optimal

Uploaded by

Koilama

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views5 pages

553.740 Project 2 Optimization. Fall 2020 Due On Wednesday October 21

Uploaded by

Koilama

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

553.740 Project 2; Optimization.

Fall 2020
Due on Wednesday October 21.

Please read carefully the directions below.

• You must type your answers (using, e.g., LateX or MS Word.) A 5% penalty will be applied
otherwise.

• The solution must be written with sufficient explanations and your results must be commented,
as if you were writing a report or a paper.. Returning a series of numerical results and figures is not
enough. A solution in the form of a program output is not acceptable either, or a Jupyter notebook
output.
• You must provide your program sources. They must be added as an appendix, separated to your
answers to the questions. They will not graded (so no direct credit for them), but they will be useful in
order to understand why results are not correct (and decide whether partial credit can be given) and to
ensure that your work is original. You may use any programming language, although Python is strongly
recommended.
• Data files. The “.csv” files associated with this project are available on Blackboard. The columns are
formatted as (Index, X1, . . . , Xd, Y) and each row corresponds to a sample. Here, X1, . . . , Xd are the
d coordinates of the input variable (in this homework, d = 1 or 2), and Y is the input variable (Index
is just the line number).

Question 1.
Let X be a real-valued random variable modeled with a normal distribution N (m, σ 2 ). Fixing a positive
number ρ, we propose to estimate m and σ 2 , based on a sample (x1 , . . . , xN ) by maximizing the penalized
log-likelihood
N
X |m|
log ϕm,σ2 (xk ) − N ρ
σ
k=1
2
where ϕm,σ2 is the p.d.f. of N (m, σ ).
We will assume in the following that the observed samples are not equal to the same constant value.

(1.1) Let α = m/σ and β = 1/σ. Let also µ1 and µ2 denote the first- and second-order empirical moments:
1
µ1 = (x1 + · · · + xN )
N
1
µ2 = (x21 + · · · + x2N )
N
Prove that this maximization problem is equivalent to minimizing (introducing an auxiliary variable ξ)
1 2 1
F (α, β, ξ) = α − µ1 αβ + µ2 β 2 − log β + ρξ
2 2
subject to the constraints ξ − α ≥ 0 and ξ + α ≥ 0.

1
(1.2) Prove that F is a convex function over R × (0, +∞) × R and that that any feasible point for this
constrained minimization problem is regular.

(1.3) Prove that the KKT conditions for this problem are:

α − µ1 β = λ2 − λ1



1


 − µ1 α + µ2 β − = 0





 β
ρ − λ1 − λ2 = 0
λ1 , λ2 ≥ 0





λ1 (α − ξ) = 0






λ2 (α + ξ) = 0

where λ1 , λ2 are Lagrange multipliers.

(1.4) Find a necessary and sufficient condition on µ1 , µ2 and ρ for a solution to this system to satisfy α = 0,
and provide the optimal β in that case.

(1.5) Assume that the previous condition is not satisfied. (a) Prove that in that case, α has the same sign
as µ1 . (b) Completely describe the solution of the system, separating the cases µ1 > 0 and µ1 < 0.

(1.6) Summarize this discussion and prove that one of the two following statements (with extra credit if you
probe both, i.e., prove that they are equivalent). (a) The optimal parameters are

m̂ = sign(µ1 ) max (0, |µ1 | − ρσ̂)

!
√ 2s2
σ̂ = min µ2 , p
ρ2 µ21 + 4s2 − ρ|µ1 |

with s2 = µ2 − µ21 .
(b) The optimal parameters can be computed as follows
√
• If µ21 ≤ ρ2 µ2 : m̂ = 0, σ̂ = µ2 .
• If µ21 ≥ ρ2 µ2 :

m̂ = µ1 − sign(µ1 )ρσ̂
2s2
σ̂ = p
ρ2 µ21 + 4s2 − ρ|µ1 |

(1.7) What happens if ρ ≥ 1?

Question 2.
We now consider a second random variable, Y ∈ {0, 1}, and assume that, conditionally to Y = y,
X ∼ N (my , σ 2 ) for some m0 , m1 and σ 2 . Based on samples (x1 , . . . , xN ) and (y1 , . . . , yN ), we want to

2
maximize the penalized log-likelihood
N
X |m1 − m0 |
log ϕmyk ,σ2 (xk ) − N ρ
σ
k=1

with respect to m0 , m1 and σ 2 .

Let N0 and N1 denote the number of observations with Y = 0 and Y = 1. where
N
1 X
µ1,0 = xk 1yk =0
N0
k=1
N
1 X
µ1,1 = xk 1yk =1
N1
k=1
1
µ1 = (N0 µ1,0 + N1 µ1,1 )
N
N
1 X 2
µ2 = xk
N
k=1
s2 = µ2 − µ21

(2.1) Let m = (N0 m0 + N1 m1 )/N , α = (m1 − m0 )/σ and β = 1/σ. Express the problem in terms of these
new parameters and prove that, denoting the optimal solution by m̂, α̂, β̂, that
1. m̂ = µ1
2. α̂ and β̂ minimize
s2 β 2 N0 N1 N0 N1 2
− 2
(µ1,1 − µ1,0 )αβ + α − log β + ρ|α|.
2 N 2N 2
with respect to α and β.

(2.2) Adapt the solution found in question 1 to show one of the following statements (with extra credits if
you show both)
(a) The optimal solution of the penalized likelihood problem is
!
2(s2 − N0 N1 (µ1,1 − µ1,0 )2 /N 2 )
σ̂ = min s, p
ρ2 (µ1,1 − µ1,0 )2 + 4(s2 − N0 N1 (µ1,1 − µ1,0 )2 /N 2 ) − ρ|µ1,1 − µ1,0 |
N2

N1
m̂0 = µ1 − sign(µ1,1 − µ1,0 ) max 0, |µ1,1 − µ1,0 | − ρσ̂
N N0 N1
N2

N0
m̂1 = µ1 + sign(µ1,1 − µ1,0 ) max 0, |µ1,1 − µ1,0 | − ρσ̂
N N0 N1

(b) The optimal solution of the penalized likelihood problem is computed as follows.
• If N0 N1 |µ1,1 − µ1,0 | ≤ N 2 ρs: m̂0 = m̂1 = µ1 , σ = s.

3
• If N0 N1 |µ1,1 − µ1,0 | ≤ N 2 ρs:
2(s2 − N0 N1 (µ1,1 − µ1,0 )2 /N 2 )
σ̂ = p
ρ2 (µ1,1 − µ1,0 )2 + 4(s2 − N0 N1 (µ1,1 − µ1,0 )2 /N 2 ) − ρ|µ1,1 − µ1,0 |
N
m̂0 = µ1,0 − sign(µ1,1 − µ1,0 )ρσ̂
N0
N
m̂1 = µ1,1 − sign(µ1,1 − µ1,0 )ρσ̂
N1

Question 3.
We now assume that the random variable X is multi-dimensional and that, conditional to Y , it follows
a normal distribution N (my , Σ) with m0 , m1 ∈ Rd and Σ a diagonal matrix. We will denote by my (i) the
ith coefficient of my (y = 0, 1) and by σ 2 (i) the ith diagonal entry of Σ.

(3.1) Use the previous question to describe the maximizers of the penalized likelihood
N d
X X |m0 (i) − m1 (i)|
log Φmyk ,Σ (xk ) − N ρ
i=1
σ(i)
k=1

where Φm,Σ is the p.d.f. of the multivariate Gaussian N (m, Σ).

(3.2) Write a program for training that takes as input an N by d array X, an N -dimensional binary vector
Y and a value of ρ, and that returns m0 , m1 and σ as vectors.
(a) Test this program on the dataset “OPT FALL2020 train1.csv” with ρ = 0.1 and return the number of
coordinates such that m0 (i) 6= m1 (i).
(b) Same question, with ρ = 0.15.

(3.3) Write a program for prediction that takes as input vectors m0 , m1 and σ (such as those returned by
the previous training program) and an M by d array X of test data and that returns an M -dimensional
vector Y such that, for k = 1, . . . , M : Y(k) = 0 if
d d
X (X(k, j) − m0 (j))2 X (X(k, j) − m1 (j))2
<
j=1
σ 2 (j) j=1
σ 2 (j)

and Y(k) = 1 otherwise.

For ρ = 0.01, 0.02, 0.03, . . . , 0.34, 0.35, execute the following operations:
(1) Run the training program of question(3.2) to estimate m0 , m1 , σ. Compute ν(ρ), the number of
coordinates such that m0 (i) 6= m1 (i).
(2) Using these values of m0 , m1 and σ obtained in the previous question for λ = 0.10 and λ = 0.15
and apply the classification program to the dataset provided in the file “OPT FALL2020 test1.csv.”
Compute the balanced error rate E(ρ) defined as the average between the fraction of wrong answers
among test data with Y = 0 and that among test data with Y = 1, formally defined by
1 X #{k : Y(k) = y, Y0 (k) 6= y}
E=
2 #{k : Y(k) = y}
y∈{0,1}

0
where Y contains the predicted classes and Y the true ones.

4
Provide, in your answer,
• The numerical value of the E(0.1) and E(0.15).
• The values ρ for which E is minimal.
• Two plots: of ν(ρ) vs. ρ and of E(ρ) vs. ρ.

(3.4) Instead of using a closed form for the solution of the problem considered in this question, we consider
here an iterative approach using proximal gradient descent. Let α0 (i) = (m0 (i)/σ(i), i = 1, . . . , d), α1 (i) =
(m1 (i)/σ(i), i = 1, . . . , d) and β = (1/σ(i), i = 1, . . . , d) be d-dimensional vectors.
(a) Rewrite the penalized likelihood maximization problem considered in this question as an equivalent
minimization problem for a function taking the form

F (α0 , α1 , β) = f (α0 , α1 , β) + ρg(α0 , α1 , β)

Pd
with g(α0 , α1 , β) = i=1 |α1 (i) − α0 (i)|.
(b) Provide an explicit expression of the proximal operator

1
proxγρg (α0 , α1 , β) = argminα00 ,α01 ,β 0 ρg(α00 , α10 , β 0 ) + (|α0 − α00 |2 + |α1 − α10 |2 + |β − β 0 |2 )
2γ

(c) Give the expression of ∇f (α0 , α1 , β) for the function f obtained in (a).
(d) Write a new version of the training program that takes the same input X, Y and ρ and returns the
optimal m0 , m1 , σ obtained after stabilization of a proximal gradient algorithm

(α0,n+1 , α1,n+1 , βn+1 ) = proxγρg ((α0,n , α1,n , βn ) − γ∇f (α0,n , α1,n , βn )) ,

where stabilization is defined as the maximal change between the optimized variable during one iteration
being less than τ = 10−6 .
Run this program on the dataset “OPT FALL2020 train1.csv” using ρ = 0.1 and γ = 0.1. Provide in
your answer the maximum value (over all estimated variables) of the difference between the result obtained
after convergence and the exact formula obtained in question (3.1).

Interntional Audit Workbook
100% (6)
Interntional Audit Workbook
133 pages
Inference Quals 1992-2019
No ratings yet
Inference Quals 1992-2019
66 pages
Lieven LP Problems
No ratings yet
Lieven LP Problems
68 pages
Solutions
No ratings yet
Solutions
11 pages
OHS-PR-02-07 Document Control
100% (2)
OHS-PR-02-07 Document Control
14 pages
Final Exam Solution 20220201
No ratings yet
Final Exam Solution 20220201
14 pages
Midterm F02soln
No ratings yet
Midterm F02soln
14 pages
Diploma Syllabus 1st Year-2nd Sem PDF
No ratings yet
Diploma Syllabus 1st Year-2nd Sem PDF
7 pages
Machine 2020 Jul-Dec
No ratings yet
Machine 2020 Jul-Dec
45 pages
APMTH231 Pset1
No ratings yet
APMTH231 Pset1
7 pages
Stat QEs S24
No ratings yet
Stat QEs S24
11 pages
COGS 118 Homework 3 Supervised Machine Learning Algorithms
No ratings yet
COGS 118 Homework 3 Supervised Machine Learning Algorithms
7 pages
CS 419M Midsem 2021 22
No ratings yet
CS 419M Midsem 2021 22
6 pages
Exam Practice Solution
No ratings yet
Exam Practice Solution
9 pages
Econometrics - Exercise Set 2 (Solution)
No ratings yet
Econometrics - Exercise Set 2 (Solution)
12 pages
Sol3 2015
No ratings yet
Sol3 2015
8 pages
Midterm Practice Questions
No ratings yet
Midterm Practice Questions
14 pages
Tute1 Questions
No ratings yet
Tute1 Questions
4 pages
Minors 2023 PDF
No ratings yet
Minors 2023 PDF
8 pages
Sol Advriskmin 1
No ratings yet
Sol Advriskmin 1
4 pages
Optimisation Eg
No ratings yet
Optimisation Eg
7 pages
Mock End Term Solution
No ratings yet
Mock End Term Solution
12 pages
Epfl Machine Learning Final Exam 2021 Solutions
No ratings yet
Epfl Machine Learning Final Exam 2021 Solutions
21 pages
ML Question CMU
No ratings yet
ML Question CMU
12 pages
ES Key
No ratings yet
ES Key
6 pages
M1 INT POP 3 Lesson3 Homework
No ratings yet
M1 INT POP 3 Lesson3 Homework
2 pages
Ps 4
No ratings yet
Ps 4
4 pages
Exam Practice
No ratings yet
Exam Practice
5 pages
2023 02 Exam
No ratings yet
2023 02 Exam
5 pages
Homework1 2024
No ratings yet
Homework1 2024
2 pages
Col726 A2
No ratings yet
Col726 A2
5 pages
John Crane Gas Seal Technology: 27 September, Singapore
No ratings yet
John Crane Gas Seal Technology: 27 September, Singapore
44 pages
MATH412 QUIZ 3 Solution
No ratings yet
MATH412 QUIZ 3 Solution
5 pages
SVM Problems1
No ratings yet
SVM Problems1
5 pages
Fall2020 CS395T Mock Midterm Solutions
No ratings yet
Fall2020 CS395T Mock Midterm Solutions
4 pages
COMPSCI5014 1 Machine Learning (M) 201904
No ratings yet
COMPSCI5014 1 Machine Learning (M) 201904
7 pages
Assignment 3 Solution Ver2
No ratings yet
Assignment 3 Solution Ver2
4 pages
Finalsample
No ratings yet
Finalsample
6 pages
MSML604 Homework 5
No ratings yet
MSML604 Homework 5
4 pages
2021 11 Quiz Solution
No ratings yet
2021 11 Quiz Solution
4 pages
IIT Kanpur PHD May 2019
No ratings yet
IIT Kanpur PHD May 2019
7 pages
EE364a Homework 6 Solutions: I 1,..., K I I I
No ratings yet
EE364a Homework 6 Solutions: I 1,..., K I I I
20 pages
Solution 5 Problem 1: Let a > 0 be a known constant, and let θ > 0 be a parameter
No ratings yet
Solution 5 Problem 1: Let a > 0 be a known constant, and let θ > 0 be a parameter
8 pages
Midterm 2010 F
No ratings yet
Midterm 2010 F
15 pages
hw2 Sol
No ratings yet
hw2 Sol
7 pages
CMU 2018s NinaBALCAN HW3
No ratings yet
CMU 2018s NinaBALCAN HW3
7 pages
Econometrics: Problem Set 2: Professor: Mauricio Sarrias
No ratings yet
Econometrics: Problem Set 2: Professor: Mauricio Sarrias
10 pages
JRF Stat STB 2017
No ratings yet
JRF Stat STB 2017
2 pages
2017-18-I MS Key
No ratings yet
2017-18-I MS Key
6 pages
2019-20-I MS Key
No ratings yet
2019-20-I MS Key
6 pages
SAP Abap Quiz Part 2
No ratings yet
SAP Abap Quiz Part 2
10 pages
HW 1 Sol
No ratings yet
HW 1 Sol
6 pages
Quiz3 2024
No ratings yet
Quiz3 2024
2 pages
Additional Problems
No ratings yet
Additional Problems
6 pages
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Solution 3 Problem 1: Let X
No ratings yet
Solution 3 Problem 1: Let X
12 pages
Solutions Problem Set 1
No ratings yet
Solutions Problem Set 1
7 pages
NPTEL Assignment 1 Opt
No ratings yet
NPTEL Assignment 1 Opt
14 pages
ML ES 23-24-II Key
No ratings yet
ML ES 23-24-II Key
4 pages
Math525 2
No ratings yet
Math525 2
8 pages
HW 4
No ratings yet
HW 4
6 pages
hw3 Solutions PDF
No ratings yet
hw3 Solutions PDF
11 pages
Manisha's Journey
No ratings yet
Manisha's Journey
6 pages
Ps 2
No ratings yet
Ps 2
7 pages
MAST90083 2021 S2 Exam Paper
No ratings yet
MAST90083 2021 S2 Exam Paper
4 pages
Building Joints
No ratings yet
Building Joints
11 pages
SCADA User Interface: E-Terracontrol - Module 4
No ratings yet
SCADA User Interface: E-Terracontrol - Module 4
14 pages
3assignment Sol
No ratings yet
3assignment Sol
7 pages
ST MicroInverter Schemetic
No ratings yet
ST MicroInverter Schemetic
12 pages
2018 Power Substation Guidebook
No ratings yet
2018 Power Substation Guidebook
22 pages
AFL For Intraday Trend Following Strategy Using MACD and Bollinger Band
No ratings yet
AFL For Intraday Trend Following Strategy Using MACD and Bollinger Band
2 pages
Brushless DC Electric Motor
No ratings yet
Brushless DC Electric Motor
7 pages
APCCAS Full Schedule - Nov18
No ratings yet
APCCAS Full Schedule - Nov18
6 pages
Cainta Catholic College Senior High School Department Cainta, Rizal
No ratings yet
Cainta Catholic College Senior High School Department Cainta, Rizal
33 pages
Inomax Manual de Operacion PDF
No ratings yet
Inomax Manual de Operacion PDF
136 pages
Chapter 6 Word - Table and Mail Merge
No ratings yet
Chapter 6 Word - Table and Mail Merge
29 pages
CC2530 Registers
No ratings yet
CC2530 Registers
299 pages
Driver SCN Serie
No ratings yet
Driver SCN Serie
47 pages
1187-1996 - Farm Milk Cooling and Storage Systems
No ratings yet
1187-1996 - Farm Milk Cooling and Storage Systems
19 pages
DTCN
No ratings yet
DTCN
232 pages
Bright Technologies
No ratings yet
Bright Technologies
1 page
ORDERS D97A Message-Guideline
No ratings yet
ORDERS D97A Message-Guideline
45 pages
Engine Speed Circuit Fault
No ratings yet
Engine Speed Circuit Fault
7 pages
PREPOSITIONS OF PLACE - Quizizz
No ratings yet
PREPOSITIONS OF PLACE - Quizizz
6 pages
Anthropometry As Ergonomic Consideration For Hospital
No ratings yet
Anthropometry As Ergonomic Consideration For Hospital
8 pages
Course Syllabus Gamification
No ratings yet
Course Syllabus Gamification
4 pages
X ICSE-practice Sheet-1
No ratings yet
X ICSE-practice Sheet-1
5 pages
Introduction To Java Programming: Preparred by R.Divya, Btech (It)
No ratings yet
Introduction To Java Programming: Preparred by R.Divya, Btech (It)
24 pages
How To Make Micro-SIM From Usual SIM Card
No ratings yet
How To Make Micro-SIM From Usual SIM Card
1 page
Nse 3.1
No ratings yet
Nse 3.1
4 pages

553.740 Project 2 Optimization. Fall 2020 Due On Wednesday October 21

Uploaded by

553.740 Project 2 Optimization. Fall 2020 Due On Wednesday October 21

Uploaded by

553.740 Project 2; Optimization.

Please read carefully the directions below.

where λ1 , λ2 are Lagrange multipliers.

m̂ = sign(µ1 ) max (0, |µ1 | − ρσ̂)

(1.7) What happens if ρ ≥ 1?

with respect to m0 , m1 and σ 2 .

where Φm,Σ is the p.d.f. of the multivariate Gaussian N (m, Σ).

and Y(k) = 1 otherwise.

F (α0 , α1 , β) = f (α0 , α1 , β) + ρg(α0 , α1 , β)

(α0,n+1 , α1,n+1 , βn+1 ) = proxγρg ((α0,n , α1,n , βn ) − γ∇f (α0,n , α1,n , βn )) ,

You might also like