Final 12

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

EE364a Convex Optimization I March 1617 or 1718, 2012

Prof. S. Boyd

Final exam
This is a 24 hour take-home nal. Please turn it in at Bytes Cafe in the Packard building, 24 hours after you pick it up. You may use any books, notes, or computer programs (e.g., Matlab, CVX), but you may not discuss the exam with anyone until March 19, after everyone has taken the exam. The only exception is that you can ask us for clarication, via the course sta email address. Weve tried pretty hard to make the exam unambiguous and clear, so were unlikely to say much. Please make a copy of your exam before handing it in. Please attach the cover page to the front of your exam. Assemble your solutions in order (problem 1, problem 2, problem 3, . . . ), starting a new page for each problem. Put everything associated with each problem (e.g., text, code, plots) together; do not attach code or plots at the end of the nal. We will deduct points from long needlessly complex solutions, even if they are correct. Our solutions are not long, so if you nd that your solution to a problem goes on and on for many pages, you should try to gure out a simpler one. We expect neat, legible exams from everyone, including those enrolled Cr/N. When a problem involves computation you must give all of the following: a clear discussion and justication of exactly what you did, the Matlab source code that produces the result, and the nal numerical results or plots. Youll nd Matlab les containing problem data in http://www.stanford.edu/~boyd/cvxbook/cvxbook_additional_exercises/ All problems have equal weight. Be sure you are using the most recent version of CVX, which is Version 1.22 (build 829). You can check this using the command cvx_version. Be sure to check your email often during the exam, just in case we need to send out an important announcement.

1. Optimal political positioning. A political constituency is a group of voters with similar views on a set of political issues. The electorate (i.e., the set of voters in some election) is partitioned (by a political analyst) into K constituencies, with (nonnegative) populations P1 , . . . , PK . A candidate in the election has an initial or prior position on each of n issues, but is willing to consider (presumably small) deviations from her prior positions in order to maximize the total number of votes she will receive. We let xi R denote the change in her position on issue i, measured on some appropriate scale. (You can think of xi < 0 as a move to the left and xi > 0 as a move to the right on the issue, if you like.) The vector x Rn characterizes the changes in her position on all issues; x = 0 represents the prior positions. On each issue she has a limit on how far in each direction she is willing to move, which we express as l x u, where l 0 and u 0 are given. The candidates position change x aects the fraction of voters in each constituency that will vote for her. This fraction is modeled as a logistic function,
T fk = g ( w k x + vk ) ,

k = 1, . . . , K.

Here g (z ) = 1/(1+exp(z )) is the standard logistic function, and wk Rn and vk R are given data that characterize the views of constituency k on the issues. Thus the total number of votes the candidate will receive is V = P1 f1 + + PK fK . The problem is to choose x (subject to the given limits) so as to maximize V . The problem data are l, u, and Pk , wk , and vk for k = 1, . . . , K . (a) The general political positioning problem. Show that the objective function V need not be quasiconcave. (This means that the general optimal political positioning problem is not a quasiconvex problem, and therefore also not a convex problem.) In other words, choose problem data for which V is not a quasiconcave function of x. (b) The partisan political positioning problem. Now suppose the candidate focuses only on her core constituencies, i.e., those for which a signicant fraction will vote for her. In this case we interpret the K constituencies as her core constituencies; we assume that vk 0, which means that with her prior position x = 0, at least half of each of her core constituencies will vote for her. We add the constraint T x + vk 0 for each k , which means that she will not take positions that that wk alienate a majority of voters from any of her core constituencies. Show that the partisan political positioning problem (i.e., maximizing V with the additional assumptions and constraints) is convex. (c) Numerical example. Find the optimal positions for the partisan political positioning problem with data given in opt_pol_pos_data.m. Report the number of 2

votes from each constituency under the politicians prior positions (x = 0) and optimal positions, as well as the total number of votes V in each case. You may use the function gapprox (z ) = min{1, g (i) + g (i)(z i) for i = 0, 1, 2, 3, 4} as an approximation of g for z 0. (The function gapprox is also an upper bound on g for z 0.) For your convenience, we have included function denitions for g and gapprox (g and gapx, respectively) in the data le. You should report the results (votes from each constituency and total) using g , but be sure to check that these numbers are close to the results using gapprox (say, within one percent or so).

2. Portfolio optimization with qualitative return forecasts. We consider the risk-return portfolio optimization problem described on pages 155 and 185 of the book, with one twist: We dont precisely know the mean return vector p . Instead, we have a range of possible values for each asset, i.e., we have l, u Rn with l p u. We use l and u to encode various qualitative forecasts we have about the mean return vector p . For example, l7 = 0.02 and u7 = 0.20 means that we believe the mean return for asset 7 is between 2% and 20%. Dene the worst-case mean return Rwc , as a function of portfolio vector x, as the worst (minimum) value of p T x, over all p consistent with the given bounds l and u. (a) Explain how to nd a portfolio x that maximizes Rwc , subject to a budget constraint and risk limit, 2 1T x = 1, xT x max , where Sn ++ and max R++ are given. (b) Solve the problem instance given in port_qual_forecasts_data.m. Give the optimal worst-case mean return achieved by the optimal portfolio x . In addition, construct a portfolio xmid that maximizes cT x subject to the budget constraint and risk limit, where c = (1/2)(l + u). This is the optimal portfolio assuming that the mean return has the midpoint value of the forecasts. Compare the midpoint mean returns cT xmid and cT x , and the worst-case mean returns of xmid and x . Briey comment on the results.

3. Learning a quadratic pseudo-metric from distance measurements. We are given a set of N pairs of points in Rn , x1 , . . . , xN , and y1 , . . . , yN , together with a set of distances d1 , . . . , dN > 0. The goal is to nd (or estimate or learn) a quadratic pseudo-metric d, d(x, y ) = (x y )T P (x y )
1/2

with P Sn + , which approximates the given distances, i.e., d(xi , yi ) di . (The pseudometric d is a metric only when P 0; when P 0 is singular, it is a pseudo-metric.) To do this, we will choose P Sn + that minimizes the mean squared error objective 1 N
N

(di d(xi , yi ))2 .


i=1

(a) Explain how to nd P using convex or quasiconvex optimization. If you cannot nd an exact formulation (i.e., one that is guaranteed to minimize the total squared error objective), give a formulation that approximately minimizes the given objective, subject to the constraints. (b) Carry out the method of part (a) with the data given in quad_metric_data.m. The columns of the matrices X and Y are the points xi and yi ; the row vector d gives the distances di . Give the optimal mean squared distance error. We also provide a test set, with data X_test, Y_test, and d_test. Report the mean squared distance error on the test set (using the metric found using the data set above).

4. Optimal parimutuel betting. In parimutuel betting, participants bet nonnegative amounts on each of n outcomes, exactly one of which will actually occur. (For example, the outcome can be which of n horses wins a race.) The total amount bet by all participants on all outcomes is called the pool or tote. The house takes a commission from the pool (typically around 20%), and the remaining pool is divided among those who bet on the outcome that occurs, in proportion to their bets on the outcome. This problem concerns the choice of the amount to bet on each outcome. Let xi 0 denote the amount we bet on outcome i, so the total amount we bet on all outcomes is 1T x. Let ai > 0 denote the amount bet by all other participants on outcome i, so after the house commission, the remaining pool is P = (1 c)(1T a + 1T x), where c (0, 1) is the house commission rate. Our payo if outcome i occurs is then pi = xi x i + ai P.

The goal is to choose x, subject to 1T x = B (where B is the total amount to be bet, which is given), so as to maximize the expected utility
n

i U ( pi ) ,
i=1

where i is the probability that outcome i occurs, and U is a concave increasing utility function, with U (0) = 0. You can assume that ai , i , c, B , and the function U are known. Explain how to nd an optimal x using convex or quasiconvex optimization. If you use a change of variables, be sure to explain how your variables are related to x. Remarks. To carry out this betting strategy, youd need to know ai , and then be the last participant to place your bets (so that ai dont subsequently change). Youd also need to know the probabilities i . These could be estimated using sophisticated machine learning techniques or insider information. The formulation above assumes that the total amount to bet (i.e., B ) is known. If it is not known, you could solve the problem above for a range of values of B and use the value of B that yields the largest optimal expected utility.

5. Polyhedral cone questions. You are given matrices A Rnk and B Rnp . Explain how to solve the following two problems using convex optimization. Your solution can involve solving multiple convex problems, as long as the number of such problems is no more than linear in the dimensions n, k, p.
p (a) How would you determine whether ARk + B R+ ? This means that every nonnegative linear combination of the columns of A can be expressed as a nonnegative linear combination of the columns of B . n (b) How would you determine whether ARk + = R ? This means that every vector in Rn can be expressed as a nonnegative linear combination of the columns of A.

6. Resource allocation in stream processing. A large data center is used to handle a stream of J types of jobs. The trac (number of instances per second) of each job type is denoted t RJ + . Each instance of each job type (serially) invokes or calls a set of processes. There are P types of processes, and we describe the job-process relation by the P J matrix 1 job j invokes process p Rpj = 0 otherwise. The process loads (number of instances per second) are given by = Rt RP , i.e., p is the sum of the trac from the jobs that invoke process p. The latency of a process or job type is the average time that it takes one instance to complete. These are denoted lproc RP and ljob RJ , respectively, and are related job by ljob = RT lproc , i.e., lj is the sum of the latencies of the processes called by j . job Job latency is important to users, since lj is the average time the data center takes to handle an instance of job type j . We are given a maximum allowed job latency: ljob lmax . The process latencies depend on the process load and also how much of n dierent resources are made available to them. These resources might include, for example, number of cores, disk storage, and network bandwidth. Here, we represent amounts of these resources as (nonnegative) real numbers, so xp Rn + represents the resources allocated to process p. The process latencies are given by
proc lp = p ( xp , p ) ,

p = 1, . . . , P,

where p : Rn R R {} is a known (extended-valued) convex function. These functions are nonincreasing in their rst (vector) arguments, and nondecreasing in their second arguments (i.e., more resources or less load cannot increase latency). We interpret p (xp , p ) = to mean that the resources given by xp are not sucient to handle the load p . We wish to allocate a total resource amount xtot Rn ++ among the P processes, so we P tot have p=1 xp x . The goal is to minimize the objective function
J

wj (ttar j tj )+ ,
j =1

where ttar j is the target trac level for job type j , wj > 0 give the priorities, and (u)+ is the nonnegative part of a vector, i.e., ui = max{ui , 0}. (Thus the objective is a weighted penalty for missing the target job trac.) The variables are t RJ + and n max tot tar xp R+ , p = 1, . . . , P . The problem data are the matrix R, the vectors l , x , t , and w, and the functions p , p = 1, . . . , P .

(a) Explain why this is a convex optimization problem. (b) Solve the problem instance with data given in res_alloc_stream_data.m, with latency functions p ( xp , p ) =
T 1 / ( aT p x p p ) ap x p > p , otherwise

xp

xmin p

min min where ap Rn Rn are ++ and xp ++ are given data. The vectors ap and xp stored as the columns of the matrices A and x_min, respectively. Give the optimal objective value and job trac. Compare the optimal job trac with the target job trac.

7. Probability bounds. Consider random variables X1 , X2 , X3 , X4 that take values in {0, 1}. We are given the following marginal and conditional probabilities: Prob(X1 Prob(X2 Prob(X3 Prob(X1 = 1, X4 = 0 | X3 Prob(X4 = 1 | X2 = 1, X3 = 1) = 1) = 1) = 1) = 0) = = = = = 0. 9, 0. 9, 0. 1, 0. 7, 0. 6.

Explain how to nd the minimum and maximum possible values of Prob(X4 = 1), over all (joint) probability distributions consistent with the given data. Find these values and report them. Hints. (You should feel free to ignore these hints.) CVX supports multidimensional arrays; for example, variable p(2,2,2,2) declares a 4-dimensional array of variables, with each of the four indices taking the values 1 or 2. The function sum(p,i) sums a multidimensional array p along the ith index. The expression sum(a(:)) gives the sum of all entries of a multidimensional array a. You might want to use the function denition sum_all = @(A) sum( A(:));, so sum_all(a) gives the sum of all entries in the multidimensional array a.

10

8. Perturbing a Hamiltonian to maximize an energy gap. A nite dimensional approximation of a quantum mechanical system is described by its Hamiltonian matrix H Sn . We label the eigenvalues of H as 1 n , with corresponding orthonormal eigenvectors v1 , . . . , vn . In this context the eigenvalues are called the energy levels of the system, and the eigenvectors are called the eigenstates. The eigenstate v1 is called the ground state, and 1 is the ground energy. The energy gap (between the ground and next state) is = 2 1 . By changing the environment (say, applying external elds), we can perturb a nominal Hamiltonian matrix to obtain the perturbed Hamiltonian, which has the form
k

H=H

nom

+
i=1

xi H i .

Here H nom Sn is the nominal (unperturbed) Hamiltonian, x Rk gives the strength or value of the perturbations, and H1 , . . . , Hk Sn characterize the perturbations. We have limits for each perturbation, which we express as |xi | 1, i = 1, . . . , k . The problem is to choose x to maximize the gap of the perturbed Hamiltonian, subject to the constraint that the perturbed Hamiltonian H has the same ground state (up to scaling, of course) as the unperturbed Hamiltonian H nom . The problem data are the nominal Hamiltonian matrix H nom and the perturbation matrices H1 , . . . , Hk . (a) Explain how to formulate this as a convex or quasiconvex optimization problem. If you change variables, explain the change of variables clearly. (b) Carry out the method of part (a) for the problem instance with data given in hamiltonian_gap_data.m. Give the optimal perturbations, and the energy gap for the nominal and perturbed systems. The data Hi are given as a cell array; H{i} gives Hi .

11

You might also like