0% found this document useful (0 votes)
41 views15 pages

Statistical Estimation: - Maximum Likelihood Estimation - Optimal Detector Design - Experiment Design

This document discusses statistical estimation techniques in convex optimization, including: 1) Maximum likelihood estimation for parametric distribution estimation, which formulates it as a convex optimization problem by maximizing the log-likelihood function. 2) Linear measurement models with independent and identically distributed noise, where the maximum likelihood estimate solves an optimization problem to maximize the log probability density function. 3) Examples like logistic regression, hypothesis testing, and experiment design formulate statistical estimation problems as convex optimizations.

Uploaded by

dmp130
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views15 pages

Statistical Estimation: - Maximum Likelihood Estimation - Optimal Detector Design - Experiment Design

This document discusses statistical estimation techniques in convex optimization, including: 1) Maximum likelihood estimation for parametric distribution estimation, which formulates it as a convex optimization problem by maximizing the log-likelihood function. 2) Linear measurement models with independent and identically distributed noise, where the maximum likelihood estimate solves an optimization problem to maximize the log probability density function. 3) Examples like logistic regression, hypothesis testing, and experiment design formulate statistical estimation problems as convex optimizations.

Uploaded by

dmp130
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Convex Optimization Boyd & Vandenberghe

7. Statistical estimation

maximum likelihood estimation

optimal detector design

experiment design

71
Parametric distribution estimation

distribution estimation problem: estimate probability density p(y) of a


random variable from observed values
parametric distribution estimation: choose from a family of densities
px(y), indexed by a parameter x

maximum likelihood estimation

maximize (over x) log px(y)

y is observed value
l(x) = log px(y) is called log-likelihood function
can add constraints x C explicitly, or define px(y) = 0 for x 6 C
a convex optimization problem if log px(y) is concave in x for fixed y

Statistical estimation 72
Linear measurements with IID noise

linear measurement model

yi = aTi x + vi, i = 1, . . . , m

x Rn is vector of unknown parameters


vi is IID measurement noise, with density p(z)
m Qm
yi is measurement: y R has density px(y) = i=1 p(yi aTi x)

maximum likelihood estimate: any solution x of


Pm
maximize l(x) = i=1 log p(yi aTi x)

(y is observed value)

Statistical estimation 73
examples
2
/(2 2 )
Gaussian noise N (0, 2): p(z) = (2 2)1/2ez ,
m
m 2 1 X T
l(x) = log(2 ) 2 (ai x yi)2
2 2 i=1

ML estimate is LS solution
Laplacian noise: p(z) = (1/(2a))e|z|/a,
m
1X T
l(x) = m log(2a) |ai x yi|
a i=1

ML estimate is 1-norm solution


uniform noise on [a, a]:

m log(2a) |aTi x yi| a,



i = 1, . . . , m
l(x) =
otherwise
ML estimate is any x with |aTi x yi| a

Statistical estimation 74
Logistic regression
random variable y {0, 1} with distribution

exp(aT u + b)
p = prob(y = 1) =
1 + exp(aT u + b)

a, b are parameters; u Rn are (observable) explanatory variables


estimation problem: estimate a, b from m observations (ui, yi)

log-likelihood function (for y1 = = yk = 1, yk+1 = = ym = 0):



k T m
Y exp(a u i + b) Y 1
l(a, b) = log
T T

i=1
1 + exp(a ui + b) 1 + exp(a ui + b)
i=k+1

k
X m
X
= (aT ui + b) log(1 + exp(aT ui + b))
i=1 i=1

concave in a, b

Statistical estimation 75
example (n = 1, m = 50 measurements)

0.8
prob(y = 1)
0.6

0.4

0.2

0 2 4 6 8 10
u

circles show 50 points (ui, yi)


solid curve is ML estimate of p = exp(au + b)/(1 + exp(au + b))

Statistical estimation 76
(Binary) hypothesis testing

detection (hypothesis testing) problem

given observation of a random variable X {1, . . . , n}, choose between:

hypothesis 1: X was generated by distribution p = (p1, . . . , pn)


hypothesis 2: X was generated by distribution q = (q1, . . . , qn)

randomized detector

a nonnegative matrix T R2n, with 1T T = 1T


if we observe X = k, we choose hypothesis 1 with probability t1k ,
hypothesis 2 with probability t2k
if all elements of T are 0 or 1, it is called a deterministic detector

Statistical estimation 77
detection probability matrix:
 
  1 Pfp Pfn
D= Tp Tq =
Pfp 1 Pfn

Pfp is probability of selecting hypothesis 2 if X is generated by


distribution 1 (false positive)
Pfn is probability of selecting hypothesis 1 if X is generated by
distribution 2 (false negative)

multicriterion formulation of detector design

minimize (w.r.t. R2+) (Pfp, Pfn) = ((T p)2, (T q)1)


subject to t1k + t2k = 1, k = 1, . . . , n
tik 0, i = 1, 2, k = 1, . . . , n

variable T R2n

Statistical estimation 78
scalarization (with weight > 0)

minimize (T p)2 + (T q)1


subject to t1k + t2k = 1, tik 0, i = 1, 2, k = 1, . . . , n

an LP with a simple analytical solution



(1, 0) pk qk
(t1k , t2k ) =
(0, 1) pk < qk

a deterministic detector, given by a likelihood ratio test


if pk = qk for some k, any value 0 t1k 1, t1k = 1 t2k is optimal
(i.e., Pareto-optimal detectors include non-deterministic detectors)

minimax detector

minimize max{Pfp, Pfn} = max{(T p)2, (T q)1}


subject to t1k + t2k = 1, tik 0, i = 1, 2, k = 1, . . . , n

an LP; solution is usually not deterministic

Statistical estimation 79
example
0.70 0.10
0.20 0.10
P =
0.05

0.70
0.05 0.10

0.8

0.6
Pfn

0.4

1
0.2
2 4
3
0
0 0.2 0.4 0.6 0.8 1
Pfp

solutions 1, 2, 3 (and endpoints) are deterministic; 4 is minimax detector

Statistical estimation 710


Experiment design
m linear measurements yi = aTi x + wi, i = 1, . . . , m of unknown x Rn
measurement errors wi are IID N (0, 1)
ML (least-squares) estimate is

m
!1 m
X X
x = aiaTi yi a i
i=1 i=1

error e = x x has zero mean and covariance


m
!1
X
E = E eeT = aiaTi
i=1

confidence ellipsoids are given by {x | (x x)T E 1(x x) }

experiment design: choose ai {v1, . . . , vp} (a set of possible test


vectors) to make E small

Statistical estimation 711


vector optimization formulation

T 1
Pp
Sn+)

minimize (w.r.t. E= k=1 mk vk vk
subject to mk 0, m1 + + mp = m
mk Z

variables are mk (# vectors ai equal to vk )


difficult in general, due to integer constraint

relaxed experiment design


assume m p, use k = mk /m as (continuous) real variable

T 1
Pp
Sn+)

minimize (w.r.t. E = (1/m) k=1 k vk vk
subject to  0, 1T = 1

common scalarizations: minimize log det E, tr E, max(E), . . .


can add other convex constraints, e.g., bound experiment cost cT B

Statistical estimation 712


D-optimal design

T 1
Pp 
minimize log det k=1 k vk vk
T
subject to  0, 1 =1

interpretation: minimizes volume of confidence ellipsoids


dual problem

maximize log det W + n log n


subject to vkT W vk 1, k = 1, . . . , p

interpretation: {x | xT W x 1} is minimum volume ellipsoid centered at


origin, that includes all test vectors vk
complementary slackness: for , W primal and dual optimal

k (1 vkT W vk ) = 0, k = 1, . . . , p

optimal experiment uses vectors vk on boundary of ellipsoid defined by W

Statistical estimation 713


example (p = 20)
1 = 0.5

2 = 0.5

design uses two vectors, on boundary of ellipse defined by optimal W

Statistical estimation 714


derivation of dual of page 713
first reformulate primal problem with new variable X:
1
minimize log det
PpX
subject to X = k=1 k vk vkT ,  0, 1T = 1

p
!!
X
L(X, , Z, z, ) = log det X 1+tr Z X k vk vkT z T +(1T 1)
k=1

minimize over X by setting gradient to zero: X 1 + Z = 0


minimum over k is unless vkT Zvk zk + = 0

dual problem

maximize n + log det Z


subject to vkT Zvk , k = 1, . . . , p

change variable W = Z/, and optimize over to get dual of page 713

Statistical estimation 715

You might also like