0% found this document useful (0 votes)
75 views20 pages

Silvey

This document summarizes a paper that discusses using a Lagrangian multiplier test to determine if a parameter belongs to a subset of possible parameters. It begins by introducing the problem and noting that restricted maximum likelihood estimation often involves solving equations containing a Lagrangian multiplier. It then establishes notation for describing the statistical problem and assumptions made. Finally, it discusses how the restricted maximum likelihood estimator emerges as a solution to equations involving the log-likelihood function, Lagrangian multiplier, and restriction conditions.

Uploaded by

Ezy Vo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views20 pages

Silvey

This document summarizes a paper that discusses using a Lagrangian multiplier test to determine if a parameter belongs to a subset of possible parameters. It begins by introducing the problem and noting that restricted maximum likelihood estimation often involves solving equations containing a Lagrangian multiplier. It then establishes notation for describing the statistical problem and assumptions made. Finally, it discusses how the restricted maximum likelihood estimator emerges as a solution to equations involving the log-likelihood function, Lagrangian multiplier, and restriction conditions.

Uploaded by

Ezy Vo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

The Lagrangian Multiplier Test

Author(s): S. D. Silvey
Source: The Annals of Mathematical Statistics, Vol. 30, No. 2 (Jun., 1959), pp. 389-407
Published by: Institute of Mathematical Statistics
Stable URL: http://www.jstor.org/stable/2237089 .
Accessed: 28/09/2013 21:09

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact [email protected].

Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to The
Annals of Mathematical Statistics.

http://www.jstor.org

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
THE LAGRANGIAN MULTIPLIER TEST
BY S. D. SILVEY
University
of Glasgow
1. Introduction.One of the problemswhich occurs most frequentlyin prac-
tical statisticsis that of deciding,on the basis of a numberof independentob-
servations on a random variable, whether a finite dimensional parameter
involvedin the distributionfunctionof the randomvariablebelongsto a proper
subset X of the set Q of possible parameters.Naturally this problemhas re-
ceived considerableattentionand the main methodwhichis currentlyapplied
in dealing with it is the well-knownNeyman-Pearsonlikelihoodratio test.
Direct applicationof this test involvesfindingthe supremumof the likelihood
functionin the set X and this in turn ofteninvolvesthe solutionof restricted
likelihoodequations containinga Lagrangianmultiplier.And the same set of
of equations has to be solved if, irrespectiveof the likelihoodratio test, it is
desiredto obtain a maximumlikelihoodestimatein the set X of the unknown
parameter.Rather surprisingly, since the problemis of such frequentoccur-
rence,little seems to have appeared in statisticalliteratureon such restricted
maximumlikelihoodestimates,the main resultsin this fieldbeing cont-ined
in a recentpaper by Aitchisonand Silvey [1].
In this paper the authors introduced,on an intuitivebasis, a method of
testingwhetherthe true parameterdoes belong to c, this methodbeing based
on the distributionof a random Lagrangian multiplierappearing in the re-
strictedlikelihoodequations. It is the object of this presentpaper to discuss
this Lagrangianmultipliertest. In order to do so, it is necessaryto consider
how the resultsof the previouspaper mustbe modifiedwhenthe trueparameter
does niotbelongto the set c, because onilyin thisway can we obtain any notion
of the power of the test. Discussion of this point formsthe initial part of the
presentpaper. We will then show the connectionbetweenthe Lagrangianmul-
tipliertestand the likelihoodratiotest.Finally,sinceoftenin practicesituations
arise wherethe information matrixis singular,we will considerhow the Lagran-
gian multipliertest must be adapted to meet this contingency.
The approach adopted by Aitchisonand Silvey [1] in the discussionof re-
strictedestimatesis essentiallyCramer's approach [4] to maximumlikelihood
estimates,i.e., attentionis concentratedon solutionsof the likelihoodequations
rather than on genuine maximumlikelihoodestimates. Such an approach is
really unsuitable in the presentinstance wherewe do not necessarilyassume
that the trueparameterdoes belongto the subsetw.And we willuse insteadthe
methodused by Wald [7] in his discussionof the consistencyof maximumlike-
lihood estimators.As has been pointed out by Kraft and Le Cam [5], Wald's
approach to unrestrictedmaximumlikelihoodestimationis much more illumi-
Received March 14, 1958.
389

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
390 S. D. SILVEY

natingthan that of Cramerand, not surprisingly,this is still true of restricted


estimation.Unfortunately the changein viewpointnecessitatescertainchanges
in the notationused by Aitchisonand Silvey,and these we will now introduce
in describingmathematicallythe situationto be discussed.
2. Notation.The basic situationin whichwe shall be interestedis described
mathematicallyas follows.
to each point0 = (01,
Corresponding 02, ***, 0,) in somesubsetQ of s-di-
mensional Euclidean space, denoted by R8, is a distributionfunctionF(, 0)
definedon R1; wherea is somegiveninteger.A randomvariableX, takingvalues
in R' has distributionfunctionF( *, Go)where 0ois knownto belongto Q but is
otherwiseunknown;thoughit is suspectedthat 0obelongsto a subset WX= n
I{0:h(0) = 0} of Ql,whereh = (hi, h2 , hr) is a well-behavedfutnction from
R' into R', r < s.
We will assume,as is usual, thatforall 0 E Q, F( *, 0) is eitherdiscreteor ab-
solutelycontinuous,and admits an elementaryprobabilitylaw f(, 0). Theni
for a given sequence x = (x, XX2,
2 Xn, ')*
, X of independentobservations
on X, the log-likelihoodfunctionlog Ln(x,-) is definedon Q by log Lnx,0) =
= log f(xi, 0). By a maximumlikelihoodestimateof Ooin any subset W*of
Q, we mean an elementO.(x, w*) of w*whichis such that
log Ln(x,O (x, w*)) = sup log Ln(x, 0) .
Sew

functiondi(
If a single-valued *) is thusdefinedforalmostall x, then n(* w*)
is a randomvariable called a maximumlikelihoodestimatorof Ooin W*.When
we referto "almost all x" we mean almost all with respectto the probability
measuredefinedon the sequence space of pointsx by the considerationthat the
componentsof a sequence x are regardedas independentobservationson a
random variable X with distributionfunctionF(., 0o). Similarly"almost all
t E Ra" means almost all with respect to the probabilitymeasure definedon
R by F( , Oo).
The matrix whose (i, j)th element is fR. alog f(t, 0)/a0i alog f(t, o)/aoj
dF(t, 0), we will denote by Be. Further,Ho will denote the s X r matrix
(ahj(0)/aOj). For any real functionr definedon R', DD(O) will denotethe col-
umn vectorwhose ith componentis cl(O)/a0j, while D2v(O) will denote the
s X s matrixwhose (i, j)th componentis 02r(O)/CEiCj . Generally column
vectorscorresponding to pointsin Euclidean space will be printedin the corre-
spondingboldfacetype so that, forexample,the columnvector 0 corresponds
to the point 0.
We willbe interestedinitiallyin the emergenceof 6.(x, co) as a solutionof the
equations
n-'D log Ln(x, 0) + Ho= 0

h(0) = 0,

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
LAGRANGIAN MULTIPLIER TEST 391

whereX is a Lagrangianmultiplierin R', and generallyin the restrictedmaxi-


mumlikelihoodestimatorAn(*, w).
3. &,(x, w) and the likelihoodequations. Naturally the discussionon which
we have embarkedwill involve the introductionof various assumptionscon-
cerningF and h. The assumptionsthat we will introduceare not designedto
achieve completemathematicalgeneralitybut are, we hope, of such a nature
that theywillnot obscurethe over-allmathematicalpictureand will be satisfied
in many practicalproblems.The firstof these assumptionsis as follows.
Assumption1. For every 6 E Q, z(0) = f R- log f(t, 0) dF(t, Go) exists.
The whole problem of maximumlikelihoodestimation,restrictedand un-
restricted, is closelybound up withthe behaviourof the functionz, because the
Law of Large Numbersensuresthat,foreach 0, the sequence (n-' log Ln(x, 0))
converges,foralmostall x, to z(6). If, further, this convergenceis uniformwith
respectto 6, thenforlargen and mostx, nA-logLn(X,-) will be uniformly near
z and undersuitableconditionswill attain its supremumin X near the point (if
such exists) wherez attains its supremumin w. The assumptionswhichwe will
now introduceare designedto achieve this desirablesituation.
A ssumption2. Q is a convex compact subset of R8.
Assumption3. For almost all t - R', log f(t, *) is continuouson Q.
Assumption4. For almost all t c R', and for every 0 E Q, a log f(t, )/100,
(i = 1, 2, - , s) exists and la log f(t, 0)1a0jl < g(t)(i = 1, 2, * -, s) .

where JfRa g(t) dF(t, 0) is finite.


Assumption5. The funietion h is continuouson Q.
Assumption6. There existsa point 6* e w such that z(6*) > z(6) when 0 e w
and 6 # 0*.
Assumptions2-4 ensurethat foralmost all x the sequence (n 1 log Ln(x, 0))
convergesto z(6) uniformly withrespectto 0 in the set Q. Assumptions2 and 5
ensurethat w is a compact subset of R8 and consequentlythat any continuous
functionon wattainsits supremumat some pointofw.In particularthefunction
log L,,(x, .), foralmost all x, attains its supremumin w at some point O.(x, w)
of w. Assumption6 then ensuresthat foralmost all x the sequence (&1(x, w))
convergesto 0*. The proofsof these resultsare fairlystraightforward and we
omitthem.
It is of some interestto note that if Oo- w then usually Oowill satisfythe
conditiondemanded of 6*. This has been proved by Wald [7]. In fact, when
interestis concentratedon the case where 0o- w,Assumption6 may be replaced
by the following
Assumption6A. 0oE w and if 0 # Oothen forat least one t E R', F(t, 0) #
F(t, 0o). This is sufficient to ensurethat z(Oo) > z(O) if 0 $ 0o.
As stated above, Assumptions1-6 ensurethe existenceof a maximumlikeli-
hood estimatorin w of Oowhichconvergeswithprobabilityone to 0*. If in addi-
tion we make the followingAssumption7 thenforlarge n and most x, O1(x,W)
will be an interiorpoint of w and consequentlywill emergeas a solutionof the
restrictedlikelihoodequations, when the functionh is differentiable.

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
392 S. D. SILVEY

Assumption7. 0* is an interiorpoint of w. Now makingassumptions1-7, we


will use these likelihoodequations in discussingthe asymptoticdistributionof

4. The asymptoticdistribution of 6n(-wo).The methodby whichthe asymp-


totic distributionof maximumlikelihoodestimatorsis usually derived,forex-
ample by Cram&r[4], involves expandingthe likelihoodfunctionby Taylor's
Theorem.In orderthat we may adopt this methodin the presentinstancewe
now introducethe followingassumptions,similarto those of Cramer.
Assumption8. The functionshi possess firstand second orderpartial deriva-
tives whichare continuous(alndso bounded) on U.
Assumption9. For almost all t ? R' the functionlog f(t, *) possesses con-
tinuoussecond orderpartial derivativesin a neighborhoodof 0*. Also, if 0 be-
longsto thisneighborhood, then102log f(t, 0)/a0id0j1 < G,(t) (i, j = 1, 2,
... , s) wherefRa Gi(t) dF(t,0) is finite.
Assumption10. For almost all t E R' the functionlog f(t, *) possesses third
orderpartial derivativesin a neighborhoodof 0* and, if 0 is in this neighbor-
hood, then
< G2(t)
1i3 logf(t, 0)/1aiOdjd0kj (i,j, ck= 1, 2, *.* s),

where fR G2(t) dF(t, Go) is finite.


(4.1) Importantimplicationsforour purposesof Assumptions4, 9 and 10
are as follows.
(4.1.1) The vectorDz(0) existsforevery0 - Q2and the sequence (Dn'1 log
Ln(x, 0)) of vectorsconvergesforalmostall x to Dz(0) (Assumption4).
(4.1.2) The matrixD2z(0*) exists and the sequence (D2n-1 log L. (x, 0*))
of matricesconvergesforalmost all x to D2z(0*) (Assumption9).

(4.1.3) For almost all x and i, j, k = 1, 2, - , s the sequence (n'103 log


L,,(x, 0)/00i00j90k)is bounded uniformly with respectto 0 in a neighborhood
of 0* (Assumption10).
Each of these threestatementsis almost a directconsequenceof the Strong
Law of Large Numbers.
We are now in a positionto' obtain the asymptoticdistributionof n(-, W).
For brevitywe will now write insteadof O'(x, co). Since 0 -- 0* foralmost all
x, we findby applyingTaylor's Theoremand using (4.1.2) and (4.1.3) that
(4.1.4) Dn 1 log Ln(x, f9)= Dn ' log L,(x, 0*) + [D2z(0*) + o(1)] [0- ']
foralmostall x.
Also because of the continuityof the firstpartialderivativesof the functions
hi, foralmostall x,
(4.1.5) H6 = Ho* + o(1)

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
LAGRANGIAN MULTIPLIER TEST 393

and
(4.1.6) h(s) = [H'. + o(1)][6 - 0*1.
For almost any x, if n is sufficiently
large, 6 will,with a certainLagrangian
multiplierXA(x),satisfythe restrictedlikelihoodequations.So we have, writing
X in place of XA(x)forbrevity,
(4.1.7) Dn-1 log Ln(x, 9*) + [D2z(0*) + o(1)][6- 0*] + He;-O,
(4.1.8) [He* + o(1)][6 -*] = 0.
Since z(0*) is a maximumin the set wof the functionz, thereexistsa Lagran-
gian multiplierX* = (Xi, A*, *** , xr) such that
(4.1.9) Dz(0*) + Ho.I* = 0,
and on subtracting(4.1.9) from(4.1.7), and using (4.1.5) we obtain
[Dn ' log Ln(x, 9*)- Dz(o*)] + [D2z(0*) + o(1)][6 -_*1
(4.1.10)
+ [Ho. + o()][" - L*] + [Hi - Ho.*]X*- 0.
Now on expandingthe elementsof the matrixH6 by Taylor's Theorem,we find
that,because ofthecontinuityofthesecondorderpartialderivativesofthefunc-
tionshi, foralmost all x,

(4.1.11) [H6 - Ho*.]* = [z Xi D2hi(0*) + o(i)] [6 - ].

We will denote by - Bet the matrixD2z( 8*) + 5=1 X*D hi(*). Then on
substitutingin (4.1.10) the expressionfor [H&- Ho*]I* containedin (4.1.11)
we have
[Bet*+ o(l)][6 - J*]-[Ho* + o(l)][l-
= Dn ' log Ln(x, 9*) -Dz(0*),

L
and combining(4.1.12) and (4.1.8) we may write

Bet + o(l) -H + o(l)1F6-6*1


l-Ho* + o(1) 0 ; *
(4.1.13) -H +o10
[Dn-1 log Ln(x, 0*) -Dz(0*)
L ~0
We willnow make the finalassumptionswhichenable us to derivethe asymp-
totic distributionof f(, co) and A(.)
Assumption11. The matrix

H*
L-Ho -H|
0J
is non-singular.

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
394 s. D. SILVEY

Assumption12. For i, j = 1, 2, , s, f3xj(0*) = fRa( log f(t, 0*)/1a90


a log f(t, 0*)/aojdF(t,Do) exists.
We now define
Pt Q Wj[r s He.}

lQt Rt* J -Ho o j


and VO*= (3ij((*)) - [Dz(0*)I[Dz(0*)I'. By the multivariateformof a Cen-
tral Limit Theorem (Cram6r [3]) it followsfromthe existenceof the matrix
Vo. that the distribution
ofV\n[Dn-1log Ln(., 0*) - Dz(0*)] is asymptotically
normalwithmean 0 and variancematrixVo*.Then from(4.1.13), by the multi-
variate extensionof a theoremof Cram6r[4] we have the resultsstated in the
followinglemma.
LEMMA 1. UnderAssumptions1-12 therandomvector

/n[;< () _ O*]

is asymptotically withmean 0 and variancematrix


normallydistributed
rPoV*t* pt* Qt*l
[Qt'Vo.*PI Qt Ve. Qoj

We have now obtained a formalresult regardingthe behavior for large n


of the restrictedmaximumlikelihoodestimator,a resultwhichmightbe used in
most practical situationsto determinethe large sample powerfunctionof the
test of the hypothesisthat 0oc w, proposed by Aitchisonand Silvrey.(This
mightinvolvea considerableamountof computation).The extentto whichthe
methodof solvingthe likelihoodequations whichis proposedin the same paper
can be used when Ooe w remainsobscure,as does any generalpictureof the
powerof the test. Howeversome lightis shed on thesequestionsby considering
how the resultshere obtained particularizein the case when 0oe w.
(4.2) Accordinglywe considerwhat happens when we replace Assumption
6 by Assumption6A. Then Ooreplaces0* and z(Oo), the maximumof z in the set
w,is also the maximumof z in the set U. Hence Dz(0*) = 0 and X* = 0. The
*matrixVo*becomes the matrixBo,,and, with the mild additional assumption
Assumption13. fR. 02f(t,0o)/&0Oi90jdt 0O (i, j = 1, 2, *.* , s), the matrix
Bt* also becomesBe6. Consequentlywe have exactlythe resultof the previous
paper [1] concerningthe asymptoticdistributionof the restrictedestimatorand
the corresponding Lagrangianmultiplier.The assumptionsmade here in deriv-
ing this distributionare, so far as comparisonis possible, strongerthan the
assumptionsof the previouspaper, but we have now obtaineda resultconcern-
ing the geniuinemaximumlikelihoodestimatorratherthan merelya solutionof
the likelihoodequations. (A greaterdegreeof similaritybetwveen the two sets

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
LAGRANGIAN MULTIPLIER TEST 395

of assumptionsis apparent if we note that in the case where OoE cowe might
replace Assumption11 by the following
Assumption11A. The matrixBeois positivedefiniteand the matrixHeo is of
rank r).
(4.3) It is now possibleto obtain a pictureof the typicalpracticalsituation
when n is large and 0o, while not belongingto the set w, is very near this set.
Usually thenz(Oo) will be supeesz(0) and 0* will be near 0oso that Dz(0*) will
be near Dz( Go) = 0. Then X*also will be near 0, though,sincen is large,VnkX*
may be-appreciablydifferent from0. Also the elementsof D2z(O*) will be near
=
thoseofD2z(Oo) -B6o. If in additionf3ij(0*) is nearthe corresponding
element
of B0,, as will usually be the case, then we can say that approximately

z
[6n~' o-*
v\n
will have a multivariatenormaldistributionwith mean 0 and variance matrix

rPoo ?
[Po -Reol
thismatrixbeingas definedin [1]. (It wouldbe possibleto give a rigorousmathe-
maticalderivationof thisresultby imaginingthe trueparameter0oto varywith
n in such a way that the distanceof 0ofromthe set w tended to 0 as n -00,
and by imposingsuitable restrictions on the functionsf and h to ensure that
what is here said to happen usually would in fact happen. But this does not
seem particularlyprofitable).
(4.4) Finally in this connection,because of the remarksmade in the pre-
vious paragraphand of the flexibility of Newton's methodof solvingequations,
we mightexpect that, in the case where 0ois near the set w and n is large,the
iterativemethodof solvingthe restrictedlikelihoodequations suggestedin [1]
will stillapply.
5. Three tests of the hypothesisthat 0o co.We will now comparethreein-
tuitivelyreasonabletests of the hypothesisthat OoE w. These are as follows.
(i) The likelihoodratiotest.We accept the hypothesisif ,(x) = sUpoewLn(x,
0)/supo,sLn(x, 0) is "sufficiently
near" 1.
(ii) The TValdtest.Assumingthe existenceof On(x,Q), we accept the hypothe-
sis if h(&n(x,Q)) is "sufficiently
near" 0. (Wald [8]).
(iii) The Lagrangianmultipliertest.Assumingthe existenceof &n(x,w) and
An(x) we accept the hypothesisif An(x) is "sufficiently
near" 0. (Aitchisonand
Silvey [1]).
For typographicalbrevitywe 0
' will now write forthe unrestricted maximum
likelihoodestimatorf)(* *X), forthe restrictedmaximum, likelihoodestimator
61( 7w) and X forthe randomvariable XA(.)-
The measureof the distancefrom0 of h(6) used by Wald is, in our notation,

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
396 S. D. SILVEY

variable and he has showni


-n[h(6)]'Rd[h(6)]; his test is based on this random-i
that under genieralconditionsthe asymptoticdistributionsof -2 log A and
-n[h(0)]'R6[h(6)] are the same. The measureof the distancefrom0 of X in the
test proposed by Aitchisonand Silvey is -nA'Re'5;. We will now show that
subject to the followingassumptionsA we have
plim 2 log , = plim n[h(0)]'R6[h(6)] = plimn45R'Rem5Z.
A we mean thefollowingset ofassumptions:
A. By assumptionis
Assumptions
1-5, 6A, 7-10, 11A, 13 and
Assumption12A. The matrixBo existsin a neighborhood of Oo,and its elements
are continuousfunctionsof 0 there. Of course when assumption6A is made,
0* is replacedby Goin subsequentassumptions. '
We have already seen that these assumptionsimplythat existsand almost
certainlyconvergesto Go,and that forlarge n and most x, An(x) exists. It is
not difficultto use the particularformto which (4.1;13) reduceswhenassump-
tion6A replacesassumption6 to obtainthe results
(5.1) n (' - Oo) -n-'PoOD log Ln( *, Go)+ op(1),
(5.2) n-2Q1oD log L,(-, Go)+ op(1).
Here op is used in the sense of MIannand Wald [6] and P0o,Qo0are definedby

(5.3)LB0 Hj [P0 Ql
[-Hoo 0o LQOORooj
Also it is easy to showby the same kindofargumentas has been applied above
that the assumptionsA implythat 0 existsand almost certainlyconvergesto Oo
and that
(5.4) foralmost any x and sufficiently
large n,
D log L[x, O(x)] = 0,
(5.5) /n(b - Oo) = n'B-'D log L( ,Go) +o(1).
We will now use these resultsto prove the followinglemmas.
LEMMA 2. Subject to assumptions A,
-2 log n(= n
- b3Be0(
-)
-6b) + oP(1).

PROOF.Clearlyfrom(5.1) and (5.5)., - Ol = Op(n-). Hence on expand-


ing log Ln(-,0x) by Taylor's Theorem,we have, in virtueof (4.1.3) and (5.4)
logLn(., 0) = log L.( , 0) + 6)- [D2 log Ln(., 0)][0 - 61 + op(1).
Again fromTaylor's Theorem we have n-'D2 log L.(- 0) = n-1D2 log L,,(,
Go)+ op(1), and from(4.1.2) and assumptions9 and 13 (whichimplyDmZ(Go)=
-Bog)
im-1'Dlog L,(-, Go) = -Boo + op(l).

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
LAGRANGIAN MULTIPLIER TEST 397

Hence
log ,u = log L(, 0) -log L.(- 0)
2n(6 6)'[Boo + o,(1)](6 - 6) + os,(1),
and the resultfollowsbecause 110- Oil= O0(n-).
LEMMA 3. Subjectto assumptions A, 2 log , = n"'R-' I + op(l).
PROOF. We have

V/n(6-b6) nBon)D
(Poo - G + op(l).
log Ln( Xo)
Now

[P00- B']Boo[Poo - B_] = B-P =


-QPoR-oQ@0
these matrixrelationshipsfollowingeasily fromthe definitionof POO,QoOand
Rooin (5.3). Hence
n(6' - )'Boo(6 - 6) = -n-'[D log L.( log L.(., Go)]
*, 0o)'QooR-01Q'0[D
+ op(l)
= -nx'Rs'5; + op(l), by (5.2).
Since, accordingto assumption12A, the elementsof the matrixBo are continu-
ous functionsin a neighborhoodof Go,and by 1lA Boois positivedefinite,Bo
will also be positivedefinitein a neighborhoodof Go.SimilarlyHo is of rank r
in a neighborhoodof Ooand so the matrixRo exists and its elementsare con-
tinuousfunctionsof 0 in a neighborhood of Oo. It followsfromthe strongconver-
gence of 0 to Gothat R`1 = R-? + op(l), and this completesthe proof.
LEMMA 4. Subject to assumptionsA, 2 log ,u= n[h(O)]'Ri[h(O)] + op(l).
PROOF. Since the second derivativesof the functionshi are bounded on Q2
(assumption 9) and since 10- Oollis Op(n x), we have

h(O) -h(Oo) + H'0(b - 6O) + Op(n-')


- H O( 6- Oo) + Op(n-1),

since by 6A, G0ow. Hence V\nh(O) = n 'H' Bo-1Dlog L,,( X,Go)+ op(1) and
[D logLn( XGto)]
n[h(0)]'Ro0[h()] = n l[D log Ln( *, Go)]'B-1Ho0RooHeoB- + op(1).
It is easy to show that BoO'HeOR0OHo03B' = Qo0R-1Q'O,and it follows that
n[h(6)]'Ro0[h(6)] = ni'R-001+ op(1). The proof is then completed by the
remarkthat, as in Lemma 3, Roo = R, + op(1).
LEMMA5. Subject to assumptionsA, each of therandomvariables-2 log ,
-n[h(0)]'R4[h(6)] and -nX'Re^1i is asymptotically as x2 withr degrees
distributed
offreedom.
This followsfromlemmas3 and 4 and fromthe fact that Vn5 is asymptoti-
cally normallydistributedwith mean 0 and variance matrix -Ri.
In consequenceof lemma 5, whenn is large the naturalchoicesof criticalre-
gionsofsize a fortestingthehypothesisthat G0E w on the bases (i), (ii) and (iii)

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
398 S. D. SILVEY

are Cl 02 and C3 respectivelywhere


C0 is the set of x on which -2 log IA> ka X.
C2 is the set of x on which -n[h(6)]'Re[h(6)] > k.,X and
C3 iS the set of x on which -n5k'RIe' > ka.
Here ka is determined
by Pr{X(] > ka} - a.
Wald [8] has shown that usually the tests based on the criticalregionsC1
and C2 have asymptoticallythe same power. His argumentshows essentially
that if n is large and 0ois not near e, the powerof each test is near 1, whileif
0ois near w each of the randomvariables -2 log ,uand -n[h(O)I'R4h(O)] has
approximatelya non-centralx2-distribution with the same parameters.We
now inquire, without going into rigorousmathematicaldetail, whetherthis
type of argumentwill usually hold when we compare the tests based on the
criticalregionsCl and C3.
We considerfirstwhat happenswhenn is large and Oois near W.Then as we
have seen, 0* will usually be near Ooanl we suppose that 0o is near enough co
to ensurethat 0* - 0 is near 0, though-Vn(0*- O0o)may be appreciablydif-
ferentfrom0. In virtueof the remarksmade in (4.3) we willthenhave, in most
practicalsituations,
(5.6) Vn(4~ O-*) n PeDlog Ln( *0 ),
(5.7) V/n(5;- *) 'n QoOD'0logLn(*, 00)
where denotes approximateequality with probabilitynear 1, for large n.
'

Also since Dz(0*) + Ho4** = 0 and since usually Dz(Oo) = 0 and D2z(Oo)=
-Boo, we willhave
(5.8) Be (0* - Oo) + Heol* = 0,
approximately.Since the distribution of 6 does not dependon whetherOois in W
or not, it will remaintrue (see (5.5)) that
(5.9) V/n( - Oo) - n Bo-,1D log Ln( 0*).
Also examinationof the details of the proofof lemma 2 shows that the result
thereobtained,namely
(5.10) -2 log , n(' -
6)Bo(O -
)
stillholds.
Now from(5.6) and (5.9) we have

>- 6) +Vn(4* - 0) + ni(Peo - Bo1)D log Ln(*, 0f)


V
An(0 -
Oo) + n'QoOR-1QIODlog Ln(., 00)
-
V/iB@6lH,o1* + VnQ,OR-'(5-
by (5.8) anid (5.7). It is not difficult
to show that Q00Ro0= BooHoo, and so

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
LAGRANGIAN MULTIPLIER TEST 399

V/n(6 -6) 2BoO1Heo0.Hence, in the usual practical situation; when n


is large and 0ois near enoughw to ensurethat 0* - 0ois near 0, we will have

-2 log -
n(' - 6)'B9o(6 - 6) - n5'H IB-1Heo0= ni'R-'5- n_'Re^L,,
and consequentlythe testsbased on the criticalregionsC1 and C3 will have ap-
proximatelythe same powerin these circumstances.Moreoverit is easy to see
that each of the randomvariables -2 log jAand -ni"'Ri01; will then have ap-
proximatelya non-centralx2-distribution withr degreesoffreedomand param-
eter - n*'Ro^%*. (Again thisargumentcould clearlybe maderigorous byimagin-
ing 0oto vary withn in such a way that 110*- Ooll= 0(n-') and by imposing
suitableconditionson the functionsf and h).
We now considerthe powerof the Lagrangianmultipliertest whenn is large
and 0ois not near w. Then the asymptoticdistributionof 'n will usually be as
given in Lemma 1. Now, if X*is not near 0, thenwitha highprobability x
will be farfrom0 and since normallythe matrix-Ro will be positivedefinite,
the powerof the test based on C3 will be near 1. Howeverthereis a possibility
that Oomightbe such that the functionz has a stationaryvalue at 0*,in which
case A* = 0. Then -ni'RA' would not necessarilybe large with a high prob-
abilityand consequentlythe powerof the test based on C3 would not be near 1
forsuch a Oo. But thisis a contingency whichdoes not seemlikelyto arise often
(the authorhas been unable to findan exampleofit) and we may concludethat
in most practical situationsthe Lagrangian multipliertest is equivalent,for
large samples, to the likelihoodratio test.
6. Singular informationmatrices. As we have said previouslythe whole
problemof maximumlikelihoodestimationis closely bound up with the be-
haviorof the functionz. In particular,forunrestrictedestimationit is important
that z shouldhave a maximnum turningvalue in Q at 0o,forthisconditionplays
an importantpart in ensuringconsistencyof f9n( *, il). Now the demands that
z(Oo) shouldbe a maximumturningvalue of z in Q and that Booshouldbe posi-
tive definiteare not unrelated.For it is usually true that z has a stationary
value at Oo,i.e., that Dz(Oo) = 0 and also that D2z(Oo) = -Beo: these results
dependonlyon f beingsuch that we can "differentiate underthe integralsign."
0
So that if is near Oowe will usually have

(6.1) z(0) - z(Oo) = -1(0 - Oo)'B6o(0 - 00) + 0(110 - Oo11).

Hence if B9ois not positivedefiniteit may verywell happen that z(Oo) is not a
maximumturningvalue of z in Q and much of unirestricted estimationtheory
would thenbreak down.
However,even ifBoois not positivedefiniteand z(Oo) is not a maximumturn-
ing value of z in Q, it may still be the case that if Oobelongsto the subset X of
Q, z(Oo) is a maximumturningvalue of z in w so that restrictedestimation
theorymay not need drasticrevision.And it is of some theoreticalinterestto
considerJustwhat revisionis necessaryin this case. Moreoverthisproblemis of

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
400 S. D. SlLVEY

practicalinterestbecause it oftenhappens that it is niatural,eitherfor reasons


of symmetryor forsome otherreason,to describethe distributionof a random
variablein termsof a parameter0 in such a way that neitheris Bo0positivedefi-
nite nor is z(Oo) a maximumof z in Q. For instanceif X has a multinomialdis-
tributionand describesan experimentin whichan individual can fall into any
one of s classes,it is naturalforreasonsof symmetryto denote the probabilities
associated withthe different classes by j/E!=j 0i (i = 1, 2, ... , s). The set
Q of possible parametersis {0 E R8:Oi > 0 (i = 1,2, , s)}, and it is easy to
verifythat neitheris B9 positivedefiniteforany 0 in Q nor is z(Oo) a maximum
turningvalue of z in U. (In this case it is clear that this is so because we have
set in s-dimensionalspace a parameter that is really (s - 1)-dimensional).
However it is obvious that thereis no difficulty about restrictedestimationin
thesubsetofQ in whichZ=1 Oi= 1.
We will Inow considerwhat revisionis necessaryof that part of the foregoing
theorybased on the assumptionsA, if we drop the demand that Bo, be positive
definite(assumption1 A) and replaceassumption6A by the followingassump-
tion 6B, whilemaintainingthe remainderof the assumptionsA.
Assumption6B. 0oE X and forany otherpoint 0 of w,F(t, 0) $ F(t, O0) for
at least one t. Roughlyspeaking,we may explainthe introductionof assumption
GB as follows.If assumption6A is not satisfied,the parameteris not identifiable
.in the set Q, i.e., thereare different 0's in Qvwhichgive the same distributionof
X. Howeverwe wish 0oto be identifiable in the subset w,in orderthat restricted
estimationmay still be possible. Hence we make assumption6B.
It is easy to verifythat these assumptionsimplythe existenceof a consistent
estimator6Q,( *Xw) of Oo,that foralmost any x and sufficiently large n, On(x,w)
with a Lagrangianmultiplier'X(x) satisfiesthe restrictedlikelihoodequations
and that

'G2) F
L-BO0+ ? o(i) -H?0 + o(1)1 6F(x,c)
0 JL (x
- 0 =] Dn logLn(x,0o)]
H) , o(1) W
for almost any x. Now however,since we have dropped the requirementthat
Boobe positivedefiniteand since subsequenttheoryconcerningthe asymptotic
distributionsoffn(., w), X',,and associated randomvariables makesconsiderable
use of the inverseof Boo,this theoryno longerapplies. To enable us to replace
this theorywe will now introduceassumption 1lB which is associated with
assumptionGB in the same manneras 1lA was shown at the beginningof this
sectionto be associatedwith6A. This assumptionwill providea natural connec-
tion between propertiesof the matrixBoo, the subset w and the facts that 00
is identifiablewhen it is kniownto belong to X (assumption 6B), but unidenti-
fiablein S.
Assumption11B. The matrixHoois of rank r. The matrixBo. is of rank s -t
wheret ? r. There existsan s X t sub-matrixHi of Ho. such that Bo. + H,Hi
is positivedefinite.(Without any loss of generalitywe may assume that Hi is

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
LAGRANGIAN MULTIPLIER TEST 401

the matrixcomposedof the firstt columnsof Ho, and we may write


H,o = [H1H2]).
We will now definethe set of assumptionsB.
AssumptionsB. By assumptionsB we will mean the set of assumptionsA
with 6B and 1lB replacing6A and 1lA respectively.
Now subject to assumptionsB, if y denotesan s-dimensionalrandomvector
normallydistributedwithmean 0 and variancematrixBooand if we write0 in
place of-O'Q, w) and Ain place of X',, then from(6.2) we have, as before,

and since /n Hoo(6 - Oo) - 0 it followsthat

(G.4) + [Beo +Hi Hi H]


-Ho Oo] [Y]

Since B,o + HiH/ is positivedefiniteand Hoo is of rank r, the matrix

rBoo+ HI Hi -Hoo
L-Ho
I
0 J
is non-singularand wvedefinePO*o,Q'* and R* by

[Po
Q0 Bo] [o ?Hi-HH
(6.5)

We will also defineSooby


It O
(G.6) Soo = -Roo -
0]

whereIt denotesthe unitt X t matrix.


WVewill now prove two lemmas concerningthe distributionsof statisticsin
whichwe are interested.
LEMMA 6. Subjecttoassumptions B, thevector
0 0

normallydistributed
is asymptotically mcan0 and variancematrix
twith
[P 0o
O so]

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
4{)2 S. D. SILVEY

PROOF. From (6.4) we have, as previously,the resultthat


[ Oo

is asymptoticallynormalwithmean 0 and variancematrix


PO BsoPoo PO BooQoo
lQoO Bo0Poo QOOB0oQ*,j
Now Pe-oB00PoO = Peo(Boo + H1H')P* - P*H1HP*O and, as previously,the
firsttermon the righthand side of this equation is P@o. Also from(6.5)
0
Po*Hoo == 0

and in particularP* H1 = 0. It followsthat P* BooP* = P* ; and in a similar


mannerit may be shownthat PooBoo Qoo = 0. We also have
= -RnO-QOO H1 HI Qoo,
QOOBooQoo
and from(6.5) Q0UHoo = -I, so that, in particular,QeOH, = -[It 0]'. It
followsthat

QooB0oQoo -Ro-o LO = So ,

and this completesthe proof.


LEMMA 7. Subject to assumptionsB, -n1'R*^11' is asymptoticallydistributed
as x2 withr - t degreesoffreedom.
PROOF.Since Boo + H1H1 is positivedefiniteand Boois of rank s -t, there
exists a non-singularmatrixW such that

W'(Bo + HI1H1)W = I8
and
As-t O
W'Boo W A0 0]

whereA0-t is a diagonals - tX s - t matrix.Then

t ^~~8-t ?
W'H1 Hi W=Is-[ 0

and since H1H/ is of ranlkt, it followsthat As-t = I and that

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
LAGRANGIAN MULTIPLIER TEST 403

We now definean s-dimnensionalrandom variable m = (ml , M M2 * mS)

by m = W'y. Then m is distributedwithmean 0 and variancematrix


inormally
Is_t O
W'Boo W ; 0]

It followvs
that ni1, i2, ***, mit are independentN(O, 1) random variables,
while mist?l = Ms-t+2 = i = 0-.

Now from (6.4) wrehave

[(WW) -Heo[6 Oo] [W m]

and so

(6.7) [W' -W'H00 [6 OoIf [m]

Hence

m'm n [6 0 '[(WW'Y)1 + I HeoH0 W HJ ?


m m,,- I -HWo
e etej

i.e., since Ho, (6 o-o) 0,

(6.8) m'm 4
n - ]'Boo[ - OoI+ n5I'HeoWW'Hoo1"
Now from(6.4) Vn(-- -O) PO*(W')-'m and, as previously,P* is of
rank s - r. Ilence asymptotically, when n[6 - Oo]'B8o[6- Oo]is expressed as
a quadraticformin ml , M2 . , mS8- , its rank is at most s - r. We will n1ow
is expressedas a quadratic formin ml,
show that when n1"'HooWW'Ho05l M2,
* n*,-t , its rank is at most r - t.
From (6.7) we have, again since Hso(o- Oo) 0,
-H@OWm '-. VnH@o0W'H/o01.
Now

I , ~H1 Wm
H Wm =-H W]

and, since
_ _

M'W'Hi HWmi-nm' O m =0,


LO ItJ
we have H'Wm = 0. Heince

- VnHol, WW'Hoo I[ Wm.

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
404 S. D. SILVEY

Since the rank of H2 is at most r - t, it followsthat, asymptotically,when


n?,,'H0WW'H8o0is expressed as a quadratic formin mi , M2, ***, m8t, its
rankis at most r - t. Now from(6.8) by applyingCochran's Theorem (Cram6r
[41) we have the resultthat asymptoticallyn[' - Oo]'Boj[0- 00] and
n5'HIIWW'Heo`
are independentlydistributedas x2 with s - r and r - t degrees of freedom
respectively.
The proofof Lemma 7 is completedby the remarksthat

HeoWW'H0o = H0o(Boo + HiH;)-1Hoo = -R

ari-dthat RsO-' Ra'.


The resultsproved in this section,and the methods of proof,make it clear
how the technique suggestedby Aitchisonand Silvey [1] for solving the re-
strictedlikelihoodequations can usually be adapted, and how the Lagrangian
multipliertest can usually be applied when the matrixBoois singularand the
functionh is suitable. We will not amplifythis point.
7. Differentnumbers of observationson several random variables. Experi-
mentalmaterialbeinigwhat it is, and experimenters being as they are, it is not
oftenthat the statisticianis faced with an estimationproblemin the ideal cir-
cumstancesof beinggivena numberof observationson a vectorvalued random
variable. The more usual situation confronting him is that he is given n1 ob-
servationson a randomvariable X1 whose probabilitydensityfunctiondepends
onl s8 parameters01, 02, *- - X X n2 observationson a random variable X2

whose probabilitydensityfunctioiidepends on S2 parameters081+1, 081+2, ***,


081?+2, .... and nk observationson a random variable Xk, whose probability
density functiondepends on Sk parameters 08?+82+-. +8k_1+1i * *, 8, where
s = s3 + 82 + - + Sk. And he is presentedwith the of
problem deciding
whetherthe trueparameter0o= (0?, 00, * , 00) belongsto a set
co = {I0o:h(0) =0,

Q and h being as before.If ni = n2 = ... = nk then we may interpretthe ob-


servationsas observationson a vector valued random variable and the fore-
goingtheoryapplies. But if the n's are not all equal we cannot do this, and in
orderto enlargethe sphere of the Lagrangialnmultipliertest we have to con-
sider this situationseparately.In discussingit we will avoid all mathematical
detail and will be contentto indicate very brieflythe modificationsnecessary
in the test.
We will denote by x* a given set of ni + n2 + * + nk observationson the
random variables XI, X2, * * *, Xk, and log L (x*, 0) will denote the value of
the log-likelihoodfunctionat the point 0. Now if OoE X then the same kind of
argumentas we have used beforemay be used to show that it will usually be
the case that 0(xc*,w) exists,is near 0owhen ni , n2, **. , nki, and nk are all

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
LAGRANGIAN MULTIPLIER TEST 405

large,and witha LagrangianmultiplierX(x*) satisfiesthe likelihoodequations


D log L(x*, 0) + Hol = 0
h(O) = 0.
We now introducea matrixN definedby
ni 0 .. 0]

N=
K n72IL2 0

L o nk L.J

matrixBo is definedin this case by


The information
Be- -N '[EoD2 log L(., 0)],
whiereEo denotesexpectedvalue when 0 is take as the true parameter.Then
again by the type of argumentused previouslywe may show that for most
x*, when ni, n2, * , nk are large,

1
-
(7.1) [NB0Noo -Ho] ( ) Oo[]
D log L(x*,
LHoo 0 iL 5WX*) I ' 0
Also it will usually be true that D log L(x*, do) can he regardedas an observa-
tion on a random variable which is approximatelynormalwith mean 0 and
variancematrixNBo .
Now in the case whereBoois positivedefinitewe may use (7.1) in the same
way as beforeto show that when 0o E w and n,, nk are large, n2*

1H6I
L.'H4NRBd]
will usually be distributedapproximatelyas x2 with r degreesof freedom,and
it is this statisticwhichwe use in the modifiedformof the Lagrangiatnmulti-
pliertest. AlternativelywhenBo, is of rank s - t, wheneach of the functions
hi, h2 , -,* hi is a function of only the parameters involved in the distribution
of one of the X's and Bo, + H1H' is positive definite,the statisticon wlhich
the test is based is 2/H4[N(B6+ H1H")K 1Hdi,whichwill usuallybe distributed
as x with r - t degreesof freedomwhen ni , n2 , nk are large.
2
.

We concludeby applyingthe Lagrangiaiimnultiplier test in a familiarsitua-


tion.
Homogeneityin the 2 X 2 contingencytable. One of the three situa-
tions (Cochran [2]) in whichthe 2 X 2 con1tingencytable arises is as follows.
We are given ni observationsoIn a random variable X1 whose distributionis
definedby
Pr{il = (1, O)} = 07(0? + 0?)

Pr'Xi = (0, 1)} -? (0? + 6?),

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
406 s. D. SILVEY

and n2 observationson an ildepenidentr-alndomn variable X2 whose distribution


is definedsimilarlyin termsof 03 and 04 . These observations can be summiiariised
in a 2 X 2 contingencytable as follows.
Numberof occurrencesof different values of Xi and X2.
(1, 0) (0, 1) Total

Xi nl n1 ni
X2 n2l 122 fl2

Total MI m2 n
We suppose that the point 0C = (01 , 00, 03, O) is kniownto belongto the set
Q { R 4:E _ O < 1/e (i = 1, 2, 3, 4)1 where E is a snmall positiveinumber.
In this case we also have
log L(x*, 0) -constant + nl1 log 01+ n12 log 02 - n lo3g(01 + 02)

+ n2 log0 03 + f122 log 04 - It2 1(g ( 03 + 0).

The matrix
[-
O'- (01 + 02)-l -(1 + 02) 0 0

-(01 + 02) o2 - (01 + 02) 0 0

0- _(0 + 04) 1 (03 +


04)
L 0 0 (03 + 04)< 041 (3 - + 04)_<

ofX1and X2meansthat01/(0 +--02) =03/(


has rank2. Homogeneity -4)
+
and we considerestimatingOqsubject to the restrictions
01 + 02 - 1
h(0) - f. +044- 1 =0,

L 01- 03 J
so that
0 O
_l
He= 0 1 -1.
0 1 0
If H1 is the leading4 X 2 sub-matrixof Ho, then forany 0 r ,

H, H
B8 +
Be + Hi Hi =
=
10
o-
0 o0
0-1
0
0
0
0-
0 j
_ 0 0 04
whichis positivedefinite.

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions
JiAGIRANGIAN MUL4TIPLIER TEST 407

rThelikelihoodequatiols are easily solved in this case and we findthat


81(x*, co) = 03(x*, w) = m/n
while02(V*, W) 04(X*, ,) = M2/n. It is n-ot difficult
to verifythat the statistic
^'H;[N(B6 + H,H')]-p1H4 is the usual statisticused in the x2-testof homo-
geneityin a 2 X 2 table, so that this test is a particularcase of the Lagrangian
multipliertest. And it illustratesmost aspects of the precedingtheory.The
computationalprocedurefor applying the Lagrangian multipliertest in less
familiarand mnore complicatedsituationswill be set out in a subsequenitpaper.
REFERENCES
11]J. AITCHISON ANIDS. I). SILVEY, "Maximutm likelihoodestimationof parameterssub-
je(t to restraInts," Ann. Math. Stat., Vol. 29 (1958), pp. 813-828.
[21W. G. COCHRAN,"The x2-testof goodness of fit," Ann. Math. Stat., Vol. 23 (1952), pp.
315-345.
[3l H. CRAMiR, ?antdotaV'ariables and Probability Distributions, Cambridge University
Press, 1937.
[41 H. CRAME'R,MatheimaticalMethods of Statistics, Princeton University Press, 1946.
[51C. KRAFTANDL. LECAM,"A remarkon therootsofthemaximumlikelihoodequation,"
Ann. Math. Stat., Vol. 27 (1956), pp. 1174-1177.
[61H. B. MANN ANDA. WALD,"On stochasticlimitand orderrelationships,"Ann. Math.
Stat., Vol. 14 (1943), pp. 217-226.
[71A. WALD,"Note onithe consistencyof the maximumlikelihoodestimate,"Ann. lMath.
Stat.,Vol. 20 (1949),pp. 595-601.
[81A. WALD, "Tests of statistical hypotheses concerning several parameters when the
numberofobservationsis large," Trans.Am. Math.Soc., Vol. 54 (1943),pp. 426-
482.

This content downloaded from 192.12.88.148 on Sat, 28 Sep 2013 21:09:29 PM


All use subject to JSTOR Terms and Conditions

You might also like