Matrix Calculus 2
Matrix Calculus 2
Matrix Calculus
Go to: Introduction, Notation, Index
Notation
j is the square root of -1
XR and XI are the real and imaginary parts of X = XR + jXI
XC is the complex conjugate of X
X: denotes the long column vector formed by concatenating the columns of X (see vectorization).
A ¤ B = KRON(A,B), the kroneker product
A • B the Hadamard or elementwise product
matrices and vectors A, B, C do not depend on X
Derivatives
In the main part of this page we express results in terms of differentials rather than derivatives for two
reasons: they avoid notational disagreements and they cope easily with the complex case. In most cases
however, the differentials have been written in the form dY: = dY/dX dX: so that the corresponding
derivative may be easily extracted.
If X is p#q and Y is m#n, then dY: = dY/dX dX: where the derivative dY/dX is a large mn#pq matrix. If
X and/or Y are column vectors or scalars, then the vectorization operator : has no effect and may be
omitted. dY/dX is also called the Jacobian Matrix of Y: with respect to X: and det(dY/dX) is the
corresponding Jacobian. The Jacobian occurs when changing variables in an integration:
Integral(f(Y)dY:)=Integral(f(Y(X)) det(dY/dX) dX:).
Although they do not generalise so well, other authors use alternative notations for the cases when X and
Y are both vectors or when one is a scalar. In particular:
http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/calculus.html Page 1 of 6
Matrix Reference Manual: Matrix Calculus 11/30/11 9:03 AM
If X is complex then dY: = dY/dX dX: can only be true iff Y(X) is an analytic function which normally
implies that Y(X) does not depend on XC or XH.
Even for non-analytic functions we can write uniquely dY: = dY/dX dX: + dY/dXC dXC: provided that
is analytic with respect to X and XC individually (or equivalently with respect to XR and XI individually).
dY/dX is the Generalized Complex Derivative and dY/dXC is the Complex Conjugate Derivative [R.4,
R.9].
We define the generalized derivatives in terms of partial derivatives with respect to XR and XI:
We have the following relationships for both analytic and non-analytic functions Y(X):
If f(x) is a real function of a complex vector then df/dxC= (df/dx)C and we can define grad(f(x)) = 2
(df/dx)H = (df/dxR+j df/dxI)T as the Complex Gradient Vector [R.9] with the following properties:
Basic Properties
http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/calculus.html Page 2 of 6
Matrix Reference Manual: Matrix Calculus 11/30/11 9:03 AM
http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/calculus.html Page 3 of 6
Matrix Reference Manual: Matrix Calculus 11/30/11 9:03 AM
Differentials of Inverses
d(X-1) = -X-1dX X-1 [2.1]
d(X-1): = -(X-T ¤ X-1) dX:
d(aTX-1b) = - (X-TabTX-T ):T dX: = - (abT):T (X-T ¤ X-1) dX: [2.6]
d(tr(ATX-1B)) = d(tr(BTXTA)) = -(X-TABTX-T):T dX: = -(ABT):T (X-T ¤ X-1) dX:
Differentials of Trace
Note: matrix dimensions must result in an n*n argument for tr().
d(tr(Y))=tr(dY)
d(tr(X)) = d(tr(XT)) = I:T dX: [2.4]
d(tr(Xk)) =k(Xk-1)T:T dX:
d(tr(AXk)) = (SUMr=0:k-1(XrAXk-r-1)T ):T dX:
d(tr(AX-1B)) = -(X-1BAX-1)T:T dX:= -(X-TATBTX-T):T dX: [2.5]
d(tr(AX-1)) =d(tr(X-1A)) = -(X-TATX-T ):T dX:
d(tr(ATXBT)) = d(tr(BXTA)) = (AB):T dX: [2.4]
d(tr(XAT)) = d(tr(ATX)) =d(tr(XTA)) = d(tr(AXT)) = A:T dX:
d(tr(ATX-1BT)) = d(tr(BXTA)) = -(X-TABX-T):T dX: = -(AB):T (X-T ¤ X-1) dX:
d(tr(AXBXTC)) = (ATCTXBT + CAXB):T dX:
d(tr(XAXT)) = d(tr(AXTX)) = d(tr(XTXA)) =( X(A+AT)):T dX:
d(tr(XTAX)) = d(tr(AXXT)) = d(tr(XXTA)) = ((A+AT)X):T dX:
d(tr(AXBX)) = (ATXTBT + BTXTAT ):T dX:
d(tr((AXb+c)(AXb+c)T) = 2(AT(AXb+c)bT):T dX:
d(tr((XTCX)-1A) = [C:symmetric] d(tr(A (XTCX)-1) = -((CX(XTCX)-1)(A+AT)(XTCX)-1):T dX:
d(tr((XTCX)-1(XTBX)) = [B,C:symmetric] d(tr( (XTBX)(XTCX)-1) = 2(BX(XTCX)-1-
(CX(XTCX)-1)XTBX(XTCX)-1 ):T dX:
Differentials of Determinant
Note: matrix dimensions must result in an n#n argument for det(). Some of the expressions below involve
inverses: these forms apply only if the quantity being inverted is square and non-singular; alternative
http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/calculus.html Page 4 of 6
Matrix Reference Manual: Matrix Calculus 11/30/11 9:03 AM
forms involving the adjoint, ADJ(), do not have the non-singular requirement.
Jacobian
dY/dX is called the Jacobian Matrix of Y: with respect to X: and JX(Y)=det(dY/dX) is the corresponding
Jacobian. The Jacobian occurs when changing variables in an integration:
Integral(f(Y)dY:)=Integral(f(Y(X)) det(dY/dX) dX:).
JX(X[n#n]-1)= (-1)ndet(X)-2n
Hessian matrix
If f is a real function of x then the Hermitian matrix Hx f = (d/dx (df/dx)H)T is the Hessian matrix of f(x).
A value of x for which grad f(x) = 0 corresponds to a minimum, maximum or saddle point according to
whether Hx f is positive definite, negative definite or indefinite.
http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/calculus.html Page 5 of 6
Matrix Reference Manual: Matrix Calculus 11/30/11 9:03 AM
This page is part of The Matrix Reference Manual. Copyright © 1998-2005 Mike Brookes, Imperial
College, London, UK. See the file gfl.html for copying instructions. Please send any comments or
suggestions to "mike.brookes" at "imperial.ac.uk".
Updated: $Id: calculus.html,v 1.30 2011/01/14 16:28:04 dmb Exp $
http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/calculus.html Page 6 of 6