0% found this document useful (0 votes)
21 views

Multivariate Characteristic and Correlation Functions

Uploaded by

Andres AE
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Multivariate Characteristic and Correlation Functions

Uploaded by

Andres AE
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 376

De Gruyter Studies in Mathematics 50

Editors
Carsten Carstensen, Berlin, Germany
Nicola Fusco, Napoli, Italy
Fritz Gesztesy, Columbia, Missouri, USA
Niels Jacob, Swansea, United Kingdom
Karl-Hermann Neeb, Erlangen, Germany
Zoltán Sasvári

Multivariate Characteristic
and Correlation Functions

De Gruyter
Mathematical Subject Classification 2010: 42B10, 43A35, 60F05, 60G10, 60G50, 60G51,
60E07, 60E10.

ISBN 978-3-11-022398-9
e-ISBN 978-3-11-022399-6
Set-ISBN 978-3-11-174044-7
ISSN 0179-0986

Library of Congress Cataloging-in-Publication Data


A CIP catalog record for this book has been applied for at the Library of Congress.

Bibliographic information published by the Deutsche Nationalbibliothek


The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data are available in the internet at http://dnb.dnb.de.

© 2013 Walter de Gruyter GmbH, Berlin/Boston

Typesetting: P T P-Berlin Protago-TEX-Production GmbH, www.ptp-berlin.de


Printing and binding: Hubert & Co. GmbH & Co. KG, Göttingen
Printed on acid-free paper

Printed in Germany

www.degruyter.com
Preface

Our goal in writing this book is to make the basic concepts and results on multivari-
ate characteristic and correlation functions easily accessible to both students and re-
searchers in a comprehensive manner. Especially in the interest of students, every
attempt has been made to make the book as self-contained as possible. This led to
a relatively large appendix containing topics like infinite products, functional equa-
tions, special functions or compact operators. We hope that this appendix will make
the usage of the book much easier, for example in seminars. Chapters 1, 2 and parts of
Chapter 3 have been ‘tested’ in lectures and seminars given at the Technical University
Dresden, Wroclaw University of Technology, and Swansea University.
In a certain sense, characteristic functions and correlation functions are the same,
the common underlying concept is positive definiteness. Many results in probability
theory, mathematical statistics and stochastic processes can be derived by using these
functions. While there are books on characteristic functions of one variable, books de-
voting some sections to the multivariate case, and books treating the general case of
locally compact groups, interestingly there is no book devoted entirely to the multidi-
mensional case which is extremely important for applications. This book is intended to
fill this gap at least partially. Due to the abundance of results and of their applications
a single book cannot treat all of them.
Most of the presented results are fairly well known. For this reason only few refer-
ences are included, rather we cite papers and monographs where such references can be
found. A brief outline of the book is as follows. The first chapter presents basic results
and should be read carefully since it is essential for the understanding of the subsequent
chapters. The second chapter is devoted to correlation functions, their applications to
stationary processes and some connections to harmonic analysis. In Chapter 3 we deal
with several special properties, Chapter 4 is devoted to the extension problem while
Chapter 5 contains a few applications. Many results in the Appendix are presented
with proofs. The reader might find some sections of the Appendix interesting on their
own, e.g., the section Solutions of certain functional equations or the section Linear
independence of exponential functions.
The numbering of chapters, sections, theorems, etc. is traditional. Equations are
numbered consecutively within each paragraph, theorem, definition, and so forth.
Equation (i) stands for the i-th equation within the current paragraph while equa-
tion (l.k.j.i) stands for the i-th equation within the Paragraph (l.k.j).
My warmest thanks go to Georg Berschneider who read the whole manuscript, gave
valuable remarks, helped to find literature to special topics and solved several TeX
vi Preface

problems. Björn Böttcher read substantial parts of the manuscript, I am grateful for his
comments. I thank my colleagues at the Institut für Stochastik, Technische Universität
Dresden, for many helpful discussions, in particular Christiane Weber for her help in
preparing the figures, the index, and the bibliography, and Willi Schenk for the proof
idea of Lemma B.1.2. I record my gratitude to the publishers, especially to the series
editor Niels Jacob for his interest and encouragement.
Last, but not least, I would like to thank my wife, Evelyne Sasvári, for her support
and understanding.

Dresden, November 2012 Zoltán Sasvári


Contents

Preface v
1 Characteristic functions 1
1.1 Basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Inversion theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.4 Basic properties of positive definite functions . . . . . . . . . . . . . . . . . . . . 25
1.5 Further properties of positive definite functions on Rd . . . . . . . . . . . . 32
1.6 Lévy’s continuity theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.7 The theorems of Bochner and Herglotz . . . . . . . . . . . . . . . . . . . . . . . . . 41
1.8 Fourier transformation on Rd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1.9 Fourier transformation on discrete commutative groups . . . . . . . . . . . . 55
1.10 Basic properties of Gaussian distributions . . . . . . . . . . . . . . . . . . . . . . 57
1.11 Some inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2 Correlation functions 67
2.1 Random fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
2.2 Correlation functions of second order random fields . . . . . . . . . . . . . . 70
2.3 Continuity and differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
2.4 Integration with respect to complex measures . . . . . . . . . . . . . . . . . . . . 77
2.5 The Karhunen–Loève decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 86
2.6 Integration with respect to orthogonal random measures . . . . . . . . . . . 92
2.7 The theorem of Karhunen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
2.8 Stationary fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
2.9 Spectral representation of stationary fields . . . . . . . . . . . . . . . . . . . . . . 109
2.10 Unitary representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
2.11 Unitary representations and positive definite functions . . . . . . . . . . . . . 125
3 Special properties 132
3.1 Strict positive definiteness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
3.2 Infinitely differentiable and rapidly decreasing functions . . . . . . . . . . . 134
viii Contents

3.3 Analytic characteristic functions of one variable . . . . . . . . . . . . . . . . . 139


3.4 Holomorphic L2 Fourier transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
3.5 Further properties of Gaussian distributions . . . . . . . . . . . . . . . . . . . . . 154
3.6 Fourier transformation of radial measures and functions . . . . . . . . . . . 160
3.7 Radial characteristic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
3.8 Schoenberg’s theorems on radial characteristic functions . . . . . . . . . . . 172
3.9 Convex and completely monotone functions . . . . . . . . . . . . . . . . . . . . 175
3.10 Convolution roots with compact support . . . . . . . . . . . . . . . . . . . . . . . . 184
3.11 Infinitely divisible characteristic functions . . . . . . . . . . . . . . . . . . . . . . 187
3.12 Conditionally positive definite functions . . . . . . . . . . . . . . . . . . . . . . . . 189
4 The extension problem 200
4.1 General results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
4.2 The cases Rd and Zd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
4.3 Decomposition of locally defined positive definite functions . . . . . . . . 213
4.4 Extension of radial positive definite functions . . . . . . . . . . . . . . . . . . . 221
5 Selected applications 224
5.1 Limit theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
5.2 Sums of independent random vectors and the Jessen–Wintner purity
law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
5.3 Ergodic theorems for stationary fields . . . . . . . . . . . . . . . . . . . . . . . . . . 234
5.4 Filtration of discrete stationary fields . . . . . . . . . . . . . . . . . . . . . . . . . . 239
Appendix 242
A Basic notation 242
A.1 Standard notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
A.2 Multidimensional notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
B Basic analysis 244
B.1 Miscellaneous results from classical analysis . . . . . . . . . . . . . . . . . . . . 244
B.2 Uniform convergence of continuous functions . . . . . . . . . . . . . . . . . . . 256
B.3 Infinite products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
B.4 Convex functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
B.5 The Riemann–Stieltjes integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
B.6 Multivariate calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
B.7 The Lebesgue integral on Rd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
Contents ix

C Advanced analysis 278


C.1 Functions of a complex variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
C.2 Almost periodic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
C.3 Fourier series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
C.4 The Gamma function and the formulae of Stirling and Binet . . . . . . . . 288
C.5 Bessel functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
C.6 The Mellin transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
C.7 The Laplace transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
C.8 Existence of continuous logarithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
C.9 Solutions of certain functional equations . . . . . . . . . . . . . . . . . . . . . . . 306
C.10 Linear independence of exponential functions . . . . . . . . . . . . . . . . . . . 310
D Functional analysis 314
D.1 Inner product spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
D.2 Matrices and kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
D.3 Hilbert spaces and linear operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
D.4 Convex sets and the theorem of Kreı̆n and Milman . . . . . . . . . . . . . . . 332
D.5 Weak topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
E Measure theory 339
E.1 Borel measures, weak and vague convergence . . . . . . . . . . . . . . . . . . . 339
E.2 Convolution of measures and functions . . . . . . . . . . . . . . . . . . . . . . . . 345
F Probability 347
F.1 Basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
F.2 Convergence of random vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
F.3 Products of probability spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
F.4 Conditional expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
Bibliography 357
Index 361
Chapter 1

Characteristic functions

In this very first chapter we prove several classical results due to, among others,
Herglotz, Bochner, Lévy, and Plancherel. We start with basic properties of charac-
teristic functions and continue with the investigation of positive definite functions.
As a byproduct, we obtain basic results on the Fourier transformation which will be
needed in subsequent chapters.
Before starting we suggest to have a short look at Appendix A which contains basic
notation used throughout the book.

1.1 Basic properties


This is probably the easiest section of the book. We first define characteristic functions
and then show several simple properties. All the proofs are short and not difficult. In
spite of their simplicity, these properties are extremely useful and frequently applied.
Definition 1.1.1. Let X be a d-dimensional random vector and denote by  the dis-
tribution of X . The characteristic function f D fX of X is defined by
Z
 
f .t / WD E ei.t;X/ D ei.t;x/ d.x/; t 2 Rd :
Rd

The existence of the expectation above is guaranteed by the fact that the complex-
valued random variable ei.t;X/ is bounded for each t 2 Rd . Sometimes we call f the
characteristic function of the distribution , in which case we also write f . As we
will see in Section 1.3, every distribution is uniquely determined by its characteristic
function.
If X has a density p, then
Z
f .t / D ei.t;x/ p.x/ dx; t 2 Rd :
Rd

If X is discrete with values x1 ; x2 ; : : : and corresponding probabilities p1 ; p2 ; : : :, then


X
f .t / D pj  ei.t;xj / ; t 2 Rd :
j

Especially, the functions


1 i.t;x/ 1 i.t;x/
t 7! ei.t;x/ and t 7! cos..t; x// D e C e
2 2
2 Chapter 1 Characteristic functions

are characteristic functions for every x 2 Rd . As a further example, assume that X is


uniformly distributed on the parallelepiped

K D Œa1 ; b1       Œad ; bd   Rd ;

where aj < bj . We then have


Z Z Y
d
1 1
fX .t / D i.t;x/
e dx D eitj xj dx
.K/ K .K/ K j D1

d Z bj
Y Y
d
1 eibj tj  eiaj tj
D  eitj xj dxj D
.K/ aj i.bj  aj /tj
j D1 j D1

where the factors of the last product are equal to 1 (by definition) if tj D 0.
In the last example, the characteristic function fX is real-valued whenever aj D
bj for all j . We will show later (cf. Theorem 1.3.13) that fX is real-valued if and
only if X and X have the same distribution, i.e., if  is symmetric:  D 4 . The
proof of the “if part” is very simple, but for the “only if part” we need the fact that
every distribution is uniquely determined by its characteristic function.

Theorem 1.1.2. Every characteristic function f on Rd has the following properties:


(i) f .0/ D 1;

(ii) jf .t /j  1; t 2 Rd ;

(iii) f .t / D f .t /; t 2 Rd ;

(iv) f is uniformly continuous on Rd ;

(v) the inequality


X
n
f .ti  tj /ci cj  0
i;j D1

holds for every positive integer n, for all t1 ; : : : ; tn 2 Rd ,


and for all c1 ; : : : ; cn 2 C.

Proof. The first three properties follow immediately from Definition 1.1.1. To prove
(iv) assume that f is the characteristic function of a random vector X . Then

jf .t C h/  f .t /j D jE Œei.t;X/  .ei.h;X/  1/j  E.jei.h;X/  1j/

where the upper bound is independent of t and, by the dominated convergence theo-
rem, tends to zero as h ! 0. This shows that f is uniformly continuous on Rd .
Section 1.1 Basic properties 3

The last statement follows from


X
n X
n  X
n 
f .ti  tj /ci cj D E.ei.ti tj ;X/ /ci cj D E ci ei.ti ;X/ cj ei.tj ;X/
i;j D1 i;j D1 i;j D1
ˇ X
n ˇ2 
ˇ ˇ
DE ˇ ci ei.ti ;X/ ˇ  0:
iD1

We will see in Section 1.7 that inequality (1.1.2.v) is a characteristic property of


characteristic functions.
The next theorem is very useful in the study of sums of independent random vectors.

Theorem 1.1.3. If X and Y are independent d-dimensional random vectors, then the
characteristic function of their sum is

fXCY D fX  fY :

Proof. Let t 2 Rd be arbitrary. Since X and Y are independent, the complex-valued


random variables ei.t;X/ and ei.t;Y / are independent as well. Thus,

fXCY .t / D E.ei.t;XCY / / D E.ei.t;X/  ei.t;Y / /


D E.ei.t;X/ /  E.ei.t;Y / / D fX .t /  fY .t /:

For arbitrary distributions  and  on Rd there exist independent d-dimensional


random vectors X and Y such that  and  are the distributions of X and Y , respec-
tively.1 From this and Theorem 1.1.3 we obtain:

Corollary 1.1.4. The product of two characteristic functions of d variables is again


a characteristic function.
Noting that f is the characteristic function of X , we obtain the following corol-
lary.

Corollary 1.1.5. If X and Y are independent identically distributed random vectors


with characteristic function f , then jf j2 is the characteristic function of X  Y .
Next we present further simple methods to construct characteristic functions from
given ones.

1 Note that the distribution of X C Y is    so that the equation in Theorem 1.1.3 can also be written
as f D f  f .
4 Chapter 1 Characteristic functions

Theorem 1.1.6. Let R be a random variable and X be a d-dimensional random vec-


tor. If R and X are independent, then2
Z 1
fRX .t / D fX .st / dR .s/; t 2 Rd :
1

Proof. Applying equation (F.1.3) we obtain


   
fRX .t / D E ei.t;RX/ D E eiR.t;X/
Z 1
D eis dR.t;X/ .s/
1
Z 1
D eis dR ? .t;X/ .s/
1
Z 1Z 1
D eisu d.t;X/ .u/ dR .s/
1 1
Z 1Z
D eis.t;x/ dX .x/ dR .s/
1 Rd
Z 1
D fX .st / dR .s/ :
1

Theorem 1.1.7. Let X be a d-dimensional random vector. Then the equation

fAXCb .t / D ei.t;b/  fX .AT t /; t 2 Rn

holds for every linear mapping A W Rd ! Rn and b 2 Rn .


In particular, if X D .X1 ; X2 / where X1 and X2 are random variables, then

fa1 X1 Ca2 X2 .t / D fX .a1 t; a2 t /; a1 ; a2 ; t 2 R:

Proof. We have
T t;X/
fAXCb .t / D EŒei.t;AXCb/  D E Œei.t;b/  ei.A  D ei.t;b/  fX .AT t /:

Theorem 1.1.8. Let X1 and X2 be independent d1 - and d2 -dimensional random vec-


tors, respectively, and write d D d1 C d2 . Then the characteristic function of the
d-dimensional random vector X D .X1 ; X2 / is given by

fX .t / D fX1 .t1 /  fX2 .t2 /

where t1 2 Rd1 ; t2 2 Rd2 , and t D .t1 ; t2 / 2 Rd .3

2 See also Lemma 1.5.7.


3 See also Theorem 1.3.10.
Section 1.1 Basic properties 5

Proof. The theorem follows from


fX .t / D E.ei.t;X/ / D E.ei.t1 ;X1 /  ei.t2 ;X2 / /
using the independence of X1 and X2 .

Applying the same argument as in the proof of Corollary 1.1.4, we obtain the fol-
lowing corollary.

Corollary 1.1.9. Let fi be a characteristic function of di variables .i D 1; 2/. Then


the function f defined by
f .t / D f1 .t1 /  f2 .t2 /
where t1 2 Rd1 ; t2 2 Rd2 , and t D .t1 ; t2 / 2 Rd , is a characteristic function of
d D d1 C d2 variables.

Remark 1.1.10. It is a useful fact that the characteristic function fX of a random


vector X D .X1 ; : : : ; Xd / is determined by the characteristic functions fLy of all
linear combinations
Ly D y1 X1 C    C yd Xd D .y; X /; y 2 Rd
and vice versa. Indeed,
fLy .s/ D E.eisLy / D E.eis.y;X/ / D fX .s  y/; y 2 Rd ; s 2 R: (1)
In particular, fX .y/ D fLy .1/. From (1) we also see that s 7! fX .s  y/ is a charac-
teristic function of one variable for all y 2 Rd .

Remark 1.1.11. Let X D .X1 ; : : : ; Xd / be a random vector and i1 ; : : : ; ik be integers


such that 1  i1 <    < ik  d . It follows immediately from Definition 1.1.1 that the
characteristic function of the random vector .Xi1 ; : : : ; Xik / can be obtained by setting
ti D 0 for i … fi1 ; : : : ; ik g in fX .t1 ; : : : ; td /. The distribution of .Xi1 ; : : : ; Xik / is
usually called a marginal distribution.

Lemma 1.1.12. Let  be a distribution on Rd . The real part


Z
Re f .t / D cos ..t; x// d.x/; t 2 Rd
Rd

of its characteristic function f is the characteristic function of the distribution


. C 4 /=2 .

Proof. The statement follows immediately from the fact that


Z Z
4
h.t / d .t / D h.t / d.t /

for every bounded continuous function h on Rd .


6 Chapter 1 Characteristic functions

In the rest of this section we present some simple examples of characteristic func-
tions of one variable. These functions are shown in Figures 1.1–1.5.

Example 1.1.13.
(a) The exponential distribution with parameter ˛ > 0 has the density function
x 7! ˛1Œ0;1/ .x/  e˛x ; x 2 R , and its characteristic function is given by
Z 1
˛  e.it˛/x ˇˇxD1 ˛
˛e˛x  eitx dx D ˇ D
0 it  ˛ xD0 ˛  it
(see Figure 1.1).
(b) The Laplace distribution with parameter ˇ > 0 has the density function x 7!
jxj
1

 e ˇ ; x 2 R. Its characteristic function is given by
Z 1 Z 1
1 jxj 1 1
 e ˇ  eitx dx D  e. ˇ Cit/x dx
1 2ˇ 0 2ˇ
Z 0
1 1
C  e. ˇ Cit/x dx

1
1 1 1 1
D  C D
2ˇ it  ˇ 1
it C ˇ 1 1 C ˇ2t 2

(see Figure 1.2). Using Corollary 1.1.5 and the last two examples we see that
the difference of two independent identically distributed exponential random
variables is Laplace distributed.

Im

0.5

1 Re

0:5

Figure 1.1. The function t 7! 1


1it
; t 2 Œ10; 10, from Example 1.1.13(a).
Section 1.1 Basic properties 7

0.5

3 2 1 1 2 3

Figure 1.2. The function t 7! 1


1Ct 2
from Example 1.1.13(b).

(c) To compute the characteristic function of the binomial distribution


!
Xn
n k
D p .1  p/nk ık
k
kD0

with parameters n  1 and p 2 Œ0; 1, let X1 ; : : : ; Xn be independent random


variables such that P .Xj D 0/ D 1  p; P .Xj D 1/ D p for all j . Then  is
the distribution of X D X1 C    C Xn . Applying Theorem 1.1.3 we obtain

f .t / D fX .t / D ŒfX1 .t /n D Œ1  p C p  eit n

(see Figure 1.3 and Figure 1.4).


(d) The characteristic function of the Poisson distribution
1
X k
e   ık

kD0

with parameter   0 is
1
X 1
X
k itk .eit /k
e   e D e  D e.e 1/
it

kŠ kŠ
kD0 kD0

(see Figure 1.5).


8 Chapter 1 Characteristic functions

Im
1

0:5 1 Re

1

Figure 1.3. The function t 7! Œ1  p C p  eit n from Example 1.1.13(c) with p D 1=2 and
n D 4:

Im
1

0:8 1 Re

1

Figure 1.4. The function t 7! Œ1  p C p  eit n from Example 1.1.13(c) with p D 0:7 and
n D 4.
Section 1.2 Differentiability 9

Im

0.5

1 Re

0:5

Figure 1.5. The function t 7! exp.eit  1/ from Example 1.1.13(d).

1.2 Differentiability
The characteristic functions in Example 1.1.13 are infinitely often differentiable (even
analytic) on R. We will show in Example 1.3.7 that the characteristic function of the
Cauchy distribution with parameter ˛ D 1 is given by f .t / D ejtj . This function
is not differentiable at 0. Example B.1.19 shows that characteristic functions may be
nowhere differentiable.
Throughout this section,  denotes a probability distribution on Rd with charac-
teristic function f . We will see that differentiability of f is closely related to the
existence of moments of .

Theorem 1.2.1. Let ˛ 2 Nd0 be such that the moment M˛ of  exists. Then the partial
derivative D ˛ f exists and we have
Z
(i) D ˛ f .t / D ij˛j x ˛  ei.t;x/ d.x/; t 2 Rd ;
Rd
(ii) D ˛ f .0/ D ij˛j  M˛ ;
(iii) D˛ f is uniformly continuous and jD ˛ f j is bounded by the absolute moment
A˛ of .

Proof. Suppose that ˛j > 0 for some j and let e1 ; : : : ; ed be the standard basis of
Rd . For h 2 R n f0g we have
Z
f .t C hej /  f .t / eihxj  1 i.t;x/
D xj  e d.x/:
h Rd hxj
10 Chapter 1 Characteristic functions

Theorem B.1.5 with n D 0 shows that the integrand is dominated by jxj j. Taking the
limit h ! 0 and applying Lebesgue’s theorem on dominated convergence we obtain
Z
@
f .t / D i xj  ei.t;x/ d.x/; t 2 Rd :
@tj Rd

Repeating this argument we obtain (i). Equation (ii) is obtained from (i) by setting
t D 0.
From (i) we infer that
Z
˛
jD f .t /j  jx ˛ j d.x/ D A˛ :
Rd

The uniform continuity of D ˛ f follows from


ˇZ ˇ
ˇ ˇ
jD ˛ f .t C h/  D ˛ f .t /j D ˇ x ˛  .ei.h;x/  1/  ei.t;x/ d.x/ˇ
Rd
Z
 jx ˛ j  jei.h;x/  1j d.x/
Rd

by dominated convergence.

The previous theorem has the following immediate consequences.

Corollary 1.2.2. Let X D .X1 ; : : : ; Xd / be a d-dimensional random vector with


characteristic function f . If Xj is square integrable for all j , then the covariance
matrix cov.X / D .cj k /dj;kD1 is given by

@2 @ @
cj k D  f .0/ C f .0/  f .0/:
@tj @tk @tj @tk

Corollary 1.2.3. Assume that  ¤ ı0 and that for some ˛ 2 Nd0 the moment M2˛ of
 exists. Then the function g defined by

.1/j˛j
g.t / D  D 2˛ f .t /; t 2 Rd
M2˛
is a characteristic function. The corresponding distribution  is given by
1
d.x/ D  x 2˛ d.x/:
M2˛
Example 1.2.4. As an application of Theorem 1.2.1 we compute the mean and the
variance of the distributions from Example 1.1.13. Let f denote the characteristic
function of a random variable X with finite variance. Then, in view of (1.2.1.ii),

E.X / D if 0 .0/; E.X 2 / D f 00 .0/


Section 1.2 Differentiability 11

and hence
Var.X / D E. X 2 /  .E X /2 D f 00 .0/  f 0 .0/2 :
(a) If X is exponentially distributed with parameter ˛ > 0 then we have
d ˛ ˛i 2˛
f 0 .t / D D ; f 00 .t / D 
dt ˛  it .˛  it /2 .˛  it /3

and hence E.X / D ˛1 ; E.X 2 / D ˛22 and Var.X / D ˛12 .


(b) In the case of the Laplace distribution with parameter ˇ > 0 we have

d 1 2ˇ 2 t 2ˇ 2 .3ˇ 2 t 2  1/
f 0 .t / D D  ; f 00 .t / D :
dt 1 C ˇ 2 t 2 .1 C ˇ 2 t 2 /2 .1 C ˇ 2 t 2 /3

Thus, E.X / D 0 and E.X 2 / D Var.X / D 2ˇ 2 .


(c) Assume that X has the binomial distribution with parameters n and p. Writing
q WD 1  p we have
d
f 0 .t / D .q C peit /n D inp  .q C peit /n1  eit
dt
f 00 .t / D np  .q C peit /n2  .q C pneit /  eit :

Consequently, E.X / D np; E.X 2 / D np.np C q/ and Var.X / D npq.


(d) If X has the Poisson distribution with parameter  > 0, then
d .eit 1 / it 1 it 1
f 0 .t / D e D i  eit  e.e / ; f 00 .t / D   eit  .1 C eit /  e.e /
dt
and therefore E.X / D ; E.X 2 / D  C 2 and Var.X / D .

The next theorem follows immediately from Theorem 1.2.1 and Theorem B.6.1 with
a D 0.

Theorem 1.2.5. If for some positive integer n all moments M˛ ; ˛ 2 Nd0 ; j˛j  n, of
 exist, then
X M˛ X
f .t / D  .it /˛ C R˛ .t /  t ˛ ; t 2 Rd
˛Š
j˛jn j˛jDn

where Z 1
n
R˛ .t / D  .D ˛ f .xt /  D ˛ f .0//.1  x/n1 dx:
˛Š 0
2A˛
Moreover, jR˛ .t /j  ˛Š and limt!0 R˛ .t / D 0.

Sometimes it is more convenient to use the following expansion which is obtained


by using Theorem B.6.2.
12 Chapter 1 Characteristic functions

Theorem 1.2.6. Under the conditions of the previous theorem there exist ;  2 .0; 1/
depending on t such that
X M˛ X Re D ˛ f . t / C i  Im D ˛ f .t /
f .t / D  .it /˛ C  t ˛:
˛Š ˛Š
j˛jn1 j˛jDn

The next special case is frequently used.

Corollary 1.2.7. Let X D .X1 ; : : : ; Xd / be a random vector such that E.Xj2 / < 1
for all j . For the characteristic function f of X we then have
X
d
1 X
d
f .t / D 1 C i  E.Xj /tj  ŒE.Xi Xj / C Ri;j .t / ti tj ; t 2 Rd
2
j D1 i;j D1

where jRi;j .t /j  2E.jXi Xj j/ and limt!0 Ri;j .t / D 0. Moreover, there exist ;  2


.0; 1/ depending on t such that
X
d
1 X
d
f .t / D 1 C i  E.Xj /tj C ŒRe fti ;tj . t / C i  Im fti ;tj .t / ti tj :
2
j D1 i;j D1

Lemma 1.2.8. Let t 2 Rd be such that the inequality


1  Re f .s  t /  Ct  s 2 ; s2R (1)
holds with some Ct  0. Then,
Z
.t; x/2 d.x/  2  Ct : (2)
Rd

In particular, if (1) holds for all t 2 Rd , then all moments of second order of  are
finite and hence all partial derivatives of order two of f exist.4
Proof. Let s 2 R n f0g; t 2 Rd and T > 0 be arbitrary. We then have
Z
1  Re f .s  t / 1  cos .s.t; x//
Ct  2
D d.x/
s Rd s2
Z T Z T
1  cos .s.t; x//
  d.x/:
T T s2
Using the inequality (B.1.6.2) and Lebesgue’s theorem on dominated convergence we
see that the right-hand side tends to
Z Z T
1 T
 .t; x/2 d.x/
2 T T
as s ! 0. Since T > 0 is arbitrary, we find that (2) holds.
4 Compare this result with Theorem 1.11.4.
Section 1.2 Differentiability 13

The second statement is obtained by choosing special t ’s with tj D 1 and ti D 0,


i ¤ j , in (2).

The existence of D ˛ f does not imply the existence of the moment M˛ in general.5
However, the following is true.

Theorem 1.2.9. If for some ˛ 2 Nd0 n f0g and for g D Re f all partial derivatives
D ˇ g; ˇ < 2˛, exist in an open neighborhood of zero and if D 2˛ g exists at zero,
then the moment M2˛ of  exists. Moreover, the partial derivative D 2˛ f exists on all
of Rd .

Proof. The second statement follows from the first one using Theorem 1.2.1. To prove
the first statement assume first that j˛j D 1. By our assumption the function

h.s/ D h˛ .s/ D Re f .s  ˛/; s2R

is differentiable in a neighborhood of 0 and h00 .0/ exists. Since h is even, h0 .0/ D 0.


Applying (1.1.2.ii) and the mean value theorem we see that
1  h.s/ h0 .s/ h0 .s/
0 D  ; s¤0
s2 s s
where  D .s/ 2 .0; 1/. Since h0 .0/ D 0, the right-hand side tends to h00 .0/ as s ! 0.
Consequently, there exists a constant C D C˛ such that
1  h.s/
 C; s¤0
s2
(note that jh.s/j  1 for all s). Thus, by Lemma 1.2.8, the moment M2˛ exists.
The general case j˛j > 1 follows from Corollary 1.2.3 using induction on j˛j.

We continue with a simple application to conditional expectations.

Definition 1.2.10. Let X0 be a random variable with finite expectation and let X D
.X1 ; : : : ; Xd / be a random vector. The random variable X0 is said to have polynomial
regression of order k on X if there exists a polynomial
X
Q.t / D b˛  t ˛ ; t 2 Rd (1)
˛2Nd
0 ; j˛jk

of order k such that


E.X0 jX / D Q.X /: (2)
If k D 0; 1 or 2 we speak of constant, linear or quadratic regression, respectively.

5 See Remark 1.10.6 in [6] for an example.


14 Chapter 1 Characteristic functions

Whether X0 has polynomial regression on X can be answered in terms of the char-


acteristic function of the random vector .X0 ; X1 ; : : : ; Xd /.

Theorem 1.2.11. Let k 2 N0 and let X0 ; X1 ; : : : ; Xd be random variables such that


all moments of order at most k of the random vector Y D .X0 ; X1 ; : : : ; Xd / exist.
Denote by g the characteristic function of Y . Then the equation (1.2.10.2) holds if
and only if 6
X
iD .1;0;:::;0/ g.0; t / D .i/j˛j b˛ D .0;˛1 ;:::;˛d / g.0; t /; t 2 Rd :
˛2Nd
0 ; j˛jk

Proof. Write X D .X1 ; : : : ; Xd /. By Lemma F.4.2, equation (1.2.10.2) holds if and


only if   X  
E X0  ei.t;X/ D b˛  E X ˛  ei.t;X/ ; t 2 Rd :
˛
Hence, the statement follows immediately from Theorem 1.2.1.
Next we prove an interesting result: If the product of two characteristic functions is
2n-times differentiable, then so are the factors.7

Theorem 1.2.12. Suppose that f D f1 f2 where f1 ; f2 are characteristic functions


on R and let 1 ; 2 be the corresponding distributions. If f is 2n-times differentiable,
then so is f1 and
 k
.k/ 1=k
jf1 .0/j  2 Ak C jm2 j ; 1  k  2n (1)

where m2 is an arbitrary median8 of 2 and Ak is the absolute moment of order k of


 WD 1  2 . Analogous statements hold for f2 .
Proof. Note first that f is the characteristic function of . Since f is 2n-times differ-
entiable, the moments Ak ; k D 1; : : : ; 2n, exist by Theorem 1.2.9. For every a 2 R
we have Z Z Z
1 > Ak .a/ WD jx  ajk d.1  2 /.x/ D jt C s  ajk d1 .t / d2 .s/:
R R R

If t and s  a have the same sign, then jt C s   jt jk . Using this inequality and
ajk
the fact that the integrand above is zero if s D a and t D 0, we obtain
Z Z Z Z
Ak .a/  jt jk d1 .t / d2 .s/ C jt jk d1 .t / d2 .s/
Œa;1/ Œ0;1/ .1;a .1;0/
Z Z
  k
 
D 2 Œa; 1/ jt j d1 .t / C 2 .1; a jt jk d1 .t /:
Œ0;1/ .1;0/

6 Further results of this type can be found in [33].


7 See also Theorem 3.3.9.
8 This means that 2 ..1; m2 /  12 and 2 .Œm2 ; 1//  1
2 .
Section 1.2 Differentiability 15

These inequalities show that if a D m2 is a median of 2 , then


Z 1
jt jk d1 .t /  2Ak .m2 /; k D 1; : : : ; 2n: (2)
1

By virtue of Theorem 1.2.1, the function f1 is 2n-times differentiable. Moreover,


by (2) and (1.2.1.iii), we have
.k/
jf1 .0/j  2Ak .m2 /: (3)
On the other hand,
Z 1
Ak .m2 / D jx  m2 jk d.x/
1
Z 1 X ! !
k
k X
k
k
j kj
 jxj jm2 j d.x/ D Aj  jm2 jkj :
1 j j
j D0 j D0

The last sum is, in view of inequality (F.1.5), majorized by


!
X k  1=k j  k
k
1=k
Ak jm2 jkj D Ak C jm2 j
j
j D0

which, together with (3), proves the inequality (1).


Interchanging the roles of f1 and f2 , we obtain the last statement of the theorem.
The next theorem generalizes the differentiability statement of Theorem 1.2.12.9

Theorem 1.2.13. Let ı > 0 and f be a 2n-times differentiable function on the inter-
val .ı; ı/ such that f .t / D f .t /; jt j < ı. Further, let f1 ; : : : ; fr be characteristic
functions on R and ˛1 ; : : : ; ˛r be positive real numbers such that
Y
r
jfj .t /j˛j D jf .t /j; ı < t < ı:
j D1

Then the fj ’s are 2n-times differentiable on R.


Proof. Let
gj .t / D fj .t /fj .t / D jfj .t /j2 ; t 2R
and g.t / D f .t /f .t / D jf .t /j2 ;
jt j < ı. The gj ’s are nonnegative characteristic
functions, the corresponding distributions will be denoted by j . The function g is
even, nonnegative and 2n-times differentiable. Moreover,
Y
r
˛
gj j .t / D g.t /; ı < t < ı: (1)
j D1

9 See also Theorem 3.3.14.


16 Chapter 1 Characteristic functions

We prove by induction on k that gj is 2k-times differentiable for all k D 1; : : : ; n.


The statement for fj follows then from Theorem 1.2.12.
˛
Assume first that k D 1. Since gj .t /  1 we have g.t /  gj j .t /. Using this and
the mean value theorem for the function x 7! x ˛j we see that
˛
1  g.t / 1  gj j .t / 1  gj .t /
2
 2
D  ˛j  s.t /˛j 1
t t t2
where gj .t /  s.t /  1. As g is even g 0 .0/ D 0 and hence the left side tends to
 12 g 00 .0/ if t tends to zero. On the other hand, s.t / tends to 1 and therefore

1  gj .t /
 C; t ¤0
t2
for some constant C . Lemma 1.2.8 shows that gj is twice differentiable.
Suppose now that gj is 2k-times differentiable for some k < n. Then
.2l1/
gj .0/ D 0; l D 1; : : : ; kI j D 1; : : : ; r:

We choose ı0 2 .0; ı such that g.t / ¤ 0 if jt j < ı0 . Then we have

X
r
1
˛j log D h.t /; ı0 < t < ı0 (2)
gj .t /
j D1

where h D log g. This equation and Faà di Bruno’s formula (B.1.14) with m D
2k; g D  log and f D gj show that for t 2 .ı0 ; ı0 / we have
!b1 !b2 .2k/ !b2k
.2k/
X
r X gj0 .t / gj00 .t / gj .t /
h .t / D ˛j c.b/  (3)
gj .t / gj .t / gj .t /
j D1

0 satisfying b1 C 2b2 C    C 2kb2k D 2k and


where the last sum is over all b 2 N2k

.2k/Š.1/jbj  .jbj  1/Š


c.b/ D :
b1 Šb2 Š : : : b2k Š .1Š/b1 .2Š/b2 : : : ..2k/Š/b2k
Setting t D 0 in (3), subtracting the resulting expression from (3) and using that
c.b/ D 1 if b D .0; 0; : : : ; 1/, we obtain

.2k/ !
X
r
gj .t / .2k/
˛j  gj .0/ D
gj .t /
j D1
X
r X
.2k/ .2k/
h .t /  h .0/ ˛j c.b/  ŒSj;b .t /  Sj;b .0/ (4)
j D1
Section 1.2 Differentiability 17

where the last sum is over all b 2 N2k0 satisfying b2k D 0; b1 C 2b2 C    C
.2k  1/b2k1 D 2k and
!b !b .2k1/ !b
gj0 .t / 1 gj00 .t / 2 gj .t / 2k1
Sj;b .t / D  :
gj .t / gj .t / gj .t /

.2l1/
Using that gj is 2k-times continuously differentiable and gj .0/ D 1; gj .0/ D 0,
1  l  k, we obtain
1
D 1 C O.t 2 /; t ! 0 (5)
gj .t /
and
.2l1/ .2l2/ .2l2/
gj .t / D O.t /; gj .t / D gj .0/ C O.t 2 /; t ! 0:
These relations show that Sj;b .t /  Sj;b .0/ D O.t 2 / if b ¤ .0; 0; : : : ; 1/. Since h is
2k-times differentiable we conclude that the left-hand side of (4) is O.t 2 /. Applying
this and (5) we get
X
r  
.2k/ .2k/
˛j gj .t /  gj .0/ D O.t 2 /:
j D1

.2k/ .2k/
Since gj =gj .0/ is a characteristic function (cf. Corollary 1.2.3), inequality
(1.1.2.ii) shows that the terms in this sum have the same sign and hence they are O.t 2 /.
.2k/
By Lemma 1.2.8 the function gj is twice differentiable, completing the induction
on k.

Remark 1.2.14. If all the numbers ˛k are integers or rational numbers, then Theo-
rem 1.2.13 follows easily from Theorem 1.2.12 using the fact that fj˛ is a characteris-
tic function for all ˛ 2 N. If g is a nonnegative characteristic function and ˛ is not an
integer, then g ˛ is not necessarily a characteristic function. To see a simple example,
let g.t / D cos2 t; t 2 R. Then g 1=2 .t / D j cos t j and g 1=2 is not a characteristic
function since it is infinitely differentiable on .=2; =2/ but it is not differentiable
at =2 (cf. Theorem 1.2.9).

Remark 1.2.15. The set of all points where a given characteristic function is differen-
tiable can be quite complicated. To see examples, let b  2 be an integer and f"n g1
1 be
a sequence of nonnegative numbers tending to zero. Denote by D the set of all t 2 R
where the characteristic function g defined by
1
X
g.t / D C "n b n cos b n t; t 2R
nD1

is differentiable (the constant C is determined by the condition g.0/ D 1). Then g is


differentiable at 0. Moreover:
18 Chapter 1 Characteristic functions

(i) For every open interval I  R the set D \ I has the same cardinality as R.
P
(ii) If 1 0
1 "n < 1 then D D R and g is continuous.
P1 2
(iii) If 1 "n D 1 then .D/ D 0.
This result is due to A. Zygmund, a proof can be found in Section 1.10 of [6].

1.3 Inversion theorems


The main result of this section is Theorem 1.3.3, which states that distributions are
uniquely determined by their characteristic functions. This result allows us to work
with functions instead of measures, which is very useful for example in the study of
limit theorems. At the end of this section we present some simple applications of the
uniqueness theorem.

Theorem 1.3.1. Let f be the characteristic function of a d-dimensional distribution


 and let a; b 2 Rd such that aj < bj ; j D 1; : : : ; d . Then 10
Z Z Y
d Z Y d
1 T T
eiaj tj  eibj tj
lim ::: f .t / dt D j .x/ d.x/
T !1 .2/d T T j D1 itj Rd j D1

where 8
< 1 if xj 2 .aj ; bj /
j .x/ D
1
if xj D aj or xj D bj
: 2
0 if xj … Œaj ; bj :
Proof. For T > 0 write
Z Z Y
d
1 T T
eiaj tj  eibj tj
I.T / WD ::: f .t / dt
.2/d T T j D1 itj
Z Z Y
d Z
1 T T
eiaj tj  eibj tj
D ::: ei.x;t/ d.x/ dt:
.2/d T T j D1
itj R d

In view of Theorem B.1.5,


ˇ ˇ ˇ ˇ
ˇ eiaj tj  eibj tj ˇ ˇ 1  ei.bj aj /tj ˇ
ˇ i.x;t/ ˇ ˇ ˇ
ˇ e ˇDˇ ˇ  bj  aj ;
ˇ itj ˇ ˇ tj ˇ

10 The value of .eiaj tj  eibj tj /=itj at tj D 0 is defined by continuity. It is equal to bj  aj .


Section 1.3 Inversion theorems 19

and hence we may apply Fubini’s theorem:


Z Z T Z T Y d
1 ei.xj aj /tj  ei.xj bj /tj
I.T / D : : : dt d.x/
.2/d Rd T T j D1 itj
Z Y
d Z 
1 T
sin.xj  aj /tj sin.xj  bj /tj
D  dtj d.x/
Rd 2 T tj tj
j D1
Z Y
d
DW Kj .T; x/ d.x/:
Rd j D1

It follows from Corollary B.1.8 that limT !1 Kj .T; x/ D j .x/. By Theorem B.1.9,
we have jKj .T; x/j < 2. Hence, the statement of the theorem follows by dominated
convergence.

Corollary 1.3.2. If the -measure of the boundary of the set


J D .a1 ; b1 /      .ad ; bd /
is equal to zero, then
Z Z Y
d
1 T T
eiaj tj  eibj tj
lim ::: f .t / dt D .J /:
T !1 .2/d T T j D1 itj

From this corollary we immediately obtain the following theorem.

Theorem 1.3.3 (Uniqueness Theorem). Every distribution  is uniquely determined


by its characteristic function f .
Proof. It suffices to show that the -measure of sets of the form
J D .a1 ; b1 /      .ad ; bd /
is uniquely determined by f . If the boundary @J of J has zero -measure, then we
can apply the previous corollary. Otherwise we consider the sets
J D .a1 C ; b1  /      .ad C ; bd  /; > 0:
If 1 ¤ 2 then @J1 \ @J2 D ;. Since  is a finite measure the set
f > 0 W .@J / ¤ 0g
is denumerable. Consequently, there exists a sequence f n g1
1 such that .@Jn / D 0
and
1
[
J D J n
nD1
from which the theorem follows.
20 Chapter 1 Characteristic functions

The next lemma deals with a special case of Gaussian distributions which will be
investigated in more detail in Sections 1.10 and 3.5.

Lemma 1.3.4. The function ' ; > 0, given by


2
1  kxk
' .x/ D p  e 2 2 ; x 2 Rd
. 2  /d
is a probability density11 with characteristic function
 2 kt k2
g .t / D e 2 ; t 2 Rd :
Moreover,
1
' D p  g1= : (1)
. 2  /d
Proof. We consider first the case d D 1. By Corollary B.1.11, the function ' is a
probability density. The corresponding characteristic function is
Z 1 2
1  y
g .t / D p  e 2 2  eity dy:
2  1
Differentiating (cf. Theorem 1.2.1) and integrating by parts we get
Z 1 2
0 i  y
g .t / D p  ye 2 2  eity dy
2  1
2t Z 1 2
 y
D p  e 2 2  eity dy D  2 tg .t /:
2  1
2t2
Every solution of this differential equation has the form t 7! C e 2 . Since
g .0/ D 1 we must have C D 1.
The case d > 1 is obtained now from the equation
Y
d
1  1
x 2
' .x/ D p e 2 2 j ; x 2 Rd :
j D1
2 
The last statement is trivial.
The next lemma is a special form of Parseval’s identity (see also equation (i) in
Plancherel’s Theorem 1.8.9).

Lemma 1.3.5. Let  and  be distributions on Rd with characteristic functions f


and f , respectively. Then
Z Z
ei.t;y/ f .t / d.t / D f .x C y/ d.x/
Rd Rd

holds for all y 2 Rd .


11 The distribution corresponding to the case D 1 is called standard Gaussian (see also Figure B.2).
Section 1.3 Inversion theorems 21

Proof. By definition of f
Z
e i.t;y/
f .t / D ei.t;xCy/ d.x/:
Rd
Integrating both sides with respect to  and using Fubini’s theorem we obtain the
desired equation.

Throughout the rest of this section f denotes the characteristic function of a prob-
ability distribution  on Rd.
The most frequently applied inversion formula is presented in the next theorem.

Theorem 1.3.6. If f 2 L1 .Rd / then  is absolutely continuous and the correspond-


ing density p is given by
Z
1
p.y/ D f .t /  ei.t;y/ dt; y 2 Rd :
.2/d Rd
Moreover, p is bounded and uniformly continuous.

Proof. Since f is integrable it can be written as a (complex) linear combination of


four nonnegative integrable functions. Thus, the left-hand side of the equation above
represents a linear combination of four characteristic functions. As characteristic func-
tions are bounded and uniformly continuous, it remains to prove the first part of the
theorem.
Let X and Y be independent d-dimensional random vectors such that  is the dis-
tribution of X and Y is standard Gaussian (cf. Lemma 1.3.4). For any > 0 the
density and the characteristic function of Y are given by the functions ' and g of
Lemma 1.3.4, respectively. Applying equation (F.1.2) and Lemma 1.3.4 we see that
the random vector Sn D X C n1 Y; n 2 N, has a density pn given by
Z d Z
n
pn .y/ D '1=n .y  x/ d.x/ D p  gn .y  x/ d.x/; y 2 Rd :
Rd 2 Rd

Lemma 1.3.5 shows that the integral on the right-hand side is equal to
Z
ei.x;y/ f .x/'n .x/ dx
Rd
and therefore Z
1  1
ktk2
pn .y/ D ei.t;y/ f .t /e 2n2 dt: (1)
.2/d Rd
The sequence fSn g1 1 converges pointwise to X . Therefore, the corresponding se-
quence of distributions converges weakly to . By Theorem E.1.7,
Z Z
lim h.y/pn .y/ dy D h.y/ d.y/ (2)
n!1 Rd Rd
22 Chapter 1 Characteristic functions

for every bounded continuous function h on Rd . Since the modulus of the integrand
on the right-hand side of (1) is bounded by jf j, Lebesgue’s dominated convergence
theorem can be applied to show that pn .y/ tends to
Z
1
p.y/ WD f .t /  ei.t;y/ dt; y 2 Rd
.2/d Rd

for all y 2 Rd . From (1) we also see that the sequence fpn g1
nD1 is uniformly bounded.
Applying Lebesgue’s theorem once more and using (2) we obtain
Z Z
h.y/p.y/ dy D lim h.y/pn .y/ dy
Rd Rd n!1
Z Z
D lim h.y/pn .y/ dy D h.y/ d.y/:
n!1 Rd Rd
From this we conclude that  is absolutely continuous with density p.

Example 1.3.7. Let


1
f .t / D ; t 2R
1 C ˛2t 2
be the characteristic function of the Laplace distribution with density
1 jyj
p.y/ D  e ˛ ; y 2 R

(see (1.1.13.b)). By the previous theorem,
Z 1
1 jyj 1 eity
 e ˛ D dt:
2˛ 2 1 1 C ˛ 2 t 2
jyj
Replacing y by y, this equation shows that y 7! e ˛ is the characteristic function
of the Cauchy distribution with density t 7! 1  1C˛˛2 t 2 .

Theorem 1.3.8. For every a 2 Rd we have


Z T Z T
1
lim  ei.a;t/ f .t / dt D .fag/:
T !1 .2T /d T T

Proof. Applying Fubini’s theorem we obtain


Z T Z T
1
d
 ei.a;t/ f .t / dt
.2T / T T
Z T Z T Z
1 i.a;t/
D    e ei.t;x/ d.x/ dt
.2T /d T T R d
Z Z T Z T
1
D  ei.t;xa/ dt d.x/
Rd .2T /d T T
Section 1.3 Inversion theorems 23

Z Z T !
Y d
1
D eitj .xj aj / dtj d.x/
Rd 2T T
j D1
Z !
Yd
DW Kj .xj ; T / d.x/
Rd j D1

where Kj .aj ; T / D 1 and


sin T .xj  aj /
lim Kj .xj ; T / D lim D 0; x j ¤ aj :
T !1 T !1 T .xj  aj /
Hence, the assertion of the theorem follows by dominated convergence.

Theorem 1.3.9. Assume that  D d C c where c is a continuous nonnegative


measure and X
d D pn ıxn
n
is a nonnegative discrete measure. Then
Z T Z T X
1
lim d
   jf .t /j2 dt D pn2 :
T !1 .2T / T T n

Consequently,  is continuous if and only if the limit on the left-hand side is equal
to zero.

Proof. Let X1 and X2 be independent random vectors with characteristic function f .


Then jf j2 is the characteristic function of X D X1  X2 and
X X
P .X D 0/ D P .ŒX1 D xn  \ ŒX2 D xn / D pn2 :
n n

Applying Theorem 1.3.8 with a D 0 and with f replaced by jf j2 we obtain the


assertion of the theorem.

We close this section with some simple applications of the uniqueness Theo-
rem 1.3.3.

Theorem 1.3.10. The random variables X1 ; : : : ; Xd with characteristic functions


f1 ; : : : ; fd are independent if and only if

Y
d
fX .t / D fj .tj /; t 2 Rd (1)
j D1

where fX denotes the characteristic function of the random vector


X D .X1 ; : : : ; Xd /.
24 Chapter 1 Characteristic functions

Proof. Denoting by j the distribution of Xj , the random variables Xj ’s are inde-


pendent if and only if the distribution of X is equal to the product measure 1   d .
The statement of the theorem follows from the uniqueness Theorem 1.3.3 and from
the fact that the right-hand side of equation (1) is equal to the characteristic function
of the product measure 1      d .

Theorem 1.3.11. A probability measure  on Rd is uniquely determined by its values


assigned to the half spaces

Hy;r WD fx 2 Rd W .y; x/  rg; y 2 Rd ; r 2 R:

Proof. Let X be the identity mapping on Rd and consider X as a random vector on


the probability space . ; A; P / WD .Rd ; Bd ; /. Then  is the distribution of X and

Hy;r D f! 2 W .y; X.!//  rg:

Thus, the values .Hy;r /; r 2 R, determine the distribution (and hence the charac-
teristic function) of the random variable ! 7! .y; X.!// for all y 2 Rd . On the other
hand, the characteristic functions of these random variables determine the character-
istic function of X (cf. Remark 1.1.10). This fact together with Theorem 1.3.3 proves
the result.

The next result can be proved in the same way as the previous one. We omit the
proof.

Theorem 1.3.12. The distribution of a random vector X is uniquely determined by


the distributions of all orthogonal projections of X on lines going through the origin.

Recall that a probability distribution  on Rd is called symmetric if  D 4 , i.e.,


.B/ D .B/ for all Borel subsets of Rd . The distribution of a random vector X is
symmetric if and only if X and X have the same distribution.

Theorem 1.3.13. For a random vector X the following conditions are equivalent.
(i) The distribution  of X is symmetric.
(ii) The characteristic function f of X is real-valued.
R
(iii) f .t / D Rd cos..t; x// d.x/; t 2 Rd :

Proof. If  is symmetric then X and X have the same distribution and hence
   
f .t / D E ei.t;X/ D E ei.t;X/ D f .t / D f .t /; t 2 Rd :

Thus, f is real-valued.
Section 1.4 Basic properties of positive definite functions 25

Suppose that f is real-valued. Then f .t / D f .t / D f .t / for all t 2 Rd ,


hence  and 4 have the same characteristic function. Therefore  D 4 , by Theo-
rem 1.3.3.
The equivalence of (ii) and (iii) follows from

f .t / D E Œcos.t; X / C iE Œsin.t; X / :

1.4 Basic properties of positive definite functions


An essential property of characteristic functions and covariance functions is that they
are positive definite. In the present section we investigate positive definiteness on its
own. In the main part of the book we deal with positive definite functions on the group
Rd , but occasionally we need also the group Zd . For this reason it is practical to treat
the basic properties of these functions in a more general setting. Throughout this sec-
tion the symbol G denotes an arbitrary commutative group. We use additive notation
for the group operation.12
We have already shown in Theorem 1.1.2 that every characteristic function is posi-
tive definite in the following sense.

Definition 1.4.1. A complex-valued function f on G is said to be positive definite if


the inequality
Xn
f .ti  tj /ci cj  0
i;j D1

holds for every positive integer n, for all t1 ; : : : ; tn 2 G, and for all c1 ; : : : ; cn 2 C.

We denote by P .G/ the set of all positive definite functions on G while P c .Rd / and
P m .Rd / denote the set of all continuous and Lebesgue measurable positive definite
functions on Rd , respectively.

Remark 1.4.2. By the definition above, f is positive definite if and only if the matrix
 n
A D f .ti  tj / i;j D1

is positive semidefinite13 for an arbitrary choice of n 2 N and t1 ; : : : ; tn 2 G.


The inequality in Definition 1.4.1 can be formulated in terms of finitely supported
complex measures as follows. Let tj ; cj be as in Definition 1.4.1 and define the meas-

12 Note that most of the results on positive definite functions presented in this book have their analogues
on locally compact groups (see for example [49]). The restriction of considering only commutative
groups without topology makes the treatment simpler. On the other hand, the results obtained are
sufficient for many applications in probability theory.
13 Section D.2 contains some basic results on positive semidefinite matrices.
26 Chapter 1 Characteristic functions

ure  by
X
n
D cj ıtj :
j D1
Pn
Then Q D j D1 cj ıtj and
Z X
n
  Q  f .t / D f .t  x/ d.  /.x/
Q D f .t  ti C tj /ci cj ; t 2 G:
G i;j D1

Setting here t D 0 and replacing ti by ti we obtain the sum in Definition 1.4.1. We
have thus proved the following lemma.

Lemma 1.4.3. A complex-valued function f on G is positive definite if and only if


the inequality
  Q  f .0/  0
holds for all finitely supported complex measures  on G.
Next we present two examples where we show the positive definiteness of a given
function by proving that the matrix A in Remark 1.4.2 is positive semidefinite.

Example 1.4.4. A simple example of a positive definite function on G is given by


the function 1f0g . In this case the matrix A after Definition 1.4.1 is the n  n identity
matrix whenever the ti ’s are mutually different.
More generally, if G0 is a subgroup of G, then 1G0 is positive definite. To see this
let F be a non-void finite subset of G. The set F can be written as F D [nkD1 Fk
where the sets Fk are contained in mutually different cosets of G0 . Let nk denote the
number of elements of Fk and choose a numbering t1 ; t2 ; : : : ; tm of the elements of F
such that ti 2 Fk if and only if

X
k1 X
k
nj < i  nj
j D1 j D1

where the sum on the left-hand side is interpreted as 0 if k D 1. Then


2 3
A1 0 0
 m 6 0 A2 0 7
6 7
f .ti  tj / i;j D1 D 6 : 7
4 :: 5
0 0 An
where all entries of Ak are equal to 1. It is easy to check that this matrix is positive
semidefinite.
We will see in Section 1.7 that the ‘building stones’ of positive definite functions
are the characters defined below.
Section 1.4 Basic properties of positive definite functions 27

Definition 1.4.5. A complex-valued function ¤ 0 on G is called a character of G


if .x C y/ D .x/ .y/ and .x/ D .x/ for all x; y 2 G.

It is easy to check that .0/ D 1 and j .x/j D 1 for all x 2 G if is a character


of G.

Example 1.4.6. Every character of G is positive definite. Indeed,


ˇ n ˇ2
X
n X
n ˇX ˇ
ˇ ˇ
.ti  tj /ci cj D .ti / .tj /ci cj D ˇ ci .ti /ˇ  0:
ˇ ˇ
i;j D1 i;j D1 iD1

Lemma 1.4.7. Every continuous character of Rd has the form

.t / D ei.x;t/ ; t 2 Rd

with some x 2 Rd . If is a character of Zd then there exists z 2 Td such that

.n/ D z n ; n 2 Zd :

Proof. The first statement follows immediately from Theorem C.9.2 using the fact that
j j D 1.
To prove the second one let e1 ; : : : ; ed be the standard basis of Zd . If is a character
of Zd and n D .n1 ; : : : ; nd / 2 Zd then

.n/ D .n1  e1 C    C nd  ed / D .e1 /n1  : : :  .ed /nd D z n

where zi D .ei / 2 T.

Next we prove several simple properties of positive definite functions. From Theo-
rem D.2.3 we obtain the following lemma:

Lemma 1.4.8. Let f 2 P .G/. Then f is Hermitian, i.e., f .x/ D f .x/ holds for
all x 2 G.

Theorem 1.4.9. Let f1 ; f2 2 P .G/. Then the functions f1 , Ref1 , jf1 j2 , and f1  f2
are positive definite. Moreover, p1 f1 C p2 f2 is positive definite for all p1 ; p2  0.

Proof. The positive definiteness of f1 and p1 f1 C p2 f2 follow at once from Defi-


nition 1.4.1. We therefore have Ref1 D 12 .f1 C f1 / 2 P .G/. The fact that f1 f2
is positive definite is an immediate consequence of Schur’s theorem stating that the
pointwise product of nonnegative definite matrices is nonnegative definite (cf. Theo-
rem D.2.12). It follows that jf1 j2 D f1 f1 2 P .G/.

The next theorem follows immediately from Definition 1.4.1.


28 Chapter 1 Characteristic functions

Theorem 1.4.10. If a sequence (or net) of positive definite functions converges point-
wise to a function f then f is positive definite.
As a consequence of Theorems 1.4.9 and 1.4.10 we obtain the following corollary:

Corollary 1.4.11. The set P .G/ is a convex cone closed in the topology of pointwise
convergence.
Next we prove some frequently used inequalities.14

Theorem 1.4.12. Let f 2 P .G/. Then


(i) jf .x/j  f .0/ for all x 2 G;
(ii) jf .x/  f .y/j2  2f .0/Œf .0/  Re f .x  y/ for all x; y 2 G;
(iii) jf .0/f .x C y/  f .x/f .y/j2  Œf .0/2  jf .x/j2 Œf .0/2  jf .y/j2  for all
x; y 2 G;
ˇ n ˇ2
ˇX ˇ X
n
ˇ ˇ
(iv) ˇ ai f .xi /ˇ  f .0/ f .xi  xj /ai aj for an arbitrary choice of n 2
ˇ ˇ
iD1 i;j D1
N, x1 ; : : : ; xn 2 G, and a1 ; : : : ; an 2 C;
ˇ n m ˇ2 ! !
ˇX X ˇ X n X
m
ˇ ˇ
(v) ˇ f .xi yj /ai bj ˇ  f .xi xj /ai aj f .yi yj /bi bj
ˇ ˇ
iD1 j D1 i;j D1 i;j D1
for all n; m 2 N, x1 ; : : : ; xn ; y1 ; : : : ; ym 2 G, and for all complex numbers
a1 ; : : : ; an ; b1 ; : : :, bm .

Pn (v) first. Denote by L the complex linear spacePof


Proof. We prove
m
all functions of
the form g D 1 ai 1fxi g , ai 2 C, xi 2 G. For such g and h D 1 bi 1fyi g 2 L we
define the inner product .g; h/ by
X
n X
m
.g; h/ WD f .xi  yj /ai bj :
iD1 j D1

It is easy to check that this definition is correct, i.e., .g; h/ does not depend on the
particular representations of g and h. As f 2 P .G/, this inner product is nonnegative
definite. Thus, the Cauchy–Schwarz inequality j.g; h/j2  .g; g/  .h; h/ holds, and
this is the same as (v).
Setting m D 1, y1 D 0, and b1 D 1 in (v), we get (iv).
The inequality (ii) follows readily from (iv) by setting n D 2, x1 D x, x2 D y,
a1 D 1, a2 D 1. Putting n D 1, x1 D x, and a1 D 1 in (iv), we get jf .x/j2  f .0/2
and, since f .0/ D .1f0g ; 1f0g /  0, this gives jf .x/j  f .0/.

14 See also Theorem 1.11.6.


Section 1.4 Basic properties of positive definite functions 29

To prove (iii) we first remark that if a matrix of the form


0 1
1 a b
@a 1 c A
b c 1
is positive semidefinite, then its determinant is nonnegative (cf. Theorem D.2.9). Com-
puting this determinant we obtain

1 C abc C abc  jaj2 C jbj2 C jcj2

or equivalently,
jc  abj2  .1  jaj2 /.1  jbj2 /: (1)
Assume first that f .0/ D 1: Applying (1) to the (Hermitian) matrix
0 1
 3 1 f .x/ f .y/
f .xi  xj / i;j D1 D @ f .x/ 1 f .x C y/A
f .y/ f .x  y/ 1

where x1 D 0, x2 D x, and x3 D y, we obtain (iii). If f .0/ D 0, then f D 0 by


(i) and so (iii) is satisfied. The case f .0/ ¤ 0 can be reduced to the case f .0/ D 1 by
considering the positive definite function f =f .0/.

Corollary 1.4.13. Let f 2 P .G/ be such that f .0/ D 1 and write

G0 WD f x 2 G W jf .x/j D 1 g:

Then G0 is a subgroup of G and

f .x C y/ D f .x/f .y/

for all x 2 G0 and y 2 G.

Proof. This follows immediately from inequality (1.4.12.iii) and Lemma 1.4.8.

Corollary 1.4.14. Let f 2 P .R/ be such that f .0/ D 1. If there exist n 2 N and
h > 0 such that f .h/n D 1 then f is periodic with period nh.

Proof. The case n D 1 follows readily from Corollary 1.4.13. Using this, the general
case is obtained by noting that jf .h/j D 1 and therefore, by Corollary 1.4.13, f .nh/ D
f .h/n D 1.

The functions

t 7! ektk ;
2
t 7! ei.t;x/ ; t 7! cos..t; x// and t; x 2 Rd
30 Chapter 1 Characteristic functions

provide examples for the subgroup G0 in the previous corollary. Example 1.4.4
shows that for an arbitrary subgroup G0 of G there exists f 2 P .G/ such that
G0 D f x 2 G W jf .x/j D 1 g.

A simple corollary of inequality (1.4.12.iii) is the following:

Corollary 1.4.15. Let fn 2 P .G/ be such that fn .0/ D 1; n 2 N. Then the set of all
x 2 G with fn .x/ ! 1; n ! 1, is a subgroup of G. In particular, if G D Rd and
fn ! 1 on a neighborhood of 0 then fn ! 1 on Rd .

Next we list some simple results that can be used to construct positive definite func-
tions.

Lemma 1.4.16. Let fi 2 P .Gi / where Gi ; i D 1; 2, is a commutative group. Then


the function
.x1 ; x2 / 7! f1 .x1 /f2 .x2 /
is positive definite on the product group G1  G2 .

Proof. The lemma follows immediately from the fact that the entrywise prod-
uct of positive semidefinite matrices is positive semidefinite (see Schur’s Theo-
rem D.2.12).

Lemma 1.4.17. Let g be a given positive definite function on G and let an ; n 2 N0 ,


be nonnegative numbers such that the series
1
X
'.z/ WD an z n ; z2C
nD0

is convergent whenever jzj  g.0/. Then the function f .t / WD '.g.t //; t 2 G, is


positive definite.

Proof. By (1.4.12.i), we have jg.t /j  g.0/, and hence the definition of f is correct.
The positive definiteness of f follows from Theorems 1.4.9 and 1.4.10.

For example, the functions eg and 1= .2g.0/  g/, are positive definite if g 2 P .G/
and g.0/ ¤ 0.

Lemma 1.4.18. Let f 2 P .G/ and let

X
n
D cj ıtj ; n 2 N; cj 2 C; tj 2 G
j D1
Section 1.4 Basic properties of positive definite functions 31

be an arbitrary complex measure with finite support. Then the function


X
n
g.t / D   Q  f .t / D f .t  ti C tj /ci cj ; t 2G
i;j D1

y
is positive definite. In particular, the function fc defined by

fcy .t / D .1 C jcj2 /f .t / C cf .t C y/ C cf .t  y/; t 2G

is positive definite for all y 2 G and c 2 C.

Proof. Let  be an arbitrary finitely supported complex measure on G. The first state-
ment of the lemma follows from the equation

  Q  g.0/ D .  /  .  /Q  f .0/;

using Lemma 1.4.3 and the positive definiteness of f . Setting  D ı0 C cıy , we


obtain the second statement.

It follows from Corollary 1.4.11 that the set

P0 .G/ WD ff 2 P .G/ W f .0/ D 1g

is convex. Using the previous lemma we are now able to identify the extreme points
(cf. D.4.1) of this set. First we prove a simple but useful statement.

Lemma 1.4.19. If f is an extreme point of P0 .G/ and f D f1 C f2 where fi 2


P .G/, then fi D fi .0/  f; i D 1; 2.

Proof. If f1 .0/ D 0, then f1 D 0 (cf. (1.4.12.i)) and hence f2 D f . The case f2 .0/ D
0 can be treated in the same way. If f1 .0/ and f2 .0/ are different from zero then
fi =fi .0/ 2 P0 .G/ and
f1 f2
f D f1 .0/  C f2 .0/  :
f1 .0/ f2 .0/
Since f is extremal and 1 D f .0/ D f1 .0/ C f2 .0/ we conclude that fi =fi .0/ is
equal to f .

Theorem 1.4.20. A complex-valued function on G is an extreme point of P0 .G/ if


and only if it is a character of G.

Proof. Let 2 P0 .G/ be extremal. For every y 2 G and c 2 C, the function


y
c .x/ D .1 C jcj2 / .x/ C c .x C y/ C cN .x  y/
32 Chapter 1 Characteristic functions

is positive definite by Lemma 1.4.18. Moreover,


1
D  . cy C y
c / :
2.1 C jcj2 /
y
Since is extremal the previous lemma shows that c .x/ D g.c; y/ .x/ with a certain
function g. Thus,
y y
c .x/  c .x/ D 2Œc .x C y/ C cN .x  y/ D g1 .c; y/ .x/; (1)
where g1 .c; y/ D g.c; y/  g.c; y/. Setting c D 1 and c D i in (1) , we get
.x C y/ D l.y/ .x/ (2)
with
1
 Œg1 .1; y/  ig1 .i; y/:
l.y/ D
4
Setting x D 0 in (2) we obtain .y/ D l.y/, and therefore
.x C y/ D .x/ .y/; x; y 2 G:
Noting that positive definite functions are Hermitian we see that is a character of G.
We now prove that every character is an extreme point of P0 .G/. We have already
seen in Example 1.4.6 that is positive definite. Assume that D p 1 C .1  p/ 2 ,
where 1 ; 2 2 P0 .G/ and 0 < p < 1. It follows immediately from the relations
j j D 1; j 1 j  1 and j 2 j  1 that D 1 D 2 , i.e., is extremal.

1.5 Further properties of positive definite functions on Rd


We continue the investigation of positive definite functions, but now we make use of
the special structure of Rd .

Lemma 1.5.1. Let f 2 P c .Rd / such that jf .t /j D 1 holds for all t from a neigh-
borhood of 0 2 Rd . Then

f .t / D ei.x;t/ ; t 2 Rd

for some x 2 Rd .

Proof. If a subgroup of Rd contains a neighborhood of the zero, then it is equal to Rd .


The statement follows therefore from Corollary 1.4.13 and Theorem C.9.2.

Corollary 1.5.2. Let f 2 P .Rd /. If Re f is continuous at 0, then f is uniformly


continuous on Rd .

Proof. This is a consequence of (1.4.12.ii).


Section 1.5 Further properties of positive definite functions on Rd 33

The next two results can be applied to construct positive definite functions.

Lemma 1.5.3. Let f 2 P c .Rd / and let  be an arbitrary complex measure on Bd .


Then the function
Z Z
  Q  f .t / D f .t  x  y/ d.x/ d.y/;
Q t 2 Rd
Rd Rd

is positive definite. In particular, the function


Z Z
t 7! f .t C x  y/h.x/h.y/ dx dy; t 2 Rd
Rd Rd

is positive definite for all h 2 L1 .Rd /.

Proof. By Lemma 1.4.18, the statement is true if  has finite support. Let  now be
arbitrary, and let f˛ g be a net of finitely supported complex measures converging
weakly to  (cf. Theorem E.1.8). By Theorem E.2.5, the net f˛  Q ˛ g converges
weakly to   ,Q and hence f˛  Q ˛  f g converges pointwise to   Q  f . Thus,
the first assertion follows from Theorem 1.4.10. The second assertion follows from
the first one by setting d.x/ D h.x/ dx.

Lemma 1.5.4. Let g 2 L2 .Rd /. Then the function


Z
f .t / WD g  g.t
Q /D g.t C y/g.y/ dy; t 2 Rd
Rd

belongs to P c .Rd / and it vanishes at infinity.

Proof. For all t1 ; : : : ; tn 2 Rd , and for all c1 ; : : : ; cn 2 C we have


X
n Z X
n
f .ti  tj /ci cj D g.ti  tj C y/g.y/ci cj dy
i;j D1 Rd i;j D1
Z X n
D g.ti C y/g.y C tj /ci cj dy
Rd i;j D1

Z ˇ n ˇ2
ˇX ˇ
ˇ ˇ
D ˇ g.t i C y/ci ˇ dy  0
ˇ
Rd iD1 ˇ

i.e., f is positive definite. The remaining statements follow from Theorem E.2.4.
34 Chapter 1 Characteristic functions

0.5

3 2 1 1 2 3

Figure 1.6. The function pa from Remark 1.5.5 with a D 2.

Remark 1.5.5. Setting d D 1 and g D p1a  1Œa=2;a=2 ; a > 0, in the previous


lemma we see that the function
8
< 1  jtj
a if jt j  a
a .t / WD ; t 2R
:
0 if jt j > a
is positive definite. We show that the function
 2
a sin.t a=2/
pa .t / D  ; t 2R
2 t a=2
is a density and the corresponding characteristic function is a .
Indeed, let X and Y be independent random variables having a1  1Œa=2;a=2 as their
density. The characteristic functions of X and Y are given by
sin.t a=2/
fX .t / D fY .t / D :
t a=2
In view of Theorem 1.1.3, the characteristic function of their sum is

sin.t a=2/ 2 2
fXCY .t / D fX .t /  fY .t / D D  pa .t /:
t a=2 a
Applying (F.1.2), a short computation shows that the density of X CY is equal to a1  a .
From Theorem 1.3.6 with f replaced by fXCY we conclude that pa is a density and
the corresponding characteristic function is a .15
15 The function a is an example of the so-called Pólya-type characteristic functions, which we will
treat in more detail in Section 3.9.
Section 1.5 Further properties of positive definite functions on Rd 35

Lemma 1.5.6. The following statements hold.16


(i) There exists a sequence fgn g1
1 of functions gn 2 P .R / of the form gn D
c d
Q
hn  hn where hn 2 C00 .R / such that gn .0/ D 1 and fgn g1
d
1 tends to 1
uniformly on compact sets.
(ii) There exists a sequence fgn g1 1 of functions gn 2 P .Z / of the form gn D
d

hn  hQ n where hn is a complex-valued function on Zd with finite support such


that gn .0/ D 1 and fgn g11 tends pointwise to 1.

Proof. (i) For n 2 N let Sn WD ft 2 Rd W 0  tj  n; j D 1; : : : ; d g and


1
hn WD p  1Sn :
.Sn /

For all r 2 Œ0; n we have

.Sn \ .Sn C x//  .n  r/d ; x 2 Rd ; jxi j  r:

Using this inequality we see that the sequence fhn  hQ n g1


1 has the desired properties.
(ii) We define Sn WD ft 2 Z W 0  tj  n; j D 1; : : : ; d g and
d

1
hn WD p  1Sn
jSn j

where j  j denotes the counting measure. For all r 2 Œ0; n we then have

jSn \ .Sn C x/j  .n C 1  r/d ; x 2 Zd ; jxi j  r

and hence the sequence fhn  hQ n g1


1 has the desired properties.

The next lemma follows immediately from the definition of positive definiteness,
we omit the proof.

Lemma 1.5.7. Let f 2 P .Rd / be Borel measurable and let  be a finite, nonnegative
measure on B1 . Then the function g defined by
Z
g.t / D f .xt / d.x/; t 2 Rd
R

is positive definite. In particular, the function t 7! f .xt /; t 2 Rd , is positive definite


for all x 2 R.17

16 See also Lemma 3.2.3 and Lemma 3.2.4.


17 See also Theorem 1.1.6.
36 Chapter 1 Characteristic functions

Lemma 1.5.8. A continuous function f on Rd is positive definite if and only if the


inequality Z Z
f .x  y/h.x/h.y/ dx dy  0
Rd Rd

holds for arbitrary functions h 2 C00 .Rd /.

Proof. The ‘only if part’ follows from Lemma 1.5.3 with t D 0 and d.x/ D h.x/ dx,
using the fact that positive definite functions are nonnegative at 0.
Q
The assumption of the ‘if part’ means that h hf .0/  0 for all h 2 C00 .Rd /. Let
 be an arbitrary finitely supported complex measure. It follows from Theorems E.1.9
and E.2.5 that there exists a sequence fhn g1
1 of functions hn 2 C00 .R / such that n 
d

Q n converges weakly to  , Q where dn .x/ D hn .x/ dx. From this we conclude that
  Q  f .0/  0. Thus, the positive definiteness of f follows from Lemma 1.4.3.

Theorem 1.5.9. If f 2 P c .Rd / \ L1 .Rd /, then18


Z
f .x/ei.t;x/ dx  0; t 2 Rd :
Rd

If f 2 P .Zd / \ L1 .Zd /, then


X
f .n/ei.t;n/  0; t 2 Rd :
n2Zd

Proof. Since the integrand is positive definite for all fixed t , it suffices to prove that
Z
f .x/ dx  0
Rd

for all f 2 P c .Rd / \ L1 .Rd /. To show this inequality we choose a sequence fgn g
as in Lemma 1.5.6. Then jfgn j  jf j and Lebesgue’s theorem on dominated conver-
gence shows that
Z Z
f .x/ dx D lim f .x/gn .x/ dx  0:
Rd n!1 Rd

Applying Fubini’s theorem and Lemma 1.5.8 we obtain


Z Z Z
f .x/gn .x/ dx D f .x/ hn .x C y/hn .y/ dy dx
Rd
ZR Z Rd
d

D f .x/hn .x C y/hn .y/ dx dy


ZR ZR
d d

D f .x  y/hn .x/hn .y/ dx dy  0:


Rd Rd
The second statement can be proved in the same way by replacing the integrals by
sums.
18 See Theorems 1.8.13 and 1.9.6 for closely related results.
Section 1.5 Further properties of positive definite functions on Rd 37

The next lemma will be used in the proof of Theorem 1.5.11.

Lemma 1.5.10. Let K  Rd be compact and let fgn g1 1 be a uniformly bounded


sequence of Lebesgue measurable functions on R converging -almost everywhere
d

to a function g. Then the sequence f'n g1


1 with
Z
'n .t / D gn .t C x/ dx; t 2 Rd
K

converges uniformly on every compact set to the function ' defined by


Z
'.t / D g.t C x/ dx; t 2 Rd :
K

Proof. We may assume that .K/ > 0; g D 0 and jgn j  1. Suppose, on the contrary,
that for some compact set C  Rd the sequence f'n g does not converge uniformly to 0
on C . Then there exist a positive number ı, a sequence ftn g1 1  C , and a subsequence
f'kn g11 such that j'k n
.t n /j  ı. We claim that the Lebesgue measure of the set

ı
Sn D x 2 K W jgkn .tn C x/j 
2.K/
ı
is at least 2 . Indeed, the inequality .Sn / < 2ı would imply that
Z Z
j'kn .tn /j  jgkn .tn C x/j dx C jgkn .tn C x/j dx
Sn KnSn
ı
 .Sn / C .K/  <ı
2.K/
ı
contradicting our choice of tn and kn . Thus, jgkn .y/j  2.K/ > 0 for all y 2 SQn WD
Sn C tn  K C C and .SQn / D .Sn /  ı . Consequently,
2

1 [
\ 1
ı
.lim sup SQn / D  SQm 
mDn
2
nD1

which is a contradiction since the sequence fgn g11 does not converge to 0 on the set
lim sup SQn which is contained in the compact set K C C .

We close this section with an important result concerning the convergence of posi-
tive definite functions.

Theorem 1.5.11. If a sequence ffn g1 1 of functions fn 2 P .R / converges point-


c d

wise to a continuous function f , then the convergence is uniform on every compact


subset of Rd .
38 Chapter 1 Characteristic functions

Proof. Let C  Rd be an arbitrary compact subset. For a bounded complex-valued


function h on C write khkC WD supf jh.t /j W t 2 C g. Using this notation we have to
show that kfn  f kC ! 0.
We may assume that fn .0/ D f .0/ D 1. Since f is continuous at 0, for every
" > 0 there exists a compact set K with positive Lebesgue measure satisfying
Z
2 "2
Œ1  Re f .x/ dx < : (1)
.K/ K 9
Since fn .x/ ! f .x/, this inequality also holds for fn if n is sufficiently large, say
n  N0 . Define ' and 'n as in Lemma 1.5.10 with g and gn replaced by f and fn ,
respectively. We show that
   
 1  "  1  "
   
 .K/ '  f  < 3 and  .K/ 'n  fn  < 3 : (2)
C C

Indeed, using the Cauchy–Schwarz inequality, (1.4.12.ii), and (1) we obtain


ˇ ˇ2 ˇ Z Z ˇ2
ˇ ˇ ˇ ˇ
ˇf .t /  1 '.t /ˇ D ˇ 1 f .t / dx 
1
f .t C x/ dx ˇ
ˇ .K/ ˇ ˇ .K/ K .K/ K ˇ
ˇZ ˇ2
1 ˇˇ ˇ
D 2 ˇ Œf .t /  f .t C x/ dx ˇˇ
.K/
ZK Z
1
 1 dx  jf .t /  f .t C x/j2 dx
.K/2 K K
Z
2 "2
 Œ1  Re f .x/ dx  :
.K/ K 9
The second inequality in (2) can be proved in the same way.
By Lemma 1.5.10 (note that jfn .t /j  fn .0/ D 1), there exists N  N0 such that
 
 1 1  "
 
 .K/ 'n  .K/ '  < 3 if n  N: (3)
C

From (2) and (3) we infer that kf  fn kC < "; n  N .

1.6 Lévy’s continuity theorem


We now give necessary and sufficient conditions for the weak convergence of prob-
ability distributions in terms of their characteristic functions.

Theorem 1.6.1. If a sequence fn g1 1 of probability distributions on R converges


d
1
weakly to a probability distribution , then the sequence ffn g1 of the corresponding
characteristic functions converges uniformly on every compact set to the characteris-
tic function of .
Section 1.6 Lévy’s continuity theorem 39

Proof. By weak convergence (cf. Theorem E.1.7),


Z Z
lim fn .t / D lim ei.t;x/
dn .x/ D ei.t;x/ d.x/; t 2 Rd :
n!1 n!1 Rd Rd

That the convergence is uniform on compact sets follows from Theorem 1.5.11.

Lemma 1.6.2. Let f be the characteristic function of a distribution  on Rd . For


a > 0 we then have

.Ka /  1  7  sup Œ1  Re f .t /
t2K1=a

where Ka D Œa; ad .

Proof. Using Fubini’s theorem and the example after Definition 1.1.1 we obtain
d Z
2
 sup Œ1  Re f .t /  Œ1  Re f .t / dt
a t2K1=a K1=a
Z Z
D Œ1  cos..t; y// d.y/ dt
K1=a Rd
Z Z
D Œ1  cos..t; y// dt d.y/
Rd K1=a
Z " #
2 d Y d
sin yj =a
D  1 d.y/
a Rd yj =a
j D1
Z " #
2 d Y
d
sin yj =a
  1 d.y/
a Rd nKa yj =a
j D1

If y 2 Rd n Ka , then jyj =aj  1 for at least one j . Since


sin x 101 6 sin x
< < if jxj  1 and < 1 if x 2 R n f0g
x 120 7 x
(cf. (B.1.6.5) and (B.1.6.1)), the last integrand is not less than 17 on the set Rd n Ka .
We thus obtain
1
sup Œ1  Re f .t /   Œ1  .Ka /;
t2K1=a 7
from which the lemma follows.

The next theorem is frequently applied in probability theory.

Theorem 1.6.3 (Lévy’s continuity theorem). Let fn g1 1 be a sequence of probability


distributions on Rd such that the sequence ffn g1
1 of the corresponding characteristic
40 Chapter 1 Characteristic functions

functions converges pointwise to a continuous function f . Then f is the characteristic


function of a probability distribution  and the sequence fn g1 1 converges weakly
to .

Proof. In view of Theorem E.1.13, the sequence fn g1


nD1 contains a subsequence
fnk g1
kD1
converging vaguely to some finite nonnegative measure  satisfying
.R /  1. We show that .R / D 1.
d d

By Lemma 1.6.2, the inequality

nk .Km /  1  7  sup Œ1  Re fnk .t /


t2K1=m

holds for all k and m 2 N. Taking lim sup with respect to k and using Theorem E.1.11
we obtain

.Km /  lim sup nk .Km /  1  7  sup Œ1  Re f .t /:


k t2K1=m

Now letting m ! 1 and noting that f is continuous and f .0/ D 1, we see that
.Rd /  1. Thus, .Rd / D 1 and hence fnk g1 kD1
converges weakly to  (cf.
Theorem E.1.12).
From Theorem 1.6.1 we conclude that f is the characteristic function of . Since
 is uniquely determined by f , we see that every weakly convergent subsequence of
fn g11 has the same limit . This shows that the whole sequence converges weakly
to .

Remark 1.6.4. In view of Theorem 1.4.10 and Corollary 1.5.2, it suffices to assume
continuity of Ref at 0 in the previous theorem. This assumption cannot be dropped.
Indeed, with the notation introduced in Remark 1.5.5, limn!1 1=n D 1f0g . The
limit is not a characteristic function though it is continuous at every point different
from zero.

The next theorem is usually called the Cramér–Wold device.

Theorem 1.6.5. A sequence fXn g of d-dimensional random vectors converges in dis-


tribution to a random vector X if and only if for all a 2 Rd the sequence f.a; Xn /g of
random variables converges in distribution to .a; X /.

Proof. Suppose that Xn ! X in distribution. For any a 2 Rd the function x 7! .a; x/


is continuous on Rd . Applying Corollary E.1.16 we see that .a; Xn / ! .a; X / in
distribution.
Conversely, suppose that .a; Xn / ! .a; X / in distribution for all a 2 Rd . Denoting
by fa;n and fa the corresponding characteristic functions, Theorem 1.6.1 shows that

lim fa;n .s/ D fa .s/; s 2 R:


n
Section 1.7 The theorems of Bochner and Herglotz 41

Using Remark 1.1.10 we obtain

fXn .a/ D fa;n .1/ ! fa .1/ D fX .a/; a 2 Rd :

Hence, by Lévy’s continuity theorem, Xn ! X in distribution.

The next result will be used in the proof of Schoenberg’s Theorem 3.8.5.

Proposition 1.6.6. Let p; pn ; n 2 N, be probability densities on Rd with corre-


sponding characteristic functions fn and f , respectively. If pn ! p pointwise, then
fn ! f uniformly on Rd .

Proof. We have
ˇZ Z ˇ
ˇ ˇ
ˇ
jfn .rt /  f .rt /j D ˇ ei.rt;x/
pn .x/ dx  ei.rt;x/
p.x/ dx ˇˇ
ZR Rd
d

 jpn .x/  p.x/j dx:


Rd

By Scheffe’s Lemma E.1.19 the integral above tends to zero as n ! 1. This com-
pletes the proof.

1.7 The theorems of Bochner and Herglotz


We already know that functions of the form
X
n
f .t / D pj j .t /; t 2G
j D1

where G is a commutative group, pj  0, and j is a character of G, are positive


definite. We will show that every positive definite function on G can be obtained if we
replace the weighted sum above by an integral with respect to a nonnegative measure.
First we consider the case G D Rd .

Lemma 1.7.1. Let  be a probability measure on Rd and for n 2 N define the prob-
ability measure n by n .B/ D .nB/; B 2 Bd . Then

lim n  g.t / D g.t /; t 2 Rd


n!1

holds for any bounded, continuous function g on Rd .

Proof. Let > 0 and t 2 Rd be arbitrary and choose a bounded, open neighborhood
U of 0 such that
jg.t /  g.t  x/j < ; x 2 U:
42 Chapter 1 Characteristic functions

We have
ˇZ ˇ
ˇ ˇ
jg.t /  n  g.t /j D ˇˇ g.t /  g.t  x/ dn .x/ˇˇ
Rd
  n .U / C 2kgk1  n .Rd n U /
 C 2kgk1  .Rd n nU /:

The lemma follows now from the fact that limn!1 .Rd n nU / D 0.
Applying the previous lemma to an absolutely continuous measure  we immedi-
ately obtain the following corollary:

Corollary 1.7.2. Let p be a probability density on Rd and for n 2 N put

pn .t / D nd  p.nt /; t 2 Rd :

Then
lim pn  g.t / D g.t /; t 2 Rd
n!1

holds for any bounded, continuous function g on Rd .


We are now able to characterize continuous positive definite functions on Rd . We
formulate the result in terms of characteristic functions.

Theorem 1.7.3 (Bochner). A continuous complex-valued function f on Rd is a char-


acteristic function if and only if f .0/ D 1 and f is positive definite.
Proof. We have already seen in Theorem 1.1.2 that characteristic functions are
positive definite. Conversely, let f be a continuous positive definite function with
f .0/ D 1 and assume first that f is integrable. Further, let g and ' be as in
Lemma 1.3.4. Applying Fubini’s theorem we obtain
Z
f  g .t / D f .x/g .t  x/ dx
Rd
Z Z
D f .x/  ei.tx;y/ ' .y/ dy dx
ZR ZR
d d

D ' .y/  f .x/ei.y;x/ dx  ei.t;y/ dy


ZR R
d d

DW ' .y/  q.y/  ei.t;y/ dy:


Rd
R
Theorem 1.5.9 shows that q.y/  0. Since q.y/  jf .x/j dx, the function y 7!
' .y/q.y/ is integrable. From this, using (1.3.4.1), we conclude that
f  g f  '1=
D DW h1=
f  g .0/ f  '1= .0/
Section 1.7 The theorems of Bochner and Herglotz 43

is a characteristic function. By Lemma 1.7.2, the sequence fhn g1


1 converges pointwise
to f . Application of Lévy’s continuity Theorem 1.6.3 shows that f is a characteristic
function.
In the general case where f is not supposed to be integrable, we consider the
functions f  g1=n which are positive definite and integrable. Moreover, f .t / D
limn f  g1=n .t /; t 2 Rd . A second application of Lévy’s continuity theorem com-
pletes the proof.
An alternative formulation of Bochner’s theorem is the following.

Theorem 1.7.4. A continuous complex-valued function f on Rd is positive definite


if and only if it can be represented in the form
Z
f .t / D ei.t;x/ d.x/; t 2 Rd ;
Rd

with some nonnegative finite Borel measure  on Rd . The measure  is uniquely


determined by f .
In the case d D 1 the next theorem is due to G. Herglotz. We will deduce it from
the more general Theorem 1.7.8.19

Theorem 1.7.5. A complex-valued function f on Zd is positive definite if and only if


it can be represented in the form
Z
f .n/ D ei.n;t/ d.t /; n 2 Zd
Œ0;2/d

with some nonnegative finite Borel measure  on Œ0; 2/d . The measure  is uniquely
determined by f .

Remark 1.7.6. Let G be an arbitrary commutative group and denote by L D L.G/


the complex linear space of all bounded complex-valued functions on G. Equipped
with the topology of pointwise convergence, L becomes a locally convex linear space
(cf. Section D.4). Notice that for fixed x 2 G the function h 7! h.x/ is a continuous
linear functional on L. The convex set

P0 .G/ D ff 2 P .G/ W f .0/ D 1g

is a closed subset of L. Since jf .x/j  1; x 2 G, this set is compact (cf. Theo-


rem B.2.8). We have already seen (cf. Theorem 1.4.20) that the set of all extreme
O
points of P0 .G/ is equal to the set of all characters of G which we will denote by G.
The set GO is a closed subset of P0 .G/. Hence it is compact.
19 The formulations of Theorems 1.7.4, 1.7.5, and 1.7.8 can be unified within the framework of Fourier
transformation on locally compact commutative groups, see for example [49].
44 Chapter 1 Characteristic functions

Remark 1.7.7. If G D Zd then we can identify GO with the set Td (see Lemma 1.4.7).
Moreover, the topology of pointwise convergence on GO is the same as the metric to-
pology of Td . Since every z 2 T can uniquely be written as z D eit with some
t 2 Œ0; 2/, we can identify GO also with Œ0; 2/d . This is sometimes more conve-
nient. However, in this case the topology of pointwise convergence is not the same as
the metric topology (note that Œ0; 2/d is not compact). Nevertheless, both topologies
generate the same Borel sets.

Theorem 1.7.8. A complex-valued function f on a commutative group G is positive


definite if and only if it can be represented in the form
Z
f .x/ D .x/ d. /; x 2 G
O
G

O The measure  is uniquely


with some nonnegative finite Radon measure  on G.
determined by f . 20

Proof. If f has the form above, then

X
n Z ˇˇXn
ˇ2
ˇ
ˇ ˇ
f .xi  xj /ci cj D ˇ ci .xi /ˇ d. /  0;
O
G ˇ ˇ
i;j D1 iD1

showing that f is positive definite. To prove the other direction we may assume that
f 2 P0 .G/. By Theorem D.4.6 (see also Remark 1.7.6), there exists a probability
measure  D f on GO such that
Z
l.f / D l. / d. /
O
G

holds for every continuous linear functional l on L.G/. Taking here the linear func-
tionals lx defined by lx .h/ D h.x/; h 2 L.G/, we obtain the desired representation
of f .
To prove uniqueness, assume that  is a finite Radon measure on GO such that
Z Z
.x/ d. / D .x/ d. /; x 2 G:
O
G O
G

Denote by A the linear space of (continuous) functions on GO spannedR by all functions


R
of the form 7! .x/; 2 GO where x 2 G. By the equation above, g d D g d
for all g 2 A. It is easy to check that A satisfies the first two conditions of Theo-
rem B.2.4 (Stone–Weierstraß theorem). To show that the last condition is satisfied,
let x; y 2 G be such that x ¤ y. Applying the first part of the proof to the positive

20 The measure  is sometimes called a representing measure or spectral measure of f .


Section 1.7 The theorems of Bochner and Herglotz 45

definite function 1f0g we see that there exists a probability measure  on GO such that
Z
0 D 1f0g .x  y/ D .x  y/ d. /:
O
G

This implies the existence of 2 GO such that .x  y/ ¤ 1 from which .x/ ¤ .y/
follows.
Thus, A satisfies the conditions of the Stone–Weierstraß theorem (cf. Theo-
O can be uniformly approx-
rem B.2.4) and therefore an arbitrary functionR g 2 C.G/
R
imated by functions from A. This shows that g d D g d for all g 2 C.G/, O and
hence  D .

Applying Theorem E.1.821 we obtain the following corollary:

Corollary 1.7.9. Every positive definite function f on G is the pointwise limit of a


net ff˛ g of positive definite functions of the form
X

f˛ D pi˛ ˛
i
iD1
Pn˛
where n˛ 2 N; pi˛  0; ˛
D f .0/ and ˛ O
2 G.
iD1 pi i

The next statement has been show in the proof of Theorem 1.7.8. We state it explic-
itly because of its importance.

Corollary 1.7.10. If x and y are distinct elements of G, then there exists a character
of G such that .x/ ¤ .y/.
Another nice corollary of Theorem 1.7.8 is the existence and uniqueness of a
O which is called the normalized Haar
translation invariant probability measure on G,
measure.

Corollary 1.7.11. There exists a unique Radon probability measure  on GO such that
Z Z
f . / d. / D f . / d. /; f 2 C.G/; O O
2 G:
O
G O
G

Proof. Let  and A be as in the proof of Theorem 1.7.8. It follows from the definition
of  that the equation above holds for all f 2 A, while the density property of A
O
implies that it holds for all f 2 C.G/.
Assume that  is a Radon probability measure satisfying
Z Z
f . / d. / D f . / d. /; f 2 C.G/; O O
2 G:
O
G O
G

21 The corollary also follows from Kreı̆n–Milman’s Theorem D.4.5.


46 Chapter 1 Characteristic functions

Setting f . / WD .x/ where x 2 G we obtain


Z Z Z
.x/ .x/ d. / D . /.x/ d. / D .x/ d. /; x 2 G; O
2 G:
O
G O
G O
G

We have shown in the proof of Theorem 1.7.8 that for all x ¤ 0 there exists 2 GO
such that .x/ ¤ .0/ D 1. Using this and the relation above we conclude that
Z
.x/ d. / D 1f0g .x/; x 2 G
O
G

and hence  D .

Example 1.7.12. Identifying the character group of Zd with Œ0; 2/d the restriction
of the Lebesgue measure to Œ0; 2/d is translation invariant in the sense of the previ-
ous corollary. Thus, dividing this restriction by .2/d we obtain the normalized Haar
measure.

Corollary 1.7.13. If  2 Mb .G/ is such that


Z
.x/ d.x/ D 0; 2 GO
G

then  D 0.

Proof. For all y 2 G we have


Z Z Z
.fyg/ D 1f0g .x  y/ d.x/ D .x  y/ d. / d.x/
G O
ZG G Z
D .y/ .x/ d.x/ d. / D 0:
O
G G

Proof of Theorem 1.7.5. We identify the character group of Zd with Œ0; 2/d (see Re-
mark 1.7.7). Since the topology of pointwise convergence generates the same Borel
-algebra as the metric topology of Œ0; 2/d , Theorem 1.7.5 follows immediately from
Theorem 1.7.8.

Alternative proof of Theorem 1.7.5 in the case d D 1. We show here only the exis-
tence of . Definition 1.4.1 with cj D eijx , x 2 R, and tj D j gives

1 XX X
N N N
jmj
0  PN .x/ WD f .j  k/ei.j k/x D 1 f .m/eimx
N N
j D1 kD1 mDN
Section 1.8 Fourier transformation on Rd 47

and therefore
Z Z
1 X
N
1 2
jmj 2
inx
e PN .x/ dx D 1 f .m/ ei.nm/x dx
2 0 2 N 0
mDN
jnj
D 1  f .n/
N C

for all n 2 Z. Consequently, the equation


1
dN .x/ WD PN .x/ dx; x 2 Œ0; 2/
2
defines a nonnegative measure on Œ0; 2/, such that
Z
jnj
einx dN .x/ D 1  f .n/: (1)
Œ0;2/ N C

In particular, N .Œ; // D f .0/. By Theorem E.1.13, the sequence fN g contains
a subsequence converging weakly to some nonnegative measure .22 From (1) we
conclude that Z
einx d.x/ D f .n/:
Œ0;2/

1.8 Fourier transformation on Rd


The aim of the present section is to prove some basic and frequently used results on
the Fourier transform on Rd. Our previous results on inversion theorems, on positive
definite functions and Bochner’s theorem enable us to give short proofs.

Definition 1.8.1. For g 2 L1 .Rd / or g 2 L1 .Rd / let


Z
1
O /D
g.t g.x/ei.t;x/ dx; t 2 Rd
.2/d=2 Rd
and Z
1
L /D
g.t g.x/ei.t;x/ dx; t 2 Rd :
.2/d=2 Rd
The function gO is called the Fourier transform of g while gL is the inverse Fourier
transform of g. In some cases it is convenient to consider the Fourier transformation
on the Banach space L1 .Rd /, in other cases it is more natural to deal with L1 .Rd /,
the elements of which are functions and not equivalence classes of functions.

22 This follows also from Helly’s selection theorem.


48 Chapter 1 Characteristic functions

For a finite complex measure  on Rd we define the Fourier–Stieltjes transform of


 by Z
O /D
.t ei.t;x/ d.x/; t 2 Rd
Rd
while Z
L /D
.t ei.t;x/ d.x/; t 2 Rd
Rd
is the inverse Fourier–Stieltjes transform of .

Note that if ' is a density, then the corresponding characteristic function is equal
L On the other hand, if  is the distribution of a random vector, then the
to .2/d=2 '.
L
corresponding characteristic function is .
To see an example of the Fourier transform, let ' be the Gaussian density as defined
in Lemma 1.3.4. Then, by the same lemma,
1
'O D 'L D d
 '1= :

Theorem 1.8.2. Every function g 2 L1 .Rd / is uniquely determined by its Fourier


transform gO (inverse Fourier transform g)
L which is a bounded, uniformly continuous
function on Rd . Moreover, for all g; h 2 L1 .Rd / we have
(i) .g C h/O D gO C h;O
(ii) .cg/O D c gO for all c 2 C;
(iii) O
.g  h/O D .2/d=2 gO  h;
(iv) gO D .g/L D .g/O;
Q
(v) gO is real-valued if and only if g D g;
Q
Z
1
(vi) sup fjg.tO /j W t 2 Rd g  jg.x/j dx;
.2/d=2 Rd
 
(vii) gy O.t / D ei.t;y/ g.t
O / and
  i.t;y/
gy L.t / D e L /, where gy .x/ D g.x C y/.
g.t
The analogues of (i)–(iii), (v) and (vi) are also valid for the inverse Fourier trans-
L
form g.
Proof. The properties (i)–(iv) and (vi) are simple consequences of Definition 1.8.1,
we omit the details. Since g is integrable it can be written as a linear combination
of four density functions. Using this, uniform continuity of gO (and of g)
L follows from
(1.1.2.iv). As to uniqueness, the proof of Theorem 1.3.3 can also be applied to  D g.
Property (v) follows from (iv) and from the uniqueness.
The next theorem can be proved in the same way as Theorem 1.8.2. We omit the
proof.
Section 1.8 Fourier transformation on Rd 49

Theorem 1.8.3. Every measure  2 Mb .Rd / is uniquely determined by its Fourier–


Stieltjes transform O which is a bounded, uniformly continuous function on Rd . For
all ;  2 Mb .Rd / we have
(i) . C /O D O C ;
O
(ii) .c/O D c O for all c 2 C;
(iii) .  /O D O  ;
O
(iv) .  f /O D O  fO for all f 2 L1 .Rd /;
(v) Q D ;
./O O
(vi) O is real-valued if and only if  D .
Q
L
All these properties are also valid for .
Next we investigate the behavior of the Fourier transform at infinity.

Theorem 1.8.4 (Riemann–Lebesgue lemma). If g 2 L1 .Rd / then gO 2 C0 .Rd /.

Proof. Let g first be the indicator function of a parallelepiped

Œa1 ; b1       Œad ; bd   Rd:

Then
Y
d
eibj tj  eiaj tj
O /D
g.t ; t 2 Rd;
itj
j D1

and this tends to zero as kt k ! 1, since jtj j ! 1 for at least one j .


For each function g 2 L1 .Rd / and > 0 there exists a function h such that h is a
finite linear combination of indicator functions of parallelepipeds and kg hk1 < =2.
From the first step we know that there exists a positive number K such that jh.tO /j <
=2 whenever jt j > K. For such t we have

jg.t
O /j D j.g  h/ O.t / C h.t O /j < :
O /j  kg  hk1 C jh.t

Theorem 1.8.5. If  is a finite discrete measure on Rd , then for all t 2 Rd and s 2 R


we have
lim inf j.st
O /  .rt
O /j D 0:
r!1

Proof. The function f .s/ WD .st


O /; s 2 R, has the form
1
X
f .s/ D cn eixn s ; s2R
nD1
P
where xn 2 R; cn 2 C and 1 nD1 jcn j < 1. Therefore, the theorem follows from
Corollary C.2.5 and Theorem C.2.7.
50 Chapter 1 Characteristic functions

Remark 1.8.6. Applying the previous theorem with s D 0 and t D 1 to a discrete


probability measure  on R we see that
lim sup j.r/j
O D 1:
r!1
It can be shown (see, e.g., [6], Corollary 1.11.8) that for every p 2 Œ0; 1 there exists
a continuous, singular probability measure  on R such that
lim sup j.r/j
O D p:
r!1
To see another interesting example for the behavior at infinity of a Fourier transform,
let q 2 .0; 1/ and let Xn ; n 2 N0 , be independent random variables such that
1
P .Xn D 1/ D P .Xn D C1/ D :
2
The characteristic function of the random variable
X1
XD q n Xn
nD0

does not tend to zero at infinity if and only if  WD q1 is a Pisot number23 different
from 2. This is the so-called Zygmund–Salem theorem; we refer to [6, Sect. 1.11] for
more details.
The Fourier transform of an integrable function may not be integrable (cf. Exam-
ple 1.8.14). However, we have the following result:24

Theorem 1.8.7. Let f be a continuous and integrable positive definite function on


Rd . Then both fO and fL are integrable, nonnegative and

f D .fO /L D .fL /O:

Proof. We may assume that f .0/ D 1. Then f is a characteristic function in view


of Bochner’s theorem. By Theorem 1.3.6, f corresponds to some density p, i.e.,
f D .2/d=2 p. L On the other hand, Theorem 1.3.6 shows that p D .1=2/d=2 fO .
Combining these facts we see that fO  0 and f D .fO /L . The relations fL  0 and
f D .fL /O follow from (1.8.2.iv).
We will extend the Fourier transformation to square integrable functions. For this
we need the following result:

Lemma 1.8.8. The set

L WD f gO W g 2 L1 .Rd / \ L2 .Rd / g
23 A real number  > 1 is said to be a Pisot number if it satisfies an equation
 n C b1  n1 C    C bn D 0
where b1 ; : : : ; bn are integers, and all other roots of this equation have moduli less than 1.
24 See also Theorem 1.8.13.
Section 1.8 Fourier transformation on Rd 51

is a dense linear subspace of L2 .Rd / and

kgk2 D kgk
O 2; g 2 L1 .Rd / \ L2 .Rd /:

Proof. If g 2 L1 .Rd / \ L2 .Rd /, then the function f D g  gQ belongs to P c .Rd / \


L1 .Rd / (cf. Lemma 1.5.4 and Theorem E.2.4) and fO D .2/d=2 jgj O 2 in view of
O
(1.8.2.iii) and (1.8.2.iv). By Theorem 1.8.7, the function f is integrable and therefore
gO 2 L2 .R/, i.e., L  L2 .Rd /. It is obvious that L is a linear subspace.
Applying the first equation in Theorem 1.8.7, we get
Z Z Z
1 O.x/ dx D
jg.x/j2 dx D g  g.0/
Q D f .0/ D d=2
f jg.x/j
O 2
dx;
R d .2/ R d R d

that is, kgk2 D kgkO 2.


Suppose now that 2 L2 .Rd / and is orthogonal to L. Since L1 .Rd / \ L2 .Rd /
is translation-invariant, (1.8.2.vi) shows that L is invariant under multiplication by the
function t 7! ei.t;x/ for all x 2 Rd . Define the function  by .t / D e.t;t/ . Since 
and are square integrable their product is in L1 .Rd /. By Lemma 1.3.4, the function
 is in L, thus the identity
Z
.t /.t /ei.t;x/ dt D 0
Rd

holds for all x 2 Rd , i.e., . /O D 0. The first part of Theorem 1.8.2 implies that  D
0 and hence also D 0. Thus, 0 is the only element in L2 .Rd / that is orthogonal to L;
therefore L is dense in L2 .Rd /.

Theorem 1.8.9 (Plancherel). We consider the Fourier transformation, introduced in


Definition 1.8.1, on the set L1 .Rd /\L2 .Rd /. This transformation can be extended in
a unique way to a linear isometry from L2 .Rd / onto L2 .Rd /. Denoting the extension
by the same symbol, the equations25
Z Z
(i) f .x/g.x/ dx D fO.t /g.t
O / dt and
Rd Rd

(ii) .2/d=2 .f  g/O D fO  g.


O
hold for all f; g 2 L2 .Rd /.
Proof. Write H D L1 .Rd /\L2 .Rd /. By Lemma 1.8.8, the Fourier transformation is
a linear isometry from H into L2 .Rd /. The first statement follows immediately from
the fact that H is a dense subspace of L2 .Rd /. Equation (i) is a consequence of the
identity
4f gN D jf C gj2  jf  gj2 C ijf C igj2  ijf  igj2 :
Replacing g.x/ by g.x/ei.y;x/ ; y 2 Rd , in (i) we obtain (ii).
25 The first equation is usually referred to as Parseval’s identity.
52 Chapter 1 Characteristic functions

Remark 1.8.10. If g 2 L2 .Rd / then we call gO the L2 Fourier transform of g. Note


that gO is not a function but an equivalence class of functions and relations such as
gO D hO or gO  0 are relations between equivalence classes. However, if g 2 L1 .Rd / \
L2 .Rd / then we can regard gO as a function given by
Z
1
O /D
g.t g.x/ei.t;x/ dx:
.2/d=2 Rd
It will be clear from the context in every case whether we mean the fixed function gO
O
or the equivalence class in L2 .Rd / containing g.
The inverse Fourier transformation can be extended to square integrable functions
in the same way.

Theorem 1.8.11. We consider the inverse Fourier transformation, introduced in Defi-


nition 1.8.1, on the set L1 .Rd / \ L2 .Rd /. This transformation can be extended in a
unique way to a linear isometry from L2 .Rd / onto L2 .Rd /. Denoting the extension
by the same symbol, the equations
(i)  D .L /O D .O /L
(ii) .2/d=2 .  /L D L  L
(iii) .  /L D .2/d=2  L  L
hold for all ; 2 L2 .Rd /.
Proof. The inverse Fourier transform of a function in L1 .Rd / \ L2 .Rd / is square in-
tegrable. This follows from (1.8.2.iv) and the square integrability of the Fourier trans-
form. If ; g 2 L1 .Rd / \ L2 .Rd /, then
Z Z Z
L 1
.x/g.x/ dx D .t /ei.t;x/ dt g.x/ dx
Rd .2/d=2 Rd Rd
Z Z
1
D .t / g.x/ei.t;x/ dx dt
.2/d=2 Rd Rd
Z
D O / dt:
.t /g.t
Rd
Using (1.8.9.i) and the equations above we see that
Z Z Z
L O / dt D
./O.t /g.t L
.x/g.x/ dx D O / dt:
.t /g.t
Rd Rd Rd

Since the set L in Lemma 1.8.8 is dense in L2 .Rd /, we conclude that

 D .L /O;  2 L1 .Rd / \ L2 .Rd /: (1)

It follows from (1) and (1.8.9.i) that kk2 D kkL 2 . Thus, the mapping  7! L maps
L1 .Rd / \ L2 .Rd /, which is dense in L2 .Rd /, isometrically into L2 .Rd /, and there-
Section 1.8 Fourier transformation on Rd 53

fore it extends to a unique isometric mapping from L2 .Rd / onto L2 .R/. From (1) we
conclude that (i) holds while (ii) is obtained in the same way as (1.8.9.ii). Taking the
inverse Fourier transform of both sides of (1.8.9.ii) we obtain (iii).

Remark 1.8.12. We call ; L  2 L2 .Rd /, the inverse L2 Fourier transform of .


Remark 1.8.10 holds for this transformation, too.

Theorem 1.8.13. Suppose that f 2 L1 .Rd / \ L1 .Rd / and that fO is real-valued


and nonnegative. Then fO is in L1 .Rd /.

Proof. First we note that f 2 L2 .Rd /. Let ' be the Gaussian density as defined
in Lemma 1.3.4. Applying (1.8.9.i) with g D '1=n and using that 'O1=n D nd 'n we
obtain
Z 2
Z
1 O  kt k2
f .t /e 2n dt D fO.t /'O1=n .t / dt
.2/d=2 Rd Rd
Z
D f .x/'1=n .x/ dx  kf k1 :
Rd

The integrand on the left is nonnegative and monotone increasing as a function of n.


Taking the limit n ! 1 and applying Beppo Levi’s theorem on monotone conver-
gence we obtain Z
fO.t / dt  .2/d=2 kf k1 < 1:
Rd

Thus, fO is integrable.

Next we give two simple examples where the Fourier transform is square integrable
but not integrable.

Example 1.8.14. By Example 1.1.13(a), the Fourier transform of the function p.x/ D
1Œ0;1/ .x/ex ; x 2 R, is equal to q.t / D p1  1Cit
1
. This function is square integrable
2
but not integrable. By (1.8.11.i), the inverse L2 Fourier transform of q is p.
A similar example is given by the indicator function p of the set Œ1; 1d . The
computations at the end of Definition 1.1.1 show that
d=2 Y
d
2 sin tj
q.t / WD p.t
O / D p.t
L /D ; t 2 Rd :
 tj
j D1

The function q belongs to L2 .Rd / n L1 .Rd / and qO D qL D p.

The representing measure of the positive definite function   Q  f from Lem-


ma 1.5.3 can be easily determined.
54 Chapter 1 Characteristic functions

Theorem 1.8.15. Assume that f 2 P c .Rd / admits the representing measure and
let  2 Mb .Rd / be arbitrary. Then the function   Q  f is positive definite and its
representing measure  is given by d  .s/ D j.s/j
O 2 d .s/.

Proof. Using Z
f .t / D ei.t;s/ d .s/

(integration over Rd ) we obtain


Z
  Q  f .t / D f .t  x/ d.  /.x/
Q

D f .t  x C y/ d.x/ d.y/

D ei.txCy;s/ d .s/ d.x/ d.y/
Z Z Z
i.x;s/
D e i.t;s/
e d.x/ ei.y;s/ d.y/ d .s/
Z
2
D ei.t;s/ j.s/j
O d .s/:

Using L2 Fourier transformation we can characterize characteristic functions cor-


responding to absolutely continuous distributions.

Theorem 1.8.16. Let f be the characteristic function of a distribution on Rd . Then


is absolutely continuous if and only if there exists g 2 L2 .Rd / such that
Z
f .t / D g  g.t
Q /D g.t C x/g.x/ dx; t 2 Rd :
Rd
For this function g we have d .x/ D jg.x/j
O 2 dx.

Proof. Assume first that d .t / D p.t / dt where p 2 L1 .Rd /; p  0, so that


d
f D .2/ 2  p:
L
Choose an arbitrary Lebesgue measurable function q with p D jqj2 (we can take
p
q D p). Then q 2 L2 .Rd /. Setting g WD qL we have g 2 L2 .Rd /; gO D q and

O 2 D .2/ 2  jqj2 D .2/ 2  p D fO:


d d d
.g  g/O
Q D .2/ 2  jgj
Thus, f D g  g. Q
Assume now that f D g  gQ with some function g 2 L2 .Rd /. Parseval’s identity
(1.8.9.i) shows that
Z Z
2
f .t / D g.t C x/g.x/ dx D ei.t;x/ jg.x/j
O dx
Rd Rd
i.e., d .x/ D jg.x/j
O 2 dx.
Section 1.9 Fourier transformation on discrete commutative groups 55

1.9 Fourier transformation on discrete commutative


groups
In this section we prove some basic facts on the Fourier transformation on a commu-
O which we will need in the sequel.
tative group G and its character group G,

Definition 1.9.1. The Fourier–Stieltjes transform O of a complex measure  2


Mb .G/ is defined by
Z
O /D
. .x/ d.x/; O
2 G:
G

If d.x/ D f .x/ dx with f 2 L1 .G/ where dx denotes integration with respect


to the counting measure on G, then we write fO for O and call fO the Fourier transform
of f .
If G D Zd and we identify GO with Œ0; 2/d (see Remark 1.7.7) then we have
Z X
O /D
.t ei.t;n/ d.n/ D .fng/  ei.t;n/ ; t 2 Œ0; 2/d
Zd
n2Zd

for all  2 Mb .Zd /.

Theorem 1.9.2. Every complex measure  2 Mb .G/ is uniquely determined by its


Fourier–Stieltjes transform O which is a bounded continuous function. For all ;  2
Mb .G/ we have
(i) . C /O D O C ;
O
(ii) .c/O D c O for all c 2 C;
(iii) .  /O D O  ;
O
(iv) Q D .
./O O

Proof. The uniqueness follows from Corollary 1.7.13, the remaining properties are
simple consequences of the definitions.

Definition 1.9.3. The inverse Fourier–Stieltjes transform L of a complex measure


O is defined by
 2 Mb .G/
Z
L
.x/ D .x/ d. /; x 2 G:
O
G

Note that the function 7! .x/ is bounded and continuous on GO for all x in G,
so the integral exists. If d. / D f . / d. / with f 2 L1 .G/ O where  denotes the
normalized Haar measure on G, O then we write fL for L and call fL the inverse Fourier
transform of f .
56 Chapter 1 Characteristic functions

If G D Zd and we identify GO with Œ0; 2/d then we have


Z
L
.n/ D ei.t;n/ d.t /; n 2 Zd
Œ0;2/d

for all  2 Mb .Œ0; 2/d /.

Theorem 1.9.4. Every complex measure  2 Mb .G/ O is uniquely determined by its


inverse Fourier–Stieltjes transform L which is a bounded function. For all ;  2
Mb .G/O we have
(i) . C /L D L C ;
L
(ii) .c/L D c L for all c 2 C;
(iii) Q D .
./L L

Proof. The uniqueness can be proved by the same arguments used in the proof of
Theorem 1.7.8. The remaining properties are simple consequences of the defini-
tions.

The next theorem is the analogue of Theorem 1.8.15 and can be proved in the same
way. We omit the proof.

Theorem 1.9.5. Assume that f 2 P .G/ admits the representing measure 2


b O
MC .G/ and let  2 M .G/ be arbitrary. Then the function   Q  f is positive
b

definite and its representing measure  is given by d  .s/ D j.s/j


O 2 d .s/.

The analogue of Theorem 1.8.7 is contained in the next result.

Theorem 1.9.6.
(i) The relation f D .fO /L holds for all f 2 L1 .G/.
(ii) If f 2 P .G/ \ L1 .G/, then fO  0.
(iii) If f 2 L1 .G/ and fO  0, then f 2 P .G/.

Proof. (i) Applying Fubini’s theorem we obtain


Z Z
O
.f /L.t / D f .x/ .x/ dx .t / d. /
O
G G
Z Z
D .t  x/ d. / f .x/ dx
G O
G
Z
D f .x/1ftg .x/ dx D f .t /; t 2 G:
G
Section 1.10 Basic properties of Gaussian distributions 57

(ii) By Theorem 1.7.8 there exists a unique nonnegative measure  2 MbC .G/ O
such that f D . O
L By (i) we also have f D .f /L. Since  is unique we must have
 D fO d. The nonnegativity of  implies that fO  0. (iii) This statement follows
immediately from (i).

The extended Fourier transform 1.9.7. If  2 Mf .Zd / then we can extend the
Fourier–Stieltjes transform of  from Td to Cd n f0g by setting
Z
O
.z/ WD z n d.n/; z 2 Cd n f0g:
Zd

In the same way, if f is a complex-valued function on Zd having finite support, then


we define the Fourier transform of f on Cd n f0g by
X
fO.z/ WD z n f .n/; z 2 Cd n f0g:
n2Zd

It is easy to see that the relations


 
fQ O.z/ D fO.1=z/ (1)

.f  g/O.z/ D fO.z/  g.z/


O (2)

hold for all z 2 Cd n f0g. In particular,


 
f  fQ O.z/ D fO.z/  fO.1=z/; z 2 Cd n f0g:

1.10 Basic properties of Gaussian distributions


1.10.1. In Lemma 1.3.4 we already introduced the standard Gaussian density
1 1
'.x/ D d=2
 e 2 .x;x/ ; x 2 Rd
.2/
and showed that its characteristic function is
1
g.t / D e 2 .t;t/ ; t 2 Rd :

Suppose that a random vector X has characteristic function g and consider the random
vector Y D m C BX where m 2 Rd and B is a d  d real matrix. The covariance
matrix of Y is C D BB T (see page 349). By Theorem 1.1.7, the characteristic function
fY of Y is given by
1 T t;B T t/ 1
fY .t / D ei.m;t/  g.B T t / D ei.m;t/ 2 .B D ei.m;t/ 2 .C t;t/ :

Since C is a covariance matrix, it is positive semidefinite (cf. Section F.1).


58 Chapter 1 Characteristic functions

Now let C be an arbitrary positive semidefinite real matrix. By Theorem D.2.11


there exists a symmetric real matrix B such that B 2 D C . The above considerations
1
show that fY .t / D ei.m;t/ 2 .C t;t/ is the characteristic function of Y D m C BX
where X is standard Gaussian.
The exponent i.m; t /  12 .C t; t / is a polynomial of t of degree at most two. We will
show in Section 3.5 that a function of the form eP where P is a polynomial of degree
greater than two cannot be a characteristic function (cf. Theorem 3.5.1).

Definition 1.10.2. A d-dimensional random vector Y is said to be Gaussian (or nor-


mal) if its characteristic function f has the form
1
f .t / D ei.m;t/ 2 .C t;t/ ; t 2 Rd
with some m 2 Rd and a d  d positive semidefinite real matrix C .

For such Y we will write Y N.m; C /. It follows from Theorem 1.2.1 that m D
E.Y / and C D cov.Y /.
The next lemma follows immediately from the definition of Gaussian random vec-
tors, we omit the proof.

Lemma 1.10.3. Let X be a d-dimensional Gaussian vector with X N.m; C /. For


an arbitrary n  d matrix B, the n-dimensional random vector Y D BX is Gaussian
and Y N.Bm; BCB T /.
In particular, all marginal distributions of X are Gaussian. Moreover, for each
a 2 Rd the random variable .a; X / has a univariate Gaussian distribution with mean
.a; m/ and variance .a; C a/.

The next theorem gives a very useful characterization of Gaussian random vectors.

Theorem 1.10.4. A d-dimensional random vector X is Gaussian if and only if for


each a 2 Rd the random variable .a; X / is Gaussian.

Proof. The necessity of the condition has already been established in Lemma 1.10.3.
To prove the sufficiency suppose that Xa D .a; X / is Gaussian for any a 2 Rd .
Setting m D E.X / and C D cov.X / we have E.Xa / D .m; a/ and cov.Xa / D
.C a; a/. Thus, the characteristic function fa of Xa is given by
  1 2
fa .t / D E eit.a;X/ D eit.m;a/ 2 t .C a;a/ ; t 2 R:

Setting here t D 1 we obtain the required characteristic function of X , namely


  1
E ei.a;X/ D ei.m;a/ 2 .C a;a/ ; a 2 Rd :
Section 1.10 Basic properties of Gaussian distributions 59

Theorem 1.10.5. Let Y be a d-dimensional Gaussian vector with covariance ma-


trix C having rank k > 0. Then there exist a k-dimensional standard Gaussian vector
X and a d  k real matrix B of rank k such that C D BB T and

Y D E.Y / C BX:

Consequently, the distribution of Y is supported by a k-dimensional linear manifold.

Proof. We may assume that E.Y / D 0. Let Y D .Y1 ; : : : ; Yd / and denote by L the
linear subspace of L2r . ; A; P / spanned by the random variables Yj . The elements
of L are Gaussian random variables with zero mean (cf. Theorem 1.10.4). By Theo-
rem D.2.16, the dimension of L is equal to the rank of C . Let X1 ; : : : ; Xk be an arbi-
trary orthonormal basis of L. Then the random vector X D .X1 ; : : : ; Xk / is standard
Gaussian. The existence of B follows from the fact that each Yj is a linear combination
of the Xi ’s. Moreover, rank.B/ D dim L D k.

Using the previous theorem we can now express the density of absolutely continu-
ous Gaussian vectors by means of the standard Gaussian density.

Theorem 1.10.6. Let Y be a d-dimensional Gaussian vector with Y N.m; C /. If


det C > 0 then Y has the density
1 1 1 .xm/;xm/
p.x/ D p  e 2 .C ; x 2 Rd :
.2/d=2  det C
Proof. We already know that the statement is true for a standard Gaussian vector
(cf. 1.10.1). On the other hand, by Theorem 1.10.5, Y can be written as

Y D m C BX

where X is d-dimensional, standard Gaussian and B is a d  d matrix. The covari-


ance of m C BX is BB T and hence det C D .det B/2 . The theorem follows now
immediately from the transformation formula in Corollary B.7.3.

Theorem 1.10.7. Let Xn be a d-dimensional Gaussian random vector with mean mn


and covariance matrix Cn D .ci;j n
/; n 2 N. The sequence fXn g1
1 converges in law
to some random vector X if and only if the sequences fmn g11 and n g1 converge
fci;j nD1
for all i and j . In case of convergence, X is Gaussian with mean m D limn mn and
covariance matrix C D .limn ci;j n
/.

Proof. The characteristic function of Xn is given by


1
fn .t / D ei.mn ;t/ 2 .Cn t;t/ ; t 2 Rd :
60 Chapter 1 Characteristic functions

If the sequences fmn g1 n 1


1 and fci;j gnD1 converge, then

1
lim fn .t / D ei.m;t/ 2 .C t;t/ ; t 2 Rd
n!1

and convergence in law of the sequence fXn g1 1 follows from Theorem 1.6.3.
Assume now that fXn g1 converges in law. By Theorem 1.6.1, the sequence ffn g1
1
1
converges uniformly on every compact set. The same holds for the sequence fjfn jg1
1
hence the limit
lim .Cn t; t /; t 2 Rd
n!1

exists. By Lemma D.2.17, the sequence fCn g1 1 converges entrywise to some positive
semidefinite matrix C . Considering now the sequence ffn =jfn jg1 1 we see that e
i.mn ;t/

converges to some continuous positive definite function. The modulus of this function
is 1 and therefore it has the form ei.m;t/ with some m (cf. Lemma 1.5.1). Thus, the one
point measures ımn converge weakly to ım from which mn ! m follows.

Theorem 1.10.8. If X D .X1 ; : : : ; Xd / is Gaussian and the random variables Xj


are pairwise uncorrelated, then X1 ; : : : ; Xd are (completely) independent.

Proof. Since the Xj ’s are uncorrelated the covariance matrix C of X is diagonal:


C D diag . 12 ; : : : ; d2 /. Thus,

Y
d
1 2 2
eitj EXj  2 j tj ;
1
fX .t / D ei.EX;t/ 2 .C t;t/ D t 2 Rd
j D1

and hence X1 ; : : : ; Xd are independent (cf. Theorem 1.3.10).

Remark 1.10.9. There exist uncorrelated Gaussian random variables which are not
independent. To see an example let
p 1 2
h.x/ D 2  e 2 x  ex ; x 2 R
2

and
1 h i
h.x/  ey C h.y/  ex ; .x; y/ 2 R2 :
2 2
p.x; y/ D
2
It is easy to check that p is a density and
Z 1 Z 1
1 1 2 1 1 2
p.x; y/ dy D p  e 2 x ; p.x; y/ dx D p  e 2 y :
1 2 1 2
Since p is an even function in x and in y we have
Z 1Z 1
xyp.x; y/ dx dy D 0:
1 1
Section 1.11 Some inequalities 61

Now let Z D .X; Y / be a two-dimensional random vector with density p. Note that Z
is not Gaussian. By what we have proved, X and Y are standard Gaussian and uncor-
related. Since p is not the product of the marginal densities, they are not independent.

Theorem 1.10.10. Let X D .X1 ; : : : ; Xd / where X1 ; : : : ; Xd are independent stan-


dard Gaussian random variables. Further, let a; b 2 Rd n f0g and O be a d  d real
matrix.26
(i) The random variables

Ya D .a; X / D a1 X1 C    C ad Xd

and
Yb D .b; X / D b1 X1 C    C bd Xd
are independent if and only if .a; b/ D 0.
(ii) The coordinates of the random vector OX are independent if and only if O is
an orthogonal matrix.
Proof. The second statement follows immediately from the first one. To prove (i)
we consider the 2-dimensional random vector Y D .Ya ; Yb /. It follows from Theo-
rem 1.10.4 that Y is Gaussian. Applying Theorem 1.10.8 we obtain that Ya and Yb are
independent if and only if
X
d
0 D E.Ya Yb / D aj bk E.Xj Xk / D .a; b/:
j;kD1

1.11 Some inequalities


In the present section we mainly consider inequalities which are used in this book.27

Theorem 1.11.1. Let X be a d-dimensional random vector such that kX k  R with


some R > 0 and E.X / D 0. Then the inequality
1
jfX .t /j  1   .t; cov .X /t / (1)
8
3
holds for all t 2 Rd with kt k < 4R .

Proof. Assume first that d D 1 and denote by  the distribution of X . Then


Z R
f .t / D fX .t / D eitx d.x/; t 2 R:
R

26 See also Theorem 3.5.7.


27 Further inequalities can be found in [57].
62 Chapter 1 Characteristic functions

3
Using (B.1.6.1), (B.1.6.2), and (B.1.6.3) it is easy to check that for jsj < 4

s2 s2
0  cos s  1  and sin s D s C h.s/ where jh.s/j  :
4 8
Since E.X / D 0, the inequalities above imply that
1 1 3
0  Re f .t /  1  t 2 E.X 2 / and jImf .t /j  t 2 E.X 2 /; jt j <
4 8 4R
and therefore
1
jf .t /j  1  t 2 E.X 2 /: (2)
8
Now let d 2 N and t 2 Rd n f0g be arbitrary. Write t0 D t =kt k and denote by g the
characteristic function of the random variable Y D .t0 ; X /. Then jY j  kX k  R,
E.Y / D 0,    
f .t / D E ei.t;X/ D E eiktkY D g.kt k/;

and E.Y 2 / D .t0 ; cov .X /t0 / D .t; cov.X /t /=kt k2 . Application of equation (2) for Y
completes the proof.

Replacing X by X  E.X / changes neither the modulus of the characteristic func-


tion nor the covariance matrix and kX  E.X /k  2R. Thus we have the following
corollary.

Corollary 1.11.2. If kX k  R then the inequality (1.11.1.1) holds for all t 2 Rd


3
with kt k < 8R .

Lemma 1.11.3. Let X be a d-dimensional random vector and define the random vec-
tor XR .R > 0/ by
(
X.!/ if kX.!/k  R
XR .!/ D !2 :
0 if kX.!/k > R

Then the corresponding characteristic functions f and fR satisfy the inequality

jRe f .t /j  Re fR .t /

for all t 2 Rd with kt k  2R .

Proof. Denote by  and R the corresponding distributions and write

B D fx 2 Rd W kxk  Rg:

Note that
R D B C .1  .B//  ı0
Section 1.11 Some inequalities 63


where B is given by B .A/ D .B \ A/; A 2 Bd . If kt k  2R and kxk  R, then
cos ..t; x//  0. Using this we obtain:
ˇZ ˇ
ˇ ˇ
jRe f .t /j D ˇˇ cos ..t; x// d.x/ˇˇ
ˇZR
d
ˇ ˇZ ˇ
ˇ ˇ ˇ ˇ
ˇ ˇ
 ˇ cos ..t; x// d.x/ˇ C ˇ ˇ cos ..t; x// d.x/ˇˇ
B Rd nB
Z ˇZ ˇ
ˇ ˇ
D cos ..t; x// d.x/ C ˇ ˇ cos ..t; x// d.x/ˇˇ
B Rd nB
Z
 cos ..t; x// d.x/ C .Rd n B/
ZB
D cos ..t; x// dB .x/ D Re fR .t /:
Rd

Theorem 1.11.4. Let X be a d-dimensional random vector such that the support of
its distribution is not contained in some hyperplane of Rd. Then there exist positive
numbers ı and such that the inequality

jfX .t /j  1   kt k2 (1)

holds for all t 2 Rd with kt k < ı.28

Proof. We may suppose that f WD fX is real-valued. Indeed, if the statement is true


for real-valued characteristic functions, then for an arbitrary characteristic function f
we have
jf .t /j2  1   kt k2 ; kt k  ı
for some positive ı and , which yields

jf .t /j  1   kt k2 ; kt k  ı:
2
We choose R > 0 such that the random variable XR from Lemma 1.11.3 is not concen-
trated on a hyperplane.29 Then the covariance matrix of XR is strictly positive definite
and hence
.t; cov.XR /t /    kt k2 ; t 2 Rd
where  .> 0/ is the smallest eigenvalue of cov.XR /. The theorem now follows from
Corollary 1.11.2 and Lemma 1.11.3, using the fact that f and fR are real valued.

28 See also Lemma 1.2.8.


29 For this it is sufficient to choose R such that fx 2 supp  W kxk  Rg has a positive Lebesgue
measure.
64 Chapter 1 Characteristic functions

Theorem 1.11.5. Let n 2 N and f be a real-valued 2n-times differentiable charac-


teristic function on R. Then we have
2b n1
2 cC1 2b 2 c n
X M2k 2k X M2k 2k
k
.1/ t  f .t /  .1/k t ; t 2 R;
.2k/Š .2k/Š
kD0 kD0

where M2k is the moment of order 2k of the corresponding distribution.

Proof. From Corollary 1.2.3 and inequality (1.1.2.ii) we see that

M2k  f .2k/ .t /  M2k ; t 2 R; k D 0; 1; : : : ; n:

Since f is an even function, all derivatives of odd order are zero at 0. Applying the
n
upper estimate to f .4b 2 c/ . t / in the Taylor expansion
2b n
2 c1
X M2k 2k t 4b 2 c
n
n
f .t / D .1/ k
t C n  f .4b 2 c/ . t /; t 2 R;  2 Œ0; 1;
.2k/Š .4b 2 c/Š
kD0

we obtain the upper estimate for f .t /. The lower estimate can be obtained in the same
way.

As an application of Corollary 1.7.9 we prove two useful inequalities for positive


definite functions.

Theorem 1.11.6. Let f be a positive definite function on a commutative group G


such that f .0/ D 1. Then for all x 2 G and n 2 N we have
(i) 1  Re f .nx/  nŒ1  .Re f .x//n ;
(ii) 1  jf .nx/j  nŒ1  jf .x/jn .

Proof. For each fixed x 2 G the function n 7! f .nx/ is positive definite on Z. We


may therefore assume that G D Z and x D 1. Thus, we have to show that

1  Re f .n/  nŒ1  .Re f .1//n  (1)


1  jf .n/j  nŒ1  jf .1/jn : (2)

To prove the first inequality we define the set Kn  R2 by

Kn WD f.x; y/ W jxj  1; jyj  1; nx n C 1  n  yg:

Inequality (1) is then equivalent to the relation .Re f .1/; Re f .n// 2 Kn . The set Kn
is convex for all n 2 N. Using this and Corollary 1.7.9 we see that it is sufficient to
show (1) for the positive definite functions n 7! z n ; z 2 T. Thus, it remains to prove
that
1  Re .z n /  nŒ1  .Re z/n ; z 2 T;
Section 1.11 Some inequalities 65

 2

2

4

Figure 1.7. The function hn from the proof of Theorem 1.11.6 with n D 5.

or equivalently,

hn .t / WD h.t / WD n cosn t  cos nt  n  1; t 2R (3)

(see Figure 1.7). The function h is periodic and hence it attains its maximum at some
y 2 R. From
h0 .y/ D n2 cosn1 y sin y C n sin ny D 0
we see that either sin y D 0 or
sin ny
sin y 6D 0 and n cosn1 y D : (4)
sin y
If sin y D 0, then (3) holds. If y satisfies (4), then
cos y sin ny cos y sin ny  sin y cos ny
h.y/ D  cos ny D
sin y sin y
sin.n  1/y
D :
sin y
Using induction on n we see that sin.n  1/y= sin y  n  1; n 2 N, and therefore
h.y/  n  1.
To prove inequality (2), let z 2 T be such that Re .zf .1// D jf .1/j. Since the
function n 7! z n f .n/ is positive definite on Z, inequality (1) shows that

njf .1/jn C 1  n D n.Re zf .1//n C 1  n  Re .z n f .n//  jf .n/j

from which (2) follows.


66 Chapter 1 Characteristic functions

Using the inequality

1  r n  n.1  r/; 1  r  1;

we obtain the following corollary:

Corollary 1.11.7. With the notation of the previous theorem we have


(i) 1  Re f .nx/  n2 Œ1  Re f .x/;
(ii) 1  jf .nx/j  n2 Œ1  jf .x/j.
Chapter 2

Correlation functions

In the first section of this chapter we present a few basic results on processes with
independent increments which can be obtained easily by using characteristic functions.
Then we concentrate on those properties of second order fields which are related to
the fact that their correlation functions represent positive semidefinite kernels. In the
second part of the chapter we investigate the special case of stationary fields. At the
end of the chapter we give an introduction to the harmonic analysis of stationary fields
using unitary representations built from the correlation function.
Throughout this chapter the symbol T denotes a nonempty subset of Rd with some
d 2 N and . ; A; P / is a probability space.

2.1 Random fields


Basic notions 2.1.1. A real (complex) random field Z is a mapping from T into the
set of all real (complex, respectively) random variables on . ; A; P /. The random
variable Z.t /; t 2 T , is also written as Zt . If Z.t / is square integrable for all t 2 T ,
then Z is called a second order random field. In the special case T D R or T D Œ0; 1/
we also use the terminology random process or simply process instead of random field.
If T D Z or T D N0 , then Z is usually called a time series.
For any finite subset ft1 ; : : : ; tn g of T the distribution of the random vector X D
.Z.t1 /; : : : ; Z.tn // is called a finite-dimensional distribution of Z. If Z is real and all
finite-dimensional distributions are Gaussian (see Definition 1.10.2), then Z is said to
be a Gaussian field. Note that Z is Gaussian if and only if the characteristic function
fX of X is given by
1
fX .s/ D ei.s;E.X// 2 .C s;s/ ; s D .s1 ; : : : ; sn / 2 Rn

where C is the covariance matrix of X . If Z is complex and all random vectors

.Re Z.t1 /; Im Z.t1 /; : : : ; Re Z.tn /; Im Z.tn //

are Gaussian, then Z is called a complex Gaussian field.


A process Z on T D R or T D Œ0; 1/ is said to have independent increments if
the random variables

Z.t2 /  Z.t1 /; Z.t3 /  Z.t2 /; : : : ; Z.tn /  Z.tn1 /;


68 Chapter 2 Correlation functions

the so-called increments of Z, are independent for every n  3 and every t 2 Rn such
that t1 <    < tn and tj 2 T .
Next we prove the existence of processes having special properties. The proof uses
Kolmogorov’s existence theorem formulated in terms of characteristic functions (cf.
Remark F.3.4). We need the following simple lemma.

Lemma 2.1.2. A real process Z on T D Œ0; 1/ with Z.0/ D 0 has independent


increments if and only if for every d  1 and for every t 2 Rd such that 0 
t1 <    < td the characteristic function ft of the random vector .Z.t1 /; : : : ; Z.td //
satisfies the equation

ft .x/ D g1 .x1 C    C xd /  g2 .x2 C    C xd /  : : :  gd .xd /; x 2 Rd (1)

where gj , j  1, denotes the characteristic function of the random variable j D


Z.tj /  Z.tj 1 /, and t0 D 0.

Proof. Denote by g the characteristic function of the random vector .1 ; : : : ; d /.


We have

ft .x/ D E .exp Œix1 Z.t1 / C ix2 Z.t2 / C    C ixd Z.td //


D E .exp Œix1 1 C ix2 .1 C 2 / C    C ixd .1 C    C d //
D E .exp Œi.x1 C    C xd /1 C i.x2 C    C xd /2 C    C ixd d /
D g.x1 C    C xd ; x2 C    C xd ; : : : ; xd /:

These equations show that (1) holds whenever the random variables j are independ-
ent. Assume now that (1) holds. The equations above show that

g.y/ D g1 .y1 /  : : :  gd .yd /

where yj D xj C    C xd ; 1  j  d . Since any y 2 Rd can be represented


in this way with some x 2 Rd the independence of the j ’s follows from Theo-
rem 1.3.10.

Theorem 2.1.3. Let h be a complex-valued function on R such that eph is a charac-


teristic function for all p 2 Œ0; 1/.1 There exists a process Z on Œ0; 1/ satisfying the
conditions
(i) Z.0/ D 0;
(ii) Z has independent increments;
(iii) the characteristic function of Z.t /  Z.s/ is e.ts/h for all t  s.

1 We will investigate functions with this property in Section 3.11.


Section 2.1 Random fields 69

Proof. Let d  1 and t0 D 0. By assumption, the function


" d #
X
ft .x/ D exp .tj  tj 1 /h.xj C    C xd / ; x 2 Rd
j D1

is a characteristic function for all t 2 Œ0; 1/d such that t1      td . For arbitrary
t 2 Œ0; 1/d we choose a permutation  such that t.1/      t.d / and define
ft by the consistency condition (F.3.4.i) Then condition (F.3.4.ii) is satisfied as well.
Consequently, there exists a process Z on Œ0; 1/ such that the finite-dimensional dis-
tributions of Z have the ft ’s as characteristic functions. Obviously, Z.0/ D 0. The
equation above with d D 2 and x2 D x1 shows that the characteristic function of
Z.t /  Z.s/ is e.ts/h if t  s (cf. Theorem 1.1.7). By Lemma 2.1.2, the process Z
has independent increments.

As a corollary we establish the existence of two important processes, the Wiener


process and the Poisson process.2 The next two statements follow immediately from
Theorem 2.1.3 by noting that the characteristic function of a Gaussian random variable
1 2 2
with mean zero and variance 2 is t 7! e 2  t while the characteristic function
of a Poisson distributed random variable with mean   0 is t 7! e.e 1/ (see
it

Examples 1.1.13 and 1.2.4).

Corollary 2.1.4 (Wiener process). There exists a process W on Œ0; 1/ satisfying the
conditions
(i) W .0/ D 0;
(ii) W has independent increments;
(iii) W .t /  W .s/ has a Gaussian distribution with mean zero and variance t  s
for all t  s.

Corollary 2.1.5 (Poisson process). For each   0 there exists a process N on Œ0; 1/
satisfying the conditions
(i) N.0/ D 0;
(ii) N has independent increments;
(iii) N.t /  N.s/ has a Poisson distribution with mean .t  s/ for all t  s.

2 Usually one also imposes conditions on the trajectories t 7! Zt .!/ of these processes but we do not
consider these questions in this book.
70 Chapter 2 Correlation functions

2.2 Correlation functions of second order random fields


To give a complete description of a random field it is necessary to specify all of its
finite-dimensional distributions. Since the theoretical determination of these distribu-
tions is possible only in a few cases, it is natural to restrict oneself to investigating
properties of the field which are determined by simple characteristics of the finite-
dimensional distributions. The next definition uses only the first and second moments
of the field.

Definition 2.2.1. Let Z be a second order field on T . The function C W T  T ! C


defined by  
C.s; t / D E Z.s/  Z.t /
is called the correlation function of Z. The covariance function is defined by
 
.s; t / D E ŒZ.s/  M.s/  ŒZ.t /  M.t /

where
M.t / D E.Z.t //
is the mean of the field.

An easy computation shows that

.s; t / D C.s; t /  M.s/  M.t /:

If the mean is zero, the correlation function and the covariance function are equal.
Note that all finite-dimensional distributions of a Gaussian field are completely de-
termined if we know its mean and correlation function.

Example 2.2.2. (a) Let X 2 L2 . ; A; P / and f W T ! C be arbitrary. Then

Z.t / D f .t /  X; t 2T

is a second order field with mean M.t / D f .t /  E .X / and correlation function

C.s; t / D f .s/f .t /  E .jX j2 /:

(b) If Z0 D 0 and Z has independent increments, then for 0  s  t we have

C.s; t / D E.Zs Zt / D E.Zs .Zt  Zs // C E.Zs2 /


D E.Zs /  E.Zt  Zs / C E.Zs2 /
D E.Zs /  E.Zt / C Var.Zs /

and hence
.s; t / D Var.Zs /; s  t:
Section 2.2 Correlation functions of second order random fields 71

In particular, the covariance function of the Wiener process and of the Poisson process
with  D 1 is given by3

.s; t / D min.s; t /; t; s 2 Œ0; 1/:

The next theorem gives a characterization of correlation functions.

Theorem 2.2.3. A complex-valued (real-valued) function K defined on T  T is the


covariance or correlation function of a second order complex (real, respectively) field
if and only if for any finite collection t1 ; : : : ; tn of elements of T the matrix
 n
K.ti ; tj /
i;j

is positive semidefinite (and symmetric, respectively).

Proof. Assume that K is the covariance function of a complex field Z. The case where
K is the correlation function can be treated in the same way. We have
X
n
K.ti ; tj /ci cj
i;j D1
X
n h i
D E .Z.ti /  M.ti //  .Z.tj /  M.tj // ci cj
i;j D1
ˇ n ˇ2
ˇ X ˇ
D Eˇˇ cj .Z.tj /  M.tj //ˇˇ  0
j D1

for all c1 ; : : : ; cn 2 C, i.e., K is positive semidefinite. If the field is real, then K is


symmetric and the relations above hold for all c1 ; : : : ; cn 2 R. Theorem D.2.4 shows
that K is positive semidefinite.
The remaining statements follow from Theorem 2.2.4.

Theorem 2.2.4. Let M W T ! C be arbitrary and let K W T  T ! C be such that


for any finite collection t1 ; : : : ; tn of elements of T the matrix
 n
K.ti ; tj / (1)
i;j
is positive semidefinite. Then there exists a Gaussian field Z on T such that

E.Z.t // D M.t / (2)


E.Z.s/  Z.t / /  M.s/  M.t / D K.s; t / (3)
E.Z.s/  Z.t / / D M.s/  M.t /: (4)
3 Combining this with Theorem 2.2.3 we obtain an alternative, though not elementary, proof of the
positive definiteness of the kernel .s; t / 7! min.s; t / (cf. Theorem D.2.10).
72 Chapter 2 Correlation functions

If M and K are real-valued and K is symmetric, then there exists a real Gaussian
field Z such that (2) and (3) hold.4

Proof. Assume first that M and K are real-valued and let ft1 ; : : : ; tn g  T be arbi-
trary. Since K is positive semidefinite there exists a unique Gaussian distribution on
Rn with mean vector .M.t1 /; : : : ; M.tn // and covariance matrix (1). The collection
of all these distributions satisfies the conditions of Kolmogorov’s consistency Theo-
rem F.3.3. Hence there exists a real Gaussian field satisfying (2) and (3).
Now assume that M and K are complex-valued and let TQ WD T  f1; 2g. Further,
let MQ .t; 1/ D Re M.t /; MQ .t; 2/ D Im M.t / and define the real kernel KQ on TQ  TQ
by setting
Q Q 1
K..s; 1/; .t; 1// D K..s; 2/; .t; 2// D Re K.s; t /
2
and
Q Q 1
K..s; 1/; .t; 2// D K..s; 2/; .t; 1// D  Im K.s; t /; s; t 2 T:
2
It follows from Lemma D.2.14 that KQ is positive semidefinite. By the first part of the
proof there exists a real field ZQ on TQ such that the analogues of (2) and (3) are satisfied
for MQ ; KQ and Z.Q Setting now

Q 1/ C iZ.t;
Z.t / WD Z.t; Q 2/; t 2T

an easy computation shows that Z satisfies the equations (2)–(4).

Example 2.2.5. By Lemma D.2.10 and Theorem 2.2.4, there exists a real Gaussian
process X on T D Œ0; 1/ such that E.X.t // D 0 and

E.X.t /X.s// D min.t; s/; t; s 2 Œ0; 1/: (1)

Let t1  t2  s1  s2 . Using (1) we obtain

E.ŒX.t2 /  X.t1 /  ŒX.s2 /  X.s1 // D C.t2 ; s2 /  C.t2 ; s1 /


 C.t1 ; s2 / C C.t1 ; s1 /
D t2  t2  t1 C t1
D0

i.e., the increments X.t2 /  X.t1 / and X.s2 /  X.s1 / are uncorrelated. Since the in-
crements of a Gaussian process are Gaussian, we infer that they are independent (cf.
Theorem 1.10.8). Thus, we obtain the Wiener process constructed in Corollary 2.1.4.
Now let h 2 Œ0; 1/ be arbitrary and define the process X h by

X h .t / D X.t C h/  X.t /; t 2 Œ0; 1/:

4 If a real field satisfies (2) and (4), then E.ŒZ.t /  M.t /2 / D 0 and hence Z.t / D M.t /.
Section 2.2 Correlation functions of second order random fields 73

Denote by C h the correlation function of this process. The mean of X h is zero and for
s  t we have

C h .s; t / D E Œ.X.s C h/  X.s//  .X.t C h/  X.t //


D C.s C h; t C h/  C.s C h; t /  C.s; t C h/ C C.s; t /
D s C h  min.s C h; t /  s C s
D max.0; h  t C s/:

In the same way we see that C h .s; t / D max.0; h C t  s/; s  t , and hence

C h .s; t / D max.0; h  js  t j/:

Therefore C h .s; t / depends only on s  t , a property that we will call stationarity (cf.
Definition 2.8.1).

Notation 2.2.6. Recall that the set L2 .P / WD L2 . ; A; P / of all square integrable


complex-valued random variables is a linear space with the usual notion of multipli-
cation by a scalar and addition of random variables. This linear space endowed with
the inner product Z
.X; Y / WD X  Y dP D E.X  Y /

is a positive semidefinite inner product space. The inner product space L2r .P / of all
square integrable real-valued random variables on . ; A; P / is introduced in the same
way. We denote the corresponding Hilbert spaces by L2 .P / and L2r .P /, the elements
of which are equivalence classes of random variables. In all these cases we will use
the notation k  k for the norm:
p
kX k D .X; X / :

We have by the Cauchy–Schwarz inequality

jE.X  Y /j  kX k  kY k:

Note that the expectation can be expressed using the inner product:

E .X / D .X; 1 /:

Two square integrable random variables with zero mean are orthogonal if and only if
they are uncorrelated.
For a field Z of second order on T we define the mapping Z W T ! L2 .P / by
Zt D Zt C N where N is the class of random variables which are zero P -almost
everywhere. We call Z the L2 -valued random field corresponding to Z. The notions
mean, covariance function, correlation function and Gaussian field are defined for Z
74 Chapter 2 Correlation functions

in the same way as for Z. By H.Z/ and H.Z/ we denote the closed linear subspaces
generated by all Zt and Zt ; t 2 T , respectively.
Let fe˛ g˛2I be an orthonormal basis of H.Z/. The index set I may be finite, count-
able infinite or uncountable.5 In any case
X
Zt D f˛ .t /  e˛ < 1; t 2 T (1)
˛2I

where f˛ .t / D .Zt ; e˛ / (for each t only countable many summands are different from
zero). Moreover, X
kZt k2 D kZt k2 D jf˛ .t /j2 : (2)
˛2I

Theorem 2.2.7. A complex-valued (real-valued) function K on T  T is the correla-


tion function or covariance function of a complex (real, respectively) field of second
order if and only if it can be represented in the form
X
K.s; t / D f˛ .s/f˛ .t /; s; t 2 T (1)
˛2I

where I is some nonempty set and f˛ ; ˛ 2 I , are complex-valued (real-valued,


respectively) functions on T such that
X
jf˛ .t /j2 < 1; t 2 T: (2)
˛2I

Proof. We consider only correlation functions and the complex case, the rest is similar.
Assume that (1) and (2) hold. Then K is positive semidefinite:
ˇ n ˇ2
Xn X ˇˇ X ˇ
ˇ
K.tj ; tk /cj ck D ˇ f˛ .tj /cj ˇ  0:
ˇ ˇ
j;kD1 ˛2I j D1

Theorem 2.2.3 shows that K is a correlation function.


Now let K be the correlation function of a field Z W T ! L2 . ; A; P /. The
relations (1) and (2) follow immediately from (2.2.6.1) and (2.2.6.2) by noting that
.Zs ; Zt / D .Zs ; Zt /.

5 To give an example where I is uncountable, we take independent random variables Zt ; t 2 R, with


mean 0 and variance 1. Then fZt gt2R is an orthonormal basis of H.Z/. The fact that such random
variables exist follows from Corollary F.3.2.
Section 2.3 Continuity and differentiability 75

2.3 Continuity and differentiability


As we will see, continuity and differentiability of a second order field can be com-
pletely described in terms of its correlation function.

Definition 2.3.1. A second order field Z is called strongly continuous at t0 2 T if


 
lim E jZ.t /  Z.t0 /j2 D 0:
t!t0

The field Z is called strongly continuous if it is strongly continuous at each t0 2 T .

Theorem 2.3.2. A second order field Z on T is strongly continuous at t 2 T if and


only if its correlation function C is continuous at .t; t /. Moreover, if for some t; s 2 T
the function C is continuous at .t; t / and .s; s/, then it is continuous at .t; s/ and
.s; t / as well.

Proof. For any h 2 T such that t C h 2 T we have

kZ.t C h/  Z.t /k2 D C.t C h; t C h/  C.t C h; t /  C.t; t C h/ C C.t; t /:

Thus, if C is continuous at .t; t /, then Z is strongly continuous at t .


For t and h as above we have

C.t C h; t C /  C.t; t / D E ŒZ.t C h/  Z.t /  ŒZ.t C /  Z.t /
CŒZ.t C h/  Z.t /  Z.t /

CŒZ.t C /  Z.t /  Z.t / :

Applying the Cauchy–Schwarz inequality we obtain

jC.t C h; t C /  C.t; t /j  kZ.t C h/  Z.t /k  kZ.t C /  Z.t /k


CkZ.t C h/  Z.t /k  kZ.t /k
CkZ.t C /  Z.t /k  kZ.t /k:

Thus, if Z is strongly continuous at t , then C is continuous at .t; t /.


To prove the second statement assume that C is continuous at .t; t / and .s; s/. As we
have seen, this implies that Z is strongly continuous at t and s. The proof is completed
by noting that
C.t; s/ D .Z.t /; Z.s//:
76 Chapter 2 Correlation functions

Definition 2.3.3. A process Z of second order on .a; b/; 1  a < b  1, is


called strongly differentiable, if for all t 2 .a; b/ the strong limit6
Z.t C h/  Z.t /
lim
h!0 h
exists. The limit is denoted by Z 0 .t /.

In the same way one can define partial derivatives of a field defined on an open set
T  Rd .
Next we show that strong differentiability of a process is closely related to the dif-
ferentiability of its correlation function. For a function f of two variables we write

2 f .t; s; ; h/ D f .t C ; s C h/  f .t C ; s/  f .t; s C h/ C f .t; s/:

Theorem 2.3.4. Let Z be a process of second order on .a; b/ with correlation func-
tion C . The process Z is strongly differentiable if and only if the limit

2 C.t; t; ; h/
lim
;h!0 h
exists for all t 2 .a; b/.
If Z is strongly differentiable, then the limit

2 C.t; s; ; h/
D.t; s/ WD lim
;h!0 h

exists for all t; s 2 .a; b/ and D is the correlation function of Z 0 . Moreover,


d
.Z.t /; v/ D .Z 0 .t /; v/ (1)
dt
for all v 2 L2 . ; A; P /. In particular, with v D 1 ,
d
E .Z.t // D E .Z 0 .t //: (2)
dt
Proof. By Cauchy’s criterion (cf. Lemma D.3.2), the process Z is strongly differen-
tiable at t if and only if the limit
Z.t C /  Z.t / Z.t C h/  Z.t / 2 C.t; t; ; h/
lim ; D lim
;h!0 h ;h!0 h
exists. This shows the first statement.

6 That is, limit with respect to the norm of L2 .P /.


Section 2.4 Integration with respect to complex measures 77

If Z is strongly differentiable, then


Z.t C /  Z.t / Z.s C h/  Z.s/
.Z 0 .t /; Z 0 .s// D lim ; lim
!0 h!0 h
Z.t C /  Z.t / Z.s C h/  Z.s/
D lim ;
;h!0 h
2 C.t; s; ; h/
D lim D D.t; s/:
;h!0 h
Equation (1) follows from the fact that strong convergence implies weak convergence:
Z.t C h/  Z.t / Z.t C h/  Z.t /
lim ; v D lim ;v :
h!0 h h!0 h
Applying Lemma B.6.3 we obtain the following corollary:

Corollary 2.3.5. If C has continuous partial derivatives of second order on .a; b/ 


.a; b/, then Z is strongly differentiable and the correlation function D of Z 0 is
given by
@2 C
D.s; t / D .s; t /; s; t 2 .a; b/:
@s@t

Lemma 2.3.6. If Z is a strongly differentiable Gaussian process on .a; b/, then Z 0 is


Gaussian as well.

Proof. The random vectors


1  
 Z.t1 C 1=n/  Z.t1 /; : : : ; Z.tk C 1=n/  Z.tk /
n
where n 2 N and tj ; tj C 1=n 2 .a; b/ are Gaussian and they converge in law to

.Z 0 .t1 /; : : : ; Z 0 .tk //

as n ! 1 (see Section F.2). Thus, the statement of the Lemma follows from Theo-
rem 1.10.7.

2.4 Integration with respect to complex measures

Throughout this section T  Rd denotes a nonempty Borel set and Z is a strongly


continuous random field of second order on T . As in Notation 2.2.6 we denote by Z
the corresponding L2 -valued random field. First we define integrals of the form
Z
Z.t / d.t /
T
78 Chapter 2 Correlation functions

where  2 Mb .T / is a complex Radon measure on T . Then we prove basic properties


of this integral paying special attention to properties which can be expressed in terms
of the correlation function
   
C.s; t / D E Z.s/  Z.t / D E Z.s/  Z.t / ; s; t 2 T:

Definition 2.4.1. Let  2 Mb .T /. The field Z is called -integrable if the function


t 7! .Z.t /; h/; t 2 T , is -integrable for all h 2 L2 .P / and there exists I.Z/ 2
L2 .P / such that
Z
.Z.t /; h/ d.t / D .I.Z/; h/ ; h 2 L2 .P /: (1)
T

Note that I.Z/ is uniquely determined by equation (2.4.1.1). We write


Z Z
Z d WD Z.t / d.t / WD I.Z/:
T

For a Lebesgue integrable function f on T let


Z Z
Z.t /f .t / dt WD Z.t / d.t /
T T

where d.t / D f .t / dt . If S  T is a Borel set, then we write


Z Z
Z.t / d.t / D Z.t /1S .t / d.t /:
S T
Rb
In Rthe special case where S D Œa; b  R we also use the notation a Z d instead
of Œa;b Z d.

The next lemma follows immediately from the definition of the integral. We omit
the proof.

Lemma 2.4.2. Assuming that all the integrals below exist, we have:
Z Z
(i) .Z.t /; h/ d.t / D Z.t / d.t /; h ; h 2 L2 .P /
T T
Z
(ii) a1 Z1 .t / C a2 Z2 .t / d.t / D
T Z Z
a1 Z1 .t / d.t / C a2 Z2 .t / d.t /; a1 ; a 2 2 C
T T
Z Z Z
(iii) Z.t / d.1 C 2 /.t / D Z.t / d1 .t / C Z.t / d2 .t /
T T T
Section 2.4 Integration with respect to complex measures 79

Z Z Z
(iv) Z.t / d.t / D Z.t / d.t / C Z.t / d.t /
S1 [S2 S1 S2

where S1 ; S2  T are disjoint Borel sets


(v) If Z.t / D h for some h 2 L2 .P / and all t 2 T then
Z
Z.t / d.t / D .T /  h.
T

Example 2.4.3. (a) Let


X
n
D cj ıtj ; cj 2 C; tj 2 T
j D1

be a measure with finite support. Then


Z Xn X
n
.Z.t /; h/ d.t / D cj .Z.tj /; h/ D cj Z.tj /; h ; h 2 L2 .P /:
T j D1 j D1

Thus, by Definition 2.4.1,


Z X
n
Z d D cj Z.tj /:
j D1

Moreover, we have
Z 2
  X
n X
n X
n
 Z d D cj Z.tj /; ck Z.tk / D cj ck C.tj ; tk /: (1)
 
j D1 kD1 j;kD1

(b) Assume that Z.t / D f .t /X; t 2 T D Œa; b, where f is a continuous function
on T and X 2 L2 .P /. For all h 2 L2 .P / we have
Z b Z b
.Z.t /; h/ d.t / D .f .t /X; h/ d.t /
a a
Z b Z b
D .X; h/ f .t / d.t / D f .t / d.t /  X; h
a a
and hence Z Z
b b
Z.t / d.t / D f .t / d.t /  X:
a a
Using this and (2.4.2.ii) we obtain the more general relation
Z bX n Xn Z b
fj .t /Xj d.t / D fj .t / d.t /  Xj
a j D1 j D1 a

where Xj 2 L2 .P / and fj is a continuous function on Œa; b.


80 Chapter 2 Correlation functions

Theorem 2.4.4. Assume that t 7! .Z.t /; h/ is -integrable for all h 2 L2 .P /. Then


Z is -integrable if and only if there exists a constant K 2 Œ0; 1/ with
ˇZ ˇ
ˇ ˇ
ˇ .Z.t /; h/ d.t /ˇ  K  khk; h 2 L2 .P /: (1)
ˇ ˇ
T
R
Proof. If Z is -integrable, then the inequality (1) holds with K D k Z dk. This
follows from (2.4.1.1) and from the Cauchy–Schwarz inequality

j.I.Z/; h/j  kI.Z/k  khk:

Assume now that (1) holds with some constant K. Then


Z
h 7! .h; Z.t // d.t /
T

is a bounded linear functional7 on L2 .P /. By a theorem of Riesz (cf. Theorem D.3.4)


there exists I 2 L2 .P / such that
Z
.h; Z.t // d.t / D .h; I / ; h 2 L2 .P /
T

i.e., Z
.Z.t /; h/ d.t / D .I; h/ ; h 2 L2 .P /:
T
Thus, the field Z is -integrable.

Corollary 2.4.5. If Z
kZ.t /k djj.t / < 1 (1)
T
then Z is -integrable and
Z  Z
 
 Z.t / d.t /  kZ.t /k djj.t /: (2)
 
T T

Proof. The Cauchy–Schwarz inequality j.Z.t /; h/j  kZ.t /k  khk and (1) imply that
t 7! .Z.t /; h/ is jj-integrable and hence -integrable as well. Moreover,
ˇZ ˇ Z Z
ˇ ˇ
ˇ .Z.t /; h/ d.t /ˇ  j.Z.t /; h/j djj.t /  kZ.t /k djj.t /  khk :
ˇ ˇ
T T T

By Theorem 2.4.4, ˇRthe field Z


ˇ is -integrable.
R The left-hand side of the inequality
above is equal to ˇ Z d; h ˇ. Setting h D Z d and dividing by khk if khk ¤ 0
we obtain (2).

7 To obtain a linear functional we use the conjugate of the integral in (1).


Section 2.4 Integration with respect to complex measures 81

Theorem 2.4.6. If Z is integrable with respect to  and , then the correlation func-
tion C is   -integrable and
Z Z Z
Z d; Z d D C.t; s/ d.  /.t; s/: (1)
T T T T

In particular,
Z 2 Z
 
 Z d D C.t; s/ d.  /.t; s/: (2)
 
T T T
R
Proof. Equation (2.4.1.1) with h D Z d yields
Z Z Z Z
Z.t / d.t /; Z.s/ d.s/ D Z.t /; Z.s/ d.s/ d.t /: (3)

Setting now h D Z.t / in equation (2.4.1.1), replacing the integration variable t by s,


and replacing the measure  by  we obtain
Z Z Z
Z.t /; Z.s/ d.s/ D .Z.s/; Z.t // d.s/ D C.t; s/ d.s/:

This relation together with equation (3) shows that


Z Z Z Z
Z.t / d.t /; Z.s/ d.s/ D C.t; s/ d.s/ d.t /:

By Fubini’s theorem C is   -integrable and (1) holds.

Next we characterize integrability of a random field in terms of integrability of its


correlation function.

Theorem 2.4.7. The field Z is -integrable if and only if the correlation function C
is   -integrable.

Proof. In view of Theorem 2.4.6 it remains to prove that Z is -integrable whenever


C is   -integrable. To show this, let h 2 L2 .P /; h ¤ 0, be arbitrary and write
fh .t /
fh .t / WD .Z.t /; h/; Zh .t / WD Z.t /   h; t 2 T:
khk2
Then .Zh .t /; h/ D 0 and
1
C.t; s/ D .Z.t /; Z.s// D .Zh .t /; Zh .s// C fh .t /fh .s/:
khk2
Let S ¤ ; be an arbitrary compact subset of T . By continuity, the functions t 7!
kZ.t /k and t 7! kZh .t /k are bounded on S . Consequently, the restrictions of Z and
82 Chapter 2 Correlation functions

Zh to S are integrable with respect to any complex measure (see Corollary 2.4.5).
Applying Theorem 2.4.6 we obtain
Z Z Z Z
C.t; s/ d.t / d.s/ D .Zh .t /; Zh .s// d.t / d.s/
S S S S
Z Z
1
C fh .t /fh .s/ d.t / d.s/
khk2 S S
ˇZ ˇ2
1 ˇˇ ˇ
 ˇ fh .t / d.t /ˇˇ
khk 2
S

(note that the first integral after the equation sign is nonnegative by (2.4.6.2)). Conse-
quently, ˇZ ˇ Z
ˇ ˇ 1=2
ˇ fh .t / d.t /ˇ  jC j d j  j  khk:
ˇ ˇ
S T T
Using Lemma E.1.18 we conclude8 that fh is -integrable and the inequality above
remains valid if we replace S by T . By Theorem 2.4.4 the field Z is -integrable.

We already know that a correlation function K is positive semidefinite, i.e.,


X
n
K.ti ; tj /ci cj  0
i;j D1

for all n 2 N; tj 2 T and cj 2 C. The next corollary contains a more general


inequality.

Corollary 2.4.8. Let K be a continuous positive semidefinite kernel on T . Then the


inequality Z
K.t; s/ d.  /.t; s/  0
T T

holds for all  2 Mb .T / such that K is -integrable. In particular, the inequality


Z Z
K.t; s/h.t /h.s/ dt ds  0
T T

holds for every continuous function h W T ! C with compact support.

Proof. By Theorems 2.2.3 and 2.3.2 the kernel K is the correlation function of a
strongly continuous random field. This field is -integrable in view of Theorem 2.4.7.
The corollary follows now from (2.4.6.2).

8 To apply this lemma we extend  to a measure on Rd by setting .B/ WD .B \ T /; B 2 B.Rd /.


The function fh is extended to Rd by setting it equal to zero on Rd n T .
Section 2.4 Integration with respect to complex measures 83

Theorem 2.4.9. Let fn g be a sequence of complex measures on T converging


weakly to some complex measure . If Z is integrable with respect to  and n then
Z Z
Z d D lim Z dn : (1)
T n!1 T

Proof. We may suppose that  and n are nonnegative. If T is compact, then, by


continuity, the function C is bounded and hence integrable with respect to any complex
measure on T  T . Using (2.4.6.2) we obtain
Z Z 2 Z 2
   
 Z d  Z dn   
  D  Z d.  n /
T T T
Z
D C d.  n /  .  n / :
T T

In view of Theorem E.1.14, the sequence f.  n /  .  n /g converges weakly


to 0. Since C is bounded, the right-hand side of the relation above tends to zero from
which (1) follows.
Now let T be arbitrary. For each > 0 there exist disjoint Borel sets Tb and Tu
such that Tb is compact, T D Tb [ Tu and
Z
C d.  / <
Tu Tu

(this integral is nonnegative in view of (2.4.6.2)). Therefore,


Z Z 2 Z 2 Z
   
 Z d  
Z d D   Z d
  D C d.  / < :
T Tb Tu Tu Tu

By the first part of the proof, the statement of the theorem is true for the compact set
Tb . The inequality above shows that it is true for T as well.
Using Theorem E.1.8 we obtain the following corollary:

Corollary 2.4.10. For every complex measure  there exists a sequence fn g of com-
plex measures with finite support converging weakly to  and satisfying the rela-
tion (2.4.9.1).

Corollary 2.4.11. If Z is a j -integrable Gaussian field .j D 1; : : : ; d /, then


Z Z
Z d1 ; : : : ; Z dd
T T

is a Gaussian random vector.


Proof. The statement is obvious if the j ’s have finite support. The general case fol-
lows from Corollary 2.4.10 and from the fact that the limit of weakly convergent Gaus-
sian random vectors is Gaussian (cf. Theorem 1.10.7).
84 Chapter 2 Correlation functions

Theorem 2.4.12 (Newton–Leibniz formula for fields). Let Z be a strongly continuous


field on .a; b/; 1  a < b  1, and let t0 2 .a; b/ be arbitrary.
Rt
(i) The field t 7! t0 Z.s/ ds is strongly differentiable and
Z t
d
Z.t / D Z.s/ ds; t 2 .a; b/:
dt t0

(ii) If C is twice continuously differentiable, then


Z t
Z.t / D Z.t0 / C Z 0 .s/ ds; t 2 .a; b/:
t0

Proof. (i) We have to show that


"Z Z t # Z
tCh
1 1 tCh
Z.s/ ds  Z.s/ ds D Z.s/ ds
h t0 t0 h t

converges to Z.t / as h ! 0. This follows from


 Z  Z 
 1 tCh  1  tCh 
   
 Z.s/ ds  Z.t / D  Z.s/  Z.t / ds 
h t  jhj  t 
Z
1
  kZ.s/  Z.t /k ds
jhj Œt;tCh
 sup kZ.s/  Z.t /k ! 0:
s2Œt;tCh

(ii) Corollary 2.3.5 shows that Z is strongly differentiable and the correlation func-
tion of Z 0 is continuous. Consequently, Z 0 is strongly continuous and hence inte-
grable with respect to the Lebesgue measure on Œt0 ; t . By Theorem 2.3.4, the function
t 7! .Z.t /; h/ is continuously differentiable for all h 2 L2 .P /. Applying the classical
Newton–Leibniz formula to this function we obtain
Z t
d
.Z.t /; h/ D .Z.t0 /; h/ C .Z.s/; h/ ds:
t0 ds
Equation (2.3.4.1) and the definition of the integral show that
Z t
.Z.t /; h/ D .Z.t0 /; h/ C Z 0 .s/ ds; h ; h 2 L2 .P /:
t0

from which the statement follows.

Lemma 2.4.13 (dominated convergence). Let Z; Zn ; n 2 N, be strongly continuous


fields on T such that
lim Zn .t / D Z.t /; t 2 T: (1)
n!1
Section 2.4 Integration with respect to complex measures 85

Assume further that there exists a jj-integrable function g on T such that kZn .t /k 
g.t / for all n and t . Then Z and Zn are -integrable and
Z Z
lim Zn .t / d.t / D Z.t / d.t /:
n!1 T T

Proof. Relation (1) implies that limn!1 kZn .t /k D kZ.t /k and hence kZ.t /k 
g.t /. By Corollary 2.4.5, the fields Z and Zn are -integrable. Using the inequality
kZn .t /  Z.t /k  2g.t /, the lemma follows from
Z Z  Z 
   
 Zn .t / d  Z.t / d.t / D  ŒZn .t /  Z.t / d.t /
 
 
T T
Z T
 kZn .t /  Z.t /k djj.t /
T
and from Lebesgue’s theorem on dominated convergence.

Definition 2.4.14. Assume that T D Rd or T D Zd and let  be a complex measure


on T . Further, let Z be a continuous field of second order on T such that the field
x 7! Z.t  x/ is -integrable for all t 2 T . We define the field   Z, the convolution
of  and Z, by Z
  Z.t / D Z.t  x/ d.x/; t 2 T:
T

If f 2 L1 ./ or f 2 L1 .Zd /, then we write


Z
f  Z.t / WD Z.t  x/f .x/ d.x/; t 2T
Rd

and X
f  Z.t / WD Z.t  n/f .n/; t 2T
n2Zd
respectively.

In the next theorem we collect basic properties of the convolution.


Theorem 2.4.15. Assume that all convolutions below exist. Then we have
. C /  Z D   Z C   Z (1)
.c/  Z D c  .  Z/; c 2 C (2)
Z
  Z.t / D Z d.ıt  4 / (3)
.  /  Z D   .  Z/ (4)
for all t 2 T . If a sequence fn g converges weakly to , then
lim n  Z.t / D   Z.t /; t 2 T: (5)
n!1
86 Chapter 2 Correlation functions

Proof. The first three equations follow immediately from the definition of the convo-
lution, while (5) is a consequence of Theorem 2.4.9.
It is easy to check that (4) holds for measures with finite support. In the general case
we approximate  and  by measures with finite support (see Corollary 2.4.10) and
apply Theorem E.2.5.

Remark 2.4.16. Let Z be a not necessarily continuous, second order field on Rd . If


the complex measure  has finite support, then we define the convolution of  and Z,
as in Definition 2.4.14, by
Z
  Z.t / D Z.t  x/ d.x/; t 2 Rd :
Rd

Then equations (1)–(4) in the previous theorem still remain valid.

2.5 The Karhunen–Loève decomposition


In this section we use the notation

I WD Œa; b WD Œa1 ; b1       Œad ; bd 

where a; b 2 Rd and 1 < aj < bj < 1. We will prove a decomposition theorem


for continuous positive semidefinite kernels on I  I . Using this result we derive a
decomposition theorem for strongly continuous fields on I .

We denote by L2 .a; b/ the positive semidefinite inner product space of all complex-
valued functions on Œa; b which are square integrable with respect to the Lebesgue
R
2 .a; b/. We write b instead
measure.
R The corresponding Hilbert space is denoted by L a
of I . Finally, C W I  I ! C will be a continuous kernel.

Definition 2.5.1. A function ' 2 L2 .a; b/ is called an eigenfunction of C with eigen-


value  2 C if Z b
'.t / D C.t; s/'.s/ ds; t 2 Œa; b:
a

Lemma 2.5.2. Let h 2 L2 Œa; b. The function g defined by


Z b
g.t / WD C.t; s/h.s/ ds; t 2 Œa; b
a

is continuous. In particular, if ' is an eigenfunction of C with eigenvalue  ¤ 0, then


' is continuous.
Section 2.5 The Karhunen–Loève decomposition 87

Proof. For t; t0 2 I the Cauchy–Schwarz inequality shows that


ˇZ ˇ2
ˇ b ˇ
2 ˇ ˇ
jg.t /  g.t0 /j D ˇ .C.t; s/  C.t0 ; s//h.s/ ds ˇ
ˇ a ˇ
Z b Z b
 jC.t; s/  C.t0 ; s/j2 ds  jh.s/j2 ds:
a a

The kernel C is continuous and hence uniformly continuous on I  I . Using this we


see that the right-hand side of the relation above tends to zero as t ! t0 .

Lemma 2.5.3. The mapping K defined by9


Z b
Kh.t / WD C.t; s/h.s/ ds; h 2 L2 .a; b/; t 2 Œa; b
a

is a compact linear operator in L2 .a; b/.

Proof. Using the fact that C is bounded on I  I it is easy to see that K is a bounded
linear operator in L2 .a; b/. We have to show that for every bounded sequence fhn g,
the sequence fKhn g has a convergent subsequence. Write gn D Khn . To simplify the
notation, we also consider gn as a function defined pointwise by the integral above.
Applying the Cauchy–Schwarz inequality as in Lemma 2.5.2, we see that the sequence
fgn g is equicontinuous and uniformly bounded. In view of the theorem of Arzelà and
Ascoli (cf. Theorem B.2.9), the sequence fgn g has a subsequence converging uni-
formly on Œa; b to a continuous function. It follows that the operator K is indeed
compact.

Theorem 2.5.4 (Mercer). An arbitrary continuous positive semidefinite kernel


C W I  I ! C admits the decomposition
1
X
C.t; s/ D n 'n .t /'n .s/; t; s 2 I (1)
nD1

where
(i) each 'n is an eigenfunction of C with eigenvalue n  0;
(ii) the series above is absolutely and uniformly convergent;
(iii) f'n C N g1 2
nD1 is an orthonormal basis of L .a; b/, where
N D f' 2 L .a; b/ W .'; '/ D 0g.
2

Proof. To simplify the notation we identify continuous functions on Œa; b with the
corresponding elements of L2 .a; b/. We consider the compact linear operator K
9 To be precise, Kh is the equivalence class represented by the integral on the right-hand side.
88 Chapter 2 Correlation functions

defined in Lemma 2.5.3. Using that C.t; s/ D C.s; t / it is easy to check that
.Kh; g/ D .h; Kg/ , i.e., K is self-adjoint. It follows from Corollary 2.4.8 that K is
nonnegative and hence all eigenvalues of K are nonnegative. By Remark D.3.13, there
exists an orthonormal basis f'n g1 2
1 of L .a; b/ consisting of eigenvectors of K with
1
eigenvalues fn g1 . Every h 2 L .a; b/ admits the decomposition
2

1
X
hD .h; 'n / 'n :
nD1

For fixed t we apply the above decomposition to the function h W s 7! C.s; t /:


1
X
C.t; / D C.; t / D .'n ; C.; t // 'n ; t 2 Œa; b:
nD1

Since 'n is an eigenvector of K with eigenvalue n we have


Z b
.'n ; C.; t // D C.t; s/'n .s/ ds D K'n .t / D n 'n .t /: (2)
a

Thus,
1
X
C.t; / D n 'n .t /'n :
nD1

From here, again using (2) and the orthonormality of the vectors 'n , we obtain
 X
n 2
 
0 D lim 
 C.t; /   '
k k .t /' 
k
n!1
kD1
Z b X
n
2
D lim jC.t; s/j ds  2k j'k .t /j2 ; t 2 Œa; b:
n!1 a
kD1

If k ¤ 0, then the function 'k is continuous (cf. Lemma 2.5.2). It is easy to see that
the function
Z b
t 7! jC.t; s/j2 ds
a
P
is continuous as well. By Dini’s Theorem B.2.5, the series 1 kD1 k j'k .t /j con-
2 2

verges uniformly on Œa; b. Using this and the Cauchy–Schwarz inequality

q ˇ
X ˇ X
q 1
X
q 1
ˇ ˇ 2
2
2
2
ˇk 'k .t /'k .s/ˇ  k j'k .t /j  k j'k .s/j
kDp kDp kDp
Section 2.5 The Karhunen–Loève decomposition 89

we conclude that (ii) holds. To show equation (1) we consider for fixed s the continuous
nonnegative function
ˇ 1
X ˇ2
ˇ ˇ
ˇ
t 7! ˇC.t; s/  n 'n .t /'n .s/ˇˇ :
nD1

Using (2) we see that its integral over I is equal to zero. Thus, the function above is
equal to zero everywhere.

Theorem 2.5.5 (Karhunen–Loève). Let Z W I ! L2 . ; A; P / be a strongly con-


tinuous field with correlation function C . Then Z admits the expansion
Xp
Z.t / D n 'n .t /Xn ; t 2 I (1)
n2S

where
(i) 'n and n are as in Mercer’s theorem and S D fn 2 N W n > 0g; 10
Z b
1
(ii) Xn D p Z.s/  'n .s/ ds;
n a
(iii) the Xn ’s form an orthonormal system in L2 . ; A; P /;
(iv) the series (1) converges uniformly on I .

Proof. For n 2 S we define Xn by equation (ii) and write Xn D 0 if n … S . Note that


the function s 7! kZ.s/  'n .s/k is bounded by continuity, hence the integral in (ii)
exists (cf. Theorem 2.4.5). Applying Theorem 2.4.6 we see that
Z b Z b
1
.Xn ; Xm / D p Z.t /  'n .t / dt; Z.s/  'm .s/ ds
n m a a
Z bZ b
1
Dp C.t; s/'n .t /'m .s/ ds dt
n m a a
s Z
m b
D 'n .t /'m .t / dt
n a

D ın;m ; n; m 2 S

10 The correlation function C is positive semidefinite, hence we may apply Mercer’s Theorem 2.5.4 to
it. If S D ;, then the sum (1) is zero by definition.
90 Chapter 2 Correlation functions

i.e., (iii) holds. Using property (2.4.2.i) of the integral and the definition of Xn , we
obtain
Z b
1
.Z.t /; Xn / D p Z.t /; Z.s/'n .s/ ds
n a
Z b
1
Dp C.t; s/'n .s/ ds
n a
p
D n 'n .t /; n 2 N:

Applying this we see that


 2
 Xk p
 X
k p
Z.t /   ' .t /X  D C.t; t /  2Re n  'n .t /.Z.t /; Xn /
 n n n
nD1 nD1
X
k
C n j'n .t /j2
nD1
X
k
D C.t; t /  n j'n .t /j2 ; k 2 N:
nD1

In view of Theorem 2.5.4, the right-hand side converges to zero uniformly on I as


k ! 1. This completes the proof.

Remark 2.5.6. If Z is Gaussian, then in view of Corollary 2.4.11 the random vari-
ables Xn in Theorem 2.5.5 are Gaussian. If the mean of Z is zero, then E.Xn / D 0.
This follows from (2.4.2.i) and from E.Xn / D .Xn ; 1/. By orthogonality, the random
variables Xn are then uncorrelated and hence independent (cf. Theorem 1.10.8).

The next theorem is the converse of the previous one and it is easier to prove.

Theorem 2.5.7. Let ; ¤ S  N, for each n 2 S let n be a positive number


and fXn gn2S be an orthonormal system in L2 . ; A; P /. Further, let f'n gn2S be an
orthonormal system of continuous functions in L2 .a; b/ such that the series
Xp
Z.t / D n 'n .t /Xn ; t 2 I
n2M

is uniformly convergent on I . Then this series defines a strongly continuous second


order field Z and the function 'n is an eigenfunction with eigenvalue n of the corre-
lation function of Z.

Proof. To simplify the notation we assume that S D N, the general case can be treated
in the same way. It is easy to check that Z is a field of second order with correlation
Section 2.5 The Karhunen–Loève decomposition 91

function
1
X
C.t; s/ D .Z.t /; Z.s// D n 'n .t /'n .s/; t; s 2 I:
nD1
where the series is uniformly convergent on I . From this we conclude that C and
hence Z is continuous. Since f'n gn2N is an orthonormal system, we have
Z b Z 1
b X
C.t; s/'k .s/ ds D n 'n .t /'n .s/'k .s/ ds
a a nD1
1
X Z b
D n 'n .t / 'n .s/'k .s/ ds
nD1 a

D k 'k .t /; t 2 I; k 2 N:

Example 2.5.8. We consider the Wiener process W from Corollary 2.1.4 on a finite
interval I D Œ0; T ; T > 0. The correlation function of W is given by C.t; s/ D
min.t; s/. First we compute the eigenfunctions of C with positive eigenvalues. The
equation Z T
C.t; s/'.s/ ds D '.t /; 0  t  T;  > 0
0
can be rewritten as
Z t Z T
s'.s/ ds C t '.s/ ds D '.t /:
0 t

From here we see that '.0/ D 0. The left-hand side, hence also the right-hand side, is
differentiable with respect to t .11 We have,
Z T
'.s/ ds D ' 0 .t /; t 2 Œ0; T :
t

This equation implies that ' 0 .T / D 0. Differentiating again we obtain

'.t / D ' 00 .t /:

The solutions of this equation satisfying the condition '.0/ D 0 are given by
t
'.t / D A sin p ; t 2 Œ0; T :

Taking into account the condition ' 0 .T / D 0 it is easy to check that

T2
n D ; n 2 N0
.n C 12 /2  2

11 At t D 0 and t D T we have only one-sided derivatives.


92 Chapter 2 Correlation functions

are theqonly possible values of . We denote by 'n the corresponding function and set
2
A WD T, i.e.,
r 
2 1 t
'n .t / D  sin nC :
T 2 T
The L2 -norm of 'n is equal to 1. From Theorem 2.5.5 we obtain the representation
1   
p X sin n C 12 Tt
W .t / D 2T  1
 XnT ; 0  t  T:
nD1
nC 2 

Remark 2.5.6 shows that the random variables XnT are independent and standard
Gaussian.

Example 2.5.9. Let Y1 ; : : : ; Yn 2 L2 . ; A; P / be uncorrelated with mean zero and


standard deviation j > 0. Further, let k1 ; : : : ; kn 2 N. The correlation function of
the process
Xn
Z.t / D eitkj Yj ; t 2 Œ0; 2
j D1
is given by
X
n X
n
2 itkj 2 i.ts/kj
C.t; s/ D je  eiskj D je ; t 2 Œ0; 2:
j D1 j D1

The functions 'j .t / D p1  eitkj ; t 2 Œ0; 2, build an orthonormal system


2
in L2 .0; 2/. Write Xj WD Yj = j . Then X1 ; : : : ; Xn is an orthonormal system in
L2 . ; A; P /,
p X n
Z.t / D 2 j 'j .t /Xj ; t 2 Œ0; 2
j D1
p
and 'j is an eigenfunction of C with eigenvalue 2  2
j (cf. Theorem 2.5.7).

2.6 Integration with respect to orthogonal random


measures
The results and methods presented in this section will be used to prove integral repre-
sentations for random fields. Throughout the present section .W; B/ denotes a meas-
urable space.
Section 2.6 Integration with respect to orthogonal random measures 93

Definition 2.6.1. A mapping  W B ! L2 . ; A; P / is called a random orthogonal


measure on B (or on .W; B//, if
(i) ..B/; .D// D 0 for all B; D 2 B with B \ D D ;;
P1
(ii) .[1
j D1 Bj / D j D1 .Bj / for all mutually disjoint sets Bj 2 B.
If  is a random orthogonal measure, then the mapping W B ! Œ0; 1/, defined by

.B/ WD ..B/; .B// D k.B/k2 ; B2B

is called the structure measure of .12

By H./ we denote the smallest subspace of L2 . ; A; P / containing the set


f.B/ W B 2 Bg.

Lemma 2.6.2. The structure measure of a random orthogonal measure  is a finite


nonnegative measure on B.

Proof. Equation (2.6.1.i) with B D D D ; shows that .;/ D 0. If the sets Bj 2 B


are mutually disjoint, then by (2.6.1.i), the .Bj /’s are mutually orthogonal. From
here and from (2.6.1.ii) we obtain
!  !2  1 2
1
[  1
[  X 
   
B j D  Bj  D  .Bj /
   
j D1 j D1 j D1
1
X 1
X
D k.Bj /k2 D .Bj /:
j D1 j D1

Theorem 2.6.3. Let  be a random orthogonal measure on B with structure measure


. Then there exists an isometric linear operator I from L2 . / onto H./ such that
!
Xn Xn
I cj 1Bj D cj .Bj /
j D1 j D1

holds for all n 2 N and all cj 2 C; Bj 2 B.

Proof. Let L denote the linear manifold of all functions f of the form f D
P n
j D1 cj 1Bj . This manifold is dense in L . /. For f 2 L we define I .f / by the
2

equation above. It is easy to check that this definition is correct, i.e., the right-hand
side does not depend on the special representation of f . It is clear that I is a linear
operator from L into H./. By the definition of H./, the range of I is dense in H./.

12 It is possible to define random orthogonal measures where the structure measure is not finite (see for
example [19]). However, we do not need this more general notion in the present book.
94 Chapter 2 Correlation functions

An arbitrary step function f can be written in the form above with mutually disjoint
sets Bj . By orthogonality, we obtain

X
n
.I .f /; I .f // D jcj j2 .Bj / D .f; f /
j D1

(we denote the inner product in both spaces by .; /). Thus, the mapping I is isomet-
ric. By Theorem D.3.6, this mapping can be uniquely extended to an isometric linear
operator from L2 . / onto H./.

In the sequel we will use the notation


Z Z Z
f .t / d.t / WD f .t / d.t / WD f d WD I .f /; f 2 L2 . /:
W

For a set A 2 B we write


Z Z
f .t / d.t / WD 1A .t /f .t / d.t /; f 2 L2 . /:
A

In the next theorem we list the basic properties of this integral which follow imme-
diately from Theorem 2.6.3.

Theorem 2.6.4. Let  be a random orthogonal measure. For all f; g; fn 2 L2 . /


and a; b 2 C we have
Z Z Z
(i) Œaf .t / C bg.t / d.t / D a f .t / d.t / C b g.t / d.t /
Z Z Z
(ii) f .t / d.t /; g.t / d.t / D f .t /g.t / d .t /
Z Z
(iii) lim fn .t / d.t / D lim fn .t / d.t /
n!1 n!1

whenever the sequence ffn g is convergent.

Example 2.6.5. Let tj 2 W and Xj 2 L2 . ; A; P /; j 2 N, where the tj ’s are


mutually distinct and the Xj ’s are mutually orthogonal such that
1
X
kXj k2 < 1:
j D1

For an arbitrary subset B  W we define


X
.B/ WD Xj :
j W tj 2B
Section 2.6 Integration with respect to orthogonal random measures 95

Thus,
1
X
D Xj ıtj : (1)
j D1

Then  is a random orthogonal measure with a discrete structure measure


1
X
D kXj k2 ıtj :
j D1

Choosing Bj D ftj g in Theorem 2.6.3 and taking the limit n ! 1 we obtain


Z 1
X
f .t / d.t / D f .tj /Xj
j D1

where f is an arbitrary function on W .13


It is easy to see that the following converse statement is true as well: If is discrete,
then  can be represented in the form (1).

Definition 2.6.6. A second order process Z on T D Œ0; 1/ is called a process with


orthogonal increments, if
.Z.t2 /  Z.t1 /; Z.t4 /  Z.t3 // D 0; 0  t1  t2  t3  t4 ; tj 2 T:

The Wiener process and the Poisson process on Œ0; 1/ have orthogonal increments
(cf. Theorem 2.1.4 and Theorem 2.1.5). The next theorem shows that for each process
with orthogonal increments there is an associated random orthogonal measure.

Theorem 2.6.7. Let Z be a bounded, continuous14 process with orthogonal incre-


ments on T D Œ0; 1/. Then there exists a unique orthogonal random measure  on
B.T / such that H./  H.Z/ and

.Œt; s// D Z.s/  Z.t /; 0  t < s < 1: (1)

Proof. We define the function F by

F .t / D kZ.t /  Z.0/k2 ; t 2 Œ0; 1/:

This function is continuous and bounded. Since the increments are orthogonal we have

F .t / D kZ.t /  Z.s/ C Z.s/  Z.0/k2


D kZ.t /  Z.s/k2 C F .s/
13 If .ftj g/ > 0 it is meaningful to write f .tj / for f 2 L2 . Otherwise Xj D 0 and hence it is not
necessary to evaluate f at tj .
14 Actually, it is sufficient to assume continuity from the left.
96 Chapter 2 Correlation functions

where 0  s  t . Thus, the function F is increasing and

F .t /  F .s/ D kZ.t /  Z.s/k2 ; 0  s  t:

By basic results of measure theory, there exists a finite nonnegative measure on


B.T / such that
.Œs; t // D F .t /  F .s/; s  t:
For a set B  T of the form B D Œt; s/ we define .B/ by (1). In the same way as
in the proof of Theorem 2.6.3 we define I W L2 . / ! H.Z/ for step functions from
L2 . /, we use however only Borel sets of this special form. These functions span
a dense linear manifold and hence I can be extended to an isometric operator from
L2 . / into H.Z/ (see Theorem D.3.6). Now, let B  T be an arbitrary Borel set
and write .B/ WD I.1B /. The fact that I is isometric implies that  is an orthogonal
random measure. The uniqueness follows from the density of the linear manifold used
above.

Example 2.6.8. Let Z be the Wiener process or the Poisson process on Œ0; 1/. Denote
by  and the corresponding random orthogonal measure and the structure measure,
in the sense of the previous theorem, respectively. For B D Œ0; t /  Œ0; 1/ we have

.B/ D ..B/; .B// D .Z.t /; Z.t // D t:

From this we conclude that D jŒ0;1/ .

Definition 2.6.9. Let Z and  be as in Theorem 2.6.7. We define the integral


Rb
a f .t / dZ.t /; 0  a  b, by
Z b Z b
f .t / dZ.t / WD f .t / d.t /; f 2 L2 . /
a a
where is the structure measure of .

Theorem 2.6.10 (integration by parts). Let f be an absolutely continuous function on


Œ0; t ; t > 0, and let Z be a continuous process on Œ0; 1/ with orthogonal increments.
Then we have
Z t Z t
f .s/ dZ.s/ D f .t /Z.t /  f .0/Z.0/  Z.s/f 0 .s/ ds:
0 0

Proof. Denote by the structure measure of the random orthogonal measure corre-
sponding to Z (cf. Theorem 2.6.7). We write
t i
ti;n WD ; 0i n
n
Section 2.6 Integration with respect to orthogonal random measures 97

and define the function fn on Œ0; t  by fn .s/ D f .ti;n / if s 2 Œti;n ; tiC1;n / and
fn .t / D f .t /. The sequence ffn g converges uniformly, hence also in L2 . /, to f .
Using Theorem 2.6.4 we obtain
Z t Z t X
n1
f .s/ dZ.s/ D lim fn .s/ dZ.s/ D lim fn .ti;n /ŒZ.tiC1;n /  Z.ti;n /:
0 0 n!1 n!1
iD0

Using summing by parts15 the last sum can be written as


X
n1
f .t /Z.t /  f .0/Z.0/  Z.ti;n /Œf .tiC1;n /  f .ti;n /:
iD0

The sum above is equal to Z t


Z.gn .s//f 0 .s/ ds (1)
0
where gn .s/ D ti;n whenever s 2 Œti;n ; tiC1;n /. Denote by C the correlation function
of Z. Using the inequality jgn .s/  sj < n1 , the equation

kZ.s/  Z.gn .s//k2 D C.s; s/  C.s; gn .s//  C.gn .s/; s/ C C.gn .s/; gn .s//

and the uniform continuity of C on Œ0; t Œ0; t  we see that for any > 0 the inequality

kZ.s/  Z.gn .s//k2 <

holds if n is sufficiently large. Applying this and inequality (2.4.5.2) we conclude that
the integral (1) converges to Z t
Z.s/f 0 .s/ ds
0
as n ! 1. The proof is complete.

We end this section with a lemma which will be useful in Section 5.4.

Lemma 2.6.11. Let  be a random orthogonal measure on .W; B/ with structure


measure and let g 2 L2 . /. Then
Z Z
g .A/ WD 1A g d D g d; A 2 B:
W A

defines a random orthogonal measure g on .W; B/ with structure measure g given

X
n X
n
15 ak .bkC1  bk / D ŒanC1 bnC1  am bm   bk .akC1  ak /
kDm kDm
98 Chapter 2 Correlation functions

by d g D jgj2 d . Moreover, we have


Z Z
(i) f dg D fg d; f 2 L2 . g /I

(ii) the g -measure of the set of zeros of g is equal to 0, the function 1=g is in
L2 . g / and
Z
1
.A/ D dg ; A 2 B:
A g
Proof. By equation (2.6.4.ii),
Z Z
2
.g .A/; g .B// D 1A 1B jgj d D jgj2 d
W A\B

and hence (2.6.1.i) holds. Let Bj 2 B; j 2 N, be mutually disjoint. Lebesgue’s


dominated convergence theorem shows that

lim 1[N Bj D 1[1


1 Bj
and lim g1[N Bj D g1[1
1 Bj
N !1 1 N !1 1

the convergence being in L2 . /. From (2.6.4.iii) we see that (2.6.1.ii) holds, i.e., g
is a random orthogonal measure.
The relation (i) is true for step functions by the definition of g . Let ffn g be a se-
quence of step functions converging in L2 . g / to f . Since d g D jgj2 d , the se-
quence ffn gg converges in L2 . / to fg, from which (i) follows.
The first two statements in (ii) follow from d g D jgj2 d , while the third one
follows from (i) replacing f by 1A  g1 .

2.7 The theorem of Karhunen


Let  be a random orthogonal measure on the measurable space .W; B/ with structure
measure . Further, let g W T  W ! C be such that g.t; / 2 L2 . / for all t 2 T .
We define the second order field Z by
Z
Z.t / D g.t; x/ d.x/; t 2 T:
W

By virtue of (2.6.4.ii) the correlation function C of Z is given by


Z
C.t; s/ D g.t; x/g.s; x/ d .x/; t; s 2 T:
W

In the next theorem we consider the converse direction. We assume that the correlation
function admits an integral representation of this kind and deduce from it an integral
representation for the field.
Section 2.7 The theorem of Karhunen 99

Theorem 2.7.1 (Karhunen). Let Z W T ! L2 . ; A; P / be a field with correlation


function C . Assume that C admits the representation
Z
C.t; s/ D g.t; x/g.s; x/ d .x/; t; s 2 T (1)
W

where
(i) is a finite nonnegative measure on a measurable space .W; B/;
(ii) g is a complex-valued function on T  W with g.t; / 2 L2 . / for all t 2 T ;
(iii) the linear manifold Lg WD span fg.t; /; t 2 T g is dense in L2 . /.
Then there exists a uniquely determined random orthogonal measure  on .W; B/ with
values in L2 . ; A; P / and with structure measure such that
Z
Z.t / D g.t; x/ d.x/; t 2 T (2)
W

and H.Z/ D H./.


Proof. Let
X
n
f .x/ D ck g.tk ; x/; tk 2 T; x 2 W; ck 2 C
kD1
be an arbitrary function in Lg . We define the linear mapping I W Lg ! H.Z/ by
X
n
I.f / D ck Z.tk /:
kD1

We have to show that this definition is correct, i.e., the sum above depends only on f
but not on the special choice of the ck ’s and tk ’s. Using equation (1) we obtain
X
n X
n
ck Z.tk /; Z.s/ D ck C.tk ; s/
kD1 kD1
Xn Z
D ck g.tk ; x/g.s; x/ d .x/
W
kD1
Z
D f .x/g.s; x/ d .x/; s 2 T:
W

Using this and theP fact that the linear span of the set fZ.s/ W s 2 T g is dense in H.Z/
we conclude that nkD1 ck Z.tk / depends only on f .
It is clear that I is linear. A similar computation as above shows that
Xn Z
.I.f /; I.f // D cj ck C.tj ; tk / D f .x/f .x/ d .x/ D .f; f /
W
j;kD1
100 Chapter 2 Correlation functions

i.e., I is isometric. Consequently, I can be extended to an isometric linear operator


from Lg D L2 . / into H.Z/ (cf. Theorem D.3.6). We denote also the extension by
I and define  by
.B/ D I.1B /; B 2 B: (3)
Using the fact that I is isometric we see that  is a random orthogonal measure with
structure measure . Write
Z
X.t / WD g.t; x/ d.x/; t 2 T:

Next we show that Z D X . We have


Z
.Z.t /; .B// D .I.g.t; //; I.1B // D g.t; x/1B .x/ d .x/; B 2 B: (4)
P
For fixed t , let gn .s/ D k ckn 1Bkn ; n 2 N, be step functions converging to g.t; / in
L2 . / as n ! 1. Then, by (2.6.4.iii),
X Z
n n
lim ck .Bk / D lim gn .t; x/ d.x/ D X.t /:
n!1 n!1
k

Combining this with (4) we conclude that


Z
.Z.t /; X.s// D g.t; x/g.s; x/ d .x/

and hence
.Z.t /; X.s// D .Z.t /; Z.s//; t; s 2 T:
Since span fZ.t / W t 2 T g is dense in H.Z/ we must have X.s/ D Z.s/.
The representation (2) shows that H.Z/  H./. By the definition of  we also
have H./  H.Z/, i.e., H./ D H.Z/.
Finally, let  be a random orthogonal measure on .W; B/ with structure measure
such that Z Z
g.t; x/ d.x/ D g.t; x/ d.x/; t 2 T:
W W
The assumption (iii) implies that
Z Z
h d D h d; h 2 L2 . /:
W W

Setting h D 1B ; B 2 B, we obtain  D .

It is possible to prove a similar result (see [34]) without using the density condition
(2.7.1.iii). In this case, besides losing the equality of the spaces H.Z/ and H./, the
random orthogonal measure has values in a possibly larger space. This is the motiva-
tion for the next theorem where we use the same notation as in Theorem 2.7.1.
Section 2.7 The theorem of Karhunen 101

Theorem 2.7.2. Assume that C admits the integral representation (2.7.1.1) satisfying
the conditions (2.7.1.i) and (2.7.1.ii). Then the integral representation (2.7.1.2) with
a random orthogonal measure  W B ! L2 . ; A; P / having structure measure is
possible if and only if dim L?
g  dim H.Z/ .
?

Proof. Assume first that dim L? ?


g  dim H.Z/ and consider the orthogonal decom-
position
L2 .W; B; / D Lg ˚ L?
g:

Choose an orthonormal basis fe W  2 I g of L?g , where I is a nonempty index set


such that T \ I D ;. By assumption there exists an orthonormal system fY W  2 I g
in H.Z/? . Now we consider the random field
(
e Z.t /; if t 2 T
X .t / WD
Yt ; if t 2 I;

for all t 2 T [ I . We write further for t 2 T [ I


(
g.t; /; if t 2 T
e
g .t; / WD
et ; if t 2 I:

e .t / 2 L2 . ; A; P / for all t 2 T [ I and


Then X
Z
e .s; t / D .X
C e .s/; X
e .t // D e
g .s; x/e
g .t; x/ d .x/:
W

g is dense in L .W; B; /, Theorem 2.7.1 yields the existence of a uniquely


Since Le 2

determined random orthogonal measure  with structure measure and values in


L2 . ; A; P / such that Z
e .t / D
X e
g .t; x/ d.x/:
W
In particular, Z
Z.t / D g.t; x/ d.x/
W
for all t 2 T .
For the converse direction assume that

dim H.Z/? < dim L?


g:

As before, choose an orthonormal basis fe W  2 I g of L? g , where I is nonempty


and T \ I D ;. Assume  W B ! L2 . ; A; P / to be a random orthogonal measure
satisfying (2.7.1.2) and having structure measure . Set for  2 I
Z
Y WD e .x/ d.x/ 2 L2 . ; A; P /:
W
102 Chapter 2 Correlation functions

We then have
Z Z
.Z.s/; Y / D g.s; x/ d.x/; e .x/ d.x/
W W
Z
D g.s; x/e .x/ d .x/ D 0; s 2 T;  2 I
W
and
Z Z
.Y ; Y| / D e .x/ d.x/; e| .x/ d.x/
W W
Z
D e .x/e| .x/ d .x/ D ı ;| ; ; | 2 I:
W

Thus, fY W  2 I g is an orthonormal system in H.Z/? . Hence, this system can be


extended to an orthonormal basis of H.Z/? . Since dim L?
g is equal to the cardinality
of I , we obtain
dim L?g  dim H.Z/
?

a contradiction to the assumption. The proof is complete.

Remark 2.7.3. For fixed functions C and g the representing measure in (2.7.1.1)
is in general not uniquely determined. To see classical examples consider the case of
W D R equipped with the usual Borel -algebra B.
Let first
g.t; x/ D x t ; n 2 T WD N0 ; x 2 R:
A bounded positive measure on the real line is in general not uniquely determined
by the values of the integrals
Z 1 Z 1
g.t; x/g.s; x/ d .x/ D x tCs d .x/; t; s 2 N0
1 1
(see also Remark 3.3.7). The description of all measures with given moments is the
power moment problem. For more information on the power moment problem we refer
the reader to the monographs [1] and [55].
Another example is given by the trigonometric moment problem. In this case T D
Œa; a; a > 0, and

g.t; x/ D eitx ; t 2 Œa; a; x 2 R:

We will see in Chapter 4 that is in general not uniquely determined by the values of
the integrals
Z 1 Z 1
g.t; x/g.s; x/ d .x/ D ei.ts/x d .x/; t; s 2 Œa; a:
1 1
The paper [5] contains more details on the connection between Karhunen’s theorem,
moment problems and quadrature formulae.
Section 2.8 Stationary fields 103

2.8 Stationary fields

Recall that throughout this chapter T denotes a nonempty subset of Rd .

Definition 2.8.1. A second order field Z on T is said to be stationary,16 if the ex-


pectation M WD E.Z.t // does not depend on t and E.Z.t /  Z.s/ / is a function of
t  s:  
E Z.t /  Z.s/ D C.t  s/; t; s; t  s 2 T
where C is a complex-valued function defined on the set
T  T D ft  s W t; s 2 T g:

With a little abuse of terminology we call the function C , like in Definition 2.1.1,
the correlation function of Z. If Z is stationary, then the covariance
 
E .Z.t /  M /  .Z.s/  M /

is a function of t  s as well. This function is called the covariance function of Z.


Denoting it by K we have

K.h/ D C.h/  jM j2 ; h 2 T  T:

The notions above are defined for Z (cf. Notation 2.2.6) in the same way as for Z.
Definition 2.8.2. A field Z on T is said to be strongly stationary or stationary in
the strong sense when all its finite-dimensional distributions remain the same when
shifted. That is, for all n 2 N and all t1 ; : : : tn ; h 2 Rd the random vectors

.Z.t1 /; : : : ; Z.tn // and .Z.t1 C h/; : : : ; Z.tn C h//

have the same distribution whenever t1 C h; : : : ; tn C h 2 T .

If Z is a strongly stationary field of second order then it is stationary. Indeed, the


random variables Z.t / and Z.t C h/ have the same distribution and hence the same
expectation. Moreover, the random vectors

.Z.t /; Z.s// and .Z.t C h/; Z.s C h//

have the same distribution and hence the same correlation.17


Next we show that for Gaussian fields the notions stationarity and strong stationarity
are the same.
16 Stationary fields are also called stationary in the wide sense, widely stationary or second order sta-
tionary. These are the main fields studied in this book, hence we call them simply stationary for
brevity.
17 Actually, we used only the cases n D 1 and n D 2 of the definition of strong stationarity.
104 Chapter 2 Correlation functions

Theorem 2.8.3. Every stationary Gaussian field Z is strictly stationary.

Proof. Since Z is Gaussian the random vectors

.Z.t1 /; : : : ; Z.tn // and .Z.t1 C h/; : : : ; Z.tn C h//

tj ; tj Ch 2 T; h 2 Rd , are Gaussian. Since Z is stationary, they have the same expec-


tation and the same covariance. Consequently, they have the same distribution.

Lemma 2.8.4. Let Z be stationary on T D Rd or T D Zd with a continuous corre-


lation function C . Then we have

.  Z.t /;   Z.s// D .  Q  Z.t /; Z.s// D   Q  C.t  s/ (1)

for all complex measures ;  on T and for all s; t 2 T .

Proof. It is easy to check that the lemma holds for measures with finite support. In
the general case we approximate  and  by measures with finite support (see Corol-
lary 2.4.10) and apply Theorem E.2.5.

Definition 2.8.5. A second order field Z is called white noise with expectation 
and variance 2 ;  0, if the random variables X.t /; t 2 T , are uncorrelated and
E.Z.t // D  and Var.Z.t // D 2 for all t 2 T .

We will use the notation Z W N.; 2 / to indicate that Z is white noise with
expectation  and variance 2 . White noise is stationary with covariance function
2; hD0
K.h/ D
0; h ¤ 0:

If the random variables Z.t /; t 2 T , are not identically distributed, then Z is not
strongly stationary.

Example 2.8.6.
(1) Let X 2 L2 . ; A; P /; T D Rd and x 2 Rd . The field

Z.t / D ei.t;x/ X; t 2 Rd (1)

is stationary since

.Z.t C h/; Z.t // D ei.tCh;x/  ei.t;x/ .X; X / D ei.h;x/ .X; X /:

The question arises of under which conditions on a complex-valued function


f on Rd is the field Z W t 7! f .t /X stationary? Without loss of generality
Section 2.8 Stationary fields 105

assume that .X; X / D 1 and f .0/ D 1. The field Z is stationary if and only if
the relation

.f .t C h/X; f .t /X / D f .t C h/f .t / D g.h/; t; h 2 Rd

holds with some function g. Setting t D 0 we see that f D g while setting


h D 0 we obtain jf j D 1. Thus, f is a character of the group Rd (cf. Defini-
tion 1.4.5). If f is continuous then, by Theorem C.9.2, the function f has the
form f .h/ D ei.h;x/ with some x 2 Rd .

(2) The example given by equation (1) can be generalized as follows. Let
X1 ; : : : ; Xn 2 L2 . ; A; P / be an orthonormal system such that E.Xj / D m
and let x1 ; : : : ; xn 2 Rd . Then the field Z defined by
X
n
Z.t / D ei.t;xj / Xj ; t 2 Rd
j D1

is stationary with correlation function


X
n
C.h/ D .Xj ; Xj /  ei.h;xj / ; h 2 Rd :
j D1

(3) To give an example of a real stationary field let A; B 2 L2r . ; A; P / be such


that .A; A/ D .B; B/ D 1 and .A; B/ D .A; 1/ D .B; 1/ D 0. Further, let
 2 Rd and put

Z.t / D A cos  t C B sin  t; t 2 Rd

where we write  t instead of ..; t //, for brevity. Then E.Z.t // D 0. Using
trigonometric identities we obtain

.Z.t C h/; Z.t // D .A cos .t C h/ C B sin .t C h/; A cos  t C B sin  t /
D cos .t C h/  cos  t C sin .t C h/  sin  t
D cos h:

Thus, Z is stationary with correlation function h 7! cos..; h//.

Example 2.8.7. (1) In this example we generalize the field defined in equation
(2.8.6.1) by making the exponent random. Let X be a real random variable and Y be a
d-dimensional real random vector. Assume that X and Y are independent, E.X / D 0
and Var.X / D 2 < 1. We define the field Z by

Z.t / D ei.t;Y /  X; t 2 Rd :
106 Chapter 2 Correlation functions

This field is stationary since


 
E.Z.t // D E.X /  E ei.t;Y / D 0

and
 
.Z.t C h/; Z.t // D ei.tCh;Y /  X; ei.t;Y /  X
 
D E ei.h;Y /  X 2
2
D  fY .h/; h 2 Rd

where fY denotes the characteristic function of Y .


(2) Now we construct a real stationary field in a similar way. Let X and Y be as
above and choose a real random variable ‰ such that X; Y; ‰ are independent and18
1
E.cos ‰/ D E.sin ‰/ D 0 and E.cos2 ‰/ D E.sin2 ‰/ D : (1)
2
We define the field Z by

Z.t / D X  cos..t; Y / C ‰/; t 2 Rd :

Then E.Z.t // D 0 and


2
 
.Z.t C h/; Z.t // D  E cos..t C h; Y / C ‰/  cos..t; Y / C ‰/ :

Applying trigonometric identities and using (1) we obtain


2
.Z.t C h/; Z.t // D  E .cos..t C h/Y / cos.t Y / C sin..t C h/Y / sin.t Y //
2
2
D  E cos..h; Y //
2
2
D  RefY .h/
2
where fY is the characteristic function of Y .

Example 2.8.8. (1) Let Zj ; j 2 Z, be independent and identically distributed such


that E.Zj / D 0 and Var.Zj / D 2 . For  2 R we define the time series X by19
Xj D Zj C Zj 1 ; j 2 Z:

18 These equations hold, e.g., if ‰ is uniformly distributed in Œ0; 2.


19 A time series of this form is called a moving average sequence of order 1.
Section 2.8 Stationary fields 107

This time series is stationary:


.Xj Ch ; Xj / D .Zj Ch C Zj 1Ch ; Zj C Zj 1 /
8
< .1 C  2 / 2 ; h D 0;
D  2; h D ˙1;
:
0; jhj > 0:
It is even strongly stationary. To prove this we show that the characteristic function
fY of the random vector
Y D .X1Ch ; X2Ch ; : : : ; XnCh /; h 2 Z; n 2 N
does not depend on h. Denote by f the characteristic function of Zj . For t D
.t1 ; : : : ; tn / 2 Rn we have
X
n X
n1
.Y; t / D .Zj Ch C Zj 1Ch /tj D  t1 Zh C tn ZnCh C .tj C  tj C1 /Zj Ch
j D1 j D1
and therefore
  Y
n1
fY .t / D E ei.Y;t/ D f . t1 /  f .tn /  f .tj C  tj C1 /:
j D1

Thus, fY does not depend on h.


(2) Let T D Z; Z W N.0; 2 / and a 2 C such that jaj < 1. We are looking for
a stationary time series X satisfying the equation20
Xn D Zn C aXn1 ; n 2 Z: (1)
Applying this equation iteratively we obtain
Xn D Zn C aZn1 C a2 Xn2
D 
D Zn C aZn1 C    C ak Znk C akC1 Xnk1 :
If X is stationary, then kXn k is constant, so
 2
 Xk 
 
Xn  a Znj  D jaj2kC2 kXnk1 k2 ! 0;
j
k ! 1:
 
j D0
Thus,
1
X
Xn D aj Znj ; n 2 Z: (2)
j D0

20 A time series satisfying this equation is called an autoregressive sequence of order 1.


108 Chapter 2 Correlation functions

By Theorem F.2.3 this series converges with probability one. Now we show that the
time series X D fXn gn2N is stationary. Indeed,
1
X
.Xn ; 1/ D aj .Znj ; 1/ D 0
j D0

for all n and for h  0 we have


1 1
!
X X
j k
.XnCh ; Xn / D a ZnChj ; a Znk
j D0 kD0
1 1
!
X X
j k
D a ZnChj ; a Znk
j Dh kD0
1 1
!
X X
j Ch k
D a Znj ; a Znk
j D0 kD0
1
X
2 h
D a jaj2j
j D0
2
D  ah :
1  jaj2
We find that X is stationary and the correlation function C is given by
2
C.h/ D  .a/jhj ; h < 0:
1  jaj2
We have thus shown that equation (1) has exactly one stationary solution which is
given by (2).
If a ¤ 0 then equation (1) has a unique solution for each fixed X0 which can be
determined iteratively: X1 D Z1 C aX0 ; X1 D .X0  Z0 /=a, and so on. If X0 is
not given by (2), i.e.,
X1
X0 ¤ aj Zj
j D0
then the solution is not stationary.

Example 2.8.9. Let Z be the Wiener process or the Poisson process constructed in
Corollary 2.1.4 and Corollary 2.1.5, respectively. Further, let h 2 Œ0; 1/ be arbitrary
and define the process X h by

X h .t / D Z.t C h/  Z.t /; t 2 Œ0; 1/:

The computations in Example 2.2.5 show that X h is stationary and its correlation
function C h is given by

C h .t / D max.0; h  t /; t 2 Œ0; 1/:


Section 2.9 Spectral representation of stationary fields 109

2.9 Spectral representation of stationary fields


First we generalize the notion of positive definiteness introduced in Definition 1.4.1.21

Definition 2.9.1. A function f W T  T ! C is said to be T -positive definite if the


inequality
Xn
f .ti  tj /ci cj  0
i;j D1

holds for every positive integer n, for all t1 ; : : : ; tn 2 T and for all c1 ; : : : ; cn 2 C.

Now we reformulate some of our previous results in terms of stationary fields. The
next theorem follows from Theorem 2.2.3.

Theorem 2.9.2. The correlation function of a real or complex stationary field on T


is T -positive definite. Conversely, let C be a real- or complex-valued function defined
on T  T . If C is T -positive definite, then there exists a real or complex, respectively,
stationary Gaussian field on T with correlation function C . Analogous statements
hold for the covariance function.
In the remaining part of this section we consider only the correlation function.
Combining the previous theorem with Bochner’s Theorem 1.7.3 and Herglotz’s Theo-
rem 1.7.5 we obtain the following characterizations of correlation functions of station-
ary fields on Rd and Zd .

Theorem 2.9.3. A continuous complex-valued function C on Rd is the correlation


function of a continuous stationary field on Rd if and only if it can be represented in
the form Z
C.t / D ei.t;x/ d .x/; t 2 Rd
Rd
with some nonnegative finite Borel measure on Rd .

Theorem 2.9.4. A complex-valued function C on Zd is the correlation function of a


stationary field on Zd if and only if it can be represented in the form
Z
C.n/ D ei.n;t/ d .t /; n 2 Zd
Œ0;2/d

with some nonnegative finite Borel measure on Œ0; 2/d .


The measure in the previous theorems is called the spectral measure of the field.
If is absolutely continuous with respect to the Lebesgue measure, then the corre-
sponding density is called the spectral density.
21 See also Definition 4.1.1.
110 Chapter 2 Correlation functions

Example 2.9.5. Let Z W N.0; 2/ be a white noise on Zd . Since


Z
ei.n;x/ dd .x/ D 0; n 2 Zd n f0g
Œ0;2/d

the correlation function C can be represented as


2 Z
C.n/ D ei.n;x/ dd .x/; n 2 Zd :
.2/d Œ0;2/d

Thus, the spectral density is constant. Conversely, if a field Z on Zd has a constant


spectral density, then Z is a white noise.
Using Theorem 1.7.8 we can characterize the correlation functions of not neces-
sarily continuous fields on Rd . By d we denote the set of all characters of the
group Rd . Recall that d consists of all complex-valued functions ¤ 0 on Rd
such that .x C y/ D .x/ .y/ and .x/ D .x/ for all x and y. Endowing d
with the topology of pointwise convergence, d becomes a compact Hausdorff space
(cf. Remark 1.7.6).

Theorem 2.9.6. A complex-valued function C on Rd is the correlation function of a


stationary field on Rd if and only if it can be represented in the form
Z
C.t / D .t / d . /; t 2 Rd
d

with some nonnegative finite Borel measure on d .

Example 2.9.7. Let Z W N.0; 1/ be a white noise on Rd . Then C D 1f0g and the
measure in the previous theorem is equal to the normalized Haar measure on d
(see Corollary 1.7.11 and its proof).
We are now able to prove integral representations for stationary fields using
Karhunen’s Theorem 2.7.1.

Theorem 2.9.8. Let Z be a continuous stationary field on Rd with correlation func-


tion C and spectral measure . Then there exists a uniquely determined random or-
thogonal measure  on .Rd ; Bd / with structure measure such that
Z
Z.t / D ei.t;x/ d.x/; t 2 Rd
Rd

and H.Z/ D H./.


Conversely, let  be a random orthogonal measure on .Rd ; Bd / with structure
measure . Then the equation above defines a continuous stationary field Z with
spectral measure .
Section 2.9 Spectral representation of stationary fields 111

Proof. We have
Z Z
C.t  s/ D i.ts;x/
e d .x/ D ei.t;x/ ei.s;x/ d .x/; t; s 2 Rd :
Rd Rd

The linear span L of the functions ei.t;/ ; t 2 Rd , is dense in L2 . /. To see this assume
that g 2 L? . Then
Z
ei.t;x/ g.x/ d .x/ D 0; t 2 Rd :
Rd

Since g 2 L2 . /  L1 . /, the measure  WD gN is finite. The equation above


shows that L D 0 and hence g D 0. Thus, L is dense in L2 . /. The first statement of
the theorem follows immediately from Karhunen’s Theorem 2.7.1, while the second
one is a consequence of basic properties of the integral with respect to  (cf. Theo-
rem 2.6.4).

Example 2.9.9. Let Z be a continuous stationary field on Rd . If the spectral measure


of Z is discrete, i.e.,
1
X 1
X
D pj ıxj ; xj 2 Rd ; pj  0; pj < 1
j D1 j D1

then
1
X
C.t / D pj ei.t;xj / ; t 2 Rd
j D1
and
1
X
Z.t / D ei.t;xj /  Xj ; t 2 Rd
j D1

where the Xj ’s are orthogonal and .Xj ; Xj / D pj (see Example 2.6.5). If Z is a


Gaussian field then the Xj ’s are even independent.
The next two theorems can be proved in the same way as Theorem 2.9.8. We omit
the proofs.

Theorem 2.9.10. Let Z be a stationary field on Zd with correlation function C


and spectral measure
 . Thenthere exists
 a uniquely determined random orthogo-
nal measure  on Œ0; 2/d ; B Œ0; 2/d with structure measure such that
Z
Z.n/ D ei.n;x/ d.x/; n 2 Zd
Œ0;2/d

and H.Z/ D H./.


112 Chapter 2 Correlation functions

Conversely, let  be a random orthogonal measure on .Œ0; 2/d ; B.Œ0; 2/d // with
structure measure . Then the equation above defines a stationary field Z with spec-
tral measure .

Theorem 2.9.11. Let Z be a stationary field on Rd with correlation function C


and spectral measure . Then there exists a uniquely determined random orthogo-
nal measure  on .d ; B.d // with structure measure such that
Z
Z.t / D .t / d. /; t 2 Rd
d

and H.Z/ D H./. Conversely, if  is a random orthogonal measure on .d ; B.d //,
then the equation above defines a stationary field Z.
We will call the random orthogonal measure  occurring in Theorems 2.9.8, 2.9.10
and 2.9.11, the representing measure of the field Z.

Remark 2.9.12. Let Z be a continuous stationary field on Rd (analogous consid-


erations can be carried out for fields on Zd or for not necessarily continuous fields
on Rd ). We have seen in the proof of Karhunen’s Theorem 2.7.1 that the mapping
ei.t;/ 7! Z.t /; t 2 Rd , can be uniquely extended to an isometric linear operator IZ
from L2 . / onto H.Z/ D H./. Using the inverse of IZ we can reformulate prob-
lems that are formulated in the space H.Z/ to problems formulated in L2 . /.
To see examples, let X be a stationary time series on Z with spectral measure . By
isometry we have
 X 2 Z 2 ˇ ˇ2
 n
 ˇ i.nC1/x X n
ˇ
XnC1  
ck Xk  D ˇ  ikx ˇ
ck e ˇ d .x/:
 ˇe
0
kD0 kD0

This equation, for example, is important for the prediction of stationary processes.22
In this case the problem is to find those cj ’s which minimize the left-hand side.
Another problem is to determine .Œa; b//; a < b, where  is the random orthogonal
measure in the spectral representation of X . For this we choose a sequence
X
pn .x/ D cj;n eikj;n x ; n 2 N; kj;n 2 Z; x 2 Œ0; 2/
j

of trigonometric polynomials converging in L2 . ) to 1Œa;b/ . Since IZ is isometric we


have limn!1 IZ .pn / D IZ .1Œa;b/ /. Using the definition of  (see equation (2.7.1.3))
we obtain X
.Œa; b// D lim cj;n Xkj;n :
n!1
j

22 We will generalize the equation above in Theorem 2.9.13.


Section 2.9 Spectral representation of stationary fields 113

We will use this method in Theorems 2.9.16 and 2.9.17 to prove inversion formulae.
In Section 2.10 we will investigate the interconnection between stationary fields and
unitary representations which will provide another point of view for the operator IZ .

Theorem 2.9.13. Let Z be a continuous stationary field on Rd with spectral measure


, correlation function C , and let  and  be complex measures on Rd . Then
Z Z Z
Z d; Z d D L  L d
Rd Rd
ZR Z
d

D C.t  s/ d.t / d.s/


ZR R
d d

D C d.  / Q
Rd

and Z
Z d D IZ ./
L (1)
Rd

where IZ is the isometric operator from L2 . / onto H.Z/ introduced in Re-


mark 2.9.12.

Proof. The integrability of Z is a consequence of Theorem 2.4.4, since kZ.t /k is


constant. The equations
Z Z Z Z Z
Z d; Z d D C.t  s/ d.t / d.s/ D C d.  /
Q
Rd Rd Rd Rd Rd

follow from Theorem 2.4.6 and from the definition of the convolution (cf. Section E.2).
To prove the remaining two equations it suffices to consider nonnegative (finite)
measures. Using the definition of IZ and the fact that it is isometric we see that the
equations hold for measures with finite support. By Corollary 2.4.10 there exist se-
quences fn g and fn g of nonnegative measures with finite support converging weakly
to  and , respectively, such that
Z Z Z Z
Z d D lim Z dn and Z d D lim Z dn :
Rd n!1 Rd Rd n!1 Rd

Theorem 1.6.1 shows that the sequences fL n g and fL n g converge uniformly on com-
pact sets (and hence also in L2 . /) to L and ,
L respectively. Now the equations in
question follow by taking the limit n ! 1.

The next theorem can be proved in the same way as the previous one. We omit the
proof.
114 Chapter 2 Correlation functions

Theorem 2.9.14. If Z is a stationary field on Zd with spectral measure and  and


 are complex measures on Zd , then
Z Z Z
Z d; Z d D L  L d
Zd Zd Œ0;2/d

and Z
Z d D IZ ./:
L (1)
Zd

As an application of the operator IZ we present the so-called sampling theorem


attributed to E. T. Whittaker, H. Nyquist, C. Shannon, V. Kotelnikov, and É. Borel.
It implies, in terms of signal processing, that a bandlimited signal can be recovered
from uniformly spaced discrete samples as long as the sampling rate is not smaller
than twice the bandwidth.

Theorem 2.9.15 (sampling theorem). Let X be a continuous stationary process on


R such that the support of its spectral measure is contained in a finite interval
Œa; a; a > 0, and .fag/ D .fag/ D 0. Then we have
X1
sin.at   n/ n
X.t / D Z ; t 2 R:23
nD1
at   n a

Proof. The correlation function C of X admits the representation


Z a
C.t / D eitx d .x/; t 2 R:
a

Let t 2 R be fixed and consider the function ft W x 7! eitx on the interval Œa; a. Its
Fourier series is given by
X1
sin.at   n/ ix  n
e a ; x 2 Œa; a:
nD1
at   n

Since ft is continuously differentiable, the series converges in L2 . / to ft (see Corol-


lary C.3.2). Applying the isometry IZ we obtain the statement of the theorem.

Next we present two inversion theorems.

Theorem 2.9.16. Let Z be a continuous stationary field on Rd with representing


measure  and spectral measure . If the -measure of the boundary of the set

J D .a1 ; b1 /      .ad ; bd /  Rd ; aj < bj

sin 0
23 We use the convention 0 WD 1.
Section 2.9 Spectral representation of stationary fields 115

is equal to 0, then24
Z Z Y
d
1 T T
eiaj tj  eibj tj
.J / D lim :::  X.t / dt:
T !1 .2/d T T j D1 itj

Proof. We have
Z Z Y
d
1 T T
eiaj tj  eibj tj i.x;t/
KT .x/ WD ::: e dt
.2/d T T j D1 itj
Z Z Y
d
1 T T
ei.xj aj /tj  ei.xj bj /tj
D ::: dt
.2/d T T j D1 itj

Y
d Z 
1 T
sin.xj  aj /tj sin.xj  bj /tj
D  dtj
2 T tj tj
j D1

Y
d
DW Kj .T; x/:
j D1

By Corollary B.1.8, 8
< 1; xj 2 .aj ; bj /
lim Kj .T; x/ D 1
; xj D aj or xj D bj :
T !1 : 2
0; xj … Œaj ; bj 
and hence
lim KT D 1J ; -almost everywhere:
T !1
Theorem B.1.9 shows that jKj .T; x/j < 2 and therefore we also have conver-
gence in L2 . /. The isometric operator IZ being continuous we conclude that
limT !1 IZ .KT / D .J /. Now the theorem follows immediately from equa-
tion (2.9.13.1).
The next theorem can be proved in the same way by using Corollary C.3.2.

Theorem 2.9.17. Let Z be a stationary field on Zd with representing measure  and


spectral measure measure . If the -measure of the boundary of the set
J D .a1 ; b1 /      .ad ; bd /; 0  aj < bj < 2
is equal to 0, then
1 X Y d
eiaj kj  eibj kj
.J / D lim  Z.k/:
n!1 .2/d ikj
j D1 jkjn

24 In case of tj D 0 we use the same convention as in Theorem 1.3.1.


116 Chapter 2 Correlation functions

Theorem 2.9.18. Let Z be a continuous stationary field on Rd or Zd with represent-


ing measure  and spectral measure . The following properties are equivalent:
(i) H.Z/ is finite-dimensional;
(ii) the support of is finite;
P
(iii) Z.t / D njD1 ei.t;xj / Xj
with some n 2 N, mutually orthogonal Xj 2 H.Z/, and xj 2 Rd or xj 2
Œ0; 2/d , respectively.

Proof. Noting that L2 . / is finite dimensional if and only if supp is finite, the equiv-
2
Pn from the fact that H.Z/ and L . / are isomorphic.
alence of (i) and (ii) follows
Assume that D j D1 pj ıxj , where the xj ’s are mutually distinct. Then the
functions 1fxj g build an orthogonal basis of L2 . /. For all functions f we have

X
n
f D f .xj /1fxj g
j D1

equality being in L2 . /, and hence


Z X
n
f .x/ d.x/ D f .xj /Xj ;
j D1
R
where the random variables Xj WD 1fxj g d are orthogonal. Relation (iii) follows
from the equation above by setting f .x/ WD ft .x/ WD ei.t;x/ and from the spectral
representation of Z.
If (iii) holds, then H.Z/ D spanfX1 ; : : : ; Xn g, from which (i) follows.

Theorem 2.9.19. Let Z be a continuous stationary field on T D Rd or on T D Zd


with correlation function C and spectral measure . Then   Z is stationary with
correlation function   Q  C and spectral measure  given by
2
d  .x/ D j.x/j
O d .x/: (1)

Moreover,   Z admits the representation


Z
  Z.t / D ei.t;x/ .x/
O d.x/; t 2 T;

where  is the representing measure of Z and integration is over Rd or Œ0; 2/d ,


respectively.

Proof. The statement on stationarity and on the correlation function follows from
Lemma 2.8.4 with  D  while equation (1) follows from Theorem 1.8.15 and Theo-
rem 1.9.5, respectively.
Section 2.10 Unitary representations 117

The statement on the representation of Z can be easily checked when the support
of  is finite. In the general case we approximate  by measures with finite support
using (2.4.15.5)25 and (2.6.4.iii).

2.10 Unitary representations


This section contains an introduction to unitary representations and their connection
to stationary fields. This connection is useful for example in the study of ergodic prop-
erties of stationary fields (see Section 5.3).
Throughout the present section H is a complex Hilbert space and Z denotes a
stationary field on T , where T D Rd or T D Zd .

Unitary Operators 2.10.1. Recall that a linear operator U from H onto H is


called unitary if .U h; U h/ D .h; h/ for all h 2 H . If U is unitary then

.U h; Ug/ D .h; g/ and .U h; g/ D .h; U 1 g/; h; g 2 H:

To see examples, let H D L2 .d /. Since the Lebesgue measure is invariant under
translations, the equation26

Ut h.x/ WD h.x  t /; h 2 H; x 2 Rd

defines a unitary operator Ut in H for all t 2 Rd . Another example is given by

U h.x/ WD g.x/h.x/; h 2 H; x 2 Rd

where g is a Lebesgue measurable function on Rd such that jgj D 1.

Definition 2.10.2. A unitary representation .Ut / of T in H is a mapping t 7! Ut


from T into the set of all unitary operators in H such that
UtCs D Ut Us ; t; s 2 T:
The representation .Ut / is called continuous if the mapping t 7! Ut h is continuous
for all h 2 H with respect to the norm topology on H .
If T D Rd and the function t 7! .g; Ut h/ is Lebesgue measurable for all g; h 2 H ,
then .Ut / is said to be weakly measurable.

25 See also Corollary 2.4.10.


26 Strictly speaking this definition is not correct since h is an equivalence class and not a function. It can
easily be made correct by using the representants of h.
118 Chapter 2 Correlation functions

Lemma 2.10.3. Let .Ut / be a unitary representation of T in H . Then U0 D I (the


identity operator), Ut D Ut1 and

.Ut v; w/ D .v; Ut w/; t 2 T; v; w 2 H:

Moreover, .Ut / is continuous if and only if the complex-valued functions

t 7! .Ut v; w/

are continuous for all v; w 2 L, where L is a dense subset of H .

Proof. Since U0 D U0C0 D U0 U0 we see that U0 D I . From I D Utt D Ut Ut it


follows that Ut D Ut1 and hence

.Ut v; w/ D .v; Ut w/ D .v; Ut1 w/ D .v; Ut w/:

If .Ut / is continuous, then the functions t 7! .Ut v; w/ are obviously continuous for
all v; w 2 H . Assume that these functions are continuous for all v; w 2 L. Let h 2 H
and w 2 L be arbitrary and choose a sequence fvn g in L converging to h. Using the
Cauchy–Schwarz inequality we get

j.Ut vn ; w/  .Ut h; w/j D j.Ut .vn  h/; w//j  kvn  hk  kwk:

Thus, the function t 7! .Ut h; w/ is continuous since it is the uniform limit of the
sequence t 7! .Ut vn ; w/ of continuous functions. Applying the same argument once
more, we see that t 7! .Ut h; g/ is continuous for all h; g 2 H . The continuity of .Ut /
follows from the identity

kUt h  Us hk2 D 2khk2  .Ut h; Us h/  .Us h; Ut h/:

Example 2.10.4. Continuous unitary representations of Rd in H D L2 .d / are


given by
Ut h.x/ WD h.x  t /; h 2 H; x; t 2 Rd
and
Ut h.x/ WD ei.t;x/ h.x/; h 2 H; t; x 2 Rd :
The continuity of the first example follows from Lemma 2.10.3 and from the fact that
the function Z
t 7! h.x  t /g.x/ d.x/; t 2 Rd

is continuous for all h; g 2 L2 .Rd / (cf. Theorem E.2.4).


Replacing Rd by Zd and d by the counting measure on Zd we obtain unitary
representations of Zd .
Section 2.10 Unitary representations 119

Definition 2.10.5. Let .Ut / be a unitary representation of T in H . A subspace L of


H is said to be .Ut /-invariant if
Ut L  L; t 2 T:
A vector h 2 H; h ¤ 0, is called an eigenvector of .Ut / if there exists a complex-
valued function on T such that
Ut h D .t /h; t 2 T:

Let be as above. It is easy to check that .0/ D 1 and

.t C s/ D .t / .s/; t; s 2 T:

Thus, is a character of T . If the representation is continuous, then is continuous


as well.

Definition 2.10.6. Let .Ut / be a unitary representation of T in H . A vector v 2 H is


called a cyclic vector if the linear space

spanfUt W t 2 T g

is dense in H . Unitary representations having a cyclic vector are called cyclic.

Example 2.10.7. The constant function 1 2 H D L2 .Œ0; 2// is a cyclic vector for
the unitary representation

Ut h.x/ WD eitx h.x/; h 2 H; x 2 Œ0; 2/; t 2 Z

of Z since the trigonometric polynomials build a dense linear space in H . This follows
immediately from the Stone–Weierstraß Theorem B.2.4.

Example 2.10.8. Let H be a subspace of L2 . ; A; P / and let .Ut / be a unitary rep-


resentation of T in H . We choose an arbitrary random variable X 2 H and define the
second order field Z by
Z.t / WD Ut X; t 2 T:
Then Z is stationary. Indeed,

.Z.t /; Z.s// D .Ut X; Us X / D .Us Ut X; X / D .Uts X; X /; t; s 2 T:

It is easy to check that H.Z/ is .Ut /-invariant. Moreover, the restriction of .Ut / to
H.Z/ is a unitary representation with cyclic vector X .
120 Chapter 2 Correlation functions

Lemma 2.10.9. Let .Ut / be a unitary representation of T in H . The function

f .t / D .h; Ut h/; t 2T

is positive definite for all h 2 H .

Proof. Let tj 2 T and cj 2 C; j D 1; : : : ; n, be arbitrary. Using that Utj ti D


Utj Ut1
i
we obtain

X
n X
n
f .tj  ti /ci cj D .ci h; cj Utj ti h/
i;j D1 i;j D1
X n
D .ci Uti h; cj Utj h/
i;j D1
X 2
 n 

D ci Uti h
  0;
iD1

i.e., f is positive definite.

Definition 2.10.10. Two unitary representations .Ut / and .Vt / of T in H1 and H2 ,


respectively, are called equivalent if there exists an isometric linear operator M from
H1 onto H2 such that
M Ut D Vt M; t 2 T:

Lemma 2.10.11. Let .Ut / and .Vt / be equivalent unitary representations of Rd in


H1 and H2 , respectively. Then .Ut / is continuous if and only if .Vt / is continuous.

Proof. Let M be as in Definition 2.10.10. For all v; w 2 H1 we have

.M Ut v; w/ D .Ut v; M 1 w/ D .Vt M v; w/; t 2 Rd :

Since M maps H1 onto H2 , the statement follows from Lemma 2.10.3.

Lemma 2.10.12. Let .Ut / and .Vt / be cyclic unitary representations of T in H1 and
H2 with cyclic vectors h1 and h2 , respectively. If

.h1 ; Ut h1 / D .h2 ; Vt h2 /; t 2 T;

then these representations are equivalent.27

27 We use the same notation for the inner products in H1 and H2 .


Section 2.10 Unitary representations 121

Proof. We define the mapping M from L WD span fUt h1 W t 2 T g into H by

X
n X
n
M ci Uti h1 WD ci Vti h2 :
iD1 iD1

That this definition is correct, i.e., the left-hand side does not depend on the special
choice of ci and ti , follows from the relation
X
n X
m X
n X
m
 
ci Uti h1 ; dj Usj h1 D ci dj h1 ; Usj ti h1
iD1 j D1 iD1 j D1
Xn X
m
D ci Vti h2 ; dj Vsj h2
iD1 j D1

and from the fact that h1 and h2 are cyclic vectors. This mapping is linear and

M h1 D h2 ;
.g; h/ D .M g; M h/;
M Ut g D Vt M g; t 2 T; g 2 L:

By Theorem D.3.6, M can be extended to an isometric linear operator MQ from H1


onto H2 . It is clear that MQ Ut D Vt MQ for all t , i.e., .Ut / and .Vt / are equivalent.

Lemma 2.10.13. Let  and  be complex measures with finite support on T . If the
equation   Z.s/ D   Z.s/ holds for some s 2 T , then

  Z.t / D   Z.t /

for all t 2 T .

Proof. Indeed, applying Lemma 2.8.4 we see that

k.  /  Z.t /k2 D .  /  .  /Q  C.0/:

Thus, k.  /  Z.t /k does not depend on t . Since it is equal to zero if t D s, it is


zero for all t .

The next theorem establishes a close relation between unitary representations and
stationary fields.

Theorem 2.10.14. For every (continuous) stationary field Z on T there exists a (con-
tinuous) cyclic unitary representation .Ut / of T in the Hilbert space H.Z/ such that

Ut Z.s/ D Z.s  t /; s; t 2 T: (1)


122 Chapter 2 Correlation functions

We call this representation the canonical unitary representation of Z in H.Z/.


Conversely, if .Vt / is a (continuous) cyclic unitary representation of T in a Hilbert
space H , then there exists a (continuous) stationary field Z on T such that the canon-
ical unitary representation of Z in H.Z/ is equivalent to .Vt /.

Proof. Denote by L.Z/ the linear space of all vectors v 2 H.Z/ of the form
X
v D   Z.0/ D cj Z.xj /
j
P
where  D j cj ıxj is a complex measure with finite support. For v as above we
define Ut v by
Ut v WD ıt    Z.0/ D   Z.t /; t 2 T:
It follows from Lemma 2.10.13 that this definition is correct, i.e., Ut v does not depend
on the special representation of v. It is clear that Ut is a linear operator from T .Z/ onto
T .Z/. Moreover,
Ut Us v D UtCs v; t; s 2 T;
in particular, Ut Ut v D v. It follows from Lemma 2.8.4 that Ut is isometric:

.Ut v; Ut v/ D .v; v/; v 2 L.Z/:

We conclude that Ut can be extended to a unitary operator on H.Z/ D L.Z/, we


denote this operator also by Ut . It is easy to check that .Ut / is a unitary representation
of T with cyclic vector Z.0/. If Z is continuous then, by the definition of Ut , the func-
tion t 7! Ut v is continuous for all v 2 L. In view of Lemma 2.10.3, the representation
.Ut / is continuous.
To prove the second part of the theorem, assume that .Vt / is a cyclic unitary repre-
sentation of T in some Hilbert space H and let v0 be a cyclic vector. The function

f .t / D .v0 ; Vt v0 /; t 2T

is positive definite in view of Lemma 2.10.9. By Theorem 1.7.8 there exists a unique
measure  2 MC ./ such that
Z
f .t / D .t / d. /; t 2 T;

where  denotes the character group of T . For each t 2 T the function

gt . / D .t /; 2

is bounded and continuous on , hence gt 2 L2 ./. We define the second order


random field Z W T ! L2 ./ by Z.t / D gt . Since
Z Z
.Z.t /; Z.s// D .t / .s/ d. / D .t  s/ d. / D f .t  s/
d d
Section 2.10 Unitary representations 123

this field is stationary. Denoting by .Ut / the canonical unitary representation in H.Z/
we have
Ut Z.s/ D Z.s  t / D gst D gt  gs D gt  Z.s/:
From this we see that the operator Ut acts as multiplication with the function gt . By
the uniqueness of , the constant function 1 is a cyclic vector for .Ut /. Moreover,
Z
.1; Ut 1/ D .t / d. / D f .t / D .v0 ; Vt v0 /:

Lemma 2.10.12 shows that the representations .Ut / and .Vt / are equivalent. If .Vt / is
continuous, then .Ut / is continuous as well (cf. Lemma 2.10.11). The continuity of Z
follows from Ut Z.0/ D Z.t /.

Remark 2.10.15. Let Z be a continuous stationary field on Rd . The proof of the sec-
ond part of the previous theorem can be carried out by using Bochner’s Theorem 1.7.3
instead of Theorem 1.7.8, as follows.
Let be the spectral measure  of Z and consider the isometric mapping
IZ W L2 . / ! H.Z/ with IZ ei.t;/ D Z.t / (cf. Remark 2.9.12). We define
the unitary representation .Vt / by

Vt g.x/ WD ei.t;x/ g.x/; g 2 L2 . /:

The function 1 is a cyclic vector of this representation.28 Moreover, .Vt / is equivalent


with the unitary representation .Ut / of Z in H.Z/. This follows from the relation
    
.1; Vt 1/ D 1; ei.t;/ D IZ .1/; IZ ei.t;/ D .Z.0/; Z.t // D .Z.0/; Ut Z.0//

and from Lemma 2.10.12.29 We call .Vt / the canonical unitary representation of Z in
L2 . /. In Section 2.11 we will encounter another unitary representation for Z which
is constructed using the correlation function of Z.

Lemma 2.10.16. Let .Ut / be a unitary representation of T in H , let H0 be a subspace


of H and denote by P0 the orthogonal projection from H onto H0 . The following
statements are equivalent:
(i) H0 is .Ut /-invariant;
(ii) Ut P 0 D P 0 Ut ; t 2 T;
(iii) H0? is .Ut /-invariant.

28 This follows from the fact that is uniquely determined by its inverse Fourier–Stieltjes transform.
29 Note that .Z.0/; Z.t // D .Z.s/; Z.s  t // by stationarity, so that Z.0/ can be replaced by Z.s/
where s 2 Rd is arbitrary.
124 Chapter 2 Correlation functions

Proof. The equivalence of (i) and (iii) follows from the relations

H0 is .Ut /  invariant ” Ut h 2 H0 ; t 2 T; h 2 H0
” Ut h ? g; h 2 H0 ; g 2 H0?
” h ? Ut g; t; h 2 H0 ; g 2 H0?
” Ut g 2 H0? ; t; g 2 H0?
” H0? is .Ut /  invariant:

Assume that (i) holds. An arbitrary element h 2 H can be written as

h D h0 C h?
0; h 0 2 H0 ; h ? ?
0 2 H0 :

From this we see that Ut P0 h D Ut h0 . Since (iii) holds we obtain

P 0 Ut h D P 0 Ut h 0 C P 0 Ut h ?
0 D Ut h 0 :

Thus the relation (i) implies (ii).


Now assume that (ii) holds. For all h0 2 H0 we have P0 Ut h0 D Ut P0 h0 D Ut h0
and hence Ut h0 2 H0 . Thus, H0 is .Ut /-invariant, completing the proof.

Lemma 2.10.17. Let Z be a stationary field on T with canonical unitary representa-


tion .Ut / in H.Z/. If H.Z/ admits the decomposition

H.Z/ D H1 ˚ H2

where H1 and H2 are .Ut /-invariant subspaces, then Z can be decomposed as

Z D Z1 C Z2

where Z1 and Z2 are orthogonal stationary fields with H.Z1 / D H1 and


H.Z2 / D H2 .

Proof. Denote by Pi the orthogonal projection onto Hi . Setting

Zi .t / WD Pi Z.t / D Pi Ut Z.0/; t 2 V; i D 1; 2;

we obviously have Z D Z1 CZ2 and Z1 ? Z2 . By Lemma 2.10.16 we have Zi .t / D


Ut Pi Z.0/ and hence Zi is stationary (cf. Example 2.10.8) and H.Zi /  Hi . From
the relation

H1 ˚ H2 D H.Z/ D H.Z1 C Z2 /  H.Z1 / ˚ H.Z2 /  H1 ˚ H2

we conclude that H.Z1 / ˚ H.Z2 / D H1 ˚ H2 and therefore H.Zi / D Hi .


Section 2.11 Unitary representations and positive definite functions 125

Corollary 2.10.18. If H.Z/ is n-dimensional where n 2 N, then there exist one-


dimensional .Ut /-invariant subspaces Hj , j D 1; : : : ; n; such that30

H.Z/ D H1 ˚    ˚ Hn :

Moreover,
Z.t / D 1 .t /X1 C C n .t /Xn ; t 2 T;
where Xj 2 Hj and j is a character of T .

Proof. Since the unitary operators Ut commute, they have a common eigenvector.
Denoting by H1 the subspace spanned by this eigenvector X1 we have

H.Z/ D H1 ˚ H1? :

By Lemma 2.10.16, the subspace H1? is .Ut /-invariant. Using this, the lemma follows
easily by induction on n (see also the discussion in the first part of Example 2.8.6).

2.11 Unitary representations and positive definite functions


In Theorem 2.10.14 we constructed a unitary representation for a given stationary field
by using the integral representation of the correlation function. In the present section
we follow another method where we construct a unitary representation without using
the integral representation. We also present some applications of this method.
Throughout this section we assume that T D Rd or T D Zd . In the linear space of
all complex-valued functions g on T we introduce the translation operator Et by

Et g.x/ WD g.x  t /; t; x 2 T:

We have Et g D ıt  g and EsCt D Es Et . For a complex-valued function g on T ,


we denote by T .g/ the linear space spanned by all translates of g. Note that

T .g/ D f  g W  2 Mf .T /g:

Theorem 2.11.1. To every positive definite function f on T there corresponds a


Hilbert space H.f / with inner product .; / D .; /f such that
(i) elements of H.f / are bounded complex-valued functions on T and H.f / is
invariant under translations;
(ii) T .f /  H.f / and T .f / is dense in H.f /;
(iii) setting Ut g WD Et g; g 2 H.f /; t 2 T , we obtain a unitary representation
.Ut / of T in H.f / with cyclic vector f ;
30 See also Theorem 2.9.18.
126 Chapter 2 Correlation functions

(iv) g.t / D .g; Ut f / for all g 2 H.f / and t 2 T ; in particular, f .t / D .f; Ut f /;


(v) convergence of a sequence fgn g in H.f / implies uniform convergence on T ;
(vi) weak convergence of a net fg˛ g in H.f / implies pointwise convergence;
(vii) if f 2 P c .Rd /; then the representation .Ut / and each g 2 H.f / are continu-
ous;
(viii) if f 2 P m .Rd /; then each g 2 H.f / is measurable and the representation
.Ut / is weakly measurable.
Pn
Proof. LetPm g; h 2 T .f / and choose finitely supported measures  D iD1 ai ıxi
and  D j D1 bj ıyj such that g D   f and h D   f . We write

X
n X
m
.g; h/ WD Q    f .0/ D f .yj  xi /ai bj :
iD1 j D1

Since
X
m X
n
.g; h/ D g.yj /bj D h.xi /ai ;
j D1 iD1

the definition of .g; h/ does not depend on the particular choice of  and . Using
Lemma 1.4.3 and Lemma 1.4.8 we see that .; / is a positive semidefinite inner product
on T .f /. It follows from the definition of .; / that

g.t / D .g; Ut f /; t 2 T; g 2 T .f /: (1)

Moreover, the translation operators Et are isometric on T .f /:

.Et g; Et g/ D .g; g/; g 2 T .f /:

Using equation (1), together with the Cauchy–Schwarz inequality, we obtain

jg.t /j2 D j.g; Et f /j2  .g; g/.Et f; Et f / D .g; g/.f; f /: (2)

From this we conclude that .g; g/ D 0 if and only if g D 0, so T .f / is a pre-Hilbert


space. Moreover, every function g 2 T .f / is bounded.
Let fgn g be a Cauchy sequence in T .f /. It follows from (2) that fgn g, as a sequence
of functions on T , is a Cauchy sequence in the metric of uniform convergence on T .
Consequently, there exists a complex-valued function g on T with

lim gn .t / D g.t /; t 2T (3)


n!1

where the convergence is uniform. Let H.f / be the completion of T .f / constructed


by means of functions on T . These functions are obviously bounded. It follows easily
Section 2.11 Unitary representations and positive definite functions 127

from (3) and from the fact that Et is isometric on T .f / that H.f / is invariant under
Et and .Et g; Et g/ D .g; g/ holds for all g 2 H.f /. Denote by Ut the restriction of
Et to H.f /. Since Ut Ut D Ut Ut D I (the identity operator), the operator Ut is
invertible and hence unitary. It is easy to check that .Ut / is a unitary representation of
T in H.f /. The linear space spanned by the set fUt f W t 2 T g is equal to T .f / and
hence it is dense in H.f /.
Since g 7! .g; Ut f / is continuous on H.f / and T .f / is dense in H.f /, equa-
tion (1) holds for all g 2 H.f /. Using the Cauchy–Schwarz inequality again, we
obtain
jgn .t /  g.t /j2 D j.gn  g; Ut f /j2  kgn  gk2 kf k2 :
Thus, gn ! g uniformly on T whenever fgn g converges to g in H.f /.
Property (vi) follows immediately from the equation g˛ .t / D .g˛ ; Ut f /.
To prove (vii), let g; h 2 H.f / be arbitrary and choose n ; n 2 Mf .T / such
that limn n f D g and limn n f D h, the convergence being in H.f /. Then the
function t 7! .g; Ut h/ is the uniform limit of the functions

t 7! .n f; Ut n f / D Q n  n  f .t /

which are continuous. Thus, the representation .Ut / is continuous. The continuity of
the functions in H.f / now follows from (iv). The last statement can be proved using
the same arguments.

Throughout the rest of this section .Ut / denotes the unitary representation of T in
H.f /, constructed in the previous theorem. If f D C is the correlation function of a
stationary process Z, then .Ut / is called the canonical unitary representation of Z in
H.C /.

Theorem 2.11.2. Let f 2 P .T / and let H1 and H2 be .Ut /-invariant subspaces of


H.f / such that H.f / D H1 ˚ H2 . Then the orthogonal projection fi of f onto
Hi .i D 1; 2/ is a positive definite function. Moreover, the equation

gi .t / D .gi ; Ut fi /

holds for all t 2 T and gi 2 Hi .i D 1; 2/.

Proof. Using that H1 ? H2 and the fact that Hi is .Ut /-invariant, we obtain

gi .t / D .gi ; Ut f / D .gi ; Ut .f1 C f2 // D .gi ; Ut fi /; gi 2 Hi :

In particular, fi .t / D .fi ; Ut fi / and hence fi is positive definite in view of Lem-


ma 2.10.9.
128 Chapter 2 Correlation functions

Theorem 2.11.3. If f D f1 C f2 , where f; f1 ; f2 2 P .T /, then f1 ; f2 2 H.f / and


there exists a nonnegative operator Q on H.f / such that kQk  1 and

QUt D Ut Q (1)
f1 .t / D .Qf; Ut f / (2)
f2 .t / D ..I  Q/f; Ut f / (3)

for all t 2 T . Conversely, if f 2 P .T / and an operator Q on H.f / has the properties


above, then the functions f1 and f2 defined by (2) and (3) are positive definite and
f D f1 C f2 .

Proof. We define in T .f / a new positive semidefinite inner product .; /1 by setting

.  f;   f /1 WD Q    f1 .0/; ;  2 Mf .T /: (4)

Since f1 and f  f1 are positive definite, we have

0  .g; g/1  .g; g/; g 2 T .f /:

From Theorem D.3.5 we conclude that the inner product .; /1 can be extended to a
positive semidefinite inner product on H.f /, which we also denote by .; /1 . By the
second part of Theorem D.3.5 there exists a nonnegative linear operator Q on H.f /
such that kQk  1 and

.g; h/1 D .Qg; h/; g; h 2 H.f /:

From equation (4) we see that .Ut g; h/1 D .g; Ut h/1 holds for all g; h 2 H.f / and
t 2 T . Using this, we obtain

.Ut Qg; h/ D .Qg; Ut h/ D .g; Ut h/1 D .Ut g; h/1 D .QUt g; h/:

This being true for all g; h 2 H.f /, we must have Ut Q D QUt ; t 2 T . Taking into
account the definition of .; /1 , we obtain

f1 .t / D .f; Ut f /1 D .Qf; Ut f / D .Qf /.t /

and
f2 .t / D f .t /  f1 .t / D ..I  Q/f; Ut f / D ..I  Q/f /.t /:
Therefore, f1 D Qf 2 H.f / and f2 D .I  Q/f 2 H.f /.
Suppose now that Q is a nonnegative operator on H.f / with kQk  1 and QUpt D
Ut Q; t 2 Tp. Then the operator I  Q is nonnegative as well. Setting Q1 WD Q
and Q2 WD I  Q, the operators Q1 and Q2 commute with Ut for every t 2 T (cf.
Theorem D.3.7). Thus, we have

f1 .t / D .Qf; Ut f / D .Q1 f; Ut Q1 f /
Section 2.11 Unitary representations and positive definite functions 129

and
f2 .t / D ..I  Q/f; Ut f / D .Q2 f; Ut Q2 f /:
It is obvious that f D f1 C f2 . It follows from Lemma 2.10.9 that f1 and f2 are
positive definite.

Corollary 2.11.4. Let f1 ; f2 be positive definite functions on Rd .


If f1 C f2 is continuous, then f1 and f2 are also continuous.
If f1 C f2 is measurable (vanishes -almost everywhere), then f1 and f2 are meas-
urable (vanish -almost everywhere, respectively).

Proof. By the previous theorem f1 ; f2 2 H.f / and therefore the corollary follows
from the last two statements of Theorem 2.11.1.

Theorem 2.11.5. Let f 2 P m .Rd /; h 2 L1 .Rd / and define the mapping Ah by


Ah g WD h  g; g 2 H.f /.31 Then
(i) Ah is a bounded linear operator from H.f / into H.f /;
(ii) if H0 is a closed .Ut /-invariant subspace of H.f /, then H0 is invariant un-
der Ah ;
(iii) .Ah / D AhQ .

Proof. By Theorem 2.11.1, the function t 7! .Ut g1 ; g2 / is measurable for all g1 ; g2 2


H.f /. We define the sesquilinear functional Bh by
Z
Bh .g1 ; g2 / WD .Ut g1 ; g2 /h.t / dt; g1 ; g2 2 H.f /:
Rd

Since j.Ut g1 ; g2 /j  kg1 k kg2 k, we have


Z
jBh .g1 ; g2 /j  kg1 k kg2 k jh.t /j dt:
Rd

Consequently, by the second part of Theorem D.3.5, there exists a bounded linear
operator Ch W H.f / ! H.f / for which Bh .g1 ; g2 / D .Ch g1 ; g2 / for all g1 ; g2 2
H.f /. Using the definition of Bh we obtain
Z Z
.Ch g/.t / D .Ch g; Ut f / D .Us g; Ut f /h.s/ ds D g.t  s/h.s/ ds D h  g.t /
G G

and so Ah g D Ch g 2 H.f / if g 2 H.f /:

31 Note that h  g is a continuous function because g is bounded and measurable by Theorem 2.11.1.
130 Chapter 2 Correlation functions

Let H0 be a closed .Ux /-invariant subspace of H.f /. If g1 2 H0 , then


Z
.Ah g1 ; g2 / D .Ut g1 ; g2 /h.t / dt D 0
G

holds for every g2 2 H0? : Thus, Ah g1 2 H0?? D H0 , i.e., H0 is invariant under Ah .


We now compute the adjoint Ah of Ah :

.Ah g/.x/ D .Ah g; Ux f / D .g; Ah Ux f / D .Ah Ux f; g/


Z
D .UxCy f; g/h.y/ dy
G
Z
D g.x C y/h.y/ dy
G
D .hQ  g/.x/ D .AhQ g/.x/:

Thus, Ah D AhQ .

Corollary 2.11.6. For all f 2 P m .Rd / we have: 32


(i) .h  g1 ; g2 / D .g1 ; hQ  g2 /; g1 ; g2 2 H.f /; h 2 L1 .G/;
Z Z
(ii) f .y  x/h.x/h.y/ dx dy  0; h 2 L1 .Rd /.
Rd Rd

Proof. Property (i) is a consequence of (2.11.5.iii) while (ii) is proved as follows:


Z Z
f .y  x/h.x/h.y/ dx dy D hQ  h  f .0/
Rd Rd
D .hQ  h  f; f / D .h  f; h  f /  0:

We are now able to prove a decomposition theorem for measurable positive definite
functions.

Theorem 2.11.7. Let f 2 P m .Rd /. Then

f D fc C f 0

where fc 2 P c .Rd /; f0 2 P m .Rd / and f0 D 0 -almost everywhere.

Proof. Denote by Hc the set of all continuous functions in H.f / and by H0 the or-
thogonal complement of Hc . Then Hc and H0 are closed .Ut /-invariant subspaces of
H.f / and
H.f / D Hc ˚ H0 :

32 See also Lemma 1.5.8 and Corollary 2.4.8.


Section 2.11 Unitary representations and positive definite functions 131

By Theorem 2.11.2 the function f can be decomposed as

f D fc C f0 ; fc 2 P c .Rd / \ Hc ; f0 2 P .Rd / \ H0 :

It remains to prove that f0 D 0; -almost everywhere. Theorem 2.11.5 shows that


h  f0 2 H0 for all h 2 L1 .Rd /. But h  f0 is continuous, i.e., h  f0 2 Hc , and
therefore we must have h  f0 D 0. This being true for all h 2 L1 .Rd /, we obtain
f0 D 0 -almost everywhere.

Theorem 2.11.8. If f 2 P .Rd / is Lebesgue measurable on a neighborhood V of 0,


then f is Lebesgue measurable on Rd .

Proof. Let B be an open ball with center 0 such that B CB  V and denote by HB the
subspace of H.f / spanned by the functions Ut f; t 2 B. As f is measurable on V and
B C B  V , the function Ut f is measurable on B for all t 2 B. Since convergence
in H.f / implies uniform convergence (cf. Theorem 2.11.1) every function in HB is
measurable on B. If g ? 2 HB? , then g ? .t / D .g ? ; Ut f / D 0; t 2 B. Applying the
decomposition
H.f / D HB ˚ HB?
we see that every function h 2 H.f / can be written as

h D g C g?

where g is measurable on B and g ? is identically zero on B. Thus, h is measurable


on B. In particular, Ut f is measurable on B for all t 2 Rd . From this it follows that
f is measurable on Rd .
Chapter 3

Special properties

3.1 Strict positive definiteness


In several applications, for example in interpolation problems, it is more advantageous
to use strictly positive definite functions.1 The aim of the present section is to collect
basic properties of these functions and to give necessary and sufficient conditions for
a function to be strictly positive definite.2

Definition 3.1.1. A function f W Rd ! C is said to be strictly positive definite if the


matrix  n
A D f .ti  tj / i;j D1
is positive definite for every positive integer n and for all pairwise distinct
t1 ; : : : ; tn 2 Rd .

We denote by SP .Rd / the set of strictly positive definite functions on Rd while


SP c .Rd / is the set of all continuous functions in SP .Rd /. It is easy to check that
SP .Rd /  P .Rd /.
The next lemma follows immediately from the definition. We omit the proof.

Lemma 3.1.2. Let f be a complex-valued function on Rd . Then f 2 SP .Rd / if and


only if the inequality
Xn
f .ti  tj /ci cj > 0
i;j D1

holds for every positive integer n, for all pairwise distinct t1 ; : : : ; tn 2 Rd , and for all
c 2 Cn n f0g.

Theorem 3.1.3. Let f 2 SP .Rd / and g 2 P .Rd /. We have


(i) f .0/ > 0 and jf .t /j < f .0/ for all t ¤ 0;
(ii) f ; f C g 2 SP .Rd /. In particular, Re f 2 SP .Rd /;
(iii) If r 2 R n f0g, then f .r/ 2 SP .Rd /;
(iv) If g ¤ 0, then fg 2 SP .Rd /. In particular, jf j2 2 SP .Rd /.
1 We refer to [58] for an introduction to multivariate scattered data approximation.
2 More specific results and references can be found in [23].
Section 3.1 Strict positive definiteness 133

Proof. Setting n D 1 and c D 1 in Lemma 3.1.2 shows that f .0/ > 0 while setting
n D 2; x1 D x; x2 D 0; c1 D 1, and c2 D 1 yields

f .0/2  f .x/f .x/ D f .0/2  jf .x/j2 > 0

whenever x ¤ 0.
(iv) is a consequence of Schur’s Theorem D.2.12 while the remaining statements
follow immediately from the definition of strictly positive definiteness.

Recall that (complex) trigonometric polynomials on Rd are functions P of the form


X
n
P .t / D cj ei.t;xj / ; t 2 Rd
j D1

where n 2 N; c 2 Cn and x 2 Rd .

Theorem 3.1.4. Let f 2 P c .Rd / and let  be the corresponding spectral measure.
Then f 2 SP c .Rd / if and only if there is no trigonometric polynomial P ¤ 0 on Rd
which vanishes on the support of .

Proof. Let P ¤ 0 be a trigonometric polynomial on Rd . We write P in the form


X
n
P .t / D cj ei.t;xj / ; t 2 Rd ;
j D1

where c 2 Cn nf0g and the xj ’s are pairwise distinct. The proof of inequality (1.1.2.v)
shows that Z
Xn
f .ti  tj /ci cj D jP .t /j2 d.t /:
i;j D1 Rd

Since jP j2 is continuous and nonnegative, the integral on the right-hand side is zero
if and only if P vanishes on the support of . The theorem follows immediately from
this observation.

The next two theorems provide simple sufficient conditions to ensure strict positive
definiteness on Rd .

Theorem 3.1.5. If f 2 P c .Rd /; d  2, is radial and not constant, then f is strictly


positive definite.

Proof. By Lemma 3.6.3 the spectral measure  of f is radial. Since f is not constant
the support of  contains a sphere of positive radius centered at the origin. Therefore,
Theorem C.10.3 shows that there is no trigonometric polynomial P ¤ 0 vanishing on
the support of . From Theorem 3.1.4 we infer that f is strictly positive definite.
134 Chapter 3 Special properties

Theorem 3.1.6. Let f 2 P c .Rd / and let  be the corresponding spectral measure.
If the support of  contains a nonempty open set then f is strictly positive definite.

Proof. The statement follows by the same argument as in the proof of Theorem 3.1.5.
Instead of Theorem C.10.3 we apply Theorem C.10.2.

We now turn to the special case d D 1.3

Proposition 3.1.7. Let f 2 P c .R/ and let  be the corresponding spectral measure.
If the support of  has accumulation points, then f is strictly positive definite.

Proof. If a trigonometric polynomial vanishes on the support of  then, being analytic,


it vanishes on R. Thus, the statement follows from Theorem 3.1.4.

Corollary 3.1.8. If the support of  is not countable, then f is strictly positive defi-
nite. In particular, if  is not purely discrete, then f 2 SP c .R/.

Corollary 3.1.9. If f 2 P c .R/ and

lim sup jf .x/j < f .0/;


x!1

then f is strictly positive definite.

Proof. Theorem 1.8.5 with s D 0 shows that the spectral measure of f cannot be
purely discrete so we can apply Corollary 3.1.8.

3.2 Infinitely differentiable and rapidly decreasing


functions
In this section we construct positive definite functions which are rapidly decreasing
(cf. Definition 3.2.6). These functions can be useful for example as technical tools in
proofs. We also consider the restriction of the Fourier transformation to the Schwartz
space S.Rd / of all rapidly decreasing functions (see Theorem 3.2.8).

3.2.1. Recall that C1 .Rd / denotes the set of infinitely differentiable functions
on Rd while C1 d 1 d
00 .R / stands for the set of functions in C .R / which have compact
support. It is easy to check that the function
e1=t if t > 0;
g.t / WD (1)
0 if t  0

(see Figure 3.1) is infinitely differentiable on R. Since x 7! kxk2 is infinitely differ-


3 For special results in the one-dimensional case we refer to [12].
Section 3.2 Infinitely differentiable and rapidly decreasing functions 135

0.5

1 2 3 4

Figure 3.1. The function g from equation (3.2.1.1).

entiable on Rd we see that the function

h.x/ WD g.1  kxk2 /; x 2 Rd (2)

(see Figure 3.2) belongs to C1 d


00 .R /. Moreover, h is radial and its support is equal to
c
the closed unit ball B .1/.
The support of
f WD h  hQ (3)
is the ball B c .2/. In addition, f is positive definite, infinitely differentiable and radial
(cf. Lemma 3.6.5).

Theorem 3.2.2. A set S  Rd has the form S D fx 2 Rd W f .x/ 6D 0g for some


infinitely differentiable positive definite function f ¤ 0 if and only if
(i) 0 2 S;
(ii) S is open;
(iii) S D S .

Proof. If f 2 P c .Rd / and f ¤ 0, then the set

S WD fx 2 Rd W f .x/ 6D 0g

obviously has the properties (i)–(iii).


136 Chapter 3 Special properties

0.4

1 1

Figure 3.2. The function h from equation (3.2.1.2) with d D 1.

Assume that S  Rd is such that (i)–(iii) hold and let s1 ; s2 ; : : : be all points of S
having rational coordinates. For each j 2 N choose rj > 0 such that
B c .rj / [ ŒB c .rj / C sj  [ ŒB c .rj /  sj   S: (1)
Let h 2 \
P c .Rd / C1 .Rd / be such that h  0 and supp h D B c .1/. We define the
function gj ; j 2 N, by
gj .x/ D 2h.x=rj / C h..x C sj /=rj / C h..x  sj /=rj /; x 2 Rd : (2)
By Lemma 1.4.18, the function gj is positive definite. In addition, gj is nonnegative,
gj 2 C1 .Rd / and
supp gj D B c .rj / [ ŒB c .rj / C sj  [ ŒB c .rj /  sj   S:
Using equation (2) and Theorem 1.2.1 we see that
4A˛
jD ˛ gj j  j˛j ; ˛ 2 Nd0
rj
where A˛ denotes the absolute moment corresponding to h. Using this, Lemma B.1.16
shows the existence of a sequence fpk g1
1 of positive real numbers such that the series
1
X
pj jD ˛ gj j
j D1

converges for all ˛ 2 Nd0 . Setting


1
X
f WD pj gj
j D1
Section 3.2 Infinitely differentiable and rapidly decreasing functions 137

we have f 2 P c .Rd /. Theorem B.2.10 shows that f 2 C1 .Rd /. Noting that


f .sj /  gj .sj / > 0, the relation

fx 2 Rd W f .x/ 6D 0g D S

follows from equation (1) and from the fact that for each s 2 S there exists a subse-
quence fskj g converging to s.

The next three lemmata are often useful as technical tools in proofs.

Lemma 3.2.3. There exists a sequence fgn g1 1 of radial positive definite functions
gn 2 C1 d / of the form g D h  hQ hn 2 C1
00 .R / such that gn .0/ D 1
.R where d
00 n n n
1
and fgn g1 tends to 1 uniformly on compact sets.

Proof. This lemma can be proved in the same way as Lemma 1.5.6 by using the func-
tion f from (3.2.1.3).

Lemma 3.2.4. There exists a sequence ffn g of radial positive definite functions fn 2
C1 d
00 .R / such that

(i) fOn ! 1 uniformly on compact sets;


R
(ii) Rd fn h d ! h.0/ for every continuous function h.

Proof. Consider the positive definite function f WD h  hQ where h is the function


defined in (3.2.1.3). It is clear that f is radial and f 2 C1 d
00 .R /. Scaling and nor-
malizing this function we obtain a sequence ffn g of radial positive definite functions
such that fn is a density and the support of fn is contained in B c .1=n/. It is a routine
exercise to show that (ii) holds. Replacing h in (ii) by the characters of Rd , we see
that fOn ! 1 pointwise. The uniform convergence follows from Theorem 1.5.11.

Lemma 3.2.5. For all ; ı 2 .0; 1/ there exists a radial positive definite function
p 2 C1 d
00 .R / such that p is a density and

0  p.t
O / < ı; t 2 Rd n B o . /:

Proof. Let f 2 C1 d
00 .R / be an arbitrary radial positive definite function which is a
density (cf. the proof of Lemma 3.2.4). Then 0  fO.t /  1 D f .0/ for all t . By the
Riemann–Lebesgue Lemma 1.8.4, the function fO is in C0 .Rd /. From Corollary 1.4.13
we conclude that 0  fO.t / < 1; t ¤ 0. Consequently,

.fO.t //n < ı; t 2 Rd n B o . /

for some n 2 N. The function p WD f      f (n fold convolution) has the desired


properties.
138 Chapter 3 Special properties

Definition 3.2.6. A function f 2 C1 .Rd / is called rapidly decreasing if

kf k˛;ˇ WD sup jx ˛ D ˇ f .x/j < 1


x2Rd

for all ˛; ˇ 2 N0 .

We denote by S.Rd / the linear space of all rapidly decreasing functions4 on Rd . If


g is a complex-valued function on Rd , then we write5

M˛ g.x/ WD x ˛ g.x/; x 2 Rd

so that
kf k˛;ˇ D kM˛ D ˇ f k1 :
It is clear that C1
00 .R /  S.R /. Another example of a rapidly decreasing function
d d

is given by
f .x/ D ekxk ; x 2 Rd :
2

The next lemma is simple, we omit the proof.

Lemma 3.2.7. If f 2 S.Rd /, then also M˛ D ˇ f 2 S.Rd / for all ˛; ˇ 2 Nd0 .

Theorem 3.2.8. The Fourier transformation maps S.Rd / onto S.Rd / and

Mˇ D ˛ fO D .1/j˛j .D ˇ M˛ f /O; f 2 S.Rd /: (1)

Moreover, if fn ; f 2 S.Rd / and

lim kfn  f k˛;ˇ ! 0


n!1

for all ˛; ˇ 2 Nd0 , then


lim kfOn  fOk˛;ˇ ! 0
n!1

for all ˛; ˇ 2 Nd0 .

Proof. Equation (1) is a consequence of Theorem 1.2.1 and Theorem 1.2.9.


Let Q be a positive polynomial such that6
Z
1
d D 1:
Rd Q

4 The function space S.Rd / is usually called Schwartz space.


5 Note that the symbol M˛ is also used to denote moments of measures.
6 We can take for example Q.x/ D c.1 C kxk2 /d with a suitable constant c > 0.
Section 3.3 Analytic characteristic functions of one variable 139

If g 2 S.Rd / we have
Z Z
d=2 jgQj
.2/ jgj
O  jgj d D d  kgQk1 < 1:
Rd Rd Q

Replacing here g by Mˇ D ˛ fO and using equation (1) we see that fO 2 S.Rd / and that
the second statement of the theorem is true.
In the same way, we see that fL 2 S.Rd / whenever f 2 S.Rd /. Using this, the fact
that the Fourier transformation maps S.Rd / onto S.Rd / follows from the relation

f D .fL /O ; f 2 S.Rd /:

The proof is complete.

3.3 Analytic characteristic functions of one variable


Definition 3.3.1. A complex-valued function f on R is said to be analytic if it is
infinitely often differentiable and its Taylor series at 0 has positive convergence radius.
If the convergence radius is equal to infinity, then f is said to be an entire function.

We note that f is analytic if and only if there exists a complex-valued function 


of one complex variable, which is holomorphic in an open disk around 0 and
f .t / D .t / for all real t in this disk. If  is holomorphic in the whole complex plane
and f .t / D .t / for all t 2 R, then f is entire.

Characteristic functions that are not differentiable at 0, like ejtj or max .1  jt j; 0/,
are not analytic. The characteristic functions et and cos t are entire, while 1it
2 1
and
1
1Ct 2
are analytic but not entire.

3.3.2. We have seen in Section 1.2 that the existence of derivatives of a characteristic
function f is closely related to the existence of moments of the corresponding distri-
bution .RWe will prove that analyticity is closely related to finiteness of integrals of
the form ers d.s/ for certain r 2 R. We introduce the numbers
Z 1
C C
s D sf D sup r 2 R W erx d.x/ < 1
1
Z 1

s D sf D inf r 2 R W erx d.x/ < 1
1
and the set
 C
S D Sf D fz 2 C W s < Im z < s g:
  0  s C and hence S is empty if and only if s  D s C D 0.
Note that s    
140 Chapter 3 Special properties

If the support of  is bounded, then S D R. If  is the exponential distribution


˛
with parameter ˛, then f .t / D ˛it and S D fz 2 C W ˛ < Im z < 1g.

Theorem 3.3.3. Let  be a probability distribution with characteristic function f . If


S is not empty, then the integral
Z 1
f .z/ WD eizx d.x/
1

exists for all z in the strip S and it represents there a holomorphic function. More-
over, Z 1
.n/
f .z/ D .ix/n  eizx d.x/; n 2 N (1)
1
for all z 2 S .7
Proof. The fact that f .z/ exists follows from jeizx j D eIm zx . We now proceed in
a similar way as in the proof of (1.2.1.i). Let z 2 S be fixed and choose a positive ı
such that z C h 2 S for all h 2 C with jhj < ı. If, in addition, h ¤ 0 then we have
Z 1
f .z C h/  f .z/ eihx  1 izx
D x  e d.x/: (2)
h 1 hx
ı
Theorem B.1.5 with n D 0 shows that the integrand times 2 is dominated by
ıjxj jhxj Im zx
e e :
2
Using that r < er ; r 2 R, we see that this expression is less than

eıjxj  eIm zx

whenever jhj < 2ı . This function is -integrable with respect to x by our choice of
ı. Taking the limit h ! 0 in (2) and applying Lebesgue’s theorem on dominated
convergence we obtain (1) for n D 1. The general case is obtained by repeating the
arguments above.

Corollary 3.3.4. If the support of a distribution is bounded, then its characteristic


function is entire.

Corollary 3.3.5. The characteristic function of  is analytic if and only if


 C
s < 0 < s :

In the sequel, we always assume that an analytic characteristic function f is defined


in the strip S .
7 See also Lemma 3.4.1.
Section 3.3 Analytic characteristic functions of one variable 141

Theorem 3.3.6. Let f be an analytic characteristic function.


(i) If g is an analytic characteristic function and f and g coincide on an interval
.a; a/  R, then f D g on R.
(ii) If g is an entire function and f and g coincide on an interval .a; a/  R,
then f D g on R.8
(iii) If two probability distributions have analytic characteristic functions and the
same moments, then they are equal.

Proof. (i) It follows from Theorem 3.3.3 that f and g have holomorphic extensions
to the strip S WD Sf \ Sg . Since S is connected and f D g on .a; a/, the extensions
are equal on S (cf. Theorem C.1.8). This implies that the corresponding distributions
are equal, hence Sf D Sg .
(ii) We extend f to the strip Sf and consider g on the whole complex plane. Since
Sf is simply connected and f D g on .a; a/, the extensions are equal on Sf .
(iii) By Theorem 1.2.1, the corresponding characteristic functions have the same Tay-
lor expansion at zero. Hence (iii) follows from (i).

Remark 3.3.7. A distribution is in general not uniquely determined by its moments.


To see an example we consider the log-normal distribution which has the density p
given by
1
 e.log x/ =2 ; x > 0
2
p.x/ D p
x 2
(see Figure 3.3). The moments Mk of this distribution are given by
Z 1
1
x k1 e.log x/
2 =2
Mk D p dx; k 2 N:
2 0

Substituting t D log x, we get


Z 1
1
et
2 =2Ckt
Mk D p dt
2 1
Z 1
1
e.tk/
2 =2 2 =2
D ek p dt
2 1
2 =2
D ek :

8 See also Theorem 4.2.3.


142 Chapter 3 Special properties

0.7

1 2
Figure 3.3. The function p from Remark 3.3.7.

The same argument as above shows that


Z 1 Z 1
1
et =2Ckt sin.2 t / dt
k 2
x sin.2 log x/  p.x/ dx D p
0 2 1
Z 1
k =2 1
ey =2 sin.2y/ dy
2 2
De p
2 1
D0

for all k 2 N0 . Consequently, all the functions pa , a 2 Œ1; 1, defined by


 
pa .x/ D p.x/ 1 C a sin.2 log x/ ; x > 0

are densities and they have the same moment sequence.9


Next we present an application of analytic characteristic functions.

Theorem 3.3.8. If the random variables X and Y have analytic characteristic func-
tions and the equation
E.X k  Y l / D E.X k /  E.Y l /
holds for all k and l in N, then X and Y are independent.10

9 See, e.g., [1, 35, 55] or Chapter 2 in [6] for more information on this topic.
10 We refer to [7] for some related results.
Section 3.3 Analytic characteristic functions of one variable 143

Proof. Let fX , fY , and g be the characteristic functions of X , Y , and the random


vector .X; Y /, respectively. If s; t 2 R and jsj and jt j are small enough, then we have
1
X 1
E.X k /  .is/k X E.Y l /  .it /l
fX .s/  fY .t / D 
kŠ lŠ
kD0 lD0
1
X E.X k /  .is/k  E.Y l /  .it /l
D
kŠ  lŠ
k;lD0
X1
E.X k Y l /  .is/k  .it /l
D :
kŠ  lŠ
k;lD0

Using Theorem 1.2.1 we see that the last sum is the Taylor series of the function g
at zero. Now fix s and t . The display above shows that r 7! g.rs; rt /; r 2 R, is an
analytic characteristic function which is equal to the analytic characteristic function
r 7! fX .rs/  fY .rt / in a neighborhood of zero. By (3.3.6.ii), the equality holds on
R. In particular,
g.s; t / D fX .s/  fY .t /:
Thus, X and Y are independent (cf. Theorem 1.3.10).

Theorem 3.3.9. Let f1 and f2 be characteristic functions. If the characteristic


function f D f1 f2 is analytic, then so are f1 and f2 . Moreover, the relations
Sfj Sf .j D 1; 2/ hold.11
Proof. Denote by 1 ; 2 and  the corresponding distributions. For all A; B > 0 and
 < r < s C we have
s 
Z A Z B Z A Z B
rx ry
e d1 .x/ e d2 .y/ D er.xCy/ d1 .x/ d2 .y/
A B A B
Z 1Z 1
 er.xCy/ d1 .x/ d2 .y/
Z1
1
1

D ers d.1  2 /.s/


Z1
1
D ers d.s/ < 1:
1

This being true for all A; B > 0, we conclude that


Z 1 Z 1
erx d1 .x/ < 1 and erx d2 .x/ < 1
1 1

whenever 
s <r < C.
s

11 See also Theorem 3.3.14.


144 Chapter 3 Special properties

Corollary 3.3.10. Let f1 and f2 be characteristic functions. If f1  f2 is an entire


characteristic function, then f1 and f2 are entire characteristic functions as well.
We have seen in Theorem 1.1.2 that characteristic functions are positive definite.
We now show that analytic characteristic functions are positive definite in the corre-
sponding strip in the following sense.

Lemma 3.3.11. Every analytic characteristic function f satisfies the inequality


Xn
f .zj  zN k /cj ck  0
j;kD1
for an arbitrary choice of c1 ; : : : ; cn 2 C and z1 ; : : : ; zn 2 Sf such that zj zN k 2 Sf .
Proof. Denoting by  the corresponding distribution, we have
Z 1
f .z/ D eizx d.x/; z 2 Sf
1
and hence
X
n Z 1 X
n
f .zj  zN k /cj cNk D ei.zj zN k /x cj ck d.x/
1
j;kD1 j;kD1
Z 1 Xn
D eizj x eizk x cj ck d.x/
1
j;kD1
Zˇ n ˇ2
ˇX 1 ˇ
ˇ izj x ˇ
D ˇ e cj ˇ d.x/  0:
1 ˇ j D1
ˇ

Next we prove some simple but useful properties of analytic characteristic functions.

Theorem 3.3.12. Let f be an analytic characteristic function.


(i) The function x 7! f .x C iy/ is positive definite on R for all y 2 .sf ; sfC /.
(ii) N D f .z/; z 2 Sf .
f .z/
(iii) f .iy/ > 0; iy 2 Sf .
(iv) jf .x C iy/j  f .iy/; x 2 R; iy 2 Sf .
(v) The function y 7! log f .iy/ is convex on the interval .sf ; sfC /.

Proof. The first statement is obtained from Lemma 3.3.11 with zj D xj C iy=2 while
(ii) and (iv) follow from (i) (see Lemma 1.4.8 and Theorem 1.4.12).12 Let  be the
distribution corresponding to f . Then
Z 1
f .iy/ D eyx d.x/ > 0:
1

12 Actually, these properties also follow easily from the integral representation of f .
Section 3.3 Analytic characteristic functions of one variable 145

 n
By Lemma 3.3.11, the matrix A D f .zj  z k / j;kD1 is positive semidefinite and
hence det A  0. In the special case where n D 2, zj D iyj , this inequality gives

f .iŒy1 C y2 /2  f .i2y1 /f .i2y2 /; sf < y1 ; y2 ; y1 C y2 < sfC : (1)

From (iii) we see that f is strictly positive on the imaginary axis. Inequality (1) and
the continuity of f imply that the function y 7! log f .iy/ is convex.

Lemma 3.3.13. Let f be an analytic characteristic function of the form f D eg ,


where g is holomorphic in Sf and g.0/ D 0. Then
(i) for x; y 2 R such that jyj  r < min .sf ; sfC / we have
 
Re g.x C iy/  g.iy/  max g.ir/; g.ir/ I
ˇ
ˇ
(ii) g.iy/  dg.iy/
dy ˇ  y.
yD0

Proof. By (3.3.12.v), the function y 7! log f .iy/ is convex and hence the inequality
 
g.iy/  max g.ir/; g.ir/ (1)

holds. Since jf j D eRe g , we obtain (i) from (3.3.12.iv) and (1).


In view of g.iy/ D log f .iy/, the function g is convex on the imaginary axis.
Convexity and the fact that g.0/ D 0 imply (ii).

The next theorem, which is essentially due to Yu. V. Linnik,13 is a generalization


of Theorem 3.3.9. If all the numbers ˛k below are integers or rational numbers, then
Theorem 3.3.14 follows easily from Theorem 3.3.9.

Theorem 3.3.14 (Linnik). Let f be a holomorphic function on the open disk


D.0; ı/; ı > 0, such that f .z/ D f .z/ for all z 2 D.0; ı/. Further, let f1 ; : : : ; fr
be characteristic functions and ˛1 ; : : : ; ˛r be positive real numbers such that
Y
r
jfk .t /j˛k D jf .t /j; ı < t < ı:
kD1

Then fk is analytic and Sfk D.0; ı/; k D 1; : : : ; r.

Proof. As in the proof of Theorem 1.2.13 we write

gk .t / D fk .t /fk .t / D jfk .t /j2 ; t 2R


13 Linnik’s original formulation contains the conditions f .z/ ¤ 0; z 2 D.0; ı/, and
Y
r
˛
fk k .t / D f .t /; ı < t < ı:
kD1
Our conditions are somewhat weaker and allow us to avoid the use of powers of complex numbers.
146 Chapter 3 Special properties

and g.z/ D f .z/f .z/ D jf .z/j2 ; z 2 D.0; ı/. Then g is an even holomorphic
function on D.0; ı/, the gk ’s are characteristic functions and g; gk are nonnegative
on the interval .ı; ı/. Moreover,
Y
r
gk˛k .t / D g.t /; ı < t < ı: (1)
kD1

We will prove the statement of the theorem for the functions gk . The statement for fk
follows then from Theorem 3.3.9. We choose d 2 .0; ı such that g.z/ ¤ 0 if jzj < d .
From (1) we obtain
X
r
1
˛k log D h.t /; d < t < d (2)
gk .t /
kD1

where h.t / D log g.t /.


By Theorem 1.2.13, the functions gk are infinitely often differentiable. We prove
first that they are analytic and Sgk D.0; R/ for some R > 0. We may suppose that
˛k  1 .k D 1; : : : ; r/. Indeed, if N0 2 N is sufficiently large, then N0 ˛k  1 for all
k. Raising both sides of (1) to the power N0 we obtain an equation of the same type
containing factors of the form gkN0 ˛k .
2q˛
We introduce the notation q D g 2q and q;k .t / D gk k .t /; t 2 .ı; ı/. Raising
both sides of (1) to the power 2q, differentiating the new equation 2q-times using the
Leibniz rule (cf. (B.1.13.1)), and setting t D 0 we obtain
X .2q/Š .l1 /
.2q/ .lr /
q .0/ D .0/  : : :  q;r .0/: (3)
l1 Š  : : :  lr Š q;1
l1 CClr D2q

To compute the derivatives on the right-hand side we apply Faà di Bruno’s formula
(B.1.14) which yields
X
c.b; m/  gk0 .0/b1  gk00 .0/b2  : : :  gk .0/bm
.m/ .m/
q;k
.0/ D (4)

0 satisfying b1 C 2b2 C    C mbm D m and


where the sum is over all b 2 Nm

mŠ2q˛k .2q˛k  1/  : : :  .2q˛k  jbj C 1/


c.b; m/ D :
b1 Šb2 Š  : : :  bm Š .1Š/b1 .2Š/b2  : : :  .mŠ/bm
Since ˛k  1 and jbj  m, the number c.b; m/ is positive if m  2q. Noting that
.2l1/ .2l/
gk .0/ D 0 and the sign of Œgk .0/b2l is .1/lb2l , we see that the sign of the
nonvanishing terms14 on the right of (4) is equal to .1/m=2 . From this we conclude
that the sign of the nonvanishing terms on the right-hand side of (3) is .1/q . Since

14 Note that m is even if there are nonvanishing terms.


Section 3.3 Analytic characteristic functions of one variable 147

.2q/
the right-hand side of (3) contains the terms 2q˛k gk we obtain
ˇX ˇ
ˇ r ˇ
ˇ 2q˛k gk .0/ˇˇ  j
.2q/ .2q/
.0/j
ˇ q
kD1
and hence ˇ ˇ
ˇ .2q/ ˇ 1
ˇgk .0/ˇ  j .2q/ .0/j; k D 1; : : : ; r: (5)
2q˛k q
Applying the inequality (C.1.13) to the Taylor expansion of q D g 2q with r replaced
by ı0 D 2ı we get

j
.2q/
.0/j 2 ˇ  ˇ
 sup ˇRe g 2q .z/ ˇ  2M 2q
q
 2q
(6)
.2q/Š ı0 jzjDı0

1
where M D ı0
 supjzjDı0 jg.z/j. The inequalities (5) and (6) show that

.2q/
! 2q
1
gk .0/
lim sup M
q!1 .2q/Š

and hence gk is holomorphic on the disk D.0; 1=M /.


Denote by ık the radius of convergence of the Taylor expansion of gk . We already
know that ık > 0. Without loss of generality we may assume that ı1      ır . To
finish the proof we have to show that ı1 D ı. Assume, on the contrary, that ı1 < ı.
The equations
X1 .2j /
gk .0/
Gk .t / D .t /j ; t 2 .ık2 ; ık2 /
.2j /Š
j D0
and
1
X g .2j / .0/
G.t / D .t /j ; t 2 .ı 2 ; ı 2 /
.2j /Š
j D0

define functions whose Taylor expansions have ık2 and ı 2 as radii of convergence,
respectively. We have Gk .t 2 / D gk .it /; G.t 2 / D g.it / and therefore
Y
r
Gk˛k .t / D G.t /; t 2 .ı12 ; ı12 /: (7)
kD1
Relation (1.2.1.ii) shows that
.l/
Gk .0/  0; l 2 N0 : (8)

Let ı0 2 .0; ı12 / be such that G.ı0 / ¤ 0 and define the functions Hk by
Hk .t / D Gk .ı0 C t /; jı0 C t j < ı12 ; k D 1; : : : ; r:
148 Chapter 3 Special properties

By (7) we have
Y
r
Hk˛k .t / D G.ı0 C t /; jı0 C t j < ı12 ;
kD1
hence
Y
r
Hk .t / ˛k
G.ı0 C t /
D :
Hk .0/ G.ı0 /
kD1
We raise this equation to the power n, differentiate the resulting equation n times and
.n/ .l/
set t D 0. The left-hand side contains the term n˛1 H1 .0/=H1 .0/. Since Hk .0/  0
for all l (cf. (8)) we conclude that
ˇ
.n/ dn G n .ı0 C t / ˇˇ
n˛1 H1 .0/  H1 .0/  n :
dt G n .ı / ˇ 0 tD0

We now choose arbitrary real numbers r1 and r2 such that ı12 < r1 < r2 < ı 2 .
Applying the inequality (C.1.13) to the Taylor expansion of G n .ı0 Ct / with r replaced
by r1  ı0 we get
ˇ
1 dn n ˇ 2 ˇ  n ˇ
 G .ı C t /ˇ   sup ˇRe G .ı0 C z/ ˇ  2M n ;
nŠ dt n
0 ˇ .r1  ı0 /n jzjDr1 ı0
tD0
1
where M D r1 ı1
 supjzjr2 jG.z/j. Consequently,

.n/
! n1
H1 .0/
lim sup M
n!1 nŠ

and therefore H1 is holomorphic on the disk D.ı0 ; 1=M /. Since M does not depend
on ı0 , we can choose ı0 such that this disk contains ı12 . This shows that G1 is bounded
on .ı12  ; ı12 C / for some positive . On the other hand, the Taylor series of
G1 at 0 has nonnegative coefficients (cf. inequality (8)) and hence this series cannot
converge for values larger than ı12 (see Lemma B.1.15). This contradiction shows that
ı1 D ı.

3.4 Holomorphic L2 Fourier transforms


In this section we consider the Fourier transforms of square integrable functions as
well as their extensions to certain regions of C.15 We will use the notation

CC
im WD fz 2 C W Im z > 0g; C
im WD fz 2 C W Im z < 0g

and
CC
re WD fz 2 C W Re z > 0g; C
re WD fz 2 C W Re z < 0g:

15 For further reading on this topic we refer to Rudin’s book [46].


Section 3.4 Holomorphic L2 Fourier transforms 149

Lemma 3.4.1. If F 2 L2 .0; 1/, then the function


Z 1
f .z/ D F .t /eitz dt; z 2 CC
im
0

is holomorphic. Moreover,
Z 1 Z 1
2
jf .x C iy/j dx  2  jF .t /j2 dt; y > 0:
1 0

Proof. If z D x C iy where y > 0, then jeitz j D ety showing that the integral above
exists.
Assume that y > ı > 0; Im zn > ı, and zn ! z. Using the Cauchy–Schwarz
inequality we obtain
Z 1 Z 1
ˇ itz ˇ
2
jf .z/  f .zn /j  2
jF .t /j dt  ˇe  eitzn ˇ2 dt:
0 0

The second integrand on the right is bounded by the integrable function t 7! 4e2ıt
and tends to zero as n ! 1. The dominated convergence theorem shows that f is
continuous.
Let be a closed path in CC im . By Fubini’s theorem we have
Z Z 1 Z
f .z/ dz D F .t /  eitz dz dt:
0

Cauchy’s Theorem C.1.5 shows that


Z
eitz dz D 0:

Thus, the integral of f over is zero and hence, by Morera’s Theorem C.1.6, f is
holomorphic on CC im .
To prove the last statement let y > 0 be fixed. By the definition of f , we have
Z 1 Z 1Z 1
 
f .x C iy/ dx D F .t /ety eitx dt dx; y > 0:
1 1 0

Applying Plancherel’s Theorem 1.8.9 we obtain


Z 1 Z 1 Z 1
1 2 2 2ty
 jf .x C iy/j dx D jF .t /j e dt  jF .t /j2 dt:
2 1 0 0

Lemma 3.4.2. Let 0 < a < 1 and F 2 L2 .a; a/. The function
Z a
f .z/ D F .t /eitz dt; z 2 C
a
150 Chapter 3 Special properties

is entire and
jf .z/j  C eajzj ; z2C
with some constant C  0.

Proof. The fact that f is entire can be proved by the same arguments as in the proof
of Lemma 3.4.1 while the inequality follows from
Z a Z a
jf .z/j  jF .t /jety dt  eajyj jF .t /j dt
a a

since jyj  jzj.

Definition 3.4.3. For a > 0 we denote by E.a/ the set of all entire functions of
exponential type a, i.e., F 2 E.a/ provided that F is entire and

jF .w/j  C eajwj ; w2C

for some C < 1.

Theorem 3.4.4 (Paley–Wiener). Let a and C be positive constants and f 2 E.a/ be


an entire function such that
Z 1
jf .x/j2 dx < 1: (1)
1

Then the L2 Fourier transform F of f jR vanishes outside Œa; a and


Z a
f .z/ D F .t /eitz dt (2)
a

for all z 2 C.

Proof. For > 0 define the function f by

f .x/ WD f .x/ejxj ; x 2 R:

We will prove that


Z 1
lim f .x/eitx dx D 0; t 2 R n Œa; a: (3)
!0 1

First we show that this relation implies the theorem.


Considering the restriction of f to the real axis, we have

lim kf  f k2 D 0:
!0

Plancherel’s theorem (cf. Theorem 1.8.9) implies that the Fourier transforms of the
f ’s converge in L2 .R/ to the L2 Fourier transform F of f . By equation (3) the
Section 3.4 Holomorphic L2 Fourier transforms 151

function F vanishes outside Œa; a. Since f is equal to the inverse Fourier transform
of F we see that equation (2) holds for almost every z 2 R. As each side of (2) is an
entire function (see Corollary 3.3.4), we conclude that (2) holds for all z 2 C.
To prove (3) let ˛ be the path defined by

˛ .s/ D sei˛ ; ˛ 2 R; s 2 Œ0; 1/: (4)

Write
…˛ D fw 2 C W Re .wei˛ / > ag;
and for w 2 …˛ , define the function ‰˛ by16
Z Z 1
wz
f .sei˛ /ewse ds:

‰˛ .w/ D f .z/e dz D e i˛
˛ 0

Using the fact that f 2 E.a/ we see that the modulus of the integrand is at most

C e.Re .we
i˛ /a/s
:

The same arguments as in the proof of Lemma 3.4.1 show that ‰˛ is holomorphic in
the half plane …˛ . In the cases ˛ D 0 and ˛ D , we extend the definition of ‰˛ to
larger half planes by
Z 1
‰0 .w/ D f .x/ewx dx; w 2 CC re
0
and Z 0
‰ .w/ D  f .x/ewx dx; w 2 C
re :
1
Inequality (1) and Lemma 3.4.1 show that ‰0 and ‰ are holomorphic in the indicated
half planes. Moreover, it is easy to check that
Z 1
f .x/eitx dx D ‰0 . C it /  ‰ . C it /; t 2 R:
1
Hence, it remains to prove that the right-hand side of the equation above tends to zero
as ! 0 if jt j > a.
Next we show that any two of the functions ‰˛ are equal in the intersection of their
domains of definition.
Let 0 < ˇ  ˛ <  and write
˛Cˇ ˇ˛
D ;  D cos :
2 2
Note that  is positive. If w D rei , where r 2 .a=; 1/, then w 2 …˛ \ …ˇ . This
follows from    
Re wei˛ D r D Re weiˇ :
16 See Section C.1 for the definition of line integrals.
152 Chapter 3 Special properties

Let  be the curve given by .t / D Reit ; R > 0; ˛  t  ˇ. If z D .t / is a point


of this curve and w is as above, then

Re .wz/ D rR cos.t  /  rR

and hence the modulus of the integrand of


Z
IR WD f .z/ewz dz

is at most
C e.ar/R :
Since r 2 .a=; 1/, we conclude that the integral IR tends to zero as R ! 1.
By Cauchy’s Theorem C.1.5
Z Z
f .z/ewz dz D IR C f .z/ewz dz:
Œ0;Reiˇ  Œ0;Rei˛ 

Taking the limit R ! 1 we see that


a
‰˛ .w/ D ‰ˇ .w/; w D rei ; r > :

Theorem C.1.8 shows that ‰˛ and ‰ˇ coincide in the intersection of their domains
of definition. Thus, the difference on the right of equation (3) does not change if we
replace ‰0 and ‰ by ‰=2 if t < a, and by ‰=2 if t > a. From this we infer
that the difference tends to 0 as ! 0.

Theorem 3.4.5 (Rudin). Suppose F 2 E.2a/; F is even, nonnegative on the real


axis, and Z 1
F 2 .u/ du < 1:
1
Then there are even functions Gj ; Hj 2 E.a/ such that
1
X 1
X
F .u/ D jGj .u/j2 C u2 jHj .u/j2 ; u 2 R: (1)
j D1 j D1

Proof. We may assume that a D 1. If F .0/ D 0, then F .w/ D w 2k F0 .w/ with


some positive integer k and an entire function F0 having no zero at 0. It is easy to
check that F0 satisfies the conditions of the theorem. If we have shown that F0 admits
a decomposition analogous to (1), then we obtain the desired decomposition for F
by multiplying Gj and Hj by w k and changing the roles of Gj and Hj if k is odd.
Therefore we may assume, without loss of generality, that F .0/ D 1.
Section 3.4 Holomorphic L2 Fourier transforms 153

Let ˙1 ; ˙2 ; : : : be the zeros of F , counted with their multiplicities. Since F 2
E.2/, Corollary C.1.16 shows that
log C C 2jn j
n
log 2
with some constant C > 0 and hence jn j > cn, for some c > 0. Thus,
1
X 1
< 1:
jn j2
nD1

Using this and the facts that F is even and F 2 E.2/, Hadamard’s factorization Theo-
rem C.1.19 shows that
1
Y z2
F .z/ D 1 ; z 2 C: (2)
2n
nD1

Let t1  t2     be those positive numbers for which itn is a zero of F and write
Y
m
z2
Am .z/ D 1C
tn2
nD1

Qm D F=Am
Q D lim Qm :
m!1

We show that Q 2 E.2/. It is clear that Qm 2 E.2/. The Paley–Wiener Theo-


rem 3.4.4 shows that the Fourier transform of the restriction of Qm to R vanishes
outside Œ2; 2. By definition, Qm ! Q pointwise. Since 0  Qm  F on the real
axis, Lebesgue’s dominated convergence theorem shows that Qm ! Q in L2 .R/.
Consequently, QO m ! QO in L2 .R/ in view of Theorem 1.8.9 and hence QO vanishes
outside of Œ2; 2. Thus, QO is Lebesgue integrable. Taking the inverse Fourier trans-
form of QO and using Lemma 3.4.2 we see that Q 2 E.2/.
The function Q is nonnegative on the real axis, it is even and has no pure imaginary
zeros. Therefore, the real zeros of Q have even multiplicity and the non-real zeros
occur in conjugate pairs. Hence, we can write Q as Q D F1 F2 where F1 and F2 are
subproducts of the product in (2), F1 and F2 are even, entire and F2 .w/ D F1 .w/.
Next we show that F1 2 E.1/ and F2 2 E.1/. Since jn j > cn, the subproducts
of (2) are dominated by
Y1 ˇ z ˇ2
ˇ ˇ
1Cˇ ˇ :
cn
nD1
Using this and the product representation of the sine function from Remark C.1.20,
we see that F1 and F2 are of exponential type. The equation jF1 j2 D Q holds on both
the real and imaginary axes. On the real axis, the function Q is the inverse Fourier
154 Chapter 3 Special properties

O hence it is bounded. The function R defined


transform of the integrable function Q,
by R.w/ WD eiw F1 .w/ is obviously bounded on the real axis. Using that Q 2 E.2/
we see that

jR.it /j D et jF1 .it /j2 D et Q.it /  et C e2t ; t > 0;

with some C > 0, i.e., R is bounded on the upper half of the imaginary axis. Since
F1 is of exponential type, R is of exponential type as well. The Phragmen–Lindelöf
Theorem C.1.21 shows that R is bounded in the first and second quadrant. Applying
the same argument to the lower half plane we conclude that F1 2 E.1/ and F2 2 E.1/.
By the definition of Am and Q we have
1
Y z2
F .z/ D Q.z/ 1C
tn2
nD1
1
X
D F1 .z/F2 .z/ ck2 z 2k
kD0

with some nonnegative numbers ck . If u is real, then F1 .u/F2 .u/ D jF1 .u/j2 and
therefore
1
X
F .u/ D jck uk F1 .u/j2
kD0
X1 1
X
D jc2j u2j F1 .u/j2 C u2 jc2j C1 u2j F1 .u/j2 ; u 2 R:
j D0 j D0

Thus, the functions Gj .z/ WD c2j z 2j F1 .z/ and Hj .z/ WD c2j C1 z 2j F1 .z/ have the
desired properties.

3.5 Further properties of Gaussian distributions


We now have all the necessary tools to prove further important properties of Gaussian
distributions.

Theorem 3.5.1. Let P be a polynomial on Rd with complex coefficients such that


P .0/ D 0. The function f D eP is the characteristic function of a d-dimensional
random vector X if and only if
1
P .t / D i.m; t /  .C t; t /; t 2 Rd (1)
2
with some m 2 Rd and a d  d positive semidefinite real matrix C D .cij /.
Section 3.5 Further properties of Gaussian distributions 155

Proof. First we show that if f D eP is a characteristic function, then the degree of P


is at most two. Suppose first that d D 1. Since P .0/ D 0 the constant term of P is 0.
The case P 0 being trivial, assume that
X
n
P .t / D ck t k ; ck 2 C; t 2 R
kD1

where n  1 and cn ¤ 0. Since jf .z/j D eRe P .z/ ; z 2 C, the inequality (3.3.12.iv)


implies that X n Xn
Re ck z k  ck .i Im z/k ; z 2 C:
kD1 kD1
Replacing here z by s  z; s 2 R, dividing both sides by s n and letting s ! 1 we see
that
Re .cn z n /  cn .i Im z/n :
Putting all n-th roots z1 ; : : : zn of jcn j=cn into this inequality, we obtain

jcn j  cn .i Im zj /n ; j D 1; : : : ; n:

Taking modulus and dividing by jcn j shows that jIm zj j  1. Since jzj j D 1, we
conclude that jIm zj j D 1 for all j D 1; : : : ; n. This is only possible if n D 1 or
n D 2. Thus, the degree of P is at most 2. The case where d is arbitrary follows now
from the fact that s 7! f .s t / is a characteristic function of one variable for all t 2 Rd
(see, e.g., Remark 1.1.10) by using Lemma B.6.7.
Next we show that P has the form (1). We already know that
X
d X
d
P .t / D aj tj C bj k tj tk
j D1 j;kD1

with some aj ; bj k 2 C. We may obviously assume that bj k D bkj . Theorem 1.2.9


shows that all moments of X exist. Hence, by Theorem 1.2.1,
@
f .0/ D aj D iE.Xj /
@tj
@2
f .0/ D aj ak C bj k C bkj D aj ak C 2bj k D E.Xj Xk /:
@tj @tk

Thus, m WD i.a1 ; : : : ; ad / is the expectation and C WD 2.bj k /dj;kD1 is the covari-


ance matrix of X . From this it follows that m 2 Rd and that C is real and positive
semidefinite (see page 349).
Finally, suppose that C is a positive semidefinite real matrix and m 2 Rd . Then f
is the characteristic function of the random vector Y D BX C m where X is standard
Gaussian and B is a symmetric real matrix such that B 2 D C (see 1.10.1).
156 Chapter 3 Special properties

Theorem 3.5.2. Let f be a characteristic function of a d-dimensional random vector


X and P be a polynomial on Rd . If f D eP in a neighborhood of zero, then X is
Gaussian.

Proof. For fixed t 2 R consider the characteristic function x 7! f .xt /; x 2 R. By


(3.3.6.ii), f .xt / D eP .xt/ for all x 2 R and hence f D eP . We can now apply
Theorem 3.5.1.

Remark 3.5.3. The function

f .t / D e˛jtjjtj ;
˛
t 2R

is a characteristic function for all ˛ 2 Œ2; 1/ (see Figure 3.4). To see this we show that
f is convex on the interval Œ0; 1/. The fact that f is a characteristic function follows
then from Pólya’s Theorem 3.9.11. The second derivative of f at t > 0 is

Œ˛ 2 .1 C t ˛1 /2  ˛.˛  1/t ˛2   f .t /:

It suffices to prove that the expression in the brackets is positive which is equivalent to
p ˛
t C t ˛ > 1  1=˛  t 2 :
˛ ˛
This inequality holds because t  t 2 if 0  t  1 and t ˛  t 2 if t  1.

1 1
Figure 3.4. The function f from Remark 3.5.3 with ˛ D 3.
Section 3.5 Further properties of Gaussian distributions 157

Theorem 3.5.4 (Lévy–Cramér). If f1 and f2 are characteristic functions on R such


that for some a 2 R and b > 0
2
f1 .t /f2 .t / D eiatbt ; t 2R

then
2
fj .t / D eiaj tbj t ; t 2R
for some aj 2 R and bj  0; j D 1; 2.
Proof. Without loss of generality we may assume that a D 0 and b D 1. From Corol-
lary 3.3.10 we infer that f1 and f2 are entire characteristic functions. The equation
ez D f1 .z/f2 .z/ holds clearly for every z 2 C. It follows that f1 has no zeros and
2

therefore, by Lemma C.1.11, f1 D eg for some entire function g of the form


1
X
g.z/ D ck z k ; ck ; z 2 C:
kD1

We may assume that c1 D g 0 .0/


D 0, for otherwise we replace f1 .z/ by ec1 z f1 .z/
c z
and f2 .z/ by e f2 .z/.
1

Applying inequality (3.3.13.ii) to the functions f1 .z/ D eg.z/ and f2 .z/ D


eg.z/z we obtain
2

g.iy/  0 and  g.iy/ C y 2  0; y 2 R:


Applying now the inequality on the left of (3.3.13.i) to f1 and f2 , we see that
 
Re g.x C iy/  g.iy/ and Re g.x C iy/  .x C iy/2  g.iy/ C y 2
for all x; y 2 R. These four inequalities imply that jRe g.z/j  jzj2 ; z 2 C. Using
Corollary C.1.14 we obtain that cn D 0 for each n > 2. Since f1 is bounded on R and
f1 .x/ D f1 .x/; x 2 R, we must have c2  0, completing the proof.

Corollary 3.5.5. If X and Y are independent d-dimensional random vectors and


X C Y is Gaussian, then X and Y are Gaussian as well.
Proof. Let a 2 Rd be arbitrary. By Lemma 1.10.3 the random variable .a; X C Y / D
.a; X / C .a; Y / is Gaussian. The previous theorem shows that .a; X / and .a; Y / are
Gaussian. The statement follows now from Theorem 1.10.4.

Theorem 3.5.6. Let f1 ; : : : ; fr be characteristic functions on Rd and let ı;


˛1 ; : : : ; ˛r be positive real numbers such that
Y
r
jfk .t /j˛k D eP .t/ ; t 2 Rd ; kt k < ı
kD1
holds for some real polynomial P . Then the distributions corresponding to the fk ’s
are Gaussian.
158 Chapter 3 Special properties

Proof. Suppose first that d D 1. By Theorem 3.3.14, the functions fk are entire. As
in the proofs of Theorems 3.3.14 and 1.2.13 we write

gk .t / D fk .t /fk .t / D jfk .t /j2 ; t 2R

and Q.t / D P .t / C P .t /. Then gk is an entire, nonnegative characteristic function.


Moreover,
Yr
gk˛k .t / D eQ.t/ ; t 2 R: (1)
kD1
We will prove the statement of the theorem for the functions gk . The statement for fk
follows then from Theorem 3.5.4.
By Lemma C.1.11 there exist entire functions hk such that gk D ehk and hk .0/ D 0.
These functions satisfy the equation
X
r
˛k hk .z/ D Q.z/; z 2 C:
kD1

As hk is even, its first derivative at 0 is equal to 0. Inequality (3.3.13.ii) shows that


hk .it /  0; t 2 R. Consequently, Q.it /  0 and 0  hk .it /  Q.it /. From (3.3.13.i)
and Corollary C.1.14 we conclude that hk is a polynomial. That the distribution cor-
responding to gk is Gaussian follows from Theorem 3.5.1.
Now let d be arbitrary. For each a 2 Rd nf0g the univariate characteristic functions
t 7! fk .t a/; t 2 R, satisfy the equation
Y
r
ı
jfk .t a/j˛k D eP .ta/ ; t 2 R; jt j <
kak
kD1

hence they correspond to Gaussian distributions. The statement of the theorem follows
now from Theorem 1.10.4.

Theorem 3.5.7 (Darmois–Skitovitch). Let X D .X1 ; : : : ; Xd / be a random vector


where X1 ; : : : ; Xd are independent and let a; b 2 Rd . If the random variables

X
d X
d
Ya D .a; X / D ak X k ; Yb D .b; X / D bk Xk
kD1 kD1

are independent, then Xk is Gaussian for all k with ak bk ¤ 0.17

Proof. Since Ya and Yb are independent, Theorem 1.1.8 shows that

f.Ya ;Yb / .t; s/ D fYa .t /  fYb .s/; t; s 2 R:

17 See also Theorem 1.10.10.


Section 3.5 Further properties of Gaussian distributions 159

Using this and the independence of the Xk ’s we obtain

Y
d Y
d Y
d
fk .ak t C bk s/ D fk .ak t / fk .bk s/; t; s 2 R: (1)
kD1 kD1 kD1

where fk denotes the characteristic function of Xk . We may assume that ak and bk are
not both zero. Changing the enumeration if necessary we also assume that ak bk ¤ 0
if 1  k  m and ak bk D 0 if k > m. If k > m and ak ¤ 0, then the factor fk .ak t /
appears on both sides of (1). The same holds for fk .bk s/ if k > m and bk ¤ 0. We
choose ı > 0 such that none of the factors in (1) is equal to zero if t; s 2 .ı; ı/. Then
we have
Y
m Y
m Y
m
fk .ak t C bk s/ D fk .ak t / fk .bk s/; t; s 2 .ı; ı/: (2)
kD1 kD1 kD1

We show that this equation implies that Xk is Gaussian. By the same argument as in
the proof of Theorem 3.5.6, we may assume that all the factors in (2) are positive.
Setting gk D log fk we write (2) in the form

X
m X
m X
m
gk .ak t C bk s/ D gk .ak t / C gk .bk s/; t; s 2 .ı; ı/:
kD1 kD1 kD1

If the numbers bk =ak are all different, then Theorem C.9.5 shows that gk is a poly-
nomial in a neighborhood of 0 and hence, by Theorem 3.5.2, Xk is Gaussian. Assume
that b1 =a1 D    D bn1 =an1 for some n1 and b1 =a1 ¤ bk =ak if k > n1 . We then
write Ya and Yb in the form

a2 an
Ya D a1  X1 C X2 C    C 1 Xn1 C   
a1 a1
and 
b2 bn
Yb D b1  X1 C X2 C    C 1 Xn1 C   
b1 b1
The expressions in the brackets are equal. Applying the same arguments to the remain-
ing Xj ’s, we obtain that the sums in the brackets are Gaussian. By Theorem 3.5.4 of
Lévy and Cramér the summands are Gaussian as well.

The next theorem is a multivariate analogue of the previous one.18

18 We refer to [33] for further reading on results of this type and their applications.
160 Chapter 3 Special properties

Theorem 3.5.8. Let X1 ; : : : ; Xr be independent random vectors in Rd and let


Ak ; Bk 2 Rd d ; k D 1; : : : ; r, be nonsingular matrices. If the random vectors
X
r X
r
YA D Ak X k and YB D Bk X k
kD1 kD1

are independent, then Xk is Gaussian for all k.

Proof. Proceeding in the same way as in the proof of Theorem 3.5.7, we obtain the
functional equation
X
r X
r X
r
gk .Ak t C Bk s/ D gk .Ak t / C gk .Bk s/; t; s 2 B o .ı/
kD1 kD1 kD1

with some ı > 0, where gk D log fk and fk is the characteristic function of Xk .


Theorem C.9.5 shows that both sums on the right-hand side are polynomials of t and
s, respectively. In view of Theorem 3.5.2, the random vectors YA and YB are Gaussian.
Hence, by Theorem 3.5.4 of Lévy and Cramér, Ak Xk is Gaussian as well. Since Ak
is nonsingular, we infer that Xk is Gaussian.

3.6 Fourier transformation of radial measures


and functions
Radial measures and functions have the advantage of a simple structure. In this section
we start exploiting radiality in more detail.

Definition 3.6.1. A function g W Rd ! C is called radial if

g.t / D g.O t /; t 2 Rd

for every orthogonal matrix O 2 O.d /. A measure  2 M.Rd / is called radial if

.B/ D .OB/; B 2 B.Rd /

for all O 2 O.d /.

Setting O D I where I is the identity matrix, we see that radial positive definite
functions are real-valued and even.

Lemma 3.6.2. A complex-valued function g on Rd is radial if and only if there exists


a complex-valued function h on Œ0; 1/ such that

g.t / D h.kt k/; t 2 Rd :


Section 3.6 Fourier transformation of radial measures and functions 161

Proof. Assume that g is radial and define the function h by

h.r/ WD g.re1 /; r 2 Œ0; 1/

where e1 D .1; 0; : : : ; 0/ is the first vector of the standard basis of Rd . Let t 2 Rd be


arbitrary and choose an orthogonal matrix Ot 2 O.d / such that Ot t D kt k  e1 . We
then have
g.t / D g.Ot t / D g.kt k  e1 / D h.kt k/:
To prove the converse statement assume that g.t / D h.kt k/ with some function h.
Then f .O t / D h.kO t k/ D h.kt k/ D f .t /; t 2 Rd , i.e., f is radial.

Lemma 3.6.3. A measure  2 Mb .Rd / is radial if and only if the function O is


radial.

Proof. If  is radial, then


Z Z
g.t / d.t / D g.O t / d.t / (1)

for all functions g of the form

X
n
gD cj 1Bj ; n 2 N; cj 2 C; Bj 2 B.Rd /
j D1

and for all O 2 O.d /. Since these functions span a dense linear subspace of L1 ./, we
conclude that equation (1) holds for all continuous bounded functions g. In particular,
Z Z
1
O t / D ei.O t;s/ d.s/ D ei.t;O s/ d.s/ D .t
.O O /:

The converse statement can be proved in the same way by using the fact that functions
of the form t 7! ei.t;s/ ; s 2 Rd , span a dense linear subspace of L1 ./.

Next we consider radiality in terms of random vectors.

Theorem 3.6.4. Let X be a d-dimensional random vector. The following statements


are equivalent:
(i) X and OX have the same distribution for every orthogonal matrix O 2 O.d /;
(ii) the distribution of X is radial;
(iii) the characteristic function of X is radial;
(iv) for any a 2 Rd the random variables .a; X / and kakX1 have the same distri-
bution.
162 Chapter 3 Special properties

Proof. The equivalence of (i) and (ii) is trivial while the equivalence of (ii) and (iii)
follows from Lemma 3.6.3.
Assume that (i) holds and let Oa 2 O.d / be such that Oa a D kake1 where e1 D
.1; 0; : : : ; 0/ 2 Rd . Then

.a; Oa1 X / D .Oa a; X / D kakX1

so that .a; X / and kakX1 have the same distribution.


Finally, assume that .a; X / and kakX1 have the same distribution. We then have
h i h i
fX .a/ D E ei.a;X/ D E eikakX1 D fX1 .kak/

i.e., fX is radial.

Lemma 3.6.5. Let g 2 L2 .Rd / be a radial function. Then the function


Z
f .t / WD g  g.t
Q /D g.t C y/g.y/ dy; t 2 Rd
Rd

is radial, f 2 P c .Rd /, and f vanishes at infinity.

Proof. In view of Lemma 1.5.4 it remains to prove that f is radial. Let O 2 O.d /
be an orthogonal matrix. Using the fact that the Lebesgue measure is invariant under
orthogonal transformations we obtain
Z
f .O t / D g.O t C y/g.y/ dy
ZR d

D g.O t C Oy/g.Oy/ dy
ZR
d

D g.t C y/g.y/ dy D f .t /; t 2 Rd
Rd

i.e., f is radial.

Using Bessel functions, the Fourier transform of a radial function can be expressed
as a univariate integral.

Theorem 3.6.6. The Fourier transform of a radial function f 2 L1 .Rd / is given by


Z 1
O .2/d=2
f .t / D r d=2 Jd=21 .rkt k/f .r  e/ dr; t 2 Rd n f0g;
kt kd=21 0

where e 2 Rd is an arbitrary element of unit length.


q
2
Proof. Since J1=2 .x/ D x cos x the equation above is trivial if d D 1. Suppose
that d > 1. By Lemma 3.6.3 the function fO is radial. We may therefore assume that
Section 3.6 Fourier transformation of radial measures and functions 163

e D .1; 0; : : : ; 0/. Writing h.r/ WD f .r  e/; r  0, we have


Z
O O
f .t / D f .kt k  e/ D eiktkx1 h.kxk/ dx
Rd
Z Z q
D eiktkx1
h x12 C kyk2 dy dx1 :
R Rd 1

Using Corollary B.7.7 we see that the integral over Rd 1 is equal to


Z 1 q
d 2
.S / h x12 C u2 ud 2 du
0
where, in view of Proposition B.7.8,
2 .d 1/=2
.S d 2 / D :
..d  1/=2/
Therefore, we have
Z 1 Z 1 q 
fO.t / D .S d 2 / iktkx1
e h x12 C u2 ud 2 du dx1 :
1 0

Introducing polar coordinates .r; '/ by x1 D r cos ' and u D r sin ', it follows
Z 1 Z 
O
f .t / D .S d 2
/ h.r/r d 1
eiktkr cos ' sind 2 ' d' dr:
0 0
From Lemma C.5.3 we know that
1  x  Z 
J .x/ D p   eix cos ' sin2 ' d'; x  0;   0
   C 12 2 0

from which the assertion follows.

Corollary 3.6.7. If f 2 L1 .R3 /, then


Z
O 4 1
f .t / D r sin.rkt k/f .r  e/ dr; t 2 R3 n f0g:
kt k 0

If f is a complex-valued function on Œ0; 1/ such that


Fd .t / WD f .kt k/; t 2 Rd
is Lebesgue integrable, then the Fourier transform of Fd is radial. We will use the
notation
FOd .t / D Fd .f /.kt k/; t 2 Rd :
Next we prove a recursion formula for the Fourier transformation for radial func-
tions.19
19 See [54] and [26] for more information on recursion formulae.
164 Chapter 3 Special properties

Theorem 3.6.8. Let f W Œ0; 1/ ! C be such that the functions t 7! f .kt k/; t 2
Rd C2 , and t 7! f .kt k/; t 2 Rd , are in L1 .Rd C2 / and L1 .Rd /, respectively. Then
we have
1 1 d
Fd C2 .f /.r/ D  Fd .f /.r/; r > 0: (1)
2 r dr
Proof. In view of Theorem 3.6.6
Z 1
Fd .f /.s/ D .2/d=2 JQd=21 .rs/f .r/r d 1 dr; s>0 (2)
0

where JQ .x/ WD x  J .x/. By the second equation in Proposition C.5.5 we have20
d Q
J .x/ D x JQC1 .x/: (3)
dx
while Lemma C.5.2 shows that
jJQ .x/j  C ; 0
with some positive constant C . Using equation (3) and differentiating both sides of
equation (2) with respect to s we obtain (1). The fact that interchanging differentiation
and integration is permissible follows from
Z 1ˇ ˇ
ˇ d Q ˇ
ˇ Jd=21 .rs/ˇ jf .r/jr d 1 dr
0 ds
Z 1
ˇ ˇ
Ds ˇJQd=2 .rs/ˇ jf .r/jr d C1 dr
0
Z 1
 sCd=2 jf .r/jr d C1 dr < 1:
0
That the last integral is finite is a consequence of Corollary B.7.7 since, by assumption,
the function t 7! f .kt k/; t 2 Rd C2 , is in L1 .Rd C2 /.

Using induction on d . we immediately obtain the following corollary:

Corollary 3.6.9. Let f W Œ0; 1/ ! C be such that the functions t 7! f .kt k/; t 2 Rn
are in L1 .Rn / for all 1  n  2d C 2. Then we have

1 X .1/j .2d  j  1/Š 1  d j


d
F2d C1 .f /.r/ D F1 .f /.r/
.2/d j D1 2d j .d  j /Š.j  1/Š r 2d j dr

and
1 X .1/j .2d  j  1/Š 1  d j
d
F2d C2 .f /.r/ D F2 .f /.r/ :
.2/d j D1 2d j .d  j /Š.j  1/Š r 2d j dr

20 See also Remark C.5.6.


Section 3.7 Radial characteristic functions 165

3.7 Radial characteristic functions


Radial positive definite functions have many applications in probability and statistics.
They occur for example as characteristic functions of radial distributions and as covari-
ance functions of isotropic random fields. We have already shown (cf. Lemma 1.3.4)
that the characteristic function of the radial density
1 1
 e 2 kxk ; x 2 Rd
2
'.x/ D d=2
.2/
is given by
1
g.t / D e 2 ktk ; t 2 Rd :
2

We continue with further examples of radial characteristic functions.

Lemma 3.7.1. Let U be a random vector that is uniformly distributed on the closed
ball B c .R/  Rd . Then its characteristic function fU (see Figure 3.5) is given by
d=2
2
fU .t / D .d=2 C 1/ Jd=2 .Rkt k/; t 2 Rd n f0g:
Rkt k

Proof. The case d D 1 being simple, assume that d > 1. Theorem 3.6.6 and Corol-
lary C.5.5 show that
Z R
.2/d=2 1d=2
fU .t / D kt k r d=2 Jd=21 .rkt k/ dr
d .B c .R// 0
d=2 Z Rktk
.2/
D kt kd x d=2 Jd=21 .x/ dx
d .B c .R// 0

.2/d=2 h iRktk
d d=2
D kt k x J d=2 .x/
d .B c .R// xD0
from which the lemma follows.

Recall that a d-dimensional random vector is said to be uniformly distributed on the


d-dimensional sphere of radius R if the distribution of X=R is equal to the measure
d introduced in Theorem B.7.5.

Lemma 3.7.2. Let U be a random vector that is uniformly distributed on the d-dimen-
sional sphere of radius R. Then its characteristic function fU is given by
d=21
2
fU .t / D .d=2/ Jd=21 .Rkt k/; t 2 Rd n f0g:
Rkt k
Proof. To simplify the notation we may assume that R D 1. By Lemma 3.6.3 the
function fU is radial and hence fU .t / D fU .kt ke1 / where e1 D .1; 0; : : : ; 0/ 2 Rd .
166 Chapter 3 Special properties

0.75

0.50

0.25

2 4 6 8 10

0:25

Figure 3.5. The characteristic function fU from Lemma 3.7.1 with d D 2 and R D 1 (shown
as a function of the norm).

Using this and equation (B.7.11.1) we obtain


Z
1
fU .t / D d 1 /
eitu d d 1 .u/
d 1 .S S d 1

Z
1
D d 1
eiktku1 d d 1 .u/
d 1 .S / S d 1

Z Z 1
1
D d 1 /
.1  s 2 /.d 3/=2 eiktks ds d d 2 .v/:
d 1 .S S d 2 1

The definition of Jd=21 (cf. Definition C.5.1 and Remark C.5.6) and Proposi-
tion B.7.8 lead immediately to the stated formula.

Lemma 3.7.3. The function f defined by

f .t / D ektk ; t 2 Rd ; d 2 N

is a characteristic function and

fO.t / D 2d  .d 1/=2 ..d C 1/=2/  .1 C kt k2 /.d C1/=2 ; t 2 Rd : (1)

Proof. Theorem 3.6.6 shows that


Z 1
fO.t / D .2/d=2 kt k1d=2 r d=2 Jd=21 .rkt k/er dr; t 2 Rd n f0g:
0
Section 3.7 Radial characteristic functions 167

Using the formula


 x d=21 X
1
.1/k  x 2k
Jd=21 .x/ D 
2 kŠ .k C d=2/ 2
kD0
(cf. Theorem C.5.4) we obtain
1
X Z 1
.1/k kt k2k
fO.t / D 2 d=2 r 2kCd 1 er dr
22k kŠ.k C d=2/ 0
kD0

for all t where the power series above is absolutely convergent. The definition of the
Gamma function and Legendre’s duplication formula (cf. Proposition C.4.5) show that
the integral above is equal to
1
p 22kCd 1 .k C d=2/.k C .d C 1/=2/

and hence
1
X .1/k .k C .d C 1/=2/
fO.t / D 2d  .d 1/=2  kt k2k ; kt k < 1:

kD0

Using this and Proposition B.1.3 we see that equation (1) holds whenever kt k < 1
while Theorem 3.3.6 implies that it holds for all t 2 Rd .
The fact that f is a characteristic function follows from fO  0.

Corollary 3.7.4. Let P be a polynomial with real coefficients such that P .0/ D 1
and P .z/ ¤ 0 if Re z  0. Then
t 7! 1=P .kt k/b ; t 2 Rd
is a characteristic function for all b > 0 and all d 2 N.
Proof. Since P has no zero with nonnegative real part, it is the product of polynomials
of the form Q.t / D 1 C pt C qt 2 where p; q  0. Thus, it suffices to show that
t 7! 1=Q.kt k/b is positive definite. We have
Z 1 Z 1
b1 x
.b/ D x e dx D s b
ub1 esu du; b; s > 0:
0 0

Setting s D Q.kt k/ we obtain


Z 1
1 1
D ub1 eQ.ktk/u du; b > 0; t 2 Rd : (1)
Q.kt k/b .b/ 0
Using Lemma 3.7.3 and Lemma 1.3.4 we see that
t 7! eQ.ktk/u D eu eupktk euqktk ;
2
t 2 Rd
is positive definite if u  0. Therefore, the Corollary follows from Lemma 1.5.7.
168 Chapter 3 Special properties

Taking the inverse Fourier transform in equation (3.7.3.1), we see that the function
t 7! .1 C kt k2 /.d C1/=2 is the characteristic function of an absolutely continuous
distribution. The next theorem states more.

Theorem 3.7.5. The function


1
f .t / D ; t 2 Rd
.1 C kt k2 /b
d
is a characteristic function for all b > 0. If b > 2, then21

21b
fO.x/ D kxkbd=2 Kbd=2 .kxk/; x 2 Rd n f0g:
.b/

Proof. The first statement follows from Corollary 3.7.4.


If b > d=2, then f is Lebesgue integrable (cf. Corollary B.7.7). By equa-
tion (3.7.4.1) we have
Z 1
1
ub1 e.1Cktk /u du:
2
f .t / D
.b/ 0

0.15

0.10

0.05

2 4 6

Figure 3.6. The function fO from Theorem 3.7.5 with b D 3 and d D 4 (shown as a function
of the norm).

21 Recall that K denotes the modified Bessel function (cf. Definition C.5.7). The function fO is shown
in Figure 3.6.
Section 3.7 Radial characteristic functions 169

Consequently,
Z
fO.x/ D .2/d=2 f .t /ei.t;x/ dt
Rd
Z Z 1
1
D .2/d=2 ub1 e.1Cktk /u ei.t;x/ du dt
2

.b/ Rd 0
Z 1 Z
1
D .2/d=2 ub1 eu ektk u ei.t;x/ dt du:
2

.b/ 0 Rd
Using Lemma 1.3.4 we obtain
Z 1
1
fO.x/ D ubd=21 eu ekxk
2 =.4u/
du:
2d=2 .b/ 0

From Lemma C.5.8 we know that


Z 1
1 r u a
K .r/ D  e 2 . a C u / u1 du; a; r > 0;  2 R:
2a 0
Setting r D kxk; a D kxk=2, and  D b  d=2 leads to
Z
1 kxk d=2b 1 u kxk2 =.4u/ bd=21
Kbd=2 .kxk/ D e e u du
2 2 0
D 2b1 .b/kxkd=2b fO.x/; x ¤ 0;

from which the second statement follows.

Next we compute the self-convolution of the indicator function of a ball.

Lemma 3.7.6 (Euclid’s hat). The function22


1
gd WD  1B c .1=2/  1B c .1=2/
d .B c .1=2//

is a radial characteristic function on Rd and we have


8 Z 1
ˆ
< p
d .d=2/
.1  x 2 /.d 1/=2 dx; t 2 B c .1/
gd .t / D  ..d C 1/=2/ ktk

0; t 2 Rd n B c .1/:
The distribution corresponding to gd is absolutely continuous with density (see Fig-
ure 3.7)
2
.d=2 C 1/ Jd=2 .kxk=2/
pd .x/ D ; x 2 Rd n f0g:
 d=2 kxkd

22 Scale mixtures of Euclid’s hat have been investigated in [20].


170 Chapter 3 Special properties

0.08

0.06

0.04

0.02

2 4 6

Figure 3.7. The density pd from Lemma 3.7.6 with d D 2 (shown as a function of the norm).

Proof. It follows from Lemma 3.6.5 that gd is a radial characteristic function. To


prove the remaining statements we may assume that d  2.23 We have
Z
q.t / WD 1B c .1=2/  1B c .1=2/ .t / D 1B c .1=2/ .t C y/1B c .1=2/ .y/ dy
Rd
D d .B c .1=2/ \ .B c .1=2/  t //:
If kt k > 1, then the measure above is zero. Assume that kt k  1 and let e WD
.1; 0; : : : ; 0/. Since q is radial, we have q.t / D q.kt k  e/. Using the fact that
the set B c .1=2/ \ .B c .1=2/ C kt k  e/ is symmetric with respect to the hyperplane
fy 2 Rd W y1 D kt k=2g we obtain
n o
q.t / D 2d y 2 Rd W kt k=2  y1  1=2; kyk  1=2
Z 1=2 r
1
D2 d 1 B  s2 ds:
ktk=2 4
By Proposition B.7.9
Z 1=2
 .d 1/=2 d 1
q.t / D d 2 .1  4s 2 / 2 ds
2 ..d C 1/=2/ ktk=2
Z 1
 .d 1/=2 d 1
D d 1 .1  x 2 / 2 dx
2 ..d C 1/=2/ ktk
from which the representation of gd follows.
23 We already considered the case d D 1 in Remark 1.5.5.
Section 3.7 Radial characteristic functions 171

From Lemma 3.7.1 we conclude that the Fourier transform of 1B c .1=2/ is given by

Jd=2 .kxk=2/
1O B c .1=2/ .x/ D :
.2kxk/d=2
Thus, the last statement is an immediate consequence of Theorem 1.3.6 and equa-
tion (1.8.2.iii).

Remark 3.7.7. Write the function gd in the form gd .t / D hd .kt k/. For 0  s  1
we then have h1 .s/ D 1  s and
2
h2 .s/ D .arccos s  s.1  s 2 /1=2 /

(see Figure 3.8).
Partial integration of the integral in Lemma 3.7.6 gives the recursion formula24
.d=2/ .d 1/=2
hd .s/ D hd 2 .s/  p s.1  s 2 /C ; s 2 R; d  3:
..d C 1/=2/
The derivative
d .d=2/ .d 1/=2
h0n .s/ D  p .1  s 2 /C
..d C 1/=2/
is nonpositive and nondecreasing, so that hd is convex and nonincreasing for all
d 2 N.25

0.5
h2
h3

h4

0.5 1
Figure 3.8. The functions h2 ; h3 and h4 from Remark 3.7.7.

24 See also the proof of Proposition C.1.10.


25 More information on positive definite functions related to Euclid’s hat can be found in [20].
172 Chapter 3 Special properties

Corollary 3.7.8. The function 'd defined by


b d C2
2 c
'd .t / D .1  kt k/C ; t 2 Rd
is a characteristic function.

Proof. If d is even, then 'd .t / D 'd C1 .t; 0/; t 2 Rd . Therefore, it suffices to con-
sider the case where d D 2n C 1; n 2 N. Lemma 3.7.6 shows that the function
gd .t / D c1B c .1/ .t /Pn .kt k/; t 2 Rd
is positive definite, where c is some positive constant and
Z 1
Pn .s/ D .1  x 2 /n dx; s 2 R:
s
By Proposition C.1.10
Pn .s/ D .1  s/nC1 Qn .s/;
where Qn is a polynomial of degree n, having only zeros with negative real part. Using
the definition of Pn we see that
1
.1  kt k/nC1
C D  gd .t /; t 2 Rd :
cQn .kt k/
Thus, the statement follows from Corollary 3.7.4.

3.8 Schoenberg’s theorems on radial characteristic


functions

In this section we first characterize radial characteristic functions on Rd (Theo-


rem 3.8.2). Then we investigate what happens if the dimension d tends to infinity
(Theorem 3.8.5).
3.8.1. We will use the notation
d=21
2
d .r/ D .d=2/ Jd=21 .r/; r >0 (1)
r
and d .0/ D 1 (see Figure 3.9). By Lemma 3.7.2, t 7! d .kt k/ is the characteristic
function of a random vector that is uniformly distributed on S d 1 .

Theorem 3.8.2 (Schoenberg). A real-valued function f on Rd is a radial character-


istic function if and only if f .t / D .kt k/ where
Z 1
.r/ D d .rs/ d.s/ (1)
0
for some probability measure  on Œ0; 1/.
Section 3.8 Schoenberg’s theorems on radial characteristic functions 173

0.8

0.6

0.4

0.2

5 10 15

0:2

Figure 3.9. The function 4 from equation (3.8.1.1).

Proof. Since t 7! d .kt ks/ is a positive definite function on Rd for all s, the function
f .t / WD .kt k/, where is defined by (1), is positive definite as well. Thus, f is a
characteristic function by Bochner’s Theorem 1.7.3. The other direction follows from
the fact that f is the characteristic function of the random vector RX where R is a
random variable with distribution  and X is a random vector uniformly distributed
on S d 1 and independent of R (cf. Theorem 1.1.6).

As the proof of the preceding theorem shows, we have the following characteriza-
tion of random vectors with a radial distribution.

Corollary 3.8.3. A d-dimensional random vector Y has a radial distribution if and


only if there exist a nonnegative random variable R and a random vector X which is
uniformly distributed on S d 1 such that R and X are independent and RX has the
same distribution as Y .
The set of all characteristic functions  on R such that t 7! .kt k/ is a characteristic
function on Rd is denoted by ˆd . If  2 ˆd for all d , then we write  2 ˆ1 .
Notice that
ˆ1      ˆ2  ˆ1 :

Lemma 3.8.4. The relation


p
d / D er
2 =2
lim d .r
d !1

holds uniformly for r 2 R.


174 Chapter 3 Special properties

Proof. Using the definition of the Bessel function Jd=21 (see Definition C.5.1) it is
p
easy to see that r 7! d .r d /; r 2 R, is the characteristic function corresponding
to the density
.d 3/=2
.d=2/ x2
pd .x/ D p 1 1.pd ;pd / ; x 2 R:
d ..d  1/=2/ d
In view of Stirling’s formula (cf. Corollary C.4.10)
.d=2/
lim p D 1:
d !1 d ..d  1/=2/
Using this and the relation
.d 3/=2
x2
D ex
2 =2
lim 1
d !1 d
we see that
1
lim pd .x/ D p1 .x/ WD p ex =2 ;
2
x 2 R:
d !1 2
The characteristic function corresponding to the density p1 is r 7! er
2 =2
. Therefore,
the lemma follows from Proposition 1.6.6.

Theorem 3.8.5 (Schoenberg). A characteristic function  on R belongs to ˆ1 if and


only if Z 1
er s =2 d.s/; r 2 R
2 2
.r/ D
0
for some probability measure  on Œ0; 1/.

Proof. Since t 7! ektk s =2 is a characteristic function on Rd for all s and for all
2 2

d (cf. Lemma 1.3.4), the same argument as in the proof of Theorem 3.8.2 shows that
 2 ˆ1 .
Next assume that  2 ˆd for all d . Then for each d there exists a probability
measure d on Œ0; 1/ such that
Z 1 p
.r/ D d .rs d / dd .s/; r 2 R: (1)
0

In view of Theorem E.1.13, the sequence fd g1


d D1
contains a subsequence fdk g1
kD1
converging vaguely to some finite nonnegative measure  satisfying .R/  1.
Let > 0 and t 2 R n f0g be arbitrary. Equation (1) and Lemma 3.8.4 imply
Z 1
er s =2 dd .s/ C d .r/
2 2
.r/ D
0
Section 3.9 Convex and completely monotone functions 175

where j dj < for sufficiently large values of d . This implies that


Z a
er s =2 dd .s/ C d;a .r/
2 2
.r/ D
0

where jd;a j < 2 whenever both a and d are sufficiently large. If a is a continuity
point of , letting d ! 1 we get
Z a
er s =2 d.s/ C ıa .r/
2 2
.r/ D
0

where jıa j < 2 . Finally, letting a ! 1 and then ! 0 we obtain the desired
representation of .26

3.9 Convex and completely monotone functions


As we will see in this section, sufficient conditions for positive definiteness can be
formulated in terms of convexity. Convexity conditions can be checked efficiently by
taking derivatives and checking for nonnegativity. Using numerical algorithms this
leads to simple and powerful tests for positive definiteness. One of the main results of
the present section is Theorem 3.9.8 which describes the connection between radial
characteristic functions and completely monotone functions.

Definition 3.9.1. For a 2 R and n 2 N0 we denote by Cna the set of all functions
g W Œa; 1/ ! R having the following properties:
(i) g is bounded;
(ii) g is n-times differentiable;27
(iii) g .n/ is monotone.

Note that C0a is the set of functions which are bounded and monotone.

Lemma 3.9.2. For all n 2 N we have

Cna  Cn1
a
     C0a :

Moreover, the derivatives g .k/ ; 1  k  n, of a function g 2 Cna are alternately


nonnegative and decreasing or nonpositive and increasing.

Proof. Assume that g 2 Cna for some n  1. Without loss of generality we may
further assume that g .n/ is decreasing. We prove that g .n/ is nonnegative and g .n1/
is nonpositive. The lemma follows then by induction on n.
26 See also the proof of Bernstein’s Theorem 3.9.6 for a similar argument.
27 We consider differentiability from the right at a.
176 Chapter 3 Special properties

Since g .n/ is monotone the limit

ın WD lim g .n/ .x/


x!1

exists. We show that ın D 0. If ın > 0, then there exists An > a such that
g .n/ .x/ > ın =2 whenever x  An . Thus, the function g .n1/ is increasing on the
interval ŒAn ; 1/. Let x  An . By the mean value theorem there exists  2 Œx; x C 1
such that

g .n1/ .x C 1/ D g .n1/ .x/ C g .n/ ./ > g .n1/ .x/ C ın =2:

This relation shows that limx!1 g .n1/ .x/ D 1. Repeating this argument we obtain
that limx!1 g.x/ D 1, contradicting our assumption. In the same way we see that
ın < 0 is not possible and hence ın D 0. Using this and the fact that g .n/ is decreas-
ing, we conclude that g .n/ must be nonnegative on Œa; 1/. This implies that g .n1/
is increasing. The same argument as above shows that limx!1 g .n1/ .x/ D 0 and
hence g .n1/ is nonpositive.

Lemma 3.9.3. If n 2 N and g 2 Cna , then limx!1 x k g .k/ .x/ D 0 for all k D
1; : : : ; n.

Proof. Lemma 3.9.2 shows that g is monotone and hence the limit ı0 WD
limx!1 g.x/ exists. Replacing g by g  ı0 we may suppose that ı0 D 0. With-
out loss of generality we may further assume that g is decreasing. Then g  0; g 0  0
and g 0 is increasing. We consider first the case n D 1. For 0 < c < 1 we have
Z 1 Z x
0
g.cx/ D  g .u/ du   g 0 .u/ du  .1  c/xg 0 .x/  0 (1)
cx cx

whenever cx  a, showing that limx!1 xg 0 .x/ D 0.


Assume that the lemma is true for some n and let g 2 CnC1
a a
. Since CnC1  Cna
we have g .n/ 2 C1a . Applying the relation .1/ for g .n/ instead of g we see that
limx!1 x nC1 g .nC1/ .x/ D 0.

Theorem 3.9.4. Let n 2 N0 and g W Œ0; 1/ ! R be an n-times differentiable


bounded function such that g .n/ is convex or concave. Then28

lim x k g .k/ .x/ D 0; k D 1; : : : ; n C 1 (1)


x!1

28 This result is essentially due to P. Lévy [40] though his formulation and proof is slightly different
from ours.
Section 3.9 Convex and completely monotone functions 177

where g .nC1/ denotes the right-hand derivative.29 Moreover, the function


Z
.1/nC1 1 nC1 .nC1/
F .u/ WD v dg .v/; u 2 Œ0; 1/
.n C 1/Š u
is monotone, bounded and
Z 1
x nC1
g.x/ D 1 dF .u/ C lim g.u/; x 2 Œ0; 1/:
0 u C u!1

Proof. It suffices to consider the case where g .n/ is convex. The same argument as
in the proof of Lemma 3.9.3 shows that we may suppose limu!1 g.u/ D 0. By
Theorem B.4.4 there exists a  0 such that g .n/ is monotone on Œa; 1/. Consequently,
by Lemma 3.9.3, the relation (1) holds if k  n.
Since g .n/ is convex it is absolutely continuous and g .nC1/ is increasing (cf. Theo-
rem B.4.7 and Theorem B.4.3). Consequently, we may replace g by g .n/ in rela-
tion (3.9.3.1). This leads to the inequality
g .n/ .cx/  .1  c/xg .nC1/ .x/  0; cx  a
from which (1) with k D n C 1 follows.
Using the relation (1) repeated integration by parts (cf. relation (B.5.6.1)) gives
Z 1 Z 1
.1/n
g.0/ D  g 0 .v/ dv D    D v nC1 dg .nC1/ .v/:
0 .n C 1/Š 0
This shows that the integral on the right-hand side converges. Thus, the integral in
the definition of F converges as well. It is clear that F is monotone and bounded.
Integrating by parts again and using Theorem B.5.4 we obtain
Z 1 Z 1
.1/n
g.x/ D  g 0 .u/ du D .u  x/nC1 dg .nC1/ .u/
x .n C 1/Š x
Z 1
1
D .u  x/nC1 nC1 dF .u/
u
Zx 1 
x nC1
D 1 dF .u/
u
Zx 1 
x nC1
D 1 dF .u/:
0 u C

Definition 3.9.5. An infinitely differentiable function g W .0; 1/ ! R is called


completely monotone if
.1/n g .n/ .x/  0
for all n 2 N0 and x > 0. A continuous function g W Œ0; 1/ ! R is called completely
monotone if gj.0;1/ is completely monotone.

29 That the right-hand derivative exists follows from Theorem B.4.3.


178 Chapter 3 Special properties

The function x 7! eux is completely monotone for all u 2 Œ0; 1/. The next
theorem shows that this function is the prototype of completely monotone functions.

Theorem 3.9.6 (Hausdorff–Bernstein–Widder). An infinitely differentiable function


g W Œ0; 1/ ! R is completely monotone if and only if it admits the representation
Z 1
g.x/ D eux d.u/; x 2 Œ0; 1/
0

where  is a finite nonnegative measure on Œ0; 1/.

Proof. It is clear that g is completely monotone if it admits the representation above.


It follows from Theorem 3.9.4 that for all n 2 N there exists a finite nonnegative
measure on Œ0; 1/ such that
Z 1
xv n
g.x/ D 1 dn .v/; x  0: (1)
0 n C
Since g.0/ D n .Œ0; 1//, some subsequence fnk g converges weakly to a finite
nonnegative measure  (cf. Theorem E.1.13 and Theorem E.1.12). Moreover,
 xv n
lim 1  D exv
n!1 n C
where for fixed x the convergence is uniform in v. Using this and taking the limit along
the subsequence fnk g in equation (1) we obtain the desired representation of g.30

Corollary 3.9.7. An infinitely differentiable function g W .0; 1/ ! R is completely


monotone if and only if it admits the representation
Z 1
g.x/ D eux d.u/; x 2 Œ0; 1/
0

where  is a nonnegative measure on Œ0; 1/.

Proof. It is easy to show that g is completely monotone if it admits the representation


above (cf. Lemma C.7.4).
To prove the converse, assume that g is completely monotone on .0; 1/. Then for
each a > 0 the function x 7! g.x C a/ is completely monotone on Œ0; 1/. Hence, by
Theorem 3.9.6, there is a finite nonnegative measure a on Œ0; 1/ such that
Z 1
g.x C a/ D eux da .u/; x 2 Œ0; 1/:
0

30 We used the same idea as in the proof of Schoenberg’s Theorem 3.8.5. Here the situation is simpler
since we have uniform convergence of the integrands.
Section 3.9 Convex and completely monotone functions 179

We define the measure  by d.x/ WD eax da .x/. We then have


Z 1
g.x/ D eux d.u/ (1)
0
for all x > a. Using this and the uniqueness of the Laplace transform (cf. Theorem
C.7.5) we conclude that eax da .x/ D ebx db .x/ for all b > 0. Thus,  does not
depend on a and therefore equation (1) remains valid for all x > 0.
Combining the Hausdorff–Bernstein–Widder Theorem 3.9.6 and Schoenberg’s
Theorem 3.8.5 we obtain the following characterization.

Theorem 3.9.8. For a continuous function g W Œ0; 1/ ! R the following properties


are equivalent:
(i) g.k  k/ is positive definite on every Rd ;
p
(ii) g. / is completely monotone on Œ0; 1/;
(iii) g admits the representation
Z 1
er
2s
g.r/ D d.s/; r 2 Œ0; 1/;
0

where  is finite nonnegative measure on Œ0; 1/.

Theorem 3.9.9 (Askey). Let d 2 N, k WD bd=2c and let g W Œ0; 1/ ! R be a


continuous function such that
(i) g.0/ D 1;
(ii) limr!1 g.r/ D 0;
(iii) .1/k g .k/ is convex.
Then the function f defined by

f .t / WD g.kt k/; t 2 Rd

is a characteristic function in Rd . Moreover, there exists a probability measure  on


.0; 1/ such that Z 1
f .t / D 'd .rt / d.r/; t 2 Rd
0
where 'd is the characteristic function defined in Corollary 3.7.8.31
Proof. The function g is obviously bounded. Since .1/k g .k/ is convex, the function
.1/k g .kC1/ is increasing where g .kC1/ denotes the right-hand derivative. The state-
ments of the theorem follow immediately from Theorem 3.9.4 by setting d.r/ WD
d.F .1=r// and noting that the function F is increasing.
31 Generalizations of Askey’s theorem as well as further references can be found in [22].
180 Chapter 3 Special properties

Definition 3.9.10. An even continuous function f W R ! Œ0; 1/ with f .0/ D 1 is


said to be of Pólya-type if f is convex on .0; 1/ and limt!1 f .t / D 0.

The case d D 1 of the previous theorem has been considered by G. Pólya in [43].

Theorem 3.9.11 (Pólya). Every function of the Pólya type is the characteristic func-
tion of an absolutely continuous distribution.
3.9.12. Below we give a more detailed statement. For its proof we will need the fol-
lowing facts. Every function f of the Pólya type admits the representation
Z 1
f .t / D .1  jt =yj/C d.y/; t 2 R
0
where d.y/ D dŒ1  f .y/ C yf 0 .y/ is a probability distribution on .0; 1/ and f 0
denotes the right-hand derivative. This follows from Theorem 3.9.4 but it can also be
proved directly using the relation
Z 1  Z 1
d f .y/
f .t / D t dy D .1  t =y/ dŒ1  f .y/ C yf 0 .y/; t > 0
t dy y t
d
where dy denotes the right-hand derivative. The fact that the function
F .y/ WD 1  f .y/ C yf 0 .y/ (1)
is increasing is illustrated by Figure 3.10. For y 2 .0; 1/ and x 2 R let
2
2 sin2 yx=2 y sin yx=2
K.x; y/ WD D
 yx 2 2 yx=2

F .y/

Figure 3.10. The function F from equation (3.9.12.1).


Section 3.9 Convex and completely monotone functions 181

if x ¤ 0 and K.0; y/ WD y=2. By Remark 1.5.5 the function x 7! K.x; y/ is


a density for any fixed y and the corresponding characteristic function is given by
t 7! .1  jt =yj/C . That is,
Z
.1  jt =yj/C D K.x; y/ eitx dx: (2)
R

Theorem 3.9.13. Let f be a Pólya-type function. Then we have


(i) f is the characteristic function of an absolutely continuous distribution with
density Z
2 1 sin2 .yx=2/
p.x/ D d.y/
 0 yx 2
where d.y/ D dŒ1  f .y/ C yf 0 .y/ is a probability distribution on .0; 1/;
(ii) p is finite and continuous on R n f0g;
(iii) p is bounded if and only if
Z 1
y d.y/ < 1:
0

Proof. (i) We have seen in Theorem 3.9.12 that  is a probability distribution. By


Fubini’s theorem,
Z 1 Z 1Z 1
1D 1 d.y/ D K.x; y/ dx d.y/
0
Z0 1 Z1
1
D K.x; y/ d.y/ dx
Z1 0
D K.x; y/ d  .x; y/
R.0;1/

showing that K 2 L1 .  /. Moreover, the function p defined by


Z 1
p.x/ WD K.x; y/ d.y/; x 2 R
0

is a probability density. Since K 2 L1 .  /, applying Fubini’s theorem and


(3.9.12.2) we obtain
Z 1 Z 1Z 1
p.x/ eitx dx D K.x; y/ d.y/ eitx dx
1 1 0
Z 1Z 1
D K.x; y/ eitx dx d.y/
0 1
Z 1
D .1  jt =yj/C d.y/ D f .t /:
0
182 Chapter 3 Special properties

(ii) Using the inequalities


1 2
K.x; y/  ; 0 < y  1 and K.x; y/  ; y1
2 x 2
we see that p.x/ < 1; x ¤ 0. Moreover, in view of these inequalities, Lebesgue’s
dominated R convergence theorem shows that p is continuous on R n f0g.
(iii) If y d.y/ < 1 then, using the inequality
K.x; y/  y=2 D K.0; y/
we conclude that p.x/  p.0/ < 1 for every x.
Assume that p is bounded. Then, by Fatou’s lemma,
Z 1
1 > lim inf p.x/ D lim inf K.x; y/ d.y/
x!0 x!0 0
Z 1
 lim inf K.x; y/ d.y/
0 x!0
Z 1
1
D y d.y/:
2 0
This completes the proof.

Remark 3.9.14. The fact that a function f of Pólya-type is positive definite can also
be proved in the following way. For m 2 N let the function gm W R ! Œ0; 1/ be
defined by the relations (see Figure 3.11)
(i) gm .j=m/ D f .j=m/; j D m2 ; : : : ; m2 ;
(ii) gm is linear on the intervals Œj=m; .j C 1/=m; j D m2 ; : : : ; m2  1;
(iii) gm .x/ D f .m/ if jxj > m.
It is not hard to see that gm can be written as
2
X
m
gm D f .m/ C pj j=m
j Dm2

where a .x/ D .1  jx=aj/C and pj  0. This shows that gm is positive definite. On


the other hand, gm ! f uniformly on R.

Example 3.9.15. We have shown in Remark 3.5.3 that the function


f˛ .t / D e˛jtjjtj ;
˛
t 2R
is of Pólya-type for all ˛ 2 Œ2; 1/. It is easy to check that the functions
1
g˛ .t / D ejtj
˛
and h˛ .t / D ; t 2R
1 C jt j˛
Section 3.9 Convex and completely monotone functions 183

1=m2 2=m2 3=m2

Figure 3.11. The functions f (continuous line) and gm from Remark 3.9.14.

are of the Pólya type if 0 < ˛  1. Note that g˛ and h˛ are characteristic functions
also for 1 < ˛  2. To prove this we show that these functions are positive definite.
As to g˛ , we may assume that 0 < ˛ < 2 and write
Z 1
1  cos s
D˛ D ds:
0 s 1C˛

The substitution s D ty leads to the formula


Z 1
cos ty  1
jt j˛ D C˛ dy; t 2R
1 jyj1C˛
R
where C˛ D 1=2D˛ > 0. Replacing the integral above by jtj1=m ; m 2 N, we see
that g˛ is the pointwise limit of functions of the form qm D e'm Ccm where 'm is
positive definite and cm 2 R. By Lemma 1.4.17 the function qm is positive definite.
The positive definiteness of h˛ follows from the equation
Z 1
ey.1Cjtj
˛/
h˛ .t / D dy
0

and from the positive definiteness of g˛ .


184 Chapter 3 Special properties

3.10 Convolution roots with compact support

If f is a characteristic function and f D h  hQ with some square integrable function


h, then h is called a convolution root of f . By Theorem 1.8.16, a characteristic
function has a convolution root if and only if its spectral measure is absolutely
continuous. In this section we assume that f has compact support and investigate the
existence of compactly supported convolution roots. This problem is closely related
to factorization of entire functions (cf. Corollary 3.10.3). Applications of results
of this type range from probability theory, time series, spatial statistics, to optics,
crystallography, and signal processing.32

We consider first the discrete one-dimensional case.

Theorem 3.10.1. Let N 2 N and f 2 P .Z/ be such that supp f  Œ2N; 2N . Then
there exists a complex-valued function h on Z such that supp h  ŒN; N  and

X
N
Q
f .n/ D h  h.n/ D h.n C k/h.k/; n 2 Z:
kDN

Proof. Define the polynomial p by

X
2N
p.z/ D f .n/z n ; z 2 C:
nD2N

It follows from Theorem 1.9.6 that p is nonnegative on T. By Theorem B.1.4, p can


be written in the form

p.z/ D q.z/q.1=z/ ; z 2 C n f0g

where
X
2N
q.z/ D bk z k ; z; bk 2 C:
kD0

Setting bk WD 0 if k 2 Z n Œ0; 2N  and equating the coefficients we obtain

X
2N
f .n/ D bnCk bk ; n 2 Z:
kD0

The function h defined by h.k/ WD bkCN has the desired properties.

32 See [15] for more details and references.


Section 3.10 Convolution roots with compact support 185

Theorem 3.10.2 (Boas–Kac, Kreı̆n). Let f 2 P c .R/ be such that the support of f is
contained in Œ2r; 2r. Then there is a square integrable function h vanishing outside
Œr; r such that
Z r
Q
f .x/ D h  h.x/ D h.x C y/h.y/ dy; x 2 R:
r

Proof. We may suppose that r D 12 . For all m 2 N the sequence ff .n=m/gn2Z is


positive definite and
n vanishes
o outside fm; m C 1; : : : ; mg. Hence, there exists a
1
.m/
sequence b .m/ D bn vanishing outside f0; 1; : : : ; mg and satisfying
nD1

X
m
.m/ .m/
f .n=m/ D bnCk  bk ; n 2 Z: (1)
kD0

(see the proof of Theorem 3.10.1). For each m let33


p
hm WD m  b .m/  1Œ1=.2m/;1=.2m/ and fm .x/ WD hm  hQ m :

It follows from (1) and from the definition of hm that fm takes the same values as f at
the points n=m and is linear in between. Since f is uniformly continuous, we see that
limm!1 fm .t / D f .t / uniformly on R. As khm k22 D fm .0/ D f .0/, there exists a
subsequence fhmk g converging weakly in L2 .R/ to some square integrable function
h (see Theorem D.5.6). If g 2 L2 .R/ and supp .g/  R n Œ1; 1 then
Z 1 Z 1
g.x/h.x/ dx D lim g.x/hmk .x/ dx D 0
1 k!1 1

w
and hence supp .h/  Œ1; 1. Therefore, h is integrable. Using that hmk ! h and the
fact that the supports of h and hmk are contained in Œ2; 2 we see that

lim hO mk .t / D h.t
O /; t 2 R:
k!1

Consequently,
fO.t / D lim fOmk .t / D jhmk .t /j2 D jh.t /j2 ;
k!1
Q
i.e., f D h  h.

Corollary 3.10.3. If F is an entire function of finite exponential type 2r which is


nonnegative and integrable on R, then there exists an entire function H of exponential
type r such that
F .z/ D H.z/H.z/; z 2 C: (1)

33 The first convolution below is defined by considering b .m/ as a complex measure in Mf .R/.
186 Chapter 3 Special properties

Proof. Since F is integrable and nonnegative on R, the Fourier transform f of F jR is


a continuous positive definite function. The Paley–Wiener Theorem 3.4.4 shows that
f vanishes outside Œ2r; 2r. By Theorem 3.10.2 there is a square integrable function
h vanishing outside Œr; r such that f D h  h. Q Setting
Z r
H.z/ WD h.t /eitz dt; z 2 C
r

Lemma 3.4.2 shows that H is an entire function of exponential type r. Moreover, it is


easy to check that equation (1) holds.
Recall that B o .r/  Rd denotes the open ball with center 0 and radius r  0.

Theorem 3.10.4 (Rudin). Let f 2 P c .Rd / be an infinitely differentiable radial func-


tion such that the support of f lies in B o .2r/; r > 0. Then f is the sum of a uniformly
convergent series34
X 1
f D fk  fQk
kD1
where each fk is infinitely differentiable and vanishes outside B o .r/.
Proof. Define the function F by
Z
F .w/ D eiwt1 f .t / dt; w 2 C:
Rd

Since f is even, F is even as well. Note that F can be written as


Z 2r
F .w/ D eiwt1 g.t1 / dt1 (1)
2r

where g D f if d D 1 and
Z
g.t1 / D f .t1 ; t2 ; : : : ; td / dt2    dtd ; t1 2 R
Rd 1

if d > 1. In both cases g is an infinitely differentiable positive definite function vanish-


ing outside Œ2r; 2r. By Theorem 3.2.8 the function F jR is rapidly decreasing. Using
the representation (1), Lemma 3.4.2 and Theorem 1.5.9 show that F is an entire func-
tion of exponential type 2r which is nonnegative on the real axis. By Theorem 3.4.5
there are even functions Gj ; Hj 2 E.r/ such that
1
X 1
X
F .u/ D jGj .u/j2 C u2 jHj .u/j2 ; u 2 R: (2)
j D1 j D1

34 The paper [14] contains an analogue of Rudin’s result where f is not supposed to be infinitely differ-
entiable and d  3. See also [15] where this analogue is stated without proof.
Section 3.11 Infinitely divisible characteristic functions 187

Since jGj .u/j2  F and jHj .u/j2  F , the functions Gj jR and Hj jR are rapidly
decreasing. Thus, s 7! Gj .ksk/ and s 7! Hj .ksk/ are integrable on Rd . We define
the function gj by Z
gj .t / WD ei.t;s/ Gj .ksk/ ds: (3)
Rd
By Lemma 3.6.3 the function g is radial while Theorem 3.2.8 shows that gj is rapidly
decreasing. Since Gj is even and entire, for fixed t2 ; : : : ; td 2 R there is an entire
function Qj D Qjt2 ;:::;td such that

Qj .t1 / D Gj .k.t1 ; t2 ; : : : ; td /k/; t1 2 R:

Using that Gj 2 E.r/ it is easy to check that Qj belongs to E.r/. By the Paley–
Wiener Theorem 3.4.4 the Fourier transform of Qj vanishes outside Œr; r. Hence,
Z Z
gj .t1 ; 0; : : : ; 0/ D eit1 x Qj .x/ dx dt2    dtd D 0
Rd 1 R

whenever jt1 j  r (if d D 1, then the relation above does not contain the outer
integral). Since gj is radial we conclude that gj vanishes outside B o .r/.
Next, we have
Xd
juj2 jHj .juj/j2 D juk Hj .juj/j2 :
kD1
Associate the function hj with Hj as the function gj was associated with Gj by equa-
tion (3). Then hj is rapidly decreasing. Moreover, if hkj denotes the partial derivative
of hj with respect to tk , then .2/d=2 tk Hj .kt k/ is the Fourier transform of hkj .
From equation (2) we see that
1
X X 1
d X
fO.t / D jgO j .t /j2 C jhO kj .t /j2
j D1 kD1 j D1

where the series converges in L1 .Rd /. Taking inverse Fourier transforms we obtain
the uniformly convergent representation
1
X X 1
d X
f D gk  gQ k C hkj  hQ kj
kD1 kD1 j D1

from which the theorem follows.

3.11 Infinitely divisible characteristic functions

In this section we study a special class of characteristic functions on Rd known as


infinitely divisible characteristic functions. This class plays an important role in the
188 Chapter 3 Special properties

study of decomposition of distributions, in the theory of processes with independent


increments (cf. Theorem 2.1.3), as well as in the study of limit theorems.

Definition 3.11.1. A characteristic function f on Rd is called infinitely divisible if


for every positive integer n there exists a characteristic function fn such that

f D .fn /n :

Equivalently, a probability measure  is said to be infinitely divisible if for every


n 2 N there exists a probability measure n such that

 D n      n :
„ ƒ‚ …
n times

The characteristic functions


2 1
t 7! ei.a;t/qktk and t 7! ; t 2 Rd
.1 C kt k2 /q

where a 2 Rd ; q  0, are easily seen to be infinitely divisible (cf. Theorem 3.7.5).


The next lemma follows immediately from Definition 3.11.1. We omit the proof.

Lemma 3.11.2. If the characteristic functions f and g are infinitely divisible, then
so are f ; jf j2 and fg.

The next theorem shows that characteristic functions having zeros cannot be in-
finitely divisible.

Theorem 3.11.3. Let f be an infinitely divisible characteristic function on Rd . Then


f has no zeros.

Proof. For n 2 N let fn be a characteristic function such that f D .fn /n . The function
jf j2=n D jfn j2 is a characteristic function for all n. Define g by

g.t / WD lim jfn .t /j2 D lim jf .t /j2=n ; t 2 Rd :


n!1 n!1

If f .t / ¤ 0, then g.t / D 1, otherwise g.t / D 0. From this we conclude that g is


equal to 1 in a neighborhood of 0. Since g is positive definite, Corollary 1.5.2 and
Lemma 1.5.1 show that g.t / D 1 and hence f .t / ¤ 0 for all t 2 Rd .

Remark 3.11.4. Let g W Rd ! C n f0g be a continuous function such that g.0/ > 0.
By Theorem C.8.7 there exists a unique continuous function  W Rd ! R such that
.0/ D 0 and
g.t / D jg.t /j  ei.t/ ; t 2 Rd :
Section 3.12 Conditionally positive definite functions 189

We define the functions ln g and g p ; p  0, by the relations

ln g.t / D ln jg.t /j C i.t /;


g p .t / D jg.t /jp  eip.t/ ; t 2 Rd :

Theorem 3.11.5. Let ffk g1 kD1


be a sequence of infinitely divisible characteristic
functions converging pointwise on Rd to a continuous function f . Then f is an in-
finitely divisible characteristic function.

Proof. Since fk is infinitely divisible, jfk j2=n is a characteristic function for all posi-
tive integers k and n. Lévy’s continuity Theorem 1.6.3 and the relation

jf .t /j2=n D lim jfk .t /j2=n ; t 2 Rd


k!1

show that f and jf j2=n ; n 2 N, are characteristic functions. Thus, jf j2 is an infinitely


divisible characteristic function and therefore it has no zeros. Using this we obtain
1 1 1=n
f 1=n .t / D e n ln f .t/ D lim e n ln fk .t/ D lim fk .t /; t 2 Rd :
k!1 k!1

This relation shows that f is infinitely divisible.

Corollary 3.11.6. If f is an infinitely divisible characteristic function, then so are


f p and jf jp for all p  0.

Proof. If p is rational, then the statement follows easily from Definition 3.11.1. Using
this, the general case is a consequence of Theorem 3.11.5.

In the next section we give an integral representation for the continuous logarithm
of infinitely divisible characteristic functions.

3.12 Conditionally positive definite functions


In the present section we generalize the concept of positive definiteness and show that
the new concept is closely related to infinite divisibility. We prove an integral repre-
sentation and give a useful sufficient condition for a radial function to be conditionally
positive definite.35

35 Conditionally positive definite functions and their generalizations have many applications, see for
example [9, 58] and also the notes in Remark 3.12.16.
190 Chapter 3 Special properties

Definition 3.12.1. A complex-valued Hermitian function f on Rd is called condi-


tionally positive definite if the inequality
X
n
f .ti  tj /ci cj  0
i;j D1

holds for all n 2 N; t1 ; : : : ; tn 2 Rd and for all complex numbers c1 ; : : : ; cn such that
c1 C    C cn D 0.

Note that the inequality above is satisfied if f is constant. Therefore, we cannot


conclude from this inequality alone that f is Hermitian.
Positive definite functions are obviously conditionally positive definite. Let a 2 Rd
be arbitrary and write
l.t / WD i.t; a/; t 2 Rd :
If a ¤ 0, then l is not bounded and hence it is not positive definite. The function l is
Hermitian and we have
X
n X
n
l.ti  tj /ci cj D Œl.ti /  l.tj /ci cj
i;j D1 i;j D1
Xn X
n X
n X
n
D l.ti /ci cj  l.tj /cj ci D0
iD1 j D1 j D1 iD1
P
whenever ci D 0. Thus, l is conditionally positive definite.
The next lemma states elementary properties of conditionally positive definite func-
tions which follow immediately from Definition 3.12.1. We omit the proof.

Lemma 3.12.2. Let f and g be conditionally positive definite functions on Rd . Then


so are the functions f ; Ref and pf C qg for all p; q  0. Moreover, the pointwise
limit of conditionally positive definite functions is conditionally positive definite as
well.
Note that the product of conditionally positive definite functions need not be con-
ditionally positive definite. A simple example is given by the product of the functions
t 7! 1 and t 7! eit ; t 2 R.

Lemma 3.12.3. If f is conditionally positive definite, then the Hermitian matrix


 n
A D f .ti  tj / i;j D1

has at most one negative eigenvalue36 for all finite systems t1 ; : : : ; tn of elements
of Rd .
36 See also Remark 3.12.16.
Section 3.12 Conditionally positive definite functions 191

Proof. The set


fx 2 Cn W x1 C    C xn D 0g
is an .n  1/-dimensional nonnegative subspace of A (cf. Definition D.2.5). We con-
clude that A has no negative subspace of dimension greater than 1. Hence, the state-
ment follows from Theorem D.2.6.

Theorem 3.12.4. Let f W Rd ! C be a Hermitian function with f .0/  0. Then f


is conditionally positive definite if and only if the matrix
 n
B D f .ti  tj /  f .ti /  f .tj /
i;j D1

is positive semidefinite for all finite systems t1 ; : : : ; tn 2 Rd .

Pn the “if part” let t1 ; : : : ; tn 2 R be arbitrary and let c1 ; : : : ; cn 2 C


Proof. To prove d

be such that iD1 ci D 0. Then we have

X
n X
n
f .ti  tj /ci cj D .f .ti  tj /  f .ti /  f .tj / /ci cj  0:
i;j D1 i;j D1

Thus, f is conditionally positive definite.


; : : : ; tn 2 Rd
Conversely, suppose that f is conditionally positive definite. Let t1P
and c ; : : : ; cn 2 C be arbitrary and write tnC1 WD 0; cnC1 WD  niD1 ci . Then
PnC11
iD1 ci D 0 and we have

X
n X
n
.f .ti  tj /  f .ti /  f .tj / /ci cj D f .ti  tj /ci cj
i;j D1 i;j D1
Xn X
n
C f .ti /ci cnC1 C f .tj /cnC1 cj
iD1 j D1

X
nC1
D f .ti  tj /ci cj  f .0/jcnC1 j2
i;j D1
 0:

The next theorem establishes an important connection between infinitely divisible


characteristic functions and conditionally positive definite functions.

Theorem 3.12.5. A Hermitian function f on Rd is conditionally positive definite if


and only if the function
t 7! epf .t/ ; t 2 Rd
is positive definite for all p > 0.
192 Chapter 3 Special properties

Proof. Suppose that epf ./ is positive definite for all p > 0. Then .epf ./  1/=p is
conditionally positive definite for all p > 0. From the relation

epf .t/  1
f .t / D lim ; t 2 Rd
p!C0 p
we conclude that the function f is conditionally positive definite.
To prove the converse statement, suppose that f is conditionally positive definite.
Then so is f  f .0/ and therefore we may suppose that f .0/ D 0. Since pf is
conditionally positive definite if p > 0, it suffices to consider the case p D 1. If .aij /
is a positive semidefinite matrix, then so is .exp.aij // (cf. Corollary D.2.13). Using
this, it follows from Theorem 3.12.4 that the matrix
 n
exp.f .ti  tj /  f .ti /  f .tj / /
i;j D1

is positive semidefinite. We have


X
n
exp.f .ti  tj //ci cj
i;j D1
X
n
D exp.f .ti  tj /  f .ti /  f .tj / / exp.f .ti // exp.f .tj / /ci cj
i;j D1
X n
D exp.f .ti  tj /  f .ti /  f .tj / /di dj  0
i;j D1

where di D exp. f .ti / /ci .

As a corollary we obtain that the function

t 7! a C i.x; t /  .C t; t /; t 2 Rd

is conditionally positive definite for all a 2 R; x 2 Rd and all d d positive semidef-


inite real matrices C .

Lemma 3.12.6. A Hermitian function f on Rd is conditionally positive definite if


and only if the inequality
  Q  f .0/  0
O
holds for all finitely supported complex measures  on Rd such that .0/ D 0.

Proof. If
X
n
D cj ıtj
j D1
Section 3.12 Conditionally positive definite functions 193

O
then .0/ D c1 C    C cn . Using this the statement can be proved in the same way as
Lemma 1.4.3.

Lemma 3.12.7. Let f be a conditionally positive definite function on Rd and let 


be a complex measure with finite support on Rd . Then the function

g D   Q  f

O
is conditionally positive definite. If .0/ D 0, then g is positive definite.
In particular, the function

t 7! 2f .t /  f .t C y/  f .t  y/; t 2 Rd

is positive definite for all y 2 Rd .

Proof. Let  be an arbitrary finitely supported complex measure on Rd . The first two
statements of the lemma follow from the equation

  Q  f .0/ D .  /  .  / Q  f .0/

using Lemma 3.12.6 and the fact that .  /O.0/ D .0/


O .0/.
O Setting  D ı0  ıy
we obtain the last statement.

The previous simple lemma has several useful corollaries.

Corollary 3.12.8. Let t 7! f .t /; t 2 Rd , be a conditionally positive definite func-


tion. If f is twice continuously differentiable, then the function

@2
 f
@tj2

is positive definite for all j D 1; : : : ; d .

Proof. Let e1 ; : : : ; ed be the standard orthonormal basis in Rd . Then we have

@2 2f .t /  f .t C ej /  f .t  ej /
 f .t / D lim ; t 2 Rd :
@tj2 !0 2

Hence, the statement follows from Lemma 3.12.7.

Corollary 3.12.9. Let h W Œ0; 1/ ! R be a continuous function which is twice con-


tinuously differentiable on .0; 1/. If h.k  k2 / is conditionally positive definite on
Rd C1 for some d 2 N, then the function h0 .k  k2 / is positive definite on Rd .

Proof. For  2 R we consider the function

t 7! 2h.kt k2 /  h.kt C ed C1 k2 /  h.kt  ed C1 k2 /; t 2 Rd C1


194 Chapter 3 Special properties

where e1 ; : : : ; ed C1 is the standard basis of Rd C1 . Lemma 3.12.7 shows that this func-
tion is positive definite. Its restriction to Rd is given by

s 7! 2 Œh.ksk2 /  h.ksk2 C  2 /; s 2 Rd :

Dividing by  2 and letting  tend to zero we obtain the statement of the lemma.

Corollary 3.12.10. Let f be a conditionally positive definite function and let ;  2


O
Mf .Rd / be such that .0/ D .0/
O D 0. Then there exists a complex measure  such
that     f D L .

Proof. The corollary follows from the identity

1X l
3
 D i . C il /
Q  . C il /Q
Q
4
lD0

using Lemma 3.12.7 and the fact that . C il /O.0/


Q D 0.

Lemma 3.12.11. For every continuous conditionally positive definite function f


there exists a finite radial measure 0 such that
(i) the convolution f  0 exists and f  0 is a continuous positive definite
function;
(ii) O 0 .t / > 0; t 2 Rd n f0g.

Proof. In view of Lemma 3.2.5 for every n 2 N there exists a radial probability meas-
ure n with compact support satisfying
0  O n .t /  1=2; kt k  1=n:
Setting n WD ı0  n we have O n .0/ D 0 and
1
O n .t /  ; kt k  1=n: (1)
2
By Lemma 3.12.7, the function n  Q n  f is positive definite. Consequently, writing
rn WD n  Q n  f .0/, we have
jn  Q n  f .t /j  rn ; t 2 Rd : (2)
Since kn k  2, the equation
1
X 1
0 D  n  Q n (3)
2n max.1; rn /
nD1

defines a finite radial measure. Using inequality (2) we see that the convolution f 0
exists and f  0 2 P c .Rd /.
Property (ii) follows from inequality (1).
Section 3.12 Conditionally positive definite functions 195

Theorem 3.12.12 (Lévy–Khinchin formula). Let f be a continuous Hermitian func-


tion on Rd and let h W Rd ! Rd be a bounded continuous function such that h.t / D t
in some neighborhood of zero and h.t / D h.t /; t 2 Rd . The function f is condi-
tionally positive definite if and only if
Z  
f .t / D a C i.t; x/  .C t; t / C ei.t;y/  1  i.t; h.y// d .y/; t 2 Rd (1)
Rd

where a 2 R; x 2 Rd ; C is a d  d positive semidefinite real matrix and  is a


nonnegative measure on Rd such that  .f0g/ D 0;  .Rd n U / < 1 and
Z
kyk2 d .y/ < 1 (2)
U

for every neighborhood U of zero. The representation (1) is unique.

Proof. Assume first that f admits the integral representation (1). That f is condition-
ally positive definite follows immediately from the fact that the functions

t 7! a C i.t; x/  .C t; t / and t 7! ei.t;y/  1  i.t; h.y//

are conditionally positive definite.


Next suppose that f is conditionally positive definite and let 0 be the measure
from Lemma 3.12.11. Since 0  f is positive definite it can be represented as

0  f D L 0

with some nonnegative measure 0 . We define the measure  by  .f0g/ WD 0 and
1
d .y/ WD d0 .y/; y 6D 0:
O 0 .y/
Then  is nonnegative and (3.12.11.ii) implies that

 .Rd n U / < 1

for every neighborhood U of zero. By Corollary 3.12.10, for ;  2 Mf .Rd / with


O
.0/ D .0/
O D 0 there is a complex measure  such that

    f D L  :

It follows from 0  .    f / D     .0  f / that

O 0 .y/ d .y/ D .  /O.y/ d0 .y/:

Using this relation and the definition of  we obtain

d .y/ D .  /O.y/ d .y/; y ¤ 0: (3)


196 Chapter 3 Special properties

In particular, the right-hand side represents a finite measure. The special case x WD
ı0  ıx and  WD Q x , gives
Z Z
2
jO x .y/j d .y/ D j1  ei.x;y/ j2 d .y/
Rd Rd
Z
D2 Œ1  cos..x; y// d .y/ < 1; x 2 Rd
Rd
from which the relation (2) readily follows.
Now we put
Z  
p.t / WD f .t /  ei.t;y/  1  i.t; h.y// d .y/; t 2 Rd :
Rd
That the integral above exists follows from (2) using the inequality in Lemma B.1.5
and the fact that h.y/ D y in some neighborhood of zero. Next we show that     p
is a constant for all ;  2 Mf .Rd / with .0/ O D .0/
O D 0. Writing WD    for
short, and applying Proposition C.9.6 we obtain
Z
 p.t / D  f .t /  O .y/ei.t;y/ d .y/
Z R
d

D L  .t /  O .y/ei.t;y/ d .y/; t 2 Rd :
Rd
From equation (3) we see that the last integral is equal to
Z
ei.t;y/ d .y/ D L  .t /   .f0g/
Rd nf0g

and hence
 p.t / D  .f0g/; t 2 Rd :
In particular, .ı0  ıx /  .ı0  ıs /  p is constant for all x; s 2 Rd . Applying
Lemma C.9.4 twice we see that p is a polynomial of degree at most 2. Setting  WD ,
Q
we obtain that
  Q  p.0/ D Q .f0g/  0
and therefore, by Lemma 3.12.6, the function p is conditionally positive definite. Thus,
ep is positive definite in view of Theorem 3.12.5. The special form of p follows from
(the simple part of) Theorem 3.5.1.
To prove uniqueness of the representation (1) assume that
Z  
p1 .t / C ei.t;y/  1  i.t; h.y// d1 .y/
ZR 
d

D p2 .t / C ei.t;y/  1  i.t; h.y// d2 .y/; t 2 Rd
Rd

where pj is a polynomial of degree at most 2 and j is a measure having the properties


of ; j D 1; 2. Let x 2 Rd be arbitrary. Convolving both sides of the equation above
Section 3.12 Conditionally positive definite functions 197

with WD .ı0  ıx /  .ı0  ıx /  .ı0  ıx / and applying Proposition C.9.6 we obtain


Z Z
O .y/ei.t;y/
d1 .y/ D O .y/ei.t;y/ d2 .y/; t 2 Rd :
Rd Rd
h i3
Since O .y/ dj .y/ D 1  ei.x;y/ dj .y/ is a finite measure, in view of prop-
erty (2), we conclude that
h i3 h i3
1  ei.x;y/ d1 .y/ D 1  ei.x;y/ d2 .y/; x 2 Rd :

Taking into account that 1 .f1g/ D 2 .f1g/ D 0 we infer that 1 D 2 . Hence we


also have p1 D p2 .

Remark 3.12.13. If f is radial, then the measure  in Theorem 3.12.12 is radial as


well. This follows immediately from the construction of  . In this case, x D 0 and C
is a nonnegative multiple of the identity matrix.

Theorem 3.12.14 (Micchelli). Let h W Œ0; 1/ ! R be a continuous function which is


infinitely differentiable on .0; 1/. Then the function f WD h.k  k2 / is conditionally
positive definite on Rd for all d if and only if h0 is completely monotone on .0; 1/.37

Proof. If h0 is completely monotone on .0; 1/ then, by Corollary 3.9.7, it admits


the representation
Z 1
0
h .r/ D ers d.s/; r 2 Œ0; 1/
0

where  is a nonnegative measure on Œ0; 1/. Let  > 0 and define the function h by
h .r/ WD h.r C /; r 2 .; 1/. Then we have
Z r
h .r/ D h .0/ C h0 .t / dt
0
Z rZ 1
D h .0/  e.tC/s d.s/ dt
0 0
Z 1 Z r
s
D h .0/  e ets dt d.s/
0 0
Z 1
ers  1
D h .0/ C es d.s/
0 s

Since the function ekk s 1 is conditionally positive definite on Rd for all s 2 Œ0; 1/
2

and for all d , the function h .k  k2 / is conditionally positive definite on Rd for all d .
Letting  tend to zero, we conclude that the same is true for h.
37 This result can easily be generalized for conditionally positive definite functions of order m (see [58]).
198 Chapter 3 Special properties

Next assume that h.k  k2 / is conditionally positive definite on every Rd . Corol-


lary 3.12.9 shows that the function h0 .k  k2 / is positive definite on every Rd . From
Schoenberg’s Theorem 3.8.5 we conclude that h0 is completely monotone.

Example 3.12.15. Let a > 0; 0 < b < 1, and write g.r/ WD .a C r/b ; r  0. From
the relation

g .k/ .r/ D b.b  1/    .b  k C 1/.a C r/bk ; k2N

we see that g 0 is completely monotone. Thus, g.k  k2 / is conditionally positive


definite on every Rd .

Remark 3.12.16. M. G. Kreı̆n (see the historical remarks in [49]) generalized the con-
cept of positive definiteness in the following way: A complex-valued Hermitian func-
tion f on Rd is said to have k negative squares if the Hermitian matrix
 n
A D f .ti  tj / i;j D1

has at most k negative eigenvalues (counted with their multiplicities) for any choice
of n and t1 ; : : : ; tn 2 Rd , and for some choice of n and t1 ; : : : ; tn the matrix A has
exactly k negative eigenvalues. Denote by Pk .Rd / the set of all functions on Rd with
k negative squares. As we have seen in Lemma 3.12.3, conditionally positive definite
functions belong to P0 .Rd / [ P1 .Rd /. Further examples are given by the functions

t 7! t k et C .t /k et 2 PkC1 .R/;


t 7! .1/k t 2k 2 Pk .R/;
t 7! .1/kC1 t 2k 2 PkC1 .R/;
t 7! .1/k jt ja 2 Pk .R/; a 2 .2k  2; 2k; k > 0;

(see [49] for proofs).


In [36] M. G. Kreı̆n proved that every continuous function f 2 Pk .R/ is defini-
tizable in the following sense: there exists a polynomial Q of degree k such that the
inequality
Z 1Z 1
d d
f .x  y/Q  i h.y/ Q  i h.x/ dy dx  0
1 1 dy dx
holds for every infinitely differentiable function h with compact support. He obtained
the integral representation
Z 1 itx
e  S.x; t /
f .x/ D p.x/ C d.t /
1 jQ0 .t /j2
where p is a Hermitian solution of the differential equation
d d
Q i Q i p.x/ D 0
dx dx
Section 3.12 Conditionally positive definite functions 199

where Q.t / WD Q.tN /; Q0 is a polynomial that is obtained by deleting the non-real


zeros of Q, S is a regularizing correction compensating for the real zeros of Q, and
 is a nonnegative measure satisfying
Z 1
1
d.t / < 1
1 .1 C t /
2 m

where m denotes the degree of Q0 .


The concepts of negative squares and definitizability, as well as Kreı̆n’s integral rep-
resentation have been generalized to commutative groups. Functions with k negative
squares on a commutative group are definitizable in the sense that certain linear com-
binations of their translates are positive definite. We refer to [49] for more information
on this topic. Definitizable functions are closely related to certain random fields which
are generalizations of stationary processes.38
To see multivariate examples of functions with negative squares, denote by srd ; 1 
r  d , the elementary symmetric polynomial of degree r in d real variables:
s1d .t / D t1 C t2 C    C td
s2d .t / D t1 t2 C t1 t3 C    C td 1 td
::
:
srd .t / D t1    tr C    C tnrC1    td
::
:
sdd .t / D t1 t2    td ; t 2 Rd :
Then we have
˙ is1d 2 P1 .Rd /; s2d 2 P2 .Rd /; s2d 2 Pd .Rd /;
˙ is3d 2 Pd C1 .Rd /; ˙ id sdd 2 P2d 1 .Rd /
(see [51] for proofs).

38 See, among others, [4, 9, 42, 52, 53] and the references therein.
Chapter 4

The extension problem

In this chapter, we study the extendability of positive definite functions defined on


a certain subset V of a commutative group. After proving some general results we
consider the groups Rd and Zd .

4.1 General results

Throughout this section G is a discrete commutative group with character group GO


and V denotes a symmetric subset1 of G containing 0.

Definition 4.1.1. A complex-valued function f on V is called positive definite if the


inequality
Xn
f .xi  xj /ci c j  0 (1)
i;j D1

holds for all c1 ; : : : ; cn 2 C and all x1 ; : : : ; xn 2 V such that xi  xj 2 V; i; j D


1; : : : ; n.

We denote by P .V / the set of all positive definite functions on V and write

P0 .V / WD ff W f 2 P .V /; f .0/  1g:

Remark 4.1.2. If f 2 P .V /, then inequality (4.1.1.1) holds for all x1 ; : : : ; xn 2 G


such that xi  xj 2 V . Indeed, setting yi WD xi  x1 we have yi 2 V and yi  yj D
xi  xj .
If V D T  T with some T  G, then every f 2 P .V / is T -positive definite
(cf. Definition 2.9.1). Now let G D R and V D .2a; 2a/ with some a > 0. Then
V D T  T where T D .a; a/. In this case a complex-valued function f on V is T -
positive definite if and only if f is positive definite on V . To see this, let x1 ; : : : ; xn 2
V be such that xi  xj 2 V for all i and j . Then the diameter of the set fx1 ; : : : ; xn g
is less than 2a and hence there exists x0 2 R such that xi 2 .x0  a; x0 C a/ for all
i . Setting yi WD xi  x0 we have yi 2 T and yi  yj D xi  xj . Consequently, f is
positive definite on V .
Example 4.1.4 shows that there exist a finite set T  R2 and a function f on T  T
such that f is T -positive definite but not positive definite on T  T .
1 A subset V of G is called symmetric if V D V .
Section 4.1 General results 201

Example 4.1.3. For a 2 .0; =2/ let

Va WD feit W a < t < ag  G WD T

and let r 2 R be arbitrary.2 Then the function


it
r .e / WD eirt ; a < t < a

is positive definite on Va . Indeed, if x; y; x  y 2 Va then r .x  y/ D r .x/ r .y/


and hence we can repeat the computations in Example 1.4.6:
Xn ˇX ˇ2
ˇ n ˇ
ˇ
r .xi  xj /ci cj D ˇ ci r .xi /ˇˇ  0:
i;j D1 iD1

Example 4.1.4. Write3


2k 2k
T WD cos ; sin W k D 0; 1; : : : ; 6
6 6
and V WD T  T . We show that there is a function W Œ0; 2 ! Œ1; 1 such that the
function f .x/ D .kxk/ is T -positive definite but f … P .V /. p
Note that V consists of 19 points, of which 6 have norm 1, 6 have norm 3, and 6
have norm 2 (see Figure 4.1). Thus, apart from .0/ D 1, it suffices to specify three
numbers: p
.1/; . 3/ and .2/:
Choose any

2 Œ1; 1=2/; a 2 Œ1=2; 1; and b 2 Œ1; 1=2

and put
1 p 1
.1/ WD ˛ WD  Œa.1 C / C b.1  /; . 3/ WD ˇ WD  Œa.1 C /  b.1  /
2 2
and .2/ WD . To see that f is T -positive definite, write

tk D .cos 2k=6; sin 2k=6/

so we have to show that the matrix

A D .f .ti  tj //5j;kD0

is positive semidefinite. Writing down A, we see that it is the Toeplitz matrix corre-
sponding to the function .1; ˛; ˇ; ; ˇ; ˛/ on the group f0; 1; 2; 3; 4; 5g D Z=6Z (factor
2 Below we use additive notation for the group operation in T which is actually multiplication.
3 This example is due to T. M. Bisgaard.
202 Chapter 4 The extension problem

Im

Re

Figure 4.1. The set V D T  T from Example 4.1.4.

group). By (1.9.6.iii) it suffices to show that the Fourier transform of this function is
nonnegative. So one has to verify

1 C ˛  .z k C z k / C ˇ  .z 2k C z 2k / C  z 3k  0

for all k D 0; : : : ; 5, where z D ei2=6 . This is readily done.4


To see that f is not positive definite on V , write R D f0; 2t0 ; 2t1 g and note that
R  R  V . The corresponding 3  3 matrix is
2 3
1
BD4 1 5:
1

Setting v WD Œ1; 1; 1T we have .Bv; v/ D 3.1 C 2 / < 0. Thus, B is not positive
semidefinite.
In the next theorem we list basic properties of positive definite functions on V which
can be proved in the same way as in Section 1.4 and Section 1.5.

Theorem 4.1.5.
(i) If f 2 P .V /, then f is Hermitian.

4 By symmetry of the function it suffices to consider the values k D 0; 1; 2; 3.


Section 4.1 General results 203

(ii) Let f1 ; f2 2 P .V /. Then the functions f1 , Ref1 , jf1 j2 , and f1  f2 are positive
definite. Moreover, p1 f1 C p2 f2 is positive definite for all p1 ; p2  0.
(iii) The set P .V / is a convex cone closed in the topology of pointwise convergence,
while P0 .V / is a compact convex set.
(iv) The inequalities of Theorem 1.4.12 hold for a function f 2 P .V / whenever the
used arguments of f lie in V .
(v) Let G D Rd or G D Td , let V be open and f 2 P .V /. If Re f is continuous
at the neutral element, then f is uniformly continuous on V .
(vi) Let V  Rd and assume that V is an open ball with center 0. If f 2 P c .V / is
such that jf .t /j D 1 for all t from a neighborhood of 0, then

f .t / D ei.x;t/ ; t 2V

for some x 2 Rd .
The next proposition can be proved in the same way as Lemma 1.5.8. We omit the
proof.

Proposition 4.1.6. Let r > 0 and f 2 P c .B o .2r//. Then the inequality


Z Z Z
f .x  y/c.x/c.y/ dx dy D f .x/.c  c/.x/
Q dx  0
B o .r/ B o .r/ B o .2r/

holds for every continuous function c on Rd whose support lies in B o .r/.


P
Notation 4.1.7. Recall that the support supp ./ of a measure  D j cj ıxj 2
Mf .G/ is given by
supp ./ D fxj W cj ¤ 0g:
We write

TV WD fO W  2 Mf .G/; supp ./  V g


TVr WD fp 2 TV W p is real-valuedg
TVC WD fp 2 TV W p  0g

Further, let TV2 denote the set of all q 2 TVC of the form

X
n
qD jqj j2 for some n 2 N
j D1

where
qj D O j 2 TV and supp .j /  supp .j /  V: (1)
It is clear that
TV2  TVC  TVr  TV
204 Chapter 4 The extension problem

TV is a complex linear space, TVr is a real linear space and the sets TV2 and TVC are
convex cones. Note that TV2 contains the constant character and hence TV2 6D ;. By
(1.8.3.vi) a function p D O 2 TV belongs to TVr if and only if  D .
Q On TV and TVr
we will use the supremum norm

kpk1 D supfjp. /j W O
2 Gg:

Assume that V is finite. Then both of the linear spaces TV and TVr are finite dimen-
sional and we have dim .TV / D dim .TVr / D d , where d is the number of points of
V . For a complex-valued Hermitian function f on V we set
X Z
Lf .p/ WD L
f .x/p.x/ D f d; p D O 2 TVr :
x2V V

Then Lf is a real linear functional on TVr . Conversely, if L is a real linear functional


on TVr , then L D Lf with some Hermitian function f on V .

Lemma 4.1.8. If V is finite, then the cone TV2 is closed in TVr with respect to the norm
k  k1 .
Proof. Let d denote the dimension of TVr . We prove that every q 2 TV2 is a sum of d
squares jqj j2 with some qj satisfying (4.1.7.1). Suppose that
X
m
qD jqj j2 (1)
j D1

where m > d . Since m > dim .TVr /, there exist real numbers rj , not all zero, such
that
Xm
rj jqj j2 D 0:
j D1
We may assume, without loss of generality, that jr1 j  jr2 j      jrm j and r1 ¤ 0.
Solving the above equation for jq1 j2 and substituting the solution into (1) we obtain
X
m
qD .1  rj =r1 /jqj j2
j D2

which is a sum of m  1 squares. Repeating these arguments we see that every q 2 TV2
is a sum of d squares.
Suppose that a sequence fqn g1 1  TV converges to q 2 TV with respect to the
2 r

norm k  k1 . By the first part of the proof there exist functions qj;n satisfying (4.1.7.1)
such that
X d
qn D jqj;n j2 : (2)
j D1
Section 4.1 General results 205

The sequence fqn g1 1 being convergent, there exists a constant K such that
kqn k1  K for all n. It follows from equation (2) that kqj;n k21  K for all j
and n. The normed space TV being finite dimensional, there is a sequence ni ! 1
such that qj;ni converges to some qj0 2 TV ; j D 1; : : : ; d . Since the corresponding
measures j;ni converge weakly, qj0 satisfies the condition on the support in (4.1.7.1).
Obviously we have
Xd
qD jqj0 j2 :
j D1

Thus, q 2 TV2 , showing that TV2 is closed.

Lemma 4.1.9. Let f W V ! C be a Hermitian function. Then f is positive definite


on V if and only if 5
Lf .q/  0 f or al l q 2 TV2 : (1)
Moreover, f can be extended to a positive definite function on G if and only if

Lf .p/  0 f or al l p 2 TVC :
P
Proof. To prove the first statement assume that f 2 P .V / and let  D niD1 ci ıxi
be such that xi 2 V and xi  xj 2 V for all i; j . Then supp ./  supp ./  V .
Since .  /O
Q D jj
O 2 we have
Z X
n
O 2/ D
Lf .jj f d.  /
Q D f .xi  xj /ci cj :
V i;j D1

This relation shows that f is positive definite if and only if (1) holds.
To prove the second statement. suppose that f can be extended to a positive definite
function 2 P .G/. If p 2 TVC , then pL is a positive definite function on G with
L  V . Using the fact that pL 2 P .G/ and applying Theorem 1.9.6 we obtain
supp .p/
X X
Lf .p/ D L
f .x/p.x/ D L
.x/p.x/ D . p/O.0/
L  0:
x2V x2G

To prove the converse statement, assume that Lf .p/  0 for every p 2 TVC . Without
loss of generality we may suppose that f .0/ D 1. If p  q; p; q 2 TVr , then q  p 2
TVC and hence Lf .p/  Lf .q/. From this we conclude that

jLf .p/j  Lf .1/ D 1

whenever p 2 TVr and 1  p  1. Thus, Lf is a linear functional of norm6 1


on TVr . By the Hahn–Banach theorem (cf. Theorem D.4.4), Lf can be extended to a

5 See Notation 4.1.7 for the definition of Lf .


6 O
Note that we consider the supremum norm on Cr .G/.
206 Chapter 4 The extension problem

O Since
linear functional L of norm 1 on the linear space Cr .G/.

1 D L.1/  kLk D 1

L is nonnegative (cf. Lemma E.1.5). The Riesz representation theorem (see Theo-
O such that
rem E.1.3) shows the existence of a (nonnegative) measure  2 M.G/
Z
L.p/ D Lf .p/ D p. / d. /; p 2 TVr : (2)
O
G

Let x 2 V be arbitrary. Since V is symmetric, the function p defined by p. / WD


.x/ C .x/ belongs to TVr . Substituting this p into (2) we obtain

f .x/ C f .x/ D .x/


L C .x/:
L

If we define p by p. / WD i. .x/  .x//, then (2) gives

f .x/  f .x/ D .x/


L  .x/:
L

Thus, f .x/ D .x/


L for all x 2 V , i.e., .x/
L is an extension of f . Since  is nonnega-
tive the function L is positive definite. The proof is complete.

Theorem 4.1.10. Let V be finite. The following two conditions are equivalent:
(i) every function f 2 P .V / can be extended to a function in P .G/;
(ii) TVC D TV2 .

Proof. Suppose that TVC D TV2 and let f 2 P .V /. By the first statement of
Lemma 4.1.9 the inequality Lf .p/  0 holds for all p 2 TV2 D TVC . It follows
from the second statement of Lemma 4.1.9 that f admits a positive definite extension.
By Lemma 4.1.8, the convex cone TV2 is closed. If TV2 6D TVC then there exists a real
linear functional L on TVr such that L is nonnegative on TV2 and L.p0 / < 0 for some
p0 2 TVC (cf. Corollary D.4.3 with F D TV2 and C D fp0 g). Let f be the Hermitian
function on V for which L D Lf . Lemma 4.1.9 shows that f is positive definite on
V but has no positive definite extension.

Example 4.1.11. Let Va and r be as in Example 4.1.3 and assume that r can be
extended to a function fr 2 P .T/. By (4.1.5.v), the function fr is continuous. Arguing
as in the proof of Corollary 1.4.13, we see that fr must be a character of T. Thus,
fr .t / D eint with some n 2 Z. Consequently, r can be extended to a positive definite
function on T if and only if r is an integer.
In the rest of this section G1 and G2 are commutative groups, V2 is a symmetric
subset of G2 containing 0 and G denotes the product group G1  G2 . We identify
G1 and G2 with the subgroups G1  f0g and f0g  G2 , respectively. Doing so, every
element x 2 G has a unique decomposition x D x1 C x2 with xi 2 Gi ; i D 1; 2.
Section 4.1 General results 207

We will investigate positive definite functions on the set V WD G1  V2  G.

Lemma 4.1.12. Let f 2 P .V /; y 2 G1 , and c 2 C. Then the function

fcy .x/ WD .1 C jcj2 /f .x/ C cf .x C y/ C cf .x  y/; x2V

is positive definite on V .

Proof. It is easy to see that x  y; x C y 2 V if x 2 V and y 2 G1 . Thus, the function


y
fc is well defined on V . Writing h1 WD 0; h2 WD y; d1 WD 1 and d2 WD c, we have

X
2
fcy .x/ D f .x C hl  hk /dl dk ; x 2 V:
k;lD1

Now let c1 ; : : : ; cn 2 C be arbitrary and x1 ; : : : ; xn 2 V be such that xi  xj 2 V .


Setting xj;k WD xj C hk ; cj;k WD cj dk , using the equation above and the fact that
f 2 P .V /, we obtain

X
n X
2 X
n
fcy .xi  xj /ci cj D f .xi;l  xj;k /ci;l cj;k  0
i;j D1 k;lD1 i;j D1

y
i.e., fc 2 P .V /.

Lemma 4.1.13. If f is an extremal point of the convex set P0 .V /, then

f .x C y/ D f .x/f .y/

holds for all x 2 G1 and y 2 V .

Proof. This lemma can be proved by the same argument that was used in the first part
of the proof of Theorem 1.4.20. We omit the details.

Theorem 4.1.14. If every positive definite function on V2 can be extended to a posi-


tive definite function on G2 , then every positive definite function on V has a positive
definite extension to G.

Proof. First we prove that every extremal point f of P0 .V / can be extended to a


positive definite function on G. By the previous lemma we have

f .x C y/ D f .x/f .y/; x 2 G1 ; y 2 V:

The function f2 defined by f2 .y/ WD f .y/; y 2 V2 , is positive definite on V2 . By


assumption, it has an extension 2 2 P .G2 /. Setting

.x C y/ WD f .x/ 2 .y/; x 2 G1 ; y 2 G2
208 Chapter 4 The extension problem

the function is positive definite on G (cf. Lemma 1.4.16). Moreover, is an exten-


sion of f .
The set P0 .V / is compact in the topology of pointwise convergence. Kreı̆n–
Milman’s Theorem D.4.5 shows that every function g 2 P0 .V / is the pointwise limit
of a net ff˛ g, where f˛ is a finite convex combination of extremal points. By what
we have proved, f˛ admits an extension ˛ 2 P .G/. Since P0 .V / is compact, the net
f ˛ g has a subnet converging pointwise to some complex-valued function on G. It
is clear that is a positive definite extension of g.

4.2 The cases Rd and Zd


We start with the one-dimensional case.

Theorem 4.2.1. Let N be a nonnegative integer. If V D fn 2 Z W jnj  N g, then


every positive definite function on V can be extended to a positive definite function
on Z.

Proof. By Theorem 4.1.10 it suffices to show that TVC D TV2 . The character group
of Z is T and each function p 2 TV can be written as

X
N
p.z/ D cn z n ; z2T
nDN

with some complex numbers cn . The equality TVC D TV2 follows now at once from
Theorem B.1.4.

Theorem 4.2.2. Let a 2 .0; 1/ and set V WD .a; a/.


(i) Every function f 2 P .V / can be extended to a function ' 2 P .R/.
(ii) If f is measurable (continuous) on V , then every positive definite extension of
f is measurable (continuous) on R.

Proof. (i) Let g be a positive definite function on R such that the support of g is a
finite subset of V . We show that the function h, which is equal to fg on V and is
equal to zeronon R n V , is positive definite on R. For this we prove that the matrix
h.xi  xj / i;j D1 is positive semidefinite for all x1 ; : : : ; xn 2 R. Without loss of
generality assume that x1  x2      xn . We set

S WD f.i; j / W xi  xj 2 V g and cij WD f .xi  xj /; .i; j / 2 S:

Then S and cij satisfy the conditions of Theorem D.2.19 (see also
 the
n beginning of
Remark 4.1.2). Thus, there exists a positive semidefinite matrix aij i;j D1 such that
Section 4.2 The cases Rd and Zd 209

aij D cij for .i; j / 2 S . Schur’s Theorem D.2.12 shows that the matrix
 n
aij g.xi  xj / i;j D1

is positive semidefinite. Since the support of g is contained in V , we have g.xi xj / D


0 for .i; j / … S and hence

h.xi  xj / D aij g.xi  xj /; i; j D 1; : : : ; n:

Thus, h is positive definite and hence (cf. Theorem 1.9.6)


X
O
h.1/ D f .x/g.x/  0: (1)
x2V

Setting7 X
Lf .q/ WD L
f .x/q.x/; q 2 TVr
x2V

inequality (1) shows that Lf is nonnegative on TVC . Considering R as a discrete group


and arguing as in the second part of the proof of (4.1.9.ii), we see that there exists a
nonnegative finite measure on the character group of R such that f .x/ D .x/;L x2
V . Consequently, L is a positive definite extension of f .
(ii) The statements concerning measurability and continuity of ' follow from Theo-
rem 2.11.8 and Corollary 1.5.2.

As a corollary we obtain the following interesting result.

Theorem 4.2.3. If an analytic function g W R ! C is positive definite on an interval


.a; a/, then it is positive definite on R.

Proof. By the previous theorem, the restriction of g to .a; a/ can be extended to a


continuous positive definite function f on R. It follows from (3.3.6.ii) that f D g.

Remark 4.2.4. In a series of papers M. G. Kreı̆n investigated the problem of describ-


ing all positive definite continuations to R of a given positive definite function on a
finite interval.8 To give details would go beyond the scope of this book. Below we
present an example which was studied in [38].
We consider the function

f .t / D 1  jt j; t 2R

and its restrictions fa WD f j.2a;2a/ ; a > 0. Using Pólya’s Theorem 3.9.11 we see
that fa is positive definite if a  1. If a > 1, then fa is not positive definite since its

7 We use the Notation 4.1.7 with G D R.


8 See, e.g., [38] or [25] for a list of references.
210 Chapter 4 The extension problem

modulus is not majorized by f .0/. If a  1, then the periodic extension fQa of fa is


positive definite. This can be seen from its Fourier series
1
X a
fQa .t / D 1  a C
1 

1 2 2
eit.kC 2 / a ; t 2 R:
kD1
.k C 2/ 

If a D 1 and fQ1 is a positive definite continuation of f1 to R, then fQ1 .2/ D 1 and


therefore fQ1 is periodic with period 4 in view of Corollary 1.4.14.
Using Kreı̆n’s results it was shown in [38] that for a < 1 the relation
Z 1
i eizt fQa; .t / dt D
0
tan az 1 .a  1/2
  ; Im z > 0
z2 z z 2 cos2 az. C .a  1/2 tan az C .a  1/=z/

establishes a bijective correspondence between all continuations fQa; of fa and all


functions of the form 1 or
Z 1
tz C 1
.z/ D ˛ C ˇz C d .t /; z 2 C n R
1 t  z

where ˛ 2 R; ˇ  0 and  is nonnegative finite measure.9


We consider two special extensions.
If D 1, then the corresponding extension is the 4a-periodic extension of fa .
If .z/ D .a  1/2 i; Imz > 0, and a; is the measure corresponding to fQa; . then
.a  1/2
da; .x/ D dx:
..a  1/2 x 2 C .a  1/x sin.2ax/ C cos2 .ax//
This implies the relation
Z
2 1 .a  1/2 cos.tx/
dx D 1  jt j
 0 .a  1/2 x 2 C .a  1/x sin.2ax/ C cos2 .ax/
where 0 < a < 1; 2a < t < 2a.
We now turn to the multivariate extension problem. Below we show that the ex-
tension is not always possible if d > 1 (cf. Theorem 4.2.6 and Theorem 4.2.8). For
M 2 N write
d
SM WD fi 2 Zd W jik j  M for all kg:

Theorem 4.2.5. If G D Zd and V D SM


d
where d and M are greater than 1, then
C
TV 6D TV .
2

9 is a so-called Nevanlinna function: it is holomorphic in the upper half plane and has a nonnegative
imaginary part there.
Section 4.2 The cases Rd and Zd 211

Proof. We consider only the case d D 2, the general case can be treated in the same
way. Denote by L the real linear space of all polynomials
X
2M
P .s; t / D am;n s m t n ; am;n 2 R
m;nD0

of two real variables. Since the character group of Z2 is T2 , the linear space TVr consists
of functions of the form
X
M
p.z1 ; z2 / D cm;n z1m z2n ; z1 ; z2 2 T
m;nDM

where cm;n 2 C and cm;n D cm;n . For p 2 TVr we define the function Lp by
sCi t Ci
.Lp/.s; t / WD .1 C s 2 /M .1 C t 2 /M p ; ; s; t 2 R:
si t i
Noting that .r C i/=.r  i/ 2 T for all r 2 R, we see that Lp 2 L. The mapping L is
obviously linear and Lp ¤ 0 if p ¤ 0. From
dim .L/ D dim .TVr / D .2M C 1/2

we conclude that L is a linear isomorphism from TVr onto L. Moreover, p  0 if and


only if Lp  0. Let P be the polynomial from Lemma B.6.4. Then P 2 L and setting
p WD L1 P we have p 2 TVC . We show that p … TV2 .
Suppose,
P on the contrary, that p 2 TV2 . Then there exist qj D O j 2 TV with
p D jqj j where
2

supp .j /  V and supp .j /  supp .j /  V: (1)

We may suppose that


C
supp .j /  SM WD f.k; l/ 2 Z2 W 0  k; l  Mg:
C
Indeed, it follows from (1) that there exists xj 2 Z2 such that supp .ıxj  j /  SM .
Moreover, we have ˇ ˇ
ˇ.ıx  j /Oˇ2 D jO j j2 D jqj j2 :
j
C
Since supp .j /  SM , the function
sCi t Ci
Qj .s; t / WD .s  i/M .t  i/M qj ;
si t i
is a complex polynomial in each of the variables s and t . Applying the definition of L
we see that jQj j2 D Ljqj j2 . Consequently,
X X
P D Lp D Ljqj j2 D jQj j2 :
212 Chapter 4 The extension problem

Writing Qj D gj C ihj where gj and hj are polynomials with real coefficients, we


obtain X
P D .gj2 C h2j /
in contradiction to the choice of P . Thus, p … TV2 .

Combining Theorem 4.1.10 and Theorem 4.2.5 we obtain the following theorems:

Theorem 4.2.6 (Calderón–Pepinsky). If d and M are greater than 1, then there exists
d
a positive definite function on SM which cannot be extended to a positive definite
function on Z .
d

Theorem 4.2.7. Let M; d 2 N be arbitrary. Every positive definite function on SM


d

can be extended to a continuous positive definite function on the set


d
TM WD ft 2 Rd W jtj j  M for all j g  Rd :

Proof. Write
U WD ft 2 Rd W jtj j < 1=2 for all j g
R
and choose a continuous function h such that supp .h/  U and Rd jhj2 d D 1. Let
K WD h  hQ and define by
X
.t / WD f .n/K.t  n/; t 2 Rd :
d
n2SM

Note that K and are continuous. Using that K.0/ D 1 and K.m/ D 0 if m 2 Zd nf0g,
we see that is an extension of f . It remains to prove that is positive definite, i.e.,
X
n
.xi  xj /ci cj  0 (1)
i;j D1

for every choice of n 2 N; ci 2 C and points xi 2 TM d


with xi  xj 2 TM d
. We may
suppose that the xi ’s have nonnegative coordinates. Indeed, denote by rk the smallest
of the k-th coordinates of the points xi and write x0 WD .r1 ; : : : ; rd / and yi WD xi x0 .
The points yi lie in TMd
, have nonnegative coordinates, and .xi  xj / D .yi  yj /.
Put
X n
H.y/ WD ci h.y  xi /; y 2 Rd : (2)
iD1

Setting f .n/ WD d
0 outside of SM and using the definition of we see that the left-hand
side of (1) is equal to
X Z
f .n/ H.y  n/H.y/ dy: (3)
Rd
n2Zd
Section 4.3 Decomposition of locally defined positive definite functions 213

It follows from
Z X Z
H.y  n/H.y/ dy D H.y  n  m/H.y  m/ dy
Rd U
m2Zd

that (3) is equal to


Z X
f .m  n/H.y C m/H.y C n/ dy: (4)
U
m;n2Zd

Using the definition of H we see that H.y C n/, where y 2 U and n 2 Zd , can be
different from zero only if n 2 SM
d and n has nonnegative coordinates. Thus, (4) is not
d
changed if we restrict m and n to lie in SM and to have nonnegative coordinates. For
such m and n we have m  n 2 SM . Since f is positive definite on SM
d d
, the integrand
in (4) is nonnegative from which (1) follows.

The next result follows from Theorems 4.2.6 and 4.2.7.

Theorem 4.2.8 (Rudin). Let d  2 and V  Rd be a symmetric closed square. Then


there exists a continuous function on V which cannot be extended to a positive definite
function on Rd .

In Section 4.4 we return to the extension problem and show that if V is a ball in Rd
and f is a continuous radial positive definite function on V , then there exists a radial
positive definite function on Rd that extends f .

4.3 Decomposition of locally defined positive


definite functions

Throughout the present section V denotes a symmetric open neighborhood of 0 2 Rd .


The main result of this section is that every positive definite function f on V can be
written as f D fc C f0 where fc and f0 are positive definite, fc is continuous and
f0 is near to zero in a certain sense.

Definition 4.3.1. Let h be a complex-valued function on V . We say that h averages to


zero on V if for every neighborhood W of 0 and for any  > 0 there exist x1 ; : : : ; xn 2
W and positive numbers p1 ; : : : ; pn summing to 1 such that
ˇX ˇ
ˇ n ˇ
ˇ p h.x C x /ˇ
i ˇ<
ˇ i
iD1

whenever x 2 V and x C xi 2 V for all i .


214 Chapter 4 The extension problem

Example 4.3.2. The function 1ftg ; t 2 V , averages to zero on V . Indeed, if


x1 ; : : : ; xn 2 V are mutually distinct, then for all x 2 V the equation x C xi D t
holds for at most one i and hence
ˇX ˇ
ˇ n ˇ 1
ˇ p 1 .x C x /ˇ
i ˇ
ˇ i ftg
n
iD1
where pi D 1=n. More generally, functions with finite support average to zero.
To see another example, consider the subgroup Qd  Rd and let h be the restriction
of 1Qd to V . Then for all n 2 N there exist x1 ; : : : ; xn 2 V such that xi  xj … Qd if
i ¤ j . Choosing an arbitrary x 2 V , the relation x C xi 2 Qd holds for at most one
i . As above, we conclude that h averages to zero.

Theorem 4.3.3. Let h be a bounded Lebesgue measurable function on V . If h aver-


ages to zero on V , then h is equal to zero -almost everywhere.

Proof. We may assume that h is real-valued. If h does not vanish -almost every-
where, then there exist x0 2 V and an open symmetric neighborhood W of 0 such
that x0 C W C W  V and h does not vanish -almost everywhere on x0 C W . We
choose a function g 2 Cr00 .Rd / vanishing off x0 C W such that
Z
h.y/g.y/ dy D 1:
W Cx0

Let H be the function on Rd which is equal to h on V and is equal to zero otherwise.


The function H  g is continuous and H  g.0/ D 1. Let W0 P be an open set such that
0 2 W0  W and H g.x/ > 12 for all x 2 W0 . If pi  0 and pi D 1, then we have
X
n
1
pi H  g.xi / >
2
iD1
for all xi 2 W0 . Since h averages to zero, for all  > 0 we can choose pi and xi such
that the inequality ˇX ˇ
ˇ n ˇ
ˇ pi h.y C xi /ˇˇ < 
ˇ
iD1
holds whenever x; y C xi 2 V . Using that y 2 x0 C W implies
y C xi 2 xi C x0 C W  x0 C W C W  V , we obtain
Z X
1 X
n n
< pi H  g.xi / D pi H.y C xi / g.y/ dy
2 Rd iD1
iD1
Z ˇX ˇ
ˇ n ˇ
 jg.y/j dy  sup ˇ ˇ pi h.y C xi /ˇˇ
Rd y2x0 CW
Z iD1

 jg.y/j dy
Rd
for all  > 0. This contradiction shows that h vanishes -almost everywhere.
Section 4.3 Decomposition of locally defined positive definite functions 215

The main result of this section is the following theorem:

Theorem 4.3.4 (Sasvári). Every function f 2 P .V / admits a unique decomposition

f D fc C f0

where fc 2 P c .V /; f0 2 P .V / and f0 averages to zero. If f is Lebesgue measur-


able, then f0 vanishes -almost everywhere.10
First we introduce some notation and basic facts that will be used in the proof.
P P
Notation 4.3.5. Let  D i ci ıti and  D j dj ısj be measures in Mf .V / such
that sj  ti 2 V for all i; j . We write11
X
.; / WD f .sj  ti /ci dj :
i;j

Since f is positive definite we have .; /  0. Moreover, the Cauchy–Schwarz


inequality
j.; /j2  .; /.; / (1)
holds whenever all expressions are defined. This can be proved in the same way as
in positive semidefinite inner product spaces. For t 2 V and  2 Mf .V / such that
ıt   2 Mf .V / we set
Ut  WD ıt  :
Then the mapping  7! Ut  is linear, where it is defined, and we have

.Ut ; Ut / D .; / (2)


f .t / D .ı0 ; Ut ı0 /: (3)

Let fWn g1
nD1 be a sequence of balls with center zero such that

1
\
W1 C W1  V; WnC1 C WnC1  Wn and Wn D f0g:
nD1

For all t 2 V let N.t / 2 N be the smallest integer such that t C WN.t/  V . Then
N.0/ D 1 and ft C Wn g1 nDN.t/
is a neighborhood basis of t . If n  N.t /, then .; /
is a well-defined semidefinite inner product on the linear manifold Mf .t C Wn /.
Therefore, Mf .t C Wn / can be completed to a Hilbert space Hn .t /. The element of

10 The case where V is the whole group has been considered in [11] in a more general setting.
11 Note that if V ¤ Rd , then .; / is not an inner product in Mf .V / since it is not defined for all pairs
of  and .
216 Chapter 4 The extension problem

Hn .t / corresponding to a  2 Mf .t C Wn / will also be denoted by . We assume


that the completions are chosen such that12 t CWn  sCWm implies Hn .t /  Hm .s/.
For all t; s 2 V such that t  s 2 V let N.t; s/ 2 N be the smallest integer such
that t C WN.t;s/  .s C WN.t;s/ /  V . If n  N.t; s/, then .; / is well-defined
for all  2 Mf .t C Wn / and  2 Mf .s C Wn /. In view of inequality (1), .; / is
a bounded sequilinear functional on Mf .t C Wn /  Mf .s C Wn / and hence it has
a unique extension to a bounded sequilinear functional on Hn .t /  Hn .s/ which we
also denote by .; /. By continuity, the inequality

j.h; g/j2  .h; h/.g; g/

holds for all h 2 Hn .t / and g 2 Hn .s/ if n  N.t; s/.


Let t 2 V be fixed. Since Ut  2 Mf .s  t C WN.t;s/ / for all measures  2
Mf .s C WN.t;s/ /, the isometric linear mapping

Ut W Mf .s C WN.t;s/ / ! Mf .s  t C WN.t;s/ /

can be uniquely extended to an isometric mapping

Ut W HN.t;s/ .s/ ! HN.t;s/ .s  t /:

Doing so for all s 2 V we obtain a linear isometric mapping13 Ut defined on


[
dom.t / WD HN.t;s/ .s/:
s2V

Using the properties of Ut on Mf .s C WN.t/ / we see that the relations

.UtCs h; g/ D .Ut Us h; g/ D .Us h; Ut g/ (4)

hold for all t; s 2 V and for all h; g for which all the expressions above are well-
defined.
Set
\1
H0 WD Hn .0/:
nD1
Then ı0 2 H0 and H0 is a Hilbert space. Since HN.t;0/ .0/  dom.t /, the mapping
Ut is well-defined on H0 .
Next we define two semigroups of operators that will play an important role in the
proof of Theorem 4.3.4. If m > 1, then Wm C Wm  Wm1 and hence the relations
t 2 Wm and  2 Mf .Wm / imply that Ut 2 Mf .W1 /. Thus, Ut maps Hm .0/ and
H0 into Hm1 .0/. For t 2 Wm we denote by Ut0 the restriction of Ut to H0 . Then Ut0
12 That this is possible can be easily seen, e.g., by applying the completion procedure which uses Cauchy
sequences.
13 We use the word “mapping” to indicate that domain and range of Ut need not be Hilbert spaces.
Section 4.3 Decomposition of locally defined positive definite functions 217

is an isometric linear operator on H0 with range in Hm1 .0/. For m > 1 set

Um WD fUt0 W t 2 Wm g
 the closure of U in the weak operator topology.14 Write further
and denote by Um m
\

S0 WD Um
m>1

and let S be the closure, in the weak operator topology, of the convex hull of S0 .
 and S are weakly
The set Um consists of operators of norm 1 and hence the sets Um 0
compact. By Theorem D.5.9, the set S is weakly compact as well. Moreover, in view
of Theorem D.5.5, the norm of the operators in S and S0 is at most 1.

We now show several properties of S; S0 and the mappings Ut .

Lemma 4.3.6. The operators in S and S0 map H0 into H0 .

Proof. It suffices to prove the statement concerning S0 . If m0 > m > 1, then we have
Wm0 C Wm0  Wm and

Ut0 H0  Hm .0/; t 2 W m0 :

Consequently,

Um 00 H0  Hm .0/; m00  m0 :
We conclude that S0 H0  Hm .0/ whenever m > 1 and therefore S0 H0  H0 .

In the sequel we will consider the elements of S and S0 as operators from H0 into
H0 .

Lemma 4.3.7. S and S0 are semigroups.

Proof. Let s1 ; s2 2 S0 be arbitrary. We show that s1 s2 2 Um for all m > 1. If


m0 > m, then Wm0 CWm0  Wm and Um0 Um0  Um . Since multiplication of operators
is separately continuous in the weak operator topology (cf. Lemma D.5.2), we infer
that Um U   U  . From the definition of S we see that s s 2 U  U  and hence
0 m0 m 0 1 2 m0 m0
s1 s2 2 Um U  for all m > 1. Consequently, s s 2 S showing that S is a semigroup.
m 1 2 0 0
Since the convex hull of S0 is also a semigroup, the same is true for the weak operator
closure S .

Notation 4.3.8. By Theorem D.5.11 the semigroup S contains an orthogonal projec-


tion P such that sP D P s D P for all s 2 S . We introduce the notation c WD P ı0
and 0 WD ı0  c .
14 We consider the weak operator topology on the set of all bounded linear operators from H0 to
Hm1 .0/, cf. D.5.1.
218 Chapter 4 The extension problem

Lemma 4.3.9. Let m0 < m. The mapping t 7! Ut0 c from Wm0 into Hm .0/ is weakly
continuous at 0.

Proof. We show that Ut0n c ! c for an arbitrary sequence tn ! 0; tn 2 Wm . The


sequence fUt0n g can have cluster points only in S0 . Consequently, fUt0n c g can only
have cluster points in S0 c . Using the relation

sP D P; s 2 S0  S

we see that S0 c D S0 P ı0 D fP ı0 g D fc g and therefore Utn c ! c .

Lemma 4.3.10. The function g defined by

g.t / D .Ut0 c ; h/; t 2V

is continuous on V for all h 2 H0 .

Proof. The function g is continuous at t0 2 V if and only if the function

g0 .t / WD .UtCt0 c ; h/

which is defined in a neighborhood of 0, is continuous at 0. Using (4.3.5.4) we obtain

.UtCt0 c ; h/ D .Ut c ; Ut0 h/: (1)

Since v 7! .v; Ut0 h/ is a continuous linear functional on H0 , the continuity of g0 at


0 follows at once from equation (1) and from Lemma 4.3.9.

Lemma 4.3.11. For all  > 0 and m 2 N there exist t1 ; : : : ; tn 2 Wm and


p1 ; : : : ; pn  0 summing to 1 such that
 n 
X 
 p i U ti  0 
  < : (1)
iD1

Proof. Since 0 D P 0 2 S0 , the lemma follows from the definitions of S and
S0 and from the fact that the weak and norm closures of the convex hull of S0 0
coincide.

Lemma 4.3.12. For all t0 2 V we have

.Ut0 c ; 0 / D .Ut0 0 ; c / D 0:

Proof. In view of Lemma 4.3.10 the function t 7! .Ut c ; 0 / is continuous on V .


Thus, for all  > 0 there exists N 2 N such that

j.Ut0 t c ; 0 /  .Ut0 c ; 0 /j < ; t 2 WN :


Section 4.3 Decomposition of locally defined positive definite functions 219

Consequently, if pi  0; p1 C    C pn D 1 and ti 2 WN , then


ˇ n ˇ
ˇ X ˇ
ˇ p U  ;   .U  ;  /ˇ
0 ˇ < ; t 2 WN : (1)
ˇ i t0 t c 0 t0 c
iD1

By the previous lemma we can choose pi and ti such that inequality (4.3.11.1) holds.
We then have
ˇ n ˇ ˇ ˇ
ˇ X ˇ ˇ X
n
ˇ
ˇ ˇ ˇ
pi Ut0 ti c ; 0 ˇ D ˇ Ut0 c ; pi Uti 0 ˇˇ    kUt0 c k:
ˇ
iD1 iD1

Comparing this with (1) we obtain .Ut0 c ; 0 / D 0 and hence .Ut0 0 ; c / D


.0 ; Ut0 c / D 0.
Proof of Theorem 4.3.4. Setting
fc .t / WD .c ; Ut c /; f0 .t / WD .0 ; Ut 0 /; t 2V
the functions fc and f0 are positive definite on V . Indeed, write '.t / WD .x; Ut x/; t 2
V , where x 2 H0 is arbitrary. Using the relation (4.3.5.4) we obtain
X
n X
n X
n X
n
'.ti  tj /ci cj D x; ci cj Uti tj x D ci Uti x; ci Uti x 0
i;j D1 i;j D1 iD1 iD1

where tj 2 V and cj 2 C.
By the previous lemma we have
f .t / D .ı0 ; Ut ı0 / D .c C 0 ; Ut .c C 0 // D fc .t / C f0 .t /; t 2 V:
The continuity of fc follows from Lemma 4.3.10. To prove that f0 averages to zero, let
 > 0 and m 2 N be arbitrary and choose tj and pj as in Lemma 4.3.11. We then have
ˇ n ˇ ˇ ˇ
ˇX ˇ ˇ X
n
ˇ
ˇ ˇ ˇ
pi f0 .t  ti /ˇ D ˇ 0 ; pi Utti 0 ˇˇ (2)
ˇ
iD1 iD1
ˇ n ˇ
ˇ X ˇ
D ˇˇ p i U ti  0 ; U t  0 ; ˇ
ˇ (3)
iD1
X 
 n 

 kUt 0 k   p i U ti  0 
  : (4)
iD1

To prove the uniqueness, assume that f D fc C f0 D gc C g0 where


gc 2 P c .V /; g0 2 P .V / and g0 averages to zero. Then the function
h WD fc  gc D g0  f0 averages to zero and it is continuous. Theorem 4.3.3 shows
that h D 0, i.e., we have fc D gc and f0 D g0 .
The second statement follows from Theorem 4.3.3.
220 Chapter 4 The extension problem

Recall that B o .r/  Rd denotes the open ball with center 0 and radius r > 0.
A function g defined on B o .r/ is called a radial function if g.t / D g.O t / for all
t 2 B o .r/ and for all orthogonal matrices O 2 O.d /. Taking O D I we see that
radial positive definite functions on B o .r/ are real-valued.
The same argument as in Lemma 3.6.2 shows that g W B o .r/ ! C is radial if and
only if there exists a complex-valued function h on Œ0; r/ such that

g.t / D h.kt k/; t 2 B o .r/:

Lemma 4.3.13. If V D B o .r/ and f 2 P .V / is radial, then the functions fc and f0


in Theorem 4.3.4 are radial as well.

Proof. Let e 2 B o .1/ be arbitrary and define the function h by

h.t / WD f .t e/; t 2 I WD .r; r/:

Since f is radial h does not depend on e. By Theorem 4.3.4, h D hc C h0 where


hc 2 P c .I /; h0 2 P .I / and h0 averages to zero. On the other hand, we have h.t / D
fc .t e/ C f0 .t e/. The uniqueness statement of Theorem 4.3.4 shows that fc .t e/ D
hc .t / and f0 .t e/ D h0 .t / for all t . Thus, fc and f0 are radial.

Lemma 4.3.14. If V D Bdo .r/; d  2, and f0 2 P .V / is a radial function which


vanishes Lebesgue almost everywhere, then f0 D p1f0g with some nonnegative con-
stant.15

Proof. We may assume that f0 .0/ D 1. Let k be a positive integer and let 0 < q < r.
Let t 2 Rd be a fixed point of norm q and choose ı > 0 such that Sı  Sı  B o .r/
where
Sı D fx 2 Rd W kxk D q; kx  t k  ıg:
Choose independent random vectors x1 ; : : : ; xk on a probability space .; A; P / such
that xj .!/ 2 Sı for all ! and xj is uniformly distributed in Sı for all j . Further,
let x0 be the random variable on  which is identically zero. Write ˛0 D 1 and
˛j D f0 .t /; 1  j  k. Since f0 is positive definite the inequality

X
k X
k
f0 .xi .!/  xj .!//˛i ˛j D
iD0 j D0 X
1  kf .t /2 C 2f .t /2 f0 .xi .!/  xj .!//  0
1i<j k

holds for all ! 2 . Assume that d D 2. It is easy to see that the random variables
kxi  xj k; i ¤ j , are absolutely continuous with respect to the Lebesgue measure on

15 The example f D 1Q shows that the lemma is not true if d D 1.


Section 4.4 Extension of radial positive definite functions 221

R.16 Since f0 is radial and vanishes Lebesgue almost everywhere, we conclude that
E .f0 .xi  xj // D 0; 1  i < j  k. Thus, the inequality 1  kf .t /2  0 holds for
all k from which f .t / D 0 follows. This proves the lemma for d D 2.
In the case d > 2, we consider the function g0 .s/ WD f0 .s; 0; : : : ; 0/; s 2 B2o .r/.
By the first part of the proof, g0 .s/ D 0 for all s ¤ 0. Since f0 is radial we infer that
f0 .t / D 0 for all t 2 B o .r/ n f0g.

Combining Theorem 4.3.4 and the preceding lemmata we obtain the following
result.17

Theorem 4.3.15. Let d  2 and f be a Lebesgue measurable radial positive definite


function on B o .r/  Rd . Then f admits the decomposition

f D fc C p1f0g

where fc is a continuous radial positive definite function on B o .r/ and p is a non-


negative constant.

4.4 Extension of radial positive definite functions


We have seen in Section 4.2 that there exist locally defined positive definite functions
which cannot be extended to the whole of Zd or Rd . The situation changes if we
assume radiality.

Theorem 4.4.1 (Rudin). Every radial function ' 2 P c .Bdo .2r// can be extended to
a radial positive definite function on Rd .

Proof. Denote by C1 r the set of all radial, infinitely differentiable functions on R


d

whose support lies in B o .2r/ and let Pr be the set of all f 2 C1


r which are positive
definite. By Theorem 3.10.4, each f 2 Pr is the sum of a uniformly convergent series
1
X
f D fk  fQk (1)
kD1

where fk is infinitely differentiable and vanishes outside B o .r/. Note that fO is non-
negative and radial.
Fix f 2 C1 O
r , write Mf WD kf k1 , and choose  2 .0; 1/ such that the support
of f lies in B .2r  /. By Lemma 3.2.4 there exists a sequence fgi g of functions
o

gi 2 Pr such that

16 This argument is due to T. Gneiting.


17 See also [21] and [50] for decomposition results concerning more general norms.
222 Chapter 4 The extension problem

(i) supp gi  B o ./


(ii) gO i ! 1 uniformly on compact sets
R
(iii) Rd gi h d ! h.0/ for every continuous function h.

Let ı > 0 be arbitrary. There is a compact set K  Rd such that


jfO.y/j < ı; y 2 Rd n K
and there is a positive integer i0 such that
Mf gO i .y/ < Mf  ı; y 2 K; i  i0 :

Then fO < Mf gO i C ı on Rd . Multiplying this inequality by gO i we see that the Fourier


transform of the function
hi WD Mf gi  gi C ıgi  f  gi
is nonnegative. Noting that the support of hi lies in B o .2r/ we conclude that hi 2 Pr .
Proposition 4.1.6 together with (1) shows that
Z Z
.f  gi /'.x/ dx  Œıgi .x/ C Mf .gi  gi /.x/'.x/ dx; i  i0 :
B o .2r/ B o .2r/

Letting i ! 1 we obtain
Z
f .x/'.x/ dx  .ı C Mf /'.0/:
B o .2r/

Letting ı ! 0 and applying the same argument to f in place of f we obtain the


inequality ˇZ ˇ
ˇ ˇ
ˇ f .x/'.x/ dx ˇˇ  Mf '.0/; f 2 C1
ˇ o r : (2)
B .2r/
Denote by E the linear subspace of all g 2 C0 .Rd / which are integrable and gL 2 C1
r .
d
We endow C0 .R / with the supremum norm and define the linear functional L by
Z
Lg WD L
g.x/'.x/ dx; g 2 E:
B o .2r/

By inequality (2) we have jLgj  '.0/kgk1 ; g 2 E. The Hahn–Banach Theo-


rem D.4.4 shows that L can be extended to a linear functional LQ on C0 .Rd / such that
Q
jLgj  '.0/kgk1 ; g 2 C0 .Rd /. Therefore, in view of the Riesz–Markov Theo-
rem E.1.4, there is a complex Radon measure  on Rd such that
Z Z
f .x/'.x/ dx D fO.y/ d.y/; f 2 C1 r (3)
B o .2r/ Rd

and jj.X /  '.0/. It is not hard to check that  is radial and real. Applying equation
(3) to gi in place of f and letting i ! 1, we obtain '.0/ D .Rd /  jj.Rd /.
Section 4.4 Extension of radial positive definite functions 223

Hence  is nonnegative. Equation (3) is the same as


Z Z Z
f .x/'.x/ dx D f .x/ ei.x;y/ d.y/ dx; f 2 C1
r :
B o .2r/ B o .2r/ Rd

Since this holds for all f 2 C1r we conclude that


Z
'.x/ D ei.x;y/ d.y/; x 2 B o .2r/:
Rd

This right-hand side of this equation defines a radial positive definite function on the
whole of Rd . This completes the proof of the theorem.

Combining Theorem 4.4.1 with Theorem 4.3.15 we obtain the following result.

Theorem 4.4.2 (Gneiting–Sasvári). Every Lebesgue measurable radial function in


P .Bdo .2r// can be extended to a Lebesgue measurable radial positive definite func-
tion on the whole of Rd .
Chapter 5

Selected applications

A chapter on applications could easily fill several hundred pages so that we had to be
very restrictive in our choice of the topics. We start with the probably ‘most classical’
application of characteristic functions: limit theorems. Even here we had to restrict
ourselves to present only basic methods. In the remaining sections we treat topics
which usually cannot be found in other textbooks in this form.

5.1 Limit theorems


First we prove a version of the central limit theorem.

Theorem 5.1.1. Let  be a d-dimensional distribution such that the first order mo-
ments are all equal to zero while the second order moments are finite. If X1 ; X2 ; : : :
are independent random vectors all having the distribution , then the distributions
of the random vectors
X1 C X2 C    C Xn
Sn D p
n
tend weakly, as n ! 1, to the normal distribution having the same first and second
moments as .

Proof. Denoting by f the characteristic


p function of , the characteristic function fn
of Sn is given by fn .t / D f .t = n/n . By Corollary 1.2.7,

1 X
d
p n
fn .t / D 1  ŒE.Xi Xj / C Ri;j .t = n / ti tj ; t 2 Rd
2n
i;j D1

where limt!0 Ri;j .t / D 0. Using Lemma B.1.2 we obtain


1
Pd
lim fn .t / D e 2 i;j D1 E.Xi Xj / ti tj
:
n!1

The statement now follows from Lévy’s continuity Theorem 1.6.3.

The next application is Khinchin’s weak law of large numbers.


Section 5.1 Limit theorems 225

Theorem 5.1.2. Let X1 ; X2 ; : : : be independent identically distributed d-dimensional


random vectors and put
X1 C X2 C    C Xn
Sn D :
n
If E.X1 / D m exists then

lim P .kSn  mk  / D 0
n!1

for all  > 0.

Proof. Denote by f the characteristic function of the random vectors Xj . Then the
characteristic function fn of Sn is given by fn .t / D f .t =n/n . By Theorem 1.2.6,

1 X
d n
fn .t / D 1 C  Œimj C Rj .t =n/  tj ; t 2 Rd
n
j D1

where limt!0 Rj .t / D 0. Lemma B.1.2 shows that

lim fn .t / D ei.m;t/ :
n!1

By Lévy’s continuity Theorem 1.6.3, the distributions of the Sn ’s converge weakly,


and hence also in probability, to ım .

The next result is Poisson’s limit theorem.

Theorem 5.1.3. Let

X1;1
X2;1 ; X2;2
::
:
Xn;1 ; Xn;2 ; : : : ; Xn;n
::
:

be random variables which are independent and identically distributed within each
row of the array above and such that

P .Xn;k D 1/ D pn ; P .Xn;k D 0/ D 1  pn

where limn!1 npn D . Writing

Sn D Xn;1 C Xn;2 C    C Xn;n


226 Chapter 5 Selected applications

we have
k
lim P .Sn D k/ D e  ; k 2 N0 : (1)
n!1 kŠ
Proof. The characteristic function of Sn is given by
  n
fn .t / D 1 C pn eit  1 ; t 2 R:

By our assumption, pn D . C n /=n where n ! 0. Using this and Lemma B.1.2


we obtain
lim fn .t / D e .e 1/ :
it

n!1
Thus, by 1.6.3, the distributions of the Sn ’s converge weakly to the Poisson distribu-
tion with parameter (see (1.1.13.d)) from which (1) follows.

5.2 Sums of independent random vectors and the


Jessen–Wintner purity law
In the present section we study infinite series X1 C X2 C    where the Xj ’s are
independent random vectors. The investigation of such series leads us to infinite
convolutions of distributions and infinite products of characteristic functions. The
main results are Kolmogorov’s three-series theorem, the equivalence of convergence
in law and almost sure convergence, and the Jessen–Wintner purity law (cf. Theorem
5.2.14, Corollary 5.2.12 and Theorem 5.2.15).1

If fn g is a sequence of distributions converging weakly to some distribution ,


then we simply write n ! . Recall that the sum A C B of two sets A; B  Rd is
defined as the set of those points in Rd that may be represented in at least one way as
a C b where a 2 A and b 2 B. Next we extend this notation to an infinite sequence
of sets.

Definition 5.2.1. By the limit lim Bn of a sequence fBn g of subsets of Rd we under-


stand the set of those points in Rd that may be represented in at least one way as the
limit of a sequence fxn g where xn 2 Bn . By the sum

B1 C B2 C   

we mean the set lim .B1 C    C Bn /.

It is easy to check that lim Bn , which may be empty, is a closed set.

1 The presentation of the results is based on the paper [31].


Section 5.2 Sums of independent random vectors and the Jessen–Wintner purity law 227

Definition 5.2.2. If fn g1 1 is a sequence of distributions, we say that the infinite con-
volution 1  2     is convergent if there exists a distribution  such that
1      n ! :

Lemma 5.2.3. If n !  and n  n ! , then n ! ı0 .

Proof. Denote by fn ; f and gn the characteristic functions of n ;  and n , respec-


tively. For all t 2 Rd we have fn .t / ! f .t / and fn .t /gn .t / ! f .t /. This implies
that gn .t / ! 1 if f .t / ¤ 0. Since the set of such t ’s contains a neighborhood of 0,
Corollary 1.4.15 shows that gn .t / ! 1 for all t 2 Rd . Thus, n ! ı0 by Lévy’s
continuity Theorem 1.6.3.

Theorem 5.2.4. Let n be a distribution with characteristic function fn ; n 2 N. The


following conditions are equivalent:
(i) The infinite convolution 1  2     is convergent.
(ii) The sequence fpn g11 with pn D f1    fn converges uniformly on every com-
pact subset of R .
d

(iii) nC1      nCd.n/ ! ı0 for an arbitrary function d W N ! N.

Proof. The equivalence of (i) and (ii) follows immediately from Theorems 1.6.1
and 1.6.3. Let d W N ! N be arbitrary. Writing n D nC1      nCd.n/ and
n D 1      n we have nCd.n/ D n  n . Consequently, if n !  we also
have n n ! , hence n ! ı0 by Lemma 5.2.3. Conversely, assume that n ! ı0
for all d W N ! N and let K  Rd be compact. Then

k1  On k D k1  fnC1    fnCd.n/ k ! 0 (1)

where kk denotes the supremum norm on K. We show that fpn g is a Cauchy sequence
with respect to the supremum metric on K from which (ii) follows. Assume, on the
contrary, that there exists  > 0 such that for all n 2 N there exist positive integers
a.n/; b.n/ with n < a.n/ < b.n/ such that the distance of pa.n/ and pb.n/ is at least
. We have
 
  pa.n/  pb.n/ 
  
D pa.n/ 1  fa.n/C1    fb.n/ 
 
 1  fa.n/C1    fb.n/ 
   
 1  fnC1    fb.n/  C fnC1    fb.n/  fa.n/C1    fb.n/ 
    
D 1  fnC1    fb.n/  C fa.n/C1    fb.n/  1  fnC1    fa.n/ 
   
 1  fnC1    fb.n/  C 1  fnC1    fa.n/  ! 0

where we have used (1). This contradiction completes the proof.


228 Chapter 5 Selected applications

Using Theorem B.3.6 we obtain the following corollary:

Corollary 5.2.5. If for each compact set K  Rd there exists a constant CK  0


such that
X1
j1  fn .t /j  CK ; t 2 K;
nD1
then the infinite convolution 1  2     is convergent.
5.2.6. If x 7! xj is -integrable for each j D 1; : : : ; d , then we denote by E./
the center of gravity or barycenter of , i.e., E./ is the point of Rd which has the
moments Z
Mj ./ D xj d.x/
Rd
of order one as coordinates. We denote by c the distribution given by

c .B/ D .B C E.//; B 2 Bd

so that E.c / D 0. For an arbitrary distribution  on Rd with finite moments


Z
Mj;j ./ D xj2 d.x/; j D 1; : : : ; d
Rd

of order two, we write


Z Z
2
Var./ D kx  E k d.x/ D .x1  M1 /2 C    C .xd  Md /2 d.x/
Rd Rd

where Mj D Mj ./. Note that if  is the distribution of a random vector X , then


E./ D E.X /; Var./ D E.kX  E X k2 / and c is the distribution of X  E.X /.
This fact can be used to show easily the following simple properties (provided the
existence of E or Var, respectively):
(i) E.1  2 / D E.1 / C E.2 /;
(ii) Var.1  2 / D Var.1 / C Var.2 /;
(iii) .1  2 /c D c1  c2 ;
(iv) f c .t / D eiE./t  f .t /; t 2 Rd , where f and f c denote the characteristic
functions of  and c , respectively.
(v) if n !  and lim infn Var.n / is finite, then Var./ is finite, namely
Var./  lim infn Var.n /.2
2 This follows from the relation
Z Z
f .x/ d.x/  lim inf f .x/ dn .x/
n

which holds for every nonnegative, continuous function on Rd .


Section 5.2 Sums of independent random vectors and the Jessen–Wintner purity law 229

Theorem 5.2.7. If Mj;j .n / is finite for every n 2 N and j D 1; : : : ; d , then the
convergence of the two series

E.1 / C E.2 / C    and Var.1 / C Var.2 / C   

implies that 1  2     converges to some distribution . Furthermore, Var./ is


finite and the series above converge to E./ and Var./, respectively.

Proof. Assume first the E.n / D 0. Since Mj;j .n / < 1, the characteristic function
gn of n possesses continuous partial derivatives of order at least two and those of the
first order vanish at 0. Using this, Corollary 1.2.7 shows that

1 X
d
gn .t /  1 D  ŒRe gn;ti ;tj .n t / C i  Im gn;ti ;tj .n t / ti tj
2
i;j D1

where the subscripts ti ; tj denote partial differentiation. By Theorem 1.2.1,


Z
jgn;ti ;tj j  jti tj j dn .t /
Rd
Z 1
2
Z 1
2
 ti2 dn .t / tj2 dn .t /  Var.n /:
Rd Rd

Using this we obtain

1 X d
d
jgn .t /  1j  p  Var.n /  jti tj j  p  Var.n /  kt k2 (1)
2 i;j D1
2

which proves that 1  2     converges to some


P distribution  (cf. Corollary 5.2.5).
From (5.2.6.v) we conclude that Var./  1 j D1 Var.j / < 1. Since Var./ D
Var.1      n / C P
Var.nC1     /, we have Var./  Var.1 / C    C Var.n /,
and hence Var./  1 j D1 Var.j /.
Applying (1) to the characteristic function fn WD g1    gn instead of gn and using
(5.2.6.ii) we get
d X
n
jfn .t /  1j  p  Var.j /  kt k2 :
2 j D1
Taking the limit n ! 1 we see that the partial derivatives of order one of the char-
acteristic function of  are zero at 0, thus E./ D 0.
The general case where E.n / is not supposed to be zero follows from the relation
gn .t / D eiE.n /t  gnc .t /.

Remark 5.2.8. The converse of the previous theorem is false. To see an example,
let X1 ; X2 ; : : : be independent
P random variables such that P .Xn D 0/ D 1  pn
where 0  pn  1 and 1 1 p n < 1. The Borel–Cantelli Lemma F.2.1 shows that
230 Chapter 5 Selected applications

P1 P
1PXn is a finite sum with probability 1. On the other hand, the series 1 1 E .Xn /
and 1 1 Var .X n / may not converge. We can take for example p n D 1
n2
and choose
1
Xn such that P .Xn D n / D n2 . Then E .Xn / D 1 and Var .Xn / D n  1.
2 2

We have, however, the following theorem.

Theorem 5.2.9. If for some R  0 the supports of all n are contained in the sphere
fx W kxk  Rg, then the convergence of the two series

E.1 / C E.2 / C    and Var.1 / C Var.2 / C   

is necessary and sufficient for the convergence of 1  2     .

Proof. The sufficiency follows from Theorem 5.2.7. In order to prove the neces-
sity assume that 1  2     converges to some distribution  and let gn ; gnc
and g be the characteristic functions of n ; cn and , respectively. Suppose that
the series Var.1 / C Var.2 / C    is divergent. Then it follows from Var.n / D
Pd
j D1 Mj;j .n / that for some value of j the series Mj;j .1 / C Mj;j .2 / C    is
c c c

divergent. Without loss of generality assume that j D 1. In view of Corollary 1.11.2


there exists an open neighborhood U of 0 2 R such that
1 2
jgn .s; 0; : : : ; 0/j D jgnc .s; 0; : : : ; 0/j  1   s M1;1 .cn /
8
holds for all s 2 U and for all n. The second part of Theorem B.3.4 shows that
1
Y
g.s; 0; : : : ; 0/ D gn .s; 0; : : : ; 0/ D 0; s 2 U n f0g:
nD1

This, however, contradicts the continuity of g.

A fundamental theorem of the theory of sums of independent random vectors is that


for the sequence of partial sums there is equivalence between convergence in law and
convergence almost everywhere. Theorem 5.2.11 is an immediate consequence of this
fact. Below we present a different proof given in the paper [31] of B. Jessen and A.
Wintner since it contains a nice application of Jessen’s martingale Theorem 5.2.10.3
To formulate this theorem we need some notation (see Section F.3 for the definition
of products of probability spaces).

3 I would like to thank Christian Berg for making Jessen’s original work [30] available to me.
Section 5.2 Sums of independent random vectors and the Jessen–Wintner purity law 231

For all n 2 N let .n ; An ; Pn / be a probability space and write


Q D 1  2    
AQ D A1  A2    
PQ D P1  P2    
Q
!Q n D .!nC1 ; !nC2 ; : : : /; ! D .!1 ; !2 ; : : : / 2 
Q n D nC1  nC2    
AQ n D AnC1  AnC2    
PQn D PnC1  PnC2     :
We will identify .!1 ; !2 ; : : : / and .!1 ; : : : ; !n ; !Q n /.

Theorem 5.2.10 (Jessen). Let f 2 L1 .PQ / and for each n 2 N define the function fn
by Z
Q D
fn .!/ f .!1 ; : : : ; !n ; !Q n / dPQn .!Q n /; !Q 2 :
Q
Qn
Then
Q D f .!/
lim fn .!/ Q
n!1

holds for PQ -almost all !Q 2 .


Q

Theorem 5.2.11. A necessary and sufficient condition for the convergence of the in-
finite convolution 1  2     is that if X1 ; X2 ; : : : are independent random vectors
on a probability space .; A; P / having the distributions 1 ; 2 ; : : :, then the series
X1 C X2 C    is convergent P -almost everywhere. The distribution of the random
vector S D X1 C X2 C    is then  D 1  2     .

Proof. By independence, the distribution of the partial sum Sn D X1 C    C Xn is


n D 1      n . Moreover, by the very definition of the convolution, n is also the
distribution of the random vector
SQn .!/
Q D X1 .!1 / C    C Xn .!n /; Q
!Q D .!1 ; !2 ; : : :/ 2 :
If fSn g converges P -almost everywhere to some random vector S , then Sn ! S
in law, showing that n !  where  is the distribution of S .
Now assume that fn g is convergent and let d W N ! N be arbitrary. The distribu-
tion of
RQ n .!/
Q D XnC1 .!nC1 / C    C XnCd.n/ .!nCd.n/ /
is n D nC1      nCd.n/ . From Theorem 5.2.4 we see that n ! ı0 implying
that RQ n ! 0 in probability (see Section F.2). This shows
P that SQn converges to some
random vector SQ in probability. Setting QQ n .!/
Q D 1 Q
kDnC1 Xk .!k /, Qn is defined
Q
P -almost everywhere and
Q Q Q Q Q Q
Q WD ei.S .!/;y/
f .!/ D ei.Sn .!/;y/  ei.Qn .!/;y/
232 Chapter 5 Selected applications

for every y 2 Rd . The function f is measurable and bounded, hence integrable, so


that we may apply Theorem 5.2.10 with .n ; An ; Pn / D .; A; P / for all n. We find
Q
Q
Q D ei.Sn .!/;y/
fn .!/  zn .y/
Q Q
with some zn .y/ 2 C satisfying limn zn .y/ D 1. Hence, ei.Sn .!/;y/ converges PQ -
almost everywhere for every y. Lemma B.1.17 shows that SQn is convergent PQ -almost
everywhere. Since the random variables
! 7! .X1 .!/; X2 .!/; : : :/ and !Q 7! .X1 .!1 /; X2 .!2 /; : : :/
defined on  and , Q respectively, have the same distribution (by independence), we
conclude that Sn is convergent P -almost everywhere.

The set of points where an infinite series of random vectors converges is a tail
event. Combining the previous theorem and the zero–one law of Kolmogorov (cf.
Theorem F.2.5) we obtain the following corollary.

Corollary 5.2.12. An infinite series X1 C X2 C    of independent random vectors is


convergent in law if and only if it is convergent almost surely. Moreover, the series is
always either almost surely convergent or almost surely divergent.

5.2.13. The following notation will be used in the next theorem. Let fXn g be a se-
quence of d-dimensional random vectors and, as in Lemma 1.11.3, define the random
vectors Xn;R , where n 2 N and R 2 .0; 1/, by
(
Xn .!/ if kXn .!/k  R
Xn;R .!/ D ! 2 :
0 if kXn .!/k > R
Denoting by n and n;R the corresponding distributions we have
n;R D n;R C .1  n .BR //  ı0
where BR D fx 2 Rd W kxk  Rg and n;R .A/ D n .BR \ A/; A 2 Bd .

Theorem 5.2.14 (Kolmogorov’s three-series theorem). A necessary and sufficient


condition for the convergence of the infinite convolution 1  2     is the con-
vergence of the three series
.1  1 .BR // C .1  2 .BR // C    (1)
E.1;R / C E.2;R / C    and Var.1;R / C Var.2;R / C    (2)
for a fixed R > 0 (or for all R > 0).

Proof. Let X1 ; X2 ; : : : be independent random vectors having the distributions


1 ; 2 ; : : :. By Theorem 5.2.11 the convergence of 1  2     is necessary and
sufficient for the almost sure convergence of X1 C X2 C    .
Section 5.2 Sums of independent random vectors and the Jessen–Wintner purity law 233

If X1 C X2 C    is almost surely convergent, then Xn ! 0 almost surely, hence


the probability that the event ŒkXn k > R happens for infinitely many n, is equal to 0.
By Theorem F.2.1, we then have
X1 X1
P .Xn ¤ Xn;R / D .1  n .BR // < 1
nD1 nD1

so that X1;R C X2;R C    is almost surely convergent in view of Lemma F.2.2. From
Theorem 5.2.9 we see that the two series in (2) are convergent.
Assume now that the three series in (1) and (2) are convergent. Theorem 5.2.9
then shows that X1;R C X2;R C    is almost surely convergent. Applying again
Lemma F.2.2 we see that X1 C X2 C    is almost surely convergent.

Theorem 5.2.15 (Jessen–Wintner purity law). If  D 1  2     is a convergent


infinite convolution of distributions n each of which is discrete, then the following
three cases are possible:  is
(i) discrete;
(ii) singular and continuous;
(iii) absolutely continuous.

Proof. Let X1 ; X2 ; : : : be a sequence of independent random vectors on a probability


space .; A; P / such that n is the distribution of Xn . By Theorem 5.2.11, the series
X1 .!/ C X2 .!/ C    converges with probability 1. We may suppose that each Xn
takes values in a countable set and that S D X1 C X2 C    converges everywhere.
Let G denote the smallest group in Rd containing all possible values of the Xn ’s.
Then G is countable. Let E  Rd be an arbitrary Borel set and write
A WD f! 2  W S.!/ 2 E C Gg:
Since
X1 .!/ C    C Xn .!/ 2 G; n 2 N; ! 2 
and G is a group, we see that
A D f! W Xn .!/ C XnC1 .!/ C    2 E C Gg; n 2 N:
Thus, A is a tail event (cf. F.2.4) with respect to X1 ; X2 ; : : : . By Theorem F.2.5,
P .A/ D 1 whenever P .A/ > 0. This means that .E C G/ D 1 whenever
.E C G/ > 0 and a fortiori when .E/ > 0 (since E  E C G).
If  is not continuous, then there is a one-point set E with .E/ > 0. The set E CG
is then countable, its -measure is 1, showing that  is discrete.
Assume now that  is continuous. If there is a Borel set E such that .E/ D 0 and
.E/ > 0, then .E C G/ D 0 (since G is countable) and .E C G/ D 1, i.e.,  is
singular. If there is no such set E then, by the theorem of Radon and Nikodym,  is
absolutely continuous.
234 Chapter 5 Selected applications

5.3 Ergodic theorems for stationary fields


We start with a simple lemma. As an immediate corollary, we obtain a limit theorem
for stationary sequences by applying the isometric operator IZ from Remark 2.9.12.

Lemma 5.3.1. The sequence fsn g1


nD1 of complex-valued functions where

1 X ikx
n
sn .x/ D e ; x 2 R; n2N
n
kD1

is uniformly bounded and converges pointwise to 1f0g (see Figure 5.1).

Im
1

0:5 1 Re

1

Figure 5.1. The function s5 from Lemma 5.3.1.

Proof. We have sn .0/ D 1 and jsn .x/j  1 for all x. If x ¤ 0 then


ˇ ˇ2
2 1 ˇˇ eix  ei.nC1/x ˇˇ
jsn .x/j D 2  ˇ ˇ (1)
n ˇ 1  eix ˇ
1 4
  ! 0; n ! 1: (2)
n j1  eix j2
2

Theorem 5.3.2. Let Z be a stationary time series on Z with representing measure .


Then we have
1X
n
lim Z.k/ D .f0g/:
n!1 n
kD1
Section 5.3 Ergodic theorems for stationary fields 235

Proof. In view of Lemma 5.3.1 the sequence fsn g converges in L2 . / to 1f0g . The
theorem follows by applying the isometric operator IZ (cf. Remark 2.9.12) to sn and
noting that IZ .1f0g / D .f0g/ (cf. equation (2.7.1.3)).

Next we prove the continuous version of the previous result.

Theorem 5.3.3. If Z is a continuous stationary process on R with representing meas-


ure , then
Z
1 T
lim Z.t / dt D .f0g/:
T !1 T 0

Proof. This theorem can be proved in the same way as the previous one by using the
relations Z
1 T itx eiT x  1
e dt D ; x ¤ 0; T > 0 (1)
T 0 iT x
and Z Z
1 T itx 1 T
IZ e dt D Z.t / dt
T 0 T 0
(see equation (2.9.13.1)).

Remark 5.3.4. Let Z be as in Theorem 5.3.2 or in Theorem 5.3.3 and denote by m


the mean of Z. We say that Z is ergodic if
1X
n
lim Z.k/ D m
n!1 n
kD1
or Z T
1
lim Z.t / dt D m
T !1 T 0
respectively. By the theorems above, Z is ergodic if and only if .f0g/ D m. If Z is
ergodic and  denotes the spectral measure of Z, then
Z
 
 .f0g/ D 1f0g d D E j.f0g/j2 D jmj2 :

Consequently, if m D 0, then Z is ergodic if and only if its spectral measure has no


atom at the origin.

Remark 5.3.5. Let Z be a stationary time series and write Sn D Z1 C  CZn ; n 2 N.


We already know that the sequence fSn =ng converges to some Y 2 H.Z/. Denote
by .Un / the canonical unitary representation of Z in H.Z/. We then have
Sn Z1Ck C    C ZnCk SnCk  Sk SnCk n C k Sk
Uk D D D  
n n n nCk n n
236 Chapter 5 Selected applications

for all k 2 Z such that n C k ¤ 0. Taking the limit n ! 1 and replacing k by k


we obtain
Uk Y D Y; k 2 Z
i.e., Y is a so-called fixed point of .Un /. Similar computations can be carried out for
continuous stationary processes and integrals instead of sums. This is the motivation
for the following investigation of fixed points of unitary representations.
Notation 5.3.6. Let .Ut / be a unitary representation of T D Rd or T D Zd in a
Hilbert space H . For g 2 H we denote by H.g/ the closure of the linear manifold

span fUt g W t 2 T g:

It is easy to check that H.g/ is a .Ut /-invariant subspace of H . The orthogonal pro-
jection from H.Z/ onto H.g/ will be denoted by Pg . It follows from Lemma 2.10.16
that Pg commutes with all operators Ut :

Ut P g D P g Ut ; t 2 T; g 2 H: (1)

An element h 2 H is called a fixed point of .Ut / if

Ut h D h; t 2 T:

We denote by H0 the set of all fixed points. This set is a .Ut /-invariant subspace, the
orthogonal projection onto H0 will be denoted by P 0 . We have

P 0 Ut D Ut P 0 D P 0 : (2)

The first equation follows from the .Ut /-invariance of H0 , the second one from
P 0 H D H0 .

Lemma 5.3.7. For all g 2 H we have


(i) Pg P 0 D P 0 Pg
(ii) H.g/ \ H0 D C  P 0 g.

Proof. (i) Equations (5.3.6.1) and (5.3.6.2) show that Ut Pg P 0 D Pg P 0 . Thus, for
all h 2 H the vector Pg P 0 h is a fixed point, i.e., Pg P 0 h 2 H0 . Consequently,

P 0 Pg P 0 h D Pg P 0 h; h2H

and hence
P 0 Pg P 0 D Pg P 0 : (1)
Since Pg and P 0 are orthogonal projections, we see that

.P 0 Pg h; h0 / D .h; Pg P 0 h0 / D .h; P 0 Pg P 0 h0 / D .P 0 Pg P 0 h; h0 /
Section 5.3 Ergodic theorems for stationary fields 237

for all h; h0 2 H . This relation and (1) imply that P 0 Pg D P 0 Pg P 0 D Pg P 0 .

(ii) By (i) we have

P 0 g D P 0 Pg g D Pg P 0 g 2 H.g/

and hence C  P 0 g  H.g/ \ H0 . If h0 2 H0 \ H.g/, then

X
kn
h0 D lim cj;n Utj;n g
n!1
j D1

with some kn 2 N; cj;n 2 C and tj;n 2 T . Using equation (5.3.6.1) we see that

X
kn X
kn
h0 D P 0 h0 D lim cj;n P 0 Utj;n g D lim cj;n  P 0 g
n!1 n!1
j D1 j D1

i.e., h0 is proportional to P 0 g.

Theorem 5.3.8. Let Z be a continuous stationary field on T D Rd or T D Zd


with representing measure , spectral measure  and correlation function C . Denote
by .Ut /; .Vt / and .Wt / the canonical unitary representations in H.Z/; L2 . / and
H.C /, respectively. Then we have:
(i) The set of all fixed points of .Vt / is C  1f0g .
(ii) The set of all fixed points of .Wt / is C  1T \ H.C /.
(iii) The set H0 of all fixed points of .Ut / is C  .f0g/ and

.f0g/ D P 0 Z.s/

for all s 2 T .
Proof. We consider only the case Rd , the case Zd can be treated in the same way.
(i) If h is a fixed point of .Vt /, then

h.x/ D Vt h.x/ D ei.t;x/ h.x/; x; t 2 Rd :

We conclude that h.x/ D 0 if x ¤ 0, i.e., h 2 C  1f0g . On the other hand, 1f0g is a


fixed point of .Vt / so that (i) holds.
(ii) A function h 2 H.C / is a fixed point for the translation operators Wt if and
only if h.x  t / D h.x/ for all x; t 2 T . This equation is satisfied if and only if h is
constant.
(iii) It is easy to see that the isometric operator IZ from Remark 2.10.15 maps the
set of fixed points of .Vt / onto H0 . From (i) we conclude that

H0 D C  IZ .1f0g / D C  .f0g/:
238 Chapter 5 Selected applications

If  .f0g/ D 0, then .f0g/ D 0. The equation above shows that H0 D f0g and hence
P 0 D 0. Thus, the equation in (iii) holds whenever  .f0g/ D 0.
If  .f0g/ ¤ 0, then .f0g/ ¤ 0, the subspace H0 is one-dimensional and P 0 ¤ 0. We
obviously have P 0 Z.s/ 2 H0 . By equation (5.3.6.2),

P 0 Z.s/ D P 0 Us Z.0/ D Us P 0 Z.0/ D P 0 Z.0/

and hence P 0 Z.s/ does not depend on s. From the definition of H.Z/ and from
the fact that P 0 ¤ 0 we conclude that P 0 Z.s/ ¤ 0. Since H0 is one-dimensional,
P 0 Z.s/ D P 0 Z.0/ D cIZ .1f0g / with some c 2 C n f0g. As P 0 is an orthogonal
projection we have

.P 0 Z.0/; P 0 Z.0// D .Z.0/; P 0 Z.0//:

Using the fact that IZ is isometric we obtain

.P 0 Z.0/; P 0 Z.0// D .c1f0g ; c1f0g /

and
.Z.0/; P 0 Z.0// D .1; c1f0g / D .1f0g ; c1f0g /:
Thus, c D 1.

Corollary 5.3.9. H0 D f0g if and only if  .f0g/ D 0.


We are now able to prove a generalization of Theorem 5.3.2 and Theorem 5.3.3.
Note that the expressions
Z
1X
n
1 T
Z.k/ and Z.t / dt
n T 0
kD1
R
can be written as Z d where

1X
n
1
D ık or D jŒ0;T  :
n T
kD1

The Fourier–Stieltjes transforms of these measures converge to 1f0g if n ! 1 and


T ! 1, respectively (cf. Lemma 5.3.1 and equation (5.3.3.1)).

Theorem 5.3.10 (ergodic theorem). Let Z be a continuous stationary field on the set
T D Rd or T D Zd and let fn g be a sequence of probability measures such that
lim L n .t / D 0; t 2 T n f0g:
n!1

Then Z
lim Z.t / dn .t / D .f0g/ D P 0 Z.s/; s2T
n!1 T

where  denotes the representing measure of Z.


Section 5.4 Filtration of discrete stationary fields 239

Proof. Since jL n j  1, Rthe sequence fL n g converges in L2 . / to 1f0g . By (2.9.13.1)


and (2.9.14.1) we have Z dn D IZ .L n /. Using this and Theorem 5.3.8 we obtain
Z
 
lim Z.t / dn .t / D IZ 1f0g D .f0g/ D P 0 Z.s/:
n!1 T

Example 5.3.11. (i) Let n be the Gaussian distribution on Rd with density


1  kxk
2
x 7!  e 2n
.2 n/d=2
n
and characteristic function t 7! L n .t / D e 2 ktk . By the previous theorem we have
2

Z
1 kt k2
lim d=2
 Z.t /  e 2n dt D .f0g/:
n!1 .2 n/ Rd

(ii) Let n be the uniform distribution on the cube Œ0; nd  Rd . The characteristic
function is then
Yd
1  e2itj n
t 7! :
2itj n
j D1

The sequence fn g satisfies the condition of the previous theorem.


(iii) Examples of sequences of distributions on Zd satisfying the condition of Theo-
rem 5.3.10 can be given by considering the uniform distribution on f1; : : : ; ngd or on
fn; : : : ; ngd .

5.4 Filtration of discrete stationary fields


First we prove a more detailed version of Theorem 2.9.19 where we replace  by a
square integrable function.

Theorem 5.4.1. Let Z be a stationary field on Zd with representing measure Z and


spectral measure Z . Further, let a 2 L2 .Zd / be such that the series
X
O WD
a./ a.n/ei.;n/ (1)
n2Zd

converges in L2 .Z /. Then the series


X
a  Z.t / WD a.n/Z.t  n/; t 2 Zd
n2Zd
240 Chapter 5 Selected applications

converges in H.Z/. The field Y D a  Z is stationary with representing measure Y


and spectral measure Y given by

dY D aO dZ and O 2 dZ :


dY D jaj

If Z .fx W a.x/
O D 0g/ D 0, then
Z
1
Z.t / D ei.t;x/  dY .x/; t 2 Zd
Œ0;2/d O
a.x/
i.e., the convolution Z 7! a  Z is invertible.

Proof. Using the spectral representation of Z we obtain


X Z X
a.j /Z.t  j / D ei.t;x/ a.j /ei.j;x/ dZ .x/:
Œ0;2/d
j W jj jn j W jj jn

The sum on the right-hand side converges in L2 .Z /. Using the isometric operator IZ
we see that the sum on the left-hand side converges in H.Z/. Taking the limit n ! 1
we obtain the spectral representation of Y . The rest follows from Lemma 2.6.11.

Remark 5.4.2. If Z has a constant density, then the series (5.4.1.1) converges in
L2 . / for all a 2 L2 .Zd /, since the functions ei.;n/ have the same norm and build
a (complete) orthogonal system in this Hilbert space.
The next theorem is important in signal processing.

Theorem 5.4.3. A stationary field Y on Zd admits the representation


X
Y .t / D a  X.t / D a.n/X.t  n/; t 2 Zd
n2Zd

with a white noise X W N.0; 1/ on Zd and a 2 L2 .Zd / if and only if the spectral
measure of Y is absolutely continuous.

Proof. Assume that Y has the representation above. By Example 2.9.5 and Theo-
rem 5.4.1, the spectral measure  of Y is given by
1
d D O 2d
 jaj d
.2/d
i.e.,  is absolutely continuous.
Now assume that  is absolutely continuous and let C be the correlation function
of Y . By Theorem 1.8.16 there exists a function g 2 L2 .Zd / such that
X
C.t / D g  g.t
Q /D g.t C n/g.n/; t 2 Zd
n2Zd
Section 5.4 Filtration of discrete stationary fields 241

and d .x/ D jg.x/j


O 2d
d .x/. Let
Z
Y .t / D ei.t;x/ d.x/
Œ0;2/d

be the spectral representation of Y where  is a random orthogonal measure with struc-


ture measure  . We define the random orthogonal measure h as in Lemma 2.6.11 by
Z  
h .A/ WD h d; A 2 B Œ0; 2/d
A

where
1
hD 2 L2 . /
.2/d=2 gO
(since d .x/ D jg.x/j
O 2d
d .x/ O In view of
we may ignore the zeros of g).
Z Z
1
.h .A/; h .B// D 1A 1B jhj2 jgj
O 2d d D 1d d
Œ0;2/d .2/d A\B

the structure measure of h is d


d =.2/ . Consequently,
Z
X.t / D ei.t;x/ dh .x/; t 2 Zd
Œ0;2/d

defines a white noise X W N.0; 1/ (see Example 2.9.5). By Lemma 2.6.11 we have
Z Z
Y .t / D e i.t;x/
d.x/ D ei.t;x/ .2/d=2 g.x/
O dh .x/
Z 1
X
D ei.t;x/ .2/d=2 ei.x;n/ g.n/ dh .x/
nD1
1
X Z
d=2
D .2/ g.n/ ei.tn;x/ dh .x/
nD1
X1
D .2/d=2 g.n/X.t  n/
nD1

completing the proof.


Appendix A

Basic notation

A.1 Standard notation


The list below contains some standard notation that is used throughout the book.
N D f1; 2; 3; : : : g
N0 D f0; 1; 2; : : :g
Z D f0; ˙1; ˙2; : : :g
Q rational numbers
R real numbers
C complex numbers
T complex numbers with modulus one
C1 .Rd / the set of infinitely differentiable functions on Rd
C1 d
00 .R / the set of infinitely differentiable functions on Rd with compact
support
Lp .Rd / the set of complex-valued Lebesgue measurable functions f on
Rd for which jf jp is Lebesgue integrable
Lp .Rd / Lp -space with respect to Lebesgue measure on Rd
Lp .Zd / Lp -space with respect to the counting measure on Zd
Lpr ; Lpr the corresponding spaces of real-valued functions
xC D max .0; x/; x 2 R
exp.z/ D ez ; z 2 C
1A indicator function of a set A
ıx one-point or Dirac measure concentrated at x
k  kp Lp -norm
fQ.x/ D f .x/
span linear span
f .x C 0/ right-hand limit
f .x  0/ left-hand limit
E.X / expectation of the random variable or random vector X
cov.X / covariance of the random variable or random vector X
Let S be a topological space. By C.S / we denote the set of continuous complex-valued
functions on S . The symbol C00 .S / denotes the set of continuous functions with com-
pact support while C0 .S / is the set of continuous functions vanishing at infinity.1

1 Recall that a continuous function f on S vanishes at infinity if and only if for all  > 0 there exists a
compact set K  S such that jf .x/j < ; x 2 S n K.
Section A.2 Multidimensional notation 243

A.2 Multidimensional notation

If x 2 Rd .Cd ; Nd0 , etc.), then xi ; 1  i  d , denotes the i -th coordinate of x


and we write x as x D .x1 ; : : : ; xd /. In expressions involving matrix operations, e.g.,
in Ax where A is a matrix, we consider x as a column vector. The zero element of
Rd will also be denoted by 0. The standard basis of Rd or Zd consists of elements
ej 2 Rd ; 1  j  d , such that the j -th coordinate of ej is equal to 1 and the other
coordinates are zero.
For ˛; ˇ 2 Rd we write ˇ  ˛ if ˇj  ˛j for all j . The relation ˛ < ˇ holds if
and only if ˛  ˇ and ˛ ¤ ˇ.
Let x 2 Rd or Cd and let ˛; ˇ 2 Nd0 . Then we write

x ˛ D x1˛1  x2˛2  : : :  xd˛d


q
kxk D jx1 j2 C    C jxd j2
jxj D jx1 j C    C jxd j
˛Š D ˛1 Š  : : :  ˛d Š
! ! !
˛ ˛Š ˛1 ˛d
D D  :::  ; ˇ˛
ˇ .˛  ˇ/Š  ˇŠ ˇ1 ˇd
@j˛j
D˛ D
@x1˛1    @xd˛d

When ˛ D .0; : : : ; 0/ then x ˛ D 1. Note that D ˛ D ˇ g D D ˛Cˇ g where g de-


notes an arbitrary real- or complex-valued function on Rd for which the partial de-
rivative D ˛Cˇ g exists. By definition, D ˛ g D g when ˛ D .0; : : : ; 0/. We also use
the notation
@g @g
gxi WD or gxi ;xj WD :
@xi @xi @xj
By dx; d .x/ or d d .x/ we denote integration with respect to the Lebesgue meas-
ure D d on Rd , while .x; y/ is the inner product of x; y 2 Rd or Cd .
For r  0 the open and closed balls with radius r and center 0 2 Rd are denoted
by B o .r/ D Bdo .r/ and B c .r/ D Bdc .r/, respectively. That is,

Bdo .r/ D ft 2 Rd W kt k < rg

and
Bdc .r/ D ft 2 Rd W kt k  rg:
Appendix B

Basic analysis

B.1 Miscellaneous results from classical analysis


Lemma B.1.1. The inequality
n ˇˇ k ˇ
X z  wk ˇ
 jz  wj  emax.jzj;jwj/

kD1

holds for all z; w 2 C and n 2 N.

Proof. Using the identity

X
k1
k k
z  w D .z  w/  z s w k1s ; k2N
sD0

we obtain
ˇ n ˇ
ˇ X zk  wk ˇ X
n
1 X s k1s
k1
ˇ ˇ
ˇ ˇ  jz  wj  jzj jwj
ˇ kŠ ˇ kŠ
kD1 kD1 sD0

X
n
1
 jz  wj   k  max.jzj; jwj/k1

kD1

X
n1
1
D jz  wj   max.jzj; jwj/k

kD0

 jz  wj  emax.jzj;jwj/ :

Lemma B.1.2. Let fzn g1


1 be a sequence of complex numbers tending to some com-
plex number z. Then  zn n
lim 1 C D ez :
n!1 n
Proof. Setting
X zk  zn n
n
dn D  1C ; n2N
kŠ n
kD0
Section B.1 Miscellaneous results from classical analysis 245

it suffices to prove that limn!1 dn D 0. We have


ˇX  ˇ
ˇ n z k n.n  1/    .n  k C 1/ znk ˇ
ˇ
jdn j D ˇ   k ˇˇ
kŠ kŠ n
kD1
ˇ ˇ
X n
1 ˇˇ k Y
k1
j ˇˇ
  ˇz  znk  1
kŠ n ˇ
kD1 j D1
ˇ ˇ
X n ˇˇz k  z k ˇˇ Xn  Y
k1
n jzn jk j
 C  1 1 :
kŠ kŠ n
kD1 kD1 j D1

By Lemma B.1.1, the first sum in the last display tends to zero if n ! 1. Denote by
sn the second sum. Applying Bernoulli’s inequality1 we obtain
X
n
jzn jk .k  1/k jzn j2 jzn j
jsn j    e ! 0 .n ! 1/:
kŠ 2n 2n
kD1

Proposition B.1.3 (binomial series). For all ˛ > 0 and s 2 .1; 1/ we have
1
X
1 .k C ˛/
D .1/k  sk (1)
.1 C s/˛ .k C 1/.˛/
kD0

where the series is absolutely convergent.

Proof. Denote by ak the coefficient of s k . Using the relation

.x C 1/ D x.x/ (2)

(cf. Lemma C.4.2) we see that limk!1 jakC1 =ak j D 1. Thus, the radius of con-
vergence of the power series above is equal to 1. Applying again (2), the relation
.k C 1/ D kŠ, and using induction on k it is easy to check that (1) is the Taylor series
1
expansion of s 7! .1Cs/˛ at s D 0. This completes the proof.

Theorem B.1.4 (Fejér–Riesz). Let p be a function of the form

X
N
p.z/ D cn z n ; cn 2 C; z 2 C n f0g:
nDN

1 Bernoulli’s inequality states that


.1 C x1 /.1 C x2 /    .1 C xn /  1 C x1 C x2 C    C xn
where xj 2 .1; 0 for all j or xj 2 Œ0; 1/ for all j . This inequality can easily be proved by induction
on n.
246 Appendix B Basic analysis

If p.z/  0 for all z 2 T, then there exists a polynomial q of the form

X
N
q.z/ D bn z n ; z; bn 2 C
nD0

such that
p.z/ D q.z/q.1=z/ ; z 2 C n f0g:
In particular, p.z/ D jq.z/j2 holds for all z 2 T.

Proof. Without loss of generality assume that cN 6D 0. Since p is real-valued on T


we have
XN
0 D p.z/  p.z/ D .cn  c n /z n ; z 2 T
nDN
showing that cn D c n , n D 0; 1; : : : ; N . Using this we see that the polynomial g
given by

g.z/ WD z N p.z/ D c N C    C c0 z N C    C cN z 2N ; z2C

satisfies the equation

g.z/ D z 2N g.1=z/; z 2 C n f0g: (1)

Denote by z1 ; : : : ; z2N the zeros of g counted according to their multiplicities. Since


cN 6D 0, all these zeros are different from 0. It follows from (1) that 1=z j is a zero of
g having the same multiplicity as zj .
Next we show that if zj D eitj 2 T, then zj has even multiplicity. The function
'.t / WD p.eit /; t 2 R, is analytic and nonnegative on R. Moreover, tj is a zero of '. It
suffices to prove that the multiplicity of tj is even but this is a consequence of the fact
that the function ', being nonnegative, has a local minimum at tj . It follows from the
statements concerning the zeros of g that we can, if necessary, rearrange these zeros zj
such that z1 ; : : : ; zN ; 1=z 1 ; : : : ; 1=z N represent all the 2N zeros of g. We then have

Y
N Y
N
g.z/ D cN .z  zj / .z  1=z j /
j D1 j D1

and hence
Y
N Y
N
N
p.z/ D z g.z/ D cN .z  zj / .1  1=zz j /
j D1 j D1

Y
N Y
N
Dc .z  zj / .1=z  z j / (2)
j D1 j D1
Section B.1 Miscellaneous results from classical analysis 247

Q 1
N
where c D cN j D1 .z j / . Since p is nonnegative on T the constant c is posi-
p QN
tive. Setting q.z/ WD c j D1 .z  zj / and applying (2) we obtain

p.z/ D q.z/q.1=z/; z 2 C n f0g:

Theorem B.1.5. The inequality


8
ˆ
ˆ jzjnC1
ˇ ˇ ˆ
< ; if Im z  0;
ˇ iz .iz/nˇ .n C 1/Š
ˇe  1  iz      ˇ
ˇ nŠ ˇ ˆˆ jzjnC1
:̂  ejzj ; if Im z  0:
.n C 1/Š
holds for all n 2 N0 and z 2 C.

Proof. Assume first that n D 0 and let z D a C ib; a; b 2 R, be arbitrary. Using the
identity Z 1
eiz  1 D iz eitz dt
0
we obtain
ˇZ ˇ Z
ˇ iz ˇ ˇ 1 ˇ 1ˇ ˇ
ˇe  1ˇ D jzj ˇ eitz dt ˇˇ  jzj ˇeitz ˇ dt
ˇ
0 0
Z Z (
1ˇ ˇ 1 1; if b  0;
ˇ itatb ˇ tb
D jzj ˇ e ˇ dt D jzj e dt  jzj
0 0 ejzj ; if b  0:
Setting
.iz/n
rn .z/ WD eiz  1  iz      ; n 2 N0

it is not hard to check that
Z 1
rnC1 .z/ D iz rn .t z/ dt:
0

Using this equation, the theorem follows by induction on n.

B.1.6. Here we collect some simple but useful inequalities for trigonometric functions.
Integrating the inequalities

1  cos x  1 and  1  sin x  1; x2R

several times we find


j sin xj < jxj; x¤0 (1)
1 1 1
1  x 2 < cos x < 1  x 2 C x 4 ; x¤0 (2)
2 2 24
248 Appendix B Basic analysis

1 1 1 5
x  x 3 < sin x < x  x 3 C x ; x > 0: (3)
6 6 120
The inequality
sin y 
sin x  x  ; 0<y ; xy (4)
y 2
can be proved easily, e.g., by using geometrical arguments. Setting here y D 1 and
applying (3) we conclude that
sin x sin 1 101
 < ; jxj  1: (5)
x 1 120
Next we show that

h.t / WD n cosn t  cos nt  n  1; t 2 R; n 2 N: (6)

The function h is periodic, so that it attains its maximum at some y 2 R. Since

h0 .y/ D n2 cosn1 y sin y C n sin ny

either sin y D 0 or
sin ny
sin y 6D 0 and n cosn1 y D : (7)
sin y
It is easy to check that sin y D 0 implies (6). If (7) holds, then
cos y sin ny
h.y/ D  cos ny
sin y
cos y sin ny  sin y cos ny sin.n  1/y
D D :
sin y sin y
A simple induction argument shows that
sin.n  1/y
 n  1; n2N
sin y
and therefore h.y/  n  1.

The function Z T
sin t
Si.T / WD dt; T 2 R (8)
0 t
is frequently used (see Figure B.1). We will need its limit at infinity.

Theorem B.1.7. We have



lim Si.T / D :
T !1 2
Proof. We have
Z T Z 1 Z 1Z T
Si.T / D sin t ety dy dt D sin t ety dt dy:
0 0 0 0
Section B.1 Miscellaneous results from classical analysis 249

=2

5 10 15

Figure B.1. The function Si from equation (B.1.6.8).

Applying integration by parts twice gives


Z 1
1 y sin T C cos T
Si.T / D  eT y dy
0 1Cy 2 1 C y2
and therefore
Z 1
1 ˇ1
lim Si.T / D dy D arctan y ˇ D :
T !1 0 1 C y2 0 2
From Theorem B.1.7 we easily obtain the following corollary.

Corollary B.1.8.
Z T
1 sin xt
lim dt D sign x; x 2 R:
 T !1 T t

Theorem B.1.9. If 1 < a  b < 1 then


ˇZ ˇ
ˇ b sin t ˇ
ˇ ˇ
ˇ dt ˇ < 2:
ˇ a t ˇ

Proof. Define the real numbers cn , n D 0; 1; 2; : : : ; by


Z .nC1/ Z 
sin t sin t
cn WD dt D .1/n dt:
n t 0 n Ct
250 Appendix B Basic analysis

Then jcn j  jcnC1 j. From this we conclude that the inequalities


Z T
sin t
0 dt  c0 < 
0 t
hold for all T  0, from which the theorem follows.

Lemma B.1.10. We have Z 1 p


x 2 
e dx D :
0 2
Proof. Denote by I the integral above. Using Fubini’s theorem and the substitution
y D xs we obtain
Z 1 Z 1
x 2
ey dy
2 2
I D e dx 
0 0
Z 1 Z 1
e.x
2 Cy 2 /
D dy dx
0 0
Z 1 Z 1
x  ex
2 .1Cs 2 /
D ds dx
0 0
Z ˇ1
1
1 ˇ
x 2 .1Cs 2 / ˇ
D e ˇ ds
0 2.1 C s /
2
xD0
Z ˇ1
1 1 1 1 ˇ 
D ds D arctan s ˇˇ D :
2 0 1Cs 2 2 0 4
Using an obvious substitution we obtain the following corollary.

Corollary B.1.11. For every a;  2 R;  > 0, the function


1 x2

'a; .x/ D p e 2 2 ; x2R
2  
is a probability density (see Figure B.2).

Lemma B.1.12. For all x > 0 and a > x we have


Z 1 xt
e  e.xCa/t a
dt D log .1 C /:
0 t x
Proof. Denote by f .x/ and g.x/ the left- and right-hand side of the equation above.
We have
lim f .x/ D lim g.x/ D 0
x!1 x!1
Section B.1 Miscellaneous results from classical analysis 251

1
 D 0:3

0.5

 D1

3 2 1 1 2 3
Figure B.2. The function '0; from Corollary B.1.11.

and f 0 .x/ D g 0 .x/.2 Consequently, f .x/ D g.x/ for all x > 0.

B.1.13. The numbers


!
m mŠ
WD
k1 ; k 2 ; : : : ; kn k1 Šk2 Š    kn Š

are called multinomial coefficients. They satisfy the multinomial formula


!
X m
.x1 C    C xn /m D x k ; n 2 N; m 2 N0
k1 ; k2 ; : : : ; kn
jkjDm

where k 2 Nn0 .3 This formula can be proved by induction on n, applying the binomial
formula to the identity

.x1 C x2 C    C xn C xnC1 /m D .x1 C x2 C    C Œxn C xnC1 /m :

It follows from the multinomial formula that the multinomial coefficients have a com-
binatorial interpretation as the number of partitions4 of a set with m elements in n
blocks, with k1 elements in the first block, k2 elements in the second block, and so on.
R1
2 Differentiating under the integral sign can be avoided by replacing 1t by 0 est ds and then using
Fubini’s theorem.
3 Expressions of the form 00 are defined to be equal to 1.
4 A partition of a set S is a collection of nonempty pairwise disjoint subsets of S whose union is S .
The subsets are also called blocks.
252 Appendix B Basic analysis

The multinomial coefficients also occur in the identity


! n
dm Y X Y .k /
n
m
m
fi .t / D fi i .t / (1)
dt k 1 ; : : : ; kn
iD1 jkjDm iD1

called the generalized Leibniz rule. This identity can be proved by induction on n
showing first that !
Xm
m
.fg/.m/ D f .k/ g .mk/ :
k
kD0

Next we prove Faà di Bruno’s formula for the higher chain rule for differentiation.5

Theorem B.1.14 (Faà di Bruno’s formula). If f and g are real functions with a
sufficient number of derivatives, then

dm
g.f .t // D
dt m
X mŠ f 0 .t / b1
f 00 .t / b2
f .m/ .t / bm
g .jbj/ .f .t //  (1)
b1 Šb2 Š    bm Š 1Š 2Š mŠ
where the sum is over all different solutions in nonnegative integers b1 ; : : : ; bm of
b1 C 2b2 C    C mbm D m, and jbj D b1 C    C bm .

Proof. We prove that


dm X
g.f .t // D g .k/ .f .t // .f 0 .t //b1 .f 00 .t //b2    .f .m/.t/ /bm (2)
dt m
where the sum is over all partitions of Sm D f1; 2; : : : ; mg, and, for each partition, k
is its number of blocks and bi is the number of blocks with exactly i elements. The
number of partitions of Sm into b1 blocks with 1 element, b2 blocks with 2 elements,
etc. would be (cf. B.1.13) the multinomial coefficient
!
m
1; : : : ; 1; 2; : : : ; 2; 3; : : : ; 3; : : :
„ ƒ‚ … „ ƒ‚ … „ ƒ‚ …
b1 1 0 s b2 2 0 s b3 3 0 s

except that this makes artificial distinctions among the i -blocks for each i . Therefore,
we have to divide this number by b1Š b2 Š    bm Š . Thus, (1) follows from equation (2).
We prove (2) by induction on m. The case m D 1 being simple, assume that the
statement is true for some m. Every partition of SmC1 can be obtained in a unique way
by adjoining m C 1 to a partition of Sm .
5 Our proof is taken from the survey paper [32] which contains interesting remarks on the history of
this formula.
Section B.1 Miscellaneous results from classical analysis 253

If we add fm C 1g as a new singleton block, we increase the number of blocks of


size 1 by one, and the total number of blocks by one. This corresponds to applying
d=dt to g .k/ .f .t // to get g .kC1/ .f .t //f 0 .t /.
If we add m C 1 to an existing block of size i , then the number of such blocks
decreases by one, the number of blocks of size i C 1 increases by one, and the total
number of blocks remains the same. This corresponds to applying d=dt to .f .i/ .t //bi ,
to get bi .f .i/ .t //bi 1 f .iC1/ .t /.

Lemma B.1.15. Let ak  0; k 2 N, and denote by R the radius of convergence of


the power series
1
X
S.z/ D ak z k ; z 2 C:
kD1
If 0  R < 1, then S.R C / D 1 for all  > 0.
Proof. Indeed, if S.R C / were finite for some positive , then we would have
1
X
jS.z/j  ak .R C /k < 1
kD1
whenever jzj  R C , i.e., the radius of convergence would be larger than R.

Lemma B.1.16. Let fak g1 1 be a sequence of nonnegative real numbers. There exists
1
a sequence fpk g1 of positive real numbers such that
1
X
pk akn < 1
kD1
for all n 2 N.
Proof. Setting bk WD max .ak ; k/ we have ak  bk and bk ! 1. Let pk D ebk .
Then for each n 2 N there exists N.n/ 2 N such that
pk bkn  ebk =2  ek=2
if k  N.n/. Thus,
1
X 1
X
pk akn  pk bkn < 1
kD1 kD1
for all n 2 N.

Lemma B.1.17. Let x 2 Rd and let fxn g be a sequence in Rd such that


lim ei.t;xn / D ei.t;x/
n!1

for all t 2 Rd . Then limn!1 xn D x.6


6 Note that for the divergent sequence fxn g in R with xn D 2 nŠ the relation above holds for all
rational t .
254 Appendix B Basic analysis

Proof. We may assume that x D 0 and d D 1 so that


lim eitxn D 1 (1)
n!1

for all t . Assume first that the sequence fxn g is bounded. If s is an accumulation point
of this sequence then, by equation (1), eits D 1 for all t and hence s D 0. From this
we conclude that fxn g tends to 0.
Next assume that fxn g is unbounded and choose a subsequence fyn g tending to 1
or to 1. Using equation (1) we see that
Z 1 Z 1
eiyn  1
1D lim eityn dt D lim eityn dt D lim D 0:
0 n!1 n!1 0 n!1 iyn
Thus, the sequence fxn g cannot be unbounded.

Lemma B.1.18. For all t; a  0 we have


X . t /k
lim e t
D 1Œ0;a :
!1 kŠ
0k a

Proof. Let X be a Poisson random variable with parameter t . Then


X . t /k
P .X  a/ D e t
:

0k a

Thus, we have to prove that


lim P .X  a/ D 1Œ0;a :
!1

We have E .X / D Var .X / D t (cf. Example 1.2.4). If t > a, then Chebyshev’s


inequality (cf. (F.1.6)) shows that
t
P .X  a/  P .jX  t j  .t  a//  ! 0; ! 1:
.t  a/2
If t  a, we find in the same way
P .X  a/ D 1  P .X  t > .a  t //
 1  P .jX  t j > .a  t // ! 1; ! 1:

Example B.1.19. Let h be the 2-periodic function on R with


h.x/ D 1  jxj; jxj  1:
We show that the function f defined by (see Figure B.3)
1
1X 3 n
f .x/ D h.4n x/
4 4
nD0
Section B.1 Miscellaneous results from classical analysis 255

0.50

0.25 0.50 0.75 1


Figure B.3. The 5th partial sum of the series from Example B.1.19.

is nowhere differentiable.7 Let x 2 R; m 2 N, and


1
"m WD ˙
2  4m
where the sign is chosen in such a way that there is no integer lying between 4m x and
4m .x C "m /. Write
h.4n .x C "m //  h.4n x/
dnm WD :
"m
Using the obvious inequality jh.s/  h.t /j  js  t j, we see that jdnm j  4n for
all n; m. Since there is no integer between 4m x and 4m .x C "m /, we conclude that
jdmm j D 4m . If n > m, then 4n "m is an even integer and hence dnm D 0. Therefore,
ˇ ˇ ˇ m ˇ
ˇ f .x C "m /  f .x/ ˇ ˇˇ X 3 n ˇ
ˇ X
m1
1
4  ˇˇ ˇDˇ
ˇ dnm ˇ  3m
 3n D .3m C 1/ :
"m ˇ 4 ˇ 2
nD0 nD0

Since limm!1 "m D 0 we see that f is not differentiable at x.

7 Note that h is equal to the function fQ1=2 from Remark 4.2.4. Therefore h, and hence also f , are
positive definite. Bochner’s Theorem 1.7.3 shows that they are characteristic functions.
256 Appendix B Basic analysis

B.2 Uniform convergence of continuous functions


The first theorem of this section is due to S. N. Bernstein.

Theorem B.2.1 (Bernstein). Let f be a continuous real-valued function on Œ0; 1.


Then
X n   n
f .x/ D lim f kn x k .1  x/nk ; x 2 Œ0; 1
n!1 k
kD0

where the convergence is uniform on Œ0; 1.

Proof. For each x 2 Œ0; 1 and n 2 N let Ynx be a random variable on a probability
space .; A; P / such that nYnx has binomial distribution with parameters n and x. We
then have E.Ynx / D x; Var.Ynx / D x.1x/
n and

X
n   n
E.f .Ynx // D f kn x k .1  x/.nk/ :
k
kD0

Therefore, it suffices to show that

lim E.f .Ynx /  f .x// D 0; x 2 Œ0; 1


n!1

uniformly on Œ0; 1. By Chebyshev’s inequality (see (F.1.6))


  x.1  x/ 1
P jYnx  xj > n1=4  n1=2 Var .Ynx / D p p : (1)
n n
Choose K 2 R such that jf j  K and denote by A the random event on the left-hand
side of (1). Then P .A/  p1n and
Z
E.jf .Ynx /  f .x/j/ D jf .Ynx /  f .x/j dP
Z Z
D jf .Ynx /  f .x/j dP C jf .Ynx /  f .x/j dP
A nA
2K
 p C sup jf .y/  f .z/j:
n jyzjn1=4

The theorem follows now from the fact that f is uniformly continuous.

Corollary B.2.2 (Weierstraß). Every continuous, real-valued (complex-valued) func-


tion on a finite interval Œa; b  R can be uniformly approximated by real (complex,
respectively) polynomials.
Section B.2 Uniform convergence of continuous functions 257

f B5 B2 B1

1
Figure B.4. The Bernstein polynomials B1 ; B2 and B5 from Remark B.2.3 for the function
f W x 7! x 0:3 .

Remark B.2.3. Let f be a function on Œ0; 1. The polynomials Bn .f /; n 2 N0 ,


defined by
!
X
n   n
Bn .f /.x/ WD f kn x k .1  x/nk ; x 2 Œ0; 1
k
kD0

are called Bernstein polynomials (see Figure B.4).


Theorem B.2.4 (Stone–Weierstraß). Suppose X is a compact Hausdorff space and
A is a linear subspace of C.X / such that
(i) if f; g 2 A, then fN and f  g belong to A, as well;
(ii) A contains the constant functions;
(iii) for all x; y 2 X; x 6D y, there exists a function f 2 A such that f .x/ 6D f .y/.
Then every function in C.X / can be uniformly approximated by functions in A.

Proof. Denote by A the closure of A in the topology of uniform convergence. We


have to show that A D C.X /.
It is easy to check that A is a linear subspace and fN; f  g; Im f; Re f 2 A
whenever f; g 2 A. From this we conclude that if p is a polynomial, then p.f / 2 A
for all f 2 A.
Now let g 2 A be a real-valued function. We show that jgj 2 A. We may suppose
that 1  g  1. By Corollary B.2.2, for each n 2 N there exists a real polynomial
258 Appendix B Basic analysis

such that
ˇ ˇ
ˇjt j  pn .t /ˇ  1 ; 1  t  1:
n
We already know that pn .g/ belongs to A. The inequalities above show that the se-
quence fpn .g/g converges uniformly to jgj, and hence jgj 2 A.
If f; g 2 A are real-valued, then max .f; g/ and min .f; g/ belong to A, since
f C g C jf  gj f C g  jf  gj
max .f; g/ D ; min .f; g/ D :
2 2
To finish the proof it suffices to show that every real-valued function h 2 C.X / be-
longs to A. It follows at once from (iii) that for all x; y 2 X; x ¤ y, we can choose
a real-valued function fxy 2 A such that fxy .x/ D h.x/ and fxy .y/ D h.y/. Let
 > 0 be arbitrary and define the sets Uxy and Vxy by

Uxy D ft 2 X W fxy .t / < h.t / C g;

Vxy D ft 2 X W fxy .t / > h.t /  g:


These sets are open and x; y 2 Uxy \ Vxy . Therefore, fUxy gx2X is an open covering
of X for each y. By compactness of X , there exist n D n.y/ 2 N and xi D xi .y/ 2
X; i D 1; : : : ; n, such that
[n
XD Uxi y :
iD1

By what we have already proved, fy WD min .fx1 y ; : : : ; fxn y / 2 A. Moreover,

fy .t / < h.t / C ; t 2 X;
fy .t / > h.t /  ; t 2 Vy WD \niD1 Vxi y :

The sets Vy are open and y 2 Vy . Using again the compactness of X we see
that there exist m 2 N and y1 ; : : : ; ym 2 X such that [m
1 Vyi D X . Choosing
f WD max .fy1 ; : : : ; fym / we have f 2 A and

h.t /   < f .t / < h.t / C ; t 2X

completing the proof.

Theorem B.2.5 (Dini). Let ffn g be a sequence of continuous real-valued functions on


a compact space X converging pointwise to a continuous function g. If this sequence
is monotonically increasing or decreasing, then the convergence is uniform.

Proof. We assume that the sequence is increasing. For every  > 0 and x 2 X there
exists an index n.x/ D n.x; / such that

0  g.x/  fm .x/  ; m  n.x/:
3
Section B.2 Uniform convergence of continuous functions 259

Let V .x/ be an open neighborhood of x such that the inequalities



jg.x/  g.y/j 
3

jfn.x/ .x/  fn.x/ .y/j 
3
hold for all y 2 V .x/. For all y 2 V .x/ we then have

0  g.y/  fn.x/ .y/  :

By compactness we can choose finitely many points xi 2 X such that the union of the
sets V .xi / is equal to X . Let n0 be the largest of the numbers n.xi /. Since each y 2 X
belongs to some V .xi / we have

g.y/  fn .y/  g.y/  fn.xi / .y/  ; n  n0 :

The proof is complete.

Definition B.2.6. Let X be a topological space. A set F of complex-valued functions


on X is called equicontinuous if for all x 2 X and  > 0 there exists a neighborhood
Vx of x such that the inequality

jf .x/  f .y/j < 

holds for all f 2 F and for all y 2 Vx .

Theorem B.2.7 (Ascoli). Let ff˛ g be a net of equicontinuous complex-valued func-


tions on a topological space X converging pointwise to a function f . Then this net
converges uniformly on compact sets and f is continuous.

Proof. We show first that f is continuous. For all x; y 2 X and for all ˛ we have

jf .x/  f .y/j  jf .x/  f˛ .x/j C jf˛ .x/  f˛ .y/j C jf˛ .y/  f .y/j: (1)

Let x 2 X and  > 0 be arbitrary and let Vx be a neighborhood of x such that

jf˛ .x/  f˛ .y/j < ; y 2 Vx :

For each y 2 Vx we choose an index ˛.y/ D ˛ such that jf .x/  f˛ .x/j <  and
jf .y/  f˛ .y/j < . It follows from inequality (1) that jf .x/  f .y/j < 3, y 2 Vx .
Thus, f is continuous in x.
Obviously the set ff˛ ; f g is also equicontinuous. Let K  X be compact. For all
x 2 K and for all  > 0 there exists an open neighborhood Vx of x such that the
inequalities
jf .x/  f .y/j <  and jf˛ .x/  f˛ .y/j < 
260 Appendix B Basic analysis

S all ˛ and for all y 2 Vx . Since K is compact there exist x1 ; : : : ; xn such that
hold for
K  niD1 Vxi . Choose ˇ such that

jf˛ .xi /  f .xi /j < ; ˛  ˇ; i D 1; : : : ; n:

Now let y 2 K be arbitrary. Then y 2 Vxi for some i . Using this we see that

jf˛ .y/  f .y/j  jf˛ .y/  f˛ .xi /j C jf˛ .xi /  f .xi /j C jf .xi /  f .y/j < 3

holds whenever ˛  ˇ. Thus, f˛ ! f uniformly on K.

Theorem B.2.8. Let F be a set of complex-valued functions on a set X ¤ ; and


consider on F the topology of pointwise convergence. If

supfjf .x/j W f 2 F g < 1

for all x 2 X , then the closure of F is compact.

Proof. Let C.x/ be the closure of the set ff .x/ W f 2 F g  C. Then C.x/ is bounded
and hence compact. Now F is a subset of the product space
Y
C D C.x/
x2X

and pointwise convergence is the same as convergence in the product topology. By


Tychonoff’s theorem, C is compact.

Combining Theorem B.2.7 and Theorem B.2.8 we obtain the following result.

Theorem B.2.9 (Arzelà–Ascoli). Let ff˛ g be an equicontinuous net of complex-


valued functions on a topological space X such that

supfjf .x/j W f 2 F g < 1

for all x 2 X . Then this net contains a subnet converging uniformly on compact sets
to a continuous function.
A proof of the next theorem can be found for example in [47].

Theorem B.2.10. Let ffn g be a sequence of differentiable real-valued functions on a


finite interval I D Œa; b  R such that
(i) for some x0 2 I the sequence ffn .x0 /g is convergent;
(ii) the sequence ffn0 g is uniformly convergent on I .
Then the sequence ffn g converges uniformly on I to a differentiable function f and

f 0 .x/ D lim fn0 .x/; x 2 I:


n!1
Section B.3 Infinite products 261

B.3 Infinite products

Definition B.3.1. Given a sequence fzj g1 j D1 of complex numbers let pn D


z1 z2    zn ; n  1, be the n-th partial product. If there is a p 2 C such that
limn pn D p, then we write8
1
Y
pD zn :
nD1

Lemma B.3.2. For a sequence fzj g1


1 of complex numbers the following conditions
are equivalent:
(i) The n-th partial products pn converge to a nonzero limit as n ! 1.
(ii) pn ¤ 0 for all n and  > 0 there is an N 2 N such that jpn =pm  1j <  for
all n; m  N .
(iii) There is a ı > 0 such that jpn j > ı for all n, and for each  > 0 there is an
N 2 N such that and jpn  pm j <  for all n; m  N .

Proof. The equivalence of (i) and (iii) is obvious. If (iii) holds then
ˇ ˇ
ˇ pn ˇ
ˇ  1ˇ <  <  ; n; m  N
ˇp ˇ jp j ı
m m

from which (ii) follows. Assume finally that (ii) is true and let 0 <  < 1. Then
ˇˇ ˇ ˇ ˇ ˇ
ˇˇ pn ˇ ˇ ˇ ˇ
ˇˇ ˇ  1ˇ < ˇ pn  1ˇ < ; n; m  N:
ˇˇ p ˇ ˇ ˇp ˇ
m m

Replacing m by N and writing A D jpN j we obtain


jpn j
 < 1<
A
or
.1  /A < jpn j < .1 C /A; n  N:
The lower estimate shows that the condition jpn j > ı in (iii) is satisfied while the
upper estimate shows that the sequence fpn g is bounded. Multiplying the inequality
jpn =pm  1j <  by jpm j and using boundedness the condition jpn  pm j <  in (iii)
is satisfied as well.

Since pn =pn1 D zn ; n > 1, we obtain the following corollary.


Q1
Corollary B.3.3. If nD1 zn D p ¤ 0 then limn!1 zn D 1.

8 Note that we do not introduce the terminology convergent infinite product which is treated differently
in the literature.
262 Appendix B Basic analysis

Theorem B.3.4. Let fan g1


1 be a sequence of nonnegative real numbers. Then
1
Y
.1 C an / < 1
nD1
P
if and only if an < 1. If an < 1 for all n, then
1
Y
.1  an / > 0
nD1
P
if and only if an < 1.

Proof. Writing as shorthands sn D a1 C    C an ; pn D .1 C a1 /    .1 C an / and


qn D .1  a1 /    .1  an /, the sequences fsn g and fpn g are increasing while fqn g is
decreasing so that their (possibly infinite) limits exist. Moreover,

sn < pn  esn : (1)

Indeed, the lower estimate for pn is a consequence of

pn D .1 C a1 /    .1 C an / D 1 C a1 C    C an C a1 a2 C   

while the upper one follows by applying the inequalities 1 C ak  eak ; k D 1; : : : ; n.


From (1) we conclude that fsn g and fpn g are at the same time bounded or unbounded.
This is shown by the first statement. If ak < 1, then 0 < 1  ak  eak hence
0 < qn  esn . Thus, lim qn D 0 if lim sn D 1. The case lim sn < 1 follows from
the next theorem.9

Theorem B.3.5. Let fun g1


1 be a sequence of complex numbers different from 1
such that
X1
jun j < 1:
nD1
Then there exists p 2 C such that
1
Y
.1 C un / D p ¤ 0:
nD1

Proof. Write
Y
n Y
n
an D jun j; qn D .1 C aj /; pn D .1 C uj /
j D1 j D1

9 Note that Theorem B.3.5 uses only the first part of Theorem B.3.4.
Section B.3 Infinite products 263

and let  > 0 arbitrary. It follows from the first part of Theorem B.3.4 that lim qn < 1.
By Lemma B.3.2, there exists N 2 N such that
qnCd
 1 < ; n  N; d 2 N0 :
qn
We have
ˇ ˇ
ˇ pnCd ˇ
ˇ ˇ
ˇ p  1ˇ D j.1 C unC1 /.1 C unC2 /    .1 C unCd /  1j
n
D junC1 C    C unCd C unC1 unC2 C    j
 anC1 C    C anCd C anC1 anC2 C   
qnCd
D  1 < ; n  N; d 2 N0 :
qn
Thus, the theorem follows from Lemma B.3.2.

Theorem B.3.6. Let fun g1


1 be a sequence of complex-valued functions on some set
D ¤ ; such that for some K > 0
1
X
jun .z/j  K; z 2 D:
nD1

Then the sequence


pn .z/ D .1 C u1 .z//    .1 C un .z//
converges uniformly on D. The limit u.z/ is equal to zero if and only if 1 C un .z/ D 0
for some n.

Proof. We have pnC1 .z/  pn .z/ D unC1 .z/pn .z/ and hence for m > n

X
m1
pm .z/  pn .z/ D pm .z/  pm1 .z/ C    C pnC1 .z/  pn .z/ D uj C1 .z/pj .z/:
j Dn

Using this and the inequality 1 C x  ex ; x 2 R, we obtain


ˇ m1 ˇ
ˇX ˇ
ˇ ˇ
jpm .z/  p n .z/j D ˇ u j C1 .z/p j .z/ˇ
ˇ ˇ
j Dn

X
m1
 juj C1 .z/j  eju1 .z/j    ejuj .z/j
j Dn

X
m1
 juj C1 .z/j  eK :
j Dn
264 Appendix B Basic analysis

By our assumption the right-hand side tends to 0 uniformly as m; n ! 1, i.e.,


the sequence fpn g is uniformly convergent. The last statement follows from Theo-
rem B.3.5.

B.4 Convex functions


For proofs of the results presented in this section we refer to [61].

Definition B.4.1. A real-valued function f on an interval I  R is said to be convex


if the inequality

f . x1 C .1  /x2 /  f .x1 / C .1  /f .x2 /

holds for all x1 ; x2 2 I and 2 Œ0; 1.

Geometrically, a convex function f has the property that for any two points x1 < x2
in I the chord joining the points .x1 ; f .x1 // and .x2 ; f .x2 // is above the graph of f .

Proposition B.4.2. Let f be a convex function on an interval I  R and u; v; w 2 I


be such that u < v < w. Then we have
f .v/  f .u/ f .w/  f .u/ f .w/  f .v/
< < :
vu wu wv
The inequalities in Proposition B.4.2 play a basic role in the proof of the next
theorem.

Theorem B.4.3. Let f be a convex function on an open interval I  R. Then we


have
(i) f is continuous on I ;
(ii) f is both left- and right-differentiable at every x0 2 I , i.e.,
f .x/  f .x0 /
.Dl f /.x0 / WD lim 2R
x"x0 x  x0
f .x/  f .x0 /
.Dr f /.x0 / WD lim 2 RI
x#x0 x  x0
(iii) Dl and Dr are increasing functions, Dl  Dr and
f .x2 /  f .x1 /
.Dr f /.x1 /   .Dl f /.x2 /
x2  x1
for all x1 ; x2 2 I such that x1 < x2 ;
(iv) f is differentiable everywhere except at countably many points in I and f 0 is
an increasing function on the subset of I on which it exists.
Section B.4 Convex functions 265

Theorem B.4.4. Let f be a convex function on an open interval I  R. Then either


f is monotone on I or else there exists x0 2 I such that f is decreasing on I \
.1; x0  and increasing on I \ Œx0 ; 1/.
Theorem B.4.7 below states that convex functions are absolutely continuous. Before
formulating it we recall the definition of absolute continuity.

Definition B.4.5. A complex-valued function F , defined on an interval ŒaI b  R, is


called absolutely continuous if there corresponds to every  > 0 a ı > 0 such that
X
n
jF .bi /  F .ai /j < 
iD1

for any n 2 N and any disjoint collection of intervals Œai ; bi   Œa; b satisfying
X
n
j.bi  ai /j < ı:
iD1

Taking n D 1 in the definition above, we see that such an F is continuous.


Moreover, F is of bounded variation.10 Indeed, let ı > 0 correspond to  D 1. If
a D x0 <    < xm D b, by inserting further subdivision points if necessary, we can
collect the intervals .xi1 ; xi / into at most N WD b.b  a/=ıc C 1 groups such that
the sum of the interval lengths within a group is less than ı. It follows that the total
variation of F is at most N .
The next theorem is usually referred to as the fundamental theorem of calculus for
Lebesgue integrals.

Theorem B.4.6 (Lebesgue). If 1 < a < b < 1 and F W Œa; b ! C, then the
following statements are equivalent:
(i) F is absolutely continuous;
Rx
(ii) F .x/  F .a/ D a f .t / dt; x 2 Œa; b
for some Lebesgue integrable function f on Œa; b;
(iii) F is Lebesgue almost
R x everywhere differentiable, F 0 is Lebesgue integrable and
0
F .x/  F .a/ D a F .t / dt; x 2 Œa; b.

Theorem B.4.7. Let f be a convex function on an interval I  R.


(i) If I is open, then f is absolutely continuous on every finite closed interval
contained in I .
(ii) If I is a closed finite interval and f is continuous at the endpoints of I , then f
is absolutely continuous on I .
10 See page 288 for the definition of bounded variation.
266 Appendix B Basic analysis

Theorem B.4.8. Let f be a real-valued function on an open interval I  R. Suppose


(i) f is absolutely continuous on every finite closed interval contained in I , and
(ii) f 0 is an increasing function on the subset of I on which it exists.
Then f is a convex function on I .

B.5 The Riemann–Stieltjes integral


At a few places in the book we need the Riemann–Stieltjes integral. D. V. Widder’s
classic book [60] contains a short introduction to this topic.
Let ˛ and f be real-valued functions defined on some finite interval Œa; b  R.
Denote by  D .x0 ; x1 ; : : : ; xn / a subdivision of this interval by the points
x0 ; x1 ; : : : ; xn , where
a D x0 < x1 <    < xn D b:
Further, let ı D ı./ be the largest of the numbers xiC1  xi ; i D 0; : : : ; n  1.

Definition B.5.1. If the limit


X
n1
lim f .si /Œ˛.xiC1 /  ˛.xi /
ı!0
iD0

where xi  si  xiC1 exists independently of the manner of subdivision and of the


choice of the numbers si , then the limit is called the Riemann–Stieltjes integral of f
with respect to ˛ from a to b and is denoted by
Z b
f .x/ d˛.x/:
a

The Riemann–Stieltjes integral reduces to the Riemann integral if ˛.x/ D x.

Theorem B.5.2. If f is continuous and ˛ is of bounded variation in Œa; b, then the
Riemann–Stieltjes integral of f with respect to ˛ from a to b exists.

Theorem B.5.3 (integration by parts). If the Riemann–Stieltjes integral of f with


respect to ˛ from a to b exists, then so does the Riemann–Stieltjes integral of ˛ with
respect to f , and
Z b Z b
f .x/ d˛.x/ D f .b/˛.b/  f .a/˛.a/  ˛.x/ df .x/:
a a

Another useful property of the Riemann–Stieltjes integral is given next.


Section B.6 Multivariate calculus 267

Theorem B.5.4. If f and ' are continuous and ˛ is of bounded variation in Œa; b,
and if Z x
ˇ.x/ D '.t / d˛.t /; x 2 Œa; b
c
where c 2 Œa; b is fixed, then
Z b Z b
f .x/ dˇ.x/ D f .x/'.x/ d˛.x/:
a a

Definition B.5.5 (improper Riemann–Stieltjes integral). Let ˛ and f be real-valued


functions defined on the interval Œa; 1/  R. Then
Z 1
f .x/ d˛.x/
a
Rb
is said to be convergent and equal to A provided a f .x/ d˛.x/ exists for all b  a
and Z b
A D lim f .x/ d˛.x/
b!1 a
is finite.

B.5.6. Using the notation from the previous definition and applying Theorem B.5.3 we
obtain the formula
Z 1 Z 1
f .x/ d˛.x/ D lim Œf .b/˛.b/  f .a/˛.a/  ˛.x/ df .x/ (1)
a b!1 a

provided the improper integrals and the limit exist.

B.6 Multivariate calculus


Theorem B.6.1 (Taylor’s theorem for multivariate functions). Let f be a real- or
complex-valued function defined on an open convex set S  Rd such that for some
positive integer n all partial derivatives D ˛ f; ˛ 2 Nd0 ; j˛j  n, exist and are con-
tinuous. Then for all t; a 2 S we have
X D ˛ f .a/ X
f .t / D  .t  a/˛ C R˛ .t /  .t  a/˛
˛Š
j˛jn j˛jDn

where
Z 1
n 
R˛ .t / D  D ˛ f .a C x.t  a//  D ˛ f .a/  .1  x/n1 dx:
˛Š 0
268 Appendix B Basic analysis

The remainder terms R˛ satisfy the inequality


1
jR˛ .t /j   sup jD ˛ f .y/  D ˛ f .a/j
˛Š y2L
where L denotes the line segment connecting a and t .

Proof. Since S is open and convex we can choose an open interval I Œ0; 1 such
that a C s.t  a/ 2 S for all s 2 I . The function g of one real variable defined by

g.s/ D f .a C s.t  a//; s2I (1)

is n-times continuously differentiable. Thus, for n > 0 the one-dimensional Taylor


formula
Xn Z s
g .k/ .0/ k 1  .n/ 
g.s/ D s C  g .x/  g .n/ .0/  .s  x/n1 dx; s 2 I
kŠ .n  1/Š 0
kD0

holds. Consequently,
X
n Z 1
g .k/ .0/ 1  .n/ 
f .t / D g.1/ D C  g .x/g .n/ .0/ .1x/n1 dx: (2)
kŠ .n  1/Š 0
kD0

Using (1), the chain rule, and induction on k we get


X
d X
d
@k f
g .k/ .s/ D  .a C s.t  a//  .ti1  ai1 /  : : :  .tik  aik /
@ti1    @tik
i1 D1 ik D1
X kŠ
D  D ˛ f .a C s.t  a//  .t  a/˛ :
˛Š
j˛jDk

Inserting this into (2) we obtain the first statement of the theorem.
The remainders are easily seen to satisfy the given upper estimate.

Theorem B.6.2. Under the conditions of Theorem B.6.1 there exist ;  2 .0; 1/ de-
pending on t such that
X D ˛ f .a/ X
f .t / D  .t  a/˛ C Q˛ .t /  .t  a/˛
˛Š
j˛jn1 j˛jDn

where
1
Q˛ .t / D  ŒRe D ˛ f .a C   .t  a// C i  Im D ˛ f .a C   .t  a//:
˛Š
Section B.6 Multivariate calculus 269

Proof. The proof is almost the same as that of Theorem B.6.1, but now we apply the
one-dimensional Taylor formula
X
n1
h.k/ .0/ k 1
h.s/ D s C  h.n/ .s/  s n
kŠ nŠ
kD0

where h is real-valued, for the real and imaginary part of g.

Lemma B.6.3. Let f be a complex-valued function on .a; b/  .a; b/  R2 having


continuous partial derivatives of the second order. Then

2 f .x; y; ; h/
D .1;1/ f .x; y/ D fxy .x; y/ D lim ; t; s 2 .a; b/
;h!0 h
where

2 f .x; y; ; h/ D f .x C ; y C h/  f .x C ; y/  f .x; y C h/ C f .x; y/:

Proof. We may assume that f is real-valued. Let x; y 2 .a; b/ and choose ı > 0 such
that x C ; y C h 2 .a; b/ whenever ; h 2 Iı WD .ı; ı/. Setting

g.t / WD f .x C t; y C h/  f .x C t; y/  fxy .x; y/t h; t 2 Iı

the function g is continuously differentiable on Iı . For  2 Iı we have

g./  g.0/ D 2 f .x; y; ; h/  fxy .x; y/h:

Applying the mean value theorem to g we obtain

jg./  g.0/j  jj  sup jg 0 .s/j


s2Iı
D jj  sup jfx .x C s; y C h/  fx .x C s; y/  fxy .x; y/hj:
s2Iı

Now we apply the mean value theorem to the continuously differentiable function
r.h/ D fx .x C s; y C h/  fxy .x; y/h:

jr.h/  r.0/j D jfx .x C s; y C h/  fx .x C s; y/  fxy .x; y/hj


 jhj  sup jr 0 .q/j
q2Iı
D jhj  sup jfxy .x C s; y C q/  fxy .x; y/j:
q2Iı

Combining the two upper estimates we see that


ˇ 2 ˇ
ˇ  f .x; y; ; h/ ˇ
ˇ  f .x; y/ˇ  sup jfxy .x C s; y C q/  fxy .x; y/j; ; h 2 Iı :
ˇ h
xy ˇ s;q2Iı

Since fxy is continuous, the right-hand side tends to zero as ı ! 0.


270 Appendix B Basic analysis

Lemma B.6.4. Let a; b 2 .0; 1/ be such that b > .a=3/3 . Then the polynomial

P .s; t / D s 2 t 2 .s 2 C t 2  a/ C b; s; t 2 R

is strictly positive but it is not a sum of squares of polynomials with real coefficients.
p p
Proof. It is easy to show that P attains its minimum at s D ˙ a=3; t D ˙ a=3
and the minimum is equal to b  .a=3/3 > 0.
Assume that
P D P12 C    C Pn2 (1)
where Pj is a polynomial with real coefficients. It follows from P .s; 0/ D P .0; t / D b
that Pj .s; 0/ and Pj .0; t / are constant for all j . Hence,

Pj .s; t / D aj C stQj .s; t / (2)

where Qj is a polynomial and aj 2 R. From (1) we see that the degree of Qj is at


most one. Putting (2) into (1) and comparing the coefficients we obtain
X
n
2 2
s Ct a D Qj2 .s; t /:
j D1

This, however, is a contradiction because the right-hand side is always nonnegative


whereas the left-hand side is negative if s 2 C t 2 < a.

In the rest of this section we prove some simple results about multivariate poly-
nomials.

Lemma B.6.5. Let P be a polynomial of degree m on Rd . Then for all t; s 2 Rd the


function
Qt;s .r/ D P .t C r.s  t //; r 2 R
is a polynomial of degree at most m. Moreover, every nonempty open set S  Rd
contains elements t; s such that the degree of Qt;s is equal to m.

Proof. The first statement follows from the fact that Qt;s is a linear combination of
terms of the form
Yd
 ˛
tj C r.sj  tj / j
j D1

where ˛j 2 N0 and ˛1 C  C˛d  m. We now show the second statement. If m D 0,


then there is nothing to show; otherwise we have
X
P .t / D a˛ t ˛ C R.t /
j˛jDm
Section B.6 Multivariate calculus 271

where not all of the coefficients a˛ are equal to zero and R is a polynomial of degree
less than m. We choose t 2 S such that the sum above is different from zero. Further,
let ı ¤ 0 be a real number such that s D .1 C ı/t 2 S . Then
  X  
Qt;s .r/ D P .1 C rı/t D .1 C rı/m a˛ t ˛ C R t C r.s  t /
j˛jDm

Applying the first statement to R we see that the degree of Qt;s is equal to m.

Theorem B.6.6. Let f be a real- or complex-valued function defined on an open


convex set S  Rd . Assume that for some nonnegative integer m and for all t; s 2 S
the function
r 7! f .t C r.s  t //; r 2 Œ0; 1
is a polynomial of degree at most m. Then f is a polynomial of degree at most m.

Proof. To simplify the notation we assume that d D 2 and write the function
f as f .x; y/; .x; y/ 2 S . We choose mutually different aj , mutually different
bj ; 0  j  m, such that .ai ; bj / 2 S for all i and j and denote by Ai and Bi
the corresponding Lagrange polynomials:

Y
m
x  ak Y
m
y  bk
Ai .x/ D ; Bi .y/ D :
ai  ak bi  bk
kD1; k¤i kD1; k¤i

The function
X
m X
m
P .x; y/ D ŒAi .x/f .ai ; y/ C Bi .y/f .x; bi /  Ai .x/Bj .y/f .ai ; bj /
iD1 i;j D1

is a polynomial of degree at most m in each variable. Moreover, we have


P .ak ; bl / D f .ak ; bl / for all 0  k; l  m. Since f and P are polynomials of
degree at most m of the first variable, we conclude the P .x; bl / D f .x; bl / whenever
.x; bl / 2 S . Knowing this, the same argument shows that P .x; y/ D f .x; y/ for all
.x; y/ 2 S , i.e., f is a polynomial. That the degree of f is at most m follows from
Lemma B.6.5.

Lemma B.6.7. Let P be a polynomial on Rd with complex coefficients such that for
all t 2 Rd the polynomial s 7! P .st /; s 2 R, of one variable is of degree at most k.
Then the degree of P is at most k.

Proof. We write P in the form P D P0 C    C Pn where n is the degree of P and


X
Pj .t / D c˛ t ˛ ; c˛ 2 C; t 2 Rd :
j˛jDj
272 Appendix B Basic analysis

Choose t0 such that Pn .t0 / ¤ 0. Using the relation Pj .st0 / D s j Pj .t0 / we see that
the degree of s 7! P .st0 / is equal to n and hence n  k.

B.7 The Lebesgue integral on Rd


We refer to [18] for basic properties of the multidimensional Lebesgue integral.

B.7.1. Assume that the real-valued functions h1 ; : : : ; hd are defined on the


open set U  Rd and let h.x/ D .h1 .x/; : : : ; hd .x//; x 2 U . Suppose further that
@hi
all partial derivatives @x ; i; j D 1; : : : ; d exist. Then the matrix-valued function
j

d
@hi
J h.x/ D .x/
@xj i;j D1
2 @h @h1 3
1
.x/    .x/
6 @x1 @xd 7
6 7
D6
6
::
:
::
:
7;
7 x2U
4 @h @hd 5
d
.x/    .x/
@x1 @xd
is called the Jacobian matrix of h. The real-valued function x 7! det .J h.x// is usu-
ally called the Jacobian of h.

Theorem B.7.2. Let U be an open subset of Rd and h W U ! Rd be an injective


function with continuous partial derivatives, the Jacobian of which is nonzero for
every x in U . A real- or complex-valued function g is Lebesgue integrable over h.U /
if and only if x 7! g.h.x//  det .J h.x//; x 2 U , is Lebesgue integrable over U . In
this case11 Z Z
g.y/ dy D g.h.x//  j det .J h.x//j dx:
h.U / U

Corollary B.7.3. Let h be as in the previous theorem and write ' D h1 . If X is a
d-dimensional random vector having density pX , then the random vector Y D h.X /
has density
pY .x/ D pX .'.x//  j det .J'.x//j; x 2 Rd :
If h.x/ D Ax with a nonsingular matrix A, then
1
pY .x/ D  pX .A1 x/; x 2 Rd :
j det Aj

11 Note that the assumptions imply that h.U / is open.


Section B.7 The Lebesgue integral on Rd 273

B.7.4. Let S d 1 D ft 2 Rd W kt k D 1g. The map F defined by

F .t / WD .kt k; t =kt k/; t 2 Rd n f0g

is a continuous bijection from Rd n f0g to .0; 1/  S d 1 whose inverse is

F 1 .r; s/ D rs; .r; s/ 2 .0; 1/  S d 1 :

Theorem B.7.5. There is a unique Borel measure  D d on S d 1 such that if f is


Borel measurable on Rd and f  0 or f 2 L1 . /, then
Z Z 1Z
f .t / dt D f .rs/r d 1 d .s/ dr: (1)
Rd 0 S d 1

Moreover,  is rotation-invariant.

Proof. For E 2 B.S d 1 / let  .E/ WD d .E1 /, where

Ea D F 1 ..0; a  E/ D frs W 0 < r  a; s 2 Eg; a > 0:

Since the map E 7! E1 takes Borel sets to Borel sets and commutes with unions,
intersections and complements, it is clear that  is a Borel measure on S d 1 . The
rotation-invariance of implies that  is rotation-invariant as well. Since Ea is the
image of E1 under the map t 7! at , Theorem B.7.2 shows that .Ea / D ad .E1 /.
We denote by  the Borel measure on .0; 1/  S d 1 induced by the mapping
1 .B//. Further, define the measure  on .0; 1/ by .A/ WD
F
R , i.e.,  .B/ D .F
d 1 dr. We show that  D    from which equation (1) follows. We have
Ar

b d  ad
 ..a; b  E/ D .Eb n Ea / D  .E/
d
Z b
D r d 1 dr   .E/ D ..a; b/ .E/; E 2 B.S d 1 /:
a

Using this, a standard uniqueness argument shows that for fixed E 2 B.S d 1 / the
measures  and    coincide on the  -algebra

fA  E W A 2 B..0; 1//g:

Since E 2 B.S d 1 / is arbitrary we conclude that  D    on all Borel sets of


.0; 1/  S d 1 .

Remark B.7.6. By considering the completion of the measure  , which we denote


also by  , equation (B.7.5.1) can be extended to Lebesgue measurable functions. Note
that the completion is rotation-invariant, as well.
274 Appendix B Basic analysis

30

20

10

5 10 15 20
Figure B.5. The function d 7! 2 d=2
=.d=2/; d 2 .0; 1/, from Proposition B.7.8.

Corollary B.7.7. If f is a measurable function on Rd , nonnegative or integrable,


such that f .t / D g.kt k/; t 2 Rd , for some function g on .0; 1/, then
Z Z 1
d 1
f .t / dt D  .S / g.r/r d 1 dr:
Rd 0

Proposition B.7.8. We have12

2 d=2
 .S d 1 / D :
.d=2/

Proof. Using Lemma B.1.10, Corollary B.7.7, and the substitution s D r 2 we obtain
d Z
Y 1 Z
tj2
ektk dt
d=2 2
 D e dtj D
j D1 1 Rd
Z 1 Z 1
d 1 d 1 r 2  .S d 1 /
D  .S / r e dr D s .d 2/=2 es ds
0 2 0
 .S d 1 /
D .d=2/ :
2

12 See Figure B.5 for a plot of this expression.


Section B.7 The Lebesgue integral on Rd 275

Proposition B.7.9. For R  0 we have


Rd Rd  d=2
.ft 2 Rd W kt k  Rg/ D   .S d 1 / D :
d .d=2 C 1/
Proof. By Corollary B.7.7 and the previous proposition
Z R
.ft 2 Rd W kt k  Rg/ D  .S d 1 / r d 1 dr
0
Rd Rd  d=2
D   .S d 1 / D :
d .d=2 C 1/
Spherical coordinates B.7.10. Recall that the angle between x; y 2 Rd n f0g is
defined by
.x; y/
cos D ; 0   :
kxk  kyk
If x D 0 or y D 0, then we set WD 0.
Let e1 ; : : : ; ed be the standard orthonormal basis in Rd ; d  2. For x 2 Rd write
r D kxk. Let uj be the unit vector in the direction of the orthogonal projection of x
onto the space spanned by ej ; : : : ; ed ; j D 2; : : : ; d  1. If the orthogonal projection
is 0, then we set uj WD 0. Denote by j 1 the angle between uj and ej . Below we
show that
2 1
! 2
!
X
d jY dY
xD r sin k cos j  ej C r sin k  ud 1 : (1)
j D1 kD1 kD1

Since ud 1 is a linear combination of ed 1 and ed , there is a unique  2 Œ0; 2/ such


that ud 1 D cos   ed 1 C sin   ed . From equation (1) we obtain
x1 D r cos 1 (2)
1
jY
xj D r cos j sin k; j D 2; : : : ; d  2 (3)
kD1
2
dY
xd 1 D r cos  sin k; (4)
kD1
2
dY
xd D r sin  sin k (5)
kD1

where r 2 Œ0; 1/; j 2 Œ0; ; 1  j  d  2, and  2 Œ0; 2/.


The numbers r; 1 : : : ; d 1 and  are called the d-dimensional spherical co-
;
ordinates of x.13
13 This derivation of the spherical coordinates is taken from [8].
276 Appendix B Basic analysis

Proof of (1). Since x1 D .x; e1 / D r cos 1,

X
d
x D r cos 1  e1 C xj  ej :
j D2

We have
X
d X
d
r 2 D kxk2 D r 2 cos2 1C xj2 and hence xj2 D r 2 sin2 1:
j D2 j D2

Define aj by xj D aj r sin 1 if sin 1 ¤ 0 and let aj D 0; j D 2; : : : ; d , otherwise.


P
Then djD2 aj ej has unit length and

X
d
x D r cos 1  e1 C r sin 1 aj  ej :
j D2

Pd
Since sin 1  0 we see that u2 D j D2 aj  ej . We have cos 2 D .u2 ; e2 / D a2
and
X
d
u2 D cos 2  e2 C aj  ej :
j D3
Moreover,
X
d X
d
1 D ku2 k2 D cos2 2C aj2 and hence aj2 D sin2 2:
j D3 j D3

Define bj by aj D bj sin 2 if sin 2 ¤ 0 and let bj D 0; j D 3; : : : ; d , otherwise.


P
Then djD3 bj ej has unit length and

X
d
u2 D cos 2  e2 C sin 2  bj ej
j D3

Pd
i. e., u3 D j D3 bj ej . Thus,

x D r cos 1  e1 C r sin 1 cos 2  e2 C r sin 1 sin 2  u3 :

Continuing this process we obtain the equation (1).

Remark B.7.11. Note that the Jacobian of the transformation (B.7.10.1) is given by
2
dY
d 1
J.r; 1; : : : ; d 2 ;  / D r sink d 1k :
kD1
Section B.7 The Lebesgue integral on Rd 277

Let d  2. Setting r D 1 in equation (B.7.10.1) we see that S d 1 can be parameterized


by angles j 2 Œ0; ; 1  j  d  2, and  2 Œ0; 2/, which are related to the
Cartesian coordinates x1 ; : : : ; xd by

x1 D cos 1; x2 D sin 1 cos 2; x3 D sin 1 sin 2 cos 2; : : : ;


xd 1 D sin 1    sin d 2 cos ; xd D sin 1    sin d 2 sin :

Using this parameterization the surface measure d on S d 1 can be given explicitly


by

dd . 1; : : : ; d 2 ;  / D sind 2 1 sin


d 3
2    sin d 2 d 1    d d 2 d:

Making the substitution t D cos 1 on the right-hand side we obtain the recursive
relation

dd . 1; : : : ; d 2 ;  / D .1  t 2 /.d 3/=2 dd 1 . 2; : : : ; d 2 ;  / dt (1)

for d  3.
Appendix C

Advanced analysis

C.1 Functions of a complex variable


We refer to [46] and [56] for proofs of the classical results not presented in the present
section.
Throughout this section the symbol D denotes a nonempty open subset of C. If
r 2 .0; 1 and z0 2 C, then we write
D.z0 ; r/ D fz 2 C W jz  z0 j < rg:
Definition C.1.1. A complex-valued function f on D is said to be differentiable at
z0 2 D if the limit
f .z/  f .z0 /
lim
z!z0 z  z0
exists. This limit is then denoted by f 0 .z0 / and is called the derivative of f .

Higher order derivatives are defined recursively in the same way as for real-valued
functions. If f is differentiable at all points of D, then f is said to be holomorphic on
D. The set of all holomorphic functions on D will be denoted by H.D/. Functions
that are holomorphic on C are also called entire.
Unlike the real case, holomorphic functions are infinitely often differentiable.

Theorem C.1.2. If f 2 H.D/ then f 0 2 H.D/.


Power series and Taylor expansion C.1.3.
A power series is a series of the form
X1
ck  .z  z0 /k
kD0
where z; z0 and ck are complex numbers. The radius of convergence r of this series is
a nonnegative number or 1 given by
1
rD p :
lim sup k jck j
k!1
If jz  z0 j < r, then the series converges absolutely; if jz  z0 j > r, then the series is
not convergent. If r > 0, then the function
X1
g.z/ D ck  .z  z0 /k ; z 2 D.z0 ; r/
kD0
Section C.1 Functions of a complex variable 279

is holomorphic on D.z0 ; r/.


If f is holomorphic on D and D.z0 ; q/  D for some q > 0, then the convergence
radius of the power series
X1
f .k/ .z0 /
 .z  z0 /k

kD0
is at least q and the series coincides with f on D.z0 ; q/. This series is called the
Taylor expansion of f at z0 .

Line integrals C.1.4. A curve in the complex plane is a continuous mapping of a


finite interval Œa; b  R into C. If .a/ D .b/, then is said to be closed.
A curve can be written in the form

.t / D x.t / C iy.t /; t 2 Œa; b

where x and y are continuous real-valued functions on Œa; b. If x and y are piecewise
continuously differentiable, then is called a path. We define 0 .t / by
0
.t / D x 0 .t / C iy 0 .t /

for all t where the derivatives x 0 .t / and y 0 .t / exist.


Let f W D ! C be a continuous function and W Œa; b ! D; a; b 2 R, be a path
in D. The line integral Z
f .z/ dz

of f along is defined by
Z Z b
0
f .z/ dz D f . .t //  .t / dt:
a

We use the notation


Z Z Z 1
f .z/ dz WD f .z/ dz D .z1  z0 / f .z0 C .z1  z0 /t / dt; z0 ; z1 2 C
Œz0 ;z1  0

where is the path given by

.t / D z0 C .z1  z0 /t; 0  t  1:

For z0 ; z1 ; z2 2 C let  D .z0 ; z1 ; z2 / denote the triangle with vertices at z0 ; z1 and


z2 and define
Z Z Z Z
f .z/ dz WD f .z/ dz C f .z/ dz C f .z/ dz:
@ Œz0 ;z1  Œz1 ;z2  Œz2 ;z0 
280 Appendix C Advanced analysis

Theorem C.1.5 (Cauchy). Suppose D is a convex open set and f 2 H.D/. Then
Z
f .z/ dz D 0

for every closed path in D. If z0 2 D, then the function


Z
F .z/ WD f .w/ dw; z 2 D
Œz0 ;z

belongs to H.D/ and F 0 D f .

Theorem C.1.6 (Morera). Let f W D ! C be a continuous function such that


Z
f .z/ dz D 0
@

for every triangle   D. Then f 2 H.D/.


Simply connected sets C.1.7. Suppose 0 and 1 are closed curves in X  C, both
with parameter interval Œ0; 1. We say that these curves are X -homotopic if there is a
continuous mapping H from Œ0; 1  Œ0; 1 into X such that

H.s; 0/ D 0 .s/; H.s; 1/ D 1 .s/; H.0; t / D H.1; t /; t; s 2 Œ0; 1:

If X is connected and every closed curve in X is X -homotopic to a constant mapping,


then X is said to be simply connected. Intuitively, this means that every closed curve
can be continuously deformed to a single point within X .

Theorem C.1.8. Suppose that D is connected and let fzn g be a sequence in D having
at least one accumulation point in D. If f and g are holomorphic functions in D and
f .zn / D g.zn / for all n, then f .z/ D g.z/ for all z 2 D.
Next we prove a result on the zeros of certain holomorphic functions.

Theorem C.1.9 (Eneström–Kakeya1 ). Let a0  a1  a2      0 and a0 ¤ 0.


Then
1
X
f .z/ WD aj z j ¤ 0; jzj < 1:
j D0

Proof. We may assume that a0 D 1. Setting


1
X
g.z/ WD .aj 1  aj /z j ; jzj < 1
j D1

1 The original Eneström–Kakeya theorem is on the zeros of polynomials. The present formulation and
its simple proof is due to T. Koornwinder.
Section C.1 Functions of a complex variable 281

we have .1  z/f .z/ D 1  g.z/. Since


1
X
.aj 1  aj / D 1  lim aj  1
j !1
j D1

and aj 1  aj  0 we see that


1
X 1
X
k
jg.z/j  .aj 1  aj /jzj  .aj 1  aj /jzj  jzj:
j D1 j D1

So
j.1  z/f .z/j  1  jg.z/j  1  jzj > 0; jzj < 1:

Proposition C.1.10. For n 2 N0 we have


Z
.1  t 2 /n dt D .1  z/nC1 Qn .z/
Œz;1

where Qn is a polynomial of degree n, the zeros of which lie in the closed disc D.z0 ; r/
where2
1 1 1 1
z0 D  nC1C and rD nC1 :
2 nC1 2 nC1
Proof. We consider the polynomial
Z
Pj .z/ D .1  t /nCj .1 C t /nj dt; 0  j  n; z 2 C:
Œz;1

Then
1
Pn .z/ D .1  z/2nC1 (1)
2n C 1
and integration by parts gives
1 nj
Pj .z/ D .1  z/nCj C1 .1 C z/nj C Pj C1 .z/ (2)
nCj C1 nCj C1
for all j < n. Setting
1 n n1 nj C1
cj WD   
nC1 nC2 nC3 nCj C1
from (1) and (2) we obtain
X
n
P0 .z/ D cj .1  z/nC1Cj .1 C z/nj
j D0

2 This proposition is taken from [39] where the authors state that the zeros of Qn lie in the open disc
D.z0 ; r/. However, the real part of two zeros of Q1 is equal to  12 .
282 Appendix C Advanced analysis

and therefore
X
n
1z j
Qn .z/ D .1 C z/n cj : (3)
1Cz
j D0
Using the notation
nC2 j X
n
aj WD cj ; H.w/ WD aj w j
n
j D0

equation (3) yields


n 1z
Qn .z/ D .1 C z/n H  :
nC2 1Cz
Since
aj C1 nj nC2
0< D  1
aj nCj C2 n
the Enström–Kakeya Theorem C.1.9 shows that all of the zeros of H lie in the set
S WD C n D.0; 1/. The proposition follows from the fact that the image of S [ f1g
by the reciprocal map of
n 1z
z 7! w D
nC21Cz
is D.z0 ; r/.

Lemma C.1.11. Let f be a holomorphic function on a convex open set D  C and


assume that f has no zeros. Then there exists a holomorphic function g on D such
that f D eg . If z0 2 D and f .z0 / D 1, then g can be chosen such that g.0/ D 0.3
Proof. It suffices to prove the second statement. Since f has no zeros, the function
h D f 0 =f is holomorphic. The second part of Theorem C.1.5 shows that
Z
g.z/ D h.w/ dw; z 2 D
Œz0 ;z

is holomorphic, as well. Since g0


D h, we have

d g.z/
 g.z/ f 0 .z/
f .z/e De f 0 .z/  f .z/  D 0:
dz f .z/

Thus, f eg is constant and therefore f .z/eg.z/ D f .z0 /eg.z0 / D 1.

Lemma C.1.12. Let g be a holomorphic function on D.0; R/ given by


1
X
g.z/ D ck z k ; jzj < R:
kD0

3 See also Section C.8.


Section C.1 Functions of a complex variable 283

Then
Z
1 2   
cn r Dn
Re g reit  eint dt; r 2 R; jrj < R; n 2 N:
 0

Proof. Writing ck D ak C ibk ; ak ; bk 2 R and using that the series is absolutely


convergent for jzj  r we obtain
Z 2
  
Re g reit  eint dt D
0
1
X Z 2
rk Œak cos k t  bk sin k t   Œcos nt  i sin nt  dt D cn r n :
0
kD0

Corollary C.1.13. Let g be as in the previous lemma. Then the inequality


2
jcn j   sup jRe g.z/j ; 0<r <R
r n jzjDr
holds for all n 2 N.

Corollary C.1.14. Let g be an entire function such that for some k 2 N0 and constant
C > 0; R > 0 the inequality

jRe g.z/j  C jzjk ; jzj  R

holds. Then g is a polynomial of degree at most k.

Proof. By Corollary C.1.13, the inequality jcn j  2C r kn holds for all r  R and
n  1. Thus, cn D 0 if n > k.

Theorem C.1.15 (Jensen’s formula). Suppose 0 < r < R, and let f be a holomor-
phic function on D.0; R/ such that f .0/ ¤ 0. Denote by z1 ; : : : ; zN the zeros of f in
the closed disc D.0; r/, listed according to their multiplicities. Then

Y
N  Z 
r 1
jf .0/j  D exp log jf .rei /j d :
jzn j 2 
nD1

Corollary C.1.16. Let f be an entire function such that f .0/ D 1 and denote by
n.r/; 0 < r < 1, the number of zeros of f in the closed disc D.0; r/. Then
log M.2r/
n.r/ 
log 2
where
M.r/ D sup jf .rei /j:
2Œ0;2/
284 Appendix C Advanced analysis

Proof. Let fzk g be the sequence of zeros of f , listed according to their multiplici-
ties and arranged so that the sequence fjzk jg is increasing. By Jensen’s formula and
inequality
 Z 
1
M.2r/  exp log jf .2rei /j d
2 
Y
n.2r/
2r Y 2r
n.r/
D   2n.r/ :
jzk j jzk j
kD1 kD1

Taking logarithms we obtain the desired inequality.

Definition C.1.17. An entire function f is said to be of finite order if there exist


positive numbers A and C such that

jf .z/j  C ejzj ;
A
z 2 C:

The infimum  of all numbers A for which this is true is called the order of f .

A polynomial is of order zero. The functions ez ; cos z and sin z are of order 1 while
zk z
e ; k 2 N, is of order k. The function ee is of infinite order.

Lemma C.1.18. If f is an entire function of order , f .0/ ¤ 0, and r1 ; r2 ; : : : are


the moduli of the zeros of f , then the series
X 1
rn˛
n1

is convergent if ˛ > .

Theorem C.1.19 (Hadamard). If f is an entire function of order , with zeros


z1 ; z2 ; : : : such that f .0/ ¤ 0, then
Y z z 1 z 2
1 z p
f .z/ D eQ.z/ 1 exp C C  C
zn zn 2 zn p zn
n1

where Q is a polynomial of degree not greater than  and p is the smallest integer for
which
X 1
jzn jpC1
n1

is convergent.
Section C.2 Almost periodic functions 285

Remark C.1.20. Since the functions cos and sin are of order 1, the well-known prod-
uct formulae
1
Y z2
sin z D z 1
n2  2
nD1
1
Y z2
cos z D 1
.n  1=2/2  2
nD1

are consequences of Hadamard’s theorem.4

Theorem C.1.21 (Phragmen–Lindelöf). Let ˛ < ˇ  ˛ C  and let f be a function,


holomorphic in the sector

S D fz 2 C W ˛ < arg z < ˇg

and continuous on its boundary. If jf .z/j  M for all z on the boundary of S and

jf .z/j  C ejzj ;
A
z2S

where 0  C and 0  A < ˇ ˛
, then jf .z/j  M for all z 2 S .

C.2 Almost periodic functions

Definition C.2.1. A complex-valued function g on R is called almost periodic if for


every " > 0 there exist finitely many subsets U1 ; : : : ; Un of R with
[
n
Uj D R
j D1

such that the inequality


jg.s C x/  g.s C y/j < "
holds for every s 2 R whenever x; y 2 Uj with some j D 1; : : : ; n. We call
fU1 ; : : : ; Un g an "-partition for g.

Next we prove a few basic facts about almost periodic functions.5

Lemma C.2.2. Every continuous periodic function g is almost periodic.

4 See, e.g., § 3.23 in [56] for another proof of these product representations.
5 See Maak [41] for more information on almost periodic functions.
286 Appendix C Advanced analysis

Proof. Assume, without loss of generality, that g is periodic with period 1. Since
g is uniformly continuous on Œ0; 1 for every " > 0, there exists n 2 N such that
jg.t /  g.s/j < " whenever t; s 2 Œ0; 1 and jt  sj  1=n. Using this and the fact
that g is periodic with period 1 we see that
1
[
Uj WD Œk C .j  1/=n; k C j=n; j D 1; : : : ; n
kD1

is an "-partition for g.

Lemma C.2.3. If g and h are almost periodic, then so is their sum.

Proof. If fU1 ; : : : ; Un g and fV1 ; : : : ; Vm g are "-partitions for g and h, respectively,


then the sets
Ui \ Vj ; i D 1; : : : ; nI j D 1; : : : ; m
form a 2"-partition for g C h.

Lemma C.2.4. If g is the uniform limit of almost periodic functions gn ; n 2 N, then


g is almost periodic.

Proof. For each " > 0 there exists n 2 N such that

jg.t /  gn .t /j < "=3; t 2 R:

Since

jg.s C x/  g.s C y/j


 jg.s C x/  gn .s C x/j C jgn .s C x/  gn .s C y/j C jgn .s C y/  g.s C y/j
 jgn .s C x/  gn .s C y/j C 2"=3

we see that every "=3-partition for gn is an "-partition for g.

Applying the previous lemmata we obtain the following corollary.

Corollary C.2.5. Every function g of the form


1
X
g.t / D cn ei˛n t ; t 2R
nD1
P1
where ˛n 2 R; cn 2 C and nD1 jcn j < 1, is almost periodic.

Theorem C.2.6. Let g be a continuous almost periodic function. Then for all " > 0
there exists a positive number L D L."/ such that every closed interval of the form
Section C.3 Fourier series 287

Œa; a C L; a 2 R, contains a number satisfying6

jg.s C /  g.s/j < "; s 2 R: (1)

Proof. Let fU1 ; : : : ; Un g be an "-partition for g. Further, let a 2 R and aj 2 Uj


be arbitrary and write L WD 2 max.ja1 j; : : : ; jan j/. Choose j 2 f1; : : : ; ng such that
a C L=2 2 Uj and define by WD a C L=2  aj . Then 2 Œa; a C L. Since aj
and a C L=2 are in Uj we have

jg.s C a C L=2/  g.s C aj /j < ; s 2 R:

Replacing here s by s  aj we see that satisfies the inequality (1).

Corollary C.2.7. If g is a continuous almost periodic function, then the relation

lim inf jg.t /  g.s/j D 0


t!1

holds for all s 2 R.

C.3 Fourier series


Let f be a Lebesgue integrable function on the interval Œ0; 2/. The series
X
fO.n/einx ; x 2 Œ0; 2/
n2Z

where Z 2
1
fO.n/ D f .x/einx dx; n2Z
2 0
is called the Fourier series of f . If N is a nonnegative integer, the N-th symmetric
partial sum of the Fourier series of f is
X
sN f .x/ D fO.n/einx :
jnjN

If f D 1Œa;b/ where Œa; b/  Œ0; 2/, then

1 X eia  eib ijt


sN f .x/ D e :
2 ij
jj jN

6 It can be shown that this property implies almost periodicity; see Maak [41], p. 94.
288 Appendix C Advanced analysis

Recall that the total variation V .f / of a complex-valued function f defined on an


interval Œa; b is the quantity
X
m
V .f / WD sup jf .xi /  f .xi1 /j
iD1

where the supremum is taken over all m 2 N and all xi 2 Œa; b such that
0 D x0 < x1 <    < xm D b . If V .f / is finite, then f is said to be of bounded vari-
ation.
A real-valued function f is of bounded variation if and only if it can be expressed
as the difference of two monotone functions.
A proof of the next theorem can be found, e.g., in Section 10.1 of [13].

Theorem C.3.1. Let f be a Lebesgue integrable function on the interval Œ0; 2/ and
suppose that the total variation of f is finite. Then
1
lim sN f .x/ D  Œf .x C 0/ C f .x  0/; x 2 Œ0; 2/
N !1 2
and
jsN f .x/j  K
with some constant K  0.7
Applying Lebesgue’s theorem on dominated convergence we obtain the following
corollary.

Corollary C.3.2. Let f be as in the previous theorem and let  be a complex Borel
measure on Œ0; 2/. If  .fxg/ D 0 for all discontinuity points8 of f , then sN f con-
verges to f in L2 . /.

C.4 The Gamma function and the formulae


of Stirling and Binet

Definition C.4.1. The Gamma function  is defined by


Z 1
.x/ D t x1 et dt; x 2 .0; 1/:
0

Note that if Z is an exponentially distributed random variable with expectation one,


then Z 1
E .Z x / D t x et dt D .x C 1/; x > 1:
0

7 f .0  0/ is defined by the 2-periodic extension of f .


8 Discontinuity in 0 is defined by the 2-periodic extension of f .
Section C.4 The Gamma function and the formulae of Stirling and Binet 289

4
n D 12

2 nD6

1 2 3 4

Figure C.1. The Gamma function  (continuous line) and the function x 7! nŠnx =x.x C 1/
   .x C n/ from relation (C.4.2.v).

Lemma C.4.2. We have


(i) .x/ < 1 and  is continuous;
(ii) .x C 1/ D x.x/;
(iii) .n C 1/ D nŠ; n 2 N0 ;
p
(iv) .1=2/ D ;
nŠnx
(v) .x/ D limn!1 x.xC1/.xCn/
:

The relation (C.4.2.v) is illustrated in Figure C.1.

Proof. (i) It is easy to check that .x/ < 1 while the continuity can be proved by
using Lebesgue’s theorem on dominated convergence.
(ii) Integrating by parts we obtain
Z 1 Z 1
x t
ˇ
x t ˇ1
.x C 1/ D t e dt D t e 0 C x t x1 et dt D x.x/:
0 0

(iii) Since .1/ D 1, (iii) follows from (ii) by induction on n.


(iv) We have
Z 1 Z 1
1
.1=2/ D t 1=21 et dt D p et dt
0 0 t
Z 1
p
eu du D 
2
D2
0
290 Appendix C Advanced analysis

where we made the substitution t D u2 and applied Lemma B.1.10.


(v) Repeated integration by parts gives
Z n Z
x1 t n 1 n x t n1
t 1 dt D t 1 dt
0 n x 0 n

nŠnx
D
x.x C 1/    .x C n/
from which (v) follows by dominated convergence.9

Definition C.4.3. The Beta function B (see Figure C.2) is defined by the equation
Z 1
B.a; b/ D .1  t /a1 t b1 dt; a; b > 0:
0

6
y D 0:2

4
y D 0:4
2
y D 2:0

1 2 3

Figure C.2. The functions By W x 7! B.x; y/ where B is the Beta function from Definition
C.4.3.

Proposition C.4.4. We have


.a/.b/
B.a; b/ D ; a; b > 0:
.a C b/

9 It is easy to check by differentiating that 1Œ0;n .t /.1  t =n/n is an increasing function of n.


Section C.4 The Gamma function and the formulae of Stirling and Binet 291

Proof. It follows from the definition of the Gamma function that the function pa de-
fined by
x a1 ex
pa .x/ WD 1Œ0;1/ .x/  ; x2R
.a/
is a probability density. We have
Z 1 Z x
ex
pa  pb .x/ D pa .x  y/pb .y/ dy D  .x  y/a1 y b1 dy
1 .a/.b/ 0
Z 1
x aCb1 ex
D  .1  t /a1 t b1 dt
.a/.b/ 0

where we made the substitution y D tx. The relation above shows pa  pb D cpaCb
with some constant c. Since pa  pb and paCb are probability densities, we conclude
that c D 1 from which the assertion follows.

Proposition C.4.5 (Legendre’s duplication formula). We have


p 1
 .2x/ D 22x1 .x/ x C ; x > 0:
2

Proof. Let f1 .x/ D ex ; f2 .x/ D ex x 1=2 and denote by g the multiplicative con-
volution of f1 and f2 , i.e.,
Z 1
1
g.x/ D f1 .x=u/f2 .u/ du; x 2 .0; 1/:
0 u

Then Mf1 .s/ D .s/; Mf2 .s/ D .s C 12 / and, by Lemma C.6.3,

M g.s/ D Mf1 .s/  Mf2 .s/

where M denotes Mellin transformation. We prove that


p
 .2s/
M g.s/ D
22s1
from which the assertion follows. We have
Z 1 Z 1
.x=uCu/ 1
e.x=t Ct / dt:
2 2
g.x/ D e p du D 2
0 u 0
p
x
Changing t to t yields
Z 1 p
.x=t 2 Ct 2 / x
g.x/ D 2 e dt
0 t2
and therefore Z 1 p
x
e.x=t
2 Ct 2 /
g.x/ D 1C dt:
0 t2
292 Appendix C Advanced analysis

p
Substituting u D t  tx we obtain
Z 1 Z 1
p
.u2 C2 x/
p
2 x
p p
eu du D  e2
2 x
g.x/ D e du D e :
1 1
p
Finally, set y D 2 x and obtain
Z 1
p p
M g.s/ D  e2 x x s1 dy
0
Z 1  2s1 p
p y y  .2s/
D  e dy D :
0 2 22s1

Theorem C.4.6 (Binet). We have


 x x p
.x C 1/ D 2x  e.x/ ; x>0 (1)
e
where10 Z 1
1 1 1 xt 1
.x/ D  C e dt:
0 et  1 t 2 t

C.4.7. For the proof of Binet’s formula we define the function ' by the equation
Z 1
'.x/ 1 p  1t x
e Dp x te dt; x > 0
2 0
so that  x x p
.x C 1/ D 2x  e'.x/ : (1)
e
Binet’s formula is equivalent with .x/ D '.x/. For the proof of this we need that 
and ' satisfy a certain difference equation and that . 12 / D '. 12 /. First we show these
facts.

Lemma C.4.8. For all x > 0 we have


1 1
'.x/  '.x C 1/ D .x/  .x C 1/ D x C log 1 C  1: (1)
2 x
Proof. Denote by g.x/ the right-hand side of the relation above. The equation

'.x/  '.x C 1/ D g.x/

10 That the integral below exists follows from the relation


1 1 1 1 1
lim  C  D
t!0 et  1 t 2 t 12
which can be shown, e.g., by using the Bernoulli–l’Hospital rule.
Section C.4 The Gamma function and the formulae of Stirling and Binet 293

follows immediately from (C.4.7.1) by using the relation

.x C 2/ D .x C 1/.x C 1/:

To prove the statement on  , first note that

lim .x/  .x C 1/ D lim g.x/ D 0:


x!1 x!1

Moreover,11
Z 1
ext  e.xC1/t ext C e.xC1/t
 0 .x/   0 .x C 1/ D  dt:
0 t 2
Applying Lemma B.1.12 we obtain
1 1 1 1
 0 .x/   0 .x C 1/ D ln 1 C  C D g 0 .x/
x 2 x xC1
completing the proof.

Lemma C.4.9. '. 12 / D . 12 / D 12  12 log 2.


p
Proof. Since . 32 / D 12 , equation (C.4.7.1) shows that '. 12 / D 1
2  1
2 log 2.
By an obvious substitution,
Z 1
1 2 1 1t 1
.1/ D 1
 C e 2 dt:
0 e2  1
t t 2 t
Using this, we obtain

.1=2/ D ..1=2/  .1// C .1/


Z 1 1t Z 1
e 2 1 1 1 1 1 t 1
D  t dt C  C e dt
0 t e 1 t 0 e 1 t
t 2 t
Z 1 1t
e 2  et 1 1
D  et dt
0 t 2 t
Z 1 1 1
d e 2 t  et e 2 t  et
D   dt:
0 dt t 2t
Applying Lemma B.1.12 we obtain the desired result.

Proof of Binet’s formula. We have to show that '.x/ D .x/. By Lemma C.4.8,

.x/  .x C 1/ D '.x/  '.x C 1/:

11 The footnote to Lemma B.1.12 also holds for the differentiation below.
294 Appendix C Advanced analysis

Applying this to x; x C 1; : : : ; x C n  1 and summing these equations we see that

.x/  .x C n/ D '.x/  '.x C n/:

Since limn!1 .x C n/ D 0, we immediately obtain

.x/ D '.x/  lim '.x C n/ DW '.x/  h.x/: (1)


n!1

Next we show that the function h is decreasing. If 0  y  x and 0  p  1, then


p p p p
x C np xCn  y C np yCn  x C np yCn  y C np yCn
p p
 . x C n  y C n/p

for all n  1. Noting that 0  t e1t  1; t  0, and using the definition of ', we
conclude that
p p
e'.xCn/  e'.yCn/  . x C n  y C n/e'.1/ :

Taking the limit as n ! 1, we obtain eh.x/  eh.y/  0, i.e., h.x/  h.y/. Since
the function h is also periodic with period 1, it must be constant. Applying (1) and
Lemma C.4.9, we now obtain that h.x/ D h. 12 / D 0 for all x > 0.

Since limx!1 .x/ D 0, from Theorem C.4.6 we immediately obtain:

Corollary C.4.10 (Stirling’s formula).



lim p D 1:
n!1 .n=e/n 2 n

To prove a more precise statement we will need the Bernoulli numbers.

C.4.11. The Bernoulli numbers Bn ; n 2 N0 , are defined by12


1
X
z 1 Bn n
D D z ; z 2 C; jzj < : (1)
ez  1 1
C z
C z2
C  nŠ
1Š 2Š 3Š nD0

It follows from this definition that


1 z z2 B0 B1 B2 2
C C C   C zC z C  D 1; jzj < :
1Š 2Š 3Š 0Š 1Š 2Š
Comparing the coefficients we see that B0 D 1 and
1 B0 1 B1 1 B2 1 Bn1
 C  C  C C  D0
nŠ 0Š .n  1/Š 1Š .n  2/Š 2Š 1Š .n  1/Š
12 The expression on the left-hand side is defined by continuity at z D 0.
Section C.4 The Gamma function and the formulae of Stirling and Binet 295

if n > 1. Multiplying both sides by nŠ we obtain


! ! ! !
n n n n
B0 C B1 C B2 C    C Bn1 D 0: (2)
0 1 2 n1

In particular,

0 D 1 C 2B1 ;
0 D 1 C 3B1 C 3B2 ;
0 D 1 C 4B1 C 6B2 C 4B3 ;
0 D 1 C 5B1 C 10B2 C 10B3 C 5B4

and hence
1 1 1
B1 D  ; B2 D ; B3 D 0; B4 D  :
2 6 30
Using that B1 D  12 , relation (1) gives
z z
B2 2 z z z ez C 1 z e 2 C e 2
1C z C  D z C D D z : (3)
e 1 2 ez  1
z
2Š 2 2 e 2  e 2
The function on the right-hand side is even, showing that B2kC1 D 0 for all k 2 N.

Theorem C.4.12. For all n; N 2 N we have

X
2N X
2N C1
B2j nŠ B2j
1
< log p < :
2j.2j  1/n2j n
.n=e/ 2 n 2j.2j  1/n2j 1
j D1 j D1

Proof. From equation (C.4.11.3) we see that


1
X
1 1 1 B2j 2j 1
 C D z :
e 1 z
z 2 .2j /Š
j D1

By Problem 154 in Part I, Chapter 4 of [44], the series on the right-hand side has the
so-called enveloping property, i.e.,

X
2N X
2N C1
B2j 2j 1 1 1 1 B2j 2j 1
t < t  C < t ; 0 < t < :
.2j /Š e 1 t 2 .2j /Š
j D1 j D1
R1
Using the equation j Š D 0 t j et dt and the definition of  we immediately obtain
the assertion.
296 Appendix C Advanced analysis

C.5 Bessel functions

Definition C.5.1. For   0 we define13 the Bessel function J by

1  x  Z 1
J .x/ D p   .1  t 2 /1=2 eitx dt; x  0: (1)
   C 12 2 1

 
For x D  D 0 the expression x2 is equal to 1 by definition. Using that .1=2/ D
p
 we see that
Z
1 1 1 1
J0 .0/ D p dt D arcsin t j11 D 1
 1 1  t 2 
while J .0/ D 0 if  > 0.
If  D 1=2, then the integral above can be easily evaluated. We obtain:
r
2
J1=2 .x/ D  sin x:
x
Note that .1  t 2 /1=2 is even in t , hence one can replace eitx by cos tx in Defini-
tion C.5.1.

Lemma C.5.2. The function J is continuous and

jJ .x/j  C  x  ; ; x  0

where Z 1
1
C D p    .1  t 2 /1=2 dt:
   C 12 2 1

Proof. The continuity follows from the fact that the integral in Definition C.5.1 rep-
resents a characteristic function, up to a constant factor. The second statement is ob-
vious.

The substitution t D cos ' in Definition C.5.1 yields:

Lemma C.5.3. For   0 we have


1  x  Z 
J .x/ D p   eix cos ' sin2 ' d'; x  0:
   C 12 2 0

13 We will extend the definition of J for all  2 R in Remark C.5.6. Note that the extension is possible
for all complex  but we do not need it in this book. The functions J0 and J1 are shown in Figure C.3.
Section C.5 Bessel functions 297

1
J0

J1
0.5

2 4 6 8 10

0:5
Figure C.3. The Bessel functions J0 and J1 from Definition C.5.1.

Theorem C.5.4. The formula


 x  X
1
.1/k  x 2k
J .x/ D 
2 kŠ .k C  C 1/ 2
kD0

holds for all   0 and x  0.

Proof. Replacing eitx by its power series expansion in Definition C.5.1 we see that
J .x/ is equal to
 x  X1 Z 1
1 1
p  1
 .ixt /2k .1  t 2 /1=2 dt:
 C 2 2
kD0
.2k/Š 1

Proposition C.4.4 implies


Z 1
.k C 1=2/. C 1=2/
t 2k .1  t 2 /1=2 dt D
1 .k C  C 1/
so
1  x  X
1
1 .k C 1=2/. C 1=2/
J .x/ D   .ix/2k p :
1
  C 2 2 kD0 .2k/Š  .k C  C 1/
298 Appendix C Advanced analysis

p
Using .2k/Š D .2k C 1/;  D .1=2/, and the duplication formula (cf. Proposi-
tion C.4.5), which implies
.k C 1=2/ 22k
D
.1=2/.2k C 1/ .k C 1/
we obtain the assertion.

Proposition C.5.5. We have


d 
.x J .x// D x  J1 .x/;   1; x  0
dx
and
d 
.x J .x// D x  JC1 .x/;   0; x > 0
dx
Proof. Applying Theorem C.5.3 we obtain
1
d  d X .1/k 1
.x J .x// D x 2kC2
dx dx kŠ .k C  C 1/ 22kC
kD0
1
X .1/k .2k C 2/ 1
D x 2kC21
kŠ .k C  C 1/ 22kC
kD0
X1
.1/k 1
D x 2kC21
kŠ .k C / 22kC1
kD0

D x J1 .x/:
The second equation can be proved in the same way:
1
d  d X .1/k 1
.x J .x// D x 2k
dx dx kŠ .k C  C 1/ 22kC
kD0
1
X .1/k 2k 1
D x 2k1
kŠ .k C  C 1/ 22kC
kD1
1
X .1/k .2k C 2/ 1
D x 2kC1
.k C 1/Š .k C  C 2/ 22kC2C
kD0
X1
.1/k 1
D x 2kC1
kŠ .k C  C 2/ 22kC1C
kD0
D x  JC1 .x/:

Remark C.5.6. We use the first equation in Proposition C.5.5 to define J .x/ recur-
sively for all  2 R and x > 0. Doing so, both equations in Proposition C.5.5 hold for
Section C.5 Bessel functions 299

all such  and x. Note that the first equation can also be written in the form
 d 
C J .x/ D J1 .x/; x > 0: (1)
dx x
q
2
From J1=2 .x/ D x sin x we obtain
r
2
J1=2 .x/ D cos x; x > 0:
x
The second equation in Proposition C.5.5 can also be written in the form
 d 
 J .x/ D JC1 .x/; x > 0: (2)
dx x
Putting together (1) and (2) produces the equation
d 1 d 
 C J .x/ D J .x/
dx x dx x
which is equivalent to

d2 1 d 2
C C 1  J .x/ D 0
dx 2 x dx x2
known as Bessel’s equation. Adding and subtracting (1) and (2) we obtain the identities

2J0 .x/ D J1 .z/  JC1 .z/


2
J .x/ D J1 .z/ C JC1 .z/ :
z

Definition C.5.7. The modified Bessel function of the second kind14 of index  2 C is
defined by
Z 1
K .z/ D ez cosh t cosh.t / dt; z 2 C; Re z > 0:
0

It follows from the definition that K is continuous, K D K if  2 R, and


K .r/ > 0 if  2 R and r > 0. In this book we need only the following simple
lemma.

Lemma C.5.8. For all b 2 R and a > 0 we have


Z 1
1 r u a
Kb .r/ D b e 2 . a C u / ub1 du; r > 0:
2a 0

14 This function has also been called by the now-rare name modified Bessel function of the third kind.
The function K1 is shown in Figure C.4.
300 Appendix C Advanced analysis

10

1 2 3

Figure C.4. The modified Bessel function K1 from Definition C.5.7.

Proof. Using the substitution u D aet we obtain


Z 1
1 r u a

b
e 2 . a C u / ub1 du
2a 0
Z
1 1  r .et Cet / bt
D e 2 e dt
2 1
Z 1
1
D er cosh t ebt dt D Kb .r/:
2 1

Note that setting a D 2r in the previous lemma leads to the inequality


Z 1
b1 b
eu er =.4u/ ub1 du  .b/2b1 r b ; a; b > 0:
2
Kb .r/ D 2 r
0

C.6 The Mellin transform


C.6.1. We consider G D .0; 1/ as a group with multiplication as operation. The
mapping ' W r 7! ln .r/; r 2 G, is then an algebraical and topological isomorphism
between G and the additive group R of all real numbers.
Using this isomorphism we see that  is a continuous character of G if and only if it
has the form .r/ D r ix with some x 2 R. Denoting by  the image of the Lebesgue
Section C.6 The Mellin transform 301

measure by the mapping ' 1 , we have

 ..a; b// D ln.b/  ln.a/; 0 < a  b < 1:

More generally,
Z 1 Z 1
f .s/
f .s/ d .s/ D ds; f 2 L1 . /:
0 0 s
Since  is invariant under translations we have
Z 1 Z 1
f .rs/ d .s/ D f .s/ d .s/; f 2 L1 . /:
0 0

The multiplicative convolution on G corresponds to the convolution on R. The multi-


plicative convolution on G is given by
Z 1
h.r/ D f .r=s/g.s/ d .s/; r 2 G; f; g 2 L2 . /:
0

Definition C.6.2. Let f be aR Lebesgue measurable, complex-valued function on


1
.0; 1/. For all z 2 C such that 0 jf .t /t z1 j dt < 1 we define the Mellin transform
Mf of f by
Z 1 Z 1
z1
Mf .z/ WD f .s/s ds D f .s/s z d .s/:
0 0

Lemma C.6.3. If h is the multiplicative convolution of f and g, then

M h.z/ D Mf .z/  M g.z/

for all z 2 C for which the Mellin transforms Mf .z/ and M g.z/ exist.

Proof. Indeed, using that  is invariant under multiplications, we obtain


Z 1 Z 1
Mf .z/M g.z/ D f .r/r z d .r/  g.s/s z d .s/
Z0 1 Z 1 0

D f .r/g.s/.rs/z d .r/ d .s/


Z0 1 Z0 1
D f .r=s/g.s/r z d .r/ d .s/ D M h.z/:
0 0

Remark C.6.4. Substituting s D et and setting g.t / WD f .et / in Definition C.6.2
we obtain Z 1
Mf .z/ D g.t /etz dt: (1)
1
302 Appendix C Advanced analysis

Setting here z D iy; y 2 R, we obtain


Z 1
Mf .iy/ D eiyt g.t / dt D g.y/;
O y 2 R:
1

In the same way as in Section 3.3 we see that there exists a; b 2 Œ1; 1; a  b,
such that the integral in (1) exists if a < Re z < b and Mf is holomorphic in this
strip.

Example C.6.5. If f .s/ D es , then


Z 1
Mf .z/ D et t z1 dt:
0

In this case, Mf is holomorphic in the region Re .z/ > 0 and Mf is equal to the
Gamma function.
Let f .s/ D 1=.1 C s/. Using the substitution s D x=.1  x/ we see that
Z 1
Mf .z/ D z s1 .1  x/z dx:
0

In this case, Mf is holomorphic in the region fz 2 C W 0 < Re .z/ < 1g and


Mf .z/ D B.z; 1  z/ .

C.7 The Laplace transform


Throughout this section denotes a nonnegative measure on Œ0; 1/ such that the
function x 7! etx is -integrable for all t > 0.

Definition C.7.1. The function L defined by


Z 1
L .t / D etx d .x/; t >0
0

is called the Laplace transform of .15

Example C.7.2. If d .x/ D x a dx; a > 1, then


Z 1 Z 1
1 .a C 1/
L .t / D x a etx dx D aC1 y a ey dy D ; t > 0:
0 t 0 t aC1

Lemma C.7.3. The measure is finite if and only if its Laplace transform is bounded.

15 A classical reference for Laplace transforms is [60].


Section C.7 The Laplace transform 303

Proof. If is finite, then


Z 1 Z 1
L .t / D etx d .x/  1 d .x/ D .Œ0; 1// < 1; t > 0:
0 0

Assume that L  K for some K 2 Œ0; 1/. By monotone convergence we have


Z 1 Z 1
tx
K  lim L .t / D lim e d .x/ D 1 d .x/ D .Œ0; 1//
t!0 t!0 0 0

i.e., is bounded.

Lemma C.7.4. The function L is infinitely differentiable and


Z 1
.k/
.L / .t / D .1/ k
x k etx d .x/; t > 0
0

for all k 2 N0 .

Proof. The lemma follows by successive differentiation which can be justified since

x k etx D x k etx=2 etx=2  Kk;t etx=2 ; x>0

where Kk;t is uniformly bounded for all t from a fixed compact subset of .0; 1/.

Theorem C.7.5. The measure is uniquely determined by its Laplace transform.

Proof. Writing d.x/ WD ex d .x/, the measure  is finite. Since L.t / D L .tC1/,
it suffices to show that  is uniquely determined by its Laplace transform. By
Lemma C.7.4 we have
Z 1
k .k/
.1/ .L/ .t / D x k etx d.x/; t > 0; k 2 N0
0

and hence
X Z 1 X
tk .tx/k
k
.1/ .L/ .k/
.t /  D etx d.x/; a; t > 0:
kŠ 0 kŠ
0kta 0kta

The integrand on the right-hand side is at most 1. Using this, dominated convergence
and Lemma B.1.18 show that
X Z 1
tk
lim .1/k .L/.k/ .t /  D 1Œ0;a .x/ d.x/ D .Œ0; a/:
t!1 kŠ 0
0kta

This relation shows that  is uniquely determined by L.


304 Appendix C Advanced analysis

C.8 Existence of continuous logarithms

Definition C.8.1. Let f be a complex-valued function defined on some topological


space X . A continuous complex-valued function F on X is called a continuous loga-
rithm of f if
f .x/ D eF .x/ ; x 2 X:
By a continuous argument of f we mean any real-valued continuous function  on X
such that
f .x/ D jf .x/j  ei.x/ ; x 2 X:

Remark C.8.2. Note that if F is a continuous logarithm of f , then  WD Im F is a


continuous argument of f .
It is clear that functions which have a continuous logarithm cannot have zeros. If
X D T  C and f .z/ D z; z 2 T, then f is continuous and has no zeros. However,
f does not have a continuous logarithm. To see this, assume that

z D ei.z/ ; z2T

with some continuous function  W T ! R. In particular,  is injective and hence


.z/  .z/ ¤ 0. Consequently,
.z/  .z/
g.z/ WD ; z2T
j.z/  .z/j
defines a continuous real-valued function on T. Since jg.z/j D 1 we must have
g.T/  f1; 1g . But g.z/ D g.z/ and therefore g.T/ D f1; 1g. This is a con-
tradiction, since T is connected while f1; 1g is disconnected.

Lemma C.8.3. Let X be a connected space and f W X ! C n f0g be a continuous


function. If 1 and 2 are continuous arguments of f , then there exists k 2 Z such
that
2 .x/ D 1 .x/ C 2ki; x 2 X:

Proof. From
f .x/ D jf .x/j  ei1 .x/ D jf .x/j  ei2 .x/
we see that 2 .x/ D 1 .x/ C 2k.x/i with some function k W X ! Z. Since 1
and 2 are continuous, so is k. The continuous image of connected sets is connected,
therefore k.X / consists of a single integer.

Lemma C.8.4. Every continuous function f W X ! T such that f .X / ¤ T has a


continuous argument.
Section C.8 Existence of continuous logarithms 305

Proof. Choose ˛ 2 R such that ei˛ 2 T n f .X /. The mapping

t 7! eit ; t 2 .˛; ˛ C 2/

is continuous, injective and it maps .˛; ˛ C 2/ onto T n fei˛ g. Denoting by the
inverse mapping we have
˚ 
z D ei .z/ ; z 2 T n ei˛ :

Thus, the function  WD .f / is a continuous argument of f .

Lemma C.8.5. Let f1 and f2 be continuous functions on X with values in T such that
f1 .x/ ¤ f2 .x/ for all x 2 X . If f1 has a continuous argument, then so does f2 .

Proof. Let 1 be a continuous argument of f1 . The function f WD f1 =f2 is continu-


ous, maps X into T and f .x/ ¤ 1 for all x. By Lemma C.8.4 it has a continuous
argument . If follows from the definition of f that 1  is a continuous argument
of f2 .

Proposition C.8.6. Let X be a compact topological space and f W X Œ0; 1 ! Cnf0g


be a continuous function. If the restriction of f to X f0g has a continuous argument,
then so does the restriction of f to X  f1g.

Proof. Replacing f by f =jf j we may assume that jf j D 1. The function f is uni-


formly continuous on the compact set X  Œ0; 1. Therefore, there exists n 2 N such
that
jf .x; s/  f .x; t /j  1; x 2 X (1)
whenever js  t j  1=n. For j D 0; 1; : : : ; n we define the function fj by

fj .x/ WD f .x; j=n/; x 2 X:

Inequality (1) shows that jfj  fj C1 j  1. Since jfj j D jfj C1 j D 1 we conclude


that fj .x/ ¤ fj C1 .x/ for all x. The statement of the proposition follows by applying
Lemma C.8.5 n times.

Theorem C.8.7. Let f W Rd ! Cnf0g be a continuous function such that f .0/ > 0.
Then f has a unique continuous argument  such that .0/ D 0.16

Proof. Replacing f by f =jf j we may assume that jf j D 1 and f .0/ D 1. Denote


by fn the restriction of f to the compact set Bdc .n/; n 2 N. We claim that fn has a
continuous argument n such that n .0/ D 0. To show this define the function gn by

gn .x; t / WD fn .tx/; .x; t / 2 Bdc .n/  Œ0; 1:

16 See also Lemma C.1.11.


306 Appendix C Advanced analysis

We have g.x; 0/ D 1 and g.x; 1/ D fn .x/. Hence, the existence of a continuous


argument for fn follows from Proposition C.8.6.
Using Lemma C.8.3 and the fact that 0 D n .0/ D nC1 .0/ we see that n .x/ D
nC1 .x/, x 2 Bdc .n/. Let  be the unique function on Rd such that .x/ D n .x/,
x 2 Bdc .n/. It is clear that  is a continuous argument of f while the uniqueness
follows from Lemma C.8.3.

C.9 Solutions of certain functional equations


Recall that B o .ı/ D Bdo .ı/ denotes the open ball

ft 2 Rd W kt k < ıg:

Lemma C.9.1 (Cauchy’s functional equation). Let f be a continuous real-valued


function on B o .ı/ such that

f .t C s/ D f .t / C f .s/; t; s; t C s 2 B o .ı/: (1)

Then there exists c 2 Rd such that f .t / D .c; t / for all t 2 B o .ı/.

Proof. Assume first that d D 1. From f .0/ D f .0 C 0/ D f .0/ C f .0/ we get


f .0/ D 0. Repeatedly applying equation (1) gives f .t / D f .t =n C    C t =n/ D
nf .t =n/ for all n 2 N and for all t . Now let r D m=n be a rational number such that
rt 2 B o .ı/. Then
m
f .rt / D f .mt =n/ D mf .t =n/ D f .t / D rf .t /:
n
By continuity, the equation f .rt / D rf .t / holds for all r 2 R such that rt; t 2 B o .ı/.
We replace in this equation t by t0 ¤ 0 and r by t =t0 and obtain
f .t0 /
f .t / D  t; t 2 B o .ı/:
t0

To prove the general case, let e1 ; : : : ; ed be the standard basis of Rd . Then

f .t / D f .t1  e1 C    C td  ed / D f1 .t1 / C    C fd .td /

where fj .tj / D f .tj  ej /. We now apply the first part of the proof to the func-
tions fj .

Theorem C.9.2. A continuous complex-valued function ' ¤ 0 on B o .ı/ satisfies the


equation
'.t C s/ D '.t /'.s/; t; s; t C s 2 B o .ı/ (1)
Section C.9 Solutions of certain functional equations 307

if and only if there exists c 2 Cd such that

'.t / D e.c;t/ ; t 2 B o .ı/:

Proof. It is clear that any function of the above form satisfies the equation (1). To prove
the other direction we consider only the case d D 1. The general case can be reduced
to this one in the same way as in the proof of Lemma C.9.1. From '.t C0/ D '.t /'.0/
we see that '.0/ D 1. Thus, 1 D '.t  t / D '.t /'.t / and therefore '.t / ¤ 0 for
all t . The function j'j satisfies the same equation as '. Applying Lemma C.9.1 to the
logarithm of j'j, we obtain j'.t /j D ert with some r 2 R. The function  D '=j'j sat-
isfies the same equation as '. By continuity, there exists ı0 > 0 such that the real part
of .t / is positive whenever t 2 .ı0 ; ı0 /. Setting f .t / D arctan.Im  .t /=Re .t //
we have .t / D eif .t/ ; t 2 .ı0 ; ı0 /. The function f is continuous and satisfies
equation (C.9.1.1). Applying Lemma C.9.1 and using ' D j'j   we conclude that

'.t / D ect ; t 2 .ı0 ; ı0 /

with some c 2 C. To show that this equation holds for arbitrary t 2 B o .ı/, let n 2 N
be such that t =n 2 .ı0 ; ı0 /. Repeated application of (1) shows that

'.t / D '.t =n C    C t =n/ D '.t =n/n D ect :

Lemma C.9.3 (Pexider’s functional equation). Let f; g and h be real-valued func-


tions on B o .ı/ such that f is continuous and

f .t C s/ D g.t / C h.s/; t; s; t C s 2 B o .ı/:

Then there exist a; b 2 R and c 2 Rd such that f .t / D .c; t / C a C b,


g.t / D .c; t / C a; h.t / D .c; t / C b for all t 2 B o .ı/.

Proof. Let a D g.0/ and b D h.0/. Setting s D 0 in the above equation we get
g.t / D f .t /  b. Similarly, h.s/ D f .s/  a. Thus,
f .t C s/ D f .t / C f .s/  a  b
and therefore the function f  a  b satisfies Cauchy’s functional equation. The as-
sertion follows now from Lemma C.9.1.

Lemma C.9.4. Let k 2 N0 and let f be a continuous real-valued function on B o .ı/


such that for all s 2 B o .ı/ the function
t 7! f .t C s/  f .t /; t; t C s 2 B o .ı/
is a polynomial of degree at most k. Then f is a polynomial of degree at most k C 1.

Proof. First, let d D 1. By assumption,


f .t C s/  f .t / D ak .s/t k C    C a1 .s/t C a0 .s/; t; s; t C s 2 .ı; ı/ (1)
308 Appendix C Advanced analysis

where a0 ; : : : ; ak are real-valued functions. First we show that the functions


aj are continuous. Let s be fixed and choose mutually different tj such that
tj ; s C tj 2 .ı; ı/ , j D 0; : : : ; k. Replacing t by tj in (1) we obtain k C 1 equa-
tions which can be written in the form f D Va, where V is the Vandermonde
matrix corresponding to the tj ’s and f and a are column vectors with coordinates
f .tj C s/  f .tj / and aj .s/, respectively. Since V is invertible, we see that each aj
can be written as
X k
aj .s/ D ri  Œf .ti C s/  f .ti /
iD0
with some ri D ri .t0 ; : : : ; tk / 2 R, from which the continuity of aj follows.
We now prove the lemma by induction on k. The case k D 0 follows from
Lemma C.9.3 with g D f . Assume that k > 0 and that the statement is true for
0; 1; : : : ; k  1. Replacing s by s C x in (1) gives
X
k
f .t C s C x/  f .t / D aj .s C x/t j
j D0

while replacing t by t C x gives


X
k
f .t C s C x/  f .t C x/ D aj .s/.t C x/j :
j D0

These equations hold whenever s; t; x; t C x; s C x; s C t C x 2 .ı; ı/. Subtracting


the second equation from the first one yields
X
k
 
f .t C x/  f .t / D aj .s C x/t j  aj .s/.t C x/j :
j D0
On the other hand, (1) shows that
X
k
f .t C x/  f .t / D aj .x/t j :
j D0

For fixed s and x these equations hold for all t in a neighborhood of zero. Comparing
the coefficients of t k we obtain
ak .s C x/ D ak .s/ C ak .x/; s; x; s C x 2 .ı; ı/:
Thus, in view of Lemma C.9.1, ak .s/ D c  s with some c 2 R. Define the function g
c
by g.t / D f .t /  kC1  t kC1 . Then

g.t C s/  g.t / D bk1 .s/t k1 C    C b0 .s/


with some functions bj . By the induction assumption g is a polynomial of degree at
most k, completing the proof in the case d D 1.
Section C.9 Solutions of certain functional equations 309

Now let d be arbitrary. By assumption,


X
f .t C s/  f .t / D a˛ .s/t ˛ ; t; s; t C s 2 B o .ı/: (2)
j˛jk

with some real-valued functions a˛ . For fixed t; s 2 B o .ı/ define the function h by

h.x/ D f .t C x  .s  t //; x 2 I WD fr 2 R; t C r  .s  t / 2 B o .ı/g:

Note that I is an open interval containing Œ0; 1. If x; y; x C y 2 I , then

h.x C y/  h.x/ D f .t C x  .s  t / C y  .s  t //  f .t C x  .s  t //
X
D a˛ .y  .s  t //  .t C x  .s  t //˛ :
j˛jk

The right-hand side is a polynomial in x of degree at most k. By the first part of the
proof, h is a polynomial of degree at most k C 1.17 Application of Theorem B.6.6
completes the proof.

Theorem C.9.5. Let f1 ; : : : ; fm ; g and h be continuous real-valued functions on


B o .ı/. Further, let Ai ; Bi 2 Rd d be invertible matrices. If for some ı0 2 .0; ı
the equation
X
m
fi .Ai t C Bi s/ D g.t / C h.s/; t; s 2 B o .ı0 / (1)
iD1

holds, then in some neighborhood of zero the functions g and h are polynomials of
degree at most m. If the matrices Bi A1 1
i  Bj Aj are invertible for all i ¤ j , then
all functions fi are polynomials of degree at most m.

Proof. Without loss of generality we may assume that Ai D Id (the identity matrix)
for all i . We prove the assertions by induction on m. The case m D 1 follows from
Lemma C.9.3. Let m > 1 and assume that the assertions are true for 1; : : : ; m  1. For
t; s and x near to the origin we have

X
m X
m
fi .t C.Bm Bi /xCBi s/ D fi .t CBm xCBi .sx// D g.t CBm x/Ch.sx/:
iD1 iD1

Subtracting from this equation (1) with Ai D Id we obtain

X
m1
Œfi .t C.Bm Bi /xCBi s/fi .t CBi s/ D Œg.t CBm x/g.t /CŒh.sx/h.s/:
iD1

17 The first part remains valid if we replace .ı; ı/ by the possibly nonsymmetric interval I .
310 Appendix C Advanced analysis

We consider the functions

Fi .t / D fi .t C .Bm  Bi /x/  fi .t /

and
G.t / D g.t C Bm x/  g.t /; H.s/ D h.s  x/  h.s/:
These functions satisfy the equation

X
m1
Fi .t C Bi s/ D G.t / C H.s/: (2)
iD1

The induction assumption (concerning g and h) and Lemma C.9.4 show that g and h
are polynomials of degree at most m. If Bm  Bi is nonsingular, then the induction
assumption (concerning fi ) and Lemma C.9.4 show that fi is a polynomial of degree
at most m.

Proposition C.9.6. If f W Rd ! C is a polynomial of degree k > 0, then for all


s 2 Rd the function
t 7! f .t C s/  f .t /
is a polynomial of degree at most k  1. More generally, if the measure

X
n
D ci ıxi ; c 2 Rn ; x 2 Rn
iD1
Pn
is such that O .0/ D iD1 ci D 0, then  f is a polynomial of degree at most k  1.

Proof. It is easy to check that the statement is true for polynomials of the form

Y
d
˛
f .t / D tj j ; t 2 Rd ; ˛ 2 Nd0
j D1

from which the general statement follows.

C.10 Linear independence of exponential functions


C.10.1. Let P be a trigonometric polynomial of the form
X
n
P .t / D cj ei.t;xj / ; t 2 Rd
j D1
Section C.10 Linear independence of exponential functions 311

where n 2 N; c 2 Cn and the xj ’s are pairwise distinct. Then P can be regarded as


the inverse Fourier transform of the measure
X
n
D cj ıxj :
j D1

Since is uniquely determined by its inverse Fourier transform, we conclude that


P D 0 implies D 0. In other words, the functions t 7! ei.t;xj / ; t 2 Rd , are linearly
independent (over C).
In this section we generalize this statement. We will use the notation

x .t / WD ei.x;t/ ; x; t 2 Rd :

Theorem C.10.2. Suppose that x1 ; : : : ; xn 2 Rd are pairwise distinct and let


; ¤ U  Rd be open. Then the functions xj jU ; 1  j  n, are linearly inde-
pendent.
Proof. For c 2 Cn write
X
n
P .t / WD cj xj .t / D 0; t 2 Rd
j D1

and assume that P .t / D 0 for all t 2 U . For each t0 2 U the function ' defined by

'.x/ WD P .xt0 /; x2R

is analytic on R. Since ' vanishes on an open interval containing 1, we conclude that


' D 0. Consequently, P .t / D 0 for all t 2 Rd . The remarks in the preceding para-
graph show that c D 0, from which the theorem follows.

Theorem C.10.3. Suppose that x1 ; : : : ; xn 2 Rd ; d  2, are pairwise distinct and


let S  Rd be a sphere of positive radius. Then the functions xj jS ; 1  j  n, are
linearly independent.18
Proof. We may suppose that S D S d 1 . We have to show that the relation
X
n X
n
cj xj .t / D cj ei.t;xj / D 0; t 2 S d 1 (1)
j D1 j D1

where cj 2 C, implies that c D 0. Assume, on the contrary, that c ¤ 0. Removing the


zeros from the numbers cj we suppose that cj ¤ 0 for all j . The case n D 1 being
trivial, let n  2. Write
R WD max kxj k > 0
1j n

18 The proof of this result is taken from [24].


312 Appendix C Advanced analysis

and let e1 ; : : : ; ed be the standard basis of Rd . We may assume that x1 D Re1 . De-
noting by P the orthogonal projection onto spanfe1 ; e2 g we have

P xj D rj Œcos.'j / e1 C sin.'j / e2 

where rj D kP xj k  kxj k  R and 0  'j < 2. Noting that r1 D R we may


assume that for some n0 2 N we have rj D R if 1  j  n0 and rj < R if j > n0 .
Then kP xj k D kxj k and hence P xj D xj for all j  n0 . Since the xj ’s are pairwise
distinct we may also assume that

0 D '1 < '2 <    < 'n0 < 2: (2)

For all ' 2 R we have t WD cos.'/ e1 C sin.'/ e2 2 S d 1 . Putting this t into equation
(1) and using that .t; xj / D .P t; xj / D .t; P xj / we obtain

X
n
cj eirj cos.''j / D 0; ' 2 R:
j D1

It follows from Theorem C.1.8 that this equation holds in the whole complex plane:
X
n
cj eirj cos.z'j / D 0; z D x C iy 2 C: (3)
j D1

Since
1  i.z'j / 
cos.z  'j / D e C ei.z'j /
2
we have
ˇ ir cos.z' / ˇ rj  
ˇe j j ˇ
D exp  Im ei.z'j / C ei.z'j /
2
D erj sinh.y/ sin.x'j / :

Let j .z/ 2 Œ0; 2/ be the argument of eirj cos.z'j / . Then, by relation (3),

X
n
cj erj sinh.y/ sin.x'j / ei j .z/
D 0:
j D1

Multiplying this equation by eR sinh.y/ , we obtain


0
X
n
cj eR sinh.y/Œsin.x'j /1 ei j .z/

j D1
X
n
C cj esinh.y/Œrj sin.x'j /R ei j .z/
D 0: (4)
j Dn0 C1
Section C.10 Linear independence of exponential functions 313

Set x WD '1 C 2 . Then sin.x  'j /  1 is equal to 0 if j D 1 and it is negative if


1 < j  n0 , in view of (2). Since rj < R if j > n0 we conclude that rj sin.x 'j /R
is negative for all such j . Taking the limit y ! 1 in (4), we obtain c1 D 0. This
contradiction shows that the relation (1) implies that c D 0. The theorem is proved.
Appendix D

Functional analysis

D.1 Inner product spaces


Let V and W be linear spaces over K where K D R or K D C. The dimension of V
is denoted by dim.V /.
A linear operator from V into W is a mapping A W V ! W satisfying

A.av C bw/ D aA.v/ C bA.w/; a; b 2 K; v; w 2 V:

If W D V then we say that A is a linear operator in V . If W D K then A is called a


linear functional on V . A mapping .; / from V  V into K satisfying
(i) .c1 v1 C c2 v2 ; w/ D c1 .v1 ; w/ C c2 .v2 ; w/
(ii) .w; c1 v1 C c2 v2 / D c1 .w; v1 / C c2 .w; v2 /
for all v; v1 ; v2 ; w 2 V and c1 ; c2 2 K, is called a sesquilinear form on V .
A positive semidefinite inner product is a sesquilinear form such that
.v; w/ D .w; v/ and .v; v/  0. If .v; v/ > 0 whenever v ¤ 0, then the inner prod-
uct .; / is called positive definite. The linear space V together with the inner product
.; / is called a (positive definite or positive semidefinite, respectively) inner product
space.
Two elements v; w of an inner product space V are called orthogonal if .v; w/ D 0.
Orthogonality of two sets is defined correspondingly. Orthogonality of elements and
sets is denoted by the symbol ? . The orthogonal complement M ? of a set M  V is
the set of all elements of V which are orthogonal to M W M ? D fv 2 V W v ? M g.
The orthogonal complement is a linear space. If L; M  V are linear spaces, then the
notation
V DL˚M
means that V D L C M and V ? M .
If V is finite dimensional then any linear functional l on V can be written as

l.v/ D .v; w/; v2V

with some w 2 V .1 In case of a positive definite inner product, w is uniquely deter-


mined. Any positive semidefinite inner product h; i on the linear space Kd can be

1 See also Theorem D.3.4.


Section D.2 Matrices and kernels 315

written as hv; wi D .Av; w/ where A 2 Kd d is positive semidefinite and

X
d
.v; w/ D v j wj
j D1

is the standard inner product of Kd .


Writing p
kvk D .v; v/; v2V
for all v; w 2 V and a 2 K we have
(i) kvk  0
(ii) kavk D jaj  kvk
(iii) j.v; w/j  kvk  kwk (Cauchy–Schwarz inequality)
(iv) kv C wk  kvk C kwk (triangle inequality)
(v) kv C wk2 C kv  wk2 D 2kvk2 C 2kwk2 (parallelogram law).
Any mapping kk 7! Œ0; 1/ satisfying (ii) and (iv) is called a seminorm. If kvk > 0 for
all v 2 V nf0g then kk is called a norm. A linear space equipped with a norm is called a
normed linear space. Defining the distance .v; w/ of v and w by .v; w/ WD kvwk,
the linear space V becomes a metric space. If this metric space is complete, i.e., if every
Cauchy sequence is convergent, then V is called a Banach space.
The inner product .; / of a semidefinite inner product space can be expressed in
terms of the corresponding seminorm:
1 
.v; w/ D kv C wk2  kv  wk2 ; if K D R (1)
4
1X k
3
.v; w/ D i kv C ik wk2 ; if K D C: (2)
4
kD0

The two equations above are called polarization identity. A result, attributed to M. R.
Fréchet, J. von Neumann and P. Jordan, states that if a seminorm (norm) k  k satisfies
the parallelogram law, then the equations above define a semidefinite (positive definite,
respectively) inner product .; /.

D.2 Matrices and kernels


Basic facts and notation D.2.1. The identity matrix is I D In D .ıij / 2 Rnn ,
where ıi i D 1 and ıij D 0 if i ¤ j . A permutation matrix is obtained by reordering
the columns of the identity matrix. The symbol diag .c1 ; : : : ; cn / where cj 2 C denotes
the diagonal matrix .dij / with di i D ci and dij D 0; i ¤ j . More generally, if
316 Appendix D Functional analysis

D1 ; : : : ; Dn are matrices, then we write


2 3
D1
6 :: 7
diag .D1 ; : : : ; Dn / D 4 : 5
Dn
where we use the convention that blank entries in a matrix denote zeros.
For a matrix A D .aij / 2 Cmn , AT denotes the transpose .aj i / 2 Cnm
and A denotes the conjugate transpose .aj i /. The rank of A is defined by
rank.A/ D dim.range.A// , where2

range.A/ D ACn D fAx W x 2 Cn g  Cm :

The rank of A is equal to the maximum number of linearly independent column vectors
of A, which is the same as the maximum number of linearly independent row vectors.
If the product AB is defined then

rank.AB/  min.rank.A/; rank.B//:

Let A 2 Cnn be a quadratic matrix for the rest of this paragraph. A is diagonaliz-
able if there exists an invertible matrix X 2 Cnn such that XAX 1 is diagonal. A is
symmetric if A D AT and A is Hermitian if A D A . A is called unitary if A D A1 .
If A is real and AT D A1 , then A is called orthogonal. We denote by O.n/ the set
of all n  n orthogonal matrices. A is said to be normal if A A D AA . A matrix is
normal if and only if it is diagonalizable by a unitary matrix.
A is positive semidefinite if x  Ax  0 for all x 2 Cn and positive definite if
x  Ax > 0 for all nonzero x 2 Cn . We show below (cf. Theorem D.2.3) that positive
semidefinite matrices are Hermitian.
The determinant of A is denoted by det.A/. Key properties are

det.AB/ D det.A/  det.B/


det.cA/ D c n det.A/; c 2 C:

The complex number  is an eigenvalue of A with corresponding eigenvector


x 2 Cn n f0g if Ax D x. The eigenvalues are the zeros of the characteristic poly-
nomial
q.t / D det.tI  A/; t 2 C
which has degree n. The eigenspace of A corresponding to an eigenvalue  is the linear
space fx 2 Cn W Ax D xg. The algebraic multiplicity of  is its multiplicity as a
zero of the characteristic polynomial. The geometric multiplicity of  is the dimension
of the eigenspace of A corresponding to . If A is Hermitian, then the eigenvalues are
real; if A is positive semidefinite, then they are nonnegative.
2 Recall our convention from A.2 that in expressions involving matrix operations, e.g., as in Ax, we
consider x 2 Cn as a column vector.
Section D.2 Matrices and kernels 317

The trace of A is defined by


X
n
trace.A/ D ai i :
iD1

It is easy to check that trace.AB/ D trace.BA/ and hence trace.X 1 AX / D trace.A/


for an arbitrary invertible matrix X 2 Cnn . The trace of A is the sum of all eigen-
values of A while the determinant is the product of all eigenvalues of A.
AVandermonde matrix V has the form
2 3
1 c1 c12 : : : c1n1
6 1 c2 c 2 : : : c n1 7
6 2 2 7
V D V .c1 ; : : : ; cn / D 6 : : :: :: 7
4 : :
: : :  : 5
1 cn cn2 : : : cnn1
its determinant is equal to Y
.cj  ci /
1i<j n

and hence V is invertible if and only if the cj ’s are mutually distinct.


Let f be a complex-valued function defined on some set S  C such that S contains
all eigenvalues of A. Assume that A is diagonalizable, then A can be expressed in
the form A D X 1 DX with some invertible matrix X and a diagonal matrix D.
The diagonal entries 1 ; : : : ; n of D are the eigenvalues of A. The matrix f .A/ is
defined by

f .A/ D f .X 1 DX / D X 1 diag.f .1 /; : : : ; f .n //X:

The diagonal matrix D is unique up to the ordering of the eigenvalues, but the matrix
X is not unique. However, f .A/ can be shown to be independent of the particular
representation of A.3

Kernels D.2.2. Let X be a nonempty set. A kernel on X  X is a complex-


valued function on X  X . A kernel K is called Hermitian if K.y; x/ D K.x; y/
for all x; y 2 X and it is called positive definite if the matrix .K.xj ; xk //nj;kD1 is
positive definite for every choice of n 2 N, x1 ; : : : ; xn 2 X such that xj ¤ xk for
j ¤ k. The kernel K is positive semidefinite if the same matrices are always positive
semidefinite. In this case, they are positive semidefinite even without the assumption
that xj ¤ xk for j ¤ k. Indeed, if x1 ; : : : ; xn 2 X are arbitrary and xl1 ; : : : ; xlp are

3 There are several equivalent ways to define f .A/ for arbitrary quadratic matrices; a detailed treatment
is given in [29].
318 Appendix D Functional analysis

the mutually different elements among the xj ’s, then

X
n X
p
cj ck K.xj ; xk / D dj dk K.xlj ; xlk /
j;kD1 j;kD1

where X
dk D ci ; k D 1; : : : ; p:
iW xi Dxlk
Next we give proofs of some results on matrices which are used in the book.

Theorem D.2.3. Every positive semidefinite matrix A D .aj k / is Hermitian and


ajj  0 for all j .

Proof. For u; v 2 C we have

0  juj2 ajj C jvj2 akk C uvaj k C vuakj :

Taking .u; v/ equal to .1; 0/ or .0; 1/, we see that ajj and akk are nonnegative and, in
particular, real. Thus,

uvaj k C vuakj 2 R; u; v 2 C:

Taking
  D .1; 1/ we get aj k C akj 2 R. Taking .u;
.u; v/  v/ D .1; i/ we get that
i aj k  akj 2 R. Writing a D aj k C akj 2 R and b D i aj k  akj 2 R, we infer
aj k D aCib aib
2 and akj D 2 , from which the desired equality is immediate.
h i
1 1
We remark that the matrix .aj k / D 1 1 is not Hermitian but we have

X
2
aj k rj rk D r12 C r22 > 0
j;kD1

for arbitrary real numbers r1 and r2 , not both zero.

Theorem D.2.4. A real matrix A D .aj k /nj;kD1 is positive semidefinite if and only if
it is symmetric and
Xn
aj k rj rk  0
j;kD1
for all real numbers r1 ; : : : ; rn .

Proof. The “only if” part follows immediately from Theorem D.2.3. If A is real and
symmetric, then
Xn X
n
aj k cj ck D aj k Re.cj ck / 2 R
j;kD1 j;kD1
Section D.2 Matrices and kernels 319

for all cj 2 C. Using this and writing cj D xj C iyj ; xj ; yj 2 R we have

X
n X
n X
n
aj k cj ck D aj k .xj xk C yj yk / C i aj k .yj xk  xj yk /
j;kD1 j;kD1 j;kD1
Xn X
n
D aj k x j x k C aj k yj yk
j;kD1 j;kD1

from which the “if part” follows.

Definition D.2.5. Suppose A D .aj k /nj;kD1 is a Hermitian matrix. Then A has n real
eigenvalues, counted with multiplicity, and there exists an orthonormal basis of Cn
consisting of eigenvectors of A. The number of negative squares of A is the num-
ber of negative eigenvalues of A, counted with multiplicity. A negative (respectively
nonnegative, respectively positive) subspace for A is a linear subspace E of Cn such
that
Xn
aj k xj xk D .Ax; x/ < 0
j;kD1

(respectively  0, respectively > 0) for all x 2 E n f0g.

Note that, by orthogonality, the linear space spanned by eigenvectors corresponding


to negative (nonnegative, positive, nonpositive) eigenvalues is negative (nonnegative,
positive, nonpositive, respectively).

Theorem D.2.6. If A is a Hermitian matrix, then the number of negative squares of


A is equal to the maximal dimension of a negative subspace for A.

Proof. Let the order of A be n and let k be the number of negative squares of A. In
view of the remark after Definition D.2.5, the matrix A has a negative subspace of
dimension k and A has a nonnegative subspace P of dimension n  k. Now let N be
an arbitrary negative subspace for A. Since N cannot intersect P except in 0 we must
have dim .N /  k.

Theorem D.2.7. If A 2 Cnn is an invertible Hermitian, matrix then the number of


negative squares of A is equal to the number of changes of sign in the sequence

.1; det A1 ; : : : ; det An /

where Ak D .aij /ki;j D1 .

Proof. For 1  k  n we identify Ck with the linear subspace

f .x1 ; : : : ; xk ; 0; : : : ; 0/ W .x1 ; : : : ; xk / 2 Ck g
320 Appendix D Functional analysis

of Cn . Further, let mk be the number of negative squares of Ak . It follows from


Theorem D.2.6 that mk1  mk for k D 2; : : : ; n. If k  2 and if E is a nega-
tive subspace for Ak of dimension mk , then E \ Ck1 is a negative subspace for
Ak1 of co-dimension at most 1 in E, hence of dimension at least mk  1. Thus,
mk1  mk  mk1 C 1 for k D 2; : : : ; n. The determinant of a Hermitian matrix is
equal to the product of all eigenvalues (considered with multiplicities). Therefore, the
sign of det Ak is .1/mk , from which the statement follows.

Noting that positive definite matrices are nonsingular, we obtain the following cor-
ollary.

Corollary D.2.8. An invertible Hermitian matrix A 2 Cnn is positive definite if and


only if det Ak > 0 for k D 1; : : : ; n.

As the example 
0 0
AD
0 1
shows, there exist matrices with det Ak  0 for all k that are not positive semidefinite.
However, the following result holds.

Theorem D.2.9. A Hermitian matrix A 2 Cnn is positive semidefinite if and only if


all subdeterminants of A are nonnegative.

Proof. If A D .aj k / is positive semidefinite, then the eigenvalues and hence the de-
terminant of A are nonnegative.
On the other hand, let us assume that all subdeterminants are nonnegative. For > 0
we define the matrix A D A C  In . Then
X
n
l
det .A / D dl
lD0

where dn D 1 and
X
dl D det ..aj k /j;k2Sl /  0; l D 0; 1; : : : ; n  1:
Sl

Here the sum is over all sets Sl  f1; : : : ; ng having n  l elements. From this we
conclude that det .A /  n > 0. Corollary D.2.8 shows that A is positive definite.
Consequently, the pointwise limit A D lim!0 A is positive semidefinite.

As an application of Theorem D.2.9 we present the following result:

Lemma D.2.10. The kernels


K1 .x; y/ D min .x; y/; x; y 2 Œ0; 1/
Section D.2 Matrices and kernels 321

and
1
K2 .x; y/ D  .jxj C jyj  jx  yj/; x; y 2 R
2
are positive semidefinite.4
Proof. We consider first the kernel K1 and show that the matrix
 n
A D min .xi ; xj / i;j D1
is positive semidefinite for all x1 ; : : : ; xn 2 Œ0; 1/. We may assume that
x1  x2      xn . We then have
ˇ ˇ
ˇ x1 x1 x1 : : : x1 ˇ
ˇ ˇ
ˇ x1 x2 x2 : : : x2 ˇ
ˇ ˇ
ˇ ˇ
Dn .x1 ; : : : ; xn / WD det .A/ D ˇ x1 x2 x3 : : : x3 ˇ :
ˇ :: : : :: ˇˇ
:
ˇ : :: ::
ˇ ˇ
ˇ x 1 x2 x3 : : : xn ˇ
Subtracting the first row from all other rows and expanding the determinant along the
first column we see that
Dn .x1 ; x2 ; : : : ; xn / D x1  Dn1 .x2  x1 ; : : : ; xn  x1 /:
Induction on n leads to
Dn .x1 ; x2 ; : : : ; xn / D x1 .x2  x1 /.x3  x2 /    .xn  xn1 /
from which det A  0 follows. Since all subdeterminants of A have the same structure,
they are also nonnegative. Theorem D.2.9 shows that A is positive semidefinite.
We now consider the kernel K2 . If xy  0 then K2 .x; y/ D 0, otherwise
K2 .x; y/ D K1 .jxj; jyj/. Thus, for x1      xn ,
 n
K2 .xi  xj / i;j D1 D diag .A1 ; A2 /
where A1 and A2 are, by the first part of the lemma, positive semidefinite matrices.
This completes the proof.
Alternative proof. We present an alternative proof for the positive semidefiniteness of
the matrix A D .min .xi ; xj //ni;j D1 , where 0  x1  x2      xn . By continuity,
it suffices to consider positive rational xj ’s. Multiplying A by a positive integer we
may even assume that the xj ’s are positive integers. Then for some m the matrix A is
a submatrix of .min.i; j //m
i;j D1 which is positive definite since it is the square of the
symmetric matrix B D .bij / where bij D 0 if i C j  m and bij D 1 if i C j > m.
For example, 2 3 2 32
1 1 1 1 0 0 0 1
6 1 2 2 2 7 6 0 0 1 1 7
6 7 6 7
4 1 2 3 3 5D4 0 1 1 1 5 :
1 2 3 4 1 1 1 1
4 Note that K1 .x; y/ D K2 .x; y/ if x; y  0.
322 Appendix D Functional analysis

Theorem D.2.11. A matrix A 2 Cnn is positive semidefinite if and only if there


exists a Hermitian matrix B 2 Cnn such that A D B 2 . If A is real then B can be
chosen real as well.

Proof. If A D B 2 for some n  n Hermitian matrix B, then

.Ax; x/ D .B 2 x; x/ D .Bx; Bx/  0; x 2 Cn :

Thus, A is positive semidefinite.


For the converse, let fe1 ; : : : ; en g be an orthonormal basis of Cn (or of Rn if A is
real), where ej is an eigenvector of A with corresponding eigenvalue rj  0. Consider
ej as a column vector and let Q D Œe1 ; : : : ; en . Using orthonormality we see that Q
is unitary, i.e., Q Q is the identity matrix. Moreover,

Q diag .r1 ; : : : ; rn / Q ej D rj ej D Aej

and hence A D Q diag .r1 ; : : : ; rn / Q . It is now easy to check that the square of the
Hermitian matrix
p p
B D Q diag . r1 ; : : : ; rn / Q
is equal to A.

Theorem D.2.12 (Schur). Let A D .aij / and B D .bij / be nn positive semidefinite
matrices. Then the matrix C D .aij bij /ni;j D1 is positive semidefinite. If A ¤ 0 and B
is positive definite, then C is positive definite, as well.

Proof. Suppose first that A and B are positive semidefinite. We assume that A ¤ 0.
By Theorem D.2.11 there exists an n  n Hermitian matrix D such that A D D 2 .
Writing D D .dij /, we have for x D .xi / 2 Cn

X
n
.C x; x/ D aij bij xj xi
i;j D1
X n X n
D dli dlj bij xj xi
i;j D1 lD1
X n X n
D bij ylj yli
lD1 i;j D1
Xn
D .Byl ; yl /  0
lD1

by the positive semidefiniteness of B, where yl D .yl1 ; : : : ; yln / 2 Cn with compo-


nents yli D dli xi ; i; l D 1; : : : ; n.
The second statement can be proved in the same way.
Section D.2 Matrices and kernels 323

Corollary D.2.13. Let A D .aij / be an n  n positive semidefinite matrix and let


pk ; k 2 N0 , be nonnegative numbers such that the series
1
X
'.z/ WD pk z k
kD0

is convergent for all z 2 C. Then the matrix .'.aij // is positive semidefinite.

Lemma D.2.14. Let C D .cj k /nj;kD1 be a positive semidefinite matrix and


A D .aj k / D Re C , B D .bj k / D Im C . Then the 2n  2n matrix

A B
D D .dj k / D
B A

is positive semidefinite as well.

Proof. Since cj k D ckj we conclude that AT D A; B T D B and hence D D D T .


In view of Theorem D.2.4 it suffices to show that
X
2n
dj k rj rk  0 (1)
j;kD1

for all r1 ; : : : ; r2n 2 R. By assumption,


X
n
0 cj k .rj  irnCj /.rk C irnCk /
j;kD1
Xn X
n
D cj k .rj rk C rnCj rnCk /  i cj k .rnCj rk  rj rnCk / DW S1  iS2 :
j;kD1 j;kD1

Using the relation cj k D ckj we see that S1 is real, while S2 is purely imaginary.
Consequently,
X
n X
n
S1  iS2 D aj k .rj rk C rnCj rnCk / C bj k .rnCj rk  rj rnCk /
j;kD1 j;kD1

X
2n
D dj k rj rk
j;kD1

showing the inequality (1).


324 Appendix D Functional analysis

Definition D.2.15. Let V be a positive definite inner product space over K where
K D C or K D R and let v1 ; : : : ; vn 2 V . The Gram matrix G associated with
v1 ; : : : ; vn is defined by
 n
G D G.v1 ; : : : ; vn / D .vi ; vj /
i;j D1

where .; / denotes the inner product of V .

Choosing an orthonormal basis in the linear span of the vectors vj and building a
matrix A by writing the coordinates of vj as the j -th row of A we have G D AA .

Theorem D.2.16. (With the notation of the previous definition.)


(i) The Gram matrix G.v1 ; : : : ; vn / is positive semidefinite.
(ii) If w1 ; : : : ; wm 2 V are any vectors for which
X
m
vi D tij wj ; tij 2 K; i D 1; : : : ; n
j D1

then
G.v1 ; : : : ; vn / D T G.w1 ; : : : ; wm /T  (1)
where T D .tij /.
(iii) The dimension of the linear space L spanned by the vectors v1 ; : : : ; vn is equal
to the rank of G.v1 ; : : : ; vn /.
Proof. The first statement follows from5
!
X
n X
n X
n
.vi ; vj /ci cj D vi ci ; vi ci  0; ci 2 K
i;j D1 iD1 iD1

while the second one is a consequence of


X
m
.vi ; vk / D tij .wj ; wl /tkl :
j;lD1

To prove the last statement choose arbitrary vectors w1 ; : : : ; wn which span the same
linear space L. Since the rank of the product of matrices is not greater than that of any
factor, equation (1) shows that
rank G.v1 ; : : : ; vn /  rank G.w1 ; : : : ; wn /:
Exchanging the role of the v’s and w’s we conclude that we have equality in the above
display. The result now follows by choosing the w’s such that w1 ; : : : ; wk form an
orthonormal basis of L and wj D 0 for j > k D dim L.
5 The positive semidefiniteness also follows from the representation G D AA after Definition D.2.15.
Section D.2 Matrices and kernels 325

Lemma D.2.17. Let Cn 2 Cd d ; n 2 N, be positive semidefinite matrices such that


the limit
q.t / D lim .Cn t; t /
n!1

exists for all t 2 Rd . Then there exists a positive semidefinite matrix C 2 Cd d


such that q.t / D .C t; t /; t 2 Rd . Moreover, the sequence fCn g1
1 converges to C
entrywise.

Proof. It is clear that the above limit also exists for all t 2 p
Cd , we denote it in this case
also with q.t /. Since Cn is positive semidefinite, kt kn D .Cn t; t / is a seminorm on
p
Cd satisfying the parallelogram law (cf. (D.1.v)). The same is true for the limit q .
By the theorem of Fréchet, von Neumann and Jordan, q is generated by a positive
semidefinite inner product h; i on Cd :

q.t / D ht; t i; t 2 Cd :

The first statement of the lemma follows from the fact that ht; t i D .C t; t / with some
positive semidefinite matrix C 2 Cd d . Let e1 ; : : : ; ed be the standard basis of Cd .
Using the polarization identity (D.1.2) with v D ej and w D ek , we obtain the second
statement.

Lemma D.2.18. If A is a positive semidefinite matrix. then A WD A C I is positive


definite for all > 0.

Proof. The matrix A is obviously positive semidefinite. If it was not positive definite,
then it would have an eigenvector with eigenvalue 0. This vector would be an eigen-
vector of A with eigenvalue  , contradicting the positive semidefiniteness of A.

Theorem D.2.19. Let n 2 N and let S be a subset of the square

Sn WD f.i; j / 2 Z2 W 1  i; j  ng

such that
(i) .i; i / 2 S whenever 1  i  n;
(ii) if .l; k/ 2 S then f.i; j / W min .l; k/  i; j  max .l; k/g  S .
Suppose that for each .i; j / 2 S a complex number cij is given such that
(iii) for all .l; k/ 2 S with l  k the matrix .cij /ki;j Dl is positive semidefinite.
Then there exists a positive semidefinite matrix A D .aij /ni;j such that aij D cij for
.i; j / 2 S .

Proof. The case n D 1 being trivial suppose that n > 1.


326 Appendix D Functional analysis

We consider first the special case where S D Sn n f.1; n/; .n; 1/g. For z 2 C we
define the matrix A.z/ by
2 3
c11 c12 : : : c1;n1 z
6 c21 c22 : : : c2;n1 c2n 7
6 7
6 : 7
A.z/ WD 6 :: 7:
6 7
4 cn1;1 cn1;2 : : : cn1;n1 cn1;n 5
z cn2 : : : cn;n1 cnn
Set
A1 WD .cij /n1
i;j D1 ; A3 WD .cij /ni;j D2 ; A5 WD .cij /n1
i;j D2
and let A2 .z/ .A4 .z// denote the matrix obtained by canceling the first (last) column
and the last (first) row of A.z/. Assume that all matrices in (iii) are positive definite.
By Corollary D.2.8 it suffices to prove the existence of a complex number z for which
the determinant det .A.z// is positive. We have
det .A1 / det .A3 /  det .A2 .z// det .A4 .z//
det .A.z// D :
det .A5 /
Since A.z/ is Hermitian det .A4 .z// D det .A2 .z// and hence the inequality
det .A.z// > 0 holds if and only if
j det .A2 .z//j2 < det .A1 / det .A3 /: (1)
By assumption, the right-hand side of this inequality is positive. Expanding the deter-
minant det .A2 .z// along the first row we see that
det .A2 .z// D det .A5 /z C c
where det .A5 / > 0 and c is some complex number not depending on z. We conclude
that there exist infinitely many z 2 C satisfying (1).
If the matrices in (iii) are not all positive definite, then we replace A.z/ by the
matrices A .z/ WD A.z/ C I where I denotes the n  n identity matrix and > 0.
By Lemma D.2.18 the corresponding matrices are positive definite. For m 2 N we
choose zm 2 C such that A1=m .zm / is positive semidefinite. Then
 1
c11 C m zm
det 1 0
zm cnn C m
and hence the sequence fzm g is bounded. Consequently, there exists a subsequence
converging to some z 2 C. The matrix A.z/ is then positive semidefinite.
We now turn to the general case. We may suppose that S 6D Sn . Let k be the greatest
integer with the property that .1; 1/; : : : ; .1; k  1/ 2 S and let l be the greatest integer
with .1; k/; : : : ; .l; k/ … S . Using (i) and (ii) we see that 1  l < k  n and
S \ f.i; j / W l  i; j  kg D f.i; j / W l  i; j  kg n f.l; k/; .k; l/g:
Section D.3 Hilbert spaces and linear operators 327

The first part of the proof shows the existence of a complex number z such that the
matrix .cij /ki;j Dl is positive semidefinite, where ckl WD z; clk WD z. Replacing S by
S [f.l; k/; .k; l/g, this new index set and the corresponding complex numbers cij also
satisfy the conditions of the theorem. Thus, repeating the arguments above we obtain
the desired matrix A.

D.3 Hilbert spaces and linear operators


Hilbert spaces D.3.1. Let V be a normed linear space. We say that a sequence fvn g1
1
of elements of V converges strongly to or simply converges to v 2 V if

lim kv  vn k D 0:
n!1

In case of strong convergence we write v D limn!1 vn or vn ! v. When speak-


ing about convergence, continuity, closed or open sets, etc., we always refer to the
corresponding notion for the topology generated by the norm.6
A Hilbert space H over K where K D R or K D C is a positive definite inner
product space over K which is complete with respect to the norm
p
khk D .h; h/; h 2 H

where .; / denotes the inner product of H. Examples of Hilbert spaces are Rd ; Cd
and L2 . ; A; P /. Throughout the rest of this section the symbol H denotes a Hilbert
space.
We call a nonempty set L  H a linear manifold if for any two vectors g; h 2 L
and any a; b 2 K we have af C bg 2 L. A linear manifold is called a subspace if it is
closed. The orthogonal complement M ? of a nonempty subset M  H is a subspace
of H. If L is a subspace of H, then

H D L ˚ L? :

Thus, each h 2 H can be represented in the form

h D hL C hL? ; hL 2 L; hL? 2 L? :

The element hL is called the orthogonal projection of h onto L. The orthogonal pro-
jection hL is the (unique) element of L having minimal distance to h:

kh  hL k D inf kh  vk:
v2L

6 See also D.5.1 for the definition of weak convergence.


328 Appendix D Functional analysis

Lemma D.3.2 (Cauchy criterion). For a sequence fhn g1


1 of elements of H the fol-
lowing conditions are equivalent:
(i) fhn g1
1 is convergent;
(ii) limn;m!1 khn  hm k D 0;
(iii) The limit limn;m!1 .hn ; hm / exists.
If the limit in (iii) exists, then it is equal to k limn!1 hn k2 .
Linear operators D.3.3. Let V and W be normed linear spaces. A linear operator A
from V into W is called bounded if there exists a C  0 such that kAvk  C kvk
for all v 2 V . We denote by B.V; W / the set of all bounded linear operators from V
to W . A linear operator is bounded if and only if it is continuous. The norm kAk of a
bounded operator is defined by
kAk WD inf fC  0 W kAvk  C kvk for all v 2 V g: (1)
If kAk  1, then A is called a (linear) contraction. We have
kAk D sup fkAhk W h 2 H; khk D 1g
D sup fkAhk W h 2 H; khk  1g
and
kAvk  kAk  kvk and kABk  kAk  kBk
where B is a bounded linear operator with values in V .
If W D H is a Hilbert space, then
kAk D sup fj.Ag; h/j W g 2 V; h 2 H; kgk  1; khk  1g:
A linear operator A in H is called isometric if kAhk D khk for all h 2 H. An isometric
operator A is obviously bounded and its norm is equal to 1. If A is an isometric operator
in H and A.H/ D H, then A is called unitary.
For an arbitrary bounded linear operator A W H ! H there exists a uniquely
determined bounded linear operator A W H ! H, the adjoint of A, such that
.Ah; g/ D .h; A g/; h; g 2 H:
Key properties are kAk D kA k and .AB/ D B  A . The operator A is called self-
adjoint if A D A. It is called normal if AA D A A. If U is unitary, then U is
invertible and U  D U 1 . Unitary and self-adjoint operators are normal. Two eigen-
vectors of a normal operator are orthogonal whenever they correspond to different
eigenvalues. All eigenvalues of a self-adjoint operator are real, while the eigenvalues
of a unitary operator have modulus 1.
For a self-adjoint operator A, the inner product .Ah; h/ is real for all h 2 H. If, in
addition, .Ah; h/  0 for all h 2 H, then A is said to be nonnegative. The eigenvalues
of nonnegative operators are nonnegative.
A linear manifold L  H is said to be invariant under an operator A, or A-invariant,
if A.L/  L. If L is A-invariant, then L? is invariant under A .
Section D.3 Hilbert spaces and linear operators 329

Theorem D.3.4 (Riesz–Fréchet). For each bounded linear functional l on H there


exists a unique v 2 H such that

l.h/ D .h; v/; h2H

and klk D kvk.

Theorem D.3.5. Let h; iL be a sesquilinear form on a dense linear manifold L  H
such that
jhg; hiL j  C kgk  khk; h 2 L
with some C 2 Œ0; 1/. This form can be uniquely extended to a sesquilinear form
h; i on H such that
jhg; hij  C kgk  khk; h 2 H:
Moreover, there exists a linear operator A on H such that kAk  C and

hg; hi D .Ag; h/; g; h 2 H:

Theorem D.3.6. Let H1 and H2 be two Hilbert spaces and let I W L ! H2 be


an isometric linear operator defined on a linear manifold L  H1 . Then I can be
uniquely extended to an isometric linear operator from the closure of L onto the clo-
sure of I.L/.

Theorem D.3.7. For every bounded nonnegative


p p onpH there exists a
operator P
unique bounded nonnegative operator P such that P pD P P . If a bounded
linear operator commutes with P , then it commutes with P , as well.

Theorem D.3.8 (Banach–Steinhaus, principle of uniform boundedness). Let B be a


Banach space and V be a normed linear space. If A  B.B; V / is such that for each
v2V
sup fkAvk W A 2 Ag < 1
then
sup fkAk W A 2 Ag < 1:

Definition D.3.9. A bounded linear operator K in H is called compact if fK.xn /g1


1
contains a convergent subsequence for all bounded sequences fxn g1
1 in H.

The next lemma states basic properties of compact operators which follow imme-
diately from the definition of compactness.
330 Appendix D Functional analysis

Lemma D.3.10.
(i) If K1 and K2 are compact operators, then so is a1 K1 Ca2 K2 for all a1 ; a2 2 K.
(ii) If K is a compact and A is a bounded linear operator, then AK and KA are
compact.

Theorem D.3.11. If K is a compact self-adjoint operator in H, then K has an eigen-


value  2 R such that jj D kKk.

Proof. The case K D 0 being trivial, we may assume that kKk ¤ 0. We choose a
sequence fhn g in H such that khn k D 1 and kKhn k converges to kKk. Since K is
self-adjoint we have .K 2 hn ; hn / D .Khn ; Khn / D kKhn k2 . Using this we obtain

0  kK 2 hn  kKhn k2  hn k2
D kK 2 hn k2  2kKhn k2  .K 2 hn ; hn / C kKhn k4
D kK 2 hn k2  kKhn k4
 kKk2  kKhn k2  kKhn k4 :

Since the sequence fkKhn kg converges to kKk we see that

lim kK 2 hn  kKhn k2  hn k D 0: (1)


n!1

By Lemma D.3.10, the operator K 2 is compact. Thus, we can choose a subsequence


fK 2 hnk g converging to an element, which we write in the form kKk2  g. Relation (1)
shows that fhnk g converges to g and

K 2 g D kKk2 g:

To complete the proof we observe that kgk D 1 and

.K  kKk/.K C kKk/g D 0:

If h WD .K C kKk/g D 0, then kKk is an eigenvalue of K. Otherwise we have


.K  kKk/h D 0, i.e., kKk is an eigenvalue of K.

Theorem D.3.12. Let K ¤ 0 be a self-adjoint compact operator in H. For all h 2 H


we have X
hD .h; 'n /'n (1)
n2S
and X
Kh D .h; 'n /n 'n (2)
n2S
Section D.3 Hilbert spaces and linear operators 331

where
(i) S D f1; : : : ; N g with some N 2 N or S D N;
(ii) fn W n 2 S g  R is the set of all nonzero eigenvalues of K;
(iii) f'n W n 2 S g is an orthonormal system in H and 'n is an eigenvector of K
with eigenvalue n .
(iv) If S D N, then limn!1 n D 0.

Proof. By Theorem D.3.11 there exists an eigenvalue 1 2 R of K with j1 j D kKk.


Let '1 be an eigenvector with eigenvalue 1 such that .'1 ; '1 / D 1 and denote by H1
the subspace generated by '1 . We have
H D H1 ˚ H1?
where H1 is K-invariant. Since K is self-adjoint, H1? is K-invariant as well. The re-
striction of K to H1? is a compact self-adjoint operator with norm at most kKk. If
this restriction is not the zero operator, then we apply Theorem D.3.11 to it. Proceed-
ing in this way we obtain a finite or infinite sequence 1 ; 2 ; : : : of eigenvalues such
that jn j  jnC1 j > 0 and a sequence of corresponding subspaces Hn spanned by
eigenvectors of K with eigenvalue n . If this sequence is finite, then
H D H1 ˚    ˚ HN ˚ N
where N 2 N and K.N / D f0g, from which (1) and (2) follow.
Assume that this sequence is infinite and suppose that the sequence fjn jg has
a positive lower bound. Then the sequence f'n =n g is bounded but the sequence
fK'n =n g D f'n g does not contain a convergent subsequence. For k'n  'm k2 D 2
if m ¤ n so that no subsequence can be a Cauchy sequence. Thus, limn n D 0. We
have
1
M
HD Hn ˚ N
nD1

where N is a K-invariant subspace. By our procedure, kKjN k  jn j for all n, where
KjN denotes the restriction of K to N . Thus K.N / D f0g. The decompositions (1)
and (2) follow as in the finite case.

Remark D.3.13. If the Hilbert space H in the previous theorem is separable and
infinite-dimensional, then there exists an orthonormal basis f'n g1
1 of H consisting
of eigenvectors 'n of K with (possibly zero) eigenvalues n . This follows from the
previous proof by selecting an orthonormal basis of the subspace N .
332 Appendix D Functional analysis

D.4 Convex sets and the theorem of Kreı̆n and Milman


A short introduction to the theory of locally convex spaces can be found in Chapter 1
of [3].
D.4.1. Throughout this section the symbol L will denote a linear space over a field K
which is either R or C. We assume that L is equipped with a Hausdorff topology O
such that the following hold:
(i) The mappings .x; y/ 7! x C y of L  L into L and .r; x/ 7! rx of K  L into
L are continuous.7
(ii) Each neighborhood of 0 2 L contains a convex neighborhood of 0.
The linear space L together with such a topology O is called locally convex. Let A and
B be subsets of L such that A  B. Then A is called an extreme subset of B if for all
x; y 2 B and p 2 .0; 1/ the relation
px C .1  p/y 2 A
implies that x; y 2 A. In the special case A D fag we say that a is an extreme point
of B. The set of all extreme points of B will be denoted by ex .B/.
The next separation result is often useful when dealing with convex sets. A proof
can be found, e.g., in [3] (cf. Theorem 2.3 in Chapter 1).

Theorem D.4.2. Let E be a locally convex space over K. Further, let F and C be
disjoint nonempty convex subsets of E such that F is closed and C is compact. Then
there exists a continuous linear functional l W E ! K such that

sup Re l.x/ < inf Re l.x/:


x2C x2F

Since a locally convex space over C can also be considered as a locally convex
space over R, we obtain:

Corollary D.4.3. Under the assumptions of the preceding theorem there exists a con-
tinuous R-linear functional l W E ! R such that

sup l.x/ < inf l.x/:


x2C x2F

Theorem D.4.4 (Hahn–Banach). Let E be a linear space over K, let p be a seminorm


on E and let L  E be a linear manifold. For every linear functional l on L such
that jl.x/j  p.x/ for all x 2 L there exists a linear functional e on E such that
e.x/ D l.x/ for all x 2 L and je.x/j  p.x/ for all x 2 E.
7 Here K is equipped with its usual topology and the product sets are equipped with the product top-
ology.
Section D.4 Convex sets and the theorem of Kreı̆n and Milman 333

In the proof of the next theorem we will repeatedly use the following simple fact:
If A is an extreme subset of B and B is an extreme subset of C then A is an extreme
subset of C .

Theorem D.4.5 (Kreı̆n–Milman). Every compact convex subset K of a locally convex


space L is the closed convex hull of its extreme points.

Proof. First we prove that an arbitrary non-void compact set C  L has at least
one extreme point. Applying Zorn’s lemma we see that C contains a closed extreme
subset M which is minimal with respect to inclusion. We claim that M has only one
element. This element is then an extreme point of C . Assume, on the contrary, that M
has more than one element. Then, by Corollary D.4.3, there exists a continuous real
linear functional l on L such that l is not constant on M . It is easy to check that the set
N D fx 2 M W l.x/ D sup l.M /g is a proper, closed and extreme subset of M . This
implies that N is a closed extreme subset of C contradicting the minimality of M .
Denote by K0 the closed convex hull of the extreme points of K. Assume that there
exists a y 2 K n K0 . Corollary D.4.3 with C D K0 and F D fyg shows the existence
of a continuous real linear functional l on L such that
sup l.K/  l.y/ > sup l.K0 /: (1)
By the first part of the proof, the compact convex set
N D fx 2 K W l.x/ D sup l.K/g
which is obviously not empty, contains at least one extreme point x0 . As N is an
extreme subset of K, x0 is an extreme point of K as well and hence x0 2 K0 . Since
this contradicts (1), we must have K D K0 .

Corollary D.4.6. Let K be a compact convex subset of L. For every x 2 K there


exists a Radon probability measure x on ex .K/ such that
Z
l.x/ D l.y/ d x .y/

holds for every continuous linear functional l on L.

Proof. By the previous theorem there exists a net fx˛ g of points of K of the form
X

x˛ D p˛j  x˛j ; n˛ < 1
j D1
j j P j
where x˛ 2 ex.K/; p˛  0; j p˛ D 1, such that lim˛ x˛ D x. We define the
probability measures ˛ on ex.K/ by
X

˛ D p˛j  ıx j :
˛
j D1
334 Appendix D Functional analysis

Since ex.K/ is compact, Theorem E.1.13 shows the existence of a subnet f ˇ g con-
verging weakly to a probability measure x on ex.K/. For an arbitrary continuous
linear functional l we then have
Z Z X j j
l.y/ d x .y/ D lim l.y/ d ˇ .y/ D lim pˇ  l.xˇ / D l.x/
ˇ ˇ
j
completing the proof.

D.5 Weak topologies


D.5.1. Let V be a normed linear space and denote by V  the linear space of all bounded
linear functionals on V . Equipped with the norm (D.3.3.1), V  is a Banach space.
The weak topology on V is the coarsest topology (the topology with the fewest open
sets) such that each element of V  is a continuous function. The weak- topology on
V  is the coarsest topology such that for all v 2 V the maps l 7! l.v/; l 2 V  ,
remain continuous. Thus, a net fv˛ g in V converges weakly to v 2 V if and only if
lim l.v˛ / D l.v/; l 2V
˛

while a net fl˛ g in V  converges in the weak- topology to l 2 V  if and only if


lim l˛ .v/ D l.v/; v 2 V:
˛
w w
In case of weak and weak- convergence we write v˛ ! v and l˛ ! l, respectively.
The limits of weakly and weak- convergent sequences are unique. Moreover, strong
convergence implies weak and weak- convergence.
If V D H is a Hilbert space then, by Theorem D.3.4, V  can be identified with H.
Moreover, the weak and weak- topologies on H coincide and a net fh˛ g converges
weakly to h if and only if
lim .h˛ ; g/ D .h; g/; g 2 H:
˛

In H any orthonormal sequence fen g1nD1 converges weakly to 0. This follows from
Bessel’s inequality
1
X
j.en ; h/j2  khk2 ; h 2 H:
nD1
Since ken k D 1, the sequence fen g does not converge strongly to 0.
Let H1 and H2 be Hilbert spaces. The weak operator topology on the set B.H1 ; H2 /
is the coarsest topology rendering all of the maps8
A 7! .Ag; h/; g 2 H1 ; h 2 H2
8 Recall that B.H1 ; H2 / denotes the set of all bounded linear operators from H1 to H2 (cf. D.3.3).
Section D.5 Weak topologies 335

continuous, where .; / denotes the inner product of H2 . A net fA˛ g of operators A˛ 2
B.H1 ; H2 / converges in the weak operator topology to an operator A 2 B.H1 ; H2 /
if and only if
lim .A˛ g; h/ D .Ag; h/; g 2 H1 ; h 2 H2 :
˛
The relation above is equivalent to
w
A˛ g ! Ag; g 2 H1 :
wo
In case of convergence in the weak operator topology we write A˛ ! A.

Lemma D.5.2. Multiplication of bounded operators in a Hilbert space H is sepa-


wo wo
rately continuous in the weak operator topology, i.e., if A˛ ! A then A˛ B ! AB
wo
and BA˛ ! BA.
wo
Proof. Assume that A˛ ! A so that

lim .A˛ g; h/ D .Ag; h/; g; h 2 H:


˛

wo
Replacing g by Bg we see that A˛ B ! AB. The second statement follows from

lim .BA˛ g; h/ D lim .A˛ g; B  h/ D .Ag; B  h/ D .BAg; h/:


˛ ˛

Remark D.5.3. Multiplication of operators is not jointly continuous in the weak op-
erator topology. To see an example, let H be a Hilbert space with orthonormal basis
fen g1
nD1 and write Uen WD enC1 . It is easy to check that U can be extended to a
unitary operator in H. Using that any orthonormal sequence in H converges weakly
wo wo
to 0 (cf. D.5.1) we see that U n ! 0 and U n ! 0; n ! 1. On the other hand,
U n U n is the identity operator for all n.

Theorem D.5.4. Convergence of a net in the weak, weak- or weak operator topology
implies that the net is bounded.
w
Proof. Let V be a normed linear space and let fl˛ g be a net in V  with l˛ ! l. Then for
each v 2 V the net fl˛ .v/g converges to l.v/ and hence it is bounded. Theorem D.3.8
shows that the net fl˛ g is bounded.
Assume that fv˛ g is a weakly convergent net in V and define the linear functional
L˛ on V  by L˛ .l/ WD l.v˛ /. Then kL˛ k D kv˛ k and the net fL˛ g converges in the
weak- topology. By the first part of the proof fL˛ g, and hence also fv˛ g, is bounded.
wo
Let fA˛ g be a net in B.H1 ; H2 / where H1 and H2 are Hilbert spaces. If A˛ ! A,
w
then A˛ g ! Ag for all g 2 H1 . It follows that fA˛ gg is bounded for all g 2 H1 .
Applying again Theorem D.3.8, we conclude that fA˛ g is bounded.
336 Appendix D Functional analysis

Theorem D.5.5. Let V be a normed linear space, H1 ; H2 be Hilbert spaces,


fv˛ g; fl˛ g and fA˛ g be nets in V; V  and B.H1 ; H2 /, respectively.9
w
(i) If v˛ ! v, then kvk  lim inf˛ kv˛ k.
w
(ii) If l˛ ! l, then klk  lim inf˛ kl˛ k.
wo
(iii) If A˛ ! A, then kAk  lim inf˛ kA˛ k.

Proof. (i) Using the linear functionals L˛ from Theorem D.5.4 we see that (i) follows
from (ii).
(ii) Let flˇ g be a subnet of fl˛ g such that limˇ klˇ k D lim inf˛ kl˛ k. For all v 2 V
we have
jl.v/j D lim jlˇ .v/j  lim klˇ k  kvk D lim inf kl˛ k  kvk
ˇ ˇ ˛
from which (ii) follows.
(iii) The proof is similar to that of (ii). We choose a subnet fAˇ g of fA˛ g such that
limˇ kAˇ k D lim inf˛ kA˛ k. For all g 2 H1 and h 2 H2 we have

j.Ag; h/j D lim j.Aˇ g; h/j  lim kAˇ k  kgk  khk D lim inf kA˛ k  kgk  khk
ˇ ˇ ˛

from which the inequality in (iii) follows.

Theorem D.5.6 (Banach–Alaoglu). Let V be a normed linear space and K  V 


be a bounded set which is closed in the weak- topology. Then K is compact in the
weak- topology.

Proof. It suffices to consider the special case K D fl 2 V  W klk  rg; r > 0.


This set is weak- closed in view of (D.5.5.ii) and jl.v/j  rkvk for all l 2 K and
all v 2 V . The weak- topology is the topology of pointwise convergence. Hence,
by Tychonoff’s theorem, each net in K has a subnet converging pointwise to some
function l0 on V . This function is obviously linear and jl0 .v/j  rkvk, i.e., l0 2 K.
Thus, K is compact in the weak- topology.

Corollary D.5.7. Let H1 and H2 be Hilbert spaces and let S  B.H1 ; H2 / be closed
in the weak operator topology. Then S is compact in the weak operator topology if
and only if Sg is bounded for all g 2 H1 .

Proof. The “if part” follows by the same argument as in the proof of the previous theo-
rem. Here we use the fact that the weak operator topology is the topology of pointwise
convergence, considering the weak- topology on H2 .
To show the “only if part”, assume that Sg is unbounded for some g 2 H1 . Then
for all n 2 N there exists sn 2 S such that ksn gk  n. By compactness we can choose
a subsequence fsnk g which is convergent in the weak operator topology. In view of
9 For a net fr˛ g of real numbers lim inf˛ r˛ is defined by lim inf˛ r˛ D lim˛ inf frˇ W ˇ  ˛g.
Section D.5 Weak topologies 337

Theorem D.5.4 this subsequence is bounded and hence fsnk gg is bounded, as well.
This contradiction completes the proof.

Theorem D.5.8. Let V be a normed linear space and let K  V be convex. Then the
weak and strong closures of K coincide.
Proof. It suffices to prove that every strongly closed convex set E  V is weakly
closed. By Corollary D.4.3, for all x … E there exists a continuous real linear func-
tional lx on V such that

mx WD inf flx .h/ W y 2 Eg > lx .x/:

Then \
ED fy 2 V W lx .y/  mx g:
x…E
Since each set fy 2 V W lx .y/  mx g is weakly closed, the set E is weakly closed as
well.

Theorem D.5.9. If S  B.H1 ; H2 / is compact in the weak operator topology, then


so is the weak operator closure of its convex hull.
Proof. Corollary D.5.7 and Theorem D.3.8 show that S is bounded and hence so is its
convex hull. By (D.5.5.iii) the weak operator closure Q of the convex hull is bounded
as well. Therefore, the compactness of Q follows from Corollary D.5.7.

Lemma D.5.10. Let B be a Banach space such that B (B  ) is strictly convex.10 If


K is a convex subset of B (B  ) and K is closed in the weak (weak-) topology, then
there exists a unique element of K with minimal norm.
Proof. We consider only B, the statement about B  can be proved in the same way.
Choose a sequence fvn g in K such that

lim kvn k D inf fkvk W v 2 Kg:


n!1
By Theorem D.5.6 this sequence has a subsequence, converging weakly to some el-
ement v 2 V . Since K is weakly closed we have v 2 K. Theorem D.5.5 shows that
the norm of v is minimal. The uniqueness of v follows from the fact that B is strictly
convex.

Theorem D.5.11. Let S be a convex semigroup of linear contractions in H. If S is


compact in the weak operator topology, then there exists a unique orthogonal projec-
tion s0 2 S such that s0 s D ss0 D s0 holds for all s 2 S .11
10 A normed space V is called strictly convex if the relations kvk D kwk and v ¤ w imply the inequality
k.v C w/=2k < kvk. Hilbert spaces are strictly convex.
11 A more general result can be found in [48].
338 Appendix D Functional analysis

Proof. Denote by H0 the set of common fixed points of the operators in S . Then
0 2 H0 and H0 is an S -invariant subspace of H. Write

S  WD fs  W s 2 S g:

It is easy to check that S  is a compact convex semigroup of contractions. Moreover,


the subspace H0? is S  -invariant. Indeed, if h0 2 H0 and h 2 H0? are arbitrary then

.s  h; h0 / D .h; sh0 / D .h; h0 / D 0; s 2 S:

For all h 2 H the set S  h is convex. Using the fact that S is compact we see that S  h
is weakly compact. By Lemma D.5.10 there exists a unique h0 2 S  h having minimal
norm. Since S  is a semigroup of contractions, s  h0 is in S  h and ks  h0 k  h0 for
all s 2 S . By the uniqueness of h0 we must have s  h0 D h0 .
We prove that h0 D 0 if h 2 H0? . Let g 2 H be arbitrary and consider the weakly
compact convex set Sg. The same arguments as above show the existence of g0 2 Sg
such that sg0 D g0 for all s 2 S . Thus, g0 2 H0 and hence .g0 ; h0 / D 0. Choose
s0 2 S such that g0 D s0 g. Then

.g; h0 / D .g; s0 h0 / D .s0 g; h0 / D .g0 ; h0 / D 0; g2H

and hence h0 D 0. We have thus shown that for all h0 2 H0? the set S  h0 contains 0.
Now let h1 ; : : : ; hn 2 H0? be arbitrary and choose s1 2 S such that s1 h1 D 0.
Next we choose s2 2 S such that s2 .s1 h2 / D 0. Continuing this process we obtain an
operator s 2 S with s  hj D 0 for all j D 1; : : : ; n. A simple compactness argument
shows the existence of an operator s0 2 S such that s0 h D 0 for all h 2 H0? . It follows
that s0 H  H0 and hence ss0 D s0 .
It remains to prove that s0 s D s0 . Note first that the equations s0 ss0 D s0 and
s0 ss0 s D s0 s hold because of ss0 D s0 . We have

ks  s0 hk D ks  s0 s  s0 hk  ks0 s  s0 hk  ks  s0 hk; h 2 H:

Using this and the relation s0 s  s0 D s0 we see that ks  s0 hk D ks0 hk. If
s0 h ¤ s  s0 h for some h 2 H, then
1 1
ks0 hk D ks  s0 hk D ks0 .s  s0 h C s0 h/k  ks  s0 h C s0 hk < ks0 hk:
2 2
This contradiction shows that s0 D s  s0 and therefore s0 s D s0 . The uniqueness of
s0 follows at once from ss0 D s0 s D s0 .
Appendix E

Measure theory

E.1 Borel measures, weak and vague convergence

Mainly we need only measures on Rd or on Zd . However, at a few places we use


Borel measures on topological spaces in general. In this section we collect some
basic facts that will be sufficient for our purposes. We refer to Chapter 2 of [3] or to
§§30–31 of [2] for the proofs.1

Throughout this section X will denote a Hausdorff topological space. The measures
considered will be defined on the  -algebra B.X / of all Borel subsets of X , i.e., the
 -algebra generated by the open subsets of X . A measure (nonnegative, real or com-
plex) defined on B.X / will be called a Borel measure. In our terminology a measure
need not be nonnegative. If and  are finite Borel measures and c 2 C, then the meas-
ures c and C are defined by .c /.B/ D c .B/ and . C/.B/ D .B/C.B/,
B 2 B.X /.
Definition E.1.1. A nonnegative Radon measure on X is a nonnegative Borel meas-
ure such that
(i) .K/ < 1 for each compact set K  X ;
(ii) .B/ D sup f .K/ W K  B; K compactg for each Borel set B  X .

E.1.2. The set of all nonnegative Radon measures is denoted by MC .X / while


MbC .X / denotes the set of all finite measures in MC .X /. A complex Radon measure
 on X is a complex measure of the form

D. 1  2/ Ci. 3  4/

where the j ’s are in MbC .X /. If a Borel measurable function


R f is j -integrable for
all j , then we say that f is -integrable. The integral f d is then defined by
Z Z Z Z Z
f d D f d 1  f d 2 C i  f d 3  i  f d 4 :

We write Mb .X / for the set of all complex Radon measures on X while Mf .X /


stands for the set of all complex measures with finite support.
1 We remark that there are some small differences in the terminology of these books, e.g., in the defi-
nitions of Borel and Radon measures.
340 Appendix E Measure theory

We define the measure by .B/ D .B/; B 2 B.X /. If 2 Mb .Rd /, then the


measures 4 and Q are defined by
4
.B/ D .B/ and Q .B/ D .B/; B 2 B.Rd /
respectively. Note that on Rd each nonnegative Borel measure that is finite on compact
sets is a Radon measure.
If and  are complex Radon measures on a locally compact space X and
Z Z
f d D f d; f 2 C00 .X /

then D .
Let be a complex Radon measure on a locally compact space X . Then there exist
a finite nonnegative Borel measure j j and a Borel measurable function g on X such
that2 jg.x/j D 1 for all x 2 X and
Z Z
f d D fg dj j (1)

for every -integrable function f . This equation implies the inequality


ˇZ ˇ Z
ˇ ˇ
ˇ f d ˇ  jf j dj j:
ˇ ˇ

Theorem E.1.3 (Riesz representation theorem). Let X be locally compact. Then there
is a bijection between all nonnegative linear functionals3 L on C00 .X / and all Radon
measures on X given by
Z
L.f / D f d ; f 2 C00 .X /:

Theorem E.1.4 (Riesz–Markov theorem). Let X be locally compact. For any con-
tinuous4 linear functional L on C0 .X /, there is a unique complex Radon measure
on X such that Z
L.f / D f d ; f 2 C0 .X /:
The norm of L as a linear functional is j j.X /.

Lemma E.1.5. Let F be a linear space of real-valued bounded functions on some set
X such that 1 WD 1X 2 F. Further, let L be a real linear functional on F such that
L.1/ D 1 and
jL.f /j  sup jf .x/j; f 2 F:
x2X
Then L is nonnegative, i.e., L.p/  0 whenever p  0.
2 See Theorem 14.14 and Corollary 11.41 in [27].
3 Nonnegativity means that L.f /  0 whenever f  0.
4 We consider the supremum norm on C0 .X /
Section E.1 Borel measures, weak and vague convergence 341

Proof. If L was not nonnegative, then we could find a function p 2 F such that
0  p  1 and L.p/ < 0. From 1 D L.1/ D L.p/ C L.1  p/ it follows that
L.1  p/ > 1. This contradicts the assumptions since 0  1  p  1.

Definition E.1.6. The weak topology on MbC .X / is the coarsest


R topology (the topol-
ogy with the fewest open sets) such that the functions 7! f d become lower
semicontinuous for every bounded lower semicontinuous real-valued function f on
X .5

The space MbC .X / is a Hausdorff space in the weak topology. Convergence in the
weak topology is characterized by the next theorem.

Theorem E.1.7. For 2 MbC .X / and a net f ˛g in MbC .X / the following prop-
erties are equivalent:
(i) ˛ ! in the weak topology;
(ii) lim inf ˛ .G/  .G/ for all open G  X and lim ˛ .X / D .X /;
(iii) lim sup ˛ .F /  .F / for all closed F  X and lim ˛ .X / D .X /;
R R
(iv) lim inf f d ˛  f d for all bounded lower semicontinuous
f W X ! R.
R R
(v) lim sup f d ˛  f d for all bounded upper semicontinuous
f W X ! R.
If (i)–(v) are fulfilled, then
R R
(vi) lim f d ˛ D f d for all bounded continuous f W X 7! R.
Finally, if X is a completely regular space,6 then property (vi) implies (i)–(v).
We say that a net f ˛ g of complex Radon measures on a completely regular space
X converges weakly to some complex measure if the relation (E.1.7.vi) holds.

The set of finitely supported nonnegative measures is a dense subset of MbC .X /


with respect to the weak topology. This set is dense even in a stronger sense:

Theorem E.1.8. For every 2 MCb


.X / there exists a net f ˛ g of nonnegative finitely
supported measures such that

lim ˛ .B/ D .B/; B 2 B.X /:


˛

5 Note that a real-valued function f on X is called lower (upper) semicontinuous if the set
fx 2 X W f .x/ > rg (the set fx 2 X W f .x/ < rg, respectively) is open for every r 2 R.
6 A topological space X is completely regular if, given any closed set F and any point x 2 X n F , there
is a continuous real-valued function f on X such that f .x/ D 0 and f .y/ D 1 for every y 2 F .
Locally compact Hausdorff spaces and metric spaces are completely regular.
342 Appendix E Measure theory

The next simple result shows that finitely supported measures on Rd can be approx-
imated by absolutely continuous measures with compact support.

Lemma E.1.9. For every finitely supported nonnegative measure on Rd there exists
a sequence fhn g of nonnegative functions hn 2 C00 .Rd / such that the sequence fn g
where dn .t / D hn .t / d.t / converges weakly to .

Proof. Consider first the special case D ıx ; x 2 Rd , and denote by Bn .x/ the
open ball with center Rx and radius 1=n; n 2 N. For each n we choose a nonnegative
function hn such that hn d D 1 and hn .t / D 0; t … Bn .x/, and define n as in the
statement of the lemma. Now let f be an arbitrary continuous real-valued function on
Rd . For an arbitrary > 0 there exists N. / 2 N such that

f .x/  < f .t / < f .x/ C ; t 2 Bn .x/; n  N. /:

Integrating both sides with respect to n gives


Z
f .x/  < f .t / dn .t / < f .x/ C ; n  N. /:
R
Noting that f d D f .x/ and applying Theorem E.1.7 we see that fn g converges
weakly to .
The general case follows by linearity.

Definition E.1.10. RThe vague topology on MC .X / is the coarsest topology such that
the functions 7! f d are continuous for every function f 2 C00 .X /.

Convergence in the vague topology is characterized by the next theorem.

Theorem E.1.11. Let X be locally compact. For 2 MC .X / and a net f ˛g in


MC .X / the following properties are equivalent:
(i) ˛ ! in the vague topology;
(ii) lim sup ˛ .K/  .K/ for all compact K  X and
lim inf ˛ .G/  .G/ for all relatively compact7 open sets G  X ;
(iii) lim ˛ .B/ D .B/ for all relatively compact sets B 2 B.X / such that
.@B/ D 0.8
The following relationship holds between the weak and the vague topologies:

7 A set is called relatively compact if its closure is compact.


8 The symbol @B denotes the boundary of B. It consists of all points of the closure of B which are not
inner points of B.
Section E.1 Borel measures, weak and vague convergence 343

Theorem E.1.12. Suppose that X is locally compact and let f ˛ g be a net in


MbC .X /. Then f ˛ g converges weakly to a measure 2 MbC .X / if and only if it
converges vaguely to and lim ˛ .X / D .X /.
The next theorem is very useful. A special case of it is known as Helly’s selection
theorem.

Theorem E.1.13. Suppose that X is locally compact and let p > 0. The set

f 2 MbC .X / W .X /  pg

is vaguely compact. If X is compact, then the set of all Radon probability measures
on X is compact in the weak topology.
If and  are Radon measures on spaces X and Y , then there is a unique Radon
measure   on X  Y , called product measure, satisfying

.  /.A  B/ D .A/  .B/ for all compact A  X; B  Y:

The next theorem states that the product measure depends continuously on both of its
arguments.

Theorem E.1.14. Let X and Y be Hausdorff spaces. Then . ; / 7!   is weakly


continuous as a mapping from MbC .X /  MbC .Y / to MbC .X  Y /. If X and Y
are locally compact, then . ; / 7!   is vaguely continuous as a mapping from
MC .X /  MC .Y / to MC .X  Y /.

Theorem E.1.15. Let X and Y be Hausdorff spaces and let f W X ! Y be a


continuous mapping. Then for any 2 MbC .X / the image measure f belongs
to MC .Y /. Moreover, the transformation
b
7! f from MbC .X / to MbC .Y / is
continuous in the weak topology.
The previous theorem has a useful corollary.

Corollary E.1.16. Let f W Rd ! Rk be continuous. If a sequence fXn g of d-dimen-


sional random vectors converges in distribution9 to a random vector X , then ff .Xn /g
converges in distribution to f .X /.

Theorem E.1.17. Let .O˛ /˛2I be an open covering of X and on each O˛ let a Radon
measure ˛ be given such that ˛ .B/ D ˇ .B/ for each pair of indices ˛; ˇ 2 I
and for each Borel set B  O˛ \ Oˇ . Then there is a uniquely determined Radon
measure on X such that .B/ D ˛ .B/ for every Borel set B  O˛ .

9 Recall that, by definition, a sequence of random vectors converges in distribution to a random vector
X if the sequence of the corresponding distributions converges weakly to the distribution of X .
344 Appendix E Measure theory

Lemma E.1.18. Let f be a Borel measurable function on a locally compact Haus-


dorff space X and be a complex Radon measure on X . Suppose that there exists
K  0 such that the inequality
ˇZ ˇ
ˇ ˇ
ˇ f d ˇK
ˇ ˇ
S

holds for all compact sets S  X . Then f is -integrable and the same inequality
holds with S replaced by X .

Proof. Let g be as in (E.1.2.1) and write h D fg. Then


ˇZ ˇ ˇZ ˇ
ˇ ˇ ˇ ˇ
ˇ f d ˇ D ˇ h dj jˇ  K
ˇ ˇ ˇ ˇ
S S

holds for all compact sets S  X . We show first that h is j j-integrable. Since
jRe zj  jzj and jIm zj  jzj; z 2 C, we may suppose that h is real-valued. Set-
ting hC D max .0; h/; h D  min .0; h/ and ShC D fx 2 X W h.x/  0g, we have
h D hC  h and
Z Z
0 hC dj j D h dj j  K:
S S \ShC

The validity of the last inequality follows from the fact that is a Radon measure and
hence the integral over S \ ShC can be approximated by integrals over compact sets.
From this we conclude that hC is j j-integrable. The integrability of h follows in the
same way. Thus h is integrable. Since jf j D jhj, the function f is integrable as well.
As is a Radon measure, there exist compact sets Kn  X; n 2 N, such that
Z Z
lim f d D f d
n!1 K S
n

from which the required inequality follows.

Lemma E.1.19 (Scheffe). Let be a nonnegative measure on a measurable space


. ; A/ and let f; fn be nonnegative, -integrable functions such that

lim fn D f
n!1

-almost everywhere and


Z Z
lim fn d D f d :
n!1

Then Z
lim jfn  f j d D 0:
n!1
Section E.2 Convolution of measures and functions 345

Proof. The function fn C f  jfn  f j is obviously nonnegative. Applying Fatou’s


lemma we obtain
Z Z
2 f d D lim inf .fn C f  jfn  f j/ d
n!1
Z
 lim inf .fn C f  jfn  f j/ d
n!1
Z Z
D2 f d  lim sup jfn  f j d
n!1

from which the statement follows.

E.2 Convolution of measures and functions


This section contains basic facts about the convolution of functions and measures.
Proofs and more details can be found in the books [2, 3, 27, 28]. All measures in this
section will be complex Borel measures on Rd (cf. Section E.1).

Definition E.2.1. The additive convolution or simply convolution   of two finite


Borel measures and  is defined as the image of the product measure   under
the mapping A W Rd  Rd ! Rd where A.x; y/ D x C y.
If d D 1, then the multiplicative convolution ?  is the image of the product
measure   under the mapping M W R  R ! R where M.x; y/ D xy.

E.2.2. The convolution   and the multiplicative convolution ?  are again finite
Borel measures satisfying
Z ZZ
f d  D f .x C y/ d .x/ d.y/

and Z ZZ
gd ? D g.xy/ d .x/ d.y/

for arbitrary bounded continuous functions f on Rd and g on R, respectively. In the


sequel we consider only the additive convolution, analogous formulae hold for the
multiplicative convolution as well.
We have
(i)  D ;
(ii) .  /   D  .  /;
(iii) .c  /   D c  .  /; c 2 C;
(iv)  . C / D C  ;
(v) ıx  ıy D ıxCy .
346 Appendix E Measure theory

The last three


P equations imply that
P the convolution of the finitelydsupported meas-
ures D njD1 cj ıxj and  D m kD1 dk ıyk , where xj ; yk 2 R ; cj ; dk 2 C, is
given by
Xn Xm
 D cj dk ıxj Cyk :
j D1 kD1

E.2.3. Let  2 Mb .Rd / and f; g 2 L1 .Rd /. Recall that  D d denotes the Lebesgue
measure on Rd . We define the functions f   and f  g for x 2 Rd by
Z
f  .x/ D f .x  y/ d.y/

and Z
f  g.x/ D f .x  y/g.y/ d.y/

if the integrals exist, otherwise we set f  .x/ D 0 or f  g.x/ D 0, respectively.


It can be shown that the integrals above exist -almost everywhere (see also Theo-
rem E.2.4 below). Moreover, the analogues of (E.2.2.i–iv) hold for convolutions of
functions and measures and for convolutions of functions as well (-almost every-
where).
If is absolutely continuous with respect to the Lebesgue measure, i.e.,
d .x/ D f .x/ d.x/ with some f 2 L1 .Rd /, then   is absolutely continuous,
as well, and
d  .x/ D f  .x/ d.x/:
If  is absolutely continuous, too, i.e., d.x/ D g.x/ d.x/ with some g 2 L1 .Rd /,
then
d  .x/ D f  g.x/ d.x/:

Theorem E.2.4. If f; g 2 L2 .Rd / then the integral


Z
f  g.x/ WD f .x  y/g.y/ d.y/

exists for all x 2 Rd and f  g 2 C0 .Rd /.


If f 2 L1 .Rd / and g 2 Lp .Rd /; 1  p  1, then the integral above exists
-almost everywhere, f  g 2 L1 .Rd / and kf  gkp  kf k1  kgkp .
The next theorem states that the convolution depends continuously on both of its
arguments.

Theorem E.2.5. If the nets f ˛ g; f˛ g converge weakly to and , respectively, then
f ˛  ˛ g converges weakly to  .
Appendix F

Probability

In this chapter we collect some basic definitions, facts, and notation of probability
theory. For more details see, e.g., [10, 16, 17, 37, 45].

F.1 Basic notions


A probability space is a triple . ; A; P /, where is a nonempty set, A is a  -algebra
of subsets of , and P is a probability measure1 on A. A d-dimensional random vector
on . ; A; P / is an .A; Bd /-measurable function X W ! Rd , where Bd denotes the
 -algebra of Borel subsets of Rd . If d D 1, then we call X a random variable. If
we mention two or more random vectors in a theorem, definition, etc., we implicitly
assume that all random vectors are defined on the same probability space.
Two random vectors X and Y are called independent if the equality

P .X 2 B; Y 2 D/ D P .X 2 B/  P .Y 2 D/

holds for all B; D 2 Bd . Let g and h be .Bd ; Bn /-measurable functions from Rd


into Rn . If X and Y are independent, then the random vectors g.X / and h.Y / are
independent as well. Complex-valued random variables, complex random vectors and
their independence are defined in the same way by considering the  -algebra of Borel
subsets of Cd . If not stated otherwise, random vectors are supposed to be real.
The probability measure X defined by

X .B/ WD P .X 2 B/; B 2 Bd

is called the distribution of the random vector X , while the distribution function FX
of X is defined by

FX .t / D X ..1; t1 /      .1; td //; t 2 Rd :

A random vector X is said to be continuous if X is continuous, i.e., if the equation


X .ft g/ D P .X D t / D 0 holds for all t 2 R . If there exists a nonnegative Borel
d

measurable function p on R such that


d

Z
X .B/ D p.x/ d.x/; B 2 Bd
B

1 That is, a nonnegative measure such that P . / D 1.


348 Appendix F Probability

i.e., if X is absolutely continuous with respect to the Lebesgue measure, then p is


called the density of X (or of X ). In this case we say that X is an absolutely con-
tinuous random vector. If there exists a Borel subset S of Rd such that .S / D 0 and
X .S / D 1, then X and X are said to be singular (with respect to the Lebesgue
measure). If there exists a denumerable subset S of Rd such that X .S / D 1, then
X and X are said to be discrete. Note that discrete distributions are singular.
For arbitrary X the distribution X can be written as

X D p1 d C p2 ac C p3 sc

where p1 ; p2 ; p3 are nonnegative numbers summing to 1 and d ; ac ; sc are prob-


ability distributions such that d is discrete, ac is absolutely continuous, and sc is
continuous and singular.
If X and Y are independent, then

XCY D X  Y

where  denotes convolution2 of measures, respectively. We have


Z Z Z
h.t / d. X  Y /.t / D h.x C y/ d X .x/ d Y .y/ (1)
Rd Rd Rd

for every bounded, continuous complex-valued function h on Rd . If Y is absolutely


continuous, then X C Y is absolutely continuous as well, and its density pXCY is
given by Z
pXCY .y/ D pY .y  x/ d X .x/ (2)
Rd
where pY denotes the density of Y .
If d D 1, then
XY D X ? Y
where ? denotes multiplicative convolution. We have
Z Z Z
h.t / d X ? Y .t / D h.xy/ d X .x/ d Y .y/ (3)
Rd Rd Rd

for every bounded, continuous complex-valued function h on R.


Let X be a P -integrable real- or complex-valued random variable. The number
Z
E .X / WD X.!/ dP .!/

is called the expectation of X while the variance Var.X / of X is defined by


Var .X / WD E.jX  E.X /j2 /:
Note that this definition allows the variance to be 1.
2 For properties of convolution of functions and measures we refer to Section E.2.
Section F.1 Basic notions 349

If X and Y are independent and integrable, then X Y is integrable and we have

E.X Y / D E.X /  E.Y /:

More generally, let X D .X1 ; : : : ; Xd / be a d-dimensional real or complex random


vector such that Xj is P -integrable for all j . Then the expectation of X is defined by

E .X / WD .EX1 ; : : : ; EXd /:

Suppose that Xj is square integrable for all j and put mj D EXj . The matrix
cov.X / D .cj k /dj;kD1 where

cj k D EŒ.Xj  mj /.Xk  mk / D EŒXj Xk   mj mk

is called the covariance matrix of X . The covariance matrix is positive semidefinite


(cf. Section D.2), since
ˇ d ˇ2
Xd ˇX ˇ
ˇ ˇ
cj k zj zk D Eˇ zj  .Xj  mj /ˇ  0; zj 2 C:
ˇ ˇ
j;kD1 j D1

If the entries of a matrix Y D .Yj k / are integrable random variables, then we write

E.Y / WD .EYj k /:

Using this notation, the covariance matrix can also be written in the form3

cov.X / D E Œ.X  E.X //  .X  E.X // :

Let A 2 Cd d be a square matrix. Then, as it is easy to verify,

cov.AX / D A cov.X / A :

Let ˛ D .˛1 ; : : : ; ˛d / 2 Nd0 be a d -tuple of nonnegative integers and let be


the distribution of a d-dimensional random vector X . If the function x 7! x ˛ is -
integrable, then we call the number
Z
M˛ D M˛ . / D M˛ .X / D x ˛ d .x/
Rd

the moment of order ˛ of (of X ). In this case we also say that the moment of order
˛ of exists. Note that M˛ exists if and only if
Z
A˛ D A˛ . / D A˛ .X / D jx ˛ j d .x/ < 1:
Rd

3 Recall our convention from Section A.2 saying that in expressions involving matrix operations the
elements of Rd and Cd are considered as column vectors.
350 Appendix F Probability

The number A˛ is called the absolute moment of order ˛. The existence of M˛ implies
the existence of Mˇ for all ˇ  ˛. In the case d D 1 the inequality

x 2k C x 2k2
jxj2k1  ; x 2 R; k  1
2
shows that
M2k C M2k2
A2k1  ; k 2 N: (4)
2
Let n > k  1 and suppose that An exists. By Hölder’s inequality,
Z Z 1=p
Ak D jxjk  1 d .x/  jxjpk d .x/
R R
n
for every p > 1. Setting p D k
we find
1=k
Ak  A1=n
n ; 1  k  n: (5)

We will also use the simple but useful inequality


1
P .jX  E .X /j   /  2
; >0 (6)
p
where  D Var .X / . This inequality is called Chebyshev’s inequality, it holds for
every random variable X with finite variance.

F.2 Convergence of random vectors


A sequence fXn g of random vectors is said to converge in distribution, or converge
weakly, or converge in law to a random vector X if the sequence f n g of their distri-
butions converges weakly (see Section E.1) to the distribution of X . We say that the
sequence fXn g converges in probability to X if for all > 0

lim P .kX  Xn k  / D 0:
n!1

If g W Rd ! Rn is continuous and fXn g is a sequence of d-dimensional random


vectors which converges in distribution (in probability) to X , then fg.Xn /g converges
in distribution4 (in probability, respectively) to g.X /. Finally, fXn g converges to X
almost surely if
P .f! 2 W lim Xn .!/ D X.!/g/ D 1:
n!1
Note that the set of all ! for which lim Xn .!/ exists belongs to A.
Almost sure convergence implies convergence in probability which implies conver-
gence in distribution. The converse is not true, but if a sequence fXn g converges in
4 See Corollary E.1.16 for the convergence in distribution.
Section F.2 Convergence of random vectors 351

distribution to a constant random vector, then the sequence is convergent in probabil-


ity, as well.

Theorem F.2.1 (zero–one law of Borel–Cantelli). Let . ; A; P / be a probability


space and fAn g1nD1 be a sequence of events in A.
P
(i) If P .An / < 1, then the probability that infinitely many of the An ’s occur
is 0.
P
(ii) If the events are pairwise independent and P .An / D 1, then the probability
that infinitely many of them occur is 1.

Lemma F.2.2. Let fXn g and fYn g be sequences of random vectors such that
X
P .Xn ¤ Yn / < 1:
P P
Then Xn is almost surely convergent if and only if Yn is almost surely convergent
(and they converge to the same limit).
Proof. The lemma follows immediately from the fact that, by the zero–one law of
Borel–Cantelli (see Theorem F.2.1), the probability that the event ŒXn ¤ Yn  happens
for infinitely many n is zero.

Theorem F.2.3. Let X1 ; X2 ; : : : be random variables and c1 ; c2 ; : : : be complex num-


bers such that
X1
sup E jXn j < 1 and jcn j < 1:
n2N
nD1
Then the series
1
X
cn Xn (1)
nD1
converges absolutely with probability one. If in addition supn2N E jXn j2 < 1, then
the series converges in mean square to the same limit.5
Proof. The monotone convergence theorem gives
1
X X
N
E jcn Xn j D lim E jcn Xn j
N !1
nD1 nD1
X
N
 lim jcn j  sup E jXn j < 1
N !1 n
nD1

1
5 Note that E jX j  .E jX j2 / 2 in view of (F.1.5) and therefore the condition supn E jXn j2 < 1
implies that supn E jXn j < 1.
352 Appendix F Probability

P
from which it follows that n jcn Xn .!/j is finite with probability one. Thus, the se-
quence of partial sums of (1) converges almost surely to a limit X .
If K WD supn E jXn j2 < 1 and 0 < m < n, then
ˇ ˇ2
ˇ X ˇ X X X 2
ˇ ˇ
Eˇ cj Xj ˇ D cj ck E .Xj X k /  K  jcj j :
ˇ ˇ
m<j n m<j n m<kn m<j n

The right-hand side tends to zero as m; n ! 1 and so by the Cauchy criterion


(cf. D.3.2) the sequence of partial sums of (1) converges in mean square to a limit Y .
Since mean square convergence and almost sure convergence imply convergence in
probability, we conclude that X D Y with probability one.

F.2.4. For a d-dimensional random vector X we denote by  .X / the smallest  -algebra


containing all events of the form ŒX 2 B, where B 2 Bd . Similarly, if fXn g1 1 is a
sequence of d-dimensional random vectors on the same probability space . ; A; P /,
then  .X1 ; X2 ; : : :/ denotes the smallest  -algebra containing  .Xn / for all n. An
event A 2 A is called a tail event (with respect to fXn g1 1 ) if

1
\
A2  .Xn ; XnC1 ; : : : /:
nD1

Theorem F.2.5 (zero–one law of Kolmogorov). If A is a tail event with respect to a


sequence of independent random vectors, then the probability of A is 0 or 1.

F.3 Products of probability spaces


Suppose that we are given a nonempty index set I and a family . i ; Ai ; Pi /; i 2 I;
of probability spaces. We consider the Cartesian product
Y
WD i:
i2I

If I D N, then we also use the notation

D 1  2   and ! D .!1 ; !2 ; : : : / 2 :

Let A be the smallest  -algebra in containing all sets A of the form


Y
AD Ai
i2I

where Ai 2 Ai and Ai D i for all but a finite number of i 2 I .


Section F.3 Products of probability spaces 353

Theorem F.3.1. There exists a unique probability measure P on A such that


Y
P .A/ D Pi .Ai /
i2I
Q
for all A 2 A of the form A D i2I Ai where Ai 2 Ai and Ai D i for all but a
finite number of i 2 I .

The probability space . ; A; P / is called the product of the probability


Q spaces
. i ; Ai ; Pi /; i 2 I . The measure P is usually written as P D i2I Pi , and, in
case of I D N, also as P D P1  P2     .
The next result is an immediate corollary of the previous theorem.

Corollary F.3.2. For an arbitrary family f ˛ g of d-dimensional distributions there


exists a family fX˛ g of independent d-dimensional random vectors such that ˛ is the
distribution of X˛ .

A more general existence theorem than F.3.1 was given by Kolmogorov.

Theorem F.3.3 (Kolmogorov’s existence theorem). Let T ¤ ; be an arbitrary set


and d 2 N. For each finite subset ft1 ; : : : ; tk g  T let t1 ;:::;tk be a probability
measure on .Rd /k . Suppose that these measures satisfy the so-called consistency con-
ditions:
(i) For all permutations  of f1; : : : ; kg and all Borel sets B1 ; : : : ; Bk  Rd

t1 ;:::;tk .B1      Bk / D t.1/ ;:::;t.k/ .B.1/      B.k/ / :

(ii) For all n 2 N and Borel sets B1 ; : : : ; Bk  Rd

t1 ;:::;tk .B1      Bk / D t1 ;:::;tk ;tkC1 ;:::;tkCn .B1      Bk  .Rd /n / :

Then there exist a probability space . ; A; P / and d-dimensional random vectors Xt


on such that t1 ;:::;tk is the distribution of the random vector .Xt1 ; : : : ; Xtk /.

Remark F.3.4. Let t1 ;:::;tk be as in Theorem F.3.3 and denote by ft1 ;:::;tk the corre-
sponding characteristic function. Using Theorem 1.1.8 and Corollary 1.1.9 we see that
the consistency conditions can be formulated in terms of the characteristic functions
as follows:
(i) For all permutations  of f1; : : : ; kg and all x1 ; : : : ; xk 2 Rd

ft1 ;:::;tk .x1 ; : : : ; xk / D ft.1/ ;:::;t.k/ .x.1/ ; : : : ; x.k/ / :


354 Appendix F Probability

(ii) For all n 2 N and x1 ; : : : ; xk 2 Rd

ft1 ;:::;tk .x1 ; : : : ; xk / D ft1 ;:::;tk ;tkC1 ;:::;tkCn .x1 ; : : : ; xk ; 0; : : : ; 0/:

F.4 Conditional expectation


Let . ; A; P / be a probability space and B  A be a  -algebra. For any P -integrable
random variable X there exists a B-measurable P -integrable random variable Z such
that Z Z
Z dP D X dP; B 2 B:
B B
This random variable Z, which is P -almost everywhere uniquely determined, is called
the conditional expectation of X with respect to B and is denoted by E .X jB/ or
EB .X /. If B D  .Y / where Y D .Y1 ; : : : ; Yd / is a random vector we also write
E .X jY1 ; : : : ; Yd / or E .X jY / instead of E .X jB/. For an arbitrary  .Y1 ; : : : ; Yd /-
measurable random variable Z there exists a real-valued Borel measurable function
g on Rd such that Z D g.Y1 ; : : : ; Yd / holds P -almost everywhere. In particular, if
Z D E.X jY1 ; : : : ; Yd /, then

E .X jY1 ; : : : ; Yd / D g.Y1 ; : : : ; Yd /: (1)

For this g the notation

E .X jY D y/ WD g.y/; y 2 Rd

is used and E .X jY D y/ is called the conditional expectation of X given ŒY D y.


E .X jY D y/ is determined uniquely up to sets of Y -measure 0 where Y denotes
the distribution of Y . The function g is Y -integrable and for all Borel sets B  Rd
we have Z Z
g.y/ d Y .y/ D X.!/ dP .!/: (2)
B Y 1 .B/
Conversely, if a Borel measurable function g satisfies this equation then (1) holds.
From (2) it is easy to deduce the more general equation
Z Z
h.y/g.y/ d Y .y/ D h.Y .!//X.!/ dP .!/ (3)
Rd

which holds for any complex-valued Borel measurable function h on Rd such that hg
is Y -integrable.
The conditional variance of a square integrable random variable X with respect to
a  -algebra B  A is defined by
 
Var.X jB/ WD E .X  E.X jB//2 jB D E.X 2 jB/  .E.X jB//2 :
Section F.4 Conditional expectation 355

In the next two theorems we collect some basic properties of the conditional ex-
pectation. Equations and inequalities for conditional expectations are meant to hold
almost surely.

Theorem F.4.1. Let X and Y be integrable random variables. The following relations
hold:
(i) EB .aX C bY / D aEB .X / C bEB .Y /; a; b 2 R or C;
(ii) if X  Y , then EB .X /  EB .Y /;
(iii) if X is B-measurable, then EB .X / D X ;
(iv) E .EB .X // D E.X /;
(v) jEB .X /j  EB .jX j/ and kEB .X /k1  kX k1 ;
(vi) if Z is a B-measurable random variable such that Z  X is integrable, then
EB .Z  X / D Z  EB .X /;
(vii) if B1 and B2 are sub- -algebras of A and B1  B2 , then EB2 .EB1 .X // D
EB1 .EB2 .X // D EB1 .X /;
(viii) if B and  .X / are independent,6 then EB .X / D E .X /.
By the previous theorem, EB is a linear operator from L1 .P / into L1 .P /. This op-
erator is nonnegative, contractive and idempotent, the B-measurable random variables
are fixed points of EB .

Lemma F.4.2. Let X0 be an integrable random variable, X D .X1 ; : : : ; Xd / be


a random vector and g be a real-valued Borel measurable function g on Rd . The
equation
E.X0 jX / D g.X / (1)
holds if and only if
   
E ei.t;X/  X0 D E ei.t;X/  g.X / ; t 2 Rd : (2)

Proof. Multiplying both sides of (1) by ei.t;X/ and taking expectations yields
   
E ei.t;X/  E.X0 jX / D E ei.t;X/  g.X / :

Equation (2) now follows from (F.4.1.vi) and (F.4.1.iv).


For the proof of the other direction, let denote the distribution of X . By (F.4.3)
equation (2) can be rewritten as
Z Z
ei.t;x/
 E.X0 jX D x/ d .x/ D ei.t;x/  g.x/ d .x/; t 2 Rd :
Rd Rd

6 Two subsets B1 and B2 of A are called independent if the events B1 and B2 are independent for
arbitrary B1 2 B1 ; B2 2 B2 .
356 Appendix F Probability

By the uniqueness theorem for the Fourier–Stieltjes transform (cf. Theorem 1.8.3) we
have E.X0 jX D x/ d .x/ D g.x/ d .x/. Thus, E.X0 jX D x/ D g.x/ for -almost
all x from which (1) follows.
Bibliography

[1] N. I. Akhiezer, The Classical Moment Problem, Oliver & Boyd, Edinburgh, 1965.
[2] H. Bauer, Measure and Integration Theory, De Gruyter, Berlin-New York, 2001.
[3] C. Berg, J. P. R. C. Christensen, and P. Ressel, Harmonic Analysis on Semigroups,
Springer, Berlin, 1984.
[4] G. Berschneider, Spectral representation of intrinsically stationary fields, Stoch. Proc.
Appl. 122 (2012), 3837–3851.
[5] G. Berschneider and Z. Sasvári, On a theorem of Karhunen and related moment prob-
lems and quadrature formulae, in: Spectral Theory, Mathematical System Theory, Evolu-
tion Equations, Differential and Difference Equations. 21st International Workshop on
Operator Theory and Applications. Berlin, July 2010, Operator Theory: Advances and
Applications 221, pp. 171–185, Springer, Basel, 2012.
[6] T. M. Bisgaard and Z. Sasvári, Characteristic Functions and Moment Sequences. Posi-
tive Definiteness in Probability, Nova Science Publishers, Huntington, 2000.
[7] T. M. Bisgaard and Z. Sasvári, When does E.X k  X j / D E.X k /  E.X j / imply inde-
pendence?, Statist. Probab. Lett. 76 (2006), 1111–1116.
[8] L. E. Blumenson, A derivation of n-dimensional spherical coordinates, Amer. Math.
Monthly 67 (1960), 63–66.
[9] J.-P. Chilès and P. Delfiner, Geostatistics. Modelling Spatial Uncertainty, Wiley, New
York, 1999.
[10] B. De Finetti, Theory of Probability. Vol. 2, Wiley, New York, 1975.
[11] K. de Leeuw and I. Glicksberg, The decomposition of certain group representations,
J. Anal. Math. 15 (1965), 135–192.
[12] F. Derrien, Strictly positive definite functions on the real line, HAL: Hyper Articles
en Ligne, hal-00519325, version 1 (2010). http://hal-univ-artois.archives-ouvertes.fr/
docs/00/51/93/25/PDF/SPD.pdf.
[13] R. E. Edwards, Fourier Series. A Modern Introduction. Vol. 1, Springer, New York-
Heidelberg-Berlin, 1979.
[14] A. V. Efimov, An analogue of Rudin’s theorem for continuous radial positive definite
functions of several variables (in Russian), Trudi Instituta Matematiki i Mehaniki UrO
RAN 18 (2012), 1–8.
[15] W. Ehm, T. Gneiting, and D. Richards, Convolution roots of radial positive definite
functions with compact support, Trans. Amer. Math. Soc. 356 (2004), 4655–4685.
358 Bibliography

[16] W. Feller, An Introduction to Probability Theory and its Applications. Vol. 1, 3rd edn.,
Wiley, New York, 1968.
[17] W. Feller, An Introduction to Probability Theory and its Applications, Vol. 2, 2nd edn.,
Wiley, New York, 1971.
[18] G. B. Folland, Real Analysis, Wiley, New York, 1984.
[19] I. I. Gikhman and I. V. Skorokhod, The Theory of Stochastic Processes. I, translated
from the Russian by S. Kotz. Corrected printing of the 1st edn. Classics in Mathematics,
Springer, Berlin, 2004.
[20] T. Gneiting, Radial positive definite functions generated by Euclid’s hat, J. Multivar.
Anal. 69 (1999), 88–119.
[21] T. Gneiting, Decomposition theorems for ˛-symmetric positive definite functions, Math.
Nachr. 208 (1999), 117–120.
[22] T. Gneiting, Criteria of Pólya type for radial positive definite functions, Proc. Amer.
Math. Soc. 129 (2001), 2309–2318.
[23] T. Gneiting, Strictly and non-strictly positive definite functions on spheres, Cornell Uni-
versity Library, arXiv:1111.7077v4 [math.PR] (2012), pp. 1–30.
[24] N. Goloshchapova, M. Malamud, and V. Zastavnyi, Radial positive definite functions
and spectral theory of the Schrödinger operators with point interactions, Math. Nachr.
285 (2012), 1839–1859.
[25] V. I. Gorbachuk and M. L. Gorbachuk, M.G. Krein’s Lectures on Entire Operators,
Birkhäuser, Basel-Boston-Berlin, 1997.
[26] L. Grafakos and G. Teschl, On Fourier transforms of radial functions and distributions,
Cornell University Library, arXiv:1112.5469 [math.CA] (2012), pp. 1–12.
[27] E. Hewitt and K. A. Ross, Abstract Harmonic Analysis. I, Springer, Berlin, 1963.
[28] E. Hewitt and K. Stromberg, Real and Abstract Analysis, Springer, New York-Heidel-
berg-Berlin, 1965.
[29] N. J. Higham, Functions of Matrices. Theory and Computation, SIAM, Philadelphia,
2008.
[30] B. Jessen, Abstrakt Maalog Integraltheori, Ch. Johansens Bogtrykkeri, Kobenhavn,
1947.
[31] B. Jessen and A. Wintner, Distribution functions and the Riemann zeta function, Trans.
Amer. Math. Soc. 38 (1935), 48–88.
[32] Johnson, W. P., The curious history of Faà di Bruno’s formula, Amer. Math. Monthly
109 (2002), 217–234.
[33] A. M. Kagan, Yu. V. Linnik, and C. R. Rao, Characterization Problems in Mathematical
Statistics, Wiley, New York-London-Sydney-Toronto, 1973.
[34] K. Karhunen, Über lineare Methoden in der Wahrscheinlichkeitsrechnung, Ann. Acad.
Sci. Fenn. Ser. A I 37 (1947), 1–79.
Bibliography 359

[35] C. Kleiber and J. Stoyanov, Multivariate distributions and the moment problem, J. Mul-
tivar. Anal. 113 (2013), 7–18.
[36] M. G. Krein, On the integral representation of a continuous Hermitian-indefinite func-
tion with a finite number of negative squares. (Russian), Doklady Akademii Nauk SSSR,
Serija matematika 125 (1959), 31–34.
[37] R. G. Laha and V. K. Rohatgi, Probability Theory, Wiley, New York–Chichester–
Brisbane–Toronto, 1979.
[38] H. Langer, M. Langer, and Z. Sasvári, Continuations of Hermitian indefinite functions
and corresponding canonical systems: an example, Meth. Func. Anal. Top. 10 (2004),
39–53.
[39] G. Letac and Q. I. Rahman, A factorisation of the Askey’s characteristic function
.1  kt k2nC1 /nC1
C , Ann. Inst. Henri Poincaré, Sect. B 22 (1986), 169–174.
[40] P. Lévy, Extensions d’un théorème de D. Dugué et M. Girault, Z. Wahrscheinlichkeits-
theorie Verw. Geb. 1 (1962), 159–173.
[41] W. Maak, Fastperiodische Funktionen, Springer, Berlin, 1950.
[42] G. Matheron, The intrinsic random functions and their applications, Adv. Appl. Prob. 5
(1973), 439–468.
[43] G. Pólya, Remarks on characteristic functions, Proc. Berkeley Symp. Math. Stat. Probab.
1 (1949), 115–123.
[44] G. Pólya and G. Szegö, Problems and Theorems in Analysis, Springer, Berlin-Heidel-
berg-New York, 1972.
[45] A. Rényi, Foundations of Probability, Holden-Day, San Francisco, 1970.
[46] W. Rudin, Real and Complex Analysis, McGraw-Hill, New York, 1966.
[47] W. Rudin, Principles of Mathematical Analysis, 3rd edn., McGraw-Hill, New York,
1976.
[48] Z. Sasvári, On common fixed points of linear contractions, Proc. Amer. Math. Soc. 108
(1990), 565–566.
[49] Z. Sasvári, Positive Definite and Definitizable Functions, Akademie Verlag, Berlin,
1994.
[50] Z. Sasvári, On norm dependent positive definite functions, Monatsh. Math. 120 (1995),
319–325.
[51] Z. Sasvári, On the number of negative squares of certain functions, in: Contributions
to operator theory in spaces with an indefinite metric: The Heinz Langer anniversary
volume, Operator Theory: Advances and Applications 106, pp. 337–353, Birkhäuser,
Basel, 1998.
[52] Z. Sasvári, On generalized correlation functions of intrinsically stationary processes of
order k, Statist. Probab. Lett. 78 (2008), 3381–3387.
[53] Z. Sasvári, Correlation functions of intrinsically stationary random fields, in: Modern
Analysis and Applications: The Mark Krein Centenary Conference. Vol. 1, Operator
Theory: Advances and Applications 190, pp. 451–470, Birkhäuser, Basel–Berlin, 2009.
360 Bibliography

[54] R. Schaback and Z. Wu, Operators on radial functions, J. Comput. Appl. Math. 73
(1996), 257–270.
[55] J. A. Shohat and J. D. Tamarkin, The Problem of Moments, American Math. Soc., Prov-
idence, 1943, published in 1950.
[56] E. C. Titchmarsh, The Theory of Functions, 2nd edn., Oxford University Press, London,
1939.
[57] N. G. Ushakov, Selected Topics in Characteristic Functions, VSP, Utrecht, 1999.
[58] H. Wendland, Scattered Data Approximation, Cambridge University Press, Cambridge,
2005.
[59] E. T. Whittaker and G. N. Watson, A Course of Modern Analysis, 4th edn. (repr. 1927),
Cambridge University Press, Cambridge, 1999.
[60] D. V. Widder, The Laplace Transform, Princeton University Press, Princeton, 1946.
[61] J. Yeh, Lectures on Real Analysis, World Scientific Publishing, Singapore, 2000.
Index

AT , 316 Lpr , 242


A˛ , 350 Lpr , 242
A , 316 MC .X /, 339
B c , 243 MbC .X /, 339
Bd , 347 Mb .X /, 339
Bn , 294 Mf .X /, 339
B o , 243 M˛ , 138, 349
B.V; W /, 328 N, 242
B.X /, 339 N0 , 242
C, 242 N.m; C /, 58
C0 , 242 O.n/, 316
C00 , 242
d , 172
C1 .Rd /, 242 P .G/, 25
C1 d
00 .R /, 242 P0 .G/, 31
C, 242 P .V /, 200
CCim , 148 P0 .V /, 200
Cim , 148 P c .Rd /, 25
CCre , 148 P m .Rd /, 25
Cre , 148 ˆd , 173
D.z0 ; r/, 278
ˆ1 , 173
E.X /, 348
Q, 242
E .X jB/, 354
R, 242
EB .X /, 354
E.a/, 150 SP .Rd /, 132
Et , 125 SP c .Rd /, 132
d
Fd , 163 SM , 210
, 288 S d 1 , 273
d , 110 Si, 248
O 43
G, T, 242
H.D/, 278 TV , 203
H.Z/, 74 TVC , 203
H.Z/, 74 TV2 , 203
H.f /, 125 TVr , 203
H./, 93 V .f /, 288
In , 315 Var.X jB/, 354
J , 296 Var .X /, 348
K , 299 W N. ;  2 /, 104
Lp , 242 Z, 242
Lp , 242 c  , 339
362 Index

cov.X /, 349 Bernoulli’s inequality, 245


det, 316 Bernstein polynomials, 257
diag, 315 Bernstein’s theorem, 178
dim, 314 Bessel function, 296
ıx , 242 Bessel’s equation, 299
exp.z/, 242 Bessel’s inequality, 334
fX , 1 Beta function, 290
f  g, 346 binomial distribution, 11
fO, 55 Bochner’s theorem, 42
fL, 55 Borel measure, 339
fQ, 242 Borel sets, 339
f .x ˙ 0/, 242
O 47
g, canonical unitary representation
L 47
g, in L2 . /, 123
, 243 in H.C /, 127
d , 243 in H.Z/, 122
C , 339 Cauchy criterion, 328
 , 343 Cauchy’s functional equation, 306
 , 345 Cauchy’s theorem, 280
? , 345 Cauchy–Schwarz inequality, 315
c
, 228 center of gravity, 228
L , 48, 55 central limit theorem, 224
O , 48, 55 character, 27
, 340 characteristic function, 1
Q , 340 infinitely divisible, 188
4
, 340 of Pólya-type, 34, 180
j j, 340 characteristic polynomial, 316
X, 1 Chebyshev’s inequality, 350
range, 316
closed curve, 279
 .X /, 352
completely monotone, 177
span, 242
completely regular topological space, 341
xC , 242
conditional expectation, 13, 354
.; /f , 125 conditional variance, 354
k  kp , 242 conditionally positive definite function,
1A , 242 190
˚, 314 conjugate transpose, 316
?, 314 consistency conditions, 353
continuous argument, 304
algebraic multiplicity, 316 continuous logarithm, 304
angle, 275 contraction, 328
Askey’s theorem, 179 convergence of random vectors
autoregressive sequence, 107 almost sure, 350
in distribution, 350
Banach space, 315 in law, 350
barycenter, 228 in probability, 350
Bernoulli numbers, 294 weakly, 350
Index 363

convergent infinite product, 261 finite-dimensional distributions of a field,


convolution, 85, 86, 345 67
additive, 345 fixed point, 236
multiplicative, 345 Fourier series, 287
convolution root, 184 N th symmetric partial sum, 287
correlation function, 70, 73, 103 function
covariance function, 70, 73, 103 absolutely continuous, 265
covariance matrix, 349 almost periodic, 285
Cramér–Wold device, 40 analytic, 139
curve, 279 averaging to zero, 213
cyclic vector, 119 characteristic, 1
conditionally positive definite, 190
density, 348 convex, 264
derivative of a complex function, 278 definitizable, 198, 199
determinant, 316 Hermitian, 27
differentiable complex function, 278 holomorphic, 278
dimension, 314 integrable, 339
distribution, 347 nowhere differentiable, 255
absolutely continuous, 348 of bounded variation, 288
binomial, 7 of Pólya-type, 180
Cauchy, 9, 22 positive definite on G, 25
continuous, 347 positive definite on a symmetric sub-
discrete, 348 set, 200
exponential, 6, 11, 140 radial, 160, 220
Gaussian, 20 T -positive definite, 109
Laplace, 6, 11 with k negative squares, 198
log-normal, 141
Poisson, 7, 11 Gamma function, 288
singular, 348 Gaussian distribution, 20
standard Gaussian, 20 Gaussian field, 73
uniform, 2 Gaussian random vector, 58
distribution function, 347 generalized Leibniz rule, 252
geometric multiplicity, 316
eigenspace, 316 Gram matrix, 324
eigenvalue, 316
eigenvector, 316 Haar measure, 45
elementary symmetric polynomial, 199 Herglotz’s theorem, 43
Eneström–Kakeya theorem, 280 Hilbert space, 327
entire function, 139, 278 Hölder’s inequality, 350
of exponential type, 150
of finite order, 284 improper Riemann–Stieltjes integral, 267
equicontinuous, 259 increments, 68
ergodic, 235 independent
ergodic theorem, 238 increments, 67
expectation, 348 random vectors, 347
sets of events, 355
Faà di Bruno’s formula, 252 infinite convolution, 227
364 Index

infinitely divisible, 188 radial, 160


inner product, 314 random orthogonal, 93
inner product space, 314, 324 singular, 348
integration by parts, 96, 266 translation invariant, 45
invariant linear manifold, 328 median, 14
Micchelli’s theorem, 197
Jacobian, 272 modified Bessel function, 299
Jensen’s formula, 283 moment problem
Jessen–Wintner purity law, 233 power, 102
trigonometric, 102
kernel, 317
moments
eigenfunction of, 86
absolute, 350
eigenvalue of, 86
of order ˛, 349
Hermitian, 317
Morera’s theorem, 280
positive definite, 317
moving average sequence, 106
positive semidefinite, 317
multinomial coefficients, 251
Khinchin’s weak law of large numbers,
multinomial formula, 251
224
multiplicative convolution, 301
Kolmogorov’s existence theorem, 353
Kolmogorov’s three-series theorem, 232
Nevanlinna function, 210
Lévy’s continuity theorem, 39 norm, 315
Lévy–Khinchin formula, 195 norm of an operator, 328
limit of a sequence of sets, 226 normal random vector, 58
line integral, 279 normed linear space, 315
linear functional, 314
linear manifold, 327 operator
linear operator, 314 adjoint, 328
Linnik’s theorem, 145 compact, 329
locally convex, 332 isometric, 328
nonnegative, 328
marginal distribution, 5 normal, 328
matrix self-adjoint, 328
diagonal, 315 unitary, 328
diagonalizable, 316 order of an entire function, 284
Hermitian, 316 orthogonal
identity, 315 complement, 314
normal, 316 increments, 95
orthogonal, 316 projection, 327
permutation, 315 sets, 314
positive definite, 316 vectors, 314
positive semidefinite, 316
symmetric, 316 Pólya’s theorem, 180
unitary, 316 parallelogram law, 315
mean of a field, 70, 73 Parseval’s identity, 20, 51
measure partial product, 261
absolutely continuous, 348 partition, 285
discrete, 348 path, 279
Index 365

Payley–Wiener theorem, 150 semicontinuous, 341


Pexider’s functional equation, 307 seminorm, 315
Pisot number, 50 sesquilinear form, 314
Plancherel’s theorem, 51 simply connected, 280
Poisson process, 69, 96, 108 spectral density, 109
Poisson’s limit theorem, 225 spectral measure, 44, 109
polarization identity, 315 spherical coordinates, 275
power series, 278 standard basis, 243
probability measure, 347 stationary field, 103
probability space, 347 strictly convex, 337
product measure, 343 strictly positive definite, 132
product of probability spaces, 353 strong convergence, 327
strongly stationary field, 103
radius of convergence, 278 structure measure, 93
Radon measure, 339 subspace
random field, 67 .Ut /-invariant, 119
L2 -valued, 73 of a Hilbert space, 327
Gaussian, 67 support, 203
integrable, 78 symmetric subset, 200
second order, 67
strongly continuous, 75 tail event, 352
strongly differentiable, 76 Taylor expansion, 279
random process, 67 Taylor’s theorem for multivariate func-
random variable, 347 tions, 267
complex-valued, 347 theorem of
random vector, 347 Arzelà–Ascoli, 260
absolutely continuous, 348 Ascoli, 259
complex, 347 Banach–Alaoglu, 336
continuous, 347 Banach–Steinhaus, 329
discrete, 348 Bernstein, 256
singular, 348 Dini, 258
rank, 316 Fréchet, von Neumann and Jordan, 315
rapidly decreasing function, 138 Hadamard, 284
regression Hahn–Banach, 332
constant, 13 Karhunen, 99
linear, 13 Karhunen–Loève, 89
polynomial, 13 Kolmogorov, 353
quadratic, 13 Kreı̆n–Milman, 333
relatively compact, 342 Mercer, 87
representing measure, 44, 112 Phragmen–Lindelöf, 285
Riemann–Lebesgue lemma, 49 Riesz–Fréchet, 329
Riemann–Stieltjes integral, 266 Riesz–Markov, 340
Riesz representation theorem, 340 Schur, 322
Stone–Weierstraß, 257
sampling theorem, 114 Weierstraß, 256
Scheffe’s lemma, 344 time series, 67
Schwartz space, 138 total variation, 288
366 Index

trace, 317 continuous, 117


transform cyclic, 119
L1 Fourier, 47 equivalence, 120
L2 Fourier, 52 weakly measurable, 117
extended Fourier, 57
extended Fourier–Stieltjes, 57 vague topology, 342
Fourier, 55 Vandermonde matrix, 317
Fourier–Stieltjes, 48, 55 variance, 348
inverse L1 Fourier, 47
weak convergence, 334
inverse L2 Fourier, 53
of complex measures, 341
inverse Fourier, 55
weak operator topology, 334
inverse Fourier–Stieltjes, 48, 55
weak topology, 334, 341
Mellin, 301
weak- topology, 334
translation operator, 125
white noise, 104
transpose, 316 Wiener process, 69, 91, 96, 108
triangle inequality, 315
trigonometric polynomial, 133 zero–one law
Borel–Cantelli, 351
uniformly distributed on S d 1 , 165 Kolmogorov, 352
unitary representation, 117 Zygmund–Salem theorem, 50

You might also like