0% found this document useful (0 votes)
80 views128 pages

Stanf215 Notes

This document provides an introduction to complex analysis by describing complex numbers and their algebraic properties. It defines complex numbers as ordered pairs of real numbers and introduces addition and multiplication rules that make the set of complex numbers a field. The document also presents polar and Cartesian representations of complex numbers and discusses the Riemann sphere model which compactifies the complex plane by adding a point at infinity.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views128 pages

Stanf215 Notes

This document provides an introduction to complex analysis by describing complex numbers and their algebraic properties. It defines complex numbers as ordered pairs of real numbers and introduces addition and multiplication rules that make the set of complex numbers a field. The document also presents polar and Cartesian representations of complex numbers and discusses the Riemann sphere model which compactifies the complex plane by adding a point at infinity.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 128

Math 215 Complex Analysis

Lenya Ryzhik copy pasting from others

November 25, 2013

1 The Holomorphic Functions


We begin with the description of complex numbers and their basic algebraic properties.
We will assume that the reader had some previous encounters with the complex numbers
and will be fairly brief, with the emphasis on some specifics that we will need later.

1.1 The Complex Plane


1.1.1 The complex numbers
We consider the set C of pairs of real numbers (x, y), or equivalently of points on the
plane R2 . Two vectors z1 = (x1 , x2 ) and z2 = (x2 , y2 ) are equal if and only if x1 = x2
and y1 = y2 . Two vectors z = (x, y) and z̄ = (x, −y) that are symmetric to each other
with respect to the x-axis are said to be complex conjugate to each other. We identify
the vector (x, 0) with a real number x. We denote by R the set of all real numbers (the
x-axis).
We introduce now the operations of addition and multiplication on C that turn it
into a field. The sum of two complex numbers and multiplication by a real number
λ ∈ R are defined in the same way as in R2 :

(x1 , y1 ) + (x2 , y2 ) = (x1 + x2 , y1 + y2 ), λ(x, y) = (λx, λy).

Then we may write each complex number z = (x, y) as

z = x · 1 + y · i = x + iy, (1.1)

where we denoted the two unit vectors in the directions of the x and y-axes by 1 = (1, 0)
and i = (0, 1).
You have previously encountered two ways of defining a product of two vectors:
the inner product (z1 · z2 ) = x1 x2 + y1 y2 and the skew product [z1 , z2 ] = x1 y2 − x2 y1 .
However, none of them turn C into a field, and, actually C is not even closed under these
operations: both the inner product and the skew product of two vectors is a number, not
a vector. This leads us to introduce yet another product on C. Namely, we postulate

1
that i · i = i2 = −1 and define z1 z2 as a vector obtained by multiplication of x1 + iy1 and
x2 + iy2 using the usual rules of algebra with the additional convention that i2 = −1.
That is, we define
z1 z2 = x1 x2 − y1 y2 + i(x1 y2 + x2 y1 ). (1.2)
More formally we may write

(x1 , y1 )(x2 , y2 ) = (x1 x2 − y1 y2 , x1 y2 + x2 y1 )

but we will not use this somewhat cumbersome notation.

Exercise 1.1 Check that the product (1.2) turns C into a field, that is, the distributive,
commutative and associative laws hold, and for any z 6= 0 there exists a number z −1 ∈ C
x iy
so that zz −1 = 1. Hint: z −1 = 2 2
− 2 .
x +y x + y2
Exercise 1.2 Show that the following operations do not turn C into a field: (a) z1 z2 =
x1 x2 + iy1 y2 , and (b) z1 z2 = x1 x2 + y1 y2 + i(x1 y2 + x2 y1 ).

The product (1.2) turns C into a field (see Exercise 1.1) that is called the field of complex
numbers and its elements, vectors of the form z = x + iy are called complex numbers.
The real numbers x and y are traditionally called the real and imaginary parts of z and
are denoted by
x = Rez, y = Imz. (1.3)
A number z = (0, y) that has the real part equal to zero, is called purely imaginary.
The Cartesian way (1.1) of representing a complex number is convenient for per-
forming the operations of addition and subtraction, but one may see from (1.2) that
multiplication and division in the Cartesian form are quite tedious. These operations,
as well as raising a complex number to a power are much more convenient in the polar
representation of a complex number:

z = r(cos φ + i sin φ), (1.4)

that is obtained from (1.1) passing to the polar coordinates


p for (x, y). The polar coordi-
nates of a complex number z are the polar radius r = x2 + y 2 and the polar angle φ,
the angle between the vector z and the positive direction of the x-axis. They are called
the modulus and argument of z are denoted by

r = |z|, φ = Argz. (1.5)

The modulus is determined uniquely while the argument is determined up to addition


of a multiple of 2π. We will use a shorthand notation

cos φ + i sin φ = eiφ . (1.6)

Note that we have not yet defined the operation of raising a number to a complex power,
so the right side of (5.1) should be understood at the moment just as a shorthand for

2
the left side. We will define this operation later and will show that (5.1) indeed holds.
With this convention the polar form (1.4) takes a short form

z = reiφ . (1.7)

Using the basic trigonometric identities we observe that

r1 eiφ1 r2 eiφ2 = r1 (cos φ1 + i sin φ1 )r2 (cos φ2 + i sin φ2 ) (1.8)


= r1 r2 (cos φ1 cos φ2 − sin φ1 sin φ2 + i(cos φ1 sin φ2 + sin φ1 cos φ2 ))
= r1 r2 (cos(φ1 + φ2 ) + i sin(φ1 + φ2 )) = r1 r2 ei(φ1 +φ2 ) .

This explains why notation (5.1) is quite natural. Relation (5.4) says that the modulus
of the product is the product of the moduli, while the argument of the product is the
sum of the arguments.
Sometimes it is convenient to consider a compactification of the set C of complex
numbers. This is done by adding an ideal element that is called the point at infinity
z = ∞. However, algebraic operations are not defined for z = ∞. We will call the
compactified complex plane, that is, the plane C together with the point at infinity, the
closed complex plane, denoted by C. Sometimes we will call C the open complex plane
in order to stress the difference between C and C.
One can make the compactification more visual if we represent the complex numbers
as points not on the plane but on a two-dimensional sphere as follows. Let ξ, η and ζ
be the Cartesian coordinates in the three-dimensional space so that the ξ and η-axes
coincide with the x and y-axes on the complex plane. Consider the unit sphere

S : ξ 2 + η2 + ζ 2 = 1 (1.9)

in this space. Then for each point z = (x, y) ∈ C we may find a corresponding point
Z = (ξ, η, ζ) on the sphere that is the intersection of S and the segment that connects
the “North pole” N = (0, 0, 1) and the point z = (x, y, 0) on the complex plane.
The mapping z → Z is called the stereographic projection. The segment N z may
be parameterized as ξ = tx, η = ty, ζ = 1 − t, t ∈ [0, 1]. Then the intersection point
Z = (t0 x, t0 y, 1 − t0 ) with t0 being the solution of

t20 x2 + t20 y 2 + (1 − t0 )2 = 1

so that (1 + |z|2 )t0 = 2. Therefore the point Z has the coordinates


2x 2y |z|2 − 1
ξ= , η = , ζ = . (1.10)
1 + |z|2 1 + |z|2 1 + |z|2
2
The last equation above implies that = 1 − ζ. We find from the first two
1 + |z|2
equations the explicit formulae for the inverse map Z → z:
ξ η
x= , y= . (1.11)
1−ζ 1−ζ

3
Expressions (5.9) and(5.10) show that the stereographic projection is a one-to-one
map from C to S\N (clearly N does not correspond to any point z). We postulate that
N corresponds to the point at infinity z = ∞. This makes the stereographic projection
be a one-to-one map from C̄ to S. We will usually identify C̄ and the sphere S. The
latter is called the sphere of complex numbers or the Riemann sphere. The open plane
C may be identified with S\N , the sphere with the North pole deleted.

Exercise 1.3 Let t and u be the longitude and the latitude of a point Z. Show that the
corresponding point z = seit , where s = tan(π/4 + u/2).

We may introduce two metrics (distances) on C according to the two geometric descrip-
tions presented above. The first is the usual Euclidean metric with the distance between
the points z1 = x1 + iy1 and z2 = x2 + iy2 in C given by
p
|z2 − z1 | = (x1 − x2 )2 + (y1 − y2 )2 . (1.12)

The second is the spherical metric with the distance between z1 and z2 defined as the
Euclidean distance in the three-dimensional space between the corresponding points Z1
and Z2 on the sphere. A straightforward calculation shows that
2|z2 − z1 |
ρ(z1 , z2 ) = p p . (1.13)
1 + |z1 |2 1 + |z2 |2

This formula may be extended to C by setting


2
ρ(z, ∞) = p . (1.14)
1 + |z|2
Note that (1.14) may be obtained from (1.13) if we let z1 = z, divide the numerator and
denominator by |z2 | and let |z2 | → +∞.

Exercise 1.4 Use the formula (5.9) for the stereographic projection to verify (1.13).

Clearly we have ρ(z1 , z2 ) ≤ 2 for all z1 , z2 ∈ C. It is straightforward to verify that both


of the metrics introduced above turn C into a metric space, that is, all the usual axioms
of a metric space are satisfied. In particular, the triangle inequality for the Euclidean
metric (1.12) is equivalent to the usual triangle inequality for two-dimensional plane:
|z1 + z2 | ≤ |z1 | + |z2 |.

Exercise 1.5 Verify the triangle inequality for the metric ρ(z1 , z2 ) on C defined by
(1.13) and (1.14)

We note that the Euclidean and spherical metrics are equivalent on bounded sets M ⊂ C
that lie inside a fixed disk {|z| ≤ R}, R < ∞. Indeed, if M ⊂ {|z| ≤ R} then (1.13)
implies that for all z1 , z2 ∈ M we have
2
|z2 − z1 | ≤ ρ(z1 , z2 ) ≤ 2|z2 − z1 | (1.15)
1 + R2

4
(this will be elaborated in the next section). Because of that the spherical metric is
usually used only for unbounded sets. Typically, we will use the Euclidean metric for C
and the spherical metric for C.
Now is the time for a little history. We find the first mention of the complex numbers as
square rots of negative numbers in the book ”Ars Magna” by Girolamo Cardano published in
1545. He thought that such numbers could be introduced in mathematics √ but opined√that this
would be useless: ”Dismissing mental tortures, and multiplying 5 + −15 by 5 − −15, we
obtain 25 − (−15). Therefore the product is 40. .... and thus far does arithmetical subtlety go,
of which this, the extreme, is, as I have said, so subtle that it is useless.” The baselessness of
his verdict was realized fairly soon: Raphael Bombelli published his “Algebra” in 1572 where
he introduced the algebraic operations over the complex numbers and explained how they
may √be used for solving√the cubic equations. One may find in Bombelli’s book the relation
(2 + −121)1/3 + (2 − −121)1/3 = 4. Still, the complex numbers remained somewhat of a
mystery for a long time. Leibnitz considered them to be “a beautiful and majestic refuge of
the human spirit”, but he also thought that it was impossible to factor x4 + 1 into a product of
two quadratic polynomials (though this is done in an elementary way with the help of complex
numbers).
The active use of complex numbers in mathematics began with the works of Leonard
Euler. He has also discovered the relation eiφ = cos φ + i sin φ. The geometric interpretation
of complex numbers as planar vectors appeared first in the work of the Danish geographical
surveyor Caspar Wessel in 1799 and later in the work of Jean Robert Argand in 1806. These
papers were not widely known - even Cauchy who has obtained numerous fundamental results
in complex analysis considered early in his career the complex numbers simply as symbols
that were convenient for calculations, and equality of two complex numbers as a shorthand
notation for equality of two real-valued variables.
The first systematic description of complex numbers, operations over them, and their
geometric interpretation was given by Carl Friedreich Gauss in 1831 in his memoir “Theoria
residuorum biquadraticorum”. He has also introduced the name “complex numbers”.

1.2 The topology of the complex plane


We have introduced distances on C and C that turned them into metric spaces. We will
now introduce the two topologies that correspond to these metrics.
Let ε > 0 then an ε-neighborhood U (z0 , ε) of z0 ∈ C in the Euclidean metric is the
disk of radius ε centered at z0 , that is, the set of points z ∈ C that satisfy the inequality

|z − z0 | < ε. (1.16)

An ε-neighborhood of a point z0 ∈ C is the set of all points z ∈ C such that

ρ(z, z0 ) < ε. (1.17)


r
4
Expression (1.14) shows that the inequality ρ(z, ∞) < ε is equivalent to |z| > − 1.
ε2
Therefore an ε-neighborhood of the point at infinity is the outside of a disk centered at
the origin complemented by z = ∞.

5
We say that a set Ω in C (or C) is open if for any point z0 ∈ Ω there exists a
neighborhood of z0 that is contained in Ω. It is straightforward to verify that this
notion of an open set turns C and C into topological spaces, that is, the usual axioms of
a topological space are satisfied.
Sometimes it will be convenient to make use of the so called punctured neighborhoods,
that is, the sets of the points z ∈ C (or z ∈ C) that satisfy

0 < |z − z0 | < ε, 0 < ρ(z, z0 ) < ε. (1.18)

1.3 Paths and curves


Definition 1.6 A path γ is a continuous map of an interval [α, β] of the real axis into
the complex plane C (or C). In other words, a path is a complex valued function z = γ(t)
of a real argument t, that is continuous at every point t0 ∈ [α, β] in the following sense:
for any ε > 0 there exists δ > 0 so that |γ(t) − γ(t0 )| < ε (or ρ(γ(t), γ(t0 )) < ε if
γ(t0 ) = ∞) provided that |t − t0 | < δ. The points a = γ(α) and b = γ(β) are called the
endpoints of the path γ. The path is closed if γ(α) = γ(β). We say that a path γ lies in
a set M if γ(t) ∈ M for all t ∈ [α, β].

Sometimes it is convenient to distinguish between a path and a curve. In order to


introduce the latter we say that two paths

γ1 : [α1 , β1 ] → C and γ2 : [α2 , β2 ] → C

are equivalent (γ1 ∼ γ2 ) if there exists an increasing continuous function

τ : [α1 , β1 ] → [α2 , β2 ] (1.19)

such that τ (α1 ) = α2 , τ (β1 ) = β2 and so that γ1 (t) = γ2 (τ (t)) for all t ∈ [α1 , β1 ].

Exercise 1.7 Verify that relation ∼ is reflexive: γ ∼ γ, symmetric: if γ1 ∼ γ2 , then


γ2 ∼ γ1 and transitive: if γ1 ∼ γ2 and γ2 ∼ γ3 then γ1 ∼ γ3 .

Example 1.8 Let us consider the paths γ1 (t) = t, t ∈ [0, 1]; γ2 (t) = sin t, t ∈ [0, π/2];
γ3 (t) = cos t, t ∈ [0, π/2] and γ4 (t) = sin t, t ∈ [0, π]. The set of values of γj (t) is always
the same: the interval [0, 1]. However, we only have γ1 ∼ γ2 . These two paths trace
[0, 1] from left to right once. The paths γ3 and γ4 are neither equivalent to these two,
nor to each other: the interval [0, 1] is traced in a different way by those paths: γ3 traces
it from right to left, while γ4 traces [0, 1] twice.

Exercise 1.9 Which of the following paths: a) e2πit , t ∈ [0, 1]; b) e4πit , t ∈ [0, 1]; c)
e−2πit , t ∈ [0, 1]; d) e4πi sin t , t ∈ [0, π/6] are equivalent to each other?

Definition 1.10 A curve is an equivalence class of paths. Sometimes, when this will
cause no confusion, we will use the word ’curve’ to describe a set γ ∈ C that may be
represented as an image of an interval [α, β] under a continuous map z = γ(t).

6
Below we will introduce some restrictions on the curves and paths that we will consider.
We say that γ : [α, β] → C is a Jordan path if the map γ is continuous and one-to-one.
The definition of a closed Jordan path is left to the reader as an exercise.
A path γ : [α, β] → C (γ(t) = x(t) + iy(t)) is continuously differentiable if derivative
γ 0 (t) := x0 (t) + iy 0 (t) exists for all t ∈ [α, β]. A continuously differentiable path is said
to be smooth if γ 0 (t) 6= 0 for all t ∈ [α, β]. This condition is introduced in order to avoid
singularities. A path is called piecewise smooth if γ(t) is continuous on [α, β], and [α, β]
may be divided into a finite number of closed sub-intervals so that the restriction of γ(t)
on each of them is a smooth path.
We will also use the standard notation to describe smoothness of functions and
paths: the class of continuous functions is denoted C, or C 0 , the class of continuously
differentiable functions is denoted C 1 , etc. A function that has n continuous derivatives
is said to be a C n -function.

Example 1.11 The paths γ1 , γ2 and γ3 of the previous example are Jordan, while γ4 is
not Jordan. The circle z = eit , t ∈ [0, 2π] is a closed smooth Jordan path; the four-petal
rose z = eit cos 2t, t ∈ [0, 2π] is a smooth non-Jordan path; the semi-cubic parabola
z = t2 (t + i), t ∈[−1, 1] is 
a Jordan
 continuously differentiable piecewise smooth path.
1
The path z = t 1 + i sin , t ∈ [−1/π, 1/π] is a Jordan non-piecewise smooth
t
path.

One may introduce similar notions for curves. A Jordan curve is a class of paths that
are equivalent to some Jordan path (observe that since the change of variables (1.19) is
one-to-one, all paths equivalent to a Jordan path are also Jordan).
The definition of a smooth curve is slightly more delicate: this notion has to be
invariant with respect to a replacement of a path that represents a given curve by an
equivalent one. However, a continuous monotone change of variables (1.19) may map
a smooth path onto a non-smooth one unless we impose some additional conditions on
the functions τ allowed in (1.19).
More precisely, a smooth curve is a class of paths that may be obtained out of a
smooth path by all possible re-parameterizations (1.19) with τ (s) being a continuously
differentiable function with a positive derivative. One may define a piecewise smooth
curve in a similar fashion: the change of variables has to be continuous everywhere, and
in addition have a continuous positive derivative except possibly at a finite set of points.
Sometimes we will use a more geometric interpretation of a curve, and say that a
Jordan, or smooth, or piecewise smooth curve is a set of points γ ⊂ C that may be
represented as the image of an interval [α, β] under a map z = γ(t) that defines a
Jordan, smooth or piecewise smooth path.

7
1.4 Functions of a complex variable
1.4.1 Differentiability
The notion of differentiability is intricately connected to linear approximations so we
start with the discussion of linear functions of complex variables.
Definition 1.12 A function f : C → C is C-linear, or R-linear, respectively, if
(a) l(z1 + z2 ) = l(z1 ) + l(z2 ) for all z1 , z2 ∈ C,
(b) l(λz) = λl(z) for all λ ∈ C, or, respectively, λ ∈ R.
Thus R-linear functions are linear over the field of real numbers while C-linear are linear
over the field of complex numbers. The latter form a subset of the former.
Let us find the general form of an R-linear function. We let z = x + iy, and use
properties (a) and (b) to write l(z) = xl(1) + yl(i). Let us denote α = l(1) and β = l(i),
and replace x = (z + z̄)/2 and y = (z − z̄)/(2i). We obtain the following theorem.
Theorem 1.13 Any R-linear function has the form

l(z) = az + bz̄, (1.20)

where a = (α − iβ)/2 and b = (α + iβ)/2 are complex valued constants.


Similarly writing z = 1 · z we obtain
Theorem 1.14 Any C-linear function has the form

l(z) = az, (1.21)

where a = l(1) is a complex valued constant.

Theorem 1.15 An R-linear function is C-linear if and only if

l(iz) = il(z). (1.22)

Proof. The necessity of (1.22) follows immediately from the definition of a C-linear
function. Theorem 1.13 implies that l(z) = az + bz̄, so l(iz) = i(az − bz̄). Therefore,
l(iz) = il(z) if and only if
iaz − ibz̄ = iaz + ibz̄.
Therefore if l(iz) = il(z) for all z ∈ C then b = 0 and hence l is C-linear.
We set a = a1 + ia2 , b = b1 + ib2 , and also z = x + iy, w = u + iv. We may represent
an R-linear function w = az + bz̄ as two real equations

u = (a1 + b1 )x − (a2 − b2 )y, v = (a2 + b2 )x + (a1 − b1 )y.

Therefore geometrically an R-linear function is an affine transform of a plane y = Ax


with the matrix !
a1 + b1 −(a2 − b2 )
A= . (1.23)
a2 + b 2 a1 − b 1

8
Its Jacobian is
J = a21 − b21 + a22 − b22 = |a|2 − |b|2 . (1.24)
This transformation is non-singular when |a| = 6 |b|. It transforms lines into lines, parallel
lines into parallel lines and squares into parallelograms. It preserves the orientation when
|a| > |b| and changes it if |a| < |b|.
However, a C-linear transformation w = az may not change orientation since its
jacobian J = |a|2 ≥ 0. They are not singular unless a = 0. Letting a = |a|eiα and
recalling the geometric interpretation of multiplication of complex numbers we find that
a non-degenerate C-linear transformation

w = |a|eiα z (1.25)

is the composition of dilation by |a| and rotation by the angle α. Such transformations
preserve angles and map squares onto squares.
We note that preservation of angles characterizes C-linear transformations. More-
over, the following theorem holds.
Theorem 1.16 If an R-linear transformation w = az + bz̄ preserves orientation and
angles between three non-parallel vectors eiα1 , eiα2 , eiα3 , αj ∈ R, j = 1, 2, 3, then w is
C-linear.
Proof. Let us assume that w(eiα1 ) = ρeiβ1 and define w0 (z) = e−iβ1 w(zeiα1 ). Then
w0 (z) = a0 z + b0 z̄ with
a0 = aei(α1 −β1 ) a, b0 = be−i(α1 +β1 ) ,
and, moreover w0 (1) = e−iβ1 ρeiβ1 = ρ > 0. Therefore we have a0 + b0 > 0. Furthermore,
w0 preserves the orientation and angles between vectors v1 = 1, v2 = ei(α2 −α1 ) and
v3 = ei(α3 −α1 ) . Since both v1 and its image lie on the positive semi-axis and the angles
between v1 and v2 and their images are the same, we have w0 (v2 ) = h2 v2 with h2 > 0.
This means that
a0 eiβ2 + b0 e−iβ2 = h2 eiβ2 , β2 = α2 − α1 ,
and similarly
a0 eiβ3 + b0 e−iβ3 = h3 eiβ3 , β3 = α3 − α1 ,
with h3 > 0. Hence we have

a0 + b0 > 0, a0 + b0 e−2iβ2 > 0, a0 + b0 e−2iβ3 > 0.

This means that unless b0 = 0 there exist three different vectors that connect the vector
a0 to the real axis, all having the same length |b0 |. This is impossible, and hence b0 = 0
and w is C-linear.

Exercise 1.17 (a) Give an example of an R-linear transformation that is not C-linear
but preserves angles between two vectors.
(b) Show that if an R-linear transformation preserves orientation and maps some square
onto a square it is C-linear.

9
Now we may turn to the notion of differentiability of complex functions. Intuitively,
a function is differentiable if it is well approximated by linear functions. Two differ-
ent definitions of linear functions that we have introduced lead to different notions of
differentiability.
Definition 1.18 Let z ∈ C and let U be a neighborhood of z. A function f : U → C is
R-differentiable (respectively, C-differentiable) at the point z if we have for sufficiently
small |∆z|:
∆f = f (z + ∆z) − f (z) = l(∆z) + o(∆z), (1.26)
where l(∆z) (with z fixed) is an R-linear (respectively, C-linear) function of ∆z, and
o(∆z) satisfies o(∆z)/∆z → 0 as ∆z → 0. The function l is called the differential of f
at z and is denoted df .
The increment of an R-differentiable function has, therefore, the form

∆f = a∆z + b∆z + o(∆z). (1.27)

Taking the increment ∆z = ∆x along the x-axis, so that ∆z = ∆x and passing to the
limit ∆x → 0 we obtain
∆f ∂f
lim = = a + b.
∆x→0 ∆x ∂x
Similarly, taking ∆z = i∆y (the increment is long the y-axis) so that ∆z = −i∆y we
obtain
∆f 1 ∂f
lim = = a − b.
∆y→0 i∆y i ∂y
The two relations above imply that
   
1 ∂f ∂f 1 ∂f ∂f
a= −i , b= +i .
2 ∂x ∂y 2 ∂x ∂y
These coefficients are denoted as
   
∂f 1 ∂f ∂f ∂f 1 ∂f ∂f
= −i , = +i (1.28)
∂z 2 ∂x ∂y ∂ z̄ 2 ∂x ∂y
and are sometimes called the formal derivatives of f at the point z. They were first
introduced by Riemann in 1851.

∂z ∂ z̄ ∂ ∂f ∂g ∂
Exercise 1.19 Show that (a) = 0, = 1; (b) (f + g) = + , (f g) =
∂ z̄ ∂ z̄ ∂ z̄ ∂ z̄ ∂ z̄ ∂ z̄
∂f ∂g
g+f .
∂ z̄ ∂ z̄
Using the obvious relations dz = ∆z, dz̄ = ∆z̄ we arrive at the formula for the differential
of R-differentiable functions
∂f ∂f
df = dz + dz̄. (1.29)
∂z ∂ z̄
10
Therefore, all functions f = u+iv such that u and v have usual differentials as functions
of two real variables x and y turn out to be R-differentiable. This notion does not bring
any essential new ideas to analysis. The complex analysis really starts with the notion
of C-differentiability.
The increment of a C-differentiable function has the form

∆f = a∆z + o(∆z) (1.30)

and its differential is a C-linear function of ∆z (with z fixed). Expression (1.29) shows
that C-differentiable functions are distinguished from R-differentiable ones by an addi-
tional condition
∂f
= 0. (1.31)
∂ z̄
If f = u + iv then (1.28) shows that
   
∂f 1 ∂u ∂v i ∂u ∂v
= − + +
∂ z̄ 2 ∂x ∂y 2 ∂y ∂x

so that the complex equation (1.31) may be written as a pair of real equations

∂u ∂v ∂u ∂v
= , =− . (1.32)
∂x ∂y ∂y ∂x
The notion of complex differentiability is clearly very restrictive: while it is fairly difficult
to construct an example of a continuous but nowhere real differentiable function, most
trivial functions turn out to be non-differentiable in the complex sense. For example,
∂u ∂v
the function f (z) = x + 2iy is nowhere C-differentiable: = 1, = 2 and conditions
∂x ∂y
(1.32) fail everywhere.

Exercise 1.20 1. Show that C-differentiable functions of the form u(x) + iv(y) are
necessarily C-linear.
2. Let f = u + iv be C-differentiable in the whole plane C and u = v 2 everywhere. Show
that f = const.

Let us consider the notion of a derivative starting with that of the directional derivative.
We fix a point z ∈ C, its neighborhood U and a function f : U → C. Setting ∆z =
|∆z|eiθ we obtain from (1.27) and (1.29):

∂f ∂f
∆f = |∆z|eiθ + |∆z|e−iθ + o(∆z).
∂z ∂ z̄
We divide both sides by ∆z, pass to the limit |∆z| → 0 with θ fixed and obtain the
derivative of f at the point z in direction θ:
∂f ∆f ∂f ∂f −2iθ
= lim = + e . (1.33)
∂zθ |∆z|→0,arg z=θ ∆z ∂z ∂ z̄

11
This expression shows that when z is fixed and θ changes between 0 and 2π the point
∂f ∂f ∂f
traverses twice a circle centered at with the radius .
∂zθ ∂z ∂ z̄
∂f
Hence if 6= 0 then the directional derivative depends on direction θ, and only if
∂ z̄
∂f
= 0, that is, if f is C-differentiable, all directional derivatives at z are the same.
∂ z̄
Clearly, the derivative of f at z exists if and only if the latter condition holds. It is
defined by
∆f
f 0 (z) = lim . (1.34)
∆z→0 ∆z

The limit is understood in the topology of C. It is also clear that if f 0 (z) exists then it
∂f
is equal to . This proposition is so important despite its simplicity that we formulate
∂z
it as a separate theorem.
Theorem 1.21 Complex differentiability of f at z is equivalent to the existence of the
derivative f 0 (z) at z.
∂f
Proof. If f is C-differentiable at z then (1.30) with a = implies that
∂z
∂f
∆f = ∆z + o(∆z).
∂z
o(∆z) ∆f
Then, since lim = 0, we obtain that the limit f 0 (z) = lim exists and is
∆z→0 ∆z ∆z→0 ∆z
∂f
equal to . Conversely, if f 0 (z) exists then by the definition of the limit we have
∂z
∆f
= f 0 (z) + α(∆z),
∆z
where α(∆z) → 0 as ∆z → 0. Therefore the increment ∆f = f 0 (z)∆z + α(∆z)∆z may
be split into two parts so that the first is linear in ∆z and the second is o(∆z), which is
equivalent to C-differentiability of f at z.
The definition of the derivative of a function of a complex variable is exactly the
same as in the real analysis, and all the arithmetic rules of dealing with derivatives
translate into the complex realm without any changes. Thus the elementary theorems
regarding derivatives of a sum, product, ratio, composition and inverse function apply
verbatim in the complex case. We skip their formulation and proofs.
We should have convinced ourselves that the notion of C-differentiability is very
natural. However, as we will see later, C-differentiability at one point is not sufficient
to build an interesting theory. Therefore we will require C-differentiability not at one
point but in a whole neighborhood.
Definition 1.22 A function f is holomorphic (or analytic) at a point z ∈ C if it is
C-differentiable in a neighborhood of z.

12
Example 1.23 The function f (z) = |z|2 = z z̄ is clearly R-differentiable everywhere in
∂f
C. However, = 0 only at z = 0, so f is only C-differentiable at z = 0 but is not
∂ z̄
holomorphic at this point.

The set of functions holomorphic at a point z is denoted by Oz . Sums and products of


functions in Oz also belong to Oz , so this set is a ring. We note that the ratio f /g of
two functions in Oz might not belong to Oz if g(z) = 0.
Functions that are C-differentiable at all points of an open set D ⊂ C are clearly
also holomorphic at all points z ∈ D. We say that such functions are holomorphic in D
and denote their collection by O(D). The set O(D) is also a ring. In general a function
is holomorphic on a set M ⊂ C if it may extended to a function that is holomorphic on
an open set D that contains M .
Finally we say that f is holomorphic at infinity if the function g(z) = f (1/z) is
holomorphic at z = 0. This definition allows to consider functions holomorphic in C.
However, the notion of derivative at z = ∞ is not defined.

1.5 Geometric and Hydrodynamic Interpretations


The differentials of an R-differentiable and, respectively, a C-differentiable function at
a point z have form
∂f ∂f
df = dz + dz̄, df = f 0 (z)dz. (1.35)
∂z ∂ z̄
The Jacobians of such maps are given by (see (1.24))
2 2
∂f ∂f
Jf (z) = − , Jf (z) = |f 0 (z)|2 . (1.36)
∂z ∂ z̄
Let us assume that f is R-differentiable at z and z is not a critical point of f , that is,
Jf (z) 6= 0. The implicit function theorem implies that locally f is a homeomorphism,
that is, there exists a neighborhood U of z so that f maps U continuously and one-
to-one onto a neighborhood of f (z). Expressions (1.36) show that in general Jf may
have an arbitrary sign if f is just R-differentiable. However, the critical points of a C-
differentiable map coincide with the points where derivative vanishes, while such maps
preserve orientation at non-critical points: Jf (z) = |f 0 (z)0 |2 > 0.
Furthermore, an R-differentiable map is said to be conformal at z ∈ C if its differ-
ential df at z is a non-degenerate transformation that is a composition of dilation and
rotation. Since the latter property characterizes C-linear maps we obtain the following
geometric interpretation of C-differentiability:
Complex differentiability of f at a point z together with the condition f 0 (z) 6= 0 is
equivalent to f being a conformal map at z.
A map f : D → C conformal at every point z ∈ D is said to be conformal in D. It
is realized by a holomorphic function in z with no critical points (f 0 (z) 6= 0 in D). Its
differential at every point of the domain is a composition of a dilation and a rotation,
in particular it conserves angles. Such mappings were first considered by Euler in 1777

13
in relation to his participation in the project of producing geographic maps of Russia.
The name “conformal mapping” was introduced by F. Schubert in 1789.
So far we have studied differentials of maps. Let us look now at how the properties
of the map itself depend on it being conformal. Assume that f is conformal in a
neighborhood U of a point z and that f 0 is continuous in U 1 . Consider a smooth path
γ : I = [0, 1] → U that starts at z, that is, γ 0 (t) 6= 0 for all t ∈ I and γ(0) = z. Its
image γ∗ = f ◦ γ is also a smooth path since
γ∗0 (t) = f 0 [γ(t)]γ 0 (t), t ∈ I, (1.37)
and f 0 is continuous and different from zero everywhere in U by assumption.
Geometrically
p γ 0 (t) = ẋ(t) + iẏ(t) is the vector tangent to γ at the point γ(t), and
|γ 0 (t)|dt = ẋ2 + ẏ 2 dt = ds is the differential of the arc length of γ at the same point.
Similarly, |γ∗ (t)|dt = ds∗ is the differential of the arc length of γ∗ at the point γ∗ (t). We
conclude from (1.37) at t = 0 that

0 |γ∗0 (0| ds∗


|f (z)| = 0 = . (1.38)
|γ (0)| ds
Thus the modulus of f 0 (z) is equal to the dilation coefficient at z under the mapping f .
The left side does not depend on the curve γ as long as γ(0) = z. Therefore under
our assumptions all arcs are dilated by the same factor. Therefore a conformal map f
has a circle property: it maps small circles centered at z into curves that differ from
circles centered at f (z) only by terms of the higher order.
Going back to (1.37) we see that
arg f 0 (z) = arg γ∗0 (0) − arg γ 0 (0), (1.39)
so that arg f 0 (z) is the rotation angle of the tangent lines at z under f .
The left side also does not depend on the choice of γ as long as γ(0) = z, so that all
such arcs are rotated by the same angle. Thus a conformal map f preserves angles: the
angle between any two curves at z is equal to the angle between their images at f (z).
If f is holomorphic at z but z is a critical point then the circle property holds
in a degenerate form: the dilation coefficient of all curves at z is equal to 0. Angle
preservation does not hold at all, for instance under the mapping z → z 2 the angle
between the lines arg z = α1 and arg z = α2 doubles! Moreover, smoothness of curves
may be violated at a critical point. For instance a smooth curve γ(t) = t+it2 , t ∈ [−1, 1]
is mapped under the same map z → z 2 into the curve γ∗ (t) = t2 (1 − t2 ) + 2it3 with a
cusp at γ∗ (0) = 0.
Exercise 1.24 Let u(x, y) and v(x, y) be real valued R-differentiable functions and let
∂u ∂u ∂v ∂v
∇u = + i , ∇v = + i . Find the geometric meaning of the conditions
∂x ∂y ∂x ∂y
(∇u, ∇v) = 0 and |∇u| = |∇v|, and their relation to the C-differentiability of f = u + iv
and the conformity of f .
1
We will later see that existence of f 0 implies its continuity and, moreover, existence of derivatives
of all orders.

14
Let us now find the hydrodynamic meaning of complex differentiability and deriva-
tive. We consider a steady two-dimensional flow. That means that the flow vector field
v = (v1 , v2 ) does not depend on time. The flow is described by

v = v1 (x, y) + iv2 (x, y). (1.40)

Let us assume that in a neighborhood U of the point z the functions v1 and v2 have
continuous partial derivatives. We will also assume that the flow v is irrotational in U ,
that is,
∂v2 ∂v1
curlv = − =0 (1.41)
∂x ∂y
and incompressible:
∂v1 ∂v2
divv = + =0 (1.42)
∂x ∂y
at all z ∈ U .
Condition (1.41) implies the existence of a potential function φ such that v = ∇φ,
that is,
∂φ ∂φ
v1 = , v2 = . (1.43)
∂x ∂y
The incompressibility condition (1.42) implies that there exists a stream function ψ so
that
∂ψ ∂ψ
v2 = − , v1 = . (1.44)
∂x ∂y
dy v2
We have dψ = −v2 dx + v1 dy = 0 along the level set of ψ and thus = . This shows
dx v1
that the level set is an integral curve of v.
Consider now a complex function

f = φ + iψ, (1.45)

that is called the complex potential of v. Relations (1.43) and (1.44) imply that φ and
ψ satisfy
∂φ ∂ψ ∂φ ∂ψ
= , =− . (1.46)
∂x ∂y ∂y ∂x
The above conditions coincide with (1.32) and show that the complex potential f is
holomorphic at z ∈ U .
Conversely let the function f = φ + iψ be holomorphic in a neighborhood U of a
point z, and let the functions φ and ψ be twice continuously differentiable. Define the
∂φ ∂φ ∂ 2φ ∂ 2φ
vector field v = ∇φ = + i . It is irrotational in U since curlv = − = 0.
∂x ∂y ∂x∂y ∂y∂x
∂ 2φ ∂ 2φ ∂ 2φ ∂ 2φ
It is also incompressible since divv = 2 + 2 = − = 0. The complex
∂ x ∂ y ∂x∂y ∂y∂x
potential of the vector field v is clearly the function f .

15
Therefore the function f is holomorphic if and only if it is the complex potential of
a steady fluid flow that is both irrotational and incompressible.
It is easy to establish the hydrodynamic meaning of the derivative:
∂φ ∂ψ
f0 = +i = v1 − iv2 , (1.47)
∂x ∂x
so that the derivative of the complex potential is the vector that is the complex conjugate
of the flow vector. The critical points of f are the points where the flow vanishes.

Example 1.25 Let us find the complex potential of an infinitely deep flow over a flat
bottom with a line obstacle of height h perpendicular to the bottom. This is a flow in
the upper half-plane that goes around an interval of length h that we may consider lying
on the imaginary axis.
The boundary of the domain consists, therefore, of the real axis and the interval
[0, ih] on the imaginary axis. The boundary must be a stream line of the flow. We set
it to be the level set ψ = 0 and will assume that ψ > 0 everywhere in D. In order
to find the complex potential f it suffices to find a conformal mapping of D onto the
upper half-plane ψ > 0. One function that provides such a mapping may be obtained as
follows. The mapping z1 = z 2 maps D onto the plane without the half-line Rez1 ≥ −h2 ,
Imz1 = 0. The map z2 = z1 +h2 maps this p half-line onto the positive semi-axis Rez2 ≥ 0,

Imz2 = 0. Now the mapping w2 = z2 = |z2 |ei(arg z2 )/2 with 0 < arg z2 < 2π maps the
complex plane without the positive semi-axis onto the upper half-plane. It remains to
write explicitly the resulting map
√ p √
w = z2 = z1 + h2 = z 2 + h2 (1.48)

that provides the desired mapping of D onto the upper half-plane. We may obtain the
equation for the stream-lines of the flow by writing (φ + iψ)2 = (x + iy)2 + h2 . The
streamline ψ = ψ0 is obtained by solving

φ2 − ψ02 = h2 + x2 − y 2 , 2φψ0 = 2xy.

This leads to φ = xy/ψ0 and


s
h2
y = ψ0 1+ . (1.49)
x2 + ψ02

dw |z|
The magnitude of the flow is |v| = = p and is equal to one at infinity.
dz |z|2 + h2
The point z = 0 is the critical point of the flow. One may show that the general form
of the solution is √
f (z) = v∞ z 2 + h2 , (1.50)
where v∞ > 0 is the flow speed at infinity.

16
1.6 Möbius transforms
We will later prove that the only conformal maps C̄ → C̄ are the ones given by rational
functions. It is clear how to identify the conformal automorphisms amongst these maps,
at least on the non-rigorous level. Indeed, the fact that there is only one (and simple
since the map is one-to-one) solution to
P (z)
=0 (1.51)
Q(z)
means (via the fundamental theorem of algebra that we will prove soon) that P (z) is
linear. Furthermore, if P (z) 6= const then (1.51) has a solution different from z = ∞.
Therefore, we can not have P (∞)/Q(∞) = 0 in that case, which means that Q(z) also
has to be linear. Finally, when P (z) = P0 = const, one sees that Q(z) is linear since
the equation P0 /Q(z) = w has exactly one solution for each w ∈ C̄. Based on this
argument (which the reader for now can ignore if desired), the next lemma identifies all
automorphisms of C̄.
Lemma 1.26 Every matrix A ∈ GL(2, C) defines a transformation
" #
az + b a b
TA (z) := , A=
cz + d c d
which is holomorphic as a map from C → C. It is called a fractional linear or Möbius
transformation. The map A 7→ TA only depends on the equivalence class of A under
the relation A ∼ B iff A = λB, λ ∈ C∗ . In other words, the family of all Möbius
transformations is the same as
P SL(2, C) := SL(2, C)/{±Id} (1.52)
We have TA ◦ TB = TA◦B and TA−1 = TA−1 . In particular, every Möbius transformation
is an automorphism of C̄.
Proof. It is clear that each TA is a holomorphic map C̄ → C̄. The composition law
TA ◦ TB = TA◦B and TA−1 = TA−1 are simple computations that we leave to the reader.
In particular, TA has a conformal inverse and is thus an automorphism of C̄. To prove
the last claim, note that if TA = TAe where A, A e ∈ SL(2, C), then the derivatives also
coincide:
ad − bc ade − ebe
c
TA0 (z) = 0 e
2
= TA
(z) =
(cz + d) e
cz + d)2
(e e
and thus cz + d = ±(e
cz + d),
e as A and A
e obey the normalization

ad − bc = e
ade − ebe
c=1
Hence, A and A e are the same matrices in SL(2, C) possibly up to a choice of sign, which
establishes (1.52).
Fractional linear transformations enjoy many important properties which can be
checked separately for each of the following four elementary transformations.

17
Lemma 1.27 Every Möbius transformation is the composition of four elementary maps:
• translations z 7→ z + z0
• dilations z 7→ λz, λ > 0
• rotations z 7→ eiθ z, θ ∈ R
1
• inversion z 7→ z

Proof. If c = 0, then TA (z) = ad z + db . If c 6= 0, then


bc − ad 1 a
TA (z) = d
+
c2 z + c
c

and we are done.


z−1
The reader will have no difficulty verifying that the transformation z 7→ z+1 maps
the right half-plane {Rez > 0} onto the unit disk D := {|z| < 1}. In particular,
the imaginary axis iR is mapped onto the unit circle and z = 1 is mapped to zero.
Similarly, the transformation z 7→ 2z−12−z
maps D onto itself with the boundary going
onto the boundary, since
2e − 1 2 − e−iθ

2 − eiθ = 2 − eiθ = 1, for any θ ∈ R.

If we include all lines into the family of circles (they may thought of as circles passing
through ∞, and their images on the unit sphere under the stereographic projection are
true circles on the sphere) then these examples motivate the following lemma.
Lemma 1.28 Fractional linear transformations map circles onto circles.
Proof. In view of Lemma 1.27, the only case requiring an argument is the inversion.
Thus, let |z − z0 | = r be a circle and set w = z1 . Then

1 Re (wz0 )
0 = |z|2 − 2Re (z̄z0 ) + |z0 |2 − r2 = 2
−2 2
+ |z0 |2 − r2
|w| |w|
If |z0 | = r, then one obtains the equation of a line in w. Note that this is precisely the
case when the circle passes through the origin. Otherwise, we obtain the equation
z̄0 2 r2
0 = w − −

|z0 |2 − r2 (|z0 |2 − r2 )2

which is a circle. Finally, a line is given by an equation

2Re (z z̄0 ) = a

which transforms into 2Re (z0 w) = a|w|2 . If a = 0, then we simply obtain another line
through the origin. Otherwise, we obtain the equation |w − z0 /a|2 = |z0 /a|2 which is a
circle.

18
Since
az + b
Tz = =z
cz + d
is a quadratic equation2 for any Möbius transform T , we see that T can have at most
two fixed points unless it is the identity.
It is also clear that every Möbius transform has at least one fixed point. The map
T z = z +1 has exactly one fixed point, namely z = ∞, whereas T z = z1 has two, z = ±1.
Lemma 1.29 A fractional linear transformation is determined completely by its action
on three distinct points. Moreover, given z1 , z2 , z3 ∈ C̄ distinct, there exists a unique
fractional linear transformation T with T z1 = 0, T z2 = 1, T z3 = ∞.
Proof. For the first statement, suppose that S, T are Möbius transformations that
agree at three distinct points. Then S −1 ◦ T has three fixed points and is thus the
identity. For the second statement, let
z − z1 z2 − z3
T z :=
z − z3 z2 − z1
in case z1 , z2 , z3 ∈ C. If any one of these points is ∞, then we obtain the correct formula
by passing to the limit here.
Definition 1.30 The cross ratio of four points z0 , z1 , z2 , z3 ∈ C̄ is defined as
z0 − z1 z2 − z3
[z0 : z1 : z2 : z3 ] :=
z0 − z3 z2 − z1
This concept is most relevant for its relation to Möbius transformations.
Lemma 1.31 The cross ratio of any four distinct points is preserved under Möbius
transformations. Moreover, four distinct points lie on a circle iff their cross ratio is
real.
Proof. Let z1 , z2 , z3 be distinct, T be a Möbius transformation, and let T zj = wj ,
j = 1, 2, 3. Then for all z ∈ C, we have
[w : w1 : w2 : w3 ] = [z : z1 : z2 : z3 ] provided w = T z
The reason is that the cross ratio on the left side defines a Möbius transformation S1 w
with the property that S1 w1 = 0, S1 w2 = 1, S1 w3 = ∞, whereas the right side defines
a transformation S0 with S0 z1 = 0, S0 z2 = 1, S0 z3 = ∞. Hence S1−1 ◦ S0 zj = wj , for
j = 1, 2, 3, which implies that S1−1 ◦ S0 = T as claimed, by virtue of Lemma 1.29. The
second statement is an immediate consequence of the first and the fact that for any
three distinct points z1 , z2 , z3 ∈ R, a fourth point z0 has a real-valued cross ratio with
these three iff z0 ∈ R.
We can now define what it means for two points to be symmetric relative to a circle
(or line — recall that we consider lines to be circles passing through z = ∞).
2
Strictly speaking, this is a quadratic equation provided c 6= 0; if c = 0 one obtains a linear equation
with a fixed point in C and another one at z = ∞.

19
Definition 1.32 Let z1 , z2 , z3 ∈ Γ where Γ ⊂ C∞ is a circle. We say that z and z ∗ are
symmetric relative to Γ if
[z : z1 : z2 : z3 ] = [z ∗ : z1 : z2 : z3 ].
Obviously, if Γ = R, then z ∗ = z̄. In other words, if Γ is a line, then z ∗ is the reflection
of z across that line. If Γ is a circle of a finite radius, then the symmetric point is given
by what is known in the elementary geometry as an inversion.
Lemma 1.33 Let Γ = {|z − z0 | = r}. Then for any z ∈ C∞ ,
r2
z∗ =
z̄ − z̄0
Proof. It is sufficient to consider the unit circle – the general case follows by trans-
lation and dilation. If z1,2,3 lie on the unit circle, then z̄j = zj−1 , hence
z̄ − z1−1 z2−1 − z3−1 z1 z̄ − 1 z2−1 − z3−1
z1−1
[z : z1 : z2 : z3 ] = [z̄ : z2−1
: z3−1 ]
: = =
z̄ − z3−1 z2−1 − z1−1 z̄ − z3−1 z1 z2−1 − 1
z1 z̄ − 1 1 − z2 z3−1 z1 z̄ − 1 z3 − z2
= −1 = = [1/z̄ : z1 : z2 : z3 ]
z̄ − z3 z1 − z2 z̄z3 − 1 z1 − z2
In other words, we have z ∗ = z̄ −1 , as claimed.

Möbius transformations are important for several reasons. We already observed that
they are precisely the automorphisms of the Riemann sphere (though to see that every
automorphism is a Möbius transformation requires additional material). In the 19th
century there was much excitement surrounding non-Euclidean geometry and there is
an important connection between Möbius transformations and hyperbolic geometry:
the isometries of the hyperbolic plane H are precisely those Möbius transformations
which preserve it. Let us be more precise. Consider the upper half-plane model of the
hyperbolic plane given by
dx2 + dy 2 dzdz̄
H = {z ∈ C : Im z > 0}, ds2 = = .
y2 (Im z)2

20
It is not hard to see that Möbius transformations that preserve the upper half-plane are
given by
az + b
z 7→
cz + d
with a, b, c, d ∈ R with ad − bc = 1 (up to multiplication of a, b, c, d by a complex
number λ ∈ C∗ ). Indeed, a Möbius transformation preserves the real line if and only
if a, b, c, d ∈ λR for some λ ∈ C∗ . Without loss of generality we may assume that
ad − bc = ±1. If the determinant equals +1 (so that the corresponding matrix is
in P SL(2, R)), then the upper half-plane is preserved, while those with a negative
determinant interchange the upper and the lower half-planes. It is easy to check that
P SL(2, R) operates transitively on H and preserves the metric: for the latter, one simply
computes that if
az + b
w= , a, b, c, d ∈ R, ad − bc = 1,
cz + d
then
 
a(cz + d) − (az + b)c 1
dw = 2
dz = dz,
(cz + d) (cz + d)2

and
az + b az̄ + b acz z̄ + bcz̄ + azd + bd − acz z̄ − adz̄ − bcz − bd
2iIm w = − =
cz + d cz̄ + d (cz + d)(cz̄ + d)
(ad − bc)(z − z̄) 2iIm z
= = ,
(cz + d)(cz̄ + d) (cz + d)(cz̄ + d)

hence
dwdw̄ dzdz̄ (cz + d)2 (cz̄ + d)2 dz̄ dz
2
= 2 2 2
= .
(Im w) (cz + d) (cz̄ + d) (Im z) (Im z)2
In particular, the geodesics are preserved under the Möbius transformations from
P SL(2, R). Since the metric does not depend on x it follows that all vertical lines are
geodesics. In order to see what the general geodesics are, note that any two points
z1,2 in H lie on a unique circle S12 that is perpendicular to the real axis. It is easy
to see that there exists a Möbius transformation T12 from P SL(2, R) that maps one of
the intersection points of S12 and R to infinity and, in addition, preserves H. As T12
preserves angles, it maps S12 to a line perpendicular to R, that is, to a geodesic in H.
As Möbius transformations from P SL(2, R) map geodesics to geodesics, it follows that
S12 itself is a geodesic. Hence, we have shown that the geodesics of H are precisely all
circles which intersect the real line at a right angle (with the vertical lines being counted
as circles of infinite radius).
It is clear from the above that the hyperbolic plane satisfies all axioms of Euclidean
geometry except for the axiom of parallel lines: there are many “lines” (i.e., geodesics)
passing through a point which is not on a fixed geodesic that do not intersect that
geodesic. Let us now prove the famous Gauss-Bonnet theorem which describes the

21
hyperbolic area of a triangle whose three sides are geodesics (those are called geodesic
triangles).

Theorem 1.34 Let T be a geodesic triangle with angles α1 , α2 , α3 , then

Area(T ) = π − (α1 + α2 + α3 ).

Proof. There are four essentially distinct types of geodesic triangles, depending on
how many of its vertices lie on the real line. Up to equivalences via transformations
in P SL(2, R) (which are isometries and therefore also preserve the area) we see that
it suffices to consider precisely those cases described in Figure 1.3. Let us start with
the case in which exactly two vertices belong to R as shown in that figure (the second
triangle from the right). Without loss of generality one vertex coincides with 1, the
other with ∞, and the circular arc lies on the unit circle with the projection of the
second finite vertex onto the real axis being x0 . Then
Z 1Z ∞ Z 1 Z 0
dxdy dx d cos φ
Area(T ) = √ 2
= √ = p = α0 = π − α1
1−x2 y 1−x 2 1 − cos2 (φ)
x0 x0 α0

as desired since the other two angles are zero. By additivity of the area we can deal
with the other two cases in which at least one vertex is real. We leave the case where
no vertex lies on the (extended) real axis to the reader, the idea is to use Figure 1.4.

A
C
B

22
2 Properties of Holomorphic Functions
2.1 The Integral
Definition 2.1 Let γ : I → C be a piecewise smooth path, where I = [α, β] is an
interval on the real axis. Let a complex-valued function f be defined on γ(I) so that the
function f ◦ γ is a continuous function on I. The integral of f along the path γ is
Z Z β
f dz = f (γ(t))γ 0 (t)dt. (2.1)
γ α
Z β Z β
The integral in the right side of (2.1) is understood to be g1 (t)dt+i g2 (t)dt, where
α α
g1 and g2 are the real and imaginary parts of the function f (γ(t))γ 0 (t) = g1 (t) + ig2 (t).
Note that the functions g1 and g2 may have only finitely many discontinuities on I so
that the integral (2.1) exists in the usual Riemann integral sense.
Example 2.2 Let γ be a circle γ(t) = a + reit , t ∈ [0, 2π], and f (z) = (z − a)n , where
n = 0, ±1, . . . is an integer. Then we have γ 0 (t) = reit , f (γ(t)) = rn eint so that
Z Z 2π
n n+1
(z − a) dz = r i ei(n+1)t dt.
γ 0

We have to consider two cases: when n 6= −1 we have


e2πi(n+1) − 1
Z
(z − a)n dz = rn+1 = 0,
γ n+1
because of the periodicity of the exponential function, while when n = −1
Z Z 2π
dz
=i dt = 2πi.
γ z −a 0

Therefore the integer powers of z − a have the ”orthogonality” property


(
0, if n 6= −1
Z
(z − a)n = (2.2)
γ 2πi, if n = −1
that we will use frequently.
Example 2.3 Let γ : I → C be an arbitrary piecewise smooth path and n 6= 1. We
also assume that the path γ(t) does not pass through the point z = 0 in the case n < 0.
d
The chain rule implies that γ n+1 (t) = (n + 1)γ n (t)γ 0 (t) so that
dt
Z Z β
1
z n dz = γ n (t)γ 0 (t)dt = [γ n+1 (β) − γ n+1 (α)]. (2.3)
γ α n+1
We observe that the integrals of z n , n 6= −1 depend not on the path but only on its
endpoints. Their integrals over a closed path vanish.

23
Integral is invariant under a re-parameterization of the path.
Theorem 2.4 Let a path γ1 : [α1 , β1 ] → C be obtained from a piecewise smooth path
γ : [α, β] → C by a legitimate re-parameterization, that is γ = γ1 ◦ τ where τ is an
increasing piecewise smooth map τ : [α, β] → [α1 , β1 ]. Then we have for any function f
that is continuous on γ (and hence on γ1 ):
Z Z
f dz = f dz. (2.4)
γ1 γ

Proof. The definition of the integral implies that


Z Z β1
f dz = f (γ1 (s))γ10 (s)ds.
γ1 α1

Introducing the new variable t so that τ (t) = s and using the usual rules for the change
of real variables in an integral we obtain
Z Z β1 Z β
0
f dz = f (γ1 (s))γ1 (s)ds = f (γ1 (τ (t)))γ10 (τ (t))τ 0 (t)dt
γ1 α1 α
Z β Z
= f (γ(t))γ 0 (t)dt = f dz. 
α γ

This theorem has an important corollary: the integral that we defined for a path makes
sense also for a curve that is an equivalence class of paths. More precisely, the value of
the integral along any path that defines a given curve is independent of the choice of
path in the equivalence class of the curve.

Theorem 2.5 Let f be a continuous function defined on a piecewise smooth path γ :


[α, β] → C. Then the following inequality holds:
Z Z

f dz ≤ |f ||dγ|, (2.5)

γ γ

where |dγ| = |γ 0 (t)|dt is the differential of the arc length of γ and the integral on the
right side is the real integral along a curve.
Z
Proof. Let us denote J = f dz and let J = |J|eiθ , then we have
γ
Z Z β
−iθ
|J| = e f dz = e−iθ f (γ(t))γ 0 (t)dt.
γ α

The integral on the right side is a real number and hence


Z β Z β Z
 −iθ 0
 0
|J| = Re e f (γ(t))γ (t) dt ≤ |f (γ(t))||γ (t)|dt = |f ||dγ|. 
α α γ

24
Corollary 2.6 Let assumptions of the previous theorem hold and assume that |f (z)| ≤
M for a constant M , then Z

f dz ≤ M |γ|, (2.6)

γ

where |γ| is the length of the path γ.

Inequality (2.6) is obtained


Z from (2.5) if we estimate the integral on the right side of
(2.5) and note that |dγ| = |γ|.
γ

2.1.1 The anti-derivative


Definition 2.7 An anti-derivative of a function f in a domain D is a holomorphic
function F such that at every point z ∈ D we have

F 0 (z) = f (z). (2.7)

Let us now address the existence of anti-derivative. First we will look at the question of
existence of a local anti-derivative that exists in a neighborhood of a point. We begin
with a theorem that expresses in the simplest form the Cauchy theorem that lies at the
core of the theory of integration of holomorphic functions.

Theorem 2.8 (Cauchy) Let f ∈ O(D), that is, f is holomorphic in D. Then the
integral of f along the oriented boundary3 of any triangle ∆ that is properly contained4
in D is equal to zero: Z
f dz = 0. (2.8)
∂∆

Proof. Let us assume that this is false, that is, there exists a triangle ∆ properly
contained in D so that Z


f dz = M > 0.
(2.9)
∂∆

Let us divide ∆ into four sub-triangles by connecting the midpoints of all sides and
assume that the boundaries both of ∆ and these triangles are oriented counter-clockwise.
Then clearly the integral of f over ∂∆ is equal to the sum of the integrals over the
boundaries of the small triangles since each side of a small triangle that is not part of
the boundary ∂∆ belongs to two small triangles with two different orientations so that
they do not contribute to the sum. Therefore there exists at least one small triangle
that we denote ∆1 so that Z
M
f dz ≥ .

∂∆1 4
3
We assume that the boundary ∂∆ (that we treat as a piecewise smooth curve) is oriented in such
a way that the triangle ∆ remains on one side of ∂∆ when one traces ∂∆.
4
A set S is properly contained in a domain S 0 if S is contained in a compact subset of S 0 .

25
We divide ∆1 into four smaller sub-triangles
Z and using the same considerations we find
M
one of them denoted ∆2 so that f dz ≥ 2 .
∂∆2 4
Continuing this procedure we construct a sequence of nested triangles ∆n so that
Z
M
f dz ≥ n . (2.10)

∂∆n 4

The closed triangles ∆n have a common point z0 ∈ ∆ ⊂ D. The function f is holomor-


phic at z0 and hence for any ε > 0 there exists δ > 0 so that we may decompose

f (z) − f (z0 ) = f 0 (z0 )(z − z0 ) + α(z)(z − z0 ) (2.11)

with |α(z)| < ε for all z ∈ U = {|z − z0 | < δ}.


We may find a triangle ∆n that is contained in U . Then (2.11) implies that
Z Z Z Z
0
f dz = f (z0 )dz + f (z0 )(z − z0 )dz + α(z)(z − z0 )dz.
∂∆n ∂∆n ∂∆n ∂∆n

However, the first two integrals on the right side vanish since the factors f (z0 ) and f 0 (z0 )
may be pulled out of the integrals and the integrals of 1 and
Z z−z0 over Z a closed path ∂∆n
are equal to zero (see Example 2.3). Therefore, we have f dz = α(z)(z −z0 )dz,
∂∆n ∂∆n
where |α(z)| < ε for all z ∈ ∂∆n . Furthermore, we have |z − z0 | ≤ |∂∆n | for all z ∈ ∂∆n
and hence we obtain using Theorem 2.5
Z Z


f dz = α(z)(z − z0 )dz < ε|∂∆n |2 .
∂∆n ∂∆n

However, by construction we have |∂∆n | = |∂∆|/2n , where |∂∆| is the perimeter of ∆,


so that Z

f dz < ε|∂∆|2 /4n .

∂∆n

This together with (2.10) implies that M < ε|∂∆|2 which in turn implies M = 0 since ε
is an arbitrary positive number. This contradicts assumption (2.9) and the conclusion
of Theorem 2.8 follows. 
We will consider the Cauchy theorem in its full generality in the next section. At the
moment we will deduce the local existence of anti-derivative from the above Theorem.

Theorem 2.9 Let f ∈ O(D) then it has an anti-derivative in any disk U = {|z − a| <
r} ⊂ D: Z
F (z) = f (ζ)dζ, (2.12)
[a,z]

where the integral is taken along the straight segment [a, z] ⊂ U .

26
Proof. We fix an arbitrary point z ∈ U and assume that |∆z| is so small that the point
z + ∆z ∈ U . Then the triangle ∆ with vertices a, z and z + ∆z is properly contained
in D so that Theorem 2.8 implies that
Z Z Z
f (ζ)dζ + f (ζ)dζ + f (ζ)dζ = 0.
[a,z] [z,z+∆z] [z+∆z,a]

The first term above is equal to F (z) and the third to −F (z + ∆z) so that
Z
F (z + ∆z) − F (z) = f (ζ)dζ. (2.13)
[z,z+∆z]

On the other hand we have


Z
1
f (z) = f (z)dζ
∆z
[z,z+∆z]

(we have pulled the constant factor f (z) out of the integral sign above), which allows
us to write
F (z + ∆z) − F (z)
Z
1
− f (z) = [f (ζ) − f (z)]dζ. (2.14)
∆z ∆z
[z,z+∆z]

We use now continuity of the function f : for any ε > 0 we may find δ > 0 so that if
|∆z| < δ then we have |f (ζ) − f (z)| < ε for all ζ ∈ [z, z + ∆z]. We conclude from (2.14)
that
F (z + ∆z) − F (z) 1
− f (z) < ε|∆z| = ε
∆z |∆z|
provided that |∆z| < δ. The above implies that F 0 (z) exists and is equal to f (z). 
Remark 2.10 We have used only two properties of the function f in the proof of Theo-
rem 2.9: f is continuous and its integral over any triangle ∆ that is contained properly in D
vanishes. Therefore we may claim that the function F defined by (2.12) is a local anti-derivative
of any function f that has these two properties.

The problem of existence of a global anti-derivative in the whole domain D is somewhat


more complicated. We will address it only in the next section, and now will just show
how an anti-derivative that acts along a given path may be glued together out of local
anti-derivatives.
Definition 2.11 Let a function f be defined in a domain D and let γ : I = [α, β] → D
be an arbitrary continuous path. A function Φ : I → C is an anti-derivative of f along
the path γ if (i) Φ is continuous on I, and (ii) for any t0 ∈ I there exists a neighborhood
U ⊂ D of the point z0 = γ(t0 ) so that f has an anti-derivative FU in U such that
FU (γ(t)) = Φ(t) (2.15)
for all t in a neighborhood ut0 ⊂ I.

27
We note that if f has an anti-derivative F in the whole domain D then the function
F (γ(t)) is an anti-derivative along the path γ. However, the above definition does not
require the existence of a global anti-derivative in all of D – it is sufficient for it to exist
locally, in a neighborhood of each point z0 ∈ γ. Moreover, if γ(t0 ) = γ(t00 ) with t0 6= t00
then the two anti-derivatives of f that correspond to the neighborhoods ut0 and ut00
need not coincide: they may differ by a constant (observe that they are anti-derivatives
of f in a neighborhood of the same point z 0 and hence their difference is a constant).
Therefore anti-derivative along a path being a function of the parameter t might not be
a function of the point z.
Theorem 2.12 Let f ∈ O(D) and let γ : I → D be a continuous path. Then the
anti-derivative of f along γ exists and is defined up to a constant.
Proof. Let us divide the interval I = [α, β] into n sub-intervals Ik = [tk , t0k ] so that
each pair of adjacent sub-intervals overlap on an interval (tk < t0k−1 < tk+1 < t0k , t1 = α,
t0n = β). Using uniform continuity of the function γ(t) we may choose Ik so small
that the image γ(Ik ) is contained in a disk Uk ⊂ D. Theorem 2.8 implies that f has
an anti-derivative F in each disk Uk . Let us choose arbitrarily an anti-derivative of f
in U1 and denote it F1 . Consider an anti-derivative of f defined in U2 . It may differ
only by a constant from F1 in the intersection U1 ∩ U2 . Therefore we may choose the
anti-derivative F2 of f in U2 that coincides with F1 in U1 ∩ U2 .
We may continue in this fashion choosing the anti-derivative Fk in each Uk so that
Fk = Fk−1 in the intersection Uk−1 ∩ Uk , k = 1, 2, . . . , n. The function
Φ(t) = Fk ◦ γ(t), t ∈ Ik , k = 1, 2, . . . , n,
is an anti-derivative of f along γ. Indeed it is clearly continuous on γ and for each t0 ∈ I
one may find a neighborhood ut0 where Φ(t) = FU ◦ γ(t) where FU is an anti-derivative
of f in a neighborhood of the point γ(t0 ).
It remains to prove the second part of the theorem. Let Φ1 and Φ2 be two anti-
derivatives of f along γ. We have Φ1 = F (1) ◦ γ(t), Φ2 = F (2) ◦ γ(t) in a neighborhood
ut0 of each point t0 ∈ I. Here F (1) and F (2) are two anti-derivatives of f defined in a
neighborhood of the point γ(t0 ). They may differ only by a constant so that φ(t) =
Φ1 (t)−Φ2 (t) is constant in a neighborhood ut0 of t0 . However, a locally constant function
defined on a connected set is constant on the whole set 5 . Therefore Φ1 (t)−Φ2 (t) = const
for all t ∈ I. 
If the anti-derivative of f along a path γ is known then the integral of f over γ is
computed using the usual Newton-Leibnitz formula.
Theorem 2.13 Let γ : [α, β] → C be a piecewise smooth path and let f be continuous
on γ and have an anti-derivative Φ(t) along γ, then
Z
f dz = Φ(β) − Φ(α). (2.16)
γ
5
Indeed, let E = {t ∈ I : φ(t) = φ(t0 )}. This set is not empty since it contains t0 . It is open since φ
is locally constant so that if t ∈ E and φ(t) = φ(t0 ) then φ(t0 ) = φ(t) = φ(t0 ) for all t0 in a neighborhood
ut and thus ut ⊂ E. However, it is also closed since φ is a continuous function (because it is locally
constant) so that φ(tn ) = φ(t0 ) and tn → t00 implies φ(t00 ) = φ(t0 ). Therefore E = I.

28
Proof. Let us assume first that γ is a smooth path and its image is contained in a domain
D where f has an anti-derivative F . Then the function F ◦ γ is an anti-derivative of f
along γ and hence differs from Φ only by a constant so that Φ(t) = F ◦ γ(t) + C. Since
γ is a smooth path and F 0 (z) = f (z) the derivative Φ0 (t) = f (γ(t))γ 0 (t) exists and is
continuous at all t ∈ [α, β]. However, using the definition of the integral we have
Z Z β Z β
0
f dz = f (γ(t))γ (t)dt = Φ0 (t)dt = Φ(β) − Φ(α)
γ α α

and the theorem is proved in this particular case.


In the general case we may divide γ into a finite number of paths γν : [αν , αn+1 ] → C
(α0 = α < α1 < α2 < · · · < αn = β) so that each of them is smooth and is contained in
a domain where f has an anti-derivative. As we have just shown,
Z
f dz = Φ(αν+1 ) − Φ(αν ),
γν

and summing over ν we obtain (2.16). 

Remark 2.14 We may extend our definition of the integral to continuous paths (from
piecewise smooth) by defining the integral of f over an arbitrary continuous path γ
as the increment of its anti-derivative along the this path over the interval [α, β] of
the parameter change. Clearly the right side of (2.16) does not change under a re-
parameterization of the path. Therefore one may consider integrals of holomorphic
functions over arbitrary continuous curves.

Remark 2.15 Theorem 2.13 allows us to verify that a holomorphic function might have
no global anti-derivative in a domain that is not simply connected. Let D = {0 < |z| <
2} be a punctured disk and consider the function f (z) = 1/z that is holomorphic in D.
This function may not have an anti-derivative in D. Indeed, were the anti-derivative F
of f to exist in D, the function F (γ(t)) would be an anti-derivative along any path γ
contained in D. Theorem 2.13 would imply that
Z
f dz = F (b) − F (a),
γ

where a = γ(α), b = γ(β) are the end-points of γ. In particular the integral of f along
any closed path γ would vanish. However, we know that the integral of f over the unit
circle is Z
f dz = 2πi.
|z|=1

29
2.2 The Cauchy Theorem
We will prove now the Cauchy theorem in its general form – the basic theorem of the
theory of integration of holomorphic functions (we have proved it in its simplest form
in the last section). This theorem claims that the integral of a function holomorphic in
some domain does not change if the path of integration is changed continuously inside
the domain provided that its end-points remain fixed or a closed path remains closed.
We have to define first what we mean by a continuous deformation of a path. We assume
for simplicity that all our paths are parameterized so that t ∈ I = [0, 1]. This assumption
may be made without any loss of generality since any path may be re-parameterized in
this way without changing the equivalence class of the path and hence the value of the
integral.

Definition 2.16 Two paths γ0 : I → D and γ1 : I → D with common ends γ0 (0) =


γ1 (0) = a, γ0 (1) = γ1 (1) = b are homotopic to each other in a domain D if there exists
a continuous map γ(s, t) : I × I → D so that

γ(0, t) = γ0 (t), γ(1, t) = γ1 (t), t∈I


(2.17)
γ(s, 0) = a, γ(s, 1) = b, s ∈ I.

The function γ(s0 , t) : I → D defines a path inside in the domain D for each fixed
s0 ∈ I. These paths vary continuously as s0 varies and their family “connects” the
paths γ0 and γ1 in D. Therefore the homotopy of two paths in D means that one path
may be deformed continuously into the other inside D.
Similarly two closed paths γ0 : I → D and γ1 : I → D are homotopic in a domain
D if there exists a continuous map γ(s, t) : I × I → D so that

γ(0, t) = γ0 (t), γ(1, t) = γ1 (t), t ∈ I


(2.18)
γ(s, 0) = γ(s, 1), s ∈ I.

Homotopy is usually denoted by the symbol ∼, we will write γ0 ∼ γ1 if γ0 is homotopic


to γ1 .
It is quite clear that homotopy defines an equivalence relation. Therefore all paths
with common end-points and all closed paths may be separated into equivalence classes.
Each class contains all paths that are homotopic to each other.
A special homotopy class is that of paths homotopic to zero. We say that a closed
path γ is homotopic to zero in a domain D if there exists a continuous mapping
γ(s, t) : I × I → D that satisfies conditions (2.18) and such that γ1 (t) = const.
That means that γ may be contracted to a point by a continuous transformation.
Any closed path is homotopic to zero in a simply connected domain, and thus any
two paths with common ends are homotopic to each other. Therefore the homotopy
classes in a simply connected domains are trivial.
We have introduced the notion of the integral first for a path and then verified
that the value of the integral is determined not by a path but by a curve, that is, by

30
an equivalence class of paths. The general Cauchy theorem claims that integral of a
holomorphic function is determined not even by a curve but by the homotopy class of
the curve. In other words, the following theorem holds.
Theorem 2.17 (Cauchy) Let f ∈ O(D) and γ0 and γ1 be two paths homotopic to each
other in D either as paths with common ends or as closed paths, then
Z Z
f dz = f dz. (2.19)
γ0 γ1

Proof. Let γ : I × I → D be a function that defines the homotopy of the paths γ0


and γ1 . We construct a system of squares Kmn , m, n = 1, . . . , N that covers the square
K = I × I so that each Kmn overlaps each neighboring square. Uniform continuity of
the function γ implies that the squares Kmn may be chosen so small that the image of
each Kmn is contained in a disk Umn ⊂ D. The function f has an anti-derivative Fmn in
each of those disks (we use the fact that a holomorphic function has an anti-derivative
in any disk). We fix the subscript m and proceed as in the proof of Theorem 2.12. We
choose arbitrarily the anti-derivative Fm1 defined in Um1 and pick the anti-derivative
Fm2 defined in Um2 so that Fm1 = Fm2 in the intersection Um1 ∩ Um2 . Similarly we may
choose Fm3 , . . . , FmN so that Fm,n+1 = Fmn in the intersection Um,n+1 ∩ Umn and define
the function
Φm (s, t) = Fmn ◦ γ(s, t) for (s, t) ∈ Kmn , n = 1, . . . , N . (2.20)
The function Φmn is clearly continuous in the rectangle Km = ∪N n=1 Kmn and is defined
up to an arbitrary constant. We choose arbitrarily Φ1 and pick Φ2 so that Φ1 = Φ2 in
the intersection K1 ∩ K2 6 . The functions Φ3 , . . . , ΦN are chosen in exactly the same
fashion so that Φm+1 = Φm in Km+1 ∩ Km . This allows us to define the function
Φ(s, t) = Φm (s, t) for (s, t) ∈ Km , m = 1, . . . , N . (2.21)
the function Φ(s, t) is clearly an anti-derivative along the path γs (t) = γ(s, t) : I → D
for each fixed s. Therefore the Newton-Leibnitz formula implies that
Z
f dz = Φ(s, 1) − Φ(s, 0). (2.22)
γs

We consider now two cases separately.


(a) The paths γ0 and γ1 have common ends. Then according to the definition of
homotopy we have γ(s, 0) = a and γ(s, 1) = b for all s ∈ I. Therefore the functions
Φ(s, 0) and Φ(s, 1) are locally constant as functions of s ∈ I at all s and hence they
are constant on I. Therefore Φ(0, 0) = Φ(1, 0) and Φ(1, 0) = Φ(1, 1) so that (2.22)
implies (2.19). 
(b) The paths γ0 and γ1 are closed. In this case we have γ(s, 0) = γ(s, 1) so that the
function Φ(s, 0) − Φ(s, 1) is locally constant on I, and hence this function is a constant
on I. Therefore once again (2.22) implies (2.19).
6
This is possible since the function Φ1 − Φ2 is locally constant on a connected set K1 ∩ K2 and is
therefore constant on this set

31
2.3 Some special cases
We consider in this section some special cases of the Cauchy theorem that are especially
important and deserve to be stated separately.

Theorem 2.18 Let f ∈ O(D) then its integral along any path that is contained in D
and is homotopic to zero vanishes:
Z
f dz = 0 if γ ∼ 0. (2.23)
γ

Proof. Since γ ∼ 0 this path may be continuously deformed into a point a ∈ D and
thus into a circle γε = {|z − a| = ε} of an arbitrarily small radius ε > 0. The general
Cauchy theorem implies that Z Z
f dz = f dz.
γ γε

The integral on the right side vanishes in the limit ε → 0 since the function f is bounded
in a neighborhood of the point a. However, the left side is independent of ε and thus it
must be equal to zero. 
Any closed path is homotopic to zero in a simply connected domain and thus the
Cauchy theorem has a particularly simple form for such domains - this is its classical
statement:

Theorem 2.19 If a function f is holomorphic in a simply connected domain D ⊂ C


then its integral over any closed path γ : I → D vanishes.

It is easy to deduce from the Cauchy theorem the global theorem of existence of an
anti-derivative in a simply connected domain.

Theorem 2.20 Any function f holomorphic in a simply connected domain D has an


anti-derivative in this domain.

Proof. We first show that the integral of f along a path in D is independent of the
choice of the path and is completely determined by the end-points of the path. Indeed,
let γ1 and γ2 be two paths that connect in D two points a and b. Without any loss of
generality we may assume that the path γ1 is parameterized on an interval [α, β1 ] and
γ2 is parameterized on an interval [β1 , β], α < β1 < β. Let us denote by γ the union of
the paths γ1 and γ2− , this is a closed path contained in γ, and, moreover,
Z Z Z
f dz = f dz − f dz.
γ γ1 γ2

However, Theorem 2.19 integral of f over any closed path vanishes and this implies our
claim7 .
7
One may also obtain this result directly from the general Cauchy theorem using the fact that any
two paths with common ends are homotopic to each other in a simply connected domain.

32
We fix now a point a ∈ D and let z be a point in D. Integral of f over any path
γ=a
fz that connects a and z depends only on z and not on the choice of γ:
Z
F (z) = f (ζ)dζ. (2.24)
a
f z

Repeating verbatim the arguments in the proof of Theorem 2.9 we verify that F (z) is
holomorphic in D and F 0 (z) = f (z) for all z ∈ D so that F is an anti-derivative of f in
D. 
The example of the function f = 1/z in an annulus {0 < |z| < 2} (see Remark 2.15)
shows that the assumption that D is simply connected is essential: the global existence
theorem of anti-derivative does not hold in general for multiply connected domains.
The same example shows that the integral of a holomorphic function over a closed
path in a multiply connected domain might not vanish, so that the Cauchy theorem
in its classical form (Theorem 2.19) may not be extended to non-simply connected
domains. However, one may present a reformulation of this theorem that allows such a
generalization.

Definition 2.21 Let the boundary of a compact domain D8 consist of a finite number of
closed curves γν , ν = 0, . . . , n. We assume that the outer boundary γ0 , that is, the curve
that separates D from infinity, is oriented counterclockwise while the other boundary
curves γν , ν = 1, . . . , n are oriented clockwise. In other words, all the boundary curves
are oriented in such a way that D remains on the left side as they are traced. The
boundary of D with this orientation is called the oriented boundary and denote by ∂D.

We may now state the Cauchy theorem for multiply connected domains as follows.
Theorem 2.22 Let a compact domain D be bounded by a finite number of continuous
curves and let f be holomorphic in its closure D̄. Then the integral of f over its oriented
boundary ∂D is equal to zero:
Z Z Xn Z
f dz = f dz + f dz = 0. (2.25)
∂D γ0 ν=1 γν

Proof. Let us introduce a finite number of cuts λ± ν that connect the components of the
boundary of this domain. It is clear that the closed curve Γ that consists of the oriented
− −
boundary ∂D and the unions Λ+ = ∪λ+ ν and Λ = ∪λν is homotopic to zero in the
domain G that contains D̄, and such that f is holomorphic in D. Theorem 2.18 implies
that the integral of f along Γ vanishes so that
Z Z Z Z Z
f dz = f dz + f dz + f dz = f dz
Γ ∂D Λ+ Λ− ∂D

since the integrals of f along Λ+ and Λ− cancel each other.


8
Recall that a domain D is compact if its closure does not contain the point at infinity.

33
2.4 The Cauchy Integral Formula
We will obtain here a representation of functions holomorphic in a compact domain with
the help of the integral over the boundary of the domain.
Theorem 2.23 Let the function f be holomorphic in the closure of a compact domain
D that is bounded by a finite number of continuous curves. Then the function f at any
point z ∈ D may be represented as
Z
1 f (ζ)
f (z) = dζ, (2.26)
2πi ∂D ζ − z
where ∂D is the oriented boundary of D.
The right side of (2.26) is called the Cauchy integral.
Proof. Let ρ > 0 be such that the disk Uρ = {z 0 : |z − z 0 | < ρ} is properly contained
f (ζ)
in D and let Dρ = D̄\Ūρ . The function g(ζ) = is holomorphic in D̄ρ as a ratio
ζ −z
of two holomorphic functions with the numerator different from zero. The oriented
boundary of Dρ consists of the union of ∂D and the circle ∂Uρ = {ζ : |ζ − z| = ρ}
oriented clockwise. Therefore we have
Z Z Z
1 1 f (ζ) 1 f (ζ)
g(ζ)dζ = dζ − dζ.
2πi ∂Dρ 2πi ∂D ζ − z 2πi ∂Uρ ζ − z

However, the function g is holomorphic in D̄ρ (its singular point ζ = z lies outside this
set) and hence the Cauchy theorem for multiply connected domains may be applied.
We conclude that the integral of g over ∂Dρ vanishes.
Therefore, Z Z
1 f (ζ) 1 f (ζ)
dζ = dζ, (2.27)
2πi ∂D ζ − z 2πi ∂Uρ ζ − z
where ρ may be taken arbitrarily small. Since the function f is continuous at the point
z, for any ε > 0 we may choose δ > 0 so that

|f (ζ) − f (z)| < ε for all ζ ∈ ∂Uρ

for all ρ < δ. Therefore the difference


f (z) − f (ζ)
Z Z
1 f (ζ) 1
f (z) − dζ = dζ (2.28)
2πi ∂Uρ ζ − z 2πi ∂Uρ ζ −z
1
does not exceed ε·2π = ε and thus goes to zero as ρ → 0. However, (2.27) shows that

the left side in (2.28) is independent of ρ and hence is equal to zero for all sufficiently
small ρ, so that Z
1 f (ζ)
f (z) = dζ.
2πi ∂Uρ ζ − z
This together with (2.27) implies (2.26). 

34
Remark 2.24 If the point z lies outside D̄ and conditions of Theorem 2.23 hold then
Z
1 f (ζ)
dζ = 0. (2.29)
2πi ∂D ζ − z

f (ζ)
This follows immediately from the Cauchy theorem since now the function g(ζ) =
ζ −z
is holomorphic in D̄.

The integral Cauchy theorem expresses a very interesting fact: the values of a function
f holomorphic in a domain Ḡ are completely determined by its values on the boundary
∂G. Indeed, if the values of f on ∂G are given then the right side of (2.26) is known
and thus the value of f at any point z ∈ D is also determined. This property is the
main difference between holomorphic functions and differentiable functions in the real
analysis sense.

Exercise 2.25 Let the function f be holomorphic in the closure of a domain D that
contains the point at infinity and the boundary ∂D is oriented so that D remains on
the left as the boundary is traced. Show that then
Z
1 f (ζ)
f (z) = dζ + f (∞).
2πi ∂D ζ − z
An easy corollary of Theorem 2.23 is

Theorem 2.26 The value of the function f ∈ O(D) at each point z ∈ D is equal to the
average of its values on any sufficiently small circle centered at z:
Z 2π
1
f (z) = f (z + ρeit )dt. (2.30)
2π 0

Proof. Consider the disk Uρ = {z 0 : |z − z 0 | < ρ} so that Uρ is properly contained in


D. The Cauchy integral formula implies that
Z
1 f (ζ)
f (z) = dζ. (2.31)
2πi ∂Uρ ζ − z

Introducing the parameterization ζ = z +ρeit , t ∈ [0, 2π] of Uρ and replacing dζ = ρieit dt


we obtain (2.30) from (2.31). 
The mean value theorem shows that holomorphic functions are built in a very regular
fashion, so to speak, and their values are intricately related to the values at other points.
This explains why these functions have specific properties that the real differentiable
functions lack. We will consider many other such properties later.
Before we conclude we present an integral representation of R-differentiable functions
that generalizes the Cauchy integral formula.

35
Theorem 2.27 Let f ∈ C 1 (D̄) be a continuously differentiable function in the real
sense in the closure of a compact domain D bounded by a finite number of piecewise
smooth curves. Then we have
Z ZZ
1 f (ζ) 1 ∂f dξdη
f (z) = dζ − (2.32)
2πi ∂D ζ − z π D ∂ ζ̄ ζ − z

for all z ∈ D (here ζ = ξ + iη inside the integral).

Proof. Let us delete a small disk Ūρ = {ζ : |ζ − z| ≤ ρ} out of D and apply the Green’s
f (ζ)
formula to the function g(ζ) = that is continuously differentiable in the domain
ζ −z
Dρ = D\Ūρ Z Z ZZ
f (ζ) f (ζ) ∂f dξdη 9
dζ − dζ = 2i . (2.33)
∂D ζ − z ∂Uρ ζ − z Dρ ∂ ζ̄ ζ − z

The function f is continuous at z so that f (ζ) = f (z)+O(ρ) for ζ ∈ Uρ , where O(ρ) → 0


as ρ → 0, and thus
Z Z Z
f (ζ) 1 O(ρ)
dζ = f (z) dζ + dζ = 2πif (z) + O(ρ).
∂Uρ ζ − z ∂Uρ ζ − z ∂Uρ ζ − z

Passing to the limit in (2.33) and using the fact that the double integrals in (2.32) and
(2.33) are convergent10 we obtain (2.32). 

2.5 The Taylor series


We will obtain the representation of holomorphic functions as sums of power series (the
Taylor series) in this section. Let us recall the simplest results regarding series familiar
from the real analysis.
One of the main theorems of the complex analysis is

Theorem 2.28 Let f ∈ O(D) and let z0 ∈ D be an arbitrary point in D. Then the
function f may be represented as a sum of a convergent power series

X
f (z) = cn (z − z0 )n (2.34)
n=0

inside any disk U = {|z − z0 | < R} ⊂ D.


9 ∂g 1 ∂f
We have = since the function 1/(ζ − z) is holomorphic in ζ so that its derivative with
∂ ζ̄ ζ − z ∂ ζ̄
respect to ζ̄ vanishes. ZZ
10 ∂f dξdη
Our argument shows that the limit lim exists. Moreover, since f ∈ C 1 (D) the
ρ→0 Dρ ∂ ζ̄ ζ − z
double integral in (2.32) exists as can be easily seen by passing to the polar coordinates and thus this
limit coincides with it.

36
Proof. Let z ∈ U be an arbitrary point. Choose r > 0 so that |z − z0 | < r < R and
denote by γr = {ζ : |ζ − z0 | = r} The integral Cauchy formula implies that
Z
1 f (ζ)
f (z) = dζ.
2πi γr ζ − z
In order to represent f as a power series let us represent the kernel of this integral as
the sum of a geometric series:
−1 X ∞
(z − z0 )n
 
1 z − z0
= (ζ − z0 ) 1 − = . (2.35)
ζ −z ζ − z0 n=0
(ζ − z0 )n+1

1
We multiply both sides by f (ζ) and integrate the series term-wise along γr . The
2πi
series (2.35) converges uniformly on γr since

z − z0 |z − z0 |
ζ − z0 = =q<1

r
for all ζ ∈ γr . Uniform convergence is preserved under multiplication by a continuous
1
and hence bounded function f (ζ). Therefore our term-wise integration is legitimate
2πi
and we obtain
Z X ∞ ∞
1 f (ζ)dζ n
X
f (z) = (z − z0 ) = cn (z − z0 )n
2πi γr n=0 (ζ − z0 )n+1 n=0

where11 Z
1 f (ζ)dζ
cn = , n = 0, 1, . . . . (2.36)
2πi γr (ζ − z0 )n+1
Definition 2.29 The power series (2.34) with coefficients given by (2.36) is the Taylor
series of the function f at the point z0 (or centered at z0 ).
The Cauchy theorem 2.17 implies that the coefficients cn of the Taylor series defined by
(2.36) do not depend on the radius r of the circle γr , 0 < r < R.
The Cauchy inequalities. Let the function f be holomorphic in a closed disk
Ū = {|z − z0 | ≤ r} and let its absolute value on the circle γr = ∂U be bounded by a
constant M . Then the coefficients of the Taylor series of f at z0 satisfy the inequalities
|cn | ≤ M/rn , (n = 0, 1, . . . ). (2.37)
Proof. We deduce from (2.36) using the fact that |f (ζ)| ≤ M for all ζ ∈ γr :
1 M M
|cn | ≤ n+1
2πr = n .
2π r r
11
This theorem was presented by Cauchy in 1831 in Turin. Its proof was first published in Italy, and
it appeared in France in 1841. However, Cauchy did not justify the term-wise integration of the series.
This caused a remark by Chebyshev in his paper from 1844 that such integration is possible only in
some “particular cases”.

37
Exercise 2.30 Let P (z) be a polynomial in z of degree n. Show that if |P (z)| ≤ M for
|z| = 1 then |P (z)| ≤ M |z|n for all |z| ≥ 1.
The Cauchy inequalities imply the interesting
Theorem 2.31 (Liouville12 ) If the function f is holomorphic in the whole complex
plane and bounded then it is equal identically to a constant.
Proof. According to Theorem 2.28 the function f may be represented by a Taylor series

X
f (z) = cn z n
n=0

in any closed disk Ū = {|z| ≤ R}, R < ∞ with the coefficients that do not depend on
R. Since f is bounded in C, say, |f (z)| ≤ M then the Cauchy inequalities imply that
for any n = 0, 1, . . . we have |cn | ≤ M/Rn . We may take R to be arbitrarily large and
hence the right side tends to zero as R → +∞ while the left side is independent of R.
Therefore cn = 0 for n ≥ 1 and hence f (z) = c0 for all z ∈ C. 
Therefore the two properties of a function – to be holomorphic and bounded are
realized simultaneously only for the trivial functions that are equal identically to a
constant.

Exercise 2.32 Prove the following properties of functions f holomorphic in the whole
plane C:
(1) Let M (r) = sup |f (z)|, then if M (r) = ArN + B where r is an arbitrary positive
|z|=r
real number and A, B and N are constants, then f is a polynomial of degree not higher
than N .
(2) If all values of f belong to the right half-plane then f = const.
(3) If limz→∞ f (z) = ∞ then the set {z ∈ C : f (z) = 0} is not empty.

The Liouville theorem may be reformulated:


Theorem 2.33 If a function f is holomorphic in the closed complex plane C then it is
equal identically to a constant.
Proof. if the function f is holomorphic at infinity the limit limz→∞ f (z) exists and is
finite. Therefore f is bounded in a neighborhood U = {|z| > R} of this point. However,
f is also bounded in the complement Ū c = {|z| ≤ R} since it is continuous there and
the set Ū c is compact. Therefore f is holomorphic and bounded in C and thus Theorem
2.31 implies that is equal to a constant. 

Exercise 2.34 Show that a function f (z) that is holomorphic at z = 0 and satisfies
f (z) = f (2z), is equal identically to a constant.
12
Actually this theorem was proved by Cauchy in 1844 while Liouville had proved only a partial
result in the same year. The wrong attribution was started by a student of Liouville who had learned
the theorem at one of his lectures.

38
Theorem 2.34 claims that any function holomorphic in a disk may be represented as
a sum of a convergent power series inside this disk. We would like to show now that,
conversely, the sum of a convergent power series is a holomorphic function. Let us first
recall some properties of power series that are familiar from the real analysis.

Lemma 2.35 If the terms of a power series



X
cn (z − a)n (2.38)
n=0

are bounded at some point z0 ∈ C, that is,

|cn (z0 − a)n | ≤ M, (n = 0, 1, . . . ), (2.39)

then the series converges in the disk U = {z : |z − a| < |z0 − a|}. Moreover, it converges
absolutely and uniformly on any set K that is properly contained in U .

Proof. As in real analysis.

Theorem 2.36 (Abel) Let the power series (2.38) converge at a point z0 ∈ C. Then
this series converges in the disk U = {z : |z − a| < |z0 − a|} and, moreover, it converges
uniformly and absolutely on every compact subset of U .

Proof. Follows immediately from lemma.


The Cauchy-Hadamard formula. Let the coefficients of the power series (2.38)
satisfy
1
lim sup |cn |1/n = , (2.40)
n→∞ R
with 0 ≤ R ≤ ∞ (we set 1/0 = ∞ and 1/∞ = 0). Then the series (2.38) converges at
all z such that |z − a| < R and diverges at all z such that |z − a| > R.
Proof. As in real analysis.

Definition 2.37 The domain of convergence of a power series (2.38) is the interior of
the set E of the points z ∈ C where the series converges.

Theorem 2.38 The domain of convergence of the power series (2.38) is the open disk
{|z − a| < R}, where R is the number determined by the Cauchy-Hadamard formula.

Proof. The previous proposition shows that the set E where the series (2.38) converges
consists of the disk U = {|z − a| < R} and possibly some other set of points on the
boundary {|z − a| = R} of U . Therefore the interior of E is the open disk {|z − a| < R}.

The open disk in Theorem 2.38 is called the disk of convergence of the power series
(2.38), and the number R is its radius of convergence. We pass now to the proof that
the sum of a power series is holomorphic.

39
Theorem 2.39 The sum of a power series

X
f (z) = cn (z − a)n (2.41)
n=0

is holomorphic in its domain of convergence.


Proof. We assume that the radius of convergence R > 0, otherwise there is nothing to
prove. Let us define the formal series of derivatives

X
ncn (z − a)n−1 = φ(z). (2.42)
n=1

Its convergence is equivalent to that of the series ∞ n


P
n=1 ncn (z − a) . However, since
1/n 1/n
lim sup |ncn | = lim sup |cn | the radius of convergence of the series (2.42) is also
n→∞ n→∞
equal to R. Therefore this series converges uniformly on compact subsets of the disk
U = {|z − a| < R} and hence the function φ(z) is continuous in this disk.
Moreover, for the same reason the series (2.42) may be integrated term-wise along
the boundary of any triangle ∆ that is properly contained in U :
Z X∞ Z
φdz = ncn (z − a)n−1 dz = 0.
∂∆ n=1 ∂∆

The integrals on the right side vanish by the Cauchy theorem. Therefore we may apply
Theorem 2.9 and Remark 2.10 which imply that the function
Z X∞ Z ∞
X
φ(ζ)dζ = ncn (ζ − a)n−1 dζ = cn (z − a)n
[a,z] n=1 [a,z] n=1

has a derivative at all z ∈ U that is equal to φ(z). Once again we used uniform
convergence to justify the term-wise integration above. However, then the function
Z
f (z) = c0 + φ(ζ)dζ
[a,z]

has a derivative at all z ∈ U that is also equal to φ(z). 

2.5.1 Properties of holomorphic functions


We discuss some corollaries of Theorem 2.39.
Theorem 2.40 Derivative of a function f ∈ O(D) is holomorphic in the domain D.
Proof. Given a point z0 ∈ D we construct a disk U = {|z −z0 | < R} that is contained in
D. Theorem 2.28 implies that f may be represented as a sum of a converging power series
in this disk. Theorem 2.39 implies that its derivative f 0 = φ may also be represented as
a sum of a power series converging in the same disk. Therefore one may apply Theorem
2.39 also to the function φ and hence φ is holomorphic in the disk U . 
This theorem also implies directly the necessary condition for the existence of anti-
derivative that we have mentioned in Section 2.1.1:

40
Corollary 2.41 If a continuous function f has an anti-derivative F in a domain D
then f is holomorphic in D.

Using Theorem 2.40 once again we obtain

Theorem 2.42 Any function f ∈ O(D) has derivatives of all orders in D that are also
holomorphic in D.

The next theorem establishes uniqueness of the power series representation of a function
relative to a given point.

Theorem 2.43 Let a function f have a representation



X
f (z) = cn (z − z0 )n (2.43)
n=0

in a disk {|z − z0 | < R}. Then the coefficients cn are determined uniquely as

f (n) (z0 )
cn = , n = 0, 1, . . . (2.44)
n!
Proof. Inserting z = z0 in (2.43) we find c0 = f (z0 ). Differentiating (2.43) termwise we
obtain
f 0 (z) = c1 + 2c2 (z − z0 ) + 3c3 (z − z0 )2 + . . .
Inserting z = z0 above we obtain c1 = f 0 (z0 ). Differentiating (2.43) n times we obtain
(we do not write out the formulas for e
cj below)

f (n) (z) = n!cn + e c1 (z − z0 )2 + . . .


c1 (z − z0 ) + e

and once again using z = z0 we obtain cn = f (n) (z0 )/n!. 


Sometimes Theorem 2.43 is formulated as follows: ”Every converging power series is
the Taylor series for its sum.”
Comparing expressions (2.44) for the coefficients cn with their values given by (2.36)
we obtain the formulas for the derivatives of holomorphic functions:
Z
(n) n! f (ζ)dζ
f (z0 ) = , n = 1, 2 . . . (2.45)
2πi γr (ζ − z0 )n+1

If the function f is holomorphic in a domain D and G is a sub-domain of D that is


bounded by finitely many continuous curves and such that z0 ∈ G then we may replace
the contour γr in (2.45) by the oriented boundary ∂G, using the invariance of the integral
under homotopy of paths. Then we obtain the Cauchy integral formula for derivatives
of holomorphic functions:
Z
(n) n! f (ζ)dζ
f (z) = , n = 1, 2 . . . (2.46)
2πi ∂G (ζ − z)n+1

41
These formulas may be also obtained from the Cauchy integral formula
Z
1 f (ζ)dζ
f (z) = ,
2πi ∂G (ζ − z)
by differentiating with respect to the parameter z inside the integral. Our indirect
argument allowed us to bypass the justification of this operation.
Theorem 2.44 (Morera13 ) If a function f is continuous in a domain D and its integral
over the boundary ∂∆ of any triangle ∆ vanishes then f is holomorphic in D.
Proof. RGiven a ∈ D we construct a disk U = {|z − a| < r} ⊂ D. The function
F (z) = [a,z] f (ζ)dζ is holomorphic in U (see remark after Theorem 2.9). Theorem 2.40
implies then that f is also holomorphic in D. This proves that f is holomorphic at all
a ∈ D. 
Remark 2.45 The Morera Theorem states the converse to the Cauchy theorem as
formulated in Theorem 2.8, that is, that integral of a holomorphic function over the
boundary of any triangle vanishes. However, the Morera theorem also requires that f is
continuous in D. This assumption is essential: for instance, the function f that is equal
to zero everywhere in C except at z = 0, where it is equal to one, is not even continuous
at z = 0 but its integral over any triangle vanishes.
However, the Morera theorem does not require any differentiability of f : from the
modern point of view we may say that a function satisfying the assumptions of this
theorem is a generalized solution of the Cauchy-Riemann equations. The theorem asserts
that any generalized solution is a classical solution, that is, it has partial derivatives that
satisfy the Cauchy-Riemann equations.
Remark 2.46 We have seen that the representation as a power series in a disk {|z−a| <
R} is a necessary and sufficient condition for f to be holomorphic in this disk. However,
convergence of the power series on the boundary of the disk is not related to it being
holomorphic at those points. This may be sen on simplest examples. Indeed, the
geometric series

1 X
= zn (2.47)
1−z n=0
converges in the open disk {|z| < 1}. The series (2.47) diverges at all points on {|z| = 1}
since its n-th term does not vanish in the limit n → ∞. On the other hand, the series

X zn
f (z) = 2
(2.48)
n=0
n
converges at all points of {|z| = 1} since it is majorized by the convergent number

X 1
series . However, its sum may not be holomorphic at z = 1 since its derivative
n=1
n2

0
X z n−1
f (z) = is unbounded as z tends to one along the real axis.
n=1
n
13
The theorem was proved by an Italian mathematician Giacinto Morera in 1889.

42
2.6 The Uniqueness theorem
Definition 2.47 A zero of the function f is a point a ∈ C where f vanishes, that is,
solution of f (z) = 0.

Zeroes of differentiable functions in the real analysis may have limit points where the
function f remains differentiable, for example, f (x) = x2 sin(1/x) behaves in this manner
at x = 0. The situation is different in the complex analysis: zeroes of a holomorphic
function must be isolated, they may have limit points only on the boundary of the
domain where the function is holomorphic.
Theorem 2.48 Let the point a ∈ C be a zero of the function f that is holomorphic
at this point, and f is not equal identically to zero in a neighborhood of a. Then there
exists a number n ∈ N so that

f (z) = (z − a)n φ(z), (2.49)

where the function φ is holomorphic at a and is different from zero in a neighborhood of


a.
Proof.
P∞ Indeed, f may be represented by a power series in a neighborhood of a: f (z) =
n
n=0 cn (z − a) . The first coefficient c0 = 0 but not all cn are zero, otherwise f would
vanish identically in a neighborhood of a. Therefore there exists the smallest n so that
cn 6= 0 and the power series has the form

f (z) = cn (z − a)n + cn+1 (z − a)n+1 + . . . , cn 6= 0. (2.50)

Let us denote by
φ(z) = cn + cn+1 (z − a) + . . . (2.51)
so that f (z) = (z − a)n φ(z). The series (2.51) converges in a neighborhood of a (it has
the same radius of convergence as f ) and thus φ is holomorphic in this neighborhood.
Moreover, since φ(a) = cn 6= 0 and φ is continuous at a, φ(z) 6= 0 in a neighborhood of
a. 
Theorem 2.49 (Uniqueness) Let f1 , f2 ∈ O(D), then if f1 = f2 on a set E that has a
limit point in D then f1 (z) = f2 (z) for all z ∈ D.
Proof. The function f = f1 − f2 is holomorphic in D. We should prove that f ≡ 0
in D, that is, that the set F = {z ∈ D : f (z) = 0}, that contains in particular the
set E, coincides with D. The limit point a of E belongs to E (and hence to F ) since
f is continuous. Theorem 2.50 implies that f ≡ 0 in a neighborhood of a, otherwise it
would be impossible for a to e a limit point of the set of zeroes of f .
Therefore the interior F o of F is not empty - it contains a. Moreover, F o is an open
set as the interior of a set. However, it is also closed in the relative topology of D.
Indeed, let b ∈ D be a limit point of F o , then the same Theorem 2.48 implies that f ≡ 0
in a neighborhood of b so that b ∈ F o . Finally, the set D being a domain is connected,
and hence F o = D by Theorem 1.29 of Chapter 1. 

43
This theorem shows another important difference of a holomorphic function from
a real differentiable function in the sense of real analysis. Indeed, even two infinitely
differentiable functions may coincide on an open set without being identically equal to
each other everywhere else. However, according to the previous theorem two holomor-
phic functions that coincide on a set that has a limit point in the domain where they
are holomorphic (for instance on a small disk, or an arc inside the domain) have to be
equal identically in the whole domain.
Exercise 2.50 Show that if f is holomorphic at z = 0 then there exists n ∈ N so that
f (1/n) 6= (−1)n /n3 .
We note that one may simplify the formulation of Theorem 2.48 using the Uniqueness
theorem. That is, the assumption that f is not equal identically to zero in any neighbor-
hood of the point a may be replaced by the assumption that f is not equal identically
to zero everywhere (these two assumptions coincide by the Uniqueness theorem).
Theorem 2.48 shows that holomorphic functions vanish as an integer power of (z −a).
Definition 2.51 The order, or multiplicity, of a zero a ∈ C of a function f holomorphic
at this point, is the order of the first non-zero derivative f (k) (a). in other words, a point
a is a zero of f of order n if
f (a) = · · · = f (k−1) (a) = 0, f (n) (a) 6= 0, n ≥ 1. (2.52)
Expressions ck = f (k) (a)/k! for the coefficients of the Taylor series show that the order
of zero is the index of the first non-zero Taylor coefficient of the function f at the point
a, or, alternatively, the number n in Theorem 2.48. The Uniqueness theorem shows
that holomorphic functions that are not equal identically to zero may not have zeroes
of infinite order.
Similar to what is done for polynomials, one may define the order of zeroes using
division.
Theorem 2.52 The order of zero a ∈ C of a holomorphic function f coincides with
the order of the highest degree (z − a)k that is a divisor of f in the sense that the ratio
f (z)
(extended by continuity to z = a) is a holomorphic function at a.
(z − a)k
Proof. Let us denote by n the order of zero a and by N the highest degree of (z − a)
that is a divisor of f . Expression (2.49) shows that f is divisible by any power k ≤ n:
f (z)
= (z − a)n−k φ(z),
(z − a)k
and thus N ≥ n. Let f be divisible by (z − a)N so that the ratio
f (z)
ψ(z) =
(z − a)N
is a holomorphic function at a. Developing ψ as a power series in (z −a) we find that the
Taylor expansion of f at a starts with a power not smaller than N . Therefore n ≥ N
and since we have already shown that n ≤ N we conclude that n = N . 

44
Example 2.53 The function f (z) = sin z−z has a third order zero at z = 0. Indeed, we
have f (0) = f 0 (0) = f 00 (0) but f 000 (0) 6= 0. This may also be seen from the representation

z3 z5
f (z) = − + + ...
3! 5!
Remark 2.54 Let f be holomorphic at infinity and equal to zero there. It is natural
to define the order of zero at this point as the order of zero the order of zero at z = 0 of
the function φ(z) = f (1/z). The theorem we just proved remains true also for a = ∞
if instead of dividing by (z − a)k we consider multiplication by z k . For example, the
1 1
function f (z) = 3 + 2 has order 3 at infinity.
z z

2.7 The Weierstrass theorem


Recall that termwise differentiation of a series in real analysis requires uniform conver-
gence of the series in a neighborhood of a point as well as uniform convergence of the
series of derivatives. The situation is simplified in the complex analysis. The following
theorem holds.
Theorem 2.55 (Weierstrass) If the series

X
f (z) = fn (z) (2.53)
n=0

of functions holomorphic in a domain D converges uniformly on any compact subset of


this domain then
(i) the sum of this series is holomorphic in D;
(ii) the series may be differentiated termwise arbitrarily many times at any point in D.
Proof. Let a be arbitrary point in D and consider the disk U = {|z − a| < r} that
is properly contained in D. The series (2.53) converges uniformly in U by assumption
and thus its sum is continuous in U . Let ∆ ⊂ U be a triangle contained in U and let
γ = ∂∆. Since the series (2.53) converges uniformly in U we may integrate it termwise
along γ:
Z ∞ Z
X
f (z)dz = fn (z)dz.
γ n=0 γ

However, the Cauchy theorem implies that all integrals on the right side vanish since
the functions fn are holomorphic. Hence the Morera theorem implies that the function
f is holomorphic and part (i) is proved.
In order to prove part (ii) we once again take an arbitrary point a ∈ D, consider the
same disk U as in the proof of part (i) and denote by γr = ∂U = {|z − a| = r}. The
Cauchy formulas for derivatives imply that
Z
(k) k! f (ζ)
f (a) = dζ. (2.54)
2πi γr (ζ − a)k+1

45
The series ∞
f (ζ) X fn (ζ)
= (2.55)
(ζ − a)k+1 n=0
(ζ − a)k+1
1
differs from (2.53) by a factor that has constant absolute value for all ζ ∈ γr .
rk+1
Therefore it converges uniformly on γr and may be integrated termwise in (2.54). Using
expressions (2.54) in (2.55) we obtain
∞ Z ∞
(k) k! X fn (ζ) X
f (a) = dζ = fn(k) (a),
2πi n=0 γr (ζ − a)k+1 n=0

and part (ii) is proved. 



X sin(n3 z)
Exercise 2.56 Explain why the series may not be differentiated termwise.
n=1
n2

3 Properties of Harmonic functions


We will now discuss some basic properties of the harmonic functions. Most of them hold
not only in two but only in higher dimensions but some are specific to two dimensions.
We have already seen the man value principle: if u(z) is harmonic in a disk |z − z0 | < R,
then Z 2π
1
u(z0 ) = u(z0 + ρeiφ )dφ.
2π 0
Here is the generalization to harmonic functions in higher dimensions.
Theorem 3.1 Let U ⊂ Rn be an open set and let B(x, r) be a ball centered at x ∈ Rn
of radius r > 0 contained in U . Assume that the function u(x) satisfies ∆u = 0 for all
x ∈ U and that u ∈ C 2 (U ). Then we have
Z Z
1 1
u(x) = udy = udS. (3.1)
|B(x, r)| B(x,r) |∂B(x, r)| ∂B(x,r)
Proof. Let us fix the point x ∈ U and define
Z
1
φ(r) = u(z)dS(z). (3.2)
|∂B(x, r)| ∂B(x,r)
It is easy to see that, since u(x) is continuous, we have
lim φ(r) = u(x). (3.3)
r↓0

Therefore, we would be done if we knew that φ0 (r) = 0 for all r > 0 (and such that the
ball B(x, r) is contained in U ). To this end, using the polar coordinates z = x + ry,
with y ∈ ∂B(0, 1), we may rewrite (3.2) as
Z
1
φ(r) = u(x + ry)dS(y).
|∂B(0, 1)| ∂B(0,1)

46
Then differentiating in r gives
Z
0 1
φ (r) = y · ∇u(x + ry)dS(y).
|∂B(0, 1)| ∂B(0,1)

Going back to the z-variables gives


Z Z
0 1 1 1 ∂u
φ (r) = (z − x) · ∇u(z)dS(z) = dS(z).
|∂B(x, r)| ∂B(x,r) r |∂B(x, r)| ∂B(x,r) ∂ν
Here we used the fact that the outward normal to B(x, r) at a point z ∈ ∂B(x, r) is
simply ν = (z − x)/r. Using the Green’s formula
Z Z Z
∂g
f ∆gdy = f dS − ∇f · ∇gdy,
U ∂U ∂ν U

with f = 1 and g = u gives now


Z
0 1
φ (r) = ∆u(y)dy = 0.
|∂B(x, r)| B(x,r)

It follows that φ(r) is a constant and then (3.3) implies that


Z
1
u(x) = udS, (3.4)
|∂B(x, r)| ∂B(x,r)
which is the second identity in (3.1).
In order to prove the first equality in (3.1) we use the polar coordinates once again:
Z Z r Z  Z r
1 1 1
udy = udS ds = u(x)nα(n)sn−1 ds
|B(x, r)| B(x,r) |B(x, r)| 0 ∂B(x,s) |B(x, r)| 0
n
nα(n)r
= u(x) = u(x).
α(n)rn
In the second equality above we used two facts: first, the already proved identity (3.4)
about averages on spherical shells, and, second, that the area of an (n − 1)-dimensional
unit sphere is nα(n). Now, the proof of (3.1) is complete. 

The maximum principle


The first consequence of the mean value property is the maximum principle that says
that a harmonic function attains its maximum over any domain on the boundary and
not inside the domain. Once again, in one dimension this is obvious: a linear function
does not have any local extremal points.
Theorem 3.2 (The maximum principle) Let u(x) be a harmonic function in a con-
nected domain U and assume that u ∈ C 2 (U ) ∩ C(Ū ). Then
max u(x) = max u(y). (3.5)
x∈U y∈∂U

Moreover, if u(x) achieves its maximum at a point x0 in the interior of U then u(x) is
identically equal to a constant in U .

47
Proof. Suppose that u(x) attains its maximum at an interior point x0 ∈ U , and set
M = u(x0 ). Then for any r > 0 sufficiently small (so that the ball B(x0 , r) is contained
in U ) we have Z
1
M = u(x) = udy ≤ M,
|B(x0 , r)| B(x0 ,r)
with the equality above holding only if u(y) = M for all y in the ball B(x0 , r). Therefore,
the set S of points where u(x) = M is open. Since u(x) is continuous, this set is also
closed. Since S us both open and closed in U , and U is connected, it follows that S = U ,
hence u(x) = M at all points x ∈ U . 
Of course, if we replace u by (−u) (which is equally harmonic), we get the minimum
principle for u.

Corollary 3.3 (Strict positivity) Assume that U is a connected domain, and u solves

∆u = 0 in U (3.6)
u = g on ∂U .

Assume, in addition, that g ≥ 0, g is continuous on ∂U , and g(x) 6≡ 0. Then u(x) > 0


at all x ∈ U .

Proof. This is an immediate consequence of the minimum principle: minx∈Ū u(x) ≥ 0,


and u can not attain its minimum inside U , thus u(x) > 0 for all x ∈ U . 

Maximum principle for subharmonic functions


We say that a function u(x) is subharmonic in a domain U if

∆u(x) ≥ 0 for all x ∈ U . (3.7)

Inspecting the proof of the mean-value property we see that if u(x) is sub-harmonic in
a ball B(x0 , R) and we set
Z
1
φ(r) = udS,
|∂B(x0 , r)| ∂B(x0 ,r)

then Z
0 1
φ (r) = ∆udS ≥ 0.
|B(x0 , r)| ∂B(x0 ,r)

This means that Z


1
u(x0 ) ≤ udS,
|∂B(x0 , r)| ∂B(x0 ,r)

for any 0 < r < R. It follows that sub-harmonic functions can not attain a maximum
inside a domain U unless they are equal to a constant – the proof is identical to that
for harmonic functions.

48
A typical example of a sub-harmonic function comes about as follows. Assume that
u(x) is harmonic: ∆u = 0 and Φ(s) is a convex function. Set v(x) = Φ(u(x)), then:

∆v = ∇ · (Φ0 (u)∇u) = Φ0 (u)∆u + Φ00 (u)|∇u|2 = Φ00 (u)|∇u|2 ≥ 0,

so that the function v(x) is sub-harmonic. In particular, if u(x) is harmonic then u2 (x)
is sub-harmonic. This fact has an important implication in complex analysis.

Theorem 3.4 (Maximum modulus principle) Let f (z) be holomorphic in the closure of
a bounded domain D. Then |f (z)| attains its maximum on the boundary ∂D.

Proof. Let f (z) = u + iv(z), then the functions u and v are harmonic, whence |f (z)|2 =
u2 +v 2 is subharmonic. It follows that |f (z)| attains its maximum on the boundary ∂D. 

The three lines theorem


A key ingredient in the proof of the Riesz interpolation theorem that we will soon
encounter is the following basic result that is related to the maximum modulus principle.

Theorem 3.5 Let F (z) be a bounded analytic function in the strip

S = {z : 0 ≤ Rez ≤ 1},

such that |F (iy)| ≤ m0 , |F (1 + iy)| ≤ m1 , with m0 , m1 > 0 for all y ∈ R. Then

|F (x + iy)| ≤ m01−x mx1 for all 0 ≤ x ≤ 1, y ∈ R. (3.8)

Proof. It is convenient to set


F (z)
F1 (z) = ,
m1−z
0 m1
z

so that |F1 (iy)| ≤ 1, |F1 (1 + iy)| ≤ 1 and F1 is uniformly bounded in S. It suffices to


show that |F1 (x + iy)| ≤ 1 for all (x, y) ∈ S under these assumptions – this will give
immediately (3.8). If the strip S were a bounded domain, this would follow immediately
from the maximum modulus principle.
Assume first that F1 (x + iy) → 0 as |y| → +∞, uniformly in x ∈ [0, 1]. Then
|F1 (x ± iM )| ≤ 1/2 for all y with |y| ≥ M , and M > 0 large enough. The maximum
modulus principle implies that |F1 (x+iy)| ≤ 1 for |y| ≤ M , and, since, |F1 (x+iy)| ≤ 1/2
for all y with |y| ≥ M , it follows that |F1 (x + iy)| ≤ 1 for all (x, y) ∈ S.
In general, set
2
Gn (z) = F1 (z)e(z −1)/n ,
then
2 −1)/n
|Gn (iy)| ≤ |F1 (iy)|e(−y ≤ 1,
and
2
|Gn (1 + iy)| ≤ |F1 (1 + iy)|e−y ≤ 1,

49
but in addition, Gn goes to zero as |y| → +∞, uniformly in x ∈ [0, 1]:
2 −y 2 −1)/n 2 /n
|Gn (x + iy)| ≤ |F1 (z)|e(x ≤ C0 e−y ,

with a constant C0 such that |F1 (z)| ≤ C0 for all z ∈ S. It follows from the previous
part of the proof that |Gn (z)| ≤ 1, hence
2 )/n
|F1 (z)| ≤ e(1+y ,

for all z ∈ S and all n ∈ N. Letting n → +∞ we deduce that |F1 (z)| ≤ 1 for all z ∈ S. 

Regularity of harmonic functions


Now, we prove that if u(x) is a twice continuously differentiable harmonic function then
it is infinitely differentiable – in two dimensions this is simply the reflection of the fact
that a harmonic function is the real part of a holomorphic function.

Theorem 3.6 (Regularity) Let u ∈ C 2 (U ) be a harmonic function in a domain U .


Then u is infinitely differentiable in U .

Proof. The proof is via a miracle: we first define a ”smoothed” version of u, and then
verify that the ”smoothed” version coincides with the original, hence original is also
infinitely smooth. This is as close to a free lunch as it gets.
Consider a radial non-negative function η(x) ≥ 0 that depends only Ron |x| such that
(i) η(x) = 0 for |x| ≥ 1, (ii) η(x) is infinitely differentiable, and (iii) Rn η(x)dx = 1.
Also, for each ε ∈ (0, 1) define its rescaled version
1 x
ηε (x) = n η .
ε ε
It is straightforward to verify that ηε satisfies the same properties (i)-(iii) above. More-
over, the function Z
uε (x) = ηε (x − y)u(y)dy (3.9)
Rn

is infinitely differentiable in the slightly smaller domain Uε = {x ∈: dist(x, ∂U ) > ε}.


The reason is that we can differentiate infinitely many times under the integral sign in
(3.9) – this follows from the standard multivariable calculus theorem on differentiation
of integrals depending on a parameter (the variable x plays the role of a parameter here).
Our main claim is that, because of the mean value property, we have

uε (x) = u(x) for all x ∈ Uε . (3.10)

This will immediately imply that u(x) is infinitely differentiable in the domain Uε . And,
as any point x from U lies in Uε if ε < dist(x, ∂U ), it follows that u(x) is infinitely
differentiable at all points x ∈ U .

50
Let us now verify (3.10):
Z    
|x − y| |x − y|
Z Z
1 1
uε (x) = ηε (x−y)u(y)dy = n η u(y)dy = n η u(y)dy.
Rn ε U ε ε B(x,ε) ε

The last equality holds because η(z) = 0 if |z| ≥ 1, whence ηε (z) = 0 if |z| ≥ ε. Changing
variables y = x + εz gives
Z
uε (x) = η (z) u(x + εz)dz.
B(0,1)

Going to the polar coordinates leads to


Z 1 Z 
uε (x) = η (r) u(x + εrω)dS(ω) rn−1 dr. (3.11)
0 ∂B(0,1)

The mean value property implies that


Z
u(x + εrω)dS(ω) = u(x)|∂B(0, 1)|.
∂B(0,1)

Using this in (3.11), we obtain


Z 1 Z
n−1
uε (x) = u(x) η (r) |∂B(0, 1)|r dr = u(x) η(y)dy = u(x), (3.12)
0 B(0,1)

which is (3.10). We used the fact that η has integral equal to one in the last step. 
This regularity property is quite fundamental and appears in one way or other for
the class of elliptic equations (and not just for the Laplace equation) we will discuss
later. One of their main qualitative properties is that solutions are more regular than
the data prescribed, and they behave much nicer than, say, solutions of wave equations
and other hyperbolic problems.
Let us now give a more quantitative estimate on how large the derivatives of the
harmonic functions can be (in two dimensions these follow from the Cauchy integral
formula).

Theorem 3.7 Let u(x) be a harmonic function in a domain U and let B(y0 , r) be a
ball contained in U centered at a point y0 ∈ U . Then there exist universal constants Cn
and Dn that depends only on the dimension n so that we have
Z
Cn
|u(y0 )| ≤ n |u(y)|dy. (3.13)
r B(y0 ,r)

and Z
Dn
|∇u(y0 )| ≤ n+1 |u(y)|dy. (3.14)
r B(y0 ,r)

51
The remarkable fact about the estimate (3.14) is that we are able to estimate the size
of the derivatives of a harmonic function in terms of its values – this means that a
harmonic function can not oscillate (oscillation means, essentially, that the function is
much smaller than its derivative).
Proof. First, the estimate (3.13) follows immediately from the first equality in the
mean value formula (3.1). In order to obtain the derivative bound (3.14) note that if
u(x) is harmonic then so are the partial derivatives ∂u/∂xj , whence
Z Z
∂u(y0 ) 1 ∂u(y) 1
∂xj ≤ |B(y0 , r/2)| dy = u(y)nj (y)dy ,

B(y0 ,r/2) ∂xj |B(y0 , r/2)| ∂B(y0 ,r/2)

(3.15)
where nj (y) is the j-th component of the outward normal. Continuing this estimate we
see that (we use the fact that the area of the unit sphere is nα(n))
n
nα(n)rn−1

∂u(y0 )
≤ 2 sup |u(z)| =
2n
sup |u(z)|. (3.16)
∂xj α(n)rn 2n−1 r z∈B(y0 ,r/2)
z∈B(y0 ,r/2)

Now, we can use the estimate (3.13) applied at any point z ∈ B(y0 , r/2):
Z
Cn
|u(z)| ≤ |u(z 0 )|dz 0 . (3.17)
(r/2)n B(z,r/2)

However, since |y0 − z| ≤ r/2 (this is why we took a smaller ball in (3.15)!), any such
ball B(z, r/2) is contained inside the ball B(y0 , r), thus (3.17) implies that
Z
Cn
|u(z)| ≤ |u(z 0 )|dz 0 .
(r/2)n B(y0 ,r)

Now, it follows from (3.16) that


Z Z
∂u(y0 ) 2n Cn 0 0 Dn
≤ |u(z )|dz = n+1 |u(z)|dz, (3.18)
∂xj r (r/2)n B(y0 ,r) r B(y0 ,r)

which is (3.14). 
Theorem 3.7 is another expression of the fact that harmonic functions do not oscillate
– the first estimate says that the value of the function at a point is bounded by its
averages (but we have seen that already in the mean value property), while the second
bound says in a quantitative way that derivative at a point can not be large without
the function being large around the point. This rules out oscillatory behavior.

The Liouville theorem


The Liouville theorem says that a function which is harmonic in all of Rn is either
unbounded or is identically equal to a constant.
Theorem 3.8 Let u(x) be a harmonic bounded function in Rn . Then u(x) is equal
identically to a constant.

52
Proof. Let us assume that |u(x)| ≤ M for all x ∈ Rn . We fix x0 ∈ Rn and use
Theorem 3.7:
Cα(n)rn
Z
C Cα(n)M
|∇u(x0 )| ≤ n+1 |u(y)|dy ≤ n+1
M≤ .
r B(x0 ,r) r r

As this is true for any r > 0 we may let r → ∞ and conclude that ∇u(x0 ) = 0, thus
u(x) is equal identically to a constant. 
This theorem is, of course, a direct generalization to higher dimensions of the familiar
Liouville theorem in complex analysis.

Harnack’s inequality
Here is another way to express lack of oscillations of nonnegative harmonic functions
– their maximum cannot be much larger than their minimum. To trivialize, consider
the one-dimensional situation. Let u(x) be a non-negative harmonic function on the
interval (0, 1), that is, u(x) = ax + b with some constants a, b ∈ R. We claim that if
u(x) ≥ 0 for all x ∈ [0, 1] then
1 u(x)
≤ ≤ 3, (3.19)
3 u(y)
for all x, y in the smaller interval (1/4, 3/4). The constants 1/3 and 3 in (3.19) depend on
the choice of the ”smaller” interval – they would change of we would replace (1/4, 3/4)
by another subinterval of [0, 1]. But once we fix the subinterval, they do not depend
on the choice of the harmonic function. Let us now show that (3.19) holds for all
x, y ∈ (1/4, 3/4). Without loss of generality we may assume that x > y. First, consider
the case a > 0. Then, since u(x) is increasing (because a > 0), we have

u(x) u(3/4) 3a + 4b
1≤ ≤ = . (3.20)
u(y) u(1/4) a + 4b

As u(x) > 0 on [0, 1] we know that b > 0 (and a > 0 by assumption), using this in (3.20)
gives, with c = a/b:
u(x) 3c + 4 8
1≤ ≤ =3− ≤ 3.
u(y) c+4 c+4
On the other hand, if a < 0 then the function u is decreasing, and
u(x) u(3/4) c+4 1 8
1≥ ≥ = = + .
u(y) u(1/4) 3c + 4 3 3(3c + 4)

As u(1) > 0 we know that a + b > 0, and we still have b > 0 since u(0) > 0. Thus,
c > −1, and therefore,
u(x) 1 8 1
1≥ ≥ + ≥ .
u(y) 3 3(3c + 4) 3
We conclude that (3.19), indeed, holds. Geometrically, (3.19) expresses a very sim-
ple fact: if u(3/4)  u(1/4) then the slope of the straight line connecting the points

53
(1/4, u(1/4)) and (3/4, u(3/4)) is too large so that it would go below the x-axis at x = 0
contradicting the assumption that the linear function is positive on the interval (0, 1).
On the other hand, if u(1/4)  u(3/4) then this line would go below that x-axis at
x = 1. Therefore, the condition that u(x) > 0 on the larger interval [0, 1] is very
important here.
Now, we turn to the general case of dimension larger than one. We say that a set
V is strictly contained in U if V ⊂ U and there exists ε0 > 0 so that for any x ∈ V we
have dist(x, ∂U ) ≥ ε0 .
Theorem 3.9 (Harnack’s inequality) Let U be an open set and let V be a connected
compact set strictly contained in U . Then there exists a constant C that depends on U
and V but nothing else so that for any nonnegative harmonic function u in U we have
sup u(x) ≤ C inf u(x). (3.21)
x∈V x∈V

Proof. Let r = (1/4)dist(V, ∂U ) and choose two points x, y ∈ V such that |x − y| ≤ r.


Then B(x, 2r) ⊂ U so u is harmonic in this ball, and the mean-value principle implies
that Z
1
u(x) = u(z)dz. (3.22)
|B(x, 2r)| B(x,2r)
Note also that since |x − y| ≤ r, the ball B(y, r) is contained inside B(x, 2r), hence
(3.22) implies that Z
1
u(x) ≥ u(z)dz. (3.23)
α(n)2n rn B(y,r)
It follows, on the other hand, from the mean-value principle that
Z
1
u(y) = u(z)dz. (3.24)
α(n)rn B(y,r)
Putting (3.23) and (3.24) together gives
1
u(x) ≥ u(y). (3.25)
2n
Reversing the argument we can similarly conclude that
1
u(y) ≥ n u(x), (3.26)
2
hence
1
u(x) ≤ u(y) ≤ 2n u(x), for all x, y ∈ V such that |x − y| ≤ (1/4)dist(V, ∂U ). (3.27)
2n
Now, there exists a number N so that we may cover the compact set V̄ by N balls of
radius r/2. Then given any two points x, y ∈ V we can connect them by a piece-wise
straight line curve with no more than N segments, each segment at most r long. It
follows that for any x, y ∈ V we have
1
N n
u(x) ≤ u(y) ≤ 2N n u(x), for all x, y ∈ V . (3.28)
2
This, of course, implies (3.21) with C = 2nN . 

54
4 The Hilbert transform and the Riesz-Thorin in-
terpolation
The Poisson kernel
Given a Schwartz class function f (x) ∈ S(Rn ) define a function
Z
u(x, t) = e−2πt|ξ| fˆ(ξ)e2πixξ dξ, t ≥ 0, x ∈ Rn .
Rn

The function u(x, t) is harmonic:


∆x,t u = 0 in Rn+1
+ = Rn × (0, +∞),
and satisfies the boundary condition on the hyper-plane t = 0:
u(x, 0) = f (x), x ∈ Rn .
We can write u(x, t) as a convolution
Z
u(x, t) = Pt ? f = Pt (x − y)f (y),

with
P̂t (ξ) = e−2πt|ξ| .
Note that in one dimension
Z Z ∞  
−2πt|ξ|+2πixξ −2πtξ+2πixξ 1 1 1 t
Pt (x) = e dξ = 2Re e dξ = + = .
0 2π t + ix t − ix π(t + x2 )
2

It dimension n > 1 we have


Z
t
Pt (x) = e−2πt|ξ|+2πiξ·x dξ = Cn .
(t2 + |x|2 )(n+1)/2
Here the constant n depends only on the spatial dimension. We will focus on dimension
n = 1 and address two questions: the first is in which sense is f (x) the boundary data
for u(x, t) as t → 0. The second is as follows: given the harmonic function u(x, t) defined
for t ≥ 0 we can find its harmonic conjugate v(t, x) such that u + iv is a holomorphic
function in the upper half plane {t > 0}. Let g(x) be the boundary value of v(x, t) at
t = 0. What can we say about the map f → g? Both questions are surprisingly rich,
and we will begin with the second question.

The conjugate Poisson kernel


In order to construct the conjugate harmonic function, for f ∈ S(R) define u(x, t) =
Pt ? f , set z = x + it and write
Z Z ∞ Z 0
−2πt|ξ| ˆ ˆ
u(z) = e f (ξ)e2πixξ
dξ = f (ξ)e2πizξ
dξ + fˆ(ξ)e2πiz̄ξ dξ.
R 0 −∞

55
Consider the function v(z) given by
Z ∞ Z 0
iv(z) = ˆ
f (ξ)e 2πizξ
dξ − fˆ(ξ)e2πiz̄ξ dξ.
0 −∞

As the function Z ∞
u(z) + iv(z) = fˆ(ξ)e2πizξ dξ
0
is analytic in the upper half-plane {Imz > 0}(simply because we can differentiate the
integral), the function v is the harmonic conjugate of u. It can be written as
Z
v(z) = (−isgn(ξ))e−2πt|ξ| fˆ(ξ)e2πixξ dξ = Qt ? f, (4.1)
R

with
Q̂t (ξ) = −isgn(ξ)e−2πt|ξ| , (4.2)
and
Z  
2πixξ −2πt|ξ| i 1 1 1 x
Qt (x) = −i e sgn(ξ)e dξ = − − = .
2π t − ix t + ix π t2 + x2
The Poisson kernel and its conjugate are related by
i
Pt (x) + iQt (x) = ,
π(x + it)
which is analytic in {Im z ≥ 0}. The main problem with the conjugate Poisson kernel
is that it does not decay fast enough at infinity to be in L1 (R) nor is it regular at x = 0
as t → 0. Therefore, we need to make precise the meaning of the limit t → 0 in the
convolution in (4.1).

The principle value of 1/x


In order to consider the limit of Qt as t → 0 let us define the principal value of 1/x
which is an element of S 0 (R) (the Schwartz distributions) defined by
Z
1 φ(x)
P.V. (φ) = lim dx, φ ∈ S(R).
x ε→0 |x|>ε x

This is well-defined because


φ(x) − φ(0)
Z Z
1 φ(x)
P.V. (φ) = dx + dx,
x |x|<1 x |x|>1 x
thus
1
P.V. (φ) ≤ C(kφ0 kL∞ + kxφkL∞ ),

x
and therefore P.V.(1/x) is, indeed, a distribution in S 0 (R). The conjugate Poisson kernel
Qt and the principal value of 1/x are related as follows.

56
1 x
Proposition 4.1 Let Qt = , then for any function φ ∈ S(R)
π t2 + x2
Z
1 1
P.V. (φ) = lim Qt (x)φ(x)dx.
π x t→0 R

Proof. Let
1
ψt (x) =χt<|x| (x)
x
so that Z
1
P.V. (φ) = lim ψt (x)φ(x)dx.
x t→0 R

Note, however, that


Z Z Z
xφ(x) φ(x)
(πQt (x) − ψt (x))φ(x)dx = 2 2
dx − dx
R x +t |x|>t x
Z Z  
xφ(x) x 1
= 2 2
dx + 2 2
− φ(x)dx (4.3)
|x|<t x + t |x|>t x + t x
t2 φ(x)
Z Z Z Z
xφ(tx) xφ(tx) φ(tx)
= 2
dx − 2 2
dx = 2
dx − 2
dx.
|x|<1 x + 1 |x|>t x(x + t ) |x|<1 x + 1 |x|>1 x(x + 1)

The dominated convergence theorem implies that both integrals on the utmost right
side above tend to zero as t → 0. 
It is important to note that the computation in (4.3) worked only because the kernel
1/x is odd – this produces the cancellation that saves the day. This would not happen,
for instance, for a kernel behaving as 1/|x| near x = 0.

The Hilbert transform


Motivated by the previous discussion, for a function f ∈ S(R), we define the Hilbert
transform as
f (x − y)
Z
1
Hf (x) = lim Qt ? f (x) = lim dy.
t→0 π ε→0 |y|>ε y
It follows from (4.2) that

d(ξ) = lim Q̂t (ξ)fˆ(ξ) = −isgn(ξ)fˆ(ξ).


Hf (4.4)
ε→0

Therefore, the Hilbert transform may be extended to an isometry L2 (R) → L2 (R), with
kHf kL2 = kf kL2 , H(Hf ) = −f and
Z Z
(Hf )(x)g(x)dx = − f (x)(Hg)(x)dx. (4.5)

The following extension of the Hilbert transform to Lp -spaces for 1 < p < ∞ is due to
M. Riesz.

57
Theorem 4.2 Given 1 < p < ∞ there exists Cp > 0 so that

kHf kLp ≤ Cp kf kLp for all f ∈ Lp (Rn ). (4.6)

The proof of this theorem requires some basic interpolation theory that we will now
develop. Note that the result fails at both ends: both for p = 1 and p = ∞.

Interpolation in Lp -spaces
A simple example of an interpolation inequality is a bound that tells us that a function
f which lies in two spaces Lp0 (Rn , dµ) and Lp1 (Rn , dµ) has to lie also in all intermediate
spaces Lp (Rn , dµ) with p0 ≤ p ≤ p1 . Indeed, if p = αp0 + (1 − α)p1 , 0 < α < 1, then, by
Hölder’s inequality, with q = 1/α and q 0 = 1/(1 − α), we get
Z Z α Z 1−α
αp0 +(1−α)p1 p0 p1
|f | dµ ≤ |f | dµ |f | dµ .

The Riesz-Thorin interpolation theorem


The Riesz-Thorin interpolation theorem deals with the following question, somewhat
motivated by above. Let (M, µ) and (N, ν) be two measure spaces and consider an
operator A which maps Lp0 (M ) to a space Lq0 (N ), and also Lp1 (M ) to a space Lq1 (N ).
More precisely, there exist operators A0 : Lp0 (M ) → Lq0 (N ) and A1 : Lp1 (M ) → Lq1 (N )
so that A = A0 = A1 on Lp0 (M ) ∩ Lp1 (N ). The question is whether A can be defined
on Lp (M ) with p0 < p < p1 , and what is its target space. Let us define pt ∈ (p0 , p1 ) and
qt ∈ (q0 , q1 ) by
1 t 1−t 1 t 1−t
= + , = + , 0 ≤ t ≤ 1, (4.7)
pt p1 p0 qt q1 q0
as well as
k0 = kAkLp0 (M )→Lq0 (N ) , k1 = kAkLp1 (M )→Lq1 (N ) .

Theorem 4.3 (The Riesz-Thorin interpolation theorem) For any t ∈ [0, 1] there exists a
bounded linear operator At : Lpt (M ) → Lqt (N ) that coincides with A on Lp0 (M )∩Lp1 (M )
and whose operator norm satisfies

kAt kLpt (M )→Lqt (N ) ≤ k01−t k1t . (4.8)

Before proving the Riesz-Thorin interpolation theorem we mention some of its implica-
tions. We already know that the Fourier transform maps L1 (Rn ) to L∞ (Rn ) and L2 (Rn )
to itself. This allows us to extend the Fourier transform to all intermediate spaces
Lp (Rn ) with 1 ≤ p ≤ 2.

Corollary 4.4 (The Hausdorff-Young inequality) Let 1 ≤ p ≤ 2, then if f ∈ Lp (Rn )


0 1 1
then its Fourier transform fˆ ∈ Lp (Rn ) with + 0 = 1 and kfˆkLp0 ≤ kf kLp .
p p

58
Proof. We take p0 = 1, p1 = 2, q0 = ∞, q1 = 2. Then for any t ∈ [0, 1] the
corresponding pt and qt are given by
1 1−t t t 1 t
= + =1− , = ,
pt 1 2 2 qt 2

which means that 1/pt + 1/qt = 1, as claimed. Furthermore, as kfˆkL2 = kf kL2 by the
Parceval identity and kfˆkL∞ ≤ kf kL1 , it follows that kfˆkLpt →Lqt ≤ 1. 
The next corollary of the Riesz-Thorin theorem, also sometimes called the Hausdorff-
Young inequality, allows to estimate convolutions.

Corollary 4.5 Let f ∈ Lp (Rn ) and g ∈ Lq (Rn ), then f ? g ∈ Lr (Rn ), and

kf ? gkLr ≤ kf kLp kgkLq , (4.9)

with r determined by
1 1 1
+1= + . (4.10)
r p q
Proof. We do this in two steps. First, fix g ∈ L1 (Rn ). Obviously, we have
Z
kf ? gkL1 ≤ |f (x − y)||g(y)|dydx = kf kL1 kgkL1 , (4.11)

and
kf ? gkL∞ ≤ kf kL∞ kgkL1 . (4.12)
The Riesz-Thorin theorem applied to the map f → f ? g implies then that

kf ? gkLp ≤ kgkL1 kf kLp , (4.13)

which is a special case of (4.9) with q = 1 and r = p. On the other hand, Hölder’s
inequality implies that
1 1
kf ? gkL∞ ≤ kf kLp kgkLp0 , + 0 = 1. (4.14)
p p

Let us take p0 = 1, q0 = p, p1 = p0 and q1 = ∞ in the Riesz-Thorin interpolation


theorem applied to the mapping g → f ? g, with f fixed. Then (4.13) and (4.14) imply
that, for all t ∈ [0, 1],
kf ? gkLr ≤ kf kLp kgkLq ,
with
1 1 1−t t
= = + 0,
q pt 1 p
and
1 1 1−t t
= = + .
r qt p ∞

59
It follows that t = 1 − p/r, thus
1 p 1 p p 1 p 1 1
= 1 − (1 − ) + 0 (1 − ) = + (1 − )(1 − ) = 1 − + ,
q r p r r p r p r

which is (4.10). 
The next example arises in microlocal analysis. Given a function a(x, ξ) ∈ S(R2n )
we define a semiclassical operator
Z
A(x, εD)f = e2πiξ·x a(x, εξ)fˆ(ξ)dξ.

Here ε ∈ (0, 1) is the parameter that plays the role of the Planck constant in physics
and is, therefore, small.

Corollary 4.6 The family of operators A(x, εD), 0 < ε ≤ 1, is uniformly bounded from
any Lp (Rn ), 1 ≤ p ≤ +∞, to itself.

Proof. Let us write


Z Z Z
A(x, εD)f = e 2πiξ·x
a(x, εξ)fˆ(ξ)dξ = e 2πiξ·x+2πiεξ·y
a(x, y)fˆ(ξ)dξdy = e
e a(x, y)f (x+εy)dy,

where e
a(x, y) is the Fourier transform of the function a(x, ξ) in the variable ξ. It follows
that Z
kA(x, εD)f kL∞ ≤ kf kL∞ sup |e a(x, y)|dy = C1 (a)kf kL∞ ,
x∈Rn

and
Z Z
kA(x, εD)kL1 ≤ |e
a(x, y)||f (x + εy)|dydx ≤ ( sup |e a(z, y)|)|f (x + εy)|dydx
z∈Rn
Z
= kf kL1 ( sup |e a(z, y)|)dy = C2 (a)kf kL1 .
z∈Rn

The Riesz-Thorin interpolation theorem implies then that for any p ∈ [1, +∞] there
exists Cp (a) which does not depend on ε ∈ (0, 1] so that kA(x, εD)kLp →Lp ≤ Cp . 

The proof of the Riesz-Thorin interpolation theorem


The proof is based on the three lines theorem. First, let us define how the operator A
acts on Lpt (M ) with pt as in (4.7). Given f ∈ Lpt (M ) we can decompose it as

f (x) = f1 (x) + f0 (x), f1 (x) = f (x)χ|f |≤1 (x), f0 (x) = f (x)χ|f |≥1 (x).

Then, as pt ≤ p1 :
Z Z Z Z
p1
|f1 | dµ = p1
|f | χ|f |≤1 dµ ≤ pt
|f | χ|f |≤1 dµ ≤ |f |pt dµ = kf kpLtpt ,
M M M M

60
and, as p0 ≤ pt :
Z Z Z Z
p0
|f0 | dµ = pt
|f | χ|f |≥1 dµ ≤ pt
|f | χ|f |≥1 dµ ≤ |f |pt dµ = kf kpLtpt ,
M M M M

so that f1 ∈ L (M ) and f0 ∈ L (M ). As A is defined both on Lp0 (M ) and Lp1 (M ),


p1 p0

we can set
Af = Af1 + Af0 .
We now need to verify that A maps Lpt (M ) to Lqt (N ) continuously, and obtain the
bound on its norm as in the theorem. Recall that the space Lp (M ), 1 < p ≤ +∞ is the
0
dual space of Lp (M ) where p and p0 are related by
1 1
+ 0 = 1.
p p
0
Note also that the norm of a bounded linear functional Lf : Lp (M ) → R,
Z
Lf (g) = f gdµ, f ∈ Lp (M ),
M

is kLf k = kf kLp , for all p ∈ [1, +∞]. To see that, for f (x) = |f (x)|eiα(x) simply take
0
g(x) = |f (x)|p/p exp{−iα(x)} for 1 < p < +∞, g(x) = exp{−iα(x)} for p = 1, and
g(x) = χAε (x) exp{−iα(x)}, where Aε is a set of a finite measure such that |f (x)| >
(1 − ε)kf |L∞ on Aε for p = +∞. We conclude that
Z
1 1
kf kLp = sup f gdµ, + 0 = 1.
kgk p0 =1 M p p
L

For an operator mapping Lp to Lq we have the corresponding representation for its


norm: Z
kAkLp (M )→Lq (N ) = sup kAf kLq (N ) = sup (Af )gdν. (4.15)
kf kLp (M ) =1 kf kLp (M ) =1 N
kgk 0 =1
Lq (N )

We will base our estimate of the norm of A : Lpt (M ) → Lqt (N ) on (4.15) . Moreover,
0
as simple functions are dense in Lpt (M ) and Lqt (N ), it suffices to use in (4.15) only simple
functions f and g with kf kLpt (M ) = kgkLqt0 (N ) = 1, of the form
n
X m
X
f (x) = aj eiαj (x) χAj (x), g(y) = bj eiβj (y) χBj (y), x ∈ M, y ∈ N, (4.16)
j=1 j=1

with aj , bj > 0, µ-measurable sets Aj and ν-measurable sets Bj . Since 0 < t < 1, neither
pt nor qt0 can be equal to +∞, hence µ(Aj ), ν(Bj ) < +∞.
Let us now extend the definition of pt and qt to all complex numbers ζ with 0 ≤
Re ζ ≤ 1:
1 1−ζ ζ 1 1−ζ ζ 1 1−ζ ζ
= + , = + , 0 = 0
+ 0.
p(ζ) p0 p1 q(ζ) q0 q1 q (ζ) q0 q1

61
0
Fix t ∈ (0, 1) and a pair of (complex-valued) functions f ∈ Lpt (M ) and g ∈ Lqt (M ) of
the form (4.16). Consider a family of functions
n m
q 0 /q 0 (ζ) iβj (y)
X p /p(ζ) iαj (x)
X
u(x, ζ) = aj t e χAj (x), v(y, ζ) = bj t e χBj (y),
j=1 j=1

with x ∈ M , y ∈ N and 0 ≤ Re ζ ≤ 1. Note that, when ζ = t,


u(x, t) = f (x) and v(y, t) = g(y). (4.17)
As both 1/p(ζ) and 1/q 0 (ζ) are linear in ζ, the functions u(x, ζ) and v(x, ζ) are analytic
in ζ in the strip S = {ζ : 0 ≤ Re ζ ≤ 1}. Since u(x, ζ) and v(y, ζ) are simple functions
of x and y, respectively, vanishing outside of a set of finite measure for each ζ ∈ S fixed,
0 0
they lie in Lp0 (M ) ∩ Lp1 (M ), and Lq0 (M ) ∩ Lq1 (M ), respectively. Therefore, we can
define
Z n X m Z
pt /p(ζ) qt0 /q 0 (ζ)
X
F (ζ) = (Au)(y, ζ)v(y, ζ)dν = aj bk (AΨj )(y)eiβk (y) χBk (y)dν,
N j=1 k=1 N

with Ψj (x) = eiαj (x) χAj (x). According to (4.15) and (4.17) , in order to prove that
kAt kLpt (M )→Lqt (N ) ≤ k01−t k1t , (4.18)
it suffices to show that
|F (t)| ≤ k01−t k1t . (4.19)
The function F (ζ) is analytic and bounded in the strip S, as, for instance, for ζ = η +iξ,
0 ≤ η ≤ 1:

pt /p(ζ) pt ζ/p1 +pt (1−ζ)/p0 pt η/p1 +pt (1−η)/p0
a
j = a
j = a
j ≤ Cj < +∞.

On the boundary of the strip S we have the following bounds: along the line η = 0, for
z = iξ,
Z X n
!1/p0

[pt (iξ)/p1 +pt (1−iξ)/p0 ]p0
ku(x, iξ)kLp0 (M ) = aj χAj (x)dµ

M j=1

n
Z X !1/p0
pt p /p
= |aj | χAj (x)dµ = kf kLtpt (M
0
) = 1,
M j=1

and
m
Z X !1/q00

[qt0 (iξ)/q10 +qt0 (1−iξ)/q00 ]q00
kv(y, iξ)kLq00 (N ) = bj χBj (y)dν
N j=1

m
Z X !1/p0
qt0 qt0 /q00
= |bj | χBj (y)dν = kgk 0 = 1.
N j=1 Lqt (N )

62
It follows that

|F (iξ)| ≤ k(Au)(iξ)kLq0 (N ) kv(iξ)kLq00 (N ) ≤ kAkLp0 (M )→Lq0 (N ) ku(iξ)kLp0 (N ) kv(iξ)kLq00 (N ) ≤ k0 .

Similarly, we can show that along the line ζ = 1+iξ we have ku(x, 1+iξ)kLp1 (M ) ≤ 1 and
kv(x, 1 + iξ)kLq10 (N ) ≤ 1, which implies that |F (1 + iξ)| ≤ k1 . The three lines theorem
implies now that |F (η + iξ)| ≤ k01−η k1η , hence (4.19) holds. 

The Lp -bounds on the Hilbert transform


Recall that we would like to prove the following theorem.

Theorem 4.7 Given 1 < p < ∞ there exists Cp > 0 so that

kHf kLp ≤ Cp kf kLp for all f ∈ Lp (R). (4.20)

Proof. The issue here is that we would like to use the interpolation theorem for the
proof but we only have one trivial bound:

kHf k2 = kf k2 ,

and interpolation requires two bounds! Hence, we start fishing for the second bound.
We first consider p ≥ 2. It suffices to establish (4.20) for f ∈ S(R). Consider a smaller
set
S0 = {f ∈ S : ∃ε > 0 such that fˆ(ξ) = 0 for |ξ| < ε}.
Let us show that S0 is dense in Lp (R). Given any f ∈ S we’ll find a sequence gn ∈ S0
such that kf − gn kLp → 0 as n → +∞. For p = 2 this is trivial: take a smooth function
χ(ξ) such that 0 ≤ χ(ξ) ≤ 1, χ(ξ) = 0 for |ξ| ≤ 1, χ(ξ) = 1 for |ξ| > 2, and set
Z
gn (x) = e2πiξx fˆ(ξ)χ (nξ) dξ,

so that Z 2/n
kf − gn k2L2 ≤ |fˆ(ξ)|2 dξ → 0 as n → +∞. (4.21)
−2/n

On the other hand, for p = +∞ we have


Z 2/n
kf − gn kL∞ ≤ |fˆ(ξ)|dξ → 0 as n → +∞. (4.22)
−2/n

Interpolating between p = 2 and p = +∞ we conclude that

kf − gn kLp → 0 as n → +∞ (4.23)

for all p ≥ 2, hence S0 is dense in Lp (R) for 2 ≤ p < +∞.

63
d(ξ) = −i(sgnξ)fˆ(ξ) is a Schwartz class function (there is no
Given f ∈ S0 , Hf
discontinuity at ξ = 0), thus Hf is also in S(R). We may then write
Z Z ∞
p(x) = (f + iHf )(x) = (1 + sgn(ξ))fˆ(ξ)e 2πiξx
dξ = 2 fˆ(ξ)e2πiξx dξ,
R 0

and consider its extension to the complex plane:


Z ∞
p(z) = 2 fˆ(ξ)e2πiξz dξ.
0

The function p(z) is holomorphic in the upper half-plane {Imz > 0} and is continuous
up to the boundary y = 0. Since f ∈ S0 there exists ε > 0 so that fˆ(ξ) = 0 for |ξ| ≤ ε.
Thus, p(z) satisfies an exponential decay bound

|p(z)| ≤ 2e−2πεy kfˆkL1 , z = x + iy. (4.24)

Integrating p4 (z) along the contour CR which consists of the interval [−R, R] along the
real axis and the semicircle {x2 + y 2 = R2 , y > 0}, and passing to the limit R → 0 with
the help of (4.24) leads to
Z R
lim (f (x) + iHf (x))4 dx = 0.
R→+∞ −R

As both f and Hf are in S0 , the integral above converges absolutely, hence


Z
(f (x) + iHf (x))4 dx = 0.
R

The real part above gives


Z Z Z
(Hf (x)) dx = [−f (x) + 2f (x)(Hf ) (x)]dx ≤ 2 f 2 (x)(Hf )2 (x)dx
4 4 2 2
RZ R
1
≤ (2f 4 (x) + (Hf )4 (x))dx,
2
hence Z Z
4
(Hf (x)) dx ≤ 4 f 4 (x)dx, (4.25)
R

for any function f ∈ S0 . As we have shown that S0 is dense in any Lp (R), 2 ≤ p < ∞, we
know that (4.25) holds for all f ∈ L4 (R). Therefore, the Hilbert transform is a bounded
operator L4 (R) → L4 (R). As we know that it is also bounded from L2 (R) to L2 (R), the
Riesz-Thorin interpolation theorem implies that kHf kLp ≤ Cp kf kLp for all 2 ≤ p ≤ 4.
An argument identical to the above, integrating the function p2k (z) over the same
contour, shows that H is bounded from L2k (R) to L2k (R) for all integers k ≥ 1. It
follows then from the Riesz-Thorin interpolation theorem that kHf kLp ≤ Cp kf kLp for
all 2 ≤ p < +∞.

64
It remains to consider 1 < p < 2 – this is done using the duality argument. Let
q > 2 be the dual exponent of p, 1/p + 1/q = 1. As the operator H : Lq (R) → Lq (R)
is bounded – this follows from what we have already done, as q > 2, so is its adjoint
H ∗ : Lp (R) → Lp (R) defined by hH ∗ f, gi = hf, Hgi, with f ∈ Lp (R), g ∈ Lq (R).
However, identity (4.5) says that H ∗ = −H, hence the boundedness of H ∗ implies that
H : Lp (R) → Lp (R) is also bounded. 
The Hilbert transform does not map L1 (R) → L1 (R) but we have the following result
due to Kolmogorov. We say that a function g belongs to weak-L1 : g ∈ L1w if

C
P (λ) := |{x : |g(x)| > λ}| ≤ ,
λ
for all λ > 0. This notion is motivated by the identity
Z Z ∞
|g(x)|dx = |{x : |g(x)| > λ}|dλ. (4.26)
0

Hence, for L1 -functions the “tail distribution function” P (λ) is integrable while for L1w -
functions it may barely miss being integrable. Note that if g ∈ L1 (R) then
Z
λ|{x : |g(x)| > λ}| ≤ |g(x)|dx = kgk1 ,

thus all functions in L1 lie in L1w . On the other hand, for instance, the function g(x) =
1/x in R lies in L1w (R) but not in L1 (R). We have the following theorem.

Theorem 4.8 Let f ∈ L1 (R), then there exists C > 0 so that for any λ > 0 the
following estimate holds:
Z
C
m{x : |Hf (x)| ≥ λ} ≤ |f (x)dx.
λ R
We will not prove this theorem here. However, we note that the Marcinkiewicz inter-
polation theorem (a generalization of the Riesz-Thorin theorem) allows us to conclude
from Theorem 4.8 and the obvious boundedness of the Hilbert trasform from L2 to L2
that the Hilbert transform is bounded from any Lp (R) to Lp (R) providing an alternative
proof of Theorem 4.7.
The Marcinkiewicz theorem says the following. First, let us generalize (4.26) to
p > 1. Generally, for an increasing differentiable function φ(s) we have the relation
Z Z ∞
φ(|f (x)|)dx = φ0 (λ)P (λ)dx. (4.27)
0

To see that, we simply write


Z Z Z |f (x)| Z ∞ Z ∞
0 0
φ(|f (x)|)dx = φ (λ)dλdx = φ (λ)|{x : |f (x)| > λ}dλ = φ0 (λ)P (λ)dx.
0 0 0

65
As a consequence, for p > 1 we have
Z ∞
kf kpp =p λp−1 P (λ)dλ. (4.28)
0

We say that an operator T from Lp to Lq (with q < ∞) is weak (p, q) if


 q
Ckf kp
|{y : |T f (y)| > λ}| ≤ ,
λ
and we say that T is weak (p, ∞) if T is bounded from Lp to L∞ . Note that if T is
bounded from Lp to Lq then it is weak (p, q), because if we set
Eλ = {y : |T f (y)| > λ},
then
T f (x) q kT f kqq
 q
Ckf kp
Z Z
|Eλ | = 1≤ λ ≤ λq ≤
.
Eλ Eλ λ
Theorem 4.9 (Marcinkiewicz Interpolation Theorem) Let 1 ≤ p0 < p1 ≤ ∞, and let
T be a sublinear operator that is weak (p0 , p0 )and weak (p1 , p1 ). Then T is a bounded
operator from any Lp to Lp with p0 < p < p1 .
The remarkable fact is that weak bounds at the end-points imply a strong bound for
intermediate values of p.

5 Harmonic functions on D
We now consider the properties of harmonic functions on the unit disk D = {|z| < 1}.

5.1 The Poisson kernel


We have discussed above the Hilbert transform. Recall how it was defined: we take
a function f , extend it to a harmonic function u(x, t) in the upper half plane, find
the conjugate to u harmonic function v(x, t) and consider the limit t → 0. Then
“Hf (x) = v(x, 0), at least in the Lp -sense. Here, we will be concerned with the point-
wise convergence of the harmonic extension u(x, t) to the function f (x) itself. We will
also consider the problem on the unit disk rather than on the upper half plane but these
questions are equivalent for these two domains.
Let us derive a candidate solution formula for the above problem: given a function
f on the boundary of D find a harmonic function u on D which attains these boundary
values. This Dirichlet problem is formulated too vaguely – much of what we will do now
will be devoted to a proper interpretation of what we mean by attaining the boundary
values and what kind of regularity we wish u to satisfy on all of D. For the moment, let
us proceed heuristically. Starting with the Fourier series for the function f :
X
f (θ) = fˆ(n)e2πinθ ,
n∈Z

66
we observe that one harmonic extension to the interior is given by
X X
u(z) = fˆ(n)z n = fˆ(n)rn e2πinθ , z = re2πiθ
n∈Z n∈Z

This is singular at z = 0, though, in case fˆ(n) 6= 0 for some n < 0. Since both z n and
z n are (complex) harmonic, we can avoid the singularity by defining

X −1
X
u(z) = fˆ(n)z n + fˆ(n)z̄ |n| (5.1)
n=0 n=−∞

which at least formally is a solution of our Dirichlet problem.


Inserting z = re2πiθ and
Z 1
ˆ
f (n) = e−2πinϕ f (ϕ) dϕ
0

into (5.1) yields


Z 1 X Z 1
2πiθ |n| 2πin(θ−ϕ)
u(re )= r e f (ϕ) dϕ =: Pr (θ − ϕ) f (ϕ) dϕ
0 n∈Z 0

where the Poisson kernel


X
|n| 2πinθ 1 − r2
Pr (θ) := r e =
n∈Z
1 − 2r cos(2πθ) + r2

as can be verified by explicit summation. This is a formal answer, and our goal is to
understand how it can be interpreted. We start with some properties of Pr .

Lemma 5.1 The function u(z) = Pr (θ), z = re2πiθ is a positive harmonic function
on D. It satisfies Z 1
Pr (θ) dθ = 1,
0

for any 0 ≤ r < 1, and for any (complex) Borel measure µ on T,

u(z) = (Pr ∗ µ)(θ)

defines a harmonic function on D.

Proof. The fact that integral of the Poisson kernel equals one, follows immediately
from its defining series. The fact that a convolution with a measure is harmonic follows
from the formula for Pr (θ).
The Poisson kernel close to the boundary is an approximation of identity in the
following sense.

67
Definition 5.2 A sequence Φn ⊂ L∞ (T) is called an approximate identity provided
Z 1
(A1) Φn (θ) dθ = 1 for all n
0
Z 1
(A2) sup |Φn (θ)| dθ < ∞
n 0
Z
(A3) for all δ > 0 one has |Φn (θ)| dθ → 0 as n → ∞.
|x|>δ

The same definition applies, with obvious modifications, to families of the form {Φt }0<t<1
(with n → ∞ replaced by t → 1−).
A standard example is the box kernel
 
1
χ[−ε,ε]
2ε 0<ε< 1 2

in the limit ε → 0. The main example for us right now is, of course, the Poisson kernel
{Pr }0<r<1 . We leave it to the reader to check that it satisfies (A1)–(A3) as r → 1. The
significance of approximate identities lies in the following.
Lemma 5.3 For any approximate identity Φn one has
1. If f ∈ C(T), then kΦn ∗ f − f k∞ → 0 as n → ∞
2. If f ∈ Lp (T) where 1 ≤ p < ∞, then kΦn ∗ f − f kp → 0 as n → ∞.
These statements carry over to approximate identities Φt , 0 < t < 1 simply by replacing
n → ∞ with t → 1.
Proof. For the proof of the first statement, note that, since T is compact, f is
uniformly continuous. Given ε > 0, let δ > 0 be such that
sup sup |f (x − y) − f (x)| < ε
x |y|<δ

Then, by (A1)–(A3), we have


Z
|(Φn ∗ f )(x) − f (x)| = (f (x − y) − f (x))Φn (y) dy

T
Z Z
≤ sup sup |f (x − y) − f (x)| |Φn (t)| dt + |Φn (y)|2kf k∞ dy < Cε
x∈T |y|<δ T |y|≥δ

provided n is large.
For the second part, fix f ∈ Lp . Let g ∈ C(T) with kf − gkp < ε. Then
kΦn ∗ f − f kp ≤ kΦn ∗ (f − g)kp + kf − gkp + kΦn ∗ g − gkp
 
≤ sup kΦn k1 + 1 kf − gkp + kΦn ∗ g − gk∞
n

68
where we have used Hausdorff-Young’s inequality
kf1 ∗ f2 kp ≤ kf1 k1 kf2 kp
to obtain the first term on the right-hand side. Using (A2), the assumption on g, as
well as the first part finishes the proof.
An immediate consequence is the following simple and fundamental result.
Theorem 5.4 Let f ∈ C(T). The unique harmonic function u on D, with u ∈ C(D)
and u = f on T is given by u(z) = (Pr ∗ f )(θ), z = re(θ).
Proof. Uniqueness follows from the maximum principle. For the existence, we ob-
served before that u(z) := (Pr ∗ f )(θ) with |z| < 1 is harmonic on D. By Lemma 5.3,
we have
sup |u(re2πiθ ) − f (θ)| → 0, as r → 1−.
θ∈T

This implies that we can extend u continuously to D̄ by setting it equal to f on T.

5.2 Hardy classes of harmonic functions


Next, we wish to reverse this process and understand which classes of harmonic functions
on D assume boundary values on T. Moreover, we need to clarify which boundary values
arise here and what we mean by “assume”. Particularly important classes known as the
“little” Hardy spaces are as follows:
Definition 5.5 For any 1 ≤ p ≤ ∞ define
n Z 1 o
p
h (D) := u : D → C harmonic sup |u(re2πiθ )|p dθ < ∞

0<r<1 0

with the norm


|||u|||p := sup ku(re(·))kLp (T)
0<r<1

By the mean value property, any positive harmonic function belongs to the space h1 (D):
then the L1 -norm of u over any circle equals to its value in the center and is thus bounded.
Among those, the most important example for us is Pr (θ) ∈ h1 (D). Observe that this
function has boundary values Pr → δ0 (the Dirac mass at θ = 0) as t → 1−, where
the convergence is in the sense of distributions. This already shows that even for a
very smooth function in h1 (D) the boundary value may not be a function but only a
distribution or a measure. In what follows, M(T) denotes the complex-valued Borel
measures and M+ (T) ⊂ M(T) the positive Borel measures.
Theorem 5.6 There is a one-to-one correspondence between h1 (D) and M(T) given by
µ ∈ M(T) 7→ Fr (θ) := (Pr ∗ µ)(θ). Under this map, any µ ∈ M+ (T) relates uniquely to
a positive harmonic function. Furthermore,
kµk = sup kFr k1 = lim kFr k1 (5.2)
0<r<1 r→1

and the following properties hold:

69
1. The measure µ is absolutely continuous with respect to Lebesgue measure (µ  dθ)
if and only if {Fr } has a limit in L1 (T) as r → 1. If so, then dµ = f dθ where f
is the L1 -limit of Fr .
2. The following are equivalent for 1 < p ≤ ∞:

(i) dµ = f dθ with f ∈ Lp (T)


(ii) {Fr }0<r<1 is Lp - bounded
(iii) {Fr } converges in Lp if 1 < p < ∞,
and in weak-∗ sense in L∞ if p = ∞, both as r → 1.

3. The following are equivalent: (i) f is continuous, (ii) Fr = Pr ∗ f extends to a


continuous function on D, (iii) Fr converges uniformly as r → 1−.

This theorem identifies h1 (D) with M(T), and hp (D) with Lp (T) for 1 < p ≤ ∞.
Note that there is an improvement of regularity: even if the boundary valus is a measure,
the restriction of the function to any circle of radius r < 1 lies in L1 (T). Moreover, h∞ (D)
contains the subclass of harmonic functions that can be extended continuously onto D;
this subclass is the same as C(T). Before proving the theorem we present two simple
lemmas. In what follows we use the notation Fr (θ) := F re2πiθ .

Lemma 5.7
(i) If F ∈ C(D) and 4F = 0 in D, then Fr = Pr ∗ F1 for any 0 ≤ r < 1.
(ii) As a function of r ∈ (0, 1) the norms kFr kp are non-decreasing for any 1 ≤ p ≤ ∞.

Proof. Part (1) is a restatement of Theorem 5.4. For (2), first note that for any
0 < s ≤ r < 1 we have
Fs (θ) = (Ps/r ∗ Fr )(θ),
by a simple rescaling. Hausdorff-Young’s inequality then implies

kFs kp ≤ kPs/r k1 kFr kp = kFr kp

as claimed.

Lemma 5.8 Let F ∈ h1 (D). Then there exists a unique measure µ ∈ M(T) such that
Fr = Pr ∗ µ.

Proof. Since the unit ball of M(T) is weak-∗ compact, there exists a subsequence
rj → 1 with Frj → µ in weak-∗ sense to some µ ∈ M(T). We firat claim that

Fr = Pr ∗ µ. (5.3)

To see that, note that for any 0 < r < 1, we have

Pr ∗ µ = lim (Pr ∗ Frj ) = lim Frrj = Fr


j→∞ j→∞

70
by Lemma 5.7. Next, in order to show that such µ is unique, consider any f ∈ C(T),
then, again by Lemma 5.7, and (5.3):

hFr , f i = hPr ∗ µ, f i = hµ, Pr ∗ f i → hµ, f i

as r → 1. This shows that, in the weak-∗ sense,

µ = lim Fr (5.4)
r→1

which implies uniqueness of µ.


Proof of Theorem 5.6. If µ ∈ M(T), then Pr ∗ µ ∈ h1 (D) by Lemma 5.1.
Conversely, given F ∈ h1 (D) then by Lemma 5.8 there is a unique µ so that Fr = Pr ∗ µ.
This gives the one-to-one correspondence between h1 (D) and M(T). Moreover, (5.4)
and Lemma 5.7 show that

kµk ≤ lim sup kFr k1 = sup kFr k1 = lim kFr k1 .


r→1 0<r<1 r→1

Since one also has


sup kFr k1 ≤ sup kPr k1 kµk = kµk ,
0<r<1 0<r<1
1
equality (5.2) follows. If f ∈ L (T) and dµ = f dθ, then Lemma 5.3 shows that Fr → f in
L1 (T). Conversely, if Fr → f in the sense of L1 (T), then because of (5.4) we necessarily
have dµ = f dθ which proves the first part of Theorem 5.6. The other parts are proved
similarly, and we omit the details – one invokes Lemma 5.3, part (2) for 1 < p < ∞ and
Lemma 5.3 part (1) if p = ∞. 
Let us make a remark on the Hilber transform on the circle: it is given by the
convolution with the kernel Qr (θ) which is the harmonic conjugate of Pr (θ). It is easy
to find Qr (θ) since  
1+z
Pr (θ) = Re
1−z
and therefore  
1+z 2r sin(2πθ)
Qr (θ) = Im =
1−z 1 − 2r cos(2πθ) + r2
Observe that {Qr }0<r<1 is not an approximate identity, since Q1 (θ) = cot(πθ) which
is not the density of a measure – it behaves like 1/(πθ) close to θ = 0. The Hilbert
transform on the circle is the map which is formally defined as follows:

f 7→ uf 7→ u
ef 7→ u
ef T

where uf denotes the harmonic extension to D and u ef its harmonic conjugate. From the
preceding, Q1 is the kernel of the Hilbert transform. But we will not discuss the results
for the Hilbert transform disk as they are similar to those in a half space.

71
5.3 Almost everywhere convergence to the boundary data
Finally, we turn to the issue of almost everywhere convergence of Pr ∗ f to f as r → 1.
The main idea here is to mimic the proof of the Lebesgue differentiation theorem, which
says that for any f ∈ L1 (Rn ) we have
Z
1
f (y)dy → f (x) a.e. in Rn . (5.5)
|B(x, r)| B(x,r)

In fact, the proof we describe below gives the proof of (5.5) as well. In particular, we
need as a tool the Hardy-Littlewood maximal function M f , which is defined (for the
torus) as follows: Z
1
M f (x) = sup |f (y)| dy
x∈I⊂T |I| I

where I ⊂ T is an (open) interval and |I| is the length of I. It is convenient to think of


M f (x) via the box kernel:
" Z #
1/(2n)
M f (x) = sup n |f (y)|dy = sup(χn ∗ |f |)(x),
n −1/(2n) n

where χn (x) = nχ[−1/2n,1/2n] (x) is the box kernel.


The most basic facts concerning this (sublinear) operator are contained in the fol-
lowing result.

Proposition 5.9 The Hardy-Littlewood maximal function M is bounded from L1 to


weak L1 , i.e.,
C
|{x ∈ T|M f (x) > λ}| ≤ kf k1
λ
for all λ > 0. For any 1 < p ≤ ∞, M is bounded on Lp .

Proof. Fix some λ > 0 and any compact set K such that

K ⊂ {x | M f (x) > λ} (5.6)

There exists a finite cover {Ij }N


j=1 of K by open arcs Ij such that
Z
|f (y)| dy > λ|Ij | (5.7)
Ij

for each j. We now pass to a more convenient sub-cover (this is known as Wiener’s
covering lemma and is very much like Vitali’s covering lemma). Select an arc of maximal
length from {Ij }; call it J1 . Maximality of I1 implies that any Ij such that Ij ∩ J1 6= ∅
satisfies Ij ⊂ 10 · J1 where 10 · J1 is the arc with the same center as J1 and ten times
the length (if 10 · J1 has length larger than 1, then set 10 · J1 = T). Next, remove all
arcs from {Ij }N
j=1 that intersect J1 . Let J2 be one of the remaining ones with maximal

72
length. Continuing in this fashion we obtain arcs {J` }L`=1 which are pair-wise disjoint
and so that
[N [L
Ij ⊂ 10 · J`
j=1 `=1

In view of (5.6) and (5.7) therefore,


L
! L L Z
[ X 10 X 10
meas(K) ≤ meas 10 · J` ≤ 10 meas(J` ) ≤ |f (y)| dy ≤ kf k1
`=1 `=1
λ `=1 J` λ

as claimed. To prove the Lp statement, one interpolates the weak L1 bound with the
trivial L∞ bound
kM f k∞ ≤ kf k∞
by means of Marcinkiewicz’s interpolation theorem.
We now introduce a class of approximate identities which can be dominated by the
box kernels. The importance of this idea is that it allows us to dominate the maximal
function associated with an approximate identity by the Hardy-Littlewood maximal
function, see Lemma 5.11 below.

Definition 5.10 Let Φn be an approximate identity as in Definition 5.2. We say that


it is radially bounded if there exist functions Ψn on T so that the following additional
property holds (it is convenient to think of the torus as the interval [−1/2, 1/2] rather
than [0, 1] here):
(A4) |Φn | ≤ Ψn , Ψn is even and decreasing, i.e., Ψn (φ) ≤ Ψn (θ) for 0 ≤ φ ≤ θ ≤ 21 , for
all n ≥ 1. Finally, we require that supn kΨn k1 < ∞.

We have the following domination lemma.


Lemma 5.11 If Φn satisfies (A4), then for any f ∈ L1 (T) one has
 
sup |(Φn ∗ f )(x)| ≤ M f (x) sup kΨn k1 (5.8)
n n

for all x ∈ T.
Note that the left side of (5.8) can be thought of as the maximal function associated
to the approximation of identity Φn in the same way as the Hardy-Littlewood maximal
function is associated to the box kernel.
Proof. It suffices to show the following statement: let a non-negative function K(x)
defined on the torus T = [−1/2, 1/2] be even and decreasing on [0, 1/2]. Then for any
f ∈ L1 (T) we have
|(K ∗ f )(x)| ≤ kKk1 M f (x) (5.9)
Indeed, assume that (5.9) holds. Then

sup |(Φn ∗ f )(x)| ≤ sup(Ψn ∗ |f |)(x) ≤ (sup kΨn k1 )M f (x)


n n n

73
and the lemma follows. The idea behind (5.9) is to show that a positive, even and
decreasing function K can be written as an average of box kernels, i.e., for some positive
measure µ we have
Z 1
2
K(φ) = χ[−θ,θ] (φ) dµ(θ). (5.10)
0
Let us check that
dµ(θ) = −dK(θ) + K (1/2) δθ= 1
2

is a suitable choice: the right side of (5.10) defines an even function of φ and for 0 ≤
φ ≤ 1/2 we have
Z 1 Z 1
2 2
χ[−θ,θ] (φ) dµ(θ) = χ[−θ,θ] (φ)[−dK(θ) + K(1/2)δ(θ − 1/2)]
0 0
Z 1/2
= [−dK(θ) + K(1/2)δ(θ − 1/2)] = K(φ) − K(1/2) + K(1/2) = K(φ).
φ

Note that (5.10) implies that


! 1
Z 1 Z 1/2 Z 1/2 Z 1/2 Z
2
K(φ)dφ = 2 K(φ)dφ = 2 dµ(θ) dφ = 2 θ dµ(θ).
0 0 0 φ 0

Moreover, by (5.10), we have


Z 1 Z 12
2 1 
|(K ∗ f )(φ)| = χ[−θ,θ] ∗ f (φ) 2θ dµ(θ) ≤ M f (φ)2θ dµ(θ) = M f (φ)kKk1 ,

0 2θ 0

which is (5.9).
Finally, we can properly address the question of whether Pr ∗ f → f in the almost
everywhere sense for f ∈ L1 (T). The idea is as follows: the pointwise convergence
is clear from Lemma 5.3 for continuous f . This suggests approximating f ∈ L1 by
a sequence of continuous functions gn , in the L1 norm. This leads to the problem of
an interchange of limits, namely r → 1 and n → ∞. As always in such a situation,
we require some form of uniform control, which is furnished by the Hardy–Littlewood
maximal function.

Theorem 5.12 If Φn satisfies (A1)–(A4), then for any f ∈ L1 (T) one has Φn ∗ f → f
almost everywhere as n → ∞.

Proof. Pick ε > 0 and let g ∈ C(T) with kf −gk1 < ε. By Lemma 5.3, with h = f −g

74
one has (| · | denotes the Lebesgue measure of a set)

{x ∈ T s.t. lim sup | (Φn ∗ f ) (x) − f (x)| > ε }
n−→∞

= {x ∈ T s.t. lim sup | (Φn ∗ h) (x) − h(x)| > ε }
n−→∞
√ √
≤ {x ∈ T s.t. lim sup |(Φn ∗ h)(x)| > ε/2} + {x ∈ T s.t. |h(x)| > ε/2}

n→∞
√ √
≤ {x ∈ T s.t. sup |(Φn ∗ h)(x)| > ε/2} + {x ∈ T s.t. |h(x)| > ε/2}

n
√  √ 
≤ {x ∈ T CM h(x) > ε/2} + x ∈ T |h(x)| > ε/2


≤C ε

We used Lemma 5.11 in the next to last inequality, while in final step we used Proposi-
tion 5.9 as well as the Chebyshev inequality (recall that khk1 < ε).
As a first corollary, we obtain the classical Lebesgue differentiation theorem: we have

1 x+ε
Z
f (y)dy → f (x) a.e. on T
2ε x−ε

if f ∈ L1 (T) (this theorem holds, of course, in any dimension, not just one).
We also get the almost everywhere convergence of of the Poisson integrals Pr ∗ f → f
for any f ∈ L1 (T) as r → 1−. In view of Theorem 5.6 we of course would like to know
whether a similar statement holds for measures instead of L1 functions. It turns out
that Pr ∗ µ → f almost everywhere where f is the density of the absolutely continuous
component of µ in the Lebesgue decomposition. A most important example here is Pr
itself! Indeed, its boundary measure is δ0 and the almost everywhere limit is identically
zero. Hence, in the almost everywhere limit we lose a lot of information – the singular
part of the boundary measure. An amazing fact that we will prove next, known as the
F. & M. Riesz theorem, states that there is no such loss in the class h1 (D) ∩ H(D), that
is, for analytic harmonic functions in h1 (D). Indeed, we will see that any such function
is the Poisson integral of an L1 function rather than a measure.

5.4 The F. and M. Riesz theorem


In this section we will prove the following theorem due to F. and M. Riesz.
Theorem 5.13 If µ ∈ M(T) satisfies
Z 1
µ̂(n) = e−2πinθ dµ(θ),
0

for all n < 0, then µ is absolutely continuous with respect to Lebesgue measure on T.
An important role in the proof of this theorem will be played by subharmonic functions.
Previously, we have defined sub-harmonic functions by the inequality ∆φ ≥ 0. We will
first need to broaden this notion, requiring φ to be only continuous.

75
Definition 5.14 Let Ω be a domain in R2 . We say that a function φ which takes values
in R ∪ {−∞} is sub-harmonic if it is continuous and for each z ∈ Ω there exists rz so
that for all 0 < r < rz we have
Z 1
f (z) ≤ f (z + re2πiθ )dθ. (5.11)
0

We summarize the basic properties of subharmonic functions in the following proposi-


tion.
Proposition 5.15 (i) Let f and g be subharmonic in Ω, then u(z) = max(f (z), g(z))
is also subharmonic in Ω.
(ii) Let f ∈ C 2 (Ω), then f is subharmonic in Ω if and only if ∆f ≥ 0 in Ω.
(iii) If f is subharmonic, and φ convex and increasing then φ(f (z)) is subharmonic.
(iv) If F is analytic in Ω then log |F | and |F |α , with α > 0, are subharmonic.
The important aspect of part (iv) is that the functions log u and uα (with 0 < α < 1)
are concave not convex but nevertheless log |F | and |F |α are sub-harmonic.
Proof. (i) follows immediately from (5.11) for f and g. As for (ii), we have already
shown that ∆f ≥ 0 implies the sub-mean value property (5.11). The converse follows
from our proof of the mean value property for harmonic functions, as well: if we fix
z ∈ Ω, and set Z 1
φ(r) = f (z + re2πiθ )dθ,
0
then we have shown that
Z rZ 2π
0 1
φ (r) = ∆u(z + ρeiφ )ρdρdφ.
2πr 0 0

Therefore, if ∆u(z) < 0, we would have φ0 (r) < 0 for small r contradicting the local
sub-mean value property. Property (iii) can be seen as follows; if f is subharmonic and
φ(z) is increasing then
Z 1 
2πiθ
φ(f (z)) ≤ φ f (z + re )dθ .
0

Next, if, in addition, φ is convex, we have


Z 1  Z 1
2πiθ
φ f (z + re )dθ ≤ φ(f (z + re2πiθ )dθ,
0 0

by Jensen’s inequality. Together, the last two inequalities show that φ(f (z)) is subhar-
monic. Finally, (iv) is shown as follows: if F (z0 ) 6= 0 then log F (z) is analytic in disk
around z0 , hence its real part log |F (z)| is not just subharmonic but actually harmonic
at z0 . On the other hand, if F (z0 ) = 0 so that | log F (z0 )| = −∞ then there is nothing
to prove. Finally, we can write |F |α (z) = φ(log |F |), where φ(s) = exp(αs) is increasing
and convex.
The maximum principle holds for the “extended” notion of a subharmoninc function
as well.

76
Proposition 5.16 Let f ∈ C(Ω̄) be subharmonic and u ∈ C(Ω̄) be harmonic. If f ≤ u
on the boundary ∂Ω then f ≤ u in Ω.
The proof is verbatim the same as before so we leave it to the reader.
The next proposition allows us to pass from the “local” sub-mean value property to
all balls contained in the region where the function f is sub-harmonic.
Proposition 5.17 Let f be subharmonic in a domain Ω and let B(z, r) ⊂ Ω, then
Z 1
f (z) ≤ f (z + re2πiθ )dθ.
0

Proof. Let n ≥ 1 and set gn (z) = max(f (z), −n) – the function gn (z) is also sub-
harmonic. Next, consider the function un (z) which is harmonic in B(z, r) and coincides
with gn (z) on the circle ∂B(z, r). The maximum principle and the definition of gn (z)
imply that Z 1
f (z) ≤ gn (z) ≤ un (z) = gn (z + ρe2πiθ )dθ.
0
The monotone convergence theorem implies now that
Z 1
f (z) ≤ f (z + ρe2πiθ )dθ,
0

and the proof is complete. 


We now define the radial maximal function that will play crucial role in the rest of
this section.
Definition 5.18 Let F (z) be a function defined on D, then the radial maximal function
F ∗ : T → R is defined as
F ∗ (θ) = sup |F (re2πiθ )|.
0<r<1

Recall that we have already shown (see Lemma 5.11) that if u ∈ h1 (D) is harmonic, and
µ is its boundary measure: u(re2πiθ ) = Pr ∗ µ, then, for all 0 < r < 1, we have

|u(re2πiθ )| ≤ CM µ(θ),

that is, u∗ (θ) ≤ CM µ(θ). Here M µ is the Hardy-Littlewood maximal function of the
measure µ. The same result holds for sub-harmonic functions.
Proposition 5.19 Let g be sub-harmonic in D, g ≥ 0, and
Z 1
|||g|||1 := sup g(re2πiθ )dθ < +∞. (5.12)
0<r<1 0

Then, (i) for all λ > 0 we have


C
|θ ∈ T : g ∗ (θ) > λ| ≤ |||g|||1 , (5.13)
λ
77
(ii) if for some 1 < p ≤ ∞ we have
 Z 1 1/p
2πiθ
|||g|||p := sup g(re )dθ < +∞,
0<r<1 0

then kg ∗ kLp (T) ≤ Cp |||g|||p .


Proof. (i) The uniform bound (5.12) implies that there exists a sequence rn → 1 so
that grn weak-* converges to a measure µ ∈ M(T), and

kµk ≤ |||g|||1 . (5.14)

Moreover, for ever 0 < s < 1 we have

gs (θ) = lim grn s (θ),


n→+∞

while (as g is sub-harmonic)

grn s (θ) ≤ Ps ∗ grn → Ps ∗ µ.

We conclude that gs ≤ Ps ∗ µ. Now, Lemma 5.11 shows that

gs (θ) ≤ CM µ(θ), (5.15)

and (5.13) follows from the weak L1 -bound for the maximal function in Proposition 5.9,
as well as (5.14).
(ii) If, in addition, |||g|||p < +∞, for some 1 < p ≤ +∞, then µ has a density
f ∈ Lp (T): dµ = f dθ, and kf kLp (T) ≤ |||g|||p . Then the Lp -bound on the maximal
function in Proposition 5.9, together with (5.15) implies that

kg ∗ kLp (T) ≤ Ckf kLp (T) ≤ C|||g|||p ,

completing the proof.


This gives us the following first version of F. and M. Riesz theorem (the result is not
true without the analyticity assumption).
Proposition 5.20 Let F ∈ h1 (D) be analytic, then F ∗ ∈ L1 (T).
Proof. As F is analytic, it follows from part (iv) of Proposition 5.15 that |F |1/2
is a subharmonic function. As F ∈ h1 (D), we have |||F 1/2 |||2 < +∞, from which
we conclude, using Proposition 5.19 that (|F |1/2 )∗ ∈ L2 (T). But we have, obviously,
(|F |1/2 )∗ = (F ∗ )1/2 , hence F ∗ ∈ L1 (T).
Next, recall that any function F ∈ h1 (D) has the form Fr = Pr ∗ µ, where µ is a
measure on T. It has the Lebesgue decomposition µ = µac + µs , where µac is absolutely
continuous with respect to the Lebesgue measure dθ, and µs and dθ are mutually singu-
lar. The measure µac has the form f (θ)dθ with f ∈ L1 (T). A useful exercise is to show
that Pr ∗ µ → f a.e., or, equivalently, Pr ∗ µs → 0 a.e. In other words, F (re2πiθ ) has a

78
pointwise limit as r → 1 for a.e. θ ∈ T. The reason is that if µs is singular with respect
to the Lebesgue measure, then for a.e. θ ∈ T we have
1
µs ([θ − ε, θ + ε]) → 0 as ε → 0.

In particular, as we have already mentioned, the Poisson kernel itself satisfies Pr (θ) → 0
as r → 1 a.e. The next result shows that “for analytic functions this can not happen”.
Proposition 5.21 Assume F ∈ h1 (D) and F is analytic, and let f (θ) = limr→1 F (re2πiθ ),
then Fr = Pr ∗ f for all 0 < r < 1.
Proof. We have Fr → f for a.e. θ ∈ T, and |Fr | ≤ F ∗ ∈ L1 (T). The Lebesgue
dominated convergence theorem implies that Fr → f in L1 (T). Now, Theorem 5.6 (part
(1)) implies that Fr = Pr ∗ f .
Finally, we can prove Theorem 5.13: if a measure µ on the torus satisfies µ̂(n) = 0 for
all n < 0, then µ is absolutely continuous with respect to the Lebesgue measure on T.
Proof. (Of Theorem 5.13). Assume that

µ̂(n) = 0 for all n < 0, (5.16)

and set ∞
X X
F (re 2πiθ
)= rn µ̂(n)e2πniθ = r|n| µ̂(n)e2πinθ = Pr ∗ µ(θ).
n=0 n∈Z

Note, that |µ̂(n)| ≤ kµk, so the above definition makes sense for all r because of (5.16),
and the function Fr (θ) is analytic. Proposition 5.21 implies that Fr has an L1 -limit f
as r → 1, and dµ = f dθ.
The last theorem by Riesz brothers is as follows.
Theorem 5.22 Let F by analytic in D and L1 -bounded, that is, F ∈ h1 (D). Assume
that F 6≡ 0 and set f = limr→1 Fr , then f can not vanish on a set of positive measure.
Proof. The idea is to show that log |f | ∈ L1 (T), so that, in particular, log |f | is
finite a.e. First, note that log+ |f | ≤ |f | ∈ L1 (T) by Proposition 5.20. Furthermore, if
F (0) 6= 0, then, as log |F | is sub-harmonic, we
Z
−∞ < − log |F (0)| ≤ log |Fr (θ)|dθ,
T

for any 0 < r < 1. However, Fatou’s lemma shows that the last inequality implies
Z
−∞ < − log |F (0)| ≤ log |f (θ)|dθ, (5.17)
T

after passing to r → 1 and recalling that Fr (θ) → f (θ) a.e., and log+ |f | ∈ L1 (T).
Finally, if F (0) = 0 and there exists a point z0 ∈ D such that F (z0 ) 6= 0, we simply
consider an automorphism of the unit disk to itself that maps z0 → 0 and repeat the
previous argument.

79
6 The Fourier transform and holomorphic functions
We now turn to further connections between the analytic functions and the Fourier
transform. We will be following the Stein-Shakarchi book.

6.1 The Fourier transform of moderately decaying functions


The Fourier transform of a function f (x) defined on R is
Z
ˆ
f (ξ) = f (x)e−2πixξ dx, (6.1)

defined for ξ ∈ R. A convenient class of functions that relate the Fourier and complex
analyses are as follows: given a > 0 we denote by Fa the class of all functions that
satisfy two conditions: (i) f is holomorphic in the strip

Sa = {z ∈ C : |Im(z)| < a},

and (ii) there exists a constant A > 0 (that depends on f ) so that

A
|f (x + iy)| ≤ , for all x ∈ R and |y| < a.
1 + x2
2
For example, the Gaussian f (z) = e−πz belongs to Fa for all a > 0. On the other hand,
the function
1
f (z) = 2 ,
z + c2
with c > 0 belongs to Fa only for 0 < a < c. We will also denote by F the class of
functions that belong to Fa for some a > 0.
The first result relates the exponential decay of the Fourier transform and the mod-
erate decay of the function itself.

Theorem 6.1 Let f ∈ Fa for some a > 0, then for any 0 ≤ b < a there exists B so
that |fˆ(ξ)| ≤ Be−2πb|ξ| .

Proof. The case b = 0 is simple: if f ∈ Fa for some a > 0 then its restriction to the
real axis satisfies f ∈ L1 (R), which means that |fˆ(ξ)| ≤ kf k1 .
When 0 < b < a let us first assume that ξ > 0 and consider the function g(z) =
f (z)e−2πiξz (with ξ fixed) and integrate it over the boundary of the rectangle that is
formed by the points (−R, 0), (R, 0), (R, −b), and (−R, −b) (connected in that order).
The integral over the vertical sides goes to zero as R → +∞ because of the uniform
decay of f :
Z 0
−2πiξ(−R+iy) ≤ bA → 0 as R → +∞,


f (−R + iy)e dy 1 + R2
−b

80
with a similar estimate over the interval [R − ib, R]. The Cauchy theorem implies,
therefore, that the integrals over the two horizontal sides are equal in the limit R → +∞:
Z ∞ Z ∞
−2πiξx
f (x)e dx = f (x − ib)e−2πiξ(x−ib) dx,
−∞ −∞

or Z ∞
fˆ(ξ) = f (x − ib)e−2πiξ(x−ib) dx (6.2)
−∞

This expression (interesting in itself) leads to the estimate


Z ∞
A
|fˆ(ξ)| ≤ 2
e−2πξb dx ≤ Be−2πbξ ,
−∞ 1 + x

as claimed. The proof for ξ < 0 is identical except the real line is shifted up by b.
The above theorem relates the decay of the Fourier transform to the possibility of
extending f as an analytic function in a wide strip. The ultimate step in this direc-
tion will be asking for which functions we may have fˆ have compact support (this is
“ultimate” decay), and this is what we will soon study.
First, we establish the Fourier inversion formula using the complex analysis tools.
Theorem 6.2 If f ∈ F then the Fourier inversion formula holds:
Z ∞
f (x) = fˆ(ξ)e2πixξ dξ. (6.3)
−∞

Proof. As f ∈ F, the Fourier transform fˆ(ξ) is exponentially decaying by Theo-


rem 6.1, so the right side of (6.3) makes sense. As f ∈ F, we can find a > 0 so that
f ∈ Fa and choose b ∈ (0, a). For ξ > 0 we will then use expression (6.2):
Z ∞
ˆ
f (ξ) = f (x − ib)e−2πiξ(x−ib) dx, (6.4)
−∞

so that the integral as in (6.3) but over ξ > 0 can be written as


Z ∞ Z ∞ Z ∞
ˆ
f (ξ)e 2πixξ
dξ = dξ dsf (s − ib)e−2πiξ(s−ib) e2πixξ
0 −∞
Z ∞ Z 0∞ Z ∞
−2πiξ(s−x−ib) 1
= dsf (s − ib) dξe = f (s − ib) ds
−∞ 0 −∞ 2πi(s − x − ib)
Z ∞
f (s − ib)
Z
1 1 f (ζ)dζ
= ds = .
2πi −∞ s − ib − x 2πi L1 ζ − x

Here L1 is the horizontal line {y = −ib}, oriented from the left to the right. A similar
computation shows that for ξ < 0 we have
Z 0 Z
ˆ 2πixξ 1 f (ζ)dζ
f (ξ)e dξ = ,
−∞ 2πi L ζ − x

81
where L2 is the horizontal line y = ib oriented from the right to the left. Let us now
consider the rectangular contour ΓR that connects counterclockwise the vertices −R+ib,
−R − ib, R − ib and R + ib. The Cauchy theorem implies that
Z
1 f (ζ)dζ
f (x) = . (6.5)
2πi ΓR ζ − x

On the other hand, as in the proof of Theorem 6.1, the integral over the vertical sides
vanishes as R → +∞, and the integral over the horizontal lines becomes the sum of the
integrals over L1 and L2 , which gives, passing to R → +∞ in (6.5):
Z Z Z ∞ Z 0
1 f (ζ)dζ 1 f (ζ)dζ ˆ
f (x) = + = f (ξ)e 2πixξ
dξ + fˆ(ξ)e2πixξ dξ
2πi L1 ζ − x 2πi L2 ζ − x 0 −∞
Z ∞
= fˆ(ξ)e2πixξ dξ,
−∞

and the proof is complete.

6.2 The Paley-Wiener theorem


We will need to use the inversion formula for the Fourier transform under slightly dif-
ferent conditions. We say that f is of moderate decrease if f and fˆ satisfy
A B
|f (x)| ≤ 2
, |fˆ(ξ)| ≤ . (6.6)
1 + |x| 1 + |ξ|2
Theorem 6.3 Assume that f and g are of moderate decrease and continuous, then
Z Z
f (x)ĝ(x)dx = fˆ(x)g(x)dx, (6.7)
Rn Rn

and Z
f (x) = fˆ(ξ)e2πixξ dξ. (6.8)

Proof. We begin with a lemma that is one of the cornerstones of the probability theory.

Lemma 6.4 Let f (x) = e−π|x| , then fˆ(x) = f (x).


2

Proof. The proof is a glimpse of how useful the Fourier transform is for differential
equations and vice versa: the function f (x) satisfies an ordinary differential equation

u0 + 2xu = 0, (6.9)

with the boundary condition u(0) = 1. However, (6.9), together with the formula for
the Fourier transform of the derivative f 0 :

fˆ0 (ξ) = 2πiξ fˆ(ξ),

82
implies that fˆ satisfies the same differential equation (6.9), with the same boundary
condition fˆ(0) = 0. It follows that f (x) = fˆ(x) for all x ∈ R. 
We continue with the proof of Theorem 6.3. The Parceval identity can be verified
directly using Fubini’s theorem:
Z Z Z
f (x)ĝ(x)dx = f (x)g(ξ)e −2πiξ·x
dxdξ = fˆ(ξ)g(ξ)dξ,

as f , fˆ, g and ĝ are all integrable. Finally, we prove the inversion formula using a
rescaling argument. For any λ > 0 we have
Z Z Z Z  
−2πiλξ·x ˆ 1 ˆ ξ
f (x)ĝ(λx)dx = f (x)g(ξ)e dx = f (λξ)g(ξ)dξ = n f (ξ)g dξ.
Rn R2n λ Rn λ
Multiplying by λ and changing variables on the left side we obtain
Z   Z  
x ˆ ξ
f ĝ(x)dx = f (ξ)g dξ.
λ λ
Letting now λ → ∞ using the Lebesgue dominated convergence theorem gives
Z Z
f (0) ĝ(x)dx = g(0) fˆ(ξ)dξ, (6.10)

2
for all continuous functions f and g of moderate decrease. Taking g(x) = e−π|x| in
(6.10) and using Lemma 6.4 leads to
Z
f (0) = f (ξ)dξ. (6.11)

The inversion formula (6.8) now follows if we apply (6.11) to a shifted function fy (x) =
f (x + y), because Z
fy (ξ) = f (x + y)e−2πiξ·x dx = e2πiξ·y fˆ(ξ),
ˆ

so that Z Z
f (y) = fy (0) = fˆy (ξ)dξ = e2πiξ·y fˆ(ξ)dξ,

which is (6.8). 
Theorem 6.1 has the following “partial converse”.
Theorem 6.5 Assume that fˆ(ξ) satisfies |fˆ(ξ)| ≤ Ae−2πa|ξ| for some a, A > 0. Then
f (x) is the restriction to the real axis of a function holomorphic in the strip Sa =
{|Imz| < a}.
Proof. The function f can be extended to any sub-strip Sb , 0 < b < a, as
Z ∞
f (z) = fˆ(ξ)e2πiξz dξ.
−∞

83
The integral converges absolutely for any z ∈ Sa :
Z ∞ Z ∞
|f (z)| ≤ |fˆ(ξ)|e 2π|ξ||Imz|
dξ ≤ A e−2π(a−|Imz|)|ξ| dξ.
−∞ −∞

In addition, the functions Z n


fn (z) = fˆ(ξ)e2πiξz dξ
−n

are entire simply because they can be differentiated. Finally, for any z ∈ Sa we have
Z
|f (z) − fn (z)| ≤ A e−2π(a−|Imz|)|ξ| dξ → 0,
|ξ|>n

as n → +∞, uniformly in any sub-strip k|Imz| < b} for any b ∈ (0, a). It follows that
fn (z) converges uniformly to f (z) in any such sub-strip, hence, as fn (z) are holomorphic,.
so is f (z).

Corollary 6.6 If |fˆ(ξ)| ≤ Ae−2πa|ξ| for some a > 0, and f vanishes on a non-empty
open interval then f ≡ 0.

Proof. Any such f is a restriction of a holomorphic function, whence the uniqueness


theorem implies the claim.
As a particularly important example, it follows that it is impossible that both f and
ˆ
f are compactly supported. Hence, it is natural to ask for which functions it is possible
to have compactly supported Fourier transform.
Theorem 6.7 Assume that f is continuous and has moderate decrease. Then fˆ is
supported inside an interval −M ≤ ξ ≤ M if and only if f is a restriction to the real
line of an entire function that satisfies |f (z)| ≤ Ae2πM |z| for some A > 0.
Proof. First, if fˆ is supported in [−M, M ] then both f and fˆ have moderate decrease,
thus Z M
f (x) = fˆ(ξ)e2πiξx dξ.
−M

Therefore, f can be extended to the whole complex plane by setting


Z M
f (z) = fˆ(ξ)e2πiξz dξ.
−M

This function is entire and


Z M
|f (z)| ≤ |fˆ(ξ)|e2πM |Imz| dξ ≤ Ce2πM |z| .
−M

The other direction is much less trivial. We will assume that f is an entire function, and
make progressively weaker assumptions on f , eventually getting to |f (z)| ≤ Ae2πM |z| ,
and show that each one guarantees that fˆ(ξ) = 0 for |ξ| ≥ M .

84
First, assume that
Ae2πM |y|
|f (x + iy)| ≤ . (6.12)
1 + x2
Let ξ > M , then, using (6.4) we get, for any y > 0:
Z ∞ Z ∞
ˆ
f (ξ) = f (x)e −2πiξx
dx = f (x − iy)e−2πiξ(x−iy) dx.
−∞ −∞

It follows that ∞
e2πM y−2πξy
Z
|fˆ(ξ)| ≤ A 2
dx ≤ Ce2πM y−2πξy .
−∞ 1+x
As ξ > M , the right side above can be made arbitrarily small by taking y large. We
conclude that fˆ(ξ) = 0. A nearly identical argument proves this for ξ < −M .
Now, instead of (6.12) assume that

|f (x + iy)| ≤ Ae2πM |y| , (6.13)

which is still stronger than |f (z)| ≤ Ae2πM |z| , which is what we ultimately need. How-
ever, assumption (6.13) can be reduced to (6.12) as follows. Take ξ > M , and set

f (z)
fε (z) = ,
(1 + iεz)2

with ε > 0 small. The function fε (z) satisfies

|f (z)| Aε e2πM |z|


|fε (x + iy)| = ≤ , for y ≤ 0, (6.14)
(1 − εy)2 + ε2 x2 1 + x2

with Aε = A/ε2 . That is, fε (z) satisfies (6.12) in the lower half-plane. Note that
Z ∞ Z ∞
ˆ ˆ
1
|fε (ξ) − f (ξ)| ≤ |fε (x) − f (x)|dx ≤ |f (x)|

2
− 1 dx → 0,
−∞ −∞ (1 + iεx)

as ε → 0 by the Lebesgue dominated convergence theorem, as f is of moderate decrease


and |1 + iεx| ≥ 1 for all x ∈ R. Observe that in showing that (6.12) implies fˆ(ξ) = 0 for
ξ > M we only used (6.12) in the lower half plane. Therefore, as fε (z) satisfies (6.14),
we conclude that fˆε (ξ) = 0 for ξ > M and all ε ∈ (0, 1). It follows that fˆ(ξ) = 0 also.
The argument for ξ < 0 is similar except the factor 1/(1 + iεz)2 in the definition of fε
should be replaced by 1/(1 − iεz)2 .
The last step is to show that condition (6.12) holds under the assumptions of the
theorem. More precisely, we will show that if |f (x)| ≤ 1 for all x ∈ R and |f (z)| ≤ e2πM |z|
for all z ∈ C, then
|f (x + iy)| ≤ e2πM |y| . (6.15)
This will follow from the following lemma (Phragmén-Lindelöf theorem), which is an-
other version of the three lines theorem that we have seen before.

85
Lemma 6.8 Let F be a function holomorphic in the sector

S = {z : − π/4 < arg z < π/4},

and continuous up to the boundary of S. Assume that |F (z)| ≤ 1 on the boundary of S,


and that |F (z)| ≤ Aec|z| for all z ∈ S, with some constants A, c > 0. Then |F (z)| ≤ 1
for all z ∈ S.

Note that some restriction on the growth of F (z) is necessary – for example, the function
2
F (z) = ez satisfies
2
F (x ± ix) = e±2i|x| ,
but is unbounded on the real axis.
Let us first finish the proof of the Paley-Wiener theorem, and then return to the
proof of Lemma 6.8. We need to show that the conditions |f (x)| ≤ 1 for all x ∈ R and

|f (x + iy)| ≤ e2πM |x+iy|

imply that
|f (x + iy)| ≤ e2πM |y| .
Of course, the Phragmén-Lindelöf principle applies to any quadrant, not just the one
specified in Lemma 6.8. We will apply it to the first quadrant Q = {x > 0, y > 0}. On
its boundary we have |f (x)| ≤ 1 and |f (iy)| ≤ e2πM |y| . In order to remove the growth
on the y-axis we set
F (z) = f (z)e2πiM z .
Then we have on the boundary of Q: |F (x)| = |f (x)| ≤ 1, as well as

|F (iy)| = |f (iy)|e−2πM y ≤ 1, for any y > 0.

We also have the growth condition inside Q:

|F (z)| = |f (x + iy)|e−2πM y ≤ e2πM |z| .

We conclude from Lemma 6.8 that |F (z)| ≤ 1 everywhere in Q, which means

|f (x + iy)| ≤ e−2πiM (x+iy) = e2πM y .


The argument for the other three quadrants is very similar. This finishes the proof of
the Paley-Wiener theorem except for the proof of Lemma 6.8.
Proof. (Of Lemma 6.8). The proof is very similar to that of the three lines theorem.
Consider the function
3/2
Fε (z) = F (z)e−εz .
Note that the function z 3/2 is holomorphic in S: it is given by (for z = reiθ , −π < θ < π)

z 3/2 = r3/2 e3iθ/2 .

86
The exponential is bounded by

−εz3/2 3/2
e = e−εr cos(3θ/2) .

The key point is that if −π/4 < θ < π/4 then


π 3π 3θ 3π π
− <− < < < ,
2 8 2 8 2
which means that cos(3θ)/2 > α0 > 0 in this quadrant, and Fε (z) decays as |z| → +∞
3/2
at least as e−εα0 |z| . As a consequence, Fε (z) is bounded in S, as the growth of F (z)
at infinity is at most ec|z| . Moreover, as |Fε (z)| ≤ |F (z)|, we have |F (z)| ≤ 1 on the
boundary of S. Now, as in the proof of the three lines theorem we may conclude that
|Fε (z)| ≤ 1: take R sufficiently large so that |Fε (z)| ≤ 1 for all z ∈ S with |z| > R.
Then on the boundary of the region SR = {z ∈ S : |z| ≤ R} we have |Fε (z)| ≤ 1. The
maximum principle implies that |Fε (z)| ≤ 1 in all of SR . Therefore, F (z) satisfies
3/2 3/2
|F (z)| = |Fε (z)eεz | ≤ eε|z| ,

and letting ε → 0 we conclude that |F (z)| ≤ 1 for all z ∈ S.

6.3 The du Bois-Reymond example


We will now give an example of a continuous periodic function whose Fourier series
diverges at the point x = 0 (of course we can move this point to an arbitrary point on
the circle) (this material is taken from M. Pinsky’s book). We will look for f (x) in the
form ∞
X Bk (x)
f (x) = e2πiNk x 2 . (6.16)
k=1
k
The coefficients Bk (x) themselves will have the form
mk
X
Bk (x) = aj e2πijx .
j=−mk

The main point here is to choose appropriately the integers Nk and mk , as well as
the coefficients aj . To be very concrete we will take ak to be the Fourier coefficients
of the function u(x) = 1/2 − x, 0 < x < 1, extended periodically (which leads to a
discontinuous function at x = 0):
Z 1
1 xe−2πikx x=1 1
ak = ( − x)e−2πikx dx = = .
0 2 2πik x=0 2πik

Note that the partial sums


N
X
ak e2πikx
k=−N

87
are uniformly bounded in x for all N . This is seen as follows:
N Z 1
X
2πikx 1 sin[(2N + 1)π(x − y)]
ak e = ( − y) dy
k=−N 0 2 sin[π(x − y)]
Z 1 Z 1
sin[(2N + 1)πy)] sin[(2N + 1)πy)]
= u(x − y) dy ∼ u(x − y) dy
0 sin(πy) 0 πy
Z x Z 1
1 sin[(2N + 1)πy)] 1 sin[(2N + 1)πy)]
= ( − x + y) dy + (y − x − ) dy ≤ M,
0 2 πy x 2 πy
as there exists a constant C such that for any (a, b) we have
Z b
sin y
≤ C.
dy
y
a

Therefore, |Bk (x)| ≤ M , and the series (6.16) converges uniformly, hence the sum f (x)
is a continuous function. Let us choose Nk and mk so that
Nk+1 − mk+1 > Nk + mk .
This implies that the frequencies coming from the term eiNk x Bk (x) and those coming
from eiNk+1 x Bk+1 (x) do not overlap. This allows us to compute the Fourier coefficients
of f :
Z 1 ∞
1 1
X Z
−2πinx
fn = f (x)e dx = 2
Bk (x)e2πiNk x e−2πinx dx.
0 k=1
k 0

As all Bk (x) involve different frequencies, all integrals in the right side above will vanish
except possibly for one k (if it exists) that satisfies |n − Nk | ≤ mk . That is, if there
exists k such that Nk − mk < n < Nk + mk , then
an−Nk
fn = ,
k2
and otherwise fn = 0. Consider then the partial sum SNk f of the Fourier series for f :
Nk k−1 Nj +mj Nk
X X X an−Nj X an−Nk
SNk f (0) = fn = 2
+
n=−Nk j=1 n=Nj −mj
j n=N −m
k2
k k

k−1 mj mk k−1 mk
X 1 X 1 X X Bj (0) 1 X
= 2
an + 2 a−j = 2
+ 2 a−j .
j=1
j n=−m k j=1 j=1
j k j=1
j

The first term in the right side above converges as k → +∞, since |Bj (0)| ≤ M .
However, the second term satisfies, from our choice of aj a lower bound:
m
k
1 X C log m
k
a ≥ .

2 −j 2
k j=1 k


Therefore, in order for the Fourier series of the function f to diverge at x = 0 we may
4
choose mk so that (log mk )/k 2 → +∞, for instance, we may take mk = 2k . After
choosing mk we choose Nk so that Nk+1 − mk+1 > Nk + mk . This completes the proof
of the du Bois-Raymond example.

88
7 Entire functions
This material is also taken from the Stein-Shakrachi book.

7.1 Counting zeros of an entire function


The fundamental theorem of algebra shows that there is a link between the growth of a
polynomial and the number of zeros it can have: a polynomial of order n has n zeros, –
in particular, it can not have more than n zeros, – growth restricts the possible number
of zeros. A convenient tool to count the zeros is given by Jensen’s formula. Let f be
analytic in a disk DR = {|z| ≤ R}, and let n(r) be the number of zeros f has in the disk
Dr , 0 < r < R (with multiplicities). Our goal is to relate n(R) to the values of f on the
circle {|z| = R}. We assume in this section that f (0) 6= 0, and f has no zeros on the
circle {|z| = R} . Let us denote all zeros in DR by z1 , . . . , zN and make the following
observation:
Z R N
dr X R
n(r) = log . (7.1)
0 r zk
k=1

To see that (7.1) holds, note that the right side can be written as
N N Z R
X R X dr
log =
,
k=1
zk
k=1 |z k | r

and the right side above equals to the left side of (7.1) – it is important that f (0) 6= 0,
so that n(r) = 0 for r sufficiently small.
The next step is to re-write the right side of (7.1) as (this is known as Jensen’s
formula)
N   Z 1
X |zk |
log = log |f (0)| − log |f (Re2πiθ )|dθ. (7.2)
k=1
R 0

This is verified as follows. First, if f (z) has no zeros in DR then log f (z) is a holomorphic
function, and log |f (z)| is its real part, hence harmonic. Therefore, (7.2) holds for such
f by the mean-value property of harmonic functions. In the general case, we may write
f (z) as
f (z) = (z − z1 )(z − z2 ) . . . (z − zN )g(z),
with a holomorphic function g(z) that never vanishes in DR (and for which (7.2) holds
by the above argument). As log |z1 z2 | = log |z1 | + log |z2 |, it follows that we only need
to establish (7.2) for linear functions f (z) = z − w, with |w| < R. Then, (7.2) has the
form   Z 1
|w|
log = log |w| − log |Re2πiθ − w|dθ, (7.3)
R 0
which is equivalent to Z 1
log |e2πiθ − a|dθ = 0, (7.4)
0

89
for all a with |a| < 1. This may, in turn, be rewritten as
Z 1
log |ae2πiθ − 1|dθ = 0, for all a ∈ {|z| < 1}. (7.5)
0

The function p(z) = 1 − az does not vanish in the unit disk, hence its logarithm is an
analytic function in {|z| < 1} hence the left side of (7.7) equals to log |p(0)| = 0, whence
(7.2) holds.
Together, (7.1) and (7.2) imply that if f (0) 6= 0, and f has no zeros on the circle
{|z| = R}, then
Z R Z 1
dr
n(r) = log |f (Re2πiθ )|dθ − log |f (0)|. (7.6)
0 r 0
In order to relate the number of zeros to the growth of an entire function at infinity,
it is convenient to make the following definition: we say that an entire function has an
order of growth ρ if for any s > ρ we can find two constants As and Bs so that
s
|f (z)| ≤ As eBs |z| , for all z ∈ C.
Theorem 7.1 Let f be an entire function of an order of growth ρ, then for any s > ρ
we have n(r) ≤ Cs (1 + r)s , and its zeros zk (such that zk 6= 0) satisfy

X 1
s
< ∞, for all s > ρ. (7.7)
k=1
|zk|

Proof. We may assume without loss of generality that f (0) 6= 0 – otherwise, we


simply consider g(z) = f (z)/z m , where m is the order of vanishing of f at z = 0 – this
does not change the order of f , not modify the sum in (7.7). If f (0) 6= 0, we have
Z R Z 1
dr
n(r) = log |f (Re2πiθ )dθ − log |f (0)|. (7.8)
0 r 0

It follows that Z 2R Z 1
dr
n(r) ≤ log |f (2Re2πiθ )dθ − log |f (0)|.
R r 0
As n(r) is increasing, we have
Z 2R Z 2R
dr dr
n(r) ≥ n(R) = n(R) log 2.
R r R r
The growth condition on f means that the right side in (7.8) can be bounded, for all
s > ρ, as Z 1 Z 1
2πiθ s
log |f (Re )dθ ≤ log |AeBR |dθ ≤ C(1 + R)s .
0 0
For the second statement we can write, using the dyadic blocks, for any s > ρ:
∞ ∞ ∞ ∞
X 1 X X 1 X 1
j+1
X 1 0
= ≤ n(2 ) ≤ C (1 + 2j+1 )s < +∞.
|zk |s j=0
|zk |s
j=0
2js
j=0
2js
k=1 2j ≤|zk |<2j+1

90
Here we have chosen s0 ∈ (ρ, s).
The simple example f (z) = sin z shows that the condition s > ρ can not be removed
in the theorem. Indeed, we have | sin z| ≤ e|z| so ρ = 1. However, the zeros are of the
form z = πn, with n ∈ Z, so that

X 1
ρ
= +∞
j=1
|zj |

in this case.

7.2 Entire functions with prescribed zeros


Now, we ask the following question: given a sequence zk ∈ C, can we find an entire
function that has exactly these zeros? If the set {zk } is finite, the answer is simple:

f (z) = (z − z1 )(z − z2 ) . . . (z − zN ).

It turns out that if the set {zk } is infinite then the answer is, basically, the same.
Moreover, the obvious obstruction – if the set {zk } has a limit point, then f can only
be identically equal to zero by the uniqueness theorem, is the only obstruction. In other
words, given any set {zk } with no limit points in C, there exists an entire function that
vanishes exactly at zk .
In order to construct such entire functions, we need to use infinite products. Let us
recall the following basic result on the infinite products.

X ∞
Y
Lemma 7.2 If |an | < +∞, then the product (1 + an ) converges. Moreover, the
n=1 n=1
product is non-zero unless one of an + 1 = 0 for some n.
P
Proof. As the series an is absolutely convergent, we may assume without loss of
generality that all |an | < 1/2. Therefore, we may choose one branch of log(1 + z) that
is holomorphic in the disk {|z| < 1/2} and write
N
Y N
X
(1 + an ) = exp( log(1 + an )).
n=1 n=1

Note that | log(1 + z)| ≤ 2|z| for |z| < 1/2. It follows that the series

X
log(1 + an )
n=1

converges to a limit A, proving the first assertion. Moreover, the limit of the infinite
product is non-zero since it is the exponential of the limit of the series above.
This result generalizes to the product of holomorphic functions.

91
Proposition 7.3 Let Fn be a sequence of holomorphic functions in a domain Ω. As-
sume that there exists a sequence cn > 0 so that

X
cn < +∞,
n=1


Y
and |Fn (z) − 1| ≤ cn for all n ≥ 1 and all z ∈ Ω. Then the product Fn (z) converges
n=1
to a holomorphic function in Ω, and if Fn (z) does not vanish for any n then

F 0 (z) X Fn0 (z)
= .
F (z) n=1
F (z)

The proof uses Lemma 7.2 and the fact that the sum of a uniformly converging series
of holomorphic functions is itself holomorphic.
We now describe Weierstrass’ construction of an entire function with prescribed
zeros.
Theorem 7.4 Given any sequence an ∈ C with |an | → ∞ as n → ∞ there exists an
entire function f (z) whose set of zeros coincides with {an }. Any other function of such
form is given by f (z)eg(z) with an entire function g(z).
The second part of the theorem is easy: if f1 and f2 are two entire functions with the
same set of zeros (with multiplicities) then f1 /f2 is an entire function that vanishes
nowhere, hence it must have the form eg(z) with an entire function g(z).
In order to construct the required entire function it is tempting to set f (z) to be the
product
∞  
Y z
1− .
n=1
an
It is not clear, however, why the product would converge, hence we need to correct this
expression. Imagine that we have constructed functions En (z) such that the only zero
of En (z) is z = 1, and we have
C
|1 − En (z)| ≤ , for |z| < 1/2. (7.9)
2n+1
Consider then the function
∞  
m
Y z
f (z) = z En ,
m=1
an

where m is the (prescribed) order of vanishing at z = 0. We claim that f (z) satisfies


the requirements of the theorem. First, we have f (an ) = 0. Moreover, for any R > 0
the product  
Y z
En
an
|an |≥2R

92
is then a holomorphic function in |z| < R, and does not vanish in that disk. Hence, f
is a holomorphic function in all of C and vanishes only at an .
Thus, we only need to exhibit En (z) that vanish only at z = 1 and satisfy (7.9).
They are defined as:

z2 zk
 
E0 (z) = 1 − z, Ek (z) = (1 − z) exp z + + ··· + . (7.10)
2 k

In order to verify that (7.9) holds, note that for |z| < 1/2 we have

z2 zk X zn
log Ek (z) = log(1 − z) + z + + ··· + =− ,
2 k n=k+1
n

or ∞ ∞
k+1
X |z|n−k−1 k+1
X C
| log Ek (z)| ≤ |z| ≤ |z| |z|n ≤ C|z|k+1 ≤ k+1 .
n=k+1
n n=0
2
It follows that
C
|1 − Ek (z)| = |1 − elog Ek (z) | ≤ C| log Ek (z)| ≤ ,
2k+1
and the proof of the theorem is complete.

7.3 Hadamard’s factorization theorem


Recall that a function has an order of growth ρ if ρ is the smallest number such that for
all s > ρ we have
s
|f (z)| ≤ As eBs |z| .
We have already proved that if f has growth of order ρ then for any s > ρ we have
n(r) ≤ Cs (1 + r)s , and its zeros a1 , . . . , an , . . . satisfy
X 1
< +∞.
n=1
|an |s

We will prove the following.

Theorem 7.5 Let f be entire and have growth order ρ0 , with zeros a1 , . . . , an , . . . , and
let k be an integer such that k ≤ ρ0 < k + 1, then f (z) has the representation
∞  
P (z) m
Y z
f (z) = e z Ek . (7.11)
n=1
an

Here P (z) is a polynomial of degree at most k, and m is the order of zero of f at z = 0.

93
The main improvement here compared to the Weierstrass factorization is that the factors
Ek (z/an ) have a constant order k, and the overall exponential factor is a polynomial.
The reason this is possible is as follows. Consider the function
∞  
m
Y z
E(z) = z Ek . (7.12)
n=1
a n

Previously, when we considered the function


∞  
m
Y z
E1 (z) = z En ,
n=1
an

in order to show that the function E1 (z) is entire, we used the estimate

|1 − En (z)| ≤ C|z|n+1 (7.13)

for |z| < 1/2, which is summable in n. Now, however, we know, in addition, that |an |
have to grow at a certain rate: the series

X 1
< +∞
n=1
|an |k+1

converges. Therefore, we use the bound (7.13) with n = k only: for any fixed z and
sufficiently large n we have
  k+1

1 − E k z ≤C z

,
an an

hence for any R > 0, and |z| < R the series


 
1 − Ek z
X
an
|an |≥2R

is majorized by the series


X z k+1


an ,
|an |≥2R

which converges. Hence, the function E(z) given by (7.12) is entire and has the same
zeros (with multiplicity) as f . Therefore, the ratio f (z)/E(z) can be written as

f (z)
= eg(z) ,
E(z)

with a holomorphic function g(z). Our task is to show that g(z) has to be a polynomial
of degree at most k. We will do this by showing that the ratio f (z)/E(z) does not grow
too fast. This relies on the following two lemmas.

94
Lemma 7.6 For any s > ρ0 there exists a sequence rm → +∞ so that
∞  
Y z s
Ek ≥ e−c|z| , (7.14)

an


n=1

for z with |z| = rm .

Lemma 7.7 Assume that g is an entire function and there exists a sequence rn → +∞
so that
Re g(z) ≤ crns , for all z with |z| = rn . (7.15)
Then g is a polynomial of degree less or equal to s.

The end of the proof of Hadamard’s theorem is as follows: for any s > ρ0 we have
s
|f (z)| ≤ As eB|z| ,

and by Lemma 7.6 there exists a sequence rm → +∞ so that


s
|E(z)| ≥ c1 e−c|z| , for all z with |z| = rm .

Therefore, we have

f (z)
eReg(z) = ≤ C1 eC2 |z|s for all z with |z| = rm ,
E(z)
or
Reg(z) ≤ C|z|s , for all z with |z| = rm .
This, by Lemma 7.7, implies that g(z) is a polynomial of degree less or equal to s. As s
is an arbitrary number larger than ρ0 and k ≤ ρ0 < k + 1, g(z) is a polynomial of degree
at most k.
Therefore, to finish the proof we need to prove Lemmas 7.6 and 7.7. We begin with
Lemma 7.7 which is shorter. As g(z) is an entire function, we can write it as
X
g(z) = an z n ,
n=0

and our goal is to show that an = 0 for n > s. Observe that for n ≥ 0 we have,
Z 1
n
an r = g(re2πiθ )e−2πinθ dθ, n ≥ 0,
0

while for n < 0 these integrals vanish:


Z 1
g(re2πiθ )e−2πinθ dθ = 0, n < 0.
0

95
It follows that Z 1
n
an r = 2 Reg(re2πiθ )e−2πinθ dθ, n > 0,
0
while for n = 0 we simply have
Z 1
2Rea0 = Reg(re2πiθ )dθ.
0

We conclude that for all n > 0, any C > 0 an any r > 0 we have

2 1
Z
Reg(re2πiθ ) − Crs e−2πinθ dθ, n > 0.

an = n
r 0
Now, for c and rm as in the assumption of the Lemma, we have then
Z 1 Z 1
2 2πiθ s
2 s 2πiθ

|an | ≤ n Reg(rm e ) − Crm dθ =
n
Cr m − Reg(rm e ) dθ
rm 0 rm 0
C0 2Rea0
≤ n−s − n
→ 0 as rm → +∞,
rm rm

for all n > s, and the proof of Lemma 7.7 is complete.


The proof of Lemma 7.6 is quite a bit longer. An obvious potential obstacle is that
Ek (a/zn ) = 0 when z = an . Therefore, the conclusion of the lemma can not hold for all
z and the best we can hope for is to prove it for points separated away from an . The
main step is to prove the lower bound for E1 (z) outside of “forbidden disks” centered
at an , and then prove that the union of forbidden disks does not cover too many circles
centered at zero. More precisely, we will show that for any s such that ρ0 < s < k + 1
we have
 

Y z s 1
Ek ≥ e−c|z| , for all z such that |z − an | > for all n ≥ 1. (7.16)

an |an |k+1

n=1

This is sufficient because the series



X 1
< +∞
n=1
|an |k+1

converges. Indeed, we can choose N so that



X 1 1
k+1
< .
n=N
|an | 10

It follows that for all integers L we can find r ∈ (L, L + 1) so that none of the points on
the circle {|z| = r} intersect any of the forbidden disks with n ≥ N , and the claim of

96
Lemma 7.6 will follow. Hence, we concentrate on the proof of (7.16). For a given z we
will consider separately  
Y z
I= Ek
an
|an |>2|z|

and  
Y z
II = Ek . (7.17)
an
|an |≤2|z|

In the first term we will use the fact that for |z| < 1/2 we can write
( k
) ( ∞
)
X zn X zn k+1
Ek (z) = exp log(1 − z) + = exp − ≥ e−c|z| .
n=1
n n=k+1
n

This gives
 
Y k+1
 X 1 
|I| ≥ e−c|z/an | ≥ exp −c|z|k+1 .
 |an |k+1 
|an |>2|z| |an |>2|z|

However, for |an | > 2|z| and s < k + 1 we have


1 1 1 C 1
= ≤ .
|an |k+1 |an |s |an |k+1−s |an |s |z|k+1−s
As the series ∞
X 1
s
< +∞
n=1
|a n |
is summable, we deduce that
|I| ≥ exp {−c|z|s } .
Finally, in order to bound the second term given by (7.17) we note that for |z| > 1/2
we have
z2 z k z2 zk


|Ek (z)| = |1 − z| exp(z +
+ · · · + ) ≥ |1 − z| exp(−|z + + · · · + |)
2 k 2 k
k
≥ |1 − z| exp(−c|z| ). (7.18)

It follows that
Y z
Y
k
e−c|z/an | .

|II| ≥ 1 − (7.19)
an
|an |≤2|z| |an |≤2|z|

The second product above is estimated as before, by writing


X z k

s−k 1 1
X X
= |z|k |a n | ≤ C|z|k
|z|s−k ≤ C|z|s ,
an |an |s |an |s
|an |≤2|z| |an |≤2|z| |an |≤2|z|

97
whence
k s
Y
e−c|z/an | ≥ e−c|z| .
|an |≤2|z|

Finally, we come to the first term in the right side of (7.19) that forces us to keep z
outside of the forbidden disks (it is clear from its form that some assumption on the
distance from z to an is needed). Note that, if |an − z| > 1/|an |k+1 (that is, z is outside
all of the forbidden disks), we have

Y z Y an − z Y 1
1 − = ≥ .
an an |an |k+2
|an |≤2|z| |an |≤2|z| |an |≤2|z|

Observe that (recall that n(r) is the number of zeros of f inside the disk {|z| ≤ r})
X
(k + 2) log |an | ≤ (k + 2)n(2|z|) log(2|z|).
|an |≤2|z|

0
Theorem 7.1 implies that n(r) ≤ C(1 + r)s for any s0 > ρ0 , thus
0
X
(k + 2) log |an | ≤ C(k + 2)(1 + |z|)s log(2|z|) ≤ C|z|s ,
|an |≤2|z|

if we take ρ0 < s0 < s. As a consequence, we have



Y z Y 1 s
≥ e−c|z| ,

1 − ≥
an |an |k+2
|an |≤2|z| |an |≤2|z|

and the proof of Theorem 7.5 is complete!

8 The Basics of the Geometric Theory


This section (taken from the Shabat book) recalls the basics of the geometric theory of
functions of a complex variable.

8.1 The Argument Principle


Let the function f be holomorphic in a punctured neighborhood {0 < |z − a| < r}
of a point a ∈ C. We assume also that f does not vanish in this neighborhood. The
logarithmic residue of the function f at the point a is the residue of the logarithmic
derivative
f 0 (a) d
= Lnz (8.1)
f (a) dz
of this function at the point a.
Apart from isolated singular points the function f may have a non-zero logarithmic
residue at its zeros. Let a ∈ C be a zero of order n of a function f holomorphic at a.

98
Then we have f (z) = (z − a)n φ(z) in a neighborhood Ua of a with the function φ
holomorphic and different from zero in Ua . Therefore we have in Ua

f 0 (z) n(z − a)n−1 φ(z) + (z − a)n φ0 (z) 1 nφ(z) + (z − a)φ0 (z)


= = ·
f (z) (z − a)n φ(z) z−a φ(z)

with the second factor holomorphic in Ua . Hence it may be expanded into the Taylor
series with the zero order term equal to n. Therefore we have in Ua

f 0 (z) 1  n
n + c1 (z − a) + c2 (z − a)2 + . . . =

= + c1 + c2 (z − a) + . . . (8.2)
f (z) z−a z−a

This shows that the logarithmic derivative has a pole of order one with residue equal to
n at the zero of order n of f : the logarithmic residue at a zero of a function is equal to
the order of this zero.
If a is a pole of f of the order p then 1/f has a zero of order p at this point. Observing
that
f 0 (z) d 1
= − Ln ,
f (z) dz f (z)
and using (8.2) we conclude that the logarithmic derivative has residue equal to −p at
a pole of order p: the logarithmic residue at a pole is equal to the order of this pole with
the minus sign.
Those observations allow to compute the number of zeros and poles of meromorphic
functions. We adopt the convention that a pole and a zero are counted as many times
as their order is.

Theorem 8.1 Let the function f be meromorphic in a domain D ⊂ C and let G be a


domain properly contained in D with the boundary ∂G that is a continuous curve. Let
us assume that ∂G contains neither poles nor zeros of f and let N and P be the total
number of zeros and poles of f in the domain G, then

f 0 (z)
Z
1
N −P = dz. (8.3)
2πi ∂G f (z)

Proof. The function f has only finitely many poles a1 , . . . , al and zeros b1 , . . . , bm in
G since G is properly contained in D. The function g = f 0 /f is holomorphic in a
neighborhood of ∂G since the boundary of G does not contain poles or zeros. Applying
the Cauchy theorem on residues to g we find
l m
f0
Z
1 X X
dz = resaν g + resbν g. (8.4)
2πi ∂G f ν=1 ν=1

However, according to our previous remark,

resaν g = nν , resbν g = pν .

99
Here nν and pν are the order of zero aν and pole bν , respectively. Using this Pin (8.4)
and counting
P the multiplicities of zeros and poles we obtain (8.3) since N = nν and
P = pν . 
The theorem that we have just proved has a geometric interpretation. Let us pa-
f0
rameterize ∂G as z = z(t), α ≤ t ≤ β and denote by Φ(t) the anti-derivative of along
f
this path. The Newton-Leibnitz formula implies that
f 0 (z)
Z
dz = Φ(β) − Φ(α). (8.5)
∂G f (z)

However, clearly, Φ(t) = ln[f (z(t))], where ln denotes any branch of the logarithm that
varies continuously along the path ∂G. It suffices to choose a branch of arg f that varies
continuously along ∂G since Lnf = ln |f |+iArgf and the function ln |f | is single-valued.
The increment of ln |f | along a closed path ∂G is equal to zero and thus

Φ(β) − Φ(α) = i{arg f (z(β)) − arg f (z(α))}.

We denote the increment of the argument of f in the right side by ∆∂G f and re-write
(8.5) as
f0
Z
dz = i∆∂G arg f.
∂G f
Theorem 8.1 may now be expressed as
Theorem 8.2 (The argument principle) Under the assumptions of Theorem 8.1 the
difference between the number of zeros N and the number of poles P of a function f in a
domain G is equal to the increment of the argument of this function along the oriented
boundary of G divided by 2π:
1
N −P = ∆∂G arg f. (8.6)

Geometrically the right side of (8.6) is the total number of turns the vector w = f (z)
makes around w = 0 as z varies along ∂G. Let us denote by ∂G∗ the image of ∂G under
the map f , that is, the path w = f (z(t)), α ≤ t ≤ β. Then this number is equal to
the total number of times the vector w rotates around w = 0 as it varies along ∂G∗ .
This number is called the winding number of ∂G∗ around w = 0, we will denote it by
ind0 ∂G∗ . The argument principle states that
1
N −P = ∆∂G arg f = ind0 ∂G∗ . (8.7)

Remark 8.3 We may consider the a-points of f , solutions of f (z) = a and not only
its zeros: it suffices to replace f by f (z) − a in our arguments. If ∂G contains neither
poles nor a-points of f then
f 0 (z)
Z
1 1
Na − P = dz = ∆∂G arg{f (z) − a}, (8.8)
2πi ∂G f (z) − a 2π

100
where Na is the number of a-points of f in the domain D. Passing to the plane w = f (z)
and introducing the index of the path ∂G∗ around the point a we may re-write (8.8) as
1
Na − P = ∆∂G arg{f (z) − a} = inda ∂G∗ . (8.9)

The next theorem is an example of the application of the argument principle.

Theorem 8.4 (Rouche14 ) Let the functions f and g be holomorphic in a closed domain
Ḡ with a continuous boundary ∂G and let

|f (z)| > |g(z)| for all z ∈ ∂G. (8.10)

Then the functions f and f + g have the same number of zeros in G.

Proof. Assumption (8.10) shows that neither f nor f + g vanish on ∂G and thus the
argument principle might be  to both of these functions. Moreover, since f 6= 0
 applied
g
on ∂G, we have f + g = f 1 + and thus we have with the appropriate choice of a
f
branch of the argument:
 
g
∆∂G arg(f + g) = ∆∂G arg f + ∆∂G arg 1 + . (8.11)
f

g g
However, since < 1 on ∂G, the point ω = lies in {|ω| < 1} for all z ∈ ∂G.
f f
Therefore the vector w = 1 + ω may not turn around zero and hence the second term
in the right side of (8.11) vanishes. Therefore, ∆∂G arg(f + g) = ∆∂G arg f and the
argument principle implies the statement of the theorem. 
The Rouche theorem is useful in counting the zeros of holomorphic functions. In
particular it implies the main theorem of algebra in a very simple way.

Theorem 8.5 Any polynomial Pn of degree n has exactly n roots in C.

Proof. All zeros of Pn must lie in a disk {|z| < R} since Pn has a pole at infinity. Let
Pn = f + g where f = a0 z n , a0 6= 0 and g = a1 z n−1 + · · · + an , then, possibly after
increasing R, we may assume that |f | > |g| on {|z| = R} since |f | = |a0 |Rn while g is
a polynomial of degree less than n. The Rouche theorem implies that Pn has as many
roots in {|z| < R} as f = a0 z n , that is, exactly n of them. 

8.2 The Open Mapping Theorem


Theorem 8.6 15 If a function f holomorphic in a domain D is not equal identically to
a constant then the image D∗ = f (D) is also a domain.
14
Eugene Rouche (1832-1910) was a French mathematician.
15
This theorem was proved by Riemann in 1851.

101
Proof. We have to show that D∗ is connected and open. Let w1 and w2 be two arbitrary
points in D∗ and let z1 and z2 be some pre-images of w1 and w2 , respectively. Since the
domain D is path-wise connected there exists a path γ : [α, β] → D that connects z1
and z2 . Its image γ ∗ = f ◦ γ connects w1 and w2 and is a path since the function f is
continuous. Moreover, it is clearly contained in D∗ and hence the set D∗ is path-wise
connected.
Let w0 be an arbitrary point in D∗ and let z0 be a pre-image of w0 . There exists a
disk {|z − z0 | < r} centered at z0 that is properly contained in D since D is open. After
decreasing r we may assume that {|z − z0 | ≤ r} contains no other w0 -points of f except
z0 : since f 6= const its w0 points are isolated in D. We denote by γ = {|z − z0 | = r} the
boundary of this disk and let

µ = min |f (z) − w0 |. (8.12)


z∈γ

Clearly µ > 0 since the continuous function |f (z) − w0 | attains its minimum on γ, so
that if µ = 0 then there would exist a w0 -point of f on γ contrary to our construction
of the disk.
Let us now show that the set {|w − w0 | < µ} is contained in D∗ . Indeed, let w1 be
an arbitrary point in this disk, that is, |w1 − w0 | < µ. Then we have

f (z) − w1 = f (z) − w0 + (w0 − w1 ), (8.13)

and, moreover, |f (z) − w0 | ≥ µ on γ. Then, since |w0 − w1 | < µ, the Rouche theorem
implies that the function f (z) − w1 has as many roots inside γ as f (z) − w0 . Hence it
has at least one zero (the point z0 may be a zero of order higher than one of f (z) − w0 ).
Thus the function f takes the value w1 and hence w1 ∈ D∗ . However, w1 is an arbitrary
function in the disk {|w − w0 | < µ} and hence this whole disk is contained in D∗ so that
D∗ is open. 

Exercise 8.7 Let f be holomorphic in {Imz ≥ 0}, real on the real axis and bounded.
Show that f ≡ const.

A similar but more detailed analysis leads to the solution of the problem of local inversion
of holomorphic functions. This problem is formulated as follows.
A holomorphic function w = f (z) is defined at z0 , find a function z = g(w) analytic
at w0 = f (z0 ) so that g(w0 ) = z0 and f (g(w)) = w in a neighborhood of w0 .
We should distinguish two cases in the solution of this problem:
I. The point z0 is not a critical point: f 0 (z0 ) 6= 0. As in the proof of the open mapping
theorem we choose a disk {|z − z0 | ≤ r} that contains no w0 -points except z0 , and define
µ according to (8.12). Let w1 be an arbitrary point in the disk {|w − w0 | < µ}. Then
the same argument (using (8.13) and the Rouche theorem) shows that the function f
takes the value w1 as many times as w0 . However, the value w0 is taken only once and,
moreover, z0 is a simple zero of f (z) − w0 since f 0 (z0 ) 6= 0.
Therefore the function f takes all values in the disk {|w − w0 | < µ} once in the disk
{|z − z0 | < r}. In other words, the function f is a local bijection at z0 .

102
Then the function z = g(w) is defined in the disk {|w − w0 | < r} so that g(w0 ) = z0
and f ◦ g(w) = w. Furthermore, derivative g 0 (w) exists at every point of the disk
{|w − w0 | < r}:
1
g 0 (w) = 0 (8.14)
f (z)
and thus g is holomorphic in this disk16 .
II. The point z0 is a critical point: f 0 (z0 ) = · · · = f (p−1) (z0 ) = 0, f (p) 6= 0, p ≥ 2.
Repeating the same argument as before choosing a disk {|z − z0 | < r} that contains
neither w0 -points of f nor zeros of the derivative f 0 (we use the uniqueness theorem once
again). As before, we choose µ > 0, take an arbitrary point w1 in the disk {|w−w0 | < µ}
and find that f takes the value w1 as many times as w0 . However, in the present case
the w0 -point z0 has multiplicity p: z0 is a zero of order p of f (z) − w0 . Furthermore,
since f 0 (z) 6= 0 for 0 < |z − z0 | < r the value w1 has to be taken at p different points.
Therefore, the function f takes each value p times in {|z − z0 | < r}.
The above analysis implies the following

Theorem 8.8 Condition f 0 (z0 ) 6= 0 is necessary and sufficient for the local invertibility
of a holomorphic function f at the point z0 .

Remark 8.9 The general inverse function theorem of the real analysis implies that
the assumption f 0 (z0 ) 6= 0 is sufficient for the local invertibility since the Jacobian
Jf (z) = |f 0 (z)|2 of the map (x, y) → (u, v) is non-zero at this point. However, for an
arbitrary differentiable map to be locally invertible one needs not Jf (z) 6= 0 to hold.
This may be seen on the example of the map f = x3 + iy that has Jacobian equal to
zero at z = 0 but that is nevertheless one-to-one.

Remark 8.10 The local invertibility condition f 0 (z) 6= 0 for all z ∈ D is not sufficient
for the global invertibility of the function in the whole domain D. This may be seen
on the example of f (z) = ez that is locally invertible at every point in C but is not
one-to-one in any domain that contains two points that differ by 2kπi where k 6= 0 is
an integer.

8.3 The maximum modulus principle and the Schwartz lemma


The maximum modulus principle is expressed by the following theorem.

Theorem 8.11 If the function f is holomorphic in a domain D and its modulus |f |


achieves its (local) maximum at a point z0 ∈ D then f is constant.

Proof. We use the open mapping theorem. If f 6= const then it maps z0 into a point w0
of the domain D∗ . There exists a disk {|w − w0 | < µ} centered at w0 that is contained
in D∗ . There must be a point w1 in this disk so that |w1 | > |w0 |. The value w1 is taken
Expression (8.14) shows that in order for derivative to exist we need f 0 6= 0. Using continuity of f 0
16

we may conclude that f 0 6= 0 in the disk {|z − z0 | < r}, possibly decreasing r if needed.

103
by the function f in a neighborhood of the point z0 which contradicts the fact that |f |
achieves its maximum at this point. 
Taking into account the properties of continuous functions on a closed set the max-
imum modulus principle may be reformulated as
Theorem 8.12 If a function f is holomorphic in a domain D and continuous in D̄
then |f | achieves its maximum on the boundary ∂D.
Proof. If f = const in D (and hence in D̄ by continuity) the statement is trivial.
Otherwise if f 6= const then |f | may not attain its maximum at the points of D.
However, since this maximum is attained in D̄ it must be achieved on ∂D. 
Exercise 8.13 1. Let P (z) be a polynomial of degree n in z and let M (r) = max |P (z)|.
|z|=r
n
Show that M (r)/r is a decreasing function.
2. Formulate and prove the maximum principle for the real part of a holomorphic
function.
A similar statement for the minimum of modulus is false in general. This may be seen
on the example of the function f (z) = z in the disk {|z| < 1} (the minimum of |f | is
attained at z = 0). However, the following theorem holds.
Theorem 8.14 Let a function f be holomorphic in a domain D and not vanish any-
where in D. Then |f | may attain its (local) minimum in D only if f = const.
For the proof of this theorem it suffices to apply Theorem 8.11 to the function g = 1/f
that is holomorphic since f 6= 0.
A simple corollary of the maximum modulus principle is
Lemma 8.15 (The Schwartz lemma17 ) Let a function f be holomorphic in the unit disk
U = {|z| < 1}, satisfy |f (z)| ≤ 1 for all z ∈ U and f (0) = 0. Then we have

|f (z)| ≤ |z| (8.15)

for all z ∈ U . Moreover, if the equality in (8.15) holds for at least one z 6= 0 then it
holds everywhere in U and in this case f (z) = eiα z, where α is a real constant.
Proof. Consider the function φ(z) = f (z)/z, it is holomorphic in U since f (0) = 0.
Let Ur = {|z| < r}, r < 1 be an arbitrary disk centered at zero. The function φ(z)
attains its maximum in Ur on its boundary γr = {|z| = r} according to Theorem 8.12.
However, we have |φ| ≤ 1/r on γr since |f | ≤ 1 by assumption. Therefore we have

|φ(z)| ≤ 1/r (8.16)

everywhere in Ur . We fix z ∈ U and observe that z ∈ Ur for r > |z|. Therefore (8.16)
holds for any given z with all r > |z|. We let r → 1, and passing to the limit r → 1 we
obtain |φ(z)| ≤ 1 or |f (z)| ≤ |z|. This proves the inequality (8.15).
17
Hermann Schwartz (1843-1921) was a German mathematician, a student of Weierstrass. This
important lemma has appeared in his papers of 1869-70.

104
Let us assume that equality in (8.15) holds for some z ∈ U , then |φ| attains its
maximum equal to 1 at this point. Then φ is equal to a constant so that φ(z) = eiα and
f (z) = eiα z. 
The Schwartz lemma implies that a holomorphic map f that maps the disk {|z| < 1}
into the disk {|w| < 1} and that takes the center to the center, maps any circle {|z| = r}
inside the disk {|w| < r}. The image of {|z| = r} may intersect {|w| = r} if and only f
is a rotation around z = 0.
Exercise 8.16 1. Show that under the assumptions of the Schwartz lemma we have
|f 0 (0)| ≤ 1 and equality is attained if and only if f (z) = eiα z.
2. Let f ∈ O(D), f : U → U and f (0) = · · · = f (k−1) (0) = 0. Show that then
|f (z)| ≤ |z|k for all z ∈ U .

8.4 The Riemann Theorem


Any holomorphic one-to-one function defined in a domain D defines a conformal map of
this domain since the above assumptions imply that f has no critical points in D. We
have encountered such maps many times before. Here we consider a more difficult and
important for practical purposes problem:
Given two domains D1 and D2 find a one-to-one conformal map f : D1 → D2 of
one of these domains onto the other.

Definition 8.17 A conformal one-to-one map of a domain D1 onto D2 is said to be


a (conformal) isomorphism, while the domains D1 and D2 that admit such a map are
isomorphic (or conformally equivalent). Isomorphism of a domain onto itself is called a
(conformal) automorphism.

It is easy to see that the set of all automorphisms φ : D → D of a domain D forms a


group that is denoted AutD. The group operation is the composition φ1 ◦ φ2 , the unity
is the identity map and the inverse is the inverse map z = φ−1 (w).
The richness of the group of automorphisms of a domain allows to understand the
richness of the family of the conformal maps onto it of a different domain, as may be
seen from the next
Theorem 8.18 Let f0 : D1 → D2 be a fixed isomorphism. Then any other isomorphism
of D1 onto D2 has the form
f = φ ◦ f0 (8.17)
where φ is an automorphism of D2 .
Proof. First, it is clear that all maps of the form of the right side of (8.17) are isomor-
phisms from D1 onto D2 . Furthermore, if f : D1 → D2 is an arbitrary isomorphism
then φ = f ◦ f0−1 is a conformal map of D2 onto itself, that is, an automorphism of D2 .
Then (8.17) follows. 
In the sequel we will only consider simply connected domains D. We will distinguish
three special domains that we will call canonical: the closed plane C, the open plane C

105
and the unit disk {|z| < 1}. We have previously found the group of all fractional-linear
automorphisms of those domains. However, the following theorem holds.
Theorem 8.19 Any conformal automorphism of a canonical domain is a fractional-
linear transformation.
Proof. Let φ be automorphism of C. There exists a unique point z0 that is mapped
to infinity. Therefore φ is holomorphic everywhere in C except at z0 where it has a
pole. This pole has multiplicity one since in a neighborhood of a pole of higher order
the function φ could not be one-to-one. Therefore since the only singularities of φ are
poles φ is a rational function. Since it has only one simple pole, φ should be of the form
A
φ(z) = + B if z0 6= ∞ and φ(z) = Az + B if z0 = ∞. The case of the open
z − z0
complex plane C is similar.
Let φ be an arbitrary automorphism of the unit disk U . Let us denote w0 = φ(0)
and consider a fractional linear transformation
w − w0
λ: w→
1 − w̄0 w
of the disk U that maps w0 into 0. The composition f = λ ◦ φ is also an automorphism
of U so that f (0) = 0. Moreover, |f (z)| < 1 for all z ∈ U . Therefore the Schwartz
lemma implies that |f (z)| ≤ |z| for all z ∈ U . However, the inverse map z = f −1 (w)
also satisfies the assumptions of the Schwartz lemma and hence |f −1 (w)| ≤ |w| for all
w ∈ U that in turn implies that |z| ≤ |f (z)| for all z ∈ U . Thus |f (z)| = |z| for all z ∈ U
so that the Schwartz lemma implies that f (z) = eiα z. Then φ = λ−1 ◦ f = λ−1 (eiα z) is
also a fractional-linear transformation. 
Taking into account our results on the Möbius transformations we obtain the com-
plete description of all conformal automorphisms of the canonical domains.
(I) The closed complex plane:
 
az + b
AutC = z → , ad − bc 6= 0 . (8.18)
cz + d
(II) The open plane:
AutC = {z → az + b, a 6= 0} . (8.19)
(III) The unit disk:
 
z−a

AutU = z→e , |a| < 1, α ∈ R . (8.20)
1 − āz
It is easy to see that different canonical domains are not isomorphic to each other.
Indeed, the closed complex plane C is not even homeomorphic to C and U and hence
it may not be mapped conformally onto these domains. The domains C and U are
homeomorphic but there is no conformal map of C onto U since such a map would have
to be realized by an entire function such that |f (z)| < 1 which has then to be equal to
a constant by the Liouville theorem.

106
A domain that has no boundary (boundary is an empty set) coincides with C. Do-
mains with boundary that consists of one point are the plane C without a point which
are clearly conformally equivalent to C (even by a fractional linear transformation).
The main result of this section is the Riemann theorem that asserts that any simply
connected domain D with a boundary that contains more than one point (and hence
infinitely many points since boundary of a simply connected domain is connected) is
conformally equivalent to the unit disk U .
This theorem will be presented later while at the moment we prove the uniqueness
theorem for conformal maps.
Theorem 8.20 If a domain D is conformally equivalent to the unit disk U then the set
of all conformal maps of D onto U depends on three real parameters. In particular there
exists a unique conformal map f of D onto U normalized by
f (z0 ) = 0, arg f 0 (z0 ) = θ, (8.21)
where z0 is an arbitrary point of D and θ is an arbitrary real number.
Proof. The first statement follows from Theorem 8.18 since the group AutU depends
on three real parameters: two coordinates of the point a and the number α in (8.20).
In order to prove the second statement let us assume that there exist two maps
f1 and f2 of the domain D onto U normalized as in (8.21). Then φ = f1 ◦ f2−1 is an
automorphism of U such that φ(0) = 0 and arg f 0 (0) = 0. Expression (8.20) implies
that then a = 0 and α = 0, that is φ(z) = z and f1 = f2 . In order to prove the Riemann
theorem we need to develop some methods that are useful in other areas of the complex
analysis.

The compactness principle


Definition 8.21 A family {f } of functions defined in a domain D is locally uniformly
bounded if for any domain K properly contained in D there exists a constant M = M (K)
such that
|f (z)| ≤ M for all z ∈ K and all f ∈ {f }. (8.22)
A family {f } is locally equicontinuous if for any ε > 0 and any domain K properly
contained in D there exists δ = δ(ε, K) so that
|f (z 0 ) − f (z 00 )| < ε (8.23)
for all z 0 , z 00 ∈ K so that |z 0 − z 00 | < δ and all f ∈ {f }.
Theorem 8.22 If a family {f } of holomorphic functions in a domain D is locally uni-
formly bounded then it is locally equicontinuous.
Proof. Let K be a domain properly contained in D. Let us denote by 2ρ the distance
between the closed sets K̄ and ∂D, and let
[
K (ρ) = {z : |z − z0 | < ρ}
z0 ∈K

107
be a ρ-enlargement of K. The set K (ρ) is properly contained in D and thus there exists a
constant M so that |f (z)| ≤ M for all z ∈ K (ρ) and f ∈ {f }. Let z 0 and z 00 be arbitrary
points in K so that |z 0 − z 00 | < ρ. The disk Uρ = {z : |z − z 0 | < ρ} is contained in K (ρ)
1
and hence |f (z) − f (z 0 )| < 2M for all z ∈ Uρ . The mapping ζ = (z − z 0 ) maps Uρ onto
ρ
the disk |ζ| < 1 and the function
1
g(ζ) = {f (z 0 + ζρ) − f (z 0 )}
2M
satisfies the assumptions of the Schwartz lemma.
This lemma implies that |g(ζ)| ≤ |ζ| for all ζ, |ζ| < 1, which means
2M
|f (z) − f (z 0 )| ≤ |z − z 0 | for all z ∈ Uρ . (8.24)
ρ
 ερ 
Given ε > 0 we choose δ = min ρ, and obtain from (8.24) that |f (z 0 ) − f (z 00 )| < ε
2M
for all f ∈ {f } provided that |z 0 − z 00 | < δ. 
Definition 8.23 A family of functions {f } defined in a domain D is compact in D
if any sequence fn of functions of this family has a subsequence fnk that converges
uniformly on any domain K properly contained in D.

Theorem 8.24 (Montel18 ) If a family of functions {f } holomorphic in a domain D is


locally uniformly bounded then it is compact in D.

Proof. (a) We first show that if a sequence fn ⊂ {f } converges at every point of an


everywhere dense set E ⊂ D then it converges uniformly on every compact subset K of
D. We fix ε > 0 and the set K. Using equicontinuity of the family {f } we may choose a
partition of D into squares with sides parallel to the coordinate axes and so small that
that for any two points z 0 , z 00 ∈ K that belong to the same square and any f ∈ {f } we
have
ε
|f (z 0 ) − f (z 00 )| < . (8.25)
3
The set K is covered by a finite number of such squares qp , p = 1, . . . , P . Each qp
contains a point zp ∈ E since the set E is dense in D. Moreover, since the sequence{fn }
converges on E there exists N so that
ε
|fm (zp ) − fn (zp )| < (8.26)
3
for all m, n > N and all zp , p = 1, . . . , P .
Let now z be an arbitrary point in K. Then there exists a point zp that belongs to
the same square as z. We have for all m, n > N :

|fm (z) − fn (z)| ≤ |fm (z) − fm (zp )| + |fm (zp ) − fn (zp )| + |fn (zp ) − fn (z)| < ε
18
Paul Montel (1876-1937) was a French mathematician.

108
due to (8.25) and (8.26). The Cauchy criterion implies that the sequence {zn } converges
for all z ∈ K and convergence is uniform on K.
(b) Let us show now that any sequence {fn } has a subsequence that converges at
every point of a dense subset E of D. We choose E as the set z = x + iy ∈ D with both
coordinates x and y rational numbers. This set is clearly countable and dense in D, let
E = {zν }∞ ν=1 .
The sequence fn (z1 ) is bounded and hence it has a converging subsequence fk1 =
fnk (z1 ), k = 1, 2, . . . . The sequence fn1 (z2 ) is also bounded so we may extract its
subsequence fk2 = fnk 1 , k = 1, 2, . . . . The sequence fn2 converges at least at the points
z1 and z2 . Then we extract a subsequence fk3 = fnk 2 of the sequence fn2 (z3 ) so that
fn3 converges at least at z1 , z2 and z3 . We may continue this procedure indefinitely. It
remains to choose the diagonal sequence
f11 , f22 , . . . , fnn , . . .
This sequence converges at any point zp ∈ E since by construction all its entries after
index p belong to the subsequence fnp that converges at zp .
Parts (a) and (b) together imply the statement of the theorem. 
The Montel theorem is often called the compactness principle.
Exercise 8.25 Show that any sequence {fn } of functions holomorphic in a domain D
with Refn ≥ 0 everywhere in D has a subsequence that converges locally uniformly
either to a holomorphic function or to infinity.
Definition 8.26 A functional J of a family {f } of functions defined in a domain D
is a mapping J : {f } → C, that is, J(f ) is a complex number. A functional J is
continuous if given any sequence of functions fn ∈ {f } that converges uniformly to a
function f0 ∈ {f } on any compact set K ⊂ D we have
lim J(fn ) = J(f0 ).
n→∞

Example 8.27 Let O(D) be the family of all functions f holomorphic in D and let a
be an arbitrary point in D. Consider the p-th coefficient of the Taylor series in a:
f (p) (a)
cp (f ) = .
p!
This is a functional on the family O(D). Let us show that it is continuous. If fn → f0
uniformly on every compact set K ⊂ D, we may take K to be the circle γ = {|z − a| =
r} ⊂ D. Then, given any ε > 0 we may find N so that |fn (z) − f0 (z)| < ε for all n > N
and all z ∈ γ. The Cauchy formula for cp
Z
1 f (z)
cp = dz
2πi γ (z − a)n+1
implies that
ε
|cp (fn ) − cp (f0 )| ≤
rn
for all n > N which in turn implies the continuity of the functional cp (f ).

109
Definition 8.28 A compact family of functions {f } is sequentially compact if the limit
of any sequence fn that converges uniformly on every compact subset K ⊂ D belongs to
the family {f }.
Theorem 8.29 Any functional J that is continuous on a sequentially compact family
{f } is bounded and attains its least upper bound. That is, there exists a function f0 ∈ {f }
so that we have
|J(f0 )| ≥ |J(f )|
for all f ∈ {f }.
Proof. We let A = sup |J(f )| – this number may be equal to infinity. By definition
f ∈{f }
of the supremum, there exists a sequence fn ∈ {f } so that |J(fn )| → A. Since {f } is a
sequentially compact family there exists a subsequence fnk that converges to a function
f0 ∈ {f }. Continuity of the functional J implies that
|J(f0 )| = lim |J(fnk )| = A.
k→∞

This means that first A < ∞ and second, |J(f0 )| ≥ |J(f )| for all f ∈ {f }. 
We will consider below families of univalent functions in a domain D. The following
theorem is useful to establish sequential compactness of such families.
Theorem 8.30 (Hurwitz19 ) Let a sequence of functions fn holomorphic in a domain D
converge uniformly on any compact subset K of D to a function f 6= const. If f (z0 ) = 0,
then given any disk Ur = {|z − z0 | < r} there exists N so that all functions fn vanish at
some point in Ur when n > N .
Proof. The Weierstrass theorem implies that f is holomorphic in D. The uniqueness
theorem implies that there exists a punctured disk {0 < |z − z0 | ≤ ρ} ⊂ D where f 6= 0
(we may assume that ρ < r). We denote γ = {|z − z0 | = ρ} and µ = min |f (z)|, and
z∈γ
observe that µ > 0. However, fn converges uniformly to f on γ and hence there exists
N so that
|fn (z) − f (z)| < µ
for all z ∈ γ and all n > N . The Rouche theorem implies that for such n the function
fn = f + (fn − f ) has as many zeros (with multiplicities) as f inside γ, that is, fn has
at least one zero inside Uρ . 
Corollary 8.31 If a sequence of holomorphic and univalent functions fn in a domain
D converges uniformly on every compact subset K of D then the limit function f is
either a constant or univalent.
Proof. Assume that f (z1 ) = f (z2 ) but z1 6= z2 , z1,2 ∈ D and f 6≡ const. Consider a
sequence of functions gn (z) = fn (z) − fn (z2 ) and a disk {|z − z1 | < r} with r < |z1 − z2 |.
The limit function g(z) vanishes at the point z1 . Hence according to the Hurwitz theorem
all functions fn starting with some N vanish in this disk. This, however, contradicts the
assumption that fn (z) are univalent. 
19
Adolf Hurwitz (1859-1919) was a German mathematician, a student of Weierstrass.

110
The Riemann theorem
Theorem 8.32 Any simply connected domain D with a boundary that contains more
than one point is conformally equivalent to the unit disk U .

Proof. The idea of the proof is as follows. Consider the family S of holomorphic and
univalent functions f in D bounded by one in absolute value, that is, those that map
D into the unit disk U . We fix a point a ∈ D and look for a function f that maximizes
the dilation coefficient |f 0 (a)| at the point a. Restricting ourselves to a sequentially
compact subset S1 of S and using continuity of the functional J(f ) = |f 0 (a)| we may
find a function f0 with the maximal dilation at the point a. Finally we check that f0
maps D onto U and not just into U as other functions in S.
Such a variational method when one looks for a function that realizes the extremum
of a functional is often used in analysis.
(i) Let us show that there exists a holomorphic univalent function in D that is
bounded by one in absolute value. By r assumption the boundary ∂D contains at least
z−α
two points α and β. The square root admits two branches φ1 and φ2 that differ
z−β
by a sign. Each one of them is univalent in D20 since the equality φν (z1 ) = φν (z2 ) (ν = 1
or 2) implies
z1 − α z2 − α
= (8.27)
z1 − β z2 − β
which implies z1 = z2 since fractional linear transformations are univalent. The two
branches φ1 and φ2 map D onto domains D1∗ = φ1 (D) and D2∗ = φ2 (D) that have
no overlap. Otherwise there would exist two points z1,2 ∈ D so that φ1 (z1 ) = φ2 (z2 )
which would in turn imply (8.27) so that z1 = z2 and then φ1 (z1 ) = −φ2 (z2 ). This is a
contradiction since φν (z) 6= 0 in D.
The domain D2∗ contains a disk {|w − w0 | < ρ}. Hence φ1 does not take values in
this disk. Therefore the function
ρ
f1 (z) = (8.28)
φ1 (z) − w0
is clearly holomorphic and univalent in D and takes values inside the unit disk: we have
|f1 (z)| ≤ 1 for all z ∈ D.
(ii) Let us denote by S the family of functions that are holomorphic and univalent in
D, and are bounded by one in absolute value. This family is not empty since it contains
the function f1 . It is compact by the Montel theorem. The subset S1 of the family S
that consists of all functions f ∈ S such that

|f 0 (a)| ≥ |f10 (a)| > 0 (8.29)

at some fixed point a ∈ D is sequentially compact. Indeed Corollary 8.31 implies that
the limit of any sequence of functions fn ∈ S1 that converges on any compact subset K
r
20 z−a
In general we may define a univalent branch of in a domain D if neither a nor b are in D.
z−b

111
of D may be only a univalent function (and hence belong to S1 ) or be a constant but
the latter case is ruled out by (8.29).
Consider the functional J(f ) = |f 0 (a)| defined on S1 . It is a continuous functional
as was shown in Example 8.27. Therefore there exists a function f0 ∈ S that attains its
maximum, that is, such that
|f 0 (a)| ≤ |f00 (a)| (8.30)
for all f ∈ S.
(iii) The function f0 ∈ S1 maps D conformally into the unit disk U . Let us show
that f0 (a) = 0. Otherwise, the function

f0 (z) − f0 (a)
g(z) =
1 − f0 (a)f0 (z)

would belong to S1 and have


1
|g 0 (a)| = |f 0 (a)| > |f00 (a)|,
1 − |f0 (a)|2 0

contrary to the extremum property (8.30) of the function f .


Finally, let us show that f0 maps D onto U . Indeed, let f0 omit some value b ∈ U .
Then b 6= 0 since f0 (a) = 0. However, the value b∗ = 1/b is also not taken by f0 in D
since |b∗ | > 1. Therefore one may define in D a single valued branch of the square root
s
f0 (z) − b
ψ(z) = (8.31)
1 − b̄f0 (z)

that also belongs to S: it is univalent for the same reason as in the square root in part
(i), and |ψ(z)| ≤ 1. However, then the function

ψ(z) − ψ(a)
h(z) =
1 − ψ(a)ψ(z)

1 + |b| 0
also belongs to S. We have |h0 (a)| =
p
p |f0 (a)|. However, 1 + |b| > 2 |b| since
2 |b|
0 0
|b| < 1 and thus h ∈ S1 and |h (a)| > |f0 (a)| contrary to the extremal property of f0 . 
The Riemann theorem implies that any two simply connected domains D1 and D2
with boundaries that contain more than one point are conformally equivalent. Indeed,
as we have shown there exist conformal isomorphisms fj : Dj → U of these domains
onto the unit disk. Then f = f2−1 ◦ f1 is a conformal isomorphism between D1 and D2 .
Theorem 8.20 implies that an isomorphism f : D1 → D2 is uniquely determined by a
normalization
f (z0 ) = w0 , arg f 0 (z0 ) = θ, (8.32)
where z0 ∈ D1 , w0 ∈ D2 and θ is a real number.

112
9 Elliptic functions
We will be following here the book by Stein and Shakarchi, Ahlfors and the Whittaker-
Watson classic.

The fundamental parallelogram


An elliptic function is a doubly-periodic meromorphic function. More precisely, there
exist two complex numbers ω1 and ω2 so that

f (z + ω1 ) = f (z), and f (z + ω2 ) = f (z), (9.1)

for all z ∈ C. In order for this definition to be interesting, ω1 and ω2 should be linearly
independent over R. Otherwise, as the reader can check, if the ratio ω1 /ω2 is a (real)
rational number, condition (9.1) simply means that the function f (z) is periodic. On
the other hand, if the ratio ω1 /ω2 is an irrational real number, the function f has to be
identically equal to a constant.
The periods of a doubly-periodic function f form a module: if ω is a period, then nω
is a period for any n ∈ C and if ω1 and ω2 are periods, then ω1 + ω2 is also a period. It
is easy to see that unless f is constant, the module M of its periods consists of discrete
points, and we have the following.

Proposition 9.1 A discrete module of complex numbers consists either of ω = 0 alone,


of the integral multiples nω, n ∈ Z of a given complex number ω, or of all linear
combinations n1 ω1 + n2 ω2 , with n1 , n2 ∈ Z of two complex numbers ω1 , ω2 with a non-
real ratio ω1 /ω2 .

Proof. If M contains a non-zero number, it has to contain a number ω1 that has the
smallest absolute value of all ω ∈ M . Assume that M also contains a number ω that
is not an integer multiple of ω1 . Choose the smalest in absolute value number ω2 6= ω1
which is in M but is not a multiple of ω1 . We claim that any ω ∈ M has the form
ω = n1 ω1 + n2 ω2 with n1 , n2 ∈ Z. Indeed, we know that any ω can be written as

ω = λ1 ω1 + λ2 ω2 ,

with λ1 , λ2 ∈ R. Choose integers n1 and n2 so that |λ1 − n1 | ≤ 1/2 and |λ2 − n2 | ≤ 1/2.
As ω is in M , so is
ω 0 = ω − n1 ω1 − n2 ω2 .
It follows that
1 1
|ω 0 | < |ω1 | + |ω2 | ≤ |ω2 |.
2 2
The first inequality above is strict since ω1 and ω2 are linearly independent over R. It
follows from the way ω2 was chosen that either ω 0 = 0 (and hence ω has the required
form), or ω is a real multiple of ω1 . Then ω has to be an integer multiple of ω1 – this
follows from how ω1 was chosen.

113
Note that simply by defining F (z) = f (ω1 z) by obtain a function F (z) that satisfies
F (z+1) = F (z), and any other period ω of F satisfies |ω| ≥ 1. It will be often convenient
to adopt this normalization, and we will denote τ = ω2 /ω1 . The lattice of periods in
this case is given by
Λ = {n + mτ : n, m ∈ Z}.
Its fundamental parallelogram is

P0 = {z ∈ C : z = a + bτ, with 0 ≤ a < 1 and 0 ≤ b < 1}.

The behavior of f in all of C is completely determined by its behavior in P0 . Indeed,


we say that z ∼ w (z is congruent to w) if z − w ∈ Λ. We claim that for each z ∈ C
there exists a unique w0 ∈ P0 such that z ∼ w0 . It is constructed as in the proof of
Proposition 9.1: choose a, b ∈ R so that z = a + bτ , and let n, m be the greatest integers
that are less or equal to a and b, respectively. Then w = a − n + (b − m)τ is congruent
to z and lies in P0 . For uniqueness, assume that z ∼ w = a + bτ and z ∼ w0 = a0 + b0 τ ,
with both w and w0 in P0 . Then w ∼ w0 , so we have

w − w0 = n + mτ,

with integer n and m, but we also have

w − w0 = a − a0 + (b − b0 )τ,

with 0 ≤ a, a0 < 1 and 0 ≤ b, b0 < 1. It follows that n = a − a0 = 0 and m = b − b0 = 0.


Therefore, the values of f everywhere in C are uniquely determined by its values on P0 .

The basic properties of the elliptic functions


A consequence of the above observation is Liouville’s theorem.
Theorem 9.2 An entire elliptic function is a constant.
Proof. If f is an entire elliptic function then f is bounded on P0 , hence it is bounded
on all of C hence it is a constant.
Therefore, an elliptic function must have poles in the fundamental parallelogram. As
usual, we will count them with multiplicities. As elliptic functions are meromorphic
(their only singularities are poles), it can have only finitely many poles inside P0 . They
can lie on the boundary of P0 but since there are only finitely many of poles and zeros
in P0 , we may always consider a shift: F (z) = f (z + a) so that the shifted function F (z)
has neither zeros nor poles on the boudnary of P0 .
Theorem 9.3 The sum of residues of an elliptic function is zero.
Proof. We may assume without loss of generality that there are no poles on the
boundary of P0 . Then, the sum of the residues of f inside P0 is
Z
X 1
Resf = f dz,
2π ∂P0

114
with P0 traced counterclockwise, and the sum taken over all poles of f . However, due
to periodicity of f , the integrals over the opposite sides cancel each other, hence
X
Resf = 0,

and we are done.

Corollary 9.4 An elliptic function has at least two poles in P0 (counting with multi-
plicity).

Indeed, f has to have at least one pole in P0 , and satisfy


X
Resf = 0.

It follows that it can not have a single pole of order one.


We say that an elliptic function has order m if it has m poles in P0 (counted with
their multiplicity).

Theorem 9.5 An elliptic function of order m has m zeros (with multiplicities).

Proof. Without loss of generality we may assume that f has neither poles nor zeros
on the boundary of P0 . The argument principle implies that

f 0 (z)
Z
1
Np − Nz = dz.
2π ∂P0 f (z)

Here Np and Nz are the number of poles and zeros of f inside P0 , respectively. Periodicity
of f once again implies that the integrals over the opposite sides of the fundamental
parallelogram cancel each other, whence Np = Nz .

Theorem 9.6 The zeros a1 , . . . , an and the poles b1 , . . . , bn of an elliptic function satisfy

a1 + · · · + an = b 1 + · · · + b n ,

modulo the period lattice.

Proof. Once again, without loss of generality we may assume that f has neither
poles nor zeros on the boundary of P0 . Consider the integral

zf 0 (z)dz
Z
1
.
2π ∂P0 f (z)

It is given by (recall that the residue of the function f 0 (z)/f (z) at a zero ak is the
multiplicity of ak , and its residue at a pole bk is the order of bk ):

zf 0 (z)dz
Z
1
= a1 + · · · + an − b 1 − · · · − b n .
2πi ∂P0 f (z)

115
Let us now inspect the integral on each side of P0 , and without loss of generality we
assume that P0 has the form (that is, f (z) has basic periods 1 and τ )

P0 = {z ∈ C : z = a + bτ, with 0 ≤ a < 1 and 0 ≤ b < 1},

and its four sides are (0, 1), (1, 1 + τ ), (1 + τ, τ ), (τ, 0). Let us look at
Z 1 Z a+τ  0 Z 1 0
1 zf (z)dz τ f (z)dz
− =− .
2πi 0 τ f (z) 2πi 0 f (z)

Except for the factor of τ , the right side is winding number of the curve traced by f (z)
as z varies from z = 0 to z = 1. As f (0) = f (1), this curve is closed, hence the winding
number is an integer. A similar argument applies to
Z 1+τ Z τ  0 Z 1+τ 0
1 zf (z)dz 1 f (z)dz
− =− .
2πi 1 0 f (z) 2πi 1 f (z)

It follows that
a1 + · · · + an − b1 − · · · − bn = m1 + m2 τ,
with some integers m1,2 , and the proof is complete.

The Weierstrass ℘ function


We have shown that no elliptic function can have one simple pole, and the next simplest
step would be to construct an elliptic function that has one double pole. If we were to
construct a 1-periodic function, not a doubly periodic function, a natural choice would
be ∞
X 1
F (z) = ,
n=−∞
z + n
which is 1-periodic by inspection, and has a pole at all integers. The problem is that
the series defining F (z) does not converge absolutely. The solution to this would be to
re-write F (z) as
∞  
1 X 1 1
F (z) = + + ,
z n=1 z + n z − n
which converges absolutely. Another trick would be to write
   
1 1 1 1 1 1
+ = − + + ,
z+n z−n z+n n z−n n

which would lead to


∞   X ∞    
1 X 1 1 1 1 1 X 1 1
F (z) = + − + − = + − .
z n=1 z + n n n=1
z + (−n) (−n) z n6=0
z + n n

116
The idea is to construct an elliptic function with a double pole at zero following the
second recipe above. Let Λ be the period lattice, if we try to define
X 1
,
ω∈Λ
(z + ω)2

we get a series that does not converge absolutely. The remedy is to consider instead
1 X 1 1

℘(z) = 2 + 2
− 2 . (9.2)
z ω∈Λ∗
(z + ω) ω

Here Λ∗ is the period lattice without the point (0, 0). More explicitly, we may write
1 X  1 1

℘(z) = 2 + − . (9.3)
z (z + n + mτ )2 (n + mτ )2
(n,m)6=(0,0)

Our task is now to show that (1) the series defining the function ℘(z) converges, (2)
the function ℘(z) is meromorphic and has a double pole at all ω ∈ Λ, and only at those
points, and (3) that ℘(z) is doubly periodic with the periods ω1 = 1 and ω2 = τ .
In order to see that the series converges, consider a disk {|z| ≤ R} and notice that
for any z with |z| < R we can write
1 X  1 1
 X  1 1

℘(z) = 2 + − + − . (9.4)
z (z + ω)2 ω 2 (z + ω)2 ω 2
0<|ω|≤2R |ω|>2R

The first term above has finitely many terms and is meromorphic in {|z| < R}: it has
double poles at the lattice points ω with |ω| < R. The second term is holomorphic in
{|z| < R}: there exists C > so that for any |z| < R, |ω| > 2R we have
2
1 1 z + 2zω CR
(z + ω)2 − ω 2 = (z + ω)2 ω 2 ≤ |ω|3 ,

and we have (see Stein-Shakarchi for a detailed proof of convergence of this series):
X 1
3
< +∞.
ω∈Λ∗
|ω|

Therefore, indeed, ℘(z) is well-defined and has double poles exactly at the points z ∈ Λ.
Let us now check that ℘(z) is doubly-periodic. This is done in an indirect way. The
derivative ℘0 (z) is
X 1
℘0 (z) = −2 3
. (9.5)
ω∈Λ
(z − ω)
Note that this series converges absolutely and clearly defines a doubly periodic function.
It follows that there exist two constants a and b so that

℘(z + 1) − ℘(z) = a, ℘(z + τ ) − ℘(z) = b, for all z ∈ C.

117
The definition of ℘(z) implies that it is an even function. Then, taking z = −1/2 implies
that a = 0 while taking z = −τ /2 implies that b = 0.
Let us now investigate the derivative ℘0 (z). First, as ℘(z) is even, its derivative is
both odd and doubly periodic. It follows that

℘0 (1/2) = −℘0 (−1/2) = −℘0 (−1/2 + 1) = −℘0 (1/2),

hence ℘0 (1/2) = 0, and similarly, we can show that ℘0 (τ /2) = ℘0 ((1 + τ )/2) = 0. As the
function ℘0 (z) is elliptic and has a single triple pole, it follows that it has no other zeros
except for 1/2, τ /2 and (1 + τ )/2. Let us set
 
1+τ
e1 = ℘(1/2), e2 = ℘(τ /2), e3 = ℘ .
2

Equation ℘(z) = e1 has a double root at z = 1/2, and similarly the equation ℘(z) = e2
has a double root at z = τ /2, while ℘(z) = e3 has a double root at z = (1 + τ )/2. The
numbers e1,2,3 have to be distinct for otherwise, say, the equation ℘(z) = e1 would have
four solutions (with multiplicities) which would imply that the elliptic function ℘(z)−e1
would have four poles in the fundamental domain (also with multiplicity), while it only
has two poles.

Theorem 9.7 The function (℘0 )2 is a cubic polynomial in ℘:

(℘0 (z))2 = 4(℘(z) − e1 )(℘(z) − e2 )(℘(z) − e3 ). (9.6)

Proof. The functions

F (z) = (℘(z) − e1 )(℘(z) − e2 )(℘(z) − e3 )

and (℘0 (z))2 have the same roots in the fundamental domain, with the same multiplicity
two. In addition, they both have poles of order six at the vertices of the lattice Λ. The
function F (z)/(℘0 (z))2 is, therefore, a holomorphic double-periodic function, whence a
constant B. To find B, note that for z close to zero we have
1 2
℘(z) = + ..., ℘0 (z) = − ,
z2 z3
which means that B = 1/4.
As a consequence, we have the expression for ℘(z) as the inverse of an elliptic integral:
Z ℘(z)
dw
z − z0 = p .
℘(z0 ) 4(w − e1 )(w − e2 )(w − e3 )

Proposition 9.8 Every even elliptic function with periods 1 and τ is a rational function
of ℘(z).

118
Proof. First, if F (z) has a zero or a pole on the lattice Λ, we may consider F1 (z) =
F (z)℘m (z), with an integer m so that F1 (z) has no pole or zero on the lattice. The
function F is even, hence if F (a) = 0 then F (−a) = 0. Moreover, as in the argument
for ℘0 (z), a zero a has an even order if a is a half-period. The same is true for ℘(z) −
℘(a) – it has an even order zero at a if and only if a is a half-period. Therefore, if
a1 , −a1 , a2 , −a2 , . . . , an , −an are the zeros of F then the product
(℘(z) − ℘(a1 ))(℘(z) − ℘(−a1 )) . . . (℘(z) − ℘(an ))(℘(z) − ℘(−an ))
has exactly the same zeros as F , with the same multiplicities. The same argument
applies to the poles b1 , −b1 , . . . , bn , −bn . As a consequence, the ratio F/G, with
(℘(z) − ℘(a1 ))(℘(z) − ℘(−a1 )) . . . (℘(z) − ℘(an ))(℘(z) − ℘(−an ))
G(z) = ,
(℘(z) − ℘(b1 ))(℘(z) − ℘(−b1 )) . . . (℘(z) − ℘(bn ))(℘(z) − ℘(−bn ))
is a bounded elliptic function, hence a constant.

Theorem 9.9 Every elliptic function is a rational function of ℘ and ℘0 .

Proof. Recall that ℘0 (z) is an odd function. Given an elliptic function f (z) we write
it as f (z) = h(z) + g(z) with an even elliptic function h and an odd elliptic function g.
It follows from the previous proposition that h is a rational function of ℘, but also that
the (even) ratio g/℘0 is a rational function of ℘.
We now get the addition rule for the Weierstrass function. Consider the equations
℘0 (z) = A℘(z) + B, ℘0 (y) = A℘(y) + B,
which determine A and B as functions of z and y (unless ℘(z) = ℘(y), that is, unless
z = ±y(mod(1, τ )). Next, consider the function
℘0 (ζ) − A℘(ζ) − B
as a function of ζ. It is an elliptic function that has a triple pole at ζ = 0, hence it has
exactly three zeros. As the sum of all zeros equals to the sum of all poles (modulo the
period lattice), the sum of all zeros equals to zero. Two of the zeros are z and y, hence
the third zero is equal to −z − y (modulo the period lattice), that is:
℘0 (−z − y) = A℘(−z − y) + B.
It follows (using the fact that ℘ is even and ℘0 is odd) that the following determinant
vanishes:
0
℘(z) ℘ (z) 1


℘(y)
℘0 (y) 1 = 0.
℘(z + y) −℘0 (z + y) 1

The derivatives that appear above can be expressed as functions of ℘(z), ℘(y) and
℘(z + y) using the differential equation for ℘0 . This express algebraically ℘(z + y) in
terms of ℘(z) and ℘(y).

119
10 The Theta functions
The Jacobi theta function
The Jacobi theta function is defined as

2
X
Θ(z|τ ) = eπin τ e2πinz . (10.1)
n=−∞

Here z ∈ C and Im τ > 0, so that the integral converges. Note that when τ = it the
function Θ(z, it) becomes the heat kernel

2
X
Θ(z|it) = e−πn t e2πinz ,
n=−∞

providing a link between the complex analysis, number theory and PDEs.
Our first goal is to obtain a different expression for Θ(z|τ ), known as the product
formula.
Theorem 10.1 For all z ∈ C and τ ∈ H = {Im τ > 0} we have

Y
Θ(z|τ ) = (1 − q 2n )(1 + q 2n−1 e2πiz )(1 + q 2n−1 e−2πiz ), (10.2)
n=1

where q = eπiτ .
This will be done in several steps. First, we establish some basic properties of Θ(z|τ ).
Proposition 10.2 The function Θ(z|τ ) satisfies the following properties:
(i) Θ is entire in z ∈ C and holomorphic in τ ∈ H.
(ii) Θ(z + 1|τ ) = Θ(z|τ ),
(iii) Θ(z + τ |τ ) = Θ(z|τ )e−πiτ e−2πiz .
(iv) Θ(z|τ ) = 0 whenever z = 1/2 + τ /2 + n + mτ and n, m ∈ Z.
Proof. To prove the first claim, note that if Im τ = t ≥ t0 > 0, and |z| ≤ R, then

2πinz πin2 τ 2
e e ≤ e2π|n|R−πn t0 ,

hence the series defining Θ(z|τ ) is absolutely convergent. Therefore, Θ(z|τ ) is holo-
morphic both in z and τ on the sets {|z| ≤ R, Im τ ≥ t0 }, and the conclusion of
part (i) follows. The property (ii) is imply the consequence of the 1-periodicity of the
exponentials e2πinz , n ∈ Z. Finally, to prove (iii) we need to compute:
∞ ∞
2 2 +2n)τ
X X
Θ(z + τ |τ ) = eπin τ e2πin(z+τ ) = eπi(n e2πinz
n=−∞ n=−∞

2
X
= e−πiτ eπi(n+1) τ e2πinz = e−πiτ e2πiz Θ(z|τ ).
n=−∞

120
Finally, property (iv) will follow from (ii) and (iii) if we show that
 
1+τ
Θ |τ = 0. (10.3)
2
This is seen as follows: we have
∞ ∞
1 τ X
πin2 τ 2πin(1/2+τ /2)
X 2
Θ( + |τ ) = e e = (−1)n eπi(n +n)τ
2 2 n=−∞ n=−∞
−1
n πi(n2 +n)τ 2 +n)τ
X X
= (−1) e + (−1)n eπi(n
n≥0 n=−∞
0
n πi(n2 +n)τ 2 +n−1)τ
X X
= (−1) e + (−1)n−1 eπi((n−1)
n≥0 n=−∞
0
n πi(n2 +n)τ 2
X X
= (−1) e + (−1)n−1 eπi((n+1) −n−1)τ = 0,
n≥0 n≥0

and we are done.


The next step in the proof of Theorem 10.1 is to show that the right side of (10.2)
satisfies exactly the same properties as Θ(z|τ ). Let us set

Y
Π(z|τ ) = (1 − q 2n )(1 + q 2n−1 e2πiz )(1 + q 2n−1 e−2πiz ), (10.4)
n=1
πiτ
with q = e .
Proposition 10.3 The function Π(z|τ ) satisfies the following properties:
(i) Π is entire in z ∈ C and holomorphic in τ ∈ H.
(ii) Π(z + 1|τ ) = Π(z|τ ),
(iii) Π(z + τ |τ ) = Π(z|τ )e−πiτ e−2πiz .
(iv) Π(z|τ ) = 0 whenever z = 1/2 + τ /2 + n + mτ and n, m ∈ Z. Moreover, these are
simple zeros of Π(z|τ ), and Π(z|τ ) has no other zeros.
Proof. First, if t = Im τ ≥ t0 > 0 and |z| ≤ R, then |q| < e−πt0 , and
|(1 − q 2n )(1 + q 2n−1 e2πiz )(1 + q 2n−1 e−2πiz ) − 1| ≤ C|q|2n−1 e2πR ,
hence the infinite product converges and defines a function that is entire in z and holo-
morphic for τ ∈ H. Furthermore, 1-periodicity of Π(z|τ ) in z follows immediately from
its definition. In order to prove (iii) we compute, using q 2 = e2πiτ :

Y
Π(z + τ |τ ) = (1 − q 2n )(1 + q 2n−1 e2πi(z+τ ) )(1 + q 2n−1 e−2πi(z+τ ) )
n=1

Y 1 + q −1 e−2πiz
= (1 − q 2n )(1 + q 2n+1 e2πiz )(1 + q 2n−3 e−2πiz ) = Π(z|τ )
n=1
1 + qe2πiz
= Π(z|τ )q −1 e−2πiz ,

121
which proves (iii). In order to see (iv) recall that an infinite product vanishes if and
only if one of the factors vanishes. As |q| < 1 for τ ∈ H, the only possibility for Π(z|τ )
to vanish is that
1 + q 2n−1 e2πiz = 0,
or
1 + q 2n−1 e−2πiz = 0.
Since q = eπiτ , these can be re-written as
(2n − 1)τ + 2z = 1 (mod2), or (2n − 1)τ − 2z = 1 (mod2),
that is,
1 τ 1 τ
z=+ − nτ (mod1), or z = − − + nτ (mod1),
2 2 2 2
which is the set of zeros claimed in (iv). It is easy to check that they are all simple
zeros.
Now, we can prove Theorem 10.1. Let us fix τ ∈ H. Consider the ratio F (z) =
Θ(z|τ )/Π(z|τ ). According to the last two propositions, this function is entire and is
doubly periodic, with periods 1 and τ . It follows that F (z) is a constant, which we will
denote by c(τ ):
Θ(z|τ ) = c(τ )Π(z|τ ). (10.5)
Our goal is to show that c(τ ) ≡ 1. Taking z = 1/2 in (10.5) (hence e2πiz = e−2πiz = −1)
gives
∞ ∞
2
X Y
eπin τ (−1)n = c(τ ) (1 − q 2n )(1 − q 2n−1 )(1 − q 2n−1 ),
n=−∞ n=1
or
∞ ∞ ∞
n n2
X Y Y
2n 2n−1 2n−1
(−1) q = c(τ ) (1 − q )(1 − q )(1 − q ) = c(τ ) (1 − q n )(1 − q 2n−1 ).
n=−∞ n=1 n=1

The last identity is obtained simply by noticing that 2n − 1 runs over all odd positive
integers, and 2n over all even ones. It follows that
P∞ n n2
n=−∞ (−1) q
c(τ ) = Q∞ n 2n−1 )
. (10.6)
n=1 (1 − q )(1 − q

Next, we take z = 1/4 in (10.5), with e2πiz = i:


∞ ∞
n2 n n2 n 2
X X X
Θ(1/4|τ ) = q i = q i = q 4n (−1)n .
n=−∞ n even n=−∞

And we also have


∞ ∞
1 Y
2n 2n−1 2n−1
Y
Π( , τ ) = (1 − q )(1 + iq )(1 − iq )= (1 − q 2n )(1 + q 4n−2 )
4 n=1 n=1

Y ∞
Y
= (1 − q 4n )(1 − q 4n−2 )(1 + q 4n−2 ) = (1 − q 4n )(1 − q 8n−4 ).
n=1 n=1

122
We conclude that P∞ n 4n2
n=−∞ (−1) q
c(τ ) = Q∞ . (10.7)
n=1 (1 − q 4n )(1 − q 8n−4 )
Comparing (10.6) and (10.7) and recalling that q(4τ ) = q 4 (τ ), we conclude that
c(τ ) = c(4τ ), (10.8)
and, as consequence, c(τ ) = c(4k τ ) for all k ∈ Z. We also have q(4k τ ) → 0 as k → +∞,
for any τ ∈ H. It follows from (10.7) that c(τ ) = 1, and the proof of Theorem 10.1 is
complete.
Theorem 10.1 also shows a link between the Θ function and the Weierstrass ℘ func-
tion.
Corollary 10.4 For each τ ∈ H fixed, the function
Θ(z|τ )Θ00 (z|τ ) − (Θ0 (z|τ ))2
(log Θ(z|τ ))00 = (10.9)
Θ(z|τ )2
is an elliptic function of order 2 with periods 1 and τ , and with a double pole at z =
1/2 + τ /2.
Proof. Let
Θ0 (z|τ )
F (z) = (log(Θ(z|τ ))0 = ,
Θ(z|τ )
then Proposition 10.3 implies that F (z + 1) = F (z), and
F (z + τ ) = F (z) − 2πi,
so that F 00 (z) is doubly periodic. Since Θ(z|τ ) vanishes only at z = 1/2 + τ /2 in the
fundamental parallelogram, and that zero is simple, the function F (z) has a single pole
at this point, meaning that F 00 (z) has a double pole there.
As a consequence of this corollary, F (z) = ℘(z − τ − 1/2) + bτ , with some constant bτ
that can be computed in terms of the values of the first several terms of the Taylor series
for Θ(z|τ ) at z = 1/2 + τ /2.

Generating functions and partitions of integers


We will now study some applications of the theta functions to the number theory. An
important tool in this study is the generating function: given a sequence Fn we may
often infer its properties from the behavior of the function

X
F (x) = Fn xn .
n=0

A simple example of this comes from the Fibonacci sequence defined by F0 = F1 = 1,


and
Fn = Fn−1 + Fn−2 .

123
It is easy to check that the function

X
F (x) = Fn x n
n=0

satisfies
F (x) = x2 F (x) + xF (x) + x,
so that
x A B
F (x) = 2
= + ,
1−x−x 1 − αx 1 − βx
where √ √
1+ 5 1− 5 1 1
α= , β= , A= , B= .
2 2 α−β β−α
It follows that the general formula for the Fibonacci sequence is

Fn = Aαn + Bβ n .

We will now use this method to find the formula for the partition function defined
as follows: for an integer n we denote by p(n) the number of ways n can be written as
a sum of positive integers. For example, p(1) = 1, p(2) = 2 and p(3) = 3 because

3 = 3 + 0 = 2 + 1 = 1 + 1 + 1,

while p(4) = 5, because

4 = 4 + 0 = 3 + 1 = 2 + 2 = 1 + 1 + 1 + 1 = 1 + 1 + 2,

and so on. By convention, we set p(0) = 0.

Theorem 10.5 (Euler) If |x| < 1 then


∞ ∞
X
n
Y 1
p(n)x = . (10.10)
n=0 k=1
1 − xk

Note that
1
= 1 + O(xk ),
1 − xk
hence the product in the right side of (10.10) converges. Moreover, as each term can be
written as ∞
1 X
= xkm ,
1 − xk m=0
when we multiply them out, the coefficient in front of xn is equal exactly to p(n) and
Euler’s formula follows.

124
Next, let po (n) be the number of partitions of n into odd parts, and pu (n) be the
number of partitions of n into unequal parts. An argument similar to the proof of
Theorem 10.5 shows that the generating function for po (n) is
∞ ∞
X
n
Y 1
po (n)x = ,
n=0 n=1
1 − x2n−1

while the generating function for pu (n) is



X ∞
Y
n
pu (n)x = (1 + xn ).
n=0 n=1

It is easy to see that these two generating functions are the same:
∞ ∞
Y 1 Y
2n−1
= (1 + xn ). (10.11)
n=1
1−x n=1

This is because ∞ ∞ ∞
Y Y Y
(1 + xn ) (1 − xn ) = (1 − x2n ),
n=1 n=1 n=1

and ∞ ∞ ∞
Y Y Y
2n 2n−1
(1 − x ) (1 − x )= (1 − xn ).
n=1 n=1 n=1

Hence, we have proved


Proposition 10.6 Let po (n) be the number of partitions of n into odd parts, and pu (n)
be the number of partitions of n into unequal parts, then po (n) = pu (n).
Next we show the following. An integer n is pentagonal if it is of the form n =
k(3k − 1)/2, with k ∈ Z∗ , or, equivalently, numbers of the form n = k(3k + 1)/2, k ∈ Z∗ .
Theorem 10.7 (Euler) Let pe,u (n) be the number of partitions of n into an even number
of unequal parts, and po (n) be the number of partitions of n into am odd number of
unequal parts, then pe,u (n) = po,u (n) unless n is a pentagonal number, in which case

pe,u (n) − po,u (n) = (−1)k , if n = k(3k + 1)/2. (10.12)

Proof. The proof based on two key observations. First, we have, for |x| < 1:

Y ∞
X
n
(1 − x ) = [pe,u (n) − po,u (n)]xn , (10.13)
n=1 n=1

and ∞ ∞
Y X
n
(1 − x ) = (−1)k xk(3k+1)/2 . (10.14)
n=1 k=−∞

125
Expression (10.13) follows from multiplying out the terms in the product in the left side:
this gives monomials of the form

(−1)r xn1 +···+nr ,

where all nk , k = 1, . . . , r are different. Therefore, each partition of n into an even


number of unequal parts contributes +1 to the coefficient in front of xn , while each
partition of n into an odd number of unequal parts contributes −1 to this coefficient,
leading to the right side of (10.13).
Next, we prove (10.14). Set x = e2πiu , with u ∈ H, then we can write

Y ∞
Y
n
(1 − x ) = (1 − e2πinu ),
n=1 n=1

and further, with q = e3πiu and z = 1/2 + u/2 we can re-write this as

Y ∞
Y ∞
Y
n 2πinu
(1 − x ) = (1 − e )= (1 − e2πi3nu )(1 − e2πi(3n−1)u )(1 − e2πi(3n−2)u )
n=1 n=1 n=1
∞ ∞
2
Y X
2n 2n−1 2πiz 2n−1 −2πiz
= (1 − q )(1 + q e )(1 + q e ) = Θ(z|τ = 3u) = eπin τ e2πinz
n=1 n=−∞
∞ ∞ ∞
2 3u 2 +n)u
X X X
= eπin e2πin(1/2+u/2) = (−1)n eπi(3n = (−1)n xn(3n+1)/2 ,
n=−∞ n=−∞ n=−∞

which is nothing but (10.14).

Writing integers as sums of squares


Integers that can be written as a sum of two squares
We now address the question of which integers can be written as a sum of squares,
starting with just the sum of two squares. We let r2 (n) be the number of ways an integer
can be written as a sum of two squares, including obvious repetitions: for example,
r2 (5) = 8 because
5 = (±1)2 + (±2)2 = (±2)2 + (±1)2 .
It is easy to see that r2 (3) = 0 or r2 (7) = 0, so not all integers can be written as a sum
of two squares.

Theorem 10.8 An integer n can be written as a sum of two squares if and only if every
prime pj of the form 4k + 3 that occurs in its prime number factorization

n = pa11 pa22 . . . par r ,

has an even exponent ar .

126
This result will follow from the following. Let d1 (n) be the number of divisors of n of
the form 4k + 1 and d3 (n) the number of its divisors of the form 4k + 3.
Theorem 10.9 For all n ≥ 1 we have r2 (n) = 4(d1 (n) − d3 (n)).
Theorem 10.9 implies the conclusion of Theorem 10.8 as follows. First, if n is a prime
number of the form n = 4k + 1, then d1 (n) = 2, d3 (n) = 0, so r2 (n) = 8, meaning that
n has a unique decomposition into the sum of two squares up to the trivial repetitions.
Next, if n is a number of the form n = q a , with a prime number q = 4k + 3, then n
has a + 1 divisors. Furthermore, if a is even then d1 (n) = a/2 + 1 and d3 (n) = a/2, so
that r2 (n) = 1, while if a is odd, then d1 (n) = d3 (n) = (1 + a)/2, and r2 (n) = 0. For a
general number
n = pa11 pa22 . . . par r ,
a divisor of the form
m = pb11 pb22 . . . pbrr , 0 ≤ bj ≤ aj , (10.15)
has the form m = 4k + 1 if and only if p = 2 does not appear in the prime number
factorization of m, and the sum of the exponents bs for all primes ps = 4k + 3 that
appear in (10.15) is even, while m has the form m = 4k + 3 if that sum is odd. It is
easy to see that to have d1 (n) > d3 (n) we need all aj to be even if pj = 4k + 3.
The proof of Theorem 10.8 is rather long. We will consider the function
∞ ∞
2τ 2
X X
θ(τ ) = Θ(0|τ ) = eπin = q n , τ ∈ H, q = eπiτ .
n=−∞ n=−∞

It follows that

! ∞
! ∞
2 2 2 2
X X X X
(θ(τ ))2 = q n1 q n2 = q n1 +n2 = r2 (n)q n ,
n1 =−∞ n2 =−∞ (n1 ,n2 )∈Z2 n=−∞

that is, r2 (n) is the generating sequence for θ(τ )2 . Next, we make the following obser-
vation.
Lemma 10.10 The identity r2 (n) = 4(d1 (n) − d3 (n)), n ≥ 1, is equivalent to the
identities
∞ ∞
2
X 1 X qn
θ(τ ) = 2 n + q −n
= 1 + 4 2n
, q = eπiτ , τ ∈ H. (10.16)
n=−∞
q n=1
1 + q

Proof. Both series in (10.16) converge because |q| < 1 and they are equal to each
other because
1 |q||n|
= .
q n + q −n 1 + q 2|n|
To prove the first equality we note that the right side of (10.16) can be written as
∞ ∞ ∞ 
qn q n (1 − q 2n ) qn q 3n
X X X 
1+4 2n
=1+4 4n
=1+4 4n
− 4n
.
n=1
1 + q n=1
1 − q n=1
1 − q 1 − q

127
The first term above can be written as
∞ ∞ ∞ ∞
X qn XX
n 4mn
X
= q q = d1 (k)q k ,
n=1
1 − q 4n n=1 m=0 k=1

while the second is


∞ ∞ ∞ ∞
X q 3n XX
3n 4mn
X
= q q = d3 (k)q k ,
n=1
1 − q 4n n=1 m=0 k=1

finishing the proof of the lemma.


It is convenient to set
∞ ∞
X 1 X 1
C(τ ) = 2 n −n
= , q = eπiτ , τ ∈ H.
m=−∞
q + q n=−∞
cos(nπτ )

Lemma 10.10 shows that our main task in the proof of Theorem 10.8 is to prove that
C(τ ) = θ2 (τ ).

128

You might also like