Math 212a - Real Analysis: Taught by Yum-Tong Siu Notes by Dongryul Kim Fall 2017
Math 212a - Real Analysis: Taught by Yum-Tong Siu Notes by Dongryul Kim Fall 2017
This course was taught by Yum-Tong Siu. The class met on Tuesdays and
Thursdays at 2:30–4pm, and the textbooks used were Real analysis: measure
theory, integration, and Hilbert spaces by Stein and Shakarchi, and Partial dif-
ferential equations by Evans. There were weekly problem sets, a timed take-
home midterm, and a take-home final. There were 6 students enrolled, and
detailed lecture notes and problem sets might be found on the course website.
Contents
1 August 31, 2017 4
1.1 Convergence of the Fourier series . . . . . . . . . . . . . . . . . . 4
2 September 5, 2017 6
2.1 Measure of a set . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 More properties of the measure . . . . . . . . . . . . . . . . . . . 8
3 September 7, 2017 9
3.1 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Egorov’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3 Convergence theorems . . . . . . . . . . . . . . . . . . . . . . . . 10
10 October 3, 2017 34
10.1 Riesz representation theorem . . . . . . . . . . . . . . . . . . . . 34
10.2 Adjoint operator . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
11 October 5, 2017 37
11.1 Approximating by convolutions . . . . . . . . . . . . . . . . . . . 38
2
18 October 31, 2017 58
18.1 Smearing out by polynomials . . . . . . . . . . . . . . . . . . . . 58
18.2 Differential equations with constant coefficients . . . . . . . . . . 59
19 November 2, 2017 61
19.1 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
19.2 Tempered distributions . . . . . . . . . . . . . . . . . . . . . . . . 62
20 November 7, 2017 64
20.1 Malgrange–Ehrenpreis on unbounded region . . . . . . . . . . . . 64
21 November 9, 2017 67
21.1 Division problem for distributions . . . . . . . . . . . . . . . . . . 68
21.2 Changing the domain of integration . . . . . . . . . . . . . . . . 68
3
Math 212a Notes 4
is true. Lebesgue answered this question in 1902, using Lebesgue theory. Ac-
tually this didn’t solve the problem entirely even in constant coefficients, and
it remained unsolved until 1955 when Malgrange and Ehrenpreis solve it in
constant coefficients.
This sum in the integral is called the Dirichlet–Dini kernel. We can sum
them as
n
X e−inx − ei(n+1)x sin(n + 12 )x
Dn (x) = eikx = = .
k=−n
1−e ix sin 21 x
as the convolution.
To argue about the limit of sn (x) as n → ∞, we need to say about the
commutation of limits. This is not very true of most functions, for instance
characteristic functions of bad sets. Lebesgue had this big idea of partitioning
the range instead of the domain. The base of the rectangles may be strange,
though.
If we change the period from [0, 2π] to [−L, L], we can rescale the length and
use einxπ/L instead. In the limiting case L → ∞, we will get another theory.
Then
Z ∞ Z ∞
ˆ
f (ξ) = f (x)e −2πiξx
dx, f (x) = fˆ(ξ)e2πiξx dξ.
x=−∞ ξ=−∞
These are called the Fourier transform and the inversion formula.
This still doesn’t solve the differential equations with constant coefficients.
Let me tell you why. Suppose you have a differential equation like
∂2 ∂2
+ · · · + f = g.
∂x21 ∂x2n
and then divide by the polynomial. But this is a problem because there might
be zeros in the polynomial. This was a big problem, and it took a long time to
handle these.
Math 212a Notes 6
2 September 5, 2017
So we start out with the details. We want to solve differential equations by
eigenfunctions, but P
there are two problems here. The first is the convergence of
∞
the series like f = n=−∞ cn einx . The second is the effect of differentiation.
But here, it is easier to do integration, because it will be easier to exchange
sums. So the tool we need here is the fundamental theorem of calculus. For
Riemann integration, we need “uniformity”, but this is too strong and so we will
use Lebesgue theory. We are going to study the size of the set when a certain
property holds or fails.
Proof. The key is that a point disconnects an open set in R. Then any connected
component of O is an open interval. This is because, if x ∈ O, then
is of the form (x, b). If you do the same thing on the other side, the connected
component of x in O is (a, b). Countability follows from the fact that the rational
numbers is dense and countable.
S
This means that we can write O = k∈J (ak , bk ) as a disjoint union. This
allows us to define its measure as
X
m(O) = (bk − ak ).
k∈J
Lebesgue’s original definition for the inner measure was also like mi (E) =
supF ⊆E m(F ) and call E measurable if the outer and inner measures coincide.
But we are not doing this. Instead, we define
Definition 2.3. Given E ⊆ R, E is measurable if for each > 0, there exists
an open set O ⊇ E such that m∗ (O − E) < .
Math 212a Notes 7
and by countable sub-additivity of the outer measure, the right hand side has
outer measure at most .
Proving (1) is more complicated, because outer approximation and inner
approximation get mixed up.
Proposition 2.5 (Additivity of outer measure). If E1 , E2 ⊆ R have positive
distance between them, i.e., inf xj ∈Ej |x1 −x2 | > 0, then m∗ (E1 ∪E2 ) = m∗ (E1 )+
m∗ (E2 ).
Proof. Let δ be the distance. Let Gj = {y ∈ R : dist(y, Ej ) < δ/2}. This is an
open set by the triangle inequality. Also G1 and G2 are disjoint. Given O ⊇
E1 ∪ E2 , we can assume that O ⊆ G1 ∪ G2 by replacing with the intersection.
Let O,j = Gj ∩ O so that O is the disjoint union of O,1 and O,2 . Then
m∗ (E1 ) + m∗ (E2 ) ≤ m(O,1 ) + m(O,1 ) = m(O ) < m∗ (E1 ∪ E2 ) + .
Since is arbitrary, we get the inequality on the other side.
Proposition 2.6. Every closed subset F ⊆ R is measurable.
S
Proof. By countable union, we may assume that F is bounded; F = n F ∩
[−n, n]. Assume that F ⊆ (a, b). Then we can again we can write
[
F = (aj , bj ).
j∈J
Now we can approximate (aj , bj ) from the inside by closed sets, and then take
the complement. Here, we actually need to select a finite number of big intervals.
Math 212a Notes 8
This and (2) implies (3), and then (1) follows. Thus measurable sets form
an σ-algebra.
Proof. First reduce to the case when each Ej is bounded, by breaking them up
into Ej ∩ [n, n + 1). For the other side of the inequality, we approximate Ej for
the inside like Fj ⊆ Ej so that m(Ej − Fj ) < 2−j . Here the point is that any
two disjoint compact sets have positive distance. So we can use the previous
additivity property to get additivity for Fj .
Proposition 2.9. The measure of a limit of an increasing sequence is the limit
of the measure.
Proof. You just use the fact that En = (En − En−1 ) ∪ (En−1 − En−2 ) ∪ · · · .
Proposition 2.10. The measure of a limit of a decreasing sequence is the limit
of the measure, if the measure of at least one is finite.
This m(E1 ) < ∞ assumption is crucial. Here is a counterexample. If En =
(n, ∞), then E = ∅ and m(En ) = ∞.
Math 212a Notes 9
3 September 7, 2017
Let me make a remark about Borel measurable sets. Abstract measure theory is
just taking some sets and taking countable unions, intersections, complements.
Borel measurable sets are the sets that can be obtained by these operations
from open sets. There are more Lebesgue measurable sets, namely the sets with
measure zero.
Let me give you two counter-examples. We are going to construct a non-
measurable set E ⊆ [0, 1] such that a countable disjoint union of E contains
[0, 1] but is contained in [−1, 2]. Consider the equivalence relation x ∼ y if and
only if x − y ∈ Q on [0, 1], and pick one representative for each class. Then this
set E has the property.
The second example is the Cantor function, which does not satisfy the fun-
damental theorem of calculus.
3.1 Integration
Let’s look at the definition of Riemann integration. Riemann’s idea was to cut
the interval [a, b] into smaller intervals, and take the limit as the mesh goes to
zero. Here, we require that f is continuous, or almost continuous. We then
approximate the function with the step function
X
a j χ Ij .
j
The definition depends on the fact that this step function approaches f uni-
formly. Then by definition,
X Z X Z
aj m(Ij ) = aj χIj → f.
j j
In Lebesgue’s case, we assume that 0 ≤ f ≤ M and cut the target into pieces
0 = y0 < y1 < · · · < yp = M . Let
has m(E − Â ) < /2. This is the “good set”, but we also have the condition
that this has to be closed. Approximate  by a “good” closed subset A ⊆ Â
such that m(Â − A ) < /2.
R
R All the other theorems regard with the question of whether limk→∞ fk =
f if fk → f .
Before I go on, I want to make some comments on the definition of Lebesgue
integration. So far, we have defined the integral of 0 ≤ f ≤ M on E with
m(E) < ∞. Now we relax the conditions, one by one. Suppose f ≥ 0 but that
it could be unbounded. What we can do is to truncate f at n so that it becomes
max(f, n). The integral of this is well-defined and we can then take the limit.
Next, we would like to remove the condition m(E) < ∞. Here we truncate
the domain and replace E with E∩[−n, n]. But we still need the f ≥ 0 condition,
because otherwise we could get ∞ − ∞.
R R
Definition
R R3.6. We say that f is integrable if f+ = max{0, f } is finite
and f− = max{0, −f } is also finite. In this case, we have f = f+ − f− and
we define Z Z Z
= f+ − f− .
f
The dominated convergence theorem concerns withR the case |fk | ≤ g where
g is integrable. Here we need absolute continuity, i.e., A g → 0 as m(A) → 0.
Proof of Fatou’s lemma. The proof is actually quite simple. Let f = lim inf k fk
and ϕn = inf k≥n fk so that ϕn % f .R Here we R can use monotone convergence.
On the other hand, ϕn ≤ fn implies ϕn ≤ fn . Taking lim inf of both sides
gives the result immediately.
If things are not finite, you can truncate the functions and then take the
limit.
For the first part, the left hand side is only defined by f up to “almost every-
where”. So this is the best you can expect. But then we need to ask when
dF/dx makes sense, almost everywhere.
Dini had the idea of defining 4 derivates. When you define the difference
quotient
f (h + x0 ) − f (x0 )
lim ,
h→0 h
Math 212a Notes 13
you actually have limit from the left and right, and lim sup and lim inf.
Under what condition are they the same? Vitali decided to look at the cumu-
lative effect of the errors. Assume that f is nondecreasing. The “bad” sets we
want to look at is the ones with
f (x0 + h) − f (x0 ) f (x0 + h) − f (x0 )
lim inf < lim sup .
h h
Because this is hard to control, fix rationals α < β and only look at the x0 such
that the left hand side is smaller than α and the right hand side greater than β.
Lemma 4.2. If Ek % E in R then limk→∞ m∗ (Ek ) = m∗ (E).
Proof. Approximate Ek by the open sets from the outside.
Lemma 4.3 (Vitali covering). Let E ⊆ R be m∗ (E) < ∞. For every x ∈ E,
suppose there exists a nonempty Ax of positive numbers. Then for every > 0,
there exist x1 , . . . , xk ∈ E and rxj ∈ Axj such that (xj , xj + rxj ) are all disjoint
and
k
[
m∗ E ∩ (xj , xj + rxj ) ≥ m∗ (E) − .
j=1
The main difficulty is that we are only allowed to use finitely many intervals.
Proof. Let
En = {x ∈ E : sup Ax > 1/n, x ∈ [−n, n]}.
Then En % E and so at sufficiently large n, we can replace E by En with at
most error. Now you use the at least 1/n-length intervals to cover E almost
entirely.
Let f is nondecreasing on [a, b] with a < b in R. We would like to show that
D+ f (x) = D+ f (x) almost everywhere. You need two other similar statements,
to get differentiability entirely.
For 0 ≤ α < β rational numbers, let
Now for each x ∈ Eα there exists a rx > 0 such that f (x + rx ) − f (x) < αrx .
Cover this set Eα by (xj , xj + rxj ) up to error . Then the total length of the
f -image of the union is going to be at most
X
α rxj .
j
Now replace [a, b] by [xj , xj + rxj ] for each j to apply Vitali again. Define
We can then cover the intervals [xj , xj +rxj ] with [xj,l , xj,l +sxj,l ] up to measure
. It then follows that the total length of the f -image of {[xj,l , xj,l + sj,l ]} is
at least β times the total length. These two inequalities give a contradiction as
→ 0. This finishes the proof.
We may assume that f > 0 since we can take the positive and negative part,
and write f as the difference of the two. This also means that dF/dx is almost
everywhere defined.
The difference quotient is, if n > 0,
Z b
1 b
Z b
1 b+n
Z a+n
F (x + n ) − F (x)
Z Z
= F (x+n )− F (x) = F− F .
a n n a a n b a
If f is not
R x bounded, we can use the truncation fn = min(f, n) and then for
Fn (x) = a fn , we have Fn0 (x) = fn (x) almost everywhere. Fatou’s lemma tells
us that for any F increasing and continuous,
Z b
F 0 ≤ F (b) − F (a).
a
by monotone convergence.
Theorem 5.1 (Fundamental theorem of calculus I). If f is integrable, then
Z x
d
f (t)dt = f (x)
dx a
for almost every x.
This is stronger than uniform continuity, because that is only for one interval.
The technique for proving this is again comparing the functions. Let’s as-
sume for now that we know that F 0 exists and is integrable. Let
Z x
G(s) = F 0.
a
Then Var(P, f ) = Var+ (P, f ) − Var− (P, f ) and f (b) − f (a) = Var+ (P, f ) +
Var− (P, f ). It follows that
and likewise for Var− . Then we can take sup on both sides. After doing this,
we can define
f+ = sup Var+ (P[a,x],f )
P
disjoint means that their interiors are disjoint. Is the measure defined in this
way unique? Actually you don’t have to worry about this, because you’re going
to define the exterior measure as
X
m∗ (E) = inf m(Qj ).
j
Now you can repeat everything again and get a measure theory on Rd .
The next step is the fundamental theorem of calculus, and here you have a
trouble, because you have many variables. One analogue of this can be thought
of as Z
1
lim f = f (x)
x∈B,m(B)→0 m(B) B
almost everywhere. (Note the one-dimensional case.) People call this an “aver-
aging problem”, and was solved by Lebesgue.
How are we going to approach this problem? In the one-dimensional case,
we used a truncation of f and used the approximation fn % f . Here, we
are going to approximate f by a continuous function g of compact support,
in the L1 norm. And then you hope that a 3 argument will give work out.
The approximation between f (x)
R and g(x) 1is called Tchebychev’s theorem, and
1
R
the comparison between m(B) B
f and m(B) B g is the inequality on Hardy–
Littlewood maximal function.
Math 212a Notes 19
µ(E)
lim = g.
B→p+ m(E)
The second
R part of the fundamental theorem of calculus can then be stated as
µ(E) = E g. This is what we are going to do.
Example 6.1. Before starting, let me mention the Cantor function. This is
function f : [0, 1] → [0, 1] is defined as
ja k
j
x = 0.a1 a2 a3 . . .(3) 7→ f (x) = 0.b1 b2 b3 . . .(2) where bj = .
2
This satisfies f 0 = 0 almost everywhere.
for almost every x. The proof is done by looking at the stronger version
Z
1
lim sup |f (y) − f (x)|dy = 0
m(B)→0 m(B) B
3d
m({F ∗ > α}) ≤ kF kL1 .
α
This is done by Vitali’s covering technique.
Proposition 6.2. Given a finite number of open balls B, there exists a finite
subcollection B 0 of disjoint open balls such that
[ X
m B ≤ 3d m(B).
B∈B B∈B0
Proof. Choose B0 ∈ B with maximal radius. Consider the ball 3B0 with the
same center and three-times radius. Among the balls in B that are not com-
pletely covered by 3B0 , take the maximal radius ball B1 ∈ B. Note that B0
and B1 cannot intersect by maximality. Among the balls in B that are not
completely covered by 3B0 ∪ 3B1 , take the maximal radius ball B2 ∈ B. Repeat
this until we exhaust of balls, and let B 0 = {B0 , B1 , . . .}.
Theorem 6.3 (Weak-type inequality). If F is integrable, then
3d
m({F ∗ > α}) ≤ kF kL1 .
α
Proof. We first note that F ∗ is measurable because {F ∗ > α} is in fact open.
Wiggling around doesn’t change the integral much.
Consider any compact subset K ⊆ Eα = {F ∗ > α}. Then there is a finite
covering of K by large balls, and then
3d 3d
[ X Z Z
m(K) ≤ m B ≤ 3d m(B) ≤ |F | ≤ kF k1 .
0
α B∈B0 B α
B∈B B∈B
Definition 6.4. The set on which (∗) holds is called the Lebesgue set of f .
This has an interesting consequence. If f = g almost everywhere, we regard
them as more or less the same. The first fundamental theorem of calculus allows
us to pick a representative of the equivalence class. Of course, the limit may
not exist at some point, but we can then let them to take 0.
Math 212a Notes 22
(We are doing this last step so that there is a better chance of convergence.) So
π
sin(n + 21 )y
Z
1
sn (x) − L = (f (x − y) + f (x + y) − 2L) dy
2π 0 sin 12 y
f (x − y) + f (x + y) − 2L
sin 12 y
together. If f 0 exists, then we indeed let L = f (x) and get an integrable function.
But this doesn’t really work very nicely in general. Cesàro decided to look
at the Lebesgue set.
Proposition 6.5. If x is in the Lebesgue set, then
s0 + s1 + · · · + sn−1
σn = → f (x)
n
as n → ∞.
Math 212a Notes 23
The first thing we want to establish is Dini’s test, which is that the Fourier
series for f (x) converges at x to L if
f (x + y) + f (x − y) − 2L
y
is integrable as a function of y near y = 0.
There also Cesàro convergence. If x is in the Lebesgue set, then we can show
that
s0 + s1 + · · · + sn−1
σn = → f (x).
n
Why is this interesting? We can interpret
n−1 n−1
X n−k
1X
σn = sk = ak
n n
k=0 k=0
as a sum with some weight. We can also look at other weights, for instance,
Abel’s
∞
X
Ar = r|n| an .
n=−∞
Ordinary convergence implies Cesàro convergence, and this implies Abel’s con-
vergence. Can you go back? These are called Tauberian theorems, and is an
interesting field.
and then
Z π Z π
1 1
sn −L = (f (x−y)−L)Dn (y)dy = (f (x−y)+f (x+y)−2L)dy.
2π y=−π 2π y=0
Does this go to 0 as n → ∞?
The first thing done was to look at the function
f (x − y) + f (x + y) − 2f (x)
.
y
If this is integrable, then sn − f actually converge to 0.
Math 212a Notes 24
as λ → ∞.
Proof. You approximate f by a smooth function g in L1 [a, b]. We first check
Z b Z b
cos λx cos λx
g(x) sin λx = − g(x)|ba + g 0 (x) dx,
a λ a λ
If f is continuous at x, for instance, you can show that this goes 0 for n → ∞,
because we have 1/n.
What is really the difference between Dn and Fn ? The intuitive picture is
that Fn is positive and approximates the Dirac delta for n → ∞. On the other
hand, Dn has some oscillation that does not disappear in magnitude as n → ∞.
Let us make this more precise. A good kernel is a family of integrable
functions Kδ for δ > 0 satisfying
Z
(1) Kδ (x)dx = 1 (unit interval),
d
ZR
(2) |Kδ (x)| ≤ A independent of δ (uniform integrability),
Rd
Math 212a Notes 25
Z
(3) for every η > 0, |Kδ (x)|dx → 0 as δ → 0 (small L1 norm outside the
|x|>η
origin).
This give convergence almost everywhere, but we don’t know if things converge
on Lebesgue sets.
So we strengthen the condition, and say that Kδ is an approximation to
identity if it satisfies (1) and the following two strengthenings:
A
(2s) |Kδ (x)| ≤ for all δ > 0 (parameter pole order estimate),
δd
Aδ
(3s) |Kδ (x)| ≤ for all δ > 0 (coordinate pole order estimate).
|x|d+1
Proposition 7.3. If Kδ is a good kernel, and f is an integrable function, then
there is a sequence δν → 0 such that f ∗ Kδν → f almost everywhere.
Proof. Write fy (x) = f (x − y). Note that
Z
(f ∗ Kδ )(x) − f (x) = (fy (x) − f (x))Kδ (y)dy.
Rd
Then we can use approximation of f by smooth functions to show limη→0 sup|y|≤η kfy −
f kL1 = 0.
So we get limδ→0 kf ∗ Kδ − f k = 0. You have shown in you homework that
then there is a subsequence δν such that f ∗ Kδν → f almost everywhere.
Let us now look at the approximation to identity case. Consider an approx-
imation by identity Kδ (x). If x is in the Lebesgue set, we have
Z
1
Ax (r) = d |f (x − y) − f (x)|dy → 0
r |y|<r
Proof. We have
Z
|(f ∗ Kδ )(x) − f (x)| ≤ |f (x − y) − f (x)||Kδ (y)dy|
d
ZR
= |f (x − y) − f (x)||Kδ (y)|dy
|y|<δ
∞ Z
X
+ |f (x − y) − f (x)||Kδ (y)|dy
k=0 2k δ<|y|≤2k+1 δ
∞
X cδ
≤ cA(δ) + A(2k+1 δ)
(2k δ)d+1
k=0
From the properties we had for A(r), it follows that this goes to 0 as δ → 0.
You can ask when is the inverse Fourier transform actually the inverse of the
Fourier transform? It turns out this is easier to deal with than the Fourier
series, because f and fˆ both are defined in Rd .
The idea is that if f and fˆ are in LR1 , then Rthe Fourier transform is formally
symmetric. That is, if f, g ∈ L1 , then f ĝ = fˆg. The verification is Fubini’s
theorem and rescaling the Gauss distribution to use it as a good kernel.
Math 212a Notes 27
where Fn is the Fejér kernel. Here, we are going to take care of 0 < y < η with
the Lebesgue point condition and handle η < y < π with the oscillation. Let us
write φ(y) = 21 (f (x + y) + f (x − y) − 2f (x)) and
Z t
Φ(t) = |φ(y)|dt.
y=0
Then Φ(t) ≤ 2kf kL1 and the Lebesgue point property gives Φ(t)/t → 0 as t → 0.
Given > 0, choose 0 < η < π such that 0 ≤ Φ(t) < t for all t ≤ η. Now
we want to show that
sin2 πny
Z π
1 2
φ(y)dy → 0
2πn y=0 y2
as n → ∞. We have from 0 < y < 1/n,
sin2 πny
Z 1/n
1 2 1/n
Z
1 2 n 1
2
φ(y)dy ≤ n |φ(y)|dy = Φ → 0,
2πn 0 y 2πn 0 2π n
because sin2 θ ≤ θ2 .
For the other interval, we note that integration by parts give
sin2 πny
Z b Z b Z b
2 1 Φ(b) Φ(a) 1
2
≤ 2
|φ(y)|dy = 2
− 2
+ 2 3
Φ(y)dy.
y=a y a y b a y=a y
So we have
sin2 πny
Z η Z η
1 2
1 Φ(η) n 1 1 1
2πn 2
φ(y)dy ≤
2
− Φ( + 3
Φ(y)dy
1/n y 2πn η 2π n πn y1/n y
Z η
dy 3
≤ + 2
≤ .
2π πn y=1/n y 2π
Math 212a Notes 28
Finally, we have
π
sin2 ny
Z
1 2
φ(y)dy
2πn y=η y2
goes to zero, because the amplitude of the oscillation goes to zero.
1 X einx
sf (x) = .
2i n
n6=0
This the Fourier series of the sawtooth function, which looks something like
(
− x+π
2 −π < x < 0
sf (x) =
− x−π
2 0 < x < π.
This is actually uniformly bounded for all N ∈ N and all θ ∈ R. On the other
hand, the non-principal sum
−1
X einθ
QN (θ) =
n
n=−N
PN
cannot converge because |QN (0)| = n=1 n1 ≥ log N . To show that PN (θ) is
uniformly bounded, you use Abel summation.
Choose αk > 0 and Nk ∈ N so that
P∞
(1) k=1 αk < ∞,
(2) Nk+1 > 3Nk ,
(3) αk log Nk → ∞ as k → ∞,
3
e.g., αk = k −2 , Nk = 2k , and then construct the function
∞
X
f (θ) = αk ei2Nk θ PNk (θ).
k=1
The condition (2) ensures that the coefficients do not interfere. Then if we
cut the sum at 2Nk , then we get one large non-principal sum and some small
principal sums that are uniformly bounded. This means that the Fourier series
cannot converge.
Math 212a Notes 29
i.e., f 7→ fˆ is formally symmetric with respect to the pairing. This is just Fubini,
because we can replace ĝ and fˆ with integrals.
2
We first note that e−πx is its own Fourier transform. This can be verified
directly. Now take
2 1 −π|x−y|2 /δ
g(ξ) = e−πδ|ξ| e2πix·ξ , e
ĝ(y) = .
δ d/2
Then as δ → 0, we immediately get by dominated convergence,
Z
f (x) = fˆ(ξ)g(ξ)dξ
ξ∈Rd
This holds when ∂f /∂y is L1 ([a, b] × [c, d]). As I have said, we can use Fubini
and fundamental theorem of calculus.
The interpretation is that the left hand side is the weighted average of the
constant function 1. This can be replaced by a constant weight somewhere.
You can also look at the Stieltjes version
Z b Z b
f (x)ϕ(x)dx = f (ξ) ϕ(x)dx.
a a
But what if you want to have the weight concentrated at the end? The
second mean value theorem is, if ϕ monotone and f is an integrable function on
(a, b), there exists a ξ such that
Z b Z ξ Z b
f (x)ϕ(x)dx = ϕ(b−) f (x)dx + ϕ(a+) f (x)dx.
a a ξ
can talk about Lebesgue points and so forth. What about the second part? If
Rb
you think about a F 0 = F (b) − F (a), these are two different ways of giving an
identical “measure” to an interval. So the higher dimensional analogue will also
be equating two measures.
This is how abstract measure theory work. People look at abstract measure
spaces and also define signed measures. For two measures µ and ν, we say that
µ is absolutely continuous with respect to ν if ν(E) = 0 implies µ(E) = 0.
Then there is a Radon–Nikodym derivative, which is a measurable function,
such that Z
dµ
µ(E) = dν.
E dν
To get the inverse, we need that the map is invertible, i.e., has nonzero eigen-
values. We have
and so is automatically invertible. But for this to be true for function spaces,
we need an estimate. And to formulate this, we need to set up the notion of
Hilbert spaces.
In an inner product space, we have the parallelogram law
Definition 9.1. Let E be a measurable set.R For p ≥ 1, the space Lp (E) is the
space of measurable functions f such that E |f | < ∞. The Lp -norm is defined
p
as Z 1/p
kf kLp = kf kp = |f |p .
E
The first thing you need to worry about is the triangle inequality. This is
Minkowski’s
√ inequality, and it comes from concavity. The most basic inequality
is ab ≤ (a + b)/2, which can be shown easily, and its general case is aα bβ ≤
αa+βb for α+β = 1 and a, b, α, β > 0. The condition α+β ensures homogeneity
and so we may set b = 1 and show aα ≤ α(a − 1) + 1. The left hand side is a
concave function in a, and the right hand side is a tangent line at a = 1.
From this, we can deduce other inequalities. Suppose P wePhave numbers
a1 , . . . , an and b1 , . . . , bn , and by normalization assume aj = bj = 1. Using
Math 212a Notes 32
β
aα
j bj ≤ αaj + βbj , we get
n
X n
X n
α X β
β
aα
j bj ≤ 1 = aj bj .
j=1 j=1 j=1
If we rescale aj and bj , and let p = 1/α, q = 1/β, then we get the standard
Hölder inequality
n
X n
X n
1/p X 1/q
aj bj ≤ apj bqj .
j=1 j=1 j=1
Then we can move things to the other side to get the Minkowski inequality
kf + gkp ≤ kf kp + kgkp .
(i) complete, i.e., it is complete respect to the metric defined by the norm,
(ii) separable, i.e., there exists a countable dense subset.
Example 9.3. For any measurable space E, the space L2 (E) is a Hilbert space.
We know that it is separable, because every function can be approximated by
simple functions and we can make them have rational coefficients.
Math 212a Notes 33
10 October 3, 2017
We will be looking at linear differential equations. We can ask how to get
A∗ (AA∗ + S ∗ S)−1 b in function spaces. So we look at Hilbert spaces, which are
complete and separable. One difficulty, which help people until Friedrichs(1944),
was that some operators are not defined on the domain.
The operators like d/dx are not defined on L2 (R). Rather, it is defined on
differentiable functions, which is dense. The operator T ∗ means that
(T x, y) = (x, T ∗ y).
by the parallelogram law. Then k2v − yn − ym k2 ≥ 4µ2 and the right hand side
goes to 4µ2 . This shows that kyn − ym k → 0 and that this is a Cauchy sequence.
Then yn → w.
Now we need to show that it is perpendicular. Let 0 6= y ∈ Y . We have
vf = f (v)v
does the job. Because x decomposes into a linear multiple of v and an element
of Y we only need to show (x, f (v)v) = f (x) for x ∈ Y and x = v. We have
already checked it for x = v. For x ∈ Y , (x, f (v)v) = 0 = f (x).
Math 212a Notes 35
|f (x)| ≤ kxkkvf k
(T x, y) = (x, T ∗ y).
If we fix y, then
|(T x, y)| ≤ kT xkkyk ≤ kT kkykkxk,
and so there exists this element T ∗ y that represents the map. Moreover, we
would have the inequality
kT ∗ yk = kx 7→ (T x, y)k ≤ kT kkyk.
where H1 , H2 , H3 are Hilbert spaces but T and S are only densely defined. We
can assume that the graphs of T and S are closed.
What do I mean by this? We want to make this function defined on a domain
as large as possible, so we look at the closure of the graph
{(x, T x) : x in domain of T } ⊆ H1 × H2 .
(fν0 , h) = (fν , h0 ) → 0
for all g ∈ dom(T ∗ ) ∩ dom(S), for some c > 0. The conclusion is that if
g ∈ dom S such that Sg = 0, then there exists a f ∈ dom T such that T f = g
and f ⊥ ker T .
You could try to define (T ∗ T + SS ∗ )−1 , but it becomes very complicated
because the domain is bad. Instead, we do this directly.
11 October 5, 2017
We need to prove the following proposition.
Proposition 11.1. Let S : H1 → H2 and T : H2 → H3 be densely defined
closed operators between Hilbert spaces and ST = 0 and
for all g ∈ dom(T ∗ ) ∩ dom(S). Then the equation T u = f with f ∈ ker S can
be solved uniquely with u ⊥ ker T and moreover there is the estimate
1
kukH1 ≤ √ kf kH2 .
c
kf k kf k
|(g, f )| = |(g1 , f )| ≤ √ kT ∗ g1 k = √ kT ∗ gk.
c c
√
So T ∗ g 7→ (g, f ) is bounded by kf k/ c. This is a map im T ∗ → R that is
bounded, and so we can extend it to its closure im T ∗ and the to H1 by orthog-
onal projection. Now by Riesz representation theorem, there exists a u ∈ H1
such that
kf k
kuk ≤ √
c
such that (T ∗ g, u) = (g, f ) for all g ∈ dom T ∗ .
This implies that f = (T ∗ )∗ u with u ∈ dom(T ∗ )∗ . But it can be shown that
∗ ∗
(T ) = T and so f = T u.
H1 × H2 = (Graph T ) ⊕ (Graph T )⊥ .
for all f ∈ dom T ∗ . Then (0, h) ∈ Graph(T ) which is not possible unless h = 0.
This shows that dom T ∗ is dense.
Finally, T = (T ∗ )∗ because Graph((T ∗ )∗ ) is going to be the orthogonal
complement of Graph(T ∗ ), which is Graph(T ).
1 x
χ (x) = χ ,
so that the support has order .
Now assume u ∈ dom L and let u = u ∗ χ . Now we are good if we could
prove that
(u , Lu ) → (u, Lu)
in the L2 norm of the product space.
Theorem 11.3. Let u, Lu ∈ L2 for some differential operator L. Then for any
u ∈ C0∞ , L(u ∗ χ ) → Lu in L2 as → 0.
Math 212a Notes 39
Proof. The main idea is the 3 argument applied to the sequence. We have
u = u ∗ χ → u, (Lu) ∗ χ → Lu
L(u ∗ χ ) − (Lu) ∗ χ → 0
in L2 .
For simplicity, consider the operator
d
L = a(x) + b(x).
dx
Suppose we show that
But the first term is bounded by Cku − vk and the second part goes to 0 as
→ 0. So if we prove this statement, we get
χ ∗ Lu − L(χ ∗ u) → 0
as → 0.
Now let us show the inequality (†). We have
∂u ∂
χ ∗ Lu − L(χ ∗ u) = χ a − a (χ ∗ u)
∂x ∂x
∂ ∂a ∂
= χ ∗ (au) − χ ∗ u − a (χ ∗ u)
∂x ∂x ∂x
∂ ∂ ∂a
= χ ∗ (au) − a χ ∗ u − χ ∗ u .
∂x ∂x ∂x
The last term goes to zero, so the first two terms is
Z
∂χ ∂χ
(y) a(x − y)u(x − y) − a(x) (y)u(x − y)
∂y ∂y
Z
∂χ (y)
= (a(x − y) − a(x))u(x − y)dy.
∂y
Now as → 0, the first derivative grows with order −1 and a(x − y) − a(x)
decreases with order .
Math 212a Notes 40
lim kT − Tn k = 0.
n→∞
Let H be a Hilbert space, and let L(H) be all bounded linear operators from
H to H. This is an algebra, i.e., there is bilinear multiplication. Let Lc (H) be
the space of all compact operators from H to itself.
Theorem 12.3. Lc (H) is a two-sided, closed, adjoint-invariant ideal L(H). In
Lc (H), the finite-rank elements of Lc (H) is dense.
Proof. We need to show that if T ∈ L(H) and S ∈ Lc (H), then T S ∈ Lc (H)
and ST ∈ Lc (H). This is clear, just by mapping the sequence.
To show closedness, we need to check that if kT − Tn k → 0 and Tn ∈ Lc (H)
then T ∈ Lc (H). This can be done by diagonal subseqeunce from a sequence of
nested subsequences. The usual 3 argument works.
Now let us show that elements in Lc (H) can be approximated by finite rank
operators. Let {en } be an orthonormal basis. Then define
X X
Tn f = (T f, ek )ek , Qn f = (f, ek )ek
k≤n k>n
Proof of the spectral theorem. Let µ = kT k. Then there exist xn with kxn k = 1
such that (T xn , xn ) → ±µ. (This is from the lemma.) By compactness, there
exists a subsequence of T xnk that converges, and let
lim T xnk = y.
k→∞
Math 212a Notes 42
d2
Lf = f =g
dx2
with f (a) = f (b) = 0. This operator L is not compact, but Green figured out
that L−1 is compact. He constructed the kernel
(
(x−a)(b−y)
a−b a ≤ x ≤ y ≤ b,
K(x, y) = (b−x)(y−a)
a−b a ≤ y ≤ x ≤ b.
You can differentiate to actually verify this. This K(x, y) is called the Green’s
kernel.
The Sturm–Liouville equation is solved similarly. Consider the operator
Lf = (pf 0 )0 − qf.
If we can find ϕ1 and ϕ2 such that
L(ϕ− ) = 0, ϕ− (a) = 0, ϕ0− (a) 6= 0,
L(ϕ+ ) = 0, ϕ+ (b) = 0, ϕ0+ (b) 6= 0.
Math 212a Notes 44
is actually constant in x.
This is bounded by the fact that there are finitely many eigenvalues outside a
neighborhood of 0.
Actually the Fredholm alternative holds for all compact operators T . The
techniques we will use are the lower bound technique and the iterated approxi-
mation to get surjectivity.
Theorem 13.1. If T is compact and λ 6= 0 is not an eigenvalue, then (T −λ)−1
is bounded.
This actually works for Banach spaces too.
Proof. Let’s first prove that k(T − λ)xk ≥ ckxk. If this is not true, there is a
sequence of xn with kxn k = 1 such that (T − λ)xn → 0. Assume that T xn → y.
Then
y ← T xn = λxn + (T − λ)xn .
So xn → y/λ and T xn → T y/λ. So y is an eigenvalue.
Now we need to show that T − λ is surjective, so that we can invert it. Let
kTn − T k → 0 with Tn finite dimension range. Now the claim is that there exists
a c > 0 such that
k(Tn − λ)xk ≥ ckxk
Math 212a Notes 45
for all sufficiently large n. This is by the same technique. If (sn − λ)xn → 0,
then by the diagonal argument we can assume sn xk → yn as k → ∞ for all n.
Also assume T xk → y. Then
is dense in X.
Proof. Let U be a nonempty open set. Then each U ∩ O1 is nonempty, and so
there is a open ball B1 ⊆ U ∩ O1 . Then by the same argument there is an open
ball B2 ⊆ U ∩ O2 , and so forth. Then the intersection of Bn is nonempty by
completeness, and so U ∩ E 6= ∅.
Theorem 13.3 (Open mapping theorem). Let X, Y be Banach spaces and T :
X → Y be continuous, i.e., bounded. If it is surjective, then the map is open.
Math 212a Notes 46
Fn = {x ∈ X : sup kT xk ≤ n}
T ∈T
S
is closed, and we also have n∈N Fn = X. So some Fn contains an open ball.
Theorem 14.3 (Open mappint theorem). Let T : X → Y is a bounded and
surjective map of Banach spaces. Then T is open.
Proof. For BX the unit open ball,
S we need to show thatST (BX ) contains some
open ball in Y . But we have n T (nBX ) = Y , and so n T (nBX ) = Y . The
Baire category theorem implies that some T (nBX ) contains an open ball in Y .
We may assume that T (nBX ) ⊇ BY .
The whole point is to get rid of the closure. The condition we have obtained
says that for each y ∈ Y and > 0, there exists an x ∈ X such that
for any disjoint (ξ1 , η1 ), . . . , (ξk , ηk ) ⊆ [a, b], then F is the indefinite integral of
a Lq function. Let
k
X |F (ηj ) − F (ξj )|q−1 sgn(F (ηj ) − F (ξj ))
f= χ[ξj ,ηj ]
j=1
(ηj − ξj )q−1
so that Φ(f ) is the sum we want to estimate. Now if you compute kf kLp , it is
k 1/p
|F (ηj ) − F (ξj )|q
X
kf kLp = = (Φ(f ))1/p .
j=1
|ηj − ξj |q−1
f (y − y 0 ) ≤ ky − y 0 k ≤ ky + zk + ky 0 + zk
Proof of C-linear version. Suppose |f (x)| ≤ kxk. Then clearly |u(x)| ≤ kxk.
We extend u(x) to U (x), and then define F (x) = U (x) − iU (ix). The norm
estimate follows automatically. We have
This is sort of the fundamental theorem of calculus over C, i.e., Cauchy’s theory.
But we can go in another direction, which is to interpret it as d(P dx+Qdy) = 0.
There is the problem with boundaries, so Hodge looked at compact manifolds
without boundary.
We have discussed solving T u = f subject to Sf = 0. But for compact
manifolds there is a local question and a global question. The local one is called
the Poincaré lemma. We can let T = d on (p − 1)-forms and S = d on p-forms.
This satisfies the estimate, but globally it fails up to finite dimension. These
finite dimensional exceptions are called harmonic forms and have important
consequences in geometry and topology.
Let us look at the p-forms. If M is a smooth (C ∞ ) manifold, there is
the notion of a tangent vector. This is relates to each smooth function its
directional derivative. The tangent vector maps the germs of smooth functions
to R. This map is R-linear and satisfies the Leibniz formula
What are the 2-forms then? If you just take the derivative, the second order
terms will enter. To keep them out, you take the skew-symmetrization. Define
X X
∂ϕj
d ϕj xj = dxk ∧ dxj .
j
∂xk
j,k
where ωj1 ,...,jp is alternating. Assume that dω = 0 locally. Then we would like
to say that ω = dγ for some γ a local smooth (p − 1)-form.
Lemma 15.2 (Poincaré lemma). Let Ω be a star-like domain. If ω is a p-form
on Ω such that dω = 0, then there is a (p − 1)-form γ on Ω such that dγ = ω.
There are two ways of doing this: one is using polar coordinates and the
other is using Cartesian coordinates, one at a time.
This is the same thing as looking at the operator T T ∗ + S ∗ S. You ask if this
allows diagonalization. Here it is easier to use 1 + T T ∗ + S ∗ S.
One way is to use Ascoli–Arzelà. Let us look at the L2 space and the L21
space. What is the L2 space of global p-forms? Note that locally p-forms look
like linear combinations of dxj1 ∧· · ·∧dxjp with some coefficients. So we can take
the L2 norm of this after defining the metric and volume form on the manifold.
I am going to do this.
Now L21 space consists of measurable functions whose first derivative is mea-
surable in the weak sense. This means that there exists a g ∈ L2 such that
Z Z
0
− f ϕ = gϕ
T : Hs → Hr
where gij is symmetric and positive-definite. This gives a pointwise inner prod-
uct, and to make this global, we need a volume form
p
det gij dx1 ∧ dx2 ∧ · · · ∧ dxn .
Now to a p-form
1 X
ϕ= ϕj1 ···jp dxj1 ∧ · · · ∧ dxjp ,
p!
In the old days, people tried to solve this locally using Poincaré, but then
they don’t agree at the boundary. This difference is then going to have zero
differential, so can be written as d of something. This motivated Leray to define
sheaves.
Let us call A p the space of all smooth p-forms on a smooth compact manifold
X. Also, let ALp2 be the space of all p-forms with locally L2 coefficients. We
have a sequence
d d d d
· · · → A p−2 −
→ A p−1 −
→Ap −
→ A p+1 −
→ A p+2 → · · · .
We are then trying to solve the obstructions to ker(A p → A p+1 ) ⊇ im(A p−1 →
A p ). But out method of solving this is involving limits, so we are forced to look
instead at
dp−1 dp
ALp−1
2 −−−→ ALp2 −→ A p+1 .
It is clear that dp ◦ dp−1 = 0. So the only thing we need is the a priori
estimate. We ask if dp−1 d∗p−1 +d∗p dp ≥ c is true. Then we have all our machinery
that allows us to solve the equation. But this is not all we want, because we are
interested in A p . So we have another regularity problem to solve.
Math 212a Notes 53
It turns out that dp−1 d∗p−1 + d∗p dp ≥ c up to a finite dimension. That is,
there is some finite dimension that behaves badly, and if we exclude the bad
guys by taking the orthogonal complement, we get the right inequality.
We will show that (I + dp−1 d∗p−1 + d∗p dp )−1 is compact. This will prove
that we have the right estimate up to finite dimension, by the spectral theorem.
There are two ingredients, and the first one is Gårding’s inequality.
Theorem 16.1 (Gårding’s inequality). If ϕ ∈ ALp2 is in Dom dp ∩ Dom d∗p−1
then
kd∗p−1 ϕk2A p−1 + kdp ϕk2A p+1 + kϕk2A p ≥ ck∇ϕkA p2 .
L2 L2 L2 L
Note that this also implies that ϕ is in L21 . The second ingredient is Rellich’s
lemma.
Lemma 16.2 (Rellich’s lemma). The map L2r ,→ L2s is compact for r > s.
Proof. Let ϕν be a sequence of L2 p-forms on X where the coefficients are in
L21 and are bounded uniformly. We want to select a subsequence that converges
in L2 . We can localize by using a partition of unity. Then assume supp ϕν ⊆
(−π, π)n .
Look at the Fourier series and write
X
ϕν = am1 ,...,mn ,ν eim1 x1 x1 +···+imn xn .
Note that g is bounded both above and below, and hence we may just assume g
is the standard metric, because we only need convergence. Then up to constant,
X
kϕν k2 = |am1 ,...,mn ,ν |2 ≤ 1.
m1 ,...,mn
2
P
This shows that the tail part |mj |≤N |am1 ,...,mn ,ν | is bounded by 1/N , in-
dependent of ν. Then we can truncate this tail part, get a finite-dimensional
space, and find a subsequence. Now let N → ∞ and use the diagonal.
then X X
dϕ = ∂k ϕj dxk ∧ dxj = (∂j ϕk − ∂k ϕj )dxj ∧ dxk .
j,k j<k
XZ Z X 2
2 2 ∗ 2
kdϕk = (∂j ϕk − ∂k ϕj ) , kd ϕk = ∂j ϕj .
j<k j
Also
XZ
2
k∇ϕk = (∂j ϕk )2 .
j,k
Here, we are adding the sum of squares of linear combinations of partial deriva-
tives of coefficients. This already shows that you can’t bound k∇ϕk2 by kdϕk2
and kd∗ ϕk2 . This is because there are n squares in k∇ϕk2 but n(n−1)/2 squares
in kdϕk2 and 1 square in kd∗ ϕk2 .
But this is not linear algebra and we have the squares. The idea is |~u · ~v |2 +
|~u × ~v |2 = |~u|2 |~v |2 . Then
|tr(~u ⊗ ~v )|2 + k~u ∧ ~v k2 = k~u ⊗ ~v k2
and if you let u = (∂1 , . . . , ∂n ) and v = (ϕ1 , . . . , ϕn ), then you formally get the
inequality.
So formally, pointwise,
X 1X
(∂j ϕk − ∂k ϕj )2 = (∂j ϕk − ∂k ∂j )2
2
j<k j,k
X X
= (∂j ϕk )2 − (∂k ϕj )(∂j ϕk ).
j,k j,k
R R
But when we integrate, we can move ∂ around, so (∂k ϕj )(∂j ϕk ) = (∂j ϕj )(∂k ϕk ).
This is the case for Euclidean space. But we have a Riemannian metric.
Here, Z p
kdϕk2 = (∂j ϕk − ∂k ϕj )(∂l ϕm − ∂m ϕl )g jl g km det g.
X
Here, you can mostly ignore gjl g km part because we are working up to constant.
But when you do integration by parts, you get up taking the derivative of g.
This is why you need an additional kϕk2 in the general case.
What about for p-forms? Let me explain a bit about covariant differentiation
in this case. Because difference quotient depends on the coordinate, you need a
connection on the manifold. Levi-Cevita gave a way to get a connection from a
Riemannian metric.
Math 212a Notes 55
ϕ ∗ χ → ϕ, dp (ϕ ∗ χ ) → dp ϕ, d∗p−1 (ϕ ∗ χ ) → d∗p−1 ϕ.
Then we use the C ∞ version Gårding’s inequality. To take care of ∇ϕ, use
Fatou.
If you want to solve the problem with boundary, then things get easily
complicated. You have done the D2 case in the homework.
1
(∆|H ⊥ )−1~ej = ~ej ,
µ(~ej )
which has bound 1/µ0 where µ0 is the minimal nonzero µn . We can extend
(∆|H ⊥ )−1 to 0 on H. This operator G is called the Green’s operator.
Now we are looking at
dp−1 dp
ALp−1
2 ALp2 ALp+1
2 .
ψ = d∗p−1 Gϕ.
Math 212a Notes 57
dp−1 ψ = ϕ, d∗p−2 ψ = 0.
d(ρν ϕ) = ρν dϕ + E,
dp ϕ = f + E, d∗p−1 ϕ = g + E 0
This [D, dp ] is first-order 1 by the product formula. Because Ddp ϕ and [D, dp ]
is L2 , we get we get dp Dϕ ∈ L2 . Then likewise we get d∗p−1 Dϕ ∈ L2 . Again
by Gårding, we get Dϕ ∈ L21 . Then we can bootstrap this iteratively to get
ϕ ∈ C ∞.
Here, we don’t know if ∇ is in L2 . So we use a difference quotient instead of
a partial derivative, and apply Gårding’s inequality. Then we can take the limit
using Fatou’s lemma. This problem here because we really can’t to integration
by parts because everything is a weak derivative.
Math 212a Notes 58
This is the weighted average of the exponential function. So after taking the
weighted average, it is still going to be complex-analytic. Then we have the
mean-value property. Even though f might have a zero, it might not have a
zero on a circle.
We want to solve differential equations in a weak sense. So we want to use
test functions. We can use the compactly supported smooth functions Cc∞ . But
using the Schwartz space is good enough. These are the functions whose Dα f
decay faster than any polynomial order.
When people tried to solve Laplace’s equation
n
X ∂2u
= f,
j=1
∂x2j
So we have
∂ α
\
f (ξ) = (2πiξ)α fˆ(ξ).
∂x
So we would have
fˆ
û = .
−4π 2 |ξ|2
Here you have a problem because you’re dividing by zero.
Proof. The idea is to reflect the zeros outside the unit circle. Let us write
P (z) = P1 (z)P2 (z) where the roots of P1 (z) are outside the unit disk and those
of P2 (z) are inside the unit disk. For |β| < 1, there is a Möbius transformation
z−β
z 7→
1 − β̄z
which maps the circle to itself. The polynomial P2 is the problem, so we reflect
to define Y
P̃2 (z) = (1 − β̄z).
|β|<1,P (β)=0
So if we let P̃ (z) = P1 (z)P̃2 (z) then all roots of P̃ are outside the unit disk and
Y
|P̃ (0)| = |α| ≥ 1, |P (eiθ )F (eiθ )| = |P̃ (eiθ )F (eiθ )|.
α≥1,P (ζ)=0
Applying it to P̃ F gives
Z 2π
2 1
|P̃ (0)F (0)| ≤ |P̃ (eiθ )F (eiθ )|2 dθ.
2π θ=0
which is ∂ α
X
L∗ = (−1)|α| āα .
α
∂x
We assume for now that Ω ⊆ Rn is a bounded domain where we are solving the
equation.
So it suffices to find a c > 0 such that
If this is the case, we will be able to use Riesz representation to show that for
all f ∈ L2 (Ω) there exists an u ∈ L2 (Ω) such that Lu = f and kukL2 (Ω) ≤
ckf kL2 (Ω) .
How do we go from L2 to holomorphic? Assume that Ω = (−M, M ) with
M > 0. Assume g is L2 (R) with supp g ⊆ Ω. Then I claim that
Z M
ĝ(ξ + iη) = e−2πi(ξ+iη)x dx
−M
But note that this is exactly in the situation of the key lemma. We need
Z
|Q(ξ + iη, σ1 , . . . , σk )ψ̂(ξ + iη, σ1 , . . . , σn )|2
ξ,σ1 ,...,σn
But
∗ ψ(ξ + cos θ + i sin θ, σ , . . . , σ ) = L
Ld ∗ ψ(ξ, σ , . . . , σ )e−2πi(ξ+cos θ+i sin θ)x .
1 n 1 n
d
Because Ω ⊆ [−M, M ] × Rn , the e−2πi(ξ+cos θ+i sin θ)x is bounded. This shows
that we have our inequality.
Math 212a Notes 61
19 November 2, 2017
Last time we looked at the method of Malgrange–Ehrenpreis. Here, we proved
the estimate
kukL2 (Ω) ≤ ckL∗ ukL2 (Ω)
for all u ∈ C0∞ (Ω). Then we were able to conclude that L is surjective and so
Lu = f is always solvable.
Note that this is not a special case of the a priori estimate
The a priori estimate needs to hold for all g ∈ Dom S ∩ Dom T ∗ . But we have
only shown this for g a compactly supported smooth function.
When we say that Lu = f in the weak sense, we are simply saying that
(L∗ g, u) = (g, f ) for all g ∈ C0∞ (Ω). This is not saying that u is in the domain
of L and Lu = f . To do this, we need some Friedrich thing and we have a
problem for higher order derivatives.
19.1 Distributions
Definition 19.1. A distribution or a generalized function is a continuous
linear functional on C0 (Ω) = D(Ω). The space is called D0 (Ω).
But what does continuous mean? This should mean that ϕn → 0 implies
T (ϕn ) → 0.
Definition 19.2. We introduce a metric such that ϕn → 0 means that there
exists a compact K such that supp ϕn ⊆ K for all n, and Dα ϕn → 0 uniformly
for all α.
So when u ∈ L2 , the function Lu makes sense as a distribution. We define
it as
(Lu, ϕ) = (u, L∗ ϕ).
This is going to be a continuous functional.
This also comes up when you want Ω unbounded in the Malgrange–Ehrenpreis.
If Ω ⊆ [−M, M ] × Rn all is good because the Fourier transform is bounded, but
once Ω is too big we get into trouble. Then we are forced to use distributions.
Using this setting, we can use language like
1
∆ cn = δ(x).
|x|n−2
So we would like to say (fˆ)(ϕ) = f (ϕ̂). But here we have trouble because ϕ̂
doesn’t have compact support. I will talk about this more, but you can justify
this.
Let us write E = cn /|x|n−2 for the fundamental solution. Then
1
Ê(ξ) = − .
4π|ξ|2
We want to say something like this, but this can’t be justified even if we know
the answer.
Recall that Plancherel follows from (1). If we let g(x) = f (−x) and evaluate
f ∗ g at 0, then Z Z
f ∗ g(0) = (f ∗ g)(ξ)dξ = |fˆ|2 dξ.
\
2
The inversion formula follows from e−πx being its own Fourier transform and
the identity (2).
Theorem 19.4. Let P (D) be a (constant coefficient) polynomial in the partial
differential operators on Rn . There exists a fundamental solution E for P (D),
which means that E is a (tempered) distribution on Rn with
P (D)E = δ0
is the Dirac delta at 0.
Math 212a Notes 63
We can basically repeat the same argument. But we are going to refine the
trick of Malgrange–Ehrenpreis. Recall that we were only using this trick when
Ω ⊆ [−M, M ] × Rn . Also, there was this one direct x such that the highest term
is just (∂/∂x)m . This is not true anymore, so we are going to average over all
he special one coordinates. That is, we let zj = eiθj and average over the torus
T n.
Let us write P = P0 + P1 + · · · + PN where Pν is homogeneous of degree ν.
We are going to use the mean-value property of a holomorphic function. For f
a holomorphic function on Cn , consider
F (λ) = f (z + rλw)
where Z
1 dθ1 dθn
= |PN (w)| ··· .
A w∈T n 2π 2π
Math 212a Notes 64
20 November 7, 2017
We have looked at Malgrange–Ehrenpreis, and here we need the monic assump-
tion and also bounded domain Ω ⊆ [−M, M ]×Rn . To get rid of the boundedness
assumption in one coordinate, we were forced to look at distributions.
Distributions is the dual of the space of test functions. The test functions on
compact sets form a Fréchet spaces, and so the space D(Rn ) of all test functions
is a direct limit of Fréchet space. Then the dual is going to be the distributions.
Duals of Fréchet spaces are DF spaces, but we are looking at the dual of a direct
limit of Fréchet space.
Other than Malgrange–Ehrenpreis, there is a way of solving a differential
equation with probability. When I was at Yale, I once heard this ingenious idea
from Kakutani and was very impressed. Suppose you want to solve Dirichlet’s
problem with some boundary condition. What you can do here is to find a value
in the middle is to consider a random walk starting at this point. It is going
to hit the boundary at some finite time, and look at the expected value of the
function value of the hitting point. This enjoys the harmonic property because
it has to first go somewhere, and then the expected value is the expected value
of starting at the nearby points.
then X
L1 = (−1)|α| aα ∂xα .
|α|≤m
Math 212a Notes 65
Using this we can write (LE)ϕ = E(L1 ϕ), so we want this E to send L1 ϕ to
ϕ(0).
Now it suffices to show that L1 ϕ 7→ ϕ(0) defined on L1 S(Rn ) is bounded.
Then we have Hahn–Banach and so extend to S(Rn ) with the same bound.
Note that on S(Rn ), the Fourier transform is a isomorphism of topological
vector spaces. Let Q be the symbol of the characteristic polynomial of L1 . Then
we want a bound for
Q(ξ)ϕ̂(ξ) 7→ ϕ(0).
That is, we want to show that
|ϕ(0)| ≤ C` kQ(ξ)ϕ̂(ξ)k`,S .
We apply Malgrange–Ehrenpreis to η 7→ Q(η + ξ)ϕ̂(η + ξ) for fixed ξ. Then
we will get
Z 2π
1
|ϕ̂(ξ)| ≤ C |(Qϕ̂)(ξ + eiθ1 , . . . , ξ + eiθn )|dθ1 · · · dθn .
(2π)n θj =0
Then
Z
|ϕ(0)| ≤ |ϕ̂(ξ)|
ξ∈Rn
Z 2π Z
C
≤ |(Qϕ̂)(ξ + eiθ1 , . . . , ξ + eiθn )|dξdθ1 · · · dθn
(2π)n θj =0 ξ∈Rn
Z
=C k(Qϕ̂)(ξ)k
ξ∈Rn
Z
1
=C n+1
(1 + |ξ|n+1 )|Qϕ̂(ξ)| ≤ C 0 kϕk`,S ,
ξ∈R n 1 + |ξ|
because we have rapid decay. Here you need to be a bit careful because we are
shifting in the complex direction, but this can be done.
There is another way of handling this problem. If we have LE = δ0 , then
we have E(L1 ϕ) = ϕ(0). Then
Z Z
Qϕ̂
ϕ(0) = ϕ̂(ξ) = .
Q
If Q has no zero, we are fine, but if Q has a zero, we can’t just do this. So
what we can do is to shift the domain. Instead of ξn , we are going to use
ξn + iη̃(ξ1 , . . . , ξn−1 ), where η̃ depends on ξ1 , . . . , ξn−1 . Then define
Z Z ϕ̂
E(ϕ) = (ξ1 , . . . , ξn−1 , ξn + iη̃(ξ1 , . . . , ξn−1 )).
ξ1 ,...,ξn−1 ξn ∈R Q
21 November 9, 2017
People were looking at simple linear PDEs with constant coefficients. Then
where L is is a differential equation, we want to solve LE = δ0 . Here D(Rn )
is the direct limit of the Fréchet space DK (R). This is not nice, so we look at
the Schwartz space S(Rn ), which is a Fréchet space. Another nice thing is that
ˆmaps S(Rn ) to itself, and S(Rn )0 is a tempered distribution.
The fundamental solution was obtained first by Malgrange–Ehrenpreis by
reflection with the unit circle, and next by extension of a linear functional. We
wanted to take the Fourier transform and say QÊ = 1, but then we had this
problem of dividing by zero. What we want is
E(L1 ϕ) → ϕ(0)
and then extend it to E : D(Rn ) → C. So we check that E on L1 (D(Rn )) is
already continuous. Then we want to check that for fixed K,
|ϕ(0)| ≤ CK kL1 ϕk`K ,K .
The Malgrange–Ehrenpreis trick is
Z
C
|ϕ̂(ξ)| ≤ |(Qϕ̂)(ξ1 + eiθ1 , . . . , ξn + eiθn |dθ1 · · · dθn .
(2π)n 0≤θj ≤2π
Then
Z Z
|ϕ(0)| = ϕ̂(ξ) ≤ |ϕ̂(ξ)|
ξ∈Rn ξ∈Rn
Z Z
C
≤ |(Qϕ̂)(ξ1 + eiθ1 , . . . , ξn + eiθn )|dθ1 · · · dθn .
(2π)n 0≤θj ≤2π ξ∈Rn
Here Z P
(Qϕ̂)(ξj + e iθj
)= (L1 ϕ)(x)e−2πi ν xν (ξν +cos θn +i sin θn )
x∈K
and so
Z P
|(Qθ̂)(ξj + eiθj )| ≤ |(L1 ϕ)(x)|e2π ν |ξν | 00
≤ CK sup (L1 ϕ)(x).
x∈K x∈K
and this would be an answer. This is good already good for tempered distribu-
tions. This is because we would have
Z Z
Ld1ϕ
E(L1 ϕ) = = ϕ̂ = ϕ(0).
Rn Q Rn
But what if Q has a zero? We are going to choose η̃1 such that fixed
ξ2 , . . . , ξn , we have {η1 = η̃1 } disjoint from the zeros of Q(ξ1 + iη1 , ξ2 , . . . , ξn ).
We can do this so that |Q| is even bounded from below.
If ϕ ∈ DK (Rn ) then ϕ̂ is holomorphic on Cn . Then we can write
Z
ϕ̂
E(ϕ) = (ξ1 + iη̃1 (ξ2 , . . . , ξn ), ξ2 , . . . , ξn )dξ1 · · · dξn .
Q
Then todo
Math 212a Notes 70
Let f = LU . Then
Q ∗ f = (LQ) ∗ U = (δ0 + r) ∗ U = U + r ∗ U
Let me give you how to solve this, and then other dimensions will be similar.
and then
(x − a)n+1 (n+1)
Rn (x) = f (a + O(x − a)).
(n + 1)!
If there is a zero at a with order ≥ n, then
For the second case, we are dividing by a smooth function so we don’t have to
worry. For the first case, we are going to have
Then
f (x) 1
1 (h)
|f (`) (y)|.
≤ sup f (y) + · · · ≤ Cη sup
P (x) c |y−a|≤η h!
`≤h+1,|y−x|≤η
Math 212a Notes 74
S
For the points outside j Bη (aj ), we have have |g| ≤ C̃|f | so
Then
m
m m
X m
P (D g) = D f − (Dp P )(Dm−p g)
p=1
p
and use induction to get the bound.
You can read Hörmander, Arkiv. für Math. 53 (1958), 555–568 or M. Atiyah,
Comm. Pure Appl. Math 23 (1970), 145–150. Here is what Atiyah did. In the
case when the polynomial is locally nice, like
pj
Y
P = (xν − aν (j))`ν,j ,
ν=1
then we are good. So what you do is to change coordinates and blow up. For
instance, if you have P (x, y) = x3 − y 2 then we can use y = xu and x = v to get
u2 = v. Then you still get a manifold, and you can do analysis locally. After
you do the resolution of singularity and get a manifold, and also a “tempered
distribution” on the manifold, you push forward.
This suggests that we can use Hölder to bound things. The following is a
theorem of Nirenberg, Ann. d. Scnola Normale Superiore d. Pisa 13 (1959),
115–162.
Math 212a Notes 75
Assume that Ω is bounded for simplicity. Take a point x ∈ Ω, and then integrate
along the radial direction to get something about x. Then average over the paths
around the sphere and use Hölder’s inequality.
Math 212a Notes 76
But this is only for one coordinate. So we integrate with respect to x1 . The
first factor is independent of x1 , so we can take it out and have n − 1 factors.
Then by Hölder,
Z Z 1
n−1 Z 1
n−1
n−1 n−1
|f1 · · · fn−1 | ≤ |f1 | ··· |fn−1 | .
But we have integrate for x2 too. Again, there is precisely one factor that does
not depend on x2 , so we can do the same thing. After all this, we get
Z n Z
Y 1
n−1
n
|2u(x)| n−1 ≤ |∂j u| .
Rn j=1 Rn
Then
Z n−1 n Z
n
n
1X
|2u(x)| n−1 ≤ |∂j u|
Rn n j=1 Rn
kukLn/(n−1) ≤ CkukL11 .
The usual way of getting this inequality is to replace u by |u|γ . Then the case
p = 1 implies
k|u|γ kL n−1
n
(Rn )
≤ CkD(|u|γ )kL1 (Rn ) ,
and then you can use Hölder on the right hand side and choose an appropriate
γ.
Math 212a Notes 78
k|u|γ kL n−1
n ≤ Cγk|u|γ−1 k p kDukLp
L p−1
for p, γ > 1, by Hölder. We are going to iterate this to get the estimate on
infinity.
First, note that we can scale the function so that CkDukLp (Rn ) = 1. Then
k|u|γ kL n−1
n ≤ γk|u|γ−1 k p ,
L p−1
Let us assume that p > n, so that we have n0 > p0 . Then we increase the
norm number by a factor of n0 /p0 every time we apply the inequality. Here, we
0 0
can be a bit generous and replace the L(γ−1)p norm by the Lγp norm, using
Hölder. There will be a constant coming from the volume of Ω, but we can
again scale Ω so that its volume is 1. Then
1− 1
kukLγn0 ≤ γ 1/γ kukLγpγ0 .
n0
Write δ = p0 . If you iterate this, you get
ν
1
Y
kukLδν n0 ≤ (δ ν ) δµ .
µ=1
Theorem 25.3. Assume that µ, ν are two (signed) σ-finite (|µ| and |ν|) mea-
sures. Then we can decompose ν = νa + νs where νa is absolutely continuous
with respect to µ (i.e., µ(E) = 0 implies νa (E) = 0) and νs is (mutually) sin-
gular with respect to µ (i.e., X = A ∪ B for disjoint A, B and supp νs ⊆ B and
supp µ ⊆ A). Moreover, Z
νa (E) = f dµ
E
for some f measurable on X.
Proof. Von Neumann’s proof is the most elegant. This uses Riesz representation.
Let us consider the special case µ, ν ≥ 0 and µ(X), ν(X) < ∞. Let ρ = µ + ν
and consider the Hilbert space L2 (X, ϕ). Consider the linear functional
Z
Φ(f ) = f dν.
Then this is continuous because X has finite measure, and so there exists a
g ∈ L2 (X, ρ) such that
Z Z
Φ(f ) = f dν = f gdρ.
X X
His main idea is that all the information between µ and ν is encoded in g. First
note that 0 ≤ g ≤ 1 because testing with f = χE gives a contradiction. Now
consider the sets
B = {g = 1}, A = {g < 1}.
Then µ(B) = 0 by testing with f = χB .
Now what happens in E ⊆ A? Set f = χE (1 + g + · · · + g n−1 ). Then we get
Z Z
(1 − g n )dν = g(1 + g + · · · + gn − 1)dµ.
E E
Of course, you can’t add measure spaces. So you take a function f defined
on X, and let τ : X → X be an isometry. Then we define
(T f )(x) = f (τ (x)).
f = F0 + (1 − T )F1 + F2
where kF2 k < . Then the (1 − T )F1 telescopes and contributes nothing. So it
converges to F0 in L2 .
Theorem 25.5 (Maximal ergodic theorem). Let f ∈ L1 (X, µ) and define a
maximal function
m−1
1 X
f ∗ (x) = sup |f (τ k (x))|.
m≥1 m
k=0
Math 212a Notes 82
|P 0 f |dµ ≤
R R
for almost everywhere, and by Fatou, X X
|f |dµ.
Proof. First note that L2 is dense in L1 by truncating the range. Here, we have
the decomposition H = S ⊕ S ⊥ and so f = F0 + (1 − T )F1 + F2 . We don’t
need anything for F0 , and (1 − T )F1 is also telescoped well. To deal with F2 , we
use the maximal bound. Then the places it does not converge is going to have
measure zero.
Math 212a Notes 83
Note that if m∗α (E) < ∞ for some α ≥ 0, then m∗b (E) = 0 for β > α. This
is because there is an extra factor of δ β−α . Likewise, if m∗α (E) > 0 for some
α > 0 then m∗β (E) = ∞ for β < α.
Definition 26.2. The Hausdorff dimension is defined as sup{α : m∗α (E) =
∞} = inf{α : m∗α (E) = 0}.
If F1 and F2 are positively separated sets, then m∗α (F1 )+m∗α (F2 ) = m∗α (F1 ∪
F2 ). This is because we can set δ sufficiently small and they don’t interfere.
Example 26.3. Consider the Cantor set C ⊆ [0, 1]. Its Hausdorff dimension is
α = log 2/ log 3. We can cover C by some 2k intervals of form [3−k l, 3−k (l + 1)]
and then m∗α (C) ≤ 1. The other direction uses the Cantor–Lebesgue function.
This function is α-Hölder continuous. So we get mα
α (C) > 0.
where β = α/γ.
Math 212a Notes 84
Example 26.5. There is also the Sierpinski triangle. This set is a triangle,
minus the middle triangle of side length 21 , and from each three remaining
triangle remove the half-size triangles and so on. This has dimension log 3/ log 4.
Example 26.6. There is also the Koch curve. There is also the Peano curve,
or the space-filling curve.
This is why we need the condition d ≥ 3. The main tool we were using were
invariance of L2 under the Fourier transform. Here, we are using the Euclidean
version and the spherical version. So if f is continuous and compactly supported,
then Z Z ∞ Z
2 d−1
|R̂(f )(λ, γ)| |λ| dλdσ(γ) = 2 |f (x)|2 dx.
S d−1 −∞ Rd
86