0% found this document useful (0 votes)
12 views54 pages

Ruzhansky Notes

Uploaded by

alexander_king11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views54 pages

Ruzhansky Notes

Uploaded by

alexander_king11
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Introduction to pseudo-differential operators

Michael Ruzhansky
January 21, 2014

Abstract
The present notes give introduction to the theory of pseudo-differential oper-
ators on Euclidean spaces. The first part is devoted to the necessary analysis of
functions, such as basics of the Fourier analysis and the theory of distributions
and Sobolev spaces. The second part is devoted to pseudo-differential operators
and their applications to partial differential equations. We refer to the monograph
[1] by Ruzhansky and Turunen for further details on this theory on the Euclidean
space, torus, and more general compact Lie groups and homogeneous spaces. This
book will be the main source of examples and further details to complement these
notes. The course will contain the material from these notes and from [1], but not
everything, with the rest of the material provided here for the reader’s convenience.
The reader interested in the relations of this topic to the harmonic analysis
may consult the monograph [3] by Stein, and those interested in the relations to
the spectral theory can consult the monograph [2] by Shubin, to mention only very
few related sources of material.

Contents
1 Analysis of functions 3
1.1 Basic properties of the Fourier transform . . . . . . . . . . . . . . . . . . 3
1.1.1 Lp –spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Definition of the Fourier transform . . . . . . . . . . . . . . . . . 3
1.1.3 Fourier transform in L1 (Rn ) . . . . . . . . . . . . . . . . . . . . . 4
1.1.4 Riemann-Lebesgue theorem . . . . . . . . . . . . . . . . . . . . . 4
1.1.5 Schwartz space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.6 Differentiation and multiplication . . . . . . . . . . . . . . . . . . 5
1.1.7 Fourier transform in the Schwartz space . . . . . . . . . . . . . . 6
1.1.8 Fourier inversion formula . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.9 Multiplication formula for the Fourier transform . . . . . . . . . . 7
1.1.10 Fourier transform of Gaussian distributions . . . . . . . . . . . . . 7
1.1.11 Proof of the Fourier inversion formula . . . . . . . . . . . . . . . . 8
1.1.12 Convolutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.1 Cauchy’s inequality. . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.2 Cauchy–Schwartz’s inequality. . . . . . . . . . . . . . . . . . . . . 10

1
1.2.3 Young’s inequality. . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.4 Hölder’s inequality. . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.5 Minkowski’s inequality. . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2.6 Interpolation for Lp –norms. . . . . . . . . . . . . . . . . . . . . . 12
1.3 Fourier transforms of distributions . . . . . . . . . . . . . . . . . . . . . . 12
1.3.1 Tempered distributions . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.2 Fourier transform of tempered distributions . . . . . . . . . . . . 13
1.3.3 Two principles for tempered distributions . . . . . . . . . . . . . . 13
1.3.4 Functions as distributions . . . . . . . . . . . . . . . . . . . . . . 13
1.3.5 Plancherel’s formula . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.6 Operations with distributions . . . . . . . . . . . . . . . . . . . . 15
1.3.7 Fourier inversion formula for tempered distributions . . . . . . . . 16
1.3.8 C0∞ (Ω) and sequential density . . . . . . . . . . . . . . . . . . . . 17
1.3.9 Distributions (generalised functions) . . . . . . . . . . . . . . . . 18
1.3.10 Weak derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3.11 Sobolev spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3.12 Example of a point singularity . . . . . . . . . . . . . . . . . . . . 20
1.3.13 Some properties of Sobolev spaces . . . . . . . . . . . . . . . . . . 21
1.3.14 Mollifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.3.15 Approximation of Sobolev space functions . . . . . . . . . . . . . 25

2 Pseudo-differential operators on Rn 26
2.1 Analysis of operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.1.1 Motivation and definition . . . . . . . . . . . . . . . . . . . . . . 26
2.1.2 Freezing principle for PDEs . . . . . . . . . . . . . . . . . . . . . 27
2.1.3 Pseudo-differential operators on the Schwartz space . . . . . . . . 29
2.1.4 Alternative definition of pseudo-differential operators . . . . . . . 29
2.1.5 Pseudo-differential operators on S 0 (Rn ) . . . . . . . . . . . . . . . 30
2.1.6 Kernel representation of pseudo-differential operators . . . . . . . 31
2.1.7 Smoothing operators . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.1.8 Convolution of distributions . . . . . . . . . . . . . . . . . . . . . 32
2.1.9 L2 –boundedness of pseudo-differential operators . . . . . . . . . . 33
2.1.10 Compositions of pseudo-differential operators . . . . . . . . . . . 35
2.1.11 Amplitudes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.1.12 Symbols of pseudo-differential operators . . . . . . . . . . . . . . 39
2.1.13 Symbols of amplitude operators . . . . . . . . . . . . . . . . . . . 39
2.1.14 Adjoint operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.1.15 Changes of variables . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.1.16 Principal symbol and classical symbols . . . . . . . . . . . . . . . 42
2.1.17 Asymptotic sums . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2 Applications to partial differential equations . . . . . . . . . . . . . . . . 45
2.2.1 Solving partial differential equations . . . . . . . . . . . . . . . . 45
2.2.2 Elliptic equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.2.3 Parametrix for elliptic operators and estimates . . . . . . . . . . . 47
2.2.4 Sobolev spaces revisited . . . . . . . . . . . . . . . . . . . . . . . 48
2.2.5 Proof of the statement on the Lp –continuity . . . . . . . . . . . . 49

2
2.2.6 Calculus proof of L2 –boundedness . . . . . . . . . . . . . . . . . . 50

3 Appendix 51
3.1 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1.1 Distribution functions . . . . . . . . . . . . . . . . . . . . . . . . 51
3.1.2 Weak type (p, p) . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.1.3 Marcinkiewicz interpolation theorem . . . . . . . . . . . . . . . . 52
3.1.4 Calderon–Zygmund covering lemma . . . . . . . . . . . . . . . . . 53
3.1.5 Remarks on Lp –continuity of pseudo-differential operators . . . . 53

1 Analysis of functions
1.1 Basic properties of the Fourier transform
1.1.1 Lp –spaces
Let Ω ⊂ Rn be a measurable subset of Rn . For simplicity, we may always think of Ω
being open or closed in Rn . In this section we will mostly have Ω = Rn .
Let 1 ≤ p < ∞. A function f : Ω → C is said to be in Lp (Ω) if it is measurable and
its norm Z  1/p
p
||f ||
Lp (Ω) = |f (x)| dx

is finite. In particular, L1 (Ω) is the space of absolutely integrable functions on Ω with


Z
||f ||L1 (Ω) = |f (x)|dx.

In the case p = ∞, it is said that f ∈ L∞ (Ω) if it is measurable and essentially bounded,


i.e. if
||f ||L∞ (Ω) = esssupx∈Ω |f (x)|
is finite. Here esssupx∈Ω |f (x)| is defined as the smallest M such that |f (x)| ≤ M for
almost all x ∈ Ω. We will often abbreviate the ||f ||Lp (Ω) norm by ||f ||Lp , or by ||f ||p , if
the choice of Ω is clear from the context.

1.1.2 Definition of the Fourier transform


For f ∈ L1 (Rn ) we define its Fourier transform by
Z
(Ff )(ξ) = f (ξ) =
b e−2πix·ξ f (x)dx.
Rn

In fact, other similar definitions are often encountered in the literature. For example,
one can use e−ix·ξ instead of e−2πix·ξ , multiply the integral by the constant (2π)−n/2 , etc.
Changes in definitions may lead to changes in constants in formulas.

3
1.1.3 Fourier transform in L1 (Rn )
It is easy to check that F : L1 (Rn ) → L∞ (Rn ) is a bounded linear operator with norm
one:
||fb||∞ ≤ ||f ||1 .
Moreover, if f ∈ L1 (Rn ), its Fourier transform fb is continuous, which follows from the
Lebesgue’s dominated convergence theorem. For completeness, let us state it here:

Lebesgue’s dominated convergence theorem. Let {fk }∞ k=1 be a sequence of mea-


surable functions on Ω such that fk → f pointwise almost everywhere on Ω as k → ∞.
Suppose there is an integrable function g ∈ L1 (Ω) such that |fk | ≤ g for all k. Then f is
integrable and Z Z
f dx = lim fk dx.
Ω k→∞ Ω

1.1.4 Riemann-Lebesgue theorem


It is quite difficult to characterise the image of the space L1 (Rn ) under the Fourier
transform. But we have the following theorem.

Riemann-Lebesgue Theorem. Let f ∈ L1 (Rn ). Then its Fourier transform fb is a


continuous function on Rn vanishing at infinity, i.e. fb(ξ) → 0 as ξ → ∞.
Proof. It is enough to make an explicit calculation for f being a characteristic function
of a cube and then use a standard limiting argument from the measure theory. Thus,
let f be a characteristic function of the unit cube, i.e. f (x) = 1 for x ∈ [−1, 1]n and
f (x) = 0 otherwise. Then
Z n Z 1 n
−2πix·ξ
Y
−2πixj ξj
Y 1
f (ξ) =
b e dx = e dxj = e−2πixj ξj |1−1
[−1,1] n
j=1 −1 j=1
−2πiξj
 n n
i 1 Y
e−2πiξj − e2πiξj .

=
2π ξ1 · · · ξn j=1

The product of exponents is bounded, so the whole expression tends to zero as ξ → ∞


away from coordinate axis. In the case some of ξj ’s are zero, an obvious modification of
this argument yields the same result.

1.1.5 Schwartz space


We define the Schwartz space S(Rn ) of rapidly decreasing functions as follows. First, for
multi-indices α = (α1 , · · · , αn ) and β = (β1 , · · · , βn ) with integer entries αj , βj ≥ 0, we
define
∂ α1 ∂ αn
α
∂ ϕ(x) = α1 · · · α
ϕ(x) and xβ = xβ1 1 · · · xβnn .
∂x1 ∂xn n

For such multi-indices we will write α, β ≥ 0. Then we say that ϕ ∈ S(Rn ) if ϕ is smooth
on Rn and
sup xβ ∂ α ϕ(x) < ∞
x∈Rn

4
holds for all multi-indices α, β ≥ 0. The length of the multi-index α will be denoted by
|α| = α1 + · · · + αn .
It is easy to see that a smooth function f is in the Schwartz space if and only if for
all α ≥ 0 and N ≥ 0 there is a constant Cα,N such that

|∂ α ϕ(x)| ≤ Cα,N (1 + |x|)−N

holds for all x ∈ Rn .


The space S(Rn ) is a topological space. Without going into detail, let us only in-
troduce the convergence of functions in S(Rn ). We will say that ϕj → ϕ in S(Rn ) as
j → ∞, if ϕj , ϕ ∈ S(Rn ) and if

sup |xβ ∂ α (ϕj − ϕ)(x)| → 0


x∈Rn

as j → ∞, for all multi-indices α, β ≥ 0.


If one is familiar with functional analysis, one can take expressions

sup |xβ ∂ α (ϕj − ϕ)(x)|


x∈Rn

as seminorms on the space S(Rn ), turning it into a locally convex linear topological
space.

1.1.6 Differentiation and multiplication


Since the definition of the Fourier transform contains the complex exponential, it is often
convenient to use the notation
1 ∂
Dj = and Dα = D1α1 · · · Dnαn .
2πi ∂xj
1 ∂
If Dj is applied to a function of ξ it will obviously mean 2πi ∂ξj
. However, there should
be no confusion with this convention. The following theorem relates multiplication with
differentiation with respect to the Fourier transform.

Theorem. Let ϕ ∈ S(Rn ). Then D


d j ϕ(ξ) = ξj ϕ(ξ)
b j ϕ(ξ) = −Dj ϕ(ξ).
and xd b
Proof. From the definition we readily see that
Z
Dj ϕ(ξ)
b = e−2πix·ξ (−xj )ϕ(x)dx.
Rn

This gives the second formula. Since the integrals converge uniformly, we can integrate
by parts with respect to xj in the following expression to get
Z Z
−2πix·ξ
e−2πix·ξ Dj ϕ(x)dx.

ξj ϕ(ξ)
b = −Dj e ϕ(x)dx =
Rn Rn

This implies the first formula. Note that we do not get boundary terms when integrating
by parts because function ϕ vanishes at infinity.

5
This theorem allows one to tackle some differential equations already. For example, let
us look at the equation
∆u = f
with the Laplace operator
∂2 ∂2
∆= + · · · + .
∂x21 ∂x2n
Taking the Fourier transform and using the theorem we arrive at the equation

−4π 2 |ξ|2 u
b = fb.

If we knew how to invert the Fourier transform we could find the solution
 
−1 1 b
u = −F f .
4π 2 |ξ|2

1.1.7 Fourier transform in the Schwartz space


Corollary. Let ϕ ∈ S(Rn ). Then
Z
β α
ξ D ϕ(ξ)
b = e−2πix·ξ Dβ ((−x)α ϕ(x))dx.
Rn

Hence also
sup ξ β Dα ϕ(ξ) ≤ C sup (1 + |x|)n+1 Dβ (xα ϕ(x)) ,

b
ξ∈Rn x∈Rn

(1 + |x|)−n−1 dx < ∞.
R
with C = Rn

Here we used the following useful

Integrability criterion. We have


Z
dx
ρ
< ∞ if and only if ρ > n.
Rn (1 + |x|)

We also have Z
dx
< ∞ if and only if ρ < n.
|x|≤1 |x|ρ
Both of these criteria can be easily seen by passing to the polar coordinates.
Corollary 1.1.7 implies that the Fourier transform F maps S(Rn ) to itself. In fact,
much more is true (see the next paragraph). Let us note here that Corollary 1.1.7
together with the Lebesgue’s dominated convergence theorem imply that the Fourier
transform F : S(Rn ) → S(Rn ) is continuous, i.e. ϕj → ϕ in S(Rn ) implies ϕ cj → ϕ
b in
n
S(R ).

6
1.1.8 Fourier inversion formula
Theorem. The Fourier transform F : ϕ 7→ ϕb is an isomorphism of S(Rn ) into S(Rn ),
whose inverse is given by Z
ϕ(x) = e2πix·ξ ϕ(ξ)dξ.
b
Rn
This formula is called the Fourier inversion formula and the inverse Fourier transform
is denoted by Z
−1
(F f )(x) = e2πix·ξ f (ξ)dξ.
Rn
Thus, we can say that
F ◦ F −1 = F −1 ◦ F = identity on S(Rn ).
The proof of this theorem will rely on several lemmas which have a significance on their
own.

1.1.9 Multiplication formula for the Fourier transform


R R
Theorem. Let f, g ∈ L1 (Rn ). Then Rn fbgdx = Rn f gbdx.
Proof. We will apply the Fubini formula. Thus,
Z Z Z 
−2πix·y
fbgdx = e f (y)dy g(x)dx =
Rn Rn Rn
Z Z  Z
−2πix·y
e g(x)dx f (y)dy = gbf dy.
Rn Rn Rn

1.1.10 Fourier transform of Gaussian distributions


We will show the equality
Z
2 |x|2 2 /(2)
e−2πix·ξ e−2π dx = (2π)−n/2 e−|ξ| . (1.1)
Rn
By the change of 2πx → x it is equivalent to
Z  n/2
−ix·ξ −|x|2 /2 2π 2
e e dx = e−|ξ| /(2) . (1.2)
Rn 
We will use the standard identities
Z ∞
−t2 /2
√ Z
2
e dt = 2π and hence e−|x| /2 dx = (2π)n/2 . (1.3)
−∞ Rn

In fact, (1.2) will follow from the one-dimensional case, where we have
Z ∞
−itτ −t2 /2 −τ 2 /2
Z ∞
−(t+iτ )2 /2 −τ 2 /2
Z ∞
2 √ 2
e e dt = e e dt = e e−t /2 dt = 2πe−τ /2 ,
−∞ −∞ −∞
where we used the Cauchy√theorem about changing
√ the contour of integration for analytic
functions. Changing t → t and τ → τ /  gives

Z ∞
2 √ 2
 e−itτ e−t /2 dt = 2πe−τ /(2) .
−∞

Extending this to n dimensions yields (1.2).

7
1.1.11 Proof of the Fourier inversion formula
Now we can prove the Fourier inversion formula 1.1.8. So, for ϕ ∈ S(Rn ), we want to
prove that Z
ϕ(x) = e2πix·ξ ϕ(ξ)dξ.
b
Rn
By the Lebesgue’s dominated convergence theorem we can replace the right hand side
by
Z
−2π 2 |ξ|2
RHS = lim e2πix·ξ ϕ(ξ)e
b dξ
→0 Rn
Z Z
2 2
= lim e2πi(x−y)·ξ ϕ(y)e−2π |ξ| dydξ (change y → y + x)
→0 Rn Rn
Z Z
2 2
= lim e−2πiy·ξ ϕ(y + x)e−2π |ξ| dydξ (Fubini’s theorem)
→0 Rn Rn
Z Z 
−2πiy·ξ −2π 2 |ξ|2
= lim ϕ(y + x)dy e e dξ (F.T. of Gaussian)
→0 Rn Rn

Z
2
= lim ϕ(y + x)(2π)−n/2 e−|y| /(2) dy (change y = z)
→0 Rn

Z
−n/2 2
= (2π) lim ϕ( z + x)e−|z| /2 dz
→0 Rn
= ϕ(x).

This finishes the proof.


Remark. In fact, in the proof we implicitly made use of the following useful rela-
tion between Fourier transforms and translations of functions. Let h ∈ Rn and define
(τh f )(x) = f (x − h). Then we also see that
Z
(τd
h f )(ξ) = e−2πix·ξ (τh f )(x)dx
n
ZR
= e−2πix·ξ f (x − h)dx (change y = x − h)
n
ZR
= e−2πi(y+h)·ξ f (y)dy
Rn
−2πih·ξ
=e fb(ξ).

1.1.12 Convolutions
For functions f, g ∈ L1 (Rn ), we define their convolution by
Z
(f ∗ g)(x) = f (x − y)g(y)dy.
Rn

It is easy to see that f ∗ g ∈ L1 (Rn ) with norm inequality

||f ∗ g||L1 (Rn ) ≤ ||f ||L1 (Rn ) ||g||L1 (Rn )

8
and that
f ∗ g = g ∗ f.
Also, in particular for f, g ∈ S(Rn ), integrals are absolutely convergent and we can
differentiate under the integral sign to get

∂ α (f ∗ g) = ∂ α f ∗ g = f ∗ ∂ α g.

The following properties relate convolutions with Fourier transforms.


Theorem. Let ϕ, ψ ∈ S(Rn ). Then we have
R R
(i) Rn
ϕψdx = Rn
ϕ
bψdξ;
b

∗ ψ(ξ) = ϕ(ξ)
(ii) ϕ[ b ψ(ξ);
b

(iii) ϕψ(ξ)
c b ∗ ψ)(ξ).
= (ϕ b

Proof. (i) Let us denote


Z
χ(ξ) = ψ(ξ)
b = e2πix·ξ ψ(x)dx = F −1 (ψ)(ξ),
Rn

so that χ
b = ψ. It follows now that
Z Z Z Z
ϕψ = ϕb
χ= ϕχ
b = ϕ
bψ,
b
Rn Rn Rn Rn

where we used the multiplication formula for the Fourier transform.


(ii) We can easily calculate
Z Z Z
−2πix·ξ
∗ ψ(ξ) =
ϕ[ e (ϕ ∗ ψ)(x)dx = e−2πix·ξ ϕ(x − y)ψ(y)dydx
n n n
ZR Z R R

= e−2πi(x−y)·ξ ϕ(x − y)e−2πiy·ξ ψ(y)dydx


n n
ZR ZR
= e−2πiz·ξ ϕ(z)e−2πiy·ξ ψ(y)dydz = ϕ(ξ)
b ψ(ξ),
b
Rn Rn

where we used the substitution z = x − y. We leave (iii) as an exercise.

1.2 Inequalities
This section will be devoted to several important inequalities which are very useful in
Fourier analysis and in many types of analysis involving spaces of functions.

9
1.2.1 Cauchy’s inequality. For all a, b ∈ R we have
a2 b 2
ab ≤ + .
2 2
Moreover, for any  > 0, we also have
b2
ab ≤ a2 + .
4

2 2 2
The first inequality follows from√0 ≤ (a −
√b) = a − 2ab + b . The second one follows
in we apply the first one to ab = ( 2a)(b/ 2).
As a consequence, we immediately obtain the Cauchy’s inequality for functions:
Z Z
1 2 2 1 2 2

|f (x)g(x)|dx ≤ (|f (x)| +|g(x)| )dx and ||f g||L1 (Ω) ≤ ||f ||L2 (Ω) + ||g||L2 (Ω) .
Ω 2 Ω 2
1.2.2 Cauchy–Schwartz’s inequality. Let x, y ∈ Rn . Then |x · y| ≤ |x||y|.
Proof. For  > 0, we have 0 ≤ |x ± y|2 = |x|2 ± 2x · y + 2 |y|2 . This implies
±x · y ≤ 21 |x|2 + 2 |y|2 . Setting  = |x|
|y|
, we obtain the required inequality, provided y 6= 0
(if y = 0 it is trivial).
An alternative proof may be given as follows. We can observe that the inequality
0 ≤ |x + y|2 = |x|2 + 2x · y + 2 |y|2
implies that the discriminant of the quadratic (in ) polynomial on the right hand side
must be non-positive, which means |x · y|2 − |x|2 |y|2 ≤ 0.
1 1
1.2.3 Young’s inequality. Let 1 < p, q < ∞ be such that p
+ q
= 1. Then
ap b q
ab ≤ + for all a, b > 0.
p q
Moreover, if  > 0, we have
ab ≤ ap + C()bq
for all a, b > 0, where C() = (p)−q/p q −1 .

To prove the first inequality, we will use the fact that the exponential function x 7→ ex
is convex (a function f : R → R is called convex if
f (τ x + (1 − τ )y) ≤ τ f (x) + (1 − τ )f (y),
for all x, y ∈ R and all 0 ≤ τ ≤ 1). This implies
1 p+ 1 ln bq 1 p 1 q ap b q
ab = eln a+ln b = e p ln a q ≤ eln a + eln b = + .
p q p q
The second inequality with  follows if we apply the first one to the product ab =
1/p 1/p
(p) a b/(p) .
As a consequence, we immediately obtain that if f ∈ Lp (Ω) and g ∈ Lq (Ω), then
f g ∈ L1 (Ω) with
1 1
||f g||L1 ≤ ||f ||pLp + ||g||qLq .
p q

10
1.2.4 Hölder’s inequality. Let 1 ≤ p, q ≤ ∞ with p1 + 1q = 1. Let f ∈ Lp (Ω) and
g ∈ Lq (Ω). Then f g ∈ L1 (Ω) and
Z
||f g||L1 (Ω) = |f g|dx ≤ ||f ||Lp (Ω) ||g||Lq (Ω) .

In the formulation we use the standard convention of setting 1/∞ = 0. In the case of
p = q = 2 Hölder’s inequality is often called the Cauchy–Schwartz’ inequality.
Proof. In the case p = 1 or p = ∞ the inequality is obvious, so let us assume 1 < p < ∞.
Let us first consider the case when ||f ||Lp = ||g||Lq = 1. Then by Young’s inequality
with 1 < p, q < ∞, we have
1 1 1 1
||f g||L1 ≤ ||f ||pLp + ||g||qLq = + = 1 = ||f ||Lp ||g||Lq ,
p q p q
which is the desired inequality. Now, let us consider general f, g. We observe that we
may assume that ||f ||Lp 6= 0 and ||g||Lq 6= 0, since otherwise one of functions is zero
almost everywhere in Ω and Hölder’s inequality becomes trivial. It follows from the
considered case that Z
f g
dx ≤ 1,
Ω ||f ||p ||g||q
which implies the general case by the linearity of the integral.

General Hölder’s inequality. Let 1 ≤ p1 , . . . , pm ≤ ∞ be such that p11 + · · · + p1m = 1.


Let fk ∈ Lpk (Ω) for all k = 1, . . . , m. Then the product f1 · · · fm ∈ L1 (Ω) and
m
Y
||f1 · · · fm ||L1 (Ω) ≤ ||fk ||Lpk (Ω) .
k=1

This inequality follows from Hölder’s inequality by induction on the number of functions.
1.2.5 Minkowski’s inequality. Let 1 ≤ p ≤ ∞. Let f, g ∈ Lp (Ω). Then
||f + g||Lp (Ω) ≤ ||f ||Lp (Ω) + ||g||Lp (Ω) .
In particular, this means that || · ||Lp satisfies the triangle inequality and is a norm, so
Lp (Ω) is a normed space.
Proof. The cases of p = 1 or p = ∞ follow from the triangle inequality for complex
numbers and are, therefore, trivial. So we may assume 1 < p < ∞. Then we have
Z Z
p p
||f + g||Lp (Ω) = |f + g| dx ≤ |f + g|p−1 (|f | + |g|)dx
ZΩ Ω Z
p
= |f + g|p−1 |f |dx + |f + g|p−1 |g|dx (Hölder’s ineq p = p, q = )
Ω Ω p−1
Z  p−1 "Z
p
 1 Z p
1 # p
≤ |f + g|p dx |f |p dx + |g|p dx
Ω Ω Ω

g||p−1

= ||f + Lp (Ω) ||f ||Lp (Ω) + ||g||Lp (Ω) ,
which implies the desired inequality.

11
1
1.2.6 Interpolation for Lp –norms.T Let 1 ≤ s ≤ r ≤ t ≤ ∞ be such that r
= θs + 1−θ
t
for some 0 ≤ θ ≤ 1. Let f ∈ Ls (Ω) Lt (Ω). Then f ∈ Lr (Ω) and

||f ||Lr (Ω) ≤ ||f ||θLs (Ω) ||f ||L1−θ


t (Ω) .

To prove this, we use that θr


s
+ (1−θ)r
t
= 1 and so we can apply Hölder’s inequality in the
following way:
Z Z Z  θrs Z  (1−θ)r
t
s t
(1−θ)r· (1−θ)r
|f |r dx = |f |θr |f |(1−θ)r dx ≤ |f | θr· θr
dx |f | dx ,
Ω Ω Ω Ω

which is the desired inequality.

1.3 Fourier transforms of distributions


In this section we will introduce several spaces of distributions and will extend the Fourier
transform to more general spaces of functions than S(Rn ) or L1 (Rn ) in the first section.
The main problem with the immediate extension is that the integral in the definition
of the Fourier transform may no longer converge if we go beyond the space L1 (Rn ) of
integrable functions.

1.3.1 Tempered distributions


We define the space of tempered distributions S 0 (Rn ) as the space of all continuous linear
functionals on S(Rn ). This means that u ∈ S 0 (Rn ) if it is a functional u : S(Rn ) → C
such that

1. u is linear, i.e. u(αϕ + βψ) = αu(ϕ) + βu(ψ) for all α, β ∈ C and all ϕ, ψ ∈ S(Rn );

2. u is continuous, i.e. u(ϕj ) → u(ϕ) in C whenever ϕj → ϕ in S(Rn ).

Here one should also recall the definition of the convergence ϕj → ϕ in S(Rn ) from 1.1.5,
which said that ϕj → ϕ in S(Rn ) as j → ∞, if ϕj , ϕ ∈ S(Rn ) and if

sup |xβ ∂ α (ϕj − ϕ)(x)| → 0


x∈Rn

as j → ∞, for all multi-indices α, β ≥ 0.


We can also define the convergence in the space S 0 (Rn ) of tempered distributions.
Let uj , u ∈ S 0 (Rn ). We will say that uj → u in S 0 (Rn ) as j → ∞ if uj (ϕ) → u(ϕ) in C
as j → ∞, for all ϕ ∈ S(Rn ).
Functions in S(Rn ) are called the test functions for tempered distributions in S 0 (Rn ).
Another notation for u(ϕ) will be hu, ϕi.

12
1.3.2 Fourier transform of tempered distributions
If u ∈ S 0 (Rn ), we can define its (generalised) Fourier transforms by setting

u
b(ϕ) = u(ϕ),
b

for all ϕ ∈ S(Rn ).


Then we can readily see that also u b ∈ S 0 (Rn ). Indeed, since ϕ ∈ S, it follows that
b ∈ S and so u(ϕ)
ϕ b is a well-defined complex number. Moreover, u b is linear since both
u and the Fourier transform F are linear. Finally, u b is continuous because ϕj → ϕ in S
implies ϕbj → ϕ
b in S, and hence

u bj ) → u(ϕ)
b(ϕj ) = u(ϕ b =u
b(ϕ)

by the continuity of both u from S(Rn ) to C and of the Fourier transform F as a mapping
from S(Rn ) to S(Rn ) (see 1.1.7).
Now, it follows that it is also continuous as a mapping from S 0 (Rn ) to S 0 (Rn ), i.e.
uj → u in S 0 (Rn ) implies that ubj → u
b in S 0 (Rn ). Indeed, if uj → u in S 0 (Rn ), we have

b → u(ϕ)
ubj (ϕ) = uj (ϕ) b =u
b(ϕ)

b in S 0 (Rn ).
for all ϕ ∈ S(Rn ), which means that ubj → u

1.3.3 Two principles for tempered distributions


Here we give two immediate but important principles for distributions.
Convergence principle. Let X be a topological subspace in S 0 (Rn ) (i.e. convergence
in X implies convergence in S 0 (Rn )). Suppose that uj → u in S 0 (Rn ) and that uj → v
in X. Then u ∈ X(Rn ) and u = v.
This statement is simply the consequence of the fact that the space S 0 (Rn ) is Hausdorff,
hence it has the uniqueness of limits property (recall that a topological space is called
Hausdorff if any two points have open disjoint neighborhoods, i.e. open disjoint sets
containing them). The convergence principle is also related to another principle which
we call
Uniqueness principle for distributions. Let u, v ∈ S 0 (Rn ) and suppose that u(ϕ) =
v(ϕ) for all ϕ ∈ S(Rn ). Then u = v.
This can be reformulated by saying that if an element o ∈ S 0 (Rn ) satisfies o(ϕ) = 0 for
all ϕ ∈ S(Rn ), then o is the zero element in S 0 (Rn ).

1.3.4 Functions as distributions


We can interpret functions in Lp (Rn ), 1 ≤ p ≤ ∞, as tempered distributions. If f ∈
Lp (Rn ), we define the functional uf by
Z
uf (ϕ) = f (x)ϕ(x)dx,
Rn

13
for all ϕ ∈ S(Rn ). By Hölder’s inequality, we observe that

|uf (ϕ)| ≤ ||f ||Lp ||ϕ||Lq ,

for p1 + 1q = 1, and hence uf (ϕ) is well-defined in view of the simple inclusion S(Rn ) ⊂
Lq (Rn ), for all 1 ≤ q ≤ ∞. It needs to be verified that uf is a linear continuous functional
on S(Rn ). It is clearly linear in ϕ, while its continuity follows from inequality

|uf (ϕj ) − uf (ϕ)| ≤ ||f ||Lp ||ϕj − ϕ||Lq

and
Lemma. We have S(Rn ) ⊂ Lq (Rn ) with continuous embedding, i.e. ϕj → ϕ in S(Rn )
implies that ϕj → ϕ in Lq (Rn ).
To summarise, any function f ∈ Lp (Rn ) leads to a tempered distribution uf ∈ S 0 (Rn )
in the canonical way described above. In this way we will view functions in Lp (Rn ) as
tempered distributions and continue to simply write f instead of uf . There should be no
confusion with this notation since writing f (x) suggests that f is a function while f (ϕ)
suggests that it is applied to test functions and so it is viewed as a distribution uf .
Consistency of all definitions. With this identification, the definition of the Fourier
transform for functions in L1 (Rn ) agrees with the definition of the Fourier transforms of
tempered distributions. Indeed, let f ∈ L1 (Rn ). Then we have two ways of looking at
its Fourier transforms:

1. We can use the first definition fb(ξ) = Rn e−2πix·ξ f (x)dx, and then we know that
R

fb ∈ L∞ (Rn ). In this way we get ufb ∈ S 0 (Rn ).

2. We can immediately think of f ∈ L1 (Rn ) as of uf ∈ S 0 (Rn ), and the second


cf ∈ S 0 (Rn ).
definition then produces its Fourier transform u

Fortunately, these two approaches are consistent and produce the same tempered distri-
cf ∈ S 0 (Rn ). Indeed, we have
bution ufb = u
Z Z
u
b(ϕ) = u
bϕ dx = uϕ
b dx = u(ϕ).
b
Rn Rn

Here we used the multiplication formula for the Fourier transform and the fact that both
u ∈ L1 (Rn ) and ub ∈ L∞ (Rn ) can be viewed as tempered distributions in the canonical
way. It follows that we have ub(ϕ) = u(ϕ)
b which was exactly the definition in 1.3.2.
Finally, we note that if u ∈ L1 (Rn ) and also u
b ∈ L1 (Rn ), then the Fourier inversion
formula 1.1.8 holds for almost all x ∈ Rn . A more general Fourier inversion formula for
tempered distributions will be given in 1.3.7.

1.3.5 Plancherel’s formula


It turns out that the Fourier transform acts especially nicely on one of the spaces Lp (Rn ),
namely on the space L2 (Rn ), which is also a Hilbert space. These two facts lead to a
very rich Fourier analysis on L2 (Rn ) which we will deal with only briefly.

14
Theorem. Let u ∈ L2 (Rn ). Then u
b ∈ L2 (Rn ) and

||b
u||L2 (Rn ) = ||u||L2 (Rn ) (Plancherel’s formula)

Moreover, for all u, v ∈ L2 (Rn ) we have


Z Z
uvdx = ubvbdξ (Parseval’s formula)
Rn Rn

Proof. We will use the fact (a special case of the fact to be proved later) that S(Rn ) is
sequentially dense in L2 (Rn ), i.e. for every u ∈ L2 (Rn ) there exists a sequence uj ∈ S(Rn )
such that uj → u in L2 (Rn ). Then Theorem 1.1.12, (i), with ϕ = ψ = uj − uk , implies
that
||ubj − ubk ||2L2 = ||uj − uk ||2L2 → 0,
since uj is a convergent sequence in L2 (Rn ). Thus, ubj is a Cauchy sequence in the
complete (Banach) space L2 (Rn ). It follows that it must converge to some v ∈ L2 (Rn ).
By the continuity of the Fourier transform in S 0 (Rn ) we must also have ubj → ub in S 0 (Rn ).
2 n
By the convergence principle for distributions, we get that u b = v ∈ L (R ). Applying
Theorem 1.1.12, (i), again, to ϕ = ψ = uj , we get

||ubj ||2L2 = ||uj ||2L2 .

Passing to the limit, we get


u||2L2 = ||u||2L2 ,
||b
which is the Plancherel’s formula.
Finally, for u, v ∈ L2 (Rn ), let uj , vj ∈ S(Rn ) be such that uj → u and vj → v in
L2 (Rn ). Applying Theorem 1.1.12, (i), to ϕ = uj , ψ = vj , and passing to the limit, we
obtain the Parseval’s identity.

1.3.6 Operations with distributions


Besides the Fourier transform, there are several other operations than can be extended
from functions in S(Rn ) to tempered distributions in S 0 (Rn ).
For example, partial differentiation ∂x∂ j can be extended to a continuous operator

∂xj
: S 0 (Rn ) → S 0 (Rn ). Indeed, for u ∈ S 0 (Rn ) and ϕ ∈ S(Rn ), let us define
   
∂ ∂ϕ
u (ϕ) = −u .
∂xj ∂xj

It is necessary to include the negative sign in this definition. Indeed, if u ∈ S(Rn ), then
the integration by parts formula and the identification of functions with distributions
1.3.4 yield
  Z   Z    
∂ ∂u ∂ϕ ∂ϕ
u (ϕ) = (x)ϕ(x)dx = − u(x) (x)dx = −u ,
∂xj Rn ∂xj Rn ∂xj ∂xj

15
which explains the sign. This also shows the consistence of this definition of the derivative
with the usual definition for differentiable functions. More generally, for any multi-index
α, one can define
(∂ α u)(ϕ) = (−1)|α| u(∂ α ϕ).
Then ∂ α : S 0 (Rn ) → S 0 (Rn ) is continuous. Indeed, if ϕk → ϕ in S(Rn ), then clearly also
∂ α ϕk → ∂ α ϕ in S(Rn ), and, therefore,

(∂ α u)(ϕk ) = (−1)|α| u(∂ α ϕk ) → (−1)|α| u(∂ α ϕ) = (∂ α u)(ϕ),

which means that ∂ α u ∈ S 0 (Rn ). Moreover, let uk → u ∈ S 0 (Rn ). Then

∂ α uk (ϕ) = (−1)|α| uk (∂ α ϕ) → (−1)|α| u(∂ α ϕ) = ∂ α u(ϕ),

for all ϕ ∈ S(Rn ), i.e. ∂ α is continuous on S 0 (Rn ).


If a smooth function f and all of its derivatives are bounded by some polynomial
functions, we can define the multiplication of tempered distributions by f by setting

(f u)(ϕ) = u(f ϕ).

This is well-defined since ϕ ∈ S(Rn ) implies f ϕ ∈ S(Rn ) if f is a function as above.

1.3.7 Fourier inversion formula for tempered distributions


As we have seen, statements on S(Rn ) can usually be extended to corresponding state-
ments on S 0 (Rn ). This applies to the Fourier inversion formula as well. Let us define
F −1 on S 0 (Rn ) by
(F −1 u)(ϕ) = u(F −1 ϕ),
for u ∈ S 0 (Rn ) and ϕ ∈ S(Rn ). As before, it can be readily checked that F −1 : S 0 (Rn ) →
S 0 (Rn ) is well-defined and continuous.
Theorem. F and F −1 are inverse to each other on S 0 (Rn ), i.e. FF −1 = F −1 F = id
on S 0 (Rn ).
To prove this, let u ∈ S 0 (Rn ) and ϕ ∈ S(Rn ). Then by 1.1.8 and definitions, we get

(FF −1 u)(ϕ) = (F −1 u)(Fϕ) = u(F −1 Fϕ) = u(ϕ),

so FF −1 u = u by the uniqueness principle for distributions. A similar argument applies


to show that F −1 F = id.

To give an example of these operations, let us define the Heaviside function H on R by


setting 
0, if x < 0,
H(x) =
1, if x ≥ 0.
Clearly H ∈ L∞ (R), so in particular, it is a tempered distribution. Let us also define
the δ–distribution by setting
δ(ϕ) = ϕ(0)
for all ϕ ∈ S(R).

16
We claim first that H 0 = δ. Indeed, we have
Z ∞
0 0
H (ϕ) = −H(ϕ ) = − ϕ0 (x)dx = ϕ(0) = δ(ϕ),
0

hence H 0 = δ by the uniqueness principle for distributions.


Let us now calculate the Fourier transform of δ. According to the definitions, we
have Z
δ(ϕ)
b = δ(ϕ)
b = ϕ(0)
b = ϕ(x)dx = 1(ϕ),
R

hence δb = 1. Here we used the fact that the constant one is in L∞ (Rn ), hence also a
tempered distribution. It can be also checked that b
1 = δ.

1.3.8 C0∞ (Ω) and sequential density


For an open set Ω ⊂ Rn , the space C0∞ (Ω) is defined as the space of smooth functions
ϕ : Ω → C with compact support. Here the support of ϕ is defined as the closure of the
set where ϕ is non-zero, i.e. by

supp ϕ = {x ∈ Ω : ϕ(x) 6= 0}.

It can be seen that this space is non-empty. For example, if we define function χ(t) by
2
χ(t) = e−1/t for t > 0 and by χ(t) = 0 for t ≤ 0, then f (t) = χ(t)χ(1 − t) is a smooth
compactly supported function on R. Consequently, ϕ(x) = f (x1 ) · · · f (xn ) is a function
in C0∞ (Rn ), with supp ϕ = [0, 1]n .
2
Another example is function ψ defined by ψ(x) = e1/(|x| −1) for |x| < 1 and by
ψ(x) = 0 for |x| ≥ 1. We have ψ ∈ C0∞ (Rn ) with supp ψ = {|x| ≤ 1}.
Although these examples are quite special, products of these functions with any
other smooth function as well as their derivatives are all in C0∞ (Rn ). On the other hand,
C0∞ (Rn ) can not contain analytic functions, making it relatively small. Still, it is dense
in very large spaces of functions/distributions in their respective topologies.

Theorem. The space C0∞ (Rn ) is sequentially dense in S 0 (Rn ), i.e. for every u ∈ S 0 (Rn )
there exists a sequence uk ∈ C0∞ (Rn ) such that uk → u in S 0 (Rn ) as k → ∞.

Lemma. The space C0∞ (Rn ) is sequentially dense in S(Rn ), i.e. for every ϕ ∈ S(Rn )
there exists a sequence ϕk ∈ C0∞ (Rn ) such that ϕk → ϕ in S(Rn ) as k → ∞.
Proof of Lemma. Let ϕ ∈ S(Rn ). Let us fix some ψ ∈ C0∞ (Rn ) such that ψ = 1 in
a neighborhood of the origin and let us define ψk (x) = ψ(x/k). Then it can be easily
checked that ϕk = ψk ϕ → ϕ in S(Rn ).
Proof of Theorem. Let u ∈ S 0 (Rn ) and let ψ and ψk be as in the proof of the lemma.
Then ψu ∈ S 0 (Rn ) is well-defined by (ψu)(ϕ) = u(ψϕ), for all ϕ ∈ S(Rn ). We have that
ψk u → u in S 0 (Rn ). Indeed, we have that

(ψk u)(ϕ) = u(ψk ϕ) → u(ϕ)

17
b→u
by the lemma. Similarly, we have that ψk u b in S 0 (Rn ), and hence also F −1 (ψk u
b) → u
in S (R ) because of the continuity of the Fourier transform in S 0 (Rn ). Consequently,
0 n

we have
ukj = ψj (F −1 (ψk u
b)) → u
in S 0 (Rn ) as k, j → ∞. It remains to show that ukj ∈ C0∞ (Rn ).
In general, let χ ∈ C0∞ (Rn ) and let w = χbu. We claim that F −1 w ∈ C ∞ (Rn ). Indeed,
we have
Z Z
−1 −1 2πix·ξ
(F w)(ϕ) = w(F ϕ) = wξ ( e ϕ(x)dx) = wξ (e2πix·ξ )ϕ(x)dx,
Rn Rn

where we write wξ to emphasize that w acts on the test function as the function of
ξ-variable, and where we used the continuity of w and the fact that wξ (e2πix·ξ ) =
b(χe2πix·ξ ) is well-defined. Now, it follows that F −1 w can be identified with the function
u
(F −1 w)(x) = u bξ (χ(ξ)e2πix·ξ ), which is smooth with respect to x. Indeed, we can note
first that the right hand side depends continuously on x because of the continuity of u b
on S(Rn ). Here we also use that everything is well defined since χ ∈ C0∞ (Rn ). Moreover,
since function χ(ξ)e2πix·ξ is compactly supported in ξ, so are its derivatives with respect
to x, and hence all the derivatives of (F −1 w)(x) are also continuous in x, proving the
claim and the theorem.

1.3.9 Distributions (generalised functions)


Since our main interest is in the Fourier analysis, we started with the space S 0 (Rn )
of tempered distributions which allows the definition and use of the Fourier transform.
However, there is a bigger space of distributions which we will sketch here. It will contain
some important classes of functions that S 0 (Rn ) does not contain.
Localisations of Lp –spaces. We define local versions of spaces Lp (Ω) as follows. We
will say that f ∈ Lploc (Ω) if ϕf ∈ Lp (Ω) for all ϕ ∈ C0∞ (Ω). We note that Lploc (Rn ) are not
subspaces of S 0 (Rn ) since they do not encode any information on the global behaviour
2
of functions. For example, e|x| is smooth, and hence belongs to all Lploc (Rn ), 1 ≤ p ≤ ∞,
but it is not in S 0 (Rn ). There is a natural notion of convergence in the localised spaces
f ∈ Lploc (Ω). Thus, we will write

fm → f in Lploc (Ω) as m → ∞,

if f and fm belong to Lp (Ω)loc for all m, and if ϕfm → ϕf in Lploc (Ω) as m → ∞, for all
ϕ ∈ C0∞ (Ω).
The difference between the space of distributions D0 (Rn ) and the space of tempered
distributions S 0 (Rn ) is the choice of the set C0∞ (Rn ) rather than S(Rn ) as the space of
test functions. At the same time, choosing C0∞ (Ω) as test functions allows one to obtain
the space D0 (Ω) of distributions in Ω. The definition and facts below are sketched only
as they are similar to 1.3.1.
We say that ϕk → ϕ in C0∞ (Ω) if ϕk , ϕ ∈ C0∞ (Ω), if there is a compact set K ⊂ Ω
such that supp ϕk ⊂ K for all k, and if

sup |∂ α (ϕk − ϕ)(x)| → 0


x∈Ω

18
for all multi-indices α. Then D0 (Ω) is defined as the set of all linear continuous functionals
u : C0∞ (Ω) → C.
It is easy to see that C0∞ (Rn ) ⊂ S(Rn ) and that if ϕk → ϕ in C0∞ (Rn ), then ϕk → ϕ
in S(Rn ). Thus, if u ∈ S 0 (Rn ) and if ϕk → ϕ in C0∞ (Rn ), we have u(ϕk ) → u(ϕ),
which means that u ∈ D0 (Rn ). Thus, we showed that S 0 (Rn ) ⊂ D0 (Rn ). We say that
uk → u ∈ D0 (Ω) if uk , u ∈ D0 (Ω) and if uk (ϕ) → u(ϕ) for all ϕ ∈ C0∞ (Ω). One readily
checks that uk → u in S 0 (Rn ) implies uk → u in D0 (Rn ).
Finally, one can readily see that the canonical identification 1.3.4 yields the inclusions
p
Lloc (Ω) ⊂ D0 (Ω) for all 1 ≤ p ≤ ∞.

1.3.10 Weak derivatives


There is a notion of a weak derivative which is a special case of the distributional deriva-
tive from 1.3.6. However, it allows a realisation in an integral form and we mention it
here briefly.
Let Ω be an open subset of Rn and let u, v ∈ L1loc (Ω). We say that v is the αth –weak
partial derivative of u if
Z Z
|α|
α
u∂ ϕdx = (−1) vϕdx, for all ϕ ∈ C0∞ (Ω).
Ω Ω

In this case we also write v = ∂ α u. The constant (−1)|α| stands for the consistency with
the corresponding definition for smooth functions when using the integration by parts in
Ω. This is also the reason to include the same constant (−1)|α| in the definition in 1.3.6.
The weak derivative defined in this way is uniquely determined:
Lemma. Let u ∈ L1loc (Ω). If a weak αth derivative of u exists, it is uniquely defined up
to a set of measure zero.
Indeed, assume that there are two functions v, w ∈ L1loc (Ω) such that
Z Z Z
α |α| |α|
u∂ ϕdx = (−1) vϕdx = (−1) wϕdx,
Ω Ω Ω

for all ϕ ∈ C0∞ (Ω). Then Ω (v − w)ϕdx = 0 for all ϕ ∈ C0∞ (Ω). A standard result from
R

the measure theory now implies that v = w almost everywhere in Ω.

Examples and Exercises.


(1) Let us define u, v : R → R by
 
x, if x ≤ 1, 1, if x ≤ 1,
u(x) = v(x) =
1, if x > 1, 0, if x > 1,

Prove that u0 = v weakly.


(2) Define u : R → R by 
x, if x ≤ 1,
u(x) =
2, if x > 1.
Prove that u has no weak derivative.
(3) Calculate the distributional derivative of u from (2).
(4) Prove that the δ-distribution is not an element of L1loc (Rn ).

19
1.3.11 Sobolev spaces
Let 1 ≤ p ≤ ∞ and let k ∈ N ∪ {0}. The Sobolev space Lpk (Ω) (or W p,k (Ω)) consists
of all u ∈ L1loc (Ω) such that for all multi-indices α with |α| ≤ k, ∂ α u exists weakly (or
distributionally) and ∂ α u ∈ Lp (Ω).
Since p ≥ 1, we know that Lploc (Ω) ⊂ L1loc (Ω), so we note that it does not matter
whether we take a weak or a distributional derivative.
In the case p = 2, one often uses the notation H k (Ω) for L2k (Ω), and in the case p = 2
and k = 0, we get H 0 (Ω) = L2 (Ω). As usual, we identify functions in Lpk (Ω) which are
equal almost everywhere.
For u ∈ Lpk (Ω), we define
 1/p  1/p
X XZ
||u||Lpk (Ω) =  ||∂ α u||pLp  = |∂ α u|p dx ,
|α|≤k |α|≤k Ω

for 1 ≤ p < ∞, and for p = ∞ we define


X
||u||L∞
k (Ω)
= esssupΩ |∂ α u|.
|α|≤k

It can be readily checked that these expressions are norms. Indeed, we clearly have

||λu||Lpk = |λ|||u||Lpk

for all λ ∈ C, and ||u||Lpk = 0 if and only if u = 0 almost everywhere. For the triangle
inequality, the case p = ∞ is straightforward. For 1 ≤ p < ∞ and for u, v ∈ Lpk (Ω),
Minkowski’s inequality implies
 1/p  1/p
X X
||u + v||Lpk =  ||∂ α u + ∂ α v||pLp  ≤ (||∂ α u||Lp + ||∂ α v||Lp )p 
|α|≤k |α|≤k
 1/p  1/p
X X
≤ ||∂ α u||pLp  + ||∂ α v||pLp 
|α|≤k |α|≤k

= ||u||Lpk + ||v||Lpk .

Localisations of Sobolev spaces. We define local versions of spaces Lpk (Ω) similarly
to local versions of Lp –spaces. We will say that f ∈ Lpk (Ω)loc if ϕf ∈ Lpk (Ω) for all
ϕ ∈ C0∞ (Ω). We will write fm → f in Lpk (Ω)loc as m → ∞, if f and fm belong to Lpk (Ω)loc
for all m, and if ϕfm → ϕf in Lpk (Ω) as m → ∞, for all ϕ ∈ C0∞ (Ω).

1.3.12 Example of a point singularity


An often encountered example of a function with a point singularity is the following
u(x) = |x|−a defined for x ∈ Ω = B(0, 1) ⊂ Rn , x =
6 0. We may ask a question: for
p
which a > 0 do we have u ∈ L1 (Ω)?

20
First we observe that away from the origin, u is a smooth function and can be
differentiated pointwise with ∂xj u = −axj |x|−a−2 and hence also |∇u(x)| = |a||x|−a−1 ,
x 6= 0. In particular, |∇u| ∈ L1 (Ω) for a + 1 < n. We also have |∇u| ∈ Lp (Ω) for
(a + 1)p < n So we must assume a + 1 < n and (a + 1)p < n. Let us now calculate the
weak (distributional) derivative of u in Ω. Let ϕ ∈ C0∞ (Ω). Let  > 0. On Ω\B(0, ) we
can integrate by parts to get
Z Z Z
u∂xj ϕdx = − ∂xj uϕdx + uϕν j dσ,
Ω\B(0,) Ω\B(0,) ∂B(0,)

where dσ is the surface measure on the sphere ∂B(0, ) and ν = (ν 1 , . . . , ν n ) is the inward
pointing normal on ∂B(0, ). Now, since u = ||−a on ∂B(0, ), we can estimate
Z Z
j
uϕν dσ ≤ ||ϕ||L∞ −a dσ ≤ Cn−1−a → 0
∂B(0,) ∂B(0,)

as  → 0, since a + 1 < n. Passing to the limit in the integration by parts formula, we


get Z Z
u∂xj ϕdx = − ∂xj uϕdx,
Ω Ω

which means that ∂xj u is also the weak derivative of u. So, u ∈ Lp1 (Ω) if u, |∇u| ∈ Lp (Ω),
which holds for (a + 1)p < n, i.e. for a < (n − p)/p.
We leave it as an exercise to find conditions on a for which u ∈ Lpk (Ω).

1.3.13 Some properties of Sobolev spaces


Since Lp (Ω) ⊂ D0 (Ω), we can work with u ∈ Lp (Ω) as with function or as with dis-
tributions. In particular, we can differentiate them distributionally, etc. Moreover, as
we have already seen, the equality of objects (be it functions, functionals, distributions,
etc.) depends on the spaces in which the equality is considered. In Sobolev spaces we
can use tools from the measure theory so we work with functions defines almost every-
where. Thus, in equality in Sobolev spaces (as in the following theorem) means pointwise
equality almost everywhere.
Theorem. Let u, v ∈ Lpk (Ω), and let α be a multi-index with |α| ≤ k. Then

(i) ∂ α u ∈ Lpk−|α| (Ω), and ∂ α (∂ β u) = ∂ β (∂ α u) = ∂ α+β u, for all multi-indices α, β such


that |α| + |β| ≤ k.

(ii) For all λ, µ ∈ C we have λu + µv ∈ Lpk (Ω) and ∂ α (λu + µv) = λ∂ α u + µ∂ α v.


e is an open subset of Ω, then u ∈ Lp (Ω).
(iii) If Ω e
k

(iv) If χ ∈ C0∞ (Ω), then χu ∈ Lpk (Ω) and we have the Leibnitz formula
X
∂ α (χu) = Cαβ (∂ β χ)(∂ α−β u),
β≤α

α!
where the binomial coefficient is Cαβ = β!(α−β)!
.

21
(v) Lpk (Ω) is a Banach space.

Proof. Statements (i), (ii), and (iii) are easy. For example, if ϕ ∈ C0∞ (Ω) then also
∂ ϕ ∈ C0∞ (Ω), and (i) follows from
β

Z Z Z
α β |α| α+β |α|+|α+β|
∂ u ∂ ϕdx = (−1) u∂ ϕdx = (−1) ∂ α+β u ϕdx,
Ω Ω Ω

since (−1)|α|+|α+β| = (−1)|β| .


Let us now show (iv). RThe proof will be carried out by induction on |α|. For |α| = 1,
writing hu, ϕi for u(ϕ) = Ω uϕdx, we get

h∂ α (χu), ϕi = (−1)|α| hu, χ∂ α ϕi = −hu, ∂ α (χϕ) − (∂ α χ)ϕi = hχ∂ α u, ϕi + h(∂ α χ)u, ϕi,

which is what was required. Now, suppose that the Leibnitz formula is valid for all
|β| ≤ l, and let us take α with |α| = l + 1. Then we can write α = β + γ with some
|β| = l and |γ| = 1. We get

hχu, ∂ α ϕi = χu, ∂ β (∂ γ ϕ)
= (−1)|β| ∂ β (χu), ∂ γ ϕ (by induction hypothesis)
* +
X
= (−1)|β| Cβσ ∂ σ χ∂ β−σ u, ∂ γ ϕ (by definition)
σ≤β
* +
X
= (−1)|β|+|γ| Cβσ ∂ γ (∂ σ χ∂ β−σ u), ϕ (set ρ = σ + γ)
σ≤β
* +
X
|α|
Cβσ ρ α−ρ σ α−σ

= (−1) ∂ χ∂ u + ∂ χ∂ u ,ϕ
σ≤β
* +
X
= (−1)|α| Cαρ ∂ ρ χ∂ α−ρ u, ϕ ,
ρ≤α

where we used that Cβσ + Cβρ = Cα−γ ρ−γ ρ


+ Cα−γ = Cαρ .
Now let us prove (v). We have already shown in 1.3.11 that Lpk (Ω) is a normed space.
Let us show now that the completeness of Lp (Ω) implies the completeness of Lpk (Ω). Let
um be a Cauchy sequence in Lpk (Ω). Then ∂ α um is a Cauchy sequence in Lp (Ω) for any
|α| ≤ k. Since Lp (Ω) is complete, there exists some uα ∈ Lp (Ω) such that ∂ α um → uα
in Lp (Ω). Let u = u(0,··· ,0) , so in particular, we have um → u in Lp (Ω). Let us now show
that in fact u ∈ Lpk (Ω) and ∂ α u = uα for all |α| ≤ k. Let ϕ ∈ C0∞ (Ω). Then

h∂ α u, ϕi = (−1)|α| hu, ∂ α ϕi = (−1)|α| lim hum , ∂ α ϕi


m→∞
α
= lim h∂ um , ϕi = huα , ϕi,
m→∞

which implies u ∈ Lpk (Ω) and ∂ α u = uα . Moreover, we have ∂ α um → ∂ α u in Lp (Ω) for


all |α| ≤ k, which means that um → u in Lpk (Ω) and hence Lpk (Ω) is complete.

22
1.3.14 Mollifiers
In 1.3.8 we saw that we can approximate irregular function or (tempered) distributions
by much more regular functions. The argument relied on the use of the Fourier analysis
and worked well on Rn . Such technique is very powerful, as could have been seen from
the proof of Plancherel’s formula. On the other hand, when working in subsets of Rn we
may be unable to use the Fourier transform (since for its definition we used the whole
space Rn ). We want to be able to approximate functions (or distributions) by smooth
functions without using the Fourier techniques. This turns out to be possible using the
so-called mollification of functions.
Assume for a moment that we are in Rn again and let us first argue very informally.
Let us first look at the Fourier transform of the convolution with a δ-distribution. Thus,
for a function f we must have δ[ ∗ f = δbfb = fb, if we use that δb = 1. Taking the inverse
Fourier transform we obtain the important identity

δ ∗ f = f,

which will be justified formally later. Now, if we take a sequence of smooth functions
η approximating δ-distribution, i.e. if η → δ in some sense, and if this convergence is
preserved by the convolution, we should get

η ∗ f → δ ∗ f = f as  → 0.

Now, the convolution η ∗ f may be defined locally in Rn , and functions η ∗ f will be


smooth if η are, thus giving us a way to approximate f . We will now make this argument
precise. For this, we will deal in straightforward manner by looking at the limit of η ∗ f
for a suitably chosen sequence of functions η , without referring neither to δ-distribution
nor to the Fourier transform.
For an open set Ω ⊂ Rn and  > 0 we define Ω = {x ∈ Ω : dist(x, ∂Ω) > }. Let us
define η ∈ C0∞ (Rn ) by ( 1

η(x) = Ce
|x|2 −1 , if |x| < 1,
0, if |x| ≥ 1,
R
where constant C is chosen so that Rn ηdx = 1. Function η is called a mollifier. For
 > 0, we define
1 x
η (x) = n η ,
 
R
so that supp η ⊂ B(0, ) and Rn η dx = 1.
Let f ∈ L1loc (Ω). A mollification of f corresponding to η is a family f  = η ∗ f in Ω ,
i.e. Z Z

f (x) = η (x − y)f (y)dy = η (y)f (x − y)dy, for x ∈ Ω .
Ω B(0,)

Theorem. Let f ∈ L1loc (Ω). Then we have the following properties.

(i) f  ∈ C ∞ (Ω ).

(ii) f  → f almost everywhere as  → 0.

23
(iii) If f ∈ C(Ω), then f  → f uniformly on compact subsets of Ω.

(iv) f  → f in Lploc (Ω) for all 1 ≤ p < ∞.


R
Proof. To show (i), we can differentiate f  (x) = Ω η (x − y)f (y)dy and use the fact
that f ∈ L1loc (Ω). The proof of (ii) will rely on the following

Lebesgue’s differentiation theorem. Let f ∈ L1loc (Ω). Then


Z
1
lim |f (y) − f (x)|dy = 0 for a.e. x ∈ Ω.
r→0 |B(x, r)| B(x,r)

Now, for all x for which the statement of the Lebesgue’s differentiation theorem is true,
we can estimate
Z

|f (x) − f (x)| = η (x − y)(f (y) − f (x))dy
B(x,)
 
x−y
Z
−n
≤ η |f (y) − f (x)|dy
B(x,) 
Z
1
≤C |f (y) − f (x)|dy,
|B(x, )| B(x,)

where the last expression goes to zero as  → 0. For (iii), let K be a compact subset of
Ω. Let K0 ⊂ Ω be another compact set such that K is contained in the interior of K0 .
Then f is uniformly continuous on K0 and the limit in the Lebesgue’s differentiation
theorem holds uniformly for x ∈ K. The same argument as in (ii) then shows that
f  → f uniformly on K.
Finally, to show (iv), let us choose open sets U ⊂ V ⊂ Ω such that U ⊂ Vδ and
V ⊂ Ωδ for some small δ > 0. Let us show first that ||f  ||Lp (U ) ≤ ||f ||Lp (V ) for all
sufficiently small  > 0. Indeed, for all x ∈ U , we can estimate
Z

|f (x)| = η (x − y)f (y)dy
B(x,)
Z
≤ η1−1/p (x − y)η1/p (x − y)|f (y)|dy (Hölder’s inequality)
B(x,)
Z 1−1/p Z 1/p
p
≤ η (x − y)dy η (x − y)|f (y)| dy .
B(x,) B(x,)
R
Since B(x,)
η (x − y)dy = 1, we get
Z Z Z 
 p p
|f (x)| dx ≤ η (x − y)|f (y)| dy dx
U U B(x,)
Z Z 
≤ η (x − y)dx |f (y)|p dy
V B(y,)
Z
= |f (y)|p dy.
V

24
Now, let δ > 0 and let us choose g ∈ C(V ) such that ||f − g||Lp (V ) < δ. (Here we used
the fact that C(V ) is (sequentially) dense in Lp (V )). Then

||f  − f ||Lp (U ) ≤ ||f  − g  ||Lp (U ) + ||g  − g||Lp (U ) + ||g − f ||Lp (U )


≤ 2||f − g||Lp (V ) + ||g  − g||Lp (U )
< 2δ + ||g  − g||Lp (U ) .

Since g  → g uniformly on the closure of V by (iii), it follows that ||f  − f ||Lp (U ) ≤ 3δ


for small enough  > 0, completing the proof of (iv).
The is a simple but useful corollary of the Lebesgue’s differentiation theorem, partly
explaining its name.
Corollary of the Lebesgue’s differentiation theorem. Let f ∈ L1loc (Ω). Then
Z
1
lim f (y)dy = f (x) for a.e. x ∈ Ω.
r→0 |B(x, r)| B(x,r)

1.3.15 Approximation of Sobolev space functions


With the use of mollifications we can approximate functions in Sobolev spaces by smooth
functions. We have a local approximation in localised spaces Lpk (Ω)loc , a global approxi-
mation in Lpk (Ω), and further approximations dependent on the regularity of the bound-
ary of Ω. Although the set Ω is bounded, we still say that an approximation in Lpk (Ω) is
global if it works up to the boundary.
Local approximation by smooth functions. Assume that Ω ⊂ Rn is open. Let
f ∈ Lpk (Ω) for 1 ≤ p < ∞ and k ∈ N ∪ {0}. Let f  = η ∗ f in Ω be the mollification of
f ,  > 0. Then f  ∈ C ∞ (Ω ) and f  → f in Lpk (Ω)loc as  → 0, i.e. f  → f in Lpk (K) as
 → 0 for all compact K ⊂ Ω.
Proof. It was already proved in Theorem 1.3.14, (i), that f  ∈ C ∞ (Ω ). Since f is
locally integrable, we can differentiate the convolution under the integral sign to get
∂ α f  = η ∗ ∂ α u in Ω . Now, let U be an open and bounded subset of Ω containing K.
Then by Theorem 1.3.14, (iv), we get ∂ α f  → ∂ α f in Lp (U ) as  → 0, for all |α| ≤ k.
Hence X
||f  − f ||pLp (U ) = ||∂ α f  − ∂ α f ||pLp (U ) → 0
k
|α|≤k

as  → 0, proving the statement.


Global approximation by smooth functions. Assume that Ω ⊂ Rn is open and
bounded. LetTf ∈ Lpk (Ω) for 1 ≤ p < ∞ and k ∈ N ∪ {0}. Then there is a sequence
fm ∈ C ∞ (Ω) Lpk (Ω) such that fm → f in Lpk (Ω).
Proof. Let us write Ω = ∞
S
j=1 Ωj , where

Ωj = {x ∈ Ω : dist(x, ∂Ω) > 1/j}.

Let Vj = Ωj+3 \Ωj+1 (this


S∞ definition will be very important). Take also any open V0 with
V0 ⊂ Ω so that Ω = j=0 Vj . Let χj be a partition of unity subordinate to Vj , i.e. a

25
family χj ∈ C0∞ (Vj ) such that 0 ≤ χj ≤ 1 and ∞ p
P
j=0 χj = 1 in Ω. Then χj f ∈ Lk (Ω)
and supp(χj f ) ⊂ Vj . Let us fix some δ > 0 and choose j > 0 so small that function
f j = ηj ∗ (χj f ) is supported in Wj = Ωj+4 \Ωj and satisfies

||f j − χj f ||Lpk (Ω) ≤ δ2−j−1

for all j. Let now g = ∞ j ∞


P
j=0 f . Then g ∈ C (Ω) since in any open set
P∞ U in Ω there are
only finitely many non-zero terms in the sum. Moreover, since f = j=0 χj f , for each
such U we have
∞ ∞
X
j
X 1
||g − f ||Lpk (U ) ≤ ||f − χj f ||Lpk (Ω) ≤δ j+1
= δ.
j=0 j=0
2

Taking the supremum over all open subsets U of Ω, we obtain ||g − f ||Lpk (Ω) ≤ δ, com-
pleting the proof.
In general, there are many versions of these results depending on the set Ω, in par-
ticular on the regularity of its boundary. For example, we give here without proof the
following
Further result. Let Ω be a bounded subset of Rn with C 1 boundary. Let f ∈ Lpk (Ω) for
1 ≤ p < ∞ and k ∈ N ∪ {0}. Then there is a sequence fm ∈ C ∞ (Ω) such that fm → f in
Lpk (Ω).

2 Pseudo-differential operators on Rn
2.1 Analysis of operators
2.1.1 Motivation and definition
We will start with an informal observation that if T is a translation invariant linear
operator on some space of functions on Rn , then we can write

T (e2πix·ξ ) = a(ξ)e2πix·ξ for all ξ ∈ Rn .

Indeed, if T acts on functions of variable y, we can denote f (x, ξ) = T (e2πiy·ξ )(x). To avoid
unclarity here and in the sequel, denoting eξ (x) = e2πix·ξ , we have f (x, ξ) = (T eξ )(x).
Let (τh f )(x) = f (x − h) be the translation operator by h ∈ Rn . We say that T is
translation invariant if T τh = τh T for all h. By our assumptions on T we get

f (x + h, ξ) = T (e2πi(y+h)·ξ ) = e2πih·ξ T (e2πiy·ξ ) = e2πih·ξ f (x, ξ).

Now, setting x = 0, we get f (h, ξ) = e2πih·ξ f (0, ξ), so we obtain the desired formula with
a(ξ) = f (0, ξ). If we now formally apply T to the Fourier inversion formula
Z
f (x) = e2πix·ξ fb(ξ)dξ
Rn

and use the linearity of T , we obtain


Z Z
2πix·ξ b
T f (x) = T (e )f (ξ)dξ = e2πix·ξ a(ξ)fb(ξ)dξ.
Rn Rn

26
This formula allows one to reduce certain properties of operator T to properties of the
multiplication by the corresponding function a(ξ), called the symbol of T . For example,
continuity of T on L2 would reduce to the boundedness of a(ξ), composition of two
operators T1 ◦ T2 would reduce to the multiplication of their symbols a1 (ξ)a2 (ξ), etc.
Pseudo-differential operators will extend this construction to functions which are not
necessarily translation invariant. In fact, as we saw above we can always denote
a(x, ξ) := e−2πix·ξ T (e2πix·ξ ),
so that we would have
T (e2πix·ξ ) = e2πix·ξ a(x, ξ).
Consequently, reasoning as above, we could arrive at the formula
Z
T f (x) = e2πix·ξ a(x, ξ)fb(ξ)dξ. (2.1)
Rn

Now, in order to avoid several rather informal conclusions in the arguments above, one
usually takes the opposite route and adopts the last formula as the definition of the
pseudo-differential operator with symbol a(x, ξ). Such operators are then often denoted
by Ta or by a(x, D).
The simplest and perhaps most useful class of symbols allowing this approach to
work well is the following class denoted by S m , or by S1,0
m
. We will say that a ∈ S m if
n n
a = a(x, ξ) is smooth on R × R and if
|∂xβ ∂ξα a(x, ξ)| ≤ Aαβ (1 + |ξ|)m−|α| for all α, β and all x, ξ ∈ Rn .
Note that for partial differential operators symbols are just the characteristic polynomi-
als. One can readily see that the symbol of the differential operator
X
L= aα (x)∂xα
|α|≤m

is given by X
a(x, ξ) = aα (x)(2πiξ)α ,
|α|≤m
m
and a ∈ S if coefficients aα and all of their derivatives are smooth and bounded on Rn .

2.1.2 Freezing principle for PDEs


Suppose we want to solve the following equation for an unknown function u = u(x):
X ∂ 2u
(Lu)(x) = aij (x) (x) = f (x),
1≤i,j≤n
∂xi ∂xj

where the matrix {aij (x)} is a real-valued, smooth, symmetric and positive definite. If
we want to proceed in analogy to the Laplace equation in 1.1.6, we should look for the
inverse of the operator L. In the case of an operator with variable coefficients this may
turn out to be difficult, so we may look for an approximate inverse P such that
LP = I + E,

27
where the error E is small in some sense. To argue similar to 1.1.6, we “freeze” L at x0
to get the constant coefficients operator
X ∂2
Lx0 = aij (x0 ) .
1≤i,j≤n
∂xi ∂xj

Now, Lx0 has the exact inverse which is an operator of multiplication by


!−1
X
2
−4π aij (x0 )ξi ξj
1≤i,j≤n

on the Fourier transform side. To avoid a singularity at the origin, we introduce a cut-off
function χ ∈ C ∞ which is 0 near the origin and 1 for large ξ. Then we define
Z !−1
X
(Px f )(x) =
0 e2πix·ξ −4π 2 aij (x0 )ξi ξj χ(ξ)fb(ξ)dξ.
Rn 1≤i,j≤n

Then we can readily see that


Z Z
2πix·ξ
(Lx0 Px0 f )(x) = e χ(ξ)f (ξ)dξ = f (x) +
b e2πix·ξ (χ(ξ) − 1)fb(ξ)dξ.
Rn Rn

It follows that
L x0 Px0 = I + E x0 ,
where Z
(Ex0 f )(x) = e2πix·ξ (χ(ξ) − 1)fb(ξ)dξ
Rn
is an operator of multiplication by a compactly supported function on the Fourier trans-
form side. Writing it as a convolution with a smooth test function we can readily see
that it is smoothing.
Now, we can “unfreeze” the point x0 expecting that the inverse P will be close to
Px0 for x close to x0 , and define
Z !−1
X
2πix·ξ 2
(P f )(x) = (Px f )(x) = e −4π aij (x)ξi ξj χ(ξ)fb(ξ)dξ.
Rn 1≤i,j≤n

It will be clear from a later discussion that we still have

LP = I + E1

with error E1 being “smoothing or order one”. We can then set up an iterative procedure
to improve the approximation of the inverse operator relying on the calculus of the
appearing operators.

28
2.1.3 Pseudo-differential operators on the Schwartz space
It can be shown that if a ∈ S m and f ∈ S(Rn ), then Ta f ∈ S(Rn ). First we observe
that the integral in (2.1) converges absolutely. The same is true for all of its derivatives
with respect to x by the Lebesgue’s dominated convergence theorem, which implies that
Ta f ∈ C ∞ (Rn ). Let us show now that in fact Ta f ∈ S(Rn ). Introducing operator

Lξ = (1 + 4π 2 |x|2 )−1 (I − ∆ξ )

with the property Lξ e2πix·ξ = e2πix·ξ , integrating (2.1) by parts N times yields
Z
(Ta f )(x) = e2πix·ξ (Lξ )N [a(x, ξ)fb(ξ)]dξ.
Rn

From this we get


|(Ta f )(x)| ≤ CN (1 + |x|)−2N
for all N , so Ta f is rapidly decreasing. The same argument applies to derivatives of Ta f
to show that Ta f ∈ S(Rn ).
The following convergence criterion will be useful in the sequel. It follows directly
from the Lebesgue’s dominated convergence theorem.
Convergence criterion for pseudo-differential operators. Suppose we have a se-
quence of symbols ak ∈ S m which satisfies uniform symbolic estimates

|∂xβ ∂ξα ak (x, ξ)| ≤ Aαβ (1 + |ξ|)m−|α| ,

for all α, β, all x, ξ ∈ Rn , and all k. Suppose that a ∈ S m is such that ak (x, ξ) and all of
its derivatives converge to a(x, ξ) and its derivatives, respectively, pointwise as k → ∞.
Then Tak f → Ta f in S(Rn ) for any f ∈ S(Rn ).

2.1.4 Alternative definition of pseudo-differential operators


If we writing out the Fourier transform in (2.1), we obtain
Z Z
(Ta f )(x) = e2πi(x−y)·ξ a(x, ξ)f (y)dydξ. (2.2)
Rn Rn

However, a problem with this argument and with this formula is that the ξ-integral does
not converge absolutely even for f ∈ S(Rn ). To overcome this difficulty, one uses the
idea to approximate a(x, ξ) by symbols with compact support. To this end, let us fix
some γ ∈ C0∞ (Rn × Rn ) such that γ = 1 near the origin. Let us now define

a (x, ξ) = a(x, ξ)γ(x, ξ).

Then one can readily check that a ∈ C0∞ (Rn × Rn ) and

• if a ∈ S m , then a ∈ S m uniformly in 0 <  ≤ 1 (this means that constants in


symbolic inequalities may be chosen independent of 0 <  ≤ 1);

• a → a pointwise as  → 0, uniformly in 0 <  ≤ 1. The same is true for derivatives


of a and a.

29
It follows now from the convergence criterion of 2.1.3 that

Ta f → Ta f in S(Rn ) as  → 0,

for all f ∈ S(Rn ). Here Ta f is defined as in (2.1). Now, formula (2.2) does make sense
for a ∈ C0∞ , so we may define the double integral in (2.2) as the limit in S(Rn ) of Ta f ,
i.e. take Z Z
(Ta f )(x) = lim e2πi(x−y)·ξ a (x, ξ)f (y)dydξ, f ∈ S(Rn ).
→0 Rn Rn

2.1.5 Pseudo-differential operators on S 0 (Rn )


Recall that we can define the L2 -adjoint Ta∗ of operator Ta by the formula

(Ta f, g)L2 = (f, Ta∗ g)L2 , f, g ∈ S(Rn ),

where Z
(u, v)L2 = u(x)v(x)dx
Rn

is the usual L2 –inner product. From this formula we can readily calculate that
Z Z

(Ta g)(y) = lim e2πi(y−x)·ξ a (x, ξ)g(x)dxdξ, g ∈ S(Rn ).
→0 Rn Rn

With the same understanding of divergent integrals as in (2.2) and replacing x by z to


eliminate any confusion, we can write
Z Z

(Ta g)(y) = e2πi(y−z)·ξ a(z, ξ)g(z)dzdξ, g ∈ S(Rn ).
Rn Rn

As before, by integration by parts, we can check that Ta∗ : S(Rn ) → S(Rn ). Let u ∈
S 0 (Rn ). We can now define Ta u by the formula

(Ta u)(ϕ) = u(Ta∗ ϕ) for all ϕ ∈ S(Rn ).

We clearly have Z Z
Ta∗ ϕ(y) = e2πi(z−y)·ξ a(z, ξ)ϕ(z)dzdξ,
Rn Rn

so if u, ϕ ∈ S(Rn ), we have the consistency in


Z Z

(Ta u)(ϕ) = Ta u(x)ϕ(x)dx = (Ta u, ϕ)L2 = (u, Ta ϕ)L2 = u(x)Ta∗ ϕ(x)dx = u(Ta∗ ϕ).
Rn Rn

One can readily check that Ta u ∈ S 0 (Rn ) and that Ta : S 0 (Rn ) → S 0 (Rn ) is continuous.
Indeed, let uk → u in S 0 (Rn ). Then we have

(Ta uk )(ϕ) = uk (Ta∗ ϕ) → u(Ta∗ ϕ) = (Ta u)(ϕ),

so Ta uk → Ta u in S 0 (Rn ) and, therefore, Ta : S 0 (Rn ) → S 0 (Rn ) is continuous.

30
2.1.6 Kernel representation of pseudo-differential operators
Summarising, we can now write pseudo-differential operators in different ways:

Z
Ta f (x) = e2πix·ξ a(x, ξ)fb(ξ)dξ
Z ZRn

= e2πi(x−y)·ξ a(x, ξ)f (y)dydξ


n n
ZR ZR
= e2πiz·ξ a(x, ξ)f (x − z)dzdξ
n n
ZR R
= k(x, z)f (x − z)dz
R n
Z
= K(x, y)f (y)dy,
Rn

with kernels Z
K(x, y) = k(x, x − y), k(x, z) = e2πiz·ξ a(x, ξ)dξ.
Rn
m
Theorem. Let a ∈ S . Then the kernel K(x, y) satisfies
β
|∂x,y K(x, y)| ≤ C|x − y|−N

for N > m + n + |β| and x 6= y. Thus, for x 6= y, the kernel K(x, y) is a smooth function,
rapidly decreasing as |x − y| → ∞.
Proof. We notice that k(x, ·) is the inverse Fourier transform of a(x, ·). It follows then
that
 (−2πiz)α ∂zβ k(x,
 z) is the inverse Fourier transform with respect to ξ of the derivative
∂ξα (2πiξ)β a(x, ξ) , i.e.

(−2πiz)α ∂zβ k(x, z) = Fξ−1 ∂ξα (2πiξ)β a(x, ξ) (z).


 

Since (2πiξ)β a(x, ξ) ∈ S m+|β| is a symbol of order m + |β|, we have that

∂ξα (2πiξ)β a(x, ξ) ≤ Cαβ hξim+|β|−|α| .


 

 
Therefore, ∂ξα (2πiξ)β a(x, ξ) is in L1 (Rnξ ) with respect to ξ, if |α| > m + n + |β|.
Consequently, its inverse Fourier transform is bounded:

(−2πiz)α ∂zβ k(x, z) ∈ L∞ (Rnz ) for |α| > m + n + |β|.

Since taking derivatives of k(x, z) with respect to x do not change the argument, this
implies the statement of the theorem.

2.1.7 Smoothing operators


We can define symbols of order −∞ by setting S −∞ = m∈R S m , so that a ∈ S −∞ if
T
a ∈ C ∞ and if
|∂xβ ∂ξα a(x, ξ)| ≤ AαβN (1 + |ξ|)−N ,

31
for all N .
Proposition. Let a ∈ S −∞ . Then the integral kernel K of Ta is smooth on Rn × Rn .
Proof. Since a(x, ·) ∈ L1 (Rn ), we immediately get k ∈ L∞ (Rn ). Moreover,
Z
β α
∂x ∂z k(x, z) = e2πiz·ξ (2πiξ)α ∂xβ a(x, ξ)dξ.
Rn

Since (2πiξ)α ∂xβ a(x, ξ) is absolutely integrable, it follows that from the Lebesgue’s dom-
inated convergence theorem that ∂xβ ∂zα k is continuous. This is true for all α, β, hence k,
and then also K, are smooth. Let us write kx (·) = k(x, ·).
Corollary. Let a ∈ S −∞ . Then kx ∈ S(Rn ). We have
Ta f (x) = (kx ∗ f )(x)
and, consequently, Ta f ∈ C ∞ (Rn ) for all f ∈ S 0 (Rn ).
We note that the convolution in the corollary is understood in the sense of distributions.
We will prove it and discuss this notion in detail in the next section.

2.1.8 Convolution of distributions


We can write the convolution of two functions f, g ∈ S(Rn ) in the following way
Z Z
(f ∗ g)(x) = f (z)g(x − z)dz = f (z)(τx Rg)(z)dz,
Rn Rn

where (Rg)(x) = g(−x) and (τh g)(x) = g(x − h), so that


(τx Rg)(z) = (Rg)(z − x) = g(x − z).
Recalling our identification of functions with distributions, we can write
(f ∗ g)(x) = f (τx Rg).
This can now be extended to distributions. Indeed, for u ∈ S 0 (Rn ) and ϕ ∈ S(Rn ), we
can define
(u ∗ ϕ)(x) = u(τx Rϕ),
which makes sense since τx Rϕ ∈ S(Rn ) and since τx , R : S(Rn ) → S(Rn ) are continuous.
Lemma. Let u ∈ S 0 (Rn ) and ϕ ∈ S(Rn ). Then u ∗ ϕ ∈ C ∞ (Rn ).
Proof. We can observe that (u ∗ ϕ)(x) = u(τx Rϕ) is continuous in x since τx : S(Rn ) →
S(Rn ) and u : S(Rn ) → C are continuous. The same applies when we look at derivatives
in x, implying that u ∗ ϕ is smooth. Here we note that we are allowed to pass the limit
through u since it is a continuous functional.
Now Corollary 2.1.7 follows from the fact that for a ∈ S −∞ we can write Ta f (x) =
(kx ∗ f )(x) with kx (·) = k(x, ·) ∈ S(Rn ). So
(Ta f )(x) = f (τx Rkx ).
If now f ∈ S 0 (Rn ), it follows that Ta f ∈ C ∞ because of the continuity of f (τx Rkx ) and
all of its derivatives with respect to x.

32
2.1.9 L2 –boundedness of pseudo-differential operators
Theorem. Let a ∈ S 0 . Then Ta : L2 (Rn ) → L2 (Rn ) is a bounded linear operator.
Proof. First of all, we note that by a standard functional analytic argument it is
sufficient to show the boundedness inequality

||Ta f ||L2 (Rn ) ≤ C||f ||L2 (Rn ) (2.3)

only for f ∈ S(Rn ), with constant C independent of the choice of f . Indeed, let f ∈
L2 (Rn ) and let fk ∈ S(Rn ) be a sequence of rapidly decreasing functions such that
fk → f in L2 (Rn ). Then by (2.3) we have

||Ta (fk − fm )||L2 (Rn ) ≤ C||fk − fm ||L2 (Rn ) ,

so Ta fk is a Cauchy sequence in L2 (Rn ). By completeness there is some g ∈ L2 (Rn ) such


that Ta fk → g in L2 (Rn ). On the other hand Ta fk → Ta f in S 0 (Rn ). By the uniqueness
principle we have Ta f = g ∈ L2 (Rn ). Passing to the limit in (2.3) applied to fk , we get
||Ta f ||L2 (Rn ) ≤ C||f ||L2 (Rn ) , with the same constant C.
The proof of (2.3) will consist of two parts, and we follow [3] in this. In the first part
we establish it for compactly supported (with respect to x) symbols and in the second
part we will extend it to the general case of a ∈ S 0 .
So, let us first assume that a(x, ξ) has compact support with respect to x. This will
allow us to use the Fourier transform with respect to x, in particular the formulae
Z Z
a(x, ξ) = e 2πix·λ
a(λ, ξ)dλ, b
b a(λ, ξ) = e−2πix·λ a(x, ξ)dx,
Rn Rn

with absolutely convergent integrals. We will use the fact that a(·, ξ) ∈ C0∞ (Rn ) ⊂ S(Rn ),
so that a(·, ξ) is in the Schwartz space in the first variable. Consequently, we have
a(·, ξ) ∈ S(Rn ) uniformly in ξ. To see the uniformity, we can notice that
b
Z
α
(2πiλ) b a(λ, ξ) = e−2πix·λ ∂xα a(x, ξ)dx,
Rn

and hence |(2πiλ)αb


a(λ, ξ)| ≤ Cα for all ξ ∈ Rn . It follows that

a(λ, ξ)| ≤ CN (1 + |λ|)−N


sup |b
ξ∈Rn

for all N . Now we can write


Z
Ta f (x) = e2πix·ξ a(x, ξ)fb(ξ)dξ
n
ZR Z
= e2πix·ξ e2πix·λb
a(λ, ξ)fb(ξ)dλdξ
Rn Rn
Z
= (Sf )(λ, x)dλ,
Rn

where
(Sf )(λ, x) = e2πix·λ (Tba(λ,ξ) f )(x).

33
Here Tba(λ,ξ) f is just a Fourier multiplier with symbol independent of x, so by Plancherel’s
identity we get

||Tba(λ,ξ) f ||L2 = ||T\


a(λ,ξ) f ||L2 = ||b
b a(λ, ξ)|||fb||L2 ≤ CN (1 + |λ|)−N ||f ||L2 ,
a(λ, ·)fb||L2 ≤ sup |b
ξ∈Rn

for all N ≥ 0. Hence we get


Z Z
||Ta f ||L2 ≤ ||Sf (λ, ·)||L2 dλ ≤ CN (1 + |λ|)−N ||f ||L2 dλ ≤ C||f ||L2 ,
Rn Rn

if we take N > n.
Now, to pass to symbols which are not necessarily compactly supported with respect
to x, we will use the following inequality
|f (x)|2 dx
Z Z
2
|Ta f (x)| dx ≤ CN N
, (2.4)
|x−x0 |≤1 Rn (1 + |x − x0 |)

which holds for every for every x0 ∈ Rn and for every N ≥ 0, with CN independent
of x0 and dependent only on constants in the symbolic inequalities for a. Let us show
first that (2.4) implies (2.3). Writing χ|x−x0 |≤1 for the characteristic function of the set
|x − x0 | ≤ 1 and integrating (2.4) with respect to x0 yields

|f (x)|2 dx
Z Z  Z Z 
2
χ|x−x0 |≤1 |Ta f (x)| dx dx0 ≤ CN N
dx0 .
Rn Rn Rn Rn (1 + |x − x0 |)

Changing the order of integration, we arrive at


Z Z
2
vol(B(1)) |Ta f (x)| dx ≤ CN
e |f (x)|2 dx,
Rn Rn

which is (2.3). Let us now prove (2.4).


Let us prove it for x0 = 0 first. We can write f = f1 + f2 , where f1 and f2 are smooth
functions such that |f1 | ≤ |f |, |f2 | ≤ |f |, and supp f1 ⊂ {|x| ≤ 3}, supp f2 ⊂ {|x| ≥ 2}.
We will do the estimate for f1 first. Let us fix η ∈ C0∞ (Rn ) such that η(x) = 1 for |x| ≤ 1.
Then ηTa = Tηa is a pseudo-differential operator with a compactly supported symbol,
thus by the first part we have
Z Z Z Z
2 2 2
|Ta f1 (x)| dx = |Tηa f1 (x)| dx ≤ C |f1 (x)| dx ≤ C |f (x)|2 dx,
{|x|≤1} Rn Rn {|x|≤3}

which is the required estimate for f1 . Let us now do the estimate for f2 . If |x| ≤ 1, then
x 6∈ supp f2 , so we can write
Z
Ta f2 (x) = k(x, x − y)f2 (y)dy,
{|x|≥2}

where k is the kernel of Ta . Since |x| ≤ 1 and |y| ≥ 2, we have |x − y| ≥ 1 and hence by
Theorem 2.1.6 we can estimate

|k(x, x − z)| ≤ C1 |x − y|−N ≤ C2 |y|−N

34
for all N ≥ 0. Thus we can estimate
1/2
|f (y)|2
Z
|f2 (y)| |f (y)|
Z Z
|Ta f2 (x)| ≤ C1 dy ≤ C2 dy ≤ C3 dy ,
{|y|≥2} |y|N Rn (1 + |y|)N Rn (1 + |y|)N
R 1/2
1
where by the Cauchy-Schwartz’ inequality we have C3 = C2 Rn (1+|y|)N
dy < ∞ for
N > n. This in turn implies
1/2
|f (y)|2
Z Z
2
|Ta f2 (x)| dx ≤ C dy ,
{|x|≤1} Rn (1 + |y|)N

which is the required estimate for f2 . These estimates for f1 and f2 imply (2.4) with
x0 = 0. We note that constant C0 depends only on the dimension and on the constants
in symbolic inequalities for a.
Let us now show (2.4) with an arbitrary x0 ∈ Rn . Let us define

ax0 (x, ξ) = a(x − x0 , ξ).

Then we immediately see that estimate (2.4) for Ta in the ball {|x−x0 | ≤ 1} is equivalent
to the same estimate for Tax0 in the ball {|x| ≤ 1}. Finally we note that since constants
in symbolic inequalities for a and ax0 are the same, we obtain (2.4) with constant CN
independent of x0 . This completes the proof of Theorem 2.1.9.

2.1.10 Compositions of pseudo-differential operators


Theorem. Let a ∈ S m1 and b ∈ S m2 . Then there exists some c ∈ S m1 +m2 such that
Tc = Ta ◦ Tb . Moreover, we have
X (2πi)−|α|
c∼ (∂ξα a)(∂xα b),
α
α!

which means that for all N > 0 we have


X (2πi)−|α|
c− (∂ξα a)(∂xα b) ∈ S m1 +m2 −N .
α!
|α|<N

Proof. Let us assume first that all symbols are compactly supported, so we can change
the order of integration freely. Indeed, we can think of, say, symbol a(x, ξ) as of

a (x, ξ) = a(x, ξ)γ(x, ξ),

make all the calculations uniformly in 0 <  ≤ 1, use that a ∈ S m1 uniformly in , and
then take a limit as  → 0. Let us now plug in
Z Z
(Tb f )(y) = e2πi(y−z)·ξ b(y, ξ)f (z) dzdξ
Rn Rn

35
into Z Z
(Ta g)(x) = e2πi(x−y)·η a(x, η)g(y) dydη
Rn Rn
to get
Z Z Z Z
Ta (Tb f )(x) = e2πi(x−y)·η a(x, η)e2πi(y−z)·ξ b(y, ξ)f (z)dzdξdydη
n n Rn Rn
ZR ZR
= e2πi(x−z)·ξ c(x, ξ)f (z) dzdξ,
Rn Rn

with
Z Z
c(x, ξ) = e2πi(x−y)·(η−ξ) a(x, η)b(y, ξ) dydη
n Rn
ZR
= e2πix·(η−ξ) a(x, η)bb(η − ξ, ξ) dη
n
ZR
= e2πix·η a(x, ξ + η)bb(η, ξ) dη.
Rn

Here bb is the Fourier transform with respect to the first variable, and we used that

(x − y) · η + (y − z) · ξ − (x − z) · ξ = (x − y) · (η − ξ).

Asymptotic formula. Let us assume first that b(x, ξ) has compact support in x (al-
though if we think of b as b , it will have a compact support, but the size of the support
is not uniform in ; so, we can approach the case of general symbols by treating the com-
pactly supported symbols in a way of getting estimates that are uniform with respect to
the size of the support). Since b is compactly supported in x, its Fourier transform with
respect to x is rapidly decaying, so we can estimate

|bb(η, ξ)| ≤ CM (1 + |η|)−M (1 + |ξ|)m2 ,

for all M ≥ 0. The Taylor’s formula for a(x, ξ + η) in the second variable gives
X 1
a(x, ξ + η) = ∂ α a(x, ξ)η α + RN (x, ξ, η),
α! ξ
|α|<N

with remainder RN that we will analyse later. Plugging this formula in to our expression
for c and looking at terms in the sum, we get
Z
e2πix·η ∂ξα a(x, ξ)η α bb(η, ξ) dη = (2πi)−|α| ∂ξα a(x, ξ)∂xα b(x, ξ),
 
Rn

so that we have
(2πi)−|α| X α
Z
α
c(x, ξ) = ∂ξ a(x, ξ)∂x b(x, ξ) + e2πix·η RN (x, ξ, η)bb(η, ξ) dη.
α! Rn
|α|<N

Remainder. For the remainder in the Taylor’s series we have the estimate

|RN (x, ξ, η)| ≤ CN |η|N max |∂ξα a(x, ζ)| : |α| = N, ζ on the line between ξ and ξ + η .


36
Now, if |η| ≤ |ξ|/2, then points on the line between ξ and ξ + η are proportional to ξ, so
we can estimate

|RN (x, ξ, η)| ≤ CN |η|N (1 + |ξ|)m1 −N (|η| ≤ |ξ|/2).

On the other hand, if N ≥ m1 , we get the following estimate for all ξ and η:

|RN (x, ξ, η)| ≤ CN |η|N (N ≥ m1 ).

Using the estimate for bb(η, ξ) and these two estimates, we get
Z Z
−N
e2πix·η
RN (x, ξ, η)bb(η, ξ) dη ≤ CN,M (1 + |ξ|)m1 +m2
(1 + |η|)−M |η|N dη
Rn Rn
Z
+ (1 + |ξ|)m2 (1 + |η|)−M |η|N dη.
|η|≥|ξ|/2

Taking M large enough, we can estimate both terms on the right hand side by C(1 +
|ξ|)m1 +m2 −N . Making an estimate for ∂xα ∂ξβ RN in a similar way, we can estimate
Z
e2πix·η ∂xα ∂ξβ RN (x, ξ, η)bb(η, ξ) dη ≤ C(1 + |ξ|)m1 +m2 −N −|β| ,
Rn

implying the statement of the theorem for b compactly supported with respect to x.
General symbols. We note that it is sufficient to have the asymptotic formula and
an estimate for the remainder for x near some fixed point x0 , uniformly in x0 . Let
χ ∈ C0∞ (Rn ) be such that supp χ ⊂ {x : |x − x0 | ≤ 2} and such that χ(x) = 1 for
|x − x0 | ≤ 1. Let us decompose

b = χb + (1 − χ)b = b1 + b2 .

Since symbol b1 = χb is compactly supported with respect to x, the composition formula


for Ta ◦Tb1 is given by the theorem, and it is equal to the claimed series for Ta ◦Tb near x0 .
We now have to show that the operator Ta ◦ Tb2 is smoothing, i.e. its symbol c2 (x, ξ) is
of order −∞, and does not change the asymptotic formula for the composition. Indeed,
we already know that the symbol of operator Ta ◦ Tb2 is given by
Z Z
c2 (x, ξ) = e2πi(x−y)·(η−ξ) a(x, η)b2 (y, ξ) dydη,
Rn Rn

and we claim that


1
|c2 (x, ξ)| ≤ CN (1 + |ξ|)m1 +m2 −N for all N ≥ 0, |x − x0 | ≤ .
2
In the above integral for c2 we can integrate by part to derive various properties of its
decay. For example, we can integrate by parts with respect to η using the identity

∆N 1 2πi(x−y)·(η−ξ)
η e = (−4π 2 )N1 |x − y|2N1 e2πi(x−y)·(η−ξ) ,

37
N
∆η 1 a(x,η)
to see that in the integral we can replace a(x, η) by (−4π 2 )N1 |x−y|2N1
, i.e.

∆N η a(x, η)
Z Z 1

c2 (x, ξ) = e2πi(x−y)·(η−ξ) b2 (y, ξ) dydη,


Rn Rn (−4π 2 )N1 |x − y|2N1
Here |x − x0 | ≤ 1/2 and y ∈ supp(1 − χ) implies |x − y| ≥ 1/2, so the integration by
parts is well defined. We can also integrate by parts with respect to y using the identity
(1 − ∆y )N2 e2πi(x−y)·(η−ξ) = (1 + 4π 2 |ξ − η|2 )N2 e2πi(x−y)·(η−ξ) .
Moreover, we can use that 1 + |ξ| ≤ 1 + |ξ − η| + |η| ≤ (1 + |ξ − η|)(1 + |η|), and hence
1 (1 + |η|)2N2
≤ .
(1 + |ξ − η|)2N2 (1 + |ξ|)2N2
Thus, integrating by parts with respect to y and using this estimate together with sym-
bolic estimates for a and b2 , we get
(1 + |η|)m1 −2N1 (1 + |ξ|)m2
Z Z
|c2 (x, ξ)| ≤ C 2N1 (1 + |ξ − η|)2N2

Rn Rn (1 + |x − y|)
Z
≤C (1 + |η|)m1 −2N1 +2N2 (1 + |ξ|)m2 −2N2 dydη
Rn
Z
m1 +m2 −N
≤ C(1 + |ξ|) (1 + |η|)N −2N1 dη,
Rn

if we take N = m1 − 2N2 . Now, taking large N2 and 2N1 > N + n, we obtain the
desired estimate for c2 (x, ξ). Similar estimates can be done for ∂xα ∂ξβ c2 (x, ξ) simply using
symbolic inequalities for a and b2 , so we obtain that c2 ∈ S −∞ for |x − x0 | ≤ 1/2.
Finally, we notice that (in the general case) we have never used any information
on the size of the support of symbols, so all constants depend only on the constants
in symbolic inequalities. Thus, they remain uniformly bounded in 0 <  ≤ 1 for the
composition Tc = Ta ◦ Tb , and we have c ∈ S m1 +m2 with symbolic constants uniform
in . Moreover, the asymptotic formula is satisfied uniformly in . Since c → c pointwise
as  → 0, we can conclude that c ∈ S m1 +m2 , that Tc = Ta ◦ Tb , and that the asymptotic
formula of the theorem holds.

2.1.11 Amplitudes
We have already seen that when taking the adjoint of a pseudo-differential operator, we
get the symbol which depends on the “wrong” set of variables: (y, ξ) instead of (x, ξ).
Nevertheless, we want the adjoint to be a pseudo-differential operator as well. For this
we need to study operators with symbols depending on all combinations of variables.
This leads to the definition of an amplitude. We will write c = c(x, y, ξ) ∈ Sf m if

c ∈ C ∞ (Rn × Rn × Rn ) and if
∂yγ ∂xβ ∂ξα c(x, y, ξ) ≤ Cα,β,γ (1 + |ξ|)m−|α|
holds for all x, y, ξ ∈ Rn and all multi-indices α, β, γ. The corresponding operator T[c] is
defined by Z Z
(T[c] f )(x) = e2πi(x−y)·ξ c(x, y, ξ)f (y) dydξ.
Rn Rn

38
As in 2.1.4, we can justify this formula by considering c (x, y, ξ) = c(x, y, ξ)γ(y, ξ), with
γ as in 2.1.4. Then c → c pointwise (also with the pointwise convergence of derivatives),
m for 0 <  ≤ 1, so T n n
uniformly in Sf [c ] f → T[c] f in S(R ) for all f ∈ S(R ). Thus, T[c]
is well defined and continuous as an operator from S(Rn ) to S(Rn ).

2.1.12 Symbols of pseudo-differential operators


Proposition. A pseudo-differential operator T ∈ Ψm defines its symbol a ∈ S m uniquely,
so that T = Ta . The symbol a(x, ξ) is defined by the formula

a(x, ξ) = e−2πix·ξ T (e2πix·ξ ).

The notation used in the statement is a useful abbreviation for

e−2πix·ξ T (e2πix·ξ ) = e−2πix·ξ T (e2πiy·ξ ) (x) = e−2πix·ξ T (e2πih·,ξi ) (x).


 

The formula for the symbol can be justified either in S 0 (Rn ) or as the limit in the
expression
e−2πix·ξ T (e2πix·ξ ϕ ) → a(x, ξ),
as ϕ → 1 for a family of ϕ ∈ C0∞ (Rn ).

2.1.13 Symbols of amplitude operators


Proposition. Let c ∈ Sf m be an amplitude. Then there exists a symbol a ∈ S m such that

Ta = T[c] . Moreover, the asymptotic expansion for a is given by


X (2πi)−|α|
a(x, ξ) − ∂ξα ∂yβ c(x, y, ξ)|y=x ∈ S m−N , ∀N ≥ 0.
α!
|α|<N

Proof. The proof is similar to the proof of the composition formula 2.1.10. As in that
proof, we first formally conclude that we must have

a(x, ξ) = e−2πix·ξ T (e2πix·ξ )


Z Z
= e−2πix·ξ e2πi(x−y)·η c(x, y, η)e2πiy·ξ dydη
n n
ZR R
= c(x, y, η)e2πi(x−y)·(η−ξ) dydη
n
ZR
= c(x, η, ξ + η)e2πix·η dη,
b
Rn

c = Fy c is the Fourier transform of c with respect to y and where we used the


where b
change of variables η 7→ η + ξ. By the usual justification, we may work with amplitudes
compactly supported in y, and make sure that all our arguments do not depend on the

39
size of the support of c in y. Taking the Taylor’s expansion of b c(x, η, ξ + η) at ξ, we
obtain X 1
c(x, η, ξ + η) =
b c(x, η, ξ)η α + RN (x, η, ξ),
∂ξαb
α!
|α|<N

where, as before, we can show that the remainder RN (x, η, ξ) satisfies estimates

|RN (x, η, ξ)| ≤ A|η|N (1 + |η|)−M (1 + |ξ|)m−N for large M and for 2|η| ≤ |ξ|,

and
|RN (x, η, ξ)| ≤ A|η|N (1 + |η|)−M for all large M and N.
The last estimate is used in the region 2|η| ≥ |ξ| and, similar to the proof of the com-
position formula in Section 2.1.10, we can complete the proof in the case of c(x, y, ξ)
compactly supported in y.
To treat the general case where c(x, y, ξ) is not necessarily compactly supported in
y we reduce the analysis to the case when c(x, y, ξ) vanishes for y away from y = x.
For this, we can use the same argument as in the proof of the composition formula in
Section 2.1.10, where the estimate for the remainder by the integration by parts argument
becomes
(1 + |η|)m−2N1
Z Z
1
2N2 (1 + |x − y|)2N1
dydη = O((1 + |ξ|)m−N ),
Rn R n (1 + |η − ξ|)
for large N1 , N2 . The proof is complete.
Remark. Note that we can also justify the following alternative argument. We know
by Proposition 2.1.12 that for T ∈ Ψm its symbols can be defined by the formula

a(x, ξ) = e−2πix·ξ T (e2πiy·ξ ) (x).




Let now T = T[c] be the operator with amplitude c. Then Proposition 2.1.13 says that
we can write Z Z
(T[c] u)(x) = e2πi(x−y)·ξ a(x, ξ)u(y)dydξ,
Rn Rn
for all u ∈ C0∞ (Rn ). To show this, we write the Fourier inversion formula
Z
u(y) = e2πiy·ξ u
b(ξ)dξ,
Rn

and, justifying the application of T to it and using formula for the symbol from Propo-
sition 2.1.12, we get
Z Z
2πiy·ξ
(T u)(x) = T (e )(x)b
u(ξ)dξ = e2πix·ξ a(x, ξ)b
u(ξ)dξ.
Rn Rn

For the asymptotic expansion, we write


Z Z
2πih·,ξi
(T e )(x) = e2πi(x−y)·η c(x, y, η)e2πiy·ξ dydη
R n RZn
Z
=e 2πix·ξ
e−2πiy·η c(x, x + y, ξ + η)dydη,
Rn Rn

40
where we changed variables y 7→ y + x and η 7→ η + ξ, as well as recalculated the phase
in new variables as
(x − (y + x)) · (η + ξ) + (y + x) · ξ = −y · η + x · ξ.
Now, using the Taylor’s expansion and estimates for the remainder, we can obtain Propo-
sition 2.1.13 again.

2.1.14 Adjoint operators


Theorem. Let a ∈ S m . Then there exists a symbol a∗ ∈ S m such that Ta∗ = Ta∗ , where
Ta∗ is the L2 –adjoint operator of Ta . Moreover, we have the asymptotic expansion
X (2πi)−|α|
a∗ (x, ξ) − ∂ξα ∂xα a(x, ξ) ∈ S n−N , for all N ≥ 0,
α!
|α|<N

where a(x, ξ) is the complex conjugate of a(x, ξ).


Proof. As we have already calculated, we have Ta∗ = T[c] , which is an operator with
amplitude c(x, y, ξ) = a(y, ξ). Applying Proposition 2.1.13, we obtain the statement of
Theorem 2.1.14.

2.1.15 Changes of variables


It is clear from the definition of the symbol class S m that it is locally invariant under
smooth changes of variables, i.e. if we take a local change of variable x in the symbol
from S m , it will still belong to the same symbol class S m . We will now investigate what
happens with pseudo-differential operators when we make a change of variable in space.
Let U, V ⊂ Rn be open bounded sets and let τ : V → U be a surjective diffeomor-
phism. For f ∈ C0∞ (V ), we define its pullback by the change of variables τ by
(τ f )(x) = f (τ −1 (x)).
It easily follows that the new function satisfies τ f ∈ C0∞ (U ). Let now a ∈ S m be
a symbol of order m with compact support with respect to x, and assume that this
support is contained in U . Then we will show that there exists a symbol b ∈ S m with
compact support with respect to x which is contained in V such that τ −1 Ta τ = Tb . In
other words, we have
τ −1 Ta τ f = Tb f for all f ∈ C0∞ (V ).
More precisely, we have the following expression for the “main part” of b:
 ∂τ 0  ∂τ 0
Proposition. We have b(x, ξ) = a(τ (x), ∂x ξ) modulo S m−1 , where ∂x = ((Dτ )T )−1 .

Proof. We can write


Z Z
−1
(τ Ta τ f )(x) = e2πi(τ (x)−y)·ξ a(τ (x), ξ)f (τ −1 (y))dydξ
Rn Rn
Z Z
∂τ
= e2πi(τ (x)−τ (y))·ξ a(τ (x), ξ)f (y)| det |dydξ, (2.5)
Rn Rn ∂y

41
where we changed variables y 7→ τ (y) in to get the last equality. Now we will argue that
the main contribution in this integral comes from variables y close to x. Indeed, let us
insert a cut-off function χ(x, y) in the integral, where χ is a smooth function supported
in the set where |y − x| is small and where χ(x, x) = 1 for all x. The remaining integral
Z Z
∂τ
e2πi(τ (x)−τ (y))·ξ (1 − χ(x, y))a(τ (x), ξ)f (y)| det |dydξ
Rn Rn ∂y
has a smooth kernel since we can integrate by parts any number of times with respect to
ξ to see that the symbol of this operator belongs to symbol classes S −N for all N ≥ 0.
Let us now analyse what happens for y close to x. By the mean value theorem, for y
sufficiently close to x, we have

τ (x) − τ (y) = Lx,y (x − y), (2.6)

where Lx,y is an invertible linear mapping which is smooth in x and y, and satisfies
∂τ
Lx,x = (x).
∂x
Using (2.6) in (2.5), we get
Z Z
∂τ
e2πiLx,y (x−y)·ξ χ(x, y)a(τ (x), ξ)| det |f (y)dydξ =
Rn Rn ∂y
Z Z
∂τ
e2πi(x−y)·ξ χ(x, y)a(τ (x), L0x,y ξ)| det || det L−1
x,y |f (y)dydξ,
Rn Rn ∂y
where we changed variables LTx,y ξ 7→ ξ and where LTx,y is the transpose matrix of Lx,y
and L0x,y = (LTx,y )−1 . Thus, we get an operator with the amplitude c defined by

∂τ
c(x, y, ξ) = χ(x, y)a(τ (x), L0x,y ξ)| det || det L−1
x,y |.
∂y
Applying the asymptotic expansion in Proposition 2.1.13, we see that the first term of
this expansion is given by
 0
0 ∂τ −1 ∂τ
c(x, y, ξ)|y=x = χ(x, x)a(τ (x), Lx,x ξ)| det || det Lx,x | = a(τ (x), ξ),
∂x ∂x
completing the proof.

2.1.16 Principal symbol and classical symbols


We see in Proposition 2.1.15 that the equivalence class modulo S m−1 has some meaning
for the changes of variables. In fact, we may notice that the transformation
 0
∂τ
(x, ξ) 7→ (τ (x), ξ)
∂x
is the same as the change of variables in the cotangent bundle T ∗ Rn of Rn which is
induced by the change of variables x 7→ τ (x) in Rn . This observation allows one to make

42
an invariant geometric interpretation of the class of symbols in S m modulo terms of order
S m−1 . We note that we can use the asymptotic expansion for amplitudes in the proof of
Proposition 2.1.15 to find also lower order terms in the asymptotic expansion of b(x, ξ),
but most of these terms will not have such nice invariant interpretation. Without going
into detail, let us just mention that Proposition 2.1.15 allows to introduce a notion of a
principal symbol of a pseudo-differential operator with symbol in S m as the equivalence
class of this symbol modulo the subclass S m−1 , and this principal symbol becomes a
function on the cotangent bundle of Rn . This construction can be further carried out on
manifolds leading to many remarkable applications, in particular to those in geometry
and index theory.
We will not pursue this path further in this chapter, but we will clarify the notion
of such equivalent classes for the so-called classical symbols which is a class of symbols
that plays a very important role in applications to partial differential equations. First,
we define homogeneous functions/symbols.
Definition. We will say that a symbol ak = ak (x, ξ) ∈ S k is positively homogeneous (or
simply homogeneous) of order k if for all x ∈ Rn we have
ak (x, λξ) = λk ak (x, ξ) for all λ > 0 and all |ξ| > 1.
We note that we exclude small ξ from this definition because if we assumed property
ak (x, λξ) = λk ak (x, ξ) for all ξ ∈ Rn , such function ak would not be in general smooth
at ξ = 0.
m m
Definition. We will say that a symbol a ∈ SP cl is a classical symbol of order m if a ∈ S

and if there is an asymptotic expansion a ∼ k=0 am−k , where each am−k is homogeneous
of order m − k, and if a − N m−N −1
P
k=0 am−k ∈ S , for all N ≥ 0.
Now, for a classical symbol a ∈ Sclm the principal symbol, i.e. its equivalence class modulo
S m−1 can be easily found. This is simply the first term am in the asymptotic expansion
in its definition. We will discuss asymptotic sums in more detail Pin the next section,
namely we will show that if we start with the asymptotic sum ∞ k=0 am−k , we can in
m
turn interpret it as a symbol from S .

2.1.17 Asymptotic sums


Our next objective is to show how the constructed calculus of pseudo-differential op-
erators can be applied to “easily” solve so-called elliptic partial differential equations.
However, in order to carry out this application we will need another very useful con-
struction.
Proposition. Let aj ∈ S mj , j = 0, 1, 2, . . . , where the sequence mj of orders satisfies
m0 > m1 > m2 > · · · and mj → −∞ as j → ∞. Then there exists a symbol a ∈ S m0
such that ∞
X
a∼ aj = a0 + a1 + a2 + . . . ,
j=0

which means that we have


k
X
a− aj ∈ S mk , k = 1, 2, . . .
j=0

43
We can also note that in this notation a ∼ 0 if and only if a ∈ S −∞ = ∩m∈R S m .

Proof. Let us fix a function χ ∈ C ∞ (Rn ) such that χ(ξ) = 1 for all |ξ| ≥ 1 and such
that χ(ξ) = 0 for all |ξ| ≤ 1/2. Then, for some sequence τj increasing sufficiently fast
and to be chosen later, we define
∞  
X ξ
a(x, ξ) = aj (x, ξ)χ .
j=0
τj

We note
 that this sum is well-defined pointwise because it is in fact locally finite since
ξ
χ τj = 0 for |ξ| < τj /2. In order to show that a ∈ S m0 we first take a sequence τj such
that the inequality
  
ξ
β α
∂x ∂ξ aj (x, ξ)χ ≤ 2−j (1 + |ξ|)mj−1 −|α| (2.7)
τj

is satisfied for all |α|, |β| ≤ j. We note that because the sequence mj is strictly decreasing
we canalso replace the right hand side by 2−j (1+|ξ|)mj +1−|α| . We first show that function
ξ α ∂ξα χ τξj is uniformly bounded in ξ for each j. Indeed, we have


 0, |ξ| < τj /2,
    α
ξ  ξ
ξ α ∂ξα χ = bounded by C , τj /2 ≤ |ξ| ≤ |τ |,
τj 
 τj

0, τj < |ξ|,

 
so that ξ α ∂ξα χ τξj ≤ C is uniformly bounded for all ξ, for any given j. Using this
fact, we can also estimate
    
ξ X ξ
∂xβ ∂ξα aj (x, ξ)χ = cα1 α2 ∂xβ ∂ξα1 aj (x, ξ)∂ξα2 χ
τj α1 +α2 =α
τj
X
≤ |cα1 α2 |(1 + |ξ|)mj −|α1 | (1 + |ξ|)−|α2 |
α1 +α2 =α

≤ C(1 + |ξ|)mj −|α|


= C(1 + |ξ|)−1 (1 + |ξ|)mj +1−|α| .
 

Now, the left hand side in estimate (2.7) is zero for |ξ| < τj /2, so we may assume that
|ξ| ≥ τj /2. Hence we can have

C(1 + |ξ|)−1 ≤ C(1 + |τj /2|)−1 < 2−j

if we take τj sufficiently large. This implies that we can take the sum of ∂xβ ∂ξα –derivatives
in the definition of a(x, ξ) and (2.7) implies that a ∈ S m0 . Finally, to show the asymptotic
formula, we can write
k ∞  
X X ξ
a− aj = aj (x, ξ)χ ,
j=0 j=k+1
τj

44
and so " #
k
X
∂xβ ∂ξα a − aj ≤ C(1 + |ξ|)mk −|α| .
j=0

In this arguments we fix α andP β first, and then use the required estimates for all
j ≥ |α|, |β|. This shows that a − kj=0 aj ∈ S mk finishing the proof.

2.2 Applications to partial differential equations


2.2.1 Solving partial differential equations
The main question in the theory of partial differential equations is how to solve the
equation
Lu = f
for a given partial differential operator L and a given function f . In other words, how
to find the inverse of L, i.e. an operator L−1 such that

L ◦ L−1 = L−1 ◦ L = I (2.8)

is the identity operator (on some space of functions where everything is well-defined). In
this case function u = L−1 f gives a solution to the partial differential equation Lu = f .
First of all we can observe that if operator L is an operator with variables coefficients in
most cases it is impossible or is very hard to find an explicit formula for its inverse L−1
(even when it exists). However, in many questions in the theory of partial differential
equations one is actually not so much interested in having a precise explicit formula for
L−1 . Indeed, in reality one is mostly interested not in knowing the solution u to the
equation Lu = f explicitly but rather in knowing some fundamental properties of u.
Among the most important properties are the position and the strength of singularities
of u. Thus, the question becomes whether we can say something about singularities of u
knowing singularities of f = Lu. In this case we do not need to solve equation Lu = f
exactly but it is sufficient to know its solution modulo the class of smooth functions.
Namely, instead of L−1 in (2.8) one is interested in finding an “approximate” inverse of
L modulo smooth functions, i.e. an operator P such that

u = Pf

solves equation Lu = f modulo smooth functions, i.e. if (P L − I)f and (LP − I)f are
smooth for all functions f from some class. Recalling that operators in Ψ−∞ have such
property, we have following definition.
Definition. Operator P is called the right parametrix of L if LP − I ∈ Ψ−∞ . Operator
S is called the left parametrix of L if SL − I ∈ Ψ−∞ .
In fact, the left and right parametrix are closely related. Indeed, by definition we have
LP − I = R1 and SL − I = R2 with some R1 , R2 ∈ Ψ−∞ . Then we have

S = S(LP − R1 ) = (SL)P − SR1 = P + R2 P − SR1 .

45
If L, S, P are pseudo-differential operators of finite orders, the composition formulae
imply that R2 P, SR1 ∈ Ψ−1 , i.e. S − P is a smoothing operator. Thus, we will be mainly
interested in the right parametrix P because u = P f immediately solves equation Lu = f
modulo smooth functions.
We also note that since we work here modulo smoothing operators (i.e. operators in
Ψ−∞ ), parametrices are obviously not unique – funding one of them is already very good
because any two parametrices differ by a smoothing operator.

2.2.2 Elliptic equations


We will now show how we can use the calculus to “solve” elliptic partial differential
equations. First, we recall the notion of ellipticity.
Definition. A symbol a ∈ S m is called elliptic if for some A > 0 it satisfies

|a(x, ξ)| ≥ A|ξ|m

for all |ξ| ≥ 1 and all x ∈ Rn . We also say that symbol a is elliptic in U ⊂ Rn if the
above estimate holds for all x ∈ U . Pseudo-differential operators with elliptic symbols
are also called elliptic.
Now, let L = Ta be an elliptic pseudo-differential operator with symbol a ∈ S m (which
is then also elliptic by definition). Let is introduce a cut-off function χ ∈ C ∞ (Rn ) such
that χ(ξ) = 0 for small ξ, e.g. for |ξ| ≤ 1, and such that χ(ξ) = 1 for large ξ, e.g. for
|ξ| > 2. The ellipticity of a(x, ξ) assures that it can be inverted pointwise for |ξ| ≥ 1, so
we can define symbol
b(x, ξ) = χ(ξ) [a(x, ξ)]−1 .
Since a ∈ S m is elliptic, we easily see that b ∈ S −m . If we take P0 = Tb then by
composition theorems we obtain

LP0 = I + E1 , P L = I + E2 ,

for some E1 , E2 ∈ Ψ−1 . Thus, we may view P0 as a good first approximation for a
parametrix of L. In order to find a parametrix of L, we need to modify P0 in such a way
that E1 and E2 would be in Ψ−∞ . This constructions can be carried out in an iterative
way. We now give a slightly more general statement which is useful for other purposes
as well.
Theorem. Let a ∈ S m be elliptic on an open set U ⊂ Rn , i.e. there exists some A > 0
such that
|a(x, ξ)| ≥ A|ξ|m
for all x ∈ U and all |ξ| ≥ 1. Let c ∈ S l be a symbol of order l whose support with respect
to x is a compact subset of U . Then there exists a symbol b ∈ S l−m such that

Tb Ta = Tc − Te

for some symbol e ∈ S −∞ .

46
Proof. In order to simplify the notation, in this proof we will write a1 ◦ a2 = a3 to
express the relation Ta1 Ta2 = Ta3 . We will construct the symbol b as an asymptotic sum

b ∼ b0 + b1 + b2 + · · ·

and then use Proposition 2.1.17 to justify this infinite sum. We will also work with
|ξ| ≥ 1 since small ξ are not relevant for the symbolic constructions.
First, we take b0 = c/a which is well defined for |ξ| ≥ 1 in view of the ellipticity of a.
Then we have

b0 ◦ a = c − e 0 , b0 ∈ S l−m , with e0 = c − b0 ◦ a ∈ S l−1 .

Then we take b1 = e0 /a ∈ S l−m−1 so that we have

(b0 + b1 ) ◦ a = c − e0 + b1 ◦ a = c − e1 , with e1 = e0 − b1 ◦ a ∈ S l−2 .

Inductively, we define bj = ej−1 /a ∈ S l−m−j and we have

(b0 + b1 + · · · + bj ) ◦ a = c − ej , with ej = ej−1 − bj ◦ a ∈ S l−j−1 .

Now, Proposition 2.1.17 shows that b ∈ S l−m and it satisfies b ◦ a = c − e with e ∈ S −∞


by its construction, completing the proof.

2.2.3 Parametrix for elliptic operators and estimates


Now, a parametrix for an elliptic partial differential operator L is given by Theorem 2.2.2
with Tc = I being the identity operator. More precisely, we also have the following local
version of this.
α
P
Corollary. Let L = |α|≤m aα (x)∂x be an elliptic partial differential operator in an
open set U ⊂ Rn . Let χ1 , χ2 , χ3 ∈ C0∞ (Rn ) be such that χ2 = 1 on the support of χ1 and
χ3 = 1 on the support of χ2 . Then there is an operator P ∈ S −m such that

P (χ2 L) = χ1 I + Eχ3 ,

for some E ∈ Ψ−∞ .


Proof. We take Ta = χ2 L and Tc = χ1 I. Then Ta is elliptic on the support of χ2 and
we can take P = Tb with b ∈ S −m from Theorem 2.2.2.
We will now apply this result to obtain a statement on the regularity of solution to
elliptic partial differential equations. We assume that the order m below is an integer
which is certainly true when L is a partial differential operator. However, if we take into
account the discussion from the next section, we will see that the statements below are
still true for any m ∈ R.
Theorem. Let L ∈ Ψm be an elliptic pseudo-differential operator in an open set U ⊂ Rn
and let Lu = f in U . Assume that f ∈ (L2k )loc . Then u ∈ (L2m+k )loc .
This theorem shows that if u is a solution of an elliptic partial differential equation
Lu = f then there is local gain of m derivatives for u compared to f , where m is the
order of the operator L.

47
Proof. Let χ1 , χ2 , χ3 ∈ C0∞ (U ) be non-zero functions such that χ2 = 1 on the support
of χ1 and χ3 = 1 on the support of χ2 . Then, similar to the proof of Corollary 2.2.3 we
have
P (χ2 L) = χ1 I + Eχ3 ,
with some P ∈ Ψ−m . Since f ∈ (L2k )loc we have P (χ2 f ) ∈ (L2m+k )loc . Also, E(χ3 u) ∈ C ∞
so that ||χ3 E(χ3 u)||L2k ≤ ||χ3 u||L2 for any k. Summarising and using

P (χ2 f ) = χ1 u + E(χ3 u),

we obtain  
||χ1 u||L2k+m ≤ C ||χ2 f ||L2k + ||χ3 u||L2 ,

which implies that u ∈ (L2m+k )loc .


Remark. We can observe from the proof that properties of solution u by the calculus
and the existence of a parametrix are reduced to the fact that pseudo-differential op-
erators in Ψ−m map L2k to L2k+m . In fact, in this way many properties of solutions to
partial differential equations are reduced to questions about general pseudo-differential
operators. In particular, let us assume without proof the following statement where for
now we will take µ and k to be integers or zeros. However, if we adopt the definition of
Sobolev spaces from the next section, this statement is valid for all µ, k ∈ R.
Statement on Lp –continuity. Let T ∈ S −µ be a pseudo-differential operator of order
µ and let 1 < p < ∞. Then T is bounded from the Sobolev space Lpk (Rn ) to the Sobolev
space Lpk+µ (Rn ).
If µ and k are integers or zero, we will prove this statement in the next section. As an
immediate consequence, by the same argument as in the proof of the theorem, we also
obtain
Corollary. Let L ∈ Ψm be an elliptic pseudo-differential operator in an open set U ⊂ Rn ,
let 1 < p < ∞, and let Lu = f in U . Assume that f ∈ (Lpk )loc . Then u ∈ (Lpm+k )loc .

2.2.4 Sobolev spaces revisited


Up to now we defined Sobolev spaces Lpk assuming that the index k is an integer. In fact,
using the calculus of pseudo-differential operators we can show that these spaces can be
defined for all k ∈ R thus allowing one to measure the regularity of functions much more
precisely. In the following discussion we assume the statement on the Lp –continuity of
pseudo-differential operators from Section 2.2.3.
We recall from Section 1.3.11 that for an integer k ∈ N we defined the Sobolev space
Lpk (Rn ) as the space of all f ∈ Lp (Rn ) such that their distributional derivatives satisfy
∂xα f ∈ Lp (Rn ), for all 0 ≤ |α| ≤ k. This space is equipped with a norm
X
||f ||Lpk = ||∂xα f ||Lp
|α|≤k

(or with any equivalent norm).

48
Let now s ∈ R be a real number and let us consider operators (I − ∆)s/2 ∈ Ψs which are
pseudo-differential operators with symbols a(x, ξ) = (1 + 4π 2 |ξ|2 )s/2 . We will say that

f ∈ Lps (Rn ) if (I − ∆)s/2 f ∈ Lp (Rn ).

We equip this space with the norm

||f ||Lps = ||(I − ∆)s/2 f ||Lp .

Proposition. If s ∈ N is an integer, the space Lps (Rn ) coincides with the space Lpk with
k = s, with equivalence of norms.
Proof. We will use the index k for both spaces. Since operator (I − ∆)k/2 is a pseudo-
differential operator of order k, by the statement on the Lp –continuity in Section 2.2.3
we get that it is bounded from Lpk to Lp , i.e. we have
X
||(I − ∆)k/2 f ||Lp ≤ C ||∂xα f ||Lp .
|α|≤k

Conversely, let Pα be a pseudo-differential operator defined by Pα = ∂xα (I − ∆)−k/2 , i.e. a


pseudo-differential operator with symbol pα (x, ξ) = (2πiξ)α (1+4π 2 |ξ|2 )−k/2 , independent
of x. If |α| ≤ k, we get that pα ∈ S |α|−k ⊂ S 0 , so that Pα ∈ S 0 for all |α| ≤ k. By the
statement on the Lp –continuity in Section 2.2.3 operators Pα are bounded on Lp (Rn ).
Therefore, we obtain
X X
||∂xα f ||Lp = ||Pα (I − ∆)k/2 f ||Lp ≤ C||(I − ∆)k/2 f ||Lp ,
|α|≤k |α|≤k

completing the proof.

2.2.5 Proof of the statement on the Lp –continuity


Now, let us prove the statement on the Lp –continuity from Section 2.2.3. However, we
will assume without proof that pseudo-differential operators of order zero are bounded on
Lp (Rn ) for all 1 < p < ∞. This falls outside the scope of this course and can be proved
for example by the so-called Calderon–Zygmund theory of singular integral operators
which include pseudo-differential operators from this course, if we view them as integral
operators with singular kernels. We will give some indications to this end in the last
part of these notes.
Proof of the statement on Lp –continuity. Let f ∈ Lps (Rn ). By definition this means
that (I − ∆)s/2 f ∈ Lp (Rn ). Then we can write using the calculus of pseudo-differential
operators:

(I − ∆)(s−µ)/2 T f = (I − ∆)(s−µ)/2 T (I − ∆)−s/2 (I − ∆)s/2 f ∈ Lp (Rn )

since operator (I − ∆)(s−µ)/2 T (I − ∆)−s/2 is a pseudo-differential operator of order zero


and is, therefore, bounded on Lp (Rn ).

49
2.2.6 Calculus proof of L2 –boundedness
In this section we will give a short proof of the fact that pseudo-differential operators of
order zero are bounded on L2 (Rn ) which was also proved in Section 2.1.9. The proof will
rely on the following lemma that we will give without proof here, and which is sometimes
called Schur’s lemma.
Lemma. Let T be an integral operator of the form
Z
T u(x) = K(x, y)u(y)dy
Rn

with kernel K ∈ L1loc (Rn × Rn ) satisfying


Z Z
sup |K(x, y)|dy ≤ C, sup |K(x, y)|dx ≤ C.
x∈Rn Rn y∈Rn Rn

Then T is bounded on L2 (Rn ).


Calculus proof of Theorem 2.1.9. Let T ∈ Ψ0 be an pseudo-differential operator
of order zero with symbol a ∈ S 0 and principal symbol σa . Then its adjoint satisfies
T ∗ ∈ Ψ0 and hence also the composition T ∗ T ∈ Ψ0 . Operator T ∗ T has bounded principal
symbol |σa (x, ξ)|2 by the composition formula, so that |σa (x, ξ)|2 < M for some constant
0 < M < ∞. Then function
p
p(x, ξ) = M − |σa (x, ξ)|2

is well-defined and it is easy to check that p ∈ S 0 . Let P ∈ Ψ0 be the pseudo-differential


operator with symbol p. By the calculus again we have the identity

T ∗ T = M − P ∗ P + R,

for some pseudo-differential operator R ∈ Ψ−1 . Then

||T u||2L2 = hT u, T ui = hT ∗ T u, ui = M ||u||2L2 − ||P u||2L2 + hRu, ui ≤ M ||u||2L2 + hRu, ui,

so that T is bounded on L2 (Rn ) if R is. The boundedness of R on L2 (Rn ) can be proved


by induction. Indeed, using the estimate

||Ru||2L2 = hRu, Rui = hR∗ Ru, ui ≤ C||R∗ Ru||L2 ||u||L2

we see that R ∈ Ψ−1 is bounded on L2 (Rn ) if R∗ R ∈ Ψ−2 is bounded on L2 (Rn ). Contin-


uing this argument we can reduce the question of L2 –boundedness to the boundedness
of pseudo-differential operators S ∈ Ψm for some sufficiently negative m < 0. We can
now use Theorem 2.1.6 with N = 0 for x close to y to show that the integral kernel
K(x, y) of S is bounded for |y − x| ≤ 1 while it is decreases fast for |x − y| large if we
take sufficiently large N . Therefore, we can use Lemma 2.2.6 to conclude that S must
be bounded on L2 (Rn ) thus completing the proof.

50
3 Appendix
3.1 Interpolation
3.1.1 Distribution functions
Let µ be the Lebesgue measure on Rn (a volume element if you are not familiar with the
measure theory).
Definition. For a function f : Rn → C we define its distribution function µf (λ) by

µf (λ) = µ{x ∈ Rn : |f (x)| ≥ λ}.

We have the following useful relation between the Lp –norm and the distribution of a
function.
Theorem. Let f ∈ Lp (Rn ). Then we have the identity
Z Z ∞
p
|f (x)| dx = p µf (λ)λp−1 dλ.
Rn 0

Proof. Let us define a measure on R by setting

ν((a, b]) = µf (b) − µf (a) = −µ{x ∈ Rn : a < |f (x)| ≤ b} = −µ(|f |−1 ((a, b])).

By the standard extension property of measures we can then extend ν to all Borel sets
E ⊂ (0, ∞) by setting ν(E) = −µ(|f |−1 (E)). We note that this definition is well-defined
since |f | is measurable if f is measurable. Then we claim that we have the following
property for, say, integrable functions φ : [0, ∞) → R:
Z Z ∞
ϕ ◦ |f |dµ = − ϕ(α)dµf (α). (3.1)
Rn 0

Indeed, if ϕ = χ[a,b] is a characteristic function of a set [a, b], i.e. equal to one on [a, b]
and zero on its complement, then the definition of ν implies
Z Z Z b Z ∞
χ[a,b] ◦ |f |dµ = dµ = − dµf = − χ[a,b] dµf ,
Rn a<|f (x)|≤b a 0

which verifies (3.1) for characteristic functions. By the linearity of integrals, we then
have (3.1) for finite linear combinations of characteristic functions and, consequently, for
all integrable functions by the Lebesgue’s monotone convergence theorem. Now, taking
ϕ(α) = αp in (3.1), we get
Z Z ∞ Z ∞
p p
|f | dµ = − α dµf (α) = p αp−1 µf (α)dα,
Rn 0 0

where we integrated by parts in the last equality. The proof is complete.

51
3.1.2 Weak type (p, p)
Definition. We say that operator T is of weak type (p, p) if there is a constant C > 0
such that for every λ > 0 we have
n ||u||pLp
µ{x ∈ R : |T u(x)| > λ} ≤ C .
λp

Proposition. If T is bounded from Lp (Rn ) to Lp (Rn ) then T is also of weak type (p, p).
Proof. If v ∈ L1 (Rn ) then for all ρ > 0 we have a simple estimate
Z
n
ρµ{x ∈ R : |v(x)| > ρ} ≤ |v(x)|dµ(x) ≤ ||v||L1 .
|v(x)|>ρ

Now, if we take v(x) = |T u(x)|p and ρ = λp , this readily implies that T is of weak type
(p, p).

3.1.3 Marcinkiewicz interpolation theorem


The following theorem is extremely valuable in proving Lp –continuity of operators since
it reduces the analysis to a weaker type continuity only for two values of indices.
Theorem. Let r < q and assume that operator T is of weak types (r, r) and (q, q). Then
T is bounded from Lp (Rn ) to Lp (Rn ) for all r < p < q.
Proof. Let u ∈ Lp (Rn ). For each λ > 0 we can define functions u1 and u2 by u1 (x) =
u(x) for |u(x)| > λ and by u2 (x) = u(x) for |u(x)| ≤ λ, and to be zero otherwise. Then
we have the identity u = u1 + u2 and estimates |u1 |, |u2 | ≤ |u|. It follows that
||u1 ||rLr ||u2 ||qLq
µT u (2λ) ≤ µT u1 (λ) + µT u2 (λ) ≤ C1 + C 2 ,
λr λq
since T is of weak types (r, r) and (q, q). Therefore, we can estimate
Z Z ∞
p
|T u(x)| dx = p λp−1 µT u (λ)dλ ≤
Rn 0
Z ∞ Z  Z ∞ Z 
p−1−r r p−1−q q
C1 p λ |u(x)| dx dλ + C2 p λ |u(x)| dx dλ.
0 |u|>λ 0 |u|≤λ

Using Fubini’s theorem, the first term on the right hand side can be rewritten as
Z ∞ Z  Z ∞ Z 
p−1−r r p−1−r r
λ |u(x)| dx dλ = λ χ|u|>λ |u(x)| dx dλ
0 |u|>λ 0 Rn
Z Z ∞ 
r p−1−r
= |u(x)| λ χ|u|>λ dλ dx
Rn 0
!
Z Z |u(x)|
= |u(x)|r λp−1−r dλ dx
Rn 0
Z
1
= |u(x)|r |u(x)|p−r dx
p − r Rn
Z
1
= |u(x)|p dx,
p − r Rn

52
where χ|u|>λ is the characteristic function of the set {x ∈ Rn : |u(x)| > λ}. Similarly, we
have Z ∞ Z  Z
p−1−q q 1
λ |u(x)| dx dλ = |u(x)|p dx,
0 |u|≤λ q − p Rn

completing the proof.

3.1.4 Calderon–Zygmund covering lemma


As an important tool (which will not be used here so it is given just for the information)
for proving various results of boundedness in L1 (Rn ) or of weak type (1, 1), we have the
following fundamental decomposition of integrable functions.
Theorem. Let u ∈ L1 (Rn ) and λ > 0. Then there exist v, wk ∈ L1 (Rn ) and there
exists a collection of disjoint cubes Qk , k ∈ N, centred at some points xk , such that the
following properties are satisfied

X ∞
X
u=v+ wk , ||v||L1 + ||wk ||L1 ≤ 3||u||L1 ,
k=1 k=1
Z
supp wk ⊂ Qk , wk (x)dx = 0,
Qk

X
µ(Qk ) ≤ λ−1 ||u||L1 .
k=1

3.1.5 Remarks on Lp –continuity of pseudo-differential operators


Let a ∈ S 0 . Then by Theorem 2.1.6 the integral kernel K(x, y) of pseudo-differential
operator Ta satisfies estimates
|∂xα ∂yβ K(x, y)| ≤ Aαβ |x − y|−n−|α|−|β|
for all x 6= y. In particular, for α = β = 0 this gives
|K(x, y)| ≤ A|x − y|−n for all x 6= y. (3.2)
Moreover, if we use it for α = 0 and |β| = 1, we get
Z
|K(x, y) − K(x, z)|dx ≤ A if |x − z| ≤ δ, for all δ > 0. (3.3)
|x−z|≥2δ

Now, if we take a general integral operator T of the form


Z
T u(x) = K(x, y)u(y)dy,
Rn

properties (3.2) and (3.3) of the kernel are the starting point of the so-called Calderon–
Zygmund theory of singular integral operators. In particular, one can conclude that such
operators are of weak type (1, 1), i.e. they satisfy the estimate
||u||L1
µ{x ∈ Rn : |T u(x)| > λ} ≤ .
λ
53
Since we also know from Section 2.1.9 that Ta ∈ Ψ0 are bounded on L2 (Rn ) and since
we also know from Section 3.1.2 that this implies that Ta is of weak type (2, 2), we get
that pseudo-differential operators of order zero are of weak types (1, 1) and (2, 2). The,
by Marcinkiewicz interpolation theorem, we conclude that Ta is bounded on Lp (Rn ) for
all 1 < p < 2. By the standard duality argument, this implies that Ta is bounded on
Lp (Rn ) also for all 2 < p < ∞. Since we also have the boundedness of L2 (Rn ), we obtain
Theorem. Let T ∈ Ψ0 . Then T extends to a bounded operator from Lp (Rn ) to Lp (Rn ),
for all 1 < p < ∞.
We note that there exist different proofs of this theorem. An alternative method is to
reduce the Lp –boundedness to the question of unform boundedness of Fourier multipliers
in Lp (Rn ) which then follows from Hörmander’s theorem on Fourier multipliers.

References
[1] M. Ruzhansky and V. Turunen, Pseudo-differential operators and symmetries. Back-
ground analysis and advanced topics. Birkhäuser, 2010.

[2] M.A. Shubin, Pseudodifferential operators and spectral theory, Nauka, Moscow,
1978.

[3] E. M. Stein, Harmonic analysis: real-variable methods, orthogonality, and oscilla-


tory integrals. Princeton University Press, 1993.

54

You might also like