0% found this document useful (0 votes)
62 views85 pages

Notes March2002

Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views85 pages

Notes March2002

Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 85

LECTURES IN HARMONIC ANALYSIS Thomas H. Wol Revised version, March 2002 1.

L1 Fourier transform : Rn C dened by If f L1 (Rn ) then its Fourier transform is f ( ) = f e2ix f (x)dx.

More generally, let M (Rn ) be the space of nite complex-valued measures on Rn with the norm = ||(Rn ), where || is the total variation. Thus L1 (Rn ) is contained in M (Rn ) via the identication f , d = f dx. We can generalize the denition of Fourier transform via ( ) = e2ix d(x).

Example 1 Let a Rn and let a be the Dirac measure at a, a (E ) = 1 if a E and a (E ) = 0 if a E . Then a ( ) = e2ia . Example 2 Let (x) = e|x| . Then
2

) = e||2 . ( Proof The integral in question is ) = ( e2ix e|x| dx.


2

(1)

Notice that this factors as a product of one variable integrals. So it suces to prove (1) 2 when n = 1. For this we use the formula for the integral of a Gaussian: ex dx = 1. It follows that

e2ix ex dx =
2

e(xi) dx e
2

= = = e 1 e ,

i x2 2

ex dx e
2

dx e

where we used contour integration at the next to last line. There are some basic estimates for the L1 Fourier transform, which we state as Propositions 1 and 2 below. Consideration of Example 1 above shows that in complete generality not that much more can be said. Proposition 1.1 If M (Rn ) then is a bounded function, indeed

M (R )
n

(2)

Proof For any , | ( )| = | = e2ix d(x)| |e2ix | d||(x) .

Proposition 1.2 If M (Rn ) then is a continuous function. Proof Fix and consider ( + h) = e2ix(+h) d(x).

As h 0 the integrands converge pointwise to e2ix . Since all the integrands have absolute value 1 and ||(Rn ) < , the result follows from the dominated convergence theorem. We now list some basic formulas for the Fourier transform; the ones listed here are roughly speaking those that do not involve any dierentiations. They can all be proved by using the formula ea+b = ea eb and appropriate changes of variables. Let f L1 , Rn , and let T be an invertible linear map from Rn to Rn . 1. Let f (x) = f (x ). Then ( ). f ( ) = e2i f 2. Let e (x) = e2ix . Then ( ). e f ( ) = f 2 (4) (3)

3. Let T t be the inverse transpose of T . Then T t . f T = |det(T )|1f (x) = f (x). Then 4. Dene f = f. f (5)

(6)

We note some special cases of 3. If T is an orthogonal transformation (i.e. T T t is the T , since det(T ) = 1. In particular, this implies that if f identity map) then f T = f is radial then so is f , since orthogonal transformations act transitively on spheres. If T is a dilation, i.e. T x = r x for some r > 0, then 3. says that the Fourier transform of the (r 1 ). Replacing r with r 1 and multiplying through function f (rx) is the function r n f by r n , we see that the reverse formula also holds: the Fourier transform of the function (r ). r n f (r 1 x) is the function f should be smooth, There is a general principle that if f is localized in space, then f should be localized. We now discuss some simple and conversely if f is smooth then f manifestations of this. Let D (x, r ) = {y Rn : |y x| < r }. Proposition 1.3 Suppose that M (Rn ) and supp is compact. Then is C and D = ((2ix) ) . Further, if supp D (0, R) then D

(7)

(2R)|| .

(8)

We are using multiindex notation here and will do so below as well. Namely, a multiindex is a vector Rn whose components are nonnegative integers. If is a multiindex then by denition 1 n D = 1 . . . n , x1 xn
j x = n j =1 xj .

The length of , denoted ||, is

j . One denes a partial order on multiindices via

i i for each i, < and = . Proof of Proposition 1.3 Notice that (8) follows from (7) and Proposition 1 since the norm of the measure (2ix) is (2R)|| . 3

Furthermore, for any the measure (2ix) is again a nite measure with compact support. Accordingly, if we can prove that is C 1 and that (7) holds when || = 1, then the lemma will follow by a straightforward induction. Fix then a value j {1, . . . , n}, and let ej be the j th standard basis vector. Also x Rn , and consider the dierence quotient (h) = This is equal to ( + hej ) ( ) . h (9)

e2ihxj 1 2ix e d(x). h e2ihxj 1 h


2ihxj

(10)

As h 0, the quantity

converges pointwise to 2ixj . Furthermore, | e h 1 | 2 |xj | for each h. Accordingly, the integrands in (10) are dominated by |2xj |, which is a bounded function on the support of . It follows by the dominated convergence theorem that lim (h) = e2ihxj 1 2ix d(x), e h0 h lim 2ixj e2ix d(x). This proves the formula (7) when || = 1. Formula (7) and Proposition 2 imply that is C 1. Remark The estimate (8) is tied to the support of . However, the fact that is C and the formula (7) are still valid whenever has enough decay to justify the dierentiations under the integral sign. For example, they are valid if has moments of all orders, i.e. |x|N d||(x) < for all N . The estimate (2) can be seen as justication of the idea that if is localized then should be smooth. We now consider the converse statement, smooth implies localized. Proposition 1.4 Suppose that f is C N and that D f L1 for all with 0 || N . Then ( ) D f ( ) = (2i ) f (11) when || N and furthermore ( )| C (1 + | |)N |f for a suitable constant C . 4 (12)

h0

which is equal to

The proof is based on an integration by parts which is most easily justied when f has compact support. Accordingly, we include the following lemma before giving the proof. Let : Rn R be a C function with the following properties (4. is actually irrelevant for present purposes): 1. 2. 3. 4. (x) = 1 if |x| 1 (x) = 0 if |x| 2 0 1. is radial.

Dene k (x) = ( x ); thus k is similar to but lives on scale k instead of 1. If k C is a multiindex, then there is a constant C such that |D k | k || uniformly in k . Furthermore, if = 0 then the support of D is contained in the region k |x| 2k . Lemma 1.5 If f is C N , D f L1 for all with || N and if we let fk = k f then limk D fk D f 1 = 0 for all with || N . Proof It is obvious that
k

lim k D f D f

= 0,

so it suces to show that


k

lim D (k f ) k D f

= 0.

(13)

However, by the Leibniz rule D ( k f ) k D f =


0<

c D f D k ,

where the c s are certain constants. Thus D ( k f ) k D f


1

C
0<

D k

D f

L1 ({x:|x|k })

Ck 1
0<

D f

L1 ({x:|x|k })

The last line clearly goes to zero as k . There are two reasons for this (either would suce): the factor k 1 , and the fact that the L1 norms are taken only over the region |x| k .

Proof of Proposition 1.4 If f is C 1 with compact support, then by integration by parts we have f (x)e2ix dx = 2ij e2ix f (x)dx, xj i.e. (11) holds when || = 1. An easy induction then proves (11) for all provided that f is C N with compact support. To remove the compact support assumption, let fk be as in Lemma 1.5. Then (11) holds for fk . Now we pass to the limit as k . On the one hand D fk converges uniformly to D f as k by Lemma 1.5 and Proposition 1.1. On the other hand fk , so (2i ) fk converges to (2i ) f pointwise. This proves (11) converges uniformly to f in general. L if || N . To prove (12), observe that (11) and Proposition 1 imply that f On the other hand, it is easy to estimate
1 CN (1 + | |)N ||N

| | CN (1 + | |)N ,

(14)

so (12) follows. Together with (14), let us note the inequality 1 + |x| (1 + |y |)(1 + |x y |), x, y Rn which will be used several times below. (15)

2. Schwartz space The Schwartz space S is the space of functions f : Rn C such that: 1. f is C , 2. x D f is a bounded function for each pair of multiindices and . For f S we dene f

= x D f

It is possible to see that S with the family of norms is a Frechet space, but we dont discuss such questions here (see [27]). However, we dene a notion of sequential convergence in S : A sequence {fk } S converges in S to f S if limk fk f of multiindices and .

= 0 for each pair

Examples 1. Let C0 be the C functions with compact support. Then C0 S.

Namely, to prove that x D f is bounded, just note that if f C0 then D f is a continuous function with compact support, hence bounded, and that x is a bounded function on the support of D f .

2. Let f (x) = e|x| . Then f S . For the proof, notice that if p(x) is a polynomial, then any rst partial derivative 2 2 ( p(x)e|x| ) is again of the form q (x)e|x| for some polynomial q . It follows by xj induction that each D f is a polynomial times f for each . Hence x D f is a polynomial times f for each and . This implies using LHospitals rule that x D f is bounded for each and .
2

3. The following functions are not in S : fN (x) = (1 + |x|2 )N for any given N , 2 2 and g (x) = e|x| sin(e|x| ). Roughly, although fN decays rapidly at , it does not decay rapidly enough, whereas g decays rapidly enough but its derivatives do not decay. Detailed verication is left to reader. We now discuss some simple properties of S , then some which are slightly less simple. I. S is closed under dierentiations and under multiplication by polynomials. Furthermore, these operations are continuous on S in the sense that they preserve sequential convergence. Also f, g S implies f g S . Proof Let f S . If is a multiindex, then x D (D )f = x D + f , which is bounded since f S . So D f S . By the Leibniz rule, x D (x f ) is a nite sum of terms each of which is a constant multiple of x D (x )D f for some . Furthermore, D (x ) is a constant multiple of x if , and otherwise is zero. Thus x D (x f ) is a linear combination of monomials times derivatives of f , and is therefore bounded. So x f S . The continuity statements follow from the proofs of the closure statements; we will normally omit these arguments. As an indication of how they are done, let us show that if is a multiindex then f D f is continuous. Suppose that fn f in S . Fix a pair of multiindices and . Applying the denition of convergence with the multiindices and + , we have lim x D + (fn f ) = 0.
n

Equivalently,
n

lim x D (D fn D f )

=0

which says that D fn converges to D f . The last statement (that S is an algebra) follows readily from the product rule and the denitions. 7

II. The following alternate denitions of S are often useful. f S (1 + |x|)N D f is bounded for each N and , f S lim x D f = 0 for each and .
x

(16) (17)

Indeed, (16) follows from the denition and (14). The backward implication in (17) is trivial, while the forward implication follows by applying the denition with replaced by appropriate larger multiindices, e.g. + ej for arbitrary j {1, . . . , n}.
Proposition 2.1 C0 is dense in S , i.e. for any f S there is a sequence {fk } C0 with fk f in S .

Proof This is almost the same as the proof of Lemma 1.5. Namely, dene k as there and consider fk = k f , which is evidently in C0 . We must show that x D (k f ) x D f uniformly as k . For this, we estimate x D (k f ) x D f

k x D f x D f

+ x (D (k f ) k D f )

The rst term is bounded by sup|x|k |x D f | and therefore goes to zero as k by (17). The second term is estimated using the Leibniz rule by C
<

x D f

D k

(18)

Since f S and D k

C , k

the expression (18) goes to zero as k .

There is a stronger density statement which is sometimes needed. Dene a C0 tensor n function to be a function f : R C of the form

f (x) =
j where each j C0 (R).

j (xj ),

Proposition 2.1 Linear combinations of C0 tensor functions are dense in S . Proof In view of Proposition 2.1 it suces to show that if f C0 then there is a sequence {gk } such that: 1. Each gk is a C0 tensor function.

2. The supports of the gk are contained in a xed compact set E which is independent of k . 3. D gk converges uniformly to D f for each . To construct {gk }, we use the fact (a basic fact about Fourier series) that if f is a C function in Rn which is 2 -periodic in each variable then f can be expanded in a series f ( ) =
Z
n

a ei

where the {a } satisfy (1 + | |)N |a | <

for each N . Considering partial sums of the Fourier series, we therefore obtain a sequence of trigonometric polynomials pk such that D pk converges uniformly to D f for each . In constructing {gk } we can assume that x suppf implies |xj | 1, say, for each j otherwise we work with f (Rx) for suitable xed R instead and undo the rescaling at the end. Let be a C0 function of one variable which is equal to 1 on [1, 1] and vanishes outside [2, 2]. Let f be the function which is equal to f on [, ] . . . [, ] and is 2 -periodic in each variable. Then we have a sequence of trigonometric polynomials pk for each . Let gk (x) = n (xj ) pk (x). such that D pk converges uniformly to D f j =1 Then gk clearly satises 1. and 2, and an argument with the product rule as in Lemma 1.5 and Proposition 2.2 will show that {gk } satises 3. The proof is complete. The next proposition is an alternate denition of S using L1 instead of L norms. Proposition 2.2 A C function f is in S i the norms x D f
1

are nite for each and . Furthermore, a sequence {fk } S converges in S to f S i


k

lim x D (fk f )

=0

for each and . Proof We only prove the rst part; the equivalence of the two notions of convergence follows from the proof and is left to the reader. First suppose that f S . Fix and . Let N = || + n + 1. Then we know that (1 + |x|)N D f is bounded. Accordingly, x D f
1

(1 + |x|)N D f < 9

x (1 + |x|)N

using that the function (1 + |x|)n1 is integrable. For the converse, we rst make a denition and state a lemma. If f : Rn is C k and if x Rn then def f |D f (x)|. k (x) =
||=k

We denote D (x, r ) = {y : |x y | r }. We also now start to use the notation x mean that x Cy where C is a xed but unspecied constant. Lemma 2.2 Suppose f is a C function. Then for any x |f (x)|
0j n+1

y to

f j

L1 (D (x,1)) .

This is contained in Lemma A2 which is stated and proved at the end of the section. To nish the proof of Proposition 2.2, we apply the preceding lemma to D f . This gives |D f (x)|
| || |+n+1 D (x,1)

|D f (y )|dy,

therefore (1 + |x|)N |D f (x)| (1 + |x|)N


| || |+n+1 D (x,1)

|D f (y )|dy (1 + |y |)N |D f (y )|dy,

| || |+n+1

D (x,1)

where we used the elementary inequality 1 + |x| 2 min (1 + |y |).


y D (x,1)

It follows that (1 + |x|)N |D f


| || |+n+1

(1 + |x|)N D f

1,

and then Proposition 2.2 follows from (14). S . Furthermore, the map f f is continuous from S Theorem 2.3 If f S then f to S . Proof As usual we explicitly prove only the rst statement. is bounded. Thus if f S , then D x f is bounded for any If f S then f L1 , so f given and , since D x f is again in S . However, Propositions 1.3 and 1.4 imply that ( ) D x f ( ) = (2i)|| (2i)| | D f 10

is again bounded, which means that f S. so D f

Appendix: pointwise Poincare inequalities This is a little more technical than the preceding and we will omit some details. We prove a frequently used pointwise estimate for a function in terms of integrals of its gradient, which plays a similar role that the mean value inequality plays in calculus. Then we prove a generalization involving higher derivatives which includes Lemma 2.3. Let be the volume of the unit ball. Lemma A1 Suppose that f is C 1 . Then |f (x) 1 f (y )dy |
D (x,1) D (x,1)

|f (y )| dy. | x y | n1

Proof Applying the fundamental theorem of calculus to the function t f (x + t(y x)) shows that |f (y ) f (x)| |x y |
0 1

|f (x + t(y x))|dt.

Integrate this with respect to y over D (x, 1) and divide by . Thus |f (x) 1 f (y )dy |
D (x,1)

|f (x) f (y )|dy
D (x,1) 1

|x y |
D (x,1) 1 0

|f (x + t(y x))|dtdy (19)

=
0 D (x,1)

|x y ||f (x + t(y x))|dydt.

Make the change of variables z = x + t(y x), and then reverse the order of integration again. This leads to
1

(19) =
t=0 D (x,t)

t1 |z x||f (z )|
1

dz dt tn t(n+1) dtdz

=
D (x,1)

|z x||f (z )|

t=|z x|

|x z |(n1) |f (z )|dz
D (x,1)

as claimed. 11

Lemma A.2 Suppose that f is C k . Then (nk ) f k (y )dy if 1 k n 1, D(x,1) |x y | f 1 f log |xy| n (y )dy if k = n, |f (x)| j L1 (D(x,1)) + D (x,1) f 0j<k if k = n + 1. n+1 L1 (D (x,1)) The case k = n + 1 is Lemma 2.2.

(20)

Proof The case k = 1 follows immediately from Lemma A.1, so it has already been proved. To pass to general k we use induction based on the inequalities (here a > 0, b > 0, |z x| constant) (nab) if a + b < n, C |x z | C (na) (nb) log if a + b = n, |x y | |z y | dy (21) | z x | y D (x,C ) C if a + b > n, and
y D (x,C )

|x y |(n1) log

1 dy C |z y |

(22)

In fact, (21) may be proved by subdividing the region of integration in the three regimes |y x| 1 |z x|, |y z | 1 |z x| and the rest, and noting that on the third 2 2 regime the integrand is comparable to |y x|(2nab) . (22) may be proved similarly. We now prove (20) by induction on k . We have done the case k = 1. Suppose that 2 k n 1 and that the cases up to and including k 1 have been proved. Then |f (x)|
j k 2

f j f j
j k 2

L1 (D (x,1))

+
D (x,1)

|x y |(nk+1)f k 1 (y )dy |x y |(nk+1) f k 1 (z )dzdy

L1 (D (x,1))

+
D (x,1)

D (y,1)

+
D (x,1)

|x y |(nk+1)
D (y,1) L1 (D (x,2))

|y z |(n1) f k (z )dzdy |x z |(nk) f k (z )dz.

f j
j k 1

+
D (x,2)

For the rst two inequalities we used (20) with k replaced by k 1 and 1 respectively, and for the last inequality we reversed the order of integration and used (21). The disc D (x, 2) can be replaced by D (x, 1) using rescaling, so we have proved (20) for k n 1. To pass from k = n 1 to k = n we argue similarly using the second case of (21), and to pass from k = n to k = n + 1 we argue similarly using (22). 3. Fourier inversion and Plancherel 12

Convolution of and f is dened as follows: f (x) = (y )f (x y )dy (23)

We assume the reader has seen this denition before but will summarize some facts, mostly without giving the proofs. There is an issue of the appropriate conditions on and f under which the integral (23) makes sense. We recall the following. 1. If L1 and f Lp , 1 p , then the integral (23) is an absolutely convergent Lebesgue integral for a.e. x and f
p

p.

(24)

2. If is a continuous function with compact support and f L1 loc , then the integral (23) is an absolutely convergent Lebesgue integral for every x and f is continuous. 1 3. If Lp and f Lp , 1 +p = 1, then the integral (23) is an absolutely convergent p Lebesgue integral for every x, and f is continuous. Furthermore, f

(25)

The absolute convergence of (23) in 2. is trivial, and (25) follows from CauchySchwarz. Continuity then follows from the dominated convergence theorem. 1. is obvious when p = . It is also true for p = 1, by Fubinis theorem and a change of variables. The general case follows by interpolation, see the Riesz-Thorin theorem in Section 4. In any of the above situations convolution is commutative: the integral dening f is again convergent for the same values of x, and f = f . This follows by making the change of variables y x y . Notice also that supp( f ) supp + suppf, where the sum E + F means {x + y : x E, y F }. In many applications the function is xed and very nice, and one considers convolution as an operator f f.
Lemma 3.1 If C0 and f L1 and loc then f is C

D ( f ) = (D ) f.

(26)

Proof It is enough to prove that f is C 1 and (26) holds for multiindices of length 1, since one can then use induction. Fix j and consider dierence quotients d(h) = 1 ( f )(x + hej ) ( f )(x) . h 13

Using (23) and commutativity of convolution, we can rewrite this as d(h) = The quotients Ah (y ) = are bounded by
xj

1 ((x + hej y ) (x y ))f (y )dy. h 1 ((x + hej y ) (x y )) h

by the mean value theorem. For xed x and for |h| 1, the

support of Ah is contained in the xed compact set E = D (x, 1)+ supp . Thus the inte grands Ah f are dominated by the L1 function x E |f |. The dominated convergence j theorem implies
h0

lim d(h) =

f (y ) lim Ah (y )dy =
h0

f (y )

(x y )dy. xj

This proves (26) (when || = 1). The continuity of the partials then follows from 2. above. Corollary 3.1 If f, g S then f g S . Proof By Lemma 3.1 it suces to show that if (1 + |x|)N f (x) and (1 + |x|)N g (x) are bounded for every N then so is (1 + |x|)N f g (x). This follows by writing out the denitions and using (15); the details are left to the reader. Convolution interacts with the Fourier transform as follows: Fourier transform converts convolution to ordinary pointwise multiplication. Thus we have the following formulas: g f g =f , f, g L1 , (27) g fg = f , f, g S . (28)

(27) follows from Fubinis theorem and is in many textbooks; the proof is left to the reader. (28) then follows easily from the inversion theorem, so we defer the proof until after Theorem 3.4. Let S , and assume that = 1. Dene (x) = n ( 1 x). The family of functions { } is called an approximate identity. Notice that = 1 for all . Thus one can regard the as roughly convergent to the Dirac mass 0 as 0. Indeed, the following fact is basic but quite standard; see any reasonable book on real analysis for the proof. Lemma 3.2 Let S and = 1. Then: 1. If f is a continuous function which goes to zero at then f f uniformly as 0. 2. If f Lp , 1 p < then f f in Lp as 0. 14

Let us note the following corollary:


Lemma 3.3 Suppose f L1 loc . Then there is a xed sequence {gk } C0 such that if p [1, ) and f Lp , then gk f in Lp . If f is continuous and goes to zero at , then gk f uniformly.

The reason for stating the lemma in this way is that one sometimes has to deal with several notions of convergence simultaneously, e.g., L1 and L2 convergence, and it is convenient to be able to approximate f in both norms simultaneously.
Proof Let C0 , = 1, 0, and let be as in Lemma 1.5. Fix a sequence x k f ). k 0. Let gk (x) = ( k ) ( p If f L , then for large k the quantity k f Lp (|x|k) is bounded by f Lp (|x|k1) using (24) and that supp k is contained in D (0, 1). Accordingly, gk k f p 0 as k . On the other hand, k f f p 0 by Lemma 3.2. If f is continuous and goes to zero at , then one can argue the same way using the rst part of Lemma 3.2. Smoothness of gk follows from Lemma 3.1, so the proof is complete.

is also in Theorem 3.4 (Fourier inversion) Suppose that f L1 , and assume that f L1 . Then for a.e. x, ( )e2ix d. (29) f (x) = f Equivalently, (x) = f (x) for a.e. x. f The proof uses Lemma 3.2 and also the following facts:
2 = , and therefore also satises (30). So at A. The gaussian (x) = e|x| satises any rate there is one function f for which Theorem 3.4 is true. In fact this implies that there are many such functions. Indeed, if we form the functions

(30)

(x) = e then we have ( ) = Applying this again with replaced by discussion after formula (5).
1

2 | x| 2

| | n 2

(31)

, one can verify that satises (30). See the

B. The duality relation for the Fourier transform, i.e., the following lemma.

15

Lemma 3.5 Suppose that M (Rn ) and M (Rn ). Then d = In particular, if f, g L1 , then (x)g (x)dx = f Proof This follows from Fubinis theorem: d = = = e2ix d(x)d ( ) e2ix d ( )d(x) d. f (x) g (x)dx. d. (32)

Proof of Theorem 3.4 Consider the integral in (29) with a damping factor included: I (x) = We evaluate the limit as ( )e f
2 | |2

e2ix d.

(33)

0 in two dierent ways.

( )e2ix d for each xed x. This follows from the dominated 1. As 0, I (x) f L1 . convergence theorem, since f 2. With x and xed, dene g ( ) = e I (x) =
2 | |2

e2ix . Thus

f (y ) g(y )dy

by Lemma 3.5. On the other hand, we can evaluate g using the fact that g ( ) = ex ( ) ( ) and (4), (31). Thus g (y ) = (y x) = (x y ), where (y ) = that is even. Accordingly,
n

( y ) is an approximate identity as in Lemma 3.2, and we have used I =f ,

and we conclude by Lemma 3.2 that I f 16

in L1 as 0. Summing up, we have seen that the functions I converge pointwise to and converge in L1 to f . This is only possible when (29) holds.

( )e2ix d , f

Corollaries of the inversion theorem The rst corollary below is not really a corollary, but a reformulation of the proof L1 . This is the form the inversion theorem takes without the assumption that f for general f . Notice that the integrals I are well dened for any f L1 , since the 2 2 Gaussian e |x| is integrable for each xed . Corollary 3.6 is often stated as the Fourier transform of f is Gauss-Weierstrass summable to f , and can be compared to the theorem on Cesaro summability for Fourier series. Corollary 3.6 1. Suppose f L1 and dene I (x) via (33). Then I f in L1 as 0. 2. If 1 < p < and additionally f Lp , then I f in Lp as 0. If instead f is continuous and goes to zero at , then I f uniformly. Proof This follows from the preceding argument showing that I = f , together with Proposition 3.2. = 0, then f = 0. Corollary 3.7 If f L1 and f This is immediate from Theorem 3.4. Theorem 3.8 The Fourier transform maps S onto S . . Then g S by Theorem 2.3, Proof Given f S , let F (x) = f (x) and let g = F and (30) implies (x) = F (x) = f (x) g (x) = F

Let us also prove formula (28). Let f, g S . Then g (x)g f (x) = f (x) = f (x)g (x) g by (27) and Theorem 3.4. Using Theorem 3.4 again, it follows that f = f g. Theorem 3.9 (Plancherel theorem, rst version) If u, v S then u v = uv. (34)

17

Proof By the inversion theorem, u(x)v (x)dx = i.e., u v = u v . u (x)v (x)dx = u (x)v (x)dx

Applying the duality relation to the right-hand side we obtain uv = and now (34) follows from (6). Theorem 3.9 says that the Fourier transform restricted to Schwartz functions is an isometry in the L2 norm. Since S is dense in L2 (e.g., by Lemma 3.3) this suggests a way of extending the Fourier transform to L2 . Theorem 3.10 (Plancherel theorem, second version) There is a unique bounded op when f S . F has the following additional erator F : L2 L2 such that F f = f properties: 1. F is a unitary operator. if f L1 L2 . 2. F f = f Proof The existence and uniqueness statement is immediate from Theorem 3.9, as is the fact that F f 2 = f 2 . In view of this isometry property, the range of F must be closed, and unitarity of F will follow if we show that the range is dense. However, the latter statement is immediate from Theorem 3.8 and Lemma 3.3. It remains to prove 2. For f S , 2. is true by denition. Suppose now that f L1 L2 . By Lemma 3.3, there is a sequence {gk } S which converges to f both in L1 and in L2 . uniformly. On the other hand, gk converges to F f By Proposition 1.1, gk converges to f 2 . in L by boundedness of the operator F . It follows that F f = f for F f if f L2 without any possible Statement 2. allows us to use the notation f ambiguity. We may therefore extend the denition of the Fourier transform to L1 + L2 + g (in fact to -nite measures of the form + f dx, M (Rn ), f L2 ) via f + g = f . Corollary 3.12 The following form of the duality relation is valid: = d, S, u v ,

18

if = + f dx, M (Rn ), f L2 . Proof We have already proved this in Lemma 3.5 if f = 0, so it suces to prove it when = 0, i.e., to show that = f f if f L2 , S . This is true by Lemma 3.5 if f L1 L2 . Therefore it is also true for f L2 , since for xed both sides depend continuously on f (in the case of the left-hand side this follows from the Plancherel theorem). Theorem 3.13 If M (Rn ), f L2 and + f = 0, L2 , then is absolutely continuous then = f dx. In particular, if M (Rn ) and 2 with respect to the Lebesgue measure with an L density. Proof By the Riesz representation theorem for measures on compact sets, the measure + f dx will be zero provided d + f dx = 0 (35)

for continuous with compact support. If C0 then (35) follows from Corollary 3.12. In general, we choose (e.g., by Proposition 3.2) a sequence k in C0 which converges to uniformly and in L2 . We write down (35) for the k s and pass to the limit. This proves (35). To prove the last statement, suppose that L2 , and choose (by Theorem 3.10) a 2 function g L with g = . Then d gdx has Fourier transform zero, so by the rst part of the proof d = gdx. All the basic formulas for the L1 Fourier transform extend to the L1 + L2 -Fourier transform by approximation arguments. This was done above in the case of the duality relation. Let us note in particular that the transformation formulas in Section 1 extend. For example, in the case of (5), one has T t f T = |det (T )|1 f (36)

if f L1 + L2 . Since we already know this when f L1 , it suces to prove it when f L2 . Choose {fk } L1 L2 , fk f in L2 . Composition with T is continuous on L2 , as is Fourier transform, so we can write down (36) for the {fk } and pass to the limit. The L1 + L2 domain for the Fourier transform is wide enough to include many natural examples. Note in particular that Lp L1 + L2 if p (1, 2). Furthermore, certain homogeneous functions belong to L1 + L2 although none of them can belong to Lp for any xed p. For example, |x|a belongs to L1 + L2 if n < a < n, since it belongs to L1 at the 2 2 1 2 origin and to L at innity. However, the L + L domain is not always sucient. The 19

most natural way to proceed would be to develop the idea of tempered distributions, but we dont want to do this explicitly. Instead, we further broaden the denition of Fourier n transform as follows: a tempered function is a function f L1 loc (R ) such that (1 + |x|)N |f (x)|dx < for some constant N . Roughly, f has at most polynomial growth in the sense of L1 averages. It is clear that if f is tempered and S , then |f | < . Furthermore, the map f is continuous on S . It follows that f is well dened if S and f is tempered, and a simple estimation shows that f is again tempered. If f and g are tempered functions, we say that g is the distributional Fourier transform of f if g = f (37) for all S . For given f , such a function g is unique using the density properties of S . Notice also that if f L1 + L2 , as in several previous arguments. We denote g by f then its L1 + L2 -Fourier transform coincides with its distributional Fourier transform by Proposition 3.12. All the basic formulas, in particular (3), (4), (5), (6), extend to the case of distributional Fourier transforms, e.g., if g is the distributional Fourier transform of f , then T t is the distributional Fourier transform of f T . This may be seen by |det T |1 f making appropriate changes of variable in the integrals in (37). We indicate how these arguments are carried out by proving the extended version of formula (27). Namely, if f is tempered, S , and f has a distributional Fourier transform, then so does f and f f = The proof is as follows. Let be another Schwartz function. Then f ) = ( = = = = ( ) f f ) f ( f (x) (y )dydx ((x y )) (38)

(y ). f (y )

The second line followed from the denition of distributional Fourier transform, the third line from (28), and the next to last line used the inversion theorem for . Comparing with the denition (37), we see that we have proved (38). 20

Let us note also that the inversion theorem is true for distributional Fourier transforms: , then f has the distributional if f is tempered and has a distributional Fourier transform f Fourier transform f (x). Here is the proof. If S , then f (x)(x)dx = = = f (x)(x)dx (x)dx f (x) (x) (x)dx. f

. We used a change of variables, the inversion theorem for , and the denition (37) of f Comparing again with (37), we have the stated result.

4. Some specics, and Lp for p < 2 We rst discuss a couple of basic examples where the Fourier transform can be calculated, namely powers of the distance to the origin and complex Gaussians.
(a/2) a Proposition 4.1 Let ha (x) = . Then ha = hna in the sense of L1 + L2 a/2 |x| Fourier transforms if n < Re (a) < n, and in the sense of distributional Fourier transforms 2 if 0 < Re (a) < n.

Here is the gamma function, i.e.,

(s ) =
0

et ts1 dt.

Proof Suppose that a is real and n < a < n. Then ha L1 + L2 . The functions 2 of the form f (x) = c|x|a with c constant may be characterized by the following two transformation properties: 1. f is radial, i.e., f = f for all linear : Rn Rn with t =identity. 2. f is homogeneous of degree a, i.e., f ( x) = for each > 0.
a

f (x)

(39)

We will use the notation f (x) = f ( x), x f (x) = n f ( ). 21 (40) (41)

Let f (x) = |x|a , n < a < n. Taking Fourier transforms we obtain from 1. and 2. the 2 is radial, and following (see the discussion in Section 1 regarding special cases of (5)): f a (na) (na) f = , and it remains to f , which is equivalent to f = f . Hence f = c|x| evaluate the constant c. For this we use the duality relation, taking the Schwartz function to be the Gaussian . Thus |x|a e|x| dx = c
2

|x|(na) e|x| dx.


2

(42)

To evaluate the left hand side, change to polar coordinates and then make the change of variable t = r 2 . Thus, if is the area of the unit sphere, we get |x|a e|x| dx =
2

dr r 0 t na dt = et ( ) 2 2t 0 ( na ) n a = 2 ( ), 2 2 er r na
2

a and similarly the right hand side of (42) is c ( 2 ) ( a ). Hence 2 2

c=

a ) 2 ( n 2
a

na 2

( a ) 2

and the proposition is proved in the case n < a < n. 2 For the general case, x S and consider the two integrals A(z ) = B (z ) = hz , hnz .

Both A and B may be seen to be analytic in z in the indicated regime: since is analytic, this reduces to showing that |x|z (x)dx is analytic when S , which may be done by using the dominated convergence theorem to justify complex dierentiation under the integral sign. By Proposition 3.14, A and B agree for z in ( n , n). So they agree everywhere by the 2 uniqueness theorem. This proves that the distributional Fourier transform of ha exists and is hna . If Re a > n , then ha L1 + L2 , so that its L1 + L2 and distributional Fourier 2 transforms coincide.

22

Let T be an invertible n n real symmetric matrix. The signature of T is the quantity k+ k where k+ and k are the numbers of positive and negative eigenvalues of T , counted with multiplicity. We also dene GT (x) = ei T x,x , and observe that GT has absolute value 1 and is therefore tempered. Proposition 4.2 Let T be an invertible n n real symmetric matrix with signature . Then GT has a distributional Fourier transform, which is equal to 1 ei 4 |det T | 2 GT 1 Remark This can easily be generalized to complex symmetric T with nonnegative imaginary part (the latter condition is needed, else GT is not tempered). See [17], Theorem 7.6.1. If n = 1, we do this case in the course of the proof. Proof We need to show that
i 1 |det T | 2 (x)dx = e 4 ei T x,x

ei T

1 x,x

(x)dx

(43)

if S and T is invertible real symmetric. First consider the n = 1 case. Let z be the branch of the square root dened on the complement of the nonpositive real numbers and positive on the positive real axis. Thus i i = e 4 . Accordingly, (43) with n = 1 is equivalent to 2 (x)dx = ( z )1 ezx e z (x)dx
x2

(44)

if S and z is pure imaginary and not equal to zero. We prove this formula by analytic continuation from the real case. Namely, if z = 1 then (44) is Example 2 in Section 1, and the case of z real and , see positive then follows from scaling, i.e., the fact that the Fourier transform of f is f (5). Both sides of (44) are easily seen to be analytic in z when Re z > 0 and continuous in z when Re z 0, z = 0, so (44) is proved. Now consider the n 2 case. Observe that if (43) is true for a given T (and all ), it is true also when T is replaced by UT U 1 for any U SO (n). This follows from the U . However, since we did not give an explicit proof of the latter fact that f U = f fact for distributional Fourier transforms, we will now exhibit the necessary calculations. Let S = UT U 1 . Thus S and T have the same determinant and the same signature. Accordingly, if (43) holds for T then (x)dx = ei Sx,x ei T U 23
1 x,U 1 x

(x)dx

= =

(Ux)dx ei T x,x ei T x,x U (x)dx


i 1

= e 4 |det T | 2 = e 4 |det T | 2
i

ei T ei T ei S

1 x,x

U (x)dx (x)dx

1 U 1 x,U 1 x

= e 4 |det S | 2
i

1 x,x

(x)dx.

U for Schwartz functions , see the comments after formula (5) We used that U = in Section 1. It therefore suces to prove (43) when T is diagonal. If T is diagonal and is a tensor function, then the integrals in (43) factor as products of one variable integrals and (43) follows immediately from (44). The general case then follows from Proposition 2.1 and the fact that integration against a tempered function denes a continuous linear functional on S . We now briey discuss the Lp Fourier transform, 1 < p < 2. The most basic result is the Hausdor-Young theorem, which is a formal consequence of the Plancherel theorem and Proposition 1.1 via the following. Riesz-Thorin interpolation theorem. Let T be a linear operator with domain Lp0 + Lp1 , 1 p0 < p1 . Assume that f Lp1 implies Tf and f Lp1 implies Tf
q1 q0

A0 f A1 f

p0

(45)

p1

(46)

for some 1 q0 , q1 . Suppose that for a certain (0, 1), 1 1 = + p p0 p1 and Then f Lp implies Tf
q

(47)

1 1 = + . q q0 q1
A1 0 A1 f p.

(48)

For the proof see [20], [34], or numerous other textbooks.

24

We will adopt the convention that when indices p and p are used we must have 1 1 + = 1. p p Proposition 4.3 (Hausdor-Young) If 1 p 2 then f
p

p.

(49)

Proof We interpolate between the cases p = 1 and 2, which we already know. Namely, apply the Riesz-Thorin theorem with p0 = 1, q0 = , p1 = q1 = 2, A0 = A1 = 1. The hypotheses (45) and (46) follow from Proposition 1.1 and Theorem 3.10 respectively. For given p, q , existence of (0, 1) for which (47) and (48) hold is equivalent to 1 < p < 2 and q = p . The result follows. For later reference we insert here another basic result which follows from Riesz-Thorin, although this one (in contrast to Hausdor-Young) could also be proved by elementary manipulation of inequalities. Proposition 4.4(Youngs inequality) Let Lp , Lr , where 1 p, r and 1 1 1 +1 1. Let 1 = p r . Then the integral dening is absolutely convergent for p r q a.e. x and q p r. Proof View as xed, i.e., dene T = . Inequalities (24) and (25) imply that T : L1 + Lp Lp + L with T T If
1 q 1 p p

p p

1 p

1 r

then there is [0, 1] with 1 1 = + r 1 p 1 1 = + . q p 25

The result now follows from Riesz-Thorin. Remarks 1. Unless p = 1 or 2, the constant 1 in the Hausdor-Young inequality is not the best possible; indeed the best constant is found by testing the Gaussian function . This is much deeper and is due to Babenko when p is an even integer and to Beckner [1] in general. There are some related considerations in connection with Proposition 4.4, due also to Beckner. 2. Except in the case p = 2 the inequality (49) is not reversible, in the sense that there is no constant C such that f p f p when f S . Equivalently (in view of the inversion theorem) the result does not extend to the case p > 2. This is not at all dicult to show, but we discuss it at some length in order to illustrate a few dierent techniques used for constructing examples in connection with the Lp Fourier transform. Here is the most elementary argument. Exercise Using translation and multiplication by characters, construct a sequence of Schwartz functions {n } so that 1. Each n has the same Lp norm. 2. Each n has the same Lp norm. 3. The supports of the n are disjoint. 4. The supports of the n are essentially disjoint meaning that
N N

n p p
n=1

n=1

p p

( N )

uniformly in N . Use this to disprove the converse of Hausdor-Young. Here is a second argument based on Proposition 4.2. This argument can readily be adapted to show that there are functions f Lp for any p > 2 which do not have a distributional Fourier transform in our sense. See [17], Theorem 7.6.6. 2 Take n = 1 and f (x) = (x)eix , where C0 is xed. Here is a large positive number. Then f p is independent of for any p. By the Plancherel theorem, f 2 , which is in L1 , is also independent of . On the other hand, f is the convolution of 1 i1 x2 1 with ( i) e , which has L norm 2 . Accordingly, if p < 2 then f
p

2 p f 2 f 1 1 ( ) 2 p .

2 1 p

Since f p is independent of , this shows that when p < 2 there is no constant C such that C f p f p for all f S .

26

Here now is another important technique (randomization) and a third disproof of the converse of Hausdor-Young. Let {n }N n=1 be independent random variables taking values 1 with equal probability. Denote expectation (a.k.a. integral over the probability space in question) by E, and probability (a.k.a. measure) by Prob. Let {an }N n=1 be complex numbers. Proposition 4.5 (Khinchins inequality)
N N

E(|
n=1

an n |p ) (
n=1

|an |2 ) 2

(50)

for any 0 < p < , where the implicit constants depend on p only. Most books on probability and many analysis books give proofs. Here is the proof in the case p > 1. There are three steps. (i) When p = 2 it is simple to see from independence that (50) is true with equality: expand out the left side and observe that the cross terms cancel. (ii) The upper bound. This is best obtained as a consequence of a stronger (subgaussian) estimate. One can clearly assume the {an } are real and (52) below is for real {an }. Let t > 0. We have E(et

an n

)=
n

E(etan n ) =
n

1 tan (e + etan ), 2

where the rst equality follows from independence and the fact that ex+y = ex ey . Use the numerical inequality x2 1 x (e + ex ) e 2 (51) 2 to conclude that t2 2 E(et n an n ) e 2 n an , therefore Prob(
n

an n ) et+ 2

t2

a2 n

for any t > 0 and > 0. Taking t = Prob(


n

a
n

2 n

gives

an n ) e

2 n an
2

(52)

hence Prob(|
n

an n | ) 2e 27

2 n a2 n
2

From this and the formula for the Lp norm in terms of the distribution function, E(|f |p ) = p one gets E(|
n

p1 Prob(|f | )d 2 n an
2

an n | ) 2p
p

p 1 2

p p d = 22+ 2 p ( )( 2

2 a2 n) .

This proves the upper bound. (iii) The lower bound. This follows from (i) and (ii) by duality. Namely |an |2 = E(|
n n

an n |2 ) an n |p ) p E(|
n n 2
1 2 1

E(| (
n

an n |p ) p
1

|an | ) E(|
n

an n |p ) p ,

so that E(|
n

an n |p ) p

(
n

|an |2 ) 2 .

as claimed.

To apply this in connection with the converse of Hausdor-Young, let be a C0 def function, and let {kj }N j =1 be such that the functions j = ( kj ) have disjoint support. ( ). The Lp norm of Thus n ( ) = e2ikn nN n n is independent of in view of the disjoint support, indeed 1 n n p = CN p , (53) nN

where C = p . Now consider the corresponding Fourier side norms, more precisely the expectation of their p powers: E( n n p (54) p.
nN

We have by Fubinis theorem (54) = E(


nN

( )) n e2ikn ( )|p E(| |


nN

p Lp (d )

n e2ikn |p )

N , 28

p 2

where at the last step we used Khinchin. 1 It follows that we can make a choice of so that N 2 . If p < 2 and nN n n p if N is large, this is much smaller than the right hand side of (53), so we are done.

5. Uncertainty Principle The uncertainty principle is1 the heuristic statement that if a measure is supported on R, then for many purposes may be regarded as being constant on any dual ellipsoid R. The simplest rigorous statement is is supported in Proposition 5.1 (L2 Bernstein inequality) Assume that f L2 and f D (0, R). Then f is C and there is an estimate Df
2

(2R)|| f

2.

(55)

Proof Essentially this is an immediate consequence of the Plancherel theorem, see the calculation at the end of the proof. However, there are some details to take care of if one wants to be rigorous. The Fourier inversion formula f (x) = ( )e2ix d f (56)

L1 , is valid (in the naive sense). Namely, note that the support assumption implies f both in L1 and and choose a sequence of Schwartz functions k which converges to f in L2 . Let k S satisfy k = k . The formula (56) is valid for k . As k , the left sides converge in L2 to f by Theorem 3.10 and the right sides converge uniformly to ( )e2ix d , which proves (56) for f . f now implies that f is C and that D f is obtained by Proposition 1.3 applied to f dierentiation under the integral sign in (56). The estimate (55) holds since Df
2

= Df

= (2i ) f

(2R)|| f

= (2R)|| f

2.

A corresponding statement is also true in Lp norms, but proving this and other related results needs a dierent argument since there is no Plancherel theorem. Lemma 5.2 There is a xed Schwartz function such that if f L1 + L2 and f is supported in D (0, R), then 1 f = R f.
1 This should be qualied by adding as far as we are concerned. There are various more sophisticated related statements which are also called uncertainty principle; see for example [14], [15] and references there.

29

Proof Take S so that is equal to 1 on D (0, 1). Thus R1 ( ) = (R1 ) is equal 1 1 to 1 on D (0, R), so (R f f ) vanishes identically. Hence R f = f . is Proposition 5.3 (Bernsteins inequality for a disc) Suppose that f L1 + L2 and f supported in D (0, R). Then 1. For any and p [1, ], Df 2. For any 1 p q f Proof The function = R
1

(CR)|| f

p.

CRn( p q ) f
1 1

p.

satises
r

= CR r

(57)

for any r [1, ], where C = r . Also, using the chain rule


1

= R 1.

(58)

We know that f = f . In the case of rst derivatives, 1. therefore follows from (57) and (24). The general case of 1. then follows by induction. 1 For 2., let r satisfy 1 =1 r . Apply Youngs inequality obtaining q p f
q

f q r f p n Rr f p
1 1

= R n( p q ) f

p.

We now extend the Lp Lq bound to ellipsoids instead of balls, using change of variable. An ellipsoid in Rn is a set of the form E = { x Rn :
j

|(x a) ej |2 1} 2 rj

(59)

for some a Rn (called the center of E ), some choice of orthonormal basis {ej } (the axes) and some choice of positive numbers rj (the axis lengths). If E and E are two ellipsoids,

30

then we say that E is dual to E if E has the same axes as E and reciprocal axis lengths, i.e., if E is given by (59) then E should be of the form { x Rn :
j 2 rj |(x b) ej |2 1

for some choice of the center point b. Proposition 5.4 (Bernsteins inequality for an ellipsoid) Suppose that f L1 + L2 and is supported in an ellipsoid E . Then f f if 1 p q . One could similarly extend the rst part of Proposition 5.3 to ellipsoids centered at the origin, but the statement is awkward since one has to weight dierent directions dierently, so we ignore this. Proof Let k be the center of E . Let T be a linear map taking the unit ball onto E k . Let S = T t ; thus T = S t also. Let f1 (x) = e2ikx f (x) and g = f1 S , so that g ( ) = |det S |1 f1 (S t ( )) (S t ( + k )) = |det S |1 f (T ( ) + k )). = |det T |f Thus g is supported in the unit ball, so by Proposition 5.3 g On the other hand, g
q q q

|E |( p q ) f
1 1

g p.

= |det S | q f
1

= |det T | q f

= |E | q f

and likewise with q replaced by p. So |E | q f as claimed. For some purposes one needs a related pointwise statement, roughly that if suppf E , then for any dual ellipsoid E the values on E are controlled by the average over E . To formulate this precisely, let N be a large number and let (x) = (1 + |x|2 )N . Suppose an ellipsoid R is given. Dene E (x) = (T (x k )), where k is the center of E and T is a selfadjoint linear map taking E k onto the unit ball. If T1 and T2 31
1

|E | p f

1 are two such maps, then T1 T2 is an orthogonal transformation, so E is well dened. Essentially, E is roughly equal to 1 on E and decays rapidly as one moves away from E . We could also write more explicitly

E (x) = 1 +
j

|(x k ) ej |2 2 rj

is supported in an ellipsoid E . Then Proposition 5.5 Suppose that f L1 + L2 and f for any dual ellipsoid E and any z E , | f (z )| C N 1 |E | |f (x)|E dx. (60)

Proof Assume rst that E is the unit ball, and E is also the unit ball. Then f is the convolution of itself with a xed Schwartz function . Accordingly | f (z )| |f (x)| | (z x)|dx |f (x)|(1 + |z x|2 )N |f (x)|(1 + |x|2 )N .

CN CN

We used the Schwartz space bounds for and that 1 + |z x|2 1 + |x|2 uniformly in x when |z | 1. This proves (60) when E = E =unit ball. Suppose next that E is centered at zero but E and E are otherwise arbitrary. Let k and T be as above, and consider g (x) = f (T 1 x + k )). Its Fourier transform is supported on T 1 E , and if T maps E onto the unit ball, then T 1 maps E onto the unit ball. Accordingly, | g (y )| if y D (0, 1), so that f (T 1 z + k ) (x)|f (T 1 x + k )|dx = |det T | E (x)|f (x)|dx (x)|g (x)|dx

1 by changing variables. Since |det T | = |E | , we get (60). If E isnt centered at zero, then we can apply the preceding with f replaced by 2ik x e f (x) where k is the center of E .

32

Remarks 1. Proposition 5.5 is an example of an estimate with Schwartz tails. It is not possible to make the stronger conclusion that, say, |f (x)| is bounded by the average of f over the double of E when x E , even in the one dimensional case with E = E = unit interval. For this, consider a xed Schwartz function g whose Fourier transform is supported in the unit interval [1, 1]. Consider also the functions fN (x) = (1 x2 N ) g (x). 4

Since f and its derivatives, they have the same support as N are linear combinations of g g . Moreover, they converge pointwise boundedly to zero on [2, 2], except at the origin. It follows that there can be no estimate of the value of fN at the origin by its average over [2, 2]. 2. All the estimates related to Bernsteins inequality are sharp except for the values of the constants. For example, if E is an ellipsoid, E a dual ellipsoid, N < , then there E and with is a function f with suppf f
1

|E |,

(61) (62)

|f (x)| CE (x),
(N )

where E = E was dened above. In the case E = E =unit ball this is obvious: take f to be any Schwartz function with Fourier support in the unit ball and with the appropriate L1 norm. The general case then follows as above by making changes of variable. 1 The estimates (61) and (62) imply that f p |E | p for any p, so it follows that Proposition 5.4 is also sharp.

6. Stationary phase
Let be a real valued C function, let a be a C0 function, and dene

I ( ) =

ei(x) a(x)dx.

Here is a parameter, which we always assume to be positive. The issue is the behavior of the integral I () as +. Some general remarks 1. |I ()| is clearly bounded by a constant depending on a only. One may expect decay as , since when is large the integral will involve a lot of cancellation. 2. On the other hand, if is constant then |I ()| is independent of . So one needs to put nondegeneracy hypotheses on . As it turns out, properties of a are less important. 33

Note also that one can always cut up a with a partition of unity, which means that the question of how fast I () decays can be localized to a small neighborhood of a point. 3. Suppose that 1 = 2 G where G is a smooth dieomorphism. Then ei2 (x) a(x)dx = = = ei1 (G
1 x )

a(x)dx

ei1 (y) a(Gy )d(Gy ) ei1 (y) a(Gy )|JG (y )|dy

where JG is the Jacobian determinant. The function y a(Gy )|JG (y )| is again C0 , so we see that any bound for I () which is independent of choice of a will be dieomorphism invariant.

4. Recall from advanced calculus [23] the normal forms for a function near a regular point or a nondegenerate critical point: Straightening Lemma Suppose Rn is open, f : R is C , p and f (p) = 0. Then there are neighborhoods U and V of 0 and p respectively and a C dieomorphism G : U V with G(0) = p and f G(x) = f (p) + xn . Morse Lemma Suppose Rn is open, f : R is C , p , f (p) = 0, and 2f suppose that the Hessian matrix Hf (p) = x (p) is invertible. Then, for a unique k i xj (= number of positive eigenvalues of Hf ; see Lemma 6.3 below) there are neighborhoods U and V of 0 and p respectively and a C dieomorphism G : U V with G(0) = p and
k n

f G(x) = f (p) +
j =1

x2 j

j =k +1

x2 j.

We consider now I () rst when a is supported near a regular point, and then when a is supported near a nondegenerate critical point. Degenerate critical points are easy to deal with if n = 1, see [33], chapter 8, but in higher dimensions they are much more complicated and only the two-dimensional case has been worked out, see [36]. Proposition 6.1 (Nonstationary phase) Suppose Rn is open, : R is C , p and (p) = 0. Suppose a C0 has its support in a suciently small neighborhood of p. Then N CN : |I ()| CN N , 34

and furthermore CN depends only on bounds for nitely many derivatives of and a and a lower bound for |(p)| (and on N ). Proof The straightening lemma and the calculation in 3. above reduce this to the case (x) = xn + c. In this case, letting en = (0, . . . , 0, 1) we have I () = eic a ( en ), 2 and this has the requisite decay by Proposition 2.3. Now we consider the nondegenerate critical point case, and as in the preceding proof we rst consider the normal form.
C0

Proposition 6.2 Let T be a real symmetric invertible matrix with signature , let a be (or just in S ), and dene I ( ) = ei T x,x a(x)dx.

Then, for any N ,


n 1 I () = ei 4 |det T | 2 2

a(0) +
j =1

j Dj a(0) + O((N +1) ) .

Here Dj are certain explicit homogeneous constant coecient dierential operators of order 2j , depending on T only, and the implicit constant depends only on T and on bounds for nitely many Schwartz space seminorms of a. Proof Essentially this is just another way of looking at the formula for the Fourier transform of an imaginary Gaussian. By Proposition 4.2, the denition of distributional Fourier transform, and the Fourier inversion theorem for a we have 1 n 1 1 I () = ei 4 2 |det T | 2 a ( )ei T , d. We can replace a ( ) with a ( ) by making a change of variables, since the Gaussian is even. To understand the resulting integral, use that 1 0 as , so the Gaussian term is approaching 1. To make this quantitative, use Taylors theorem for eix : e
i
1

T 1 ,

=
j =0

| |2N +2 (i1 T 1 , )j + O( N +1 ) j! (i1 T 1 , )j )d j!

uniformly in and . Accordingly, a ( )e


i
1

T 1 ,

d =

a ( )(1 +
j =1

+O 35

|a ( )|

| |2N +2 . N +1

Now observe that

a ( )d = a(0) by the inversion theorem, and similarly a ( ) (i T 1, )j d j!

is the value at zero of Dj a for an appropriate dierential operator Dj . This gives the result, since |a ( )| |2N +2d is bounded in terms of Schwartz space seminorms of a , and therefore in terms of derivatives of a. Now we consider the case of a general phase function with a nondegenerate critical point. It is clear that this should be reducible to the Gaussian case using the Morse lemma and remark 3. above. However, there is more calculation involved than in the proof of Proposition 5.1, since we need to obtain the correct form for the asymptotic expansion. We recall the following formula which follows from the chain rule: Lemma 6.3 Suppose that is smooth, (p) = 0 and G is a smooth dieomorphism, G(0) = p. Then HG (0) = DG(0)t H (p)DG(0). Thus H (p) and HG (0) have the same signature and det (HG (0)) = JG (0)2 det (H (p)). Proposition 6.4 Let be C and assume that (p) = 0 and H (p) is invertible. Let be the signature of H (p), and let = 2n |det (H (p)|. Let a be C0 and supported in a suciently small neighborhood of p. Dene I ( ) = Then, for any N , I ( ) = e
i(p) i 4 N

ei(x)dx a(x).

1 2

n 2

a(p) +
j =1

j Dj a(p) + O((N +1) ) .

Here Dj are certain dierential operators of order2 2j , with coecients depending on , and the implicit constant depends on and on bounds for nitely many derivatives of a.
2

Actually the order is exactly 2j but we have no need to know that.

36

Proof We can assume that (p) = 0; else we replace with (p). Choose a C dieomorphism G by the Morse lemma and apply remark 3. Thus I = ei T y,y a(Gy )|JG (y )|dy,

where T is a diagonal matrix with diagonal entries 1 and with signature . Also 1 |JG (0)| = 2 by Lemma 6.3 and an obvious calculation of the Hessian determinant of the function y T y, y . Let Dj be associated to this T as in Proposition 5.2 and let b(y ) = a(Gy )|JG (y )|. Then I ( ) = e
i 4 N

n 2

b(0) +
j =1

j D j b(0) + O((N +1) )


1

by Proposition 6.2. Now b(0) = |JG (0)|a(p) = 2 a(p), so we can write this as
n 1 I () = ei 4 2 2

a(p) +
j =1

j 2 Dj b(0) + O((N +1) ) .


1

Further, it is clear from the chain rule and product rule that any 2j -th order derivative of b at the origin can be expressed as a linear combination of derivatives of a at p of order 1 2j with coecients depending on G, i.e., on . Otherwise stated, the term 2 Dj b(0) j a(p), where D j is a new dierential operator of order 2j can be expressed in the form D with coecients depending on . This gives the result. In practice, it is often more useful to have estimates for I () instead of an asymptotic n expansion. Clearly an estimate |I ()| 2 could be derived from Proposition 6.4, but one also sometimes needs estimates for the derivatives of I () with respect to suitable parameters. For now we just consider the technically easiest case where the parameter is itself. Proposition 6.5 (i) Assume that (p) = 0. Then for a supported in a small neighj ) CjN N for any N . borhood of p, d I ( j d (ii)Assume that (p) = 0, and H (p) is invertible. Then, for a supported in a small neighborhood of p, n dk i(p) (e I ()) Ck ( 2 +k) . k d Proof We only prove (ii), since (i) follows easily from Proposition 6.1 after dierentiating under the integral sign as in the proof below. For (ii) we need the following. Claim. Let {i }M i=1 be real valued smooth functions and assume that i (p) = 0, i (p) = 0. Let = M i=1 i . Then all partial derivatives of of order less than 2M also vanish at p. 37

Proof By the product rule any partial D is a linear combination of terms of the form
M

D i i
i=1

with i i = . If || < 2M , then some i must be less than 2, so by hypothesis all such terms vanish at p. To prove the proposition, dierentiate I () under the integral sign obtaining dk (ei(p) I ()) = (i)k k d ((x) (p))k a(x)ei((x)(p) dx.

Let b(x) = ((x) (p))k a(x). By the above claim all partials of b of order less than 2k vanish at p. Now look at the expansion in Proposition 6.4 replacing a with b and setting N = k 1. By the claim the terms Dj b(p) must vanish when j < k , as well as b(p) itself. n k Hence Proposition 5.4 shows that d k (ei(p) I ()) = O(( 2 +k) ) as claimed. d As an application we estimate the Fourier transform of the surface measure on the sphere S n1 Rn . For this and for other similar calculations one wants to work with an integral over a submanifold instead of over Rn . This is not signicantly dierent since it is always possible to work in local coordinates. However, things are easier if one uses the local coordinates as economically as possible. Recall then that if : Rn R is smooth and if M is a k -dimensional submanifold, p M , and if F : U M is a local coordinate (more precisely the inverse map to one) near p, then F will have a critical point at F 1 p i (p) is orthogonal to the tangent space to M at p; in particular this is independent of the choice of F . Notice that is a radial function, because the surface measure is rotation invariant (exercise: prove this rigorously), and is smooth by Proposition 1.3. It therefore suces to consider (en ) where en = (0, . . . , 0, 1) and > 0. Put local coordinates on the sphere as follows: the rst local coordinate is the map x (x, 1 |x|2 ),

1 Rn1 D (0, ) S n1. 2 The second is the map x (x, 1 |x|2 ), 1 Rn1 D (0, ) S n1, 2 and the remaining ones map onto sets whose closures do not contain {en }. Let {qk } be a suitable partition of unity subordinate to this covering by charts. Dene (x) = en x, : Rn R. Thus the gradient of is en and is normal to the sphere at en only. 38

Now (en ) =
k

e2ien x d(x) e2ien x qk (x)d (x)


j =1

= =
D (0, 1 ) 2

e2i +
k 3

1|x|2

q1 (x) 1 |x|2

dx +
D (0, 1 ) 2

e2i

1|x|2

q2 (x) 1 |x|2

dx (63)

e2ik (x) ak (x)dx,

where the dx integrals are in Rn1 , and the phase functions k for k 3 have no critical points in the support of ak . The Hessian of 2 1 |x|2 at the origin is 2 times the identity matrix, and in particular is invertible. It is also clear that the rst and second terms are complex conjugates. We conclude from Proposition 6.5 that (en ) = Re(a()e2i ) + y () with
n1 dj a() 2 j , (64) j d dj y () N (65) j d for any N . In fact is real and even and therefore must be real valued. Multiplying y 2i by e does not aect the estimate (65), so we can absorb y into a and rewrite this as

(en ) = Re(a()e2i ), where a satises (64). Since is radial, we have proved the following. (is a C function and) satises Corollary 6.6 The function (x) = Re(a(|x|)e2i|x| ), where for large r |
n1 dj a | Cj r ( 2 +j ) . j dr

(66)

(67)

Furthermore, looking at the rst term in (63), and using the expansion of Proposition 6.4, with N = 0, we can obtain the leading behavior at . Namely, for the rst term in (63) at its critical point x = 0 we have a phase function 2 1 |x|2 with (0) = 2, = 1

39

and signature (n 1), and an amplitude q2 (x)(1 |x|2 )1/2 which is 1 at the critical point. By Proposition 6.4 the integral is e2i e 4 (n1)
i n1 2

+ O (

n+1 2

).

The second term is the complex conjugate and the others are O(N ) for any N . Hence the quantity (63) is n1 n+1 n1 2 2 cos(2 ( )) + O( 2 ) 8 and we have proved Corollary 6.7 For large |x| (x) = 2|x|
n1 2

cos(2 (|x|

n+1 n1 )) + O(|x| 2 ). 8

Remarks Of course it is possible to consider surfaces other than the sphere, see for example [17], Theorem 7.7.14. The main point in regard to the latter is that the nondegeneracy of the critical points of the phase function which arises when calculating the Fourier transform is equivalent to nonzero Gaussian curvature, so a hypersurface with nonzero Gaussian curvature everywhere behaves essentially the same as the sphere, whereas if there are at directions the decay becomes weaker. Obtaining derivative bounds like Corollary 6.6 in the above manner requires a somewhat more complicated version of Proposition 6.5 with and a depending on an auxiliary parameter z , which we now explain without giving the proofs. Suppose that (x, z ) is a C function of x and z , where x Rn , and z Rk should be regarded a parameter. Assume that for a certain p and z0 we have x (p, z0 ) = 0 and that the matrix of second x-partials of at (p, z0 ) is invertible. 1. Prove that there are neighborhoods U of z0 and V of p and a smooth function : U V with the following property: if z V , then x (x, z ) = 0 if and only if x = (z ).
2. Let a(x, z ) be C0 and supported in a small enough neighborhood of (p, z0 ). Dene

I (, z ) = Prove the following:

ei(x,z ) a(x, z )dx.

dj +k (ei((z ),z ) I (, z )) ( n +j ) 2 . C jk j k d dz This is the analogue of Proposition 6.5 for general parameters. 40

7. Restriction problem We now suppose given a function f : S n1 C and consider the Fourier transform f d( ) =
S n1

f (x)e2ix d(x).

(68)

If f is smooth, then one can use stationary phase to evaluate f d to any desired degree of precision, just as with Corollary 6.6. In particular this leads to the bound |f d( )| C f
C 2 (1

+ | | )

n1 2

say, where f C 2 = 0||2 D f L . On the other hand there can be no similar decay estimate for functions f which are just bounded. The reason for this is that then there is no distinguished reference point in Fourier space. Thus, if we let fk (x) = e2ikx and set = k , we have |fk d( )| = (S n1) 1. Taking a sum of the form f = j j 2 fkj , where |kj | suciently rapidly, we obtain a continuous function f such that there is no estimate |f d( )| C (1 + | |) for any > 0. On the other hand, if we consider instead Lq norms then the issue of a distinguished origin is no longer relevant. The following is a long standing open problem in the area. Restriction conjecture (Stein) Prove that if f L (S n1 ) then f d for all q >
2n . n1 q

Cq f

(69)

n The example of a constant function shows that the regime q > n2 would be best 1 n1 q possible. Namely, Corollary 6.7 implies that L if and only if q 2 > n. 2 The corresponding problem for L densities f was solved in the 1970s:

Theorem 7.1 (P. Tomas-Stein) If f L2 (S n1 ) then f d for q


2n+2 , n1 q

C f

L2 (S n1 )

(70)

and this range of q is best possible. 41

Remarks 1. Notice that the assumptions on q in (70) and (69) are of the form q > q0 or q q0 . The reason for this is that there is an obvious estimate f d

by Proposition 1.1, and it follows by the Riesz-Thorin theorem that if (69) or (70) holds for a given q , then it also holds for any larger q . 2. The restriction conjecture (69) is known to be true when n = 2; this is due to C. Feerman and Stein, early 1970s. See [12] and [33]. 3. Of course there is a dierence in the Lq exponent in (70) and the one which is conjectured for L densities. Until fairly recently it was unknown (in three or more dimensions) whether the estimate (69) was true even for some q less than the Stein-Tomas n+2 exponent 2n . This was rst shown by Bourgain [3], a paper which has been the starting 1 point for a lot of recent work.
n+2 4. The fact that q 2n is best possible for (70) is due I believe to A. Knapp. We 1 now discuss the construction. Notice that in order to distinguish between L2 and L norms, one should use a function f which is highly localized. Next, in view of the nice behavior of rectangles under the Fourier transform discussed e.g. in our section 5, it is natural to take the support of f to be the intersection of S n1 with a small rectangle. Now we set up the proof. Let C = {x S n1 : 1 x en 2 },

where en = (0, . . . , 0, 1). Since |x en |2 = 2(1 x en ), it is easy to show that |x en | C 1 x C |x en | C (71)

for an appropriate constant C . Now let f = f be the indicator function of C . We calculate f L2 (S n1 ) and f d q . All constants are of course independent of . In the rst place, f 2 is the square root of the measure of C , so by (71) and the dimensionality of the sphere we have f
2

n1 2

(72)

The support of f d is contained in the rectangle centered at en with length about 2 in the en direction and length about in the orthogonal directions. We look at f d on the 1 2 1 1 dual rectangle centered at 0. Suppose then that |n | C1 and that |j | C1 when j < n; here C1 is a large constant. Then |f d( )| =
C

e2ix d (x) 42

=
C

e2i(xen ) d (x) cos(2 (x en ) )d(x).


C

Our conditions on imply if C1 is large enough that |(x en ) | , say, for all x C . 3 Accordingly, 1 |f d( )| |C | n1 . 2 Our set of has volume about (n+1) , so we conclude that f d
q

n1

n+1 q

Comparing this estimate with (72) we nd that if (70) holds then n1 uniformly in (0, 1]. Hence n 1
n+1 q

n1 2

n+1 q

n1 , 2

i.e. q

2n+2 . n1

For future reference we record the following variant on the above example: if f is as above and g = e2ix T f , where Rn and T is a rotation mapping en to v S n1 , then g is supported on { x S n1 : 1 x v 2 } , and |gd| n1

1 2 1 1 on a cylinder of length C1 and cross-section radius C1 , centered at and with the axis parallel to v . Before giving the proof of Theorem 7.1 we need to discuss convolution of a Schwartz function with a measure, since this wasnt previously considered. Let M (Rn ); assume has compact support for simplicity, although this assumption is not really needed. Dene (x) = (x y )d(y ).

Observe that is C , since dierentiation under the integral sign is justied as in Lemma 3.1. It is convenient to use the notation for (x). We need to extend some of our formulas to the present context. In particular the following extends (28) since if S then the Fourier transform of is by Theorem 3.4: when S , = = when S . 43 (73) (74)

Notice that (73) can be interpreted naively: Proposition 1.3 and the product rule imply that is a Schwartz function. To prove (73), by uniqueness of distributional Fourier transforms it suces to show that = ) (

if is another Schwartz function. This is done as follows. Denote T x = x, then (x)(x) (x)dx = = = = = = = (x)(x) (x)dx ) T ) (( ) T )d by the duality relation (( T ) ( T )d ( T )d ( (x + y )dyd(x) (y ) (y )dy. (y )

For (74), again let be another Schwartz function. Then dx = = = = d by the duality relation d ) ( dx by the duality relation ( )dx.

The last line may be seen by writing out the denition of the convolution and using Fubinis theorem. Since this worked for all S , we get (74). Lemma 7.2 Let f, g S , and let be a (say) compactly supported measure. Then g f d = Proof Recall that g =g , 44 ( g ) f dx. (75)

so that

g = g =g

by the inversion theorem. Now apply the duality relation and (74), obtaining g f d = = as claimed. Lemma 7.3 Let be a nite positive measure. The following are equivalent for any q and any C . 1. f d q C f 2 , f L2 (d). 2. g L2 (d) C g q , g S . 3. f q C2 f q , f S. Proof Let g S , f L2 (d). By the duality relation g f d = f d gdx. (76) f ( g )dx f (g )dx

If 1. holds, then the right side of (76) is g q f d q C g q f L2 (d) for any f L2 (d). Hence so is the left side. This proves 2. by duality. If 2. holds then the left side is g L2 (d) f L2 (d) C g q f L2(d) for g S . Hence so is the right side. Since S is dense in Lq , this proves 1. by duality. If 3. holds, then the right side of (75) is C 2 f 2 q when f = g S . Hence so is the left side, which proves 2. If 2. holds then, for any f, g S , using also the Schwartz inequality the left side of (75) is C 2 f q g q . Hence the right side of (75) is also C 2 f q g q , which proves 3. by duality. Remark One can t lemma 7.3 into the abstract setup T : L2 Lq T : Lq L2 T T : Lq Lq . This is the standard way to think about the lemma, although it is technically a bit easier to present the proof in the above ad hoc manner. Namely, if T is the operator f f d , where we regard f as being dened on the measure space then T is the operator f f associated to , and T T is convolution with . Proof of Theorem 7.1 We will not give a complete proof; we only prove (70) when n+2 q > 2n instead of . For the endpoint, see for example [35], [9], [32], [33]. 1 n+2 We will show that if q > 2n , then 1 f
q

Cq f 45

(77)

which suces by Lemma 7.3. The relevant properties of will be (D (x, r )) r n1 , (78)

which reects the n 1-dimensionality of the sphere, and the bound | ( )| (1 + | |)


n1 2

(79)

from Corollary 6.6. Let be a C function with the following properties: supp {x : if |x| 1 then
j 0

1 x 1}, 4 (2j x) = 1.

Such a function may be obtained as follows: let be a C function which is equal to 1 when |x| 1 and to 0 when |x| 1 , and let (x) = (2x) (x). 2 We now cut up as follows:

= K +
j =0

Kj ,

where

Kj (x) = (2j x) (x),

K (x) = (1
j =0

(2j x)) .

Then K is a

C0

function, so K f
q

by Youngs inequality, provided q p. In particular, since q > 2 we may take p = q . We now consider the terms in the sum. The logic will be that we estimate convolution with Kj as an operator from L1 to L and from L2 to L2 , and then we use Riesz-Thorin. We have n1 Kj 2j 2 by (79). Using the trivial bound Kj f bound, Kj f

Kj 2 j
n1 2

we conclude our L1 L (80)

1.

. Note also that On the other hand, we can use (78) to estimate Kj . Namely, let = = , since and therefore are invariant under the reection x x. Accordingly, we have j Kj = 2 , 46

. Since S , it follows that using (73) and the fact that = |Kj ( )| CN 2jn for any xed N < . Therefore |Kj ( )| CN 2jn +
k 0 D (,2k+1j )\D (,2kj ) D (,2j )

(1 + 2j | |)N d ( )

(1 + 2j | |)N d( ) (1 + 2j | |)N d ( ) 2N k (D (, 2k+1j )\D (, 2kj ))


k 0

CN 2jn (D (, 2j )) + 2jn 2j (n1) +


k 0

2N k 2(n1)(kj )

2, where we used (78) at the next to last line, and at the last line we xed N to be equal to n and summed a geometric series. Thus Kj

2j .

(81)

Now we mention the trivial but important fact that K f


2

(82)

if say K and f are in S . This follows since K f


2

= = =

K f 2 2 f K 2 f K f 2. K

Combining (81) and (82) we conclude Kj f


2

2j f

2.

(83)

Accordingly, by (80), (83) and Riesz-Thorin we have Kj f if


2 q

2j 2j

n1 (1 ) 2

=1 . This works out to q Kj f


q

2j (

n+1 n1 2 ) q

(84)

47

for any q [2, ]. If q > that

2n+2 n1

then the exponent Kj f


j q

n+1 q

n1 2

is negative, so we conclude

Since f is a Schwartz function, the sum K f +


j

Kj f

is easily seen to converge pointwise to f . We conclude using Fatous lemma that f q f q , as claimed. Further remarks 1. Notice that the L2 estimate in the preceding argument was based only on dimensionality considerations. This suggests that there should be an L2 bound for f d valid under very general conditions. Theorem 7.4 Let be a positive nite measure satisfying the estimate (D (x, r )) Cr . Then there is a bound f d
L2 (D (0,R))

(85)
L2 (d ) .

CR

n 2

(86)

The proof uses the following generic test for L2 boundedness. Lemma 7.5 (Schurs test) Let (X, ) and (Y, ) be measure spaces, and let K (x, y ) be a measurable function on X Y with |K (x, y )|d(x) A for each y,
X

(87) (88)

|K (x, y )|d (y ) B for each x.


Y

Dene TK f (x) = K (x, y )f (y )d (y ). Then for f L2 (d ) the integral dening TK f converges a.e. (d(x)) and there is an estimate TK f L2 (d) AB f L2 (d ) . (89) Proof It is possible to use Riesz-Thorin here, since (88) implies TK f B f and (87) implies TK f 1 A f 1 . A more elementary argument goes as follows. If a and b are positive numbers then we have 1 ab = min ( a + 1 b), (90) (0,) 2 48

since is the arithmetic-geometric mean inequality and follows by taking = To prove (89) it suces to show that if f L2 (d) 1, g L2(d ) 1, then |K (x, y )||f (x)||g (y )|d(x)d (y ) To show (91), we estimate |K (x, y )||f (x)||g (y )|d(x)d (y ) 1 = min |K (x, y )||g (y )|2d(x)d (y ) 2 +
1

b . a

AB.

(91)

|K (x, y )||f (x)|2d (y )d(x)


1

1 min A |g (y )|2d (y ) + 2 1 min( A + 1 B ) 2 = AB.

|f (x)|2 d(x)

To prove Theorem 7.4, let be an even Schwartz function which is 1 on the unit disc and whose Fourier transform has compact support. (Exercise: show that such a function exists.) In the usual way dene R1 (x) = (R1 x). Then f d
L2 (D (0,R))

R1 (x)f d (x) R1 (f d )
2

L2 (dx)

by (73) and Plancherel. The last line is the L2 (dx) norm of the function (R(x y ))f (y )d (y ). Rn We have estimates (R(x y ))|dx = |R n for each xed y , by change of variables, and (R(x y ))|d (y ) |R n R n
1

<

. By Lemma 7.5 for each xed x, by (85) and the compact support of (R(x y ))f (y )d (y ) Rn 49
L2(dx)

n 2

L2 (d ) ,

and the proof is complete. 2. Another remark is that it is possible to base the whole proof of Theorem 7.1 on the stationary phase asymptotics in section 6, instead of explicitly using the dimensionality of . This sort of argument has the obvious advantage that it is more exible, since it works also in other situations where the convolution kernel (x y ) is replaced by a kernel K (x, y ) satisng appropriate conditions. See for example [29], [32], [33]. On the other hand, it is more complicated and is not as relevant in connection with more delicate questions such as the restriction conjecture, which is known to be false in most of the more general situations (see [5], [26], [30]). We give a brief sketch omitting details. The basic result is the so-called variable coecient Plancherel theorem, due to H ormander [16]. Let be a real valued C function dened on Rn Rn , let a C0 (Rn Rn ) and consider the oscillatory integral operators T f (x) = ei(x,y) a(x, y )f (y )dy. (92)

Since a has compact support, it is obvious that these map L2 (Rn ) to L2 (Rn ) with a norm bound independent of , but we want to show that the norm decays in a suitable way as . As with the oscillatory integrals of section 6, this will not be the case if is too degenerate. In the present situation, note that if depends on x only, then the factor ei in (92) may be taken outside the integral sign, so the norm is independent of . Similarly, if depends on y only, then the factor ei may be incorporated into f . We conclude in fact that if (x, y ) = a(x) + b(y ), then T L2 L2 is independent of . This strongly suggests that the appropriate nondegeneracy condition should involve the mixed Hessian n 2 = H xi yj i,j =1 since the mixed Hessian vanishes identically if (x, y ) = a(x) + b(y ). Theorem A (H ormander) Assume that (x, y )) = 0 det (H at all points (x, y ) supp a. Then T
L2 L2

C 2 .
n

Sketch of proof This is evidently related to stationary phase, but one cannot apply stationary phase directly to the integral (92), since f isnt smooth. Instead, one looks at T T which is an integral operator TK with kernel K (x, y ) = ei((x,z )(y,z )) a(x, z )a(y, z )dz. 50 (93)

The assumption about the mixed Hessian guarantees that the phase function in (93) has no critical points if x and y are close together. Using a version3 of nonstationary phase one can obtain the estimate N CN : |K (x, y )| CN (1 + |x y |)N , provided |x y | is less than a suitable constant. It follows that if a has small support then |K (x, y )|dy n for each xed x, and similarly |K (x, y )|dx n
n

for each y . Then Schurs test shows that T T n , so T L2 L2 2 . The L2 L2 small support assumption on a can then be removed using a partition of unity. . is k , where It is possible to generalize this to the case where the rank of H k {1, . . . , n}; just replace the exponent n by k . Furthermore, the compact support 2 2 assumption on a may be replaced by proper support (see the statement below), and nally one can obtain Lq Lq estimates by interpolating with the trivial T L1 L 1. Here then is the variable coecient Plancherel, souped up in a manner which makes it applicable in connection with Stein-Tomas. See the references mentioned above.

Theorem B (H ormander) Assume that a is a C function supported on the set {(x, y ) n R R : |x y | C } whose all partial derivatives are bounded. Let be a real valued C function dened on a neighborhood of supp a, all of whose partial derivatives are (x, y ) is at least k everywhere. Assume furtherbounded, and such that the rank of H more that the sum of the absolute values of the determinants of the k by k minors of H is bounded away from zero. Then there is a bound
n

T when 2 q .

Lq Lq

C q

Now look back at the proof we gave for Theorem 7.1. The main point was to obtain the bound (84). Now, Kj (x) is the real part of j (x) def = (2j x)a(|x|)e2i|x| , K where a satises the estimates in Corollary 6.6. Accordingly, it suces to prove (84) with j . Let Tj be convolution with K j , and rescale by 2j ; thus we consider Kj replaced by K the operator f Tj (f2j )2j .
3 One needs something a bit more quantitative than our Proposition 6.1; the necessary lemma is best proved by integration by parts. See for example [32].

51

This is an integral operator Sj whose kernel is 2nj (x y )a(2j |x y |)e2i2


j | x y |

We want to apply Theorem B to Sj ; toward this end we make the following observations. (i) From the estimates in Corollary 6.6, we see that the functions 2
n1 j 2

(x y )a(2j |x y |)

have derivative bounds which are independent of j , and clearly they are supported in 1 |x y | 1. 4 (ii) The mixed Hessian of the function |x y | has rank n 1. This is a calculation which we leave to the reader, just noting that the exceptional direction corresponds to the direction along the line segment xy . It follows that the operators 2 2 j Sj satisfy the hypotheses of Theorem B with k = n 1, uniformly in j , if we take = 2j +1. Accordingly,
n+1

Sj f

2(

n+1 1 n )j 2 q

and therefore using change of variables 2 q j Tj f


n

2( 2(

n+1 n1 q )j 2

n q j

f ,

i.e. Tj f which is (84).


q

n+1 n1 2 )j q

Exercise: Use Theorem A for an appropriate phase function, and a rescaling argument of the preceding type, to prove the bound f
2

C f

2.

This explains the name variable coecient Plancherel theorem.

8. Hausdor measures Fix > 0, and let E Rn . For > 0, one denes

H (E ) = inf(
j =1

rj ),

52

where the inmum is taken over all countable coverings of E by discs D (xj , rj ) with rj < . It is clear that H (E ) increases as decreases, and we dene H (E ) = lim H (E ).
0

It is also clear that H (E ) H (E ) if > and

1; thus (94)

H (E ) is a nonincreasing function of .

1 Remarks 1. If H (E ) = 0, then H (E ) = 0. This follows readily from the denition, 1 1 (E ) < will necessarily consist of discs of radius < . since a covering showing that H

2. It is also clear that H (E ) = 0 for all E if > n, since one can then cover Rn by discs D (xj , rj ) with j rj arbitrarily small. Lemma 8.1 There is a unique number 0 , called the Hausdor dimension of E or dimE , such that H (E ) = if < 0 and H (E ) = 0 if > 0 . Proof Dene 0 to be the supremum of all such that H (E ) = . Thus H (E ) = if < 0 , by (94). Suppose > 0 . Let (0 , ). Dene M = 1 + H (E ) < . If > 0, then we have a covering by discs with j rj M and rj < . So
rj j j rj

M .

which goes to 0 as

0. Thus H (E ) = 0.

Further remarks 1. The set function H may be seen to be countably additive on Borel sets, i.e. denes a Borel measure. See standard references in the area like [6], [10], [25]. 1 This is part of the reason one considers H instead of, say, H . Notice in this connection that if E and F are disjoint compact sets, then evidently H (E F ) = H (E ) + H (F ). 1 This statement is already false for H .
1 2. The Borel measure Hn coincides with times Lebesgue measure, where is the volume of the unit ball. If < n, then H is non-sigma nite; this follows e.g. by Lemma 8.1, which implies that any set with nonzero Lebesgue measure will have innite H -measure.

Examples The canonical example is the usual 1 -Cantor set on the line. This has a 3 covering by 2n intervals of length 3n , so it has nite H log 2 -measure. It is not dicult to log 3 show that in fact its H log 2 -measure is nonzero; this can be done geometrically, or one can log 3 apply Proposition 8.2 below to the Cantor measure. In particular, the dimension of the 2 Cantor set is log . log 3 53

Now consider instead a Cantor set with variable dissection ratios { n }, i.e. one starts with the interval [0, 1], removes the middle 1 proportion, then removes the middle 2 proportion of each of the resulting intervals and so forth. If we assume that n+1 n , and let = limn n , then it is not hard to show that the dimension of the resulting set log 2 E will be log( . In particular, if n 0 then dimE = 1. 2 ) On the other hand, H1 (E ) will be positive only if n n < , so this gives examples of sets with zero Lebesgue measure but full Hausdor dimension. There are numerous other notions of dimension. We mention only one of them, the Minkowski dimension, which is only dened for compact sets. Namely, if E is compact then let E = {x Rn : dist(x, E ) < }. Let 0 be the supremum of all numbers such that, for some constant C , |E | C n for all (0, 1]. Then 0 is called the lower Minkowski dimension and denoted dL(E ). Let 1 be the supremum of all numbers such that, for some constant C , |E | C n for a sequence of s which converges to zero. Then 1 is called the upper Minkowski dimension and denoted dU (E ). It would also be possible to dene these like Hausdor dimension but restricting to coverings by discs all the same size, namely: dene a set S to be -separated if any two distict points x, y S satisfy |x y | > . Let E (E ) (-entropy on E ) be the maximal possible cardinality for a -separated subset4 of E . Then it is not hard to show that dL (E ) = lim inf 0 log E (E ) , log 1
1

log E (E ) dU (E ) = lim sup . log 1 0 Notice that a countable set may have positive lower Minkowski dimension; for example, 1 }n=1 {0} has upper and lower Minkowski dimension 1 . the set { n 2 If E is a compact set, then let P (E ) be the space of the probability measures supported on E . Proposition 8.2 Suppose E Rn is compact. Assume that there is a P (E ) with (D (x, r )) Cr (95)

for a suitable constant C and all x Rn , r > 0. Then H (E ) > 0. Conversely, if H (E ) > 0, there is a P (E ) such that (95) holds.
4

Exercise: show that E (E ) is comparable to the minimum number of -discs required to cover E

54

Proof The rst part is easy: let {D (xj , rj )} be any covering of E by discs. Then 1 = (E )
j

(D (xj , rj )) C
j

rj ,

which shows that H (E ) C 1 . The proof of the converse involves constructing a suitable measure, which is most easily done using dyadic cubes. Thus we let Qk be all cubes of side length (Q) = 2k whose vertices are at points of 2k Zn . We can take these to be closed cubes, for deniteness. It is standard to work with these in such contexts because of their nice combinatorics: if Qk1 with Q Q ; furthermore if we x Q1 Qk1 , Q Qk , then there is a unique Q = Q1 , and the union is disjoint except for then Q1 is the union of those Q Qk with Q edges. A dyadic cube is a cube which is in Qk for some k . If Q is a dyadic cube, then clearly there is a disc D (x, r ) with Q D (x, r ) and r C (Q). Likewise, if we x D (x, r ), then there are a bounded number of dyadic cubes Q1 . . . QC with (Qj ) Cr and whose union contains D (x, r ). From these properties, it is easy to see that the denition of Hausdor measure and also the property (95) could equally well be given in terms of dyadic cubes. Thus, except for the values of the constants, satises (95) (Q) C (Q) for all dyadic cubes Q. Furthermore, if we dene h (E ) = inf(
QF

(Q) : E
QF

Q),

where F runs over all coverings of E by dyadic cubes of side length (Q) < , and h (E ) = lim h (E ),
0

then we have therefore

C 1 H (E ) h (E ) CH (E ), h (E ) > 0 H (E ) > 0.

We return now to the proof of Proposition 8.2. We may assume that E is contained in the unit cube [0, 1] . . . [0, 1]. By the preceding remarks and Remark 1. above we may assume that h1 (E ) > 0, and it suces to nd P (E ) so that (Q) C (Q) for all dyadic cubes Q with (Q) 1. We now make a further reduction. Claim It suces to nd, for each xed m Z+ , a positive measure with the following properties: is supported on the union of the cubes Q Qm which intersect E ; 55 (96)

C 1 ; (Q) (Q) for all dyadic cubes with (Q) 2


m

(97) . (98)

Here C is independent of m. Namely, if this can be done, then denote the measures satisfying (96), (97), (98) by m . (98) implies a bound on m , so there is a weak* limit point . (96) then shows that is supported on E , (98) shows that (Q) (Q) for all dyadic cubes, and (97) shows that C 1 . Accordingly, a suitable scalar multiple of gives us the necessary probability measure. There are a number of ways of constructing the measures satisfying (96), (97), (98). Roughly, the issue is that (97) and (98) are competing conditions, and one has to nd a measure with the appropriate support and with total mass roughly as large as possible given that (97) holds. This can be done for example by using nite dimensional convexity theory (exercise!). We present a dierent (more constructive) argument taken from [6], Chapter 2. We x m, and will construct a nite sequence of measures m , . . . , 0 , in that order; 0 will be the measure we want. Start by dening m to be the unique measure with the following properties. 1. On each Q Qm , m is a scalar multiple of Lebesgue measure. 2. If Q Qm and Q E = , then m (Q) = 0. 3. If Q Qm and Q E = , then m (Q) = 2m . If we set k = m, then k has the following properties: it is absolutely continuous to Lebesgue measure, and k (Q) (Q) if Q is a dyadic cube with side 2j , k j m; if Q1 is a dyadic cube of side 2k , then there is a covering F Q1 of Q1 E by dyadic cubes contained in Q1 such that k (Q1 ) QF Q (Q) . 1 (99)

(100)

Assume now that 1 k m and we have constructed an absolutely continuous measure k with properties (96), (99) and (100). We will construct k1 having these same properties, where in (99) and (100) k is replaced by k 1. Namely, to dene k1 it suces to dene k1 (Y ) when Y is contained in a cube Q Qk1 . Fix Q Qk1 . Consider two cases. (i) k (Q) (Q) . In this case we let k1 agree with k on subsets of Q. (ii) k (Q) (Q) . In this case we let k1 agree with ck on subsets of Q, where c is (k1) the scalar 2 k (Q) . 56

Notice that k1 (Y ) k (Y ) for any set Y , and furthermore k1 (Q) (Q) if Q Qk1 . These properties and (99) for k give (99) for k1 , and (96) for k1 follows trivially from (96) for k . To see (100) for k1 , x Q Qk1 . If Q is as in case (ii), then k1 (Q) = (Q) , so we can use the covering by the singleton {Q}. If Q is as in case (i), then for each of the cubes Qj Qk whose union is Q we have the covering of Qj E associated with (100) for k . Since k and k1 agree on subsets of Q, we can simply put these coverings together to obtain a suitable covering of Q E . This concludes the inductive step from k to k1 . We therefore have constructed 0 . It has properties (96), (98) (since for 0 this is equivalent to (99)), and by (100) and the denition of h1 it has property (99). Let us now dene the -dimensional energy of a (positive) measure with compact support5 by the formula I () = |x y |d(x)d(y ).

We always assume that 0 < < n. We also dene


V (y ) =

|x y |d(x).

Thus I () =
V d.

(101)

The potential V is very important in other contexts (namely elliptic theory, since it is harmonic away from supp when = n 2) but less important than the energy here. Nevertheless we will use it in a technical way below. Notice that it is actually the convolution of the function |x| with the measure . Roughly, one expects a measure to have I () < if and only if it satises (95); this precise statement is false, but we see below that nevertheless the Hausdor dimension of a compact set can be dened in terms of the energies of measures in P (E ).

Lemma 8.3 (i) If is a probability measure with compact support satisfying (95), then I () < for all < . (ii) Conversely, if is a probability measure with compact support and with I () < , then there is another probability measure such that (X ) 2(X ) for all sets X and such that satises (95). Proof (i) We can assume that the diameter of the support of is 1. Then

V (x)
j =0
5

2j (D (x, 2j )).

The compact support assumption is not needed; it is included to simplify the presentation.

57

Accordingly, if satises (95), and < , then

V (x)
j =0

2j 2j

1. It follows by (101) that I () < . (ii) Let F be the set of points x such that V (x) 2I (). Then (F ) 1 by (101). 2 Let F be the indicator function of F and let (X ) = (X F )/(F ). We need to show that satises (95). Suppose rst that x F . If r > 0 then
r (D (x, r )) V (x) 2V (x) 4I ().

This veries (95) when x F . For general x, consider two cases. If r is such that D (x, r ) F = then evidently (D (x, r )) = 0. If D (x, r ) F = , let y D (x, r ) F . Then (D (x, r )) (D (y, 2r )) r by the rst part of the proof. Proposition 8.4 If E is compact then the Hausdor dimension of E coincides with the number sup{ : P (E ) with I () < }. Proof Denote the above supremum by s. If < s then by (ii) of Lemma 8.3 E supports a measure with (D (x, r )) Cr . Then by Proposition 8.2 H (E ) > 0, so dim E . So s dim E . Conversely, if < dim E then by Proposition 8.2 E supports a measure with (D (x, r )) Cr + for > 0 small enough. Then I () < , so s, which shows that dim E s. The energy is a quadratic expression in and is therefore susceptible to Fourier transform arguments. Indeed, the following formula is essentially just Lemma 7.2 combined with the formula for the Fourier transform of |x| . Proposition 8.5 Let be a positive measure with compact support and 0 < < n. Then |x y |d(x)d(y ) = c | (102) ( )|2| |(n) d, where c =
a ( n ) a 2 2 ( a ) 2 n

Proof Suppose rst that f L1 is real and even, and that d(x) = (x)dx with S . Then we have ( )d f (x y )d(x)d(y ) = | ( )| 2 f (103) This is proved like Lemma 7.2 using (73) instead of (74). Now x . Then both sides of (103) are easily seen to dene continuous linear maps from f L2 to R. Accordingly, 58

(103) remains valid when f L1 + L2 , S . Applying Proposition 4.1, we conclude (102) if d(x) = (x)dx, S . To pass to general measures, we use the following fact. Lemma 8.6 Let be any radial decreasing Schwartz function with L1 norm 1, and let 0 < < n. Then |x y |(y )dy |x| , where the implicit constant depends only on , not on the choice of . We sketch the proof as follows: one can easily reduce to the case where = and this case can be done by explicit calculation. Now let (x) = e|x| . We have then S , so
2

1 , |D (0,R)| D (0,R)

|x y | (x z ) (y w )dxdy d(z )d(w ) = c ( )|2 | |(n)d. | ( )| 2 |

(104)

Now let 0. On the left side of (104), the expression inside the parentheses converges pointwise to |z w | using a minor variant on Lemma 3.2. If I () < then the convergence is dominated in view of the preceding lemma, so the integrals on the left side converge to I (). If I () = , then this remains true using Fatous lemma. On the right hand side of (104) we can argue similarly: the integrands converge pointwise to | ( )|2| |(n) . If | ( )|2 | |(n)d < then the convergence is dominated since ( ) are bounded by 1, so the integrals converge to | the factors ( )|2 | |(n)d . If | ( )|2 | |(n)d = then this remains true by Fatous lemma. Accordingly, we can pass to the limit from (104) to obtain the proposition. Corollary 8.7 Suppose is a compactly supported probability measure on Rn with | ( )| C | | | ( )|2 d CN n2 .
D (0,N )

(105)

for some 0 < < n/2, or more generally that (105) is true in the sense of L2 means: (106)

Then the dimension of the support of is at least 2 . Proof It suces by Proposition 8.4 to show that if (106) holds then I () < for all < 2 . However,
| |1

| ( )| | |
2

(n)

d
j =0

2j (n)
2j | |2j +1

| ( )|2d

2j (n) 2j (n2 )

j =0

< 59

if < 2 and (106) holds. Observe also that the integral over | | 1 is nite since | ( )| = 1. This completes the proof in view of Proposition 8.5. One can ask the converse question, whether a compact set with dimension must support a measure with | ( )| C (1 + | |) 2 + (107) for all > 0. The answer is (emphatically) no6 . Indeed, there are many sets with positive dimension which do not support any measure whose Fourier transform goes to zero as | | . The easiest way to see this is to consider the line segment E = [0, 1] {0} R2 . E has dimension 1, but if is a measure supported on E then ( ) depends on 1 only, so it cannot go to zero at . If one considers only the case n = 1, this question is related to the classical question of sets of uniqueness. See e.g. [28], [40]. One can show for example that the standard 1 Cantor set does not support any measure such that 3 vanishes at . Indeed, it is nontrivial to show that a noncounterexample exists, i.e. a set E with given dimension which supports a measure satisfying (107). We describe a construction of such a set due to R. Kaufman in the next section. As a typical application (which is also important in its own right) we now discuss a special case of Marstrands projection theorem. Let e be a unit vector in Rn and E Rn a compact set. The projection Pe (E ) is the set {x e : x E }. We want to relate the dimensions of E and of its projections. Notice rst of all that dim Pe E dim E ; this follows from the denition of dimension and the fact that the projection Pe is a Lipschitz function. A reasonable example, although not very typical, is a smooth curve in R2 . This is one-dimensional, and most of its projection will be also one-dimensional. However, if the curve is a line, then one of its projections will be just a point. Theorem 8.8 (Marstrands projection theorem for one-dimensional projections) Assume that E Rn is compact and dim E = . Then (i) If 1 then for a.e. e S n1 we have dimPe E = . (ii) If > 1 then for a.e. e S n1 the projection Pe E has positive one-dimensional Lebesgue measure. Proof If is a measure supported on E , e S n1 , then the projected measure e is the measure on R dened by f de = f (x e)d(x)

for continuous f . Notice that e may readily be calculated from this denition: e (k ) = e2ikxe d(x)

6 On the other hand, if one interprets decay in an L2 averaged sense the answer becomes yes, because the calculation in the proof of the above corollary is reversible.

60

= (ke). Let < dim E , and let be a measure supported on E with I () < . We have then | (ke)|2 |k |1+ dkd(e) < (108) by Proposition 8.5 and polar coordinates. Thus, for a.e. e we have | (ke)|2 |k |1+ dk < . It follows by Proposition 8.5 with n = 1 that for a.e. e the projected measure e has nite -dimensional energy. This and Proposition 8.4 give part (i), since e is supported on the projected set Pe E . For part (ii), we note that if dim E > 1 we can take = 1 in (108). Thus e is in L2 for almost all e. By Theorem 3.13, this condition implies that e has an L2 density, and in particular is absolutely continuous with respect to Lebesgue measure. Accordingly Pe E must have positive Lebesgue measure. Remark Theorem 8.8 has a natural generalization to k -dimensional instead of 1dimensional projections, which is proved in the same way. See [10].

9. Sets with maximal Fourier dimension, and distance sets A. Sets with maximal Fourier dimension Jarniks theorem is the following Proposition 9A1. Fix a number > 0, and let E = {x R : innitely many rationals
a q

a such that |x | q (2+) }. q


2 . 2+

Proposition 9A1 The Hausdor dimension of E is equal to

2 Proof We show only that dim E 2+ . The converse inequality is not much harder (see [10]) but we have no need to give a proof of it since it follows from Theorem 9A2 below using Corollary 8.7. It suces to prove the upper bound for E [N, N ]. Consider the set of intervals Iaq = ( a q (2+) , a + q (2+) ), where 0 a Nq are integers. Then q q

|Iaq |
q>q0 a q>q0

q q (2+) ,

2 which is nite and goes to 0 as q0 if > 2+ . For any given q0 the set {Iaq : q > q0 } 2 , as claimed. covers E [N, N ], which therefore has H (E [N, N ]) = 0 when > 2+ .

61

Theorem 9A2 (Kaufman [21]). For any > 0 there is a positive measure supported on a subset of E such that 1 | ( )| C | | 2+ + (109) for all > 0.

This shows then that Corollary 8.7 is best possible of its type. The proof is most naturally done using periodic functions, so we start with the following general remarks concerning periodization and deperiodization. Let Tn be the n-torus which we regard as [0, 1] . . . [0, 1] with edges identied; thus a function on Tn is the same as a function on Rn periodic for the lattice Zn . If f L1 (Tn ) then one denes its Fourier coecients by (k ) = f

f (x)e2ikx dx, k Zn

and one also makes the analogous denition for measures. If f is smooth then one has (k )| CN |k |N for all N and (k )e2ikx = f (x). n f |f k Z Also, if f L1 (Rn ) one denes its periodization by fper (x) =
Z
n

f (x ).

Then fper L1 (Tn ), and we have (k ). Lemma 9A3 If k Zn then fper (k ) = f Proof (k ) = f =
Z
n

f (x)e2ikx dx f (x)e2ikx dx
[0,1]...[0,1]+

=
Z
n

[0,1]...[0,1]

f (x )e2ik(x ) dx f (x )e2ik(x ) dx

=
[0,1]...[0,1] Z
n

=
[0,1]...[0,1]

fper (x)e2ikx dx.

At the last line we used that e2ik = 1.

62

Suppose now that f is a smooth function on Tn ; regard it as a periodic function on Rn . Let S and consider the function F (x) = (x)f (x). We have then ( ) = F
Z
n

( ) f

e2i( )x (x)dx

=
Z
n

( ) ( ). f

This formula extends by a limiting argument to the case where the smooth function f is replaced by a measure; we omit this argument7 . Thus we have the following: let be a measure on Tn , let S , and dene a measure on Rn by d (x) = (x)d({x}), where {x} is the fractional part of x. Then for Rn ( ) =
k Z
n

(110)

( k ). (k )

(111)

A corollary of this formula by simple estimates with absolutely convergent sums, using , is the following. the Schwartz decay of Lemma 9A4 If is a measure on Tn , satisfying | (k )| C (1 + |k |) for a certain > 0, and if M (Rn ) is dened by (110), then | ( )| C (1 + | |). This can be proved by using (111) and considering the range | k | | |/2 and its complement separately. The details are left to the reader. We now start to construct a measure on the 1-torus T, which will be used to prove Theorem 9A2. Let be a nonnegative C0 function on R supported in [1, 1] and with = 1. 1 1 Dene (x) = ( x) and let be the periodization of . Let P (M ) be the set of prime numbers which lie in the interval ( M , M ]. By the 2 M prime number theorem, |P (M )| log for large M . If p P ( M ) then the function M p (x) = (px) is again 1-periodic8 and we have p (k ) = ( k ) if p | k, p 0 otherwise. (112)
def

7 It is based on the fact that every measure on Tn is the weak limit of a sequence of absolutely continuous measures with smooth densities, which is a corollary e.g. of the Stone-Weierstrass theorem. 8 Of course for xed p it is p1 -periodic

63

To see this, start from the formula (k ) = (k ) = ( k ) which follows from Lemma 9A3. Thus (x) =
k

( k )e2ikx , ( k )e2ikpx ,
k

p (x) = which is equivalent to (112). Now dene F =

1 |P (M )|
1

p .
pP (M )

Then F is smooth, 1-periodic, and 0 F = 1 (cf. (113)). Of course F depends on M but we suppress this dependence. Lemma The Fourier coecients of F behave as follows: (0) = 1, F (k ) = 0 if 0 < |k | F for any N there is CN such that (k )| C N |F log |k | |k | 1+ M M
N M , 2

and

(113) (114)

for all k = 0.

(115)

Proof Both (113) and (114) are selfevident from (112). For (115) we use that a given log k integer k > 0 has at most C log dierent prime divisors in the interval (M/2, M ]. Hence, M , by (112) and the Schwartz decay of (k )| |F as claimed. We now make up our mind to choose = M (1+) , and denote the function F by FM . Thus we have the following a suppFM {x : |x | p(2+) for some p P M , a [0, p]}, p |FM (k )| CN log |k | |k | 1 + 2+ M M 64
N

log M log |k | |k | CN 1 + M log M M

(116) (117)

and furthermore FM is nonnegative and satises (113) and (114). It is easy to deduce from (117) with N = 1 that |FM (k )| |k | 2+|| log |k |
1

(118)

uniformly in M . In view of (116) we could now try to prove the theorem by taking a weak limit of the measures FM dx and then using Lemma 9A4. However this would not be correct since the set E is not closed, and ((116) notwithstanding) there is no reason why the weak limit should be supported on E . Indeed, (114) and (113) imply easily that the weak limit is the Lebesgue measure. The following is the standard way of getting around this kind of problem. It has something in common with the classical Riesz product construction; see [25]. I. First consider a xed smooth function on T. We claim that if M has been chosen large enough then |FM (k ) (k )|
log |k | (1 M

|k | )100 M 2+

when |k | when |k |

M 4 M . 4

M 100

(119)

To see this, we write using (111) and (114), (113) FM (k ) (k ) = =


lZ

(l)FM (k l) (k )

l:|k l|M/2 (l)FM (k l).

(120)

Since is smooth, | (l)| cN |l|N for any N . Hence the right side of (120) is bounded by C max |FM (k l)|.
l:|k l|M/2

Using (117), we can estimate this by CN log |k l| |k l | 1 + 2+ l:|k l|M/2 M M max


N

(121)

For any xed N the function f (t) = log t t 1 + 2+ M M


N

is decreasing for t M/10, provided that M is large enough. Thus (121) is bounded by
M/2) CN log(M 1+ M/2 M 2+ N

M 100 , which proves the second part of (119). To prove the rst part, we again use (120) and consider separately the range |k l| |k |/2 and its complement. For |k l| |k |/2 we 65

argue as above, replacing |k l| by |k |/2 in (121). For |k l| |k |/2 we have |l| |k |/2, hence the estimate follows easily using the decay of the Fourier coecients of and the fact that |FM (k )| 1. The details are left to the reader. II. Let g (r ) = 1 r 2+ log r r0
1 2+

when r r0 , when r r0 ,

log r0

where r0 > 1 is chosen so that g (r ) 1 and g (r ) is nonincreasing for all r . Then for any C (T), > 0, and M0 > 10r0 we can choose N large enough and a rapidly increasing sequence M0 < M1 < M2 < . . . < MN so that |G(k ) (k )| g (|k |), (122)

where G = N 1 (FM1 + . . . + FMN ). This can be done as follows. Denote EM (k ) = |FM (k ) (k )| for M M1 . We will need the following consequences of (119): EM (k ) Cg (|k |) if |k | M/4, EM (k ) = 0 for any xed M, |k | g (|k |) lim EM (k ) CM 100 if |k | M/4, with the constant in (123), (125) independent of k, M . C Fix N so that N < 100 . Observe that for all M large enough CM 100 < 100 g (M ). (126) (123) (124) (125)

We now choose M1 , M2 , . . . , MN inductively so that (126) holds, Mj +1 > 4Mj and 1 N


j

Ei (k )
i=1

100

g (|k |) if |k | > Mj +1 ,

(127)

which is possible by (124). We claim that (122) holds for this choice of Mj . To show this, we start with N 1 |G(k ) (k )| Ei (k ). (128) N i=1 Assume that Mj |k | Mj +1 (the cases |k | M1 and |k | MN are similar and are left to the reader). By (127) we have 1 N
j 1

Ei (k )
i=1

100

g (| k | ).

66

We also see from (123), (125) that 1 C Ej (k ) g (|k |) g (| k | ), N N 100 1 C C C 100 100 Ej +1 (k ) g (|k |) + Mj g (|k |) + Mj , +1 N N N 100 N +1 1 C Ei (k ) Mi100 for j + 2 i N. N N Thus the right side of (128) is bounded by 3 C g (| k | ) + 100 N By (126), we can estimate the last sum by
N N

Mi100 .
i=j +1

100N

g (Mi )
i=j +1

100

g (Mj +1)

100

g (| k | ),

which proves (122). We note that the support properties of G are similar to those of the F s. Namely, it follows from (116) that a M1 supp G {x : |x | p(2+) for some p ( , MN ), a [0, p]}. p 2 (129)

III. We now construct inductively the functions Gm and Hm , m = 1, 2, . . ., as follows. Let G0 1. If Gm has been constructed, we let Gm+1 be as in step II with = G0 G1 . . . Gm , M0 10r0 + m, and = 2m2 . Then the functions Hm = G1 . . . Gm satisfy 3 1 Hm (0) 2 2 for each m and moreover the estimate (118) holds also for the H s, i.e. | H m (k )| |k | 2+ log |k |
1

IV. Now let be a weak limit point of the sequence {Hm dx}. The support of will be contained in the intersection of the supports of the {Gm }, hence by (129) it will be a 1 compact subset of E . From step III, we will have | (k )| |k | 2+ log |k |. The theorem now follows by Lemma 9A4.

B. Distance sets 67

If E is a compact set in R2 (or in Rn ), the distance set (E ) is dened as (E ) = {|x y | : x, y E }. One version of Falconers distance set problem is the conjecture that E R2 , dim E > 1 |(E )| > 0. One can think of this as a version of the Marstrand projection theorem where the nonlinear projection (x, y ) |x y | replaces the linear ones. In fact, it is also possible to make the stronger conjecture that the pinned distance sets {|x y | : y E } should have positive measure for some x E , or for a set of x E with large Hausdor dimension. This would be analogous to Theorem 8.8 with the nonlinear maps y |x y | replacing the projections Pe . Alternately, one can consider this problem as a continuous analogue of a well known open problem in discrete geometry (Erd os distance set problem): prove that for nite sets 2 1 F R there is a bound |(F )| C |F |1 , > 0. The example F = Z2 D (0, N ), N can be used to show that in Erd os problem one cannot take = 0, and a related example [11] shows that in Falconers problem it does not suce to assume that H1 (E ) > 0. The current best result on Erd os problem is = 1 due to Solymosi and T oth 7 [31] (there were many previous contributions), and on Falconers problem the current best result is dim E > 4 due to myself [37] using previous work of Mattila [24] and Bourgain 3 [4]. The strongest results on Falconers problem have been proved using Fourier transforms in a manner analogous to the proof of Theorem 8.8. We describe the basic strategy, which is due to Mattila [24]. Given a measure on E , there is a natural way to put a measure on (E ), namely push forward the measure by the map : (x, y ) |x y |. If one can show that the pushforward measure has an L2 Fourier transform, then (E ) must have positive measure by Theorem 3.13. In fact one proceeds slightly dierently for technical reasons. Let be a measure in 2 R , then [24] one associates to it the measure dened as follows. Let 0 = ( ), i.e. f d0 = Observe that t 2 d (t) = I 1 ().
1 2

f (|x y |)d(x)d(y ).

Thus if I 1 () < , as we will always assume, then the measure we now dene will be in 2 M (R). Namely, let 1 1 d (t) = ei 4 t 2 d0 (t) + ei 4 |t| 2 d0 (t). (130) Since 0 is supported on (E ), is supported on (E ) (E ). 68

Proposition 9B1 (Mattila [24]) Assume that I () < for some > 1. Then the following are equivalent: (i) L2 (R), (ii) the estimate

(
R=1

| (Rei )|2 d)2 RdR < .

(131)

Corollary 9B2 [24] Suppose that > 1 is a number with the following property: if is a positive compactly supported measure with I () < then | (Rei )|2 d C R(2) . (132)

Then any compact subset of R2 with dimension > must have a positive measure distance set. Here and below we identify R2 with C in the obvious way. Proof of the corollary Assume dim E > . Then E supports a measure with I () < . We have

(
R=1

| (Rei )|2 d)2 RdR

(
R=1

| (Rei )|2 d)R(2) RdR

< . On the rst line we used (132) to estimate one of the two angular integrals, and the last line then follows by recognizing that the resulting expression corresponds to the Fourier representation of the energy in Proposition 8.5. By Proposition 9B1 (E ) (E ) supports a measure whose Fourier transform is in L2 , which suces by Theorem 3.13. . At the end of the section we will prove (132) in the easy case = 3 where it follows 2 from the uncertainty principle; we believe this is due to P. Sj olin. It is known [37] that (132) holds when > 4 , and this is essentially sharp since (132) fails when < 4 . The 3 3 negative result follows from a variant on the Knapp argument (Remark 4. in section 7); this is due to [24], and is presented also in several other places, e.g. [37]. The positive result requires more sophisticated Lp type arguments. Before proving the proposition we record a few more formulas. Let R be the angular measure on the circle of radius R centered at zero; thus we are normalizing the arclength measure on this circle to have total mass 2 . Let be any measure with compact support. We then have | (Rei )|2 d = R d. (133) 69

This is just one more instance of the formula which rst appeared in Lemma 7.2 and was used in the proof of Proposition 8.5. This version is contained in Lemma 7.2 if has a Schwartz space density, and a limiting argument like the one in the proof of Proposition 8.4 shows that it holds for general . We also record the asymptotics for R which of course follow from those for 1 (Corollary 6.7) using dilations. Notice that the passage from 1 to R preserves the total mass, i.e. essentially R = ( 1 )R . We conclude that
1 3 1 R (x) = 2(R|x|) 2 cos(2 (R|x| )) + O((R|x|) 2 ) 8

(134)

when R|x| 1, and | R | is clearly also bounded independently of R. Proof of the proposition From the denition of we have (k ) = ei 4

|x y | 2 e2ik|xy| d(x)d(y )
1

+ ei 4 = 2

|x y | 2 e2ik|xy| d(x)d(y )
1 1 1 |x y | 2 cos(2 (|k ||x y | ))d(x)d(y ). 8

On the other hand, by (133) and (134) we have | (kei )|2 d = |k | 2 +O +O


1 1 1 2|x y | 2 cos(2 (|k ||x y | ))d(x)d(y ) 8

|xy ||k |1

(|k ||x y |) 2 d(x)d(y )


3

|xy ||k |1

(|k ||x y |) 2 d(x)d(y ) .


1

The last error term arises by comparing R , which is bounded, to the main term on the 1 right side of (134), which is O((R|x|) 2 ), in the regime R|x| < 1. We may combine the two error terms to obtain | (kei )|2 d = |k | 2
1 1 1 2|x y | 2 cos(2 (|k ||x y | ))d(x)d(y ) 8

+O( (|k ||x y |) d(x)d(y )) for any [ 1 , 3 ]. Therefore 2 2 (k ) = |k | 2


1

| (kei )|2 d + O(|k | 2 I ()).


1 1

The error term here is evidently bounded by |k | 2 I () for any (1, 3 ), and 2 therefore belongs to L2 (|k | 1). We conclude then that belongs to L2 on |k | 1 if 70

(kei )|2 d does. This gives the proposition, since and only if |k | 2 | (being the Fourier transform of a measure) clearly belongs to L2 on [1, 1]. Proposition 9B3 If 1 and if is a positive measure with compact support9 then | (Rei )|2 d I ()R(1) ,

where the implicit constant depends on a bound for the radius of a disc centered at 0 which contains the support of . In particular, (132) holds if = 3 . 2 Corollary 9B4 (originally due to Falconer [11] with a dierent proof) If dim E > then the distance set of E has positive measure.
3 2

Proofs The corollary is immediate from the proposition and Corollary 9B2. The proof of the proposition is very similar to the proofs of Bernsteins inequality and of Theorem 7.4. We can evidently assume that R is large. Let be a radial C0 function whose 1 Fourier transform is 1 on the support of . Let d (x) = ((x)) d(x). Then it is obvious (from the denition, not the Fourier representation) that I ( ) I (). Also = . Accordingly | (Rei )|2 d = | (Rei )|2 d (x)|2 dxd |(Rei x)| | = | (x)|2 R 1
| |x|R|C

|(x Rei )|ddx | (x)|2 dx

R1+2 R1 I (). Here the second line follows by writing | (Rei )| |(Rei x)|

|x|(2) | (x)|2 dx

|(Rei x)|| (x)|dx

and applying the Schwartz inequality. The fourth line follows since for xed x the set of where (x Rei ) = 0 has measure R1 , and is empty if |x| R is large. The proof is complete.
9

Here, as opposed to in some previous situations, the compact support is important.

71

Remark The exponent 1 is of course far from sharp; the sharp exponent is > 1, 1 if [ 1 , 1] and if < 1 . 2 2 2 Exercise Prove this in the case 1. (This is a fairly hard exercise.)

if

Exercise Carry out Mattilas construction (formula (130) and the preceding discussion) in the case where is a measure in Rn instead of R2 , and prove analogues of Proposition 9A1, Corollary 9A2, Proposition 9A3. Conclude that a set in Rn with dimension greater than n+1 has a positive measure distance set. (See [24]. The dimension result is also due 2 originally to Falconer. The conjectured sharp exponent is n .) 2 10. The Kakeya Problem A Besicovitch set, or a Kakeya set, is a compact set E Rn which contains a unit line segment in every direction, i.e. 1 1 e S n1 x Rn : x + te E t [ , ]. 2 2 (135)

Theorem 10.1 (Besicovitch, 1920) If n 2, then there are Kakeya sets in Rn with measure zero. There are many variants on Besicovitchs construction in the literature, cf. [10], Chapter 7, or [39], Section 1. There is a basic open question about Besicovitch sets which can be stated vaguely as How small can this really be? This can be formulated more precisely in terms of fractal dimension. If one uses the Hausdor dimension, then the main question is the following. Open question (the Kakeya conjecture) If E Rn is a Kakeya set, does E necessarily have Hausdor dimension n? If n = 2 then the answer is yes; this was proved by Davies [8] in 1971. For general n, what is known at present is that dim(E ) min( n+2 , (2 2)(n 4) + 3); the rst bound 2 which is better for n = 3 is due to myself [38], and the second one is due to Katz and Tao [19]. Instead of the Hausdor dimension one can use other notions of dimension, for example the Minkowski dimension dened in Section 8. The current best results for the upper Minkowski dimension are due to Katz, Laba and Tao [18], [22], [19]. There is also a more quantitative formulation of the problem in terms of the Kakeya maximal functions, which are dened as follows. For any > 0, e S n1 and a Rn , let 1 Te (a) = {x Rn : |(x a) e| , |(x a) | }, 2 where x = x (x e)e. Thus Te (a) is essentially the -neighborhood of the unit line segment in the e direction centered at a. Then the Kakeya maximal function of f 72

n n1 L1 R dened by loc (R ) is the function f : S

1 f (e) = sup n |T (a)| e aR


C : f

|f |.
(a) Te

(136)

The issue is to prove a estimate for f , i.e. an estimate of the form Lp (S n1 )

C f

(137)

for some p < . Remarks 1. It is clear from the denition that


f f

, 1.

(138) (139)

(n1) f

2. If n 2 and p < , there can be no bound of the form


f q

C f

p,

(140)

with C independent of . This can be seen as follows. Consider a zero measure Kakeya set E . Let E be the -neighborhood of E , and let f = E . Then f (e) = 1 for all n1 e S , so that f q 1. On the other hand, lim0 |E | = 0, hence lim0 f p = 0 for any p < . 3. Let f = D(0,) . Then for all e S n1 the tube Te (0) contains D (0, ), so that (0,)| n/p (e) = ||D . Hence f . This shows that (137) f (0)| p . However, f p Te cannot hold for any p < n. Open problem (the Kakeya maximal function conjecture): prove that (137) holds with p = n, i.e. C : f f n. (141) Ln (S n1 ) C When n = 2, this was proved by C ordoba [7] in a somewhat dierent formulation and by Bourgain [3] as stated. These results are relatively easy; from one point of view, this is because (141) is then an L2 estimate. In higher dimensions the problem remains open. There are partial results on (141) which can be understood as follows. Interpolating between (139), which is the best possible bound on L1 , and (141) gives a family of conjectured inequalities
f q

C p +1 f
n

p,

q = q (p ).

(142)

Note that if (142) holds for some p0 > 1, it also holds for all 1 p p0 (again by interpolating with (135)). The current best results in this direction are that (142) holds with p = min((n + 2)/2, (4n + 3)/7) and a suitable q [38], [19]. 73

Proposition 10.2 If (137) holds for some p < , then Besicovitch sets in Rn have Hausdor dimension n. Remark The inequality
1 |E | C

(143)

for any Kakeya set E follows immediately from (137) by the same argument that showed (140). (143) says that Besicovitch sets in Rn have lower Minkowski dimension n. Proof of the proposition Let E be a Besicovitch set. Fix a covering of E by discs Dj = D (xj , rj ); we can assume that all rj s are 1/100. Let Jk = {j : 2k rj 2(k1) }. For every e S n1 , E contains a unit line segment Ie parallel to e. Let Sk = {e S n1 : |Ie
j Jk

Dj |

1 }. 100k 2
k =1 Sk

Since Let

1 k 100k 2

< 1 and

|Ie

j Jk

Dj | |Ie | = 1, it follows that D (xj , 10rj ).


j Jk

= S n1 .

f = Fk , Fk = Then for e Sk we have |Te2 (ae ) Fk |


k k

1 k |Te2 (ae )|, 2 100k

where ae is the midpoint of Ie so that Te2 (ae ) is a tube of radius 2k around Ie . Hence
f2 k p

k 2 (Sk )1/p .

(144)

On the other hand, (137) implies that


f2 k p

C 2k f

C 2k (|Jk | 2(k1)np )1/p .

(145)

Comparing (144) and (145), we see that ( Sk ) Therefore


j n2p rj k rj

2kpknk 2p |Jk |

2k(n2p)|Jk |. (Sk )
k

2k(n2p)|Jk |

1.

We have shown that dimension bound.

1 for any < n, which implies the claimed Hausdor 2

74

Remark By the same argument as in the proof of Proposition 10.2, (142) implies that the dimension of a Kakeya set in Rn is at least p.

A. The n = 2 case Theorem 10.3 If n = 2, then there is a bound


f 2

1 C (log )1/2 f

2.

We give two dierent proofs of the theorem. The rst one is due to Bourgain [3] and uses Fourier analysis. The second one is due to C ordoba [7] and is based on geometric arguments.
1 Proof 1 (Bourgain) We can assume that f is nonnegative. Let e (0) , (x) = (2 ) Te then f (e) = sup (e f )(a). 2 aR

Let be a nonnegative Schwartz function on R such that has compact support and (x) 1 when |x| 1. Dene : R2 R by (x) = (x1 ) 1 ( 1 x2 ).
Note that e when e = e1 , so that f (e1 ) supa ( f )(a). Similarly f (e) sup(e f )(a), a

where e = pe for an appropriate rotation pe . Hence


f (e) e f

e f

|e ( )| |f ( )|d.

(146)

By H olders inequality, |e ( ))| |f ( )|d |e ( )| |f ( )| (1 + | |)d


2 1/2 |e ( )| d 1+| | 1/2

(147) .

Note that e = pe and = (x1 )(x2 ), so that |e | rectangle Re of size about 1 1/ . Accordingly, |e ( )| d 1 + | | d 1 + | | 75
1/ 1

1 and is supported on a

Re

1 ds = log( ). s

(148)

Using (146), (147) and (148) we obtain


f 2 2

log( 1 )

|e ( )| |f ( )|2(1 + | |)d
S1

1/2

log( 1 ) R2 |f ( )|2(1 + | |) 2 2 |f ( )| d log 1 R f 2 = log 1 2.

|e ( )|de d

Here the third line follows since for xed the set of e S n1 where e ( ) = 0 has measure 1/(1 + | |). The proof is complete. 2 Remark For n 3, the same argument shows that
f 2

(n2)/2 f

2,

(149)

which is the best possible L2 bound. Proof 2 (C ordoba) The proof uses the following duality argument.
1 +p = 1. Suppose Lemma 10.4 Let 1 < p < , and let p be the dual exponent of p: 1 p n1 is a maximal -separated set, and if that p has the following property: if {ek } S p n n1 k yk 1, then for any choice of points ak R we have

yk Te
k

(ak ) p

A.

Then there is a bound

Lp (S n1 )

A f

p.

Proof. Let {ek } be a maximal -separated subset of S n1. Observe that if |e e | < then f (e) Cf (e ); this is because any Te (a) can be covered by a bounded number of tubes Te (a ). Therefore
f p p

D (ek ,) k k

|f (e)|p de 1/p

( n 1 = n1 for some sequence yk with lp and lp . Hence


f k p p

|f (ek )|p )

yk |f (ek )|

p n1 yk = 1. On the last line we used the duality between

n1
k

yk

1 |Tek (ak )|

|f |
(a ) Te k k

76

for some choice of {ak }. Since |Tek (ak )| n1 , it follows that


f p p k

yk Te
k

(ak )

|f | f
p

yk Te
p

(ak ) p

(H olders inequality)

A f as claimed. 2

We continue with C ordobas proof. In view of Lemma 10.4, it suces to prove that 2 for any sequence {yk } with yk = 1 and any maximal -separated subset {ek } of S 1 we have 1 yk Te (ak ) log . (150) k 2 k The relevant geometric fact is |Tek (a) Tel (b)| Using (151) we estimate yk Te
k
k

2 . |ek el | +

(151)

2 (ak ) 2

=
k,l

yk yl |Tek (ak ) Tel (al )| yk yl


k,l

2 . |ek el | + . |ek el | + (152)

yk yl

k,l

Observe that for xed k |el ek | + = l + 1 1 log , l+1

l 1

l 1

and similarly for xed l


k

|ek el | +

1 log . ( yk )2
k

Applying Schurs test (Lemma 7.5) to the kernel /(|ek el | + ) we obtain that yk Te
k 2 (a ) 2 k k

log

1 log ,

(153)

77

which proves (150).

B. Kakeya Problem vs. Restriction Problem Recall that the restriction conjecture states that f d In fact, the stronger estimate f d
p p

Cp f

L (S n1 )

if p >

2n . n1

Cp f

Lp (S n1 )

if p >

2n n1

(154)

can be proved to be formally equivalent, see e.g. [3]. It is known that the restriction conjecture implies the Kakeya conjecture. This is due to Bourgain [3], although a related construction had appeared earlier in [2]; both constructions are variants on the argument in [13]. Proposition 10.5. (Feerman, Bourgain) If (154) is true then the conjectured bound
f n

C f

is also true. Proof We will use Lemma 10.4. Accordingly, we choose a maximal -separated set {ej } on S n1 ; observe that such a set has cardinality (n1) . For each j pick a tube Tej (aj ), and let j be the cylinder obtained by dilating Tej (aj ) by a factor of 2 around the origin. Thus j has length 2 , cross-section radius 1 , and axis in the ej direction. Also let Sj = {e S n1 : 1 e.ej C 1 2 }. Then Sj is a spherical cap of radius approximately C 1 , centered at ej . We choose the constant C large enough so that the Sj s are disjoint. Knapps construction (see chapter 7) gives a smooth function fj on S n1 such that fj is supported on Sj and fj
L (S n1 )

= 1,

|fj d| We consider functions of the form f =

n1 on j .

j yj fj ,
j

78

where yj are nonnegative coecients and j are independent random variables taking values 1 with equal probability. Since fj have disjoint supports, we have f
q Lq (S n1 )

=
j

yj fj

q Lq (S n1 )

q n1 yj .

(155)

On the other hand, E( f d


q n ) Lq (R )

R R

E(|f d (x)|q )dx (


j 2 yj |f d (x)|2 )q/2 dx (Khinchins inequality)

q(n1)

|
j

2 yj j (x)|q/2 dx.

(156)

Assume now that (154) is true. Then for any q > that 2 q(n1) n | yj j (x)|q/2 dx R j

2n n1

it follows from (155) and (156)


q n1 yj .

2 Let zj = yj and p = q/2, then the above inequality is equivalent to the statement

if n1
j

zj

q/2

1, then
j

zj j

q/2

2(n1)

(157)

for any p

n . n1

We now rescale this by 2 to obtain


p zj 1, then j j

if n1 Observe that
n p

zj Tj

n 2( p (n1))

(n 1) if n1

0 as p

n . n1

Thus for any > 0 we have zj Tj


j p

p zj 1, then j

(158)

if p is close enough to

n . n1

By Lemma 10.4, this implies that for any > 0


f p

provided that p < n is close enough to n. Interpolating this with the trivial L bound, we conclude that f f n n 79

as claimed.

We proved that the restriction conjecture is stronger than the Kakeya conjecture. Bourgain [3] partially reversed this and obtained a restriction theorem beyond SteinTomas by using a Kakeya set estimate that is stronger than the L2 bound (153) used in the proof of (149). It is not known whether (either version of) the Kakeya conjecture implies the full restriction conjecture. Theorem 10.6 (Bourgain [3]) Suppose that we have an estimate Te (aj )
j

C ( q 1+)
n

(159)

for any given > 0 and for some xed q > 2. Then f d for some p <
2n+2 . n1 p

Cp f

L (S n1 )

(160)

Remark The geometrical statement corresponding to (159) is that Kakeya sets in Rn have dimension at least q . We will sketch the proof only for n = 3. Recall that in R3 we have the estimates f d from the Stein-Tomas theorem, and f d
L2 (D (0,R)) 4

L2 (S 2 )

(161)

R1/2 f

L2 (S 2 )

(162)

from Theorem 7.4 with = n 1 = 2. Interpolating (161) and (162) yields a family of estimates f d
Lp (D (0,R))

Rp2 f
2 1

L2 (S 2 )

(163)

for 2 p 4. Below we sketch an argument showing that the exponent of R in (163) can be lowered if the L2 norm on the right side is replaced by the L norm. Proposition 10.7 Let n = 3, 2 < p < 4, and assume that (159) holds for some q > 2. Then f d Lp (D(0,R)) R(p) f L (S n1 ) , where (p) <
2 p

1 . 2

This of course implies (160) for all p such that (p) 0; in particular, there are p < 4 for which (160) holds. 80

Heuristic proof of the proposition Assume that f cover S 2 by spherical caps

L (S n1 )

= 1, and let = R1 . We

Sj = {e S 2 : 1 e.ej }, -separated set on S 2 . Then f=


j

where {ej } is a maximal

fj ,

and Gj = fj where each fj is supported on Sj . Let G = f d d , so that G = j Gj . By the uncertainty principle |Gj | is roughly constant on cylinders of length R and diameter R pointing in the ej direction. To simplify the presentation, we now make the assumption10 that Gj is supported on only one such cylinder j . Next, we cover D (0, R) with disjoint cubes Q of side R. For each Q we denote by N (Q) the number of cylinders j which intersect it. Note that G|Q = j Gj |Q , where we sum only over those j s for which j intersects Q. Using this and (163), we can estimate G Lp (Q) for 2 p 4: 2 p 1 G Lp (Q) R 2 fj Summing over Q, we obtain G
p Lp (D (0,R)) j :j Q=
2

L2 (S 2 )

Rp

1 2

(N (Q) |Si |)1/2 (164)

3 1 p 4

N (Q)1/2 .

4 1
Q

3p

N (Q)p/2 j
j p/2 p/2 .

3p 1 p+ 2 4

(165) On the last line we used that j


j p/2 p/2

=
Q

N (Q)p/2 |Q| = 3/2


Q

N (Q)p/2 .

We now let p = 2q , where q is the exponent in (159), and assume that p is suciently close to 4 (interpolate (159) with (149) if necessary). We have from (159) ( 3 1+) T (a ) q C q .
j
ej j

10 It is because of this assumption that our proof is merely heuristic. Of course the Fourier transform of a compactly supported measure cannot be compactly supported; the rigorous proof uses the Schwartz decay of Gj instead.

81

Rescaling this inequality by 1 we obtain j


j

( 3 1+) 3/q q

= 1 p .
3

We combine this with (164) and conclude that G i.e., f d


Lp (D (0,R)) 1 p p Lp (Q)

4 1 , 4 p = R p 4 + ,
1 1 1 1

which proves the proposition since

1 4

<

2 p

1 2

if p < 4.

82

References
[1] W. Beckner, Inequalities in Fourier analysis, Ann. of Math. 102 (1975), 159-182. [2] W. Beckner, A. Carbery, S. Semmes, F. Soria: A note on restriction of the Fourier transform to spheres, Bull. London Math. Soc. 21 (1989), 394-398. [3] J. Bourgain, Besicovitch type maximal operators and applications to Fourier analysis, Geometric and Functional Analysis 1 (1991), 147-187. [4] J. Bourgain, Hausdor dimension and distance sets, Israel J. Math 87(1994), 193-201. [5] J. Bourgain, Lp estimates for oscillatory integrals in several variables, Geometric and Functional Analysis 1(1991), 321-374. [6] L. Carleson, Selected Problems on Exceptional Sets, Van Nostrand Mathematical Studies. No. 13, Van Nostrand Co., Inc., Princeton-Toronto-London 1967. [7] A. C ordoba, The Kakeya maximal function and spherical summation multipliers, Amer. J. Math. 99 (1977), 1-22. [8] R. O. Davies, Some remarks on the Kakeya problem, Proc. Cambridge Phil. Soc. 69(1971), 417-421. [9] K.M. Davis, Y.C. Chang, Lectures on Bochner-Riesz Means London Mathematical Society Lecture Note Series, 114, Cambridge University Press, Cambridge, 1987. [10] K. J. Falconer, The geometry of fractal sets, Cambridge University Press, 1985. [11] K. J. Falconer, On the Hausdor dimension of distance sets, Mathematika 32(1985), 206-212. [12] C. Feerman, Inequalities for strongly singular convolution operators, Acta Math. 124(1970), 9-36. [13] C. Feerman, The multiplier problem for the ball, Ann. Math. 94(1971), 330-336. [14] G. Folland, Harmonic Analysis in Phase Space, Annals of Mathematics Studies, 122. Princeton University Press, Princeton, NJ, 1989. [15] V. Havin, B. Joricke, The Uncertainty Principle in Harmonic Analysis, SpringerVerlag, 1994. [16] L. H ormander, Oscillatory integrals and multipliers on F Lp , Ark. Mat. 11 (1973), 1-11. [17] L. H ormander, The Analysis of Linear Partial Dierential Operators, volume 1, 2nd edition, Springer Verlag 1990. 83

[18] N. Katz, I. Laba, T. Tao, An improved bound on the Minkowski dimension of Besicovitch sets in R3 , Annals of Math. 152(2000), 383-446. [19] N. Katz, T. Tao, New bounds for Kakeya sets, preprint, 2000. [20] Y. Katznelson, An Introduction to Harmonic Analysis 2nd edition. Dover Publications, Inc., New York, 1976. [21] R. Kaufman, On the theorem of Jarnik and Besicovitch, Acta Arithmetica 39(1981), 265-267. [22] I. Laba, T. Tao, An improved bound for the Minkowski dimension of Besicovitch sets in medium dimension, Geom. Funct. Anal. 11(2001), 773-806. [23] J. E. Marsden, Elementary Classical Analysis, W. H. Freeman and Co., San Francisco, 1974 [24] P. Mattila, Spherical averages of Fourier transforms of measures with nite energy; dimension of intersections and distance sets, Mathematika 34 (1987), 207-228. [25] P. Mattila, Geometry of sets and measures in Euclidean spaces, Cambridge University Press, 1995. [26] W. Minicozzi, C. Sogge, Negative results for Nikodym maximal functions and related oscillatory integrals in curved space, . Math. Res. Lett. 4 (1997), 221237. [27] W. Rudin, Functional Analysis, McGraw-Hill, Inc., New York, 1991. [28] R. Salem, Algebraic Numbers and Fourier Analysis, D.C. Heath and Co., Boston, Mass. 1963 . [29] C. Sogge, Fourier integrals in Classical Analysis, Cambridge University Press, Cambridge, 1993. [30] C. Sogge, Concerning Nikodym-type sets in 3-dimensional curved space, Journal Amer. Math. Soc., 1999. [31] J. Solymosi, C. T oth, Distinct distances in the plane, Discrete Comput. Geometry 25(2001), 629-634. [32] E.M. Stein, in Beijing Lectures in Harmonic Analysis, edited by E. M. Stein. [33] E. M. Stein, Harmonic Analysis, Princeton University Press 1993. [34] E. M. Stein, G. Weiss, Introduction to Fourier Analysis on Euclidean Spaces, Princeton University Press, Princeton, N.J. 1990 [35] P. A. Tomas, A restriction theorem for the Fourier transform, Bull. Amer. Math. Soc. 81(1975), 477-478. 84

[36] A. Varchenko, Newton polyhedra and estimations of oscillatory integrals, Funct. Anal. Appl. 18(1976), 175-196. [37] T. Wol, Decay of circular means of Fourier transforms of measures, International Mathematics Research Notices 10(1999), 547-567. [38] T. Wol, An improved bound for Kakeya type maximal functions, Revista Mat. Iberoamericana 11(1995), 651674. [39] T. Wol, Recent work connected with the Kakeya problem, in Prospects in Mathematics, H. Rossi, ed., Amer. Math. Soc., Providence, R.I. (1999), 129-162. [40] A. Zygmund, Trigonometric Series, Cambridge University Press, Cambridge, 1968.

85

You might also like