A Short Introduction To Distribution Theory: 1 The Classical Fourier Integral
A Short Introduction To Distribution Theory: 1 The Classical Fourier Integral
Sven Nordebo
School of Computer Science, Physics and Mathematics
Linnaeus University
8 september 2010
X() = F{x(t)} =
x(t)eit dt
Z
x(t)
=
F
{X()}
=
X()eit d,
(1)
where the integrals are assumed to exist in the usual sense, i.e., as Riemann integrals,
or as Lebesgue integrals
[3]. If e.g., the function x(t) is absolutely integrable, i.e.,
R
1
if x(t) L and |x(t)| dt < exists finitely, then the Fourier transform X() =
F{x(t)} exists in the sense of the integral in (1), and X() is a continuous and
bounded function, i.e., |X()| < A where A is a constant, see e.g., [3]. Similarly,
if the function X() L1 , then the inverse Fourier transform x(t) = F 1 {X()}
exists in the sense of the integral in
it is a continuous and bounded function.
R (1), and
2
2
Alternatively, if x(t) L , i.e., if |x(t)|R dt < exists finitely, then both integrals
in (1) are well defined and X() L2 ( |X()|2 d < ), see e.g., [3].
However, there are many important functions that are neither in L1 nor in L2 , and
for which the classical Fourier transform in the sense of (1) does not exist. Examples
are the functions: x(t) = 1, sin 0 t, u(t), sgn(t), 1/t, (t) etc., some of which can be
regarded as infinite energy signals. Nevertheless, it is of great importance that we are
able to perform some kind of Fourier transformation for these functions as well. It
is for this purpose that generalized functions or distributions [4] are now introduced.
Before we introduce the distributions, however, we will first establish two fundamental properties of the classical Fourier transform (1), and which will be used later
in the definition
of the distributions. Hence, suppose that the function x(t) is locally
R
integrable ( E x(t) dt exists over any finite interval E), continuously differentiable
(the differential x0 (t) exists and is continuous) and that the Fourier transform X()
exists in the sense of the integrals in (1). Suppose further that (t) is a suitable testing function, also locally integrable, continuously differentiable, F(t) exists and
which decays sufficiently fast so that
lim x(t)(t) = 0.
|t|
(2)
hx(t), (t)i =
x(t)(t) dt
(3)
which should be interpreted as a linear mapping from the space of testing functions
(t) to the real (or complex) numbers.
Theorem: The linear functional (3) has the following symmetry properties with
respect to the differential and the Fourier transform, respectively.
hx0 (t), (t)i = hx(t), 0 (t)i,
hFx(t), ()i = hx(), F(t)i.
0
0
hx (t), (t)i =
x (t)(t) dt = [x(t)(t)]
(4)
(5)
(6)
The proof of (5) is as follows
Z
Z
hFx(t), ()i =
it
x(t)e
Z Z
it
x()e
d (t) dt
Z
it
(t)e
dt d = hx(), F(t)i. (7)
x()
dt () d
The space S of testing functions of rapid descent [4] (also called the Schwartz class)
consists of all functions (t) which are infinitely differentiable (infinitely smooth),
i.e., all the derivatives (k) (t) exist and are continuous for any order k 0, and
are such that, as |t| , (t) and all its derivatives (k) (t) decay faster than any
power of 1/|t|. In other words, for any order k 0 and integer power m 0, there
is a constant ckm such that |(k) (t)| ckm |t|1m for all t.
The space S 0 of tempered distributions [4] (also called distributions of slow
growth) consists of all continuous linear functionals on the space S of testing functions of rapid descent. Hence, a tempered distribution x(t) is a rule that assigns a
number hx(t), (t)i to each (t) in S, in such a way that the following conditions
are fulfilled:
Linearity: If 1 (t) and 2 (t) are in S and if is a number, then
hx(t), 1 (t) + 2 (t)i = hx(t), 1 (t)i + hx(t), 2 (t)i
hx(t), 1 (t)i = hx(t), 1 (t)i.
2
(8)
(9)
n <t<
(10)
where n (t) S, and where the left side is required to hold for all orders k 0 and
integer powers m 0.
2.1
Z
hsin 0 t, (t)i =
sin(0 t)(t) dt
Z
ln |t|(t) dt
hln |t|, (t)i =
Z
Z
hu(t), (t)i =
u(t)(t) dt =
(t) dt
0
Z 0
Z
Z
(t) dt
(t) dt
sgn(t)(t) dt =
hsgn(t), (t)i =
Z
Z
1
1
1
h , (t)i = lim
(t) dt = P
(t) dt
0 |t| t
t
t
h(t), (t)i = (0),
(11)
(12)
(13)
(14)
(15)
(16)
(17)
where P denotes the Cauchy principal value of the integral. The last example is
the definition of the Dirac, or delta distribution. It can be shown that the delta
distribution (t) cannot be obtained from an ordinary integral as in (3), see e.g.,
[4]. Nevertheless,
R it is very common to retain the notation of an integral and write
h(t), (t)i = (t)(t) dt = (0), even though it is not an integral in the classical
sense. In this case, we must remember that the exact meaning of the integral notation
is defined as above. Alternatively, and as we will see later, the integral notation could
also be used to denote the limit of a sequence of classical integrals.
3.1
Equality of distributions
Two distributions x(t) and y(t) are said to be equal if hx(t), (t)i = hy(t), (t)i for
all testing functions (t) S.
3.2
Limits of distributions
(18)
for all testing functions (t) S. In this case, we write lim xn (t) = x(t). The same
n
definition is made with an indexed set of distributions where e.g., lim x (t) = x(t)
0
means that lim hx (t), (t)i = hx(t), (t)i for all testing functions (t) S.
0
t2
1
Example: The distribution x (t) = 2
e 22 (Gaussian pulse) has zero mean,
R
variance 2 and integral x (t) dt = 1, and hence
Z
Z
x (t) = (0) = h(t), (t)i.
x (t)(t) dt = (0)
lim hx (t), (t)i = lim
0
(19)
It is therefore concluded that lim x (t) = (t).
0
1
T
T
R Example: The rectangular pulse xT (t) = T (u(t + 2 ) u(t 2 )) has integral
xT (t) dt = 1, and it has the same property as above that
1
lim hxT (t), (t)i = lim
xT (t) dt = lim
T 0
T 0
T 0 T
T /2
(20)
and hence that lim xT (t) = (t). Obviously, we can construct many distribution
T 0
3.3
(21)
for all testing functions (t) S. Note that this property is shared by (3) by a
simple substitution of variables.
The reflection x(t) of a distribution x(t) is defined by
hx(t), (t)i = hx(t), (t)i
(22)
for all testing functions (t) S. Note that this property is shared by (3) by a
simple substitution of variables. A distribution is said to be even if x(t) = x(t),
and odd if x(t) = x(t).
A periodic distribution x(t) = x(t + T ) is defined by the property that
hx(t), (t)i = hx(t + T ), (t)i = hx(t), (t T )
for all testing functions (t) S.
(23)
(24)
(t mT ), (t)i =
m=
h(t mT ), (t)i
m=
h(t), (t + mT ) =
m=
(mT ) (25)
m=
3.4
Multiplication of distributions
The multiplication of two distributions x(t) and y(t) is in general not defined. However, if (at least) one of the distributions is also a function g(t) of slow growth, then
the product g(t)x(t) is well defined. A function g(t) of slow growth is infinitely differentiable (infinitely smooth), i.e., the derivatives g (k) (t) exist for all orders k 0,
and they grow slower than some polynomial, i.e., |g (k) (t)| Ck (1 + |t|)Nk for all t,
where the constant Ck and the polynomial order Nk depend on k. It can be shown
that the product of a testing function (t) S and a function g(t) of slow growth
is also a testing function, i.e., g(t)(t) S, see e.g., [4]. Hence, the product of a
distribution x(t) and a function g(t) of slow growth (e.g., a polynomial) is defined
as the distribution g(t)x(t) by the following property
hg(t)x(t), (t)i = hx(t), g(t)(t)i,
(26)
(27)
for all testing functions (t) S. Hence g(t)(t) = g(0)(t), by the definition of
equality of distributions.
Example: The product of the delta distribution (t) with itself, i.e., (t)(t)
can not be defined as a distribution. Hence, the notation 2 (t) has no meaningful
interpretation.
3.5
Derivatives of distributions
(28)
for all testing functions (t) S. Note that the right hand side of (28) is well
defined as a distribution, and that the definition is chosen in accordance to the
classical property (4). By induction, it follows from the definition (28) that all the
derivatives of x(t) exist as distributions, i.e., x(k) (t) S 0 for all integer orders k 1.
In particular, the derivatives x(k) (t) are defined by
hx(k) (t), (t)i = (1)k hx(t), (k) (t)i
(29)
(30)
h (k) (t), (t)i = (1)k h(t), (k) (t)i = (1)k (k) (0).
(31)
and
Example: The distributional derivative of the unit step function u(t) and the
sign function sgn(t) are given by
u0 (t) = (t)
sgn0 (t) = 2(t).
(32)
(33)
(34)
which shows that u0 (t) = (t). The derivative of the sign function follows immediately by writing sgn(t) = 2u(t) 1.
3.6
It can be shown that if (t) is a testing function of rapid descent, then the classical
Fourier transform F(t) defined in (1) exists and generates a new function () =
F(t) which is also a testing function of rapid descent, see e.g., [4]. Hence, if (t) S
then F(t) S. Based on this property, the Fourier transform of a distribution
x(t) S 0 is a distribution Fx(t) S 0 defined by
hFx(t), ()i = hx(), F(t)i
(35)
for all testing functions (t) S. Note that the right hand side of (35) is well defined as a distribution, and that the definition is chosen in accordance to the classical
6
property (5).
Example: F1 = 2().
To prove this, let () = F(t) where (t) S.
Z
(36)
which shows that F1 = 2().
Example: F(t) = 1.
To prove this, let () = F(t) where (t) S.
Z
(37)
which shows that F(t) = 1.
3.7
Fouriertransform of a derivative
(39)
where k 1.
Proof: The proof is based on the corresponding properties of the testing functions, together with the definition of the Fourier transform. Hence, let (t) be an
arbitrary testing function in S. Then
d
F(t)i = hx(), F{it(t)}i
d
= hFx(t), i()i = hiX(), ()i, (40)
3.8
Let x(t) be a distribution in S 0 and X() = Fx(t). The Fourier transform of tx(t)
is given by
d
Ftx(t) = i X(),
(41)
d
7
dk
X()
d k
(42)
where k 1.
Proof: Let (t) be an arbitrary testing function in S and () = F(t). Then
hFtx(t), ()i = hx(), F(t)i = hx(), ()i = ihx(), i()i
d
d
d
= ihx(), F (t)i = ihFx(t),
()i = ih Fx(t), ()i
dt
d
d
d
= hi X(), ()i (43)
d
d
which shows that Ftx(t) = i d
X().
3.9
(44)
It follows from this property that the Fourier transform F is an invertible operator,
i.e., the inverse Fourier transform F 1 exists.
Proof: Let (t) be an arbitrary testing function in S. Then FF(t) = 2(t).
Hence,
hFFx(t), (t)i = hFx(), F()i = hx(t), FF(t)i = hx(t), 2(t)i
= h2x(t), (t)i (45)
which shows that FFx(t) = 2x(t).
3.10
(46)
Proof: Suppose that x(t) is an even distribution, i.e., x(t) = x(t) and let (t)
be an arbitrary testing function in S. Then
d
(t)i = hx(t), 0 (t)i
dt
= hx(t), 0 (t)i = hx(t), 0 (t)i = hx0 (t), (t)i, (47)
3.11
Based on the distribution theory introduced above, we are now ready to establish
the following important Fourier transforms:
2
i
1
+ ()
Fu(t) =
i
1
= isgn(),
F
t
Fsgn(t) =
1
i
and
1
t
(49)
(50)
(51)
(52)
iFsgn(t) = 2,
(53)
we obtain
which has the general distributional solution
Fsgn(t) =
2
+ A()
i
(54)
2
= 2sgn(t),
i
1
which shows that F t
= isgn().
(55)
3.12
Time shift and modulation are important operations. Let x(t) be a distribution and
X() = Fx(t). Then
Fei0 t x(t) = X( 0 ) (modulation)
Fx(t t0 ) = eit0 X() (time shift).
(56)
(57)
Proof: The proof is based on the corresponding time shift and modulation properties of the testing functions, together with the definition of the Fourier transform.
Hence, let (t) be an arbitrary testing function in S and () = F(t). Then
hFei0 t x(t), ()i = hei0 x(), F(t)i = hx(), ei0 ()i = hx(), F(t + 0 )i
= hFx(t), ( + 0 )i = hX(), ( + 0 )i = hX( 0 ), ()i, (58)
which shows that Fei0 t x(t) = X( 0 ).
Similarly,
hFx(t t0 ), ()i = hx( t0 ), F(t)i = hx( t0 ), ()i = hx(), ( + t0 )i
= hx(), Feit0 t (t)i = hFx(t), eit0 ()i = heit0 X(), ()i, (59)
which shows that Fx(t t0 ) = eit0 X().
3.13
Convolution of distributions
The convolution of two distributions x(t) and y(t) is in general not defined. However,
if (at least) one of the distributions has a Fourier transform which is also a function
of slow growth, then the product Fx(t)Fy(t) is well defined as a distribution. In
this case, the convolution x(t) y(t) is a distribution defined by F{x(t) y(t)} =
Fx(t)Fy(t), or
x(t) y(t) = F 1 {X()Y ()},
(60)
where X() = Fx(t) and Y () = Fy(t). Note that the definition (60) is in agreement with the corresponding properties of the classical Fourier transform (1).
In general, if both distributions x(t) and y(t) have Fourier transforms X() and
Y () which are also ordinary functions, and which can be multiplied together to
yield a well defined distribution X()Y (), then the convolution x(t) y(t) is well
defined by (60).
Example: The Fourier transform of the delta distribution is F(t) = 1, which
is a function of slow growth. Hence, the convolution of a distribution x(t) with the
delta distribution is given by
x(t) (t) = F 1 {Fx(t)F(t)} = F 1 {Fx(t)} = x(t).
(61)
Hence, the delta distribution is a unity operator under convolution. Note e.g., that
(t) (t) = (t).
10
3.14
indexed set of distributions. Hence, if lim xa (t) = x(t), then lim Fxa (t) = Fx(t).
a0
a0
This property shows that the Fourier transform F is a continuous mapping from S 0
to S 0 .
Proof: Let (t) be an arbitrary testing function in S. Then
lim hFxn (t), ()i = lim hxn (), F(t)i = hx(), F(t)i = hFx(t), ()i
(63)
(64)
where a > 0. The distributional Fourier transform Xa () = Fxa (t) exists also as a
classical Fourier integral, and is given by
2i
Xa () =
.
(65)
(a i)(a + i)
Since lim xa (t) = sgn(t), it is concluded that
a0+
2
2i
= Fsgn(t) = ,
a0+ (a i)(a + i)
i
a0+
(66)
where the limits should be interpreted in the distributional sense, and the distribution i1 corresponds to a principal value integral.
Example: Define the distribution xa (t) by the ordinary function
xa (t) = eat u(t)
(67)
where a > 0. The distributional Fourier transform Xa () = Fxa (t) exists also as a
classical Fourier integral, and is given by
1
.
(68)
Xa () =
a + i
Since lim xa (t) = u(t), it is concluded that
a0+
1
1
= Fu(t) =
+ (),
(69)
a0+
a0+ a + i
i
where the limits should be interpreted in the distributional sense, and the distribution i1 corresponds to a principal value integral.
lim Fxa (t) = lim
11
(70)
is the (open) region in the complex s-plane where the integral converges absolutely.
Hence, if s = + i ROC, then
Z
Z
st
|x(t)e | dt =
|x(t)|et dt < .
(71)
and the Fourier transform of x(t) exists in the classical sense of (1). In this case,
X() = Fx(t) = X(s)|s=i .
Example: Let the function x(t) be given by
at
e
t>0
x(t) =
at
e
t<0
(73)
(74)
2s
,
(a s)(a + s)
(75)
2i
.
(a i)(a + i)
(76)
(77)
1
,
a+s
(78)
1
.
a + i
(79)
(80)
The classical Laplace transform does not exist in this case. To see this, let s = +i,
and consider the integral
Z
Z
Z
t
st
et dt = ,
(81)
|sgn(t)|e dt =
|x(t)e | dt =
(82)
1
X(s) = ,
s
(83)
1
1
+ () 6= X(s)|s=i = .
i
i
1
1
=
+ i
s
(84)
(85)
where s = + i, and where the Fourier integral exists in the classical sense (1) for
> 0. Since lim et u(t) = u(t) in the distributional sense, the following relation
0+
between the classical Laplace transform and the distributional Fourier transform is
valid
1
1
= lim Fet u(t) = Fu(t) =
+ () = X()
0+
0+
0+ + i
i
(86)
where the limits should be interpreted in the distributional sense, and the distribution i1 corresponds to a principal value integral.
lim X(s)|s=+i = lim
Hilbert transform relationships have many important applications in physics, telecommunication, etc., see e.g., [1]. Here, we summarize briefly some important
properties.
13
5.1
Let X() = Fx(t) where x(t) is a distribution. When the product isgn()X() is
well defined as a distribution, it defines the Hilbert transform in the time domain
by
1
x(t) = F 1 {isgn()X()}.
(87)
t
In telecommunication applications, the Hilbert transform has the important property of shifting the phase of a signal by /2 radians without changing its envelope.
This is a property that can be used e.g., to obtain the quadrature components of a
signal in a telecommunication system, see e.g., [2].
Example: Let 0 be the carrier frequency, and let (t) be an ordinary function
with band limited spectrum () = F(t) where () = 0 for || < B, and where
B < 0 . The Hilbert transform has the following phase shifting properties
1
cos 0 t = sin 0 t
t
(88)
1
((t) cos 0 t) = (t) sin 0 t.
t
(89)
5.2
Let X() = Fx(t) where x(t) is a distribution. When the product sgn(t)x(t) is well
defined as a distribution, it defines the Hilbert transform in the frequency domain
by
1 2
F{sgn(t)x(t)} =
X()
(92)
2 i
or, equivalently
1
X() = F{isgn(t)x(t)}.
(93)
14
(94)
and
x(t) + x(t)
2
x(t) x(t)
xo (t) =
.
2
xe (t) =
(95)
(96)
If X() = Fx(t) = Re X() + i Im X() and if x(t) is real, it can be shown that
Re X() is even and Im X() is odd. Hence, for real distributions x(t) the following
relationships hold
Re X() = Fxe (t)
i Im X() = Fxo (t).
(97)
(98)
Suppose now that x(t) is casual with x(t) = 0 for t < 0. It is then readily verified
that
sgn(t)x(t) = x(t)
sgn(t)x(t) = x(t),
(99)
(100)
(101)
(102)
where it has been assumed that the product sgn(t)x(t) is well defined as a distribution.
A situation of practical interest is when x(t) contains a delta function C(t)
supported at the origin t = 0. In this case, the relations (101) and (102) above must
be modified as follows
xe (t) = sgn(t)xo (t) + C(t)
xo (t) = sgn(t)(xe (t) C(t))
(103)
(104)
where the distribution C(t) is associated with the even part since (t) is even.
By taking the Fourier transform of (101) and (102) (assuming that C = 0) the
following Hilbert transform relationships are now obtained
1 2
i Im X()
2 i
1 2
i Im X() =
Re X()
2 i
Re X() =
15
(105)
(106)
or
Z
1
1
Im X() =
Im X( 0 ) d 0
Re X() =
0
Z
1
1
Re X() =
Im X() =
Re X( 0 ) d 0
0
(107)
(108)
where the integral (when it exists) should be interpreted as a Cauchy principal value.
In conclusion, for a causal linear system where x(t) = 0 for t < 0, the real part
Re X() of the frequency function X() is given explicitly by the Hilbert transform
of its imaginary part Im X(), and vice versa.
Referenser
[1] F. W. King. Hilbert transforms vol. III. Cambridge University Press, 2009.
[2] J. G. Proakis. Digital Communications. McGraw-Hill, third edition, 1995.
[3] W. Rudin. Real and Complex Analysis. McGraw-Hill, New York, 1987.
[4] A. H. Zemanian. Distribution theory and transform analysis. Dover Publications,
New York, 1965.
16