0% found this document useful (0 votes)
35 views3 pages

A Binary Analog To The Entropy Power Inequality,"

The document summarizes an analog of the entropy-power inequality for binary random sequences. Specifically: 1) It establishes a lower bound on the entropy H(Z) of the modulo-2 sum Z, of two independent stationary binary random sequences X, and Y,. 2) This lower bound is analogous to the entropy-power inequality for continuous random variables. It states that the "entropy" u(Z) of Z, is lower bounded by u(X)*u(Y), where u(.) is a function of the entropy. 3) When Y, is i.i.d., this reduces to "Mrs. Gerber's Lemma" from an earlier work, providing a generalization of

Uploaded by

chang lichang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views3 pages

A Binary Analog To The Entropy Power Inequality,"

The document summarizes an analog of the entropy-power inequality for binary random sequences. Specifically: 1) It establishes a lower bound on the entropy H(Z) of the modulo-2 sum Z, of two independent stationary binary random sequences X, and Y,. 2) This lower bound is analogous to the entropy-power inequality for continuous random variables. It states that the "entropy" u(Z) of Z, is lower bounded by u(X)*u(Y), where u(.) is a function of the entropy. 3) When Y, is i.i.d., this reduces to "Mrs. Gerber's Lemma" from an earlier work, providing a generalization of

Uploaded by

chang lichang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

1428 IEEE TRANSACTIONS O N INFORMATION THEORY, VOI.. 36, NO.

6, NOVEMBER 1990

Substituting (19) and (16) into (151, we have Thus the entropy-powerAu,JX ) is the variance of an i.i.d. Gauss-
ian random sequence {X,l}for which

HJ R)= W,(R) = X). (3)


The well-known entropy-power inequality states that for two
+
independent stationary sequences { X J and {y,]and Z,, 2 X,,
yl (--oc<n<m),
Finally, using N = a2‘, so that N 2 = a2(1+ PI, we have
a,,(Z>2 d X ) + a , , ( Y ) . (4)
1 7re 1 In particular as n +-oc, (4) becomes
) c - - log -- - log
I( X N ; Y N2
2 6 2
o(X + Y )2 a ( X ) + a ( Y ) , (5)
which is (7) and Part b).
where

REFERENCES 1
a ( X ) = Iim all(X )= - e 2 H ( X ) ,
R. B. Ash, Information Theorv. New York: Interscience. 1965 n +CC 27re
G. Ungerboeck, “Channel coding with multilevel/phase signals,” IEEE
Trans. Inform. Theory, vol. IT-28, pp. 55-67, Jan. 1982. and
H ( X )= Iim H,,( X ) , etc.
I I ‘X

A ~i~~~ ~~~l~~ to the Entropy-power~ ~ These


~ limits exist ~ X(”)and
~ whenever ~ Y ( “ ) have
l densities
i and ~ ~
EX;, E ~ , Z< m .
SHLOMO SHAMAI (SHITZ) SENIOR MEMBER, IEEE, A N D
In this correspondence we establish an analog of (5) for
AARON D. WYNER, FELLOW, IEEE
binary random sequences. Let {X,,),- 30 < n < m , be a stationary,

Abstract -Let (X,,), (Y,) be independent stationary binary random


binary-valued (i.e., 0/1) random sequence. Let the nth order
sequences with entropy H ( X ) ,H ( Y ) , respectively. Let h ( [ )= - [ log [ entropy be defined by
- ( 1 - C)log(l- 51, 0 I 5 I 1/2, be the binary entropy function and let 1 1 ”
u ( X )= h - (I H ( X ) ) , u(Y)= h - ’ ( H ( Y )). Let Z , = X,,@Y,, where @ H,( X )= - H ( X I , .. . ,X,)= - H ( X,I X I ,X 2 ,. . . , X, I )
denotes modulo-2 addition. The following analog of the “entropy-power” k=l
inequality provides a lower bound on H ( Z ) , the entropy of (Z,$
where H ( . ) here and in the remainder of the paper denotes
v ( Z )2 v(X)*u(Y), discrete entropy and discrete conditional entropy. The entropy
where u ( Z ) = h - ’ ( H ( Z ) ) , and a * p = a(1- P I + p(1- a). When {Y,,) Of is defined by
are independent identically distributed (i.i.d.), this reduces to “Mrs.
H ( X ) = lim H , , ( X ) = lim H ( X , , I X -,,;..,X-,).
(6)
Gerber’s Lemma” from Wyner and Ziv. n +m n +cc

The “entropy-power inequality” [ l , Sect. 7.101 [3, Sect. 221 is a Corresponding to the stationary binary sequence { X,,},let (in}
useful lower bound on the differential entropy of the sum of two be the independent identically distributed (i.i.d.) binary se-
independent real-valued stationary random sequences. In this quence with the same entropy, i.e.,
correspondence, we will establish an analogous inequality for
H ( X )=H ( X )=H(2,) =h(a(X)), (7)
the modulo-2 sum of two independent stationary binary random
sequences. This bound is a generalization of “Mrs. Gerber’s where h ( l )= - (‘ log(1- l ) - ( l - l ) l o g ( l - i),0 5 iI 1, is the
Lemma” [4, Theorem 11 and is proved in a similar way. binary entropy function, and a ( X ) is taken to be in [0, i]. In
We begin by stating the entropy-power inequality. Let (X,,],other words d X ) , defined by
- m < n < m be a stationary, real-valued random sequence, with
probability density function for = (XI,. . ., X,,) given by a ( X ) =h-’(H( X ) ) , 0<o(X)I;, (8)
p,(x), x E s”,1 I n < m . The “nth order differential entropy” is the “success” probability in a Bernoulli sequence with the
of the sequence is’ same entropy as {X17].The quantity a ( X ) corresponding to the
binary random sequence { X J is analogous to the entropy power
of a continuous random sequence in that

and the “nth order entropy-power’’ is a(X)irnin[Pr{X,,=I}, Pr{X,,=O}]. (9)


with equality if and only if {X,,]
is i.i.d.
Our main result is the following analog to (5). Let {X,l),(Y,]
be independent stationary binary random sequences and let
Z , = X,,@Y,,, where “e”denotes modulo-2 addition. Let a ( . )
Manuscript received September 13, 1989; revised March 9, 1990.
S . Shamai (Shitz) is with the Department of Electrical Engineering, Tech- be defined by (8). We will prove the following theorem.
nion-Israel Institute of Technology, Technion City, 32000 Haifa, Israel. H e
was a visitor at AT&T Bell Laboratories while this work was researched. Theorem:
A. D. Wyner is with AT&T Bell Laboratories, Rm. 2C-365, 600 Mountain
Ave., Murray Hill, NJ 07974. o ( Z )2 a ( X ) * a ( Y ) ,
I E E E Log Number 9037505.
‘All logarithms is this correspondence are taken to the base 2. where a * p =a(l- p)+ p(l- a ) , 0 Ia,/3 5 1.

0018-9448/90/1100-1428$01.00 01990 IEEE


~
~

IECE TRANSACTIONS O N INFORMATION THFORY, VOL.. 36, NO. 6, NOVEMBER 1990 1429

Remarks Proof of the Theorem: For n = 1, 2;. ., write X ( , ) =


. ., X _I ) , and similarly Y ( " )Z(,).
(X-,,; , Then write
a) Equality in (10) holds when ( X , , )and (y,}are i.i.d.
b) When (y,}is i.i.d. and P r ( y , = 1) = 0, then we can think (1)
H ( Z,)IZ'"') 2 H ( z(lIz()7),
(2)
XO1),Y(J1))= H ( Z,)IX'"',Y ( l l ) )
of X,,and Z,, as the input and output of a BSC with
(3)
crosSbver probability /?. In this case, the theorem spe- = Pr{Y('f)=y} Pr{X(n)=x)
cializks to "Mrs. Gerber's Lemma" [4]. YE(O.I)" x E (0, I)"
c) We do not see how to generalize this inequality to
nonbinary random sequences. Note that even Mrs. . H ( Z,,IX(")= x, Y ( " )= y )
Gerber's Lemma [4] does not extend directly to nonbi- (4)
nary inputs 151 (see also [71). = Pr ( Y(') = y } Pr {x'")
= x}
Y X
d) We can define a binary analog to the nth order entropy
power by ++W*b(Y)), (12a)
where
a,,(X)=h-'("(X))
&( x ) = Pr ( X I ,= 1I x('~)
= x} , ( 12b)
However, here is an example of two stationary sequences
(X,,},(Y,}for which ~(y)=~r(Y,,=1/~(")=y}. (12c)
Step 1) follows from the fact that conditioning decreases en-
% ( Z >< a , , ( X ) * C A Y ) . (11) tropy, Step 2) from Z, = X,@Y,, Step 3) from the independence
of ( X n ) and {Yn),and Step 4) from
Let xz,, -E< k <m, be i.i.d. binary random variables with
P r ( Z , , = 11x('7)=
x, Y ( ~ ) y=} =~ ; ( x ) b* ( y ) .
uniform distribution. Let X z E 2 k +X l 2 , , - - C Ok< <m. Let 0 be
another binary random variable, independent-of the ( X 2 k ) also , Now take
with uniform distribution. Finally let X, = X , , + H ,- m < n < W .
a ( x ) =min[&(x),l-&(x)], (13)
The "random phase" 0 makes (X,}stationary. Let (Y,} be a
sequence independent of {X,)with the same distribution, and and observe that
let Z,, = X,,@yI, -E< n <m. Then it is easy to show that h( .;(x) * p^< Y 1) = h( 4 x ) * p^( Y )) . (14)
H , ( X ) = H , ( Y ) = 0.90564, so that u 2 ( X )= a,(Y) = 0.32116
and a 2 ( X ) *a , ( Y ) = 0.43603. But H,(Z) = 0.9772, and a,(Z)= Also
0.41 13, satisfying (1 1). a(.) = h - ' ( u ( x ) ) , (15a)
Thus our result (10) is an inequality that holds only in the
where
limit. .
U
(
.
) = H(X,,IX'"'= x ) . (15b)
e) As an application of the theorem, consider the problem
of estimating the capacity of the following binary chan- Substituting (14) and (15) into (12a), we obtain
nel. The channel input is a binary sequence { X , 7 }with a H ( z,,Iz(~')
2 Pr { Y ( " )= y }
"finite state" constraint (as discussed for example in Y
[1, Chapter 21, or [3]).The channel output Z, = X,@Y,,
where the binary "noise" sequence (y,)is stationary (for -[F Pr{X(")=x)h(b(y)*h-'(u(x)) .
1
example Gilbert's burst-error channel [2]). The channel
capacity is Now it is shown in [4, Lemma 11that for 0 IU I 1 and 0 I I 1,

c=supH(2) - H(Y), f(.)=h(p*h-'(u))


is a convex function of U. Thus applying Jensen's inequality, the
where the supremum is taken over all stationary random term in square brackets
sequences { X J that satisfy the given constraint.
2 h( b( Y ) * h ' ( H ( XOW"))),
-

Let H be the maximum achievable entropy of a stationary since


binary random sequence ( X , , ) that satisfies the given input
constraints. (It is known [l],[3] that H = logA*, where A* is the Pr { x ( "=) x ) u ( x > = H ( X ( , I X ( ~ ) ) .
largest eigenvalue of the state transition matrix of the finite-state X

machine that defines the constraints.) If (X,,} is this max- Repeating the entire argument for Y instead of X (with
entropic sequence, then our theorem gives a lower bound on H ( X , , I X ( " ) )fixed) we have
H( Z ) , which implies that H(Z,,IZ'"') 2 h [h - H ( X , , I X ' " ' ) ) * h - l ( H(Y,,IY'"'))].
-
I(

c 2 h [ a (X ) a ( Y ) ] -H ( Y ) , Taking h I of both sides and letting n +a, yields


h-'( H( Z)) 2 h - l ( H ( X))* h - y H ( Y ) ) ,
where
which is the theorem. 0
a(x)=h-'(H), a(Y)=h-l(H(Y)).
REFERENCES
A special case in which a ( d , k ) run-length constraint is imposed [I] R. E. Blahut, Principles and Practice of Information Theory. Reading,
MA: Addison-Wesley, 19x7.
on the binary inputs transmitted through a memoryless BSC [2] E. N. Gilbert, "Capacity of a burst noise channel," BSTJ, vol. 39, pp.
channel is reported in [6]. 1253-1266, Sept. 1960.
1430 I E E l T R A N S A ( T I 0 N S O N INFORMATION T11EOKY, VOL. 36, NO. 6, NOVEMBER 1990

C. E. Shannon, “ A mathematical theory of communication,” ESTJ, vol. of a nonlinear measurement model for measurement, we need
27, pp. 379-423, pp. 623-656, Oct. 1948, Reprinted in D. Slepian, Key
Puper.s in tl7e Dei.elopment of Information Theory. NY: I E E E Press, to work on a new corresponding measurement model (relative to
1974. our original measurement model) after we apply superposition
A. D. Wyner and J . Ziv, “A theorem o n the entropy of certain binary to the linear models of signal and multiplicative noise.
sequences and applications (Part I),” IEEE Trans. Inform. Tlzeory, vol.
IT-19, pp, 769-777, Nov. 1973. Multiplicative noise is important in many cases such as in the
H. S. Witsenhausen, “Entropy inequalities for discrete channels,” IEEE situation of fading or reflection of the transmitted signal over an
Trans. Inform.Theory, vol. IT-20, pp. 610-616, Sept. 1974.
S . Shamai (Shitz) and Y . Kofman, “ O n the capacity o f binary and
ionospheric channel, and also certain situations involving sam-
Gaussian channels with run-length limited inputs,” to appear in IEEE pling, gating, or amplitude modulation. Most of the research
Trans. Commun. about multiplicative noise concerns uncertain observations
R. Ahlswede and J. Korner, “On the connection between the entropies
of input and output distributions of discrete memoryless channels,” [4]-[8] in the presence of a discrete type of random switching
Proc. Fifth Conf. in Probability Theory, Braslov, 1974, pp. 13-23, sequence, to determine if there exists signal in data. Under the
Academy Rep. Soc. Romania, Bucharest, 1977. assumption of a white sequence, Nahi [4] derived an optimal
linear mmse recursive filter. Monzingo [5] extended it to an
optimal linear smoother. Tugnait [6] studied the stability of
A New Recursive Filter for Systems with Nahi’s estimator. Hadidi and Schwartz 171 used a two-state
Multiplicative Noise Markoff chain to develop a more general model for the switch-
ing sequence. They also proved that the optimal linear mmse
B. S. CHOW A N D W. P. BIRKEMEIER filter could not be achieved by the conventional structure of
Kalman filter. Wang [8] reached the same conclusion by a
Abstract-In a previous work, an optimal linear recursive MMSE different approach.
estimator was developed for a zero-mean signal corrupted hy multiplica-
However, little research has been done for the case of multi-
tive noise in its measurement model. This recursive filter cannot he
obtained by the recursive structure of a conventional Kalman filter plicative noise with a continuous range of values. Rajasekaran
where the new estimate is a linear Combination of the previous estimate [9] developed a linear mmse recursive estimator for a continuous
and the new data. Instead, the recursive structure was achieved by white noise case, and Tugnait [IO] has also analyzed the stability
combining the previous estimate with a recursive innovation, a linear of this estimator. In a different category, Koning [ I l l has studied
combination of the most recent two data samples and the previous the optimal estimation for systems with white stochastic parame-
estimate. In this correspondence the signal is extended to be nonzero- ters in the signal model. In our previous work [3], we have
mean. In the conventional Kalman filter, the superposition principle can developed a model for nonwhite continuous multiplicative noise,
be applied to both the signal and the measurement models for this described by a dynamic equation. Rajasekaran’s model turns out
nonzero-mean extension. However, when multiplicative noise exists, the to be a special case of our model, but his approach is not
measurement model becomes nonlinear. Therefore, a new recursive
suitable for the more general case of our model because our
structure for the innovation process needs to he developed to achieve a
recursive filter.
nonwhite multiplicative noise in the measurement model makes
his form of innovation process invalid.
Index Terms -Multiplicative noise, Kalman filter, recursive estima-
tion, innovation process, nonlinear measurement model. FORMULATION
11. PROBLEM
A. Notation Specification
I. INTRODUCTION
The notation in this correspondence obeys the following two
The Kalman filter [ l ] is a well-known estimator with a simple rules.
recursive structure. This structure is made possible by the ele-
gant form of its innovation process [2]. The innovation process is 1) Random variables are distinguished from deterministic
based on its linear signal and measurement models. Therefore, constants by their time arguments being parenthesized
the signal and measurement models are very important in the instead of being subscripted.
development of the Kalman filter type estimators. 2 ) Matrices and vectors are distinguished from scalars by
However, the signal in the Kalman filter’s measurement model being written in upper case.
is assumed to be corrupted only by an additive noise. In our B. Systems Models
previous work [3] we included a multiplicative noise in the
measurement model and retained a same signal model (with a Consider the following system.
zero-mean constraint). Since multiplicative noise makes the Signal Model:
measurement model nonlinear, we cannot exploit the Kalman X(k + 1) = A , X ( k ) + B , U ( k ) . (1)
filter’s form of innovation. As a result, we developed a new
Multiplicative Noise Model:
structure of a recursive estimator based upon a new form of
recursive innovation process. r(k +I) = c k r (k ) + d k c (k ) . (2)
In this correspondence, we generalize the zero-mean signal to Measurement Model:
nonzero-mean case. For the Kalman filter the superposition
principle can be applied to both the signal and the measurement Z(k)=r(k)H,X(k)+F,N(k), (3)
models for nonzero-mean extension. For our problem, because where
Manuscript received June 1989, revised August 1989. I ) X ( k ) , r ( k ) , N ( k ) , and Z ( k ) are signal, multiplicative
B. S. Chow is with the Department of Electrical Engineering, National Sun noise, additive noise, and data, respectively.
Yat-Sen University, Kaohsiung, Taiwan 80424, R.O.C.
W. P. Birkemeier is at S-11 463 Soeldner Road, Spring Green, WI 53588. 2 ) U ( k ) and c ( k ) are the generating random sequences for
IEEE Log Number 9038000. the signal and the multiplicative noise.

0018-9448/90/1100-1430$01.00 01990 IEEE

You might also like