An Information-Theoretic Framework For Receiver Quantization in Communication
An Information-Theoretic Framework For Receiver Quantization in Communication
Abstract—We investigate information-theoretic limits and de- four times as the resolution increases by one bit, while every
sign of communication under receiver quantization. Unlike most doubling of the sampling rate leads to a one-bit loss of
existing studies that focus on low-resolution quantization, this resolution [2], [3]. The impact of ADC on the performance has
work is more focused on the impact of weak nonlinear distortion
due to resolution reduction from high to low. We consider a received increasing attention along with the recent evolution
standard transceiver architecture, which includes independent of wireless communications, in which several challenges are
and identically distributed (i.i.d.) complex Gaussian codebook faced, such as the increasing processing speed due to the uti-
at the transmitter, and a symmetric quantizer cascaded with a lization of larger bandwidth in mmWave and higher frequen-
nearest neighbor decoder at the receiver. Employing the general- cies, the increasing scale of hardware due to the use of massive
ized mutual information (GMI), an achievable rate under general
quantization rules is obtained in an analytical form, which multiple-input-multiple-output (MIMO), and the critical need
shows that the rate loss due to quantization is log (1 + γSNR), for low cost energy-efficient devices in emerging scenarios,
where SNR is the signal-to-noise ratio at the receiver front-end, e.g., massive machine-type communications (mMTC).
and γ is determined by thresholds and levels of the quantizer. A majority of the studies on ADC at communication re-
Based on this result, the performance under uniform receiver ceivers have focused on the performance and design under
quantization is analyzed comprehensively. We show that the
front-end gain control, which determines the loading factor low-resolution output quantization, and one-bit quantization
(normalized one-sided quantization range) of quantization, has has been of particular interest due to its negligible power
an increasing impact on performance as the resolution decreases. dissipation and simplicity of implementation, even without
In particular, we prove that the unique loading factor that requiring automatic gain control (AGC). In such studies the
minimizes the mean square error (MSE) of the uniform quantizer end-to-end channel is highly nonlinear, typically incurring a
also maximizes the GMI, and the corresponding irreducible
rate loss is given by log (1 + mmse · SNR), where mmse is the substantial performance loss, and necessitating a rethinking of
minimum MSE normalized by the variance of quantizer input, the transceiver design.
and is equal to the minimum of γ. A geometrical interpretation On the other side, the transceiver architecture used in
for the optimal uniform quantization at the receiver is further present wireless systems is built without considering the effect
established. Moreover, by asymptotic analysis, we characterize
of output quantization. It is thus necessary to ask, under
the impact of biased gain control, including how small rate
losses decay to zero and achievable rate approximations under such conventional transceiver architecture, how much is the
large bias. From asymptotic expressions of the optimal loading loss caused by output quantization with moderate to high
factor and mmse, approximations and several “per-bit rules” for resolution? In other words, if a small loss in achievable rate
performance are also provided. Finally we discuss more types is acceptable, how fine need the quantization be? Analytical
of receiver quantization and show that the consistency between
results on these problems appear to be lacking. Moreover,
achievable rate maximization and MSE minimization does not
hold in general. limited resolution of quantization leads to new problems, e.g.,
sensitivity of performance to the error of gain control, residual
Index Terms—Achievable rate, analog-to-digital converter,
Gaussian channel, generalized mutual information, nearest neigh- interference in multiuser systems, and so on. These largely
bor decoding rule, mean square error, MMSE, transceiver design, unexplored problems prompt us to revisit the topic of receiver
uniform quantization. quantization in communication in this work.
I. I NTRODUCTION
A. Related Work on Quantization at Communication Receivers
HE analog-to-digital conversion (ADC), including sam-
T pling and quantization, is essential for any digital receiv-
er. The power dissipation of state-of-the-art ADCs increases
We begin from the impact of output quantization in the
(discrete-time) additive white Gaussian noise (AWGN) chan-
nel, which is a benchmark model in communication theory.
This work was supported in part by the National Natural Science Foundation The performance gain of using more output quantization
of China through Grant 62231022 and in part by Henan Key Laboratory
of Visible Light Communications through Grant HKLVLC2023-B03. The levels in coded transmission (i.e., soft-decision decoding)
material in this paper will be presented in part at the IEEE International was observed very early in [4] via an information theoretic
Symposium on Information Theory (ISIT), Ann Arbor, MI, USA, June 2025 approach. In the classic textbook of Wozencraft and Jacobs [5,
[1]. (Corresponding author: Wenyi Zhang.)
Jing Zhou is with the Department of Computer Science and Engineering, Chap. 6.2], cutoff rate analysis showed that, for equiprobable
Shaoxing University, Shaoxing 312000, China (e-mail: [email protected]). uniformly spaced pulse amplitude modulation (PAM) input,
Shuqin Pang and Wenyi Zhang are with the Department of Electronic the degradation due to output quantization is approximately 2
Engineering and Information Science, University of Science and Technology
of China, Hefei 230027, China (e-mail: [email protected], weny- dB when the alphabet size equals to the number of quantization
[email protected]). levels, and the degradation vanishes when the quantization
DRAFT 2
becomes increasingly fine. Particularly, in the low-signal-to- channel under Gaussian input and nearest neighbor decoding
noise-ratio (low-SNR) limit, hard-decision decoding (one-bit [30]; see [39] for more discussions. However, GMI analyses
quantization that observes the sign of output) leads to a power also show that the AQNM-based estimation is not accurate in
loss of π/2 (approximately 2 dB) [4], [5]; see also [6, Chap. general. For example, it may overestimate the achievable rate
2.11 and 3.4].1 For K-level output quantization, it was proved in multiantenna systems [40].
that a discrete input of at most K + 1 mass points suffices to
achieve the constrained capacity [8]. B. Related Work in Quantization Theory
In vector (MIMO) channels, the quantization loss in achiev-
able rate can be very small at low-to-moderate SNR even if The rich theory of quantization was surveyed comprehen-
1~3-bit output quantization is used. This fact was shown in sively in [41], in which two well established asymptotic
information theoretic studies [9]–[11] by numerical examples theories were emphasized. The first is Shannon’s information
for full-rank channels with multiplexing gains 2~4. Although theoretic approach (rate distortion theory [42]), which places
the constrained capacity is still unknown even in the one- quantization in the framework of lossy source coding and
bit case, recent information theoretic studies have provided focuses on the high-dimension regime, thereby shedding light
various results on MIMO systems with coarse output quanti- on vector quantization. The second is the asymptotic quanti-
zation [12]–[19], which show that proper transceiver design is zation theory, which sheds light on quantizer design in the
critical in realizing efficient communication in such systems. high-resolution regime. The asymptotic quantization theory
In particular, for one-bit output quantization, it was shown that is more relevant to receiver quantization in communication
the high-SNR capacity grows linearly with the rank of channel which typically does not employ coding or vector quantization.
[13], while the low-SNR asymptotic power loss of π/2 still Although not applicable directly to receiver quantization, some
exists in the case of vector channel [12]. classical results in asymptotic quantization theory, especially
In light of these positive theoretical results, performance and those for uniform scalar quantization, are reviewed here for
design of wireless systems with low-resolution quantization comparison.
have been extensively studied in recent years; see, e.g., [20]– A basic result known in [43] and [44] (rigorously proved in
[26]. The most common approach therein, however, is not [45]) states that, for a high-resolution uniform quantizer with
information-theoretic (since exact evaluation of mutual infor- step size ℓ, the mean square error (MSE) can be approximated
mation can be difficult). Instead, achievable rate estimation by ℓ2 /12. This yields the “6-dB-per-bit rule” that each addi-
based on the additive quantization noise model (AQNM) has tional bit in resolution reduces the MSE by 6.02 dB. The rule
been widely used, which comes from Bussgang-like decom- reflects the impact of step size that causes granular distortion;
position [27]–[29]. Results in [12]–[14], [20] suggested that but it ignores overload distortion due to finite quantization
such estimation approximates the mutual information well at range. The interplay between these two types of distortion
low SNR, but becomes inaccurate at high SNR. determines how the optimal quantization range scales with
Although mutual information is a fundamental performance the resolution. The scaling law has been characterized in
measure, for communications under transceiver nonlinearity the seminal work [46] for several types of input densities.
it has limited operational meaning, in the sense that the Take the Gaussian source as example. In [46] it has been
decoder that achieves the predicted rate can be too complex shown that, for the optimal 2K-level uniform quantization
2
to implement, while that rate is not necessarily achievable that
p minimizes the MSE, 1) the loading factor scales like
by a standard transceiver architecture designed without con- 2 ln(2K) (cf. the conventional “four-sigma” rule of thumb
sidering nonlinearity because the decoder is typically mis- [44], [47], [48]), and 2) the granular distortion dominates and
matched to the nonlinear channel. In [30], [31], a more the overload distortion is asymptotically negligible. Further
meaningful performance measure that takes decoding rule into properties of the uniform quantization have been analyzed
account, namely the generalized mutual information (GMI) in [49]–[51]. For quantization at communication receivers,
[32], has been adopted, yielding analytical expressions of we need parallel results to characterize the optimal loading
the achievable rate under output quantization and nearest factor, which is essential for the design of AGC. We note
neighbor decoding rule. Under a given (possibly mismatched) that, although the importance of AGC design in the presence
decoder, the GMI determines the highest rate below which of output quantization has been recognized for a long time [6],
the average probability of error, averaged over a given i.i.d. [29], it was only investigated by numerical results in several
codebook ensemble, converges to zero as the block length works; see, e.g., [9], [10], [22], [52], [53].
N grows without bound, and it is thus a lower bound on Similar to the AQNM, there is also an additive noise model
mismatch capacity [32], [33]. The GMI has been applied for source quantization [41], [44], [47], which approximates
in various scenarios for performance evaluation, including the quantization error as an independent white noise term
the bit-interleaved coded modulation (BICM) [34], fading added to the quantizer input (but does not include a scaling
channels [35]–[37], and nonlinear fiber-optic channels [38]. In factor like that in the AQNM), though the “noise” is in fact a
fact, the rate estimation based on the AQNM (or Bussgang- deterministic function of the input. The quantization error can
like decomposition) is consistent with the GMI for scalar be white (uncorrelated between samples) when the source is
i.i.d., and it is approximately uncorrelated with the input when
1 Interestingly, the loss can be fully recovered if we replace the hard-decision
decoder (a.k.a. sign quantizer) by a carefully designed asymmetric one-bit 2 The loading factor of a quantizer is the one-sided width of its quantization
quantizer (a.k.a. threshold quantizer) and employ asymmetric input [7]. range normalized by the standard deviation of the input [47].
DRAFT 3
the resolution is sufficiently high. So the model may give a decoding, the GMI has a simple expression which can be
useful approximation under certain circumstances (see, e.g., evaluated by the correlation between the channel input and
[54]). output. Our asymptotic analyses also rely on methods and
results in asymptotic (high-resolution) quantization theory.
C. Our Work A notable tool originating from numerical analysis is the
Euler-Maclaurin summation formula [60], which was initially
1) Problem: In this paper, we consider communications introduced to source quantization theory in [49].
in the presence of output quantization, and focus on the 3) Summary of Contribution: We provide information-
effect of resolution reduction on the performance of a s- theoretic results for the transceiver architecture including i.i.d.
tandard transceiver architecture designed without considering complex Gaussian codebook and nearest neighbor decoding,
that effect. Rather than considering only low-resolution output in the presence of complex Gaussian noise and symmetric
quantization (1~3 bits) as in most existing studies, we con- receiver quantization, especially uniform quantization. Our
sider the entire region of resolution, especially the transition main contributions, including exact expressions, asymptotic
from high resolution (typically 8~12 bits or more so that formulas, and numerical results, are summarized as follows.
performance loss is negligible) to low resolution. Since in the
• In Sec. III, for the considered transceiver architecture, we
transition only weak to moderate nonlinearity is introduced,
show that the GMI under a given SNR can be expressed
it is natural to keep the transceiver architecture unchanged
by
rather than rebuild it. The considered standard transceiver
architecture includes independent and identically distributed IGMI = C − log(1 + γSNR), (1)
(i.i.d.) complex Gaussian codebook at the transmitter, and a
uniform output quantizer cascaded with a (weighted) nearest where C is the channel capacity when the resolution of
neighbor decoder at the receiver, where the loading factor of quantization is unlimited, and the parameter γ, which
the quantizer can be adjusted by gain control. See Sec. II does not depend on the SNR, is determined by thresholds
for details. We choose such an architecture in view of the and levels of the quantizer in an analytical form.
following facts. • In Sec. IV, for uniform quantization (with equispaced
thresholds and mid-rise levels), we show that optimizing
• When the impact of quantization is negligible, this archi-
the loading factor by gain control before quantization
tecture is capacity-achieving in several important chan-
(thereby minimizing γ in (1)) is increasingly important
nel models, such as the AWGN channel and the flat-
as the resolution decreases, thus imposing a critical
fading channel with Gaussian noise and channel state
challenge to the AGC. Interestingly, the problems of
information at the receiver [55]. It is also a very robust
MSE minimization and achievable rate maximization are
architecture in general noisy channels [56]. This architec-
proven to be consistent, in the sense that there is a unique
ture has been adopted in various performance evaluation
loading factor L = L∗ satisfying
problems, e.g., [35], [36], [56] for linear channels with
fading and [30], [38], [57] for nonlinear channels. L∗ = arg max IGMI (L) = arg min mse(L), (2)
L L
• The nearest neighbor decoding (minimum Euclidean dis-
tance decoding) can be implemented efficiently and has where IGMI (L) is the achievable rate as a function of the
been widely employed as a standard decoding rule in loading factor. This fact, combined with existing result,
communication systems. The uniform quantizer is also implies
√ that the optimal loading factor L∗ scales like
a standard component of practical receivers as well as 2 b ln 2 as the resolution b (in bits) increases. We further
a common assumption in performance analysis [9]–[11], prove that the minimum of γ is exactly the minimum
[14], [15], [25], [53], [58]. mean square error (MMSE) normalized by the variance of
• The achievable rate of complex Gaussian input approx- quantizer input (denoted by mmse), so that the irreducible
imates that of regular high-order modulation schemes loss in achievable rate due to uniform quantization is
such as quadrature amplitude modulation (QAM). In determined by
the AWGN channel, the high-SNR gap between their
C − IGMI (L∗ ) = log (1 + mmse · SNR) . (3)
achievable rates is 1.53 dB, which can be further reduced
by constellation shaping [59]. For uniform quantization with the optimal loading factor,
2) Method: The achievable rate results in this paper are we establish a geometrical interpretation of how the
derived based on the GMI, which, as discussed in Sec. I- MMSE and the additive Gaussian noise jointly determine
A, is a convenient performance measure for the problem the achievable rate.
of information transmission under transceiver nonlinearity. • The uniform receiver quantization is further studied by
Moreover, as a performance measure under a mismatched asymptotic analysis. In Sec. V, the impact of biased gain
decoder (since the nearest neighbor decoding rule becomes control is characterized by asymptotic behaviors and ap-
suboptimal in the presence of receiver quantization), the GMI proximations of IGMI (L). Approximations of IGMI (L) for
possesses optimality in the sense that it is the maximum overload region (L < L∗ ) and underload region (L > L∗ )
achievable rate of the i.i.d. random code ensemble, thereby of the loading factor are proposed, respectively. Specifi-
indicating the performance of a “typical” codebook [32], cally, in the high-resolution regime, we characterize the
[35]. Specifically, for Gaussian codebook and nearest neighbor loading loss in achievable rate, showing that i) the loading
DRAFT 4
lk /g, k = 1, ..., K − 1}, where g can be adjusted to optimize To achieve the maximum GMI given in (14), the scaling factor
the performance. A special case is the uniform quantizer, in (7) should be set as
which has equispaced thresholds
E XY
ℓk = kℓ, k = 1, ..., K − 1, (9) a = α := . (16)
σx2
and mid-rise levels Besides the proof in [30] based on direct evaluation and
1 optimization of the dual expression of the GMI, here we
yk = k − ℓ, k = 1, ..., K, (10)
2 provide a sketch of an alternative proof.
Proof Sketch: It has been noted in [35] and [66] that, for an
where ℓ is the step size. Thus, we define its quantization range
or support as [−Kℓ, Kℓ]. Then the loading factor or support noise channel Y = S +U (i.e., the noise
additive uncorrelated
U satisfies E SU = 0, but is not necessarily independent of
limit of the uniform quantizer is L = Kℓ.
the input S), if S ∼ NC (0, E |S|2 ), then
!
B. An Achievable Rate Formula from Generalized Mutual E |S|2
Information I(X; Y ) ≥ log 1 + . (17)
E [|U |2 ]
Following the notation of [33], consider a memoryless
channel X → Y with general alphabets X and Y, input prob- In [56], under Gaussian codebook and nearest neighbor de-
ability distribution PX (x), transition probability PY |X (y|x), coding rule, the achievability of the RHS of (17) and a
and decoding metric d(X, Y ). The decoder selects a message random coding converse for it are established by a geometric
according to argument,4 where the Gaussian codebook can either be an i.i.d.
Gaussian codebook or an equienergy one. In the former case
N
X the rate (17) is the GMI. For the scalar channel X → Y ,
m̂ = arg min d(Xn (m), Yn ). (11)
m∈M X ∼ NC (0, σx2 ), by a Bussgang-like decomposition, we can
n=1
always write Y = αX + D, where α is given in (16),
As a lower bound on mismatch capacity, the GMI can be given and D = Y − αX satisfies E XD = 0. That is, we
by its dual expression as [32], [33] eliminate the correlation between the scaled input αX and
" #
e−sd(X,Y ) the corresponding distortion D by a carefully chosen scaling
IGMI = sup E log −sd(X ′ ,Y ) , (12) factor. It is straightforward to show that (14) can be obtained
s≥0 E e |Y
by (17) when S = αX and U = D.
where (X, Y, X ′ ) ∼ PX (x)PY |X (y|x)PX (x′ ). The GMI gives Let σy be the standard deviation of Y . We note that when
the maximum rate below which the probability of decoding E[Y ] = 0 (e.g., when the quantizer is symmetric, the channel
error, averaged over the i.i.d. random codebook emsemble, output in (4) has zero mean) we have ∆ = |ρXY |2 , where
converges to zero as the coding block length grows without
cov(X, Y )
bound. For communications with transceiver nonlinearity, if ρXY = (18)
the input distribution PX (x) is complex Gaussian with vari- σx σy
ance σx2 and the decoding rule (11) is specified by (7), then is the Pearson correlation coefficient between X and Y . Also
from (12) we obtain [30] note that the equality in (17) holds if and only if X ⊥ D and
D ∼ NC (0, σ 2 ) (e.g., when there is no quantization in (4)),
IGMI = sup log(1 + s|a|2 σx2 ) − sE |Y − aX|2 so that the channel X → Y reduces to the AWGN channel,
s≥0
the nearest neighbor decoding rule is optimal, and the GMI
sE |Y |2 equals
+ . (13) to the mutual information (also the channel capacity)
1 + s|a|2 σx2 σ2
log 1 + σx2 .
Maximizing the GMI by optimizing the scaling factor a yields
the following result [30, Appendix C], which provides a
III. ACHIEVABLE R ATE U NDER R ECEIVER
general approach for achievable rate analysis in the presence
Q UANTIZATION : E XACT AND A SYMPTOTIC R ESULTS
of transceiver nonlinearity with known transition probability.
Proposition 1 [30]: For a memoryless SISO channel X → Based on Proposition 1, we establish the following result
Y with transition probability pY |X (y|x) and nearest neighbor which provides an analytical expression for the achievable rate
decoding rule (7), where X, Y ∈ C and Var(X) = σx2 , the of the transceiver architecture considered in this paper.
maximum GMI under i.i.d. complex Gaussian codebook is Theorem 2: For the channel (4) where q(·) is the sym-
given by metric quantizer described in Sec. II-A2, the achievable rate
1 under i.i.d. complex Gaussian codebook and nearest neighbor
IGMI = log , (14) decoding rule (7) is
1−∆
where IGMI = log(1 + SNR) − log(1 + γSNR), (19)
2
E XY
∆= 2 . (15) 4 The proof in [56] was intended for independent noise. However, as noted
σx E [|Y |2 ] in [35], it goes through verbatim for uncorrelated noise.
DRAFT 6
where SNR = |h|2 σx2 /σ 2 is the SNR at the receiver front-end, quantization; see, e.g., [31], which showed that nonsubtrac-
and γ is a parameter determined by the quantizer as tive Gaussian dithering improves GMI under certain circum-
stances.5 However, in our setting, nonsubtractive Gaussian
A2 dithering is always harmful, because it is equivalent to re-
γ =1− , (20)
B ducing the SNR, while the GMI is a monotonically increasing
in which function of the SNR since its derivative (in nats) satisfies
dIGMI 1−γ
√ X K = > 0. (27)
A = 2π yk (φ(ℓk−1 ) − φ(ℓk )) , (21) dSNR (1 + SNR)(1 + γSNR)
k=1 From Theorem 2, we immediately obtain the following
corollary, which shows that γ dominates the asymptotic
and
behavior of performance, especially the low-SNR slope of
K
X achievable rate, and the high-SNR saturation rate.
B=π yk2 (Q(ℓk−1 ) − Q(ℓk )) . (22) Corollary 3: The achievable rate IGMI given in (19) has the
k=1 following properties.
• The parameter γ satisfies
Proof: Applying Proposition 1, it is sufficient to show
that 2
0<γ ≤1− , (28)
π
SNR A2
∆= , (23) where the equality holds when q(·) is a one-bit quantizer,
1 + SNR B
corresponding to an achievable rate
which can be obtained by showing that 1-bit 1 + SNR
IGMI = log , (29)
1 + π−2
E XY = 2E [Re(X) · Re(Y ) + j · Im(X) · Re(Y )] (24a) π SNR
π
2Re(h)σx 2 2
2j · Im(h)σx which converges to log2 π−2 = 1.4604 bits/c.u. at high
=p A− p A (24b) SNR (see also footnote 6). As the quantization becomes
π(|h|2 σx2 + σ 2 ) π(|h|2 σx2 + σ 2 ) increasingly fine, we have γ → 0 (from above), SNRe →
2hσx2 SNR (from below), and
=p A (24c)
π(|h|2 σx2 + σ 2 ) SNR2 2
C − IGMI = SNR · γ − γ + o(γ 2 ) nats/c.u. (30)
and 2
4 • High- and low-SNR asymptotics: As SNR → ∞, we have
E |Y |2 = 2E Re(Y )2 = B. (25) SNRe → 1−γγ and
π
1 1−γ 1 1
The identities (24c) and (25) can be obtained by lengthy but IGMI = log − +o . (31)
straightforward evaluations of expectations (cf. [30, Appendix γ γ SNR SNR
D]). So the saturation rate is given by
Remark: In [30, Sec. V], a parallel result of Theorem 2 for 1
real-valued channel with symmetric output quantization was I¯GMI = log . (32)
γ
given by deriving its effective SNR, while Theorem 2 shows
As SNR → 0, we have
that the GMI in the complex-valued case has exactly the same
expression except that the pre-log factor is doubled. 1 − γ2
IGMI = (1 − γ)SNR − SNR2
We note that, if the impact of quantization is omitted, 2
then the model (4) reduces to a linear Gaussian channel + o SNR2 nats/c.u. (33)
Yn = hXn (m)+Zn , and the achievable rate of the considered
An alternative expression of (19) is given in terms of
transceiver architecture is log(1 + SNR), which is also the
effective SNR (or signal to noise-and-distortion ratio) as
channel capacity C when the power of the input is σx2 . The
expression of IGMI in Theorem 2 shows explicitly that the rate IGMI = log (1 + SNRe ) , (34)
loss due to quantization is
where
C − IGMI = log (1 + γSNR) , (26) (1 − γ)|h|2 σx2 (1 − γ)
SNRe = 2 2 2
= SNR. (35)
γ|h| σx + σ γSNR + 1
which has the same expression as C except for an SNR
From this expression, we see that the effect of quantization is
reduction of at least 4.4 dB (see (28) given later). For a given
to transfer a fraction γ of the power to the denominator of the
SNR, the reduction is determined by the parameter γ, which
reflects the impact of nonlinearity and does not depend on the 5 In nonsubtractive dithering, one adds a dither signal W independent of
d
SNR. the quantizer input V before quantization, yielding an output q(V + Wd ).
Subtractive dithering has an additional step that subtracts the dither signal
Remark: Dithering, an intentional randomization technique after quantization and finally obtains q(V + Wd ) − Wd . See [41] for more
in source quantization [41], may also be beneficial in receiver discussion.
DRAFT 7
effective SNR. At low SNR, the power loss can be evaluated From (40), the normalized MSE is minimized only if
by SNR/SNRe , which asymptotically converges to 1/(1−γ) ≤ √ dA dB
π/2, where equality holds when one-bit quantization is used.6 2π = . (43)
dg dg
From Theorem 2, it is simple to check that, for one-bit
quantization, the gain control factor g has no impact on The different expressions of mse and γ show that, in gen-
the achievable rate. In general, the achievable rate has the eral, the gain control factor that minimizes the MSE does
following property. not necessarily maximizes the achievable rate. However, for
Proposition 4: The achievable rate IGMI given in (19), as a uniform quantization, the next section will show that the two
function of the gain control factor g, satisfies optimization problems are consistent; i.e., there is a unique
gain control factor (and correspondingly, a unique loading
1-bit
lim IGMI (g) = lim IGMI (g) = IGMI . (36) factor) that solves both problems simultaneously.
g→0 g→∞
14
Clearly, decreasing L reduces the granular distortion but
increases the overload distortion. When L < L∗ the overload
12 distortion increases quickly, thereby causing the waterfall of
achievable rate.
10 The numerical results in Fig. 4 and Fig. 5 reveal the
increasing importance of gain control (realized by an AGC
8 module in practical systems) as the resolution decreases: 1)
Under high-resolution output quantization, we only require
6
a rough estimate of the channel gain to guarantee that the
loading factor to be no less than a predefined threshold, say 4
4
(from the four-sigma rule of thumb [44], [47]), so that we can
stay away from the waterfall; 2) Under low-resolution output
2
quantization, such a simple strategy may increase rate loss
0
considerably, but perfect gain control needs accurate channel
-10 0 10 20 30 40 estimation which is also challenging in this case.
Fig. 2. Achievable rate with uniform output quantization and optimal loading B. Consistency Between Rate Maximization and MSE Mini-
factor. mization and a GMI-MMSE Formula
14 From source quantization theory [46], [50] we know that,
for Gaussian input, the MSE of the uniform quantizer is a
12
strictly convex function of the step size (or loading factor) and
the minimum is located at the unique solution of dMSE dℓ = 0,
10
denoted by ℓ∗ . In fact, that solution is also the unique step
8
size that maximize the GMI; that is, for uniform quantization
at the receiver there is a consistency between achievable rate
6 maximization and MSE minimization. This consistency further
leads to a GMI-MMSE formula which connects the maximum
4 achievable rate and the MMSE of quantization in a simple and
closed form. We first give the following lemma, and then use
2 it to establish the aforementioned findings in Theorem 8.
Lemma 7: For uniform quantization at the receiver, for
0
2 4 6 8 10 12 b > 1 (i.e., except for one-bit quantization), we have
dIGMI dmse
=0⇔ = 0, (51)
Fig. 3. Convergence of achievable rate to channel capacity. dℓ dℓ
and both of them are equivalent to
r
A 2
unique loading factor is denoted by L∗ . Its uniqueness will be = . (52)
B π
proved later. In Fig. 4 and Fig. 5, we show how the achievable
rate IGMI varies with the loading factor L (some more numeri- Proof: See Appendix D.
cal results and details therein will be interpreted in subsequent Theorem 8: For the channel (4) where q(·) is the uniform
sections). As the resolution increases, the increasing of L∗ is quantizer described in Sec. II-A2 with resolution b > 1,
clear. A “waterfall” near L = 0 can be observed in all figures, the input employs i.i.d. complex Gaussian codebook, and the
implying that an underestimate of the optimal loading factor receiver employs nearest neighbor decoding rule (7), we have
always causes serious rate loss. To interpret this phenomenon, the following properties.
∗
we write the normalized MSE of uniform quantization as the • The loading factor L that maximizes the GMI (19) is
sum of overload distortion and granular distortion as unique, and it is also the unique loading factor that
minimizes the MSE (39), i.e.,
mse = mseo + mseg , (48)
arg max IGMI (L) = arg min mse(L). (53)
L L
where
Z ∞ • The minimum of γ as a function of L, namely γ(L∗ ), is
2
mseo := (t − L + ℓ/2) φ(t)dt, (49) exactly the normalized MMSE (42) and satisfies
L
2
and 0 < mmse = γ(L∗ ) ≤ γ ≤ 1 − , (54)
π
K Z
X kℓ 2
1 and the maximum GMI can be written as
mseg := t− k− ℓ φ(t)dt. (50)
(k−1)ℓ 2 ∗
IGMI = log(1 + SNR) − log (1 + mmse · SNR) . (55)
k=1
DRAFT 9
12 12 12
10 10 10
8 8 8
6 6 6
4 4 4
2 2 2
0 0 0
0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32
(a) b = 12 bits. Simple gain control strategy (b) b = 10 bits. Simple gain control strategy (c) b = 8 bits. Four-sigma rule of thumb is conve-
is always enough. is usually enough. nient.
12 12 12
10 10 10
8 8 8
6 6 6
4 4 4
2 2 2
0 0 0
0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32
(d) b = 6 bits. Four-sigma rule of thumb (e) b = 4 bits. Four-sigma rule of thumb (f) b = 2 bits. Four-sigma rule of thumb causes
causes small loss at high SNR. causes considerable loss. significant loss.
Fig. 4. Impact of loading factor on achievable rate: Fixed resolution, varying SNR. Vertical lines and corresponding values show the optimal loading factors.
0.14 1 3.5
0.9
0.12 3
0.8
0.6
0.08 2
0.5
0.06 1.5
0.4
0.04 0.3 1
0.2
0.02 0.5
0.1
0 0 0
0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32
(a) SNR = −10dB. (b) SNR = 0dB. (c) SNR = 10dB ≈ SNRq (2).
7 10 14
9
6 12
8
5 7 10
6
4 8
5
3 6
4
2 3 4
2
1 2
1
0 0 0
0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32
(d) SNR = 20dB ≈ SNRq (4). (e) SNR = 30dB ≈ SNRq (6). (f) SNR = 40dB ≈ SNRq (8).
Fig. 5. Impact of loading factor on achievable rate: Fixed SNR, varying resolution.
DRAFT 10
Proof: First, in [46, Sec. V-A], it has been proved that Theorem 8 enables us to utilize known results on uniform
the MSE is minimized if and only if the loading factor is set quantization for minimizing the MSE. The following result
to be the unique solution of dmse ∗
dL = 0, denoted by L . The follows immediately from the equivalence (53) and existing
∗
uniqueness of L is confirmed by showing that the second results in [46], [67]. In particular, it is shown that the√optimal
derivative of the MSE is strictly positive. Lemma 7 implies loading factor L∗ grows with the resolution b like 2 b ln 2.
that the loading factor that satisfies dIdL
GMI
= dI
Kdℓ = 0 is also
GMI
Corollary 10: In the channel (4) under i.i.d. complex
unique. Noting that the achievable rate IGMI is a continuous Gaussian codebook and nearest neighbor decoding rule (7),
and differentiable function of L, from Proposition 4 we can the optimal step size ℓ∗ that maximizes the GMI (19) satisfies
infer that L∗ also maximizes IGMI (L) (it is not difficult to
lim Kℓ∗ = ∞, lim ℓ∗ = 0. (61)
exclude the other possible case that L∗ minimizes IGMI (L)). K→∞ K→∞
Then the first part of Theorem 8 is proved. The optimal loading factor L∗ = Kℓ∗ grows monotonically
According to Lemma 7, if L = L∗ then (52) holds. with K and satisfies
Combining this fact with (40) and (20), we have
L∗
π lim p = 1. (62)
γ(L∗ ) = 1 − B(ℓ∗ ) = mmse, (56) K→∞ 2 ln(2K)
2
thereby completing the proof of the second part of Theorem The corresponding MMSE consisting of the granular distortion
8. mmseg and the overload distortion mmseo satisfies
From the proof of Theorem 8, we can infer that IGMI (L) > mmse mmseg
1-bit
IGMI holds for 0 < L < ∞. Combining this with Proposition lim ∗ 2 = lim ∗ 2 = 1; (63)
K→∞ ℓ /12 K→∞ ℓ /12
4 yields the following corollary.
Corollary 9: For uniform receiver quantization, the achiev- that is, the granular distortion dominates the MMSE as the
resolution increases:
able rate given in (19), as a function of the loading factor L,
mmseo
satisfies lim = 0. (64)
K→∞ mmseg
1-bit
inf IGMI (L) = lim IGMI (L) = lim IGMI (L) = IGMI , (57)
L L→0 L→∞ For (61) and (62), we also provide new proofs of them,
1-bit
where IGMI is given in (29). respectively, by asymptotic results on the achievable rate; see
In Theorem 8 we exclude the case b = 1. In fact, for Sec. VI-A.
symmetric one-bit quantization, we have A = ℓ/2, B = πℓ2 /8, Remark: In quantization theory, the “SNR” for a quantizer is
and γ ≡ 1 − 2/π for 0 < ℓ < ∞, so that gain control is often defined as the ratio between the variance of the quantizer
unnecessary. But when input and the MSE. For the optimal uniform quantizer we
denote
4
ℓ= √ = 1.5958, (58) 1
2π SNRq = . (65)
mmse
the normalized MSE achieves its minimum as mmse1-bit = Theorem 8 implies that the maximum saturation rate can be
1 − 2/π, corresponding to the upper bound in (54). expressed as
Remark: From Theorem 2, the rate loss due to uniform 1
quantization with a loading factor L is given by log(1 + I¯GMI
∗
= log = log SNRq , (66)
mmse
γ(L)SNR). For uniform quantization, Theorem 8 characterizes
the minimum of rate loss when L = L∗ . Thus we should and the corresponding effective SNR is given by
distinguish two parts of the total rate loss as follows. sup SNRe = lim SNRe = SNRq − 1. (67)
7 SNR>0 SNR→∞
• Irreducible loss: the unavoidable part for given resolu-
tion and SNR, given by For finite SNR we have
∗ 1 − mmse (1 − mmse)Ex
C − IGMI = log(1 + mmse · SNR). (59) SNRe = SNR = . (68)
mmse · SNR + 1 mmse · Ex + σ 2
Numerical results in Fig. 2 and Fig. 3 consider only
In Table I,8 we show numerical results of optimal uniform
irreducible loss.
quantization under our transceiver architecture (an approxima-
• Loading loss: the remaining part, given by
tion of mmse given by ln(2K)/3K 2 , and an approximation
∗ 1 + γ(L)SNR of I¯GMI , denoted by IˆGMI , will be introduced in Sec. VI-B).
IGMI − IGMI (L) = log , (60)
1 + mmse · SNR As the SNR decreases, numerical results in Fig. 5 show
which is due to suboptimal loading factor and can be that the supremum and infimum of IGMI have a decreasing
reduced by improving the accuracy of gain control. Nu- ratio which converges to π/2, coinciding with the “2 dB loss”
merical results in Fig. 4 and Fig. 5 show the importance result for hard-decision decoding. Thus, a major part of the
of reducing loading loss. low-SNR capacity can always be utilized. At moderate-to-high
SNR (Figs. (5(c)-5(f))), we may roughly separate different
7 We note that the loss (59) is irreducible in the sense of GMI, and it is not
necessarily irreducible in general since the GMI is only a lower bound on the 8 Some numerical results therein have been given in [46, Table II], in which
mismatch capacity of the channel (4). an error occurred in actual SNR computation for Gaussian source, b = 8.
DRAFT 11
TABLE I
N UMERICAL R ESULTS OF O PTIMAL PARAMETERS AND P ERFORMANCE M ETRICS FOR U NIFORM Q UANTIZATION
scenarios of receiver quantization as follows, which show that interpretation for such optimal uniform quantization can be
channel estimation and AGC design become more challenging established. For simplification we first introduce an equivalent
as the resolution decreases. p shown in Fig. 6, where we let X = hX, Y = σv Y =
model
• SNR-limited (high resolution) scenario: b > 2 log10 SNR Y · (|h|2 σx2 + σ 2 )/2, V = V R +jV I , Z = Z, and W = Y−V
+b1 bits and SNRq ≫ SNR, where b1 can be 2~3. The (we write W(V) in Fig. 6 to emphasize that W is a function
irreducible loss (59) is negligible, and significant rate of V), so that
improvement can be obtained by increasing the SNR. In
Y = X + Z + W = V + W, (70)
fact, we have
∗
d(C − IGMI ) where the quantization error W satisfies
= mmse + o(mmse2 ), (69)
dSNR E |W|2 = MMSE = mmse · E |V|2 = 2 · mmse · σv2 . (71)
which suggests that the irreducible loss increases slowly
with the SNR. A large overestimate of the optimal Thus, the high-resolution limit of Y is V, and the high-SNR
gain control factor does not cause significant loading limit of V is X.
loss, thereby allowing the simple gain control strategy Now we are ready to illustrate the geometry of optimal uni-
described at the end of Sec. IV-A. In this scenario the form quantization at the receiver in N -dimensional Euclidean
impact of receiver quantization is not important. There is space, as shown in Fig. 7. We use boldface letters to denote
still room to reduce the resolution if accurate gain control codewords or signal vectors in the equivalent model (70), e.g.,
is possible. X = [X1 , ..., XN ]. Since i.i.d. codebook is considered, the
• Resolution-limited (low resolution) scenario: b ≤ vector X and other vectors in Fig.
7 are all i.i.d. random
2 log10 SNR bits and SNRq < SNR. The achievable rate vectors. Thus, we have E kXk2 = N Ex , and the empirical
is seriously limited due to large irreducible loss, while average power of the input codeword converges in probability
accurate gain control is required to avoid large loading to Ex , i.e.,
loss. Significant rate improvement can be obtained by 1
lim kXk2 − Ex > ǫ = 0, ∀ǫ > 0. (72)
increasing the resolution. N →∞ N
• Moderate resolution scenario (the remaining cases): The
resolution is enough to maintain a small irreducible loss, √ that the length of X (in asymptotic√sense)
Wepthus briefly say
is E [kXk2 ] = N Ex . Similarly, the length of Z is N σ 2 .
but an overestimate of the optimal gain control factor may
The geometry in Fig. 7 includes two Pythagorean relations as
cause considerable loading loss.
follows.
We note that the consistency between rate maximization and
• For the additive noise channel V = X + Z, we have
MSE minimization does not hold in general; see discussions
E XZ = 0, implying
on nonuniform quantization in Sec. VII.
E kVk2 = E kXk2 + E kZk2 . (73)
C. Geometry of Optimal Uniform Quantization at Receiver • For the quantization channel Y = V + W, it has been
We have shown that, in the standard transceiver architecture shown in [67] that the MMSE uniform quantization
shown in Fig. 1, if the gain control factor g is set appropriately satisfies E YW = 0; i.e., the error W is uncorrelated
so that the loading factor of the quantizer q(·) is equal to the with the output of the quantizer, yielding
optimal value L∗ , then the achievable rate attains its maximum
given by the GMI-MMSE formula (55). In fact, a geometrical E kVk2 = E kYk2 + E kWk2 . (74)
DRAFT 12
The following result shows that, the rate loss due to on- which follows from bounds for the Q function as [68]
ly overload distortion decays exponentially as the loading
factor increases. Asymptotically, the loading loss is directly φ(t) t
proportional to the overload distortion, and is also directly > Q(t) > φ(t). (102)
t 1 + t2
proportional to the SNR.
Theorem 12: In the channel (4) under i.i.d. complex Gaus- The proof is completed by combining (100) and (30).
sian codebook and nearest neighbor decoding rule (7), the
Remark: By the lower bound in (102), one can show that
rate loss due to uniform quantization with a loading factor L
φ(t)− tQ(t) < φ(t)/t2 . Thus the integral in (98) can be upper
satisfies
bounded by
4φ2 (L)
C − IGMI = mseo + (1 + oL (1)) SNR nats/c.u. Z ∞
L2 φ(t)
(95a) dt = L−1 φ(L) − Q(L). (103)
L t2
4φ(L)
= (1 + oL (1)) SNR nats/c.u. (95b)
L3 Using the lower bound in (102) again, we obtain
in the high-resolution limit, where
Z ∞ 4φ(L)
mseo < , (104)
mseo := 2 (t − L)2 φ(t)dt (96) L3
L
is the infimum of the normalized overload distortion mseo which implies that, as L increases, mseo converges to
given in (49). 4φ(L)L−3 from below (cf. (99)).
Proof: Combining (46c) and (47c), we obtain the high- On the other hand, the rate loss due to only granular
resolution limit of γ as a function of L as distortion decays quadratically as the step size vanishes. To
prove this we need the following lemma which is one of the
γ̄(L) := lim γ (97a) various forms of the Euler-Maclaurin summation formula [60].
K→∞
1
R∞ 2 Lemma 13: For a real-valued continuously differentiable
4 − L tQ(t)dt − 21 − Q(L)
= 1
R∞ (97b) function f (t) defined on [a, b], we have
4 − L tQ(t)dt
R∞
4 L (φ(t) − tQ(t)) dt − 4Q2 (L) Z K−1
!
= R∞ . (97c) b
f (a) X f (b)
1 − 4 L tQ(t)dt f (t)dt = ℓ + f (a + kℓ) +
a 2 2
k=1
In the high-resolution limit, granular distortion vanishes and
ℓ2 ′
the normalized MSE includes only the overload distortion, − (f (b) − f ′ (a)) + o(ℓ2 ), (105)
namely mseo . From Proposition 5, (46c), and (47c), we obtain 12
another expression of the overload distortion as
Z ∞ where ℓ = b−a
K .
mseo = 4 (φ(t) − tQ(t)) dt. (98) This lemma characterizes the error of numerical integration
L using the composite trapezoidal rule with an evenly spaced
From [46, Lemma 7 and Eqn. A9], we can infer that9 (uniform) grid. Based on Lemma 13, the following result can
be obtained.
4φ(L) Theorem 14: In the channel (4) under i.i.d. complex Gaus-
mseo = (1 + oL (1)) . (99)
L3 sian codebook and nearest neighbor decoding rule (7), the rate
We thus obtain loss due to uniform quantization with a step size ℓ satisfies
mseo − 4Q2 (L)
γ̄(L) = R∞ (100a)
1 − 4 L tQ(t)dt C − IGMI = mseg + o(ℓ2 ) SNR nats/c.u. (106a)
2
4φ2 (L) ℓ 2
= mseo + (1 + oL (1)) (100b) = + o(ℓ ) SNR nats/c.u. (106b)
L2 12
4φ(L)
= (1 + oL (1)) , (100c)
L3 in the high-resolution limit, where
where we utilize the fact
φ(t) mseg := lim mseg (107)
Q(t) = (1 + ot (1)) , (101) K→∞
t
9 The is the supremum of the normalized granular distortion mseg
derivation in [46] begins from the original form (96). However, we
can also begin from (98) and confirm (99) directly by bounds of Q-function given in (50).
1 √1
[68], e.g., t+1/t φ(t) < Q(t) ≤ 2
φ(t). Proof: Applying the Euler-Maclaurin summation formula
3t/4+ t +8/4
DRAFT 15
ℓ2
mse = mseg = + o(ℓ2 ) (110)
12 Fig. 9. Approximations of small irreducible rate loss: b = 6 bits.
and
π
1 + o(ℓ2 ) ℓ2
γ̄ = 1 − 1
2
ℓ2
= + o(ℓ2 ). (111)
π 2 + 24 + o(ℓ2 ) 12 In this case the achievable rate is well approximated by
L2 SNR/12K 2. When the resolution decreases, e.g., in Fig.
The proof of (106a) is completed by combining (110) and
9 where the resolution is 6 bits, the approximation (112)
(30), while the proof of (106b) is completed by combining
becomes less useful, especially at high SNR. But the ap-
(111) and (30).
proximation (113) is still satisfactory. For higher accuracy,
In Theorem 12 and Theorem 14 we assume output quan-
by considering both the overload distortion and the granular
tization with unlimited resolution. We now check whether
distortion, we propose an approximation given by
they provide useful approximations of rate loss under finite
resolutions. In Fig. 8, we show the rate loss C − IGMI due to
12 bits output quantization. According to Theorem 12, a small d
C − IGMI (L) ≈ log (1 + m se · SNR) , (114)
irreducible loss in overload region can be approximated by
C − IGMI (L) ≈ 4φ(L)L−3 SNR nats/c.u., (112) where
and according to Theorem 14, a small irreducible loss in
underload region can be approximated by 4φ(L) 4φ2 (L) L2
d
m se = 3
+ 2
+ , (115)
L L 12K 2
C − IGMI (L) ≈ L2 SNR/12K 2 nats/c.u. (113)
Clearly, these approximations successfully capture the decay in which the first two terms of the RHS come from (95a),
of rate loss when it is dominated by the overload distortion. and the last term comes from (106). As shown in Fig. 9,
When the loading factor L exceeds 4, the increasing granular the proposed formula well approximates the transition from
distortion kicks in and dominates the performance quickly. overload region to underload region.
DRAFT 16
dIGMI dγ 4
≈ −SNR , (117)
dL dL 3
5.5
4.5
3.5
2.5
1.5
0.5
0
1 2 3 4 5 6 7 8 9 10 11 12
∗
Remark: To approach IGMI , it is sufficient to use a simple If we replace mmse in (55) by 4b3·4lnb2 , then we obtain IˆGMI (b).
∗
linear approximation of L as According to (140), it is straightforward to show that the gap
b+4
∗
between IGMI and IˆGMI (b) is ob (1) for an arbitrary finite SNR,
L̂lin = . (136) thereby completing the proof.
3
The approximation (140) implies a 6-dB-per-bit-rule as
Fig. 12 shows that it is accurate for 2 ≤ b ≤ 7. When b ≥ 8,
its accuracy degrades, but the loading loss caused is negligible 10 log10 SNRq ≈ 6.02b − 10 log10 b + 0.34 (dB), (141)
since it grows quadratically with L; see numerical results in which has been known since [46]. Therefore, in the high-
Fig. 4. resolution regime, each additional bit in resolution reduces
the MSE by four times:
B. Maximum Achievable Rate Approximations and Per-Bit mmse(b)
Rules lim = 4. (142)
b→∞ mmse(b − 1)
∗
We have found a simple relationship between IGMI and The approximation (140) is still not accurate enough for
mmse in Theorem 8, and we have also given numerical moderate to low resolutions; see Table I. More approximations
results of mmse for b = 1, 2, ..., 12 in Table I. However, a for MMSE or SNRq can be found in [46, Sec. IV] based
∗
direct connection between IGMI and the resolution is still very on refined asymptotic formulas of L∗ and its asymptotic
useful. The following result provides such a connection via relationship with MMSE. However, using (140) is enough to
approximating mmse by b. get the simple and useful approximation of IGMI ∗
in Proposition
Proposition 16: The achievable rate given in (55) satisfies 16. In Fig. 13 we compare IGMI with its approximation IˆGMI .
∗
∗
IGMI = IˆGMI (b) + ob (1), (137) It is shown that the accuracy is acceptable when b ≥ 2.
We next introduce some per-bit rules for other performance
where metrics. The following one is obtained from Proposition 16
4b ln 2 immediately.
ˆ
IGMI (b) = C − log 1 + SNR . (138)
3 · 4b • A 2-bpcu-per-bit rule for saturation rate: The saturation
rate I¯GMI
∗
(b) given in (66) satisfies
Proof: From (62) and (63) we obtain
I¯GMI
∗
(b) = Iˆ¯GMI (b) + o(1), (143)
3K 2
lim mmse = 1, (139)
K→∞ ln(2K) where
which yields a high-resolution approximation of MMSE given Iˆ¯GMI (b) = 2b − log2 b + 0.11 bits/c.u., (144)
by which implies that
4b ln 2 4b ln 2
mmse = (1 + ob (1)) b
≈ . (140) lim I¯GMI
∗
(b) − I¯GMI
∗
(b − 1) = 2 bits/c.u. (145)
3·4 3 · 4b b→∞
DRAFT 19
14 101
12
100
10
10-1
6
4
10-2
0 10-3
-10 0 10 20 30 40 2 4 6 8 10 12
100
20
100
-1 18
10
16
10-2 10-2
14
10-3 12
10-4
10
-4
10
8
10 -5 6 10-6
4
10-6
2 10-8
10-7 0
2 4 6 8 10 12 2 4 6 8 10 12 14 16
Fig. 14. Normalized MMSE and saturation rate under different resolutions. Fig. 16. For irreducible rate loss, a 6-dB-per-bit rule is inaccurate.
Numerical evaluations of Iˆ ¯GMI (b) for 1 ≤ b ≤ 12 are This rule can be confirmed by numerical results in Fig. 15.
shown in Table I. In Fig. 14, it is shown that mmse decreases Although the 6-dB-per-bit rule is well-known in quantization
exponentially as the resolution increases, and correspondingly, theory and gives the correct asymptotics (as shown in (141)),
the saturation rate I¯GMI
∗
grows approximately linearly with it is less accurate unless the rate loss is extremely small, see
the resolution. The improvement per bit is 1.6-1.9 bits/c.u. Fig. 16.
in our range of interest, although the high-resolution limit is
2 bits/c.u. VII. D ISCUSSIONS ON F URTHER Q UANTIZATION RULES
The second per-bit rule can be observed from results in AT THE R ECEIVER
Table I, suggesting a roughly 5-dB-per-bit increase of SNRq :
Based on the analytical framework given in Theorem 2, a
10 log10 SNRq ≈ 5b. (146) major part of this paper has focused on the simplest receiver
quantization scheme, namely scalar uniform quantization with
Since the rate loss in (55) is determined by SNR/SNRq , (146)
mid-rise levels. This section briefly discusses nonuniform and
implies that, in our range of interest, a 5 dB increase of SNR
other types of quantization rules in several aspects.
requires an extra bit of resolution to maintain the same rate
loss. More specifically, we have the following rule.
• A 5-dB-per-bit rule for irreducible rate loss: We require A. Relationship Between MSE and Achievable Rate: Numeri-
a resolution of at least cal Examples
For the channel (4) with a uniform quantizer q(·) whose
2 log10 SNR + b0 (147)
resolution is at least two bits, we have shown that the unique
∗
bits so that the irreducible loss C − IGMI can be as small gain control factor that minimizes the MSE also maximizes the
−b0
as 10 bits/c.u., where b0 ≥ 0. GMI. When non-uniform quantization rules are allowed, one
DRAFT 20
1 1 1.4
3 3 GMI GMI
GMI 2.5
mse mse 1.2
mse 0.8 0.8
2.5 2.5
2 1
2
0.6
2 0.6
0.8
1.5
1.5 1.5
0.4 0.4 0.6
1
1 1
0.4
0.2 0.2
0.5 0.5 0.5
0.2
0 0 0 0 0 0
0 2 4 6 8 10 0 1 2 3 4 5 0 2 4 6 8 10
(a) Uniform quantizer: (b) Optimal nonuniform quantizer: (c) Nonuniform quantizer:
thresholds 0, ±0.25, ±0.5, ±0.75, thresholds 0, ±0.5006, ±1.0500, ±1.7480, thresholds 0, ±0.2, ±0.6, ±0.7,
levels ±0.125, ±0.375, ±0.625, ±0.875 levels ±0.2451, ±0.7560, ±1.3440, ±2.1520 levels ±0.1, ±0.5, ±0.7, ±0.9
GMI GMI 2
3 mse GMI 3
mse 0.0395
3 mse
2.98 0.05
2.5
0.039 1.5
2.96
2.995
2
0.0385 2.94 0.045
1
1.5
2.99
2.92
0.038
1
2.9 0.04
0.5
2.985
0.0375
2.88 0.5
(d) Part of (a): 2.1 ≤ L ≤ 2.6 (e) Optimized quantizer with equispaced thresholds: (f) Nonuniform quantizer:
thresholds 0, ±0.25, ±0.5, ±0.75, thresholds 0, ±0.2, ±0.6, ±0.7,
levels ±0.1224, ±0.3673, ±0.6122, ±0.9693 levels ±0.1, ±0.9, ±0.8, ±0.2
Fig. 17. GMI, MSE and γ under gain control: SNR = 10 dB.
may vary the thresholds {ℓk } and levels {yk } of q(·) (possibly From Fig. 17(a), we confirm that, under uniform quantiza-
under some constraints) to reduce γ so that the GMI (19) tion, the parameter γ converges to mmse when the loading
can be optimized, yielding thresholds {ℓ∗k } and levels {yk∗ }. factor approaches its optimal value L∗ (in this example it
However, when this optimized quantization rule is applied in is 2.3441); see also Fig. 17(d). Interestingly, this consistency
a given channel, to achieve that optimized GMI, we need to also occurs in Fig. 17(b), where we use the optimal 8-level
set the gain control factor g appropriately according to σv quantizer obtained numerically in [30] by Lloyd’s algorithm
(the standard deviation of the quantizer input) to guarantee [70]. But for the nonuniform quantizer given in Fig. 17(c)
that the thresholds satisfy ℓk = lk /g = ℓ∗k for 1 ≤ k < K. and the optimized quantizer with equispaced thresholds (also
Otherwise, the performance will be degraded. We next explore obtained in [30]) given in Fig. 17(e), the gain control factor
the impact of gain control on the MSE and the GMI of the that maximizes the GMI and the one that minimizes the MSE
channel (4) under nonuniform quantization rules by examples; are different. The highly nonlinear quantizer given in Fig. 17(f)
see numerical results given in Fig. 17, where we set SNR = 10 provides an example of unusual behavior of IGMI (g), which
dB. The examples are described as follows, each of which achieves its maximum when the MSE is relatively large.
satisfies 2K = 4, i.e., b = 2. Remark: In achievable rate evaluation, if we use the additive
noise model of the quantizer, i.e., treating the MSE as additive
• The uniform quantizer (with equispaced thresholds and
Gaussian noise, then we obtain an estimate as
mid-rise levels); see Fig. 17(a) and Fig. 17(d).
• The optimal nonuniform quantizer: the thresholds and Ex
R̂ = log 1 + 2 (148)
levels are optimized to maximize the GMI; see Fig. 17(b). σ + 2 · mse · σv2
• Two nonuniform quantizers without optimization: the first
mse
one has monotonically increasing levels, and the second = log(1 + SNR) − log 1 + SNR , (149)
1 + mse
one is highly nonlinear since its levels are no longer
monotonically increasing; see Fig. 17(c) and Fig. 17(f), which may cause underestimate or overestimate. In particular,
respectively. it overestimates the achievable rate when the optimal uniform
• An optimized quantizer with equispaced thresholds: its quantization is used, while it may significantly underestimate
thresholds and levels are optimized to maximize the the achievable rate when mse is much larger than γ, which is
GMI under the constraint that the thresholds must be possible; see the numerical results in Fig. 17 when the gain
equispaced; see Fig. 17(e). control factor (or loading factor) is far away from its optimal
DRAFT 21
regime is still 2.72 times larger than the theoretical limit from constellation shaping. But it is natural to ask whether
implied by the rate-distortion function of a Gaussian source our results are still useful in the cases of QAM, phase-shift
[42].13 If we allow coded uniform scalar quantization (i.e., keying (PSK), and other commonly used inputs, which are all
representing the quantizer output by a variable-rate lossless bounded (unlike the unbounded Gaussian input). Since in these
code), the gap can be reduced to only 1.53 dB (πe/6) worse cases Proposition 1 does not apply, the GMI evaluation may
than the theoretical limit (see [41] and references therein). rely heavily on numerical computation. We left this problem
Unfortunately, coded quantization does not help in commu- to future study. Nevertheless, the insight gained in this work
nication receivers since the bottleneck therein is the limited will be very helpful for the finite-alphabet case.
resolution of the quantizer rather than the cost of representing Finally, we list some topics that can be addressed following
its output. Applying vector quantization to communication our information-theoretic framework of receiver quantization.
receivers is a more challenging topic. Nevertheless, the gap to • The orthogonal frequency division multiplexing (OFDM),
the fundamental limit reminds us that it is possible to alleviate which typically generate Gaussian-like signal. Therein, a
the ADC resolution bottleneck in communication receivers by new effect is that the nonlinear distortion due to quan-
exploring new quantization mechanisms. tization leads to intercarrier interference. The problem
becomes more complicated when the channel introduces
VIII. C ONCLUDING R EMARKS time-dispersion.
The goal of this study is to evaluate the impact of resolution • Multiuser channels, especially the Gaussian multiple-
reduction on information-theoretic limits of communications access channel. In this case a new phenomenon due to
with receiver quantization. Leveraging the GMI as a basic receiver quantization is that, when successive interference
tool, which enables us to take the decoding rule into con- cancellation (SIC) is used to decode a user, there exists
sideration, we establish an array of exact and asymptotic residual interference from other users which may signif-
results under a standard transceiver architecture. Our results icantly reduces the achievable rate of that user.
indicate a critical issue in system design that arises as the • Multiantenna channels. We need an extended version
resolution decreases, namely, optimizing the loading factor by of the proposed analytical framework which applies for
gain control, which minimizes the loss in achievable rate by different receiver architectures (e.g., MMSE receiver,
eliminating the loading loss. The remaining irreducible loss maximal ratio combining receiver).
is an appropriate evaluation of the inherent robustness of the
considered transceiver architecture. Our results also establish A PPENDIX A
explicit connections between the MSE (affected by the gain P ROOF OF P ROPOSITION 4
control) and the rate loss. Although for general receiver
quantization rules, smaller MSE does not necessarily imply We first note that all the adjustable levels in {ℓk =
smaller rate loss, for the commonly used uniform quantizer lk /g, k = 1, ..., K − 1} tend to zero as g → ∞, and
we prove that the unique loading factor that minimizes the they tend to infinity as g → 0. For the case √of g → ∞,
MSE also maximizes the GMI (i.e., achieves zero loading from (21) and (22) we have limg→∞ A/yK = 2πφ(0) and
2
loss). For perfect gain control, we show that the irreducible limg→∞ B/yK = πQ(0). For the case of g → 0, noting that
loss is determined by the product of the normalized MMSE A and B can be expressed by
and the SNR, and provide a geometrical interpretation for this
√ K−1
X
result. Performance approximations and per-bit rules in this A = y1 + 2π φ(ℓk )(yk+1 − yk ) (157)
case are also given. For imperfect gain control, to understand k=1
its impact, we characterize the decay of small rate loss in
the high-resolution regime, and propose approximations of the and
achievable rate as a function of the loading factor which is K−1
X
π 2 2
fairly accurate for moderate resolutions. These results provide B= y1 + π Q(ℓk )(yk+1 − yk2 ), (158)
2
insight into transceiver design with nonnegligible quantization k=1
effect, especially choice of quantizer resolution and design of respectively, it is direct to check that limg→0 A/y1 = 1 and
AGC. limg→0 B/y12 = π/2. Therefore, in both cases γ tends to 1 −
A limitation of this work is that the obtained analytical 2/π and the limits of the GMI are the same, IGMI1-bit
.
results are derived under the Gaussian codebook. Like many
classical information-theoretic results given by mutual infor-
mation, for the GMI, it is also not easy to get closed-form A PPENDIX B
or analytical results without assuming Gaussian input. For P ROOF OF P ROPOSITION 5
practical systems with finite alphabet, we can expect that our We begin from the definition (39) in which the quantization
result applies well for Gaussian-like constellations that comes rule is given by (8) and is equivalent to
13 The comparison is made under the assumption that the signal at the
q(gV ) = yk · sgn(V ), if ℓk−1 ≤ v < ℓk , (159)
receiver front-end is represented by b bits per channel use before feeding
into the decoder. The MSE of the representation can be reduced by lossy
source coding (treating the received signal as a source). A scalar quantizer is where v = |V |/σv ∼ N (0, 1). Utilizing the symmetries of the
a very simple lossy source coding scheme. input distribution and the quantization rule with respect to the
DRAFT 23
K−1 K−1
!
origin, we have X X 1
2
= πℓ (2k + 1)Q(kℓ) − Q(kℓ) + (163c)
K Z
X ℓk 8
k=0 k=0
mse = 2 yk2 φ(v)dv K−1
X
ℓk−1 1
k=1 =π 2kℓ2 Q(kℓ) + πℓ2 . (163d)
K Z ℓk
X 8
k=0
−2 2yk vφ(v)dv + 1 (160a)
k=1 ℓk−1 A PPENDIX D
XK
P ROOF OF L EMMA 7
= 1−4 yk (φ(ℓk−1 ) − φ(ℓk ))
k=1 Let K > 1. According to Theorem 2, the LHS of (51) is
K
X equivalent to dγ
dℓ = 0, or
+2 yk2 (Q(ℓk−1 ) − Q(ℓk )) (160b)
dA A dB
k=1 = . (164)
2 √ dℓ 2B dℓ
= 1− 2πA − B . (160c) According to Proposition 5, the RHS of (51) is equivalent to
π
The lower bound and condition of equality follow from dA 1 dB
r !2 = √ . (165)
dℓ 2π dℓ
1 2
mse − γ = A− B ≥ 0. (161) Let
B π
K−1
X
−k 2 ℓ2
A PPENDIX C C= ℓ · k 2 ℓ2 · exp , (166)
2
P ROOF OF P ROPOSITION 6 k=0
According to (21), we can derive (44) by straightforward which is strictly positive when K > 1. Noting that
derivation as K−1
X
dA −k 2 ℓ2 ℓ
X
K−1
ℓ
−(k − 1)2 ℓ2 −k 2 ℓ2
ℓ· = ℓ · exp −
A= kℓ − exp − exp dℓ 2 2
k=0
2 2 2
k=1
X
K−1
−k 2 ℓ2
ℓ −(K − 1) ℓ2 2 − ℓ · k 2 ℓ2 · exp (167a)
+ Kℓ − exp (162a) 2
2 2 k=0
K−1
X
K−1
−(k − 1)2 ℓ2 −k 2 ℓ2
X
2 2 −k 2 ℓ2
= kℓ exp − exp =A− ℓ · k ℓ · exp (167b)
2 2 2
k=0
k=1
K−1 =A−C (167c)
1 X −(k − 1)2 ℓ2 −k 2 ℓ2
− ℓ exp − exp
2 2 2 and
k=1
K−1 √
−(K − 1) 2 2
ℓ ℓ −(K − 1) ℓ 2 2
1 dB √ X 2 2π 2
+ Kℓ exp − exp (162b) √ ℓ· = 2π 2kℓ Q(kℓ) + ℓ
2 2 2 2π dℓ 8
K−1 k=0
X −k 2 ℓ2
= ℓ · exp −
ℓ
, (162c) X
K−1
−k 2 ℓ2
2 2 2 2
k=0
− ℓ · k ℓ exp (168a)
2
k=0
and according to (22), we obtain (45) as r
2 X
K−1
−k 2 ℓ2
X
K−1
1
2 =
π
B− ℓ · k 2 ℓ2 exp
2
(168b)
B=π k− ℓ2 (Q((k − 1)ℓ) − Q(kℓ)) r k=0
2
k=1
2 2
= B − C, (168c)
1 π
+π K− ℓ2 Q((K − 1)ℓ) (163a)
2 it is direct to check that both (164) and (165) are equivalent
K−1
X to (52), thereby completing the proof.
= πℓ2 k 2 (Q((k − 1)ℓ) − Q(kℓ))
k=1
K−1 A PPENDIX E
X
− k (Q((k − 1)ℓ) − Q(kℓ)) P ROOF OF P ROPOSITION 11
k=1 First, (80) and (81) can be obtained by the Pythagorean
K−1
X
1 relations (73) and (74), respectively. We begin from (24c) and
+ (Q((k − 1)ℓ) − Q(kℓ)) (25) and rewrite them as
4
k=1
2 r r
1 2 2 2
+π K− ℓ2 Q((K − 1)ℓ) (163b) E XY = E |X| A = E XV A (169)
2 π π
DRAFT 24
[41] R. M. Gray and D. L. Neuhoff, “Quantization,” IEEE Trans. Inf. Theory, [69] I. Mezö and Á. Baricz, “On the generalization of the Lambert W
vol. 44, no. 6, Oct. 1998. function,” Trans. Amer. Math. Soc., vol. 369, no. 11, pp. 7917–7934,
[42] T. Berger, Rate Distortion Theory: A Mathematical Basis for Data Nov. 2017.
Compression. Englewood Cliffs, NJ: Prentice-Hall, 1971. [70] S. Lloyd, “Least squares quantization in PCM,” IEEE Trans. Inf. Theory,
[43] B. M.Oliver, J. R. Pierce, and C. E. Shannon, “The philosophy of PCM,” vol. IT-28, no. 2, pp. 129–137, Mar. 1982.
Proc. IRE, vol. 36, no. 11, pp. 1324–1331, Nov. 1948. [71] J. L. Massey, “Coding and modulation in digital communication,” in
[44] W. R. Bennett, “Spectra of quantized signals,” Bell Syst. Tech. J., 1948, Proc. Zurich Sem. Digital Commun., Vol. 2, No. 1, 1974.
vol. 27, no. 3, pp. 446–472. [72] Q. Yu and M. Médard, “The asymptotic solutions of the capacity
[45] H. Gish and J. N. Pierce, “Asymptotically efficient quantizing,” IEEE maximal quantization problem,” in Proc. 2015 IEEE 82nd Veh. Technol.
Trans. Inf. Theory, vol. IT-14, pp. 676–683, Sept. 1968. Conf. (VTC2015-Fall), 2015.
[46] D. Hui and D. L. Neuhoff, “Asymptotic analysis of optimal fixed-rate [73] P. F. Panter and W. Dite, “Quantization distortion in pulse-count modu-
uniform scalar quantization,” IEEE Trans. Inf. Theory, vol. 47, no. 3, lation with nonuniform spacing of levels,” Proc. IRE, vol. 39, pp. 44–48,
March 2001. Jan. 1951.
[47] A. Gersho and R. M. Gray, Vector Quantization and Signal Compression.
Boston, MA: Kluwer, 1992.
[48] N. S. Jayant and P. Noll, Digital Coding of Waveforms: Principles and
Applications to Speech and Video. Englewood Cliffs, NJ: Prentice-Hall,
1984.
[49] S. Na and D. L. Neuhoff, “Asymptotic MSE distortion of mismatched
uniform scalar quantization,” IEEE Trans. Inf. Theory, vol. 58, no. 5, pp.
3169–3171, May 2012.
[50] S. Na and D. L. Neuhoff, “On the convexity of the MSE distortion of
symmetric uniform scalar quantization,” IEEE Trans. Inf. Theory, vol. 64,
no. 4, pp. 2626–2638, Apr. 2018.
[51] S. Na and D. L. Neuhoff, “Monotonicity of step sizes of MSE-optimal
symmetric uniform scalar quantizers,” IEEE Trans. Inf. Theory, vol. 65,
no. 3, pp. 1782–1792, Mar. 2019.
[52] F. Sun, J. Singh, and U. Madhow, “Automatic gain control for ADC-
limited communication,” in Proc. IEEE Global Telecommun. Conf.
(GLOBECOM), Miami, FL, USA, Dec. 2010.
[53] S. Krone and G. Fettweis, “Optimal gain control for single-carrier
communications with uniform quantization at the receiver,” in Proc. 2010
Int. Conf. Acoust. Speech Sig. Process. (ICASSP’10), Dallas, USA, Mar.
2010.
[54] D. Marco and D. L. Neuhoff, “The validity of the additive noise model
for uniform scalar quantizers,” IEEE Trans. Inf. Theory, vol. 51, no. 5,
pp. 1739–1755, May 2005.
[55] D. Tse and P. Viswanath, Fundamentals of Wireless Communication,
New York: Cambridege Unviersity Press, 2005.
[56] A. Lapidoth, “Nearest neighbor decoding for additive non-Gaussian
noise channels,” IEEE Trans. Inf. Theory, vol. 42, no. 5, pp. 1520–1529,
May 1996.
[57] W. Zhang, Y. Wang, C. Shen, and N. Liang, “A regression approach
to certain information transmission problems,” IEEE J. Select. Areas
Commun., vol. 37, no. 11, pp. 2517–2531, Nov. 2019.
[58] Y. Wu, L. M. Davis, and R. Calderbank, “On the capacity of the discrete-
time channel with uniform output quantization,” in Proc. 2009 IEEE Int.
Symp. Inf. Theory (ISIT’09), Seoul, Korea, 2009, pp. 2194–2198.
[59] G. D. Forney and G. Ungerboeck, “Modulation and coding for linear
Gaussian channels,” IEEE Trans. Inf. Theory, vol. 44, no.6, pp. 2384–
2415, Oct. 1998.
[60] J. Stoer and R. Bulirsch, Introduction to Numerical Analysis. 3rd edition.
New York, NY: Springer.
[61] E. N. Gilbert, “Increased information rate by oversampling,” IEEE Trans.
Inf. Theory, vol. 39, no. 6, pp. 1973–1976, Nov. 1993.
[62] S. Shamai (Shitz), “Information rates by oversampling the sign of a
bandlimited process,” IEEE Trans. Inf. Theory, vol. 40, no. 4, pp. 1230–
1236, July 1994.
[63] T. Koch and A. Lapidoth, “Increased capacity per unit-cost by over-
sampling,” in Proc. IEEE 26th Conv. Elect. Electron. Eng. Israel, Eilat,
Israel, Nov. 2010, pp. 684–688.
[64] L. T. N. Landau, M. Dörpinghaus, and G. P. Fettweis, “1-bit quantiza-
tion and oversampling at the receiver: Sequence-based communication,”
EURASIP J. Wirel. Commun. Netw., vol. 2018, no. 1, pp. 83, Apr. 2018.
[65] R. Deng, J. Zhou, and W. Zhang, “Bandlimited communication with one-
bit quantization and oversampling: Transceiver design and performance
evaluation,” IEEE Trans. Commun., vol. 69, no. 2, pp. 845-862, Feb.
2021.
[66] B. Hassibi and B. M. Hochwald, “How much training is needed in
multiple-antenna wireless links?” IEEE Trans. Inf. Theory, vol. 49, no.4,
pp. 951–963, Apr. 2003.
[67] J. A. Bucklew and N. C. Gallagher, “Some properties of uniform step
size quantizers,” IEEE Trans. Inf. Theory, vol. IT-26, no. 5, pp. 610–613,
Sept. 1980.
[68] P. Borjesson and C.-E. Sundberg, “Simple approximations of the error
function Q (x) for communications applications,” IEEE Trans. Commun.,
vol. 27, no. 3, pp. 639–643, Mar. 1979.