Distributed Channel Estimation and Pilot Contamination Analysis For Massive MIMO-OFDM Systems
Distributed Channel Estimation and Pilot Contamination Analysis For Massive MIMO-OFDM Systems
Distributed Channel Estimation and Pilot Contamination Analysis For Massive MIMO-OFDM Systems
Abstract—Massive MIMO communication systems, by virtue as massive MIMO or large scale MIMO systems [8]–[10],
arXiv:1507.08150v1 [cs.IT] 29 Jul 2015
of utilizing very large number of antennas, have a potential to overcome many limitations of traditional MIMO systems.
yield higher spectral and energy efficiency in comparison with Massive MIMO increases system capacity by simultaneously
the conventional MIMO systems. In this paper, we consider
uplink channel estimation in massive MIMO-OFDM systems serving tens of users using the same time-frequency resources.
with frequency selective channels. With increased number Moreover, the large number of low power active antennas
of antennas, the channel estimation problem becomes very allows to focus energy in a small spatial region by forming
challenging as exceptionally large number of channel parameters a sharp beam towards desired users. This additionally implies
have to be estimated. We propose an efficient distributed linear that there will be little intra-cell interference [9]. Because of
minimum mean square error (LMMSE) algorithm that can
achieve near optimal channel estimates at very low complexity these vital advantages, massive MIMO has attracted a lot of
by exploiting the strong spatial correlations and symmetry of research interest and is envisioned as an enabling technology
large antenna array elements. The proposed method involves for next generation (5G) wireless communications [11].
solving a (fixed) reduced dimensional LMMSE problem at each Hand in hand with the advantages are entirely new research
antenna followed by a repetitive sharing of information through challenges that need to be tackled for massive MIMO. The
collaboration among neighboring antenna elements. To further
enhance the channel estimates and/or reduce the number bottleneck in achieving the full advantages of massive MIMO
of reserved pilot tones, we propose a data-aided estimation is the accurate estimation of the channel impulse response
technique that relies on finding a set of most reliable data (CIR) for each transmit-receive antenna pair. Having a very
carriers. We also analyse the effect of pilot contamination on large number of antennas means that a significant number of
the mean square error (MSE) performance of different channel channel coefficients need to be estimated − far more than that
estimation techniques. Unlike the conventional approaches,
we use stochastic geometry to obtain analytical expression could be handled by traditional pilot-based MIMO channel
for interference variance (or power) across OFDM frequency estimation techniques (see [12] and references therein). In
tones and use it to derive the MSE expressions for different this regard, Bayesian minimum mean square error (MMSE)
algorithms under both noise and pilot contaminated regimes. estimator provides an optimal estimate in the presence of ad-
Simulation results validate our analysis and the near optimal ditive white Gaussian noise (AWGN). The method is complex
MSE performance of proposed estimation algorithms.
and therefore, a number of approaches have been developed
Index Terms: Channel estimation, massive MIMO, stochastic to reduce its complexity such as those proposed in [13]–
geometry, OFDM, LMMSE. [17]. Unlike the least squares (LS) or interpolation based
techniques [18], the MMSE estimation has a clear edge in
that it can effectively utilize the channel statistics to improve
I. I NTRODUCTION the estimation accuracy. However, the direct generalization of
In wireless communications, the demand for higher data these techniques to massive MIMO has some drawbacks. In
rates has been dramatically increasing mostly owing to the un- particular, they suffer from huge complexity due to matrix
precedented usage of data-hungry devices e.g., smart-phones, inversion of very large dimensionality, making it impractical.
super-phones, tablets etc., for wireless multimedia applications Some methods to reduce the complexity of MMSE estimator
[1]. Over the years, the MIMO technology (that exploits in massive MIMO have also been proposed e.g., [19]–[24].
multiple antennas at the transmitter and/or receiver) has played It is important to note that most of the existing methods
a pivotal role in sustaining the increased data rates. Installing make assumptions that are not always true. For example,
multiple antennas offers key advantages such as multiplexing many methods deal with flat fading channels only while others
gain and diversity gain due to increased spatial reuse [2], assume that the channels are sparse. Therefore, low complexity
[3]. The MIMO technology has already been incorporated channel estimation approaches suited to multi-cell and multi-
into many wireless products and standards such as WiFi carrier massive MIMO systems need further investigations.
IEEE802.11n [4], WiMAX IEEE 802.16e [5], LTE (4G) [6]. In this paper we propose a distributed algorithm for the
Recently, it was established that the use of very large estimation of correlated Rayleigh fading channels in massive
antenna arrays, typically of the order of few hundreds, at MIMO-OFDM systems. The novel distributed LMMSE algo-
the base station (BS) can potentially provide huge gains in rithm significantly reduces the computational complexity while
system throughput, energy efficiency, security and robustness attaining near optimal CIR estimates. The distributed approach
of wireless communication systems [7]. Such systems, known is inspired by our previous work in [25] (where channels are
2
assumed to be sparse and exhibit common support with the ered in Section V. Section VI describes the effect of pilot
neighboring antennas). Furthermore, in order to enhance the contamination on channel estimation, and the expression for
estimation performance, we also propose a data-aided estima- interference correlation is presented. Based on this, the MSE
tion technique that relies on finding a set of most reliable data expressions for different algorithms are derived under AWGN
carriers to increase the number of measurements, instead of and pilot contamination. Simulation results are presented in
increasing the reserved pilot tones [26]. Equivalently, by using Section VII and finally we conclude in Section VIII.
the data-aided technique, the number of reserved pilot tones
can be reduced to attain a performance that is comparable to A. Notations
pilot-based estimation, thus increasing the spectral efficiency.
In a multi-cell setting, allocation of orthogonal pilot se- We use the lower case letters x and lower case boldface
quences for all users cannot be guaranteed due to finite coher- letters x to represent the scalar and the (column) vector re-
ence time of the channel and the limited available bandwidth spectively. Matrices are denoted by upper case boldface letters
[7]. Therefore, it is inevitable to reuse the pilot sequences X whereas the calligraphic notation X is reserved for vectors
across the cells. One of the major consequences of pilots reuse in the frequency domain. The ith entry of x is represented by
is that when the BS in a cell is performing channel estimation x(i), the element of X in ith row and jth column is denoted
via uplink training, the channel estimates will be severely by xi,j and the vector xk represents the kth column of X. We
distorted (contaminated) by the pilots of the neighboring cell use x(P) to denote a vector formed by selecting the entries
users. The impact of pilot contamination on channel estimation of x indexed by set P and X(P) to denote a matrix formed
is far greater than AWGN. In fact, it was shown in [27] that by selecting the rows of X indexed by P. We also use Xij to
the effect of uncorrelated interference and fast Rayleigh fading refer to the (i, j)th block entry of a block matrix. Further, (.)T ,
diminishes as the number of BS antennas increase while the (.)∗ and (.)H represent transpose, conjugate and conjugate
effect of pilot contamination is not eliminated. Hence, it is transpose (Hermitian) operations respectively. We use diag(x)
important to investigate the effect of pilot contamination on to transform a vector x into a diagonal matrix with the entries
MSE performance of different channel estimation techniques. of x spread along the diagonal. hX̂ (k)i denotes the hard
Although the effect of pilot contamination on system per- decoding i.e., maximum likelihood (ML) decision of X̂ (k).
formance has been analysed by many researches e.g., [28], E{.} represents the statistical expectation. The discrete Fourier
[29], only few studies have analysed its impact on channel transform (DFT) and inverse DFT (IDFT) matrices are repre-
estimation performance [20]. Moreover, in these works, the sented by F and FH respectively, where we the (l, k)th entry of
analysis is carried out for fixed locations of (interference) F is defined as fl,k =N −1/2 e−2πlk/N , l, k=0, 1, 2, · · · , N −1
users. Also it can be seen from analytical expressions derived for an N -dimensional Fourier transform. Finally, the weighted
in these works, that the pathloss, which is determined by user's norm of a vector x is given by kxk2A , xH Ax.
locations, plays an important role in MSE performance eval-
uation. As such, the above works cannot analytically answer II. S YSTEM M ODEL
how the randomness of users’s locations would effect MSE We consider a multi-cell massive MIMO-OFDM wireless
performance under pilot contamination. In contrast to existing system as shown in Fig. 1, where the BS in each cell is
studies, we approach the problem by using concepts from equipped with uniform planar array (UPA) consisting of a
stochastic geometry. By assuming that the interfering users are large number of antennas. Moreover, we assume that each BS
distributed according to homogeneous poisson point process serves a number of single antenna user terminals. The antennas
(PPP), we derive analytical expressions for MSE of LS and on UPAs are distributed across M rows and G columns with
LMMSE based channel estimation algorithms in the presence horizontal and vertical spacing of dx and dy respectively. We
of both AWGN and pilot contamination. The analytical results define the (m, g)th antenna as the antenna element in mth
are validated by simulations. The results clearly show the row and gth column which corresponds to r=m + M (g − 1)th
dependence of important massive MIMO network parameters, antenna index where 1 ≤ m ≤ M , 1 ≤ g ≤ G and 1 ≤ r ≤ R,
such as pathloss and user's density, on the MSE performance where R=M G is the total number of antennas in a UPA. Fig.
and give clue to mitigate the effect of pilot contamination. It 2 shows an example of a M ×G UPA structure with antenna
is shown that the increasing the number of pilots does not indexing. Note that, depending on values of G and M , the
improve the estimation performance in the presence of pilot antennas could have linear or a rectangular configuration. We
contamination. Moreover, the dependence of MSE on antenna however, confine our attention to rectangular UPA structure
spatial correlations suggests that the massive antenna array which is a viable configuration in deployment scenarios for
structure could be optimized to slightly improve the estimation massive MIMO [9].
performance under pilot contamination. Each user communicates with the BS using OFDM and
The remainder of the paper is organized as follows. Section transmits uplink pilots for channel estimation. We assume that
II describes the system and spatial channel correlation model. all users in a particular cell are assigned orthogonal frequency
In Section III, we present the MMSE and LS based chan- tones so that there is no intra-cell interference. However, due
nel estimation in the presence of AWGN only and discuss to necessary reuse of pilots, there are users in the neighboring
their limitations for massive MIMO. The proposed distributed cells that transmit pilots at the same frequency tones, resulting
LMMSE algorithm is presented in Section IV. To enhance in an inter-cell interference or pilot contamination. Since
estimation performance, the data-aided approach is consid- only the user in a particular cell of interest will experience
3
Massive UPA
antenna array
delay profile (PDP). As manifested by (1), Rarray is assumed
Pilot
contamination to be identical across the l taps while Rtap is assumed to be
UE 1
identical across the array. For the spatial correlation matrix
UE 3 Cell 2
Rarray , we adopt a ray-based 3D channel model from [30]
UE 2
which is more appropriate for rectangular arrays. Accordingly,
the spatial correlation between array elements r=(m, g) and
r′ =(p, q) is given by,
UE 2
Cell 1 UE 1
UE 1
Pilot D1 D7 +(D2 (sinφ)σ)2 D2 D6
[Rarray ]r,r′ = √ e− e D5 ,
UE 3
contamination Cell 3 2D5
(2)
UE 2
D5
where the Di 's are defined as,
Figure 1: Multi-cell massive MIMO system layout. 2πdx 2πdx 2 2 2
1
D1 = e ν (p−m)cos(θ) e− 2 (ξ ν ) (p−m) sin θ
,
1 M+1 dx 2πdx
D2 = (q − g)sin(θ) ,
ν
dy
Up neighbor rU 2 M+2
response between the user and receive antenna r i.e., A. The Localized LMMSE (L-LMMSE) estimation
√ √ In this approach, all CIRs are estimated independently
hr
Hr = N F = N Fhr . (5) based on the observations received at each antenna element
0N −L×1
by using the classical LMMSE estimation. Using the linear
Here F is truncated Fourier matrix formed by selecting the system model in (7), the LMMSE estimate of hr is obtained
first L columns of F. Using (5), we can re-write (4) as, by minimizing the (local) MSE, E{khr −ĥr k2 }, over ĥr as
√
Y r = N diag(X )Fhr + W r = Ahr + W r , (6) follows [32]
√ −1
−1 H −1
where A, N diag(X )F and the noise vector W r is assumed ĥr = Rtap + AH R−1
w A A Rw Y r , (9)
to be uncorrelated with the channel vector hr . We assume
where we drop the index vector P for convenience. Similarly,
that K sub-carriers are reserved for pilots and the remaining
it follows that the (minimum) MSE is,
N − K for the data transmission. Further, it is best to allocate −1
the pilots uniformly as shown in [31]. Hence, for a set of pilot mser = trace R−1 H −1
tap + A Rw A . (10)
indices denoted by vector P, the system equation (6) reduces
to, The overall global MSE is obtainedPby taking summation
R
Y r (P) = A(P)hr + W r (P) , (7) over all array elements i.e., MSE(L) = r=1 mser , which after
simplifying (10), can be expressed as,
where Y r (P) and W r (P) are formed by selecting the entries L
of Y r and W r indexed by P while A(P) is a K × L matrix
X δi
MSE(L) = R , (11)
formed by selecting the rows of A indexed by P. i=1
1 + ρKδi
We can now collect the pilot measurements (7) received by
where {δi }L 2
i=1 are eigenvalues of Rtap , ρ , Ex /σw is the
all antennas into a single system of equations as follows,
SNR with Ex representing the average signal energy per
Y(P) = [IR ⊗ A(P)]h + W(P) , (8) symbol and the superscript (L) indicates L-LMMSE. Observe
h iT from (11) that channel delay spread L, has an adverse effect
where, Y(P)= Y T T
1 (P), · · · , Y R (P) , W(P) = on MSE performance, which can be reduced by increasing
h iT the number of pilot tones. The computational complexity of L-
WT T
1 (P), · · · , W R (P) , IR represents an R × R identity LMMSE is of the order O RL3 (see Table I), which increases
matrix and h, as defined earlier, represents the composite linearly with the number of BS antennas. However, the CIR
channel vector from user to the BS. For convenience, we estimates are not optimal in the sense of minimizing the overall
assume the noise variance to be identical across the array so or global MSE. The estimates would have been optimal, had
2
that W(P) ∼ CN (0, Rw =σw IRK ). Note that the number the antennas been placed sufficiently apart so that the channel
of unknown channel coefficients in (8) are RL whereas the vectors were effectively uncorrelated. But for massive MIMO
total number of equations are RK. Therefore, a necessary with extremely large number of antennas, it is expected that
condition to solve (8) for h (and also (7) for hr ) using least antennas are located in close proximity, so the channel vectors
squares, is that the number of pilots be at least equal to L are highly likely to be correlated with each other.
i.e., K ≥ L. However, K could be reduced if we utilize the
correlation information. With the models defined above, we
B. The Optimal LMMSE (O-LMMSE) Solution
are ready to estimate the CIRs between the user and each BS
antenna. We pursue different approaches that can be adopted In this strategy all the channel vectors are estimated simulta-
for channel estimation in massive MIMO setup depending on neously by minimizing the global MSE, E{kh− ĥk2 } over the
whether the information processing takes place independently composite channel vector ĥ. This could be realized by sending
at each antenna element or jointly at a centralized processor. all observations to a central processor and then invoking the
We start with naive LMMSE and LS based techniques and LMMSE estimation based on the composite system model in
discuss their limitations, and then propose a new distributed (8). The solution to this problem is given by,
approach in section IV which is further extended in section −1
ĥ = R−1 h + ÁH −1
R w Á ÁH R−1w Y , (12)
V with the help of data-aided approach.
where, Á=IR ⊗ A, Rh is as given in (1) and for notational
convenience we dropped the index P. The corresponding MSE
III. LMMSE AND LS BASED C HANNEL E STIMATION is, −1
MSE(O) = trace R−1 H −1
h + Á Rw Á , (13)
In this section, we present three different techniques for which can be simplified to yield,
channel estimation in massive MIMO-OFDM based on the
R X
L
well-known LMMSE and LS estimators and discuss their X ηj δi
limitations. For now, we assume that estimates are corrupted MSE(O) = , (14)
j=1 i=1
1 + ρKηj δi
only by the white noise. Hence, without loss of generality, we
consider a single-cell single-user scenario for the approaches where, ηj and δi are eigenvalues of Rarray and Rtap re-
presented below. spectively. By comparing (14) with (11), we conclude that
5
in presence of spatial correlation, the optimal solution yields performance with tractable complexity. The proposed dis-
better MSE performance than the localized strategy, however, tributed LMMSE estimation is described below and is further
it has the following two major drawbacks: extended in section V via a data-aided technique.
1) Realization of optimal strategy requires global sharing
of information to/from the central processor that results IV. T HE P ROPOSED DISTRIBUTED LMMSE (D-LMMSE)
in communication overhead (as it requires complex ESTIMATION
signalling which can be very expensive).
2) As evident from (12), the computation of optimal It is well known from equivalence results in linear esti-
LMMSE requires inverting a non-trivial matrix of very mation theory [33] that the O-LMMSE solution (12) could
high dimension (RK × RK) that leads to computational be alternatively obtained by solving an RL dimensional opti-
complexity of order O R3 L3 , which is cubic in number mization problem,
of BS antennas. argmin kY − A′ hkR−1
2
+ khkR−1 ,
2
(18)
w
In massive MIMO scenario where R is of the order of few h h
hundreds, both of the above mentioned operations are very where all the variables are as defined earlier. Instead of
expensive and possibly impractical. solving (18) globally (as done earlier), we aim to solve it
in a distributed manner over R antennas in which the rth
C. Estimation using Least Square (LS) antenna has access to Y r only. Moreover, the antenna r is
If the channel statistics are unknown, one can employ simple interested only in determining its own CIR (i.e., hr ) without
LS based estimation. In the absence of correlation, we can worrying about other hj 's. Here, we would like to mention that
let the inverse of channel correlation matrix go to zero, i.e., this problem is fundamentally different from those considered
R−1 in the context of adaptive networks [34]. Also, most of the
tap → 0, thereby ignoring the channel statistics. Therefore,
the localized LS solution from (9) is, existing distributed estimation techniques in adaptive networks
−1 H deal with single task problems in which all nodes in the
ĥls
r = A A
H
A Yr , (15) network estimate a single common parameter of interest.
Furthermore, they rely on full cooperation between the nodes,
and the resulting MSE is given by,
i.e., exchanging both the estimates and the observations with
H −1
−1
msels the neighbors. Although, the distributed recursive least squares
r = trace A Rw A . (16)
(RLS) algorithm of [35] might be adopted to solve (18), it
In this case, the overall MSE simplifies to, would be gravely complex in number of dimensions (due to
R large R in massive MIMO and large channel delay spread)
X RL
MSE(LS) = msels
r = . (17) and hence might suffer from convergence issues. Our proposed
r=1
ρK solution, the distributed LMMSE (D-LMMSE) algorithm, as
Comparing (17) with (11), we conclude that LS has poor will become clear, is much simpler in that it exploits the
performance in comparison with the LMMSE as it does not structure of spatial correlation matrix Rarray and relies only
utilize the channel statistics. It is for this reason that the on exchanging the (partial) weighted estimates of CIRs with
centralized LS (C-LS) solution would achieve the same MSE immediate neighbors, thus reducing the communication and
performance as the localized one as shown below. computational cost significantly. The proposed D-LMMSE
−1 algorithm is composed of three main steps namely the esti-
H −1
MSE(C−LS) =trace (IR ⊗ A) (IR ⊗ Rw ) (IR ⊗ A) mation, sharing and update, as explained below.
−1
= trace IR ⊗ AH R−1 w A ,
R A. Estimation
X −1
= trace AH R−1
w A , In the estimation step, each antenna acting as a center
r=1 antenna rC , estimates not only its own CIR but also the CIRs
= MSE(LS) , of its neighborhood. The neighborhood of rC consists of 4-
direct neighbors represented by the set N ={rL , rR , rU , rD }1
where we have used the Kronecker product identities, (A ⊗
on the left, right, top and bottom positions respectively as
B)(C ⊗ D)=AC ⊗ BD and (A ⊗ B)−1 =A−1 ⊗ B−1 .
shown in Fig. 3(a). Also, let the corresponding channel vectors
In short, the L-LMMSE estimation has the advantage of
be represented by hC , hL , hR , hU and hD respectively and
low complexity (and better performance than LS) but it is
let hc represent |N + |L × 1 dimensional composite channel
unable to exploit the strong spatial correlation among antenna
vector of the central antenna and its |N | direct neighbors (i.e.,
elements which is inevitable in massive MIMO systems. On T T
hc = hT T T T
the other hand, O-LMMSE exploits the spatial correlations but C , hL , hR , hU , hD ). During the estimation process,
at a significantly higher computational cost. This motivates each antenna acting as a central antenna computes the estimate
us to propose a method that can overcome the shortcomings of hc by solving a reduced dimensional weighted least squares
of aforementioned techniques without affecting the estimation 1 Note that for elements lying at the edges of a UPA, the number of
quality. Specifically, we propose a distributed estimation of neighbors are different, so that 2 ≤ |N | ≤ 4. The set of neighbors including
CIRs based on antenna coordination that attains near optimal the central antenna is represented by N + .
6
ĥ 4
rA rB rE rF rG
ĥ1
ĥ4 =
ĥ7
ĥ5 ĥ7
1 4 7
ĥ ĥ4
1
ĥ1 =
ĥ4
rH rI rU rJ rK ĥ2
ĥ
ĥ 5
ĥ 2 ĥ2
1 2 ĥ5
ĥ1 =
ĥ4
ĥ =
ĥ 4 ĥ5 = ĥ
8
ĥ7 ĥ8
ĥ1
ĥ2 ĥ4
ĥ3 ĥ8
ĥ6
rM rL rC rR rN
ĥ8 =
ĥ5
ĥ7
ĥ 5 ĥ9
2 5 8
ĥ2 ĥ
5
ĥ2
5
ĥ = ĥ8
rO rP rD rQ rS
ĥ
ĥ4
ĥ 5 ĥ
2 ĥ2 6 ĥ 6
ĥ3
ĥ5
ĥ2 =
ĥ 3 ĥ5 =
ĥ8
ĥ6 =
ĥ
ĥ8 ĥ9
ĥ1 ĥ 9
4
ĥ3 ĥ5
ĥ6 ĥ 6
ĥ3
ĥ6 =
rT rV rW rX rY
ĥ9
ĥ5 ĥ9
3 6 9
ĥ3 ĥ6
Figure 3: (a) During the first iteration rC (blue antenna) receives information from its 4-direct neighbors (pink antennas). In
the second iteration, the information from next nearest neighbors (green antennas) also comes in and so on. (b) An example
of a 3 × 4 antenna array where the neighboring antennas (indices 4 and 2) share the selected estimates (highlighted) with the
central antenna (index 1).
estimates and inverse error covariance matrices to minimize entry corresponding to null vectors is replaced by aI where
the computational requirements. The recursions in the update 0 < a ≪ 1 is a small positive number, which indicates
equations are initialized by (23) and (22) respectively, which very low weight or confidence in null estimates (that are not
are available after the estimation step. In the subsequent shared). In essence, the central element has the full information
iterations, each antenna would also require the partial matrices, needed to construct Pj 's and Rhj 's corresponding to shared
j
Pj 's and Rhj 's, for each of its |N | neighbors. Fortunately, estimates ĥw . We illustrate how these matrices could be
they can be obtained from Pc and Rhc respectively (which obtained for the example in Fig. 3(b). Consider the central
are available at the central antenna) by exploiting the antenna rC =1, its |N |=2 direct neighbors with (shared) partial
symmetrical structure of Rarray . Thus, there is no need to estimates given in (24). The partial correlation and error
share them across the neighboring elements, that in turn saves covariance matrices associated with those estimates (shown
a significant amount of communication burden. Specifically, underlined) along with that of central element are given in
the matrices Rhc and Pc exhibit the following two properties:2 (29) and (30) respectively.
Based on above steps and procedures, the proposed D-
Property 1: The matrix Rhc is identical for all elements in LMMSE algorithm is summarized in Algorithm 1.
the neighborhood of rC i.e., Rhc =Rhj , ∀j ∈ N
Property 2: The matrix Pc is identical for all elements in the
R11 R14 R12 R44 R41 0
neighborhood of rC i.e., Pc =Pj , ∀j ∈ N
Rh1 = R41 R44 R42 , Rh4 = R14 R11 0
R21 R24 R22 0 0 IL
Property 1 is attributed to the symmetric nature of the spatial
correlation matrix Rarray , which implies that the spatial R22 0 R21
correlation between any two antennas, placed equidistant apart, and Rh2 = 0 IL 0
is the same. Therefore, it is not difficult to see that property R12 0 R11
1 holds exactly under the Kronecker model and our earlier (29)
assumption of identical tap correlation across the antenna array
in section II. Property 2 is the consequence of property 1 when P11 P14 P12 P44 P41 0
incorporated into (22). P1 = P41 P44 P42 , P4 = P14 P11 0
Hence, to obtain the patrial correlation matrices, Rhj , j ∈ P21 P24 P22 0 0 aI
(30)
N , we use property 1 to first set Rhj =Rhc and then mod- P22 0 P21
ify the off-diagonal block entries corresponding to the null and P2 = 0 aI 0
vectors of partial estimates as Rij =0 if any ĥwi , ĥwj =0 and P12 0 P11
the diagonal block entries as Rii =IL if ĥwi =0, where the Remarks:
subscript ij denotes the (i, j)th block. The matrices Pj 's are
1) The information sharing and update take place during
obtained in the similar fashion except that the diagonal block
each iteration of the algorithm such that after few itera-
2 These properties are generally satisfied as the spatial correlation matrix is tions the information diffuses across the whole antenna
usually symmetric, if not, then the antennas can share these matrices as well. array. This concept of sharing is depicted in Fig. 3(a)
8
V. DATA - AIDED C HANNEL E STIMATION The proposed strategy for selecting the subset R of the
most reliable data-carriers, motivated by [26], is based on the
The basic idea of data-aided channel estimation is to exploit criteria,
the data sub-carriers in order to improve the initial channel
estimates obtained using only the pilots. As the data aided fz Z(k)=X (k) − hX ˆ(k)i
technique does not require additional pilots, it is spectrally R(k)= PM , (36)
ˆ
m=1,Am 6=hX (k)i
fz (Z(k)=X (k) − Am )
more efficient. Here, the pilot-based channel estimate is used
for data detection, which along with the reserved pilots can where, fz (.) is the pdf of Z(k) and Am represents the set
significantly enhance the channel estimation. It is possible of constellation alphabets. Note that the numerator in (36) is
that some of the data-pilots be erroneous due to noise and the probability that X (k) will be decoded correctly while the
channel estimation errors, while some of the other data-carriers denominator sums the probabilities of all possible incorrect
are reliable i.e., they are likely to be decoded correctly. An decisions due to distortion Z(k). The subset R is formed by
important problem is how to down-select a subset of the most selecting only those data-carriers for which R(k) > 1 i.e.,
reliable data-carriers to be used as data-pilots.
R = {k | R(k) > 1} . (37)
The metric (37) is intuitively appealing as it selects only those
A. Reliable Carriers Selection sub-carriers which are likely to be decoded correctly with high
probability. Fig 4 further elaborates this idea; that even though
Consider the received OFDM symbol at any antenna as X̂ (1) and X̂ (2) have the same distance from X , X̂ (2) is
shown in (4), and let ĥ and Ĥ be the CIR and CFR estimates more likely to be decoded correctly than X̂ (1), as it is farther
obtained using pilots. Then, the tentative estimates of the from the nearest neighbours and therefore is less likely to be
data symbols are obtained by equalizing the received OFDM decoded as any other constellation point.
10
Interferers
Lemma 2. Using the network model of VI-A, the mean and where, σI2 is given in (44) and Λ is a diagonal matrix with
variance of interference I is, eigenvalues of Rtap spread along the diagonal and all users
are assumed to have similar channel characteristics.
µI = E{I} = 0 (43)
Proof: See Appendix B.
and
Theorem 1, shows that MSE is composed of two terms.
σI2 = E{|I|2 } The first term due to AWGN can be suppressed by increasing
the number of pilot tones but the second term due to pilot
1 1
= πλ(β − 1)−1 E{|x|2 }Ex Ω − (44) contamination cannot be reduced by adding more pilots and
γo2β−2 2β−2
γm
even persists at high SNR (i.e., ρ → ∞).
respectively.
Theorem 2. For the system model described in section II and
Proof: See Appendix A. pilot contamination as characterised in section VI, the MSE
Although (44) is derived by considering that the interference expression for L-LMMSE estimation algorithm presented in
space is annular, it can be extended for an infinite interference section III-A under both AWGN and pilot contamination is
space with a protection region of γo by taking the limit as given by,
γm → ∞ yielding,
L
δi 1 + ρKδi σI2
Ex Ω
X
(L)
σI2 = πλγo2 (β − 1)−1 E{|x|2 } . (45) MSE = R , (48)
1 + ρKδi + ρKδi σI2
γoβ−1 i=1
MSE
MSE
10−1
Array element spacing dx , dy 0.3ν, 0.5ν 100 D=0
D=1
Number of OFDM sub-carriers (N ) 256 10−2
D=2
D=3
Number of pilots (K) 32 10−3
0 1 2 3 4 5 6 7 8 9 −10 −5 0 5 10 15 20 25 30
Signal constellation modulation 4/16/64 – QAM # of iterations (D) SNR (dB)
103
102 LS
10 LS
LS LS (Th.) LS (Th.)
LS (Th.) L-LMMSE L-LMMSE
102 L-LMMSE (Th.) 6 L-LMMSE (Th.)
101 L-LMMSE D-LMMSE (D=3) D-LMMSE (D=3)
L-LMMSE (Th.) O-LMMSE O-LMMSE
MSE (dB)
O-LMMSE (Th.) 2 O-LMMSE (Th.)
D-LMMSE (D=3)
MSE
101
O-LMMSE
100 O-LMMSE (Th.)
−2
MSE
100 −6
10−1 −10
10−1
−10 −5 0 5 10 15 20 25 30 10−2 10−1 100 101
SNR (dB) λ
10−2
(a) (b)
10 −3
−10 −5 0 5 10 15 20 25 30 Figure 10: Effect of pilot contamination on MSE performance
SNR (dB)
(a) MSE as a function of SNR for λ=0.1 (b) MSE as a function
Figure 7: MSE performance of different algorithms in white of λ for SNR fixed at 10 dB.
Gaussian noise.
10−1
different algorithms in presence of both AWGN and pilot
MSE
MSE
0
MSE is more severe than AWGN.
−0.5
We present similar analysis in Fig. 10(b), where the MSE
−1
10−2 10−1 100 101 is plotted as a function of λ with SNR fixed at 10 dB. It is
λ
obvious that all algorithms perform well for small values of
4
Simulation λ. However when λ increases, the interference due to pilot
3 Theory
contamination dominates AWGN, thus severely degrading the
σI2
2
performance as indicated by a sharp increase in MSE curves.
1
Note that LMMSE channel estimation is more robust to pilot
0
10−2 10−1 100 101 contamination than simple LS based channel estimation. Also
λ
observe a close match between simulation and theoretical
Figure 9: Mean and variance of interference at single OFDM analysis, shown in Fig. 10, over a wide range of λ.
sub-carrier as a function of λ.
E. Experiment 5: Computational Complexity
In this experiment we compare the average runtime of
user at a distance of 1m from BS while interfering users are various algorithms that can be regarded as a measure of
distributed in a region of radius 5m and with a protection computational complexity. Fig. 11 shows the average runtime
region of γo =2m according to a PPP with density λ and with increasing number of BS antennas under the default
pathloss exponent β=2. All users communicate with BS using simulation parameters of Table II. It is clear that computa-
OFDM with N =256, L=8 and K=32 identical pilot symbols tional requirements for proposed D-LMMSE algorithm, with
drawn from a 4-QAM constellation. Fig. 9 compares the mean different values of parameter D, grow at much slower pace
and variance of interference observed on single OFDM carrier than that of the O-LMMSE algorithm as the number of BS
(randomly picked) due to simulated sources with expressions antenna increases. Further, in terms of memory requirements
given in Lemma 2, as a function of λ. The results indicate a and communication overhead (not shown here), the advantages
close match between simulation and theory. of D-LMMSE are even more tangible.
14
2
10 of interference can be computed as follows,
1
10 σI2 = E{|I|2 }
X √E z X √E z ∗
0
10 x i x j
= EΨ Ez
Avg. Runtime (sec)
β β
−1
i∈Ψ γ i j∈Ψ γ j
10
( )
(a) X Ex Ez {|zi |2 }
−2
10 = EΨ
LS
L−LMMSE i∈Ψ γi2β
−3 D−LMMSE(D=1)
10 Z 2π Z γm
D−LMMSE(D=2)
(b) 1
D−LMMSE(D=3)
O−LMMSE = λEx E{|zi |2 } rdrdθ
−4
10
4 16 36 64 100 0 γo r2β
No. of BS antennas
(c) 1 1
= πλ(β − 1)−1 Ex ΩE{|x|2 } −
Figure 11: Average runtime of various algorithms. γo2β−2 2β−2
γm
(a)
where, = is due to the fact that zi are independent SS random
(b) (c)
variables, in = we employed Campbell’s theorem and in = we
2 2 2 2
VIII. C ONCLUSION used the result E{|zi | } = E{ai αi } = ΩE{|x| }, where we
note that ai and αi are independent random variables, which
completes the proof.
Channel estimation is a challenging problem in massive
MIMO systems as the conventional techniques applicable to
MIMO systems cannot be employed owing to an exceptionally
large number of unknown channel coefficients. We proposed
a distributed algorithm that attains near optimal solution at A PPENDIX B
a significantly reduced complexity by relying on coordination P ROOF OF T HEOREM 1
among antennas. To reduce the pilots overhead, the distributed
LMMSE algorithm is extended using data-aided estimation
based on reliable carriers. To gain insight into the effect
of pilot contamination on channel estimation performance, By replacing Rw with Rw + RIr in MSE expression of
we used the stochastic geometry to obtain the aggregated (16), we obtain
interference power and then based on this, we derived MSE −1
−1
expressions for different algorithms under AWGN and pilot msels H
r = trace A (Rw + RIr ) A
contaminated scenarios. The derived expressions were verified −1 −1
using simulation results. Extending the obtained results to = trace AH Rw + σI2 ARtap AH A
analyzing the system throughput under pilot contamination (a)
−1
remains open for future work. = trace AH R−1 2 H −1
w A − σI A Rw A Rtap
−1 H −1 −1
+ σI2 AH R−1
w A A Rw A
(a)
where, = follows from matrix inversion lemma. Now, using
the EVD of the channel correlation matrix Rtap = QΛQH
A PPENDIX A and the fact that AH R−1 KEx
w A = σ2 IL we obtain,
M EAN AND VARIANCE OF INTERFERENCE w
2
(b) KEx KEx
msels
r = trace 2
IL − σI2 2
Λ−1
σw σw
The mean of I can be determined as follows, !−1
σI2 KEx −1
X √Ex zi + IL
( )
σw2
µI = E{I} = E
i∈Ψ γiβ L 2 2
−1 !−1
KEx KE σ KE
X √Ex Ez {zi }
x x
X
= −σI2 δi−1 + I 2
( )
2
σw σw2 σw
= EΨ i=1
i∈Ψ γiβ (53)
1
Z
(a) p (b)
= Ex E{zi } rdrdθ = 0 where, = follows from the property that trace QRQH =
β
R2 r
trace(R) if Q is unitary. After simple algebraic manipulations,
(a) 2
σw L PL
where, = results from Campbell’s theorem [40] and then the the term inside the summation simplifies to KE x
+σI2 i=1 δi ,
fact, E{zi } = 0 yields the zero mean. Similarly, the variance which completes the proof.
15