Distributed Channel Estimation and Pilot Contamination Analysis For Massive MIMO-OFDM Systems

1
Distributed Channel Estimation and Pilot

Contamination Analysis for Massive MIMO-OFDM
Systems
Alam Zaib, Mudassir Masood, Anum Ali, Weiyu Xu and Tareq Y. Al-Naffouri
Abstract—Massive MIMO communication systems, by virtue as massive MIMO or large scale MIMO systems [8]–[10],
arXiv:1507.08150v1 [cs.IT] 29 Jul 2015
of utilizing very large number of antennas, have a potential to overcome many limitations of traditional MIMO systems.
yield higher spectral and energy efficiency in comparison with Massive MIMO increases system capacity by simultaneously
the conventional MIMO systems. In this paper, we consider
uplink channel estimation in massive MIMO-OFDM systems serving tens of users using the same time-frequency resources.
with frequency selective channels. With increased number Moreover, the large number of low power active antennas
of antennas, the channel estimation problem becomes very allows to focus energy in a small spatial region by forming
challenging as exceptionally large number of channel parameters a sharp beam towards desired users. This additionally implies
have to be estimated. We propose an efficient distributed linear that there will be little intra-cell interference [9]. Because of
minimum mean square error (LMMSE) algorithm that can
achieve near optimal channel estimates at very low complexity these vital advantages, massive MIMO has attracted a lot of
by exploiting the strong spatial correlations and symmetry of research interest and is envisioned as an enabling technology
large antenna array elements. The proposed method involves for next generation (5G) wireless communications [11].
solving a (fixed) reduced dimensional LMMSE problem at each Hand in hand with the advantages are entirely new research
antenna followed by a repetitive sharing of information through challenges that need to be tackled for massive MIMO. The
collaboration among neighboring antenna elements. To further
enhance the channel estimates and/or reduce the number bottleneck in achieving the full advantages of massive MIMO
of reserved pilot tones, we propose a data-aided estimation is the accurate estimation of the channel impulse response
technique that relies on finding a set of most reliable data (CIR) for each transmit-receive antenna pair. Having a very
carriers. We also analyse the effect of pilot contamination on large number of antennas means that a significant number of
the mean square error (MSE) performance of different channel channel coefficients need to be estimated − far more than that
estimation techniques. Unlike the conventional approaches,
we use stochastic geometry to obtain analytical expression could be handled by traditional pilot-based MIMO channel
for interference variance (or power) across OFDM frequency estimation techniques (see [12] and references therein). In
tones and use it to derive the MSE expressions for different this regard, Bayesian minimum mean square error (MMSE)
algorithms under both noise and pilot contaminated regimes. estimator provides an optimal estimate in the presence of ad-
Simulation results validate our analysis and the near optimal ditive white Gaussian noise (AWGN). The method is complex
MSE performance of proposed estimation algorithms.
and therefore, a number of approaches have been developed
Index Terms: Channel estimation, massive MIMO, stochastic to reduce its complexity such as those proposed in [13]–
geometry, OFDM, LMMSE. [17]. Unlike the least squares (LS) or interpolation based
techniques [18], the MMSE estimation has a clear edge in
that it can effectively utilize the channel statistics to improve
I. I NTRODUCTION the estimation accuracy. However, the direct generalization of
In wireless communications, the demand for higher data these techniques to massive MIMO has some drawbacks. In
rates has been dramatically increasing mostly owing to the un- particular, they suffer from huge complexity due to matrix
precedented usage of data-hungry devices e.g., smart-phones, inversion of very large dimensionality, making it impractical.
super-phones, tablets etc., for wireless multimedia applications Some methods to reduce the complexity of MMSE estimator
[1]. Over the years, the MIMO technology (that exploits in massive MIMO have also been proposed e.g., [19]–[24].
multiple antennas at the transmitter and/or receiver) has played It is important to note that most of the existing methods
a pivotal role in sustaining the increased data rates. Installing make assumptions that are not always true. For example,
multiple antennas offers key advantages such as multiplexing many methods deal with flat fading channels only while others
gain and diversity gain due to increased spatial reuse [2], assume that the channels are sparse. Therefore, low complexity
[3]. The MIMO technology has already been incorporated channel estimation approaches suited to multi-cell and multi-
into many wireless products and standards such as WiFi carrier massive MIMO systems need further investigations.
IEEE802.11n [4], WiMAX IEEE 802.16e [5], LTE (4G) [6]. In this paper we propose a distributed algorithm for the
Recently, it was established that the use of very large estimation of correlated Rayleigh fading channels in massive
antenna arrays, typically of the order of few hundreds, at MIMO-OFDM systems. The novel distributed LMMSE algo-
the base station (BS) can potentially provide huge gains in rithm significantly reduces the computational complexity while
system throughput, energy efficiency, security and robustness attaining near optimal CIR estimates. The distributed approach
of wireless communication systems [7]. Such systems, known is inspired by our previous work in [25] (where channels are
2
assumed to be sparse and exhibit common support with the ered in Section V. Section VI describes the effect of pilot
neighboring antennas). Furthermore, in order to enhance the contamination on channel estimation, and the expression for
estimation performance, we also propose a data-aided estima- interference correlation is presented. Based on this, the MSE
tion technique that relies on finding a set of most reliable data expressions for different algorithms are derived under AWGN
carriers to increase the number of measurements, instead of and pilot contamination. Simulation results are presented in
increasing the reserved pilot tones [26]. Equivalently, by using Section VII and finally we conclude in Section VIII.
the data-aided technique, the number of reserved pilot tones
can be reduced to attain a performance that is comparable to A. Notations
pilot-based estimation, thus increasing the spectral efficiency.
In a multi-cell setting, allocation of orthogonal pilot se- We use the lower case letters x and lower case boldface
quences for all users cannot be guaranteed due to finite coher- letters x to represent the scalar and the (column) vector re-
ence time of the channel and the limited available bandwidth spectively. Matrices are denoted by upper case boldface letters
[7]. Therefore, it is inevitable to reuse the pilot sequences X whereas the calligraphic notation X is reserved for vectors
across the cells. One of the major consequences of pilots reuse in the frequency domain. The ith entry of x is represented by
is that when the BS in a cell is performing channel estimation x(i), the element of X in ith row and jth column is denoted
via uplink training, the channel estimates will be severely by xi,j and the vector xk represents the kth column of X. We
distorted (contaminated) by the pilots of the neighboring cell use x(P) to denote a vector formed by selecting the entries
users. The impact of pilot contamination on channel estimation of x indexed by set P and X(P) to denote a matrix formed
is far greater than AWGN. In fact, it was shown in [27] that by selecting the rows of X indexed by P. We also use Xij to
the effect of uncorrelated interference and fast Rayleigh fading refer to the (i, j)th block entry of a block matrix. Further, (.)T ,
diminishes as the number of BS antennas increase while the (.)∗ and (.)H represent transpose, conjugate and conjugate
effect of pilot contamination is not eliminated. Hence, it is transpose (Hermitian) operations respectively. We use diag(x)
important to investigate the effect of pilot contamination on to transform a vector x into a diagonal matrix with the entries
MSE performance of different channel estimation techniques. of x spread along the diagonal. hX̂ (k)i denotes the hard
Although the effect of pilot contamination on system per- decoding i.e., maximum likelihood (ML) decision of X̂ (k).
formance has been analysed by many researches e.g., [28], E{.} represents the statistical expectation. The discrete Fourier
[29], only few studies have analysed its impact on channel transform (DFT) and inverse DFT (IDFT) matrices are repre-
estimation performance [20]. Moreover, in these works, the sented by F and FH respectively, where we the (l, k)th entry of
analysis is carried out for fixed locations of (interference) F is defined as fl,k =N −1/2 e−2πlk/N , l, k=0, 1, 2, · · · , N −1
users. Also it can be seen from analytical expressions derived for an N -dimensional Fourier transform. Finally, the weighted
in these works, that the pathloss, which is determined by user's norm of a vector x is given by kxk2A , xH Ax.
locations, plays an important role in MSE performance eval-
uation. As such, the above works cannot analytically answer II. S YSTEM M ODEL
how the randomness of users’s locations would effect MSE We consider a multi-cell massive MIMO-OFDM wireless
performance under pilot contamination. In contrast to existing system as shown in Fig. 1, where the BS in each cell is
studies, we approach the problem by using concepts from equipped with uniform planar array (UPA) consisting of a
stochastic geometry. By assuming that the interfering users are large number of antennas. Moreover, we assume that each BS
distributed according to homogeneous poisson point process serves a number of single antenna user terminals. The antennas
(PPP), we derive analytical expressions for MSE of LS and on UPAs are distributed across M rows and G columns with
LMMSE based channel estimation algorithms in the presence horizontal and vertical spacing of dx and dy respectively. We
of both AWGN and pilot contamination. The analytical results define the (m, g)th antenna as the antenna element in mth
are validated by simulations. The results clearly show the row and gth column which corresponds to r=m + M (g − 1)th
dependence of important massive MIMO network parameters, antenna index where 1 ≤ m ≤ M , 1 ≤ g ≤ G and 1 ≤ r ≤ R,
such as pathloss and user's density, on the MSE performance where R=M G is the total number of antennas in a UPA. Fig.
and give clue to mitigate the effect of pilot contamination. It 2 shows an example of a M ×G UPA structure with antenna
is shown that the increasing the number of pilots does not indexing. Note that, depending on values of G and M , the
improve the estimation performance in the presence of pilot antennas could have linear or a rectangular configuration. We
contamination. Moreover, the dependence of MSE on antenna however, confine our attention to rectangular UPA structure
spatial correlations suggests that the massive antenna array which is a viable configuration in deployment scenarios for
structure could be optimized to slightly improve the estimation massive MIMO [9].
performance under pilot contamination. Each user communicates with the BS using OFDM and
The remainder of the paper is organized as follows. Section transmits uplink pilots for channel estimation. We assume that
II describes the system and spatial channel correlation model. all users in a particular cell are assigned orthogonal frequency
In Section III, we present the MMSE and LS based chan- tones so that there is no intra-cell interference. However, due
nel estimation in the presence of AWGN only and discuss to necessary reuse of pilots, there are users in the neighboring
their limitations for massive MIMO. The proposed distributed cells that transmit pilots at the same frequency tones, resulting
LMMSE algorithm is presented in Section IV. To enhance in an inter-cell interference or pilot contamination. Since
estimation performance, the data-aided approach is consid- only the user in a particular cell of interest will experience
3
Massive UPA
antenna array
delay profile (PDP). As manifested by (1), Rarray is assumed
Pilot
contamination to be identical across the l taps while Rtap is assumed to be
UE 1
identical across the array. For the spatial correlation matrix
UE 3 Cell 2
Rarray , we adopt a ray-based 3D channel model from [30]
UE 2
which is more appropriate for rectangular arrays. Accordingly,
the spatial correlation between array elements r=(m, g) and
r′ =(p, q) is given by,
UE 2
Cell 1 UE 1
UE 1
Pilot D1 D7 +(D2 (sinφ)σ)2 D2 D6
[Rarray ]r,r′ = √ e− e D5 ,
UE 3
contamination Cell 3 2D5
(2)
UE 2
D5
where the Di 's are defined as,
Figure 1: Multi-cell massive MIMO system layout. 2πdx 2πdx 2 2 2
1
D1 = e ν (p−m)cos(θ) e− 2 (ξ ν ) (p−m) sin θ
,
1 M+1 dx 2πdx
D2 = (q − g)sin(θ) ,
ν
dy
Up neighbor rU 2 M+2
Right neighbor rR 2πdx

Central antenna r D3 = ξ (q − g)cos(θ) ,
Left neighbor rL
3 M+3
ν
Down neighbor rD
2
1 2π
4 M+3
D4 = ξ (p − m)(q − g)sin(2θ) ,
Linear indices of 2 ν
antennas
D5 = (D3 )2 (sin(φ)σ)2 + 1 ,
M 2M GM
D6 = D4 (sin(φ)σ)2 + cos(φ) ,
D7 = (D3 )2 cos2 φ − (D4 )2 (sin(φ)σ)2 − 2D4 cosφ .
Figure 2: An example of M × G UPA structure with antenna Here, ν is the carrier-frequency wavelength in meters, φ
indexing. and θ are the mean horizontal angle-of-departure (AoD) and
the mean vertical AoD in radians respectively, σ and ξ are
the standard deviation of horizontal AoD and the standard
interference from the users of neighboring cells that share deviation of vertical AoD respectively. As shown in [30], the
pilots at the same frequency tones, hence without loss of spatial correlation matrix can be well approximated as,
generality, it suffices to consider one user per cell with all
users transmitting pilots at same OFDM frequency tones. Rarray ≈ Raz ⊗ Rel , (3)
where Raz and Rel are the correlation matrices in azimuth
A. Channel Model (horizontal) and elevation (vertical) directions, having dimen-
sions (M ×M ) and (G×G) respectively and are defined as,
In the discussion that follows, we assume that there is no
inter-cell interference and thus focus on a single-cell single-
2πdx
user scenario (the case of multi-cell will be treated in section (p−m)cos(θ) − 21 (ξ 2πd x )2 (p−m)2 sin2 θ
[Rel ]m,p = e ν e ν ,
VI further ahead). Further, we assume a multi-path channel D2 cos2 φ (D2 σ)2
1 − 32D 
D2 cosφ
− 21 D
between user and receive antenna r modeled by a Gaussian [Raz ]g,q = √ e 5 e D5
e 5 .
L-tap CIR vector. Specifically, the channel between user and D5
T
antenna r is defined by hr , [hr (0), hr (1), · · · , hr (L − 1)]
where hr (l) ∈ C represents the lth tap complex channel gain.
We append all the CIR vectors from a user to the R antennas B. Signal model
of the BS to form an RL dimensional composite channel
T T
vector h, hT T
1 , h2 , · · · , hR . Further, we collect the lth tap We assume that there are N OFDM sub-carriers and let
of all transmit-receive pairs to form an R dimensional lth tap X represent the N -dimensional information symbol whose
T
vector h(l) , [h1 (l), h2 (l), · · · , hR (l)] . Then, the RL × RL entries are drawn from a bi-dimensional constellation e.g.,
dimensional composite channel correlation matrix can be Q-QAM. The equivalent time-domain symbol is obtained by
written as, taking inverse Fourier transform i.e., x=FH X . The time-
domain symbol is then transmitted after inserting a cyclic
Rh , E{hhH } = Rarray ⊗ Rtap , (1)
prefix (CP) of length at least L−1 to avoid inter-symbol-
which is the kronecker product (⊗) of two components: (i) interference (ISI). After removing the CP at the receiver,
The R×R dimensional antenna spatial correlation matrix, the frequency-domain OFDM symbol at rth antenna can be
Rarray =E{h(l) h(l)H }, ∀l=0, 1, · · · , L − 1, which represents represented as,
the correlation among the lth taps across the array and
Y r = diag(X )Hr + W r , (4)
(ii) The L×L dimensional channel tap correlation matrix,
Rtap =E{hr hH r }, ∀r=1, 2, · · · , R, which represents the cor- where, W r is frequency domain AWGN vector of zero mean
2
relation among the CIR taps that depends on channel power and covariance Rw =σw IN and Hr is the channel frequency
4
response between the user and receive antenna r i.e., A. The Localized LMMSE (L-LMMSE) estimation
√ √ In this approach, all CIRs are estimated independently

hr
Hr = N F = N Fhr . (5) based on the observations received at each antenna element
0N −L×1
by using the classical LMMSE estimation. Using the linear
Here F is truncated Fourier matrix formed by selecting the system model in (7), the LMMSE estimate of hr is obtained
first L columns of F. Using (5), we can re-write (4) as, by minimizing the (local) MSE, E{khr −ĥr k2 }, over ĥr as
√
Y r = N diag(X )Fhr + W r = Ahr + W r , (6) follows [32]
√ −1
−1 H −1
where A, N diag(X )F and the noise vector W r is assumed ĥr = Rtap + AH R−1
w A A Rw Y r , (9)
to be uncorrelated with the channel vector hr . We assume
where we drop the index vector P for convenience. Similarly,
that K sub-carriers are reserved for pilots and the remaining
it follows that the (minimum) MSE is,
N − K for the data transmission. Further, it is best to allocate −1
the pilots uniformly as shown in [31]. Hence, for a set of pilot mser = trace R−1 H −1
tap + A Rw A . (10)
indices denoted by vector P, the system equation (6) reduces
to, The overall global MSE is obtainedPby taking summation
R
Y r (P) = A(P)hr + W r (P) , (7) over all array elements i.e., MSE(L) = r=1 mser , which after
simplifying (10), can be expressed as,
where Y r (P) and W r (P) are formed by selecting the entries L
of Y r and W r indexed by P while A(P) is a K × L matrix
X δi
MSE(L) = R , (11)
formed by selecting the rows of A indexed by P. i=1
1 + ρKδi
We can now collect the pilot measurements (7) received by
where {δi }L 2
i=1 are eigenvalues of Rtap , ρ , Ex /σw is the
all antennas into a single system of equations as follows,
SNR with Ex representing the average signal energy per
Y(P) = [IR ⊗ A(P)]h + W(P) , (8) symbol and the superscript (L) indicates L-LMMSE. Observe
h iT from (11) that channel delay spread L, has an adverse effect
where, Y(P)= Y T T
1 (P), · · · , Y R (P) , W(P) = on MSE performance, which can be reduced by increasing
h iT the number of pilot tones. The computational complexity of L-
WT T
1 (P), · · · , W R (P) , IR represents an R × R identity LMMSE is of the order O RL3 (see Table I), which increases

matrix and h, as defined earlier, represents the composite linearly with the number of BS antennas. However, the CIR
channel vector from user to the BS. For convenience, we estimates are not optimal in the sense of minimizing the overall
assume the noise variance to be identical across the array so or global MSE. The estimates would have been optimal, had
2
that W(P) ∼ CN (0, Rw =σw IRK ). Note that the number the antennas been placed sufficiently apart so that the channel
of unknown channel coefficients in (8) are RL whereas the vectors were effectively uncorrelated. But for massive MIMO
total number of equations are RK. Therefore, a necessary with extremely large number of antennas, it is expected that
condition to solve (8) for h (and also (7) for hr ) using least antennas are located in close proximity, so the channel vectors
squares, is that the number of pilots be at least equal to L are highly likely to be correlated with each other.
i.e., K ≥ L. However, K could be reduced if we utilize the
correlation information. With the models defined above, we
B. The Optimal LMMSE (O-LMMSE) Solution
are ready to estimate the CIRs between the user and each BS
antenna. We pursue different approaches that can be adopted In this strategy all the channel vectors are estimated simulta-
for channel estimation in massive MIMO setup depending on neously by minimizing the global MSE, E{kh− ĥk2 } over the
whether the information processing takes place independently composite channel vector ĥ. This could be realized by sending
at each antenna element or jointly at a centralized processor. all observations to a central processor and then invoking the
We start with naive LMMSE and LS based techniques and LMMSE estimation based on the composite system model in
discuss their limitations, and then propose a new distributed (8). The solution to this problem is given by,
approach in section IV which is further extended in section −1
ĥ = R−1 h + ÁH −1
R w Á ÁH R−1w Y , (12)
V with the help of data-aided approach.
where, Á=IR ⊗ A, Rh is as given in (1) and for notational
convenience we dropped the index P. The corresponding MSE
III. LMMSE AND LS BASED C HANNEL E STIMATION is, −1
MSE(O) = trace R−1 H −1
h + Á Rw Á , (13)
In this section, we present three different techniques for which can be simplified to yield,
channel estimation in massive MIMO-OFDM based on the
R X
L
well-known LMMSE and LS estimators and discuss their X ηj δi
limitations. For now, we assume that estimates are corrupted MSE(O) = , (14)
j=1 i=1
1 + ρKηj δi
only by the white noise. Hence, without loss of generality, we
consider a single-cell single-user scenario for the approaches where, ηj and δi are eigenvalues of Rarray and Rtap re-
presented below. spectively. By comparing (14) with (11), we conclude that
5
in presence of spatial correlation, the optimal solution yields performance with tractable complexity. The proposed dis-
better MSE performance than the localized strategy, however, tributed LMMSE estimation is described below and is further
it has the following two major drawbacks: extended in section V via a data-aided technique.
1) Realization of optimal strategy requires global sharing
of information to/from the central processor that results IV. T HE P ROPOSED DISTRIBUTED LMMSE (D-LMMSE)
in communication overhead (as it requires complex ESTIMATION
signalling which can be very expensive).
2) As evident from (12), the computation of optimal It is well known from equivalence results in linear esti-
LMMSE requires inverting a non-trivial matrix of very mation theory [33] that the O-LMMSE solution (12) could
high dimension (RK × RK) that leads to computational be alternatively obtained by solving an RL dimensional opti-
complexity of order O R3 L3 , which is cubic in number mization problem,
of BS antennas. argmin kY − A′ hkR−1
2
+ khkR−1 ,
2
(18)
w
In massive MIMO scenario where R is of the order of few h h
hundreds, both of the above mentioned operations are very where all the variables are as defined earlier. Instead of
expensive and possibly impractical. solving (18) globally (as done earlier), we aim to solve it
in a distributed manner over R antennas in which the rth
C. Estimation using Least Square (LS) antenna has access to Y r only. Moreover, the antenna r is
If the channel statistics are unknown, one can employ simple interested only in determining its own CIR (i.e., hr ) without
LS based estimation. In the absence of correlation, we can worrying about other hj 's. Here, we would like to mention that
let the inverse of channel correlation matrix go to zero, i.e., this problem is fundamentally different from those considered
R−1 in the context of adaptive networks [34]. Also, most of the
tap → 0, thereby ignoring the channel statistics. Therefore,
the localized LS solution from (9) is, existing distributed estimation techniques in adaptive networks
−1 H deal with single task problems in which all nodes in the
ĥls
r = A A
H
A Yr , (15) network estimate a single common parameter of interest.
Furthermore, they rely on full cooperation between the nodes,
and the resulting MSE is given by,
i.e., exchanging both the estimates and the observations with
H −1
−1
msels the neighbors. Although, the distributed recursive least squares
r = trace A Rw A . (16)
(RLS) algorithm of [35] might be adopted to solve (18), it
In this case, the overall MSE simplifies to, would be gravely complex in number of dimensions (due to
R large R in massive MIMO and large channel delay spread)
X RL
MSE(LS) = msels
r = . (17) and hence might suffer from convergence issues. Our proposed
r=1
ρK solution, the distributed LMMSE (D-LMMSE) algorithm, as
Comparing (17) with (11), we conclude that LS has poor will become clear, is much simpler in that it exploits the
performance in comparison with the LMMSE as it does not structure of spatial correlation matrix Rarray and relies only
utilize the channel statistics. It is for this reason that the on exchanging the (partial) weighted estimates of CIRs with
centralized LS (C-LS) solution would achieve the same MSE immediate neighbors, thus reducing the communication and
performance as the localized one as shown below. computational cost significantly. The proposed D-LMMSE
−1 algorithm is composed of three main steps namely the esti-
H −1
MSE(C−LS) =trace (IR ⊗ A) (IR ⊗ Rw ) (IR ⊗ A) mation, sharing and update, as explained below.
−1
= trace IR ⊗ AH R−1 w A ,
R A. Estimation
X −1
= trace AH R−1
w A , In the estimation step, each antenna acting as a center
r=1 antenna rC , estimates not only its own CIR but also the CIRs
= MSE(LS) , of its neighborhood. The neighborhood of rC consists of 4-
direct neighbors represented by the set N ={rL , rR , rU , rD }1
where we have used the Kronecker product identities, (A ⊗
on the left, right, top and bottom positions respectively as
B)(C ⊗ D)=AC ⊗ BD and (A ⊗ B)−1 =A−1 ⊗ B−1 .
shown in Fig. 3(a). Also, let the corresponding channel vectors
In short, the L-LMMSE estimation has the advantage of
be represented by hC , hL , hR , hU and hD respectively and
low complexity (and better performance than LS) but it is
let hc represent |N + |L × 1 dimensional composite channel
unable to exploit the strong spatial correlation among antenna
vector of the central antenna and its |N | direct neighbors (i.e.,
elements which is inevitable in massive MIMO systems. On T T
hc = hT T T T

the other hand, O-LMMSE exploits the spatial correlations but C , hL , hR , hU , hD ). During the estimation process,
at a significantly higher computational cost. This motivates each antenna acting as a central antenna computes the estimate
us to propose a method that can overcome the shortcomings of hc by solving a reduced dimensional weighted least squares
of aforementioned techniques without affecting the estimation 1 Note that for elements lying at the edges of a UPA, the number of
quality. Specifically, we propose a distributed estimation of neighbors are different, so that 2 ≤ |N | ≤ 4. The set of neighbors including
CIRs based on antenna coordination that attains near optimal the central antenna is represented by N + .
6
(WLS) optimization problem, place in a similar fashion.

c 2 2 As a result of information sharing, each antenna acting as
ĥ = argmin kY C (P) − A(P)hC kR−1 + khc kR−1c , (19) j
hc
w h a central node rC receives |N | partial vectors, ĥw , j ∈ N ,
+
where Y C (P) represents pilot observations at the central from its neighbors, each of dimension |N |L × 1 and having
antenna, Rhc is channel correlation matrix defined as Rhc , only two non-zero components; ĥwj and ĥwc . For the example
E{hc (hc )H } and Rw =σw 2
IK is the noise covariance matrix in Fig. 3(b), the composite vector of the central node and the
at the central antenna. From (19) it is clear that information partial vectors received from its neighbors are given as follows,
is processed locally at each antenna as each antenna uses
     
ĥw1 ĥw1 ĥw1
4 2
only its own observations and interacts with its neighborhood ĥ1w = ĥw4  , ĥw = ĥw4  and ĥw =  0  . (24)
only through Rhc (it is assumed that the central antenna ĥw2 0 ĥw2
has available correlation information of its neighborhood to
construct Rhc ). The solution to the above WLS minimization Note that the estimates which are not shared have been
problem can be obtained by first re-writing (19) explicitly in assigned as null vectors.
terms of hc as,
2 2 C. Update
ĥc = argmin Ȳ − Āhc R−1 + khc kR−1c ,

(20)
w h
hc
Having received the (partial) LMMSE estimates from the
where, Ȳ=Y C (P) and Ā= A(P) 0K×L|N | . Then, by in- neighboring elements, each antenna acting as the central
voking the equivalence between LMMSE and WLS estimation element updates its estimate and error covariance matrix. The
problems we obtain, update rule is based on the optimal combining of estimators,
−1 H −1 a standard result in LMMSE estimation theory. The result is
ĥc = R−1 H −1
hc + Ā Rw Ā Ā Rw Ȳ . (21) summarized in the following lemma,
−1
We define Pc , (Cce ) as the inverse of error covariance Lemma 1. Let y1 and y2 be two separate observations of
matrix at the central element, which is given by the expression, a zero mean random vector h, such that y1 =A1 h + w1
Pc = R−1 H −1 and y2 =A2 h + w2 , where we assume that h is uncorrelated
hc + Ā Rw Ā . (22)
with both w1 and w2 . Let ĥ1 and ĥ2 denote the LMMSE
Then, by using (22) into (21), the weighted estimate of estimates of h and C1 and C2 be the corresponding error
composite channel at each antenna is simply, covariance matrices in two experiments. Then, the optimal
LMMSE estimator and the error covariance matrix of h given
ĥcw = Pc ĥc = ĀH R−1
w Ȳ . (23)
both the observations are,
This weighting of the estimates asserts that we put more
C−1 ĥ = C−1 −1
1 ĥ1 + C2 ĥ2 , (25)
confidence into the estimates which are more reliable and vice
versa. The estimation step is non-recursive and is computed and
once for all antennas in the array. Thus, having found the
P matrices in (22) and the weighted estimates in (23), each C−1 = C−1 −1 −1 −1 −1
1 + C2 + R h − R 1 − R 2 , (26)
antenna is ready to initiate sharing. where, Rh =E{hhH } and R1 and R2 are covariance matrices
of h in two experiments.
B. Sharing Proof: See [33].
The sharing step is the key to the proposed distributed Aforementioned lemma can be easily extended to more than
algorithm where the information is shared through collabo- two observations. The lemma suggests an optimal way of
ration between antennas. Let us define the sub-vector ĥwj combining the individual estimates obtained by independent
k observations. We use this lemma at each antenna to improve
of composite vector ĥw as a (weighted) CIR estimate of
the initial channel estimate by combining it with the estimates
antenna j (i.e. the vector ĥj ) computed by the antenna k.
computed and shared by |N | neighbors. Consequently, by
In sharing step, each antenna acting as a central element,
treating each antenna as a central element rC , the update rule
shares only the partial information with its neighbors such
k is given by following equations,
that the antenna k, with composite vector ĥw , would share X j(i−1)
only the selected components; its own (weighted) estimate ĥc(i)
w = ĥw
c(i−1)
+ ĥw , (27)
ĥwk and the (weighted) estimate ĥwj , j ∈ N , with its jth j∈N
neighbor. Henceforth, the shared vectors will be termed as
and
partial vectors and represented by an underlined notation. An X
Pc(i) = Pc(i−1) + Pj(i−1) − R−1
hj , (28)
example of how this sharing takes place is also depicted in Fig.
j∈N
3(b) for a 3 × 4 array with central element rC =1 having only
two neighbors; N ={rR =4, rD =2}. As shown, each of the where Pj and Rhj represent the partial (inverse) error
neighboring element shares only two sub-vectors (i.e., partial covariance and correlation
j
matrices associated with the
information) of its composite vector with the central antenna. partial estimates ĥw and i represents the iteration index.
The collaboration between the rest of the array elements takes Note that in the update equations, we employed the weighted
7
 
ĥ 4
rA rB rE rF rG  
 ĥ1 
ĥ4 =  
 ĥ7 
 
ĥ5 ĥ7
1   4 7
ĥ ĥ4
 1
ĥ1 = 
 ĥ4 

rH rI rU rJ rK ĥ2  
ĥ
 
ĥ  5
 
ĥ  2  ĥ2 
 1 2  ĥ5   
ĥ1 = 
 ĥ4 
 ĥ = 


 ĥ 4 ĥ5 =  ĥ
 8
 ĥ7 ĥ8
 ĥ1   
ĥ2  ĥ4   
ĥ3 ĥ8
ĥ6  
rM rL rC rR rN 
ĥ8 = 

ĥ5 


 ĥ7 
ĥ 5 ĥ9
2 5   8
ĥ2 ĥ
 5
 ĥ2 
5
 
ĥ =  ĥ8 

rO rP rD rQ rS  

ĥ

  
 ĥ4 


ĥ  5 ĥ
 2  ĥ2   6 ĥ 6
 ĥ3 
ĥ5 
   
ĥ2 = 
  ĥ 3 ĥ5 = 
 ĥ8 
 ĥ6 = 
 ĥ 
 ĥ8 ĥ9
 ĥ1   ĥ   9
   4
ĥ3 ĥ5
ĥ6 ĥ 6
 
ĥ3 
 
ĥ6 = 
rT rV rW rX rY  
 ĥ9 
ĥ5 ĥ9
3 6 9
ĥ3 ĥ6
(a) Information diffusion process (b) Information sharing process
Figure 3: (a) During the first iteration rC (blue antenna) receives information from its 4-direct neighbors (pink antennas). In
the second iteration, the information from next nearest neighbors (green antennas) also comes in and so on. (b) An example
of a 3 × 4 antenna array where the neighboring antennas (indices 4 and 2) share the selected estimates (highlighted) with the
central antenna (index 1).
estimates and inverse error covariance matrices to minimize entry corresponding to null vectors is replaced by aI where
the computational requirements. The recursions in the update 0 < a ≪ 1 is a small positive number, which indicates
equations are initialized by (23) and (22) respectively, which very low weight or confidence in null estimates (that are not
are available after the estimation step. In the subsequent shared). In essence, the central element has the full information
iterations, each antenna would also require the partial matrices, needed to construct Pj 's and Rhj 's corresponding to shared
j
Pj 's and Rhj 's, for each of its |N | neighbors. Fortunately, estimates ĥw . We illustrate how these matrices could be
they can be obtained from Pc and Rhc respectively (which obtained for the example in Fig. 3(b). Consider the central
are available at the central antenna) by exploiting the antenna rC =1, its |N |=2 direct neighbors with (shared) partial
symmetrical structure of Rarray . Thus, there is no need to estimates given in (24). The partial correlation and error
share them across the neighboring elements, that in turn saves covariance matrices associated with those estimates (shown
a significant amount of communication burden. Specifically, underlined) along with that of central element are given in
the matrices Rhc and Pc exhibit the following two properties:2 (29) and (30) respectively.
Based on above steps and procedures, the proposed D-
Property 1: The matrix Rhc is identical for all elements in LMMSE algorithm is summarized in Algorithm 1.
the neighborhood of rC i.e., Rhc =Rhj , ∀j ∈ N
Property 2: The matrix Pc is identical for all elements in the    
R11 R14 R12 R44 R41 0
neighborhood of rC i.e., Pc =Pj , ∀j ∈ N
Rh1 = R41 R44 R42  , Rh4 = R14 R11 0
R21 R24 R22 0 0 IL
Property 1 is attributed to the symmetric nature of the spatial  
correlation matrix Rarray , which implies that the spatial R22 0 R21
correlation between any two antennas, placed equidistant apart, and Rh2 =  0 IL 0 
is the same. Therefore, it is not difficult to see that property R12 0 R11
1 holds exactly under the Kronecker model and our earlier (29)
assumption of identical tap correlation across the antenna array    
in section II. Property 2 is the consequence of property 1 when P11 P14 P12 P44 P41 0
incorporated into (22). P1 = P41 P44 P42  , P4 = P14 P11 0
Hence, to obtain the patrial correlation matrices, Rhj , j ∈ P21 P24 P22 0 0 aI
  (30)
N , we use property 1 to first set Rhj =Rhc and then mod- P22 0 P21
ify the off-diagonal block entries corresponding to the null and P2 =  0 aI 0 
vectors of partial estimates as Rij =0 if any ĥwi , ĥwj =0 and P12 0 P11
the diagonal block entries as Rii =IL if ĥwi =0, where the Remarks:
subscript ij denotes the (i, j)th block. The matrices Pj 's are
1) The information sharing and update take place during
obtained in the similar fashion except that the diagonal block
each iteration of the algorithm such that after few itera-
2 These properties are generally satisfied as the spatial correlation matrix is tions the information diffuses across the whole antenna
usually symmetric, if not, then the antennas can share these matrices as well. array. This concept of sharing is depicted in Fig. 3(a)
8
Algorithm 1 Distributive LMMSE (D-LMMSE) algorithm Table I: Computational Complexity

1) (Estimation) Each antenna acting as a central element
c Algorithm Multiplications (×) Additions (+) Complexity
rC computes ĥw and Pc by using (23) and (22)
respectively. LS RK(L + 1) R(KL − 1) O(RLK)
R 2L3 + L 2

2) (Sharing) Each antenna acting as a central element rC L-LMMSE + RL[L+ K−1] O(RL3 )
c K(L + 1)
shares partial estimates, ĥw with its |N | neighbors as
R (L3 + 1)R2 +

described in IV-B. O-LMMSE R2 LK O(R3 L3 )
RL(L+K)+K +L3
3) (Pre-processing) Using Rhc , Pc from step 1 and the
R (53 +1)L3 +2(5L)2

j |N | R[D(5L)3 +(5L)2
received (partial) information {ĥw }j=1 in step 2, each D-LMMSE O(R53 L3 )
+ L(K+1) + 53

+L(K−1)−D]
antenna, acting as a central element rC , constructs
{R−1 j
hj }, {P }, j ∈ N .
4) (Update) Each antenna acting as a central element rC ,
updates its weighted estimate and error covariance using during each step of iteration, while (28) needs one time
(27) and (28) respectively. computations of inversions R−1 hj as they do not depend on
5) (Iterate) Repeat steps 2-4 D times, where D represents iteration index. Finally, the computation of inverse,(Pc)−1 is
maximum number of citerations. c also required only after the convergence when each antenna
6) (Output) Compute ĥ =(Pc )−1 ĥw and output the esti- outputs its final estimate.
mated CIR ĥC .
E. Choice of D
The choice of parameter D i.e., maximum number of
which shows that information diffusion process grows required iterations, has a great influence on computational
exponentially, resulting in fast convergence. complexity and convergence of the proposed D-LMMSE al-
2) The repetitive sharing enables each antenna in the array gorithm. A trivial choice for D is that it can be set to the
to utilize the observations from distant elements, thereby largest dimension of the array i.e., D=max(M, G), which
improving its estimate in each iteration till it converges will ensure that each antenna receives information from every
to near optimal solution. other antenna in the array. The aforementioned choice of D
3) As opposed to the central processing mechanism, the would guarantee the convergence of the proposed D-LMMSE
proposed sharing step is more convenient and computa- algorithm but such a high value of D is very inefficient from
tionally more efficient as all antennas do not commu- the computational complexity point of view, particularly if the
nicate with each other. The collaboration takes place array dimensions are large. We therefore, derive a simple loose
only among the neighboring antennas. Therefore, the upper bound on maximum number of iterations D that is much
complexity of proposed algorithm is significantly less better than the trivial choice. To this end, we first note that the
than the centralized approach. total number of antennas sharing information in D iterations
4) Note that, the antennas share only the partial information of algorithm are 2D(D + 1)+ 1. Hence, in order to ensure that
because only selected vectors are transmitted to the each antenna receives information from every other antenna in
neighbors which save significant amount of communi- the array, we should have 2D(D + 1) + 1 ≤ R. Solving this
cation. Also, estimation step and the repetitive sharing, inequality we get,
pre-processing and update steps require simple linear r
block processing and have a fixed size data structure R 1 1
D≤ − − . (31)
which is well suited for real implementations. In con- 2 4 2
trast, the memory and processing requirements for the It must be emphasised here, that the actual value of D also
centralized approach are even more challenging with depends on the spatial correlations among antennas. If the
large array dimensions. antennas are not very strongly correlated, then we might not
gain from sharing and a small number of iterations might be
D. Complexity Analysis sufficient. In fact, if the antennas are completely uncorrelated,
then the sharing cannot improve the channel estimates as the
In Table I, we compare the computational complexity of
O-LMMSE solution will converge to the L-LMMSE solution
proposed D-LMMSE algorithm with LS, L-LMMSE and the
(see Section III).
centralized O-LMMSE algorithm in terms of multiply and
add operations. The figures indicate that complexity of pro-
posed algorithm is slightly higher (linear with BS antennas) F. Convergence of D-LMMSE Algorithm
than L-LMMSE but is significantly less than the centralized The notion of convergence of D-LMMSE algorithm is
approach. For the proposed D-LMMSE algorithm, it is also attributed to the fact that every iteration of algorithm succes-
worth mentioning here that, the P matrices in (22) can be sively brings new information from the neighboring tiers to
computed off-line and in parallel at all antennas as they the central antenna.
do not depend on observations. Moreover, the computation For example, consider the antenna array depicted in Fig.
of weighted estimates in (23) does not involve any matrix 3(a) and focus on the central antenna rC . Let A(i) and R(i)
inversion. Further, the update in (27) requires simple addition represent the extended data matrix and channel correlation
9
matrix respectively, during the i-th iteration. Then by defining

I1 , 1, we can write, ×
Xa X̂1
×
X
(i)
A(i) = Ii+1 ⊗ A and R(i) = Rarray ⊗ Rtap (32)
(i) X̂2
where, Rarray represents the spatial correlation matrix of the
central antenna and all its neighbors up to i-th tier. Further
assume that, {δl }L R
i=1 and {ηj }j=1 are eigenvalues of Rtap Xc Xb
× ×
and Rarray respectively, arranged in decreasing order of
magnitude. Then for D=0 (i.e., no sharing case), the resulting Figure 4: Concept of reliable carriers selection. Here, X̂ (2)
MSE at the central element is, has higher probability of decoding correctly than X̂ (1).
−1
−1 −1
mse(0)
rC = trace R (0) + AH
R
(0) w A (0)
L
X δl
= (33)
1 + ρKδl
l=1
symbol using zero-forcing (ZF) as follows,
which is obviously the MSE of L-LMMSE in (10). Similarly,
Y(k)
for D = 1 (i.e., sharing up to the first tier), rC receives X̂ (k) = , k ∈ {1, 2, · · · , N } \ P
information from its |N | neighbors (e.g., red antennas of the Ĥ(k)
1st tier) and updates its estimate by optimal combining of W(k)
≈ X (k) + = X (k) + Z(k), (35)
neighboring estimates as described in IV-C. Therefore, the Ĥ(k)
resulting MSE at rC can be written as,
−1 where, Z(k) represents the distortion on k-th data-carrier
1
−1 −1 due to noise and channel estimation error. Given the CFR
mse1rC = trace R (1) + AH
R
(1) w A (1) ,
|N + | estimate, Z(k) can be modelled as Gaussian with zero mean
+
|N | L and variance σz2 =Ĥ(k)−2 σw 2
. The recovery of data symbols is
1 XX ηj δl
= . (34) then performed by simple hard decisions on estimated symbols
|N + | j=1 1 + ρKηj δl X̂ (k) denoted by hX̂ (k)i. Clearly, the errors in the decoding
l=1
process occur due to noise as well as inaccurate channel esti-
Comparing (33) with (34), we note that mse1rC ≤mse0rC ,
mates. Hence, some data-carriers would be severely effected
where the equality holds only if ηj =1, ∀j (i.e., spatially
by noise and channel perturbation errors i.e., Z(k) and fall
uncorrelated channels). Proceeding similarly, it can be shown
outside their correct decision regions, while for some other
that mseD D−1
rC ≤mserC , so that the MSE during each iteration data-carriers the distortion is not strong enough and they are
decreases monotonically till it converges after utilizing obser-
decoded correctly. All those data carriers X̂ (k) which satisfy
vations from all antennas in the array.
the condition hX̂ (k)i=X (k) with high probability, are termed
reliable carriers.
V. DATA - AIDED C HANNEL E STIMATION The proposed strategy for selecting the subset R of the
most reliable data-carriers, motivated by [26], is based on the
The basic idea of data-aided channel estimation is to exploit criteria,
the data sub-carriers in order to improve the initial channel
estimates obtained using only the pilots. As the data aided fz Z(k)=X (k) − hX ˆ(k)i
technique does not require additional pilots, it is spectrally R(k)= PM , (36)
ˆ
m=1,Am 6=hX (k)i
fz (Z(k)=X (k) − Am )
more efficient. Here, the pilot-based channel estimate is used
for data detection, which along with the reserved pilots can where, fz (.) is the pdf of Z(k) and Am represents the set
significantly enhance the channel estimation. It is possible of constellation alphabets. Note that the numerator in (36) is
that some of the data-pilots be erroneous due to noise and the probability that X (k) will be decoded correctly while the
channel estimation errors, while some of the other data-carriers denominator sums the probabilities of all possible incorrect
are reliable i.e., they are likely to be decoded correctly. An decisions due to distortion Z(k). The subset R is formed by
important problem is how to down-select a subset of the most selecting only those data-carriers for which R(k) > 1 i.e.,
reliable data-carriers to be used as data-pilots.
R = {k | R(k) > 1} . (37)
The metric (37) is intuitively appealing as it selects only those
A. Reliable Carriers Selection sub-carriers which are likely to be decoded correctly with high
probability. Fig 4 further elaborates this idea; that even though
Consider the received OFDM symbol at any antenna as X̂ (1) and X̂ (2) have the same distance from X , X̂ (2) is
shown in (4), and let ĥ and Ĥ be the CIR and CFR estimates more likely to be decoded correctly than X̂ (1), as it is farther
obtained using pilots. Then, the tentative estimates of the from the nearest neighbours and therefore is less likely to be
data symbols are obtained by equalizing the received OFDM decoded as any other constellation point.
10
Interferers
B. Revisiting the Estimation Step Base Station
We now revisit the estimation step of the proposed Algo-

rithm 1 using both the pilots and reliable carriers in order to
enhance the initial estimates. Let Rr be the set of indices of γo
reliable data carriers for antenna r, obtained in reliable carriers γm
selection process. Each antenna could revisit the estimation

step by solving (19) using an extended set of indices, P ∪ Rr
corresponding to pilots and reliable data carriers. To make
it computationally efficient, we instead proceed by exploiting Figure 5: Realization of interferes distributed according to PPP
the block form of RLS to update the pilot-based estimates. of λ=0.3, γo =2m and γm =5m with BS at the origin.
Skipping the derivation, the update equations for data-aided
estimation at antenna r are given by,
r r r (see [36] and references therein). Specifically, without loss

ĥd = ĥ + Cre ĀHd G Ȳ d − Ād ĥ , (38)
of generality, we assume a single user in a reference cell of
Cred = Cre − Cre ĀH
d GĀd , (39) radius γo , communicating with the BS located at the origin
−1 O in a 2-D plane. The interfering users (outside radius γo )
G = Rw + Ād Cre ĀH d , (40)
are distributed over a circular region of radius γm according
r to a homogenous PP, denoted by Ψ and having intensity λ.
where, Ȳ d =Y r (P ∪ R ) is extended set of observations,
Ād = A(P ∪ Rr ) 0|P∪Rr |×|N |L is the extended data ma-

Thus, the interfering space is an annular region with radii
r
trix, G represents the gain matrix and ĥ and Cre are respec- γo and γm and where the distance of ith interferer from
tively the estimate and error covariance matrix obtained using BS satisfies γo < γi < γm . Fig. 5 shows a realization
only the pilots during the estimation step. The complete data- of interferes distributed according to homogeneous PP of
aided approach is described in Algorithm 2. λ=0.3 with γo =2m and γm =5m. Further, from [37]–[39], we
conclude that the interference itself is not correlated across
Algorithm 2 Data-aided Distributive LMMSE (DAD- OFDM frequency tones. This makes the analysis considerably
LMMSE) Algorithm simple and tractable because each OFDM frequency tone can
r be treated as an independent narrow-band channel. Hence, it
1) Run step 1 of Algorithm 1 to get ĥ and Cre at each suffices to characterize the interference at single OFDM tone.
antenna index r. r Consider the complex received interference at any given
2) Each antenna uses its CIR estimate, ĥ to form the sub-carrier (at the BS antenna r) due to all interfering users,
r
subset R of the most reliable data-carriers. which can be represented as [39],
3) Update the estimates and error covariance in step (1) Xp
using (38)-(39). I= Ex xi hi (41)
4) Run steps (2)-(6) of Algorithm 1, with Pr =(Cre )−1 and i∈Ψ
r r
ĥw =Pr ĥ . where, xi =ai exp{jθi } is the interfering symbol,
hi =γi−β αi exp{jφi } is the interfering channel, where
β > 1 is the pathloss exponent, αi is an independent
Rayleigh distributed random variable with Ω=E{α2i }=1
VI. E FFECT OF P ILOT C ONTAMINATION
and φi is independent random variable that is uniformly
So far, we assumed single-cell scenario where all the users distributed over [0, 2π). The symbols xi are generated from
have been allocated orthogonal resources for uplink channel a general bi-dimensional constellation with M equiprobable
estimation, thus the pilot observations are corrupted only by symbols Am =a(m) exp{jθ(m) }, m=1, 2, · · · , M . We assume
AWGN. In a multi-cell scenario, predominated by pilot con- that all interfering users transmit with the same average
tamination due to aggressive reuse of the pilots, the knowledge energy per symbol Ex and that the transmission constellation
of the interference statistics is critical in studying the effect of is normalized so that E{|xi |2 }=1. Therefore, (41) can be
pilot contamination on channel estimation techniques. Unlike expressed as,
the existing pilot contamination analyses, we take a stochastic X √Ex ai αi exp{j(θi + φi )} X √Ex zi
geometry based approach to derive analytical expressions for I= = (42)
interference correlation. i∈Ψ\O
γiβ i∈Ψ\O
γiβ
where, zi = ai αi exp{j(θi + φi )}.

A. Modified Network Model
To characterise the inter-cell interference resulting from B. Interference characterization
pilot contamination, we modify our previous 2-D network Although I can be completely characterized, to simplify our
model of Fig. 1 by introducing interferes that are assumed analysis of pilot contamination, we assume I to be Gaussian
to be distributed according to a PPP. Due to its simplicity and thus require only the first two moments, i.e., the mean
and tractability, the PPP has been widely used in stochastic and the variance. They are given in the following lemma.
geometry for modelling of the interference in cellular networks
11
Lemma 2. Using the network model of VI-A, the mean and where, σI2 is given in (44) and Λ is a diagonal matrix with
variance of interference I is, eigenvalues of Rtap spread along the diagonal and all users
are assumed to have similar channel characteristics.
µI = E{I} = 0 (43)
Proof: See Appendix B.
and
Theorem 1, shows that MSE is composed of two terms.
σI2 = E{|I|2 } The first term due to AWGN can be suppressed by increasing
the number of pilot tones but the second term due to pilot

1 1
= πλ(β − 1)−1 E{|x|2 }Ex Ω − (44) contamination cannot be reduced by adding more pilots and
γo2β−2 2β−2
γm
even persists at high SNR (i.e., ρ → ∞).
respectively.
Theorem 2. For the system model described in section II and
Proof: See Appendix A. pilot contamination as characterised in section VI, the MSE
Although (44) is derived by considering that the interference expression for L-LMMSE estimation algorithm presented in
space is annular, it can be extended for an infinite interference section III-A under both AWGN and pilot contamination is
space with a protection region of γo by taking the limit as given by,
γm → ∞ yielding,
L
δi 1 + ρKδi σI2

Ex Ω
X
(L)
σI2 = πλγo2 (β − 1)−1 E{|x|2 } . (45) MSE = R , (48)
1 + ρKδi + ρKδi σI2
γoβ−1 i=1
where, σI2 is given in (44), δi are the eigenvalues of Rtap and

C. Effect of PC on MSE Performance all users are assumed to have similar channel characteristics.
The knowledge of interference statistics at single OFDM
frequency tone, obtained through lemma 2, allows us to Proof: Replace Rw with Rw + RIr in MSE expression
evaluate the aggregate interference correlation over all OFDM (10), then invoking the eigenvalue decomposition (EVD) of
tones and/or across the whole BS antenna array using known Rtap , follow the steps of Theorem 1 given in Appendix B.
channel statistics. Consider the received OFDM symbol at rth We skip the detailed proof due to its similarity to Theorem 1.
BS antenna, after omitting the index P,
Note that (48) reduces to MSE expression for AWGN (given
Y r = Ahr + I r + W r in (10)) had there been no pilot contamination. At high SNR
= Ahr + E r (46) (i.e. ρ ≫ 1), when there essentially remains only the effect of
pilot contamination, the MSE expression (48) reduces to,
where, I r is the interference at antenna r of BS due to
σI2

pilot contamination and E r is the interference term which high SN R
MSE(L) −→ R trace(Λ) , (49)
captures the effect of both pilot contamination and the noise. 1 + σI2
Due to independence of noise and pilot contamination terms, which shows that MSE is independent of number of pilots and
each having a zero mean, the correlation matrix of E r is that LMMSE is more robust to pilot contamination compared
RE r =RIr +Rw . Now, using the interference power (or vari- to LS.
ance) at each OFDM sub-carrier from lemma 2, the inter-
ference correlation matrix RIr across OFDM tones can be Theorem 3. For the system model described in section II and
easily obtained as RIr =σI2 ARtap AH , where we assumed that pilot contamination as characterised in section VI, the MSE
all user channels (both desired and interfering) have identical expression for O-LMMSE estimation algorithm presented in
correlations (as in section II) and use the same pilots, which is section III-B under both AWGN and pilot contamination is
the worst case scenario from pilot contamination perspective. given by,
Similarly, in the multi-antenna case, based on system model of R X L
µj δi 1 + ρKµj δi σI2

X
(8), the interference correlation matrix for the whole BS array MSE(O) = , (50)
can be obtained as RE =RI +Rw , with RI =σI2 ÁRh ÁH . j=1 i=1
1 + ρKµj δi + ρKµj δi σI2
Using these interference correlations, we can derive the MSE
where, σI2 is given in (44), µj and δi are the eigenvalues of
expressions for LS, L-LMMSE and O-LMMSE algorithms in
the presence of noise and pilot contamination by replacing Rarray and Rtap respectively, and all users are assumed to
have similar channel characteristics.
the noise covariance matrix Rw =σ 2 I with matrix RE r or RE
in the MSE expressions already obtained in section III. The Proof: See Appendix C.
results are presented in following theorems. Note that (50) reduces to the MSE expression for AWGN
Theorem 1. For the system model described in section II and given in (13) in absence of pilot contamination. Again observe
pilot contamination as characterised in section VI, the MSE that, under the assumption of high SNR, when the effect of
expression for LS estimation algorithm of section III-C under pilot contamination predominates AWGN, the MSE expression
both AWGN and pilot contamination is given by, in (50) simplifies to,
σI2

RL (O) high SN R
MSE(LS) = + RσI2 trace(Λ) , (47) MSE −→ trace(Rarray )trace(Λ). (51)
ρK 1 + σI2
12
Table II: Parameters for simulation SNR=0 dB

101
LS 101
L-LMMSE
Parameter Value D-LMMSE 100
O-LMMSE
Array Size (M × G) 10 ×10
MSE
MSE
10−1
Array element spacing dx , dy 0.3ν, 0.5ν 100 D=0
D=1
Number of OFDM sub-carriers (N ) 256 10−2
D=2
D=3
Number of pilots (K) 32 10−3
0 1 2 3 4 5 6 7 8 9 −10 −5 0 5 10 15 20 25 30
Signal constellation modulation 4/16/64 – QAM # of iterations (D) SNR (dB)
Channel length (L) 8 (a) (b)
Figure 6: Number of iterations (D) required to achieve the

convergence of distributed algorithm.
This indicates that MSE depends strongly on interference
power and is independent of number of pilots K. Since
trace(Rarray ) ≤ R, the O-LMMSE seems to be more robust algorithm (red curve) against the parameter D (i.e., number of
to pilot contamination compared to both LS and L-LMMSE. iterations) in Fig. 6(a). The SNR was fixed at 0 dB. The MSE
The MSE expression also gives us clue that effect of pilot values of other algorithms, which do not depend on parameter
contamination can be minimized by exploiting the spatial D, are also shown. It can be seen that the proposed algorithm
correlations and by optimizing the BS antenna array design. converges very closely to the optimal in 3 iterations. Note that,
Above theorems quantify the effect of pilot contamina- when the antennas do not collaborate (i.e., D=0), the MSE
tion on MSE performance of channel estimation in terms of distributed algorithm coincides with that of L-LMMSE
of interference power (or variance) which in turn depends because no information sharing takes place. As the information
on different parameters described in lemma 2. The MSE from neighbors comes in during the next few iterations, the
performance against various parameters will be numerically MSE decays exponentially until it converges to near optimal
analysed through simulations. solution. Fig. 6(b) also suggests that there would be hardly
any improvement in MSE for D > 3.
VII. S IMULATION R ESULTS
We adopt the channel model in (1) with spatial correlation B. Experiment 2: MSE Performance in AWGN
matrix given in (3) whose parameters are: φ=π/3 (mean In this experiment, we compare the MSE performance of
horizontal AoD in radians), θ=3π/8 (mean vertical AoD in different algorithms in the presence of AWGN using the
radians), σ=π/12 (standard deviation of horizontal AoD) and parameters in Table II. The results given in Fig. 7, show that O-
ξ=π/36 (standard deviation of vertical AoD). The channel LMMSE performs better than both LS and L-LMMSE in terms
tap correlation matrix follows an exponentially decaying PDP, of MSE as it is able to utilize the antenna spatial correlations.
E{|hr (τ )|2 }=e−τ , while rest of the parameters are given As shown, the proposed D-LMMSE algorithm (Algorithm 1)
in the Table II, where ν represents the carrier frequency achieves near optimal results in just 3 iterations. The analytical
wavelength in meters. It is also assumed that receiver has the MSE expressions given in Section III, for LS, L-LMMSE and
knowledge of channel correlations. O-LMMSE under AWGN are also plotted with legends (Th.),
To assess the performance of different algorithms we use which agree with simulation results.
the following MSE performance criterion: Fig. 8(a) shows the MSE performance of proposed data-
Θ aided algorithm (DAD-LMMSE in Algorithm 2) against other
1 X i
M SE = kh − ĥi k2 (52) pilot-based algorithms. It is obvious that data-aided approach
Θ i=1 has the best performance compared to all others and that the
effect of using reliable carriers is more pronounced at higher
where, hi and ĥi are true and estimated CIR vectors (at the SNR. Fig. 8(b) demonstrates the MSE behaviour of different
ith trial) respectively, each of size RL × 1 and Θ represents algorithms with varying number of pilots K with SNR fixed
the total number of trials. We used Θ=100 in our simulations. at 20 dB. As is shown, increasing the pilot tones yields better
We conduct five different experiments to study the perfor- estimation performance but this comes at the cost of lower
mance of our proposed approach and compare it with the three spectral efficiency. The data-aided algorithm however, is able
methods i.e., LS, L-LMMSE and O-LMMSE described earlier to achieve the best performance even for a small number of
in Section III. We also perform experiments to validate our pilot tones.
analysis and study the impact of pilot contamination on all
these methods.
C. Experiment 3: Mean and variance of interference
This experiment aims to validate the mean and variance
A. Experiment 1: How many iterations (D)? of the interference given in Lemma 2. In order to mimic the
In this experiment we are interested in finding the number of setup described in Section VI-A, we use single antenna BS and
iterations, required for convergence of the proposed distributed assume that CIRs from each user to the BS has a uniform PDP.
LMMSE algorithm. We plot the MSE of proposed D-LMMSE Further, we assume that BS is located at the origin, the desired
13
103
102 LS
10 LS
LS LS (Th.) LS (Th.)
LS (Th.) L-LMMSE L-LMMSE
102 L-LMMSE (Th.) 6 L-LMMSE (Th.)
101 L-LMMSE D-LMMSE (D=3) D-LMMSE (D=3)
L-LMMSE (Th.) O-LMMSE O-LMMSE
MSE (dB)
O-LMMSE (Th.) 2 O-LMMSE (Th.)
D-LMMSE (D=3)
MSE
101
O-LMMSE
100 O-LMMSE (Th.)
−2
MSE
100 −6
10−1 −10
10−1
−10 −5 0 5 10 15 20 25 30 10−2 10−1 100 101
SNR (dB) λ
10−2
(a) (b)
10 −3
−10 −5 0 5 10 15 20 25 30 Figure 10: Effect of pilot contamination on MSE performance
SNR (dB)
(a) MSE as a function of SNR for λ=0.1 (b) MSE as a function
Figure 7: MSE performance of different algorithms in white of λ for SNR fixed at 10 dB.
Gaussian noise.
D. Experiment 4: MSE Performance under AWGN and Pilot

SNR = 20 dB
101 101
LS LS (Th.)
Contamination
L-LMMSE L-LMMSE (Th.)
100
100
D-LMMSE (D=3) O-LMMSE (Th.) In this experiment we study the MSE performance of
O-LMMSE D-LMMSE (DA)
10−1
different algorithms in presence of both AWGN and pilot
MSE
MSE
10−1 contamination. For simulations, we use the parameters given

10−2
LS
L-LMMSE
in Table II with the interfering users distributed according to a
10−2
10−3
O-LMMSE
D-LMMSE (D=3)
PPP of λ=0.1 and pathloss β=2. The desired user is assumed
D-LMMSE (DA)
10−4
0 5 10 15 20 25 30
10−3
0 10 20 30 40 50 60 70
1m away from BS located at origin while the interfering
SNR (dB) # of pilots (K) users are distributed in circular region of radius 5m with
(a) (b) protection region of γo =2 m. In Fig. 10(a), the simulated
MSE performance of different algorithms is compared over
Figure 8: MSE performance comparison of data-aided D- a wide range of SNR with the analytical expressions given
LMMSE algorithm with pilot-based techniques in white Gaus- in Theorems 1, 2 and 3 (see Section VI-C). From Fig. 10(a),
sian noise. note that all MSE curves decrease with increasing SNR in
lower range but reach an error floor at higher SNR. This is
in stark contrast to AWGN case (see Fig. 7), where the MSE
1
Simulation always decreases with increasing SNR. This shows that pilot
0.5 Theory
contamination persists even at higher SNR and its effect on
µI
0
MSE is more severe than AWGN.
−0.5
We present similar analysis in Fig. 10(b), where the MSE
−1
10−2 10−1 100 101 is plotted as a function of λ with SNR fixed at 10 dB. It is
λ
obvious that all algorithms perform well for small values of
4
Simulation λ. However when λ increases, the interference due to pilot
3 Theory
contamination dominates AWGN, thus severely degrading the
σI2
2
performance as indicated by a sharp increase in MSE curves.
1
Note that LMMSE channel estimation is more robust to pilot
0
10−2 10−1 100 101 contamination than simple LS based channel estimation. Also
λ
observe a close match between simulation and theoretical
Figure 9: Mean and variance of interference at single OFDM analysis, shown in Fig. 10, over a wide range of λ.
sub-carrier as a function of λ.
E. Experiment 5: Computational Complexity
In this experiment we compare the average runtime of
user at a distance of 1m from BS while interfering users are various algorithms that can be regarded as a measure of
distributed in a region of radius 5m and with a protection computational complexity. Fig. 11 shows the average runtime
region of γo =2m according to a PPP with density λ and with increasing number of BS antennas under the default
pathloss exponent β=2. All users communicate with BS using simulation parameters of Table II. It is clear that computa-
OFDM with N =256, L=8 and K=32 identical pilot symbols tional requirements for proposed D-LMMSE algorithm, with
drawn from a 4-QAM constellation. Fig. 9 compares the mean different values of parameter D, grow at much slower pace
and variance of interference observed on single OFDM carrier than that of the O-LMMSE algorithm as the number of BS
(randomly picked) due to simulated sources with expressions antenna increases. Further, in terms of memory requirements
given in Lemma 2, as a function of λ. The results indicate a and communication overhead (not shown here), the advantages
close match between simulation and theory. of D-LMMSE are even more tangible.
14
2
10 of interference can be computed as follows,
1
10 σI2 = E{|I|2 }
 X √E z X √E z ∗ 
 
0
10 x i x j
= EΨ Ez
Avg. Runtime (sec)
β β
−1

i∈Ψ γ i j∈Ψ γ j

10
( )
(a) X Ex Ez {|zi |2 }
−2
10 = EΨ
LS
L−LMMSE i∈Ψ γi2β
−3 D−LMMSE(D=1)
10 Z 2π Z γm
D−LMMSE(D=2)
(b) 1
D−LMMSE(D=3)
O−LMMSE = λEx E{|zi |2 } rdrdθ
−4
10
4 16 36 64 100 0 γo r2β
No. of BS antennas

(c) 1 1
= πλ(β − 1)−1 Ex ΩE{|x|2 } −
Figure 11: Average runtime of various algorithms. γo2β−2 2β−2
γm
(a)
where, = is due to the fact that zi are independent SS random
(b) (c)
variables, in = we employed Campbell’s theorem and in = we
2 2 2 2
VIII. C ONCLUSION used the result E{|zi | } = E{ai αi } = ΩE{|x| }, where we
note that ai and αi are independent random variables, which
completes the proof.
Channel estimation is a challenging problem in massive
MIMO systems as the conventional techniques applicable to
MIMO systems cannot be employed owing to an exceptionally
large number of unknown channel coefficients. We proposed
a distributed algorithm that attains near optimal solution at A PPENDIX B
a significantly reduced complexity by relying on coordination P ROOF OF T HEOREM 1
among antennas. To reduce the pilots overhead, the distributed
LMMSE algorithm is extended using data-aided estimation
based on reliable carriers. To gain insight into the effect
of pilot contamination on channel estimation performance, By replacing Rw with Rw + RIr in MSE expression of
we used the stochastic geometry to obtain the aggregated (16), we obtain
interference power and then based on this, we derived MSE −1
−1
expressions for different algorithms under AWGN and pilot msels H
r = trace A (Rw + RIr ) A
contaminated scenarios. The derived expressions were verified −1 −1
using simulation results. Extending the obtained results to = trace AH Rw + σI2 ARtap AH A
analyzing the system throughput under pilot contamination (a)

−1
remains open for future work. = trace AH R−1 2 H −1
w A − σI A Rw A Rtap
−1 H −1 −1
+ σI2 AH R−1
w A A Rw A
(a)
where, = follows from matrix inversion lemma. Now, using
the EVD of the channel correlation matrix Rtap = QΛQH
A PPENDIX A and the fact that AH R−1 KEx
w A = σ2 IL we obtain,
M EAN AND VARIANCE OF INTERFERENCE w
2
(b) KEx KEx
msels
r = trace 2
IL − σI2 2
Λ−1
σw σw
The mean of I can be determined as follows, !−1
σI2 KEx −1
X √Ex zi + IL
( )
σw2
µI = E{I} = E
i∈Ψ γiβ L 2 2
−1 !−1
KEx KE σ KE
X √Ex Ez {zi }
x x
X
= −σI2 δi−1 + I 2
( )
2
σw σw2 σw
= EΨ i=1
i∈Ψ γiβ (53)
1
Z
(a) p (b)
= Ex E{zi } rdrdθ = 0 where, = follows from the property that trace QRQH =

β
R2 r
trace(R) if Q is unitary. After simple algebraic manipulations,
(a) 2
σw L PL
where, = results from Campbell’s theorem [40] and then the the term inside the summation simplifies to KE x
+σI2 i=1 δi ,
fact, E{zi } = 0 yields the zero mean. Similarly, the variance which completes the proof.
15
A PPENDIX C [7] T. L. Marzetta, “Noncooperative Cellular Wireless with Unlimited

P ROOF OF T HEOREM 3 Numbers of Base Station Antennas,” IEEE Transactions on Wireless
Communications, vol. 9, no. 11, pp. 3590–3600, November 2010.
[8] F. Rusek, D. Persson, B. K. Lau, E. G. Larsson, T. L. Marzetta,
Under both AWGN and pilot contamination, we replace Rw O. Edfors, and F. Tufvesson, “Scaling Up MIMO: Opportunities and
with RE = Rw + σI2 ÁRh ÁH to get, Challenges with Very Large Arrays,” IEEE Signal Processing Magazine,
vol. 30, no. 1, pp. 40–60, Jan 2013.
−1 −1 [9] E. Larsson, O. Edfors, F. Tufvesson, and T. L. Marzetta, “Massive
(O) −1 H 2 H
MSE = trace Rh +Á Rw +σI ÁRh Á Á MIMO for next generation wireless systems,” IEEE Communications
Magazine, vol. 52, no. 2, pp. 186–195, February 2014.
(a)
[10] J. Hoydis, S. T. Brink, and M. Debbah, “Massive MIMO in the UL/DL of
= trace R−1 H
h + ÁRh Á − σI ÁRh Á
2 H
R−1
h
Cellular Networks: How Many Antennas Do We Need?” IEEE Journal
on Selected Areas in Communications, vol. 31, no. 2, pp. 160–171,
!−1 February 2013.
−1 [11] F. Boccardi, R. W. Heath, A. Lozano, T. L. Marzetta, and P. Popovski,
2 H H
+ σI ÁRh Á ÁRh Á “Five disruptive technology directions for 5G,” IEEE Communications
Magazine, vol. 52, no. 2, pp. 74–80, February 2014.
[12] E. Bjornson and B. Ottersten, “A Framework for Training-Based Esti-
(a) mation in Arbitrarily Correlated Rician MIMO Channels With Rician
where = follows from matrix inversion lemma. Using Disturbance,” IEEE Transactions on Signal Processing, vol. 58, no. 3,
the properties of kronecker product, it can be shown that pp. 1807–1820, March 2010.
ÁRh ÁH = KE σw2 (IR ⊗ IL ). Further, the channel correla-
x [13] J.-J. van de Beek, O. Edfors, M. Sandell, S. K. Wilson, and P. O.
Borjesson, “On channel estimation in OFDM systems,” in IEEE 45th
tion matrix Rh = Rarray ⊗ Rtap can be decomposed as Vehicular Technology Conference, vol. 2, Jul 1995, pp. 815–819 vol.2.
H
Rh = (V ⊗ Q)(S ⊗ Λ)(V ⊗ Q) , where we introduced the [14] O. Edfors, M. Sandell, J.-J. van de Beek, S. K. Wilson, and P. O. Bor-
EVDs, Rarray = VSVH and Rtap = QΛQH . Incorporating jesson, “OFDM channel estimation by singular value decomposition,”
(a) in IEEE 46th Vehicular Technology Conference on Mobile Technology
these results in = yields, for the Human Race, vol. 2, Apr 1996, pp. 923–927 vol.2.
[15] M. K. Ozdemir, H. Arslan, and E. Arvas, “Toward real-time adaptive
low-rank LMMSE channel estimation of MIMO-OFDM systems,” IEEE
2 Transactions on Wireless Communications, vol. 5, no. 10, pp. 2675–
(O) (b) −1 −1 KEx 2 KEx
MSE = trace S ⊗Λ + 2 (IR ⊗ IL )−σI 2
2678, Oct 2006.
σw σw [16] C. Mehlfuhrer, S. Caban, and M. Rupp, “An accurate and low complex
−1 !−1 channel estimator for OFDM WiMAX,” in 3rd International Symposium
2

−1 σI KEx
on Communications, Control and Signal Processing (ISCCSP), March
−1
· S ⊗Λ + 2
(IR ⊗ IL ) 2008, pp. 922–926.
σw [17] M. Simko, D. Wu, C. Mehlfuehrer, J. Eilert, and D. Liu, “Implementation
R L 2 Aspects of Channel Estimation for 3GPP LTE Terminals,” in 11th
(c) X X 1 KEx 2 KEx European Wireless Conference on Sustainable Wireless Technologies,
= + 2 − σI 2 April 2011, pp. 1–5.
j=1 i=1
µ δ
j i σw σw [18] M. K. Ozdemir and H. Arslan, “Channel estimation for wireless OFDM
−1 !−1 systems,” IEEE Communications Surveys Tutorials, vol. 9, no. 2, pp.
σI2 KEx

1 18–48, Second 2007.
· + 2 [19] N. Shariati, E. Bjornson, M. Bengtsson, and M. Debbah, “Low-
µj δ i σw complexity channel estimation in large-scale MIMO using polynomial
expansion,” in IEEE 24th International Symposium on Personal Indoor
(b) and Mobile Radio Communications (PIMRC), Sept 2013, pp. 1157–
where, = follows from property, trace QRQH =trace(R)

1162.
(c)
when Q is unitary and = is due to the diagonal nature of the [20] P. Xu, J. Wang, J. Wang, and F. Qi, “Analysis and Design of Chan-
nel Estimation in Multicell Multiuser MIMO OFDM Systems,” IEEE
matrix inside the trace operator, where µj and δi represent the Transactions on Vehicular Technology, vol. PP, no. 99, pp. 1–11, 2014.
eigenvalues of matrices Rarray and Rtap respectively. After [21] H. Yin, D. Gesbert, M. Filippou, and Y. Liu, “A Coordinated Approach
(c) to Channel Estimation in Large-Scale Multiple-Antenna Systems,” IEEE
some algebraic manipulations, = simplifies to the result given Journal on Selected Areas in Communications, vol. 31, no. 2, pp. 264–
in Theorem 3. 273, February 2013.
[22] H. Q. Ngo and E. G. Larsson, “EVD-based channel estimation in
multicell multiuser MIMO systems with very large antenna arrays,”
in IEEE International Conference on Acoustics, Speech and Signal
R EFERENCES Processing (ICASSP), March 2012, pp. 3249–3252.
[23] S. L. H. Nguyen and A. Ghrayeb, “Compressive sensing-based channel
[1] K. E. S. WWRF, L. Sorensen, “2020: Beyond 4G: Radio estimation for massive multiuser MIMO systems,” in IEEE Wireless
Evolution for the Gigabit Experience,” July 2009. [Online]. Available: Communications and Networking Conference (WCNC), April 2013, pp.
http://www.wireless-world-research.org. 2890–2895.
[2] A. J. Paulraj and T. Kailath, “Increasing capacity in wireless [24] Y. Barbotin, A. Hormati, S. Rangan, and M. Vetterli, “Estimating Sparse
broadcast systems using distributed transmission/directional reception MIMO channels having Common Support,” in IEEE International
(DTDR),” Sep. 6 1994, uS Patent 5,345,599. [Online]. Available: Conference on Acoustics, Speech and Signal Processing (ICASSP), May
http://www.google.com/patents/US5345599 2011, pp. 2920–2923.
[3] E. Telatar, “Capacity of Multi-antenna Gaussian Channels,” European [25] M. Masood, L. H. Afify, and T. Y. Al-Naffouri, “Efficient Coordinated
transactions on telecommunications, vol. 10, no. 6, pp. 585–595, 1999. Recovery of Sparse Channels in Massive MIMO,” IEEE Transactions
[4] B. P. Crow, I. Widjaja, J. G. Kim, and P. Sakai, “IEEE 802.11 Wireless on Signal Processing, vol. 63, no. 1, pp. 104–118, Jan 2015.
Local Area Networks,” IEEE Communications Magazine, vol. 35, no. 9, [26] E. B. Al-Safadi and T. Y. Al-Naffouri, “Pilotless recovery of clipped
pp. 116–126, Sep 1997. OFDM signals by compressive sensing over reliable data carriers,” in
[5] I. Koffman and V. Roman, “Broadband wireless access solutions based IEEE 13th International Workshop on Signal Processing Advances in
on OFDM access in IEEE 802.16,” IEEE Communications Magazine, Wireless Communications (SPAWC), June 2012, pp. 580–584.
vol. 40, no. 4, pp. 96–103, Apr 2002. [27] H. Q. Ngo, T. L. Marzetta, and E. G. Larsson, “Analysis of the pilot
[6] G. T. 36.211, “Evolved Universal Terrestrial Radio Access (E-UTRA); contamination effect in very large multicell multiuser MIMO systems
Physical Channels and Modulation,” ver. 10.5.0, Sep. 2012. for physical channel models,” in IEEE International Conference on
16
Acoustics, Speech and Signal Processing (ICASSP), May 2011, pp.

3464–3467.
[28] J. Jose, A. Ashikhmin, T. L. Marzetta, and S. Vishwanath, “Pilot
Contamination and Precoding in Multi-Cell TDD Systems,” IEEE Trans-
actions on Wireless Communications, vol. 10, no. 8, pp. 2640–2651,
August 2011.
[29] B. Gopalakrishnan and N. Jindal, “An analysis of pilot contamination
on multi-user MIMO cellular systems with many antennas,” in IEEE
12th International Workshop on Signal Processing Advances in Wireless
Communications (SPAWC), June 2011, pp. 381–385.
[30] Y. Dawei, F. W. Vook, T. A. Thomas, D. J. Love, and A. Ghosh,
“Kronecker product correlation model and limited feedback codebook
design in a 3D channel model,” in IEEE International Conference on
Communications (ICC), June 2014, pp. 5865–5870.
[31] R. Negi and J. Cioffi, “Pilot tone selection for channel estimation in a
mobile OFDM system,” IEEE Transactions on Consumer Electronics,
vol. 44, no. 3, pp. 1122–1128, Aug 1998.
[32] A. H. Sayed, Fundamentals of Adaptive
Filtering. Wiley, 2003. [Online]. Available:
http://books.google.com.sa/books?id=VaAV4uqMuKYC
[33] T. Kailath, A. H. Sayed, and B. Hassibi, Linear Estimation, ser. Prentice-
Hall information and system sciences series. Prentice Hall, 2000. [On-
line]. Available: http://books.google.com.sa/books?id=zNJFAQAAIAAJ
[34] A. H. Sayed, “Adaptation, Learning, and Optimization over Networks,”
Foundations and Trends in Machine Learning, vol. 7, no. 4-5, pp. 311–
801, 2014. [Online]. Available: http://dx.doi.org/10.1561/2200000051
[35] F. S. Cattivelli, C. G. Lopes, and A. H. Sayed, “Diffusion recursive
least-squares for distributed estimation over adaptive networks,” IEEE
Transactions on Signal Processing, vol. 56, no. 5, pp. 1865–1877, May
2008.
[36] H. ElSawy, E. Hossain, and M. Haenggi, “Stochastic Geometry for
Modeling, Analysis, and Design of Multi-Tier and Cognitive Cellular
Wireless Networks: A Survey,” IEEE Communications Surveys Tutorials,
vol. 15, no. 3, pp. 996–1019, Third 2013.
[37] M. Di Renzo and W. Lu, “The Equivalent-in-Distribution (EiD)-Based
Approach: On the Analysis of Cellular Networks Using Stochastic
Geometry,” IEEE Communications Letters, vol. 18, no. 5, pp. 761–764,
May 2014.
[38] R. K. Ganti and M. Haenggi, “Spatial and temporal correlation of
the interference in ALOHA ad hoc networks,” IEEE Communications
Letters, vol. 13, no. 9, pp. 631–633, Sept 2009.
[39] A. Ali, H. Elsawy, T. Y. Al-Naffouri, and M. Alouini, “Narrowband
Interference Parameterization for Sparse Bayesian Recovery,” in IEEE
International Conference on Communications (ICC), June 2015 (Ac-
cepted).
[40] S. Chiu, D. Stoyan, W. Kendall, and J. Mecke, Stochastic
Geometry and Its Applications, ser. Wiley Series in
Probability and Statistics. Wiley, 2013. [Online]. Available:
https://books.google.com.sa/books?id=GCRI8Q-RUEkC

Distributed Channel Estimation and Pilot Contamination Analysis For Massive MIMO-OFDM Systems

Uploaded by

Copyright:

Available Formats

Distributed Channel Estimation and Pilot Contamination Analysis For Massive MIMO-OFDM Systems

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Distributed Channel Estimation and Pilot Contamination Analysis For Massive MIMO-OFDM Systems

Uploaded by

Copyright:

Available Formats

1

Distributed Channel Estimation and Pilot

Right neighbor rR 2πdx

(WLS) optimization problem, place in a similar fashion.

(a) Information diffusion process (b) Information sharing process

Algorithm 1 Distributive LMMSE (D-LMMSE) algorithm Table I: Computational Complexity

matrix respectively, during the i-th iteration. Then by defining

B. Revisiting the Estimation Step Base Station

We now revisit the estimation step of the proposed Algo-

selection process. Each antenna could revisit the estimation

where, zi = ai αi exp{j(θi + φi )}.

where, σI2 is given in (44), δi are the eigenvalues of Rtap and

Table II: Parameters for simulation SNR=0 dB

Channel length (L) 8 (a) (b)

Figure 6: Number of iterations (D) required to achieve the

D. Experiment 4: MSE Performance under AWGN and Pilot

10−1 contamination. For simulations, we use the parameters given

A PPENDIX C [7] T. L. Marzetta, “Noncooperative Cellular Wireless with Unlimited

Acoustics, Speech and Signal Processing (ICASSP), May 2011, pp.

You might also like