0% found this document useful (0 votes)
28 views15 pages

Deep-Learning-Based High-Precision Localization With Massive MIMO

Uploaded by

sneepweep
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views15 pages

Deep-Learning-Based High-Precision Localization With Massive MIMO

Uploaded by

sneepweep
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Received 4 April 2023; revised 19 September 2023; accepted 9 November 2023.

Date of publication 28 November 2023; date of current version 18 December 2023.


The associate editor coordinating the review of this article and approving it for publication was K. Seddik.
Digital Object Identifier 10.1109/TMLCN.2023.3334712

Deep-Learning-Based High-Precision
Localization With Massive MIMO
GUODA TIAN (Member, IEEE), ILAYDA YAMAN (Student Member, IEEE),
MICHIEL SANDRA (Student Member, IEEE), XUESONG CAI (Senior Member, IEEE),
LIANG LIU (Member, IEEE), AND FREDRIK TUFVESSON (Fellow, IEEE)
Department of Electrical and Information Technology, Lund University, 221 00 Lund, Sweden
CORRESPONDING AUTHORS: G. TIAN ([email protected]) AND X. CAI ([email protected])
This work was supported in part by Ericsson AB, in part by the Horizon Europe Framework Program through the Marie Skłodowska-Curie
Grant under Agreement 101059091, in part by the Swedish Research Council under Grant 2022-04691, and in part by the Strategic
Research Area Excellence Center at Linköping-Lund in Information Technology (ELLIIT).

ABSTRACT High-precision localization and machine learning (ML) are envisioned to be key technologies
in future wireless systems. This paper presents an ML pipeline to solve localization tasks. It consists of
multiple parallel processing chains, each trained using a different fingerprint to estimate the position of the
user equipment. In this way, ensemble learning can be utilized to fuse all chains to improve localization perfor-
mance. Nevertheless, a common problem of ML-based techniques is that network training and fine-tuning can
be challenging due to the increase in network sizes when applied to (massive) multiple-input multiple-output
(MIMO) systems. To address this issue, we utilize a subarray-based approach. We divide the large antenna
array into several subarrays, feeding the fingerprints of the subarrays into the pipeline. In our case, such an
approach eases the training process while maintaining or even enhancing the performance. We also use the
Nyquist sampling theorem to gain insight on how to appropriately sample and average training data. Finally,
an indoor measurement campaign is conducted at 3.7 GHz using the Lund University massive MIMO testbed
to evaluate the approaches. Localization accuracy at a centimeter level has been reached in this particular
measurement campaign.

INDEX TERMS Channel measurements, deep learning, localization, massive MIMO.

I. INTRODUCTION UE locations from delays or angles according to geometry.

C ELLULAR-based localization is expected to pave the


way for various location-aware applications such as
robotic navigation, emergency healthcare, and smart trans-
The general concept of fingerprint-based localization is to
establish a radio map for the area of interest by storing
channel features or fingerprints. The UE coordinates are
portation [1], [2], [3], [4], [5], [6], [7]. The technology estimated by comparing the received fingerprints with the
has undergone a significant improvement over the years, previously stored fingerprints. Furthermore, with the aid of
and high-precision wireless localization has currently been ultra-wideband (UWB) [9] and/or massive multiple-input
included as a key feature in the current fifth generation new multiple-output (MIMO) systems, it is possible to improve
radio (NR) standard, with strict requirements on localization positioning accuracy due to the high delay resolution in UWB
accuracy [8]. and the high angular resolution in massive MIMO [10], [11],
Traditional localization approaches include proximity, [12], [13], [14], [15], [16]. For example, the work in [10],
triangulation (trilateration), fingerprint matching, and simul- [15], and [16] proposed novel estimators to jointly estimate
taneous localization and mapping [7]. Proximity approaches angles and positions with large-scale arrays. The authors
examine whether user equipment (UE) is close to pre-known in [11], [12], and [13] provided solutions for localization
locations by analyzing received wireless signal characteris- by designing tracking filters to exploit and track important
tics such as the received signal strength indicator (RSSI). propagation channel characteristics, i.e. the autocorrelation
Triangulation or trilateration technology is used to estimate function of the received signal and the phase of multipaths,
2023 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
VOLUME 2, 2024 19
respectively. Especially, [12] validated their methods via a • We apply a localization framework that blends channel
real massive MIMO testbed and showed that localization fingerprints that contain information from the delay and
accuracy can be significantly enhanced with the 40 MHz angular domains, respectively. It is not necessary to
bandwidth. [14] presented a direct localization method con- calibrate the whole BS array to obtain those channel
sidering localization as a joint optimization problem, which fingerprints.
bypasses the channel estimation step and still achieves good • By dividing the whole array into subarrays, the network
positioning accuracy. size can be reduced, which facilitates the training pro-
All of the aforementioned localization methods belong to cess while improving the localization performance.
the traditional signal processing family. The main challenges • We apply the Nyquist sampling theorem to analyze how
are high algorithm complexity and requirements of the base to appropriately collect and average training data.
station (BS) array calibration [7]. On the other hand, machine • Finally, an indoor measurement campaign with a mas-
learning (ML) based localization algorithms have gained sig- sive MIMO testbed was conducted to evaluate our
nificant interest [17], [18], [19], [20], [21], [22], [23], [24], approach. The results show that our pipeline can reach
[25], [26], [27], [28], [29], [30], [31], [32]. It is essential centimeter-level positioning accuracy with only 20 MHz
to appropriately select both fingerprints and algorithms. One bandwidth for this measurement campaign.
can choose either the raw transfer function [17], [18], [19] The remainder of this paper is organized as follows.
or various channel fingerprints such as RSS, power delay In Section II, we introduce the signal model and briefly
profile (PDP), angular spectrum, correlation function, etc., discuss the selected fingerprints. In Section III, we present the
[20], [21], [22], [23], [24], [25], [26], [28], and [27] as localization algorithms. Section IV illustrates the measure-
learning features. Moreover, a variety of ML algorithms have ment campaign, and Section V presents the results. Finally,
been investigated, which can be mainly classified into two conclusive remarks are included in Section VI.
categories, namely the traditional ML family such as the
K-nearest neighbors (KNN), support vector machines, kernel II. SYSTEM MODEL AND FINGERPRINT GENERATION
methods, random forest, Gaussian process regression, [21], We consider the uplink of a single user massive MIMO
[22], [23], [32], and the deep learning family [17], [18], [20], system, which uses orthogonal frequency division multiplex-
[24], [25], [26], [27], [28], [29], [30], [31]. Considering the ing (OFDM) with F subcarriers. The UE has one antenna,
features of massive MIMO systems, there is also great poten- while the BS is equipped with M antennas. Each antenna
tial to apply ML techniques with massive MIMO systems to element is connected to an RF and a digital processing chain,
solve localization tasks. Early research [17], [18], [24] used which allows the BS to simultaneously process the received
convolutional neural networks (CNN) for localization. The signals from all antennas. We assume that the UE moves at
work in [30] proposed an algorithm that trains an autoencoder walking speed, and the 2-D position of this UE is given by
to first calibrate the antenna array. Then, the angle spectrum pi (x0 , y0 ) ∈ R2 , shortened to pi in the following sections.
is computed as a training feature. The work in [22] applied Taking into account the propagation channel, the transfer
Gaussian regression to perform localization with distributed function matrix Ypi = [ypi ,1 , . . . , ypi ,F ] ∈ CM ×F for all
massive MIMO systems. subcarriers, corresponding to the position of the UE pi , can
However, there are still some research gaps that need be written as
to be further addressed: i) Most of the existing ML-based
localization algorithms directly output the position of the UE Ypi = Hpi ⊙ 0 + N, (1)
without considering the uncertainty of the estimation, thus
lacking effective information fusion from different channel where Hpi ∈ CM ×F represents the uplink wireless propaga-
fingerprints. ii) The size of the neural networks increases sig- tion channel, 0 ∈ CM ×F the complex coefficients (amplitude
nificantly with the increasing number of antennas. This may scaling and phase drift) of all the M RF chains and F sub-
hinder the training and fine-tuning of the network. Therefore, carriers. Additionally, ⊙ is the Hadamard product, and N ∈
it is essential to develop efficient localization algorithms that CM ×F denotes receiver noise at all M RF chains. When UE
are suitable for the massive MIMO system. iii) A theoretical moves, a total of T snapshots are recorded and T different
analysis of the necessary training density is missing in the receive matrices Ypi are collected. Our aim is to find a
literature. It is important to investigate the density of the train- functional relationship between Ypi and pi , which falls into
ing sample under different channel conditions, as training the category of a regression (estimation) task.
data collection is a time-consuming task. To address those ML-based localization algorithms have the potential to
limitations, our main contributions are as follows.1 achieve good performance if adequate channel fingerprints
are selected as the input to the algorithms. Such fingerprints
can be extracted from the raw received transfer function
1 A preliminary version of this work [33] has been presented at Ypi . In this paper, we analyze two fingerprints, namely the
the 2023 IEEE International Conference on Communication. Unlike [33], spatial covariance matrix and the truncated channel impulse
this paper presents new material on the subarray method and a detailed
analysis of the necessary training density. In addition, the pipeline in this response (CIR), since they can be achieved with even an
paper is used to estimate both the UE position and error variances. uncalibrated array.

20 VOLUME 2, 2024
Tian et al.: Deep-Learning-Based High-Precision Localization With Massive MIMO

A. SPATIAL COVARIANCE MATRIX


It is sometimes challenging to extract calibrated fingerprints
such as AoA due to the presence of the RF chain matrix 0,
see (1). Therefore, we consider using the covariance matrix
Ci = E{ypi yH M ×M as a fingerprint. The main diagonal
pi } ∈ C
elements of Ci (auto-correlation) indicate the received signal
power for each antenna, whereas the off-diagonal elements of
Ci represent the cross-correlation between different antennas.
Note that typically one can only estimate the covariance
matrix in practice with a limited number of samples to con-
duct the expectation operation. Suppose that for each position
pi , there exist in total Npi positions in the neighborhood
region of pi , whose channel responses are accessible. Those FIGURE 1. A typical structure of an FCNN.
Npi samples are inside a circular area, with pi as the center
and d as the diameter, i.e., ||pj −pi ||2 ≤ d2 , j = 1, 2, . . . , Npi .
B. TRUNCATED CHANNEL IMPULSE RESPONSE
We then define the sample covariance matrix C̃i,Npi ∈ CM ×M
The fingerprint Ci does not contain channel information
to estimate Ci . Specifically,
from the delay domain; however, it is still important to
p N utilize the delay information to further improve the accu-
1 Xi racy of localization. To this end, the truncated CIR matrix
C̃i,Npi = pj .
Ypj YH (2)
Npi 4 ∈ CM ×L is generated by calculating the inverse discrete
j=1
Fourier transform (IDFT) along each row of Ypi , followed by
As shown in (2), C̃i,Npi depends on Npi and thus d. A special choosing the first L delay elements. We introduce a vector
case is that if d = 0 and Npi = 1, Ci is estimated by only ξ ∈ R2ML , which includes all elements of 4. Specifically,
correlating across all subcarriers of Ypi at a fixed position ξ = [vec{ℜ(4)}T , vec{ℑ(4)}T ]T .
pi . We name this specific matrix the one-sample covariance
matrix. Note that it is challenging to estimate Ci with this III. ML-BASED LOCALIZATION APPROACH
matrix for a narrowband system, since channel responses A. NEURAL NETWORK BASICS
with respect to different subcarriers are strongly correlated. Neural networks have been widely used to solve various tasks
In contrast, when d is larger than half a wavelength, a major such as channel estimation, wireless sensing, etc., owing to
difference in the propagation channel can be observed and their excellent abilities to learn non-linear complex models
C̃i,Npi can therefore better approach Ci . If d is large enough, [34]. These models can generally be represented as a multi-
the fingerprint C̃i,Npi changes much more slowly than the variate function f : RV1 → RV2 , where V1 and V2 represent
one-sample covariance matrix with the movement of the UE, the dimensions of the learning characteristics and the goals,
since the influence of small-scale fading is reduced due to respectively. An example of a typical fully connected neural
the average operation. Due to this, fewer training samples are network (FCNN) is illustrated in Fig. 1, consisting of an input
needed. layer, several hidden layers, and an output layer. Regarding
Since the sample covariance matrix C̃i,Npi is a Hermitian the input and output layers, the number of their nodes is
identical to V1 and V2 , respectively. Specific to this 2-D
matrix, i.e., C̃i,Npi = C̃Hi,Npi , the upper minor diagonal ele- localization task, we view the output of the neural network
ments contain the same information as the lower. To decrease as a Gaussian distribution function, which can be determined
the computation complexity, we introduce another matrix by the estimated position of the UE (p̂ = [p̂x , p̂y ]T ∈ R2 ) and
2
C̆i,Npi ∈ RM ×M and a vector c̃i,Npi ∈ RM as the variance (σ̂ 2 = [σ̂x2 , σ̂y2 ]T ∈ R2 ).
      Two processes are usually involved when training a neural
C̆i,Npi = ltril ℜ C̃i,Npi +sltril ℑ C̃i,Npi network, namely the forward and backward propagation pro-
c̃i,Npi = vec C̆i,Npi , cesses. In the forward propagation process, the input signals

(3)
enter the neural network through the input layer. Then it
where ℜ{.} and ℑ{.} denote the operation to take the real propagates through multiple hidden layers and ultimately
and imaginary parts of a given matrix, respectively. ltril{.} reaches the output layer. At each layer, the output of a node
represents a matrix operation that replaces all values above is determined by the inputs from the previous layers, the
the diagonal as zero while maintaining the other matrix respective weights and biases, and a non-linear activation
elements. The operation sltril{.} keeps all elements below function that is specific to that node. For example, we suppose
the diagonal and substitutes all the remaining matrix ele- that an FCNN has γi nodes in the i-th layer, the values of
ments (including the diagonal elements) for zero. The vec{.} which are collected by a signal vector xi = [x1i , . . . , xγi i ] ∈
operator denotes the operation of converting a matrix to a Rγi . The value of the k-th node is calculated by applying a
γi−1 to the signal
weight vector w = [wi−1 1 , . . . , wγi −1 ] ∈ R
i−1
vector.

VOLUME 2, 2024 21
vector xi−1 in the previous layer. Specifically, xki is computed
as
γi−1
X 
xki = gi xji−1 wi−1
j + bi , (4)
j=1

where bi represents an optional bias term and gi (.) the activa-


tion function. The same propagation pattern is followed for
each layer, generating an output vector ν = [p̂, σ̂ 2 ].
To train the network, it is important to select an appropriate
training criterion, or the so-called loss function. A popular
FIGURE 2. The positioning neural network structure.
criterion is the mean-square error (MSE), which measures the
differences between estimated localization coordinates and
the ground truth labels. However, the uncertainty of the pre- j-th processing chain estimates the position of the UE and the
dictions is not evaluated by MSE and therefore we consider variance as p̂i,j = [p̂xi,j , p̂yi,j ] ∈ R2 and σ̂ 2i,j = [σ̂x2i,j , σ̂y2i,j ] ∈
the negative log-likelihood (NLL) loss function instead [35]. R2 . By fusing all χ processing chains according to the max-
Suppose that the entire training dataset contains in total Ntr imum ratio combining (MRC) approach [37], p̂i and σ̂ 2i are
training samples. For the i-th sample, the network outputs calculated as:
estimate the UE coordinate p̂i = [p̂xi , p̂yi ]T ∈ R2 and the 1 1
variance vector σ̂ 2i = [σ̂x2i , σ̂y2i ]T ∈ R2 , while the ground truth σ̂x2i = P , σ̂y2i = P (6)
j 1/σ̂xi,j j 1/σ̂yi,j
2 2
is pi = [pxi , pyi ]T ∈ R2 . Taking into account all Ntr training X 1 X 1
samples, the loss function ψ is p̂xi = σ̂x2i ( p̂xi,j ), p̂yi = σ̂y2i ( p̂y ). (7)
σ̂xi,j
2 σ̂y2i,j i,j
j j
1 X  log σ̂x2i σ̂y2i (pxi − p̂xi )2 (pyi − p̂yi )2 
ψ= + + . However, the estimated variance by (6) may be overconfident,
2Ntr 2 2σ̂x2i 2σ̂y2i
i especially when the network is overfitted. According to (6),
(5) σ̂x2i and σ̂y2i are less than each individual σ̂x2i,j and σ̂y2i,j , respec-
Observe that ψ can be negative owing to the log term. After tively. This may increase the NLL, since σ̂x2i and σ̂y2i act as
selecting the training criterion, all hyperparameters, namely the denominators of the second and third terms, respectively.
all weights and bias terms in (4) in each layer, need to be To address this issue, we multiply a factor χ with σ̂ 2i to get the
fine-tuned to minimize ψ. This optimization procedure can modified vector σ̂ 2i,mod ∈ R2 , which is the harmonic averages
be carried out by backward propagation, which propagates of all estimated variances. Specifically,
the error signal back to each neural network layer to update σ̂ 2i,mod = χ σ̂ 2i . (8)
the weights. Due to page limitations, we avoid presenting the
mathematical derivations; however, the relevant material can
be found in [34]. C. TRAINING ON THE SUBARRAYS
As an evaluation procedure, we collect the test datasets and The size of the neural network increases significantly with
select the NLL loss as the evaluation criterion [35]. As indi- the number of antennas, which leads to a risk of over-fitting
cated in (5), an under-confident variance estimate results in problems. To address this problem, subarray methods can be
the increase of the first term, while an over-confident variance considered. In this paper, we use the covariance matrix as an
results in the increase of the second and third terms. example, however, this method can be generalized to other
fingerprints. We assume that an M1 × M2 rectangular antenna
B. ML-BASED LOCALIZATION PIPELINE array is equipped at the BS side. The spatial correlation
We apply the idea of ensemble learning to the localization between channel responses of two antennas is reduced to a
task. As a popular ML approach, ensemble learning tar- large extent if they are separated larger than the coherence dis-
gets performance improvements by training multiple base tance. Enlightened by this fact, we divide the whole antenna
learners and then fusing their outputs [36]. Each learner into I subarrays and train I neural networks instead of feeding
itself should individually deliver decent results, and it is also the whole covariance matrix into the processing chain. The
important to embed enough diversity when selecting those subarrays are selected as follows.
base learners. Based on this insight, we apply the processing We define a rectangular sliding kernel with a size of
pipeline illustrated in Fig. 2. We select χ fingerprints and N1 rows and N2 columns, which captures in total N1 N2 anten-
feed each fingerprint to an individual processing chain. Those nas. We first place the kernel in the upper left corner of
fingerprints can be either the entire covariance matrix or sub- the whole array, to select antennas that belong to the first
matrices (see the subarray method in the following section) N1 rows and N2 columns. The sliding kernel then moves
or the truncated CIR. Each processing chain estimates 2-D S2 columns to the right and assigns its antennas to a new
UE coordinates as well as the variances. Suppose that the group. When the sliding kernel reaches the last column,

22 VOLUME 2, 2024
Tian et al.: Deep-Learning-Based High-Precision Localization With Massive MIMO

2
Č = [c̃1,Npi , . . . , c̃Q,Npi ] ∈ RM ×Q to collect all those Q
channel response vectors. By performing the 1-D discrete
Fourier transform of Č along the horizontal axis, we can
2
formulate a matrix 9 ∈ CM ×Q that characterizes the channel
variations along those Q positions. Specifically,
9 = Č3, (9)
where 3 ∈ CQ×Q is the DFT matrix. We then define a spec-
trum window L, which covers consecutive L columns w.r.t.
the lower frequency components of 9. Once L is selected, the
corresponding sampling distance 1d between two adjacent
samples can be calculated as
1d = Q δd /L. (10)
We then form those L columns into a new matrix 9 L ∈
2
CM ×L and define η as the ratio between the Frobenius norm
||9 L ||2F
of 9 L and 9, that is η = . Here, η shows the extent
||9||2F
of aliasing of different sampling intervals. In the following
sections, we will analyze the influence of η on localization
accuracy and discuss the choice of 1d .

IV. MASSIVE MIMO MEASUREMENT CAMPAIGN


To validate our approach, an indoor measurement campaign
FIGURE 3. The subarray method. was carried out in the Lund University Humanities Lab
motion capture studio. Photos of the mocap studio are shown
in Fig. 4. We give a brief introduction to the measurement
it moves S1 rows downward, followed by moving S2 columns campaign, while more details can be found in [38].
to the left until the kernel hits the first column. This procedure
is repeated until the entire array is scanned by the kernel A. INTRODUCTION TO THE MEASUREMENT CAMPAIGN
and I = (⌊ M 1−N 1 M 2−N 2
S1 ⌋ + 1)(⌊ S2 ⌋ + 1) training groups are In this measurement, we use a robot to carry the UE with
formulated, where ⌊.⌋ denotes the floor function. We then
a single dipole antenna that is placed at a height of 1.73 m.
formulate I sample covariance matrices that correspond to the
The parameter settings of our measurement system are sim-
UE position pi , which are denoted as Ĉ1i,Np , . . . , ĈIi,Np ∈
i i ilar to those of the LTE system. Specifically, our system
CN1 N2 ×N1 N2 . These covariance matrices are fed into the occupies 20 MHz bandwidth which consists of 100 physical
pipeline shown in Fig. 3, to obtain the estimated UE positions resource blocks (PRBs), and each PRB has 12 subcarriers.
and variances. The subcarrier resource is allocated to multiple users in
such a way that each UE occupies every 12 subcarrier and
D. TRAINING DENSITY in total 100 subcarriers. Specific to this measurement, the
A fundamental question of ML-based localization is to deter- UE transmits uplink pilots on the 1st , 13th , 25th , . . . , 1188th
mine the number of necessary training samples. According subcarriers to estimate the uplink channel and the estimated
to the Nyquist sampling theorem, insufficient numbers of channel responses are recorded every 10 ms. Those pilots
training samples result in aliasing, which has a detrimental are received by the Lund University massive MIMO testbed
effect on system performance. To figure out the necessary (LuMaMi) [39], with 100 active patch antennas operating at
training density, we apply this theorem to investigate max- a center frequency of 3.7 GHz (wavelength λ ≈ 0.081 m).
imum separation distances between two adjacent training The antennas are separated by a distance of around 4 cm (half
samples during the training process. Some degree of aliasing wavelength at 3.7 GHz) in both the vertical and horizontal
is allowed since our task is to estimate the UE localization, directions. Since our objective is to exploit more information
rather than perfectly reconstructing the propagation channels. from the azimuth compared to the elevation domain, a wide
For convenience, we confine the scope of our approach to 4 × 25 antenna configuration is selected.
uniform sampling. We analyzed 75 pre-defined robot trajectories, where the
We consider the vector c̃i,Npi generated by (3), which robot was the only moving object and all other objects were
varies when UE moves to Q different positions. To simplify static. Ti channel snapshots have been recorded on the i-th
this analysis, the UE position labels are assumed to be evenly trajectory, and each snapshot was represented by a matrix
distributed along a straight line, and the geographical dis- with dimension M × F (M = F = 100). A complex tensor
tances between these Q positions are δd . We define a matrix Ai ∈ CTi ×M ×F was then formulated to collect all snapshots.

VOLUME 2, 2024 23
FIGURE 4. The indoor measurement campaign.

power delay profile and the power of the transfer functions


for all 100 antennas in Fig. 6. The power delay profile shows
a typical indoor short-range channel characteristic: the first
few delay bins contain the majority of the power in the delay
domain. Such characteristics are also revealed in the transfer
functions in Fig. 6, showing significant variations in chan-
nel responses among different antennas, and the frequency
correlation is rather high. In contrast, the channel responses
vary much smoother between different sub-carriers for every
single antenna.
We evaluate the spatial correlation of the channel at dif-
ferent UE positions by computing the correlation coefficient
FIGURE 5. Measurement arrangement in the mocap studio, the ρ (1d ) as
red dotted arrow shows the trajectory of the antenna.
ỹH
px ỹpx+1d
 
1 X
While the robot was moving, the position of the antenna was ρ (1d ) = , (11)
P′ p
q
||ỹ || 2 ||ỹ || 2
continuously recorded every 10 ms by the Mocap system. x px px+1d
The measurements began with locating the robot at the edge
of the predefined 4.2 × 2.5 m2 measurement area. The robot where ỹpx ∈ CMN is achieved by reorganizing the received
moved at a speed 0.1 m/s straight along the x direction; see channel matrix Ypi as a vector. P ′ denotes the total number
Fig. 5. Between different measurements, the robot was moved of UE positions, while 1d denotes the distances between two
approximately 5 cm along the y direction while maintain- adjacent UE positions. To visualize the spatial correlation, the
ing its orientation. This procedure was repeated 75 times to absolute value of ρ (1d ) with respect to the first UE trajectory
densely scan the entire measurement area with approximately is plotted according to (11) in Fig. 7. The separation distance
a resolution of 5 cm in the y direction and 1 mm in the x direc- 1d ranges from 0 to 2λ. As shown, a strong spatial correlation
tion. When
P scanning the whole measurement area, we collect can be expected when 1d ≤ 18 λ, however, it decreases
T = i Ti = 302500 channel snapshots. We define a tensor significantly for larger separations.
A′ ∈ CT ×M ×N that combines all Ai . A′ is then normalized For all measurement data, the signal frequency point
by multiplying itself with a scalar so that the Euclidean norm SNR ranges from 1dB-11dB, which depends on the distance
of A′ is equal to T MN . All T collected samples are divided between UE and BS and the constructive or destructive influ-
into two datasets, namely the training dataset with X samples ence of small-scale fading.
and the testing dataset with T −X samples. Training samples
are evenly distributed with a distance along the x-axis as 1d . V. RESULTS AND DISCUSSION
If channel samples are not selected for training purposes, they In this section, we evaluate our localization pipeline using the
are used as testing data unless otherwise noted. measurement data set. We first investigate various channel
fingerprints and then demonstrate the localization accuracy
B. MEASURED PROPAGATION CHANNEL gain achieved by the subarray method. Spatial spectra of esti-
CHARACTERISTICS mated covariance matrices are generated, in order to further
One UE position is selected (position A, see Fig. 5) to illus- evaluate the impact of training density on the localization
trate the measured indoor channel properties. We present the accuracy leveraging the Nyquist sampling theorem. Finally,

24 VOLUME 2, 2024
Tian et al.: Deep-Learning-Based High-Precision Localization With Massive MIMO

TABLE 1. Parameter settings of the neural network.

frameworks of the three FCNNs are programmed based on


FIGURE 6. Power delay profile and power of the transfer function
at position A. Fig.1, which are illustrated in Table 1. Since it is important
to avoid the problem of vanishing gradient [41], we apply a
leaky rectified linear unit (LReLU) as the nonlinear activation
function at the input layer and all hidden layers as well. At the
output layer, softmax is applied as the activation function
to estimate the variances of the position, while LReLU is
applied to estimate UE positions. We initially set the learning
rates for the first FCNN as 10−5 while the second and third
as 10−4 and all the learning rates are reduced 20% every
10 epochs. Compared to our previous work [33], we reduced
the time complexity of Network 2 from O(M 4 ) to O(M 3 ).
Fig. 8 shows the localization accuracy of applying three
networks individually, as well as the accuracy when fusing
networks 2 and 3 according to (7). In Fig. 8 (a), 1d equals
to 81 λ along all 75 robot trajectories, compared to Fig. 8 (b)
where 1d = 43 λ. As presented in Fig. 8, training in trun-
FIGURE 7. Empirical spatial correlation function w.r.t one UE
cated CIR outperforms the raw transfer function, although
moving trajectory.
they embed the same CSI. We postulate that when training
on truncated CIR, the reduction in network size facilitates
we compare our approach with a classic K-nearest neighbors the training process. The signal-to-noise ratio (SNR) is also
(KNN) based and a CNN-based algorithm [40]. enhanced if the tail part of the CIRs is truncated since this
part includes only noise instead of useful CSI. Localization
A. INVESTIGATION ON CHANNEL FINGERPRINTS accuracy when training on the one-sample covariance matrix
We first investigate two commonly used channel fingerprints, significantly outperforms the raw transfer function and the
namely, the truncated CIR and the one-sample covariance CIR, although the delay domain information is not embedded
matrix, which respectively capture the delay and spatial in this fingerprint. There are two potential explanations: i)
domain CSI. It is a straightforward process to generate these It is challenging to resolve multipath components due to
fingerprints. Their localization performances are compared limited bandwidth and the propagation channel has a strong
to the case when only using the channel transfer function. LoS property. ii) Owing to the pre-processing, the angular
To this end, we train 3 neural networks: network 1 trains on information can be better exploited by the neural network.
the raw received transfer function Ypi itself; network 2 the The fusion algorithm results in a slight improvement in local-
one-sample covariance matrix of the whole array with M = ization accuracy in comparison to using the pure one-sampled
100 antennas; network 3 trains on the truncated CIR in the covariance matrix, since the system bandwidth is limited
first Lw = 10 delay bins, which considers the limited system to 20 MHz and it is challenging to provide a good delay
bandwidth (20 MHz) and the typical indoor measurement resolution. However, the CSI in the delay domain is still
scenario with a strong line of sight (LoS) component; The beneficial for localization tasks even with limited bandwidth.

VOLUME 2, 2024 25
TABLE 2. The NLL loss evaluated on the testing dataset.

TABLE 3. Antenna indexes in 5 groups.

TABLE 4. Network structure for each subarray.

of Network 2 and Network 3 according to (8). In the fol-


FIGURE 8. Positioning error cumulative density function with lowing section, we apply the subarray method to further
respect to different training densities: (a) 1d = 81 λ, (b) 1d = 34 λ. address this overconfidence problem and focus merely on
spatial channel fingerprints, considering the limited system
bandwidth.
Thus, we believe that the delay domain information can
contribute more, under scenarios with rich multipath or for B. ENHANCEMENT BY SUBARRAY METHOD
a system occupying wider bandwidth. Compared to Fig. 8 We apply the subarray method in order to address the
(a), the localization accuracy shown in Fig. 8 (b) significantly overconfidence problem and further enhance localization per-
decreases. We postulate that when 1d = 34 λ, the training formance. Specific to this measurement setup, we consider
density is not sufficient to represent the instantaneous channel the trade-off between complexity and localization accuracy
properties. and formulate in total 5 subarrays, and each subarray has
We then calculate the NLL loss of all the aforementioned 32 antennas (N1 = 4 and N2 = 8). We present the antenna
localization algorithms in the training and test dataset, and indexes for each subarray in Table 3. The antenna indexes are
the results of the test data set are illustrated in Table 2. grouped in such a way that the physical distances between
As mentioned in Section III, this loss function considers the each antenna are close to each other; therefore, the signals
localization accuracy and the estimated variance jointly. The captured by those antennas are strongly correlated. Note that
NLL loss of Network 3 is higher than that of the other two a few antennas belong to multiple groups, and thus the spatial
networks, even though it delivers better localization accuracy. correlation information among antennas from different sub-
We observe that the standard deviation predicted by network array groups is included as well. These subarrays are fed into
3 is much smaller than the position error, which results 5 subnetworks that have identical network structures, which
in a significant increase of the second and third terms of are presented in Table 4. Compared to Network 2, the size
(5). Based on this observation, network 3 is overconfident. of each subnetwork is significantly reduced, which facilitates
This problem is even more severe if we fuse the outputs the training process since it is easier to avoid overfitting. For
of Network 2 and Network 3 according to (6), because (6) all those 5 networks, the activation functions and total training
produces a fused variance that is smaller than that of all epochs are the same as in Network 2. The initial learning rates
individuals. In contrast, this problem can be alleviated by for all 5 networks are set at 2 × 10−4 and reduced 20 % every
calculating the harmonic averages of the estimated variances 10 epoch.

26 VOLUME 2, 2024
Tian et al.: Deep-Learning-Based High-Precision Localization With Massive MIMO

FIGURE 9. Positioning errors of using subarrays and the


whole-array w.r.t. different training densities: (a) 1d = 18 λ,
(b) 1d = 43 λ.
FIGURE 10. Spatial spectra of covariance matrices w.r.t. different
average distances: (a) d = 0 (one-sample covariance matrix),
(b) d = λ/2 (c) d = 2λ.
Fig. 9 compares the localization performances of the sub-
array method with the whole array method. The localization
accuracy and the NLL loss, when 1d is 81 λ and 34 λ, are
shown in Fig. 9 (a) and (b), respectively. Fig. 9 shows that C. TRAINING DENSITY ANALYSIS USING NYQUIST
the localization errors of all 5 groups are close to each other, THEOREM
which is comparable to using the one-sample covariance We apply the Nyquist Theorem to the measurement dataset to
matrix of the entire array. The NLL loss w.r.t. the subarray analyze the influence of training densities. This paper focuses
is much lower than the whole array. This indicates that the on the covariance matrix as an example, but our method can
subarray method better estimates the uncertainty. Localiza- be generally applied to other channel fingerprints. The spatial
tion performance, in terms of both accuracy and NLL loss, spectra of the covariance matrices with respect to different
can be further improved by applying the MRC method to average distances (d = 0, λ2 , 2λ) for all 75 measurements are
fuse the outputs of all 5 subarrays. This result illustrates the computed, according to (2)-(3) and (9). The spatial spectra
importance of selecting a proper training input, since the of the i-th measurement w.r.t. three distances are denoted as
2
performance gain can be clearly seen, even if the covariance 9i,d=0 , 9i,d= λ , 9i,d=2λ ∈ CM ×Ti . To visualize the spectra,
2
matrices of the entire array contain the same necessary infor- we then select i = 75 and plot those three spatial spectra in
mation as all subarrays altogether. However, we still observe Fig. 10. As shown, when d increases, most of the spectral
that if the training density is decreased, the localization accu- energy is concentrated in the low-frequency region, indicat-
racy will be degraded. Therefore, we address this problem in ing that the channel changes more slowly when the UE moves
the following sections by first investigating the influence of to different positions. This phenomenon can be explained
training density on localization performance. At the next step, from a channel propagation perspective: small-scale fading
more accurate estimated covariance matrices are calculated is smoothed out by the averaging operation so that the swift
by averaging more samples at different positions, and those change of channel responses cannot be observed. This allows
matrices are applied as the training fingerprints. us to further reduce the necessary training samples.

VOLUME 2, 2024 27
coherence bandwidth for the channel is around 10 MHz.
Under this condition, Ci = E{ypi yH pi } cannot be represented
by the one-sample covariance matrix since many subcar-
riers are still strongly correlated. However, if we consider
different positions far enough from each other but within
the wide-sense stationary region, their corresponding channel
responses are weakly correlated. The estimated covariance
matrix C̃i,Npi approaches better Ci . (3) When d is large,
C̃i,Npi changes much more smoothly with different positions
due to the absence of small-scale fading and η′ drops much
more slowly. This guarantees that with the same training
density, less aliasing noise is introduced to the system.

E. COMPARISON WITH OTHER APPROACHES


FIGURE 11. η′ with respect to sampling distance 1d . We now compare our pipeline with other two representa-
tive approaches, namely, the traditional KNN localization
(naive fingerprinting) approach and the deep-learning-based
We further investigate the relationship between the 1d approach using CNNs [40].
along the x-axis (see Fig. 5) and the level of aliasing noise
introduced to the system. Once 1d increases, the captured
1) KNN APPROACH
spectrum window Li = 1Tdi δd in the i-th measurement
This approach first establishes a database that stores all
decreases, see (10), and more aliasing noise will be intro-
training fingerprints. When receiving a new localization
duced. To simplify the evaluation of the effect of aliasing
requirement, the BS finds the first K closest fingerprints from
noise, 1d is selected to be the same for all 75 measure-
the database. In this paper, the estimated covariance matrix is
ment trajectories during the training phase. Based on this,
P
||9
Li
||
selected as the fingerprint. We denote ĈTr,i as the i-th training
we define a parameter η′ = Pi ||9i,d || , to characterize the fingerprint stored in the database and ĈTe as the fingerprint
i i,d
extent of frequency aliasing. If η′ is closer to 1, the aliasing with respect to a testing sample. We then define a scalar
noise is weak. We plot η′ with regard to three covariance li = ||ĈTr,i − ĈTe ||2F . After calculating all Ntr distances li ,
matrices in Fig. 11. If d = 0 and 1d exceeds the Nyquist we select the first k = 4 lowest li and denote their coordinates
distance 1nqt = 41 λ, η′ drops apparently, and the influence as p̃i . Applying the weighted KNN algorithm [42], the final
of aliasing noise is not trivial. Compared to d = 12 λ, η′ drops estimated position p̆ ∈ R2 is calculated as
more smoothly after 1nqt = 0.5λ. If d = 2λ, 1nqt increases 4
to λ. Even if 1d exceeds the Nyquist distance, η′ drops very X
p̆ = wi p̃i , (12)
slowly, compared to d = 21 λ. This indicates that the influence
i=1
of small-scale fading is rather weak.
where the weight wi is defined as wi = P1/l1/li
.
j j
D. COMPARISON BETWEEN DIFFERENT COVARIANCE Fig. 12 displays the localization accuracy of the KNN
MATRICES method with respect to different 1d . For a fair compari-
Fig. 12 illustrates the localization performances of three son, the same training data are used here as our pipeline.
aforementioned covariance matrices with different training As a concern of the complexity issue, we randomly select
densities. To fairly compare performance, subarray methods 20000 testing channel samples rather than using all available
are applied and antennas are grouped in the same way as ones. We observe that the KNN methods perform better than
in Table 3. All three networks are programmed according the pipeline if the channel is densely sampled. However, when
to Table. 4. It can be observed from Fig. 12 that when 1d the training density is reduced, the neural network method
further exceeds 1nqt , positioning accuracy decreases more approaches and outperforms the KNN. This can be explained
because the negative effect of the aliasing noise cannot be as follows: when the channel is heavily oversampled, it is
ignored. The Nyquist distance 1nqt can be extended by possible to find a few pre-stored channel fingerprints in the
increasing the average distance d to cover more samples when database, which are very similar to the test channel finger-
formulating the covariance matrix. By comparing Fig. 12 print. The localization accuracy is already good by directly
(a), (b), and (c), we see that when d increases, both the reading the coordinates of the closest fingerprints in the
localization accuracy and the NLL improve, especially under database, let alone the further improvement by the interpo-
low training density. We postulate that three important factors lation operation shown in (12). In comparison, the neural
contribute to this improvement: (1) by averaging more sam- network mechanism estimates the UE position based on inter-
ples, the noise energy is reduced and the SNR is increased; polating on the whole training datasets rather than the few
(2) the system bandwidth is limited to 20 MHz, while the closest fingerprints, which results in suboptimal localization

28 VOLUME 2, 2024
Tian et al.: Deep-Learning-Based High-Precision Localization With Massive MIMO

FIGURE 12. Comparison between positioning error of the pipeline and kNN method when using covariance matrices w.r.t. different
average distances as training fingerprints. (a) d = 0, one-sample covariance matrix, (b) d = λ/2 (c) d = 2λ.

accuracy. However, when the training density is reduced,


it cannot be guaranteed that the channel fingerprints in the
database are close to the test channel fingerprint. There-
fore, it only performs well if there exists such a fingerprint.
In contrast, the neural network is likely more suitable for the
localization task thanks to its nonlinear interpolation ability.
Bear in mind that KNN methods generally require computing
the Euclidean distance between the testing fingerprint and
Nr prestored training fingerprints in total. This leads to a
high complexity in time, which is o(M 2 Nr ) according to (12).
In comparison, our pipeline has a better time complexity, that
is o(M 3 ), since the antenna number M is much smaller than
Nr for most commercial devices. Even when M becomes
larger, one can use the subarray method to reduce the number
of antennas in each group and to reduce running time in FIGURE 13. Positioning errors of our pipeline and localization
approach in [40] under LoS scenario.
practice. On the other hand, if one wants to achieve a better
localization result using the KNN method, it is necessary
to pre-store sheer numbers of measurement samples in the 4̃ ∈ CM ×F and the corresponding tensor Ŷpi ∈ RM ×F×2 .
database. However, it is a resource-intensive endeavor to In the next step, the author formulated a tensor I ∈ RM ×F×6
construct such very densely sampled indoor measurement by concatenating Ypi , Ỹpi , and Ŷpi and sent this tensor to a
datasets both in terms of finance and time manners, if the dis- residual CNN. The network structure is programmed accord-
tance between adjacent samples is smaller than the Nyquist ing to [40], as well as the open source code. We modify the
distance (only a few centimeters at sub-6 GHz frequency). For size of the input layer, since the antenna number in our case
most of the applications, one would spend less resources to is 100 instead of 64. We plot the localization accuracy of
collect data, and thus lower densities are expected. Therefore, this approach in Fig.13, where 1d = 14 λ and the training
the usage of the KNN method for a real-time operation sce- percentage is around 5%. As illustrated, our pipeline has bet-
nario is rather limited. From this perspective, the processing ter localization accuracy compared to [40] even with the use
pipeline still has its advantages even under the condition of a of a one-sample covariance matrix for training. Localization
high training density. accuracy can be slightly improved if we increase the average
distance d to 12 λ under this training density. We postulate that
2) CNN BASED APPROACH our pipeline benefits from the pre-processing step as well as
We then compare our localization approach with [40] that the subarray method. We have also performed a time com-
trained a deep residual CNN to perform the indoor local- plexity analysis of the convolutional neural network, which
ization tasks, where the open-source code is available. The is o(M F l1 l2 Cin Cout ), where l1 and l2 represent the 2-D size
author in [40] first formulated a tensor Ypi ∈ RM ×F×2 by of the convolutional kernel and Cin and Cout the numbers
collecting the real and imaginary part of the raw received of input and output channels at each layer. In comparison,
complex transfer function Ypi . Then Ypi was converted to the time complexity of our pipeline is o(M 3 ). Consider that
the polar domain by calculating the amplitudes and phases M 2 ≤ F l1 l2 Cin Cout for most commercial systems, we also
of each entry of the received transfer function matrix. This have advantages in terms of time complexity. Furthermore,
formulated a tensor Ỹpi ∈ RM ×F×2 . The inverse Fourier our system has the ability to predict uncertainty, which is an
transform was also performed to obtain the CIR matrix additional advantage.

VOLUME 2, 2024 29
FIGURE 14. Positioning errors of our pipeline under different FIGURE 15. Positioning errors of our pipeline and localization
SNR. approach in [40] under NLoS scenario.

F. INVESTIGATING DIFFERENT CASES TABLE 5. Antenna indexes in 4 groups.

1) THE INFLUENCE OF SNR ON POSITIONING


ACCURACY
As stated in Section IV, the SNRs at the subcarriers range
between 1 and 11 dB in our measurement scenario. To further
test the performance of our localization pipeline, especially
under low SNR, synthetic white Gaussian noise is added to
our measurement data to emulate measurement environments
with mean SNR −5 dB, −10 dB, and −15 dB. Fig. 14 illus- is 8 (i − 1) + j. A metal blocker with size 1.6 × 1.3 m is placed
trates the localization accuracy of our algorithm that trains on between the base station and the UE, blocking the LoS com-
the covariance matrix (d = 21 λ) in different SNR scenarios. ponent. Each UE sounds the OFDM signal from the uplink
The percentage of training is around 2.5% and 1d = 12 λ. as a pilot for channel estimation, which has 100 subcarriers,
As illustrated, our algorithm still delivers good localization occupying in total 20 MHz.
performance under the −10 dB SNR scenario. Even when The subarray method is also applied and 4 groups are
the SNR drops to −15 dB, the localization accuracy is still formulated, each group contains 36 antennas, and the antenna
acceptable for applications such as indoor navigation. This indices are shown in Table. 5. The covariance matrices of
is because our processing pipeline can harvest the SNR gain each subarray are formulated and sent to four individual
from correlating over other frequencies and averaging over FCNNs for training. Each FCNN has the same structure as
other one-sample covariance matrices in the neighborhood Table.4, except the input layer size is 1296 × 1024. Figure.15
region according to (2). illustrates the localization performances of our pipeline when
using the estimated covariance matrix, compared with the
2) INVESTIGATING NLoS MEASUREMENT SCENARIOS approach illustrated in [40]. Specifically, when we estimate
We investigate the localization performance of our proposed the covariance matrix, three average distances are investi-
pipeline in none-line-of-sight (NLoS) measurement scenar- gated, namely, d = 0 (one-sample covariance matrix), d =
ios. To this end, our localization is applied to an open source 0.5λ, and d = λ. The distances of the training samples
indoor measurement dataset [43]. We provide a brief intro- are 14 λ in both horizontal and vertical directions. As illus-
duction to the NLoS measurement campaign and parameter trated, the algorithm in [40] achieves better performance than
settings, while more details can be found in [40]. Fig. 16 training on the one-sample covariance matrix in the NLoS
illustrates the arrangement of indoor measurement, where 4 scenario. However, if the average distance d increases over
2 λ, the localization error decreases, and they outperform
1
UEs, which occupy different subcarriers, move within the
four gray squares, each with a size 1.2 × 1.2 m. Each UE [40]. We postulate that the one-sample covariance matrix is
is equipped with a dipole antenna that is placed at the height more unstable under the NLoS scenario because the absence
of 0.4 m. The UE trajectories are densely sampled, resulting of a prominent LoS component amplifies the effect of small-
in up to 252004 channel samples with geographical distance scale fading. Therefore, in this scenario, it is challenging to
between each sample 5 mm. The ground truth positions of collect only one channel sample to estimate the covariance
UE are recorded by a mechanical device with an error of less matrix Ci at the position pi . Compared to other studies, when
than 1 mm. The base station consists of 64 patch antennas the average distance d increases, the effect of small-scale
operating at the center frequency 2.6 GHz, and all antennas fading becomes weak and C̃i,Npi can better approach Ci . This
formulate a uniform rectangular array with size 0.56×0.56 m. shows that if the covariance matrices are selected as a training
The index of the antenna on the i -th row and the j -th column fingerprint in the NLoS scenario, it is more important to cover

30 VOLUME 2, 2024
Tian et al.: Deep-Learning-Based High-Precision Localization With Massive MIMO

the localization performance. This example illustrates the


importance of appropriately selecting training samples when
we construct the training datasets.

VI. CONCLUSION AND FUTURE WORK


This paper investigated the potential to apply ML to a massive
MIMO system for solving localization tasks. We analyzed
a novel ML-based localization pipeline, which estimated
UE positions and variances by using different channel fin-
gerprints, including covariance matrices and truncated CIR.
For a system with a massive number of antennas, a sub-
array method was applied to facilitate the training process.
Furthermore, we applied the Nyquist sampling theorem to
investigate the effect of training density. An indoor massive
MIMO measurement campaign was conducted at 3.7 GHz
using 20 MHz bandwidth to evaluate our approaches, where
centimeter-level localization accuracy has been achieved.
Measurement results show that: 1) The information from both
FIGURE 16. A demonstration of the indoor NLoS environment. the delay and angle domains contributes to the localization
performance, although in our case the delay domain CSI
contributes less than the angle domain CSI due to the limited
available bandwidth. 2) Compared to training on the whole
antenna array, the subarray method can achieve significant
enhancements in both positioning accuracy and better uncer-
tainty prediction quality. 3) As expected, the localization
accuracy decreases when the sampling interval is larger than
the Nyquist sampling distances. It is worth mentioning that
during the measurement campaign, the channel remained
stationary and no individuals were present. In upcoming
research, we will examine how the presence of people and
other moving objects, as well as the difference in the proper-
ties of the UE antenna at the training and testing phase, affect
FIGURE 17. Comparison between different ways of constructing the accuracy of the localization and apply transfer learning to
the dataset. address possible problems. We will also investigate localiza-
tion pipelines that jointly process information from multiple
snapshots.
more samples in the neighborhood region of pi than in the LoS
scenario. On the basis of the observation above, our pipeline ACKNOWLEDGMENT
is also suitable for the NLoS scenario and can still achieve The authors would like to thank the Humanities Labora-
better performances than the literature. tory, Lund University, and the LTH Robotic Laboratory, for
help with the equipment. They also thank Henrik Garde,
3) RANDOM SELECTION OF THE TRAINING SAMPLES Alexander Dürr, Steffen Malkowsky, Sara Willhammar, and
We now investigate the localization performance of our Sirvan Abdollah Poor for their assistance throughout the
pipeline, when the training data set is constructed by ran- measurements. They also appreciate the discussion with
domly selecting training samples from the robot trajectory. Ph.D. candidate Ziliang Xiong regarding the uncertainty
To enable a fair comparison, the network structures and prediction.
all other parameters, such as the training percentages of
the two datasets (evenly and randomly sampled), are the REFERENCES
same. Specific to our data set, the percentage of training [1] X. Cai, X. Cheng, and F. Tufvesson, ‘‘Toward 6G with terahertz
is 5% when the distance between two adjacent samples is communications: Understanding the propagation channels,’’ 2022,
1d = 14 λ. If 1d = 12 λ, the training percentage drops arXiv:2209.07864.
[2] R. Whiton, ‘‘Cellular localization for autonomous driving: A function
to 2.5%. Fig. 17 illustrates the localization accuracy of our pull approach to safety-critical wireless localization,’’ IEEE Veh. Technol.
pipeline when we train on the one-sample covariance matrix. Mag., vol. 17, no. 4, pp. 28–37, Dec. 2022.
As shown, the localization accuracy deteriorates when the [3] A. A. Abdallah, C.-S. Jao, Z. M. Kassas, and A. M. Shkel, ‘‘A pedes-
trian indoor navigation system using deep-learning-aided cellular signals
training samples are randomly selected. Therefore, the chan- and ZUPT-aided foot-mounted IMUs,’’ IEEE Sensors J., vol. 22, no. 6,
nel property may not be well captured, which deteriorates pp. 5188–5198, Mar. 2022.

VOLUME 2, 2024 31
[4] M. Maaref and Z. M. Kassas, ‘‘Ground vehicle navigation in GNSS- [25] X. Sun, C. Wu, X. Gao, and G. Y. Li, ‘‘Fingerprint-based localization for
challenged environments using signals of opportunity and a closed-loop massive MIMO-OFDM system with deep convolutional neural networks,’’
map-matching approach,’’ IEEE Trans. Intell. Transp. Syst., vol. 21, no. 7, IEEE Trans. Veh. Technol., vol. 68, no. 11, pp. 10846–10857, Nov. 2019.
pp. 2723–2738, Jul. 2020. [26] Y. Lin, K. Yu, L. Hao, J. Wang, and J. Bu, ‘‘An indoor Wi-Fi localization
[5] X. Mu, Y. Liu, L. Guo, J. Lin, and R. Schober, ‘‘Intelligent reflecting sur- algorithm using ranging model constructed with transformed RSSI and BP
face enhanced indoor robot path planning: A radio map-based approach,’’ neural network,’’ IEEE Trans. Commun., vol. 70, no. 3, pp. 2163–2177,
IEEE Trans. Wireless Commun., vol. 20, no. 7, pp. 4732–4747, Jul. 2021. Mar. 2022.
[6] R. Whiton, J. Chen, T. Johansson, and F. Tufvesson, ‘‘Urban navigation [27] Y. Yuan, X. Liu, Z. Liu, Z. He, and Z. Xu, ‘‘Indoor localization with wire-
with LTE using a large antenna array and machine learning,’’ in Proc. IEEE less heterogeneous devices by composite fingerprint sets and hybrid clas-
95th Veh. Technol. Conf. (VTC-Spring), Jun. 2022, pp. 1–5. sification,’’ IEEE Trans. Veh. Technol., vol. 71, no. 11, pp. 12117–12127,
[7] J. A. D. Peral-Rosado, R. Raulefs, J. A. López-Salcedo, and Nov. 2022.
G. Seco-Granados, ‘‘Survey of cellular mobile radio localization [28] P. Ferrand, A. Decurninge, and M. Guillaud, ‘‘DNN-based localization
methods: From 1G to 5G,’’ IEEE Commun. Surveys Tuts., vol. 20, no. 2, from channel estimates: Feature design and experimental results,’’ in Proc.
pp. 1124–1148, 2nd Quart., 2018. IEEE Global Commun. Conf. (GLOBECOM), Dec. 2020, pp. 1–6.
[8] S. Dwivedi et al., ‘‘Positioning in 5G networks,’’ IEEE Commun. Mag., [29] J. Fan, X. Dou, W. Zou, and S. Chen, ‘‘Localization based on improved
vol. 59, no. 11, pp. 38–44, Nov. 2021. sparse Bayesian learning in mmWave MIMO systems,’’ IEEE Trans. Veh.
Technol., vol. 71, no. 1, pp. 354–361, Jan. 2022.
[9] E. Leitinger, P. Meissner, C. Rüdisser, G. Dumphart, and K. Witrisal,
[30] S. De Bast, E. Vinogradov, and S. Pollin, ‘‘Expert-knowledge-based data-
‘‘Evaluation of position-related information in multipath components
driven approach for distributed localization in cell-free massive MIMO
for indoor positioning,’’ IEEE J. Sel. Areas Commun., vol. 33, no. 11,
networks,’’ IEEE Access, vol. 10, pp. 56427–56439, 2022.
pp. 2313–2328, Nov. 2015.
[31] X. Ye, X. Yin, X. Cai, A. P. Yuste, and H. Xu, ‘‘Neural-network-assisted
[10] A. Hu, T. Lv, H. Gao, Z. Zhang, and S. Yang, ‘‘An ESPRIT-based approach UE localization using radio-channel fingerprints in LTE networks,’’ IEEE
for 2-D localization of incoherently distributed sources in massive MIMO Access, vol. 5, pp. 12071–12087, 2017.
systems,’’ IEEE J. Sel. Topics Signal Process., vol. 8, no. 5, pp. 996–1011,
[32] R. Bharadwaj, A. Alomainy, and S. K. Koul, ‘‘Experimental investigation
Oct. 2014.
of body-centric indoor localization using compact wearable antennas and
[11] X. Zeng, F. Zhang, B. Wang, and K. J. R. Liu, ‘‘Massive MIMO for high- machine learning algorithms,’’ IEEE Trans. Antennas Propag., vol. 70,
accuracy target localization and tracking,’’ IEEE Internet Things J., vol. 8, no. 2, pp. 1344–1354, Feb. 2022.
no. 12, pp. 10131–10145, Jun. 2021. [33] G. Tian, I. Yaman, M. Sandra, X. Cai, L. Liu, and F. Tufvesson, ‘‘High-
[12] X. Li, E. Leitinger, M. Oskarsson, K. Åström, and F. Tufvesson, ‘‘Massive precision machine-learning based indoor localization with massive MIMO
MIMO-based localization and mapping exploiting phase information of system,’’ 2023, arXiv:2303.03743.
multipath components,’’ IEEE Trans. Wireless Commun., vol. 18, no. 9, [34] S. Haykin, Neural Networks: A Comprehensive Foundation. New York,
pp. 4254–4267, Sep. 2019. NY, USA: Prentice-Hall, 1998.
[13] L. Lian, A. Liu, and V. K. N. Lau, ‘‘User location tracking in massive [35] B. Lakshminarayanan, A. Pritzel, and C. Blundell, ‘‘Simple and scalable
MIMO systems via dynamic variational Bayesian inference,’’ IEEE Trans. predictive uncertainty estimation using deep ensembles,’’ in Proc. Conf.
Signal Process., vol. 67, no. 21, pp. 5628–5642, Nov. 2019. Neural Inf. Process. Syst. Red Hook, NY, USA: Curran Associates, 2017,
[14] N. Garcia, H. Wymeersch, E. G. Larsson, A. M. Haimovich, and pp. 6405–6416.
M. Coulon, ‘‘Direct localization for massive MIMO,’’ IEEE Trans. Signal [36] Z. H. Zhou, Ensemble Methods: Foundations and Algorithms. London,
Process., vol. 65, no. 10, pp. 2475–2487, May 2017. U.K.: CRC Press, 2012.
[15] X. Cai, W. Fan, X. Yin, and G. F. Pedersen, ‘‘Trajectory-aided maximum- [37] T. Hill, ‘‘Conflations of probability distributions,’’ Trans. Amer. Math. Soc.,
likelihood algorithm for channel parameter estimation in ultrawideband vol. 363, no. 6, pp. 3351–3372, Jun. 2011.
large-scale arrays,’’ IEEE Trans. Antennas Propag., vol. 68, no. 10, [38] I. Yaman et al., ‘‘The LuViRA dataset: Measurement description,’’ 2023,
pp. 7131–7143, Oct. 2020. arXiv:2302.05309.
[16] X. Cai and W. Fan, ‘‘A complexity-efficient high resolution propagation [39] S. Malkowsky et al., ‘‘The world’s first real-time testbed for massive
parameter estimation algorithm for ultra-wideband large-scale uniform MIMO: Design, implementation, and validation,’’ IEEE Access, vol. 5,
circular array,’’ IEEE Trans. Commun., vol. 67, no. 8, pp. 5862–5874, pp. 9073–9088, 2017.
Aug. 2019. [40] A. Colpaert, S. De Bast, R. Beerten, A. P. Guevara, Z. Cui, and S. Pollin,
[17] S. D. Bast, A. P. Guevara, and S. Pollin, ‘‘CSI-based positioning in massive ‘‘Massive MIMO channel measurement data set for localization and com-
MIMO systems using convolutional neural networks,’’ in Proc. IEEE 91st munication,’’ IEEE Commun. Mag., vol. 61, no. 9, pp. 114–120, Sep. 2023.
Veh. Technol. Conf. (VTC-Spring), May 2020, pp. 1–5. [41] M. Khalid, J. Baber, M. K. Kasi, M. Bakhtyar, V. Devi, and N. Sheikh,
[18] S. De Bast and S. Pollin, ‘‘MaMIMO CSI-based positioning using CNNs: ‘‘Empirical evaluation of activation functions in deep convolution neu-
Peeking inside the black box,’’ in Proc. IEEE Int. Conf. Commun. Work- ral network for facial expression recognition,’’ in Proc. 43rd Int. Conf.
shops (ICC Workshops), Jun. 2020, pp. 1–6. Telecommun. Signal Process. (TSP), Jul. 2020, pp. 204–207.
[19] J. R. Sánchez, O. Edfors, and L. Liu, ‘‘Positioning for distributed large [42] P. Cunningham and S. J. Delany, ‘‘K-nearest neighbour classifiers—A
intelligent surfaces using neural network with probabilistic layer,’’ in Proc. tutorial,’’ ACM Comput. Surv., vol. 54, no. 6, pp. 1–25, Jul. 2022.
IEEE Globecom Workshops (GC Wkshps), Dec. 2021, pp. 1–6. [43] S. De Bast and S. Pollin, ‘‘Ultra dense indoor MaMIMO CSI dataset,’’
Katholieke Universiteit Leuven, Leuven, Belgium, Tech. Rep. nr6k-8r78-
[20] E. Gönültas, E. Lei, J. Langerman, H. Huang, and C. Studer, ‘‘CSI-
21, 2021. [Online]. Available: https://ieee-dataport.org/open-access/ultra-
based multi-antenna and multi-point indoor positioning using probability
dense-indoor-mamimo-csi-dataset, doi: 10.21227/nr6k-8r78.
fusion,’’ IEEE Trans. Wireless Commun., vol. 21, no. 4, pp. 2162–2176,
Apr. 2022.
[21] K. N. R. S. V. Prasad, E. Hossain, and V. K. Bhargava, ‘‘Low-
dimensionality of noise-free RSS and its application in distributed massive
MIMO,’’ IEEE Wireless Commun. Lett., vol. 7, no. 4, pp. 486–489, GUODA TIAN (Member, IEEE) received the
Aug. 2018. bachelor’s degree in automation and control engi-
[22] K. N. R. S. V. Prasad, E. Hossain, and V. K. Bhargava, ‘‘Machine learning neering from Northeastern University in 2016 and
methods for RSS-based user positioning in distributed massive MIMO,’’ the master’s degree in wireless communication
IEEE Trans. Wireless Commun., vol. 17, no. 12, pp. 8402–8417, Dec. 2018. from Lund University in 2018, where he is
[23] X. Guo and N. Ansari, ‘‘Localization by fusing a group of fingerprints currently pursuing the Ph.D. degree, under the
via multiple antennas in indoor environment,’’ IEEE Trans. Veh. Technol., supervision of Fredrik Tufvesson, Ove Edfors,
vol. 66, no. 11, pp. 9904–9915, Nov. 2017. Bo Bernherdsson, and Xuesong Cai. His current
[24] J. Vieira, E. Leitinger, M. Sarajlic, X. Li, and F. Tufvesson, ‘‘Deep convolu- research interests include applying machine learn-
tional neural networks for massive MIMO fingerprint-based positioning,’’ ing to wireless localization, sensing, and com-
in Proc. IEEE 28th Annu. Int. Symp. Pers., Indoor, Mobile Radio Commun. munication systems. During the master’s degree, he received the Lund
(PIMRC), Oct. 2017, pp. 1–6. University Global Scholarship.

32 VOLUME 2, 2024
Tian et al.: Deep-Learning-Based High-Precision Localization With Massive MIMO

ILAYDA YAMAN (Student Member, IEEE) LIANG LIU (Member, IEEE) received the Ph.D.
received the bachelor’s degree from Istanbul Tech- degree from Fudan University in 2010. He joined
nical University in 2018, and the master’s degree in the Department of Electrical and Information
embedded electronics engineering from Lund Uni- Technology (EIT), Lund University, Sweden,
versity, Sweden, in 2020, where she is currently where he held a post-doctoral position. Since
pursuing the Ph.D. degree under the supervision 2016, he has been an Associate Professor with
of Liang Liu, Ove Edfors, Kalle Åström, and Lund University. His current research interests
Steffen Malkowsky. Her current research interests include wireless systems and digital integrated
include fusing vision and radio sensors to design circuit design. He is a member of the Technical
low-power hardware using machine learning algo- Committee of VLSI Systems and Applications and
rithms. During the master’s degree, she received the Lund University Global CAS for Communications of the IEEE Circuit and Systems Society.
Scholarship for her academic success.

MICHIEL SANDRA (Student Member, IEEE)


received the M.Sc. degree (summa cum laude)
in engineering technology from KU Leuven in
2020. He is currently pursuing the Ph.D. degree
in channel modeling for maritime communications
with Lund University, under the supervision of
Xuesong Cai and Anders J Johansson. His current
research interests include channel sounding, chan-
nel modeling, signal processing, antenna design,
and wireless system design.

XUESONG CAI (Senior Member, IEEE)


received the B.S. and Ph.D. (Hons.) degrees from
Tongji University, Shanghai, China, in 2013 and
2018, respectively.
In 2015, he conducted a three-month intern-
ship with Huawei Technologies, Shanghai.
He was a Visiting Scholar with Universidad
Politcnica de Madrid, Madrid, Spain, in 2016.
From 2018 to 2022, he conducted several
post-doctoral stays at Aalborg University; Nokia
Bell Labs, Denmark; and Lund University, Sweden. He is currently FREDRIK TUFVESSON (Fellow, IEEE)
an Assistant Professor of communications engineering and a Marie received the Ph.D. degree from Lund University,
Skłodowska-Curie Fellow with Lund University, closely cooperating with Lund, Sweden, in 2000. After two years at a startup
Ericsson and Sony. His current research interests include radio channel company, he joined the Department of Electrical
characterization, high-resolution parameter estimation, over-the-air testing, and Information Technology, Lund University,
resource optimization, and radio-based localization for 5G/B5G wireless where he is currently a Professor of radio sys-
systems. He was a recipient of the China National Scholarship (the highest tems. He has authored around 100 journal articles
honor for Ph.D. candidates) in 2016, the Outstanding Doctorate Graduate and 150 conference papers. His current research
awarded by the Shanghai Municipal Education Commission in 2018, the interests include the interplay between the radio
Marie Skłodowska-Curie Actions (MSCA) Seal of Excellence in 2019, the channel and the rest of the communication system
EU MSCA Fellowship (ranking top 1.2%, overall success rate 14%), and with various applications in 5G/B5G systems, such as massive multiple-input
the Starting Grant (success rate 12%) funded by the Swedish Research multiple-output (MIMO), mmWave communication, vehicular communi-
Council in 2022. He was also selected by the ZTE Blue Sword-Future cation, and radio-based positioning. His research has been awarded the
Leaders Plan in 2018 and the Huawei Genius Youth Program in 2021. He is Neal Shepherd Memorial Award for the Best Propagation Paper in IEEE
an Associate Editor of IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, IET TRANSACTIONS ON VEHICULAR TECHNOLOGY and the IEEE Communications
Communications, and Wireless Communications and Mobile Computing. Society Best Tutorial Paper Award.

VOLUME 2, 2024 33

You might also like