Digital Twin Based Beam Prediction: Can We Train in The Digital World and Deploy in Reality?
Digital Twin Based Beam Prediction: Can We Train in The Digital World and Deploy in Reality?
Digital Twin Based Beam Prediction: Can We Train in The Digital World and Deploy in Reality?
Abstract—Realizing the potential gains of large-scale MIMO digital twin-based communications assume that the BSs have
systems requires the accurate estimation of their channels or information on the position, orientation, dynamics, shapes, and
the fine adjustment of their narrow beams. This, however, is materials of the surrounding objects. With this information, the
arXiv:2301.07682v1 [eess.SP] 18 Jan 2023
Receiver Receiver
Fig. 1. This figure shows a real-world communication system and its digital replica. In the real world, the channel between the transmitter and the receiver is
determined by the communication environment and the signal propagation law. The digital replica employs the 3D model to approximate the communication
environment and the ray tracing to model the signal propagation.
channel acquisition overhead. (iii) We approximate the digital The objective of this paper is to solve the communication task
replicas with ML models to further reduce the computational in (1) while eliminating (or significantly reducing) the channel
complexity. To evaluate the proposed digital twin aided wireless acquisition overhead. To that end, we propose a novel research
communication approaches, (v) we build a digital twin dataset direction that aims to solve (1) by approximating the real world
comprising a real-world dataset (from DeepSense 6G [7]) and with a digital replica.
a digital replica dataset based on accurate 3D ray-tracing.
III. F ROM THE R EAL W ORLD TO THE D IGITAL R EPLICA
II. S YSTEM M ODEL AND P ROBLEM F ORMULATION
In the real world, the communication channels H are de-
In this paper, we consider a general MIMO communication termined by the following two key components: The com-
system where A base stations (BSs) serve B mobile user equip- munication environment including the positions, orientations,
ments (UEs). The a-th (a = {1, . . . , A}) BS is equipped with dynamics, shapes, and materials of the BS, the UE, and other
an antenna array of Na elements. The b-th (b = {1, . . . , B}) objects (reflectors/scatterers) in the surroundings, and (ii) the
UE employs an antenna array of Mb elements. Moreover, laws governing the wireless signal propagation phenomena. Let
we assume that the BSs have knowledge of the surrounding E denote the communication environment and g(·) denote the
environments, which includes information about the positions, signal propagation law, the communication channels can then
orientations, dynamics, shapes, materials of the BS, the UEs, be written as
and the other surrounding objects (that can act as reflec-
tors/scatterers). H = g(E). (2)
Without loss of generality, let Hb,a denote the channel
When the communication environment E and the signal prop-
between the a-th BS and the b-th UE. The channels be-
agation law g(·) are known, the solution T ? to the communi-
tween all the BSs and UEs can be then represented by a
cation task can be obtained by substituting (2) into (1) as:
set H = {Hb,a | b = {1, . . . , B}, a = {1, . . . , A}}. Accurate
information about the channels H is crucial for realizing T ? = ST g(E) . (3)
the potential gains of the MIMO systems; many essential
communications tasks for MIMO systems, such as precod- Nevertheless, the precise ground-truth communication environ-
ing, beamforming/beam tracking, handover, resource allocation, ment E is difficult to obtain, and the exact expression of the
and interference coordination, requires full or partial channel signal propagation law g(·) remains unclear in complex envi-
information. Let T denote the solution space of one such ronments. To that end, we propose to solve the communication
communication task that requires the channel information H. task in (3) by approximating the communication environment
Let T ? denote an optimal solution (where T ? ∈ T ). Further, E and the signal propagation law g(·) in a digital replica.
let ST denote an existing method that optimally solves the Particularly, we approximate the communication environment
communication task given the channel information H. Then, E with Electro-magnetic (EM) 3D models and the signal
T ? can be written as propagation law g(·) with ray tracing.
EM 3D model E: e The EM 3D models contain informa-
T ? = ST (H) , (1)
tion about the positions, orientations, dynamics, shapes and
While the channel information H is vital for MIMO systems, materials of the BS, the UE, and other surrounding objects
obtaining this channel information often requires a large ac- (reflectors/scatterers). Note that this 3D model information
quisition overhead (beam sweeping, channel training/feedback, can be obtained using several approaches. (i) For static and
etc.) that degrades the overall system efficiency. fixed objects such as the neighboring buildings, the BS can
memorize their position, shapes, and materials information band may require more accurate 3D models and ray tracing
since these objects are not likely to change frequently. (ii) In than the beam prediction task in the mmWave band.
the context of sensing-aided information, the BSs can exploit One potential challenge of the digital twin lies in the high
various sensors such as cameras, radars, and LiDARs to obtain computational complexity of accurate ray tracing. This com-
sensing information on both the surrounding stationery and plexity can even increase when the 3D models have more details
dynamic objects. From this sensing information, the BSs can and contain a large number of interacting objects. As a result,
infer the position, orientation, dynamics, shapes, and materials the digital twin can suffer from high latency, which makes it
of these objects. Moreover, once the BS identifies an object, less suitable for real-time applications. Next, we exploit ML to
the BS can use memorized and/or online data to refine the reduce such computational complexity.
information about this object. For instance, if the BS detects
IV. A PPROXIMATING THE D IGITAL R EPLICA WITH
a car of a certain model, the BS can search for the shape and
M ACHINE L EARNING
material information of this car model in its or online database.
(iii) Thanks to the recent development in the internet of things The computational overhead and consequential latency of
[8], objects with communication capability can report/broadcast (5) limits the feasibility of the digital replicas in real-time
their information that are useful for the 3D model. applications. Therefore, it is interesting to design a function
Ray tracing ge(·): Based on the information in the 3D f (·) that processes Ee and approximates the solution in (5) with
models, the channels H can be modeled using stochastic and lower computational complexity and latency. In this paper, we
deterministic channel models. The stochastic channel models take a data-driven approach and learn f (·) with ML.
assume that the propagation parameters such as pathloss, delay Let f (· ; Θ) denote an ML model with Θ representing the
spread, and angle spread, follow certain probability distri- model parameters, the ML model is developed to learn a
butions. However, it is difficult to define these probability mapping function that takes in Ee and produces a solution Tb
distributions for a specific scenario. By contrast, deterministic that approximates the Te? in (5). The objective of the ML
channel modeling methods like ray tracing do not rely on optimization problem can be written as
assumptions about the probability distributions of the propa- n
e? b e
o
max EE∼ e s(T , T ) | E ,
e Υ (6)
gation parameters. Instead, the ray tracing attempts to track f (· ;Θ)
the propagation paths between each transmit-receive antenna
pair based on the geometry and material information in the where Tb is the output of the ML model f (· ; Θ) given the 3D
3D models, which preserves the spatial and temporal consis- model E.e Υe denotes the underlying probability distribution of
tency. In this process, multiple propagation paths are explicitly the 3D models. E{·} is the expectation operator. The optimal
modeled by considering various propagation effects, such as ML model f ? (· , Θ? ) that solves the ML optimization problem
transmission, reflection, scattering, and diffraction. For each in (6) can be written as
propagation path, the ray tracing produces path parameters n o
including path gain, propagation delay, and propagation angles. f ? (· ; Θ? ) = arg max EE∼ e?
e s T , f p(E); Θ , (7)
e Υ
e
These propagation path parameters generated by the 3D f (· ;Θ)
model and ray tracing can then be exploited to construct the
channels in the digital replica. The channel impulse response where p(E) e is a function that extracts useful features from
h(t) between a transmit-receive antenna pair in the digital
e the 3D model; depending on the communication configurations
replica can be written as the sum of all the L multi-path and tasks, not all information in the 3D model Ee is useful.
components, which is given by Therefore, we only input the useful features p(E)e to the ML
model.
L
X The optimal ML model f ? (p(E);e Θ? ) can be obtained via a
h(t) =
e αl δ(t − τl )Gt (θlAoD )Gr (θlAoA ), (4) supervised learning approach. First, we randomly sample a total
l=1
number of D 3D models from the 3D model distribution Υ. e For
where αl and τl represent the complex gain and propagation the d-th 3D model sample Ed , we calculate the corresponding
e
delay of the l-th path. The angle of arrival and angle of solution Ted? to the communication task T using (5). This way,
departure are denoted by θlAoA and θlAoD . Gt and Gr are the we can construct datasetof D data points, and the d-th data
radiation patterns of the transmit and receive antennas. point can be written as Eed , Ted? . Then, we train the ML to
With accurate 3D model Ee and ray tracing ge(·), the solution minimize a loss function on this dataset. The loss function
T ? in (3) can be approximated using the solution obtained from measures how much the ML model approximation defers from
the digital replica, Te? , as shown by the digital replica solution, which is given by
Te? = ST ge(E)
e . (5) D
X
Jtrain = − s Ted? , f p(Eed ); Θ , (8)
To investigate the accuracy of the digital replica, we define d=1
s(T ? , Te? ) as the similarity function of Te? and T ? . The accuracy
requirements on the 3D model and ray tracing can vary in Since the training data for the ML model f p(E); e Θ is
different communication configurations and tasks. For instance, generated from the digital replica, a large amount of training
the channel state information prediction task in the sub-6GHz data can be relatively accessible for the training process.
However, when the ML model is trained solely on the data Y (East)
BS (origin)
generated by the digital replica, the ML model can be biased
by the impairments in the 3D model and/or the ray tracing. X (North)
0.8
0.7
nearest neighbor in the digital replica. The maximum difference
Top-1
Top-2
between the optimal beam indices obtain from the real world
0.6
Top-3 and the digital replica is two, which translates to around 10◦
0.5 Top-4
Top-5 difference in beam angle.
0.4
Accuracy Relative Receive Power
Next, we directly apply the optimal beams of the nearest
neighbors in the digital replica to the corresponding data points
Fig. 4. This figure shows the top-k accuracy and relative receive power in the real world. Fig. 4 presents the top-k accuracy and relative
performance obtained by the nearest neighbor in the digital replica. receive power obtained by this nearest neighbor approach. It
can be seen that, the top-1 accuracy and relative receive power
are 56.1% and 84.8%, respectively. Despite the relatively low
C. ML Model
top-1 accuracy, the high receive power implies that the nearest
We employ a fully connected NN architecture incorporating neighbor often provides sub-optimal beams with near-optimal
two hidden layers. Each hidden layer has 256 nodes and applies receive power. It is worth highlighting that, with the nearest
the ReLU activation. The input to the network is the position neighbor approach, 84.8% receive power can be obtained
the UE in both Cartesian and polar coordinates. Although this using the digital twin without any beam training or channel
input produces repeated information, we found that they lead estimation.
to more stable ML performance with our dataset. To normalize
the input data, we re-scale the distance in the polar coordinates B. Can the Model Trained in the Digital World Work in
by the maximum distance, and re-scale the x and y in the Reality?
Cartesian coordinates by a shared maximum absolute value. The large computational latency of the digital replica may not
The output layer adopts the standard classification setting, i.e., suit the need for real-time applications. Here, we investigate the
it has 16 nodes and employs the softmax activation function. performance of using the NN model to solve the digital replica
Each node produces a confidence score for one beam in the 16- simulation and the beam prediction task to reduce computation.
beam codebook being the optimal beam. We employ the Adam In Fig. 5, we train the NN model on the real-world data or the
optimizer, and the learning rate is set to 1 × 10−2 and 1 × 10−4 digital replica data, and test the NN model on unseen real-
for the training and fine-tuning. world data. We experiment with two beamforming codebook
in the digital replica, namely the measured codebook and the
VII. E VALUATION R ESULTS uniform beam codebook. In the measured codebook, we extract
In this section, we evaluate the beam prediction performance the beam angles of the beam patterns used in the DeepSense
of the proposed digital twin and ML approaches. We adopt two testbed, and construct a beam steering codebook with these an-
performance metrics: (i) The top-k accuracy is defined as the gles. For the uniform beam codebook, we uniformly discretize
percentage of the test data points whose ground-truth optimal the BS field-of-view into 16 angles and construct a DFT beam
beam lies in the predicted k beams with the highest scores. (ii) steering codebook accordingly.
The top-k relative receive power measures the ratio between the It can be seen from Fig. 5 that training on the digital replica
highest receive power achieved by the top-k predicted beams data using the measured codebook can achieve a relatively good
and the receive power of the ground-truth optimal beam. top-2 accuracy of 91.4% when testing on unseen real-world
data. It is worth noting that this ML approach does not need
A. Does the Digital Twin Matches the Reality? any real-world training data. Furthermore, the NN model can
The key idea presented in this paper is to utilize the digital converge with only 100 data points from the digital replica,
replicas to simulate real-world communication systems. Here, which demonstrates its high data efficiency. It can be also
we first investigate the accuracy of the digital twin in the observed that the (inaccurate) uniform beam steering codebook
context of the position-aided beam prediction task. Ideally, decreases the top-2 accuracy to 84.8%. If we want to rely only
1 1
0.9 0.9
Accuracy / Relative Receive Power
0.8 0.8
0.95
0.7
0.7
0.9 measured
codebook
0.6 uniform
0.6 codebook
0.85
0 5 10 15 20
Acc. (trained on real) 0.5 Acc. (trained on real)
0.5 Power (trained on real) Power (trained on real)
Acc. (trained on synth.) measured codeboook 0.4 Acc. (transfer learining) measured codebook
Power (trained on synth.) measured codebook Power (transfer learining) measured codebook
0.4
Acc. (trained on synth.) uniform codebook Acc. (transfer learining) unifrom codebook
Power (trained on synth.) uniform codebook 0.3 Power (transfer learining) unifrom codebook
0.3
0 50 100 150 200 0 20 40 60 80 100
Number of Training Data Points Number of Real Data Points Used for Training
Fig. 5. This figure shows the top-2 performance obtained by training the neural Fig. 6. This figure shows the top-2 performance by applying transfer learning
network on the real-world (real) or digital twin (synthetic) data and testing on to the NN model using the measured and the uniform beamforming codebooks.
the real-world data.