0% found this document useful (0 votes)
157 views15 pages

Artificial Intelligence For 5G and Beyond 5G: Implementations, Algorithms, and Optimizations

Uploaded by

Zakiy Burhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
157 views15 pages

Artificial Intelligence For 5G and Beyond 5G: Implementations, Algorithms, and Optimizations

Uploaded by

Zakiy Burhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO.

2, JUNE 2020 149

Artificial Intelligence for 5G and Beyond 5G:


Implementations, Algorithms, and Optimizations
Chuan Zhang , Member, IEEE, Yeong-Luh Ueng , Senior Member, IEEE,
Christoph Studer , Senior Member, IEEE, and Andreas Burg , Member, IEEE

Abstract— The communication industry is rapidly advancing I. I NTRODUCTION


towards 5G and beyond 5G (B5G) wireless technologies in
order to fulfill the ever-growing needs for higher data rates
and improved quality-of-service (QoS). Emerging applications
require wireless connectivity with tremendously increased data
T HE communication industry is expected to grow contin-
uously and at a extreme rates, which is key in fueling
innovation and productivity of other economic sectors, such
rates, substantially reduced latency, and growing support for a as transportation, health care, agriculture, finance, services,
large number of devices. These requirements pose new challenges consumer electronics, and so on [1]. For example, the ongoing
that can no longer be efficiently addressed by conventional coronavirus disease (COVID-19) outbreak impressively shows
approaches. Artificial intelligence (AI) is considered as one of how vital the information industry is to society [2]. In the
the most promising solutions to improve the performance and near future, we will witness a variety of emerging technologies
robustness of 5G and B5G systems, fueled by the massive amount which will fundamentally change our everyday life, including
of data generated in 5G and B5G networks and the availability of the Internet of Things (IoT), vehicle-to-everything (V2X),
powerful data processing fabrics. As a consequence, a plethora of
distance learning, (enhanced) virtual (and augmented) reality,
research on AI-based communication technologies has emerged
recently, promising higher data rates and improved QoS with unmanned aerial vehicles (UAVs), and robotization [3]. Con-
affordable implementation overhead. In this overview paper, sequently, researchers and engineers are facing new challenges
we summarize the state-of-the-art of AI-based 5G and B5G in order to enable these technologies.
techniques on the algorithm, implementation, and optimization In order to keep up with these trends, wireless communica-
levels. We shed light on the advantages and limitations of tion systems must sustain the extreme requirements of emerg-
AI-based solutions, and we provide a summary of emerging ing networks and devices. As the existing 3G/4G wireless
techniques and open research problems. networks are unable to meet such demands, research focuses
Index Terms— Artificial intelligence, fifth-generation (5G) and towards the fifth-generation (5G) and beyond 5G (B5G) eras.
beyond 5G (B5G), detection and precoding, channel coding, The upcoming 5G and B5G systems aim at connecting tens
baseband processing, model uncertainties, localization. of billions of wireless devices with gigabit-per-second data
rates and millisecond-level latency [4]. Admittedly, emerging
approaches such as centimeter/millimeter-wave (cm/mmWave)
Manuscript received April 7, 2020; revised May 16, 2020; accepted frequencies, cognitive radio (CR), device-centric networks,
May 20, 2020. Date of publication June 4, 2020; date of current version
June 12, 2020. This work was supported in part by the National Key R&D
cooperative networks with joint processing, and core network
Program of China under Grant 2020YFB2205503, in part by NSFC under virtualization are temporary solutions [5], but a large number
Grants 61871115 and 61501116, in part by the Jiangsu Provincial NSF for of severe challenges still remain.
Excellent Young Scholars under Grant BK20180059, in part by the Six Talent The aforementioned applications force wireless commu-
Peak Program of Jiangsu Province under Grant 2018-DZXX-001, in part by nication systems to meet the trends of eMBB (enhanced
the Distinguished Perfection Professorship of Southeast University, in part
by the Fundamental Research Funds for the Central Universities, in part Mobile Broadband), mMTC (massive Machine Type
by the Ministry of Science and Technology, Taiwan under Grant MOST Communications), and URLLC (Ultra-Reliable Low Latency
107-2221-E-007-017-MY3, and in part by Qualcomm Technologies under Communications) [6], [7]. Consequently, future systems must
Grant SOW NAT-435533 and Agreement NAT-391796. This article was provide tremendously increased data rates, substantially
recommended by Guest Editor A. Wu. (Corresponding author: Chuan Zhang.)
Chuan Zhang is with the LEADS, Southeast University, Nanjing 211189,
reduce latency, provide strong support to connect a massive
China, also with the National Mobile Communications Research Labora- number of devices that possibly share the same time-frequency
tory, Southeast University, Nanjing 211189, China, also with the Quantum resources, increase energy efficiency, and provide stringent
Information Center, Southeast University, Nanjing 211189, China, and also quality-of-service (QoS) guarantees. For 5G/B5G systems,
with the Purple Mountain Laboratories, Nanjing 211189, China (e-mail: these requirements should be met even when a priori
[email protected]).
Yeong-Luh Ueng is with the Department of Electrical Engineering, National knowledge, such as channel condition, is varying, incomplete,
Tsing Hua University, Hsinchu 300, Taiwan (e-mail: [email protected]). or unavailable. Conventional algorithm and implementation
Christoph Studer was with the School of Electrical and Computer Engi- solutions are expected to be no longer sufficient, and more
neering, Cornell Tech, New York, NY 10044 USA, and also with the advanced communication technologies are necessary.
Department of Electrical and Computer Engineering, Cornell University,
Ithaca, NY 14850 USA. He is now with the Department of Information
Witnessing its successful applications in fields including
Technology and Electrical Engineering (D-ITET), ETH Zurich, 8092 Zürich, computer vision, speech recognition, natural language process-
Switzerland (e-mail: [email protected]). ing, audio recognition, machine translation, social network
Andreas Burg is with the Telecommunications Circuits Laboratory, École mining, bioinformatics, and compound design [8], researchers
Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland (e-mail: are considering to apply artificial intelligence (AI) techniques
[email protected]).
Color versions of one or more of the figures in this article are available
to wireless communications. Existing results on physical layer
online at https://ieeexplore.ieee.org. level have demonstrated that AI can help to understand
Digital Object Identifier 10.1109/JETCAS.2020.3000103 the wireless contents, recognize unrevealed patterns, reduce
2156-3357 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
150 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO. 2, JUNE 2020

Fig. 1. Classification of MIMO detection algorithms.

complexity, and produce results comparable to and—in some receiving antennas. The narrow-band system model is as
cases—surpassing conventional approaches [9]. follows:
Nowadays, research combining AI with 5G/B5G has
drawn significant attention from both academia and industry. y = Hs + n. (1)
Although some related initiatives have been named, suitable Here, y ∈ CNr is the received vector, H ∈ CNr ×Nt is
algorithms, implementations, and optimizations are unfortu- the channel matrix, s ∈ ΩNt is the transmitted vector and
nately not complete and of course at its infancy. The significant the constellation
potential for AI is expected to advance network architec-  of each  component si is denoted by Ω,
and n ∼ CN 0, σ 2 INr models zero-mean additive white
tures [10], signal processing solutions [11], semiconductor Gaussian noise (AWGN). We assume perfect channel state
technologies i [12], as well as system-level optimization [13]. information (CSI) at the receiver. Unfortunately, exact max-
Recently, a number of special issues on IEEE Transactions imum likelihood detection is infeasible in practice for systems
[14]–[17], special sessions in IEEE key conferences [18]–[23], with a large number of transmitters and higher-order modu-
schools and tutorials [24] appeared on this emerging topic. lation schemes. Therefore, a variety of approximate detection
In contrast, this special issue and this overview paper empha- algorithms have been proposed in the past; see Fig. 1 for a
size “AI for 5G and B5G” related algorithms, implementations, classification of the most prominent methods.
and optimizations to provide an overview on recent progress The linear algorithms, such as zero forcing (ZF) and min-
from a circuits and systems perspective. We note that instead imum mean squared error (MMSE)-based equalization, and
of simply reviewing the literature, we focus on representative linear iterative methods, e.g., Gauss-Seidel (GS), coordinate
results and outline the technical foundation of these tech- descent (CD), successive over relaxation (SOR), and steepest
niques. Though many results on algorithm-level have been descent (SD) algorithm, have low complexity, nevertheless,
published recently, we focus on only those close to circuit and their performance is often unacceptable under realistic propa-
system implementations. We refer the interested readers to the gation conditions. Nonlinear detection algorithms, such as tree
papers [25]–[27] for survey papers from the communication search, interference cancellation (IC), and message passing,
society (ComSoc) perspective. can achieve near-optimal performance abut at significantly
The remainder of this overview paper is organized as higher complexity than linear methods. In particular, message-
follows. Section II reviews algorithms and implementations passing-based methods often suffer severe performance degra-
for AI-based massive multiple-input multiple-output (MIMO) dation under realistic channels, which motivates the develop-
detection and precoding. Section III discusses AI-based chan- ment of AI-based solutions. The pros and cons of different
nel coding. Section IV summarizes AI-based processing MIMO detection algorithms has been summarized in Table I.
for other baseband modules. Section V introduces machine
2) Deep Learning Based Linear Detection: A neural net-
learning (ML) methods that deal with model uncertainties,
works architecture called DetNet proposed in [28] is inspired
including AI-based full-duplex SI cancellation and RF/PA
by the iterative projected gradient descent and can be described
linearization. Section VI summarizes methods for AI-based
by the following procedure
device localization. Section VII investigates future research
directions. Section VIII concludes this overview. (1) (2)
qk = ŝk−1 − θk HH y + θk HH Hsk−1 ,
   
II. AI-BASED MIMO D ETECTION AND P RECODING (1) qk (1)
zk = ReLU Θk + θk ,
Due the advantages in spectral efficiency (SE), energy vk−1
efficiency (EE), and quality-of-service, MIMO technology (2) (2)
ŝk = Θk zk + θk ,
is a core component for many wireless systems. However, (3) (3)
efficient implementations of MIMO detection and precoding vk = Θk zk + θk , (2)
must always trade off performance versus complexity, in order
where ŝk is the estimate of s at the k-th iteration,
to cope with the NP-hardness of the underlying problems
ReLU (x) = max (x, 0) is ReLU activation function, and
and the high problem dimensions. Existing solutions, such  L
(1) (2) (1) (2) (3) (1) (2) (3)
as linear, linear iterative, and message passing solvers have θk , θk , θk , θk , θk , Θk , Θk , Θk is the train-
k=1
been reported in the literature. Recently, research has focused able parameters, L is the total layers of the neural network.
on AI-based solutions for MIMO detection and precoding in DetNet performs very well when the channel matrix is i.i.d.
order to improve performance, complexity, and robustness. complex Gaussian. However, this methods yields suboptimal
performance under realistic channel conditions, such as 3GPP
A. AI-Based MIMO Detection 3D MIMO channel model or clustered delay line (CDL)
1) Introduction: We consider a narrow-band MIMO com- channel models. Furthermore, other limitations of DetNet
munication system with Nt transmitting antennas and Nr exist. First, a large number of network parameters make the

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: AI FOR 5G AND B5G: IMPLEMENTATIONS, ALGORITHMS, AND OPTIMIZATIONS 151

TABLE I
C OMPARISON OF MIMO D ETECTORS

neural network unnecessarily complex, which results in high


computational complexity and storage overhead. Second, it is
difficult to gain insight into the function of the learned neural
network [29].
By combining deep learning methods with expert knowl-
edge, a model-driven approach called ComNet was proposed
in [30]. ComNet first estimates the transmitted vector by
using ZF as initialization, then applies a fully connected
neural network to refine the coarse estimation. Simulation
results show that ComNet is able to outperform the traditional
approach but the improvement of ComNet is small. However,
the idea of combining deep learning with expert knowledge
is a promising direction for the design of AI-based detection
algorithms.
Fig. 2. The structure of DNN-based MPD in [35].
3) Deep Learning Based on Message Passing Algorithms:
To improve the original approximate message passing (AMP) following equations:
algorithm, which has been used for MIMO detection in [31],
(1)
orthogonal AMP (OAMP) was proposed to relax the i.i.d. rk = ŝk + Θk (y − Hŝk ) ,
Gaussian channel matrix assumption [32]. A neural network 
(2)
called OAMPNet based on the OAMP algorithm is proposed to ŝk+1 = E s|rk , τ 2k θ k . (4)
further improve the performance of OAMP [33]. Two trainable (1)
parameters {γk , θk } are introduced to every iteration of the Here, Θk is a Nt × Nr complex-valued trainable matrix,
(2)
OAMP algorithm given by and θk is a trainable vector of Nt × 1 which makes the
noise variance be computed separately for each symbol. For
(1) (2)
rk = ŝk + γk Wk (y − Hŝk ) , i.i.d. Gaussian channels, the two terms Θk and τ 2k θk


ŝk+1 = E s|rk , τk2 (θk ) . (3) (1)
are reduced to θk HH and τk2 θk
(2)
respectively, and the

(1) (2)
Here, the optimal choice of the matrix Wk is the linear MMSE trainable matrices/vectors Θk , θk are reduced to two
matrix [32]. In [34], this architecture has been modified to 
(1) (2)
OAMPNet2, whose performance is further improved compared trainable parameters θk , θk . Furthermore, due to the
with OAMPNet. In addition, joint channel estimation and relatively low complexity of MMNet, online training can be
signal detection (JCESD) has also been investigated. Although adopted to adapt to varying channel conditions. Simulation
the OAMPNet has only a few trainable parameters and out- results have shown that the MMNet algorithm is able to
performs the original OAMP algorithm, there remains some outperform other methdos under realistic channel conditions,
limitations in OAMPNet. First, a matrix inverse is required while keeping the complexity within reasonable bounds.
in each iteration, which results in high complexity, especially A deep neural network (DNN) is adopted to belief
in the case of massive MIMO. Second, the principle of propagation (BP) in [35]. The structure of the proposed
OAMP is based on the assumption that the channel matrix DNN-based modified message passing detectors (MPDs)
is untary-invariant—realistic channel matrices do not satisfy is shown in Fig. 2. Three different MPD algorithms are
this condition, which can lead to poor performance. proposed, which are damped BP DNN (DNN-dBP), max-sum
To overcome the performance degradation of OAMPNet (MS) BP DNN (DNN-MS), and simplified MPD DNN
for realistic channels, [29] proposes a deep learning MIMO (DNN-sMPD). Numerical results demonstrate that the
detection scheme called MMNet based on the AMP algorithm. proposed MPD method with DNN is able to achieve superior
Reference [29] presents two neural network models for i.i.d. performance and improves robustness compared to the
Gaussian channels and realistic channels separately. For real- original BP algorithm and other MPD methods for a range
istic channel models, each iteration of MMNet consists of the of system configurations and channel conditions.

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
152 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO. 2, JUNE 2020

B. Multi-User (MU) MIMO Precoding to minimize the UE-side MSE s − Hx(T +1) 2 over a large
AI-based methods can also be used for multi-user (MU) number of signal, channel, and noise realizations. For the same
precoding in massive MIMO systems [36]. The goal of MSE performance, this learning-based approach was shown to
MU-MIMO precoding is to transmit constellation points to reduce the number of algorithm iterations T by 2× over the
the U user equipments (UEs) while adhering to a power original method in [41] that uses manually-tuned parameters
constraint and avoiding interference caused by the channel {τ, η, ξ} that remain constant over the iterations.
between the basestation (BS) and the UEs. Linear precoding
methods multiply the transmit vector s ∈ ΩU , where Ω is III. AI-BASED C HANNEL C ODING
the constellation set, to a precoding matrix P ∈ CB×U ,
where B is the number of BS antennas, so that the signals Recent results in AI-based channel decoding have
received at the UEs minimize a suitably-defined cost function shown that the performance of existing methods can (often
(e.g., the mean-square error). Let us assume the following significantly) be improved. While some results use neural
narrowband input-output relation in the massive MU-MIMO networks to train the scaling parameters of conventional
downlink: algorithms, some apply neural networks to post-processing of
the decoders’ output, others try to directly decode with neural
y = Hx + n. (5) networks via one-shot decoding. AI-aided code construction
is another area where ML techniques have been used. Since
Here, y ∈ C contains the receive signals at each of the
U
both low-density parity-check (LDPC) codes and polar codes
U UEs, H ∈ CU×B is the downlink MIMO channel matrix, have been standardized by 5G, most recent research focuses
x ∈ CB is the precoded vector, and n ∈ CU models thermal on these types of codes.
noise at the UEs with variance N0 per entry. Linear precoding
simply computes x = Ps, where one of the most common
precoding matrix is the Wiener-filter precoder PWF = β WF Q, A. LDPC and Polar Decoding Algorithms
which minimizes the UE-side MSE, and is given by [37]
The belief-propagation (BP) algorithm, also known as the
 −1
U N0 sum-product algorithm (SPA) [43], is a message passing algo-
QWF = HH H + 2 HH . (6) rithm, which iteratively decodes LDPC codes while providing
ρ
 near-optimal error-rate performance. However, due to the
Here, β WF = tr(QH Q)Es /ρ2 ensures that the average rather high complexity of the BP algorithm, there exist several
power constraint E{x2 } = ρ2 is met and Es is the symbol hardware-friendly simplifications, such as the min-sum (MS)
power [38]. Since Eq. (6) is a regularized matrix inverse, algorithm [44], the normalized MS (NMS) algorithm [45],
the precoding operation x = Ps can be implemented using the and the offset MS (OMS) algorithm [45]. Polar codes are
same techniques proposed for linear detection in Section II. able to asymptotically achieve the Shannon capacity using
Related precoding method has been proposed for precoding the successive cancellation (SC) decoding algorithm as the
in mmWave massive MIMO systems that build on hybrid code length approaches infinity [46]. The SC decoding algo-
analog-digital architectures [39]. rithm sequentially estimates the information bits based on the
Nonlinear MU-MIMO precoding methods have been pro- received channel log-likelihood ratios (LLRs) and the encod-
posed in [40] for massive MU-MIMO systems that use ing structure. Other than the SC decoding algorithm, the BP
coarsely-quantized digital-to-analog converters (DACs) at the algorithm can also decode polar codes by passing LLR-values
BS. For 1-bit DACs, each entry of the transmit vector x is based on the encoding graph in an iterative fashion. Compared
constrained to the quaternary set {±α ± iα}, where α2 = with the SC decoding algorithm, BP decoding provides higher
ρ2 /(2B) enforces the power constraint. In order to compute throughput and shorter latency, but suffers from a substantial
an MSE-optimal precoding vector x, one can resort to the error-rate performance degradation. For both SC decoding and
iterative algorithm put forward in [41] combined with deep BP decoding, research is working towards a balance between
unfolding. The iterative nonlinear precoding algorithm pro- error-rate performance and complexity. Since some of the
posed in [41] performs the following sequence of operations underlying mechanisms are difficult to model, AI has been
recently considered to optimize this tradeoff.
z(t+1) = x(t) − τ (t) AH Ax(t) , (7)
x(t+1) = prox(z(t+1) ; η (t) , ξ (t) ), (8)
  B. Optimized BP Decoders Based on Neural Networks
for the iterations t = 1, . . . , T . Here, A = I − ssH /s22 H There exist several results devoted to applying neural net-
and the proximal operator [42] works to optimize simplified BP decoders by modeling their
proxg (x; η, ξ) = clip(ηR{x}, ξ) + iclip(ηI{x}, ξ) (9) architectures using NNs and training the parameters of these
decoders in order to improve the decoding performance.
is applied element-wise to the vector z(t+1) and clips the real A linear approximation min-sum (LAMS) algorithm to
and imaginary parts to the interval [−ξ, +ξ]. As shown in [40], decode LDPC codes is proposed in [47]. In the LAMS algo-
one can now unfold this iterative procedure for a fixed number rithm, every check node output or channel value is multiplied
of iterations T and learn the algorithm parameters from data. by a normalization factor and biased by an offset. To optimize
Specifically, Eq. (7) corresponds to a linear layer with given these parameters, a three-layer neural network based on the
(and fixed) weights A and trainable per-iteration step-size Tanner graph is constructed to model the check node update
parameter τ (t) ; Eq. (8) can be interpreted as a nonlinear and the variable node update functions within one iteration,
activation function with trainable parameters η (t) and ξ (t) . and iteratively optimize (as shown in Fig. 3) these parameters
The parameters {τ (t) , η (t) , ξ (t) }Tt=1 can now be learned using using stochastic gradient descent (SGD). Simulation results
neural network learning tools, where a typical cost function is show that the optimized LAMS algorithm is able to outperform

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: AI FOR 5G AND B5G: IMPLEMENTATIONS, ALGORITHMS, AND OPTIMIZATIONS 153

Fig. 3. The iterative procedures for optimizing the normalization and offset
factors in each iteration proposed in [47].

Fig. 4. The iterative BP-CNN decoder proposed in [57].

both the NMS and OMS algorithms. In the high signal-to-


with existing early termination schemes, the time complexity
noise ratio (SNR) regime, this AI-assisted method can even
can also be reduced.
outperform the SPA.
A multi-layer perceptrons neural network (MLPNN) to
optimize the finite-alphabet iterative decoder (FAID) of LDPC C. Neural Post Processing for Channel Decoders
codes was constructed in [48]. The MLPNN is constructed Instead of optimizing BP decoders using neural networks,
based on the Tanner graph with trainable weights and biases several research works concatenate a neural network to the
in the variable node update functions, and every two layers output of the channel decoder to process its output, and
correspond to one iteration of the FAID. Corresponding results feed back information to the decoder for iterative processing.
show that the MLPNN decoder with only 3-bit precision In [57], an iterative BP-CNN (convolutional neural network)
outperforms the floating-point NMS decoder over the AWGN decoder is proposed to deal with correlated noise. The BP
channel. decoder’s output is fed into a feed-forward CNN to estimate
In [49], the neural offset min-sum (NOMS) algorithm is the correlated noise between channels. The estimated noise
proposed, which introduces trainable additive offsets in check will be subtracted from the received channel values for the
node update functions and optimizes these offsets using a BP decoder to decode again (as shown in Fig. 4).
DNN. A DNN-based polar decoder is proposed in [50], which In [58], a long-short term memory (LSTM) network-aided
introduces multiplicative weights and trains these weights SC flip algorithm is proposed to decode polar codes. If the
using a DNN. The proposed decoder can outperform the con- CRC fails, the SC decoder’s output will be fed into a LSTM
ventional BP decoder for polar codes within fewer iterations. network to locate and flip the first error bit during the next
It is worth mentioning that the methodology in [50] is general SC decoding attempt. Simulation results show that, for a
and can be applied to other BP decoders. For example, a recur- (64, 32 + 8) polar code, the proposed algorithm with utmost
rent neural network (RNN) based polar decoder, having a 6 attempts outperforms the CRC-aided SC-List (SCL) algo-
similar concept as [50] but more hardware-friendly is proposed rithm with a list size of 4. Another work [59] applies LSTM
in [51]. to assist bit-flipping for SCL decoding, and achieves the state-
In [52], the authors apply a similar concept to implement of-the-art performance and low complexity.
a total of six types of decoders for high density parity
check (HDPC) codes e.g. BCH codes, which are the neural
BP decoder, the BP-RNN decoder, the neural NMS (NNMS) D. Neural Network Decoders
decoder, the NNMS-RNN decoder, the neural OMS (NOMS) Several research results directly replace the decoder with a
decoder, and the NOMS-RNN decoder. Simulation results have neural network to do one-shot (i.e., non-iterative) decoding,
shown that all six decoders outperform BP and MS decoders. where correct codewords are used as the training labels and
Furthermore, the decoder optimized with a DNN is able to the channel LLRs are used as the network’s input.
perform better than that using an RNN, while the latter consists In [60], a neural network decoder (NND) is used to decode
of significantly fewer parameters. Besides, the NNMS and the unstructured codes (LDPC and HDPC codes), and structured
NNMS-RNN decoders achieve similar error rate performance codes (polar codes). Simulation results show that for both
compared with the neural BP and the BP-RNN decoders, unstructured and structured codes, the NND can approach
and outperform the NOMS and the NOMS-RNN decoders. the maximum a posteriori probability (MAP) performance
However, the NOMS and the NOMS-RNN decoders are more for short block lengths (N ≤ 1024). It is observed that the
hardware-friendly. This work offers a list of trade-off options decoding algorithm of structured codes is easier for neural
between the decoding performance and hardware complexity. networks to learn. In [64], the decoding performance and
For BP decoders, determining the number of iterations complexity between DNN, CNN, and RNN are compared. This
is an interesting research problem—a naïve approach is to work is also helpful to explain the problem of decoding longer
empirically fix a maximum number of iterations. For lower block-length codewords.
complexity, early termination schemes are preferrable. For Since the number of required training samples grows expo-
polar codes, early termination schemes include the G-matrix nentially with the number of information bits, NNDs for mod-
scheme [53], the magnitude scheme [54], [55], and the sign erate or long block lengths (N > 1024) become impracticable.
scheme [55]. In [56], a DNN has been employed to predict To address this issue, in [61] the authors partition a polar
the minimum required iteration number. Simulation results code of a long block length into sub-blocks and decode each
have shown that in the large SNR regime, the DNN can sub-block by a small NND, whose training is feasible. The
accurately predict the number of required iterations. Compared NND results are propagated via conventional BP structure.

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
154 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO. 2, JUNE 2020

Simulation results show that the BER performance of the


proposed decoder is similar to that of the SC or BP decoder.
The decoding latency is much lower due to the one-shot
decoding property of NNDs.
An optimization-based LDPC decoding algorithm has been
proposed in [62] using a hardware-friendly feed-forward
neural network. Internal parameters, such as the step size
parameters, a softness parameter, and the penalty coefficients, Fig. 5. System model for a transmitter with an encoder and a receiver with
are optimized using stochastic gradient descent. Simulation an equalizer and decoder.
results have shown that the performance is, in some cases,
better than that of the SPA. decoding to offer an end-to-end channel coding methodology.
In [63], a hybrid decoder for both LDPC and polar codes However, a suitable balance between error-rate performance
with a single neural network is proposed. The idea is based on and hardware complexity is always a prime goal.
the fact that BP algorithms can decode both LDPC and polar
codes. A group of additional input nodes called the indicator IV. OTHER AI-BASED BASEBAND M ODULES
section are added to recognize different codes. Simulation
results show that the DNN decoder outperforms the LSTM Besides the two key baseband processing modules, namely
decoder, and the CNN decoder performs the worst. the channel decoder andMIMO detector, the design and imple-
mentation of other baseband modules such as channel estima-
tor, equalizer, and non-orthogonal multiple access (NOMA)
E. Construction of Channel Codes detector have also used AI methods in order to exploit specifc
patterns, especially nonlinear behavior, which can often not be
Code construction is critical for error-correction perfor- handled by conventional methods. Existing research focuses
mance. However, the search space of code construction is on both algorithms and hardware implementations.
usually large and the modeling is complex. To balance the
complexity and performance, AI algorithms have been adopted
to construct LDPC and polar codes, including deep reinforce- A. Equalization
ment learning, deep learning, and heuristic algorithms. 1) Introduction: Channel equalization is employed to deal
In [65], a deep reinforcement learning for LDPC code with the distortions introduced by effects caused by channels
construction is proposed. The DNN is trained by the reinforce- and system impairments. These effects contain inter-symbol
ment learning algorithm and the training data are generated interference (ISI) and nonlinearities caused by amplifiers as
by constructing the code repeatedly with a Monte-Carlo tree well as data converters [73], as shown in Fig. 5. Take s as
search (MCTS). Simulation results have shown that the LDPC modulated symbols and r as the received sequence, the channel
codes constructed by the DNN yields comparable performance effect is modeled as:
as those designed using the progressive edge growth (PEG)
algorithm. v = s ⊗ h,
A heuristic search, namely a genetic algorithm, has been r = g [v] + n, (10)
proposed in [66] to optimize the LDPC code design. Other
than the ideal AWGN channel, other factors, such as channel where h is channel coefficients, n is AWGN, g is nonlin-
conditions, code length, and iteration number, are taken into ear function, and ⊗ stands for linear convolution operation.
consideration during the code design. The constructed LDPC Various types of NN-based equalizer have been shown bet-
code delivers 0.325 to 0.8 dB coding gains under various ter performance than conventional linear equalizer [74]–[77].
channels compared to 5G standardized LDPC codes. However, after neural network-based equalization, the noise
The decoding performance of polar codes relies on both is typically not i.i.d. white Gaussian distributed, which may
channel conditions and construction methods. State-of-the-art lead to (often severe) performance degradation in the chan-
construction approaches [67], [68] as well as the standardized nel decoder [74], [78]. A solution to this issue is to take
5G polar codes [69] are merely optimal or near-optimal for equalization and decoding into consideration jointly. Normally,
specific channel models. To overcome this limitation, a general the joint implementation of channel equalizer and decoder
method based on a genetic algorithm [70] has been presented is based on iterative methods [79], e.g., soft information of
for polar code construction. Moreover, the framework is tai- both parts is processed and exchanged iteratively, which is a
lored for decoding algorithms like SC, BP, and SCL [71]. time-consuming process.
The decoder-tailored property leads to superior performance 2) Deep Learning Based Joint Equalizing and Decoding:
compared to that of other construction methods. For a shorter processing time, an end-to-end approach has
A learning-based construction method that fits the char- been proposed in [78] This method employs a DNN for
acteristics of BP decoding has been proposed in [72]. The joint channel equalization and decoding. Particularly, the DNN
information and frozen bits are regarded as the trainable takes the received symbols as input and the estimated code-
weights of a neural network. The learned construction pattern word as output. The simulation results in [78] show that it
of polar codes delivers considerable performance gain over outperforms traditional methods (e.g., Gaussian processing for
the 5G standards [69] under both AWGN and Rayleigh fading classification and the SC algorithm) for polar code (16, 8).
channels. However, in terms of long code lengths, such a model is
To sum up, existing AI-aided channel coding methods can expected to require a longer time for off-line training and a
be either model-driven or data-driven. Detailed elaboration on deeper network for better error-correcting performance.
the difference between model-driven or data-driven can be Using a DNN as channel equalizer suffers a number of
found in [25]. One can also combine AI-aided encoding and limitations. Firstly, the pre-defined DNN is not adaptable

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: AI FOR 5G AND B5G: IMPLEMENTATIONS, ALGORITHMS, AND OPTIMIZATIONS 155

for symbol sequences with varying lengths. Secondly,


the complexity of DNN inference, in terms of the number
of parameters, grows exponentially as the length of data
sequence increases. Fortunately, these problems can be
addressed by introducing CNNs [80] and RNNs [81], [82],
which ca be adaptive for various input lengths and may
require fewer parameters compared with densely connected
DNNs. In addition, to eliminate ISI, CNNs can be more
Fig. 6. The AI-based techniques classification of NOMA system.
efficient equalizers than DNNs since ISI only has an effect on
a limited number of consecutive symbols of the transmitted
data, and a CNN can have better performance in feature Fig. 6 shows a classification of AI-based NOMA techniques,
extraction of consecutive data. In [80], the joint training of and we take SCMA as an example.
CNN equalizer and DNN decoder shows better transmission 2) Image Processing Based Blind Detection: In [92], a blind
performance than the DNN model in [78]. For RNN-based detector for SCMA from the perspective of image processing is
joint equalization and decoding, the authors in [82] utilize proposed, which utilizes the relationship between the resource
three layers for equalization, log-likelihood processing, and nodes and the layer nodes to determine the pattern of SCMA,
decoding, respectively. Using gated recurrent units, the method that is the conversion rules between the transmitted signal and
from [82] requires fewer parameters compared to [78], [80] the pixel of the two-dimensional image we constructed. Image
as well as better performance (e.g., more than 0.5 dB gain). inpainting techniques, such as total variation (TV), is applied
in [92] to reduce the initial noise of the two-dimensional
B. Non-Orthogonal Multiple Access (NOMA) SCMA model. The image constructed in [92] is the input
of the CNN, which contributes to denoising the image. The
By introducing controllable nonorthogonality, NOMA proposed denoising CNN combined with DMPA is simulated
techniques can effectively increase the spectral efficiency in [92]. As we mentioned above, the performance of DMPA
and, thus, have been considered by future wireless depends heavily on the CSI. Thus, there is no doubt that
systems. Reported NOMA techniques include sparse- the performance of DMPA is greatly reduced in the case of
code multiple access (SCMA) [83], multi-user shared inaccurate channel estimation. The blind detector of SCMA
access (MUSA) [84], pattern-division multiple access can well compensate for this issue.
(PDMA) [85], interleave-division multiple access (IDMA), 3) Deep Learning Based SCMA Construction: In [93],
and so on. Among them, SCMA enjoys lower decoding a deep learning-aided SCMA system is proposed in order
complexity, higher spectral and shaping gains in terms of to construct the codebook adaptively, which is named as
multi-dimensional constellations [86]. D-SCMA. The idea is to utilize a DNN in order to learn the
1) Limitations of Prior Art: For a K complex-dimension codebook and the decoding strategy of SCMA autonomously.
SCMA system with J layers, given the input data xj = For DNN-based SCMA decoder, denoted by g (y; θg ), the
(x1 j , . . . , xKj )T , the received signal is parameter θg can be determined by tackling the optimization

J problem:
y= diag (hj ) xj + n, (11)
min r − g (y; θg )2 . (12)
j=1 θg

where hj = (h1 j , . . . , hKj )T denotes the channel vectors and Here, θg denotes the weights and the biases of the decoder, y
n is the noise of channel. denotes the signal of the receiver and r denotes the original
The performance of SCMA depends heavily on the channel signal. By learning the mapping relationship between the
estimation quality. Specifically, existing SCMA detection algo- input data and the constellation plane with the aid of aa DNN,
rithms, such as the deterministic message passing algorithm i.e., the weights and the biases of the DNN, the construction
(DMPA) [87], require precise channel estimates. Once the of DNN decoder is transformed into solving the parameters
channel estimates are inaccurate or the channel conditions optimization problem. The decoder based on DNN can be
are unknown, the performance of SCMA will degrade signif- composed of DNN units, which is separated from each other.
icantly. For the sake of detecting the CSI, [88] designs a new A combination of all information can be achieved through
kind of linear estimator and then proposes a novel CSI detector a full connected (FC) network. Utilizing stochastic gradient
based on this estimator. The CSI ensures that the interference descent (SGD), the optimize weights and bias can be updated.
on the clean signal is as small as possible and that the decoding The adaptive D-SCMA system overcomes the shortcomings
is as accurate in uplink NOMA systems. However, it can’t be of the manual codebook design in previous.
denied the fact that it is difficult to obtain the CSI with the Other advantages, such as performance and computational
traditional methods. Besides, the design of a suitable codebook complexity of D-SCMA are also showed in [93] through sim-
is also an essential factor affecting the performance of SCMA. ulations. The advantages of designing codebook and decoder
Many techniques for manually designing codebooks, such as of sparse multidimensional signals such as SCMA with the
considering the distance of each constellation [89], [90] and aid of DL are superior.
the phase between the of constellations [91], still exist many
problems. Hence, codebook design is also a tricky issue.
In recent years, with the rapid development of deep learning, C. Channel Estimation
many challenging problems can be solved effectively via deep In wireless communication systems, coherent detection and
learning. The problems that arise in NOMA or SCMA systems precoding requires accurate estimates of the channel’s trans-
mentioned above can also be addressed by deep learning. fer function. In practice, the channel is typically estimated

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
156 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO. 2, JUNE 2020

Fig. 7. Architecture of the DnCNN denoiser [94].


Fig. 8. Reconstruction of self interference with non-linear distortions based
on a neural network for full-duplex systems.
by pilots which are known to the receiver. The conven-
tional pilot-based channel estimation methods include least algorithms. Such solutions are interesting, where theoreti-
square (LS) and linear MMSE (LMMSE). LS is easy to cally optimal algorithms or closed-form solutions are either
implement but has poor performance, while LMMSE performs unknown (which is rapidly the case for complex system
better since the statistics of channel and noise are consid- models) or where their computational complexity is too high
ered. Recently, deep learning methods have been adopted to for economic implementation.
channel estimation. With the deep learning, the linear channel A slightly different, but equally valid motivation for the
estimation methods can be further improved. Deep learning use of ML techniques in communications are scenarios or
methods can learn the channel structure, which enhances the problems for which the corresponding system model itself is
performance of channel estimation. not or insufficiently well known. In this situation, conventional
Inspired by image processing, [94] applies learned algorithms can only be derived based on often simplistic
denoising-based approximated message passing (LDAMP) assumptions which in some cases may turn out to be difficult
neural network to channel estimation by considering channel to make or which may suffer from considerable uncertainty.
matrix as an image. The denoising CNN (DnCNN) is utilized In such cases, the ability to learn a generic model or to
as the denoiser of LDAMP network and the channel structure train an algorithm based on observations without the need
can be learnt from the training process, which architecture is to explicitly know the underlying system model can be a
shown in Fig. 7. The simulation results show the LDAMP great opportunity. In the following, we provide three examples
network outperforms other compressed sensing-based channel for such applications of ML algorithms.
estimation methods.
Similarly, a network named ChannelNet is proposed in [95]
A. Full-Duplex SI Cancellation
by considering the time-frequency response of the channel
as a 2D-image. With the deep learning, the “high-resolution” The first example is concerned with a new approach to radio
channel matrix can be recovered from the pilot sequence which communication in which a system transmits and receives at the
is regarded as a low-resolution image. Besides, by learning the same time in the same frequency band. This simple, but effec-
features of channel state with a denoising network, the per- tive idea of in-band full-duplex (IBFD) communication [99]
formance of channel estimation can be further enhanced. The has gained attention for a variety of future wireless systems
results show the performance is competitive with LMMSE. as it promises significant improvements in spectral efficiency.
In [96], a deep learning-based joint pilot design and channel The main difficulty with IBFD communication lies in the
estimation scheme is proposed. The two-layer neural net- need for a highly efficient cancellation of the self-interference
work (TNN) and DNN are used to construct pilot designer and which is often >80 dB above the weak received signal. The
channel estimator. The simulation results show the network reconstruction and cancellation of this self-interference (SI)
outperforms the conventional LMMSE methods. The TNN should in principal be straightforward as the transmitted signal
and DNN structure is also used in [97]. Two-stage process, is known. Unfortunately, the radio frequency (RF) components
including deep learning-based pilot-aided and data-aided chan- in a real-world system introduce significant non-linear distor-
nel estimation, is proposed to enhance the accuracy of channel tions. These distortions need to be modeled and reconstructed,
estimation. The scheme has better performance than conven- but existing models do not capture all of them with a sufficient
tional methods according to the simulation. accuracy with must lie far better than what has been considered
Aiming at the scenarios of fast time-varying and nonsta- so for simply analyzing their impact on the transmitted signal
tionary channels, [98] proposes a channel estimation net- quality.
work, called ChanEstNet, to improve the performance of the ML techniques offer an approach to reconstructing the
downlink channel estimation. The results show the proposed SI signal that is mostly unbiased by assumptions on the
methods have better performance compared with conventional underlying hardware and the associated potential distortion.
methods in high-speed scenarios. This basic idea has first been described in [100] and is
To sum up, AI techniques have been exploited for other illustrated in Fig. 8. The two-layer network used in this work
modules of baseband processing. Performance improvement and some later publication receives both the real and imaginary
has been reported. However, due to the nature of different part of the transmitted signal as well as the time-delayed
modules, the AI approaches may vary. Also, their efficient signal to account for the memory of the non-linear SI channel.
hardware designs are expected in future research. It was shown that such a construction is able to match the SI
cancellation performance of conventional IBFD systems with
often lower computational complexity [101]. Other works in
V. ML FOR D EALING W ITH M ODEL U NCERTAINTIES this direction incorporate limited prior knowledge about the
The previous sections have illustrated how ML techniques system, but still use ML or at least the associated training
can be deployed as an alternative to conventional, sub-optimal to adjust system parameters. For example the work in [102]

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: AI FOR 5G AND B5G: IMPLEMENTATIONS, ALGORITHMS, AND OPTIMIZATIONS 157

combines an LMS filter with a neural network to remove ToF measurements from multiple satellites to infer location.
2nd-order intermodulation interference in 5G transceivers. In scenarios that lack LoS connectivity (which is the case for
indoor scenarios) or for devices where GNSS receivers are
unavailable (which is the case for ultra-low-power sensors),
B. RF/PA Linearization alternative positioning methods are required.
The second example for exploiting the expressive power 1) Location Fingerprinting Basics: Fingerprinting is a
of neural networks for working with systems that have no widespread approach for device localization under challenging
well defined model is closely related to the work on IBFD propagation conditions [109], [111]–[113]. This approach
communication. In [103] the authors propose to use a neural performs localization in two phases. In the first phase,
network to perform pre-distortion to reduce the negative CSI {fn }N n=1 = F and associated ground-truth location
impact of power amplifier (PA) non-linearities on the quality information {xn }N n=1 = X are measured and stored in
of the transmitted signal. This task captures both the difficulty a database. Here, the vectors fn ∈ RD correspond to
to exactly model those nonlinearities, as well as the design of D-dimensional CSI-fingerprints at positions xn ∈ Rd ,
a corresponding inverse with limited complexity. The results where D is typically high-dimensional and d is of dimension
presented in the paper indicate that again ML techniques can at two or three. CSI-fingerprints fn can contain RSS measured
least match the performance of conventional techniques with at multiple antennas, power-delay profiles, angle-of-arrival,
an overall significantly lower complexity. Another two works and many others [109]. In the second phase, an estimate for
[104], [105] consider employ real-valued time-delay neural the location xn of a new transmitting device with index n
networks for distortion compensation of PA linearization. is generated. After extracting a CSI-fingerprint fn from this
transmitter, the indices associated with the K most similar
C. Estimation and Tracking of Fading Channels fingerprints in the database {fn }N n=1 are identified. One can
then estimate the location of transmitter n from the set of
A third example for an application where closed form, similar locations via averaging [114].
broadly applicable models are often difficult to reason is While the K-nearest neighbor search (KNNS) in the
the behavior of fading wide-band channels. Conventional CSI-fingerprint database can provide a simple (often accurate)
estimation techniques to track such rapidly changing channels estimate of the transmitter’s location, the complexity of KNNS
need strong assumptions on their statistical properties which in large fingerprinting databases as well as storage of the
in practice are often not valid. Unfortunately, the diversity of fingerprints can quickly become a bottleneck. To address
such systems makes it hard to come up with more precise the complexity bottleneck, reference [115] proposed the use
models that are not only broadly applicable, but also allow for of locality-sensitive hashing (LSH) [116], [117], a powerful
the application of established estimation algorithms. ML tech- method for approximate KNNS that is widely used in ML
niques can again provide a more model-agnostic approach to applications. The idea of LSH is to construct hash functions
solve this difficult problem. An example is the work in [106] for which similar datapoints have matching hash values and
which uses a DNN to predict the channel based on pilots and dissimilar datapoints have mismatched hash values. This hash
previous channel estimates. function is then applied to all points in the CSI-fingerprint
Reference [107] also has shown that ML can help to dataset, i.e., {h(fn )}N
n=1 . For a new query point fn , one com-
exploit physically motivated relationships, which are not well putes h(fn ) and compares the resulting hash value to those in
captured in sufficiently simple and closed form models. In this the dataset. One can then compare the true, high-dimensional
paper, the authors exploit the clearly existing, but complex CSI feature dissimilarity d(fm , fn ), e.g., the Euclidean dis-
relationship between the up-link and down-link channel in a tance d(f , f  ) = f − f  , associated to only those indices
frequency-division duplex massive MIMO system to predict for which there was a hash collision. As shown in [115],
one from the other to reduce training overhead. To this end, such an LSH-based localization approach is able to reduce
the employed sparse complex-valued neural network approx- the number of fingerprint comparisons by more than 10×
imates the up-link-down-link mapping function which is a while achieving the same average distance error as a traditional
difficult problem for conventional algorithms. KNNS. Unfortunately, LSH does not help to reduce the
amount of storage for the CSI fingerprints and hash values.
VI. AI-BASED L OCALIZATION 2) Location Fingerprinting via Neural Networks: In
ML techniques are currently finding widespread application order to mitigate the complexity and storage bottlenecks of
for localization purposes in wireless networks [108]–[110]. We location fingerprinting, DNNs have been proposed recently in
next summarize methods that require supervision while learn- [118]–[121]. Such methods avoid a KNNS altogether
ing from large datasets as well as self-supervised techniques and directly map measured CSI-fingerprints to location.
that provide localization and prediction capabilities without Concretely, one learns a neural network gθ , where the
the need of expensive measurement campaigns. weights and biases are contained in the vector θ, that maps
the CSI features {fn }N n=1 to location {xn }n=1 by computing
N

xn = gθ (fn ). In order to learn the network parameters θ


A. Localization Using Channel Fingerprinting one can minimize the positioning error by minimizing the
Traditional localization approaches for outdoor applications following MSE loss function:
are typically based on triangulation, which map time-of-
flight (ToF), angle-of-arrival, or received signal strength (RSS) 
N
L(θ) = xn − gθ (fn )2 . (13)
information to location using geometrical and physical models.
n=1
Such methods heavily rely on line-of-sight connectivity to
multiple BSs, access points, or satellites—a popular instance After training, a new, unseen CSI feature fn can then
are global navigation-satellite systems (GNSS) that combine be localized by computing xm = gθ (fn ). In practice,

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
158 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO. 2, JUNE 2020

simple neural networks with less than ten layers and ReLu
activations are often sufficient for accurate positioning—what
matters the most is the way measured CSI is converted into
fingerprints. Fingerprints should be resilient to small-scale
fading properties of the channel and capture large-scaling
aspects—a combination of angle-of-arrival and delay
information appears to be among the best-performing features Fig. 9. Siamese neural network for channel charting.
for such applications [120], [122].
network” [125] that is illustrated in Figure 9. This network
corresponds to two parallel neural networks that share
B. Channel Charting for Self-Supervised Location Sensing the same weights and biases—the output is the distance
Unfortunately, fingerprinting and neural-network-based between the networks outputs. Besides the learned neural
localization approaches require extensive, time-consuming, network cθ (fn ), which can be used for RRM and other
and costly measurement campaigns, which have to be repeated predictive tasks, Siamese networks can also be used to
if large-scale properties of the environment scale. In many perform semi-supervised learning where one provides the
applications, e.g., for predictive radio resource manage- channel charting method a subset of annotated locations xn ;
ment (RRM) in wireless network, cell or beam association this approach provides absolute location capabilities while
and handover, UE pairing an grouping, exact location is requiring orders-of-magnitude fewer location-annotated
not necessary and relative position information is sufficient. channel features [121], [126].
Since carefully-designed CSI features change only slowly
when a transmitter moves through space, one can extract VII. F UTURE R ESEARCH
a low-dimensional representation from high-dimensional CSI
Though already shown its advantages, future research on AI
features only, which provides relative location information. for 5G and B5G still needs to address the following issues.
This approach is known as channel charting (CC) [122], which
uses CSI features to form a channel chart in which nearby
points will be nearby in physical space. Such a channel chart A. Learning Methods vs. Conventional Methods
can be constructed using dimensionality reduction [123] in In certain situations, conventional methods can be a satisfac-
a self-supervised (or unsupervised) fashion which does not tory solver with even lower complexity compared to the learn-
require extensive measurement campaigns. ing methods. Existing works have shown that learning-based
1) Channel Charting via Sammon’s Mapping: Mathemati- methods will demonstrate their advantages when taking care
cally, channel charting only requires measured CSI features of problems which could not be either exactly modeled or
{fn }N
n=1 and computes for each feature fn ∈ R
D
a point solved [9]. Future research needs to identify the application
yn ∈ R in the low-dimensional channel chart (d is two or
d area of AI for 5G/B5G.
three). The idea is that for similar features, the points in the
channel chart should be similar as well. Mathematically, one B. Model Driven vs. Data Driven
can solve an optimization problem of the form
Combined with expert knowledge, model driven learning
{yn }N
n=1 = arg min
can effectively lower the training complexity. On the other
yn ,n=1,...,N hand, data driven learning can help to discover the useful

× w(fn , fm )(fn −fm −yn − ym )2 , (14) patterns which could not be figured out by expertise, for better
m=n
performance though the complexity might be prohibitive.
Whether to introduce expertise depends on how reliable it
which is known as Sammon’s mapping [124]. Here, is. Fully making use of expertise can improve the learning
the function w(fn , fm ) is used to de-weight the cost function efficiency and convergence rate, and lower complexity to meet
for features that are dissimilar, i.e., one wants to enforce the real-time requirements. Future research is expected to point
similarity of pairs of features and pairs of locations in the out which approach is appropriate for a specific problem.
channel chart only if the feature pair is similar—the standard
choice for Sammon’s mapping is w(fn , fm ) = fn − fm −1 . C. Data Validity
While the problem in (14) can be solved efficiently using
gradient-descent techniques, Sammon’s mapping is nonpara- As for “AI for 5G/B5G”, data is the necessary prerequisite
metric in a sense that it does not naturally provide a function and of equal importance as learning if not more important.
that maps CSI features fn to points yn in the channel chart. Data validity includes accuracy, identifiability, ergodicity, and
2) Channel Charting via Siamese Neural Networks: In integrity. It is critical to check the data validity before learning.
order to arrive at a channel charting method that is parametric, The data generated by wireless systems usually lack perfect
i.e., provides a function cθ that maps features to points in the ergodicity. For Gaussian distributed data, the further the data
channel chart, reference [121] proposed to solve an alternative is away from the mean value, the less likely it will be
optimization problem collected for learning, and the probability of under-fitting is
higher. Therefore, without proper manual intervention of data,
θ = arg min the learning efficiency based on it will be reduced.
θ

× w(fn , fm )(fn −fm −cθ (fn )−cθ (fm ))2 , (15) D. Data Sharing and Data Security
m=n
For efficient data usage, AI for 5G/B5G usually needs
where yn is replaced by a DNN yn = cθ (fn ) in (14). The access to all data generated by network layer, data link layer,
structure of this network is known as a “Siamese neural and physical layer. However, due to the phenomenon of “Data

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: AI FOR 5G AND B5G: IMPLEMENTATIONS, ALGORITHMS, AND OPTIMIZATIONS 159

Silos,” the massive data generated by wireless networks could VIII. C ONCLUSION
not be fully opened and shared even for research purposes. Development of AI for 5G/B5G requires domain-specific
Second, the nowadays data lacks a scientific and unified knowledge from wireless communication, expertise in AI
collection and storage specification, and will further hinder the and machine learning, and experience in hardware design.
development of this area. On the other hand, the requirement As the review paper in the special issue “Artificial Intelligence
of data sharing comes with the consideration of data security. for 5G and Beyond 5G: Implementations, Algorithms, and
And how to protect users’ rights, interests, and privacy is also Optimizations” of IEEE Journal on Emerging and Selected
important [127]. Topics in Circuits and Systems (JETCAS), this review means
to bring a synthesized source and wide view of research
E. Offline Training vs. Online Training problems, methodologies, and recent results in this area, and
For good learning performance, a large number of para- highlight further directions for readers. We hope this paper and
meters need to be trained. To lower both time complexity and special issue will be interesting for a large portion of related
space complexity, designers usually prefer offline training. The researchers and be timely to trigger future research and pave
successful application of offline trained parameters depends on the road towards B5G era.
their robustness. Otherwise, online training should be consid-
ered. For real-time considerations, the number of parameters to ACKNOWLEDGMENT
be trained online should be reduced and hardware acceleration The authors would like to thank Prof. Xiaohu You for useful
is required. How to efficiently obtain the training data for discussions on future research directions.
online learning is another issue that required to be addressed
[128], [129]. Furthermore, for good learning performance, R EFERENCES
a number of parameters need to be trained. To lower both time
complexity and space complexity, designers usually prefer [1] M. Liyanage, I. Ahmad, A. Abro, A. Gurtov, and M. Ylianttila,
A Comprehensive Guide to 5G Security. Hoboken, NJ, USA: Wiley,
offline training. The successful application of offline trained 2018.
parameters depends on their robustness. Otherwise, online [2] Oxford Analytica, “COVID-19 will probably accelerate remote
training should be considered. working trend,” in Emerald Expert Briefings. Oxford, U.K.:
Oxford Analytica, 2020.
[3] H. Ullah, N. G. Nair, A. Moore, C. Nugent, P. Muschamp,
F. Separate Learning vs. Joint Learning and M. Cuevas, “5G communication: An overview of vehicle-to-
The major part of existing works focus on the learning for everything, drones, and healthcare use-cases,” IEEE Access, vol. 7,
a single or two modules. Separate learning is a natural choice pp. 37251–37268, 2019.
[4] C. Zhang, Y.-H. Huang, F. Sheikh, and Z. Wang, “Advanced baseband
since different wireless modules have different features. It is processing algorithms, circuits, and implementations for 5G commu-
generally agreed that joint learning for more modules will nication,” IEEE J. Emerg. Sel. Topics Circuits Syst., vol. 7, no. 4,
bring advantages in performance, complexity, and robustness. pp. 477–490, Dec. 2017.
However, the implementation of joint learning requires the [5] Z. Ma, Z. Zhang, Z. Ding, P. Fan, and H. Li, “Key techniques for
common nature of multiple modules. An extreme case is 5G wireless communications: Network architecture, physical layer, and
MAC layer perspectives,” Sci. China Inf. Sci., vol. 58, no. 4, pp. 1–20,
the end-to-end learning [130], [131], which will be more Apr. 2015.
interesting and complex for future research. [6] TECH PERF REQ-Minimum Requirements Related to Technical
Performance for IMT-2020 Radio Interface (S), document ITU-R
SG05 Contribution, Draft New Report ITU-R M.IMT-2020, 2017,
G. Physical Layer vs. Cross Layer vol. 40.
Existing works mostly focus AI’s applications in a single [7] 5G; Study on Scenarios and Requirements for Next Generation Access
layer, e.g., the physical layer. Though good performance Technologies, document TR 38.913, 3GPP, 2017.
and cost balance can be achieved by model-driven learning, [8] Wikipedia. (2020). Artificial Intelligence—Wikipedia, the Free
Encyclopedia. Accessed: Apr. 3, 2020. [Online]. Available: http://en.
physical layer learning is somehow constrained by expert wikipedia.org/w/index.php?title=Artificial%20intelligence&oldid
knowledge, otherwise the end-to-end learning becomes fea- =948602554
sible. Some researchers are arguing we should not limit AI’s [9] X. You, C. Zhang, X. Tan, S. Jin, and H. Wu, “AI for 5G: Research
application within the physical layer but need to consider for directions and paradigms,” Sci. China Inf. Sci., vol. 62, no. 2, p. 21301,
Feb. 2019.
cross-layer applications. For example, nowadays researchers
[10] A. Zappone, M. Di Renzo, and M. Debbah, “Wireless networks design
have started AI-based network optimization. Learning crossing in the era of deep learning: Model-based, AI-based, or both?” IEEE
two or three layers can expect more benefits. Trans. Commun., vol. 67, no. 10, pp. 7331–7376, Oct. 2019.
[11] C. Jiang, H. Zhang, Y. Ren, Z. Han, K.-C. Chen, and L. Hanzo,
“Machine learning paradigms for next-generation wireless networks,”
H. Learning With Software vs. Learning With Hardware IEEE Wireless Commun., vol. 24, no. 2, pp. 98–105, Apr. 2017.
Learning with software will always be the first choice due [12] S.-Y. Wu, “Key technology enablers of innovations in the AI and 5G
to its convenience and adaptivity. However, for complicated era,” in IEDM Tech. Dig., Dec. 2019, p. 36.
learning tasks, software implementation will introduce high [13] R. Li et al., “Intelligent 5G: When cellular networks meet artificial
intelligence,” IEEE Wireless Commun., vol. 24, no. 5, pp. 175–183,
time and space complexity. Therefore, learning with hardware Oct. 2017.
has also been considered. For efficient hardware implementa- [14] D. Gesbert, D. Gündüz, P. de Kerret, C. R. Murthy, M. van der Schaar,
tion, the corresponding algorithm should be hardware-friendly, and N. D. Sidiropoulos, “Guest editorial special issue on machine
for example, robust to quantization and structure-regular. learning in wireless communication—Part 1,” IEEE J. Sel. Areas
Commun., vol. 37, no. 10, pp. 2181–2183, Oct. 2019.
Hardware design expertise on both FPGA and ASIC is
[15] D. Gesbert, D. Gündüz, P. de Kerret, C. R. Murthy, M. van der Schaar,
essential for this issue. On the other hand, AI can also help and N. D. Sidiropoulos, “Guest editorial special issue on machine
to design reconfigurable hardware for learning. Existing learning in wireless Communication—Part 2,” IEEE J. Sel. Areas
hardware auto-generator can be the basis for future research. Commun., vol. 37, no. 11, pp. 2409–2412, Nov. 2019.

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
160 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO. 2, JUNE 2020

[16] S. Gong et al., “Introduction to the special section on deep rein- [37] M. Joham, W. Utschick, and J. A. Nossek, “Linear transmit processing
forcement learning for future wireless communication networks,” in MIMO communications systems,” IEEE Trans. Signal Process.,
IEEE Trans. Cognit. Commun. Netw., vol. 5, no. 4, pp. 1019–1023, vol. 53, no. 8, pp. 2700–2712, Aug. 2005.
Dec. 2019. [38] K. Li, C. Jeon, J. R. Cavallaro, and C. Studer, “Feedforward architec-
[17] J. IEEE Sel Topics Signal Process (JSTSP). (2020). Call for Papers: tures for decentralized precoding in massive MU-MIMO systems,” in
Special Issue on Compact Deep Neural Networks With Industrial Proc. 52nd Asilomar Conf. Signals, Syst., Comput. (ACSSC), Oct. 2018,
Applications. Accessed: Apr. 3, 2020. [Online]. Available: https:// pp. 1659–1665.
signalprocessingsociety.org/sites/default/files/uploads/special_issues_ [39] A. M. Elbir, “CNN-based precoder and combiner design in mmWave
deadlines/JSTSP_SI_compact_deep.pdf MIMO systems,” IEEE Commun. Lett., vol. 23, no. 7, pp. 1240–1243,
[18] IEEE Global Commun. Conf. (GLOBECOM). (2019). Call for Jul. 2019.
Workshop Papers: Workshop on Machine Learning for Wireless [40] A. Balatsoukas-Stimming, O. Castañeda, S. Jacobsson, G. Durisi, and
Communications. Accessed: Apr. 3, 2020. [Online]. Available: C. Studer, “Neural-network optimized 1-bit precoding for massive
https://globecom2019.ieee-globecom.org/authors/call-workshop-papers MU-MIMO,” in Proc. IEEE 20th Int. Workshop Signal Process. Adv.
[19] IEEE Global Commun. Conf. (GLOBECOM). (2019). Call for Work- Wireless Commun. (SPAWC), Jul. 2019, pp. 1–5.
shop Papers: Workshop on Artificial Intelligence for Next-Generations [41] O. Castaneda, S. Jacobsson, G. Durisi, M. Coldrey, T. Goldstein, and
Wireless Communications. Accessed: Apr. 3, 2020. [Online]. Available: C. Studer, “1-bit massive MU-MIMO precoding in VLSI,” IEEE J.
https://globecom2019.ieee-globecom.org/authors/call-workshop-papers Emerg. Sel. Topics Circuits Syst., vol. 7, no. 4, pp. 508–522, Dec. 2017.
[20] IEEE Global Conf. Signal Info. (GlobalSIP). (2019). Call for Papers: [42] O. Castañeda, T. Goldstein, and C. Studer, “VLSI designs for joint
Symposium on Machine Learning for Wireless Communications, channel estimation and data detection in large SIMO wireless systems,”
Networking, and Security. Accessed: Apr. 3, 2020. [Online]. IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 65, no. 3, pp. 1120–1132,
Available: http://2019.ieeeglobalsip.org/pages/machine-learning- Oct. 2017.
wireless-communications-networking-and-security [43] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and
[21] IEEE Global Conf. Signal Info. (GlobalSIP). (2019). Call for the sum-product algorithm,” IEEE Trans. Inf. Theory, vol. 47, no. 2,
Papers: Symposium on Artificial Intelligence for Future Wireless pp. 498–519, Feb. 2001.
Communication. Accessed: Apr. 3, 2020. [Online]. Available: [44] D. J. C. MacKay, “Good error-correcting codes based on very sparse
http://2019.ieeeglobalsip.org/pages/symposium-artificial-intelligence- matrices,” IEEE Trans. Inf. Theory, vol. 45, no. 2, pp. 399–431,
future-wireless-communication Mar. 1999.
[22] IEEE Vehi. Tech. Conf. (VTC)-Fall. (2019). Call for Workshop [45] J. Chen and M. P. C. Fossorier, “Density evolution for two improved
Papers: Workshop on Machine Learning for Wireless Communica- BP-based decoding algorithms of LDPC codes,” IEEE Commun. Lett.,
tions. Accessed: Apr. 3, 2020. [Online]. Available: http://www.ieeevtc. vol. 6, no. 5, pp. 208–210, May 2002.
org/vtc2019fall/cfw-pprs.php#wkshp_3 [46] E. Arikan, “Channel polarization: A method for constructing capacity-
[23] Brooklyn 5G Summit. (2020). Call for Workshop Papers: achieving codes for symmetric binary-input memoryless chan-
Workshop on Machine Learning for Wireless Communications. nels,” IEEE Trans. Inf. Theory, vol. 55, no. 7, pp. 3051–3073,
Accessed: Apr. 3, 2020. [Online]. Available: https://brooklyn5gsummit. Jul. 2009.
com/agenda-2020/ [47] X. Wu, M. Jiang, and C. Zhao, “Decoding optimization for 5G LDPC
[24] IEEE Communications Society. (2019). Machine Learning for Com- codes by machine learning,” IEEE Access, vol. 6, pp. 50179–50186,
munications Emerging Technologies Initiative. Accessed: Apr. 3, 2020. 2018.
[Online]. Available: https://mlc.committees.comsoc.org/workshops- [48] B. Vasic, X. Xiao, and S. Lin, “Learning to decode LDPC codes with
tutorials-symposia/ finite-alphabet message passing,” in Proc. Inf. Theory Appl. Workshop
[25] Z. Qin, H. Ye, G. Y. Li, and B.-H.-F. Juang, “Deep learning in (ITA), Feb. 2018, pp. 1–9.
physical layer communications,” IEEE Wireless Commun., vol. 26, [49] L. Lugosch and W. J. Gross, “Neural offset min-sum decod-
no. 2, pp. 93–99, Apr. 2019. ing,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Jun. 2017,
[26] H. He, S. Jin, C.-K. Wen, F. Gao, G. Y. Li, and Z. Xu, “Model-driven pp. 1361–1365.
deep learning for physical layer communications,” IEEE Wireless [50] W. Xu, Z. Wu, Y.-L. Ueng, X. You, and C. Zhang, “Improved polar
Commun., vol. 26, no. 5, pp. 77–83, Oct. 2019. decoder based on deep learning,” in Proc. IEEE Int. Workshop Signal
[27] L. Liang, H. Ye, G. Yu, and G. Y. Li, “Deep-learning-based wireless Process. Syst. (SiPS), Oct. 2017, pp. 1–6.
resource allocation with application to vehicular networks,” Proc. [51] C.-F. Teng, C.-H.-D. Wu, A. Kuan-Shiuan Ho, and A.-Y.-A. Wu, “Low-
IEEE, vol. 108, no. 2, pp. 341–356, Feb. 2020. complexity recurrent neural network-based polar decoder with weight
[28] N. Samuel, T. Diskin, and A. Wiesel, “Learning to detect,” IEEE Trans. quantization mechanism,” in Proc. IEEE Int. Conf. Acoust., Speech
Signal Process., vol. 67, no. 10, pp. 2554–2564, May 2019. Signal Process. (ICASSP), May 2019, pp. 1413–1417.
[29] M. Khani, M. Alizadeh, J. Hoydis, and P. Fleming, “Adaptive [52] E. Nachmani, E. Marciano, L. Lugosch, W. J. Gross, D. Burshtein,
neural signal detection for massive MIMO,” 2019, arXiv:1906.04610. and Y. Be’ery, “Deep learning methods for improved decoding of linear
[Online]. Available: http://arxiv.org/abs/1906.04610 codes,” IEEE J. Sel. Topics Signal Process., vol. 12, no. 1, pp. 119–131,
[30] X. Gao, S. Jin, C.-K. Wen, and G. Y. Li, “ComNet: Combination Feb. 2018.
of deep learning and expert knowledge in OFDM receivers,” IEEE [53] B. Yuan and K. K. Parhi, “Early stopping criteria for energy-efficient
Commun. Lett., vol. 22, no. 12, pp. 2627–2630, Dec. 2018. low-latency belief-propagation polar code decoders,” IEEE Trans.
[31] C. Jeon, O. Casta neda, and C. Studer, “A 354 mb/s 0.37mm2 151 mW Signal Process., vol. 62, no. 24, pp. 6496–6506, Dec. 2014.
32-user 256-QAM near-map soft-input soft-output massive mu-MIMO [54] C. Simsek and K. Turk, “Simplified early stopping criterion for belief-
data detector in 28 nm CMOS,” IEEE Solid-State Circuits Lett., vol. 2, propagation polar code decoders,” IEEE Commun. Lett., vol. 20, no. 8,
no. 9, pp. 127–130, Oct. 2019. pp. 1515–1518, Aug. 2016.
[32] J. Ma and L. Ping, “Orthogonal AMP,” IEEE Access, vol. 5, [55] Y. Ren, C. Zhang, X. Liu, and X. You, “Efficient early termination
pp. 2020–2033, 2017. schemes for belief-propagation decoding of polar codes,” in Proc. IEEE
[33] H. He, C.-K. Wen, S. Jin, and G. Y. Li, “A model-driven deep learning 11th Int. Conf. ASIC (ASICON), Nov. 2015, pp. 1–4.
network for MIMO detection,” in Proc. IEEE Global Conf. Signal Inf. [56] Y. Wang, S. Zhang, C. Zhang, X. Chen, and S. Xu, “A low-
Process. (GlobalSIP), Nov. 2018, pp. 584–588. complexity belief propagation based decoding scheme for polar codes–
[34] H. He, C.-K. Wen, S. Jin, and G. Y. Li, “Model-driven deep learn- decodability detection and early stopping prediction,” IEEE Access,
ing for MIMO detection,” IEEE Trans. Signal Process., vol. 68, vol. 7, pp. 159808–159820, 2019.
pp. 1702–1715, Feb. 2020. [57] F. Liang, C. Shen, and F. Wu, “An iterative BP-CNN architecture for
[35] X. Tan et al., “Improving massive MIMO message passing detectors channel decoding,” IEEE J. Sel. Topics Signal Process., vol. 12, no. 1,
with deep neural network,” IEEE Trans. Veh. Technol., vol. 69, no. 2, pp. 144–159, Feb. 2018.
pp. 1267–1280, Feb. 2020. [58] X. Wang et al., “Learning to flip successive cancellation decoding of
[36] A. Balatsoukas-Stimming and C. Studer, “Deep unfolding for commu- polar codes with LSTM networks,” in Proc. IEEE 30th Annu. Int.
nications systems: A survey and some new directions,” in Proc. IEEE Symp. Pers., Indoor Mobile Radio Commun. (PIMRC), Sep. 2019,
Int. Workshop Signal Process. Syst. (SiPS), Oct. 2019, pp. 1–6. pp. 1–5.

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: AI FOR 5G AND B5G: IMPLEMENTATIONS, ALGORITHMS, AND OPTIMIZATIONS 161

[59] C.-H. Chen, C.-F. Teng, and A.-Y.-A. Wu, “Low-complexity LSTM- [83] L. Dai, B. Wang, Y. Yuan, S. Han, C.-L. I, and Z. Wang, “Non-
assisted bit-flipping algorithm for successive cancellation list polar orthogonal multiple access for 5G: Solutions, challenges, opportunities,
decoder,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. and future research trends,” IEEE Commun. Mag., vol. 53, no. 9,
(ICASSP), May 2020, pp. 1708–1712. pp. 74–81, Sep. 2015.
[60] T. Gruber, S. Cammerer, J. Hoydis, and S. T. Brink, “On deep learning- [84] F. Luo and C. Zhang, Non-Orthogonal Multi-User Superposition
based channel decoding,” in Proc. 51st Annu. Conf. Inf. Sci. Syst. Shared Access. Hoboken, NJ, USA: Wiley, 2016, pp. 115–142.
(CISS), Mar. 2017, pp. 1–6. [85] S. Chen, B. Ren, Q. Gao, S. Kang, S. Sun, and K. Niu, “Pattern
[61] S. Cammerer, T. Gruber, J. Hoydis, and S. ten Brink, “Scaling deep division multiple access—A novel nonorthogonal multiple access for
learning-based decoding of polar codes via partitioning,” in Proc. IEEE fifth-generation radio networks,” IEEE Trans. Veh. Technol., vol. 66,
Global Commun. Conf. (GLOBECOM), Dec. 2017, pp. 1–6. no. 4, pp. 3185–3196, Apr. 2017.
[62] T. Wadayama and S. Takabe, “Deep learning-aided trainable projected [86] H. Nikopour et al., “SCMA for downlink multiple access of 5G
gradient decoding for LDPC codes,” in Proc. IEEE Int. Symp. Inf. wireless networks,” in Proc. IEEE Global Commun. Conf., Dec. 2014,
Theory (ISIT), Jul. 2019, pp. 2444–2448. pp. 3940–3945.
[63] Y. Wang, Z. Zhang, S. Zhang, S. Cao, and S. Xu, “A unified deep [87] C. Yang, C. Zhang, S. Zhang, and X. You, “Efficient hardware
learning based polar-LDPC decoder for 5G communication systems,” architecture of deterministic MPA decoder for SCMA,” in Proc. IEEE
in Proc. 10th Int. Conf. Wireless Commun. Signal Process. (WCSP), Asia Pacific Conf. Circuits Syst. (APCCAS), Oct. 2016, pp. 293–296.
Oct. 2018, pp. 1–6. [88] Y. Tan, J. Zhou, and J. Qin, “Novel channel estimation for non-
[64] W. Lyu, Z. Zhang, C. Jiao, K. Qin, and H. Zhang, “Performance orthogonal multiple access systems,” IEEE Signal Process. Lett.,
evaluation of channel decoding with deep neural networks,” in Proc. vol. 23, no. 12, pp. 1781–1785, Dec. 2016.
IEEE Int. Conf. Commun. (ICC), May 2018, pp. 1–6. [89] M. Taherzadeh, H. Nikopour, A. Bayesteh, and H. Baligh, “SCMA
[65] M. Zhang, Q. Huang, S. Wang, and Z. Wang, “Construction of LDPC codebook design,” in Proc. IEEE 80th Veh. Technol. Conf. (VTC-Fall),
codes based on deep reinforcement learning,” in Proc. 10th Int. Conf. Sep. 2014, pp. 1–5.
Wireless Commun. Signal Process. (WCSP), Oct. 2018, pp. 1–4. [90] H. Nikopour and M. Baligh, “Systems and methods for sparse code
[66] A. Elkelesh, M. Ebada, S. Cammerer, L. Schmalen, and S. ten Brink, multiple access,” U.S. Patent 9 240 853, Jan. 19, 2016.
“Decoder-in-the-loop: Genetic optimization-based LDPC code design,” [91] M. Alam and Q. Zhang, “Performance study of SCMA codebook
IEEE Access, vol. 7, pp. 141161–141170, 2019. design,” in Proc. IEEE Wireless Commun. Netw. Conf. (WCNC),
[67] G. He et al., “Beta-expansion: A theoretical framework for fast and Mar. 2017, pp. 1–5.
recursive construction of polar codes,” in Proc. IEEE Global Commun. [92] C. Yang, W. Xu, Z. Zhang, X. You, and C. Zhang, “A channel-blind
Conf. (GLOBECOM), Dec. 2017, pp. 1–6. detection for SCMA based on image processing techniques,” in Proc.
[68] I. Tal and A. Vardy, “How to construct polar codes,” IEEE Trans. Inf. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2018, pp. 1–5.
Theory, vol. 59, no. 10, pp. 6562–6582, Oct. 2013. [93] M. Kim, N.-I. Kim, W. Lee, and D.-H. Cho, “Deep learning-aided
[69] 3GPP. (Feb. 2017). Final Report of 3GPP TSG RAN WG1 SCMA,” IEEE Commun. Lett., vol. 22, no. 4, pp. 720–723, Apr. 2018.
#87. Accessed: Apr. 3, 2020. [Online]. Available: http://www. [94] H. He, C.-K. Wen, S. Jin, and G. Y. Li, “Deep learning-based
3gpp.org/DynaReport/TDocExMt–R1-87–31665.htm channel estimation for beamspace mmWave massive MIMO sys-
[70] A. Elkelesh, M. Ebada, S. Cammerer, and S. T. Brink, “Decoder- tems,” IEEE Wireless Commun. Lett., vol. 7, no. 5, pp. 852–855,
tailored polar code design using the genetic algorithm,” IEEE Trans. Oct. 2018.
Commun., vol. 67, no. 7, pp. 4521–4534, Jul. 2019. [95] M. Soltani, V. Pourahmadi, A. Mirzaei, and H. Sheikhzadeh, “Deep
[71] I. Tal and A. Vardy, “List decoding of polar codes,” IEEE Trans. Inf. learning-based channel estimation,” IEEE Commun. Lett., vol. 23, no. 4,
Theory, vol. 61, no. 5, pp. 2213–2226, May 2015. pp. 652–655, Apr. 2019.
[72] M. Ebada, S. Cammerer, A. Elkelesh, and S. Ten Brink, “Deep [96] C.-J. Chun, J.-M. Kang, and I.-M. Kim, “Deep learning-based
learning-based polar code design,” in Proc. 57th Annu. Allerton Conf. joint pilot design and channel estimation for multiuser MIMO
Commun., Control, Comput. (Allerton), Sep. 2019, pp. 177–183. channels,” IEEE Commun. Lett., vol. 23, no. 11, pp. 1999–2003,
[73] B. Mitchinson and R. F. Harrison, “Digital communications channel Nov. 2019.
equalization using the kernel adaline,” IEEE Trans. Commun., vol. 50, [97] C.-J. Chun, J.-M. Kang, and I.-M. Kim, “Deep learning-based channel
no. 4, pp. 571–576, Apr. 2002. estimation for massive MIMO systems,” IEEE Wireless Commun. Lett.,
[74] X. Lyu, W. Feng, R. Shi, Y. Pei, and N. Ge, “Artificial neural vol. 8, no. 4, pp. 1228–1231, Aug. 2019.
network-based nonlinear channel equalization: A soft-output per- [98] Y. Liao, Y. Hua, X. Dai, H. Yao, and X. Yang, “ChanEstNet: A deep
spective,” in Proc. 22nd Int. Conf. Telecommun. (ICT), Apr. 2015, learning based channel estimation for high-speed scenarios,” in Proc.
pp. 243–248. IEEE Int. Conf. Commun. (ICC), May 2019, pp. 1–6.
[75] J. C. Patra, W. B. Poh, N. S. Chaudhari, and A. Das, “Nonlinear [99] K. E. Kolodziej, B. T. Perry, and J. S. Herd, “In-band full-duplex
channel equalization with QAM signal using chebyshev artificial neural technology: Techniques and systems survey,” IEEE Trans. Microw.
network,” in Proc. IEEE Int. Joint Conf. Neural Netw. (IJCNN), Theory Techn., vol. 67, no. 7, pp. 3025–3041, Jul. 2019.
Jul. 2005, pp. 3214–3219. [100] A. Balatsoukas-Stimming, “Non-linear digital self-interference cancel-
[76] K. Burse, R. N. Yadav, and S. C. Shrivastava, “Channel equalization lation for in-band full-duplex radios using neural networks,” in Proc.
using neural networks: A review,” IEEE Trans. Syst., Man, Cybern. C, IEEE 19th Int. Workshop Signal Process. Adv. Wireless Commun.
Appl. Rev., vol. 40, no. 3, pp. 352–357, May 2010. (SPAWC), Jun. 2018, pp. 1–5.
[77] C.-F. Teng, H.-M. Ou, and A.-Y.-A. Wu, “Neural network-based [101] Y. Kurzo, A. T. Kristensen, A. Burg, and A. Balatsoukas-Stimming,
equalizer by utilizing coding gain in advance,” in Proc. IEEE Global “Hardware implementation of neural self-interference cancellation,”
Conf. Signal Inf. Process. (GlobalSIP), Nov. 2019, pp. 1–5. 2020, arXiv:2001.04543. [Online]. Available: http://arxiv.org/abs/
[78] H. Ye and G. Y. Li, “Initial results on deep learning for joint channel 2001.04543
equalization and decoding,” in Proc. IEEE 86th Veh. Technol. Conf. [102] O. Ploder, O. Lang, T. Paireder, and M. Huemer, “An adaptive
(VTC-Fall), Sep. 2017, pp. 1–5. machine learning based approach for the cancellation of second-order-
intermodulation distortions in 4G/5G transceivers,” in Proc. IEEE 90th
[79] P. M. Olmos, J. J. Murillo-Fuentes, and F. Perez-Cruz, “Joint non-
Veh. Technol. Conf. (VTC-Fall), Sep. 2019, pp. 1–7.
linear channel equalization and soft LDPC decoding with Gaussian [103] C. Tarver, A. Balatsoukas-Stimming, and J. R. Cavallaro, “Design and
processes,” IEEE Trans. Signal Process., vol. 58, no. 3, pp. 1183–1192, implementation of a neural network based predistorter for enhanced
Mar. 2010. mobile broadband,” 2019, arXiv:1907.00766. [Online]. Available:
[80] W. Xu, Z. Zhong, Y. Be’ery, X. You, and C. Zhang, “Joint neural http://arxiv.org/abs/1907.00766
network equalizer and decoder,” in Proc. 15th Int. Symp. Wireless [104] M. Rawat, K. Rawat, and F. M. Ghannouchi, “Adaptive digital predis-
Commun. Syst. (ISWCS), Aug. 2018, pp. 1–5. tortion of wireless power amplifiers/transmitters using dynamic real-
[81] G. Kechriotis, E. Zervas, and E. S. Manolakos, “Using recurrent neural valued focused time-delay line neural networks,” IEEE Trans. Microw.
networks for adaptive communication channel equalization,” IEEE Theory Techn., vol. 58, no. 1, pp. 95–104, Jan. 2010.
Trans. Neural Netw., vol. 5, no. 2, pp. 267–278, Mar. 1994. [105] D. Wang, M. Aziz, M. Helaoui, and F. M. Ghannouchi, “Augmented
[82] Y. Hu, L. Zhao, and Y. Hu, “Joint channel equalization and decoding real-valued time-delay neural network for compensation of distortions
with one recurrent neural network,” in Proc. IEEE Int. Symp. Broad- and impairments in wireless transmitters,” IEEE Trans. Neural Netw.
band Multimedia Syst. Broadcast. (BMSB), Jun. 2019, pp. 1–4. Learn. Syst., vol. 30, no. 1, pp. 242–254, Jan. 2019.

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
162 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO. 2, JUNE 2020

[106] Y. Yang, F. Gao, X. Ma, and S. Zhang, “Deep learning-based channel [130] T. O’Shea and J. Hoydis, “An introduction to deep learning for the
estimation for doubly selective fading channels,” IEEE Access, vol. 7, physical layer,” IEEE Trans. Cognit. Commun. Netw., vol. 3, no. 4,
pp. 36579–36589, 2019. pp. 563–575, Dec. 2017.
[107] Y. Yang, F. Gao, G. Y. Li, and M. Jian, “Deep learning-based downlink [131] S. Dorner, S. Cammerer, J. Hoydis, and S. T. Brink, “Deep learning
channel prediction for FDD massive MIMO system,” IEEE Commun. based communication over the air,” IEEE J. Sel. Topics Signal Process.,
Lett., vol. 23, no. 11, pp. 1994–1998, Nov. 2019. vol. 12, no. 1, pp. 132–143, Feb. 2018.
[108] F. Gustafsson and F. Gunnarsson, “Mobile positioning using wireless
networks: Possibilities and fundamental limitations based on available
wireless network measurements,” IEEE Signal Process. Mag., vol. 22,
no. 4, pp. 41–53, Jul. 2005.
[109] H. Liu, H. Darabi, P. Banerjee, and J. Liu, “Survey of wireless
indoor positioning techniques and systems,” IEEE Trans. Syst., Man,
Cybern. C, Appl. Rev., vol. 37, no. 6, pp. 1067–1080, Nov. 2007.
[110] S. Kumar, S. Gil, D. Katabi, and D. Rus, “Accurate indoor localization Chuan Zhang (Member, IEEE) received the B.E.
with zero start-up cost,” in Proc. 20th Annu. Int. Conf. Mobile Comput. degree (summa cum laude) in microelectronics
Netw. (MobiCom), Sep. 2014, pp. 483–494. and the M.E. degree (Hons.) in very-large scale
[111] K. Kaemarungsi and P. Krishnamurthy, “Modeling of indoor posi- integration (VLSI) design from Nanjing University,
tioning systems based on location fingerprinting,” in Proc. IEEE Nanjing, China, in 2006 and 2009, respectively, and
Int. Conf. Comput. Commun. (INFOCOM), vol. 2, Mar. 2004, the M.S.E.E. and Ph.D. degrees from the Department
pp. 1012–1022. of Electrical and Computer Engineering, University
[112] K. Wu, J. Xiao, Y. Yi, D. Chen, X. Luo, and L. M. Ni, “CSI-based of Minnesota, Twin Cities (UMN), USA, in 2012.
indoor localization,” IEEE Trans. Parallel Distrib. Syst., vol. 24, no. 7, He is currently the Excellence Professor and
pp. 1300–1309, Jul. 2013. the Purple Mountain Professor with Southeast
[113] Y. Chapre, A. Ignjatovic, A. Seneviratne, and S. Jha, “CSI-MIMO: University. He is also with the LEADS, National
An efficient Wi-Fi fingerprinting using channel state information with Mobile Communications Research Laboratory, Quantum Information Center,
MIMO,” Pervas. Mobile Comput., vol. 23, pp. 89–103, Oct. 2015. Southeast University, and the Purple Mountain Laboratories, Nanjing.
[114] A. Smailagic, J. Small, and D. P. Siewiorek, “Determining user location His current research interests include low-power high-speed VLSI design
for context aware computing through the use of a wireless LAN for digital signal processing and digital communication, bio-chemical
infrastructure,” Inst. Complex Engineered Syst., Carnegie Mellon Univ., computation and neuromorphic engineering, and quantum communication.
Pittsburgh, PA, USA, Tech. Rep. 15213, Dec. 2000, Art. no. 15213. Dr. Zhang is a member of the Seasonal School of Signal Processing and
[115] L. Tang, R. Ghods, and C. Studer, “Reducing the complexity of Design and Implementation of Signal Processing Systems TC of the IEEE
fingerprinting-based positioning using locality-sensitive hashing,” 2019, Signal Processing Society, and Circuits and Systems for Communications
arXiv:1912.00831. [Online]. Available: http://arxiv.org/abs/1912.00831 TC, VLSI Systems and Applications TC, and Digital Signal Processing TC
[116] A. Gionis, P. Indyk, and R. Motwani, “Similarity search in high
of the IEEE Circuits and Systems Society. He is also the Secretary-Elect
dimensions via hashing,” in Proc. Very Large Data Base (VLDB),
of the Circuits and Systems for Communications TC of the IEEE Circuits
Sep. 1999, vol. 99, no. 6, pp. 518–529.
[117] S. Har-Peled, P. Indyk, and R. Motwani, “Approximate nearest neigh- and Systems Society. He received the Best Contribution Award of the
bor: Towards removing the curse of dimensionality,” Theory Comput., IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) in 2018,
vol. 8, no. 1, pp. 321–350, Jul. 2012. the Best Paper Award in 2016, the Best (Student) Paper Award of the
[118] X. Wang, L. Gao, and S. Mao, “CSI phase fingerprinting for indoor IEEE International Conference on DSP in 2016, three best (student) paper
localization with a deep learning approach,” IEEE Internet Things J., awards of the IEEE International Conference on ASIC in 2015, 2017, and
vol. 3, no. 6, pp. 1113–1123, Dec. 2016. 2019, the Best Paper Award Nomination of the IEEE Workshop on Signal
[119] J. Vieira, E. Leitinger, M. Sarajlic, X. Li, and F. Tufvesson, “Deep Processing Systems in 2015, three excellent paper awards and two excellent
convolutional neural networks for massive MIMO fingerprint-based poster presentation awards of the International Collaboration Symposium on
positioning,” in Proc. IEEE 28th Annu. Int. Symp. Pers., Indoor, Mobile Information Production and Systems from 2016 to 2018, the Outstanding
Radio Commun. (PIMRC), Oct. 2017, pp. 1–6. Achievement Award of the Intel Collaborative Research Institute in 2018,
[120] M. Arnold, S. Dorner, S. Cammerer, and S. Ten Brink, “On deep and the Merit (Student) Paper Award of the IEEE APCCAS in 2008. He also
learning-based massive MIMO indoor user localization,” in Proc. IEEE received the Three-Year University-Wide Graduate School Fellowship of
19th Int. Workshop Signal Process. Adv. Wireless Commun. (SPAWC), UMN and the Doctoral Dissertation Fellowship of UMN. He also serves as
Jun. 2018, pp. 1–5. an Associate Editor of the IEEE T RANSACTIONS ON S IGNAL P ROCESSING
[121] E. Lei, O. Castañeda, O. Tirkkonen, T. Goldstein, and C. Studer, and the IEEE O PEN J OURNAL OF C IRCUITS AND S YSTEMS . He serves as
“Siamese neural networks for wireless positioning and channel chart- a Corresponding Guest Editor of the IEEE J OURNAL ON E MERGING AND
ing,” in Proc. 57th Annu. Allerton Conf. Commun., Control, Comput. S ELECTED T OPICS IN C IRCUITS AND S YSTEMS twice.
(Allerton), Sep. 2019, pp. 200–207.
[122] C. Studer, S. Medjkouh, E. Gönültas, T. Goldstein, and O. Tirkkonen,
“Channel charting: Locating users within the radio environment using
channel state information,” IEEE Access, vol. 6, pp. 47682–47698,
2018.
[123] L. J. P. van der Maaten, E. O. Postma, and H. J. van den Herik,
“Dimensionality reduction: A comparative review,” J. Mach. Learn.
Res., vol. 10, nos. 1–41, pp. 66–71, Oct. 2009. Yeong-Luh Ueng (Senior Member, IEEE) received
[124] J. W. Sammon, “A nonlinear mapping for data structure analysis,” IEEE the Ph.D. degree in communication engineering from
Trans. Comput., vol. C-18, no. 5, pp. 401–409, May 1969. National Taiwan University, Taipei, Taiwan, in 2001.
[125] J. Bromley, I. Guyon, Y. LeCun, E. Säckinger, and R. Shah, “Signature From 2001 to 2005, he was with a private communi-
verification using a,’ Siamese’ time delay neural network,” in Proc. Int. cation technology company, where he focused on the
Conf. Neural Inf. Process. Syst. (NIPS), Jan. 1994, pp. 737–744. design and development of various wireless chips.
[126] P. Huang et al., “Improving channel charting with representation-
Since 2005, he has been a member of the Faculty
constrained autoencoders,” in Proc. IEEE 20th Int. Workshop Signal
of National Tsing Hua University, Hsinchu, Taiwan,
Process. Adv. Wireless Commun. (SPAWC), Jul. 2019, pp. 1–5.
[127] X. You, H. Yin, and H. Wu, “On 6G and wide-area IoT,” Chin. J. where he is currently a Full Professor with the
Internet Things, vol. 4, no. 1, pp. 3–11, Mar. 2020. Department of Electrical Engineering and the Insti-
[128] S. Schibisch, S. Cammerer, S. Dörner, J. Hoydis, and S. Ten Brink, tute of Communications Engineering. His research
“Online label recovery for deep learning-based communication through interests include coding theory, wireless communications, and communication
error correcting codes,” in Proc. 15th Int. Symp. Wireless Commun. ICs. In 2016, he received the Wu Ta-You Memorial Award from the Ministry
Syst. (ISWCS), Aug. 2018, pp. 1–5. of Science and Technology (MOST). In 2018, he received the Outstanding
[129] L. Lugosch and W. J. Gross, “Learning from the syndrome,” in Electrical Engineering Professor Award from the Chinese Electrical Engineer-
Proc. 52nd Asilomar Conf. Signals, Syst., Comput., Oct. 2018, ing Association and the Outstanding Research Award from the Ministry of
pp. 594–598. Science and Technology (MOST), Taiwan.

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: AI FOR 5G AND B5G: IMPLEMENTATIONS, ALGORITHMS, AND OPTIMIZATIONS 163

Christoph Studer (Senior Member, IEEE) received Andreas Burg (Member, IEEE) was born in
the Ph.D. degree in electrical engineering from ETH Munich, Germany, in 1975. He received the
Zürich, Switzerland, in 2009. Dipl.Ing. degree from the Swiss Federal Institute
In 2005, he was a Visiting Researcher with the of Technology (ETH) Zürich, Zürich, Switzerland,
Smart Antennas Research Group, Stanford Univer- in 2000, and the Dr.Sc.Techn. degree from the Inte-
sity. From 2006 to 2009, he was a Research Assistant grated Systems Laboratory, ETH Zürich, in 2006.
at the Integrated Systems Laboratory and the Com- In 1998, he was with Siemens Semiconductors,
munication Technology Laboratory (CTL), ETH San Jose, CA, USA. During his Ph.D. studies,
Zürich. From 2009 to 2012, he was a Post-Doctoral he was with the Bell Labs Wireless Research for
Researcher at CTL, ETH Zürich, and the Digital one year. From 2006 to 2007, he was a Post-Doctoral
Signal Processing Group, Rice University. In 2013, Researcher with the Integrated Systems Laboratory
he has held a research scientist position at Rice University, where he has been and with the Communication Theory Group, ETH Zürich. In 2007, he co-
an Adjunct Professor since 2014. From 2014 to 2019, he was an Assistant founded Celestrius, an ETH-spinoff in the field of MIMO wireless communi-
Professor at Cornell University, and from 2019 to 2020, he was an Associate cation, where he was responsible for the ASIC development as the Director
Professor at Cornell University and at Cornell Tech, New York City. He is cur- for VLSI. In 2009, he joined ETH Zürich, as an SNF Assistant Professor
rently an Associate Professor with the Department of Information Technology and as the Head of the Signal Processing Circuits and Systems Group,
and Electrical Engineering, ETH Zürich, Switzerland. His research interests Integrated Systems Laboratory. In 2011, he became a Tenure Track Assistant
include the design of very large-scale integration (VLSI) circuits, as well Professor with the École Polytechnique Fédérale de Lausanne (EPFL), where
as wireless communications, signal and image processing, optimization, and he is currently leading the Telecommunications Circuits Laboratory. He was
machine learning. promoted to an Associate Professor in June 2018. He is also a member of
Dr. Studer received the ETH medals for his M.S. and Ph.D. theses the EURASIP SAT SPCN and of the IEEE TC-DISPS. In 2000, he received
in 2006 and 2009, respectively, the Swiss National Science Foundation the Willi Studer Award and the ETH Medal for his diploma and his diploma
Fellowship for Advanced Researchers in 2011, and the U.S. National Science thesis, respectively. He also received the ETH Medal for his Ph.D. dissertation
Foundation CAREER Award in 2017. He won the Michael Tien ’72 Excel- in 2006. In 2008, he received the four years grant from the Swiss National
lence in Teaching Award from the College of Engineering, Cornell University, Science Foundation (SNF) for an SNF Assistant Professorship. With his
in 2016. He shared the Swisscom/ICTnet Innovations Award in 2010 and students, he received the Best Paper Award from the EURASIP Journal on
2013. He was the Winner of the Student Paper Contest of the 2007 Asilomar Image and Video Processing in 2013 and the best demo/paper awards at
Conference on Signals, Systems, and Computers, received the Best Student ISCAS 2013, ICECS 2013, and ACSSC 2007. He has served on TPC of
Paper Award of the 2008 IEEE International Symposium on Circuits and various conferences on signal processing, communications, and VLSI. He was
Systems (ISCAS), and shared the Best Live Demonstration Award at the the TPC Co-Chair of VLSI-SoC 2012, ESSCIRC 2016, and SiPS 2017.
IEEE ISCAS in 2013. In 2019, he was the Technical Program Chair of He was also the General Chair of ISLPED 2019. He served as an Editor
Asilomar Conference on Signals, Systems, and Computers, and the Technical of the IEEE T RANSACTIONS ON C IRCUITS AND S YSTEMS in 2013 and on
Program Co-Chair of the IEEE International Workshop on Signal Processing the Editorial Board of the Microelectronics Journal (Springer). He is an Editor
Systems (SiPS). He is currently an Associate Editor of the IEEE O PEN of the Journal of Signal Processing Systems (Springer) and Journal of Low
J OURNAL OF C IRCUITS AND S YSTEMS . Power Electronics and Applications (MDPI), and an Associate Editor of the
IEEE T RANSACTIONS ON V ERY L ARGE S CALE I NTEGRATION.

Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.

You might also like