Artificial Intelligence For 5G and Beyond 5G: Implementations, Algorithms, and Optimizations
Artificial Intelligence For 5G and Beyond 5G: Implementations, Algorithms, and Optimizations
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
150 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO. 2, JUNE 2020
complexity, and produce results comparable to and—in some receiving antennas. The narrow-band system model is as
cases—surpassing conventional approaches [9]. follows:
Nowadays, research combining AI with 5G/B5G has
drawn significant attention from both academia and industry. y = Hs + n. (1)
Although some related initiatives have been named, suitable Here, y ∈ CNr is the received vector, H ∈ CNr ×Nt is
algorithms, implementations, and optimizations are unfortu- the channel matrix, s ∈ ΩNt is the transmitted vector and
nately not complete and of course at its infancy. The significant the constellation
potential for AI is expected to advance network architec- of each component si is denoted by Ω,
and n ∼ CN 0, σ 2 INr models zero-mean additive white
tures [10], signal processing solutions [11], semiconductor Gaussian noise (AWGN). We assume perfect channel state
technologies i [12], as well as system-level optimization [13]. information (CSI) at the receiver. Unfortunately, exact max-
Recently, a number of special issues on IEEE Transactions imum likelihood detection is infeasible in practice for systems
[14]–[17], special sessions in IEEE key conferences [18]–[23], with a large number of transmitters and higher-order modu-
schools and tutorials [24] appeared on this emerging topic. lation schemes. Therefore, a variety of approximate detection
In contrast, this special issue and this overview paper empha- algorithms have been proposed in the past; see Fig. 1 for a
size “AI for 5G and B5G” related algorithms, implementations, classification of the most prominent methods.
and optimizations to provide an overview on recent progress The linear algorithms, such as zero forcing (ZF) and min-
from a circuits and systems perspective. We note that instead imum mean squared error (MMSE)-based equalization, and
of simply reviewing the literature, we focus on representative linear iterative methods, e.g., Gauss-Seidel (GS), coordinate
results and outline the technical foundation of these tech- descent (CD), successive over relaxation (SOR), and steepest
niques. Though many results on algorithm-level have been descent (SD) algorithm, have low complexity, nevertheless,
published recently, we focus on only those close to circuit and their performance is often unacceptable under realistic propa-
system implementations. We refer the interested readers to the gation conditions. Nonlinear detection algorithms, such as tree
papers [25]–[27] for survey papers from the communication search, interference cancellation (IC), and message passing,
society (ComSoc) perspective. can achieve near-optimal performance abut at significantly
The remainder of this overview paper is organized as higher complexity than linear methods. In particular, message-
follows. Section II reviews algorithms and implementations passing-based methods often suffer severe performance degra-
for AI-based massive multiple-input multiple-output (MIMO) dation under realistic channels, which motivates the develop-
detection and precoding. Section III discusses AI-based chan- ment of AI-based solutions. The pros and cons of different
nel coding. Section IV summarizes AI-based processing MIMO detection algorithms has been summarized in Table I.
for other baseband modules. Section V introduces machine
2) Deep Learning Based Linear Detection: A neural net-
learning (ML) methods that deal with model uncertainties,
works architecture called DetNet proposed in [28] is inspired
including AI-based full-duplex SI cancellation and RF/PA
by the iterative projected gradient descent and can be described
linearization. Section VI summarizes methods for AI-based
by the following procedure
device localization. Section VII investigates future research
directions. Section VIII concludes this overview. (1) (2)
qk = ŝk−1 − θk HH y + θk HH Hsk−1 ,
II. AI-BASED MIMO D ETECTION AND P RECODING (1) qk (1)
zk = ReLU Θk + θk ,
Due the advantages in spectral efficiency (SE), energy vk−1
efficiency (EE), and quality-of-service, MIMO technology (2) (2)
ŝk = Θk zk + θk ,
is a core component for many wireless systems. However, (3) (3)
efficient implementations of MIMO detection and precoding vk = Θk zk + θk , (2)
must always trade off performance versus complexity, in order
where ŝk is the estimate of s at the k-th iteration,
to cope with the NP-hardness of the underlying problems
ReLU (x) = max (x, 0) is ReLU activation function, and
and the high problem dimensions. Existing solutions, such L
(1) (2) (1) (2) (3) (1) (2) (3)
as linear, linear iterative, and message passing solvers have θk , θk , θk , θk , θk , Θk , Θk , Θk is the train-
k=1
been reported in the literature. Recently, research has focused able parameters, L is the total layers of the neural network.
on AI-based solutions for MIMO detection and precoding in DetNet performs very well when the channel matrix is i.i.d.
order to improve performance, complexity, and robustness. complex Gaussian. However, this methods yields suboptimal
performance under realistic channel conditions, such as 3GPP
A. AI-Based MIMO Detection 3D MIMO channel model or clustered delay line (CDL)
1) Introduction: We consider a narrow-band MIMO com- channel models. Furthermore, other limitations of DetNet
munication system with Nt transmitting antennas and Nr exist. First, a large number of network parameters make the
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: AI FOR 5G AND B5G: IMPLEMENTATIONS, ALGORITHMS, AND OPTIMIZATIONS 151
TABLE I
C OMPARISON OF MIMO D ETECTORS
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
152 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO. 2, JUNE 2020
B. Multi-User (MU) MIMO Precoding to minimize the UE-side MSE s − Hx(T +1) 2 over a large
AI-based methods can also be used for multi-user (MU) number of signal, channel, and noise realizations. For the same
precoding in massive MIMO systems [36]. The goal of MSE performance, this learning-based approach was shown to
MU-MIMO precoding is to transmit constellation points to reduce the number of algorithm iterations T by 2× over the
the U user equipments (UEs) while adhering to a power original method in [41] that uses manually-tuned parameters
constraint and avoiding interference caused by the channel {τ, η, ξ} that remain constant over the iterations.
between the basestation (BS) and the UEs. Linear precoding
methods multiply the transmit vector s ∈ ΩU , where Ω is III. AI-BASED C HANNEL C ODING
the constellation set, to a precoding matrix P ∈ CB×U ,
where B is the number of BS antennas, so that the signals Recent results in AI-based channel decoding have
received at the UEs minimize a suitably-defined cost function shown that the performance of existing methods can (often
(e.g., the mean-square error). Let us assume the following significantly) be improved. While some results use neural
narrowband input-output relation in the massive MU-MIMO networks to train the scaling parameters of conventional
downlink: algorithms, some apply neural networks to post-processing of
the decoders’ output, others try to directly decode with neural
y = Hx + n. (5) networks via one-shot decoding. AI-aided code construction
is another area where ML techniques have been used. Since
Here, y ∈ C contains the receive signals at each of the
U
both low-density parity-check (LDPC) codes and polar codes
U UEs, H ∈ CU×B is the downlink MIMO channel matrix, have been standardized by 5G, most recent research focuses
x ∈ CB is the precoded vector, and n ∈ CU models thermal on these types of codes.
noise at the UEs with variance N0 per entry. Linear precoding
simply computes x = Ps, where one of the most common
precoding matrix is the Wiener-filter precoder PWF = β WF Q, A. LDPC and Polar Decoding Algorithms
which minimizes the UE-side MSE, and is given by [37]
The belief-propagation (BP) algorithm, also known as the
−1
U N0 sum-product algorithm (SPA) [43], is a message passing algo-
QWF = HH H + 2 HH . (6) rithm, which iteratively decodes LDPC codes while providing
ρ
near-optimal error-rate performance. However, due to the
Here, β WF = tr(QH Q)Es /ρ2 ensures that the average rather high complexity of the BP algorithm, there exist several
power constraint E{x2 } = ρ2 is met and Es is the symbol hardware-friendly simplifications, such as the min-sum (MS)
power [38]. Since Eq. (6) is a regularized matrix inverse, algorithm [44], the normalized MS (NMS) algorithm [45],
the precoding operation x = Ps can be implemented using the and the offset MS (OMS) algorithm [45]. Polar codes are
same techniques proposed for linear detection in Section II. able to asymptotically achieve the Shannon capacity using
Related precoding method has been proposed for precoding the successive cancellation (SC) decoding algorithm as the
in mmWave massive MIMO systems that build on hybrid code length approaches infinity [46]. The SC decoding algo-
analog-digital architectures [39]. rithm sequentially estimates the information bits based on the
Nonlinear MU-MIMO precoding methods have been pro- received channel log-likelihood ratios (LLRs) and the encod-
posed in [40] for massive MU-MIMO systems that use ing structure. Other than the SC decoding algorithm, the BP
coarsely-quantized digital-to-analog converters (DACs) at the algorithm can also decode polar codes by passing LLR-values
BS. For 1-bit DACs, each entry of the transmit vector x is based on the encoding graph in an iterative fashion. Compared
constrained to the quaternary set {±α ± iα}, where α2 = with the SC decoding algorithm, BP decoding provides higher
ρ2 /(2B) enforces the power constraint. In order to compute throughput and shorter latency, but suffers from a substantial
an MSE-optimal precoding vector x, one can resort to the error-rate performance degradation. For both SC decoding and
iterative algorithm put forward in [41] combined with deep BP decoding, research is working towards a balance between
unfolding. The iterative nonlinear precoding algorithm pro- error-rate performance and complexity. Since some of the
posed in [41] performs the following sequence of operations underlying mechanisms are difficult to model, AI has been
recently considered to optimize this tradeoff.
z(t+1) = x(t) − τ (t) AH Ax(t) , (7)
x(t+1) = prox(z(t+1) ; η (t) , ξ (t) ), (8)
B. Optimized BP Decoders Based on Neural Networks
for the iterations t = 1, . . . , T . Here, A = I − ssH /s22 H There exist several results devoted to applying neural net-
and the proximal operator [42] works to optimize simplified BP decoders by modeling their
proxg (x; η, ξ) = clip(ηR{x}, ξ) + iclip(ηI{x}, ξ) (9) architectures using NNs and training the parameters of these
decoders in order to improve the decoding performance.
is applied element-wise to the vector z(t+1) and clips the real A linear approximation min-sum (LAMS) algorithm to
and imaginary parts to the interval [−ξ, +ξ]. As shown in [40], decode LDPC codes is proposed in [47]. In the LAMS algo-
one can now unfold this iterative procedure for a fixed number rithm, every check node output or channel value is multiplied
of iterations T and learn the algorithm parameters from data. by a normalization factor and biased by an offset. To optimize
Specifically, Eq. (7) corresponds to a linear layer with given these parameters, a three-layer neural network based on the
(and fixed) weights A and trainable per-iteration step-size Tanner graph is constructed to model the check node update
parameter τ (t) ; Eq. (8) can be interpreted as a nonlinear and the variable node update functions within one iteration,
activation function with trainable parameters η (t) and ξ (t) . and iteratively optimize (as shown in Fig. 3) these parameters
The parameters {τ (t) , η (t) , ξ (t) }Tt=1 can now be learned using using stochastic gradient descent (SGD). Simulation results
neural network learning tools, where a typical cost function is show that the optimized LAMS algorithm is able to outperform
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: AI FOR 5G AND B5G: IMPLEMENTATIONS, ALGORITHMS, AND OPTIMIZATIONS 153
Fig. 3. The iterative procedures for optimizing the normalization and offset
factors in each iteration proposed in [47].
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
154 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO. 2, JUNE 2020
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: AI FOR 5G AND B5G: IMPLEMENTATIONS, ALGORITHMS, AND OPTIMIZATIONS 155
where hj = (h1 j , . . . , hKj )T denotes the channel vectors and Here, θg denotes the weights and the biases of the decoder, y
n is the noise of channel. denotes the signal of the receiver and r denotes the original
The performance of SCMA depends heavily on the channel signal. By learning the mapping relationship between the
estimation quality. Specifically, existing SCMA detection algo- input data and the constellation plane with the aid of aa DNN,
rithms, such as the deterministic message passing algorithm i.e., the weights and the biases of the DNN, the construction
(DMPA) [87], require precise channel estimates. Once the of DNN decoder is transformed into solving the parameters
channel estimates are inaccurate or the channel conditions optimization problem. The decoder based on DNN can be
are unknown, the performance of SCMA will degrade signif- composed of DNN units, which is separated from each other.
icantly. For the sake of detecting the CSI, [88] designs a new A combination of all information can be achieved through
kind of linear estimator and then proposes a novel CSI detector a full connected (FC) network. Utilizing stochastic gradient
based on this estimator. The CSI ensures that the interference descent (SGD), the optimize weights and bias can be updated.
on the clean signal is as small as possible and that the decoding The adaptive D-SCMA system overcomes the shortcomings
is as accurate in uplink NOMA systems. However, it can’t be of the manual codebook design in previous.
denied the fact that it is difficult to obtain the CSI with the Other advantages, such as performance and computational
traditional methods. Besides, the design of a suitable codebook complexity of D-SCMA are also showed in [93] through sim-
is also an essential factor affecting the performance of SCMA. ulations. The advantages of designing codebook and decoder
Many techniques for manually designing codebooks, such as of sparse multidimensional signals such as SCMA with the
considering the distance of each constellation [89], [90] and aid of DL are superior.
the phase between the of constellations [91], still exist many
problems. Hence, codebook design is also a tricky issue.
In recent years, with the rapid development of deep learning, C. Channel Estimation
many challenging problems can be solved effectively via deep In wireless communication systems, coherent detection and
learning. The problems that arise in NOMA or SCMA systems precoding requires accurate estimates of the channel’s trans-
mentioned above can also be addressed by deep learning. fer function. In practice, the channel is typically estimated
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
156 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO. 2, JUNE 2020
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: AI FOR 5G AND B5G: IMPLEMENTATIONS, ALGORITHMS, AND OPTIMIZATIONS 157
combines an LMS filter with a neural network to remove ToF measurements from multiple satellites to infer location.
2nd-order intermodulation interference in 5G transceivers. In scenarios that lack LoS connectivity (which is the case for
indoor scenarios) or for devices where GNSS receivers are
unavailable (which is the case for ultra-low-power sensors),
B. RF/PA Linearization alternative positioning methods are required.
The second example for exploiting the expressive power 1) Location Fingerprinting Basics: Fingerprinting is a
of neural networks for working with systems that have no widespread approach for device localization under challenging
well defined model is closely related to the work on IBFD propagation conditions [109], [111]–[113]. This approach
communication. In [103] the authors propose to use a neural performs localization in two phases. In the first phase,
network to perform pre-distortion to reduce the negative CSI {fn }N n=1 = F and associated ground-truth location
impact of power amplifier (PA) non-linearities on the quality information {xn }N n=1 = X are measured and stored in
of the transmitted signal. This task captures both the difficulty a database. Here, the vectors fn ∈ RD correspond to
to exactly model those nonlinearities, as well as the design of D-dimensional CSI-fingerprints at positions xn ∈ Rd ,
a corresponding inverse with limited complexity. The results where D is typically high-dimensional and d is of dimension
presented in the paper indicate that again ML techniques can at two or three. CSI-fingerprints fn can contain RSS measured
least match the performance of conventional techniques with at multiple antennas, power-delay profiles, angle-of-arrival,
an overall significantly lower complexity. Another two works and many others [109]. In the second phase, an estimate for
[104], [105] consider employ real-valued time-delay neural the location xn of a new transmitting device with index n
networks for distortion compensation of PA linearization. is generated. After extracting a CSI-fingerprint fn from this
transmitter, the indices associated with the K most similar
C. Estimation and Tracking of Fading Channels fingerprints in the database {fn }N n=1 are identified. One can
then estimate the location of transmitter n from the set of
A third example for an application where closed form, similar locations via averaging [114].
broadly applicable models are often difficult to reason is While the K-nearest neighbor search (KNNS) in the
the behavior of fading wide-band channels. Conventional CSI-fingerprint database can provide a simple (often accurate)
estimation techniques to track such rapidly changing channels estimate of the transmitter’s location, the complexity of KNNS
need strong assumptions on their statistical properties which in large fingerprinting databases as well as storage of the
in practice are often not valid. Unfortunately, the diversity of fingerprints can quickly become a bottleneck. To address
such systems makes it hard to come up with more precise the complexity bottleneck, reference [115] proposed the use
models that are not only broadly applicable, but also allow for of locality-sensitive hashing (LSH) [116], [117], a powerful
the application of established estimation algorithms. ML tech- method for approximate KNNS that is widely used in ML
niques can again provide a more model-agnostic approach to applications. The idea of LSH is to construct hash functions
solve this difficult problem. An example is the work in [106] for which similar datapoints have matching hash values and
which uses a DNN to predict the channel based on pilots and dissimilar datapoints have mismatched hash values. This hash
previous channel estimates. function is then applied to all points in the CSI-fingerprint
Reference [107] also has shown that ML can help to dataset, i.e., {h(fn )}N
n=1 . For a new query point fn , one com-
exploit physically motivated relationships, which are not well putes h(fn ) and compares the resulting hash value to those in
captured in sufficiently simple and closed form models. In this the dataset. One can then compare the true, high-dimensional
paper, the authors exploit the clearly existing, but complex CSI feature dissimilarity d(fm , fn ), e.g., the Euclidean dis-
relationship between the up-link and down-link channel in a tance d(f , f ) = f − f , associated to only those indices
frequency-division duplex massive MIMO system to predict for which there was a hash collision. As shown in [115],
one from the other to reduce training overhead. To this end, such an LSH-based localization approach is able to reduce
the employed sparse complex-valued neural network approx- the number of fingerprint comparisons by more than 10×
imates the up-link-down-link mapping function which is a while achieving the same average distance error as a traditional
difficult problem for conventional algorithms. KNNS. Unfortunately, LSH does not help to reduce the
amount of storage for the CSI fingerprints and hash values.
VI. AI-BASED L OCALIZATION 2) Location Fingerprinting via Neural Networks: In
ML techniques are currently finding widespread application order to mitigate the complexity and storage bottlenecks of
for localization purposes in wireless networks [108]–[110]. We location fingerprinting, DNNs have been proposed recently in
next summarize methods that require supervision while learn- [118]–[121]. Such methods avoid a KNNS altogether
ing from large datasets as well as self-supervised techniques and directly map measured CSI-fingerprints to location.
that provide localization and prediction capabilities without Concretely, one learns a neural network gθ , where the
the need of expensive measurement campaigns. weights and biases are contained in the vector θ, that maps
the CSI features {fn }N n=1 to location {xn }n=1 by computing
N
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
158 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO. 2, JUNE 2020
simple neural networks with less than ten layers and ReLu
activations are often sufficient for accurate positioning—what
matters the most is the way measured CSI is converted into
fingerprints. Fingerprints should be resilient to small-scale
fading properties of the channel and capture large-scaling
aspects—a combination of angle-of-arrival and delay
information appears to be among the best-performing features Fig. 9. Siamese neural network for channel charting.
for such applications [120], [122].
network” [125] that is illustrated in Figure 9. This network
corresponds to two parallel neural networks that share
B. Channel Charting for Self-Supervised Location Sensing the same weights and biases—the output is the distance
Unfortunately, fingerprinting and neural-network-based between the networks outputs. Besides the learned neural
localization approaches require extensive, time-consuming, network cθ (fn ), which can be used for RRM and other
and costly measurement campaigns, which have to be repeated predictive tasks, Siamese networks can also be used to
if large-scale properties of the environment scale. In many perform semi-supervised learning where one provides the
applications, e.g., for predictive radio resource manage- channel charting method a subset of annotated locations xn ;
ment (RRM) in wireless network, cell or beam association this approach provides absolute location capabilities while
and handover, UE pairing an grouping, exact location is requiring orders-of-magnitude fewer location-annotated
not necessary and relative position information is sufficient. channel features [121], [126].
Since carefully-designed CSI features change only slowly
when a transmitter moves through space, one can extract VII. F UTURE R ESEARCH
a low-dimensional representation from high-dimensional CSI
Though already shown its advantages, future research on AI
features only, which provides relative location information. for 5G and B5G still needs to address the following issues.
This approach is known as channel charting (CC) [122], which
uses CSI features to form a channel chart in which nearby
points will be nearby in physical space. Such a channel chart A. Learning Methods vs. Conventional Methods
can be constructed using dimensionality reduction [123] in In certain situations, conventional methods can be a satisfac-
a self-supervised (or unsupervised) fashion which does not tory solver with even lower complexity compared to the learn-
require extensive measurement campaigns. ing methods. Existing works have shown that learning-based
1) Channel Charting via Sammon’s Mapping: Mathemati- methods will demonstrate their advantages when taking care
cally, channel charting only requires measured CSI features of problems which could not be either exactly modeled or
{fn }N
n=1 and computes for each feature fn ∈ R
D
a point solved [9]. Future research needs to identify the application
yn ∈ R in the low-dimensional channel chart (d is two or
d area of AI for 5G/B5G.
three). The idea is that for similar features, the points in the
channel chart should be similar as well. Mathematically, one B. Model Driven vs. Data Driven
can solve an optimization problem of the form
Combined with expert knowledge, model driven learning
{yn }N
n=1 = arg min
can effectively lower the training complexity. On the other
yn ,n=1,...,N hand, data driven learning can help to discover the useful
× w(fn , fm )(fn −fm −yn − ym )2 , (14) patterns which could not be figured out by expertise, for better
m=n
performance though the complexity might be prohibitive.
Whether to introduce expertise depends on how reliable it
which is known as Sammon’s mapping [124]. Here, is. Fully making use of expertise can improve the learning
the function w(fn , fm ) is used to de-weight the cost function efficiency and convergence rate, and lower complexity to meet
for features that are dissimilar, i.e., one wants to enforce the real-time requirements. Future research is expected to point
similarity of pairs of features and pairs of locations in the out which approach is appropriate for a specific problem.
channel chart only if the feature pair is similar—the standard
choice for Sammon’s mapping is w(fn , fm ) = fn − fm −1 . C. Data Validity
While the problem in (14) can be solved efficiently using
gradient-descent techniques, Sammon’s mapping is nonpara- As for “AI for 5G/B5G”, data is the necessary prerequisite
metric in a sense that it does not naturally provide a function and of equal importance as learning if not more important.
that maps CSI features fn to points yn in the channel chart. Data validity includes accuracy, identifiability, ergodicity, and
2) Channel Charting via Siamese Neural Networks: In integrity. It is critical to check the data validity before learning.
order to arrive at a channel charting method that is parametric, The data generated by wireless systems usually lack perfect
i.e., provides a function cθ that maps features to points in the ergodicity. For Gaussian distributed data, the further the data
channel chart, reference [121] proposed to solve an alternative is away from the mean value, the less likely it will be
optimization problem collected for learning, and the probability of under-fitting is
higher. Therefore, without proper manual intervention of data,
θ = arg min the learning efficiency based on it will be reduced.
θ
× w(fn , fm )(fn −fm −cθ (fn )−cθ (fm ))2 , (15) D. Data Sharing and Data Security
m=n
For efficient data usage, AI for 5G/B5G usually needs
where yn is replaced by a DNN yn = cθ (fn ) in (14). The access to all data generated by network layer, data link layer,
structure of this network is known as a “Siamese neural and physical layer. However, due to the phenomenon of “Data
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: AI FOR 5G AND B5G: IMPLEMENTATIONS, ALGORITHMS, AND OPTIMIZATIONS 159
Silos,” the massive data generated by wireless networks could VIII. C ONCLUSION
not be fully opened and shared even for research purposes. Development of AI for 5G/B5G requires domain-specific
Second, the nowadays data lacks a scientific and unified knowledge from wireless communication, expertise in AI
collection and storage specification, and will further hinder the and machine learning, and experience in hardware design.
development of this area. On the other hand, the requirement As the review paper in the special issue “Artificial Intelligence
of data sharing comes with the consideration of data security. for 5G and Beyond 5G: Implementations, Algorithms, and
And how to protect users’ rights, interests, and privacy is also Optimizations” of IEEE Journal on Emerging and Selected
important [127]. Topics in Circuits and Systems (JETCAS), this review means
to bring a synthesized source and wide view of research
E. Offline Training vs. Online Training problems, methodologies, and recent results in this area, and
For good learning performance, a large number of para- highlight further directions for readers. We hope this paper and
meters need to be trained. To lower both time complexity and special issue will be interesting for a large portion of related
space complexity, designers usually prefer offline training. The researchers and be timely to trigger future research and pave
successful application of offline trained parameters depends on the road towards B5G era.
their robustness. Otherwise, online training should be consid-
ered. For real-time considerations, the number of parameters to ACKNOWLEDGMENT
be trained online should be reduced and hardware acceleration The authors would like to thank Prof. Xiaohu You for useful
is required. How to efficiently obtain the training data for discussions on future research directions.
online learning is another issue that required to be addressed
[128], [129]. Furthermore, for good learning performance, R EFERENCES
a number of parameters need to be trained. To lower both time
complexity and space complexity, designers usually prefer [1] M. Liyanage, I. Ahmad, A. Abro, A. Gurtov, and M. Ylianttila,
A Comprehensive Guide to 5G Security. Hoboken, NJ, USA: Wiley,
offline training. The successful application of offline trained 2018.
parameters depends on their robustness. Otherwise, online [2] Oxford Analytica, “COVID-19 will probably accelerate remote
training should be considered. working trend,” in Emerald Expert Briefings. Oxford, U.K.:
Oxford Analytica, 2020.
[3] H. Ullah, N. G. Nair, A. Moore, C. Nugent, P. Muschamp,
F. Separate Learning vs. Joint Learning and M. Cuevas, “5G communication: An overview of vehicle-to-
The major part of existing works focus on the learning for everything, drones, and healthcare use-cases,” IEEE Access, vol. 7,
a single or two modules. Separate learning is a natural choice pp. 37251–37268, 2019.
[4] C. Zhang, Y.-H. Huang, F. Sheikh, and Z. Wang, “Advanced baseband
since different wireless modules have different features. It is processing algorithms, circuits, and implementations for 5G commu-
generally agreed that joint learning for more modules will nication,” IEEE J. Emerg. Sel. Topics Circuits Syst., vol. 7, no. 4,
bring advantages in performance, complexity, and robustness. pp. 477–490, Dec. 2017.
However, the implementation of joint learning requires the [5] Z. Ma, Z. Zhang, Z. Ding, P. Fan, and H. Li, “Key techniques for
common nature of multiple modules. An extreme case is 5G wireless communications: Network architecture, physical layer, and
MAC layer perspectives,” Sci. China Inf. Sci., vol. 58, no. 4, pp. 1–20,
the end-to-end learning [130], [131], which will be more Apr. 2015.
interesting and complex for future research. [6] TECH PERF REQ-Minimum Requirements Related to Technical
Performance for IMT-2020 Radio Interface (S), document ITU-R
SG05 Contribution, Draft New Report ITU-R M.IMT-2020, 2017,
G. Physical Layer vs. Cross Layer vol. 40.
Existing works mostly focus AI’s applications in a single [7] 5G; Study on Scenarios and Requirements for Next Generation Access
layer, e.g., the physical layer. Though good performance Technologies, document TR 38.913, 3GPP, 2017.
and cost balance can be achieved by model-driven learning, [8] Wikipedia. (2020). Artificial Intelligence—Wikipedia, the Free
Encyclopedia. Accessed: Apr. 3, 2020. [Online]. Available: http://en.
physical layer learning is somehow constrained by expert wikipedia.org/w/index.php?title=Artificial%20intelligence&oldid
knowledge, otherwise the end-to-end learning becomes fea- =948602554
sible. Some researchers are arguing we should not limit AI’s [9] X. You, C. Zhang, X. Tan, S. Jin, and H. Wu, “AI for 5G: Research
application within the physical layer but need to consider for directions and paradigms,” Sci. China Inf. Sci., vol. 62, no. 2, p. 21301,
Feb. 2019.
cross-layer applications. For example, nowadays researchers
[10] A. Zappone, M. Di Renzo, and M. Debbah, “Wireless networks design
have started AI-based network optimization. Learning crossing in the era of deep learning: Model-based, AI-based, or both?” IEEE
two or three layers can expect more benefits. Trans. Commun., vol. 67, no. 10, pp. 7331–7376, Oct. 2019.
[11] C. Jiang, H. Zhang, Y. Ren, Z. Han, K.-C. Chen, and L. Hanzo,
“Machine learning paradigms for next-generation wireless networks,”
H. Learning With Software vs. Learning With Hardware IEEE Wireless Commun., vol. 24, no. 2, pp. 98–105, Apr. 2017.
Learning with software will always be the first choice due [12] S.-Y. Wu, “Key technology enablers of innovations in the AI and 5G
to its convenience and adaptivity. However, for complicated era,” in IEDM Tech. Dig., Dec. 2019, p. 36.
learning tasks, software implementation will introduce high [13] R. Li et al., “Intelligent 5G: When cellular networks meet artificial
intelligence,” IEEE Wireless Commun., vol. 24, no. 5, pp. 175–183,
time and space complexity. Therefore, learning with hardware Oct. 2017.
has also been considered. For efficient hardware implementa- [14] D. Gesbert, D. Gündüz, P. de Kerret, C. R. Murthy, M. van der Schaar,
tion, the corresponding algorithm should be hardware-friendly, and N. D. Sidiropoulos, “Guest editorial special issue on machine
for example, robust to quantization and structure-regular. learning in wireless communication—Part 1,” IEEE J. Sel. Areas
Commun., vol. 37, no. 10, pp. 2181–2183, Oct. 2019.
Hardware design expertise on both FPGA and ASIC is
[15] D. Gesbert, D. Gündüz, P. de Kerret, C. R. Murthy, M. van der Schaar,
essential for this issue. On the other hand, AI can also help and N. D. Sidiropoulos, “Guest editorial special issue on machine
to design reconfigurable hardware for learning. Existing learning in wireless Communication—Part 2,” IEEE J. Sel. Areas
hardware auto-generator can be the basis for future research. Commun., vol. 37, no. 11, pp. 2409–2412, Nov. 2019.
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
160 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO. 2, JUNE 2020
[16] S. Gong et al., “Introduction to the special section on deep rein- [37] M. Joham, W. Utschick, and J. A. Nossek, “Linear transmit processing
forcement learning for future wireless communication networks,” in MIMO communications systems,” IEEE Trans. Signal Process.,
IEEE Trans. Cognit. Commun. Netw., vol. 5, no. 4, pp. 1019–1023, vol. 53, no. 8, pp. 2700–2712, Aug. 2005.
Dec. 2019. [38] K. Li, C. Jeon, J. R. Cavallaro, and C. Studer, “Feedforward architec-
[17] J. IEEE Sel Topics Signal Process (JSTSP). (2020). Call for Papers: tures for decentralized precoding in massive MU-MIMO systems,” in
Special Issue on Compact Deep Neural Networks With Industrial Proc. 52nd Asilomar Conf. Signals, Syst., Comput. (ACSSC), Oct. 2018,
Applications. Accessed: Apr. 3, 2020. [Online]. Available: https:// pp. 1659–1665.
signalprocessingsociety.org/sites/default/files/uploads/special_issues_ [39] A. M. Elbir, “CNN-based precoder and combiner design in mmWave
deadlines/JSTSP_SI_compact_deep.pdf MIMO systems,” IEEE Commun. Lett., vol. 23, no. 7, pp. 1240–1243,
[18] IEEE Global Commun. Conf. (GLOBECOM). (2019). Call for Jul. 2019.
Workshop Papers: Workshop on Machine Learning for Wireless [40] A. Balatsoukas-Stimming, O. Castañeda, S. Jacobsson, G. Durisi, and
Communications. Accessed: Apr. 3, 2020. [Online]. Available: C. Studer, “Neural-network optimized 1-bit precoding for massive
https://globecom2019.ieee-globecom.org/authors/call-workshop-papers MU-MIMO,” in Proc. IEEE 20th Int. Workshop Signal Process. Adv.
[19] IEEE Global Commun. Conf. (GLOBECOM). (2019). Call for Work- Wireless Commun. (SPAWC), Jul. 2019, pp. 1–5.
shop Papers: Workshop on Artificial Intelligence for Next-Generations [41] O. Castaneda, S. Jacobsson, G. Durisi, M. Coldrey, T. Goldstein, and
Wireless Communications. Accessed: Apr. 3, 2020. [Online]. Available: C. Studer, “1-bit massive MU-MIMO precoding in VLSI,” IEEE J.
https://globecom2019.ieee-globecom.org/authors/call-workshop-papers Emerg. Sel. Topics Circuits Syst., vol. 7, no. 4, pp. 508–522, Dec. 2017.
[20] IEEE Global Conf. Signal Info. (GlobalSIP). (2019). Call for Papers: [42] O. Castañeda, T. Goldstein, and C. Studer, “VLSI designs for joint
Symposium on Machine Learning for Wireless Communications, channel estimation and data detection in large SIMO wireless systems,”
Networking, and Security. Accessed: Apr. 3, 2020. [Online]. IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 65, no. 3, pp. 1120–1132,
Available: http://2019.ieeeglobalsip.org/pages/machine-learning- Oct. 2017.
wireless-communications-networking-and-security [43] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and
[21] IEEE Global Conf. Signal Info. (GlobalSIP). (2019). Call for the sum-product algorithm,” IEEE Trans. Inf. Theory, vol. 47, no. 2,
Papers: Symposium on Artificial Intelligence for Future Wireless pp. 498–519, Feb. 2001.
Communication. Accessed: Apr. 3, 2020. [Online]. Available: [44] D. J. C. MacKay, “Good error-correcting codes based on very sparse
http://2019.ieeeglobalsip.org/pages/symposium-artificial-intelligence- matrices,” IEEE Trans. Inf. Theory, vol. 45, no. 2, pp. 399–431,
future-wireless-communication Mar. 1999.
[22] IEEE Vehi. Tech. Conf. (VTC)-Fall. (2019). Call for Workshop [45] J. Chen and M. P. C. Fossorier, “Density evolution for two improved
Papers: Workshop on Machine Learning for Wireless Communica- BP-based decoding algorithms of LDPC codes,” IEEE Commun. Lett.,
tions. Accessed: Apr. 3, 2020. [Online]. Available: http://www.ieeevtc. vol. 6, no. 5, pp. 208–210, May 2002.
org/vtc2019fall/cfw-pprs.php#wkshp_3 [46] E. Arikan, “Channel polarization: A method for constructing capacity-
[23] Brooklyn 5G Summit. (2020). Call for Workshop Papers: achieving codes for symmetric binary-input memoryless chan-
Workshop on Machine Learning for Wireless Communications. nels,” IEEE Trans. Inf. Theory, vol. 55, no. 7, pp. 3051–3073,
Accessed: Apr. 3, 2020. [Online]. Available: https://brooklyn5gsummit. Jul. 2009.
com/agenda-2020/ [47] X. Wu, M. Jiang, and C. Zhao, “Decoding optimization for 5G LDPC
[24] IEEE Communications Society. (2019). Machine Learning for Com- codes by machine learning,” IEEE Access, vol. 6, pp. 50179–50186,
munications Emerging Technologies Initiative. Accessed: Apr. 3, 2020. 2018.
[Online]. Available: https://mlc.committees.comsoc.org/workshops- [48] B. Vasic, X. Xiao, and S. Lin, “Learning to decode LDPC codes with
tutorials-symposia/ finite-alphabet message passing,” in Proc. Inf. Theory Appl. Workshop
[25] Z. Qin, H. Ye, G. Y. Li, and B.-H.-F. Juang, “Deep learning in (ITA), Feb. 2018, pp. 1–9.
physical layer communications,” IEEE Wireless Commun., vol. 26, [49] L. Lugosch and W. J. Gross, “Neural offset min-sum decod-
no. 2, pp. 93–99, Apr. 2019. ing,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Jun. 2017,
[26] H. He, S. Jin, C.-K. Wen, F. Gao, G. Y. Li, and Z. Xu, “Model-driven pp. 1361–1365.
deep learning for physical layer communications,” IEEE Wireless [50] W. Xu, Z. Wu, Y.-L. Ueng, X. You, and C. Zhang, “Improved polar
Commun., vol. 26, no. 5, pp. 77–83, Oct. 2019. decoder based on deep learning,” in Proc. IEEE Int. Workshop Signal
[27] L. Liang, H. Ye, G. Yu, and G. Y. Li, “Deep-learning-based wireless Process. Syst. (SiPS), Oct. 2017, pp. 1–6.
resource allocation with application to vehicular networks,” Proc. [51] C.-F. Teng, C.-H.-D. Wu, A. Kuan-Shiuan Ho, and A.-Y.-A. Wu, “Low-
IEEE, vol. 108, no. 2, pp. 341–356, Feb. 2020. complexity recurrent neural network-based polar decoder with weight
[28] N. Samuel, T. Diskin, and A. Wiesel, “Learning to detect,” IEEE Trans. quantization mechanism,” in Proc. IEEE Int. Conf. Acoust., Speech
Signal Process., vol. 67, no. 10, pp. 2554–2564, May 2019. Signal Process. (ICASSP), May 2019, pp. 1413–1417.
[29] M. Khani, M. Alizadeh, J. Hoydis, and P. Fleming, “Adaptive [52] E. Nachmani, E. Marciano, L. Lugosch, W. J. Gross, D. Burshtein,
neural signal detection for massive MIMO,” 2019, arXiv:1906.04610. and Y. Be’ery, “Deep learning methods for improved decoding of linear
[Online]. Available: http://arxiv.org/abs/1906.04610 codes,” IEEE J. Sel. Topics Signal Process., vol. 12, no. 1, pp. 119–131,
[30] X. Gao, S. Jin, C.-K. Wen, and G. Y. Li, “ComNet: Combination Feb. 2018.
of deep learning and expert knowledge in OFDM receivers,” IEEE [53] B. Yuan and K. K. Parhi, “Early stopping criteria for energy-efficient
Commun. Lett., vol. 22, no. 12, pp. 2627–2630, Dec. 2018. low-latency belief-propagation polar code decoders,” IEEE Trans.
[31] C. Jeon, O. Casta neda, and C. Studer, “A 354 mb/s 0.37mm2 151 mW Signal Process., vol. 62, no. 24, pp. 6496–6506, Dec. 2014.
32-user 256-QAM near-map soft-input soft-output massive mu-MIMO [54] C. Simsek and K. Turk, “Simplified early stopping criterion for belief-
data detector in 28 nm CMOS,” IEEE Solid-State Circuits Lett., vol. 2, propagation polar code decoders,” IEEE Commun. Lett., vol. 20, no. 8,
no. 9, pp. 127–130, Oct. 2019. pp. 1515–1518, Aug. 2016.
[32] J. Ma and L. Ping, “Orthogonal AMP,” IEEE Access, vol. 5, [55] Y. Ren, C. Zhang, X. Liu, and X. You, “Efficient early termination
pp. 2020–2033, 2017. schemes for belief-propagation decoding of polar codes,” in Proc. IEEE
[33] H. He, C.-K. Wen, S. Jin, and G. Y. Li, “A model-driven deep learning 11th Int. Conf. ASIC (ASICON), Nov. 2015, pp. 1–4.
network for MIMO detection,” in Proc. IEEE Global Conf. Signal Inf. [56] Y. Wang, S. Zhang, C. Zhang, X. Chen, and S. Xu, “A low-
Process. (GlobalSIP), Nov. 2018, pp. 584–588. complexity belief propagation based decoding scheme for polar codes–
[34] H. He, C.-K. Wen, S. Jin, and G. Y. Li, “Model-driven deep learn- decodability detection and early stopping prediction,” IEEE Access,
ing for MIMO detection,” IEEE Trans. Signal Process., vol. 68, vol. 7, pp. 159808–159820, 2019.
pp. 1702–1715, Feb. 2020. [57] F. Liang, C. Shen, and F. Wu, “An iterative BP-CNN architecture for
[35] X. Tan et al., “Improving massive MIMO message passing detectors channel decoding,” IEEE J. Sel. Topics Signal Process., vol. 12, no. 1,
with deep neural network,” IEEE Trans. Veh. Technol., vol. 69, no. 2, pp. 144–159, Feb. 2018.
pp. 1267–1280, Feb. 2020. [58] X. Wang et al., “Learning to flip successive cancellation decoding of
[36] A. Balatsoukas-Stimming and C. Studer, “Deep unfolding for commu- polar codes with LSTM networks,” in Proc. IEEE 30th Annu. Int.
nications systems: A survey and some new directions,” in Proc. IEEE Symp. Pers., Indoor Mobile Radio Commun. (PIMRC), Sep. 2019,
Int. Workshop Signal Process. Syst. (SiPS), Oct. 2019, pp. 1–6. pp. 1–5.
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: AI FOR 5G AND B5G: IMPLEMENTATIONS, ALGORITHMS, AND OPTIMIZATIONS 161
[59] C.-H. Chen, C.-F. Teng, and A.-Y.-A. Wu, “Low-complexity LSTM- [83] L. Dai, B. Wang, Y. Yuan, S. Han, C.-L. I, and Z. Wang, “Non-
assisted bit-flipping algorithm for successive cancellation list polar orthogonal multiple access for 5G: Solutions, challenges, opportunities,
decoder,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. and future research trends,” IEEE Commun. Mag., vol. 53, no. 9,
(ICASSP), May 2020, pp. 1708–1712. pp. 74–81, Sep. 2015.
[60] T. Gruber, S. Cammerer, J. Hoydis, and S. T. Brink, “On deep learning- [84] F. Luo and C. Zhang, Non-Orthogonal Multi-User Superposition
based channel decoding,” in Proc. 51st Annu. Conf. Inf. Sci. Syst. Shared Access. Hoboken, NJ, USA: Wiley, 2016, pp. 115–142.
(CISS), Mar. 2017, pp. 1–6. [85] S. Chen, B. Ren, Q. Gao, S. Kang, S. Sun, and K. Niu, “Pattern
[61] S. Cammerer, T. Gruber, J. Hoydis, and S. ten Brink, “Scaling deep division multiple access—A novel nonorthogonal multiple access for
learning-based decoding of polar codes via partitioning,” in Proc. IEEE fifth-generation radio networks,” IEEE Trans. Veh. Technol., vol. 66,
Global Commun. Conf. (GLOBECOM), Dec. 2017, pp. 1–6. no. 4, pp. 3185–3196, Apr. 2017.
[62] T. Wadayama and S. Takabe, “Deep learning-aided trainable projected [86] H. Nikopour et al., “SCMA for downlink multiple access of 5G
gradient decoding for LDPC codes,” in Proc. IEEE Int. Symp. Inf. wireless networks,” in Proc. IEEE Global Commun. Conf., Dec. 2014,
Theory (ISIT), Jul. 2019, pp. 2444–2448. pp. 3940–3945.
[63] Y. Wang, Z. Zhang, S. Zhang, S. Cao, and S. Xu, “A unified deep [87] C. Yang, C. Zhang, S. Zhang, and X. You, “Efficient hardware
learning based polar-LDPC decoder for 5G communication systems,” architecture of deterministic MPA decoder for SCMA,” in Proc. IEEE
in Proc. 10th Int. Conf. Wireless Commun. Signal Process. (WCSP), Asia Pacific Conf. Circuits Syst. (APCCAS), Oct. 2016, pp. 293–296.
Oct. 2018, pp. 1–6. [88] Y. Tan, J. Zhou, and J. Qin, “Novel channel estimation for non-
[64] W. Lyu, Z. Zhang, C. Jiao, K. Qin, and H. Zhang, “Performance orthogonal multiple access systems,” IEEE Signal Process. Lett.,
evaluation of channel decoding with deep neural networks,” in Proc. vol. 23, no. 12, pp. 1781–1785, Dec. 2016.
IEEE Int. Conf. Commun. (ICC), May 2018, pp. 1–6. [89] M. Taherzadeh, H. Nikopour, A. Bayesteh, and H. Baligh, “SCMA
[65] M. Zhang, Q. Huang, S. Wang, and Z. Wang, “Construction of LDPC codebook design,” in Proc. IEEE 80th Veh. Technol. Conf. (VTC-Fall),
codes based on deep reinforcement learning,” in Proc. 10th Int. Conf. Sep. 2014, pp. 1–5.
Wireless Commun. Signal Process. (WCSP), Oct. 2018, pp. 1–4. [90] H. Nikopour and M. Baligh, “Systems and methods for sparse code
[66] A. Elkelesh, M. Ebada, S. Cammerer, L. Schmalen, and S. ten Brink, multiple access,” U.S. Patent 9 240 853, Jan. 19, 2016.
“Decoder-in-the-loop: Genetic optimization-based LDPC code design,” [91] M. Alam and Q. Zhang, “Performance study of SCMA codebook
IEEE Access, vol. 7, pp. 141161–141170, 2019. design,” in Proc. IEEE Wireless Commun. Netw. Conf. (WCNC),
[67] G. He et al., “Beta-expansion: A theoretical framework for fast and Mar. 2017, pp. 1–5.
recursive construction of polar codes,” in Proc. IEEE Global Commun. [92] C. Yang, W. Xu, Z. Zhang, X. You, and C. Zhang, “A channel-blind
Conf. (GLOBECOM), Dec. 2017, pp. 1–6. detection for SCMA based on image processing techniques,” in Proc.
[68] I. Tal and A. Vardy, “How to construct polar codes,” IEEE Trans. Inf. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2018, pp. 1–5.
Theory, vol. 59, no. 10, pp. 6562–6582, Oct. 2013. [93] M. Kim, N.-I. Kim, W. Lee, and D.-H. Cho, “Deep learning-aided
[69] 3GPP. (Feb. 2017). Final Report of 3GPP TSG RAN WG1 SCMA,” IEEE Commun. Lett., vol. 22, no. 4, pp. 720–723, Apr. 2018.
#87. Accessed: Apr. 3, 2020. [Online]. Available: http://www. [94] H. He, C.-K. Wen, S. Jin, and G. Y. Li, “Deep learning-based
3gpp.org/DynaReport/TDocExMt–R1-87–31665.htm channel estimation for beamspace mmWave massive MIMO sys-
[70] A. Elkelesh, M. Ebada, S. Cammerer, and S. T. Brink, “Decoder- tems,” IEEE Wireless Commun. Lett., vol. 7, no. 5, pp. 852–855,
tailored polar code design using the genetic algorithm,” IEEE Trans. Oct. 2018.
Commun., vol. 67, no. 7, pp. 4521–4534, Jul. 2019. [95] M. Soltani, V. Pourahmadi, A. Mirzaei, and H. Sheikhzadeh, “Deep
[71] I. Tal and A. Vardy, “List decoding of polar codes,” IEEE Trans. Inf. learning-based channel estimation,” IEEE Commun. Lett., vol. 23, no. 4,
Theory, vol. 61, no. 5, pp. 2213–2226, May 2015. pp. 652–655, Apr. 2019.
[72] M. Ebada, S. Cammerer, A. Elkelesh, and S. Ten Brink, “Deep [96] C.-J. Chun, J.-M. Kang, and I.-M. Kim, “Deep learning-based
learning-based polar code design,” in Proc. 57th Annu. Allerton Conf. joint pilot design and channel estimation for multiuser MIMO
Commun., Control, Comput. (Allerton), Sep. 2019, pp. 177–183. channels,” IEEE Commun. Lett., vol. 23, no. 11, pp. 1999–2003,
[73] B. Mitchinson and R. F. Harrison, “Digital communications channel Nov. 2019.
equalization using the kernel adaline,” IEEE Trans. Commun., vol. 50, [97] C.-J. Chun, J.-M. Kang, and I.-M. Kim, “Deep learning-based channel
no. 4, pp. 571–576, Apr. 2002. estimation for massive MIMO systems,” IEEE Wireless Commun. Lett.,
[74] X. Lyu, W. Feng, R. Shi, Y. Pei, and N. Ge, “Artificial neural vol. 8, no. 4, pp. 1228–1231, Aug. 2019.
network-based nonlinear channel equalization: A soft-output per- [98] Y. Liao, Y. Hua, X. Dai, H. Yao, and X. Yang, “ChanEstNet: A deep
spective,” in Proc. 22nd Int. Conf. Telecommun. (ICT), Apr. 2015, learning based channel estimation for high-speed scenarios,” in Proc.
pp. 243–248. IEEE Int. Conf. Commun. (ICC), May 2019, pp. 1–6.
[75] J. C. Patra, W. B. Poh, N. S. Chaudhari, and A. Das, “Nonlinear [99] K. E. Kolodziej, B. T. Perry, and J. S. Herd, “In-band full-duplex
channel equalization with QAM signal using chebyshev artificial neural technology: Techniques and systems survey,” IEEE Trans. Microw.
network,” in Proc. IEEE Int. Joint Conf. Neural Netw. (IJCNN), Theory Techn., vol. 67, no. 7, pp. 3025–3041, Jul. 2019.
Jul. 2005, pp. 3214–3219. [100] A. Balatsoukas-Stimming, “Non-linear digital self-interference cancel-
[76] K. Burse, R. N. Yadav, and S. C. Shrivastava, “Channel equalization lation for in-band full-duplex radios using neural networks,” in Proc.
using neural networks: A review,” IEEE Trans. Syst., Man, Cybern. C, IEEE 19th Int. Workshop Signal Process. Adv. Wireless Commun.
Appl. Rev., vol. 40, no. 3, pp. 352–357, May 2010. (SPAWC), Jun. 2018, pp. 1–5.
[77] C.-F. Teng, H.-M. Ou, and A.-Y.-A. Wu, “Neural network-based [101] Y. Kurzo, A. T. Kristensen, A. Burg, and A. Balatsoukas-Stimming,
equalizer by utilizing coding gain in advance,” in Proc. IEEE Global “Hardware implementation of neural self-interference cancellation,”
Conf. Signal Inf. Process. (GlobalSIP), Nov. 2019, pp. 1–5. 2020, arXiv:2001.04543. [Online]. Available: http://arxiv.org/abs/
[78] H. Ye and G. Y. Li, “Initial results on deep learning for joint channel 2001.04543
equalization and decoding,” in Proc. IEEE 86th Veh. Technol. Conf. [102] O. Ploder, O. Lang, T. Paireder, and M. Huemer, “An adaptive
(VTC-Fall), Sep. 2017, pp. 1–5. machine learning based approach for the cancellation of second-order-
intermodulation distortions in 4G/5G transceivers,” in Proc. IEEE 90th
[79] P. M. Olmos, J. J. Murillo-Fuentes, and F. Perez-Cruz, “Joint non-
Veh. Technol. Conf. (VTC-Fall), Sep. 2019, pp. 1–7.
linear channel equalization and soft LDPC decoding with Gaussian [103] C. Tarver, A. Balatsoukas-Stimming, and J. R. Cavallaro, “Design and
processes,” IEEE Trans. Signal Process., vol. 58, no. 3, pp. 1183–1192, implementation of a neural network based predistorter for enhanced
Mar. 2010. mobile broadband,” 2019, arXiv:1907.00766. [Online]. Available:
[80] W. Xu, Z. Zhong, Y. Be’ery, X. You, and C. Zhang, “Joint neural http://arxiv.org/abs/1907.00766
network equalizer and decoder,” in Proc. 15th Int. Symp. Wireless [104] M. Rawat, K. Rawat, and F. M. Ghannouchi, “Adaptive digital predis-
Commun. Syst. (ISWCS), Aug. 2018, pp. 1–5. tortion of wireless power amplifiers/transmitters using dynamic real-
[81] G. Kechriotis, E. Zervas, and E. S. Manolakos, “Using recurrent neural valued focused time-delay line neural networks,” IEEE Trans. Microw.
networks for adaptive communication channel equalization,” IEEE Theory Techn., vol. 58, no. 1, pp. 95–104, Jan. 2010.
Trans. Neural Netw., vol. 5, no. 2, pp. 267–278, Mar. 1994. [105] D. Wang, M. Aziz, M. Helaoui, and F. M. Ghannouchi, “Augmented
[82] Y. Hu, L. Zhao, and Y. Hu, “Joint channel equalization and decoding real-valued time-delay neural network for compensation of distortions
with one recurrent neural network,” in Proc. IEEE Int. Symp. Broad- and impairments in wireless transmitters,” IEEE Trans. Neural Netw.
band Multimedia Syst. Broadcast. (BMSB), Jun. 2019, pp. 1–4. Learn. Syst., vol. 30, no. 1, pp. 242–254, Jan. 2019.
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
162 IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, VOL. 10, NO. 2, JUNE 2020
[106] Y. Yang, F. Gao, X. Ma, and S. Zhang, “Deep learning-based channel [130] T. O’Shea and J. Hoydis, “An introduction to deep learning for the
estimation for doubly selective fading channels,” IEEE Access, vol. 7, physical layer,” IEEE Trans. Cognit. Commun. Netw., vol. 3, no. 4,
pp. 36579–36589, 2019. pp. 563–575, Dec. 2017.
[107] Y. Yang, F. Gao, G. Y. Li, and M. Jian, “Deep learning-based downlink [131] S. Dorner, S. Cammerer, J. Hoydis, and S. T. Brink, “Deep learning
channel prediction for FDD massive MIMO system,” IEEE Commun. based communication over the air,” IEEE J. Sel. Topics Signal Process.,
Lett., vol. 23, no. 11, pp. 1994–1998, Nov. 2019. vol. 12, no. 1, pp. 132–143, Feb. 2018.
[108] F. Gustafsson and F. Gunnarsson, “Mobile positioning using wireless
networks: Possibilities and fundamental limitations based on available
wireless network measurements,” IEEE Signal Process. Mag., vol. 22,
no. 4, pp. 41–53, Jul. 2005.
[109] H. Liu, H. Darabi, P. Banerjee, and J. Liu, “Survey of wireless
indoor positioning techniques and systems,” IEEE Trans. Syst., Man,
Cybern. C, Appl. Rev., vol. 37, no. 6, pp. 1067–1080, Nov. 2007.
[110] S. Kumar, S. Gil, D. Katabi, and D. Rus, “Accurate indoor localization Chuan Zhang (Member, IEEE) received the B.E.
with zero start-up cost,” in Proc. 20th Annu. Int. Conf. Mobile Comput. degree (summa cum laude) in microelectronics
Netw. (MobiCom), Sep. 2014, pp. 483–494. and the M.E. degree (Hons.) in very-large scale
[111] K. Kaemarungsi and P. Krishnamurthy, “Modeling of indoor posi- integration (VLSI) design from Nanjing University,
tioning systems based on location fingerprinting,” in Proc. IEEE Nanjing, China, in 2006 and 2009, respectively, and
Int. Conf. Comput. Commun. (INFOCOM), vol. 2, Mar. 2004, the M.S.E.E. and Ph.D. degrees from the Department
pp. 1012–1022. of Electrical and Computer Engineering, University
[112] K. Wu, J. Xiao, Y. Yi, D. Chen, X. Luo, and L. M. Ni, “CSI-based of Minnesota, Twin Cities (UMN), USA, in 2012.
indoor localization,” IEEE Trans. Parallel Distrib. Syst., vol. 24, no. 7, He is currently the Excellence Professor and
pp. 1300–1309, Jul. 2013. the Purple Mountain Professor with Southeast
[113] Y. Chapre, A. Ignjatovic, A. Seneviratne, and S. Jha, “CSI-MIMO: University. He is also with the LEADS, National
An efficient Wi-Fi fingerprinting using channel state information with Mobile Communications Research Laboratory, Quantum Information Center,
MIMO,” Pervas. Mobile Comput., vol. 23, pp. 89–103, Oct. 2015. Southeast University, and the Purple Mountain Laboratories, Nanjing.
[114] A. Smailagic, J. Small, and D. P. Siewiorek, “Determining user location His current research interests include low-power high-speed VLSI design
for context aware computing through the use of a wireless LAN for digital signal processing and digital communication, bio-chemical
infrastructure,” Inst. Complex Engineered Syst., Carnegie Mellon Univ., computation and neuromorphic engineering, and quantum communication.
Pittsburgh, PA, USA, Tech. Rep. 15213, Dec. 2000, Art. no. 15213. Dr. Zhang is a member of the Seasonal School of Signal Processing and
[115] L. Tang, R. Ghods, and C. Studer, “Reducing the complexity of Design and Implementation of Signal Processing Systems TC of the IEEE
fingerprinting-based positioning using locality-sensitive hashing,” 2019, Signal Processing Society, and Circuits and Systems for Communications
arXiv:1912.00831. [Online]. Available: http://arxiv.org/abs/1912.00831 TC, VLSI Systems and Applications TC, and Digital Signal Processing TC
[116] A. Gionis, P. Indyk, and R. Motwani, “Similarity search in high
of the IEEE Circuits and Systems Society. He is also the Secretary-Elect
dimensions via hashing,” in Proc. Very Large Data Base (VLDB),
of the Circuits and Systems for Communications TC of the IEEE Circuits
Sep. 1999, vol. 99, no. 6, pp. 518–529.
[117] S. Har-Peled, P. Indyk, and R. Motwani, “Approximate nearest neigh- and Systems Society. He received the Best Contribution Award of the
bor: Towards removing the curse of dimensionality,” Theory Comput., IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) in 2018,
vol. 8, no. 1, pp. 321–350, Jul. 2012. the Best Paper Award in 2016, the Best (Student) Paper Award of the
[118] X. Wang, L. Gao, and S. Mao, “CSI phase fingerprinting for indoor IEEE International Conference on DSP in 2016, three best (student) paper
localization with a deep learning approach,” IEEE Internet Things J., awards of the IEEE International Conference on ASIC in 2015, 2017, and
vol. 3, no. 6, pp. 1113–1123, Dec. 2016. 2019, the Best Paper Award Nomination of the IEEE Workshop on Signal
[119] J. Vieira, E. Leitinger, M. Sarajlic, X. Li, and F. Tufvesson, “Deep Processing Systems in 2015, three excellent paper awards and two excellent
convolutional neural networks for massive MIMO fingerprint-based poster presentation awards of the International Collaboration Symposium on
positioning,” in Proc. IEEE 28th Annu. Int. Symp. Pers., Indoor, Mobile Information Production and Systems from 2016 to 2018, the Outstanding
Radio Commun. (PIMRC), Oct. 2017, pp. 1–6. Achievement Award of the Intel Collaborative Research Institute in 2018,
[120] M. Arnold, S. Dorner, S. Cammerer, and S. Ten Brink, “On deep and the Merit (Student) Paper Award of the IEEE APCCAS in 2008. He also
learning-based massive MIMO indoor user localization,” in Proc. IEEE received the Three-Year University-Wide Graduate School Fellowship of
19th Int. Workshop Signal Process. Adv. Wireless Commun. (SPAWC), UMN and the Doctoral Dissertation Fellowship of UMN. He also serves as
Jun. 2018, pp. 1–5. an Associate Editor of the IEEE T RANSACTIONS ON S IGNAL P ROCESSING
[121] E. Lei, O. Castañeda, O. Tirkkonen, T. Goldstein, and C. Studer, and the IEEE O PEN J OURNAL OF C IRCUITS AND S YSTEMS . He serves as
“Siamese neural networks for wireless positioning and channel chart- a Corresponding Guest Editor of the IEEE J OURNAL ON E MERGING AND
ing,” in Proc. 57th Annu. Allerton Conf. Commun., Control, Comput. S ELECTED T OPICS IN C IRCUITS AND S YSTEMS twice.
(Allerton), Sep. 2019, pp. 200–207.
[122] C. Studer, S. Medjkouh, E. Gönültas, T. Goldstein, and O. Tirkkonen,
“Channel charting: Locating users within the radio environment using
channel state information,” IEEE Access, vol. 6, pp. 47682–47698,
2018.
[123] L. J. P. van der Maaten, E. O. Postma, and H. J. van den Herik,
“Dimensionality reduction: A comparative review,” J. Mach. Learn.
Res., vol. 10, nos. 1–41, pp. 66–71, Oct. 2009. Yeong-Luh Ueng (Senior Member, IEEE) received
[124] J. W. Sammon, “A nonlinear mapping for data structure analysis,” IEEE the Ph.D. degree in communication engineering from
Trans. Comput., vol. C-18, no. 5, pp. 401–409, May 1969. National Taiwan University, Taipei, Taiwan, in 2001.
[125] J. Bromley, I. Guyon, Y. LeCun, E. Säckinger, and R. Shah, “Signature From 2001 to 2005, he was with a private communi-
verification using a,’ Siamese’ time delay neural network,” in Proc. Int. cation technology company, where he focused on the
Conf. Neural Inf. Process. Syst. (NIPS), Jan. 1994, pp. 737–744. design and development of various wireless chips.
[126] P. Huang et al., “Improving channel charting with representation-
Since 2005, he has been a member of the Faculty
constrained autoencoders,” in Proc. IEEE 20th Int. Workshop Signal
of National Tsing Hua University, Hsinchu, Taiwan,
Process. Adv. Wireless Commun. (SPAWC), Jul. 2019, pp. 1–5.
[127] X. You, H. Yin, and H. Wu, “On 6G and wide-area IoT,” Chin. J. where he is currently a Full Professor with the
Internet Things, vol. 4, no. 1, pp. 3–11, Mar. 2020. Department of Electrical Engineering and the Insti-
[128] S. Schibisch, S. Cammerer, S. Dörner, J. Hoydis, and S. Ten Brink, tute of Communications Engineering. His research
“Online label recovery for deep learning-based communication through interests include coding theory, wireless communications, and communication
error correcting codes,” in Proc. 15th Int. Symp. Wireless Commun. ICs. In 2016, he received the Wu Ta-You Memorial Award from the Ministry
Syst. (ISWCS), Aug. 2018, pp. 1–5. of Science and Technology (MOST). In 2018, he received the Outstanding
[129] L. Lugosch and W. J. Gross, “Learning from the syndrome,” in Electrical Engineering Professor Award from the Chinese Electrical Engineer-
Proc. 52nd Asilomar Conf. Signals, Syst., Comput., Oct. 2018, ing Association and the Outstanding Research Award from the Ministry of
pp. 594–598. Science and Technology (MOST), Taiwan.
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: AI FOR 5G AND B5G: IMPLEMENTATIONS, ALGORITHMS, AND OPTIMIZATIONS 163
Christoph Studer (Senior Member, IEEE) received Andreas Burg (Member, IEEE) was born in
the Ph.D. degree in electrical engineering from ETH Munich, Germany, in 1975. He received the
Zürich, Switzerland, in 2009. Dipl.Ing. degree from the Swiss Federal Institute
In 2005, he was a Visiting Researcher with the of Technology (ETH) Zürich, Zürich, Switzerland,
Smart Antennas Research Group, Stanford Univer- in 2000, and the Dr.Sc.Techn. degree from the Inte-
sity. From 2006 to 2009, he was a Research Assistant grated Systems Laboratory, ETH Zürich, in 2006.
at the Integrated Systems Laboratory and the Com- In 1998, he was with Siemens Semiconductors,
munication Technology Laboratory (CTL), ETH San Jose, CA, USA. During his Ph.D. studies,
Zürich. From 2009 to 2012, he was a Post-Doctoral he was with the Bell Labs Wireless Research for
Researcher at CTL, ETH Zürich, and the Digital one year. From 2006 to 2007, he was a Post-Doctoral
Signal Processing Group, Rice University. In 2013, Researcher with the Integrated Systems Laboratory
he has held a research scientist position at Rice University, where he has been and with the Communication Theory Group, ETH Zürich. In 2007, he co-
an Adjunct Professor since 2014. From 2014 to 2019, he was an Assistant founded Celestrius, an ETH-spinoff in the field of MIMO wireless communi-
Professor at Cornell University, and from 2019 to 2020, he was an Associate cation, where he was responsible for the ASIC development as the Director
Professor at Cornell University and at Cornell Tech, New York City. He is cur- for VLSI. In 2009, he joined ETH Zürich, as an SNF Assistant Professor
rently an Associate Professor with the Department of Information Technology and as the Head of the Signal Processing Circuits and Systems Group,
and Electrical Engineering, ETH Zürich, Switzerland. His research interests Integrated Systems Laboratory. In 2011, he became a Tenure Track Assistant
include the design of very large-scale integration (VLSI) circuits, as well Professor with the École Polytechnique Fédérale de Lausanne (EPFL), where
as wireless communications, signal and image processing, optimization, and he is currently leading the Telecommunications Circuits Laboratory. He was
machine learning. promoted to an Associate Professor in June 2018. He is also a member of
Dr. Studer received the ETH medals for his M.S. and Ph.D. theses the EURASIP SAT SPCN and of the IEEE TC-DISPS. In 2000, he received
in 2006 and 2009, respectively, the Swiss National Science Foundation the Willi Studer Award and the ETH Medal for his diploma and his diploma
Fellowship for Advanced Researchers in 2011, and the U.S. National Science thesis, respectively. He also received the ETH Medal for his Ph.D. dissertation
Foundation CAREER Award in 2017. He won the Michael Tien ’72 Excel- in 2006. In 2008, he received the four years grant from the Swiss National
lence in Teaching Award from the College of Engineering, Cornell University, Science Foundation (SNF) for an SNF Assistant Professorship. With his
in 2016. He shared the Swisscom/ICTnet Innovations Award in 2010 and students, he received the Best Paper Award from the EURASIP Journal on
2013. He was the Winner of the Student Paper Contest of the 2007 Asilomar Image and Video Processing in 2013 and the best demo/paper awards at
Conference on Signals, Systems, and Computers, received the Best Student ISCAS 2013, ICECS 2013, and ACSSC 2007. He has served on TPC of
Paper Award of the 2008 IEEE International Symposium on Circuits and various conferences on signal processing, communications, and VLSI. He was
Systems (ISCAS), and shared the Best Live Demonstration Award at the the TPC Co-Chair of VLSI-SoC 2012, ESSCIRC 2016, and SiPS 2017.
IEEE ISCAS in 2013. In 2019, he was the Technical Program Chair of He was also the General Chair of ISLPED 2019. He served as an Editor
Asilomar Conference on Signals, Systems, and Computers, and the Technical of the IEEE T RANSACTIONS ON C IRCUITS AND S YSTEMS in 2013 and on
Program Co-Chair of the IEEE International Workshop on Signal Processing the Editorial Board of the Microelectronics Journal (Springer). He is an Editor
Systems (SiPS). He is currently an Associate Editor of the IEEE O PEN of the Journal of Signal Processing Systems (Springer) and Journal of Low
J OURNAL OF C IRCUITS AND S YSTEMS . Power Electronics and Applications (MDPI), and an Associate Editor of the
IEEE T RANSACTIONS ON V ERY L ARGE S CALE I NTEGRATION.
Authorized licensed use limited to: UNIVERSITAS TELKOM. Downloaded on December 06,2021 at 07:57:00 UTC from IEEE Xplore. Restrictions apply.