0% found this document useful (0 votes)
34 views4 pages

Learning To Communicate With Autoencoders Rethinking Wireless Systems With Deep Learning

Uploaded by

Mark Jennings
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views4 pages

Learning To Communicate With Autoencoders Rethinking Wireless Systems With Deep Learning

Uploaded by

Mark Jennings
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Learning to Communicate with Autoencoders:

Rethinking Wireless Systems with Deep Learning


Manuel Eugenio Morocho-Cayamcela, Judith Nkechinyere Njoku Jeonghun Park, Wansu Lim
Dept. of Electronic Engineering Dept. of IT Convergence Engineering
Kumoh National Institute of Technology Kumoh National Institute of Technology
Gumi, South Korea Gumi, South Korea
{eugeniomorocho, judithnjoku24}@kumoh.ac.kr {wlsemrl8, wansu.lim}@kumoh.ac.kr

Abstract—The design and implementation of conventional (i.e., data rate, mobility, latency, connection density, energy
communication systems are based on strong probabilistic models efficiency, traffic capacity, etc.) of the next-generation mobile
and assumptions. These fixed and conventional communication and wireless communication systems [10].
theories exhibit limitations in the utilization of the limited
spectrum resources and the complexity of optimization for In this paper, we rethink the approach to a conventional
emerging wireless applications. Currently, new generations of communication system and describe how deep learning can
wireless systems supported by artificial intelligence can learn be used to design an end-to-end communication system using
from the wireless spectrum data, and optimize their utilization to an encoder to replace the transmitter functionalities such as
enhance their performance. In this paper, we describe how deep modulation and coding, and a decoder for the receiver func-
learning can be used to design an end-to-end communication
system using an encoder to replace the transmitter tasks such as tionalities such as demodulation and decoding. We consider
modulation and coding, and a decoder for the receiver tasks such an autoencoder to improve the performance of conventional
as demodulation and decoding. This flexible design can capture systems by jointly optimizing the communication between
channel impairments effectively and optimize the operations of the transmitter and the receiver, instead of optimizing their
the transmitter and receiver altogether. We evaluate the case of a individual modules. An autoencoder is a deep neural network
single-antenna system, incorporating impairments in the channel
layer of the autoencoder and evaluating the response of different that consists of an encoder that learns a (latent) representation
neural network optimization algorithms. of the given data, and a decoder that reconstructs the input data
Index Terms—Deep learning, autoencoders, wireless systems, from the encoded data. In this setting, joint modulation and
physical layer, channel estimation. coding at the transmitter correspond to the encoder, and joint
decoding and demodulation at the receiver corresponds to the
I. I NTRODUCTION decoder. The proposed convolutional encoder-decoder design
The design and implementation of conventional commu- captures channel impairments and optimizes the transmitter
nication systems are built upon strong probabilistic models and receiver operations jointly for a single-antenna system.
and assumptions [1]. Furthermore, they are limited in ex- Additionally, we compare different optimizers for the param-
plaining theory to practice when handling the complexity of eter update of our design, as well as for the constellations
optimization for new wireless applications with high degrees generated by the autoencoder. Results demonstrate the power
of freedom. Deep learning has shown a high potential to of deep learning optimization in providing novel means to
address this challenge via data-driven solutions, improving design wireless communications.
the utilization of limited wireless spectrum resources [2]–
[5]. Instead of following a rigid design, new generations of II. A N E ND - TO -E ND COMMUNICATION SEQUENCE WITH
wireless systems empowered by cognitive radio can learn D EEP L EARNING
from spectrum data, and optimize their spectrum utilization A communication system is comprised of a transmitter, a
to enhance their performance. These smart communication receiver, and a channel that carries the information between the
systems rely on various detection, classification, and prediction transmitter and the receiver. Claude E. Shannon in its original
tasks, such as signal detection and signal type identification paper on communication theory [11], stated that the funda-
for spectrum sensing [6]. To address these tasks, deep learning mental problem of communication systems is: "reproducing at
provides powerful automated means for communication sys- one point either exactly or approximately a message selected
tems to learn from spectrum data and adapt to its dynamics at another point". That statement is equivalent to the concept
[7]. Wireless communications data come in large volumes of a modern autoencoder, where its job is to reconstruct a
and at high rates and is subject to interference and security given input at its output. In this section, we revisit the physical
threats due to the shared nature of the medium [8], [9]. Tra- layer of a conventional communication system design and
ditional modeling often fall short when capturing the delicate reformulate it as an end-to-end reconstruction task that aims to
relationship between highly complex spectrum data, whereas optimize the transmitter and receiver components in a single
deep learning has a robust capacity to meet the requirements operation.

978-1-7281-4985-1/20/$31.00
Authorized licensed use limited to:©2020 IEEE
University 308 on July 03,2023 at 20:14:31 UTC from IEEE Xplore.
of the West Indies (UWI). Downloaded ICAIIC
Restrictions apply. 2020
Transmitter Receiver Transmitter Receiver
s ŝ
{1010} {1000}
Information Reconstructed 1s {0...010...0}
Source Information
Dense Layers argmax
Source Source
Encoder Decoder
Softmax
f (s) ℝ→ℂ g (y)
Activation
Channel Channel
Encoder Decoder
Normalization
Dense Layers
Layer
Modulator Demodulator
x
AWGN, n y
Channel
Channel Detection
p(y|x)
Channel
Estimation Fig. 2. An autoencoder-based end-to-end communication system.

Fig. 1. Block diagram of a conventional communication system. transmitter and receiver by training them as deep neural
networks (DNNs). In an autoencoder system for a single
antenna, the output constellation diagrams are not pre-defined
A. The Limitation of Conventional Communication Systems
but learned, based on the desired performance metric to be
Conventional communication systems are divided into mul- minimized at the receiver (i.e., the symbol error rate, coherence
tiple independent blocks for the transmitter and receiver. These time, distance, propagation loss, etc.). The hardware of the
independent pieces are optimized individually for different transmitter imposes the following constraints [14]:
tasks [12] (Fig. 1). Each block at the transmitter prepares
(a) an energy constraint  x22 ≤ n,
the signal to the effects of the communication channel and
(b) an amplitude constraint |xi | ≤ 1 ∀i, 
noise at the receiver. The source encoder compresses the
(c) an average power constraint E |xi | 2 ≤ 1 ∀i on x.
input data and removes redundancy. The channel encoder adds
redundancy on the output of the source encoder in a controlled The data rate of this system is computed as R = k/n
way. The modulator changes the characteristics of the signal [bit/channel use], where k = log2 (M) represents the number
based on the required data rate. The transmitted signal is of input bits and n includes both the input bits and redundant
then distorted and attenuated by the channel. On top of that, bits to reduce channel effects. The notation (n, k) implies that
the impairments of the receiver’s hardware introduce extra a communication system sends one from the M = 2k messages
noise to the signal. The transmitter processes are reversed at (i.e., k bits) over n channel uses. Figure 2 illustrates a block
the receiver to recover the information. The optimization of diagram of the channel autoencoder. The learning process
these individual processing blocks is known to be suboptimal, exploits the distribution of the communication channel data
given that it does not optimize the overall system collec- under impairments. The communication channel is explained
tively [13]. In this conventional communication system, the by the density of the conditional probability p(y|x), where
transmitter communicates one from the M available messages y ∈ Rn denotes the signal at the receiver. The transmitted
s ∈ M = {1, 2, ..., M } to the receptor, making n uses of the message s is detected as y at the receiver, where the operation
channel. The transmitter applies the modulation f : M → Rn g : Rn → M is applied to estimate the value of ŝ. The channel
to the message s, and generates the signal x = f (s) ∈ Rn autoencoder is optimized to map x to y to enable s to be
to be transmitted. Digital modulation maps the input symbols recovered by minimizing probability of error. The autoencoder
from a discrete alphabet to complex numbers that represent components are summarized as follows:
the points on the constellation diagram. The process of digital 1) Input: The input symbol s is encoded as a one-hot
modulation in conventional communication systems has fixed vector, that is, s can only take legal combinations of values
and pre-established constellation diagrams. The desired data with a single high ’1’ bit and all the others low ’0’. This
rate determines the constellation scheme and the grouping of encoding allows a state machine to run at a faster clock rate
the input bits for symbol construction. Linear decision regions than any other encoding of that state machine. Determining
make it simple to decode the information at the receiver. the state of a one-hot vector has a low and constant cost of
accessing one flip-flop.
B. An End-to-End Optimization Process with Autoencoders 2) Transmitter: The transmitter is composed of a feed-
As opposed to the independent block optimization of con- forward neural network (FNN) with multiple dense layers. The
ventional communication systems, deep learning is capable last dense layer output is reshaped to represent two complex
to jointly optimize multiple communications blocks at the numbers with real (in-phase, I) and imaginary (quadrature,

309 on July 03,2023 at 20:14:31 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: University of the West Indies (UWI). Downloaded
Q) parts for each modulated input symbol. The normalization
layer ensures that physical constraints on x are met.
3) Channel: The channel layer is not trainable, and is rep- input_1: InputLayer

resented by an additive white Gaussian noise (AWGN) layer


with a variance β = (2REb /N0 )−1 , where Eb /N0 constitutes
the energy per bit (Eb ) to noise power spectral density (N0 ) dense_1: Dense

ratio. The noise varies for every training example, and it is


used for the forward pass to distort the transmitted signal, but
neglected in the backward pass. dense_2: Dense

4) Receiver: Similar to the transmitter, the receiver is im-


plemented as an FNN. The softmax activation of its last layer
outputs the probability vector p ∈ (0, 1) M over all possible batch_normalization_1: BatchNormalization

messages. The element of p with the highest probability value


is selected as ŝ.
lambda_1: Lambda
5) Training: The autoencoder is trained using different
optimizers to update the weights of the FNN and compare their
behavior. The optimizers used during training are: stochastic
gaussian_noise_1: GaussianNoise
gradient descent (SGD), root mean square propagation (RM-
Sprop), adaptive gradient (Adagrad), adaptive learning rate
(Adadelta), adaptive moment (Adam), Adam-based infinity
dense_3: Dense
norm (Adamax), and Nesterov Adam (Nadam). The training
batch is the set of all possible messages s ∈ M. The gradient is
derived from a categorical cross-entropy loss function between
dense_4: Dense
1s and p.
III. S IMULATION R ESULTS AND P ERFORMANCE
E VALUATION
Fig. 3. Block diagram of the autoencoder used to compare the optimizers.
The autoencoder uses the data generated for transmission
and the same data at the reception point. The autoencoder
is considered as an unsupervised learning system since the
data used is not labeled externally. This concept allows the
autoencoder to learn without any prior knowledge. According
to [15], an autoencoder can achieve equivalent performance as
the Hamming (7, 4) code with maximum likelihood decoding
(MLD). The autoencoder achieves the same BLER as uncoded
BPSK for a (2, 2) system, and outperforms uncoded BPSK
for an (8, 8) system. We have reproduced the latter results
and discovered that the autoencoder learns the coding and
modulation scheme by jointly optimizing the cost function
for the entire end-to-end model. Optimizing the encoder and
decoder together is how we force the autoencoder to extract
only the features that are necessary and characterize the input
data to store it in the bottleneck layer (i.e., where the smaller
and dense representations are). After the training stage, the
autoencoder learns a heavily tailored compression scheme for Fig. 4. The number of epochs against the loss of Adadelta optimization.
the specific communication system. Figure 3 presents a block
diagram of the simulated autoencoder architecture used to
compare the different optimizers. Figure 4 shows how the We notice that for some optimizers e.g., SGD, the constellation
loss of the cost function reduces until converging to almost points are deviated from their ideal positions. This deviation
zero after around 100 epoch. The plot of SNR vs. BLER of increases the modulation error at the receiver, which agrees
our autoencoder (1,2) with different optimizers can be seen with Fig. 5 where SGD tends to diverge when we increase the
in Fig. 5. We identified that our autoencoder trained with a SNR range.
categorical cross-entropy and optimized with Adadelta [16],
gives the best performance in terms of SNR range against IV. C ONCLUSIONS AND F UTURE W ORK
block error rate (BLER). The constellations received by our We have review how deep learning architectures can help
autoencoder with diverse optimizers are illustrated in Fig. 6. in the optimization of communication systems. First, we

310 on July 03,2023 at 20:14:31 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: University of the West Indies (UWI). Downloaded
emerging autoencoder architectures. Moreover, additional au-
toencoders may be employed to extend this approach to
multi-user systems and multiple-antenna systems. It would be
interesting to see the new solutions of using autoencoders as
we scale systems. Finally, this work may be transferred to
specific domains, such as satellite communications, backhaul
radios, dense urban wireless, 5G MIMO, etc. There is still a
wide engineering knowledge that researchers might include to
take advantage of autoencoders in wireless communications
to finally enable a full deep learning-based communication
system.
ACKNOWLEDGMENT
This work was supported by Kumoh National Institute of
Technology (2019-104-155), and by the Technology Develop-
ment Program (S2508336) funded by the Ministry of SMEs
Fig. 5. Signal to noise ratio vs. block error rate for different autoencoder and Startups (MSS, Korea).
(AE) optimizers.
R EFERENCES
[1] T. L. Marzetta, E. G. Larsson, H. Yang, and H. Q. Ngo, Fundamentals
of Massive MIMO, 1st ed. Cambridge, United Kingdom: Cambridge
University Press, 2016.
[2] T. O’Shea and J. Hoydis, “An Introduction to Deep Learning for the
Physical Layer,” IEEE Transactions on Cognitive Communications and
Networking, vol. 3, no. 4, pp. 563–575, 12 2017.
[3] M. E. Morocho-Cayamcela and W. Lim, “Finding the optimal path for
V2V multi-hop connectivity with Q-learning and Convolutional Neural
Networks,” in The Korean Institute of Communications and Information
Sciences Conference 2019 (KICS 2019), Jeju, South Korea, 6 2019.
[4] L. Wang and D. T. Delaney, “QoE Oriented Cognitive Network Based
on Machine Learning and SDN,” in 2019 IEEE 11th International
Conference on Communication Software and Networks (ICCSN). IEEE,
6 2019, pp. 678–681.
[5] M. E. Morocho-Cayamcela and W. Lim, “Proposed cost function using
wireless propagation for self-organizing networks,” in The Korean
Institute of Communications and Information Sciences Fall Conference
2019 (KICS 2019), Seoul, South Korea, 11 2019, pp. 172–174.
[6] M. E. Morocho-Cayamcela and W. Lim, “Artificial Intelligence in 5G
Technology: A Survey,” in 2018 International Conference on Informa-
tion and Communication Technology Convergence (ICTC 2018), no. 1.
IEEE, 10 2018, pp. 860–865.
Fig. 6. Constellations generated by our autoencoder under different parameter [7] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, 1st ed.,
optimizers. T. Dietterich, Ed. The MIT Press, 2016.
[8] A. Osseiran, J. F. Monserrat, and P. Marsch, 5G Mobile and Wireless
Communications Technology, 1st ed. Cambridge University Press, 2017.
[9] M. E. Morocho-Cayamcela, S. R. Angsanto, W. Lim, and A. Caliwag,
discussed how to formulate a transmitter and receiver as an “An artificially structured step-index metasurface for 10GHz leaky
autoencoder for the physical layer. We have used an end- waveguides and antennas,” in 2018 IEEE 4th World Forum on Internet
to-end optimization for the reconstruction loss, instead of of Things (WF-IoT 2018). IEEE, 2 2018, pp. 568–573.
[10] M. E. Morocho-Cayamcela, H. Lee, and W. Lim, “Machine Learning for
optimizing the individual blocks of a conventional communi- 5G/B5G Mobile and Wireless Communications: Potential, Limitations,
cation system (i.e., synchronization, symbol estimation, error and Future Directions,” IEEE Access, vol. 7, pp. 137 184–137 206, 9
correction, channel coding, modulation, etc.). We showed that 2019.
[11] C. E. Shannon, “A Mathematical Theory of Communication,” Bell
this formulation enables to capture channel impairments of System Technical Journal, vol. 27, no. 3, pp. 379–423, 1948.
single antenna systems, and can match modulation baselines [12] E. Björnson, J. Hoydis, and L. Sanguinetti, Massive MIMO Networks,
by just applying off-the-shelf DNNs. Future works in the 1st ed. Pisa, Italy: now Publishers Inc., 2019, vol. 1.
[13] S. Dorner, S. Cammerer, J. Hoydis, and S. t. Brink, “Deep Learning
field might include channel generalization by scaling from a Based Communication Over the Air,” IEEE Journal of Selected Topics
simple AWGN model to more complex real-world channels. in Signal Processing, vol. 12, no. 1, pp. 132–143, 2 2018.
This channel generalization might be studied by combining [14] T. Erpek, T. J. OShea, Y. E. Sagduyu, Y. Shi, and T. C. Clancy, “Deep
Learning for Wireless Communications,” in Development and Analysis
generative RF models with discriminative RF models, in an of Deep Learning Architectures. Springer, Cham, 2020, pp. 223–266.
adversarial way to improve both. Additionally, researchers [15] T. J. O’Shea, T. Erpek, and T. C. Clancy, “Physical layer deep learning of
may leverage theory we know about propagation and physics encodings for the MIMO fading channel,” in 2017 55th Annual Allerton
Conference on Communication, Control, and Computing (Allerton).
to propose better impairment models. From the autoencoder IEEE, 10 2017, pp. 76–80.
side, several learning strategies can be studied, such as dif- [16] M. D. Zeiler, “ADADELTA: An Adaptive Learning Rate Method,”
ferent weights initialization, hyperparameter selection, and arXiv:1212.5701v1, 12 2012.

311 on July 03,2023 at 20:14:31 UTC from IEEE Xplore. Restrictions apply.
Authorized licensed use limited to: University of the West Indies (UWI). Downloaded

You might also like