Implementation of OTFS Transmitter IEEE FNWF23
Implementation of OTFS Transmitter IEEE FNWF23
net/publication/374536104
CITATIONS READS
0 15,499
4 authors, including:
Kapil Dandekar
Drexel University
237 PUBLICATIONS 3,615 CITATIONS
SEE PROFILE
All content following this page was uploaded by Murat Can Işık on 08 October 2023.
Abstract—Sixth-generation (6G) communication systems are in OFDM modulation. This is advantageous as the DD do-
poised to accommodate high data-rate wireless communication main representation of any rapidly-varying TF channel is
services in highly dynamic channels, with applications including both slowly varying and sparse in nature [6]. Consequently,
high-speed trains, unmanned aerial vehicles, and intelligent trans-
portation systems. Orthogonal frequency-division multiplexing modulating in the DD domain allows the received signal to
(OFDM) modulation suffers from performance degradation in interact with a DD channel that exhibits both sparsity and
such high-mobility applications due to high Doppler spread in slow variability. This interaction improves the performance
the channel. The recently proposed Orthogonal Time Frequency in terms of throughput and error rate, compared to a signal
Space (OTFS) modulation scheme outperforms OFDM in terms modulated in the TF domain that must deal with a rapidly
of supporting a higher transmitter (Tx) and receiver (Rx)
user velocity. Additionally, the highly-dynamic time-frequency changing TF channel. OTFS modulation has shown significant
(TF) channel has little effect on OTFS modulated signals, improvement in performance compared to OFDM modulation
which enables the realization of low-complexity pre-processing in high-Doppler scenarios where the user velocity was as high
architectures for implementing massive-multiple input multiple as 500 km/h in the 4 GHz band [5]. Considering mmWave
outputs (MIMO) based OTFS systems. However, while OTFS communication, OTFS outperformed OFDM in the 28 GHz
has received attention in the literature from a theory and
simulation perspective, there has been comparatively little work band at a user velocity of 40km/h [7], [8]. However, these
on real-time FPGA implementation of OTFS waveforms. Thus, results were obtained using software simulation alone.
in this paper, we first present a mathematical overview of OTFS
modulation and then describe an FPGA implementation of OTFS The ability of Field Programmable Gate Arrays (FPGAs)
implementation on hardware. Power, area, and timing analysis to implement complex digital signal processing algorithms
of the implemented design on a Zynq UltraScale+ RFSoC FPGA in real-time is making them increasingly popular in wireless
are provided for benchmarking purposes. communication systems. FPGAs are superior to traditional
digital signal processors (DSPs) in wireless communication
I. I NTRODUCTION applications due to their high parallelism, flexibility, power
Multipath propagation channels are doubly selective due to efficiency, and speed. A complex modulation scheme like
time dispersion (frequency selectivity) and Doppler shift (time OFDM requires real-time processing of large amounts of
selectivity) [1]. Both of these phenomena are due to the highly data. The high parallelism and low latency of FPGAs make
varying nature of the time-frequency (TF) channel. Fourth- them ideal for processing OFDM signals. Several processing
generation (4G) and fifth-generation (5G) communication sys- blocks are required for OFDM, including the fast-Fourier
tems address the time-dispersive nature of the channel using transform (FFT), inverse fast-Fourier transform (IFFT), modu-
orthogonal frequency division multiplexing (OFDM), which lation, demodulation, and channel estimation. Various wireless
mitigates inter-symbol interference (ISI) through the use of a communication systems can be easily prototyped by modifying
cyclic prefix and longer symbol duration [2], [3]. However, a IP cores used by FPGAs to implement these processing blocks.
high-velocity communication scenario produces high Doppler FPGAs provide real-time processing of data, which is essential
shifts, which destroys the orthogonality of the sub-carriers for wireless communication systems to provide reliable and
generated by the OFDM modulator, thus causing inter-carrier efficient communication [9]–[12]. All of these characteristics
interference (ICI) [4]. While OFDM cannot mitigate the effects also make FPGAs ideal for prototyping OTFS systems prior
of high Doppler shifts, the recently proposed orthogonal time to standardization.
frequency space (OTFS) modulation scheme is highly effective We propose that FPGA implementation would be a powerful
in these environments due to its delay-doppler characteris- solution for evaluating wireless communication modulation
tic[5]. schemes such as OTFS. The high parallelism, high flexibil-
OTFS modulation multiplexes the information in the Delay- ity, low power consumption, and high processing speed of
Doppler (DD) domain in contrast to the TF domain used FPGAs make them an ideal platform for the processing of
† Corresponding
large amounts of data required by wireless communication
Author.
All authors are with the Department of Electrical and Computer Engineer- systems. The use of customizable IP cores in FPGA-based
ing, Drexel University. This research is supported by the National Science implementations of OTFS provides a high degree of flexibility
Foundation under Grants CNS-1828236 and CNS-1816387. Any opinion, and allows the designer to optimize the design for specific
findings, and conclusion or recommendations expressed in this paper are those
of the author(s) and do not necessarily reflect the reviews of the National system requirements, and to prototype different variations of
Science Foundation. the waveform to help inform the design of future 6G standards.
II. R ELATED W ORKS
ISFFT and
Information Constellation Heisenberg
trasmit
bits Mapping transform
This section summarizes recent literature on OTFS modula- windowing
show that OTFS has a lower bit-error rate (BER) than OFDM DD Domain
in a number of varying case scenarios. Chockalingam et al. Fig. 1. Block diagram of an OTFS transceiver system
[14] described a low-complexity detection scheme according
to Markov chain Monte Carlo (MCMC) sampling techniques p
and a Pseudo-Noise (PN) graph sequence-based channel esti- m Data embedded ΩDD
in DD plane
mate technique for the DD domain. N
There is comparatively little research on real-time FPGA ΣTF
frequency
hardware implementations of OTFS waveforms. [15], [16] M
delay
2D SFFT
have performed the Software-defined radio (SDR) implemen-
tation of an OTFS modem. [15] studied the performance of 2D ISFFT
OTFS and OFDM modulation systems in real indoor wireless Δf Δ
channel scenarios. [16] presents the performance of an OTFS N n
T time M q
system, where the received signal extracted from a 60GHz Δτ
doppler
millimeter wave carrier frequency is passed through a Linear Fig. 2. TF and DD plane
Minimum Mean Square Error (LMMSE) equalizer.
In a study by [17], a novel VLSI architecture for OTFS
modulation was proposed for high-speed vehicular communi- present OTFS implementation on an FPGA board.
cation scenarios. The authors presented a report on resource
III. OTFS M ODULATION
utilization for implementation of OTFS on an FPGA board,
as well as a demonstration of the input-output relationship OTFS modulation can be implemented as an extension to the
of an OTFS signal on a single input, single output channel existing OFDM modulation framework for 4G communication
under additive white gaussian noise conditions. A CORDIC systems [6]. An OTFS transceiver system starts with the con-
processor was used for non-linear function generation (sin(θ), stellation mapping of the information bits on a discretized DD
cos(θ), etc) for designing the transmitter and receiver. This plane. The data symbols in the DD domain are then converted
design was implemented on a Xilinx Zynq-7 FPGA board, to the TF domain symbols using the two-dimensional (2D)
thus demonstrating a power-efficient approach. Our paper Inverse Symplectic Finite Fourier transform (ISFFT) at the
uniquely provides a detailed performance analysis, including transmitter side. This is followed by the Heisenberg transform,
power, area, and timing, which were previously unexplored in which converts the TF symbols to a time-domain signal.
hardware implementations of OTFS, thus contributing to the This signal is then pulse-shaped with a suitable window and
advancement of practical applications for 6G communications is sent over the channel. The transmitted OTFS frame also
systems. includes additional pilot symbols for channel estimation. In
More recently [18], a low-complexity implementation of the receiver, the received time-domain signal is converted back
an OTFS transmitter was proposed using a fully parallel and to the symbols in the DD domain with the help of the Wigner
pipelined hardware architecture. FFTs and IFFTs were parallel transform, followed by an SFFT operation on the received
and depth pipelined on an FPGA to accelerate OTFS execu- signal. Fig. 1 provides a block diagram of the OTFS system.
tion, resulting in high accuracy and performance. The authors The mathematical representations of the above process are
also proposed an optimized OTFS transmitter architecture with shown below:
a modified Booth multiplier and memory, which needed fewer • Discretized DD and TF grids are represented as:
hardware resources while providing higher performance. The
ΩDD = {(p∆ν, q∆τ ), p = 0, ..., N −1; q = 0, ..., M −1}
OTFS hardware architecture achieved a bandwidth of 196.67
Tbps at 139.64 MHz maximum operating frequency, thus mak- ΣT F = {(nT, m∆f ), n = 0, ..., N −1; m = 0, ..., M −1}
ing it suitable for future 5G and 6G wireless communications
standards. Furthermore, the optimized hardware architecture where,
1 1
reduced the LUTs on the Virtex-7 FPGA board by around ∆ν = , ∆τ =
20% compared to a conventional OTFS transmitter. NT M ∆f
To the best of our knowledge, and contrary to the previous The discretized grids are shown in Fig. 2.
works on OTFS which mostly focused on theoretical explana- • The input bit stream is converted to its respective infor-
tion and software simulation, this paper is one of the first to mation symbols in the IQ plane and then mapped to the
2D DD grid ΩDD . The resultant data frame is represented
by x[p, q], having a dimension of N × M . x[p, q] resides Heisenberg
transform
in the DD domain.
• Next, the ISFFT operation in conjunction with a transmit
PRN QAM Array
windowing function maps the data frame in the DD ISFFT
Generator Modulator Reshaping
domain to its equivalent TF domain ΣT F , represented
by Xρ [n, m]: QAM
Output Top Module SFFT
N −1 M −1 Demodulator
1 np mq
FPGA
xρ [p, q]ej2π( )
X X
Xρ [n, m] = √ N − M
NM p=0 q=0
Wigner
X[n, m] = Wtx [n, m] · Xρ [n, m] transform
where, Wtx [n, m] is the square summable transmit win- Fig. 3. Top Hardware Architecture
dowing function.
• In the final step, the TF data frame is converted to a
continuous time waveform for transmission using the
Heisenberg transform:
N
X −1 M
X −1
s(t) = X[n, m]ej2πm∆f (t−nT ) gtx (t − nT )
n=0 m=0 Fig. 4. LFSR Structure
where gtx is the transmit pulse.
IV. FPGA I MPLEMENTATION The LFSR is initialized with all 1’s, and the output bit is
generated by XORing the first, third, fourth, and sixth bits
The architecture design implements the various processes in of the shift register. The generated bit is output and the shift
parallel, thus consuming less processing time. Xilinx Intellec- register is updated for each clock cycle.
tual Property (IP) cores and custom modules are used in the The code also includes control logic to set the number of
implementation that is shown in Figure 3. Modules are defined random bits to be generated based on the desired modulation
using VHDL based on the AXI Interface. The modules in the order. The number of bits is computed as follows:
design use 12-bit data configuration for the real and imaginary
• For M = 4, 8192 bits are generated (4096 × 2)
parts. A Xilinx FFT core is used to improve the efficiency
• For M = 8, 12288 bits are generated (4096 × 3)
and speed of the FFT and IFFT blocks used inside the ISFFT
• For M = 16, 16384 bits are generated (4096 × 4)
module, which has a maximum instantaneous frequency of 250
• For M = 32, 20480 bits are generated (4096 × 5)
MHz and can operate up to 50 frames/sec. This architecture
has six parts, described in the following sub-sections: After generating the required number of random bits, the
control logic sets the generator to idle mode. The generated
• Random-Bit Generator
random bits were analyzed for their auto-correlation function,
• QAM Modulator
which is depicted in Fig. 5. This figure illustrates the degree of
• Array reshaping
correlation between consecutive bits. A lower auto-correlation
• ISFFT and Heisenberg
implies a sequence with characteristics that resemble ran-
• Wigner and SFFT
domness. It is worth noting, however, that while the 16-
• QAM Demodulator
bit LFSR-generated sequence meets the requirements of this
A. Random-Bit Generator study, its predictability characteristics may vary depending on
the application context.
The generation of random bits is a fundamental requirement The generated random bit sequence is evaluated in both
for testing various communication and information processing MATLAB and VHDL environments. The auto-correlation
systems. In this regard, a simple and efficient technique for analysis of the 16-bit LFSR generated bit sequence demon-
generating pseudo-random binary sequences (PRBS) is the use strates that it closely resembles white noise, indicating that
of a linear feedback shift register (LFSR). In this current work, the generated sequence is statistically random. Moreover, a
a 16-bit LFSR is utilized to generate a 8192 bit long PRBS VHDL testbench is designed to compare the performance
sequence. The feedback loop for the LFSR is implemented of the PRBS generator implemented in VHDL with that of
using a simple configuration, as shown in Fig. 4. The LFSR the PRBS generated in MATLAB. It is observed that the
is configured as a 16-bit shift register, and the feedback generated pseudo-random bit sequences in both environments
polynomial is defined as follows: are virtually identical.
LFSR has been widely adopted for generating pseudoran-
f (x) = 1 + x11 + x13 + x14 + x16 (1) dom bit sequences due to its simplicity and efficiency [19],
constellation’s complex symbols are represented by floating-
point numbers, with the real and imaginary parts represented
as separate values, and floating-point values are converted to
fixed-point notation, with a 10-bit resolution, thus resulting
in 12-bit signed numbers with 2’s complement signed 2.10
format. For instance, the floating-point value for 971/1024 is
represented as 0.948 in decimal notation.
Fig. 8. Error difference between VHDL and MATLAB implementa- Fig. 9. Error difference between VHDL and MATLAB implementa-
tion of the modulated signal. tion of the demodulated signal.
[16] R. Marsalek, J. Blumenstein, D. Schützenhöfer, and M. [20] R. St˛epień and J. Walczak, “Application of the DLFSR gen-
Pospisil, “OTFS modulation and influence of wideband RF erators in spread spectrum communication,” in Proceedings of
impairments measured on a 60 GHz testbed,” in 2020 IEEE the 19th International Conference Mixed Design of Integrated
21st International Workshop on Signal Processing Advances Circuits and Systems-MIXDES 2012, IEEE, 2012, pp. 555–
in Wireless Communications (SPAWC), IEEE, 2020, pp. 1–5. 558.
[17] A. R. Shadangi, S. S. Das, and I. Chakrabarti, “VLSI archi- [21] A. K. Panda, P. Rajput, and B. Shukla, “Fpga implementation
tecture for implementing OTFS,” 2023. of 8, 16 and 32 bit LFSR with maximum length feedback
[18] S. K. Dora, H. B. Mishra, and M. Sahoo, “Low complexity polynomial using VHDL,” in 2012 International Conference
implementation of OTFS transmitter using fully parallel and on Communication Systems and Network Technologies, IEEE,
pipelined hardware architecture,” Journal of Signal Processing 2012, pp. 769–773.
Systems, pp. 1–10, 2023. [22] Xilinx, Inc., System generator for DSP reference guide,
[19] W. Payne, “Pseudorandom numbers for mini-and microcom- 14th ed., Oct. 2012.
puters: A generalized feedback shift register algorithm,” Be-
havior Research Methods & Instrumentation, vol. 5, no. 2,
pp. 93–98, 1973.