0% found this document useful (0 votes)
65 views

Hardware Implementation of OTFS Modulation Using CORDIC Algorithm

Uploaded by

DORA SAI KUMAR
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views

Hardware Implementation of OTFS Modulation Using CORDIC Algorithm

Uploaded by

DORA SAI KUMAR
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Hardware Implementation of OTFS Modulation

Using CORDIC Algorithm


Sai Kumar Dora, Himanshu B. Mishra, Manodipan Sahoo and Kapil Yadav
2024 International Conference on Signal Processing and Communications (SPCOM) | 979-8-3503-5045-6/24/$31.00 ©2024 IEEE | DOI: 10.1109/SPCOM60851.2024.10631593

Abstract— As per the existing literature, Orthogonal Time data processing in wireless communication systems. It is worth
Frequency Space (OTFS) transceiver can be implemented using noting that the internal IP core is specifically designed to
two different methods: i) the two step approach: includes in- operate with XILINX FPGAs and SOCs, implying potential
verse symplectic fast Fourier transform (ISFFT) and Heisenberg
transform at the transmitter, along with SFFT and Wigner trans- incompatibility with other hardware platforms. This limita-
form at the receiver, and ii) the direct approach: incorporates tion could impact flexibility in system design considerations.
Inverse Zak (IZak) transformation at the transmitter and Zak Moreover, Authors in [7], focused on implementing the OTFS
transformation at the receiver. In this work, to expedite the based wireless transmitter using the Radix-2 Multi-path Delay
implementation process of the OTFS transceiver in real-time Commutator (R2MDC) method. Here a conventional hardware
wireless communication, we use field programmable gate array
(FPGA) technology, leveraging the time acceleration benefits it architecture developed to using a parallel and depth-pipelined
provides. Additionally, while implementing the two aforemen- hardware architecture for FFTs/IFFTs on a FPGA. This design
tioned approaches, we employ the coordinate rotation digital aimed at accelerating OTFS execution with high accuracy and
computer (CORDIC) algorithm as it is flexible and requires maximum throughput. Furthermore, in [7] authors proposed an
minimum area. We compare the hardware performances among optimized OTFS transmitter architecture, integrating a modi-
these two approaches in terms of resource utilization, timing,
and power by implementing on the 7a200tiffg1156-1L FPGA fied Booth multiplier and memory. This optimized architecture
board. We observe that, the direct approach exhibits significant not only demands low hardware resources but also deliverers
improvements, with a 47.57% reduction in Look-Up Tables high throughput.
(LUTs) and a 17.63% reduction in Flip Flops (FF) compared One can observe that the aforementioned literature only
to the two step approach in the OTFS transceiver design. considered two step approach based OTFS transceiver while
implementing on hardware platforms such as SDR/FPGA.
I. I NTRODUCTION
To the best of the authors’ knowledge no literature exist
OTFS is a recently developed waveform scheme which regarding the hardware implementation of the direct approach
mitigates the adverse effects of highly time varying channel based OTFS transceiver. In this work, we develop VLSI
by leveraging the quasi-static nature of the delay-Doppler architecture of OTFS transceiver considering both two step
(DD) channel. Thus, OTFS deals with the transmission of DD and direct approaches. In contrast to the existing work, we fur-
symbols, which requires the transformation of DD domain ther implement these architectures on FPGA using CORDIC
to time-domain at the transmitter and time-domain to DD algorithm, known for its efficiency. Here, CORDIC is used to
domain at the receiver. As per the existing literature [1], [2], simplify complex computations through a series of advanced
the transceiver for OTFS can be developed based on two shift-and-add operations, making it well-suited for hardware
distinct approaches, i) two-step approach: involves ISFFT and implementations. Through simulation and synthesis, we next
Heisenberg transform at the transmitter, along with SFFT and compare the hardware performance between these two distinct
Wigner transform at the receiver, and ii) the direct approach: approaches in terms of resource utilization, timing and power
incorporates IZak transformation at the transmitter and Zak considerations.
transformation at the receiver.
To measure the performance of OTFS in the real-time II. S YSTEM M ODEL
wireless systems, T.Thaj et al. [3] implemented an OTFS The implementation of an OTFS transmitter and receiver can
modem on an software-defined radio (SDR) board, evaluating be approached through two methods: the two step approach
real-time performance. A.Abushatta et al. [4] investigated RF and the direct approach as shown in Fig. 1 and Fig. 2
impairments on OTFS and orthogonal frequency division mul- respectively. The subsequent discussion delves into the details
tiplexing (OFDM) waveforms in a real indoor environment, of these two approaches.
demonstrating the superior performance of OTFS. Further-
more, authors in [5], focused on FPGA implementation of A. OTFS Transmitter Section
OTFS transceiver by using the pipelined CORDIC algorithm 1) Two Step Approach: In the two step approach, the
with input size 16 × 16. Authors in [6] have recently explored transmitter involves the ISFFT and the Heisenberg transform
the FPGA implementation of the OTFS transceiver using described as follows. Given set of N M information symbols
internally customized IP cores. Their study underscores the {d[k, l], k = 0, . . . , N − 1, l = 0, . . . , M − 1} derived from a
advantages of FPGAs as an optimal platform for extensive modulation alphabet A = {a1 , . . . , aQ } (e.g., QAM symbols),

Authorized licensed use limited to: Indian Institute of Technology (ISM) Dhanbad. Downloaded on October 22,2024 at 10:37:01 UTC from IEEE Xplore. Restrictions apply.
arranged on the DD grid, the OTFS transmitter employs the B. Receiver Section
ISFFT operation to map symbols D[k, l] to N M samples In this work, since our primary focus is to do hardware im-
X[n, m] on the time–frequency grid. This mapping is achieved plementation of the transforms: ISFFT, Heisenberg transform,
through the following expression IZak transform, SFFT, Wigner transform and Zak transform,
N −1 M −1 we ignore the effect of channel and noise on the received
1
D[k, l]ej2π( N − M ) ,
X X nk ml
X[n, m] = √ (1) signal. We therefore, consider the received signal same as the
N M k=0 l=0
transmitted signal s(t).
where n = 0, . . . , N −1 and m = 0, . . . , M −1. Subsequently, 1) Two Step Approach: In the two step approach, the
a time–frequency modulator transforms the samples X[n, m] receiver section consists of Wigner transform followed by
to a continuous-time waveform s(t) using the Heisenberg the SFFT. The Wigner transform converts the received time
transform. This transformation is represented by the following domain signal s(t) in to the time frequency domain W[n, m]
equation which is defined as
N −1 M −1 Z
W[n, m] = grx (t − τ )s(t)e−j2πν(t−τ ) dt,
X X
s(t) = X[n, m]gtx (t − nT )ej2πm∆f (t−nT ) , (2) (7)
n=0 m=0
where gtx (t) is the transmit windowing function. This Heisen- where n and m represent the indices for time and frequency,
berg transform is commonly known as 2D IFFT, parameteri- respectively. Here, τ = nT , ν = m∆f , and grx (t) is
zation by a factor gtx (t) [1]. the receiver windowing function. The subsequent block in
The two step approach operation at the transmitter can be the receiver is the SFFT which converts the time-frequency
concisely expressed in matrix form. Specifically, the ISFFT at domain signal to DD domain, represented as
the transmitter can be represented as N −1 M −1
1
W[n, m]e−j2π( N − M ) ,
X X nk ml
X = ISF F T (D) = FM DFH
N. (3) Y[k, l] = √ (8)
N M n=0 m=0
Here, D ∈ CM ×N and X ∈ CM ×N represents the data matrix
in DD domain and time-frequency domain, respectively. Here, where Y[k, l] denotes received signal at kth Doppler and lth
the objective is to deliver the signal in the time frequency delay.
domain, where FM and FH N are the matrices representing of
Next, we can represent the matrix form of the Wigner
M -point FFT and N -point IFFT, respectively. ISFFT Contains transform given in (7) as
N number of M -point FFT and M number of N -point IFFT.
W = FM Grx S, (9)
Next, the Heisenberg transform equation can be represented
in matrix form as where assuming rectangular windowing function, Grx = IM .
S= Gtx FH
M (X). (4) Now, we can observe that Wigner transform can be computed
using 2D FFT, where M -point FFT modules are utilised N
Here, Heisenberg transform contains N times M -point IFFT. times. Furthermore, matrix form of the SFFT operation in (8)
For a rectangular waveform, the transmit waveform matrix is shown below.
Gtx = IM , where IM is the identity matrix of size M × M .
Column-wise vectorization of the M × N matrix S yields the Y = SF F T (W) = FH
M (W)FN , (10)
MN×1 vector s = vec(S).
2) Direct approach: The direct approach is an alternative where Y ∈ CM ×N is the output of SFFT block.
method for implementing the OTFS transmitter, and it exclu- 2) Direct approach: In the direct approach of the receiver
sively employs IZak transform. The conversion of M × N section, the Zak transform is employed instead of the Wigner
DD domain symbols D to the M N × 1 time domain signals transform and SFFT. This approach converts the 1D time
denoted as s is accomplished using the IZak transform instead domain signal s ∈ CM N ×1 into a DD-2D signal Y ∈ CM ×N .
of two stages transform (ISFFT and Heisenberg transform). N −1
This transformation is expressed as 1 X l
Y[k, l] = √ s[k + mM ]e−j2π N m . (11)
N −1 N m=0
1 X ln
s[k + nM ] = √ D[k, l]ej2π N , (5)
N l=0 We now can observe that (11) can be represented as the
following matrix form.
where k = 0, . . . , M − 1. Next, we can represent (5) in matrix
form as Y = ZAK(s) = SFN . (12)
S = IZAK(D) = DFH
N, (6) Here, s = vec(S). One can notice from (12) that Y can be
where s = vec(S). Here, M times N -point IFFT is used and obtained by applying N -point FFT along the rows of S. In
IFFT is applied to the rows of the X. Notably, the output the next section, we demonstrate the VLSI architectures of
of IZak transform is exactly same as that of the two step the OTFS transmitter and receiver using both two-step and
approach. direct approaches.

Authorized licensed use limited to: Indian Institute of Technology (ISM) Dhanbad. Downloaded on October 22,2024 at 10:37:01 UTC from IEEE Xplore. Restrictions apply.
Fig. 1. M × N OTFS transmitter and receiver implementation using two step approach method.

Fig. 2. M × N OTFS transmitter and receiver implementation using direct approach method.

III. A RCHITECTURE FOR OTFS TRANSMITTER AND evidence supports the assertion that CORDIC stands as a
RECEIVER USING CORDIC ALGORITHM highly suitable alternative in the OTFS transceiver.
CORDIC addresses the challenge of rotating a complex
In this section, we delve into the hardware architectures for number, x + jy, to a new position x′ + jy ′ by an angle θ.
the OTFS transmitter and receiver. The OTFS system involves The rotation operation can be expressed as
FFT and IFFT algorithms, which are implemented in this work  ′   
using the CORDIC algorithm. In the subsequent subsections, x cos(θ) − sin(θ) x
= .
we demonstrate i) a detailed exploration of the CORDIC y′ sin(θ) cos(θ) y
algorithm ii) application of CORDIC in FFT algorithm, and This can be further decomposed using CORDIC techniques,
iii) VLSI architectures of OTFS using CORDIC for both two where the rotation operation becomes
step and direct approaches.  ′   
x 1 − tan(θ) x
= cos(θ) .
y′ tan(θ) 1 y
A. CORDIC Algorithm
Jack E. Volder proposed decomposing θ into a discrete set of
The CORDIC algorithm known for its iterative arithmetic
angles (θ0 , θ1 , . . . , θn , . . . , θN −1 ), where θn = tan−1 (2−n ).
computing capabilities, is employed to evaluate elementary
This leads to the iterative equation such as
functions through efficient shift-and-add operations. In the  ′
−d(n)2−n x′n−1
  
context of FFT processing, a processor based on CORDIC xn −1 −n 1
= cos(tan (2 )) ,
algorithm is implemented. A key innovation in the FFT/IFFT yn′ d(n)2−n 1 ′
yn−1
architecture is the replacement of sine and cosine twiddle
where cos(tan−1 (2−n )) = √1+2 1
−2n
and d(n) = ±1. The
factors with iterative CORDIC rotations, resulting in a re-
original complex number x + jy ′ is then evaluated as

duced ROM usage. The integration of CORDIC in the OTFS
transceiver not only eliminates the need for multipliers but  ′  NY −1
−d(n)2−n x
  
x −1 −n 1
also saves area and power, which in turn reduces the cost = cos(tan (2 )) .
y′ d(n)2−n 1 y
of VLSI integrated chips (ICs). The versatility of CORDIC n=0

extends to various applications, providing a simpler method for Here, N is the number of iterations, typically equal
computing complex multiplications. Furthermore, empirical to the register size for acceptable Signal-to-Quantization

Authorized licensed use limited to: Indian Institute of Technology (ISM) Dhanbad. Downloaded on October 22,2024 at 10:37:01 UTC from IEEE Xplore. Restrictions apply.
Noise Ratio (SQNR). As N → ∞, the product term function, as depicted in Fig. 1. Finally, time domain paral-
Q N −1 −1 −n
n=0 cos(tan (2 )) approaches 0.6073. The multiplica- lel data converted into serial data through parallel to serial
tive inverse of 0.6073, which is approximately 1.647, is converter. The transmitted signal contains N time domain
known as the CORDIC gain. Importantly, this gain is angle- symbols, with each time domain symbol consisting of M
independent, simplifying system design, as most applications samples, and is then transmitted to the receiver.
do not need compensation for this factor. While the CORDIC The OTFS receiver follows a sequence of operations for ef-
algorithm reduces complex rotations to shift-add operations, fective signal processing. Beginning with the serial-to-parallel
the fixed number of iterations can result in overhead. This conversion, the received serial data is transformed into parallel
is because the choice of N often accounts for the worst-case data. This transition is succeeded by the receiver windowing
scenario of all possible supported angles by the CORDIC. function. In the two step approach of the OTFS receiver
architecture, the receiver windowing function is followed by
B. FFT using CORDIC Algorithm
the 2D Wigner transform, which concurrently employs an
In this work, we implement the 8-point FFT and IFFT 8-point FFT eight times. The subsequent stage involves the
algorithms using R2MDC method as shown in Fig. 3. This SFFT block, encompassing IFFT and FFT operations. The
approach [7], [8] provides a straightforward and efficient way IFFT operation on the columns of the time frequency signals
to implement pipelined and parallel versions of the FFT, and extracts Doppler-frequency information symbols. The rows of
it effectively reducing the hardware latency. The 8-point FFT, the resulting matrix are then directed to the IFFT, generating
consisting of three butterfly stages, each involving twiddle fac- DD symbols. Within the ISFFT block, eight IFFT modules in
tor multiplication. For leveraging simplified design, we have parallel during the first stage, followed by eight parallel FFT
implemented a fully parallel implementation, and it enabling modules in the second stage, as depicted in Fig. 1. The output
the completion of each butterfly stage within a single clock of the SFFT block is channeled into the parallel-to-serial
cycle, with each butterfly stage containing four butterflies. converter, facilitating the conversion of data from parallel to
serial form. These symbols are subsequently fed into the QAM
demodulation block, which maps the received output back to
the original symbols.
2) Direct approach: In this alternative approach, the pro-
cess commences with QAM modulation and serial-to-parallel
Fig. 3. Radix-2 multipath delay commutation architecture for 8-point FFT.
conversion. Diverging from the preceding two step method, we
opt for the IZak transform, transitioning the signal from a DD-
To optimize the twiddle factor multiplication operation, a 2D representation to a 1D time domain signal. This transfor-
delay unit is incorporated after each butterfly stage. Notably, mation involves eight parallel 8-point IFFT. Subsequently, the
we have replaced the conventional twiddle factor multiplica- time symbols undergo processing through the transmit window
tion with a CORDIC unit. This substitution allows for a more and a parallel-to-serial converter. The transmitted signal is
efficient and simplified computation. Each CORDIC module composed of N = 8 time domain symbols, each containing
requires 16 clock cycles to execute. M = 8 samples, and is subsequently forwarded to the receiver.
Upon reaching the receiver side, the operation initiates with
C. Architecture for OTFS transmitter and receiver a serial-to-parallel conversion, followed by the receive window
1) Two step approach: In the hardware implementation of function to the time domain signal. The Zak transform is
the M ×N OTFS modulation, we consider 64 DD information employed, converting the time signal into a DD representa-
symbols, corresponding to M = 8 and N = 8. Each information tion. This transformation involves eight parallel 8-point FFTs.
symbol is mapped onto a subset of 2D orthogonal basis The resulting output is then directed to the parallel-to-serial
functions by using a QAM modulator. Following in the two converter and the QAM demodulator.
step approach of OTFS transmitter implementation, the serial The 8×8 OTFS modulation process consists of 8-point FFT
symbols are transformed into parallel symbols, which are and IFFT modules, with each module utilizing 2 CORDIC
then input into the ISFFT. The ISFFT stage comprises two ICs. The implementation of the two step approach for 8 × 8
operations: FFT and IFFT. The FFT operation on the columns OTFS transceiver involves 96 CORDIC ICs and 1152 adders
of the DD signals provides time delay information symbols, (3M N log2 N + 3M N log2 M ), distributed across three IFFT
which are then input to the IFFT, yielding time frequency and three FFT stages, totaling eighteen butterfly stages. On
symbols. In the ISFFT block, eight FFTs run in parallel in the other hand, the direct method employs 16 CORDIC ICs
the first stage, followed by eight parallel IFFT operations in and 384 (2M N log2 N ) adders , distributed across one IFFT
the second stage. In the two step approach OTFS hardware and one FFT stages.
transmitter architecture, the ISFFT stage is succeeded by the
2D Heisenberg transform. In this transform stage, an 8-point IV. R ESULTS
IFFT is employed eight times concurrently, which converts We generated simulated output waveforms of the OTFS
time frequency signal into time signal. Subsequently, the transmitter and receiver employing the two step approach,
Heisenberg transform followed to the transmit windowing by utilizing appropriate parallel input DD symbols at the

Authorized licensed use limited to: Indian Institute of Technology (ISM) Dhanbad. Downloaded on October 22,2024 at 10:37:01 UTC from IEEE Xplore. Restrictions apply.
Fig. 4. Simulation results of the two-step and direct approach of OTFS transmitter and receiver.

TABLE I provement in FF over the two step approach in the OTFS


R ESOURCE UTILIZATION ON 7 A 200 TIFFG 1156-1L FPGA DEVICE . transceiver. Table II provides the timing report for OTFS,
Parameter Two step approach Direct approach Available indicating a clock frequency of 100 MHz for both the two
(% of utilization) (% of utilization) step and direct approaches. The throughput of the two step
LUT 91106 (67.69) 27078 (20.12) 134600 approach and direct approach in the OTFS transceiver design
FF 72778 (27.03) 25316 (9.40) 269200
IOB 258 (51.60) 258 (51.60) 500 is reported as 118.51 Mbps and 336.84 Mbps, respectively.
Power analysis results for the OTFS transceiver on the FPGA
TABLE II
T IMING R EPORT FOR THE OTFS ON 7 A 200 TIFFG 1156-1L FPGA DEVICE .
7a200tiffg1156-1L device are presented in Table III.

Parameter Two step approach Direct approach V. C ONCLUSION


Time period 10 ns 10 ns
Throughput 118.51 Mbps 336.84 Mbps
In this work, we implemented the hardware architectures
Worst Negative Slack (WNS) 1.756 ns 2.352 ns of the OTFS transceivers on FPGA considering both direct
Worst Hold Slack (WHS) 0.132 ns 0.026 ns and two-step approaches. We observed that the direct ap-
Worst Pulse Width Slack (WPWS) 4.5 ns 4.5 ns proach remarkably outperforms the two-step approach in terms
of the resource utilization, power consumption, timing and
transmitter. These input symbols are utilized through two throughput, indicating its usefulness for real-time wireless
transformations (ISFFT and Heisenberg transformation) to communication applications.
produce time symbols at the transmitter output. The resulting
time symbols are processed by the OTFS receiver. The time R EFERENCES
domain received symbols at the receiver were further pro- [1] H. B. Mishra, P. Singh, A. K. Prasad, and R. Budhiraja, “OTFS channel
cessed to generate DD symbols, employing the inverse of the estimation and data detection designs with superimposed pilots,” IEEE
two transmitter transformations (SFFT and Wigner transform). Transactions on Wireless Communications, vol. 21, no. 4, pp. 2258–2274,
2022.
The step-by-step simulated waveform results are illustrated [2] S. K. Mohammed, R. Hadani, A. Chockalingam, and R. Calderbank,
in Fig. 4. Additionally, the figure depicts the step-by-step “OTFS - predictability in the delay- Doppler domain and its value to
simulation results of the OTFS transmitter and receiver using communication and radar sensing,” IEEE BITS the Information Theory
Magazine, pp. 1–20, 2023.
the direct approach method. Moreover, we verify the Vivado [3] T. Thaj and E. Viterbo, “OTFS modem SDR implementation and experi-
simulation output with the MATLAB simulation and obtain mental study of receiver impairment effects,” in 2019 IEEE International
identical results in both software platforms. The resource Conference on Communications Workshops (ICC Workshops), 2019, pp.
1–6.
utilization report for the OTFS transmitter and receiver on the [4] A. Abushattal, S. E. Zegrar, A. Yazgan, and H. Arslan, “A comprehensive
7a200tiffg1156-1L FPGA device is summarized in Table I. experimental emulation for OTFS waveform RF-impairments,” Sensors,
The direct approach OTFS transceiver demonstrates a 47.57% vol. 23, no. 1, p. 38, 2022.
[5] A. R. Shadangi, S. S. Das, and I. Chakrabarti, “VLSI architecture for
improvement in LUTs compared to the two step approach implementing OTFS,” researchsquare, 2023.
method. Similarly, the direct method shows a 17.63% im- [6] M. Isik, M. Nkomo, A. Das, and K. R. Dandekar, “FPGA implementation
of OTFS modulation for 6G communication systems,” 2023.
[7] S. K. Dora, H. B. Mishra, and M. Sahoo, “Low complexity
implementation of OTFS transmitter using fully parallel and pipelined
TABLE III hardware architecture,” J. Signal Process. Syst., vol. 95, no. 8, pp.
P OWER ANALYSIS ON 7 A 200 TIFFG 1156-1L FPGA DEVICE . 955–964, 2023. [Online]. Available: https://doi.org/10.1007/s11265-023-
01847-x
Parameter Two step approach Direct approach [8] S. K. Dora, R. K. Yadav, M. Sahoo, and H. B. Mishra, “Vlsi architecture
Total on Chip Power 2.594 W 0.897 W for low complexity zero forcing equalizer in otfs modulation,” in 2023
Dynamic Power 2.482 W 0.787 W International Conference on Electrical, Electronics, Communication and
Static Power 0.112 W 0.110 W Computers (ELEXCOM), 2023, pp. 1–6.
Junction Temp. 28.8 ◦ C 26.3 ◦ C

Authorized licensed use limited to: Indian Institute of Technology (ISM) Dhanbad. Downloaded on October 22,2024 at 10:37:01 UTC from IEEE Xplore. Restrictions apply.

You might also like