Navid Lashkarian, Signal Processing Division, Xilinx Inc., San Jose, USA, Chris Dick, Signal Processing Division, Xilinx Inc., San Jose, USA
Navid Lashkarian, Signal Processing Division, Xilinx Inc., San Jose, USA, Chris Dick, Signal Processing Division, Xilinx Inc., San Jose, USA
Navid Lashkarian, Signal Processing Division, Xilinx Inc., San Jose, USA,
[email protected],
Chris Dick, Signal Processing Division, Xilinx Inc., San Jose, USA,
[email protected]
ABSTRACT
Proceeding of the SDR 04 Technical Conference and Product Exposition. Copyright © 2004 SDR Forum. All Rights Reserved
where K is the nonlinearity order and Q represents
z(n) Power
x(n) Predistorter y(n) the memory length of the power amplifier. In order to
Amplifier
reduce the implementation complexity of the predis-
torter while maintaining acceptable performance, only
e(n) the odd-order terms in the nonlinearity are included
1
G in the model. This compromise reduces the complex-
^z(n) ity of the predistorter by approximately 40% at the
Predistorter expense of 3 to 5 dB spectral regrowth. A detailed
Training investigation of the benefits of even-order terms in the
baseband model is presented in [7].
Figure 2: Baseband equivalent model of the DPD cir- 2.2. Indirect Learning
cuit. G represents the gain of the power amplifier.
Initialization of the DPD linearizer is performed
2. PREDISTORTER ARCHITECTURE using optimum filtering, which is done as an off-line
computation in our DPD implementation. Adaptive
Our approach to nonlinear predistortion is based on filter coefficient estimation can be considered a lin-
the method proposed by Eun & Powers [5]. In this ear optimization task. Any of the common estima-
approach, two identical truncated Volterra systems tion methods - Least Square Estimation [8], minimum
are used for training and predistortion. Figure 2 de- mean squared error (MMSE) [8] or Wiener Filtering,
picts the block diagram of the equivalent baseband
Kalman [8] or recursive least squares (RLS) filter-
model of the digital predistortion network. The ob-
ing [8] - can be used.
jective of the linearizer is to find a transformation
We note that while all of the above methods try
of the signal (z̃(n) = V (x(n))) that in combination
to solve one optimization problem, that is the linear
with the nonlinear amplifier (responsible for the dis-
parameter estimation, the stationary point obtained
tortion) will result in an identity system that produces
from using these methods might be quite different.
the signal of interest without distortion at the out-
This is mainly due to the fact that the error criterion
put of the power amplifier (y(n) ' x(n)). The main
for the approaches are different, causing a different
challenge of this approach is to track and identify the
profile for the error surface.
time varying characteristics of the amplifier. To ad-
dress this task a stochastic gradient adaptation mech-
anism is employed. The adaptation of the truncated 2.3. Tracking and Direct Learning
Volterra system is a two stage process. During initial-
ization, the input and output signals of the power am-
The inverse of the nonlinear amplifier is adaptively
plifier are probed and the Volterra filter coefficients are
tracked using a stochastic gradient method. Least
adapted off-line using Recursive Least Squares (RLS
mean squares (LMS) adaptive filters are known to
or Kalman Filtering) estimation. This process is also
have a slow convergence rate. However, since the
known as initialization through indirect learning. Once
power amplifier characteristics vary slowly as a func-
the adaptive filter is initialized at an optimum station-
tion of time, the LMS approach is a reasonable choice
ary point, a stochastic adaptive mechanism is used to
for performing parameter tracking.
track the time-varying characteristics of the nonlinear
amplifier. At each iteration of the stochastic gradient algo-
rithm, an update for the unknown vector is obtained
2.1. Memory Polynomial Predistorter from
We use the memory polynomial model (Eq. 1) for Wn+1 = Wn + µ × en × X∗n (2)
the predistorter block as described in [6]
where the error vector is defined as
K Q
X X
z[n] = akq x(n − q) ∗ |x(n − q)|k−1 (1) en = z(n) − Wn × Xn (3)
q=0
k=1
k even X is the vector containing all the necessary nonlinear
Proceeding of the SDR 04 Technical Conference and Product Exposition. Copyright © 2004 SDR Forum. All Rights Reserved
products of the input sample and can be expressed as
x(n) z-1 z-1
y(n)
y(n) ∗ |y(n)|2 a10 a11 a12
M1
4
y(n) ∗ |y(n)| z(n)
y(n − 1)
∆
2
Xn = y(n − 1) ∗ |y(n − 1)| (4) M2
y(n − 1) ∗ |y(n − 1)|4 z-1 z-1
y(n − 2)
a30 a31 a32
y(n − 2) ∗ |y(n − 2)|2
y(n − 2) ∗ |y(n − 2)|4 M3
Proceeding of the SDR 04 Technical Conference and Product Exposition. Copyright © 2004 SDR Forum. All Rights Reserved
3.1. Simulation Model
v(n)
x(n) H(z) F(v) y(n)
When the System Generator simulation is opened
in the Simulink environment a pre-load function is
Figure 5: Wiener system PA model employed in the called that computes an initial estimate of the sys-
DPD simulation. tem coefficients using RLS estimation. The optimum
coefficients resulting from the estimate are
Effectiveness of DPD in Suppressing Specral Growth
−20 0.0003 - j0.0066
Complex Baseband Signal 0.0005 + j0.0120
Amp. output without DPD
−40 Amp output with DPD -0.0036 + j0.0005
1.1632 - j0.0936
−60
0.0890 + j0.3610
-0.0554 + j0.0254
−80
-0.6712 + j0.0543
dB
−100
-0.0525 - j0.2041
0.0295 - j0.0144
−120
In order to demonstrate adaptive tracking one of the
−140
coefficients is deliberately perturbed - the fourth co-
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
efficient (1.1632 − j0.0936) is scaled by a factor of 3.
Normalized Frequency
The modified coefficient vector is used as the initial
condition for the adaptive processor.
Figure 6: Effectiveness of DPD in suppressing spectral The least mean squares processor in the system
regrowth. The figure shows an overlay of the baseband adaptively updates the coefficients, iteratively forcing
signal spectrum, the PA output without linearization the perturbed coefficient back to its optimum value.
and with linearization. Figure 7 shows the trajectory of the real component
of the modified element (fourth entry of the vector)
as a function of the LMS update iteration number.
system, H(z), in cascade with a memoryless nonlin-
The figure provides an overlay of the Matlab double-
earity F (ν). Ding [6] provides expressions for
precision floating-point simulation and the fixed-point
arithmetic FPGA implementation. The floating-point
1 + 0.5z −2
H(z) = (10) and fixed-point simulations are in close agreement.
1 − 0.2z −1 The residual mean-squared error (MSE) of the FPGA
based LMS filtering is plotted in Figure 8.
and
The System Generator predistorter reference imple-
X
K
mentation employs a fully parallel adaptive processor
y(n) = bk v(n)|v(n)|k−1 (11)
for the adaptive learning sub-system. This means that
k=1,k odd
the 9 complex coefficients in Eq. 6 are all updated at
where v(n) and y(n) are the input and output of the the output sample rate. This is a very high-frequency
memoryless non-linearity F (ν). Ding [6] provides val- update rate and may be too rapid for many applica-
ues for the coefficients bk based on measurements from tions. It is straightforward to modify the adaptive pro-
a class AB PA as cessor to employ a decimated update. In this scenario
the coefficients would be updated at a lower frequency
b1 = 1.0108 + j0.0858 (12) than the Volterra filter processing rate. Using a dec-
imated update permits functional unit folding in the
b3 = 0.0879 − j0.1583 (13)
adaptive processor so that the FPGA footprint can
b5 = −1.0992 − j0.8891 (14) be reduced, i.e., both the number of logic slices and
embedded multiplier can be minimized.
Figure 6 shows the effectiveness of the baseband When the simulation completes a post simulation
predistortion in suppressing spectral regrowth. As stop function is executed that plots the linearizer input
shown in the figure, DPD can effectively reduce spec- function overlaid with the predistorter output, gener-
tral regrowth by 40 dB. ating a plot similar to Figure 6.
Proceeding of the SDR 04 Technical Conference and Product Exposition. Copyright © 2004 SDR Forum. All Rights Reserved
Trajectory of the Perturbed Volterra Kernel Based on LMS Direct Learning
3.5
Table 1: FPGA Resource Utilization for DPD and
LMS Adaptive Learning. The Volterra filter coeffi-
3 cients are updated at the full output data rate using
a fully parallel LMS processor. The design is easily
FPGA Fixed Point
modified to accommodate a decimated update using a
Perturbated Coefficient
2.5
reduced number of embedded multipliers.
2
Volterra Filter LMS Total
Slices 2032 3483 5515
Block Memory 0 0 0
1.5
Multipliers 48 106 154
0 0.5 1 1.5 2 2.5 3 3.5 4
Iteration Number 4
x 10
Figure 7: Volterra Kernel tracking based on LMS in- The computation rate of the predistorter alone is
direct learning. The figure shows the evolution of the 212e6 × 48 = 10.176e9. This 10 Giga-op process-
fourth coefficient in the model. The floating-point and ing rate exceeds the compute capacity of other pro-
FPGA fixed-point simulation results are overlaid in grammable DSP technologies. The FPGA implemen-
the figure. tation easily supports the processing requirements,
while providing the system architect with a flexible
Residual MSE Error of LMS Tracking (FPGA Implementation) solution that can be easily modified based on evolving
0 specifications or future system requirements.
−10
4. ADAPTIVE COEFFICIENT UPDATE
−20 USING EMBEDDED PROCESSING
−30 In many typical applications the PA characteristics do
dB
Proceeding of the SDR 04 Technical Conference and Product Exposition. Copyright © 2004 SDR Forum. All Rights Reserved
like Virtex-II [12] and the low-cost Spartan-3 [13] fam- 8 , pp. 1461-1466, Aug. 1999.
ily that do not include embedded PPC405 processors.
[2] J.K Cavers, “Amplifier linearization using a digi-
tal predistorter with fast adaptation and low mem-
5. CONCLUSION ory requirements,” IEEE Transactions on Vehicu-
In this paper we have provided an architecture study lar Technology, Vol. 39 , Issue 4 , pp. 374-382, Nov.
for the FPGA implementation of a wideband digi- 1990.
tal baseband predistortion processor. As communica- [3] V. J. Mathews, G. Sicuranza, Polynomial Signal
tion infrastructure providers operating in the UMTS, Processing John Wiley & Sons, 2000.
CDMA2000 and military radio application spaces con-
tinue to increase transmission bandwidth and sup- [4] System Generator for DSP, Xilinx Inc., Xilinx Inc.,
port multi-carrier systems, traditional look-up table http://www.xilinx.com/xlnx/xebiz/designResources
approaches to power amplifier linearization are no /ip product details.jsp?key=
longer appropriate, and alternative methods that sup- dr dt system generator
port wideband signals are required. Linearization [5] C. Eun, E. Powers, “A new Volterra Predistorter
techniques based on non-linear signal processing tech- Based on the Indirect Learning Architecture,”
niques have been studied for some time, but their IEEE trans. on Signal Processing, Vol. 45, No. 1,
practical deployment has been restricted due to the January 1997.
limited processing capabilites of traditional config-
urable signal processors. While an application spe- [6] L. Ding et. al. , “A Robust Digital Baseband Pre-
cific integrated circuit (ASIC) approach could meet distorter Constructed Using Memory Polynomi-
the processing requirements, non-recurring engineer- als,” IEEE trans. on comm., Vol. 52, No. 1, Jan-
ing (NRE) costs, high mask-set costs, lengthy devel- uary 2004.
opment schedules and lack of flexibility have limited [7] L. Ding et. al. , “Effects of Even-Order Nonlinear
the ASIC implementation of sophisticated PA lineariz- Terms on Power Amplifier Modeling and Predis-
ers. tortion Linearization,” IEEE Transactions on Ve-
The highly parallel nature of Xilinx FPGAs eas- hicular Technology, Vol. 53 , Issue 1, pp. 156-162,
ily support the processing requirements of complex Jan. 2004.
non-linear signal processing algorithms. The System
Generator design described in this paper implements [8] S. Haykin, Adaptive Filter Theory, Prentice Hall,
a baseband linearizer that includes a 5th order non- New Jersey, 1996.
linearity and a 2nd order term that accounts for PA [9] Virtex-II Pro Datasheet, Xilinx Inc.,
memory. These design parameters are easily modi- http://www.xilinx.com/xlnx
fied to reflect the characteristics of any given power /xweb/xil publications display.jsp?category=Publications
amplifier. /FPGA+Device+Families/Virtex-
The LMS coefficient update procedure used in the II+Pro&iLanguageID=1
implementation is a fully parallel design that updates
all of the linearizer coefficients at the output sam- [10] Xilinx Virtex-4 Revolutionizes Platform FPGAs,
ple rate. Depending on the system requirements, the Xilinx Inc., http://www.xilinx.com
adaptive processor could be modified to include func- /company/press/kits/v4 arch/v4 finalwhitepaper4.pdf
tional unit time-sharing that would reduce the FPGA [11] Microblaze Soft Processor Core, Xilinx Inc.,
footprint in return for a decimated coefficient update http://www.xilinx.com/xlnx/xebiz/designResources
rate. The coefficient update procedure could entirely, /ip product details.jsp?sSecondaryNavPick
or partially, be relocated to embedded software run- =Design+Tools&key=micro blaze
ning on either a Microblaze soft processor core or
embedded PPC405 hard core in the Virtex-II Pro or [12] Virtex-II Datasheet, Xilinx Inc.,
Virtex-4 FPGA families. http://www.xilinx.com/xlnx/xweb/xil publications
display.jsp?category=/Data+Sheets/FPGA+Device
+Families/Virtex-II&iLanguageID=1
References
[13] Spartan-3 Datasheet, Xilinx Inc.,
[1] C. Liang et. al., “Nonlinear amplifier effects in http://www.xilinx.com/xlnx/xil prodcat
communications systems,” IEEE Transactions on landingpage.jsp?title=Spartan-3
Microwave Theory and Techniques, Vol. 47 , Issue
Proceeding of the SDR 04 Technical Conference and Product Exposition. Copyright © 2004 SDR Forum. All Rights Reserved