Complex EMethod
Complex EMethod
Jean-Michel Muller
CNRS-Laboratoire LIP, projet Arénaire
Ecole Normale Supérieure de Lyon
46 Allée d’Italie
69364 Lyon Cedex 07, France
[email protected]
Abstract
The E-method, introduced in [2, 3], allows efficient parallel solution of diagonally
dominant systems of linear equations in real domain using simple and highly regular
hardware. Since the evaluation of polynomials and certain rational functions can be
achieved by solving the corresponding linear systems, the E-method is an attractive
general approach for function evaluation. We generalize the E-method to complex
linear systems, and show some potential applications such as the evaluation of complex
polynomials and rational functions.
1 Introduction
In this report we propose an extension of a digit-iterative method for solving systems of
linear equations, the E-method [2, 3, 7], to allow the use of the complex number system.
The proposed approach is suitable for hardware implementation. The main characteristics
of the method are: (i) m-digit solution is computed in about m steps, each step consisting
of a sum of number-by-digit products, (ii) the cycle time depends on the number of nonzero
1
coefficients, (iii) the cycle time does not depend on the precision m (if redundant additions
are used), (iv) for a system of order n, the shortest latency requires n elementary units for
the real part, and n units for the imaginary part, and (v) the elementary units are inter-
connected with digit-wide links. The approach is particularly efficient when the coefficient
matrix is sparse. This happens when the E-method is used to evaluate polynomials (one
off-diagonal element) and rational functions (two off-diagonal elements). Other examples are
a tridiagonal system (two off-diagonal elements), powers of the argument (one off-diagonal
element), and special expressions.
We first introduce the transform which allows the E-method to be used in the complex
field C. Then we show how to use the complex E-method (CE-method) in evaluating
complex polynomials and rational functions as particularly interesting cases. Evaluation of
consecutive powers of a complex argument is a special case of polynomial evaluation.
With the exception of complex addition and multiplication, other complex operations
are typically not implemented in hardware. Online algorithms for complex arithmetic have
been proposed and implemented in FPGAs in [8, 9]. Based on this work, algorithms and
implementations for complex FIR filters, complex matrix inversion, and complex House-
holder transform have been developed [10, 11, 12]. Recently, hardware-oriented methods for
complex division and square root have been introduced [4, 6]. The method proposed in this
report extends complex arithmetic to complex polynomials, complex powers, and rational
functions - a significant extension of the domain of hardware implementation for complex
arithmetic.
Complex polynomials appear in many areas such as digital signal and image processing,
control systems, and applied mathematics, in general. A Horner type method for evaluating
complex polynomials is proposed in [1] at the algorithm level, implicitly assuming a software
implementation. The method uses O(n) multiplications and O(n) additions for a complex
polynomial of degree n. If these multiplications and additions are performed in a sequential
order, the latency of the method is about n × TM U LT −ADD which is significantly slower
than our method. If a parallel algorithm for polynomial evaluation is used, the total time is
about log n × TM U LT −ADD which is still slower than our method.
The case of rational functions with complex coefficients and argument is even more
attractive: using the proposed CE-method, we avoid explicit complex division and produce
the result, as mentioned above, in time proportional to the desired precision. We are not
aware of prior special algorithms for evaluation of complex rational functions in hardware.
In the next section we describe the transformation which maps computation from the
complex to the real domain. In Section 3 we show the CE-method. In Section 4 iterations
and convergence conditions are considered. Implementation aspects are discussed in Section
5.
2
This isomorphism holds for complex addition and multiplication which are used in the
proposed method :
a −b c −d
(a + ib) + (c + id) ↔ +
b a d c
a+c −b − d
= ↔ (a + c) + i(b + d) (2)
b+d a+c
a −b c −d
(a + ib) × (c + id) ↔ ×
b a d c
ac − bd −(ad + bc)
= ↔ (ac − bd) + i(bc + ad) (3)
ad + bc ac − bd
a1,1 a1,2 a1,3 a1,4 ··· a1,n z1 t1
a2,1 a2,2 a2,3 a2,4 ··· a2,n
z2
t2
a3,1 a3,2 a3,3 a3,4 ··· a3,n ×
z3 =
t3
(4)
.. .. .. .. .. .. ..
. . . . ··· . . .
an,1 an,2 an,3 an,4 ··· an,n zn tn
is the 2n-dimensional real linear system
0 1 0 1 0 1
BB ar1,1 −ai1,1 ar1,2 −ai1,2 ··· ar1,n −ai1,n
C
C B
B
z1r
C
C B
B
tr1
C
C
BB ai1,1 ar1,1 ai1,2 ar1,2 ··· ai1,n ar1,n
C
C B
B
z1i
C
C B
B
ti1
C
C
BB ar2,1 −ai2,1 ar2,2 −ai2,2 ··· ar2,n −ai2,n C
C B
B z2r C
C B
B tr2 C
C
BB ai2,1 ar2,1 ai2,2 ar2,2 ··· ai2,n ar2,n C
C B
B z2i C
C B
B ti2 C
C
BB C
C B
×B
C
C B
=B
C
C
BB ar3,1 −ai3,1 ar3,2 −ai3,2 ··· ar3,n −ai3,n
C
C B
B
z3r
C
C B
B
tr3
C
C
(5)
BB ai3,1 ar3,1 ai3,2 ar3,2 ··· ai3,n ar3,n C
C B
B
z3i C
C B
B
ti3 C
C
BB .. . .. .. .. C
C B
B .. C
C B
B .. C
C
BB ..
C
C B C B C
A B C
A B C
. . . ··· . . .
@ arn,1 −ain,1 arn,2 −ain,2 ··· arn,n −ain,n @ r
zn @ trn A
ain,1 arn,1 ain,2 arn,2 ··· ain,n arn,n i
zn tin
where aj,k = arj,k + iaij,k , zj = zjr + izji and tj = trj + itij . These two linear systems are
equivalent.
3
In other words, the real linear system (5) is obtained from the complex linear system (4)
by replacing each element x + ix by the 2 × 2 matrix defined in (1). In the next section we
consider a hardware-oriented method for solving such a system.
3 Complex E-method
The E-method [2, 3], provides an iterative approach of solving diagonally dominant real
linear systems. The method has characteristics desirable for efficient hardware implemen-
tation: the basic operators are bit-vector multiplexers, redundant adders of [p : 2] type,
with p ∈ {3, 4, 6} for radix-2, and registers. The overall structure consists of n elementary
units, interconnected digit-serially. The method computes one digit of each component of
the solution per iteration in the MSDF (Most Significant Digit First) manner which allows
digit-serial communication between the modules which operate concurrently. The time to
obtain the solution to m digits of precision is about m cycles (iterations). The amount of
hardware required is roughly related to the number of nonzero terms of the matrix of the
system, which makes the E-method very efficient in hardware resources when the matrix
of the system is sparse. Typical applications of the E-method are evaluation of polynomial
and rational functions, since these correspond to sparse linear systems. The solution of the
linear system
1 −x 0 0 0 ··· 0 y0 p0
0 1 −x 0 0 · · · 0 y1 p1
.. .. .. .. .. .. × .. ..
=
.
. . . . ··· . .
.
0 0
· · · 0 0 1 −x yn−1 pn−1
0 0 0 ··· 0 0 1 yn pn
is
p0 + p1 x + p2 x2 + · · · + pn xn
p1 + p2 x + · · · + pn xn−1
..
.
pn−1 + pn x
pn
that is, the first component of the solution is
p0 + p1 x + p2 x2 + · · · + pn xn
4
whereas the solution of the linear system
1 −x 0 0 0 ··· 0 y0 p0
q1
1 −x 0 0 · · · 0
y1
p1
.. .. .. .. .. .. × .. ..
=
.
. . . . ··· .
. .
qn−1 0
· · · 0 0 1 −x
yn−1
pn−1
qn 0 0 ··· 0 0 1 yn pn
is
p0 +p1 x+p2 x2 +···+pn xn
1+q1 x+q2 x2 +···+qn xn
(p1 −p0 q1 )+(p2 −p0 q2 )x+···+(pn −p0 qn )xn−1
1+q1 x+q2 x2 +···+qn xn
..
.
..
.
(pn −qn p0 )+(pn q1 −qn p1 )x+···+(pn qn−1 −qn pn−1 )xn−1
1+q1 x+q2 x2 +···+qn xn
That is, the first component of the solution is the rational function
p0 + p1 x + p2 x2 + · · · + pn xn
.
1 + q1 x + q2 x2 + · · · + qn xn
Now, let us turn to the evaluation of complex polynomials of a complex argument. We
wish to evaluate
p(z) = p0 + p1 z + p2 z 2 + . . . + pn z n
where the pj ’s and z are complex numbers. As in the real case, the desired value p(z) is
clearly equal to the first component of the solution of the linear system
p0
1 −z 0 0 0 ... 0 y0
y 1 p1
0 1 −z 0 0 . . . 0
y 2 p2
0 0
1 −z 0 . . . 0 × y 3 = p3
(6)
y4
.. .. .. .. .. .. ..
.
.. p4
. . . . . . .
..
0 0 0 0 ... 0 1 yn .
pn
The E-method cannot directly solve the linear system (6), but now if we define real numbers
x and y as x + iy = z, and prj and pij as pj = prj + ipij , then we can apply the CR-transform
of (6), and get the following linear system:
5
The matrix is
1 0 −x y 0 0 0 0 ··· 0
0 1 −y −x 0 0 0 0 0 ···
0 0 1 0 −x y 0 0 · · · 0
0 0 0 1 −y −x 0 0 · · · 0
E=
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
0 0 ··· 0 0 0 1 0 −x y
0 0 ··· 0 0 0 0 1 −y −x
0 0 ··· 0 0 0 0 0 1 0
0 0 ··· 0 0 0 0 0 0 1
s0r p0r
1 0 −x y 0 0 0 0 ··· 0
i p0i
0 1 −y −x 0 0 0
0 s0
0 ···
r p1r
0 0 1 0 −x y 0 0 · · · 0 s1
i p1i
0 0 0 1 −y −x 0 0 · · · 0 s1
.. .. .. .. .. .. .. .. .. .. × .. ..
= . (7)
. . . . . . . . . .
.
0 0 ··· 0 0 0 1 0 −x y r pr
sn−1
n−1
0 0 ··· 0 0 0 0 1 −y −x i pi
sn−1
n−1
r pr
0 0 ··· 0 0 0 0 0 1 0 sn
n
0 0 ··· 0 0 0 0 0 0 1 sni pni
p0 + p1 z + p 2 z 2 + · · · + p n z n .
6
For instance, in the case n = 3, we get
−3 xy 2 p3r + x3 p3r − 3 yx2 p3i + x2 p2r − 2 xyp2i + xp1r + y 3 p3i − y 2 p2r − yp1i + p0r
−y 3 p3r + 3 yx2 p3r − 3 y 2 xp3i + 2 yxp2r − y 2 p2i + yp1r + x3 p3i + x2 p2i + xp1i + p0i
−y 2 p3r + x2 p3r − 2 yxp3i + xp2r − yp2i + p1r
x2 p3i + 2 xyp3r − y 2 p3i + yp2r + xp2i + p1i
s=
xp3r − yp3i + p2r
yp3r + xp3i + p2i
p3r
p3i
The linear system (7) is easily solved by the E-method, provided that it is diagonally dom-
inant (see Section 4 for details on the iterations and convergence conditions). Note that
the E-method does not evaluate directly the expressions given for the solution s0 . These
would require at least 16+16 full multiplications, that, assuming enough multipliers, would
take at least 3 consecutive multiply times. Moreover, the reduction of product terms would
require a [10:2] reduction. Of course, all the interconnections are of full precision. Instead,
as explained later, the complex E-method computes s0 on 14 serial-parallel (left-to-right)
multipliers, including the additions, in about one serial-parallel multiplication time. In this
approach, the interconnections are digit-serial.
Now, let us turn to rational functions of a complex argument with rational coefficients
(assuming the degree-0 coefficient of the denominator is 1). We wish to evaluate
p 0 + p1 z + p 2 z 2 + · · · + pn z n
R(z) =
1 + q1 z + q2 z 2 + · · · + qn z n
where the pj ’s, the qj ’s and z are complex numbers. Clearly, R(z) is equal to the first
component of the solution of the linear system
1 −z 0 0 0 ... 0 s0 p0
q1 1 −z 0 0 ... 0 s1 p1
q2 0
1 −z 0 . . . 0 × s2 =
p2 (8)
.. .. .. .. .. .. .. .. ..
. . . . . . . .
.
qn 0 0 0 ... 0 1 sn pn
The E-method cannot directly solve the linear system (8), but it suffices to take the
CR-transform of that system. Define z = x + iy, pj = prj + ipij and qj = qjr + iqji . The
CR-transform results in the following linear system
7
sr0 pr0
1 0 −x y 0 0 0 0 ... 0 0
si0 pi0
0 1 −y −x 0 0 0 0 ... 0 0
q1r −q1i sr1 r
1 0 −x y 0 0 ... 0 0
p1
q1i q1r si1 pi1
0 1 −y −x 0 0 ... 0 0
q2r −q2i sr2 r
0 0 1 0 −x y ... 0 0 ×
=
p2 (9)
q2i q2r si2 pi2
0 0 0 1 −y −x ... 0 0
.. .. .. .. .. .. .. .. .. .. .. .. ..
. . . . . . . . . . .
.
.
qnr −qni 0 0 0 0 0 0 ... 1 0
srn
r
pn
qni qnr 0 0 0 0 0 0 ... 0 1 sin pin
Again, that system is easily solved by the E-method, provided that it satisfies the con-
vergence conditions (see Section 4), and
(j) (j)
Using (11), one can show that if the residuals |wk | are bounded, then for all k, Dk goes
to yk as j goes to infinity.
(j)
The problem at step j is to find a selection function that gives a value of the digits dk from
8
(j) (j+1)
the residuals wk such that the values wk will remain bounded. In [3], the following
selection function (a form of rounding) is proposed
(
sign x × b|x + 1/2|c , if |x| ≤ 1
s(x) = (12)
sign x × b|x|c , otherwise,
w(0)
= pik
k,i
9
The digit-vector d(j) will be denoted
(j) (j) (j) (j) (j)
d(j) = [d0,r , d0,i , d1,r , d1,i , · · · , d(j)
n,r , dn,i ].
• for k = 0, . . . , n − 1,
h i
w(j)
=
(j−1) (j−1) (j−1) (j−1)
2 wk,r − dk,r + xdk+1,r − ydk+1,i
k,r
h i (13)
w(j)
=
(j−1) (j−1) (j−1) (j−1)
2 wk,i − dk,i + ydk+1,r + xdk+1,i
k,i
• for k = n, h i
(j) (j−1) (j−1)
wn,r = 2 wn,r − dn,r
h i
w(j)
(j−1) (j−1)
= 2 wn,i − dn,i
n,i
Now, let us examine the convergence conditions. The iterations converge to the desired
result if vector w(j) is bounded. Define constants ξ, α and ∆ (with 0 ≤ ∆ < 1) such that
1. |x| + |y| ≤ α;
2. for any k between 0 and n,
|prk | ≤ ξ
|pi |
≤ ξ
k
(j)
|wk,r (j) ∆
− ŵk,r | ≤ 2
(j) (j) ∆
|wk,i − ŵk,i | ≤
2
10
Consider the following example: we wish to evaluate
at point
1 i
+ . z=
100 10
We assume that ∆ = 0 (that is, we use non-redundant residuals). We get:
• initialization:
w(0) = [pr0 , pi0 , pr1 , pi1 , pr2 , pi3 , pr4 , pi4 ]t = [1, 0, 1, 0, −0.5, −1.25, 1, 1]t .
which gives
w(1) = [0.02, 0.2, 0.2, −0.02, −1.18, −0.28, 0, 0]t .
which gives
w(2) = [0.04, 0.4, 0.38, −0.24, −0.36, −0.56, 0, 0]t .
is equal to
533789 57727
+ i ≈ 1.018121719 + 0.110105514 i
524288 524288
whereas the exact value of p(z) is
Exactly as in the real case, even if polynomial p and point z do not satisfy the convergence
constraints, one can easily “transform” them using mere shifts, so that p(z) can be computed
using the E-method. Once ∆ is chosen, and α is defined as 14 − ∆/2, this is done as follows:
1. Find the smallest integer k such that |<(z/2k )| + |=(z/2k )| should be less than α;
2. Now, p(z) = π(t), where the degree-m coefficient of polynomial π is 2mk pm . If at least
one of the coefficients of π has the absolute value of its real or imaginary part greater
than ξ = 3/2, then divide π by 2` , where ` is the smallest integer such that ρ = π/2`
has the absolute value of the real and imaginary parts of its coefficients less than ξ;
3. What we actually compute using the E-method is ρ(z/2k ). This result will then be
multiplied by 2` (a simple left-shift) to get p(z).
11
4.2 Rational function evaluation
We now wish to evaluate
p0 + p1 z + p2 z 2 + · · · + p n z n
R(z) =
1 + q1 z + q2 z 2 + · · · + qn z n
at the complex point z = x + iy, with pk = prk + ipik and qk = qkr + iqki . The matrix of the
CR-transform obtained in Section 3 is
1 0 −x y 0 0 0 0 ... 0 0
0
1 −y −x 0 0 0 0 ... 0 0
r i
q1 −q1 1 0 −x y 0 0 ... 0 0
i
q1r
q1
0 1 −y −x 0 0 ... 0 0
r
q2 −q2i
0 0 1 0 −x y ... 0 0
i
q2r
q2
0 0 0 1 −y −x ... 0 0
.. .. .. .. .. .. .. .. .. .. ..
. . . . . . . . . . .
q r −q i 0 0 0 0 0 0 ... 1 0
n n
qni qnr 0 0 0 0 0 0 ... 0 1
• for k = 0, h i
w(j)
=
(j−1) (j−1) (j−1) (j−1)
2 w0,r − d0,r + xd1,r − yd1,i
0,r
h i
w(j)
=
(j−1) (j−1) (j−1) (j−1)
2 w0,i − d0,i + yd1,r + xd1,i
0,i
• for k = 1, . . . , n − 1,
h i
w(j)
=
(j−1) (j−1) (j−1) (j−1) (j−1) (j−1)
2 wk,r − dk,r − qkr d0,r + qki d0,i + xdk+1,r − ydk+1,i
k,r
h i (17)
w(j)
=
(j−1) (j−1) (j−1) (j−1) (j−1) (j−1)
2 wk,i − dk,i − qki d0,r − qkr d0,i + ydk+1,r + xdk+1,i
k,i
• for k = n, h i
(j) (j−1) (j−1) (j−1) (j−1)
wn,r = 2 wn,r − dn,r − qnr d0,r + qni d0,i
h i
w(j)
=
(j−1) (j−1) (j−1) (j−1)
2 wn,i − dn,i − qni d0,r − qnr d0,i
n,i
12
Similarly to the polynomial case, define constants ξ, α, and ∆ (with 0 ≤ ∆ < 1) so that
∀k, |prk | ≤ ξ
∀k, |pik | ≤ ξ
∀k, |x| + |y| + |qkr | + |qki | ≤ α (18)
(j) (j)
∀k, |wk,r − ŵk,r | ≤ ∆
2
(j) (j)
∆
∀k, |w − ŵ | ≤
k,i k,i 2
Again, for this bound to be valid, we must be sure that it is possible to find a suitable choice
(j) (j) (j) (j)
of dk,r and dk,i . This requires that |wk,r | and |wk,i | should be less than 3/2, which gives the
conditions
∆ + 2α ≤ 1/2
(19)
ξ ≤ 3/2.
Unfortunately, as in the real case, there is no simple rule of transformation that allows to
evaluate any rational function. In the real case, this problem is discussed in [3, 13].
13
section. The corresponding implementations considered for the real domain E-method are
in [2, 3, 7].
A general scheme for evaluation of complex polynomials is shown in Figure 1 for n = 3
and the corresponding elementary unit (PEU) is illustrated in Figure 2. A bit-parallel bus
transmits x and y values in a broadcast mode, while the real and imaginary coefficients pr
and pi are loaded in separate cycles. Note that the initialization cycles could be shorter
than the iteration cycles.
x, y, p,rp i
bus PEU0r s 0r
PEU0i s 0i
PEU1r s 1r
PEU1i s 1i
PEU2r s 2r
PEU2i s 2i
PEU3r s 3r
PEU3i s 3i
14
s 2r x s 2i y p r0 0
ws wc
MG MG REG REG
[4:2] ADDER
s 0r SEL
ws wc
• Registers (4)
• Multiple generators MG (2), producing {−1, 0, 1} × x and {−1, 0, 1} × y, with buffers
• Multiplexer MUX for initializing the residual
• A [4:2] adder
• Output digit selection SEL ( a table or a gate network):
The cycle time, in terms of a full adder (complex gate) delay t, is estimated as
TP EU = tBU F F + tM G + tSEL + t[4:2] + tREG ≈ (0.4 + 0.3 + 1 + 1.3 + 0.9)t = 3.9t (21)
15
x, y, p,rp i
q r, q i bus PEU0r s 0r
PEU0i s 0i
REU1r s 1r
REU1i s 1i
REU2r s 2r
REU2i s 2i
PEU3r s 3r
PEU3i s 3i
16
s 2r x s 2i y s 0r q r1 s 0i q i1 p r1 0
ws wc
MG MG MG MG REG REG
[6:2] ADDER
s 1r SEL
ws wc
Figure 4: Block diagram of Elementary Unit for rational function evaluation (REU0 ).
17
A block diagram of an Elementary Unit for rational function evaluation (REU ) is shown
in Figure 4.
The modules in Figure 4 are:
• Registers (6)
• Multiple generators MG (4), producing {−1, 0, 1} × x etc. with buffers
• Multiplexer MUX for initializing the residual
• A [6:2] adder
• Output digit selection SEL ( a table or a gate network):
The cycle time, in terms of a full adder (complex gate) delay t, is estimated as
TREU = tBU F F + tM G + tSEL + t[6:2] + tREG ≈ (0.4 + 0.3 + 1 + 2.3 + 0.9) = 4.9t (23)
then
• with polynomial approximations, one would need a degree-9 polynomial for the exp
function, a degree-8 polynomial for the cos and a degree-9 polynomial for the sin;
• with rational approximations, a (4/4)-fraction for the exp function, a (4/4)-fraction for
the cos and a (5/5)-fraction for the sin, where an (n/m)-fraction is a rational fraction
whose numerator has degree n, and whose denominator has degree m.
18
4.5 Potential Applications
For computing complex square roots with moderate (e.g., single) precision, our method
can be of interest. In [6] we adapted the real digit-recurrence square-root iteration to the
complex case. The basic iteration is simple, yet there is a prescaling initial step that requires
a look-up in a rather big tables and a small multiplication if the required precision is large
enough, the method is of much interest (the initial step can then be neglected). If this is
not the case, it is much simpler to use a Padé √ approximation of the square-root and the
complex E-method. For instance, for computing 1 + z with max(|<(z)|, |=(z)|) ≤ 1/2 (this
domain is large enough, so that reduction to it is straightforward), then one can use the
Padé approximation
It has real coefficients only, and the error is less than 9.3 × 10−10 .
4.6 Summary
We have presented a method for solving diagonally-dominant linear systems in complex
domain by a digit-recurrence algorithm. This is a generalization of the real-domain E-
method. The method is particularly area/cost-effective for solving systems with sparse
coefficient matrices. Specifically, the method is suitable for evaluating complex polynomials,
integer powers of a complex number, and complex rational functions. The latency is roughly
m cycles for m bits of precision and independent of the order of the system. This does not
take into account potentially needed scaling steps. The cycle time is independent of m.
We discussed the transform from real to complex numbers, the iteration and convergence
conditions. The application of the method to polynomials, rational functions, and division
are described. Implementation is given at a high level with estimates of the cost and latency.
A detailed design and its hardware implementation with FPGAs are considered.
References
[1] K. Benmahammed, Evaluation of Complex Polynomials in One and Two Variables.
Multidimensional Systems and Signal Processing, 5, 245-261, 1994.
[2] M.D. Ercegovac. A general method for evaluation of functions and computation in a
digital computer. PhD thesis, Dept. of Computer Science, University of Illinois, Urbana-
Champaign, 1975.
[3] M.D. Ercegovac. A general hardware-oriented method for evaluation of functions and
computations in a digital computer. IEEE Trans. Comp., C-26(7):667–680, 1977.
[4] M.D. Ercegovac and J.-M. Muller. Complex Division with Prescaling of Operands. IEEE
International Conference on Application-Specific Systems, Architectures and Proces-
sors, pp. 293-303, 2003.
19
[5] M.D. Ercegovac and J.-M. Muller, Design of a complex divider. Proc. SPIE on Advanced
Signal Processing Algorithms, Architectures, and Implementations XII, pp. 51-59, 2004.
[6] M.D. Ercegovac and J.-M. Muller. Complex Square Root with Operand Prescal-
ing. IEEE International Conference on Application-Specific Systems, Architectures and
Processors, pp. 293-303, 2004.
[7] M.D. Ercegovac and T. Lang. Digital Arithmetic, Morgan Kaufmann Publishers - an
Imprint of Elsewier Science, San Francisco, 2004.
[8] R. McIlhenny and M.D. Ercegovac. On-Line Algorithms for Complex Number Arith-
metic. Proc. 32nd Asilomar Conference on Signals, Systems and Computers, pages
172-176, 1998.
[9] R. McIlhenny, Complex Number On-line Arithmetic for Reconfigurable Hardware: Al-
gorithms, Implementations, and Applications, PhD Dissertation, UCLA Computer Sci-
ence Department, 2002.
[10] R. McIlhenny and M.D. Ercegovac, On the Design of an On-Line Complex FIR Filter.
Proc. 38th Asilomar Conference on Signals, Systems and Computers, pp. 478-482, 2004.
[11] R. McIlhenny and M. D. Ercegovac, On the Design of an On-line Complex Matrix
Inversion Unit. Proc. 39th Asilomar Conference on Signals, Systems and Computers, 5
pps., 2005.
[12] R. McIlhenny and M. D. Ercegovac, On the Design of an On-line Complex Householder
Transform. Proc. 40th Asilomar Conference on Signals, Systems and Computers, 5 pps.,
2006.
[13] N. Brisebarre and J.-M. Muller. Functions approximable by E-fractions. 38th Asilomar
Conference on Signals, Systems and Computers, Pacific Grove, California, Nov. 2004.
[14] A.H. Nutall, Efficient Evaluation of Polynomials and Exponentials of Polynomials for
Equispaced Arguments, IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-35,
pp. 1486-1487, 1987.
[15] F. W. J. Olver, Error Bounds for Polynomial Evaluation and Complex Arithmetic,
IMA Journal of Numerical Analysis 6, 373-379, 1986.
[16] J.H. Reif, Approximate Complex Polynomial Evaluation in Near Constant Work Per
Point, STOC 97, pp. 30-39, 1997.
20