0% found this document useful (0 votes)
9 views21 pages

Complex EMethod

Complex numbers

Uploaded by

tjopza31
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views21 pages

Complex EMethod

Complex numbers

Uploaded by

tjopza31
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Solving Systems of Linear Equations in Complex

Domain : Complex E-Method


Milos Ercegovac, Jean-Michel Muller

To cite this version:


Milos Ercegovac, Jean-Michel Muller. Solving Systems of Linear Equations in Complex Domain :
Complex E-Method. 2007. �ensl-00125369v2�

HAL Id: ensl-00125369


https://ens-lyon.hal.science/ensl-00125369v2
Preprint submitted on 24 Jan 2007

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est


archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents
entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non,
lished or not. The documents may come from émanant des établissements d’enseignement et de
teaching and research institutions in France or recherche français ou étrangers, des laboratoires
abroad, or from public or private research centers. publics ou privés.
Solving Systems of Linear Equations in Complex
Domain: Complex E-Method
Miloš D. Ercegovac
Computer Science Department
3732 Boelter Hall
University of California at Los Angeles
Los Angeles, CA 90024, USA
[email protected]

Jean-Michel Muller
CNRS-Laboratoire LIP, projet Arénaire
Ecole Normale Supérieure de Lyon
46 Allée d’Italie
69364 Lyon Cedex 07, France
[email protected]

This is LIP research report number 2007-2


LIP is a laboratory of CNRS, Ecole Normale Supérieure de Lyon,
INRIA and Université Claude Bernard Lyon 1

Abstract
The E-method, introduced in [2, 3], allows efficient parallel solution of diagonally
dominant systems of linear equations in real domain using simple and highly regular
hardware. Since the evaluation of polynomials and certain rational functions can be
achieved by solving the corresponding linear systems, the E-method is an attractive
general approach for function evaluation. We generalize the E-method to complex
linear systems, and show some potential applications such as the evaluation of complex
polynomials and rational functions.

1 Introduction
In this report we propose an extension of a digit-iterative method for solving systems of
linear equations, the E-method [2, 3, 7], to allow the use of the complex number system.
The proposed approach is suitable for hardware implementation. The main characteristics
of the method are: (i) m-digit solution is computed in about m steps, each step consisting
of a sum of number-by-digit products, (ii) the cycle time depends on the number of nonzero

1
coefficients, (iii) the cycle time does not depend on the precision m (if redundant additions
are used), (iv) for a system of order n, the shortest latency requires n elementary units for
the real part, and n units for the imaginary part, and (v) the elementary units are inter-
connected with digit-wide links. The approach is particularly efficient when the coefficient
matrix is sparse. This happens when the E-method is used to evaluate polynomials (one
off-diagonal element) and rational functions (two off-diagonal elements). Other examples are
a tridiagonal system (two off-diagonal elements), powers of the argument (one off-diagonal
element), and special expressions.
We first introduce the transform which allows the E-method to be used in the complex
field C. Then we show how to use the complex E-method (CE-method) in evaluating
complex polynomials and rational functions as particularly interesting cases. Evaluation of
consecutive powers of a complex argument is a special case of polynomial evaluation.
With the exception of complex addition and multiplication, other complex operations
are typically not implemented in hardware. Online algorithms for complex arithmetic have
been proposed and implemented in FPGAs in [8, 9]. Based on this work, algorithms and
implementations for complex FIR filters, complex matrix inversion, and complex House-
holder transform have been developed [10, 11, 12]. Recently, hardware-oriented methods for
complex division and square root have been introduced [4, 6]. The method proposed in this
report extends complex arithmetic to complex polynomials, complex powers, and rational
functions - a significant extension of the domain of hardware implementation for complex
arithmetic.
Complex polynomials appear in many areas such as digital signal and image processing,
control systems, and applied mathematics, in general. A Horner type method for evaluating
complex polynomials is proposed in [1] at the algorithm level, implicitly assuming a software
implementation. The method uses O(n) multiplications and O(n) additions for a complex
polynomial of degree n. If these multiplications and additions are performed in a sequential
order, the latency of the method is about n × TM U LT −ADD which is significantly slower
than our method. If a parallel algorithm for polynomial evaluation is used, the total time is
about log n × TM U LT −ADD which is still slower than our method.
The case of rational functions with complex coefficients and argument is even more
attractive: using the proposed CE-method, we avoid explicit complex division and produce
the result, as mentioned above, in time proportional to the desired precision. We are not
aware of prior special algorithms for evaluation of complex rational functions in hardware.
In the next section we describe the transformation which maps computation from the
complex to the real domain. In Section 3 we show the CE-method. In Section 4 iterations
and convergence conditions are considered. Implementation aspects are discussed in Section
5.

2 Complex-Real (CR) Transforms


Complex numbers can be represented by 2 × 2 skew-symmetric matrices
 
x −y
x + iy ↔ (1)
y x

2
This isomorphism holds for complex addition and multiplication which are used in the
proposed method :
   
a −b c −d
(a + ib) + (c + id) ↔ +
b a d c
 
a+c −b − d
= ↔ (a + c) + i(b + d) (2)
b+d a+c

   
a −b c −d
(a + ib) × (c + id) ↔ ×
b a d c
 
ac − bd −(ad + bc)
= ↔ (ac − bd) + i(bc + ad) (3)
ad + bc ac − bd

Consequently, an m × n matrix of complex numbers can be represented as a 2m × 2n matrix


of real numbers. For n × n complex matrices, considered in this paper, the transform from
the complex domain to the real domain is defined next.

Definition 1 The CR-transform of the n-dimensional complex linear system

     
a1,1 a1,2 a1,3 a1,4 ··· a1,n z1 t1

 a2,1 a2,2 a2,3 a2,4 ··· a2,n  
  z2  
  t2 


 a3,1 a3,2 a3,3 a3,4 ··· a3,n ×
  z3 =
  t3 
 (4)
 .. .. .. .. ..   ..   .. 
 . . . . ··· .   .   . 
an,1 an,2 an,3 an,4 ··· an,n zn tn
is the 2n-dimensional real linear system

0 1 0 1 0 1
BB ar1,1 −ai1,1 ar1,2 −ai1,2 ··· ar1,n −ai1,n
C
C B
B
z1r
C
C B
B
tr1
C
C
BB ai1,1 ar1,1 ai1,2 ar1,2 ··· ai1,n ar1,n
C
C B
B
z1i
C
C B
B
ti1
C
C
BB ar2,1 −ai2,1 ar2,2 −ai2,2 ··· ar2,n −ai2,n C
C B
B z2r C
C B
B tr2 C
C
BB ai2,1 ar2,1 ai2,2 ar2,2 ··· ai2,n ar2,n C
C B
B z2i C
C B
B ti2 C
C
BB C
C B
×B
C
C B
=B
C
C
BB ar3,1 −ai3,1 ar3,2 −ai3,2 ··· ar3,n −ai3,n
C
C B
B
z3r
C
C B
B
tr3
C
C
(5)
BB ai3,1 ar3,1 ai3,2 ar3,2 ··· ai3,n ar3,n C
C B
B
z3i C
C B
B
ti3 C
C
BB .. . .. .. .. C
C B
B .. C
C B
B .. C
C
BB ..
C
C B C B C
A B C
A B C
. . . ··· . . .
@ arn,1 −ain,1 arn,2 −ain,2 ··· arn,n −ain,n @ r
zn @ trn A
ain,1 arn,1 ain,2 arn,2 ··· ain,n arn,n i
zn tin

where aj,k = arj,k + iaij,k , zj = zjr + izji and tj = trj + itij . These two linear systems are
equivalent.

3
In other words, the real linear system (5) is obtained from the complex linear system (4)
by replacing each element x + ix by the 2 × 2 matrix defined in (1). In the next section we
consider a hardware-oriented method for solving such a system.

3 Complex E-method
The E-method [2, 3], provides an iterative approach of solving diagonally dominant real
linear systems. The method has characteristics desirable for efficient hardware implemen-
tation: the basic operators are bit-vector multiplexers, redundant adders of [p : 2] type,
with p ∈ {3, 4, 6} for radix-2, and registers. The overall structure consists of n elementary
units, interconnected digit-serially. The method computes one digit of each component of
the solution per iteration in the MSDF (Most Significant Digit First) manner which allows
digit-serial communication between the modules which operate concurrently. The time to
obtain the solution to m digits of precision is about m cycles (iterations). The amount of
hardware required is roughly related to the number of nonzero terms of the matrix of the
system, which makes the E-method very efficient in hardware resources when the matrix
of the system is sparse. Typical applications of the E-method are evaluation of polynomial
and rational functions, since these correspond to sparse linear systems. The solution of the
linear system
     
1 −x 0 0 0 ··· 0 y0 p0
     
 0 1 −x 0 0 · · · 0   y1   p1 
     
     
 .. .. .. .. .. ..  ×  ..   ..
=

 .
 . . . . ··· .   .
 
  .
  

     
 0 0
 · · · 0 0 1 −x   yn−1   pn−1 
   

0 0 0 ··· 0 0 1 yn pn
is
p0 + p1 x + p2 x2 + · · · + pn xn
 

p1 + p2 x + · · · + pn xn−1
 
 
 
 
 .. 

 . 

 

 pn−1 + pn x 

pn
that is, the first component of the solution is

p0 + p1 x + p2 x2 + · · · + pn xn

4
whereas the solution of the linear system
     
1 −x 0 0 0 ··· 0 y0 p0
     
 q1
 1 −x 0 0 · · · 0   
 y1  
  p1 

     
 .. .. .. .. .. ..  ×  .. ..
=
  
 .
 . . . . ··· . 
 
 .   . 

     
 qn−1 0
 · · · 0 0 1 −x   
 yn−1  
  pn−1 

qn 0 0 ··· 0 0 1 yn pn

is
p0 +p1 x+p2 x2 +···+pn xn
 
1+q1 x+q2 x2 +···+qn xn
 

 (p1 −p0 q1 )+(p2 −p0 q2 )x+···+(pn −p0 qn )xn−1 

 1+q1 x+q2 x2 +···+qn xn 
 
 .. 

 . 

 
 .. 

 . 

 
(pn −qn p0 )+(pn q1 −qn p1 )x+···+(pn qn−1 −qn pn−1 )xn−1
1+q1 x+q2 x2 +···+qn xn

That is, the first component of the solution is the rational function

p0 + p1 x + p2 x2 + · · · + pn xn
.
1 + q1 x + q2 x2 + · · · + qn xn
Now, let us turn to the evaluation of complex polynomials of a complex argument. We
wish to evaluate
p(z) = p0 + p1 z + p2 z 2 + . . . + pn z n
where the pj ’s and z are complex numbers. As in the real case, the desired value p(z) is
clearly equal to the first component of the solution of the linear system
 
    p0
1 −z 0 0 0 ... 0 y0  
  y 1   p1 
 

 0 1 −z 0 0 . . . 0     
   y 2   p2 
     
 0 0
 1 −z 0 . . . 0   ×  y 3  =  p3 
   
(6)
   y4   
 .. .. .. .. .. .. ..     
 .  
  ..   p4 
 . . . . . . .  
 .. 

0 0 0 0 ... 0 1 yn  . 
pn

The E-method cannot directly solve the linear system (6), but now if we define real numbers
x and y as x + iy = z, and prj and pij as pj = prj + ipij , then we can apply the CR-transform
of (6), and get the following linear system:

5
The matrix is
 
1 0 −x y 0 0 0 0 ··· 0
 

 0 1 −y −x 0 0 0 0 0  ···
 

 0 0 1 0 −x y 0 0 · · · 0 
 

 0 0 0 1 −y −x 0 0 · · · 0 
 
E=
 .. .. .. .. .. .. .. .. .. .. 
 . . . . . . . . . . 

 

 0 0 ··· 0 0 0 1 0 −x y  
 

 0 0 ··· 0 0 0 0 1 −y −x  
 

 0 0 ··· 0 0 0 0 0 1 0 
0 0 ··· 0 0 0 0 0 0 1

The first two components of the solution s of the linear system

s0r p0r
     
1 0 −x y 0 0 0 0 ··· 0
i   p0i
     

 0 1 −y −x 0 0 0
0   s0
 0 ···  


r   p1r
     

 0 0 1 0 −x y 0 0 · · · 0   s1

 


i   p1i
     

 0 0 0 1 −y −x 0 0 · · · 0   s1

 


     
.. .. .. .. .. .. .. .. .. ..  ×  ..   ..
= . (7)
 

 . . . . . . . . .   .
.  
 


     
0 0 ··· 0 0 0 1 0 −x y   r   pr
  sn−1
 
   n−1 
     
0 0 ··· 0 0 0 0 1 −y −x   i   pi
  sn−1
 
   n−1 
     
r   pr

 0 0 ··· 0 0 0 0 0 1 0   sn

  n


0 0 ··· 0 0 0 0 0 0 1 sni pni

are equal to the real and imaginary parts of

p0 + p1 z + p 2 z 2 + · · · + p n z n .

6
For instance, in the case n = 3, we get
−3 xy 2 p3r + x3 p3r − 3 yx2 p3i + x2 p2r − 2 xyp2i + xp1r + y 3 p3i − y 2 p2r − yp1i + p0r
 
 
 −y 3 p3r + 3 yx2 p3r − 3 y 2 xp3i + 2 yxp2r − y 2 p2i + yp1r + x3 p3i + x2 p2i + xp1i + p0i 
 
 

 −y 2 p3r + x2 p3r − 2 yxp3i + xp2r − yp2i + p1r 

 
 x2 p3i + 2 xyp3r − y 2 p3i + yp2r + xp2i + p1i 
s=
 


 xp3r − yp3i + p2r 

 

 yp3r + xp3i + p2i 

 

 p3r 

p3i

The linear system (7) is easily solved by the E-method, provided that it is diagonally dom-
inant (see Section 4 for details on the iterations and convergence conditions). Note that
the E-method does not evaluate directly the expressions given for the solution s0 . These
would require at least 16+16 full multiplications, that, assuming enough multipliers, would
take at least 3 consecutive multiply times. Moreover, the reduction of product terms would
require a [10:2] reduction. Of course, all the interconnections are of full precision. Instead,
as explained later, the complex E-method computes s0 on 14 serial-parallel (left-to-right)
multipliers, including the additions, in about one serial-parallel multiplication time. In this
approach, the interconnections are digit-serial.
Now, let us turn to rational functions of a complex argument with rational coefficients
(assuming the degree-0 coefficient of the denominator is 1). We wish to evaluate
p 0 + p1 z + p 2 z 2 + · · · + pn z n
R(z) =
1 + q1 z + q2 z 2 + · · · + qn z n
where the pj ’s, the qj ’s and z are complex numbers. Clearly, R(z) is equal to the first
component of the solution of the linear system
     
1 −z 0 0 0 ... 0 s0 p0
     
 q1 1 −z 0 0 ... 0    s1   p1 
  
 
     
 q2 0
 1 −z 0 . . . 0   ×  s2  = 
   p2  (8)
     
 .. .. .. .. .. .. ..   ..   .. 
 . . . . . . .   . 
   . 
   
qn 0 0 0 ... 0 1 sn pn

The E-method cannot directly solve the linear system (8), but it suffices to take the
CR-transform of that system. Define z = x + iy, pj = prj + ipij and qj = qjr + iqji . The
CR-transform results in the following linear system

7
sr0 pr0
     
1 0 −x y 0 0 0 0 ... 0 0
si0  pi0 
     

 0 1 −y −x 0 0 0 0 ... 0 0  
 

  
q1r −q1i sr1  r 
     

 1 0 −x y 0 0 ... 0 0  
 

  p1 
q1i q1r si1  pi1 
     

 0 1 −y −x 0 0 ... 0 0  
 

  
q2r −q2i sr2  r 
     

 0 0 1 0 −x y ... 0 0 ×

=
 p2  (9)
q2i q2r si2  pi2 
     

 0 0 0 1 −y −x ... 0 0  
 

  
     
 .. .. .. .. .. .. .. .. .. .. ..   ..   .. 

 . . . . . . . . . . .  
  .  
  . 
     

 qnr −qni 0 0 0 0 0 0 ... 1 0  
  srn  
 
r 
pn 
qni qnr 0 0 0 0 0 0 ... 0 1 sin pin

Again, that system is easily solved by the E-method, provided that it satisfies the con-
vergence conditions (see Section 4), and

R(z) = sr0 + isi0 .

4 Iteration, and convergence conditions


To make the presentation simpler, we will focus on radix-2 iterations only. Adaptation
to higher radices is rather straightforward. The radix-2 E-method consists in solving the
n-dimensional linear system
Ax = P
by using the following basic recursion on residuals:
h i
w(j) = 2 × w(j−1) − Ad(j−1) (10)
(j)
with w(0) = [p0 , p1 , . . . , pn ]t , and d(j) = [d0 , d1 , . . . , dn ]t where the digits dk are in {−1, 0, 1}.
(j) (0) (1) (2) (j) (j)
Define the number Dk = dk .dk dk . . . dk (the dk are the digits of a radix-2 signed-digit
(j)
representation of Dk ). By induction, we easily get,
h i
w(j) = 2j w(0) − AD(j−1) . (11)

(j) (j)
Using (11), one can show that if the residuals |wk | are bounded, then for all k, Dk goes
to yk as j goes to infinity.
(j)
The problem at step j is to find a selection function that gives a value of the digits dk from

8
(j) (j+1)
the residuals wk such that the values wk will remain bounded. In [3], the following
selection function (a form of rounding) is proposed
(
sign x × b|x + 1/2|c , if |x| ≤ 1
s(x) = (12)
sign x × b|x|c , otherwise,

and applied to the following cases:


(j) (j) (j)
1. dk = s(wk ), i.e., the selection uses a non-redundant wk ;
(j) (j) (j) (j) (j)
2. dk = s(ŵk ), where ŵk is an approximation to wk (in practice, ŵk is deduced
(j)
from a few digits of wk by the means of a rounding or a truncation)
Let us now see what this gives in two cases: complex polynomial evaluation and complex
rational function evaluation.

4.1 Polynomial evaluation


We wish to evaluate a degree-n polynomial
pn z n + pn−1 z n−1 + · · · + p0
at the complex point z = x + iy, with pk = prk + ipik . The matrix of the CR-transform,
obtained in Section 3 is
 
1 0 −x y 0 0 0 0 ··· 0
 
 0
 1 −y −x 0 0 0 0 ···
0 
 
 0
 0 1 0 −x y 0 0 ··· 0 
 
 0
 0 0 1 −y −x 0 0 ··· 0 
 .. .. .. .. .. .. .. .. .. .. 
A=  . . . . . . . . . . 

 
 0
 ··· 0 0 0 0 1 0 −x y  
 
 0
 ··· 0 0 0 0 0 1 −y −x 
 
 0
 ··· 0 0 0 0 0 0 1 0 
0 ··· 0 0 0 0 0 0 0 1
Let us slightly modify the notations w and d of iteration (10), to adapt them to the
complex case. The residual vector w(j) will be denoted
(j) (j) (j) (j) (j)
(j)
w(j) = [w0,r , w0,i , w1,r , w1,i , · · · , wn,r , wn,i ],
and its initial value will be given by

 w(0)

= prk
k,r

 w(0)

= pik
k,i

9
The digit-vector d(j) will be denoted
(j) (j) (j) (j) (j)
d(j) = [d0,r , d0,i , d1,r , d1,i , · · · , d(j)
n,r , dn,i ].

Therefore, iteration (10) becomes

• for k = 0, . . . , n − 1,
 h i
 w(j)

=
(j−1) (j−1) (j−1) (j−1)
2 wk,r − dk,r + xdk+1,r − ydk+1,i
k,r
h i (13)
 w(j)

=
(j−1) (j−1) (j−1) (j−1)
2 wk,i − dk,i + ydk+1,r + xdk+1,i
k,i

• for k = n,  h i
(j) (j−1) (j−1)
 wn,r = 2 wn,r − dn,r

h i
 w(j)
 (j−1) (j−1)
= 2 wn,i − dn,i
n,i

Now, let us examine the convergence conditions. The iterations converge to the desired
result if vector w(j) is bounded. Define constants ξ, α and ∆ (with 0 ≤ ∆ < 1) such that
1. |x| + |y| ≤ α;
2. for any k between 0 and n,


 |prk | ≤ ξ

 |pi |

≤ ξ
k
(j)
 |wk,r (j) ∆

 − ŵk,r | ≤ 2
 (j) (j) ∆
|wk,i − ŵk,i | ≤

2

(j−1) (j−1) (j−1) (j−1)


Since |dk,r − ŵk,r | ≤ 1/2 and |dk,i − ŵk,i | ≤ 1/2, from (13) we find
 
(j) 1 ∆
|wk,r | ≤ 2 + +α = 1 + ∆ + 2α. (14)
2 2
(j)
The same bound holds for |wk,i |. For this bound to be feasible, we must assure that a
(j) (j) (j) (j)
suitable choice of dk,r and dk,r in {−1, 0, 1} is possible. This requires that |wk,r | and |wk,i |
should be less than 3/2. This immediately gives the following condition
1
∆ + 2α ≤ . (15)
2
(0) (0)
Now, let us turn to the initial values. Since |wk,r | and |wk,i | must also be less than 3/2, we
get
3
ξ≤ . (16)
2

10
Consider the following example: we wish to evaluate

p(z) = (1 + i) z 3 − (0.5 + 1.25 i) z 2 + z + 1.

at point
1 i
+ . z=
100 10
We assume that ∆ = 0 (that is, we use non-redundant residuals). We get:
• initialization:

w(0) = [pr0 , pi0 , pr1 , pi1 , pr2 , pi3 , pr4 , pi4 ]t = [1, 0, 1, 0, −0.5, −1.25, 1, 1]t .

• Step 1: from w(0) and the selection function, we get

s(0) = [1, 0, 1, 0, 0, −1, 1, 1]t ,

which gives
w(1) = [0.02, 0.2, 0.2, −0.02, −1.18, −0.28, 0, 0]t .

• Step 2: from w(1) and the selection function, we get

s(1) = [0, 0, 0, 0, −1, 0, 0, 0]t ,

which gives
w(2) = [0.04, 0.4, 0.38, −0.24, −0.36, −0.56, 0, 0]t .

• after 20 iterations, the number


(0) (1) (2) (20) (0) (1) (2) (20)
d0,r .d0,r d0,r · · · d0,r + i × d0,i .d0,i d0,i · · · d0,i

is equal to
533789 57727
+ i ≈ 1.018121719 + 0.110105514 i
524288 524288
whereas the exact value of p(z) is

p(z) = 1.018121 + 0.110106 i

Exactly as in the real case, even if polynomial p and point z do not satisfy the convergence
constraints, one can easily “transform” them using mere shifts, so that p(z) can be computed
using the E-method. Once ∆ is chosen, and α is defined as 14 − ∆/2, this is done as follows:

1. Find the smallest integer k such that |<(z/2k )| + |=(z/2k )| should be less than α;
2. Now, p(z) = π(t), where the degree-m coefficient of polynomial π is 2mk pm . If at least
one of the coefficients of π has the absolute value of its real or imaginary part greater
than ξ = 3/2, then divide π by 2` , where ` is the smallest integer such that ρ = π/2`
has the absolute value of the real and imaginary parts of its coefficients less than ξ;
3. What we actually compute using the E-method is ρ(z/2k ). This result will then be
multiplied by 2` (a simple left-shift) to get p(z).

11
4.2 Rational function evaluation
We now wish to evaluate
p0 + p1 z + p2 z 2 + · · · + p n z n
R(z) =
1 + q1 z + q2 z 2 + · · · + qn z n

at the complex point z = x + iy, with pk = prk + ipik and qk = qkr + iqki . The matrix of the
CR-transform obtained in Section 3 is
 
1 0 −x y 0 0 0 0 ... 0 0
 
 0
 1 −y −x 0 0 0 0 ... 0 0 
 r i

 q1 −q1 1 0 −x y 0 0 ... 0 0 
 
 i
q1r

 q1
 0 1 −y −x 0 0 ... 0 0  
 r
 q2 −q2i

 0 0 1 0 −x y ... 0 0  
 i
q2r

 q2
 0 0 0 1 −y −x ... 0 0  
 
 .. .. .. .. .. .. .. .. .. .. .. 
 . . . . . . . . . . . 
 
 
 q r −q i 0 0 0 0 0 0 ... 1 0 
 n n 
qni qnr 0 0 0 0 0 0 ... 0 1

Therefore, iteration (10) becomes

• for k = 0,  h i
 w(j)

=
(j−1) (j−1) (j−1) (j−1)
2 w0,r − d0,r + xd1,r − yd1,i
0,r
h i
 w(j)

=
(j−1) (j−1) (j−1) (j−1)
2 w0,i − d0,i + yd1,r + xd1,i
0,i

• for k = 1, . . . , n − 1,

 h i
 w(j)

=
(j−1) (j−1) (j−1) (j−1) (j−1) (j−1)
2 wk,r − dk,r − qkr d0,r + qki d0,i + xdk+1,r − ydk+1,i
k,r
h i (17)
 w(j)

=
(j−1) (j−1) (j−1) (j−1) (j−1) (j−1)
2 wk,i − dk,i − qki d0,r − qkr d0,i + ydk+1,r + xdk+1,i
k,i

• for k = n,  h i
(j) (j−1) (j−1) (j−1) (j−1)
 wn,r = 2 wn,r − dn,r − qnr d0,r + qni d0,i

h i
 w(j)

=
(j−1) (j−1) (j−1) (j−1)
2 wn,i − dn,i − qni d0,r − qnr d0,i
n,i

12
Similarly to the polynomial case, define constants ξ, α, and ∆ (with 0 ≤ ∆ < 1) so that

∀k, |prk | ≤ ξ






∀k, |pik | ≤ ξ






∀k, |x| + |y| + |qkr | + |qki | ≤ α (18)


(j) (j)

∀k, |wk,r − ŵk,r | ≤ ∆


2




(j) (j)
 ∆
 ∀k, |w − ŵ | ≤

k,i k,i 2

As in the polynomial case, we find that


(j) (j)
|wk,r |, |wk,i | ≤ 1 + ∆ + 2α.

Again, for this bound to be valid, we must be sure that it is possible to find a suitable choice
(j) (j) (j) (j)
of dk,r and dk,i . This requires that |wk,r | and |wk,i | should be less than 3/2, which gives the
conditions 
 ∆ + 2α ≤ 1/2

(19)
 ξ ≤ 3/2.

Unfortunately, as in the real case, there is no simple rule of transformation that allows to
evaluate any rational function. In the real case, this problem is discussed in [3, 13].

4.3 Approximations with real coefficients


Many useful math functions (e.g., the elementary functions) have polynomial or rational ap-
proximations whose coefficients are real numbers. A straightforward example is the following
Padé approximation for the exponential function:
1 3 1 1
1 + 1/2 z + 1/9 z 2 + z + z4 + z5
z
e ≈ 72 1008 30240 (20)
1 3 1 1
1 − 1/2 z + 1/9 z 2 − z + z4 − z5
72 1008 30240
Having real coefficients would not simplify a polynomial evaluation much, since the coef-
ficients of the polynomial only appear in the initialization of the residual vector w. And
yet, this would significantly simplify a rational function evaluation, since recurrence (17)
becomes simpler if the terms qki are equal to zero.

4.4 Implementation Aspects


In this section we discuss implementation aspects of the complex E-method in general terms.
The main difference from implementation of real domain E-method is that the number of
non-zero off-diagonal elements doubles: for the polynomial case to two, and for the rational
case to four elements. This has two consequences. First, the bounds on the elements are
smaller by a factor of two, and second, the cycle time is increased as explained later in this

13
section. The corresponding implementations considered for the real domain E-method are
in [2, 3, 7].
A general scheme for evaluation of complex polynomials is shown in Figure 1 for n = 3
and the corresponding elementary unit (PEU) is illustrated in Figure 2. A bit-parallel bus
transmits x and y values in a broadcast mode, while the real and imaginary coefficients pr
and pi are loaded in separate cycles. Note that the initialization cycles could be shorter
than the iteration cycles.

x, y, p,rp i
bus PEU0r s 0r

PEU0i s 0i

PEU1r s 1r

PEU1i s 1i

PEU2r s 2r

PEU2i s 2i

PEU3r s 3r

PEU3i s 3i

Figure 1: Overall scheme for evaluating complex polynomial of degree n = 3.

A block diagram of an Elementary Unit for polynomial evaluation (P EU ) is shown in


Figure 2.
The modules in Figure 2 are:

14
s 2r x s 2i y p r0 0
ws wc

REG REG MUX MUX

MG MG REG REG

[4:2] ADDER

s 0r SEL

ws wc

Figure 2: Block diagram of Elementary Unit for polynomials (EU0 .)

• Registers (4)
• Multiple generators MG (2), producing {−1, 0, 1} × x and {−1, 0, 1} × y, with buffers
• Multiplexer MUX for initializing the residual
• A [4:2] adder
• Output digit selection SEL ( a table or a gate network):
The cycle time, in terms of a full adder (complex gate) delay t, is estimated as

TP EU = tBU F F + tM G + tSEL + t[4:2] + tREG ≈ (0.4 + 0.3 + 1 + 1.3 + 0.9)t = 3.9t (21)

The cost, again in terms of area of a full adder A is estimated as

AP EU (m) = ASEL + ABU F F + AM G + A[4:2] + AREG (22)


≈ [5 + 2 × 0.4 + (m + 2)(2 × 0.45 + 2.3 + 4 × 0.6)]A ≈ 6 + 6mA

A general scheme for evaluation of complex rational functions is shown in Figure 3


for n = 3 and the corresponding elementary unit (REU ) is illustrated in Figure 4. As
mentioned above, a bit-parallel bus transmits x and y values in a broadcast mode, while the
real and imaginary coefficients pr , pi , q r and q i are loaded in separate cycles. Note that the
initialization cycles could be shorter than the iteration cycles.

15
x, y, p,rp i
q r, q i bus PEU0r s 0r

PEU0i s 0i

REU1r s 1r

REU1i s 1i

REU2r s 2r

REU2i s 2i

PEU3r s 3r

PEU3i s 3i

Figure 3: Overall scheme for evaluating complex rational function of degree n = 3.

16
s 2r x s 2i y s 0r q r1 s 0i q i1 p r1 0
ws wc

REG REG REG REG MUX MUX

MG MG MG MG REG REG

[6:2] ADDER

s 1r SEL

ws wc

Figure 4: Block diagram of Elementary Unit for rational function evaluation (REU0 ).

17
A block diagram of an Elementary Unit for rational function evaluation (REU ) is shown
in Figure 4.
The modules in Figure 4 are:
• Registers (6)
• Multiple generators MG (4), producing {−1, 0, 1} × x etc. with buffers
• Multiplexer MUX for initializing the residual
• A [6:2] adder
• Output digit selection SEL ( a table or a gate network):

The cycle time, in terms of a full adder (complex gate) delay t, is estimated as

TREU = tBU F F + tM G + tSEL + t[6:2] + tREG ≈ (0.4 + 0.3 + 1 + 2.3 + 0.9) = 4.9t (23)

The cost, again in terms of area of a full adder A is estimated as

AREU (m) = ASEL + ABU F F + AM G + A[6:2] + AREG (24)


≈ [5 + 2 × 0.4 + (m + 2)(4 × 0.5 + 3.3 + 6 × 0.6)]A ≈ 20 + 7mA

About Comparisons with Real E-Method


For a “general” function, there is no way of comparing the real and complex methods,
since they do not apply to the same (real or complex) domain. However, for the usual
elementary functions, there are schoolbook transformation rules that makes it possible
to re-write a complex-valued function in terms of real-valued functions. So, let us com-
pare both methods with such a function. Assume we wish to approximate exp(z), where
max(|<(z)|, |=(z)|) ≤ 1/2. The Padé approximation of Eq. (20) has a degree-5 numerator
and a degree-5 denominator, with real coefficients. The error of this approximation in the
considered domain is less than 2.2 × 10−12 . If one wishes to reach a similar accuracy using
the real E-method and the formula

ex+iy = ex (cos y + i sin y),

then
• with polynomial approximations, one would need a degree-9 polynomial for the exp
function, a degree-8 polynomial for the cos and a degree-9 polynomial for the sin;
• with rational approximations, a (4/4)-fraction for the exp function, a (4/4)-fraction for
the cos and a (5/5)-fraction for the sin, where an (n/m)-fraction is a rational fraction
whose numerator has degree n, and whose denominator has degree m.

18
4.5 Potential Applications
For computing complex square roots with moderate (e.g., single) precision, our method
can be of interest. In [6] we adapted the real digit-recurrence square-root iteration to the
complex case. The basic iteration is simple, yet there is a prescaling initial step that requires
a look-up in a rather big tables and a small multiplication if the required precision is large
enough, the method is of much interest (the initial step can then be neglected). If this is
not the case, it is much simpler to use a Padé √ approximation of the square-root and the
complex E-method. For instance, for computing 1 + z with max(|<(z)|, |=(z)|) ≤ 1/2 (this
domain is large enough, so that reduction to it is straightforward), then one can use the
Padé approximation

√ 15 45 275 225 189 35 15


1+ 4 z+ 8 z2 + 64 z3 + 128 z4 + 512 z5 + 1024 z6 + 16384 z7
1+z ≈ 13 33 165 105 63 7 1
1+ 4 z+ 8 z2 + 64 z3 + 128 z4 + 512 z5 + 1024 z6 + 16384 z7

It has real coefficients only, and the error is less than 9.3 × 10−10 .

4.6 Summary
We have presented a method for solving diagonally-dominant linear systems in complex
domain by a digit-recurrence algorithm. This is a generalization of the real-domain E-
method. The method is particularly area/cost-effective for solving systems with sparse
coefficient matrices. Specifically, the method is suitable for evaluating complex polynomials,
integer powers of a complex number, and complex rational functions. The latency is roughly
m cycles for m bits of precision and independent of the order of the system. This does not
take into account potentially needed scaling steps. The cycle time is independent of m.
We discussed the transform from real to complex numbers, the iteration and convergence
conditions. The application of the method to polynomials, rational functions, and division
are described. Implementation is given at a high level with estimates of the cost and latency.
A detailed design and its hardware implementation with FPGAs are considered.

References
[1] K. Benmahammed, Evaluation of Complex Polynomials in One and Two Variables.
Multidimensional Systems and Signal Processing, 5, 245-261, 1994.
[2] M.D. Ercegovac. A general method for evaluation of functions and computation in a
digital computer. PhD thesis, Dept. of Computer Science, University of Illinois, Urbana-
Champaign, 1975.
[3] M.D. Ercegovac. A general hardware-oriented method for evaluation of functions and
computations in a digital computer. IEEE Trans. Comp., C-26(7):667–680, 1977.
[4] M.D. Ercegovac and J.-M. Muller. Complex Division with Prescaling of Operands. IEEE
International Conference on Application-Specific Systems, Architectures and Proces-
sors, pp. 293-303, 2003.

19
[5] M.D. Ercegovac and J.-M. Muller, Design of a complex divider. Proc. SPIE on Advanced
Signal Processing Algorithms, Architectures, and Implementations XII, pp. 51-59, 2004.
[6] M.D. Ercegovac and J.-M. Muller. Complex Square Root with Operand Prescal-
ing. IEEE International Conference on Application-Specific Systems, Architectures and
Processors, pp. 293-303, 2004.
[7] M.D. Ercegovac and T. Lang. Digital Arithmetic, Morgan Kaufmann Publishers - an
Imprint of Elsewier Science, San Francisco, 2004.
[8] R. McIlhenny and M.D. Ercegovac. On-Line Algorithms for Complex Number Arith-
metic. Proc. 32nd Asilomar Conference on Signals, Systems and Computers, pages
172-176, 1998.
[9] R. McIlhenny, Complex Number On-line Arithmetic for Reconfigurable Hardware: Al-
gorithms, Implementations, and Applications, PhD Dissertation, UCLA Computer Sci-
ence Department, 2002.
[10] R. McIlhenny and M.D. Ercegovac, On the Design of an On-Line Complex FIR Filter.
Proc. 38th Asilomar Conference on Signals, Systems and Computers, pp. 478-482, 2004.
[11] R. McIlhenny and M. D. Ercegovac, On the Design of an On-line Complex Matrix
Inversion Unit. Proc. 39th Asilomar Conference on Signals, Systems and Computers, 5
pps., 2005.
[12] R. McIlhenny and M. D. Ercegovac, On the Design of an On-line Complex Householder
Transform. Proc. 40th Asilomar Conference on Signals, Systems and Computers, 5 pps.,
2006.
[13] N. Brisebarre and J.-M. Muller. Functions approximable by E-fractions. 38th Asilomar
Conference on Signals, Systems and Computers, Pacific Grove, California, Nov. 2004.
[14] A.H. Nutall, Efficient Evaluation of Polynomials and Exponentials of Polynomials for
Equispaced Arguments, IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-35,
pp. 1486-1487, 1987.
[15] F. W. J. Olver, Error Bounds for Polynomial Evaluation and Complex Arithmetic,
IMA Journal of Numerical Analysis 6, 373-379, 1986.

[16] J.H. Reif, Approximate Complex Polynomial Evaluation in Near Constant Work Per
Point, STOC 97, pp. 30-39, 1997.

20

You might also like