0% found this document useful (0 votes)
106 views5 pages

Square Rooting Algorithms For Integer and Floating-Point Numbers

This document presents a new algorithm for evaluating square roots of integers and real numbers. The algorithm has two parts: 1) Obtaining an initial close estimate of the square root. 2) Iteratively modifying the initial value until the precise root is evaluated. The algorithm aims for high speed by using a simple two-step method to obtain the initial estimate and a fast converging iteration technique. It also avoids division operations except for division by 2, which is performed via bit shifting. The algorithm is simulated for both integers and real numbers, and is shown to provide considerable speed improvements over two widely used methods.

Uploaded by

Jayaram Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views5 pages

Square Rooting Algorithms For Integer and Floating-Point Numbers

This document presents a new algorithm for evaluating square roots of integers and real numbers. The algorithm has two parts: 1) Obtaining an initial close estimate of the square root. 2) Iteratively modifying the initial value until the precise root is evaluated. The algorithm aims for high speed by using a simple two-step method to obtain the initial estimate and a fast converging iteration technique. It also avoids division operations except for division by 2, which is performed via bit shifting. The algorithm is simulated for both integers and real numbers, and is shown to provide considerable speed improvements over two widely used methods.

Uploaded by

Jayaram Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

1025

IEEE TRANSACTIONS ON COMPUTERS, VOL. 39, NO. 8, AUGUST 1990

Square Rooting Algorithms for Integer and


Floating-Point Numbers
Abstmct-A new algorithm is developed for evaluating the
square root of integers and real numbers. The procedure consists
of two parts: one to obtain a close estimate of the square root
and the other to modify the initial value, iteratively, until a
precise root is evaluated. The major effort, in this development,
has been concentrated on two objectives: high-speed and no
division operation other than division by 2. The first objective is
achieved through a simple two-step procedure for getting the first
estimate, and then modifying it by employing a fast converging
iteration technique. The second objective is also fulfilled through
applying bit-shift operation rather than division operation. The
algorithm is simulated for both integer and real numbers, and
the results are compared to two methods being widely used.
The results (tabulated) show considerable improvement in speed
compared to these other two methods.
Index Terms-Initial value, integer number, iteration, nodivision, real number, square root.

I. INTRODUCTION
OMPUTATION of square root is a basic and essential
operation in many areas of science and engineering. It
may be rated, in importance, next to the four basic arithmetic
operations: addition, subtraction, multiplication, and division. Some typical applications of square root are in: complex
variables, trigonometry, solution of quadratic equations, twodimensional modeling, error computation, statistics, graphics,
image processing, and many others. Computation of gradients
in edge detection process or geometrical distances between
pixels in an image are obvious examples in image processing
that require frequent use of square rooting. Also in signal processing applications such as adaptive filtering, square root is
predominantly used [11. In fact, in some of the latest developments in digital signal processors the square root is included in
the normal instruction set instead of being a macroinstruction
PI.
Probably the most widely used method of evaluating the
square root of a number is the Newton-Raphson technique.
This technique is a simple and effective iteration procedure
which improves the first approximation of the root, progressively. However, the method does not provide any scheme
for getting a close first estimate; besides, it requires division
operation which by itself is another recursive process in computer arithmetic, and many DSP units, simply, do not have a
division instruction [2].There are several developments in re-

Manuscript received November 21, 1989; revised March 8, 1990.


The author is with the Department of Electrical Engineering, Northern
Illinois University, Dekalb, IL 601 15.
IEEE Log Number 9036138.

cent years on relatively high-speed square rooting techniques


[3]-[7]. The square root algorithm presented by Prado and Alcantara [5] is a progressive bit-by-bit recursive technique. In
their method, the speed is almost proportional to the binary
length of the root, and, therefore, it slows down for large
data size. Moreover, the method is restricted to fixed-point
arithmetic.
An algorithm for evaluating the square root of any given
number is presented in this paper. The algorithm is partitioned
into three sections: in the first section, a simple procedure is
developed for evaluation of the first estimate of the square root
within an average accuracy of 1.7% (or f0.85%). In the second section, a high-speed iterative technique is presented for
computation of square roots of integer numbers, and the third
section deals with the extension of the algorithm to floating
point arithmetic. As shown in the subsequent sections, the
quadratic nature of the method makes it converge fast. Furthermore, it avoids division operation, except division by 2,
which is actually performed by shifting the operand.
11. PRELIMINARIES

Before we discuss the actual procedure for evaluating the


square root of an operand it is important to develop some
basic criteria, and the accuracy involved in the computation
of square root. Lemmas 1-3, below, are presented to serve
this purpose.
Let Y be an m bit integer (an integer whose MSB is at
position m ) , and let X = Y 2 .It is known that X is either a
2m, or a 2m - 1 bit integer. Conversely, if X is a positive
2m bit integer then Y is an m bit integer.
Lemma I : Let X be a 2m bit integer and let Y = X I 2 .
Replacing the m 1 least significant bits (LSBs) of X by 0s
causes Y to be reduced by e where

O<e<&.

(1)

Proof: The lower limit holds if the m + 1 LSBs of X


are all zeros. For the upper limit, consider the worst case,
where m 1 LSBs of X are all 1s. Let X Obe equal to X
after replacing all m 1 LSBs of X by 0s. This could be
mathematically represented by

X o = X - 2m+

+ 1.

(2)

Now, if Y and Y Oare the square roots of X and X O ,respectively, then we must have

OO18-9340/90/08OO-1025$01.OO 0 1990 IEEE

Y O= Y - e .

(3)

1026

IEEE TRANSACTIONS ON COMPUTERS, VOL. 39, NO. 8, AUGUST 1990

111. INITIAL
ESTIMATE
OF THE SQUARE
ROOT

Combining (2) and (3) results in

(4)
But, Y is an m bit integer, and the smallest value Y can have
is

A simple two-step procedure is developed in this section for


the evaluation of the first close estimate of the square root.
. follows that
For a given 2m bit integer X let Y = X 1 / 2 It
Y is an m bit integer, and

Y + y =2m

(11)

where y represents the two's complement of Y . Next let


After substituting Y from ( 5 ) into (4) we obtain

+ 22") = K(Y2 + 22m)


(12)
= 2-(m+1),representing m + 1 shifts to the right.
z

=K ( X

where K
Combining (1 1 ) and (12) it follows that
which agrees with the upper limit given in ( 1 ) .
Lemma 1 provides an important criteria for efficient evaluation of square roots within a prescribed precision. It may
loosely be restated as follows: by ignoring the bit values in
the lower half plus one digits, one can still compute the
square root of an integer within an accuracy up to the last
digit (LSB) of the root.
Lemma 2 is a generalized version of Lemma 1, and could
be used mainly for noninteger (fraction or floating-point) numbers. By applying Lemma 2 one may, in fact, remove the fixed
value assigned to the upper limit of e, i.e., 2 1 / 2 ,and make it
more adjustable and appropriate for computation of square
roots with a selective accuracy.
Lemma 2: Let X be an n bit integer. Replacing the k least
significant bits of X by zeros will reduce the square root of
X by e where
0<
- e < 2k-(n+1)/2.
(7)

Proof: We first test the special case of Lemma 1. If


we assume n = 2m and k = m + 1 the upper limit in (7)
becomes 2'/', as denoted by (1). In general, if we replace the
term " m 1," in (4) by k we obtain

2k - 1
e=--2Y - e
where Y is the square root of X and, similar to ( 5 ) , the smallest value Y can get is

y = 2("-')/2 + e .

(9)

After substituting Y from (9) into (8) we get

z = KrY2 + (Y + yl21 = K[2Y(Y + Y ) + Y21


which results in

Z = Y +Ky2

(13)

( 14)

or, alternatively,
=Z -K y 2 .

It is evident from (12) that Z is also an m bit integer, and


therefore we can write

Z+z=2m

(15)

where z is the two's complement of Z .


As shown in Lemma 3, the integer Z is a close estimate for
the square root of X , and is evaluated by a two-step procedure,
as described in Algorithm 1.
Lemma 3: For an integer X with 2m bits, the integer Z
given in (12) is a close approximation for the square root
of X. More precisely, Z is always greater than Y (the exact
square root of X ) by about 1.7%, on average, and it may, at
most, reach a ceiling of 6% greater than Y.
Proof: First we prove that

X =y2

=z2 - 2 2 .

(16)

From (12) and (15) we derive

X
22 = 2m

X
+ 2m = 2m
+ z + 2.
-

(17)

Simplifying the formula and multiplying both sides by 2m results in

X = 2"(Z - z ) = z2- z 2 .
(18)
It follows from Lemma 1 (as well as Lemma 2) that in order
to evaluate the square root of an integer X with 2m digits It is evident from (18) that, first of all, Z 2 Y , and second,
and for a desired accuracy up to the LSB we can practically the difference between Z and Y becomes the largest when Z
ignore the m 1 least significant bits of X. And for integers is at its lowest value (or z at its highest value), in which case
with odd number of digits one can always apply a left shift the bit maps of Z and z are represented as
to make the number of bits even, and later adjust the square
Z = 1 1 OOO... 0, and ~ = 0 OO...
1
0.
root so obtained by multiplying it with a constant, as will be
discussed later. As a consequence of this discussion, and in This simply means that z = Z/3, and after substituting the
order to facilitate our analysis we ignore the m 1 LSB's and values into ( 1 8) we get
replace them by zeros, for the rest of the discussion, unless
Y 2 = Z2 - 1/9Z2 = 8/9Z2
otherwise stated.

1027

HASHEMIAN: SQUARE ROOTING ALGORITHMS FOR INTEGER AND FLOATING-POINT NUMBERS

which results in
Z =

= 1.06Y.

(19)

By simple calculations it can be easily derived (see Appendix)


that y i / y < 1, and also Y / y > 2.4. This clearly shows that
Y , converges to Y fairly quickly, and as the experimental
results indicate: for high density operands (32 bit) the number
of iterations required to reach a precision up to the final bit
(LSB) is around four.
Based on the foregoing discussion, Algorithm 2 presents a
step by step procedure for evaluating the square root of an
integer X with accuracy up to the least significant digit.
Algorithm 2: For a given integer X,
1) Find the number of (effective) bits of X , and if this
number is odd apply a left shift to X to make the number of
bits even, and let this number be 2m bits.
2) Find the twos complement of Z, the initial estimated
square root of X , by applying Algorithm 1.
3) Assume yo = z , and set i = 1.
4) Evaluate y ; using (23).

Equation (19) gives the worst case values for Z, where the
difference between Z and Y is the largest and equal to 6%,
as indicated. At the other extreme, Z is equal to the exact
square root of X. This is true when z is at its minimum
(which is 1 and practically negligible). However, due to the
quadratic nature of both the error function and X , in terms
of Y , the average increase in Z is reduced below the normal
6/2 = 3/. Experimental results show that this average is
about 1.7%; and in fact, a mean accuracy of f0.85% can be
claimed if we multiply Z by a proper constant.
Algorithm 1: For a given integer X with 2m number of
bits, one can always obtain integer 2, the initial estimate of the
square root of X,through the following two-step procedure:
a) apply m 1 right shifts to X , and
b) insert a 1 bit to the left of the most significant bit
yi = Z +K&I
(23)
(MSB) of the result in step 1).
In the next section, we are going to develop a new iteration where K = 2-(m+l)is a constant which represents m + 1 shifts
technique for the accurate evaluation of the square root of X, to the right.
5) Find di = yi - yi-1. If d; is greater than zero, then let
through modification of Z.
i = i 1 and go to step 4; else continue.
6) Evaluate the twos complement of yi to get Y i . If the
IV . ITERATION
PROCEDURE
We showed in the previous section that we were able to number of bits of X was originally odd then multiply Y ; by
evaluate the first estimate of the square root of X with a mean the constant 0.7071067. The process is now terminated and
accuracy better than 1% ( f0.85%). In this section, we try to Y = Yi is the square root of X.

implement a fast converging iteration procedure to improve the


accuracy even further. As we will discuss shortly, the method
is not only fast, but it also avoids division operation as well.
This is particularly interesting in applications such as digital
signal processing and image processing, where division is a
relatively slow process.
For the iteration procedure we start with (14), but to make
it more uniform we utilize the equality Y - Z = z - y and
modify (14) as
Y =z+Ky2.
(20)
Next, we assume the initial estimate of y as yo =
substitute yo for y in (20) to obtain the next estimate
yi = z

+K y i .

z,

and
(21)

Similarly, we can update yl and get


y2 = z + K y :

(22)

and likewise, for the ith iteration step we have


yi

=z

+ Ky;-l.

(23)

It is shown (see the Appendix) that the procedure always converges to y (or Y , the precise square root of X ) , and for the
error function defined as

eo = (z- Y ) / y and e, = ( Y ,
we find

Y)/Y

(24)

V. SQUARE
ROOTSOF REALNUMBERS
In this section, we consider X being a floating point number with its fraction being normalized for radix 2. This is
achieved by shifting the fraction to the left (or right) until the
bit to the immediate right of the radix point is the most significant 1 bit, and decreasing (or increasing) the exponent
accordingly. It is important to note that if X is normalized so
is Y , the square root of X.This is evident from the fact that
has value denoted by 1/2 5 Xf < 1,
the fraction of X,Xf,
and if Y f is the fraction of Y then 212/2 5 Y f < 1. In order
to evaluate the square root of a real number in normalized
format we have to evaluate both the fraction as well as the
exponent part of the square root. But the exponent part is
simply evaluated by dividing the exponent of X by 2. More
precisely, if X , is even then Ye = XJ2 where X , and Y e
are the exponents of X and Y , respectively. If, however, X,
is odd then Y e is obtained as Ye = ( X , 1)/2. Now, to compensate for the extra half exponent, added in the later case, we
clearly need to modify the fraction part and this is performed
by multiplying Y f by the constant 0.7071067. Note that the
new modified Y f still remains normalized.
Next we present an algorithmic procedure to evaluate Y f ,
the fraction of Y , for a given X f . In order to simplify the analysis, we chose to drop the subscripts and assume X , Y , Z , . . .
to represent Xf, Y f , Z f . . ., respectively.
First we show that

z= (1.o + X)/2.0

(26)

is a close initial estimate of the square root of X [this is similar


to the integer case, as denoted by (12)]. To prove it, first note

1028

IEEE TRANSACTIONS ON COMPUTERS, VOL. 39, NO. 8, AUGUST 1990

TABLE I
THESIMULATION
TIME(ms) OF THE SQUARE
ROOTALGORITHM
BY
PA AND THE PROPOSEDMETHOD

PA method The proposed method

47134012
1956421

.17
.17
.17
.16
.15

1067001974
34978139
13579
.17

TABLE I1
THESIMULATION
TIME(ms) OF THE SQUARE
ROOTALGORITHM
BY
NR AND THE PROPOSED
METHOD
I

Data

3786541.91
35689.3512
57.9812
4.815
0.534
0.0004581

mlmethod
5.2
3.8
2.2
1.5
1.25
2.4

SquareRoot

1945.903931
188.916260
7.614540
2.216981
0.730754
0.021582

that Z is also a normalized real number, and z = 1.O - Z is


the twos complement of Z. Second, if we substitute in (16)
we get

x =z*-(l.O-Z)2

=2z

1.0

(27)

-X)/2.0.

(28)

or simply
Z

= (1.0

+X)/2.0

and z

= (1.0

Wmethod
1.7
1.12
1.11
1.1
1.o

1.3

Thepmposedmethod
0.8
0.85
0.67
0.75
0.92
0.6

4) Compute di = yi -yi-1. If di is greater than a specified


precision index then let i = i 1 and go to step 3: otherwise,
continue.
5) Evaluate Yi = 1.O - y i . If n is odd multiply Yi by the
constant 0.7071067. The final square root Y = Yi is now
evaluated, and the process terminates.

VI. SIMULATION
RESULTS

Both square rooting algorithms for f i x e d p i n t operands


and floating-point operands have been simulated on a general purpose computer (Sun-3/110) and the results are compared to two other known methods. For fixed-point arithmetic
we compare our simulation results to that obtained from simulation of Prado and Alcantaras (PA) method [ 5 ] . Table I
shows the comparison results under identical conditions, and
for different data ranges.
For the floating point case the simulation results are compared to the Newton-Raphson (NR) method [8] which is predominantly used for square rooting in computer arithmetic.
yi = z ~i2_1/2.0.
(29) However, to have a clear understanding of the significance
In using (29) we initially assign yo = z( l.O+z/2.0) for yi-1. of the initial estimate Z in reducing the computation time,
It is important, at this point, to realize that any division by we have chosen to implement the NR algorithm in two differ2 is actually performed by shifting the (bits in the) operand ent environments: one without any preassigned initial estimate
one position to the right. Algorithm 3 provides a step by step (NR1) and the other with initial estimate Z = (1.O - X)/2.0
procedure to evaluate the square root of a real number X for (NIU), as described earlier. Table I1 shows the comparison
results.
a given predefined precision.
Note that with the initial estimate of the square root known,
Algorithm 3: First, normalize X (as described earlier) such that its fraction is between 0.5 and 1.0, i.e., the simulation time is considerably reduced in both NR and
112 5 X < 1. Then:
the proposed methods of simulation, and also the computation
1) Let n be the exponent of X, that is, the number of sig- time with initial estimate becomes more stable. Second, the
nificant bits to the left of the radix point, in the original X. If execution time for floating-point square rooting is substantially
n is even assign m = n/2 to be the exponent of Y , the square greater than that of the fixed-point square root.
root of X. If, however, n is odd then assign m = (1 + n)/2
VII. CONCLUSION
as the exponent of Y .
2) Evaluate z = (1.0 - X)/2.0 and yo = z(1.0 2/2.0),
We have discussed some characteristics of square root, and
and set i = 1.
the precision expected in evaluating the square root, in Sec3) Evaluate y i using (29)
tion 11. In Section 111, a simple method for finding a close

In order to evaluate a more precise square root we now


can take Z as a seed value and start the iteration. However, contrary to the conventional iterations (such as Newton-Raphson), the iteration procedure developed here does not
employ any division operation other than division by 2 and
hence is more appropriate for DSP and similar applications.
Once again, we start from (23) for iteration procedure, and
in order to make it suitable for floating point arithmetic we
change it accordingly to obtain (29):

029

HASHEMIAN: SQUARE ROOTING ALGORITHMS FOR INTEGER AND FLOATING-POINT NUMBERS

estimate of a square root has been developed. A new iteration


technique has been presented for computation of an accurate
square root in Section IV, and the method has been implemented for both integers and real numbers (Section V). It
has been shown that the iteration procedure converges very
fast without any division operation being required. The algorithm is simulated for both integers and real numbers, and the
results are compared to two methods that are being widely
used (Section VI). The results (tabulated) show considerable
improvement in speed compared to those two methods.

The lower limit is met when Y is at its lowest (and y is at its


highest) value, which is Y = 21/22-1,and therefore
-

-JZ+l.

2-Jz

On the other hand, y i - l / y < 1. This is evident from (23)


indicating that yj-1 is monotonically increasing with i. Substituting the values for Y / y and yi-1 /U into (A6) results in
e; < 0.3ei-l which satisfies the convergence criteria.
REFERENCES

APPENDIX

Convergence: We are going to show that the iteration process, denoted by (23), always converges to Y , the precise
square root of X . First, we expand yl as
= y 2 - (U - Y i - l U

+ Yi-I).

(AI)

From the definition of error function

ei-l = ( Y j P l- Y ) / Y = (U - yi-l)/Y

(A2)

we can rewrite (Al) as


Y;-I

y 2 - ei-iY(y

+ yi-1).

(A3)

Next, if we substitute yl-, from (A3) into (23) we get


Yi = z +KY* - e ; - l K Y ( y + Y i - l )
and from (20) and the fact that yi - y

=Y

(44)

Yi we obtain

or from the definition of ei we get

from which (25) follows.

Now, it remains to show the convergence, that is, to show


that ei < ei-1, for all i. First we shall prove that Y / y > 2.4.

[l] G. M. Cioffi and T. Kailath, Fast recursive least-squares transversal


filters for adaptive filtering, IEEE Trans. Acoust., Speech, Signal
Processing, vol. ASSP-32, no. 2, Apr. 1984.
[2] G. R. L. Sohie and K. L. Kloker, A digital signal processor with
IEEE floating-point arithmetic, IEEE Micro, vol. 8, no. 6, pp.
49-67, 1989.
[3] 3. H. P. Zurawski and J . B. Gosling, Design of a high-speed square
root multiply and divide unit, IEEE Trans. Comput., vol. (2-36, no.
1, pp. 13-23, Jan. 1987.
[4] V. G. Oklobdzija and M. D. Ercegovac, An on-line square root
algorithm, IEEE Trans. Comput., vol. C-31, no. 1, pp. 70-75,
Jan. 1982.
[5] J. Prado and R. Alcantara, A fast square-rooting algorithm using a
digital signal processor, Proc. IEEE, vol. 75, no. 2, pp. 262-264,
k b . 1987.
[6] J. F. Cavanagh, Digital Computer Arithmetic. New York:
McGraw-Hill, 1984, ch. 4.
[7] N. R. Scott, Computer Number Systems & Arithmetic. Englewood Cliffs, NJ: Prentice-Hall, 1985, ch. 5.
[E] Y. Jaluria, Computer Methods for Engineering. Boston, MA: Allyn and Bacon, 1988, ch. 4.
[9] IEEE Standard for Binary Floating-point Arithmetic, IEEE Standard
754, IEEE Computer Society, 1985.

Reza Hashemian (S65-M68-SM84) received


the B.S.E.E. degree from the Tehran University,
Tehran, Iran, in 1960, and the M.S. and Ph.D. degrees from the University of Wisconsin, Madison,
both in electrical engineering, in 1965 and 1968,
respectively.
From 1968 to 1984 he was with Sharif University
of Technology as an Assistant, Associate, and Full
Professor. This includes eight years (1972-1980) of
research and development on circuit simulation and
MOS modeling and characterization at the Materials and Energy Research Center. He joined Signetics Corporation in 1984
where he worked on design of semi-custom ICs. Currently he is with the
Department of Electrical Engineering, Northern Illinois University, Dekalb,
where he teaches and does research in IC design, CAD, and computer arithmetics.

You might also like