0% found this document useful (0 votes)
44 views96 pages

FPGAArithmetic Xilinx

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 96

The DSP Primer 3

Arithmetic for DSP

Return Return

DSPprimer Home DSPprimer Notes

August 2005, University of Strathclyde, Scotland, UK For Academic Use Only


THIS SLIDE IS BLANK
Top
Introduction 3.1

• This section review arithmetic for DSP.

• The following key issues are presented here:

• Number representation techniques:


signed/unsigned integers, 1’s & 2’s complement, fixed point
and floating point;

• Arithmetic operation structures:


Addition/Subtraction, Multiplication, Division and Square Root;

• Complex arithmetic operations;

• FPGA specific arithmetic.

• Examples of implementing addition and multiplication in a Xilinx Virtex-


II Pro FPGA are given.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
This section of the course will introduce the following concepts:

Integer number representations - unsigned, one’s complement, two’s complement.

Non-integer number representations - fixed point, floating point.

Quantisation of signals, truncation, rounding, overflow, underflow and saturation.

Addition - decimal, two’s complement integer binary, two’s complement fixed point, hardware structures for
addition, Xilinx-specific FPGA structures for addition.

Multiplication - decimal, 2s complement integer binary, two’s complement fixed point, hardware structures for
multiplication, Xilinx-specific FPGA structures for multiplication.

Division.

Square root.
Top
Integer Number Representations 3.2

• A fundamental consideration in DSP is the issue of:

Number Representation

• DSP, by its very nature, requires quantities to be represented digitally -


using a number representation with finite precision.

• This representation must be sufficiently accurate to handle the “real-


world” input and outputs of the DSP system.

• The representation must also be efficient in terms of its


implementation in hardware.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
The use of binary numbers is of course a fundamental of any digital systems course and is well understood by
most engineer. However when dealing with large complex DSP systems, there can be literally billions of
multiplies and adds per second. Therefore any possible cost reduction by reducing the number of bits for
representation is likely to be of significant value.

For example, assume we have a DSP filtering application using 16 bit resolution arithmetic. We will show later
(see Slide 3.25) that the cost of a parallel multiplier (in terms of silicon area - speed product) can be
approximated as the number of full adder cells. Therefore for a 16 bit by 16 bit parallel multiply the cost is of the
order of 16 x 16 = 256 “cells”. The wordlength of 16 bits has been chosen (presumably) because the designer
at sometime demonstrated that 17 bits was too many bits, and 15 was not enough - or did they? Probably not!
Its likely that we are using 16 bits because - well that’s what we usually use in DSP processors and we are
creatures of habit! In the world of FPGA DSP arithmetic you can choose the resolution. Therefore, if it was
demonstrated that in fact 9 bits was sufficient resolution, then the cost of a multiplier is 9 x 9 cells = 81 cells.
This is approximately 30% of the cost of using 16 bits arithmetic.

Therefore its important to get the wordlength right: too many bits wastes resources, and too few bits loses
resolution. So how do we get it right? Well, you need to know your algorithms and DSP.
Top
Unsigned Integers - Positive Values Only 3.3

• Unsigned integers can be used to represent non-negative numbers.


For example using 8 bits we can represent from 0 to 255:
Integer Value Binary Representation
0 00000000
1 00000001
2 00000010
3 00000011
4 00000100

64 10000000
65 10000001

131 10000011

255 11111111

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Note that the maximum value ( 255 ) is the sum of the powers of two between 0 and 8 where 8 is the number
of bits:

i.e. 20 + 21 + 22 + 23 + 24 + 25 + 26 + 27 = 255 = 28 - 1
N
For the general case of N bits the maximum value is equal to 2 – 1 .
Top
2’s Complement 3.4

• A more sensible number system for +ve a -ve numbers is 2’s


complement which has only one representation of 0 (zero):
Positive Numbers Negative Numbers
Integer Binary Integer Binary
0 00000000 0 100000000
1 00000001 Invert all bits -1 11111111
2 00000010 and ADD 1 -2 11111110
3 00000011 -3 11111101

125 01111101 -125 10000011


126 01111110 -126 10000010
127 01111111 -127 10000001
-128 10000000

• The 9th bit generated for 0 can be ignored. Note that -128 can be
represented but +128 cannot.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
When negating +ve values, it can be seen that a ninth bit is required to represent -ve zero. However, if we simply
ignore this ninth bit, the representation for this -ve zero becomes identical to the representation for +ve zero.

It is helpful to note the worth of each position in the number’s representation. For decimal 156:
( 1 × 10 2 + 5 × 10 1 + 6 × 10 0 ) = 156

This is to say that the string of symbols “156” represents the number 156 which is found by summing the product
of the worth and value at each position. The same is true for binary integers:

bit worth decimal


128 64 32 16 8 4 2 1 integer

0 0 0 0 0 0 0 1 1
1 0 1 0 0 0 0 0 160
1 1 0 0 0 1 1 1 199
1 1 1 1 1 1 1 1 255

For two’s complement integers, the same is true if we consider the leftmost column to have a negative value

bit worth decimal


-128 64 32 16 8 4 2 1 integer

0 0 0 0 0 0 0 1 1
1 0 1 0 0 0 0 0 -96
1 1 0 0 0 1 1 1 -57
1 1 1 1 1 1 1 1 -1
Top
Analogue to Digital Converter (ADC) 3.5

• An ADC is a device that can convert a voltage to a binary number,


according to its specific input-output characteristic.

Binary Output
fs
127 01111111
96 01100000
8 bit 1
64 01000000 0
0
32 00100000 1
ADC
1
-2 -1 1 2 Voltage 1
-32 11001000 Voltage Input 0
Input 1
-64 11000000
Binary
-96 10100000 Output
-128 10000000

• We can generally assume ADCs operate using two’s complement


arithmetic.
August 2005, For Academic Use Only, All Rights Reserved
Notes:
Viewing the straight line portion of the device we are tempted to refer to the characteristic as “linear”. However
a quick consideration clearly shows that the device is non-linear (recall the definition of a linear system from
before) as a result of the discrete (staircase) steps, and also that the device clips above and below the maximum
and minimum voltage swings. However if the step sizes are small and the number of steps large, then we are
tempted to call the device “piecewise linear over its normal operating range”.

Note that the ADC does not necessarily have a linear (straight line) characteristic. In telecommunications for
example a defined standard nonlinear quantiser characteristic is often used (A-law and µ-law). Speech signals,
for example, have a very wide dynamic range: Harsh “oh” and “b” type sounds have a large amplitude, whereas
softer sounds such as “sh” have small amplitudes. If a uniform quantisation scheme were used then although
the loud sounds would be represented adequately the quieter sounds may fall below the threshold of the LSB
and therefore be quantised to zero and the information lost. Therefore non-linear quantisers are used such that
the quantisation level at low input levels is much smaller than for higher level signals. A-law quantisers are often
implemented by using a nonlinear circuit followed by a uniform quantiser. Two schemes are widely in use: the
A-law in Europe, and the µ -law in the USA and Japan. Similarly for the DAC can have a non-linear
characteristic.
Binary Output

Voltage Input
Top
ADC Sampling “Error” 3.6

• Perfect signal reconstruction assumes that sampled data values are


exact (i.e. infinite precision real numbers). In practice they are not, as
an ADC will have a number of discrete levels.

• The ADC samples at the Nyquist rate, and the sampled data value is
the closest (discrete) ADC level to the actual value:
s(t) v̂ ( n ) ts
4 fs 4

Binary value
3 3
2 2
Voltage

1 1
0 ADC 0
-1 -1 sample, n
-2 time -2
-3 -3
-4 -4

v̂ ( n ) = Quantise { s ( nt s ) }, for n = 0, 1, 2, …

• Hence every sample has a “small” quantisation error.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
For example purposes, we can assume our ADC or quantiser has 5 bits of resolution and maximum/minimum
voltage swing of +15 and -16 volts. The input/output characteristic is shown below:

01111 (+15)

Binary Output
1 volts
Vmin = -15 volts

Vmax = 15 volts Voltage


Input

10000 (-16)

In the above slide figure, for the second sample the true sample value is 1.589998..., however our ADC
quantises to a value of 2.
Top
Quantisation Error 3.7

• If the smallest step size of a linear ADC is q volts, then the error of any
one sample is at worst q/2 volts.

01111 (+15)

Binary Output
q volts

-Vmax Vmax Voltage


Input

10000 (-16)

August 2005, For Academic Use Only, All Rights Reserved


Notes:

Quantisation error is often modelled an additive noise component, and indeed the quantisation process can be
considered purely as the addition of this noise:
nq

x ADC y
x y
Top
An example 3.8

• Here is an example using a 3-bit ADC:


ADC input and output
4
input
output

amplitude (volts)
0

−1

−2

−3

−4
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
time (seconds)

ADC input versus output ADC error


3 0.5

0.4
2

0.3

1 0.2

0.1

amplitude (volts)
output (volts)

−1
−0.1

−0.2
−2

−0.3

−3
−0.4

−4 −0.5
−4 −3 −2 −1 0 1 2 3 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
input (volts) time (seconds)

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Top
Non-Integer Values 3.9

• What about non-integer values? In DSP systems, we often want to


represent signals such as this sine wave:
+2

time
-2

• 2’s complement is not much use, e.g. using just two bits gives values -
2, -1, 0, 1 we end up with a large quantisation error:

+2

time
-2

• Clearly what we need is a representation that can cope with non-integer


values.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
A possible solution to the requirement for non-integer values here is just to allow the sine wave to scale up in
amplitude and represent it using integers:
+127

time

-127

This approach is fairly common but in some cases it is either very convenient or essential to represent numbers
between 0 and 1, and numbers between integers in general.

Representing fractional numbers is simple to do using decimal numbers. Recall, we represent non-integers by
introducing a decimal point, and insert digits to the right of the point:
“12.34” ≡ 1 × 10 1 + 2 × 10 0 + 3 × 10 – 1 + 4 × 10 – 2 = 10.34

In words, the string of symbols “10.34” represents the number 10.34 as shown by the sum of multiples of powers
of ten above.

We can do the same for binary numbers:


“10.01” ≡ 1 × 21 + 0 × 2 0 + 0 × 2 – 1 + 1 × 2 – 2 = 2.25

In words, the string of symbols “10.01” represents the number 2.25 as shown by the sum of multiples of powers
of two above.
Top
Fixed-point Binary Numbers 3.10

• We can now define what is known as a “fixed-point” number:

a number with a fixed position for the binary point.

• Bits on the left of the binary point are termed integer bits, and bits on
the right of the binary point are termed fractional bits, for example:
aaa.bbbbb 3 integer bits, 5 fractional bits

• This number behaves in a similar way to signed integers:


digit worth decimal
–( 22 ) 21 20 2 –1 2 –2 2 –3 2 –4 2 –5 value

-4 2 1 0.5 0.25 0.125 0.0625 0.03125


0 0 0 0 0 0 0 1 0.03125
0 0 0 0 0 0 1 0 0.0625
1 0 1 0 0 0 0 0 -3.0
1 1 0 0 0 1 1 1 -1.78125
1 1 1 1 1 1 1 1 -0.03125

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Therefore, the second last example shows that 110.001112 = -1.7812510

A very important class of fixed-point numbers is those with only one integer bit:
digit worth decimal
–20 2 –1 2 –2 2 –3 2 –4 2 –5 value

-1 0.5 0.25 0.125 0.0625 0.03125


0 0 0 0 0 1 0.03125
0 0 0 0 1 0 0.0625
1 0 0 0 0 0 -1.0
0 1 1 1 1 1 0.96875
1 1 1 1 1 1 -0.03125

For example, Motorola StarCore and TI C62x DSP processors both use a fixed point representation with only
one integer bit.

This format can be problematic as it cannot represent +1.0 - in fact, any fixed point representation cannot
represent a positive number as large as the largest negative number it can represent. For the 3 integer bit, 5
fractional bit example in the slide above, -3.0 can be represented but +3.0 cannot.

The result is that great care must be taken when using fixed point. Some DSP processor architectures allow
extension of the format with one integer bit by the use of ‘extension bits’ - these are additional integer bits.
Top
Fixed-point Quantisation 3.11

• Consider again the number format:


aaa.bbbbb 3 integer bits, 5 fractional bits

• Numbers between – 4 and 3.96785 can be represented, in steps of


0.03125 . As there are 8 bits, there are 2 8 = 256 different values.

• Revisiting our sine wave example, using this fixed-point format:


+2

-2

• Looks much better. We must always take into account the quantisation
when using fixed point - it will be +/- 1/2 of the LSB (least significant bit).

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Quantisation is simply the DSP term for the process of representing infinite precision numbers with finite
precision numbers. In the decimal world, it is familiar to most to work with a given number of decimal places.
The real number π can be represented as 3.14159265.... and so on. We can quantise or represent π to 4
decimal places as 3.1416. If we use “rounding” here and the error is:

3.14159265… – 3.1416 = 0.00000735

If we truncated (just chopped off the bits below the 4th decimal place) then the error is larger:

3.14159265… – 3.1415 = 0.00009265

Clearly rounding is most desirable to maintain best possible accuracy. However it comes at a cost. Albeit the
cost is relatively small, but it is however not “free”.

When multiplying fractional numbers we will choose to work to a given number of places. For example, if we
work to two decimal places then the calculation:

0.57 x 0.43 = 0.2451

can be rounded to 0.25, or truncated to 0.24. The result are different.

Once we start performing billions of multiplies and adds in a DSP system it is not difficult to see that these small
errors can begin to stack up.
Top
Truncation 3.12

• In binary, truncation is the process of simply “removing” bits. This is


usually done in a constrained way to convert from a larger to a smaller
binary wordlength;

• Usually truncation is performed on least significant bits (LSBs):

16 bits

Truncating 7 LSBs

9 bits

• The net effect is that we lose precision.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
It is easy to show truncating for decimal numbers. Consider truncating 7.8992 to three significant digits - 7.89.
Of course, we truncate the least significant digits and the result has lost some accuracy (or resolution) but it still
representative of the original 5 digit number. If we truncated the most significant digits - 992 (or 0.0992) - the
result is not what was intended and makes little sense in terms of the magnitude of the original number.

In the binary world the concept of truncation of MSB is rare and as for the decimal example, truncating the MSB
is usually catastrophic. However, in some (rare!) instances a sequence of operations may result in a reduction
of the overall range of values and therefore merit the removal of MSBs.

9 bits

Truncating 1 MSBs

Truncating MSBs can generally only be done when the bits to be truncated are empty. This is shown below for
the MSB truncation of the numbers 1.25 and -1.25:
0 1 0 1 0 0 0 0 0 1.25 1 1 0 1 0 0 0 0 0 -1.25

1 0 1 0 0 0 0 0 1.25 1 0 1 0 0 0 0 0 0.25

Truncating MSBs is especially problematic when using signed values as the sign bit will be lost.
Top
Rounding 3.13

• Rounding is a more accurate, but more complicated technique that


requires an addition operation then the truncation.

9 bits 9 bits

+ 1

truncation rounding
• This process is equivalent to the technique for decimal rounding, i.e. to
go from 7.89 to one decimal place is accomplished by adding 0.05 then
truncating to 7.9.

• Note that rounding is not “free” it requires one extra full adder.
August 2005, For Academic Use Only, All Rights Reserved
Notes:
Some examples of truncation of LSBs of 16 bit numbers:
MSB 1 1 0 0 0 0
1 1 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
16 bits

1 1 0 0 0 0
1 1 0 0 0 0
0 0 1 1 0 0
0 -1.046875 1 0.0078125 1 0.0
0 1 1
0 0 0
0 0 0
0 no loss of precision 0 loss of precision 0 total loss of precision
0 0 0 (underflow)
LSB 0 0 0
-1.046875 0.013671875 0.005859375

The following rounding example is a fairly extreme (but perfectly valid) - 0.013671875 is very close to needing
to be rounded up (to 0.015625) so truncate makes a significantly larger error than rounding.
0 0 0
0 0 0 0
0
0 0 0 0
0
0 0 0 0
0
0 0 0 0
0
0 0 0 0
0
0 0 0 0
0
0 0 0 0
0 ROUNDING
1 1 TRUNCATE 0 1 1
0.0078125 1 0
1
1 1 0.015625
1 1
0
0 0
0 0
0 error=0.0078125-0.013671875=-0.005859375 0 error=0.015625-0.013671875=0.001953125
0 0
0.013671875 0
0.013671875
Top
A different approach: Trounding 3.14

• Trounding is a compromise between truncation and rounding;

• It preserves information from beyond the LSB like rounding;

• However, unlike rounding it cannot affect any bit beyond the new LSB:

0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 1 1 1
1 0.0078125 1 0.0078125
1 1
0 0
0 0
0 0
0 0
0 0
0.005859375 0.013671875

August 2005, For Academic Use Only, All Rights Reserved


Notes:
A UK patent has been applied for in relation to this work:

UK Patent Application Number: GB0416896.9

Reducing wordlength of a fixed point number

Date Lodged: 29 Jul 2004

Trounding explained (briefly)

Trounding is best explained by comparing the logical OR operation with addition:

Input A Input B OR addition


0 0 0 0
0 1 1 1
1 0 1 1
1 1 1 0 carry 1

Only when both inputs are 1 does trounding differ from rounding. So compared to rounding, trounding “gets it
right” three times out of 4. Trucation gets it right two times out of 4. Hence rounding is 3 dB improved error over
truncation, and trounding is 1.5dB improved over truncation.

Trounding performs like rounding 75% of the time:

50% of the time, tround=round=truncate; 25% of the time, tround=round; 25% of the time, tround=truncate
Trounding has a lower mean quantisation error than truncation, but a higher mean quantisation error than
rounding; Trounding has a higher quantisation error variance than both rounding and truncation.
Top
Addition 3.15

• A couple of examples of integer arithmetic:

unsigned 2s
binary complement
integer
1 00000001 1
+1 +00000001 +1
2 00000010 2

131 10000011 -125


+3 +00000011 +3
134 10000110 -122

• The most important point to note is that when a binary addition is


performed, the interpretation of the binary strings as either unsigned
integers or 2s complement integers is valid - the additions are correct
either way.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
The result of this is that the same hardware can be used for addition of signed or unsigned integers without
modification.
Top
The Full Adder (FA) 3.16

• A full adder circuit can be built from two half adders plus an additional
or gate to provide support for carry in and carry out of the addition:

A
S

CIN
COUT

August 2005, For Academic Use Only, All Rights Reserved


Notes:
The full adder truth table is shown below, along with the equivalent operations in normal mathematical notation:
A B CIN S COUT
0 0 0 0 0 0+0+0 = 0
0 0 1 1 0 0+0+1 = 1
0 1 0 1 0 0+1+0 = 1
0 1 1 0 1 0+1+1 = 2
1 0 0 1 0 1+0+0 = 1
1 0 1 0 1 1+0+1 = 2
1 1 0 0 1 1+1+0 = 2
1 1 1 1 1 1+1+1 = 3
Top
Adding multi-bit numbers 3.17

• The full adder circuit can be used in a chain to add multi-bit numbers.
The following example shows 4 bits:
A3 A2 A1 A0
B3 B2 B1 B0
S A S A S A S A
FA B FA B FA B FA B
COUT CIN COUT CIN COUT CIN COUT CIN 0

S4 S3 S2 S1 S0

• This chain can be extended to any number of bits. Note that the last
carry output forms an extra bit in the sum.

• If we do not allow for an extra bit in the sum, if a carry out of the last
adder occurs, an “overflow” will result i.e. the number will be incorrectly
represented.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Subtraction

Subtraction is very readily derived from addition. Remember two’s complement? All we need to do to get a
negative number is invert the bits and add 1.

Then if we add these numbers, we’ll get a subtraction:

A3 A2 A1 A0
B3 B2 B1 B0
S A S A S A S A
FA B FA B FA B FA B
COUT CIN COUT CIN COUT CIN COUT CIN
1

S4
S0 S1 S2 S3

The addition of the 1 is done by setting the carry in to the chain.


Top
Overflow 3.18

• An example of an addition which overflows:

65 01000001
+222 +11011110
287 100011111

• The result requires 9 bits from two 8-bit operands. If the ninth bit isn’t
present, the result becomes 00011111 = 31, which is incorrect.
Overflow has occurred.

• It is often not acceptable to allow the output width to grow as in the


example above. For example, a filter with 128 unit weights has 128
consecutive additions and therefore potentially requires 7 extra bits at
the output.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Addition/Subtraction

Sometimes we need a combined adder/subtractor with the ability to switch between modes.

This can be achieved quite easily:


B3 B2 B1 B0
Control
A3 A2 A1 A0
S A S A S A S A
FA B FA B FA B FA B
COUT CIN COUT CIN COUT CIN COUT CIN

S4 S3 S2 S1 S0

For: A + B, Control = 0

For: A - B, Control = 1

This structure will be seen again in the Division/Square Root slides!


Top
Negative overflow 3.19

• We can get negative overflows as well:

-65 10111111
+ -112 +10010000
-177 101001111

• In this case, we lose the 9th bit (red) and the result “wraps round” to
positive values: 01001111 = 47 .

• The solution to overflow, both negative and positive, is to ensure that


the results of operations will not exceed a certain pre-defined number
of bits.

• For example, with 8 bit operands, we might allow 16 bits, regardless of


how many consecutive additions we perform.

• This can be difficult to achieve in practice, so overflow is a problem.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Top
Saturation 3.20
• One method to reduce the effects of overflow is to use a technique
known as saturation:
65 01000001
+222 +11011110
287 100011111
Detect overflow and
saturate the result
255 11111111

• When overflow is detected, the result is set to the largest possible


value.

• Generally available in DSP processors - could be done on FPGA but


requires additional logic.

• Very useful technique for dealing with the potential for overflow in, e.g.,
adaptive filtering algorithms.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Saturation is extremely useful in adaptive algorithms. For example, in the Least Mean Squares algorithm (LMS),
the filter weights w are updated according to the equation:

w ( k ) = w ( k – 1 ) + 2µe ( k )x ( k )

Without further concern over the meaning of this equation, we can see that the term 2µe ( k )x ( k ) is added to
the weights at time epoch k – 1 to generate the new weights at time epoch k .

The the operations that form 2µe ( k )x ( k ) were to overflow, there is a high chance that the sign of the term would
flip and drive the weights in completely the wrong direction, leading to instability.

With saturation however, if the term 2µe ( k )x ( k ) gets very big and would overflow, saturation will limit it to the
maximum value representable, causing the weights to change in the right direction, and at the fastest speed
possible in the current representation. The result is a huge increase in the stability of the algorithm.
Top
Xilinx Virtex-II Pro addition 3.21

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Picture of Xilinx-II Pro slice (upper half) taken from “Virtex-II Pro Platform FPGAs: Introduction and Overview”,
DS083-1 (v2.4) January 20, 2003. http://www.xilinx.com

LookUp Table (LUT) programmed with two-input XOR function:

G1 (A) G2 (B) D
0 0 0
0 1 1
1 0 1
1 1 0
Y = CIN xor D, COUT = DA + CIND (multiplex operation). Result: .

G1 (A) G2 (B) CIN D Y COUT


0 0 0 0 0 0
0 0 1 0 1 0
0 1 0 1 1 0
0 1 1 1 0 1
1 0 0 1 1 0
1 0 1 1 0 1
1 1 0 0 0 1
1 1 1 0 1 1
Top
Xilinx Virtex-II Pro addition (II) 3.22

• Although this looks complicated, the tools will handle all this complexity
- you just need to specify that you want addition.

• The bottom half of a Virtex-II Pro slice can be programmed for an


identical operation, with its COUT wired to the top-half’s CIN. Hence we
can get two bits of addition per slice.

• Further bits can be added by wiring different slices in a column.

FA
2 bit addition 4 bit addition
1 slice 2 slices
FA

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Top
Fixed point addition 3.23

• First some examples of decimal non-integer addition:

10.375 10.375
+ 3.125 + 8.125
13.500 18.500

• Now in fixed point binary (4 bits integer, 3 bits fractional):

1010.011 1010.011
+ 0011.001 + 1000.001
1101.100 10010.100

• Note that for large operands, an extra bit may be required. Care must
be taken to interpret the binary point - it must stay in the same location
w.r.t. the LSB - this means a change of location w.r.t. the MSB.

• Subtraction follows the same binary arithmetic as for integers.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Top
Multiplication in decimal 3.24

• Starting with an example in decimal:


214
x45
1070
+8560
9630

• Note that we do 214 × 5 = 1070 and then add to it the result of


214 × 4 = 856 right-shifted by one column.

• For each additional column in the second operand, we shift the


multiplication of that column with the first operand by another place.
zzz
xaaaa
bbbb
+cccc0
+dddd00
+eeee000 etc...

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Multiplication in binary

Now the same example in binary:

11010110 A 7 …A 0
x00101101 B 7 …B 0
11010110
000000000
1101011000
11010110000
000000000000
1101011000000
00000000000000
000000000000000
0010010110011110 P 15 …P 0
Note that the product P is composed purely of selecting, shifting and adding A . The i th column of B indicates
whether or not a shifted version of A is to be selected or not in the i th row of the sum.

So we can perform multiplication using just full adders and a little logic for selection, in a layout which performs
the shifting.
Top
Structure for multiplication 3.25

• This example shows a four-bit multiplication:


s FA is full adder
0 a3 0 a2 0 a1 0 a0
a

b b0
bout 0 0
c
cout FA Example:
b1
0 1101 13
aout 0
sout 1011 11
b2 1101
0 0 1101
0000
b3 1101
0 10001111 143

p7 p6 p5 p4 p3 p2 p1 p0

• The AND gate connected to a and b performs the selection for each
bit. The diagonal structure of the multiplier effectively inserts zeros in
the appropriate columns and shifts the a operands right.

• Note that this structure is not for signed 2’s complement (needs
modified)!
August 2005, For Academic Use Only, All Rights Reserved
Notes:
Top
Xilinx Virtex-II Pro Slice multiplication 3.26

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Picture of Xilinx-II Pro slice (upper half) taken from “Virtex-II Pro Platform FPGAs: Introduction and Overview”,
DS083-1 (v2.4) January 20, 2003. http://www.xilinx.com

LUT implements the XOR of two ANDs:


G4 (A0)
G3 (B1)
G2 (A1) D
G1 (B0)

The dedicated MULTAND unit is required as the intermediate product G1G2 cannot be obtained from within the
LUT, but is required as an input to MUXCY. The two AND gates perform a one-bit multiply each, and the result
is added by the XOR plus the external logic (MUXCY, XORG):
Y = CIN xor D, COUT = DA1B0 + CIND

This structure will perform one bit of a multiply.


Top
Xilinx Virtex-II Pro Slice multiplication (II) 3.27

• Can check that it works by making sure that this multiplication works:
A1 A0
x B1 B0
COUT CIN
Y

• Multiple units like this can be chained to do bigger multiplies:


A3 COUT
A2
Y3
Thick black borders
A1 represent half-slice
Y2
boundaries.
A0 Green borders are
Y1 slice boundaries.
CIN
B0 B1 Y0

August 2005, For Academic Use Only, All Rights Reserved


Notes:
The first half of the truth table for Y and C OUT (from Slide 3.26):

G1 (B0) G2 (A1) G3 (B1) G4 (A0) D CIN Y COUT

0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0
0 0 1 0 0 0 0 0
0 0 1 1 1 0 1 0
0 1 0 0 0 0 0 0
0 1 0 1 0 0 0 0
0 1 1 0 0 0 0 0
0 1 1 1 1 0 1 0
1 0 0 0 0 0 0 0
1 0 0 1 0 0 0 0
1 0 1 0 0 0 0 0
1 0 1 1 1 0 1 0
1 1 0 0 1 0 1 0
1 1 0 1 1 0 1 0
1 1 1 0 1 0 1 0
1 1 1 1 0 0 0 1
Top
Xilinx Virtex-II Pro multiplication (VI) 3.28

• As we can do one bit of a multiply in a slice, we can do an N-bit by 2-bit


multiply in N/2 slices. In the example above, we have 4-bit by 2-bit in 2
slices.

• Perhaps the most important thing to note is that this is very


complicated!

• Tools are designed to automate the process of connecting the


components within a slice in order to perform efficient operations.

• But it is important to note that the tools aren’t infinitely clever, and
sometimes we need to bear in mind the structure of the FPGA in order
to generate an efficient design.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
The second half of the truth table for Y and C OUT (from Slide 3.26):

G1 (B0) G2 (A1) G3 (B1) G4 (A0) D CIN Y COUT

0 0 0 0 0 1 1 0
0 0 0 1 0 1 1 0
0 0 1 0 0 1 1 0
0 0 1 1 1 1 0 1
0 1 0 0 0 1 1 0
0 1 0 1 0 1 1 0
0 1 1 0 0 1 1 0
0 1 1 1 1 1 0 1
1 0 0 0 0 1 1 0
1 0 0 1 0 1 1 0
1 0 1 0 0 1 1 0
1 0 1 1 1 1 0 1
1 1 0 0 1 1 0 1
1 1 0 1 1 1 0 1
1 1 1 0 1 1 0 1
1 1 1 1 0 1 1 1
Top
ROM-based multipliers 3.29

• Just as logical functions such as XOR can be stored in a LUT as shown


for addition, we can use storage-based methods to do other operations.

• By using a ROM, we can store the result of every possible multiplication


of two operands.

• The two operands are concatenated to be used as the address by


which to access the ROM.

• The value stored at that address is the multiplication result:

A
256 x 8
bit ROM P

August 2005, For Academic Use Only, All Rights Reserved


Notes:
There is one serious problem with this technique: as the operand size grows, the ROM size grows exponentially.
2N
For two N bit input operands (therefore an 2N bit output operand) 2N2 bits of storage are required. For
example, with 8 bits operands (a fairly reasonable) size, 1Mbit of storage is required - a large quantity. For
bigger operands e.g. 16 bits, a huge quantity of storage is required. 16 bit operands require 128Gbits of storage!
Top
Constant ROM-based multipliers 3.30

• Consider a ROM multiplier with 8 bit inputs: 65,536 8-bit locations are
required
ROM
8 bits
A
16 bits 16 bits
address data P
B 8 bits 65,536 16-bit
locations

• If input B is constant and B = k only 256 locations are accessed


0×k
1×k
ROM
2×k
3×k

8 bits 16 bits
A address data P

input B removed 256 16-bit


locations

• This constitutes a Constant Coefficient Multiplier (KCM)

August 2005, For Academic Use Only, All Rights Reserved


Notes:
In the above example, 8-bit input B is fixed to one value. Which means that in total only 256 out of a total of
65,536 locations are accessed. Therefore, when one of the inputs of the ROM-based multiplier is fixed the size
of the required ROM can be reduced.

It is also possible to reduce the memory requirements of this structure if additional knowledge of the constant
value is available. For example, if the value of B is 10, the maximum output required for any 8-bit input A will be
– 128 × 10 = – 1280 , which can be represented with 12 bits.
Top
Constant Coefficient Multiplier (KCM) 3.31

• ROM-based multipliers with a constant input

• This reduces the size of the required ROM

• Further reductions in size requirement can be made if there is


knowledge of the constant value
8 bits representation
B = – 83 required
maximum product
maximum absolute
A: 8 signed bit number value: -128
( A × B ) max = 10, 624
15 bits representation required
1 bit save!
0×k
1×k
ROM
2×k
3×k

8 bits 16 bits
A address data
15 bits
P
15 bits
256 16-bit
locations

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Top
2’s complement multiplication 3.32

• For one negative and one positive operand just remember to sign
extend the negative operand.
11010110 -42
x00101101 x45
1111111111010110
0000000000000000
1111111101011000
sign 1111111010110000
extends 0000000000000000
1111101011000000
0000000000000000
0000000000000000
1111100010011110 -1890

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Top
2’s complement multiplication (II) 3.33

• For both operands negative, subtract the last partial product.

• We use the trick of inverting (negating and adding 1) the last partial
product and adding it rather than subtracting.
form last partial product negative
11010110 -42
x10101101 x-83
1111111111010110
0000000000000000
1111111101011000
1111111010110000
0000000000000000
two’s 1111101011000000
complement 0000000000000000
-1110101100000000 +0001010100000000
0000110110011110 3486

• Of course, if both operands are positive, just use the unsigned


technique!

August 2005, For Academic Use Only, All Rights Reserved


Notes:
The difference between signed and unsigned multiplies results in different hardware being necessary. DSP
processors typically have separate unsigned and signed multiply instructions.
Top
Fixed Point multiplication 3.34

• Fixed point multiplication is no more awkward than integer


multiplication:

11010.110
x00101.101 26.750
11.010110 x5.625
000.000000 0.133750
1101.011000
11010.110000 0.535000
000000.000000 16.050000
1101011.000000 133.750000
00000000.000000
000000000.000000 150.468750
0010010110.011110

• Again we just need to remember to interpret the position of the binary


point correctly.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Top
On-chip multipliers 3.35

• Many Xilinx FPGAs have various numbers of “on-chip” multipliers (from


4 to more than 500!).

• These are in hardware on the ASIC, not actually in the user FPGA area,
and therefore are permanently available, and they use no slices. They
also consume less power than a slice-based equivalent.

A
18x18 bit
multiply P
B

• A and B are 18-bit input operands, and P is the 36-bit product


P = A × B . ..........Why 18 bits?

• Depending upon the particular device, between 12 and 556 of these


dedicated multipliers are available.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Information on dedicated multipliers taken from “Virtex-II Pro Platform FPGAs: Introduction and Overview”,
DS083-1 (v2.4) January 20, 2003. http://www.xilinx.com
Top
Division (i) 3.36

• Divisions are sometimes required in DSP, although not very often.

• 6 bit non-restoring division array:


a5 a4 a3 a2 a1 a0 sin
Bin
b5 b4 b3 b2 b1 b0
1
q5 bin bout
0
cout FA cin
q4
0
Bout
q3 sout
0

q2
0

Q=B/A q1
0

q0

• Note that each cell can perform either addition or subtraction as shown
in an earlier slide ⇒ either Sin+ Bin or Sin - Bin can be selected.
August 2005, For Academic Use Only, All Rights Reserved
Notes:
A Direct method of computing division exists. This “paper and pencil” method may look familiar as it is often
taught in school. A binary example is given below. Note that each stage computes an addition or subtraction of
the divisor A. The quotient is made up of the carry bits from each addition/subtraction. If the quotient bit is a 0,
the next computation is an addition, and if it is a 1, the divisor is subtracted. It is not difficult to map this example
into the structure shown on the slide.

Example: B = 01011 (11), A = 01101 (13) ⇒ -A = 10011. Compute Q = B / A.

01011 R0 = B
q4 = 0 carry 10011 -A
11110 R1
0
11100 2.R1
q3 = 1 c ar r y 01101 +A
01001 R2
0
10010 2.R2
q2 = 1 c ar r y 10011 -A
00101 R3
0
01010 2.R3
q1 = 0 c ar r y 10011 -A
11101 R4
0
11010 2.R4
q0 = 1 c ar r y 01101 +A
00111 R5

Q = B / A = 01101 x 2-4 = 0.8125


Top
Division (ii) 3.37

• There is an alternative way to compute division using another paper


and pencil technique.
divisor_in
00000.1101
01101 01011.0000
0
01 VHDL Design
00
010
000
0101
0000
01011
00000
remdsh1
01011 0
divisor_in
00110 1
00100 10
00011 01
00001 010
00000 000
00001 0100
00000 1101
00000 0111

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Top
The Problem With Division 3.38

• An important aspect of division is to note that the quotient is generated


MSB first - unlike multiplication or addition/subtraction!

• This has implications for the rest of the system.

• It is unlikely that the quotient can be passed on to the next stage until
all the bits are computed - hence slowing down the system!

• Also, an N by N array has another problem - ripple through adders.

• Note that we must wait for N full adder delays before the next row can
begin its calculations.

• Unlike multiplication there is no way around this, and as result division


is always slower than multiply even when performed on a parallel array
- a N by N multiply will run faster than a N by N divide!

August 2005, For Academic Use Only, All Rights Reserved


Notes:
By looking at the top two rows of a 4 x 4 division array we can see that the first bit to get generated is the MSB
of the quotient. This is unlike the multiplication array that can also be seen below, where the LSB is generated
first. This is a problem when using division as most operations require the LSBs to start a computation and
hence the whole solution will have to be generated before the next stage can begin.

Another problem for division is the fact that it takes N full adder delays before the next row can start. In the
examples below, the order in which the cells can start has been shown. So for the multiplier, the first cell on the
second row is the 3rd cell to start working. However, for the divider, the first cell on the second row is only the
5th cell to start working because it has to wait for the 4 cells on the first row to finish.
a2 a1 a0 sin
a3 Bin
b3 b2 b1 b0
1
q3 4 3 2 1 bout
bin
0

6 5 cout FA cin
q2

Bout
sout

s FA is full adder
0 a3 0 a2 0 a1 0 a0
a

b 4 3 2 1
b0
bout 0 0
c
cout FA
b1
5 4 3
aout 0
sout
p1 p0
Top
Pipelining The Division Array 3.39

• The division array shown earlier can be pipelined to increase


throughput.
a5 a4 a3 a2 a1 a0
b5 b4 b3 b2 b1 b0
a5 a4 a3 a2 a1 a0
b5 b4 b3 b2 b1 b0 Operands
a5 a4 a3 a2 a1 a0
b5 b4 b3 b2 b1 b0
pipeline delay
1
q5 sin
Bin
0

q4 bout
0 bin

q3 cout FA cin
0
Bout
q2 sout
0

Q=B/A q1
0

q0

August 2005, For Academic Use Only, All Rights Reserved


Notes:
To increase the throughput, the critical path can be broken down by implementing pipeline delays at appropriate
points. If pipelining is not used, the delay (critical path) from new data arriving to registering the full quotient is
N2 full adders. This delay represents the maximum rate that new data can enter the array. However, by
pipelining the array, the critical path is broken down to just N full adders and thus the rate at which new data
can arrive is increased dramatically.
a3 a2 a1 a0
b3 b2 b1 b0
The longest path from register 1
to register is the Critical Path. q3 0

q2 0

a3 a2 a1 a0 q1 0
b3 b2 b1 b0
1
q3 0 q0

With pipelining the critical path is only N full adders.


q2 0

q1 0

q0

Without pipelining the critical path is through N2 full adders.


Top
Square Root (i) 3.40

• 6 bit non-restoring square root array.


sin
0 a7 a6 Bin
0 1 1
1 0 0
0 a5 a4 bin bout
b5 1 1
0 0
cout FA cin
b4 0 a3 a2
1 1
0 0 Bout
b3 0 a1 a0
1 1
0 0
b2 0
1 0 1 0
0 0 sout
0
B = A b1
1 0 1 0
0 0
b0
0

• The square root is found (with divides) in DSP in algorithms such as QR


algorithms, vector magnitude calculations and communications
constellation rotation.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Looking carefully at the non-restoring square root array, we can note that this array is essentially “half” of the
division array! If the division array above is cut diagonally from the left we can see the cells that are needed for
the square root array. The 2 extra cells on the right hand side are standard cells which can be simplified. So
square root can be performed twice as fast as divide using half of the hardware!
a4 a3 a2 a1
A = 10 11 01 01
010 0a4
b3 = 1 carry 111 111
001 R1
0111 R1<<1 & a3
b2 = 1 c ar r y 1011 1b311
0 a7 a6 R2
0 1 1 0010
1 0 0 01001 R2<<1 & a2
0 a5 a4 b1 = 0 carry 10011 1b3b211
b5 1 1 11100 R3
0 0
b4 0 a3 a2 110001 R3<<1 & a1
1 1 b0 = 1 c ar r y 011011 0b3b2b111
0 0 001100 R4
b3 0 a1 a0
1 1
0 0
b2 0
1 0 1 0
0 0 sout
b1 0
1 0 1 0
0 0
b0
0
Top
Square Root - An Alternative Approach 3.41

• Unfortunately the square root algorithm suffers from the same


problems as division although not to the same extent.

• These are:

• The result is generated MSB first.

• Each row has to wait longer and longer for the data it needs
from the previous row.

• A solution is to use memory to store the pre-computed square root


values. The input is then used as an address to look up the answer.

• This can be fast but if the input wordlength is large this approach quickly
becomes unfeasible.

• Another approach is to use memory to look up a partial solution and


then use an iterative approach like the Newton-Raphson algorithm to
find the final solution.

August 2005, For Academic Use Only, All Rights Reserved


Notes:
The Newton-Raphson equation can be used to find the square root of a number. It is an iterative technique
which can achieve accurate results with relatively few iterations. However, there are two parameters that make
it less than ideal for DSP.

• An initial guess is required to start the algorithm and the accuracy of this guess effects the accuracy of the
solution after n iterations.

• The number of iterations n to achieve a desired accuracy are unknown.

The iterative algorithm is:

Input
x n + 1 =  x n + 1 + ------------- ⁄ 2
 x  n

where xn is the initial estimate of the square root.

One approach that uses this algorithm is to take the first b MSB bits of the input and use them to address
memory containing values for the initial guess xn. This value is then fed into the Newton-Raphson algorithm for
n iterations.
Top
Square Root and Divide - Pythagoras! 3.42

• The main appearance of square roots and divides is in advanced


adaptive algorithms such as QR using givens rotations.

• For these techniques we often find equations of the form:

x y
cos θ = ---------------------
- and sin θ = ---------------------
-
x2 + y2 x2 + y2

• So in fact we actually have to perform two squares, a divide and a


square root. (Note that squaring is “simpler” than multiply!)

• There are a number of iterative techniques that can be used to


calculate square root. (However these routines invariably require
multiplies and divides and do not converge in a fixed time.)

• There seems to be some misinformation out there about square roots:


For FPGA implementation square roots are easier and cheaper to
implement than divides....!

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Top
Complex Addition/Subtraction 3.43

• Complex Addition and Subtraction obey the following:

( a + jb ) + ( c + jd ) = ( a + c ) + j ( b + d )
( a + jb ) – ( c + jd ) = ( a – c ) + j ( b – d )

• Thus 2 additions/subtractions are required:

a
+
_ Real
c

b
+
_ Imaginary
d

August 2005, For Academic Use Only, All Rights Reserved


Notes:
For more information on Complex Arithmetic the following text may be useful:

[1] Press, Teukolsky, Vetterling, Flannery. Numerical Recipes in C, Cambridge University Press, 1992
Top
Complex Multiplication 3.44

• Complex Multiplication requires more operations:

( a + jb ) × ( c + jd ) = ( ac – bd ) + j ( bc + ad )

• Thus, 4 multiplications and 2 additions are required:

a
x
+ Imaginary
b x

c
x
_
Real
d
x

August 2005, For Academic Use Only, All Rights Reserved


Notes:
The total number of operations that must be performed for a complex multiplication is 6. But 4 of these
operations are multiplies. Generally multiplies are more costly in terms of speed and/or area than additions.
Thus, if we can reduce the number of multiplies at the expense of a few more additions, this can be beneficial.
Top
Alternative Complex Multiplication 3.45

• The multiplication of two complex numbers can also be written as:

( a + jb ) × ( c + jd ) = ( ac – bd ) + j [ ( a + b ) × ( c + d ) – ac – bd ]

• Which comprises of 3 multiplications and 5 additions:

a
+
b
x
c _

+ _
Imaginary
d x
_
Real
x

August 2005, For Academic Use Only, All Rights Reserved


Notes:
With some algebraic manipulation a complex multiplication can be expressed in terms of 8 operations as
opposed to 6. However, even though this form has 2 more operations than the previous one, there is 1 less
multiplier. We have effectively substituted a multiplier for 3 additions. This procedure offers an alternative
architecture which may be faster in systems where multiplication takes considerably longer than addition.
Top
Complex Division 3.46

• Division of complex numbers uses more hardware than multiplication:


a + jb- ( ac + bd ) + j ( bc – ad )
------------- = --------------------------------------------------------
-
c + jd 2
c +d
2

• Hence, 6 multiplications, 2 divisions and 3 additions are required:


a
x
_
b ÷ Imaginary
x

c
x + ÷ Real

d
x x
+
x

August 2005, For Academic Use Only, All Rights Reserved


Notes:
Clearly the division of complex numbers is even more expensive than multiplication in terms of the amount of
hardware required to carry out the operation. This process requires 6 multiplies, which we already know to be
slow, and divides. Dividers however are even slower than multipliers so it is clear that the division of complex
numbers is an expensive operation, both in terms of area and speed.
Top
Conclusions 3.47

• An overview of the principles of arithmetic for DSP has been given in


this section;

• Number representation techniques have been presented. Different


methods, their advantages and disadvantages have been introduced;

• Basic and advanced operations and their implementation in hardware


were reviewed;

• Special attention has been paid to highly effective implementation of


addition and multiplication in Xilinx Virtex-II Pro FPGAs;

• Complex arithmetic operations and their implementations have also


been considered.

• A current generation of DSP algorithms and architectures (QR, least


squares, MIMOs) require square root and divide calculations - hence
knowledge and support of how to derive these is very important.

August 2005, For Academic Use Only, All Rights Reserved


Notes:

You might also like