0% found this document useful (0 votes)

36 views33 pages

Unit-Iv Adders:: Binary Adder Notations and Operations

The document discusses different types of binary adders, including half adders, full adders, ripple carry adders, carry lookahead adders, and carry save adders. It explains the basic operations and equations for 1-bit and multi-bit versions of each adder type. Ripple carry adders are built from full adders but have a delay that increases linearly with bit length, while carry lookahead adders reduce delay through carry propagation logic. Carry save adders minimize additions by computing partial sum and carry bits at each stage.

Uploaded by

Rene Dev

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views33 pages

Unit-Iv Adders:: Binary Adder Notations and Operations

Uploaded by

Rene Dev

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

UNIT-IV

ADDERS:
Binary Adder Notations and Operations:
As mentioned previously, adders in VLSI digital systems use binary notation.
In that case, add is done bit by bit using Boolean equations.

1-bit Half Adder.

Consider a simple binary add with two n-bit inputs A;B and a one-bit carry-in
cin along with n-bit output S.
S = A + B + Cin
Where A = an-1, an-2……a0; B = bn-1, bn-2……b0.

The + in the above equation is the regular and operation. However, in the
binary world, only Boolean algebra works. For add related operations, AND, OR and
Exclusive-OR (XOR) are required. In the following documentation, a dot between
two variables (each with single bit), e.g. a _ b denotes 'a AND b'. Similarly, a + b
denotes 'a OR b' and a _ b denotes 'a XOR b'. Considering the situation of adding two
bits, the sum s and carry c can be expressed
using Boolean operations mentioned above.

Si = ai ^ bi
Ci + 1 = ai . bi

The Equation of Ci+1 can be implemented as shown in Fig.2.1. In the figure,

there is a Half adder, which takes only 2 input bits. The solid line highlights the
critical path, which indicates the longest path from the input to the output. Equation of
ci+1 can be extended to perform full add operation, where there is a carry input.

Si = ai ^ bi ^ ci
Ci + 1 = ai . bi + ai . ci + bi . ci

1-bit Full Adder.

A Full adder can be built based on Equation above. The block diagram of a 1-
bit full adder is shown in Fig.2.2. The full adder is composed of 2 half adders and an
OR gate for computing carry-out. Using Boolean algebra, the equivalence can be
easily proven. To help the computation of the carry for each bit, two binary literals are
introduced. They are called carry generate and carry propagate, denoted by gi and pi.
Another literal called temporary sum ti is employed as well. There is relation between
the inputs and these literals.

Gi = ai . bi
Pi = ai + bi
Ti = ai ^ bi
Where i is an integer and 0 _ i < n.
With the help of the literals above, output carry and sum at each bit can be
written as:

Ci + 1 = gi + pi . ci
Si = ti ^ ci
In some literatures, carry-propagate pi can be replaced with temporary sum ti
in order to save the number of logic gates. Here these two terms are separated in order
to clarify the concepts. For example, for Ling adders, only pi is used as carry-
propagate.
The single bit carry generate/propagate can be extended to group version G
and P. The following equations show the inherent relations.

Gi : k = Gi : j + Pi : j . Gj – 1 : k
Pi : k = Pi : j . Pj-1:k
Where i : k denotes the group term from i through k.

Using group carry generate/propagate, carry can be expressed as expressed in

the following equation.
Ci + 1 = Gi : j + Pi : j . Cj
Ripple carry adder
Ripple carry adder is an n-bit adder built from full adders. Fig 2.1 shows a 4-bit ripple
carry adder. One full adder is responsible for the addition of two binary digits at any
stage of the ripple carry. The carryout of one stage is fed directly to the carry-in of
the next stage. Even though this is a simple adder and can be used to add unrestricted
bit length numbers, it is however not very efficient when large bit numbers are used.

4-b Ripple Carry Adder

One of the most serious drawbacks of this adder is that the delay increases
linearly with the bit length. The worst-case delay of the RCA is when a carry
signal transition ripples through all stages of adder chain from the least significant
bit to the most significant bit, which is approximated by:
T = (n-1) tc + ts
Delay :
The latency of a 4-bit ripple carry adder can be derived by considering the
worst-case signal propagation path. We can thus write the following
expressions:

TRCA-4bit = TFA(A0,B0→Co)+T FA (C in→C1)+TFA (Cin→C2)+ TFA

(Cin→S3)

And, it is easy to extend to k-bit RCA:

TRCA-4bit = TFA(A0,B0→Co)+(K-2)* TFA (Cin→Ci)+ TFA (Cin→Sk-1).

Drawbacks :
Delay increases linearly with the bit length and Not very efficient when large
bit numbers are used.
Carry Look-Ahead Adder

Lookahead carry algorithm speed up the operation to perform addition,

because in this algorithm carry for the next stages is calculated in advance based on
input signals. In CLA, the carry propagation time is reduced to O(log2(Wd)) by using
a tree like circuit to compute the carry rapidly. Fig. shows the 4-bit Carry Look-Ahead
Adder.

4-bit Carry Look Ahead Adder

The CLA exploits the fact that the carry generated by a bit-position depends
on the three inputs to that position. If ‘X’ and ‘Y‘ are two inputs then if X=Y=1, a
carry is generated independently of the carry from the previous bit position and if
X=Y= 0, no carry is generated. Similarly if X ≠ Y, a carry is generated if and only if
the previous bit-position generates a carry. ‘C’ is initial carry, “S” and “Cout” are
output sum and carry respectively, then Boolean expression for calculating next carry
and addition is:
Pi = Xi xor Yi -- Carry Propagation
Gi = Xi and Yi -- Carry Generation
Ci + 1 = Gi or (Pi and Ci) -- Next Carry
Si = Xi xor Yi xor Ci -- Sum Generation

Thus, for 4-bit adder, we can extend the carry, as shown below:

C1 = G0 + P0 · C0
C2 = G1 + P1 · C1 = G1 + P1 · G0 + P1 · P0 · C0
C3 = G2 + P2 · G1 + P2 · P1 · G0 + P2 · P1 · P0 · C0
C4 = G3 + P3 · G2 + P3 · P2 · G1 + P3 · P2 · P1 · G0+ P3 · P2 · P1 · P0 · C0

As with many design problems in digital logic, we can make tradeoffs

between area and performance (delay). In the case of adders, we can create faster (but
larger) designs than the RCA. The Carry Look ahead Adder (CLA) is one of these
designs (there are others too, but we will only look at the CLA).
Drawbacks :
For long bit length, a carry look-ahead adder is not practical, but a hierarchical
structure one can improve much. The disadvantage of CLA is that the carry logic
block gets very complicated for more than 4-bits. For that reason, CLAs are usual
implemented as 4-bit modules and are used in a hierarchical structure to realize adders
that have multiples of 4-bits.

Carry Save Adder

The carry-save adder reduces the addition of 3 numbers to the addition of 2
numbers. The propagation delay is 3 gates regardless of the number of bits. The carry-
save unit consists of n full adders, each of which computes a single sum and carries
bit based solely on the corresponding bits of the three input numbers.
The entire sum can then be computed by shifting the carry sequence left by
one place and appending a 0 to the front (most significant bit) of the partial sum
sequence and adding this sequence with RCA produces the resulting n+1-bit value.

This process can be continued indefinitely, adding an input for each stage of
full adders, without any intermediate carry propagation. These stages can be arranged
in a binary tree structure, with cumulative delay logarithmic in the number of inputs
to be added, and invariant of the number of bits per input. The main application of
carry save algorithm is, well known for multiplier architecture is used for efficient
CMOS implementation of much wider variety of algorithms for high speed digital
signal processing .CSA applied in the partial product line of array multipliers will
speed up the carry propagation in the array.

4-bit Carry Save Adder

Basically, carry save adder is used to compute sum of three or more n-bit
binary numbers. Carry save adder is same as a full adder. As shown in the Fig.2.4,
here we are computing sum of two 4-bit binary numbers, so we take 4 full adders at
first stage. Carry save unit consists of 4 full adders, each of which computes single
sum and carry bit based only on the corresponding bits of the two input numbers. Let
X and Y are two 4-bit numbers and produces partial sum and carry as S and C as
shown in the below :

Si = Xi xor Yi ; Ci = Xi and Yi
The final addition is then computed as:
1. Shifting the carry sequence C left by one place.
2. Placing a 0 to the front (MSB) of the partial sum sequence S.
3. Finally, a ripple carry adder is used to add these two together and computing the
resulting sum.
Carry Save Adder Computation :
X: 10011
Y: 11001
Z: + 01011
S: 00001
C: + 11011
SUM: 1 1 0 1 1 1
In this design 128 bit carry save adder is used since the output of the multiplier
is 128 bits (2N). The carry save adder minimize the addition from 3numbers to 2
numbers. The propagation delay is 3gates despite of the number of bits. The carry
save adder contains n full adders, computing a single sum and carries bit based mainly
on the respective bits of the three input numbers. The entire sum can be calculated by
shifting the carry sequence left by one place and then appending a 0 to most
significant bit of the partial sum sequence. Now the partial sum sequence is added
with ripple carry unit resulting in n + 1 bit value. The ripple carry unit refers to the
process where the carryout of one stage is fed directly to the carry in of the next stage.
This process is continued without adding any intermediate carry propagation. Since
the representation of 128 bit carry save adder is infeasible, hence a typical 8 bit carry
save adder is shown in the figure 3.Here we are computing the sum of two 128 bit
binary numbers, then 128 half adders at the first stage instead of 128 full adder.
Therefore , carry save unit comprises of 128 half adders, each of which computes
single sum and carry bit based only on the corresponding bits of the two input
numbers.
bit carry save adder

If x and y are supposed to be two 128 bit numbers then it produces the partial
products and carry as S and C respectively.
Si = xi 1\ yi (4)
Ci = xi & yi (5)
During the addition of two numbers using a half adder, two ripple carry adder
is used. This is due the fact that ripple carry adder cannot compute a sum bit without
waiting for the previous carry bit to be produced, and hence the delay will be equal to
that of n full adders. However a carry-save adder produces all the output values in
parallel, resulting in the total computation time less than ripple carry adders. So,
Parallel In Parallel Out (PIPO) is used as an accumulator in the final stage.

2.5. Carry Select Adder

A carry-select adder is divided into sectors, each of which – except for the
least-significant –performs two additions in parallel, one assuming a carry-in of zero,
the other a carry-in of one. A four bit carry select adder generally consists of two
ripple carry adders and a multiplexer. The carry-select adder is simple but rather fast,
having a gate level depth of O(√n) . Adding two n-bit numbers with a carry select
adder is done with two adders (two ripple carry adders) in order to perform the
calculation twice, one time with the assumption of the carry being zero and the other
assuming one.
After the two results are calculated, the correct sum, as well as the correct
carry, is then selected with the multiplexer once the correct carry is known. The
design schematic of Carry Select Adder is shown in Fig.

The N-bit Ripple Carry Adder constructed by N set single bit Full-adder
In the N-bit carry ripple adder, the delay time can be expressed as:

TCRA = (N-1) Tcarry + Tsum

In the N-bit carry select adder, the delay time is:

TCSA = Tsetup + (N/M) Tcarry + MTmux + Tsum

In our proposed N-bit area-efficient carry select adder, the delay time is:

Tnew = Tsetup + (N-1) Tmux + Tsum

The carry select adder comes in the category of conditional sum adder.
Conditional sum adder works on some condition. Sum and carry are calculated by
assuming input carry as 1 and 0 prior the input carry comes. When actual carry input
arrives, the actual calculated values of sum and carry are selected using a multiplexer.
The conventional carry select adder consists of k/2 bit adder for the lower half
of the bits i.e. least significant bits and for the upper half i.e. most significant bits
(MSB’s) two k/bit adders. In MSB adders one adder assumes carry input as one for
performing addition and another assumes carry input as zero. The carry out calculated
from the last stage i.e. least significant bit stage is used to select the actual calculated
values of output carry and sum. The selection is done by using a multiplexer. This
technique of dividing adder in two stages increases the area utilization but addition
operation fastens.

2.6 Ripple Carry Adder

The basic addition operation at the bit level can be accomplished with a Full
Adder (FA) circuit. FA adds two input bits Xi and Yi along with an input carry Cin ,
resulting in a sum Si and a carry-out bit Cout as shown in Figure 3(b). The operation
preformed by the FA is defined by the following boolean equations for the sum and
the carry-out bits:

Si = Xi ⊕ Yi ⊕ Cin

Cout = (Xi ∧ Yi) ∨ (Cin ∧ (Xi ⊕ Yi))

= Majority(Xi, Yi, Cin)

The following notation for various Boolean operators will be used in this work to
avoid ambiguity

x ∨ y ↔ x OR y

x ∧ y ↔ x AND y

x ⊕ y ↔ x XOR y

x ↔ NOT x
It is apparent from equations 2.4 and 2.5 that the realization of the sum function
requires two XOR logic gates, while two AND and one OR logic gates are needed for
the carry-out function. Despite that, FA sum and carry-out functions can be
represented in many different logic expressions and, thereby, determine the structure
of the circuit. Based upon those different logic expressions, many full-adder cells and
modules can be conceived. This provides the digital designer with various alternatives
for the FA adder implementations to choose from and to investigate. Recently Shams
et al. carried out detailed performance analysis of twenty three 1-bit FA. Their study
showed that each adder cell exhibits it own figurers of power consumption, delay and
area.the area and power-delay product performance of six existing 1-bit FA adders
and proposed a new design based on XOR/XNOR. proposed five different FA
expressions based on XOR/XNOR implementation to explore different performance
tradeoffs. Then, they used their proposed FA cells to improve the area and power of
an array tree multiplier.

The 1-bit FA is cascaded as illustrated in Figure 4 to create n-bit wide operand adder
known as Ripple Carry Adder (RCA). The sum at each bit position i is determined by
the corresponding bit values of the operands at that position and the incoming carry
bit value from (i − 1)th position. The addition is completed once the carry value
propagates along the entire structure to the most significant bit (MSB) position.

Ripple carry adder block diagram.

The area and delay of this adder can be roughly estimated using the unit-gate delay
and area model. This model is technology independent and assumes that each gate,
excluding exclusive-OR, counts as one elementary gate for both area and delay. An
exclusive-OR gate counts for two elementary gates for both area and delay.

Complex gates as well as multi-input gates are built from 2-input basic gates and their
gate count equals the sum of gate counts of the composing cells. Thus, RCA delay is
estimated as 2n unit-gate delay while its area is 7n unit-gate area, where n is the
operand size. The main advantage of RCA implementation is that it is area efficient
and easy to construct. However, its linear delay characteristics makes it less suitable
for high-speed implementations. An improved addition approach in given next.
Multipliers

Introduction
Multiplication is important fundamental function in arithmetic logic operation. A
system’s performance is generally determined by the performance of the multiplier
because the multiplier is generally the slowest clement in the system. The objective of
good multiplier to provide a physically compact high speed and low power
consumption unit.To reduce significant power consumption of multiplier design it is a
good direction to reduce number of operations thereby reducing a dynamic power
which is a major part of total power dissipation.

An efficient multiplier should have following characteristics:-

Accuracy:- A good multiplier should give correct result.

Speed:- Multiplier should perform operation at high speed.

Area:- A multiplier should occupies less number of slices and LUTs.

Power:- Multiplier should consume less power.

Multiplication process or A multiplier can be divided into three steps-

1. Partial product generation-The first is radix 4 booth encoding in which a partial

product is produced from the multiplier and multiplicand.

2. Partial product reduction-The second is adder array or partial product compression

to add all partial products and convert them into the form of sum and carry.

3. Final addition-The last is the final addition in which the final multiplication result
is generated by adding the sum and carry . Z=A*B+Z.

For the multiplication of an n-bit multiplicand with an m�bit multiplier, m partial

products are generated and products formed is n + m bits long.
A=(an an-1 an-2.......................a0) B=(bn bn-1 bn-2.....................b0)

AB= ( A2nbn+A2n-1bn-1+A2n-2bn-2+...................................+A20b0 )

Types of Multipliers
The common multiplication method is “add and shift” algorithm.

1.) In parallel multipliers number of partial products to be added is the main

parameter that determines the performance of the multiplier.

2.)To reduce the number of partial products to be added, Modified Booth algorithm is
one of the most popular algorithms.

3.)To achieve speed improvements Wallace Tree algorithm can be used to reduce the
number of sequential adding stages.

4.) On the other hand “serial-parallel” multipliers compromise speed to achieve better
performance for area and power consumption. The selection of a parallel or serial
multiplier actually depends on the nature of application.

Applications
1.)Multiplication is a heavily used arithmetic operation that figures prominently in
signal processing and scientific applications
2.)Multipliers are key components of many high performance systems such as FIR
filters, microprocessors, digital signal processors, etc.

3.) Multipliers play an important role in today’s digital signal processing and various
other applications.
Types of Multipliers

Serial-Parallel Multiplier
The serial multiplier uses successive addition algorithm, where both operands are
entered in serial manner, which leads to poor speed performance. However in the
parallel multiplier both operands are entered in parallel manner, which gives high
speed but occupies much larger area when compared to serial multiplier.Hence, we go
for serial-parallel multiplier.

The serial-parallel multiplier serves as a good trade-off between the time consuming
serial multiplier and area consuming parallel multipliers. These multipliers are used
when there is demand for both high speed and small area. In a device using the serial-
parallel multiplier, one operand is entered serially and the other operand is stored in
parallel with a fixed number of bits. The resultant enhancement in the processing
speed and the chip area will become more significant when a large number of
independent operations are performed.

1.)This multiplier is the simplest one, the multiplication being considered as a

succession of additions.

If A=(an an-1 an-2.......................a0)

B=(bn bn-1 bn-2..............b0)

then the product A.B may be expressed as

AB= ( A2nbn+A2n-1bn-1+A2n-2bn- 2+...................................+A20b0 )

2.)To implement this we use D flip-flop and full adder .

3.)In this D flip-flop acts as a memory to store the data values and full adder circuit
is used for adding the partial products.

4.)A possible form of this multiplier for multiplying 4- bit quantities based on this
expression is shown in figure (1). The operation of the multiplier is as follows-

i.)'Number A' is entered in the 4 right most bits of the top row of D flip-flop, which
are further connected to three D flip-flops to form a 7-bit shift register. The first left
most column of D flip-flops holds B values.

ii.) The number A( a3 a2 a1 a0 ) is then multiplied with the least significant bit of
B( b0 ). Later the number A is shifted and then multiplied with the other bits of B one
after the other simultaneously.The partial products are then added using full adders.

iii.)This is done to allow the multiplication of 'number A' by 21 22..................2n, thus

forming the partial product at each stage of the process.

5.)This approach can be used to eliminate the least significant bits of the product.

6.)A further reduction in hardware can be done by using 3 additional D flip-flops

( which were earlier used as shifting of A proceeds ) for holding b values.
7.)This structure is suited only for positive or unsigned operands. If the operands are
negative and 2's compliment encoded then -
i.) The most significant bit of B will have a negative weight and so a subtraction must
be performed as the last step.

ii.) The most significant bit of A must be replicated since operand A must be
expanded to 2N bits.

Figure (1): 4- bit Serial and Parallel Multiplier

Braun Multiplier
Braun Edward Louis proposed the braun multiplier in 1963. It is the simplest parallel
multiplier, that is commonly known as the Carry Save Array Multiplier.

This multiplier consists of an array of AND gates and adders arranged in an iterative
structure that does not require logic registers. This is also known as the non-additive
multiplier since it does not add an a operand to result of the multiplication. The
completion time is limited by the depth of the Carry Save Array, and by the Carry
propagation in the adder.This multiplier is suited only for positive operands. This
multiplier is restricted to performing multiplication of two unsigned numbers.
Architecture

1.)An n*n bit Braun multiplier is constructed with n (n-1) adders, n2 AND gates and
(n-1) rows of Carry Save Adder .

2.)In the first rows there is no Carry propagation ( using Carry Save adder).At the
bottom of the array, the output of the array is noted in Carry Save, so an adder
converts it ( by mean of a Carry propagation) into the classical notation.

3.)Each products can be generated in parallel with the AND gates. Each partial
product can be added to the previous sum of partial products.(which has produced by
using the row of adders).

4.)The carry out will be shifted one bit to the left or right and then are added to the
sum of first adder and the newly generated partial product.

5.)The shifting would carry out with the help of Carry Save Adder (CSA) and the
Ripple carry adder should be used for the final stage of the output.

6.)The schematic diagram is as shown figure(4.2).

Figure - Carry save adder

Figure - Ripple Carry adder

Performance of Braun Multiplier

1.)Braun multiplier performs well for the unsigned operands that are less than 16 bits
in terms of speed, power and area.But it is simple structure when compared to the
other multipliers.

2.)The number of components required in building the Braun multiplier increases

quadratically with the number of bits, which makes it inefficient.

3.)The main drawback of this multiplier is that the potential susceptibility of glitching
problem due to the Ripple Carry Adder in the last stage. The delay depends on the
delay of the Full Adder and also a final adder in the last stage.

To overcome drawback

1.)The internal structure of the full adder can be realized using FPGA. The power and
area can also be reduced by using two bypassing techniques called Row bypassing
technique and Column bypassing technique.

2.)Delay due to the final ripple adder can be minimized by using very fast one of a
Parallel Prefix Adder “KOGGE STONE ADDER” which is a type of Carry Look
Head Adder.

Speed consideration:

1.)The delay of the Braun multiplier is dependent on the delay of the full Adder cell
and also on the final Adder in the last row.

2.)In the multiplier array, a full Adder with balanced Carry and sum delays is
desirable because the sum and carry signals are both in the critical path.

3.)The speed and power of the full Adder is very important for large arrays.

Enhanced Braun Multipliers

1.)The performance of Braun Multiplier can be enhanced by replacing full adders

with half adder, which will result in saving three logic gates, but regularity of
structure gets disturbed.

2.)The another way to do this is by optimising the interconnection between the adders,
so that delay through out each adders path is approximately same.
Baugh-Wooley multiplier
In signed multiplication the length of the partial products and the number of partial
products will be very high. So an algorithm was introduced for signed multiplication
called as Baugh- Wooley algorithm. The Baugh-Wooley multiplication is one
amongst the cost-effective ways to handle the sign bits. This method has been
developed so as to style regular multipliers, suited to 2's compliment numbers.

Baugh-Wooley Two’s compliment Signed multipliers is the best known algorithm for
signed multiplication because it maximizes the regularity of the multiplier and allow
all the partial products to have positive sign bits.

Figure- unsigned 4-
bit multiplication

Figure-signed 4-bit multiplication

Figure-Baugh-Wooley 4-bit algorithm

When multiplying two’s compliment numbers directly, each of the partial products to
be added is a signed numbers. Thus each partial product has to be sign extended to the
width of the final product in order to form a correct sum by the Carry Save Adder
(CSA) tree. According to Baugh-Wooleyapproach, an efficient method of adding
extra entries to the bit matrix suggested to avoid having deal with the negatively
weighted bits in the partial product matrix.

Baugh-Wooley algorithm

Here are using fewer steps and also lesser adders. Here a0, a1, a2, a3& b0, b1, b2, b3
are the inputs. I am getting the outputs that are p0, p1... p7. As I am using pipelining
resister in this architecture ,so it will take less time to multiply large number of 2’s
compliment.

Let us consider two numbers A and B ( 2's compliment number )

The product of two numbers is

--->(3)

The first two terms of above equation are positive and last two terms are negative.
The last two terms are n-1 bits that extend in binary weight from 2n-1 upto 22n-3.O n
the other hand, the final product is 2n bits that extends in binary weight from 20 to
22n-1 . In order to calculate the product, instead of subtracting the last two terms, it is
possible to add the opposite values.

We see that subtractor cells must be used. In order to use only adder cells, the
negative terms may be rewritten as :

----->(4)

Then A.B becomes

---->(5)

The final equation is

----->(6)

The above equation signifies the Baugh-Wooley algorithm for multiplication process
in two’s compliment form.

Baugh-Wooley Multiplier provides a high speed, signed multiplication algorithm . It

uses parallel products to complement multiplication and adjusts the partial products to
maximize the regularity of multiplication array . When number is represented in two’s
complement form, sign of the number is embedded in Baugh-Wooley multiplier.This
algorithm has the advantage that the sign of the partial product bits are always kept
positive so that array addition techniques can be directly employed.In the two’s
complement multiplication, each partial product bit is the AND of a multiplier bit and
a multiplicand bit, and the sign of the partial product bits are positive .In this scheme ,
a total of n(n - 1) + 3 full adders are required. Hence, for the case of n = 4, the array
requires 15 adders.
FPGA AND CPLDS

Integrated circuits (IC) technology is the enabling technology for a whole host
of innovative devices and systems that have changed the way of living. VLSI systems
are much smaller and consume less power than discrete components used to built
electronic systems before 1960’s. The electronics industry has achieved a phenomenal
growth over the last two decades, mainly due to the rapid advances in integration
technologies, large-scale systems design in short, due to the advent of VLSI. The
number of applications of integrated circuits in high-performance computing,
telecommunications, and consumer electronics has been rising steadily, and at a very
fast pace. Typically, the required computational power (or, in other words, the
intelligence) of these applications is the driving force for the fast development of this
field. Below figure gives an overview of the prominent trends in information
technologies over the next few decades. The current leading-edge technologies (such
as low bit-rate video and cellular communications) already provide the end-users a
certain amount of processing power and portability.

Trends of VLSI

This trend is expected to continue, with very important implications on VLSI

and systems design. One of the most important characteristics of information services
is their increasing need for very high processing power and bandwidth (in order to
handle real-time video, for example). The other important characteristic is that the
information services tend to become more and more personalized (as opposed to
collective services such as broadcasting), which means that the devices must be more
intelligent to answer individual demands, and at the same time they must be portable
to allow more flexibility/mobility. As more and more complex functions are required
in various data processing and telecommunications devices, the need to integrate
these functions in a small system/package is also increasing. The level of integration
as measured by the number of logic gates in a monolithic chip has been steadily rising
for almost three decades, mainly due to the rapid progress in processing technology
and inter connect technology.

Figure 6.2: Evolution of integration density and minimum feature size, as seen in the
early 1980s.

2. VLSI Design Flow

The design process at various levels is usually evolutionary in nature. It starts

with a given set of requirements. Initial design is developed and tested against the
requirements. When requirements are not met, the design has to be improved. If such
improvement is either not possible or too costly, then the revision of requirements and
its impact analysis must be considered. The three important domains in VLSI are
Behavioral domain, Structural domain, Geometrical layout domain. The design flow
starts from the algorithm that describes the behavior of the target chip. The
corresponding architecture of the processor is first defined. It is mapped onto the chip
surface by floor planning. The next design evolution in the behavioral domain defines
finite state machines (FSMs) which are structurally implemented with functional
modules such as registers and arithmetic logic units (ALUs).

These modules are then geometrically placed onto the chip surface using CAD
tools for automatic module placement followed by routing, with a goal of minimizing
the interconnects area and signal delays. The third evolution starts with a behavioral
module description. Individual modules are then implemented with leaf cells. At this
stage the chip is described in terms of logic gates (leaf cells), which can be placed and
interconnected by using a cell placement & routing program. The last evolution
involves a detailed Boolean description of leaf cells followed by a transistor level
implementation of leaf cells and mask generation. In standard-cell based design, leaf
cells are already pre-designed and stored in a library for logic design use.

3. Design Hierarchy

The use of hierarchy, or divide and conquer technique involves dividing a

module into sub- modules and then repeating this operation on the sub-modules until
the complexity of the smaller parts becomes manageable. This approach is very
similar to the software case where large programs are split into smaller and smaller
sections until simple subroutines, with well-defined functions and interfaces, can be
written. The design of a VLSI chip can be represented in three domains.
Correspondingly, a hierarchy structure can be described in each domain separately.
However, it is important for the simplicity of design that the hierarchies in different
domains can be mapped into each other easily.

In the physical domain, partitioning a complex system into its various

functional blocks will provide a valuable guidance for the actual realization of these
blocks on chip. Obviously, the approximate shape and size (area) of each sub-module
should be estimated in order to provide a useful floor plan. This physical view
describes the external geometry of the adder, the locations of input and output pins,
and how pin locations allow some signals (in this case the carry signals) to be
transferred from one sub-block to the other without external routing.
4.VLSI Design Styles

Several design styles can be considered for chip implementation of specified

algorithms or logic functions. Each design style has its own merits and shortcomings,
and thus a proper choice has to be made by designers in order to provide the
functionality at low cost.

Field Programmable Gate Array (FPGA)

Fully fabricated FPGA chips containing thousands of logic gates or even more,
with programmable interconnects, are available to users for their custom hardware
programming to realize desired functionality. This design style provides a means for
fast prototyping and also for cost-effective chip design, especially for low-volume
applications. A typical field programmable gate array (FPGA) chip consists of I/O
buffers, an array of configurable logic blocks (CLBs), and programmable interconnect
structures. The programming of the inter connects is implemented by programming of
RAM cells whose output terminals are connected to the gates of MOS pass
transistors.

Performance of the design can be simulated and verified before downloading

the design for programming of the FPGA chip. The programming of the chip remains
valid as long as the chip is powered-on or until new programming is done. In most
cases, full utilization of the FPGA chip area is not possible - many cell sites may
remain unused.

The largest advantage of FPGA-based design is the very short turn-around

time, i.e., the time required from the start of the design process until a functional chip
is available. Since no physical manufacturing step is necessary for customizing the
FPGA chip, a functional sample can be obtained almost as soon as the design is
mapped into a specific technology. The typical price of FPGA chips are usually higher
than other realization alternatives (such as gate array or standard cells) of the same
design, but for small-volume production of ASIC chips and for fast prototyping,
FPGA offers a very valuable option.
General architecture of Xilinx FPGA.

Gate Array Design

In view of the fast prototyping capability, the gate array (GA) comes after the
FPGA. While the design implementation of the FPGA chip is done with user
programming, that of the gate array is done with metal mask design and processing.
Gate array implementation requires a two-step manufacturing process: The first phase,
which is based on generic (standard) masks, results in an array of uncommitted
transistors on each GA chip. These uncommitted chips can be stored for later
customization, which is completed by defining the metal interconnects between the
transistors of the array. Since the patterning of metallic interconnects is done at the
end of the chip fabrication, the turn-around time can be still short, a few days to a few
weeks.

Basic processing steps required for gate array implementation.

Figure above shows a magnified portion of the internal array with metal mask
design (metal lines highlighted in dark) to realize a complex logic function. Typical
gate array platforms allow dedicated areas, called channels. The availability of these
routing channels simplifies the interconnections, even using one metal layer only. The
interconnection patterns to realize basic logic gates can be stored in a library, which
can then be used to customize rows of uncommitted transistors according to the net
list. While most gate array platforms only contain rows of uncommitted transistors
separated by routing channels, some other platforms also offer dedicated memory
(RAM) arrays to allow a higher density where memory functions are required.

Layout views of a conventional GA chip and a gate array with two memory banks.

Metal mask design to realize complex logic function on channeled GA

In general, the GA chip utilization factor, as measured by the used chip area
divided by the total chip area, is higher than that of the FPGA and so is the chip speed,
since more customized design can be achieved with metal mask designs. The current
gate array chips can implement as many as hundreds of thousands of logic gates.

Standard-Cells Based Design

The standard-cells based design is one of the most prevalent full custom
design styles which require development of a full custom mask set. For instance, the
inverter gate can have standard size transistors, double size transistors, and quadruple
size transistors so that the chip designer can choose the proper size to achieve high
circuit speedandlayoutdensity.

The standard cell is also called the poly cell. In this design style, all of the
commonly used logic cells are developed, characterized, and stored in a standard cell
library. A typical library may contain a few hundred cells including inverters, NAND
gates, NOR gates, complex AOI, OAI gates, D-latches, and flip-flops. Each gate type
can have multiple implementations to provide adequate driving capability for
different fan outs.

VHDL contains constructs that are more specific to simulation and verification
than for synthesis. Synthesis software may ignore such constructs or rules. However,
the goal is to match the simulation specification with the codes for synthesis.
Depending on tools, the goal may (or usually) not be achievable. For example, the
following two VHDL codes are different but describe the same design - one for
simulation (cannot be synthesized efficiently) and the other for synthesis.

A standard cell layout example.

After chip logic design is done using standard cells in the library, the most
challenging task is to place individual cells into rows and interconnect them in a way
that meets stringent design goals in circuit speed, chip area, and power consumption.
Many advanced CAD tools for place-and-route have been developed and used to
achieve such goals.

Full Custom Design

Full custom design, in a strict sense, it is somewhat less than fully custom
since the cells are pre-designed for general use and the same cells are utilized in many
different chip designs. In a fuller custom design, the entire mask design is done anew
without use of any library. However, the development cost of such a design style is
becoming prohibitively high. Thus, the concept of design reuse is becoming popular
in order to reduce design cycle time and development cost.

Simplified floor plan consisting of two separate blocks and a common signal bus.

The most rigorous full custom design can be the design of a memory cell, be it
static or dynamic. Since the same layout design is replicated, there would not be any
alternative to high density memory chip design. For logic chip design, a good
compromise can be achieved by using a combination of different design styles on the
same chip, such as standard cells, data-path cells and PLAs. In real full-custom layout
in which the geometry, orientation and placement of every transistor is done
individually by the designer, design productivity is usually very low - typically 10 to
20 transistors per day, per designer. In digital CMOS VLSI, full-custom design is
rarely used due to the high labor cost. Exceptions to this include the design of high-
volume products such as memory chips, high- performance microprocessors and
FPGA masters.

Cost of Manufacturing

IC manufacturing plants are extremely expensive. A single plant costs as

much as $4 billion. Given that a new state –of-the-art manufacturing process is
developed every three years that is a sizeable investment. The investment makes sense
because s single plant can manufacture so many chips and can easily be switched to
manufacture different types of chips. In early years of the IC business companies
focused on building large quantities of a few standard parts. These parts are
commodities one 80 ns, 256 MB dynamic RAM is one more or less the same as any
other regardless of the manufacturer. Companies concentrated on commodity parts
because manufacturing variations are easier to keep track of when the same part is
being fabricated day after day. Standard parts also made sense because designing
integrated circuits was hard not only the circuit but the layout had to be designed, and
there were few computer programs to help automate the design process.

Soft Key Solutions - HASP4 HASP HL Hardlock Dongle Emulator For Aladdin Hardware Key
100% (1)
Soft Key Solutions - HASP4 HASP HL Hardlock Dongle Emulator For Aladdin Hardware Key
4 pages
Datapath Design - 1
No ratings yet
Datapath Design - 1
24 pages
List of Job Consultancy With Address in Hyderbad
67% (6)
List of Job Consultancy With Address in Hyderbad
20 pages
Ambiguous Grammar: Context Free Grammars (CFGS) Are Classified Based On
No ratings yet
Ambiguous Grammar: Context Free Grammars (CFGS) Are Classified Based On
3 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
Open Group Guide: Business Capabilities
100% (1)
Open Group Guide: Business Capabilities
25 pages
Least Learned Competencies
No ratings yet
Least Learned Competencies
2 pages
Python-Unit-6 R16 PDF
No ratings yet
Python-Unit-6 R16 PDF
19 pages
Java Project Final-3
No ratings yet
Java Project Final-3
39 pages
Project Management Book1
100% (1)
Project Management Book1
25 pages
08 Sensor Guide
100% (1)
08 Sensor Guide
2 pages
Computer Organization and Architecture: UNIT-2
No ratings yet
Computer Organization and Architecture: UNIT-2
29 pages
Course File SEM
No ratings yet
Course File SEM
24 pages
Unit-IV Subsystem Design and VLSI Design Styles
No ratings yet
Unit-IV Subsystem Design and VLSI Design Styles
33 pages
C Programming Strings
No ratings yet
C Programming Strings
9 pages
EC6 2 ReleaseNotes P638 24
No ratings yet
EC6 2 ReleaseNotes P638 24
3 pages
Vlsi Implementation For High Speed Adders
100% (1)
Vlsi Implementation For High Speed Adders
6 pages
Adders and Multipliers
No ratings yet
Adders and Multipliers
59 pages
Packet Tracer
No ratings yet
Packet Tracer
4 pages
Vlsi Design Unit4 PDF
No ratings yet
Vlsi Design Unit4 PDF
53 pages
Digital Systems and VLSI Design: by Vijaya Prakash A M
No ratings yet
Digital Systems and VLSI Design: by Vijaya Prakash A M
77 pages
VLSI - Carry Lookahead Adder
No ratings yet
VLSI - Carry Lookahead Adder
12 pages
Sanyo Cm21sf1 Cm21sf1 Chassis Fc8-A SM
No ratings yet
Sanyo Cm21sf1 Cm21sf1 Chassis Fc8-A SM
37 pages
Performance Analysis of 32-Bit Array Multiplier With A Carry Save Adder and With A Carry-Look-Ahead Adder
No ratings yet
Performance Analysis of 32-Bit Array Multiplier With A Carry Save Adder and With A Carry-Look-Ahead Adder
4 pages
Unit 4 Vlsi
No ratings yet
Unit 4 Vlsi
35 pages
Unit 2: 18CSC203J-Computer Organization and Architecture
No ratings yet
Unit 2: 18CSC203J-Computer Organization and Architecture
106 pages
Final Note Arithmetic Vtu
No ratings yet
Final Note Arithmetic Vtu
30 pages
Designing of Look Ahead Carry Adder by Using VHDL
No ratings yet
Designing of Look Ahead Carry Adder by Using VHDL
6 pages
Probabilistic Reasoning: Unit-V
No ratings yet
Probabilistic Reasoning: Unit-V
33 pages
09-Addition Multiplication
No ratings yet
09-Addition Multiplication
27 pages
Com Bi National Logic Circuit
No ratings yet
Com Bi National Logic Circuit
9 pages
Ripple Carry and Carry Lookahead Addition and Subtraction Circuits
No ratings yet
Ripple Carry and Carry Lookahead Addition and Subtraction Circuits
19 pages
Design of Binary Multiplier Using Adders-3017 PDF
No ratings yet
Design of Binary Multiplier Using Adders-3017 PDF
5 pages
Adder
No ratings yet
Adder
25 pages
Permasense Brochure V4.3
No ratings yet
Permasense Brochure V4.3
8 pages
Carry Look Ahead Adders: Lesson Objectives
No ratings yet
Carry Look Ahead Adders: Lesson Objectives
10 pages
Addition and Multiplication 1
100% (1)
Addition and Multiplication 1
27 pages
Comparison Among Different Adders
No ratings yet
Comparison Among Different Adders
6 pages
DTCN
No ratings yet
DTCN
232 pages
6.1 Addition and Subtraction of Signed Numbers:: Unit - 6 Arithmetic
No ratings yet
6.1 Addition and Subtraction of Signed Numbers:: Unit - 6 Arithmetic
48 pages
4-Bit Carry Look Ahead Adder: Abstract - An Adder Is An Essential Part of The Central
No ratings yet
4-Bit Carry Look Ahead Adder: Abstract - An Adder Is An Essential Part of The Central
4 pages
Analysis of Different Bit Carry Look Ahead Adder Using Verilog Code-2
No ratings yet
Analysis of Different Bit Carry Look Ahead Adder Using Verilog Code-2
7 pages
Analysis of Different Bit Carry Look Ahead Adder Using Verilog Code
No ratings yet
Analysis of Different Bit Carry Look Ahead Adder Using Verilog Code
8 pages
Adders
No ratings yet
Adders
82 pages
Lecture3 Chapter4 - Design 4-Bit Ripple Carry Binary Adder-Subtractor Circuit
No ratings yet
Lecture3 Chapter4 - Design 4-Bit Ripple Carry Binary Adder-Subtractor Circuit
32 pages
Lecture9 Polymorphism
No ratings yet
Lecture9 Polymorphism
97 pages
September 29, 2003 Addition and Multiplication 1
No ratings yet
September 29, 2003 Addition and Multiplication 1
24 pages
Lecture 12
No ratings yet
Lecture 12
37 pages
Week 6: Arithmetic Functions and Circuits: Adding Two Bits
No ratings yet
Week 6: Arithmetic Functions and Circuits: Adding Two Bits
12 pages
FDP Presentation-Day1 Rene Dev
No ratings yet
FDP Presentation-Day1 Rene Dev
23 pages
Week7 Slides
No ratings yet
Week7 Slides
47 pages
CO Unit 2
No ratings yet
CO Unit 2
110 pages
Dep Ed Tabbing
No ratings yet
Dep Ed Tabbing
1 page
Ece341 Lecture05
No ratings yet
Ece341 Lecture05
20 pages
1 PB
No ratings yet
1 PB
21 pages
CO Module4
No ratings yet
CO Module4
72 pages
Unit 4 Vlsi Ec3552
No ratings yet
Unit 4 Vlsi Ec3552
24 pages
Comparison Analysis of 16-Bit Adders
No ratings yet
Comparison Analysis of 16-Bit Adders
3 pages
DUR Template
No ratings yet
DUR Template
2 pages
BL Outline 14 01 24
No ratings yet
BL Outline 14 01 24
8 pages
Computer Arithmetic
No ratings yet
Computer Arithmetic
47 pages
DDMP Unit 3
No ratings yet
DDMP Unit 3
43 pages
CO Unit 2
No ratings yet
CO Unit 2
102 pages
Tellio - Job Ad 07-17-2023
No ratings yet
Tellio - Job Ad 07-17-2023
2 pages
Lec 11 CLD
No ratings yet
Lec 11 CLD
20 pages
EEE 241 - Lecture 15 & 16
No ratings yet
EEE 241 - Lecture 15 & 16
28 pages
Module 3
No ratings yet
Module 3
60 pages
Unit 2 CLA
No ratings yet
Unit 2 CLA
29 pages
Ilovepdf Merged 4c7bdd33 159a 4e13 8c6b De1ec358b3ca
No ratings yet
Ilovepdf Merged 4c7bdd33 159a 4e13 8c6b De1ec358b3ca
75 pages
LCDF3 Chap 05
No ratings yet
LCDF3 Chap 05
61 pages
21CSS201T Coa Unit 3 Notes
No ratings yet
21CSS201T Coa Unit 3 Notes
113 pages
L-5 - Carry Look Ahead Adder - Unit 2
No ratings yet
L-5 - Carry Look Ahead Adder - Unit 2
11 pages
15 - Carry Look Ahead Adder
No ratings yet
15 - Carry Look Ahead Adder
21 pages
Vlsi 4th Module Adders, Multiplers, Shifter
No ratings yet
Vlsi 4th Module Adders, Multiplers, Shifter
58 pages
Unit Iv
No ratings yet
Unit Iv
41 pages
IAM Policy Examples For AWS EC2
No ratings yet
IAM Policy Examples For AWS EC2
3 pages
Vlsi Unit4
No ratings yet
Vlsi Unit4
15 pages
Carrylook Ahead Adder
No ratings yet
Carrylook Ahead Adder
5 pages
CH-2023-07-12 - Amazon QuickSight
No ratings yet
CH-2023-07-12 - Amazon QuickSight
11 pages
Analog Electronic Circuit - Lab - Manual 2024-2025
No ratings yet
Analog Electronic Circuit - Lab - Manual 2024-2025
50 pages
The Future of Cybersecurity - Emerging Trends and Challenges
No ratings yet
The Future of Cybersecurity - Emerging Trends and Challenges
5 pages
MVJ Mou
No ratings yet
MVJ Mou
5 pages
DevOps Engineer
No ratings yet
DevOps Engineer
2 pages
Flyer 2023 2024 Final
No ratings yet
Flyer 2023 2024 Final
3 pages
Industrial AI Applications With Sustainable Performance 1st Edition Jay Lee Download PDF
No ratings yet
Industrial AI Applications With Sustainable Performance 1st Edition Jay Lee Download PDF
40 pages
Logic Design Week 9 Binary Adders
No ratings yet
Logic Design Week 9 Binary Adders
34 pages
Univ IV Notes N
No ratings yet
Univ IV Notes N
26 pages
Computer Science To The Point Computer Science For Life Sciences Students and Other Noncomputer Scientists Boris Tolg Instant Download
No ratings yet
Computer Science To The Point Computer Science For Life Sciences Students and Other Noncomputer Scientists Boris Tolg Instant Download
82 pages
Lecture 15
No ratings yet
Lecture 15
17 pages
03 HLDD - ArithmeticCircuits
No ratings yet
03 HLDD - ArithmeticCircuits
30 pages
Revised CET Draft TS (Rev 03) For Proposed HPWMS System at PBS As Per Comments Observations 26 12 2023
No ratings yet
Revised CET Draft TS (Rev 03) For Proposed HPWMS System at PBS As Per Comments Observations 26 12 2023
137 pages
18CSC203J COA Unit 2 Final
No ratings yet
18CSC203J COA Unit 2 Final
109 pages

Unit-Iv Adders:: Binary Adder Notations and Operations

Uploaded by

Unit-Iv Adders:: Binary Adder Notations and Operations

Uploaded by

UNIT-IV

1-bit Half Adder.

The Equation of Ci+1 can be implemented as shown in Fig.2.1. In the figure,

1-bit Full Adder.

Using group carry generate/propagate, carry can be expressed as expressed in

4-b Ripple Carry Adder

TRCA-4bit = TFA(A0,B0→Co)+T FA (C in→C1)+TFA (Cin→C2)+ TFA

And, it is easy to extend to k-bit RCA:

TRCA-4bit = TFA(A0,B0→Co)+(K-2)* TFA (Cin→Ci)+ TFA (Cin→Sk-1).

Lookahead carry algorithm speed up the operation to perform addition,

4-bit Carry Look Ahead Adder

As with many design problems in digital logic, we can make tradeoffs

Carry Save Adder

4-bit Carry Save Adder

2.5. Carry Select Adder

TCRA = (N-1) Tcarry + Tsum

In the N-bit carry select adder, the delay time is:

TCSA = Tsetup + (N/M) Tcarry + MTmux + Tsum

Tnew = Tsetup + (N-1) Tmux + Tsum

2.6 Ripple Carry Adder

Cout = (Xi ∧ Yi) ∨ (Cin ∧ (Xi ⊕ Yi))

= Majority(Xi, Yi, Cin)

Ripple carry adder block diagram.

An efficient multiplier should have following characteristics:-

Accuracy:- A good multiplier should give correct result.

Speed:- Multiplier should perform operation at high speed.

Area:- A multiplier should occupies less number of slices and LUTs.

Power:- Multiplier should consume less power.

Multiplication process or A multiplier can be divided into three steps-

1. Partial product generation-The first is radix 4 booth encoding in which a partial

2. Partial product reduction-The second is adder array or partial product compression

For the multiplication of an n-bit multiplicand with an m�bit multiplier, m partial

1.) In parallel multipliers number of partial products to be added is the main

1.)This multiplier is the simplest one, the multiplication being considered as a

If A=(an an-1 an-2.......................a0)

B=(bn bn-1 bn-2..............b0)

then the product A.B may be expressed as

AB= ( A2nbn+A2n-1bn-1+A2n-2bn- 2+...................................+A20b0 )

2.)To implement this we use D flip-flop and full adder .

iii.)This is done to allow the multiplication of 'number A' by 21 22..................2n, thus

6.)A further reduction in hardware can be done by using 3 additional D flip-flops

Figure (1): 4- bit Serial and Parallel Multiplier

6.)The schematic diagram is as shown figure(4.2).

Figure - Carry save adder

Figure - Ripple Carry adder

2.)The number of components required in building the Braun multiplier increases

Enhanced Braun Multipliers

1.)The performance of Braun Multiplier can be enhanced by replacing full adders

Figure-signed 4-bit multiplication

Let us consider two numbers A and B ( 2's compliment number )

Then A.B becomes

The final equation is

Baugh-Wooley Multiplier provides a high speed, signed multiplication algorithm . It

This trend is expected to continue, with very important implications on VLSI

2. VLSI Design Flow

The design process at various levels is usually evolutionary in nature. It starts

The use of hierarchy, or divide and conquer technique involves dividing a

In the physical domain, partitioning a complex system into its various

Several design styles can be considered for chip implementation of specified

Field Programmable Gate Array (FPGA)

Performance of the design can be simulated and verified before downloading

The largest advantage of FPGA-based design is the very short turn-around

Gate Array Design

Basic processing steps required for gate array implementation.

Metal mask design to realize complex logic function on channeled GA

Standard-Cells Based Design

A standard cell layout example.

Full Custom Design

IC manufacturing plants are extremely expensive. A single plant costs as

You might also like