100% found this document useful (1 vote)
1K views72 pages

Carry Select Adder

Full Description about CSA

Uploaded by

harshithakr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
1K views72 pages

Carry Select Adder

Full Description about CSA

Uploaded by

harshithakr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 72

Chapter 1

Introduction

The arithmetic logic unit (ALU) is the heart of every microprocessor and determines its
throughput. The core of arithmetic logic unit is adder. Therefore a high performance adder is
essential to maximize microprocessor's speed. However, the high data activity associated with
this unit results in high power and thermal density leading to increased cooling costs. Thus, there
is a critical need for breakthrough ideas in VLSI design methodology to reduce the adder power
consumption while maintaining the high performance target. There are many ways to design an
adder.
Ripple Carry Adder has most compact design but slowest in speed. If there is N-bit Ripple
Carry Adder, the delay is linearly proportional to N. Thus for large values of N the Ripple Carry
Adder gives highest delay of all adders. Whereas Carry Look ahead Adder is the fastest one but
consume more area. If there is N-bit adder, Carry Look-ahead Adder is fast for N<=4, but for
large values of N its delay increases more than other adders. Therefore, for higher number of bits,
Carry Select Adder gives higher delay than other adders due to presence of large number of logic
gates. Carry Select Adders acts as a compromise between a small area but longer delay Ripple
Carry adder and a large area with shorter delay Carry Look-ahead Adder.
In electronic application, adders are most widely used. Applications where these are used
are multipliers, DSP to execute various algorithms like FFT, FIR and IIR. It is known that
millions of instructions per second were performed in microprocessors. The speed of operation is
the most important constraint to be considered while designing multipliers.
Due to the device, portability miniaturization of device should be high and power
consumption should be low. In rapidly growing mobile industry, faster units are not the only
concern but also smaller area and less power become major concerns for design of digital
circuits. In mobile electronics, reducing area and power consumption are key factors in
increasing portability and battery life. Even in servers and desktop computers, power dissipation
is an important design constraint. Design of area and power efficient high-speed data-path logic
systems are one of the most substantial areas of research in VLSI system design.
In digital adders, the speed of addition is limited by the time required to propagate a carry
through the adder. The sum for each bit position in an elementary adder is generated sequentially
1
only after the previous bit position has been summed and a carry propagated into the next
position. Among various adders, the Carry Select Adder is intermediate regarding speed and area.
The CSLA is used in many computational systems to alleviate the problem of carry
propagation delay by independently generating multiple carries and then select a carry to generate
the sum. However, the CSLA is not area efficient because it uses multiple pairs of Ripple Carry
Adders (RCA) to generate partial sum and carry by considering carry input C in = 0 and Cin = 1,
then the final sum and carry are selected by the multiplexers.
The basic idea of this work is to use Binary to Excess-1 Converter (BEC) instead of RCA
with Cin = 1 in the regular CSLA to achieve lower area and power consumption. The main
advantage of this BEC logic comes from the lesser number of logic gates than the n-bit Full
Adder (FA) structure.

The carry-select adder generally consists of two ripple carry adders and a multiplexer.
Adding two n-bit numbers with a carry-select adder is done with two adders (therefore two ripple
carry adders) in order to perform the calculation twice, one time with the assumption of the carry
being zero and the other assuming one. After the two results are calculated, the correct sum, as
well as the correct carry, is then selected with the multiplexer once the correct carry is known.

The number of bits in each carry select block can be uniform, or variable. In the uniform

case, the optimal delay occurs for a block size of . When variable, the block size should
have a delay, from addition inputs A and B to the carry out, equal to that of the multiplexer chain

leading into it, so that the carry out is calculated just in time. The delay is derived from
uniform sizing, where the ideal number of full-adder elements per block is equal to the square
root of the number of bits being added, since that will yield an equal number of MUX delays.

1.1Vlsi technology
Very large scale integration is the process of creating integrated circuits by combining
thousands of transistors into a single chip. VLSI began in the 1970s when
complex semiconductor and communication technologies were being developed.
The first semiconductor chips held two transistors. Subsequent advances added more and more
transistors. As a sequence individual functions or systems were integrated over time.

2
The first integrated circuits held only a few devices, perhaps as many as
ten diodes, transistors, resistors and capacitors, making it possible to fabricate one or
more logic gates on a single device.
The Previous integrated circuits design methods are SSI, MSI and LSI. SSI means Small
Scale Integrated Circuits in which circuits held only a few devices, perhaps as
ten diodes, transistors, resistors and capacitors, making it possible to fabricate more logic gates.
MSI means medium scale integration. This technique led to devices with hundreds of logic gates.
LSI means large scale integration i.e. systems with at least a thousand logic gates.
Digital VLSI circuits are predominantly CMOS based. The way normal blocks like latches
and gates are implemented is different from what we have seen so far but the behavior remains
the same. The miniaturization involves new things to consider. A lot of thought has to go into
implementations as well as design.
CMOS (Complementary metal oxide semiconductor) technology is used for constructing
integrated circuits. CMOS technology is used in microprocessors, microcontrollers, static RAM
and the other digital logic circuits. CMOS technology is also used for several analog circuits
such as image sensors, data converters, and highly integrated transceivers for many types of
communication.

1.2 Languages in vlsi


The language is used to design the hardware. It is designed using specific software. The
direct structural implementation is not possible because the nano range fabrication once designed
cannot be changed manually. Before designing in Hardware unit, the units should be functionally
connected and their behavior as a complex unit should be checked using software. Then it will be
implemented in hardware. For this a pre-described language is needed for describing the
behavioral function of a particular system. The Language used depends on the user’s
requirements and designer’s knowledge.
The languages have predefined library functions which describes the functions of some
elements. The hardware used can specify the designed unit depending on the behavior, data flow
or logic elements used.
There are three Hardware description languages,

3
i. AHDL
ii. VHDL
iii. Verilog
1.2.1. AHDL
AHDL means Analog HDL. This method is not widely used because the behavior results
are not more accurate. The recently used languages are VHDL and Verilog HDL.
1.2.2. VHDL
VHDL means VHSIC hardware description language. VHDL is commonly used to write
text models that describe a logic circuit. Such a model is processed by a synthesis program, only
if it is part of the logic design. A simulation program is used to test the logic design using
simulation models to represent the logic circuits that interface to the design. This collection of
simulation models is commonly called a test bench.
1.2.3. Verilog HDL
In the semiconductor and electronic design industry, Verilog is a general purpose hardware
description language (HDL) used to model electronic systems. It is easy to learn and easy to use.
It is similar in syntax to the C programming language. It is most commonly used in the design,
verification, and implementation of digital logic chips at the register transfer level (RTL) of
abstraction. It is also used in the verification of analog and mixed signal circuits.
A Verilog design consists of a hierarchy of modules. Modules encapsulate design
hierarchy, and communicate with other modules through a set of declared input, output, and
bidirectional ports.

1.3Introduction to adders

1.3.1 Basic adders


Two types of basic adder are discussed below.
i. Half Adder (HA)
ii. Full Adder (FA)

1.3.1.1. Half adder (ha)


The half adder is an example of a simple, functional digital circuit built from two logic
gates. The half adder adds two one-bit binary numbers (A and B). The output is the sum of the

4
two bits (S) and the carry (C). The logic level diagram for Half Adder is shown in Fig 1.1. The
Boolean expressions for the S and C bits are as shown below.

S=A⊕B (1.1)

C=A×B (1.2)

Fig 1.1 Logic level diagram of Half Adder

SUM bit is the XOR function of two inputs and CARRY bit is the AND function of the
two inputs. The truth table of a half adder is shown in Table 1.1.

Table 1.1 Truth Table of a Half Adder

INPUTS OUTPUTS

A B S C

0 0 0 0

0 1 1 0

1 0 1 0

1 1 0 1

5
Note how the same two inputs are directed to two different gates. The inputs to the XOR
gate are also the inputs to the AND gate. The input "wires" to the XOR gate are tied to the input
wires of the AND gate; thus, when voltage is applied to the A input of the XOR gate, the A input
to the AND gate receives the same voltage.

1.3.1.2. Full Adder (FA)

A full adder could be defined as a combinational circuit that forms the arithmetic sum of
three input bits. It consists of three inputs and two outputs. In our design, we have designated the
three inputs as A, B and C. The third input C represents carry input to the first stage. The outputs
are S and C. Fig 1.2 shows the logic level diagram of a full adder. The Boolean expressions for
the S and C bits are as shown below.

S=A⊕B⊕C (1.3)
C = (A × B) + (B × C) + (A × C) (1.4)

Fig 1.2 Logic level diagram of Full Adde

SUM bit is the XOR function of all three inputs and CARRY bit is the AND function of
the three inputs.

The truth table of a full adder is shown in Table 1.2. The truth table also indicates the
status of the CARRY bit; that is to say, if that carry bit has been generated or deleted or
propagated. Depending on the status of input bits A and B, the CARRY bit is either generated or

6
deleted or propagated. If either one of A or B inputs is ‘1’, then the previous carry is just
propagated, as the sum of A and B is ‘1’. If both A and B are‘1’s then carry is generated because
summing A and B would make output S ‘0’ and C1 ‘1’. If both A and B are ‘0’s then summing A
and B would give us ‘0’ and any previous carry is added to this S making C 1 bit ‘0’. This is in
effect deleting the CARRY.

Table 1.2 Truth Table of a Full Adder

INPUT OUTPUT

A B C S C1

0 0 0 0 0

0 0 1 1 0

0 1 0 1 0

0 1 1 0 1

1 0 0 1 0

1 0 1 0 1

1 1 0 0 1

1 1 1 1 1

1.3.2 Fast adder


Many types of fast adders are there. They are,

i. Ripple Carry Adder


ii. Carry Look-ahead Adder
iii. Carry Save Adder
7
iv. Carry Skip Adder
v. Carry Select Adder

1.3.2.1. Ripple Carry Adder (RCA)


A simple ripple carry adder is a digital circuit that produces the arithmetic sum of two
binary numbers. It can be constructed with full adders connected in cascade, with the carry output
from each full adder connected to the carry input of the next full adder in the chain. This is called
a carry ripple adder or ripple carry adder. Fig 1.3 shows the interconnection of four full adder
(FA) circuits to provide a 4-bit ripple carry adder. Notice from Fig 3.3 that the input is from the
right side because the first cell traditionally represents the least significant bit (LSB). Bits A0 and
B0 in the Fig represent the least significant bits of the numbers to be added. The sum output is
represented by the bits S0-S3.The main problem with this type of adder is the delays needed to
produce the carry out signal and the most significant bits. These delays increase with the increase
in the number of bits to be added.

B3 A3 B2 A2 B1 A1 Bo Ao

Cin
FULL FULL FULL FULL
ADDER ADDER ADDER ADDER

Cout S4 S3 S1 So

Fig 1.3 4-Bit Ripple Carry Adder

Ripple carry adder calculates sum and carry according to the following equation.

Si = Ai ⊕ Bi ⊕ Ci (1.5)

Ci+1 = Ai Bi + (Ai + Bi) Ci (1.6)

where i = 0, 1, 2, ……, n-1.


8
As carry ripples one full adder to the other, if traverses longest critical path and exibits
worst case delay. Ripple Carry Adder is the lowest in all adders but it is very compact in size. If
the ripple carry adder is implemented by concatenating n-bit full adders, the delay of such an
adder is 2n gate delays from Cin to Cout. The delay of adder increases linearly with increase in
number of bits.

1.3.2.2. Carry Select Adder (CSLA)


The carry select adder comes in the category of conditional sum adder. Conditional sum
adder works on some condition. Sum and carry are calculated by assuming input carry as 1 and 0
prior the input carry comes. When actual carry input arrives, the actual calculated values of sum
and carry are selected using a multiplexer. The conventional carry select adder consists of n- bit
adder for the lower half of the bits i.e. least significant bits (LSB’s) and for the upper half i.e.
most significant bits (MSB’s) two n-bit adders. In MSB adder’s one adder assumes carry input as
one for performing addition and another assumes carry input as zero.

The carry out calculated from the last stage i.e. least significant bit stage is used to select
the actual calculated values of output carry and sum. The selection is done by using a
multiplexer. This technique of dividing adder in two stages increases the area utilization but
addition operation fastens. The basic block diagram for carry select adder is shown in Fig 1.4.

Carry Select Adders (CSLA) is one of the fastest adders used in many data-processing
processors to perform fast arithmetic functions. The carry select adder partitions the adder into
several groups, each of which performs two additions in parallel.

9
A’s B’s

4-bit set up

P’s G’s

‘0’ carry propagation 0

‘1’ carry propagation 1

Cout Multiplexer Cin

C’s

Sum Generation

S’s

Fig 1.4 Carry Select Adder


Two copies of ripple-carry adder act as carry evaluation block per select stage. One copy
evaluates the carry chain assuming the block carry-in is zero, while the other assumes it to be
one. Once the carry signals are finally computed, the correct sum and carry-out signals will be
simply selected by a set of multiplexers. The 4-bit adder block is RCA. Carry Select Adders acts
as a compromise between a small area but longer delay Ripple Carry adder and a large area with
shorter delay Carry Look-ahead Adder.

1.4 Binary to excess-1 convertor (bec)

BEC is a circuit used to add 1 to the input numbers. A circuit of 3-bit BEC and the
function table is shown in Fig 1.5 and Table 1.3 respectively. The main objective of this project is
to reduce the gate level by using Binary to Excess-1 Converter. In order to reduce the delay and
power we use n+1 Binary to Excess-1 Converter instead of n RCA.

10
Fig 1.5 3-Bit Binary to Excess-1 Converter

Table 1.3 Function Table of 3-Bit Binary to Excess-1 Convertor

BINARY EXCESS-1
[3:0] [3:0]

B2 B1 B0 X2 X1 X0

0 0 0 0 0 1

0 0 1 0 1 0

0 1 0 0 1 1

0 1 1 1 0 0

1 0 0 1 0 1

1 0 1 1 1 0

1 1 0 1 1 1

1 1 1 0 0 0

Fig 1.5 shows the basic function of the CSLA. One input for 6:3 mux is BEC output (B2,
B1 and B0) and another input for the mux is the RCA with Cin=0. This produces the two possible
11
partial results in parallel and the mux is used to select either the BEC output or the direct inputs
according to the control signal Cin. The importance of the BEC logic is the large silicon area
reduction.
The Boolean expression of the 3-bit BEC are shown below:

X0 = ~B0 (1.7)
X1 = B0 ⊕ B1 (1.8)
X2 = B2 ⊕ (B1 × B0) (1.9)

1.5 Delay and area calculation

The AND, OR, and Inverter (AOI) implementation of an XOR gate is shown in Fig 1.6.
The gates between the dotted lines are performing the operations in parallel and the numeric
representation of each gate indicates the delay contributed by that gate.

The delay and area evaluation methodology considers all gates to be made up of AND,
OR, and Inverter, each having delay equal to 1 unit and area equal to 1 unit. Then add up the
number of gates in the longest path of a logic block that contributes to the maximum delay.

Fig 1.6 Delay and Area Evaluation of an XOR Gate


The area evaluation is done by counting the total number of AOI gates required for each
logic block. The delay calculation is done by using the parallel performance of work in XOR
gate.

12
Based on this approach, the CSLA adder blocks of 2:1 mux, Half Adder (HA), and Full
Adder (FA) are evaluated and listed in Table 1.4.

Table 1.4 Delay and Area count of the blocks of CSLA

Adder Blocks Delay Area

XOR 3 5

2:1 Mux 3 4

Half Adder 3 6

Full Adder 6 13

1.6 SIMULATION AND SYNTHESIS TOOL

The two different software is used. They are,

i. Xilinx 9.1i Software


ii. ModelSim 6.4a software

The Xilinx 9.1i software is used to synthesis the Modified CSLA and ModelSim 6.4a
software is used to simulate the Modified CSLA.

1.6.1 Modelsim 6.4a Software

ModelSim is a powerful simulator that can be used to simulate the behavior and
performance of logic circuits. The simulator allows the user to apply inputs to the designed
circuit, usually referred to as test vectors, and to observe the outputs generated in response. The
user can use the Waveform Editor to represent the input signals as waveforms.

13
1.6.1.1 Basic simulation flow

Creating a working library

Compile design files

Run simulation

Debug results

Fig 1.7 Basic Simulation Flow

Creating the working library

In ModelSim, all designs, be they VHDL, Verilog, or some combination thereof, are
compiled into a library. You typically start a new simulation in ModelSim by creating a working
library called "work". "Work" is the library name used by the compiler as the default destination
for compiled design units.

Compiling design

After creating the working library, you compile your design units into it. The ModelSim
library format is compatible across all supported platforms. You can simulate your design on any
platform without having to recompile your design.

Running the simulation

With the design compiled, you invoke the simulator on a top-level module (Verilog) or a
configuration or entity/architecture pair (VHDL). Assuming the design loads successfully, the
simulation time is set to zero, and you enter a run command to begin simulation.
14
Debugging results

If you don’t get the results you expect, you can use ModelSim’s robust debugging
environment to track down the cause of the problem.

1.6.1.2 Project Flow

A project is a collection mechanism for an HDL design under specification or test. Even
though you don’t have to use projects in ModelSim, they may ease interaction with the tool and
are useful for organizing files and specifying simulation settings. The following Fig 1.8 shows
the basic steps for simulating a design within a ModelSim project.

Create a project

Add files to the project

Compile design file

Run simulation

Debug results

Fig 1.8 Project Flow

The flow is similar to the basic simulation flow. However, there are two important differences:
i. Do not have to create a working library in the project flow; it is done for automatically.
ii. Projects are persistent. It will open every time invoke ModelSim unless specifically close
it.

15
Chapter 2

Literature review

[1].Low-Power and Area-Efficient Carry Select Adder.


Ram Kumar .B and Kittur H.M, “Low-Power and Area-Efficient Carry Select Adder”,
IEEE transactions on very large scale integration (VLSI) systems, vol. 20, no. 2, February 2012.

Ram Kumar et al. (2012) proposed that design of area- and power-efficient high-speed
data path logic systems are one of the most substantial areas of research in VLSI system design.
In digital adders, the speed of addition is limited by the time required to propagate a carry
through the adder. The sum for each bit position in an elementary adder is generated sequentially
only after the previous bit position has been summed and a carry propagated into the next
position.
[2].Area Efficient Carry Select Adder
Anitha Kumari R D, Nayana N D. “Low power and Area Efficient Carry Select Adder”,
National Conference on Electronics, Communication and Signal Processing, NCECS-2011.

Anitha Kumari et al. (2011) proposed that most of the VLSI applications, such as digital
signal processing, image and video processing, and microprocessors, extensively use arithmetic
operations. Addition, subtraction, and multiplication are examples of the most commonly used
operations. The 1-bit full adder cell is the building block of all these modules. Thus, enhancing
its performance is critical for enhancing the overall module performance.

[3].Improved Carry Select Adder with Reduced Area and Low Power
Consumption
Padma Devi, Ashima Girdher and Balwinder Singh, ”Improved Carry Select Adder
with Reduced Area and Low Power Consumption”, International Journal of Computer
Applications (0975 – 8887), Volume 3 -No.4, June 2010.

16
Padma Devi et al. (2010) proposed that power dissipation is one of the most important
design objectives in integrated circuits, after speed. As adders are the most widely used
components in such circuits, design of efficient adder is of much concern. The Carry Select
Adder (CSA) provides a good compromise between cost and performance in carry propagation
adder design.

[4].Carry-select adder using single ripple carry adder


Ceiang T Y and Hsiao M J, “Carry-select adder using single ripple carry adder,”
Electron. Lett., vol. 34, no. 22, pp. 2101–2103, Oct. 1998.

Ceiang et al. (1998) proposed that instead of using dual carry-ripple adders a carry select
adder scheme using an add-one circuit to replace one carry-ripple adder requires fewer
transistors. If speed is crucial for this 64-bit adder, then two of the original carry select adder
blocks can be substituted by the proposed scheme with an area saving and the same speed.

[5].Robust high-performance low power adder


Jeong .W and Roy .K, “Robust high-performance low power adder”, Proc. of the Asia
and South Pacific Design Automation Conference, pp. 503-506, 200

Jeong et al. (2003) proposed that a new circuit based on combining XOR gates and
double pass-transistor logic has been developed for implementing a full adder. The main design
objectives for these new circuits are low power consumption.

[6]. An area efficient 64-bit square root carry select adder for low power
applications
He Y, Chang C H, and Gu J, “An area efficient 64-bit square root carry select adder
for low power applications,” in Proc. IEEE Int. Symp. Circuits Syst., 2005, vol. 4, pp. 4082–
4085.

17
Chang et al. (2005) proposed that Carry select method has deemed to be a good
compromise between cost and performance in carry propagation adder design. However,
conventional carry select adder (CSLA) is still area-consuming due to the dual ripple carry adder
structure.

[7].Two New Low Power High Performance Full Adders with Minimum Gates
Hosseinghadiry M, Mohammadi H and Nadisenejani M, '' Two New Low Power
High Performance Full Adders with Minimum Gates", World Academy of Science, Engineering
and Technology 52 2009.

Hosseinghadiry et al. (2009) proposed that with increasing circuit complexity and demand
to use portable devices, power consumption is one of the most important parameters these days.
Full adders are the basic block of many circuits. Therefore reducing power consumption in full
adders is very important in low power circuits. One of the most power consuming modules in full
adders is XOR circuit.

[8].64-bit carry-select adder with reduced area


Kim Y and Kim L S, “64-bit carry-select adder with reduced area”, Electron.Lett., vol.
37, no. 10, pp. 614–615, May 2001.

Kim et al. (2001) proposed that a carry select adder can be implemented by using a single
ripple carry adder and an add-one circuit instead of using dual ripple carry adders. A multiplexer-
based add-one circuit is proposed to reduce the area with negligible speed penalty.

[9].Single bit full adder design using 8 transistors with novel 3 transistors
XNOR gate
Manoj Kumar, Sandeep K. Arya and SujataPandey, “Single bit full adder design
using 8 transistors with novel 3 transistors XNOR gate”, International Journal of VLSI design &
Communication Systems (VLSICS) Vol.2, No.4, December 2011.

18
Manoj Kumar et al. (2011) proposed that with exponential growth of portable electronic
devices like laptops, multimedia and cellular device, research efforts in the field of low power
VLSI (Very Large Scale Integration) systems have increased many folds. With the rise in chip
density, power consumption of VLSI systems is also increasing and this further, adds to
reliability and packaging problems. Packaging and cooling cost of VLSI systems also goes up
with high power dissipation.

[10].The Design of a High-Performance Full Adder Cell by Combining


Common Digital Gates and Majority Function

KeivanNavi and NedaKhandel, “The Design of a High-Performance Full Adder Cell by


Combining Common Digital Gates and Majority Function”, European Journal of Scientific
Research, ISSN 1450-216X Vol.23 No.4 (2008), pp.626-638.
KeivanNavi et al. (2008) proposed that a new circuit based on combining XOR gates and
double pass-transistor logic has been developed for implementing a full adder. The main design
objectives for these new circuits are low power consumption.

19
Chapter 3
Existing system

3.1.1 Regular carry select adder


The regular CSLA design using two set of ripple carry adder (RCA).

Schematic
The schematic structure of 16-bit regular carry select adder is shown in the Fig 3.1.

Fig 3.1 16-Bit Regular Carry Select Adder Schematic

It has five groups of different size RCA. The steps leading to the evaluation are as follows.

20
Fig 3.2 Group 1 Architecture

1.The group 1 has one ripple carry adder. This RCA contains two full adders. Carry input is
denoted as Cin. The Cin is given to the group 1 and the data a and b also given. The full adder
performs the addition operation and gives the sum and carry. That carry is given to the next level
of group 2 for the control line or select line to the multiplexer.

Fig 3.3 Group 2 Architecture

2.The group 2 contains two set of ripple carry adder. Carry input Cin = 1 is fed to the one set of
RCA and the carry input Cin = 0 is fed to another one set of RCA. The data is given to the two set

21
of RCA. Now the operation is performed using the carry input. The first set of the RCA is
performed using Cin = 1. It contains two full adders. Another one RCA operates in Cin = 0. It
contains one half adder and one full adder. The control line, which is the carry out of the group 1,
is given to the multiplexer and the sum value of the two set of RCA is also given to the same
multiplexer. Now the multiplexer produce one sum and carry by using the control line. If the
control line has the value 1 the first set of RCA’s sum and carry is selected. Otherwise next set of
RCA’s sum and carry is selected. The selected carry out data is given to the next level of control
line to the group 3 multiplexer.

Fig 3.4 Group 3 Architecture

3.The group 3 contains two set of ripple carry adder. Carry input Cin = 1 is fed to the one set of
RCA and the carry input Cin = 0 is fed to another one set of RCA. The data is given to the two set
of RCA. Now the operation is performed using the carry input. The first set of the RCA is
performed using Cin = 1. It contains three full adders. Another one RCA operates in Cin = 0. It
contains one half adder and two full adder. The control line, which is the carry out of the group 2,
is given to the multiplexer and the sum value of the two set of RCA is also given to the same
multiplexer. Now the multiplexer produce one sum and carry by using the control line. If the
control line has the value 1 the first set of RCA’s sum and carry is selected. Otherwise next set of
RCA’s sum and carry is selected. The selected carry out data is given to the next level of control
line to the group 4 multiplexer.

22
Fig 3.5 Group 4 Architecture

4. The group 4 contains two set of ripple carry adder. Carry input Cin = 1 is fed to the one set of
RCA and the carry input Cin = 0 is fed to another one set of RCA. The data is given to the two set
of RCA. Now the operation is performed using the carry input. The first set of the RCA is
performed using Cin = 1. It contains four full adders. Another one RCA operates in Cin = 0. It
contains one half adder and three full adder. The control line, which is the carry out of the group
3, is given to the multiplexer and the sum value of the two set of RCA is also given to the same
multiplexer. Now the multiplexer produce one sum and carry by using the control line. If the
control line has the value 1 the first set of RCA’s sum and carry is selected. Otherwise next set of
RCA’s sum and carry is selected. The selected carry out data is given to the next level of control
line to the group 5 multiplexer.

Fig 3.6 Group 5 Architecture

23
5.The group 5 contains two set of ripple carry adder. Carry input Cin = 1 is fed to the one set of
RCA and the carry input Cin = 0 is fed to another one set of RCA. The data is given to the two set
of RCA. Now the operation is performed using the carry input. The first set of the RCA is
performed using Cin = 1. It contains five full adders. Another one RCA operates in Cin = 0. It
contains one half adder and four full adder. The control line, which is the carry out of the group
4, is given to the multiplexer and the sum value of the two set of RCA is also given to the same
multiplexer. Now the multiplexer produce one sum and carry by using the control line. If the
control line has the value 1 the first set of RCA’s sum and carry is selected. Otherwise next set of
RCA’s sum and carry is selected.

The group2 has two sets of 2-bit RCA. Based on the consideration of delay value is the
arrival time of selection input C1 of 6:3 multiplexer is earlier than S3 and later than S2. Thus, S3 is
summation of C2 and multiplexer and S2 is summation of C1 and multiplexer.

Except for group2, the arrival time of multiplexer selection input is always greater than the
arrival time of data outputs from the RCA’s. Thus, the delay of group3 to group5 is determined,
respectively as follows:

{C6, Sum [6:4]} = C3 + Multiplexer


{C10, Sum [10:7]} = C6 + Multiplexer
{Cout, Sum [15:11]} = C10 + Multiplexer

The Group 1 architecture calculation is,

Gate Count = 19 (FA+HA)


FA = 13 (1×13)
HA = 6 (1×6)

24
The Group 2 architecture calculation is,

Gate Count = 57 (FA+HA+Mux)


FA = 39 (3×13)
HA = 6 (1×6)
Mux = 12 (3×4)

The Group 3 architecture calculation is,

Gate Count = 87 (FA+HA+Mux)


FA = 65 (5×13)
HA = 6 (1×6)
Mux = 16 (4×4)

The Group 4 architecture calculation is,

Gate Count = 117 (FA+HA+Mux)


FA = 91 (7×13)
HA = 6 (1×6)
Mux = 20 (5×4)

The Group 5 architecture calculation is,

Gate Count = 147 (FA+HA+Mux)


FA = 117 (9×13)
HA = 6 (1×6)
Mux = 24 (6×4)

25
3.1.2 Area calculation for RCSLA
The area calculation of regular CSLA is derived from the following steps. From the
structure of RCSLA, 8-bit, 16-bit, 32-bit and 64-bit area is calculated.
8-bit RCSLA

Initial carry input Cin,


2-bit RCA = 1
Carry input Cin = 0,
2-bit RCA = 1
4-bit RCA = 1
Carry input Cin = 1,
2-bit RCA = 1
4-bit RCA = 1
The area count of RCA is tabulated.

Table 1.5 Area count of 8-bit RCSLA

Word size & Adder Number of gates


2-bit RCA 19
4-bit RCA 45
2:1 Mux 4

The total area of the 8-bit regular CSLA is 179. The total area of the different adder is
tabulated in Table 1.6.
Table 1.6 Regular CSLA Area
Word Size Adder Area (no. Of Gates)
8-bit RCSLA 179
16-bit RCSLA 399
32-bit RCSLA 839
64-bit RCSLA 1719

26
Chapter 4

Proposed system

4.1 Modified carry select adder

4.1.1 Schematic

The block schematic for 16-bit Modified Carry Select Adder (MCSLA) are shown in Fig
4.1.

Fig 4.1 16-Bit Modified Carry Select Adder Schematic

The structure of carry select adder using binary to excess 1 converter for RCA with
Cin=1 to optimize the area and power is shown in Fig 4.1. In our proposed method the carry 1
RCA is replaced by the BEC.

27
The n-bit RCA is replaced by the n+1bit BEC. The number of gate used in BEC is less
compare with RCA. The structure of 16-bit modified carry select adder is shown in the Fig 4.1. It
has five groups of different size binary to excess-1 convertor. The steps leading to the evaluation
are given here.

Fig 4.2 Group 1 Architecture


1.The group 1 has one ripple carry adder. This RCA contains two full adders. Carry input is
denoted as Cin. The Cin is given to the group 1 and the data a and b also given. The full adder
performs the addition operation and gives the sum and carry. That carry is given to the next level
of group 2 for the control line or select line to the multiplexer.

Fig 4.3 Group 2 Architecture

28
2. The group 2 contains one set of ripple carry adder and one BEC. Carry input Cin = 0 is fed to
the RCA. The data is given to the RCA. Now the operation is performed using the carry input.
The ripple carry adder is performed using Cin = 0. It contains one half adder and one full adder.
The binary to excess-1 convertor (BEC) instead of RCA with Cin = 1 in the regular CSLA to
achieve low area and low power consumption. The control line, which is the carry out of the
group 1, is given to the multiplexer and the sum value of RCA and BEC is also given to the same
multiplexer. Now the multiplexer produce one sum and carry by using the control line. If the
control line has the value 0 the RCA’s sum and carry is selected. Otherwise the BEC’s sum and
carry is selected. The selected carry out data is fed to the next level of control line to the group 3
multiplexer.

Fig 4.4 Group 3 Architecture

3.The group 3 contains one set of ripple carry adder and one BEC. Carry input Cin = 0 is fed
to the RCA. The data is given to the RCA. Now the operation is performed using the carry
input. The ripple carry adder is performed using Cin = 0. It contains one half adder and one
full adder. The binary to excess-1 convertor (BEC) instead of RCA with Cin = 1 in the regular
CSLA to achieve low area and low power consumption. The control line, which is the carry
out of the group 2, is given to the multiplexer and the sum value of RCA and BEC is also

29
given to the same multiplexer. Now the multiplexer produce one sum and carry by using the
control line. If the control line has the value 0 the RCA’s sum and carry is selected.
Otherwise the BEC’s sum and carry is selected. The selected carry out data is fed to the next
level of control line to the group 4 multiplexer.

Fig 4.5 Group 4 Architecture

4. The group 4 contains one set of ripple carry adder and one BEC. Carry input C in = 0 is fed to
the RCA. The data is given to the RCA. Now the operation is performed using the carry input.
The ripple carry adder is performed using Cin = 0. It contains one half adder and one full adder.
The binary to excess-1 convertor (BEC) instead of RCA with Cin = 1 in the regular CSLA to
achieve low area and low power consumption. The control line, which is the carry out of the
group 3, is given to the multiplexer and the sum value of RCA and BEC is also given to the same
multiplexer. Now the multiplexer produce one sum and carry by using the control line. If the
control line has the value 0 the RCA’s sum and carry is selected. Otherwise the BEC’s sum and
carry is selected. The selected carry out data is fed to the next level of control line to the group 5
multiplexer.

30
Fig 4.6 Group 5 Architecture

5. The group 5 contains one set of ripple carry adder and one BEC. Carry input Cin = 0 is fed to
the RCA. The data is given to the RCA. Now the operation is performed using the carry input.
The ripple carry adder is performed using Cin = 0. It contains one half adder and one full adder.
The binary to excess-1 convertor (BEC) instead of RCA with Cin = 1 in the regular CSLA to
achieve low area and low power consumption. The control line, which is the carry out of the
group 4, is given to the multiplexer and the sum value of RCA and BEC is also given to the same
multiplexer. Now the multiplexer produce one sum and carry by using the control line. If the
control line has the value 0 the RCA’s sum and carry is selected. Otherwise the BEC’s sum and
carry is selected.

The group2 has one 2-bit RCA which has 1FA and 1 HA for Cin = 0. Instead of another 2-
bit RCA with a 3-bit BEC is used which adds one to the output from 2-bit RCA. Based on the
consideration of delay values is the arrival time of selection input C1 [time (t) = 7] of 6:3
multiplexer is earlier than the S3 [t = 9] and C3 [t = 10] and later than S2 [t = 4]. Thus the sum3
and final C3 (output from multiplexer) are depending on S3 and multiplexer and partial C3 (input
to multiplexer) and multiplexer, respectively. The sum2 depends on C1 and multiplexer.

31
For the remaining group’s the arrival time of multiplexer selection input is always greater
than the arrival time of data inputs from the BEC’s Thus, the delay of the remaining groups
depends on the arrival time of multiplexer selection input and the multiplexer delay.

4.1.2 Area calculation for MCSLA

The area calculation of modified CSLA is derived from the following steps. From the
structure of MCSLA, 8-bit, 16-bit, 32-bit and 64-bit area is calculated.
The Group 1 architecture calculation is,

Gate Count = 19 (FA+HA)


FA = 13 (1×13)
HA = 6 (1×6)

The Group 2 architecture calculation is,


Gate Count = 43 (FA+HA+Mux+BEC)
FA = 13 (1×13)
HA = 6 (1×6)
Mux = 12 (3×4)

BEC:
AND = 1
NOT = 1
XOR = 10 (2×5)

The Group 3 architecture calculation is,

Gate Count = 61 (FA+HA+Mux+BEC)


FA = 26 (2×13)
HA = 6 (1×6)
Mux = 16 (4×4)

32
BEC:
AND = 2
NOT = 1
XOR = 15 (3×5)

The Group 4 architecture calculation is,

Gate Count = 84 (FA+HA+Mux+BEC)


FA = 13 (3×13)
HA = 6 (1×6)
Mux = 20 (5×4)
BEC = 24

The Group 5 architecture calculation is,

Gate Count = 107 (FA+HA+Mux+BEC)


FA = 52 (4×13)
HA = 6 (1×6)
Mux = 24 (6×4)
BEC = 30

The total area of the modified CSLA The total area of the different adder is tabulated in
Table 1.7.
Table 1.7 Modified CSLA Area

Word Size Adder Area (no. Of Gates)


8-bit MCSLA 145
16-bit MCSLA 311
32-bit MCSLA 643
64-bit MCSLA 1307

33
Chapter 5
System requirement

5.1. Introduction to verilog hdl

Verilog HDL is one of the two most common Hardware Description Languages (HDL) used by
integrated circuit(IC) designers. The other one is VHDL.HDL’s allows the design to be simulated
earlier in the design cycle in order to correct errors or experiment with different architectures.
Designs described in HDL are technology-independent, easy to design and debug, and are usually
more readable than schematics, particularly for large circuits.
Verilog can be used to describe designs at four levels of abstraction:
(i) Algorithmic level (much like c code with if, case and loop statements).
(ii) Register transfer level (RTL uses registers connected by Boolean equations).
(iii) Gate level (interconnected AND, NOR etc.).
(iv) Switch level (the switches are MOS transistors inside gates).
The language also defines constructs that can be used to control the input and output of
simulation. More recently Verilog is used as an input for synthesis programs which will generate
a gate-level description (a net list) for the circuit. Some Verilog constructs are not synthesizable.
Also the way the code is written will greatly affect the size and speed of the synthesized circuit.
Most readers will want to synthesize their circuits, so non synthesizable constructs should be
used only for test benches. These are program modules used to generate I/O needed to simulate
the rest of the design. The words “not synthesizable” will be used for examples and constructs as
needed that do not synthesize.
There are two types of code in most HDLs:
Structural, which is a verbal wiring diagram without storage? Assign a=b & c | d; /* “|” is a OR
*/ assign d = e & (~c);
Here the order of the statements does not matter. Changing e will change a. Procedural which is
used for circuits with storage, or as a convenient way to write conditional logic.
always @(posedge clk) // Execute the next statement on every rising clock edge.
count <= count+1;

34
Procedural code is written like c code and assumes every assignment is stored in memory until
over written. For synthesis, with flip-flop storage, this type of thinking generates too much
storage. However people prefer procedural code because it is usually much easier to write, for
example, if any case statements are only allowed in procedural code. As a result, the synthesizers
have been constructed which can recognize certain styles of procedural code as actually
combinational. They generate a flip-flop only for left-hand variables which truly need to be
stored. However if you stray from this style, beware. Your synthesis will start to fill with
superfluous latches. This manual introduces the basic and most common Verilog behavioral and
gate-level modeling constructs, as well as Verilog compiler directives and system functions. Full
description of the language can be found in Cadence Verilog-XL Reference Manual and Synopsys
HDL Compiler for Verilog Reference Manual. The latter emphasizes
only those Verilog constructs that are supported for synthesis by the Synopsys Design Compiler
synthesis tool.
In all examples, Verilog keyword is shown in boldface. Comments are shown in italics.

5.1.1. Lexical Tokens

A Verilog source text file consists of the following lexical tokens:


White Space
White spaces separate words and can contain spaces, tabs, new-lines and form feeds. Thus a
statement can extend over multiple lines without special continuation characters.
Comments
Comments can be specified in two ways (exactly the same way as in C/C++):
- Begin the comment with double slashes (//). All text between these characters and the end of the
line will be ignored by the Verilog compiler.
- Enclose comments between the characters /* and */. Using this method allows you to continue
comments on more than one line. This is good for “commenting out” many lines code, or for very
brief in-line comments.

35
Numbers
Number storage is defined as a number of bits, but values can be specified in binary, octal,
decimal or hexadecimal (See Sect. 6.1. for details on number notation).
Examples are 3’b001, a 3-bit number, 5’d30, (=5’b11110), and 16‘h5ED4, (=16’d24276)

Identifiers
Identifiers are user-defined words for variables, function names, module names, block names and
instance names.
Identifiers begin with a letter or underscore (Not with a number or $) and can include any number
of letters, digits and underscores. Identifiers in Verilog are case-sensitive.

Syntax
allowed symbols
ABCDE . . . abcdef. . . 1234567890 _$
not allowed: anything else especially - & #@

5.1.2. Operators
Operators are one, two and sometimes three characters used to perform operations on variables.
36
Examples include >, +, ~, &, !=. Operators are described in detail in “Operators”.

5.1.3. Verilog Keywords


These are words that have special meaning in Verilog. Some examples are assign, case, while,
wire, reg, and, or, and, and module. They should not be used as identifiers. Refer to Cadence
Verilog-XL Reference Manual for a complete listing of Verilog keywords. A number of them will
be introduced in this manual. Verilog keywords also include Compiler Directives and System
Tasks and Functions.

5.1.4. Gate-Level Modeling

Primitive logic gates are part of the Verilog language. Two properties can be specified, drive
strength and delay. Drive strength specifies the strength at the gate outputs. The strongest output
is a direct connection to a source, next comes a connection through a conducting transistor, then a
resistive pull-up/down. The drive strength is usually not specified, in which case the strengths
defaults to strong1 and strong0. Refer to Cadence Verilog-XL Reference Manual for more
details on strengths.

Delays: If no delay is specified, then the gate has no propagation delay; if two delays are
specified, the first represent the rise delay, the second the fall delay; if only one delay is
specified, then rise and fall are equal. Delays are ignored in synthesis. This method of specifying
delay is a special case of “Parameterized Modules” . The parameters for the primitive gates have
been predefined as delays.

5.1.5. Basic Gates


These implement the basic logic gates. They have one output and one or more inputs. In the gate
instantiation syntax shown below, GATE stands for one of the keywords and, nand, or, nor,
xor, xnor.

37
buf, not Gates
These implement buffers and inverters, respectively. They have one input and one or more
outputs. In the gate instantiation syntax shown below, GATE stands for either the keyword buf or
not

38
Three-State Gates; bufif1, bufif0, notif1, notif0
These implement 3-state buffers and inverters. They propagate z (3-state or high-impedance) if
their control signal is deserted. These can have three delay specifications: a rise time, a fall time,
and a time to go into 3-state.

5.1.6. Data Types

Value Set
Verilog consists of only four basic values. Almost all Verilog data types store all these values:
0 (logic zero, or false condition)
1 (logic one, or true condition)
x (unknown logic value) x and z have limited use for synthesis.
z (high impedance state)

Wire

39
A wire represents a physical wire in a circuit and is used to connect gates or modules. The value
of a wire can be read, but not assigned to, in a function or block. See “Functions” on p. 19, and
“Procedures: Always and Initial Blocks” on p. 18. A wire does not store its value but must be
driven by a continuous assignment statement or by connecting
it to the output of a gate or module. Other specific types of wires include:
wand (wired-AND);:the value of a wand depend on logical AND of all the drivers connected to
it.wor (wired-OR);: the value of a wor depend on logical OR of all the drivers connected to it.tri
(three-state;): all drivers connected to a tri must be z, except one (which determines the value of
the tri).

Reg
Declare type reg for all data objects on the left hand side of expressions in inital and always
procedures, or functions. See “Procedural Assignments” on page 12. A reg is the data type that
must be used for latches, flip-flops and memorys. However it often synthesizes into leads rather
than storage. In multi-bit registers, data is stored as unsigned numbers and no sign extension is
done for what the user might have thought were two’s complement numbers.

Input, Output, Inout

40
These keywords declare input, output and bidirectional ports of a module or task. Input and
inout ports are of type wire. An output port can be configured to be of type wire, reg, wand, wor
or tri. The default is wire

Integer
Integers are general-purpose variables. For synthesis they are used mainly loops-indicies,
parameters, and constants. See “Parameter” on p. 5. They are of implicitly of type reg. However
they store data as signed numbers whereas explicitly declared reg types store them as unsigned.
If they hold numbers which are not defined at compile time, their size will default to 32-bits. If
they hold constants, the synthesizer adjusts them to the minimum width needed at compilation.

Supply0, Supply1
Supply0 and supply1 define wires tied to logic 0 (ground) and logic 1 (power), respectively

41
Time
Time is a 64-bit quantity that can be used in conjunction with the $time system task to hold
simulation time. Time is not supported for synthesis and hence is used only for simulation
purposes.

Parameter
Parameters allow constants like word length to be defined symbolically in one place. This makes
it easy to change the word length later, by change only the parameter. See also “Parameterized
Modules” . An alternative way to do the same thing is to use macro substitution, see “Macro
Definitions”.

42
5.1.6. Operators

Arithmetic Operators

These perform arithmetic operations. The + and - can be used as either unary (-z) or binary (x-y)
operator

Relational Operators

Relational operators compare two operands and return a single bit 1or 0. These operators
synthesize into comparators. Wire and reg variables are positive Thus (-3’b001) = = 3’b111 and
(-3d001)>3d110. However for integers -1< 6.

43
Bit-wise Operators

Bit-wise operators do a bit-by-bit comparison between two operands. However see “Reduction
Operators”.

Logical Operators

Logical operators return a single bit 1 or 0. They are the same as bit-wise operators only for
single bit operands. They can work on expressions, integers or groups of bits, and treat all values
that are nonzero as “1”. Logical operators are typically used in conditional (if ... else) statements
since they work with expressions.

44
Reduction Operators

Reduction operators operate on all the bits of an operand vector and return a single-bit value.
These are the unary (one argument) form of the bit-wise operators above.

Shift Operators

Shift operators shift the first operand by the number of bits specified by the second operand.
Vacated positions are filled with zeros for both left and right shifts.

Concatenation Operator

The concatenation operator combines two or more operands to form a larger vector.

45
5.1.7. Operands

Literals
Literals are constant-valued operands that can be used in Verilog expressions. The two common
Verilog literals are:
(a) String: A string literal is a one-dimensional array of characters enclosed in double quotes (“
“).
(b) Numeric: constant numbers specified in binary, octal, decimal or hexadecimal.

Wires, Regs, and Parameters


Wires, regs and parameters can also be used as operands in Verilog expressions. These data
objects are described in more detail in Sect. 4. .

Bit-Selects “x[3]” and Part-Selects “x[5:3]”


Bit-selects and part-selects are a selection of a single bit and a group of bits, respectively, from a
wire, reg or parameter vector using square brackets “[ ]”. Bit-selects and part-selects can be used
as operands in expressions in much the same way that their parent data objects are used.

46
Function Calls
The return value of a function can be used directly in an expression without first assigning it to a
register or wire variable. Simply place the function call as one of the operands. Make sure you
know the bit width of the return value of the function call. Construction of functions is described
in “Functions”.

5.1.8. Modules

Module Declaration
A module is the principal design entity in Verilog. The first line of a module declaration specifies
the name and port list (arguments). The next few lines specifies the i/o type (input, output or
inout, ) and width of each port. The default port width is 1 bit.
Then the port variables must be declared wire, wand,. . ., reg . The default is wire. Typically
inputs are wire since their data is latched outside the module. Outputs are type reg if their signals
were stored inside an always or initial block.

47
Continuous Assignment

The continuous assignment is used to assign a value onto a wire in a module. It is the normal
assignment outside of always or initial blocks. Continuous assignment is done with an explicit
assign statement or by assigning a value to a wire during its declaration. Note that continuous
assignment statements are concurrent and are continuously executed during simulation. The order
of assign statements does not matter. Any change in any of the right-hand-side inputs will
immediately change a left-hand-side output.

Module Instantiations

Module declarations are templates from which one creates actual objects (instantiations).
Modules are instantiated inside other modules, and each instantiation creates a unique object
from the template. The exception is the top-level module which is its own instantiation.
The instantiated module’s ports must be matched to those defined in the template. This is
specified:

48
(i) By name, using a dot (.) “.template_port_name (name_of_wire_connected_to_port)”. or(ii) by
position, placing the ports in exactly the same positions in the port lists of both the template and
the instance.

Modules may not be instantiated inside procedural blocks.

5.2. STEP BY STEP PROCEDURE TO RUN A PROGRAM ON FPGA BOARD


We will explain how to burn a program on Xilinx board. First of all you can learn the Step-by-
Step Description for MATLAB+ISE Co-Simulation using System Generator for Spartan/Virtex
FPGAs explained here. Before going to burn a program on Xilinx board, check your program
output in MATLAB+ISE co-simulation using system generator.

Software and hardware used:

 Xilinx ISE 14.1


 Nexys™3 Spartan-6 FPGA Board (XC6LX16-CS324)

Here I am taking simple AND ing example for understanding of step by step procedure to run a
program on hardware.

49
STEP1: Open Xilinx ISE and create a new project.

click on Next

50
Step 2: Select the Family, Device, Package and Speed of Xilinx board and also select your
programming language (Verilog/VHDL). Here i am using Verilog language.

click on Next

51
Project summery window occurs.

Click on Finish

52
Step 3: Click on Project > New Source

Step 4: Select Source type is Verilog Module and enter the file name (ANDing_code).

Click on Next
53
Step 5: Define Module Window.

Here we can define inputs & outputs and its bit/bus size. In ANDing example there are two inputs
and output of single bit each. Also define clock signal for clocking operation.

Click on Next.

Step 6: Summery Window

Click on Finish.
54
Step 7: The Project Navigator window looks like below window.

All the inputs and outputs are already defined in Define Module window so these inputs and
outputs are seen in project navigator.

Step 8: Write a program for ANDing operator in module present in project navigator.

55
Step 9: Click on Project > New Source. Select Implementation Constraints file type and enter the
file name (e.g. pinout).

Click on Next.

Click on Finish.
56
Step 10: Write the inputs, outputs and its pin location in proper format of .ucf file. (use datasheet
of Xilinx board for pin location). Here two switchs SW0 and SW1 are used for input and one led
LD0 is used for output.

Step 11: Open main ANDing program and double click on Synthesize – XST. After successful
completion of Synthesis, double click on Implement Design. Implement design consists of three
parts-

 Translate
 Map
 Place and Route

After successful completion of implement design double click on Programming File


Generation. Programming File Generation produces a bit stream for Xilinx device
configuration. Successful completion of all these process the window looks like.

57
Step 12: Double click on Configure Target Device and a new ISE iMPACT window open.

58
Step 13: Connect the Xilinx board to your PC/Laptop using USB cable.

Step 14: Double click on Boundary scan. Check auto cable connection Output > Cable Auto
Connect and if cable is connected then Window bottom part looks like step 3 shown in below.

59
Step 15: Click on File > Initialize Chain. After that they ask for “Do you want to continue and
assign configuration files(s)?”

Click on Yes and select the generated bit-stream file.

60
Step 16: Click on Open. After that they ask “Do you want to attach an SPI or BPI PROM to
this device?” click on No tab. Click on Operations > Program. If Programming successful then
they shows Program Succeeded.

61
Step 17: Check the output on hardware(Board). Here I am giving input through switches and
output shows on LED. Output:

1. Switch1(OFF) AND Switch2(OFF) = LED(OFF)


2. Switch1(ON) AND Switch2(OFF) = LED(OFF)
3. Switch1(OFF) AND Switch2(ON) = LED(OFF)
4. Switch1(ON) AND Switch2(ON) = LED(ON)

62
Chapter 6
Results and discussion

6.1 16-bit modified csla

6.1.1 Design summary

Fig 6.1 Design summary of 16-bit MCSLA


From the above design summary, it explains the usage of number of LUT, number of IOB
and number of Slices. The percentage utilization of Slices is 0%, LUT is 0% and the bonded IOB
is 20%, which is available in Spartan 3E FPGA kit.

63
6.1.2 RTL schematic results in Xilinx 9.1i software

Fig 6.2 RTL Schematic of Modified CSLA Architecture

Above Fig 6.2 is the RTL schematic of the Modified Carry Select Adder (MCSLA),
which is an internal block of the 16-bit modified carry select adder. The first architecture consist
of 2-bit RCA, it have the carry input as 0. The next level of the adder block consist 2-bit RCA
and 3-bit BEC. The 2-bit RCA have Cin = 1 and the 3-bit BEC have Cin = 0. The first
architecture’s carry is going to the next level of multiplexer’s select line or control line. By the
use of multiplexer, the sum value is obtained.

64
6.1.3 Technology schematic of MCSLA

Fig 6.3 Technology schematic of MCSLA

Above Fig 6.3 is the technology schematic of 64-bit modified carry select adder architecture,
which is the schematic representation of the MCSLA design architecture

65
6.1.4 Simulation results in Xilinx 9.1i software

Fig 6.4 Simulation Results of MCSLA in Xilinx

The above Fig 6.4 is the simulation results of modified carry select adder in Xilinx 14.1i.
The output performance is same as modelsim. Here the input data is in the form of hexadecimal
‘a=0011010011100011 and b=0010011110101001’. The output result is
‘Sum =0101110010001100 and Cout = 0’.

66
6.2 Comparison of the regular and modified csla

Table 1.8 Comparisons between RCLSA and MCSLA

Word Adder Area (in Gates) Delay (in ns) Power (in
Size mW)

8-bit Regular CSLA 179 27.867 0.367

Modified CSLA 145 28.975 0.183

16-bit Regular CSLA 399 28.003 0.538

Modified CSLA 311 29.204 0.469

32-bit Regular CSLA 839 28.095 1.176

Modified CSLA 643 30.002 0.982

64-bit Regular CSLA 1719 29.456 2.163

Modified CSLA 1307 31.622 2.84

The above tabulation discusses comparisons of regular and modified CSLA. The
delay overhead of the 64-bit is slightly larger delay with 2.166ns. The area and power of the
modified CSLA are significantly reduced by 17% and 15% respectively. The modified CSLA
architecture is low power, low area, simple and efficient for VLSI hardware implementation

67
Chapter 7

Conclusion

The simple approach is proposed in this project to reduce the area and power of CSLA.
The reduced number of gates of this work offers the great advantage in the reduction of area and
also the total power. The compared results shows that the modified CSLA has a slightly larger
delay (only 2.166ns), but the area and power of the 64-bit modified CSLA are significantly
reduced by 17% and 15% respectively. The modified CSLA architecture is therefore, low area,
low power, simple and efficient for VLSI hardware implementation. By adapting this technology,
it is used in various applications like multipliers, DSP to execute various algorithms like FFT,
FIR and IIR. Using model based approach; this could be implemented and tested using XILINX.

68
Chapter 8

Future work

It would be interesting to test the design of the modified 128-bit SQRT CSLA and it
would be interesting to use Carry Look-ahead Adder (CLA) instead of RCA with Cin=0 in the
modified carry select adder (MCSLA) to achieve high speed and small delay.

69
Bibliography

The following books and study materials were consulted:

1. VHDL Reference guide


2. Synthesis and Simulation Design Guide
3. Model SimXE’s User Manual

Online helps:

1. Google

70
References

1. Anitha Kumari R D, Nayana N D, “Low power and Area Efficient Carry Select Adder”,
National Conference on Electronics, Communication and Signal Processing, NCECS-2011.

2. Bedrij O.J, “Carry-select adder,” IRE Trans. Electron. Comput., pp.340–344, 1962.

3. Ceiang T Y and Hsiao M J, “Carry-select adder using single ripple carry adder,” Electron.
Lett., vol. 34, no. 22, pp. 2101–2103, Oct. 1998.

4. Jeong .W and Roy .K, “Robust high-performance low power adder”, Proc. of the Asia and
South Pacific Design Automation Conference, pp. 503-506, 2003.

5. He Y, Chang C H, and Gu J, “An area efficient 64-bit square root carry select adder for low
power applications,” in Proc. IEEE Int. Symp. Circuits Syst., 2005, vol. 4, pp. 4082–4085.

6. Hosseinghadiry M, Mohammadi H and Nadisenejani M, '' Two New Low Power High
Performance Full Adders with Minimum Gates", World Academy of Science, Engineering and
Technology 52 2009.

7. KeivanNavi and NedaKhandel, “The Design of a High-Performance Full Adder Cell by


Combining Common Digital Gates and Majority Function”, European Journal of Scientific
Research, ISSN 1450-216X Vol.23 No.4 (2008), pp.626-638.

8. Kim Y and Kim L S, “64-bit carry-select adder with reduced area”, Electron.Lett., vol. 37, no.
10, pp. 614–615, May 2001.

9. Manoj Kumar, Sandeep K. Arya and SujataPandey, “Single bit full adder design using 8
transistors with novel 3 transistors XNOR gate”, International Journal of VLSI design &
Communication Systems (VLSICS) Vol.2, No.4, December 2011.

71
10. Massimo Alioto and Gaetano Palumbo, "Optimized Design of Carry-Bypass Adders",
ECCTD’01 - European Conference on Circuit Theory and Design, August 28-31, 2001, Espoo,
Finland.

11. Padma Devi, Ashima Girdher and Balwinder Singh, ”Improved Carry Select Adder with
Reduced Area and Low Power Consumption”, International Journal of Computer Applications
(0975 – 8887), Volume 3 -No.4, June 2010.

12. Ram Kumar .B and Kittur H.M, “Low-Power and Area-Efficient Carry Select Adder”, IEEE
transactions on very large scale integration (VLSI) systems, vol. 20, no. 2, February 2012.

13. Saiful Islam Md, Muhammad MahbuburRahman, Zerina begum and Mohd.Zulfiquar Hafiz,
"Fault Tolerant Reversible Logic Synthesis: Carry Look-Ahead and Carry-Skip Adders", ACTEA
2009July 15-17, 2009 ZoukMosbeh, Lebanon.

72

You might also like