Avlsi - Module 1word Notesn
Avlsi - Module 1word Notesn
MODULE 1
Introduction to ASICs
Programmable ASICs
– Tens of Transistors
– NAND, NOR
– Hundreds of Transistors
– Counters
– First Microprocessor
• Bipolar
– More accuracy
• MOS
– Gate-Aluminium
– Low power consumption
– Low cost
• CMOS
– Gate-Poly-Silicon
– Low power consumption
– Low cost
• BiCMOS
- High Current drive
Origin of ASICs:
the functions that we can implement using standard ICs and then implementing
the remaining logic functions (sometimes called glue logic ) with one or more
custom ICs.
custom ICs, dynamic random access memory (DRAM) and static RAM (SRAM)
• Examples of ICs that are not ASICs include standard parts such as:
- microprocessors;
– Both of these examples are specific to an application (shades of an ASIC) but are
sold to many different system vendors (shades of a standard part). ASICs such
( ASSPs ).
Measurement of IC :
• Gate Equivalent
– Example: 0.5µm IC
Types of ASICs
• Full-Custom ASICs: Possibly all logic cells and all mask layers customized
• Semi-Custom ASICs: all logic cells are pre-designed and some (possibly all)
Full-Custom ASICs
Manufacturing lead time is typically 8 weeks (time taken to make the IC does
Full-Custom ASICs
Advantages:
Disadvantages
Increased Complexity
Higher risk.
Some Examples:
Microporcessor,
Semi-Custom ASICs : all logic cells are pre-designed and some (possibly all)mask
layers customized
full-custom blocks
System-Level Macros(SLMs)
• Feedthrough cell:
• Piece of metal that is used to pass a signal through a cell or to a space in a cell
• Spacer cells
• The width of each row of standard cells is adjusted so that they may be aligned using
spacer cells .
• The power buses, or rails, are then connected to additional vertical power rails using
• If the rows of standard cells are long, then vertical power rails can also be run in
metal2 through the cell rows using special power cells that just connect to VDD and
GND.
Usually the designer manually controls the number and width of the verti1c9al power rails
– The rows stack vertically to form flexible blocks- reshape during design
– Flexible blocks connected with other std cell blocks or full custom block
Advantages of CBIC
Disadvantages of CBIC:
– Time needed to fabricate all layers of the ASIC for new design
Only the top few layers of metal, which define the interconnect between
It is often called a masked gate array ( MGA ) or a pre-diffused array which uses
The key difference between a channel-less gate array and channeled gate array
• There are no predefined areas set aside for routing between cells on a channel-
less gate array.
• An embedded gate array or structured gate array (also known as master slice or
master image ) combines some of the features of CBICs and MGAs.
• One of the disadvantages of the MGA is the fixed gate-array base cell. This makes
the implementation of memory, for example, difficult and inefficient.
• In an embedded gate array we set aside some of the IC area and dedicate it to a
specific function.
• This embedded area either can contain a different base cell that is more suitable for
building memory cells, or it can contain a complete circuit block, such as a
microcontroller.
Channelled gate array
Adv: Specific space for interconnection
Disadv: compared to CBIC space is not adjustable
Channelless gate array
A B Cout
0 Kill
0
1 Propagate
0
1 0 Propagate
1 1 Generate
Carry-lookahead expansion:
• Can recursively expand carry formula.
• Having these we could design the circuit. We can now write the Boolean function for
the carry output of each stage and substitute for each Ci its value from the previous
equations
Generate :
Propagate :
Delete : D= A . B
Condition : If BP= P 0 . P 1 . P 2 . P 3=1 Then C0,3 = Ci,0
• When BP= P 0 . P 1 . P 2 . P 3=1, the incoming carry is bypassed through the next blocks
immediately and hence the name
• The delay involved is the setup time to evaluate the generate and propagate functions
To design a N bit adder: Divide the adder into equal lengths by pass stages of
length M:(if N=16, M=4)
tsum : Time to generate the sum of the final stage where N=2M
Advantages
• Improves delay compared to the RCA with minimal effort and complexity
Disadvantages
• Probabilistic speed improvement : the speed of a CBA improves for only some input
combinations
Applications
• CSA is a digital adder that can efficiently add three or more binary numbers.
• The CSA outputs two numbers , a partial sum and a carry instead of a single sum.
• CSA uses bit level compressors to break the carry chain and sum two numbers without
carry propagation.
• The basic CSA is made up of a number of (3,2) counters in parallel, with no carry links.
Main idea: Don’t propagate carry signal until last possible stage.
• This adder performs three bit addition at once and produces two outputs
• As an alternative, carry can be stored within the current stage and updated like added
value within the next stage.
• A simple n-bit RCA is used for the next level wherever the final operation is done.(or
use carry propagate adder for final sum)
Advantages
• It consumes low power as compared to other types of adders due to few carry
propagation stages
Disadvantages
Applications
• Used in high speed multiplication and better performance compared to RCA and CPA
• A CSLA is a fast adder which is used in high speed arithmetic calculations , processing
applications, Memory architectures and digital communication systems.
• A CSLA is based on usage of multiple carry bits that is needed for the final output. This
helps to reduce the issue of carry propagation.
• A carry-select adder is often used as the fast adder in a datapath library because its
layout is regular.
• In a carry-select adder we duplicate two small adders (usually 4-bit or 8-bit adders—often
CLAs) for the cases Cin = '0' and Cin = '1' and then use a MUX to select the case that we need—
wasteful, but fast [Bedrij, 1962].
Advantages
• The adder speeds up addition by performing addition operation on lower and upper
positions of the word simultaneously
Disadvantages
• The price paid is additional hardware for word adder, a set of multiplexers and the
associated interconnect wiring. Area requirement is huge because of multiple pairs of
RCAs.
• The design is favourable when speed is more important than area consumption.
• A conditional sum adder is a recursive structure based on carry select adder. (CSLA)
• The carry select adder is a fast addition scheme that divides the n-bit operands into
smaller groups , allowing the serial carry propagation to be done in parallel.
• We can extend the idea behind a carry-select adder as follows. Suppose we have an n -
bit adder that generates two sums: One sum assumes a carry-in condition of '0', the
other sum assumes a carry-in condition of '1’.
• We can split this n -bit adder into an i -bit adder for the ‘i’ LSBs and an ( n – i )-bit
adder for the (n – i) MSBs. Both of the smaller adders generate two conditional sums as
well as true and complement carry signals.
• The two (true and complement) carry signals from the LSB adder are used to select
between the two (n – i + 1)-bit conditional sums from the MSB adder using 2( n – i + 1) two
input MUXes
• The above figure shows the simplest form of an n -bit conditional-sum adder that uses
n single-bit conditional adders, H (each with four outputs: two conditional sums, true
carry, and complement carry), together with a tree of 2:1 MUXes (Qi_j).
• We can recursively apply this technique. For example, we can split a 16-bit adder using
i = 8 and n = 8; then we can split one or both 8–bit adders again—and so on.
Advantages
Disadvantages
Booth’s Multiplier
• The algorithm loads the multiplicand and multiplier into registers, initializes a third
register to 0, and performs bitwise shifts and arithmetic operations
(addition/subtraction of the multiplicand) on the registers based on the values of bits
from the multiplier.
• This process builds up the product one bit at a time in a third register.
• The function of the algorithm is to determine the beginning and the end of the string of
ones in the multiplier and perform multiplicand addition-accumulation at the end of
the string or perform multiplicand subtraction-accumulation at the beginning of the
string.
Advantages
• Reduces partial products: making it useful for multipliers with long operands
Disadvantages
• The combinational datapath cells, NAND, NOR, and so on, and sequential datapath cells
(flip-flops and latches) have standard-cell equivalents and function identically.
• Bold outlines(1 point) are used for datapath cells instead of the regular (0.5 point) that
is used for scalar symbols. We call a set of identical cells a vector of datapath elements
in the same way that a bold symbol, A , represents a vector and A represents a scalar.
• MAJ(NOT(A), B, NOT(BIN)) These equations are the same as those for the FA except
that the B input is inverted and the sense of the carry chain is inverted.
• To build a subtracter that calculates (A – B) we invert the entire B input bus and
connect the BIN[0] input to VDD (not to VSS as we did for CIN[0] in an adder).
• An adder/subtracter has a control signal that gates the A input with an exclusive-OR
cell (forming a programmable inversion) to switch between an adder or subtracter.
• Some adder/subtracters gate both inputs to allow us to compute (–A – B). We must be
careful to connect the input to the LSB of the carry chain (CIN[0] or BIN[0]) when
changing between addition (connect to VSS) and subtraction (connect to VDD).
2) A barrel shifter rotates or shifts an input bus by a specified amount. For example if we
have an eight input barrel shifter with input '1111 0000' and we specify a shift of '0001
0000' (3, coded by bit position) the right-shifted 8-bit output is '0001 1110’.
• A barrel shifter may rotate left or right (or switch between the two under a separate
control).
• A barrel shifter may also have an output width that is smaller than the input.
• This situation is equivalent to having a barrel shifter with two 4-bit inputs and a 4-bit
output.
• Barrel shifters are used extensively in floating-point arithmetic to align (we call this
normalize and denormalize ) floating-point numbers (with sign, exponent, and
mantissa).
• The input is an n -bit bus A, the output is an n -bit bus, S, with a single '1’ in the bit
position corresponding to the most significant '1' in the input.
• Thus, for example, if the input is A = '0000 0101' the leading-one detector output
is S = '0000 0100', indicating the leading one in A is in bit position 2 (bit 7 is the
MSB, bit zero is the LSB).
• If we feed the output, S, of the leading-one detector to the shift select input of a
normalizing (left-shift) barrel shifter, the shifter will normalize the input A.
• In our example, with an input of A = '0000 0101', and a left-shift of S = '0000
0100', the barrel shifter will shift A left by five bits and the output of the shifter is
Z = '1010 0000’.
• Now that Z is aligned (with the MSB equal to '1') we can multiply Z with another
normalized number.
4) The output of a priority encoder is the binary-encoded position of the leading one in an
input.
For example, with an input A = '0000 0101' the leading 1 is in bit position 3 so the output
of a 4-bit priority encoder would be Z = ‘0011' (3).
In some cell libraries the encoding is reversed so that the MSB has an output code of zero,
in this case Z = '0101' (5).
• The carry-in control input, CIN[0], thus acts as an enable: If it is set to '0' the output is
the same as the input.
• The implementation of arithmetic cells is often a little more complicated than we have
explained.
• This inverts COUT, so that in the following stage we must invert it again. If we push an
inverting bubble to the input CIN we find that:
• In many datapath implementations all odd-bit cells operate on inverted carry signals,
and thus the odd-bit and even-bit datapath elements are different.
• In fact, all the adder and subtracter datapath elements we have described may use this
technique.
• Normally this is completely hidden from the designer in the datapath assembly and
any output control signals are inverted, if necessary, by inserting buffers.
• The implementation may invert the odd carry signals, with CIN[0] again acting
• as an enable.
• This has the effect of selecting either the increment or decrement function
• For a 4-bit number, for example, zero in ones‘ complement arithmetic is '1111' or
'0000', and that zero in signed magnitude arithmetic is '1000' or '0000’.
7) A register file (or scratchpad memory) is a bank of flip-flops arranged across the bus;
sometimes these have the option of multiple ports (multiport register files) for read and
write.
• Normally these register files are the densest logic and hardest to fit in a datapath. For
large register files it may be more appropriate to use a multiport memory. We can add
control logic to a register file to create a first-in first-out register ( FIFO ), or last-in
first-out register ( LIFO ).
I/O Cells
When the output enable (OE) signal is high, the circuit functions as a noninverting
buffer driving the value of DATAout onto the I/O pad.
• When OE is low, the output transistors or drivers , M1 and M2, are disconnected. This allows
multiple drivers to be connected on a bus.
• It is up to the designer to make sure that a bus never has two drivers—a problem known as
contention .
• In order to prevent the problem opposite to contention—a bus floating to an
intermediate voltage when there are no bus drivers—we can use a bus keeper or bus-
hold cell (TI calls this Bus-Friendly logic).
• A bus keeper normally acts like two weak (low drive-strength) cross-coupled inverters
that act as a latch to retain the last logic state on the bus, but the latch is weak enough
that it may be driven easily to the opposite state.
• Even though bus keepers act like latches, and will simulate like latches, they should
not be used as latches, since their drive strength is weak.
• Such large currents flowing in the output transistors must also flow in the power
supply bus and can cause problems.
• There is always some inductance in series with the power supply, between the point at
which the supply enters the ASIC package and reaches the power bus on the chip.
• The inductance is due to the bond wire, lead frame, and package pin.
• We can design the output buffer to limit the slew rate of the output (we call these slew-
rate limited I/O pads).
• Quiet-I/O cells also use two separate power supplies and two sets of I/O drivers:
• An AC supply (clean or quiet supply) with small AC drivers for the I/O circuits that
start and stop the output slewing at the beginning and end of a output transition
• A DC supply (noisy or dirty supply) for the transistors that handle large currents as
they slew the output.
• The three-state buffer allows us to employ the same pad for input and output—
bidirectional I/O .
• When we want to use the pad as an input, we set OE low and take the data from
DATAin.
• Of course, it is not necessary to have all these features on every pad: We can build
output-only or input-only pads
• We can also use many of these output cell features for input cells that have to drive
large on-chip loads (a clock pad cell, for example).
• Some gate arrays simply turn an output buffer around to drive a grid of interconnect
that supplies a clock signal internally.
• Some libraries include I/O cells that have passive pull-ups or pull-downs (resistors)
instead of the transistors, M1 and M2 (the resistors are normally still constructed from
transistors with long gate lengths).
• We can also omit one of the driver transistors, M1 or M2, to form open-drain outputs
that require an external pull-up or pull-down.
• We can design the output driver to produce TTL output levels rather than CMOS logic
levels.
• The input buffer can also include a level shifter to accept TTL input levels and shift the
input signal to CMOS levels.
• The gate oxide in CMOS transistors is extremely thin (100 Å or less). This leaves the
gate oxide of the I/O cell input transistors susceptible to breakdown from static
electricity ( electrostatic discharge , or ESD ).
• ESD arises when we or machines handle the package leads (like the shock I sometimes
get when I touch a doorknob after walking across the carpet at work).
• Sometimes this problem is called electrical overstress (EOS) since most ESD-related
failures are caused not by gate-oxide breakdown, but by the thermal stress (melting)
that occurs when the n -channel transistor in an output driver overheats (melts) due to
the large current that can flow in the drain diffusion connected to a pad during an ESD
event.
• To protect the I/O cells from ESD, the input pads are normally tied to device structures
that clamp the input voltage to below the gate breakdown voltage (which can be as low
as 10 V with a 100 Ao gate oxide).
• Some I/O cells use transistors with a special ESD implant that increases breakdown
voltage and provides protection.
• I/O driver transistors can also use elongated drain structures (ladder structures) and
large drain-to-gate spacing to help limit current.
• In a salicide process that lowers the drain resistance ladder structures are difficult.
One solution is to mask the I/O cells during the salicide step.
• Another solution is to use pnpn and npnp diffusion structures called silicon-controlled
rectifiers (SCRs) to clamp voltages and divert current to protect the I/O circuits from
ESD
• There are several ways to model the capability of an I/O cell to withstand EOS.
• Typical voltages generated by the human body are in the range of 2–4 kV, and we often
see an I/O pad cell rated by the voltage it can withstand using the HBM.
• The charge-device model ( CDM , also called device charge–discharge) represents the
problem when an IC package is charged, in a shipping tube for example, and then
grounded.
• If the diffusion structures in the I/O cells are not designed with care, it is possible to
construct an SCR structure unwittingly, and instead of protecting the transistors the
SCR can enter a mode where it is latched on and conducting large enough currents to
destroy the chip.
• Latch-up can occur if the pn -diodes on a chip become forward-biased and inject
minority carriers (electrons in p -type material, holes in n -type material) into the
substrate.
• These injected minority carriers can travel fairly large distances and interact with
nearby transistors causing latch-up.
• This is a problem that can also occur in the logic core and this is one reason that we
normally include substrate and well connections to the power supplies in every cell
Cell Compilers
• The process of hand crafting circuits and layout for a full-custom IC is a tedious, time-
consuming, and error-prone task.
• There are two types of automated layout assembly tools, often known as a silicon
compilers .
• The first type produces a specific kind of circuit, a RAM compiler or multiplier
compiler , for example.
• Dynamic RAM (DRAM) can use a cell with only one transistor, storing charge on a
capacitor that has to be periodically refreshed as the charge leaks away.
• ASIC RAM is invariably static (SRAM), so we do not need to refresh the bits.
• When we refer to RAM in an ASIC environment we almost always mean SRAM. Most
ASIC RAMs use a six-transistor cell (four transistors to form two cross-coupled
inverters that form the storage loop, and two more transistors to allow us to read from
and write to the cell).
• RAM compilers are available that produce single-port RAM (a single shared bus for
read and write) as well as dual-port RAMs , and multiport RAMs .
• In a multi-port RAM the compiler may or may not handle the problem of address
contention (attempts to read and write to the same RAM address simultaneously).
• RAM can be asynchronous (the read and write cycles are triggered by control and/or
address transitions asynchronous to a clock) or synchronous (using the system clock).