Lec 4 - 5 - 6
Lec 4 - 5 - 6
Subsystem design
Subsystem: A system that is a part of a larger system. Most chips are built from
a collection of subsystem such as adder, register, files, state machines etc.
Subsystem optimization: The cost of a design is measured in area, delay, and
power. Area and delay costs can be reduced by optimization at each level of
abstraction:
• Circuit. Transistor sizing is the first line of defense against circuits that
inherently require long wires. Advanced logic circuits, such as precharged
gates, may help reduce the delay within logic gates.
• Logic. redesigning the logic to reduce the gate depth from input to output
can greatly reduce delay, though usually at the cost of area.
Shifter:
A barrel shifter can perform n-bit shifts in a single combinational function, and
it has a very efficient layout. It can rotate and extend signs as well. Its architecture
is shown in Figure 1. The barrel shifter accepts 2n data bits and n control signals
and produces n output bits. It shifts by transmitting an n-bit slice of the 2n data
bits to the outputs. The position of the transmitted slice is determined by the
control bits; the exact operation is determined by the values placed at the data
inputs. Consider two examples:
• Send a data word d into the top input and a word of all zeroes into the bottom
input. The output is a right shift (imagine standing at the output looking into the
barrel shifter) with zero fill. Setting the control bits to select the top-most n bits
is a shift of zero, while selecting the bottom-most n bits is an n-bit shift that
pushes the entire word out of the shifter. We can shift with a ones fill by sending
an all-ones word to the bottom input.
• Send the same data word into both the top and bottom inputs. The result is a
rotate operation—shifting out the top bits of a word causes those bits to reappear
at the bottom of the output.
ECE 405
Fig2
A barrel shifter with n output bits is built from a 2n vertical by n horizontal array
of cells, each of which has a single transistor and a few wires. The schematic for
a small group of contiguous cells is shown in Fig2. The core of the cell is a
transmission gate built from a single n-type transistor; a complementary
transmission gate would require too much area for the tubs. The control lines run
vertically; the input data run diagonally upward through the system; the output
data run horizontally. The control line values are set so that exactly one is 1,
which turns on all the transmission gates in a single column. The transmission
gates connect the diagonal input wires to the horizontal output wires; when a
column is turned on, all the inputs are shunted to the outputs. The length of the
shift is determined by the position of the selected column—the farther to the right
it is, the greater the distance the input bits have travelled upward before being
shunted to the output.
Note that, while this circuit has many transmission gates, each signal must
traverse only one transmission gate. The delay cost of the barrel shifter is largely
determined by the parasitic capacitances on the wires, which is the reason for
squeezing the size of the basic cell as much as possible. In this case, area and
delay savings go hand-in-hand.
Describe CMOS VLSI design of a full adder circuit and hence verify the
truth table.
ECE 405
Semiconductor memories
Types of RAM
DRAM stands for Dynamic Random Access Memory. It is used in most of the computers. It is
the least expensive kind of RAM. It requires an electric current to maintain its electrical state.
The electrical charge of DRAM decreases with time that may result in loss of DATA. DRAM
is recharged or refreshed again and again to maintain its data. The processor cannot access the
data of DRAM when it is being refreshed. That is why it is slow.
SRAM stands for Static Random Access Memory. It can store data without any need of
frequent recharging. CPU does not need to wait to access data from SRAM during processing.
That is why it is faster than DRAM. It utilizes less power than DRAM. SRAM is more
expensive as compared to DRAM. It is normally used to build a very fast memory known as
cache memory.
MRAM stands for Magneto resistive Random Access Memory. It stores data using magnetic
charges instead of electrical charges. MRAM uses far less power than other RAM technologies
so it is ideal for portable devices. It also has greater storage capacity. It has faster access time
than RAM. It retains its contents when the power is removed from computer.
The simplest dynamic RAM cell uses a three-transistor circuit .This circuit is
fairly large and slow. It is sometimes used in ASICs because it is denser than
SRAM and, unlike one-transistor DRAM, does not require special processing
steps.
The three-transistor DRAM circuit is shown in Fig3. The value is stored on the
gate capacitance of t1; the other two transistors are used to control access to that
value:
• To write, the value to be written is set on write_data, write is set to 1, and read
to 0. Charge sharing between write_data and t1’s gate capacitance forces t1 to the
desired value.
ECE 405
Physical design(Floorplanning)
Floorplanning is chip-level layout design. When designing a leaf cell, we used
transistors and vias as our basic components; floorplanning uses the adders,
registers, and FSMs as the building blocks. The fundamental difference between
floorplanning and leaf-cell design is that floorplanning works with components
that are much larger than the wires connecting them. This great size mismatch
forces us to analyse the layout differently and to make different trade-offs during
design.
Block placement, as the name implies, places the blocks on the chip.
global routing assigns wires to routing channels between the blocks;
detailed routing designs the layouts for the wiring.
ECE 405
A block is characterized by its area and its aspect ratio (the ratio of its width to
its height). The wiring between the blocks (such as a rat’s nest plot) can be used
to adjust the positions of the blocks.
Chennel definition: the space between the blocks into rectangular regions for
simplicity during detailed routing; this step is known as channel definition.
• Sweep small components into larger ones. Block diagrams often have isolated
gates or slightly larger components. While these small components help describe
system operation, they create lots of problems during floorplanning: they require
ECE 405
extra effort for power/ground routing and they disrupt the flow of wires across
the chip. Put these small components into an existing larger block or create a glue
logic block to contain all the miscellaneous elements.
• Design wiring that looks simple. If your sketch of the block diagram or
floorplan looks like a plate of spaghetti, it will be hard to route. More importantly,
it will be harder to change the design when you need to make logic changes or
redesign to reduce delays. Move blocks, then move pin locations to simplify
routing topology.
• Design planar wiring. A set of nets is planar if all the nets can all be routed in
the plane without crossing. While most interesting chips don’t have planar wiring,
a subset of the wires may be planar. It may help organize your thinking to first
design a floorplan on which the most important signals have a planar routing,
then add the less-critical signals later.
• Draw separate wiring plans for power and clock signals. You may want to
include these signals in your floorplan sketch, but they may be hard to distinguish
from the maze of signals on the chip. A separate chart of power and clock routing
will help you convince yourself that your design is good for signals, power, and
clock.
• Vias as thermal pipes. A simple step that can be taken to improve the
temperature characteristics of the chip is to add dummy vias in higher-level metal
layers. These vias help conduct heat away from the lower layers.