0% found this document useful (0 votes)
4 views22 pages

Lect 12

The lecture focuses on the design of decoders for CMOS VLSI, emphasizing methods to minimize delay in memory systems. It discusses the construction of decoders using AND gates, the challenges of large fanin gates, and the benefits of a two-level decoder approach. Additionally, it covers layout considerations and transistor sizing for efficient circuit performance.

Uploaded by

workat60474
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views22 pages

Lect 12

The lecture focuses on the design of decoders for CMOS VLSI, emphasizing methods to minimize delay in memory systems. It discusses the construction of decoders using AND gates, the challenges of large fanin gates, and the benefits of a two-level decoder approach. Additionally, it covers layout considerations and transistor sizing for efficient circuit performance.

Uploaded by

workat60474
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Introduction to CMOS VLSI Design (E158)

Harris

Lecture 11: Decoders and Delay Estimation

David Harris

Harvey Mudd College

[email protected]

Based on EE271 developed by Mark Horowitz, Stanford University

MAH E158 Lecture 12 1


Decoders and Delay Estimation

Reading
W&E 4.5-4.6

Introduction
In the last lecture, we looked at memory design. Today we will look at various
methods for building decoders to drive the word lines and column multiplexer
circuitry.
To build a fast memory, we need to minimize the delay of the decoder. This
challenge will serve as a jumping off point for delay estimation and gate sizing
to minimize delay.

MAH E158 Lecture 12 2


Peripheral Circuits

decoder

mux

We need to build the decoder and wordline drive circuits, and the column select
and bitline drive circuits. For both we need to build a decoder -- something to
select the correct line. Lets look at building decoders for CMOS memories.

MAH E158 Lecture 12 3


Decoders

A decoder is just a structure that contains a number of AND gates, where each
gate is enabled for a different input value.

For a n-bit to 2n decoder, we need to build 2n, n-input AND gates. And we want to
build these AND gates so they layout nicely (in a regular way)

MAH E158 Lecture 12 4


Large Fanin AND Gates

In CMOS building this type of gate causes a problem, since large fanin implies a
series stack. We will see a little later in the notes that the best way to do this is to
use a two-level decoder by predecoding the inputs.

In nMOS the problem was easy, large fanin NOR gates work well. So

a collection of NOR gates solves the problem very nicely.

MAH E158 Lecture 12 5


CMOS Decoders

In CMOS, a large fanin gate implies a series stack. So we need to build a decoder
that does not use a large fanin gate. But how? Use a 2-level decoder.
• An n-bit decoder requires 2n wires
A0, A0, A1, A1, …
Each gate is an n bit NOR (NAND gate)
• Could predecode the inputs
Send A0 A1, A0 A1, A0 A1, A0 A1, A2 A3 …
Instead of A0, A0, A1, A1, …
Maps 4 wires into 4 wires that need to go to the decoder
Reduces the number of inputs to the decode gate by a factor of two.

MAH E158 Lecture 12 6


Predecode Example

A0 A1
A0 A1
A0 A1
A0 A1
A0 A1
A0 A1
A0 A1
A0 A1

A1
A1
A0
A0
2 Bit Predecode No Predecode

MAH E158 Lecture 12 7


Predecode

Predecode is just like what we did when we needed to make a single six input
AND gate. Did it in a few levels:

decode gate

predecode

One can do a 2 input predecode, or a 3 input predecode


• A 2 input predecoder generates 4 outputs
• A 3 input predecoder generates 8 outputs

The difference with standard logic is that we need to decode all possible inputs.
This means that each predecode gate can be reused by many ‘final’ decode
gates. A little planning can yield a regular layout.

MAH E158 Lecture 12 8


Predecode

A predecoded decoder:
A0 A1 A2 A3 A4 A5

MAH E158 Lecture 12 9


Layout Issues

Often we need to build large array structures (for example we need a large RAM),
so we want to layout the decoder in as little space as possible. We need to find a
good way to layout this structure.

Clearly we need to run the address lines through each decoder cell, and stack the
decoder cells next to each other.

MAH E158 Lecture 12 10


Predecode Layout

The output of the predecode gate need to drive the address lines.
• These address lines are usually high capacitance
So usually it is better to use a NAND with an inverter buffer as the
predecode cells.
• Cells can be placed on top of the address lines, or to the left of the address
lines.
predecode cells

decode cells

MAH E158 Lecture 12 11


Decoder Cell Layout

• Need to have n and p transistors


• Need to take up minimum space
• Want it to be easy to ‘program’ the cell
While layout is regular each cell is different
It connects to a different set of inputs

• Look at a couple of layout styles

MAH E158 Lecture 12 12


Decoder Layout

Cell Area is proportional to n2. Decoder area is n3.

A0 A0 A1 A1 A2 A2 Gnd Vdd

The problem with this layout is that most of the space is wasted. All of the area
under the wires is wasted. We should rotate the gate to fit under the wires.

MAH E158 Lecture 12 13


A Slightly Better Decoder Layout

Better cell design (like we have talked about)


Out1

Out0

A0 A0 A1 A1 A2 A2 Vdd Gnd

In this layout, the basic cell remains unchanged, it is the wire contacts that are
programmed. This is sometimes a good idea, since it lets you optimize the decode
cell (in this case the 3 input gate)

MAH E158 Lecture 12 14


A Smaller Layout

Leave space for all the tracks in the cell


Address lines in M2/Poly

Vdd

Out1

Gnd
Out0

A0 A0 A1 A1 A2 A2

Need to program the decoder by placing transistors, or metal.

With predecode, you have more tracks per transistor.

MAH E158 Lecture 12 15


Wordline Driver

Decoder is just part of the wordline drive circuit


• Also need to qualify the wordline (AND with clock)
• Also need to buffer the signal to drive WL cap

Clock qualification can be done in the decoder


• A0 … An Phi1 - just another input to the decoder
Usually not a great idea, since this can lead to large skew
Clock AND is usually done in last stage before driver
can be large devices
wordline_q1
decode_s1

Φ1
or use normal NAND gate

MAH E158 Lecture 12 16


Thin Drivers

Wordline pitch of memory cell is not that tight (about 40λ), but not that large either.
There are some memories (ROMs, dRAMs) with much tighter pitch. For many of
these applications you need thin gates and drivers. The minimum useful space is
16λ

Decoder In Out 16λ


is here

Gnd Vdd Contacts can


be shared

For the wordline driver, I might use two of these drivers in parallel, to reduce the
horizontal length (effectively fold the transistors again)

MAH E158 Lecture 12 17


Putting it Together

Floorplan for a memory

Bit Line Precharge Φ1

Predecoder
Row Decode
Memory Array Mem Drv Decoder

Mem
Drv Decoder

R/W
Column Mux 2:1 Mux
&
Bit IO
Bit IO

Address

Built using Array constructs


• Decoder base is often array, with programming done by software
Memory is built by arraying a cell that contains the cell and its mirror

MAH E158 Lecture 12 18


Transistor Sizing

For memories (and other structures) you end up with long high cap wires
• Need to drive these large capacitors quickly, and this sets the device size
• We will look at chain of inverters first, and then think about gates

Factors to consider in gate sizing:


• Need to think about the load you are driving
• Need to think about the load you present to your predecessor

Why transistor sizes matter when you are driving a large capacitance

13ns falling
26ns rising

min 2pF (10mm of metal2)


4λ:2λ

MAH E158 Lecture 12 19


Buffer (or Gate) Sizing

But bigger gates have bigger input capacitance too:


Delay = 4ns - falling
8ns - rising
Delay = 0.3ns

2pF
min 400-p
200-n

Clearly we need to make the predriver larger too.


Is there an optimal solution? Yes, in a way
• Minimize delay of chain - for the minimum all delays will match (why?)

1 f f2 f3

• Equalizing delay principle applies to any critical path through gates.

MAH E158 Lecture 12 20


MAH E158 Lecture 12 21
MAH E158 Lecture 12 22

You might also like