0% found this document useful (0 votes)
33 views27 pages

Module-5 Memories Updated

Uploaded by

rashmi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views27 pages

Module-5 Memories Updated

Uploaded by

rashmi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

VLSI Design 21ECE62

Module 5
Memory, Registers and Aspects of System Timing
System Timing Considerations:
1. Two phase non-overlapping clock is assumed to be available and this clock will
be used throughout the system.
2. Clock phases are assumed to be ϕ1 and ϕ2 and ϕ1 is assumed to lead ϕ2.
3. Bits (data) to be stored are written to registers, storage elements and subsystems
on ϕ1 of the clock i.e., WR (write) signal is ANDed with ϕ1.
4. Bits written into storage elements may be assumed to have settled before ϕ2 signal
(which follows immediately) and ϕ2 signal may be used for refreshing the stored
data.
5. Delay through data paths, combinational logic etc., are assumed to be less than
the interval between leading edge of ϕ1 of the clock and leading edge of following
ϕ2 signal.
6. Bits or data may be read from storage elements on the next of ϕ1. RD (read) signal
is ANDed with ϕ1. Thus RD and WR signals are mutually exclusive.
7. General requirement for system stability is that there must be at least one clocked
storage element in series with every clocked loop signal path.

Some commonly used storage/memory elements:


The storage elements are compared based on three parameters
1. Area requirement
2. Estimated dissipation per bit stored.
3. Volatility

The Dynamic Shift Register Stage

Dept. of ECE, GAT


VLSI Design 21ECE62

Dissipation:

Volatility:

Dept. of ECE, GAT


VLSI Design 21ECE62

A three-transistor Dynamic RAM Cell (3T DRAM Cell)

• In this cell arrangement it uses a single transistor for storing data and 2 transistors
for each RD and WR access switch.
• It has a pull-up network with either CMOS or nMOS technology and RD/WR
circuit as pull-down network.
• The binary data is stored at gate capacitance of transistor in the form of charge;
RD and WR are the control lines.
• T1 with T2 is used for writing the data and T3 with T2 is used for reading the
data. At point I data is written and read.
• Here T2 is the storage transistor and T1 & T3 are pass transistors which acts as
access switches for control lines RD and WR and also for read and write
operations.

Write Operation
• WR and RD signals are mutually exclusive i.e., compliment to each other.
When WR = 1, RD will be 0 Because of WR = 1, T1 is ON but T3 and T2
are OFF.
If data bit on bus is 1, as T1 pass transistor is ON it will pass the signal
(VDD – Vth) towards T2. The capacitor is charged to this potential at I
If data bit is 0, as T1 is ON it will pass the signal and charge stored at I is
0.
After the data is stored at I or capacitor WR signal is made to 0.
Read Operation
For this RD = 1, WR = 0

Dept. of ECE, GAT


VLSI Design 21ECE62

As WR = 0, T1 is OFF and T3 is ON as RD = 1.
T2 will be ON/OFF depending on the voltage/charge stored at I
(gate capacitance of T2).
If logic 1 is stored at I, then T2 will be ON. Thus T3 and T2 is ON
and path for discharge and the bus is pulled down to ground.
If logic 0 is stored at I, the T2 is OFF and charge does not any path
for discharge and retains logic at logic1.
Note: The compliment of stored bit is read on the data bus
• In DRAM sensing amplifiers will be connected and as the output begins
to decrease from 1 to 0 and this makes the sensing amplifier output as
logic 1. If the output does not change then sensing amplifier will make the
output as logic 0.
• Mask layouts for three transistor memory cell is as shown below.

Dept. of ECE, GAT


VLSI Design 21ECE62

One-Transistor Dynamic Memory Cell:


• This is an approach which reduces area/bit.
• It consists of capacitor Cm and pass transistor. The circuit arrangement
and stick diagram is shown in Fig a & b.

Write operation
• The capacitor Cm will be charged when Read/Write = 1 and Row Select
= 1.

Dept. of ECE, GAT


VLSI Design 21ECE62

• If the Read/Write line is provided with logic 1, Cm will be charged to logic


1 and if the line is provided with logic 0 charge stored will be logic 0.
Read operation
• If logic 0 is stored in Cm and when Row select line is high M1 is ON.
Then the sense amplifier at the bit line will sense and give the output as
logic 0
• If logic 1 is stored in Cm and when Row select line is high M1 is ON, the
logic 1 stored will begin to discharge as the path exists. The sense
amplifier senses this and this gives the output as logic 1
• The area occupied in 1T DRAM cell configuration is of a single transistor
and a capacitor.
• Larger the value of Cm longer is the duration of storage of charge. Thus
Cm should be large. But this in turn consumes more space.
• However the transistor and capacitor can be built in a single transistor.
• Cm can be fabricated by extending and enlarging the diffusion area
forming the source of process transistor. For this capacitance between n-
diffusion and p-substrate is considered.
• But this value is very small compared to gate capacitance.
• Thus in order to get higher Cm larger area is required.
• An alternate solution to this by using a polysilicon plate used over
diffusion area. This results in the formation of a 3 plate capacitor structure,
where polysilicon plate is connected to VDD. This is shown in the Fig c.

Dept. of ECE, GAT


VLSI Design 21ECE62

A Pseudo-static RAM/register cell:


This is a memory cell which combines high storage capability of DRAM and ease
of use of SRAM.
It can be used as SRAM as no external refreshing circuit is required and also used
as a DRAM having built-in refresh logic.
This is a static storage cell which will hold data indefinitely. This is achieved by
storing bit in 2 inverters with feedback. This feedback is used to refresh the data
in every clock cycle.
But care to be taken by not allowing read/write operation during internal
refreshing.
Circuit arrangement is as shown in the following figure.
Φ1 and ϕ2 are mutually exclusive clock signals, WR and RD signal coincides with
ϕ1 signals
When ϕ1 is high and WR = 1, transistor T1 is ON and data is charged/stored on
Cg (gate capacitance) of inverter. This is write operation.
When ϕ1 is high and RD = 1, transistor &the data stored at inverter stage is made
available at the output and also the compliment. Thus data is read at the output.
When Φ2 = 1, T3 is ON. The output is read and feedback i.e., refreshed (reading
and storing back the data). The gated feedback path from output of T2 is fed to
the input of T1.
The bit will be held as long as ϕ2 rescues and this time is less than decay time of
stored charged bit.

Note:
WR and RD must be mutually exclusive but both should coincide with ϕ1.
During refreshing of memory cell i.e., at ϕ2 the cell must not be read. If an attempt
is made to read the cell data onto the bus, the charge sharing effect between bus
and Cg (input gate capacitance) may cause destruction of stored bit.
Other bus lines should be allowed to run through the cells so that register and
memory arrays can be easily configured.
The Pseudo-static memory cell can also be implemented using transmission gate
(TG). This is seen the Fig. [replace nMOS transistors with TG]

Dept. of ECE, GAT


VLSI Design 21ECE62

Dept. of ECE, GAT


VLSI Design 21ECE62

Dept. of ECE, GAT


VLSI Design 21ECE62

Four Transistor Dynamic and Six-Transistor CMOS memory


cell:
The cells here include both n-type and p-type transistors and are intended for
CMOS systems.
Both the dynamic and static elements uses 2 bus per bit arrangement so that
the bit is available in both normal and compliment form on bit and bit’ bus.
Prior to reading and writing operation of the data, the buses are pre-charged
to VDD or logic 1.
Figure (a) gives arrangement for a four transistor dynamic cell for storing 1 bit.
Each bit is stored on the gate capacitance of two n-type transistors T1 and T2
and a description of the write and read operation as follows.
4 Transistor Dynamic memory cell:

Dept. of ECE, GAT


VLSI Design 21ECE62

Write operation:
• Before writing onto memory the bit and bit’ line is pre-charged to logic 1
using pMOS transistor T5 and T6 in coincidence with clock signal ϕ1
• Next appropriate column is selected in coincidence with the clock signal
ϕ2.
• Depending on the data on the bus either bit or bit’ is discharged.
• At the same clock signal ϕ2 the row select line is activated, turning on
transistors T3 and T4.
• Thus value on bit and bit’ are written via T3 and T4 stored at T2 and T1
as gate capacitances Cg2 and Cg1 respectively.

Dept. of ECE, GAT


VLSI Design 21ECE62

•The way in which T2 and T1 are connected always gives the


complimentary states when row select line is activated. When row line is
deactivated the data stored will remain until the gate capacitance can hold
the value.
• For refreshing sense amplifier is provided which will permanently hold
the data.
Read operation:
• Before reading again bit and bit’ lines are pre-charged to VDD using T5 and
T6 transistors.
• Suppose in the memory element if logic 1 is stored i.e., at gate of T2 and
at gate of T4 logic 1 is stored.

• When column and row lines are selected i.e., T3 and T4 will be in ON
state.
• As logic 1 is available at T2, T2 will be in ON state and T1 will be in OFF
state. Thus T3 = ON, T1 = OFF, T4 = ON, T2 = ON. With this condition
bit’ which was pre-charged to VDD has now a path to discharge to VSS.
Hence bit’ = 0 and bit = 1 as shown in the Fig.
• When sense amplifier senses this voltage variation on bit’ line and outputs
the data on bus line. The bit = 1 and bit’ = 0, which represents the data in
the memory.
• The sense amplifier formed from the arrangement of T1, T2, T3 and T4,
which forms a flip flop circuit.
• If the “sense” de-active/ inactive, then the bit line state is reflected in the
gate capacitances of T1 and T3 and this is w.r.t VDD. This will cause one
of the transistors to turn ON and other to turn OFF.
• When sense = enabled, current flows from VDD through ON transistor and
helps to maintain the state of the bit line.
• Sense amplifier performs 2 function
1. Rewriting the data after reading i.e., refreshing the memory cell so that
it holds the data without signal degradation
2. It predetermines the state of the data lines.

Dept. of ECE, GAT


VLSI Design 21ECE62

Fig. shows Read operation in the memory cell and in the sense amplifier

Six Transistor Static RAM cell:

Dept. of ECE, GAT


VLSI Design 21ECE62

Fig. six transistor static RAM cell with sense amplifier

• Figure shows 6 T SRAM with the adaption of dynamic cell and modifying it to
form a static memory cell.
• It includes 2 additional transistor per store bit thus it is called 6 transistors. The
transistor T5 and T6 acts as the access switch for memory element which is
formed by connecting two inverters back to back (i.e., output of one is connected
as the input of the other)
• Similar to 4T Dynamic RAM the information is stored in memory cell. The
memory cell is connected in such a way that it gives the complimentary states
when row select line is activated. When row line is deactivated the data stored
will remain in the memory cell.
Below Fig. shows dynamic and static RAM cell together as the sense amplifier
is same in both the memory cell.

Dept. of ECE, GAT


VLSI Design 21ECE62

Dept. of ECE, GAT


VLSI Design 21ECE62

JK flip flop:
• It is a memory element. It is the widely used arrangement for static memory
element.
• Also with JK other flip-flop arrangements can be obtained such as T and D flip-
flop.
• The flip-flop has inputs clocked J and K along with asynchronous clear and has
the output as Q and Q’.
• The inputs J and K are read for the rising edge of clock signal and data is passed
to the output for the falling edge of clock.
Note: Here JK is implemented in master slave configuration in order to solve the race
around condition.
Edge-triggered circuits are conveniently designed with an ASM (algorithmic state machine)
approach and the design equations for a JK flip-flop, as in following figure.

• It should be noted that the flip-flop is assumed to have an asynchronous clear


(Clr) input as well as the clocked J and K inputs, and that J and K are read in
during the Hi level of the clock ɸ, and the data thus read is transferred to the
output on the falling edge of ɸ.

Dept. of ECE, GAT


VLSI Design 21ECE62

Dept. of ECE, GAT


VLSI Design 21ECE62

Logic gate Implementations:

➢ The implementations are based on NAND or NOR or switch logic.


➢ The expressions for A and B are readily realized in NAND or NOR logic, as
shown in Figure 9.10, and it will be seen that a master/slave arrangement
applies in each case.
➢ However, an initial consideration of each arrangement will reveal that, for
nMOS, the NAND arrangement is impractical, owing to the relatively large
number of gates requiring three or more inputs which will therefore be
inherently large in area and slow in performance.
➢ The obvious nMOS alternative is a NOR gate arrangement which is a practical
proposition and can be readily implemented.

Dept. of ECE, GAT


VLSI Design 21ECE62

Switch logic and inverter implementation:

➢ Pass transistors are not to be used to drive the gates of other pass transistors;
➢ The logic 0 as well as the logic 1 transmission conditions are to be deliberately
satisfied. Thus, we need to implement the expressions for A’ (A bar) and B’ (B bar) as
well as the expressions for A and B.
➢ The resulting arrangement is given at Figure 9.11 and is a realization of the JK flip-
flop based. on n-pass transistor logic and inverters only.

Figure 9.11: Switch logic implementation of JK Flip flop

Dept. of ECE, GAT


VLSI Design 21ECE62

D Flip-Flop Circuit:

➢ A D flip-flop is readily formed from a JK flip-flop by renaming the J input D and


then replacing connections to K by D’ (see figure 9.10).
➢ Similarly, a T(Toggle) flip-flop is formed from the JK by making J = K = E, where
E is the toggle enabling input.
➢ It should also be noted that the arrangements given may be simplified by the
omission of the Clr input, or that a Preset input can be substituted for or added to
the Clr input if required.
➢ Furthermore, the way in which clock activation takes place may be modified by
a reshaping of requirements in the ASM chart of Figure 9.9.

Dept. of ECE, GAT


VLSI Design 21ECE62

Module 5: Design for Testability (Part 2)


Syllabus: Design for Testability: Adhoc testing, Scan Design, Built-in Self-Test
(BIST), IDDQ Testing, Design for Manufacturability. (Text-1)
The keys to designing circuits that are testable are controllability and observability.
Controllability is the ability to set (to 1) and reset (to 0) every node internal to the circuit.
Observability is the ability to observe, either directly or indirectly, the state of any node
in the circuit.
Good observability and controllability reduce the cost of manufacturing testing because
they allow high fault coverage with relatively few test vectors.
There are three main approaches to what is commonly called as Design for Testability
(DFT). These may be categorized as follows:
▪ Ad hoc testing
▪ Scan-based approaches
▪ Built-in self-test (BIST)

Ad hoc Testing:
Ad hoc test techniques, as their name suggests, are collections of ideas aimed at reducing
the combinational explosion of testing. They are only useful for small designs where scan,
ATPG, and BIST are not available.
Some of the common techniques for ad hoc testing are:
✓ Partitioning large sequential circuits
✓ Adding test points
✓ Adding multiplexers
✓ Providing for easy state reset
Some of the examples are: multiplexers can be used to provide alternative signal paths
during testing. In CMOS, transmission gate multiplexers provide low area and delay
overhead. Use of the bus in a bus-oriented system for test purposes. Here each register is
made loadable from the bus and capable of being driven onto the bus. Here, the internal
logic values that exist on a data bus are enabled onto the bus for testing purposes.

Any design should always have a method of resetting the internal state of the chip within
a single cycle or at most a few cycles. Apart from making testing easier, this also makes
simulation faster as a few cycles are required to initialize the chip. In general Ad hoc
testing techniques represent a bag of tricks.

Scan Design:
• The scan-design strategy for testing has evolved to provide observability and
controllability at each register.
• In designs with scan, the registers operate in one of two modes.
• In normal mode: they behave as expected

Dept. of ECE, GAT


VLSI Design 21ECE62

• In scan mode: they are connected to form a giant shift register called a scan chain
spanning the whole chip.
• By applying N clock pulses in scan mode, all N bits of state in the system can be
shifted out and new N bits of state can be shifted in. Thus scan mode gives easy
observability and controllability of every register in the system.
• Modern scan is based on the use of scan registers, as shown in Fig. The scan
register is a D flip-flop preceded by a multiplexer. When the SCAN signal is
deasserted (made to 0), the register behaves as a conventional register, storing data
on the D input. When SCAN is asserted (made to 1), the data is loaded from the
SI pin, which is connected in shift register fashion to the previous register Q
output in the scan chain.
• To load the scan chain, SCAN is asserted and 8 CLK pulses are given to load the
first two ranks of 4-bit registers with data. Then SCAN is deasserted and CLK is
asserted for one cycle to operate the circuit normally with predefined inputs.
SCAN is then reasserted and CLK asserted eight times to read the stored data
out. At the same time, the new register contents can be shifted in for the next test.
• . Testing proceeds in this manner of serially clocking the data through the scan
register to the right point in the circuit, running a single system clock cycle and
serially clocking the data out for observation. In this scheme, every input to the
combinational block can be controlled and every output can be observed.

• Test generation for this type of test architecture can be highly automated.
• The prime disadvantage is the area and delay impact of the extra multiplexer in
the scan register.

Parallel Scan:
Serial scan chains can become quite long, and the loading and unloading can dominate
testing time. A simple method/solution is to split the chains into smaller segments. This

Dept. of ECE, GAT


VLSI Design 21ECE62

can be done on a module-by-module basis or completed automatically to some specified


scan length. This method is called ‘Random Access Scan’.

Fig shows a two-by-two register section. Each register receives a column (column<m>)
and row (row<n>) access signal along with a row data line (data<n>). A global write
signal (write) is connected to all registers. By asserting the row and column access
signals in conjunction with the write signal, any register can be read or written.

Built–In Self-Test (BIST):


Built-in test techniques, as their names suggest, rely on augmenting (additional) circuits
to allow them to perform operations upon themselves that prove correct operation. These
techniques add area to the chip for the test logic, but reduce the test time required and
thus can lower the overall system cost.

The structure of BIST is shown below.


Out
Test Circui put
Patter t Res
n Unde
pon
• One method of testing a module is to use ‘signature analysis’ or ‘cyclic
redundancy checking’. This involves using a pseudo-random sequence generator
(PRSG) to produce the input signals for a section of combinational circuitry and
a signature analyzer to observe the output signals.
• A PRSG of length n is constructed from a linear feedback shift register (LFSR),
which in turn is made of n flip-flops connected in a serial fashion.
• The XOR of particular outputs are fed back to the input of the LFSR. An n-bit
LFSR will cycle through 2n–1 states before repeating the sequence. One problem
seen is that it is not possible to generate pattern with all 0’s.

Dept. of ECE, GAT


VLSI Design 21ECE62

• A complete feedback shift register (CFSR), shown in Fig, includes the zero state
that may be required in some test situations. An n-bit LFSR is converted to an n-
bit CFSR by adding an n – 1 input NOR gate connected to all but the last bit.
When in state 0…01, the next state is 0…00.
• A signature analyzer receives successive outputs of a combinational logic block
and produces a syndrome that is a function of these outputs. The syndrome is
reset to 0, and then XORed with the output on each cycle.
• The syndrome is present in each cycle so that a fault in one bit is unlikely to
cancel itself out. At the end of a test sequence, the LFSR contains the syndrome
that is a function of all previous outputs. This can be compared with the correct
syndrome (derived by running a test program on the good logic) to determine
whether the circuit is good or bad.

BILBO – Built-In Logic Block Observation:


• The combination of signature analysis and the scan technique is the formation of
BILBO
• The 3-bit BIST register shown in Fig is a scannable, resettable register that also
can serve as a pattern generator and signature analyzer.
• This structure can operate in different mode as shown in table below

C[1] C[0] Mode


0 0 Scan
0 1 Test
1 0 Reset
1 1 Normal
• In the reset mode (10), all the flip-flops are synchronously initialized to 0. In
normal mode (11), the flip-flops behave normally with their D input and Q
output. In scan mode (00), the flip-flops are configured as a 3-bit shift register
between SI and SO. In test mode (01), the register behaves as a pseudo-random
sequence generator or signature analyzer.

Dept. of ECE, GAT


VLSI Design 21ECE62

• In summary, BIST is performed by first resetting the syndrome in the output


register. Then both registers are placed in the test mode to produce the pseudo-
random inputs and calculate the syndrome. Finally, the syndrome is shifted out
through the scan chain.

Memory BIST:
On many chips, memories involves with majority of the transistors. A robust testing
methodology must be applied to provide reliable parts. In a typical MBIST scheme,
multiplexers are placed on the address, data, and control inputs for the memory to
allow direct access during test. During testing, a state machine uses these multiplexers
to directly write a checkerboard pattern of alternating 1s and 0s. The data is read back,
checked, then the inverse pattern is also applied and checked. ROM testing is even
simpler: The contents are read out to a signature analyzer to produce a syndrome.

IDDQ test (VDD supply current Quiescent) or supply current monitoring.


A method of testing for bridging faults is called IDDQ test.
• When CMOS logic gate is not switching, it draws no DC current (except for
leakage)
• When a bridging fault occurs, then for some combination of input conditions, a
measurable DC I DD will flow.
• Testing consists of applying the normal vectors, allowing the signals to settle, and
then measuring IDD.
• As Current measuring is slow, the tests must be run slower (of the order of 1 ms
per vector) than normal, increases testing time.
• IDDQ testing can be completed externally to the chip by measuring the current
drawn on the VDD line or internally using specially constructed test circuits.

Note: (Only for reference)


• What are the different types of faults in VLSI?
• Common Fault Models
• Single stuck-at faults • Transistor open and short faults • Memory faults •
PLA faults (stuck-at, cross-point, bridging) • Functional faults (processors) •
Delay faults (transition, path) • Analog faults
• A bridging fault consists of two signals that are connected when they should

Dept. of ECE, GAT


VLSI Design 21ECE62

not be.
• Bridging to VDD or Vss is equivalent to stuck at fault model. Traditionally
bridged signals were modeled with logic AND or OR of signals. If one
driver dominates the other driver in a bridging situation, the dominant
driver forces the logic to the other one, in such case a dominant bridging
fault is used.
Design for Manufacturability
Circuits can be optimized for manufacturability to increase their yield. This can be
done in a number of different ways.
Physical: At the physical level (i.e., mask level), the yield and hence manufacturability
can be improved by reducing the effect of process defects. The design rules for
particular processes will frequently have guidelines for improving yield. The following
list is representative:
• Increase the spacing between wires where possible––this reduces the chance of a
defect causing a short circuit.
• Increase the overlap of layers around contacts and vias––this reduces the chance
that a misalignment will cause an aberration in the contact structure.
• Increase the number of vias at wire intersections beyond one if possible––this
reduces the chance of a defect causing an open circuit.
Increasingly, design tools are dealing with these kinds of optimizations automatically.

Redundancy Redundant structures can be used to compensate for defective


components on a chip. For example, memory arrays are commonly built with extra
rows. During manufacturing test, if one of the words is found to be defective, the
memory can be reconfigured to access the spare row instead. Laser-cut wires or
electrically programmable fuses can be used for configuration. Similarly, if the
memory has many banks and one or more are found to be defective, they can be
disabled, possibly even under software control.

Power Elevated power can cause failure due to excess current in wires, which in turn
can cause metal migration failures. In addition, high-power devices raise the die
temperature, degrading device performance and, over time, causing device parameter
shifts. The method of dealing with this component of manufacturability is to minimize
power through design techniques described elsewhere in this text. In addition, a
suitable package and heat sink should be chosen to remove excess heat.

Process Spread We have seen that process simulations can be carried out at different
process corners. Monte Carlo analysis can provide better modeling for process spread
and can help with centering a design within the process variations.

Yield Analysis When a chip has poor yield or will be manufactured in high volume,
dice that fail manufacturing test can be taken to a laboratory for yield analysis to
locate the root cause of the failure. If particular structures are determined to have

Dept. of ECE, GAT


VLSI Design 21ECE62

caused many of the failures, the layout of the structures can be redesigned. For
example, during volume production ramp-up for the Pentium microprocessor, the
silicide over long thin polysilicon lines was found to crack and raise the wire
resistance. This in turn led to slower-than-expected operation for the cracked chips.
The layout was modified to widen polysilicon wires or strap them with metal
wherever possible, boosting the yield at higher frequencies.

Dept. of ECE, GAT

You might also like