Unit-Iii Sequential Logic Circuits
Unit-Iii Sequential Logic Circuits
Unit-Iii Sequential Logic Circuits
2 cascaded inverters
Static Latches and Registers
NOR-based SR flip-flop
SR Flip-Flops
SR Flip-Flops
• When clock is low (CLK = 1), T1 is on and T2 is off, and the D input is
sampled onto node QM.
• When the clock goes high, the master stage stops sampling the input
and goes into a hold mode.
Low-Voltage Static Latches
• The scaling of supply voltages is critical for low power
operation.
• Unfortunately, certain latch structures don’t function at
reduced supply voltages.
• Scaling to low supply voltages hence requires the use of
reduced threshold devices.
• When the registers are constantly accessed, the
leakage energy is typically insignificant compared to the
switching power.
• However, with the use of conditional clocks, it is
possible that registers are idle for extended periods and
the leakage energy expended by registers can be quite
significant.
Low-Voltage Static Latches
Positive edge-triggered
register based on sense-amplifier
Sense-Amplifier Based Registers
• The circuit uses a precharged front-end amplifier that
samples the differential input signal on the rising edge of
the clock signal.
• Where tc-q and tsu are the propagation delay and the set-
up time of the register, respectively.
Operation of two-phase
pipelined circuit using dynamic registers
NORA-CMOS— A Logic Style for
Pipelined Structures
• This topology has one important property:
• A - based pipelined circuit is race-free as long as
all the logic functionsF (implemented using static logic)
between the latches are noninverting.
• The only way a signal can race from stage to stage
under this condition is when the logic functionF is
inverting where F is replaced by a single, static CMOS
inverter.
NORA-CMOS— A Logic Style for
Pipelined Structures
NORA-CMOS— A Logic Style for
Pipelined Structures
• Logic and latch are clocked in such a way that both are
simultaneously in either evaluation, or hold (precharge)
mode.
-----
• A block that is in evaluation during CLK = 1 is called a
CLK-module,
-----
while the inverse is called a CLK-module.
• A NORA datapath consists of a chain of alternating CLK
and CLK modules.
• While one class of modules is precharging with its output
latch in hold mode, preserving the previous output value,
the other class is evaluating.
Memory architecture
Semiconductor Memory
Classification
Non-Volatile
Read-Write Memory Read-Write Read-Only Memory
Memory
DRAM LIFO
Shift Register
CAM
Memory Timing: Definitions
Memory Architecture:
Decoders M bits M bits
S0 S0
Word 0 Word 0
S1
Word 1 A0 Word 1
S2 Storage Storage
Word 2 A1 Word 2
cell cell
words AK2 1
N SN 2 2 Decoder
Word N 2 2 Word N 2 2
SN 2 1
Word N 2 1 Word N 2 1
K 5 log2N
Input-Output Input-Output
(M bits) (M bits)
Intuitive architecture for N x M memory Decoder reduces the number of select signals
Too many select signals:
N words == N select signals
K = log2N
Contents-Addressable Memory
Data (64 bits)
I/O Buffers
Commands
I/O Buffers
I/O Buffers
Comparand
Mask
Commands
Commands
Address Decoder
Priority Encoder
29 Validity Bits
CAM Array
Control Logic R/W Address (9 bits) 9
2 words 3 64 bits
92Validity Bits
Priority
Address Decoder9 Validity BitsEncode
2 Priority Encod
Address Decoder
Memory Timing:
Approaches
BL BL BL
WL WL
WL
0
GND
WL [0]
V DD
WL [1]
WL [2]
V DD
WL [3]
V bias
Pull-down loads
MOS NOR ROM
V DD
Pull-up devices
WL [0]
GND
WL [1]
WL [2]
GND
WL [3]
WL [0]
WL [1]
WL [2]
WL [3]
BL
r word
WL Cbit
cword
CL
r bit
cbit
r word
WL
cword
tox G
tox
S
n+ p n+_
Substrate
10 V 5V 20 V 0V 5V
25V 2 2.5 V
S D S D S D
Control gate
Floating gate
n1 source n1 drain
programming
p-substrate
Unit Cell
Source line
(Diff. Layer)
Courtesy Toshiba
Read-Write Memories (RAM)
STATIC (SRAM)
DYNAMIC (DRAM)
WL
V DD
M2 M4
Q
M5 Q M6
M1 M3
BL BL
CMOS SRAM Analysis (Read)
WL
V DD
BL M4
BL
Q= 0
Q= 1 M6
M5
V DD M1 V DD V DD
Cbit Cbit
CMOS SRAM Analysis
(Write) WL
V DD
M4
Q= 0 M6
M5
Q= 1
M1
V DD
BL = 1 BL = 0
3-Transistor DRAM Cell
BL 1 BL 2
WWL
RWL WWL
M3 RWL
M1 X X V DD 2 V T
M2
V DD
CS BL 1
BL 2 V DD 2 V T DV
CAM SRAM
ARRAY ARRAY
Hit Logic
Address Decoder
(N)AND Decoder
NOR Decoder
Hierarchical Decoders
Multi-stage implementation improves performance
•••
WL 1
WL 0
A 0A 1 A 0A 1 A 0A 1 A 0A 1 A 2A 3 A 2A 3 A 2A 3 A 2A 3
•••
NAND decoder using
2-input pre-decoders
A1 A0 A0 A1 A3 A2 A2 A3
Dynamic Decoders
Precharge devices GND GND VDD
WL 3
VDD
WL 3
WL 2
WL 2 VDD
WL 1
WL 1
V DD
WL 0
WL 0
VDD f A0 A0 A1 A1
A0 A0 A1 A1 f
S0
A0
S1
S2
A1 S3
A0
A0
A1
A1
D
Number of devices drastically reduced
Delay increases quadratically with # of sections; prohibitive for large
decoders
Solutions ::buffers
progressive sizing
combination of tree and pass transistor approaches
Decoder for circular shift-
register
V DD V DD V DD V DD V DD V DD
WL 0 WL 1 WL 2
f f f f f f
• • •
R f f R f f R f f
V DD
Sense Amplifiers
make D V as small
C ×D V as possible
tp = ----------------
Iav
large small
Idea: Use Sense Amplifer
small
transition s.a.
input output
Differential Sense Amplifier
V DD
M3 M4
y Out
bit M1 M2 bit
SE M5
Directly applicable to
SRAMs
Differential Sensing ― SRAM
V DD V DD
PC
BL BL V DD V DD
EQ
y M3 M4 2y
WL i
x M1 M2 2x x 2x
SE M5 SE
SE
SRAM cell i
V DD
Diff.
x Sense 2x Output
Amp y
SE
Output
(a) SRAM sensing scheme (b) two stage differential amplifier
Latch-Based Sense Amplifier (DRAM)
EQ
BL BL
VDD
SE
SE
nCDE V INT f m
selected mi act
CPTV INT f
I DCP
n
mC DE V INT f
PERIPHERY
COLUMN DEC
V SS
Suppressing Leakage in
SRAM
V DD
low-threshold transistor V DD V DDL
sleep
V DD,int sleep
V DD,int
V SS,int
sleep
F1
F2
Combinational Logic
D2
• Increases hold time too clk
F1
CL
D2
tcd thold tccq tskew F2
tskew
clk
thold
Q1 tccq
D2 tcd
• Reduce clock skew
– Careful clock distribution network design
– Plenty of metal wiring resources
• Analyze clock skew
– Only budget actual, not worst case skews
– Local vs. global skew budgets
• Tolerate clock skew
– Choose circuit structures insensitive to skew
Skew Tolerance
• Flip-flops are sensitive to skew because of hard edges
– Data launches at latest rising edge of clock
– Must setup before earliest next rising edge of clock
– Overhead would shrink if we can soften edge
• Latches tolerate moderate amounts of skew
– Data can arrive anytime latch is transparent
Skew: Latches
f1 f2 f1
2-Phase Latches
D1 Q1 Combinational D2 Q2 Combinational D3 Q3
2t
L1
L2
L3
t pd Tc pdq
Logic 1 Logic 2
sequencing overhead
f1
tcd 1 , tcd 2 thold tccq tnonoverlap tskew
f2
c tsetup tnonoverlap tskew
T
tborrow
2
Pulsed Latches
t pd Tc max t pdq , t pcq tsetup t pw tskew
sequencing overhead
W X
A
B
f
dynamic static
NAND inverter
Clock Skew
• Skew increases sequencing overhead
– Traditional domino has hard edges
– Evaluate at latest rising edge
– Setup at latch by earliest falling edge
clk
clk
t pd Tc 2tsetup 2tskew
clk clk clk clk clk clk clk clk
Dynamic
Dynamic
Dynamic
Dynamic
Dynamic
Dynamic
Static
Static
Static
Static
Latch
Latch
tsetup tskew
Time Borrowing
• Logic may not exactly fit half-cycle
– No flexibility to borrow time to balance logic between
half cycles
• Traditional domino sequencing overhead is about 25% of
cycle time in fast systems!
clk
clk
Dynamic
Dynamic
Dynamic
Static
Static
Static
Static
Latch
Latch
tsetup tskew
Skew-Tolerant Domino
• Use overlapping clocks to eliminate latches at phase
boundaries.
– Second phase evaluates using results of first
No latch at
phase boundary
f1 f2
Dynamic
Dynamic
a b c d
Static
Static
f1 f1
f2 f2
a a
b b
c c
Full Keeper
• After second phase evaluates, first phase precharges
• Input to second phase falls
– Violates monotonicity?
• But we no longer need the value
• Now the second gate has a floating output
– Need full keeper to hold it either high or low
f H
X
weak full
f keeper
transistors
Time Borrowing
• Overlap can be used to
– Tolerate clock skew
– Permit time borrowing
• No sequencing overhead toverlap
tborrow tskew
f1
f2
t pd Tc
f1 f1 f1 f1 f1 f2 f2 f2
Dynamic
Dynamic
Dynamic
Dynamic
Dynamic
Dynamic
Dynamic
Dynamic
Static
Static
Static
Static
Static
Static
Static
Static
Phase 1 Phase 2
Multiple Phases
• With more clock phases, each phase overlaps more
– Permits more skew tolerance and time borrowing
f1
f2
f3
f4
f1 f1 f2 f2 f3 f3 f4 f4
Dynamic
Dynamic
Dynamic
Dynamic
Dynamic
Dynamic
Dynamic
Dynamic
Static
Static
Static
Static
Static
Static
Static
Static
Phase 1 Phase 2 Phase 3 Phase 4
Clock Generation
en clk
f1
f2
f3
f4
Timing issues
• Set up and hold time:
• Every flip-flop has restrictive time regions around the
active clock edge in which input should not change
• We call them restrictive because any change in the input
in this regions the output may be the expected one
• It may be derived from either the old input, the new input,
or even in between the two.
Timing issues
• The setup time is the interval before the clock where the
data must be held stable.
• The hold time is the interval after the clock where the
data must be held stable.
• Hold time can be negative, which means the data can
change slightly before the clock edge and still be
properly captured.
• Most of the current day flip-flops has zero or negative
hold time.
Timing issues
Timing issues
• To avoid setup time violations: