DDCA Ch5
DDCA Ch5
Chapter 5 <1>
Chapter 5 :: Topics
• Introduction
• Arithmetic Circuits
• Number Systems
• Sequential Building Blocks
• Memory Arrays
• Logic Arrays
Chapter 5 <2>
Introduction
• Digital building blocks:
– Gates, multiplexers, decoders, registers,
arithmetic circuits, counters, memory arrays,
logic arrays
• Building blocks demonstrate hierarchy,
modularity, and regularity:
– Hierarchy of simpler components
– Well-defined interfaces and functions
– Regular structure easily extends to different sizes
• Will use these building blocks in Chapter
7 to build microprocessor
Chapter 5 <3>
1-Bit Adders
Half Full
Adder Adder
A B A B
S =
Cout =
Chapter 5 <4>
1-Bit Adders
Half Full
Adder Adder
A B A B
S =
Cout =
Chapter 5 <5>
1-Bit Adders
Half Full
Adder Adder
A B A B
S = A B Cin
Cout = AB + ACin + BCin
Chapter 5 <6>
Multibit Adders (CPAs)
• Types of carry propagate adders (CPAs):
– Ripple-carry (slow)
– Carry-lookahead (fast)
– Prefix (faster)
• Carry-lookahead and prefix adders faster for large adders
but require more hardware
Symbol
A B
N N
Cout Cin
+
N
S
Chapter 5 <7>
Ripple-Carry Adder
• Chain 1-bit adders together
• Carry ripples through entire chain
• Disadvantage: slow
Cout Cin
+ C30 + C29 C1 + C0 +
S31 S30 S1 S0
Chapter 5 <8>
Ripple-Carry Adder Delay
tripple = NtFA
where tFA is the delay of a full adder
Chapter 5 <9>
Building a Faster Adder
– Similar to adding by hand, column by column a3 b3 a2 b2 a1 b1 a0 b0 cin
4-bit adder
– Con: Slow
cout s3 s2 s1 s0
• Output is not correct until the carries have rippled to
the left – critical path carries: c3 c2 c1 cin
• 4-bit carry-ripple adder has 4*2 = 8 gate delays B: b3 b2 b1 b0 a
– Pro: Small A: + a3 a2 a1 a0
• 4-bit carry-ripple adder has just 4*5 = 20 gates cout s3 s2 s1 s0
a3 b3 a2 b2 a1 b1 a0 b0 ci
FA FA FA FA
co s3 s2 s1 s0
10
Efficient Lookahead
cin c0 a
carries: c4 c3 c2 c1 c0 c1 1 0 1 1 1 1 1 1
b0
B: b3 b2 b1 b0 1 1 0 1
a0
A: + a3 a2 a1 a0 + 1 + 1 + 1 + 0
cout s3 s2 s1 s0 0 1 0 0
if a0b0 = 1 if a0 xor b0 = 1
c1 = a0b0 + (a0 xor b0)c0 then c1 = 1 then c1 = 1 if c0 = 1
(call this G: Generate) (call this P: Propagate)
c2 = a1b1 + (a1 xor b1)c1
c3 = a2b2 + (a2 xor b2)c2 Why those names? When a0b0=1, we should generate a 1
for c1. When a0 XOR b0 = 1, we should propagate the c0
c4 = a3b3 + (a3 xor b3)c3 value as the value of c1, meaning c1 should equal c0.
11
Efficient Lookahead
Gi = aibi (generate)
c1 = G0 + P0c0 Pi = ai XOR bi (propagate)
c3 = G2 + P2c2
c3 = G2 + P2(G1 + P1G0 + P1P0c0)
c3 = G2 + P2G1 + P2P1G0 + P2P1P0c0
12
a3 b3 a2 b2 a1 b1 a0 b0 cin
CLA Half-adder
SPG
block
Half-adder Half-adder Half-adder
• Each stage:
– HA for G
and P
– Another G3 P3
Carry-lookahead logic c3
G2 P2
c2
G1 P1
c1
G0 P0 c0
XOR for s
– Call SPG cout s3 s2 (b) s1 s0
block P3 G3 P2 G2 P1 G1 P0 G0 c0
• Create carry- Carry-lookahead logic
lookahead
logic from
equations
• More a
efficient than
naïve
scheme, at
expense of
one extra Stage 4 Stage 3 Stage 2 Stage 1
gate delay
c1 = G0 + P0c0
c2 = G1 + P1G0 + P1P0c0
13
c3 = G2 + P2G1 + P2P1G0 + P2P1P0c0
cout = G3 + P3G2 + P3P2G1 + P3P2P1G0 + P3P2P1P0c0
Carry-Lookahead Adder – High-Level
a3
View
b3 a2 b2 a1 b1 a0 b0 c0
P3 G3 c3 P2 G2 c2 P1 G1 c1 P0 G0
4-bit carry-lookahead logic
cout
cout s3 s2 s1 s0
15
Hierarchical Carry-Lookahead Adders
• Better solution – Rather than rippling the carries, just repeat the carry-
lookahead concept
– Requires minor modification of 4-bit CLA adder to output P and G
These use carry-lookahead internally
P3 G3 c3 P2 G2 c2 P1 G1 c1 P0 G0
4-bit carry-lookahead logic
P G cout
16
Hierarchial Carry-Lookahead Adders
• Hierarchical CLA concept can be applied for larger adders
• 32-bit hierarchical CLA:
– Only about 8 gate delays (2 for SPG block, then 2 per CLA level)
– Only about 14 gates in each 4-bit CLA logic block
SPG ai bi P G c a
block
Q: How many gate
delays for 64-bit
hierarchical CLA,
4-bit 4-bit 4-bit 4-bit 4-bit 4-bit 4-bit 4-bit
using 4-bit CLA logic?
CLA CLA CLA CLA CLA CLA CLA CLA
logic logic logic logic logic logic logic logic
A: 16 CLA-logic blocks
PGc c GP c G P P G c P Gc c GP c G P in 1st level, 4 in 2nd, 1
P G c
in 3rd -- so still just 8
4-bit 4-bit gate delays (2 for
CLA CLA
logic logic SPG, and 2+2+2 for
CLA logic). CLA is a
P G c 2-bit c G P very efficient method.
CLA
logic
17
Subtracter
Symbol Implementation
A B
N
A B
N N
N N
-
N +
Y N
Y
Chapter 5 <18>
Subtracter
Symbol Implementation
A B
N
A B
N N
N N
-
N +
Y N
Y
Chapter 5 <19>
Comparator: Equality
Symbol Implementation
A3
B3
A B A2
4 4 B2
Equal
= A1
B1
Equal
A0
B0
Chapter 5 <20>
Comparator: Equality
Symbol Implementation
A3
B3
A B A2
4 4 B2
Equal
= A1
B1
Equal
A0
B0
Chapter 5 <21>
4.4
Comparators
• N-bit equality comparator: Outputs 1 if two N-bit numbers are equal
– 4-bit equality comparator with inputs A and B
• a3 must equal b3, a2 = b2, a1 = b1, a0 = b0
– Two bits are equal if both 1, or both 0
– eq = (a3b3 + a3’b3’) * (a2b2 + a2’b2’) * (a1b1 + a1’b1’) * (a0b0 + a0’b0’)
• Note that function inside parentheses is XNOR
– eq = (a3 xnor b3) * (a2 xnor b2) * (a1 xnor b1) * (a0 xnor b0)
a3 b3 a2 b2 a1 b1 a0 b0
0110 = 0111 ? 0 0 1 1 1 1 0 1
a3 a2 a1 a0 b3 b2 b1 b0
1 1 1 0
4-bit equality comparator =
eq
a
a
0 eq
22
Comparator: Less Than
A B
N N
-
N
[N-1]
A<B
5-<23>
Chapter 5 <23>
Magnitude Comparator
a b a b a b a b
Igt in_gt out_gt in_gt out_gt in_gt out_gt in_gt out_gt AgtB
Ieq in_eq out_eq in_eq out_eq in_eq out_eq in_eq out_eq AeqB
Ilt in_lt out_lt in_lt out_lt in_lt out_lt in_lt out_lt AltB
0 Ilt AltB
25
Magnitude Comparator
a3 b3 a2 b2 a1 b1 a0 b0
a b a b a b a b
Igt in_gt out_gt in_gt out_gt in_gt out_gt in_gt out_gt AgtB
Ieq in_eq out_eq in_eq out_eq in_eq out_eq in_eq out_eq AeqB
Ilt in_lt out_lt in_lt out_lt in_lt out_lt in_lt out_lt AltB
26
Magnitude Comparator
1 1 0 0 1 > 0 1 1
1011 = 1001 ? a3 b3 a2 b2 a1 b1 a0 b0 • Final answer
appears on the
a b a b a b a b right
Igt
0
in_gt out_gt in_gt out_gt 0 in_gt out_gt
1
in_gt out_gt AgtB • Takes time for
1 0 answer to “ripple”
Ieq in_eq out_eq in_eq out_eq 1 in_eq out_eq in_eq out_eq AeqB
0 0 from left to right
Ilt in_lt out_lt in_lt out_lt 0 in_lt out_lt in_lt out_lt AltB
• Thus called “carry-
Stage3 Stage2 Stage1 Stage0 ripple style” after
(c) the carry-ripple
a 1 1 0 0 1 0 1 1 adder
a3 b3 a2 b2 a1 b1 a0 b0 – Even though
there’s no
a b a b a b a b “carry”
0 1 involved
Igt in_gt out_gt in_gt out_gt in_gt out_gt 1 in_gt out_gt AgtB
1 0 0
Ieq in_eq out_eq in_eq out_eq in_eq out_eq in_eq out_eq AeqB
0 0 0
Ilt in_lt out_lt in_lt out_lt in_lt out_lt in_lt out_lt AltB
27
Exercise
28
29
Exercise
30
31
Arithmetic Logic Unit (ALU)
F2:0 Function
A B 000 A&B
N N 001 A|B
010 A+B
ALU 3F 011 not used
N 100 A & ~B
Y 101 A | ~B
110 A-B
111 SLT
5-<32>
Chapter 5 <32>
ALU Design
A B
N N F2:0 Function
000 A&B
N
001 A|B
1
0 F2
N 010 A+B
011 not used
Cout + 100 A & ~B
[N-1] S
101 A | ~B
Extend
Zero
110 A-B
N N N N
1
0
3
0
F2
N
Cout +
[N-1] S
Extend
Zero
N N N N
1
0
3
2 F1:0
N
Y
5-<34>
Chapter 5 <34>
Set Less Than (SLT) Example
A B
N N
• Configure 32-bit ALU for SLT
operation: A = 25 and B = 32
N
1 – A < B, so Y should be 32-bit
0
F2 representation of 1 (0x00000001)
N
– F2:0 = 111
– F2 = 1 (adder acts as
Cout + subtracter), so 25 - 32 = -7
[N-1] S
– -7 has 1 in the most
Extend
N N N N
– F1:0 = 11 multiplexer selects
1
0
3
5-<35>
Chapter 5 <35>
Shifters
• Logical shifter: shifts value to left or right and fills empty spaces with 0’s
– Ex: 11001 >> 2 =
– Ex: 11001 << 2 =
• Arithmetic shifter: same as logical shifter, but on right shift, fills empty
spaces with the old most significant bit (msb).
– Ex: 11001 >>> 2 =
– Ex: 11001 <<< 2 =
• Rotator: rotates bits in a circle, such that bits shifted off one end are
shifted into the other end
– Ex: 11001 ROR 2 =
– Ex: 11001 ROL 2 =
5-<36>
Chapter 5 <36>
Shifters
• Logical shifter:
– Ex: 11001 >> 2 = 00110
– Ex: 11001 << 2 = 00100
• Arithmetic shifter:
– Ex: 11001 >>> 2 = 11110
– Ex: 11001 <<< 2 = 00100
• Rotator:
– Ex: 11001 ROR 2 = 01110
– Ex: 11001 ROL 2 = 00111
Chapter 5 <37>
Shifter Design
A 3 A 2 A1 A0 shamt1:0
2
00 S1:0
01
10
Y3
shamt1:0 11
2 00
S1:0
01
Y2
A3:0 4 >> 4 Y3:0
10
11
00
S1:0
01
10
Y1
11
How to extend it to N-bit shifter?
00
S1:0
Need 2Nx1MUX for each bit. 01
Y0
10
Too expensive! 11
Chapter 5 <38>
Barrel Shifter
39
Shifters as Multipliers, Dividers
• A << N = ?
• A >>> N = ?
Chapter 5 <40>
Shifters as Multipliers, Dividers
• A << N = A × 2N
– Example: 00001 << 2 = 00100 (1 × 22 = 4)
– Example: 11101 << 2 = 10100 (-3 × 22 = -12)
• A >>> N = A ÷ 2N
– Example: 01000 >>> 2 = 00010 (8 ÷ 22 = 2)
– Example: 10000 >>> 2 = 11100 (-16 ÷ 22 = -4)
Chapter 5 <41>
Exercise
42
43
4.2
Registers
• N-bit register: Stores N bits, N is the width
– Common widths: 8, 16, 32 b x
Combinational n1
– Storing data into register: Loading logic
– Opposite of storing: Reading (does not alter contents) n0
• Basic register of Ch 3: Loaded every cycle s1 s0
Q3 Q2 Q1 Q0
44
Register with Parallel Load
• Add 2x1 mux to front of each flip-flop
• Register’s load input selects mux input to pass
– load=0: existing flip-flop value; load=1: new input value
I3 I2 I1 I0
1 0 1 0 1 0 1 0
load 2x1 I3 I2 I1 I0
load a
D D D D
Q3 Q2 Q1 Q0
Q Q Q Q
block symbol
Q3 Q2 Q1 Q0
I3 I2 I1 I0 I3 I2 I1 I0
load=1
load=0
10 10 10 10 10 10 10 10
D D D D D D D D
Q Q Q Q Q Q Q Q
Q3 Q2 Q1 Q0 Q3 Q2 Q1 Q0
a
45
Exercise
46
module register(
input logic clk,
input logic [1:0] select,
input logic [3:0] In,
output logic [3:0] Q
);
47
48
Exercise
49
50
Counters
• Increments on each clock edge
• Used to cycle through numbers. For example,
– 000, 001, 010, 011, 100, 101, 110, 111, 000, 001…
• Example uses:
– Digital clock displays
– Program counter: keeps track of current instruction executing
Symbol Implementation
CLK
N CLK
N N
+ Q
Q N N r
1
Reset
Reset
Chapter 5 <51>
Counters and Timers 4.9
53
Counter with Load
• Up-counter that can be
L 4
loaded with external value
ld
– Designed using 2x1 1 4-bit 2x1 0
4
mux. ld input selects
cnt Id
incremented value or clr clr
4-bit register
external value
– Load the internal 4 4
+1
4
register when loading
external value or when tc C
counting
– Note that ld has
priority over cnt
54
Exercise
55
module count4(
input clk,
input cnt,
input set,
input clear,
output logic [3:0] count
);
endmodule
56
57
58
Exercise
59
60
Exercise
61
62
M
Timers -1
32
load
• Pulses output at user-specified ld
32-bit register
(c)
63
module timer(input logic clk, load, enable, logic [31:0] M,
output logic Q, logic[31:0] downcount);
endmodule
64
Exercise
65
66
Shift Registers
• Shift a new bit in on each clock edge
• Shift a bit out on each clock edge
• Serial-to-parallel converter: converts serial input (Sin) to
parallel output (Q0:N-1)
Symbol: Implementation:
CLK
N
Q Sin Sout
Sin Sout
Q0 Q1 Q2 QN-1
Chapter 5 <67>
Shift Register
• Shift right 0
– Move each bit one position right
Register contents
0 1 1 0
– Rightmost bit is “dropped” after shift right
– Assume 0 shifted into leftmost bit
A: 1001 (original)
0100
0010 • Implementation: Connect flip-flop
output to next flip-flop’s input
0001
shr_in
0000
68
Shift Register
• To allow register to either shift or retain, use
2x1 muxes
– shr: “0” means retain, “1” shift
– shr_in: value to shift in
• May be 0, or 1
shr_in
1 0 1 0 1 0 1 0 1 0 1 0
shr=1
shr 1 0 1 0
2x1 2x1
D D D D
D D D D
Q Q Q Q
Q Q Q Q
Q3 Q2 Q1 Q0
Q3 Q2 Q1 Q0 (b)
(a )
shr_in
shr
Left-shift register also easy to design
Q3 Q2 Q1 Q0
(c)
69
Shift Register with Parallel Load
• When Load = 1, acts as a normal N-bit register
• When Load = 0, acts as a shift register
• Now can act as a serial-to-parallel converter (Sin to Q0:N-1) or
a parallel-to-serial converter (D0:N-1 to Sout)
D0 D1 D2 DN-1
Load
Clk
Sin 0 0 0 0 Sout
1 1 1 1
Q0 Q1 Q2 QN-1
Chapter 5 <70>
Rotate Register
Register contents
1 1 0 1
before shift right
71
Example: Above-Mirror Car Display
Operation:
• The type of the value is determined by the 2-bit signals a1a0 or x1x0.
• The car’s central computer can update these values at arbitrary times and
in arbitrary order. It sends the data to your system over an 8-bit bus C
after setting the 2-bit signal a1a0 and single-bit signal load.
• Depending on the value of x1x0, your system should output the
corresponding 8-bit value to the display system through an 8-bit bus D.
72
Example: Above-Mirror Car Display
73
Example: Above-Mirror Car Display (cont’d)
How can we reduce the number of wires from car computer to your
system from 11 to 4?
74
Example: Above-Mirror Car Display (cont’d)
75
4.10
Register Files
• Accessing one of
32
C C 8
9 a
4x 162 4 i0 i0
9
– OK if just a few d1
too much
loadfanout
reg1
8
32-bit
8-bit
4 a0 A 9 4× 1
16x1
registers i0
9
i1
i3-i0 i1 8
– Problematic when a1
d2 load reg2 I
9 dd
328
DD
9
many 8
i2
congestion
d3 load reg3
– Ex: Earlier above- e
d15
e load reg15 M
16*32 =
load i15i3 512
s1 s0wires
load 8
mirror display, with 4 8-bit registers tolerable
32 s3-s0
x y
“read port”
W_data R_data
write
4-bit “address” specifies 4 4 4-bit address to specifies
W_addr R_addr
which register to write which register to read
W_en R_en
Enable (load) line: Reg 16 × 32
Enable read
written on next clock register file
a
77
Internal Implementation
• How to handle the large fanout problem?
• Use buffers/repeaters
. Y
32 D1
15
78
Register File 9
W_data
32
bus 9 32 9
R_data
d0 load reg0 driver d0
W_addr
i0
9
32 i0
R_addr
bus driver
d q
1 i1
d2 load reg2
9 d2
i1
1
write 32 read
q=d decoder 1 decoder
d3 load reg3
9 1 d3
Boosts signal e e
32 9 9
W_en R_en
three-state driver 1 4x32 register file 1
a c
d q Internal design of 4x32 RF; 16x32 RF follows similarly
c=1: q=d d q
d
c=0: q= Z q
like no connection
79
Exercise
80
81
Exercise
82
83
Memory Arrays
• Efficiently store large amounts of data
• 3 common types:
– Dynamic random access memory (DRAM)
– Static random access memory (SRAM)
– Read only memory (ROM)
• M-bit data value read/ written at each unique N-bit address
N
Address Array
Data
Chapter 5 <84>
Memory Arrays
• 2-dimensional array of bit cells
• Each bit cell stores one bit
• N address bits and M data bits: Address
N
Array
– 2N rows and M columns
– Depth: number of rows (number of words) M
Address Data
11 0 1 0
2
Address Array 10 1 0 0
depth
01 1 1 0
3 00 0 1 1
Data width
Chapter 5 <85>
Memory Array Example
• 22 × 3-bit array
• Number of words: 4
• Word size: 3-bits
• For example, the 3-bit word stored at address 10 is 100
Address Data
11 0 1 0
2
Address Array 10 1 0 0
depth
01 1 1 0
3 00 0 1 1
Data width
Chapter 5 <86>
Memory Arrays
1024-word x
10
Address 32-bit
Array
32
Data
Chapter 5 <87>
Memory Array Bit Cells
bitline
wordline
stored
bit
bitline = bitline =
wordline = 1 wordline = 0
stored stored
bit = 0 bit = 0
bitline = bitline =
wordline = 1 wordline = 0
stored stored
bit = 1 bit = 1
(a) (b)
Chapter 5 <88>
Memory Array Bit Cells
bitline
wordline
stored
bit
bitline = 0 bitline = Z
wordline = 1 wordline = 0
stored stored
bit = 0 bit = 0
bitline = 1 bitline = Z
wordline = 1 wordline = 0
stored stored
bit = 1 bit = 1
(a) (b)
Chapter 5 <89>
Memory Array
• Wordline:
– like an enable
– single row in memory array read/written
– corresponds to unique address
– only one wordline HIGH at once
2:4
Decoder bitline2 bitline1 bitline0
wordline3
11
2 stored stored stored
Address bit = 0 bit = 1 bit = 0
wordline2
10
stored stored stored
wordline1 bit = 1 bit = 0 bit = 0
01
stored stored stored
bit = 1 bit = 1 bit = 0
wordline0
00
stored stored stored
bit = 0 bit = 1 bit = 1
Chapter 5 <91>
RAM: Random Access Memory
• Volatile: loses its data when power off
• Read and written quickly
• Main memory in your computer is RAM
(DRAM)
Chapter 5 <92>
ROM: Read Only Memory
• Nonvolatile: retains data when power off
• Read quickly, but writing is impossible or
slow
• Flash memory in cameras, thumb drives, and
digital cameras are all ROMs
Chapter 5 <94>
Robert Dennard, 1932 -
• Invented DRAM in
1966 at IBM
• Others were skeptical
that the idea would
work
• By the mid-1970’s
DRAM in virtually all
computers
Chapter 5 <95>
DRAM
• Data bits stored on capacitor
• Dynamic because the value needs to be refreshed
(rewritten) periodically and after read:
– Charge leakage from the capacitor degrades the value
– Reading destroys the stored value
bitline bitline
wordline wordline
stored
bit stored
bit
Chapter 5 <96>
DRAM
bitline bitline
wordline wordline
stored + + stored
bit = 1 bit = 0
Chapter 5 <97>
SRAM
bitline
wordline
stored
bit
bitline bitline
wordline
Chapter 5 <98>
Memory Arrays Review
2:4
Decoder bitline2 bitline1 bitline0
wordline3
11
2 stored stored stored
Address bit = 0 bit = 1 bit = 0
wordline2
10
stored stored stored
wordline1 bit = 1 bit = 0 bit = 0
01
stored stored stored
bit = 1 bit = 1 bit = 0
wordline0
00
stored stored stored
bit = 0 bit = 1 bit = 1
Chapter 5 <99>
Memory Comparison
100
Exercise
Chapter 5 <101>
Solution
Chapter 5 <102>
Exercise
Chapter 5 <103>
Exercise
Chapter 5 <104>
ROM: Dot Notation
bitline
2:4
Decoder wordline
11
2
Address
bit cell
10
containing 0
01
bitline
wordline
00
bit cell
Data2 Data1 Data0 containing 1
Chapter 5 <105>
Fujio Masuoka, 1944 -
• Developed memories and high
speed circuits at Toshiba, 1971-1994
• Invented Flash memory as an
unauthorized project pursued during
nights and weekends in the late
1970’s
• The process of erasing the memory
reminded him of the flash of a
camera
• Toshiba slow to commercialize the
idea; Intel was first to market in
1988
• Flash has grown into a $25 billion
per year market
Chapter 5 <106>
ROM Storage
2:4
Decoder
11
2
Address
10
01
00
Chapter 5 <107>
ROM Storage
2:4
Decoder Address Data
11
Address 2 11 0 1 0
10
10 1 0 0
depth
01 01 1 1 0
00 00 0 1 1
Data2 Data1 Data0
width
Chapter 5 <108>
ROM Logic
2:4
Decoder
11
2
Address
10
01
00
Chapter 5 <109>
ROM Logic
2:4
Decoder
Address 2
11
Data2 = A1 A0
10
Data1 = A1 + A0
01
00
Data0 = A1A0
Data2 Data1 Data0
Chapter 5 <110>
Example: Logic with ROMs
Implement the following logic functions using a 22 × 3-bit
ROM:
– X = AB 2:4
Decoder
– Y=A+B
11
– Z=AB 2
A, B
10
01
00
X Y Z
Chapter 5 <111>
Example: Logic with ROMs
Implement the following logic functions using a 22 × 3-bit
ROM:
– X = AB 2:4
Decoder
– Y=A+B
11
– Z=AB 2
A, B
10
01
00
X Y Z
Chapter 5 <112>
Logic with Any Memory Array
2:4
Decoder bitline2 bitline1 bitline0
wordline3
11
2 stored stored stored
Address bit = 0 bit = 1 bit = 0
wordline2
10
stored stored stored
wordline1 bit = 1 bit = 0 bit = 0
01
stored stored stored
bit = 1 bit = 1 bit = 0
wordline0
00
stored stored stored
bit = 0 bit = 1 bit = 1
Chapter 5 <113>
Logic with Any Memory Array
2:4
Decoder bitline2 bitline1 bitline0
wordline3
11
2 stored stored stored
Address bit = 0 bit = 1 bit = 0
wordline2
10
stored stored stored
wordline1 bit = 1 bit = 0 bit = 0
01
stored stored stored
bit = 1 bit = 1 bit = 0
wordline0
00
stored stored stored
bit = 0 bit = 1 bit = 1
Data2 = A1 A0
Data1 = A1 + A0
Data0 = A1A0
Chapter 5 <114>
Logic with Memory Arrays
Implement the following logic functions using a 22 × 3-bit
memory array:
– X = AB
– Y=A+B
– Z=AB
Chapter 5 <115>
Logic with Memory Arrays
Implement the following logic functions using a 22 × 3-bit
memory array:
– X = AB 2:4
– Y=A+B Decoder
wordline3
bitline2 bitline1 bitline0
11
– Z=AB stored stored stored
A, B 2
bit = 1 bit = 1 bit = 0
wordline2
10
stored stored stored
wordline1 bit = 0 bit = 1 bit = 1
01
stored stored stored
bit = 0 bit = 1 bit = 0
wordline0
00
stored stored stored
bit = 0 bit = 0 bit = 0
X Y Z
Chapter 5 <116>
Logic with Memory Arrays
Called lookup tables (LUTs): look up output at each input
combination (address)
4-word x 1-bit Array
2:4
Decoder bitline
Truth
Table 00
stored
A A1
bit = 0
A B Y 01
B A0
0 0 0 stored
0 1 0 bit = 0
1 0 0 10
1 1 1 stored
bit = 0
11
stored
bit = 1
Chapter 5 <117>
Multi-ported Memories
• Port: address/data pair
• 3-ported memory
– 2 read ports (A1/RD1, A2/RD2)
– 1 write port (A3/WD3, WE3 enables writing)
• Register file: small multi-ported memory
CLK
WE3
A1 RD1
N M
A2 RD2
N M
A3 Array
N
WD3
M
Chapter 5 <118>
SystemVerilog Memory Arrays
// 256 x 3 memory module with one read/write port
module dmem( input logic clk, we,
input logic[7:0] a
input logic [2:0] wd,
output logic [2:0] rd);
assign rd = RAM[a];
Chapter 5 <119>
Logic Arrays
• PLAs (Programmable logic arrays)
– AND array followed by OR array
– Combinational logic only
– Fixed internal connections
• FPGAs (Field programmable gate arrays)
– Array of Logic Elements (LEs)
– Combinational and sequential logic
– Programmable internal connections
Chapter 5 <120>
PLAs
PLAs
• X = ABC + ABC
• Y = AB Inputs
M
AND Implicants OR
ARRAY N ARRAY
P
Outputs
A B C
OR ARRAY
ABC
ABC
AB
AND ARRAY
X Y
Chapter 5 <121>
PLAs: Dot Notation
Inputs
M
AND Implicants OR
ARRAY N ARRAY
P
Outputs
A B C
OR ARRAY
ABC
ABC
AB
AND ARRAY
X Y
Chapter 5 <122>
FPGA: Field Programmable Gate Array
• Composed of:
– LEs (Logic elements): perform logic
– IOEs (Input/output elements): interface with outside
world
– Programmable interconnection: connect LEs and
IOEs
– Some FPGAs include other building blocks such as
multipliers and RAMs
Chapter 5 <123>
General FPGA Layout
Chapter 5 <124>
CLB : Configurable
Logic Block
IOB: Input-Output
Block
LE: Logic Element
• Composed of:
– LUTs (lookup tables): perform combinational logic
– Flip-flops: perform sequential logic
– Multiplexers: connect LUTs and flip-flops
Chapter 5 <126>
Altera Cyclone IV LE
Chapter 5 <127>
Altera Cyclone IV LE
• The Spartan CLB has:
– 1 four-input LUT
– 1 registered output
– 1 combinational output
Chapter 5 <128>
LE Configuration Example
Show how to configure a Cyclone IV LE to perform the
following functions:
– X = ABC + ABC
– Y = AB
Chapter 5 <129>
LE Configuration Example
Show how to configure a Cyclone IV LE to perform the
following functions:
– X = ABC + ABC
– Y = AB
(A) (B) (C) (X)
data 1 data 2 data 3 data 4 LUT output
0 0 0 X 0
0 0 1 X 1
A data 1
0 1 0 X 0
B data 2
0 1 1 X 0 C
data 3 X
1 0 0 X 0
0 data 4
1 0 1 X 0 LUT
1 1 0 X 1
LE 1
1 1 1 X 0
LE 2
Chapter 5 <130>
Exercise
Implement the function Y = JKLMPQR using Cyclone IV LEs.
131
Exercise
Implement the following sequential circuit using Cyclone IV LEs
132
Solution
133
FPGA Design Flow
Using a CAD tool (such as Altera’s Quartus II)
• Enter the design using schematic entry or an HDL
• Simulate the design
• Synthesize design and map it onto FPGA
• Download the configuration onto the FPGA
• Test the design
Chapter 5 <134>