Hardware Issuesss
Hardware Issuesss
Sequential logic is a type of logic circuit whose output depends not only on the
present value of its input signals but on the sequence of past inputs, the input history
as well.
Combinational and Sequential logic
Transistor:
The basic electrical component in digital systems.
drain
gate
IC package IC oxide
source channel drain
Silicon substrate
CMOS transistor implementations
Complementary Metal Oxide Semiconductor:
source source
We refer to logic levels gate Conducts gate Conducts
if gate=1 if gate=0
Typically 0 is 0V, 1 is 5V drain drain
nMOS pMOS
Two basic CMOS types
nMOS conducts if gate=1 1 1 1
x
pMOS conducts if gate=0 x y
x F = x' y
F = (xy)'
Hence “complementary” x F = (x+y)'
y x y
0
0
Basic gates 0
NOR gate
inverter NAND gate
Inverter, NAND, NOR
Basic logic gates
x x x x
F x F F
x y F y F x y F F x y F
0 0 y 0 0 0 0 0 0 y 0 0 0
1 1 0 1 0 0 1 1 0 1 1
F=x F=xy 1 0 0 F=x+y 1 0 1 F=x 1 0 1
1 1 1 1 1 1 1 1 0
Driver AND OR y
XOR
x F x F x
F
x y F x
F x y F x
F
x y F
0 1 y 0 0 1 y 0 0 1 y 0 0 1
1 0 0 1 1 0 1 0 0 1 0
F = x’ F = (x y)’ 1 0 1 F = (x+y)’ 1 0 0 F=x y 1 0 0
Inverter NAND 1 1 0 NOR 1 1 0 XNOR 1 1 1
Combinational logic design
A) Problem description B) Truth table C) Output equations
y is 1 if a is to 1, or b and c are 1.
Inputs Outputs y = a'bc + ab'c' + ab'c + abc' + abc
z is 1 if b or c is to 1, but not both, or if all a b c y z
are 1.
0 0 0 0 0
z = a'b'c + a'bc' + ab'c + abc' + abc
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 1 0 E) Logic Gates
1 0 1 1 1
D) Minimized output equations 1 1 0 1 1
a y
y bc 1 1 1 1 1 b
a 00 01 11 10
c
0 0 0 1 0
1 1 1 1 1
y = a + bc
z
bc
a 00 01 11 10 z
0 0 1 0 1
1 0 1 1 1
z = ab + b’c + bc’
Combinational components
A B
I(log n -1) I0 A B A B
I(m-1) I1 I0
… n n
n n n
n …
log n x n n bit,
n-bit n-bit
S0 n-bit, m x 1 m function S0
Decoder Adder Comparator
… Multiplexor ALU
…
… n
S(log m) S(log m)
n n
With enable input e all With carry-in input Ci May have status outputs
O’s are 0 if e=0 sum = A + B + Ci carry, zero, etc.
Sequential components
I
n
load shift n-bit
n-bit n-bit
Register Shift register Counter
clear I Q
n n
Q Q
Q= Q = lsb Q=
0 if clear=1, - Content shifted 0 if clear=1,
I if load=1 and clock=1, - I stored in msb Q(prev)+1 if count=1 and clock=1.
Q(previous) otherwise.
Sequential logic design
A) Problem Description C) Implementation Model D) State Table (Moore-type)
0 a=1 3
a=1 a=1
I1 Q1Q0
a
00 01 11 10
a
0 0 0 1 1
I1 = Q1’Q0a + Q1a’ + x
Q1Q0’
1 0 1 0 1
Q1Q0
I0
00 01 11 10 I1
a
0 0 1 1 0 I0 = Q0a’ + Q0’a
1 1 0 0 1
x Q1Q0 I0
a
00 01 11 10
0 0 0 1 0 x = Q1Q0
1 0 0 1 0 Q1 Q0
Custom single-purpose processor basic
model
external external … …
control data
inputs inputs
controller datapath
… …
datapath
control next-state registers
controller inputs datapath and
control
logic
datapath
control
state functional
outputs
… … register units
external external
control data
outputs outputs
… …
Convert algorithm to
2:
go_i x_i y_i !go_i
x!=y
1: while (1) {
such conversion 2: while (!go_i);
6:
(c) state
diagram
State diagram templates
Assignment statement Loop statement Branch statement
a=b while (cond) { if (c1)
next loop-body- c1 stmts
statement else if c2
statements c2 stmts
} else
next other statements
a=b
statement next statement
!cond
C: C:
next
statement c1 !c1*c2 !c1*!c2
cond
J:
J:
next next
statement statement
Creating the datapath
Create a register for any 1:
!1
declared variable 2:
1 !(!go_i)
x_i y_i
Create a functional unit for 2-J:
!go_i
Datapath
2-J:
0010 2-J:
!go_i
actions/conditions with
3: x = x_i
0011
x_sel = 0
3: x_ld = 1
datapath configurations
x_i y_i
x_neq_y=0
0101 5:
Q3 Q2 Q1 Q0 x_neq_y=1 != < subtractor subtractor
0110 6:
State register 5: x!=y 6: x<y 8: x-y 7: y-x
x_lt_y=1 x_lt_y=0
I3 I2 I1 I0 x_neq_y
7: y_sel = 1 8: x_sel =1
y_ld = 1 x_ld = 1
x_lt_y 9: d
0111 1000
1001 6-J: d_ld
1011 9: d_ld = 1
1100 1-J:
Controller state table for the GCD example
Inputs Outputs
0 0 0 1 * * 0 0 0 1 0 X X 0 0 0
0 0 0 1 * * 1 0 0 1 1 X X 0 0 0
0 0 1 0 * * * 0 0 0 1 X X 0 0 0
0 0 1 1 * * * 0 1 0 0 0 X 1 0 0
0 1 0 0 * * * 0 1 0 1 X 0 0 1 0
0 1 0 1 0 * * 1 0 1 1 X X 0 0 0
0 1 0 1 1 * * 0 1 1 0 X X 0 0 0
0 1 1 0 * 0 * 1 0 0 0 X X 0 0 0
0 1 1 0 * 1 * 0 1 1 1 X X 0 0 0
0 1 1 1 * * * 1 0 0 1 X 1 0 1 0
1 0 0 0 * * * 1 0 0 1 1 X 1 0 0
1 0 0 1 * * * 1 0 1 0 X X 0 0 0
1 0 1 0 * * * 0 1 0 1 X X 0 0 0
1 0 1 1 * * * 1 1 0 0 X X 0 0 1
1 1 0 0 * * * 0 0 0 0 X X 0 0 0
1 1 0 1 * * * 0 0 0 0 X X 0 0 0
1 1 1 0 * * * 0 0 0 0 X X 0 0 0
1 1 1 1 * * * 0 0 0 0 X X 0 0 0
Completing the GCD custom single-purpose
processor design
We finished the datapath … …
… …
Problem Specification
Rather than algorithm Sende Bridge Rece
r rdy_in A single-purpose processor that rdy_out iver
Cycle timing often too central clock
converts two 4-bit inputs, arriving one
at a time over data_in along with a
rdy_in pulse, into one 8-bit output on
to functionality data_in(4)
data_out along with a rdy_out pulse.
data_out(8)
Example
rdy_in=0 Bridge rdy_in=1
Bus bridge that converts 4-bit rdy_in=1
Send8Start Send8End
data_out_ld=1 rdy_out=0
rdy_out=1
rdy_in rdy_ou
t
clk
data_in(4) data_out
data_lo_ld
data_out_ld
data_hi_ld
registers
data_hi data_lo
to all
data_out
(b) Datapath
Optimizing single-purpose processors
Optimization is the task of making design
metric values the best possible
Optimization opportunities
original program
FSMD
datapath
FSM
Optimizing the original program
Analyze program attributes and look for areas of
possible improvement
number of computations
size of variable
time and space complexity
operations used
multiplication and division very expensive
Optimizing the original program (contd)
original program optimized program
0: int x, y; 0: int x, y, r;
1: while (1) { 1: while (1) {
2: while (!go_i); 2: while (!go_i);
3: x = x_i; // x must be the larger number
4: y = y_i; 3: if (x_i >= y_i) {
5: while (x != y) { 4: x=x_i;
replace the subtraction
6: if (x < y) 5: y=y_i;
operation(s) with modulo
7: y = y - x; }
operation in order to speed
else 6: else {
up program
8: x = x - y; 7: x=y_i;
} 8: y=x_i;
9: d_o = x; }
} 9: while (y != 0) {
10: r = x % y;
11: x = y;
12: y = r;
}
13: d_o = x;
}
GCD(42, 8) - 9 iterations to complete the loop GCD(42,8) - 3 iterations to complete the loop
x and y values evaluated as follows : (42, 8), (34, 8), x and y values evaluated as follows: (42, 8), (8,2),
(26,8), (18,8), (10, 8), (2,8), (2,6), (2,4), (2,2). (2,0)
Optimizing the FSMD
Areas of possible improvements
merge states
states with constants on transitions can be eliminated,
transition taken is already known
states with independent operations can be merged
separate states
states which require complex operations (a*b*c*d) can
be broken into smaller states to reduce hardware size
scheduling
Optimizing the FSMD (cont.)
int x, y; !1 optimized FSMD
original FSMD
1:
int x, y;
1 !(!go_i) eliminate state 1 – transitions have constant values
2: 2:
!go_i go_i !go_i
2-J: x = x_i
3: y = y_i
merge state 2 and state 2J – no loop operation in
3: x = x_i between them
5:
x!=y
9: d_o = x
6: merge state 5 and state 6 – transitions from state 6 can
x<y !(x<y) be done in state 5
y = y -x 8: x = x - y
7:
eliminate state 5J and 6J – transitions from each state
6-J: can be done from state 7 and state 8, respectively
5-J:
eliminate state 1-J – transition from state 1-J can be
d_o = x done directly from state 9
9:
1-J:
Optimizing the datapath
Sharing of functional units
one-to-one mapping, as done previously, is not
necessary
if same operation occurs in different states, they
can share a single functional unit
Multi-functional units
ALUs support a variety of operations, it can be
shared among operations occurring in different
states
Optimizing the FSM
State minimization
task of merging equivalent states into a single state
state equivalent if for all possible input combinations the two
states generate the same outputs and transitions to the next
same state