0% found this document useful (0 votes)
22 views27 pages

Hardware Issuesss

The document discusses hardware design issues, focusing on custom single-purpose processors and the distinctions between combinational and sequential logic circuits. It explains the functionality of transistors, CMOS implementations, basic logic gates, and the design of combinational and sequential components. Additionally, it covers the process of creating a datapath and controller for a finite-state machine with an example of a greatest common divisor algorithm.

Uploaded by

kxahoi12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views27 pages

Hardware Issuesss

The document discusses hardware design issues, focusing on custom single-purpose processors and the distinctions between combinational and sequential logic circuits. It explains the functionality of transistors, CMOS implementations, basic logic gates, and the design of combinational and sequential components. Additionally, it covers the process of creating a datapath and controller for a finite-state machine with an example of a greatest common divisor algorithm.

Uploaded by

kxahoi12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Hardware Design Issues

Custom Single-Purpose Processor -


Hardware
Combinational and Sequential logic
Combinational circuit is a type of logic circuit whose output is a function of only the
present input.

Sequential logic is a type of logic circuit whose output depends not only on the
present value of its input signals but on the sequence of past inputs, the input history
as well.
Combinational and Sequential logic
Transistor:
 The basic electrical component in digital systems.

 Acts as an on/off switch.

 Voltage at “gate” controls whether current flows from source to drain.


source
 Don’t confuse this “gate” with a logic gate. gate Conducts
if gate=1

drain

gate
IC package IC oxide
source channel drain
Silicon substrate
CMOS transistor implementations
Complementary Metal Oxide Semiconductor:
source source
 We refer to logic levels gate Conducts gate Conducts
if gate=1 if gate=0
 Typically 0 is 0V, 1 is 5V drain drain

nMOS pMOS
 Two basic CMOS types
 nMOS conducts if gate=1 1 1 1
x
 pMOS conducts if gate=0 x y
x F = x' y
F = (xy)'
 Hence “complementary” x F = (x+y)'

y x y
0
0
 Basic gates 0
NOR gate
inverter NAND gate
 Inverter, NAND, NOR
Basic logic gates
x x x x
F x F F
x y F y F x y F F x y F
0 0 y 0 0 0 0 0 0 y 0 0 0
1 1 0 1 0 0 1 1 0 1 1
F=x F=xy 1 0 0 F=x+y 1 0 1 F=x 1 0 1
1 1 1 1 1 1 1 1 0
Driver AND OR y
XOR

x F x F x
F
x y F x
F x y F x
F
x y F
0 1 y 0 0 1 y 0 0 1 y 0 0 1
1 0 0 1 1 0 1 0 0 1 0
F = x’ F = (x y)’ 1 0 1 F = (x+y)’ 1 0 0 F=x y 1 0 0
Inverter NAND 1 1 0 NOR 1 1 0 XNOR 1 1 1
Combinational logic design
A) Problem description B) Truth table C) Output equations

y is 1 if a is to 1, or b and c are 1.
Inputs Outputs y = a'bc + ab'c' + ab'c + abc' + abc
z is 1 if b or c is to 1, but not both, or if all a b c y z
are 1.
0 0 0 0 0
z = a'b'c + a'bc' + ab'c + abc' + abc
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 1 0 E) Logic Gates
1 0 1 1 1
D) Minimized output equations 1 1 0 1 1
a y
y bc 1 1 1 1 1 b
a 00 01 11 10
c
0 0 0 1 0
1 1 1 1 1
y = a + bc
z
bc
a 00 01 11 10 z
0 0 1 0 1
1 0 1 1 1

z = ab + b’c + bc’
Combinational components
A B
I(log n -1) I0 A B A B
I(m-1) I1 I0
… n n
n n n
n …

log n x n n bit,
n-bit n-bit
S0 n-bit, m x 1 m function S0
Decoder Adder Comparator
… Multiplexor ALU

… n
S(log m) S(log m)
n n

O(n-1) O1 O0 carry sum less equal greater


O O

O= O0 =1 if I=0..00 sum = A+B less = 1 if A<B O = A op B


I0 if S=0..00 O1 =1 if I=0..01 (first n bits) equal =1 if A=B op determined
I1 if S=0..01 … carry = (n+1)’th greater=1 if A>B by S.
… O(n-1) =1 if I=1..11 bit of A+B
I(m-1) if S=1..11

With enable input e  all With carry-in input Ci May have status outputs
O’s are 0 if e=0 sum = A + B + Ci carry, zero, etc.
Sequential components
I
n
load shift n-bit
n-bit n-bit
Register Shift register Counter
clear I Q
n n

Q Q

Q= Q = lsb Q=
0 if clear=1, - Content shifted 0 if clear=1,
I if load=1 and clock=1, - I stored in msb Q(prev)+1 if count=1 and clock=1.
Q(previous) otherwise.
Sequential logic design
A) Problem Description C) Implementation Model D) State Table (Moore-type)

You want to construct a clock divider. Slow x


down your pre-existing clock so that you a Combinational logic Inputs Outputs
output a 1 for every four clock cycles I1 Q1 Q0 a I1 I0 x
I0 0 0 0 0 0
0
0 0 1 0 1
0 1 0 0 1 0
Q1 Q0 0 1 1 1 0
1 0 0 1 0 0
B) State Diagram 1 0 1 1 1
State register
1 1 0 1 1
1
a=0 x=0 x=1 a=0 I1 I0 1 1 1 0 0

0 a=1 3

a=1 a=1

Given this implementation model


1 2
a=0
a=1
a=0 Sequential logic design quickly reduces to
x=0 x=0
combinational logic design
Sequential logic design (cont.)
E) Minimized Output Equations F) Combinational Logic

I1 Q1Q0
a
00 01 11 10
a
0 0 0 1 1
I1 = Q1’Q0a + Q1a’ + x
Q1Q0’
1 0 1 0 1

Q1Q0
I0
00 01 11 10 I1
a

0 0 1 1 0 I0 = Q0a’ + Q0’a

1 1 0 0 1

x Q1Q0 I0
a
00 01 11 10

0 0 0 1 0 x = Q1Q0

1 0 0 1 0 Q1 Q0
Custom single-purpose processor basic
model
external external … …
control data
inputs inputs
controller datapath
… …
datapath
control next-state registers
controller inputs datapath and
control
logic

datapath
control
state functional
outputs
… … register units

external external
control data
outputs outputs

… …

a view inside the controller and datapath


controller and datapath
Example: greatest common divisor
!1
(a) black-box
 First create algorithm
1:
view
1 !(!go_i)

 Convert algorithm to
2:
go_i x_i y_i !go_i

“complex” state machine


2-J:
GCD
3: x = x_i
d_o
 Known as FSMD: finite-state
4: y = y_i
machine with datapath
(b) desired functionality
 Can use templates to perform 0: int x, y;
5: !(x!=y)

x!=y
1: while (1) {
such conversion 2: while (!go_i);
6:

3: x = x_i; x<y !(x<y)


4: y = y_i; 7: y = y -x 8: x = x - y
5: while (x != y) {
6: if (x < y) 6-J:
7: y = y - x;
else 5-J:
8: x = x - y;
9: d_o = x
}
9: d_o = x;
1-J:
}

(c) state
diagram
State diagram templates
Assignment statement Loop statement Branch statement
a=b while (cond) { if (c1)
next loop-body- c1 stmts
statement else if c2
statements c2 stmts
} else
next other statements
a=b
statement next statement
!cond
C: C:
next
statement c1 !c1*c2 !c1*!c2
cond

loop-body- c1 stmts c2 stmts others


statements

J:
J:

next next
statement statement
Creating the datapath
 Create a register for any 1:
!1

declared variable 2:
1 !(!go_i)

x_i y_i
 Create a functional unit for 2-J:
!go_i
Datapath

each arithmetic operation


x_sel
n-bit 2x1 n-bit 2x1
3: x = x_i
y_sel

 Connect the ports, registers 4: y = y_i


x_ld
y_ld
0: x 0: y

and functional units 5: !(x!=y)


!= < subtractor subtractor
 Based on reads and writes 6:
x!=y
5: x!=y 6: x<y 8: x-y 7: y-x
x_neq_y
 Use multiplexors for multiple x<y !(x<y) x_lt_y 9: d
y = y -x 8: x = x - y d_ld
sources 7:
d_o
6-J:
 Create unique identifier
5-J:
 for each datapath component d_o = x
9:
control input and output
1-J:
Creating the controller’s FSM
go_i
 Same structure as FSMD
!1
1:
Controller !1
1 !(!go_i) 0000 1:
2:
0001 2:
1 !(!go_i)  Replace complex
!go_i

2-J:
0010 2-J:
!go_i
actions/conditions with
3: x = x_i
0011
x_sel = 0
3: x_ld = 1
datapath configurations
x_i y_i

4: y = y_i y_sel = 0 Datapath


0100 4: y_ld = 1
x_sel
!(x!=y) !x_neq_y n-bit 2x1 n-bit 2x1
5:
0101 5:
y_sel
x!=y
x_neq_y x_ld
6: 0110 6: 0: x 0: y
x_lt_y !x_lt_y y_ld
x<y !(x<y)
7: y_sel = 1 8: x_sel = 1
7: y = y -x 8: x=x-y y_ld = 1 x_ld = 1
0111 1000 != < subtractor subtractor
6-J: 5: x!=y 6: x<y 8: x-y 7: y-x
1001 6-J:
x_neq_y
1010 5-J:
5-J: x_lt_y 9: d
1011 9: d_ld = 1 d_ld
9: d_o = x

1100 1-J: d_o


1-J:
Splitting into a controller and datapath
go_i

Controller implementation model Controller !1


0000 1: x_i y_i
go_i
x_sel 1 !(!go_i)
Combinational 0001 2: (b) Datapath
y_sel
logic !go_i
x_ld x_sel
y_ld 0010 2-J: n-bit 2x1 n-bit 2x1
x_neq_y x_sel = 0 y_sel
0011 3: x_ld = 1
x_lt_y x_ld
d_ld 0: x 0: y
y_sel = 0
0100 4: y_ld = 1 y_ld

x_neq_y=0
0101 5:
Q3 Q2 Q1 Q0 x_neq_y=1 != < subtractor subtractor
0110 6:
State register 5: x!=y 6: x<y 8: x-y 7: y-x
x_lt_y=1 x_lt_y=0
I3 I2 I1 I0 x_neq_y
7: y_sel = 1 8: x_sel =1
y_ld = 1 x_ld = 1
x_lt_y 9: d
0111 1000
1001 6-J: d_ld

1010 5-J: d_o

1011 9: d_ld = 1

1100 1-J:
Controller state table for the GCD example
Inputs Outputs

Q3 Q2 Q1 Q0 x_neq_ x_lt_y go_i I3 I2 I1 I0 x_sel y_sel x_ld y_ld d_ld


y
0 0 0 0 * * * 0 0 0 1 X X 0 0 0

0 0 0 1 * * 0 0 0 1 0 X X 0 0 0

0 0 0 1 * * 1 0 0 1 1 X X 0 0 0

0 0 1 0 * * * 0 0 0 1 X X 0 0 0

0 0 1 1 * * * 0 1 0 0 0 X 1 0 0

0 1 0 0 * * * 0 1 0 1 X 0 0 1 0

0 1 0 1 0 * * 1 0 1 1 X X 0 0 0

0 1 0 1 1 * * 0 1 1 0 X X 0 0 0

0 1 1 0 * 0 * 1 0 0 0 X X 0 0 0

0 1 1 0 * 1 * 0 1 1 1 X X 0 0 0

0 1 1 1 * * * 1 0 0 1 X 1 0 1 0

1 0 0 0 * * * 1 0 0 1 1 X 1 0 0

1 0 0 1 * * * 1 0 1 0 X X 0 0 0

1 0 1 0 * * * 0 1 0 1 X X 0 0 0

1 0 1 1 * * * 1 1 0 0 X X 0 0 1

1 1 0 0 * * * 0 0 0 0 X X 0 0 0

1 1 0 1 * * * 0 0 0 0 X X 0 0 0

1 1 1 0 * * * 0 0 0 0 X X 0 0 0

1 1 1 1 * * * 0 0 0 0 X X 0 0 0
Completing the GCD custom single-purpose
processor design
 We finished the datapath … …

 We have a state table for the next state controller datapath

and control logic


next-state registers
 All that’s left is combinational and
control
logic design logic

 This is not an optimized design, but we


state functional
see the basic steps register units

… …

a view inside the controller and datapath


RT-level custom single-purpose processor
design
 We often start with a state machine

Problem Specification
 Rather than algorithm Sende Bridge Rece
r rdy_in A single-purpose processor that rdy_out iver
 Cycle timing often too central clock
converts two 4-bit inputs, arriving one
at a time over data_in along with a
rdy_in pulse, into one 8-bit output on
to functionality data_in(4)
data_out along with a rdy_out pulse.
data_out(8)

 Example
rdy_in=0 Bridge rdy_in=1
 Bus bridge that converts 4-bit rdy_in=1

bus to 8-bit bus WaitFirst4 RecFirst4Start


data_lo=data_in
RecFirst4End

 Start with FSMD rdy_in=0 rdy_in=0 rdy_in=1


 Known as register-transfer rdy_in=1
WaitSecond4 RecSecond4Start RecSecond4End
(RT) level data_hi=data_in
FSMD

 Exercise: complete the design rdy_in=0


Inputs
Send8Start rdy_in: bit; data_in: bit[4];
data_out=data_hi Send8End Outputs
& data_lo rdy_out=0 rdy_out: bit; data_out:bit[8]
rdy_out=1 Variables
data_lo, data_hi: bit[4];
RT-level custom single-purpose processor
design (cont’)
Bridge
(a) Controller
rdy_in=0 rdy_in=1
rdy_in=1
WaitFirst4 RecFirst4Start RecFirst4End
data_lo_ld=1
rdy_in=0 rdy_in=0 rdy_in=1
rdy_in=1
WaitSecond4 RecSecond4Start RecSecond4End
data_hi_ld=1

Send8Start Send8End
data_out_ld=1 rdy_out=0
rdy_out=1

rdy_in rdy_ou
t
clk
data_in(4) data_out

data_lo_ld
data_out_ld
data_hi_ld
registers

data_hi data_lo
to all

data_out
(b) Datapath
Optimizing single-purpose processors
Optimization is the task of making design
metric values the best possible

Optimization opportunities
original program
FSMD
datapath
FSM
Optimizing the original program
Analyze program attributes and look for areas of
possible improvement
number of computations
size of variable
time and space complexity
operations used
multiplication and division very expensive
Optimizing the original program (contd)
original program optimized program
0: int x, y; 0: int x, y, r;
1: while (1) { 1: while (1) {
2: while (!go_i); 2: while (!go_i);
3: x = x_i; // x must be the larger number
4: y = y_i; 3: if (x_i >= y_i) {
5: while (x != y) { 4: x=x_i;
replace the subtraction
6: if (x < y) 5: y=y_i;
operation(s) with modulo
7: y = y - x; }
operation in order to speed
else 6: else {
up program
8: x = x - y; 7: x=y_i;
} 8: y=x_i;
9: d_o = x; }
} 9: while (y != 0) {
10: r = x % y;
11: x = y;
12: y = r;
}
13: d_o = x;
}
GCD(42, 8) - 9 iterations to complete the loop GCD(42,8) - 3 iterations to complete the loop
x and y values evaluated as follows : (42, 8), (34, 8), x and y values evaluated as follows: (42, 8), (8,2),
(26,8), (18,8), (10, 8), (2,8), (2,6), (2,4), (2,2). (2,0)
Optimizing the FSMD
Areas of possible improvements
merge states
states with constants on transitions can be eliminated,
transition taken is already known
states with independent operations can be merged
separate states
states which require complex operations (a*b*c*d) can
be broken into smaller states to reduce hardware size
scheduling
Optimizing the FSMD (cont.)
int x, y; !1 optimized FSMD
original FSMD
1:
int x, y;
1 !(!go_i) eliminate state 1 – transitions have constant values
2: 2:
!go_i go_i !go_i

2-J: x = x_i
3: y = y_i
merge state 2 and state 2J – no loop operation in
3: x = x_i between them
5:

4: y = y_i x<y x>y


merge state 3 and state 4 – assignment operations are
independent of one another 7: y = y -x 8: x = x - y
5: !(x!=y)

x!=y
9: d_o = x
6: merge state 5 and state 6 – transitions from state 6 can
x<y !(x<y) be done in state 5
y = y -x 8: x = x - y
7:
eliminate state 5J and 6J – transitions from each state
6-J: can be done from state 7 and state 8, respectively

5-J:
eliminate state 1-J – transition from state 1-J can be
d_o = x done directly from state 9
9:

1-J:
Optimizing the datapath
Sharing of functional units
one-to-one mapping, as done previously, is not
necessary
if same operation occurs in different states, they
can share a single functional unit
Multi-functional units
ALUs support a variety of operations, it can be
shared among operations occurring in different
states
Optimizing the FSM
State minimization
task of merging equivalent states into a single state
state equivalent if for all possible input combinations the two
states generate the same outputs and transitions to the next
same state

You might also like