0% found this document useful (0 votes)
17 views31 pages

ESD #2 Processors

The document discusses embedded system design and custom single-purpose processor design. It covers topics like combinational logic, sequential logic, and designing processors for specific computation tasks that are fast, small, and low power but with higher costs and longer development times.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views31 pages

ESD #2 Processors

The document discusses embedded system design and custom single-purpose processor design. It covers topics like combinational logic, sequential logic, and designing processors for specific computation tasks that are fast, small, and low power but with higher costs and longer development times.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

SMJE4423 Embedded System Design

#2 Processors

Ir. Ts. Dr. Mohd Azlan Abu


MJIIT, ESE, PRA iKohza
Phone: 012-6512399
Email: [email protected]

Embedded System Design 1


Outline

• Introduction
• Combinational logic
• Sequential logic
• Custom single-purpose processor design
• RT-level custom single-purpose processor design

Embedded System Design 2


Introduction
• Processor
• Digital circuit that performs a
computation tasks
• Controller and datapath Digital camera chip
• General-purpose: variety of CCD

computation tasks CCD Pixel D2A


A2D preprocessor coprocessor
• Single-purpose: one particular
computation task lens

• Custom single-purpose: non-standard JPEG codec Microcontroller Multiplier/Accu


task m

• A custom single-purpose DMA controller Display

processor may be ctrl

• Fast, small, low power


• But, high NRE, longer time-to-market, Memory ISA bus UART LCD ctrl
less flexible controller interface

Embedded System Design 3


CMOS transistor on silicon
• Transistor
• The basic electrical component in digital systems
• Acts as an on/off switch
• Voltage at “gate” controls whether current flows from source to
drain
• Donʼt confuse this “gate” with a logic gate

source
gate Conducts
if gate=1
1 drain

gate
IC package IC oxide
source channel drain
Silicon
substrate

Embedded System Design 4


CMOS transistor implementations
• Complementary Metal source source
Oxide Semiconductor gate Conducts gate Conducts
if gate=1 if gate=0
• We refer to logic levels drain drain

• Typically 0 is 0V, 1 is 5V nMOS pMOS

• Two basic CMOS types


• nMOS conducts if gate=1
1 1 1
• pMOS conducts if gate=0 x y x
• Hence “complementary” x F = x'
F= y
x (xy)' F = (x+y)'
• Basic gates 0 y x y
• Inverter, NAND, NOR 0 0
inverter NAND gate NOR gate

Embedded System Design 5


Basic logic gates

x x
x F x F F
x y F yx F x y F x
F y F
0 0 y 0 0 0 0 0 0 y 0 0 0
1 1 0 1 0 0 1 1 0 1 1
F=x F=xy 1 0 0 F=x+y 1 0 1 F=xÅy 1 0 1
1 1 1 1 1 1 1 1 0
Driver AND OR XOR

x F x F x x y F x x y F x x y F
F
0 1 y
F
0 0 1 y 0 0 1 y
F 0 0 1
1 0 0 1 1 0 1 0 0 1 0
1 0 1 1 0 0
F = xʼ F = (x+y)ʼ 1 0 0
1 1 1
Inverter F = (x y)ʼ 1 1 0 NOR 1 1 0 F = (x Å y)ʼ
NAND XNOR

Embedded System Design 6


Combinational logic design
A) Problem description B) Truth table C) Output equations

y is 1 if a is to 1, or b and c are 1. z Inputs Outputs y = a'bc + ab'c' + ab'c + abc' + abc


is 1 if b or c is to 1, but not both, or if a b c y z
all are 1. 0 0 0 0 0
0 0 1 0 1 z = a'b'c + a'bc' + ab'c + abc' + abc
0 1 0 0 1
0 1 1 1 0
1 0 0 1 0
1 0 1 1 1
D) Minimized output equations 1 1 0 1 1
y bc 1 1 1 1 1 E) Logic Gates
a 00 01 11 10
0 0 0 1 0
a y
1 1 1 1 1 b
c
y = a + bc
z
bc
a 00 01 11 10
0 0 1 0 1
z
1 0 1 1 1

z = ab + bʼc + bcʼ

Embedded System Design 7


Combinational components
I(log n -1) I0 A A B
B A B
I(m-1) I1 I0 n
… n n n n
n …
log n x n n-bit n bit,
S0 n-bit, m x 1 n-bit
Decoder Adder m function S0
… Multiplexor Comparato
ALU …
… n r
S(log m) S(log m)
n n
O(n-1) O1 O0 carry sum less equal greater
O O

O= O0 =1 if I=0..00 sum = A+B less = 1 if A<B O = A op B


I0 if S=0..00 O1 =1 if I=0..01 (first n bits) equal =1 if A=B op determined
I1 if S=0..01 … carry = (n+1)ʼth greater=1 if A>B by S.
… O(n-1) =1 if I=1..11 bit of A+B
I(m-1) if S=1..11

With enable input e With carry-in input May have status


à all Oʼs are 0 if e=0 Cià outputs carry, zero,
sum = A + B + Ci etc.

Embedded System Design 8


Sequential components

I
n
load n-bit shift n-bit
n-bit
Register Shift register Counter
clear I Q
n n

Q Q

Q= Q = lsb Q=
0 if clear=1, - Content shifted 0 if clear=1,
I if load=1 and clock=1, - I stored in msb Q(prev)+1 if count=1 and clock=1.
Q(previous) otherwise.

Embedded System Design 9


Sequential logic design
A) Problem Description C) Implementation Model D) State Table (Moore-type)
You want to construct a clock
divider. Slow down your pre- x
existing clock so that you a Combinational logic Inputs Outputs
output a 1 for every four clock I1 Q1 Q0 a I1 I0 x
cycles I0 0 0 0 0 0
0
0 0 1 0 1
0 1 0 0 1 0
Q1 Q0 0 1 1 1 0
1 0 0 1 0 0
B) State Diagram State register 1 0 1 1 1
1 1 0 1 1
a=0 x=0 x=1 a=0 1
I1 I0 1 1 1 0 0
0 a=1 3

a=1 a=1

a=0
1
a=1
2
a=0
• Given this implementation model
x=0 x=0
• Sequential logic design quickly reduces
to combinational logic design

Embedded System Design 10


Sequential logic design (cont.)

F) Combinational Logic

a
x

I1

I0

Q1 Q0

Embedded System Design 11


Custom single-purpose processor basic model

… …

external external
control data controller datapath
inputs inputs
… …
datapath next-state registers
control and
controller inputs datapath control
logic

datapath
control state functional
outputs register units
… …
external external
control data
outputs outputs
… …

controller and datapath a view inside the controller and datapath

Embedded System Design 12


Example: greatest common divisor
(a) black-box view (c) state diagram

• First create algorithm 1:


!1

go_i x_i y_i


• Convert algorithm to
1 !(!go_i)
2:
GCD
“complex” state machine d_o 2-J:
!go_i

• Known as FSMD: 3: x = x_i

finite-state machine with 4: y = y_i


(b) desired functionality
datapath
5: !(x!=y)
0: int x, y;
1: while (1) { x!=y
• Can use templates to 2: while (!go_i); 6:

perform such conversion 3: x = x_i; x<y !(x<y)


4: y = y_i; 7: y = y -x 8: x = x - y
5: while (x != y) {
6: if (x < y) 6-J:
7: y = y - x;
else 5-J:
8: x = x - y;
} 9: d_o = x
9: d_o = x;
Euclidean algorithm } 1-J:

Embedded System Design 13


State diagram templates
Assignment statement Loop statement Branch statement
a=b while (cond) { if (c1)
next loop-body- c1 stmts
statement statements else if c2
} c2 stmts
next statement else
other stmts
!cond
C: next statement
a=b cond
loop-body- C:
statements
next c1 !c1*c2 !c1*!c2
statement
J:
c1 stmts c2 stmts others
next statement

J:

next statement

Embedded System Design 14


Creating the datapath
• Create a register for any 1:
!1

declared variable 1 !(!go_i)


2: x_i y_i
• Create a functional unit !go_i Datapath
for each arithmetic 2-J: x_sel
operation 3: x = x_i y_sel
n-bit 2x1 n-bit 2x1

• Connect the ports, x_ld


0: x 0: y

registers and functional 4: y = y_i y_ld

units 5: !(x!=y)
!= < subtractor subtractor
• Based on reads and x!=y
5: x!=y 6: x<y 8: x-y 7: y-x
writes 6: x_neq_y

• Use multiplexors for x<y !(x<y) x_lt_y 9: d


7: y = y -x 8: x = x - y
multiple sources d_ld

d_o
• Create unique identifier 6-J:

• for each datapath 5-J:


component control input 9: d_o = x
and output
1-J:

Embedded System Design 15


Creating the controllerʼs FSM
go_i
• Same structure as
!1
1:
Controller !1

2:
1 !(!go_i) 0000 1:
1 !(!go_i) FSMD
0001 2:
!go_i
2-J:
00102-J:
!go_i
• Replace complex
3: x = x_i x_sel = 0
0011 3: x_ld = 1 actions/conditions with
4: y = y_i
y_sel = 0
datapath configurations
x_i y_i
0100 4: y_ld = 1
!(x!=y) Datapath
5: !x_neq_y
0101 5:
x!=y x_sel
x_neq_y n-bit 2x1 n-bit 2x1
6: 0110 6: y_sel
x<y !(x<y) x_lt_y !x_lt_y x_ld
0: x 0: y
7: y = y -x 8: x = x - y 7: y_sel = 1 8: x_sel =1
y_ld = 1 x_ld = 1 y_ld

6-J: 0111 1000


1001 6-J:
!= < subtractor subtractor
5-J: 1010 5-J: 5: x!=y 6: x<y 8: x-y 7: y-x
9: d_o = x x_neq_y
1011 9: d_ld = 1
x_lt_y 9: d
1-J: 1100 1-J:
d_ld

d_o

Embedded System Design 16


Splitting into a
controller and
datapath

Embedded System Design 17


Controller state table for the GCD example
Inputs Outputs
Q3 Q2 Q1 Q0 x_neq x_lt_y go_i I3 I2 I1 I0 x_sel y_sel x_ld y_ld d_ld
_y
0 0 0 0 * * * 0 0 0 1 X X 0 0 0
0 0 0 1 * * 0 0 0 1 0 X X 0 0 0
0 0 0 1 * * 1 0 0 1 1 X X 0 0 0
0 0 1 0 * * * 0 0 0 1 X X 0 0 0
0 0 1 1 * * * 0 1 0 0 0 X 1 0 0
0 1 0 0 * * * 0 1 0 1 X 0 0 1 0
0 1 0 1 0 * * 1 0 1 1 X X 0 0 0
0 1 0 1 1 * * 0 1 1 0 X X 0 0 0
0 1 1 0 * 0 * 1 0 0 0 X X 0 0 0
0 1 1 0 * 1 * 0 1 1 1 X X 0 0 0
0 1 1 1 * * * 1 0 0 1 X 1 0 1 0
1 0 0 0 * * * 1 0 0 1 1 X 1 0 0
1 0 0 1 * * * 1 0 1 0 X X 0 0 0
1 0 1 0 * * * 0 1 0 1 X X 0 0 0
1 0 1 1 * * * 1 1 0 0 X X 0 0 1
1 1 0 0 * * * 0 0 0 0 X X 0 0 0
1 1 0 1 * * * 0 0 0 0 X X 0 0 0
1 1 1 0 * * * 0 0 0 0 X X 0 0 0
1 1 1 1 * * * 0 0 0 0 X X 0 0 0

Embedded System Design 18


Completing the GCD custom single-
purpose processor design
… …
• We finished the datapath
controller datapath
• We have a state table for the
next state and control logic next-state registers
• All thatʼs left is combinational and
control
logic design logic

• This is not an optimized


design, but we see the basic state functional
steps register units

… …

a view inside the controller and datapath

Embedded System Design 19


RT-level custom single-purpose processor design

• We often start with a

Problem Specification
state machine Send
rdy_in
Bridge Rec
er A single-purpose processor that rdy_out eive
• Rather than algorithm clock
converts two 4-bit inputs, arriving
one at a time over data_in along
r

• Cycle timing often too data_in(4)


with a rdy_in pulse, into one 8-bit
output on data_out along with a
data_out(8)
central to functionality rdy_out pulse.

• Example rdy_in=0 Bridge rdy_in=1


• Bus bridge that converts 4- WaitFirst4
rdy_in=1
RecFirst4Start RecFirst4End
bit bus to 8-bit bus data_lo=data_in

• Start with FSMD rdy_in=0 rdy_in=0 rdy_in=1


rdy_in=1
• Known as register-transfer FSMD
WaitSecond4 RecSecond4Start
data_hi=data_in
RecSecond4End

(RT) level
rdy_in=0
• Exercise: complete the Send8Start
Send8End
Inputs
rdy_in: bit; data_in: bit[4];
design data_out=data_
hi & data_lo rdy_out=0
Outputs
rdy_out: bit; data_out:bit[8]
rdy_out=1 Variables
data_lo, data_hi: bit[4];

Embedded System Design 20


RT-level custom single-purpose processor design (cont’)
Bridge

(a) Controller
rdy_in=0 rdy_in=1
rdy_in=1
WaitFirst4 RecFirst4Start RecFirst4End
data_lo_ld=1

rdy_in=0 rdy_in=0 rdy_in=1


rdy_in=1
WaitSecond4 RecSecond4Start RecSecond4End
data_hi_ld=1

Send8Start Send8End
data_out_ld=1 rdy_out=0
rdy_out=1

rdy_in rdy_out

clk

data_in(4) data_out

data_lo_ld
data_out_ld
data_hi_ld

data_hi data_lo
registers
to all

data_out

(b) Datapath

Embedded System Design 21


Optimizing single-purpose processors

• Optimization is the task of making design metric values the


best possible
• Optimization opportunities
• original program
• FSMD
• datapath
• FSM

Embedded System Design 22


Optimizing the original program

• Analyze program attributes and look for areas of possible


improvement
• number of computations
• size of variable
• time and space complexity
• operations used
• multiplication and division very expensive

Embedded System Design 23


Optimizing the original program (contʼ)
original program optimized program
0: int x, y; 0: int x, y, r;
1: while (1) { 1: while (1) {
2: while (!go_i); 2: while (!go_i);
3: x = x_i; // x must be the larger number
4: y = y_i; 3: if (x_i >= y_i) {
5: while (x != y) { 4: x=x_i;
replace the subtraction
6: if (x < y) 5: y=y_i;
operation(s) with modulo
7: y = y - x; }
operation in order to
else 6: else {
speed up program
8: x = x - y; 7: x=y_i;
} 8: y=x_i;
9: d_o = x; }
} 9: while (y != 0) {
10: r = x % y;
11: x = y;
12: y = r;
}
13: d_o = x;
}
GCD(42, 8) - 9 iterations to complete the loop GCD(42,8) - 3 iterations to complete the loop
x and y values evaluated as follows : (42, 8), (34, 8), x and y values evaluated as follows: (42, 8), (8,2),
(26,8), (18,8), (10, 8), (2,8), (2,6), (2,4), (2,2). (2,0).

Embedded System Design 24


Optimizing the FSMD

• Areas of possible improvements


• merge states
• states with constraints on transitions can be eliminated, transition taken is
already known
• states with independent operations can be merged
• separate states
• states which require complex operations (a*b*c*d) can be broken into smaller
states to reduce hardware size
• scheduling

Embedded System Design 25


Merging of tasks

• Reduced overhead of context switches,


• More global optimization of machine code,
• Reduced overhead for inter-process/task communication.

Merging of task graphs can be performed


when some task Ti is the immediate predecessor of some other task
Tj and if Tj does not have any other immediate predecessor

Embedded System Design 26


Splitting of tasks

Assumption that task T2 requires some input somewhere in its code.

• No blocking of resources while waiting for input,


• more flexibility for scheduling, possibly improved result.

Embedded System Design 27


Optimizing the FSMD (cont.)
int x, y; !1 optimized FSMD
original FSMD
1:
int x, y;
1 !(!go_i) eliminate state 1 ‒ transitions have constant
2: 2:
values !go_i
!go_i
go_i
2-J: x = x_i
3:
y = y_i
merge state 2 and state 2J ‒ no loop operation in
3: x = x_i between them
5:

4: y = y_i x<y x>y


merge state 3 and state 4 ‒ assignment
operations are independent of one another 7: y = y -x 8: x = x - y
5: !(x!=y)

x!=y
9: d_o = x
6: merge state 5 and state 6 ‒ transitions from
x<y !(x<y) state 6 can be done in state 5
y = y -x 8: x = x - y
7:
eliminate state 5J and 6J ‒ transitions from each
6-J: state can be done from state 7 and state 8,
respectively
5-J:
eliminate state 1-J ‒ transition from state 1-J
d_o = x can be done directly from state 9
9:

1-J:

Embedded System Design 28


Optimizing the datapath

• Sharing of functional units


• one-to-one mapping, as done previously, is not necessary
• if same operation occurs in different states, they can share a single
functional unit
• Multi-functional units
• ALUs support a variety of operations, it can be shared among
operations occurring in different states

Embedded System Design 29


Optimizing the FSM

• State encoding
• task of assigning a unique bit pattern to each state in an FSM
• size of state register and combinational logic vary
• can be treated as an ordering problem
• State minimization
• task of merging equivalent states into a single state
• state equivalent if for all possible input combinations the two states generate
the same outputs and transitions to the next same state

Embedded System Design 30


Summary

• Custom single-purpose processors


• Straightforward design techniques
• Can be built to execute algorithms
• Typically start with FSMD
• CAD tools can be of great assistance

Embedded System Design 31

You might also like