0% found this document useful (0 votes)
124 views64 pages

LECTURE 2. From Combination Alto Processor

This document discusses increasing levels of abstraction in hardware design from behavioral descriptions down to physical implementations. It covers combinational logic design including two-level and multilevel logic minimization techniques. Sequential logic design including finite state machines and synthesis is also introduced. The document then discusses custom single-purpose processor design at the register-transfer and logic levels.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views64 pages

LECTURE 2. From Combination Alto Processor

This document discusses increasing levels of abstraction in hardware design from behavioral descriptions down to physical implementations. It covers combinational logic design including two-level and multilevel logic minimization techniques. Sequential logic design including finite state machines and synthesis is also introduced. The document then discusses custom single-purpose processor design at the register-transfer and logic levels.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 64

From Combinational to Sequential Circuits to Simple Processors

What we covered on Friday meeting?


1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. Design of SOP circuits from KMaps. Prime implicants and Covering Design of POS circuits from KMaps. Prime implicates and Covering Design of ESOP circuits from KMaps. Algebraic rules for AND/EXOR logic. Design using NAND and NOR gates. De Morgan Rules. Factorization. Multiplexers. Iterative circuits and their types. Using State Machines to design one-directional iterative circuits Predicates Oracles SAT oracles Graph Coloring oracles and distributed processors SEND+MORE=MONEY problem and its oracle. The idea of Constraint Satisfaction and Distributed Software/hardware for it.
Ask questions to Mr Parasa and Mr Mathias Sunardi who actively participated.

Reminder Embedded Systems

Outline
Introduction Combinational logic Sequential logic FSM design Custom single-purpose processor design RT-level custom single-purpose processor design

Increasing abstraction level in design specification


Higher abstraction level focus of hardware/software design evolution
Description smaller/easier to capture
E.g., Line of sequential program code can translate to 1000 gates

Many more possible implementations available


(a) Like flashlight, the higher above the ground, the more ground illuminated
Sequential program designs may differ in performance/transistor count by orders of magnitude Logic-level designs may differ by only power of 2

(b) Design process proceeds to lower abstraction level, narrowing in on single implementation
modeling cost increases opportunities decrease
idea
back-of-the-envelope sequential program register-transfers logic

idea

implementation (a)

implementation (b)

What is Synthesis
Automatically converting systems behavioral description to a structural implementation
Complex whole formed by parts Structural implementation must optimize design metrics

More expensive, complex than compilers


Cost = $100s to $10,000s User controls 100s of synthesis options Optimization critical
Otherwise could use software

Optimizations different for each user Run time = hours, days

Gajskis Y-chart
Each axis represents type of description
Behavioral
Defines outputs as function of inputs Algorithms but no implementation
Structural Processors, memories Behavior Sequential programs

Structural
Implements behavior by connecting components with known behavior

Registers, FUs, MUXs


Gates, flip-flops Transistors

Register transfers
Logic equations/FSM Transfer functions Cell Layout Modules

Physical
Gives size/locations of components and wires on chip/board

Synthesis converts behavior at given level to structure at same level or lower


E.g.,
FSM gates, flip-flops (same level) FSM transistors (lower level) FSM X registers, FUs (higher level) FSM X processors, memories (higher level)

Chips Boards Physical

FU = functional unit FSM = finite state machine

Introduction
Processor
Digital circuit that performs a computation tasks Controller and datapath CCD General-purpose: variety of computation tasks Single-purpose: one particular lens computation task Custom single-purpose: non-standard task
Digital camera chip

A2D

CCD preprocessor

Pixel coprocessor

D2A

JPEG codec

Microcontroller

Multiplier/Accum

A custom single-purpose processor may be


Fast, small, low power But, high NRE, longer time-to-market, less flexible

DMA controller

Display ctrl

Memory controller

ISA bus interface

UART

LCD ctrl

CMOS transistor on silicon


Transistor
The basic electrical component in digital systems Acts as an on/off switch Voltage at gate controls whether current flows from source to drain Dont confuse this gate with a logic gate gate
1

source Conducts if gate=1 drain

IC package

IC

source

gate oxide channel

drain Silicon substrate

CMOS transistor implementations


Complementary Metal Oxide Semiconductor We refer to logic levels
Typically 0 is 0V, 1 is 5V
source gate Conducts if gate=1 drain gate source Conducts if gate=0 drain

nMOS

pMOS

Two basic CMOS types


nMOS conducts if gate=1 pMOS conducts if gate=0 Hence complementary
1 x x F = x' x 0 y 0 inverter NAND gate x 0 NOR gate 1 y x y 1

F = (xy)'

F = (x+y)'
y

Basic gates
Inverter, NAND, NOR

Basic logic gates


x F

x 0 1

F 0 1

x y F

F=x Driver

F=xy AND

x 0 0 1 1

y 0 1 0 1

F 0 0 0 1

x y

F=x+y OR

x 0 0 1 1

y 0 1 0 1

F 0 1 1 1

x y F

F=xy XOR

x 0 0 1 1

y 0 1 0 1

F 0 1 1 0

x 0 1

F 1 0

x y F

F = x Inverter

F = (x y) NAND

x 0 0 1 1

y 0 1 0 1

F 1 1 1 0

x y

F = (x+y) NOR

x 0 0 1 1

y 0 1 0 1

F 1 0 0 0

x y F

F=x y XNOR

x 0 0 1 1

y 0 1 0 1

F 1 0 0 1

Combinational logic design


A) Problem description y is 1 if a is to 1, or b and c are 1. z is 1 if b or c is to 1, but not both, or if all are 1. a 0 0 0 0 1 1 1 1 B) Truth table Inputs b c 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 Outputs y z 0 0 0 1 0 1 1 0 1 0 1 1 1 1 1 1 C) Output equations y = a'bc + ab'c' + ab'c + abc' + abc z = a'b'c + a'bc' + ab'c + abc' + abc

D) Minimized output equations y bc 00 01 11 10 a 0 0 0 1 0 1 z a bc 1 1 1 1

E) Logic Gates a b c

y = a + bc 00 0 0 1 0 01 1 1 11 0 1 10 1 1

z = ab + bc + bc

Combinational components
I(m-1) I1 I0 n S0 n-bit, m x 1 Multiplexor S(log m) n I(log n -1) I0 log n x n Decoder O(n-1) O1 O0 A n n-bit Adder n carry sum less equal greater B n A n B n A n B

n-bit Comparator

n bit, m function S0 ALU S(log m) n O

O= I0 if S=0..00 I1 if S=0..01 I(m-1) if S=1..11

O0 =1 if I=0..00 O1 =1 if I=0..01 O(n-1) =1 if I=1..11

sum = A+B (first n bits) carry = (n+1)th bit of A+B

less = 1 if A<B equal =1 if A=B greater=1 if A>B

O = A op B op determined by S.

With enable input e all Os are 0 if e=0

With carry-in input Ci sum = A + B + Ci

May have status outputs carry, zero, etc.

Logic synthesis
Logic-level behavior to structural implementation
Logic equations and/or FSM to connected gates

Combinational logic synthesis


Two-level minimization (Sum of products/product of sums)
Best possible performance
Longest path = 2 gates (AND gate + OR gate/OR gate + AND gate) Minimum cover Minimum cover that is prime Heuristics

Minimize size

Multilevel minimization
Trade performance for size Pareto-optimal solution
Heuristics

FSM synthesis
State minimization State encoding

14

Two-level minimization
Represent logic function as sum of products (or product of sums)
AND gate for each product OR gate for each sum

Sum of products
F = abc'd' + a'b'cd + a'bcd + ab'cd

Gives best possible performance


At most 2 gate delay

Direct implementation a b c d

Goal: minimize size


Minimum cover
Minimum # of AND gates (sum of products)

Minimum cover that is prime


Minimum # of inputs to each AND gate (sum of products)

4 4-input AND gates and 1 4-input OR gate 40 transistors

15

Minimum cover
Minimum # of AND gates (sum of products) Literal: variable or its complement
a or a, b or b, etc.

Minterm: product of literals


Each literal appears exactly once
abcd, abcd, abcd, etc.

Implicant: product of literals


Each literal appears no more than once
abcd, acd, etc.

Covers 1 or more minterms


acd covers abcd and abcd

Cover: set of implicants that covers all minterms of function Minimum cover: cover with minimum # of implicants
16

Minimum cover: K-map approach


Karnaugh map (K-map)
1 represents minterm Circle represents implicant
K-map: sum of products cd ab 00 01 11 10
00 01 11 10

K-map: minimum cover cd ab 00 01 11 10


00 01 11 10

0 0 1 0

0 0 0 0

1 1 0 1

0 0 0 0

0 0 1 0

0 0 0 0

1 1 0 1

0 0 0 0

Minimum cover
Covering all 1s with min # of circles Example: direct vs. min cover
Less gates
4 vs. 5

Minimum cover F=abc'd' + a'cd + ab'cd Minimum cover implementation a b c d F


2 4-input AND gate 1 3-input AND gates 1 4 input OR gate 28 transistors

Less transistors
28 vs. 40

17

Minimum cover that is prime


Minimum # of inputs to AND gates Prime implicant
Implicant not covered by any other implicant Max-sized circle in K-map
K-map: minimum cover that is prime ab
00 01 11 10

cd

00

01

11

10

0 0 1 0

0 0 0 0

1 1 0 1

0 0 0 0

Minimum cover that is prime


Covering with min # of prime implicants Min # of max-sized circles Example: prime cover vs. min cover
Same # of gates
4 vs. 4
Minimum cover that is prime F=abc'd' + a'cd + b'cd

Implementation a b c d
1 4-input AND gate 2 3-input AND gates F 1 4 input OR gate 26 transistors

Less transistors
26 vs. 28

18

Minimum cover: heuristics


K-maps give optimal solution every time
Functions with > 6 inputs too complicated Use computer-based tabular method
Finds all prime implicants Finds min cover that is prime Also optimal solution every time Problem: 2n minterms for n inputs
32 inputs = 4 billion minterms Exponential complexity

Heuristic
Solution technique where optimal solution not guaranteed Hopefully comes close
19

Heuristics: iterative improvement


Start with initial solution
i.e., original logic equation

Repeatedly make modifications toward better solution Common modifications


Expand
Replace each nonprime implicant with a prime implicant covering it Delete all implicants covered by new prime implicant

Reduce
Opposite of expand

Reshape
Expands one implicant while reducing another Maintains total # of implicants

Irredundant
Selects min # of implicants that cover from existing implicants

Synthesis tools differ in modifications used and the order they are used
20

Multilevel logic minimization


Trade performance for size
Increase delay for lower # of gates Gray area represents all possible solutions Circle with X represents ideal solution
Generally not possible max delay = 2 gates Solve for smallest size
delay

2-level gives best performance

Multilevel gives pareto-optimal solution


Minimum delay for a given size Minimum size for a given delay

2-level minim.

size

21

Example of logic factorization


Minimized 2-level logic function:
F = adef + bdef + cdef + gh Requires 5 gates with 18 total gate inputs
4 ANDS and 1 OR 2-level minimized
a d b e c f g h

After algebraic manipulation:


F = (a + b + c)def + gh Requires only 4 gates with 11 total gate inputs
2 ANDS and 2 ORs

Less inputs per gate Assume gate inputs = 2 transistors


Reduced by 14 transistors
36 (18 * 2) down to 22 (11 * 2)

multilevel minimized
a b c d e f g h

Sacrifices performance for size


Inputs a, b, and c now have 3-gate delay

Iterative improvement heuristic commonly used

22

FSM synthesis
FSM to gates State minimization
Reduce # of states
Identify and merge equivalent states
Outputs, next states same for all possible inputs Tabular method gives exact solution Table of all possible state pairs If n states, n2 table entries Thus, heuristics used with large # of states

State encoding
Unique bit sequence for each state If n states, log2(n) bits n! possible encodings Thus, heuristics common
23

Sequential components
I n load clear n-bit Register n Q Q= 0 if clear=1, I if load=1 and clock=1, Q(previous) otherwise. Q = lsb - Content shifted - I stored in msb shift I n-bit Shift register n-bit Counter n Q Q= 0 if clear=1, Q(prev)+1 if count=1 and clock=1. Q

Reversible shifter shifts left and rigth

Reversible counter counts up and down


Reading it operation in most of registers generalized registers.

Sequential logic design


A) Problem Description You want to construct a clock divider. Slow down your preexisting clock so that you output a 1 for every four clock cycles a C) Implementation Model Combinational logic x I1 I0 Q1 B) State Diagram a=0 x=0 x=1 a=1 a=0 I1 Q0 State register I0 Q1 0 0 0 0 1 1 1 1 D) State Table (Moore-type) Inputs Q0 a 0 0 0 1 1 0 1 1 0 0 0 1 1 0 1 1 Outputs I0 0 1 1 0 0 1 1 0 I1 0 0 0 1 1 1 1 0 x 0 0 0 1

0
a=1 1 a=0 x=0

3
a=1 2 x=0 a=0

a=1

Given this implementation model


Sequential logic design quickly reduces to combinational logic design

Sequential logic design (cont.)


E) Minimized Output Equations F) Combinational Logic

I1 Q1Q0 00 a 0
1

01

11

10

0
0

0
1

1
0

1
1

a I1 = Q1Q0a + Q1a + Q1Q0 x

I0 Q1Q0 00
a

01 1 0

11 1 0

10 0 1 I0 = Q0a + Q0a

I1

0 1

x Q1Q0 00 a 0 1 0 0

I0 01 0 0 11 1 1 10 0 0 x = Q1Q0 Q1 Q0

Custom single-purpose processor basic model


external control inputs controller datapath control inputs external data inputs datapath controller

datapath
registers

next-state and control logic

external control outputs

datapath control outputs

external data outputs

state register

functional units

controller and datapath a view inside the controller and datapath

Example: greatest common divisor


!1

First create algorithm Convert algorithm to complex state machine


Known as FSMD: finitestate machine with datapath Can use templates to perform such conversion

(a) black-box view


go_i x_i GCD d_o y_i

1: 1 2: !go_i 2-J: 3: x = x_i !(!go_i)

(c) state diagram

4:

y = y_i !(x!=y) x!=y

(b) desired functionality


0: int x, y; 1: while (1) { 2: while (!go_i); 3: x = x_i; 4: y = y_i; 5: while (x != y) { 6: if (x < y) 7: y = y - x; else 8: x = x - y; } 9: d_o = x; }

5:

6: x<y 7: y = y -x 6-J: !(x<y)

8: x = x - y

5-J: 9: 1-J: d_o = x

State diagram templates


Assignment statement a=b next statement Loop statement while (cond) { loop-bodystatements } next statement Branch statement if (c1) c1 stmts else if c2 c2 stmts else other stmts next statement
C: c1 c1 stmts !c1*c2 c2 stmts !c1*!c2 others

a=b

C: cond

!cond

next statement J:

loop-bodystatements

J: next statement

next statement

Creating the datapath


Create a register for any declared variable Create a functional unit for each arithmetic operation Connect the ports, registers and functional units
Based on reads and writes Use multiplexors for multiple sources
7: !1 1: 1 2: !go_i 2-J: x_sel 3: x = x_i y_sel n-bit 2x1 n-bit 2x1 !(!go_i) x_i y_i

Datapath

x_ld
4: y = y_i !(x!=y) x!=y 6: x<y y = y -x 6-J: !(x<y) != 5: x!=y x_neq_y x_lt_y y_ld

0: x

0: y

5:

< 6: x<y

subtractor 8: x-y

subtractor 7: y-x

8: x = x - y

9: d d_o

d_ld

Create unique identifier


for each datapath component control input and output

5-J: 9: 1-J: d_o = x

Creating the controllers FSM


!1 go_i

1:
1 2: !go_i 2-J: 3: x = x_i !(!go_i)

Controller
0000 0001 1: 1 2: !go_i 0010 2-J: 0011 x_sel = 0 3: x_ld = 1 y_sel = 0 4: y_ld = 1 5: 6:

!1 !(!go_i)

Same structure as FSMD Replace complex actions/conditions with datapath configurations


x_i y_i

4:

y = y_i 0100 !(x!=y) 0101 x!=y

Datapath
!x_neq_y x_sel n-bit 2x1 n-bit 2x1

5:

6: x<y 7: y = y -x 6-J: !(x<y)

0110

x_neq_y !x_lt_y x_sel = 1 8: x_ld = 1 1000

y_sel x_ld 0: x 0: y

8: x = x - y

x_lt_y 7: y_sel = 1 y_ld = 1 0111 1001 6-J:

y_ld

!= 5: x!=y x_neq_y x_lt_y

< 6: x<y

subtractor 8: x-y

subtractor 7: y-x

5-J:

1010 5-J: d_o = x 1011 9: d_ld = 1

9:
1-J:

9: d d_o

d_ld

1100 1-J:

Splitting into a controller and datapath


go_i

Controller implementation model


go_i Combinational logic x_sel y_sel x_ld y_ld x_neq_y x_lt_y d_ld

Controller
0000 0001 1: 1 2: !go_i 0010 2-J: 0011 x_sel = 0 3: x_ld = 1 y_sel = 0 4: y_ld = 1 5: 6:

!1 x_i !(!go_i) x_sel y_sel x_ld y_ld 0: x 0: y n-bit 2x1 n-bit 2x1 y_i

(b) Datapath

0100 0101 Q3 Q2 Q1 Q0 0110 State register I3 I2 I1 I0

!= x_neq_y=0 5: x!=y x_neq_y x_lt_y d_ld

< 6: x<y

subtractor 8: x-y

subtractor 7: y-x

x_neq_y=1 x_lt_y=0 x_sel = 1 8: x_ld = 1 1000

9: d d_o

x_lt_y=1 7: y_sel = 1 y_ld = 1 0111

1001 6-J:
1010 5-J: 1011 9: d_ld = 1

1100 1-J:

Controller state table for the GCD example


Inputs
Q3 0 0 0 0 0 0 0 0 0 0 0 1 Q2 0 0 0 0 0 1 1 1 1 1 1 0 Q1 0 0 0 1 1 0 0 0 1 1 1 0 Q0 0 1 1 0 1 0 1 1 0 0 1 0 x_neq _y * * * * * * 0 1 * * * * x_lt_ y * * * * * * * * 0 1 * * go_i * 0 1 * * * * * * * * * I3 0 0 0 0 0 0 1 0 1 0 1 1 I2 0 0 0 0 1 1 0 1 0 1 0 0 I1 0 1 1 0 0 0 1 1 0 1 0 0 I0 1 0 1 1 0 1 1 0 0 1 1 1

Outputs
x_sel X X X X 0 X X X X X X 1 y_sel X X X X X 0 X X X X 1 X x_ld 0 0 0 0 1 0 0 0 0 0 0 1 y_ld 0 0 0 0 0 1 0 0 0 0 1 0 d_ld 0 0 0 0 0 0 0 0 0 0 0 0

1
1 1 1 1 1 1

0
0 0 1 1 1 1

0
1 1 0 0 1 1

1
0 1 0 1 0 1

*
* * * * * *

*
* * * * * *

*
* * * * * *

1
0 1 0 0 0 0

0
1 1 0 0 0 0

1
0 0 0 0 0 0

0
1 0 0 0 0 0

X
X X X X X X

X
X X X X X X

0
0 0 0 0 0 0

0
0 0 0 0 0 0

0
0 1 0 0 0 0

Completing the GCD custom single-purpose processor design


We finished the datapath We have a state table for the next state and control logic
All thats left is combinational logic design
controller datapath registers

next-state and control logic

state register

functional units

This is not an optimized design, but we see the basic steps

a view inside the controller and datapath

You may be asked in homeworks or exams or projects to optimize the design with some respect such as area, speed , power or testability

RT-level custom single-purpose processor design Example Bus Bridge


Problem Specification

We often start with a state machine


Rather than algorithm Cycle timing often too central to functionality

Sende r

rdy_in clock data_in(4)

Bridge A single-purpose processor that converts two 4-bit inputs, arriving one at a time over data_in along with a rdy_in pulse, into one 8-bit output on data_out along with a rdy_out pulse.

rdy_out

Rece iver

data_out(8)

Example
Bus bridge that converts 4-bit bus to 8-bit bus Start with FSMD Known as register-transfer (RT) level Exercise: complete the design

rdy_in=0 rdy_in=1 WaitFirst4

Bridge RecFirst4Start data_lo=data_in rdy_in=0 rdy_in=1 RecSecond4Start data_hi=data_in rdy_in=0

rdy_in=1 RecFirst4End

rdy_in=0 WaitSecond4

rdy_in=1 RecSecond4End

FSMD

Send8Start data_out=data_hi & data_lo rdy_out=1

Send8End rdy_out=0

Inputs rdy_in: bit; data_in: bit[4]; Outputs rdy_out: bit; data_out:bit[8] Variables data_lo, data_hi: bit[4];

RT-level custom single-purpose processor design (cont)


Bridge

(a) Controller
rdy_in=0 rdy_in=1 WaitFirst4 RecFirst4Start data_lo_ld=1 rdy_in=0 rdy_in=1 RecSecond4Start data_hi_ld=1 RecFirst4End rdy_in=1

rdy_in=0 WaitSecond4

rdy_in=1 RecSecond4End

Send8Start data_out_ld=1 rdy_out=1

Send8End rdy_out=0

Example Bus Bridge


rdy_out data_out

rdy_in clk

data_in(4)
data_out_ld to all registers data_hi_ld data_hi data_lo data_lo_ld

data_out

(b) Datapath

Optimizing single-purpose processors


Optimization is the task of making design metric values the best possible Optimization opportunities
original program FSMD datapath FSM

Optimizing the original program


Analyze program attributes and look for areas of possible improvement
number of computations size of variable time and space complexity operations used
multiplication and division very expensive

Optimizing the original program (cont)


original program 0: int x, y; 1: while (1) { 2: while (!go_i); 3: x = x_i; 4: y = y_i; 5: while (x != y) { 6: if (x < y) 7: y = y - x; else 8: x = x - y; } 9: d_o = x; } optimized program 0: int x, y, r; 1: while (1) { 2: while (!go_i); // x must be the larger number 3: if (x_i >= y_i) { 4: x=x_i; 5: y=y_i; } 6: else { 7: x=y_i; 8: y=x_i; } 9: while (y != 0) { 10: r = x % y; 11: x = y; 12: y = r; } 13: d_o = x; } GCD(42,8) - 3 iterations to complete the loop x and y values evaluated as follows: (42, 8), (8,2), (2,0)

replace the subtraction operation(s) with modulo operation in order to speed up program

GCD(42, 8) - 9 iterations to complete the loop x and y values evaluated as follows : (42, 8), (43, 8), (26,8), (18,8), (10, 8), (2,8), (2,6), (2,4), (2,2).

Optimizing the FSMD


Areas of possible improvements
merge states
states with constants on transitions can be eliminated, transition taken is already known states with independent operations can be merged

separate states
states which require complex operations (a*b*c*d) can be broken into smaller states to reduce hardware size

scheduling

Optimizing the FSMD (cont.)


int x, y;
1: 1 2: !go_i 2-J: x = x_i y = y_i !(x!=y) x!=y 6: x<y !(x<y) y = y -x 6-J: 5-J: d_o = x 8: x = x - y !(!go_i) !1

original FSMD eliminate state 1 transitions have constant values

optimized FSMD int x, y;


2: go_i !go_i x = x_i y = y_i

3: 4: 5:

merge state 2 and state 2J no loop operation in between them

3:

5:

merge state 3 and state 4 assignment operations are independent of one another
merge state 5 and state 6 transitions from state 6 can be done in state 5 eliminate state 5J and 6J transitions from each state can be done from state 7 and state 8, respectively eliminate state 1-J transition from state 1-J can be done directly from state 9

x<y 7: y = y -x

x>y 8: x = x - y

9:

d_o = x

7:

9: 1-J:

Optimizing the datapath


Sharing of functional units
one-to-one mapping, as done previously, is not necessary if same operation occurs in different states, they can share a single functional unit

Multi-functional units
ALUs support a variety of operations, it can be shared among operations occurring in different states

Optimizing the FSM


State encoding
task of assigning a unique bit pattern to each state in an FSM size of state register and combinational logic vary can be treated as an ordering problem

State minimization
task of merging equivalent states into a single state
state equivalent if for all possible input combinations the two states generate the same outputs and transitions to the next same state

Technology mapping
Library of gates available for implementation
Simple
only 2-input AND,OR gates

Complex
various-input AND,OR,NAND,NOR,etc. gates Efficiently implemented meta-gates (i.e., AND-OR-INVERT,MUX)

Final structure consists of specified librarys components only If technology mapping integrated with logic synthesis
More efficient circuit More complex problem Heuristics required

44

Complexity impact on user


As complexity grows, heuristics used Heuristics differ tremendously among synthesis tools
Computationally expensive
Higher quality results Variable optimization effort settings Long run times (hours, days) Requires huge amounts of memory Typically needs to run on servers, workstations Lower quality results Shorter run times (minutes, hours) Smaller amount of memory required Could run on PC

Fast heuristics

Super-linear-time (i.e. n3) heuristics usually used


User can partition large systems to reduce run times/size 1003 > 503 + 503 (1,000,000 > 250,000)
45

Integrating logic design and physical design


Past
Gate delay much greater than wire delay Thus, performance evaluated as # of levels of gates only

Today
Gate delay shrinking as feature size shrinking Wire delay increasing
Performance evaluation needs wire length
Wire Delay Transistor

Transistor placement (needed for wire length) domain of physical design Thus, simultaneous logic synthesis and physical design required for efficient circuits

Reduced feature size

46

Embedded Systems Case Study


Elevator Controller
47

48

Elevator System
CRC cards is a well-known method for analyzing a system and developing an architecture. CRC
Classes: logical groupings of data and functionality Responsibilities: describe what the class do Collaborators: other classes w/ which a given class works

Elevator Control Classes


Elevator car, Passenger, Floor control, Car control, Car sensors, etc.

Architectural Classes
Car state, Floor control reader, Car control reader, Car control sender, Scheduler
49

F floors
N hoistways

50

51

52

53

54

Classes: logical groupings of data and functionality

Physical Interfaces

Responsibilities: describe what the class do Collaborators: other classes w/ which a given class works Elevator Control Classes Elevator car, Passenger, Floor control, Car control, Car sensors, etc. Architectural Classes Car state, Floor control reader, Car control reader, Car control sender, Scheduler 55

56

Architecture
Computation and I/O occur at:
Floor control panels/displays Elevator cars System controller

Panels Controller Car Controller


read buttons and send events to system controller read sensor inputs and send to system controller

57

System Controller
Must take inputs from many sources: Must control cars to hard real-time deadlines User interface, scheduling are soft deadlines Testing
Build an elevator simulator using SystemC, Verilog, VHDL and FPGA
Simulate multiple elevators Simulate real-time control demands
58

Homework 2
The simplest possible custom single-purpose processor
Design a processor to multiply two numbers. The initial data are in registers/counters A and B. The result should be in register/counter C. You have only reversible counters (with reading) to be used in the data path. The counters perform the following operations:
Add one Subtract one Read new value

Invent the algorithm for multiplication. Use minimum number of counters Design the reversible counter by hand using logic gates and D FFs. Design the control unit Design the data path Draw the timing diagram of the whole system. You can use VHDL or Verilog to help you, but I need your design by hand.

Summary
Custom single-purpose processors
Straightforward design techniques Can be built to execute algorithms Typically start with FSMD CAD tools can be of great assistance

Questions to Exams (1)


1. 2. 3. 4. What are the main methods of Combinational logic design? What is Mealy FSM (Finite State Machine)? What is Moore State Machine? Think about a robot controller as a Sequential logic Circuit. What are the blocks and their role? 5. Role of abstraction in FSM design. Give examples. 6. Explain the concepts from Gajskis Chart in a Custom single-purpose processor design 7. RT-level custom single-purpose processor design. Explain briefly all design stages from bottom of design hierarchy (layout) to the top (system design of a GCD processor as an example) 8. List and explain logic gates. 9. List and explain combinational blocks. 10. List and explain sequential blocks. 11. List and explain sensors to be used with embedded systems of FSM type. 12. List and explain actuators to be used with such embedded systems.

Questions to Exams (2)


1. 2. 3. 4. 5. 6. 7. 8. What are the main synthesis processes and CAD tools in Combinational logic design? What are the methods to solve the covering problem? Explain the concept of search and give examples. Explain the concept of heuristic in search and give examples. SOP minimization can be very useful. Also ESOP. Explain design tradeoffs and Pareto Optimization on one practical example. Explain in detail on example the basic synthesis method for Mealy FSM from specification to a circuit from D type flip-flops (FFs) and logic gates. Explain and illustrate how D, T and JK flip-flops work. What is a difference between
Register with enable Register without enable Reversible register

9. 10. 11. 12. 13.

Draw the schematic of the FSMD. Explain GCD algorithm of Euclides on examples. Without looking to the slides, convert GCD algorithm to a FSMD. How can we optimize GCD? Apply these ideas to Least Common Multiplier algorithm and FSMD for two numbers.

Questions to Exams (3)


1. 2. 3. 4. The role of GO-TO commands in FSMD design. Are they good or bad? Give examples. The role of structured design of FSMD. How the data path is created from FSMD? This is one of main topics for this whole class. You have to know it well. How CU (Control Unit) is created from FSMD? This is one of main topics for this whole class. You have to know it well. Compare state graph, state transition table and flow-chart. Why we need all of them? In this class we are not optimizing combinational logic or FSMs too much. But if you have taken ECE 572 or ECE 573 classes you know many methods to optimize on these levels. Can you give practical examples of these optimizations in GCD or other similar system? Complete the Bus bridge FSMD that converts 4-bit bus to 8-bit bus and is given in these slides. Discuss Optimizing the single-purpose processors. Give examples. Explain levels of optimization, such as the original program, the FSMD, the data path, the CU, the register, the combinational logic, finally the technology mapping. Design the complete elevator system for a villa of a crazy millionaire artist from Hollywood. Cost does not count. You have to amaze his guests.

5.

6. 7.

8.

EECE 353-1 Real-Time Systems T. John Koo Embedded Computing Systems Laboratory Institute for Software Integrated Systems Department of Electrical Engineering and Computer Science Vanderbilt University 5306 Stevenson Center January 16, 2006 [email protected]

Sources

Slides from S. Mohammadi Vahid, Siamak Mohammadi Givargis and Marwedel


64

You might also like