Computer Architecture Introduction
Computer Architecture Introduction
The fastest, smallest, and most expensive memory per bit at the top of the hierarchy and
the slowest, largest, and cheapest per bit at the bottom.
Illusion that main memory is nearly as fast as the top of the hierarchy and nearly as big and
cheap as the bottom of the hierarchy.
8
Dependability via Redundancy
Redundant components - can take over when a failure occurs and help to detect failures.
Abstraction levels
Layout/silicon level
Abstraction Hierarchy
Circuit level
CISC characteristcs
many multicycle operations
microcode for multi-cycle operations
register-memory and memory-memory
many modes
many formats and lengths
few registers
(d) All operands are registers in and, like the stack architecture, can be transferred to
memory only via separate instructions: push or pop for (a) and load or store for (d).
0, 1, 2, 3 address machines
The code sequence for C = A+B for four classes of instruction sets.
The Add instruction has implicit operands for stack and accumulator architectures, and
explicit operands for register architectures. A, B, and C are in the memory
Princeton/Harvard architectures
Princeton
Harvard
-Single storage for instructions and data
-Separate storage for instructions and data
Assignment.
=,
Tests for equality and inequality.
||
Bit string concatenation.
XY
Data transfer of contents of regY to regX
X0
Clears regX
XY+Z
Adds contents of regY with regZ, load into regX
X Y v Z
Ors contents of regY with regZ, load into regX
DR M[MAR]
Load into DR the contents of memory pointed to by MAR
R1 >> R1
R2 << R1
X Y, A B
(cond) A B
S0 A B
P (ab) R2 R3
State diagram
ASM chart
Conditional output
Ring counter;
One-Hot Implementation Reset 0001
One-hot - reset one flipflop set to 1 instead of resetting all flip-flops to 0.
High-Level Synthesis (HLS) Synthesis of Digital Architectures
HLS: translation process from behavioural description to a structural description
Behavioral specification
HLS
Tree-height reduction
Time and space reduction
b=3. x; t = x <<1; b = x + t;
multiplication replaced by shift and addition
Operator strength reduction
Sequential Ops
Parallel scheduling
CHAINING
Chaining
Multi-cycle unit
Pipelined unit
As-Soon-As-Possible
As-late-As-Possible
No chaining
Two chained additions
A multicycle multiplication
Chained and multicycle operations - time balancing along clock periods
EXAMPLE 1
DFG
program fragment
repeat
xl = x + dx;
ul = u - (3 * x * u * dx) - (3 * y * dx);
yl = y + u * dx;
c = xl < a;
x = xl; u = ul; y = yl;
until (c);
xl = x + dx
ul = u-(3.x.u.dx)-(3.y.dx)
yl = y+ u. dx
c = x<a
v10
v1-v7
v8, v9
v11
Example 1 -- specification
SCHEDULING AND BINDING.
- an operation can be scheduled when all its predecessors have been scheduled.
- very simple, DFG need only to be traversed from inputs(s) to output(s).
ASAP Scheduling
Critical-path list scheduling
List scheduling
Op 3 higher priority than Op 1
- Extra criterion used for scheduling
o Critical-path list scheduling
- sorting criterion - the length of the longest path from operation to output.
- good results in practice
Since operation 3 has a higher priority than operation 1, it is scheduled first
As late as possible (ALAP)
- scheduling performed from output(s) to input(s).
Freedom-based scheduling
- compute both ASAP and ALAP schedules
- the difference in scheduling position gives the freedom or mobility of an operation
(operations in the critical path have mobility zero).
- take advantage of mobility to find a good position within scheduling range.
Example 1:
ASAP Schedule
Functional Units
Registers
BUSes
CONTROL UNIT SYNTHESIS
Reg.Files
MUXs
DATAPATH
Laboratory
Basys2
Basys2 board
Xilinx Spartan-3E FPGA
Programming Circuits
and Atmel AT90USB2 USB controller
Flash select - Mode Jumper (JP3) - ROM.
Rising-edge detector
debouncing a switch.
5
7
2
4
1
3
Imax/Iavg luminosity
15-0
31
32-bit Register 0
16
1
0
SW
Div Freq
7segment display
Keypad
31-16
Hex
To
7seg
3
2
1
0
8888
2 to 4
DEC
2
RS232 standard
PC -- HyperTerminal
RS232 - MAX232 voltage converter
TTL: logic 1 = +5V; RS232: logic 1 = from -3V to - 15V
ASCII codes
Data format Error: parity, overrun- receiver empty, frame stop bit; detection
UART receiver
The ASMD chart for the receiver
D-BIT - the number of data bits
SB-TICK - the number of ticks needed for the stop bits, which is
16, 24, and 32 for
1, 1.5, and 2 stop bits, respectively
s - t i c k signal (RXx16) is the enable tick from the baud rate generator and there are 16
ticks in a bit interval.
The FSM stays in the same state unless the s - t i c k signal is asserted.
The s counter -- keeps track of the nr. of sampling ticks and counts to
o 7 in the s t a r t state, to
o 15 in the data state, and to
o SB-TICK in the s t o p state.
The n counter -- keeps track of the nr. of data bits received in the data state.
The retrieved bits are shifted into and reassembled in the b register.
Status signal, rx-done-tick -- asserted for one clock cycle after the receiving process is
completed.
FSM partitioning
ASM chart
One-Hot circuit
Talker FSM
Listener FSM
ASM charts of the talker and listener of the four-phase handshaking protocol.