RISC, CISC, and ISA Variations: Prof. Hakim Weatherspoon CS 3410, Spring 2015
RISC, CISC, and ISA Variations: Prof. Hakim Weatherspoon CS 3410, Spring 2015
Prelim today
Starts at 7:30pm sharp
Go to location based on netid
[a-g]* → MRS146: Morrison Hall 146
[h-l]* → RRB125: Riley-Robb Hall 125
[m-n]* → RRB105: Riley-Robb Hall 105
[o-s]* → MVRG71: M Van Rensselaer Hall G71
[t-z]* → MVRG73: M Van Rensselaer Hall G73
Announcements
Prelim1 today:
• Time: We will start at 7:30pm sharp, so come early
• Location: on previous slide
• Closed Book
• Cannot use electronic device or outside material
• Practice prelims are online in CMS
Material covered everything up to end of this week
• Everything up to and including data hazards
• Appendix B (logic, gates, FSMs, memory, ALUs)
• Chapter 4 (pipelined [and non] MIPS processor with hazards)
• Chapters 2 (Numbers / Arithmetic, simple MIPS instructions)
• Chapter 1 (Performance)
• HW1, Lab0, Lab1, Lab2, C-Lab0, C-Lab1
Big Picture: Where are we now?
compute
jump/branch
targets
A
memory register
D
alu
file
B
+4
addr
inst
PC din dout
M
B
control
memory
imm
extend
new
forward
pc detect
unit
hazard
Instruction Instruction Write-
ctrl
ctrl
ctrl
Fetch Decode Execute Memory Back
IF/ID ID/EX EX/MEM MEM/WB
Big Picture: Where are we going?
C int x = 10;
compiler x = 2 * x + 15;
r0 = 0
MIPS
addi r5, r0, 10 r5 = r0 + 10
assembly muli r5, r5, 2 r5 = r5<<1 #r5 = r5 * 2
assembler addi r5, r5, 15 r5 = r15 + 15
op = addi r0 r5 10
machine 00100000000001010000000000001010
code 00000000000001010010100001000000
00100000101001010000000000001111
op = addi r5 r5 15
CPU
op = r-type r5 r5 shamt=1 func=sll
Circuits
Gates
Transistors
5
Silicon
Big Picture: Where are we going?
C int x = 10;
compiler x = 2 * x + 15;
High Level
MIPS Languages
addi r5, r0, 10
assembly muli r5, r5, 2
assembler addi r5, r5, 15
machine 00100000000001010000000000001010
code 00000000000001010010100001000000
00100000101001010000000000001111
CPU Instruction Set
Architecture (ISA)
Circuits
Gates
Transistors
6
Silicon
Goals for Today
Instruction Set Architectures
• ISA Variations, and CISC vs RISC
Next Time
• Program Structure and Calling Conventions
Next Goal
Is MIPS the only possible instruction set
architecture (ISA)?
What are the alternatives?
Instruction Set Architecture Variations
ISA defines the permissible instructions
• MIPS: load/store, arithmetic, control flow, …
• ARMv7: similar to MIPS, but more shift, memory, &
conditional ops
• ARMv8 (64-bit): even closer to MIPS, no conditional ops
• VAX: arithmetic on memory or registers, strings, polynomial
evaluation, stacks/queues, …
• Cray: vector operations, …
• x86: a little of everything
Brief Historical Perspective on ISAs
Accumulators
• Early stored-program computers had one register!
Intel 8086
“extended accumulator”
Processor for IBM PCs
• Extended Accumulator
– One operand may be in memory (like previous accumulators).
– Or, all the operands may be registers (like MIPS).
Brief Historical Perspective on ISAs
Next step, more registers…
• General-purpose registers
– Registers can be used for any purpose
– E.g. MIPS, ARM, x86
• Register-memory architectures
– One operand may be in memory (e.g. accumulators)
– E.g. x86 (i.e. 80386 processors)
• Register-register architectures (aka load-store)
– All operands must be in registers
– E.g. MIPS, ARM
Takeaway
The number of available registers greatly influenced
the instruction
Machine
set architecture
Num General Purpose Registers
(ISA)
Architectural Style Year
EDSAC 1 Accumulator 1949
IBM 701 1 Accumulator 1953
CDC 6600 8 Load-Store 1963
IBM 360 18 Register-Memory 1964
DEC PDP-8 1 Accumulator 1965
DEC PDP-11 8 Register-Memory 1970
Intel 8008 1 Accumulator 1972
Motorola 6800 2 Accumulator 1974
DEC VAX 16 Register-Memory, Memory-Memory 1977
Intel 8086 1 Extended Accumulator 1978
Motorola 6800 16 Register-Memory 1980
Intel 80386 8 Register-Memory 1985
ARM 16 Load-Store 1985
MIPS 32 Load-Store 1985
HP PA-RISC 32 Load-Store 1986
SPARC 32 Load-Store 1987
PowerPC 32 Load-Store 1992
DEC Alpha 32 Load-Store 1992
HP/Intel IA-64 128 Load-Store 2001
AMD64 (EMT64) 16 Register-Memory 2003
Next Goal
How to compute with limited resources?
E.g. VAX
• Like x86, arithmetic on memory or registers, but also on
Complex Instruction Set
Computers (CISC)
Takeaway
The number of available registers greatly
influenced the instruction set architecture (ISA)
John Cock
• IBM 801, 1980 (started in 1975)
• Name 801 came from the bldg that housed the project
• Idea: Possible to make a very small and very fast core
• Influences: Known as “the father of RISC
Architecture”. Turing Award Recipient and National
Medal of Science.
Reduced Instruction Set Computer (RISC)
Dave Patterson John L. Hennessy
• RISC Project, 1982 • MIPS, 1981
• UC Berkeley • Stanford
• RISC-I: ½ transistors & 3x • Simple pipelining, keep full
faster • Influences: MIPS computer
• Influences: Sun SPARC, system, PlayStation, Nintendo
namesake of industry
Reduced Instruction Set Computer (RISC)
Dave Patterson John L. Hennessy
• RISC Project, 1982 • MIPS, 1981
• UC Berkeley • Stanford
• RISC-I: ½ transistors & 3x • Simple pipelining, keep full
faster • Influences: MIPS computer
• Influences: Sun SPARC, system, PlayStation, Nintendo
namesake of industry
Reduced Instruction Set Computer (RISC)
MIPS Design Principles
Smaller is faster
• Small register file
I-type op rs rt immediate
6 bits 5 bits 5 bits 16 bits
R-type
opx op rs rd opx rt
4 bits 8 bits 4 bits 4 bits 8 bits 4 bits
I-type
opx op rs rd immediate
4 bits 8 bits 4 bits 4 bits 12 bits
J-type
opx op immediate (target address)
4 bits 4 bits 24 bits
ARMv7 Conditional Instructions
• while(i != j) {
• if (i > j)
• i -= j; In MIPS, performance will be
• else slow if code has a lot of branches
• j -= i;
• }
Loop: BEQ Ri, Rj, End // if "NE" (not equal), then stay in loop
SLT Rd, Rj, Ri // "GT" if (i > j),
BNE Rd, R0, Else // …
SUB Ri, Ri, Rj // if "GT" (greater than), i = i-j;
J Loop
Else: SUB Rj, Rj, Ri // or "LT" if (i < j)
J Loop // if "LT" (less than), j = j-i;
End:
ARMv7 Conditional Instructions
• while(i != j) {
• if (i > j)
• i -= j; In ARM, can avoid delay due to
Branches with conditional
• else
instructions
• j -= i;
• }
0 10 0
LOOP: CMP Ri, Rj = ≠ <//> set condition "NE" if (i != j)
// "GT" if (i > j),
// or "LT" if (i < j)
0 00 1
= ≠ < SUBGT
> Ri, Ri, Rj // if "GT" (greater than), i = i-j;
1 01 0
= ≠ < SUBLE
> Rj, Rj, Ri // if "LE" (less than or equal), j = j-i;
0 1 0 BNE
0 loop // if "NE" (not equal), then loop
= ≠< >
ARMv7: Other Cool operations
Shift one register (e.g. Rc) any amount
Add to another register (e.g. Rb)
Store result in a different register (e.g. Ra)
Registers
State Logic Output
Comb.
Input Logic Next State
AB AB AB AB
Cout
Cin
S S S S
Critical Path
How long does it take to compute a result?
• Speed of a circuit is affected by the number of gates in series (on
the critical path or the deepest level of logic)
AB AB AB AB
Cout
Cin
4S 2S 0
t=8
S
6
S
t = t = t =
t=
Example: Mealy Machine
Next Current
Output
State State z
s' s Comb.
D Q
Logic
a Next State
b
Input s'
z = b + a + s + abs
s’ = ab + bs + as + abs
Strategy: .
(1) Draw a state diagram (e.g.
. Mealy Machine)
(2) Write output and next-state
. tables
(3) Encode states, inputs, and outputs as bits
(4) Determine logic equations for next state and outputs
Endianness
Endianness: Ordering of bytes within a memory word
Little Endian = least significant part first (MIPS, x86)
1000 1001 1002 1003
as 4 bytes 0x78 0x56 0x34 0x12
as 2 halfwords 0x5678 0x1234
as 1 word 0x12345678
A
D
inst B
mem data
mem
A
D
inst B
mem data
mem
A
D
inst B
mem data
mem
A
D
inst B
mem data
mem
sub r6,r4,r1 NOP lw r4, 20(r8)
lw r4, 20(r8) IF ID Ex M W
Stall
or r6, r3, r4 IF ID Ex
ID Ex M W
load-use stall
DELAY SLOT!
Quiz
Stall
+ Forwarding from M/WID/Ex (WEx)
5 Hazards
Questions?