0% found this document useful (0 votes)
96 views55 pages

RISC, CISC, and ISA Variations: Prof. Hakim Weatherspoon CS 3410, Spring 2015

The document discusses prelim exam logistics for CS 3410 at Cornell University in Spring 2015. It provides the following information in 3 sentences: The prelim exam for the course will take place today starting at 7:30pm sharp in various locations based on students' netids. The exam is closed book and covers all material up to and including data hazards, appendices on logic gates and memory, and chapters 1-4 of the textbook. Students should practice with online prelim exams on the course management system and come prepared to start the actual exam promptly at 7:30pm in their assigned location.

Uploaded by

Røñø
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views55 pages

RISC, CISC, and ISA Variations: Prof. Hakim Weatherspoon CS 3410, Spring 2015

The document discusses prelim exam logistics for CS 3410 at Cornell University in Spring 2015. It provides the following information in 3 sentences: The prelim exam for the course will take place today starting at 7:30pm sharp in various locations based on students' netids. The exam is closed book and covers all material up to and including data hazards, appendices on logic gates and memory, and chapters 1-4 of the textbook. Students should practice with online prelim exams on the course management system and come prepared to start the actual exam promptly at 7:30pm in their assigned location.

Uploaded by

Røñø
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 55

RISC, CISC, and ISA Variations

Prof. Hakim Weatherspoon


CS 3410, Spring 2015
Computer Science
Cornell University
See P&H Appendix 2.16 – 2.18, and 2.21
Announcements
There is a Lab Section this week, C-Lab2

Project1 (PA1) is due next Monday, March 9th

Prelim today
Starts at 7:30pm sharp
Go to location based on netid
[a-g]* → MRS146: Morrison Hall 146
[h-l]* → RRB125: Riley-Robb Hall 125
[m-n]* → RRB105: Riley-Robb Hall 105
[o-s]* → MVRG71: M Van Rensselaer Hall G71
[t-z]* → MVRG73: M Van Rensselaer Hall G73
Announcements

Prelim1 today:
• Time: We will start at 7:30pm sharp, so come early
• Location: on previous slide
• Closed Book
• Cannot use electronic device or outside material
• Practice prelims are online in CMS
Material covered everything up to end of this week
• Everything up to and including data hazards
• Appendix B (logic, gates, FSMs, memory, ALUs)
• Chapter 4 (pipelined [and non] MIPS processor with hazards)
• Chapters 2 (Numbers / Arithmetic, simple MIPS instructions)
• Chapter 1 (Performance)
• HW1, Lab0, Lab1, Lab2, C-Lab0, C-Lab1
Big Picture: Where are we now?
compute
jump/branch
targets

A
memory register

D
alu
file

B
+4
addr
inst

PC din dout

M
B
control
memory
imm

extend
new
forward
pc detect
unit
hazard
Instruction Instruction Write-
ctrl

ctrl

ctrl
Fetch Decode Execute Memory Back
IF/ID ID/EX EX/MEM MEM/WB
Big Picture: Where are we going?
C int x = 10;
compiler x = 2 * x + 15;
r0 = 0
MIPS
addi r5, r0, 10 r5 = r0 + 10
assembly muli r5, r5, 2 r5 = r5<<1 #r5 = r5 * 2
assembler addi r5, r5, 15 r5 = r15 + 15
op = addi r0 r5 10
machine 00100000000001010000000000001010
code 00000000000001010010100001000000
00100000101001010000000000001111
op = addi r5 r5 15
CPU
op = r-type r5 r5 shamt=1 func=sll
Circuits

Gates

Transistors
5
Silicon
Big Picture: Where are we going?
C int x = 10;
compiler x = 2 * x + 15;
High Level
MIPS Languages
addi r5, r0, 10
assembly muli r5, r5, 2
assembler addi r5, r5, 15

machine 00100000000001010000000000001010
code 00000000000001010010100001000000
00100000101001010000000000001111
CPU Instruction Set
Architecture (ISA)
Circuits

Gates

Transistors
6
Silicon
Goals for Today
Instruction Set Architectures
• ISA Variations, and CISC vs RISC

Next Time
• Program Structure and Calling Conventions
Next Goal
Is MIPS the only possible instruction set
architecture (ISA)?
What are the alternatives?
Instruction Set Architecture Variations
ISA defines the permissible instructions
• MIPS: load/store, arithmetic, control flow, …
• ARMv7: similar to MIPS, but more shift, memory, &
conditional ops
• ARMv8 (64-bit): even closer to MIPS, no conditional ops
• VAX: arithmetic on memory or registers, strings, polynomial
evaluation, stacks/queues, …
• Cray: vector operations, …
• x86: a little of everything
Brief Historical Perspective on ISAs
Accumulators
• Early stored-program computers had one register!

Intel 8008 in 1972


was an accumulator
EDSAC (Electronic Delay Storage
Automatic Calculator) in 1949

• One register is two registers short of a MIPS instruction!


• Requires a memory-based operand-addressing mode
– Example Instructions: add 200
 Add the accumulator to the word in memory at address 200
Brief Historical Perspective on ISAs
Next step, more registers…
• Dedicated registers
– E.g. indices for array references in data transfer instructions,
separate accumulators for multiply or divide instructions,
top-of-stack pointer.

Intel 8086
“extended accumulator”
Processor for IBM PCs

• Extended Accumulator
– One operand may be in memory (like previous accumulators).
– Or, all the operands may be registers (like MIPS).
Brief Historical Perspective on ISAs
Next step, more registers…
• General-purpose registers
– Registers can be used for any purpose
– E.g. MIPS, ARM, x86

• Register-memory architectures
– One operand may be in memory (e.g. accumulators)
– E.g. x86 (i.e. 80386 processors)
• Register-register architectures (aka load-store)
– All operands must be in registers
– E.g. MIPS, ARM
Takeaway
The number of available registers greatly influenced
the instruction
Machine
set architecture
Num General Purpose Registers
(ISA)
Architectural Style Year
EDSAC 1 Accumulator 1949
IBM 701 1 Accumulator 1953
CDC 6600 8 Load-Store 1963
IBM 360 18 Register-Memory 1964
DEC PDP-8 1 Accumulator 1965
DEC PDP-11 8 Register-Memory 1970
Intel 8008 1 Accumulator 1972
Motorola 6800 2 Accumulator 1974
DEC VAX 16 Register-Memory, Memory-Memory 1977
Intel 8086 1 Extended Accumulator 1978
Motorola 6800 16 Register-Memory 1980
Intel 80386 8 Register-Memory 1985
ARM 16 Load-Store 1985
MIPS 32 Load-Store 1985
HP PA-RISC 32 Load-Store 1986
SPARC 32 Load-Store 1987
PowerPC 32 Load-Store 1992
DEC Alpha 32 Load-Store 1992
HP/Intel IA-64 128 Load-Store 2001
AMD64 (EMT64) 16 Register-Memory 2003
Next Goal
How to compute with limited resources?

i.e. how do you design your ISA if you have limited


resources?
People programmed in assembly and machine code!
• Needed as many addressing modes as possible
• Memory was (and still is) slow

CPUs had relatively few registers


• Register’s were more “expensive” than external mem
• Large number of registers requires many bits to index

Memories were small


• Encouraged highly encoded microcodes as instructions
• Variable length instructions, load/store, conditions, etc
People programmed in assembly and machine code!
E.g. x86
• > 1000 instructions!
– 1 to 15 bytes each
– E.g. dozens of add instructions
• operands in dedicated registers, general purpose
registers, memory, on stack, …
– can be 1, 2, 4, 8 bytes, signed or unsigned
• 10s of addressing modes
– e.g. Mem[segment + reg + reg*scale + offset]

E.g. VAX
• Like x86, arithmetic on memory or registers, but also on
Complex Instruction Set
Computers (CISC)
Takeaway
The number of available registers greatly
influenced the instruction set architecture (ISA)

Complex Instruction Set Computers were very


complex
• Necessary to reduce the number of instructions
required to fit a program into memory.
• However, also greatly increased the complexity of the
ISA as well.
Next Goal
How do we reduce the complexity of the ISA while
maintaining or increasing performance?
Reduced Instruction Set Computer (RISC)

John Cock
• IBM 801, 1980 (started in 1975)
• Name 801 came from the bldg that housed the project
• Idea: Possible to make a very small and very fast core
• Influences: Known as “the father of RISC
Architecture”. Turing Award Recipient and National
Medal of Science.
Reduced Instruction Set Computer (RISC)
Dave Patterson John L. Hennessy
• RISC Project, 1982 • MIPS, 1981
• UC Berkeley • Stanford
• RISC-I: ½ transistors & 3x • Simple pipelining, keep full
faster • Influences: MIPS computer
• Influences: Sun SPARC, system, PlayStation, Nintendo
namesake of industry
Reduced Instruction Set Computer (RISC)
Dave Patterson John L. Hennessy
• RISC Project, 1982 • MIPS, 1981
• UC Berkeley • Stanford
• RISC-I: ½ transistors & 3x • Simple pipelining, keep full
faster • Influences: MIPS computer
• Influences: Sun SPARC, system, PlayStation, Nintendo
namesake of industry
Reduced Instruction Set Computer (RISC)
MIPS Design Principles

Simplicity favors regularity


• 32 bit instructions

Smaller is faster
• Small register file

Make the common case fast


• Include support for constants

Good design demands good compromises


• Support for different type of interpretations/classes
Reduced Instruction Set Computer
MIPS = Reduced Instruction Set Computer (RlSC)
• ≈ 200 instructions, 32 bits each, 3 formats
• all operands in registers
– almost all are 32 bits each
• ≈ 1 addressing mode: Mem[reg + imm]

x86 = Complex Instruction Set Computer (ClSC)


• > 1000 instructions, 1 to 15 bytes each
• operands in dedicated registers, general purpose registers,
memory, on stack, …
– can be 1, 2, 4, 8 bytes, signed or unsigned
• 10s of addressing modes
– e.g. Mem[segment + reg + reg*scale + offset]
RISC vs CISC
RISC Philosophy CISC Rebuttal
Regularity & simplicity Compilers can be smart
Leaner means faster Transistors are plentiful
Optimize the Legacy is important
common case Code size counts
Micro-code!

Energy efficiency Desktops/Servers


Embedded Systems
Phones/Tablets
ARMDroid vs WinTel
• Android OS on • Windows OS on
ARM processor Intel (x86) processor
Takeaway
The number of available registers greatly influenced the instruction
set architecture (ISA)

Complex Instruction Set Computers were very complex


- Necessary to reduce the number of instructions required to fit a
program into memory.
- However, also greatly increased the complexity of the ISA as well.

Back in the day… CISC was necessary because everybody


programmed in assembly and machine code! Today, CISC ISA’s are
still dominant due to the prevalence of x86 ISA processors. However,
RISC ISA’s today such as ARM have an ever increasing market share
(of our everyday life!).
ARM borrows a bit from both RISC and CISC.
Next Goal
How does MIPS and ARM compare to each other?
MIPS instruction formats
All MIPS instructions are 32 bits long, has 3 formats

R-type op rs rt rd shamt func


6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

I-type op rs rt immediate
6 bits 5 bits 5 bits 16 bits

J-type op immediate (target address)


6 bits 26 bits
ARMv7 instruction formats
All ARMv7 instructions are 32 bits long, has 3 formats

R-type
opx op rs rd opx rt
4 bits 8 bits 4 bits 4 bits 8 bits 4 bits

I-type
opx op rs rd immediate
4 bits 8 bits 4 bits 4 bits 12 bits
J-type
opx op immediate (target address)
4 bits 4 bits 24 bits
ARMv7 Conditional Instructions
• while(i != j) {
• if (i > j)
• i -= j; In MIPS, performance will be
• else slow if code has a lot of branches
• j -= i;
• }
Loop: BEQ Ri, Rj, End // if "NE" (not equal), then stay in loop
SLT Rd, Rj, Ri // "GT" if (i > j),
BNE Rd, R0, Else // …
SUB Ri, Ri, Rj // if "GT" (greater than), i = i-j;
J Loop
Else: SUB Rj, Rj, Ri // or "LT" if (i < j)
J Loop // if "LT" (less than), j = j-i;
End:
ARMv7 Conditional Instructions
• while(i != j) {
• if (i > j)
• i -= j; In ARM, can avoid delay due to
Branches with conditional
• else
instructions
• j -= i;
• }
0 10 0
LOOP: CMP Ri, Rj = ≠ <//> set condition "NE" if (i != j)
// "GT" if (i > j),
// or "LT" if (i < j)
0 00 1
= ≠ < SUBGT
> Ri, Ri, Rj // if "GT" (greater than), i = i-j;
1 01 0
= ≠ < SUBLE
> Rj, Rj, Ri // if "LE" (less than or equal), j = j-i;
0 1 0 BNE
0 loop // if "NE" (not equal), then loop
= ≠< >
ARMv7: Other Cool operations
Shift one register (e.g. Rc) any amount
Add to another register (e.g. Rb)
Store result in a different register (e.g. Ra)

ADD Ra, Rb, Rc LSL #4


Ra = Rb + Rc<<4
Ra = Rb + Rc x 16
ARMv7 Instruction Set Architecture
All ARMv7 instructions are 32 bits long, has 3 formats
Reduced Instruction Set Computer (RISC) properties
• Only Load/Store instructions access memory
• Instructions operate on operands in processor registers
• 16 registers

Complex Instruction Set Computer (CISC) properties


• Autoincrement, autodecrement, PC-relative addressing
• Conditional execution
• Multiple words can be accessed from memory with a
single instruction (SIMD: single instr multiple data)
ARMv8 (64-bit) Instruction Set Architecture
All ARMv8 instructions are 64 bits long, has 3 formats
Reduced Instruction Set Computer (RISC) properties
• Only Load/Store instructions access memory
• Instructions operate on operands in processor registers
• 32 registers and r0 is always 0

NO MORE Complex Instruction Set Computer (CISC)


properties
• NO Conditional execution
• NO Multiple words can be accessed from memory with
a single instruction (SIMD: single instr multiple data)
Instruction Set Architecture Variations
ISA defines the permissible instructions
• MIPS: load/store, arithmetic, control flow, …
• ARMv7: similar to MIPS, but more shift, memory, &
conditional ops
• ARMv8 (64-bit): even closer to MIPS, no conditional ops
• VAX: arithmetic on memory or registers, strings, polynomial
evaluation, stacks/queues, …
• Cray: vector operations, …
• x86: a little of everything
Next time
How do we coordinate use of registers?
Calling Conventions!

PA1 due next Tueday


Prelim 1 Review Questions
Prelim 1
Prelim today
Starts at 7:30pm sharp
Go to location based on netid
[a-g]* → MRS146: Morrison Hall 146
[h-l]* → RRB125: Riley-Robb Hall 125
[m-n]* → RRB105: Riley-Robb Hall 105
[o-s]* → MVRG71: M Van Rensselaer Hall G71
[t-z]* → MVRG73: M Van Rensselaer Hall G73
Prelim 1

Time: We will start at 7:30pm sharp, so come early


Location: See previous slide
Closed Book
• Cannot use electronic device or outside material
Material covered everything up to end of last week
• Everything up to and including data hazards
• Appendix B (logic, gates, FSMs, memory, ALUs)
• Chapter 4 (pipelined [and non] MIPS processor with
hazards)
• Chapters 2 (Numbers / Arithmetic, simple MIPS
instructions)
• Chapter 1 (Performance)
Mealy Machine
General Case: Mealy Machine
Current
Registers
State Comb.
Output
Logic
Input Next State

Outputs and next state depend on both


current state and input
Moore Machine
Special Case: Moore Machine
Current Comb.

Registers
State Logic Output
Comb.
Input Logic Next State

Outputs depend only on current state


Critical Path
How long does it take to compute a result?

AB AB AB AB
Cout
Cin

S S S S
Critical Path
How long does it take to compute a result?
• Speed of a circuit is affected by the number of gates in series (on
the critical path or the deepest level of logic)
AB AB AB AB
Cout
Cin

4S 2S 0
t=8

S
6

S
t = t = t =
t=
Example: Mealy Machine
Next Current
Output
State State z
s' s Comb.
D Q
Logic
a Next State
b
Input s'
 

z = b + a + s + abs
s’ = ab + bs + as + abs
Strategy: .
(1) Draw a state diagram (e.g.
. Mealy Machine)
(2) Write output and next-state
. tables
(3) Encode states, inputs, and outputs as bits
(4) Determine logic equations for next state and outputs
Endianness
Endianness: Ordering of bytes within a memory word
Little Endian = least significant part first (MIPS, x86)
1000 1001 1002 1003
as 4 bytes 0x78 0x56 0x34 0x12
as 2 halfwords 0x5678 0x1234
as 1 word 0x12345678

Big Endian = most significant part first (MIPS, networks)


1000 1001 1002 1003
as 4 bytes 0x12 0x34 0x56 0x78
as 2 halfwords 0x1234 0x5678
as 1 word 0x12345678
Memory Layout
Examples (big/little endian):
0x00000000
# r5 contains 5 (0x00000005) 0x00000001
0x00000002
0x05
SB r5, 2(r0) 0x00000003
0x00000004
LB r6, 2(r0)
0x00000005
# R[r6] = 0x05 0x00000006
0x00000007
SW r5, 8(r0) 0x00000008
0x00000009
LB r7, 8(r0) 0x00
0x0000000a
LB r8, 11(r0) 0x00 0x0000000b
# R[r7] = 0x00 0x00 ...
0x05 0xffffffff
# R[r8] = 0x05
Memory Layout
Examples (big/little endian):
0x00000000
# r5 contains 5 (0x00000005) 0x00000001
0x00000002
0x05
SB r5, 2(r0) 0x00000003
0x00000004
LB r6, 2(r0)
0x00000005
# R[r6] = 0x00000005 0x00000006
0x00000007
SW r5, 8(r0) 0x00000008
0x00000009
LB r7, 8(r0) 0x00
0x0000000a
LB r8, 11(r0) 0x00 0x0000000b
# R[r7] = 0x00000000 0x00 ...
0x05 0xffffffff
# R[r8] = 0x00000005
Forwarding Datapath 1

A
D
inst B
mem data
mem

add r3, r1, r2 IF ID Ex M W


sub r5, r3, r1 IF ID Ex M W
Forwarding Datapath 2

A
D
inst B
mem data
mem

add r3, r1, r2 IF ID Ex M W


sub r5, r3, r1 IF ID Ex M W
or r6, r3, r4 IF ID Ex M W
Register File Bypass

A
D
inst B
mem data
mem

add r3, r1, r2


IF ID Ex M W
sub r5, r3, r1 IF ID Ex M W
or r6, r3, r4 IF ID Ex M W
add r6, r3, r8 IF ID Ex M W
Memory Load Data Hazard

A
D
inst B
mem data
mem
sub r6,r4,r1 NOP lw r4, 20(r8)

lw r4, 20(r8) IF ID Ex M W
Stall
or r6, r3, r4 IF ID Ex
ID Ex M W
load-use stall
DELAY SLOT!
Quiz

add r3, r1, r2


nand r5, r3, r4
add r2, r6, r3
lw r6, 24(r3)
sw r6, 12(r2)
Quiz
add r3, r1, r2
nand r5, r3, r4 Forwarding from Ex/MID/Ex (MEx)

add r2, r6, r3 Forwarding from M/WID/Ex (WEx)

lw r6, 24(r3) RegisterFile (RF) Bypass

sw r6, 12(r2) Forwarding from M/WID/Ex (WEx)

Stall
+ Forwarding from M/WID/Ex (WEx)

5 Hazards
Questions?

You might also like