0% found this document useful (0 votes)
13 views

Computer Architecture ISA

Computer_Architecture, English

Uploaded by

22021150
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Computer Architecture ISA

Computer_Architecture, English

Uploaded by

22021150
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

ELT3047 Computer Architecture

Lesson 3: ISA design principles

Hoang Gia Hung


Faculty of Electronics and Telecommunications
University of Engineering and Technology, VNU Hanoi
Last lecture review
❑ Various measures for computer performance
➢ Execution time: the best performance measure for designers
➢ MIPS/MFLOPS: easy to understand but contains many drawbacks
➢ Benchmarks: use real applications – best performance measure for users

❑ Factors affecting execution time


➢ Instruction counts
➢ CPI
➢ Clock cycle time (rate)
➢ Power is a limiting factor (the power wall)

❑ Amdahl’s law of diminishing returns


➢ Improvement of one aspect is usually not proportional to improvement in
overall performance.
❑ Today’s lecture: ISA design principles
Recap: Instruction & Instruction Set
❑ Instructions are fundamental operations that CPU may execute.
➢ Analogy to human sentence: operations (verbs) applied to operands (objects)
➢ Instruction set: the repertoire of instructions like the vocabulary of the
computer language.
Operands may be implicit or explicit. C = A + B
operands operator

❑ Stored program
➢ A program is written as a sequence of instructions,
which are stored in a memory, in conjunction
with data, as binary bits.
➢ Instructions are automatically fetched, decoded,
and executed one by one.

❑ Registers: small amount of fast memory built directly inside the


processor by dedicated HW to store instructions’ data.
➢ Registers hold the fastest data available to the processor
➢ Why is having registers a good idea? ← programs exhibit data locality.
Recap: Instruction Set Architecture
Programmer’s view
of hardware: things
Software to make machine
(to be translated to code work correctly
the instruction set)

abstraction layer that hides the complexity of CPU


implementation from programmers.

Hardware
(implementation of
the instruction set)

❑ Allows computer designers to think


about functions independently from
the hardware that performs them.
✓ SW for Intel 80386 can still run on a Intel
Rocket Lake.

❑ Different ISA’s → different processors


Processor Design Levels
❑ Architecture (ISA) programmer/compiler view
➢ “functional appearance to its immediate user/system programmer”
➢ Data storage, addressing mode, instruction set, instruction formats &
encodings.

❑ µ-architecture processor designer view


➢ “logical structure or organization that performs the architecture”
➢ Pipelining, functional units, caches, physical registers

❑ VLSI Realization (chip) chip designer view


➢ “physical structure that embodies the µ-architecture”
➢ Gates, cells, transistors, wires

❑ Distinct Three Levels


➢ Processors with identical ISA may be different in organization: Intel vs AMD
➢ Processors with identical ISA and identical organization may still be
different realization: Intel Core i9-11900K vs Intel Core i5-11600K
The 5 Aspects in ISA Design

1. Data Storage

2. Memory Addressing Modes

3. Operations in the Instruction Set

4. Encoding the Instruction Set

5. The role of compilers


ISA Design Principles
❑ Designing an ISA is hard:
➢ What types of storage? How much?
➢ How many instructions? What are they?
➢ How to encode instructions? To minimize code size or to make hardware
implementation simple?
➢ How to future-proof?

❑ Design principles:
1. Simplicity favors regularity
2. Make the common case fast
3. Smaller is faster
4. Good design demands good compromises

❑ The quantitative methodology


➢ Take a set of benchmark programs expected to run on the system
➢ Implement the benchmark programs with different ISA configurations
➢ Pick the best one
CISC vs RISC: the famous ISA battle
❑ Two major design philosophies for ISA:
➢ Complex instruction set computer (CISC)
➢ Reduced Instruction Set Computer (RISC)

CISC RISC
Many instructions and addressing Few instructions and addressing
modes modes
Single instruction performs Simple instructions, combined by
complex operation SW to perform complex operations
Smaller program size Larger program size
Complex implementation Easier to build/optimize hardware
RISC-V, Sun SPARC, HP PA-RISC,
Intel, AMD, Cyrix
IBM PowerPC

❑ This course’s case study: RISC-V


Aspect #1 – Data Storage

❑ Storage Architecture
❑ General Purpose Register Architecture

Aspect #1: Data Storage


Aspect #2: Memory Addressing Modes
Aspect #3: Operations in the Instruction Set
Aspect #4: Encoding the Instruction Set
Aspect #5: The role of compilers
Storage Architecture: Definition
❑ For a processor, storage architecture concerns with:
➢ Where do we store the operands so that the computation can be
performed?
➢ Where do we store the computation result afterwards?
➢ How do we specify the operands?

❑ Common storage architectures


➢ Stack: usually implemented as a register file to store all operands &
results; all operands are implicitly on top of the stack.
➢ Accumulator (1-operand machine): a special register (the accumulator) to
store the result of a calulation, while also acting as an implicit operand.
➢ General-purpose register architecture: used only explicit operands, all
registers good for all purposes
➢ Memory: all operands & results are placed in the memory.
Popular storage architectures
C = A + B
Real-life implementation
❑ Stack architecture: a legacy from the “adding” machine days
➢ Top portion of the stack inside CPU; the rest in memory.
Examples: some technical handheld calculator, Z4 (by Conrad Zuse).

❑ Accumulator architecture:
➢ One operand is implicitly in the accumulator.
Examples: IBM 701, DEC PDP-8.

❑ General-purpose register architecture:


➢ Register-memory architecture: one operand in memory. Examples:
Motorola 68000, Intel 80386.
➢ Register-register (or load-store) architecture: both operands in registers.
Examples: RISC-V, DEC Alpha.

❑ Memory-memory architecture:
➢ All operands in memory. Example: DEC VAX.
Storage Architecture: GPR Architecture
❑ For modern processors (after 1980):
➢ General-Purpose Register (GPR) is the most common choice for storage
design.
➢ RISC computers typically uses Register-Register (Load/Store) design
E.g. RISC-V, ARM
➢ CISC computers use a mixture of Register-Register and Register-Memory
E.g. IA32.

❑ Reasons for GPR architecture dominance


➢ Registers are much faster than memory
➢ Registers are more efficient for a compiler to use
➢ Design principle #1: simplicity favors regularity.

❑ Design question
➢ How many GPRs should be sufficient?
➢ Design principle #3: smaller is faster.
Aspect #2 – Memory Addressing Mode

❑ Memory Locations and Addresses


❑ Addressing Modes

Aspect #1: Data Storage


Aspect #2: Memory Addressing Modes
Aspect #3: Operations in the Instruction Set
Aspect #4: Encoding the Instruction Set
Aspect #5: The role of compilers
Memory Address and Content
❑ Memory is viewed as a large, single-dimension array.
➢ Each element of the array must be indexed, i.e. has a specific address.
RISC-V: byte addressed.
➢ Given 𝑘-bit address → address space is of size 2𝑘.
➢ Each memory transfer consists of one word of 𝑛 bits → requires alignment.

❑ Registers hold temporary values (operands)


➢ Each register is referred to by a number/name (RISC-V example next slide)
(Byte) Address Data

0 int A, B, C;
store 4 B C = A + B;
8 C
load 12 A
16
R0 ← Mem[12]
Up to 232 bytes . .
organized as 230 . . R1 ← Mem[4]
. .
4-byte words → R2 ← R0 * R1
addresses of 32-bit words
consecutive words Mem[8] ← R2
differ by 4.
Memory
RISC-V general registers
Numbers
hardware
understands

Human-friendly
symbolic names in
assembly code
More on RISC-V registers
❑ There are other registers!
➢ Not accessible to user (no name/number).

❑ PC: Program counter


➢ holds the address of the next instruction to be fetched from memory

Memory address is 0x0010

CPU Memory - RAM

Program Counter (PC)


Instruction 1
0x0010 Address bus Instruction 2

Instruction Register (IR) Instruction 3

Instruction 2 Data bus Instruction 4


Memory Content: Endianness
❑ Endianness:
➢ The relative ordering of the bytes in a multiple-byte word stored in memory
Note: endianness does NOT refer to bit order within each byte in memory.

Big-endian: Little-endian:
Most significant byte stored in Least significant byte stored in
lowest address. lowest address.
Example: Example:
IBM 360, Motorola 68000, SPARC. Intel 80x86, DEC Alpha RISC-V.
Example: How are 16 consecutive bytes (0x) 0,1, …, E, F stored in a
32-bit wide memory from the address 0?
MSB LSB Word address MSB LSB
C D E F 3 3 F E D C
8 9 A B 2 2 B A 9 8
4 5 6 7 1 1 7 6 5 4
0 1 2 3 0 0 3 2 1 0
Byte address 0 1 2 3 3 2 1 0 Byte address
Addressing Modes
Addressing mode Example Meaning
Register Add R4,R3 R4  R4+R3
Will mainly work with
the first 3 modes in Immediate Add R4,#3 R4  R4+3
this course. Displacement Add R4,100(R1) R4  R4+Mem[100+R1]
Register indirect Add R4,(R1) R4  R4+Mem[R1]
Indexed / Base Add R3,(R1+R2) R3  R3+Mem[R1+R2]
Direct or absolute Add R1,(1001) R1  R1+Mem[1001]
Memory indirect Add R1,@(R3) R1  R1+Mem[Mem[R3]]
Auto-increment Add R1,(R2)+ R1  R1+Mem[R2]; R2  R2+d
Auto-decrement Add R1,–(R2) R2  R2-d; R1  R1+Mem[R2]
Scaled Add R1,100(R2)[R3] R1  R1+Mem[100+R2+R3*d]

❑ Addressing modes = ways to


obtain an operand of an instr.
➢ How architectures specify the
address of a constant, registers,
locations in memory.
➢ Which modes should we use?
Applying quantitative method.
Addressing modes example
A[0] = h + A[2]; .
.
.
Base address (x6) 4n
lw x5, 8(x6) 4n+1 A[0]
add x5, x5, x7 4n+2
4n+3
sw x5, 0(x6) Offset (8)
4n+4
4n+5 A[1]
4n+6
4n+7
x5
4n+8
x6 4n
4n+9
x7 h
4n+10
A[2]
4n+11

registers .
.
.
Aspect #3 – Operations in Instructions
Set
❑ Standard Operations in an Instruction Set
❑ Frequently Used Instructions

Aspect #1: Data Storage


Aspect #2: Memory Addressing Modes
Aspect #3: Operations in the Instruction Set
Aspect #4: Encoding the Instruction Set
Aspect #5: The role of compilers
Standard Operations
𝐜𝐨𝐦𝐩𝐢𝐥𝐞𝐫
❑ RISC: complex operations sequences of basic
operations

➢ Which one should our ISA support?


➢ Can be determined by quantitative method.
Frequently Used Instructions
❑ Design principle #2: “Make the common case fast”

❖ Average of five SPECint92 programs


Aspect #4 – Encoding the Instruction
Set
❑ Instruction format
▪ Instruction fields
▪ Instruction length

❑ Instruction encoding alternatives

Aspect #1: Data Storage


Aspect #2: Memory Addressing Modes
Aspect #3: Operations in the Instruction Set
Aspect #4: Encoding the Instruction Set
Aspect #5: The role of compilers
Instruction Encoding: Overview
❑ Encoding = how instructions are represented in binary format for
execution by the processor.
➢ Affects: Code size, performance, hardware complexity

❑ Things to be decided:
➢ Number of registers
➢ Number of addressing modes
➢ Instruction length
➢ Number of operands in an instruction

❑ Different competing forces:


➢ Have many registers and addressing modes
➢ Reduce code size
➢ Have instruction length that makes hardware implementation simpler.
Instruction fields
❑ An instruction consists of at least one of the following fields
➢ opcode: unique code to specify the desired operation
➢ operands: additional information needed for the operation

❑ The operation designates the type and size of the operands


➢ Most accessed data type and size by SPEC: integer (byte, half-word, word,
double word), floating point (byte, word, double word).
Instruction length
❑ Design choices for instruction length:
Instruction encoding alternatives
❑ Variable-length instructions.
➢ Support any number of operands → enables smallest code size, because
unused fields need not be included
➢ Require multi-step fetch and decode → worst performance.

❑ Fixed-length instructions.
➢ Fixed number of operands, with addressing modes (if options exist) specified
as part of the opcode → largest code size.
➢ Allow for easy fetch and decode + simplify pipelining and parallelism → best
performance.

❑ Hybrid instructions.
➢ Has multiple formats: fixed-length plus one or two variable-length
instructions.
➢ Improving the variability in size and work of the variable-length architecture
while reducing the code size of fixed-length counterpart.
Aspect #5 – The role of compilers

❑ The role of compilers


❑ ISA factors that affect compiler performance

Aspect #1: Data Storage


Aspect #2: Memory Addressing Modes
Aspect #3: Operations in the Instruction Set
Aspect #4: Encoding the Instruction Set
Aspect #5: The role of compilers
The role of compilers
❑ To translate programs written in HLL to a target instruction set.
➢ Significantly affects the performance of a computer.
➢ Goals: correctness, speed of the compiled code, then: fast compilation,
debugging support, and interoperability among languages.
ISA factors that affect compiler
performance
❑ Register allocation
➢ Optimizing passes must use registers to achieve best performance.
➢ Register allocation algorithms require at least 16 (preferably 32) GPR’s.

❑ Regularity, a.k.a. "law of least astonishment"


➢ All addressing modes apply to all data transfer instructions (i.e. addressing
modes & data transfer operations are orthogonal).
➢ Simplify code generation ← design principle #1: “simplicity favors regularity”

❑ Instruction simplicity
➢ Special features that “match” a language construct (e.g. FOR and CASE
statements) or a kernel function often make the compiler work more.
➢ “Provide primitives, not solutions” – compiler works best with a minimalist
instruction set.
Summary
❑ ISA design is hard
➢ Adhere to 4 qualitative principles
➢ Applying quantitative method

❑ Five aspects of ISA design


➢ Data Storage choices: GPR (load/store, register-memory), Stack, Register-
memory, Accumulator.
➢ Common addressing modes: displacement, immediate, register indirect
➢ Most important operations are simple instructions (96% of the instructions
executed) → make the common case fast.
➢ Instruction encoding: performance vs code size trade-off (fixed- vs
variable-length)
➢ To support the compiler performance: at least 16 (preferably 32) GPR’s,
aim for a minimalist instruction set, & ensure all addressing modes apply to
all data transfer instructions.

❑ Next lecture: case study for RISC-V ISA.

You might also like