0% found this document useful (0 votes)
19 views35 pages

Slide 2

The document discusses the principles of Instruction Set Architecture (ISA) design, emphasizing the importance of understanding the hardware/software interface in modern computers. It outlines key design principles such as simplicity, performance optimization, and the need for good compromises, while also comparing CISC and RISC architectures. Additionally, it covers aspects like data storage, memory addressing modes, and the role of compilers in ISA design.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views35 pages

Slide 2

The document discusses the principles of Instruction Set Architecture (ISA) design, emphasizing the importance of understanding the hardware/software interface in modern computers. It outlines key design principles such as simplicity, performance optimization, and the need for good compromises, while also comparing CISC and RISC architectures. Additionally, it covers aspects like data storage, memory addressing modes, and the role of compilers in ISA design.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

ELT3047 Computer Architecture

Lesson 2: ISA design principles

Hoang Gia Hung


Faculty of Electronics and Telecommunications
University of Engineering and Technology, VNU Hanoi
Last lecture review
❑ Course overview
➢ You’ll learn the HW/SW interface in a modern (=Von Neumann) computer

❑ Von Neumann model


➢ Instructions are stored in a linear memory → can be modified like data.
➢ Instructions are fetched and executed in CPU sequentially.

❑ Computer architecture
➢ Represented in abstract layers to manage complexity
➢ ISA = what the computer does; Organization = how the ISA is implemented;
Realization = implementation of the ISA on specific integrated circuits.

❑ Computer performance
➢ Is reciprocal of CPU time
➢ Also follows the classical CPU performance equation

❑ Today’s lecture: ISA design principles


Recap: Instruction Set Architecture
Programmer’s view
of hardware: things
Software to make machine
(to be translated to code work correctly
the instruction set)

abstraction layer that hides the complexity of CPU


implementation from programmers.

Hardware
(implementation of
the instruction set)

❑ Allows computer designers to think


about functions independently from
the hardware that performs them.
✓ SW for Intel 80386 can still run on a Intel
Rocket Lake.

❑ Different ISA’s → different processors


Recap: Processor Design Levels
❑ Architecture (ISA) programmer/compiler view
➢ “functional appearance to its immediate user/system programmer”
➢ Data storage, addressing mode, instruction set, instruction formats &
encodings.

❑ µ-architecture processor designer view


➢ “logical structure or organization that performs the architecture”
➢ Pipelining, functional units, caches, physical registers

❑ VLSI Realization (chip) chip designer view


➢ “physical structure that embodies the µ-architecture”
➢ Gates, cells, transistors, wires

❑ Distinct Three Levels


➢ Processors with identical ISA may be different in organization: Intel vs AMD
➢ Processors with identical ISA and identical organization may still be
different realization: Intel Core i9-11900K vs Intel Core i5-11600K
Recap: Instruction & Instruction Set
❑ Instructions are fundamental operations that CPU may execute.
➢ Analogy to human sentence: operations (verbs) applied to operands (objects)
➢ Instruction set: the repertoire of instructions like the vocabulary of the
computer language.
Operands may be implicit or explicit. C = A + B
operands operator

❑ Stored program
➢ A program is written as a sequence of instructions,
which are stored in a memory, in conjunction
with data, as binary bits.
➢ Instructions are automatically fetched, decoded,
and executed one by one.

❑ Registers: small amount of fast memory built directly inside the


processor by dedicated HW to store instructions’ data.
➢ Registers hold the fastest data available to the processor
➢ Why is having registers a good idea? ← programs exhibit data locality.
The 5 Aspects in ISA Design

1. Data Storage

2. Memory Addressing Modes

3. Operations in the Instruction Set

4. Encoding the Instruction Set

5. The role of compilers


ISA Design Principles
❑ Designing an ISA is hard:
➢ What types of storage? How much?
➢ How many instructions? What are they?
➢ How to encode instructions? To minimize code size or to make hardware
implementation simple?
➢ How to future-proof?

❑ Design principles:
1. Simplicity favors regularity
2. Make the common case fast
3. Smaller is faster
4. Good design demands good compromises

❑ The quantitative methodology


➢ Take a set of benchmark programs expected to run on the system
➢ Implement the benchmark programs with different ISA configurations
➢ Pick the best one
Benchmarks

❑ Benchmarking: using real applications to measure performance


➢ Supposedly typical of actual workload.
➢ Representatives of expected classes of applications (compilers, editors,
scientific applications, graphics, ...) ← make common case fast.
➢ Focus on reproducibility: must provide every detail so that another
experimenter would need to duplicate the results.

❑ SPEC (System Performance Evaluation Corporation)


➢ Funded and supported by a number of computer vendors
➢ Companies have agreed on a set of real program and inputs
➢ Various benchmarks for CPU performance, graphics, high-performance
computing, client- server models, file systems, Web servers, etc.
➢ Valuable indicator of performance (and compiler technology)
SPEC CPU Benchmark
❑ Measure elapsed time to execute a selection of programs (with
neglectable I/O), and normalized relative to reference machine
𝑛
➢ Summarize as geometric mean of performance ratios: ς𝑛𝑖=1 𝑃𝑒𝑟𝑓. 𝑟𝑎𝑡𝑖𝑜𝑖
CISC vs RISC: the famous ISA battle
❑ Two major design philosophies for ISA:
➢ Complex instruction set computer (CISC)
➢ Reduced Instruction Set Computer (RISC)

CISC RISC
Many instructions and addressing Few instructions and addressing
modes modes
Single instruction performs Simple instructions, combined by
complex operation SW to perform complex operations
Smaller program size Larger program size
Complex implementation Easier to build/optimize hardware
RISC-V, Sun SPARC, HP PA-RISC,
Intel, AMD, Cyrix
IBM PowerPC

❖ No new general-purpose CISC ISA in 30 years!

❑ This course’s focus: RISC


Aspect #1 – Data Storage

❑ Storage Architecture
❑ General Purpose Register Architecture

Aspect #1: Data Storage


Aspect #2: Memory Addressing Modes
Aspect #3: Operations in the Instruction Set
Aspect #4: Encoding the Instruction Set
Aspect #5: The role of compilers
Storage Architecture: Definition
❑ For a processor, storage architecture concerns with:
➢ Where do we store the operands so that the computation can be
performed?
➢ Where do we store the computation result afterwards?
➢ How do we specify the operands?

❑ Common storage architectures


➢ Stack: usually implemented as a register file to store all operands &
results; all operands are implicitly on top of the stack.
➢ Accumulator (1-operand machine): a special register (the accumulator) to
store the result of a calulation, while also acting as an implicit operand.
➢ General-purpose register architecture: used only explicit operands, all
registers good for all purposes
➢ Memory: all operands & results are placed in the memory.
Popular storage architectures
C = A + B
Real-life implementation
❑ Stack architecture: a legacy from the “adding” machine days
➢ Top portion of the stack inside CPU; the rest in memory.
Examples: some technical handheld calculator, Z4 (by Conrad Zuse).

❑ Accumulator architecture:
➢ One operand is implicitly in the accumulator.
Examples: IBM 701, DEC PDP-8.

❑ General-purpose register architecture:


➢ Register-memory architecture: one operand in memory. Examples:
Motorola 68000, Intel 80386.
➢ Register-register (or load-store) architecture: both operands in registers.
Examples: RISC-V, DEC Alpha.

❑ Memory-memory architecture:
➢ All operands in memory. Example: DEC VAX.
Storage Architecture: GPR Architecture
❑ For modern processors (after 1980):
➢ General-Purpose Register (GPR) is the most common choice for storage
design.
➢ RISC computers typically uses Register-Register (Load/Store) design
E.g. RISC-V, ARM
➢ CISC computers use a mixture of Register-Register and Register-Memory
E.g. IA32.

❑ Reasons for GPR architecture dominance


➢ Registers are much faster than memory
➢ Registers are more efficient for a compiler to use
➢ Design principle #1: simplicity favors regularity.

❑ Design question
➢ How many GPRs should be sufficient?
➢ Design principle #3: smaller is faster.
Aspect #2 – Memory Addressing Mode

❑ Memory Locations and Addresses


❑ Addressing Modes

Aspect #1: Data Storage


Aspect #2: Memory Addressing Modes
Aspect #3: Operations in the Instruction Set
Aspect #4: Encoding the Instruction Set
Aspect #5: The role of compilers
General memory organization
❑ Memory is viewed as a large, single- Address Content
dimension array of memory locations. 0 8 bits
1 8 bits
❑ Each location of the memory has an 2 8 bits
address, which is an index into the 3 8 bits
array. 4 8 bits
➢ Given k-bit address, the address space is of 5 8 bits
size 2k. 6 8 bits
➢ Each location/address contains 1 byte → we 7 8 bits
can access a single byte (byte addressing)
8 8 bits
➢ Each memory transfer usually consists of one 9 8 bits
word of 4 bytes → requires word addressing
10 8 bits
➢ Word address = address of the first/last 11 8 bits
byte of the chunk. :
➢ Consecutive word addresses are 4 bytes apart.
Aligned word
➢ Words are aligned in memory if they begin at a
Mis-aligned word
byte address that is a multiple of 4.
Memory organization: endianness
❑ Endianness:
➢ The relative ordering of the bytes in a multiple-byte word stored in memory
Note: endianness does NOT refer to bit order within each byte in memory.

Big-endian: Little-endian:
Most significant byte stored in Least significant byte stored in
lowest address. lowest address.
MSB LSB MSB LSB
Address 0 1 2 3 Address 3 2 1 0

Example: How is the data block 0x0123456789ABCDEF stored in a


32-bit wide memory, starting continuously from the address 0?

Address (Word number)


MSB LSB MSB LSB
EF CD AB 89 4 (1) 4 (1) 89 AB CD EF
67 45 23 01 0 (0) 0 (0) 01 23 45 67
Memory content transfer
❑ CPU loads data from memory to registers and stores register
contents to locations in memory.
➢ Variables stored in memory must be referenced by memory addresses
➢ Each register is referred to by a number/name (next slide)
Address Data

int A, B, C;
store 0 B C = A + B;
4
8 C

load 12 A
16
R0 ← Mem[12]
. . R1 ← Mem[4]
. .
Up to232 bytes . .
R2 ← R0 * R1
organized as 230
4-byte words; 32-bit words Mem[8] ← R2
addresses of
consecutive words
differ by 4.
Memory
RISC-V general registers
Human-friendly
symbolic names ABI name Register Number Usage
in assembly code zero x0 Constant value 0
ra x1 Return address
sp x2 Stack pointer
gp x3 Global pointer
tp x4 Thread pointer
t0-2 x5-7 Temporaries
s0/fp x8 Saved register / Frame pointer
s1 x9 Saved register
a0-1 x10-11 Function arguments / return
values
a2-7 x12-17 Function arguments
s2-11 x18-27 Saved registers
Numbers t3-6 x28-31 Temporaries
hardware
understands
More on RISC-V registers
❑ There are other registers!
➢ Not accessible to user (no name/number).

❑ PC: Program counter


➢ holds the address of the next instruction to be fetched from memory

Memory address is 0x0010

CPU Memory - RAM

Program Counter (PC)


Instruction 1
0x0010 Address bus Instruction 2

Instruction Register (IR) Instruction 3

Instruction 2 Data bus Instruction 4


Addressing Modes
Addressing mode Example Meaning
Register Add R4,R3 R4  R4+R3
Will mainly work with
the first 3 modes in Immediate Add R4,#3 R4  R4+3
this course. Displacement Add R4,100(R1) R4  R4+Mem[100+R1]
Register indirect Add R4,(R1) R4  R4+Mem[R1]
Indexed / Base Add R3,(R1+R2) R3  R3+Mem[R1+R2]
Direct or absolute Add R1,(1001) R1  R1+Mem[1001]
Memory indirect Add R1,@(R3) R1  R1+Mem[Mem[R3]]
Auto-increment Add R1,(R2)+ R1  R1+Mem[R2]; R2  R2+d
Auto-decrement Add R1,–(R2) R2  R2-d; R1  R1+Mem[R2]
Scaled Add R1,100(R2)[R3] R1  R1+Mem[100+R2+R3*d]

❑ Addressing modes = ways to


obtain an operand of an instr.
➢ How architectures associates
variables with registers or allocates
them to locations in memory.
➢ Which modes should we use?
Applying quantitative method.
Addressing modes example
A[0] = h + A[2]; .
.
.
Base address (x6) 4n
lw x5, 8(x6) 4n+1 A[0]
add x5, x5, x7 4n+2
4n+3
sw x5, 0(x6) Offset (8)
4n+4
4n+5 A[1]
4n+6
4n+7
x5
4n+8
x6 4n
4n+9
x7 h
4n+10
A[2]
4n+11

registers .
.
.
Aspect #3 – Operations in Instructions
Set
❑ Standard Operations in an Instruction Set
❑ Frequently Used Instructions

Aspect #1: Data Storage


Aspect #2: Memory Addressing Modes
Aspect #3: Operations in the Instruction Set
Aspect #4: Encoding the Instruction Set
Aspect #5: The role of compilers
Standard Operations
𝐜𝐨𝐦𝐩𝐢𝐥𝐞𝐫
❑ RISC: complex operations sequences of basic
operations

➢ Which one should our ISA support?


▪ Design principle #2: “Make the common case fast”
▪ Determined by quantitative method.
Frequently Used Instructions
❑ Average of five SPECint92 programs
Aspect #4 – Encoding the Instruction
Set
❑ Instruction format
▪ Instruction fields
▪ Instruction length

❑ Instruction encoding alternatives

Aspect #1: Data Storage


Aspect #2: Memory Addressing Modes
Aspect #3: Operations in the Instruction Set
Aspect #4: Encoding the Instruction Set
Aspect #5: The role of compilers
Instruction Encoding: Overview
❑ Encoding = how instructions are represented in binary format for
execution by the processor.
➢ Affects: Code size, performance, hardware complexity

❑ Things to be decided:
➢ Number of registers
➢ Number of addressing modes
➢ Instruction length
➢ Number of operands in an instruction

❑ Different competing forces:


➢ Have many registers and addressing modes
➢ Reduce code size
➢ Have instruction length that makes hardware implementation simpler.
Instruction fields
❑ An instruction consists of at least one of the following fields
➢ opcode: unique code to specify the desired operation
➢ operands: additional information needed for the operation

❑ The operation designates the type and size of the operands


➢ Most accessed data type and size by SPEC: integer (byte, half-word, word,
double word), floating point (byte, word, double word).
Instruction length
❑ Design choices for instruction length:
Instruction encoding alternatives
❑ Variable-length instructions.
➢ Support any number of operands → enables smallest code size, because
unused fields need not be included
➢ Require multi-step fetch and decode → worst performance.

❑ Fixed-length instructions.
➢ Fixed number of operands, with addressing modes (if options exist) specified
as part of the opcode → largest code size.
➢ Allow for easy fetch and decode + simplify pipelining and parallelism → best
performance.

❑ Hybrid instructions.
➢ Has multiple formats: fixed-length plus one or two variable-length
instructions.
➢ Improving the variability in size and work of the variable-length architecture
while reducing the code size of fixed-length counterpart.
Aspect #5 – The role of compilers

❑ The role of compilers


❑ ISA factors that affect compiler performance

Aspect #1: Data Storage


Aspect #2: Memory Addressing Modes
Aspect #3: Operations in the Instruction Set
Aspect #4: Encoding the Instruction Set
Aspect #5: The role of compilers
The role of compilers
❑ To translate programs written in HLL to a target instruction set.
➢ Significantly affects the performance of a computer.
➢ Goals: correctness, speed of the compiled code, then: fast compilation,
debugging support, and interoperability among languages.
ISA factors that affect compiler
performance
❑ Register allocation
➢ Optimizing passes must use registers to achieve best performance.
➢ Register allocation algorithms require at least 16 (preferably 32) GPR’s.

❑ Regularity, a.k.a. "law of least astonishment"


➢ All addressing modes apply to all data transfer instructions (i.e. addressing
modes & data transfer operations are orthogonal).
➢ Simplify code generation ← design principle #1: “simplicity favors regularity”

❑ Instruction simplicity
➢ Special features that “match” a language construct (e.g. FOR and CASE
statements) or a kernel function often make the compiler work more.
➢ “Provide primitives, not solutions” – compiler works best with a minimalist
instruction set.
Summary
❑ ISA design is hard
➢ Adhere to 4 qualitative principles
➢ Applying quantitative method

❑ Five aspects of ISA design


➢ Data Storage choices: GPR (load/store, register-memory), Stack, Register-
memory, Accumulator.
➢ Common addressing modes: displacement, immediate, register indirect
➢ Most important operations are simple instructions (96% of the instructions
executed) → make the common case fast.
➢ Instruction encoding: performance vs code size trade-off (fixed- vs
variable-length)
➢ To support the compiler performance: at least 16 (preferably 32) GPR’s,
aim for a minimalist instruction set, & ensure all addressing modes apply to
all data transfer instructions.

❑ Next lecture: case study for RISC-V ISA.

You might also like