Week 9
Week 9
ARM Pipeline
• A very important feature of ARM
processors. It has different versions
• 3-stage pipeline – ARM7TDMI and earlier
• 5-stage pipeline – ARMS, ARM9TDMI
• 6-stage pipeline – ARM10TDMI
• 8-stage pipeline – ARM11
3-stage Pipeline
• Classical fetch-decode-execute pipeline
• First stage reads an instruction from memory and
increments the value in the instruction address
register
• Next stage decodes instruction and prepares control
signals to execute it
• Third stage does the actual work: reading operands
from register file, performing ALU operations,
writes back the modified register values
5-stage Pipeline
• In 3-stage pipeline, pipeline stall caused by every data transfer
instruction – the next instruction cannot be fetched while
memory is being read/written
• Instruction and data memory separated
• Register read step moved to decode stage
• Execute stage split into three – performing arithmetic
computations, memory access, write result back to register file
• Balances pipeline, reducing CPI (average number of clocks per
instruction)
• However, need to forward data between pipeline stages to
resolve data dependencies between the stages without stalling
the pipeline
6-stage Pipeline
• In ARM10 core, instruction decode is split into
two pipeline stages – decode, register
• Decode stage performs decode operation
• Register stage reads the register to be used
• A separate adder introduced in execution unit
to take care of multiply-accumulate
instructions
• Both instruction and data buses are 64-bit
8-stage Pipeline
• Two new features introduced in ARM11 core
• Shift operation has been separated into a
separate pipeline stage
• Both instruction and data accesses are
distributed across two pipeline stages
• Execution unit is split into three different
pipelines that can concurrently operate and
commit instructions out-of-order also
Instruction Set Architecture
• Typical RISC architecture with several enhancements to
improve performance further
• The RISC features are as follows
– Large uniform register file with 16 general purpose registers
– Load/store architecture. The instructions that process data operate
only on registers and are separate from instructions that access
memory
– Simple addressing modes
– Uniform and fixed-length instruction fields. All ARM instructions
are 32-bit long and most of them have a regular three operand
encoding
Improved Features
• Each instruction controls the ALU and shifter, making the
instructions more powerful
• Auto-increment and auto-decrement addressing modes
supported
• Multiple load/store instructions that allow to load/store upto 16
registers at once
• Conditional execution of instructions introduced. Instruction
opcode is preceded by a 4-bit condition code. For the instruction
to execute, the condition must be met. Eliminates small branches
and thus pipeline stalls
• Arithmetic operations may or may not affect the status bits
Registers
• 16 general purpose registers R0-R15 in user mode
• R15 is the program counter, but can also be manipulated as a general
purpose register
• R13 is conventionally used as the stack pointer. ARM instruction set does
not have PUSH/POP instructions
• R14 is called the link register. When a procedure call is made, the return
address is automatically placed into this register (unlike in stack). A return
from the procedure can be implemented by copying R14 to R15
• Current Program Status Register (CPSR) contains four 1-bit condition flags
– negative, zero, carry, overflow
• Saved Program Status Register (SPSR) stores a copy of CPSR in some
modes of operation
Modes of Operation
• ARM processor operates in one of the six operating
modes
– User mode
• used to run application code
• CPSR cannot be written
• mode can only be changed via exception generation
– Fast interrupt processing mode (FIQ)
• Supports high speed interrupt handling
• Generally used for a single critical interrupt source
– Normal interrupt processing mode (IRQ)
• supports all other interrupt sources in a system
Modes of Operation (contd.)
• Supervisor mode (SVC)
– entered when the processor encounters a software
interrupt instruction
– used for OS services
– on reset, ARM inters into this mode
• Undefined instruction mode (UNDEF)
– fetched instruction is not an ARM instruction or a
coprocessor instruction
• Abort mode
– entered in response to memory fault
ARM Registers in Different Modes
CPSR Register
31 30 29 28 27 8 7 6 5 4 0
N Z C V Unused I F T Mode