ARM Processor
ARM Processor
Dr. B. Thiyaneswaran
Associate Professor,
Department of ECE,
Sona College of Technology
ARM Basics
• ARM processor core originates -> ACRON.
• ACRON RISC Machine.
• Designs the ARM range of RISC processor
cores
• Licenses ARM core designs to semiconductor
partners who fabricate and sell to their
customers.
(ARM does not fabricate silicon itself)
Leading provider of 32-bit embedded RISC
microprocessors.
• 75% of market.
• High performance
• Low power consumption
• Low system cost
Solutions for,
• ARMv2
ARM2, First commercial chip, Included 32-bit result
multiply instructions / coprocessor support
ARMv2a
ARM3 chip with on-chip cache, Added load and store
cache management
ARMv3
ARM6, 32 bit addressing, virtual , memory support
ARM Processor Core
Current low-end ARM core for applications like digital
mobile phones
TDMI
T: Thumb, 16-bit instruction set
D: on-chip Debug support, enabling the processor to halt in
response to a debug request
M: enhanced Multiplier, yield a full 64-bit result, high
performance
I: EmbeddedICE hardware
Von Neumann architecture
3-stage pipeline
ARM Core Diagram
ARM Inheritance
Used features of RISC:
• A load and store architecture.
• Fixed 32 bit instructions.
• 3 byte instruction format.
Unused features:
• Register windows.
• Delayed branches.
• Single cycle instruction execution
ARM Programmer Model
• Visible registers.
• Invisible registers -> not
significant.
It may be designated as ‘scratch pad’ registers. These are the
R0-R12
registers into which data and address are loaded.
R15 It act as PC
Current Program Status Register (CPSR)
Operating Modes
Modes:
1. USER -> Unprivileged Mode.
2. FIQ (Fast Interrupt Request) -> Entered on high priority INT
3. IRQ (interrupt Request) -> Entered on low priority INT
4. Supervisor-> Entered on Reset & SWI is executed.
5. Abort -> Used to handle memory access violations.
6. Undef -> Used to handle undefined instructions.
7. System-> Privileged Mode same as register access in
user mode.
Saved Program Status
Register(SPSR)
• 5 SPSR.
• Each one corresponding to exception
mode of operation.
• When an exception that is an
interrupt occurs the corresponding
SPSR saves the current CPSR value
into it.
ARM Memory system
• 32 bit memory.
• 8 bit / 16 bit /32 bit access flexibility.
• Little Endian & Big Endian format.
Load and Store architecture
• Direct memory to memory operation is not
allowed like CISC.
• Addition/subtraction/ or an other operation,
data’s r to be from/to in registers only.
• The data’s in memory has to be loaded in to
registers (LOAD). -> Register to ALU -> ALU to
registers.
• Register data to stored to memory (STORE)
Load and Store architecture
Data Processing instruction:
Uses registers for loading and values. Results
also stored in registers. (Not directly with
memory).
Data Transfer instruction:
Even memory to operation. Location-1 ->
Registers. Register-> Location-2.
Control Flow Instruction:
Control flow required Jumping of different
address.
3 Stage ARM Organization
3 Stage ARM Organization
i
n
s
t i Fetch Decode Execute
r
u
i+1 Fetch Decode Execute
c
t
i i+2 Fetch Decode Execute
o cycle
n
t t+1 t+2 t+3 t+4 19
FETCH:
Instruction is fetched from memory and placed in the
instruction pipeline.
DECODE:
The instruction is decoded and the data path control
signals prepared for the next cycle. In this stage the
instruction 'owns' the decode logic but not the data path.
EXECUTE:
The instruction 'owns' the data path. The register bank is
read, an operand shifted, the ALU result generated and
written back into a destination register.
***At any one time, three different instructions may occupy
each of these stages, so the hardware in each stage has to
be capable of independent operation.
Multiple cycle instruction in 3 stage
5 stage pipe line
Fclk
• Logic in Each pipe line has to be simplified.
• No of pipe line stages has to be increased.
Remember
• Fetch.
• Decode.
• Execute.
• Buffer / Data.
• Write Back.
5 stage Pipeline
Fetch:
The instruction is fetched from memory and placed in the instruction
pipeline.
Decode:
The instruction is decoded and register operands read from the register file.
Register bank has 3 Read ports -> ARM instructions can source all their
operands in one cycle.
Execute:
An operand is shifted to ALU input and result generated. If the instruction is
a load or store the memory address is computed in the ALU.
Buffer/data:
Data memory is accessed if required. Otherwise the ALU result is simply
buffered for one clock cycle to give the same pipeline flow for all instructions.
Write-back:
The results generated by the instruction are written back to the register file,
including any data loaded from memory.
Comparative Clock analysis of 3 & 5 stage Pipeline
3 Stage 5 Stage
3 Stage
For 3 instruction
Analysis
• Branching instructions.
Data Processing Instruction
• All operands are 32 bits wide and come from
registers or are specified as literals in the instruction
itself.
• The result, if there is one, is 32 bits wide and is
placed in a register. (There is an exception here: long
multiply instructions produce a 64-bit result).
• Each of the operand registers and the result register
are independently specified in the instruction. That
is, the ARM uses a '3-address' format for these
instructions.
Ex: ADD R0, R1, R2
Arithmetic Instructions
Arithmetic
Comparison instructions
Immediate operands
SMULL R0, R1, R2, R3 ; R0 <- Higher 32 bit, R1 <- Lower 32 bit
UMULL R0, R1, R2, R3 ; R0 <- Higher 32 bit, R1 <- Lower 32 bit