0% found this document useful (0 votes)
18 views50 pages

1.ARM Architecture, Instruction

The document compares RISC (Reduced Instruction Set Computer) and CISC (Complex Instruction Set Computer) architectures, highlighting their differences in instruction complexity, execution speed, and hardware requirements. It also details the ARM architecture, emphasizing its RISC design, low power consumption, and widespread use in various devices. Additionally, the document outlines ARM's operating modes, registers, and exception handling mechanisms.

Uploaded by

myamritatempmail
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views50 pages

1.ARM Architecture, Instruction

The document compares RISC (Reduced Instruction Set Computer) and CISC (Complex Instruction Set Computer) architectures, highlighting their differences in instruction complexity, execution speed, and hardware requirements. It also details the ARM architecture, emphasizing its RISC design, low power consumption, and widespread use in various devices. Additionally, the document outlines ARM's operating modes, registers, and exception handling mechanisms.

Uploaded by

myamritatempmail
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

ARM

Processors
CISC Vs RISC

RISC - Reduced Instruction Set Computer - uses simple commands that can be divided into several
instructions which achieve low-level operation within a single CLK cycle.
CISC - Complex Instruction Set Computer - single instructions can perform numerous low-level
operations like a load from memory, an arithmetic operation, and a memory store or are
accomplished by multi-step processes or addressing modes in single instructions
CISC Vs RISC
RISC CISC
1. RISC stands for Reduced Instruction Set Computer. 1. CISC stands for Complex Instruction Set Computer.

2. RISC processors have simple instructions taking about 2. CSIC processor has complex instructions that take up
one clock cycle. The average clock cycle per instruction multiple clocks for execution. The average clock cycle per
(CPI) is 1.5 instruction (CPI) is in the range of 2 and 15.

3. Performance is optimized with more focus on software 3. Performance is optimized with more focus on hardware.

4. It has no memory unit and uses a separate hardware to 4. It has a memory unit to implement complex
implement instructions.. instructions.
5. It has a hard-wired unit of programming. 5. It has a microprogramming unit.

6. The instruction set is reduced i.e. it has only a few


6. The instruction set has a variety of different instructions
instructions in the instruction set. Many of these
that can be used for complex operations.
instructions are very primitive.

7. CISC has many different addressing modes and can thus


7. The instruction set has a variety of different instructions
be used to represent higher-level programming language
that can be used for complex operations.
statements more efficiently.
CISC Vs RISC

8. Complex addressing modes are synthesized using the


8. CISC already supports complex addressing modes
software.
9. Multiple register sets are present 9. Only has a single register set

10. RISC processors are highly pipelined 10. They are normally not pipelined or less pipelined

11. The complexity of RISC lies with the compiler that


11. The complexity lies in the microprogram
executes the program
12. Execution time is very less 12. Execution time is very high
13. Code expansion can be a problem 13. Code expansion is not a problem
14. Decoding of instructions is simple. 14. Decoding of instructions is complex

15. It does not require external memory for calculations 15. It requires external memory for calculations

16. The most common RISC microprocessors are Alpha, ARC,


16. Examples of CISC processors are the System/360, VAX,
ARM, AVR, MIPS, PA-RISC, PIC, Power Architecture, and
PDP-11, Motorola 68000 family, AMD and Intel x86 CPUs.
SPARC.
17. RISC architecture is used in high-end applications such as
17. CISC architecture is used in low-end applications such as
video processing, telecommunications and image
security systems, home automation, etc.
processing.
RISC

– Instructions - RISC processors have a reduced number of instruction classes.


These classes provide simple operations that can each execute in a single cycle.
– Pipelines—The processing of instructions is broken down into smaller units that
can be executed in parallel by pipelines.
– Registers—RISC machines have a large general-purpose register set. Any
register can contain either data or an address.
– Load-store architecture—The processor operates on data held in registers.
Separate load and store instructions transfer data between the register bank
and external memory.
ARM

– ARM developed during 1983 – 1985 by Acron Computers Ltd.


– Before 1990 ARM – Acron RISC Machine – Later Advanced RISC Machine
– Most licensed and wide spread processor core
– Used in PDA, Mobile phones, Digital TV, Camera
– ARM7 – iPod - von Neuman Architecture
– ARM9 – PSP, Sony Ericsson, BenQ - Harvard Architecture
– ARM11 – Apple iPhone, Nokia N93, N800
– 75% of 32 bit processors are ARM
– Low power, Low Cost, Tiny
ARM Processor

– 32 Bit architecture
– Uses RISC architecture
– Has large uniform register file
– Load - store architecture
– Uniform and fixed length instructions – 32 bit
– Good speed / power consumption ratio
– High code density
ARM Processor - Products
ARM Architecture.
ARM core Dataflow model
ARM Architecture.

– Typical RISC architecture:


– Large uniform register file
– Load/store architecture
– Simple addressing modes
– Uniform and fixed-length instruction fields
– Enhancements:
– Each instruction controls the ALU and shifter
– Auto-increment and auto-decrement addressing modes
– Multiple Load/Store
– Conditional execution
– Results:
– High performance
– Low code size
– Low power consumption
– Low silicon area
Pipeline Organization

– Increases speed – most instructions executed in single cycle


– Versions:
– 3-stage (ARM7TDMI and earlier)
– 5-stage (ARMS, ARM9TDMI)
– 6-stage (ARM10TDMI)
– 3-stage pipeline: Fetch – Decode - Execute
– Three-cycle latency, one instruction per cycle throughput
Pipeline Organization

i
n
s
t i Fetch Decode Execute
r
u Fetch Decode Execute
i+1
c
t
i i+2 Fetch Decode Execute
o cycle
n
t t+1 t+2 t+3 t+4
Operating Modes
– Seven operating modes:
– User
– Privileged:
– System (version 4 and above)
– FIQ
– IRQ
– Abort exception modes
– Undefined
– Supervisor
Operating Modes

– User mode is the usual ARM program execution state, and is used for
executing most application programs.
– Fast Interrupt (FIQ) mode supports a data transfer or channel process.
– Interrupt (IRQ) mode is used for general-purpose interrupt handling.
– Supervisor mode is a protected mode for the operating system.
– Abort mode is entered after a data or instruction Prefetch Abort.
– System mode is a privileged user mode for the operating system.
– Undefined mode is entered when an undefined instruction is executed.
Operating Modes

User mode: Exception modes:


– Normal program execution mode – Entered upon exception
– System resources unavailable – Full access to system resources
– Mode changed by exception only – Mode changed freely
Exceptions

Exception Mode Priority IV Address


Reset Supervisor 1 0x00000000
Undefined instruction Undefined 6 0x00000004
Software interrupt Supervisor 6 0x00000008
Prefetch Abort Abort 5 0x0000000C
Data Abort Abort 2 0x00000010
Interrupt IRQ 4 0x00000018
Fast interrupt FIQ 3 0x0000001C
ARM Registers

– 31 general-purpose 32-bit registers – Saved Program Status Register (SPSR)


– 16 visible, R0 – R15 – On exception, entering mod mode:
– Others speed up the exception process – (PC + 4)  LR
– Special roles: – CPSR  SPSR mode
– Hardware – PC  IV address
– R14 – Link Register (LR): – R13, R14 replaced by R13_mod, R14_mod
– In case of FIQ mode R7 – R12 also replaced
optionally holds return address
for branch instructions
– R15 – Program Counter (PC)
– Software
– R13 - Stack Pointer (SP)
– Current Program Status Register (CPSR)
The Program Status Registers
(CPSR and SPSRs)
31 28 8 4 0

N Z CV I F T Mode

Copies of the ALU status flags (latched if the


instruction has the "S" bit set).
* Interrupt Disable bits.
* Condition Code Flags I = 1, disables the IRQ.
N = Negative result from ALU flag. F = 1, disables the FIQ.
Z = Zero result from ALU flag.
C = ALU operation Carried out * T Bit (Architecture v4T only)
V = ALU operation overflowed T = 0, Processor in ARM state
T = 1, Processor in Thumb state
* Mode Bits
M[4:0] define the processor mode.
Conditional Execution

– Most instruction sets only allow branches to be executed conditionally.


– However by reusing the condition evaluation hardware, ARM effectively increases number of
instructions.
– All instructions contain a condition field which determines whether the CPU will execute them.
– Non-executed instructions soak up 1 cycle.
– Still have to complete cycle so as to allow fetching and decoding of following instructions.

– This removes the need for many branches, which stall the pipeline (3 cycles to refill).
– Allows very dense in-line code, without branches.
– The Time penalty of not executing several conditional instructions is frequently less than overhead of
the branch
or subroutine call that would otherwise be needed.
The Condition Field

31 28 24 20 16 12 8 4 0

Cond

0000 = EQ - Z set (equal) 1001 = LS - C clear or Z (set unsigned lower or same)


0001 = NE - Z clear (not equal) 1010 = GE - N set and V set, or N clear and V clear (>or =)
0010 = HS / CS - C set (unsigned higher or
same) 1011 = LT - N set and V clear, or N clear and V set (>)

0011 = LO / CC - C clear (unsigned lower) 1100 = GT - Z clear, and either N set and V set, or N clear
0100 = MI -N set (negative) and V set (>)

0101 = PL - N clear (positive or zero) 1101 = LE - Z set, or N set and V clear,or N clear and V set
0110 = VS - V set (overflow) (<, or =)

0111 = VC - V clear (no overflow) 1110 = AL - always


1000 = HI - C set and Z clear (unsigned 1111 = NV - reserved.
higher)
Operating modes of ARM7
Operating modes of ARM7 Cntd.
➢ Context switching

➢ Shadow Registers

➢ SPSR – Saved Program Status Register


ARM Registers
System & User FIQ Supervisor Abort IRQ Undefined
R0 R0 R0 R0 R0 R0
R1 R1 R1 R1 R1 R1
R2 R2 R2 R2 R2 R2
R3 R3 R3 R3 R3 R3
R4 R4 R4 R4 R4 R4
R5 R5 R5 R5 R5 R5
R6 R6 R6 R6 R6 R6
R7 R7_fiq R7 R7 R7 R7
R8 R8_fiq R8 R8 R8 R8
R9 R9_fiq R9 R9 R9 R9
R10 R10_fiq R10 R10 R10 R10
R11 R11_fiq R11 R11 R11 R11
R12 R12_fiq R12 R12 R12 R12
R13 R13_fiq R13_svc R13_abt R13_irq R13_und
R14 R14_fiq R14_svc R14_abt R14_irq R14_und
R15 (PC) R15 (PC) R15 (PC) R15 (PC) R15 (PC) R15 (PC)
CPSR CPSR CPSR CPSR CPSR CPSR
SPSR_fiq SPSR_svc SPSR_abt SPSR_irq SPSR_und
Condition Flags

Flag Logical Instruction Arithmetic Instruction

Negative No meaning Bit 31 of the result has been set


(N=‘1’) Indicates a negative number in
signed operations

Zero Result is all zeroes Result of operation was zero


(Z=‘1’)

Carry After Shift operation Result was greater than 32 bits


(C=‘1’) ‘1’ was left in carry flag

overflow No meaning Result was greater than 31 bits


(V=‘1’) Indicates a possible corruption of
the sign bit in signed numbers
The Program Counter (R15)

– When the processor is executing in ARM state:


– All instructions are 32 bits in length
– All instructions must be word aligned
– Therefore the PC value is stored in bits [31:2] with bits [1:0] equal to zero (as instruction
cannot be halfword or byte aligned).

– R14 is used as the subroutine link register (LR) and stores the return address
when Branch with Link operations are performed & calculated from the PC.
– Thus to return from a linked branch
– MOV r15,r14
or
– MOV pc,lr
Exception Handling and the Vector
Table
– When an exception occurs, the core:
0x00000000 Reset
– Copies CPSR into SPSR_<mode>
0x00000004 Undefined Instruction
– Sets appropriate CPSR bits
0x00000008 Software Interrupt
If core implements ARM Architecture 4T and is currently in Thumb state, then
0x0000000C Prefetch Abort
ARM state is entered. Data Abort
0x00000010
Mode field bits 0x00000014 Reserved
Interrupt disable flags if appropriate. 0x00000018 IRQ

– Maps in appropriate banked registers 0x0000001C FIQ

– Stores the “return address” in LR_<mode>


– Sets PC to vector address

– To return, exception handler needs to:


– Restore CPSR from SPSR_<mode>
– Restore PC from LR_<mode>
ARM Instruction Set Format
31 2827 1615 87 0

Cond 0 0 I Opcode S Rn Rd Operand2

Cond 0 0 0 0 0 0 A S Rd Rn Rs 1 0 0 1 Rm

Cond 0 0 0 0 1 U A S RdHi RdLo Rs 1 0 0 1 Rm


Instruction type
Data processing / PSR Transfer
Cond 0 0 0 1 0 B 0 0 Rn Rd 0 0 0 0 1 0 0 1 Rm
Multiply
Cond 0 1 I P U B W L Rn Rd Offset
Long Multiply (v3M / v4 only)
Cond 1 0 0 P U S W L Rn Register List
Swap
Cond 0 0 0 P U 1 W L Rn Rd Offset1 1 S H 1 Offset2
Load/Store Byte/Word
Cond 0 0 0 P U 0 W L Rn Rd 0 0 0 0 1 S H 1 Rm
Load/Store Multiple
Cond 1 0 1 L Offset
Halfword transfer : Immediate offset (v4 only)
Cond 0 0 0 1 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 Rn
Halfword transfer: Register offset (v4 only)
Cond 1 1 0 P U N W L Rn CRd CPNum Offset
Branch
Cond 1 1 1 0 Op1 CRn CRd CPNum Op2 0 CRm
Branch Exchange (v4T only)
Cond 1 1 1 0 Op1 L CRn Rd CPNum Op2 1 CRm
Coprocessor data transfer
Cond 1 1 1 1 SWI Number
Coprocessor data operation
Coprocessor register transfer
Software interrupt
Instruction Set

– Two instruction sets:


– ARM
– Standard 32-bit instruction set
– THUMB
– 16-bit compressed form
– Code density better than most CISC
– Dynamic decompression in pipeline
– Features:
– Load/Store architecture
– 3-address data processing instructions
– Conditional execution
– Load/Store multiple registers
– Shift & ALU operation in single clock cycle
ARM Instruction Set

– Conditional execution:
– Each data processing instruction
prefixed by condition code
– Result – smooth flow of instructions through pipeline
– 16 condition codes:

signed greater
EQ equal MI negative HI unsigned higher GT
than
unsigned lower or signed less than
NE not equal PL positive or zero LS LE
same or equal

unsigned higher signed greater


CS VS overflow GE AL always
or same than or equal

CC unsigned lower VC no overflow LT signed less than NV special purpose


ARM Instruction Set

ARM instruction set

Data processing
instructions
Data transfer
instructions
Block transfer
instructions
Branching instructions
Multiply instructions
Software interrupt
instructions
ARM Instruction set

1. Data processing Instructions

2. Branch Instructions

3. Load store Instructions

4. Software Interrupt Instructions

5. Program Status Register Instructions


ARM Instruction set Cntd.

1. Data processing Instructions

i. Move
ii.Arithmetic
iii.Logical
iv.Comparison
v.Multiply
Move instruction:

Syntax:

<instruction> {condition} {S} Rd,N

S- Set Condition Code


Rd- Destination Register
N - Register or Immediate value
Move instruction Cntd:

Syntax:

<instruction> {condition} {S} Rd,N

MOV

MVN
Move Instruction
36

MOV R1, #0x77; R1 = 0x77

MVN R1, R2 ; R1 = ~R2


Move instructions Cntd:

Syntax:

<instruction> {condition} {S} Rd, N

MOV R0, R1

MVN R0, R1
MOVE instuctions Cntd:

MOV R0, #7

MOV R0, R1, LSL #2


40
Eg:

ADD R0, R1, R2; R0 = R1 + R2; z = x +y

SUB R0, R1, R2; R0 = R1 - R2

RSB R0, R1, R2; R0 = R2 – R1

MLA R0, R1, R2, R3; R0 = (R1*R2) + R3

ADD R0, R1, #0x77; R0 = R1 + 77


Multiply Instructions

– Integer multiplication (32-bit result)


– Long integer multiplication (64-bit result)
– Built in Multiply Accumulate Unit (MAC)
– Multiply and accumulate instructions add product to
running total
Multiply Instructions

– Instructions:
MUL Multiply 32-bit result

MULA Multiply accumulate 32-bit result

UMULL Unsigned multiply 64-bit result

UMLAL Unsigned multiply accumulate 64-bit result

SMULL Signed multiply 64-bit result

SMLAL Signed multiply accumulate 64-bit result


Eg:

AND R0, R1, R2; R0 = R1 & R2

ORR R0, R1, R2; R0 = R1 | R2

EOR R0, R1, R2; R0 = R1 ^ R2

BIC R0, R1, R2; R0 = R1 & (~R2)


ALP to add first 5 natural numbers. Store sum in register
ALP to add first 10 odd numbers. Store sum in register.

You might also like