0% found this document useful (0 votes)
258 views

Lecture5 ARM

The document describes ARM multiply instructions. It includes: 1) Integer and long integer multiplication instructions that produce 32-bit and 64-bit results respectively. 2) Multiply accumulate instructions that add the product to a running total. 3) The 64-bit multiplication instructions are UMULL, UMLAL, SMULL, SMLAL and store the least and most significant 32 bits of the 64-bit result in separate registers.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
258 views

Lecture5 ARM

The document describes ARM multiply instructions. It includes: 1) Integer and long integer multiplication instructions that produce 32-bit and 64-bit results respectively. 2) Multiply accumulate instructions that add the product to a running total. 3) The 64-bit multiplication instructions are UMULL, UMLAL, SMULL, SMLAL and store the least and most significant 32 bits of the 64-bit result in separate registers.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 39

Multiply Instructions

 Integer multiplication (32-bit result)


 Long integer multiplication (64-bit result)
 Built in Multiply Accumulate Unit (MAC)
 Multiply and accumulate instructions add
product to running total

1
39v10 The ARM Architecture TM
1 1
Multiply Instructions

 Instructions:
MUL Multiply 32-bit result

MULA Multiply accumulate 32-bit result

UMULL Unsigned multiply 64-bit result

UMLAL Unsigned multiply accumulate 64-bit result

SMULL Signed multiply 64-bit result

SMLAL Signed multiply accumulate 64-bit result

2
39v10 The ARM Architecture TM
2 2
Multiply instructions

 There are some important differences


from the other arithmetic instructions:
 Immediate second operands are not supported
 The result register must not be the same as the first source register

39v10 The ARM Architecture TM


3 3
Multiply Instructions

39v10 The ARM Architecture TM


4 4
Multiplication

 MUL R0, R1, R2;R0 = (R1xR2)[31:0]

 Features:
 Second operand can’t be immediate
 The result register must be different from the first
operand
 Cycles depends on core type
 If S bit is set, C flag is meaningless

39v10 The ARM Architecture TM


5 5
Multiplication

39v10 The ARM Architecture TM


6 6
39v10 The ARM Architecture TM
7 7
Multiplication

 Multiply-accumulate (2D array indexing)


MLA R4, R3, R2, R1 @ R4 = R3xR2+R1
 Multiply with a constant can often be more efficiently
implemented using shifted register operand
MOV R1, #35
MUL R2, R0, R1
or
ADD R0, R0, R0, LSL #2 @ R0’=5xR0
RSB R2, R0, R0, LSL #3 @ R2 =7xR0’
39v10 The ARM Architecture TM
8 8
•The following instruction produce the full 64 bit result:
<mul>{<cond>}{S} RdHi, RdLo, Rm, Rs

• The 64-bit multiply types are


UMULL, UMLAL, SMULL, SMLAL

•Example:
UMULL r6, r5, r3. r9 ;multiplies the values of
r3 and r9 and stores the 64 bit result
as least significant 32 bits are
stored in r5 and most significant
32 bits are stored in r6.

39v10 The ARM Architecture TM


9 9
Summary- Multiply instructions

Op c o de Mn e mo n i c Me an i n g Ef f e c t
[2 3 :2 1 ]
000 MUL Multiply (32-bit result) Rd := (Rm * Rs) [31:0]
001 MLA Multiply-accumulate (32-bit result) Rd := (Rm * Rs + Rn) [31:0]
100 UMULL Unsigned multiply long RdHi:RdLo := Rm * Rs
101 UMLAL Unsigned multiply-accumulate long RdHi:RdLo += Rm * Rs
110 SMULL Signed multiply long RdHi:RdLo := Rm * Rs
111 SMLAL Signed multiply-accumulate long RdHi:RdLo += Rm * Rs

31 28 27 24 23 21 20 19 16 15 12 11 8 7 4 3 0
cond 0000 mul S Rd/RdHi Rn/RdLo Rs 1001 Rm

39v10 The ARM Architecture TM


10 10
Flow control instructions

 Determine the instruction to offset


pc-relative be executed
next

39v10 The ARM Architecture TM


11 11
Branch instruction

B label

label: …
 conditional branches
MOV R0, #0
loop: …
ADD R0, R0, #1
CMP R0, #10
BNE loop

39v10 The ARM Architecture TM


12 12
Branch and link

 BL instruction save the return address to


R14 (lr)
BL sub @ call sub
CMP R1, #5 @ return to here
MOVEQ R1, #0

sub: … @ sub entry point

MOV PC, LR@ return
39v10 The ARM Architecture TM
13 13
Branch conditions

39v10 The ARM Architecture TM


14 14
Branches

39v10 The ARM Architecture TM


15 15
Conditional execution

CMP R0, #5
BEQ bypass @ if (R0!=5) {
ADD R1, R1, R0 @ R1=R1+R0-R2
SUB R1, R1, R2 @ }
bypass: …
smaller and faster
CMP R0, #5
ADDNE R1, R1, R0
SUBNE R1, R1, R2
Rule of thumb: if the conditional sequence is three instructions
or less, it is better to use conditional execution than a branch.
39v10 The ARM Architecture TM
16 16
Data Transfer Instructions

39v10 The ARM Architecture TM


17 17
39v10 The ARM Architecture TM
18 18
39v10 The ARM Architecture TM
19 19
39v10 The ARM Architecture TM
20 20
39v10 The ARM Architecture TM
21 21
Register- indirect addressing

39v10 The ARM Architecture TM


22 22
39v10 The ARM Architecture TM
23 23
Addressing Modes in ARM

Load and store instructions have three


primary addressing modes
offset
pre-indexed
post-indexed.

39v10 The ARM Architecture TM


24 24
39v10 The ARM Architecture TM
25 25
39v10 The ARM Architecture TM
26 26
39v10 The ARM Architecture TM
27 27
39v10 The ARM Architecture TM
28 28
39v10 The ARM Architecture TM
29 29
39v10 The ARM Architecture TM
30 30
39v10 The ARM Architecture TM
31 31
39v10 The ARM Architecture TM
32 32
One-dimensional array
Example:

39v10 The ARM Architecture TM


33 33
One-dimensional array
Example:

39v10 The ARM Architecture TM


34 34
One-dimensional array
Example:

Another Approach:

39v10 The ARM Architecture TM


35 35
Modifying the Status Registers

 Only indirectly R0

R1
 MSR moves contents MRS
from CPSR/SPSR to R7
selected GPR MSR R8
CPSR
SPSR
 MRS moves contents
from selected GPR R14

to CPSR/SPSR R15

 Only in privileged
modes
36
39v10 The ARM Architecture TM
36 36
PSR Transfer Instructions
31 28 27 24 23 16 15 8 7 6 5 4 0

N Z C V Q J U n d e f i n e d I F T mode
f s x c

 MRS and MSR allow contents of CPSR / SPSR to be transferred to / from


a general purpose register.
 Syntax:
 MRS{<cond>} Rd,<psr> ; Rd = <psr>
 MSR{<cond>} <psr[_fields]>,Rm ; <psr[_fields]> = Rm

where
 <psr> = CPSR or SPSR
 [_fields] = any combination of ‘fsxc’
 Also an immediate form
 MSR{<cond>} <psr_fields>,#Immediate
 In User Mode, all bits can be read but only the condition flags (_f) can be
written.

39v10 The ARM Architecture TM


37 37
Software Interrupt

28 27 24 23

Cond Opcode Ordinal

 SWI instruction
 Forces CPU into supervisor mode
 Usage: SWI #n
 Maximum 224 calls
 Suitable for running privileged

code and
making OS calls
38
39v10 The ARM Architecture TM
38 38
Software Interrupt (SWI)
31 28 27 24 23 0

Cond 1 1 1 1 SWI number (ignored by processor)

Condition Field

 Causes an exception trap to the SWI hardware vector


 The SWI handler can examine the SWI number to decide what operation
has been requested.
 By using the SWI mechanism, an operating system can implement a set
of privileged operations which applications running in user mode can
request.
 Syntax:
 SWI{<cond>} <SWI number>

39v10 The ARM Architecture TM


39 39

You might also like