0% found this document useful (0 votes)
4 views

Module-5

The document introduces the THUMB instruction set, which encodes a subset of ARM instructions into a 16-bit format for improved performance in memory-constrained systems. It highlights the advantages of THUMB in terms of code density and memory usage, as well as the limitations in register access and the need for ARM-Thumb interworking. Additionally, it covers various instruction types, including data processing, stack operations, and software interrupts, along with examples of efficient C programming practices.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Module-5

The document introduces the THUMB instruction set, which encodes a subset of ARM instructions into a 16-bit format for improved performance in memory-constrained systems. It highlights the advantages of THUMB in terms of code density and memory usage, as well as the limitations in register access and the need for ARM-Thumb interworking. Additionally, it covers various instruction types, including data processing, stack operations, and software interrupts, along with examples of efficient C programming practices.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Introduction to the THUMB

instruction set

MODULE: 5
2TE
12/6/2023 6:56 AM
3TE
12/6/2023 6:56 AM

Introduction
 Thumb encodes a subset of the 32-bit ARM instructions into a 16-bit
instruction set space.
 Since Thumb has higher performance than ARM on a processor with a
16-bit data bus, but lower performance than ARM on a 32-bit data bus,
use Thumb for memory-constrained systems.
4TE
12/6/2023 6:56 AM

Introduction
 Thumb has higher code density—the space taken up in memory by an
executable program—than ARM.
 For memory-constrained embedded systems, for example, mobile phones
and PDAs, code density is very important.
 Cost pressures also limit memory size, width, and speed.
5TE
12/6/2023 6:56 AM

Introduction
 On average, a Thumb implementation of the same code takes up
around 30% less memory than the equivalent ARM implementation.
6TE
12/6/2023 6:56 AM

Thumb Register Usage


 In Thumb state, we do not have direct access to all registers. Only the low
registers r0 to r7 are fully accessible, as shown in Table 4.2.
 The higher registers r8 to r12 are only accessible with MOV, ADD, or
CMP instructions.
 CMP and all the data processing instructions that operate on low registers
update the condition flags in the cpsr.
7TE
12/6/2023 6:56 AM

ARM-Thumb Interworking
 ARM-Thumb interworking is the name given to the method of linking
ARM and Thumb code together for both assembly and C/C++.
 It handles the transition between the two states.
 Extra code, called a veneer, is sometimes needed to carry out the
transition.
 ATPCS defines the ARM and Thumb procedure call standards.
8TE
12/6/2023 6:56 AM

ARM-Thumb Interworking
 Syntax: BX Rm
BLX Rm | label
9TE
12/6/2023 6:56 AM

Other Branch Instructions


 There are two variations of the standard branch instruction, or B.
 The first is similar to the ARM version and is conditionally executed; the
branch range is limited to a signed 8-bit immediate, or −256 to +254
bytes.
 The second version removes the conditional part of the instruction and
expands the effective branch range to a signed 11-bit immediate, or
−2048 to +2046 bytes.
0TE
12/6/2023 6:56 AM

Other Branch Instructions


 The conditional branch instruction is the only conditionally executed
instruction in Thumb state.
1TE
12/6/2023 6:56 AM

Data processing instructions


 The data processing instructions manipulate data within registers.
 They include move instructions, arithmetic instructions, shifts, logical
instructions, comparison instructions, and multiply instructions.
 The Thumb data processing instructions are a subset of the ARM data
processing instructions.
2TE
12/6/2023 6:56 AM

Data processing instructions


3TE
12/6/2023 6:56 AM

Data processing instructions


4TE
12/6/2023 6:56 AM

Stack Instructions
 The Thumb stack operations are different from the equivalent ARM
instructions because they use the more traditional POP and PUSH
concept.
5TE
12/6/2023 6:56 AM

Software Interrupt Instruction


 Similar to the ARM equivalent, the Thumb software interrupt (SWI)
instruction causes a software interrupt exception.
 If any interrupt or exception flag is raised in Thumb state, the
processor automatically reverts back to ARM state to handle the
exception.
Efficient C Programming
Basic C Data Types
Local Variable Types
Data packet checksum Compiler Output:
routine:
• In the checksum_v1 example, the compiler inserts an extra AND
instruction to reduce i to the range 0 to 255 before the comparison
with 64. This instruction disappears in the checksum_v2 example .
Data packet checksum Compiler Output:
routine:
• The checksum_v4 code fixes all the problems discussed previous. It
uses int type local variables to avoid unnecessary casts. It increments
the pointer data instead of using an index offset data[i].
Data packet checksum Compiler Output:
routine:

• Three instructions have been removed from the inside loop, saving
three cycles per loop compared to checksum_v3.
Function Argument Types

gcc compiler output:

arm c compiler output:


Signed versus Unsigned Types
• If your code uses addition, subtraction, and multiplication, then there
is no performance difference between signed and unsigned
operations. However, there is a difference when it comes to division.
• Consider the following short example that averages two integers:
C routine to compute average Compiler output:
of two integers:

• In C on an ARM target, a divide by two is not a right shift if a+b is


negative. For example, -3>>1=-2 but -3/2=-1. Hence, if the a+b value is
signed, then compiler inserts an extra ADD instruction i.e it adds one
to the sum before shifting by right if the sum is negative.
C Looping Structures
Loops with a fixed number of iterations:

Below example shows how the compiler treats a loop with incrementing count, i++
and the compiler output for the same.
Data packet checksum routine:
Compiler Output:

Continu
ed...
Data packet checksum routine:Compiler Output:

• The SUBS and BNE instructions implement the loop. Our checksum
example now has the minimum number of four instructions per loop.
This is much better than six for checksum_v1 and eight for
Loops using a variable number of
iterations:

Data packet checksum routine: Compiler Output:

• Notice that the compiler checks that N is nonzero on entry to the


function. Often this check is unnecessary since you know that the array
wont be empty. In this case a do-while loop gives better performance and
Data packet checksum routine: Compiler Output:
Loop unrolling:
• We saw that each loop iteration costs two instructions in addition to
the body of the loop: a subtract to decrement the loop count and a
conditional branch. We call these instructions the loop overhead.
• On ARM7 or ARM9 processors the subtract takes one cycle and the
branch three cycles, giving an overhead of four cycles per loop.
• You can save some of these cycles by unrolling a loop- repeating the
loop body several times, and reducing the number of loop iterations
by the same proportion.
Data packet checksum routine: Compiler Output:

You might also like