0% found this document useful (0 votes)
4 views

Module-5

The document introduces the THUMB instruction set, which encodes a subset of ARM instructions into a 16-bit format for improved performance in memory-constrained systems. It highlights the advantages of THUMB in terms of code density and memory usage, as well as the limitations in register access and the need for ARM-Thumb interworking. Additionally, it covers various instruction types, including data processing, stack operations, and software interrupts, along with examples of efficient C programming practices.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Module-5

The document introduces the THUMB instruction set, which encodes a subset of ARM instructions into a 16-bit format for improved performance in memory-constrained systems. It highlights the advantages of THUMB in terms of code density and memory usage, as well as the limitations in register access and the need for ARM-Thumb interworking. Additionally, it covers various instruction types, including data processing, stack operations, and software interrupts, along with examples of efficient C programming practices.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Introduction to the THUMB

instruction set

MODULE: 5
2TE
12/6/2023 6:56 AM
3TE
12/6/2023 6:56 AM

Introduction
 Thumb encodes a subset of the 32-bit ARM instructions into a 16-bit
instruction set space.
 Since Thumb has higher performance than ARM on a processor with a
16-bit data bus, but lower performance than ARM on a 32-bit data bus,
use Thumb for memory-constrained systems.
4TE
12/6/2023 6:56 AM

Introduction
 Thumb has higher code density—the space taken up in memory by an
executable program—than ARM.
 For memory-constrained embedded systems, for example, mobile phones
and PDAs, code density is very important.
 Cost pressures also limit memory size, width, and speed.
5TE
12/6/2023 6:56 AM

Introduction
 On average, a Thumb implementation of the same code takes up
around 30% less memory than the equivalent ARM implementation.
6TE
12/6/2023 6:56 AM

Thumb Register Usage


 In Thumb state, we do not have direct access to all registers. Only the low
registers r0 to r7 are fully accessible, as shown in Table 4.2.
 The higher registers r8 to r12 are only accessible with MOV, ADD, or
CMP instructions.
 CMP and all the data processing instructions that operate on low registers
update the condition flags in the cpsr.
7TE
12/6/2023 6:56 AM

ARM-Thumb Interworking
 ARM-Thumb interworking is the name given to the method of linking
ARM and Thumb code together for both assembly and C/C++.
 It handles the transition between the two states.
 Extra code, called a veneer, is sometimes needed to carry out the
transition.
 ATPCS defines the ARM and Thumb procedure call standards.
8TE
12/6/2023 6:56 AM

ARM-Thumb Interworking
 Syntax: BX Rm
BLX Rm | label
9TE
12/6/2023 6:56 AM

Other Branch Instructions


 There are two variations of the standard branch instruction, or B.
 The first is similar to the ARM version and is conditionally executed; the
branch range is limited to a signed 8-bit immediate, or −256 to +254
bytes.
 The second version removes the conditional part of the instruction and
expands the effective branch range to a signed 11-bit immediate, or
−2048 to +2046 bytes.
0TE
12/6/2023 6:56 AM

Other Branch Instructions


 The conditional branch instruction is the only conditionally executed
instruction in Thumb state.
1TE
12/6/2023 6:56 AM

Data processing instructions


 The data processing instructions manipulate data within registers.
 They include move instructions, arithmetic instructions, shifts, logical
instructions, comparison instructions, and multiply instructions.
 The Thumb data processing instructions are a subset of the ARM data
processing instructions.
2TE
12/6/2023 6:56 AM

Data processing instructions


3TE
12/6/2023 6:56 AM

Data processing instructions


4TE
12/6/2023 6:56 AM

Stack Instructions
 The Thumb stack operations are different from the equivalent ARM
instructions because they use the more traditional POP and PUSH
concept.
5TE
12/6/2023 6:56 AM

Software Interrupt Instruction


 Similar to the ARM equivalent, the Thumb software interrupt (SWI)
instruction causes a software interrupt exception.
 If any interrupt or exception flag is raised in Thumb state, the
processor automatically reverts back to ARM state to handle the
exception.
Efficient C Programming
Basic C Data Types
Local Variable Types
Data packet checksum Compiler Output:
routine:
• In the checksum_v1 example, the compiler inserts an extra AND
instruction to reduce i to the range 0 to 255 before the comparison
with 64. This instruction disappears in the checksum_v2 example .
Data packet checksum Compiler Output:
routine:
• The checksum_v4 code fixes all the problems discussed previous. It
uses int type local variables to avoid unnecessary casts. It increments
the pointer data instead of using an index offset data[i].
Data packet checksum Compiler Output:
routine:

• Three instructions have been removed from the inside loop, saving
three cycles per loop compared to checksum_v3.
Function Argument Types

gcc compiler output:

arm c compiler output:


Signed versus Unsigned Types
• If your code uses addition, subtraction, and multiplication, then there
is no performance difference between signed and unsigned
operations. However, there is a difference when it comes to division.
• Consider the following short example that averages two integers:
C routine to compute average Compiler output:
of two integers:

• In C on an ARM target, a divide by two is not a right shift if a+b is


negative. For example, -3>>1=-2 but -3/2=-1. Hence, if the a+b value is
signed, then compiler inserts an extra ADD instruction i.e it adds one
to the sum before shifting by right if the sum is negative.
C Looping Structures
Loops with a fixed number of iterations:

Below example shows how the compiler treats a loop with incrementing count, i++
and the compiler output for the same.
Data packet checksum routine:
Compiler Output:

Continu
ed...
Data packet checksum routine:Compiler Output:

• The SUBS and BNE instructions implement the loop. Our checksum
example now has the minimum number of four instructions per loop.
This is much better than six for checksum_v1 and eight for
Loops using a variable number of
iterations:

Data packet checksum routine: Compiler Output:

• Notice that the compiler checks that N is nonzero on entry to the


function. Often this check is unnecessary since you know that the array
wont be empty. In this case a do-while loop gives better performance and
Data packet checksum routine: Compiler Output:
Loop unrolling:
• We saw that each loop iteration costs two instructions in addition to
the body of the loop: a subtract to decrement the loop count and a
conditional branch. We call these instructions the loop overhead.
• On ARM7 or ARM9 processors the subtract takes one cycle and the
branch three cycles, giving an overhead of four cycles per loop.
• You can save some of these cycles by unrolling a loop- repeating the
loop body several times, and reducing the number of loop iterations
by the same proportion.
Data packet checksum routine: Compiler Output:

You might also like