0% found this document useful (0 votes)
27 views21 pages

Unit-3Thumb Instruction Set

Embedded System: Thumb Instruction Set

Uploaded by

vickyqsc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views21 pages

Unit-3Thumb Instruction Set

Embedded System: Thumb Instruction Set

Uploaded by

vickyqsc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

UNIT-III:Thumb Instruction Set

1. Introduction:
Thumb encodes a subset of the 32-bit ARM instructions into a 16-bit instruction set
space. Since Thumb has higher performance than ARM on a processor with a 16-bit
data bus, but lower performance than ARM on a 32-bit data bus, use Thumb for
memory-constrained systems.

Thumb has higher code density—the space taken up in memory by an executable


program—than ARM. For memory-constrained embedded systems, for example,
mobile phones and PDAs, code density is very important. Cost pressures also limit
memory size, width, and speed.

On average, a Thumb implementation of the same code takes up around 30% less
memory than the equivalent ARM implementation.

Let take an example:

A program in ARM Code as:


ARM Divide;
; IN: r0 (value), r1 (divisor)
; OUT: r2 (MODulus),r3 (DIVide)
MOV r3, #0
loop
SUBS r0, r0, r1
ADDGE r3, r3 #1
BGE loop
ADD r2, r0, r1

Explanation for the ARM code:


This ARM assembly code performs integer division and calculates both the quotient
and remainder.

Step by step explanation as follows:


Input: r0 contains the dividend (value), and r1 contains the divisor.
Output: r3 contains the quotient (DIVide), and r2 contains the remainder (MODulus).

1) MOV r3, #0: This instruction initializes register r3 to 0, which will hold the quotient
(DIVide).

2) loop: This is a label marking the start of a loop.

3) SUBS r0, r0, r1: This instruction subtracts the value in register r1 (divisor) from the
value in register r0 (dividend) and stores the result back in r0. The S suffix updates
the flags (including the sign flag) based on the result of the subtraction.

4) ADDGE r3, r3, #1: This instruction adds 1 to the value in r3 (quotient) if the
previous subtraction did not result in a negative value. The GE suffix stands for
"greater than or equal to," which checks the sign flag to determine if the result is
greater than or equal to zero.

5) BGE loop: This is a conditional branch instruction that branches back to the loop
label if the previous subtraction (SUBS) resulted in a value greater than or equal to
zero, indicating that another iteration of the division process is needed.

6) ADD r2, r0, r1: After the loop completes, r0 will contain the remainder (MODulus),
which is adjusted by adding the divisor (r1) back to it. The result is stored in r2.

The loop continues to subtract the divisor from the dividend until the result is
negative, incrementing the quotient each time. Once the loop finishes, the final
remainder is calculated by adding the divisor back to the last negative result in r0.
The quotient is stored in r3, and the remainder is stored in r2.

Calculation of Code density:

The code density of a program is calculated as the ratio of the number of executed
instructions to the total number of instructions in the program. In this case, let's
assume we are calculating the code density for the provided ARM assembly code
snippet.

To calculate the code density, we need to count the total number of instructions in the
program and the number of executed instructions. In this context, an executed
instruction is one that is not skipped due to a conditional branch.

Break down of the instructions as:

MOV r3, #0 (1 instruction)


loop: (1 instruction)
SUBS r0, r0, r1 (1 instruction)
ADDGE r3, r3, #1 (1 instruction)
BGE loop (1 instruction, only executed if the condition is met)
ADD r2, r0, r1 (1 instruction)
Total instructions: 5

Out of these, 4 instructions are always executed (MOV, loop label, SUBS, ADD), and 1
instruction (BGE) is conditionally executed based on the result of the subtraction.

Therefore, the code density can be calculated as:

4
Code Density= =0.8
5

So, the code density of the provided ARM assembly code snippet is 0.8, indicating that
80% of the instructions are executed during the program's execution.

Calculation of Memory Usage:

To calculate the memory usage of the program, we need to consider the memory
required for storing the program instructions and any additional data used by the
program.
Break down of the memory usage:

Instructions:

MOV r3, #0: 1 word (assuming 32-bit ARM instruction set)


loop: 1 word
SUBS r0, r0, r1: 1 word
ADDGE r3, r3, #1: 1 word
BGE loop: 1 word
ADD r2, r0, r1: 1 word
Total instruction memory: 6 words
Register values:

r0, r1, r2, r3: Each register requires 4 bytes (assuming 32-bit ARM architecture)
Total register memory: 4 registers * 4 bytes = 16 bytes

Total memory usage: Instruction memory + Register memory = 6 words + 16


bytes ≈ 40 bytes

So, the total memory usage of the program is approximately 40 bytes.

A program in THUMB Code as:


THUMB Divide;
; IN: r0 (value), r1 (divisor)
; OUT: r2 (MODulus),r3 (DIVide)
MOV r3, #0
loop
ADD r3, #0
SUB r0, r1
BGE loop
SUB r3, #1
ADD r2, r0, r1

Explanation:

This THUMB assembly code performs integer division with remainder. THUMB is a
subset of the ARM instruction set, designed to be more compact, making it suitable for
memory-constrained systems. Step wise step breakdown of the code:

1) MOV r3, #0: This instruction initializes register r3 to 0, which will hold the quotient
(DIVide).

2) loop: This is a label marking the start of a loop.

3) ADD r3, #0: This instruction effectively increments r3 by 1 in each iteration of the
loop. This is a workaround for the limited capabilities of THUMB instructions.
4) SUB r0, r1: This instruction subtracts the value in register r1 (divisor) from the
value in register r0 (dividend) and stores the result back in r0.

5) BGE loop: This is a conditional branch instruction that branches back to the loop
label if the previous subtraction (SUB) resulted in a value greater than or equal to
zero, indicating that another iteration of the division process is needed.

6) SUB r3, #1: After the loop completes, r0 will contain the remainder (MODulus), and
r3 will contain the quotient. However, the ADD r3, #0 instruction effectively
incremented r3 by 1 in each iteration, so we need to decrement it by 1 to get the
correct quotient.

7) ADD r2, r0, r1: This instruction calculates the correct remainder by adding the
divisor (r1) back to the last negative result in r0, and stores it in r2.

This code provides a basic implementation of integer division with remainder using
simple arithmetic operations and loops in THUMB assembly language.
Calculation of Code Density:
Break down the instructions and determine the number of executed instructions:

MOV r3, #0 : 1 executed instruction


Loop : 1 executed instruction
ADD r3, #0 : 1 executed instruction
SUB r0, r1 : 1 executed instruction
BGE loop : 1 executed instruction
SUB r3, #1 : 1 executed instruction
ADD r2, r0, r1 : 1 executed instruction

Total executed instructions: 6


Out of these, 6 instructions are always executed (MOV, loop label, ADD, SUB, BGE,
SUB, ADD).

Total instructions: 6
Code density can be calculated as:

6
Code Density= =1
6
So, the code density of the THUMB code program is 1, indicating that 100% of the
instructions are executed during the program's execution.

Calculation of Memory Usage:

Break down the memory usage:

Instructions:

MOV r3, #0: 2 bytes


loop: 2 bytes
ADD r3, #0: 2 bytes
SUB r0, r1: 2 bytes
BGE loop: 2 bytes
SUB r3, #1: 2 bytes
ADD r2, r0, r1: 2 bytes
Total instruction memory: 14 bytes
Register values:

r0, r1, r2, r3: Each register requires 2 bytes (THUMB mode)
Total register memory: 4 registers * 2 bytes = 8 bytes
Total memory usage: Instruction memory + Register memory = 14 bytes + 8 bytes =
22 bytes

So, the total memory usage of the THUMB code program is 22 bytes.

Even though the Thumb implementation uses more instructions, the overall
memory footprint is reduced.

Each Thumb instruction is related to a 32-bit ARM instruction. Figure 1 shows a simple
Thumb ADD instruction being decoded into an equivalent ARM ADD instruction.

Table provides a complete list of Thumb instructions available in the THUMBv2


architecture used in the ARMv5TE architecture. Only the branch relative instruction
can be conditionally executed. The limited space available in 16 bits causes the barrel
shift operations ASR, LSL, LSR, and ROR to be separate instructions in the Thumb ISA.
2. Thumb Register Usage:
In Thumb state, we do not have direct access to all registers. Only the low registers r0
to r7 are fully accessible, as shown in Table. The higher registers r8 to r12 are only
accessible with MOV, ADD, or CMP instructions. CMP and all the data processing
instructions that operate on low registers update the condition flags in the CPSR.

It is to be noticed from the Thumb instruction set list and from the Thumb register
usage table that there is no direct access to the CPSR or SPSR. In other words, there
are no MSR- and MRS-equivalent Thumb instructions.
To alter the CPSR or SPSR, we must switch into ARM state to use MSR and MRS.
Similarly, there are no coprocessor instructions in Thumb state. We need to be in ARM
state to access the coprocessor for configuring cache and memory management.

{ In ARM state (as opposed to THUMB state), the MRS (Move to Register from Status) and
MSR (Move to Status from Register) instructions are used to transfer data between the ARM
core's general-purpose registers and the Current Program Status Register (CPSR) or the SPSR
(Saved Program Status Register) in a privileged mode. These instructions are used for tasks
like saving and restoring the program status, accessing flags, and controlling program
execution.}

{In Thumb instruction set, banked registers refer to R8-R15, which are used for storing
return addresses and other temporary values during subroutine calls and interrupts. These
registers are banked because they have different uses in different modes of operation (such
as User, FIQ, IRQ, etc.).

In Thumb, banked registers serve several purposes:

1) Return addresses: R14 (LR) is used to store the return address when a subroutine is called,
allowing the program to return to the correct location after the subroutine completes.

2) Interrupt handling: During interrupts, the processor automatically saves the return address
in LR (R14) to ensure that the program can return to the correct point after handling the
interrupt.

3) Fast Interrupt Requests (FIQ): R8-R14 are banked in the FIQ mode, meaning that when the
processor enters the FIQ mode, the current values of R8-R14 are saved and a new set of R8-
R14 registers is used exclusively for FIQ handling.

4) Other modes: Depending on the processor mode, banked registers may serve different
purposes, such as storing context information or temporary variables.

Overall, banked registers in the Thumb instruction set play a crucial role in managing program
flow, handling interrupts, and ensuring that the processor can switch between different modes
of operation efficiently.}

Features of THUMB instructions:


3. ARM-THUMB INTERWORKING:

ARM-Thumb interworking is the name given to the method of linking ARM and Thumb
code together for both assembly and C/C++.

It handles the transition between the two states.


Extra code, called a veneer, is sometimes needed to carry out the transition.
ATPCS defines the ARM and Thumb procedure call standards.

To call a Thumb routine from an ARM routine, the core has to change state. This state
change is shown in the T bit of the CPSR. The BX and BLX branch instructions cause a
switch between ARM and Thumb state while branching to a routine. The BX lr
instruction returns from a routine, also with a state switch if necessary.

The BLX instruction was introduced in ARMv5T. On ARMv4T cores the linker uses a
veneer to switch state on a subroutine call. Instead of calling the routine directly, the
linker calls the veneer, which switches to Thumb state using the BX instruction.

There are two versions of the BX or BLX instructions: an ARM instruction and a Thumb
equivalent. The ARM BX instruction enters Thumb state only if bit 0 of the address in
Rn is set to binary 1; otherwise it enters ARM state. The Thumb BX instruction does
the same.

Example:
This example shows a small code fragment that uses both the ARM and Thumb versions of the BX
instruction. You can see that the branch address into Thumb has the lowest bit set. This sets the T bit in
the CPSR to Thumb state.

The return address is not automatically preserved by the BX instruction. Rather the code sets the
return address explicitly using a MOV instruction prior to the branch:

; ARM code

CODE32 ;word aligned


LDR r0, =thumbCode+1 ; +1 to enter Thumb state
MOV lr, pc ; set the return address
BX r0 ; branch to Thumb code & mode
; continue here

; Thumb code
CODE16 ; halfword aligned
thumbCode
ADD r1, #1
BX lr ; return to ARM code & state

Explanation of an example:

This ARM code does the following:

1) Loads the address of the thumbCode label into register r0, with an offset of 1 to
switch to Thumb state.
2) Moves the value of the program counter (pc) into the link register (lr), setting the
return address.
3) Branches to the address in register r0, which contains the thumbCode label,
entering Thumb mode.

In the thumbCode section:


1) Adds 1 to the value in register r1.
2) Branches back to the address in the link register (lr), returning to the ARM code and
state.

Overall, this code demonstrates a transition from ARM to Thumb state and back, with
a simple addition operation in the Thumb mode.

Ques: Why +1 is used in the instruction: LDR r0, =thumbCode+1 ; +1 to


enter Thumb state

In ARM assembly, when we load an address using LDR with the = syntax, it loads the
address of the specified label as a literal value. The +1 in =thumbCode+1 is used to
indicate that the address should be adjusted by 1 byte.

In this context, the +1 is used to ensure that the address loaded into r0 points
to the instruction immediately after the BX r0 instruction. This is because
when the BX instruction is executed with r0 as the target, it switches the
processor to Thumb mode and begins executing instructions from the
address in r0. By using =thumbCode+1, the code is set up to enter Thumb
mode and start executing instructions from the thumbCode label, skipping
over the BX instruction itself.

Another example where Replacing the BX instruction with BLX simplifies the calling of
a Thumb routine since it sets the return address in the link register lr:

CODE32
LDR r0, =thumbRoutine+1 ; enter Thumb state
BLX r0 ; jump to Thumb code
…………. ; continue here

CODE16
thumbRoutine
ADD r1, #1
BX r14 ; return to ARM code and state
4. Other Branch Instructions
There are two variations of the standard branch instruction, or B.

The first is similar to the ARM version and is conditionally executed; the branch range is limited to a
signed 8-bit immediate, or −256 to +254 bytes.
The second version removes the conditional part of the instruction and expands the effective branch
range to a signed 11-bit immediate, or −2048 to +2046 bytes.

The conditional branch instruction is the only conditionally executed instruction in Thumb state.

The BL instruction is not conditionally executed and has an approximate range of+/−4 MB. This range
is possible because BL (and BLX) instructions are translated into a pair of 16-bit Thumb instructions.

The first instruction in the pair holds the high part of the branch offset, and the second the low part.
These instructions must be used as a pair.
The code here shows the various instructions used to return from a BL subroutine call:

MOV pc, lr
BX lr
POP {pc}

To return, we set the pc to the value in lr. The stack instruction called POP.
********************************************************************************************************
**********************
5. Data Processing Instructions:
The data processing instructions manipulate data within registers. They include move instructions,
arithmetic instructions, shifts, logical instructions, comparison instructions, and multiply instructions.
The Thumb data processing instructions are a subset of the ARM data processing instructions.
These instructions follow the same style as the equivalent ARM instructions. Most Thumb data
processing instructions operate on low registers and update the CPSR. The exceptions are which can
operate on the higher registers r8–r14 and the pc. These instructions, except for CMP, do not update
the condition flags in the CPSR when using the higher registers. The CMP instruction, however, always
updates the CPSR.

Example-1:

This example shows a simple Thumb ADD instruction. It takes two low registers r1 and r2 and adds
them together. The result is then placed into register r0, overwriting the original contents. The cpsr is
also updated.

PRE cpsr = nzcvIFT_SVC

r1 = 0x80000000

r2 = 0x10000000

ADD r0, r1, r2

POST r0 = 0x90000000

cpsr = NzcvIFT_SVC

Example-2:

Thumb deviates from the ARM style in that the barrel shift operations (ASR, LSL, LSR, and ROR) are
separate instructions. This example shows the logical left shift (LSL) instruction to multiply register r2
by 2.

PRE r2 = 0x00000002
r4 = 0x00000001
LSL r2, r4
POST r2 = 0x00000004
r4 = 0x00000001
Single-Register Load-Store Instructions;
The Thumb instruction set supports load and storing registers, or LDR and STR. These instructions use
two pre-indexed addressing modes: offset by register and offset by immediate.

You can see the different addressing modes in Table 4.3. The offset by register uses a base register Rn
plus the register offset Rm. The second uses the same base register Rn plus a 5-bit immediate or a
value dependent on the data size. The 5-bit offset encoded in the instruction is multiplied by one for
byte accesses, two for 16-bit accesses, and four for 32-bit accesses.

Example: This example shows two Thumb instructions that use a preindex addressing mode. Both use
the same pre-condition.
Both instructions carry out the same operation. The only difference is the second LDR uses a fixed
offset, whereas the first one depends on the value in register r4.

Multiple Resister Load Store Instruction:

The Thumb versions of the load-store multiple instructions are reduced forms of the ARM load-store
multiple instructions. They only support the increment after (IA) addressing mode.

Here N is the number of registers in the list of registers. You can see that these instructions always
update the base register Rn after execution. The base register and list of registers are limited to the
low registers r0 to r7.

Example:

This example saves registers r1 to r3 to memory addresses 0x9000 to 0x900c. It also updates base
register r4. Note that the update character ‘!’ is not an option, unlike with the ARM instruction set.
Stack Instructions:
The Thumb stack operations are different from the equivalent ARM instructions because they use the
more traditional POP and PUSH concept.

The interesting point to note is that there is no stack pointer in the instruction. This is because the
stack pointer is fixed as register r13 in Thumb operations and sp is automatically updated. The list of
registers is limited to the low registers r0 to r7.

The PUSH register list also can include the link register lr; similarly the POP register list can include the
pc. This provides support for subroutine entry and exit, as shown in Example below.

The stack instructions only support full descending stack operations.

Example:

In this example we use the POP and PUSH instructions. The subroutine ThumbRoutine is called using a
branch with link (BL) instruction.

The link register lr is pushed onto the stack with register r1. Upon return, register r1 is popped off the
stack, as well as the return address being loaded into the pc. This returns from the subroutine.

Software Interrupt Instruction:


A software interrupt instruction (SWI) causes a software interrupt exception, which provides a
mechanism for applications to call operating system routines.

When the processor executes an SWI instruction, it sets the program counter pc to the offset 0x8 in the
vector table. The instruction also forces the processor mode to SVC, which allows an operating system
routine to be called in a privileged mode.

Each SWI instruction has an associated SWI number, which is used to represent a particular function
call or feature.

Example:

Here we have a simple example of an SWI call with SWI number 0x123456, used by ARM toolkits as a
debugging SWI. Typically the SWI instruction is executed in user mode.

Since SWI instructions are used to call operating system routines, you need some form of parameter
passing. This is achieved using registers. In this example, register r0 is used to pass the parameter
0x12. The return values are also passed back via registers.

Code called the SWI handler is required to process the SWI call. The handler obtains the SWI number
using the address of the executed instruction, which is calculated from the link register lr.

The SWI number is determined by

(BIC instruction)

Here the SWI instruction is the actual 32-bit SWI instruction executed by the processor.

Next,

Similar to the ARM equivalent, the Thumb software interrupt (SWI) instruction causes a software
interrupt exception. If any interrupt or exception flag is raised in Thumb state, the processor
automatically reverts back to ARM state to handle the exception.
The Thumb SWI instruction has the same effect and nearly the same syntax as the ARM equivalent. It
differs in that the SWI number is limited to the range 0 to 255 and it is not conditionally executed.

Example:

This example shows the execution of a Thumb SWI instruction. Note that the processor goes from
Thumb state to ARM state after execution.

You might also like