0% found this document useful (0 votes)
66 views254 pages

Embedded Systems UEC513 D MST

The document outlines a course on Embedded Systems (UEC513) taught by Dr. Debabrata Ghosh, focusing on the design and programming of ARM processors and their interfacing with peripherals. It covers various architectures, classifications, and applications of embedded systems, as well as the constraints and characteristics of different processor types. The course aims to equip students with the skills to develop embedded systems and includes a syllabus with specific learning objectives and recommended textbooks.

Uploaded by

Naman Sangar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views254 pages

Embedded Systems UEC513 D MST

The document outlines a course on Embedded Systems (UEC513) taught by Dr. Debabrata Ghosh, focusing on the design and programming of ARM processors and their interfacing with peripherals. It covers various architectures, classifications, and applications of embedded systems, as well as the constraints and characteristics of different processor types. The course aims to equip students with the skills to develop embedded systems and includes a syllabus with specific learning objectives and recommended textbooks.

Uploaded by

Naman Sangar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 254

Embedded Systems

(UEC513)

By:
Dr. Debabrata Ghosh
Assistant Professor, ECED

• Use this PPT sensibly. It is for reference only. By no means, it is comprehensive and self
sufficient for your exams. Use the course syllabus along with the text books for
comprehensive knowledge.
UEC513 : EMBEDDED SYSTEMS
L T P Cr
3 0 2 4.0
Course Objective:The objective of this course is to equip students with the necessary fundamental
knowledge and skills that enable them to design basic embedded systems. It covers architecture,
programming of ARM processor ad it’s interfacing with peripheral devices.
Syllabus
Introduction to Embedded Systems: Definition, Embedded Systems Vs General Computing
Systems, Classification of Embedded Systems, Major application areas. General purpose processor
architecture and organization, Von-Neumann and Harvard architectures, CISC and RISC architectures,
Big and Little endian processors, Processor design trade-offs, Processor cores: soft and hard.
Introduction to ARM Processor: The ARM design philosophy, ARM core data flow model,
Architecture, Register set, ARM7TDMI Interface signals, General Purpose Input Output Registers,
Memory Interface, Bus Cycle types, Pipeline, ARM processors family, Operational Modes, Instruction
Format, Data forwarding.
Programming based on ARM7TDMI: ARM Instruction set, condition codes, Addressing modes,
Interrupts, Exceptions and Vector Table. Assembly Language Programming, Thumb state, Thumb
Programmers model, Thumb Applications, ARM coprocessor interface and Instructions.
ARM Tools and Interfacing of Peripherals: ARM Development Environment, Arm Procedure Call
Standard (APCS), Example C/C++ programs, Embedded software development, Image structure,
linker inputs and outputs, Protocols (I2C, SPI), Memory Protection Unit (MPU). Physical Vs Virtual
Memory, Paging, Segmentation. The Advanced Microcontroller Bus Architecture (AMBA), DMA,
Peripherals, Interfacing of peripherals with ARM.
Course Learning Objectives (CLO)
The students will be able to:
1.Explain embedded system, its processor architecture and distinguish it from general
computing system.
2.Describe ARM processor internal architecture, assembly instructions, their format and
Develop ARM processor-based assembly language program for a given statement.
3.Describe how thumb mode operations are designed and various coprocessors are
interfaced in an embedded system.
4.Interface various hardware peripherals in embedded systems.
5.Recognize issues to be handled in any processor software tool chain for embedded
system development especially using C/C++.

Text Books
1.Carl Hamacher, ZvonkoVranesic, SafwatZaky, Naraig Manjikian, “COMPUTER
ORGANIZATION AND EMBEDDED SYSTEMS, Sixth Edition, McGraw Hill, 2012.
2.Steve Furber, “ARM System-on-Chip Architecture, Second Edition, PEARSON, 2013

Reference Books
1.Stephen Welsh, Peter Knaggs, “ARM: Assembly Language Programming”, Bourne
Mouth University Publication, 2003.
2.Andrew N. Sloss, Dominic Symes, Chris Wright “ARM System Developers Guide,
Designing and Optimizing System Software”, Elsevier Publication.
Role of Address, data and control busses
Address Bus:
• To recognized any I/O device or memory location by
processor.
• Assign address must be unique.
• Processor send the address on address lines and
decoding circuit will find respective device.
• More number of address bus means more number of
devices can be interfaced with CPU.
2m=n
m: number of address lines
n: number of memory locations to be addressed
▪ The address bus is unidirectional
Data Bus:
• Required data transfer or received through data bus.
It indicates the data handling capacity of CPU.
• More data buses mean a more expensive CPU and
computer but with higher data handling capacity
• The data bus is bidirectional
Control bus:
• Use to decide the direction of data, either transfer or
received.
Block diagram of a general purpose microprocessor
Snipp
Embedded System Definitions
1. An embedded system is a system that has software embedded into
computer-hardware, which makes it dedicated for an application
(s) or specific part of an application or product or part of a larger
system.
2. An embedded system is one that has a dedicated purpose software
embedded in a computer hardware.
3. It is a dedicated computer based system for an application(s) or
product. It may be an independent system or a part of large system.
Its software usually embeds into a ROM (Read Only Memory) or
flash.
4. It is any device that includes a programmable computer but is not
itself intended to be a general purpose computer.
5. Embedded Systems are the electronic systems that contain a
microprocessor or a microcontroller, but we do not think of them as
computers– the computer is hidden or embedded in the system.
1/1/2024 14
Applications of Embedded System
• 1. Consumer Electronics: Camcorders, Cameras.
• 2. Household appliances: Washing machine, Refrigerator.
• 3. Automotive industry: Anti-lock breaking system(ABS), engine control.
• 4. Home automation & security systems: Air conditioners, sprinklers, fire
alarms.
• 5. Telecom: Cellular phones, telephone switches
• 6. Computer peripherals: Printers, scanners.
• 7. Computer networking systems: Network routers and switches.
• 8. Healthcare: EEG, ECG machines.
• 9. Banking & Retail: Automatic teller machines, point of sales.
• 10. Card Readers: Barcode, smart card readers.

1/1/2024 15
Embedded systems vs general
computing system
1. Computing revolution began with general-purpose computing,
not sufficient for embedded computing (requires meeting
computational deadline, meeting power efficiency, meeting limited
memory availability)
2. Personal computer: built around general-purpose processor,
supports multiple peripherals, runs many applications
3. DVD player: embedded system specifically for decoding digital
video and generating video output. Not possible to: change OS of it,
not possible to change embedded software to make it work as a TV,
not possible to install printer software in it
4. Demarcation shrinking in certain embedded applications:
smartphone. Smartphones are not meant for any specific application
or dedicated purpose (so, it is not an embedded system). OS of
smartphone is unalterable by the end user (so it is an embedded
system)
Classification of Embedded systems
Based on generation:
Order in which embedded systems evolved
• First generation: built around 8-bit microprocessors and 4-bit
microcontrollers, simple hardware circuits, assembly firmware. Ex:
telephone keyboard
• Second generation: built around 16-bit microprocessors and 8 or
16-bit microcontrollers, complex and powerful instruction set,
embedded OS. Ex: SCADA
• Third generation: built around 32-bit processors and 16-bit
microcontrollers, application and domain specific processors,
complex and powerful instruction set, instruction pipeline, embedded
OS
• Fourth generation: SoC, reconfigurable processors, high
performance, tight integration, miniaturization, real time embedded
OS. Ex: smart phone
Classification of Embedded systems
Based on complexity & performance:
• Small scale: built around low performance and low cost 8
or 16-bit microprocessors/microcontrollers, simple
application, not time critical, may or may not contain OS,
Ex. electronic toy
• Medium scale: built around medium performance, low
cost 16 or 32-bit microprocessors/microcontrollers,
complex hardware and firmware, usually contain
embedded OS, Ex. Automated Teller Machine (ATM)
• Large scale: highly complex hardware and firmware, high
performance, 32 or 64-bit RISC processors, PLDs, may
contain coprocessors, contains RTOS, Ex. Autonomous
Vehicle Control System
Classification of Embedded systems
Based on functional requirements:
• Real-time embedded system: Strictly time specific,
defence, medical, healthcare. Ex: traffic control
Soft real-time: deadline is not strictly followed, ex: weather
monitoring system, TV remote, washing machine
Hard real-time: deadline is strictly followed, ex: air traffic control
systems, weapon defence systems, alarm for gas leakage,
pacemaker
• Stand alone embedded systems: independent systems.
Ex: MP3 player, microwave oven
• Networked embedded system: connected to a network.
Ex: Home security, ATM
• Mobile embedded system: small, easy to use, portable.
Ex: MP3 player, mobile phone, digital camera
Classification of Embedded systems
Based on triggering:
• Embedded systems which are Reactive in nature can be
based on triggering.
Event triggered: Event-Triggered systems respond to
external stimuli or events. They execute tasks when specific
events occur. Ex: automotive airbag system, activated by
crash sensor's signal.
Time triggered: Time-Triggered systems operate
based on predefined schedules or time intervals. Tasks are
scheduled and executed at specific times, ensuring
deterministic behavior. Automated irrigation systems,
watering plants at specific intervals.
Constraints of an Embedded system

⮚ Design
⮚ Available system-memory
⮚ Available processor speed
⮚ Limited power dissipation when running the system
continuously in cycles of the system start, wait for event,
wake-up and run, sleep and stop
⮚ Performance
⮚ Size
⮚ Non-recurring design cost, and manufacturing costs

1/1/2024 21
Classifications of computer
architecture
On the basis of hardware:
⮚ Von Neumann Architecture
⮚ Harvard Architecture

On the basis of software:


⮚ RISC
⮚ CISC
Von Neumann architecture

The design of a Von Neumann architecture machine is simpler than that of


a Harvard architecture machine, which is also a stored-program system but has
one dedicated set of address and data buses for reading data from and writing
data to memory, and another set of address and data buses for instruction
fetching.
Harvard architecture

The Harvard architecture is a computer architecture with


physically separate storage and signal pathways ( Address and
data busses) for instructions and data.
Von-Neumann vs Harvard Architecture

Von-Neumann Harvard Architecture

⮚ The data and program are stored in the same ⮚ The data and program memories are separate
Memory.
⮚ Share single common bus for data as well ⮚ Has separate buses for data and Instructions
as instruction fetching. fetching.
⮚ Pipelining complex. Low performance as ⮚ Easier to pipeline, so high performance
compared to Harvard
⮚ Slow ⮚ Faster
⮚ Cheaper ⮚ More costly
⮚ Efficient utilization of memory space ⮚ Wastage of memory
⮚ The data and instructions can not be fetched ⮚ The data and instructions can be fetched
at the same time. at the same time.
Processor architecture: CISC vs RISC
Multiply two numbers in memory: source code a = a*b (where a, b are memory locations,
a= 2:3, b=5:2) LOAD A, 2:3
LOAD B, 5:2
MULT 2:3, 5:2
PROD A, B
STORE 2:3, A
1) CISC instructions operate on memory and 1) RISC instructions operate mainly on registers
registers 2) Generally single cycle instructions, fixed
2) Generally multiple cycle instructions, length instructions
variable length instructions 3) Few addressing modes
3) More addressing modes 4) Lots of registers
4) Few Registers 5) Instruction set reduced and simple
5) Instruction set large and complex 6) Decoding simple
6) Decoding Complex 7) Pipeline easier and excellent
7) Pipeline harder and poor 8) Large program for a specific task
8) Small program for a specific task 9) More emphasis on software
9) More emphasis on hardware 10) Use of Hardwired Control Unit
10) Use of Microprogrammed Control Unit 11) E.g. ARM, DLX,PIC
11) E.g. Intel, AMD X86
Processor cores: soft and hard
❖ Design approaches for implementing processor cores in IC
❖ Hard core: not reconfigurable, components are fixed, integrated into chip
during manufacturing
❖ Soft core: implemented using configurable hardware through the use of
programmable logic device, such as FPGA

Soft core and hard core tradeoff:


Environment

❖ Physical (actual hardware) and virtual (emulation of hardware)


❖ Hard core solution: lack of complete system adaptability limits
accomplishment in virtual environment
❖ Soft core solution: entire system can be simulated and verified in virtual
environment
Processor cores: soft and hard
Soft core and hard core tradeoff:

Visibility of signal behaviors


❖ Hardware debug and visibility of signals critical for diagnosis
❖ Hard core solution: impossible to monitor signal transitions
❖ Soft core solution: System level debugging tool acts as virtual logic
analyser displaying any signal in the circuit

Design flexibility
❖ How easily development platform can be expanded: 1) adding IP
(Intellectual Property) blocks or 2) use parallel, serial interfaces
❖ Hard core solution: use of standard interfaces
❖ Soft core solution: use of digital IP cores
Processor cores: soft and hard
Soft core and hard core tradeoff:

Cost
❖ Hard core solution: cost effective
❖ Soft core solution: expensive system
Difference in price between microcontroller and FPGA
❖ Low price of microcontroller: Advanced technology and high volume
production
❖ High price of FPGA: not so advanced technology and less volume
production

Power consumption
Design system for high energy efficiency
❖ Hard core solution: power saving modes, very little power consumption
❖ Soft core solution: FPGA less power efficient
Endianness
Introduction to ARM
⮚ Key Component in Embedded Systems.

⮚ ARM cores are used in mobile phones, handheld organizers (PDA),


portable consumer devices, automobile industry, Networking, Security
Systems.

⮚ Originally Acorn RISC Machines, but now called as Advanced RISC


Machines.

⮚ Development started in 1985.

⮚ Over 1 billion ARM processors were sold by 2001. ARM7TDMI was


most successful ARM core.
Introduction
ARM7TDMI

T : supports both ARM (32-bit) and Thumb (16-bit) instruction sets

D: Contains Debug extensions: The debug extensions provide the


mechanism by which normal operation of the processor can be suspended
for debug, including the input signal ports to trigger this behavior.

M : Enhanced (relative to earlier ARM cores) 32x8 Multiplier block.

I : Embedded ICE macrocell: The Embedded ICE macrocell consists of on-


chip logic to support debug operations.
Introduction
⮚ 32-bit processor
Mode
Mode identifi
⮚ 32-bit ALU
er
⮚ 32 bit data bus
User usr
⮚ 32-bit instructions
Fast interrupt fiq
⮚ 32 bit address bus
⮚ Von-Neumann Model Interrupt irq

⮚ 3 stage pipeline (Fetch, Decode, Execute) Supervisor svc


Interrupts
⮚ 37 registers-32 bit each Abort abt
1. Reset
⮚ Load-Store Model System sys
2. Undefined
⮚ 7 operating modes Undefined und
3. Prefetch
⮚ 7 interrupts/exceptions Abort
⮚ 7 addressing modes
4. Data Abort
⮚ 3 data formats (8, 16, 32-bit)
5. S/W Interrupt
6. IRQ
7. FIQ
Features of ARM Processors
⮚ 32 bit RISC processor.

⮚ High Code density.( Less memory)

⮚ Hardware Debug Technology.

⮚ Load store architecture.

⮚ Mostly Single Cycle Execution except variable cycle execution for certain instructions.

⮚ Inline barrel shifter.

⮚ Thumb 16 bit instruction set.

⮚ Conditional execution: An instruction is only executed when a specific condition has


been satisfied.
Features Continued
⮚ Enhanced Instructions: DSP

⮚ Large 16 x 32 register file.

⮚ Uniform and Fixed op code width of 32 bits to ease decoding and pipelining.

⮚ Powerful indexed addressing modes.

⮚ Simple, but fast, 2-priority-level interrupt subsystem with switched register


banks.

⮚ Good Speed(few MHz to GHz) and Power consumption ratio

⮚ Based on Von Neuman Architecture or Harvard Architecture


Registers of ARM
Register Example:
User to IRQ Mode
Registers in use Registers in use
User IRQ Mode
r0 r0
r1
Mode r1
r2 r2
r3 r3
r4 r4
r5 r5
r6 r6
r7 r7
r8 EXCEPTION r8
r9 r9
r10 r10
r11 r11
r12 r12
r13 (sp) r13 (sp) r13_irq r13_irq
r14 (lr) r14 (lr) r14_irq r14_irq
r15 (pc) r15 (pc)
Return address calculated from User mode
cpsr PC value and stored in IRQ mode LR cpsr
spsr_fiq spsr_irq

User mode CPSR copied to IRQ mode SPSR


Register Example:
User to FIQ Mode
Registers in use Registers in use
User FIQ Mode
r0 r0
r1
Mode r1
r2 r2
r3 r3
r4 r4
r5 r5
r6 r6
r7 r7
r8 r8 EXCEPTION r8_fiq r8_fiq
r9 r9 r9_fiq r9_fiq
r10 r10 r10_fiq r10_fiq
r11 r11 r11_fiq r11_fiq
r12 r12 r12_fiq r12_fiq
r13 (sp) r13 (sp) r13_fiq r13_fiq
r14 (lr) r14 (lr) r14_fiq r14_fiq
r15 (pc) r15 (pc)
Return address calculated from User mode
cpsr PC value and stored in FIQ mode LR cpsr
spsr_fiq spsr_fiq

User mode CPSR copied to FIQ mode SPSR


Registers of ARM
⮚ The ARM has total of 37 registers.
⮚ 1 dedicated program counter
⮚ 1 dedicated current program status register
⮚ 5 dedicated saved program status registers
⮚ 30 general purpose registers

⮚ In all ARM processors, the following registers are available and


accessible in any processor mode:
⮚ 13 general-purpose registers R0-R12.
⮚ One Stack Pointer (SP), R13.
⮚ One Link Register (LR), R14.
⮚ One Program Counter (PC), R15.
⮚ CPSR
Registers of ARM
⮚ There are a standard set of eight general purpose registers that are always
available (R0–R7) no matter which mode the processor is in.

⮚ These registers are truly general purpose, with no special uses being placed
on them by the processors’ architecture.

⮚ A few registers (R8–R12) are common to all processor mode with the
exception of the fiq mode.

⮚ When the processor is in the fast interrupt mode these registers are replaced
with the different set of registers (R8_fiq – R12_fiq)
Registers of ARM
⮚ The general purpose register can be used to handle 8-bit bytes,
16-bit half words, or 32-bit words.

⮚ When we use a 32 bit register in a byte instruction only the


least significant 8 bits are used.

⮚ In a half word instruction only the least significant 16 bits are


used.

⮚ The remaining registers (R13 – R15) are special purpose


registers and have very specific roles.
Registers of ARM
⮚ R13 is also known as the Stack pointer, while R14 is known
as the Link Register, and R15 is the program counter.

⮚ The “user” (usr) and “System” (sys) modes share the same
registers.

⮚ There are also one or two status registers depending on which


mode the processor is in.

⮚ Current processor status register (CPSR) holds information


about the current status of the processor (including its current
mode)
Registers of ARM
⮚ In the exception modes there is an additional Saved Processor Status register
(SPSR) which holds information on the processors state before the system
changed into this mode i.e. the processor status just before an exception.
Stack pointer, SP or R13
⮚ Register R13 is used as a stack pointer and is also known as the SP register.

⮚ Each exception mode has its own version of R13, which points to a stack
dedicated to that exception mode.

⮚ The stack is typically used to store temporary values.


The link register, LR or r14
⮚ Register R14 is also known as the Link register or LR

⮚ It is used to hold the return address of a subroutine.

⮚ When an execution occurs, the exception mode’s version of R14 is set to the
address after the instruction which has just been completed.

⮚ The SPSR is a copy of the CPSR just before the exception occurred.
The Program Counter, PC or r15
⮚ Register r15 holds the Program Counter known as the PC.

⮚ It is used to identify which instruction is to be performed next.

⮚ As the PC holds the address of the next instruction it is often referred to as


an instruction pointer.
Current Program Status Register
(CPSR)
⮚ Current program status register (CPSR) contains the current status of the
processor.

⮚ This includes various conditional flags, Interrupt Status, Processor mode,


and other status and control information.

⮚ The exception modes also have a saved program status register (SPSR),
that is used to preserve the value of CPSR when the associated exception
occurs.

⮚ Because the User and System modes are not exception modes, there is no
SPSR available.
Bit pattern Current Program Status
Register (CPSR)
Current Program Status Register (CPSR)

⮚ The processors’ status is split in to two distinct parts: the condition


flags and the Systems Control field.

⮚ Each processor mode is either privileged or non-privileged.

⮚ A privileged mode allow full read-write access to the CPSR.

⮚ A nonprivileged mode only allow read access to control fields but


allows read-write access to condition flags.

⮚ Any bit not currently used is reserved for future use and should be
zero.

⮚ The I and F bits indicate if interrupts (I) or Fast Interrupts (F) are
allowed.
Modes of ARM Processor
ARM has 7 modes

⮚ Out of which 6 privileged (Allows full read write access to CPSR)


1. Abort ( failed attempt to access memory and/or Memory protection)
2. Fast Interrupt request (Fast Interrupt for high speed)
3. Interrupt request (Used for general purpose interrupt handling)
4. Supervisor ( after reset, OS kernel operation)
5. System (Special version of user mode)
6. Undefined (undefined instruction and supports software emulation of hardware
coprocessors)

⮚ One non-privileged mode (allows read access to control field and read write to
conditional flags)
⮚ User mode (Normal programs and applications)
Modes of ARM Processor
SVR
IRQ SWI/RESET
ABT
USER
Sys Call UND
FIQ
SYS
Internal Architecture of ARM

• Arrows represent flow of


data, lines represent buses
• Load-store architecture:
Only load and store
instructions can directly
access memory
• Register file: a storage bank
made up of 32-bit registers
• MAC: Multiply-accumulate
unit
Exceptions
⮚ ARM supports seven types of exception, and provides
privileged processor modes to handle each type.
Exception Handling
and the Vector Table
• When an exception occurs, the core:
– Copies CPSR into SPSR_<mode>
– Sets appropriate CPSR bits
● If core implements ARM Architecture 4T and is currently in
Thumb state, then
● ARM state is entered.
● Mode field bits
● Interrupt disable flags if appropriate.
– Maps in appropriate banked registers
– Stores the “return address” in LR_<mode>
– Sets PC to vector address

• To return, exception handler needs


to:
– Restore CPSR from SPSR_<mode>
– Restore PC from LR_<mode>
Exception Handling
and the Vector Table
Sequence of Execution of Exception
⮚ When an exception occurs, the processor halts execution after
the current instruction.

⮚ The state of the processor is preserved in the Saved Processor


Status Register (SPSR) so that the original program can be
resumed when the exception routine has completed.

⮚ The address of the instruction the processor was just about to


execute is placed into the Link register of the appropriate
processor mode.

⮚ The processor is now ready to begin execution of the exception


handling.
Exceptions Execution
The exception handler are located a pre-defined locations known as
Exception vectors. It is the responsibility of an operating system to provide
suitable exception handling.
Exceptions
⮚ When an exception occurs, some of the standard registers are replaced with
registers specific to the exception mode.

⮚ All exception modes have their own Stack Pointer (SP) and Link (LR) registers.

⮚ The fast interrupt mode has more registers (r8_fiq – r12_fiq) for fast interrupt
processing.
Exceptions
The seven exceptions are

⮚ Reset when the reset pin is held low, this is normally when the system is first turned
on or when the reset button is pressed.

⮚ Software Interrupt is generally used to allow user mode to transition to a


privileged mode to typically request services from the operating system.

⮚ The user program executes a software interrupt (SWI) instruction with an


argument which identifies the function the user wishes to perform.

⮚ Undefined Instruction is when an attempt is made to perform an undefined


instruction. This normally happens when there is a logical error in the program and
the processor starts to execute data rather than instruction.
Exceptions
⮚ Prefetch Abort occurs when the processor attempts to access memory that does not
exist or the processor has executed the breakpoint (BKPT) instruction.
⮚ Data Abort occurs when attempting to access a word on a non – word aligned
boundary. The lower two bits of a memory must be zero when accessing a word.
⮚ Interrupt occurs when an external device asserts the IRQ (Interrupt) pin on the
processor. This can be used by external device to request attention from the
processor.
⮚ Fast Interrupt occurs when an external device asserts the FIQ (fast interrupt) pin.
This is designed to support data transfer and has sufficient private registers to remove
the need for register saving in such applications. A fast interrupt can not be
interrupted.
Mode selection using CPSR
Architecture Revisions

⮚ ISA : Instruction Set Architecture


⮚ Nomenclature
⮚ ARM {x}{y}{z}-{T}{D}{M}{I}{E}{J}{F}{S}
⮚ X: family, y: Memory Mangmt /Protection,
⮚ Z : Cache
⮚ T: Thumb
⮚ D: Debugger (on-chip debug support)
⮚ M: Extended Multiplier (Consists Multiply instructions)
⮚ I : Embedded ICE macrocell (Allow breakpoint watchpoint to be set)
⮚ E: Enhanced Instructions (DSP processor)
⮚ J: Java acceleration by Jazelle (for JAVA coding)
⮚ F: Vector floating point unit
⮚ S: Synthesizable version ( Core is provided as source code which can be
modified and used by EDA tools)
ARM Cores
ARM Processor Family
ARM Processor Family

https://en.wikipedia.org/wiki/List_of_ARM_microarchitectures
Instruction Set
The ARM Instruction set can be divided into six broad classes of
instruction

Data Movement

Arithmetic

Memory Access

Logical and bit manipulation

Flow Control

System Control/ Privileged


Instruction Mnemonic
Condition code (cc) Mnemonic

Back
ARM instructions
Type of operation:

Arithmetic

Branch

Load and Store

Logical

Move
Arithmetic Instructions
ADD Add

ADC Add with carry

SUB Subtract

SBC Subtract with carry

RSB Reverse subtract

RSC Reverse subtract with carry

MUL Multiply

MLA Multiply and accumulate

UMULL Multiply - unsigned long

UMLAL Multiply and accumulate - unsigned long

SMULL Multiply - signed long

SMLAL Multiply and accumulate - signed long

CMP Compare

CMN Compare negative


Branch Instructions:

B Branch
BL Branch with link
Load and Store Instructions
LDR Load word
LDRB Load byte
LDRSB Load signed byte
LDRH Load half word
LDRSH Load signed half word
LDM Load multiple
LDM sp! Pop
STR Store word
STRB Store byte
STRH Store half word
STM Store multiple
STM sp! Push
Logical Instructions:

AND AND
EOR Exclusive OR
ORR OR
BIC Bit clear
TST Test
TEQ Test equivalence
Move Instructions
MOV Move
MVN Move and negate
SWP Swap
SWPB Swap byte
MRS Move program status register to register
MSR Move register to program status register
Arithmetic Instruction
Add
Syntax: ADD{cond}{S} Rd, Rn, Operand2
Elements inside curly brackets are optional.
Usage: Adds the value in Rn to Operand2 and places the sum in Rd.
Condition flags: If S is specified then all flags are updated according to the result.
Examples:
ADD R7, R4, #99 ;adds 99 to the value in R4 and places the sum in R7
ADD R1, R2, R3 ;adds the value in R3 to the value in R2 and places the sum in R1
Add with carry
Syntax: ADC{cond}{S} Rd, Rn, Operand2
Elements inside curly brackets are optional.
Usage: Adds the value in Rn to Operand2 and adds another 1 if the carry flag is
set (C = 1). The sum is placed in Rd.
Condition flags: If S is specified then all flags are updated according to the
result.
Examples:
ADC R7, R4, #99 ;adds 99 to the value in R4 and adds another 1 if the carry flag
is set. Places the sum in R7
ADC R1, R2, R3 ;adds the value in R3 to the value in R2 and adds 1 if the carry
flag is set. Places the sum in R1
Subtract
Syntax: SUB{cond}{S} Rd, Rn, Operand2
Elements inside curly brackets are optional.
Usage: Subtracts Operand2 from the value in Rn and places the difference in
Rd.
Condition flags: If S is specified then all flags are updated according to the
result.
Examples:

SUB R7, R4, #99 ;subtracts 99 from the value in R4 and places the result in R7
SUB R1, R2, R3 ;subtracts the value in R3 from the value in R2 and places the
difference in R1.
SUB Rd, Rn, Rm
Rm = 0x00110011
Rn = 0xFFFFFFFF
Subtract with carry

Syntax: SBC{cond}{S} Rd, Rn, Operand2


Elements inside curly brackets are optional.
Usage: Subtracts Operand2 from the value in Rn and subtracts another 1 if the
carry flag is clear (C=0 indicates borrow in subtraction). Places the difference in
Rd.
Condition flags: If S is specified then all flags are updated according to the result.
Examples:

SBC R7, R4, #99;


SBC R1, R2, R3;
SBC Rd, Rn, Rm
Rm = 0x00110011
Rn = 0xFFFFFFFF
C=0
SUB vs SBC

C=0
SUB vs SBC

MOV R0, #0X12


MOV R1, #0X11
SUB R2, R0, R1
CMP R0, R1 If RO >R1, C = 1
SBC R3, R0, R1
Reverse subtract
Syntax: RSB{cond}{S} Rd, Rn, Operand2
Elements inside curly brackets are optional.
Usage: Subtracts the value in Rn from Operand2 and places the difference in
Rd.
Condition flags: If S is specified then all flags are updated according to the
result.
Examples:

RSB R7, R4, #99 ;subtracts the value in R4 from 99 and places the result in R7
RSB R1, R2, R3 ;subtracts the value in R2 from the value in R3 and places the
difference in R1
Reverse subtract with carry
Syntax: RSC{cond}{S} Rd, Rn, Operand2
Elements inside curly brackets are optional.
Usage: Subtracts the value in Rn from Operand2 and subtracts another 1 if the
carry flag is clear. Places the difference in Rd.
Condition flags: If S is specified then all flags are updated according to the
result.
Examples:

RSC R7, R4, #99


RSC R1, R2, R3
Summary

ADD
ADC
SUB
SBC
RSB
RSC
Multiply
Syntax: MUL{cond}{S} Rd, Rm, Rs
Elements inside curly brackets are optional.
Usage: Multiplies the values in registers Rm and Rs and places the least
significant 32 bits of the product in register Rd.
Condition flags: If S is specified then the N and Z flags are updated according to
the result, the V flag is not affected (because only lower 32 bit result is taken,
irrespective of operands are signed or unsigned) and the C flag is unpredictable
for the ARM7 and earlier processors.
Example:
MUL R5, R3, R9 ;multiply the values in R3 and R9 and places the result in R5
Multiply
Multiply and accumulate
Syntax: MLA{cond}{S} Rd, Rm, Rs, Rn

Elements inside curly brackets are optional.

Usage: Adds the value in Rn to the product of the values in Rm and Rs and places the least
significant 32 bits of the result in register Rd.

Condition flags: If S is specified then the N and Z flags are updated according to the result,
the V flag is not affected (because only lower 32 bit result is taken, irrespective of operands
are signed or unsigned) and the C flag is unpredictable for the ARM7 and earlier processors.

Example:

MLA R5, R3, R9, R5 ;multiply the values in R3 and R9, add the product to the value in R5 and
places the result in R5
Multiply - unsigned long
Syntax: UMULL{cond}{S} RdLo, RdHi, Rm, Rs
Elements inside curly brackets are optional.
Usage: Multiplies the values (as unsigned integers) in registers Rm and
Rs and places the least significant 32 bits of the product in register RdLo
and the most significant 32 bits of the product in register RdHi.
Condition flags: If S is specified then the N and Z flags are updated
according to the result and the V and C flags are unpredictable for the
ARM7 and earlier processors.
Example:
UMULL R6, R5, R3, R9 ;multiply the values in R3 and R9 and places the
result in R5 and R6
Multiply - unsigned long
Multiply and accumulate - unsigned long
Syntax: UMLAL{cond}{S} RdLo, RdHi, Rm, Rs
Elements inside curly brackets are optional.
Usage: Multiplies the values (as unsigned integers) in registers Rm and Rs and
adds the 64 bit product to the unsigned 64 bit value in registers RdLo (least
significant 32 bits) and RdHi (most significant 32 bits).
Condition flags: If S is specified then the N and Z flags are updated according to
the result and the V and C flags are unpredictable for the ARM7 and earlier
processors.
Example:
UMLAL R6, R5, R3, R9 ;multiply the values in R3 and R9 and add the product to
the values in R5 and R6
Multiply and accumulate - unsigned long
Multiply - signed long
Syntax: SMULL{cond}{S} RdLo, RdHi, Rm, Rs
Elements inside curly brackets are optional.
Usage: Multiplies the values (as two's complement signed integers) in registers
Rm and Rs and places the least significant 32 bits of the product in register
RdLo and the most significant 32 bits of the product in register RdHi.
Condition flags: If S is specified then the N and Z flags are updated according to
the result and the V and C flags are unpredictable for the ARM7 and earlier
processors.
Example:
SMULL R6, R5, R3, R9 ;multiply the values in R3 and R9 and places the result in
R5 and R6
Multiply and accumulate - signed long
Syntax: SMLAL{cond}{S} RdLo, RdHi, Rm, Rs
Elements inside curly brackets are optional.
Usage: Multiplies the values (as two's complement signed integers) in registers
Rm and Rs and adds the 64 bit product to the two's complement signed 64 bit
value in registers RdLo (least significant 32 bits) and RdHi (most significant 32
bits).
Condition flags: If S is specified then the N and Z flags are updated according to
the result and the V and C flags are unpredictable for the ARM7 and earlier
processors.
Example:
SMLAL R6, R5, R3, R9 ;multiply the values in R3 and R9 and add the product to
the values in R5 and R6
Summary
• MUL
• MLA
• UMULL
• UMLAL
• SMULL
• SMLAL
Compare

Syntax: CMP Rn, Operand2

Usage: Subtracts Operand2 from the value in Rn and updates the flags
accordingly. The result is discarded.
Condition flags: All flags are updated according to the result.
Examples:
CMP R1, #9 ;set the flags as if 9 was subtracted from the value in R1.
CMP R6, R2 ;set the flags for the result of (R6 - R2) but discard the
result
Compare negative
Syntax: CMN Rn, Operand2

Usage: Add Operand2 to the value in Rn and updates the flags accordingly.
The result is discarded.
Condition flags: All flags are updated according to the result.
Examples:

CMN R1, #9 ;set the flags as if 9 was added to the value in R1.
CMN R6, R2 ;set the flags for the result of (R6 + R2) but discard the result
Comparisons
• The only effect of the comparisons is to

– UPDATE THE CONDITION FLAGS.


⮚Thus no need to set S bit.

• Operations are:

⮚CMP operand1 - operand2, but result not written

⮚CMN operand1 + operand2, but result not written


Branch
Syntax: B{cond} label

Usage: Reloads the program counter with the memory address given by
label. The label identifies the instruction to branch to in the assembly
language program.

Condition flags: Flags are not affected.

Examples:
For Signed Number
BNE loop
For Unsigned Number
B display
Branch with link
Syntax: BL{cond} label

Usage: Reloads the program counter with the memory address given
by label. The label identifies the instruction to branch to in the
assembly language program. The memory address of the next
instruction after the BL instruction is copied to register r14, the link
register.

Condition flags: Flags are not affected.

Examples:
BLCS LABEL
BL display
Load and Store Instructions
LDR Load word
LDRB Load byte
LDRSB Load signed byte
LDRH Load half word
LDRSH Load signed half word
LDM Load multiple
LDM sp! Pop
STR Store word
STRB Store byte
STRH Store half word
STM Store multiple
STM sp! Push
Load word
Syntax: LDR{cond} Rd, address mode

Elements inside curly brackets are optional.

Usage: Loads register Rd with 4 bytes from a location in memory with address
determined by address mode. Ensure that the address is divisible to 4.

Condition flags: Flags are not affected.

Examples:
LDR r7, [r3] ;load r7 with the value in memory location with address given by r3

LDR r1, [r2], #4 ;load r1 with the value in memory location with address given by r2 then add 4 to
r2. (Post Index Register Relative)

LDR r1, [r2,#4] ; load r1 with the value in memory location with address given by r2+4 to r2.(Normal
Register Relative)

LDR r1, [r2,#4]! ;(Pre Index Register Relative)


LDR r1, [r2,r3] ;(Base Indexed)
Load word
Load word
Load word
Load word
Load word
Load byte
Syntax: LDRB{cond} Rd, address mode

Elements inside curly brackets are optional.

Usage: Loads the least significant byte of register Rd with


1 byte from a location in memory with address
determined by address mode. The top 24 bits of Rd are
cleared.

Condition flags: Flags are not affected.

Example:
LDRB r7, [r3] ;load r7 with the byte in memory location
with address given by r3 and clear the top 24 bits of r7
Load signed byte
Syntax: LDRSB{cond} Rd, address mode

Elements inside curly brackets are optional.

Usage: Loads the least significant byte of register Rd with 1 byte from a
memory location with address determined by address mode. The most
significant bit of the loaded byte (the sign bit) is extended across the
top 24 bits of Rd.

Condition flags: Flags are not affected.

Example:
LDRSB r7, [r3] ;load r7 with the byte in memory location with address
given by r3 and extend the sign bit to 32 bits
Load half word
Syntax: LDRH{cond} Rd, address mode
Elements inside curly brackets are optional.
Usage: Loads the bottom 16 bits of register Rd with 2 bytes from a
location in memory with address determined by address mode. The top
16 bits of Rd are cleared.
Condition flags: Flags are not affected.
Example:
LDRH r7, [r3] ;load r7 with two bytes in memory location with address
given by r3 and clear the top 16 bits of r7
Load signed half word
Syntax: LDRSH{cond} Rd, address mode

Elements inside curly brackets are optional.

Usage: Loads the bottom 16 bits of register Rd with 2 bytes


from a location in memory with address determined by address
mode. The most significant bit of the loaded bytes (the sign bit)
is extended across the top 16 bits of Rd.

Condition flags: Flags are not affected.

Example:
LDRSH r7, [r3] ;load r7 with two bytes in memory location with
address given by r3 and extend the sign bit to 32 bits
Load multiple
Syntax: LDM{cond}mode Rn{!}, reglist

Elements inside curly brackets are optional.

Usage: Loads the registers specified in reglist from a memory location


with address given by the base register, Rn. Ensure that the address is
divisible to 4. One of four modes must be specified. The base register,
Rn, is updated to the final address if there is an exclamation mark, !,
after Rn.

Condition flags: Flags are not affected.

Example1:
LDMIA r7!, {r3-r5, r9, r11} ;load registers r3, r4, r5, r9 and r11 with values
starting from memory location with address given by r7. After each
transfer increment r7 by 4 and update r7 at the end of the instruction.
1/1/2024 LEC/UEC-405/July-Dec 2017 119
1/1/2024 LEC/UEC-405/July-Dec 2017 120
1/1/2024 LEC/UEC-405/July-Dec 2017 121
1/1/2024 LEC/UEC-405/July-Dec 2017 122
Store word
Syntax: STR{cond} Rd, address mode
Elements inside curly brackets are optional.
Usage: Stores the value in register Rd into 4 bytes of memory with
address determined by address mode. Ensure that the address is
divisible to 4.

Condition flags: Flags are not affected.


Examples:
STR r7, [r3] ;store value in r7 into memory location with address
given by r3
STR r1, [r2], #4 ;store value in r1 into memory location with
address given by r2 then add 4 to r2
Store byte
Syntax: STRB{cond} Rd, address mode
Elements inside curly brackets are optional.
Usage: Stores the least significant byte of register Rd into the
memory location with address determined by address mode.
Condition flags: Flags are not affected.
Example:
STRB r7, [r3] ;store least significant byte of r7 into memory
location with address given by r3
Store half word
Syntax: STRH{cond} Rd, address mode
Elements inside curly brackets are optional.

Usage: Stores the bottom 16 bits of register Rd into two locations


of the memory with address determined by address mode.

Condition flags: Flags are not affected.

Example:

STRH r7, [r3] ;store least significant 16 bits of r7 into memory


location with address given by r3
Store multiple
Syntax: STM{cond}mode Rn{!}, reglist
Elements inside curly brackets are optional.
Usage: Stores the values in registers specified by reglist to memory locations
starting with address given by the base register, Rn. Ensure that the address is
divisible to 4. One of four modes must be specified. The base register, Rn, is
updated to the final address if there is an exclamation mark, !, after Rn.

Condition flags: Flags are not affected.


Example:
STMDA r7!, {r3-r5, r9, r11} ;stores values in registers r3, r4, r5, r9 and r11 to
memory locations starting from address given by r7. After each transfer
decrement r7 by 4 and update r7 at the end of the instruction.
Store multiple
Stack Operation
Stack Operation
Push
Syntax: STM{cond}mode sp!, reglist
Elements inside curly brackets are optional.
Usage: Stores the values in registers specified by reglist to the
stack. One of four modes must be specified; FD, FA, ED or EA.
Condition flags: Flags are not affected.
Example:
STMEA sp!, {r3-r5, lr} ;store values in registers r3, r4, r5 and value
in link register to the stack. The stack is 'empty ascending'.
Pop
Syntax: LDM{cond}mode sp!, reglist

Elements inside curly brackets are optional.

Usage: Loads the registers specified in reglist with values from the
stack. One of four modes must be specified; FD, FA, ED or EA.

Condition flags: Flags are not affected.

Example:

LDMFD sp!, {r3-r5, pc} ;load registers r3, r4, r5 and program
counter with values from the stack. The stack is 'full descending'.
Push
POP
Store Multiple Instruction
Store Multiple Instruction
Store Multiple Instruction
Store Multiple Instruction
Stack Instruction
Programming Examples

Write a program to find factorial of a number.


Programming Examples

Write a program to find factorial of a number.

LDR R1, Value1


LDR R2, Value2
LOOP MUL R0, R1, R2
MOV R1, R0
SUB R2, R2, #0x01
CMP R2, #0x01
BNE LOOP
SWI &11; all done
Example: Find the larger of two numbers.
Example: Find the larger of two numbers.

LDR R1, Value 1


LDR R2, Value 2
CMP R1, R2
BHI Done
MOV R0, R2
SWI &11
Done MOV R0, R1
SWI &11
Almost all instructions can be conditionally executed.

The mnemonic for an instruction that is to be conditionally executed includes


one of the following condition codes:
Condition Flags for Meaning
code execution
EQ Z set Equal
NE Z clear Not equal
CS or HS C set Higher or same (unsigned
>=)
CC or LO C clear Lower (unsigned <)
MI N set Negative
PL N clear Positive or zero
VS V set Overflow
VC V clear No overflow
HI C set and Z clear Higher (unsigned >)
LS C clear or Z set Lower or same (unsigned <=)
GE N and V the same Signed >=
LT N and V different Signed <
GT Z clear, and N = V Signed >
LE Z set and N ≠ V Signed <=
AL Any Always
One’s complement

LDR R1, value

MVN R1, R1

SWI &11
32 Bit Addition: Direct
LDR R1, value 1

LDR R2, value 2

ADD R1, R1, R2

SWI &11
32 Bit addition: Indirect
LDR R0, Value 1

LDR R1, [R0]

ADD R0, R0, #0x04

LDR R2, [R0]

ADD R1, R1, R2

SWI &11
64 bit addition
LDR R1, value 1
LDR R2, value 2
LDR R3, value 3
LDR R4, value 4
ADDS R5, R1, R3
ADC R6, R2, R4
SWI &11
To find number of positive and negative numbers out of 10 32-bit numbers
AREA PROGRAM, CODE, READONLY AREA PROGRAM, CODE, READONLY AREA PROGRAM, CODE, READONLY
ENTRY ENTRY ENTRY
MAIN MAIN MAIN
MOV R9, #10 MOV R9, #5 MOV R9, #5
LDR R0,VALUE LDR R0,VALUE LDR R0,VALUE
LABEL LABEL LABEL
LDR R1,[R0],#4 LDR R1,[R0],#4 LDR R1,[R0],#4
LDR R2,=&80000000 SUBS R2,R1,#00 OR MOVS R2,R1 SUBS R2,R1,#00
ANDS R3,R1,R2 ADDPL R4,R4,#1 ADDGT R4,R4,#1
ADDEQ R4,R4,#1 ADDMI R5,R5,#1 ADDLT R5,R5,#1
ADDNE R5,R5,#1 SUBS R9,R9,#1 SUBS R9,R9,#1
SUBS R9,R9,#1 BNE LABEL BNE LABEL
BNE LABEL AREA PROGRAM, DATA, READONLY AREA PROGRAM, DATA, READONLY
AREA PROGRAM, DATA, READONLY VALUE DCD &00001000 VALUE DCD &00001000
VALUE DCD &00001000 END END
END

AREA PROGRAM, CODE, READONLY AREA PROGRAM, CODE, READONLY AREA PROGRAM, CODE, READONLY
ENTRY ENTRY ENTRY
MAIN MAIN MAIN
MOV R9, #5 MOV R9, #5 MOV R9, #5
LDR R0,VALUE LDR R0,VALUE LDR R0,VALUE
LABEL LABEL LABEL
LDR R1,[R0],#4 LDR R1,[R0],#4 LDR R1,[R0],#4
CMP R1,#00 CMP R1,#00 LSLS R2,R1,#1
ADDPL R4,R4,#1 ADDGT R4,R4,#1 ADDCC R4,R4,#1
ADDMI R5,R5,#1 ADDLT R5,R5,#1 ADDCS R5,R5,#1
SUBS R9,R9,#1 SUBS R9,R9,#1 SUBS R9,R9,#1
BNE LABEL BNE LABEL BNE LABEL
AREA PROGRAM, DATA, READONLY AREA PROGRAM, DATA, READONLY AREA PROGRAM, DATA, READONLY
VALUE DCD &00001000 VALUE DCD &00001000 VALUE DCD &00001000
END END END
Example: Block Copy
– Copy a block of memory, which is an exact
multiple of 12 words long from the location
pointed to by r12 to the location pointed to by
r13. r14 points to the end of block to be copied.
r13
; r12 points to the start of the source data
; r14 points to the end of the source data r14 IncreasingM
; r13 points to the start of the destination data emory
loop LDMIA r12!, {r0-r11} ; load 48 bytes
STMIA r13!, {r0-r11} ; and store them
CMP r12, r14 ; check for the end r12
BNE loop ; and loop until done
Logical Instructions:
AND AND
EOR Exclusive OR
ORR OR
BIC Bit clear
TST Test
TEQ Test equivalence
AND
Syntax: AND{cond}{S} Rd, Rn, Operand2
Elements inside curly brackets are optional.
Usage: Performs a bit by bit logical AND of Operand2 with the value in
Rn and places the result in Rd.
Condition flags: If S is specified then the N and Z flags are updated
according to the result, the C flag is updated if Operand2 was calculated
using a shift and the V flag is not affected.
Examples:
AND r7, r4, #0xFF ;ANDs 0x000000FF with r4 and places the result in r7
AND r1, r2, r3 ;ANDs r3 with r2 and places the result in r1
Exclusive OR
Syntax: EOR{cond}{S} Rd, Rn, Operand2
Elements inside curly brackets are optional.
Usage: Performs a bit by bit exclusive OR of Operand2 with the value in Rn and
places the result in Rd.

Condition flags: If S is specified then the N and Z flags are updated according to
the result, the C flag is updated if Operand2 was calculated using a shift and the
V flag is not affected.

Examples:
EOR r7, r4, #0xFF ;Exclusive ORs 0x000000FF with r4 and places result in r7
EOR r1, r2, r3 ;Exclusive ORs r3 with r2 and places the result in r1
OR
Syntax: ORR{cond}{S} Rd, Rn, Operand2
Elements inside curly brackets are optional.
Usage: Performs a bit by bit logical OR of Operand2 with the value in Rn and
places the result in Rd.
Condition flags: If S is specified then the N and Z flags are updated according to
the result, the C flag is updated if Operand2 was calculated using a shift and the
V flag is not affected.
Examples:
ORR r7, r4, #0xFF ;ORs 0x000000FF with r4 and places the result in r7
ORR r1, r2, r3 ;ORs r3 with r2 and places the result in r1
Bit clear
Syntax: BIC{cond}{S} Rd, Rn, Operand2
Elements inside curly brackets are optional.

Usage: Performs a logical AND of the bits in Rn with the complement of


the bits in Operand2 and places the result in Rd. In effect a 1
in Operand2 will make a 1 in Rn into a 0 - hence bit clear.

Condition flags: If S is specified then the N and Z flags are updated


according to the result, the C flag is updated if Operand2 was calculated
using a shift and the V flag is not affected.
Examples:
BIC r7, r4, #0xFF ;clears the last 8 bits of r4 to 0 and places the result in
r7
BIC r1, r2, r3 ;clears 1's in r2 corresponding to 1's in r3 and places result
in r1
Test
Syntax: TST{cond} Rn, Operand2
Elements inside curly brackets are optional.

Usage: Performs a bit by bit logical AND of Operand2 with the value in
Rn and updates the flags accordingly. The result is discarded.
Condition flags: The N and Z flags are updated according to the result,
the C flag is updated if Operand2 was calculated using a shift/rotate and
the V flag is not affected.
Same as ANDS operation (but, result is discarded)
All of the Shift and Rotate instructions affect Carry Flag. In case of
left/right shift, highest/ lowest bit is copied to the carry flag
Examples:
TST r4, #0xFF ;AND 0x000000FF with r4 and sets the flags accordingly
TST r2, r3 ;AND r3 with r2 and sets the flags accordingly
Test
Test equivalence
Syntax: TEQ{cond} Rn, Operand2

Elements inside curly brackets are optional.


Usage: Performs a bit by bit exclusive OR of Operand2 with the value in
Rn and updates the flags accordingly. The result is discarded.
Condition flags: The N and Z flags are updated according to the result,
the C flag is updated if Operand2 was calculated using a shift/rotate and
the V flag is not affected.
Same as EORS operation (but, result is discarded)
Examples:
TEQ r4, #0xFF ;exclusive OR 0x000000FF with r4 and sets the flags
TEQ r2, r3 ;exclusive OR r3 with r2 and sets the flags accordingly
Move Instructions
MOV Move
MVN Move and negate
SWP Swap
SWPB Swap byte
MRS Move program status register to register
MSR Move register to program status register
Move
Syntax: MOV{cond}{S} Rd, Operand2
Elements inside curly brackets are optional.

Usage: Copies the value of Operand2 into Rd.


Condition flags: If S is specified then the N and Z flags are updated
according to the result, the C flag is updated if Operand2 was
calculated using a shift/rotate and the V flag is not affected.
Examples:
MOV r7, #0xFF ;copy 0x000000FF into register r7
MOV r7, r4 ;copy the value in register r4 into register r7
Move and negate
Syntax: MVN{cond}{S} Rd, Operand2

Elements inside curly brackets are optional.


Usage: Takes the value of Operand2, performs a bitwise logical NOT operation on
the value and places the result into Rd.
Condition flags: If S is specified then the N and Z flags are updated according to
the result, the C flag is updated if Operand2 was calculated using a shift/rotate
and the V flag is not affected.
Examples:
MVN r7, #0xFF ;copy 0xFFFFFF00 into register r7
MVN r7, r4 ;invert the value in r4 and place the result into r7
Swap
Syntax: SWP{cond} Rd, Rm, [Rn]
Elements inside curly brackets are optional.
Usage: Data from memory address given by value in Rn is loaded into Rd. Data
in register Rm is stored at memory location with address given by value in Rn.
Ensure that the memory address is divisible by 4.
Condition flags: Flags are not affected.
Examples:

SWP r1, r1, [r9] ;swap the data in register r1 with the data held in memory at
address given by value in r9.
SWP r6, r8, [r2] ;load r6 with data from memory at address given by r2 and then
store data in r8 at the same memory address.
Swap byte
Syntax: SWPB{cond} Rd, Rm, [Rn]
Elements inside curly brackets are optional.
Usage: Byte from memory address given by value in Rn is loaded into least
significant byte of Rd and the top 24 bits of Rd are cleared. Least significant
byte in register Rm is stored at memory location with address given by value
in Rn.
Condition flags: Flags are not affected.
Example:

SWPB r1, r1, [r9] ;swap least significant byte in register r1 with the byte held
in memory at address given by value in r9. Clear top 24 bits of r1.
Move program status register to register
Syntax: MRS{cond} Rd, psr
Elements inside curly brackets are optional.
Usage: Moves the contents of the current program status register (CPSR) or
the saved program status register (SPSR) into register Rd.
Condition flags: Flags are not affected.
Examples:

MRS r1, CPSR ;move the value in the CPSR into register r1
MRS r5, SPSR ;move the value in the SPSR into register r5
Move register to program status register
Syntax: MSR{cond} <psr>_<fields>, Rm

Elements inside curly brackets are optional.

Usage: Moves the contents of register Rm into the current program status register
(CPSR) or the saved program status register (SPSR). One or more fields must be
specified; these are the control field, c, the extension field, x, the status field, s and the
flags field, f. The source register can be replaced by an immediate value formed from 8
bits rotated by an even number of bits.

Condition flags: The flags are updated if CPSR_f is specified.

Examples:

MSR CPSR_f, r1 ;update the flags using the value in register r1

MSR SPSR_c, #0x7A ;move the immediate value, #0x7A, into the control field of the
saved program status register
Format of CPSR or SPSR

Examples
.

A register shifted by an immediate value

The value in the register is first shifted by a numeric constant before being
applied to the remainder of the instruction.

There are five different types of shift available.

LSL Logical shift left Shift in range 0 to 31

LSR Logical shift right Shift in range 1 to 32

ASR Arithmetic shift right Shift in range 1 to 32

ROR Rotate right Shift in range 1 to 31

RRX Extended rotate right Shift by 1 bit only


Rotate and Shift Instructions

• MOV R2, R1, LSL #2


• MOV R2, R1, LSL R2
• LSL R5,#2
• LSL R6,R5,#2
• MOV R2, R1, LSR #2
• MOV R2, R1, LSR R2
• LSR R5,#2
• LSR R6,R5,#2
• MOV R2, R1, ASR #2
• MOV R2, R1, ASR R2
• ASR R5,#2
• ASR R6,R5,#2
• MOV R7, R1, ROR #2
• MOV R7, R1, ROR R2
• ROR R1, #2
• ROR R6,R5,#2
MOV R5, R1, RRX
RRX R7, R5
Key points
Shift/rotation amount encoded in 5 bits (should ideally be 0-31)

LSL Logical shift left Shift in range 0 to 31

LSR Logical shift right Shift in range 1 to 32

ASR Arithmetic shift right Shift in range 1 to 32

ROR Rotate right Shift in range 1 to


31(what about shift 0?)

RRX Extended rotate right Shift by 1 bit only


Key Points
Key Points
Key Points
Logical shift left (LSL)
The value in the register is shifted to the left by a specified number, n,
of bits and the right hand n bits are set to 0.

Example:
Execute instruction MOV r3, r2, LSL #2
Before r2 holds the value:
1000 1101 1000 1001 0011 0100 0010 0001
After r3 holds the value:
0011 0110 0010 0100 1101 0000 1000 0100
Logical shift right (LSR)
The value in the register is shifted to the right by a specified
number, n, of bits and the left hand n bits are set to 0.

Example:
Execute instruction MOV r3, r2, LSR #2

Before r2 holds the value:


1000 1101 1000 1001 0011 0100 0010 0001

After r3 holds the value:


0010 0011 0110 0010 0100 1101 0000 1000
Arithmetic shift right (ASR)

The value in the register is shifted to the right by a specified number, n, of bits and the left hand n bits are
set to the value of the most significant bit (sign bit) before the shift.

This preserves the sign of a two's complement number.

Example:

Execute instruction MOV r3, r2, ASR #2

Before r2 holds the value: 1000 1101 1000 1001 0011 0100 0010 0001

After r3 holds the value: 1110 0011 0110 0010 0100 1101 0000 1000

or

Before r2 holds the value:

0000 1101 1000 1001 0011 0100 0010 0001

After r3 holds the value:

0000 0011 0110 0010 0100 1101 0000 1000


Extended rotate (RRX)
The value in the register is rotated to the right by one bit and
the most significant bit is set to the value of the carry
flag before the shift.
Example:
Execute instruction MOV r3, r2, RRX
Before r2 holds the value:
1000 1101 1000 1001 0011 0100 0010 0001
And the carry flag is set.
After r3 holds the value:
1100 0110 1100 0100 1001 1010 0001 0000
Addressing modes

1. IMMEDIATE
MOV R0, #25H
ADD R0, R1, #25H

2. REGISTER
MOV R0, R1
ADD R0, R1, R2

3. DIRECT
LDR R0, VALUE
STR R0, VALUE
Addressing modes
4. INDIRECT
LDR R0, [R1]
STR R0, [R1]

5. REGISTER RELATIVE
NORMAL: LDR R0, [R1, #04H]
PRE INDEX: LDR R0, [R1, #04H]!
POST INDEX: LDR R0, [R1], #04H

6. BASE INDEXED

NORMAL: LDR R0, [R1, R2]


PRE INDEX: LDR R0, [R1, R2]!
POST INDEX: LDR R0, [R1], R2

7. BASE WITH SCALED INDEX


NORMAL: LDR R0,[R1, R2, LSL #4]
PRE INDEX: LDR R0,[R1, R2, LSL #4]!
POST INDEX: LDR R0,[R1], R2, LSL #4
Example
Pre-indexing with write back

⮚ LDR R0, [R1, #4]!

• Before instruction execution


⮚ R0 = 0x00000000, R1=0x00009000
⮚ Mem32[0x00009000] = 0x01010101
⮚ Mem32[0x00009004] = 0x02020202

R0 and R1 ?
Example
Pre-indexing with write back

⮚ LDR R0, [R1, #4]!

• Before instruction execution


⮚ R0 = 0x00000000, R1=0x00009000
⮚ Mem32[0x00009000] = 0x01010101
⮚ Mem32[0x00009004] = 0x02020202

• After instruction execution


⮚ R0 = 0x02020202
⮚ R1 = 0x00009004
Addressing modes
Load and store instructions can use a number of
different addressing modes.

Zero offset LDR r7, [r3] ;load r7 with the value in memory location with address given by r3

Pre-indexed offset
Post-indexed offset
Pre-index Examples:
An immediate value in the range -4095 to +4095

LDR r7, [r3, #4] ;load r7 with the value in memory location with address
given by 4 added to value in r3

LDR r7, [r3, #4]! ;load r7 with the value in memory location with address
given by 4 added to value in r3 and update r3

STR r5, [r2, -r7] ;store value in r5 at memory location with address given
by value in r7 subtracted from the value in r2

STR r5, [r2, r7, LSL #3] ;store value in r5 at memory location with address
given by value in r7 left shifted by three bits added to the value in r2
Post-index Examples:
An immediate value in the range -4095 to +4095

LDR r7, [r3], #4 ;load r7 with the value in memory location with address
given by r3 and then add 4 to the value in r3

STR r5, [r2,] -r7 ;store value in r5 at memory location with address given
by value in r2 and then subtract the value in r7 from the value in r2

STR r5, [r2], r7, LSL #3 ;store value in r5 at memory location with
address given by value in r2 and then add the value in r7 left shifted by
three bits to the value in r2
Second operand
Operand2 occurs in the syntax of a number of ARM instructions.

Any of the following can be used for Operand2

▪ A numeric constant (known as an 'immediate value')

▪ A register

▪ A register shifted by an immediate value

▪ A register shifted by another register


A numeric constant
A numeric constant (known as an 'immediate value').
An immediate value is identified by a hash symbol, #.
The constant must correspond to an 8-bit pattern rotated by an even number
of bits within a 32-bit word.
Hence the following are all allowed:
ADD r2, r4, #255
MOV r6, #0xFF000000
SUB r7, r3, #0x3FC00
The following are not allowed:
ADD r2, r4, #257
MOV r6, #0xFF800000
SUB r7, r3, #0x1FE00
A register

Any register from r0 to r15.

Examples:
ADD r2, r4, r9
MOV r6, r0
SUB r7, r3, r14

Note that using the program counter, r15, can


give unexpected results.
A register shifted by an immediate value
Example:
MOV r6, r0, LSL #5 ;logical shift left the value in r0 by 5 bits
and place it in r6
ADD r4, r7, r2, ASR #8 ;arithmetic shift right the value in r2
by 8 bits and add it to the value in r7. Place the sum in r4.
SUB r3, r9, r12, RRX ;extended rotate right the value in r12
by one bit and subtract it from the value in r9. Place the
difference in r3.
A register shifted by another register

MOV r6, r0, LSL r3 ;logical shift left the value in r0 by the value in r3
and place it in r6

ADD r4, r7, r2, ASR r8 ;arithmetic shift right the value in r2 by the value
in r8 and add it to the value in r7. Place the sum in r4.

SUB r3, r9, r12, ROR r1 ;rotate right the value in r12 by the value in r1
and subtract it from the value in r9. Place the difference in r3.
Conditional Execution
Branch instructions
• All ARM processors support a branch instruction that
allows a conditional branch forwards or backwards up to
32MB.

• As the PC is one of the general-purpose registers (R15), a


branch or jump can also be generated by writing a value to
R15.

• A subroutine call can be performed by a variant of the


standard branch instruction.
• As well as allowing a branch forward or backward up to 32MB, the Branch
with Link (BL) instruction preserves the address of the instruction after the
branch (the return address) in the LR (R14).
Data Processing Instructions
• Most data-processing instructions take two source operands,
though Move and Move Not take only one.

• The compare and test instructions only update the condition


flags.

• Other data-processing instructions store a result to a register and


optionally update the condition flags as well.
Data Processing Instructions
• Of the two source operands, one is always a register.

• The other is called a shifter operand and is either an immediate value or a register.

• If the second operand is a register value, it can have a shift applied to it.

• CMP, CMN, TST and TEQ always update the condition code flags.

• The assembler automatically sets the S bit in the instruction for them.

• The remaining instructions update the flags if an S is appended to the instruction


mnemonic (which sets the S bit in the instruction).
Compare

Syntax: CMP Rn, Operand2

Usage: Subtracts Operand2 from the value in Rn and updates the flags
accordingly. The result is discarded.
Condition flags: All flags are updated according to the result.
Examples:
CMP R1, #9 ;set the flags as if 9 was subtracted from the value in R1.
CMP R6, R2 ;set the flags for the result of (R6 - R2) but discard the
result
Compare negative
Syntax: CMN Rn, Operand2

Usage: Add Operand2 to the value in Rn and updates the flags accordingly.
The result is discarded.
Condition flags: All flags are updated according to the result.
Examples:

CMN R1, #9 ;set the flags as if 9 was added to the value in R1.
CMN R6, R2 ;set the flags for the result of (R6 + R2) but discard the result
Test
Syntax: TST{cond} Rn, Operand2
Elements inside curly brackets are optional.

Usage: Performs a bit by bit logical AND of Operand2 with the value in
Rn and updates the flags accordingly. The result is discarded.
Condition flags: The N and Z flags are updated according to the result,
the C flag is updated if Operand2 was calculated using a shift/rotate and
the V flag is not affected.
Same as ANDS operation (but, result is discarded)
All of the Shift and Rotate instructions affect Carry Flag. In case of
left/right shift, highest/ lowest bit is copied to the carry flag
Examples:
TST r4, #0xFF ;AND 0x000000FF with r4 and sets the flags accordingly
TST r2, r3 ;AND r3 with r2 and sets the flags accordingly
Test equivalence
Syntax: TEQ{cond} Rn, Operand2

Elements inside curly brackets are optional.


Usage: Performs a bit by bit exclusive OR of Operand2 with the value in
Rn and updates the flags accordingly. The result is discarded.
Condition flags: The N and Z flags are updated according to the result,
the C flag is updated if Operand2 was calculated using a shift/rotate and
the V flag is not affected.
Same as EORS operation (but, result is discarded)
Examples:
TEQ r4, #0xFF ;exclusive OR 0x000000FF with r4 and sets the flags
TEQ r2, r3 ;exclusive OR r3 with r2 and sets the flags accordingly
Programming Examples
A) Shift left by 2 bits

LDR R1, Value


MOV R1, R1, LSL#0x02
SWI &11

Value DCD &00000005 ; DCD: Define


constant double
Result: R1 = 0x00000014
B) Shift right by number of bits stored in register R2.

LDR R1, Value 1


LDR R2, Value 2
MOV R1, R1, LSR R2
SWI &11

Value1 DCD &00000005


Value2 DCD &00000003
Result: R1 = 0x00000000
D) Arithmetic shift by value contained in
register R2

LDR R1, value 1


LDR R2, value 2
MOV R1, R1, ASR R2
SWI &11

value 1 DCD &00000005


Value 2 DCD &00000002
Result: R1 = 0x00000001
A) Direct Method

LDR R1, value 1


LDR R2, value 2
ADD R1, R1, R2
SWI &11

value 1 DCD &10000100


value 2 DCD &00000002
Result: 0x10000102
B) Indirect Method

LDR R2, value 1


LDR R4, value 2
LDR R1, [R2]
LDR R3, [R4]
ADD R0, R1, R3
SWI &11

value 1 DCD &100000100


value 2 DCD &00000002
In general, -n = NOT (n-1). This means that to load a
negative number, you subtract one from its positive
value
and use that in the MVN.
For example, to load the number -128 you would do a
MVN of 127.
Examples:
MVNS R0, R0 ;Invert all bits of R0, setting flags MVN
Block copy without LDM and
STM

LDR R0, =src


LDR R1, =dst
MOV R2, #20
word copy
LDR R3, [R0], #4
STR R1, [R3], #4
SUBS R2, R2, #1
BNE word copy
SWI &11

src DCD 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4
dstDCD 0, 0, 0, 0, 0, 0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0,0 ,0 ,0 ,0
Programming Examples

One’s complement

LDR R1, value

MVN R1, R1

SWI &11
16 Bit Addition: Direct
LDR R1, value 1

LDR R2, value 2

ADD R1, R1, R2

SWI &11
16 Bit addition: Indirect
LDR R0, Value 1

LDR R1, [R0]

ADD R0, R0, #0x04

LDR R2, [R0]

ADD R1, R1, R2

SWI &11
Find the larger of two numbers.
LDR R1, Value 1
LDR R2, Value 2
CMP R1, R2
BHI Done
MOV [R0], R2
SWI &11
Done MOV [R0], R1
SWI &11
64 bit addition
LDR R1, value 1
LDR R2, value 2
LDR R3, value 2
LDR R4, value 4
ADDS R5, R1, R3
ADC R6, R2, R4
SWI &11
write a program for multiplication
of numbers by repetitive addition.
Alternate program

LDR R1, Value 1


LDR R2, Value 2
LOOP
ADD R3, R2,R2
SUBS R1, R1, #0x01
BNE LOOP
Write a program for division of
numbers by repetitive subtraction.
Write a program to find factorial of a
number.
Write a program to verify how many bytes are present in a
given set which resembles the value 0xAC.

LDR R0, Value


MOV R3, # 0x08
MOV R2, # 0x00
LOOP LDRB R1, [R0], #1
CMP R1, #0xAC
ADDEQ R2, R2, #0x01
SUBS R3, R3, #0x01
BNE LOOP
Write a program in ARM assembly language to count the
number of 1’s and 0’s in a given word and verify the result.

MOV R1,#0x03
MOV R2, #32
MOV R3, #0x00
MOV R4, #0x00
NEXT MOVS R1, R1, RRX
ADDCC R3,R3,#0x01
ADDCS R4, R4, #0x01
SUB R2, R2, #0x01
BNE NEXT
Write a program in ARM assembly language to perform
multiplication of numbers by repetitive addition.

LDR R1, Value1


LDR R2, Value2
LOOP ADD R3, R2, R2
SUBS R1, R1, #0x01
BNE LOOP
Write a program in ARM assembly language to copy consecutive
word from source to destination in memory using

A) Multiple register transfer instruction


B) Load and store instruction in a loop

LDR R9! Value1


LDR R10! Value2
LDMIA R9! {R0-R3}
STMIA R10! {R0-R3}
SWI &11
ARM Instruction Set Format
Data processing instructions
• The ARM data processing instructions are used to modify data values in registers.
The operations that are supported include arithmetic and bit-wise logical
combinations of 32-bit data types. One operand may be shifted or rotated en
route to the ALU, allowing, for example, shift and add in a single instruction.
ADD{cond}{S} Rd, Rn, Operand2
Data processing instructions
The Condition Field
31 28 24 20 16 12 8 4 0

Cond

0000 = EQ - Z set (equal) 1001 = LS - C clear or Z (set unsigned


0001 = NE - Z clear (not equal) lower or same)
0010 = HS / CS - C set (unsigned 1010 = GE - N set and V set, or N clear
higher or same) and V clear (>or =)
0011 = LO / CC - C clear (unsigned 1011 = LT - N set and V clear, or N clear
lower) and V set (>)
0100 = MI -N set (negative) 1100 = GT - Z clear, and either N set and
0101 = PL - N clear (positive or V set, or N clear and V set (>)
zero)
1101 = LE - Z set, or N set and V clear,or
0110 = VS - V set (overflow) N clear and V set (<, or =)
0111 = VC - V clear (no overflow) 1110 = AL - always
1000 = HI - C set and Z clear 1111 = NV - reserved.
(unsigned higher)
Data processing instructions
Rd Rn Rm Rd Rm
ADD{cond}{S} Rd, Rn, Operand2 ADD R0, R1, R2 MOV R0, R1
Data processing instructions
MOV with LSL #imm
Data processing instructions
MOV with LSL #imm
Data processing instructions
MOV with LSL reg

1110 0 1101 0 0010 0000 0001


Example: Data processing instructions
Data processing instructions with
immediate value as second operand

• Note that operand 2 is only 12 bits. That


doesn't give a huge range of numbers: 0–
4095
• But ARM doesn't use the 12-bit immediate
value as a 12-bit number. Instead, it's an 8-
bit number with a 4-bit rotation, like this:
• The 4-bit rotation value has 16 possible
settings, so it's not possible to rotate the 8-
bit value to any position in the 32-bit
word. The most useful way to use this
rotation value is to multiply it by two. It
can then represent all even numbers from
zero to 30
• To form the constant for the data
processing instruction, the 8-bit
immediate value is extended with zeroes to
32 bits, then rotated the specified number
of places to the right
Data processing instructions with
immediate value as second operand
Data processing instructions

E FF
4 FF
2 FF
Data processing instructions

LDR r5, =0x9F683D41 ;load the value 0x9F683D41 into register r5


Branch instructions
• Branch : B{<cond>} label
• Branch with Link : BL{<cond>} sub_routine_label

31 28 27 25 24 23 0

Cond 1 0 1 L Offset

Link bit 0 = Branch


1 = Branch with link
Condition field

• The offset for branch instructions is calculated by the assembler:


By taking the difference between the target address and branch instruction
address minus 8 (to allow for the pipeline). Target address -PC-8 = Offset

Branch ins IF ID EXE

Branch successor 1 IF ID EXE

Branch successor 2 IF ID EXE


Branch instructions
• Branch : B{<cond>} label
• Branch with Link : BL{<cond>} sub_routine_label

31 28 27 25 24 23 0

Cond 1 0 1 L Offset

Link bit 0 = Branch


1 = Branch with link
Condition field

Two LSBs of instruction address are always 0 (instructions are word-aligned).


No need to store two LSBs of offset. Thus, during decoding, offset filed is
effectively treated as 26 bits. Two LSBs can be recovered by left shift the
offset by 2 bits
During encoding, the 24 bit offset is right shifted 2 bits and stored into the
instruction encoding.
This gives a range of ± 32 Mbytes. (± 2^25, with 1 bit for sign)
Branch instructions
• Branch : B{<cond>} label
• Branch with Link : BL{<cond>} sub_routine_label

31 28 27 25 24 23 0

Cond 1 0 1 L Offset

Link bit 0 = Branch


1 = Branch with link
Condition field

During decoding: Processor takes the 24-bit signed offset value, left shift it
by 2 bits, sign ext it to 32 bits, and adds this to PC + 8
Branch instructions
31 28 27 25 24 23 0

Cond 1 0 1 L Offset

Link bit 0 = Branch


1 = Branch with link
Condition field

During encoding, the 24 bit offset is right shifted 2 bits and stored into the instruction encoding.
Multiply
• MUL{cond}{S} Rd, Rm, Rs
• MLA{cond}{S} Rd, Rm, Rs, Rn
Load/Store
• LDR|STR {<cond>}{B} Rd,addressing1
LDR{<cond>}SB/H/SH Rd, addressing2
LDR R0, [R1]
LDR R0, [R1, R2, LSL #2]
Load/Store
Load/Store Multiple registers
• Syntax: <LDM|STM>{<cond>}<addressing mode> Rn{!},<registers>{ˆ}
• LDMIA r0!, (r1-r3)
• STMIB r0!, (r1-r3)
Load/Store Multiple registers

You might also like