WEEK 1-2013
Introduction/The ARM7TDMI
Programmers Model
ELEC2142: Embedded Systems Design
Programmers Model
From programmers point of view, what is
important is a model for the processor.
How is the processor controlled?
What features are available from high level such as where
data can be stored?
How does the processor respond if an invalid instruction is
submitted to it?
A model that describes how the processor is controlled,
where data can be stored and how the processor respond
if an invalid instruction is given to it is known as
Programmers Model.
2
MEMORY SYSTEM
The computer memory hierarchy consists of
several levels, each level being characterized by
size, speed and cost.
Processor
Control
Hard disk
Main (Virtual
Register
Cache
2nd Cache Memory Memory)
Datapath
Speed: Fastest Slowest
Size: Smallest Biggest
Cost: Highest Lowest
The registers and main memory are seen by the
programmer.
3
MEMORY SYSTEM
The caches are managed automatically by the
hardware and effectively invisible to the
application.
Virtual memory is handled by Operating system
Speed Size
Registers A few ns 128 bytes
On-chip Cache Ten ns 8-32 Kbytes
2nd Cache A few tens of ns Hundreds of Kbytes
Main Memory 100ns Mega bytes
Virtual memory tens of 100 Gbytes
(Hard disk) milliseconds
Fast memory is more expensive per bit than slow
memory
4
MEMORY SYSTEM
Memory can be viewed as group of storage
elements that hold data, where each element has a
fixed number of bits and an address.
Universally adopted width of each memory
element is 8bits (bytes)
5
THE PROCESSOR (ARM7TDMI)
ARM 7
Thumb
On-chip
Debug
Multiplier
Embedded
ICE ( In-
Circuit
Emulation)
6
DATA TYPES
Data in digital systems is represented by binary
digits called bits
Bits can be imagined as either ON or OFF ( ONE or ZERO)
Byte eight bits grouped together (in most systems)
Halfword two bytes or 16 bits
Word - four bytes or 32 bits ( in ARM cores)
Data and instructions are described taking their length into
consideration
32- bit instruction (ARM instructions)
16-bit instruction (THUMB instructions)
8-bit data
16-bit data
7
DATA TYPES
Reading or writing halfword data must be aligned
to two-byte boundaries (the address of the memory
must end in an even number)
Reading or writing word data must be aligned to
four-byte boundaries ( the address of the memory
must end in 0, 4, 8, or C.
8
PROCESSOR MODES
Version 4T (ARM7TDMI) cores support seven
processor modes: User, FIQ, IRQ , Supervisor,
Abort, Undef, and System.
Mode Description
Supervisor Entered on reset and when a Software Interrupt
(SVC) (SWI) instruction is executed
FIQ Entered when a high priority (Fast) interrupt is Privileged
raised modes/
IRQ Entered when a low priority (normal) interrupt is Exception
raised modes
Abort Used to handle memory access violations
Undef Used to handle undefined instructions
System Privileged mode using the same registers as User
mode
User Mode under which most applications/OS tasks run Unprivileged
9
PROCESSOR MODES
The processor operates mostly in User Mode and
most applications are executed in this mode.
The mode of the processor can be changed by
software, but most of the time, it is due to external
conditions or exceptions
External conditions includes such as
When a signal comes to a cell phone or a user presses a
key
These external events are seen as interrupts.
Interrupts can be of high (FIQ) or low (IRQ) priority
10
PROCESSOR MODES
Supervisor mode allows the processor to access
protected resources
Abort mode allows the processor to recover from
exceptions such as a memory access to an address
that does not physically exist
When the processor happens to have an instruction
in the pipeline that it does not recognize, then it
deals with it by going into Undefined mode
Undefined mode may be useful to deal with floating
point operation processing using software when a
11
floating-point unit ( hardware) does not exist
REGISTERS
Storage units in the datapath of the processor
The ARM7TDMI processor has a total of 37
registers
30 general-purpose registers
6 status registers
A program counter
The registers in ARM7TDMI are 32-bit wide.
The general registers are named as r0, r1,..r14.
At any time and in any given mode, the programmer
sees 15 general purpose registers bank.
12
REGISTERS
At any time and in any given mode, the programmer
sees
15 general purpose registers (r0..r14) bank.
1 program counter (PC or r15)
One or two status register
The register banks for the modes are arranged in a
partially overlapping manner.
In user/system modes, one will see a bank of registers
r0..r14, PC, and Current Program Status Register (CPSR)
When the processor switches to, say Abort mode, it will
swap general registers r13 and r14 with different r13 and
r14
In Abort mode, the programmer will also see SPSR_Abort
status register in addition to CPRS
13
REGISTERS
Mode
User/System Supervisor Abort Undefined Interrupt Fast Interrupt
R0 , A1 R0 R0 R0 R0 R0
R1, A2 R1 R1 R1 R1 R1
R2, A3 R2 R2 R2 R2 R2
R3, A4 R3 R3 R3 R3 R3
R4, V1 R4 R4 R4 R4 R4
R5, V2 R5 R5 R5 R5 R5
R6, V3 R6 R6 R6 R6 R6
R7,V4 R7 R7 R7 R7 R7
R8,V5 R8 R8 R8 R8 R8 (FIQ)
R9, V6 R9 R9 R9 R9 R9 (FIQ)
R10,V7 R10 R10 R10 R10 R10 (FIQ)
R11, fp R11 R11 R11 R11 R11 (FIQ)
R12, ip R12 R12 R12 R12 R12 (FIQ)
R13, sp R13 (SVC) R13 (Abort) R13 (Undef) R13 (IRQ) R13 (FIQ)
R14, lr R14 (SVC) R14 (Abort) R14 (Undef) R14 (IRQ) R14 (FIQ)
14
REGISTERS
Program counter is seen by all modes
Mode
User/System Supervisor Abort Undefined Interrupt Fast Interrupt
R15(PC) R15(PC) R15(PC) R15(PC) R15(PC) R15(PC)
Current Program Status Register is seen by all
modes.
Supervisor, Abort, Undefined, Interrupt, and Fast
Interrupt modes have their own Saved Program
Status Register(SPSR) in addition to access to
CPSR, which common to all modes.
Mode
User/System Supervisor Abort Undefined Interrupt Fast Interrupt
CPSR CPSR CPSR CPSR CPSR CPSR
SPSR (SVC) SPSR(Abort) SPSR (Undef) SPSR (IRQ) SPSR (FIQ)
15
REGISTERS
Although most registers can be used for any
purpose, there are a few registers normally reserved
for special use.
Register r13 ( the Stack Pointer or SP) holds the
address of the stack in memory and each mode has
its unique stack pointer
Register r14 (the Link Register or LR) is used as a
subroutine return address link register. It will hold
the address to which the processor need to return if
it jumps to a subroutine.
Register r15 (the Program Counter) is used to hold
16
the address of the instruction that is to be fetched.
REGISTERS
The ARM7TDMI is a pipelined architecture, that is,
while one instruction is being fetched, another is
being decode, and yet again another one is being
executed
PC-8 11 PC FETCH Fetched from memory
00 E1A02081
A0
E3 Decoding the instruction in
PC-4 80 PC-4 DECODE the control unit
10
E1A01080
A0
E1 Executing the instruction in
PC 81 PC-8 EXECUTE the datapath
20
A0 E3A00011
E1
17
Program Status Register
The Current Program Status Register (CPSR)
contains condition flags, interrupt enable flags, the
current mode, and state of the processor.
CPSR allows programs to recover from exception
Each privileged mode except system mode has a
Saved Program Status Register (SPSR)
SPSR holds the value of CPSR during the
occurrence of exceptions
18
Program Status Register
Both CPSR and SPSR have the following format
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
N Z C V Do not modify / Read as Zero I FT MMMMM
Bit [31:28] - conditional flags
N - Negative
Z - Zero
C - Carry over
V - Overflow
Bit [4:0] - Current Mode
I - IRQ ( disable IRQ if it is set)
F - FIQ ( disable FIQ if it is set)
T - set to 0 for ARM code ( otherwise THUMB code)
19
Program Status Register
Bit [7:0] of PSRs are called the control bits
Bit [4:0] of PSRs are called the Mode bits
The Mode bits
PSR[4:0] Mode
10000 User mode
10001 FIQ mode
10010 IRQ mode
10011 Supervisor mode
10111 Abort mode
11011 Undefined mode
11111 System mode
If mode bit pattern that is not valid is requested, the
result is unpredictable.
20
THE VECTOR TABLE
The vector table ( exception vector table) shows
external memory address locations that hold
information that is necessary to handle exceptions.
The Exception Vector Table
Exception Type Mode Vector Address
Reset SVC 0x00000000
Undefined instructions UNDEF 0x00000004
Software Interrupt (SWI) SVC 0x00000008
Prefetch abort (instruction fetch memory ABORT 0x0000000C
abort)
Data abort (data access memory abort) ABORT 0x00000010
IRQ (interrupt) IRQ 0x00000018
FIQ (fast interrupt) FIQ 0x0000001C
21
THE VECTOR TABLE
For example, when the fast interrupt comes along,
the processor will change the program counter to
0X1C and begin fetching instruction from this
address.
Before changing the program counter to 0X1C,
however, the current state is saved by copying PC to
r14_FIQ and CPSR into SPSR_FIQ
The processor operating mode is changed to the
appropriate exception mode
22
THE VECTOR TABLE
The instruction at the location defined by the
exception vector table will usually contain a branch
to the exception handler.
The processor will then start fetching instructions
from the exception handler.
The exception handler will use r13 (of the exception),
which is normally initialized to point to a dedicated
stack in memory, to save r0-r4 (a1-a3) for use as
work register.
23
THE VECTOR TABLE
By restoring the user registers (a1-a4), PC, and
CPSR, a return to the user program will be achieved.
The addresses specified by the exception vector table
are reserved for exceptions and should not be used
for storing data or instructions.
24
Load-Store Architecture
ARM processor is based on Load-Store Architecture
This means instruction set will only process values stored
in registers ( or specified within instruction itself) and
place the result back into a register.
The only instructions allowed to used memory are those
which copy memory values into registers (load operation)
and those which copy register values into memories
(store operation).
ARM does not support memory-to-memory operations
25
Load-Store Architecture
All ARM instructions can be categorized as either
Data processing instructions use and change only
register values. Eg. add, mov, etc
Data transfer instructions register to memory
(store) or memory to register (load) data
transfer. Eg. str, ldr
Control flow instructions causes execution to
jump to a different address Eg. B, BL
26
ARM INSTRUCTION SET
32-Bit wide (should be aligned to 4-byte
boundaries in memory)
The load-store architecture
3-address data processing instructions ( two source
operand registers and a result register)
Conditional execution of every instruction
Multiple load and store instructions
27
ARM INSTRUCTION SET
General shift (rotate) operation and ALU
operations in a single instruction that executes in a
single cycle
Open instruction set extension through the
coprocessor instruction set, including adding new
registers and data types to the programmers
model
16 bit representations of instruction set in the
Thumb architecture
28
INPUT/OUTPUT SYSTEM
Input/output peripherals are handled by the ARM
as memory-mapped devices, with interrupt
support.
This means the internal registers of the peripheral
devices are addressable within ARMs memory
map.
Peripherals may attract the attention of the
processor by raising interrupt request.
29
INPUT/OUTPUT SYSTEM
Input/output peripherals are handled by the ARM
as memory-mapped devices, with interrupt
support.
This means the internal registers of the peripheral
devices are addressable within ARMs memory
map.
Peripherals may attract the attention of the
processor by raising interrupt request.
30
ASSEMBLY PROGRAM EXAMPLE
Consists of ARM instructions, directives, and
comments
Directives AREA Prog1, CODE, READONLY
ENTRY
MOV r0, #0x11 ; load initial value
MOV r1, r0, LSL #1 ; Shift 1 bit to left
MOV r2, r1, LSL#1 ; Shift 1 bit to left
stop B stop
Directive END label
31
ASSEMBLY PROGRAM EXAMPLE
AREA a new assembly section (block) is to be
created
Prog1 - Section (block) name
CODE specifies what type if section ( Instruction or
Data)
READONLY the section (block) is to be READONLY
ENTRY - tells the assembler that the instruction
code is to start next
END tells the assembler that there is no further
instruction
32
ASSEMBLY PROGRAM EXAMPLE
After being assembled and converted to machine
code,
0x00000011
MOV r0, #0x11 0x00000000
0x00000000
MOV r1, r0, LSL 0x00000011
#1 0x00000022
0x00000000
MOV r2, r1, LSL 0x00000011
#1 0x00000022
0x00000044
(stop)
B stop
33
ASSEMBLY PROGRAM EXAMPLE
MOV r0, #0x11
Onto B BUS (
from decode
stage) through a
barrel shifter ( No
shift occurs in this
case) then
through 32-bit
ALU and ALU BUS
to r0
MOV r1, r0, LSL #1
34
ASSEMBLY PROGRAM EXAMPLE
MOV r1, r0, LSL #1
r0 from register
bank onto B BUS
through a barrel
shifter (shift
occurs in this
case, shift by one
position to left)
then through 32-
bit ALU and ALU
BUS to r1
35
ASSEMBLY PROGRAM EXAMPLE
MOV r2, r1, LSL #1
r1 from register
bank onto B BUS
through a barrel
shifter (shift
occurs in this
case, shift by one
position to left)
then through 32-
bit ALU and ALU
BUS to r2
36
ASSEMBLY PROGRAM EXAMPLE
B stop
0xFFFFFE onto B
BUS through a
barrel shifter ( shift
left by 2 positions)
and PC value onto A
BUS then through
32-bit ALU and ALU
BUS to the address
register
Assembler uses
the PC value to
create an address (
that replaces label)
37
ASSEMBLY PROGRAM EXAMPLE
How?
When the instruction (B stop) is executed (at
0x0000000C), the processor will be fetching
instruction at 0x00000014 address, that is the
current value of the PC.
For the process to fetch the next instruction from
the memory location labeled as stop ( which is at
0x00000C), the PC should be changed to
(0x0000000C).
This is an effective offset of -8
38
ASSEMBLY PROGRAM EXAMPLE
How?
PC( new value) = PC(at the moment) + effective
offset
Bit pattern for branch
The processor shifts offset provided in the
instruction pattern left by two bits and effectively
multiplying it by four. Therefore, the offset in the
instruction pattern of B should be -2 to produce the
required -8 offset
2s complement of -2 in 24 bits is FFFFFE
39
ASSEMBLY PROGRAM EXAMPLE
Factorial calculation (n!)
n! = n(n-1)(1)
Get the value of n
AREA Prog2, CODE, READONLY
ENTRY Copy n to n_factorial
MOV r6, #10 n=n-1
MOV r4, r6
loop SUBS r4, r4, #1
Yes
MULNE r6, r6, r4 is n = 0?
BNE loop
No
n_factorial = n_factorial * n
END
stop B stop
40
END
ASSEMBLY PROGRAM EXAMPLE
Note:
Conditional execution the multiplication
instruction is executed only if subtraction
instruction before it results in zero.
Setting flag the suffix S onto the SUB
instruction direct the process to update the flags in
CPSR
Change-of-flow instructions a branch will load a
new address called a branch target to the program
counter and the execution will resume from the
new address
41
ASSEMBLY PROGRAM EXAMPLE
AREA Prog2, CODE, READONLY
ENTRY
MOV r6, #10
MOV r4, r6
Loop 2
loop SUBS r4, r4, #1
MULNE r6, r6, r4
BNE loop
stop B stop
42 END Loop 1
ASSEMBLY PROGRAM EXAMPLE
Shuffle data around swap the contents of two
registers
AREA Prog1, CODE, READONLY
ENTRY
LDR r0, =0xF631024C ; load some data
LDR r1, =0x17539ABD ; load some data
EOR r0, r0, r1 ; r0 XOR r1
EOR r1, r0,r1 ; r0 XOR r1
E0R r0, r0, r1 ; r0 XOR r1
stop B stop
END
43
ASSEMBLY PROGRAM EXAMPLE
0xF631024C
LDR r0, =0XF631024C 0x00000000
0xF631024C
LDR r1, =0X17539ABD
0x17539ABD
0xE16298F1
EOR r0, r0, r1
0x17539ABD
EOR r1, r0, r1 0xE16298F1
0xF631024C
EOR r0, r0, r1 0x17539ABD
0xF631024C
44
ASSEMBLY PROGRAM EXAMPLE
LDR is normally used to load data from memory to
register.
LDR r0, =0XF631024C
LDR r1, =0X17539ABD
Here, it is used to load large constant to a register.
The instructions are not legal. They are called
Pseudo-instructions that we put in the code to
make it easier for us, programmer.
45
ARM Tools
High-level languages are easier to use as they
contain near-English descriptions. Example C, C++
High-level languages are translated to the
instruction set of the microprocessor (assembly
language) using a compiler
The assembly languages ( instruction sets) are
translated further into machine codes (bit patterns)
using assembler.
46
ARM Tools
Strictly speaking, the output of the assembler is
object files, which contain debugging and
relocation information, and are used to build a
larger executable file.
During assembly, a linker is used to combine the
object files into executable program.
The executable files will normally run by the
hardware in the final embedded application.
We can also use debugger to run executable files.
47
ARM Tools
With the debugger, one can
access to registers on the chip
view memory
the ability to set and clear breakpoints and watchpoints
views of code in both high-level languages and assembly
48