0% found this document useful (0 votes)
417 views43 pages

Chapter 2 - Architecture of ARM Processor

Uploaded by

方勤
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
417 views43 pages

Chapter 2 - Architecture of ARM Processor

Uploaded by

方勤
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Chapter 2

Architecture of ARM
Processor
Chessda Utt raphan, PhD
Chapter
Outline
 Processor’s Core Architecture
 Programmer’s Model
 Memory Organization and Addressing
 Instruction Set Architecture (ISA)

© C. Uttraphan
Processor’s Core
Architecture

Simplified ARM processor core © C. Uttraphan


Processor’s Core Architecture
(Cont..)

• ARM core contains functional units (datapath components) connected


by data, address and control buses.
• The arrows represent the flow of data, the lines represent the buses,
and the boxes represent either an operation unit or a storage area.
• Data enters the processor core through the data bus. The data may be
an instruction to execute or a data item.
• ARM implemented using Von Neumann architecture - data items and
instructions share the same bus. In contrast, ARM implemented on
Harvard architecture implementations use two different buses.
• The decoder translates instructions before they
instructionEach instruction
executed. are executed belongs to a particular instruction
set.
• In this course we will use the ARM Cortex-M3 (ARMv7) as a reference
(Harvard Architecture)
© C. Uttraphan
Processor’s Core Architecture
(Cont..)
Separate address bus

Shared address bus

Program and data memory

Program memory
Processor core

Processor core

Data memory
Von Neumann architecture Harvard architecture

© C. Uttraphan
Processor’s Core Architecture
(Cont..)

• The processor, like all RISC processors, uses a load-store


ARM
architecture. This means it has two instruction types for transferring
data in and out of the processor: Load instructions copy data from
memory to registers in the core, and conversely the Store instructions
copy data from registers to memory.
• There are no data processing instructions that directly manipulate data
in memory. Thus, data processing is carried out solely in registers.
• Data items are placed in the register file—a storage bank made up of 32-
bit registers. Since the ARM core is a 32-bit processor, most instructions
treat the registers as holding signed or unsigned 32-bit values.
• ARM instructions typically have two source registers, Rn and Rm, and a
single result or destination register, Rd. Source operands are read from
the register file using the internal buses A and B, respectively.

© C. Uttraphan
Processor’s Core Architecture
(Cont..)

• The ALU (arithmetic logic unit) or Mul (multiply-accumulate unit) takes


the register values Rn and Rm from the A and B buses and computes a
result.
• Data processing instructions write the result in Rd directly to the register
file.
• Load and store instructions use the ALU to generate an address to be
held in the address register and broadcast on the Address bus.
• One important feature of the ARM is that register Rm alternatively can
be preprocessed in the barrel shifter before it enters the ALU. Together
the barrel shifter and ALU can calculate a wide range of expressions and
addresses.

© C. Uttraphan
Processor’s Core Architecture
(Cont..)

• After passing through the functional units, the result in Rd is written


back to the register file using the Result bus.
• For load and store instructions the incrementor updates the address
register before the core reads or writes the next register value from or to
the next sequential memory location. The processor continues executing
instructions until an exception or interrupt changes the normal
execution flow.

© C. Uttraphan
Programmer’s model

• A Programmer’s model shows what the CPU has available


to a programmer for the execution of programs
sequence of codes). It covers the CPU (or
execution of the CPU's instruction set. resources
• The model does not show details of hardware,for
such as how
the CPU's electronic circuitry works, how buses transport
data or the I/O peripherals available. In other words, the
programmers model would not cover functions that cannot
be observed by CPU instructions.

© C. Uttraphan
Programmer’s model
(Cont..)
R0
R1
R2
R3 General purpose
Special registers
R4 Registers
(Low registers) xPSR Program status register
R5
PRIMASK
R6
FAULT TMASK Interrupt mask register
R7
BASEPRI
R8
CONTROL
R9 Control register
General purpose
R10 Registers
R11 (High registers)
R12
R13 (MSP) Main Stack Pointer (MSP)
R13 (PSP) Process Stack Pointer (PSP)
R14 (LR) Link Register
R15 (PC) Program Counter

32 bits © C. Uttraphan
Programmer’s model
(Cont..)

• To perform data processing and controls, a number of


registers are required inside the processor core.
• If data from memory are to be processed, they have to be
loaded from the memory to a register in the register bank,
processed inside the processor, and then written back to the
memory if needed.
• General-purpose registers hold either data or an address. The
register bank contains sixteen 32-bit registers. Most of them
are general-purpose registers, but some have special uses.

© C. Uttraphan
Programmer’s model
(Cont..)

R0 – R12
• Registers R0 to R12 are for general uses.

R13 – Stack Pointer (SP)


• R13 is the stack pointer. It is used for accessing the stack
memory via PUSH and POP operations.
• MSP – Main stack pointer – The default stack pointer, used by
the operating system (OS) kernel and exception handlers
• PSP – Used by user application code

© C. Uttraphan
Programmer’s model
(Cont..)

R14 – Link Register (LR)


• The Link Register is used for storing the return address of a
subroutine or function call.
• At the end of the subroutine or function, the return address
stored in LR is loaded into the program counter so that the
execution of the calling program can be resumed.

R15 – Program Counter (PC)


• Store the memory address of the next instruction to be
executed.
• In ARM (v7) processors, some instructions take up one half
word (2 bytes) and some take one word (4 bytes). Hence
incrementing the PC actually adds 2 or 4 to its value as
memory addresses are given in bytes.
© C. Uttraphan
Programmer’s model
(Cont..)

Special Register

xPSR - Provide arithmetic and logic processing flags, execution status,


and current executing interrupt number
PRIMASK - Disable all interrupts except the non-maskable interrupt (NMI)
and hard fault
FAULTMASK - Disable all interrupts except the NMI
BASEPRI - Disable all interrupts of specific priority level or lower priority
level
CONTROL - Define privileged status and stack pointer selection

© C. Uttraphan
Programmer’s model
(Cont..)

Operation Modes of the Cortex-M3


• The Cortex-M3 processor has two modes and two privilege levels.
• The operation modes (thread mode and handler mode) determine whether the
processor is running a normal program or running an exception handler like an
interrupt handler or system exception handler
• The privilege levels (privileged level and user level) provide a mechanism for
safeguarding memory accesses to critical regions as well as providing a basic
security model.
• When the processor is running a main program (thread mode), it can be either in
a privilege state or a user state, but exception handlers can only be in a
privileged state.

© C. Uttraphan
Programmer’s model
(Cont..)
Operation Modes of the Cortex-M3

• When the processor exits reset, it is in


thread mode, with privileged access rights.
In the privileged state, a program has
access to all memory ranges and can use
all supported instructions.
• Software in the privileged access level can
switch the program into the user access
level using the control register.
• When an exception takes place, the
processor will always switch back to the
privileged state and return to the previous
state when exiting the exception handler.
• A user program cannot change back to the privileged state by writing to the control
register It has to go through an exception handler that programs the control register
to switch the processor back into the privileged access level when returning to
some
threaduntrusted
mode. The programs.
separation of privilege and user levels improves system reliability
© C. Uttraphan
by preventing system configuration registers from being accessed or changed by
Programmer’s model
(Cont..)
Condition Flags
• Condition flags are updated by comparisons and the result of
ALU operations that specify the S instruction suffix. For
example, if a SUBS subtract instruction results in a register
value of zero, then the Z flag in the xPSR is set. This particular
subtract instruction specifically updates the xPSR.

Program status register in Cortex-M3

© C. Uttraphan
Programmer’s model
(Cont..)
Condition Flags

© C. Uttraphan
Programmer’s model
(Cont..)

The N-Flag
This flag is useful when checking for a negative result.

Example: Adding -1 to -2 in 32-bit

FFFFFFFF
+ FFFFFFFF

FFFFFFFD

MSB = 1, indicates the result is negative

© C. Uttraphan
Programmer’s model
(Cont..)

The V-Flag
When performing an operation like addition or subtraction, if we calculate
the V flag as an exclusive OR of the carry bit going into the most significant
bit of the result with the carry bit coming out of the most significant bit,
then the V flag accurately indicates a signed overflow. Overflow occurs if
the result of an add, subtract, or compare is greater than or equal to 231, or
less than –231.
Example:
A1234567
+

B0000000
151234567 © C. Uttraphan
Programmer’s model
(Cont..)

The C-Flag
The C flag will set if an arithmetic operation produce a carry as in example
below

Example:
FFFFFFFF
+ 00000001
100000000

carry

© C. Uttraphan
Programmer’s model
(Cont..)

The Z-Flag
The Z flag will set if an arithmetic operation (can be other operation,
depend on instruction used) produce zero result

Example:
FFFFFFFF
+ 00000001
100000000

32-bit 0

© C. Uttraphan
Memory Organization

• Memory can be conceptually viewed as contiguous storage elements


that hold data, each element holding a fixed number of bits and having
an address.
• The typical analogy for memory is a very long string of mailboxes, where
data (your letter) is stored in a box with a specific number on it.
• While there are some digital signal processors that use memory widths
of 16 bits, the system that is nearly universally adopted these days has
the width of each element as 8 bits, or a byte long.
• Therefore, we always refer to memory as being so many megabytes
(abbreviated MB, representing 220 or approximately 106 bytes),
gigabytes (abbreviated GB, representing 230 or approximately 109
representing 240 or
bytes), or even
approximately terabytes (abbreviated
1012 bytes).
TB,

© C. Uttraphan
Memory Organization
(Cont..)

The ARM processor core


© C. Uttraphan
Memory Organization
(Cont..)

• The ARM core consists of 32 bits, meaning that you could address bytes
in memory from address 0 to 232 – 1, or 4,294,967,295 (0xFFFFFFFF),
which is considered to be 4 GB of memory space.
• A memory map is a structure of data (which usually resides in memory
itself) that indicates how memory is laid out.
• Normally not all addresses are used, and much of the memory map
contains areas dedicated to specific functions some of which we’ll
examine further in later chapters. While the memory layout is defined
by an SoC’s implementation, it is not part of the processor core.

© C. Uttraphan
Memory Organization
(Cont..)

Example of ARM Cortex-M3 memory map


© C. Uttraphan
Addressing: Loads and
Stores
• Now that we have some idea of how memory is described in the
system, the next step is to consider getting data out of memory and into
a register, and vice versa (Addressing modes).
• Recall RISC architectures are considered to be load/store
that
architectures, meaning that data in external memory must be brought
into the processor using an instruction.
• Operations that take a value in memory, multiply it by a coefficient, add
it to another register, and then store the result back to memory with
only a single instruction do not exist.
• For hardware designers, this is considered to be a very good thing, since
some older architectures had so many options and modes for loading
and storing data that it became nearly impossible to build the
processors without introducing errors in the logic.

© C. Uttraphan
Addressing: Loads and Stores
(Cont..)

• Load instructions take a single value from memory and write it to a


general purpose register. Store instructions read a value from a general-
purpose register and store it to memory.
• Load and store instructions have a single instruction format:
LDR|STR{<size>}{<cond>} <Rd>, <addressing_mode>
• where <size> is an optional size such as byte or halfword (word is the
default size), <cond> is an optional condition. we will learn this in detail
in the next Chapter
• The addressing modes allowed are actually quite flexible, as we’ll see in
the next section, and they have two things in common: a base register
and an (optional) offset.

© C. Uttraphan
Addressing: Loads and Stores
(Cont..)

• For example, the instruction


LDR R9, [R12, R8, LSL #2]
would have a base register of R12 and an offset value created by shifting
register R8 left by two bits.
• The LSL is a logical shift left by a certain number of bits. The offset is
added to the base register to create the effective address for the load in
this case.
• The term effective address is often used to describe the final address
created from values in the various registers, with offsets and/or shifts.
• For example, in the instruction above, if the base register R12 contained
the value 0x4000 and we added register R8, the offset, which contained
0x20, to it, we would have an effective address of 0x4080 (remember
the offset is shifted).
• We will discuss the addressing modes in details in Chapter 3
© C. Uttraphan
Instruction Set Architecture
(ISA)
• The instruction set, also called ISA (instruction set architecture), is part
of a computer that pertains to programming, which is basically machine
language.
• All processors are programmed with a set of instructions, which are
unique patterns of bits, or 1’s and 0’s. Each set is unique to that
particular processor.
• These might tell the processor to add two
together, move data from
instructions one place to another, or sit quietly until
numbers
something wakes it up, like a key being pressed.
• A processor from Intel, such as the Core i5, has a set of bit patterns that
are completely different from a SPARC processor or an ARM processor.
However, all instruction sets have some common operations, and
learning one instruction set will help you understand nearly any of
them.

© C. Uttraphan
Instruction Set Architecture
(ISA)
• The instructions themselves can be of different lengths, depending on
the processor architecture: 8, 16, or 32 bits long, or even a combination
of these.
• For our studies, the instructions are either 16 or 32 bits long; although,
much later on, we’ll examine how the ARM processors can use some
shorter, 16-bit instructions in combination with the 32-bit instructions.
• Reading and writing a string of 1’s and 0’s can give you a headache
rather quickly, so to aid in programming, a particular bit pattern is
mapped onto an instruction name, or a mnemonic

© C. Uttraphan
Instruction Set Architecture
(ISA)
• Hence, instead of reading
2805
F101010A

F04F0208

• the
programmer
can read
CMP

R0,#5 ADD

R1,#10 MOV

R2,#8 © C. Uttraphan
Instruction Set Architecture
(ISA)
• Consider the bit pattern for the instruction above:
MOV R2, #8
• The pattern is the hex number 0xF04F0208. From Figure below, we can
see that the ARM processor expects parts of our instruction in certain
fields. The number 8, for example, would be placed in the field called
8_bit_immediate, and the instruction itself, moving a number into a
register, is encoded in the field called opcode.

© C. Uttraphan
Instruction Set Architecture
(ISA)
How the microprocessor executes the instruction?
CMP R0,#5 :2805
ADD :F101010A 20000000 0 x 28
R1,#10 MOV :F04F0208 20000001 0 x 05
R2,#8
20000002 0 x F1
PC 0 x 20000002
20000003 0 x 01
20000004 0 x 01
R0 0 x 00000005 20000005 0 x 0A
R1 0 x 00000003 20000006 0 x F0
0 x 00000002
20000007 0 x 4F
R2
20000008 0 x 02
20000009 0 x 08

When the P executes CMP instruction, the PC will point to the next instruction to
be executed © C. Uttraphan
Instruction Set Architecture
(ISA)
How the microprocessor executes the instruction?
CMP R0,#5 :2805
ADD :F101010A 20000000 0 x 28
R1,#10 MOV :F04F0208 20000001 0 x 05
R2,#8
20000002 0 x F1
PC 0 x 20000006
20000003 0 x 01
20000004 0 x 01
R0 0 x 00000005 20000005 0 x 0A
R1 0 x 0000000D 20000006 0 x F0
0 x 00000002
20000007 0 x 4F
R2
20000008 0 x 02
20000009 0 x 08

When the P executes ADD instruction, the PC will point to the next instruction to
be executed © C. Uttraphan
Instruction Set Architecture
(ISA)
How the microprocessor executes the instruction?
CMP R0,#5 :2805
ADD :F101010A 20000000 0 x 28
R1,#10 MOV :F04F0208 20000001 0 x 05
R2,#8
20000002 0 x F1
PC 0 x 2000000A
20000003 0 x 01
20000004 0 x 01
R0 0 x 00000005 20000005 0 x 0A
R1 0 x 0000000D 20000006 0 x F0
0 x 00000008
20000007 0 x 4F
R2
20000008 0 x 02
20000009 0 x 08

When the P executes MOV instruction, the PC will point to the next instruction to
be executed © C. Uttraphan
Instruction Set Architecture
(ISA)
Instruction pipeline
• Instruction pipelining is a technique used in the design of modern
microprocessors, microcontrollers and CPUs to increase their
instruction throughput (the number of instructions that can be
executed in a unit of time).
• The Cortex-M3 processor has a three- pipeline. The
ARM
pipeline stages instruction stage decode,
are
instruction execution. fetch, instruction and

© C. Uttraphan
Instruction Set Architecture
(ISA)
Instruction pipeline

clock

© C. Uttraphan
Instruction Set Architecture
(ISA)
The Thumb-2 Technology
• The original ARM instructions are 32-bit wide, and they the first to be
used on older architectures such as the ARM7TDMI, ARM9, ARM10,
and ARM11.
For list of ARM architecture:
https://en.wikipedia.org/wiki/ARM_architecture
• Thumb instructions (Thumb 1), which are a
subset of ARM instructions,
also work on 32-bit data; however, they are 16
bits wide.
• Thumb-2 is a superset of Thumb instructions, including new 32-bit
instructions for more complex operations. In other words, Thumb-2 is a
combination of both 16-bit and 32-bit instructions.
• Generally, it is left to the compiler or assembler to choose the optimal
size, but a programmer can force the issue if necessary. Some cores,
© C. Uttraphan
such as the Cortex-M3 and M4, only execute Thumb-2 instructions,
Instruction Set Architecture
(ISA)
The Thumb-2 Technology
• The Thumb-2 technology extended the Thumb Instruction Set
Architecture (ISA) into a highly efficient and powerful instruction set
that delivers significant benefits in terms of ease of use, code size, and
performance

© C. Uttraphan
Instruction Set Architecture
(ISA)

ARMv7-M
Architecture

ARMv6-M
Architecture

Binary upwards
compatibility

© C. Uttraphan
Exercise

1. What is the size of the memory for the microprocessor if it has 24-bit
address lines (bus)? Furthermore, give the starting address and the last
address of the memory.
2. List the operation modes of the ARM Cortex-M3.
3. What is the function of register R13? Register R14? Register R15?
4. On an ARM Cortex-M3, in any given mode, how many registers does a
programmer see at one time?
5. Which bits of the ARM Cortex-M3 status registers contain the status
flags?
6. How many stages does the ARM Cortex-M3 pipeline have? Name them.
7. Suppose that the Program Counter, register R15, contained the hex value
0x8000. From what address would an Cortex-M3 fetch an instruction.
Assume that all instructions are 32-bit wide
8. What is the size the ARM Cortex-M3’s address bus?
9. What is Thumb instruction set? What is the different between Thumb-1
and Thumb-2
© C. Uttraphan
Exercise

10. Explain the Load-Store architecture in RISC processor


11. For an 8-bit microprocessor, determine the value of the N, Z, V, C flag
after the microprocessor executes the following arithmetic operations.
a. 127 + 2 b. 10 – 10 c. 100 – 50
d. 90 + 20 e. -10 + 20 f. -100 – 50

© C. Uttraphan

You might also like