0% found this document useful (0 votes)
40 views45 pages

Cortex-M0+ CPU Core

Uploaded by

Harsh Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views45 pages

Cortex-M0+ CPU Core

Uploaded by

Harsh Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 45

Cortex-M0+ CPU Core

1
Overview
 Cortex-M0+ Processor Core Registers

 Memory System and Addressing

 Thumb Instruction Set

 References
 DDI0419C Architecture ARMv6-M Reference Manual

2
Microcontroller vs. Microprocessor
 Both have a CPU core to execute
instructions
 Microcontroller has peripherals for
embedded interfacing and control
 Analog
 Non-logic level
signals
 Timing
 Clock generators
 Communications
 point to point
 network
 Reliability
and safety

3
Cortex-M0+ Core

4
Architectures and Memory Speed
 Load/Store Architecture
 Developed to simplify CPU design and improve performance
 Memory wall: CPUs keep getting faster than memory
 Memory accesses slow down CPU, limit compiler optimizations
 Change instruction set to make most instructions independent of memory
 Data processing instructions can access registers only
1. Load data into the registers
2. Process the data
3. Store results back into memory
 More effective when more registers are available

 Register/Memory Architecture
 Data processing instructions can access memory or registers
 Memory wall is not very high at lower CPU speeds (e.g. under 50 MHz)

5
ARM Processor Core Registers
 R0-R12 - General purpose, for data processing
 SP - Stack pointer (R13)
 Can refer to one of two SPs
 Main Stack Pointer (MSP)
 Process Stack Pointer (PSP)
 Uses MSP initially, and in Handler mode
 In Thread mode, can select either MSP or PSP
using SPSEL flag in CONTROL register.
 LR - Link Register (R14)
 Holds return address when called with Branch &
Link instruction (B&L)
 PC - program counter (R15)

6
Operating Modes
Reset

Thread
Mode.
MSP or PSP.

Exception Starting
Processing Exception
Completed Processing

Handler
Mode
MSP
 Which SP is active depends on operating mode, and SPSEL (CONTROL register bit 1)
 SPSEL == 0: MSP
 SPSEL == 1: PSP

7
ARM Program Status Register

 Three views of same register


 Application PSR (APSR)
 Condition code flag bits Negative, Zero, oVerflow, Carry used for
conditional branches, extended precision math, error detection
 Interrupt PSR (IPSR)
 Holds exception number of currently executing ISR
 Execution PSR (EPSR)
 Thumb state

8
ARM Processor Core Registers
 PRIMASK - Exception mask register
 Bit 0: PM Flag
 Set to 1 to prevent activation of all exceptions with configurable priority
 Access using CPS, MSR and MRS instructions
 Use to prevent data race conditions with code needing atomicity

 CONTROL
 Bit 1: SPSEL flag
 Selects SP when in thread mode: MSP (0) or PSP (1)
 Bit 0: nPRIV flag
 Defines whether thread mode is privileged (0) or unprivileged (1)
 With OS environment,
 Threads use PSP
 OS and exception handlers (ISRs) use MSP

9
Memory Maps For Cortex M0+ and MCU
KL25Z128VLK4
0x2000_2FFF

SRAM_U (3/4)
16 KB SRAM
0x2000_0000
SRAM_L (1/4)
0x1FFF_F000
Some RAM is located in
Code segment, allowing
code to run from RAM
to allow flash
reprogramming or for
0x0001_FFFF better speed on faster
systems

128KB Flash

0x0000_0000
10
Endianness Memory
7 0
 For a multi-byte value, in what
Address A B0 LSB
order are the bytes stored? Register
A+1 B1 31 24 23 16 15 8 7 0
A+2 B2 B3 B2 B1 B0
 Little-Endian: Start with least- A+3 B3 MSB
significant byte
Memory
7 0

 Big-Endian: Start with most- Address A B3 MSB


significant byte Register
A+1 B2 31 24 23 16 15 8 7 0
A+2 B1 B3 B2 B1 B0
A+3 B0 LSB

11
ARMv6-M Endianness

 Instructions are always little-endian

 Loads and stores to Private Peripheral Bus are always little-endian

 Data: Depends on implementation, or from reset configuration


 Kinetis processors are little-endian

12
Different Instruction Sets for Different Design Spaces?

 ARM instructions optimized for resource-rich high-


performance computing systems
 Deeply pipelined processor, high clock rate, wide (e.g. 32-
bit) memory bus
 https://en.wikipedia.org/wiki/ARM_Cortex-M#Instruction_se
ts
 Low-end embedded computing systems are different
 Slower clock rates, shallow pipelines
 Different cost factors – e.g. code size matters much more
 Bit and byte operations critical

13
The Memory Wall

 It has been easier to speed up the CPU than the memory


 Facts of life
 Off-chip memory is slower than on-chip memory. May not want
to put all memory on-chip, even if possible.
 Flash is slower to read or write than RAM.
 Fast RAM is more expensive than slow RAM. Same for flash.
 Design for high-performance CPUs
 Use caches (small fast RAM) to make main memory (large slow Double flash
bus width
RAM, flash) look faster at a low cost.
 Put cache(s) on chip if possible.
 Increase bandwidth by widening memory bus, improving protocol,
reducing overhead, split transactions, using page mode, etc.)
 Design for low-performance CPUs
 Put memory on-chip with CPU. RAM, flash ROM
 Increase flash ROM bandwidth by widening memory bus, adding
prefetch buffer, branch target buffer, etc. High
Low
 Add cache Performance Performance
 Change instruction set size to reduce instruction bandwidth needed
14
ARM and Thumb Instructions

 Thumb technology reduces program  CPU operating state


memory size and bandwidth requirements  CPU decodes instructions based on whether in
 Thumb provides a subset of ARM 32-bit Thumb state or ARM state - controlled by T bit
 Thumb state indicated by program counter being
instructions re-encoded into usually fewer
odd (LSB = 1)
bits (most 16 bits, some 32 bits)
 Cortex-M0+ only uses Thumb instructions,
 Not all 32-bit instructions available
 Most 16-bit instructions can only access low is always in Thumb state
registers (R0-R7), but a few can access high  See ARMv6-M Architecture Reference
registers (R8-R15) Manual for specifics per instruction
 1995 :Thumb-1 instruction set (Section A.6.7)
 16-bit instructions
 2003: Thumb-2 instruction set
 Adds some 32 bit instructions
 Improves speed with little memory overhead
15
Cortex-M Instruction Groups
Instr M0,M0+,M
Group Instructions M3 M4 M7 M23 M33,M35P
bits 1
ADC, ADD, ADR, AND, ASR, B, BIC, BKPT, BLX, BX, CMN, CMP, CPS, EOR, LDM, LDR, LDRB, LDRH,
Thumb-1 16 LDRSB, LDRSH, LSL, LSR, MOV, MUL, MVN, NOP, ORR, POP, PUSH, REV, REV16, REVSH, ROR, RSB, Yes Yes Yes Yes Yes Yes
SBC, SEV, STM, STR, STRB, STRH, SUB, SVC, SXTB, SXTH, TST, UXTB, UXTH, WFE, WFI, YIELD
Thumb-1 16 CBNZ, CBZ No Yes Yes Yes Yes Yes
Thumb-1 16 IT No Yes Yes Yes No Yes
Thumb-2 32 BL, DMB, DSB, ISB, MRS, MSR Yes Yes Yes Yes Yes Yes
Thumb-2 32 SDIV, UDIV No Yes Yes Yes Yes Yes
ADC, ADD, ADR, AND, ASR, B, BFC, BFI, BIC, CDP, CLREX, CLZ, CMN, CMP, DBG, EOR, LDC, LDM,
LDR, LDRB, LDRBT, LDRD, LDREX, LDREXB, LDREXH, LDRH, LDRHT, LDRSB, LDRSBT, LDRSH,
LDRSHT, LDRT, LSL, LSR, MCR, MCRR, MLA, MLS, MOV, MOVT, MRC, MRRC, MUL, MVN, NOP,
Thumb-2 32 ORN, ORR, PLD, PLDW, PLI, POP, PUSH, RBIT, REV, REV16, REVSH, ROR, RRX, RSB, SBC, SBFX, No Yes Yes Yes No Yes
SEV, SMLAL, SMULL, SSAT, STC, STM, STR, STRB, STRBT, STRD, STREX, STREXB, STREXH, STRH,
STRHT, STRT, SUB, SXTB, SXTH, TBB, TBH, TEQ, TST, UBFX, UMLAL, UMULL, USAT, UXTB, UXTH,
WFE, WFI, YIELD
PKH, QADD, QADD16, QADD8, QASX, QDADD, QDSUB, QSAX, QSUB, QSUB16, QSUB8, SADD16,
SADD8, SASX, SEL, SHADD16, SHADD8, SHASX, SHSAX, SHSUB16, SHSUB8, SMLABB, SMLABT,
SMLATB, SMLATT, SMLAD, SMLALBB, SMLALBT, SMLALTB, SMLALTT, SMLALD, SMLAWB,
SMLAWT, SMLSD, SMLSLD, SMMLA, SMMLS, SMMUL, SMUAD, SMULBB, SMULBT, SMULTT,
DSP 32 No No Yes Yes No Optional
SMULTB, SMULWT, SMULWB, SMUSD, SSAT16, SSAX, SSUB16, SSUB8, SXTAB, SXTAB16, SXTAH,
SXTB16, UADD16, UADD8, UASX, UHADD16, UHADD8, UHASX, UHSAX, UHSUB16, UHSUB8,
UMAAL, UQADD16, UQADD8, UQASX, UQSAX, UQSUB16, UQSUB8, USAD8, USADA8, USAT16,
USAX, USUB16, USUB8, UXTAB, UXTAB16, UXTAH, UXTB16
VABS, VADD, VCMP, VCMPE, VCVT, VCVTR, VDIV, VLDM, VLDR, VMLA, VMLS, VMOV, VMRS,
SP Float 32 No No Optional Optional No Optional
VMSR, VMUL, VNEG, VNMLA, VNMLS, VNMUL, VPOP, VPUSH, VSQRT, VSTM, VSTR, VSUB
VCVTA, VCVTM, VCVTN, VCVTP, VMAXNM, VMINNM, VRINTA, VRINTM, VRINTN, VRINTP,
DP Float 32 No No No Optional No No
VRINTR, VRINTX, VRINTZ, VSEL
TrustZone 16 BLXNS, BXNS No No No No Optional Optional
TrustZone 32 SG, TT, TTT, TTA, TTAT No No No No Optional Optional
Co-processor 16 CDP, CDP2, MCR, MCR2, MCRR, MCRR2, MRC, MRC2, MRRC, MRRC2 No No No No No Optional

https://en.wikipedia.org/wiki/ARM_Cortex-M#Instruction_sets
16
Reference for ARM Instruction Set Architecture

 ARM V6-M Architecture Reference Manual,


Chapter A5. The Thumb Instruction Set Encoding
 16- or 32-bit instruction?
 Bits [15:11]
 0b11101, 0b1110, 0b11111: 32-bit instruction. Page A5-91
 Else 16-bit instruction. Page A5-84

17
Example Instruction Encoding: ADC (register)

 Page A6-106 of ARM-V6M ARM

18
Example Instruction Encoding: ADD (register)

 Page A6-109 of ARM-V6M ARM

19
Assembler Instruction Format
 <operation> <operand1> <operand2> <operand3>
 There may be fewer operands
 First operand is typically destination (<Rd>)
 Other operands are sources (<Rn>, <Rm>)

 Examples
 ADDS <Rd>, <Rn>, <Rm>
 Add registers: <Rd> = <Rn> + <Rm>
 AND <Rdn>, <Rm>
 Bitwise and: <Rdn> = <Rdn> & <Rm>
 CMP <Rn>, <Rm>
 Compare: Set condition flags based on result of computing <Rn> - <Rm>

20
Where Can the Operands Be Located?
 In a general-purpose register R
 Destination: Rd
 Source: Rm, Rn
 Both source and destination: Rdn
 Target: Rt
 Source for shift amount: Rs

 An immediate value encoded in instruction word

 In a condition code flag

 In memory
 Only for load, store, push and pop instructions

21
Update Condition Codes in APSR?

 “S” suffix indicates the instruction updates APSR


 ADD vs. ADDS
 ADC vs. ADCS
 SUB vs. SUBS
 MOV vs. MOVS

22
AAPCS Register Use Conventions

 Arm Architecture Procedure


Calling Standard (AAPCS):
Makes it easier to create modular
and isolated yet composable code Must be saved, restored by callee-
procedure if it will modify them.
 Preserved (“variable”) registers Calling subroutine expects these to
retain their value.
are expected to have their original
values upon returning from a
called subroutine Must be saved, restored by callee-
procedure if it will modify them.
 r4-r8, r10-r11 Calling subroutine expects these to
retain their value.
 Scratch registers are not expected
to be preserved upon returning Don’t need to be saved. May be
from a called subroutine used for arguments, results, or
 r0-r3 temporary values.

23
Instruction Set Summary
Instruction Type Instructions
Move MOV
Load/Store LDR, LDRB, LDRH, LDRSH, LDRSB, LDM, STR, STRB,
STRH, STM
Add, Subtract, Multiply ADD, ADDS, ADCS, ADR, SUB, SUBS, SBCS, RSBS, MULS
Compare CMP, CMN
Logical ANDS, EORS, ORRS, BICS, MVNS, TST
Shift and Rotate LSLS, LSRS, ASRS, RORS
Stack PUSH, POP
Conditional branch B, BL, B{cond}, BX, BLX
Extend SXTH, SXTB, UXTH, UXTB
Reverse REV, REV16, REVSH
Processor State SVC, CPSID, CPSIE, SETEND, BKPT
No Operation NOP
Hint SEV, WFE, WFI, YIELD
Barriers DMB, DSB, ISB

24
Load and Store Register Instructions

ARM is a load/store architecture, so must Source and destination addresses are


process data in registers (not memory) specified using available addressing modes
LDR: load register with word (32 bits) from  Offset Addressing mode: [<Rn>, <offset>]
memory accesses address <Rn>+<offset>
 Base Register <Rn> can be R0-R7, SP or PC
 LDR <Rt>, source address
 <offset> is added or subtracted from base register
STR: store register contents (32 bits) to
to create effective address
memory  Can be an immediate constant
 STR <Rt>, destination address  Can be another register, used as index <Rm>
 Auto-update: Can write effective address
back to base register
 Pre-indexing: use effective address to access
memory, then update base register
 Post-indexing: use base register to access
25
memory, then update base register
Loading/Storing Smaller Data Sizes
Signed Unsigned
Byte LDRSB LDRB
Half-word LDRSH LDRH

 Some load and store instructions can handle half-word (16 bits) and byte (8 bits)
 Store just writes to half-word or byte
 STRH, STRB
 Loading a byte or half-word requires padding or extension: What do we put in the upper bits of the
register?
 Example: How do we extend 0x80 into a full word?
 Unsigned? Then 0x80 = 128, so zero-pad to extend to word 0x0000_0080 = 128
 Signed? Then 0x80 = -128, so sign-extend to word 0xFFFF_FF80 = -128

26
In-Register Size Extension
Signed Unsigned
Byte SXTB UXTB
Half-word SXTH UXTH

 Can also extend byte or half-word already in a register


 Signed or unsigned (zero-pad)
 How do we extend 0x80 into a full word?
 Unsigned? Then 0x80 = 128, so zero-pad to extend to word 0x0000_0080 = 128
 Signed? Then 0x80 = -128, so sign-extend to word 0xFFFF_FF80 = -128

27
Load/Store Multiple
 LDM/LDMIA: load multiple registers starting from [base register], update base register afterwards
 LDM <Rn>!,<registers>
 LDM <Rn>,<registers>

 STM/STMIA: store multiple registers starting at [base register], update base register after
 STM <Rn>!, <registers>

 LDMIA and STMIA are pseudo-instructions, translated by assembler

28
Load Literal Value into Register

 Assembly pseudo-instruction: LDR <rd>, on compiler and toolchain used)


=value  Decimal: 3909
 Assembler generates code to load <rd> with value  Hexadecimal: 0xa7ee
 Assembler selects best approach depending  Character: ‘A’
 String: “44??”
on value
 Load immediate
 MOV instruction provides 8-bit unsigned immediate operand
(0-255)
 Load and shift immediate values
 Can use MOV, shift, rotate, sign extend instructions
 Load from literal pool
 1. Place value as a 32-bit literal in the program’s literal pool
(table of literal values to be loaded into registers)
 2. Use instruction LDR <rd>, [pc,#offset] where offset
indicates position of literal relative to program counter value

 Example formats for literal values (depends

29
Move (Pseudo-)Instructions
 Copy data from one register to another without
updating condition flags
 MOV <Rd>, <Rm>

 Assembler translates pseudo-


instructions into equivalent
instructions (shifts, rotates)
 Copy data from one register to another
and update condition flags
 MOVS <Rd>, <Rm>
 Copy immediate literal value (0-255)
into register and update condition flags
 MOVS <Rd>, #<imm8>

30
Stack Operations
 Push some or all of registers (R0-R7, LR) to stack
 PUSH {<registers>}
 Decrements SP by 4 bytes for each register saved
 Pushing LR saves return address
 PUSH {r1, r2, LR}
 Largest register number pushed first (to largest address)

 Pop some or all of registers (R0-R7, PC) from stack


 POP {<registers>}
 Increments SP by 4 bytes for each register restored
 If PC is popped, then execution will branch to new PC value after this POP instruction (e.g. return address)
 POP {r5, r6, r7}
 Smallest register number popped first (from smallest address)

31
Add Instructions
 Add registers, update condition flags
 ADDS <Rd>,<Rn>,<Rm>

 Add registers and carry bit, update condition flags


 ADCS <Rdn>,<Rm>

 Add registers
 ADD <Rdn>,<Rm>

 Add immediate value to register


 ADDS <Rd>,<Rn>,#<imm3>
 ADDS <Rdn>,#<imm8>

32
Add Instructions with Stack Pointer
 Add SP and immediate value
 ADD <Rd>,SP,#<imm8>
 ADD SP,SP,#<imm7>

 Add SP value to register


 ADD <Rdm>, SP, <Rdm>
 ADD SP,<Rm>

33
Address to Register Pseudo-Instruction
 Add immediate value to PC, write result in register
 ADR <Rd>,<label>

 How is this used?


 Enables storage of constant data near program counter
 First, load register R2 with address of const_data
 ADR R2, const_data
 Second, load const_data into R2
 LDR R2, [R2]

 Value must be close to current PC value

34
Subtract
 Subtract immediate from register, update condition flags
 SUBS <Rd>,<Rn>,#<imm3>
 SUBS <Rdn>,#<imm8>

 Subtract registers, update condition flags


 SUBS <Rd>,<Rn>,<Rm>

 Subtract registers with carry, update condition flags


 SBCS <Rdn>,<Rm>

 Subtract immediate from SP


 SUB SP,SP,#<imm7>

35
Multiply
 Multiply source registers, save lower word of result in destination register, update condition flags
 MULS <Rdm>, <Rn>, <Rdm>
 <Rdm> = <Rdm> * <Rn>

 Signed multiply
 Note: upper word of result is truncated

36
Logical Operations
 Bitwise AND registers, update condition flags
 ANDS <Rdn>,<Rm>
 Bitwise OR registers, update condition flags
 ORRS <Rdn>,<Rm>
 Bitwise Exclusive OR registers, update condition flags
 EORS <Rdn>,<Rm>
 Bitwise AND register and complement of second register, update condition flags
 BICS <Rdn>,<Rm>
 Move inverse of register value to destination, update condition flags
 MVNS <Rd>,<Rm>
 Update condition flags by ANDing two registers, discarding result
 TST <Rn>, <Rm>

37
Compare
 Compare - subtracts second value from first, discards result, updates APSR
 CMP <Rn>,#<imm8>
 CMP <Rn>,<Rm>

 Compare negative - adds two values, updates APSR, discards result


 CMN <Rn>,<Rm>

38
Shift and Rotate
 Common features
 All of these instructions update APSR condition flags
 Shift/rotate amount (in number of bits) specified by last operand
 Logical shift left - shifts in zeroes on right
 LSLS <Rd>,<Rm>,#<imm5>
 LSLS <Rdn>,<Rm>
 Logical shift right - shifts in zeroes on left
 LSRS <Rd>,<Rm>,#<imm5>
 LSRS <Rdn>,<Rm>
 Arithmetic shift right - shifts in copies of sign bit on left (to maintain arithmetic sign)
 ASRS <Rd>,<Rm>,#<imm5>
 Rotate right
 RORS <Rdn>,<Rm>

39
Reversing Bytes

 REV - reverse all bytes in word MSB LSB


 REV <Rd>,<Rm>

MSB LSB
 REV16 - reverse bytes in both half-words
 REV16 <Rd>,<Rm>
MSB LSB
 REVSH - reverse bytes in low half-word
(signed) and sign-extend
 REVSH <Rd>,<Rm> MSB LSB

MSB LSB
Sign extend

MSB LSB
40
Changing Program Flow - Branches

 Unconditional Branches
 B <label>
 Target address must be within 2 KB of branch instruction (-2048 B to
+2046 B)

 Conditional Branches
 B<cond> <label>
 <cond> is condition - see next page
 B<cond> target address must be within of branch instruction
 B target address must be within 256 B of branch instruction (-256 B to
+254 B)

41
Condition Codes

 Append to branch instruction


(B) to make a conditional branch

 Full ARM instructions (not


Thumb or Thumb-2) support
conditional execution of
arbitrary instructions

 Note: Carry bit = not-borrow for


compares and subtractions

42
Changing Program Flow - Subroutines

 Call  Return
 BL <label> - branch with link  BX <Rd> branch and exchange
 Call subroutine at <label>  Branch to address specified by <Rd>
 PC-relative, range limited to PC+/-16MB  LSB of target address must be set to 1 to
 Save return address in LR ensure continued execution in Thumb state
 BLX <Rd> - branch with link and  Supports full 4 GB address space
 BX LR - Return from subroutine
exchange
 Call subroutine at address in register Rd  POP {PC}
(exchange Rd with PC)
 Supports full 4GB address range
 LSB of target address must be set to 1 to
ensure continued execution in Thumb state
 Save return address in LR

43
Special Register Instructions
 Move to Register from Special Register
 MSR <Rd>, <spec_reg>

 Move to Special Register from Register


 MRS <spec_reg>, <Rd>

 Change Processor State - Modify


PRIMASK register
 CPSIE - Interrupt enable
 CPSID - Interrupt disable

44
Other

 No Operation - does nothing! Used for delays, or to align instruction to word address
 NOP

 Breakpoint - causes hard fault or debug halt - used to implement software breakpoints
 BKPT #<imm8>

 Wait for interrupt - Pause program, enter low-power state until a WFI wake-up event occurs (e.g. an
interrupt)
 WFI

 Supervisor call generates SVC exception (#11), same as software interrupt


 SVC #<imm>

45

You might also like