STG Captronic 01 - CM3-Introduction
STG Captronic 01 - CM3-Introduction
V7-M ARCHITECTURE
V7-A/R/M profiles
Increasing complexity
Increasing number of gates
Increasing performance
Increasing consumption V7-A (Cortex-A5/A8/A9MP/A15MP)
ARM1136
ARM7TDMI ARM926 ARM1176
V7-R (Cortex-R4,R5,R7)
V4T V5TE V6(Z)
ARM has identified 3 different markets requiring different core features (around 2005)
Application
Real-Time
Microcontroller
V7-M ARCHITECTURE
Thumb v2
V4T V5TE V6(Z) V7-A, V7-R V7-M V7-EM
ARM
Thumb v1
Jazelle (V7-A) option
SIMD using regular R0-R12 regs SIMD using regular R0-
R12 regs
TrustZone (V7-A only)
Thumb v2 is a superset of Thumb v1
Thumb-2 EE (V7-A)
VFPv3 option (Simple FPv4-SP option (Simple
and Double Precision) Precision)
NEON (V7-A) option
ARM has decided to not maintain this compatibility for V7-M in order to
Simplify the design of the cores
Allow determinism
The exception mechanism of the V7-M architecture describes a nesting mechanism
that enables the accurate calculation of the worst case latency time
A key factor is that the application level is consistent across all profiles
V7-M ARCHITECTURE
Code Size and Performance
Relative Dhrystone performance and code size for ARM, Thumb and Thumb-2
V7-M ARCHITECTURE
SIMD instructions
The Cortex-M4 implements the V7-EM architecture
Compared to Cortex-M3, DSP-oriented instructions are supported
Example of a SIMD instruction
SMLADX Rd,Rn,Rm,Ra
31 16 15 0
Rn
Rm
* *
31 0
Ra +
CPSR[Q]
Rd
A library provided in source format has been developed by ARM and partners
It is called CMSIS 2.0 (Cortex Microcontroller Software Interface Standard)
Standard mathematic functions are supported
FFT, FIR, biquad
Complex arithmetic
Sin, Cos
Matrix calculation
Statistics functions
For instance, let us compare the performance of a 16-bit complex FFT, 1024 samples
with and without DSP instructions
Harvard Architecture
The whole code, including boot and exception handlers can be developed in C
BLOCK DIAGRAM
Harvard Architecture and Bus Matrix
Cortex-M3 core
3-stage pipeline
I D
PPB
(APB)
matrix
AHB-lite
Bus
I-code bus
AHB-lite
Emulation D-code bus
Probe AHB-lite
System bus
Cortex-M3 CPU
BLOCK DIAGRAM
Core peripherals
Address Core peripheral
Systick MPU
PPB
PPB
SCB
matrix
AHB-lite
Bus
I-code bus
AHB-lite
Emulation D-code bus
Probe AHB-lite
System bus
Cortex-M3 CPU
Instruction Decode
FETCH and
Register read Multiply and Divide WR
ALU
Shift And
Branch
Branch
BLOCK DIAGRAM
3-State pipeline: Data Path
Instruction
HDATA_I
decode Byte select &
HRDATA_D
Sign extend
B
Address
HADDR_D
register
Register
Bank
MUL Barrel
R0-R15 DIV shifter
Address
incrementer xPSR ALU
A
HADDR_I
Address
register Writeback
ExceptionVector
The Cortex-M3 supports the following sleep modes reducing power consumption:
Sleep Mode
Internal Clock gating
The Cortex-M3 main clock is stopped
Deep Sleep Mode
Indicated to an external power management Unit
Effects are Implementation defined
• Some SoC power domains could be turned of for example
BLOCK DIAGRAM
Power Management: Sleep Modes
The NVIC controls the low power modes :
Power
Management Unit
WIC NVIC Cortex-M3 core
Always powered
Stimulus
peripheral
Clamps RAM
Sleep-on-exit Mode:
When SLEEPONEXIT bit of the System Control Register (SCR) is set, the processor
completes the execution of an exception handler and immediately enters sleep mode
IRQ Return from Exception
WFE as same effect as WFI, but cortex-M3 will also wakes up from a pulse on the event
input pin (what is connected ther is implementation defined)
BLOCK DIAGRAM
Debug Features
Trace port
Systick MPU
DWT ITM TPIU (serial wire
or multi-pin)
PPB
AHB-lite
Bus
Cortex-M3 CPU
Trace port
Systick MPU
DWT ITM TPIU (serial wire
or multi-pin)
PPB
matrix
AHB-lite
Bus
FPB I-code bus
AHB-lite
Emulation D-code bus
Probe AHB-lite
System bus
Cortex-M3 CPU
Runtime code can be instrumented in order to track what happens in the core
The ITM unit contains memory mapped FIFOs whose contents are exported through
TPIU to the outer world
STG-Captronic 1. 21 Cortex-M3 Introduction
BLOCK DIAGRAM
Debug Features
Trace port
Systick MPU
DWT ITM TPIU (serial wire
or multi-pin)
PPB
AHB-lite
Bus
Cortex-M3 CPU
Host Host
system Cortex-M3 system Cortex-M3
USB, 2 USB, 5
Ethernet SW Ethernet JTAG
SWJ SWJ
probe probe
Or Host
system Cortex-M3
USB, 2
Ethernet SW
SW
probe
BLOCK DIAGRAM
Debug Features: TPIU
Non invasive (with ETM) debug can be done through either Serial Wire Viewer or
Parallel Trace port
Cortex-M3 Host
1
system
SWV
TPIU
Port analyzer USB,
SWO Ethernet
Cortex-M3 Host
system
4
Trace
TPIU
Port analyzer USB,
Ethernet
TraceData
matrix
AHB-lite
FPB
Bus
I-code bus
AHB-lite
APB D-code bus
SW/ SW/ AHB
AHB-AP
JTAG SWJ-DP AHB-lite
System bus
Cortex-M3 CPU
NVIC Nested Vectored Interrupt Controller DW T Data W atchpoint and Trace SW Serial W ire
MPU Memory Protection Unit ITM Instrumentation Trace Macrocell SW J-DP Serial W ire JTAG Debug Port
AHB Advanced High Performance Bus ETM Embedded Trace Macrocell AHB-AP AHB Access Port
APB Advanced Peripheral Bus FPB Flash Patch and Breakpoint TPIU Trace Port Interface Unit
BLOCK DIAGRAM
Implementation options
Memory-mapped registers
As seen before, the core peripherals are configured through memory-mapped
registres
They are accessed by the load and store instructions
PRIVILEGE LEVELS
The idea is to protect specific address ranges (datas, I/O registers, etc.) against non-
privileged accesses (placing the Core in the Non-privileged level)
Note: The System Control Block registers are all privileged registers
PRIVILEGE LEVELS
System / User Land
System
Kernel data
Kernel code
Application data
Application code
t
Reset Interrupt
DOCUMENTATION
Software specifications
Hardware specification
V7-M
AMBA v3 (Advanced Microcontroller
AAPCS (ARM Architecture Procedure
Bus Architecture)
Call Standard)
Cortex-M3
Technical
Reference
Manual
(TRM) one per
core version
Debug specification
Coresight