Application Note: Bare-Metal Boot Code For Armv8-A Processors
Application Note: Bare-Metal Boot Code For Armv8-A Processors
Application Note: Bare-Metal Boot Code For Armv8-A Processors
Non-Confidential
Release Information
Document History
http://www.arm.com
The following section describes the typographical conventions and how to give feedback:
Typographical conventions
The following typographical conventions are used:
monospace denotes text that can be entered at the keyboard, such as
commands, file and program names, and source code.
monospace denotes a permitted abbreviation for a command or option. The
underlined text can be entered instead of the full command or option
name.
monospace italic
denotes arguments to commands and functions where the argument
is to be replaced by a specific value.
monospace bold
denotes language keywords when used outside example code.
italic highlights important notes, introduces special terminology, denotes
internal cross-references, and citations.
bold highlights interface elements, such as menu names. Also used for
emphasis in descriptive lists, where appropriate, and for ARM®
processor signal names.
Feedback on documentation
If you have comments on the documentation, e-mail [email protected]. Give:
• The title.
• If viewing a PDF version of a document, the page numbers to which your comments
apply.
Other information
• ARM Information Center, http://infocenter.arm.com/help/index.jsp.
• ARM® Cortex™-A Series Programmer’s Guide for ARMv7-A (ARM DEN 0013).
PL Privilege Level.
SoC System on Chip.
SP Stack Pointer.
This chapter describes the purpose and scope of this application note.
It contains the following topics:
• Document purpose on page 11.
• Initializing registers.
• AArch64.
This application note provides boot code examples for each Execution state.
For boot code examples applicable to ARMv7-A processors, see the ARM ® CortexTM-A
Series Programmer’s Guide for ARMv7-A.
Example 4-1 shows a typical vector table that is used for reset and other exceptions.
Example 4-1 Typical vector table
.balign 0x20
vector_table_base_address:
B reset_handler
B undefined_handler
B svc_handler
B prefetch_handler
B data_handler
NOP
B IRQ_handler
The vector entries in the four tables might be different. For details, see the section,
Exception vectors and the exception base address, in the ARM® Architecture Reference
Manual ARMv8, for ARMv8-A architecture profile.
You must initialize the four vector tables, and program the vector table base address
registers before using the vector tables. The base addresses of vector tables must be 32-
byte aligned.
Example 4-2 shows you how to initialize VBAR and MVBAR after reset.
CPSIE aif
Example 4-4 shows you how to initialize general-purpose registers after reset. Because
there are banked general-purpose registers for different modes in AArch32, the example
code changes to different modes and initializes them all.
Example 4-4 General-purpose registers initialization
MOV R0, #0
MOV R1, #0
MOV R2, #0
MOV R3, #0
MOV R4, #0
MOV R5, #0
MOV R6, #0
MOV R7, #0
MOV R8, #0
MOV R9, #0
MOV R10, #0
MOV R11, #0
MOV R12, #0
MOV R13, #0
MOV R14, #0
MOV R8, #0
MOV R9, #0
MOV R10, #0
MOV R11, #0
MOV R12, #0
MOV R14, #0
MOV R13, #0
MOV R14, #0
MOV R13, #0 // System and User modes reuse the same banking
MOV R13, #0
MOV R14, #0
MOV R13, #0
MOV R14, #0
MOV R13, #0
MOV R14, #0
MCR P15, 0, R1, C1, C0, 2 // CPACR full access to cp11 and cp10.
MOV R1, #0
MOV R2, #0
VMOV.F64 D1, D0
VMOV.F64 D2, D0
VMOV.F64 D3, D0
VMOV.F64 D4, D0
VMOV.F64 D5, D0
VMOV.F64 D6, D0
VMOV.F64 D7, D0
VMOV.F64 D8, D0
VMOV.F64 D9, D0
VMOV.F64 D10, D0
VMOV.F64 D11, D0
VMOV.F64 D12, D0
VMOV.F64 D13, D0
VMOV.F64 D14, D0
VMOV.F64 D15, D0
VMOV.F64 D16, D0
VMOV.F64 D17, D0
VMOV.F64 D18, D0
VMOV.F64 D19, D0
VMOV.F64 D20, D0
VMOV.F64 D21, D0
VMOV.F64 D22, D0
VMOV.F64 D23, D0
VMOV.F64 D24, D0
VMOV.F64 D25, D0
VMOV.F64 D26, D0
VMOV.F64 D27, D0
VMOV.F64 D28, D0
VMOV.F64 D29, D0
VMOV.F64 D30, D0
VMOV.F64 D31, D0
MOV R0, #0
MSR SPSR, R0
MSR SPSR_svc, R0
MSR SPSR_und, R0
MSR SPSR_hyp, R0
MSR SPSR_abt, R0
MSR SPSR_irq, R0
MSR SPSR_fiq, R0
// Initialize ELR_hyp.
MOV R0, #0
MSR ELR_hyp, R0
// Disable L1 Caches.
MCR P15, 2, R0, C0, C0, 0 // CSSELR Cache Size Selection Register.
way_loop:
set_loop:
// Initialize TTBCR.
// Initialize DACR.
MCR P15, 0, R1, C3, C0, 0 // Accesses are checked against the
// Initialize SCTLR.AFE.
BIC R1, R1, #(0x1 <<29) // Set AFE to 0 and disable Access Flag.
// Initialize TTBR0.
// executable.
loop:
SUBS R3, #1
BNE loop
loop2:
SUBS R3, #1
BNE loop2
MOV R1, #0
loop:
SUBS R4, #1
BNE loop
Example 4-11 creates a section as a translation table at compile time. This method is fast
for simulations. It is written with the GNU assembly grammar. The code to initialize
translation table control registers in example 4-10 is still required.
.word \low
.word \high
.endm
.endm
.endm
.endm
.align 12
ttb0_base:
TABLE_ENTRY level2_pagetable, 0
.align 12
level2_pagetable:
.rept 0x200
.endr
ORR R1, R1, #(0x1 << 12) // The I bit (instruction cache).
DSB
ISB
MOV R0, #(0xF << 20) // Enable CP10 & CP11 function
MCR P15, 0, R0, C1, C0, 2 // Write the Coprocessor Access Control
VMSR FPEXC, R1
4.4.2 Enabling access to the NEON and FP functionality in the Non-secure world
Access to NEON technology and FP functionality from the Non-secure world is disabled
after reset. If software requires access to the NEON and FP registers in the Non-secure
world, Non-secure Access Control Register (NSACR) must be initialized in EL3.
Example 4-14 shows you how to configure the NSACR after reset.
Example 4-14 NSACR configuration
MOV R1, #(0x3 << 10) // Enable Non-secure access to CP10 & CP11.
4.4.3 Enabling access to the NEON and FP functionality in Non-secure EL1 and EL0
Access to the NEON and FP functionality from Non-secure EL1 or EL0 can be trapped to
Hypervisor mode. The trap must be disabled if a program must access NEON and FP
functionality in Non-secure EL1 or EL0. The trap function is disabled by default after core
reset, so this step might be unnecessary.
Example 4-15 shows you how to disable trap of accesses to NEON technology and FP
functionality from Non-secure EL1 or EL0 by programming the Hyp Architectural Feature
Trap Register (HCPTR) register.
Note
The HCPTR register can be accessed in EL2 and EL3 (NS=1).
For details, see the section, Security state, in the ARM® Architecture Reference Manual
ARMv8, for ARMv8-A architecture profile.
The following sections describe how to change between these modes when a processor
runs in AArch32:
• Changing between User, System, FIQ, IRQ, Supervisor, Abort, Undefined modes
on page 31.
4.5.1 Changing between User, System, FIQ, IRQ, Supervisor, Abort, Undefined modes
When booting in AArch32 mode, processors enter secure Supervisor mode after reset.
Normally, processors take or return exceptions to change to other modes. To simplify the
test, it can be done by directly changing the CPSR.M bits in a bare-metal test.
Example 4-16 shows you how to change from a non-User mode to other modes.
Example 4-16 Mode change
CPS #Mode_FIQ
Example 4-17 shows you how to change from User mode to Supervisor mode.
Example 4-17 Mode switch from User mode to Supervisor mode
// When processors are in User mode, use SVC to change from User mode
// to SVC mode. Make sure that VBAR is initialized before executing SVC.
SVC #0
Example 4-18 shows you how to use the SMC instruction to enter Monitor mode.
SMC #0
To switch from the Secure world to the Non-secure world, the processor must set
SCR.NS to 1 in Monitor mode. After that, the processor returns to Non-secure world with
an exception return.
Example 4-19 shows you how to switch to Non-secure Supervisor mode when the
processor is in Monitor mode.
Example 4-19 Switch from Secure world to Non-secure world
// (SCR).
MOV R0, #0
// Exception return.
ERET
To switch from the Non-secure world to the Secure world, the processor performs the
following steps:
1. Enter Monitor mode.
Example 4-20 shows you how to clear the SCR.NS bit when the processor is in Monitor
mode.
// is in Monitor mode.
ORR R1, R1, #(1 << 8) // Set SCR.HCE (bit 8) and enable HVC.
MOV R0, #0
ERET
Example 4-22 shows you how to enter Hypervisor mode from any of the Non-secure
System, FIQ, IRQ, Supervisor, Abort, or Undefined modes.
Example 4-22 Enter Hypervisor mode
HVC #0
Reset vector
In AArch64, the processor starts execution from an IMPLEMENTAION-DEFINED
address, which is defined by the hardware input pins RVBARADDR and can be read by
the RVBAR_EL3 register. You must place boot code at this address.
Vector table
There are dedicated vector tables for each exception level:
• VBAR_EL3.
• VBAR_EL2.
• VBAR_EL1.
The vector table in AArch64 is different from that in AArch32. The vector table in AArch64
mode contains 16 entries. Each entry is 128B in size and contains at most 32 instructions.
Vector tables must be placed at a 2KB-aligned address. The addresses are specified by
initializing VBAR_ELn registers.
For more details about the vector table, see the section, Exception vectors, in the ARM®
Architecture Reference Manual ARMv8, for ARMv8-A architecture profile.
The following figure shows you how the vector table is structured.
// Initialize VBAR_EL3.
MSR VBAR_EL3, X1
MSR VBAR_EL2, X1
MSR VBAR_EL1, X1
.balign 0x800
Vector_table_el3:
.balign 0x80
.balign 0x80
.balign 0x80
.balign 0x80
// current SP.
.balign 0x80
.balign 0x80
// current SP.
.balign 0x80
.balign 0x80
.balign 0x80
.balign 0x80
.balign 0x80
.balign 0x80
.balign 0x80
.balign 0x80
MSR SCR_EL3, X0
To route an asynchronous exception to EL2 rather than EL3, you must set
HCR_EL2.{AMO,FMO,IMO} and clear SCR_EL3.{EA,IRQ,FIQ}.
Example 5-4 shows you how to route SError, IRQ and FIQ to EL2.
Example 5-4 SError, IRQ and FIQ routing enablement in EL2
MSR HCR_EL2, X0
For more details about enabling asynchronous exceptions, see the section,
Asynchronous exception types, routing, masking and priorities, in the ARM® Architecture
Reference Manual ARMv8, for ARMv8-A architecture profile.
Example 5-6 shows you how to initialize general-purpose registers after reset.
Example 5-6 Register bank initialization
MOV SP, X1
MSR SCTLR_EL2, X1
MSR SCTLR_EL1, X1
This example does not cover all system registers that need initialization. Theoretically,
you must initialize all system registers that do not have architecturally defined reset
values. However, some registers can have IMPLEMENTATION-DEFINED reset values,
depending on the implementation of a particular processor. For details, see the section,
General system control registers, in the ARM® Architecture Reference Manual ARMv8, for
ARMv8-A architecture profile and the TRM of the relevant processor.
// Disable L1 Caches
// Calculate the cache size first and loop through each set +
// way.
way_loop:
set_loop:
For details, see the section, The AArch64 Virtual Memory System Architecture, in the
ARM® Architecture Reference Manual ARMv8, for ARMv8-A architecture profile.
Example 5-11 and Example 5-12 build an EL3 translation table with a 4KB granule size
covering 4GB memory space:
• 0-1GB memory is configured as Normal cacheable memory.
• 1-4GB memory is configured as Device-nGnRnE memory.
The translation table contains 512 level2 blocks of 2MB size and 3 level1 blocks of 1GB
size.
Example 5-11 first initializes translation table control registers, and then uses looped store
instructions to build a translation table, which is easier to port.
Example 5-11 Build translation tables using looped store instructions
// Inner-shareable.
MSR TTBR0_EL3, X0
// instructions.
// AttrIdx=000 Device-nGnRnE.
loop:
BNE loop
.word \low
.word \high
.endm
.endm
.endm
.endm
ttb0_base:
TABLE_ENTRY level2_pagetable, 0
level2_pagetable:
.rept 0x200
MSR S3_1_C15_C2_1, X0
ORR X0, X0, #(0x1 << 12) // The I bit (instruction cache).
MSR SCTLR_EL3, X0
DSB SY
ISB
MSR CPACR_EL1, X1
ISB
MSR SCR_EL3, x0
ERET
el2_entry:
MSR HCR_EL2, X0
ERET
el1_entry:
MSR SPSR_EL1, X0
ERET
el0_entry:
MSR HCR_EL2, X0
ERET
el1_entry: