Booting An Intel System Architecture: November 2015
Booting An Intel System Architecture: November 2015
net/publication/295010710
CITATIONS READS
0 6,117
1 author:
Nikola Zlatanov
Applied Materials
44 PUBLICATIONS 8 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Nikola Zlatanov on 18 February 2016.
When external power is first applied to a platform, the hardware platform must carry out a number of
tasks before the processor can be brought out of reset. The first task is for the power supply to be
allowed to settle down to its nominal state; once the primary power supply settles, there are usually a
number of derived voltage levels needed on the platform. For example, on the Intel architecture
reference platform the input supply consists of a 12-volt source, but the platform and processor require a
number of different voltage rails such as 1.5 V, 3.3 V, 5 V, and 12 V. The platform and processor also
require that the voltages are provided in a particular sequence. This process is known as power
sequencing. The power is sequenced by controlling analog switches (typically field effect transistors).
The sequence is driven by often driven by a complex program logic device (CPLD). The platform clocks
are also derived from a small number of input clock and oscillator sources. The devices use phase
locked loop circuitry to generate the derived clocks used for the platform. These clocks also take time to
converge. When all these steps have occurred, the power sequencing CPLD de-asserts the reset line to
the processor. Figure 1 shows an overview of the platform blocks described. Depending on integration of
silicon features, some of this logic may be on chip and controlled by microcontroller firmware, which
starts prior to the main processor.
Mode Selection
IA-32 supports three operating modes and one quasi-operating mode:
• Protected mode is considered the native operating mode of the processor. It provides a rich set of
architectural features, flexibility, high performance and backward compatibility to the existing software
base.
• Real-address mode or “real mode” is operating mode provides the programming environment of the
Intel 8086 processor, with a few extensions (such as the ability to switch to protected or system
management mode). Whenever a reset or a power on happens, the system transitions back to Real-
address mode.
• System management mode (SMM) is a standard architectural feature in all IA-32 processors,
beginning with the Intel386 SL processor. This mode provides an operating system or executive with a
transparent mechanism for implementing power management and OEM differentiation features. SMM is
entered through activation of an external system interrupt pin (SMI#), which generates a system
management interrupt (SMI). In SMM, the processor switches to a separate address space while
saving the context of the currently running program or task. SMM-specific code may then be executed
transparently. Upon returning from SMM, the processor is placed back into its state prior to the SMI.
• Normally the system firmware creates an SMI handler, which may periodically take over the system
from the host OS. There are normally legitimate workarounds that are executed in the SMI handler and
handling and logging-off errors that may happen at the system level. As this presents a potential
security issue, there is also a lock bit that resists tampering with this mechanism.
• Real-time OS vendors often recommend disabling this feature because it has a potential of subverting
the nature of the OS environment. If this happens, then the additional work of the SMI handler would
either need to be incorporated into the RTOS for that platform or else the potential exists of missing
something important in the way of error response or workarounds. If the SMI handler can work with the
RTOS development, there are some advantages to the feature.
• Virtual-8086 mode is a quasi-operating mode supported by the processor in protected mode.
This mode allows the processor execute 8086 software in a protected, multitasking environment.
Intel® 64 architecture supports all operating modes of IA-32 architecture and IA-32e modes:
• IA-32e mode—in IA-32e mode, the processor supports two sub-modes: compatibility mode and 64-bit
mode.
• 64-bit mode provides 64-bit linear addressing and support for physical address space larger than 64
GB.
• Compatibility mode allows most legacy protected-mode applications to run unchanged.
Refer to the Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3A section titled
“Mode Switching” for more details.
When the processor is first powered on, it will be in a special mode similar to real mode, but with the
top 12 address lines being asserted high. This aliasing allows the boot code to be accessed directly from
NVRAM (physical address 0xFFFxxxxx).
Upon execution of the first long jump, these 12 address lines will be driven according to instructions
by firmware. If one of the protected modes is not entered before the first long jump, the processor will
enter real mode, with only 1 MB of addressability. In order for real mode to work without memory, the
chipset needs to be able to alias a range of memory below 1 MB to an equivalent range just below 4 GB
to continue to access NVRAM. Certain chipsets do not have this aliasing and may require a switch to into
a normal operating mode before performing the first long jump. The processor also invalidates the
internal caches and translation lookaside buffers (TLBs).
The processor continues to boot in real mode today, for no particular technical requirement. While
some speculate it is to ensure that the platform can boot legacy code (example DOS OS) written many
years ago, it is more an issue of introducing and needing to validate the change into a broad ecosystem
of players and developers, and backward compatibility issues it would create in test and manufacturing
environments and other natural upgrade hurdles that will continue to keep the boot mode and Intel reset
vector “real.”
The first power-on mode is actually a special subset of the real mode. The top 12 address lines are
held high, thus allowing aliasing, where the processor can execute code from the nonvolatile storage
(such as flash) located within lowest one megabyte as if from the top of memory. Normal operation of the
firmware (such as BIOS) is to switch modes to flat protected mode as early in the boot sequence as is
possible. Once the processor is running in protected mode, it is usually not necessary to switch back to
real mode unless executing a legacy option ROM, which makes certain legacy software interrupt calls.
The flat mode runs 32-bit code and the physical address are mapped on to one with the logical
addresses (paging off). The Interrupt Descriptor Table is used for interrupt handling. This is the
recommended mode for all BIOS/boot loaders to operate in. The segmented protected mode is not used
for the initialization code as part of the BIOS sequence.
Intel produces BIOS specifications or BIOS writer’s guides that go into some details about chip-
specific and technology-specific initialization sequences. These documents hold fine-grain details of
every bit that needs to be set, but not necessarily the high level sequence in which to set them. The
following flows for early initialization and advanced initialization outline that from a hardware architecture
point of view.
Early Initialization
The early phase of the BIOS/bootloader will do the minimum to get the memory and processor cores
initialized.
In a system BIOS constructed using the Unified Extensible Firmware Interface (UEFI) 2.0 framework,
the Security (SEC) and the pre-EFI initialization (PEI) phases are normally synonymous with “early
initialization.” It doesn’t matter if legacy or UEFI BIOS is used; from a hardware point of view, the early
initialization sequence is the same for a given system.
Single-Threaded Operation
In a multi-core system, the bootstrap processor is the CPU core/thread that is chosen to boot the
normally single-threaded system firmware. At RESET, all of the processors race for a semaphore flag bit
in the chipset. The first finds it clear and in the process of reading it sets the flag; the other processors
find the flag set and enter a WAIT for SIPI or halt state. The first processor initializes main memory and
the application processors (APs) and continues with the rest of boot. A multiprocessor (MP) system does
not truly enter MP operation
6 until the OS takes over. While it is possible to do a limited amount of parallel processing during the
UEFI boot phase, such as during memory initialization with multiple socket designs, any true
multithreading activity would require changes to be made to the DXE phase of the UEFI solutions to
allow for this. In order to have broad adoption, some obvious benefits would need to arise.
CPU Initialization
This consists of simple configuring of processor and machine registers, loading a microcode update, and
enabling the Local APIC.
Microcode Update. Microcode is a hardware layer of instructions involved in the implementation of the
machine-defined architecture. It is most prevalent in CISC-based processors. Microcode is developed by
the CPU7 vendor and incorporated into an internal CPU ROM during manufacture. Since the infamous
“Pentium flaw,” Intel processor architecture allows that microcode to be updated in the field either
through a BIOS update or via an OS update of “configuration data.”
Today an Intel processor must have the latest microcode update to be considered a warranted CPU.
Intel provides microcode updates that are written to the writable microcode store in the CPU. The
updates are encrypted and signed by Intel such that only the processor that the microcode update was
designed for can authenticate and load that update. On socketed systems, the BIOS may have to carry
many flavors of microcode update depending on the number of processor steppings supported. It is
important to load the microcode updates early in the boot sequence to limit the exposure of the system to
any known errata in the silicon. It is important to note that the microcode update may need to be
reapplied to the CPU after certain reset events in the boot sequence.
The BSP microcode update must be loaded before No-Eviction mode is entered.
Local APIC. The Local APIC must be enabled to handle any interrupts that occur early in post, before
enabling protected mode. For more information on the Intel APIC Architecture and Local APICs, please
see the Intel 64 and IA-32 Intel Architecture Software Developer’s Manual, Volume 3A: System
Programming Guide, Chapter 8.
Switch to Protected Mode. Before the processor can be switched to protected mode, the software
initialization code must load a minimum number of protected mode data structures and code modules
into memory to support reliable operation of the processor in protected mode. These data structures
include the following:
An IDT
A GDT
A TSS
(Optional) An LDT
If paging is to be used, at least one page directory and one page table
A code segment that contains the code to be executed when the processor switches to protected mode
One or more code modules that contain the necessary interrupt and exception handlers
Initialization code must also initialize the following system registers before the processor can be
switched to protected mode:
The GDTR
Optionally the IDTR. This register can also be initialized immediately after switching to protected mode,
prior to enabling interrupts.
Control registers CR1 through CR4
The memory type range registers (MTRRs)
With these data structures, code modules, and system registers initialized, the processor can be
switched to protected mode by 8 loading control register CR0 with a value that sets the PE flag (bit 0).
From this point onward, it is likely that the system will not enter 16b Real mode again, Legacy Option
ROMs and Legacy OS/BIOS interface notwithstanding, until the next hardware reset is experienced.
More details about protected mode and real mode switching can be found in the Intel 64 and IA-32
Intel Architecture Software Developer’s Manual, Volume 3A: System Programming Guide, Chapter 9.
Cache as RAM and No Eviction Mode. Since no DRAM is available, the code initially operates in a
stackless environment. Most modern processors have an internal cache that can be configured as RAM
(Cache as RAM or CAR) to provide a software stack. Developers must write extremely tight code when
using CAR because an eviction would be unacceptable to the system at this point in the boot sequence;
there is no memory to maintain coherency with at this time. There is a special mode for processors to
operate in cache as RAM called “no evict mode” (NEM) where a cache line miss in the processor will not
cause an eviction. Developing code with an available software stack is much easier, and initialization
code often performs the minimal setup to use a stack even prior to DRAM initialization.
Processor Speed Correction. The processor may boot into a slower mode than it is capable of
performing for various reasons. It may be considered less risky to run in a slower mode, or it may be
done to save additional power. It will be necessary for the BIOS to initialize the speed step technology of
the processor and may need to force the speed to something appropriate for a faster boot. This
additional optimization is optional; the OS will likely have the drivers to deal with this parameter when it
loads.
Memory Configuration
The initialization of the memory controller varies slightly depending on the DRAM technology and the
capabilities of the memory controller itself. The information on the DRAM controller is proprietary for
SOC devices, and in such cases the initialization memory reference code (MRC) is typically supplied by
the SOC vendor. Developers have to contact Intel to request access to the low level information
required. Without being given the MRC, developers can understand that for a given DRAM technology,
they must follow a JEDEC initialization sequence. It is likely a given that the code will be single point of
entry and single point of exit code that has multiple boot paths contained within it. It will be 32-bit
protected mode code. The settings for various bit fields like buffer strengths and loading for a given
number of banks of memory is something that is chipset-specific. The dynamic nature of the memory
tests that are run (to ensure proper timings have been applied for a given memory configuration) is an
additional complexity that would prove difficult to replicate without a proper code from the memory
controller vendor. Workarounds for errata and interactions with other subsystems such as the
Manageability Engine are not something that can be re-invented, 9 but can be reverse-engineered. The
latter is of course heavily frowned upon.
There is a very wide range of DRAM configuration parameters, such as number of ranks, 8-bit or 16-
bit addresses, overall memory size, and constellation, soldered down or add-in module (DIMM)
configurations, page closing policy, and power management. Given that most embedded systems
populate soldered down DRAM on the board, the firmware may not need to discover the configuration at
boot time. These configurations are known as memory-down. The firmware is specifically built for the
target configuration. This is the case for the Intel reference platform from the Embedded Computing
Group. At current DRAM speeds, the wires between the memory controllers behave like transmission
lines; the SOC may provide automatic calibration and runtime control of resistive compensation
(RCOMP) and delay locked look (DLL) capabilities. These capabilities allow the memory controller to
change elements such as the drive strength to ensure error-free operation over time and temperature
variations.
If the platform supports add-in-modules for memory, there are a number of standardized form factors
for such memory. The small outline dual in-line memory module (SODIMM) is one such form factor often
found in embedded systems. The DIMMs provide a serial EPROM. The serial EPROM devices contain
the DRAM configuration data. The data is known as serial presence detect data (SPD data). The
firmware reads the SDP data to identify the device configuration and subsequently configures the device.
The serial EPROM is connected via SMBUS, thus the device must be available in this early initialization
phase, so the software can establish the memory devices on board. It is possible for memory-down
motherboards to also incorporate serial presence detect EEPROMs to allow for multiple and updatable
memory configurations to be handled efficiently by a single BIOS algorithm. It is also possible to provide
a hard-coded table in one of the MRC files to allow for an EEPROM-less design. In order to derive that
table for SPD, please see Appendix A.
Post-Memory
Once the memory controller has been initialized, a number of subsequent cleanup events take place.
Memory Testing
The memory testing is now part of the MRC, but it is possible to add more tests should the design merit
it. BIOS vendors typically provide some kind of memory test on a cold boot. Writing custom firmware
requires the authors to choose a balance between thoroughness and speed, as highly embedded/mobile
devices require extremely fast boot times and memory testing can take up considerable time.
If testing is warranted for a design, testing the memory directly following initialization is the
time to do it. The system is idle, the subsystems are not actively accessing memory, and the OS
has not taken over the host side of the platform. Memory errors manifest themselves in random
ways, sometimes inconsistently. Several hardware features 10 can assist in this testing both during
boot and during runtime. These features have traditionally been thought of as high-end or server
features, but over time they have moved into the client and embedded markets.
One of the most common is ECC. Some embedded devices use error correction codes (ECC)
memory, which may need extra initialization. After power up, the state of the correction codes may not
reflect the contents and all memory must be written to; writing to memory ensures that the ECC bits are
valid and sets the ECC bits to the appropriate contents. For security purposes, the memory may need to
be zeroed out manually by the BIOS or in some cases a memory controller may incorporate the feature
into hardware to save time.
Depending on the source of the reset and security requirements, the system may not execute a
memory wipe or ECC initialization. On a warm reset sequence, memory context can be maintained.
If there were any memory timing changes or other configuration changes that require a reset to take
effect, this is normally the time to execute a warm reset. That warm reset would start the early
initialization phase over again; registers would need to be restored that are affected.
Shadowing
From the reset vector, execution starts off executing directly from the nonvolatile flash storage (NVRAM).
This operating mode is known as execute in place (XIP). The read performance of nonvolatile storage is
much slower than the read performance of DRAM. The performance of the code running from flash is
much lower than if it executed from RAM, so most early firmware will copy from the slower nonvolatile
storage into RAM. The firmware starts to run the RAM copy of the firmware. This process is sometimes
known as shadowing. Shadowing involves having the same contents in RAM and flash; with a change in
the address decoders the RAM copy is logically in front of the flash copy and the program starts to
execute from RAM. On other embedded systems, the chip selects ranges that are managed to allow the
change from flash to RAM execution. Most computing systems run as little as possible in place.
However, some embedded platforms constrained (in terms of RAM) execute all the application in place.
This is generally an option on very small embedded devices. The Intel architecture platforms generally
do no execute in place for anything but the very initial boot steps before memory has been configured.
The firmware is often compressed instead of a simple copy. This allows reduction of the NVRAM
requirements for the firmware. Clearly the processor cannot execute a compressed image in place. On
Intel architecture platforms the copy of the firmware is usually located below 1 MB.
There is a tradeoff between the size of data to be shadowed and the act of decompression. The
decompression algorithm may take much longer to load and execute than it would be for the image to
remain uncompressed. Prefetchers in the processor, if enabled, may also speed up execution in place
and some SOCs have internal NVRAM cache buffers to assist in pipelining the data from the flash to the
processor.
Figure 3 shows the memory map at initialization in real mode. Real mode has an accessibility limit of
1MB.
Exit from No-Eviction Mode and Transfer to DRAM
Before memory was initialized, the data and code stacks were held in the processor cache. With memory
now initialized that special and temporary caching mode must be exited and the cache flushed. The
stack will be transferred to a new location in system main memory and cache reconfigured as part of AP
initialization.
The stack must be set up before jumping into the shadowed portion of the BIOS that now is in memory.
A memory location must be chosen for stack space. The stack will count down so the top of the stack
must be entered and enough memory must be allocated for the maximum stack.
If the system is in real mode, then SS:SP must be set with the appropriate values. If protected mode is
used, which is likely the case following MRC execution, then SS:ESP must be set to the correct memory
location.
Transfer to DRAM
This is where the code makes the jump into memory. As mentioned before, if a memory test has not
been performed up until this point, the jump could very well be to garbage. System failures indicated by a
POST code between “end of memory initialization” and the first following POST code almost always
indicates a catastrophic memory initialization problem. If this is a new design, then chances are this is in
the hardware and requires step-by-step debug.
Reset Vector
EIP (last 16 bytes of
0xFFFFFFF0 memory) 4 GB – 16 bytes
Unaddressable
memory in real
mode
0xFFFFF 1MB
Reset Aliased Reset vector read
from 0xffff:ffff0
BIOS/Firmware aliased from
0xFFFF0
0xF0000 960KB
Extended System
BIOS
896KB
Expansion area
(maps ROMs for old
peripheral cards)
768 KB
Legacy video card
memory
640 KB
Accessible RAM
memory
0x0000,0000
PAM0 0F0000–0FFFFFh
PAM1 0C0000–0C7FFFh
PAM2 0C8000–0CFFFFh
PAM3 0D0000–0D7FFFh
PAM4 0D8000–0DFFFFh
PAM5 0E0000–0E7FFFh
PAM6 0E8000–0EFFFFh
Consult the chipset datasheet for details on the memory redirection feature controls applicable to the
target platform.
Threads and cores on the same package are detectable by executing the CPUID instruction.
See the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A for details on the
information available with the CPUID instruction on various processor families.
Detection of additional packages must be done “blindly.” If a design must accommodate more than one
physical package, the BSP needs to wait a certain amount of time for all potential APs in the system to
“log in.” Once a timeout occurs or the maximum expected number of processors “log in,” it can be
assumed that there are no more processors in the system.
AP Wakeup State
Upon receipt of the SIPI, the AP starts executing the code pointed to by the SIPI message. As opposed
to the BSP, when the AP starts code execution it is in real mode. This requires that the location of the
code that the AP starts executing is located below 1 MB.
Caching Considerations
Because of the different types of processor combinations and different attributes of shared processing
registers between threads, care must be taken to ensure that the caching layout of all processors in the
entire system remain consistent such that there are no caching conflicts.
The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A section “MTRR
Considerations in MP Systems” outlines a safe mechanism for changing the cache configuration in all
systems that contain more than one processor. It is recommended that this be used for any system with
more than one processor present.
AP Idle State
Behavior of APs during firmware initialization is dependent on the firmware implementation, but is most
commonly restricted to short durations of initialization followed by entering a halt state with a HLT
instruction, awaiting direction from the BSP for another operation.
Once the firmware is ready to attempt to boot an OS, all AP processors must be placed back in their
power-on state (WAIT for SIPI), which can be accomplished by the BSP sending an INIT ASSERT IPI
followed by an INIT DEASSERT IPI to all APs in the system (all except self).
Advanced Initialization
The advanced device initialization follows the early initialization and basically ensures that the DRAM is
initialized. This second stage is focused on device-specific initialization. In a UEFI-based BIOS solution,
advanced initialization tasks are also known as Dynamic Execute Environment (DXE) and Boot Device
Selection (BDS) phases. The following devices must be initialized in order to enable an embedded
system. Not all are applicable to all embedded systems, but the list is prescriptive for most and is
particular to an SOC based on Intel architecture.
General purpose I/O (GPIO)
Interrupt controller
Timers
Cache initialization (could also be done during early initialization)
Serial ports, Console In/Out
Clocking and overclocking
PCI bus initialization
Graphics (optional)
USB
SATA
Interrupt Controllers
Intel architecture has several different methods of interrupt handling. The following or a combination of
the following can be used to handle interrupts:
Programmable Interrupt Controller (PIC) or 8259
Local Advanced Programmable Interrupt Controller (APIC)
Input/Output Advanced Programmable Interrupt Controller (IOxAPIC)
Messaged Signaled Interrupt (MSI)
Programmable Interrupt Controller (PIC)
When the PIC is the only interrupt device enabled, it is referred to as PIC Mode. This is the simplest
mode where the PIC handles all the interrupts. All APIC components are bypassed and the system
operates in single-thread mode using LINT0.
The BIOS must set the IRQs per board configuration for all onboard, integrated, and add-in PCI devices.
The PIC contains two cascaded 8259s with fifteen available IRQs. IRQ2 is not available since it is used
to connect the 8259s. On mainstream components, there are 8 PIRQ pins supported in PCH, named
PIRQ[A# :H#], that route PCI interrupts to IRQs of the 8259 PIC. PIRQ[A#:D#] routing is controlled by
PIRQ Routing Registers 60h–63h (D31:F0:Reg 60- 63h). The PIRQ[E# : H#] routing is controlled by
PIRQ Routing Registers 68h–6Bh (D31:F0:Reg 68 – 6Bh). See Figure 4.
The PCH also connects the 8 PIRQ[A# : H#] to 8 individual IOxAPIC input pins, as shown in Table 2.
PIRQ# Pin Interrupt Router Register for PIC Connected to IOxAPIC Pin
Exceptions
Exceptions are routines that run to handle error conditions. Examples include page fault and general
protection fault. At a minimum, placeholders (dummy functions) should be used for each exception
handler. Otherwise the system could exhibit unwanted behavior if an exception is encountered that isn’t
handled.
Timers
There are a variety of timers that can be employed on today’s Intel Architecture system.
While MTRRs are programmed by the BIOS, Page Attribute Tables (PATs) are used primarily with the
OS to control caching down to the page level.
Serial Ports
An RS-232 serial port or UART 16550 is initialized for either runtime or debug solutions. Unlike USB
ports, which require considerable initialization and a large software stack, serial ports have a minimal
register-level interface. A serial port can be enabled very early in POST to provide serial output support.
PCI device discovery applies to all the newer (non-legacy) interfaces such as PCI Express (PCIe) root
ports, USB controllers, SATA controllers, audio controllers, LAN controllers, and various add-in devices.
These newer interfaces all comply with the PCI specification.
Refer to the PCI Specification for more details. A list of all the applicable specifications is in the
References section.
It is interesting to note that in the UEFI system, the DXE phase does not execute the majority of
drivers, but it is the BDS Phase which executes most of the required drives in UEFI to allow the system
to boot.
Graphics Initialization
If the platform has a head, then the video BIOS or Graphics Output Protocol UEFI driver is normally the
first option ROM to be executed in the string. Once the main console out is up and running, the console
in can be configured.
Input Devices
Refer to the board schematics to determine which I/O devices are in the system. Typically a system will
contain one or more of the following devices.
USB Initialization
The USB controller supports both EHCI and now XHCI. To enable the host controller for standard PCI
resources is relatively easy. It is possible to not enable USB until the OS drivers take over and have a
very well-functioning system. If pre-OS support for EHCI or XHCI is required, then the tasks associated
with the USB subsystem become substantially more complex. Legacy USB requires an SMI handler be
used to trap port 60/64 accesses to I/O space and convert these to the proper keyboard or mouse
commands. This pre-OS USB support is required if booting to USB is preferred.
SATA Initialization
A SATA controller supports the ATA/IDE programming interface as well as the Advanced Host Controller
Interface (AHCI) (not available on all SKUs). In the following discussion, the term “ATA-IDE Mode” refers
to the ATA/IDE programming interface that uses standard task file I/O registers or PCI IDE Bus Master
I/O block registers. The term “AHCI Mode” refers to the AHCI programming interface that uses memory-
mapped register/buffer space and a command-list-based model.
A separate document, RS – Intel® I/O Controller Hub 6 (ICH6) Serial ATA Advanced Host Controller
Interface (SATA-AHCI) Hardware Programming Specification (HPS) contains details on SATA software
configuration and considerations.
IDE Mode
IDE mode is selected by programming the SMS field, D31:F2:Reg 90h[7:6] to 00. In this mode
the SATA controller is set up to use the ATA/IDE programming interface. In this mode the 6/4
SATA ports are controlled by two SATA functions. One function routes up to four SATA ports, D31:F2,
and the other routes up to two SATA ports, D31:F5 (Desktop SKUs only). In IDE mode, the Sub Class
Code, D31:F2:Reg 0Ah and D31:F5:Reg 0Ah will be set to 01h. This mode may also be referred to as
compatibility mode as it does not have any special OS driver requirements.
AHCI Mode
AHCI mode is selected by programming the SMS field, D31:F2:Reg 90h[7:6], to 01h. In this mode, the
SATA controller is set up to use the AHCI programming interface. The six SATA ports are controlled by a
single SATA function, D31:F2. In AHCI mode the Sub Class Code, D31:F2:Reg 0Ah, will be set to 06h.
This mode does require specific OS driver support.
RAID Mode
RAID mode is selected by programming the SMS field, D31:F2:Reg 90h[7:6] to 10b. In this mode, the
SATA controller is set up to use the AHCI programming interface. The 6/4 SATA ports are controlled by
a single SATA function, D31:F2. In RAID mode, the Sub Class Code, D31:F2:Reg 0Ah, will be set to
04h. This mode does require specific OS driver support.
In order for the RAID option ROM to access all 6/4 SATA ports, the RAID option ROM enables and
uses the AHCI programming interface by setting the AE bit, ABAR + 04h[31]. One consequence is that
all register settings applicable to AHCI mode set by the BIOS have to be set in RAID as well. The other
consequence is that the BIOS are required to provide AHCI support to ATAPI SATA devices, which the
RAID option ROM does not handle.
PCH supports stable image-compatible ID. When the alternative ID enable, D31 :F2 :Reg 9Ch is not
set, the PCH SATA controller will report Device ID as 2822h for a desktop SKU.
Enable Ports
It has been observed that some SATA drives will not start spin-up until the SATA port is enabled by the
controller. In order to reduce drive detection time, and hence the total boot time, system BIOS should
enable the SATA port early during POST (for example, immediately after memory initialization) by setting
the Port x Enable (PxE) bits of the Port Control and Status register, D31:F2:Reg 92h and D31:F5:Reg
92h (refer Error! Reference source not found. step Error! Reference source not found.
requirement), to initiate spin-up of such drive(s).
Memory Map
In addition to defining the caching behavior of different regions of memory for consumption by the OS, it
is also firmware’s responsibility to provide a “map” of the system memory to the OS so that it knows what
regions are actually available for its consumption. The most widely used mechanism for a boot loader or
an OS to determine the system memory map is to use real mode interrupt service 15h, function E8h,
sub-function 20h (INT15/E820), which firmware must implement.
Region Types
There are several general types of memory regions that are described by this interface:
Memory (1) – General DRAM available for OS consumption.
Reserved (2) – DRAM address not for OS consumption.
ACPI Reclaim (3) – Memory that contains all ACPI tables to which firmware does not require runtime
access. See the applicable ACPI specification for details.
ACPI NVS (4) – Memory that contains all ACPI tables to which firmware requires runtime access. See
the applicable ACPI specification for details.
ROM (5) – Memory that decodes to nonvolatile storage (for example, flash).
IOAPIC (6) – Memory that is decoded by IOAPICs in the system (must also be uncached).
LAPIC (7) – Memory that is decoded by local APICs in the system (must also be uncached).
Region Locations
The following regions are typically reserved in a system memory map:
00000000-0009FFFF – Memory
000A0000-000FFFFF – Reserved
00100000-???????? – Memory (The ???????? indicates that the top of memory changes based on
“reserved” items listed below and any other design-based reserved regions.)
TSEG – Reserved
Graphics Stolen Memory – Reserved
FEC00000-FEC01000* – IOAPIC
FEE00000-FEE01000* – LAPIC
* Mr. Nikola Zlatanov spent over 20 years working in the Capital Semiconductor Equipment Industry. His work at Gasonics, Novellus,
Lam and KLA-Tencor involved progressing electrical engineering and management roles in disruptive technologies. Nikola received his
Undergraduate degree in Electrical Engineering and Computer Systems from Technical University, Sofia, Bulgaria and completed a
Graduate Program in Engineering Management at Santa Clara University. He is currently consulting for Fortune 500 companies as well
as Startup ventures in Silicon Valley, California.
View publication stats