Quick Boot A Guide For Embedded Firmware Developers 2nd Edition 2nbsped 9781501506819 9781501515385 Compress
Quick Boot A Guide For Embedded Firmware Developers 2nd Edition 2nbsped 9781501506819 9781501515385 Compress
Quick Boot
Pete Dice
Quick Boot
2nd edition
PRESS
ISBN 978-1-5015-1538-5
e-ISBN (PDF) 978-1-5015-0681-9
e-ISBN (EPUB) 978-1-5015-0672-7
www.degruyter.com
Acknowledgments
The studies, data, results, and guidelines compiled in the book are the result of many
talented engineers at Intel who have a strong passion for BIOS and firmware. The
contributions they have made and the time they have spent, much of it outside their
normal duties, deserve to be acknowledged.
For significant contributions to this book for analyses, cases studies, and written
content, I’d like to thank these talented engineers:
– Jim Pelner—who crafted the original white paper that echoes the main themes of
this book and for contributing to several chapters early on.
– Jaben Carsey—who wrote the shell chapter in the book above and beyond his
many contributions to the UEFI shells in general.
– Sam Fleming—who created Appendix A and has been one of my mentors in BIOS
from the beginning.
– Mike Rothman, Anton Cheng, Linda Weyhing, Rob Gough, Siddharth Shah, and
Chee Keong Sim—for their exquisite multiyear collaboration around the fast boot
concept and multiple case studies over the year.
– BIOS vendor Insyde Software for donating feedback and volunteering for the fore-
word.
Thanks to my program manager, Stuart Douglas, for getting me through the writing
phase and then on to the finish line (are we there yet?).
Reviewer comments and suggestions were extremely valuable for both editions
of this work. I deeply appreciate those who took the time to provide indispensable
feedback, including Drew Jensen, Mark Doran, Jeff Griffen, John Mitkowski, and Dong
Wei and at my publisher, Jeff Pepper, Megan Lester, Mark Watanabe and Angie
MacAllister for her work on fixing the art and tables.
I would also like to acknowledge my peers in the BIOS/FW engineering and ar-
chitecture teams within the computer industry for their drive to make this technology
an ever more valuable (and less obtrusive) part of people’s everyday lives. Lastly, I
want to thank my wife, Anita, for her patience and everything she’s done to allow me
time to complete this.
Contents
Chapter 1: System Firmware’s Missing Link 1
Start by Gathering Data 1
Initialization Roles and Responsibilities 3
System Firmware 3
OS Loader 4
Operating System 4
Legacy BIOS Interface, UEFI, and the Conversion 4
Tiano Benefits 5
Previous UEFI Challenges 6
Persistence of Change 7
The Next Generation 7
Commercial BIOS Business 8
Award 8
General Software 8
Phoenix Technologies Limited 8
American Megatrends Inc. 9
Insyde Software 9
ByoSoft 9
Value of BIOS 9
Proprietary Solutions 10
Making a Decision on Boot Firmware 10
Consider Using a BIOS Vendor 11
Consider Open-Source Alternatives 12
Consider Creating Something from Scratch 13
Consider a Native Boot Loader for Intel ® Architecture 13
Just Add Silicon Initialization 14
Summary 14
Security 177
Intel® Trusted Execution Technology (Intel TXT) 177
TPM Present Detect and Early Start 177
Operating System Interactions 178
Compatibility Segment Module and Legacy Option ROMs 178
OS Loader 178
Legacy OS Interface 179
Reducing Replication of Enumeration Between Firmware and OS 179
Other Factors Affecting Boot Speed 180
No Duplication in Hardware Enumeration within UEFI 180
Minimize Occurrences of Hardware Resets 180
Intel Architecture Coding Efficiency 180
Network Boot Feature 180
Value-Add, But Complex Features 181
Tools and the User Effect 181
Human Developer’s Resistance to Change 181
Summary 182
Index 255
Foreword from the First Edition
How do you explain what BIOS is? I generally explain it as the code that runs when
you first turn on the computer. It creates a level playing field so that the operating
system has a known state to start from. If the other person has some programming
knowledge, he or she generally says something like, “Oh. You’re one of those guys!”
Let’s face it. BIOS isn’t sexy. The hardware engineers will always blame the BIOS en-
gineers if the system fails to POST. It’s generally up to BIOS engineers to prove it isn’t
their code that is the problem.
When I first started as a lowly BIOS Engineer II, the BIOS codebase was pure x86
assembly code—thousands of files across almost as many directories with lots of cryp-
tic comments like, “I don’t know why this is here, but it breaks if I remove or modify
it! Beware!” It took 45 minutes to do a clean compile. Comments would commonly
refer to specifications that no longer existed. To say a BIOS is filled with some secret,
arcane algorithms is like saying driving a Formula 1 car is just like driving on the free-
way, only faster! There are no college courses that teach BIOS programming. There
are no trade schools to go to. A few software and electronic engineers will be able to
make it as BIOS engineers because it takes a bit of both to be successful.
This book is the first one I’m aware of that attempts to shine light onto the esoteric field of BIOS
engineering. A field that makes everything from the big-iron servers to the lowly smartphone
turn on. This book has combined two fundamental concepts. What you need to know to make a
BIOS that works and what you need to know to make a BIOS that works fast! It wasn’t that long
ago that a POST in under ten seconds was considered pretty fast. Today’s standard is now under
two seconds. There are topics outlined in this book that will help get you to that sub-2-second
goal. I am currently working on a quasi-embedded system that is in the sub-1-second range with
full measured boot using these concepts!
This book has something for the recent college graduate as well as the seasoned BIOS engi-
neer. There are nuggets of tribal knowledge scattered throughout. Help yourself become better
acquainted with the BIOS industry and read it.
–Kelly Steele,
Former BIOS Architect, Insyde Software, Inc.,
Now at Intel Corporation
This book has something for the recent college graduate as well as the seasoned BIOS
engineer. There are nuggets of tribal knowledge scattered throughout. Help yourself
become better acquainted with the BIOS industry and read it.
Chapter 1
System Firmware’s Missing Link
Hardware: the parts of a computer that can be kicked.
—Jeff Pesis
Booting an Intel architecture platform should be easy. Anyone who thinks that writ-
ing an all-purpose Intel® architecture Basic Input Output System (BIOS) and/or an
operating system (OS) boot loader from scratch is easy has yet to try it. The complexity
and sheer number of documents and undocumented details about the motherboard
and hardware components, operating system requirements, industry standards and
exceptions, silicon-specific eccentricities beyond the standards, little-known tribal
knowledge, basic configuration, compiler nuances, linker details, and variety of ap-
plicable debug tools are enormous. While it can be difficult, it doesn’t have to be.
This book is designed to give a background in the basic architecture and details
of a typical boot sequence to the beginner firmware developer. Various specifications
provide the basics of both the code bases and the standards. While a summary is pro-
vided in the chapters below, there is no substitute for reading and comprehending
the specifications and developer’s manuals first-hand. This book also provides in-
sights into optimization techniques to the more advanced developers. With the back-
ground information, the required specifications on hand, and diligence, many devel-
opers can create quality boot solutions. Even if they choose not to create, but to
purchase the solution from a vendor, the right information about boot options makes
the decision making easier.
DOI 10.1515/9781501506819-001
2 Chapter 1: System Firmware’s Missing Link
available. For application specifications, there could be dozens for a given mar-
ket segment. If it is a new or emerging type of system, there will be no mature
standard and you will be chasing a moving target. Obtaining the list of industry
standards that apply is a daunting task and may require some registering and
joining to gain access to specifications and/or forums to get your questions an-
swered. Some older specifications are not published and not available today on
the Internet.
– There are many long-in-the-tooth legacy devices that may need to be initialized,
and finding documentation on them is challenging. They may exist only in the
dusty drawers of a senior engineer who retired five years ago.
– In some cases, nondisclosure agreements (NDAs) must be signed with the various
silicon, BIOS, or motherboard vendors. The NDAs can take precious time to ob-
tain and will require some level of legal advice.
– UEFI provides a handy API for interfacing to the OS. It has a modular framework
and is a viable starting place supporting many industry standards such as ACPI
and PCI.
– Until now, no single reference manual has documented the required steps
needed to boot an Intel architecture system in one place. Nor has anyone detailed
the order of initialization to get someone started.
Those who have been exposed to system firmware at a coding level and the inner
workings of the black art that is system BIOS understand that it is difficult to explain
everything it does, how it does it, or why it should be done in exactly that way. Not
many people in the world would have all theanswers to every step in the sequence.
Most people who work in the code will want to know just enough to make necessary
changes and press on with development or call up their BIOS vendor.
Fortunately, there are many options available when choosing a firmware solu-
tion, which we will examine in the next few pages. The more you know about the
options, the key players, their history, and the variables, the better decision you can
make for your design. The decision will be a combination of arbitrary thought, low-
hanging fruit, economies of scale, technical politics, and, of course, money vs. time.
Intel has created an open-source-based system, known as Intel® Boot Loader De-
velopment Kit (Intel® BLDK), which provides a turnkey solution without a huge learn-
ing curve. Developers can go to www.intel.com and download Intel BLDK for various
embedded platforms.
Intel® Quark processor also has a UEFI implementation that is entirely open
source and is built using the UEFI framework. This can be found online by searching
for Galileo UEFI firmware.
Initialization Roles and Responsibilities 3
System Firmware
The system firmware is a layer between the hardware and the OS that maintains data
for the OS about the hardware. As part of the power on self-test (POST), the system
firmware begins to execute out of flash to initialize all the necessary silicon compo-
nents, including the CPU itself and the memory subsystem. Once main memory is in-
itialized, the system firmware is shadowed from ROM into the RAM and the initializa-
tion continues. As part of the advanced initialization stages, the system firmware
creates tables of hardware information in main memory for the operating system to
utilization during its installation, loading, and runtime execution. During POST,
hardware workarounds are often implemented to avoid changing silicon or hardware
during later design phases. There may be an element of the system firmware that re-
mains active during latter stages to allow for responses to various operating system
function calls. The system firmware is customized for the specific hardware needs of
the platform and perhaps for a given application. The last thing the system firmware
does is hand off control to the OS loader.
System firmware can be constructed in a proprietary legacy code base and/or in
a Unified Extensible Firmware Interface (UEFI) framework. Legacy BIOS incorporates
a legacy OS interface and follows legacy software interrupt protocols that have been
evolving organically since the IBM PC (circa 1981). UEFI is a specification detailing an
interface that helps hand off control of the system for the preboot environment—that
is, after the system is powered on, but before the operating system starts—to an oper-
ating system, such as Microsoft Windows or Linux. UEFI provides an interface be-
tween operating systems and platform firmware at boot time, and supports an archi-
tecture-independent mechanism for initializing add-in cards (option ROM). We will
dig in to the story of how and why legacy BIOS has been converted to UEFI in a mi-
nute. A key takeaway is that the system initialization code is either going to be UEFI-
based, or legacy-based, or try to support elements of both depending on the operating
system requirements.
4 Chapter 1: System Firmware’s Missing Link
OS Loader
The OS loader does exactly what its name implies. It is customized with knowledge
about the specific operating system, how it is ordered, and what blocks of the OS to
pull from the OS storage location. A long list of OS loaders is available in the market
today for Linux. We will examine a few of these in later chapters. Microsoft Windows
and real-time custom operating systems also have their own flavors. It is possible that
the OS loader may be enabled or configured to extend the platform initialization be-
yond the system firmware’s scope to allow for more boot options than was originally
planned.
It is possible, depending on the system architecture and the firmware solutions
that you work on, that the OS loader and the system firmware are part of the same
binary image located in the same component on the system, or it may be separated.
Operating System
The operating system completes, or in some cases repeats, the initialization of the
hardware as it loads and executes the software kernel, services, and device drivers. It
potentially loads the human/machine interface (HMI) and finally begins to run the
applications. Depending on the application, there are various ways that the OS can
be initiated. We will dig into this more in future chapters.
Care should be taken when considering combining elements of various compo-
nents together as proprietary, and public licenses may prohibit linking the objects
together in various ways.
Tiano Benefits
There are some clear advantages to UEFI: faster boot, modularity, DOS replace-
ment, scalability, security, overcoming the limitations of legacy PC AT interfaces,
and longevity.
In the past, there have been some challenges early in the adoption of UEFI. While
these challenges have been discussed and dealt with at an industry level, some still
see the need to address bringing up these points of debate. As embedded architecture
moves forward into new segments, it is vital that open and honest dialogue can occur
about the firmware solutions that exist and their pros and cons. As these perceptions
may still exist, let’s review the points and discuss how they have been eliminated or
minimized over time.
– Maturity: Like any new code base, UEFI source initially lacked 20 years of in-
dustry validation and experience. It did not consider every known add-in card
workaround or industry standard. It was not ported to many environments. It
was not yet validated over the long term across multiple generations of prod-
ucts (and developers). Over the past decades, this has changed. The solution is
now well validated by many industry teams and groups. Workarounds have
been included, the industry standards have been adhered to, and new environ-
ments have been adapted.
– Some tend to adapt to any new technology slowly. Despite the benefits of a new
standard, a new code base takes time to adopt and adapt to, in part due to NIH
(not invented here) syndrome. In other cases, it has been a matter of having to
Legacy BIOS Interface, UEFI, and the Conversion 7
support a code-base change while maintaining the legacy one. It takes several
cycles to convert before the technology is broadly embraced.
– One common belief of early adopters has been that handwritten assembly code
would always outperform C code. Intel and other maintainers of compilers have
advanced their craft, and these compilers can now surpass and put to rest the
ASM-only ideology.
– The original Tiano code base was not constructed like a traditional system BIOS
in the core-to-platform-to-motherboard architecture. Instead, it was con-
structed as an OS might be, with a core-to-bus-to-device- driver model, which
proved difficult to adapt to some new segments. The newer version of the code
base, EDK II, has evolved to facilitate development with input from major BIOS
vendors and OEMs.
Persistence of Change
In the end, what started out as an idea exclusive to Itanium® began to see early UEFI
projects started in mainstream computing. First in the mobile-computing segments,
where UEFI was designed into many laptop computers starting around 2002. It then
spread to adjacent segments in desktop machines, and is now implemented in a
broad range of products from high-end servers to embedded devices. And not just
Intel Architecture, but ARM architecture as well.
As with any first-generation product, changes and improvements to the design were
made to meet the industry needs. Working together within the UEFI forum, where
most major players in the computing business are working on the next-generation
firmware architectures and implementations of UEFI open source code base, the team
has produced the EDK II. It has taken many years to work through and prioritize some
of the improvements and changes required to help the industry to evolve and remain
vibrant. Major computing manufacturing companies and BIOS vendors are ready to
ship products on this new code base, which promises more flexibility and streamlined
features, including GCC (GNU Compiler Collection) compatibility.
Also, while being deep in complexity, the documentation of the newer versions
of the code is unsurpassed within the industry. The API today is more robust and us-
able than in previous generations and more easily adapted to new and upcoming op-
erating systems.
This was the history of the major conversion and some of the reasons you can
decide to select a standard UEFI implementation or a different firmware technology
8 Chapter 1: System Firmware’s Missing Link
to start on for development. Strategically, EDK II easily makes the best long-term so-
lution. Let’s look at the state of the commercial BIOS business that has emerged from
the transition.
Award
Award BIOS was started in Taiwan and with its unique per-unit license quickly took
advantage of local tax loopholes to gain the edge at local motherboard vendors. The
simplicity and affinity of the Award code base has kept the product entrenched in
various motherboard manufacturers years after Phoenix had discontinued the
product.
General Software
Formed in 1989 by former Windows NT architect Steve Jones, General Software cre-
ated unique and dynamic solutions for the embedded segment. General Software has
in the past been one of the major players in the embedded space but did not penetrate
much into the mainstream markets.
AMI was founded in 1985 by Subramanian Shankar and has created a large variety of
solutions, including BIOS, software diagnostics, RAID technology, and hardware
platforms. Beyond having a broad base of products, AMI is the only consistently pri-
vately owned commercial BIOS company. Its products today include the AMI 8 legacy
core and the AMI Aptio core (a UEFI base first demonstrated in 2004) and AMI-Diags,
all focused on system firmware.
Insyde Software
Insyde is a Taiwanese BIOS vendor formed in 1998, brought forth from the ashes of
SystemSoft. Insyde was the first to launch a UEFI solution in the BIOS industry, In-
syde H20. They have expanded to include offerings in multiple segments and are to-
day the only vendor in the Itanium segment. Besides UEFI Framework base BIOS, In-
syde also offers UEFI applications and keyboard firmware.
ByoSoft
In early 2006, Nanjing ByoSoft Co., Ltd. (ByoSoft), was established in China. In 2008,
ByoSoft became one of the four independent BIOS vendors in the world providing
professional UEFI BIOS products and service and the only one based in mainland
China. While they are the new kid on the block, they have many long-time BIOS engi-
neers working for them and major companies signed up.
Value of BIOS
The bill of material (BOM) impact of a system BIOS from segment to segment can differ
greatly depending on a variety of factors. The business model can be based on a com-
bination of non-recurrent engineering (NRE) and/or royalty per motherboard. De-
pending on:
– Original innovation
– Porting cost
– Source level access
– Support need
– Expected volume
– Customization requirements
– Vendor/supplier history
10 Chapter 1: System Firmware’s Missing Link
This is not unlike many other free market dynamics. If volume pricing can apply, roy-
alties per board could be a relatively low percentage of the bill of materials (BOM)
cost. When volumes do not apply, then royalties per board can rise to affect the BOM.
Embedded systems customers often must pay a much higher cost per board because
of the diverse nature of the business segments, limited volume, and high-touch model
for adapting the mainstream products for custom applications.
Proprietary Solutions
Beyond the four commercial BIOS companies mentioned above, it is possible that
many name-brand computer manufacturers have teams that can and/or do write their
own proprietary BIOS. Over the years, many have created their own BIOS, starting
with IBM and the IBM PC. In some cases, separate business units within very large
companies have even created their own solutions.
– At IBM, a team of developers created a boot firmware for its new desktop machine
in August, 1981, which became known as a BIOS. Today, IBM has a choice of who
they use for which product, internal or external.
– In 1982, Compaq wrote the first BIOS specification by reverse- engineering the
IBM BIOS with one team that wrote the specification and then handing the spec-
ification to another team, which in turn wrote the Compaq portable PC firmware
from scratch.
– Today, HP does their own BIOS and utilizes BIOS vendors depending on the prod-
uct lines.
– While Intel currently employs the maintainers of the UEFI solutions at
www.tianocore.org, it does not produce commercially available BIOS, not count-
ing Intel-branded motherboards.
– Other large computer and motherboard manufacturers around the world have the
capability to develop their own solutions, and often opt to employ their favorite
BIOS vendors for core and tool maintenance.
– Apart from front-end system firmware, server manufacturers in particular have
extensive value-add firmware-based solutions for baseboard management con-
trollers (BMCs), which are embedded controllers that control the back-end sub-
system to enhance a server board’s ability to be remotely managed and for in-
creased fault tolerance.
firmware. Depending on the level of experience and number of designs they have to
support, they may decide to implement a commercial BIOS product or they may try to
create their own.
When a BIOS vendor is not an option, they must roll up their sleeves and search
for alternatives. In the education arena, software engineering, computer engineering,
and electrical engineering students all learn a certain level of low-level firmware cod-
ing as part of just a few of their classes, but most curriculums don’t include a class
that takes the student through the full experience of system firmware development.
Except for some graduate level projects, most students do not get to develop their own
solutions from scratch.
There are three basic options: BIOS vendor, reuse/borrow from open source, from
scratch.
Talking to a BIOS vendor is a great idea when the situation demands product- ready
solutions, and the return on investment merits it. To get starter code from a BIOS ven-
dor normally requires various levels of licenses and agreements for evaluation, pro-
duction, and follow-on support. The business and legal negotiations can take time,
so if you want to implement this, you should start early. A commercial BIOS comes
with a varying amount of nonrecurring engineering (NRE) and/or royalties per unit
or subscription costs. If you choose this route, there is a very high likelihood that you
are getting exactly what you need to get started and get to a successful production
cycle. You can outsource the duties and responsibilities entirely if you choose to.
First- generation products often have hiccups and, even if you are not inclined to take
the easy way out on a regular basis, you should consider what BIOS vendors can offer.
Many successful and established computer OEM development teams employ BIOS
vendors to provide a base level of software core competency, basic OS support, tools,
and on-call support while the in-house developers focus at the finer details and en-
sure that the job is done on schedule. BIOS vendors may offer starter kits with a lesser
number of features and limited support, which smaller companies or individuals can
take advantage of if they do not have the time to dive deep into the firmware realm.
As everyone’s concept of what constitutes cheap versus expensive varies, product
teams should weigh their options and the return on investment levels and make the
right decision for them for a given project. The next project or another parallel project
in the pipeline may require another solution with entirely different requirement sets.
Scalability may be something that internal teams cannot meet on their own due to a
tight production cycle.
Some say that BIOS vendors (and BIOS) are becoming obsolete, especially consid-
ering silicon vendors supporting boot loaders, but this is not true. People said the same
thing when Tiano originally came out 10 years ago—“say goodbye to BIOS.” Some
12 Chapter 1: System Firmware’s Missing Link
thought it would come true when Linux BIOS started, but they too have been disap-
pointed. If the Linux multiverse is anything to go by, commercial distribution houses
such as Red Hat, offering commercially, prepackaged products and support, continue
to thrive even when parallel royalty-free and subscription-free alternatives exist on the
same kernel and with many of the same products. Why? Because Linux isn’t free—peo-
ple must roll up their sleeves and do the work upfront and continue to maintain it. The
same thing can be said about system firmware. Commercial BIOS vendors like Insyde,
Phoenix BIOS, AMI, ByoSoft, and others will continue to provide turn-key product
quality and services to the computing industry in the future, regardless of the codebase
being used. They provide value-added products, plain and simple.
For those who choose to take the plunge to create their own solution in whole or in
part, free and open-source alternatives can be downloaded from the Web that offer a
starting point and/or reference. Two of the well-known open-source products availa-
ble to the market are Coreboot and Tiano.
Tiano
The Tiano core uses a BSD license and provides a flexible framework. Developers nor-
mally must get certain initialization codes from the silicon vendor individually, or
they must reverse-engineer code that already exists on other platforms. Tiano, by de-
fault, also lacks the needed legacy interfaces to allow many older operating systems
or PCI device option ROMs to be used. Developers can choose to create a Compatibility
Support Module for legacy operating system based on the IBM BIOS specification.
Tiano does have enough documentation, as well as the UEFI API, which replaces the
legacy interface, and the UEFI drivers, which replace legacy option ROMs; overall
more robust.
Coreboot
Formerly Linux BIOS, Coreboot provides source code under a GPL license. It has
grown and evolved since starting out at Cluster Research Lab at the Advanced Com-
puting Laboratory at Los Alamos National Laboratory. It got a lift from Google in the
last 5 years as a few of the leads joined the company.
Uboot
Also known as Das Universal Boot, or Das U-boot, Uboot is owned by DENX software
and is distributed under a GPL license. Uboot is broadly used in embedded devices.
Making a Decision on Boot Firmware 13
For more information about these and other alternatives, like Aboot, please ex-
plore the web.
Regardless of the challenges, it is possible to start coding from 0s and 1s, in assembly,
or in C, or PERL, and so on. The language doesn’t matter. You would have had to take
on the tasks of initialization one at a time, and likely have a tightly bundled RTOS or
native-mode application. It is possible to boot Intel architecture completely from
scratch. You can also use the open sources as reference code (licenses included). It
may take an NDA with a few companies to get the necessary details of secret sauce
bits to toggle on more advanced machines. Having other options available with the
benefits of some of these have already been outlined above. Starting fresh is not the
best option once you step back and look at the alternatives and trade-offs. But there
are more options out there. . . .
When market needs precipitated a native boot loader, Intel created the Intel Boot
Loader Development Kit (Intel BLDK) to provide a way to bootstrap firmware developers
new to Intel architecture. It was a good experiment and provided support for a few
Atom-based embedded platforms. Providing a combination of source, binary, tools, and
documentation, BLDK allows embedded firmware developers to not only debug and
boot their platform, but customize and optimize it for their basic production needs. It is
designed to do the basic silicon initialization required to bring the processor out of Re-
set, enable the system’s main memory, enable the device path to the target OS storage
device, fetch the initial sector of the OS, and hand control to the OS. It provides a great
reference for people new to the firmware and BIOS industry.
It is for system firmware developers working on platforms for embedded devices
based on Intel® Atom™ Processors. Students can gain an insight into what happens
before the OS takes over and in the background while the OS runs. System firmware
and hardware designers can grasp the level of work required to perform Intel archi-
tecture initialization.
Intel BLDK lacks extended features that would allow the user to run many stand-
ard off-the-shelf operating systems. As BLDK is an extendable kit, system developers
are free to make their own additions and modifications to take advantage of all the
latest and greatest technologies coming from Intel. It however doesn’t have enough
platform support to be considered universal.
14 Chapter 1: System Firmware’s Missing Link
The Intel® Firmware Support Package (FSP) provides chipset and processor initializa-
tion in a format that can easily be incorporated into many existing boot loaders. FSP
will perform all the base initialization steps for Intel silicon, including initialization
of the CPU, memory controller, chipset and certain bus interfaces, if necessary. The
initialization will enable enough hardware to allow the boot loader to load an operat-
ing system. FSP is not a stand-alone boot loader as it does not initialize non-Intel
components, conduct broad bus enumeration, discover devices in the system, or sup-
port all industry standard initialization. FSP must be integrated into a host boot
loader, such as those open-source solutions mentioned above, to carry out full boot
loader functions.
Summary
Booting Intel architecture should be easy. While open source provides many ad-
vances in case you need to know the nuts and bolts, this book provides that back-
ground. Most of the basics in the system firmware space are not something taught in
college; they are learned through on-the-job training.
Options are available when choosing a firmware solution as a starting point. The
more that you know about them, as well as about the key players, their history, and
the variables, the better decision you can make for your current and future designs.
The decision will be a combination of arbitrary thought, low-hanging fruit, economies
of scale, technical politics, and, of course, money versus time.
In the following chapters, we describe details of the typical Intel architecture boot
flow and then detail how to port and debug an Intel architecture motherboard and
add custom initialization for your own design. We will examine different OS loader
support for typical use cases. Hopefully this gives you an appreciation of the scope
involved and gives you some ideas.
As the title implies, this book is a supplement for embedded developers. We will
use Intel BLDK as an example, but the following chapters apply to all initialization
solutions. The Intel® Galileo board also has full UEFI source, which is available on
GitHub if developers want to play with these concepts and develop their own boot-
loaders.
Chapter 2
Intel Architecture Basics
Architecture is a visual art, and the buildings speak for themselves.
—Julia Morgan
—Walter Gropius
Intel Architecture has been evolving since before the PC AT computer. It has ridden
along with and contributed to many industry standards or become a de facto standard
over the years. What used to be the IBM or a “clone,” or IBM-compatible, is now
simply a PC motherboard most likely with Intel’s CPU and chipsets.
Most people didn’t really begin to understand this until the Intel® Pentium®
Processor with MMX technology and the bunny-suited fab workers dancing to Wild
Cherry’s Play That Funky Music commercials. The accompanying five-tone melody
reminds you from across the room, in a crowded sports bar, whose chips are in the
computer they just ran an ad for, even when you cannot see the TV. And then there
was the Blue Man Group commercials for the Intel® Pentium® III processors. Which
did it? I’m not sure…you pick. Likely it was all the stickers on the machines that
gave it away.
To understand the why and how of the current designs, you should go back and
study history a little, maybe not quite as far back as the systems based on the AT bus
(also known as the Industry Standard Architecture, or ISA bus), but it would help. We
then can understand how the different advancements in bus technology and capabil-
ities of add-in cards, from ISA to PCI to PCI-X to PCIe, have gone along and how pop-
ular and mature functions have been integrated into the Intel chipsets. Today most of
the Intel architecture revolves around PCI devices in the chipset. You can specifically
look at graphics and how it has individually adapted over the years to take advantage
of the location in the system and proximity to CPU, memory, and I/O. Each step along
the path of this evolution has gone toward increasing bandwidth and reducing bot-
tlenecks, putting the next key technology in the best possible position to show the
extensibility, modularity and speed of the platform. This is a universal truth in build-
ing the better mouse trap year after year.
In communicating with software and the evolution of the platform, there are
multiple angles to consider: the BIOS or firmware, the operating system, the appli-
cations, and how these interact with each other. The hardware interfaces are built
into the BIOS, and the OS kernel, and the device drivers. The applications and soft-
ware interfaces can change dramatically, but one cannot talk about hardware ar-
chitecture without also talking about the instruction set architecture, (ISA), not to
be confused with the older platform bus of choice. While the rest of the platform
DOI 10.1515/9781501506819-002
16 Chapter 2: Intel Architecture Basics
has had the benefit of fundamental revision (gradually leaving legacy behind),
BIOS has grown through accretion on the same architecture. Table 2.1 illustrates
some example advances.
Table 2.1:
The CPU
For most basic computing, the processor is the engine that makes everything work.
The addition of hyper threading, multiple threading, multiple cores complicates the
programming and the debug scene for parallel or concurrent computing, but hasn’t
made that big of a difference to the BIOS or bootloader, which is normally, but not
necessarily, single threaded. After 2010, processors have multiple processor cores,
integrated memory controllers and a variety on interconnect combinations, includ-
ing PCIe, DMI, Intel® QuickPath Interconnect (Intel® QPI), and ESI. The integration
of the “uncore,” or system agent, which includes the memory controller and
graphics interface (if not a graphics engine itself) present challenges when dealing
with multisocketed designs and making sure they the performance software stacks
are made Non-Uniform Memory Access (NUMA) aware. But the BIOS and bootloader
18 Chapter 2: Intel Architecture Basics
isn’t that affected by making sure that the processor booting the system is close to the
memory where the pieces got shadowed.
All these technologies are enabled by system firmware and add real value to perfor-
mance and platform capabilities.
One of the keys to the many flavors and brands of Intel processors is that they
have been produced with a common Instruction Set Architecture (ISA), Itanium being
an exception to a unified ISA. Today the cores for Intel Xeon, Intel Core, Intel Atom,
and Intel Quark are the same root instruction set with some minor additions every
generation. The key decision that remains to be made by system firmware is whether
you will run in 32-bit or 64-bit mode, which is done with a compiler switch if the code
is correctly designed.
The front side bus (FSB), which had been between the CPU and the North Bridge, was
a proprietary parallel bus that normally linked to the PCI host controller in the north
bridge, or the memory controller hub (MCH), as it is sometimes referred to in docu-
mentation. With the integration of the north bridge into the processor on the Nehalem
processor, the front side bus is no longer visible externally, and its functionality has
been replaced with invisible fabric internal to the chip. Instead of the FSB being the
connection between the CPU and the rest of the system, it is now via one of a few serial
interfaces. Intel QPI is the new name for interprocessor connections and to IOH and
PCH components; it is the DMI or ESI link, or similar.
It has had a string of names over the years, but the primary interface to the processor
on older chipsets contains the PCI host controller (PCI Bus0, Dev0, Func0), the
memory controller, a graphics port, and/or integrated graphics.
As part of the North Bridge, the PCI Host Controller converts FSB signals to the
appropriate PCI bus traffic. It has been the home of the PCI configuration space,
which controls the memory controller. For that reason, people seem to consider them
20 Chapter 2: Intel Architecture Basics
one in the same, which we can grant if it is known that the memory controller hub
(MCH) serves other important purposes.
Another part of the North Bridge, the Memory Controller is kind of self-explana-
tory. Memory has been evolving steadily every product generation or so. The general
rule is: the faster the access times and the larger the memory the better. There are
limitations as far as address lines of the memory controller, speed of the bus, number
of memory channels, the memory technology, and the thermal envelope, especially
in larger sizes and densities, which tend to limit the size available to the system.
The days of Fast Page, EDO, BEDO, and SDRAM are long over. RDRAM is dead,
and for good reason (electrically way too sensitive and an expensive nightmare to
debug). These days a form of Dual Data Rate (DDR) memory is the de jure standard.
In the past, there have been memory translation hubs (MTHs) and Memory Repeater
Hubs (MRHs), but these are no longer on the Intel roadmaps. Other not-quite-straight-
memory devices do exist, including the fully buffered DIMMs and other devices that
allow for a memory riser scenario. We may still be waiting for a nonvolatile form of
main memory, such as phase change memory (PCM), to come along and remove the
need for reinitialization of this key component on every boot; 3D XPoint is not claimed
to be PCM, and has not replaced DDR.
The Graphics (GFX) engine is normally located as close to the physical memory
and processor as the architecture allows for maximum graphics performance. In the
old days, cards had their own private memory devices for rendering locally. While
that is still the case for add-in cards, any integrated graphics today utilizes a piece of
main system memory for its local memory. The killer add-in graphic cards for PCI have
been replaced, first with the Accelerated Graphics Port (AGP), and now that has been
replaced over time by the PCI Express Graphics (PEG) port as the pole sitter of add-in
devices. On some embedded and server designs, the PEG port can be used at a x16
PCIe channel for increased I/O capacity.
The link between the north and south bridges has been updated from the Hub link to
the DMI link. Now with up to 2 GB/s concurrent bandwidth, DMI provides up to 4x
faster I/O bandwidth compared with the previous Intel proprietary Hub link I/O in-
terface. A similar enterprise south bridge interconnect (ESI) is available that supports
speeds of 2.5 Gb/s and connects to an Intel ICH south bridge or be configured as a x4
PCIe Gen 1 port. In the past, these links have been referred to as virtual bridges be-
cause they are transparent to any PCI transactions that may be sent across them. With
PCI bus 0 devices both in the north and south complexes, there is no change in PCI
bus numbers when you cross the border between chips. So, the software doesn’t need
to know that it is dealing with a two-chip or a one-chip (SoC) solution. The link not
The Big Blocks of Intel Architecture 21
only handles PCI transitions, but also several proprietary chip-to-chip messages, or
die-to-die messages if on a SoC.
The South Bridge, Also Known as the PIIX, I/O Controller Hub (ICH), I/O Hub (IOH),
Enterprise South Bridge (ESB), and Platform Controller Hub (PCH)
The south bridge in the past has taken on various forms, including the PIIX, the ICH,
the ESB (enterprise south bridge), the SCH (system controller hub), the IOH (I/O con-
troller hub), and now the PCH. All basically equate to the main I/O channels to the
outside world. If you want full documentation on the older corners of the newest in-
tegrated component, you may want to go all the way back to the PIIX documentation,
which can still be downloaded from www.intel.com. While you may be able to avoid
the terror of threading through fifteen years of change to a particular interface to build
a holistic I/O “driver” for your firmware, you may need to make changes to some very
old code and wonder “why did they do this; it doesn’t say it in the this datasheet.”
This is where legacy gets to some very deep layers in the code. As the parts have mor-
phed over time, the code to support them has as well.
Formally known as the 82371FB (PIIX) PCI/ISA/IDE Xcelerators are multifunction
PCI devices implementing a PCI-to-ISA bridge function and a PCI IDE function. As a
PCI-to-ISA bridge, the PIIX integrates many common I/O functions found in ISA-
based PC systems—a seven-channel DMA controller, two 8259 interrupt controllers,
an 8254 timer/counter, and power management support. In addition to compatible
transfers, each DMA channel supports type F transfers. Programmable chip select de-
coding is provided for the BIOS chip, real time clock, and keyboard controller.
Edge/Level interrupts and interrupt steering are supported for PCI plug and play com-
patibility. The PIIX supports two IDE connectors for up to four IDE devices providing
an interface for IDE hard disks and CD ROMs. The PIIX provides motherboard plug
and play compatibility. PIIX implements two steerable DMA channels (including type
F transfers) and up to two steerable interrupt lines. Even with the oldest device you’d
realistically want to examine, that’s a lot of programming in the BIOS to get the I/O
setup to run “anything.” Perhaps the coolest BIOS-centric pieces it had was the SMM
interrupt controller, used back then for advanced power management (APM).
The PIIX generations of south bridges continued to expand its list of integrated
devices and expansion buses:
– The PIIX3 added IOAPIC, USB UHCI, PCI 2.1, and upgraded DMA and IDE capa-
bilities. There was a PIIX2 in between which supported PIO Mode 4, but it didn’t
have the legs of the PIIX3 and was soon overtaken.
– The PIIX4 added ACPI and DMA to the IDE controllers. Prior to this, the CPU was
involved in all PIO mode transfers. Another important advancement was the ad-
dition of the RealTime Clock, SMBus controller to talk to things like thermal
chips, embedded controllers, and so on. There was an enhanced version of the
22 Chapter 2: Intel Architecture Basics
PIIX4 where something kind of important was fixed. And there was a mobile ver-
sion of the component where extra power management features were added for
helping keep your laps cooler in a remarkably deep green way.
– PIIX5 never left the starting line.
– PIIX6 was gearing up for adding IEEE1394, but was shelved for bigger and better
platform architecture advances.
o With the first ICH, ICH0, Intel added many basics, each with system
firmware impact:
– DMI to PCI bridge—external PCI bus.
– Serial ATA (SATA) for hard drives and SSDs.
– USB 1.1, USB 2.0, and soon USB 3.0—up to 14 ports and 2 integrated hubs.
– Intel® High Definition Audio (more bits, less noise).
– Serial Presence Interface (SPI) to connect to the NOR based NVRAM.
– Low Pin Count (LPC) Interface to get to the Super IO or Embedded Controllers and
replace the ISA bus from a software point of view.
– PCI Express Root Ports to get to PCIe bus(es). These can be configured in a variety
of different combinations from x1 to x16 lane widths.
– 1-Gigabit Ethernet (GbE) controller and now 10-Gigabit Ethernet (Xbe) functionality.
– High Performance Event Timers (HPETs), which offer very high granularity event
capability.
– Power management capabilities—most of the clocks and control of the platform
power rails run though the PCH.
– General purpose I/O and assorted native functionality—pins can be configured
for either input, output, or some other native function for the platform. The
datasheet will detail which of the 70-some GPIOs do which function.
– Direct Memory Access (DMA).
– Real Time Clock (RTC).
– Total Cost of Ownership features (TCO).
– Manageability Engine (ME), which provides many backside capabilities and can
continue to run even when the front side system is asleep. This is like baseboard
management controllers (BMCs) on larger server designs.
With each subsequent ICH or PCH generation, each I/O subsystem has changed
slightly, and a major I/O interface or other key feature has been added. Register
changes may have taken place and depending on the I/O needs of the platforms, the
system firmware will have to consider these requirements. It is best to review the lat-
est datasheets at developer.intel.com to understand how the systems have expanded.
In recent years, the system on a chip, a combination of north bridge, south bridge,
and processor, has been in vogue. While the names of a device’s internal intercon-
nects have changed and we talk about fabrics and other nonsensical abstractions for
silicon, the same principles spelled out for CPU, memory, and IO still apply, as do the
standards contained therein.
Data Movement Is Fundamental 23
This entire process is repeated until we have partial cache enabled or we finally have
memory ready to shadow our remaining BIOS code and data into. There are ways
around some of the delays that may be incurred on the way to and from the CPU and
the SPI NVRAM chip such as PCI Delayed transactions, pipelining, or prefetching. But
this isn’t the fast-boot chapter.
24 Chapter 2: Intel Architecture Basics
The Intel architecture full I/O range is 0–64 KB and is intended for register mapping
of hardware devices. Just as legacy hardware and software was handled years ago—
and for the universal good—you have to keep maintaining backward compatibility if
you want to ensure the hardware and software from years ago still works with modern
machines. The PCIe specification supports I/O Space for compatibility with legacy de-
vices which require their use because it requires the ability to support existing I/O
device drivers with no modifications.
Besides the simple fixed legacy I/O ranges (see Figure 2.4), there are Base Address
Registers per device that are enumerated by the BIOS and/or the operating system to
suit their idea of perfection. PCI-to-PCI (P2P) bridges also requires a 4-KB minimum
between them.
Alternatively, the term Memory Mapped I/O has nothing to do with actual I/O
space. It is memory space used by hardware (usually, register space) that is accessed
from a configurable base address register. While this mechanism is similar to that of
I/O access from a high level, the transactions are routed very differently and are fun-
damentally different beasts to the platform with different rules.
The Memory Map 27
Summary
Intel architecture has grown to become the industry standard. Understanding how it
has been developed, what the basic subsystems entail, and how they interact and
advance the system, help to provide a foundation of how or why systems operate the
way they do.
Chapter 3
System Firmware Terms and Concepts
For anyone new to Intel® architecture, the concepts behind three-letter acronyms can
be a bit overwhelming. This chapter explains numerous concepts that should help set
up the basic terminology used in future chapters. Many concepts are introduced, and
it is best to refer to this chapter as you progress through the book.
Memory Types
Traditionally, there is ROM and RAM. We will define them for the purist, but today
most people get the concepts.
ROM. Read-only memory (ROM) is locked memory that is not updatable without an
external ROM burner. It never loses its context once it is programmed. Logistically it
makes development and bug fixes in the field much costlier. It is hard to find true
ROMs today, as flash technology has provided programmability during runtime. Of
course, you can have things like EEPROMs, which have ROM in the name and are
programmable, but see NVRAM below.
It is possible for silicon to have embedded ROM inside of it to enable must-have
interfaces, such as a NAND controller interface, if your required boot firmware solu-
tion is sitting behind it.
RAM. Random access memory (RAM) does not retain its contents once power is
removed. This type of memory is also referred to as volatile. There are many types
of RAM, such as static RAM (SRAM) and dynamic RAM (DRAM). In the context of
this book, system memory and sometimes memory refer to any of the available
types of RAM.
DOI 10.1515/9781501506819-003
32 Chapter 3: System Firmware Terms and Concepts
NVRAM. Flash technologies have improved since the time of the dinosaur PC AT. In-
stead of ROMs, which store things like system BIOS, or RAM, which loses its mind
when you power down, most systems today contain a piece of programmable and
nonvolatile RAM known as NVRAM. If you have a flash upgrade utility for a device, it
is likely to be nonvolatile RAM, not formal ROM.
Processor Cache
Processors have local memory on them. While not normally thought about as memory,
it is akin to “short-term memory” for the system. During runtime, the processor man-
ages the cache per ranges that the system BIOS or operating system configure. Caching
of information provides the best access times you can get. Cache used to be disabled by
default, but these days, cache is enabled when the processor is powered.
During early system BIOS phases, the cache can be configured to provide a small
stack space for firmware to execute as soon as possible. Cache must be set to avoid
evictions and then disabled after main memory is up, but for a short time using more
advanced algorithms. More on this “cache as RAM” potential later.
34 Chapter 3: System Firmware Terms and Concepts
System Memory
When you buy memory for a computer or other expandable device, people think
about modular DIMMs. There have been many technology changes over the years,
from EDO to BEDO to SDRAM to RDRAM and back to SDRAM in the form of DDR,
DDR2, DDR3, and, coming soon, DDR4. In the future, there is a roadmap of memory
that will make today’s best seem like EDO or ROM. On a scale of fastest to slowest in
the system, main system memory is in the middle.
Access time to this memory is typically faster than from NVRAM (disk- or SPI-
based), but slower than CPU cache. It is sized for the platform market to be ideal for
execution of operating system and applications during runtime. Memory DIMMs re-
quired a memory controller, formerly in the north bridge of the chipset, now inte-
grated into the processor. Memory initialization can take between several millisec-
onds to several seconds, depending on the transition state of the system (coming
from off or a sleep state). During the normal boot flow, once main memory is initial-
ized by the system BIOS, the shadowing of the “rest of the BIOS” can take place.
The memory is divided into a memory map for various usages by different subsys-
tems; not all memory is used or available by the operating system. Eventually, the
OS is loaded into main memory and the system is then “booted” from a BIOS point
of view. Of course, the OS may need to load the drivers and applications … but that
is its own book.
In a BIOS context, the 512 bytes of CMOS is RTC battery-backed SRAM that is in the
south bridge of the chipset. Although it is not NVRAM, it is used to store the setup
menu and other user-dependent or board data that needs to be retained across power
state transitions (when the system is off). Depending on the system BIOS implemen-
tation, CMOS may not be used except for clock data storage. Over time, the setup
menu data became too complex, and the bitmap of 512 bytes got too tight. Tiano-
based iterations started use of flash memory storage that is not limited by SRAM sizes.
A small nonvolatile RAM (NVRAM) chip is used for storing BIOS code data. Flash
Memory is used to store the BIOS or boot loader. One advantage of flash memory is
that it can be written to as the system powers up. This allows either the BIOS or boot
loader to be stored in flash, as well as any data that the BIOS or boot loader may need
during the boot process. Flash memory may also be used to store an OS as well, but
depending on the technology of flash used, it may be cost prohibitive to do so. The
System Memory Map 35
packages that the flashes have come in over the years have been in the form of firm-
ware hubs (FWH) and Serial Presence.
Figure 3.2: System Memory Map for the Intel® IO Controller Hub, TOM is variable
The memory map contains specific regions used for specific purposes.
36 Chapter 3: System Firmware Terms and Concepts
The Legacy Address Range is used for multiple things in legacy systems. The Interrupt
Vector Table (IVT) is located at physical address 0. Interrupts were discussed in depth
in Chapter 2. The video BIOS gets loaded at 0xC000:0 and the video buffer resides in
the 0xA000 segment. The 0xC800:0 through 0xD000 segments can be used for op-
tion ROMs. The 0xE000 and 0xF000 segments are typically reserved for any runtime
BIOS or boot loader code or data.
The Main Memory Address Range, referred to earlier as system memory, is the
memory available to the OS. Parts of the Main Memory Address Range may not be
available for the OS to use, such as Stolen Memory or ACPI tables. This will be dis-
cussed in more detail in Chapter 8, which describes OS handoff structures.
The PCI Memory Address Range is used during PCI enumeration to assign any re-
quested memory-mapped input/output (MMIO). This memory is used to access regis-
ters in a specific PCI device.
For systems that have 4 GB or greater of system memory, the PCI Memory Address
Range still resides just below 4 GB. In order to avoid losing access to the system
memory in the location of the PCI MMIO, the memory is remapped just above 4 GB.
Figure 3.3 illustrates how memory remapping works.
Splash Screen
Today, most system firmware hides the diagnostic information behind a bitmap with
the company logo or any other image that the PC vendor deems suitable. These bit-
maps are also known as splash screens—screens that keep the user’s eyes occupied
until the OS can load. They serve no other purpose normally than a bit of market-
ing/brainwashing. It is common for BIOS or boot loaders to use splash screens to hide
the boot process from users and to give them something to look at.
They are popular for embedded systems where the user is accustomed to some
type of feedback as soon as they turn on the system. A set-top box is an example of
this; if users do not get visual feedback almost immediately, they would assume that
something was wrong.
Status and Error Messages 37
Figure 3.3: System Memory Map for the Intel® 4 Series Chipset Family
Display Messages
The diagnostic information hidden by splash screens can prove very useful in deter-
mining the state of the machine, should there be problems caused by an initialization
issue, such as the inability to boot the PC to an OS.
These status and error messages are communicated to the user in different ways.
The most obvious is by printing messages on the screen. Typically, the BIOS has an
option for turning off the splash screen in order to display these diagnostic messages.
38 Chapter 3: System Firmware Terms and Concepts
Most people have seen the memory test performed by the BIOS in much older sys-
tems where you see the message count to the size of memory in the system before
printing that the memory test passed.
Beep Codes
There are times when a hardware failure occurs before the graphics device is initialized so
that messages printed to the display device are useless. In this case, the BIOS or boot loader
displays beep codes in order to help the user determine what has gone wrong. Of course,
beep codes are not at all obvious but can be referenced in the BIOS documentation.
POST Codes
Boot code
xB Optional signature
xBC x
xBE Partition table
xFE xAA
Protected Mode 39
Real Mode
Real Mode is 16-bit code created to work with 16-bit registers. Real Mode allows the
accessing of only 1 MB of memory. Memory is accessed in the following format: seg-
ment : offset. The physical address is calculated by shifting the segment left by four
bits and adding the offset to it. Figure 3.4 shows an example of calculating the phys-
ical address.
Segment 0xF000
Offset 0x5432
Physical address 0xF0000 0x5432
Physical address 0xF5432
Figure 3.4: Example Physical Address Calculation in Real Mode
Protected Mode
Protected mode was introduced to address memory above 1 MB. Protected mode also
allows 32-bit code to execute. Protected mode uses the segment register content as
selectors or pointers into descriptor tables. Descriptors provide 24-bit base addresses
with a physical memory size of up to 16 MB, support for virtual memory management
on a segment swapping basis, and several protection mechanisms. The descriptors
referred to are part of the Interrupt Descriptor Table (IDT) and Global Descriptor Ta-
bles (GDT). They are beyond the scope of this book. For more details on the GDT/IDT
refer to the Intel® 64 and IA-32 Architectures Software Developer’s Manual online.
Logical Addressing
The segment selector identifies the segment to be accessed, and the offset identifies
the offset in that segment. The logical address is formed by adding the base of the
40 Chapter 3: System Firmware Terms and Concepts
segment selector to the offset. The processor translates the logical address to a phys-
ical address, making the conversion transparent to software.
The preferred mode for system firmware is flat protected mode. This mode allows ad-
dressing memory above 1 MB, but does not require a logical-to-physical conversion.
The GDT is set up so that the memory maps 1:1, meaning that the logical and physical
addresses are identical.
Reset Vector
When an Intel architecture boot-strap processor (BSP) powers on, the first address
fetched and executed is at physical address 0xFFFFFFF0, also known as the reset
vector. This accesses the ROM or flash device at the top of the ROM: 0x10. The boot
loader must always contain a jump to the initialization code in these top 16 bytes.
The I/OxAPIC
The I/OxAPIC is contained in the south bridge, or ICH. It expands the number of IRQs
available and allows an interrupt priority scheme that is independent of the interrupt
number. For example, interrupt 9 can have a higher priority than interrupt 4.
Summary 41
Each IRQ has an associated redirection table entry that can be enabled or disa-
bled and selects the IDT vector for the associated IRQ. The I/O APIC is only available
when running in protected mode.
The local APIC is contained inside the processor and controls the interrupt delivery
to the processor. Each local APIC contains its own set of associated registers, as well
as a Local Vector Table (LVT). The LVT specifies the way the interrupts are delivered
to each processor core.
Summary
Now that the basic terminology is clear, we can discuss more advanced items.
Chapter 4
Silicon-Specific Initialization
Since when has the world of computer software design been about what people want? This is a
simple question of evolution. The day is quickly coming when every knee will bow down to a
silicon fist, and you will all beg your binary gods for mercy.
—Bill Gates
So if developers strive to be binary gods, as Bill Gates puts it, then they really
need to know what they are doing when it comes to silicon initialization. Hope-
fully they have more than a binary to start from. They are going to need code and
proper documentation.
When initializing silicon should be easy: “It’s a few rdmsrs, and a wrmsr for the
processor. It is a 0xCFC here and a 0xCF8 there with the standard config space for PCI
device, a peek and a poke to some special-sauce registers. Then just glob on the ex-
pansion ROMs or blobs for the add-in devices and voila you are done, right?” Such
would be the case in a perfect world.
While silicon initialization is a process in which the BIOS or system firmware
must set a few bits and enable the system in the correct sequence, the nuances of the
silicon and the recent changes must be updated every design cycle.
DOI 10.1515/9781501506819-004
44 Chapter 4: Silicon-Specific Initialization
hardware design guidelines, it can be a ferocious nightmare where the board sputters
and grinds and makes magic blue smoke. As the first chapter indicated, this is where
it is vital to have the right level of data about the component.
Chipsets
For a chipset, the firmware developer has to consider several features:
– Flash programming (NOR, NAND, or integrated)
– Reset controls
– I/O and MMIO base address locations and ranges
– Standard PCI header details
– Timers and clock routing
– General purpose I/Os and bells and whistles.
– Thermal management
– Power management, APM, ACPI tables and ASL code support
– Interrupts
– Bus support per major interface (DMI, PCIe, audio, Ethernet, SMBus, USB, SATA).
The chipset comprises a series of integrated PCI devices connected to a fabric, back-
bone, or bus, which connects to an interface (such as DMI) to the processor. Each
device/function on the component will get PCI BIOS standard initialization, but also
some amount of proprietary bit twiddling. Most of these bits will be detailed in the
datasheet, but it is best to get the BIOS specification for the particular chipset, which
may require a special NDA with the manufacturer. There are also industry standards
to follow, depending on the type of controller (SATA, USB, and so on). There may be
exceptions to industry standards per component. In the end, we need to keep it all
straight and avoid conflicting memory, I/O, ranges, IRQ and interrupt settings, for all
of the components to come alive and start talking to one another before the OS takes
over with its set of device drivers, settings, industry standards and exceptions, and
additional capabilities.
Processors
For a processor, there is much more than a few rdmsrs and wrmsrs. Today we have
no less than:
– CPUID and branding strings
– More than 100 model-specific registers, some specific to package, core, or thread.
– Bus-to-core ratios
– Overclocking controls
– Turbo mode
Basic Types of Initialization 45
Simple Bits
at again. It is possible that the bits are also locked when set to avoid tampering by
malware during runtime.
Custom routines are just that: custom. It is up to the developer and/or the designer to
make the calls as to what needs to happen with initialization of a device beyond the
standard PCI base address registers, IRQ routing, and so forth. A custom routine can
provide the flexibility to do what is needed apart from the industry standards, such
as PCI, USB, SATA, or others. Custom routines mean that they could be one-off imple-
mentations for specific applications and will need to be redone for the next compo-
nent design, such as ASICs. Often, custom routines provide the best efficiency in boot
speeds overall, as standard implementations typically mean slowing down to detect
and meet any unusual scenarios. The algorithms and bit settings could be entirely
memory mapped and have absolutely no standard to program to, such as PCI or ACPI.
Most OSes, though, do not match up easily to that lack of standards; custom OS-level
drivers would also be needed, and that OS driver may be very different from the ini-
tialization drivers in a UEFI system firmware.
An interesting example is USB initialization. Per the specification, one needs to
use USB protocols to interact with each controller and identify whether any devices
are attached to those ports that need further initialization. One alternative mecha-
nism that can be utilized is the access to this information through memory-mapped
I/O (MMIO) instead. Such an enhancement would need silicon changes to create the
mappings in memory space, but could achieve potentially a much faster and leaner
initialization mechanism. Alternatively, beyond the USB specification requirements,
study of the actual timing of the silicon and shortening of the “standard” delays down
Basic Types of Initialization 47
to what is actually required by the component can yield great benefits. Results may
vary by the controller and device manufacturers, but the potential time savings are
dramatic. More on this in Chapter 12.
Embedded controllers are custom programmable hardware that can interface
with and extend the abilities of the system, as well as provide a back-end solution to
some interesting design problems. These controllers come with their own firmware
and control interfaces to the system BIOS, besides embedding keyboard controllers
and other super IO functionality.
Field Programmable Grid Arrays (FPGAs) are examples that provide fixed func-
tionality until they get reprogrammed. Their sizes can vary and their applicability de-
pends on the market segment in which they are found. Like CMOS, they need battery
backup to maintain their NV status … along these lines. The usage can follow stand-
ard programming needs, like PCI or USB, ACPI, and so on, or it can be completely
custom or need no additional programming at all. FPGAs are normally used to pro-
vide feature augmentation to a design or early development phases of new silicon.
Option ROMs
Formerly ISA-expansion ROMs, PCI-option ROMs such as video BIOS and now UEFI
drivers such as graphics output protocols provide another mechanism for taking
things one step beyond industry standards. These objects do so, though, in a very
standard way that can be easily implemented either though binary or source or mix-
ture. Expansion ROM code extends the BIOS capabilities beyond what the standard
CPU and chipset programming requirements provide.
They can be used for add-in cards or with integrated components. Option ROMs
get loaded and executed by the system firmware during initialization and, if needed,
shadowed during runtime. Newer capabilities of UEFI option ROMs offer the develop-
ers a driver model where the ROM can be loaded but not executed unless it is needed
to be enabled via the boot path. There are some other advantages to UEFI option
ROMs:
– Prior to EFI capabilities, all legacy option ROMs are located below 1 MB between
0xC0000 and 0xFFFFF and carry around a good deal of 16-bit code.
– Newer option ROMs can be called by various names, including DXE drivers in the
Tiano realm and can be relocated above 1 MB, which eliminates the crunch on
size requirements of most option ROMs and alleviates the expansion limitation of
larger server systems.
Devices such as LAN, SCSI, SATA, RAID, and video often have option ROMs initialize
key low-level requirements of proprietary designs. It is possible to embed or integrate
the code into the main BIOS ROMs. Sometimes the intellectual property of the various
silicon integrated into the design may not allow access to that code or knowledge of
48 Chapter 4: Silicon-Specific Initialization
the exact registers. There is a positive aspect to binary code: As a developer, you don’t
have to fix what cannot be fixed. And as the black box, a legacy option ROM binary
gives an excellent chance to innovate along alternative lines when the opportunity
presents itself.
Summary
This chapter contained a high-level overview of some of the programming require-
ments of basic components. For complete details on a particular chip, go to the silicon
provider and obtain the required documentation, such as the data sheet and program-
mer’s reference manual.
Chapter 5
Industry Standard Initialization
Beware of geeks bearing formulas.
—Warren Buffett
As the previous chapter begins to highlight, the industry standards that can be ap-
plied to today’s computing systems are vast and varied. Understanding their impact
on systems and devices is key to creating a robust and stable pre-OS firmware. By
some counts, it can be up to 70 specifications that apply:
1. UEFI
a. UEFI Specification Unified Extensible Firmware Interface Specification v2.7
b. Platform Initialization Specification v1.2 (for UEFI)
2. PCI
a. PCI Express Base Specification
b. PCI Local Bus Specification
c. PCI-to-PCI Bridge Architecture Specification
d. PCI Hot-Plug Specification
e. PCI Bus Power Management Interface Specification
f. PCI BIOS Specification
g. PCI Firmware Specification
h. PCI Standard Hot-Plug Controller and Subsystem Specification
i. PCI-X Addendum to the PCI Local Bus Specification
j. PCI Express to PCI/PCI-X Bridge Specification
3. IDE
a. ATA/ATAPI 6
b. ATA/ATAPI 7
c. Programming Interface for Bus Master IDE Controller
d. PCI IDE Controller Specification
e. Serial ATA: High Speed Serialized AT Attachment
f. Serial ATA International Organization: Serial ATA
g. Serial ATA Advanced Host Controller Interface (AHCI)
h. ATAPI Removable Media Device BIOS Specification
i. El Torito Bootable CD-ROM Format Specification
4. USB
a. Universal Serial Bus Specification
b. Universal Host Controller Interface (UHCI) Design Guide
c. Universal Serial Bus (USB) Device Class Definition for Human Interface De-
vices (HID)
d. Universal Serial Bus (USB) HID Points of Sale Usage Tables Version 1.01
DOI 10.1515/9781501506819-005
50 Chapter 5: Industry Standard Initialization
Let’s take a high-level look at the four most important industry specifications beyond
UEFI from a system firmware and BIOS perspective:
1. PCI
2. SATA (formerly known as ATA, IDE)
3. USB
4. ACPI
PCI
PCI is the single strongest hardware standard of the Intel-architecture-based plat-
form. All the key I/O controllers on the platform, including the embedded memory
controllers, system agents, and the graphics ports on Intel CPUs, are either PCI or PCI
Express (PCIe) devices.
The PCI standard is divided into several generations of key specifications. The
latter generations are, for the most part, backward compatible with previous genera-
tions. If you join the PCI Special Interest Group (SIG) at www.pcisig.com, you will
have access to all the latest PCI industry standard specifications. The following para-
graphs highlight some of the PCI pre-OS key points, but are by no means exhaustive
(please download and read/memorize these specifications). Also, it should be well
understood that exceptions will occur for specific devices, as with any standard.
Understanding the PCI Local Bus Specification is the first step to understanding
the bus transactions and protocols necessary to communicate with hardware or soft-
ware driver designers and help debug platform level issues. There are several PCI
standard configuration header registers listed in the specification that are key to PCI
enumeration, which must be completed prior to a common off-the-shelf OS to load
properly on the platform. Several devices are PCI enumerated in the OS.
Some real-time operating systems may have enough a priori knowledge of the PCI
subsystem to hard code the needed values in order to boot and load drivers and may
not require full PCI enumeration. While some of off- the-shelf operating systems re-
52 Chapter 5: Industry Standard Initialization
peat some of the PCI enumeration during OS load time, without a basic level PCI enu-
meration, none of the devices on the Intel architecture platform will function, includ-
ing the OS storage device. Full enumeration on today’s components only takes 20 mil-
liseconds (ms) and the flexibility it provides outweighs hardcoding of values per
board/system.
Besides the main PCI Local specification, there are also PCI Bridge specifications,
PCI Hot Plug specifications, and PCI Power Management specifications that play roles
in pre-OS PCI programming and compatibility. Up to a point in PCI history, most of
the pre-OS required details are outlined in the PCI BIOS Specification 2.1.
Here are some of the basic requirements from the PCI-SIG specifications for pre-
OS firmware and BIOS.
Note: Again, this laundry list does not replace the PCI specification. Please get the specifications and
read copiously! Read The Friendly Manual!
For dynamic PCI enumeration, the pre-OS firmware must scan the PCI bus for devices
and bridges. To do this, the BIOS must start looking for devices at the host to PCI
bridge (PCI Bus 0, PCI Device 0, PCI function 0), and start scanning or “walking
through” the devices, methodically looking for PCI devices and PCI bridges. When
function 0 is not present in the system, it is assumed that the functions 1 through 7 of
that device are not incorporated. For each PCI device/function found, the BIOS must:
– Examine the header type register to determine if the device is a PCI- to-PCI bridge.
The Header Type field at register offset 0x0A will indicate type 0 for devices and
type 1 for bridges.
– For Type 0 headers:
o Assign Standard Base Address Registers
Prefetchable Memory
Non-Prefetchable Memory
Legacy I/O ranges to device
o Enable Bus Mastering, memory, and I/O ranges as needed in the
configuration register
o Program Master Latency Timer Register
o Program Cache Line size
o Program Min Grant and Max Latency values (not used for PCI Ex-
press)
o Program Subsystem Vendor and Device IDs
o Assign PCI IRQs and INTA, INTB, INTC, or INTD lines. Care should
be taken to read the schematics on down PCI devices and PCI slots,
PCI 53
While there are up to 256 buses on PCI, the minimum granularity of the I/O base and
limit registers of 4 KB really limit that to approximately 16 bridges possible (assuming
each has some amount of I/O required).
When configuring the PCI devices of the chipset and CPU, there will be several
private registers above 0x3F in PCI and in the PCI Express ranges that will need to be
programmed for specific purposes. Also, in the chipset, memory-mapped I/O config-
uration spaces are mapped by the Root Complex Base Address register at Bus 0, De-
vice 31, Function 0, Offset 0xF0. It specifies the physical address of Chipset Configu-
ration space. Also used in “RCBA xxxxh,” where xxxxh is the offset of a particular
register location in the Chipset Configuration space.
For static PCI enumeration, a designer can define a specific mapping and hard-
code the standard registers for a particular system. But that is not the right way. While
some believe that this saves time in walking the buses dynamically, the entire bus
scan should take on the order of 20 ms. The benefits of doing it dynamically will likely
outweigh the return of saving 20 ms and hand-coding the map statically for every
closed-box configuration. If there are any expansion slots on PCIe buses on the sys-
tem, then static enumeration is really not an option.
54 Chapter 5: Industry Standard Initialization
PCI BIOS
PCI BIOS Specification v2.1 spells out PCI BIOS calls via legacy software interrupt Int
1Ah; however, legacy calls are fast becoming obsolete in the modern world of UEFI.
PCI BIOS (Int 1Ah) function calls are listed in Table 5.1.
Function AH AL
PCI_FUNCTION_ID Bh
PCI_BIOS_PRESENT h
FIND_PCI_DEVICE h
FIND_PCI_CLASS_CODE h
GENERATE_SPECIAL_CYCLE h
READ_CONFIG_BYTE h
READ_CONFIG_WORD h
READ_CONFIG_DWORD Ah
WRITE_CONFIG_BYTE Bh
WRITE_CONFIG_WORD Ch
WRITE_CONFIG_DWORD Dh
GET_IRQ_ROUTING_OPTIONS Eh
SET_PCI_IRQ Fh
Most operating systems today can access PCI devices directly with their own class
driver. As the PCI bus is completely discoverable (unless hidden by the BIOS for a
board-specific reason), then through that class driver they can access PCI devices,
class codes, as well as read and write configuration registers as needed.
The more useful calls are Get_IRQ_Routing_Options and Set_PCI_IRQ, which
some operating systems still use.
$PIR Table
Per Microsoft’s PCI IRQ Routing Table Specification 1.0, the PCI IRQ Routing Table is
in system memory from F0000h to FFFFFh on a 16-byte boundary. It begins with a
signature of $PIR, and the resulting table output is compatible with the formatting of
the buffer pointing to Int 1Ah Get_PCI_interupt_routing_options call above.
The PCI IRQ Routing Table ($PIR) has the header structure shown in Table 5.3.
Table 5.3: Header Structure of the PCI IRQ Routing Table ($PIR)
Signature - The signature for this table is the ASCII string “$PIR”. Byte
is a h, byte a h, byte is a h, and byte is h.
Version - The version consists of a Minor version byte followed by a Major
version byte. Has to be x.
Table Size holds the size of the PCI IRQ Routing Table in bytes. If there
were five slot entries in the table, this value would be ( ) =
.
PCI Interrupt Router’s Bus
PCI Interrupt Router’s DevFunc - The Device is in the upper five bits, the
Function in the lower three.
PCI Exclusive IRQs - This is an IRQ bitmap that indicates
which IRQs are devoted exclusively to PCI usage. For
example, if IRQ is devoted exclusively to PCI and cannot
be assigned to an ISA device, then bit of this -bit field
should be set to . If there are no IRQs devoted exclusively to
PCI, then this value should be .
Compatible PCI Interrupt Router - This field contains the
Vendor ID (bytes and ) and Device ID (byes and )
of a compatible PCI Interrupt Router, or zero () if there is
none. A compatible PCI Interrupt Router is one that uses the
same method for mapping PIRQn# links to IRQs, and uses
the same method for controlling the edge/level triggering
of IRQs. This field allows an operating system to load an
existing IRQ driver on a new PCI chipset without updating
any drivers and without any user interaction.
Miniport Data - This DWORD is passed directly to the IRQ
Miniport’s Initialize() function. If an IRQ Miniport does not
need any additional information, this field should be set to
zero ().
PCI 57
Reserved (Zero)
Checksum - This byte should be set such that the sum of
all of the bytes in the PCI IRQ Routing Table, including the
checksum, and all of the slot entries, modulo , is zero.
First of N Slot Entries. Each slot entry is -bytes long and
describes how a slot’s PCI interrupt pins are wire OR’d to
other slot interrupt pins and to the chipset’s IRQ pins.
(N ) Nth Slot Entry
embedded A x
embedded A x
embedded B x
embedded C x
embedded D x
embedded A x
embedded B x
embedded C x
embedded D xb
embedded A x
embedded A x
embedded A x
embedded A x
slot A x
slot B x
slot C x
slot D x
The $PIR table is not the only way to describe the IRQ routing to the operating system.
Microsoft Windows 2000 and later versions of Windows use ACPI tables and methods
to determine the IRQ mapping supported per PCI devices. The ACPI table FADT will
show which mode.
58 Chapter 5: Industry Standard Initialization
PCI Recommendation
As the PCI specifications have progressed, the PCI Express Specification and the PCI
Firmware Specification have become supersets of previous PCI local bus specification
and PCI BIOS specifications. The PCI Express Specification is similar and has back-
ward compatibility to the PCI Bus Specification. For this reason, developers should
implement section 7 of the latest PCIe Specification. Several implementation notes
are also included in the PCIe specification to avoid common pitfalls.
1. Supply PCI resources for onboard USB controllers and wait for the OS drivers to
load and enumerate the USB devices before they are usable. The USB controllers
or devices may also be armed to wake the system via ACPI.
2. Add a USB stack in the firmware to enumerate the USB bus and allow for limited
functionality pre-OS for such things as HID devices (keyboard/mouse) or storage
devices (boot to USB).
3. Add an SMI handler to allow for OS Bootloader and OS runtime support on OSes
that do not have a native USB driver stack. This is also known as “legacy USB
support.”
These are the basics that have existed since USB started in the 1990s. Adding this USB
pre-OS support to a firmware stack is a substantial effort (multiple-man months). The
additional boot times involved can also be inhibitive, several hundred milliseconds
to several seconds depending on the number of USB ports/devices/usages enabled.
In addition to the USB pre-OS support listed above, there are a few other areas of
USB that can be explored for additional value to the platform:
1. Pre-boot authentication (PBA) may also require USB support and that is an addi-
tional concern that must worked out with the PBA vendors and the OEM of the
system.
2. Trusted USB, where there is an established root of trust from the boot vector and
where a USB device is not allowed to function unless it is authenticated and/or
secured in some manner in either the BIOS hardened firmware and/or the oper-
ating system.
PBA devices and Trusted USB share some common and interesting traits, but we will
focus on the non-secure USB enumeration and initialization in this book.
There are different versions of USB controllers implemented in the Intel chipsets
and systems on chips. Here we will focus on the PCH.
The simplest form of USB support is basic PCI support with no additional USB net-
work initialization. During the PCI enumeration in the BIOS flow, the USB controllers’
BARS are assigned, IRQs are provided, and memory and I/O space may be enabled.
Additionally, the system firmware can also hide controllers, disable ports, and arm
for ACPI wake events.
60 Chapter 5: Industry Standard Initialization
It is important to note a quirk here that the BIOS cannot configure the device to pro-
vide UHCI support only in the Intel USB controllers on the PCH. This configuration is
prevented per the PCI Specification requirements; that is, PCI devices must always
implement function 0. Therefore, the UHCI host controller support must always be
accompanied by support by at least one EHCI host controller (D29/26 Function #0).
To ensure that a disabled USB function cannot initiate transactions for USB trans-
fers or be the target of memory or I/O requests, the system BIOS must also ensure the
controller memory, and I/O control bits are disabled prior to hiding the function. Also,
the USB functionality is disabled in the controller’s registers:
– Clear the Run/Stop bit and verify that the HCHalted bit becomes set (EHCI and
UHCI)
– Set the Interrupt disable bit, PCI Cfg offset 04h[10] (EHCI and UHCI)
– Clear Asynchronous schedule enable bit, USB20 MEM_BASE 20h[5], and Periodic
schedule enable bit, USB20 MEM_BASE 20h[4]
– Wake capabilities (UHCI, GPE0_EN, and EHCI, D26/D29:F0:62h)
Once set, the device will be disabled until the next platform reset occurs. The policy
to disable the EHCI functionality is not a dynamic one. This restriction also applies to
subsequent warm boots.
The EHCI host controllers may generate the wake from the internally routed PME sig-
nal. An enable/disable bit (PME_B0_EN) is provided to support the resume events
from Bus#0 using the PME signal. For supporting EHCI host controller wake in an
ACPI environment the _PRW ACPI method package to describe the “Wake from USB
2.0” functionality should also be present under PCI Bus#0 for each of the EHCI host
controllers.
Device (USB7) {
Name (_ADR, 1D0000)
Name (_PRW, Package ( ){0D, 03})
}
Device (USB8) {
Name (_ADR, 1A0000)
Name (_PRW, Package ( ){0D, 03})
}
USB Enumeration
The system BIOS initializes the EHCI hardware in two phases, PCI enumeration as
outlined above and EHCI initialization outlined next.
EHCI initialization places the EHCI in a fully functional state. While the operating
system USB driver will repeat these steps for its own stack, it is required for the BIOS
to perform for any pre-OS USB functional support.
1. Program Port Routing and ModPhy settings.
2. Perform a Host Controller (HC) Reset.
3. Delay required (X ms).
4. EHCI requires detection for companion controller (UHCI) initialization.
5. Check number of ports, scan connection status.
a. If devices/hubs are connected, do a root port reset.
b. Root port reset is driven for a required time. This short duration is needed to
suppress any possible resume signals from the devices during initialization.
Per the EHCI version, the BIOS needs to track duration and clear the bit at
the proper time. (XHCI version will clear the bit by itself and the BIOS only
needs to poll.)
6. The BIOS must poll for port enable bit to be set. While the specification says 2 ms,
the completion will happen sooner than that for Intel PCHs.
7. Perform speed detection.
8. Perform a reset; recovery timing is 10 ms.
62 Chapter 5: Industry Standard Initialization
9. EHCI version – proceed to get a descriptor to know what is connected to root port.
Two potential answers:
– Hub – if Hub, configure it for address, wait 10 ms per specification (may not
need to for Intel component). Get delay needed for power on delay. Wait for
that delay then configure each of the ports
– Root port 0 index – ID as hub. Then configure USB network downstream
It is recommended that the BIOS enable all the root ports in parallel, to avoid the oth-
erwise additive serial time delays and then go deeper into the hub/ device layer as
required for the platform. This is in some ways a platform policy decision.
If the BIOS finds a USB Device—get the ID from the descriptor, then set the ad-
dress. Depending on the device type, we may need to get a further descriptor and
configure as needed. What interface to query is next.
These steps are repeated for each root port.
SATA
Intel components have supported ATA controllers in one form or another since early
1990s chipsets. These have converted over time from Parallel ATA to Serial ATA. To-
day most Intel platforms support only SATA controllers, up to a maximum of six SATA
ports. It is possible to still find PATA-supported chips in the market either as inte-
grated controllers or as discrete PCI devices. Intel datasheets have complete details
on number and types of controllers, number of ports, and available configurations
and SATA-generation support per component, as well as complete set of registers.
SATA controllers support the ATA/IDE programming interface and may support
Advanced Host Controller Interface (AHCI), not available on all chips. In this section,
the term “ATA-IDE Mode” refers to the ATA/IDE programming interface that uses
standard task file I/O registers or PCI IDE Bus Master I/O block registers. The term
“AHCI Mode” refers to the AHCI programming interface that uses memory-mapped
register/buffer space and a command-list-based model.
The system BIOS must program the SATA controller mode prior to beginning any
other initialization steps or attempting to communicate to the drives. The SATA con-
troller mode is set on mainstream PCHs by programming the SATA Mode Select (SMS)
field of the Port Mapping register (Device31: Function 2: offset 90h, bit[7:6]).
Not every mode is supported on every component. Depending on which version of the
component (mobile or desktop, RAID or non-RAID), the allowed configurations vary.
RAID and AHCI modes require specific OS driver support and are identical except
for differences in PI and CC.SCC values. IDE mode does not have any special OS re-
quirements and is sometimes termed compatible mode. In addition to the three oper-
ation modes above, software can choose to operate SATA ports under a single con-
troller mode or dual controller mode.
Software, typically the BIOS, decides up front which controller mode the system
should be operating in before handing over the control to the OS.
If system BIOS is enabling AHCI Mode or RAID Mode, then it must disable the
second SATA controller on the part (Device 31, Function 5) by setting the SAD2 bit,
RCBA 1 3418h[25]. System BIOS must ensure that memory space, I/O space, and inter-
rupts for this device are also disabled prior to disabling the device in PCI configura-
tion space.
AHCI Mode
AHCI mode is selected by programming the SMS field, D31:F2:90h[7:6], to 01b. In this
mode, the SATA controller is set up to use the AHCI programming interface. The six
SATA ports are controlled by a single SATA function, D31:F2. In AHCI mode, the Sub
64 Chapter 5: Industry Standard Initialization
Class Code, D31:F2:0Ah, will be set to 06h. This mode does require specific OS driver
support.
RAID Mode
RAID mode is enabled only on certain SKUs of the Intel components and requires an
additional option ROM, available from Intel.
When supported, RAID mode is selected by programming the SMS field,
D31:F2:90h[7:6] to 10b. In this mode, the SATA controller is set up to use the AHCI
programming interface. The SATA ports are controlled by a single SATA function,
D31:F2. In RAID mode, the Sub Class Code, D31:F2:0Ah, will be set to 04h. This mode
does require specific OS driver support.
In order for the RAID option ROM to access all 6/4 SATA ports, the RAID option
ROM enables and uses the AHCI programming interface by setting the AE bit, ABAR
04h[31]. One consequence is that all register settings applicable to AHCI mode set by
the BIOS must be set in RAID as well. The other consequence is that the BIOS is re-
quired to provide AHCI support to ATAPI SATA devices, which the RAID option ROM
does not handle.
PCH supports stable image compatible ID. When the alternative ID enable, D31
:F2 :9Ch [7] is not set, PCH SATA controller will report Device ID as 2822h for desktop
SKU.
SATA drives cannot start to spin-up or start to become data-ready until the SATA port
is enabled by the controller. In order to reduce drive detection time, and hence the
total boot time, system BIOS should enable the SATA port early during POST (for ex-
ample, immediately before memory initialization) by setting the Port x Enable (PxE)
bits of the Port Control and Status registers, D31:F2:92h and D31:F5:92h, to initiate
spin-up of such drive(s).
In IDE mode, the system BIOS must program D31:F2:92h[3:0] to 0Fh and
D31:F5:92h[1:0] to 3 for a SKU with six ports to enable all ports. If no drive is present,
the system BIOS may then disable the ports.
SATA Controller Initialization 65
In AHCI and RAID modes, the system BIOS must program D31:F2:92h[5:0] to 3Fh
for six ports and 0Fh for four ports. In AHCI-enabled systems, the PCS register must
always be set this way. The status of the port is controlled through AHCI memory
space.
If Staggered Spin Up support is desired on the platform due to system power
load concerns, the BIOS should enable one port at a time, poll the Port Present bit
of the PCS register and Drive Ready bit of task file status register before enabling
the next port.
The ATA/ATAPI-7 specification recommends a 31-second timeout before assum-
ing a device is not functioning properly. Intel recommends implementing a ten-sec-
ond timeout and if the device fails to respond within the first ten seconds, then the
system BIOS should reinitiate the detection sequence. If the device fails to respond
within an additional ten seconds, the system BIOS should reinitiate the detection se-
quence again. If the device fails to respond within ten more seconds, the system BIOS
can assume the device is not functioning properly.
It is recommended that software or system BIOS clear Serial ATA Error Register
(PxSERR) after port reset happens by writing 1 to each implemented bit location.
When the SATA controller is configured to operate in RAID or AHCI mode, the system
BIOS must initialize the following memory-mapped AHCI registers specified by ABAR,
D31:F2:24h:
– CAP register, Host Capabilities register (ABAR + 00h).
o Set SSS (ABAR 00h[27]) to enable SATA controller supports stag-
gered spin-up on its ports, for use in balancing power spikes.
– Clear ABAR A0h [2][0] to 0b
– Clear ABAR 24h [1] to 0b
– PxCMD register (ABAR 118h, 198h, 218h, 298h, 318h, 398h).
After the BIOS issues the initial write to AHCI Ports Implemented (PI), ABAR 0Ch, reg-
ister (after any PLTRST#), there is a request for the BIOS to issue two reads to the AHCI
Ports Implemented (PI) register.
Some of the bits in these registers are platform specific and must be programmed
in accordance with the requirements of the platform. The details regarding how these
registers must be programmed can be found in the Serial ATA Advanced Host Con-
troller Interface (AHCI) specification.
Some of the bits in these registers are implemented as read/write-once (R/WO). It
is a requirement that the system BIOS programs each bit at least once, even if the
default setting of the bit is the desired value (that is, system BIOS must write 0 to a
bit, even if the hardware reset value is 0). Doing so will ensure that the bit is un-
changeable by software that is not system BIOS. Please refer to the PCH EDS to deter-
mine which bits are R/WO.
Registers containing multiple R/WO bits must be programmed in a single atomic
write. Failure to do so will result in nonprogrammable bits.
When the SATA controller is initialized in RAID mode, the system BIOS needs to ini-
tialize the SATA controller and attached SATA devices as stated in the previous sec-
tions, with the following exceptions:
– The system BIOS does not need to initialize DMA mode for hard disk drives
(HDDs) discovered behind the SATA controller when in RAID mode. The RAID
option ROM will provide the initialization.
– SATA HDDs discovered by the system BIOS behind the SATA controller in RAID
mode must not be added to the hard drive count at 40h:75h. The RAID option
ROM will enumerate those drives and update the 40h:75h value accordingly. Up-
dating the drive count at 40h:75h by the BIOS will make it appear that more drives
are attached to the system than are actually available.
SATA Controller Initialization 67
– System BIOS must not install INT 13h support for the SATA HDD devices discov-
ered, nor may it treat such devices as BAID (BIOS-Aware Initial Program Load)
devices, as the RAID option ROM implements the necessary INT 13h support for
them.
– ATAPI devices attached to the SATA controller must be under the full control of
the system BIOS and treated as BAID as they are in non-RAID mode.
– The system BIOS must load the RAID option ROM when the controller’s SCC
(D31:F2:0Ah) returns 04h and VenderID/DeviceID match that of PCHR RAID SKU
(refer to the PCH EDS or EDS specification update for DID/VID information).
Initialization
RAID mode initialization occurs as follows:
– System BIOS configures the SATA controller for RAID mode, initializes IO BARs
and ABAR, and assigns an IRQ to the controller. System BIOS optionally sets up
an ISR for the SATA controller and enables interrupts for the SATA controller.
– System BIOS loads the RAID option ROM (OROM).
Int13h Drive Access to an ATA Device. System BIOS control flow: if the system BIOS
uses the interrupt mechanism, then the system BIOS ISR gets control.
– If this is an unexpected interrupt, the system BIOS assumes this is a RAID request.
– The system BIOS clears the Interrupt status in the SATA/RAID controller.
– The system BIOS does not mask the IRQ in the PIC.
– The system BIOS sets byte 40h:8Eh to a nonzero byte value.
– The system BIOS issues an EOI and IRET.
If the system BIOS does not use the interrupt mechanism, then the system BIOS does
not receive this INT13h request; therefore, it does nothing.
68 Chapter 5: Industry Standard Initialization
Int13h Drive Access to a SATA ATAPI Device. System BIOS control flow: system
BIOS can choose any method to access the ATAPI drive (interrupt or polling).
– System BIOS completes the INT13h service request.
– Contact your Intel Representative to get the RAID option ROM.
System BIOS must execute several other part-specific steps as part of system BIOS
initialization of the SATA controller on both cold boot (G3/S5) and S3/S4 resume
path. The bit settings are required for basic operation. Please refer to the PCH
datasheet for the SATA initialization settings and the actual register indexes/values
to be programmed.
If an external SATA port is implemented in the system, there are additional steps that
need to be followed.
– Follow steps in 13.1.6 for additional programming to external SATA port.
– Enable the port through the corresponding bits (example: D31:F2:92[5:0]).
– Put the port into Listen Mode (refer to AHCI specification for Listen Mode infor-
mation), which achieves similar power savings as if the port is in SATA Slumber
power state.
To ensure a satisfactory user experience and to provide the RAID option ROM imple-
mentation with a “well-known” framework under which it can operate, the system
BIOS and option ROM need to be implemented in accordance with the following spec-
ifications:
Advanced Configuration and Power Interface (ACPI) 69
ACPI Tables
There are ACPI tables to describe every aspect of the system hardware and its pre-
ferred capabilities to an ACPI-aware OS. The ACPI tables that matter are:
– Root System Description Pointer (RSDP). The Main ACPI table that points to all the
other tables, the Root System Descriptor Table (RSDP) is located either at E0000-
FFFFF in legacy BIOS or can be located elsewhere as specified in the UEFI system
table.
– System Description Table Header. Common structure that is at the top of every
table, except FACS.
– Root System Description Table (RSDT). A 32-bit table that is becoming obsolete and
may no longer be used in modern systems. It still can be used to point to many of
the other tables listed below, but the system may lack certain feature compatibility.
Older ACPI-aware operating systems may require this for functionality.
– Extended System Description Table (XSDT). Replaces RSDT and supports both
32/64-bit systems. It points to all other tables (see below).
– Fixed ACPT Description Table (FADT). Provides fixed addresses for key ACPI hard-
ware registers, including GPE block, PM block, and ACPI Timer. It also provides
I/O port details to access or enable SMI ports, which turns on ACPI mode (and the
value written), and the Port address and value to reset the system.
The FADT was previously known as FACP and is pointed to by the XSDT, and in turn
the FADT points to the DSDT and FACS tables.
70 Chapter 5: Industry Standard Initialization
There are many server and NUMA system tables that provide the operating system
with enhanced details of the capabilities of dual and multisocketed systems. These
tables include the System Locality Distance Information Table (SLIT) and the System
Resource Affinity Table (SRAT).
There are some newer tables defined in the ACPI 5.0 specification that do not deal
with Intel architecture (such as GTDT) and can be excluded from the firmware image.
Most of the ACPI tables do not change for the life of the platform and can be con-
sidered static. There may be some tables that are ported from board to board (like PCI
IRQ routing _PRT). Some tables (such as SSDT) can be dynamically created, loaded,
or enabled, depending on the hardware discovered during the system boot and de-
pending on the operating system and/or the usage model requirements. For dynamic
SSDTs, “Load” and “Load Table” can be used to update tables during OS load or even
runtime before being interpreted by the OS. This allows for hot plugging of docking
stations, socketed processors, proprietary daughter cards, and so on, where modular
add-in tables make a single firmware image scalable. This SSDT table technique also
can be used for dynamically adding instrumentation for debug and performance
measurements.
Summary 71
ACPI Namespace
Summary
This section covered basic programming for PCI, SATA, USB, and ACPI standards as
they apply to the BIOS. Please check on the latest standards through the standards
72 Chapter 5: Industry Standard Initialization
bodies as they are constantly changing. More complete and lower level programming
details are available in the respective industry specifications. More complete details
about certain characteristics of IO BIOS programming also can be obtained through
BIOS specifications, which may be available online.
In the BIOS, standard enumeration algorithms can eliminate much extra work
between revisions of boards or systems. These standards can also be tuned, as we will
see later per board or per silicon components for best fit and performance.
Chapter 6
System Firmware Debug Techniques
… Bloody instructions, which, being taught, return to plague the inventor.
Hardware Capabilities
Typical Intel architecture platforms provide a common set of hardware features that
may be employed by anybody debugging firmware. Without these hardware features,
low-level debug of Intel architecture platforms would be virtually impossible.
No two hardware features are equal, and they all serve a slightly different pur-
pose. The reader should be aware of what to expect from these hardware features.
DOI 10.1515/9781501506819-006
74 Chapter 6: System Firmware Debug Techniques
POST Codes
One of the oldest and rudimentary ways of telling a user what’s happening on the
target platform is to give status visually through on-board components. Many Intel
architecture platforms incorporate seven-segment displays that show hexadecimal
status codes sent by the firmware. These displays are typically driven by an agent that
captures I/O writes to ports 0x80–0x83 and shows the values on the displays.
The amount of information that can be conveyed through hexadecimal number
displays is rather limited. The most prevalent use of these codes is to indicate “I got
here” to the user. A system crash or hang can sometimes be debugged by using the
last POST code as an indication of “this is the last known good point” and under-
standing what is being done immediately after that point. If you have the capability
of run control over a target, it is also possible to capture a sequence of POST codes to
illustrate the logic flow of the firmware, which can allow for POST codes to be used
for more than one purpose.
BIOS companies typically have a list of standard architectural POST codes com-
mon across all platforms. This list is usually documented fairly extensively for cus-
tomer consumption. If it’s not, what good are the POST codes unless you have the
entire firmware source code base?
Sometimes used in addition to POST codes, audio beep codes are used to give the user
an auditory clue of the state of the target, in applications where visual indicators are
not available (such as a motherboard in a closed PC case).
As these are more applicable to consumers of end-products and are not really
valuable to firmware engineers, they won’t be discussed further.
Hardware Capabilities 75
Serial Port
UARTs are still a prized component of hardware designs. While seven-segment dis-
plays allow POST codes to be displayed to the user in a cost-effective manner, UARTs
providing text driven output give an infinite number of degrees of freedom in what
can be communicated.
Rather than simple hexadecimal data, full strings can be output to textually de-
scribe what’s going on. Debug information can be displayed, if desired. Steps in com-
plex calibration sequences can be shown to assist in debug of hardware or the firm-
ware algorithms themselves.
Of course, UARTs require that an external cable and host PC be connected in or-
der to run a terminal program to view the serial output. But that’s not too much to ask
for the flexibility a UART gives you, including in some cases JTAG level access.
Interactive Shells
If you have a UART available, one obvious extension of the console allowed with a
UART is the capability of bidirectional traffic (that is, debug shell). This shell can be
used for system diagnostics and probing during development. Many commercially
available firmware stacks support interactive debug shells.
A firmware developer’s most prized tool, the ITP is the most useful tool in debugging
firmware. An ITP is a piece of hardware connected to both a host and a target system,
allowing the host system to have execution control over the target from the beginning
of power-on through the life of the target boot.
Basically, just about anything you can do by executing firmware you can also do with
an ITP. In fact, ITP hardware and ITP scripts are crucial to bringing new processors
and chipsets online in a rapid fashion. You can’t find an Intel firmware engineer who
doesn’t have at least one ITP in his or her possession.
There are several Intel-compatible ITP devices and associated software suites on
the market. Other architectures/vendors have In Circuit Emulators (ICE) to do simi-
lar/same jobs. Regardless of the vendor, make sure you have one.
Console Input/Output
You laugh at your friends who debug with printf, right? That’s not so funny in the
firmware world. If you don’t have access to hardware that allows run control, you
need to have as much information as possible at your fingertips. Debug messages can
serve as the base of that information.
Abstraction
Now, also consider instead writing an access method to perform that transaction to
be used as follows:
This access method could be easily modified to not only perform the MMIO write, but
display debug information on a console:
This is a rather simple example, but it can apply to any kind of hardware access. This
instrumentation allows enhanced visibility into the operation of the firmware, as well
as the capability to simulate algorithms on other platforms (that is, have the console
output in the API, but not the hardware access). Personally, I do this all the time.
Disable Optimization
One of the most forgotten debug techniques is to disable optimization in the compiler.
Why is this so important? Whether you are looking at high-level source code in an ITP
or straight assembly, what you see in the code window may not be what you expect.
Take the following example, for instance.
4. The power sequencing or input clocking may not be correct, and the silicon may
be in an indeterminate state. If this is the case, there is some other programmable
firmware that will likely need to be updated to take this into account.
5. Try the code on a known-good system. If the parts are interchangeable, you can
modularly replace parts until you find one that isn’t working.
6. Check your schematics and make sure all the parts of the subsystem giving you
fits is correct. A “blue wire” may be able to help.
7. If this is a brand new motherboard design, have a hardware person check the
Intel design guide and make sure there is nothing obviously wrong. If shortcuts
were made, then suspect the motherboard will need to be respun.
8. Check for critical signal integrity with someone who knows what an “eye dia-
gram” is. This would require a motherboard change to fix the routing/layout.
9. If all of the above has failed, it could be silicon issue, as on a brand-new prepro-
duction A0 stepping component; get a silicon design engineer to assist the debug
at register level with the specifications open as you trace through the code line
by line. Or call your vendor and report a potential issue
Most of the time, the BIOS will be blamed for bad hardware. Even if there is a BIOS
workaround that can fix the hardware stability problem, it could still be looked at as
a BIOS problem…this comes with the territory.
For option ROMs, there are a signature and entry/exit points. You don’t get much else
unless the vendor supplies the code to you under NDA. During the execution, there
Debugging Other People’s Code 79
Debugging a library function is like the black box option ROM above. The key differ-
ence is, with a library, there should be a well-defined API that provides you with more
data than an option ROM. With that data, you may be able to create a temporary work-
around for that initialization sequence to ensure that the issue is truly in the library.
You can then either bring that issue back to the vendor, or you can write your own
code and replace the library. It may be the case that the library is in a binary format
for production, but that the vendor would be willing to supply a debug version of the
source to help you debug the scenario.
It should be noted, however, that for an industry standards library, the code
should have been tested sufficiently at the vendor such that any issues being found
now are a result of a change from the standard specification or something unique to
the hardware that the library is trying to initialize. Before contacting the vendor, it
would be a good idea to run through the “unstable hardware” checks to make sure
nothing is wrong with the hardware itself.
80 Chapter 6: System Firmware Debug Techniques
Legacy operating systems dating from the beginning of the original IBM PC to current
day utilize real mode interrupts to request information from the firmware or tell the
firmware to do something. These real mode interrupts serve a wide range of functions
including:
– Video Services (INT 0x10)
– System Services (INT 0x15)
– I/O Services (various)
As with just about everything in life, there are both good and bad things about real
mode interrupts. Let’s look at the good things first:
– Once a real mode interrupt is defined, its meaning rarely changes. This means
that once firmware supports a real mode interrupt, it rarely has to change.
– The state of the system is well known at the invocation of a real mode interrupt.
– No two operating systems utilize the exact same set of real mode interrupts (if
they even do), and there is no documentation on which services are required for
any given operating system.
So, how are you supposed to know what to implement if you were to design your firm-
ware from scratch? Debug.
Debugging Methods
Although the number of real mode interrupt debug methods is vast, this section out-
lines a few helpful hints that may allow the reader a little help in this complex area.
– IVT Hardware Breakpoints. To determine when a real mode interrupt is called,
use hardware breakpoints on the Interrupt Vector Table entries for the real mode
interrupts in question. Providing the operating system has not overridden the de-
bug registers, you’ll get a hit every time a specific interrupt is called.
– Common Real Mode Interrupt Handler. If you have the flexibility to do so, have
all the real mode interrupts utilize one common handler, which takes an interrupt
vector. This allows one hardware breakpoint to be used to trace all real mode in-
terrupt calls.
– Console Output. If there’s a console available that’s accessible in real mode, use
it. Your best friend printf may actually come to the rescue a few times.
Regardless of the situation, SMM is one of the most difficult parts of firmware code to
debug.
82 Chapter 6: System Firmware Debug Techniques
Debugging Methods
As the intent of SMM is to be as invisible as possible to the system during runtime,
hardware should not have its state modified by SMM code. This means that the con-
cept of a console is probably off limits for debug messages.
Industry Specifications
Pitfalls
Just because you have a good understanding of a hardware debugger and some
knowledge into industry specifications that are supported doesn’t mean that debug-
ging the platform after handover to the OS will be a breeze. There are several pitfalls
that are very common.
Since there is overlap in the communicated data, you cannot assume that all or any
of these are used by the operating system. If you’re implementing only one of those
methods, your interrupt information may not be consumed by the operating system.
If you’re implementing all of them, you don’t always know which method(s) the op-
erating system is consuming. Therefore, in supporting all these methods, you need to
ensure that the information communicated by all of them is consistent. Also, make
no assumptions about which method is being used.
Disappearing Breakpoints
Hardware debuggers utilize the debug registers (DR0–DR7) in the processor to control
all hardware interrupts. Since these are public processor registers, anybody can mod-
ify them. That includes a boot loader or an operating system. It may seem logical that
you should be able to set hardware breakpoints on memory accesses to debug boot
loader and operating system use of firmware tables. However, if a boot loader or op-
erating system chooses to overwrite the DR0–DR7 register while the target is running,
the breakpoints will basically disappear. Therefore, care must be taken when at-
tempting to debug firmware table usage.
Summary
If you understand the hardware capabilities at your disposal, have the appropriate
tools, and understand the applicable specifications, you should be able to apply the
techniques described in this chapter to make your way through debugging your firm-
ware from the reset vector through the entire boot process.
Chapter 7
Shells and Native Applications
I was like a boy playing on the sea-shore, and diverting myself now and then finding a
smoother pebble or a prettier shell than ordinary, whilst the great ocean of truth lay all undis-
covered before me.
—Sir Isaac Newton
A shell is a very convenient place to run applications, and it allows both developers
and users great access to all the hardware in a platform, and all the shells/applica-
tions they are sitting upon. Many shells have significantly lower overhead to them
than a modern operating system. These two features combine to make them an excel-
lent place to develop and test new hardware and low-level drivers for that hardware,
as well as a place to run diagnostics.
The common features that most shells share are the ability to run external exe-
cutable images, the ability to be automated, file system access, environment access,
and system configuration.
In Figure 7.1 you can see how a shell command is translated from a human read-
able string down to a hardware command in the EFI environment.
The running of external shell applications is critical since that is the method most
used to add a new feature into the shell. This new feature can be anything from the
simple printing of a string to the console for user input all the way up to a complex
program that runs the entire platform. This is the only way to add a completely new
feature to the shell without editing the shell’s own code. This is also commonly used
to perform proprietary tasks, such as manufacturing configuration, where the execut-
able is not found on the system afterwards.
Automation is accomplished through script files. These are a set of sequential
commands that can be started automatically upon entering the shell so that they hap-
pen whenever the shell is launched. Some shells allow a user to abort the running of
these automatically started scripts.
DOI 10.1515/981501506819-007
86 Chapter 7: Shells and Native Applications
These two sets of extension abilities also can be combined. It is possible for a
script file to call some of the extended executables, some commands internal to the
shell, and even a different script file.
The features that make the UEFI Shell 2.0 unique have to do with the features that
make it especially useful for firmware. This means specifically that the shell can be
heavily configured such that the platform can have a shell with a reduced feature set
and a similarly reduced footprint size. At least 64 combinations of size and feature set
are available in the UEFI Shell, with more available via extensions. This allows for the
UEFI Shell to vary in size from ~300 to almost 1000 KB.
The other effect that this reduction in the size of the feature set can have is secu-
rity. If the end user of the platform is not expected to use the shell, it is possible to
restrict the features available to eliminate some risk that they can harm the system,
but still leave enough features that the limited shell could be used to initiate a plat-
form debug session. It is even possible to have a limited built-in-shell (with a corre-
spondingly small binary image footprint) launch a large and feature-rich shell from a
peripheral storage device, such as USB, DVD, or even over the network.
When in early testing of a new platform, a common use of the shell is as a boot
target, normally before the hardware can boot to a full modern operating system. This
allows for lots of hardware testing via the internal commands and custom-designed
shell applications. Since custom applications have access to the full system, they can
easily test memory, run new assembly instructions, test a new peripheral media de-
vice, or simply examine the contents of the ACPI table.
Since the EFI and UEFI shells have built-in commands to examine memory, ex-
amine drive contents, verify device configuration, use the network, and output logs
of what was found, much early testing can be accomplished in this environment.
When this is combined with the ability of a shell to run itself with minimal features
from the underlying system, it is a clear advantage to use the shell to test and debug
new hardware of unknown quality.
The logical continuation of this is that in a system where the hardware is expected
to be of high quality, but the side effect of the usage model dictates that testing be
done still (such as a manufacturing line), it makes a lot of sense to first boot to the
shell to do some level of testing and then “continue” the boot onto the operating sys-
tem. This is easily done from any EFI or UEFI Shell since in both of these cases the
operating system loader is just another UEFI application that can be launched from
the shell. To do this automatically, you would configure the startup.nsh script file for
the UEFI Shell and have that script file do a series of tests and, if they all pass, then
launch the operating system loader application. There are features that directly sup-
port this type of behavior with the ability to get and set environment variables and
use conditional expressions right inside the script files.
Pre-OS Shells 87
Pre-OS Shells
Both the EFI Shell and the UEFI Shell are UEFI applications. This means that they will
run on any current UEFI platform (the EFI Shell will run on older implementations of
the UEFI and EFI specifications). These applications have very low requirements from
the system. They both need memory, SimpleTextInput, SimpleTextOutput, Device-
PathToText, and UnicodeCollation, and will use lots more (for example, Simple-
FileSystem or BlockIO) if they are present. See the documentation for each UEFI ap-
plication to verify exactly what protocols are required for that application.
The EFI Shell is a nonstandard shell. That means there is no predefined behavior
for a given command and that there is no predefined set of commands that must be
present. In practice, there is a de-facto standard for both risk factors due to the preva-
lence of a single implementation of the EFI Shell dominating in the marketplace. The
EFI Shell provides all the standard features already discussed above, but unlike the
UEFI Shell there are only two different versions of the EFI Shell to allow for customizing
the size requirement. This means that fine-tuning of the binary size cannot occur.
The UEFI Shell 2.0 is the successor to the EFI Shell. This is a standards- based
version of a UEFI-compliant shell. It has all the commands the EFI Shell had, but
many of these commands have been extended and enhanced and some new com-
mands have also been added to the UEFI Shell. This means that all script files that
were written for the EFI Shell will work on the UEFI Shell, but this is not always the
case in reverse. It is possible (and recommended) to write the scripts using the new
features if the target shell will always be the UEFI Shell, as they greatly simplify and
enhance the capabilities of the script files.
Two changes in the UEFI Shell are very significant for script files. The first is the
concept of levels and profiles for the shell. These are sets of commands that can be
queried for availability before they are called. This is important because if you can’t
88 Chapter 7: Shells and Native Applications
call the shell command GetMtc to get the monotonic tic count or DrvDiag to initiate a
driver diagnostic then you may want to verify that it is present in the shell before you
call it. With the old EFI Shells you couldn’t test for this condition and had to rely on
the platform to have the correct version of the shell, but in the newer UEFI Shells, you
can query for the presence of the Driver1 profile and thereby know that the DrvDiag
command is required to be present. The second change directly affecting script files
is the concept of Standard Format Output or -sfo. This is a feature, and a parameter,
present on some shell commands (like the ls command), which have a lot of columnar
output. By specifying -sfo, the output will be output in a specific format defined in the
specification so output between different implementations will have no effect. This
output is comma delimited and can be redirected to a file and then easily parsed with
the parse shell command which did not exist in the EFI Shell. These two changes
mean that a script file that works in the UEFI Shell but not in the EFI Shell can have a
lot more logic in it and can have multiple methods for getting information such that
if one shell command is not present, it could use another, while the same script file,
if done in a cross-version method, would have to do a lot more work.
The file system in the UEFI environment is accessed via DevicePaths. These are
not especially easy or fun for a human to read and not especially easy to write. For
example:
PciRoot(0x0)/Pci(0x1D,0x3)/USB(0x1,0x0)
is the DevicePath for a USB hard disk in the system. When the shell runs, it creates a
human readable map name for this called FS0: and a consistent map name (stays the
same across multiple platforms) called f17b0:. These are much easier to use. For ex-
ample, echo hello world, fs0:\hi.txt makes a lot of sense and is easily understood by
almost any user. However, the UEFI application must interpret the full device path
from its command line and try to find the file system represented on that device path
and find the required file at the end. This is a compound problem since there are usu-
ally multiple file systems and the effort to decode each one will repeat all the work
each time the decoding must take place.
The shell environment features things like path, which is a list of places to search
for a specified file by default, a series of aliases so that ls and dir are the same com-
mand, and environment variables so that %foo% can be automatically replaced by a
predetermined (and configurable) value. There are also useful functions for finding
files using wild card characters (? and *), finding the names of devices, and getting
additional information on files. Configurable elements of these environment features
can be changed and configured on the command line of the shell.
A complicating factor is that the method used to access these features from the EFI
Shell differs from the method used from the UEFI Shell 2.0. This means you may have
to do extra work to support both types of UEFI Shells (or use the UDK 2010 Shell Library)
EFI_STATUS EFIAPI
UefiMain (
90 Chapter 7: Shells and Native Applications
The same done as a UEFI Shell application would replace the call to
OutputString with a call to ShellPrintEx(-1,-1,L“Hello World”).
As you can see the application in its simplest form is actually quite similar in the
UEFI application and in the UEFI Shell application forms. The difference becomes
more apparent as we look at the code used to open a file.
The UEFI Shell application opening a file named file.txt:
The UEFI application opening a file named file.txt and for simplicity we are not even
trying to search a path for the file (pseudo code):
The point here is that the shell does not do anything that any other application (or
driver for that matter) cannot do. It just makes the repeating of these tasks very easy
to do. Don’t think that you can’t do what the shell does. Think how much you don’t
need to do that the shell already is doing for you.
EFI/UEFI Script File 91
A Script file is always a text file, either UNICODE or ASCII, and because of this can be
changed quite easily with no special programs. A small sample shell script is:
echo “1 - START”
GOTO label1
echo “1 - NO”
FOR %a IN a b c echo “1 - NO”
:label1
echo “1 - NO”
ENDFOR
echo “1 - END”
This script is a simple functional test of the goto script-only command. The significant
limitation of script files are that they cannot do anything that is not already done in
either a shell command or an existing shell application. This means that script files
are good for repetitive tasks, but they have their limitations. For example, to output
the current memory information to a log file and then compare that with a known
good version of the file is an excellent task for completion by a script file. On the other
hand, opening a network connection and sending some information to a remote plat-
form is something that is not already encapsulated into a shell command (and assum-
ing there is no special application) means that this would be better done with an ap-
plication and not a script.
The power behind applications is that they can open and interact with any proto-
col and any handle that is present in the system. An application has all of the same
privileges and rights as a UEFI driver—this means that the application can do pretty
much anything. A script, on the other hand, cannot open any protocols; it can interact
only with shell commands and applications.
The power behind script files is that they can do large amounts of repetitive work
with less overhead than required for an application and that they can fully exercise
shell commands and applications with much less development overhead than an ap-
plication. Script files require no compiler or any development tools at all. They can
92 Chapter 7: Shells and Native Applications
even be edited via the edit command in the shell and then rerun without using any
other software.
For example, the following loop will do some desired action 300 times in a row:
echo “2 - START”
FOR %a RUN (1 300)
echo “some desired action” ENDFOR
echo “2 - END”
This same script could be made more generic by modifying it to use the first parameter
to the script itself as the termination condition. This means that the script will run
some arbitrary number of times that is controllable at the time the script is launched.
echo “2 - START”
FOR %a RUN (1 %1)
echo “some desired action” ENDFOR
echo “2 - END”
Note: Per the UEFI Shell 2.0 specification, the distribution is always two files minimum.
Adding a shell application is almost as much overhead in terms of coding work. Alt-
hough the work to build the library is reduced, more work is required in the distribu-
tion of the separate files that are not part of the shell binary itself.
The shell will by default add the root of each drive, as well as the \efi\tools and
\efi\boot directories. This means that any tool that wants to appear to be an internal
command should reside in one of those three directories.
Note: For automatic help system integration, there should be a help file named the same as the ap-
plication file and with contents contained in .man file format the same as the shell internal com-
mands (except their information is stored via HII).
94 Chapter 7: Shells and Native Applications
Once the internal command (in a profile) or the external application has started, they
have the same privileges and almost the same access. A few actions can only be ac-
cessed via the UefiShellCommandLib. These actions are centered on internal shell
functionality that a few commands need to manipulate:
– Shell script interaction
– Shell alias interaction
– Shell map interaction
Note: Linking a shell application with the library will work, but (per definition)the library functions
only when also linked to the shell core so it functions (completely) only for internal shell commands.
The help system built into the UEFI Shell 2.0 core automatically parses a file when
someone uses the help command (for example, help <Your_App_ Name>). This file
must be in the same location as the application file. The Unicode text file must have
file format similar to this:
Once you have this file in place, there is little distinction to a normal user of the shell
between your command and an internal command. There are two differences the user
can notice:
– Internal commands take precedence over applications with the same name. Spec-
ify the file name with the .efi extension to override this behavior. This can also
happen with script files; use .nsh to override in this case.
– Internal commands are listed via help command. External applications are not.
Remote Control of the UEFI Shell 95
sufficient. This means remote control of a system running a shell requires a second
computer to be the host for the testing. A very common technique for remote debug-
ging a platform involves connecting both a hardware debugger and a serial port be-
tween a host and the platform-under-development (PUD). Then the engineer doing
the debugging remotely (via the modern remote desktop connection of your choice)
controls the host computer and then controls the PUD via the hardware debugger and
the serial connection. The biggest challenge in this scenario is changing the copy of
an application the PUD has access to. I have found that there are KVMs that have a
built-in USB switching capability and using that in conjunction with a standard USB
drive allows for the UEFI application under debug to be updated without anyone hav-
ing to physically change anything.
Obviously, if having the PUD physically right next to you is possible, that may be
faster and easier, although for some systems the power, sharing, or noise require-
ments may make that prohibitive to do.
can utilize StdIn redirection to control the input to the application and use StdOut
redirection, combined with script files, to automatically verify the output.
The shell command MemMap can display memory information about the
platform. The actual map is, in this case, not the information that is of interest.
The interesting information is the other part—about how many bytes of memory are
allocated for each type (that is, BootServiceData). The MemMap command is run (re-
cording the data) before and after running an application or loading and unloading a
driver and then run again, and the end results are compared with the first run.
The commands Connect, Disconnect, and Map are required for setup and config-
uration of drivers for devices that are disk drives. Many commands can be used for
write, read, and verify on a disk drive. A simple version would be to echo a text file to
the new drive and then use the Compare command to verify it.
The simplest example is that a user can run the exit command and return control to
the caller (of the shell). This could be done so that the next application in the boot list
is called, to un-nest from one instance of the shell to another, or to return control
directly to the boot manager or setup screen.
The next simplest of these is when the OS Loader application is run. At some point
during the OS Loader application, the ExitBootServices API is called (via the system
table), causing the shell to end indirectly. In this case, all memory in the system not
marked as Runtime is allowed to be overwritten, including the shell. The goal is that
the OS will take over memory management and thus the boot service memory is no
longer required.
The next possible method is that a shell application (or UEFI application) is the
end-goal of the platform. This means there is no operating system at all and that the
platform will forever (until powered off) just run the application. The best example of
this scenario is how some Hewlett-Packard printers use an application to do all the
work of handling print jobs, controlling the queue, monitoring ink, and so on. This
embedded platform has no operating system whatsoever. All it has is an application
that runs forever.
98 Chapter 7: Shells and Native Applications
The most complex is somewhat of a hybrid of the previous two types. This is an
application that handles everything like the preceding example, but uses the Exit-
BootServices API as in the second example. This does allow the application to directly
access hardware, but it also requires that this new application have drivers to handle
this interaction. This distinction between this type of application and a true operating
system may be more in the mind of the creator than anything else. It could be easily
used for a hardware test where the boot time drivers need to stop interacting with the
hardware, but where there is no real replacement of the features provided by an op-
erating system. I do not know of any examples of this type of application.
Summary
A shell is a very convenient place to run applications and provide great access to all
of the hardware. New UEFI shell interfaces and scripts provide a convenient, cheap,
and flexible solution for simple applications for diagnostics and manufacturing and
even simple user applications. The UEFI shell basics are downloadable as open
source from www.Tianocore.org and can be quickly implemented on top of a UEFI
firmware solution.
For more in depth information, please see the book Harnessing the UEFI Shell.
Chapter 8
Loading an Operating System
A fanatic is one who sticks to his guns whether they’re loaded or not.
—Franklin P. Jones
When loading the OS, there are many ways to bootstrap the system and jump into the
OS. For Intel architecture devices, there are two sets of OS interfaces we need to con-
tend with: EFI and legacy OS. There are a number of second-stage boot loaders for
each interface of Microsoft and Linux, the major operating system camps. RTOS and
other proprietary operating systems will have either EFI, legacy, or both flavors; or
support neither.
There is a third OS interface, the null option: when there is a blind handoff, there
is no interface at all and at the end of the BIOS, it just grabs a specific address in a
piece of NVRAM, loads it, and jumps to it, and the OS never looks back.
Before we get to the details, let’s back up and look into the Boot Device Selection
phase theory of operation. In other BIOS implementations, it is known as the boot
services. How does one get to the disk we want to get to?
The Bus
From the CPU and chipset, there is likely a bus or two that must be traversed, such as
PCI, AHCI, USB, or SD. We must communicate on that bus in order to know whether
the OS is out there on a device somewhere, and that the next-stage boot loader is res-
ident and ready to be loaded and executed. Fortunately, in the EFI case, there is often
a bus driver that has been constructed. That bus driver would know the protocols for
reads, writes, and so on. This is fundamental regardless of whether you are searching
for bootable partitions, locating and identifying the boot target’s .efi file, or, in the
case of a closed box, just reading a sector and blindly handing off control to it.
DOI 10.1515/9781501506819-008
100 Chapter 8: Loading an Operating System
The Device
Once you can communicate across a particular bus to talk to the device, there may be
multiple boot partitions or devices to choose from. For this a boot priority list is a
requirement, unless you have only one specific target in mind. While you may be able
to hard-code this for a closed box, for recovery cases, it is ideal to have a list of remov-
able media that reflect a higher priority than the default storage. Alternate input
methods may be used to drive the boot flow to a particular device (such as the use of
jumpers or switches on GPIOs). The boot list can be used for giving priority to recov-
ery, validation, or manufacturing OS over the production boot image.
Which file system is being used? There have been new file systems coming out about
every year since 1964. The two we are going to talk about are FAT and EXT.
Microsoft provides a license for FAT to UEFI developers through the UEFI forum.
Those using an EFI solution should receive that as part of the EDK. FAT stands for the
file allocation table and has been around for many years in different bit-wise fashions.
FAT12, FAT16, and FAT32 before NTFS took over. While FAT12 is reserved for floppies,
FAT32 handles up to 2 TB of data.
Linux developed an alternative known as EXT, which is gaining favor in open-
source communities. There are variations of both FAT and EXT; developers have to
look at the requirements to determine which makes sense for them.
Booting a legacy OS (non-EFI) consists of locating the right master boot record (MBR)
on one of a potentially large number of bootable media.
Booting via the Legacy OS Interface 101
On each disk an MBR may be sought by looking at LBA Sector 0 for signature
0xAA55 in the last two of the 512 bytes. In the master boot record, there are four 16-
byte partition tables specifying whether there are any partitions and, if so, which are
active.
The BIOS can choose any of the active primary partitions in the table for the drive
and discover whether any of them are bootable. This bootable partition would then
be listed in the boot target list, which can then be prioritized.
There are 440 bytes of executable code in the winning MBR that can be loaded into
0x7C00. Depending on the next step, that code can load the volume boot record (VBR)
code from that partition into location 0x7C00 again and execute it. At some point
down the chain a real OS loader is loaded, jumping to that location while the proces-
sor is in 16-bit Real Mode, and starts executing the OS loader.
The legacy OS loader relies on legacy BIOS services, including but not limited to:
– Int 13h for disk including RAID option ROMs
– Int 16h for PS/2 keyboard
– Int 10h for video output from video BIOS
– BIOS’s SMI handler providing legacy USB keyboard support via port 60/64
emulation.
Depending on the desired features enabled by the boot loader, the OS needs different
tables. The following is a list of those tables:
– Memory Map (INT15h / Function E820h)
– Programmable Interrupt Routing ($PIR)
– Multi-Processor Specification (_MP_)
– Minimal Boot Loader for Intel® Architecture
– Advanced Configuration and Power Interface (ACPI)
ACPI tables are needed only if those features are enabled by the boot loader and re-
quired by the OS. Most modern operating systems, including RTOS, are ACPI aware.
The _MP_ table is needed if there is more than one Intel processing agent (more
than one thread or core) in the system. Details on the _MP_ table may be found in the
Multi-Processor Specification.
102 Chapter 8: Loading an Operating System
The $PIR table and interrupt-based memory map are almost always needed. De-
tails on the $PIR table may be found in the $PIR Specification. The memory map is
discussed in more detail in the following sections.
The OS loader is often operating-system-specific; however, in most operating sys-
tems its main function is to load and execute the operating system kernel, which con-
tinues startup, and eventually you load an application. Examples for Linux include
U-boot, Grub, and Sys-Linux. Examples for Windows CE are Eboot and ROMboot.
Another term has been used in the past for the OS loader was the initial program
loader or IPL. While the term predates most of the current mechanisms, it can pop up
in certain crowds or segments of industry. The job is the same as a BIOS or a boot
loader or an OS loader. The names are just changed to protect the innocent.
By default, UEFI firmware will search the path /EFI/BOOT/ on a FAT GPT partition
for an executable second-stage bootloader. Each FAT GPT partition will be checked
in its discovered order for an appropriate boot executable. On a 64-bit architecture,
the name of the default bootloader will be bootia64.efi; on a 32-bit machine it will
be bootia32.efi.
Operating system installation programs will typically rewrite the NvVar that con-
trols the boot process, thus installing a new path to the operating system’s standard
bootloader. For example: the new path will be /EFI/BOOT/grub.efi for a Linux system
or C:\Windows\System32\winload.efi for a Windows system.
A second-stage bootloader, as shown in Figure 8.1, will then hand off control from
the second stage to the runtime system by first locating the bootloader configuration
file, if any, and then executing commands contained in that configuration file in the
following order:
1. Locate a kernel image
2. Locate any ancillary boot data, such as an init RAM disk
3. Optional: convert an EFI memory map to E820h tables, for systems that do not
support UEFI Runtime Services
4. Call ExitBootServices()
5. Execute the kernel with parameters given in the boot loader configuration file
Booting via the EFI Interface 103
Modern UEFI systems, in conjunction with modern versions of the Linux kernel, can
directly execute a Linux kernel without need of a second-stage bootloader such as Grub.
Since the Linux kernel now can get the EFI memory map via a call to UEFI
Runtime Services, the necessity of using a second-stage bootloader to convert an EFI
memory map to E820h/E801h tables is mitigated.
This extension to the Linux kernel is fully backward compatible, meaning it is
possible to add the EFI_STUB kernel configuration option to your kernel configura-
tion and to seamlessly use this to execute the kernel directly from an UEFI firmware,
or to execute the kernel as specified above from a standard second-stage boot loader.
Operating system kernels such as those used in Windows and Linux systems assume
and require the second-stage boot loader to have called ExitBootServices() before
handing control over.
In order to support calling UEFI services from a virtual-memory-enabled kernel,
however, it is necessary for the kernel and UEFI callback mechanism to agree on a
new address structure. Recall UEFI systems operate in protected unpaged mode,
whereas Windows and Linux systems are fully fledged paged OS environments. The
UEFI runtime subsystem operates in protected unpaged mode, whereas the kernel
has switched execution over to paged mode; therefore, when a kernel makes a
runtime call to UEFI one of two things must logically happen.
1. Each callback to UEFI could switch the processor back to UEFI’s default mode of
protected unpaged mode. This is undesirable, however, for a number of reasons:
– Slow to switch modes back and forth
– Complex to enable/disable paging between the two modes
– Considered unsafe security-wise as the UEFI callback has unfettered access
to memory.
2. The UEFI callback must now itself operate in paged mode, replacing function and
data references/pointers to physical addresses with paged references. Example:
data at 0x12345678 physical may now be mapped by Linux to 0xD2345678, and
the UEFI system should reference the new address, not the old.
In order to achieve this agreement between a paging kernel and the UEFI Runtime
Services, the UEFI standard defines the function SetVirtualAddressMap(). After map-
ping UEFI runtime data/code to a set of virtual addresses, the OS kernel will make
this callback to EFI, before making any further UEFI Runtime Service calls.
Neither Option
This option should be reserved for an RTOS or modified Linux kernel. The OS loader
(if there is one) must not rely on any BIOS boot services or legacy BIOS calls. The OS
must not rely on any BIOS legacy or runtime services. The system should be a closed
box with a single permanent boot path. This is an ideal case if this is an SPI-based
(flash-based) operating system, such as a microkernel or free RTOS.
In a UEFI-based solution, a developer may choose to have the OS as a DXE pay-
load where the operating system is loaded and jumped to as part of the initial BIOS
driver load.
In a legacy-based solution, it is possible to do something similar with a DMA of
the OS kernel in a known memory location and a jump to start the kernel.
Summary 105
While this may prove limiting to the boot solution on the platform, there is an
alternative OS-loading model via this solution where the UEFI-based firmware is bi-
furcated and DXE and later is placed on removable media. A jumper or other physical
access is likely required to trigger the alternate boot path, but this is important if the
solution is going to go beyond the simple prototype and into high volume manufac-
turing. The blind handoff to disk means reading the first sector of a target device and
jumping to a known (good) address. It is the preference of RTOS vendors who don’t
call back into system firmware during runtime and have not yet implemented either
a legacy OS interface or an EFI interface.
Summary
Loading an operating system after the platform is initialized can be done in a variety
of ways. Known and tested standards are the fastest and easiest methods to provide
for a flexible solution. More deeply embedded solutions have several other options
but must be fully thought through beyond the lab environment if the system is meant
for bigger, better, and broader things.
Choose wisely and let the circumstances dictate the development (and boot)
path; with UEFI, one doesn’t need to be married to a single boot solution.
Chapter 9
The Intel® Architecture Boot Flow
There is no one giant step that does it. It’s a lot of little steps.
—Peter A. Cohen
A lot of little steps is exactly the idea of walking the path of the Intel architecture boot
flow. The bare minimum firmware requirements for making an Intel architecture plat-
form operational and for booting an OS are presented here in a given order. Design or
market segment-based requirements might add, delete, or reorder many of the items
presented in this chapter; however, for the vast majority of system designs, these
steps in this order are sufficient for a full or cold boot from a state where the power is
off to the handoff to the operating system. Depending on the architecture of the BIOS,
there may be multiple software phases to jump through with different sets of rules,
but the sequence for actually touching the hardware is, at least in the early phases,
very much the same.
DOI 10.1515/9781501506819-009
108 Chapter 9: The Intel® Architecture Boot Flow
As part of the Intel architecture, a variety of subsystems may begin prior to the main
host system starting.
The Intel Manageability Engine (ME), available on some mainstream, desktop, and
server-derived chips, is one such component. While the main system firmware does
not initialize these devices, there is likely to be some level of interactions that must
be taken into account in the settings of the firmware, or in the descriptors of the flash
component for the ME to start up and enable the clocks correctly. The main system
firmware also has the potential to make calls and be called from the ME.
Another example is the microengines, which are part of telecommunications seg-
ment components in the embedded-market segments. These microengines have their
own firmware that starts independently of the system BIOS, but the host system BIOS
has to make allowances for this in the ACPI memory map to allow for proper interac-
tion between host drivers and the microengine subsystem.
Hardware Power Sequences (The Pre-Pre-Boot) 109
Once the processor reset line has been de-asserted, newer processors may automati-
cally load a microcode update as part of its secure startup sequence. Any correctable
errata will be fixed before any executable code kicks off and it ensures security of the
system. After the patch is loaded, the processor then begins fetching instructions. The
location of these initial processor instructions is known as the reset vector. The reset
vector may contain instructions or a pointer to the actual starting instruction se-
quence in the flash memory. The location of the vector is architecture-specific and
usually in a fixed location, depending on the processor. The initial address must be a
physical address as the MMU (if it exists) has not yet been enabled. For Intel architec-
ture, the first fetching instructions start from 0xFFFF, FFFF0. Only 16 bytes are left to
the top of memory, so these 16 bytes must contain a far jump to the remainder of the
initialization code.
This code is always written in assembly at this point as (to date) there is no soft-
ware stack or cache as RAM available.
Because the processor cache is not enabled by default, it is not uncommon to
flush cache in this step with a WBINV instruction. The WBINV instruction is not
needed on newer processors, but it doesn’t hurt anything.
Mode Selection
Normally the system firmware creates an SMI handler, which may periodi-
cally take over the system from the host OS. There are normally workarounds that
are executed in the SMI handler and handling and logging-off errors that may
happen at the system level. As this presents a potential security issue, there is
also a lock bit that resists tampering with this mechanism.
Real-time OS vendors often recommend disabling this feature because it has
a potential of subverting the nature of the OS environment. If this happens, then
the additional work of the SMI handler would either need to be incorporated into
the RTOS for that platform or else the potential exists of missing something im-
portant in the way of error response or workarounds. If the SMI handler can work
with the RTOS development, there are some additional advantages to the feature.
– Virtual-8086 mode is a quasi-operating mode supported by the processor in pro-
tected mode. This mode allows the processor to execute 8086 software in a pro-
tected, multitasking environment.
Intel® 64 architecture supports all operating modes of IA-32 architecture and
IA-32e modes.
– IA-32e mode—in this mode, the processor supports two submodes: compatibility
mode and 64-bit mode. 64-bit mode provides 64-bit linear addressing and sup-
port for physical address space larger than 64 GB. Lastly, Compatibility mode al-
lows most legacy protected-mode applications to run unchanged.
Figure 9.2 shows how the processor moves between operating modes.
Figure 9.2: Figure 9.2 Mode Switching per Intel® 64 and IA-32 Architectures Software Developer’s
Manual, Volume 3A
Hardware Power Sequences (The Pre-Pre-Boot) 111
Refer to the Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume
3A section titled “Mode Switching” for more details.
When the processor is first powered on, it will be in a special mode similar to real
mode, but with the top 12 address lines being asserted high. This aliasing allows the
boot code to be accessed directly from NVRAM (physical address 0xFFFxxxxx).
Upon execution of the first long jump, these 12 address lines will be driven ac-
cording to instructions by firmware. If one of the protected modes is not entered be-
fore the first long jump, the processor will enter real mode, with only 1 MB of address-
ability. In order for real mode to work without memory, the chipset needs to be able
to alias a range of memory below 1 MB to an equivalent range just below 4 GB to con-
tinue to access NVRAM. Certain chipsets do not have this aliasing and may require a
switch into a normal operating mode before performing the first long jump. The pro-
cessor also invalidates the internal caches and translation lookaside buffers (TLBs).
The processor continues to boot in real mode today, for no particular technical
requirement. While some speculate it is to ensure that the platform can boot legacy
code (example DOS OS) written many years ago, it is more an issue of introducing
and needing to validate the change into a broad ecosystem of players and develop-
ers. The backward compatibility issues it would create in test and manufacturing
environments and other natural upgrade hurdles will continue to keep the boot
mode and Intel reset vector “real” until a high return-on-investment feature re-
quires it to be otherwise.
The first power-on mode is a special subset of the real mode. The top 12 address
lines are held high, allowing aliasing, where the processor can execute code from the
nonvolatile storage (such as flash) located within the lowest one megabyte as if from
the top of memory. Normal operation of the firmware (such as BIOS) is to switch
modes to flat-protected mode as early in the boot sequence as is possible. Once the
processor is running in protected mode, it usually is not necessary to switch back to
real mode unless executing a legacy option ROM, which makes certain legacy soft-
ware interrupt calls. The flat mode runs 32-bit code and the physical address are
mapped on to one with the logical addresses (paging off). The Interrupt Descriptor
Table is used for interrupt handling. This is the recommended mode for all BIOS/boot
loaders to operate in. The segmented protected mode is not used for the initialization
code as part of the BIOS sequence.
Intel produces BIOS specifications or BIOS writer’s guides that go into some de-
tails about chip-specific and technology-specific initialization sequences. These doc-
uments hold fine-grain details of every bit that needs to be set, but not necessarily the
high-level sequence in which to set them. The following flows for early initialization
and advanced initialization outline that from a hardware architecture point of view.
112 Chapter 9: The Intel® Architecture Boot Flow
Early Initialization
The early phase of the BIOS/bootloader will do the minimum to get the memory and
processor cores initialized.
In an UEFI-based system BIOS, the Security (SEC) and the pre-EFI initialization
(PEI) phases are normally synonymous with “early initialization.” It doesn’t matter if
legacy or UEFI BIOS is used; from a hardware point of view, the early initialization
sequence is the same for a given system.
Single-Threaded Operation
In a multicore system, the bootstrap processor is the CPU core/thread that is chosen
to boot the normally single-threaded system firmware. At RESET, all of the processors
race for a semaphore flag bit in the chipset. The first finds it clear and in the process
of reading it sets the flag; the other processors find the flag set and enter a WAIT for
SIPI or halt state. The first processor initializes main memory and the application pro-
cessors (APs) and continues with the rest of boot. A multiprocessor (MP) system does
not truly enter MP operation until the OS takes over. While it is possible to do a limited
amount of parallel processing during the UEFI boot phase, such as during memory
initialization with multiple socket designs, any true multithreading activity would re-
quire changes to be made to the DXE phase of the UEFI solutions to allow for this. In
order to have broad adoption, some obvious benefits would need to arise.
The early initialization phase readies the bootstrap processor (BSP) and I/O periph-
erals’ base address registers that are needed to configure the memory controller. The
device-specific portion of an Intel architecture memory map is highly configurable.
Most devices are seen and accessed via a logical PCI bus hierarchy, although a small
number may be memory-mapped devices that have part-specific access mechanisms.
Device control registers are mapped to a predefined I/O or MMIO space and can be set
up before the memory map is configured. This allows the early initial firmware to con-
figure the memory map of the device needed to set up DRAM. Before DRAM can be
configured, the firmware must establish the exact configuring of DRAM that is on the
board. The Intel architecture reference platform memory map is described in more
detail in Figure 9.3. SOC devices based on other processor architectures typically pro-
vide a static address map for all internal peripherals, with external devices connected
via a bus interface. The bus-based devices are mapped to a memory range within the
SOC address space. These SOC devices usually provide a configurable chip select reg-
Early Initialization 113
ister, which can be set to specify the base address and size of the memory range ena-
bled by the chip select. SOCs based on Intel architecture primarily use the logical PCI
infrastructure for internal and external devices.
The location of the device in the host memory address space is defined by the PCI base
address register (BAR) for each of the devices. The device initialization typically ena-
bles all the BAR registers for the devices required as part of the system boot path. The
BIOS will assign all devices in the system a PCI base address by writing the appropri-
ate BAR registers during PCI enumeration. Long before full PCI enumeration, the BIOS
must enable PCIe BAR and the PCH RCBA BAR for memory, I/O, and MMIO interac-
tions during the early phase of boot.
Depending on the chipset, there are prefetchers that can be enabled at this point
to speed up the data transfer from the flash device. There may also be DMI link set-
tings that must be tuned to optimal performance.
CPU Initialization
This consists of simple configuring of processor and machine registers, loading a mi-
crocode update, and enabling the Local APIC.
114 Chapter 9: The Intel® Architecture Boot Flow
Local APIC. The Local APIC must be enabled to handle any interrupts that occur early
in post, before enabling protected mode. For more information on the Intel APIC Ar-
chitecture and Local APICs, please see the Intel 64 and IA-32 Intel Architecture Soft-
ware Developer’s Manual, Volume 3A: System Programming Guide, Chapter 8.
Switch to Protected Mode. Before the processor can be switched to protected mode,
the software initialization code must load a minimum number of protected mode data
structures and code modules into memory to support reliable operation of the proces-
sor in protected mode. These data structures include the following:
– An IDT
– A GDT
– A TSS
– (Optional) An LDT
– If paging is to be used, at least one page directory and one page table
– A code segment that contains the code to be executed when the processor
switches to protected mode
– One or more code modules that contain the necessary interrupt and exception
handlers
Initialization code must also initialize the following system registers before the pro-
cessor can be switched to protected mode:
– The GDTR
– Optionally the IDTR. This register can also be initialized immediately after switch-
ing to protected mode, prior to enabling interrupts.
Early Initialization 115
With these data structures, code modules, and system registers initialized, the pro-
cessor can be switched to protected mode by loading control register CR0 with a value
that sets the PE flag (bit 0).
From this point onward, it is likely that the system will not enter 16b Real mode
again, Legacy Option ROMs and Legacy OS/BIOS interface notwithstanding, until the
next hardware reset is experienced.
More details about protected mode and real mode switching can be found in the
Intel 64 and IA-32 Intel Architecture Software Developer’s Manual, Volume 3A: Sys-
tem Programming Guide, Chapter 9.
Cache as RAM and No Eviction Mode. Since no DRAM is available, the code initially
operates in a stackless environment. Most modern processors have an internal cache
that can be configured as RAM (Cache as RAM, or CAR) to provide a software stack.
Developers must write extremely tight code when using CAR because an eviction
would be paradoxical to the system at this point in the boot sequence; there is no
memory to maintain coherency with at this time. There is a special mode for proces-
sors to operate in cache as RAM called “no evict mode” (NEM), where a cache line
miss in the processor will not cause an eviction. Developing code with an available
software stack is much easier, and initialization code often performs the minimal
setup to use a stack even prior to DRAM initialization.
Processor Speed Correction. The processor may boot into a slower mode than it can
perform for various reasons. It may be considered less risky to run in a slower mode,
or it may be done to save additional power. It will be necessary for the BIOS to initial-
ize the speed step technology of the processor and may need to force the speed to
something appropriate for a faster boot. This additional optimization is optional; the
OS will likely have the drivers to deal with this parameter when it loads.
Memory Configuration
The initialization of the memory controller varies slightly, depending on the DRAM
technology and the capabilities of the memory controller itself. The information on
the DRAM controller is proprietary for SOC devices, and in such cases the initializa-
tion memory reference code (MRC) is typically supplied by the SOC vendor. Develop-
ers should contact Intel to request access to the low-level information required. With-
out being given the MRC, developers can understand that for a given DRAM
technology, they must follow a JEDEC initialization sequence. It is likely a given that
the code will be single point of entry and single point of exit code that has multiple
116 Chapter 9: The Intel® Architecture Boot Flow
boot paths contained within it. It will also likely be 32-bit protected mode code. The
settings for various bit fields like buffer strengths and loading for a given number of
banks of memory is something that is chipset-specific. The dynamic nature of the
memory tests that are run (to ensure proper timings have been applied for a given
memory configuration) is an additional complexity that would prove difficult to rep-
licate without a proper code from the memory controller vendor. Workarounds for
errata and interactions with other subsystems, such as the Manageability Engine, are
not something that can be reinvented, but can be reverse-engineered. The latter is, of
course, heavily frowned upon.
There is a very wide range of DRAM configuration parameters, such as number of
ranks, 8-bit or 16-bit addresses, overall memory size, and constellation, soldered down
or add-in module (DIMM) configurations, page-closing policy, and power management.
Given that most embedded systems populate soldered down DRAM on the board, the
firmware may not need to discover the configuration at boot time. These configurations
are known as memory-down. The firmware is specifically built for the target configura-
tion. This is the case for the Intel reference platform from the Embedded Computing
Group. At current DRAM speeds, the wires between the memory controllers behave like
transmission lines; the SOC may provide automatic calibration and runtime control of
resistive compensation (RCOMP) and delay locked look (DLL) capabilities. These capa-
bilities allow the memory controller to change elements such as the drive strength to
ensure error- free operation over time and temperature variations.
If the platform supports add-in-modules for memory, there are a number of stand-
ardized form factors for such memory. The small outline dual in-line memory module
(SODIMM) is one such form factor often found in embedded systems. The DIMMs pro-
vide a serial EPROM. The serial EPROM devices contain the DRAM configuration data.
The data is known as serial presence detect data (SPD data). The firmware reads the
SPD data to identify the device configuration and subsequently configures the device.
The serial EPROM is connected via SMBUS, thus the device must be available in this
early initialization phase, so the software can establish the memory devices on board.
It is possible for memory-down motherboards to also incorporate serial presence de-
tect EEPROMs to allow for multiple and updatable memory configurations to be han-
dled efficiently by a single BIOS algorithm. It is also possible to provide a hard-coded
table in one of the MRC files to allow for an EEPROM-less design. In order to derive
that table for SPD, please see Appendix A.
Post-Memory
Once the memory controller has been initialized, a number of subsequent cleanup
events take place.
Early Initialization 117
Memory Testing
The memory testing is now part of the MRC, but it is possible to add more tests should
the design merit it. BIOS vendors typically provide some kind of memory test on a
cold boot. Writing custom firmware requires the authors to choose a balance between
thoroughness and speed, as highly embedded/ mobile devices require extremely fast
boot times and memory testing can take up considerable time.
If testing is warranted for a design, testing the memory directly following initial-
ization is the time to do it. The system is idle, the subsystems are not actively access-
ing memory, and the OS has not taken over the host side of the platform. Memory
errors manifest themselves in random ways, sometimes inconsistently. Several hard-
ware features can assist in this testing both during boot and during runtime. These
features have traditionally been thought of as high-end or server features, but over
time they have moved into the client and embedded markets.
One of the most common is ECC. Some embedded devices use error correction
codes (ECC) memory, which may need extra initialization. After power up, the state
of the correction codes may not reflect the contents and all memory must be written
to; writing to memory ensures that the ECC bits are valid and sets the ECC bits to the
appropriate contents. For security purposes, the memory may need to be zeroed out
manually by the BIOS or in some cases a memory controller may incorporate the fea-
ture into hardware to save time.
Depending on the source of the reset and security requirements, the system may
not execute a memory wipe or ECC initialization. On a warm reset sequence, memory
context can be maintained.
If there were any memory timing changes or other configuration changes that re-
quire a reset to take effect, this is normally the time to execute a warm reset. That
warm reset would start the early initialization phase over again; affected registers
would need to be restored.
Shadowing
From the reset vector, execution starts off directly from the nonvolatile flash storage
(NVRAM). This operating mode is known as execute in place (XIP). The read perfor-
mance of nonvolatile storage is much slower than the read performance of DRAM. The
performance of the code running from flash is much lower than if it executed from
RAM, so most early firmware will copy from the slower nonvolatile storage into RAM.
The firmware starts to run the RAM copy of the firmware. This process is sometimes
known as shadowing. Shadowing involves having the same contents in RAM and
flash; with a change in the address decoders the RAM copy is logically in front of the
flash copy and the program starts to execute from RAM. On other embedded systems,
the chip selects ranges managed to allow the change from flash to RAM execution.
Most computing systems run as little as possible in place. However, some constrained
118 Chapter 9: The Intel® Architecture Boot Flow
(in terms of RAM) embedded platforms execute all the application in place. This is
generally an option on very small embedded devices. Larger systems with main
memory generally do not execute in place for anything but the very initial boot steps
before memory has been configured. The firmware is often compressed instead of a
simple copy. This allows reduction of the NVRAM requirements for the firmware.
However, the processor cannot execute a compressed image in place.
On Intel architecture platforms, the shadowing of the firmware is usually located
below 1 MB.
There is a tradeoff between the size of data to be shadowed and the act of decom-
pression. The decompression algorithm may take much longer to load and execute
than it would be for the image to remain uncompressed. Prefetchers in the processor,
if enabled, may also speed up execution in place, and some SOCs have internal
NVRAM cache buffers to assist in pipelining the data from the flash to the processor.
Figure 9.3 shows the memory map at initialization in real mode. Real mode has
an accessibility limit of 1 MB.
Before memory was initialized, the data and code stacks were held in the processor
cache. With memory now initialized that special and temporary caching mode must
be exited and the cache flushed. The stack will be transferred to a new location in
system main memory and cache reconfigured as part of AP initialization.
The stack must be set up before jumping into the shadowed portion of the BIOS
that now is in memory. A memory location must be chosen for stack space. The stack
will count down so the top of the stack must be entered and enough memory must be
allocated for the maximum stack.
If protected mode is used, which is likely following MRC execution, then SS:ESP
must be set to the correct memory location.
Transfer to DRAM
This is where the code makes the jump into memory. As mentioned before, if a
memory test has not been performed up until this point, the jump could very well be
to garbage. System failures indicated by a POST code between “end of memory ini-
tialization” and the first following POST code almost always indicates a catastrophic
memory initialization problem. If this is a new design, then chances are this is in the
hardware and requires step-by-step debug.
Early Initialization 119
For legacy option ROMs and BIOS memory ranges, Intel chipsets usually come with
memory aliasing capabilities that allow reads and writes to sections of memory below
1 MB to be either routed to or from DRAM or nonvolatile storage located just under 4
GB. The registers that control this aliasing are typically referred to as programmable
attribute maps (PAMs). Manipulation of these registers may be required before, dur-
ing, and after firmware shadowing. The control over the redirection of memory access
varies from chipset to chipset. For example, some chipsets allow control over reads
and writes, while others only allow control over reads.
For shadowing, if PAM registers remain at default values (all 0s), all FWH ac-
cesses to the E and F segments (E_0000–F_FFFFh) will be directed downstream to-
ward the flash component. This will function to boot the system, but is very slow.
Shadowing, as we know, improves boot speed. One method of shadowing the E and
F segments (E_0000–F_FFFFh) of the BIOS is to utilize the PAM registers. This can be
done by changing the enables (HIENABLE[ ], LOENABLE[ ]) to be 10 (write only). This
will direct reads to the flash device and writes to memory. By reading and then writing
the same address, the data is shadowed into memory. Once BIOS code has been shad-
owed into memory, the enables can be changed to 0x01(read only), so memory reads
are directed to memory. This also prevents accidental overwriting of the image in
memory. See the example in Table 9.1.
PAM F–FFFFFh
PAM C–CFFFh
PAM C–CFFFFh
PAM D–DFFFh
PAM D–DFFFFh
PAM E–EFFFh
PAM E–EFFFFh
Consult the chipset datasheet for details on the memory redirection feature controls
applicable to the target platform.
120 Chapter 9: The Intel® Architecture Boot Flow
Even in SOCs, there is the likelihood of having multiple CPU cores, which are consid-
ered BSP 1 AP to system initialization. While the BSP starts and initializes the system,
the application processors (APs) must also be initialized with identical features ena-
bled to the BSP. Prior to memory, the APs are left uninitialized. After memory is
started, the remaining processors are initialized and left in a WAIT for SIPI state. To
do this, the system firmware must:
1. Find microcode and then copy it to memory.
2. Find the CPU code in SPI and copy to memory. This is an important step to avoid
execution-in-place penalties for the remainder of the boot sequence.
3. Send Startup IPIs to all processors.
4. Disable all NEM settings, if this has not already been done.
5. Load microcode updates on all processors.
6. Enable Cache On for all processors.
Partial details of these sequences are in the Software Developer’s Manual; fuller de-
tails can be found in the BIOS Writer’s Guide for that particular processor or on CPU
reference code obtained from Intel.
From a UEFI perspective the AP initialization may either be part of the PEI or the
DXE phase of the boot flow or in the early or advanced initialization. At the time of
this printing, the location of the AP initialization code may be considered product
dependent.
Threads and cores on the same package are detectable by executing the CPUID in-
struction.
See the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume
2A for details on the information available with the CPUID instruction on various pro-
cessor families.
Detection of additional packages must be done “blindly.” If a design must accom-
modate more than one physical package, the BSP needs to wait a certain amount of
Early Initialization 121
time for all potential APs in the system to “log in.” Once a timeout occurs or the max-
imum expected number of processors “log in,” it can be assumed that there are no
more processors in the system.
AP Wakeup State
Upon receipt of the SIPI, the AP starts executing the code pointed to by the SIPI mes-
sage. As opposed to the BSP, when the AP starts code execution it is in real mode.
This requires that the location of the code that the AP starts executing is located below
1 MB.
Caching Considerations
Because of the different types of processor combinations and different attributes of
shared processing registers between threads, care must be taken to ensure that the
caching layout of all processors in the entire system remain consistent such that there
are no caching conflicts.
The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A
section “MTRR Considerations in MP Systems” outlines a safe mechanism for chang-
ing the cache configuration in all systems that contain more than one processor. It is
recommended that this be used for any system with more than one processor present.
AP Idle State
Behavior of APs during firmware initialization depends on the firmware implementa-
tion, but is most commonly restricted to short durations of initialization, followed by
122 Chapter 9: The Intel® Architecture Boot Flow
entering a halt state with a HLT instruction, awaiting direction from the BSP for an-
other operation.
Once the firmware is ready to attempt to boot an OS, all AP processors must be
placed back in their power-on state (WAIT for SIPI), which can be accomplished by
the BSP sending an INIT ASSERT IPI followed by an INIT DEASSERT IPI to all APs in
the system (all except self). See the Intel® 64 and IA-32 Architectures Software Devel-
oper’s Manual, Volume 3A for details on the INIT IPI, and the MultiProcessor Specifi-
cation 1.4 for details on BIOS AP requirements.
Advanced Initialization
The advanced device initialization follows the early initialization and basically en-
sures that the DRAM is initialized. This second stage is focused on device- specific
initialization. In a UEFI-based BIOS solution, advanced initialization tasks are also
known as Dynamic Execute Environment (DXE) and Boot Device Selection (BDS)
phases. The following devices must be initialized in order to enable an embedded
system. Not all are applicable to all embedded systems, but the list is prescriptive for
most and is particular to an SOC-based- on-Intel architecture.
– General purpose I/O (GPIO)
– Interrupt controller
– Timers
– Cache initialization (could also be done during early initialization)
– Serial ports, console in/out
– Clocking and overclocking
– PCI bus initialization
– Graphics (optional)
– USB
– SATA
GPIOs are key to the extensibility of the platform. As the name implies, GPIOs can be
configured for either input or output, but also can be configured for a native function-
ality. Depending on weak or strong pull-up or pull-down resistors, some GPIOs also
can act like strapping pins, which are sampled at RESET by the chipset and can have
a second meaning during boot and runtime. GPIOs also may act like sideband signals
to allow for system wakes. GPIO 27 is used for this on most mainstream platforms.
System-on-chip devices are designed to be used in a large number of configura-
tions, the devices often having more capabilities than the device can expose on the
Advanced Initialization 123
I/O pins concurrently. That is because several functions are multiplexed to a particu-
lar I/O pin. The configuration of the pins must be set before use. The pins are config-
ured to either provide a specific function or serve as a general-purpose I/O pin. I/O
pins on the device are used to control logic or behavior on the device. General purpose
I/O pins can be configured as input or output pins. GPIO control registers provide
status and control.
System firmware developers must work through between 64 and 256 GPIOs and
their individual options with the board designer (per platform) to ensure that this fea-
ture is properly enabled.
Interrupt Controllers
Intel architecture has several different methods of interrupt handling. The following
or a combination of the following can be used to handle interrupts:
– Programmable Interrupt Controller (PIC) or 8259
– Local Advanced Programmable Interrupt Controller (APIC)
– Input/Output Advanced Programmable Interrupt Controller (IOxAPIC)
– Messaged Signaled Interrupt (MSI)
PIRQ# Pin Interrupt Router Register for PIC Connected to IOxAPIC Pin
The IVT is the Interrupt Vector Table, located at memory location 0p and containing
256 interrupt vectors. The IVT is used in real mode. Each vector address is 32 bits and
consists of the CS:IP for the interrupt vector. Refer to the Intel® 64 and IA-32 Architec-
tures Software Developer’s Manual, Volume 3A section titled “Exception and Inter-
rupt Reference” for a list of real-mode interrupts and exceptions.
The IDT is the Interrupt Descriptor Table and contains the exceptions and interrupts
in protected mode. There are also 256 interrupt vectors, and the exceptions and inter-
rupts are defined in the same locations as the IVT. Refer to the Intel® 64 and IA-32
Architectures Software Developer’s Manual, Volume 3A for a detailed description of
the IDT.
Exceptions
Exceptions are routines that run to handle error conditions. Examples include page
fault and general protection fault. At a minimum, placeholders (dummy functions)
126 Chapter 9: The Intel® Architecture Boot Flow
should be used for each exception handler. Otherwise the system could exhibit un-
wanted behavior if it encounters an exception that isn’t handled.
Timers
There are a variety of timers that can be employed on today’s Intel Architecture system.
Note: It is important to understand that for debugging any type of firmware on Intel architecture chipsets
that implement a TCO Watch Dog Timer, it should be disabled by firmware as soon as possible coming
out of reset. Halting the system for debug prior to disabling this Watch Dog Timer on chipsets that power
on with this timer enabled will result in system resets, which doesn’t allow firmware debug. The OS will
re-enable the Watch Dog Timer if it so desires. Consult the chipset datasheet for details on the specific
implementation of the TCO Watch Dog Timer. Refer to the Chipset BIOS Writer’s Guide for more details.
Advanced Initialization 127
Memory regions that must have different caching behaviors applied will vary from
design to design. In the absence of detailed caching requirements for a platform, the
following guidelines provide a “safe” caching environment for typical systems:
1. Default Cache Rule – Uncached.
2. 00000000-0009FFFF – Write Back.
3. 000A0000-000BFFFF – Write Combined / Uncached
4. 000C0000-000FFFFF – Write Back / Write Protect
5. 00100000-TopOfMemory – Write Back.
6. TSEG – Cached on newer processors.
7. Graphics Memory – Write Combined or Uncached.
8. Hardware Memory-Mapped I/O – Uncached.
While MTRRs are programmed by the BIOS, Page Attribute Tables (PATs) are used
primarily with the OS to control caching down to the page level.
The Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A,
Chapter 10, “Memory Cache Control” contains all the details on configuring caching
for all memory regions.
See the appropriate CPU BIOS Writer’s Guide for caching control guidelines spe-
cific to the processor.
Serial Ports
An RS-232 serial port or UART 16550 is initialized for either runtime or debug solu-
tions. Unlike USB ports, which require considerable initialization and a large software
stack, serial ports have a minimal register-level interface. A serial port can be enabled
very early in POST to provide serial output support.
Depending on the clocking solution of the platform, the BIOS may have to enable
the clocking of the system. It is possible that a subsystem such as the Manageability
Engine or baseboard management controller (BMC) in server platforms have this
responsibility.
It is possible that beyond the basic clock programming, there may be expanded
configuration options for overclocking such as:
– Enable/disable clock output enable based on enumeration.
– Adjust clock spread settings. Enable/disable and adjust amount. Note: settings
are provided as fixed register values determined from expected usages.
– Underclock CPU for adaptive clocking support. If done directly, the BIOS must
perform adjustment with ramp algorithm.
– Lock out clock registers prior to transitioning to host OS.
Peripheral Connect Interface (PCI) device enumeration is a generic term that refers to
detecting and assigning resources to PCI-compliant devices in the system. The dis-
covery process assigns the resources needed by each device, including the following:
– Memory, prefetchable memory, I/O space
– Memory mapped I/O (MMIO) space
– IRQ assignment
– Expansion ROM detection and execution
PCI device discovery applies to all the newer (nonlegacy) interfaces such as PCI Ex-
press (PCIe) root ports, USB controllers, SATA controllers, audio controllers, LAN con-
trollers, and various add-in devices. These newer interfaces all comply with the PCI
specification.
Refer to the PCI Specification for more details. A list of all the applicable specifi-
cations is in the References section.
It is interesting to note that in the UEFI system, the DXE phase does not execute
the majority of drivers, but it is the BDS Phase that executes most of the required
drives in UEFI to allow the system to boot.
Advanced Initialization 129
Graphics Initialization
If the platform has a head, then the video BIOS or Graphics Output Protocol UEFI
driver is normally the first option ROM to be executed in the string. Once the main
console out is up and running, the console in can be configured.
Input Devices
Refer to the board schematics to determine which I/O devices are in the system. Typ-
ically, a system will contain one or more of the following devices.
Legacy-Free Systems
Legacy-free systems use USB as the input device. If pre-OS keyboard support is re-
quired, then the legacy keyboard interfaces must be trapped. Refer to the IOH/ICH
BIOS Specification for more details on legacy-free systems.
USB Initialization
The USB controller supports both EHCI and now XHCI. To enable the host controller
for standard PCI resources is relatively easy. It is possible to not enable USB until the
OS drivers take over and have a very well-functioning system. If pre-OS support for
EHCI or XHCI is required, then the tasks associated with the USB subsystem become
substantially more complex. Legacy USB requires an SMI handler be used to trap port
60/64 accesses to I/O space and convert these to the proper keyboard or mouse com-
mands. This pre-OS USB support is required if booting to USB is preferred.
130 Chapter 9: The Intel® Architecture Boot Flow
SATA Initialization
A SATA controller supports the ATA/IDE programming interface, as well as the Ad-
vanced Host Controller Interface (AHCI, not available on all SKUs). In the following
discussion, the term “ATA-IDE Mode” refers to the ATA/IDE programming interface
that uses standard task file I/O registers or PCI IDE Bus Master I/O block registers. The
term “AHCI Mode” refers to the AHCI programming interface that uses memory-
mapped register/buffer space and a command-list-based model.
A separate document, RS – Intel® I/O Controller Hub 6 (ICH6) Serial ATA Ad-
vanced Host Controller Interface (SATA-AHCI) Hardware Programming Specification
(HPS), details SATA software configuration and considerations.
The general guidelines for initializing the SATA controller during POST and S3 re-
sume are described in the following sections. Upon resuming from S3, System BIOS
is responsible for restoring all registers it initialized during POST.
IDE Mode
IDE mode is selected by programming the SMS field, D31:F2:Reg 90h[7:6] to 00. In this
mode, the SATA controller is set up to use the ATA/IDE programming interface. In
this mode, the 6/4 SATA ports are controlled by two SATA functions. One function
routes up to four SATA ports, D31:F2, and the other routes up to two SATA ports,
D31:F5 (Desktop SKUs only). In IDE mode, the Sub Class Code, D31:F2:Reg 0Ah and
D31:F5:Reg 0Ah will be set to 01h. This mode may also be referred to as compatibility
mode as it does not have any special OS driver requirements.
Memory Map 131
AHCI Mode
AHCI mode is selected by programming the SMS field, D31:F2:Reg 90h[7:6], to 01b. In
this mode, the SATA controller is set up to use the AHCI programming interface. The
six SATA ports are controlled by a single SATA function, D31:F2. In AHCI mode the
Sub Class Code, D31:F2:Reg 0Ah, will be set to 06h. This mode does require specific
OS driver support.
RAID Mode
RAID mode is selected by programming the SMS field, D31:F2:Reg 90h[7:6] to 10b. In
this mode, the SATA controller is set up to use the AHCI programming interface. The
6/4 SATA ports are controlled by a single SATA function, D31:F2. In RAID mode, the
Sub Class Code, D31:F2:Reg 0Ah, will be set to 04h. This mode does require specific
OS driver support.
In order for the RAID option ROM to access all 6/4 SATA ports, the RAID option
ROM enables and uses the AHCI programming interface by setting the AE bit, ABAR
04h[31]. One consequence is that all register settings applicable to AHCI mode set by
the BIOS must be set in RAID as well. The other consequence is that the BIOS is re-
quired to provide AHCI support to ATAPI SATA devices, which the RAID option ROM
does not handle.
PCH supports stable image-compatible ID. When the alternative ID enable,
D31 :F2 :Reg 9Ch [7], is not set, the PCH SATA controller will report Device ID as
2822h for a desktop SKU.
Enable Ports
It has been observed that some SATA drives will not start spin-up until the SATA port
is enabled by the controller. In order to reduce drive detection time, and hence the
total boot time, system BIOS should enable the SATA port early during POST (for ex-
ample, immediately after memory initialization) by setting the Port x Enable (PxE)
bits of the Port Control and Status register, D31:F2:Reg 92h and D31:F5:Reg 92h (refer
1 requirement), to initiate spin- up of such drives.
Memory Map
In addition to defining the caching behavior of different regions of memory for con-
sumption by the OS, it is also firmware’s responsibility to provide a “map” of the
system memory to the OS so that it knows what regions are actually available for its
consumption.
The most widely used mechanism for a legacy boot loader or a legacy OS to de-
termine the system memory map is to use real-mode interrupt service 15h, function
132 Chapter 9: The Intel® Architecture Boot Flow
E8h, sub-function 20h (INT15/ E820), which firmware must implement (see
http://www.uruk.org/orig-grub/mem64mb.html).
Region Types
There are several general types of memory regions described by the legacy interface:
– Memory (1) – General DRAM available for OS consumption.
– Reserved (2) – DRAM address not for OS consumption.
– ACPI Reclaim (3) – Memory that contains all ACPI tables to which firmware does
not require runtime access. See the applicable ACPI specification for details.
– ACPI NVS (4) – Memory that contains all ACPI tables to which firmware requires
runtime access. See the applicable ACPI specification for details.
– ROM (5) – Memory that decodes to nonvolatile storage (for example, flash).
– IOAPIC (6) – Memory decoded by IOAPICs in the system (must also be uncached).
– LAPIC (7) – Memory decoded by local APICs in the system (must also be un-
cached).
Region Locations
Loading the OS
Following the memory map configuration, a boot device is selected from a prioritized
list of potential bootable partitions. The UEFI “Load Image” command or Int 19h is
Summary 133
used to call the OS loader, which in turn loads the OS. The details are covered in the
previous chapter.
Summary
The boot sequence outlined in this chapter provides general guidance that will work
for most Intel architecture platforms; however, on any given platform there may be a
reason to change the sequence, depending on the hardware requirements. Additional
secure boot implementations may also be incorporated to check for signatures of bi-
naries prior to execution. Such security additions may be added to future online arti-
cles to support Quick Boot.
As always, please refer and defer to the chipset and CPU specifications for your
target platform.
Chapter 10
Bootstrapping Embedded
Capitalists are no more capable of self-sacrifice than a man is capable of lifting himself up by his
own bootstraps.
—Vladimir Lenin
Bootstrapping to a computing device is the act of bringing the system up without any
manual entry. It may be an antiquated term, but most operating systems are not na-
tive software on Intel® architecture (which start at 0xFFFFFFF0h) and cannot boot-
strap themselves. BIOS and boot loaders have one main task to do—the bare mini-
mum required to initialize enough hardware, and then get out of the way for the more
robust operating system to take over. In some cases, the BIOS or bootloader should
query the user or system for preferences, perform scans for bootable partitions, enu-
merate expansion buses, or run diagnostics in the case of a unit failure. These are
usages where the OS seems unreachable or something seems incorrect with the hard-
ware, but people typically just want to get to the OS to run their application or answer
their email or phone or play the latest YouTube† video from the Internet; they do not
want to wait for a BIOS to boot.
As a system designer, you can look at the Apple iPad† for the latest industry
benchmark in bootstrapping. As with most embedded computing devices, it doesn’t
boot, it turns on. While the comparison may not be fair in that the iPad isn’t actually
booting most of the time, it is just coming out of a lower power state; the end user
doesn’t care. Consumers will be expecting that level of responsiveness going forward.
One could easily argue that mobile phones really boot once, then have a short turn
on/off time similar to that of a PC’s standby sleep mode. But until the likes of Win-
dows† and Linux decide to bootstrap themselves natively from a true OFF without the
need for a BIOS or bootloader, we as system developers will have to adapt tried and
true BIST and POST to a new world order, in innovative ways, or have our products
stay on the shelf at the store and then get moved directly to a museum, still in the
original packaging. We need to understand how to optimize the boot times for perfor-
mance; this is especially true for the embedded-market space, where the system
needs to effectively turn on, not boot.
But first, we need to put things in perspective.
DOI 10.1515/9781501506819-010
136 Chapter 10: Bootstrapping Embedded
This is not an all-inclusive list. Let’s look at each of them individually that can be
applied per design and then look at a better architectural method.
One of the first considerations when looking at a system BIOS and the corresponding
requirements is whether you can limit the number of variables associated with what
the user can do with or to the system.
You’re pirates. Hang the code, and hang the rules. They’re more like guidelines anyway.
—Elizabeth to crew of the Black Pearl
in Pirates of the Caribbean
In some cases, it is determining what you don’t need to do in the BIOS space ver-
sus the operating system space. This can be more important for boot time reductions
than what you need to do during the boot flow. Example: If the OS kernel or drivers
are going to repeat the bus/device enumeration of the entire PCI subsystem, for SATA,
or for USB hubs/devices, then you only need to do what you must to get the OS loaded
and executing and skip the rest. Too often, the standard PC BIOS handles many inane
corner cases that can be found in one or two devices on the market for the sake of
ultimate backward compatibility, all the way back to DOS. A standard PC BIOS fills
out every line of every legacy data table in memory regardless of whether the OS or
application needs the data. It has been considered a real added value that someone
in the universe who buys this motherboard may install some legacy application, and
we may want to make sure it is a very robust solution. Some devices will need to com-
prehend all the baggage because there will continue to be that part of the market. In
all cases for fast boot, however, developers are encouraged to reduce, reduce, reduce.
For instance, since a user can connect anything from a record player to a RAID
chassis via USB, users might think they can boot from a USB-connected device.
Though this is physically possible, it is within the purview of the platform design to
enable or disable this feature.
A good general performance optimization statement would be: If you can put off
doing something in BIOS that the OS can do, then put it off! Look at the whole boot
chain between bootloader, potential second-stage agents, the OS, and shutdown. You
need to be examining the concept of moving the finish line closer to the ball, under-
standing there are tradeoffs between firmware and OS initialization. This is advanced
because you need to have control and insight into each of the phases of the boot flows
for a platform.
Ask the second part of the question: why is it here? If it is for a keyboard or mouse
input during runtime, or to a USB stick to copy files locally to and from the system
during runtime, then we probably will not need to fully enumerate the bus during
boot time or incorporate a full USB firmware stack.
In Example 1 below, the decision was made to not support booting from USB me-
dia and to not support the user interrupting the boot process via keyboard/mouse.
This means that during the DXE/BDS phase, the BIOS can avoid initializing the USB
infrastructure to get keystrokes and save 0.5 seconds of boot time.
It should be noted that PCI resources were assigned to the USB controllers on the
platform, but by eliminating BIOS USB enumeration we saved 0.5 seconds. Upon
launching the platform OS, the OS still could interact with plugged- in USB devices
without a problem because the OS drivers will normally reset the USB controllers and
re-enumerate the USB bus/devices to suit its own needs. No wonder the OS takes so
long to boot … what else may the OS reinitialize and re-enumerate during a cold or
warm boot?
Platform policy ultimately affects how an engineer responds to the remaining
questions.
138 Chapter 10: Bootstrapping Embedded
Example 1
The overall performance numbers used in this example are measured in microsec-
onds and the total boot time is described in seconds. Total boot time is measured as
the time between the CPU coming out of reset and the handoff to the next boot agent.
It should also be noted that this proof of concept was intended to emulate real-world
expectations of a system BIOS, meaning nothing was done to achieve results that
could not reasonably be expected in a mass-market product design. The steps that
were taken for this effort should be easily portable to other designs and should largely
be codebase-independent.
Table 10.1 lists the performance numbers achieved while maintaining broad mar-
ket requirements.
Example 2
Measurements were taken starting from a configuration based on the original Intel
Atom processor. The performance numbers achieved are listed in Table 10.2.
– Intel Atom Processor Z530/Z10 (C0 stepping)
– Intel® SCH US15W chipset (D1 stepping)
– 512 MB DDR2 memory at 400 MHz
– 2 MB flash
While a different combination of techniques was used between these two teams in
these examples, it should be noted that the second trial was more focused on the em-
bedded market scenarios and striving to simply reduce boot times as much as possi-
ble without regard to the broad market. The difference, while noticeable, would not
be that great.
Example 1 Details
Some of the details are listed in Table 10.3.
Admittedly, broad marketing requirements are not the first thing that comes to mind
when a developer sits down to optimize a firmware for performance; however, the
reality is that marketing requirements form the practical limits for how the technical
solution can be adjusted.
Answering some basic questions can help you make decisions that will set outer
bounds and define the performance characteristics of the system. Since this section
details the engineering responses to marketing requirements, it does not provide a
vast array of code optimization tricks. Unless the code has a serious set of implemen-
tation bugs, the majority of boot speed improvements can be achieved from the fol-
lowing guidelines. There are codebase-independent tricks spelled out.
How does the user need to use the platform? Is it a closed-box system? Is it more of a
traditional desktop PC? Is it a server-based system with some unique add-ons? How
the platform is thought of will ultimately affect what users expect. Making conscious
design choices to either enable or limit some of these expectations is where the plat-
form policies can greatly affect the resulting performance characteristics.
Changes in the BIOS codebase that avoided the unnecessary creation of certain tables saved roughly
400 ms in the boot time.
Are the target operating systems UEFI-compliant or not? If all the OS targets are UEFI-
compliant, then the platform can save roughly 0.5 second in initialization of the video
option ROM. In this case, there were two operating systems that needed to be booted
Example 1 Details 141
on the same motherboard. We had conflicting requirements where one was UEFI-
compliant and one was not. There are a variety of tricks that could have been achieved
by the platform BIOS when booting the UEFI- compliant OS but for purposes of keep-
ing fair measurement numbers, the overall boot speed numbers reflect the overhead
of supporting legacy operating systems as well (the compatibility segment module
[CSM] was executed).
Trick: To save an additional 0.5 second or more of boot time when booting a UEFI-compliant OS, the
BDS could analyze the target BOOT#### variable to determine if the target was associated with an
OS loader—thus it is a UEFI target. The platform in this case at least has the option to avoid some of
the overhead associated with the legacy compatibility support infrastructure.
One reason why launching legacy option ROMs is fraught with peril for boot perfor-
mance is that there are no rules associated with what a legacy option
ROM will do while it has control of the system. In some cases, the option ROM
may be rather innocuous regarding boot performance, but in other instances that is
not the case. For example, the legacy option ROM could attempt to interact with the
user during launch. This normally involves advertising a hot key or two for the user
to press, which would delay the BIOS in finishing its job for however long the option
ROM pauses waiting for a keystroke.
Trick: For this particular situation, we avoided launching all the drivers in a particular BIOS and in-
stead opted to launch only the drivers necessary for reaching the boot target itself. Since the device
we were booting from was a SATA device for which the BIOS had a native UEFI driver, there was no
need to launch an option ROM. This action alone saved approximately three seconds on the platform.
More details associated with this trick and others are in the Additional Details section.
142 Chapter 10: Bootstrapping Embedded
This is a crucial element for many platforms, especially from a marketing point of
view. The display of the splash screen itself typically does not take that much time.
Usually, initializing the video device to enable such a display takes a sizable amount
of time. On the proof-of-concept platform, it would typically take 300 ms. An im-
portant question is how long does marketing want the logo to be displayed? The an-
swer to this question will focus on what is most important for the OEM delivering the
platform. Sometimes speed is paramount (as it was with this proof of concept), and
the splash screen can be eliminated completely. Other times, the display of the logo
is deemed much more important and all things stop while the logo is displayed. An
engineer’s hands are usually tied by the decisions of the marketing infrastructure.
One could leverage the UEFI event services to take advantage of the marketing-
driven delay to accomplish other things, which effectively makes some of the initial-
ization parallel.
In the proof-of-concept platform description, one element was a bit unusual. There
was a performance and a standard configuration associated with the drive attached
to the system. Though it may not be obvious, the choice of boot media can be a sig-
nificant element in the boot time when you consider that some drives require 1 to 5
seconds (or much more) to spin up. The characteristics of the boot media are very
important since, regardless of whatever else you might do to optimize the boot pro-
cess, the platform still has to read from the boot media, and there are some inherent
tasks associated with doing that. Spin-up delays are among those tasks that are una-
voidable in today’s rotating magnetic media.
For the proof of concept, the boot media of choice was one that incurs no spin-up
penalty; thus, a solid-state drive (SSD) was chosen. This saved about two seconds
from the boot time.
How a platform handles a BIOS update or recovery can affect the performance of
a platform. Since this task can be accomplished in many ways, this may inevitably be
one of those mechanisms with lots of platform variability. There are a few very com-
mon cases on how a BIOS update is achieved from a user’s perspective:
1. A user executes an OS application, which he or she likely downloaded from the
OEM’s Web site. This will eventually cause the machine to reboot.
Example 1 Details 143
2. A user downloads a special file from an OEM’s website and puts it on a USB don-
gle and reboots the platform with the USB dongle connected.
3. A user receives or creates a CD or flash drive with a special file and reboots the
platform to launch the BIOS update utility contained within that special file.
These user scenarios usually resolve into the BIOS, during the initialization caused
by the reboot, reading the update/recovery file from a particular location. Where
that update/recovery file is stored and when it is processed is really what affects
performance.
Frequently during recovery, one cannot presume that the target OS is working. For a
reasonable platform design, someone would need to design a means by which to up-
date or recover the BIOS without the assistance of the OS. This would lead to user
scenarios 2 or 3 listed above.
The question engineers should ask themselves is, how do you notify the BIOS that
the platform is in recovery mode? Depending on what the platform policy prescribes,
this method can vary greatly. One option is to always probe a given set of possible
data repositories (such as USB media, a CD, or maybe even the network) for recovery
content. The act of always probing is typically a time-consuming effort and not con-
ducive to quick boot times.
There is definitely the option of having a platform-specific action that is easy and
quick to probe that “turns on” the recovery mode. How to turn on the recovery mode
(if such a concept exists for the platform) is very specific to the platform. Examples of
this are holding down a particular key (maybe associated with a GPIO), flipping a
switch (equivalent of moving a jumper) that can be probed, and so on. These methods
are highly preferable since they allow a platform to run without much burden (no
extensive probing for update/recovery.)
Normally the overall goal is to boot the target OS as quickly as possible and the only
expected user interaction is with the OS. That being said, the main reason people to-
day interact with the BIOS is to launch the BIOS setup. Admittedly, some settings
within this environment are unique and cannot be properly configured outside of the
BIOS. However, at least one major OEM (if not more) has chosen to ship millions of
UEFI-based units without exposing what is considered a BIOS setup. It might be rea-
sonable to presume for some platforms that the established factory default settings
144 Chapter 10: Bootstrapping Embedded
are sufficient and require no user adjustments. Most OEMs do not go this route. How-
ever, it is certainly possible for an OEM to expose applets within the OS to provide
some of the configurability that would have otherwise been exposed in the pre-OS
phase.
With the advent of UEFI 2.1 (and more specifically the Human Interface Infra-
structure [HII] content in that specification), it became possible for configuration data
in the BIOS to be exposed to the OS. In this way, many of the BIOS settings can have
methods exposed and configured in what are not traditional (pre-OS) ways.
If it is deemed unnecessary to interact with the BIOS, there is very little reason
(except as noted in prior sections) for the BIOS to probe for a hot key. This only takes
time from a platform boot without being a useful feature of the platform.
A Note of Caution
When trying to optimize the settings for hardware and OS interaction features, such
as power management flows, which are enabled and controlled via a combination of
hardware and software, it is important not to oversimplify the firmware setting. Often,
the tradeoffs and ramifications are going to be beyond the simple boot flow to get to
the OS. For these settings, extreme care should be taken to understand the down-
stream usage model and workloads that are going to be exercising these features. Ex-
periment with these settings, but do not be surprised if the resulting data does not
support your original hypothesis.
Additional Details
When it comes time to address some codebase issues, the marketing requirements
clearly define the problem space an engineer has to design around. With that infor-
mation, there are several methods that can help that are fairly typical of a UEFI-based
platform. This is not intended to indicate these are the only methods, but they are the
ones most any UEFI codebase can exercise.
in the optimizations. If we refer again to the boot times for our proof of concept, it should
be noted that the BDS phase was where the majority of time was reduced. Most of the
reduction had to do with optimizations as well as some of the design choices that were
made and the phase of initialization where that activity often takes place.
At its simplest, the BDS phase is the means by which the BIOS completes any
required hardware initialization so that it can launch the boot target. At its most com-
plex, you can add a series of platform-specific, extensive, value-added hardware ini-
tialization not required for launching the boot target.
Acpi(PNP0A03,0)/Pci(1F|1)/Ata(Primary,Master)/
HD(Part3,Sig00110011)/”\EFI\Boot”/”OSLoader.efi”
only the devices directly associated with the boot target. Figure 10.3 shows an exam-
ple of that logic.
Example 2 Details
Table 10.4 depicts the results of the boot time investigation, followed by in- depth
discussion of each change.
Turning off debugging removes the serial debugging information and turns on C Com-
piler optimizations. Because framework BIOS is almost entirely written in C (unlike
earlier BIOS versions, which were written in assembly language), enabling compiler
optimizations is particularly useful.
EFI BIOS makes extensive use of spewing debugging information over a serial
port to aid development and debug of the BIOS. While this debugging information is
valuable in a development environment, it can be eliminated in a production envi-
ronment. In addition to the boot time improvements realized, there are also IP con-
siderations to shipping a BIOS developed in a higher-level language that has debug-
ging information enabled.
Likewise, compiler optimizations must be disabled when it is desired to perform
source level debug of the C language BIOS code. However, in production this can be
eliminated.
Example 2 Details 149
This step represents modifying the BIOS build process to use a 1 MB flash part instead
of the original 2 MB flash part. This includes decreasing the number of flash blocks
used in the board process. The framework BIOS uses a flash file system to store PEI
and DXE modules as well as other entities. Decreasing the number of blocks in this
file system improves the access time each time a module is accessed.
Many of the PEI modules in framework BIOS must be executed prior to platform
memory being initialized. Once memory is initialized, the remaining portions of the
BIOS are copied to system memory and executed out of memory. Because the frame-
work BIOS uses cache-as-RAM (CAR) for pre-memory storage and stack, it runs all the
PEI modules in place directly from the flash part without caching. It is possible, how-
ever, on the Intel® Atom™ processor to simultaneously enable CAR plus one 64 KB
region of normally cached address space. The BIOS must be arranged in such a way
to take full advantage of this one prememory cacheable region. This means having a
separate flash file system used exclusively by the PEI modules that are run prior to
memory initialization and placing that file system in the region that will be cached.
By employing this technique to cache the portion of the flash part that includes the
PEI modules executing prior to initialization of memory, performance is increased.
For this effort, the 64 KB region was unable to cover all the PEI modules. Through
further reduction in size of PEI, more improvement is possible.
Intel SpeedStep® technology is a power savings technology used to limit the processor
speed based on current software demand on the system. When the system is in heavy
use, the processor is run at full speed. At idle and near idle, the system is run at slower
speed “steps.”
The BIOS role in initialization of Intel SpeedStep technology includes detection
of Intel SpeedStep technology capabilities, initialization of the processor speed for all
processor threads, and advertisement of Intel SpeedStep capabilities for the operat-
ing system. The above “initialization of all processor threads” is typically to the
“power on” speed, which is normally equal to the lowest supported speed. This is to
ensure increased stability during the boot process. The operating system will then
enable the faster steps.
150 Chapter 10: Bootstrapping Embedded
To increase boot speed, the BIOS, instead of enabling the “power on” feature of
Intel SpeedStep® technology, can enable the highest speed setting. This not only in-
creases the speed of the processor during a BIOS post, but also increases the speed of
loading the operating system.
The framework BIOS BDS phase normally looks for potential boot devices from hard
drives, CD-ROM drives, floppy drives, and network. On the investigation platform, the
requirements were for boot from hard disk. This optimization removes the BDS checks
for boot devices on CD-ROM, floppy, and network since they are not supported on this
platform. If the operating system is being loaded from flash instead of hard disk, the
hard disk would be replaced with an optimized flash boot.
Noticeable boot time improvement was realized by using the highest memory speed
supported on the platform. On platforms featuring the Intel Atom processor, this is
determined by a set of jumpers on the board. On other platforms, this may be a setting
in BIOS setup.
Initialization of the PS/2 keyboard and mouse in the BIOS takes a considerable
amount of time due to the specification of these devices. This eliminates the possibil-
ity of interacting with the BIOS during the BIOS post and operating system loader. If
BIOS setup has been removed as discussed below, user input is not needed during the
BIOS. On a fielded embedded device, keyboard and mouse are typically not needed
and therefore do not need to be initialized. During device development and debug
this feature might be better left on until the device is operational.
During the boot process, the BIOS provides an opportunity for the user to hit a hot key
that terminates the boot process and instead displays a menu used to modify various
platform settings. This includes settings such as boot order, disabling various proces-
sor or chipset features, and modifying media parameters. On an embedded device,
BIOS setup (and any similar settings provided by an operating system loader) is more
Example 2 Details 151
of a liability since it gives the end user access to BIOS features that are potentially
untested on the device. It is better to have a set of setup options that may be chosen
at BIOS build time. Removal of the BIOS setup also saves significant BIOS post time.
The video option ROM on platforms featuring the Intel Atom processor and many
other newer platforms is quite slow due to the many different video interfaces that
must be supported and the display detection algorithms that must be employed. Re-
placement of a highly optimized DXE video driver in place of the video option ROM
saves significant boot time. The speed of this optimized DXE video driver is highly
dependent on the exact display chosen and video features required by the platform
prior to installation of the operating system graphics driver. On the investigation plat-
form, a fixed splash screen was displayed. It was unchanged during the boot process
until the operating system graphics driver initialized the display. To achieve this, the
capability to display text was removed from the DXE graphics driver. As a result, none
of the normal BIOS or operating system initialization messages is displayed. This
yields additional performance and a cleaner boot appearance.
Since USB-boot and a USB input device like a keyboard/mouse were not requirements
on the investigation platform, initialization of USB was completely removed from
boot flow. USB is still available from the operating system once the driver is loaded
but not during the BIOS or OS loader.
Divide Long Lead Pieces into Functional Blocks and Distribute Across the Boot Flow
While BIOS today is not multithreaded, there are some things that can still be done in
parallel. Not waiting around for long timeouts or scans and breaking up the activities
between multiple functions across the boot flow are two good ways to eliminate idle
delay times. Example: command the hard drives to spin up early in the boot flow. It
should be noted that even solid-state drives have a firmware readiness time minimum
before data can be retrieved. This can be mitigated by warming up the drives. Simi-
larly, newer LCD displays may have a minimum time required to turn on the back-
light, or perform a costly 900 ms reset on each power cycle.
Keeping the CPU fully occupied as well as any DMA engine during OS loading or
BIOS shadow is another method of running some of the longer lead activity in paral-
lel. One can start to load the data from the storage to memory at the same time you
152 Chapter 10: Bootstrapping Embedded
are executing other items. Bottom line: don’t stand around whistling while you wait
for hardware or timeouts when you can be doing real work.
Summary
This chapter covers several potential optimizations that can be implemented on a
broad array of embedded platforms, depending on the policy decisions and require-
ments of the platform.
It is important to note that some latent firmware bugs were found that only be-
came apparent when optimizations were enabled. For this reason, rigorous validation
is recommended after enabling these and optimizations.
In creating new paths for optimizations, it is important to get as much data as you
can from experts on a particular component or subsystem. Often the details will not
be written down in specifications.
None of us got where we are solely by pulling ourselves up by our bootstraps. We got here be-
cause somebody—a parent, a teacher, an Ivy League crony or a few nuns—bent down and helped
us pick up our boots.
—Thurgood Marshall
Chapter 11
Intel’s Fast Boot Technology
A little simplification would be the first step toward rational living, I think.
—Eleanor Roosevelt
One of the key objectives of computing platforms is responsiveness. BIOS boot time
is a key factor in responsiveness that OEM manufacturers, end users, and OS vendors
ask firmware developers to deliver. Here the focus is on system startup time and re-
sume times.
Traditional Intel architecture BIOS has labored through the years to design itself
to boot on any configuration that the platform can discover, each time it boots it
works to rediscover anything it can about a change to the machine configuration that
may have happened when it was off. This provides the most robust, flexible experi-
ence possible for dynamic open boxes with a single binary. However, this has also
resulted in a bloated boot time upwards of 30 seconds for the BIOS alone. As we have
covered in other chapters, when properly tuned and equipped, a closed box consumer
electronics or embedded computing device powering on can easily boot in under two
seconds. This normally requires a customized hard-coded solution or policy-based
decisions, which can cost several months of optimizations to an embedded design.
By following the methods contained in this chapter, even open-box PCs can be as fast
as embedded devices and consumer electronics, less than two seconds. And the ben-
efit to embedded designs is that they need not spend weeks and months of analysis
per design to get the desired effect on boot speed.
DOI 10.1515/9781501506819-011
154 Chapter 11: Intel’s Fast Boot Technology
1 S.C. Seow, Designing and Engineering Time: The Psychology of Time Perception in Software, Addi-
son-Wesley Professional, 2008, Chapter 4.
The Human Factor 155
The next set of similarities was determining, without a stopwatch, a delta between
times. Variations may not be noticeable to the untimed eye.
Seow suggested a basic “20 percent rule” for perceptible differences. Miller’s data
was a bit more refined (as Figure 11.3 shows):
– 75%of people cannot detect change of ±8 percent between 2 and 4 seconds
– From 0.6 to 0.8 seconds, there was 10 percent variation
– From 6 to 30 seconds, a 20–30 percent variation
For typical system boot times where the BIOS executes in 6 seconds to 15 seconds, the
times would have to be improved between 1.2 and 4.5 seconds for someone to take
notice and appreciate the work of a developer trying to make improvements at the
millisecond level.
It should be noted that many little changes can add up to a lot. Even in the sub-
second range, where Miller’s results imply a need for approximately 80 milliseconds
156 Chapter 11: Intel’s Fast Boot Technology
Responsiveness
Looking across the system, in order to achieve what your brain thinks is the right tim-
ing, many things must be aligned for speed and performance, not just during runtime,
but during startup and resume operations, as well as during sleep and shutdown.
Every millisecond wasted burns milliwatts, costing time and energy.
We need responsiveness across all levels of the platform, as shown in Figure 11.4:
– Low latency hardware. Today’s silicon and hardware provide much greater speed
than they are normally tuned for. While people think of a PC as booting in 45
seconds or more, the same hardware, properly tuned, can execute a viable boot
path in less than 1 second.
– Along these lines, power sequencing of the various power rails on the platform is
an area that needs to be examined. Between 300 and 700 ms can be absorbed by
just the power sequencing. This is before the CPU comes out of reset and the sys-
tem firmware starts operating.
The (Green) Machine Factor 157
– Fast boot BIOS and firmware. System BIOS and embedded microcontroller firm-
ware components are moving forward in ways that they didn’t in years past, or
didn’t need to. Silicon vendors often provide sample boot code that works to en-
able the platform through initialization. What is needed is more responsive sili-
con reference code.
– Operating system. Intel has been working with OS vendors to optimize the user
experience. For developers, it is easy to see where a very modular and diverse mix
of components can work if architected correctly, but many solutions are not fully
optimized. Linux teams in the Intel Open Source Technology Center (OTC) are en-
gaged to speed up the experience from kernel load and driver startup times. And
other operating systems are not being ignored.
– Driver optimizations. Intel OS drivers are measured using various tools, depending
on the OS. Intel is reducing the load and execution times for all of our drivers. We
are continually investigating any optimizations possible to improve the responsive-
ness of our ingredients, including the graphics and wireless device drivers.
– Middleware. This is a topic that you will have to consider: the applications that use
it and how it can be altered to take advantage of offload engines on the system.
– Applications. Like other companies, Intel is investing in its application store, as
well as working with others in the industry to streamline their applications with
various tools from the Intel tools suites. There are a variety of applications we
provide which assist in the debug and monitoring of the system that we provide
to all of our customers.
Depending on the use of solid-state drives, there are fundamental things that can be
done differently at the OS and application levels that can have a profound impact on
application responsiveness. This is an area we need to work on, this is where our cus-
tomers work, and this is where it will count the most. The user experience is here, if
we did the rest of our jobs correctly.
And let’s not forget that responsiveness doesn’t end at the platform level; it ex-
tends into the local network and into the ubiquitous cloud.
low data amounts and infrequent access, and can stand to wait several seconds or
minutes or hours for a system to resume from slumber and complete a task.
Real-time systems allow for setting priority on devices as well as execution
threads so they can predetermine if there is any millisecond wait at all for anything
in the system. The more responsive the system is, the more flexibility the real time
system designer may have.
The faster the response times can be, the deeper the sleep state can be and the
less power is required to keep the system active over time. Example: If the system can
be made to boot in less than two seconds from an OFF (S4) state, where power is nor-
mally 2 mW, then why put the system into S3, where the resume time is about 1 second
and the power is several hundred mW? Obviously, it depends on the resume times,
power requirements, and usage model tradeoffs. But the faster response times from
lower power states can prove extremely beneficial. Another example is Enhanced In-
tel SpeedStep® Technology, where the CPU can dynamically enter a lower power
state and run at a lower frequency until the performance is required. Average power
is a key factor when working on more ecologically sensitive, green infrastructures,
from servers to sensors. Responsiveness capabilities provide an ability to lower power
overall and create more easily sustainable technology.
These logging routines will dump data into a reserved memory location from a cold
boot for retrieval later.
One way to overcome the software logging issue is to have the hardware instrumented
with a logic analyzer. Depending on the motherboard layout and test points available,
you should be able to probe the motherboard for different signals that will respond to
the initialization as it happens. If no known good test point exists, a GPIO can be set
up as the start and end point. Sending the I/O commands takes some time, so it is not
ideal.
Using hardware measuring techniques brings further complications. It is likely
that the hardware power sequencing takes upwards of 100 ms alone to execute before
the processor is taken out of reset and the BIOS or bootloader code can begin to exe-
cute. From a user’s perspective, it should be considered as they “pushed the button”
and their eyes are expecting a response in less than a few seconds. So from a BIOS
point of view, this hardware power sequencing is a required handicap.
With the addition of any large number of experimental test points, it is possible
to incur an observer effect or to change the boot environment, slowing it down with
the extra cycles being added. Example: if you turn on debug mode, or if you do an
excessive number of I/Os, the performance can be heavily affected by up to 30 percent
in some cases. Be aware of this effect and don’t chase ghosts. This concept is outlined
in Mytkowicz et al. (2008).
160 Chapter 11: Intel’s Fast Boot Technology
Once we have the data, then the fun truly begins. A quick Pareto chart, where
summarizing the timing data per block may quickly help developers focus on the top
20 percent of the longer latency, which may total up to 80 percent of the boot time.
These items can be reduced first; then dig into the shorter portions. When looking at
attacking this problem, it is a good idea to step back and look at the bigger picture
before charging ahead and tackling the problem feature by feature.
In ACPI system state description, the system starts up from G3, G2/S5, or G1/S4, and
ends in a G0/S0 working system state. Orthogonal to Global and Sleep states of the
ACPI, Intel has defined Fast Boot states that can be thought of as:
– B0. First boot, in order to robustly boot on any configuration, firmware dynami-
cally scans all enumerable buses and configures all applicable peripherals re-
quired to boot the OS.
– B1. Full boot, similar to first boot, whenever a configuration change is noted.
– BN. Typical subsequent boot, which reuses data from previous scans to initialize
the platform. This results in a sub-two-second user experience.
Some environments may not be suitable for such a Fast Boot scenario: IT activity,
development activity, or a lab environment may not provide enough flexibility. For
those cases, the atypical full boot path or first boot paths continue to be supported.
Decisions can be made automatically or triggered by user or other external trigger.
In general, the B-states can be aligned to the UEFI-defined boot modes listed in Table 11.1.
B B B(n)
BOOT_WITH_DEFAULT_ BOOT_WITH_FULL_ BOOT_WITH_MINIMAL_
SETTINGS CONFIGURATION CONFIGURATION
The following list shows a high-level breakdown of the two-second budget. It does
not assume the scripted boot method, which may be faster in certain scenarios:
1. SEC/PEI phase budget – 500 ms, where:
– Memory is configured, initialized, and scrubbed (if needed).
– BIOS has been shadowed to memory and is currently executing from
memory.
– ME-BIOS DID handshake is done and no exception or fall back request was
raised.
– Long latency hardware (such as HDD, eDP panel) has been powered up.
– CPU is completely patch-updated, including the second CPU patch micro-
code update.
Fallback Mechanisms
There are several events that could cause a full boot to transpire following the next
system reset, including hardware, BIOS, or software level changes.
– Certain hardware changes that are relatively easily detected by BIOS include
CPU, memory, display, input, boot target, and RTC battery errors.
– BIOS setting changes that will cause an event may include changes to the boot
target and console input/output changes.
– Software changes include UEFI updates.
Not every exception event is likely to trigger a full boot on the next boot sequence or
involve interrupting the current boot with a system reset. Table 11.2 depicts the differ-
ent types of exceptions that could occur.
Exception Type Current Boot Flow Requires RESET Next Boot Flow
A Type 1 Exception can be completely handled within an EFI module and the remain-
der of the BIOS boot flow will continue with the BOOT_WITH_ MINIMAL_CONFIGU-
RATION flag. Optionally, the exception can be logged for diagnostic purposes.
A Type 2 Exception occurs if a BIOS module encounters an issue that will prevent
the rest of the BIOS from continuing the Fast Boot framework. The remainder of the
boot sequence may execute slower than anticipated but no reset is needed. The boot
will continue and it is recommended that a logging of the vent occurs.
Baseline Assumptions for Enabling Intel Fast Boot 163
A Type 3 Exception requires an interruption of the current boot flow with a system
reset. The next boot will be a full boot (without BOOT_WITH_ MINIMAL_CONFIGURA-
TION flag) to handle the exception. It is unlikely, unless the error was generated during
memory initialization, that a full MRC training would be required on top of the full boot.
A Type 4 Exception is similar to Type 3, except that after the system reset, the
BIOS is in Full Boot mode, and Memory Initialization needs to perform full training.
Table 11.3 lists a series of exceptions and the probable type casting. Several of
these can be different, depending on the policy decisions of the designer.
2. No UEFI shell or Serial Redirecting debug console as any user interaction or con-
figuration change will negate Fast Boot times.
3. UEFI only boot. A legacy boot may be necessary for older OS loader and operating
systems, but this will inevitably delay the boot times as well as open the system
to potential root kit attacks.
4. Setup Menu or other user entry (hot keys) during boot is not required. If it is to be
entered, then, when entered, boot speed is not a key requirement.
5. When referring to sub-two-second timing the start and finish lines are described
in the following manner:
– The Start Line is when the processor comes out of reset and starts to fetch
code from the SPI. While the starting line can be from the power button, de-
pending on the scope of the parameters, the BIOS is not responsible for the
approximately +300ms of power sequence time between the button until the
CPU comes out of reset.
– The Finish Line is when the UEFI BIOS calls LoadImage() for the OS loader
image. On loading of the image, the system enters a handoff phase where the
OS loader is responsible much more so than the BIOS.
Summary
This chapter introduced a very powerful Fast Boot mechanism that can be applied to
UEFI solutions. While this mechanism provides dynamic and reliable time savings
over a monotonic long, slow boot, other optimizations must still be performed for op-
timum results. As we have seen in other chapters, hardware selection, OS loader op-
timizations, and OS interface come into play. Due diligence of the development and
design teams is vital to a successful Fast Boot user experience.
Developers should read Chapter 12, and then reread Chapters 10, 11, and 12. Then
ask yourself and others the right questions, including: “What will it take to do this?”
Chapter 12
Collaborative Roles in Quick Boot
Every sin is the result of a collaboration.
—Lucius Annaeus Seneca
Collaboration between hardware, firmware, and software is vital to achieve a fast boot
experience on the system. If the hardware chosen is incapable of achieving a sub-
second initialization time, or if the software is not optimized to take advantage of a
Fast Boot time, then the investment spent in firmware optimization with either cus-
tom activity or implementing a systematic Fast Boot architecture is wasted. Below are
several examples of ways and techniques that can improve the boot times through
picking the right device, optimizing the way that device is initialized, or doing the
minimum for a future driver to take full advantage of a particular subsystem through
loading it’s driver.
Power Sequencing
If measuring time from the power button, then the motherboard hardware power se-
quencing can provide a source of delay. If simplified power plans merge the Manage-
ability Engine’s power plane with other devices, then an additional 70 ms may be
required.
If you’re using PCI or PCIe add-in cards on a desktop system, then the PC AT power
supply (silver box) will have an additional 100 ms delay for the PCI add-in card
onboard firmware to execute. If you’re using a system with an embedded controller
(mobile client or an all-in-one desktop), then there can be an additional power button
debounce in embedded controller firmware that can be as much as 100 ms.
Flash Subsystem
The system should use the highest speed SPI flash components supported by the chipset.
DOI 10.1515/9781501506819-012
166 Chapter 12: Collaborative Roles in Quick Boot
When selecting an SPI component, a slower 33 MHz clock for a SPI chip can slow the
system boot time by 50 percent as compared with a 50 MHz SPI component. Single-
byte reads versus multiple-byte reads can also affect performance. Designers should
select at least a 50 MHz component. Intel series 6 and 7 chipsets support 50 MHz DOFR
(Dual Output Fast Read). There are also components that will support Quad Fast
Read. As the SPI interface is normally a bottleneck on boot speed, this small change
can mean a lot to overall performance, and it is a relatively small difference on the
bill of material.
Besides configuring for faster read access and prefetch enabling, further optimiza-
tions can be done to reduce its accesses. For example: BIOS option setup data could
be a few kilobytes in size and, each time a setup option is referenced, it could cost
about 1 ms for 33 MHz SPI, and there could be several references. Optimization can
also be done by reinstalling the read-only variable PPI with a new version, which has
a copy of the setup data in CAR/ memory and thus the setup data is read only once
from the SPI. It could also be cached in memory for S3 needs to prevent unnecessary
SPI access during S3 resume.
It is possible to enable the buffers on the components to prefetch data from the flash
device. If the chipset is capable, set up the SPI Prefetch as early as the SEC phase. It
is recommended to do performance profiling to compare prefetch-enabled boot time
of each of the individual UEFI modules to determine potential impact. The time when
firmware volumes are shadowed can be particularly interesting to examine.
BIOS can minimize the number for flash writes by combining several writes into one.
Any content in SPI flash should be read at most once during each boot and then stored
in memory/variable for future reference. Refer to PCH BIOS Writer’s Guide for the op-
timal SPI prefetch setting. The PCH prefetching algorithm is optimized for serial ac-
cess in the hardware; however, if the firmware is not laid out in a sequential nature,
the prefetch feature may need to be turned off if the read addresses are random (see
also “EDK II Performance Optimization Guide – Section 8.3”).
Flash Subsystem 167
SPI flash access latency can be greatly improved with hardware and firmware co-
design. Table 12.1 presents some guidelines that, when followed, can provide im-
provements of several seconds off a boot.
Table 12.1: Improving SPI Flash Access Latency with Hardware and Firmware Co-Design
Interface and device accesses can be time consuming, either due to the nature of the
interfaces and/or devices, or the necessity of issuing multiple accesses to the partic-
ular controller, bus, and device.
DMI Optimizations
If the PCH and CPU are both capable of flexible interface timings, then faster DMI
settings minimize I/O latency to PCH and onboard devices. The BIOS should be us-
ing the highest DMI link speed, as the DMI default link speed may not be optimal.
The BIOS should enable Gen2 (or highest) DMI link speed as early as possible (in
the SEC phase), if this does not conflict with hardware design constraints. The rea-
son for this quirk is predictable survivability: the default value for DMI link speed
is 2.5 GT/s (Gen 1). A faster DMI link helps in I/O configuration speed by 6 to 14
percent. Thus, it should be trained to 5 GT/s (Gen 2 speed), at an early SEC phase.
There may be reasons why the link should not run at top speeds all the time. If the
BIOS option controls the DMI link speed, when the option may only be read later in
the boot, and down speed training.
168 Chapter 12: Collaborative Roles in Quick Boot
Processor Optimizations
The following section describes processor optimizations.
Starting on Sandy Bridge CPUs, which launched in 2010, the CPU frequency at reset
is limited to the lowest frequency supported by the processor. To enable CPU perfor-
mance state (P-state) transitioning, a list of registers must be configured in the correct
order. For the Intel Fast Boot, the following is expected:
1. Every Full Boot shall save the necessary register setting in UEFI Variables proto-
col, as nonvolatile content.
2. On Fast Boot, the necessary registers needed for enabling P-state will be restored
as soon as possible. By the time that DXE phase is entered, the boot processor
shall be in P0 with turbo enabled (if applicable).
Precise time budgeting has been done for the following sequence of events— from
platform reset to initial BIOS code fetch at the reset vector (memory address). The key
steps within this sequence are:
1. CPU timestamp counter zeroed and counting started at maximum nonturbo fre-
quency.
2. DMI initialization completed.
3. Soft strap and dynamic fusing of CPU completed.
4. CPU patch microcode update read from SPI flash (only once).
5. All logical processors (APs) within the CPU package patch-updated (using the
cached copy of the CPU patchMU).
6. BSP (bootstrap processor) starts fetching the BIOS at reset vector for execution.
In addition to patch-updating, the BIOS needs to replicate memory range and other
CPU settings for all APs (application processors) in a multicore, multithreaded CPU.
The best optimized algorithm may be CPU specific but, in general, the following
guidelines apply:
1. Parallelize microcode updating, MTRR, and other operations in logical core.
2. Minimize synchronization overhead using the best method for the particular CPU
microarchitecture.
Main Memory Subsystem 169
All BIOS code must be executed in cache-enabled state. The data (stack) must be in
cache. These apply to all phases of the BIOS, regardless of memory availability. Un-
less an intentional cache flush operation is done (for security or other reasons), sec-
ond access to the same SPI address should be hitting the CPU cache (see also “EDK II
Performance Optimization Guide – Section 8.2,” but included here for completeness).
When looking at memory configuration, the higher frequency memory will provide
faster boots. Like the processor, the simpler the physical makeup of the memory, the
faster the boot will be. Fewer banks of memory will boot faster than greater numbers
of memory. If the memory’s overall size is smaller, then the runtime performance will
be limited. Balance the smaller number of banks with high bit technology memory to
allow for a small but agile memory footprint.
Since 2010, with the Intel® Core™ series CPU, fast memory initialization flow has
been available for typical boot. To accomplish this, for the first boot with a new
memory stick and/or a new processor, there is an involved and intensive (time-con-
suming) training algorithm executed to tune the DDR3 parameters. If the same CPU
and DIMM combination are booted subsequently, major portions of the memory train-
ing can be bypassed.
In fast memory initialization, the MRC is expected to support three primary flows:
1. Full (slow) Memory Initialization. Create memory timing point when CPU and
memory are new and never seen before.
2. Fast Memory Initialization. If CPU and DIMM have not changed since previous the
boot, this flow is used to restore previous settings.
3. Warm Reset. Power was not removed from DIMM. This flow is used during plat-
form reset and S3 resume.
170 Chapter 12: Collaborative Roles in Quick Boot
The three flows can be used in conjunction with the Fast Boot states; however, they
may operate independently of the main Fast Boot UEFI flag setting.
On some SKUs of memory controllers offered by Intel, the hardware can be set to zero
memory for security or ECC requirements. This is not part of the two-second BIOS boot
time budgeting. Traditionally, a complete software- based memory overwrite is a very
time-consuming process, adding seconds to every boot.
Starting on the Sandy Bridge generation CPU, new CPU instructions have been added
to the speed up string operation. For memory operations, such as clearing large buff-
ers, using optimized instructions helps. For more information, please see EDK II Per-
formance Optimization Guide – Section 8.5.
The PCH SMBus controller has one data/address lane and clock at 100 KHz. There are
three ways to read data:
– SMBus Byte Read: A single SMBUS byte read makeup is 39 bits long, and thus at
minimum one SMBUS byte read would take 0.39 ms.
– SMBus Word Read: A SMBus word read is 48 bits, hence 0.48/2 bytes or 0.24
ms/byte. Word reads are 40 percent more efficient than byte reads, but the bytes
we need to read are not always sequential in nature on full boots.
– I2C Read: Where I2C is an alternate mode of operation supported by the PCH
SMBus controller.
With the MRC code today on fast boots, we do not read all the SPD bytes all the time;
we read only the serial number of the DIMMs, unless the DIMMs change. The serial
number can be performed with sequential reads. Experiments prove that using the
I2C read saves a few milliseconds, which count overall.
Minimize BIOS Shadowing Size, Dual DXE Paths for Fast Path versus Full Boot
UEFI BIOS is packaged into multiple firmware volumes. The Fast Boot is enhanced
when there are several DXE firmware volumes instead of one monolithic one. That
Manageability Engine 171
means the DXE phase of the BIOS should be split into two or more DXE firmware vol-
umes; for example, a fast one and a slow one (a full one). The fast-boot-capable DXE
firmware volume contains the minimum subset of module needed for IFB typical
boot, and the other DXE firmware volume contains the rest of the module needed for
full boot.
This requirement affects only the firmware volumes that have to be decom-
pressed from SPI flash into memory before execution. To optimize decompression
speed, the BIOS needs to avoid decompressing unnecessary modules that will not be
executed.
This may be done at the DXE driver boundary; however, there is no restriction
preventing module owners from creating a smaller fast boot module and a separate
full boot module for the two different volumes.
There are Mini PCIe enumeration needs for detections that ultimately lead to func-
tion-disable of the particular PCIe port. These include PCIe port without card de-
tected, and NAND over PCIe initialization flow. All these must be combined and get
done in a single Mini PCie enumeration.
Manageability Engine
The Manageability Engine (ME) is a security processor subsystem and offload engine
inside the PCH. There are two SKUs of the firmware that runs on the device: 1.5 MB
and 5.0 MB SKUs. The 5.0 MB SKU is associated with advanced security features, such
as Intel® Active Management Technology. The 5.0 MB firmware requires a setup op-
tion ROM called the Manageability Engine BIOS Extension (MEBx), which up until
2011 ran on every boot, which took time. There are also ME/BIOS interactions during
boot, regardless of the firmware SKUs.
Eliminating MEBx
Starting with 2012 platform controller hubs, the Intel PCH 5.0 MB eliminates the need
to execute MEBx on Fast Boots. Instead of running MEBx on every boot, MEBx is run
only as needed within the framework on a typical boot per the UEFI flag.
172 Chapter 12: Collaborative Roles in Quick Boot
In addition to the 2012 MEBx optimization, starting in 2012 platforms, during normal
boot there are only two architecturally defined sync-points between ME and BIOS re-
maining:
1. Device Initialization Done (DID). This happens as soon as memory is available for
ME use following MRC initialization. The time is estimated to be between 50 ms
and 120 ms after TSC starts, depending on the MRC and CPU requirements.
2. End of POST (EOP). This happens before a BIOS process boot list (at the end of
DXE phase). It is estimated to be 700 ms after TSC starts.
All other ME-BIOS communication will happen asynchronously outside of these two
sync-points (that is, no waiting for the other execution engine). The MEBx (ME BIOS
extension) module is not expected to be called in a typical BIOS boot. If it is needed,
it can be called via the exception handling methodology defined in Intel Fast Boot
framework.
Hardware Asset Reporting for Intel® Active Management Technology (Intel AMT)
Within the Fast Boot framework, SMBIOS, PCI Asset, and ASF tables are always up-
dated and passed to the ME Intel AMT firmware (5MB in size) regardless of boot mode.
For the media table, the BIOS will enumerate all storage controllers and attached
devices during full boot and upon request by the Intel AMT firmware. Upon detecting
an Intel AMT firmware request, BIOS will enumerate all media (except USB) devices
to generate and pass the media table. The heuristic on how frequent Intel AMT will
request this is the responsibility of the Intel AMT design.
Graphics Subsystem
The following section describes the graphics subsystem.
When looking at video and graphics devices, the panel timings were mentioned
above. The controller timing and speed are also important to boot speeds—the faster
the better. The timing numbers can be modified if required to help to achieve this in
a BMP utility on the UEFI GOP driver. A UEFI Graphics Output Protocol driver will
provide faster boot speeds than a legacy video BIOS. Finally, a single graphics solu-
tion will be faster to boot than a multiple display/controller configuration.
For operating systems that support a CSM-free boot, the GOP driver will be loaded by
BIOS instead of CSM legacy video option ROM. This eliminates the time spent on cre-
ating legacy VGA mode display services (INT 10). Benefits can be seen in Figure 12.1
in microseconds based on different ports/monitor selection.
Panel Specification
If you are using an Embedded DisplayPort† (eDP) panel, using the panel startup time
per the industry specification, then 250 ms is required during the boot to just reset
power on the panel. This is a ridiculously long time to wait for hardware, as an entire
PC motherboard takes about that time to come up through power sequencing. If the
timing is adjusted to what the hardware actually requires to cycle power, then eDP
may prove adequate.
Like the disk drives, the panel now must be started early to parallelize the delays in
the hardware during boot. A PEI module to power up the eDP panel is needed if the
target display panel has a noticeable delay in the power-up sequence. For example,
if a panel requires 300 ms to power up, a PEI module to enable (power up) the eDP
port needs to be executed at least 300 ms before the video module is reached in BDS
phase of BIOS.
Storage Subsystems
The following section describes storage subsystems.
Spinning Media
For spinning media storage devices, the spin-up time for a hard drive is 2seconds min-
imum. Even if the BIOS starts the initialization a short time after it gets control of the
system, the drive may not be ready to provide an OS.
The Intel PCH integrated SATA controller supports Nonblocking Command Queuing
(NCQ) in native AHCI mode operation. Unless required by the target operating system,
the BIOS should access the storage subsystem in the most efficient way possible. For
example: in all Windows operating systems since Windows XP, AHCI mode should be
the default SATA storage mode.
Storage Subsystems 175
Generally, in client platforms, the disk feature of power up in standby (PUIS) is disa-
bled. The hard disk would automatically spin up once it receives a COMRESET, which
is sent when the BIOS enables the related SATA port. Spinning up the hard drive as
early as possible at the PEI phase is required by enabling ports right after setting the
SATA DFT.
While SATA SSD NAND drives do not literally spin up, the wear-leveling algo-
rithms and databases required to track the bits take hundreds of milliseconds before
data is ready to be fetched (including identifying drive data). While this can be miti-
gated with SSD firmware enhancements or controller augmentation to store such
data, numbers in the range of 800 ms are probable with the latest SATA3 SSDs at the
time of this writing.
The Intel RST UEFI driver is required to allow for SSD caching of a slower hard drive.
The SSD performance far outweighs the HDD in both read/write and spin-up readi-
ness. This newly optimized UEFI driver is needed to support the CSM-free Class Two
and Class Three UEFI boot mechanism. Elimination of CSM is the primary time saving;
however, the optimizations made to the new UEFI driver over the legacy Intel RST
option ROM are dramatic. As with the MEBx UEFI driver, the Intel RST driver will fol-
low the recommendation of the UEFI flag for Fast Boot. One of the fallback conditions
for Fast Boot must also be that for any drive configuration change, the BIOS must
inform the UEFI option ROMs via the UEFI boot mode flag.
In SDR0 (Single Disk RAID 0) Intel RST caching mode, the Intel RST driver does
not wait for HDD to spin up. It needs to allow data access (read) to OS boot loader as
soon as cached data in SSD are available.
The fastest HDD (at the writing of this chapter) takes about 1.4 to 3 seconds from
power-up to data availability. That is far slower than the 800 ms power- up to data
availability on Intel SSD (X25M).
Intel integrated USB host controller and integrated hubs have much smaller latencies
than the generic host controller and hub settings spelled out in the USB specifica-
tions. The BIOS can be optimized for the integrated components (as they are always
present in the platform), by replacing the default USB specification timing with Intel
PCH USB timing, as published in the PCH USB BIOS writer’s guide.
176 Chapter 12: Collaborative Roles in Quick Boot
For example, the minimum time needed to enumerate all the USB ports on PCH
(as if they are empty) is approximately 450 ms. Using Intel PCH silicon-specific timing
guideline can cut that down by more than half.
Power Management
The following section describes power management.
On several buses in the platform, there is a recommendation for active state power
management (ASPM). The ASPM setting is important in extending battery life during
runtime; however, there is nonzero latency in ASPM state transition. The impact can
be seen in Table 12.2 for both boot and resume times.
Intel DMI bus supports a PCI ASPM lower link power scheme. To eliminate poten-
tial side effects, enabling of DMI ASPM should be done at the last possible moment in
the BIOS initialization flow. Actually, it can be made after the POST is completed.
To delay the setting of the DMI ASPM link states (L0s/L1) to the last possible mo-
ment in the boot, there are three possible options:
1. At ExitBootServices()
2. In ACPI
3. One-shot SMI timer, heuristically determined by experiment, say 8 seconds after
ExitBootServices(), to cover the OS booting period
Option 1 is to be selected if we are interested only in BIOS boot time improvement. Op-
tion 2 and 3 could be explored for OS boot time improvement. If we aim to improve the
OS load time as well, we could use the SMI timer for the S4/S5 path, and use the ACPI
method _BFS to set the DMI ASPM for the S3 path, assuming an ACPI compliant OS.
Security
Security at a high level is often a tradeoff versus boot speeds and responsiveness.
Trusted Platform Modules and measured boots will add noticeable time to a boot flow.
Single-threaded Boot ROMs, HW Root of Trust and subsequent daisy chaining of au-
thentication takes a very long time if not architected for speed (and security). You
need to look at the platform requirements carefully and balance security and respon-
siveness. There are some things we can do to mitigate security impact to platform boot
times.
Intel Trusted Execution Technology included additional BIOS binary modules that
execute to assist in authentication of subsequent code execution and provide a secure
environment for that activity. It takes time to execute these modules, and it takes a
small amount of time to do authentication prior to executing code in these environ-
ments. Other authentication schemes have similar setup and execution penalties.
Trusted platform modules hash and save results as variables during a secure or meas-
ured boot. The delay associated with a TPM can be between 300 ms to upwards of 1
second, depending on the TPM vendor, the TPM firmware revision, and the size of the
BIOS firmware volumes being hashed. There are several techniques that can save time
when using a TPM:
1. Use the fastest SPI flash part available.
2. Use the fastest TPM within the budget.
3. Where possible, execute longer latency TPM commands in parallel with other
BIOS code. TPM_Startup and TPM_ContSelfTest commands are usually the slow-
est commands. This will allow for continued BIOS execution and access to the
TPM while the diagnostics are completed. Specifically:
– Finish measurement of last FV in SEC/PEI before executing TPM_ContSelfTest
in PEI.
– Delay checking for TPM_ContSelfTest until the next TPM command in DXE,
and delay the next TPM command if possible. Interrupting SelfTest in some
cases causes additional delay.
4. Measure only what is needed. Do not measure free space or boot block if it cannot
be modified.
5. If TPM supports legacy, turn off I/O port 0x4E/0x4F.
178 Chapter 12: Collaborative Roles in Quick Boot
In a UEFI BIOS, a Class 3 UEFI solution will normally be more than 100 ms faster than
a legacy OS-supported solution; that is, the CSM time to execute (without additional
delay due to legacy option ROMs). Again, this is a tradeoff between OS compatibility
support with older operating systems and boot speeds. Setup menu options can disa-
ble the CSM if it is not required.
OS Loader
If the OS is being examined, then the OS loader times also can be improved by looking
at the OS image size. Limiting the OS requirements for pre-OS keyboard can speed up
boot by tens to hundreds of milliseconds. Loading the user interface sooner in the
boot flow of the kernel will make a noticeable difference to the end user. Device driver
load and start times and usage of services can be streamlined to positively affect boot
performance.
During runtime, the UEFI capabilities are very limited and not all the UEFI drivers
that were used to boot the platform are available for the OS to call. Once Exit-
BootServices() is called by the OS loader and it assumes control of the platform, much
information is lost.
The OS loader can collect quite a bit of data about the platform above and beyond
the typical ACPI table standard set of information accessing the BIOS through UEFI
function calls. Before exiting boot services, the OS loader can both get and give data
directly to the BIOS.
An example of the OS level of interactions is setting up for the graphics resolution
of the splash screen such that it will match the OS via a hint during OS loading.
Operating System Interactions 179
Legacy OS Interface
Windows 7 and other legacy operating systems that require a CSM in the BIOS to pro-
vide Int 10h (and other legacy software interrupts) execute hundreds of milliseconds
to several seconds slower due to the nature of their boot flow. Initialization of each
and every legacy option ROM serially is just one reason why it may be many seconds
slower than UEFI boot flows. If the OS was not optimized for boot during the kernel
and driver loading, then any reasonable amount of BIOS improvement is going to be
lost anyway (even a zero-second boot is too long if the OS takes more than ten seconds
to boot).
The OS often repeats enumeration of buses in the post-boot space that the BIOS firm-
ware has performed in the pre-boot. Ideally this would be a source of timing savings.
However, upon further inspection, there are multiple reasons for this replication, in-
cluding but not limited to the following:
1. Firmware may not have done a complete job of enumerating the entire enumera-
ble subsystem, expecting software to repeat the enumeration when the OS level
drivers load. This may be due to the BIOS not requiring use of that portion of the
system in the pre-boot space.
2. Virtualization: the firmware can perform a full enumeration of a bus, then expose
a different set or a subset of hardware to the operating system through virtualiza-
tion technology.
3. The firmware may not have done an accurate job.
4. The initial enumeration may not work well with the kernel or device driver stack
designed by the operating system developers.
At the end of the day, the BIOS must enumerate the portions of the design only just
enough to boot the operating system. Assuming the operating system has the proper
enumeration support for the system hardware, the enumeration will be repeated and
in a more complete manner than in the BIOS. Standard enumerable bus architecture
allows for this replication and the system may require it. Examples include PCI and
USB enumeration. The whole USB network under a port may not need to be enumer-
ated five-plus hubs deep. The BIOS really needs to initialize all the hardware that can-
not be enumerated through industry standards (such as i2C). The coordination can be
made tighter in an embedded design where an RTOS and a custom firmware have
minimum overlap in enumeration.
180 Chapter 12: Collaborative Roles in Quick Boot
While replication of enumeration maybe required between BIOS and OS, it is not re-
quired within the UEFI domain itself. If necessary, the BIOS can pass information be-
tween modules via UEFI variables or HOBs. For example, we can use an HOB to pass
CPUI BIST information from SEC to PEI, and memory information from MRC to the
SMBIOS module. It is recommended that we not access the same hardware I/O twice
unless the data is expected to change.
Most hardware have a long power reset sequence. Question whether a hardware reset
is necessary, or if it can be handled in software without reinitializing hardware. It is
possible that CPU initialization, memory initialization, or ME initialization may all
require an extra system or CPU reset, which will add time, as the boot is partly repli-
cated for that given boot. Fast Boot eliminates most possibilities of system resets.
A network boot (booting to an OS image over LAN) takes several seconds to negotiate
with the DHCP server for an IP address. Fast Boot is not really an option.
Other Factors Affecting Boot Speed 181
Complexity and robust feature sets will likely result in a flexible, but slower boot per-
formance than a simple configuration. RAID is a feature that adds a lot of value, but
can decrease the speed of the boot due to an option ROM execution requirement. Here
UEFI drivers can help with some of the boot speeds, but cannot completely compen-
sate for the tradeoffs.
Tools being used to measure speed can produce an observer effect if not properly im-
plemented. Using file I/O, or serial output, or post codes, or other slow recording
mechanisms can add to a boot flow. And the more precise the data collection is, the
greater the effect. Methods can vary broadly per tool, but the best tools will use
memory to store the data during the boot flow and then read it off the platform after-
wards. For a complete picture of the boot flow (into the OS level), the best tools are
from the OS vendor that has incorporated the Firmware Performance Data Table
(FPDT), where the BIOS reports the data into a memory location inside the ACPI ta-
bles. Runtime tools can read the data after the fact.
As Confucius said, “Only the wisest and stupidest of men never change.” The devel-
oper’s attitudes toward the challenge of boot speeds can have a huge impact on the
results. “It’s only a few milliseconds” can add up quickly. “S3 is fast enough” will
leave many milliwatts and milliseconds on the table. “It’s a systemic problem, what
can I do?” will leave others to solve the problem if they choose to. “Even if the BIOS
disappeared entirely, the OS is still too slow,” but that cannot be said any more.
Intel architecture platforms have not all been optimized with this technology to
date. Customers need to work with their independent BIOS vendors to see if the capa-
bility has been included with their BIOS release to achieve Fast Boot optimization.
Motherboards can be developed to encompass the right sequencing with the right
parts. Tradeoffs can be worked through for the right reasons. Tools can be obtained
and code modules instrumented properly. And with the right approach, developers
can streamline the entire boot path into something truly responsive.
182 Chapter 12: Collaborative Roles in Quick Boot
Summary
When combined with a systematic Fast Boot framework and policy decision around
major subsystems, hardware selection and initialization nuances complete the pic-
ture of a quick boot solution.
The list discussed is far from complete, focusing on today’s Intel® Core™ based
platforms. Similar activities are feasible on any platform with due diligence and time.
Chapter 13
Legal Decisions
I go to jail one time for no driver license.
—Bob Marley
When creating new works of firmware, it is important to consider both the incoming
license terms (if not starting from scratch), and the distribution license terms and con-
ditions. Examples include proprietary BSD and GPL. If creating a derivative work, it
is important to consider all of this before starting the work so that time isn’t wasted
developing something that someone else may claim later to be his or her private in-
tellectual property (IP) or public IP based on association with the original license of
the code used. As the work progresses, there cannot be ambiguity on the path or strat-
egy used; code hygiene and discipline is key.
Often it is difficult for new software to be considered an island, and when it is not
original, it is important to consider how and if the new software is statically or dy-
namically linked to older code in the final design. Beyond the technical challenge of
combining old and new code, legally it can be very confusing. Developers are highly
encouraged to get professional legal assistance from a quality source that has experi-
ence in software and patents.
Of course, if someone is in the business for a long time and the organization or
team is steeped in either proprietary or general public license (GPL) ideology, then it
is likely that team members will already know the rules; or at least what the rules
were. New team members should be walked through the company norms. DO NOT
ASSUME that the new smart people (and not so new smart people) who you hired will
just know.
Also, as the law is somewhat fluid in nature, even if you are an old hand, it is very
important to get fresh advice from a professional, because the terms can be nuanced.
The nuances change from year to year without a great deal of advertisement to the
broader community. The company may change their policies, etc. The following is an
example of some of the basics, at the writing of this book. This is not a ruling by a
judge, and this was not written by a lawyer.
Proprietary License
Under a proprietary license, the distribution and reusability rules are defined by the
originators, but developers have to be very careful in how they define things. Terms
can vary broadly. The people you license from will also be potentially interested in
your licensing terms, and it may take each party’s lawyers many months to walk
DOI 10.1515/9781501506819-013
184 Chapter 13: Legal Decisions
through everything and agree, should the need arise. This puts any development per-
formed during the negotiation timeframe at risk of being unusable.
Many name-brand software packages come with forms of a proprietary license.
While brand-name companies make money directly or indirectly with proprietary li-
censes, some forms of the proprietary license are in fact freeware (no-cost).
The following are the four key clauses of the original license:
1. Redistributions of source code must retain the above copyright notice, this list of
conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this
list of conditions and the following disclaimer in the documentation and/or other
materials provided with the distribution.
3. All advertising materials mentioning features or use of this software must display
the following acknowledgement: This product includes software developed by
the <organization>.
4. Neither the name of the <organization> nor the names of its contributors may be
used to endorse or promote products derived from this software without specific
prior written permission.
The four-clause BSD license was not GPL-compatible, and the mandatory product
message advertising the BSD developers was not wanted by product developers
downstream.
Three-Clause BSD
The newer BSD License (three clauses) is compatible with GPL. The third clause ap-
pears below for comparison to the fourth clause above.
Lesser GPL (LGPL) 185
* Neither the name of the <organization> nor the names of its contributors
may be used to endorse or promote products derived from this software with-
out specific prior written permission.
In addition to the three-clause version, there is also a simplified BSD license (two
clauses) that is similar in that it is compatible with GPL where the third clause above
was considered optional. The two clauses are as follows:
1. Redistributions of source code must retain the above copyright notice, this list of
conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this
list of conditions and the following disclaimer in the documentation and/or other
materials provided with the distribution.
One key benefit of the BSD license is that companies that create their own proprietary
software can take BSD-licensed code to start with, create derivative works, or aug-
ment their existing code. Companies that create their own firmware or software regu-
larly take the BSD drivers that are created by silicon vendors and weave their good-
ness into these “free” offerings. Intel® BLDK also uses a form of this license for the
reused Tiano-based source code that it releases.
Conclusion
Avoid cross-contamination of your products. From a legal point of view, it is im-
portant not to link BSD code (like that in Intel BLDK or Tiano), or any proprietary code
(like a commercial BIOS) to GPL code (like Linux). It may help to keep GPL code fire-
walled and separated as much as possible, either physically or logically, from other
code whose license types are such that one entity does not cross-pollinate with the
other. Execution during one phase should know nothing about the next phase except
Conclusion 187
where to jump to once that black box code is loaded. In this manner, there is no link-
ing performed. It should be noted that some legal experts believe that linking is irrel-
evant and it is the actual intent that matters.
– Any non-GPL code would frown upon being linked at this point to any GPL v3
code.
– LGPL libraries, from reading through several references, appear to not have this
linking problem that GPL code has.
– There are open-source tools to allow creation of a firmware image that con-
catenates the blocks into an FWH or SPI image such that it can be flashed or
upgraded.
When looking at the debate between proprietary, BSD, and GPL licenses, it can be
equated roughly to an in-depth political, ideological discussion. The rules can be
drastically different. The viewpoints are as diverse as they would be between passion-
ate and insightful people debating the merits and demerits of capitalism, socialism,
and communism (respectively). And comparing the three, the analogy can play out
along these lines. But when going from the theoretical to the implementation, the dif-
ferences can make or break your solution.
Appendix A
Generating Serial Presence Detection Data for Down
Memory Configurations
I have no data yet. It is a capital mistake to theorize before one has data. Insensibly one begins
to twist facts to suit theories, instead of theories to suit facts.
This appendix provides guidance on how to create the SPD data required for mother-
board designs that have the DDR3 SDRAM chips soldered directly onto the mother-
board. Modular DIMMs normally have an SPD EERPOM that contains this infor-
mation, which is used by the BIOS to correctly configure the system during Memory
Reference Code. Without this data, the system will not boot; or at best it will try to
boot with slow settings.
Since memory-down designs do not include DIMMs, they do not have ready-made
SPD data on an EEPROM for the BIOS to read, and the board designer must assist BIOS
or firmware developers to create the data. The data can then be put into an onboard
EEPROM off the SMBus (as a physical DIMM would have), or the data can be hard-
coded into the BIOS or placed into a table (or tables) where the BIOS can access the
data during memory initialization. There are tradeoffs between these methods:
SPD EEPROM on Single BIOS for any memory . Cost of the EEPROM,
motherboard configuration . Need to program the SPD data
during manufacturing
. Minor delay during initialization
to fetch data across slower
SMBus (several ms)
Hard-coding inline . No EEPROM cost. . BIOS has to change for every
in the memory code . No programming of SPD on the memory configuration
line . Complexity during manufacturing
. No added delay during ensuring correct BIOS per
initialization memory configuration
Tables with optional . No EEPROM . Takes up hardware strap
hardware strap, . No programming on line. . BIOS will change when new
point memory init. . No SMBus Read delays configurations are designed
code to read from . BIOS can read strap to know
data file instead which memory configuration
exists
. Possible to upgrade with binary
modification with a rebuild
DOI 10.1515/9781501506819-014
190 Quick Boot: A Guide for Embedded Firmware Developers
Hardcoding is not recommended for any solutions. Developers and designs should
agree to include tables in the BIOS for various memory configuration (mimicking SPD
data), or populate an actual SPD EEPROM on the board.
The following table shows the full list of the DDR3 SPD data that needs to be cal-
culated for a memory-down solution. It assumes no SPD onboard. Each field can be
calculated by analyzing the board’s topology and/or by using the datasheet for the
SDRAM components used on the board. A “Typical Value” with its associated defini-
tion is also provided.
The used memory channels must be identified, and the number of “DIMM Equiva-
lents” must be calculated by carefully noting which chip select pins on the DRAM
controller are routed to the SDRAM chips.
Each “DIMM Equivalent” will require its own block of SPD data. The specific chip
select pins, CS#[7:0] (the actual number of chip select pins can vary depending on the
specific memory controller being used), have to be carefully analyzed to see which
are being used by the SDRAM chips. It’s possible for designs to support single-rank,
dual-rank, and quad-rank DIMM equivalents.
As an example, Figure A.1 shows an abstract view of DIMMs located on a single
memory channel. Since each DIMM has only two CS# pins routed to the connector,
this design supports two dual-rank DIMMs. If all four chip select pins were routed to
a single DIMM, then the design would support a single, quad-rank DIMM.
Memory-down designs don’t use DIMMs; the SDRAM chips are soldered directly onto
the motherboard. The chip select signals will be routed to the SDRAM chips in the
same manner, however. In Figure A.2 below, on channel 0, only CS#[0] is connected
to all eight SDRAM chips. Channel 0 is supporting only one single-rank DIMM equiv-
alent. Channel 1 is also supporting a single- rank DIMM equivalent.
Analyzing the Design’s Memory Architecture 193
ECC Calculation
The data signals on the schematics must be analyzed to determine whether the down-
on-board memory is implementing ECC or not. If the ECC data signals DDR_CB[7:0]
(DDR “check bits,” although different chipsets might use different nomenclature) are
being used as data signals on one of the SDRAM chips, then the memory subsystem
on the board is implementing ECC Support. Some of the SPD fields must be set to in-
dicate that the DIMM supports ECC.
The DRAM width can be determined by carefully analyzing the number of data sig-
nals, DDR_DQ[63:0], that are routed into each SDRAM chip. It will either be four,
eight, sixteen, or thirty-two. This will indicate the DRAM width for the DRAM chip in
this DIMM equivalent.
The DRAM width is part of SPD field 7, and often there are slight timing differ-
ences in the datasheet depending on the width of the specific DRAM being used.
The vendor and exact part number of the SDRAM chips used in the design is extracted
from the schematics. The full datasheet must be obtained in order to calculate all of
the needed SPD fields.
Figure A.2 shows a typical memory-down implementation. Sixteen SDRAM chips are
split evenly across two of three available channels. All the SDRAM chips on a given
channel are connected to a single chip select signal. Since each SDRAM chip is con-
nected to eight data signals, this design consists of two single-rank DIMM equivalents
using ×8 width MT41J256M8HX-187-EITF SDRAM devices. Since the ECC data lines are
not being used (not shown, but they would have required the use of an additional ×8
SDRAM chip per channel), both DIMM equivalents are non-ECC.
Analyzing the schematics provides the SDRAM width and rank information (SPD
field #7). All the other information required in the SPD data block will have to be ex-
tracted from the MT41J256M8HX-187-EITF datasheet. This process is described in the
next section.
194 Quick Boot: A Guide for Embedded Firmware Developers
See the “References” section at the end of this appendix for links to the exact docu-
ments used.
SPD EEPROMs typically contain 256 bytes. Down-on-board memory designs should
use 0x11 for this field.
Byte 0: Number of Bytes Used ⁄ Number of Bytes in SPD Device ⁄ CRC Coverage
The least significant nibble of this byte describes the total number of bytes used by
the module manufacturer for the SPD data and any (optional) specific supplier infor-
mation. The byte count includes the fields for all required and optional data. Bits 6–
4 describe the total size of the serial memory used to hold the Serial Presence Detect
data. Bit 7 indicates whether the unique module identifier (found in bytes 117–125) is
covered by the CRC encoded on bytes 126 and 127.
= CRC covers bytes – Bit [, , ]: Bit [, , , ]:
= CRC covers bytes – = Undefined = Undefined
= =
All others reserved =
=
All others reserved
This byte describes the compatibility level of the encoding of the bytes contained in
the SPD EEPROM, and the current collection of valid defined bytes. Software should
examine the upper nibble (Encoding Level) to determine if it can correctly interpret
the contents of the module SPD. The lower nibble (Additions Level) can optionally be
used to determine which additional bytes or attribute bits have been defined; how-
ever, since any undefined additional byte must be encoded as 0×00 or undefined at-
tribute bit must be defined as 0, software can safely detect additional bytes and use
safe defaults if a zero encoding is read for these bytes.
Revision .
... . . . . . . . . .
Revision .
The Additions Level is never reduced even after an increment of the Encoding Level.
For example, if the current SPD revision level were 1.2 and a change in Encoding Level
were approved, the next revision level would be 2.2. If additions to revision 2.2 were
approved, the next revision would be 2.3. Changes in the Encoding Level are ex-
tremely rare, however, since they can create incompatibilities with older systems.
The exceptions to the above rule are the SPD revision levels used during develop-
ment prior to the Revision 1.0 release. Revisions 0.0 through 0.9 are used to indicate
sequential pre-production SPD revision levels; however, the first production release
will be Revision 1.0.
Line SDRAM/Module Type Bit Bit Bit Bit Bit Bit Bit Bit Hex
# Corresponding to Key Byte
Reserved
Standard FPM DRAM
EDO
Pipelined Nibble
SDRAM
ROM
DDR SGRAM
DDR SDRAM
DDR SDRAM
DDR SDRAM FB-DIMM
DDR SDRAM FB-DIMM PROBE A
DDR SDRAM B
- - - - - - - - - - -
Reserved FD
Reserved FE
Reserved FF
This field is typically set to 0×02 for Unbuffered DIMMs or 0×03 when using SO-
DIMMs. If the specific chipset/CPU ONLY supports SO-DIMMs, use 0×03 so as not to
confuse the BIOS.
Byte (Dec) Byte (Hex) Field Name Typ. Value Definition
X = Undefined
= RDIMM (width = . mm nom)
= UDIMM (width = . mm nom)
= SO-DIMM (width = . mm nom)
= Micro-DIMM (width = TBD mm nom)
= Mini-RDIMM (width = . mm nom)
= Mini-UDIMM (width = . mm nom)
= Mini-CDIMM (width = . mm nom)
= b-SO-UDIMM (width = . mm nom)
= b-SO-RDIMM (width = . mm nom)
= b-SO-CDIMM (width = . mm nom)
= LRDIMM (width = . mm nom)
= b-SO-DIMM (width = . mm nom)
= b-SO-DIMM (width = . mm nom)
All others reserved
Definitions:
RDIMM: Registered Dual In-Line Memory Module
LRDIMM: Load Reduction Dual In-Line Memory Module
UDIMM: Unbuffered Dual In-Line Memory Module
SO-DIMM: Unbuffered -bit Small Outline Dual In-Line Memory Module
Micro-DIMM: Micro Dual In-Line Memory Module
Mini-RDIMM: Mini Registered Dual In-Line Memory Module
Mini-UDIMM: Mini Unbuffered Dual In-Line Memory Module
Mini-CDIMM: Clocked -bit Mini Dual In-Line Memory Module
b-SO-UDIMM: Unbuffered -bit Small Outline Dual In-Line Memory Module
b-SO-RDIMM: Registered -bit Small Outline Dual In-Line Memory Module
b-SO-CDIMM: Clocked -bit Small Outline Dual In-Line Memory Module
b-SO-DIMM: Unbuffered -bit Small Outline Dual In-Line Memory Module
b-SO-DIMM: Unbuffered -bit Small Outline Dual In-Line Memory Module
SPD Field #4: “SDRAM Density and Banks” Example from Micron T41J256M8 Datasheet
This field is typically extracted directly from the SDRAM datasheet. Often the bank
bits are specified instead of the number of banks (that is, three bank bits would pro-
vide eight banks, total). Likewise, in the odd situation where the DRAM density (or
capacity) is not obviously spelled out in the datasheet, it can be calculated by multi-
plying the full address range by the DRAM width:
Table 2 Addressing
SPD Field #5: “SDRAM Rows & Columns” Example from Micron MT41J256M8 Datasheet
Rows and columns are the number of address signals that are active when the
RAS# and CAS# signals strobe respectively. The row and column values required
by the SPD data are extracted directly from the SDRAM datasheet as shown above.
Here, A[9:0] consists of ten individual address lines, not nine. Both column and
row fields required in the SPD data are not nibble aligned. This is a common mis-
take with this field.
Calculating Specific SPD Data Based on SDRAM Datasheet 201
Below is an example of a 1.5V part; this will be modified per the components used in
the design.
The nominal Voltage parameter is extracted directly from the SDRAM datasheet, usu-
ally on the front page.
202 Quick Boot: A Guide for Embedded Firmware Developers
SPD Field #7: “Ranks & Device DQ Count” Example from Micron MT41J256M8 Datasheet
The SDRAM device width also can be easily discerned by looking at the schematics
and noting the number of data lines used by each SDRAM device. SDRAM devices use
four, eight, sixteen, or thirty-two data signals. The Number of Ranks is trickier to cal-
culate. This parameter is not associated with the SDRAM chips, but is a parameter
relating to the DIMM itself. It is the number or rank signals or chip selects used in the
DIMM equivalent. The proper value to use must be extracted from the schematics.
Note the specific chip select pins being used on a given channel. If only CS0 is routed
Calculating Specific SPD Data Based on SDRAM Datasheet 203
to the SDRAM chips, then the down-on-board memory solution is single rank (1 rank).
If both CS0 and CS1 are routed to all of the SDRAM chips, the down-on-board memory
solution is dual rank (two ranks). Some server chipsets also support quad-rank
DIMMs.
In extreme cases, it might be necessary to analyze the chipset’s datasheet and
design guide to discern exactly which DIMM equivalent(s) are being used in the down-
on-board memory design.
As an example, let’s assume that an Intel® 5100 chipset based design has SDRAM
chips connected using all four rank signals (CS[3:0]#). This implementation could ei-
ther be a single quad-rank DIMM equivalent, or two dual-rank DIMM equivalent. As
shown in the two figures below, taken from this document: Intel® Xeon® Processor
5000 Sequence with Intel® 5100 Memory Controller Hub Chipset for Communications,
Embedded, and Storage Applications Platform Design Guide (PDG) April 2009 Revision
2.3 Document Number: 352108–2.3.
All other memory interface signals have to be analyzed. For example, if the design
is routing a SINGLE clock to all of the SDRAM chips (DCLKP/N [0]), then the imple-
mentation is similar to that in Figure 37; it’s a single quad-rank DIMM implementa-
tion. However, if half of the SDRAM chips are connected to DCLKP/N[0], and the other
half are connected to DCLKP/N[1], then the design is implementing two DIMM equiv-
alents of dual-rank DIMMs.
Figure 37 Configuration 1.1 - Clock and Control Signal DIMM Pin Mapping (One DIMM per Channel -
32 GB Mode, Quad-rank with S3 Support)
Figure 37 from Intel® 5100 Platform Design Guide–Single QuAD- Rank DIMM equi1valent
204 Quick Boot: A Guide for Embedded Firmware Developers
Figure 38 Configuration 2.1 - Clock and Control Signal DIMM Pin Mapping
(Two DIMMs Per Channel - 32 GB Mode)
Figure 38 from Intel® 5100 Platform Design Guide–Single QuAD-Rank DIMM equivalent
This field is set t0 0×0Bh (72 bits) if the design is implementing ECC. Otherwise, set
this field to 0×03h.
Examples:
– 64-bit primary bus, no parity or ECC (64 bits total width): xxx 000 011
– 64-bit primary bus, with 8 bit ECC (72 bits total width): xxx 001 011
Setting this value to 0×52 (2.5ps) works well for implementing the other timing fields
based on this value. Most DDR3 DIMMs use 0×52 for this field.
Examples:
ps
. .ps
ps
206 Quick Boot: A Guide for Embedded Firmware Developers
Most DDR3 DIMM’s SPD data use 0×01h (0.125 ns) for this value.
Examples:
To simplify BIOS implementation, DIMMs associated with a given key byte value may dif-
fer in MTB value only by a factor of two. For DDR3 modules, the defined MTB values are:
Per JEDEC Specification, Byte 12: SDRAM Minimum Cycle Time (tCKmin)
This byte defines the minimum cycle time for the SDRAM module, in Medium Timebase
(MTB) units. This number applies to all applicable components on the module. This byte
applies to SDRAM and support components as well as the overall capability of the DIMM.
This value comes from the DDR3 SDRAM and support component datasheets.
Bits :
Minimum SDRAM Cycle Time (tCKmin) MTB Units
If tCKmin cannot be divided evenly by the MTB, this byte must be rounded up to the
next larger integer and the Fine Offset for tCKmin (SPD byte 34) used for correction to
get the actual value.
ACTIVATE to internal t
RCD . - - ns
READ or WRITE
delay time
PRECHARGE t
RP . - - ns
command period
ACTIVATE-to- t
RC . - . - ns
ACTIVATE or Refresh
command period
ACTIVATE-to- t
RAS . × tREFI . × tREFI ns
PRECHARGE
command period
CL = CWL = t
CK (AVG) . . . . ns ,
CWL = t
CK (AVG) Reserved Reserved ns
CL = CWL = t
CK (AVG) . . . . ns
CWL = t
CK (AVG) Reserved Reserved ns ,
CL = CWL = t
CK (AVG) Reserved Reserved ns
CWL = t
CK (AVG) . < . Reserved ns ,
CL = CWL = t
CK (AVG) Reserved Reserved ns
CWL = t
CK (AVG) . <. . < . ns
Supported CL settings , , , , , CK
Calculating Specific SPD Data Based on SDRAM Datasheet 209
Notes:
. REFI depends on TOPER.
t
. The CL and CWL settings result in tCK requirements. When selecting tCK, both CL and
CWL requirement settings need to be fulfilled.
. Reserved settings are not allowed.
SPD Field #12: “Cycle Time (tCKmin)” Example from Micron MT41J256M8 Datasheet
The supported CAS latencies can be found in the SDRAM datasheet as shown above.
Each supported CAS latency must have its bit set in fields 14 and 15.
Byte (Dec) Byte (Hex) Field Name Typ. Value Definition
SPD Field #14 and #15: “CAS Latencies Supported” Definition from JEDEC DDR3 SPD Specification
ACTIVATE to internal t
RCD . - - ns
READ or WRITE
delay time
PRECHARGE t
RP . - - ns
command period
ACTIVATE-to- t
RC . - . - ns
ACTIVATE or Refresh
command period
ACTIVATE-to- t
RAS . × tREFI . × tREFI ns
PRECHARGE
command period
CL = CWL = t
CK (AVG) . . . . ns ,
CWL = t
CK (AVG) Reserved Reserved ns
CL = CWL = t
CK (AVG) . . . . ns
CWL = t
CK (AVG) Reserved Reserved ns ,
CL = CWL = t
CK (AVG) Reserved Reserved ns
CWL = t
CK (AVG) . < . Reserved ns ,
CL = CWL = t
CK (AVG) Reserved Reserved ns
CWL = t
CK (AVG) . <. . < . ns
Supported CL settings , , , , , CK
Supported CWL settings , , CK
Notes:
1. REFI depends on TOPER.
t
2. The CL and CWL settings result in tCK requirements. When selecting tCK, both CL and
CWL requirement settings need to be fulfilled.
3. Reserved settings are not allowed.
SPD Field #14 and #15: “CAS Latencies Supported” Example from Micron MT41J256M8 Datasheet
Calculating Specific SPD Data Based on SDRAM Datasheet 211
This field must also be extracted from the datasheet and divided by the Medium Time-
base Divisor. Although the SPD specification calls out this abbreviation as tAA, I’ve
found that it is usually abbreviated tCL or just “CL” in the various SDRAM datasheets
(as shown above).
In our example:
1. The tCL value found in the Micron MT41J256M8 datasheet for our specific SDRAM
chip gives 13.1 ns.
2. This number must be divided by the Medium Timebase Divisor, the value in 0×0B.
In our example this is 0.125 ns.
3. 13.1 ns ⁄ 0.125 ns = 104.8 (round up) = 0×69.
Bits :
If tAAmin cannot be divided evenly by the MTB, this byte must be rounded up to the
next larger integer and the Fine Offset for tAAmin (SPD byte 35) used for correction to
get the actual value.
Examples:
Notes:
. See SPD byte .
. Refer to device datasheet for downbin support details.
SPD Field #16: “Minimum CAS Latency Time (tAAmin)” Definition from JEDEC DDR3 SPD Specification
Notes:
. Backward compatible to , CL = (-E).
. Backward compatible to , CL = (-E).
. Backward compatible to , CL = (-).
SPD Field #16: “Minimum CAS Latency Time (tAAmin)” Example from Micron MT41J256M8 Datasheet
This is another field that must be extracted from the datasheet and divided by the
Medium Timebase Divisor.
In our example:
1. The tWR value found in the Micron MT41J256M8 datasheet for our specific SDRAM
chip gives 15 ns.
2. This number must be divided by the Medium Timebase Divisor, the value in 0×0B.
In our example this is 0.125 ns.
3. 15 ns / 0.125 ns = 120 = 0×78.
From the JEDEC Specification: Byte 17: Minimum Write Recovery Time (tWRmin)
This byte defines the minimum SDRAM write recovery time in Medium Timebase
(MTB) units. This value comes from the DDR3 SDRAM datasheet.
Bits –
Minimum Write Recovery Time (tWR) MTB Units
Example:
Steps:
1. The BIOS first determines the common operating frequency of all modules in
the system, ensuring that the corresponding value of tCK (tCKactual) falls
214 Quick Boot: A Guide for Embedded Firmware Developers
between tCKmin (Bytes 12 and 34) and tCKmax. If tCKactual is not a JEDEC
standard value, the next smaller standard tCKmin value is used for calculat-
ing Write Recovery.
2. The BIOS then calculates the “desired” Write Recovery (WRdesired):
where tWRmin is defined in Byte 17. The ceiling function requires that the
quotient be rounded up always.
3. The BIOS then determines the “actual” Write Recovery (WRactual):
where min WR is the lowest Write Recovery supported by the DDR3 SDRAM.
Note that not all WR values supported by DDR3 SDRAMs are sequential, so
the next higher supported WR value must be used in some cases. Usage ex-
ample for DDR3-1333G operating at DDR3-1333:
tCKactual = 1.5 ns
WRdesired = 15 / 1.5 = 10
WRactual = max(10, 10) = 10
SPD Field #17: “Minimum Write Recovery Time (tWRmin)” Example from Micron MT41J256M8
Datasheet
This field must also be extracted from the datasheet and divided by the Medium Time-
base Divisor.
In our example:
1. The tRCD value found in the Micron MT41J256M8 datasheet for our specific SDRAM
chip gives 13.1 ns.
2. This number must be divided by the Medium Timebase Divisor, the value in 0×0B.
In our example this is 0.125 ns.
3. 13.1 ns ⁄ 0.12 5ns = 104.8 (round up) = 0×69.
216 Quick Boot: A Guide for Embedded Firmware Developers
Bits –
Minimum RAS# to CAS# Delay (tRCD) MTB Units
If tRCDmin cannot be divided evenly by the MTB, this byte must be rounded up to the
next larger integer and the Fine Offset for tRCDmin (SPD byte 36) used for correction
to get the actual value.
Examples:
Notes:
. . See SPD byte .
. Refer to device datasheet for downbin support details.
SPD Field #18: “Minimum RAS# to CAS# Delay Time (tRCDmin)” Definition from JEDEC DDR3 SPD Spec-
ification
Speed Grade Data Rate (MT/s) Target tRCD- tRP- CL tRCD tRP CL (ns)
(ns) (ns)
Notes:
. Backward compatible to , CL = (-E).
. Backward compatible to , CL = (-E).
. Backward compatible to , CL = (-).
SPD Field 0x13: Min. Row Active to Row Active Delay (tRRDmin)
This field must also be extracted from the datasheet and divided by the Medium Time-
base Divisor. In our example:
218 Quick Boot: A Guide for Embedded Firmware Developers
1. The tRRD value found in the Micron MT41J256M8 datasheet for our specific SDRAM
chip gives 7.5 ns.
2. This number must be divided by the Medium Timebase Divisor, the value in 0×0B.
In our example this is 0.125 ns.
3. 7.5 ns ⁄ 0.125 ns = 60 = 0×3C.
× Min. Row Active to Row Active Delay (tRRDmin) ×C .ns
Byte 19: Minimum Row Active to Row Active Delay Time (tRRDmin)
This byte defines the minimum SDRAM Row Active to Row Active Delay Time in Me-
dium Timebase units. This value comes from the DDR3 SDRAM datasheet. The value
of this number may be dependent on the SDRAM page size; please refer to the DDR3
SDRAM datasheet section on Addressing to determine the page size for these devices.
Controller designers must also note that at some frequencies, a minimum number of
clocks may be required, resulting in a larger tRRDmin value than indicated in the SPD.
For example, tRRDmin for DDR3-800 must be 4 clocks.
Bits –
SPD Field #19: “Minimum Row Active to Row Active Delay Time (tRRDmin)” Example from Micron
MT41J256M8 Datasheet
220 Quick Boot: A Guide for Embedded Firmware Developers
This is another field that must be extracted from the datasheet and divided by the
Medium Timebase Divisor.
In our example:
1. The tRP value found in the Micron MT41J256M8 datasheet for our specific SDRAM
chip gives 13.125 ns.
2. This number must be divided by the Medium Timebase Divisor, the value in 0×0B.
In our example this is 0.125 ns.
3. 13.125 ns ⁄ 0.125 ns = 105 = 0×69.
Bits –
If tRPmin cannot be divided evenly by the MTB, this byte must be rounded up to the
next larger integer and the Fine Offset for tRPmin (SPD byte 37) used for correction to
get the actual value.
Notes:
. See SPD byte .
. Refer to device datasheet for downbin support details.
ACTIVATE to internal t
RCD . - - ns
READ or WRITE
delay time
PRECHARGE t
RP . - - ns
command period
ACTIVATE-to- t
RC . - . - ns
ACTIVATE or Refresh
command period
222 Quick Boot: A Guide for Embedded Firmware Developers
ACTIVATE-to- t
RAS . × tREFI . × tREFI ns
PRECHARGE
command period
CL = CWL = t
CK (AVG) . . . . ns ,
CWL = t
CK (AVG) Reserved Reserved ns
CL = CWL = t
CK (AVG) . . . . ns
CWL = t
CK (AVG) Reserved Reserved ns ,
CL = CWL = t
CK (AVG) Reserved Reserved ns
CWL = t
CK (AVG) . < . Reserved ns ,
CL = CWL = t
CK (AVG) Reserved Reserved ns
CWL = t
CK (AVG) . <. . < . ns
Supported CL settings , , , , , CK
Supported CWL settings , , CK
Notes:
1. REFI depends on TOPER.
t
2. The CL and CWL settings result in tCK requirements. When selecting tCK, both CL and
CWL requirement settings need to be fulfilled.
3. Reserved settings are not allowed.
SPD Field #20: “Minimum Row Precharge Delay Time (tRPmin)” Micron MT41J256M8 Datasheet
This field is the Most Significant nibble (4 bits) for both the Active to Precharge Delay
(tRASmin) and Active to Active/Refresh delay (tRCmin). See the next two sections for
how to determine the tRASmin and tRCmin values and put the upper nibble of those
results into this field.
SPD Field #21: “Upper Nibbles for tRAS and tRC” Definition from JEDEC DDR3 SPD Specification
This is another field that must be extracted from the datasheet and divided by the
Medium Timebase Divisor. In our example:
1. The tRAS value found in the Micron MT41J256M8-187E datasheet for our specific
SDRAM chip gives 37.5 ns.
2. This number must be divided by the Medium Timebase Divisor, the value in 0×0B.
In our example this is 0.125 ns.
3. 37.5 ns / 0.125 ns = 300 = 0×012C.
4. The LSB, 0×2C, goes into SPD field #22 (0×16).
5. The lower nibble of the MSB, 0×01, goes into bits [3:0] of SPD field #21 (0×15)
Byte 22: Minimum Active to Precharge Delay Time (tRASmin), Least Significant Byte
The lower nibble of Byte 21 and the contents of Byte 22 combined create a 12-bit value
that defines the minimum SDRAM Active to Precharge Delay Time in Medium Time-
base (MTB) units. The most significant bit is Bit 3 of Byte 21, and the least significant
bit is Bit 0 of Byte 22. This value comes from the DDR3 SDRAM datasheet.
Examples:
SPD Field #22: “Minimum Active to Precharge Delay Time (tRASmin)” Definition from JEDEC DDR3
SPD Specification
ACTIVATE to internal t
RCD . - - ns
READ or WRITE
delay time
PRECHARGE t
RP . - - ns
command period
ACTIVATE-to- t
RC . - . - ns
ACTIVATE or Refresh
command period
ACTIVATE-to- t
RAS . × tREFI . × tREFI ns
PRECHARGE
command period
Calculating Specific SPD Data Based on SDRAM Datasheet 225
CL = CWL = t
CK (AVG) . . . . ns ,
CWL = t
CK (AVG) Reserved Reserved ns
CL = CWL = t
CK (AVG) . . . . ns
CWL = t
CK (AVG) Reserved Reserved ns ,
CL = CWL = t
CK (AVG) Reserved Reserved ns
CWL = t
CK (AVG) . < . Reserved ns ,
CL = CWL = t
CK (AVG) Reserved Reserved ns
CWL = t
CK (AVG) . <. . < . ns
Supported CL settings , , , , , CK
Supported CWL settings , , CK
Notes:
1. REFI depends on TOPER.
t
2. The CL and CWL settings result in tCK requirements. When selecting tCK, both CL and
CWL requirement settings need to be fulfilled.
3. Reserved settings are not allowed.
SPD Field #22: “Minimum Active to Precharge Delay Time (tRASmin)” Example from Micron
MT41J256M8 Datasheet
SPD Field 0×17: Min. Active to Active Refresh Delay (tRCmin) LSB
This is another field that must be extracted from the datasheet and divided by the
Medium Timebase Divisor. In our example:
1. The tRC value found in the Micron MT41J256M8-187E datasheet for our specific
SDRAM chip gives 50.625 ns.
2. This number must be divided by the Medium Timebase Divisor, the value in 0×0B.
In our example this is 0.125 ns.
3. 50.625 ns ⁄ 0.125 ns = 405 = 0×0195.
4. The LSB, 0×95, goes into SPD field #23 (0×17).
5. The lower nibble of the MSB, 0x01, goes into bits [7:4] of SPD field #21 (0×15).
× Min. Active to Active Refresh Delay (tRCmin) LSB × .
226 Quick Boot: A Guide for Embedded Firmware Developers
Notes:
. See SPD byte .
. Refer to device datasheet for downbin support details.
SPD Field #23: “Minimum Active to Active/Refresh Delay Time (tRCmin)” Definition from JEDEC DDR3
SPD Specification
ACTIVATE to internal t
RCD . - - ns
READ or WRITE
delay time
PRECHARGE t
RP . - - ns
command period
ACTIVATE-to- t
RC . - . - ns
ACTIVATE or Refresh
command period
ACTIVATE-to- t
RAS . × tREFI . × tREFI ns
PRECHARGE
command period
CL = CWL = t
CK (AVG) . . . . ns ,
CWL = t
CK (AVG) Reserved Reserved ns
CL = CWL = t
CK (AVG) . . . . ns
CWL = t
CK (AVG) Reserved Reserved ns ,
CL = CWL = t
CK (AVG) Reserved Reserved ns
CWL = t
CK (AVG) . < . Reserved ns ,
CL = CWL = t
CK (AVG) Reserved Reserved ns
CWL = t
CK (AVG) . <. . < . ns
Supported CL settings , , , , , CK
Supported CWL settings , , CK
Notes:
1. REFI depends on TOPER.
t
2. The CL and CWL settings result in tCK requirements. When selecting tCK, both CL and
CWL requirement settings need to be fulfilled.
3. Reserved settings are not allowed.
SPD Field #23: “Minimum Active to Active/Refresh Delay Time (tRCmin)” Example from Micron
MT41J256M8 Datasheet
228 Quick Boot: A Guide for Embedded Firmware Developers
SPD Field 0×18 and 0×19: Min. Refresh Recovery Delay (tRFCmin)
This is another field that must be extracted from the datasheet and divided by the
Medium Timebase Divisor. In our example:
1. The tRFC value found in the Micron MT41J256M8-187E datasheet for our specific
SDRAM chip gives 160 ns. Note that the MT41J256M8- 187E is a 2 Gb part.
2. This number must be divided by the Medium Timebase Divisor, the value in 0×0B.
In our example this is 0.125 ns.
3. 50.625 ns ⁄ 0.125 ns = 1280 = 0×0500.
4. The LSB, 0×00, goes into SPD field #24 (0×18).
5. The MSB, 05, goes into SPD field #25 (0×19).
Byte 24: Minimum Refresh Recovery Delay Time (tRFCmin), Least Significant Byte
Byte 25: Minimum Refresh Recovery Delay Time (tRFCmin), Most Significant Byte
The contents of Byte 24 and the contents of Byte 25 combined create a 16-bit value that
defines the minimum SDRAM Refresh Recovery Time Delay in Medium Timebase
(MTB) units. The most significant bit is Bit 7 of Byte 25, and the least significant bit is
Bit 0 of Byte 24. These values come from the DDR3 SDRAM datasheet.
Examples:
SPD Field #24 and #25: “Minimum Refresh Recovery Delay Time (tRFCmin), LSB”
Definition from JEDEC DDR3 SPD Specification
Calculating Specific SPD Data Based on SDRAM Datasheet 229
SPD Field #24 and #25: “Minimum Refresh Recovery Delay Time (tRFCmin), LSB”
Micron MT41J256M8 Datasheet
This is another field that must be extracted from the datasheet and divided by the
Medium Timebase Divisor. In our example:
1. The tWTR value found in the Micron MT41J256M8 datasheet for our specific SDRAM
chip gives 7.5 ns.
2. This number must be divided by the Medium Timebase Divisor, the value in 0×0B.
In our example this is 0.125 ns.
3. 75 ns / 0.125 ns = 60 = 0×3C.
Byte 26: Minimum Internal Write to Read Command Delay Time (tWTRmin)
This byte defines the minimum SDRAM Internal Write to Read Delay Time in Medium
Timebase (MTB) units. This value comes from the DDR3 SDRAM datasheet. The value
of this number may be dependent on the SDRAM page size; please refer to the DDR3
SDRAM datasheet section on Addressing to determine the page size for these devices.
Controller designers must also note that at some frequencies, a minimum number of
clocks may be required, resulting in a larger tWTRmin value than indicated in the
SPD. For example, tWTRmin for DDR3-800 must be 4 clocks.
Bits –
Internal Write to Read Delay Time (tWTR) MTB Units
Example:
SPD Field #26: “Minimum Internal Write to Read Command Delay Time (tWTRmin)” Definition from
JEDEC DDR3 SPD Specification
Calculating Specific SPD Data Based on SDRAM Datasheet 231
SPD Field #26: “Minimum Internal Write to Read Command Delay Time (tWTRmin)” Example from Mi-
cron MT41J256M8 Datasheet
This is another field that must be extracted from the datasheet and divided by the
Medium Timebase Divisor.
In our example:
1. The tRTP value found in the Micron MT41J256M8 datasheet for our specific SDRAM
chip gives 7.5 ns.
2. This number must be divided by the Medium Timebase Divisor, the value in 0×0B.
In our example this is 0.125 ns.
3. 75 ns ⁄ 0.125 ns = 60 = 0×3C.
Byte 27: Minimum Internal Read to Precharge Command Delay Time (tRTPmin)
This byte defines the minimum SDRAM Internal Read to Precharge Delay Time in Me-
dium Timebase (MTB) units. This value comes from the DDR3 SDRAM datasheet. The
value of this number may depend on the SDRAM page size; please refer to the DDR3
SDRAM datasheet section on Addressing to determine the page size for these devices.
Controller designers must also note that at some frequencies, a minimum number of
clocks may be required, resulting in a larger tRTPmin value than indicated in the SPD.
For example, tRTPmin for DDR3-800 must be 4 clocks.
Bits –
Internal Read to Precharge Delay Time (tRTP) MTB Units
Examples:
Calculating Specific SPD Data Based on SDRAM Datasheet 233
SPD Field #27: “Minimum Internal Read to Precharge Command Delay Time (tRTPmin)” Definition
from JEDEC DDR3 SPD Specification
SPD Field #27: “Minimum Internal Read to Precharge Command Delay Time (tRTPmin)” Example from
Micron MT41J256M8 Datasheet
This is the upper nibble for the tFAW SDRAM timing parameter. See the next section for
how to calculate this value, and put the upper nibble into SPD Field #28, bits [3:0].
Byte (Dec) Byte (Hex) Field Name Typ. Value Definition
SPD Field #28: “Upper Nibble for tFAW” Definition from JEDEC DDR3 SPD Specification
SPD Field 0×1D: Min. Four Activate Window Delay (tFAWmin) LSB
This is another field that must be extracted from the datasheet and divided by the
Medium Timebase Divisor. In our example:
1. The tFAW value found in the Micron MT41J256M8 datasheet for our specific SDRAM
chip gives 37.5 ns. Note that in the case of this SDRAM part, the number varies,
Calculating Specific SPD Data Based on SDRAM Datasheet 235
depending on the part’s page size. In the case of the MT41J256M8, the page size
is 1K as shown.
2. This number must be divided by the Medium Timebase Divisor, the value in 0×0B.
In our example this is 0.125 ns.
3. 37.5 ns / 0.125 ns = 300 = 0×12C.
4. The upper nibble, 0×1, is put into SPD field #28, bits [3:0].
5. The LSB, 0×2C, is put into SPD field #29.
Byte 29: Minimum Four Activate Window Delay Time (tFAWmin), Least Significant Byte
The lower nibble of Byte 28 and the contents of Byte 29 combined create a 12-bit value
that defines the minimum SDRAM Four Activate Window Delay Time in Medium
Timebase (MTB) units. This value comes from the DDR3 SDRAM datasheet. The value
of this number may depend on the SDRAM page size; please refer to the DDR3 SDRAM
datasheet section on Addressing to determine the page size for these devices.
Examples:
SPD Field #29: “Minimum Four Activate Window Delay Time (tFAWmin) LSB” Definition from JEDEC
DDR3 SPD Specification
Table 2Addressing
SPD Field #29: “Minimum Four Activate Window Delay Time (tFAWmin) LSB” Example from Micron
MT41J256M8 Datasheet
These three bits need to be set if the SDRAM device supports the respective feature.
Included above is an example from the MT41J256M8 datasheet that shows it supports
all three features.
×E SDRAM Optional Features × RZQ/, RZQ/, and DLL-Off Mode
Support
SPD Field #30: “SDRAM Optional Features” Definition from JEDEC DDR3 SPD Specification
These bits need to be set if the SDRAM component supports the specific thermal prop-
erty. As shown above, the MT41J256M8 supports the extended temperature and ex-
tended temperature refresh rates, along with the ability to do Auto Self Refresh (SPD
field #33=0×07).
×F SDRAM Thermal and Refresh op- × Extended Temp. Ranges
tions and Auto Self Refresh
(ASR)
SPD Field #30: “SDRAM Optional Features” Example from Micron MT41J256M8 Datasheet
Calculating Specific SPD Data Based on SDRAM Datasheet 239
SPD Field #31: “SDRAM Thermal and Refresh Options” Definition from
JEDEC DDR3 SPD Specification
If the memory-down-on-board design has a dedicated thermal sensor for the SDRAM
components, then this field should be set to 0×80, otherwise to 0×00.
SPD Field #31: “SDRAM Thermal and Refresh Options” Example from Micron MT41J256M8 Datasheet
SPD Field #32: “Module Thermal Sensor” Definition from JEDEC DDR3 SPD Specification
SPD Field #33: “SDRAM Device Type” Definition from JEDEC DDR3 SPD Specification
For JEDEC SPD Specification v1.0, these fields are reserved and should be set to 0×00.
For JEDEC SPD Specification v1.1, some of these fields are defined.
Byte 34: Fine Offset for SDRAM Minimum Cycle Time (tCKmin)
This byte modifies the calculation of SPD Byte 12 (MTB units) with a fine correction
using FTB units. The value of tCKmin comes from the SDRAM datasheet. This value is
a Two’s Complement multiplier for FTB units, ranging from +127 to -128. Examples:
See SPD byte 12. For Two’s Complement encoding, see Relating the MTB and FTB.
Byte 35: Fine Offset for Minimum CAS Latency Time (tAAmin)
This byte modifies the calculation of SPD Byte 16 (MTB units) with a fine correction
using FTB units. The value of tAAmin comes from the SDRAM datasheet. This value
242 Quick Boot: A Guide for Embedded Firmware Developers
is a Two’s Complement multiplier for FTB units, ranging from +127 to -128. Examples:
See SPD Byte 16. For Two’s Complement encoding, see Relating the MTB and FTB.
Byte 36: Fine Offset for Minimum RAS# to CAS# Delay Time (tRCDmin)
This byte modifies the calculation of SPD Byte 18 (MTB units) with a fine correction
using FTB units. The value of tRCDmin comes from the SDRAM datasheet. This value
is a Two’s Complement multiplier for FTB units, ranging from +127 to -128. Examples:
See SPD byte 18. For Two’s Complement encoding, see Relating the MTB and FTB.
Byte 38: Fine Offset for Minimum Active to Active/Refresh Delay Time (tRCmin)
This byte modifies the calculation of SPD Bytes 21 and 23 (MTB units) with a fine cor-
rection using FTB units. The value of tRCmin comes from the SDRAM datasheet. This
value is a Two’s Complement multiplier for FTB units, ranging from +127 to -128. Ex-
amples: See SPD byte 21 and 23. For Two’s Complement encoding, see Relating the
MTB and FTB.
SPD Field #60: “Module Nominal Height” Definition from JEDEC DDR3 SPD Specification
Each DIMM equivalent consists of a single rank of ×8 SDRAM devices, no ECC. This
corresponds to RAW Card Version A, ×64, which would make SPD field #62=0×00.
The various Reference Raw Card Enumerators are described in this document:
JEDEC Standard No. 21C: 240-Pin PC3-6400/PC3-8500/PC3- 10600/PC3-12800 DDR3
SDRAM Unbuffered DIMM Design Specification, Revision 1.01, November 2009,
http://www.jedec.org/ (requires account).
Module-Specific Section: Bytes 60–116 245
×E (Unbuffered): Reference Raw Card Used × Raw Card A, ×
SPD Field #62: “(Unbuffered): Reference Card Used” Definition from JEDEC DDR3 SPD Specification
SPD Field 0×3F: Unbuff Addr. Mapping from Edge Connector to DRAM
If the motherboard has mirrored the address lines, set to 1. The typical values assume
the board address mapping is standard.
Reserved = standard
= mirrored
The definition of standard and mirrored address connection mapping is detailed be-
low; highlighted rows in the table indicate which signals change between mappings.
248 Quick Boot: A Guide for Embedded Firmware Developers
A A A
A A A
A A A
A A A
A A A
A A A
A A A
A A A
A A A
A A A
A/AP A/AP A/AP
A A A
A/BC A/BC A/BC
A A A
A A A
A/BA A/BA A/BA
BA BA BA
BA BA BA
BA BA BA
SPD Field #63: “Address Mapping from Edge Connector to DRAM” Definition from JEDEC DDR3 SPD
Specification
– ×–× Module Manufacturer ID Code, LSB × ×C Micron Technology
Byte Bits – Byte Bit Byte Bits –
Last Nonzero Byte Odd Parity for Number of Continuation Codes
Module Manufacturer Byte bits – Module Manufacturer
Examples:
SPD Field #117 and #118: “Module Manufacture ID Code” Definition from
JEDEC DDR3 SPD Specification
The current week number should be inserted into the lower byte. The last two years
of the year number should be inserted into the upper byte. Make sure to use Binary
Coded Decimal for the two numbers.
Since there isn’t a module (DIMM) physically present, this field is a “not care.” How-
ever, it should be programmed for BIOS algorithms to be able to tell “if the DIMM has
changed,” which is a trigger for fast or slow boot paths.
The CRC bytes must be calculated for bytes 0–125 using the formula above. Note that
CRC needs to be checked for bytes 0–116 if bit 7 in SPD field #0 was set to 1b (which I
Module-Specific Section: Bytes 60–116 251
don’t recommend). This site is very useful for calculating the CRC-16 values:
http://www.lammertbies.nl/comm/info/crc-calculation.html.
The result is shown in the CRC-16 field. Make sure to remove all spaces between
the bytes, and make sure to select the “Hex” option, not “ASCII.”
This 2-byte field contains the calculated CRC for previous bytes in the SPD. The fol-
lowing algorithm and data structures (shown in C) are to be followed in calculating
and checking the code. Bit 7 of byte 0 indicates which bytes are covered by the CRC.
SPD Field #126 and #127: “CRC Bytes” Definition from JEDEC DDR3 SPD Specification
Since there isn’t a physical DIMM, I tend to use the SDRAM component name in lieu
of the Module Part number. Also, this site is very useful for converting ASCII strings
to hex values: http://easycalculation.com/ascii-hex.php.
Ensure to use the “Equivalent Hex Value” result.
252 Quick Boot: A Guide for Embedded Firmware Developers
Use the same value used for fields 0×75–0×76 (the SDRAM component manufacturer
ID extracted from the JEP-106 JEDEC specification).
Byte , Bits – Byte Bit Byte Bits –
Last Nonzero Byte, Odd Parity for Number of Continuation Codes
DRAM Manufacturer Byte bits – DRAM Manufacturer
MT41J256M8 Datasheet
http://download.micron.com/pdf/datasheets/dram/ddr3/2Gb_DDR3_SDRAM.pdf
All data from the JEDEC DDR3 SPD Specification is Copyright © JEDEC. All Rights
Reserved. Reproduced with permission.
Information on JEDEC standards is current as of the date of this publication. For
the most up-to-date version of JEDEC standards, visit www.jedec.org.
Index
A Architecture boot flow 107, 112, 118, 120
ABAR 64, 66, 68, 131 ASPM. see active state power management
Abstraction, debugging and 86 (ASPM)
AC 16, 51, 136, 214, 219, 231, 236 ATA/IDE programming interface 58, 176
AC operating conditions 214, 229, 236 ASR (auto self refresh) 191, 238
Accelerated graphics port (AGP) 20, 23 ATA/IDE programming interface 62, 130
Access PCI devices 54 Auto precharge 215, 231, 237
ACPI. see Advanced Configuration and Power AVG 208, 219, 225, 231, 234, 237
Interface (ACPI) 2, 21, 28, 46, 50, 59, Award BIOS 9
69, 101, 146, 160, 176
ACPI Specification 50, 58, 71, 82, 132 B
ACPI namespace 58, 71 BA 200, 237, 248
ACPI tables 36, 69, 86, 101, 132, 158, 178 Bank bits 199
181 Banks 35, 169, 190, 198, 237
Active management technology (AMT) 171 BAR. register (BAR) 26, 46, 53, 59, 67, 112
Active state power management (ASPM) im- Baseboard management controllers (BMCs)
pact 176 10, 22, 128
Active/Refresh delay time 226, 242 Basic Input Output System (BIOS) 38, 61,
Activate Window Delay Time 191, 235, 237 143, 150, 167, 172, 179, 189
Activate windows 215, 231, 236 BCD (Binary Coded Decimal) 250
Activate-to-precharge 208, 215, 224, 231 BDS phase 128, 145, 172, 174
Activate-to-Activate 208, 215, 221, 227, 233 BGRT (Boot Graphics Resource Table) 70
ADDR 76, 214, 231 Bill of material (BOM) 9, 10, 166
ADDR setup 214, 219, 231, 236 Binary libraries, debugging 78
Address timings 214, 218, 231, 236 Binary coded decimal (BCD) 250
Advanced Configuration and Power Interface BIOS. see Basic Input Output System (BIOS)
(ACPI) 28, 46, 50, 69, 101 BIOS and bootloaders 17, 135
Advanced Host Controller Interface (AHCI) BIOS boot specification 50, 69
49, 62, 66, 99, 130 BIOS vendors 2, 7, 11, 117
AGP. see accelerated graphics port (AGP) BLDK. see Intel® Boot Loader Development
AGPset 17, 19 Kit (Intel® BLDK)
AHCI. see Advanced Host Controller Interface BMCs. see baseboard management control-
(AHCI) lers (BMCs)
AHCI mode, in SATA controller initialization BN 160, 161
62, 66, 130, 161, 174 Boot 135, 153, 160, 169, 171
American Megatrends Inc (AMI) 9 – cold 68, 107, 117, 159
AMI. see American Megatrends Inc (AMI) – faster 5, 115, 169
AMT. see active management technology – first 86, 160, 169
(AMT) – optimized 138, 146
Application processor (AP) initialization Boot code 38, 111
112, 120, 168, 248 Boot devices selection (BDS) phase 132,
– idle state 121 136, 148
– wakeup state 121 Boot firmware 10, 11, 13
APICs (Advanced Programmable) 40, 58, Boot flow 137, 144, 158, 163, 177, 178, 181
123 Boot graphics table (BGRT) 70
APM (advanced power management) 21, 44 Boot loader 14, 34, 38, 83, 101, 125
256 Index
Enterprise south bridge interconnect (ESI) Full boot 162, 168, 170
17, 20 FV. see firmware volume (FV)
Error correction codes (ECC) memory 131 FWH. see firmware hubs (FWH)
ESB. see enterprise south bridge (ESB)
ESI. see enterprise south bridge intercon- G
nect (ESI) GDT. see Global Descriptor Tables (GDT)
Exceptions 44, 51, 125, 161, 172 GFX 20, 24, 25
– probable type casting and 163 Global Descriptor Tables (GDT) 39, 40, 114
– execute 3, 103, 117, 128, 159, 177 GPIO. see general purpose I/O (GPIO)
Execute in place (XIP) 23, 117 GPL. see general public license (GPL)
ExitBootServices() 104, 176 GPL code 186, 187
EXT files 100 Graphics output protocol (GOP) 47, 162, 173
– support for CSM-free operating systems
F 173
FACS (firmware ACPI control structure) 69, – graphics subsystem 173
70
Factors affecting boot speed 180, 181 H
FADT (fixed ACPT description table) 69 Hardware and firmware co-design 167
Fallback mechanisms 162 Hardware debuggers 82, 83, 96
Fast boot 160, 164, 170, 175, 180 Hardware platforms 9, 107
Fast boot time 117, 164, 165 Hardware power sequences (pre-pre-boot)
FAT 50, 100 107, 109, 111
Fh DMA controller 27 Hardware programming specifications 130
Field name 190, 201, 204, 249, 253 Hardware resets 75, 180
– typ 196, 200, 241, 251 Harnessing the UEFI Shell 5
Field Programmable Grid Arrays (FPGAs) 47 Hex 190, 197, 241, 249, 252
File allocation table (FAT) in OS loading 100 HID (Human interface devices) 49, 71
File system, in OS loading 88, 100, 149 High precision event timer (HPET) 49, 71
Firmware ACPI control structure (facs) 69 HII. see human interface infrastructure (HII)
Firmware co-design 167 HOB 178, 180
Firmware hubs (FWH) 16, 34, 187 HMI. see human/machine interface (HMI)
Firmware performance data table (FPDT) 70, Host controller (HC) 19, 49, 60, 61, 129, 175
181 Host reset vector, starting at 109
Firmware support package (FSP) 14 Host/target debugging techniques 73
Firmware volume (FV) 147, 166, 171, 177 HPET. see high precision event timer (HPET)
Fixed ACPT description table (FADT) 69 22, 126, 158
Flash 31, 33, 35, 117, 147, 166 HPS (Hardware Programming Specifications)
Flash memory 34, 109 130
Flash size 135, 148 Hub link 20
Flash subsystem 165, 167 Human interface devices (HID) 49, 71
FPDT. see firmware performance data table Human/machine interface (HMI) 4
(FPDT)
FPGAs. see Field Programmable Grid Arrays I
(FPGAs) IA-32 architectures software developers 45,
Framework BIOS 148, 149 50, 110, 120, 125, 127
Front side bus (FSB) 17, 19, 25 IA-32 Intel architecture software developer
FSP (firmware support package) 14 114, 115
FTB (fine timebase) 205, 211, 216, 220, 242 IA-32e mode 110
FTB units 208, 211, 217, 226, 241 IBVs (independent BIOS vendors) 8, 9, 181
Index 259
ICH. see I/O controller hub (ICH) 22, 125, Intel® Virtualization Technology (Intel® VT)
130 45
IDE mode, in SATA controller initialization Intel® Xeon 19
63, 65, 130 Intel®-architecture based platform 51
IDE, native 65 Intel®-compatible ITP devices 76
IDF (Intel developer’s forum) 138 Intel® fast boot technology 153, 158, 160,
IDT. see Interrupt Descriptor Table (IDT) 39, 164
111, 114, 125 Interrupt controllers 21, 27, 122
Independent BIOS vendors (IBVs) 9, 181 Interrupt pins 55
Industry specifications 28, 45, 68, 82, 174 Interrupt routers 55
Industry Standard Architecture (ISA) bus 15 Interrupt vectors 81, 125
Industry standard initialization 49, 52, 56, Interrupt handling 111, 123, 125
58, 62, 64 Interrupt service routines (ISR) real mode
Industry standards 1, 2, 6, 15, 44, 46, 49, 67, 126
179 INTIN 124
Initialization 2, 14, 45, 47, 59, 61, 149, 189 IO, memory mapped 45, 46
Initialization code 12, 40, 109, 111, 114 I/O advanced programmable interrupt con-
Interrupt routing options 55 troller (I/OxAPIC) 125
IRQ routing with ACPI methods 54, 55 I/O controller hub (ICH) 21, 130
Initial program loader (IPL) 102 I/O hub (IOH) 19, 21
Instruction set architecture (ISA) 15, 19, 40 IOH/ICH 126
Intel® AMT 172 I/O ranges 26, 82
Intel® AMT firmware 172 I/OxAPIC 21, 132
Intel® Architecture 13, 17, 21, 28, 108, 109 IOxAPICs (Input/Output Advanced) 58, 70,
– boot flow 14, 107 123
– firmware 76, 80 ISA. see instruction set architecture (ISA)
– memory map 112 ISA bus. see Industry Standard Architecture
– platforms 1, 52, 73, 107, 118, 133, 181 (ISA) bus
– reference platform 107, 112 ISR. see interrupt service routines (ISR)
– systems 2, 126 In-target probe (ITP) 75, 76, 82
Intel® architecture basics 15, 18, 20, 24, 28 Interrupt Vector Table (IVT) 36, 81, 125
Intel® architecture boot flow 14, 107
– advanced initialization 13 J
– Intel® Atom 19, 150, 151 JEDEC DDR3 SPD Specification 223, 227,
– Inte®l Core 19 233, 238, 240
– Intel® Developer’s Forum (IDF) 138 JEDEC specification 196, 200, 205, 209, 213
Intel® BLDK 13, 186 JEP 249, 253
Intel® boot loader development kit 13
Intel® design guide 78, 79 K
Intel® fast boot 168, 172 KB page 215, 219, 231, 236
– timing results 164 KB page size 215, 219, 231, 235
Intel® PCH 61, 171, 174, 176 Kernal 12, 165, 102, 137, 178
Intel® processors 19, 25, 114, 120, 158 Key byte 196, 197
Intel® reference platform 116 – key byte table 196
Intel® RST driver 175 Keyboard/mouse 59, 137, 148, 150
Intel® SpeedStep® technology 136, 148, 150
Intel® trusted execution technology (Intel L
TXT) 45, 177 LAPIC. see local advanced programmable in-
terrupt controller (LAPIC) 123, 127, 132
260 Index
Latencies supported 190, 209 – location 38, 104, 118, 125, 181
Least significant byte 209, 223, 235, 249, Memory controller hub (MCH) 203
252 Memory configurations 115, 169, 189, 190
Legacy Address Range 36 Memory module, bit Small Outline 198
Legacy BIOS interface 4, 7 Memory map 17, 25, 34, 40, 101, 112, 118,
Legacy OS interface 12, 28, 129, 132 131
Legacy option ROMs 5, 12, 47, 111, 119, 141, – Intel architecture system and 191
178 – region locations in 38, 104, 118, 125, 184
Legacy-free systems 129 – region 127, 132, 159
Lesser GPL (LGPL) 185 Memory range 112, 113
Linux kernel, direct execution of 115 Memory reference code (MRC) 189
Loader 4, 100, 104, 164, 178 Memory Repeater Hubs (MRHs) 20
Loading 99, 12, 150, 164, 178 Memory test 38, 116, 119
Local ACPI 40, 113, 125, 132 Memory translation hubs (MTHs) 20
– CPU initialization and 70 Memory types 31, 33
– timer 127 Memory type range registers (MTRRs) 115,
Logical addressing 39 127, 168
LPC memory read cycle 23 Message signaled interrupt (MSI) 123, 127
LSB 190, 193, 223, 228, 234, 237 Microcode update 45, 109, 113
Local vector table (LVT) 41, 125 Micro engines 108
Micron MT41J256M8 datasheet 209, 213,
M 217, 230, 234
MADT (Multiple APIC description table) 70 =
Main Memory Address Range 3 Minimal boot loader for Intel 101
Manageability engine BIOS extension Minimum RAS 216, 242
(MEBx) 171 Minimum refresh recovery delay time 228,
Manageability engine (ME) 22, 116, 128, 230
165, 171 MO (module outline) 197, 243
Master boot record (MBR) 100 Mode Register Set 215, 219, 231, 237
Max Min Max Units notes 208, 221, 224, Model-specific registers (MSRs) 44, 45
227 Module 161, 171, 197, 202, 239, 243, 250
MB flash 138, 149 – manufacturer 195, 249, 253
MBR. see master boot record (MBR) – maximum thickness 243, 244
Manageability engine BIOS extension – nominal height 242, 243
(MEBx) 161, 171, 172 – part number 251, 252
Media, bootable 100, 102 – revision code 252
Media table 172 – serial number 250
Medium timebase divisor 206, 213, 220, – thermal sensor 191, 239
225, 228, 230, 232 MP Initialization protocol algorithm 121
Memory 23, 25, 33, 117, 132, 169 MRC. see memory reference code (MRC)
– accessing 33, 117 115, 169, 172, 180, 189
– boot service 97 MRHs. see Memory Repeater Hubs (MRHs)
– caching control 127 20
– configuration 115, 169, 189 MSB 191, 223, 225, 228
– configuration complexity 169 MSI. see message signaled interrupt (MSI)
– controller 14, 17, 20, 34, 112, 116 123, 125
– initialization 34, 64, 112, 149, 159, 163, MSRs (model-specific registers) 44, 45
169, 180, 189 MTB (medium timebase) 206, 211, 216, 220,
– initialization, fast 169 226, 242
Index 261
PIT. see programmable interrupt timer (PIT) RAM. random access memory (RAM) 3, 31,
41, 126 32, 109, 115, 118, 136
Platform 22, 74, 97, 107, 140, 150, 178 Ranks & Device DQ count 190, 202
– boot 144 Read command delay time 230, 232
– boot times 177 Real mode 39, 70, 81, 109, 115, 121, 125
–= Real mode interrupts 80, 81
– initialization specification 49 Reduction in boot times 138, 139
– interrupt routing 82 Real mode interrupt 81
– memory speed 136, 148, 149
– reset 168, 169 Reference code, initialization memory 15
Platform policy 135, 137, 140, 144, 147 Reference raw card revision 245, 247
Platform-under-development (PUD) 96 Refresh 208, 210, 221, 227, 229
Port x Enable (PxE) 51, 64, 131 – options 191, 238, 240
Ports 20, 60, 64, 68, 74, 131 – recovery delay 191, 228
Ports implemented (PI) 63, 66 Registers 53, 60, 66, 68, 112, 117, 119
Post codes 38, 74, 77, 82, 118, 18 Reserved 27, 132, 190, 197, 241, 247
Post memory manager specification 50, 69 Reserved ns 208, 210, 222, 225, 227
Power 31, 60, 91, 158, 174, 229 =
– management 27, 44, 116, 176 Reserved settings 209, 222, 225, 227
– sequencing 78, 107, 156, 165, 174 Reset 61, 107, 126, 137, 159, 164, 229
pre-OS shells 87 Reset vector 40, 83, 109, 117, 168
pre-OS user interaction, need for bootstrap- =
ping and 160 Read-only memory (ROM) 3, 31, 34, 40, 47,
Pre-pre-boot 59 107, 111 132, 141, 161
Pre-charge command period 208, 219, 221, Root ports 62, 128
227, 231, 236 Root system description table (RSDT) 69
Pre-charge command delay time 223, 225, Row active 190, 217, 220
232 Row address bits 200
Processor 107, 110, 114, 120, 149 RSDT (Root System Description Table) 69
– bootstrap 112, 168 RTC. real time clock (RTC) 21, 34, 41, 126
– cache 33, 109, 118 RZQ 191, 237, 238
– logical 120, 168
Programmable interrupt routing 101 S
Programmable interrupt timer (PIT) 41, 126 SATA (Serial ATA) 16, 22, 44, 49, 51, 62, 64
Programming 17, 21, 45, 53, 63, 130, 189 SATA controller 62, 64, 67, 130
Protected mode 39, 41, 109, 114, 118, 125 – controller mode 63, 130
PUD (platform-under-development) 96 – initialization 62, 65, 130
PxE (Port x Enable) 51, 64, 131 SATA mode select (SMS) 63, 130
SATA ports 62, 65, 130, 175
Q SBSY (smart battery table) 70
Quick Boot 165, 168, 170, 176, 178, 180 SCH. see system controller hub (SCH)
Schematics 52, 78, 193, 202
R Script 75, 87, 91, 98
RAID 47, 63, 66, 131, 136, 181 – files 85, 87, 91, 94, 97
RAID mode, in SATA controller initialization SDRAM 16, 34, 196, 230, 234, 239
63, 65, 130 – chips 192, 202, 211, 215, 218, 220
RAID option ROM 64, 67, 68, 101, 131 – datasheet 194, 199, 201, 207, 211, 241
Raid storage 161, 171 – density and banks 198, 199
– device type 191, 240, 241
Index 263
– devices 200, 237, 241, 244 SSDT (Secondary System Description) 70,
– thermal and refresh options 191, 238, 71
240 Standards 1, 22, 28, 46, 71, 191, 247
Secondary system description 70 Storage applications platform design 203
Second-stage bootloader 102, 103 Supported CL settings 208, 222, 227
– executable 102 Supported CWL settings 209, 222, 227
Security 6, 86, 109, 112, 169, 177 Switch 100, 104, 109, 111, 114, 143
Segments 9, 36, 39, 102, 119 System controller hub 21
= System, operating 3, 80, 83, 99, 102, 140,
Serial output support 127 179
Seven-segment 38, 74, 82 System BIOS 32, 60, 65, 130
Shadowing 34, 117, 119 System boot times 155, 164
Shell 85, 88, 90, 93, 95, 97 System firmware debug techniques 73, 76,
– application 88, 91, 93, 97 80, 82
– commands 85, 88, 91 System firmware’s missing link 1, 4, 6, 10,
– core 92, 94 14
Shells and native applications 85, 88, 92, System management interrupt (SMI) 109
94, 96, 98 System management mode 45, 81, 109
Silicon 31, 43, 46, 77, 79, 81 System memory 5, 31, 34, 36, 56, 67, 131,
SIPI (Startup Inter-Processor Interrupt) 112, 149
121, 122 System memory map 35, 37, 131
Smart battery table (SBST) 70 System resource affinity table (SRAT) 70
SMBus byte read 170
SMI (system management interrupt) 109 T
SMM (System Management Mode) 45, 81, tAAmin 211, 213, 241
109 Tables 56, 69, 101, 139, 219, 231, 236
SMS (SATA Mode Select) 63, 130 tCK selection 209, 222, 225, 227
SMS field 63, 130, 131 tCK requirements 209, 222, 225, 227
SOC devices 112, 115 tCKmin 207, 209, 214, 241
SOCs 17, 21, 113, 116, 120 Testing 86, 96, 117
SO-DIMM 190, 197 tFAW value 234
South bridge 20, 23, 34, 40 tFAWmin 191, 234, 236
SPD (Serial Presence Detect) 51, 116, 189, Thickness, baseline 243, 244
195, 201, 251 Timebase 213, 218, 228, 233, 236
SPD byte 208, 212, 216, 220, 242 TOPER 209, 222, 227
SPD byte, calculation of 241, 242 TPM 177, 178
SPD date 116, 189, 192, 195, 200 tRAS 208, 210, 222, 226, 231
SPD data based on SDRAM datasheet 194, tRCD 208, 214, 216, 219, 221, 231, 233
197, 201, 205, 209 tRCDmin 208, 210, 215, 217, 222
SPD EEPROM 189, 194, 196 tREFI 242
SPD field 193, 223, 228, 234, 238, 244, 250 tREFI 208, 210, 222, 227
Speed bins 208, 210, 221, 224, 227, 230, tREFIns 208, 210, 222, 224, 227
233 tRPmin 217, 220, 242
SPI (Serial Presence Interface) 16, 22, 34, tTRTP 215, 219, 230, 233, 237
120, 164, 169 Trusted platform modules 177
SPI flash 166, 171 TSC 158, 172
Splash screens 36, 142, 178 Two-byte field 249, 252
SRAT (System Resource Affinity Table 70 Typ 190, 201, 204, 249, 252
264 Index