Introduction To Intel Architecture - The Basics
Introduction To Intel Architecture - The Basics
Todd Langley
Introduction to
Hardware Engineer
Rob Kowalczyk
Intel® Architecture
Technical Market
Engineer
Intel Corporation
The Basics
January 2009
2 321087
Executive Summary
Intel® architecture is a powerful computing architecture that utilizes some
fundamental ingredients for specific functions. The basic workings of
these ingredients may not be intuitive to those who have never designed
with Intel® architecture in the past. The goal of this paper is to describe
the basic operation and function to the platform for the ingredients used
in three classes of Intel® architecture platforms. The paper will focus on
the platforms featuring the Intel® Atom™ processor, the Intel® Core™ 2
Duo processor, and the Intel® Core™ i7 processor. The paper will walk
through the operation of a processing core’s communication with memory
and I/O, the path of interaction between different types of I/O and
memory, and a high level description of how the CPU gets and utilizes
data. The various system components are described along with the
services they provide. A brief introduction to critical support components
(clocks, voltage regulators, super I/O) is given. This paper will also
provide an explanation for the common terms used when describing
Intel® architecture designs and operation. The final section shall highlight
the design aids and collateral that Intel provides customers to help them
create successful products.
The goal of this paper is to describe the basic operation and function to
the platform for the ingredients used in three classes of Intel®
architecture platforms.
Contents
What is Intel® Architecture? .....................................................................................5
Conclusion ...........................................................................................................23
4 321087
What is Intel® Architecture?
Though the prevalent personal computer architecture is Intel® architecture,
there are non PC designs and designers that utilize the Intel® architecture to
create compelling products outside the PC realm. For the designer who has
never been exposed to Intel® architecture, there can be concerns about how
the architecture works and perceptions regarding the complexity of Intel®
architecture. The goal of this article is educate someone who has never been
exposed to the workings of Intel® architecture and to give the initial guidance
on how Intel® architecture works and what the system components are.
The key to the longevity of Intel® architecture is that every newly introduced
product remains backward compatible to all previous Intel® architecture
CPU’s. This compatibility has allowed software and software tools to be
reused across generations and for customers to reuse and build upon prior
Intel® architecture SW and HW investments in a cost efficient manner. To
take full advantage of new capabilities, software may need to be updated, but
existing software will still work.
To look at the history of Intel® architecture products, you see that they begin
with numeric designations, specifically the 4004. In 1993 a switch was made
to using names for the new products, such as Pentium®. A complete list
showing the history of Intel® architecture CPU’s is here 1 and a more in depth
discussion on the differences is in the Intel® 64 and IA-32 architectures
Software Developer’s Manual Vol. 1 which can be found here 2. Much of the
evolvement of the Intel® architecture products over the years has been to
increase performance, efficiency and platform capabilities. Some of the ways
to accomplish this have been to increase the width of internal and external
1
http://www.intel.com/pressroom/kits/quickreffam.htm
2
http://download.intel.com/design/processor/manuals/253665.pdf
data paths and to increase the internal and external clock frequencies (the
document in Reference 2 describes the other advances in greater detail).
Other advances in Intel® architecture have come as products have integrated
various discrete components that are needed to make up an Intel®
architecture system. The other evolution of all Intel products has been the
advances in the silicon processes. This article will provide an overview of the
internal and external interfaces in an Intel® architecture system and to
describe the components that make up three of the popular types of Intel®
architecture systems. The Intel® Core™ i7 processors, which are the latest
high performance products, the Intel® Core™2 Duo processors, the
mainstream desktop and mobile platform, and the Intel® Atom™ products,
primarily for ultra mobile applications.
6 321087
Figure 1. System with Intel® Core™ 2 Duo Processor
System
Peripherals
Memory
Intel® Legacy and
High Speed
QPI DMI Lower
CPU I/O
Speed I/O
Controller
Controller
IOH ICH
High Speed BIOS
I/O
Figure 3. System with Intel® Atom™ Processor
For all cases, the basic flow of information in an Intel® architecture system
utilizes the memory and I/O controller. During the power-on or reset of the
system, the BIOS firmware configures the memory and I/O controllers as to
where in the CPU’s memory map they will reside. In the Intel® architecture
system, the CPU can control the flow of data, or I/O devices can directly
transfer data to and from system memory, or in some cases directly between
I/O devices. The CPU will use the data from the external devices that was
placed in memory, or it can directly transfer data to and from I/O and
memory. The CPU can also directly access I/O without using memory. The
CPU can do both memory and I/O operations and the two apply to two
separate address ranges throughout the Intel® architecture system.
8 321087
achieve maximum performance. The pipelines get data and code from their
respective caches. There are multiple levels of caches throughout Intel®
architecture CPU’s. The caches closest to the execution units are lowest in
number (L1). Cache memory is SRAM that is part of the CPU silicon and
designed for high speed/low latency access. Cache memory is organized to
have the full data width of the CPU pipeline. A common reference within
Intel® architecture discussions is a Cache line. This refers to the full width of
the cache and is what the CPU will use as its minimum memory access width.
The L2 cache is used to feed the L1 caches. The L2 cache is multiple times
larger than the L1 caches, but takes more time to access. The operation of
the CPU tries to make optimum use of the pipelines and the caches to
maximize performance. The way that software is written can have a dramatic
affect on the efficiency of the cache and pipeline usage.
Intel has tools to aid in optimizing software for maximum performance. The
micro-architecture of the CPU dictates the way the cache and pipelines work
together. The three different Intel® architecture products highlighted in this
document each have differences in their micro-architectures while running
the same instruction set architecture. The details of these micro-architectures
are covered in product specific documents. The CPU uses the front side bus
to transfer code and data, to and from other components, primarily the MCH.
The name front side bus, came about when the original L2 cache was added
to the CPUs. The L2 cache memories were not integrated in the silicon, so
they had their own bus called the back side bus. The front side bus went to
memory and I/O off the CPU package.
In the current implementation, the timing control for data being read and
written on the FSB uses source synchronous clocking. In this approach, there
are dedicated strobe signals that are sent synchronously with address and
data from the originating device. This makes the high clock speeds of the FSB
feasible (up to 400MHz). The strobes will run at multiples of the base clock
frequency which allows the address to be double clocked or pumped, and the
data to be quad pumped. The FSB speed is commonly referenced by the
throughput rate and not the base clock rate due to the quad pumping of data
10 321087
(e.g., 1333 Mega Transfers per second (MT/s) is based on a 333 MHz base
clock).
The FSB uses dedicated address, data, strobes, and request pins. The data
width is 64 and the number of address pins varies from 32 to 40 depending
on the class of CPU. The FSB request pins are used to signify different phases
of a given read or write operation. The request pins are also used to signify
cache coherency status for a given FSB transaction. One of the operations
that will occur on the FSB will be to maintain the coherency of the contents of
CPU cache and the system memory. These coherency operations occur
automatically as software executes and data is moved to and from memory
by the CPU or by an I/O device. The bus is pipelined and supports split
transactions. This improves command bandwidth by allowing new requests
to be issued before previous transactions complete.
Memory Controller
The central hub for the data traffic in an Intel® architecture system is the
Memory Controller Hub (MCH). Until the new Intel® Atom™ and Intel® Core™
i7 architectures were developed, the MCH had been a discrete component.
The Intel® Core™ 2 Duo architecture uses a discrete MCH which will be
described first. Figure 6 shows the MCH for a system featuring the Intel®
Core™ 2 Duo processor. The MCH facilitates the transfer of data to and from
all the interfaces. When the BIOS configures the MCH, it defines the base
address locations for all the interfaces. The BIOS relays the configuration
information to the operating system so it knows the capabilities and locations
of the hardware that is in its system.
There are many different models of MCH currently offered by Intel. The
feature differences between MCH’s include the number and type of memory
channels, the number of PCIe* lanes supported, internal 2D/3D graphics,
single or multiple CPU support (uni-processor (UP) or dual-processor (DP) or
multi-processor (MP)). Intel validates specific CPU’s with specific MCH’s to
provide a well balanced platform. Not all CPU FSB speeds are compatible with
all MCH’s.
Intel® MCH
Front Side Bus
Unit
Memory Controller
PCI Express*
Data
Steering
and
Arbitration
DMI iGFX
The CPU connects to the MCH through the FSB which was described earlier.
The FSB unit in the MCH is responsible for the CPU cache coherency. If data
at the address requested is not in the CPU cache, or the data in memory is
newer, the memory controller is told to retrieve the data at that address.
Data transfers between the CPU and memory are always 64 bits, the full
width of the L2 cache on the CPU. If only a byte of data is requested, the full
64 bits is retrieved but the CPU will only use 8 of those bits. The memory
controller is configurable by the BIOS to support multiple speeds and sizes of
memory. The refreshing of the DRAM is handled by the memory controller
after it’s initially configured. The specific type, size, and speed of memory
that is supported, varies by the model of MCH.
The Direct Media Interface (DMI) interface in the MCH is a dedicated serial
link to the I/O Controller Hub (ICH). The DMI link is actually four serial links,
with dedicated transmit and receive pins. These serial links are referred to as
“lanes” and all use differential signaling. So the DMI is 4 lanes x Transmit and
Receive (2) x differential signaling (2) = 32 pins. The DMI usage will be
described in a separate section. DMI supports signaling of 2.5GT/s.
12 321087
The PCI Express* (PCIe*) interface is the highest bandwidth I/O interface in
the IA system. The number of PCIe lanes can vary depending on the MCH
used, but will usually be in multiples of 8. A common width for PCIe is 16
lanes as this is the maximum width for discrete PCIe graphics cards. The PCIe
interface uses the same differential signaling that the DMI does, but PCIe
supports higher transfer rates. The original PCIe specification states data
rates of 2.5Gb/s per lane (this what DMI uses). The second generation of
PCIe is now available and doubles the data rate to 5Gb/s.
Many of the MCH versions also have internal graphics controllers. The details
of the graphics controllers won’t be covered in this document, but the basic
capabilities are 2D and 3D acceleration. The types of display interfaces
directly supported by MCH’s varies by the model.
I/O Controller
The I/O Controller Hub (ICH) provides extensive I/O support, support for
legacy peripherals dating back to the 1980s, and integrates support for key
platform management functions such as power sequencing and ACPI power
management, fan speed control, and reset timing. It will be seen that these
later functions are critical to system operation and often overlooked by
designers.
Intel® ICH
DMI
SATA
RTC
USB
Timers /
Interrupts Internal PCIe*
Buses
GPIO
SPI
IDE
LPC
Power
Management
LAN Audio
As the ICH is used to control the reset sequence, and often power sequencing
of the other system components, it has power supplies which are required to
turn on before the rest of the system. Also, the Real Time Clock (RTC) needs
to have a 32.768KHz oscillator running before to properly sequence. The ICH
communicates with the IOH/MCH during reset and power cycling events to
try to make these events “safer”. For example, a warning message will be
sent over DMI before a reset to allow SMBUS or memory transactions to
complete before the reset.
14 321087
obsolete and SPI is expected to be standard interface for BIOS flash in
the future. The ICH is always a master on the SPI interface.
• Low Pin Count Interface (LPC) This interface replaces the ISA bus
originally developed by IBM in the early 1980s, but uses only 7 signals
plus a clock. It can be used to connect to a variety low speed devices
that don’t require the bandwidth of PCI or PCI Express*. This interface
is typically used to interface with Super I/O devices which contain
many interfaces such as floppy driver controller, PS2 keyboard/mouse
controls and serial ports.
• JTAG Boundary Scan allows testing of PCB board after assembly.
Support Peripherals
The ICH integrates numerous support peripherals that replace many external
components.
• Real Time Clock (RTC): The RTC is compatible with the Motorola
MC146818A*. It contains 256 bytes of RAM that can be maintained
with a 3V battery. 242 bytes are available for use while the remaining
are dedicated to the clock function. The RTC supports generating
wake events up to 30 days in the future. An external 32.768 KHz
crystal is required for operation.
• High Precision Event Timers These are high resolution timers which
can be used to generate periodic or one-shot interrupts. There are 8
comparators which share a common counter that is clocked from a
14.31818 MHz source.
• Advanced Programmable Interrupt Controller (APIC) is a more
modern interrupt controller than the 82C59 (see below). It supports
multiprocessor/multi core interrupt management and allows interrupts
to be directed to a specific processor. The I/O APIC in the ICH can
support up to 24 interrupt vectors and can work in conjunction with
I/O APICs in other devices (such as the IOH) to help eliminate the
need for multiple device to share interrupts.
Compatibility Peripherals
The ICH contains peripherals that date back to the earliest IBM* PCs which
used the Industry Standard Architecture (ISA) bus. The ISA bus has been
replaced by the Low Pin-Count (LPC) bus in modern systems, but the
peripherals that were once discrete components are now integrated into the
ICH. One key strength of Intel® architecture is maintaining backward
compatibility while continuing to innovate.
The ICH contains two 82C37 DMA controllers, two ISA-Compatible 82C59
interrupt controllers and three 82C54 programmable interval timer
equivalents.
The 82C37 DMA controllers should not be confused with the DMA engines
found in some MCHs. These DMA controllers are tied to the ISA/LPC bus and
used mostly for transfers to/from slow devices such as Floppy Disk
Controllers.
The ISA compatible 82C59 interrupt controllers have been largely supplanted
by the Advanced Programmable Interrupt Controller (APIC) since it offers
support for more than 15 interrupt sources and supports multi-core/multi-
processor systems. However, the 82C59 controllers are still used by some
older operating systems which run only on uni-processor (single CPU)
systems.
The ICH provides numerous I/O interfaces including 32-bit PCI, USB, SATA,
extra PCIe* lanes, general purpose I/O pins and SMBus. The ICH has two
interfaces for connection of the BIOS device (usually Flash EPROM). One is
the Low Pin Count Bus (LPC), the second is a Serial Peripheral Interface
(SPI). The LPC is a 33 MHz 4 bit bus that can be used for numerous low
speed devices. Besides the BIOS Flash, the LPC can be used for other devices
such as a micro controller, a security device, or a Super I/O. A Super I/O is a
device that integrates much of legacy I/O, like serial ports, parallel printer,
floppy drive controller, keyboard and mouse interfaces and others, into a
single device. The ICH also has timers, interrupt controllers and an interface
to the BIOS device. Some versions of ICH have other features like audio and
Ethernet controllers. The ICH has state machines that respond to external
signals to control power and reset. Many of the internal addresses in the ICH
are fixed. This allows the BIOS and applications to always find the minimum
system components needed to start up an Intel® architecture system.
What is a BIOS?
The Intel® architecture system relies on firmware that is always at the CPU
reset vector (which is FFFFFFF0h but appears on the FSB as FFFFFFFC0h as a
full cache line is read). This firmware is known as the Basic Input/Output
System (BIOS). The BIOS controls the activity of the Intel® architecture
hardware until the operating system takes over. One job of the BIOS is to
configure registers throughout the Intel® architecture components that set up
the devices to the particulars of the system hardware into which the Intel®
16 321087
architecture is designed. In a typical PC design, some of the hardware is
dedicated by the design based on the motherboard design, but other
hardware aspects vary based on what the end user may plug into the
motherboard. As the BIOS executes, after the initial configuration is done, it
will determine the type and amount of memory, then it goes through a
discovery phase. Once all the devices and hardware are configured the BIOS
will turn over control of the system to an operating system.
Control/Data Link – 20 diff pairs
Forwarded Clock – 1 diff pair
IOH
18 321087
The IOH
The Input/Output Hub or IOH is a new system component introduced with the
Intel® Core™ i7 processor. The Intel® X58 Express Chipset is the first
component with the IOH designation and replaces the MCH in some respects.
The IOH primarily serves as a switch between the processor’s Intel®
QuickPath Interconnect and several PCI Express* ports. It maintains proper
ordering of transactions and provides for fair arbitration between ports. The
DMI link is reserved for communication to the Intel® ICH.
There are different IOHs supporting a number of different PCI Express* and
platform configurations.
20 321087
Figure 11. Intel® SCH Internals
Intel® SCH
FSB
(CMOS)
Memory IDE
Controller USB
RTC PCIe*
Internal
Timers /
Buses SPI
Interrupts
LPC
GPIO
Graphics SDIO /
SDVO / MMC
LVDS
Power HD
Management LAN Audio
The I/O and peripheral interface differences between the ICH and SCH are
listed below.
• SDIO Secure Digital Input / Output – Usually used for media cards.
• MMC Multi-Media Card – Usually used for media cards.
• SDVO Serial Digital Video Out – display interface.
• LVDS Low Voltage Digital Signaling – flat panel display interface.
The Intel® Atom™ processor design is optimized for very low power
consumption. The voltage levels are lower and the speed of the FSB is lower
than the Intel® Core™ 2 Duo. The lower speed FSB allows CMOS drivers to be
used which draw less power than the GTL drivers. Another capability of the
Intel® Atom™ CPU is to dynamically reduce on chip cache size to save power.
The SCH has many advanced power management capabilities to enable the
lowest possible platform power consumption.
Specification Updates contain lists of errata and the most recent changes
to the other documents. Specification updates should always be consulted for
the latest available information.
22 321087
Conclusion
The Intel= architecture products offer a great span of features, performance,
and power levels. The ability to reuse software across generations and
product families is a great benefit and gives designs the ability to scale
performance and features without new software re-writes. The basic building
blocks of an Intel® architecture system are highly integrated and make
system design as straightforward as possible. Intel provides all of the
resources necessary to design leading edge products, so go to www.intel.com
and start creating.
References
Intel® CPU History http://www.intel.com/pressroom/kits/quickreffam.htm
Acronyms
CPU Central Processing Unit
DMI Direct Media Interface
FSB Front Side Bus
ICH Input / Output Control Hub
IOH Input / Output Hub
MCH Memory Controller Hub
SCH System Controller Hub
QPI Quick Path Interconnect
24 321087
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS.
NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL
PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS
AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY
WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO
SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO
FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT,
COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel products are not intended for use
in medical, life saving, or life sustaining applications.
Intel may make changes to specifications and product descriptions at any time, without notice.
This paper is for informational purposes only. THIS DOCUMENT IS PROVIDED "AS IS" WITH NO
WARRANTIES WHATSOEVER, INCLUDING ANY WARRANTY OF MERCHANTABILITY,
NONINFRINGEMENT, FITNESS FOR ANY PARTICULAR PURPOSE, OR ANY WARRANTY OTHERWISE
ARISING OUT OF ANY PROPOSAL, SPECIFICATION OR SAMPLE. Intel disclaims all liability, including
liability for infringement of any proprietary rights, relating to use of information in this specification.
No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted
herein.
BunnyPeople, Celeron, Celeron Inside, Centrino, Centrino logo, Core Inside, Dialogic, FlashFile, i960,
InstantIP, Intel, Intel logo, Intel386, Intel486, Intel740, IntelDX2, IntelDX4, IntelSX2, Intel Core,
Intel Inside, Intel Inside logo, Intel. Leap ahead., Intel. Leap ahead. logo, Intel NetBurst, Intel
NetMerge, Intel NetStructure, Intel SingleDriver, Intel SpeedStep, Intel StrataFlash, Intel Viiv, Intel
vPro, Intel XScale, IPLink, Itanium, Itanium Inside, MCS, MMX, Oplus, OverDrive, PDCharm, Pentium,
Pentium Inside, skoool, Sound Mark, The Journey Inside, VTune, Xeon, Xeon Inside, Intel Core, and
Intel Atom are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the
United States and other countries.