PCI 4th Edition Book Errata
PCI 4th Edition Book Errata
Errata History For PCI System Architecture, 4th Edition
Please note that the change history table below was started on 3/12/01.
Changes made prior to that date are not reflected in the table but are contained
in this document.
5/7/02 519 Important Descriptions of bits 10:9 both say “does not
implement” and should say “does implement.”
9/2/03 387 Important Replaced the figure.
9/2/03 413 Important Replaced the figure and rewrote the text on
line five.
10/12/2010 407 Important Row for Offset 82 should read “Length of reserved read-only
VPD area below including Checksum byte below.”
10/12/2010 407 Important Add a 1 to all Offset numbers listed in this table starting
with Offset 80 to 84
1
PCI System Architecture
[X:Y],
where “X” is the most-significant bit and “Y” is the least-significant bit of the field. As
an example, the PCI address/data bus consists of AD[31:0], where AD[31] is the most-
significant and AD[0] the least-significant bit of the field.
Mailing Address
MindShare, Inc.
4285 Slash Pine Drive
Colorado Springs, CO 80908
6
Chapter 3: Intro to Reflected-Wave Switching
RST#/REQ64# Timing
The assertion and deassertion of RST# is asynchronous to the PCI clock signal. If
desired, a synchronous reset may be implemented, however. RST# must remain asserted
for a minimum of 1ms after the power has stabilized. RST# must remain asserted for a
minimum of 100 microseconds after the CLK has stabilized. When RST# is asserted, all
devices must float their output drivers within a maximum of 40ns.
During assertion of RST#, the system board reset logic must assert REQ64# for a mini-
mum of 10 clock cycles. REQ64# may remain asserted for a maximum of 50ns after
RST# is deasserted. For a discussion of REQ64# assertion during reset, refer to “64-bit
Cards in 32-bit Add-in Connectors” on page 266.
29
4 The Signal Groups
The Previous Chapter
The previous chapter provided an introduction to reflected-wave switching.
This Chapter
This chapter divides the PCI bus signals into functional groups and describes the func-
tion of each signal.
Introduction
This chapter introduces the signals utilized to interface a PCI-compliant device to the
PCI bus. Figure 4-1 on page 32 and Figure 4-2 on page 33 illustrate the required and
optional signals for master and target PCI devices, respectively. A PCI device that can
act as the initiator or target of a transaction would obviously have to incorporate both
initiator and target-related signals. In actuality, there is no such thing as a device that is
purely a bus master and never a target. At a minimum, a device must act as the target of
configuration reads and writes.
Each of the signal groupings are described in the following sections. It should be noted
that some of the optional signals are not optional for certain types of PCI agents. The
sections that follow identify the circumstances where signals must be implemented.
31
Chapter 4: The Signal Groups
software. For this reason, it must monitor PERR# during write data phases to
determine if the target has detected a data parity error. The action taken by an
initiator when a parity error is detected is design-dependent. It may perform
retries with the target or may choose to terminate the transaction and generate
an interrupt to invoke its device-specific interrupt handler. If the initiator
reports the failure to software, it must also set the 0$67(5'$7$3$5,7<(5525 bit
in its PCI configuration Status register. PERR# is only driven by one device at
time.
A detailed discussion of data parity error detection and handling may be found
in the chapter entitled “Error Detection and Handling” on page 199.
System Error
The System Error signal, SERR#, may be pulsed by any PCI device to report
address parity errors, data parity errors during a Special Cycle, and critical
errors other than parity. SERR# is required on all add-in PCI cards that perform
address parity checking or report other serious errors using SERR#. This signal
is considered a “last-recourse” for reporting serious errors. Non-catastrophic
and correctable errors should be signaled in some other way. In a PC-compati-
ble machine, SERR# typically causes an NMI to the system processor (although
the designer is not constrained to have it generate an NMI). In a PowerPC,
PReP-compliant platform, assertion of SERR# is reported to the host processor
via assertion of TEA# or MC# and causes a machine check interrupt. This is the
functional equivalent of NMI in the Intel world. If the designer of a PCI device
does not want an NMI to be initiated, some means other than SERR# should be
used to flag an error condition (such as setting a bit in the device's Status regis-
ter and generating an interrupt request). SERR# is PCI clock-synchronous signal
and is an open-drain signal. It may be driven by more than one PCI agent at a
time. When asserted, the device drives it low for one clock and then tri-states its
output driver. The keeper resistor on SERR# is responsible for returning it to the
deasserted state (this takes two to three clock periods).
45
Chapter 6: Master and Target Latency
PCI bus masters should always use burst transfers to transfer blocks of data
between themselves and a target PCI device (some poorly-designed masters use
a series of single-data phase transactions to transfer a block of data). The trans-
fer may consist of anywhere from one to an unlimited number of bytes. A bus
master that has requested and has been granted the use of the bus (its GNT# is
asserted by the arbiter) cannot begin a transaction until the current bus master
completes its transaction-in-progress. If the current master were permitted to
own the bus until its entire transfer were completed, it would be possible for the
current bus master to starve other bus masters from using the bus for extended
periods of time. The extensive delay incurred could cause other bus masters
(and/or the application programs they serve) to experience poor performance
or even to malfunction (buffer overflows or starvation may be experienced).
As an example, a bus master could have a buffer full condition and is requesting
the use of the bus in order to off-load its buffer contents to system memory. If it
experiences an extended delay (latency) in acquiring the bus to begin the trans-
fer, it may experience a data overrun condition as it receives more data from its
associated device (such as a network) to be placed into its buffer.
In order to ensure that the designers of bus masters are dealing with a predict-
able and manageable amount of bus latency, the PCI specification defines two
mechanisms:
75
Chapter 6: Master and Target Latency
Figure 6-2: Longest Legal Deassertion of IRDY# In Any Data Phase Is 7 Clocks
CLK
FRAME#
Address Data
Data
AD[31:0]
Config
Read CMD Byte Enables
C/BE#[3:0]
IRDY#
TRDY#
DEVSEL#
GNT#
77
Chapter 6: Master and Target Latency
tion when the master repeats the request. Odds are its inability to transfer the
first data item expeditiously is due to a temporary condition (such as a tempo-
rary logic busy condition) and that it will be prepared to transfer the data
within 16 clocks from the assertion of FRAME# the next time that the transac-
tion is repeated.
• If it’s a read transaction, issue the retry after memorizing the address and
command from the address phase, and the byte enables from the first data
phase. It then starts reading the requested data from the slow medium (e.g.,
from an ISA target).
• If it’s a write transaction, issue a retry after memorizing the address and
command from the address phase and the byte enables and write data from
the first data phase. The target then initiates the write to the slow medium.
Two Exceptions To First Data Phase Rule. There are only two excep-
tions:
83
PCI System Architecture
'(9,&(6 $5( 127 5(48,5(' 72 +2125 7+,6 7,0( /,0,7 '85,1* 75+)$ ,( )25
&/2&.&<&/(6$)7(5567,6'($66(57(' )25025(,1)250$7,215()(572´,1,7,$/
,=$7,217,0(965817,0(µ213$*(
127(7+$7%5,'*(6'2127+$9(72$'+(5(727+(0$;,080&203/(7,217,0(
96
Chapter 7: The Commands
0(7+2'In a single processor system (see Figure 7-1 on page 103), the inter-
rupt controller asserts INTR to the x86 processor. In this case, the processor
responds with an Interrupt Acknowledge transaction. This section
describes that transaction.
0(7+2'In a multi-processor system, interrupts can be delivered to the array
of processors over the APIC (Advanced Programmable Interrupt Control-
ler) bus in the form of message packets. For more information, refer to the
MindShare book entitled Pentium Processor System Architecture (published
by Addison-Wesley).
0(7+2'In a system that supports 0(66$*(6,*1$/(',17(558376, interrupts
can be delivered to the host/PCI bridge in the form of memory writes. For
more information, refer to “Message Signaled Interrupts (MSI)” on
page 252.
In response to an interrupt request delivered over the INTR signal line, an Intel
x86 processor issues two Interrupt Acknowledge transactions (note that the P6
family processors only issues one) to read the interrupt vector from the inter-
rupt controller. The interrupt vector tells the processor which interrupt service
routine to execute.
Background
In an Intel x86-based system, the processor is usually the device that services
interrupt requests received from subsystems that require servicing. In a PC-
compatible system, the subsystem requiring service issues a request by assert-
ing one of the system interrupt request signals, IRQ0 through IRQ15. When the
IRQ is detected by the interrupt controller within the South Bridge (see Figure
7-1 on page 103), it asserts INTR to the host processor. Assuming that the host
processor is enabled to recognize interrupt requests (the Interrupt Flag bit in the
EFLAGS register is set to one), the processor responds by requesting the inter-
rupt vector from the interrupt controller. This is accomplished by the processor
performing the following sequence:
101
Chapter 7: The Commands
DEVSEL# to force a wait state into the first data phase. This is necessary to
permit the bridge sufficient time to turn off its AD bus output drivers before
the target (the interrupt controller) begins to drive the requested interrupt
vector back to the bridge on the AD bus. Clock two is referred to as the
turnaround cycle.
&/2&.The target (the South Bridge) has completed decoding the address
phase information (address and command) and asserts DEVSEL# to claim
the transaction.
7+( host/PCI bridge samples DEVSEL# still deasserted on the rising-edge
of clock three, indicating that the target has not yet claimed the transaction.
As a result, the data phase is extended by an extra clock (clock three), a wait
state tagged onto the data phase.
'85,1* the wait state (clock three) the target then drives the vector onto
the data path(s) indicated by the byte enable settings on the C/BE bus (just
BE0# asserted in an x86 environment, but a different processor type might
ask for a 32-bit vector) and asserts TRDY# to indicate the presence of the
requested vector.
7+( byte enables are a duplicate of the byte enables asserted by the host
processor during its second interrupt acknowledge bus cycle.
&/2&.The host/PCI bridge samples DEVSEL# asserted on the rising-edge
of clock 4, indicating that the target has claimed the transaction.
7+( host/PCI bridge also samples IRDY# and TRDY# asserted on the ris-
ing-edge of clock four, indicating that the data is present (TRDY# asserted).
It reads the vector from the AD bus.
7+( target samples IRDY# asserted and FRAME# deasserted on the rising-
edge of clock four, indicating that the initiator is ready to complete the final
data phase (in fact, the only one) of the transaction.
Since the one and only data phase completed on the rising-edge of clock
four, the initiator ceases to drive the byte enables and deasserts the IRDY#
signal to return the bus to the idle state.
7+( target deasserts TRDY# and DEVSEL# and ceases to drive the inter-
rupt vector.
&/2&.The bus returns to the idle state (FRAME# and IRDY# both deas-
serted) on the rising-edge of clock five.
The host/PCI bridge passes the vector back to the processor which then reads
the vector off its data bus and terminates the Interrupt Acknowledge transac-
tion.
105
PCI System Architecture
67(3The processor has two buffers in main memory that occupy adjacent
memory regions.
67(3The processor writes data into the first memory buffer and then
instructs a bus master beyond a PCI-to-PCI bridge to read and process the
data.
67(3The bus master starts its memory read using one of the bulk memory
read commands, thus giving the bridge permission to prefetch ahead of the
master while reading from main memory. The bridge ends up prefetching
past the end of the first memory buffer into the second one, but the bus
master only actually reads the data from the first buffer area.
67(3The bridge does not discard the unused data that was prefetched from
the second buffer.
67(3The processor writes data into the second memory buffer and then
instructs a bus master (the same master or a different one) beyond the same
PCI-to-PCI bridge to read and process the data.
67(3The bus master starts its memory read at the start address of the second
buffer. The bridge delivers the data that it prefetched from the beginning of
the second buffer earlier. This is stale data and doesn’t reflect the latest data writ-
ten into the second memory buffer.
116
Chapter 9: Write Transfers
Address Data
Phase Phase
1 2 3 4 5 6 7 8 9
TRDY#
Target asserts TRDY# to
indicate it's ready to accept
data and asserts DEVSEL#
to claim transaction
DEVSEL#
GNT#
137
Chapter 9: Write Transfers
Initiator deasserts
Targets sample
address and command. FRAME# FRAME# indicating it's
ready to complete last
On decode, data phase
DEVSEL# asserted
TRDY#
Wait states
inserted by
target
Data Transfers
DEVSEL#
GNT#
141
PCI System Architecture
176
Chapter 12: Early Transaction End
&$6(The initiator starts a single data phase transaction and aborts it due to
DEVSEL# not detected. This case is illustrated in Figure 12-2 on page 178.
&$6(The initiator starts a multi-data phase transaction and aborts it due to
DEVSEL# not detected. This case is illustrated in Figure 12-3 on page 179.
&/2&.The initiator starts the transaction at the start of clock one by assert-
ing FRAME# and driving the address onto the AD bus and the command
type onto the Command/Byte Enable lines.
&/2&.Because it is not yet ready to transfer the first data item, the initiator
doesn’t assert IRDY# in clock two.
&/2&.During clock three, the master asserts IRDY# to indicate its readiness
to transfer the first (and only) data item.
$7 the same time, it deasserts FRAME#, indicating to the target that this is
the final data phase.
21 the rising-edge of clock three, the master samples DEVSEL# deasserted,
indicating that the transaction has not been claimed by any target with a
Fast PCI decoder.
&/2&.On the rising-edge of clock four, the master samples DEVSEL# deas-
serted again, indicating that the transaction has not been claimed by any
target with a Medium PCI decoder.
&/2&.On the rising-edge of clock five, the master samples DEVSEL# deas-
serted again, indicating that the transaction has not been claimed by any
target with a Slow PCI decoder.
&/2&.The initiator then samples DEVSEL# a final time on the rising-edge of
clock six to determine if the subtractive decoder in the ISA bridge has
claimed the transaction. In the example, the transaction has not been
claimed by any target, so the initiator must Master Abort the transaction
177
PCI System Architecture
memory transaction starting at the first dword of the next cache line. This per-
mits the snooper (the host/PCI bridge) to snoop the next line address in the pro-
cessor’s L1 and L2 caches.
Assuming that the master decides to resume the transfer, after keeping its REQ#
deasserted for two PCI clocks, the master should then reassert its REQ# and re-
arbitrate for bus ownership. When it has successfully re-acquired bus owner-
184
Chapter 12: Early Transaction End
As long as the arbiter leaves the GNT# on the package, the master-capable
devices within the package may take turns using the bus to initiate transactions
and REQ# does not have to be deasserted between the transaction attempts
(even if an access attempt is terminated with a Retry or a Disconnect). Before the
package re-attempts a transaction that received a Retry or a Disconnect, how-
ever, it must deassert REQ# for two clocks (one of which is the idle clock
between transactions).
197
13 Error Detection
and Handling
Prior To This Chapter
The previous chapter described the early termination of a transaction before all
of the intended data has been transferred between the master and the target.
This included descriptions of Master Abort, the preemption of a master, Target
Retry, Target Disconnect, and Target Abort.
In This Chapter
The PCI bus architecture provides two error reporting mechanisms: one for
reporting data parity errors and the other for reporting more serious system
errors. This chapter provides a discussion of error detection, reporting and han-
dling using these two mechanisms.
199
Chapter 13: Error Detection and Handling
&/2&.The initiator and target sample IRDY# asserted but TRDY# still deas-
serted on the rising-edge of clock six, causing a second wait state to be
inserted in the third data phase during clock six.
7+( initiator samples PERR# to see if the second data item was received
correctly by the target.
7+( target’s parity latches PAR on clock six and compares the third data
phase’s actual parity to the expected parity. In the event of an error, the tar-
get asserts PERR# during clock six. If the target performs this early parity
check and asserts PERR#, it must keep PERR# asserted until two clocks
after completion of the data phase. ,1 7+( (9(17 7+$7 7+( 7$5*(7 $66(576
3(55 ($5/< ,1 7+,6 0$11(5 7+( 63(& +$6 $''(' $ 58/( 7+$7 7+( 7$5*(7
0867(9(178$//<$66(5775'<72&203/(7(7+('$7$3+$6(,7,61273(50,77('
72 (1' 7+( '$7$ 3+$6( :,7+ $5(75<',6&211(&7:,7+287'$7$ 25 $7$5*(7
$%257
'85,1* the third data phase, the target re-asserts TRDY# during clock six
to indicate that it is ready to complete the data phase.
&/2&.The final data phase completes on the rising-edge of clock seven
when IRDY# and TRDY# are sampled asserted. The target latches the final
data item from the bus at that point.
&/2&.At the latest (if early parity check wasn’t performed), the target must
sample PAR one clock afterwards, on clock eight, and, in the event of an
error, must assert PERR# during clock eight.
207
Chapter 15: The 64-bit PCI Extension
Initiator deasserts
FRAME# FRAME# and REQ64#
to signal ready to
complete last
data phase
REQ64#
Targets begin
address decode
Address Data Data Data
AD[31:0]
Initiator deasserts
IRDY#, returning
bus to idle state
IRDY#
Target keeps
TRDY# deasserted to
enforce turn-around
cycle TRDY#
Data Transfers
DEVSEL#
Target decodes
address and asserts
DEVSEL# and
ACK64# Wait states
ACK64#
Initiator samples
ACK64# asserted
GNT# indicating that
a 64-bit target
is responding
275
Chapter 15: The 64-bit PCI Extension
Initiator outputs
upper 32 bits of Low High Data Data
address Address Address
AD[31:0]
Initiator outputs
Dual Address Dual
Normal Byte Byte
Command Address
Cmd Enables Enables
C/BE#[3:0] Cycle
Initiator outputs
bus command
IRDY#
Wait States
TRDY#
Data Transfers
DEVSEL#
Initiator is not
REQ64# performing
64-bit transfer Target doesn't assert ACK64#
because it's a 32-bit target
ACK64#
GNT#
291
PCI System Architecture
Initiator outputs
lower 32 bits of FRAME#
address
Initiator outputs
upper 32 bits of Low High
address Address Address Data Data
AD[31:0]
Initiator outputs
Dual Address Dual
Command Address Normal
Cmd Byte Enables Byte Enables
C/BE#[3:0] Cycle
Initiator outputs
memory read High
command Address Data Data
AD[63:3 ]
Normal
Command Byte Enables Byte Enables
C/BE#[ :4]
IRDY#
Wait States
TRDY#
Data Transfers
DEVSEL#
Initiator is
performing Target can
64-bit transfer perform 64-bit
data transfers
REQ64#
ACK64#
GNT#
294
Chapter 15: The 64-bit PCI Extension
64-bit Parity
297
Chapter 18: Configuration Transactions
,)$7$5*(7,6$&&(66(''85,1*,1,7,$/,=$7,217,0(,7,6$//2:('72'2$1<2)7+(
)2//2:,1*
,*125(7+(5(48(67 81/(66,7,6$'(9,&(1(&(66$5<72%2277+(26
&/$,0 7+( $&&(66 $1'+2/' ,1 :$,7 67$7(6 817,/ ,7 &$1 &203/(7( 7+( 5(48(67
12772(;&(('7+((1'2),1,7,$/,=$7,217,0(
&/$,07+($&&(66$1'7(50,1$7(:,7+5(75<
5817,0()2//2:6,1,7,$/,=$7,217,0('(9,&(60867&203/<:,7+7+(/$7(1&<58/(6
6((´35(9(17,1*7$5*(7)520021232/,=,1*%86µ213$*( '85,1*5817,0(
As mentioned earlier in this book, Intel x86 and PowerPC processors (as two
examples processor families) do not possess the ability to perform configuration
read and write transactions. They use memory and IO (IO is only in the x86
case) read and write transactions to communicate with external devices. This
means that the host/PCI bridge must be designed to recognize certain IO or
memory accesses initiated by the processor as requests to perform configuration
accesses.
321
Chapter 18: Configuration Transactions
However, the bridge doesn’t assert FRAME# yet. It delays a sufficient number
of clocks to let the bits on the upper AD lines propagate through the resistors to
the IDSEL pins at the devices and settle to the correct state and then asserts
FRAME#. No devices will pay any attention to the transaction until FRAME# is
asserted.
As the data phase is entered, the PCI devices are performing the address decode
to determine which of them is the target of the transaction (00b on AD[1:0] indi-
cates it is for one of them). Devices that sampled their IDSEL inputs deasserted
at the end of the address phase ignore the transaction. When a device detects its
IDSEL pin was asserted at the end of the address phase, it must determine
whether or not to claim the transaction. How it does this depends on whether it
is a single- or multi-function device:
7+(63(&67$7(67+$7$6,1*/()81&7,21'(9,&(&$1(,7+(5
'(&2'( 7+( )81&7,21 180%(5 $1' 21/< $66(57 '(96(/ )25 )81&7,21
=(52
250$<5(6321'72$//)81&7,21180%(56%<$66(57,1*'(96(/
341
Chapter 19: Configuration Registers
Bit Function
transactions.
'HIDXOWVHWWLQJzero.
15:10 Reserved
Status Register
Always mandatory. The Status register tracks the function’s status as a PCI entity.
A function must implement the bits that relate to its functionality. This register
can be read from, but writes are handled as follows. On a write, a bit that is cur-
rently set to one can be cleared to zero by writing a one to it. Software cannot set
a bit that is currently zero to a one. This method was chosen to simplify the pro-
grammer’s job. After reading the Status and ascertaining the error bits that are
set, the programmer clears the bits by writing the value that was read back to
the register.
Table 19-22 on page 372 describes the Status register bits and Figure 19-4 on
page 372 illustrates its bit assignment. 7+( 63(& +$6 0$'( 7+( )2//2:,1*
&+$1*(6727+(67$7865(*,67(5
%,7,612/21*(55(6(59(',7,65()(55('72$67+(&$3$%,/,7,(6/,67%,7
7+(8')%,7 %,7 $1')($785(+$6%((1'(/(7('$1'7+(%,7,612:5(6(59('
%,7+$6%((15(1$0('$67+(0$67(5'$7$3$5,7<(5525%,7
371
PCI System Architecture
This gives the configuration software the flexibility to map the device’s register
set into memory space and, if an IO Base Address Register is also provided, into
IO space as well. If both are implemented, the device driver associated with the
device can then choose whether to communicate with its device’s register set
through memory or IO space.
Memory Base Address Register
This section provides a detailed description of the bit fields within a Memory
Base Address Register. The section entitled “Determining Block Size and
Assigning Address Range” on page 384 describes how the register is probed to
determine its existence, the size of the memory associated with the decoder, and
the assignment of the base address to the decoder.
Decoder Width Field. In a Memory Base Address Register (see Figure 19-7
on page 382), bits [2:1] define whether the decoder is 32- or 64-bits wide:
• 00b = it’s a 32-bit register. The configuration software therefore will write a
32-bit start memory address into it specifying any address in the first 4GB of
memory address space.
• 10b = it’s a 64-bit register. The configuration software therefore writes a 64-
bit start memory address into it that specifies a start address in a 264 mem-
ory address space. This means that the device supports the Dual-Address
Command (DAC) that is used to address memory above the 4GB address
boundary. It also means that this Base Address Register consumes two
dwords of the configuration Header space. The first dword is used to set
the lower 32-bits of the start address and the second dword is completely
read/writable and is used to specify the upper 32-bits of the start address.
3/($6( 127( 7+$7 7+( 63(& 12 /21*(5 3(50,76 7+( 3$77(51 7+$7 ,1',&$7(6 7+(
'(9,&(·60(025<0867%(0$33(',1727+(),5670%2)0(025<63$&(7+,63$77(51
,612:5(6(59('
Prefetchable Attribute Bit. Bit three defines the block of memory as
Prefetchable or not. A block of memory space may be marked as Prefetchable
only if it can guarantee that:
• there are no side effects from reads (e.g., the read doesn’t alter the contents
of the location or alter the state of the device in some manner). It’s permissi-
ble for a bridge that resides between a master and a memory target to
prefetch read data from memory that has this characteristic. If the master
doesn’t end up asking for all of the data that the bridge read into a read-
ahead buffer, the bridge must discard the data (see “Bridges Must Discard
Prefetched Data Not Consumed By Master” on page 116). The data remains
unchanged in the target’s memory locations.
380
PCI System Architecture
63 32 31 43 2 1 0
Upper 32 bits of Base Address Lower part of Base Address 0
Prefetchable
Type
00 - 32-bit decoder. Locate anywhere in lower 4GB
01 - locate below 1MB (reserved in 2.2 spec)
10 - 64-bit decoder. Locate anywhere in 264 memory
space (implies this register is 64-bits wide and
consumes next dword of config space as well
as this one).
11 - reserved
Memory space indicator
IO Base Address Register
Introduction. This section provides a detailed description of the bit fields
within an IO Base Address Register. The section entitled “Determining Block
Size and Assigning Address Range” on page 384 describes:
• how the register is probed to determine its existence,
• how to determine the size of the IO register set associated with the decoder
and therefore the amount of IO space that must be assigned to it, and
• how to assign the base address to the decoder.
Description. Refer to Figure 19-8 on page 383. Bit zero returns a one, indicat-
ing that this is an IO, rather than a memory, decoder. Bit one is reserved and
must always return zero. Bits [31:2] is the Base Address field and is used to:
• determine the size of the IO block required and
• to set its start address.
The specification requires that a device that maps its control register set into IO
space must not request more than 256 locations per IO Base Address Register.
PC-Compatible IO Decoder. 7+( 833(5 %,76 2) 7+( ,2 %$5 0$< %(
+$5':,5(' 72 =(52 :+(1 $ '(9,&( ,6 '(6,*1 63(&,),&$//< )25 $ 3&&203$7,%/(
;%$6('0$&+,1( %(&$86(,17(/;352&(66256$5(,1&$3$%/(2)*(1(5$7,1*
,2 $''5(66(6 29(5.% 7+( '(9,&( 0867 67,// 3(5)250 $ )8//%,7 '(&2'( 2)
7+(,2$''5(66+2:(9(5
382
Chapter 19: Configuration Registers
The programmer then writes a 32-bit base IO address into the register. How-
ever, only bits [31:8] are writable. The decoder accepts bits [31:8] and assumes
that bits [7:0] of the assigned base address are zero. This means that the base
address is divisible by 256, the size of the requested IO range.
A very large memory decoder would not permit any bits to be written and
would have a 10b in the decoder Type field, indicating that this is a 64-bit mem-
ory decoder consuming two dwords of configuration space (this one and the
one immediately following it). If this is the case, the programmer then writes all
ones into the high dword of the BAR register to determine how big a memory
space the decoder requires.
The largest IO decoder would permit bits [31:8] to be written. The binary-
weighted value of bit 8 is 256 and this is therefore the largest range that a PCI IO
decoder can request.
385
Chapter 19: Configuration Registers
• In the Base Address field (bits [31:11]), bit 17 is the least-significant bit that
the programmer was able to set to one. It has a binary-weighted value of
128K, indicating that the ROM decoder requires 128KB of memory space.
The programmer then writes a 32-bit start address into the register to assign
the ROM start address on a 128K address boundary.
The spec recommends that the designer of the Expansion ROM Base Address
Register should request a memory block slightly larger than that required by
the current revision ROM to be installed. This permits the installation of subse-
quent ROM revisions that occupy more space without requiring a redesign of
the logic associated with the device's Expansion ROM Base Address Register.
The specification sets a limit of 16MB as the maximum expansion ROM size.
The Memory Space bit in the Command register has precedence over the
Expansion ROM Enable bit. The device's expansion ROM should respond to
memory accesses only if both its Memory Space bit (in its Command register)
and the Expansion ROM Enable bit (in its expansion ROM register) are both set
to one.
In order to minimize the number of address decoders that a device must imple-
ment, one address decoder can be shared between the Expansion ROM Base
Address Register and one of the device’s Memory Base Address Registers. The
two Base Address Registers must be able to hold different values at the same
time, but the address decoder will not decode ROM accesses unless the Expan-
sion ROM Enable bit is set in the Expansion ROM Base Address Register.
387
PCI System Architecture
The Max_Lat register value indicates how often the master would like to have
access to the bus (i.e., receive its GNT# from the arbiter). The value hardwired
into this register is used by the configuration software to determine the priority-
level (and possibly the arbitration scheme the arbiter uses) the bus arbiter
assigns to the master (assuming that the arbiter is programmable). Please note
that if the arbiter is not programmable, the configuration software shouldn’t
waste any time reading this register.
7+(63(&,1',&$7(67+$77+(9$/8(+$5':,5(',1727+,65(*,67(56+28/'$6680(
7+$77+('(9,&('2(61·7,16(57$1<:$,767$7(6,172'$7$3+$6(6
New Capabilities
390
Chapter 19: Configuration Registers
ID Description
00h Reserved.
01h PCI Power Management Interface. Refer to “Power Management” on
page 479.
02h AGP. Refer to “AGP Capability” on page 394. Also refer to the MindShare
book entitled AGP System Architecture (published by Addison-Wesley).
03h VPD. Refer to “Vital Product Data (VPD) Capability” on page 397.
04h Slot Identification. This capability identifies a bridge that provides
external expansion capabilities (i.e., an expansion chassis containing add-
in card slots). Full documentation of this feature can be found in the revi-
sion 1.1 PCI-to-PCI Bridge Architecture Specification. For a detailed descrip-
tion, refer to “Introduction To Chassis/Slot Numbering Registers” on
page 566 and “Chassis and Slot Number Assignment” on page 594.
05h Message Signaled Interrupts. Refer to “Message Signaled Interrupts
(MSI)” on page 252.
06h CompactPCI Hot Swap. Refer to “CompactPCI and PMC” on page 699.
7-255d Reserved in 2.2 spec, but ID 07h was subsequently assigned to PCI-X
devices.
393
Chapter 20: Expansion ROMs
addition to setting this bit to one, the programmer must also set the Memory
Space bit in the function’s configuration Command register to a one. The func-
tion’s ROM address decoder is then enabled and the ROM (if present) can be
accessed. The maximum ROM decoder size permitted by the specification is
16MB, dictating that bits [31:24] must be hardwired to zero.
The programmer then performs a read from the first two locations of the ROM
and checks for a return value of AA55h. If this pattern is not received, the ROM
is not present. The programmer disables the ROM address decoder (by clearing
bit zero of the Expansion ROM Base Address Register to zero). If AA55h is
received, the ROM exists and a device driver code image must be copied into
main memory and its initialization code must be executed. This topic is covered
in the sections that follow.
413
Chapter 22: Hot-Plug PCI
Async Notice of Input: Logical Slot ID This is the only primitive (defined
Slot Status by the spec) that is issued to the
Change Return: none Hot-Plug Service by the Hot-Plug
System Driver. It is sent when the
Driver detects an unsolicited
change in the state of a slot. Exam-
ples would be a run-time power
fault or card installed in a previ-
ously-empty slot with no warning.
1. When RST# is asserted, a device must tri-state all of its bus outputs and
float its open-drain outputs within 40ns.
2. The device’s PCI target and bus master state machines must be held in their
reset state as long as RST# remains asserted.
3. When RST# is deasserted, the device’s PCI target and bus master state
machines must remain in the Idle state until the device is addressed in a PCI
transaction.
4. To avoid “misinterpreting” the contents of the bus if RST# is removed in the
midst of a transaction currently in progress, a device should not leave the
reset state until RST# has been deasserted AND the bus is idle (FRAME#
and IRDY# sampled deasserted).
5. The device must not depend on the PCI CLK signal exhibiting any particu-
lar characteristics prior to the deassertion of RST#.
6. The PCI spec requires that the operating frequency of a 66MHz bus may not
be changed without asserting RST#.
7. 7+( 63(& ',&7$7(6 7+$7 7+(+273/8*6<67(0'5,9(5 0867 %( 35(3$5(' 72
:$,7$7/($673&,&/.6$)7(5567+$6%((1'($66(57('%()25(,768&&(('6
,1$&&(66,1*7+(&$5'25&203/(7(6$5(48(67)5207+(+273/8*6(59,&(72
78517+(&$5'21
475
PCI System Architecture
Installing New Devices Still Too Hard. Microsoft Windows 95 and the
Plug and Play initiative have provided an architecture that allows the user to
install hardware more easily; however, the task is still not an easy operation for
the end user. To install a new device, the user must turn off the computer and
open the box. The OnNow PC must be easily extensible by the end user, and
any device the user adds must become available without requiring a reboot or
restart.
System PM States
Table 23-2 on page 484 defines the possible states of the overall system with ref-
erence to power consumption. The “Working”, “Sleep”, and “Soft Off” states
are defined in the OnNow Design Initiative documents.
Power
Description
State
484
Chapter 23: Power Management
Bit(s) Description
8:6 Aux_Current field. For a function that supports generation of PME# from
the D3cold state, this field reports the current demand made upon the
3.3Vaux power source (see “3.3Vaux” on page 534) by the function’s logic
that retains the PME context information. This information is used by soft-
ware to determine how many functions can simultaneously be enabled for
PME generation (based on the total amount of current each draws from
the system 3.3Vaux power source and the power sourcing capability of the
power source).
• If the function does not support PME# generation from within the
D3cold PM state, then this field is not implemented and always returns
zero when read.
• If the function implements the Data register (see “Data Register” on
page 524), this field is not implemented and always returns zero when
read. The Data register then takes precedence over this field in report-
ing the 3.3Vaux current requirements for the function.
• If the function supports PME# generation from the D3cold state and
does not implement the Data register, then the Aux_Current field
reports the 3.3Vaux current requirements for the function. It is
encoded as follows:
Bit
876 Max Current Required
111 375mA
110 320mA
101 270mA
100 220mA
011 160mA
010 100mA
001 55mA
000 0mA
519
Chapter 24: PCI-to-PCI Bridge
As an example, assume that the configuration software has written the follow-
ing values to the Memory Base and Limit registers:
• The upper three digits of the Memory Base register contain 555h.
• The upper three digits of the Memory Limit register contain 678h.
15 4 3 0
591
Chapter 24: PCI-to-PCI Bridge
67(3Identify the VGA display device to be utilized during the boot process.
This is accomplished by first scanning the standard expansion bus (e.g.,
ISA, EISA or Micro Channel). If the display device is found on the expan-
sion bus, that display is used for the boot process and the initialization
sequence has completed. Do not perform the steps that follow. If the VGA
device is not found on the expansion bus, scan the PCI bus(es), starting at
the bus with the largest bus number. If it is found on a PCI bus, save the bus
number for use during the remaining steps in the initialization sequence.
67(3Set the IO Space and Memory Space enable bits in the device’s Com-
mand register so it can respond to VGA accesses.
67(3Starting at the PCI bus number the boot display device is on, scan the
PCI bus hierarchy upstream (towards bus zero). In each PCI-to-PCI bridge
detected, set the VGA Enable bit in its bridge control register.
67(3Starting at the PCI bus the boot display device is on, scan the PCI buses
downstream (all buses subordinate to this bus) looking for GFXs. Abort the
scan when the first instance of a GFX is found. Set the GFX device’s IO
Space bit and clear the VGA Palette Snoop Enable bits in its Command reg-
ister.
67(3Scan back upstream from the bus the GFX is on towards the bus the
boot display device resides on. At each PCI-to-PCI bridge encountered, set
the VGA Palette Snoop Enable bit in the bridge’s Command register.
67(3Finally, set the VGA Palette Snoop Enable bit in the boot display
device’s Command register.
615
Chapter 25: Transaction Ordering & Deadlocks
This example highlights that the device’s master and target state machines are
dependent on each other, thereby defining it (according to the 2.2 specs’s defini-
tions) as a bridge and not as a simple device.
The simple device must wait until it completes the memory write transaction on
the PCI bus (the target memory asserts TRDY#, or signals a Master Abort or a
Target Abort) before proceeding internally (in the example, assuming that the
write doesn’t receive a Master or target Abort, updating the status register).
Simple devices do not support exclusive (i.e., locked) accesses (only bridges do)
and do not use the LOCK# signal either as a master or as a target. Refer to
“Locking” on page 683 for a discussion of the use of LOCK# in bridge devices.
Scenario One
67(3Two devices, referred to as A and B, simultaneously start arbitrating for
bus ownership to attempt IO writes to each other.
67(3Device A is granted the bus first and initiates its IO write to device B
(device B is the target of the transaction). Device B decodes the address/
command and asserts DEVSEL#.
67(3Assume that, when acting as a target, device B always terminates trans-
actions that target it with Retry until its master state machine completes its
outstanding requests (in this case, an IO write).
67(3Device B is then granted the bus and initiates its IO write to device A.
67(3If device A responds in the same manner that device B did (i.e., with a
Retry), the system will deadlock.
Scenario Two
As described in a later section (“Bridges: Ordering Rules and Deadlocks” on
page 652), in certain cases a bridge is required to flush its posting buffer as a
master before it completes a transaction as a target. As described in the follow-
ing sequence, this can result in a deadlock:
651
PCI System Architecture
Introduction
When a bridge accepts a memory write into its posted memory write buffer, the
master that initiated the write to memory considers the write completed and
can initiate additional operations (i.e., PCI reads and writes) before the target
memory location actually receives the write data. Any of these subsequent
operations may end up completing before the memory write is finally consum-
mated. The possible result: a read the programmer intended to occur after the
write may happen before the data is actually written.
In order to prevent this from causing problems, many of the PCI ordering rules
require that a bridge’s posted memory write buffers be flushed before permit-
ting subsequently-issued transactions to proceed. These same buffer flushing
rules, however, can cause deadlocks. The remainder of the PCI transaction
ordering rules prevent the system buses from deadlocking when posting buffers
must be flushed.
652
PCI System Architecture
Posted
Delayed
Memory Delayed Request
Completion
Write
Rule 3—Ensures DWR Not Done Until All Posted Writes Done
A DWR that has just been latched may not be performed on the destination bus
before a previously-latched PMW is performed on the destination bus. Since the
DWR’s write was initiated after the PMW data was written to the bridge, it must
be written to the target on the destination bus after the previously-latched PMW
data. This ensures strong write ordering. In the Producer/Consumer example,
662
27 Locking
The Previous Chapter
The previous chapter introduced the PCI BIOS specification, revision 2.1, dated
August 26, 1994.
This Chapter
This chapter provides a detailed description of the PCI locking mechanism that
permits an EISA bridge to lock main memory or the host/PCI bridge to lock an
EISA memory target.
683
PCI System Architecture
System Card
General
As mentioned earlier, the system card contains:
Although it isn’t a rule, the CompactPCI specification recommends that the sys-
tem slot be physically placed on either end of the bus segment (rather than in
another position in the segment). All testing and verification performed by the
PICMG assumes this configuration. Other configurations are permitted but
must be verified by testing to validate that the electrical specification is met.
The slot rail for the system slot must be red in color, providing a vivid visual cue
to the user that it is the system slot.
710
Chapter 28: CompactPCI and PMC
Peripheral Cards
Peripheral cards can only be installed in peripheral card slots. A peripheral card
may act as a simple PCI target or may also have PCI bus master capability.
711
Chapter 28: CompactPCI and PMC
67(3Using the console, the user informs the OS that a new card will be
installed in a slot.
67(3As the card is inserted, it encounters staged power and ground pins
(see “Electrical Insertion/Removal Occurs In Stages” on page 752). When it
is fully-seated, the card has been automatically powered up.
67(3The end user informs the system that the card is present.
67(3The hot-swap handler informs the OS that a new card is present and
the OS performs bus enumeration (see “Introduction” on page 309) to
determine the resources required by the new card.
67(3The OS programs the device’s PCI configuration registers to assign
resources to it.
67(3The OS then loads the appropriate device driver and calls its initializa-
tion code.
67(3The device driver’s initialization code finishes the setup of the card
(see “Step 7: OS Loads and Call Drivers’ Initialization Code” on page 243)
and brings it on-line.
Removing a Card. This involved the same steps as installing a card, only in
reverse order:
67(3Using the console, the user warns the OS that a card will be removed.
67(3The OS commands the device’s driver to quiesce. In other words, the
driver must stop using the card. In addition, either the driver or the soft-
ware utility would clear the device’s PCI configuration Command register
(see “Command Register” on page 368) to disable the device.
67(3The OS removes the driver from memory.
67(3The OS deallocates the resources that were assigned to the functions on
the card.
67(3The OS informs the end user (via the console) that the card can be
removed.
67(3The end user removes the card (see “Card Removal Sequence” on
page 752).
753
Access Latency. The amount of time that expires from the moment a bus master
requests the use of the PCI bus until it completes the first data transfer of the
transaction.
AD Bus. The PCI address/data bus carries address information during the
address phase of a transaction and data during each data phase.
Address Ordering. During PCI burst memory transfers, the initiator must indi-
cate whether the addressing sequence will be sequential (also referred to as lin-
ear) or will use cacheline wrap ordering of addresses. The initiator uses the state
of AD[1:0] to indicate the addressing order. During I/O accesses, there is no
explicit or implicit address ordering. It is the responsibility of the programmer
to understand the I/O addressing characteristic of the target device.
Address Phase. During the first clock period of a PCI transaction, the initiator
outputs the start address and the PCI command. This period is referred to as the
address phase of the transaction. When 64-bit addressing is used, there are two
address phases.
Agents. Each PCI device, whether a bus master (initiator) or a target is referred
to as a PCI agent.
Arbiter. The arbiter is the device that evaluates the pending requests for access
to the bus and grants the bus to a bus master based on a system-specific algo-
rithm.
Arbitration Latency. The period of time from the bus master’s assertion of
REQ# until the bus arbiter asserts the bus master’s GNT#. This period is a func-
tion of the arbitration algorithm, the master’s priority and system utilization.
Atomic Operation. A series of two or more accesses to a device by the same ini-
tiator without intervening accesses by other bus masters.
Base Address Registers. Device configuration registers that define the start
address, length and type of memory space required by a device. The type of
space required will be either memory or I/O. The value written to this register
during device configuration will program its memory or I/O address decoder
to detect accesses within the indicated range.
763