Pcie Aer
Pcie Aer
• 297 •
298 • Enable PCI Express Advanced Error Reporting in the Kernel
Root Port
Root Complex CPU
Root Root
Interrupt
Port Port Switch
Upstream Root Complex
Up
Port
Port Switch Root
Downstream Port
PCI Express Switch
Port
Down Down
Port Port
Up
Port
Switch
Figure 1: PCI Express Port Topology
Down Down
Port Port
Error Message
PBD
End Point
Root Complex
PMErs AERrs HPrs VCrs
Root Root
Port Port PMErs AERrs HPrs VCrs Figure 3: PCI Express Error Reporting procedures
Claim
Figure 2: AER Root Port Service Driver • Gathers the comprehensive error information if er-
rors occurred.
Once the PCI Express AER service driver is loaded, it To support traditional error handling, PCI Express pro-
claims all AERrs service devices in a system device hi- vides baseline error reporting, which defines the basic
erarchy, as shown in Figure 2. For each AERrs service error reporting mechanism. All PCI Express devices
device, the advanced error reporting service driver con- have to implement this baseline capability and must map
figures its service device to generate an interrupt when required PCI Express error support to the PCI-related
an error is detected [3]. error registers, which include enabling error reporting
2007 Linux Symposium, Volume Two • 299
and setting status bits that can be read by PCI-compliant trol Protocol Errors, Completion Time-out Errors, Com-
software. But the baseline error reporting doesn’t define pleter Abort Errors, Unexpected Completion Errors, Re-
how platforms notify system software about the errors. ceiver Overflow Errors, Malformed TLPs, ECRC Er-
rors, and Unsupported Request Errors. When an un-
PCI Express errors consist of two types, correctable er- correctable error occurs, the corresponding bit within
rors and uncorrectable errors. Correctable errors include the Advanced Uncorrectable Error Status register is set
those error conditions where the PCI Express protocol automatically by hardware and is cleared by software
can recover without any loss of information. A cor- when writing a “1” to the bit position. Advanced error
rectable error, if one occurs, can be corrected by the handling permits software to select the severity of each
hardware without requiring any software intervention. error within the Advanced Uncorrectable Error Severity
Although the hardware has an ability to correct and re- register. This gives software the opportunity to treat er-
duce the correctable errors, correctable errors may have rors as fatal or non-fatal, according to the severity asso-
impacts on system performance. ciated with a given application. Software could use the
Uncorrectable errors are those error conditions that im- Advanced Uncorrectable Mask register to mask specific
pact functionality of the interface. To provide more errors.
robust error handling to system software, PCI Express
further classifies uncorrectable errors as fatal and non-
fatal. Fatal errors might cause corresponding PCI Ex- 2.2.2 PCI Express AER Driver Designed To Handle
press links and hardware to become unreliable. System PCI Express Errors
software needs to reset the links and corresponding de-
vices in a hierarchy where a fatal error occurred. Non- Before kernel 2.6.18, the Linux kernel had no root port
fatal errors wouldn’t cause PCI Express link to become AER service driver. Usually, the BIOS provides basic
unreliable, but might cause transaction failure. System error mechanism, but it couldn’t coordinate correspond-
software needs to coordinate with a device agent, which ing devices to get more detailed error information and
generates a non-fatal error, to retry any failed transac- perform recovery actions. As a result, the AER driver
tions. has been developed to support PCI Express AER en-
abling for the Linux kernel.
PCI Express AER provides more reliable error report-
ing infrastructure. Besides the baseline error reporting,
PCI Express AER defines more fine-grained error types
and provides log capability. Devices have a header log 2.2.2.1 AER Initialization Procedures
register to capture the header for the TLP corresponding When a machine is booting, the system allocates in-
to a detected error. terrupt vector(s) for every PCI Express root port. To
service the PCI Express AER interrupt at a PCI Express
Correctable errors consist of receiver errors, bad TLP, root port, the PCI Express AER driver registers its in-
bad DLLP, REPLAY_NUM rollover, and replay timer terrupt service handler with Linux kernel. Once a PCI
time-out. When a correctable error occurs, the corre- Express root port receives an error reported from the
sponding bit within the advanced correctable error status downstream device, that PCI Express root port sends an
register is set. These bits are automatically set by hard- interrupt to the CPU, from which the Linux kernel will
ware and are cleared by software when writing a “1” call the PCI Express AER interrupt service handler.
to the bit position. In addition, through the Advanced
Correctable Error Mask Register (which has the similar Most of AER processing work should be done under
bitmap like advanced correctable error status register), a process context. The PCI Express AER driver cre-
a specific correctable error could be masked and not be ates one worker per PCI Express AER root port virtual
reported to root port. Although the errors are not re- device. Depending on where an AER interrupt occurs
ported with the mask configuration, the corresponding in a system hierarchy, the corresponding worker will be
bit in advanced correctable error status register will still scheduled.
be set.
Most BIOS vendors provide a non-standard error pro-
Uncorrectable errors consist of Training Errors, Data cessing mechanism. To avoid conflict with BIOS while
Link Protocol Errors, Poisoned TLP Errors, Flow Con- handling PCI Express errors, the PCI Express AER
300 • Enable PCI Express Advanced Error Reporting in the Kernel
driver must request the BIOS for ownership of the PCI AER Driver
Express AER via the ACPI _OSC method, as specified
1) Get Source ID/
in PCI Express Specification and ACPI Specification. If Error Type,
the BIOS doesn’t support the ACPI _OSC method, or Clear Root Status
the ACPI _OSC method returns errors, the PCI Express
AER driver’s probe function will fail (refer to Section 3 Root Complex
Once the PCI Express AER driver takes over, the BIOS 2) Get Detailed Error Type,
must stop its activities on PCI Express error processing. Clear Correctable Error Status
2.2.2.2 Handle PCI Express Correctable Errors Figure 4: Procedure to Process Correctable Errors
agent device. Figure 4 illustrates the procedure to pro- 2) Get Detailed Error
cess correctable errors. Last but not least, the details Type and Log
+—— PCI-Express Device Error —–+ Figure 5: Procedures to Process Non-Fatal Errors
Error Severity : Corrected
PCIE Bus Error type : Physical Layer
Receiver Error : Multiple
The first two steps are like the ones to process cor-
Receiver ID : 0020
rectable errors. During Step 2, the AER driver need to
VendorID=8086h, DeviceID=3597h, Bus=00h, Device=04h,
retrieve the packet header log from the agent if the error
Function=00h
is TLP-related.
The Requester ID is the ID of the device which reports Below is an example of non-fatal error output to the sys-
the error. Based on such information, an administrator tem console.
could find the bad device easily. +—— PCI-Express Device Error ——+
Error Severity : Uncorrected (Non-Fatal)
2.2.2.3 Handle PCI Express Non-Fatal Errors PCIE Bus Error type : Transaction Layer
If an agent device reports non-fatal errors, the PCI Completion Timeout : Multiple
Express AER driver uses the same mechanism as de- Requester ID : 0018
scribed in Section 2.2.2 to obtain more details about an VendorID=8086h, DeviceID=3596h, Bus=00h, Device=03h,
error from an agent device and output error information Function=00h
to the system console. Figure 5 illustrates the procedure
to process non-fatal errors. Unlike correctable errors, non-fatal errors might cause
2007 Linux Symposium, Volume Two • 301
some transaction failures. To help an agent device driver callbacks of the relevant drivers. In the resume func-
to retry any failed transactions, the PCI Express AER tions, drivers could resume operations to the devices.
driver must perform a non-fatal error recovery proce-
dure, which depends on where a non-fatal error occurs If an error_detected callback returns PCI_ERS_
in a system hierarchy. As illustrated in Figure 6, for RESULT_NEED_RESET, the recovery procedure will call
example, there are two PCI Express switches. If end- all slot_reset callbacks of relevant drivers. If
point device E2 reports a non-fatal error, the PCI Ex- all slot_reset functions return PCI_ERS_RESULT_
press AER driver will try to perform an error recovery CAN_RECOVER, the resume callback will be called to
procedure only on this device. Other devices won’t take finish the recovery. Currently, some device drivers pro-
part in this error recovery procedure. If downstream port vide err_handler callbacks. For example, Intel’s
P1 of switch 1 reports a non-fatal error, the PCI Express E100 and E1000 network card driver and IBM’s POWER
AER driver will do error recovery procedure on all de- RAID driver.
vices under port P1, including all ports of switch 2, end The PCI Express AER driver outputs some information
point E1, and E2. about non-fatal error recovery steps and results. Below
is an example.
Root Complex
Root
+—— PCI-Express Device Error —–+
Port Error Severity : Uncorrected (Non-Fatal)
PCIE Bus Error type : Transaction Layer
Up Unsupported Request : First
Port
Switch 1 Requester ID : 0500
Down VendorID=14e4h, DeviceID=1659h, Bus=05h, Device=00h,
Port: P1 Function=00h
TLB Header:
04000001 0020060f 05010008 00000000
Up
Port
Broadcast error_detected message
Switch 2 Broadcast slot_reset message
Down Down Broadcast resume message
Port Port
tg3: eth3: Link is down.
AER driver successfully recovered
End Point: E1 End Point: E2
Figure 6: Non-Fatal Error Recovery Example 2.2.2.4 Handle PCI Express Fatal Errors
When processing fatal errors, the PCI Express AER
driver also collects detailed error information from the
To take part in the error recovery procedure, specific de- reporter in the same manner as described in Sections
vice drivers need to implement error callbacks as de- 2.2.2.2 and 2.2.2.3. Below is an example of non-fatal
scribed in Section 4.1. error output to the system console:
When an uncorrectable non-fatal error happens, the +—— PCI-Express Device Error ——+
AER error recovery procedure first calls the error_ Error Severity : Uncorrected (Fatal)
detected routine of all relevant drivers to notify their PCIE Bus Error type : Transaction Layer
devices run into errors by the deep-first sequence. In Unsupported Request : First
the callback error_detected, the driver shouldn’t Requester ID : 0200
operate the devices, i.e., do not perform any I/O on the VendorID=8086h, DeviceID=0329h, Bus=02h, Device=00h,
devices. Mostly, error_detected might cancel all Function=00h
pending requests or put the requests into a queue. TLB Header:
04000001 00180003 02040000 00020400
If the return values from all relevant error_
detected routines are PCI_ERS_RESULT_CAN_ When performing the error recovery procedure, the ma-
RECOVER, the AER recovery procedure calls all resume jor difference between non-fatal and fatal is whether
302 • Enable PCI Express Advanced Error Reporting in the Kernel
the PCI Express link will be reset. If the return val- driver specific */
ues from all relevant error_detected routines are pci_ers_result_t (*reset_link) (struct
PCI_ERS_RESULT_CAN_RECOVER, the AER recovery pci_dev *dev);
procedure resets the PCI Express link based on whether ...
the agent is a bridge. Figure 7 illustrates an example. };
Root Complex
If a port uses a vendor-specific approach to reset link, its
Root
Port: P0 AER port service driver has to provide a reset_link
function. If a root port driver or downstream port ser-
vice driver doesn’t provide a reset_link function,
Up
Port: P1 the default reset_link function will be called. If
Switch an upstream port service driver doesn’t implement a
Down Down reset_link function, the error recovery will fail.
Port: P2 Port: P3
Below is the system console output example printed by
End Point: E1
the PCI Express AER driver when doing fatal error re-
covery.
Figure 7: Reset PCI Express Link Example +—— PCI-Express Device Error —–+
Error Severity : Uncorrected (Fatal)
PCIE Bus Error type : (Unaccessible)
In Figure 7, if root port P0 (a kind of bridge) reports a
Unaccessible Received : First
fatal error to itself, the PCI Express AER driver chooses
Unregistered Agent ID : 0500
to reset the upstream link between root port P0 and up-
Broadcast error_detected message
stream port P1. If end-point device E1 reports a fatal
Complete link reset at Root[0000:00:04.0]
error, the PCI Express AER driver chooses to reset the
Broadcast slot_reset message
upstream link of E1, i.e., the link between P2 and E1.
Broadcast resume message
The reset is executed by the port. If the agent is a port, tg3: eth3: Link is down.
the port will execute reset. If the agent is an end-point AER driver successfully recovered
device, for example, E1 in Figure 7, the port of the up-
stream link of E1, i.e., port P2 will execute reset. 2.3 Including PCI Express Advanced Error Re-
porting Driver Into the Kernel
The reset method depends on the port type. As for root
port and downstream port, the PCI Express Specifica-
tion defines an approach to reset their downstream link. The PCI Express AER Root driver is a Root Port ser-
In Figure 7, if port P0, P2, P3, and end point E1 report vice driver attached to the PCI Express Port Bus driver.
fatal errors, the method defined in PCI Express Specifi- Its service must be registered with the PCI Express Port
cation will be used. The PCI Express AER driver im- Bus driver and users are required to include the PCI Ex-
plements the standard method as default reset function. press Port Bus driver in the kernel [5]. Once the ker-
nel configuration option CONFIG_PCIEPORTBUS is in-
There is no standard way to reset the downstream cluded, the PCI Express AER Root driver is automati-
link under the upstream port because different switches cally included as a kernel driver by default (CONFIG_
might implement different reset approaches. To facili- PCIEAER = Y).
tate the link reset approach, the PCI Express AER driver
adds reset_link, a new function pointer, in the data
structure pcie_port_service_driver.
3 Impact to PCI Express BIOS Vendor
control method _OSC. The PCI Express AER driver 4.2 Device driver helper functions
provides a current workaround for the lack of ACPI
BIOS _OSC support by implementing a boot param- To communicate with device AER capabilities, drivers
eter, forceload=y/n. When the kernel boots with need to access AER registers in configuration space. It’s
parameter aerdriver.forceload=y, the PCI Ex- easy to write incorrect code because they must access/
press AER driver still binds to all root ports, which im- change the bits of registers. To facilitate driver program-
plements the AER capability. ming and reduce coding errors, the AER driver provides
a couple of helper functions which could be used by de-
4 Impact to PCI Express Device Driver vice drivers.
5 Conclusion
6 Acknowledgement
Legal Statement
Volume Two
Review Committee
Andrew J. Hutton, Steamballoon, Inc., Linux Symposium,
Thin Lines Mountaineering
Dirk Hohndel, Intel
Martin Bligh, Google
Gerrit Huizenga, IBM
Dave Jones, Red Hat, Inc.
C. Craig Ross, Linux Symposium