0% found this document useful (0 votes)
85 views

Design and Simulation of A PCI Express Based Embed

The document describes the design and simulation of a PCI Express based embedded system. It provides background on PCI Express and its capabilities. The system utilizes a Xilinx Microblaze soft processor core, PCI Express core, and Philips PX1011A physical layer. An On-chip Peripheral Bus to PCI Express Bridge is developed to connect the Microblaze and PCI Express protocol layers. Data communication between an intelligent PCI Express endpoint device and system memory/CPU is simulated.

Uploaded by

Asrat Teshome
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views

Design and Simulation of A PCI Express Based Embed

The document describes the design and simulation of a PCI Express based embedded system. It provides background on PCI Express and its capabilities. The system utilizes a Xilinx Microblaze soft processor core, PCI Express core, and Philips PX1011A physical layer. An On-chip Peripheral Bus to PCI Express Bridge is developed to connect the Microblaze and PCI Express protocol layers. Data communication between an intelligent PCI Express endpoint device and system memory/CPU is simulated.

Uploaded by

Asrat Teshome
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/228909244

Design and Simulation of a PCI Express based Embedded System

Article · January 2008

CITATION READS

1 4,727

5 authors, including:

Jan Haase Christoph Grimm


Nordakademie RPTU - Rheinland-Pfälzische Technische Universität Kaiserslautern Landau
163 PUBLICATIONS   803 CITATIONS    196 PUBLICATIONS   1,211 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

TR-FSM View project

Cyber Physical Systems View project

All content following this page was uploaded by Christoph Grimm on 03 June 2014.

The user has requested enhancement of the downloaded file.


Austrochip 2008 Johannes Kepler Universität Linz

Design and Simulation of a PCI Express based Embedded System


Faraj Nassar1, Jan Haase1, Christoph Grimm1, Herbert Nachtnebel1, Majid Ghameshlu2
1
Institute of Computer Technology, Vienna University of Technology
2
Siemens IT Solution and Services PSE, Siemens AG Austria
{nassar, haase, grimm, nachtnebel}@ict.tuwien.ac.at, [email protected]

Abstract In the 1990s, the second IO buses generation was


started with different approaches. In 1993 the PCI 33
In this paper, a brief introduction to the theory of PCI MHz bus was released. At that time, a 32-bit version of
Express (PCIe) bus system is given. In addition to that, this bus was enough to deliver a bandwidth of 133
the capabilities of this bus system are demonstrated by Mbytes/s, which met the bandwidth requirements of the
designing and simulating a PCIe based embedded data available IO peripherals. A 64-bit version of this PCI bus
communication system. This system utilizes the Xilinx delivers a bandwidth of 266 Mbytes/s [1]. However, due
Microblaze soft processor core, the Xilinx PCIe core, to the increase in the processor speeds and the bandwidth
and the Philips PX1011A physical layer. A basic, needs of newer IO technologies, the PCI bus frequency
efficient, and simplified On-chip Peripheral Bus (OPB) was increased in 1995 from 33 to 66 MHz, to increase
to PCIe Bridge is developed here from scratch to bridge the bandwidth from 133 Mbytes/s to 266 Mbytes/s for a
the Microblaze and the PCIe protocol layers. Data 32-bit PCI, and from 266 Mbytes/s to 533 Mbytes/s for a
communication between the designed PCIe-based 64-bit PCI, correspondingly [2].
intelligent Endpoint device, in the PCIe topology, and Several practical limitations of the PCI 66 MHz bus
the system memory, as well as the Central Processing and the emerging of new high end system technologies
Unit (CPU), through the Root Complex, is simulated. that continued asking for higher bandwidths led in 1999
to the release of a new derivation of the PCI called the
Keywords: IP, Microblaze, PCI, PCIe Core, PCIe PCI-X bus. The PCI-X bus has frequencies of 66 and
Endpoint, On Chip Peripheral Bus (OPB), OPB IPIF, 133 MHz and enables a bandwidth up to 1.066 Gbytes/s.
OPB to PCIe Bridge, Philips PX1011A PHY. These frequencies were increased to 266 and 533 MHz
in the first quarter of 2002, to increase the bandwidth
provided up to 4 Gbytes/s [2]. Another bus system in the
second generation is the Accelerated Graphics Port
1 Introduction
(AGP). However, in order to meet the higher bandwidth
requirements and to satisfy the bandwidth hungry
The platform of Today's PCs consists of many local
devices, a new bus system was still needed.
buses with different requirements, to allow the
communication of different devices with each other. The third and latest generation IO bus system is the
Nowadays, many of these modern electronic devices are PCIe, which was released in the second quarter of 2002.
demanding a high bandwidth, even higher than what It evolved from the PCI and overcame the limitations of
already existing input and output (IO) bus systems can the PCI. The PCIe began shipping in standard desktop
deliver. An example of these is the Peripheral PCs in 2004. A x1 PCIe bus provides theoretically a
Component Interconnect (PCI). These bus systems are bandwidth of 500 Mbytes/s, a x16 PCIe can provide up
reaching their practical limits and are facing serious to 8 Gbytes/s, and a x32 provides 16 Gbytes/s [2].
problems and shortcomings that prevent them from In this paper, the capabilities of this PCIe bus system
being able to provide the bandwidth and features needed are demonstrated by designing and simulating a PCIe
by the electronic industry, which keeps needing to an based embedded system for customer reference. In this
increased bandwidth as well as a simple electrical system an intelligent Endpoint device, employing this
connectivity. All these factors together have motivated technology, is able to write a double word (DW) having
the engineering of a new IO bus system, the so-called 32 bits to a location within the system memory and read
Peripheral Component Interconnect Express (PCIe), this data back. This system also enables data
which has been adopted as a general purpose IO device communication between the CPU through the Root
interconnect in different applications, such as desktop, Complex and this Endpoint device.
server, mobile, workstations, computing, and
communication platforms.
The first IO buses generation was introduced in the 2 PCI Express
1980s, including the Industry Standard Architecture
(ISA), which enables a bandwidth of 16.7 Mbytes/s, a
sufficient one at that time. Extended ISA (EISA) and 2.1 PCI Express Introduction
Video Electronics Standards Association (VESA) are Unlike the parallel PCI bus, the PCIe bus is serial. The
other buses of this generation. PCIe link implements a high performance, high speed,

ISBN 978-3-200-01330-8 - 73 - Institut für Integrierte Schaltungen


Austrochip 2008 Johannes Kepler Universität Linz

point-to-point, dual simplex, low-pin-count and completion TLP back to the requester, unlike the
differential signalling link for interconnecting devices. memory read TLP, where the completer is supposed to
The PCIe link, shown in Figure 1, implements the return a completion TLP back to the requester. The
physical connection between two devices in the PCIe completer returns either a CPLD, if it is able to provide
topology. A PCIe interconnect is constructed of either a the requested data, or a Completion without data (CPL),
x1, x2, x4, x8, x12, x16, or x32 point-to-point link. A x1 if it fails to obtain the requested data.
link has 1 lane or 1 differential signal pairs in each
direction, transmitter and receiver, with a total of 4
signals. Correspondingly, x32 link has 32 lanes or 32 CPU
Memory
signal pairs for each direction, with a total of 128 signals
[2]. FSB Memory bus
PCIe employs a packet-based communication protocol

Graphics
PCIe
with a split transaction. Communication in this bus Root Complex
system includes the transmission and reception of
packets called Transaction Layer packets (TLPs). The
x1 PCIe link
transactions supported by the PCIe protocol can be EP: Endpoint
grouped into four categories: Memory, IO, FSB: Front Side Bus
Configuration, and Message transactions [2]. Switch

PCIe
2.2 PCI Express Topology PCIe

The PCIe topology shown in Figure 1 contains different


Switch PCIe-PCI
components. A Root Complex, PCIe switches, PCIe Bridge
Endpoints, and optional PCIe to PCI bridges.
PCIe PCIe
The Root Complex connects the CPU and the memory PCI
to the PCIe fabric. The main purpose of the Root x2 EP
x1 EP
Complex is to generate transaction and configuration PCI-Slots
requests by request of the CPU. PCIe implements a
switch-based topology in order to interconnect multiple
devices as shown in Figure 1. Figure 1. PCI Express Topology
The PCIe Endpoint (EP) is a device which can be a
requester that originates a PCIe transaction or a Figure 2 also shows the assembly and disassembly of a
completer that responds to a PCIe transaction addressed PCIe TLP. It also illustrates the contribution of each
to it. PCIe Endpoints are peripheral devices such as layer to this TLP.
Ethernet, USB or graphic devices. In order to connect The device core sends to the Transaction Layer the
some PCI devices to the PCIe fabric, a PCIe to PCI information required to assemble the TLP. This
Bridge must be used. information contains the Header (HDR) and the Data
Payload, if exists. The main functionality of the
2.3 PCI Express Architecture Transaction Layer is the generation of TLPs to be
transmitted across the PCIe link and the reception of
PCIe has a layered architecture as depicted in Figure 2.
TLPs received from the PCIe link. This layer appends a
It consists of the Transaction Layer, the Data Link Layer
32-bit End to End Cyclic Redundancy Check (ECRC) to
and the Physical Layer. On the top of these three layers
the TLP to be transmitted. These 32 bits are stripped out
resides the Software Layer, or device core. Each of these
by the same layer at the receiver side.
layers is further divided into two sections: transmitter
and receiver. The Data Link Layer (DLL) is responsible for ensuring
a reliable data transport on the PCIe link. The received
The transmitter is responsible for processing the
TLP from the transaction layer is concatenated with a
Transaction Layer Packets requested from the device
12-bit sequence ID and a 32-bit Link CRC (LCRC) as
core before being transmitted across the PCIe link. The
shown in Figure 2 [4]. These added bits are stripped out
receiver processes the incoming TLPs before sending
from the incoming TLP by the same layer in the
them to the device core.
receiving device before being transferred to the
To demonstrate the functionality of the PCI Express Transaction Layer.
protocol and for the purpose of this paper, 32-bit
The physical layer of a PCIe device is responsible for
addressable memory write/read and Completion with
driving and receiving the Low Voltage Differential
Data (CPLD) TLPs will be considered.
Signals (LVDS) at a high speed rate of 2.5 Gbps each
The memory write TLP is considered to be a posted way. It interfaces the device to the PCIe fabric. Such an
transaction where the requester transmits a request TLP interface is scalable to deliver a higher bandwidth.
to the completer. This in turn does not return a

ISBN 978-3-200-01330-8 - 74 - Institut für Integrierte Schaltungen


Austrochip 2008 Johannes Kepler Universität Linz

Device A Device
Device BB
HDR Data HDR Data
Device Core Device Core

1DW
Transaction Transaction
HDR Data ECRC HDR Data ECRC
Layer Layer
12-bit 1DW
SEQ. # HDR Data ECRCLCRC Data Link Layer Data Link Layer SEQ. # HDR Data ECRCLCRC
1B

Start SEQ. # HDR Data ECRCLCRC END Physical Layer Physical Layer Start SEQ. # HDR Data ECRCLCRC END

TX RX TX RX

Figure 2. PCI Express Architecture and Transaction Layer Packet (TLP) Assembly/Disassembly

The TLPs are transferred to this layer for the purpose of A termination of the transaction takes place when the
transmission across the link. This layer also receives the Endpoint receives the TLP and writes the data to the
incoming TLPs from the link and sends them to the Data targeted local register.
Link Layer. This layer appends 8-bit Start and End To read this data back, the CPU issues a load register
framing characters to the packet before being command from the same memory mapped location in the
transmitted. Endpoint. This is done by having the Root Complex
The physical layer of the receiving device in-turn strips generate a Memory Read TLP with the same memory
out these characters after recognizing the starting and mapped address and other Header contents. This TLP
ending of the received packet, and then forwards it to the moves downstream through the PCIe fabric to the
Data Link Layer. In addition to that, the physical layer of Endpoint. Again, routing here is based on the same
the transmitter issues Physical Layer Packets (PLPs) address within the Header. Once the Endpoint receives
which are terminated at the physical layer of the this Memory Read TLP, it generates a Completion with
receiver, such PLPs are used during the Link Training Data TLP (CPLD). The Header of this CPLD TLP
and Initialization process. In this process the link is includes the ID number of the Root Complex, which is
automatically configured and initialized for normal used to route this TLP upstream through the fabric to the
operation; no software is involved. During this process Root Complex, which in-turn updates the targeted CPU
the following features are defined: link width, data rate register and terminates the transaction. The other way
of the link, polarity inversion, lane reversal, bit/symbol around, is to have the Endpoint act as a bus master and
lock per lane, and lane-to-lane deskew (in case of multi- initiate a Memory Write TLP to write 1 DW to a location
lane link) [2]. within the system memory. This TLP is routed upstream
toward the Root Complex which in turn writes the data
to the targeted location in the system memory. If the
3 PCI Express Endpoint Design Endpoint wants to read the data it has written, it
generates a Memory Read TLP with the same address.
This is steered to the Root Complex, which in-turn
3.1 Design Overview accesses the system memory, gets the required data and
In this paper, the x1 PCIe Endpoint is considered. In generates a Completion with this data TLP. This CPLD
Figure 1, the Endpoint is an intelligent device which acts TLP is routed downstream to the Endpoint through the
as a target for downstream TLPs from the CPU through PCIe fabric. The Endpoint receives this TLP, updates its
the Root Complex and as an initiator of upstream TLPs local register and terminates the transaction. Figure 3
to the CPU. This Endpoint generates or responds to shows the layered structure of the PCIe Endpoint device.
Memory Write/Read transactions. There are two different solutions for the physical layer
When the Endpoint acts as a receiver, the CPU issues a (PHY). In the first solution, this layer can be integrated
store register command to a memory mapped location in with the other layers in the same chip. Doing so
the Endpoint. This is done by having the Root Complex increases the complexity of this chip and provides a
generate a Memory Write TLP with the required higher integration level. This integrated solution has one
memory mapped address in the Endpoint, the payload key advantage when designing using an FPGA. It uses a
size (a DW in this design), byte enables and other smaller number of IO pins, which enables easier timing
Header contents. This TLP moves downstream through closure. An example of this integrated solution is offered
the PCIe fabric to the Endpoint. Routing of the TLP in by Xilinx in their newly introduced Xilinx Virtex-5 PCIe
this case is based on the address within its Header. Endpoint block [5].

ISBN 978-3-200-01330-8 - 75 - Institut für Integrierte Schaltungen


Austrochip 2008 Johannes Kepler Universität Linz

Microblaze also controls the transmitting and receiving


of TLPs.
Endpoint Device

3.3 Complete Design


Device Core
Application Layer Figure 4 shows the complete designed PCIe Endpoint.
Microblaze based
RX TX
system
This system embeds the Xilinx Microblaze, which
implements a 32-bit Reduced Instruction Set Computer
(RISC) and operates at a frequency of 50MHz. Having
Transaction the Microblaze as a soft core processor enables the
Layer Protocol Layers design of a unique and customized PCIe peripheral
XILINX PCIe PIPE
Data Link Endpoint 1-lane Core device to be connected as a slave to it [8].
Layer
50 MHz
32 Bits

Physical Layer
Physical Layer
PHILIPS PHY 50 MHz

On-Chip Peripheral Bus


32 Bits OPB_PCIe_Bridge
MicroBlaze

USER LOGIC
OPB IPIF
(OPB)
DLMB
ILMB
PCI Express Fabric
ILMB DLMB
Controller Controller
62.5 MHz
Figure 3. Endpoint Design 32 Bits

Unlike the first solution which is quite expensive, the Instruction


BRAM
Data BRAM
XILINX PCIe PIPE
second solution offers a low cost way of implementing 1-Lane Core
the PCIe Endpoint. In this solution, the physical layer BRAM

exists in one chip, and the other layers are designed in Xilinx® Spartan-3 FPGA PXPIPE
250 MHz
another chip. In this two-chip solution, a smaller FPGA 8 Bits
with external PHY can be used. This PHY supports x1 Spartan-3 PCI Express Starter Kit
PHILIPS
PCIe designs. Having the practical bandwidth provided PHY
by x1 PCIe is 2.0 Gbps requires an internal interface of 8
bits running at 250 MHz or an interface of 16 bits PCI Express Link 2.5 GHz
running at 125 MHz. This solution has the disadvantage Serial

of higher number of IO pins.


In this paper, the Xilinx Spartan-3 FPGA and Philips Figure 4. Complete PCIe Endpoint Device. BRAM: Block
PX1011A two-chip solution was used for designing the Random Access Memory, LMB: Local Memory Bus,
PCIe Endpoint [6]. This solution was chosen due to a DLMB: Data LMB, ILMB: Instruction LMB, IPIF:
Intellectual Property Interface, FPGA: Field
management issue at the department where this work
Programmable Gate Array, PIPE: Physical Interface for
was conducted. PCI Express, PXPIPE: Philips PHY Specification PIPE

3.2 Protocol and Application Layers The Microblaze has different bus interfaces, connecting
The protocol layers containing the logical sub-layer of it with different peripherals. For example, the Local
the physical layer, the data link layer and the transaction Memory Bus (LMB) allows the communication between
layer are implemented using the Xilinx PCI Express the processor and the Block Random Access Memory
Physical Interface for PCI Express (PIPE) Endpoint 1- (BRAM), which is loaded with the application program
Lane IP core [7]. to be executed by the Microblaze. This program is
written in C, using special library provided by Xilinx. It
A Microblaze based system was built up to implement is compiled into an executable link format (ELF). The
the Application layer of the designed PCIe Endpoint. In Microblaze has a Harvard structure, in which the BRAM
this system, the PCIe core is attached as a slave to the consists of two sections, data and instructions. These
processor, which in-turn tries to access the configuration sections are accessed by the processor through memory
space of this core, reading from and writing to this space. controllers over the local memory bus.
In the application layer, the Microblaze is responsible for
sending the required Header and data payload to the The Xilinx On-Chip Peripheral Bus (OPB), which
transaction layer. When a TLP is received by the PCIe implements the IBM CoreConnect On-Chip Peripheral
Endpoint, the Header and the payload, if exists, will be Bus, has two 32-bit separate paths for data and address
forwarded to the Microblaze for further processing. The [8]. This bus is used to connect peripherals to the
Microblaze, which masters the bus.

ISBN 978-3-200-01330-8 - 76 - Institut für Integrierte Schaltungen


Austrochip 2008 Johannes Kepler Universität Linz

The PCIe core can not be directly connected to the x1 PCIe design), the PCIe downstream port model, the
OPB as a slave, because of the incompatibility of its Philips PHY and the Design Under Test (DUT) are
interfaces with the OPB protocol. To solve this instantiated.
compatibility issue, a bridge was developed to bridge the
OPB and the PCIe core. This bridge interfaces the OPB Application Output logs
with its standard protocol through the OPB Intellectual Program .txt
.elf
Property Interface (OPB IPIF) from one side, and the
PCIe core through the USER LOGIC model from the
other side. The internal structure of this USER LOGIC
model is shown in Figure 5. This model implements the PXPIPE
Philips
PCIe PCIe
Interface Downstream
logic needed to transmit/receive TLPs across the PCIe Design Under PHY
Test (DUT) Port Model
link and to access the configuration space of the PCIe PX1011A Link
core. The PCIe core transaction interfaces are
synchronized with a clock of 62.5 MHz generated from
the core as indicated in Figure 5.
Test Program
.v

USER LOGIC
Transmit Transaction Boardx01

Figure 6. PCIe Testbench Top-level


IP Interconnect (IPIC)

PCIe Transmission
Interface

State Machine
Register
Read The PX1011A behavioural model is a packaged model,
which can be simulated in ModelSim or other standard
Hardware Description Language (HDL) simulators. The
IP Model Packager from Cadence was used to generate
this model. This model can be integrated in any
Receive Transaction

Software accessible PCIe Receiving


simulator that supports either the IEEE standard 1499 –
Interface

Register Bank State Machine


the open Model Interface, or the IEEE standard 1364 –
the Verilog PLI 1.0 (Programming Language Interface).
This model was provided by NXP Semiconductors.
IP Interconnect (IPIC)

Register In a PCIe Testbench, a simulation model is needed to


Write
implement the functionality of the Root Complex and the
Configuration Interface

PCIe Configuration PCIe switch in the PCIe topology shown in Figure 1.


space Access
READ/WRITE The Xilinx PCIe downstream port simulation model [9],
State Machine offered by Xilinx when generating the core, which is
written in Verilog HDL was used for the purpose of
simulation in the PCIe based system. The main
functionality of this model is to generate downstream
Figure 5. USER LOGIC Internal Structure TLPs from the CPU to the PCIe Endpoint and to receive
upstream TLPs from the PCIe Endpoint to the CPU. In
The PCIe core interfaces with the Philips PHY using addition to the main functionality, this model does the
the Philips PHY Specification Physical Interface for PCI initialization of the PCIe core's configuration registers,
Express (PXPIPE), defined by Philips Semiconductors, verifies the transmission and reception of TLPs by
which implements an extended version of the Physical generating TLP logs, and provides a kind of Test
Interface for PCI Express (PIPE), defined by Intel. Programming Interface (TPI). For the purpose of
PXPIPE is a 250 MHz source synchronous interface, functional simulation, the model implements an output
which provides two clocks, one for transmission, and the logging mechanism. Three different text files are
other for reception. Depicted in the Figure are the generated, when running a defined task. One of the files
interfaces of Philips PHY to the PCIe link, which are the summarizes the received TLPs, another shows the
low voltage differential signals (LVDS) driven at a high transmitted TLPs, and the third file includes error
data rate of 2.5 Gbps. messages, in case any errors are detected.
Several test cases were conducted to simulate the
functionality of this designed Endpoint device. Figure 7
4 PCI Express based System Simulation shows the simulation results of the case, when the
Endpoint responds with a CPLD TLP to a memory read
The designed PCIe Endpoint was integrated in a top TLP issued by the CPU, refer to the design overview in
level Testbench to simulate its functionality. Figure 6 Section 3 of this paper for more details. This diagram
shows the top level of this Testbench (which is written in shows the PCIe transmission state machine. In this state
Verilog HDL). The Figure depicts the hierarchy of this machine and for the case of transmitting a CPLD TLP,
Testbench. In the top level named boardx01 (indicates a the following sequence of events has to be performed on

ISBN 978-3-200-01330-8 - 77 - Institut für Integrierte Schaltungen


Austrochip 2008 Johannes Kepler Universität Linz

the PCIe Transmit Transaction interfaces as shown in In addition to that, the designed Endpoint was system
Figure 7: First, after receiving an active high control was prepared to be implemented in the Xilinx Spartan-3
signal (compl_gen) from the processor requesting the FPGA, located on the Xilinx PCIe Spartan-3 Starter Kit.
generation of a CPLD, the machine activates, at the next It can also be concluded that working with PCIe
positive edge of the transaction clock (trn_clk), the requires the knowledge of the PCIe protocol, because
active low start of frame signal (trn_tsof_n) and the most of the available PCIe IP cores don’t provide
active low transmit source ready signal compatible interfaces, which would allow them to be
(trn_tsrc_rdy_n) to indicate the availability of directly connected to the regarded processor. Therefore,
valid data from the user logic application, and then in most cases, an effort must be made to develop a
presents the first TLP’s DW on the 32-bit transaction bridge that allows an easy connection of the PCIe
data signal (trn_td). Note that, in case of transmitting, peripheral to the processor. Furthermore, the
the Endpoint is enabled as a master through the signal functionality of this designed Endpoint can be more
master_enable. Second, at the next clock cycle, the complicated than this simple data transfer task. One can
state machine deactivates trn_tsof_n and presents further extend the capabilities of this Endpoint by
the rest of the TLP’s DWs on trn_td. The PCIe core reconfiguring the PCIe core to include IO mapped space.
keeps the activation of trn_tdst_rdy_n. Third, this
state machine activates trn_tsrc_rdy_n and the Acknowledgment
active low end of frame signal (trn_teof_n) together
The results in this paper are from the work which was
with the last DW of data. It also activates the signal conducted as a Master thesis in the field of
cpld_transmitted to indicate that a CPLD TLP is Microelectronics, in cooperation with the Institute of
transmitted. Finally, at the next clock cycle, the state Computer Technology at the Technical University of
machine deactivates trn_tsrc_rdy_n to indicate the Vienna and the CES Design Services business unit of
end of valid transfer of data on trn_td. Siemens IT Solutions and Services PSE, Austria. I’m
very grateful for those who supported and helped me
during conducting this work.
5 Conclusion
Within this paper, the various capabilities of the PCIe References
bus protocol were demonstrated. In a platform based on [1] Don Anderson and Tom Shanley, “PCI System
PCIe topology, an Endpoint device was designed. This Architecture”, MINDSHARE INC., 1999.
Endpoint embeds the Microblaze soft core of Xilinx, [2] Don Anderson, Ravi Budruk, and Tom Shanley, “PCI
which is bridged to the PCIe protocol layers Express System Architecture, MINDSHARE INC., 2004.
implemented by the PCIe core, to serve the data [3] Ajay V. Bhatt “Creating a PCI Express™ Interconnect”,
communication between this intelligent Endpoint and the Technology and Research Labs, Intel Corporation, 2002.
CPU/system memory through the Root Complex.
[4] “PCI Express™ Base Specification”, Revision 1.1, March
A basic and simplified OPB to PCIe Bridge was 28, 2005
developed to bridge the Microblaze and the PCIe [5] “Virtex-5 Integrated Endpoint Block for PCI Express
protocol layers. The PCIe core was generated, Designs”, User Guide, UG197 (v1.1), March 20, 2007.
configured and customized using the Xilinx CORE
[6] Koninklijke Philips Electronics N.V, “NXP x1 PHY
generator. A packaged simulation model, provided by single-lane transceiver PX1011A (I)”, September 2006.
NXP Semiconductors, was used to simulate the
[7] “LogiCORE™ PCI Express PIPE Endpoint 1-Lane v1.5”,
functionality of the PCIe physical layer. This model
User Guide, UG167, September 21, 2006.
interfaces the simulation tool using the Verilog HDL
Programming Language Interface (PLI). [8] “MicroBlaze Processor Reference Guide”, UG081 (v6.3),
User Guide, August 29, 2006.
In a modified version of a PCIe Testbench (provided by
[9] Xilinx User Guide “LogiCORE™ PCI Express® Endpoint
Xilinx) and with the help of the simulation tool Block Plus v1.2”, UG341 February15, 2007.
ModelSim, the functionality of the designed Endpoint
was simulated and verified.

Figure 7. PCIe Endpoint’s Completion with Data (CPLD) Transaction Layer Packet (TLP)
ISBN 978-3-200-01330-8 - 78 - Institut für Integrierte Schaltungen

View publication stats

You might also like