0% found this document useful (0 votes)
20 views

Multiple FPGAs Based Prototyping and Debugging With Complete Design Flow

The document discusses debugging techniques for large designs implemented across multiple FPGAs. It proposes using an external memory to store signal data from multiple partitions to overcome the limited internal memory of FPGAs. It also addresses the need to trace thousands of signals across multiple partitions during debugging.

Uploaded by

Khaled Ismail
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Multiple FPGAs Based Prototyping and Debugging With Complete Design Flow

The document discusses debugging techniques for large designs implemented across multiple FPGAs. It proposes using an external memory to store signal data from multiple partitions to overcome the limited internal memory of FPGAs. It also addresses the need to trace thousands of signals across multiple partitions during debugging.

Uploaded by

Khaled Ismail
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/313450244

Multiple FPGAs based prototyping and debugging with complete design flow

Conference Paper · December 2016


DOI: 10.1109/IDT.2016.7843035

CITATION READS

1 539

5 authors, including:

Muhammad Moazam Azeem Roselyne Chotin


Conservatoire National des Arts et Métiers Sorbonne Université
23 PUBLICATIONS 57 CITATIONS 54 PUBLICATIONS 219 CITATIONS

SEE PROFILE SEE PROFILE

Umer Farooq Maminionja Ravoson


Dhofar University Sorbonne Université
68 PUBLICATIONS 667 CITATIONS 5 PUBLICATIONS 19 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Umer Farooq on 11 November 2020.

The user has requested enhancement of the downloaded file.


Multiple FPGAs based prototyping and debugging
with complete design flow
Muhammad Moazam Azeem, Roselyne Chotin-Avot, Umer Farooq, Maminionja Ravoson, Habib Mehrez
Sorbonne Universités, UPMC University Paris 06, LIP6, Paris, France
Email: {muhammad-moazam.azeem}@lip6.fr

Abstract—Multiple FPGA-based prototyping plays an impor- FPGAs and design complexity is also increasing, the large
tant role in the design and verification process due to their complex designs can be prototyped using multiple FPGAs.
low cost and high execution speed. However, there is a need In order to do so, large design is partitioned across multiple
to optimize the configuration flow of this multiple FPGA-
based prototyping. In this paper, we address the partitioning FPGAs to meet desired logic capacity [4]. Therefore, multiple
of large designs and propose a debugging methodology for these FPGA-based prototyping is becoming more demanding for
partitioned designs using Signal Tap II embedded logic analyzer industrial needs. Synopsys HAPS is an example of FPGA-
by Quartus tool of Altera. Usually SignalTap II tool is used based prototyping system that has been designed to fulfill
to debug design implemented on single FPGA and this logic industrial needs [5].
analyzer debugs FPGA device by probing the states of internal
signals without using external debug equipment. However, we use The multiple FPGA-based prototyping follows a configura-
SignalTap II logic analyzer for large designs on multiple FPGAs tion flow including various steps which are partitioning, syn-
and we facilitate the debugging methodology for thousands of thesis, inter-FPGA routing, intra-FPGA P&R, and debugging.
signals under consideration. We propose the debugging of large We generated very large scale benchmarks using Design Space
designs after partitioning by developing the techniques to trace eXploration (DSX) tool developed at LIP6 laboratory [6]. We
the required signals under test through multiple FPGAs without
using FPGA internal memory. We have generated various large used Synopsys Certify tool for partitioning large designs. In
benchmarks as well and tested them for multiple FPGA-based multiple FPGA-based prototyping, partitioning is a process to
prototyping. divide the large complex design into multiple parts depending
on the number of available FPGAs. In this way each part
I. I NTRODUCTION
after partitioning can fit to available logic capacity of target
The advancement in processing technology has tremen- device i.e. FPGA. Certify provides two types of partitioning
dously increased the computation capability of digital circuits either manually when user assigns logic elements to physical
which in turn has made their design process costly both in devices or automatically when it is done by the tool. The
terms of time and money. Design of recent digital system takes optimized partitioning process must reduce the number of
about two to three years to role out the first prototype and signals between FPGAs to increase the frequency of whole
requires millions of dollars [1]. Furthermore, disparaging pro- design. This partitioning is then followed by synthesis, inter-
cessing technology metrics are aggravating reliability issues in FPGA routing and finally placement and routing is done using
the final rolled out design. Design process of a digital circuit Quartus tools as we target Altera FPGAs. Once the placement
normally comprises of multiple important steps among which and routing are done, the next step is in-circuit verification
design verification is important as it takes around 70% of total [7]. We will use SignalTap II Embedded Logic Analyzer from
design time and 80% of total design cost. In order to minimize Altera for debugging of multiple FPGAs.
the effect of verification time and cost, different pre-silicon Moreover, for multiple FPGA-based prototyping, FPGAs
design verification techniques are exercised. These techniques are either mounted on a single board having physical connec-
primarily include simulation, emulation, and FPGA-based tions among them [8] or we can connect multiple FPGA boards
prototyping. Among these techniques, FPGA-based prototyp- together communicating either through cabling or communi-
ing has become popular as it is an economical method to cating through HSTC connectors. The communication among
validate the functionality of an ASIC design providing rapid multiple FPGAs can be either point to point or point to multi
time to market [2]. In near future, FPGA-based prototyping point depending upon the available physical tracks between
will become important for many applications including IoT FPGAs. These number of FPGAs depend upon the complexity
products, wireless sensor networks, and for cloud computing. of design and it may vary from few FPGAs to many [9], [10].
Despite its all benefits designers still oppose to do FPGA- We used four DE3 boards by Altera that have build-in HSTC
based physical prototyping as they understand that it does connectors for inter-FPGA communication. We will use Altera
not support very large scale designs [3]. In fact, the capacity tools for synthesis, P&R and debugging.
for FPGA-based designs is limited to few million of ASIC Furthermore, there are also limitations on debugging visi-
gates and it also takes months to have a working FPGA- bility for multiple FPGA-based prototype. When the specified
based prototype. There is large gap between ASICs and condition or set of conditions are reached, the Signaltap II
978-1-5090-4900-4/16/$31.00 ⃝2016
c IEEE stops and displays the data which is called trigger. Normally,
by defining the trigger conditions in logic analyzer the design required design size for multiple FPGA-based prototyping.
accuracy is achieved and the ability to isolate the errors from The architecture of large multiprocessor generated benchmark
design is improved. The SignalTap II does not require to is shown in Fig. 1 that consists of clusters which are the basic
change the internal design files or external probes to capture building blocks for 2D mesh. The cluster consists of RAM,
the state of internal nodes and no extra I/O pins for design un-
der test are required. The signals data are stored in memory of
FPGA until they are analyzed. SignalTap II from Altera is an
integrated logic analyzer that is normally used for debugging
of single FPGA. However, to use SignalTap II for debugging
of multiple FPGA-based prototyping is a challenging scientific
problem, the large design is partitioned and there is a need
to record and trace the required signals after partitioning for
debugging. In this paper, we will address how to perform
debugging for large designs after partitioning for multi-FPGA
prototyping system. First, we propose to use an external
memory to save the information of thousands of signals to
avoid limited FPGA memory. We can save the multiple states
of thousands of signals of multiple partitions which will be Fig. 1. 2D mesh NoC
running at multiple FPGAs of existing prototype. We have
taken into account this memory limitation and propose to use DMA and four processors including a data and instruction
external memory for multiple FPGA-based prototyping for cache. The 2D mesh is a grid of 𝑀 × 𝑀 clusters and
very large scale designs [11]. Second, as the large design is each cluster is further connected to the four nearest clusters.
partitioned and each partition contains thousands of signals, The size of the cluster depends on the grid that may suffer
there is a need to keep record of all signals. For example from latency which increases with the size of grid. It is
if there are 60 FPGAs and the large design is partitioned to thus possible to integrate more cores on a single chip while
multiple parts, to debug specific signals we do not know in controlling the increase of latency. Furthermore, in order to
which partition the signals are placed. As the debugging is have more complex and large benchmarks, we can move from
performed at run time just after the bitstreams are loaded mono-cluster to multi-cluster benchmarks where intra-cluster
to multiple FPGAs, we need to trace the required signals communication is done through VCI network and inter-cluster
under debugging to observe the states of signals according to communication is done through DSPIN (NoC) architecture.
trigger conditions. We have developed techniques to trace the We have generated the multi-cluster benchmarks which are
signals under debugging through multiple partitions running realistic and based on this architecture, we generated large
on multiple FPGAs. Third, We have tested the developed sizes 2D mesh NoCs that we will use in section V.
prototype for multiple-FPGAs after generating various large III. D ESIGN F LOW
benchmarks using DSX tool and facilitated the debugging of In this section we will describe the design flow that we will
large and complex designs after partitioning. use for our multiple FPGA-based prototyping. Fig. 2 describes
The rest of the paper is organized as follows. Section II the design flow where partitioning at RTL level is done before
presents the large benchmarks generation for multi-FPGA synthesis which is then followed by inter-FPGA routing and
prototyping. Section III gives brief overview of design flow multiplexing. The bitstreams for multiple FPGAs are generated
including partitioning, synthesis, inter-FPGA routing, intra- after intra-FPGA placement and routing. When the bitstreams
FPGA P&R and debugging. The debugging methodology is are downloaded to the FPGAs, in-circuit verification is the
described in section IV. The experimental setup is presented next step to validate the design functionality. We will provide
in section V that includes the system builder for multi-FPGA brief detail of each step in coming subsections.
prototyping. Various benchmarks are tested at multiple FPGA-
based prototype in this section and the debugging for large A. Partitioning
designs is also presented. Section VI is about conclusion and The benchmarks generated by DSX tools are large and
future works. have to be partitioned before synthesis in order to avoid long
synthesis time and for multiple FPGA mapping. We have used
II. B ENCHMARK G ENERATION Synopsis Certify partitioning tool that takes the description of
The complex and large scale benchmarks are elementary FPGA and perform the fast synthesis for partitioned design as
requirement for multiple FPGA-based prototyping. We gener- shown in Fig. 2. Certify tool generates the required partitions
ated various benchmarks using DSX tool that uses SoCLiB for multi-FPGA prototyping with minimum cut nets among
developed at LIP6 [6]. This section describes the architecture different partitions. Certify tool also generates the trace as-
of a benchmark which is in fact a 2D mesh NoC that is signment file that provides the information for communication
generated using SoCLiB [12]. Many NoCs of various sizes between FPGAs. This trace assignment file is finally given to
can be generated based on this pattern depending on the inter-FPGA routing tool.
which are obtained after partitioning. These cut nets are much
more than the FPGA I/Os and the inter-FPGA tracks are
very limited [13]. These cut nets are then routed using time
division multiplexing [14]. The main objective of inter-FPGA
routing tool is to provide the shortest path between the source
and destination FPGA by minimizing the multiplexing ratio.
The multiplexing ratio will have significant impact on global
frequency of whole multi-FPGA system. Another objective of
routing tool is minimize the routing hops to avoid extra delays.
This routing tool is developed internally at LIP6 laboratory
[15], [16].

D. Intra-FPGA Placement and Routing


Once the routing tool has done inter-FPGA routing, the
next step is to perform intra-FPGA placement and routing.
The whole design has to be placed and routed, as we already
have multiple partitions, we can use vendor specific tools to
have intra-FPFA placement and routing [17]. As, we are using
Altera boards with four FPGAs, we will use Quartus tool
by Altera to perform intra-FPGA placement and routing [18].
Bitstreams are generated which are then loaded to each FPGA.

Fig. 2. Design Flow E. In-Circuit Verification


Once the bitstreams are loaded to each FPGA device, the
next step is in-circuit verification as shown in Fig. 2. This
B. Parallel Synthesis
verification is possible by using debugging tools which are
There is also possibility of parallel synthesis, we generate software tools provided by various CAD tool providers. The
the partitions from Certify tool and we propose to launch debugging tool captures and displays the signals of circuits
parallel synthesis on server via script in order to save the which are designed for implementation on FPGA devices. This
synthesis time for large designs. To further optimize synthesis will facilitate to observe the behavior of internal signals while
time, we can divide each partition to permit parallel sub- the design is running on FPGA device. By defining the trigger
synthesis. For example, Fig. 3 shows that we can divide each conditions the ability to isolate the errors from logic design
is dramatically improved. SignalTap II is debugging tool for
single FPGA provided by Altera but we will use this tool
for debugging of multiple FPGAs. More details on debugging
methodology are given in next section.

IV. D EBUGGING M ETHODOLOGY


The debugging of a design is possible using internal or
external Logic Analyzer. Tektronix [19] provides External
Logic Analyzers for example. Integrated Logic Analyzer is
provided by Xilinx [20]. We will use Signal Tap II Logic
Analyzer by Altera [21] for our prototype as we are using
FPGA boards by Altera. In fact, SignalTap II is normally
used for single FPGA device but we address the debugging
for multiple FPGAs prototyping that is a challenging scientific
Fig. 3. Parallel Synthesis of Design problem. We address the two main problems which are limited
FPGA memory as SignalTap II uses internal FPGA memory
RTL FPGAi of Fig. 2 to further three sub-circuits and the and the signal tracing after partitioning. First, to avoid FPGA
parallel synthesis can be performed on these three sub-circuits memory limitations, we propose to connect DDR3 with FPGA
to reduce synthesis time. The synthesized sub-circuits are then board or use external system memory to save information of
merged to obtain complete synthesized HDL FPGAi. signals manually. We can save the multiple states of thou-
sands of signals of multiple partitions to external memory by
C. Inter FPGA Routing multiple FPGAs of existing prototype. Secondly, we address
Due to increase in design size, the ratio of design logic the debugging problem after partitioning for multiple FPGA
to FPGA I/Os is increasing. This also increases the cut nets prototype. As we are doing partitioning for large designs we
have to keep record and trace the signals after partitioning. signals quickly for specific design. There is also possibility
Each partition contains thousands of signals and there is a need
SignalTap II Flow Diagram
to trace the required signals. The large design is partitioned to
multiple parts and then each partition is placed at a device
Program
Synthesis
among multiple devices. We have developed techniques to Your HDL Design
Place & Route Device under
Test

trace the signals under debugging through multiple partitions


Setup Your Design
running on multiple FPGAs. Our algorithm uses the record
of existing partitions and traces the required signals under
debugging, we will provide more details in next section. Use Node Finder Set up Signals,
Synthesis,
to Select Trigger Conditions
Moreover, as the debugging is performed at run time just Signals and Trigger Levels
Place & Route

after the bitstreams are loaded to multiple FPGAs, we need to Setup SignalTap II Embedded Logic Analyzer

record the multiple states of required signals under debugging


according to trigger conditions. We have tested the developed
prototype for multiple-FPGAs using various large benchmarks Program
Device under
View Samples in
Quartus II
Identity Source
of Problem
Software
that we generated using DSX tool and provide the solution for Test

debugging of multiple FPGAs after partitioning. Capture Samples and Analyze

Create New Project or Fig. 5. Design to Debug Flow


Open Existing project

to change buffer parameters, the way of capturing and storing


Add SignalTap II Logic
Analyzer to Design Instance data and specific memory type can also be selected. After the
Configure
compilation of design made with Quartus before synthesis,
SignalTap II Logic Analyzer placement and routing and programming FPGA, the signals
are selected in .stp file using node finder. After the selection of
Define Triggers
signal, the trigger conditions are set and the design is compiled
Compile Design
Yes Recompiling again. The logic Analyzer captures the data continuously when
Necessary?
it is running. The trigger conditions inform Logic Analyzer
No
Program Target when it has to stop capturing data. The trigger conditions
Device or Devices
vary from simple to complex group of multiple conditions.
Run SignalTap II Adjust Options, As we need to adjust the trigger conditions frequently, we
Logic Analyzer Triggers, or both
can use the incremental compilation feature of Logic Analyzer
View, Analyze, and
Use Captured Data
Continue Debugging with Quartus II incremental compilation that will reduce the
compilation time. The Logic Analyzer is connected to FPGA
No
device through JTAG connection. The data is captured through
Functionality
Satisfied or Bug
JTAG and it can be read and analyzed through .stp file. The
Fixed?
captured data can be saved in other format as well that later
can be used for debugging analysis. Furthermore, the complete
Yes

End
design to debug flow is provided in Fig. 5 from Altera that
Task Flow of SignalTap II
starts from synthesis, continues to programming the device
and finally ends with capturing and sampling the data. We
Fig. 4. Task Flow of Logic Analyzer will use the same flow for multiple FPGAs. As each FPGA is
programmed independently, we will use our tool to trace the
We briefly explain the classical debugging flow by Signal- specific signals after partitioning. Once the signals are traced
Tap II of Altera for single FPGA, we will then use this and we have the information of FPGA where the signals are
classical debugging flow for the debugging of multiple FPGAs. placed, we can open the Signaltap II window for that specific
We start with new projects and the input for Quartus tool device to start debugging. This will reduce the debugging
can be of various format including VHDL, Verilog, or VQM. complexity for multiple FPGAs.
To debug the ASIC design several tasks are performed, Fig.
4 shows the complete task flow of SignalTap II by Altera V. E XPERIMENTAL S ETUP AND R ESULTS
including the configuration and debug analysis [18]. First step The DE3 System Builder is really helpful to create multiple
is to add .stp file to the design. If we want to view multiple projects for a large and complex design for multiple FPGAs
clock domain simultaneously, the additional instances of logic quickly as shown in Fig. 6. It also provides error-checking
analyzer are added in the design. After the addition of .stp file rules to avoid common mistakes. These mistakes may include
to the design, logic analyzer is configured in order to monitor wrong pin assignments which may damage the board. It
the required signals. The signals can be added manually to may avoid malfunctioning of board due to wrong device
the design or there is a possibility to use a plug-in, Nios II connections. We have used 4 FPGA Stratix III DE3 boards
embedded processor plug-in for example, to add all required of Altera by Terasic as shown in Fig. 7. These boards can
to generate .sof file. In .stp file signals and trigger conditions
are defined and using JTAG chain target device is selected for
configuration using each .stp file as shown in Fig. 9. Each
FPGA is then programmed independently and each signal tap
II logic analyzer runs independently as well through JTAG
chain. SignalTap II captures data at defined trigger points
through JTAG and the data are further analyzed.
JTAG Chain

Communication
Cable Stratix FPGA1 Stratix FPGA2 Stratix FPGA3 Stratix FPGA4

STP1 STP2 STP3 STP4

Fig. 8. Multiple FPGA devices

As we are dealing with multiple FPGAs, we used Certify


tool for partitioning, each part is programmed to a specific
device through JTAG. We cannot store all signals and their
states to internal FPGA memory which is limited, we propose
Fig. 6. DE3 System Builder to store the signals using external system memory. We can
store thousands of signals in external system memory and next
step is to trace the specific signals in multiple partitions. It
be connected in stack and they are connected through build- is very difficult to keep record and tracing for thousand of
in HSTC connectors as shown in Fig. 8. Each FPGA board signals after partitioning. We have record of all signals before
can communicate with other board using these four HSTC partitioning and we trace the required signals after partitioning
connectors. For our prototype, DE3 System Builder is used among all existing partitions of large scale design. We run the
to create projects for four FPGA boards connected in stack. scripts and our algorithm traces and follow the required signals
In DE3 System Builder, we select the FPGA types and we through partitions. In fact, the names of signals that we need to
can change the I/O groups to HSTC connectors for inter- debug are given as input to the script, the algorithm then passes
FPGA communication. We then perform the configuration of through all the partitions which are generated and provides in
return the partitioned file details where the signals are placed.
The algorithm also provides the source, intermediate and final
device destination information to know where the signal is
finally placed. We can then add these signals in SignalTap II

Fig. 9. Trigger conditions by SignalTap II Logic Analyzer

logic analyzer to have run time debugging by specifying the


trigger conditions. The state of signals can easily be analyzed
Fig. 7. Stack of Altera’s FPGA boards using SignalTap II waveform as shown in Fig. 10. We can
have multiple waveform windows running at the same time in
our system by establishing the connections among 4 FPGA SignalTap II logic Analyzer depending on the complexity of
boards. Once the configuration is complete, the DE3 System design.
Builder generates four Quartus II projects. We can add the Moreover, SignalTap II logic analyzer provides the support
files of our partitioned custom logic design in each project. for multiple clock domains as well. This feature allows to
We Build and compile each project that also includes .stp file open unique logic analyzer for each clock domain. Moreover,
TABLE I by storing the signals at external memory and by tracing
B ENCHMARK D ETAILS the required signals under consideration. This work is still
in progress and the next step is to have complete open
Sr. Benchmark No of No of source academic design flow with automatic partitioning tool
No Name ALUTs Registers
dealing with large and complex designs. Furthermore, we will
automatize the multiple FPGA-based debugging methodology
1 CPU2x2x1 3088 1795
2 CPU2x2x2 92162 70270 dealing with millions of signals.
3 CPU2x2x3 98520 74861
4 CPU2x2x4 99740 78690 VII. ACKNOWLEDGMENTS
5 CPU2x2x5 112909 83741 This research work is done through the financial support of
6 CPU2x2x6 120811 88237
7 CPU2x2x7 128019 92713 Systematic and Bpifrance. We would also like to thank Zied
8 CPU2x2x8 132821 97014 Marrakchi and Hayder Mrabet from Flexras Technologies for
9 CPU4x4x1 341685 272504
their suggestions and guidance to enhance quality of work.
R EFERENCES
same SignalTap II settings can be applied to a group of signals [1] M. Santarini, “Asic prototyping: Make versus buy,” EDN, vol. 11, 2005.
using each instance. We have generated many benchmarks [2] Q. Tang, “Phd thesis: Methodology of multi-fpga prototyping platform
generation,” Master’s thesis, Pierre and Marie Curie University, 2015.
as shown in Table I, it is clear that the Adaptive Look-Up [3] S2C, “http://www.s2cinc.com/assets/files,” April 2015.
Tables (ALUTS) and registers vary from few thousands to [4] Synopsys, “http://www.synopsys.com/Prototyping/FPGABased Proto-
hundred of thousands. We have done the partitioning of all typing/Pages/Certify.aspx,” September 2013.
[5] Synopsys HAPS, “http://www.synopsys.com/Prototyping /FPGABased-
these benchmarks and tested them on our multiple FPGA- Prototyping/Pages/HAPS.aspx,” July 2013.
based prototyping platform. The complete design flow for our [6] M. Turki, H. Mehrez, Z. Marrakchi, and M. Abid, “Towards synthetic
prototype was already explained in section III. We have used benchmarks generator for cad tool evaluation,” in Ph.D. Research in
Microelectronics and Electronics (PRIME), 2012 8th Conference on,
Quartus II for intra-FPGA placement and routing, furthermore, June 2012, pp. 1–4.
with the help of Signaltap II and signals tracing, we have [7] Microsemi, “http://www.microsemi.com/document-portal/doc.”
done the debugging of all these large scale benchmarks. This [8] M. Inagi, Y. Takashima, and Y. Nakamura, “Globally optimal time-
multiplexing in inter-fpga connections for accelerating multi-fpga sys-
technique really facilitated the debugging for multi-FPGA tems,” in Field Programmable Logic and Applications, 2009. FPL 2009.
prototyping. International Conference on, Aug 2009, pp. 212–217.
[9] D. Group, “http://www.dinigroup.com/.”
[10] S. Asaad, R. Bellofatto, B. Brezzo, C. Haymes, M. Kapur, B. Parker,
T. Roewer, P. Saha, T. Takken, and J. Tierno, “A cycle-accurate,
cycle-reproducible multi-fpga system for accelerating multi-core
processor simulation,” in Proceedings of the ACM/SIGDA International
Symposium on Field Programmable Gate Arrays, ser. FPGA ’12.
New York, NY, USA: ACM, 2012, pp. 153–162. [Online]. Available:
http://doi.acm.org/10.1145/2145694.2145720
[11] I. Kuon and J. Rose, “Measuring the gap between FPGAs and ASICs,”
in Proceedings of the 2006 ACM/SIGDA 14th international symposium
on Field programmable gate arrays. ACM New York, NY, USA, 2006,
pp. 21–30.
[12] N. Pouillon and A. Greiner, “Soc lib project,” 2010. [Online]. Available:
https://www.asim.lip6.fr/trac/dsx/
[13] G. Schelle, J. Collins, E. Schuchman, P. Wang, X. Zou, G. Chinya,
Fig. 10. Waveforms by SignalTap II Logic Analyzer R. Plate, T. Mattner, F. Olbrich, P. Hammarlund, R. Singhal, J. Brayton,
S. Steibl, and H. Wang, “Intel nehalem processor core made fpga
synthesizable,” in Proceedings of the 18th Annual ACM/SIGDA
VI. C ONCLUSION International Symposium on Field Programmable Gate Arrays, ser.
FPGA ’10. New York, NY, USA: ACM, 2010, pp. 3–12. [Online].
We have done the development of prototyping platform for Available: http://doi.acm.org/10.1145/1723112.1723116
multiple FPGAs for large complex SoC designs. The main [14] Q. Tang, H. Mehrez, and M. Tuna, “Routing algorithm for multi-
fpga based systems using multi-point physical tracks,” in Rapid System
objective of the prototype is to achieve quick set-up time Prototyping (RSP),, Oct 2013, pp. 2–8.
for design under consideration and ensuring high speed with [15] U. Farooq, R. Chotin-Avot, M. Azeem, Z. Cherif, M. Ravoson, S. Khan,
lower cost. We also explored the complete configuration design and H. Mehrez, “Using timing-driven inter-fpga routing for multi-fpga
prototyping exploration,” in Euromicro DSD/SEAA, Limassal, Cyprus,
flow, we used Certify tool for partitioning, Quartus II for 2016.
synthesis, and SignalTap II for debugging that we further [16] U. Farooq, R. Chotin-Avot, M. Azeem, M. Ravoson, and H. Mehrez,
optimized for large scale ASIC designs. Our proposed design “Inter-fpga routing environment for performance exploration of multi-
fpga systems,” in Rapid System Prototyping (RSP), USA, 2016.
flow divides a large design into multiple parts that can result [17] Cadence, “http://www.cadence.com/products/sd/palladium xp series/
to lower synthesis time as we also performed the parallel pages / default.aspx.”
synthesis. We also generated large benchmarks using DSX [18] Altera, “http://www.altera.com,” 2015.
[19] Tektronix, “http://www.tek.com/.”
tool and tested them on multi-FPGA prototyping platform. We [20] Xilinx, “http://www.xilinx.com,” 2015.
have developed the techniques that optimized the debugging [21] Altera, “http://scale.engin.brown.edu/classes/EN164S16/SignalTap.pdf,”
methodology of large designs for multiple FPGA-based system Janurary 2011.

View publication stats

You might also like