PyDVE - An Open-Source Python-Based Design Verification
PyDVE - An Open-Source Python-Based Design Verification
Curtis Bucher
March 2024
Computer Engineering, Cal Poly San Luis Obispo
1 INTRODUCTION
In the digital hardware development field, ensuring the functionality of a design is crucial
before moving to fabrication. This process of functional design verification detects and rectifies any
issues prior to manufacturing, thereby reducing the risk of costly errors. However, this verification
process is both complex and resource-intensive, consuming an estimated 70% of time and resources
in the digital design process and creating a bottleneck on the speed that new products can be
brought to market [1]. As hardware complexity continues to grow exponentially, finding efficient
and cost-effective verification methods becomes increasingly important.
To address this challenge, there is a broad effort to develop tools and frameworks that
streamline the verification process. Broadly, these systems fall into two categories: simulation and
emulation. I will further draw a distinction between tools that are available from traditional “Big 3”
EDA vendors, Cadence, Synopsis, and Siemens, with newer, open-source tools that are beginning
to gain traction in the functional verification landscape.
This paper introduces a novel verification framework that strives to combine the speed of
traditional emulation tools with the power, ease, and flexibility of the Python ecosystem. Notably,
this framework is open-source and utilizes inexpensive off-the-shelf hardware, eliminating the high
costs associated with traditional emulation platforms.
The paper also explores the advantages of this framework, highlighting its flexibility and
accessibility compared to conventional emulators. Additionally, potential performance
improvements over existing simulation tools are investigated.
2 BACKGROUND
Broadly, functional design verification deals with two questions: are two models of the
design under verification equivalent, and does a model of the design operate correctly [2]? This is
analogous to writing unit tests in software engineering. To answer these questions for a hardware
design, it is necessary to replicate the behavior of the design quickly and accurately, to see how it
behaves under different test stimulus. To reproduce the behavior of the hardware model, two main
methods are used: simulation and emulation. Each of these methods have different tools available
from the “Big 3” vendors, as well as open-source alternatives. We will explore the benefits and
drawbacks of each method in depth.
These models are described using a traditional hardware description language (HDL) such as
SystemVerilog or VHDL. Additionally, module testbenches, used to drive input stimuli to the
design and verify its output accuracy, are traditionally written in SystemVerilog, often utilizing
Universal Verification Methodology (UVM) [1]. However, the increasing success of open-
source verification tools has introduced new languages for testbench design, with Python emerging
as a powerful, open-source alternative to traditional SystemVerilog testbench design, which we'll
delve into further.
Simulation
Simulation is the most common form of design verification [2]. High performance
simulators are available from most of the major EDA vendors, including Siemens ModelSim,
Synopsis VCS, and Xilinx Xsim [3] [4] [5]. These programs run on a traditional computer to
simulate the behavior of RTL hardware models. The objective with simulation is to enable quick and
flexible development of these testbenches.
Emulation
Emulation, on the other hand, utilizes a dedicated hardware emulator to mimic the behavior
of the RTL model without the need for compilation and execution on a conventional computer.
Although mapping an RTL design onto emulation hardware requires significant testbench overhead,
the hardware enables individual test vectors to run thousands of times faster, a desirable tradeoff for
larger designs with complex testbenches [2].
In most setups, a host computer manages the testbench and generates the test vectors to
keep the emulator occupied. Emulation tools and the necessary hardware are available from major
EDA vendors like Cadence Palladium Z1, Synopsys ZeBu, Siemens Veloce, and Xilinx
Alveo [6] [4] [3] [5]. However, these software tools and accompanying emulation hardware are often
prohibitively expensive, posing a barrier to entry for verification engineers seeking to leverage the
benefits of emulation.
Python
Python, as an open-source, general-purpose programming language, has gained traction in
testbench design alongside the increasing availability of open-source design verification tools, which
we'll explore later in more detail. Python testbenches offer advantages due to the language's
flexibility and ease of use compared to a traditional HDL like SystemVerilog or UVM. Python is
familiar to a larger pool of engineers, and its ecosystem provides access to numerous packages for
developing rich, powerful testbenches across a variety of domains [7]. In terms of language
complexity, Python is notably simpler, with only 23 keywords compared to SystemVerilog's 221
keywords [8].
Figure 1: http://www.fivecomputers.com/language-specification-length.html
3 RELATED WORK
Due to the expensive, closed source nature of the verification tooling from the big vendors,
and the success of the open-source software ecosystem, there has been a substantial effort in recent
years to develop a suite of open-source tools for digital hardware design, including a host of open-
source Verilog simulators and emulators. Some of these open-source simulation tools have
Python front-ends available, exposing the verification engineer to the flexibility and power of the
Python ecosystem.
Icarus Verilog
Icarus Verilog is a free and open-source Verilog compiler, intended to generate high
performance code for back-end tools [9]. Although in development since 2000, it still exhibits
notable limitations compared to other simulators. Notably, Icarus Verilog lacks support for
SystemVerilog or UVM testbenches, which are the modern standard for complex testbench design,
limiting its utility within the industry.
Cocotb
Cocotb is another free and open-source digital logic verification framework, notable because
cocotb testbenches are written using the open-source Python programming language, rather than
SystemVerilog. Cocotb is not another simulator, but rather a Python frontend that supports
many existing simulators, including open-source solutions like Icarus Verilog, as well as closed
source alternatives like Synopsys VCS and Siemens Modelism. According to the documentation,
cocotb encourages the same philosophy of design reuse and randomized testing as UVM but
implemented in Python [7].
Figure 2: Sample cocotb code [8]
Verilator
Verilator claims to be the fastest Verilog/SystemVerilog simulator available,
outperforming many closed source commercial simulators by 200-1000x [10]. It achieves this by
compiling the Verilog or SystemVerilog module into an optimized, multithreaded model of the
design, which is then wrapped in a C++ or SystemC wrapper and compiled using traditional
compilers like GCC [10].
Despite its impressive performance and support for both SystemVerilog and Verilog,
Verilator lacks many features of traditional simulators and is primarily intended for high-speed
simulation and integration of SystemVerilog models with C code [10]. Furthermore, it does not
natively support Python testbenches and is not a supported backend of cocotb, limiting verification
engineers to Verilog and SystemVerilog testbenches.
4 DEVELOPMENT
PyDVE
Considering the popularity of Python-based simulation frameworks, and the remarkable
speedup of emulation offerings from traditional vendors, there is a notable lack of solutions for
running Python-based testbenches with emulation. PyDVE was developed to bridge this gap, offering
an open-source framework for running Python-based testbenches with emulation. The goal of
PyDVE is to combine the flexibility, ease-of-use, and extendibility of the Python ecosystem with the
speed benefits of a traditional emulation system.
SYSTEMVERILOG PYTHON
SIMULATION Traditional Simulators, Cocotb
Icarus Verilog
EMULATION Traditional Emulators, PyDVE
Firesim
Table 2: The missing python-based emulation tool...
Architecture
Since PyDVE relies on emulation to replicate the behavior of the design under test (DUT),
additional hardware is necessary for its operation. To ensure affordability and accessibility, a
hardware architecture utilizing readily available components was chosen. Specifically, the Kria
KV260 development board by Xilinx, priced under $400, was selected. This board features the
Zynq UltraScale+ MPSoC, integrating an ARM processor tightly coupled with a Zynq FPGA.
The choice of the Kria board is rooted in its support for the PYNQ framework, an open-source
project developed by AMD, which provides a Python software interface for interacting with FPGA
designs [14]. This compatibility ensures seamless integration with the PyDVE package, facilitating
efficient testing processes.
Figure 5: PyDVE Hardware and Software Architecture
The system consists of a host computer connected to the Kria board via either a high-speed
network or a serial connection, enabling multiple Kria boards to connect simultaneously for
concurrent testing and a greater test throughput.
On the host CPU, engineers access the PyDVE Python package, facilitating the transmission
of the DUT to the development board and the transmission of transaction-level stimulus to the
emulator board.
Meanwhile, the Kria Board's embedded CPU operates a server, which awaits commands
from the host computer. This server manages interactions with the FPGA, including loading the
DUT, providing stimulus to the design, and retrieving outputs from the design.
The FPGA assumes responsibility for emulating the DUT by loading it into its
programmable logic and communicating with the embedded server via high-speed AXI GPIO. Its
flexibility allows it to emulate multiple DUTs simultaneously, further enhancing system speed.
Design Decisions
Several hardware and software architectures were explored before arriving at the final
architecture depicted above. One key consideration was determining where to run the testbench—
on the host CPU or the embedded CPU on the Kria board. Ultimately, running the testbench on
the host system was chosen to reduce demands on the embedded hardware, potentially improving
system speed and lowering costs. While this approach introduces some network delay, it was
deemed a necessary tradeoff to achieve the desired performance. Additionally, utilizing a
synthesizable testbench directly on the FPGA was considered but dismissed due to concerns about
restricting the system's flexibility by requiring testbenches to be written entirely in a limited subset of
synthesizable Python.
Verification Process
The PyDVE framework is presented to the verification engineer as a Python package,
featuring a class called PyDVE for interacting with the DUT. This framework enables engineers to
write verification testbenches within traditional Python software testing frameworks like unittest
or pytest, interacting with the DUT as if it were a standard Python class.
Synthesizing the Design – Currently, no open-source tools exist for generating bitstreams
for Xilinx FPGAs from HDL code. Hence, engineers are responsible for initially synthesizing their
design using traditional synthesis tools like Xilinx Vivado. Once synthesized, the DUT is exported
as a bitstream. Future work on the project will focus on integrating existing synthesis tools within
the framework.
Loading the Design - Next, engineers provide the PyDVE testing framework with the DUT,
represented by the exported files containing synthesized SystemVerilog code and necessary
metadata. These files are loaded onto the FPGA through the PyDVE package's sources decorator,
which facilitates the transmission of the bitstream over the network connection to the embedded
server for loading onto the FPGA.
In the code snippet above, testAdd() is an individual test vector within the
TestArithmetic testbench, responsible for testing that an ALU correctly performed an addition
operation. Within this test, self.dut is an object of type pydve.pydve. The self.dut object
exposes all the input and output ports of the devices-under-test. Input ports are driven by the
verification engineer through the @= and <<= operators, which assign values to the input ports
combinatorically and sequentially. This can be seen on lines 18-20. Whenever the input ports of the
self.dut are assigned to, the values are collected and driven to the embedded server in the form of
a high-level transaction, which is then driven to the FPGA as individual RTL signals. Every time the
DUT is updated on the FPGA, the resultant output signals are read by the embedded server as RTL
signals, which are then collected into a high-level transaction and sent back to the host computer.
The verification engineer can then check the validity of these output signals within the testbench, by
accessing the output ports of the self.dut object. This can be seen on lines 23-24.
By abstracting the complexity of interacting with the FPGA and DUT into a Python class,
PyDVE enables the rapid development of Python testbenches, leveraging the language's extensive
ecosystem for powerful testing capabilities. Examples in the source code demonstrate the use of
Python testbenches, including leveraging the numpy package for checking the performance of a
matrix multiplier module and utilizing the hypothesis package for generating constrained random
stimulus.
5 EVALUATION
To assess the performance of the system, PyDVE was compared against traditional simulation
tools, as well as open-source alternatives, using either Python and SystemVerilog testbenches
depending on the tool. These evaluations were conducted on a matrix multiplier module obtained
from cocotb's source code as a motivating example, varying the number of test vectors from 100
to 100,000 cases. This approach provides insights into the runtime of each tool and how their
performance scales with longer testbenches. The following graphs depict the total testbench runtime
and test vector throughput for PyDVE and various simulation tools. Notably, the data do not include
synthesis times for the modules from Vivado. Detailed data and methodology can be found in
tables 3 and 4 in the appendix.
Total Testbench Time (seconds)
10,000.00
1,000.00
100.00
Icarus Verilog
10.00
CocoTB
1.00
PyDVE 7 DUV
0.10
Vivado Xsim
0.01
100 1000 10000 100000
Number of Test Vectors
Figure 8: Comparison of Total Testbench Time for Different Simulators. See Table 3 in Appendix.
1000
Icarus Verilog
100
CocoTB
10 PyDVE 7 DUV
Vivado Xsim
1
100 1000 10000 100000
Number of Test Vectors
Figure 9: Comparison of Test Vector Throughput for Different Simulators. See Table 4 in Appendix.
Figure 10: PyDVE Runtimes Vs. Number of Devices Under Test. See Table 5 in Appendix.
6 CONCLUSION
While PyDVE did not immediately surpass other simulation methods as anticipated, I remain
optimistic about its potential based on the promising results observed. As highlighted in the
evaluation section, several bottlenecks significantly hinder the overall system speed, particularly
within the embedded server on the Kria platform. The server, developed in Python for rapid
prototyping, may not be the most efficient option. I believe that transitioning to a low-level language
tailored to the hardware constraints of the Kria platform could better leverage the FPGA's
capabilities and enhance test vector throughput. Moreover, while supporting off-the-shelf hardware
is essential, there's an opportunity to design specialized hardware for the framework, unrestricted by
the limitations of the Kria platform.
[1] A. Molina, "Functional Verification: Approaches and Challenges," Latin American Applied
Research, vol. 37, no. 1, 2007.
[2] Synopsis Inc, "Functional Verification 2003: Technology, Tools and Methodology," IEEE
Xplore, 2003.
[3] Siemens, "Verification and Validation," 2024. [Online]. Available:
https://eda.sw.siemens.com/en-US/ic/verification-and-validation/. [Accessed 11 03 2024].
[4] Synopsis, "Verification Family," 2024. [Online]. Available:
https://www.synopsys.com/verification.html. [Accessed 11 03 2024].
[5] Advanced Micro Devices, "Vivado Overview," 2024. [Online]. Available:
https://www.xilinx.com/products/design-tools/vivado.html. [Accessed 11 03 2024].
[6] Cadence, "Cadence Verifcation," 2024. [Online]. Available:
https://www.cadence.com/en_US/home/tools/system-design-and-verification.html.
[Accessed 11 03 2024].
[7] Cocotb, "Cocotb Documentation," 2023. [Online]. Available:
https://docs.cocotb.org/en/stable/. [Accessed 09 03 2024].
[8] B. Rosser, "CocotbL A Python-based digital logic verification framework," University of
Pennsylvania.
[9] S. Williams, "Icarus Verilog," 2019. [Online]. Available: https://steveicarus.github.io/iverilog/.
[Accessed 09 03 2024].
[10] Veripool, "Welcome to Verilator," Veripool, 2024. [Online]. Available:
https://veripool.org/verilator/. [Accessed 09 03 2024].
[11] W. Snyder, "Verilator: Your Big 4th Simulator: 2019 Intro and Roadmap," Chips Alliance,
2019.
[12] S. Jiang, C. Torng and C. Batten, "An Open-Source Python-Based Hardware Generation,
Simulation, and Verification Framework," in Workshop on Open-Source EDA Technology, Ithaca,
NY, 2018.
[13] S. Karandikar, H. Mao, D. Kim, D. Biancolin, A. Amid, D. Lee, N. Pemberton, E. Amaro, C.
Schmidt, A. Chopra, Q. Huang, K. Kovacs, B. Nikolic, R. Katz, J. Bachrach and K. Asanovic,
"FireSim: FPGA-Accelerated Cycle-Exact Scale-Out System Simulation in the Public Cloud,"
in ACM/IEEE International Symposium on Computer Architecture, Berkeley, CA, 2018.
[14] Advanced Micro Devices Inc, "PYNQ: Python productivity for Adaptive Computing
platforms," 2022. [Online]. Available: https://pynq.readthedocs.io/en/latest/#. [Accessed 11
03 2024].
8 APPENDIX
Performance Comparison (Figure 8,9)
Performance tests in tables 3 and 4 were run on Intel Core-i7 with 32 GB RAM.
Vivado synthesis for 7-DUTs took 24 minutes.
Num DUVs Testbench Duration (s) Total Time (s) Overhead (s)
1 34.5515757 50.1891054 15.63752971
2 18.2105112 32.5236315 14.31312038
3 12.1188674 30.2632 18.14433262
4 10.0755533 28.2098582 18.13430492
5 8.71449667 24.6855627 15.97106604
6 8.11689833 25.3971535 17.28025513
7 7.3822885 27.9552509 20.57296242
Table 5: PyDVE Timing Breakdown (s) per Number of Test Vectors.