100% found this document useful (1 vote)

184 views67 pages

PYNQ Productivity With Python

FPGA

Uploaded by

Dr. Dipti Khurge

We take content rights seriously. If you suspect this is your content, claim it here.

100% found this document useful (1 vote)

184 views67 pages

PYNQ Productivity With Python

FPGA

Uploaded by

Dr. Dipti Khurge

We take content rights seriously. If you suspect this is your content, claim it here.

You are on page 1/ 67

FPGA Programming and Architecture

UNIT – IV
Dr. D.S.Khurge
FPGA-based Programming using C/C++
Languages Used to Program FPGAs

• It might seem that FPGAs primarily lie in the domain of chip designers, rather than
engineers who specialize in software development.
• After all, most HDLs used to write FPGA code are lower-level languages that
hardware engineers are likely more familiar with than software engineers.
• But some HDLs are more similar to common software languages than you might
think.
• When we use the word “programming” in regard to FPGAs, it’s not exactly the
same as creating software, due to the way the program is set up and how it is
executed.
• But using this term does encompass the idea that writing and executing FPGA
code is similar in process to creating a software algorithm.
• The old way of thinking was that FPGAs could only be programmed by hardware
engineers designing at the circuit level. Today, that's no longer the case.
• With the help of unified software platforms, software developers can use their preferred
languages to program FPGAs without being well versed in HDLs.
• That takes the stress out of having to pivot to a new programming language, and it can
help software developers focus on concepts rather than hardware.
• These platforms work by essentially translating higher-level languages to lower-level ones
so that an FPGA can execute the desired function.
• Languages that can be used with unified software platforms to program FPGAs include:

•AI framework like TensorFlow and Pytorch - With Vitis AI, AI scientists can now directly take their trained deep
learning models from TensorFlow or Pytorch and compile for FPGA acceleration. This not only eliminates the need
for low-level hardware programming, but it also achieves blazing-fast compilation time in minutes, matching the
typical software compiling experience using CPUs and GPUs.
•C and C++ - Thanks to high-level synthesis (HLS), C-based languages can now be used for FPGA design.
Specifically, the AMD Vivado™ HLS compiler provides a programming environment that shares key technology
with both standard and specialized processors for the optimization of C and C++ programs. This allows software
engineers to optimize code without having to deal with the roadblock of limited memory space or computational
resources.
•Python - Designers can use the Python language and libraries to create high-performance applications and
program FPGAs with PYNQ—an open-source project from AMD that makes it easier to use AMD platforms.
There are also a number of mainstream HDLs that are primarily used in FPGA programming today. Here’s a
brief rundown on their names and main attributes:
•Lucid - This language was made specifically for FPGAs and overcomes some of the pitfalls of more archaic
languages, such as Verilog.
•VHDL - An acronym for VHSIC (Very High Speed Integrated Circuits) Hardware Description Language, this
language first appeared in the 1980s and was based off of Ada and Pascal.
•Verilog - The first HDL ever created, Verilog today is used mainly for test analysis and verification. The core of
this language was based on C.
How to Program an FPGA

• FPGAs are programmable logic devices that can be configured to perform specific functions.
FPGA-based programming using C/C++ involves using high-level programming languages like
C/C++ to describe the functionality of an FPGA.
• One common method of FPGA-based programming using C/C++ is through the use of High-Level
Synthesis (HLS) tools. These tools allow developers to write C/C++ code that can be synthesized
into hardware configurations for FPGAs.
• HLS tools often come with pre-built libraries and templates that simplify the process of designing
hardware. These libraries and templates can be used to implement functions like digital signal
processing, machine learning, and image processing on an FPGA.
• The output of an HLS tool is typically a Hardware Description Language (HDL) file that describes
the hardware configuration. This file can then be used to program an FPGA.
• Another approach to FPGA-based programming using C/C++ is to use an embedded processor
that runs C/C++ code on an FPGA. In this approach, the FPGA is used to accelerate certain parts of
the code, while the rest runs on the embedded processor. This approach allows for a higher level
of abstraction in the design process, as developers can write C/C++ code that runs on a processor
and use software tools to manage the hardware configuration.
• Overall, FPGA-based programming using C/C++ is a powerful technique that can be used to create
custom hardware configurations that are optimized for specific applications. However, it requires
specialized knowledge of FPGA architectures, hardware design, and HDLs.
Make FPGA Programming Easy with the Vitis™ Unified Software Platform

• The Vitis™ Unified Software Platform is a cutting-edge application that streamlines the FPGA programming
process for software engineers, data scientists, and AI developers.
• It includes an expansive open-source library optimized for AMD FPGA and ACAP hardware platforms, and a core
development kit that allows you to seamlessly build accelerated applications without extensive hardware
experience.
• Vitis™ also includes the Vitis Model Composer, which offers a toolbox within MATLAB® and Simulink®. It
streamlines the process of designing and testing new applications.

How to Get Started with Vitis Software for Application Acceleration

• Vitis™ helps you design accelerators for data and compute-intensive applications at the edge, on-premise, or in
the cloud in a four-step process:
1. Identify the performance-critical portions of your application that demand acceleration.
2. Design accelerators using Vitis Accelerated libraries, or develop your own in C, C++, OpenCL, or RTL.
3. Build, analyze, and debug to verify functional correctness and ensure performance goals are met.
4. Deploy accelerated applications on AMD platforms at the edge, on-premise, or in the cloud.
Introduction to PYNQ Z2:

S. D. Nagrale
Asst. Prof.
E&TC Dept, PCCOE Pune
Outlines
➢ Introduction to PYNQ Z2
➢ PYNQ-Z2 Setup Guide
➢ Creating Custom PYNQ Overlay with VIVADO HLS, IP
Integrator and Jupyter(Python)
➢ C/RTL Co-Simulation in Vitis HLS
➢ Programming FPGA using Verilog/VHDL
➢ FPGA Implementation of Neural Network
Introduction to PYNQ Z2
FPGA isn't something we can use as easily as Raspberry Pi, Micro: bit and similar boards.

But there is an option to use Xilinx Zynq based chips that contain an FPGA. Xilinx created an
architecture called ZYNQ.

ZYNQ combines an FPGA with ARM cores and I/O into one product. The ARM part is called Processing
System (PS) while the FPGA is called Programmable Logic (PL).
Introduction to PYNQ Z2
PYNQ is a hardware-software stack that allows using an FPGA via Python and Jupyter
notebooks running on the chip itself.

It doesn't replace Verilog/VHDL, it doesn't allow you to create designs for the FPGA
but it allows interfacing and using designs made by hardware engineers.
Introduction to PYNQ Z2
PYNQ overlay is a hardware design of the FPGA - it implements
the logic on the FPGA - with Verilog/VHDL and Vivado.

An Overlay is a class of Programmable Logic design.

Programmable Logic designs are usually highly optimized for a

specific task.

Overlays however, are designed to be configurable, and reusable

for broad set of applications.

A PYNQ overlay will have a Python interface, allowing a software

programmer to use it like any other Python package.
Introduction to PYNQ Z2

Hardware engineers can make various

overlays and software engineers can use them
in their applications.

In-between we have Linux Kernel drivers and

system libraries connect them together.
Introduction to PYNQ Z2
What is PYNQ?

PYNQ (Python On Zynq) is an open-source

project from Xilinx® that makes it easy to
design embedded systems with Xilinx Zynq®
Systems on Chips (SoCs).
Introduction to PYNQ Z2
What is PYNQ?
Using the Python language and libraries,
designers can exploit the benefits of
programmable logic and microprocessors in
Zynq to build more capable embedded
systems.
Introduction to PYNQ Z2
What is PYNQ?

PYNQ users can now create high performance embedded

applications with: parallel hardware execution, high frame-
rate video processing, hardware accelerated algorithms,
real-time signal processing, high bandwidth IO, low
latency control.

PYNQ utilizes the best advantages of ZYNQ and Python.

It has been widely used for machine learning research and

prototyping.
Introduction to PYNQ Z2
What is PYNQ?

PYNQ-Z2 board integrates Ethernet, HDMI Input/Output, MIC Input, Audio Output,
Arduino interface, Raspberry Pi interface, 2 Pmod, user LED, push-button and
switch.

It is designed to be easily extensible with Pmod, Arduino,

and peripherals, as well as general purpose GPIO pins.
Introduction to PYNQ Z2
Introduction to PYNQ Z2
Introduction to PYNQ Z2
Introduction to PYNQ Z2
PYNQ-Z2 Setup Guide
Introduction to PYNQ Z2
Introduction to PYNQ Z2
Introduction to PYNQ Z2

The Jupyter Notebook is an open source web application that you can use to create and
share documents that contain live code, equations, visualizations, and text.
Introduction to PYNQ Z2
PYNQ-Z2 Setup Guide

1.Browse to http://192.168.2.99
2.Password: xilinx
PYNQ-Z2
The PYNQ-Z2 board has the following features:

Zynq XC7Z020-1CLG400C
512MB DDR3(Memory)
1G Ethernet(Gigabit Ethernet)
USB 2.0
MicroSD
Uart
ADAU1761 Audio Codec with 3.5mm HP/Mic and line-in jacks
2x HDMI (can be used as input or output)
4 push-buttons
2 slide switches
4 LEDs
2 RGB LEDs
2x Pmod ports
1x Arduino header
1x RaspberryPi header
Base Overlay
The purpose of the base overlay design is to allow PYNQ to use
peripherals on a board out-of-the-box.

The design includes hardware IP to control peripherals on the target

board, and connects these IP blocks to the Zynq PS.

If a base overlay is available for a board, peripherals can be used

from the Python environment immediately after the system boots.

Board peripherals typically include GPIO devices (LEDs,

Switches, Buttons), Video, Audio, and other custom interfaces.
Base Overlay
Loading an Overlay

from pynq.overlays.base import BaseOverlay

base_overlay = BaseOverlay("base.bit")

help(base_overlay.leds)
Base Overlay
Loading an Overlay

Example 1: LED toggle

base_overlay.leds[0].toggle()
Programming onboard peripherals

Controlling an LED

Now we can create an instance of each of these classes and use their methods to
manipulate them.

Let’s start by instantiating a single LED and turning it on and off.

from pynq.lib import LED, Switch, Button

led0 = base.leds[0]

led0.on()

Check the board and confirm the LED is on.

led0.off()
Programming onboard peripherals

Ex2: Let’s then toggle led0 using the sleep() method from the time package to see
the LED flashing.

import time

led0 = base.leds[0]

for i in range(20):
led0.toggle()
time.sleep(1) //1 sec
Programming onboard peripherals

Example: Controlling all the LEDs, switches and buttons

The example below creates 3 separate lists, called leds, switches and buttons.

# Set the number of Switches

MAX_LEDS =4
MAX_SWITCHES = 2
MAX_BUTTONS = 4

leds = [base.leds[index] for index in range(MAX_LEDS)]

switches = [base.switches[index] for index in range(MAX_SWITCHES)]
buttons = [base.buttons[index] for index in range(MAX_BUTTONS)]

# Create lists for each of the IO component groups

for i in range(MAX_LEDS):
leds[i] = base.leds[i]
for i in range(MAX_SWITCHES):
switches[i] = base.switches[i]
for i in range(MAX_BUTTONS):
buttons[i] = base.buttons[i]
Image Processing with PYNQ, using PYNQ libraries as sci_pi, OpenCV,,
• Over the last 20 years, FPGAs have moved from glue logic through to computing
platforms.
• They effectively provide a reconfigurable hardware platform for implementing logic
and algorithms.
• Being fine-grained hardware, FPGAs are able to exploit the parallelism inherent within
a hardware design while at the same time maintaining the reconfigurability and
programmability of software.
• This has led to FPGAs being used as a platform for accelerating computationally
intensive tasks.
• This is particularly seen in the field of image processing, where the FPGA-based
acceleration of imaging algorithms has become mainstream.
• This is even more so within an embedded environment, where the power and
computational resources of conventional processors are not up to the task of
managing the data throughput and computational requirements of real-time imaging
applications.
Image Processing Using FPGAs

• Field programmable gate arrays (FPGAs) are increasingly being used for the implementation of image
processing applications.
• This is especially the case for real-time embedded applications, where latency and power are important
considerations.
• An FPGA embedded in a smart camera is able to perform much of the image processing directly as the image is
streamed from the sensor, with the camera providing a processed output data stream, rather than a sequence of
images.
• The parallelism of hardware is able to exploit the spatial (data level) and temporal (task level) parallelism
implicit within many image processing tasks.
• Unfortunately, simply porting a software algorithm onto an FPGA often gives disappointing results, because
many image processing algorithms have been optimised for a serial processor.

• It is usually necessary to transform the algorithm to efficiently exploit the parallelism and resources available
on an FPGA. This can lead to novel algorithms and hardware computational architectures, both at the image
processing operation level and also at the application level.
Image Processing Using FPGAs

• Image processing is the new gateway for numerous applications like Face recognition,
Driver-less vehicles, Vehicle and object identifications, etc.
• With this increase in the application, the hardware should be improved for increasing the
latency and that due is filled by Field Programmable Gate Arrays(FPGAs).
Programming an FPGA to accelerate complex algorithms is difficult, with one of four approaches
commonly used [1]:

• Custom hardware design of the algorithm using a hardware description language, optimised for
performance and resources;

• implementing the algorithm by instantiating a set of application-specific intellectual property cores

(from a library);

• using high-level synthesis to convert a C-based representation of the algorithm to synthesisable

hardware; or

• mapping the algorithm onto a parallel set of programmable soft-core processors.

• PYNQ is a popular platform for developing embedded systems and
accelerating applications with programmable logic.
• It provides an easy-to-use and intuitive interface that allows developers to
use Python libraries for high-performance computing on FPGA devices.
• In this case, we will explore image processing with PYNQ using two
commonly used libraries, sci_py and OpenCV.

• To get started, you will need to have a PYNQ board and Jupyter notebook
installed. The first step is to install the required libraries using the following
commands in a terminal window:

• sudo apt-get install libatlas-base-dev

• pip3 install opencv-python
• pip3 install numpy
PYNQ is a highly useful platform for image processing for a number of reasons:

1. High-performance computing: PYNQ is built on top of Xilinx Zynq FPGA SoCs, which allows for high-performance
computing and processing of images. The FPGA fabric provides hardware acceleration for image processing
algorithms, leading to faster execution times and lower power consumption compared to traditional software
implementations.
2. Easy to use: PYNQ provides a Python-based programming model, which is easy to learn and use for developers
who are familiar with Python. This makes it accessible to a wider range of developers and researchers, who can
use the familiar Python libraries to develop image processing applications.
3. Pre-built libraries: PYNQ comes with pre-built Python libraries for popular image processing libraries such as
OpenCV, Scikit-image, and PIL, which makes it easy to develop image processing applications without having to
write low-level code.
4. FPGA programmability: PYNQ enables developers to use FPGA programmability to optimize their image
processing algorithms. By implementing the algorithm in hardware, it is possible to achieve real-time processing
of high-definition video streams, which would not be possible with traditional software implementations.
5. Customizable: PYNQ allows developers to create custom overlays and IP blocks that can be used to accelerate
image processing algorithms. This provides a highly customizable platform for image processing, where
developers can tailor the platform to meet the specific requirements of their application.
PYNQ is a powerful platform for image processing, which combines the flexibility of software-based approaches
with the performance and power efficiency of hardware-based approaches. It provides an easy-to-use,
customizable, and high-performance platform for developing image processing applications.
• In PYNQ, an overlay is a hardware design that implements one or more IP blocks in the FPGA fabric.
The IP blocks can be customized for a specific application, and the overlay provides an interface
between the hardware and software components of the system.
• The overlay is typically built using high-level synthesis (HLS) tools, which allow developers to write
hardware descriptions using C/C++ or other high-level programming languages. This allows
developers to create custom IP blocks without having to write low-level hardware descriptions using
Verilog or VHDL.
• The overlay is loaded onto the FPGA fabric at runtime and can be dynamically reconfigured to support
different applications or algorithms. This allows developers to create highly customizable systems that
can be tailored to meet the specific requirements of their application.
• In PYNQ, overlays are loaded using the PYNQ Overlay class. The Overlay class provides a high-level
interface for loading and interacting with overlays. Once an overlay is loaded, developers can access
its IP blocks using Python code, which makes it easy to integrate hardware and software components.
• Overlays are a key feature of PYNQ, and they provide a powerful mechanism for accelerating
computationally intensive algorithms using the FPGA fabric.
• By leveraging the flexibility and performance of FPGAs, developers can create highly efficient and
customizable systems for a wide range of applications, including image and signal processing, machine
learning, and more.
Here are the steps to perform image processing using PYNQ:

1. Set up the PYNQ environment: To get started, you will need to set up your PYNQ board and install any required
libraries. You will also need to have Jupyter Notebook installed.
2. Load the overlay: The first step in image processing with PYNQ is to load the overlay. The overlay is a hardware
design that implements the image processing algorithm in the FPGA fabric. PYNQ comes with pre-built overlays
that you can use or you can create your own overlay.
3. Allocate memory for the input and output buffers: Once the overlay is loaded, you need to allocate memory for
the input and output buffers. This memory will be used to transfer the image data between the software and
hardware components.
4. Load the image: Next, you need to load the image that you want to process. PYNQ supports a variety of image
formats, including JPEG and PNG.
5. Convert the image: Depending on the image processing algorithm that you want to implement, you may need
to convert the image to a different format. For example, if you are performing edge detection, you may need to
convert the image to grayscale.
6. Copy the image to the input buffer: Once the image is loaded and converted, you need to copy it to the input
buffer. This will transfer the image data to the FPGA fabric for processing.
7. Process the image: The next step is to perform the image processing algorithm on the
image. This is done by invoking the hardware IP block that is included in the overlay.
8. Copy the processed image to the output buffer: Once the image processing is complete,
you need to copy the processed image from the FPGA fabric to the output buffer.
9.Display the image: Finally, you need to display the processed image on the screen. This can
be done using a variety of display technologies, such as HDMI or VGA.
• Overall, PYNQ provides a powerful platform for image processing, which combines the
performance and power efficiency of FPGA-based approaches with the flexibility and ease-
of-use of software-based approaches. By following these steps, you can get started with
image processing using PYNQ and start exploring the many possibilities of this powerful
platform.
• PYNQ provides a variety of pre-built Python libraries for popular image processing libraries such as
Scikit-image, OpenCV, and PIL. These libraries can be easily imported into your Python code and
used to develop image processing applications on the PYNQ platform.

1.Scikit-image: Scikit-image is a popular library for image processing in Python, which provides a
wide range of image processing algorithms, including filtering, segmentation, feature extraction,
and more. Scikit-image can be used on PYNQ to perform a variety of image processing tasks, such
as object recognition, edge detection, and image segmentation.
2.OpenCV: OpenCV is a widely used open-source library for computer vision and image processing,
which provides a wide range of image processing algorithms, including filtering, feature detection,
object recognition, and more. OpenCV can be used on PYNQ to perform a variety of image
processing tasks, such as face detection, image segmentation, and object tracking.
3.PIL: PIL (Python Imaging Library) is a library for opening, manipulating, and saving many different
image file formats. PIL can be used on PYNQ to perform simple image processing tasks such as
resizing, cropping, and rotating images.
• All of these libraries can be used on PYNQ to perform a wide range of image processing tasks, and
they can be easily integrated with custom hardware IP blocks or overlays to create highly efficient
and customizable image processing systems. With these libraries, developers can take advantage
of the powerful FPGA fabric on the PYNQ platform to accelerate computationally intensive image
processing algorithms and achieve real-time processing of high-definition video streams
• PYNQ is well-suited for a wide range of applications in image processing, particularly those that require high
performance and low latency. Some specific examples of image processing applications that can benefit from
PYNQ include:
1. Object recognition and detection: PYNQ can be used to accelerate object recognition and detection
algorithms, such as the YOLO (You Only Look Once) algorithm, which is commonly used in autonomous driving,
robotics, and surveillance systems.
2. Image segmentation: PYNQ can be used to accelerate image segmentation algorithms, which are commonly
used in medical imaging, satellite imagery, and industrial inspection applications.
3. Real-time video processing: PYNQ can be used to perform real-time video processing tasks, such as video
stabilization, motion detection, and face recognition.
4. Edge detection: PYNQ can be used to accelerate edge detection algorithms, which are commonly used in
computer vision and image processing applications, such as object recognition, image segmentation, and
feature extraction.
5. Deep learning-based image processing: PYNQ can be used to accelerate deep learning-based image processing
algorithms, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), which are
commonly used in applications such as image classification, object detection, and image synthesis.
• Overall, PYNQ provides a powerful platform for image processing, which combines the performance and
power efficiency of FPGA-based approaches with the flexibility and ease-of-use of software-based approaches.
By leveraging the power of FPGA fabric and pre-built Python libraries for image processing, developers can
create highly efficient and customizable image processing systems for a wide range of applications.
Installing TensorFlow on PYNQ,
• TensorFlow is an open-source software library for machine learning and
artificial intelligence. Developed by the Google Brain team, TensorFlow provides
a flexible and powerful platform for building and deploying machine learning
models at scale.
• TensorFlow supports a wide range of machine learning algorithms, including
deep neural networks, decision trees, and clustering algorithms, and it can be
used for a wide range of applications, including image recognition, natural
language processing, and predictive analytics.
• One of the key features of TensorFlow is its ability to scale across multiple CPUs
or GPUs, allowing developers to train and deploy models on large datasets and
complex architectures.
• TensorFlow also provides a high-level API for building and training machine
learning models, making it easier for developers to get started with machine
learning without needing to have a deep understanding of the underlying
algorithms.
Installing TensorFlow on PYNQ,
• Installing TensorFlow on PYNQ involves several steps, including installing the dependencies,
downloading and building the TensorFlow source code, and setting up the necessary environment
variables. Here are the general steps to install TensorFlow on PYNQ:
1.Install dependencies: TensorFlow requires several dependencies, including Python 3.x, NumPy, Bazel,
and more. You can install these dependencies using the package manager of your choice, such as apt-
get or pip.
2.Download and build TensorFlow: Once you have installed the dependencies, you can download the
TensorFlow source code from the TensorFlow website. You can then build TensorFlow using Bazel,
which is a build system that can handle complex dependencies and build configurations.
3.Set up environment variables: Once TensorFlow is built, you will need to set up the necessary
environment variables to use it. This may involve adding the TensorFlow libraries to your Python path,
setting up the LD_LIBRARY_PATH variable, and more.
4.Verify the installation: Once you have completed the installation, you can verify that TensorFlow is
working correctly by running a simple TensorFlow script. For example, you can run a script that
creates a simple neural network and trains it on a dataset.
• Note that installing TensorFlow on PYNQ can be challenging, particularly if you are not familiar with
building software from source code. However, there are several resources available online that can
help you with the installation process, including tutorials, forums, and documentation. Additionally,
some pre-built versions of TensorFlow may be available for PYNQ, which can simplify the installation
process.
Machine Learning with PYNQ,
Machine Learning with PYNQ,
• PYNQ provides a powerful platform for machine learning, combining the flexibility and ease-
of-use of Python with the high-performance and low-latency capabilities of FPGA-based
acceleration.
• By leveraging the FPGA fabric and pre-built Python libraries for machine learning, developers
can create highly efficient and customizable machine learning systems for a wide range of
applications.
Here are some ways that PYNQ can be used for machine learning:

• Accelerating machine learning algorithms: PYNQ can be used to accelerate machine learning
algorithms, such as neural networks and decision trees, by offloading the computation to
the FPGA fabric. This can result in significantly faster performance and lower power
consumption compared to running the algorithms on a CPU or GPU.

• Real-time machine learning: PYNQ can be used to perform real-time machine learning tasks,
such as object detection and speech recognition, with low latency and high accuracy. By
leveraging the FPGA fabric, PYNQ can perform the necessary computations in real-time,
making it ideal for applications that require fast response times.
Customizable machine learning systems: PYNQ provides a flexible platform for building and customizing
machine learning systems. By using Python, developers can easily modify and extend the pre-built libraries
to meet their specific needs, and by using the FPGA fabric, they can accelerate the computations and
optimize the performance of their models.

High-performance machine learning training: PYNQ can be used to train machine learning models using
large datasets and complex architectures. By leveraging the FPGA fabric, PYNQ can speed up the training
process, enabling developers to train more complex models in less time.

Overall, PYNQ provides a powerful platform for machine learning, with the flexibility and ease-of-use of
Python and the high-performance and low-latency capabilities of FPGA-based acceleration. By combining
these features, developers can create highly efficient and customizable machine learning systems for a wide
range of applications.
• PYNQ supports a range of features that make it well-suited for machine learning applications. Here are some of
the key qualities of PYNQ that support machine learning:
1. FPGA-based acceleration: PYNQ leverages the power of FPGA-based acceleration to accelerate machine
learning computations. This allows developers to achieve faster performance and lower power consumption
compared to running the computations on a CPU or GPU.
2. Low-latency and high-throughput: The FPGA fabric in PYNQ provides low-latency and high-throughput
capabilities, making it ideal for real-time machine learning applications, such as object detection and speech
recognition.
3. Python programming environment: PYNQ provides a Python programming environment, which is a popular
language for machine learning. This allows developers to easily use pre-built machine learning libraries, such as
TensorFlow and PyTorch, and to create custom models using familiar Python programming constructs.
4. High-level APIs: PYNQ provides high-level APIs for machine learning, which simplify the process of building and
training machine learning models. This makes it easier for developers to get started with machine learning and
to build sophisticated models without needing a deep understanding of the underlying algorithms.
5. Customizable hardware: PYNQ allows developers to customize the hardware architecture to match the specific
needs of their machine learning applications. This can include customizing the FPGA fabric to optimize
performance, or adding custom hardware accelerators to perform specialized computations.
Machine Learning on Xilinx FPGAs
Machine Learning on Xilinx FPGAs

• Machine learning on Xilinx FPGAs has become increasingly popular in recent years due to
their low-latency and high-throughput capabilities, which are essential for many real-time
applications.
• Xilinx has developed a range of tools and platforms that make it easier for developers to
implement machine learning on FPGAs, including the following:
1.Vitis AI: Vitis AI is a comprehensive development platform for machine learning on Xilinx
FPGAs. It includes a set of pre-built libraries and APIs for machine learning, as well as
tools for building, training, and deploying machine learning models on Xilinx FPGAs.
2.PYNQ: PYNQ is an open-source project that provides a Python programming environment
for Xilinx FPGAs. It includes a range of pre-built libraries for machine learning, such as
TensorFlow and PyTorch, as well as tools for building custom hardware accelerators.
3. Python programming environment: PYNQ provides a Python programming environment,
which is a popular language for machine learning. This allows developers to easily use pre-
built machine learning libraries, such as TensorFlow and PyTorch, and to create custom
models using familiar Python programming constructs.
4.High-level APIs: PYNQ provides high-level APIs for machine learning, which simplify the
process of building and training machine learning models. This makes it easier for developers
to get started with machine learning and to build sophisticated models without needing a
deep understanding of the underlying algorithms.
5.Customizable hardware: PYNQ allows developers to customize the hardware architecture
to match the specific needs of their machine learning applications. This can include
customizing the FPGA fabric to optimize performance, or adding custom hardware
accelerators to perform specialized computations.

• Overall, the FPGA-based acceleration, low-latency and high-throughput capabilities,

Python programming environment, high-level APIs, and customizable hardware make
PYNQ well-suited for a wide range of machine learning applications.
Neural Network Implementation on PYNQ
Neural Network Implementation on PYNQ
How Neural Networks Work.
A simple neural network includes an input layer, an output (or
target) layer and, in between, a hidden layer. The layers are
connected via nodes, and these connections form a “network” –
the neural network – of interconnected nodes. A node is
patterned after a neuron in a human brain.

How do you describe a neural network?

A neural network is a method in artificial intelligence that
teaches computers to process data in a way that is inspired by the
human brain. It is a type of machine learning process, called
deep learning, that uses interconnected nodes or neurons in a
layered structure that resembles the human brain.
How a neural network gets converted to circuit

• Neural network can be converted into a circuit through a process called

"neuromorphic computing". Neuromorphic computing is a type of computing
that uses the principles of neuroscience to develop computing architectures
and systems.
• One way to implement a neural network as a circuit is to use a hardware
description language (HDL) such as Verilog or VHDL to describe the circuit.
HDLs allow designers to specify the behavior of digital circuits at a low level of
abstraction.
• The circuit can then be implemented on an FPGA (Field Programmable Gate
Array) or an ASIC (Application-Specific Integrated Circuit). An FPGA is a
programmable chip that can be configured to implement any digital circuit,
while an ASIC is a chip that is designed to implement a specific function.
The process of converting a neural network into a circuit involves several steps:

1. Mapping the neural network onto the hardware : This step involves mapping the neurons and synapses of the neural
network onto the hardware resources of the FPGA or ASIC. This involves assigning hardware resources such as logic
gates, flip-flops, and memory cells to each neuron and synapse.

2. Quantizing the weights and activations : Neural networks typically use floating-point arithmetic to represent the weights
and activations. However, hardware implementations typically use fixed-point arithmetic to reduce the size and complexity
of the circuit. This step involves quantizing the weights and activations to fixed-point numbers.

3. Optimizing the circuit : The circuit can be optimized to reduce its size and improve its performance. This involves
techniques such as pruning, quantization, and compression.

4. Generating the HDL code : Once the circuit has been mapped, quantized, and optimized, the HDL code can be generated
automatically using tools such as high-level synthesis (HLS) or register transfer level (RTL) synthesis.

5. Simulating and verifying the circuit : The circuit can be simulated using a hardware simulator to ensure that it behaves
correctly. This step involves verifying that the circuit produces the correct outputs for a given set of inputs.

6. Implementing the circuit : Finally, the HDL code can be compiled and loaded onto an FPGA or ASIC to implement the
neural network circuit in hardware.
7 . Regenerate response
Implementing a neural network on an FPGA (Field Programmable Gate Array) has become an increasingly popular
approach due to the high computational performance and low power consumption of FPGAs. Here are the general
steps to implement a neural network on an FPGA:

1.Select an FPGA and a neural network model: The first step is to select an FPGA that is suitable for your application
and a neural network model that you want to implement. This involves considering the size and complexity of the
neural network, the performance requirements, and the available resources on the FPGA.

2.Map the neural network to the FPGA: The next step is to map the neural network onto the hardware resources of
the FPGA. This involves assigning hardware resources such as logic gates, memory cells, and arithmetic units to
each neuron and synapse of the neural network.

3.Quantize the weights and activations: Most neural network models use floating-point arithmetic to represent the
weights and activations. However, FPGAs typically use fixed-point arithmetic to reduce the size and complexity of
the circuit. Therefore, the weights and activations must be quantized to fixed-point numbers.

4.Optimize the hardware circuit: The hardware circuit can be optimized to reduce its size and improve its
performance. This involves techniques such as pruning, quantization, and compression.
5. Simulate and verify the hardware circuit: The hardware circuit can be simulated using a hardware simulator to
ensure that it behaves correctly. This step involves verifying that the circuit produces the correct outputs for a
given set of inputs.

6. Implement the hardware circuit on the FPGA: Finally, the HDL code for the hardware circuit can be compiled
and loaded onto the FPGA to implement the neural network in hardware.

7.Design the hardware circuit: After mapping the neural network onto the FPGA and quantizing the weights and
activations, the hardware circuit must be designed. This involves designing the logic circuits, memory circuits,
and arithmetic units that make up the neural network. This can be done using a hardware description language
(HDL) such as Verilog or VHDL.

8. Once the neural network has been implemented on the FPGA, it can be used for a wide range of applications,
such as image processing, speech recognition, and autonomous driving.
1. PYNQ (Python Productivity for Zynq) is an open-source framework that allows users to program Xilinx Zynq devices using
Python. Xilinx Zynq devices are System-on-Chip (SoC) devices that combine an FPGA with a processor. Implementing a
neural network on PYNQ involves the following steps:

2. Create a neural network model: First, you need to create a neural network model using a deep learning framework such
as Keras or PyTorch. You can create and train the neural network on a host machine using Python.

3. Deploy the neural network on the PYNQ device: Once the neural network has been created and trained, you can deploy
it on the PYNQ device. This involves transferring the model weights and architecture to the PYNQ device and using the
PYNQ framework to run the model.

4. Use the neural network on the PYNQ device: After deploying the neural network on the PYNQ device, you can use it for
inference. This involves providing input data to the neural network and getting the output predictions. You can use the
PYNQ framework to interface with the hardware and input/output devices.

5. To deploy a neural network on PYNQ, you can use the PYNQ-supported deep learning framework such as TensorFlow or
PyTorch. These frameworks have specific implementations that allow them to run on the PYNQ device.

6. For example, to run a neural network using TensorFlow on PYNQ, you can use the TensorFlow library in Python to define
and train your model. Then, you can use the TensorFlow Lite library to optimize the model for inference on an ARM-
based processor. Finally, you can use the PYNQ framework to deploy the optimized model on the PYNQ device and
perform inference.
Creating Custom PYNQ, Overlay on Xilinx VIVADO
Creating Custom PYNQ, Overlay on Xilinx VIVADO
• In PYNQ, an overlay is a hardware design that is loaded onto the FPGA fabric of the Zynq SoC. Overlays can be used to
add new functionality to the PYNQ board that is not provided by the default hardware configuration. Here are some key
points about overlays in PYNQ:
• Overlays are FPGA designs: An overlay in PYNQ is essentially an FPGA design that provides additional functionality to the
PYNQ board. The overlay can be designed using hardware description languages such as Verilog or VHDL, or using high-
level synthesis tools such as Vivado HLS.
• Overlays are loaded onto the FPGA:To use an overlay in PYNQ, you need to load the overlay onto the FPGA fabric of the
Zynq SoC. This is done using the PYNQ Overlay class, which provides an API for loading and accessing the overlay.
• Overlays can be designed for specific applications:Overlays can be designed to provide custom hardware functionality
for specific applications. For example, an overlay could be designed to accelerate image processing algorithms or
perform real-time signal processing.
• Overlays can be developed using the PYNQ framework: The PYNQ framework provides a set of tools and libraries for
developing overlays. This includes the PYNQ Overlay class, which provides an API for accessing the overlay, and the
PYNQ Overlay Generator, which provides a graphical user interface for designing overlays.
• Overlays can be shared and reused: Overlays can be shared and reused within the PYNQ community. This makes it easy
for developers to build on top of existing designs and create new applications using the PYNQ platform.
• Overall, overlays provide a powerful way to add custom hardware functionality to the PYNQ platform. With overlays,
developers can create custom hardware designs that are tailored to specific applications and easily share their designs
with the PYNQ community.
Custom Overlays

• A custom overlay in PYNQ is an FPGA design that has been created by a user to provide
specific hardware functionality that is not available in the default hardware configuration of
the PYNQ board.
• A custom overlay is designed using hardware description languages such as Verilog or VHDL,
or using high-level synthesis tools such as Vivado HLS. The design is then compiled and
synthesized into a bitstream that can be loaded onto the FPGA fabric of the Zynq SoC using
the PYNQ Overlay class.
• Custom overlays can be used to accelerate specific algorithms, perform real-time signal
processing, or add custom hardware interfaces to the PYNQ board. For example, a custom
overlay could be designed to perform high-speed image processing or implement a custom
communication protocol.
• Custom overlays provide a powerful way for users to extend the functionality of the PYNQ
platform to meet their specific hardware requirements. With custom overlays, users can
create hardware designs that are tailored to their applications and easily share their designs
with the PYNQ community.
In Xilinx Vivado, an overlay is a hardware design that is implemented on the programmable logic fabric of a Xilinx FPGA
device. An overlay can be created using Vivado IP Integrator, which is a graphical tool that allows users to create
hardware designs using pre-built IP cores.
Here are some key points about overlays in Vivado:
1.Overlays are implemented on the FPGA: An overlay in Vivado is a hardware design that is implemented on the
programmable logic fabric of a Xilinx FPGA device. The overlay can be designed using hardware description languages
such as Verilog or VHDL, or using Vivado IP Integrator.
2.Overlays can be used to add functionality: Overlays can be used to add new functionality to an FPGA-based system.
For example, an overlay can be designed to perform image processing, signal processing, or implement a custom
communication protocol.
3.Overlays can be created using Vivado IP Integrator: Vivado IP Integrator is a graphical tool that allows users to create
hardware designs using pre-built IP cores. Using IP Integrator, users can create overlays by adding IP cores to the design
and connecting them together.
4.Overlays can be packaged as IP: Once an overlay is created, it can be packaged as IP and reused in other designs. This
makes it easy to share and reuse overlays across different projects.
5.Overlays can be customized: Overlays can be customized to meet specific hardware requirements. Users can modify
the design by adding or removing IP cores, changing the configuration of IP cores, or modifying the interconnect
between IP cores.
Overall, overlays in Vivado provide a powerful way to add custom hardware functionality to FPGA-based systems. With
overlays, users can create custom hardware designs that are tailored to their specific applications and easily share their
designs with others.
THANK YOU

Logic Design Theory by NN Biswas
No ratings yet
Logic Design Theory by NN Biswas
3 pages
Zynq FPGA Labs 23
No ratings yet
Zynq FPGA Labs 23
51 pages
3 - KV VHDL P1a
No ratings yet
3 - KV VHDL P1a
7 pages
Introduction To FPGA
No ratings yet
Introduction To FPGA
34 pages
FPGA Project
No ratings yet
FPGA Project
15 pages
Design and Implementation of 64 Bit Alu Using VHDL
No ratings yet
Design and Implementation of 64 Bit Alu Using VHDL
59 pages
ECE448 Lecture7 VGA 1
No ratings yet
ECE448 Lecture7 VGA 1
66 pages
FPGA Seven-Segment-Display by Using Altera DE2-115 Board With Practice and Implementation
No ratings yet
FPGA Seven-Segment-Display by Using Altera DE2-115 Board With Practice and Implementation
4 pages
Lab 2 Introduction To VHDL and Nexys2
No ratings yet
Lab 2 Introduction To VHDL and Nexys2
18 pages
Lab Report Fpga
No ratings yet
Lab Report Fpga
34 pages
Field Programmable Gate Array
No ratings yet
Field Programmable Gate Array
18 pages
Programming FPGAs
No ratings yet
Programming FPGAs
38 pages
FPGA
No ratings yet
FPGA
14 pages
Digital System Design Using VHDL
No ratings yet
Digital System Design Using VHDL
238 pages
SILICA Xilinx Zynq ZedBoard Vivado Workshop Ver1.0
No ratings yet
SILICA Xilinx Zynq ZedBoard Vivado Workshop Ver1.0
61 pages
Unit 5 - VHDL
No ratings yet
Unit 5 - VHDL
15 pages
FPG A
No ratings yet
FPG A
29 pages
FSM Slides
0% (1)
FSM Slides
37 pages
VHDL Made Easy
No ratings yet
VHDL Made Easy
424 pages
Image Processing With VHDL PDF
No ratings yet
Image Processing With VHDL PDF
131 pages
E3-231 Digital Systems Design With Fpgas: Kuruvilla Varghese Dese Dese Indian Institute of Science
No ratings yet
E3-231 Digital Systems Design With Fpgas: Kuruvilla Varghese Dese Dese Indian Institute of Science
12 pages
Xilinx Block RAM
No ratings yet
Xilinx Block RAM
34 pages
Aharen San - Mastering FPGA Embedded Systems A Case Study Approach To Designing and Implementing FPGA-Based Embedded Systems With TFT LCDs (Z-Library)
No ratings yet
Aharen San - Mastering FPGA Embedded Systems A Case Study Approach To Designing and Implementing FPGA-Based Embedded Systems With TFT LCDs (Z-Library)
263 pages
Unit 7-Vhdl: 1. VLSI Design Flow
No ratings yet
Unit 7-Vhdl: 1. VLSI Design Flow
4 pages
VHDL - Lab Solution
No ratings yet
VHDL - Lab Solution
33 pages
VHDL Notes
No ratings yet
VHDL Notes
4 pages
Hardware Description Languages
No ratings yet
Hardware Description Languages
12 pages
Dijsktra Thesis
No ratings yet
Dijsktra Thesis
65 pages
Chapter6b-Combinational Logic Design Practices
No ratings yet
Chapter6b-Combinational Logic Design Practices
38 pages
Midterm PDF
No ratings yet
Midterm PDF
2 pages
LN 06 Hierarchical Modeling
No ratings yet
LN 06 Hierarchical Modeling
37 pages
VLSI Lab Manual
No ratings yet
VLSI Lab Manual
68 pages
VHDL Material
No ratings yet
VHDL Material
33 pages
EE 421-Digital System Design-Dr. Shahid Masud-Updated PDF
No ratings yet
EE 421-Digital System Design-Dr. Shahid Masud-Updated PDF
4 pages
PCB Design Course
No ratings yet
PCB Design Course
5 pages
Verilog SAP
No ratings yet
Verilog SAP
69 pages
VHDL Codes
No ratings yet
VHDL Codes
8 pages
Introuction To DSD With VHDL
No ratings yet
Introuction To DSD With VHDL
19 pages
Programmable Logic Design With VHDL Amlan Chakrabarti A.K.C.S.I.T., Calcutta University
No ratings yet
Programmable Logic Design With VHDL Amlan Chakrabarti A.K.C.S.I.T., Calcutta University
115 pages
Matlab To VHDL Code Tutorial
No ratings yet
Matlab To VHDL Code Tutorial
31 pages
VHDL Cheat Sheet Exam 1
100% (1)
VHDL Cheat Sheet Exam 1
3 pages
VHDL Program: Experiment: 1 Aim: To Write A VHDL Program For: I) 8 Bit Comparator
No ratings yet
VHDL Program: Experiment: 1 Aim: To Write A VHDL Program For: I) 8 Bit Comparator
6 pages
VHDL Tutorial
No ratings yet
VHDL Tutorial
68 pages
Nios 2
No ratings yet
Nios 2
57 pages
FPGA Lect3
No ratings yet
FPGA Lect3
93 pages
FPGA Selection: LTC2387-18 S.No Pin - Name Pin - No. - ADC Mode Purpose
100% (1)
FPGA Selection: LTC2387-18 S.No Pin - Name Pin - No. - ADC Mode Purpose
6 pages
Vivado HLD
No ratings yet
Vivado HLD
530 pages
RISC-V Instruction Set Summary
No ratings yet
RISC-V Instruction Set Summary
4 pages
FPGA Temp Sensor
No ratings yet
FPGA Temp Sensor
8 pages
The VHDL Hardware Description Language: CSEE W4840
No ratings yet
The VHDL Hardware Description Language: CSEE W4840
31 pages
VHDL
No ratings yet
VHDL
57 pages
Learning FPGA and Verilog A Beginner's Guide
No ratings yet
Learning FPGA and Verilog A Beginner's Guide
16 pages
JPEG DECODER USING VHDL AND IMPLEMENTING IT ON FPGA SPARTAN 3A KItProject Main Report1
No ratings yet
JPEG DECODER USING VHDL AND IMPLEMENTING IT ON FPGA SPARTAN 3A KItProject Main Report1
21 pages
FPGA
0% (1)
FPGA
21 pages
RISC-VTF RISC-V Based Extended Instruction Set For Transformer
No ratings yet
RISC-VTF RISC-V Based Extended Instruction Set For Transformer
6 pages
How Does Xilinx FPGA Programming Work
No ratings yet
How Does Xilinx FPGA Programming Work
4 pages
Cs295: Modern Systems What Are Fpgas and Why Should You Care
No ratings yet
Cs295: Modern Systems What Are Fpgas and Why Should You Care
22 pages
FPGAs For Software Programmers
No ratings yet
FPGAs For Software Programmers
331 pages
04 Abstract
No ratings yet
04 Abstract
40 pages
FPGAproject
No ratings yet
FPGAproject
5 pages
Affidavit of Discrepancy
No ratings yet
Affidavit of Discrepancy
1 page
Information 13 00339 v2
No ratings yet
Information 13 00339 v2
16 pages
Product Catalogue
No ratings yet
Product Catalogue
12 pages
Full Service - Electric Fire Pump Controllers: Americas Europe Middle East Asia
No ratings yet
Full Service - Electric Fire Pump Controllers: Americas Europe Middle East Asia
2 pages
ELE Deliverable D2 18 Report On State of LT in 2030
No ratings yet
ELE Deliverable D2 18 Report On State of LT in 2030
56 pages
PMI-ACP Mock Exam
No ratings yet
PMI-ACP Mock Exam
7 pages
PWS1-1725KTL-H-NA - EX-O User Manual V4.0 - 20240315 - Control Box
No ratings yet
PWS1-1725KTL-H-NA - EX-O User Manual V4.0 - 20240315 - Control Box
72 pages
Virtualization Digital Assignment 2: Monitoring Tools - Chef and Puppet
No ratings yet
Virtualization Digital Assignment 2: Monitoring Tools - Chef and Puppet
13 pages
1 Vectrino
No ratings yet
1 Vectrino
42 pages
Wireframing Guide For Developers & Designers
No ratings yet
Wireframing Guide For Developers & Designers
5 pages
Spring Boot: Creating Spring Boot Projects With Eclipse and Maven
No ratings yet
Spring Boot: Creating Spring Boot Projects With Eclipse and Maven
9 pages
ICMP Protocol Full Explanation
No ratings yet
ICMP Protocol Full Explanation
3 pages
Design Issues of Microsoft Teams: in Terms of Affordance
No ratings yet
Design Issues of Microsoft Teams: in Terms of Affordance
13 pages
Module 06 CANBUS 7 16 2015
No ratings yet
Module 06 CANBUS 7 16 2015
44 pages
CV Template
No ratings yet
CV Template
3 pages
20ME3001-ADDITIVE MANUFACTURING TECHNOLOGIES - Final - PHD - Format
No ratings yet
20ME3001-ADDITIVE MANUFACTURING TECHNOLOGIES - Final - PHD - Format
2 pages
Freightliner M2 114SD PMG Install Guide
No ratings yet
Freightliner M2 114SD PMG Install Guide
17 pages
Product IDCodes
No ratings yet
Product IDCodes
1 page
L3 Arrays
No ratings yet
L3 Arrays
41 pages
Information Technology Crossword
No ratings yet
Information Technology Crossword
2 pages
Possible Capstone Titles
No ratings yet
Possible Capstone Titles
11 pages
Philips Essenta DR Dicom
No ratings yet
Philips Essenta DR Dicom
94 pages
CS Practical File
No ratings yet
CS Practical File
32 pages
Windows 10 USB Power Save Configuration
No ratings yet
Windows 10 USB Power Save Configuration
4 pages
Internet Security Solved MCQs (Set-1)
No ratings yet
Internet Security Solved MCQs (Set-1)
6 pages
Message Authentication Codes
No ratings yet
Message Authentication Codes
18 pages
Ransomware Containment and Remediation Strategies
No ratings yet
Ransomware Containment and Remediation Strategies
38 pages
Hashing - DPP 01
No ratings yet
Hashing - DPP 01
4 pages
MCQ SDLC
100% (2)
MCQ SDLC
7 pages
NItheesh Resume
No ratings yet
NItheesh Resume
1 page

PYNQ Productivity With Python

Uploaded by

PYNQ Productivity With Python

Uploaded by

FPGA Programming and Architecture

How to Get Started with Vitis Software for Application Acceleration

An Overlay is a class of Programmable Logic design.

Programmable Logic designs are usually highly optimized for a

Overlays however, are designed to be configurable, and reusable

A PYNQ overlay will have a Python interface, allowing a software

Hardware engineers can make various

In-between we have Linux Kernel drivers and

PYNQ (Python On Zynq) is an open-source

PYNQ users can now create high performance embedded

PYNQ utilizes the best advantages of ZYNQ and Python.

It has been widely used for machine learning research and

It is designed to be easily extensible with Pmod, Arduino,

The design includes hardware IP to control peripherals on the target

If a base overlay is available for a board, peripherals can be used

Board peripherals typically include GPIO devices (LEDs,

from pynq.overlays.base import BaseOverlay

Example 1: LED toggle

Let’s start by instantiating a single LED and turning it on and off.

from pynq.lib import LED, Switch, Button

Check the board and confirm the LED is on.

Example: Controlling all the LEDs, switches and buttons

# Set the number of Switches

leds = [base.leds[index] for index in range(MAX_LEDS)]

# Create lists for each of the IO component groups

• implementing the algorithm by instantiating a set of application-specific intellectual property cores

• using high-level synthesis to convert a C-based representation of the algorithm to synthesisable

• mapping the algorithm onto a parallel set of programmable soft-core processors.

• sudo apt-get install libatlas-base-dev

• Overall, the FPGA-based acceleration, low-latency and high-throughput capabilities,

How do you describe a neural network?

• Neural network can be converted into a circuit through a process called

You might also like