Fpga Model To Implement Handwritten Digit Recognition
Fpga Model To Implement Handwritten Digit Recognition
DIGIT RECOGNITION
Submitted by
JAYASHREE N 111721104052
MADHUMITHA G 111721104074
MAHITHA M V 111721104075
BONAFIDE CERTIFICATE
Certified that this project report titled " FPGA BASED MODEL TO
IMPLEMENT HANDWRITTEN DIGIT RECOGNITION” is the
bonafide work of JAYASHREE N (111721104052), MADHUMITHA G
(1117201104074), MAHITHA M V (111721104075) who carried out the work
under my supervision.
SIGNATURE SIGNATURE
Kavaraipettai-601206. Kavaraipettai-601206.
ii
ACKNOWLEDGEMENT
We would like to express our heartfelt thanks to the Almighty, our beloved
parents for their blessings and wishes for successfully doing this project.
We would like to express our sincere and heartfelt gratitude to our Principal
Dr. K. A. Mohamed Junaid M.E., Ph.D., for fostering an excellent climate
to excel.
We are extremely thankful to Dr. T. Suresh M.E., Ph.D., Professor and Head,
Department of Electronics and Communication Engineering, for having
permitted us to carry out this project effectively.
We convey our sincere thanks to our mentor, skillful and efficient supervisor,
Mrs. Pavaiyarkarasi R, M.E., (Ph.D.,) Associate Professor for her extremely
valuable guidance throughout the course of project.
We are grateful to our Project Coordinators and all the department staff
members for their intense support.
iii
ABSTRACT
iv
TABLE OF CONTENTS
ACKNOWLEDGEMENT iii
ABSTRACT iv
LIST OF ABBREVIATIONS ix
1 INTRODUCTION 1
1.1 OVERVIEW 1
1.2 OBJECTIVES 3
2 EXSISTING SYSTEM 4
2.2 DISADVANTAGES 5
3 LITERATURE REVIEW 6
3.1 FPGA-BASED CONVOLUTIONAL NEURAL 6
NETWORK FOR IMAGE PROCESSING AND
RECOGNITION
v
3.4 EFFICIENT FPGA IMPLEMENTATION OF DEEP 9
NEURAL NETWORK FOR REAL-TIME IMAGE
CLASSIFICATION
4 PROPOSED SYSTEM 11
4.2 ADVANTAGES 12
vi
4.8 WORKING OVERVIEW 30
4.8.1 Working Description
4.8.2 Working Overview
6.1 CONCLUSION 36
REFERENCES 38
vii
LIST OF FIGURES
viii
LIST OF ABBREVIATIONS
ABBREVIATION EXPANSION
AI Artificial Intelligence
FIFO First-In-First-Out
IP Internet Protocol
ix
CHAPTER 1
INTRODUCTION
1.1 OVERVIEW
1
The convolutional layers will extract features from the input images by applying
filters, allowing the network to focus on essential characteristics such as edges and
shapes. Pooling layers will reduce the dimensionality of the feature maps, retaining
the most critical information while minimizing computational load. Finally, fully
connected layers will perform classification based on the extracted features,
producing the final output. The implementation of CNNs on FPGAs not only
promises higher processing speed but also improved energy efficiency. Hardware
implementations can consume significantly less power than their software
counterparts, making them suitable for embedded systems where power consumption
is a critical concern. This aspect is particularly relevant for applications like mobile
devices and IoT solutions.
The project will also include a thorough evaluation of the FPGA implementation
against traditional CPU and GPU approaches. Key performance metrics will include
processing time, recognition accuracy, and resource utilization. We anticipate that the
hardware implementation will demonstrate superior speed and efficiency, validating
the benefits of using FPGAs for real-time character recognition tasks. To further
enhance the robustness of the system, we will employ data augmentation techniques
during the training phase. This will involve creating variations of the training dataset,
allowing the neural network to generalize better to unseen handwriting styles and
improving its accuracy in diverse environments. The ability to adapt to various input
scenarios will be crucial for deploying the system in real-world applications.
2
applications in numerous sectors. As technology continues to advance, hardware-
based solutions like this will be critical in addressing the growing demands of real-
time data processing challenges.
1.2 OBJECTIVES
The main objective of this project is to design and implement an efficient, real-
time alphanumeric character detection system using a neural network, optimized for
FPGA hardware. The goal is to achieve high accuracy in character recognition while
maximizing processing speed and minimizing resource utilization on the FPGA. The
neural network is trained on the MNIST dataset and implemented on an FPGA using
a hardware-software codesign approach. The system leverages the FPGA's parallel
processing capabilities to achieve real-time performance, recognizing handwritten
digits at a rate of [insert rate] digits per second. The neural network is trained on the
MNIST dataset and implemented on an FPGA using a hardware-software codesign
approach. The system leverages the FPGA's parallel processing capabilities to
achieve real-time performance, recognizing handwritten digits at a rate of [insert
rate] digits per second.
3
CHAPTER 2
EXISTING SYSTEM
4
2.2 DISADVANTAGES
5
CHAPTER 3
LITERATURE REVIEW
6
The FPGA’s finite resources present a constraint, particularly when
accommodating models with high parameter counts or complex architectures that
demand extensive computational power. Another challenge is the restricted on-chip
memory of FPGAs, which limits the maximum input image size and model
complexity that can be processed efficiently, making it difficult to work with high-
resolution images or extensive feature maps within a single processing cycle.
Consequently, while the FPGA-based CNN implementation offers promising
performance for specific applications, its applicability is hindered by these
constraints, necessitating future advancements in FPGA technology and architecture
for broader applicability in diverse and more complex neural network models.
This paper by K. S. Patil, N. Gupta, and M. I. Rashid delves into the hardware
acceleration of neural networks specifically tailored for character recognition tasks
using FPGA technology. The proposed system leverages an FPGA to execute a
Convolutional Neural Network (CNN) designed to recognize alphanumeric
characters in real-time from image data, a setup that proves advantageous for edge
applications where computational resources and power are limited. Through
comparative analysis, the authors demonstrate that their FPGA-based system
outperforms traditional GPU and CPU counterparts in both speed and power
efficiency, showcasing its potential for embedded and portable devices where low
power consumption is essential. The design’s compact footprint and energy-saving
features make it a viable option for battery-operated and mobile platforms, further
extending its suitability for real-time applications that demand responsiveness
without a power trade-off.
7
However, the authors note that the FPGA-based implementation has some
limitations. A primary drawback is its reliance on pre-trained models, as the FPGA’s
architecture lacks the flexibility for on-device training, limiting its adaptability in
dynamic or continuously learning environments. Additionally, the system struggles
to handle large-scale datasets due to the inherent memory and resource constraints
of FPGAs, which restrict its generalizability for more complex, real-world scenarios
that involve vast and diverse data sources. The FPGA’s limited logic blocks and DSP
(Digital Signal Processing) units also cap the depth of the neural networks that can
be deployed on the device, constraining the model’s complexity and potentially
limiting its accuracy and robustness for more challenging recognition tasks.
8
However, Najafi, Beigi, and Wong highlight certain drawbacks to this
approach. One major limitation is the need for extensive off-chip training, as the
model requires training on a separate, high-performance platform before
deployment, adding to the complexity and time required for system setup.
Additionally, the system is primarily tailored for detecting printed alphanumeric
characters and struggles with cursive or stylized fonts, which restricts its
applicability in broader OCR contexts, such as handwritten or decorative text
recognition. The authors also note that the high cost of FPGA development tools and
the specialized expertise required for designing FPGA-based systems present
barriers for smaller research teams and companies, potentially limiting access to this
technology.
The FPGA’s parallel processing capabilities are fully utilized in this design,
allowing for rapid data throughput and making it effective for real-time tasks in
embedded and edge computing contexts. Verma, Balaji, and Kaushik also discuss
several limitations associated with this approach. The use of fixed-point arithmetic,
9
while resource-efficient, introduces a trade-off between precision and resource
usage, which can impact accuracy, particularly in more complex recognition
scenarios. Furthermore, deep neural networks, even when optimized for FPGA
constraints, demand substantial logic and memory resources, restricting the number
of networks that can run simultaneously on a single FPGA device. Another challenge
is that hardware-based implementations lack flexibility; once deployed, modifying
or fine-tuning the DNN architecture becomes challenging, making iterative
improvements or adjustments less feasible.
10
CHAPTER 4
PROPOSED SYSTEM
The proposed system will utilize a CNN architecture, which is well-suited for
image classification tasks. The network will consist of multiple convolutional layers,
pooling layers, and fully connected layers, designed to efficiently extract and classify
features from input images. The neural network will be trained on the MNIST
dataset, a benchmark dataset containing 70,000 images of handwritten digits (0-9).
The CNN will be optimized to reduce the complexity of the model while maintaining
high accuracy. FPGAs offer significant advantages for neural network
implementations, including parallel processing and low power consumption. The
neural network will be implemented on an FPGA using a hardware-software co-
design approach. The hardware component will handle computationally intensive
tasks such as convolution and matrix multiplication, while the software will manage
data flow and control.
4.2 ADVANTAGES
11
4.3 HARDWARE REQUIREMENTS
• Vivado
• Convolutional Neural Network
The system's performance will be evaluated across four key criteria to ensure
its suitability for real-time, low-power applications. Recognition accuracy will be
assessed using the MNIST test dataset, ensuring that the system achieves a high
level of precision in digit recognition. Latency will be measured to determine the
time required for each recognition task, with a target of achieving real-time
performance to facilitate seamless, responsive operations. Additionally, the
system’s throughput will be evaluated by measuring its capacity to recognize
digits at a specific rate, aiming to reach [insert rate] digits per second.
In summary, the proposed system will integrate a CNN optimized for image
classification tasks with an FPGA-based hardware platform, combining the
strengths of deep learning with the efficiency and performance of reconfigurable
hardware. By utilizing hardware-software co-design, quantization, and
parallelism, the system will achieve high-speed, low-power performance while
maintaining a high level of accuracy on the MNIST dataset. The flexibility of
FPGA implementation also makes the system adaptable for other neural network
architectures and applications, providing a robust platform for real-time AI
processing.
12
4.5 BLOCK DIAGRAM
The Fig 4.1 represents the block diagram, which illustrates a high-level flow of
deploying a machine learning model, likely a neural network, on an FPGA platform
for efficient processing. The process starts with the Data Interface, where input data,
such as images or signals, is received through communication protocols like AXI
(Advanced extensible Interface) or GPIO (General-Purpose Input/Output). This data
is then managed by a DMA (Direct Memory Access) Controller, which efficiently
transfers data between memory and the processing blocks without relying on the CPU,
optimizing throughput. The Image Processing stage performs initial pre-processing
on the input data—such as resizing, normalization, or grayscale conversion—
preparing it for feature extraction. Following this, the Feature Extraction block
identifies crucial characteristics in the data, using methods like edge detection or
segmentation, to highlight the relevant information needed for classification.
After feature extraction, the data may undergo Fixed Point/Floating Point
Conversion to ensure compatibility with the FPGA’s hardware requirements; fixed-
point representation is often preferred for resource and power efficiency in embedded
systems. The Activation Functions block then introduces non-linearity into the model
by applying functions like Sigmoid or ReLU (Rectified Linear Unit), often
implemented using lookup tables (LUTs) to reduce computation time. The Neuron
Design block encompasses the core neural network operations, where weighted sums,
biases, and activations simulate neuron behaviour. These operations are tailored to
FPGA hardware to ensure optimal performance. The Classification stage, often
implemented with methods like HARDMAX, determines the final output class of the
input data based on the processed features.
13
Fig 4.1 Block Diagram
14
4.6 HARDWARE DESCRIPTION
The ARM Cortex-A9 cores serve as the processing system (PS), handling
general-purpose computation, operating system management, and communication
with external interfaces, such as Ethernet and USB. Simultaneously, the FPGA fabric
(programmable logic, PL) is used to offload compute-intensive tasks such as parallel
data processing, custom peripheral implementation, or hardware acceleration of
specific algorithms. This hardware-software co-design capability makes the
ZedBoard highly versatile for embedded systems, allowing users to optimize
performance for specific tasks while maintaining the flexibility of software
programmability.
The ZedBoard comes with 512MB DDR3 memory for running large
applications and multitasking, along with 256MB of Quad-SPI Flash for storing
programs and data. The HDMI output port enables multimedia applications by
providing a high-resolution display interface, making the ZedBoard suitable for
15
video processing projects or graphical user interfaces. Additionally, the board
supports Gigabit Ethernet, which is essential for network-based applications like IoT
devices, remote monitoring, or streaming data over networks. It also features a USB
2.0 OTG port for connecting peripherals and a USB-UART for serial
communication, which is useful for debugging or interfacing with external systems.
One of the standout features of the ZedBoard is its rich set of expansion options. It
includes an FMC (FPGA Mezzanine Card) connector for high-speed I/O expansion
and five Pmod connectors for interfacing with Digilent’s Pmod peripheral modules,
such as sensors, actuators, or communication modules.
With its 85,000 logic cells, 53,200 LUTs, and 220 DSP slices, the ZedBoard
provides ample resources for implementing sophisticated digital designs. The
inclusion of 560 KB of Block RAM (BRAM) ensures that intermediate data can be
stored and accessed quickly during real-time operations. Additionally, the SD card
slot enables storage expansion or booting operating systems directly from the card,
further enhancing the board’s flexibility.
16
In summary, the Digilent ZedBoard is a comprehensive and powerful
development platform, ideal for both education and professional-level projects in
embedded systems, hardware-software co-design, and FPGA-based applications. Its
combination of high-performance processing, programmable logic, and extensive
I/O and expansion options make it a valuable tool for engineers, researchers, and
hobbyists alike.
The programmable logic (PL) section, derived from FPGA technology, enables
developers to implement custom hardware for specific tasks, such as signal
processing, image processing, encryption, and hardware acceleration. With up to
85,000 logic cells, 220 DSP slices, and extensive I/O capabilities, the Zynq-7000 PL
can perform tasks that would otherwise require external co-processors or specialized
17
ASICs. This allows users to offload computationally intensive functions to the PL,
while the PS manages the overall control flow, resulting in faster, low-latency
processing and improved system efficiency.
One of the key advantages of the Zynq-7000 is its seamless integration between
the PS and PL through high-bandwidth AXI interfaces. This enables fast data
exchange between the ARM processors and the programmable logic, allowing for
tight coupling of hardware and software. The architecture is highly scalable,
allowing users to balance workloads between the PS and PL depending on the
application requirements. The Zynq-7000 is widely used in industries such as
telecommunications, automotive, industrial automation, medical devices, and
aerospace.
Its ability to run real-time operating systems, coupled with the flexibility of
FPGA reconfiguration, makes it suitable for applications that require adaptable
hardware and high processing power. In summary, the Xilinx Zynq-7000 All
Programmable SoC offers a powerful combination of processor and FPGA fabric on
a single chip, enabling efficient hardware-software co-design and accelerating
compute-intensive applications.
18
Fig 4.2 ZedBoard Zynq-7000
4.6.1.3 Specifications
4.7.1 Vivado
Vivado is a comprehensive design suite from Xilinx (now part of AMD) that provides
an integrated environment for the development, simulation, synthesis, and
implementation of FPGA (Field Programmable Gate Array) and SoC (System on
Chip) designs. Built to handle complex digital designs, Vivado is known for its high-
performance capabilities, offering designers advanced tools for creating, debugging,
and optimizing FPGA and SoC-based projects. Its flexibility and wide range of
features make it ideal for hardware developers and digital designers who need precise
control and efficiency in the design process.
One of the core features of Vivado is its use of High-Level Synthesis (HLS),
which allows designers to write algorithms in higher-level languages like C or C++
and convert them directly into HDL (Hardware Description Language) code. This
20
feature significantly simplifies and accelerates the design process by allowing
developers to work at a more abstract level, thus reducing the manual effort involved
in coding complex algorithms in Verilog or VHDL. The HLS also enables a more
straightforward transition for software engineers working on FPGA projects, as they
can work in familiar programming languages.
Vivado provides a powerful synthesis engine that helps optimize designs for
area, speed, and power. Its synthesis process includes logic optimization and
technology mapping, which converts the HDL code into gate-level netlists specific
to the target FPGA. This stage is crucial for achieving high performance and efficient
utilization of FPGA resources. Vivado's Place and Route (P&R) tools further refine
the design, physically mapping the logical circuits onto the FPGA’s configurable
logic blocks, which ultimately helps achieve the desired timing and performance
constraints.
4.7.1.2 Description
21
from RTL (Register Transfer Level) to gate-level, enabling designers to verify their
logic and identify any errors early in the process. The debugging tools, such as
Vivado Logic Analyzer and Vivado Integrated Debug Environment, offer real-time
analysis and visibility into the internal signals of the FPGA. These tools allow users
to probe and monitor signals during operation, facilitating efficient debugging and
troubleshooting.
22
4.7.2 Convolutional Neural Network
At the core of CNNs is the convolutional layer, which applies a set of filters
(or kernels) to the input data to extract features. Each filter is a small matrix that
slides over the input image, performing an element-wise multiplication followed by
a summation to produce a single value in the output feature map. This operation
helps the network learn patterns such as edges, textures, and shapes. Multiple filters
can be applied in parallel, enabling the network to learn a variety of features at
different levels of abstraction. Another key component of CNNs is the activation
function, typically the Rectified Linear Unit (ReLU), which introduces non-linearity
into the model. ReLU allows the network to capture complex patterns by
transforming the linear outputs of the convolutional layer into non-linear outputs,
helping the model to learn intricate relationships in the data.
23
features as the data passes through successive layers. Lower layers may capture
simple patterns, while higher layers can capture complex structures or semantic
information. This hierarchical learning approach is one of the reasons CNNs have
been so successful in computer vision tasks. The final layers of a CNN typically
include fully connected layers (FC layers) that connect every neuron from the
previous layer to every neuron in the current layer.
Training CNNs involves using a labeled dataset and optimizing the network's
parameters through a process called backpropagation. The network's predictions are
compared against the actual labels using a loss function (such as categorical cross-
entropy for classification tasks). The gradients of the loss with respect to the
network's weights are calculated, and optimization algorithms like Stochastic
Gradient Descent (SGD) or Adam are used to update the weights iteratively,
minimizing the loss.
24
Through their layered architecture, they learn to extract and represent features
at multiple levels of abstraction, leading to their impressive performance across
diverse applications. The combination of convolutional and pooling layers allows
CNNs to efficiently capture spatial relationships in data, making them a dominant
approach in modern artificial intelligence applications.
4.7.3 TensorFlow
25
One of the key features of TensorFlow is its ability to perform automatic
differentiation, which facilitates the training of machine learning models through
backpropagation. This feature allows developers to define complex models without
manually computing gradients, significantly reducing the effort required for model
training.
26
visualization of various metrics, including loss, accuracy, and training progress,
allowing users to track their models over time and identify areas for improvement.
The TensorFlow ecosystem is further enriched by a vibrant community and a wealth
of resources, including tutorials, documentation, and pre-trained models available
through the TensorFlow Hub.
4.7.4 NumPy
27
4.7.5 Visual Studio Code
Visual Studio Code (VS Code) is a free, open-source code editor developed by
Microsoft, widely recognized for its versatility, performance, and extensive feature
set, making it one of the most popular development environments for programmers.
Released in 2015, VS Code is lightweight yet powerful, offering support for a wide
range of programming languages, frameworks, and development workflows.
One of VS Code’s standout features is its extensibility. The editor has a rich
ecosystem of extensions available through its marketplace, allowing users to add
functionality specific to their needs. These extensions range from language support
(e.g., Python, JavaScript, C++, Java) to tools for debugging, linting, testing, and even
integrating with cloud services. Developers can customize their environment to
match their development stack, ensuring that VS Code remains adaptable for various
projects, whether they involve web development, data science, or systems
programming. VS Code offers integrated debugging support, which allows
developers to set breakpoints, inspect variables, and step through code execution
without leaving the editor. This streamlines the development process by providing
real-time feedback and reducing the need for external debugging tools. VS Code also
includes a terminal within the editor, enabling developers to execute commands, run
scripts, or use version control systems like Git directly from the workspace.
28
faster and with fewer errors. This feature, combined with code snippets, allows for
rapid development and increases productivity, especially in large projects or complex
codebases.
With the BMG, memory can be configured as true dual-port RAM, allowing
two simultaneous independent accesses, either for reading or writing, from different
addresses, thus providing significant flexibility in memory access patterns.
Alternatively, it can be set as single-port RAM, optimized for high-speed sequential
data access. Additionally, ROM configurations allow read-only data storage, ideal
for constant data that doesn’t change during operation. One of the key advantages of
using the Block Memory Generator is its ability to handle large amounts of memory
without consuming excessive FPGA logic resources. The tool optimizes memory
placement and utilization on the FPGA, ensuring that the design is efficient and
meets timing constraints.
29
4.8 WORKING
The process begins with Image Acquisition, where handwritten character images
are obtained through cameras or scanning devices. These images are converted to
grayscale or binary format to simplify processing. Preprocessing follows, which is
essential for accurate recognition and involves tasks like noise reduction,
normalization, resizing, and binarization. Morphological operations may also be
applied to enhance the character’s features, with the FPGA enabling real-time
preprocessing due to its capacity for high-throughput data handling. In Feature
Extraction, the critical features of the character are identified; common methods
include edge detection, gradient calculation, or more advanced techniques like
Histogram of Oriented Gradients (HOG).
This stage simplifies the input by transforming the image data into a
manageable feature set, making it ideal for the FPGA’s parallel processing
capabilities. Classification (Pattern Recognition) is the project’s core, where the
extracted features are used to recognize the character. Classification may be
performed using machine learning algorithms such as Convolutional Neural
Networks (CNNs), which are well-suited for image recognition tasks.
30
Implementing CNNs in hardware on the FPGA enables direct feature
classification into corresponding character classes. Alternatively, algorithms like
Support Vector Machines (SVM) may be used based on the complexity and accuracy
demands. FPGAs significantly accelerate the recognition process through hardware-
accelerated parallelism, supporting real-time performance. Post-processing and
Output involve displaying or saving the recognized character, with optional
refinements like context-based correction or multiple classifier integration to
enhance accuracy. This output can be displayed on an interface or applied in real-
time tasks, such as text transcription. For machine learning-based methods like
CNNs, the Training phase is crucial. Models are trained off-device on powerful
computers using frameworks like TensorFlow or PyTorch and then converted into an
FPGA-compatible format.
31
FPGAs offer low latency, meaning there is minimal delay between input (the
scanned or captured image) and output (the recognized character). The hardware-
based nature of FPGAs ensures that data flows smoothly through each stage—from
image acquisition and preprocessing to feature extraction and classification—
without the delays typical of software-based processing.
32
CHAPTER 5
The system’s strong performance in terms of both speed and accuracy makes
it suitable for a wide range of real-time applications. These applications include
automated text transcription, where the system can convert handwritten notes into
digital text instantaneously; digital form processing, where the system can
automatically recognize and categorize handwritten entries; and various embedded
systems that require rapid and accurate data processing in real-time
33
Furthermore, the low power consumption of the FPGA design is a major
advantage, especially for applications in resource-constrained environments. In
contrast to traditional CPU or GPU implementations, which can be power-hungry
and require more cooling, FPGA-based systems are inherently more energy-efficient
due to their hardware-centric processing. This makes them ideal for deployment in
mobile devices and IoT applications where power efficiency is critical for prolonged
operation. For example, an FPGA-based character recognition system could be
embedded in wearable devices for educational or medical purposes, where real-time
handwritten data recognition is required but where battery life must be preserved.
The scalability of the FPGA design adds another layer of flexibility to the system,
enabling it to be tailored to meet different levels of processing power depending on
the application requirements.
For instance, the FPGA can be reconfigured to handle more complex models
or larger datasets if required by future applications, providing a future-proof solution
that can adapt to evolving needs. This adaptability ensures that the system can be
scaled up for high-volume data processing or scaled down for more minimal,
targeted tasks, allowing it to serve a broad spectrum of use cases. Overall, the project
showcases the immense potential of FPGAs in accelerating complex machine
learning tasks while maintaining real-time performance and energy efficiency. By
demonstrating that FPGAs can support sophisticated algorithms like CNNs with high
accuracy and low power consumption, this project highlights FPGAs as a viable
solution for real-time, embedded AI applications. The system’s success underscores
the relevance of FPGA-based designs in modern applications requiring fast, efficient,
and accurate character recognition, positioning FPGAs as an optimal choice for
future advancements in the field.
34
Fig 5.1 Output
35
CHAPTER 6
CONCLUSION & FUTURE SCOPE
6.1 CONCLUSION
One of the key achievements of this project is its real-time processing ability.
Unlike traditional CPU or GPU-based systems that can introduce delays due to
sequential processing or bottlenecks in memory access, an FPGA can perform
multiple operations simultaneously. This parallelism reduces the overall computation
time, allowing for instantaneous recognition of handwritten characters, which is
crucial for applications such as digitizing handwritten documents, form processing,
and real-time transcription in embedded systems. This low-latency response is
particularly valuable in scenarios where real-time feedback is critical, such as live
text input systems, smart devices, or automotive applications. The project's use of
machine learning models, such as Convolutional Neural Networks (CNNs),
highlights the adaptability of FPGAs in handling modern deep learning workloads.
36
6.2 FUTURE SCOPE
37
REFERENCES
38
[10] K. N. Shanmuganathan, P. K. Mahadeva Prasanna, “FPGA Based Reconfigurable
Architecture for Character Recognition Using Neural Networks,” IEEE International
Symposium on VLSI Design and Test (VDAT), 2014.
39