0% found this document useful (0 votes)
29 views

Using Advanced FPGA SoC Technologies For The Design of Industrial Control Applications

This document discusses using FPGA technologies for industrial control applications. It proposes a methodology using high-level synthesis and C/C++ to automatically implement control algorithms like PID on multicore FPGA architectures. This allows algorithms to run faster by taking advantage of parallelism while improving productivity through high-level languages. The document presents implementing three control algorithms this way, with one able to run 500k iterations in under a second.

Uploaded by

KARKAR NORA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Using Advanced FPGA SoC Technologies For The Design of Industrial Control Applications

This document discusses using FPGA technologies for industrial control applications. It proposes a methodology using high-level synthesis and C/C++ to automatically implement control algorithms like PID on multicore FPGA architectures. This allows algorithms to run faster by taking advantage of parallelism while improving productivity through high-level languages. The document presents implementing three control algorithms this way, with one able to run 500k iterations in under a second.

Uploaded by

KARKAR NORA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Using Advanced FPGA SoC Technologies for the

Design of Industrial Control Applications

Christoforos Economakos† , George Kiokes‡ and George Economakos



Technological Educational Institution of Sterea Ellada, Department of Automation Engineering
GR-34400 Psahna, Evia, Greece, e-mail: [email protected]

Hellenic Air Force Academy, Dekeleia, 1010, GR-13671, Attica, Athens, Greece, e-mail: [email protected]

National Technical University of Athens, School of Electrical and Computer Engineering
Heroon Polytechniou 9, GR-15780 Zografou, Athens, Greece, e-mail: [email protected]

Abstract—Modern industrial control systems must offer per- Arrays (FPGAs). While DSPs generally include special pur-
formance, flexibility and reliability. On the same time, they pose computational hardware to improve performance, like
need to reach the market as early as possible and at low cost. floating point coprocessors or multiply and accumulate ALUs,
Finally, they need to operate as embedded devices with low and fewer peripherals than common microcontrollers, they can
power budget. On top of that, the algorithms that they implement be considered to belong to the same family of development
are getting even more sophisticated, advanced and demanding.
To cope with all these diverse requirements, control system
platforms. With this family, applications are written most of
designers are moving with fast steps to the digital hardware the times in C/C++ and pass through a number of powerful
design field and specifically, FPGAs, System-on-Chip architec- tools like cross-compilers, linkers, debuggers and simulators,
tures and productivity improving methodologies like High-Level to meet design constraints.
Synthesis, which uses C/C++ as an abstract hardware description
language. In this paper, using these tools, the implementation of Recently, the technological advances in FPGA devices,
3 control algorithms is shown, the classical PID algorithm, a offering hundreds of GFLOPs with maximum power efficiency,
Fuzzy Logic Controller (FLC) and an Adaptive or Tuning Fuzzy has established a second powerful development family. FPGAs
Logic Controller (TFLC). The novelty of the proposed approach is have been proposed as an implementation platform between
that through specific coding and compiler directives, the C/C++ hardware and software. They consist of specially designed
input descriptions are automatically implemented as advanced hardware modules connected with efficient circuit switch-
multicore architectures (3 most advanced of them are put to ing interconnections, offering hardware-like performance and
extensive experimentation and compared), which execute up to
software-like flexible, dynamic reconfiguration. FPGA pro-
500K algorithm iterations in less that 1 sec, taking advantage
of an embedded ARM family microcontroller and common gramming is based on Hardware Description Languages
memory blocks found in the underlying FPGA implementation (HDLs) like VHDL or Verilog. HDL programming however
device. This is a substantial performance improvements and a requires domain specific knowledge and can therefore keep
high productivity boost, with very promising future extension non-expert designers away and impose a negative impact on
capabilities. productivity.
Keywords—High-Level Synthesis; Digital Control; Multicore To improve designer productivity and reduce time-to-
Architectures; FPGAs; SoC; market, modern design techniques like High-Level Synthesis
(HLS), Electronic System Level (ESL) design or, in simpler
I. I NTRODUCTION terms, C based hardware design can be adopted. HLS, ESL
and C based hardware design [2], all more or less involve the
Modern industrial control systems need to comply to automatic translation of untimed C/C++ algorithmic descrip-
different requirements to make a high and fast market impact. tions into Register-Transfer Level (RTL) HDL architectural
From the designer’s point of view, all requirements can be descriptions, ready for FPGA implementation. As a research
summarized into two key factors: improve quality (in terms topic it started more than 30 years ago, and can be divided into
of performance, resource usage, power dissipation, etc.) and three generations [3]. The latest, third generation, starting in
reduce time-to-market. 2000 and lasting up to now, is more mature, starts from system
level languages and mainly C/C++ (so the term C based design
The first step in achieving these goals is the adoption of has prevailed), offers a different design paradigm separated
digital over analog control methodologies, accompanied by from RTL and HDLs and, based on recent advances in FPGA
efficient development environments [1]. Digital control can technology, quality of results is highly improved.
be performed with common microcontrollers, Digital Signal
Processor (DSP) controllers, or Field Programmable Gate Digital industrial control methodologies and implementa-
tion technologies, like microcontrollers, DSPs and FPGAs are
 This research has been co-financed by the European Union (European
gaining wider and wider acceptance during the last years. Es-
Social Fund - ESF) and Greek national funds through the operational program
“Education and Lifelong Learning” of the National Strategic Reference Frame-
pecially FPGAs, have introduced a variety of well established
work (NSRF) - Research Funding Program: ARCHIMEDES III: Investing in and efficient hardware design solutions into the industrial
knowledge society through the European Social Fund. control arena. These include HDLs [4], C based design and

Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on December 04,2022 at 20:14:16 UTC from IEEE Xplore. Restrictions apply.
HLS [5], Programmable Logic Controller (PLC) code to HDL II. P ROPOSED M ETHODOLOGY
translator [6], SoC [7] and MultiProcessor SoC (MPSoC) [8]
architectures, hardware/software codesing [9] and run-time The design methodology proposed in this paper consists of
reconfiguration [10]. Also, FPGAs have been used for the two steps. First, the generation of a high performance hard-
implementation of different controller types [11]. However, ware accelerator and second, the integration of the generated
even though the opportunities seem to be more than what accelerator together with a reference SoC architecture. In both
designers need, in each case, a cost-benefit approach should be steps, the underlying technologies aim at improving designer
followed and Pareto optimal solutions should be investigated productivity, while maintaining quality of results.
based on the constraints imposed by the specific industrial The proposed tool flow (without loss of generality and due
environment in each case. to maturity) is based on Xilinx Vivado Design Suite (or, for
Following these past approaches, this paper presents a short Vivado), which supports SoC architectures of a single or
performance improving C based tool flow applied to a scalable, dual core embedded processor (ARM, PowerPC, MicroBlaze),
multicore FPGA-based System-on-Chip (SoC) architecture for playing the role of the AXI bus [13] master. Two Vivado tools
digital controllers. This architecture combines a common RISC are involved for hardware and software specification and de-
microcontroller and a number of special purpose coprocessors sign, IP Integrator (IPI) and Xilinx Software Development Kit
on the same FPGA device. The microcontroller is used for gen- (SDK). As their name implies, IPI is a GUI enriched hardware
eral purpose chores like communication with common devices specification tool, with links to low level implementation tools
(VGA, HDMI, TFT or LCD displays, buttons and switches, for full, top-down hardware design, and SDK is an Eclipse
external memory cards, UART, ethernet, bluetooth, GSM), for based software development environment. For the proposed
which a lot of work is available (drivers, applications), either methodology, that is the design of high performance hardware
public or non-public domain. The microcontroller can even accelerators, another tool is involved, Xilinx Vivado HLS. This
host Linux flavors, improving the usability and flexibility of tool is used for the design of an AXI slave, which implements
the resulting device. On the other hand, the special purpose the functionality of the accelerator and is connected to the
coprocessors are connected to the microcontroller bus with SoC architecture through simple or streaming AXI interfaces
different architectural options, that are selected with proper (depending on throughput and latency requirements) as well as
coding guidelines and compiler directives. Coprocessors are common on-chip memory blocks. Vivado HLS accepts untimed
used to implement demanding control applications, like classi- C/C++ algorithmic descriptions and based on the dependencies
cal Proportional-Integral-Derivative (PID) algorithm, a Fuzzy described with the used language constructs (assignments,
Logic Controller (FLC) and an Adaptive or Tuning Fuzzy Logic loops, conditionals) and user specified constraints and prefer-
Controller (TFLC). Without loss of generality, the implemen- ences (through appropriate compiler directives), generates opti-
tation presented in this paper as well as the corresponding tool mum technology aware RTL descriptions. The novel approach
flow are those offered by Xilinx, because of their maturity at of this flow is that it accepts only C/C++ input and produces
the current moment. However, other FPGA vendors are also optimized final implementation bitfiles.
preparing comparable solutions, so the corresponding reference A simple run through the tool flow starts from Vivado HLS,
architecture will probably be universally supported in the near where the C/C++ input specification of the hardware accelera-
future. tor is given, and through simulation, HLS and verification, an
optimized hardware component is generated. This component
The advantages and novelties presented in this paper are:
is then packed into a special type IP, an IP-XACT, suitable
i) demanding control applications are performance enhanced
for IPI, together with the selected AXI interfaces. Next, IPI
by designing a special purpose hardware coprocessor, handling
is called to specify the SoC architecture, using an embedded
aggressive application and technology constraints, ii) fixed and
processor, the high performance IP-XACT core and generic
floating point calculations are supported, through vendor sup-
widely used components taken from an IP catalog. The whole
plied and optimized arithmetic IP cores [12], improving quality
system specification is synthesized into an FPGA specific
of results without special and time consuming designer effort,
netlist and then mapped into a feasible bitfile. Next, this bitfile
iii) the resulting embedded device offers advanced and flexible
is passed to SDK, where low level software components (OS,
integration options, taking advantage of common peripherals
drivers, libraries) are automatically generated and application
connected to a RISC microcontroller (ARM, PowerPC, Mi-
software components are developed. Finally, the generated
croBlaze) and Linux, and finally, iv) the whole design (both
application Executable and Linkable Format (ELF) file is send
hardware and software) is done in C/C++, improving designer
back to IPI, where an implementation bitfile is generated,
productivity and avoiding HDLs and other time consuming and
containing both hardware and software components, ready to
error introducing procedures, without loss of performance.
program the FPGA device. A more detailed presentation of the
These advantages are presented in the rest of this paper and above mentioned tools are out of the scope of this publication,
justified with a set of experimental results, with 3 advanced mi- as well as optimizations and flow iterations supported to satisfy
crocontroller/coprocessor connection architectures within the design constraints. These can be found in the accompanying
overall proposed SoC architecture. With these experiments its user manuals.
is shown that the proposed environment and the corresponding
tool flow is an efficient rapid prototyping development platform III. P ROPOSED A RCHITECTURES
for digital control applications reaching performance of 500K
iterations in less than 1 sec, meeting modern design constraints While the material presented in the previous section is
and requirements, and offering promising future extension a reference to widely used software packages, this section
capabilities. presents novel work, aiming at the efficient and more cost

Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on December 04,2022 at 20:14:16 UTC from IEEE Xplore. Restrictions apply.
Fig. 1. Bus based architecture. Fig. 2. BRAM architecture.

which is its performance limiting factor.


effective use of these packages. Specifically, 4 different SoC
architectures are presented along with the language constructs The second SoC architecture, called BRAM, if given in
and required directives. In this way, a clear link between spec- figure 2. The difference with the previous architecture of
ification and implementation in the Vivado HLS environment figure 1 is that the hardware accelerator communicates directly
is given. The 4 architectures, given in figures 1-4, experiment with the BRAM memory. Execution this time starts with the
with different datapath optimizations, memory hierarchies and embedded processor initializing BRAM and then, sending an
interconnection opportunities. This way, the full potential of initialization signal to the accelerator through the bus. This
modern HLS environments is given as well as improved quality is their only communication and the accelerator can work at
of results is achieved, with also improved productivity and full speed, without time consuming bus transactions. Again, no
reduced time-to-market. specific algorithmic constructs are required, only the following
directives for the memory connected port (together with the
This first SoC architecture, called BUS, is given in figure previously presented bus connection directives).
1. An embedded processor is connected to on chip and off chip
memories and the generated hardware accelerator, through the s e t d i r e c t i v e i n t e r f a c e −mode ap memory ” f o o ” mem
processor bus (for the reference tool flow, the AXI bus). For an
FPGA device, the most widely used on chip memory is Block The third SoC architecture, called LUT, is given in figure
RAM (BRAM), which is implemented by the FPGA vendors 3. In this architecture, the hardware accelerator is equipped
as specific components installed within the FPGA fabric. with a local memory, which is implemented as a register
The most widely used off chip memory is dynamic or DDR file using Look-Up Table (LUT) function generators, found
memory, connected to processor bus though an appropriate IP in large amounts in FPGA devices. The advantage of LUT
of a DDR controller. DDR is slow but can be very large in size memory is that each component can be accessed independent
and contain large data sets while BRAM is fast but limited in of the others, so a multi port memory architecture is generated.
size. Also, parallel access to both memory types is limited by This architecture has a more complex controller than dual port
the number of controller inputs, which is most off the times two BRAM memory, but fast and parallel access times. Execution
(dual port memories). Execution in the architecture of figure 1 starts with the embedded processor initializing BRAM and
starts with input data read from an external device (a data file in then, the accelerator copies BRAM contents into the local LUT
a connected filing system) and stored in DDR. Next, all or parts memory (a local variable). After that any access command to
of the input data are moved to BRAM, to speed up execution. BRAM memory is changed to the appropriate LUT memory
Then, the embedded processor reads data from BRAM and access command and the accelerator can work at very high
sends them through the bus to the hardware accelerator, using speed, limited only by the dependencies found within the
a set of memory mapped registers. The accelerator performs datapath. Both bus and BRAM bottlenecks are avoided. The
datapath operations (all the required algorithmic operations) disadvantage of this third architecture is that both memory and
and returns results through the same route, that is the processor datapath are implemented with LUTs, which may be a limited
bus. This architecture does not require any specific language resource in small and medium range FPGA devices. The algo-
constructs, provided the initial C/C++ algorithmic description rithmic constructs required are the definition and initialization
contains legal code for the HLS tool. It needs however a set of a local variable, for LUT memory, like the following lines,
of directives, to generate appropriate hardware and software where bram is the BRAM memory variable and lut is the LUT
descriptions of the bus ports and their access methods, like memory variable. After these code modifications, any access
the following, where x and y are the input and output of top command to the BRAM memory variable should be changed
level function foo. to the appropriate LUT memory access command.
set directive i n t e r f a c e −mode a p c t r l h s ” f o o ” i n t main ( i n t bram [ BUFSIZE ] ) {
set directive i n t e r f a c e −mode a p n o n e ” f o o ” x i n t l u t [ BUFSIZE ] ;
set directive i n t e r f a c e −mode a p v l d ” f o o ” y f o r ( i = 0 ; i < BUFSIZE ; i ++) l u t [ i ] = bram [ i ] ; }
set directive r e s o u r c e −c o r e AXI4LiteS ” f o o ” r e t u r n
set directive r e s o u r c e −c o r e AXI4LiteS ” f o o ” x Regarding directives, a number of preferences can be im-
set directive r e s o u r c e −c o r e AXI4LiteS ” f o o ” y posed in order to exploit the full potential of this architecture.
First, a directive for the BRAM memory connected port is
The architecture of figure 1 is straightforward and can required. Also, a directive called ARRAY PARTITION in
be used for code that works with small amounts of data. Vivado HLS is used to map the local memory (lut variable)
Its drawback is the close dependence with the processor bus, into LUT and not BRAM resources. After that, all loops that

Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on December 04,2022 at 20:14:16 UTC from IEEE Xplore. Restrictions apply.
Fig. 3. BRAM/LUT architecture.

access LUT memory can be fully unrolled. This way all code
that access local memory can be executed in parallel, limited
only by data dependencies of the datapath. Also, the LUT
memory initialization loop can be unrolled two times, because Fig. 4. BRAM/LUT multicore architecture.
this is the maximum parallelization that can be achieved with
a dual port BRAM memory. Finally, function calls within
the datapath can be either inlined or pipelined, and the best
result is chosen for implementation. Inlining increases the
code fragments that can be accessed in parallel, with more
complicated control hardware however. Pipelining improves
execution time by overlapping consecutive function calls. For
each algorithmic description, either inlining or pipelining may
give better results so, a trial and error approach is required.
In brief, a representative example of the directives of the
third architecture are given below, where bram is the BRAM
memory variable and lut is the LUT memory variable, ini-
tialization and computation are labels of a LUT initialization
and a heavy computational loop respectively, and datapath is Fig. 5. The structure of the lopper control system.
a computational intensive function call.
algorithm, a FLC algorithm used in the lopper control of
set directive i n t e r f a c e −mode ap memory ” f o o ” bram a rolling mill reported in [14] and the adaptive or (TFLC)
set directive a r r a y p a r t i t i o n −t y p e c o m p l e t e ” f o o ” l u t
set directive u n r o l l −f a c t o r 2 ” foo / i n i t i a l i z a t i o n ”
algorithm found in the same publication (shown in figure
set directive u n r o l l ” foo / computation ” 5). All implementations used single precision floating point
set directive pipeline ” datapath ” calculations natively supported by the latest versions of Vivado
HLS and linked to vendor supplied, optimized implementations
Finally, the fourth SoC architecture, called LUT/i (ex- [12]. The PID algorithm requires 3 floating point multiplica-
plained later) is given in figure 4. The accelerator is the same tions and 5 floating point additions, the FLC algorithm requires
as the one found in the third architecture however, more than 26 floating point multiplications, 10 floating point divisions
one accelerators are used and specifically i. This way, provided and 22 floating point additions and the TFLC algorithm
the algorithmic description contains computations that can requires 106 floating point multiplications, 57 floating point
be executed in parallel using the local LUT based memory divisions and 71 floating point additions. For each algorithm,
copy, maximum parallelism can be achieved performance a C description was written upon which algorithmic and
measurements can satisfy demanding, high throughput appli- architectural optimizations were applied, resulting in deep
cations. For this architecture no extra algorithmic constructs optimization and design space exploration.
are required or tool directives. The only difference with the
third architecture is that LUT memory in each accelerator is FPGA implementations were based on the latest develop-
a part of the BRAM memory, whose length is determined by ment in FPGA technology, Xilinx’s 7 series All Programmable
the number of accelerators used. System-on-Chip, offering breakout performance, capacity, and
system integration, while optimizing price/performance/watt,
and specifically the Zynq XC7Z020 device (53200 Look-Up
IV. E XPERIMENTAL R ESULTS Table generators - LUTs, 106400 D-type Flip-Flops - DFFs
The presented design methodology and corresponding tool and 220 special purpose DSP blocks), found in the Zedboard
flow has been tested with a number of control algorithms. For evaluation board. Of all the architectures presented in the
each algorithm, a number of FPGA implementations has been previous section, implementations were taken for the 3 most
generated and performance and hardware usage measurements advanced, the BRAM, LUT and the LUT/i with i=4. This was
have been taken. Details about all experimental setups and all selected because the first architecture, the BUS architecture,
measurements are given below. has been proven to perform much more inferior than the
others [15] (more than 50X performance overhead, due to the
Specifically, 3 control algorithms have been implemented, bus bottleneck). For all 3 architectures and all 3 algorithms
starting from C behavioral specifications. The classical PID resource usage, performance for 1024-524288 or 1K-500K

Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on December 04,2022 at 20:14:16 UTC from IEEE Xplore. Restrictions apply.
TABLE I. BRAM ARCHITECTURE IMPLEMENTATION WITH THE Z YNQ
Z EDBOARD .

Algorithm Area
LUTs DFFs DSPs
PID 1287 841 8
(2.42%) (0.79%) (3.64%)
FLC 13161 6649 21
(24.74%) (6.25%) (9.55%)
TFLC 53119 30823 168
(99.85%) (28.97%) (76.36%)

TABLE II. LUT ARCHITECTURE IMPLEMENTATION WITH THE Z YNQ


Z EDBOARD .

Algorithm Area
LUTs DFFs DSPs
PID 1895 1244 10
(3.56%) (1.17%) (4.55%)
FLC 13715 7798 24
(25.78%) (7.33%) (10.91%)
TFLC 52936 31311 168 Fig. 6. Performance of the PID algorithm.
(99.50%) (29.43%) (76.36%)

iterations and power measurements were taken.


For resource usage, tables I and II show the resources re-
quired for architectures BRAM and LUT respectively (LUT/4
is the same as LUT with all numbers multiplied by 4, because
4 instances are utilized). For both architectures resources are
similar. The LUT architecture requires more elements in small
designs (PID and FLC), where the extra hardware for the lut
memory and its controller are comparable with hardware for
algorithm computations. For the largest TFLC design, the LUT
architecture, offering advanced parallelization opportunities
and thus more hardware sharing options, requires less hardware
than the BRAM, overcoming the lut memory overheads. In all
cases the required resources are less than those available in
the FPGA device used.
For performance, tables III-V show execution times in
msec for 1K-500K iterations. As expected, more demanding
algorithms take more time. Also, the LUT architecture is Fig. 7. Performance of the FLC algorithm.
always better than the BRAM, due to the use of local memory
and LUT/4 is almost 4 times better than LUT, utilizing 4 LUT
parallel cores. In case of the TFLC algorithm, where 1 LUT
architecture core only fits in the Zynq XC7Z020 device, the
XC7Z100 can be used instead to build the LUT/4 architecture,
presenting the given performance measurement numbers. The
same results are given in graphical form in figures 6-8.
Finally, the power consumption of both architectures is
near 1.5W, which is much less what traditional CPUs and
DSPs need. Overall, the proposed methodology offers solutions
than can perform 500K iteration calculations in less than 1
sec, 1.5W and reasonable resources. The LUT and LUT/4
architectures, by a clever use of local memory, offer the best
solutions and very promising for the future.

V. C ONCLUSIONS
In this paper, a design environment and the corresponding
tool flow has been presented, that utilizes C based hardware
design, for the development of digital control applications. Fig. 8. Performance of the TFLC algorithm.

Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on December 04,2022 at 20:14:16 UTC from IEEE Xplore. Restrictions apply.
TABLE III. Z EDBOARD PERFORMANCE MEASUREMENTS (1/3).
Algorithm Execution time (ms)
Iterations
1024 2048 4096 8192
Architectures
BRAM LUT LUT/4 BRAM LUT LUT/4 BRAM LUT LUT/4 BRAM LUT LUT/4
PID 0.27 0.19 0.05 0.53 0.39 0.10 1.06 0.78 0.19 2.13 1.56 0.39
FLC 1.41 1.34 0.34 2.83 2.68 0.67 5.65 5.37 1.34 11.30 10.73 2.68
TFLC 1.98 1.83 0.46 3.95 3.67 0.92 7.91 7.33 1.83 15.81 14.67 3.67

TABLE IV. Z EDBOARD PERFORMANCE MEASUREMENTS (2/3).


Algorithm Execution time (ms)
Iterations
16384 32768 65536
Architectures
BRAM LUT LUT/4 BRAM LUT LUT/4 BRAM LUT LUT/4
PID 4.26 3.11 0.78 8.52 6.23 1.56 17.04 12.45 3.11
FLC 22.61 21.46 5.37 45.22 42.93 10.73 90.44 85.85 21.46
TFLC 31.62 29.33 7.33 63.24 58.66 14.67 126.48 117.31 29.33

TABLE V. Z EDBOARD PERFORMANCE MEASUREMENTS (3/3).


Algorithm Execution time (ms)
Iterations
131072 262144 524288
Architectures
BRAM LUT LUT/4 BRAM LUT LUT/4 BRAM LUT LUT/4
PID 34.08 24.90 6.23 68.16 49.81 12.45 136.31 99.61 24.90
FLC 180.88 171.71 42.93 361.76 343.41 85.85 723.52 686.82 171.71
TFLC 252.97 234.62 58.66 505.94 469.24 117.31 1011.88 938.48 234.62

Specifically, the presented tool flow supports C input al- [6] S. Subbaraman, M. M. Patil, and P. S. Nilkund, “Novel integrated de-
gorithmic descriptions that pass through HLS and FPGA velopment environment for implementing PLC on FPGA by converting
implementation tools to implement the selected algorithms ladder diagram to synthesizable VHDL code,” in 11th International
Conference on Control Automation Robotics and Vision. IEEE, 2010,
as multicore, embedded designs, offering performance im- pp. 1791–1795.
provements and hardware utilization efficiency. Overall, the [7] A. Ben Said, M.and Hemdani, M. W. Naouar, E. Monmasson, and
proposed methodology and underlying tool flow support a I. Slama-Belkhodja, “Standard FPGA-based or full cSoC controllers
novel high productivity prototyping platform for digital con- for three-phase PWM boost rectifier, two viable solutions,” in 15th In-
trol applications, offering performance, resource and power ternational Power Electronics and Motion Control Conference. IEEE,
2012.
improvements compared other implementation architectures.
The use of local memory, through coding styles and compiler [8] S. Ben Othman, A. K. Ben Salem, H. Abdelkrim, and S. Ben Saoud,
“MPSoC design approach of FPGA-based controller for induction motor
directives, offers overall best solutions, showing that C based drive,” in International Conference on Industrial Technology. IEEE,
design can be tuned to offer both quality-of-results and reduced 2012, pp. 134–139.
time-to-market. [9] E. Monmasson, I. Bahri, L. Idkhajine, A. Maalouf, and W. M. Naouar,
“Recent advancements in FPGA-based controllers for AC drives appli-
cations,” in 13th International Conference on Optimization of Electrical
R EFERENCES and Electronic Equipment. IEEE, 2012, pp. 8–15.
[10] M. W. Naouar, E. Monmasson, A. A. Naassani, and I. Slama-Belkhodja,
[1] E. Monmasson, L. Idkhajine, M. N. Cirstea, I. Bahri, A. Tisan, and
“FPGA-based dynamic reconfiguration of sliding mode current con-
M. W. Naouar, “FPGAs in industrial control applications,” IEEE Trans-
trollers for synchronous machines,” IEEE Transactions on Industrial
actions on Industrial Informatics, vol. 7, no. 2, pp. 224–243, 2011.
Informatics, vol. 9, no. 3, pp. 1262–1271, 2013.
[2] P. Coussy and A. Morawiec, High-level Synthesis: From Algorithm to
[11] E. Monmasson and M. N. Cirstea, “FPGA design methodology for
Digital Circuit. Springer-Verlag, 2008.
industrial control systems - a review,” IEEE Transactions on Industrial
[3] G. Martin and G. Smith, “High-level synthesis: Past, present, and Electronics, vol. 54, no. 4, pp. 1824–1842, 2007.
future,” IEEE Design and Test of Computers, vol. 26, no. 4, pp. 18–25, [12] D. Bagni and D. Mackay, “Floating-point PID controller design with
2009. Vivado HLS and system generator for DSP,” Xilinx Application Note
[4] S. Ghosh, R. K. Barai, S. Bhattarcharya, P. Bhattacharyya, S. Rudra, XAPP1163, 2013.
A. Dutta, and R. Pyne, “An FPGA based implementation of a flexible [13] ARM Ltd., AMBA AXI and ACE Protocol Specification, 2013.
digital PID controller for a motion control system,” in International
Conference on Computer Communication and Informatics. IEEE, [14] F. Janabi-Sharifi and J. Fan, “A learning fuzzy system for looper control
2013. in rolling mills,” in International Conference on Systems, Man, and
Cybernetics. IEEE, 2000, pp. 3722–3727.
[5] D. Navarro, O. Lucia, L. A. Barragan, I. Urriza, and O. Jimenez,
“High-level synthesis for accelerating the FPGA implementation of [15] C. Economakos, M. Tzamtzi, M. Skarpetis, and G. Economakos,
computationally-demanding control algorithms for power converters,” “Performance improvements in a modern hardware design environment
IEEE Transactions on Industrial Informatics, vol. 9, no. 3, pp. 1371– for control applications,” in International Conference on Industrial
1379, 2013. Technology. IEEE, 2015.

Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on December 04,2022 at 20:14:16 UTC from IEEE Xplore. Restrictions apply.

You might also like