Application of Machine Learning in FPGA EDA Tool D
Application of Machine Learning in FPGA EDA Tool D
This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2023.0322000
ABSTRACT With the recent advances in hardware technologies like advanced CPUs and GPUs and the
large availability of open-source libraries, machine learning has penetrated various domains, including
Electronics Design Automation (EDA). EDA consists of multiple stages, from high-level synthesis and
logic synthesis to placement and routing. Traditionally, estimating resources and areas from one level of
design abstraction to the next level uses mathematical, statistical, and analytical approaches. However, as the
technology node decreases and the number of cells inside the chip increases, the traditional estimation meth-
ods fail to correlate with the actual post-route values. Machine-learning (ML) based methodologies pave a
strong path towards accurately estimating post-route values. In this paper, we present a comprehensive
survey of the existing literature in the ML application field in EDA, emphasizing FPGA design automation
tools. We discuss how ML is applied in different stages to predict congestion, power, performance, and
area (PPA), both for High-Level Synthesis (HLS) and Register Transfer Level (RTL)-based FPGA designs,
application of design space exploration and application in Computer-Aided Design (CAD) tool parameter
settings to optimize timing and area requirements. Reinforcement learning is widely applied in both FPGA
and ASIC physical design flow, a topic of discussion in this paper. We also discuss various ML models like
classical regression and classification ML, convolution neural networks, reinforcement learning, and graph
convolution network and their application in EDA.
I. INTRODUCTION
S the technology node size decreases due to Moores basic estimations like congestion [3], [4] and wirelength [6].
A Law [1], and Dennard’s scaling [2], more and more
transistors are fitted per unit area inside the integrated cir-
These methods generally solve every problem from scratch
and do not utilize any experience or knowledge.
cuit (IC) than ever before. With the decrease in technology On the other side of the technology world, another domain
size, the tools used to design the ICs are becoming more is emerging pretty fast and has shown its application in
complex. Electronic Design Automation is the branch of almost every aspect of human life known as Machine Learn-
science or technology that deals with the tools used to design ing (ML). ML has seen its applications in various domains
integrated circuits. IC design is a highly complex process like image recognition [7], [8], natural language processing
consisting of multiple stages with an average turnaround time [9]–[11], audio and signal processing [12]–[14], etc. This
ranging from a few days to a few years. While designing is because the hardware used for training the models, like
an IC, it is always a good practice to estimate the power, GPUs, has progressed significantly with multiple processors
performance, area, congestion, and wirlength of the design fabricated inside one die along with easy availability of open
down the flow in an early stage. Traditionally, this is done source training frameworks like Tensorflow, Scikitlearn, and
using statistical, mathematical, or analytical methods [3]– Pytorch [15]–[17]. ML has also penetrated the EDA domain
[5]. Although the analytical and mathematical models get in the last five to six years. All the problems that were
pretty accurate estimates, these methods are computationally previously solved using analytical methods [3]–[6], [18] are
expensive and consume a lot of time (few hours) to perform now reframed as ML problems. The issues that took a few
VOLUME 4, 2016 1
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
ML in EDA
Regression Transfer RL
Analytical Based Learning
PPA Placement
Recommender
ML Based Cross ML + Meta RL
Platform Heuristics Routing
minutes to hours using analytical or mathematical methods parameter-tuning effort, Kwon et al. [32] propose a CAD
can now be solved using ML-based pre-trained models within tool parameter recommender system that involves learning a
a few seconds. Machine learning has found its application collaborative prediction model through tensor decomposition
in almost all stages of IC designs, which include timing, and regression. ML-based design space exploration of HLS
power, and area estimation; routing congestion estimation, designs is discussed in [33]–[36]. [30] uses Bayesian opti-
both using classical ML and deep learning methods; design mization to perform DSE to optimize an HLS-based CNN
space exploration; CAD tool parameter tuning; Wirelength architecture, while Carloni [35] uses transfer learning to
(WL) estimation and lithographic hotspot detection. For this optimize the designs during DSE. Congestion estimation is
paper, we have surveyed the literature in the ML application another problem that machine learning can solve accurately
domain in EDA tool design and optimization, focusing on and quickly. In [37], [38], the congestion estimation of RTL
FPGA EDA tools. We have clustered the application in the designs on FPGA uses classical regression-based ML modes.
following five categories: [39] predicts the routing congestion of HLS designs using
i. Power, performance and area prediction in RTL and ML, while [40], [41] uses convolution neural network(CNN)
HLS Designs methods to forecast the congestion on post-placed images of
ii. Design Space Exploration (DSE) logic netlists.
iii. CAD tool optimization using ML As shown in Fig. 1, ML can be applied at different stages
iv. Congestion estimation in HLS and RTL designs of the logical and physical design flow, from quality of
v. ML in FPGA Phyiscal Design Flow results (QoR) estimations to hyperparameter selection and
design space exploration. In this paper, we comprehensively
In Fig. 1, we have shown how ML is applied at various
reviewed how ML is applied in different stages of FPGA
stages of FPGA EDA flow. The figure also shows the various
physical design flow, along with some insights into future
subcategories and ML algorithms applied at each flow stage.
developments and open research areas. The rest of the paper
We have discussed both analytical and mathematical-based is organized as follows: Section II, we discuss the other
PPA estimators [19]–[21] and ML-based estimators [22]– works, where application of ML in EDA tool development
[26], where the design entry is either HLS or RTL input. for both ASICs and FPGAs have been studied; in Section
The prediction models use traditional ML methods like re- III, we discuss the various ML models applied in EDA tool
gression and classification and advanced methods like deep design, which includes classical regression and classification,
learning and graph convolution [26], [27] to estimate the CNN, reinforcement learning, transfer learning, and graph
PPA of a circuit. For autotuning and recommendation of convolution network (GCN). From Section IV to Section
CAD tool parameters [28]–[30]. Bayesian methods are used VIII, we discuss in detail the five categories in which ML is
to select the hyperparameters of the CAD tools. Yanghua applied for FPGA CAD. Finally, Section IX summarizes the
et al. perform the feature selection to reduce the number of paper’s contents and looks into future trends and directions.
FPGA CAD parameters to consider [31] from 80 to 8 features
without compromising the quality of results. To reduce the
2 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
II. RELATED SURVEYS IN ML BASED EDA TOOL is available that discusses the existing work of ML-based
DEVELOPMENT FPGA CAD tool design. To address these loopholes, we
There are few survey papers in which different works in created this detailed survey in which we exclusively review
the ML application domain in EDA have been studied [42]– various works in which ML has been applied to multiple
[48]. Most of these works are either too generic for ASIC FPGA CAD tool design stages. We discuss both ML-based
[42], [45] or too specific to a particular domain [43], [44], HLS tools and ML-based RTL tools for FPGAs. This paper
[48]. These papers discuss how ML has been applied to surveys the recent advancements in GNN and RL, which have
predict or improve the CAD tools’ performance to design been widely applied to the synthesis, placement, and routing
ASICs. Discussion of ML-based FPGA CAD tools has been of FPGA circuits.
neglected in survey papers. [42] is a comprehensive survey
in which they have performed a detailed analysis of how ML III. BASICS OF MACHINE LEARNING
can be applied to various stages of ASIC design flow. They ML algorithms are broadly divided into two classes:
discussed ML applications for ASIC physical design, power
(i) Supervised Learning
delivery network and IR drop analysis, lithography and mask
(ii) Unsupervised Learning
production, analog IC design, device sizing automation, and
verification and testing. As we can see here, in this work, all Most of the ML models used in EDA fall under supervised
the predictions are targeted for nanometer scale ASIC physi- learning, including regression, classification, Convolution
cal design flow. They have minimal discussion on the FPGA Neural Networks, and Graph Convolution Network. The
front, that too on the HLS part, but nothing has been dis- CAD tool optimization or Design Space Exploration tools
cussed about prediction in various stages of FPGA physical use Bayesian learning and the Gaussian process to set the
design. In [43], the authors surveyed exclusively on applying hyperparameters of the CAD tools. In Table 1, we have
graph neural network (GNN), ASIC physical design flow. mapped the different EDA applications to the ML models
The paper [43] discusses primarily the theoretical details of used.In this section, we have discussed the ML algorithms
GNN, and the discussion on applying GNN on EDA has mentioned in column 1 of Table 1 in detail.
been minimal. In another work, [44], the authors discussed (i) Linear Regression: Linear regression is a method to
explicitly the application of ML in analog circuit design. find relationship between two continuous variables; one
They mentioned the ANN ANN-based analog IC design is independent, and the other is dependent. Linear re-
process, the hybrid analog IC design automation process, and gression tries to create a statistical relationship that may
the application of ANN and RL in analog IC manufacturing. not always be deterministic. Linear regression tries to
In [45], the authors briefly discuss the application of ML obtain a straight line that shows the relationship between
in the placement and routing of ASIC designs. There is no the independent and the dependent variable. For most of
mention of FPGAs, HLS, and recent methods like CNN, the linear regression models, the error is measured using
GNN, and RL. They only discuss wirelength estimation in the least squared error. A linear regression line has an
placement and Design Rules Check (DRC) hotspot detection equation of Y = a + bX, where X is the explanatory
in post-routed circuits. Paper [46] and [49] discuss hotspot variable and Y is the dependent variable. The slope of
and DRC violation prediction using ML in ASIC circuits and the line is b, and a is the intercept, which is determined
static timing analysis using ML. These two papers are more during model training.
of a roadmap paper than a survey paper. These papers discuss (ii) Random Forest Regression(RF): Random forest is a
in what different domains ML can be applied to future IC supervised learning method that uses ensemble meth-
designs. Similar to [46], [47] is a roadmap that discusses ods for learning. Ensemble learning is a technique that
performance limitations of traditional compute and storage combines multiple weak learners to create a strong
systems and the systems and infrastructure considerations learner. In random forests, these weak learners are de-
for performing machine learning at scale. This paper also cision trees. The outputs from the individual learners
briefly discusses how ML can be applied to solve functional are averaged to generate the final output. Random forest
verification and debug problems. Even though [48] is not belongs to the bagging class of learners where, during
a proper survey paper, the authors discuss how RL can be training, data samples are selected at random without
applied to solve various circuit design problems like DRC replacement. Bagging makes each model run indepen-
violations, layout, and routing issues. This work profoundly dently and then aggregates the outputs at the end without
focuses on using Reinforcement Learning (RL) to solve post- preference for any model. Because of the introduction of
route violation issues. the bagging method, the chances of overfitting are less
As the above discussion shows, most existing surveys are in a random forest. Random forest is a very fast training
highly focused on IC development or EDA tools for ASICs. method because each tree can be trained in parallel, and
Even though there are many recent works in which ML the inputs and outputs of each tree are not related to
has been applied to solve various FPGA physical design one another. To summarize, Random Forest Algorithm
problems, their discussion has been limited to the "Related merges the output of multiple Decision Trees to generate
Works" section of the papers only. No comprehensive survey the final output.
VOLUME 4, 2016 3
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
(iii) Multivariate Adaptive Regression Splines (MARS): which fixes the error of the previous tree. The XGB
Multivariate adaptive regression splines [65], an al- uses a gradient decent algorithm on the loss function
gorithm that automatically creates a piecewise linear to minimize the error generated by the previous trees.
model which creates a nonlinear model by combin- The XGB method keeps adding trees one after another
ing multiple small linear functions known as steps. iteratively until there is no improvement in the loss. The
In MARS, non-linearity is introduced by using step loss function used is either sum of squared error (SSE)
functions. There are no polynomial terms in the MARS or mean square error (MSE) [66].
equation. Equation 1 shows the linear stepwise equa- (vi) Graph Convolution Network(GCN): With the wide
tion: application of deep learning, neural networks are an
effective and efficient model for tasks like classification
yi = β0 + β1 C1 (xi ) + β2 C2 (xi ) + . . . + βd Cd (xi ) + ϵi , and regression. However, neural networks like ANN
(1) and CNN only take vectors or tensors as input data,
MARS is an adaptive procedure for regression and is which makes it difficult to work on graphs. Defferrard
well-suited for high-dimensional problems (i.e., many et al. generalize convolutional neural networks from
inputs). The bends in the step functions are known as regular grids (e.g., images) to general graphs via graph
“knots”. If there are multiple knots present in the MARS convolutional filters [67]. Later, Kipf and Welling sim-
equation, it may lead to overfitting. plify the convolutional operator and propose a graph
(iv) Multi Layer Perceptron based Regression (MLP): convolutional network [68]. Graph learning techniques
MLP-based regression models comprise multiple per- can be categorized into transductive and inductive set-
ceptrons known as neurons. This type of model falls tings. Under the transductive setting, the embedding of
into the feedforward class of artificial neural networks. each node is directly optimized, and thus, the training
Generally, Artificial Neural Networks (ANN)s are used process requires seeing all the nodes. The original GCN
for classification. But they can be used for regression, [67] is one kind of transductive approach. The induc-
too. MLP-based models are trained using a backprop- tive approaches learn a general rule from the training
agation algorithm. Each network consists of an input graphs through sample and aggregation of structural
layer of neurons, a few hidden layers, and an output information and node attributes. The learned model can
layer. The output of each function is linear. To make the be applied to unseen data.
output non-linear and emulate the real-world behavior, (vii) Reinforcement Learning(RL): Reinforcement learning
activation functions like “tanh”, “RELU” and “sigmoid” is an ML algorithm in which the machine tries to learn
are added. by itself. The machine repeatedly uses trial and error to
(v) Gradient Boost Regression (XGB): Similar to random learn from its own experience. To program the machine,
forest regression, Gradient boost regression is also an the programmer sets rewards for correct actions and
ensemble class of ML model. Boosting is a method of penalizes the machine for wrong actions. The goal of the
converting weak learners into strong learners. Unlike machine is to maximize the rewards. Unlike supervised
random forest regression, where we use fully grown learning, the training data of reinforcement learning
trees, in XGB, the trees are weak learners; hence, have no labels. By leveraging the power of search and
shallow trees can also be used for fast learning. In many trials, reinforcement learning is currently the most
boosting, each new tree is fit on a modified version of the effective way to hint at a machine’s creativity [69], [70].
original data set. XGB uses multiple trees in sequence, (viii) Bayesian Optimization: Bayesian optimization is an
4 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
both time and energy for the designers. Since global routing-
based congestion estimation cannot be estimated accurately
RTL Code for smaller technology nodes, researchers have developed
ML-based congestion estimation methods for FPGAs and
ASICs. They have used traditional regression models on RTL
netlists [37], [38] and HLS designs [39]. Other researchers
Logic
have addressed the FPGA congestion prediction problem
Synthesis
as Generative Adversarial Network (GAN) and CNNs [27],
[41]. In these models, the authors apply image-based GAN
and CNN algorithms to detect congestion hotspots on the
2D images of post-placed netlists. This section discusses the
Mapping
ML-based works to predict congestion on FPGA netlists. In
Fig. 2, we have shown a conventional FPGA/ASIC design
flow starting from the RTL code to the routing congestion
Features map level. On the right side, we have shown how ML can be
Placement applied right after the placement stage to get a quick estimate
of the post-route congestion value. The applied ML model
can be traditional regression/classification methods or image-
based models like CNN and GAN (generative adversarial
AI based prediction
Routing network).
ML/GAN/DNN
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
the instructions on the IR code to the actual layout after the six works. Since post-route congestion is estimated after
routing. The back tracing is done in the following way: the placement stage of the FPGA physical design process,
Congested CLB → gate level netlist → RTL code → IR for RTL-based designs, features are extracted from the post-
code → HLS code placement netlists. Even for GAN-based designs, placement
Once the source of the instruction for the congested CLB is netlists are converted to images, and features are extracted
determined in the HLS code, features are extracted to create from those images. For example, in [41], encoded feature
a regression model. The regression model is later used to maps in high-definition images are created for pin density
predict the post-route congestion from the high-level C/C++ vertical and horizontal routing congestion from post-placed
and LLVM IR codes. netlists. Similarly, in [40], different color schemes are used
in post-placement images to represent routing channels, CLB
C. ROUTING CONGESTION PREDICTION USING GAN spots, multipliers, memories, IO blocks, and connectivity
BASED METHOD matrices. Row 6 in Table 2 represents the feature sources,
The work described in [41] and [40] addressed the routing while row 7 represents the datasets. Although all the work
congestion estimation problem as a conditional generative discussed in this section can accurately estimate the routing
adversarial network (CGAN) problem. In both works, the congestion, almost all the estimation is done only for CLB
authors transfer the routing congestion problem in large-scale levels. No work estimates routing congestion for DSPs or
FPGAs to an image-to-image problem and then use condi- BRAM, even though they consume a significant amount of
tional GAN to solve it. In [41], proposed a new placement- resources in modern designs. Also, none of the work is
based routing congestion prediction approach for large-scale properly integrated with open-source or commercial imple-
FPGA designs. The proposed approach adopts a new CGAN mentation tools. Easy integration of the prediction models
model, namely pix2pixHD, which performs high-definition with existing physical implementation tools is still an open
(HD) image translations for large high-resolution images. research direction.
With HD image translation, this approach can achieve high
prediction accuracy for large FPGA designs while relying V. ML BASED DESIGN SPACE EXPLORATION OF HLS
on well-engineered features that encode the placement and DESIGNS
connectivity information for large-scale designs. The main In High-Level Synthesis, there are various methods that the
contributions of [40] include selecting features only from designer can use to optimize their design in terms of area,
post-placed netlist image, estimation of the routing channel latency, and timing. The process by which the designer
utilization by forecasting the full congestion heat map instead achieve the best set of designs in term of area and latency by
of hotspots only, and integration with placement tool to esti- changing the parameters of the HLS tool is called “Design
mate routing congestion on the fly during placement. Both Space Exploration” (DSE). In HLS designs, there are three
tools can be easily integrated with open-source academic different ways known as “knobs”, which the designer can
placers like UTPLace and GPlace [74], [75] and achieve very use to optimize their designs [76]. The three knobs of HLS
high accuracy. designs are:
D. SUMMARY OF ROUTING CONGESTION WORK AND 1) Pragma-based local synthesis
FUTURE DIRECTIONS 2) Global Synthesis Options
The application of ML to solve congestion estimation can 3) Functional Unit Constraints
be broadly classified into two classes: (a) Using traditional The main challenge HLS users face is setting these knobs
regression methods and (b) Using image-based GAN meth- to obtain a design with the desired characteristics, consider-
ods. Also, the design entry to the tool can be either RTL- ing that the total number of knob settings is extremely large.
based or HLS-based. In Table 2, we have summarized all the There are both traditional optimization methods like simu-
ML-based congestion estimation tools for FPGAs and the lated annealing based, genetic algorithm based, ant colony
models used for prediction. In most cases, the performance optimization and gradient decent [33], [77], [78] based meth-
of the post-route congestion estimation tool is measured in ods and machine learning based methods to perform DSE
terms of mean absolute percentage error (MAPE). In row 4 [33]–[36], [79], [80]. Some of the works combine both tradi-
of Table 2, we present the average error reported in each of tional methods and ML-based methods [33]. Thus, the goal of
6 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
Paper ML Model QoR Metrics Average ADRS Dataset Max. Design Space
[36] TED+RF ADRS 3.38 Custom DFT design 119
[35] Transfer Learning ADRS 4.27 Spector OpenCL 1173
[53] Transfer Learning ADRS 0.28 Machsuite 21952
[82] Linear Regression ADRS 27.47 S2Cbench NA
[33] Decision Tree Run Time/Dominance 0.90a S2Cbench NA
[83] XGB Prediction Accuracy 1.95b CNN architectures 744
[80] XGB ADRS 0.161 S2Cbench/Chstone 120,000
[34] Decision Tree ADRS 3.03 S2Cbench NA
[79] RF Run Time 0.0155 Custom OpenCL BM 1527
[81] Analytical Run Time 5.04c Xilinx Vitis Library NA
[26] GNN Resource/Latency 2.54xd /6xc Machsuite/Poly/Chstone 4700(synth) /800(real)
[20] Analytical Run Time 215.12c Polybench 95
Note: a Average Dominance; b Average MAPE; c Average Speedup; d Average DSP Usage; NA: Not Available
this work are the initial and final temperature, the descent try to generate their dataset [35], [36], [79], [81], [83]. Even
rate, the exit condition in simulated annealing, and, in the if they use the existing benchmarks, they generate their own
GA case, the number of parent pairs and the mutation and design versions. Hence, a fair comparison of the works is
crossover rate. The authors also proposed a combined SA, difficult to perform. Generating datasets to train ML models
GA, and ACO concurrent multiheuristic design space ex- for PPA estimation or DSE is an extremely time-consuming
plorer. In [80] Goswami et al. combined heuristics-based process, which can take weeks and months. To address this
Simulated annealing and ML regression model to design a issue, [88], [89] and [90] benchmarks have been created,
fast design space explorer in real designs. The initial part specifically curated for ML training. These are large datasets
of the DSE algorithm runs on logic synthesis-based results, similar to computer vision datasets like Imagenet [91], or
and once a sufficient amount of data (design points) has been Cifar10 [92], which are off-the-shelf datasets that the ML-
generated, they create an ML model and switch to fast ML- based EDA researchers can directly use for training purposes.
based predictive DSE. They used this work to run DSE on Another new direction of research is to eliminate the synthe-
different CNN architectures [83]. In [26], the authors created sis tools from the loop by using methods like LLVM [93] or
a fast DSE by applying GNN on LLVM IR graphs, which MLIR [94] methods, which was used in [26] and [80].
predicts a design’s resource and latency requirements. .
D. SUMMARY AND FUTURE DIRECTIONS FOR ML VI. MACHINE LEARNING IN FPGA CAD TOOL
BASED DSE PARAMETER SELECTION
One of the figures merits of measuring the performance of As the technology node decreases, the CAD tools used to
a DSE tool is called “Average Distance to Reference Set” design the ICs are also becoming very complex. Many pa-
(ADRS). This parameter measures how close the predicted rameters are involved in EDA tools, which results in a huge
DSE points are to the exhaustive search-based DSE points. design space. The runtime complexity of the CAD tools is
If the value of ADRS is close to zero, it is a very good also very large, and designs take multiple weeks to synthesize
design space explorer. The larger this value is, the worse the at each design point. Modern CAD tools used for logical
performance of the explorer. An explorer aims to minimize synthesis and physical design have hundreds of parameters
the ADRS or runtime. This is done by either minimizing the in them, which are set to meet various timing and area re-
design space [35], [79] or by applying transfer learning from quirements of the designs. The design space of these tools is
a different set of designs [35], [53] or from another platform humongous, and manually selecting the tool parameters will
like ASIC [82]. Most of the papers [34]–[36], [53], [80], take ages and may not always generate the optimum design
[82] measure the performance of their DSE tool in terms of in terms of power, performance, and area. To recommend
ADRS. In Table 3, column 3, we show what QoR measures tool parameters and minimize design space in EDA tools,
are being used to measure the performance of the DSE tools researchers have used Bayesian optimization [29], [30], [95],
as discussed in the papers, while in column 4, we reported classification [28], [31], [54], Principal component analysis
the average ADRS values of the various designs used in each for design space pruning [95] and tensor decomposition and
paper. For papers where QoR is measured using some other regression [32]. Researchers minimized the tools’ runtime
parameters like runtime, resource or latency requirements, and generated area and timing-efficient designs using these
the average of those values is reported. In columns 5 and methods. This section discusses seven recent works in which
6, the benchmarks’ names and the design space’s maximum ML is applied to minimize search space in EDA tools and
size are shown, respectively. Although there are few standard generate optimized designs. In Fig. 4, we have shown the
benchmarks like Chstone [84], Machsuite [85], Polybench high-level diagram of how ML can recommend CAD tool
[86] and S2Cbench [87] are available, most of the researchers parameters for FPGA design. The input to the recommender
8 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
Recommended
RTL Code ML Based Parameters
Recommender
System Parameter 1
Tool Parameters
• WLDrivenBlockPlacement
• ExtraNetDelay_low
• Bayesian Parameter 2
• . Optimization
• . • Regression
• RuntimeOptimized
• Classification Parameter 3
• Active Learning
Desired Specs
Area Timing Power
--- --- ---
--- --- ---
--- --- ---
FIGURE 4. High Level diagram showing how ML is used to recommend CAD tool parameters
system is generally the RTL code, a set of logic synthesis Random Forest (rf), Support Vector Machine (SVM)
and physical design parameters that need to be tuned, and (svmRadial) and Neural Network (nnet). Here also,
the desired specs in terms of area and performance. The the authors minimized the clock period using the suggested
recommender system considers the specs and the RTL code, CAD parameters.
and using ML, it suggests the best set of tool parameters to Similar to [28] and [31], LAMBDA [55] also tries to
be used. All the works discussed in this section use a variant maximize TNS. In [28] and [31], parameters of single stage
of this flow. are tuned, i.e only for logic synthesis, placement or routing.
But in [55], parameters of multiple stages are tuned simul-
A. RECOMMENDER SYSTEM TO SUGGEST taneously. They combine the features from multiple stages
PARAMETERS TO MEET TIMING to predict post route QoR of the designs. They addressed
the problem as a regression problem and used gradient boost
The works discussed in [28], [31], [55] propose methods
regression to solve it.
to suggest CAD tool parameters to maximize total negative
slack (TNS). InTime [28] is a plugin for FPGA CAD tools
that can automatically select tool parameter assignments for B. GENERALIZED PPA RECOMMENDER SYSTEM
each design by using machine learning heuristics and cheap There are few other works [29], [30], [32], [54] which are
cloud computing resources. While modern CAD tools have much more generalized than those discussed in Section VI-A.
hundreds of parameters, InTime uses only 25 parameters These works suggest parameters not only to optimize timing
to suggest the best timing solutions. InTime is an iterative but also to optimize other metrics like power, timing, and
algorithm organized as a series of concurrent CAD runs. area. [29] proposed a method that is used to automate the
Each round, which consists of multiple concurrent runs, is flow selection in IC design using the Bayesian optimization
an opportunity to generate candidate CAD parameter com- method. Their optimization function optimizes a cost func-
binations and acquire data for analysis. Within each round, tion consisting of power, performance, and area values. They
InTime uses a supervised learning approach to train classi- automate the flow selection at logic synthesis and the place
fiers that evaluate the effectiveness of a given combination of and route stage. They have used Gaussian process regression
CAD parameter selections toward increasing timing slack. as the surrogate function for the Bayesian model. Using
[31] is an extension of Intime [28]. In this paper, the au- Bayesian optimization, the authors tuned six parameters, four
thors proposed a method to select the best CAD tool param- of which are during the logic synthesis stage and two during
eters using a classification problem. They have minimized the place and route stage. Similar to [29], [32] also suggests
the search space using the Principal Component Analysis tool parameters both at logic synthesis and place and route
method. To train the model, they used the shelf ML mod- stage, using a two-step process. In the first stage, an ML
els like Logistic regression (glm), Bagging (treebag), model is trained offline. They have used macros and small
VOLUME 4, 2016 9
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
partitions of large-scale, high-performance server processor VII. MACHINE LEARNING TO PREDICT POWER,
chips for training. Once the model is trained, in the second PERFORMANCE AND AREA ESTIMATION IN FPGA
stage, the tool inputs the macro name whenever a new macro DESIGNS
comes. Suppose it is previously seen macro, set of cost func- Power, performance, and area estimation are crucial in any
tions to be optimized, or some baseline synthesized results VLSI physical design flow. The earlier we predict the post-
for new macro and recommends the set of tool parameters route QoR of design, the better it is so that the designer
to meet the objective function. Although the paper presents can go back and set the tool parameters to meet the specs.
a nice idea about the recommendation system, nothing has Few recent works use analytical or mathematical models
been mentioned about the ML model or feature sets. Also, or machine learning models to predict the post-route QoR
the paper does not discuss the accuracy of the resultants QoR of both HLS and RTL designs. Works like COMBA [20],
of the recommendation system. LinAnalyzer [19] and Aladdin [21] use analytical methods
[54] proposes a totally new approach compared to the to estimate the post route PPA of HLS designs, whole “Fast
already discussed ones. Instead of suggesting hyperparam- and Accurate” [22], Pyramid [25], XPPE [23], HLSPredict
eters for CAD tools, this work suggests which will be the [23] and Powergear [63] use ML-based methods to estimate
best tool to use to meet the specs of a particular design. the post route QoR of HLS designs. In [51] and [50] ML
They address the problem as a binary classification problem; has been used to predict timing in RTL-based FPGA design,
for placement purposes, they selected two popular academic while in [62] and [63], GNN is used to predict power in
placers, gplace3.0 [75] and UTPlacef [74]. This work does FPGA designs..
not discuss the quality of the generated placed circuits in
terms of timing, area, or power. A. ANAYTICAL METHODS FOR PPA ESTIMATION OF
HLS DESIGNS
Aladdin [21], Comba [20], and Lin Analyzer [19] are three
C. SUMMARY AND FUTURE DIRECTIONS recent works that estimate power, performance, or area for
In Table 4, we have summarized the work done in the HLS designs either individually or together using analyti-
ML-based CAD tool parameter optimization domain. Except cal and mathematical approaches without the need of any
[29], all the other works are classification or regression. In ML models. Aladdin [21] is a pre-RTL power performance
the classification ones discussed in [28], [31], the authors simulator designed to enable rapid design space exploration
predict whether the QoR (area/timing, etc.) will be met or of accelerator-centric systems. This framework takes high-
not met using a certain set of tool parameters. Similarly, level language descriptions of algorithms as inputs and uses
in [54], a binary classification method is used to suggest dynamic data dependence graphs (DDDG) to represent an
which of the two academic placers is suitable to place and accelerator without generating RTL. Starting with an uncon-
route a certain design; most of the work [28], [29], [31] strained program DDDG, which corresponds to an initial
work only at one stage of the physical design flow; i.e., representation of accelerator hardware, Aladdin applies op-
either at synthesis, placement or routing. They cannot opti- timizations and constraints to the graph to create a realistic
mize parameters parallelly in multiple stages. This issue is model of accelerator activity. Alladin generates different
addressed in LAMBDA [55], where they try to simultane- optimized DDDG graphs and applies various mathematical
ously autotune the hyperparameters in multiple stages, from calculations to estimate the cycle-wise power, timing, and
logic synthesis to routing. Bayesian optimization is a popular area requirement.
hyperparameter optimization method which is widely used in COMBA [20] is an analytical engine that is used to suggest
ML frameworks to select optimal parameters. More research pragmas during DSE. To generate the pragma recommen-
can be done in using Bayesian optimization methods for rec- dations, COMBA uses a database called recursive data col-
ommending automatic parameters in EDA flow at different lector (RDC) and a metric-guided design space exploration
stages. (MGDSE) algorithm. In input to the tool is a high-level HLS
code written in C/C++. COMBA converts the HLS code
into LLVM IR [73] codes and corresponding control and
TABLE 4. Summary of CAD tool autotuning works dataflow graphs (CDFG). By running analysis on the LLVM
IR code and the CDFGs, they create a mathematical model to
ML Method Stage Recommendation References estimate an FPGA design’s latency and resource requirement.
Bayesian LS, PR Tool Parameters [29], [30]
Classification LS, PR CAD flow [54] Based on the estimated resource and latency requirement, the
Classification LS, PR Tool Parameters [28], [31] MGDSE suggests pragmas for the next iteration. One major
Regression LS, PR Tool Parameters [32], [55] drawback of this work is that it compares their estimates
concur-
rent
against the estimation done by Xilinx Vivado HLS after the
Note: LS: Logic Synthesis; PR: Place and Route C-synthesis stage. Since the value reported by Vivado HLS
is highly inaccurate [22], [25], [96], hence the estimation of
COMBA is also wrong. However, their latency estimation is
very good as compared to post-HLS estimation by the Vivado
10 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
TABLE 5. Summary of Machine Learning based Power, Performance and Area prediction tools
traces for model training: sub-traces are epochs of workload the work in [96], [98] is a comprehensive one that can
execution time in the form of CPU countermeasures for the predict all the four post route QoR metrics of a design viz.,
host, and FPGA cycle counts for the target. The authors Latency, Resource Requirement, Timing, and Power. There
perform a detailed analysis of a design running on x86 CPU is no work in which more advanced ML models like BERT,
by identifying the CPU microarchitectural subsystems and Transformer, or Autoencoders are being used to enhance and
correlating them with the post-route performance and power predict the performance of EDA tools. Using these modern
when the same design runs on FPGA. techniques, post-route PPA estimations can be made faster
and more accurate simultaneously. In a very recent work [99]
E. SUMMARY AND FUTURE DIRECTIONS
(August 2023), the Large Language Model (LLM) has been
used to generate Verilog code based on design description
In Table 5, we have summarized the works done to predict in pure English language. These codes can be synthesized
FPGA designs’ PPA using analytical methods and ML-based on both FPGA and ASIC design tools. In another work, the
models. Most of the work discussed in Table 5 predicts authors present their work, which, for the first time, shows
the post-route behavior of HLS designs. Models like XPPE, how to generate hardware security properties using LLMs au-
“Fast and Accurate” and Pyramid [22], [23], [25] rely on tomatically. They created their own BERT model, Hardware
post-C-synthesis log files to extract features; hence they are security-BERT, which can read SoC design documentation
slower as compared to LLVM-based models like [20], [26], and generate pertinent hardware security-relevant properties
[27], [96]. Most of the works discussed here use their design [100]. Even though these two works are not directly applied
entry as HLS because of the fast synthesis, implementation to predict PPA in FPGA CAD tools, these two are promising
time, and large availability of benchmarks. In the RTL-based works to show how technology is advancing towards LLM
works shown in Table 5 [21], [51], [62] they use features and other language models.
from synthesized gate-level netlists to predict a design’s
power and timing requirements. While most of the works
discussed here predict either resource or timing or latency,
12 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
guide the RL agent based on four resource types in place TABLE 6. Summary of ML based Physical Design Tools
of moves. [59] is quite similar to [56], [60] and [57]. Here
Stage ML Algorithm Papers
also, the authors applied RL to optimize moves of SA-based RL [121], [123], [124]
placer in VTR [101], [125]. But the details of the moves are Logic Synthesis RL+GCN [123]
not discussed in the work, nor are the results shown very Regression [50]
RL [56]–[59]
prominently here. Placement
CNN [120]
In this paper [60], Malappa et al. applied RL in the detailed Routing RL+GCN [61]
placement stage. They addressed the detailed placement
problem as two staged hierarchical problems: (i) Coarse-
grained refinement and (ii) Fine-grained refinement. They for routing on the placed circuit in the future. In Table 6, we
applied RL in Stage 1 to select the optimal sliding window classified the three stages of physical design and what are the
size and the order in which the sliding windows must be ML algorithms applied in each stage.
rearranged. In Stage 2, Satisfiability Modulo Theories (SMT)
are applied for fine-grain refinement. This work is purely RL- IX. CONCLUSION
based, and they did not combine with any meta-heuristics- ML is a growing field in the current technology domain, with
based methods like SA or Genetic Algorithm. In [120], the many applications in computer vision, image processing,
authors addressed the placement problem as a CNN problem audio and video processing, NLP, etc. This field is currently
based on electrostatic density. The authors proposed a CNN- being applied in the EDA domain. CAD-based IC design
inspired analytical global placement algorithm for large- technology has been there since the 1980s, and we have lots
scale FPGAs. A novel density framework was constructed and lots of data available at various stages of the physical
to remedy the high computation time by casting the 2D design flow for various technology nodes. If we can utilize
electrostatic-based density constraints into CNN. This first the data from past IC designs for ASIC and FPGA, we
and one of the only known works where CNN has been can make great tools using ML, resulting in fast IC design
directly applied to the placement problem. tools. Another challenge of ML-based IC design tools is
the availability of specialists. The designer must have ex-
C. MACHINE LEARNING IN FPGA ROUTING ceptionally good knowledge of VLSI design flow and ML
Although ML has been widely applied to predict congestion, technologies. As discussed in this chapter, almost all the
routing violations, or DRC violations in routing, as discussed available ML algorithms are now being applied in EDA tool
in Section IV, very few works are where ML is directly design. However, most of the work discussed in this paper
applied to guide the routing algorithm. RL has been applied is still nascent in academic labs, and commercially making
to optimize the routing algorithm in one such work. In [61], them available with vendor-supplied tools may take some
the authors applied RL to optimize the routing algorithm and time. Two recent tools from AMD Xilinx Vitis AI [126] and
compared it against the conventional negotiated congestion Synopsys DSO [127] use ML as part of their implementation
routing method used in Pathfinder [111] routing algorithm. algorithms.
They used the cost function similar to [111] and optimized it
using RL. However, the paper does not discuss the details of
the RL algorithm and how the state table is created. This is Acronyms
the only work where the authors tried to apply ML in routing ADRS : Average Distance to Reference Set
instead of merely prediction congestion or routing violation ANN : Artificial Neural Network
[37], [38], [41], [71]. CAD : Computer Added Design
CDFG : Control and Dataflow Graph
D. SUMMARY AND FUTURE DIRECTION CNN : convolution Neural Network
In Table 6, we have summarized the application of ML in DRC : Design Rules Check
different stages of FPGA physical design flow. Reinforce- DSE : Design Space Exploration
ment Learning is very similar to human learning. Just like EDA : Electronics Design Automation
humans learn from their mistakes and retrain themselves, GA : Genetic Algorithm
RL also does so. Hence, RL is very suitable for physical GAN : Generative Adversarial Network
synthesis optimization in EDA. Based on different reward GCN : Graph convolution Network
functions, which are generally the cost functions of heuristic HLS : High Level Synthesis
algorithms, the RL agent guides the optimizer. Hence, it HPWL : Half Perimeter Wirelength
is widely used in placement [56], [57], [59], [60] although IC : Integrated Circuit
application in routing and synthesis is limited. Future works IR : Intermediate Representation
in this direction may include generative AI like GAN, which MAPE : Mean Absolute Percentage Error
will generate the placed circuit based on description like the MARS : Multivariate Adaptive Regression Splines
requirement for timing and area. Just like GAN is used to ML : Machine Learning
create virtual congestion in [40], [41], GAN can also be used MLP : Multi Layer Perceptron
14 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
PPA : Power Performance and Area [18] X. Yang, R. Kastner, and M. Sarrafzadeh, “Congestion estimation during
RL : Reinforcement Learning top-down placement,” IEEE Transactions on Computer-Aided Design of
Integrated Circuits and Systems, vol. 21, no. 1, pp. 72–80, 2002.
RTL : Register Transfer Language [19] G. Zhong, A. Prakash, Y. Liang, T. Mitra, and S. Niar, “Lin-analyzer:
SA : Simulated Annealing A high-level performance analysis tool for fpga-based accelerators,” in
TNS : Total Negative Slack 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC),
2016, pp. 1–6.
XGB : Gradient Boost Regression [20] J. Zhao, L. Feng, S. Sinha, W. Zhang, Y. Liang, and B. He, “Comba: A
comprehensive model-based analysis framework for high level synthesis
of real applications,” in 2017 IEEE/ACM International Conference on
REFERENCES Computer-Aided Design (ICCAD), 2017, pp. 430–437.
[1] G. E. Moore, “Cramming more components onto integrated circuits, [21] Y. S. Shao, B. Reagen, G.-Y. Wei, and D. Brooks, “Aladdin: A pre-rtl,
reprinted from electronics, volume 38, number 8, april 19, 1965, pp.114 power-performance accelerator simulator enabling large design space ex-
ff.” IEEE Solid-State Circuits Society Newsletter, vol. 11, no. 3, pp. 33– ploration of customized architectures,” in Proceeding of the 41st Annual
35, 2006. International Symposium on Computer Architecuture, ser. ISCA ’14.
[2] R. H. Dennard, F. H. Gaensslen, H. Yu, V. L. Rideout, E. Bassous, IEEE Press, 2014, p. 97–108.
and A. R. LeBlanc, “Design of ion-implanted mosfet’s with very small [22] S. Dai, Y. Zhou, H. Zhang, E. Ustun, E. F. Young, and Z. Zhang, “Fast
physical dimensions,” IEEE Journal of Solid-State Circuits, pp. 1–10, and accurate estimation of quality of results in high-level synthesis with
2015. machine learning,” in 2018 IEEE 26th Annual International Symposium
[3] P. Kannan, S. Balachandran, and D. Bhatia, “On metrics for comparing on Field-Programmable Custom Computing Machines (FCCM), 2018,
interconnect estimation methods for FPGAs,” IEEE Transactions on Very pp. 129–132.
Large Scale Integration (VLSI) Systems, vol. 12, no. 4, pp. 381–385, [23] H. M. Makrani, H. Sayadi, T. Mohsenin, S. rafatirad, A. Sasan, and
April 2004. H. Homayoun, “Xppe: Cross-platform performance estimation of hard-
[4] S. Balachandran and D. Bhatia, “A priori wirelength and intercon- ware accelerators using machine learning,” in Proceedings of the 24th
nect estimation based on circuit characteristics,” IEEE Transactions on Asia and South Pacific Design Automation Conference, 2019, 2019, p.
Computer-Aided Design of Integrated Circuits and Systems, vol. 24, 727–732.
no. 7, pp. 1054–1065, 2005. [24] K. O’Neal, M. Liu, H. Tang, A. Kalantar, K. DeRenard, and P. Brisk,
[5] P. Kannan, S. Balachandran, and D. Bhatia, “fgrep - fast generic routing “Hlspredict: Cross platform performance prediction for fpga high-level
demand estimation for placed FPGA circuits,” in Field-Programmable synthesis,” in 2018 IEEE/ACM International Conference on Computer-
Logic and Applications, G. Brebner and R. Woods, Eds. Berlin, Aided Design (ICCAD), 2018, pp. 1–8.
Heidelberg: Springer Berlin Heidelberg, 2001, pp. 37–47. [25] H. Mohammadi Makrani, F. Farahmand, H. Sayadi, S. Bondi, S. M.
[6] A. Caldwell, A. Kahng, S. Mantik, I. Markov, and A. Zelikovsky, “On Pudukotai Dinakarrao, H. Homayoun, and S. Rafatirad, “Pyramid: Ma-
wirelength estimations for row-based placement,” IEEE Transactions on chine learning framework to estimate the optimal timing and resource
Computer-Aided Design of Integrated Circuits and Systems, vol. 18, usage of a high-level synthesis design,” in 2019 29th International Con-
no. 9, pp. 1265–1278, 1999. ference on Field Programmable Logic and Applications (FPL), 2019, pp.
397–403.
[7] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
[26] N. Wu, Y. Xie, and C. Hao, “Ironman: Gnn-assisted design space explo-
with deep convolutional neural networks,” in Proceedings of the 25th
ration in high-level synthesis via reinforcement learning,” in Proceedings
International Conference on Neural Information Processing Systems -
of the 2021 on Great Lakes Symposium on VLSI, 2021, p. 39–44.
Volume 1, ser. NIPS’12. Red Hook, NY, USA: Curran Associates Inc.,
2012, p. 1097–1105. [27] E. Ustun, C. Deng, D. Pal, Z. Li, and Z. Zhang, “Accurate operation delay
prediction for fpga hls using graph neural networks,” in Proceedings of
[8] S. Liu and W. Deng, “Very deep convolutional neural network based
the 39th International Conference on Computer-Aided Design, 2020, pp.
image classification using small training sample size,” in 2015 3rd IAPR
1–9.
Asian Conference on Pattern Recognition (ACPR), 2015, pp. 730–734.
[28] N. Kapre, H. Ng, K. Teo, and J. Naude, “Intime: A machine learning
[9] T. P. Nagarhalli, V. Vaze, and N. K. Rana, “Impact of machine learning in
approach for efficient selection of FPGA cad tool parameters,” in Pro-
natural language processing: A review,” in 2021 Third International Con-
ceedings of the 2015 ACM/SIGDA International Symposium on Field-
ference on Intelligent Communication Technologies and Virtual Mobile
Programmable Gate Arrays, ser. FPGA ’15. New York, NY, USA: ACM,
Networks (ICICV), 2021, pp. 1529–1534.
2015, pp. 23–26.
[10] A. Le Glaz, Y. Haralambous, D.-H. Kim-Dufor, P. Lenca, R. Billot, T. C. [29] Y. Ma, Z. Yu, and B. Yu, “CAD tool design space exploration via bayesian
Ryan, J. Marsh, J. DeVylder, M. Walter, S. Berrouiguet, and C. Lemey, optimization,” CoRR, vol. abs/1912.06460, 2019.
“Machine learning and natural language processing in mental health:
[30] B. Reagen, J. M. Hernández-Lobato, R. Adolf, M. Gelbart, P. What-
Systematic review,” J Med Internet Res, vol. 23, no. 5, p. e15708, May
mough, G.-Y. Wei, and D. Brooks, “A case for efficient accelerator design
2021.
space exploration via bayesian optimization,” in 2017 IEEE/ACM Inter-
[11] E. Mankolli and V. Guliashki, “Machine learning and natural language national Symposium on Low Power Electronics and Design (ISLPED),
processing: Review of models and optimization problems,” in ICT In- 2017, pp. 1–6.
novations 2020. Machine Learning and Applications, V. Dimitrova and
[31] Q. Yanghua, H. Ng, and N. Kapre, “Boosting convergence of timing clo-
I. Dimitrovski, Eds. Springer International Publishing, 2020, pp. 71–
sure using feature selection in a learning-driven approach,” in 2016 26th
86.
International Conference on Field Programmable Logic and Applications
[12] J. Long, X. Wang, W. Zhou, J. Zhang, D. Dai, and G. Zhu, “A compre- (FPL), 2016, pp. 1–9.
hensive review of signal processing and machine learning technologies [32] J. Kwon, M. M. Ziegler, and L. P. Carloni, “A learning-based rec-
for uhf pd detection and diagnosis (i): Preprocessing and localization ommender system for autotuning design fiows of industrial high-
approaches,” IEEE Access, vol. 9, pp. 69 876–69 904, 2021. performance processors,” in 2019 56th ACM/IEEE Design Automation
[13] Rahul, “Review of signal processing techniques and machine learning al- Conference (DAC), 2019, pp. 1–6.
gorithms for power quality analysis,” Advanced Theory and Simulations, [33] A. Mahapatra and B. C. Schafer, “Machine-learning based simulated
vol. 3, no. 10, p. 2000118, 2020. annealer method for high level synthesis design space exploration,” in
[14] X. Dong, D. Thanou, L. Toni, M. Bronstein, and P. Frossard, “Graph Proceedings of the 2014 Electronic System Level Synthesis Conference
signal processing for machine learning: A review and new perspectives,” (ESLsyn), 2014, pp. 1–6.
IEEE Signal Processing Magazine, vol. 37, no. 6, pp. 117–127, 2020. [34] Z. Wang and B. C. Schafer, “Machine leaming to set meta-heuristic
[15] (2019) scikit-learn, ml in python. [Online]. Available: https://scikit- specific parameters for high-level synthesis design space exploration,” in
learn.org/ 2020 57th ACM/IEEE Design Automation Conference (DAC), 2020, pp.
[16] (2021) An end-to-end open source machine learning platform. [Online]. 1–6.
Available: https://www.tensorflow.org/ [35] J. Kwon and L. P. Carloni, “Transfer learning for design-space
[17] (2021) Pytorch. [Online]. Available: https://www.pytorch.org/ exploration with high-level synthesis,” in Proceedings of the 2020
VOLUME 4, 2016 15
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
ACM/IEEE Workshop on Machine Learning for CAD, 2020, pp. 26–31. 27th Annual International Symposium on Field-Programmable Custom
[Online]. Available: https://doi.org/10.1145/3380446.3430636 Computing Machines (FCCM), 2019, pp. 74–77.
[36] H.-Y. Liu and L. P. Carloni, “On learning-based methods for design-space [56] M. A. Elgammal, K. E. Murray, and V. Betz, “Rlplace: Using rein-
exploration with high-level synthesis,” in 2013 50th ACM/EDAC/IEEE forcement learning and smart perturbations to optimize fpga placement,”
Design Automation Conference (DAC), 2013, pp. 1–7. IEEE Transactions on Computer-Aided Design of Integrated Circuits and
[37] D. Maarouf, A. Alhyari, Z. Abuowaimer, T. Martin, A. Gunter, G. Gre- Systems, vol. 41, no. 8, pp. 2532–2545, 2022.
wal, S. Areibi, and A. Vannelli, “Machine-learning based congestion [57] M. A. Elgamma, K. E. Murray, and V. Betz, “Learn to place: Fpga
estimation for modern FPGAs,” in 2018 28th International Conference placement using reinforcement learning and directed moves,” in 2020
on Field Programmable Logic and Applications (FPL), Aug 2018, pp. International Conference on Field-Programmable Technology (ICFPT),
427–4277. 2020, pp. 85–93.
[38] C. Pui, G. Chen, Y. Ma, E. F. Y. Young, and B. Yu, “Clock-aware [58] K. E. Murray and V. Betz, “Adaptive fpga placement optimization via
ultrascale FPGA placement with machine learning routability predic- reinforcement learning,” in 2019 ACM/IEEE 1st Workshop on Machine
tion: (invited paper),” in 2017 IEEE/ACM International Conference on Learning for CAD (MLCAD), 2019, pp. 1–6.
Computer-Aided Design (ICCAD), Nov 2017, pp. 929–936. [59] J. Zhang, F. Deng, and X. Yang, “Fpga placement optimization with
[39] J. Zhao, T. Liang, S. Sinha, and W. Zhang, “Machine learning based deep reinforcement learning,” in 2021 2nd International Conference on
routing congestion prediction in fpga high-level synthesis,” pp. 1130– Computer Engineering and Intelligent Control (ICCEIC), 2021, pp. 73–
1135, 2019. 76.
[40] C. Yu and Z. Zhang, “Painting on placement: Forecasting routing conges- [60] U. Mallappa, S. Pratty, and D. Brown, “Rlplace: Deep rl guided heuristics
tion using conditional generative adversarial nets,” in Proceedings of the for detailed placement optimization,” in 2022 Design, Automation Test
56th Annual Design Automation Conference 2019, 2019, pp. 26–31. in Europe Conference Exhibition (DATE), 2022, pp. 120–123.
[41] M. B. Alawieh, W. Li, Y. Lin, L. Singhal, M. A. Iyer, and D. Z. Pan, [61] U. Farooq, N. Ul Hasan, I. Baig, and M. Zghaibeh, “Efficient fpga routing
“High-definition routing congestion prediction for large-scale fpgas,” in using reinforcement learning,” in 2021 12th International Conference on
2020 25th Asia and South Pacific Design Automation Conference (ASP- Information and Communication Systems (ICICS), 2021, pp. 106–111.
DAC), 2020, pp. 26–31. [62] Y. Zhang, H. Ren, and B. Khailany, “Grannite: Graph neural network
[42] G. Huang, J. Hu, Y. He, J. Liu, M. Ma, Z. Shen, J. Wu, Y. Xu, H. Zhang, inference for transferable power estimation,” in 2020 57th ACM/IEEE
K. Zhong, X. Ning, Y. Ma, H. Yang, B. Yu, H. Yang, and Y. Wang, Design Automation Conference (DAC), 2020, pp. 1–6.
“Machine learning for electronic design automation: A survey,” ACM [63] Z. Lin, Z. Yuan, J. Zhao, W. Zhang, H. Wang, and Y. Tian, “Powergear:
Trans. Des. Autom. Electron. Syst., pp. 1–46, 2021. Early-stage power estimation in fpga hls via heterogeneous edge-centric
[43] D. S. Lopera, L. Servadei, G. N. Kiprit, S. Hazra, R. Wille, and W. Ecker, gnns,” in Proceedings of the 2022 Conference Exhibition on Design,
“A survey of graph neural networks for electronic design automation,” Automation Test in Europe, ser. DATE ’22, 2022, p. 1341–1346.
in 2021 ACM/IEEE 3rd Workshop on Machine Learning for CAD
[64] N. Wu, H. Yang, Y. Xie, P. Li, and C. Hao, “High-level synthesis perfor-
(MLCAD), 2021, pp. 1–6.
mance prediction using gnns: Benchmarking, modeling, and advancing,”
[44] R. Mina, C. Jabbour, and G. E. Sakr, “A review of machine learning in Proceedings of the 59th ACM/IEEE Design Automation Conference,
techniques in analog integrated circuit design automation,” Electronics, 2022, p. 49–54.
vol. 11, no. 3, pp. 1–20, 2022.
[65] J. H. Friedman, “Multivariate adaptive regression splines,” The Annals of
[45] V. Hamolia and V. Melnyk, “A survey of machine learning methods and
Statistics, pp. 1 – 67, 1991.
applications in electronic design automation,” in 2021 11th International
[66] J. Elith, J. R. Leathwick, and T. Hastie, “A working guide to boosted
Conference on Advanced Computer Information Technologies (ACIT),
regression trees,” Journal of Animal Ecology., pp. 802–813, Jul. 1992.
2021, pp. 757–760.
[46] A. B. Kahng, “Machine learning applications in physical design: Recent [67] M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional
results and directions,” in Proceedings of the 2018 International Sympo- neural networks on graphs with fast localized spectral
sium on Physical Design, 2018, p. 68–73. filtering,” CoRR, vol. abs/1606.09375, 2016. [Online]. Available:
http://arxiv.org/abs/1606.09375
[47] M. Pandey, “Machine learning and systems for building the next gen-
eration of eda tools,” in 2018 23rd Asia and South Pacific Design [68] T. N. Kipf and M. Welling, “Semi-supervised classification with graph
Automation Conference (ASP-DAC), 2018, pp. 411–415. convolutional networks,” CoRR, vol. abs/1609.02907, p. 101–108, 2016.
[48] H. Ren, B. Khailany, M. Fojtik, and Y. Zhang, “Machine learning and [69] K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath,
algorithms: Let us team up for eda,” IEEE Design Test, vol. 40, no. 1, “Deep reinforcement learning: A brief survey,” IEEE Signal Processing
pp. 70–76, 2023. Magazine, pp. 26–38, nov 2017.
[49] A. B. Kahng, “Machine learning for cad/eda: The road ahead,” IEEE [70] L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement
Design Test, vol. 40, no. 1, pp. 8–16, 2023. learning: A survey,” Journal of Artificial Intelligence Research, vol. 4,
[50] H. Hu, J. Hu, F. Zhang, B. Tian, and I. Bustany, “Machine-learning based no. 1, p. 237–285, may 1996.
delay prediction for fpga technology mapping,” in Proceedings of the [71] P. Goswami and D. Bhatia, “Congestion prediction in fpga using regres-
24th ACM/IEEE Workshop on System Level Interconnect Pathfinding, sion based learning methods,” Electronics, vol. 10, no. 16, 2021.
ser. SLIP ’22, 2023. [72] S. Yang, A. Gayasen, C. Mulpuri, S. Reddy, and R. Aggarwal,
[51] T. Martin, G. Grewal, and S. Areibi, “A machine learning approach to “Routability-driven FPGA Placement Contest,” in Proceedings of the
predict timing delays during fpga placement,” in 2021 IEEE International 2016 on International Symposium on Physical Design, ser. ISPD ’16.
Parallel and Distributed Processing Symposium Workshops (IPDPSW), New York, NY, USA: ACM, 2016, pp. 139–143.
2021, pp. 124–127. [73] C. Lattner and V. Adve, “Llvm: A compilation framework for lifelong
[52] G. Singha, D. Diamantopoulosb, J. Gómez-Lunaa, S. Stuijkc, H. Cor- program analysis transformation,” in 2004 Proceedings of the Interna-
poraalc, and O. Mutlua, “Leaper: Fast and accurate fpga-based system tional Symposium on Code Generation and Optimization, 2004, pp. 75–
performance prediction via transfer learning,” in 2022 IEEE 40th Inter- 86.
national Conference on Computer Design (ICCD), 2022, pp. 499–508. [74] W. Li, S. Dhar, and D. Z. Pan, “Utplacef: A routability-driven fpga
[53] L. Ferretti, J. Kwon, G. Ansaloni, G. D. Guglielmo, L. P. Carloni, placer with physical and congestion aware packing,” in 2016 IEEE/ACM
and L. Pozzi, “Leveraging prior knowledge for effective design-space International Conference on Computer-Aided Design (ICCAD), 2016,
exploration in high-level synthesis,” IEEE Transactions on Computer- pp. 1–7.
Aided Design of Integrated Circuits and Systems, vol. 39, no. 11, pp. [75] Z. Abuowaimer, D. Maarouf, T. Martin, J. Foxcroft, G. Gréwal, S. Areibi,
3736–3747, 2020. and A. Vannelli, “Gplace3.0: Routability-driven analytic placer for ul-
[54] A. Al-hyari, Z. Abuowaimer, D. Maarouf, S. Areibi, and G. Grewal, trascale fpga architectures,” ACM Trans. Des. Autom. Electron. Syst.,
“An effective fpga placement flow selection framework using machine vol. 23, no. 5, pp. 1–33, Oct. 2018.
learning,” in 2018 30th International Conference on Microelectronics [76] B. C. Schafer and Z. Wang, “High-level synthesis design space explo-
(ICM), 2018, pp. 164–167. ration: Past, present and future,” IEEE Transactions on Computer-Aided
[55] E. Ustun, S. Xiang, J. Gui, C. Yu, and Z. Zhang, “Lamda: Learning- Design of Integrated Circuits and Systems, vol. 39, no. 10, pp. 2628–
assisted multi-stage autotuning for fpga design closure,” in 2019 IEEE 2639, 2020.
16 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
[77] B. C. Schafer, “Parallel high-level synthesis design space exploration for [98] P. Goswami, “Machine learning based prediction in fpga cad,” Ph.D.
behavioral ips of exact latencies,” ACM Trans. Des. Autom. Electron. dissertation, University of Texas at Dallas, May 2022.
Syst., vol. 22, no. 4, pp. 1–20, May 2017. [99] S. Thakur, B. Ahmad, H. Pearce, B. Tan, B. Dolan-Gavitt, R. Karri, and
[78] B. Carrion Schafer, “Probabilistic multiknob high-level synthesis design S. Garg, “Verigen: A large language model for verilog code generation,”
space exploration acceleration,” IEEE Transactions on Computer-Aided 2023.
Design of Integrated Circuits and Systems, vol. 35, no. 3, pp. 394–406, [100] X. Meng, A. Srivastava, A. Arunachalam, A. Ray, P. H. Silva, R. Psiakis,
2016. Y. Makris, and K. Basu, “Unlocking hardware security assurance: The
[79] P. Meng, A. Althoff, Q. Gautier, and R. Kastner, “Adaptive threshold non- potential of llms,” 2023.
pareto elimination: Re-thinking machine learning for system level design [101] V. Betz and J. Rose, “Vpr: A new packing, placement and routing tool
space exploration on fpgas,” in 2016 Design, Automation Test in Europe for fpga research,” in International Conference on Field-Programmable
Conference Exhibition (DATE), 2016, pp. 918–923. Logic and Applications, 1997, pp. 213–222.
[80] P. Goswami, B. C. Schaefer, and D. Bhatia, “Machine learning based fast [102] J. Kleinhans, G. Sigl, F. Johannes, and K. Antreich, “Gordian: Vlsi
and accurate high level synthesis design space exploration: From graph placement by quadratic programming and slicing optimization,” IEEE
to synthesis,” Integration, vol. 88, pp. 116–124, 2023. Transactions on Computer-Aided Design of Integrated Circuits and Sys-
[81] A. Sohrabizadeh, C. H. Yu, M. Gao, and J. Cong, “Autodse: tems, vol. 10, no. 3, pp. 356–365, 1991.
Enabling software programmers design efficient fpga accelerators,” [103] W. Wang, Q. Meng, and Z. Zhang, “A survey of fpga placement algorithm
in The 2021 ACM/SIGDA International Symposium on Field- research,” in 2017 7th IEEE International Conference on Electronics
Programmable Gate Arrays, ser. FPGA ’21. New York, NY, USA: Information and Emergency Communication (ICEIEC), 2017, pp. 498–
Association for Computing Machinery, 2021, p. 147. [Online]. Available: 502.
https://doi.org/10.1145/3431920.3439464 [104] S.-C. Chen and Y.-W. Chang, “Fpga placement and routing,” in 2017
[82] S. Liu, F. C. Lau, and B. C. Schafer, “Accelerating fpga prototyping IEEE/ACM International Conference on Computer-Aided Design (IC-
through predictive model-based hls design space exploration,” in Pro- CAD), 2017, pp. 914–921.
ceedings of the 56th Annual Design Automation Conference 2019, 2019, [105] G. Sergey, Z. Daniil, and C. Rustam, “Simulated annealing based place-
pp. 1–6. ment optimization for reconfigurable systems-on-chip,” in 2019 IEEE
[83] P. Goswami, M. Shahshahani, and D. Bhatia, “Robust estimation of fpga Conference of Russian Young Researchers in Electrical and Electronic
resources and performance from cnn models,” in 2022 35th International Engineering (EIConRus), 2019, pp. 1597–1600.
Conference on VLSI Design and 2022 21st International Conference on
[106] J. Yuan, J. Chen, L. Wang, X. Zhou, Y. Xia, and J. Hu, “Arbsa: Adaptive
Embedded Systems (VLSID), 2022, pp. 144–149.
range-based simulated annealing for fpga placement,” IEEE Transactions
[84] Y. Hara, H. Tomiyama, S. Honda, H. Takada, and K. Ishii, “Chstone: A on Computer-Aided Design of Integrated Circuits and Systems, vol. 38,
benchmark program suite for practical c-based high-level synthesis,” in no. 12, pp. 2330–2342, 2019.
2008 IEEE International Symposium on Circuits and Systems, 2008, pp.
[107] P. Goswami and D. Bhatia, “Floorplanning of partially reconfigurable
1192–1195.
design on heterogeneous fpga (abstract only),” in Proceedings of the
[85] B. Reagen, R. Adolf, Y. S. Shao, G. Wei, and D. Brooks, “Mach-
2016 ACM/SIGDA International Symposium on Field-Programmable
suite: Benchmarks for accelerator design and customized architectures,”
Gate Arrays, ser. FPGA ’16, 2016, pp. 275–275.
in 2014 IEEE International Symposium on Workload Characterization
[108] W. Li, Y. Lin, and D. Z. Pan, “elfplace: Electrostatics-based placement
(IISWC), 2014, pp. 1–6.
for large-scale heterogeneous fpgas,” in 2019 IEEE/ACM International
[86] L.-N. Pouchet, “Polybench benchmarks,” https://web.cse.ohio-
Conference on Computer-Aided Design (ICCAD), 2019, pp. 1–8.
state.edu/ pouchet.2/software/polybench/, 2020.
[109] C.-W. Pui, G. Chen, W.-K. Chow, K.-C. Lam, J. Kuang, P. Tu, H. Zhang,
[87] B. C. Schafer and A. Mahapatra, “S2cbench: Synthesizable systemc
E. F. Y. Young, and B. Yu, “Ripplefpga: A routability-driven placement
benchmark suite for high-level synthesis,” IEEE Embedded Systems
for large-scale heterogeneous fpgas,” in 2016 IEEE/ACM International
Letters, pp. 53–56, 2014.
Conference on Computer-Aided Design (ICCAD), 2016, pp. 1–8.
[88] P. Goswami, M. Shahshahani, and D. Bhatia, “Mlsbench: A synthesizable
dataset of hls designs to support ml based design flows,” in The 2020 [110] T. Liang, G. Chen, J. Zhao, S. Sinha, and W. Zhang, “Amf-placer: High-
ACM/SIGDA International Symposium on Field-Programmable Gate performance analytical mixed-size placer for fpga,” in 2021 IEEE/ACM
Arrays, 2020, pp. 1–6. International Conference On Computer Aided Design (ICCAD), 2021,
[89] P. Goswami and et al., “Mlsbench : A benchmark set for machine pp. 1–9.
learning based fpga hls design flows,” in 2022 IEEE 13th Latin America [111] L. McMurchie and C. Ebeling, “Pathfinder: A negotiation-based
Symposium on Circuits and System (LASCAS), 2022, pp. 1–4. performance-driven router for fpgas,” in Proceedings of the 1995 ACM
[90] Y. Zhou, U. Gupta, S. Dai, R. Zhao, N. Srivastava, H. Jin, J. Featherston, Third International Symposium on Field-Programmable Gate Arrays,
Y.-H. Lai, G. Liu, G. A. Velasquez, W. Wang, and Z. Zhang, “Rosetta: A 1995, p. 111–117.
realistic high-level synthesis benchmark suite for software programmable [112] J. Wang, J. Mai, Z. Di, and Y. Lin, “A robust fpga router with concurrent
fpgas,” in Proceedings of the 2018 ACM/SIGDA International Sym- intra-clb rerouting,” in Proceedings of the 28th Asia and South Pacific
posium on Field-Programmable Gate Arrays, ser. FPGA ’18, 2018, p. Design Automation Conference, 2023, p. 529–534.
269–278. [113] K. E. Murray, S. Zhong, and V. Betz, “Air: A fast but lazy timing-driven
[91] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: fpga router,” in 2020 25th Asia and South Pacific Design Automation
A large-scale hierarchical image database,” in 2009 IEEE Conference on Conference (ASP-DAC), 2020, pp. 338–344.
Computer Vision and Pattern Recognition, 2009, pp. 248–255. [114] M. Shen and G. Luo, “Corolla: Gpu-accelerated fpga routing based on
[92] N. V. Krizhevsky, A. and G. Hinton, “The cifar-10 subgraph dynamic expansion,” in Proceedings of the 2017 ACM/SIGDA
dataset.” University of Toronto, 2009. [Online]. Available: International Symposium on Field-Programmable Gate Arrays, 2017, p.
https://www.cs.toronto.edu/ kriz/cifar.html 105–114.
[93] (2019) Llvm compiler. [Online]. Available: https://www.llvm.org [115] Y. Lin, S. Dhar, W. Li, H. Ren, B. Khailany, and D. Z. Pan, “Dreampiace:
[94] “Multi-level intermediate representation overview,” Deep learning toolkit-enabled gpu acceleration for modern vlsi place-
https://mlir.llvm.org/, 2020. ment,” in 2019 56th ACM/IEEE Design Automation Conference (DAC),
[95] N. Kapre, B. Chandrashekaran, H. Ng, and K. Teo, “Driving timing 2019, pp. 1–6.
convergence of fpga designs through machine learning and cloud com- [116] A. Ludwin, V. Betz, and K. Padalia, “High-quality, deterministic parallel
puting,” in 2015 IEEE 23rd Annual International Symposium on Field- placement for fpgas on commodity hardware,” in Proceedings of the 16th
Programmable Custom Computing Machines, 2015, pp. 119–126. International ACM/SIGDA Symposium on Field Programmable Gate
[96] P. Goswami and D. Bhatia, “Predicting post-route quality of results esti- Arrays, ser. FPGA ’08, 2008, p. 14–23.
mates for hls designs using machine learning,” in 2022 23rd International [117] C. Fobel, G. Grewal, and D. Stacey, “A scalable, serially-equivalent, high-
Symposium on Quality Electronic Design (ISQED), 2022, pp. 45–50. quality parallel placement methodology suitable for modern multicore
[97] Z. Lin, J. Zhao, S. Sinha, and W. Zhang, “Hl-pow: A learning-based and gpu architectures,” in 2014 24th International Conference on Field
power modeling framework for high-level synthesis,” in 2020 25th Asia Programmable Logic and Applications (FPL), 2014, pp. 1–8.
and South Pacific Design Automation Conference (ASP-DAC), 2020, pp. [118] M. An, J. G. Steffan, and V. Betz, “Speeding up fpga placement: Parallel
574–580. algorithms and methods,” in 2014 IEEE 22nd Annual International Sym-
VOLUME 4, 2016 17
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3322358
18 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4