0% found this document useful (0 votes)
9 views4 pages

2017-Neural Network Based ECG Anomaly Detection OnFPGA and Trade-Off Analysis

This paper discusses the implementation of an FPGA-based ECG anomaly detection system using an Artificial Neural Network (ANN) that achieves high accuracy (99.82%) through optimization techniques like piecewise linear approximations and fixed-point arithmetic. It highlights the importance of feature extraction, Principal Component Analysis (PCA), and the architecture of the neural network for effective anomaly detection. The study also includes a trade-off analysis on the precision of fixed-point data types and their impact on performance and resource utilization.

Uploaded by

陈德爱
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views4 pages

2017-Neural Network Based ECG Anomaly Detection OnFPGA and Trade-Off Analysis

This paper discusses the implementation of an FPGA-based ECG anomaly detection system using an Artificial Neural Network (ANN) that achieves high accuracy (99.82%) through optimization techniques like piecewise linear approximations and fixed-point arithmetic. It highlights the importance of feature extraction, Principal Component Analysis (PCA), and the architecture of the neural network for effective anomaly detection. The study also includes a trade-off analysis on the precision of fixed-point data types and their impact on performance and resource utilization.

Uploaded by

陈德爱
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Neural Network based ECG Anomaly Detection on

FPGA and Trade-off Analysis

Matthias Wess, Sai Manoj P. D., and Axel Jantsch


Institute of Computer Technology
TU Wien, Austria
Email: [email protected], {sai.dinakarrao,axel.jantsch}@tuwien.ac.at

Abstract—This paper presents FPGA-based ECG arrhythmia the system.


detection using an Artificial Neural Network (ANN). The objective This paper proposes optimization of neural network with piece-
is to implement a neural network based machine learning wise linear approximated transfer functions and fixed point
algorithm on FPGA to detect anomalies in ECG signals, with arithmetic. For selection of the most suitable fixed point pre-
a better performance and accuracy, compared to statistical cision, an extensive trade-off study was performed, discussing
methods. An implementation with Principal Component Anal-
ysis (PCA) for feature reduction and a multi-layer perceptron
the merits and demerits of high/low precision datatypes, and
(MLP) for classification, proved superior to other algorithms. detailing their influence on accuracy and required resources.
For implementation on FPGA, the effects of several parameters The paper provides information on how to effectively reduce
and simplification on performance, accuracy and power consump- the amount of required resources implementing ANN on
tion were studied. Piecewise linear approximation for activation FPGA, for the task of ECG anomaly detection.
functions and fixed point implementation were effective methods
to reduce the amount of needed resources. The resulting neural II. I MPLEMENTATION
network with twelve inputs and six neurons in the hidden layer,
achieved, in spite of the simplifications, the same overall accuracy For an FPGA implementation of an ANN, an architecture
as simulations with floating point number representation. An was developed which allows for a fast development process
accuracy of 99.82% was achieved on average for the MIT-BIH and high flexibility. Employing High Level Synthesis (HLS)
database. offers the flexibility to study the effects of data type precision
and size of the neural network.
I. I NTRODUCTION
A. System Architecture
An Electrocardiogram (ECG) tracks the electrical activity
of the heart over time. It represents the physiological state Before presenting the used system architecture setup, we
of the heart and therefore is the most important method for describe the basic steps and workflow involved in ECG
diagnosing heart diseases [1]. Some of the major challenges anomaly detection. This includes basically three phases: fea-
to detect anomalies in ECG signal are: noise from measuring ture extraction, principal component analysis (PCA), optimiz-
electrodes or loose contacts and mechanical disturbances; ing input data for processing and ANN for anomaly detection.
symptoms of anomalies might not show up all the time; taking Feature extraction involves detection of turning points in the
measurements for long time may not be feasible. ECG signal. To reduce the computational costs, the extracted
Advancement and efficiency of artificial neural networks feature set is reduced to a lower dimension using PCA and this
(ANN) led to its habituation in various applications. Several data is provided to a multi-layer perceptron (MLP) for anomaly
different ANN models like multi-layer perceptron (MLP) [2], detection. A MLP is a fully connected ANN, with each node
[3], [4], modular neural networks (MNN) [5], general feed for- connected to every node in the next and previous layers with
ward neural networks (GFFNN) [5], [6], radial basis function at least one hidden layer. The hidden layers enable the MLP to
neural networks (RBFNN) and probabilistic neural networks perform a nonlinear mapping between an input vector and an
(PNN) [4] have been implemented for ECG classification. output vector [8]. The whole data flow is presented in Figure 1.
In addition to the neural network, the optimization of pre-
processing and feature selection holds most potential for im-
provement. In [2], a combination of fuzzy C-means clustering
and principal component analysis (PCA) was successfully
implemented with significant improvement of accuracy and Fig. 1. Data flow for ECG anomaly detection
dimensionality reduction of the input vectors.
For neural network based ECG anomaly detection on FPGA, As shown in Figure 2 the data is preprocessed in Matlab
a low number of input and hidden-layer neurons is crucial on the host PC and then transferred to the FPGA which
for an effective implementation. Several successful implemen- implements the MLP. The feature vectors are transferred via
tations exist, including optimized designs like [7]. As full Ethernet to the Zynq Processing System. Training and testing
implementations of ANN are computationally intensive, an of the neural network is entirely controlled by the integrated
optimized approach is needed for hardware implementation ARM processor of the Zynq. Classification is performed by the
and is presented in this paper. It is of paramount importance hardware implementation of MLP. To ensure fast data transfer
to understand the impact of ANN parameters while designing between the ARM processor and the MLP, Block RAM of

978-1-4673-6853-7/17/$31.00 ©2017 IEEE


Neural Networks for classification of breast cancer. Leven-
berg Marquardt (LM) performs slightly better than resilient
backpropagation [12] (RPROP) and Conjugate Gradient (CG),
in terms of accuracy of diagnosis. In [13] a comparison of
RPROP, CG ad LM is performed for the tasks of stream-
flow forecasting and determination of lateral stress in cohe-
sionless soils. In the two case-studies RPROP outperforms the
Fig. 2. Hardware/Software architecture, the feature vector is computed in Levenberg-Marquardt algorithm in terms of accuracy during
MATLAB, transmitted via Ethernet onto the Zynq FPGA which implements
the MLP and performs its training
the testing phase. Even though the two studied tasks differ from
ECG anomaly detection, the study demonstrates that RPROP
offers better generalization ability. Therefore for this paper
RPROP was chosen as neural network training algorithm.

D. Activation Functions
For artificial neural networks, several activation functions
are available. Choosing the best fitting activation function, se-
cures the best result for the problem. In classification problems
Fig. 3. For every heartbeat 181 Samples around the R-peak and RR-Interval the sigmoid activation function for hidden layer neurons and
lengths are used the softmax function [14] for output neurons are commonly
used. These functions also meet all the requirements for
classifying heartbeats.
the FPGA is used. Input and output of the entire process is
controlled by a MATLAB GUI. a) Hidden Layer: In neural networks, the hidden layer
activation function is one of the most frequently used func-
tions, and therefore a reduction of computation time was
B. Dataset desired for hardware implementation. As shown in [15],
For verification of the algorithm, the MIT-BIH arrhythmia Piecewise Linear Approximation (PLA) performs superior to
database [9] was used. The database contains forty-eight 30 exchanging the desired tansig activation function with logsig or
minute ambulatory ECG recordings, which include also less ramp characteristics. Considering that the final implementation
common but clinically significant arrhythmias. The Database was planned with fixed point arithmetic, piecewise linear
is therefore suitable to evaluate performance and accuracy approximation was performed with gradients that reduced the
of the developed hardware for a wide spectrum of heart required multiplications to simple shift operations [16]. While
diseases [10]. For classification, a variety of feature vectors previous works used a maximum of seven ranges, to reduce the
have been compared in terms of best classification accuracy. error of the approximated function we propose the following
As illustrated in Figure 4, 181 samples around the R-peak symmetrical function. This leads to a better approximation of
were used. Additionally for every heartbeat, the RR-intervals to the function and a better overall performance of the neural
the preceding and the succeeding beats were derived from the network compared to other proposed implementations. The
MIT-BIH arrhythmia database annotations files. To obtain the piecewise linearly approximated hyperbolic tangent function
feature vector, the selected samples are reduced with Principal (PLAtanh) is defined in (1), with the borders given in (2)
Component Analysis (PCA), a statistical procedure to convert
a set of observations of possibly correlated variables into a ⎧
⎪ 1 x≥a
set of values of linearly uncorrelated variables. Implementing ⎪
⎪ x/4096 + 0.9986377 a≥x>b


Fuzzy clustering as in [2] did not prove efficient, as in this ⎪
⎪ x/32 + 0.905 b≥x>c
work the heartbeats were classified into only two classes. ⎪


⎪ x/8 + 0.715625 c≥x>d



⎪ x/4 + 0.53125 d≥x>e


⎨ x/2 + 0.25 e≥x>f
P LAtanh(x) = x f ≥x>g (1)

⎪ x/2 − 0.25 g≥x>h



⎪ x/4 − 0.53125 h≥x>i



⎪ x/8 − 0.715625 i≥x>j


Fig. 4. The feature vector consists of n principal components and the RR- ⎪
⎪ x/32 − 0.905 j≥x>k


Interval lengths to the preceding and succeeding R-peaks ⎪
⎩ x/4096 − 0.9986377 k≥x>l
−1 l≥x
a = 5.5799959, b = 3.02, c = 2.02, d = 1.475,
C. Training Algorithm
e = 1.125, f = 0.5, g = −0.5, h = −1.125, (2)
In general there exists no training algorithm that fits i = −1.475, j = −2.02, k = −3.02, l = −5.5799959
every task. Problems of backpropagation techniques are slow
convergence and the chance for the algorithm to terminate Due to the selected restrictions on the gradient of the
in a local minimum. Paulin et al. [11] compared the per- linear functions, the biggest error occurs at x = 0.5 with
formance of training algorithms for Feed forward Artificial approximately 0.03788.
Mean Squared Error

Classifications in %
PLAtanh / NPLAtan
0.04 4
tanh / Ntanh
PLAtanh / softmax

False
tanh / softmax
0.02 2

0
0
50 100 150 200 250 300 10 3
15 5
Training Epochs 7
20 9
Input Neurons Hidden Neurons
Fig. 5. Mean square error for training with different activation functions
Fig. 6. Accuracy with 16-bit data (8 Fraction bits) for record 104

b) Output Layer: To normalize output layer results, the


hyperbolic tangent function is used with

Classifications in %
tanh(x) + 1
N tanh(x) = (3) 6
2

False
An implementation in this form not only allows to also 4
make use of the already implemented simplified hyperbolic 2
tangent function, but also reduces complexity. To evaluate
the performance of this activation function in comparison to 0
4 6 4 2
the exact hyperbolic tangent function, several neural networks 8 10 8 6
were trained once with exact functions and once with piecewise 12 10
linear approximations, in both hidden and output layers. As Fraction Bits Hidden Neurons
a typical example Figure 5 shows the mean square error for Fig. 7. Accuracy for 8 input neurons for record 104 MIT-BIH
one specific record in the database for different activation
functions. The error was logged after every training iteration
with RPROP algorithm. The figure shows that using the
A. Classification Accuracy
piecewise linear approximated activation function for hidden
layer neurons, does not worsen the results in comparison to To optimize the hardware implementation of the Artificial
exact implementation. Replacing the softmax function with Neural Network, an extensive analysis of implementation with
Ntanh, given in (3), as activation function for output layer fixed point data types has been performed by varying the
neurons, leads to faster convergence during the training phase number of input neurons, hidden layers, and the bit width of
and to better fitting due to the small number of output layer the fixed point implementation. The amount of fraction bits
neurons. This difference occurs, because in backpropagation for fixed point data was set to half of the bit width of the
the influence of all output signals on the results of the softmax selected data type. Figures 6 and 7 illustrate the amount of
function increases the complexity of the algorithm and was false classifications for record 104 of the MIT-BIH database
therefore simplified in our application. A fixed point imple- as functions of three parameters. This particular record was
mentation of PLANtanh reduces the number of Flip-Flops by selected as it represents a task with typical complexity and
90% and the number of LUTs by 80% compared to a floating allows a good demonstration of how classification accuracy
point implementation. Moreover, the latency was reduced from is influenced by change of parameters. However, for some
31 to 2 clock cycles. (Table I). Consequently, we used PLAtanh records 100% classification accuracy is not achievable. Train-
TABLE I. C OMPARISON OF PLAN TANH 32- BIT FIXED POINT / TANH ing the fixed point implementation of neural network, the
FLOATING POINT IMPLEMENTATION training algorithm adapts weights and bias according to the
Implementation DSPs Flip-Flops LUTs Latency
selected precision. Whereas training the neural network in
tanh (floating point) 18 1916 3697 31
a floating point implementation and transferring weights and
PLAtanh (fixed point) 0 183 705 2 biases later onto a fixed point implementation increases false
Reduction 100% 90.5% 80.9% 93.5% classifications, since weights and biases are trained for floating
point precision. In figure 6 it can be seen that independent of
the amount of inputs, more than three hidden layer neurons
and Ntanh as activation functions in our experiments for the
do not increase accuracy. For input neurons this boundary
hidden layers and the output layer, respectively.
lies between ten and thirteen, depending on the amount of
hidden layer neurons. For figure 7 the number of input neurons
III. E XPERIMENTAL R ESULTS
was set to eight. It shows that for the selected number of
An ANN with twelve input neurons, one hidden layer with inputs a precision of seven fraction bits and eight neurons
six nodes with PLAtanh as activation function and Ntanh for in the hidden layer already achieve 100% accuracy. For a
the output layer has been implemented on FPGA using the higher number of inputs the required amount of neurons in
Vivado HLS tool. With 24 bit data size this approach achieves the hidden layer and bit width decreases, while the neural
99.82% accuracy. To analyze the trade-off between data type network still achieves the same classification accuracy. It can
precision, size, latency and accuracy of the neural network, be concluded, that for every number of hidden layer neurons
several differently configured MLPs have been implemented. there are certain thresholds for minimum precision and amount
of inputs. Above the thresholds, increasing the number of IV. C ONCLUSION
input neurons improves the accuracy more than increasing the
This paper presents a study of the impact of a NN ar-
precision of the data type. For the selected record the smallest
chitecture, activation functions, and fixed point precision on
neural network with 100% accuracy consists of eight input
accuracy, performance, and area usage for FPGA implemen-
neurons, eight hidden layer neurons with 14 bit fixed point
tations of an ECG anomaly detection algorithm. Piecewise
data width. The optimal configuration for classification of the
linear approximation of activation functions was found an
entire database has to be determined separately.
effective approach to reduce required resources and latency.
The final implementation of a 12-6-2 24-bit multi-layer per-
ceptron classifies the entire MIT-BIH database with 99.82%
B. FPGA Performance accuracy. Accuracy can be improved by increasing the number
of inputs, hidden layer neurons and fixed point precision. All
With respect to accuracy, the amount of required resources implementations were performed with High Level Synthesis,
have a higher dependence on optimal implementation during leaving potential for manual optimization.
High Level Synthesis. Table II shows the synthesis results
for four different configured neural networks implemented R EFERENCES
with three different data types. In comparison to a floating
[1] M. S. Thaler, The only EKG book you’ll ever need. Lippincott Williams
point implementation, the proposed optimizations lead to a & Wilkins, 2010.
significant reduction of required resources and latency. For [2] R. Ceylan and Y. Özbay, “Comparison of FCM, PCA and WT
ANN with each one input, hidden and output-layer, concerning techniques for classification ECG arrhythmias using artificial neural
the different parameters it can be noted, that for bit widths network,” Expert Systems with Applications, vol. 33, no. 2, pp. 286–295,
of 12 and 16, Vivado HLS assigns one DSPs for each input 2007.
and hidden neuron. For 24 bit width two DSPs are required [3] V. Dubey and V. Richariya, “A neural network approach for ECG clas-
for every input and hidden-layer neuron. When increasing sification,” International Journal of Emerging Technology & Advanced
Engineering, vol. 3, 2013.
the amount of hidden layers from four to six, Vivado HLS
[4] A. E. Zadeh, A. Khazaee, and V. Ranaee, “Classification of the electro-
changes the routing and usage of DSPs, therefore the amount cardiogram signals using supervised classifiers and efficient features,”
of used Flip-Flops and LUTs is reduced. As a drawback, computer methods and programs in biomedicine, vol. 99, no. 2, pp.
the maximum frequency of the module decreases by 20%. In 179–194, 2010.
general the required resources increase with the number of [5] S. M. Jadhav, S. L. Nalbalwar, and A. A. Ghatol, “ECG arrhythmia
input and hidden layer neurons. In summary, the amount of classification using modular neural network model,” in IEEE EMBS
Conference on Biomedical Engineering and Sciences, 2010.
TABLE II. R EQUIRED R ESOURCES AND ACCURACY FOR R ECORD 104 [6] S. L. N. Jadhav, Shivajirao M. and A. A. Ghatol, “Generalized feedfor-
ward neural network based cardiac arrhythmia classification from ECG
floating point signal data,” in Advanced Information Management and Service (IMS),
Input/Hidden DSPs FFs LUTs Latency Accuracy 2010 6th International Conference on. IEEE, 2010, pp. 351–356.
8/6 42 9295 15163 1208 99.81% [7] Y. Sun and A. C. Cheng, “Machine learning on-a-chip: A high-
12-bit fixed point performance low-power reusable neuron architecture for artificial neural
Input/Hidden DSPs FFs LUTs Latency Accuracy networks in ECG classifications,” Computers in biology and medicine,
8/4 12 1729 3945 62 98.56%
8/6 14 1551 1958 85 99.14%
vol. 42, no. 7, pp. 751–757, 2012.
10/4 14 1977 4399 71 99.28% [8] M. W. Gardner and S. Dorling, “Artificial neural networks (the multi-
10/6 16 1613 1963 97 99.64% layer perceptron)a review of applications in the atmospheric sciences,”
16-bit fixed point Atmospheric environment, vol. 32, no. 14, pp. 2627–2636, 1998.
Input/Hidden DSPs FFs LUTs Latency Accuracy [9] R. Mark and G. Moody, “Mit-bih arrhythmia database directory,” 1988.
8/4 12 1801 3653 60 98.87%
8/6 14 1561 1703 83 99.37% [10] G. B. Moody and R. G. Mark, “The impact of the MIT-BIH arrhyth-
10/4 14 2017 4045 68 99.19% mia database,” IEEE Engineering in Medicine and Biology Magazine,
10/6 16 1646 1708 95 99.77% vol. 20, no. 3, pp. 45–50, 2001.
24-bit fixed point [11] F. Paulin and A. Santhakumaran, “Classification of breast cancer by
Input/Hidden DSPs FFs LUTs Latency Accuracy comparing back propagation training algorithms,” International Journal
8/4 24 2056 3910 63 99.05%
8/6 28 1772 1895 87 99.59%
on Computer Science and Engineering, vol. 3, no. 1, pp. 327–332, 2011.
10/4 28 2280 4366 71 99.28% [12] M. Riedmiller and H. Braun, “A direct adaptive method for faster
10/6 32 1926 1932 99 100% backpropagation learning: The rprop algorithm,” in Neural Networks,
1993., IEEE International Conference On. IEEE, 1993, pp. 586–591.
[13] Ö. Kişi, “Comparison of three back-propagation training algorithms for
input and hidden layer neurons alongside with precision of two case studies,” Indian journal of engineering & materials sciences,
the selected data type influence ECG classification accuracy vol. 12, no. 5, pp. 434–442, 2005.
of neural networks. While increasing the data width does [14] W. Duch and N. Jankowski, “Transfer functions: hidden possibilities
for better neural networks.” in ESANN. Citeseer, 2001, pp. 81–94.
not significantly increase the amount of required resources, a
[15] H. Amin, K. M. Curtis, and B. R. Hayes-Gill, “Piecewise linear
higher number of input and hidden layer neurons requires more approximation applied to nonlinear function of a neural network,” IEE
LUTs and FIFOs in the FPGA implementation. With respect Proceedings-Circuits, Devices and Systems, vol. 144, no. 6, pp. 313–
to previously simulated accuracies, it can be concluded that 317, 1997.
only after increasing data precision, the number of input and [16] H. Hikawa, “A digital hardware pulse-mode neuron with piecewise
hidden layer neurons should be increased. Considering these linear activation function,” IEEE Transactions on Neural Networks,
findings an ANN with twelve inputs, six hidden layer neurons vol. 14, no. 5, pp. 1028–1037, 2003.
and 24 bit bit width was implemented, classifying the entire
MIT-BIH database with 99.82% accuracy.

You might also like