0% found this document useful (0 votes)

89 views127 pages

Hardware Implementation of Real Time Ecg Analysis Algorithms

This thesis presents the hardware implementation of real-time ECG analysis algorithms, focusing on QRS peak detection using FPGA technology. It highlights the challenges of ECG signal contamination by noise and the need for efficient hardware solutions for real-time analysis. The research demonstrates successful implementations of two QRS detectors on a Xilinx FPGA board, achieving over 90% accuracy with data from the MIT-BIH database.

Uploaded by

saigdv1978

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

89 views127 pages

Hardware Implementation of Real Time Ecg Analysis Algorithms

Uploaded by

saigdv1978

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

HARDWARE IMPLEMENTATION OF REAL TIME ECG ANALYSIS

ALGORITHMS

A THESIS SUBMITTED TO THE GRADUATE DMSON OF THE

UNIVERSITY OF HAWAI'I IN PARTIAL FULFILLMENT OF THE
REQUIREMENTS FOR THE DEGREE OF

MASTER OF SCIENCE

ELECTRICAL ENGINEERING

MAY 2008

Ashish Shukla

Thesis Committee:

Luca Macchiarulo, Chairperson

Olga Boric-Lubecke
Victor Lubecke
We certify that we have read this thesis and that, in our opinion, it is

satisfactory in scope and quality as a thesis for the degree of Master of

Science in Electrical Engineering.

THESIS COMMITIEE

ff 12m ~;{:kck
~~d!5:

IT
To my mom and dad,

You're the best.

ill
ACKNOWLEDGMENTS

I would like to thank my guide Dr. Macchiarulo for his patience and his numerous

suggestions that helped me to be on course during the entire time of this research. I would

like to thank my family and all my friends for being supportive and understanding and

showing faith in me at all times.

IV
ABSTRACT

ECG is considered to be the standard for heart rate monitoring and for the diagnosis of

various cardiac ailments. A QRS complex is the most striking feature in the ECG signal.

The morphology, duration and height of the QRS complex give a significant amount of

information to the physicians for assessing the state of heart in different conditions. But

the ECG signals are easily contaminated with noise and artifacts which make it difficult

to analyze them with the naked eye. There are a number of software algorithms available

for extracting different features in an ECG signal. However, with the miniaturization of

devices there is a need for a dedicated hardware which could carry out the analysis of

ECG signal efficiently in real time. ASIC and microcontroller based designs are

expensive and they require quite some time for implementation. FPGAs support rapid

prototyping and can be reprogrammed very quickly. They are inexpensive and their easy

testability feature allows faster implementation and verification option for implementing

new designs. Moreover, FPGA based designs can act as a bridge in migrating from a

software algorithm to a dedicated hardware design based on ASICs.

In this research two QRS peak detector designs have been implemented on a Xilinx

FPGA board. The detectors were tested with five data records obtained from the MIT-

BIH database and have accuracy in excess of 90%. With the successful implementation

of these designs on the FPGA, we present yet another option for the research community

to explore and develop efficient hardware designs for complex biomedical

instrumentation applications.

v
TABLE OF CONTENTS

~C~O~])E~NTS............................................................ I\T
ABSTRACT.............................................................................. V
LIST OF TABLES .................................................................... VIII
LIST OF FIGURES.................. ... ..................... ............... ......... IX
CHAPTER ONE
Introduction ................................................................................. 1
CHAPTER TWO
Electrical Conduction in heart ....................................... ............... 3
2.1. Cardiac Action Potential ................................................................. .4
2.2. Electrical Conduction in Heart .......................................................... 8
2.3. ECG wave during Electrical Conduction ..............................................9
2.4. ECG Electrodes ........................................................................... 12
CHAPTER THREE
Software QRS ])etection Mgorithms ............................................. .17
3.1. Algorithm approaches and state of the art.•....•.•.•.•••••••••••••••••••••••••••••••• .18
3.2. A word on Data Records ................................................................. 25
CHAPTER FOUR
Hardware Implementations ...........................................................27
4.1. Data Processing........................................................................... 27
4.2. System Generator Blocks .......................................................•........ 28
4.3. Design 1: Threshold based on previously detected peak: value ......................29
4.3.1. Section 1: Pre-Processing Stage ............................................. 30
4.3.2. Section 2: Peak Detection stage ............................................ .40
[Link] Finite State Machine ................................................ 40
[Link] Memory Module ..................................................•.. 43
4.3.2.3Window Module .......................................................... .45
4.4. Design"2: Peak detection with threshold based on median oflast eight peaks ... 50
4.4.1. Finite State Machines ......................................................... 51
CHAPTER FIVE
Xilinx Tools: System Generator and Spartan 3E Starter Kit........... 59
5.1. DSP Design using Xi1inx System Generator .......................................... 59
5.1.1. Basic DesignApproach............................................•.........61
5.1.2. Black Box implementation..................................................64
5.2. Xilinx Spartan -3E starter Kit. ...........................................................67
CHAPTER SIX
Results ....................................................................................... ... 70
6.1. Result Interpretation.......................................................................70
6.2. Peak Detection Results for Design. 1................................................... 73
6.2.1. Correct detection within ±5 samples =25 ms ..............................74
6.2.2. Correct detection within ± 4 samples=20 ms .............................. 75
6.3. Peak Detection Results for Design 2 ....................................................75

VI
6.3.1. Correct detection within ± 5 samples = 25 ms .............................76
6.3.2 Correct Detection within ± 4 samples= 20 ms ........................... 77
6.4. Correct Detection within ± 10 samples for Design 1.. ...............................78
6.5. Correct Detection within ± 10 samples for Design 2 ..................................80
6.6. Resource Utilization .......................................................................81
6.6.1. Device Resource Utilization for design 1. .................................. 81
6.6.2 .Device Resource Utilization for design 2 .................................... 82
6.7. Time for analysis ........................................................................... 83
CHAPTER SEVEN
Conclusion and Suggested Future Modifications ............................. 84
Appendix A: VHDL Modules ......................................................... 86
Appendix B: Synthesis Report....................................................... 112
Bihliography ................................................................................ .114

VII
LIST OF TABLES

Table 6.1: Correct detection within ± 5 samples =25 ms ................................. 74

Table 6.2: Correct detection within ± 4 samples =20 ms .................................. 75

Table 6.3: Correct detection within ±5 samples =25 ms ................................. 76

Table 6.4: Correct detection within ± 4 samples =20 ms ................................. 78

Table 6.5: Correct detection within ± 10 samples for designl.. .......................... 80

Table 6.6: Correct detection within ± 10 samples for design2 ............................ 80

Table 6.7: Device resource utilization for the first design ................................. 82

Table 6.8: Device resource utilization for the second design .............•................ 82

VIII
LIST OF FIGURES

Figure 2.1: Cardiac muscle and nerve Cells ..................................................3

Figure 2.2: Cardiac cell depolarization ....................................................... .4

Figure 2.3: Cardiac depo1arization-repo1arization curve .................................... 6

Figure 2.4: Elements involved in cardiac electrical conduction ........................... 8

Figure 2.5: Electrical conduction stages and generation ofQRS complex...........•... 9

Figure 2.6: P-Q-R-S-T wavefonns and wave intervals ...................................... 10

Figure 2.7: Various structures classified as QRS complex................................. 11

Figure 2.8: Placement ofECG electrodes .........•........................................... 13

Figure 2.9: Measurement using Leads I. II and III and AVL. AVF and AVR........... .14

Figure 2.10: ECG waveform contaminated with noise and artifacts ...................... .15

Figure 3.1: Basic structure for QRS detection Algorithms ........•........................ 17

Figure 3.2: A general outline of algorithm by Tompkins and Hamilton....•...•......... 21

Figure 3.3: Two threshold policy for peak detection in QRS complex ....................22

Figure 3.4: Fiducial mark in bandpass signal ...........................................•.....24

Figure 4.1: Basic System Generator® Blocks ................................................ 29

Figure 4.2: General outline of Hardware implementation in designl.. ................... 30

Figure 4.3: Lowpass filter implementation in System Generator® ........................ 31

Figure 4.4: Lowpass filter output.........•....................•................................ 32

Figure 4.5: Highpass filter implementation in System Generator® ........................ 33

Figure 4.6: Highpass filter output ..............................................................34

Figure 4.7: Derivative filter implementation in System Generator® ......•......•........35

Figure 4.8: Derivative filter output ............................................................ 36

IX
Figure 4.9: Input squaring stage implementation in System Generator® ..................37

Figure 4.10: Input squaring stage output ......................................................37

Figure 4.11: Moving window integrator implementation in System Generator®........... 39

Figure 4.12: Moving window integrator output .................................................40

Figure 4.13: Delay in signal due to the various filtering stages .............................41

Figure 4.14: General design flow in hardware implementation................................. .41

Figure 4.15: Finite State Machine implementation in designl ...................................42

Figure 4.16: General outline of memory module implementation.............................45

Figure 4.17: Output of memory module (QRS complex) ........................................46

Figure 4.18: Window module implementation in System Generator® ..........................47

Figure 4.19: Output of window module (peak position in QRS complex} ..................48

Figure 4.20: Complete implementation of design 1in System Generator®

Environment ........................................................................................49

Figure 4.21: Sample output from design 1.. ..............................................................49

Figure 4.22: JTAG block for co-simulation using FPGA......................................50

Figure 4.23: Software and Hardware co-simulation output ...................................... 51

Figure 4.24: Software simulation output using floating and fixed point arithmetic .......52

Figure 4.25: General outline of Hardware implementation for design 2......................53

Figure 4.26: Finite State Machine implementation for design 2....................................54

Figure 4.27: Deletion operation on the array .....................................................56

Figure 4.28: Addition operation on the array .....................................................57

Figure 4.29: Complete implementation of design 2 in System Generator®

environment.....................................................................................................................58

x
Figure 4.30: Sample output of design 2..................................................................59

Figure 5.1: Pipeline registers to break long paths ............................................. 63

Figure 5.2: Delay element in feedback loop .................................................... 64

Figure 5.3: VHDL module in System Generator® Environment (Black Box) .............65

Figure 5.4: Delay and Assert blocks and Black Box modules in Feedback Paths......... 66

Figure 5.5: Xilinx Spartan 3E® FPGA board................................................... 67

Figure 6.1 : Truncated portion of rec#203 with detected peaks .............................. 77

Figure 6.2: Truncated portion ofrec#108 with annotations ...................................79

XI
CHAPTER ONE

Introduction

A QRS complex is the most striking feature in an ECG signal. Due to its unique shape

and magnitude as compared to the other waveforms in the ECG signal, it becomes the

most important candidate and the basis for carrying out automated analysis of ECG

signals. The morphology, location and amplitude of the QRS complex in an ECG signal

gives significant information about the complete state of the heart, and thus ECG is a

widely accepted standard for diagnosing different types of Arrhythmias and other

complications arising in the heart [1] [2]. However, the ECG is very sensitive to any

small movement occurring in the body, especially in close proximity of its leads. Thus,

the signals get contaminated easily by noise due to external disturbances introduced in

them, particularly due to electrical signal associated with the movement of muscles in the

body or power lines within the room where the ECG measurement is being carried out.

The movement of muscles could be due to a normal random movement of limbs or it

could be an inherent problem in the body; for example, the patients suffering from

Parkinson disease. These disturbances are very difficult to avoid during ECG

measurements [2] [3]. The introduction of noise in the ECG signal, in quite a few cases,

makes it incomprehensible to analyze with the naked eye. Hence, it becomes all the more

important to :find ways to remove these unwanted signals from the ECG so that the

physicians can spend more time diagnosing and treating the patients for their ailments

rather than deciphering these difficult signals. A number of signal processing and feature

extraction algorithms have been developed and implemented in software, over the last 40

years. The advancement of technology has led to miniaturization of devices. With the

I
advent of more and more portable and implantable devices, there is an urgent need for

development of dedicated hardware for carrying out the feature extraction process in real

time. In the present research one of the highly acknowledged software QRS detection

algorithm is implemented in a Xilinx FPGA board. The authors of the original paper [6]

suggested three variations to the algorithm; of which two were implemented separately in

hardware. FPGAs have an added advantage over their hardware counterparts (ASIC and

Microcontroller chips etc.) in that they supports rapid prototyping, have a low time to

market and are inexpensive for low volume products. Moreover, the fact that they

provide a fast and efficient testability option makes them a highly suitable candidate for

implementing new QRS detection algorithms. It is shown in the result section that the

design implementation led to an under utilization of the resources available with a

medium size FPGA device, which could be used to implement additional functionality in

the design at a later stage. The main purpose of the design was to find the peak in a QRS

signal; however if there is a need, the same design can also be used as the base for a more

complex system for analyzing the QRS morphology and the rate of occurrence of the

QRS signal. The FPGA based QRS detectors identify the QRS complexes with an

accuracy of more than 90 percent when tested with five data records of the MIT-Bm

Arrhythmia database. The following chapters describe the conduction theory behind the

ECG signals, previous feature extraction algorithms and the state of the art, hardware

implementation of the algorithm and finally, the results obtained after testing the design.

2
CHAPTER TWO

Electrical Conduction in Heart

An Electrocardiograph, also known as ECG, is a device which records the electrical

activity in the heart. It is considered to be the golden standard for diagnosis of cardiac

arrhythmias and various other ailments relating to the heart [2]. The cardiac muscles

consist of around 300 trillion cells, which are electrically charged at rest, as shown below

in figure 2.1. (All the pictures in this chapter, unless or otherwise stated, are a courtesy of

the public domain website [Link] and are included with the permission of the

authors and publishers of the website).

C m -di"c l\lui,l c!e C' ell ~

([Link] 300 T "illio u )

Figure 2.1: Cardiac muscle and nerve Cells

The cardiac muscle cells are charged negative relative to the outside. This is also known

as the resting potential of the cell. When these cells are electrically stimulated, they

depolarize (i.e. their resting potential changes from negative to positive) and contract. As

the impulse spreads through the various regions in the heart, the electric field changes

3
continuously both in size and direction within the heart. This is shown below in figure

2.2.

Cells at re sting potential

Change in polarity during contraction

Figure 2.2: Cardiac cell depo larization

The ECG graphically records the average of trillions of these microscopic electric signals

from within the heart [2] [3]. The next section gives an overview of the Electrochemical

Mechanism taking place in the heart that forms the basis of its electrical conduction.

2.1. Cardiac Action Potential

For all the nerve and muscle cells in the body, their membrane potential is typically

maintained between -60 to -90 mV, with the inside of the cell negative relative to the

outside. The medium both inside and outside the cell is largely composed of water

containing various ions. The ions particularly involved with the electrical response of the

nerve and muscle are sodium and potassium, Na+ and K+ respectively. The concentration

of these ions inside and outside the cell causes the electromechanical force across the

membrane and also dictates the electric potential of the cell. In the resting state, the

4
concentration of the Na+ ions is higher outside the cell membrane and low inside.

Contrary to this, the concentration of K+ ions is higher inside the cell and low outside.

Equilibrium potential or Nemst potential is the potential of the membrane at which the

concentration across the membrane, of a particular ion, is in equilibrium; in other words,

the potential across the membrane exactly matches the diffusive tendencies of the ions

across the cell such that the net current across the membrane is O. The active transport of

the Na+ and K+ ions into and out of the cell is accomplished through a number of sodium-

potassium pumps scattered all along the membrane. These pumps transfer rwo K+ ions

inside the cell for every three Na+ ions outside. Hence, the concentration ofNa+ is higher

outside and that of K+ is higher inside the cell. However, the resting cell membrane is 75

time more permeable to the K+ ions than to the Na+ ions. This difference in permeability

of the rwo ions shifts the resting potential closer to the equilibrium potential of potassium

CEK=-80m V) than to Sodium (EN .=+60m V). The energy that maintains this equilibrium

(resting potential) is derived from the metabolic process of the living cells, wherein the

cells use oxygen and produce carbon dioxide and heat [2] [3] [4].

When an electric impulse is applied to the cells as discussed in the previous section, it

initiates a series of events that leads to a propagating action potential. The action

potential is the propagating change in conductivity and potential across the cell's

membrane and essentially involves its depolarization followed by are-polarization

(where the membrane reverts back to its resting potential) [2] [4].

5
1 ++

Figure 2.3: Cardiac depolarization-repolarization curve

The complete process can be explained in five phases, numbered 0-4 as shown in figure

2.3 above.

• Phase 4: This phase corresponds to the resting potential of the membrane. This is the

state the cell remains in, unless excited by an external electrical stimulus (from an

adjacent cell). Certain cardiac cells have the ability to undergo spontaneous

depolarization without the generation of action potential from the nearby cells. This

phenomenon is called the cardiac muscle automaticity and is demonstrated primarily

in the Sino Atrial node (SA) in the heart, which starts the whole process as will be

discussed in the next section [2) [3).

• Phase 0: The membrane undergoes a rapid depolarization wherein the Na+gates in the

cell membrane open up, leading to a fast influx of the sodium ions in the cell. The

slope corresponding to this phase leads to a maximum voltage in the curve shown

above. The ability of the cell to open the fast Na+channel during this phase is entirely

dependent on the membrane potential at the moment the stimulus is app lied. If the

membrane potential is maintained at a lower value (around -85mV), all the Na+

channels are closed and when an excitation is applied, all the channels will open up

6
leading to a higher value of Vmax. However, if the membrane is maintained at higher

value (more positive as compared to the one above), then not all the Na+ channels will

open and this will give a lower value of Vmax. This is the reason, if cell membranes

become too positive, the cell may not be excitable and the conduction through the

heart may be delayed, increasing the risk of arrhythmias. It has to be noted that the

concentration ofNa+ and the K+ ions does not change much during the depolarization

state. Instead, the permeability of sodium increases, pushing the membrane potential

of the cell closer to the equilibrium potential of sodium (+55mV) [2] [3].

• Phase I starts with the deactivation of the sodium channels and subsequently, the

activation of a few potassium channels in the membrane. The 'notch' like shape is

created due to the movement of some potassium ions outside the cell. This changes

the membrane potential by a small amount. It has to be noted though, that this process

does not contribute to repolarization [2] [3] [4].

• Phase 2 is horizontal due to the fact that there is a balance reached between the inflow

and the outflow current across the channel. This balance is created due to the inflow

of Calcium ions in to the cell. The balance is sustained for some period of time

between the K+ and Ca2+ ions until repolarization starts [2] [3] .

• Phase 3 marks the repo larization phase, where the channels corresponding to the

Calcium ions close. However, the potassium channels remain open and the ions keep

on moving out of the cell. This results in a changing membrane potential towards the

equilibrium potential of potassium and that continues until a potential of -80 to -85

mV is reached which marks the resting potential of the cell. This resting potential is

7
maintained as the potassium channels close after the above mentioned voltage range

is reached and the cell waits for the next cycle [2] [3].

2.2. Electrical Conduction in Heart

Sino Atrial node (SA) or the sinus node is the impulse generating (pacemaker) tissue

located in the right atrium of heart (see figure 2.4 below). Although all the cardiac cells

have an ability to generate electrical impulses that can trigger cardiac contraction, the SA

node initiates it, because it generates impulses at a slightly faster rate than the other areas

with pacemaker abilities. Cells in SA node naturally discharge or create action potentials

in the heart at about 60-100 times every minute [2].

SA node - - -

AVnode

Left and right bundle branch

Figure 2.4: Elements involved in cardiac electrical conduction

The impulse signals arising from the SA node first depolarize and contract the atria. From

there electrical signals travel to the AV node. After a delay, the stimulus travels from the

AV node to the His bundle and subsequently to the left and right bundle branch which

eventually ends up in a dense network of Purkinje fiber. These are specialized fibers

located in the inner ventricular region of the heart. They conduct an electrical impulse in

8
the ventricular region allowing them to contract in a coordinated manner [2] [3] . These

processes eventually get recorded by the ECG as a waveform, shown in figure 2.5 below.

s
ORS COrT'lplex T~ve

Atrial activation Ventricular [Link] Recovery 'Nave

Figure 2.5: Electrical conduction stages and generation of QRS complex

2.3. ECG wave dllring Electrical COlldllction

The figure 2.6 below shows the various regions seen in the ECG waveform during the

conduction process in the heart:

9
QRS
COO:lI1ple}ll<

PM .nt .......... 1

s
QT lnle"''''' '

Figure 2.6: P-Q-R-S-T waveforms and wave intervals (Courtesy: [Link])

• SA node: P wave: The electrical stimulation is spontaneously generated at the SA

node. This impulse propagates through the right atria, then through the left atria via

the Bachman' s bundles and fmally , stimulates the muscle cells in the atria region to

contract. This conduction of the impulse through the entire atria region is seen in the

ECG as the P wave. The relationship between the P wave and the QRS complex helps

distinguish between various arrhythmias [2] [3].

• A V nodelBundles: PR interval: The AV node introduces a delay critical to the

conduction system. Without this delay, the atria and the ventricle will contract at the

same time and the blood will not flow effectively from the atria to the ventricle

region. The PR interval is measured from the beginning of the P wave to the

beginning of the QRS complex and is usually about 120 to 200 ms long. The length

of the PR segments helps in diagnosing any heart blocks in the patients [2] [3].

10
• Purkinje Fibers! Ventricular Myocardium: QRS Complex: The QRS complex is a

structure in the ECG wave which corresponds to the ventricular depolarization. As

the ventricle region contains more muscle than the atria region, QRS complex tend to

be larger than the P wave. At the far end of AV node, there is the His bundle that

splits into two branches (left and right) which in turn connect the left and the right

ventricle. These two bundle branches, at the very end, taper out to form the Purkinje

fibers which stimulate the individual ventricular muscle cells to contract. The His

bundle coordinates the depolarization of the ventricles by increasing the conduction

velocity. Hence, the QRS complex has a "spiked" shape. Normally, the QRS

complexes are of 60 to 100 ms in duration. However, any abnormality can cause the

conduction for a longer time. Moreover, not all the QRS complexes contain a Q, R

and S wave [2] [3].

os.

Figure 2.7: Various structures classified as QRS complex (Courtesy:

[Link])

11
Any combination of these waves can be considered as QRS complexes as shown in

figure 2.7 above. In the figure, lower case and capital letters are used relative to the

size of wave in the complex. The duration, amplitude and morphology of the QRS

complex are useful in diagnosing cardiac arrhythmias and other disease states [2].

• T wave: The T wave represents the repolarization of the ventricles. It is produced

approximately 80ms from the occurrence of the S wave in the QRS Complex, which

is also a typical duration for the ST interval. The interval from the start of the QRS

complex to the apex of the T wave is known as the absolute refractory period. After

the T wave, there is a period of calm for sometime before the next cycle of

conduction begins [2] [3].

This section described the entire conduction process of the heart along with their

interpretation based on the ECG waveform. The next section briefly describes the various

leads in a 12 lead ECG along with their placement on the body.

2.4. ECG Electrodes

The electric signals in the heart can be measured using electrodes in the ECG. In total 12

leads are calculated using 10 electrodes in the ECG. The positions where these electrodes

are connected in the body are shown below in figure 2.8 . The ten electrodes are:

The extremity electrodes:

• L, Left Arm

• R, Right Arm

• N, Neutral, on the right leg

• F, Foot, on the left leg

12
OR

Figure 2.8: Placement of ECG electrodes

Chest Electrodes:

• VI through V6 are connected as shown in the figure above.

Using these 10 electrodes, the 12 leads can be derived as follows :

Extremity leads (See figure 2.9):

• I-From right to left arm

• II-From right arm to left leg

• III-From left arm to left leg

13
L---,!4-.&==C~· " ~---1( <:",. .
v

Lead Co nn ect i on for

rrleasurerTlents

n I n I

Schematic ==~Fo
;;:: :' ~
III
II, 1'\ II
\,1 \.!
& ' ~

Figure 2.9: Measurement using Leads I, II and III and A VL, A VF and AVR

Another set of extremity leads are (see figure 2.9):

• AVL, Augmented Voltage, pointing to Left Arm

• AVR, Augmented Voltage, pointing to Right Arm

• AVF, Augmented Voltage, pointing to Feet

And fmall y, there are precordial or chest leads VI through V6, whose position is the

same as that of the electrodes shown in figure 2.8. The placement of the leads across the

body defllles how the waveforms look on the ECG depending on the combination of

leads selected. For example, the QRS complex should be negative in lead VI and positive

in V6. The QRS waveforms show a gradual transition from negati ve to positive between

leads V2 , V3 and V4. They are called the equiphasic or transition leads. Thus, the

waveform looks different in different lead combinations [2] [3].

14
The EeG is a highly sensitive and an accurate device. However, its high sensitivity

proves to be its downside as well. It is sensitive to slightest external disturbance .These

disturbances could be due to the electric signals generated with movement of the muscle

in the body or due to power lines. These disturbances are easily picked up by the EeG as

noise, which contaminate the waveform, as shown below in :figure 2.10:

'"
:
• ... 1'1
,,
MLII

• , ,
,J'

/.. _.. . . ..
j
,
.- .,1 .. --
~
.-~ 17:40
.. .

... ~
"

. . ~ :'\:
.. ~~
f' "\
- -1
., r1
II
'\.
'\.
,
. "--
l' . ". . .
"V "~: . .. 't .-.- ~~--- VI

"" I ~.
.r I• \J
, 'I" Grid ~s: 0.2 sec. 0.5 IDV

Figure 2.10: EeG waveform contaminated with noise and artifacts

The presence of noise in the signal makes it incomprehensible for the physicians to

diagnose the ailment correctly. It is impossible for a living body to stay in a motionless

state for a long time. Similarly, it is difficult to get rid of all the power lines from the

room where the EeG measurement is being carried out [3]. Therefore, these disturbances

can be minimized up to a certain extent but one cannot avoid their presence in the EeG

waveform. There has to be a way to remove these disturbances from the EeG waveforms

so that the physicians spend little time in reading the waveform than treating the patient

for the ailments. This is where the signal processing and feature extraction algorithms

come in. The QRS complex is one of the most distinct features in an EeG waveform and

as already discussed, its morphology, width and amplitude playa very important role in

15
the diagnosis of various cardiac ailments. There are plenty of software algorithms

available which carry out this process. Some of these algorithms are discussed next

16
CHAPTER THREE

Software QRS Detection Algorithms

Software QRS detection has been a topic of research for the past 40 years now. However,

within the last decade or so, many new approaches have been proposed. These

approaches vary from the use of Artificial Neural Networks, genetic algorithms, wavelet

transforms, filter banks to heuristic methods mostly based on non-linear transformations

[4]. Again in the very early years when the research area on automated QRS detection

picked up momentum, an algorithmic structure was developed which became very

popular and is shared by a number of other algorithms. It is shown in the figure 3.1

below:

,..--------1I
1
r--------l ,.--------1
1
r-------,
1 I
Peak 1I 1I I

ECG
I
1 linear 1
1 i Nonlinear I I
1 Detecllcn 1 1 DecIsion
1
1
Logic ! I
1 Altering FDtarlng I f---..
I1 I1c. ________ 1 1
I1
Jt(n) I
10 _ _ _ _ _ _ _ .. ~
1 I&.-------_ ..
10 _ _ _ _ _ _ - - ' "

Preprocessing Stage DecIsion Stage

Figure 3.1: Basic structure for QRS detection Algorithms [1]

This structure consisted of two main stages, the preprocessing stage, consisting of linear

and non-linear filters and the decision stage consisting of peak detection and the decision

logic stages [4].

17
Most of the algorithms differ from each other in the way the preprocessing is carried out

on the ECG signal. The decision stage is mainly based on heuristics and is dependent on

the output of the preprocessing stage [4].

Typical frequency components for QRS complex range from 10 to 25 Hz. Therefore, all

QRS algorithms use a filter stage prior to the actual detection process in order to suppress

the other unwanted signals and artifacts in the waveform such as P and T-waves, baseline

drifts and noise. The attenuation of P and T waves and baseline drift requires a highpass

filter; a 10wpass filter is needed for suppressing noise. The combination of a lowpass and

a highpass filter results in a bandpass filter with a cut-off frequency between 10 to 25 Hz

[4]. Most of the algorithms use separate lowpass and highpass filtering stages and a few

of them only use the highpass filter. The filtered signal is then used in feature extraction

process in which the occurrence of a QRS complex is established by comparing it to

adaptive threshold values. Most of the algorithms use additional decision rules in order to

decrease the number offalse detections in testing [4].

3.1. Algorithm approaches and state of the art

Before deciding on the algorithm to implement in hardware a number of different

approaches were looked at. It was decided to go ahead with an algorithm that was tested

on a standard database and had a good accuracy.

As discussed earlier there were different approaches adopted in various algorithms. The

first one used the Artificial Neural Networks approach as suggested by Vijaya et al [8].

The design consisted of 10 processing elements in the input, 3 in the hidden layer and 1

in the output layer. Only one element was required in the output as the design just

classified the signal being recognized as a QRS or a non-QRS waveform. The algorithm

18
was trained with the back propagation algorithm on more than a hundred ECGs selected

from CSE Dataset-3. The authors claimed to have tested the trained ANNs on all the data

recordings of CSE dataset-3 and reported the accuracy for detecting the QRS complex

correctly, at around 99.11%. Moreover, the testing was done on the data received from all
>
the 12 leads of the ECG. The algorithm was implemented in C language [8].

For the second algoritlnn, yet another Artificial Neural Networks approach was suggested

by Xue et al [11]. The researchers suggested an adaptive matched filtering algorithm

based on artificial neural networks. In this approach an ANN adaptive whitening filter

was used to model the lower frequencies of the ECG, which are inherently nonlinear. The

residual signal which mainly contained the higher frequency QRS complex was then

passed through a linear matched filter to detect the location of QRS complex. In the

algorithm they claimed to have developed a matched filter template from the detected

QRS complex so that it could be customized to individual records. The ANN consisted of

6 processing units in the input, 3 in the hidden layer and 1 in the output layer and the

learning rate was set to 0.3. The authors report correct detection of about 99.5% for two

of the most contaminated data records, 105 and 108, from the MIT-BIH database [11].

To date, not many algorithms have been implemented in hardware. Thus, it was decided

to first implement a very simple, yet accurate algorithm in hardware, to see if it works

properly or not. The idea was that, if the first algorithm worked, even more complicated

algorithms could be implemented, to obtain better results. Hence, for the first design, one

of three approaches suggested by Tompkins and Hamilton [6] was implemented in the

hardware. Their algorithm is based on digital filtering approach. The researchers

proposed a real time detection algorithm which uses digital filters, with a few

19
modifications, from an earlier research by Pan and Tompkins [5]. The algorithm

consisted of linear filters connected one after the other in a sequence from a lowpass,

bighpass, and derivative filter to a moving window integrator. The non-linear part was a

signal amplitude squaring block. Adaptive thresholds and blanking were used as a part of

the decision rule. The main advantage of this algorithm was the use of integer arithmetic

in carrying out the processing. The coefficients of the filters used in the algorithm were

all integers and mostly powers of 2. As stated earlier, the researchers suggested three

variations in the algorithm based on the complexity involved in calculating the adaptive

threshold value for the next detection. In the first variation, the threshold was calculated

solely based on the last detected peak. The second variation used the mean of a certain

number of previously detected peaks in setting up the threshold value. And finally, in the

third variation, the threshold was calculated from the median of the previously eight

detected peaks, by the algorithm. Only the first and the last approaches were used in this

research for hardware implementation. Moreover, it was decided to use this algorithm

because the researchers tested their results with the entire data set in the MIT-Bill

Arrhythmia database and reported the results for correct detection of QRS complexes

with accuracy in excess of 99%. A general outline of this algorithm is shown in figure 3.2

below:

20
Time Avg.
Signal
....... Peak
Detector - Peak
Height
- QRS

IFiltering ~ Fiducial
Mark.
Location
Decision
-QRS

Rules
BP Signal ~
Fiducial Mark -t Event
4
Locator Vector

Preprocessor Decision Stage

Figure 3.2: A general outline of algorithm by Tompkins and Hamilton

One important aspect of this algorithm is that it locates the QRS complex and its peak in

the Bandpass signal rather than in the original signal. In order to get the result in the

original signal a fixed delay encountered by the signal from the original to the bandpass

stage has to be subtracted from the peak position in the bandpass signal.

The algorithm is divided into two stages. The preprocessor stage consists of four filters.

The BeG signal is input to a low pass filter. The output of the Lowpass filter becomes the

input to the highpass filter. Together, they formed a band pass filter. The output of the

bandpass filter is given as an input to derivative stage, which calculates the slope of the

signal. The output of the derivative filter is squared and presented as an input to the time

averaged integrator. The first two filters are IIR filters and the last two (the derivative and

the moving window integrator) are FIR structures. The time averaged signal is obtained

from the average of the last 32 sample values. The transfer functions, difference

equations and the implementation of these filters with their output are presented in detail

21
in the next chapter. A large waveform is produced as the output of the time averaged

filter corresponding to the QRS complex as shown in figure 3.3 below. Although it looks

easy to locate the peak in this signal even with simple algorithms, that use just one

threshold value, there are times when, due to the CODtamination of signal with noise,

these algorithms end up detecting two peaks instead of one as shown in figure 3.3 below:

Thl Th2 = 30"10 ofP eak.

Figure 3.3: Two threshold policy for peak detection in QRS complex

In order to avoid this, the researchers suggested a two threshold policy for carrying out

peak detection in the time averaged signal. In the first variation for the algorithm, the first

threshold is a small percentage of the previously detected peak value. As soon as a

sample value greater than Thl is encountered, all the samples following it are compared

with each other to locate the maximum among them. After a certain point in time the

signal has a downward slope [6]. It is after this downward transition, if a sample is

encountered, whose value is less than 30% of the maximum value detected earlier, peak

detection is reported. Thus, the peak is detected when the signal is downward sloping.

This 30% of the maximum value is referred to as second threshold.

The approach for the second variation is exactly similar except that the first threshold

Thl is calculated based on the median of previously detected eight peaks, instead of just

22
the last peak as discussed for variation 1. Even though it seems like a simple process in

the software, it involves quite a few steps in the hardware. To calculate the median of any

eight numbers, it requires them to be sorted in a particular order in the array. Moreover,

with every newly detected peak, the oldest number has to be deleted from the array and

the new number has to be added to the list in its place and yet again the list has to be

sorted out. 1bis involves a sequence of steps which have to be carried out, to achieve the

desired result The hardware design is described in the next chapter.

A detected peak in the time averaged window defines an event As already stated earlier,

the peak detection algorithm does not establish that a valid peak has occurred until the

middle of the falling slope when the level drops below a third of the distance from the

maximal value to the base point. Since the time between the middle of the falling and

rising slope is equal to the size of the window of the time averaging signal, ideally the

fiducial mark representing the time of occurrence of peak in the QRS signal i.e. the R

wave is located with a fixed delay of one window's width, in the band pass signal, back

in time from the point of peak detection in the time averaged signal as shown in figure

3.4 below.

23
F1c1uclal Point

a) EP Si@,D.a1

b) Time AveragedSignal

Figure 3.4: Fiducial mark in bandpass signal; BP signal (top), Moving window-

-output (bottom)

In order to find a more consistent location, the fiducial point is set to the location of the

largest peak in the bandpass signal in an interval from 250 to 150 ms i.e. it is required to

go back 50 samples and look within the next 20 samples for a maximum value [6]. This

becomes the window for peak detection and the 20 samples represent the QRS complex

itself. When implemented in hardware, additional delays are introduced in the design

hence this range was modified a bit, as discussed in the next chapter.

Also, once a peak is detected in the time averaged signal, any peak detected within the

next 45 samples corresponding to 225 ms are discarded, as it is highly improbable to have

two QRS complexes so close to each other. This blanking phase of 40 samples is helpful

in reducing the number offalse detections in the algorithm [6].

This chapter briefly introduced three of the most accurate QRS detection algorithms

available to date. The next chapter presents an in-depth discussion on the hardware

24
implementation of the last algorithm discussed in this section, by Tomkins and Hamilton

[6].

3.2. A word on Data Records

The design implemented in hardware was tested using five data records obtained from the

MIT-BIH Arrhythmia database maintained at the Physionet website. These five records

were specifically chosen from a set of 48 data records stored in physiobank. Due to time

constraints it was difficult to obtain all the records from the database and test them

individually. Thus, a selection was made based on looking at the results reported by some

of the authors in the papers referred. Experimental results reported in the literature show

that, most of the algorithms had good detection accuracy for record 100. All the other

records namely; rec 105,108, 203 and 222 produced a fair amount of error as compared to

the rest of the records in the database. In fact, for their algorithm, Tompkins and

Hamilton reported an average error of more than 3 percent for these four records. A brief

discussion on the five chose records is given below:

Record 100: This ECG recording belongs to a male subject 69 years of age. Data from

ECG leads MLII and V5 is stored in the database. The record has 2272 annotations

classified as beat annotations. The signal is clean in both the leads.

Records 105: This record belongs to a female subject 79 years of age. Data is obtained

from leads MLII and VI of the ECG. The record has 2572 annotations classified as beat

annotations. The record is contaminated with high grade noise and artifacts in both the

lead signals.

Records 108: Record belongs to a female subject 87 years of age. Data available from

leads MLII and VI of the ECG. It has 1774 classified beat annotations. It represents a

25
borderline case for first degree AV block. Lower channel leads exhibits considerable

noise and baseline shifts. This is one of the most difficult records to comprehend overaIl,

especially due to high P-waves encountered in the signal.

Record 203: Record for male subject 49 years of age. 2980 beat annotations overall. Data

presented for MLII and V 1 leads. It is again a very difficult signal for analysis. It has

considerable noise in both channels, including muscle artifact and baseline shifts and

QRS morphology changes are seen due to axis shift, in one of the channels.

Record 222: This record belongs to a female subject 84 years of age and has 2482 beat

annotations. Several intervals of high-frequency noise/artifact are seen in both the

channels. Data from ECG leads MLII and VI are available.

As can be concluded from the above discussion, there was quite a variety in the records

selected for testing. Except for record 100 most of the other records were difficult to

analyze. However, it is always a good idea to use more data records to for testing

purposes as some algorithms perform better on difficult data but report a high count on

the number of fault detections for easier data and vice versa.

26
CHAPTER FOUR

Hardware Implementations

For the present research, a very well established algorithm on QRS Peak detection,

proposed by Tompkins and Hamilton [6] is implemented in hardware. The main

advantage of this design is that, even though it consists of a number of filtering stages, all

the filters have integer coefficients and hence can be efficiently implemented in hardware

without any loss in accuracy. As discussed in the previous chapter, Tompkins and

Hamilton suggested three possible approaches for peak detection in ECG signals. The

three methods differ from each other based on the way the threshold value is calculated

for the next cycle of detection. In this research two of these three approaches were used

to create hardware models, which are discussed in the following sections. In the first

method the threshold is calculated solely based on the last peak value. The second

method uses a more complex technique where threshold value for the next detection is

calculated from the median of the last eight detected peaks. Moreover, this method is

much more complicated to implement in hardware than the first one and it almost doubles

the resource utilization as compared to the first method. However, it leads to a more

accurate detection, as will be discussed in the results section. In general, the

implementation of the second method is an extension of the first method by adding a few

more design blocks.

4.1. Data Processing:

Fixed point arithmetic was used for the implementing the algorithm in the hardware. We

obtained 5 ECG data recordings (Rec#lOO, 105, 108,203 and 222) from the MlT-BlH

Arrhythmia database available at the Physionet website [7]. These data records were

27
origina1ly sampled at 350 Hz when downloaded. The algorithm being used here required

the data to be processed at 200Hz. Hence, it was re-sampled at 200 Hz in MATLAB®. In

order to avoid working with fractional numbers, the data samples were multiplied with a

factor of loJ and rounded off to the nearest integer value. On evaluating the origina1 data

and experimenting with different number of bits for representing the data, it was

concluded that there was not much loss in accuracy if 40 bits were used to represent each

data signal. Moreover, using 40 bits all throughout the design ensured that we did not

exceed this range, even after the data samples were processed through the different

filtering stages in the design. When 20 bits were used instead, data was no longer

accurate and in certain parts of the implementation the sample values were out of range.

If the data is incorrect in one stage then all the following stages introduce more and more

errors in the output. However, it would be interesting to see the results by using 32 bits

instead of 40 bits to represent the data samples. Generally, a bit range of the powers of2

are accepted as a standard everywhere, hence the results corresponding to 32 bits will be

worth taking a look.

4.2. System Generator Blocks

For implementing the design in hardware a Xilinx DSP tool namely, System Generator®

was used. In this section, the basic System Generator® blocks that were used in

implementing the design are shown below in figure 4.1.

28
,-_ _'...JP Xilinx Constant Block
Constant
rn Nl!gate
Xilinx Negate Block

Xilinx Delay Block

Xilinx Single Port Random
Access Memory (RAM)
Delay ""
Single Port RAM

[IT} Xilinx Up-Down Countar @ Xilinx Re~'tar Block

Counter Register

~: ;~ ~ Xilinx rel ational operator Scales input by power of2

by adjusting the binary point
Relationa l SC.J le

~ Xilinx Adder Subtractor

~> Xilinx C onstant Multiplier
AddSu b
eMult

Figure 4.1 : Basic System Generator@Blocks

A detailed discussion on tbe tool is presented in chapter 6. Readers are encouraged to

refer the user manual [12] for the too l, in order to build a design using these block from

the scratch.

4.3. Design 1: Threshold based 011 previollsly detected peak vallie

The design can be divided into two main sections. The first section is the pre-processing

stage. This was where all the filters are implemented. The second section is the peak

detection stage where a finite state machine controls the actual detection process. The

general outline of the design is shown in figure 4.2 below:

29
E
C
G
I
~
- ,.
.fl
;..;:: ,.
~
~
- u
l<i'
.fl
ii:
~
u
DO
N
P
u
f.
Pc
~

:a Pc
:a
-'"
u
:-
'n~
:-
1+
""' l<i'"
DO
1+
~
e!
u

u
~ g.
'C
S
....., ~
u
T A 'I='
'"

l Ma in
Finite
State
Mac hine
O utput
Peak
Positio n Peak. Dete c tio n Stage
r

Memo ry
Stage
1+ Windo'W
Stage
W
~

Figure 4,2: General outline of Hardware implementation for designl

The Pre-Processing and the Peak-detection stages are discussed next:

4.3.1. Section 1: Pre-Processing Stage

As di scussed earlier, this section consists of a number of filtering stages. This is a critical

part of the design as it removes the unwanted noise signal picked up by the ECG due to

interference with the power lines within the room where the recording was carried out.

To overcome the effect of these unwanted signals which could otherwise lead to false

peak detection, the ECG signal is passed through a number of filtering stages. All the

filters in the design were implemented using System Generator®, a DSP design tool by

Xilinx Corp. The various filtering stages involved in the design are discussed next:

30
a) LolV Pass Filter:

The second order low pass fi lter used in the design has a transfer function given by:

And its corresponding difference equation is:

y(nT) =2y (nT-T) - y(nT-2T) +x(nT) -2x(nT-6T) +x(nT-12T)

Where T is the sampl ing period and ' n' is any arbitrary integer. The cutoff frequency of

the filter is about II Hz and it has a gain of 32. The filter introduces a delay of 6 samples

in the stages following it. The low-pass filter removes all the high frequency spikes or

noise from the ECG signal.

This filter implemented as a direct form I structure using System Generator® is shown

below in figure 4.3:-

P'.hl&ibl

Figure 4.3: Lowpass filter implementation in System Generator®

31
The truncated output for this filter for one of the data records tested with this algorithm is

as shown in figure 4.4 below:-

IIll f- ........ 1.••••.•• ••• • 1•.•.•. •.••.• 1••.••••..••• 1 .•• • • ...•• •. L ••. ......• J • .• •• .•.•.•• J.. .••..•••.• J •• • ••••••••• J • • •• • ••••••• J •••• _

•
~ 50 f- ........ .

•EO
Q.
········· ................................ . ................................................................................ .
•
(J) .S) ~ ........ ... j............
1000 IDS)
:'::)j. .
IIIll
j ............,. ..........
11s) 1200
· ,.··········0· .i........"1..........~....
1250 131ll 1350 1400 1450
Sample Number
LP Output
1000~--~~~r---~----=F~~~--~F---~----~----~----~

SIll ........... ...•.......... .........•...... .. ............ ...•........•.......................... . .... ....•.. ..... ......... ...
o ........... ................................. .................................................. ............................ .
t ·500
E
~ ·Hm
·IS1ll
1000 1050 IIIll 1150 1200 1250 1311
Sample Number

Figure 4.4: Input ECG signal (Top) and Lowpass filter output (Bottom)

The figure at the top is the contaminated ECG signal which is fed to the Low-Pass filter

as an input. Once it passes through the Low-Pass filter, it has a much smoother transition

without any high frequency spikes, as can be seen in figure 4.4.

b) High Pass Filter:

ECG signals do not remain at a constant D.C. level at all times. Sometimes they are

raised to a higher or reduced to a lower D.C level as shown in figure 4.4b above. The

high pass filter comes in after the low pass filtering stage to remove the low frequency or

D.C. offset signal and set it to a zero level. The difference equation and transfer function

of the high pass filter are given below:-

32
y( nT) = y(nT - T) + _x(_nT_) + x(nT - 16T) - x(nT -17T) + _x.o..(n_T_-_3_2_T-,-)
32 32

1 -1 6 - 17 1 - 32
- +z -z +- z
H (z) 32 32

The filter has a cutoff frequency of 5 Hz, gain of 32 and when implemented, introduces a

delay of 16 samples for the input to the next stage. The direct form I implementation of

the filter with these specifications is shown below in figure 4.5:-

,.
SC4 l e • • b i---r+-[Link]l
0""
A dd SU D7

SCA le1

Figure 4.5 : Highpass filter implementation in System Generator®

The output of this filter is as shown below in figure 4.6:-

33
LP OutPUI
lWJ ---, --- ----------.- - _-- _----- ----, ----- --- ------, --- ---------- --, --------- ----- L - - - - - - - - - - - - - -. -- - - - - - - - - - --r ~ - ---- --- --- ---.-
• 500 --- -- - - ---- -- ----- --- ----- -- -------- ----- --- - - -- ---- --- ---- -- - --- - ------ --- -- --- --- ---- -- ---- ---- ---- --- - --- --- ---- --- -----
,
~ 0 --- ---- - --- -- -- ------ -- ------- -- ----- ------ --- ---- ----- --- •• ---------------- •.••••• .. • -.- . . . --.- •. ---... . ••••...••••..••.•

! :::~..: . .::: : : ::::::~:.: : ::: : : : : : : : :::. ::::.:: ::::::::::::::.:~ ::-: ::: ::::::::~
1WJ 1050 1100 1150 1200 1250 1350 1400
Sample Number
HPI8P OUlpUl

1500 •• • --.. -.-.- ••• -. ---- ..... -..... --- . ....• --•.••• • •••• •••••• ••• •• ••• -•• -.-.- •• • -••• -••• ••• • -•• •• •• • -. . .. ... . -••••• •• •••••• •• -
•
~ 1003
>
~ 5CD ...... . ... •.. ••••. . . ••.• •. . ••••• ... •••• ••. ••• ••.. •••. . •••. .. .•. .. ..•... .•• .. ••• . .•• ••• •. ••• ••..•• •••.. •• • • .. •••••••.•••• .. . -

~ .~p.::V....- .-::.~~.:.v...- . ;.-.- : . : ::.;. -.-.. . ... . . :~.:IL._.~

lWJ 1050 1100 1150 1200 1250 1DJ 1350 1400
Sample Number

Figure 4.6: Highpass filter input (Top), Highpass filter output (Bottom)

It is clear that the two filters acted as a Bandpass filter for 5·10 Hz range and eliminated

the DC and the hi gh frequency noise components while keeping the relevant feature of

the signal intact, though wi th some shape distortion.

c) Differentiator filter:

The derivative filter is used to find the slope information in the ECG signal. This

technique o f finding the slope is very popular among all the ECG analysis algorithms.

The reason is that the QRS complex tends to have larger variation in its slope than all the

other features in the waveform. This provides an easier discrimination among them. The

difference equation and transfer function for this filter are given below:·

y (nT ) = ( 2X(n T ) + x(nT - 1) - ;(nT - 3) - 2x(nT - 4»)

2+z · ' _ Z -3 -2z -4

H (z) = - -----
8

34
This filter introduces an overall delay of two samples fo r a stage following it.

The Direct fonn I implementation of this filter using System Generator® is shown below

in figure 4.7:-

Figure 4.7: Derivative filter implementation in System Generator®

The output of this filter is shown below in figure 4.8 :-

35
BP Oulpul
;ro:J .- .....•..• . •. ••••.••.. ...
, ,
1500 :....... . •...••. •••..•• • • •••• ••• • •
: , :
••• __________ , ____________ •• __ • ___ •••• •• oJ.. • . . . . . . . . . . . . . . . _ •• •• • • • • _ •••

: ' :
:::::::::: ·::::::::::::i:::::::::::::(::::::::::::!: .::::::::;::::::::::::: :::::
: : ' :
,
,,
..... -~-------------.---- -

1000 109J 1100 1150 1200 1250 lll0 1350 1400 1450
Sample Number
Dilferentiator Output

500 n········ ····f············!"············!·· ·········1····· ....... j...........·~··········· ··i·· ·········1······ ......! ...... .
,•
">•
0.
E
•
Ul
or··· .;.
I:I:

,,
:, :,
;'-~
:,
.... ; : :" :

"I
" :
:
..
~ L.. ... T·· ....

···i· .. ····-
, , • • I "

·500 I-:............. ~ ............ ;. ............ :............ .:. ............ :............ -;- ............ :.... ······· ·~···· .. ·..
1000 1050 1100 1150 1m 1250 l Dl 1350 1400 1450
Sample Number

Figure 4.8: Derivative ft..lter input (Top), Derivative filter output (Bottom)

d) Squaring stage:

This stage gets its input from the differentiator filter. The hardware essentially consists of

a multiplier, which accepts a 40 bit input and generates a 40 bit output as the square of

the input. The delay introduced in this stage is more or less dependent on the time

required to calculate the square of a 40 bit number. This multiplier when implemented

was pretty big, and had an internal pipeline of 7. In order to achieve timing closure in the

design, 3 additional pipeline registers were added to it. The Squaring stage as

implemented in System Generator®is as shown below in figure 4.9:

36
1
In1

Register10 z.-I
ReglsteQ

~Out1

Figure 4.9: Input squaring stage implementation in System Generator®

The output of thi s stage is as shown below in figure 4.10:-

Differentiator Output
500

,•
~
•
a. 0
E
rn•

·500
1000 1050 1tOO 1150 1200 1250 1300 1350 UOO USQ
Sample Number
, to' Squaring Stage Output
3

.
~
>
2

• 1
0.
E
rn•
0

1000 1050 11 00 11 50 1200 1250 1300 1350 1400 1450

Sample Number

Figure 4.10: Squaring stage input (Top), Squaring stage output (Bottom)

37
e) Time Averaged filter:

This is an FIR filter, which takes previous 32 samples from the squaring stage as its input

and generates one output sample corresponding to their average. It can be implemented in

two ways in the hardware. The first method is more general as that of an adder chain.

This implementation requires the use of 32 registers and a total of 31 adders distributed

among various stages in the filter. This implementation requires huge reso urces

(especially with thirty one 40 bit adders) as a lot of slices are used up to carry out this

implementation. Moreover, it introduces a huge delay in the overall timing performance

of the design. Hence, it was decided not to use this approach. Instead, a different and a

resource friendly approach were developed. The Xilinx FPGA device in the Spartan 3E

Starter Kit® (XC3S500 fg 320), comes with 20 on-board Block RAMs. The previous 32

samples as obtained from the squaring stage were stored in a block RAM of size

32x40bits. In every clock cycle the 32 nd location was read as the output of the RAM. This

approach actually reduces the resource utilization by a huge amount for two reasons.

First, the adder tree is replaced by just two adders and second, the individually

addressable registers are substituted by a more compact RAM device. The overall timing

of the design is also improved. The Hardware implementation for this stage is shown

below in figure 4.11 :

38
,---~ I'

~Out2

c::J
Out1

AddSub1

C onstant1
squareouM

L--------.B squareout1

Figure 4.11: Moving window integrator implementation in System Generator®

Once the RAM is set up, the design of the FIR filter becomes quite intuitive. Whenever a

new sample arrives at the input of the RAM, the oldest sample stored in it, is removed.

This can be achieved by subtracting the oth value from the current value in register. The

the value stored in the register and the process repeated in a similar fashion. In this way

the oldest value is removed from the memory and at the same time the new value is

added to the memory at the position where the oldest value was stored. Thus, each data

value remains in the RAM for 32 clock cycles and then eventually gets overwritten by a

new value. In the end, the value stored in the register is divided by 32 to get a signal as

shown below in figure 4.12. The position of peak in the signal is clear as shown in the

figure. However, for certain records, a number of connected and sometimes multiple

peaks are seen in the output.

39
, 10' Squa ring Siage Oulput

3
,•
~ 2
•
a.
~ I
III

0
l00J 1050 lUll 1150 1200 1250 1300 lE 1400 1450
Sample Number
4 Time Averaged Output
x 10

6
,•
~ 4
•
a.
~ 2
III

o'
l00J 1050 1100 1150 1200 1250 lDl lE 1400 1450
Sample Number

Figure 4.12: Moving window integrator input (Top), Moving window integrator output

(Bottom)

This stage marks the end of the pre-processing stage. The delays encountered by the ECG

signal when passing through the various filtering stages is shown below in figure 4.13.

40
Figure 4.13: Delay in signal due to the various filtering stages

The figure 4.14 shown below gives an outline of the stages following the preprocessing

stage. The output of the preprocessing stage is sent to the FSM, which controls the other

modules and finds the peak in the signal shown above. Then it locates the peak in the

band pass signal based on the peak in the time averaged signal.

[Link] [Link] :
Low- Pass,
High Pass,
Derivative,
Squ .,...".
~
__::> F=". s:I :~..
Pret'et..ch,.
Load l ,
Load2.
Peelel
Time [Link]"ed Peak:l

W indow- SLag" : Memory Stage ;

Finds the lugest RAM
among 20 [Link] e r s 4.5x:;ZO bits

Figure 4.14: General design fl ow

\
in hardware implementation

41
4.3.2. Section 2: Peak Detection stage

This stage consists of three main modules I.e. finite state machine, memory and the

window module. All these modules are discussed next:

[Link] Fillite State Machine:

The finite state machine is the main module which controls the peak detection process. It

consists offive states as shown below in figure 4.15:

Counter < 400

Prefekh

Counter >=400
Sample < thl

P,ak(i)-Peak(i-l )<45
P,ak(i)-P,ak(i- l »-45 Sample> thl

Sample >= th2

Sample < th2

Figure 4.15: Finite State Machine implementation in design 1

The five states of the FSM are discussed next:

1. Prejetch: This state sets up the initial threshold value. As soon as the state machine is

started, it continues to remain in this state for the initial 2 seconds. The data input to

the design is sampled at 200 Hz so the first 400 ECG samples are initially read and

the threshold is initialized to be 30% of the maximum value among these 400

42
samples. This threshold value acts as a starting point for the algorithm and is updated

later on, every time a peak is detected in the time averaged signal.

2. LoadI: This state marks the beginning of the actual peak detection process. Once the

initial threshold value is set up in the Prefetch state, the control is never given to it

again. In other words, after the first 400 samples are read by the state machine, we

never go to the Prefetch state and the state machine restarts looking for the next

position from the Load 1 state. In this state the input sample is compared to the

threshold value. If the sample value is greater than the threshold then we move on to

the next state else we remain in the current state. The idea here is to look at data

samples which exist in the QRS region, so this state discards samples which exist in

regions other than the QRS part.

3. Load2: Once all the data samples with values below the first threshold are removed,

we are only left with sample values which exist in the QRS region. In this state, the

maximum value among these samples is located. We also keep a record for a sample

value which is less than 30% of the maximum value found earlier. As soon as a

sample is located whose value is less than 30% of the maximum value located, during

the falling slope of the waveform, the maximum value is declared as the peak, as

explained in the previous chapter. This second value becomes the second threshold or

Th2.

The peak detected in the above state belongs to the time averaged signal. Based

on the delays introduced in the signal due to the various filtering and the following

processing stages, the actual peak in the band pass signal is located by going back 60

samples in the band pass signal, from the moment the peak found in the time

43
averaged signal and then finding the maximum value in the next 30 samples

encountered there after (please refer section 3.1, figure 3.4 for an explanation on this

part). This is implemented by the memory and the window modules which are

discussed later in the section. The delay is not a constant value as expected in

software implementations. The delay value depends on the way the hardware is

implemented and the tool used for the implementation.

4. Peakl: The two threshold method eliminates most of the double peak detections.

However, for extreme cases sometimes, even this approach fails. Thus, an additional

condition was added in the peak detection process. Any peak detection within 45

samples of previously detected peak is rejected. This 225 ms blanking period

(corresponding to 45 samples) eliminates the possibility of any false detection such as

multiple triggering on the same QRS complex during this time interval. It is also a

fair adjustment to the algorithm as QRS complexes cannot occur more closely than

this physiologically [6]. Thus in this state the current peak position is compared with

the last peak position and if the difference between the two is greater than 45 samples,

the control is sent to the next state else to state Loadl.

5. Peak2: This is the stage where the output from the window module is already

available. The window module calculates the peak value in the band pass signal. To

get the exact position in real time, the output of the window module, which is a

number between 1 and 30 is added to the current counter value and then the total

delay until this final state is subtracted from that number. This gives the exact

position of peak in the band pass signal in real time. Control is sent back to Loadl

state once the output is calculated and the process is repeated again.

44
[Link] Memory Module

This module was implemented using the Block RAM resources available with the

Spartan Kit. Based on the way the design was implemented, there was a delay of around

60 samples between the signal at the peak detection stage and the same signal at the band

pass stage. Thus, the memory module implemented has a size of 6Ox40 bits. Please note

that the memory is filled from the Oth location all the way down to the 59th location.

Hence, the oldest value is stored at the Oth position and the last value is at the 59th

position, as shown in the figure 4.16 below:

BP Signal 0 Oldest 30 samples

I
2

I
I
I
I
I
I
I
1
551

Figure 4.16: General outline of memory module implementation

The band pass samples are continuously written to the RAM. It was implemented as a

cyclic structure such that the oldest value gets over-written once the RAM is full and this

process is repeated again and again. A signal sent from the Load2 state of the FSM marks

the beginning of the reading cycle in the RAM. Once this happens, the oldest 30 sample

values are sent to the window module as indicated in the figure 4.16 above. This module

is very important as it connects the time averaged output to the band pass signal which

45
appeared in the very beginning of the design. A sample output of the memory module is

shown below in figure 4.17. It can be seen that only the QRS complex remains and all the

other signals are removed completely. This feature of the design could also be used for

determining the heart rate or the time between two QRS complexes. Also, it should be

noted that the peak of the QRS complex lies within this region.

GIIlmf·.....,-
I r , , I

I~---------- ---------------------- --------------------- ---------------------

aB~---------- ---------------------- --------------------- ---------------------

al----,-I

!.as~--------- - -- -----------------
..
: -I~--------- - -- -----------------
-IS~-------- - - -----------------
~~--------- - - ~----------------- -- - ----------------- -- - -----------------
4B~--------- --
1.1DII
.
III
.
(Jnl III
. 1
III
1 1 1
Ill!
Ilfll

Figure 4.17: Output of memory module (QRS complex)

4J.2.J Wmdow Module

The window module gets its input from the memory module discussed previously. Each

of the 30 samples input to the window module are compared to each other. Only the

position of the largest number is of interest and not value corresponding to this position.

Hence, this number (between I and 30) is generated as the output of this block. The

output of this module becomes the input to the next state. Apart from the position of the

46
largest sample, another signal indicating that the output is ready is generated and sent to

the FSM. This module takes a maximum of 30 clock cycles to execute along with the

memory module, as the worst case for the number of comparisons is 30. The window

module was implemented in VHDL and the module was imported in the System

Generator®environment. The figure 4.18 below shows its internal implementation.

d
,
·
• ,-,
., 1St z<1

1 2Of---+ ' "R_gist.1

[Link] t:5 R. l,tionolI3 -

w
~
• .,..,
b lot

"R,I"ti o",' ~ d
~ m;. ,., ' -
1
Cons"~ nO
.
2Of---+ I ~ '~~
·c t-+.
Rt l,,:ion.i L1

G:)
,. 0,",
Rtgist • .,

<-;=.J
"' 1 ~ .... •
'
.,
1--+
Col'l:d..lnl4 R,loItionol l2

• .
'-;;a'"
1 ~.
.,
Of-+ I.";""
' · 4
~
~
" . ,,'

Figure 4.18: Window module implementation in System Generator®

A sample output from the window module is shown be low in figure 4.19.

47
Window Out put
~'-------'-------'--------r-------'-------''-------r-------'
: i
25 -----. . - -----. ---:--------. --- ----- - ----..... --- ---- - --- ---- - -------- i--. ---- ----------- -------- ------- --~-- ------- .---- ---

20 ... ..••.... . ..... 0. ... . .. • . . . . . .... ...... . •......... .... .... ........•.............. .. . . . . . ..... . .. . .. ...•.......•. •.......

-
.2
'~
8:. 15 - - ----- - --- - -- - - - ~ - --- -- -~ - --- -. --0 -- --- -- --- ---- --- -- -- -- -- - -- - --- - ---- -- --- - --- - --- -- - -- -- --- -- -- --0-- ---- -- --- -- -- ----
~

•
if
10 -- -- -- -- -.-- ----- ~ ------.- - ---- ---- ------ - ------- - ---- --- - ------ - - ~ --------- - --- --- -- -- - -- --- -- -- -- --i---------- -------

5:-....... ········+ ·····.···

1
In .. m'" j
00 500 Kill 1500 2!lll 2500 300J 3500
Sample Number

Figure 4.19: Output of window module (Peak position in QRS complex)

It should be noted that the number of samples used in the two modules discussed above

depends completely on the way the design is implemented in the System Generator®

environment.

This section gave a detailed overview of the all the different modules in the design. The

complete design with all the individual modules connected together, in the System

Generator®environment is shown below in figure 4.20:

48
-
R

Figure 4.20: Complete implementation of design 1 in System Generator®environment

A sample output from the above design is as shown below in figure 4.21 :
P o. kV0 BPS'gna

"00 .:

1000

BOO

600

A
~

0
/f'\
"--
V\ ~

'v
V V
'-
'v \)

· ' 00

~
V
I
""
SF'Signal
V
Peak [Link] I
2350 2"'" 2 450 2550 2600
Sample Number

Figure 4.21: Sample output of design 1

49
The design was later synthesized and set up for carrying out a hardware co-simulation. In

order to do that System Generator automatically starts the Xilinx Synthesis tool, ISE to

carry out the synthesis of the design modules. Once the synthesis is complete system

generator attaches a JT AG interface to the design. The advantage of hardware co-

simulation is that the design can be downloaded to the FPGA board using the JTAG port

and can then be tested by passing data from the Simulink environment. When the co-

simulation process is executed, a JT AG block is generated as shown below in figure 4.22:

JT""
Co-sim

[Link] 1&
[Link]

Figure 4.22: JT AG block for co-simulation using FPGA

This JTAG block can be run separately by directly connecting the board via the USB port

or can be run in coordination with the simulation as shown in the figure 4.22 above. A

comparison of the hardware and the simulation output is shown below in figure 4.23. The

difference between the [wo is a maximum of 1 sample.

50
Figure 4.23: Software and Hardware co-simulation output

As stated in the beginning of this chapter, before carrying out the hardware

implementation, the algorithm was first written in the software. Later, a hardware model

was developed in System Generator® as discussed above and VHDL code for the

complete design was obtained. Thjs design was then simulated in Modelsim and the

output was compared with the software simulation output. There was a difference of a

maximum of 2-3 samples between the software and the Hardware implementation. This

difference is expected anyways because full precision floating point numbers were used

in the software implementation, while for the hardware 40 bit fixed point numbers were

used. A sample output indicating the above observation is shown in figure 4.24 below:

51
••• _ . ... 1"""1

,.,
""
J .0

· .0

."""' ... , _s.n~ I •• • .0'

..... 'm . ... ~ "'I
,.,

""
J .0

· .0

.,,"",".r _s.n~ , •• • .0'

Figure 4.24: Software simulation output using floating point arithmetic (Top) and fixed

point arithmetic (Bottom)

4.4. Design 2: Peak deteetioll with threshold based Ofl median of last eight peaks.

The second design is an extension of the first one. In this design, everything else remains

the same except that the first threshold value is calculated based on the previously

detected eight peaks. In order to calculate the median of 8 numbers as stated above, they

have to be stored in an array or memory. Moreover, the array has to be updated every

time a new peak is detected followed by the deletion of the oldest peak in the array. After

that the array has to be sorted out in a particular order and then finally the mean can be

calculated. Hence, in order to implement it, the array has to be processed in a particular

sequence of steps. This can be best implemented as a finite state machine which carries

out these operations one after the other and then the second state machine sends the

median output back to the main FSM. A general outline for this design is shown below in

figure 4.25:

52
E ~
--
C ~
ii:
.. .
~
G ~ ~
~
~ l! "~ -0
I [i: <.i:; ~ v; "00
N f.. ~
~
~
~
f.. "0- fo.- I:l
P '"
P;< '"
P;<
'.d
""
C
~
· '"
0- ' 1ij
U ;;. -'" c g. "8
T H
0
~ 0"
""
.~

,---
~ Main
-..-Y
Median W
Finite Finite
State
State
Machine Machine
Output
Peak
P os itio n
>
t

Memory
Stage
f. Window
Stage
U r. Median
RAM
Stage
f.
)

Figure 4.25: General outline of Hardware implementation in design 2

This design involves adding additional functionality to the original design that is already

discussed in design!. Therefore, only the modifications to the fust design and the

additional modules added to this design will be discussed in the following sections:

4.4.1. Fillite State Machilles:

This design involves two fmite state machines working in tandem. The first state machine

carries out the peak detection process as discussed in the previous section whereas the

second calculates the median of the last 8 peaks detected by the first state machine. One

53
of the states in each of the state machines activates the other state machine. The two state

machines are shown below in figure 4.26 and are discussed next:

Main State Machine

r---------------------~-------------------~

Median State Machine

Figure 4.26: Finite State Machine implementation for design 2

• Main State Mach ine:

The main finite state machine is exactly the same as before but with an additional state,

Peak3. In this state, a signal is sent to activate the second state machine. Control remains

in this state until the median is ready from the second state machine. This median peak

value is used by the state load I to calculate the first threshold value.

• Median State Machine :

In the current design a median of 8 previously detected peaks is calculated. In order to do

that, the 8 elements are first arranged in ascending order. 8 being an even number, its

54
median is the average of the fourth and the fifth element. Every time a new peak: is added

to the array, the array needs to be rearranged and the oldest number in the array is

deleted. In order to achieve this in the hardware a state machine is implemented. There

are four processes involved, deletion, addition, arrangement and then the median

calculation. Overall, for a worst case the median calculation process takes 16 clock

cycles. These five states are explained next:

1. Do-nothing: In this state, the control waits for activation from the main state

machine. Once the second FSM is activated, the second state machine remains in this

state until at least 8 peaks are detected by the design. As soon as the first eight peaks

are detected, the control is sent to the deletion state. This state also carries out the

array set up operation. As stated earlier, the median calculation process involves

deletion of the oldest peak: value followed by the addition of the newly detected peak:

value and then sorting of the elements of the array. In order to do this a block RAM

of size 8x40 bits is used. The RAM stores each newly detected peak: in the order they

are received. The output of this RAM block, called "Median-RAM"in the design, is

used to carry out the deletion operation. As the RAM only allows synchronous

operation, in order to carry out sorting this could take a lot of time. Hence, an array is

then constructed with 8 registers each of 40 bits. All the operations right from

addition, deletion or sorting are performed on this array. Thus, two copies of the 8

peak: values are maintained at any given time. The numbers stored in the array are

arranged in ascending order and the RAM has a copy of peak: values in the order they

were received Please note that the array is sorted in ascending order right from the

beginning i.e. as soon as the filling up of the array starts. To do this, the new peak

55
value is compared with the ones already stored in the array. The largest number is

shifted to the bottom of the array and the smallest to the top. However, until all the 8

locations are filled no output is generated by the state machine and hence, the Median

state machine remains in the do-nothing state. When the 8th peak value is detected and

added to the array, the state machine moves on to the second state.

2. Deletion: When a new peak value is detected, the Median-RAM block sends the

output, containing the oldest value stored in it, to the Median state machine. This

value is compared with the values stored in the array. There will definitely be one

element in the array which will be equal to this number and it will be removed. Once

the value is deleted from the array, all the elements of the array are shifted up such

that there is a vacant space at the last position in the array. A zero will be stored in

that vacant space as shown in the figure 4.27 below:

10 Fro mMedianRAM 10

23
-:810 ck /

DC
15
23

29 I
I
I
I
I
I 0

Figure 4.27: Deletion operation on the array

The new value is then added to the Median-RAM block at the location where the

deleted value was stored. It should be noted that the array only has 7 elements now

(8th element is 0 which is not considered) and the control is sent to the addition state.

56
Moreover, the elements of the array still remain sorted in ascending order. The worst

case delay of this state is 8 clock cycles and it depends on the location at which the

number to be deleted, is stored in the array.

3. Addition: The deletion operation results in a vacant space at the bottom of the array.

The new peak value is then added to the array at an appropriate location. To do this,

the new value is compared one by one to the elements of the sorted array. !fit is the

smaller than the first element in the array, all the numbers in the array are shifted
th
down. If it is greater than the 7 element of the array. then the new value is stored at

the 8th position. And in the case when its value lies between two numbers of the array.

the values greater than itself are shifted down and the smaller values stay at the same

location. This is shown in the figure 4.28 below.

10 10
15 (Input)
23 15

39 23
DC :>
44 39
~ 44

63 ~

67 63

0 67

Figure 4.28: Addition operation on the array

The worst case delay of this state is 7, which occurs in the case that the new value is

the largest as compared to all the values already stored in the array. Once the addition

operation is complete the control is sent to the median calculation state.

57
4. Median: After the addition operation all the 8 elements of the array are sorted in

ascending order. This state then takes the elements in the 4th and the 5th location and

calculates its average. This becomes the [mal output or the median of the 8 elements.

This output is sent back to the Peak3 state of the main state machine. Also, another

signal which would indicate the completion of the median calculation operation is

sent to Peak3 state so that the next peak detection cycle begins in the main state

machine.

The worst case delay in the complete median calculation process is 16 clock cycles.

The main state machine uses the median value to set up the first threshold value for

the next cycle of peak detection. The complete design implemented in System

Generator®is as shown below in figure 4.29:

Figure 4.29: Complete implementation of design 2 in System Generator®environment

58
The output from the hardware implementation of this design and the simulation

output are shown below in figure 4.30.

BP Signal V. Pesi< Peeifu>n

: :
1200
i
1000 ·
BIlO

600
·
1 I
j 4111
i .
&
"-
e .
~ ! ~ ..J:\
2IlO
,jJ
1\ ..--. ./\ 1\
0
· v v v
· v v

~
. \J
.2IlO
· . ·· J. !
.4lIJ

! i :
· 3.
!
Peak Pasftion BP SlgnaJ
-
.BIJO i
1:D1 14111 1600 1700 1600
Semple Number

Figure 4.30: Sample output of design 2

It can be clearly seen that the output from both the designs seem to be similar for the

most part. However, the actual difference comes, when the total number of true and

false detections is compared, the second design outperforms the first one, as will be

discussed in the results section.

59
CHAPTER FIVE

Xilinx Tools: System Generator and Spartan 3E Starter Kit

Xilinx Inc. is one of the leading programmable logic device providers in the world. There

are a number of tools provided by Xilinx to program its devices. One of them is 1SE,

which can be used to process the design based on the device specification at various

stages in the project lifetime. System Generator® is a fairly new tool introduced by

Xilinx, which is completely dedicated to Digital Signal Processing design. The FPGA

board on which we implemented the design is the Spartan 3E fp320, 4C starter kit. The

following sections provide a brief introduction to these Xilinx products.

5.1. DSP Design using Xilinx System Generator:

Chapter four gave a brief overview of the basic blocks used in system generator.

One of the main tasks in this research was to understand this tool and use it to implement

our design. But as with all the new tools, there are some problems involved here and

there and it takes time before all these bugs are fixed. The only documentation available

is the one provided by Xilinx to use this tool. There are so many great features associated

with this tool that even these 900 odd pages conld not address all the sections properly.

Different applications have different requirements and having just one help manual

proves to be a major problem. Moreover, not a lot of people have actually used this tool

so far and hence it is difficnlt to find help elsewhere. This section throws light on a few

important steps involved in building a design in system generator. However note that this

thesis is not dedicated to this aspect completely. Hence, for a clear understanding of the

tool, the user manual [12] which comes with the tool should be referred.

60
For the present research Xilinx System Generator vlO was used for implementing the

design. This was the latest version for the tool available at the time and it only supported

MATLAB® R2006b and Xilinx ISE 9.1 by the time this thesis was written. No other

MATLAB® versions were supported by the tool as yet. Installation procedure is not

discussed in this thesis and it is assumed that the tool is already installed properly.

There are a lot of advantages of using this tool especially in implementing complex signal

processing algorithm like this one. Some of these advantages are discussed below:

• User friendly GUI environment: This tool is used in the Simulink environment and

hence has a very user friendly appearance.

• No experience with Hardware Description Language necessary: System Generator®

has a number of blocks corresponding to the various components that are required for

implementing any hardware design. The use of this tool does not require a prior

knowledge of any of the hardware description languages like VHDL or Verilog. All

the components in the tool are represented as blocks which are fairly easy to relate to,

as a whole. For description of any of the blocks and what they mean in the hardware,

a brief summary is provided, which can be accessed by right clicking on the block in

the Simulink panel.

• No Interfacing required testing the design.

• Automatic synthesis, Place and route.

• Comes with a built-in user friendly static timing analysis tool.

• Portability: VHDL modules can be imported in the System Generator® environment

as black boxes. Similarly, VHDL modules can be generated directly here and

simulated in the Modelsim® environment.

61
5.1.1. Basic Design Approach:

Designing a filter with System Generator® is very easy, as it just requires connection of

different Xilinx blocks in the Simulink workspace. There are a number of different

structures possible for implementing a filter (Direct form 1, II, parallel etc.). The one

which should be used is completely based on the designer's discretion. In general, for any

design, the designer has to compromise between area/speed depending on the.

requirement and the same idea holds here. Once the design is ready, it can be simulated to

see if it performs as expected. If the simulation results are acceptable the next step is to

synthesize the design for use with the hardware. This is the most critical step, as the

timing constraints have to be met in order for the design to work properly. For the current

design, Spartan xc3s500e-4fg320 device was used. The maximum clock frequency

supported by this device is 50MHz. So, for synthesis, the clock period of 20ns was used

as the default timing constraint during synthesis. In most cases when a large data range is

used (40 bits in the current design) and when the filters have feedback paths (e.g. IIR

filters), achieving timing closure is a big problem. It is easy to end up with huge levels of

logic and negative worst case slack values for these designs. To overcome this problem,

the design should be modified slightly, by introducing pipeline registers wherever the

path is long and has a large delay as shown in figure 5.1 below:

62
~.
hi
~I'C::J.----r----------------------'!
(,a11"jJ.

'~---"-'CJ
Oull

1I1!~"!
L-~~----------~,CJ
Oul2

Figure 5.1: Pipeline registers to break long paths

In the above figure, the design was modified a little bit by splitting the delay in the

forward path between the adder block and registers. The overall delay required by the

filter remains the same. However, the long path is now divided into smaller ones such

that the setup and hold violations for the flip-flops as reported by the synthesis tool can

be easily overcome in thi s case.

Another important feature of System Generator® is its in-built timing analyzer program .

In most cases timing analyzer is able to recognize the exact path which has the maximum

delay and slack value. Registers need to be introduced in these paths to break the long

combinational paths as discussed earlier. It is also important to keep an eye on the output

every time the design is modified and incorporate this delay in the same branch of the

filter by reducing the delay in the other components belonging to the same branch. The

above fi gure depicts this, as the regi ster is introduced in the branch, the latency of the

63
delay component has been reduced by I. This method is very effective in that the transfer

function of the filter is unchanged.

There are certain cases when the timing analyzer shows a positive slack for all the paths

in the design but the timing constraint is sti II not met. In these special cases, it is

important to follow the slowest path as identified in the synthesis report generated by

ISE. This part is a bit tricky to understand as everything seems to be working perfectly,

however an error is reported when it is required to carry out the hardware in the loop

simulation.

Before simulating the design it has to be made sure that there are no unconnected

components in the design. This is very important as neither System Generator® nor

MA TLAB® reports error for that. Even if it seems as if the simulation has started, the

MATLAB® screen just displays a line reading "caught unknown exception". This

message simply indicates that MA TLAB® has crashed. The program has to be started

agam.

Whenever feedback paths are to be implemented in the design, it is very important to

introduce a delay element in the feedback path or else an error will be reported (algebraic

loop in the design). The delay element essentially breaks this loop as shown in the figure

5.2 below:

~."

Figure 5.2: Delay element in feedback loop

64
5.1.2. Black Box implementation:

A VHDL module can be introduced in the System Generator®workspace as a black box.

Black Box component imports the VHDL module and generates a corresponding

MA TLAB® function file. This file contains all the information regarding the ports, data

rates to be used at different ports, generic values and the name of the module imported.

The figure 5.3 below shows a sample VHDL module as imported in the System

Generator®workspace:

<2:)
Out1

co
Out2
In3
e cg new

Figure 5.3 : VHDL module in System Generator® Environment (Black Box)

There are two ways of importing the modules in the design based on the level of

abstraction as needed by the designer. If there is more than one module in the design then

the top level module can be imported as a black box. Once the MATLAB®function file is

generated for this module, the names of all the other modules in the hierarchy can be

added to this file. When the simulation is carried out, the files are directly read by system

generator. The other way of introducing black boxes in the design is by importing each

module separately. This approach is more complicated but gives an added advantage that

signals corresponding to internal modules can be directly observed. This approach was

65
used in present research. Just as discussed in the previous section, care must be take to

implement the feedback loops. This rule remains the same even in case of multiple black

boxes. It is always important to keep in mind the data rate corresponding to the feedback

signals. Whenever the output signal of a module is fed back as an input to the same

module, data rate needs to be specified for this signal explicitly. This can be done by

introducing an Assert block in the loop along with the delay block as shown in the figure

5.4 below. The Assert block does not cost anything in the hardware as such, but is very

effective in assigning both the data rate and the data type for a signal .

... ~
Cut3
'----+--..j~
_____ in1

madl.n1
[Link] r---

q ~- , d

Reo,st.""

Figure 5.4: Delay and Assert blocks and Black Box modules in Feedback Paths

66
Before importing a VHDL module into the System Generator®environment, it is a good

practice to simulate and test the design first in a VHDL simulator. It gives a much better

control over the internal signals in the design and is also helpful in debugging the design.

In most cases the simulation output from System Generator® matches that of a VHD L

simulator. The output might change a bit in case the model is imported in System

Generator® using the second approach discussed above, as additional delays have to be

introduced for feedback paths, but overall it should be really close to the VHDL

simulator output.

Once the simulation works, timing constraints met and all the bugs in the design are

fixed, it can be downloaded to the FPGA. As mentioned earlier, one of the main

advantages of System Generator® is that, it can be used to test the design once

downloaded to the hardware. This is known as Hardware in the loop simulation. When

the co-simulation option is selected in System Generator®, a JTAG co-simulation block is

generated, which can be directly added to the design. And then, by just the click of a

button, the FPGA can be programmed via USB and a hardware co-simulation can be

carried out. This is depicted in figure 4.19 in chapter 4.

This feature helps in testing the design without actually having to connect the FPGA to

an oscilloscope with cables etc. The simulation and actual hardware output can be

compared to check for accuracy of the design.

This section gave a brief overview of implementing a design in System Generator®

environment. However, for exact details about the components and the tool in general,

the accompanying user manual [12] for the tool should be referred.

67
5.2. X ili/IX Spartan -3£ starter Kit

The hardware design for this research was implemented on Xilinx Spartan 3E FPGA

board.

Figure 5.5: Xilinx Spartan 3EII> FPGA board

Some key components and features of the board, in relation to the design implemented

are discussed in this section [9] [10]:

• Xilinx XC3S500 fg 320 speed grade 4C FPGA device

• Maximum clock frequency SOMHz

• Support for 232 I/O pins Gnput output blocks for connection pads)

• Support for 20 Block RAMs

68
These are organized as configurable, synchronous 18Kbit blocks. They can be used to

store large amounts of data more efficiently (in one chunk) as compared to distributed

RAMs [9] [10].

• 20 Dedicated Multipliers

Support for embedded multipliers which accept two 18 bit words as inputs and generate

36 bit output. Cascading multipliers support more than 3 operands and more than 18 bits

for each input. Multipliers are placed in the design in either of the two ways: as

asynchronous primitives called MULTl8X18 or as synchronous primitives called

MULTl8X18S with a register at the output. The latter version was used in this design[9]

[10].

• 9312 four-input LUTs

LUTs or the look up tables are also called function generators. They support 4 bit logical

inputs and one output. They can be combined using MUX to support even more inputs

and are thus responsible in implementing the logic of a design. Moreover, they can also

function as a distributed RAM, ROM or as a 16 bit shift register [9] [10].

• 4656 Slices

Xilinx follows a hierarchy system starting from the lowest element starting from a logic

cell going all the way to RAMs and ROMs. Following this hierarchy, Slices are supersets

which are composed of2 LUTs, two storage elements, wide-function multiplexers, carry

logic and arithmetic gates. The number of slices is generally one half of the total number

ofLUTs. Slices are responsible to provide the logic, arithmetic and ROM functions to the

design [9] [10].

69
Only the basic terminology and features of Xilinx Spartan 3E Starter kit were discussed

in this section. The device supports many more components, which were not touched and

the reader is encouraged to check that information in the Xilinx user manual [9] for the

device.

70
CHAPTER SIX

Results

The two peak detectors were implemented successfully in hardware and the results of

their testing with five data records obtained from the Physionet database [7]. are

discussed in the following sections. Also discussed is the resoW'Ce utilization of the

FPGA board after their implementation. It is important for the reader to understand the

various terms that are used in the following sections in order to interpret the results

properly. The next section explains these terms and the other necessary information

briefly. For a brief discussion on the records please refer to section 3.2.

6.1. Result Interpretation

It has been discussed in the previous chapter that the data records used for testing the

design, were from the MIT-Bill Arrhythmia database. which is maintained at the

Physionet website. All the data records from this database are completely annotated and

are sampled at 360 Hz. The design implemented in this research required the data to be

sampled at 200 Hz. Therefore. both the data records and their corresponding annotations

were re-sampled at 200 Hz. The resulting data was rounded off to the nearest integer and

presented to the design, as input. The annotation files corresponding to the data records

classified the annotations into two types. beat and non-beat annotations. The non-beat

annotations were removed from the annotation files as the aim of this research was to

find the peak in the QRS complex. A beat is classified as a normal sinus beat if it has all

the five components i.e. p. Q. R, S and T in it. However. even if any of these features are

missing from the wave they are still classified as beats. as discussed in chapter 2. A

general trend observed in a majority of annotation files was that, the beat annotation were

71
marked within the QRS complex and their position was approximately ±4 samples from

the position of the peak in the QRS complex. As there was no better reference to compare

our results than the annotations, it was decided to go ahead with the following policy for

reporting the results. The peak detector was set up in such a way that it located the peak

in a QRS complex seen in the bandpass signal within ± I sample. Testing was carried out

to see if the annotations were within a certain number of samples from the detected peak

position to classify them as correctly detected peak. The design was therefore tested in

two variations. In the first variation, if the difference between the identified peak and the

annotation was within ± 5 samples, it was reported as a true peak, with an absolute

accuracy of 25ms. In the second variation, the difference was set to ± 4 samples for

identifying the peak as true, with an absolute accuracy of 2Oms. This policy was

consistent with all data records but one. It was observed that for record 108 the difference

in between annotations and the actual peak in the original signal was not at all close to

± 4 samples as was the case with other records. As this record is among the most

difficult ones to analyze in the whole database, the location of beat annotations was set a

bit off by the annotators themselves. The difference in the peak position in the signal and

the annotation was variable and in most cases much greater than ± 4 samples. Hence this

record was treated separately with an increased sample range (±IO samples) for

accuracy. In order to accommodate the results for all the records on a common plane, the

above mentioned sample range was used to classify whether the detected peak was a true

peak or not for all the five records. This assumption holds well, given the fact that a QRS

complex exists for an interval between 60 to 100 ms which is equivalent to 20 samples on

the higher side. Thus, if the difference between a detected peak and the corresponding

72
annotation was within ±10 samples, it will prove that the peak is located in the correct
feature of the waveform. This idea was used in reporring the result in the last pan of this

chapter. It should be noted that even though it was stated earlier that the result was tested

to see if it was accurate within ± 10 samples from the annotation, it does not mean that an

equal window of 10 samples was used on either sides of the annotation. Instead, the

annotations were observed carefully and then the window to the right of the annotation

was set to 8 samples and the one on the left was set to 12 samples. In the initial pan of

testing, the results obtained for record 108 were not used in determining the accuracy of

the detector. This issue is dealt in detail in the last section of this chapter. Another

important fact about the design is that, the peak in the QRS signal is obtained once the

signal is passed through the High-Pass filter. Hence, this represents the peak position in

the band Pass signal and not in the original signal. In order to get the peak in the original

signal, a fixed delay encountered by the signal in moving through the different stages of

design to reach the Band-Pass output, has to be subtracted from it. This delay is a

constant and has a value of 26 samples. Finally, it should be noted that the results

involving record 108 are treated separately. For the cases where the results are

repo11ed for this record, a deilly of 21 samples was used instead of 26, the reasons for

this being already mentioned above. Thus. these are not used in the calculation of

accuracy for these two designs and are marked with an * in the tables. This issue is
clarified in thefmal section ofthe testing results.

The last section of this chapter discusses the resource utilization in the Xilinx FPGA

board that was used for carrying out the implementation. This section details the number

of slices, Block RAMs, Multipliers. registers etc. that were used up for the design.

73
The various terms that are used in the following sections are discussed next:

False Positive (FP): It represents the position reported as a peak by the detector when in

reality there is none. It generally happens due to high noise level in the ECG signal. The

detector identifies the noise spike as a QRS complex and finds the peak corresponding to

it.

False Negative (FN): It represents the peak missed by the detector, when there is actually

one present. It could happen, if the threshold is set to be of too high a value as compared

to the strength of the following QRS signal, such that the detector completely misses it.

True Peak (True): It represents a peak correctly identified by the detector when compared

against the annotations.

Failed Detections (FD): It is the sum of the total false positives and negatives

encountered during testing. This number is also a measure of accuracy of the detector.

The lower the number of failed detections, higher the accuracy.

6.2. Peak Detection Results for Design 1:

The parameters fixed for carrying out testing with design 1 are given below:

Threshold 1 = lISth of the previous peak value

Threshold 2 = 1I3rd of the maximum sample value

Memory size = 63x40 bits

Window = 30 samples

Delay before peak position output in main module=70

74
6.2.1. Co"ect detection within ± 5 samples ==25 ms

The Table 6.1 below shows the testing results for design 1. Testing was carried out on

the five data records obtained from the MIT-BIH Arrhythmia database.

False False Failed

Record Total Positives Negatives True Detections % Failed
# Beats (FP) (FN) Beats (I'D) Detections
100 2272 24 28 2241 52 2.29
105 2571 129 83 2485 212 8.25
108* 1762 293 277 1482 570 32.3
203 2979 188 162 2814 350 11.75
222 2482 167 59 2420 226 9.11
Total 10310 508 332 9960 840 xxx
% FailedDetections=
% Correct Detections =

~:~;:::. x100 =8.14% 1 OO-o/oFai1ed DetectiOllS =91 .85%

Table 6.1: Correct detection within ± 5 samples

The design detected almost 92% peaks correctly with an accuracy of 25 ms. Majority of

false detections were reported for record 203, which is severely contaminated by noise

due to muscle and axis movement and is difficult to comprehend correctly as will be

shown in the next section. Apart from that, most of the other records had false detections

below a 10% range. The major disadvantage of this design is due to the fact that the

threshold for the next detection is only based on the previously detected peak. Thus, if the

peak in this case happens to be huge, there is a good chance that if the following peak is

smail, it will be missed and result in an increased false negative count This problem is

partially addressed by design 2 wherein the threshold is calculated based on the median

75
of the eight previously detected peaks. But still, this design gives a good result in testing

as shown in the table above.

6.2.2. Co"ect detection within ± 4 samples=20 ms

As expected the percentage of false detections increases slightly as the accepted range for

correct detection is reduced. Just like before, the majority of false detections are reported

for record 203 due to the reasons stated earlier. Table 6.2 shows the testing results for the

detector with an accuracy of20ms.

False False Failed

Total Positives Negatives True Detections
-~----- % Failed
~.-

Record # Beats (FP) (FN) Beats (FD) Detections

100 2272 26 30 2239 56 2.46
105 2571 129 83 2485 212 8.25
203 2979 214 188 2788 412 13.83
222 2482 169 61 2418 230 9.27
Total 10310 538 362 9930 900 xxx
% Foiled Detections =
% C otrect Detections =
Totl2l(FD) x100 =8.73%
Totl21Beats 100-%[Link].d [Link] =9127%

Table 6.2: Correct detection within ± 4 samples

6.3. Peak Detection Resultsfor Design 2

The various parameters for detector were varied within a certain range by observing the

output to get the minimum number of false detections. These parameters were later fixed

to carry out testing with the five data records. The fixed parameters of the design during

testing are given below:

Threshold Thl= 11/64 = 0.1718 of the median oflast 8 detected peak values

Threshold Th2 = V. of the maximum sample value

76
Memory Size = 63x40 bits

Window Size = 30 samples

Peak rejection within 50 samples = 250 ms

Delay in the main module for final peak reporting = 76

6.3.1 Co"ect detection within ± 5 samples = 25 ms

Table 6.3 below shows the results for the peak detector with accuracy of 25 ms. The

detector performs very well and detects almost 94% of the peaks correctly over the span

of four records. There were a total of 579 fault detections as compared to 10310 correct

detections as indicated in the table below.

False False Failed

Total Positives Negatives True Detections % Failed
Record # Beats (FP) (FN) Beats (FD) Detections
100 2272 22 25 2244 47 2.5
105 2571 110 68 2500 178 6.9
108* 1762 154 138 1651 292 13.13
203 2979 113 158 2816 271 9.09
222 2482 41 42 2434 83 3.34
Total 10310 293 298 9967 579 xxx
% Failed D etec:ti ens =
% C arrect. D eI.e ctions =
~:::;~ xlOO =5.63% 1OO·o/oFailed Detections =9439%

Table 6.3: Correct detection within ± 5 samples

It can be seen that the detections seem to be satisfactory for all the records except for

record 203 which has almost 9% false detections, which is equivalent to almost 300 false

detections. This record is pretty difficult to analyze as both the channels of the ECG are

contaminated with noise, as mentioned in section 3.2. Moreover, there are QRS

77
morphology changes in the signal due to axis shifts [7]. All these effects make this record

very difficult to comprehend and hence a lot of false detections are reported for the same.

A truncated part of this record is shown in figure 6.1 below. It can be seen that even the

noise signals are identified as QRS complexes due to their high slope values, which leads

to an increased number offalse positives in the detection.

x Peak Position Va Sample Nunmer

--, ,
6
BP Signal PeakPosiIionl
Q , IQ
4

-
• ~ • , V\
• A
" r'" ~I • I\. ., ~1J.h . IVj ft
-
1
fJ
" 1/ IT' tl' '"' '1 IV

V
V r'"\ IV lJV 1f11
-
·2
Falae Positive
f-
, ,
82CC B400 B600 B600 90ll 9200 B400 S600 S600
Sample Number

Figure 6.1: Truncated portion of rec#203 with detected peaks

6.3.2. Co"ect Detection within ± 4 samples= 20 ms

Table 6.4 shows the results of testing of this detector, with the 4 data records, for an

accuracy of 20 ms. It is intuitive to see that, with a reduction in the range for a correct

detection, the number of false detections increase. The overall percentage of correct

detections goes down by less than 1% as compared to the previous result. However, the

results are still very accurate. And due to already stated reasons, testing on record 203 yet

again gives the maximum number of failed detections as shown in the table below:

78
False False Failed
Total Positives Negatives True Detections % Failed
Record # Beats (FP) (FN) Beats (FD) Detections
100 2272 29 26 2234 55 2.5
105 2571 113 71 2497 184 7.1
203 2979 117 162 2812 279 9.36
222 2482 77 81 2398 158 6.37
Total 10310 338 346 9941 676 xxx
% Failed Deteclions= % C orrectD etections=
To1lo1(FD) xlOO =6.63% 1 OO.%Failed D ete c:tions =93.36%
Total [Link]I

Table 6.4: Correct detection within ± 4 samples

6.4. Correct Detection within ± 10 samples for designl

This additional section was introduced in the results, in order to report the results for all

the data records within the same sample range. As discussed earlier, for the calculation of

accuracy in the results stated above, the numbers obtained for record 108 were not used

in the calculations. A difference of about ± 3 samples from the actual peak position was

assumed for the annotation in reporting the results in the previous section, which worked

out well for a majority of data records used for testing the design (recS#100,10S,203 and

222). However, for rec#108 the margin for this difference was greater than the rest of the

records. The difference was around 7 to 10 samples for a number of cases, an example of

which is shown below in figure 6.2.

79
AnmIIation V. Original Signs!
2
I Original Signs! Psalc~)

1.5

1
j
. ,
! 05
r I i,
0 r
~
.1
IT c.,

.
.(1.5
3.217
l
3.218 1219 122 3.221
V l222 1223 l225
II
3.22l1
SamplaNumbat e
w.10

Figure 6.2: Truncated portion ofrec#108 with annotations

A delay of around 21 samples was used for reporting the results in tables for this record

(table 6.1, table 6.3), and therefore it was not used in any of the calculations. There were

no results reported for an accuracy of ± 1 samples for both the designs with this record, as

a very high count ofFPs and FNs were seen at this accuracy. Hence, the sample range for

reporting the peak was increased. The results shown in the following table take into

account this increased sample range and is accurate in the sense that the detected peak is

at least located within the region for which a QRS complex exists i.e. between 60-10Oms

or 20 samples to the maximum. There is a difference in this table as compared to the ones

discussed so far. For this specific table results for rec#108 have been used to calculate the

correctly detected peaks and also for determining the accuracy of this method. The table

6.5 below shows the results for all the records using method!:

80
False False Failed
Total Positives Negatives True Detections % Failed
Record#. Beats (FP) (FN) Beats (FD) Detections
100 2272 13 12 2253 25 1.1
105 2571 89 43 2525 132 5.13
108 1762 141 62 1647 203 11.52
203 2979 147 120 2855 267 8.96
222 2482 142 34 2445 176 7.09
Total 12072 532 271 11725 803 XlIX
% Failed D etections= % C orreet. Detections =
Tot<2l(FD) xlOO =6.658% 1 OO-o/oFaile d D ete [Link] =93.34%
Tot<2l Beats

Table 6.5: Correct detection within ± 10 samples for design!

As expected, the accuracy for design 1 jumps from around 92% to almost 93.5%. Also

this table takes into account the numbers obtained for rec! 08 for the calculation. Except

for rec# 108, all the other records show an error below 10% which is still a good

detection, given thai most of the records used for testing are difficult to analyze.

6.S. Co"ect Detection within ±! 0 samples for design2

False False Failed

Total Positives Negatives True Detections % Failed
Record#. Beats (FP) (FN) Beats (FD) Detections
100 2272 1 1 2269 2 0
105 2571 85 30 2538 115 4.4
108 1762 111 32 1727 143 8.12
203 2979 70 34 2959 104 3.49
222 2482 11 15 2464 26 1.05
Total 12072 278 112 11682 390 XlIX
% FailedDeteclions= % C oruet. Detections =
To1l2I(FD) xlOO =323% 1OO-%Failed D ete ctions =96.76%
Tot<2lBea13

Table 6.6: Correct detection within ± 10 samples for design2

81
As seen from Table 6.6 above, accuracy of correct detections shoots up to almost 97%

from previously reported 94%. Again, as with the other results reported earlier in this

section, the detected peaks by the design are accurate within ±I samples from the peak in
the Bandpass signal. The low count of FPs and FNs clearly indicate that this design is

actually finding the peak within the QRS complex waveform range.

Due to time constraints we could only test five data records from the MIT-BIH database

with the two detectors. However, in order to establish these as one of the better hardware

detectors, they should be tested with other standard data records. Moreover, records with

accurate annotation files should be used for carrying out the testing on these designs to

further qualify their performance as good detectors.

6.6. Resource Utilization:

Before implementing any design in the hardware it is essential for a designer to keep in

mind both the resource requirement and the available resources. If the resources available

are not enough then, ifpossible, the design should be modified accordingly. Two designs

were implemented on the Xilinx Spartan FPGA board as a part of this research. The

second design is an extension of the first one and it required more resources than the first

one. Only a brief summary of device utilization is given here. For the exact details, please

refer the appendix section.

6.6.1 Device Resource Utilizationfo1' thefll'St design:

Table 6.7 shows the total board occupation snmmary for the implementation of the first

design, on the Xilinx Spartan 3E Starter Kit.

82
Number of occupied Slices 2432 out of 4656 520/0
Number of Slice Flip Flops 2830 out of9312 30%
Number of 4 input LUTs 2524 out of9312 27%
Number of bonded lOBs lout of232 1%
Number ofBRAMs 6 out of20 30%
Number of Multipliers or MULT18X18SIOs 13 out of20 65%

Table 6.7: Device resource utilization for the first design

The device utilization report for this design clearly shows that there are still around 50%

free resources in the board. These resources can be utilized at a 1ater time if any

additional required features are added, to fwther improve the performance of the detector

and/or to enhance the functionality (e.g. arrhythmia position).

6.6.2 Device Resource UtilIzation/or the second design:

Table 6.8 shown below, summarizes the total resource occupation for the implementation

of the second design on the Xilinx Spartan Starter Kit:

Number of occupied Slices 3547 out of 4656 76%

Number of Slice Flip Flops 3573 out of 9312 38%
Number of 4 input LUTs 5376 out of 9312 54%
Number of bonded lOBs 1 out of 232 1%
Number ofBRAMs 6 out of 20 30%
Number of Multipliers or MULT18X18SIOs 9 out of 20 45%

Table 6.8: Device resource utilization for the second design

Overall utilization of the device with the second design was around 76%. A total of 6

block RAMs were used and the number of multipliers used was just 9. One of the

contributing factors for such low resource utilization is the way some of the filters were

83
implemented. Especially, the implementation of the time averaged filter in the pre-

processing stage, using a block RAM, reduced the slice occupation by a huge amount.

This design used more resources in the FPGA board as compared to the first one because

of the additional feature added to it. For the implementation of median calculation stage,

2 additional block RAMs were used to accommodate 8x40 bits. Also, an additional 320

Flip Flops were used for the creation of the array for storing the 8 numbers.

The tables above clearly show that there is still some space available in the device hence,

any additional features which could improve the detection process might be added to the

design.

6.7. Time for analysis

One of the main advantages of using the FPGAs is that they allow fast and efficient

testing of new detection algorithms. The FPGA board used for implementing the designs

had a maximum frequency of 50MHz. A test was carried out to compare the time taken

by the FPGA and the computer simulation to carry out the analysis of a record in one

pass. Please note that the two tests were independent of each other. The computer

simulation took about 8 minutes and 30 seconds for the complete analysis. It took 4

minutes and 43 seconds for the FPGA to analyze the same record, which is almost half

the time required for computer simulation. The test was carried out for just one record;

however these results should be similar for the other records as well. This is because the

total number of data samples is the same for all the records. Such a system has an

advantage that, apart from real time analysis, it can also be used by the physicians to go

through a patient's previously stored ECG records quickly and proceed with treatment

accordingly.

84
CHAPTER SEVEN

Conclusion and Suggested Future Modifications

The main aim of this research was to implement a popular software algorithm on QRS

peak detection in hardware. The design had two variations which were implemented as

two separate models. Each of the two designs was implemented successfully on a Xilinx

FPGA board. As there is still some space available on the board after the implementation

of the design, it could be efficiently utilized to introduce additional decision rules in

qualifying a feature as a QRS complex with a higher accuracy. Although the main focus

of the research was to find the position of the R wave, these designs can be used for

additional purposes too. For example, apart from finding the peak, the design locates the

QRS complex in the ECG waveform. As discussed in the first chapter, the physicians

may use the duration and morphology of the QRS complex to diagnose various

arrhythmia conditions. Even though these attributes were not categorically stated in the

results, they were successfully obtained by the peak detector. Therefore, this detector can

be utilized for a number of applications apart from the one which just requires the peak:

position.

In order to make the detector more powerful in carrying out the detection, it is necessary

that the number of false detections be minimized, One of the ways of doing it is to add a

more adaptive technique in predicting the beat rate change in the ECG signal. This can be

done by incorporation a RR interval predictor and a search back option in the design in

parallel with the adaptive threshold predictor already used in the design. The search back

technique runs in parallel with the normal detection. If there is no peak detection for a

long interval, then it gets activated and searches for the peak in the missed samples by

85
using a reduced threshold value. The search back option along with the threshold

predictor might be able to minimize some of the fault detections in highly noise

contaminated signals. Also, aT-wave discriminator can be added to the design. Both

these approaches are discussed in a previous paper by the same authors [5]. However, for

incorporating these features, additional resources might be required; hence, the

modifications should be carried out according1y. The two hardware peak detectors

implemented have more than 90% correct detections for the tests carried out on 4 data

records from a standard database. The accuracy shoots to around 97% for an increased

sample interval for all the five data records for a design based on design 2. However, in

order to further establish them as good detectors, tests have to be carried out with records

from other databases having accurate annotation files. With the successful

implementation of the current design it can be concluded that FPGA boards be effectively

used in implementing complex design like an ECG Analysis system. FPGAs have an

advantage over their ASIC counter parts in that, they can be re-programmed easily and

are cheaper. Its faster and efficient testability options should be utilized in the future to

implement new detection algorithms.

86
APPENDIX A

VHDL MODULES

This Appendix contains the synthesizable VHDL code written for the two designs. There

are two modules which are common to the two designs, one is the memory module and

the other is the window module. The VHDL code corresponding to these two module are

only presented in the first design.

,
A.I. Design 1:

1. Main Module (Containing the Finite State Machine):

library ieee;
use ieee.std_logic_ll64.all;
use [Link] [Link];

-Entity
entity ecgnew is
generic(width39:integer :=39;
width19:integer:=19;
back:integer:=85);
port(ipt:in std_logic_vector(39 downto 0);
fromwinl:in std_logic_vector(4 downto 0);
peakd:out std_logic_vector(19 downto 0);
elk:in std_logic;ce:in std_logic:='l ';
--from win Module
ready:in std_logic;
rmem:out std_logic);
end entity ecgnew;

-Architecture
architecture behavior of ecgnew is
type state is (prefetch,loadl ,load2,peakl ,peak2);
type inter is array(O to I) ofstd_logic_vector(19 downto 0);
signal reg2:inter;
signal c_state,n_state:state;
signal counter:integer:=O;
signal mx:std_logic_vector(39 downto O):=X"OOOOOOOOOO";
signal th:std_logic_vector(39 downto 0);
signal peakv:std_logic_vector(19 downto 0);
signal iw,pw,tw,rw:std_logic;

87
signal ivalue:st<Uogic_vector(39 downto 0);

begin
logic:process(c[k)is
variable iptest:stdj.ogic_vector(39 downto 0);
begin
if clk'event and c[k = 'I' then
ifce='I'then
if ipt /= X"OOOOOOOOOO" then
counter<=counter+ I;
iptest:=ipt;
if counter<=400 then
ifunsigned(iptest) > unsigned(mx) then
mx<=iptest;
end if;
else
c_ state<=n_state;
end if;

end if;
else
c_state<=prefetch;
end if;
end if;
end process logic;

-Threshold register
process([Link],tw) is
variable temp3i,temp3:std_logic_vector(41 downto 0);
variable thil:std_Iogic_vector(43 downto 0);
variable thi:std_Iogic_vector(43 downto 0);
begin
if c[k'event and c[k ='1' then
if counter <=400 then
temp3i:=std_logic_vector«to_unsigned(3,2))*(unsigned(mx»);
temp3:=std_Iogic_vector«unsigned(temp3i»)/(to_unsigned(4,3»);
-40 bits
th<=std_logic_vector(resize(unsigned(temp3),40»;-40 bits
else
iftw= 'I' then
thil :=std_Iogic_vector(to_unsigned(1 ,4)*unsigned(ivalue»;
thi:=std_Iogic_vector(unsigned(thil)/to_unsigned(8,7);
th<=std_Iogic_vector(resize(unsigned(thi),40»;
end if;
end if;
end if;

88
end process;

- Peak Position Register

process(iw,clk) is
begin
if clk'event and clk = '1' then
ifiw='I'then
ivalue<=ipt;
end if;
end if;
end process;

-Peak Value Register

proeess(pw,clk) is
begin
if clk'event and elk = '1' then
if pw ='1' then
peakv<=std_logic_vector(to_unsigned(counter,20»;
end if;
end if;
end process;

-Reg2 Storage Register

process(rw,clk) is
begin
ifclk'event and elk = '1' then
ifrw='l'then
reg2(l)<=reg2(O);
reg2(O)<=peakv;
end if;
end if;
end process;

-Combinational Part ofFSM begins here

detection:process(e_state,ipt,ready,fromwin1,th,reg2) is
variable temp1:std_logic_vector(39 downto 0);
begin
peakd<=X"OOOOO";
case e_state is
when prefetch =>
rmem<='O';
iw<='O';
rw<='O';
pw<='O';
tw<='O';

89
if counter> 400 then
n- state<=Loadl',
else
n_state<=prefetch;
end if;

when Loadl=>
temp 1:=ipt;
iw<='O';
rw<='O';
pw<='O';
tw<='O';

ifunsigned(templ) > unsigned(th) then -both 40 bits

iw<='l';
pw<='l';
n- state<=Load2',

else
n_state<=Loadl;
end if;
nnem<='O';

when Load2 =>

templ:=ipt;
nnem<='O';
iw<='O';
pw<='O';
rw<='O';
tw<='O';
ifunsigned(templ) > unsigned(ivalue) then
pw<='l';
iW<='l';
n- state<=Load2',
end if;
if to_ unsigned(3,3)*unsigned(temp 1) < resize(unsigned(ivalue),43) then
rw<='l';
n_state<=peakl;
else
n- state<=Load2',
end if;

whenpeakl=>
nnem<='O';
iw<='O';
pw<='O';

90
rw<='O';
tw<='O';
if (std_logic_vector(unsigned(reg2(1 ))-unsigned(reg2(O))))>
std_logic_vector(to_unsigned(30,20)) then
n- state<=peak2',
rmem<='l';
else
n- state<=Loadl',
end if;

when peak2=>
rmem<='I';
tw<='I';
rw<='O';
iw<='O';
pw<='O';
ifready = '1' then
rmem<='O';
peakd<=std_logic_vector(to_unsigned(counter,20) +
resize(unsigned(fromwinl),20)-to_unsigned(back.20));
n- state<=Loadl',
else
n- state<=Peak2',
end if;

end case;
end process;
end behavior;

2. Memory Module

library ieee;
use ieee.std_logic_II64.a1l;
use [Link] std.a1l;
use ieee.suUogic_signed.a1l;
-Entity
entity mem is
generic(back:integer:=60;wsize:integer:=30);
port(inl:in std_logic_vector(39 downto 0);
-from the ECG Module
r:in std_logic;
-To Wmdow Module
outl :out std_logic_ vector(39 downto 0);
-Original Clock
clk:in std_logic;

91
ce:in std_Iogic;
-From the HP Module, it's always = I
w:in std_logic:=' I'
);
end entity mem;

-Architecture
architecture behavior of mem is
type meml is array(O to back-l)ofstd_Iogic_vector(39 downto 0);
signal interl :meml;
signal counter I ,counter2,counter3 :integer:=O;
begin
store:process(clk,r,inl) is
begin
if clk'event and clk = 'I' then
ifr= 'I' then
-readfust
if counter3<wsize then
counter3<=counter3+ I;
outl <=inter1(counter2);
if counter2<back-1 then
counter2<=counter2+ I;
else
counter2<=0;
end if;
inter1(counter1)<=inl;
ifcounterl<back-l then
counter I <=counter I + I;
else
counterl <=0;
end if;
else
-Keep on writing
counter3<=O;
out! <=X"OOOOOOOOOO";
interl(counterl)<=inl;
if counter I <back-I then
counterl <=counterl +1;
else
counter 1<=0;
end if;
end if;

else
-write only
outl <=X"OOOOOOOOOO";

92
counter3<=O;
interl(counter! )<=inl;
ifcounterl<back-l then
counterl <=counterl +I;
counter2<=counterl +1;
else
counterl <=0;
counter2<=counterl;
end if;
end if;
end if;
end process store;

end architecture behavior;

3. Window Module

library IEEE;
use IEEE.SID_LOGIC_l164.ALL;
use [Link] [Link];

-Entity
entity win is
generic(wsize:integer:=30);
Port ( clk : in SID_LOGIC;
ce : in SID LOGIC;
en:in std_logic;
-from the mem module
wininl : in SID_LOGIC_VECTOR (39 downto O);-from mem module
winout: out SID_LOGIC_VECTOR (4 downto O);-to ecg
wready : out SID_LOGIC);-largest number ready, dependent on window size
end win;

-Architecture
architecture Behavioral ofwin is
signal counter:integer:=O;
signal pos:integer:=O;
signal interl:std_logic_vector(39 downto O):=X"OOOOOOOOOO";
begin .
process(clk,en) is
begin
if clk'event and clk = 'I' then
if en = 'l'then

93
if counter<wsize-l then
if (abs(signed(wininl)>abs(signed(interl))) then
interl <=wininl;
pos<=counter;
end if;
counter<=counter+-l ;
wready<='O';
else
wready<='l';
counter<=O;
winout<=std_logic_ vector(to_unsigned(pos,5));
interl <=X"OOOOOOOOOO";
end if;
else
wready<='O';
winout<="OOOOO";
inter I <=X"OOOOOOOOOO";
end if;
end if;
end process;
end Behavioral;

A.2. Design 2:

1. Main Module (Main Finite State Machine)

library ieee;
use ieee.std_Iogic_ll64.all;
use ieee. numeric_std.all;
-use ieee.std_Iogic_signed.alI;

-Entity
entity ecgnevv_nled is
generic(vvidth39:integer :=39;
vvidthI9:integer:=19;
back:integer:=76);
port(ipt:in std_Iogic_vector(39 dovvnto 0);
-fronl win nlodule
fronlwinI:in std_Iogic_vector(4 dovvnto 0);
peakd:out std_Iogic_vector(l9 dovvnto 0);
clk:in std_Iogic;ce:in std_Iogic:='l';
-fronl win Module
ready:in std_logic;
-to nlern nlodule
rnlern:out std_Iogic;
-frOnl nledian nlodule

94
mready:in std_logic;
med:in std_logic_vector(39 downto 0);
-to median and medRAM module
w_med:out std)ogic;
to_med:out std)ogic_vector(39 downto 0);
-to median module, enables the second FSM
med_en:out std_logic
-to medRAM module
-w_medRAM:out std_logic
);
end entity ecgnew_med;

-Architecture
architecture behavior ofecgnew_med is
type state is (prefetch,loadl ,load2,peakl ,peak2,peak3);
type interis array(O to 1) of std_logic_vector(1 9 downto 0);
signal reg2:inter;
signal c_state,n_state:state;
signal counter:integer:=O;
-constant back:integer:=45;
signal mx:std_logic_vector(39 downto O):=X"OOOOOOOOOO";
signal th:std_logic_vector(39 downto 0);
signal peakv:std_logic_vector(19 downto 0);
signal cw,iw,pw,tw,rw,mw,med_th:std_logic;
signal ivalue:std_logic_vector(39 downto 0);
-signal test:std_logic;
signal count2:integer:=O;
begin
logie:process(clk) is
variable iptest:std_logic_vector(39 downto 0);
begin
if clk'event and elk = '1' then
ifce = 'I' then
if ipt /= X"OOOOOOOOOO" then
counter<=counter+ 1;
iptest:=ipt;
if counter<=400 then
ifunsigned(iptest) > unsigned(mx) then
mx<=iptest;
end if;
else
e state<=n state·
end if;
- -'
end if;
else

95
end if;
end if;

end process logic;

-Threshold register
process(mx,clk,tw,med,metUh) is
variable temp3i,temp3i1,temp3:suUogic_vector(39 downto 0);
variable thi2,thi3,thi4,thiS,thi6,thi7,thiS,thi9,thi 16,thi IS,thi 12:suUogic_vector(39
downto 0);
begin
if clk'event and clk ='}' then
if counter <=400 then
temp3il:=std_logic_vector(unsigned(mx)+(unsigned(mx)));
temp3i:=std_Iogic_vector(unsigned(mx)+unsigned(temp3il));
th<=std_logic_vector«unsigned(temp3i))J(to_unsigned(4,3)));
else
iftw='l'then
ifmed_th = '0' then
thi2:=std_logic_vector(unsigned(ivalue)+unsigned(ivalue));
thi3:=std_Iogic_vector(unsigned(ivalue)+unsigned(ivalue));
thi4:=std_Iogic_vector(unsigned(thi2)+unsigned(thi3));
thiS:=stdJogic_vector(unsigned(ivalue)+unsigned(ivalue));
thi6:=std_Iogic_vector(unsigned(thi4)+unsigned(thiS));
thi7:=std_Iogic_vector(unsigned(thi6)+unsigned(ivalue));
thiS:=std_logic_vector(unsigned(ivalue)+unsigned(ivalue));
thi9:=std_Iogic_vector(unsigned(thi7)+unsigned(thiS));
th<=stdJogic_vector(unsigned(thi9)Jto_unsigned(64, 7));
else
th<=med;
end if;
end if;
end if;
end if;
end process;

- Peak Value Register

process(iw,clk) is
begin
ifclk'event and clk = '}' then
ifiw='l'then
ivalue<=ipt;
end if;
end if;
end process;

96
-Peak Position Register
process(pw,clk) is
begin
if elk'event and elk = '1' then
ifpw ='1' then
peakv<=stdJogic_vector(to_unsigned(counter,20»;
end if;
end if;
end process;

-Reg2 Storage Register

proeess(rw,clk) is
begin
if clk'event and clk = '1' then
ifrw='l'then
reg2( 1)<=reg2(0);
reg2(O)<=peakv;
end if;
end if;
end process;

-Sending values for median calculation to the Second FSM

medi:process(elk,mw) is
begin
if elk'event and clk ='1' then
ifmw='I' then
w- med<='I'·,
med- en<='I'·,
to_med<=ivalue;
else
w- med<='O'·,
med- en<='O'·,
to- med<=X"OOOOOOOOOO"·,
end if;
end if;
end process medi;

-Combinational Part ofFSM begins here

detection:process(e_state,ipt,ready,fromwinl ,th,reg2,mready,count2) is
variable templ:std_logic_vector(39 downto 0);
begin
peakd<=X"OOOOO";
case e_state is
when prefetch =>
rmem<='O';
mw<='O';

97
iw<='O';
rw<='O';
pw<='O';
tw<='O';
cw<='O';
med- th<='O",
if counter> 400 then
n_state<=Loadl;
else
n_state<=prefetch;
end if;

when Load1 =>

mw<='O';
temp 1:=ipt;
iw<='O';
rw<='O';
pw<='O';
tw<='O';
med_th<='O';
cw<='O';
ifunsigned(templ) > unsigned(th) then -both 40 bits
iw<='1';
pw<='l';
n_state<=Load2;
else
n_state<=Loadl;
end if;
rmem<='O';

when Load2 =>

mw<='O';
templ:=ipt;
rmem<='O';
iw<='O';
pw<='O';
rw<='O';
tw<='O';
med- th<='O",
cw<='O';
ifunsigned(templ) > unsigned(ivalue) then
pw<='l';
iw<='l';
n_state<=Load2;
end if;

98
ifto_unsigned(2,3)*unsigned(temp1) < resize(unsigned(ivalue),43)
then
rw<='1';
n_state<=peak1 ;
else
n- state<=Load2',
end if;

when peakl =>

mw<='O';
rmem<='O';
iw<='O';
pw<='O';
rw<='O';
tw<='O';
cw<='O';
med- th<='O",
if (stlUogic_vector(unsigned(reg2( 1))-unsigned(reg2(O))))>
stlUogic_vector(to_unsigned(4S,20)) then
n_state<= peak2;
rmem<='l';
mw<='O';
else
n- state<=Load1',
end if;

when peak2=>
mw<='1';
rmem<='1';
tw<='O';
rw<='O';
iw<='O';
pw<='O';
med- th<='O",
cw<='O';
ifready = '1' then
rmem<='O';
peakd<=stUogic_vector(to_unsigned(counter,20)+
resize(unsigned(fromwin1),20)-to_unsigned(back,20));
n- state<=peak3',
cw<='l';
else
mw<='O';
n- state<=Peak2',
end if;

99
when peak3 =>
mw<='O';
rmem<='O';
tw<='O';
rw<='O';
iw<='O';
pw<='O';
med-th<='O'·,
cw<='O';
if counQ > 8 then
ifmready ='1' then
n_state<=Loadl;
med-th<='l'·,
tw<='1';
else
n_ state<=Peak3;
end if;
else
tw<='}';
n- state<=Load}·,
end if;

end case;
end process;
-Second counter
second:process(clk,cw) is
begin
if clk'event and clk = 'I' then
ifcw='}'then
counQ<=counQ+} ;
end if;
end if;
end process second;
end behavior;

2. Median Calculator Module (Second Finite State Machine)

library IEEE;
use IEEE.SID_LOGIC_l164.ALL;
use IEEE.numeric_std.ALL;
use IEEE. SID_LOGIC_UNSIGNED.ALL;

- Uncomment the following library declaration if instantiating

- any Xilinx primitives in this code.

100
-library UNISIM;
-use [Link].a1l;

-Entity
entity median 1 is
generic(last:integer:=8);
Port ( clk : in SID_LOGIC;
ce: in SID_LOGIC;
-From medram module
fromram: in SID_LOGIC_VECTOR (39 downto 0);
-From ecgnew module
median_en:in std_logic;
w:in std_logic;
inl : in SID_LOGIC_VECTOR (39 downto 0);
-To ecgnew module
med_ready:out std_Iogic;
medout : out SID_LOGIC_VECTOR (39 downto 0»;
end median 1;

-Architecture
architecture Behavioral of medianl is
-4 states for the state machine
type state is (donothing,checkstatus,addition,deletion,mediancal);
signal c_ state,n_ state:state;
-Counter signal
signal counter:integer:= 0;
-Process enable signals
signal statusl :std_logic_ vector(2 downto 0):=" III ";-replacement for enable signals
-process completion signals
signal status2:std_logic_ vector(2 downto 0):="111";
-40· 8 bit registers for storing data
type mem is array(O to last)of std_logic_ vector(39 downto 0);
signallist:mem;
signal x:std_logic;
signal count! ,count2:integer:=O;
signal count3:integer:=l;
signal iptest:std_logic_ vector(39 downto 0);
begin
-Process for adding data to the list begins here
adding:process(clk,iptest,statusl,fromram) is
variable temp:integer;
variable temp l:std_logic_ vector(39 downto 0);
variable inter:std_Iogic_vector(39 downto 0);
begin
if clk'event and clk = 'I' then

101
case status 1 is

when "010" =>--initialize the array to 0

med_ready<='O';
status2<="111 ";--dont care
for i in 0 to last loop
list(i)<=X"OOOOOOOOOO";
end loop;

when "000" =>-Addition after deletion

-Compare the incoming data with the ones already in the list
-insert the data at an appropriate position in the list
-send a signal to the FSM that the process has finished completely
med_ready<='O';
status2<="111 ";-dont care
if iptest > list(countl) then
if iptest <= list(count3) then
case countl is
whenO=>
for i in last downto 1 loop
list(i)<=Iist(i-1 );
end loop;
list(l )<=iptest;
status2<=" 100";
countl<=O;
count3<=I;
when 1 =>
for i in last downto 2 loop
list(i)<=list(i-1 );
end loop;
list(2)<=iptest;
-process completed
status2<=" 100";
countl<=O;
count3<=l;
when 2=>
for i in last downto 3 loop
list(i)<=list(i-l );
end loop;
list(3)<=iptest;
-process completed
status2<=" 100";
countl<=O;
count3<=l;
when 3=>
for i in last downto 4 loop

102
list(i)<=list(i-I);
end loop;
list(4 )<=iptest;
-process completed
status2<=" 100";
count1<=O;
count3<=I;
when 4=>
for i in last downto 5 loop
list(i)<=list(i-I );
end loop;
list(5)<=iptest;
-process completed
status2<="100";
countl<=O;
count3<=l;
when 5=>
for i in last downto 6 loop
1ist(i)<=list(i-l);
end loop;
list(6)<=iptest;
-process completed
status2<="lOO";
countl<=O;
count3<=I;
when 6=>
list(last)<=list(7);
list(7)<=iptest;
-process completed
status2<=" lOO";
count1<=O;
count3<=l;
when others =>
null;
end case;

else
ifcount1< last-2 then
countl <=countl + I;
count3<=count3+ l;
-process incomplete
status2<="000";
else
list(last)<=iptest;
count1<=O;
count3<=l;

103
status2<="100";
end if;
end if;
else
-The number is smaller than or equal to the number in list(O)
for i in last downto I loop
list(i)<= list(i-I);
end loop;
list(O)<=iptest;
status2<="100";
-no need to set counter as it is already = 0
end if;

-Deletion Process
when "001 "=>-delete enable
med_ready<='O';
status2<=" I I I ";-dont care
if fromram = list(count2) then
case count2 is
when 0=>
fori in 0 to last-l loop
list(i)<=list(i+ I);
end loop;
list(last)<=X"OOOOOOOOOO";
status2<=" 101 ";
count2<=O;

when 1=>
fori in I to last-l loop
list(i)<=list(i+ I);
end loop;
list(last)<=X"OOOOOOOOOO";
-Process completed
status2<=" 101 ";
count2<=O;

when 2=>
for i in 2 to last-I loop
list(i)<=list(i+ I);
end loop;
list(last)<=X"OOOOOOOOOO";
-Process completed
status2<="101 ";
count2<=O;

when 3=>

104
fori in 3 to last-l loop
1ist(i)<=1ist(i+1);
end loop;
1ist(last)<=X"OOOOOOOOOO";
-Process completed
status2<=1101";
count2<=O;

when 4=>
for i in 4 to last-l loop
list(i)<=1ist(i+ 1);
end loop;
list(last)<=X"OOOOOOOOOO";
-Process completed
status2<="101 ";
count2<=O;

when 5=>
fori in 5 to last-l loop
list(i)<=list(i+ 1);
end loop;
list(last)<=XIOOOOOOOOOO";

-Process completed
status2<=1101";
count2<=O;

when 6=>
for i in 6 to last-I loop
1ist(i)<=1ist(i+1);
end loop;
1ist(last)<=X"OOOOOOOOOO";

-Process completed
status2<=" 101";
count2<=O;

when 7=>
list(7)<=Iist(last);
1ist(last)<=X"OOOOOOOOOO";
-Process completed
status2<="101 ";
count2<=O;

when 8=>
list(last)<=X"OOOOOOOOOO";

105
count2<=O;
status2<="101 ";

when others=>
null;
end case;
else
-Process incomplete
if count2 < last then
count2<=count2+ 1;
status2<="OOl ";
-is count2>=last it should get outta here
else
-number not found, move ahead
status2<=" 101 ";
count2<=O;
end if;
end if;
-Process Median calculator
when "100" =>
status2<=" 111" ;

inter:=stdJogic_vector((unsigned(list(4»+unsigned(list(5»)/to_unsigned(
2,2»;
med_ready<='l';
medout<=inter;

when "011" =>-filling up the array

status2<="111 ";
med_ready<='O';
if iptest>=list(count!) then
if iptest<=1ist(countJ) then
case countl is

when 0=>
-insert and move elements down
list(O)<=iptest;
status2<=" 11 0";
count1<=O;
countJ<=I;
when 1=>
list(O)<=list(1 );
list( 1)<=iptest;
status2<=" 11 0";
count1<=O;
countJ<=I;

106
when2=>
fori in 0 to 2 loop
list(i)<=list(i+ 1);
end loop;
list(2)<=iptest;
status2<="110";
countl<=O;
count3<=I;
when3=>

fori in 0 to 3 loop
list(i)<=list(i+ 1);
end loop;
list(3)<=iptest;
status2<="110";
count1<=O;
count3<=I;
when4=>

fori in 0 to 4 loop
list(i)<=list(i+l); .
end loop;
list(4)<=iptest;
status2<="110";
countl<=O;
count3<=I;
when5=>

for i inO to 5 loop

list(i)<=list(i+ 1);
end loop;
list(5)<=iptest;
status2<="110";
count1<=O;
count3<=I;
when6=>

fori in 0 to 6 loop
list(i)<=list(i+ 1);
end loop;
list(6)<=iptest;
status2<="110";
countl<=O;
count3<=I;
when 7=>

107
fori in 0 to 7 loop
list(i)<=list(i+ 1);
end loop;
list(7)<=iptest;
status2<=" 1 10";
countl<=O;
count3<=l;
when others =>
null;

end case;
-iptest> 7th number in the array, shift up
else
if count! < last-l then
count! <=count! + 1;
count3<=count3+ 1;
status2<="011 ";
else
for i in 0 to last-l loop
list(i)<=list(i+ 1);
end loop;
list(last)<=iptest;
status2<=" 110";
countl<=O;
count3<=l;
end if;
end if;
-iptest < list(O), shift down
else
fori in last-l downto 1 loop
list(i+ 1)<=list(i);
end loop;
list(O)<=iptest;
status2<=" 110";
count1<=O;
count3<=l;
end if;
when others =>
med_ready<='O';
status2<="111 ";-dont care
end case;
end if;
end process adding;
-Process to add data to list ends here

-Synchronous Portion of the FSM

108
initial:process(clk) is
begin
if clk'event and clk = 'I' then
ifce = 'I' then
- now activate the FSM ifmedian_en= 'I'
-counter increments only when this is asserted
if median_en = 'I' then

ifw='I'then
x<='I';
counter<=counter+ 1;
iptest<=inI;
c_ state<=n_ state;
end if;
else
x<='O';
c_ state<=n_state;
end if;
else
c_state<=donothing;

end if;
end if;
end process initial;

-Combinatorial part of the FSM begins here

combinatorial:process(c_ state,w,status2,counter,x) is
begin
case c_state is
when donothing=>
status 1<=" III ";-all deactivated

-just do this once

if counter = 0 then
statusl<="O1 O";-intialize the array
n_state<=donothing;

else
ifx='l'then
n_ state<=checkstatus;
else
n_state<=donothing;
end if;
end if;
-normal process continues from here

109
when checkstatus =>

status 1<=" 111 ";

if counter <=8 then
status 1<="011"; -filling up the IIIT!lY
n_state<=addition;

else
-Activates the the deletion process;
status 1<="001";
n state<=deletion;
end if;

when addition =>

statusl<="111 ";-default value of status 1
-Go to median calculator state
if status2=" 100" then
status 1<="100";--ca1culate the median
n_state<=medianca1;
elsif status2=" 11 0" then-filling up done!!
n_state<=donothing;
else-both add_dis and fill_up are not complete
-continue with the same value of status 1
statusl <=status2;
n_state<=addition;
end if;

when deletion =>

-Deactivate all other processes
status I <=" 111 ";
- remove the data from the list
-Go to data addition state
if status2 = "101" then
statusl<="OOO";
n state<=addition;
else
-Continue being in the same state
status 1<=status2;
n_state<=deletion;
end if;

when medianca1 =>

statusl<="111";

110
-check if it works or not else introduce a status2 value
-for disabling the median calculation process
n_state<=donothing;
end case;
end process combinatorial;
end Behavioral;

3. MedRAM Module (Memory module for storing the last 8 peaks)

-This is the RAM part of the median module

-Generates an activation signal and spits out the
-signal value that is to be deleted in the FSM module
library ieee;
use ieee.std_logic_II64.all;
use [Link] [Link];
use ieee.std_logic_ unsigned.a1l;
-Entity
entity medram is
porte
clk:in std_Iogic;ce:in std_logic;
-From ecgnew module
in!:in std_logic_ vector(39 downto 0);
w:in std_logic;
-To FSM of median module
delout:out std_logic_vector(39 downto 0)

);
end entity medram;

-Architecture
architecture behavior of medram is
type mem is array(O to 8)of std_logic_vector(39 downto 0);
signal inter:mem;
signal r:std_logic;
signal count,counter:integer:=O;
begin
writing:process(inI,w,clk}
variable x: std_logic:='O';
begin
if clk'event and clk = 'I' then
ifw= 'I' then
inter(counter}<=inI;
if counter <8 then
counter<=counter+ 1;
ifx='I'then
count<=count+ 1;

III
r<='I';
else
r<='O';
end if;
elsif counter = 8 then
x:='l';
eount<=O;
r<='I';
counter<=O;
else
r<='O';
eounter<=O;
end if;
end if;
end if;
end process writing;

reading:process(r,elk) is
begin
if elk'event and elk = 'I' then
ifr = 'I' then
delout<=inter(count);
else
delout<=X"OOOOOOOOOO";
end if;
end if;
end process reading;

end architecture behavior;

112
APPENDIXB

SYNTHESIS REPORTS

This Appendix contains the main portions of the synthesis report for the two designs.

Please note that the actual synthesis file is very lengthy, only the main portions of it are

shown in this section

B.1. Synthesis Report/or design 1:

Logic Utilization:
Number of Slice Flip Flops: 2,830 out of 9,312 30%
Number of 4 input LUTs: 2,524 out of 9,312 27%
Logic Distribution:
Number of occupied Slices: 2,432 out of 4,656 52%
Number of Slices containing only related logic: 2,432outof 2,432 100%
Number of Slices containing unrelated logic: oout of 2,432 0%
Total Number of 4 input LUTs: 3,034 out of 9,312 32%
Number used as logic: 2,524
Number used as a route-thru: 257
Number used as Shift registers: 253
Number of bonded lOBs: lout of 232 1%
Number of Block RAMs: 6 out of 20 30%
Number of GCLKs: 2 out of 24 8%
Number ofBSCANs: loutofl 100%
Number ofMULT18XI8SIOs: 13 out of 20 65%
Number of RPM macros: 2
Total equivalent gate count for design: 457,118
Additional JTAG gate count for lOBs: 48

Timing Summary:

Speed Grade: -4
Minimum period: 17.758ns (Maximum Frequency: 56.313MHz)
Minimum input arrival time before clock: 4.756ns
Maximum output required time after clock: 8.24Ons
Maximum combinational path delay: No path found

Timing constraint: TS_clk_lb4f7cfa= PERIOD TIMEGRP "clk_lb4f7cfa" 20nS mGH

IOnS
Clock period: 17.758ns (frequency: 56.313MHz)
Total number ofpaths / destination ports: 13446796/3650
Number offailed paths / ports: 0 (0.00%) /0 (0.00%)
Slack: 2.242ns

113
B.2. Synthesis Report/or Design 2:

Logic Utilization:
Number of Slice Flip Flops: 3,608 out of 9,312 38%
Number of 4 input LUTs: 4,606 out of 9,312 49%
Logic Distribution:
Number of occupied Slices: 3,640 out of 4,656 78%
Number of Slices containing only related logic: 3,640 out of 3,640 100%
Number of Slices containing unrelated logic: oout of 3,640 0010
Total Number of 4 input LUTs: 5,379 out of 9,312 57%
Number used as logic: 4,606
Number used as a route-thru: 440
Number used for Dual Port RAMs: 80
(Two LUTs used per Dual Port RAM)
Number used as Shift registers: 253
Number of bonded lOBs: 1 out of 232 1%
Number of Block RAMs: 6 out of 20 30%
Number ofGCLKs: 2 out of 24 8%
Number of BSCANs: loutofl 100%
NumberofMULTl8XI8SIOs: 9 out of 20 45%
Number of RPM macros: 2
Total equivalent gate count for design: 484,989
Additional JTAG gate count for lOBs: 48

Timing Summary:

Speed Grade: -4

Minimum period: 17.727ns (Maximum Frequency: 56.4l2MHz)

Minimum input arrival time before clock: 4.756ns
Maximum output required time after clock: 8.100ns
Maximum combinational path delay: No path found

Timing constraint: TS- clk- Ib4f7cfa = PERIOD TlMEGRP "elk- Ib4f7cfa" 20 nS mGH
IOnS
Clock period: 17.727ns (frequency: 56.412MHz)
Total number of paths I destination ports: 21952776 / 5517
Number of failed paths / ports: 0 (0.00%) / 0 (0.00%)
Slack: 2.273ns

114
BmLIOGRAPHY

[1]. Bert-Uwe Kohler, Karsten Hennig, Reinhold Orgelmister, Principles of software

QRS detection, IEEE Engineering in Medicine and Biology, January 2002.

[2]. [Link]

[3]. [Link]

[4]. Andre Vander Vorst, Arye Rosen, Youji Kotsuka, RFIMicrowave Interaction with

Biological Tissues (IEEE Press, John Wiley and Sons Inc. 2006).

[5]. Jiapu Pan and Willis J. Tompkins, A Real-Time QRS Detection Algorithm, IEEE

Transaction on Biomedical Engineering, Vol. BME-32, No.3, March 1983

[6]. Patrick S. Hamilton and Willis J. Tompkins, Quantitative Investigation of QRS

Detection Rules using MlTlBlll Arrhythmia Database, IEEE Transaction on Biomedical

Engineering,[Link]-33,No.12,December 1986.

[7]. [Link].

[8]. [Link], Vinod Kumar and H.K. Verma, ANN-based QRS-complex analysis of

ECG, Journal of Medical Engineering and Technology, Vol.22, Number 4, July/August

1998.

[9]. Xilinx Spartan -3E starter kit user guide, March 9, 2006.

[10]. Xilinx Spartan-3E FPGA family functional description, August 24 2004.

[11]. Qiuzhen Xue, Yu Hen Yu, Willis J Tompkins, Neural Network based adaptive

matched filtering for QRS detection, IEEE Transactions on Biomedical Engineering,

Vol.39, No.4, April 1992.

[12]. Xilinx System Generator for DSP V9.1.01, user's guide, March 2007.

[13]. Xilinx ISE 8.1 Synthesis tutorial for Spartan 3E boards.

115
[14]. Attium VHDL synthesis reference, June 10th 2005.

[15]. Xilinx coding styling guidelines.

[16]. Wansuree Massagram, Olga Boric-Lubecke, Luca Macchiarulo and Mingqi Chen,

Heart Rate Variability monitoring and Assessment System on Chip, Proceedings of2005

IEEE Engineering in Medicine and Biology, 27th Annual Conference, September 2005.

[17]. Ashish Shukla and Luca Macchiarulo, FPGA based ECG Analysis System,

IASTED conference on Biomedical Engineering, February 13 th 2008. Accepted for

publication.

116

Electrocardiogram (Ecg) Signal Processing On Fpga For Emerging He
No ratings yet
Electrocardiogram (Ecg) Signal Processing On Fpga For Emerging He
7 pages
Ecg Signal Thesis1
No ratings yet
Ecg Signal Thesis1
74 pages
FPGA-Based ECG Signal Analysis
No ratings yet
FPGA-Based ECG Signal Analysis
51 pages
Fpga Based System For Heart Rate Monitoring
No ratings yet
Fpga Based System For Heart Rate Monitoring
45 pages
FPGA-based System For Heart Rate Monitoring PDF
No ratings yet
FPGA-based System For Heart Rate Monitoring PDF
12 pages
A Portable Electrocardiogram For Real Time Monitoring of Cardiac
No ratings yet
A Portable Electrocardiogram For Real Time Monitoring of Cardiac
11 pages
Electrocardiogram (ECG - EKG) Using FPGA
No ratings yet
Electrocardiogram (ECG - EKG) Using FPGA
45 pages
Low Power Ecg
No ratings yet
Low Power Ecg
13 pages
Report Biomedical Signal Processing Using
No ratings yet
Report Biomedical Signal Processing Using
88 pages
++intelligent System For Detecting Cardiac Arrhythmia On FPGA
No ratings yet
++intelligent System For Detecting Cardiac Arrhythmia On FPGA
6 pages
ECG Signal System for Engineers
No ratings yet
ECG Signal System for Engineers
3 pages
FPGA ECG PhaseI Presentation Filled
No ratings yet
FPGA ECG PhaseI Presentation Filled
9 pages
VLSI Implementation of QRS Complex Detector Based On Wavelet Decomposition
No ratings yet
VLSI Implementation of QRS Complex Detector Based On Wavelet Decomposition
11 pages
FPGA IoT Health Monitoring System
No ratings yet
FPGA IoT Health Monitoring System
5 pages
08 Vol21
No ratings yet
08 Vol21
14 pages
Chu 2023 J. Phys.: Conf. Ser. 2644 012009
No ratings yet
Chu 2023 J. Phys.: Conf. Ser. 2644 012009
7 pages
Slyt416 Ecg Eeg
No ratings yet
Slyt416 Ecg Eeg
18 pages
Wearable ECG Monitoring for Patients
No ratings yet
Wearable ECG Monitoring for Patients
13 pages
ECTE-Tech24 ECG Boularas
No ratings yet
ECTE-Tech24 ECG Boularas
1 page
Ecg Monitoring System
No ratings yet
Ecg Monitoring System
7 pages
Internet Connected Low Cost Healthcare Devices For Telemedicine in Rural India
No ratings yet
Internet Connected Low Cost Healthcare Devices For Telemedicine in Rural India
91 pages
Get PDF
No ratings yet
Get PDF
12 pages
Implementation of First Order Statistical Processor On FPGA For Feature Extraction
No ratings yet
Implementation of First Order Statistical Processor On FPGA For Feature Extraction
10 pages
Systematic Design and HRV Analysis of A Portable ECG System Using Arduino and LabVIEW For Biomedical Engineering Training
No ratings yet
Systematic Design and HRV Analysis of A Portable ECG System Using Arduino and LabVIEW For Biomedical Engineering Training
11 pages
Design of A Real-Time ECG Filter For Portable Mobile Medical Systems
No ratings yet
Design of A Real-Time ECG Filter For Portable Mobile Medical Systems
9 pages
Ecd Project Report
No ratings yet
Ecd Project Report
3 pages
High Frequency Electromyogram Noise Removal From Electrocardiogram Using FIR Low Pass Filter Based On FPGA
No ratings yet
High Frequency Electromyogram Noise Removal From Electrocardiogram Using FIR Low Pass Filter Based On FPGA
8 pages
2 2021 A Low-Latency, Low-Power FPGA Implementation of ECG
No ratings yet
2 2021 A Low-Latency, Low-Power FPGA Implementation of ECG
17 pages
FPGA Article
No ratings yet
FPGA Article
4 pages
Electrocardiogram (Ecg) Signal Processing On Fpga For Emerging Healthcare Applications
No ratings yet
Electrocardiogram (Ecg) Signal Processing On Fpga For Emerging Healthcare Applications
6 pages
Abstract Sample
No ratings yet
Abstract Sample
4 pages
Inm Project Report
No ratings yet
Inm Project Report
3 pages
Classification of ECG Signals Report PDF
100% (1)
Classification of ECG Signals Report PDF
60 pages
Real Time Hardware Architecture of An Ecg Compression Algorithm For Iot Health Care Systems and Its Vlsi Implementation
No ratings yet
Real Time Hardware Architecture of An Ecg Compression Algorithm For Iot Health Care Systems and Its Vlsi Implementation
25 pages
Development of A Low Cost ECG Device
No ratings yet
Development of A Low Cost ECG Device
19 pages
Final Project Report Ecg
No ratings yet
Final Project Report Ecg
24 pages
Real-Time ECG Monitoring with LabVIEW
No ratings yet
Real-Time ECG Monitoring with LabVIEW
10 pages
14 A Review of ECG Classification Techniques Based FPGA
No ratings yet
14 A Review of ECG Classification Techniques Based FPGA
10 pages
Biomedical Digital Signal Processing - Tompkins
No ratings yet
Biomedical Digital Signal Processing - Tompkins
378 pages
Biomedical Digital Signal Processing Textbook
No ratings yet
Biomedical Digital Signal Processing Textbook
378 pages
Ge2021 QB PDF
No ratings yet
Ge2021 QB PDF
10 pages
ECG M1 Bouali
No ratings yet
ECG M1 Bouali
1 page
Bio Medical Signal Processing Tompkins
83% (6)
Bio Medical Signal Processing Tompkins
304 pages
Ikram 2021 IOP Conf. Ser. Mater. Sci. Eng. 1084 012129
No ratings yet
Ikram 2021 IOP Conf. Ser. Mater. Sci. Eng. 1084 012129
7 pages
A 12-Bit Dynamic Tracking Algorithm-Based SAR ADC With Real-Time QRS Detection
No ratings yet
A 12-Bit Dynamic Tracking Algorithm-Based SAR ADC With Real-Time QRS Detection
11 pages
Wearable Cardiac Monitor Design
No ratings yet
Wearable Cardiac Monitor Design
90 pages
Verilog Assignment Solutions Explained
100% (2)
Verilog Assignment Solutions Explained
3 pages
VHDL Lab Manual
No ratings yet
VHDL Lab Manual
51 pages
DRAM Cell Design for Engineers
No ratings yet
DRAM Cell Design for Engineers
14 pages
DSP Processor Fundementals
100% (6)
DSP Processor Fundementals
210 pages
(Ercegovac Milos D., Lang Tomas) Digital Arithmeti (B-Ok - Xyz)
0% (1)
(Ercegovac Milos D., Lang Tomas) Digital Arithmeti (B-Ok - Xyz)
731 pages
Week 2 Course Material
No ratings yet
Week 2 Course Material
60 pages
6th Central Pay Commission Salary Calculator
100% (436)
6th Central Pay Commission Salary Calculator
15 pages
Week 1 Lecture Material
No ratings yet
Week 1 Lecture Material
96 pages
Arduino Based Home Automation System Using Bluetooth Through An Android Mobile PDF
59% (17)
Arduino Based Home Automation System Using Bluetooth Through An Android Mobile PDF
72 pages
Understanding DPTR and Addressing Modes
No ratings yet
Understanding DPTR and Addressing Modes
81 pages
ECE IV B.Tech OS Exam Questions
No ratings yet
ECE IV B.Tech OS Exam Questions
1 page
Embedded Software in C For ARM Cortex M
No ratings yet
Embedded Software in C For ARM Cortex M
114 pages
Digital System Design & Digital Ic Applications
No ratings yet
Digital System Design & Digital Ic Applications
4 pages
ECE Mid Questions on Cellular Systems
No ratings yet
ECE Mid Questions on Cellular Systems
1 page
Switching Theory and Logic Design
No ratings yet
Switching Theory and Logic Design
2 pages
Real-Time Vehicle Tracking Tech
No ratings yet
Real-Time Vehicle Tracking Tech
1 page
Overview of 8085 Microprocessor Functions
No ratings yet
Overview of 8085 Microprocessor Functions
15 pages
Anatomy and Physiology
No ratings yet
Anatomy and Physiology
2 pages
Pathophysiology of Crohn 'S Disease Inflammation and Recurrence
No ratings yet
Pathophysiology of Crohn 'S Disease Inflammation and Recurrence
10 pages
Strabismus For Every Ophthalmologist Full Text PDF
No ratings yet
Strabismus For Every Ophthalmologist Full Text PDF
14 pages
Im50 60 70 80 - EdanUSAbyMdpro - Brochure2024 1
No ratings yet
Im50 60 70 80 - EdanUSAbyMdpro - Brochure2024 1
6 pages
6874bccaa810e802fa3f3f41 - ## - Lecture Planner - Zoology - Uday Maharashtra (Class 11th) 2026
No ratings yet
6874bccaa810e802fa3f3f41 - ## - Lecture Planner - Zoology - Uday Maharashtra (Class 11th) 2026
2 pages
Human Tolerance To Positive G As Determined by The Physiological Endpoints 27004
No ratings yet
Human Tolerance To Positive G As Determined by The Physiological Endpoints 27004
12 pages
9 Cell The Unit of Life Questions
No ratings yet
9 Cell The Unit of Life Questions
4 pages
Habit and Habitat of Unio
No ratings yet
Habit and Habitat of Unio
26 pages
Cell Junctions Detailed Notes
No ratings yet
Cell Junctions Detailed Notes
2 pages
Pharmacology CNS Depressant
No ratings yet
Pharmacology CNS Depressant
31 pages
Organic Chemistry Textbook 1913
No ratings yet
Organic Chemistry Textbook 1913
935 pages
U.S. Food & Drug Administration: 10903 New Hampshire Avenue Silver Spring, MD 20993
No ratings yet
U.S. Food & Drug Administration: 10903 New Hampshire Avenue Silver Spring, MD 20993
19 pages
Medical Students' Respiratory Pathology
No ratings yet
Medical Students' Respiratory Pathology
32 pages
Chapter 3 - Cell Cycle and Mitosis
No ratings yet
Chapter 3 - Cell Cycle and Mitosis
22 pages
02 MIKE NICHOLS - Asparagus Physiology
No ratings yet
02 MIKE NICHOLS - Asparagus Physiology
9 pages
Virtual Frog Dissection Lab Guide
No ratings yet
Virtual Frog Dissection Lab Guide
5 pages
Virtual Long Bone Dissection
No ratings yet
Virtual Long Bone Dissection
2 pages
9700 m18 QP 22 PDF
No ratings yet
9700 m18 QP 22 PDF
16 pages
Industrial Fermentation and Alcohol Production
No ratings yet
Industrial Fermentation and Alcohol Production
23 pages
The Cytoplasm PowerPoint
No ratings yet
The Cytoplasm PowerPoint
55 pages
Stool Electrolyte
No ratings yet
Stool Electrolyte
7 pages
Ultrasound Templates
100% (3)
Ultrasound Templates
23 pages
Local Anesthetics: Overview & Mechanism
82% (17)
Local Anesthetics: Overview & Mechanism
81 pages
Gold: Properties, Uses, and Homeopathic Applications
No ratings yet
Gold: Properties, Uses, and Homeopathic Applications
4 pages
Kidney Structure in Mammals
No ratings yet
Kidney Structure in Mammals
2 pages
Herbalife Prezentacija
No ratings yet
Herbalife Prezentacija
48 pages
01 28 PDF
No ratings yet
01 28 PDF
28 pages
Minimizing Injury and Maximizing Return To Play Le
No ratings yet
Minimizing Injury and Maximizing Return To Play Le
7 pages
314 Inter Biology Vol 2 em
No ratings yet
314 Inter Biology Vol 2 em
307 pages
Memory and Cognitive Control Circuits in Mathematical Cognition and Learning
No ratings yet
Memory and Cognitive Control Circuits in Mathematical Cognition and Learning
28 pages