DSH - L3 - Inputs and Outputs
DSH - L3 - Inputs and Outputs
DSH - L3 - Inputs and Outputs
Healthcare environment
Messy Data
31/10/2023 | 3
Issues that can make data ‘messy’
•Errors and gaps in data, missing data, noise, artefacts
|H ()|
|H(f)| |H ()|
|H(f)|
1 1
0 0
s f s
fs f
fs
low -pass filter high-pass filter
|H(f)|
|H ()|
1
|H(f)|
|H ()|
1
0 0
s1
fs1 s2
fs2 s1
fs1 s2
fs2 f
f
band-pass filter band-stop filter
(notch filter if s1 s2)
also called ”notch filter” if fs1fs2 (e.g. 50Hz)
FIR and IIR filters
• Finite Impulse Response (FIR) filter: response is a function of
current and past inputs only. After the filter has stopped receiving
inputs it will eventually stop producing outputs: h(n) is unequal to
zero in a finite range only
FIR
IIR
original signal
10
9
8
7
6
5
4
3
2
1
0
-1
-2
-3
-4
-5
-6
-7
-8
-9
-10
0 1 2 3 4 5 6 7 8 9 10
FIR
filtered signal
10
9
8
7
6
5
4
3
2
1
0
-1
-2
-3
-4
-5
-6
-7
-8
-9
-10
0 1 2 3 4 5 6 7 8 9 10
original signal
10
9
8
7
6
5
4
3
2
1
0
-1
-2
-3
-4
-5
-6
-7
-8
-9
-10
0 1 2 3 4 5 6 7 8 9 10
IIR
filtered signal
10
9
8
7
6
5
4
3
2
1
0
-1
-2
-3
-4
-5
-6
-7
-8
-9
-10
0 1 2 3 4 5 6 7 8 9 10
Frequency representations
time domain:
system output = impulse response input convolution
inv. Fourier
Fourier transforms
freq. domain
freq. output = freq. response freq. input
multiplication
also:
time domain
system output = impulse resp input
multiplication
inv. Fourier
Fourier transforms
freq. domain
freq. output = freq. response freq. input
convolution
Instead of the on/off, or 1/0, behaviour we saw in the theoretical ‘brickwall’ filter pictures, in reality we have a more graded response.
Response of a linear time-invariant filter to a
sinusoidal input
• response to a sinusoidal input is a sinusoidal output with the
same frequency but multiplied by a complex factor, H(f) (that can
be obtained from the impulse response, h(n).)
• H(f) (or H()) is called the frequency response, it is characterized
by
• a magnitude response |H(f)|: the ratio of output amplitude to input
amplitude as function of frequency
• and phase response (f): the phase of the output (in degrees or
radians), it can be considered the delay of the output signal as
function of frequency. The phase of the input is defined as 0.
Magnitude and phase response of example filters
h(n) = [0.2 0.2 0.2 0.2 0.2] h(n)=[0.2 -0.2 0.2 -0.2 0.2]
note, here
- frequency axis is in radians/sample – normalised representation ->
value 1 equals half the sampling rate (the max up to which we should
evaluate the filter’s behaviour)
- magnitude axis is in dB -> |H(f)| in dB = 20 log10 (Amp_out(f)/Amp_in(f))
different magnitude frequency response of theoretical ’brickwall’ filters
|H ()|
|H(f)| |H ()|
|H(f)|
1 1
0 0
s f s
fs f
fs
low -pass filter high-pass filter
|H(f)|
|H ()|
1
|H(f)|
|H ()|
1
0 0
s1
fs1 s2
fs2 s1
fs1 s2
fs2 f
f
band-pass filter band-stop filter
(notch filter if s1 s2)
also called ”notch filter” if fs1fs2
Magnitude frequency response of a more realistic filter
• Difficult design: very steep (relative to fs) transition bands especially with FIR filters
• Always check the frequency responses after the design!
• Note: all filters have a processing delay
• Linear phase FIR: half of the filter length
• Other filters: depends on phase response but usually of the order of 50% of the filter length
FIR vs IIR filters
FIR filters can always designed to have a linear phase shift (often
important for biomedical signal interpretation, as it keeps the time-
relationships between different waves intact)
FIR filters are inherently stable
coefficient accuracy problems that may occur for sharp IIR cut-off filters
can be made less prominent for FIR filters
FIR filters can be realised efficiently in hardware
FIR filters often require a much higher order than IIR filters to achieve a
given level of performance
10th order FIR filter 5th order IIR (Butterworth) filter
A m p litu de res p on s e A m p litu de res p on s e
1 1
0 .8 0 .8
0 .6 0 .6
0 .4 0 .4
0 .2 0 .2
0 0
0 0. 1 0. 2 0. 3 0. 4 0 .5 0 0. 1 0. 2 0. 3 0. 4 0 .5
P h a s e res p on s e P h a s e res p on s e
4 4
2 2
0 0
-2 -2
-4 -4
0 0. 1 0. 2 0. 3 0. 4 0 .5 0 0. 1 0. 2 0. 3 0. 4 0 .5
1000
Sampling rate 150Hz
800 Filters from previous slide
Blue - original ECG
600
Green - FIR
400 Red - IIR
200
-2 0 0
-4 0 0
18.6 18.7 18.8 18.9 19 19.1 19.2 19.3 19.4 19 . 5
More advanced filtering
•optimal filtering: obtain best possible filter, usually
using assumptions about stationarity and a priori
knowledge
•time-varying (adaptive) filtering: obtain filters that aim
to achieve optimal filtering in changing environments
and in environments in which the signal properties are
not known beforehand
Optimal filtering P P
non-overlapping overlapping
signal and noise signal and noise
f f
|H| ideal filter |H| less-than-ideal filter
1 1
• problem of overlap of signals
and noise in the frequency
domain 0 0
f f
left: when noise and real signal have clearly different frequency contents f f
designing a filter is easy (in this case lowpass)
right: when noise and real signal frequency contents overlap choosing
a filter is more difficult: either you have to keep some noise or remove some
signal contents
Different criteria for ‘optimal filter’
•minimization of mean squared error (MSE): m in E e 2
2
•least squares error criterion (LSE): m in en
•maximization of the signal-to-noise ratio: n
max SNR
Assumption, recorded signal x is a sum of the clean signal s + noise n. (You can write x(t)=s(t)+n(t))
PSDs = Power Spectral Density of s, PSDn = Power Spectral Density of n,
PSDx = Power Spectral Density of total recorded data x (= s+n),
PSD
10
0
0 20 40 60 80 100
frequency
20
15
PSDs(f)
PSDx(f)
PSDs(f) and
PSDx(f)=PSDs(f)+PSDn(f)
PSD(f)
10
0
0 20 40 60 80 100
frequency
PSD s ( f )
Hopt(f)
0.5
H opt ( f )
0
PSD x ( f )
0 20 40 60 80 100
frequency
Adaptive filtering
• optimal filters: stationary signals with known frequency contents. In
reality this information is often not available and the signals are not
stationary.
• processing non-stationary signals requires a filter that 'learns' signal
properties and adjusts itself continuously to perform optimally under
changing circumstances: adaptive filters.
• many similarities with adaptive algorithms in control theory
• typically, adaptive filters perform poorly in the initial phase as they will
have to 'learn' the knowledge that is not a priori available
General Structure of an Adaptive Filter
• Three main parts:
1. performance index/criterion: preferably using something 'easy' like
minimising squared errors
nref
x=s+n
https://www.electronicdesign.com/industrial-
automation/article/21805808/whats-the-difference-between-
passive-and-active-noise-cancellation
ECG + noise - approx noise
recorded signal
= ECG + noise
approximated
noise
Artefact detection and rejection
31/10/2023 39
Artefacts - a division based on their origin
31/10/2023 | 42
Biological artefacts
•Cardiac signal on EEG
•Muscle activity in EEG
•Eye movement in EEG
•Breathing
•Sweating
rhythmical slow waves in EEG
Smith, van Gils, Prior. Neurophysiological Monitoring During Intensive Care and
Surgery, 2006
Ksenia [email protected]
Dealing with artefacts
• Be very conservative with ’cleaning up’ recordings by
ruthlessly removing artefacts from the data. Rather carefully
annotate them instead.
• The idea is that many artefacts cause changes that are large in
comparison with the biosignal under study.
• Problems:
• how to set those thresholds? and
• certainly not all artefacts cause high amplitude changes
• other indicators (than large amplitudes) of artefacts
(µV)
Smax
30
20
S dif
10
-10
Slope =
-20 abs(dV/dt)m ax
Smin
-30
d t = s am ple interval
tim e
Determination of thresholds
• A too tight threshold would result in rejection of data that has no serious
artefacts, a too loose threshold would allow the acceptance of 'bad'
data.
s s
outlier detection
V (µV)
Slope i = max V
t Slope N
50
-50
1 second
patient 32, time 16:47:52
t
Other variances: recording sites, devices,
staff • Different hospitals have different equipment (e.g. imaging
scanners, lab analysis devices, patient monitors).
These have all different properties wrsp to measurement
parameters and quality of data.
• Missing at Random (MAR): The probability of an instance being missing may depend on known values but not
on the missing value itself. Sensor Example: In the case of a temperature sensor, the fact that a value is missing
doesn’t depend on the temperature, but might be dependent on some other factor, for example on the battery
charge of the thermometer. Survey example: Whether or not someone answers a question - e.g. about age- in a
survey doesn’t depend on the answer itself, but may depend on the answer to another question, i.e. gender. In
this case, the missing data actually may contain potentially valuable information.
• Not Missing at Random (NMAR): The probability of an instance being missing could depend on the value of the
variable itself. Sensor example: artefact rejection may take place on the basis of ‘unusual’ values – that actually
may come from a physiological process. Survey example: Whether or not someone self-measures something
may depend on the actual values.
In this case, the missing data very likely contains potentially valuable information.
Missing data in 10 years’ weight data
10 yr Weight
85.0
weight (kg)
80.0
1.8.90
1.8.91
1.8.92
1.8.93
1.8.94
1.8.95
1.8.96
1.8.97
1.8.98
1.8.99
to check my weight during
Christmas” (MAR? or
NMAR?)
date
Lappalainen, Raimo, Piia Pulkkinen, Mark van Gils, Juha Pärkkä, and Ilkka Korhonen. 2005. “Long-term
Self-monitoring of Weight: A Case Study.” Cognitive Behaviour Therapy 34 (2): 108–14.
https://doi.org/10.1080/16506070510008452.
Non-adherence* and drop-outs:
• Drop-outs in drug trials are usually indicating
something ‘bad’. High drop-out rates make the
study less convincing.
• What is best depends on the application (how much data you have to spare, is interpolation
reasonable)
• Imputation should not be done blindly, only if the missing data are known to be MCAR
• If they are not MCAR, the missing data may have a message to tell and you should be
careful with automatically ’fixing them’
Another issue: imbalanced data
• Related to Missing Data is the fact that we quite often have an imbalance
in the data sets. (In the general population, people with a disease are,
luckily, less prevalent than healthy people)
• So, quite often we have a situation where class A has, e.g. 995 cases and
class B only 5 cases to work with.
• This requires special thinking on how to best develop classifiers – either A)
‘do something with the data to make it more balanced’ or B) interpret the
results carefully and use appropriate analysis methods
• ‘Do something with the data’ could be: oversampling (of the disease
cases), undersampling (of the healthy cases), weighing of importances,
generate synthetic data, ….
• We will come back to this issue in more detail in Lecture 7
31/10/2023 | 64
Complex Targets
31/10/2023 | 65
Performance Assessment as basis
for development
‘Straightforward measures’: Input ResEle
64x64 64x64
Convolution
[32,32,3,3]
[N,32,3,3]
[32,K,1,1]
64x64xN
ResEle
64x64 Output
+ 64x64xK
Pooling
2x2
32x32 32x32 64x64
ResEle
[64,32,3,3]
[64,64,3,3]
[32,64,3,3]
Deconv
ResEle
32x32 +
Pooling
2x2
16x16 16x16 32x32
[128,64,3,3]
[64,128,3,3]
Deconv
[64,128,3,3]
ResEle
16x16 +
Convolution Convolution
Pooling 3x3 1x1
2x2 8x8 8x8 16x16
8x8 +
[128,256,3,3]
[256,128,3,3]
[128,128,3,3]
Deconv
ResEle
ResEle
Batch norm Residual
element
specificity,
ReLU
[In, Out, Filter size, Filter size] (ResEle)