DSH - L3 - Inputs and Outputs

Download as pdf or txt
Download as pdf or txt
You are on page 1of 71

BBT.HTI.

509/NBE-E4080 Decision Support in


Healthcare

Lecture 3 – Considerations regarding inputs


and outputs

Mark van Gils


Healthcare AI/ML Healthcare
specific data specific targets
/sigproc

Healthcare environment
Messy Data

31/10/2023 | 3
Issues that can make data ‘messy’
•Errors and gaps in data, missing data, noise, artefacts

•Generalisation and harmonisation issues: recordings at


different locations, from different devices with different
properties, operated by different users,…..

•Lack of good-quality annotations that tell what was


happening when data was collected
engineer view vs nurse view – signals and annotations

Child brought into O.R., cries and shouts, plans to escape

Patient being moved on bed, BIS recording started at this moment

Problems in getting entropy-data …


claims there is some artefact ….

Patient squeezed hand


11:47:03 maanantai 06.10.2003
12:02:32 lapsi saliin
12:06:30 OOAS 5, itkee ja huutaa ja meinaa lähteä karkuun
12:08:03 anestesia alkaa....., lapsi pyörii Full set of annotations
12:08:50 OOAS 2 made by a nurse
12:09:31 tipan laitto, OOAS 1 during an operation
12:09:56 Fentanyl 25mikrog i.v., esmeron 6,mg i.v.
12:10:50 intubaatio, OOAS0
Some of it are quantified
12:11:29 putken kiinnitys
12:12:28 potilasta liikutetaan sängyllä, Bis tuli mukaan observations, like OOAS
tässä vaiheessa (observer’s assessment of
12:13:39 potilasta liikutetaan alertness and sedation)= an
12:14:41 Pesu observational measure of
12:23:26 fentanyl 15mikrog i.v. sedation, from 5 (awake) to
12:23:48 Toimenpide alkaa 0 (deep anesthesia)
12:24:03 Demographic data : 4v, 13kg,
12:29:40 entropian datan saamisen kanssa ongelmia...,
väittää että artefactaa... But much of it is free text
12:31:44 Per-dafalgan 250mg i.v. that is not so easy to handle.
12:42:15 Zinacef 400 mgi.v. Requires manual work, or
12:42:48 katetrin laitto (increasingly) natural
12:42:57 toimenpide loppui language processing
12:43:04 anestesia loppui
software
12:45:13 extubaatio
12:45:45 ketorin 25mg i.v.
12:47:33 heräämöön
12:47:23 heräämösssä, OOAS 2, Steward1, Alderete 6,
12:51:03 OOAS3, steward1, alderete 6
….
13:47:48 OOAS5, steward 6, alderete 10
Other issues with input data
• Obtaining data may be
• Expensive (equipment, tests, staff)
• ‘taking forever’ (for technical, physiological, bureaucratic
reasons)
• inconvenient to patients, uninteresting to subjects [drop-outs,
low adherence]

• Class imbalance: rare diseases/pathologies/events usually have


few data examples, making development of classifiers sometimes
difficult
31/10/2023
Noise and artefacts
Merriam-Webster:
• noise: irrelevant or meaningless data or output occurring along with desired information
• artefact (see artifact)
• a product of artificial character (as in a scientific test) due usually to extraneous (as human) agency
• an electrocardiographic and electroencephalographic wave that arises from sources other than the heart or
brain
• a defect in an image (such as a digital photograph) that appears as a result of the technology and methods
used to create and process the image

In healthcare data analysis practice


• noise – tends to be more continuous, obscures the signal
• artefact – often narrower location in time, ‘events’, can be mistaken for part of the signal,
more heterogeneous/complex/unexpected/weird/interesting

In practice, the difference between noise and artefacts is not strict


Noise reduction - filters
First ‘weapon of choice’: signal processing -> filters,
implemented in hardware or software

•Linear Time-Invariant (LTI) filters: used to remove


low-, high- frequencies, or frequency bands in which
noise occurs
•Optimal filters: can deal with situations where the
frequency contents of noise and signal overlap
•Adaptive filters: optimal filters that adapt to changes
in properties of noise and signal
31/10/2023 9
Time domain -> Fourier Transform -> Frequency domain

“Fourier transform”, e.g. Discrete Fourier


Transform (DFT) or Fast Fourier Transform
(FFT)

Power Spectral Density (PSD) ~ |FFT|^2

VTT – beyond the obvious 31/10/2023 10


|H(f)| = magnitude frequency response of a filter. The factor with which the signal at input
frequency is multiplied with to form the output If it is 1, the filter passes everything through
‘as is’ at that frequency, If it is 0, it suppresses the signal completely at that frequency.

|H ()|
|H(f)| |H ()|
|H(f)|
1 1

0 0
 
s f s
fs f
fs
low -pass filter high-pass filter

|H(f)|
|H ()|
1
|H(f)|
|H ()|
1

0 0
 s1
fs1  s2 
fs2  s1
fs1  s2 
fs2 f
f
band-pass filter band-stop filter
(notch filter if  s1  s2)
also called ”notch filter” if fs1fs2 (e.g. 50Hz)
FIR and IIR filters
• Finite Impulse Response (FIR) filter: response is a function of
current and past inputs only. After the filter has stopped receiving
inputs it will eventually stop producing outputs: h(n) is unequal to
zero in a finite range only

• Infinite Impulse Response (IIR) filter: response is a function of both


current and past inputs and past outputs. Even after the filter has
stopped receiving inputs it will still generate new outputs due to the
're-use' of past outputs (the impulse response, h(n), is unequal to
zero for all positive n (infinite range), and may thus go on ‘infinitely’
FIR and IIR filters, mathematical general
form
•y(n) is output of the filter in response to an input signal
x(n) L K
y ( n)   Al  x( n  l )   Bk  y ( n  k )
l 0 k 1

current and past past filter outputs


signal samples

FIR

IIR
original signal
10
9
8
7
6
5
4
3
2
1
0
-1
-2
-3
-4
-5
-6
-7
-8
-9
-10
0 1 2 3 4 5 6 7 8 9 10
FIR
filtered signal
10
9
8
7
6
5
4
3
2
1
0
-1
-2
-3
-4
-5
-6
-7
-8
-9
-10
0 1 2 3 4 5 6 7 8 9 10
original signal
10
9
8
7
6
5
4
3
2
1
0
-1
-2
-3
-4
-5
-6
-7
-8
-9
-10
0 1 2 3 4 5 6 7 8 9 10
IIR
filtered signal
10
9
8
7
6
5
4
3
2
1
0
-1
-2
-3
-4
-5
-6
-7
-8
-9
-10
0 1 2 3 4 5 6 7 8 9 10
Frequency representations

time domain:
system output = impulse response  input convolution
inv. Fourier
Fourier transforms

freq. domain
freq. output = freq. response  freq. input
multiplication
also:

time domain
system output = impulse resp  input
multiplication
inv. Fourier
Fourier transforms

freq. domain
freq. output = freq. response  freq. input
convolution
Instead of the on/off, or 1/0, behaviour we saw in the theoretical ‘brickwall’ filter pictures, in reality we have a more graded response.
Response of a linear time-invariant filter to a
sinusoidal input
• response to a sinusoidal input is a sinusoidal output with the
same frequency but multiplied by a complex factor, H(f) (that can
be obtained from the impulse response, h(n).)
• H(f) (or H()) is called the frequency response, it is characterized
by
• a magnitude response |H(f)|: the ratio of output amplitude to input
amplitude as function of frequency
• and phase response (f): the phase of the output (in degrees or
radians), it can be considered the delay of the output signal as
function of frequency. The phase of the input is defined as 0.
Magnitude and phase response of example filters
h(n) = [0.2 0.2 0.2 0.2 0.2] h(n)=[0.2 -0.2 0.2 -0.2 0.2]

note, here
- frequency axis is in  radians/sample – normalised representation ->
value 1 equals half the sampling rate (the max up to which we should
evaluate the filter’s behaviour)
- magnitude axis is in dB -> |H(f)| in dB = 20 log10 (Amp_out(f)/Amp_in(f))
different magnitude frequency response of theoretical ’brickwall’ filters
|H ()|
|H(f)| |H ()|
|H(f)|
1 1

0 0
 
s f s
fs f
fs
low -pass filter high-pass filter

|H(f)|
|H ()|
1
|H(f)|
|H ()|
1

0 0
 s1
fs1  s2 
fs2  s1
fs1  s2 
fs2 f
f
band-pass filter band-stop filter
(notch filter if  s1  s2)
also called ”notch filter” if fs1fs2
Magnitude frequency response of a more realistic filter

no ’brickwall’ shape, but a transition band and stop and passbands


that do not have exactly have a 0 or 1 magnitude response.
Design of filters

For example, in Matlab filter design toolbox,


Or scipy.signal in Python
Filter design tips
• In filter design you can choose filter parameters:
• filter order = how many coefficients => how many computational operations
• Note: in standard IIR filters number of coefficients = 2*filter order (A and B coeffs)
• Higher N:
•+ better approximation of ideal filter in frequency domain (steepness of transition band,
ripple in passband, attenuation in stopband)
•- longer time delay (esp. in FIR filters); more ripple for transients and impulse-like
shapes in the signal in time domain; more nonlinear phase response in IIR filters
• other parameters (not available in all design methods and filter types): pass- and stopband
edge frequencies, ripple on passband, attenuation on stopband
• design is always a compromise

• Difficult design: very steep (relative to fs) transition bands especially with FIR filters
• Always check the frequency responses after the design!
• Note: all filters have a processing delay
• Linear phase FIR: half of the filter length
• Other filters: depends on phase response but usually of the order of 50% of the filter length
FIR vs IIR filters
FIR filters can always designed to have a linear phase shift (often
important for biomedical signal interpretation, as it keeps the time-
relationships between different waves intact)
FIR filters are inherently stable
coefficient accuracy problems that may occur for sharp IIR cut-off filters
can be made less prominent for FIR filters
FIR filters can be realised efficiently in hardware

FIR filters often require a much higher order than IIR filters to achieve a
given level of performance
10th order FIR filter 5th order IIR (Butterworth) filter
A m p litu de res p on s e A m p litu de res p on s e
1 1

0 .8 0 .8

0 .6 0 .6

0 .4 0 .4

0 .2 0 .2

0 0
0 0. 1 0. 2 0. 3 0. 4 0 .5 0 0. 1 0. 2 0. 3 0. 4 0 .5

P h a s e res p on s e P h a s e res p on s e
4 4

2 2

0 0

-2 -2

-4 -4
0 0. 1 0. 2 0. 3 0. 4 0 .5 0 0. 1 0. 2 0. 3 0. 4 0 .5

Wider transient band Steeper amplitude response


Linear phase Nonlinear phase
Greater delay
EKG
1400 filtered with IIR and FIR filters
1200

1000
Sampling rate 150Hz
800 Filters from previous slide
Blue - original ECG
600
Green - FIR
400 Red - IIR
200

-2 0 0

-4 0 0
18.6 18.7 18.8 18.9 19 19.1 19.2 19.3 19.4 19 . 5
More advanced filtering
•optimal filtering: obtain best possible filter, usually
using assumptions about stationarity and a priori
knowledge
•time-varying (adaptive) filtering: obtain filters that aim
to achieve optimal filtering in changing environments
and in environments in which the signal properties are
not known beforehand
Optimal filtering P P
non-overlapping overlapping
signal and noise signal and noise

f f
|H| ideal filter |H| less-than-ideal filter
1 1
• problem of overlap of signals
and noise in the frequency
domain 0 0

f f

P filter output: P filter output:


pure signal signal and (reduced) noise

left: when noise and real signal have clearly different frequency contents f f
designing a filter is easy (in this case lowpass)
right: when noise and real signal frequency contents overlap choosing
a filter is more difficult: either you have to keep some noise or remove some
signal contents
Different criteria for ‘optimal filter’
•minimization of mean squared error (MSE): m in E e   2


2
•least squares error criterion (LSE): m in en
•maximization of the signal-to-noise ratio: n
max SNR

• MSE uses a stochastic point of view, LSE regards the


time series as deterministic and uses specific
sequences of the record
Optimal Filter in MSE sense – Wiener filter
•if signal and noise are uncorrelated and either the signal
or the noise has zero mean, the frequency response of the
Wiener filter is:

𝑃𝑆𝐷𝑠 (𝑓) 𝑃𝑆𝐷𝑠 (𝑓)


𝐻𝑜𝑝𝑡 (𝑓) = =
𝑃𝑆𝐷𝑠 (𝑓) + 𝑃𝑆𝐷𝑛 (𝑓) 𝑃𝑆𝐷𝑥 (𝑓)

Assumption, recorded signal x is a sum of the clean signal s + noise n. (You can write x(t)=s(t)+n(t))
PSDs = Power Spectral Density of s, PSDn = Power Spectral Density of n,
PSDx = Power Spectral Density of total recorded data x (= s+n),

problem: this requires a priori knowledge of the PSD


of both the input signal and the desired signal,
which is often not available
20
PSDs(f) and PSDn(f)
PSDs(f)
15 PSDn(f)

PSD
10

0
0 20 40 60 80 100
frequency

20

15
PSDs(f)
PSDx(f)
PSDs(f) and
PSDx(f)=PSDs(f)+PSDn(f)
PSD(f)

10

0
0 20 40 60 80 100
frequency

PSD s ( f )
Hopt(f)

0.5
H opt ( f ) 
0
PSD x ( f )
0 20 40 60 80 100
frequency
Adaptive filtering
• optimal filters: stationary signals with known frequency contents. In
reality this information is often not available and the signals are not
stationary.
• processing non-stationary signals requires a filter that 'learns' signal
properties and adjusts itself continuously to perform optimally under
changing circumstances: adaptive filters.
• many similarities with adaptive algorithms in control theory
• typically, adaptive filters perform poorly in the initial phase as they will
have to 'learn' the knowledge that is not a priori available
General Structure of an Adaptive Filter
• Three main parts:
1. performance index/criterion: preferably using something 'easy' like
minimising squared errors

2. adaptive algorithm that changes the parameters


non-recursive: e.g., solve exact least square problem on recorded data
in a given time window, or
recursive: update after every sample, e.g, gradient methods

3. structure of the filter itself


Adaptive filter with reference input: approximate
noise using a second (noisy) input channel and subtract
it from the recorded data
sapprox

nref

x=s+n

https://www.electronicdesign.com/industrial-
automation/article/21805808/whats-the-difference-between-
passive-and-active-noise-cancellation
ECG + noise - approx noise
recorded signal
= ECG + noise

similar noise from other sensor

approximated
noise
Artefact detection and rejection

•Artefacts are considerably more difficult than


noise, thanks to their wide range in unexpected
effects
•They often require advanced signal processing /
AI / ML by themselves to be dealt with
•Sometimes, what looks like a physiological signal
may be an artefact, and the opposite is also true

31/10/2023 39
Artefacts - a division based on their origin

• Physical artefacts – originating ’outside’ the patient (e.g., cable


movements, electrosurgical equipment, bad sensor connections)

• Biological artefacts – originating within the patient (e.g., ECG


activity seen in EEG signal, spontaneous EEG in EP signal,
muscle activity, eye movements (EOG), body movements,...)
Physical/Technical Artefacts
Usual suspects: mains interference, cable movements, vehicle movement,…
Generating ‘step
counts’ detected
from a smartwatch
by knitting

31/10/2023 | 42
Biological artefacts
•Cardiac signal on EEG
•Muscle activity in EEG
•Eye movement in EEG
•Breathing
•Sweating
rhythmical slow waves in EEG

Smith, van Gils, Prior. Neurophysiological Monitoring During Intensive Care and
Surgery, 2006
Ksenia [email protected]
Dealing with artefacts
• Be very conservative with ’cleaning up’ recordings by
ruthlessly removing artefacts from the data. Rather carefully
annotate them instead.

• General philosophy: minimise occurrence of


artefacts/noise by using a good recording set-up and
thinking ahead rather than using signal processing
methods to clean up a signal afterwards
11:47:03 maanantai 06.10.2003
12:02:32 lapsi saliin
12:06:30 OOAS 5, itkee ja huutaa ja meinaa lähteä karkuun
12:08:03 anestesia alkaa....., lapsi pyörii
12:08:50 OOAS 2
12:09:31 tipan laitto, OOAS 1
12:09:56 Fentanyl 25mikrog i.v., esmeron 6,mg i.v.
12:10:50 intubaatio, OOAS0
12:11:29 putken kiinnitys
12:12:28 potilasta liikutetaan sängyllä, Bis tuli mukaan
tässä vaiheessa
12:13:39 potilasta liikutetaan
12:14:41 Pesu
12:23:26 fentanyl 15mikrog i.v.
12:23:48 Toimenpide alkaa
12:24:03 Demographic data : 4v, 13kg,
12:29:40 entropian datan saamisen kanssa ongelmia...,
väittää että artefactaa...
12:31:44 Per-dafalgan 250mg i.v.
12:42:15 Zinacef 400 mgi.v.
12:42:48 katetrin laitto
12:42:57 toimenpide loppui
12:43:04 anestesia loppui
12:45:13 extubaatio
12:45:45 ketorin 25mg i.v.
12:47:33 heräämöön
12:47:23 heräämösssä, OOAS 2, Steward1, Alderete 6,
12:51:03 OOAS3, steward1, alderete 6
….
13:47:48 OOAS5, steward 6, alderete 10
Artefact Detection
• Virtually all commercially available equipment has built-in artefact
detection features that rejects data if the amplitude of the input signal
exceeds a certain preset value, the artefact detection threshold.

• The idea is that many artefacts cause changes that are large in
comparison with the biosignal under study.

• Problems:
• how to set those thresholds? and
• certainly not all artefacts cause high amplitude changes
• other indicators (than large amplitudes) of artefacts

• unusually small amplitudes

• large slopes in the signal

• presence of spikes (short periods in which a very high/low slope occurs,


after which the signal returns to its original value)
•Typical parameters to check: Max, Min,
Difference, Slope in a certain window

(µV)
Smax
30

20
S dif
10

-10
Slope =
-20 abs(dV/dt)m ax
Smin
-30
d t = s am ple interval
tim e
Determination of thresholds
• A too tight threshold would result in rejection of data that has no serious
artefacts, a too loose threshold would allow the acceptance of 'bad'
data.

• Sometimes clinical knowledge of the 'limits' of physiological systems or


use of physics to detect 'impossible' situations can be employed, but
this often only gives an impression of the 'loose side' of the limits.

• Setting of thresholds can only be done after an evaluation of the range


of values of reliable, normal signals (without serious artefacts).
clinicians’ opinions of ‘workable regions’
of signals recorded in the ICU. On
horizontal axes are the measured
values, on the vertical axes are the
clinicians opinions on whether that value
is ‘low’, ‘high’ or ‘normal’. 0 is normal, +1
is the highest possible ‘sensible’ value, -
1 the lowest possible sensible value.
This provides us with some guidelines
on where to put thresholds for artefact
detection.
SAP (Systemic Arterial Pressure),
PAP (Pulmonary Arterial Pressure),
‘s’,‘m’,’d’ indicates systolic, mean,
diastolic pressure.
HR (heart rate),
CVPm (Central Venous Pressure),
Tp (peripheral temperature),
Tc (core temperature),
SVRi (Systemic Vascular Resistance
index),
PCWP (Pulmonary Capillary Wedge
Pressure).
Gau ssian d istribu tion d istribu tion d istu rbed by artifacts
Estimating thresholds (assuming a
Gaussian distribution of data)
• By estimating the mean, , and standard deviation, , of the slopes and
amplitudes in a set of (artefact free) control data one can get an idea of
the distribution.
• In a Gaussian distribution 99.7% of the samples are supposed to be
situated between  - 3  and  + 3 , so one can hypothesise that in
situations without artefacts, taking these values as threshold, nearly all
'good' data will be accepted.
• In this approach, still one has to choose the width of the interval with
acceptable values.
( - 3 ,  + 3 ) or ( - 4 ,  + 4 ) are chosen most often in practice.
slope detection on the basis of statistical outliers

s s

outlier detection

V (µV)

Slope i = max V
t  Slope N
50

-50

1 second
patient 32, time 16:47:52

t
Other variances: recording sites, devices,
staff • Different hospitals have different equipment (e.g. imaging
scanners, lab analysis devices, patient monitors).
These have all different properties wrsp to measurement
parameters and quality of data.

• Different staff within the same hospital has different


skills, experience, dedication, availability to carry out the
recordings. This leads to varying quality of data over time.

• As a consequence it may happen that we may have


features that can very well classify to a specific hospital or
even individual staff members. This is rarely the aim of a
study though.
Missing, incomplete data
•In multi-variate/multi-modal data sets

•In time series

•Most classifiers don’t work correctly when data input


vectors have empty elements or NaNs – what to do?
WHAT TO DO?

Only use complete cases?


Impute missing data? With means, interpolations, k-means,
single/multiple imputation…
Use separate category for missing data?
Find Compromise? E.g. Only use cases that have at least e.g, 70% of all elements
present, impute the rest
Nice overviews of approaches can be found here:
https://www.kdnuggets.com/2020/09/missing-value-imputation-review.html and here
https://towardsdatascience.com/how-to-handle-missing-data-8646b18db0d4
Not all missing data are equal
• Missing Completely at Random (MCAR): The probability of an instance being missing does not depend on
known values or the missing value itself. A data table was printed with no missing values and someone
accidentally dropped some ink on it so that some cells are no longer readable. Or a questionnaire might have
been lost in the post. Here, we could assume that the missing values follow the same distribution as the known
values.

• Missing at Random (MAR): The probability of an instance being missing may depend on known values but not
on the missing value itself. Sensor Example: In the case of a temperature sensor, the fact that a value is missing
doesn’t depend on the temperature, but might be dependent on some other factor, for example on the battery
charge of the thermometer. Survey example: Whether or not someone answers a question - e.g. about age- in a
survey doesn’t depend on the answer itself, but may depend on the answer to another question, i.e. gender. In
this case, the missing data actually may contain potentially valuable information.

• Not Missing at Random (NMAR): The probability of an instance being missing could depend on the value of the
variable itself. Sensor example: artefact rejection may take place on the basis of ‘unusual’ values – that actually
may come from a physiological process. Survey example: Whether or not someone self-measures something
may depend on the actual values.
In this case, the missing data very likely contains potentially valuable information.
Missing data in 10 years’ weight data
10 yr Weight

85.0
weight (kg)

80.0

Green = scale was


broken (MCAR)
75.0
Orange = “I am not going
1.8.89

1.8.90

1.8.91

1.8.92

1.8.93

1.8.94

1.8.95

1.8.96

1.8.97

1.8.98

1.8.99
to check my weight during
Christmas” (MAR? or
NMAR?)
date
Lappalainen, Raimo, Piia Pulkkinen, Mark van Gils, Juha Pärkkä, and Ilkka Korhonen. 2005. “Long-term
Self-monitoring of Weight: A Case Study.” Cognitive Behaviour Therapy 34 (2): 108–14.
https://doi.org/10.1080/16506070510008452.
Non-adherence* and drop-outs:
• Drop-outs in drug trials are usually indicating
something ‘bad’. High drop-out rates make the
study less convincing.

• In eHealth settings this is more subtle. High dropout


rates may be an important finding. Usage metrics
and determinants of attrition should be highlighted,
measured, analyzed, and discussed.

*Adherence = adhering to an agreed plan, e.g.


measuring weight every day, walking >7000 steps
every day, etc. If you don’t do this in the long run you
are ‘non-adhering’. A ‘drop-out’ stops completely.
Eysenbach, G., 2005. The law of attrition. J. Med. Internet Res. 7
Frequency analysis with missing data
• Standard frequency analysis (Fourier Transform) assumes evenly
sampled data
• If data is missing, we can impute, but that will affect frequency contents
(typically towards lower frequencies)
• One option is to fit sinusoids as good as possible to the measured
data, to recreate an estimation of the Fourier series as good as
possible
• This is known as Lomb-Scargle periodogram, or Least Squares
Spectral Analysis. See eg https://en.wikipedia.org/wiki/Least-squares_spectral_analysis or Ruf, T.
1999. “The Lomb-Scargle Periodogram in Biological Rhythm Research: Analysis of Incomplete and Unequally
Spaced Time-Series.” Biological Rhythm Research 30 (2): 178–201.
https://doi.org/10.1076/brhm.30.2.178.1422.
31/10/2023 | 62
Dealing with Missing data
• Many strategies:
• only use complete cases and discard rest;
• impute by means, interpolation etc

• What is best depends on the application (how much data you have to spare, is interpolation
reasonable)

• Nice overviews of approaches can be found here:


https://www.kdnuggets.com/2020/09/missing-value-imputation-review.html and here
https://towardsdatascience.com/how-to-handle-missing-data-8646b18db0d4

• Imputation should not be done blindly, only if the missing data are known to be MCAR

• If they are not MCAR, the missing data may have a message to tell and you should be
careful with automatically ’fixing them’
Another issue: imbalanced data
• Related to Missing Data is the fact that we quite often have an imbalance
in the data sets. (In the general population, people with a disease are,
luckily, less prevalent than healthy people)
• So, quite often we have a situation where class A has, e.g. 995 cases and
class B only 5 cases to work with.
• This requires special thinking on how to best develop classifiers – either A)
‘do something with the data to make it more balanced’ or B) interpret the
results carefully and use appropriate analysis methods
• ‘Do something with the data’ could be: oversampling (of the disease
cases), undersampling (of the healthy cases), weighing of importances,
generate synthetic data, ….
• We will come back to this issue in more detail in Lecture 7

31/10/2023 | 64
Complex Targets

31/10/2023 | 65
Performance Assessment as basis
for development
‘Straightforward measures’: Input ResEle
64x64 64x64

Convolution
[32,32,3,3]
[N,32,3,3]

[32,K,1,1]
64x64xN

ResEle
64x64 Output
+ 64x64xK
Pooling
2x2
32x32 32x32 64x64
ResEle

[64,32,3,3]
[64,64,3,3]
[32,64,3,3]

Deconv
ResEle
32x32 +
Pooling
2x2
16x16 16x16 32x32

• Classification performance: sensitivity,


ResEle

[128,64,3,3]

[64,128,3,3]
Deconv
[64,128,3,3]

ResEle
16x16 +
Convolution Convolution
Pooling 3x3 1x1
2x2 8x8 8x8 16x16
8x8 +

[128,256,3,3]

[256,128,3,3]

[128,128,3,3]
Deconv
ResEle

ResEle
Batch norm Residual
element

specificity,
ReLU
[In, Out, Filter size, Filter size] (ResEle)

accuracy, error rate, positive/negative


prediction value,
precision, recall, F1-statistic, ROC AUC, 66
survival curves,…
• Correlations and concordances
• RMSE, cross-entropy
‘straightforward measures’
that are not that straightforward in
practice
•Lack of gold standards, for many diseases and
diagnoses we don’t know what the ’right answer’
is. Or, it may become available only after a long
time.

•Inter-expert variability, who is right – clinician A or


clinician B (or C)?

•Classifier performance vs. clinical usefulness /


cost effectiveness
Complications in performance
assessment
In many biomedical problems we have some serious
complications when trying to measure the exact
performance of a system, due to
•lack of an accepted ’gold standard’ against which we
can verify our system
•often subjective (non-numerical) scales are used as
measure of patient state
Lack of a gold standard
• Many interpretations of patient data are done using expertise of a given
observer (clinician). One clinician will often make subtly different
decisions (based on experience, different ’culture’ etc) than another.
• In other cases there may not even be a generally accepted quantity that
would represent a certain concept (e.g., ’depth of anaesthesia’, ’pain’ ,
'decline of memory' etc)
• For those cases typically grading scales exist that are used as
guidelines for quantifying the phenomenon at hand
Alderete scale to assess recovery
Ordinal scales after operation
Score
A. Activity
• Many ’patient states’ are assessed by mapping observations to a 1. Ability to move all four extremities 2
subjective scale which is ordinal but cannot be used to perform 2. Ability to move two extremities
3. Unable to move any extremity
1
0
simple performance B. Respiration
1. Ability to deep breathe and cough 2
measures on 2. Respiratory effort limited and dyspnea
present
observers assessment of alertness/sedation (OAA/S) (e,g, during operations) 3. No spontaneous respiratory effort evident 0
C. Circulation
5- Replies to spoken commands, eyes open, awake 1. Systolic arterial pressure +/- 20mm Hg of 2
4- Sedated, replies to spoken commands, mild hypnosis pre-sedation level
2. SAP +/- 20 to 50mm Hg 1
3- Ceases to reply to loud commands, eye lid reflex present of pre-sedation level
2- No reply to spoken commands, no eye lid reflex 3. Systolic arterial pressure +/- 50 or higher 0
1- No reaction to TOF (train-of-four) stimulation (50mA) with movement of pre-sedation level
D. Level of consciousness
0- No reaction to tetanic stimulation with movement 1. Full alertness with ability to answer questions 2
2. Patient can be aroused by verbal stimuli 1
Ramsay scale for level of sedation (e.g in ICU) 3. Verbal stimuli fails to elicit responses 0
Ramsay sedation scale E. Temperature
between 35.6 and 37.5 2
1- Anxious or restless or both between 35 and 35.6 1
2- Cooperative, orientated and tranquil less than 35 or greater than 37.5. 0
3- Responding to commands
Total score needs to be at least 8 before
4- Brisk response to stimulus patient can be discharged
5- Sluggish response to stimulus
6- No response to stimulus
Measures of agreement ordinal scales
• From the previous slides we can notice that the levels are ordinal (ordered in a sequence; level N is
less sedated than level N+1), but we cannot say that a patient with OAA/S level 4 is 'two times more
sedated' than one at level 2, or that the difference between level 4 and level 3 is equal to the difference
between level 2 and 1.
• In such cases measures like linear correlation coefficients, mean squared error etc are not very
appropriate to measure performance.
• To measure e.g. performance of monitors in such a case we use statistic methods that quantify
concordance. For example we can measure how consistently a monitor follows an OAA/S scale: when
the OAA/S level decreases the monitor should follow in the same direction (i.e., decreasing), if the
OAA/S values are stable also the monitor output should not change, and if the OAA/S level goes up,
also monitor output should increase. The key point here is consistency.
• Measures exist to quantify such agreement. A commonly used measure in patient monitor evaluation is
the prediction probability, pk (Smith, Dutton, Smith. Anesthesiology, 1996, 84(1):38-51)

You might also like