0% found this document useful (0 votes)
32 views12 pages

Hemmerling 2020

This paper aims to use speech signal processing and machine learning methods to analyze voice recordings of Parkinson's disease patients at different times after taking medication. The goal is to estimate clinical assessment scores and predict how symptoms may change within 3 hours to help monitor patients and treatment effectiveness over time.

Uploaded by

daytdeen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views12 pages

Hemmerling 2020

This paper aims to use speech signal processing and machine learning methods to analyze voice recordings of Parkinson's disease patients at different times after taking medication. The goal is to estimate clinical assessment scores and predict how symptoms may change within 3 hours to help monitor patients and treatment effectiveness over time.

Uploaded by

daytdeen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

ARTICLE IN PRESS

Prediction and Estimation of Parkinson’s Disease Severity


Based on Voice Signal
*
Daria Hemmerling, and †,‡Magdalena Wojcik-Pedziwiatr, *yzKrakow, Poland

Summary: This paper presents the possibilities of using speech signal processing, analysis and regression meth-
ods in the context of assessment of neurological state in Parkinson’s disease patients up to 3 hours after taking
medication which alleviates symptoms of the disease. The obtained results were used to create a system whose
goals were the prognosis of values of selected acoustic parameters based on which it will be possible to further
estimate a unified Parkinson’s disease rating scale score. For the experiment, we used the recordings of the vowel
/a/ of 27 patients who were recorded 5 times each at a certain time after levodopa intake. The speech signal was
parameterized, where in the acoustic parameters describing the signal were extracted and constituted input vec-
tors to machine learning regression methods to search for characteristic diagnostic symptoms enabling automatic
monitoring of the course of Parkinson’s disease. The results of the acoustic analysis were correlated with the clini-
cal description and disease severity was assessed using the unified Parkinson’s disease rating scale. As a result, it
was possible to create software which will support the work of the clinician in the field of therapy monitoring and
provide a quantitative assessment of treatment results and a forecast of the effects of the therapy in short-term
monitoring.
Keywords: Signal processing−Voice−Parkinson’s disease−Prediction−Estimation of UPDRS score−Regres-
sion.

INTRODUCTION when the symptoms of PD recur, for example it is a state of


In 1817, James Parkinson described the changes in the voice very difficult movement, recurrence of stiffness, tremor, and
caused by Parkinson’s disease for the first time1. Although bradykinesia. This condition may occur when there is a low
such symptoms were already observed in the last century, level of dopaminergic drugs in the blood, that is before tak-
increased interest in this topic appeared quite recently. Par- ing the next dose. In the first stage of this work, we focused
kinson’s disease (PD) affects over 10 million people globally on estimation of UPDRS score, part III (The Unified
and is the second the most prevalent neurodegenerative dis- Parkinson0 s Disease Rating Scale, motor part) based on a
ease in the world caused by loss of dopaminergic neurons of parameterized voice signal. Afterwards, based on the first
the substantia nigra. The treatment of PD patients who suf- four recordings (off state, 30-, 60-, 120-minutes post medica-
fer from fluctuations in motor state and dyskinesias using tion), we estimated the values of vectors representing differ-
oral therapy is often challenging. Therefore, the develop- ent voice parameters and, based on the results, we estimated
ment of software to describe the symptoms of PD is very the UPDRS score, part III (UPDRS-III). The results of the
desirable. The application of computational tools which conducted research make it possible to observe the patient
analyze the patients state noninvasively without an attend- and assess the severity of PD at a given moment on a scale
ing physician at the time of the study might help to monitor understood by doctors by using only data obtained in a non-
the patient more often and as a result better adapt the indi- invasive way from the voice, and additionally, provide the
vidual treatment plan2. possibility of predicting the severity of the disease within
The goal of this study was to determine whether speech 3 hours of consuming drugs. The block diagram of con-
analysis might be useful to track the motor symptoms of ducted research is presented in Figure 1.
patients with PD during levodopa treatment. To accomplish
this goal, acoustic analysis of voice recordings from PD
patients was carried out to assess which parameters of COURSE OF PARKINSON’S DISEASE
speech correlate with the motor state and might be useful to Currently, many drugs are used to relieve symptoms of PD.
monitor disease symptoms. The voice recordings were The gold standard of treatment and the most effective drug
acquired in the off state of the patients and 30-, 60-, 120- is levodopa, which is a precursor to dopamine. Levodopa
and 180-minutes post drug intake. The off state occurs alleviates PD symptoms such as tremor, rigidity, bradykine-
sia, gait and posture disturbances, and improves the speech
Accepted for publication June 8, 2020.
From the *AGH University of Science and Technology, Department of Measure-
impairments. Levodopa is most effective at the initial stage
ment and Electronics, Krakow, Poland; yDepartment of Neurology, The John Paul II of the disease, the so-called honeymoon; when the adminis-
Hospital, Krakow, Poland; and the zDepartment of Neurology, Andrzej Frycz Mod-
rzewski Krakow University, Krakow, Poland.
tration of an adjustable dosage decreases the parkinsonian
Address correspondence and reprint requests to Daria Hemmerling, AGH Univer- symptoms in continuous way. The omission of a single levo-
sity of Science and Technology, Department of Measurement and Electronics, Kra-
kow, Poland. E-mail: [email protected]
dopa dose does not automatically result in a worsening in
Journal of Voice, Vol. &&, No. &&, pp. &&−&& the motor state of patient. With the progression of the dis-
0892-1997
© 2020 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
ease, the action time of levodopa shortens and the appear-
https://doi.org/10.1016/j.jvoice.2020.06.004 ance of motor fluctuation is observed. During the
ARTICLE IN PRESS
2 Journal of Voice, Vol. &&, No. &&, 2020

dopaminergic neurons, the metabolism of levodopa into


dopamine changes. This means that depending on the
admission time and the levodopa dose, the effect might be
different. Changing metabolism results in the accumulation
of the delivered and un-absorbed dose of the drug. This
leads to the unpredictable emergence of changes in motor
state at different times of the day. The patient may not feel
the effects of the disease (on state), and then after a while his
or her condition worsens, stiffness, tremor or speech distur-
bances appear (off state). Some patients may exhibit hyper-
excitability after exposure to the drug, resulting in
uncoordinated, involuntary limb movements or whole body
movements, known as dyskinesias. Drug doses initially
established cannot remain constant and must change over
time as the disease progresses and the dopamine deficiency
increases, and must be adapted to the patients current
needs4. Over time, the action period of levodopa shortens,
resulting in a temporary worsening of motor state called the
wearing off phenomenon. At this time, levodopa becomes
increasingly less effective and the re-emergence of motor or
nonmotor symptoms occurs before the next dose is adminis-
tered5. Therefore, the patient should take medication regu-
larly and be in constant contact with the treating physician.
FIGURE 1. Block diagram of conducted research, Abbreviations: Despite the effectiveness of levodopa, one of the disadvan-
MLR, multiple linear regression; RF, random forest; SVR, sup- tages is the short half-life of the drug. This means that the
port vector regression. patient needs frequent dosing, on average 4−6 times a day.
In many cases, the patients may need additional doses. The
maximum concentration of levodopa in the plasma occurs
development of the disease, the effect of levodopa changes 60−90 minutes after dose application. That moment also
in direct proportion to the given dose, because the progres- may coincide with peak dose side effects (dyskinesias),
sive degeneration of the substantia nigra neurons prevents which are more likely when an excessive dose of levodopa,
continuous supplementation of dopamine in the striatum. not well-suited to the patients state, is administered. When
The action of levodopa in Parkinsons disease patients the dose period is coming to the end the symptoms of PD
changes over time, but this phenomenon is not completely return. The wearing off effect as well as the tendency
understood yet. The variability of levodopa action reflects towards the occurrence of dyskinesias become more pro-
the changes in levodopa pharmacodynamics related to dif- nounced as the disease progresses and are more and more
ferent factors including age, gender, body weight, creatine difficult to treat5.Variations in the progression of PD are
clearance, levodopa dose, route of administration, and shown in Figure 2.
probably also factors connected with disease duration and The duration of levodopa action shows a certain variabil-
severity. There are some pharmacokinetic/pharmacody- ity from one patient to another depending on different fac-
namic models which describe the relation between these tors including disease duration and severity6. To adjust
different factors and clinical response to levodopa3. Addi- individual treatment and make it more effective, the patient
tionally, in early Parkinsons disease, levodopa which is should be observed as often as possible when the drug starts
given to the patient is metabolized to dopamine and stored and stops acting.
in the nigrostriatal dopaminergic terminals, after which it is The patients takes the medication to alleviate the disease
released to the synaptic clefts little by little, so we do not symptoms every 3 hours. To observe, which symptoms dis-
observe motor fluctuations clinically. The storage mecha- appear and which get worse after determining a specific
nism acts as a buffer preventing significant fluctuations of dose of medication, the patient should be observed during
dopamine level in the blood. Together with the development the time period between the moments of doses of medica-
of neurodegeneration and gradual loss of dopaminergic neu- tion. In normal conditions, the patients meet the doctor
rons the re-uptake of dopamine into the presynaptic termi- every 3−6 months. Considering the several months which
nals is decreased. This is the reason why we observe pass between visits and the few minutes available to summa-
clinically the relation between the duration of the disease rize how the patient feels, and also bearing in mind that the
and the duration of the response to levodopa. So, in patient suffers from neurological disease and in most cases
advanced disease the response to levodopa shortens, reflect- is an elderly person, it is impossible to share all the relevant
ing to a greater or lesser degree the pharmacokinetics of the information about all of the symptoms and other complica-
drug. Furthermore, due to the progressive degeneration of tions of the disease. When the patient is in hospital, it is not
ARTICLE IN PRESS
Daria Hemmerling and Magdalena Wojcik-Pedziwiatr Prediction and Estimation of Parkinson’s Disease Severity Based 3

FIGURE 2. Variations in the progression of PD. (A) Early stage of PD, (B) Advanced stage of PD.

possible for the doctor to carry out individual monitoring of patients based on speech signal analysis. The paper27 pre-
the patient even every 60 minutes. During a visit to the sented research on a method to analyze PD progression
clinic, the doctor spends up to 30 minutes with the patient from speech using the GMM-UBM algorithm. Speech
to determine further treatment. The software that provide recordings from 62 PD patients were analyzed in 4 different
the UPDRS-III automatically can bring more results, sessions acquired over 4 years. Based on the results, it was
gather them more often. It enables the physician to tailor an possible to track disease progression with a Pearsons corre-
individual treatment plan, one which is more specific and lation of up to 0.60 with respect to MDS-UPDRS-III labels.
adapted to the patient’s current condition. Another paper28 presented the estimation of UPDRS score
using Hubness-Aware Feedforward Neural Networks. The
mean absolute error achieved after 10-fold cross validation
RELATED WORK for the UPDRS motor part with error correction was 7.22
In recent years, researchers have focused mostly on develop- points. A paper by Nilashi and Ibrahim30 describes the use
ing algorithms to predict PD onset7−10. Letter et al.13 pre- of the adaptive neuro-fuzzy inference system (ANFIS),
sented an assessment of vital capacity, sustained vowel Expectation Maximization (EM), principal component
phonation and phonation quotient for PD patients who analysis (PCA) and support vector regression (SVR) for pre-
were treated with levodopa. These parameters improved sig- diction of PD progression. The lowest mean absolute error
nificantly following the administration of the drug. Similar was achieved for the EM-PCA-SVR algorithm and
observations were made by Skodda et al.14 who noted a amounted to 0.4721. The database was downloaded from
decrease of fundamental frequency variability in the course the Data Mining Repository of the University of California,
of reading a text when levodopa was administrated. Less Irvine (UCI). The authors of ref.32 demonstrated the use of
attention has been paid to prediction of the severity of PD singular value decomposition (SVD) and ensembles of the
in order to monitor the patients’ treatment. The most com- Adaptive Neuro-Fuzzy Inference System to predict the
monly used rating tool to follow the severity and progres- UPDRS value, also emphasizing the UPDRS motor part.
sion of PD is the Unified Parkinson’s Disease Rating Scale. The dataset was also downloaded from UCI and included
Tsanas et al.16 presented a prediction of UPDRS-III using recordings of 42 people. The minimal mean absolute error
approx. 6000 recordings acquired from 42 PD patients. The was achieved at the level of 0.480 for the EM-SVD-ANFIS
UPDRS-III was ranged from 6 to 92. The speech tasks ensemble method.
included the sustained phonation of the vowel /a/. A gener-
alized random forest algorithm was implemented to select
features with maximum clinical information and a random SPEECH DISORDERS IN PARKINSON’S DISEASE
forest regression was performed to estimate PD severity. Speech disorders in patients with PD are mainly caused by
The smallest root mean square error between the predicted deficits of larynx function, impaired performance of facial
and ground truth value was 1.62 for males and 1.72 for muscles, decreased vital capacity of the lungs and decreased
females. A study by Sakar et al.17 presented the application speech drive7. Such changes lead to numerous abnormalities
of the UPDRS scale as an index of disease progression to in voice and speech, including volume reduction, limited
create a system for binary classification of healthy and PD voice modulation (monotonous speech), difficulty with
ARTICLE IN PRESS
4 Journal of Voice, Vol. &&, No. &&, 2020

volume changes, reduction in vocal fold tension, hoarse patients’ speech. In this study, the number of points can
tone, inadequate articulation resulting in slurred speech as range from 0 to 108 (27 issues x 4 = 108).
well as change in speech pace7−9. These impairments are
called hypokinetic dysarthria. The speech is characterized
by phonation, articulation and prosody dysfunctions, which MATERIAL
arise as a result of damage to the centers and nerve path- Participant in the study were recruited from a group of
ways responsible for innervation of the speech organs10. patients with a diagnosis of PD undergoing treatment at the
Changes in articulation are caused by reduced amplitude Neurology Outpatient Clinic and Department of Neurology
and motion speed of the lips, jaws and tongue. This leads to at John Paul II Hospital in Krakow, Poland. The diagnosis
reduced accentuation, inaccurate articulation of consonants of PD was made according to the Movement Disorders Soci-
up to babbling. Prosody is a speech property that concerns ety Clinical Diagnostic Criteria for Parkinson’s Disease. The
the intonation, volume, accent and duration of the pho- patients’ ages ranged from 50 to 78 years (mean 65 § 7.9)
neme11. Abnormalities in prosody are manifested by speak- and disease duration ranged from 2 to 14 years (mean
ing with short, accelerated phrases, monotony and limited 8.4 § 3.9). The native language of all the patients was Polish.
speech volume, change of speech rate, pauses, difficulty in The database used for the purposes of this research contains
expressing emotions through speech, and repetition of voice recordings of 27 patients with PD. Each of the PD
sounds or syllables12. patient was recorded 5 times at different time periods: in the
off state (more than 3 hours after taking the last levodopa
dose and when the patient reported symptoms of the disease
THE UPDRS SCALE that had earlier been mitigated by previously acting drugs),
The most common scale to assess the severity of PD is The as well as 30, 60, 120, and 180 minutes after taking levodopa
Unified Parkinson0 s Disease Rating Scale. This is a reliable medication. Each of the recordings was registered with a
tool for monitoring the symptoms of the disease during sampling frequency of 44.1 kHz and a resolution of 16 bits.
symptomatic treatment15. The scale consists of 4 parts. Part The recordings were made in noise-controlled conditions in a
I concerns intellectual state and mood disorders (4 issues), soundproof booth. All of the patients were diagnosed and
Part II describes everyday activities (13 issues), Part III labeled by expert neurologists. After each recording, the neu-
assesses motor functions (27 issues), and the last part rologist who supervised the experiment completed the
assesses treatment complications (11 issues)16. Each of the UPDRS-III scale. That scale was considered the reference for
issues may receive from 0 (no symptoms) to 4 points (signifi- the patient’s motor state at the time when the recording was
cant symptoms). The total number of points is the sum of made. The UPDRS-III determined by the physician was
each of the parts and can reach a maximum of 220. A higher between a minimum of 2 points and a maximum of 61 points
UPDRS score indicates a more advanced stage of the dis- (mean 22.49 § 13.60 points). All of the patients were asked
ease. The effect of PD on speech is included in Part III of to pronounce sustained vowels: /a/, /e/, /i/, /o/ and /u/. Figure 3
the UPDRS scale (UPDRS-III) and is most often limited to shows the UPDRS-III scoring changes for different moments
the score obtained only from this part in studies analyzing in time for 7 patients.

FIGURE 3. UPDRS-III designated at different time periods: off state, 30-, 60-, 120-, 180-minutes post medication for 7 patients.
ARTICLE IN PRESS
Daria Hemmerling and Magdalena Wojcik-Pedziwiatr Prediction and Estimation of Parkinson’s Disease Severity Based 5

METHODS tests for each of the vowels and distinguished groups of


Feature extraction parameters are given in Table 1.
The main goal of this part of experiments was to extract We analyzed the association strength of the groups of
speech parameters that would reveal speech impairments acoustic parameters with the UPDRS scale. However, the
caused by PD. The initial step was signal parameterization ultimate aim of this study is to combine the acoustic descrip-
to extract the voice in order to describe the acoustic signal tion of voice signals to estimate motor-UPDRS so that the
by getting as much information as possible about patient’s absolute difference between the estimated and UPDRS indi-
health state at a given time. Phonatory and articulatory cated by the doctor is minimized. We analyzed the correla-
deficits caused by PD can be analyzed with sustained pho- tion between the groups of parameters concerning
nation or continuous speech signals. Most of the works phonatory features, articulatory features, the group of
available in the literature were carried out with the use of MFCC and PLP coefficients, and all the parameters
sustained phonation because this speech task is easy to together. In Table 1, the Spearman correlation shows
perform for older people and is recurrent. What is more, slightly higher correlation results in comparison to Pearson
sustained phonations provide information about the pho- results. The vowel /a/ shows the highest correlation in all of
natory (vibration of vocal folds) and articulatory (reso- the groups. The highest r was equal to 0.61 when all the
nances in the vocal tract) processes of speech production. parameters were analyzed (vowel/u/). The highest r was
To measure phonatory and articulatory deficits, several equal to 0.69 for all the parameters and the vowel /a/. Based
parameters can be used. For phonatory analysis, we on the highest correlation results we chosen the group of all
implemented fundamental frequency, jitter and shimmer the parameters for further computation.
coefficients, energy, 0-, 1-, 2-, 3- spectral moment, power
spectral moment, kurtosis and curvature. Further details Estimation of motor-UPDRS score
of the algorithms used to calculate these parameters can The first task of this research was the creation of a computa-
be found in refs.19,20. For further analysis, we also evalu- tional system to estimate the severity of PD symptoms as a
ated 13- mel-cepstral coefficients (MFCC), the ratio value of the UPDRS-III scale using only speech signals and
between positive and negative signal’s amplitude, 13 per- prior knowledge of patients. The experiment was conducted
ceptual linear prediction (PLP) coefficients, 13 delta PLP following 10-fold cross-validation which took into consider-
cepstral coefficients and 13 delta delta PLP cepstral coeffi- ation recordings of 5 Polish vowels pronounced in a sus-
cients18,19,20. The MFCC coefficients for men are pre- tained manner. This means that we processed 80% of the
sented in Figure 4 with different UPDRS scores. Based on data to train the algorithm, 10% to validate it, and 10% to
Figure 4, noticeable changes are visible for the first two test it. The training and testing groups were created ran-
cepstral coefficients, where the higher the UPDRS score, domly; the recordings of the same patient were never
the higher the MFCC coefficient value. These features are assigned to the training and testing groups at the same time.
widely used in classification of different diseases (including The regression task concerned multiple linear regression
laryngeal as well as neurological diseases) based on voice (MLR), SVR and Random Forest Regression (RF)33.
signal [22−26]. The filter banks applied to calculate these MLR is used to examine the linear relationship between sev-
coefficients are designed based on the human auditory sys- eral independent variables and a dependent variable. SVR
tem and how human beings hear. PD causes a limitation uses the same basic idea as the support vector machine
in range and energy during articulatory movements due to (SVM), a classification algorithm, but applies it to predict
hypokinetic dysarthria. Based on the articulatory features, values rather than a class. SVR acknowledges the presence
we calculated the 1st, 2nd, and 3rd formant frequency, of nonlinearity in the data and provides a proficient predic-
which provide information about any instabilities or tion model. In this examination, we used -SVR. -SVR
abnormalities in the shape of the vocal tract as well as uses a cost function that ignores all training data which are
about the position of the tongue21,29,31. appropriately close, that is less than the set value  from the
To evaluate the strength of association of the acoustic correct answer35. RF was implemented because it can han-
parameters with motor UPDRS scores, we performed two dle thousands of input variables without variable deletion
correlation methods,Pearson (r) and Spearman (r). Based and is known as an effective method for estimating missing
on the parameter r obtained via the Pearson correlation data. The idea behind RF, which is another algorithm used
method, the strength of the linear relationship between the in this research, is to combine multiple decision trees and
distinguished groups of acoustic parameters and the merge their predictions together and determine the final out-
UPDRS-III scale was determined. To detect nonlinear rela- put. First, the algorithm creates a bag of samples by random
tionships, the Spearman correlation was used. Based on sampling from the training set and then creates a tree-based
this, a r parameter was calculated that describes the ranker for each bag of data. In the last step RF ensembles
strength of the nonlinear relationship between parameters. were applied to the full forest of trees34. To find the best
We also computed P values(at the 95% level) of the null hyperparameters of the algorithms used, we defined a grid
hypothesis against each acoustic parameter which was of hyperparameter ranges and randomly sampled from the
uncorrelated with motor-UPDRS. The results of correlation grid performing 10-fold cross validation with each
ARTICLE IN PRESS
6 Journal of Voice, Vol. &&, No. &&, 2020

FIGURE 4. MFCC coefficients for men (A) UPDRS-III = 12, (B) UPDRS-III = 37, (C) UPDRS-III = 50.

combination of values. That step allowed us to narrow For evaluating regression algorithms, we used metrics
down the sought range of values for each hyperparameter. such as root mean square error (RMSE), mean absolute
In the next stage, we implemented a grid again, but with error (MAE), standard deviation (STD) of MAE and R2 to
every combination of settings we identified to try specified. calculate the errors of the predictions of UPDRS and the
ARTICLE IN PRESS
Daria Hemmerling and Magdalena Wojcik-Pedziwiatr Prediction and Estimation of Parkinson’s Disease Severity Based 7

TABLE 1.
The Results of the Pearson (r) and Spearman (r) Correlation Tests for the Analyzed Groups of Acoustic Parameters and
Scoring on the UPDRS-III Scale. All Results Were Statistically Significantly Correlated (pValue < 0.05) with UPDRS-III. All
Recordings of Vowels Were Used to Generate These Results
Groups of Parameters Phonatory Features Articulatory Features MFCC+PLP All Parameters
r r r r r r r r
/a/ 0,48 0,54 0,44 0,48 0,51 0,67 0,61 0,69
/e/ 0,41 0,41 0,39 0,41 0,44 0,49 0,51 0,52
/i/ 0,35 0,39 0,37 0,37 0,53 0,59 0,61 0,67
/o/ 0,31 0,36 0,39 0,42 0,53 0,59 0,61 0,68
/u/ 0,41 0,41 0,42 0,47 0,58 0,61 0,62 0,63

accuracy of the methods. The formulas for these metrics are lowest and R2 to be the highest (in a perfect match between
presented as follows: the predicted values and the reference, this should be equal
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi to 1). This means that the algorithm estimated the UPDRS-
1 Xn
RMSE ¼ ðyi  y^i Þ
2
ð1Þ III points with the smallest error and with the best fit of the
n i¼1
estimation results to the reference values. Based on Table 2,
1 n it can be seen that the Random Forest Regressor algorithm
MAE x003D; x2211; jyi x2212; y^i j ð2Þ estimates the UPDRS-III points with the lowest RMSE and
n i x003D; 1
MAE errors for all the vowels, whereas R2 shows the highest
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi performance. The highest errors (RMSE, MAE) and lowest
x2211; 10 ðMAEk x2212; MAE ~ Þ2 R2 results were obtained for the MLR method including all
k x003D; 1
STD x003D; the results for all the vowels. Among the analyzed vowels,
k
the lowest RMSE and MAE errors were computer for the
ð3Þ
vowel /a/ (RMSE=2.6975, MAE=1.8530, STD=2.2404 and
R2=0.9612). The vowels /e/ and /o/ show slightly worse
R2 x003D; 1 x2212; x2211; yi x2212; y^i
ð4Þ results, RMSE=3.3909 for the vowel /o/ and RMSE=3.7014
~
x2211; yi x2212; y for the vowel /e/. The MAE for these vowels is equal to 2.5
where yi are the prediction values of UPDRS-III, y^i is the points. To validate the algorithms, we performed 10-fold
reference value of UPDRS-III, y~i is the mean value, n is the cross-validation. Based on this, we computed the standard
number of observations in the sample, and k is the fold in deviation of the MAE to determine the distribution of this
the cross validation. error. The STD is around 3 points for the vowels /a/, /e/ and
When analyzing the results for the regression algorithms /o/. For the rest of the vowels, the standard deviation is
and vowels, we expect the RMSE, MAE, STD to be the higher. The R2 coefficient shows the highest results for the

TABLE 2.
The Results of RMSE, MAE, STD and R2 of the UPDRS-III Prediction Based on MLR, SVR and RF Algorithms
Metrics Regression /a/ /e/ /i/ /o/ /u/
RMSE MLR 7,3728 6,8246 7,2474 7,7407 7,1860
SVR 6,1043 6,3341 6,7279 6,3501 6,7089
RF 2,6975 3,7014 6,9336 3,3909 4,7092
MAE MLR 5,5288 4,5798 4,7953 5,3513 4,8203
SVR 4,7354 5,2432 5,1824 4,9599 5,3153
RF 1,8530 2,5361 4,6994 2,4700 3,3756
STD MLR 6,3393 5,8754 5,7654 6,5733 6,1655
SVR 5,7170 5,9130 6,1041 5,5720 6,1146
RF 2,2404 3,2051 5,3334 3,0374 4,2738
R2 MLR 0,7754 0,8019 0,7408 0,6740 0,7135
SVR 0,8328 0,9341 0,7868 0,8246 0,8067
RF 0,9612 0,9279 0,8117 0,9386 0,8899
ARTICLE IN PRESS
8 Journal of Voice, Vol. &&, No. &&, 2020

vowel /a/ and is equal to 0.9612, for the vowel /o/ it is equal voice recordings made 180 minutes after taking the medica-
to 0.9386 and for the vowel /e/ it is equal to 0.9279. The tion. For each acoustic parameter, we computed regression
mean MAE for the RF algorithm is equal to 2.9868. algorithms and calculated the MAE. The smaller the MAE
The scatter-plots obtained with the RF algorithm for all error, the better the model which was obtained. Due to the
the analyzed vowels are shown in Figure 5. We computed a limited number of patients participating in this study
linear first-order polynomial curve to fit the data of pre- (n=27), we computed 5-fold cross validation. Table 3
dicted and reference values of UPDRS-III. In the case of a presents the mean absolute error [%] and its standard devia-
perfect match, the results should follow a 45-degree straight tion for each analyzed vowel between predicted acoustic
line. This means that the obtained results of the regression parameter values and the reference acoustic parameter val-
are identical to the reference values. ues calculated from the voice signal acquired 180 minutes
after taking the medication to alleviate the effects of the
disease.
Prediction of neurological state in patients after
The Table 3 shows the MAE between the predicted and
drugs consumption
reference values of acoustic parameters. Based on the calcu-
The concept of prediction is understood as the process of
lated results, the vowels /a/, /e/ and /o/ show the lowest pre-
predicting the value of a dependent variable for the deter-
diction error (<0.09%).
mined values of an independent variable in the future. The
The final stage of this research was to assign scores on the
result of the prediction process is a forecast expressed by a
UPDRS-III scale to the prediction of the speech signal
numerical result. In this study, the prediction consisted in
parameters vector. For this purpose, we applied the RF
determining the UPDRS-III score 180 minutes after con-
algorithm as it showed the best prediction results. The
sumption of the drug (levodopa) based on voice signal anal-
results of the prediction of UPDRS-III 180 minutes after
ysis of patients with PD recorded at 0 and 30, 60 and 120
taking the medication based on a previously recorded voice
minutes after drug administration. For this purpose, we
signal are presented in Table 4.
analyzed the possibility of forecasting the severity of Parkin-
The results shown in Table 4 showed that the estimated
son’s symptoms using only the voice signal. This task con-
UPDRS-III values closest to the reference values belong to
sisted in predicting the value of acoustic parameters
the vowel /a/. The best performance of an algorithm was
describing the speech signal and estimating the patient’s
indicated by Random Forest Regressor for all the vowels.
condition expressed in the UPDRS-III scale. Because the
The lowest mean absolute error was equal to 4.4613 §
UPDRS scale includes the assessment of the patient’s speech
4.7070, the lowest RMSE was equal to 5.5236 and the high-
in only one of its parts, the task of estimating the patient’s
est R2 value was equal to 0.8442. The mean MAE of all the
future condition is extremely difficult. The above idea is
vowels was equal to 4.9569 and its mean STD was equal to
based on the assumption that there is a sufficiently strong
5.6155.
correlation between the impairment of patients speech
When comparing the results shown in Table 2 with the
expressed on the UPDRS scale and the assessment of the
results shown in Table 4,errors arise when we predict the
patients’ condition.
acoustic parameters and then estimate the UPDRS-III.
Nevertheless, the reported MAE results are around 5 points,
RESULTS which is the margin of error in the determination of the
The task of predicting the patient’s condition expressed in UPDRS-III by different doctors (4−5 points)36.
the UPDRS-III scale was divided into two parts. The first Charts presented in Figures 6 present the scores for 10
part concerned the prediction of the values of acoustic patients for each vowel and a reference value (result deter-
parameters (a vector of acoustic parameters) within 3 hours mined by doctors). It can be seen that the vowel /i/ shows
of drug consumption, while the second one involved assign- the biggest differences between the reference UPDRS-III
ing a UPDRS-III score to the designated vector. The basis score and the predicted values. The lowest differences are
for the prediction were the results of speech signal analysis achieved for the vowels /a/, /e/ and /o/.
recorded at four specified time intervals. A diagram of the
prediction process carried out in this research is shown in
Figure 1. DISCUSSION
Prediction of acoustic parameters was carried out using The UPDRS scale allows assessment of the severity of
MLR, SVR and RF. The input data consisted of the results symptoms and the severity of Parkinson’s disease. This
of acoustic analysis for each parameter for 4 voice record- study focused on assessing patients over a total period of
ings of patients with PD. The analysis was conducted using about 4 hours, taking into account the off condition and a
the vowels /a/, /e/, /i/ /o/ and /u/. The first recording was car- period of up to 3 hours after consuming medications that
ried out when the patient was in the off state, the next one alleviate the effects of the diseases. Based on emerging
after 30 minutes from consumption of medication given by changes in patient status, it was possible to validate and
the doctor, then 60 and 120 minutes after the first measure- implement possible modification of drug doses and adjust-
ment. At the learning stage, the output of the regression ment to the current level of advancement of the disease and
algorithms were the results of an acoustic analysis based on worsening of symptoms. It is not possible for a doctor to
ARTICLE IN PRESS
Daria Hemmerling and Magdalena Wojcik-Pedziwiatr Prediction and Estimation of Parkinson’s Disease Severity Based 9

FIGURE 5. Scatter plots of predictions and reference value of UPDRS-III part scale obtained for the vowels /a/, /e/, /i/, /o/, /u/, pred -
UPDRS-III predicted value, ref - UPDRS-III reference value using RF algorithm.

spend a period of about 3−4 hours individually with a emerging symptoms. A tool supporting the doctor in assess-
patient alone to be able to accurately assess the patient’s ing the patient’s UPDRS-III score at a given moment can
condition over a given time period. However, this period facilitate and document in detail the process of monitoring
allows an objective assessment of disease severity and patients. Having a such application that can determine the
ARTICLE IN PRESS
10 Journal of Voice, Vol. &&, No. &&, 2020

TABLE 3.
The Mean Absolute Error [%] and Its Standard Deviation
[%] for Each Analyzed Vowel Between Predicted Acous-
tic Parameters’ Values and the Reference Acoustic
Parameters Values Calculated from the Voice Signal
Acquired 180 min After Taking the Medication
/a/ /e/ /i/ /o/ /u/
Mean absolute error 0,08 0,09 0,10 0,09 0,11
[%]
STD of prognosis 0,05 0,02 0,01 0,01 0,02
error [%]

UPDRS-III automatically based on voice analysis and pro-


vide this information to physicians quickly is more than
desirable. The voice analysis must be performed at certain
time in relation to the medication time, like we have
reported in this article. In this way, the software can remind
the individual user to perform a voice recording at different
time periods, which is necessary for analysis and prediction FIGURE 6. Prediction of UPDRS-III scoring in 180 minutes
of the UPDRS-III score. There is also one advantage of based on recordings of extended phonation vowels for 10 patients.
application of home assessment, namely patient state at the
visit is sometimes different then his state at home. Stress difference of 4−5 points (MAE) of UPDRS-III was
connected to the visit in doctors office may influence the obtained in the prediction of the patient’s condition. The
symptoms of PD in double-side manner. Some patients MAE results for assessing the UPDRS-III score based on
present with reduction of parkinsonian symptoms and tend speech signal ware in the range of 1.85 for the vowel /a/ to
to display more dyskinesias while the others manifest rather 4.70 for the vowel /i/. These results are very close to the
more severe parkinsonian symptoms. So, the assessment of results presented by Tsanas et al.16, where the authors show-
patient during visit does not always reflect his real condi- edan MAE of 1.6 points for males and 1.7 points for
tion. The home evaluation by the use of computational soft- females. The MAE results from our experiment to estimate
ware may give more truthful results. the UPDRS-motor score were lower than those presented in
Based on the research methodology presented, it is possi- ref.28 (MAE=7.22). The efficient UPDRS-III estimation of
ble to automate the assessment of the patient’s condition on this study results from two factors: (i) acoustic parameters
the UPDRS-III scale and reduce examination time. As a were highly correlated with the severity of PD changes, (ii)
result of the implementation of the proposed procedure, a the RF algorithm outperformed other machine learning

TABLE 4.
The Results of RMSE, MAE, STD and R2 Calculated Based on Predicted Acoustic Vectors Describing the Acoustic Signals
in 180 After Taking the Medication and Subjected to Estimate the UPDRS-III Scale for Vowels /a/, /e/, /i/, /o/ and /u/
Metrics Regression /a/ /e/ /i/ /o/ /u/
RMSE MLR 7,7528 7,7281 7,5422 8,2333 7,2610
SVR 6,9445 6,3401 6,9189 8,2388 7,4646
RF 5,5236 6,3331 7,6623 6,4326 6,9832
MAE MLR 5,9682 6,0150 5,4157 5,8567 5,3715
SVR 5,4930 5,2852 5,3814 6,5589 6,1805
RF 4,4619 4,8974 5,7353 4,6594 5,0305
STD MLR 7,0725 7,1810 6,6147 7,2573 6,5471
SVR 6,1844 5,9869 6,0871 7,5163 7,1402
RF 4,7070 5,4945 6,3358 5,3089 6,2912
R2 MLR 0,6432 0,6330 0,6930 0,6590 0,7176
SVR 0,7410 0,7664 0,7267 0,6042 0,6348
RF 0,8442 0,7769 0,6738 0,7885 0,7084
ARTICLE IN PRESS
Daria Hemmerling and Magdalena Wojcik-Pedziwiatr Prediction and Estimation of Parkinson’s Disease Severity Based 11

regression algorithms. Previous studies included only the 11. Skodda S, Visser W, Schlegel U. Articulation in Parkinson’s disease.
estimation of UPDRS and UPDRS-III based on speech sig- JVoice. 2011;25:467–472.
nal including different speech tasks. In this study, we under- 12. Rusz J, Cmejla R, Ruzickova HE. Quantitative acoustic measurements
for characterization of speech and voice disorders in early untreated
took the challenge to predict the severity of the patient’s Parkinsons disease. J Acoust Soc Am. 2011;129:350–367.
disease changes based on speech signal in short-term moni- 13. De M, Santens P, De Bodt M. The effect of Levodopa on respiration
toring. The results obtained should be considered promis- and word intelligibility in people with advanced Parkinson’s disease.
ing, especially because their error values to replicate the ClinNeurolNeurosurg. 2007;109:495–500.
UPDRS score are lower than inter-rater variability scores 14. Skodda S, Grnheit W, Schlegel U. Intonation and speech rate in Par-
kinson’s disease: general and dynamic aspects and responsiveness to
appointed by the physicians (4−5 points36). The MAE of levodopa admission. J Voice. 2011;25:e199–e205.
the prediction of UPDRS-III score is just 1 point higher 15. Goetz CG, Tilley BC, Shaftman SR. Movement disorder society spon-
than the inter-rater variability differences between the physi- sored revision of the unified parkinson’s disease rating scale (MDS-
cians. A tool that will not require medical intervention, UPDRS): Scale presentation and clinicmetric testing results. MovDi-
sord. 2008;23:2129–2170.
allowing for a short-term (3 hours) analysis of changes in
16. Tsanas A, Little MA, P. E. McSharry PE. Nonlinear speech analysis
the neurological state and the patient’s response to medica- algorithms mapped to a standard metric achieve clinically useful quan-
tion would allow for a personalized diagnosis created in a tification of average parkinson’s disease symptom severity. JRoyal
much shorter time. SocInterface. 2010;8:842–855.
17. Sakar BE, Serbes G, Sakar CO. Analyzing the effectiveness of vocal
features in early telediagnosis of Parkinson’s disease. PloS One.
Acknowledgments 2017;12.
18. Sakar CO, Serbes G, Gunduz A. A comparative analysis of speech sig-
The authors would like to express their appreciations to the nal processing algorithms for Parkinsons disease classification and the
Head of the Department of Neurology at John Paul II Hos- use of the tunable q-factor wavelet transform. Appl Soft Comput.
pital in Krakow, Dr Michal Michalski, who made it possi- 2019;74:255–263.
ble for us to carry out this study. This work was funded by 19. Panek D, Skalski A, Gajda J. Quantification of linear and nonlinear
the Ministry of Science and Higher Education in Poland acoustic analysis applied to voice pathology detection. Information
Technologies in Biomedicine. 4. Springer; 2014:355–364.
under the Diamond Grant program, decision number 0136/ 20. Panek D, Skalski A, Gajda J. Acoustic analysis assessment in speech
DIA/2013/42 (AGH 68.68.120.364). pathology detection. Int J Appl MathComput Sci. 2015;25:631–643.
21. Ladefoged P, Harshman R, Goldstein L. Generating vocal tract shapes
from formant frequencies. J Acoust Soc Am. 1978;64:1027–1035.
SUPPLEMENTARY DATA 22. Orozco-Arroyave J.R., Arias-Londoo J.D., Vargas-Bonilla J.F., .et al.
Supplementary data related to this article can be found Perceptual analysis of speech signals from people with Parkinsons dis-
ease. IWINAC 2013, Part 1, LNCS 7930Berlin Heidelberg: Springer-
online a https://doi.org/10.1016/j.jvoice.2020.06.004 Verlag201−2112013;.
23. Benba A, Jilbab A, Hammouch A. Voice analysis for detecting persons
with Parkinsons disease using MFCC and VQ. Recent Adv Electr Eng-
REFERENCES Comput Sci. 2014:96–100.
1. Parkinson J. An essay on the shaking palsy. J Neuropsychiatry Clinl- 24. Benba A, Jilbab A, Hammouch A, Sandabad S. Voice-prints analysis
Neurosci. 2002;14:223–236. using MFCC and SVM for detecting patients with Parkinsons disease.
2. Maillet A, Krainik A, Debu B, et al. Levodopa effects on hand and IEEE 1st International Conference on Electrical and Information Tech-
speech movements in patients with Parkinsons disease: a FMRI study. nologies ICEIT201520152015:300–304.
PLoS One. 2012;71:e46541. 25. Benba A, Jilbab A, Hammouch A. Discriminating between patients
3. Marsot A, Guilhaumou R, Azulay JP. Levodopa in Parkinsons dis- with Parkinsons and neurological diseases using cepstral analysis.
ease: a review of population pharmacokinetics/pharmacodynamics IEEE Trans Neural SystRehabil Eng. 2016;24:1100–1108.
analysis. J Pharm Pharm Sci. 2017;20:226–238. 26. Godino-Llorente JI, Gomez-Vilda P, Blanco-Velasco M. Dimensional-
4. Fahn S. Parkinson study group, does levodopa slow or hasten the rate ity reduction of a pathological voice quality assessment system based
of progression of Parkinsons disease. J Neurol. 2005;2524.iv37-iv42 on gaussian mixture models and short-term cepstral parameters. IEEE
5. Svenningsson P, et al. Eltoprazine counteracts l-DOPA-induced dyski- Trans Biomed Eng. 2006;53:1943–1995.
nesias in Parkinsons disease: a dose-finding study. Brain. 2015;138: 27. Arias-Vergara T, Vsquez-Correa JC, Orozco-Arroyave JR, Vargas-
963–973. Bonilla JF, Nath E. Parkinson’s disease progression assessment from
6. Marsot A, Guilhaumou R, Azulay JP. Levodopa in Parkinsons dis- speech using GMM-UBM. INTERSPEECH20162016:1933–1937.
ease: a review of population pharmacokinetics/pharmacodynamics 28. Buza K, Varga NA. ParkinsoNET: estimation of UPDRS score using
analysis. J Pharm Pharm Sci. 2017;20:226–238. hubness-aware feedforward neural networks. Appl Artif Intell. 2016;
7. Cernak M, Orozco JR, Rudzicz F. Characterisation of voice quality of 30:541–555.
Parkinsons disease using differential phonological posterior features. 29. Nilashi M, Ibrahim O, Samad S. An analytical method for measuring
Comput Speech & Lang. 2017. the Parkinsons disease progression: a case on a Parkinsons telemoni-
8. Ramig LO, Fox C, Sapir CS. Speech disorders in parkinson’s disease toring dataset. Measurement. 2019;136:545–557.
and the effects of pharmacological, surgical and speech treatment with 30. Nilashi M, Ibrahim O, Ahani A. Accuracy improvement for predicting
emphasis on Lee Silverman voice treatment (LSVT). Handb Clin Neu- Parkinsons disease progression. SciRep. 2016:34181.
rol. 2007;83:385–399. 31. Sapir S, Raming LO, Spielman JL. Formant centralization ratio
9. Logemann JA, B. Fisher H, Boshes B. Frequency and cooccurrence of (FCR): a proposal for a new acoustic measure of dysarthric speech. J
vocal tract dysfunctions in the speech of a large sample of parkinson Speech Lang Hear Res. 2010;53:1–20.
patients. J Speech Hear Disord. 1978;43:47–57. 32. Nilashi M., Ibrahim O., Samad S., . et al.2019b. An analytical method
10. Darley F.L., Aronson A.E., Brown J.R.. 1969. Differential diagnostic for measuring the Parkinsons disease progression: a case on a Parkinsons
patterns of dysarthria, J Speech, Lang, Hear Res 12246−269. telemonitoring dataset. Measurement 136545−557.
ARTICLE IN PRESS
12 Journal of Voice, Vol. &&, No. &&, 2020

33. Breiman. Random forests, machine learning 45, 5−32. 2001. 35. Smola AJ, Scholkopf B. A tutorial on support vector regression. Stat-
34. Zlotnik A, Montero JM, San-Segundo R, Gallardo-Antoln A. Comput. 2004;14:199–222.
Random forest-based prediction of parkinson’s disease progression 36. Post B, Merkus MP, de Bie RM. Unified Parkinsons disease ratings-
using acoustic, ASR and intelligibility features. 16th Annual Con- cale motor examination: are ratings of nurses, residents in neurology,
ference of the International Speech Communication Associa- and movement disorders specialists inter-changeable. Mov Disord.
tion2015. 2005:1577–1584.

You might also like