1 Introduction

Postural instability is one of the symptoms of Parkinson’s Disease (PD) that significantly affects the quality of life and, more importantly, the safety of people suffering from this neurodegenerative disease, as it increases the risk of falls and injuries during daily activities (Shoneburg et al. 2013; Fasano et al. 2017), and it worsens with the progression of the disease, generating a significant cost to the healthcare system (Wielinski et al. 2005; Sparrow et al. 2016).

Several clinical scales and tests are employed to assess the postural instability and the balance dysfunctions, including Time Up and Go (TUG) test (Shumway-Cook et al. 2000; Nocera et al. 2013); Postural Instability and Gait Difficulty (PIGD) (Jankovic et al. 1990), a subscale of the Unified Parkinson’s Disease Rating Scale (UPDRS) (Goetz et al. 2008); Berg balance scale (Berg et al., 1989; Błaszczyk et al. 2007).

The combination of multiple balance tests can provide a better assessment of postural instability (Jacobs et al. 2006, 2016), compared to the single one as TUG or the pull test defined in the UPDRS (Jacobs et al. 2006; Munhoz et al. 2004; Pérez-Sánchez et al., 2019). Furthermore, deficits in postural stability can be highlighted performing concurrent cognitive tasks or secondary motor tasks during steady standing conditions (Morris et al. 2000; Marchese et al. 2003; Cheng et al., 2018; Sarasso et al. 2021).

Several recent studies have shown the strong correlation between postural sway and balance dysfunctions during the standing stance in PD subjects (Frenklach et al. 2009; Ozinga et al. 2015; Mancini et al. 2011; Curtze et al. 2016). Postural sway is the continuous movement of the Center of Mass (CoM) of the body activated by the vestibular, somatosensory, and visual systems to maintain a balanced posture. In PD, it is also recognized that dopaminergic treatments influence postural sway (Mancini et al. 2011; Menant et al. 2011; Workman et al., 2019), so frequent monitoring of this physical quantity is desirable to control and limit the negative side effect on the stability that could determine the subject’s fall. Postural (or body) sway is generally estimated by static or dynamic posturography (Johnson et al. 2013), by quantifying the displacements of the Center of Pressure (CoP) on a force platform under the feet, both in steady stance conditions and in the presence of external perturbations. CoP and CoM are strongly correlated: CoP represents the necessary reaction force to maintain a posture balanced in the presence of postural sways, which can be observed through the movement of CoM (Richmond et al. 2021). In any case, even if CoP movement is only indirectly related to postural sway and instability, in clinical settings it has traditionally been preferred over CoM estimation due to the difficulty of measuring the body CoM out of laboratory environments (Hasan et al., 1996; Leach et al. 2014; Clark et al. 2018).

Recently, several non-contact approaches based on low-cost optical tracking devices as RGB-Depth cameras (Microsoft® Kinect 2 SDK; Intel® Developer Zone) have been proposed for body movement analysis (Han et al. 2013; Ferraris et al. 2018; Clark et al. 2019; Puh et al. 2019). They show complementary features with respect to on-body inertial sensors, including simpler setups, higher usability, less invasiveness, and suitability to be easily integrated with gesture-based human-machine interfaces. On the other side, they are less ubiquitous, prone to self-occlusions and privacy concerns. This last drawback can be avoided by using only the depth or the skeleton information provided by the device.

When continuous daily monitoring of the motor status is not required, and spot evaluations are preferable, they successfully provide a non-invasive alternative to on-body inertial sensors in home monitoring of people with PD (Ozinga et al. 2015; Mancini et al. 2011; Rovini et al., 2019; Silva de Lima et al. 2020; Sica et al. 2021). In particular, the Microsoft Kinect® v1 has been used to assess movements in PD subjects (Galna et al. 2014), postural sway (Yeung et al. 2014), and balance (Yang et al. 2014); while the more recent Microsoft Kinect® v2 has been used to assess balance dysfunctions (Eltoukhy et al. 2018), posture and postural stability (Clark et al. 2015; Grooten et al. 2018), postural sway (Mishra et al. 2017), to assess upper limb functions (Cai et al. 2019), to evaluate clinical motor functions (Otte et al. 2016; Clark et al. 2019) in different fields of application, and for rehabilitation purposes (Garcia-Agundez et al. 2019). In the context of neurodegenerative and neurological diseases, Microsoft Kinect v2 has been successfully employed for several clinical evaluations: Time Up and Go test (TUG) (Kähär et al. 2017; Tan et al. 2019), automatic recognition of different categories of PD subjects (Rocha et al. 2015; Dranca et al. 2018), automatic classification of gait patterns and disorders (Li et al. 2018), assessment of neurological rehabilitation (Knippenberg et al. 2017), and assessment of postural stability and lower limb impairments (Ferraris et al. 2019).

In the context of PD, a frequent assessment of postural instability as a predictor of the risk of falls is important, making a home solution for its characterization advantageous for PD subjects.

This paper addresses this demand by presenting a prototype of a home-based monitoring system developed for the automated assessment of postural instability of PD subjects. The proposed system is based on low-cost RGB-Depth cameras, and it intends to fulfill specific requirements: suitability to be self-managed without safety risks by users with motor impairment, support for objective and daily-based spot evaluations compliant with standard clinical scales for postural stability assessment, in particular with PIGD subscale (Jankovic et al. 1990; van der Heeden et al. 2016). With respect to solutions based on wearable inertial sensors (Rovini et al. 2019; Channa et al. 2020), our proposal shows a simpler setup and maintenance, no invasiveness and support for gestural human-machine interfaces.

The assessment is based on the kinematic analysis of the body’s Center of Mass (CoM) while the user performs specific motor tasks. To our knowledge, this is the first time that estimates of CoM obtained by RGB-Depth cameras are used to characterize the postural instability in PD. The proposed tasks are designed on the basis of standard clinical scales adopted for postural stability assessment, which have been modified by the inclusion of concurrent motor tasks (dual-task condition). This extension produces different types of postural stress with the aim of emphasizing the balance dysfunctions, thus providing a more comprehensive assessment of instability (Jacobs et al. 2006a).

The automated assessment of postural stability takes place by a machine learning approach. During the task execution, a set of kinematic parameters estimated from CoM is used as input to supervised classifiers for the assessment of the user’s performance. An experimental campaign was conducted on a cohort of PD subjects to collect clinical trials and the corresponding CoM parameters used for the training of the classifiers. The consistency of the automated assessments with respect to the clinical ones has also been verified.

This paper extends our previous preliminary work on postural stability (Ferraris et al. 2021) in the following aspects: two more tasks have been designed and added for a more comprehensive analysis and assessment of postural stability; correspondingly, more PD subjects have been included in the experiment; two more classifiers have been trained and used for the automated assessments. The remaining part of the paper is organized as follows. Section 2 presents the methodological framework: the characterization of postural instability, the tasks designed to assess postural stability and its characterization by CoM parameters, and the automatic assessment of postural instability by supervised classifiers. Section 3 presents the system (hardware and software components) and the human-machine interface for its management. Section 4 presents the experimental framework: participant selection, experimental data acquisition, and the statistical analysis performed on the data. Section 5 presents the experimental results on the correlation between clinical scores and CoM parameters, and on the accuracy of the classifiers. The paper ends with Sect. 6, where the results are discussed and the conclusions are presented.

2 Methodological framework

2.1 Characterization of postural stability

PD subjects can be classified into two motor subtypes: those with postural instability and gait difficulty (PIGD subtype) and those in which tremor is the dominant symptom (TD subtype). The classification is based on the PIGD score as the average of some UPDRS items (Jankovic et al. 1990). Compared with non-PIGD participants, PIGD participants were significantly more likely to suffer multiple falls (Pelicioni et al. 2019), supporting the use of PIGD sub-score both in selecting PD participants and for comparing clinical and automated assessment.

2.2 Tasks for postural stability assessment

The design of the tasks for the automated assessment of postural stability fulfills several requirements: compliance with standard clinical scales, suitability for self-management by motor-impaired users, safety of the performer during the task execution in a typical home environment, and highlighting and enforcement of the strong correlation between postural sway and balance dysfunctions. Three tasks have been designed as derived directly from the Berg balance scale: Up-Stand-Down (USD), Tandem-Standing (TS), Reaching-Standing (RS). During each evaluation session, the user is invited to perform the following tasks:

  • Up-Stand-Down (USD): get up from a chair, stand without support for 1 min and sit down.

  • Tandem-Standing (TS): stand for 1 min with feet spaced one step ahead.

  • Reaching-Standing (RS): stand for 1 min with arms outstretched.

As an example, the schema of the RS task is shown in [Fig. 1]. Only a few Berg scale items were considered in the tasks design, excluding those considered at safety risk for self-managed use. The one-minute phase of the three tasks is split into two 30-seconds sub-phases: the first one consists of a balance test in a single-task condition (ST), while in the second one, the user is invited to read some words displayed on the system screen. This last sub-phase consists of a concurrent cognitive task, intended to emphasize stability dysfunctions in a dual-task condition (DT) (Morris et al. 2000; Cheng et al., 2018; Sarasso et al. 2021). Users perform the tasks in front of the RGB-Depth camera, about 3 m away, starting from a sitting position (in USD task) or in a standing position (in TS and RS tasks): specific constraints have been considered to ensure the optimal body-tracking using the proposed solution. In addition, the user manages each task by gestures and visual feedback through the Graphical User Interface (GUI) of the system. Details on system constraints and interaction management are described in Subsection 3.1. The user is asked to start and end each task by performing three rapid movements with one of the arms: these movements are useful to detect the evaluable part of the performance automatically.

Fig. 1
figure 1

Schema of the Reaching-Standing (RS) task: arms stretched forward, standing for 30 s (ST phase), standing for 30 s (DT phase), arms stretched at the side. TRS is the total duration of the task

The task duration is split into two consecutive sub-phases (ST and DT) analyzed separately. The system automatically assesses the subject’s stability during the execution of the three tasks by ranking the performance into a three-level ordinal scale, whose scores are: poor, medium, and good. Details on the kinematic characterization and the automated performance assessment into three levels are given in the following sections. The system collects data and videos of each performance both for the automated assessment and for the possible verification by remote supervisors of its correct execution.

2.3 Stability characterization by CoM parameters

In this work, we make use of the strong correlation existing between the CoM sway and the balance dysfunctions (Frenklach et al. 2009; Mancini et al. 2011; Curtze et al. 2016) to automatically characterize and quantify postural stability in PD. The assessment is based on the kinematic parameters of the user’s CoM during the standing phase of the motor tasks designed.

CoM sway has already been analyzed by optical RGB-Depth devices in the context of biomechanical analysis of healthy subjects (Yeung et al. 2014; Mishra et al. 2017). We adopt a similar approach based on Microsoft Kinect v.2, where the skeletal model provided by the Software Development Kit (SDK) is used to evaluate the three-dimensional (3D) position of the body CoM in real-time. The SDK of the device provides a skeletal model consisting of twenty-five 3D joints that correspond roughly to anatomical points of the human body [Fig. 2]. A simpler model than those commonly used with gold standard systems (Devetak et al. 2019) has been considered to evaluate the Center of Mass of the body (CoMBody) where only some joints of the skeletal model are used. In particular, CoMBody is calculated as indicated in Eq. 1, i.e., as the weighted average of the center of mass (CoMi) of six body segments (head, trunk, legs, and arms) whose lengths are evaluated from the corresponding joint segments of the skeletal model, as shown in [Table 1]. The anatomic weights (wi) have been set according to standard anthropometric tables related to Dempster’s studies in 1955 (Clauser et al. 1971).

$${\text{Co}}{{\text{M}}_{{\text{Body}}}} = \frac{1}{6}{\sum\nolimits_{i = 1}^6 {{\text{CoM}}} _i}*{w_i}$$
(1)
Table 1 Segments, joints and anatomic weights used for the CoMBody evaluation

In the post-processing phase, a low-pass Butterworth filter (second-order, 8 Hz cut-off frequency) was applied to the 3D joints to minimize the effects of high-frequency noise in the data due to motion artifacts. An example of CoMBody position for standing stance is shown in [Fig. 3]. The figure shows the position of the CoMBody and the CoMi for each body segment used in the calculation.

Fig. 2
figure 2

Skeletal model and 3D position of joints in standing stance position

Fig. 3
figure 3

Example of 3D position of CoMBody (magenta) in standing stance on the 3D reconstruction of the body (point cloud). The position of each CoMi is also displayed: head (blue), trunk (green), arms (light blue) and legs (red)

Postural parameters were obtained as in (Mancini et al. 2011) from CoMBody sway along the medio-lateral (ML) and antero-posterior (AP) directions, that are the orthogonal components of CoMBody in the horizontal body plane. They were evaluated both for single-task (ST) and dual-task (DT) conditions as in (Ferraris et al. 2019). [Table 2] shows the postural parameters considered for the three tasks. The maximum sway range relative to the starting position iscalculated; the total sway length isthe total distance covered by CoMBody; the speed is the maximum CoMBody velocity; the sway area is the smallest area that contains the CoMBody trajectory. Excluding the sway area, the other postural parameters are calculated along the AP and ML directions.

Table 2 Postural parameters estimated from CoMBody for USD, TS, RS tasks

2.4 Statistical analysis for discriminant parameter selection

In our previous work, the accuracy of the CoM parameters measured by the system was verified by comparison with an optoelectronic system, according to a standard biomechanical setup (Ferraris et al. 2019). CoM trajectories were acquired simultaneously by the RGB-D sensor and the optoelectronic system by applying passive markers on the body (Davis et al. 1991). Then, the average Pearson’s correlation coefficient (ρavg) between CoM trajectories was computed. This analysis has proved the good correlation between measurements, both for AP (ρavg: 0.84 ± 0.11, p-value: 3.18 × 10–3) and ML (ρavg: 0.90 ± 0.09, p-value: 8.94 × 10–3) components: since postural parameters depend on CoM trajectories, we expect the agreement to be also reflected on CoM parameters. In (Ferraris et al. 2021), the trajectories agreement has been confirmed by a new experimental campaign (ρavg: 0.83 ± 0.12, p-value: 6.42 × 10–3), as well as the ability of CoM parameters in discriminating between parkinsonians and healthy controls and between single-task and double-task phases for the USD task. Good accuracy of the automated assessments implies that the supervised classifiers designed for the tasks make use of the best discriminative parameters as input for their predictions. This parameter selection approach reduces possible overfitting problems due to the limited sample size of the cohorts involved in the experiment. Therefore, we require that the most discriminant parameters of the three tasks are those showing both a good Spearman’s correlation with the clinical PIGD scores and a good discriminant power in differentiating healthy subjects from PD ones.

2.5 Automatic assessment of postural instability by machine learning

According to our previous experience (Ferraris et al., 2021), Support Vector Machine (SVM) multiclass classifiers (Vapnik 1999) have been considered for the automatic assessment. The experiment described in Sect. 4 was designed to create a training dataset for the classifiers. Two groups, one consisting of PD subjects and one of healthy subjects, undertook a series of evaluation sessions. Every subject was asked to perform the USD, TS, and RS tasks in sequence, while at the same time, her/his performance was evaluated by the system and characterized by CoM parameters. Before every session, PD subjects were assessed according to the PIGD subscale by an expert neurologist. Three SVM classifiers, one for each task, have been trained by the pairs “CoM parameter vector – normalized average PIGD score”. The PIGD scores were normalized and quantized in a three-level interval where its values (1,2, and 3) correspond to three classes of stability: poor, medium, and good. The thresholds used to quantize the PIGD scores took into account the class imbalance problem, trying to obtain an approximately equal sample size distribution among the three classes.

The accuracy and consistency of each classifier have been evaluated by applying the leave-one-out cross-validation method for multiclass problems (Sokolova et al., 2009).

To further validate the classifier performance and, in general, the methodology, the CoM parameters of the PD subjects evaluated during the DT phase of each performance have been input to the trained classifiers. The expected evaluation results should indicate a worsening of stability. This expectation is in accordance with the clinical results concerning postural stability of PD subjects under DT conditions (Morris et al. 2000).

3 System description

3.1 System hardware and software

The hardware component of the acquisition system is made of an RGB-Depth optical sensor that is connected via a USB port to a processing unit, for example a mini-PC (Intel® NUC i7 series). The processing unit is connected to a monitor or TV screen, via VGA or HDMI connection, to provide the user with information and visual feedback of body movements, and display the Graphical User Interface (GUI) that allows for the natural interaction with the system [Fig. 4]. The RGB-Depth sensor used is Microsoft Kinect v.2, since this system is part of a wider solution already developed that aims to remotely monitor the neuro-motor status of people with PD.

Microsoft Kinect v.2 is a long-range camera that generates synchronized color and depth video streams. The operative features and the time-of-flight technology allow the device to perceive objects at a distance (depth) up to 8 m reliably. Despite this, when it is necessary to use the facilities of people detection and tracking, it is safer to limit the maximum distance to no more than 4.5 m to ensure tracking accuracy. Video streaming is generated at about 30 frames/second (FPS), with a resolution of 1920 × 1080 pixels and 521 × 424 pixels for color and depth streams respectively. The availability of depth information allows for the three-dimensional scene reconstruction by standard transformation algorithms. The RGB-Depth sensor can be mounted on a tripod or placed on a flat surface to ensure stability and correct orientation during the acquisition, at a height that guarantees the optimal tracking of people at about 3 m from the device. This distance has been defined considering some works concerning the accuracy of the depth sensor. According to (Lachat et al. 2015), the best performance is in the range [1.5 − 4.5 m] from the camera, where depth map errors are in the range [-0.5 mm − 0.5 mm]; the worst performance is for distances less than 0.8 m (the minimum operating distance) or greater than 5 m. These results agree with (Wang et al. 2015), which reports the maximum depth accuracy along the central cone of the RGB-D sensor. Conversely, lateral and vertical scattering of light pulses deteriorates the depth map in those areas, thereby reducing overall accuracy. The average accuracy in the central cone is between [2 − 4 mm] for distances up to 3.5 m from the sensor. For distances greater than 4.0 m, the average error becomes greater than 4 mm and gets worse exponentially as the distance from the sensor increases. Furthermore, a frontal view has been preferred over other viewing angles to ensure the optimal viewpoint for body detection, as indicated in (Gianaria et al., 2019).

Furthermore, to ensure optimal body-tracking, some constraints have been set to avoid external interferences as much as possible. In fact, some elements can influence the RGB-D sensor performance and affect the tracking accuracy of the skeletal model, making the joints noisy and unreliable. These include, for example, the presence on the scene of light sources entering the camera or reflective surfaces that can interfere with the infrared pulses emitted by the device and compromise the correct depth map estimation and, consequently, the 3D reconstruction of the skeletal model. Another source of depth map artifacts is commonly due to clothing: using loose or too dark clothing and reflective objects (such as belts, bracelets, necklaces) can generate discontinuities in the depth map, causing the incorrect positioning of the skeletal model joints. These factors have been considered in defining the system configuration and experimental protocol for the supervised environment. Furthermore, they will become strict constraints in a future home-based and unsupervised experimental protocol, requiring participants to be adequately instructed on these aspects and comply with the established requirements during the acquisitions.

The system software, which runs on the processing unit, has been developed for real-time data acquisition and processing, a crucial requirement to guarantee the interaction with the system. The system software, consisting of dedicated C + + and MATLAB scripts, allows access and analysis of information from the RGB-Depth sensor through the SDK, the middle-layer software made available by device manufacturers.

Fig. 4
figure 4

Example of setup for home monitoring: the RGB-Depth sensor and the TV screen to display GUIs and to provide visual feedback

3.2 Natural interaction: the human-machine interaction and the graphical user interface

The system has been designed to be easy-to-use and self-manageable as much as possible: these requirements are crucial for technological solutions dedicated to elderly and pathological subjects since they should respond to more complex and challenging needs than those for young and healthy users (Rot et al., 2017). For example, in telemedicine applications, people need to effectively use the technological solutions, especially at home and independently as much as possible, in order to follow-up progress or decline out of health facilities (Klaassen et al. 2016).

This opportunity closely links to how well the user can use the technology and how well the technology is suited to the user’s needs and the context of use: in general, the user experience allows to evaluate elements such as usefulness, usability, and acceptability of an application or technology, and the satisfaction of the user while using it, taking into account the user’s skills and health conditions (Bajenaru et al. 2020).

One of the challenges in developing technological solutions for real-world applications is acceptability: choosing interaction models and user interfaces suited to the target users’ needs and features may help avoid distrust in using such applications (Rashidi et al., 2013). To this end, the synergy between the human-machine interaction (HMI) model and the graphical user interface (GUI) plays a relevant role in a technological solution.

The main goal of the HMI is to improve the usability of a technological solution by facilitating how the user interacts with the system; the primary goal of the GUI is to guide the user in using the system with clear and straightforward information, thus avoiding user mistakes and misunderstanding. Several studies proposed age-centered guidelines to design and develop HMI and GUI (Sharma et al. 2016; Vines et al. 2015) since, for example, psychomotor and cognitive skills worsen with age.

The older and pathological subjects often show resistance to traditional HMI modalities: the use of keyboard and mouse, for example, is often problematic due to the deterioration of motor control and coordination. In recent years, new natural forms of interaction have aroused great interest by exploiting the movements of the body, speech, and gaze (Dias et al. 2012; Hsiao et al. 2017). Another element that characterizes aging is the impairment of several sensory systems, including vision function, causing more difficulties in reading and perceiving information from user interfaces: GUI design should consider this element, which is particularly relevant when the interaction occurs away from the monitor, as in our solution. Specific guidelines suggest that GUIs use plain fonts and appropriate sizes to ensure high readability. Moreover, the color contrast helps sharpen readability, such as combining light background with black characters (Boll et al., 2015). Also, due to the lower ability to avoid distractors, limiting the information displayed on the GUI is preferable, presenting the relevant information centered as these users often suffer from visual field width deficits (Sharma et al. 2016). Another critical point is the arrangement of the interaction objects and menus within the GUI: a good structure facilitates the users’ interaction. It is equally important to keep this structure consistent throughout the application’s GUI to avoid confusing the user with continuous changes in the graphic layout (Boll et al., 2015; Sharma et al. 2016).

To address these issues, we equipped the developed solution with an HMI model that allows for natural interaction through simple gestures or actions with parts of the body, such as raising arms or moving legs: this choice to limit the use of traditional interaction devices such as keyboard and mouse. Therefore, the skeletal model provided by the camera SDK is exploited for the movement analysis and the interaction with the system through dedicated GUIs.

The HMI model includes two GUI types, “interactive GUIs” and “execution GUIs”, that try to satisfy the requirements previously mentioned. In addition, independently by the type, each GUI guides the user through textual messages, video, and audio suggestions that indicate the sequence of steps to complete all the test session phases.

The interactive GUIs are used to make selections or activate management operations. These GUIs are designed using augmented reality (AR) and interactive objects, providing real-time visual feedback during the interaction [Fig. 5]. In the interactive GUIs, few interactive objects appear when necessary and are appropriately arranged to be easily reachable without complex movements: to this end, the interactive objects are automatically displayed according to specific skeletal model joints. The interaction GUIs are also used to correct the user position: the system analyzes the skeletal model, evaluates the CoM position, and warns the subject to move right, left, forward, or backward if the current position does not conform to pre-established intervals. Currently, the frontal distance range from the camera is between [2.5 − 3.2 m], while the left-right range is between [-0.8 m − 0.8 m] from the origin of the reference system. Furthermore, the size and font of the interactive objects can be customized to increase visibility, considering that the interaction occurs at about 3 m away from the system monitor.

The execution GUIs are displayed while the user is performing the proposed tasks. In this case, only rough visual feedback of body movements is shown, through the depth map and 2D joints, to prevent the user from self-influencing during the performance [Fig. 6]: the execution GUIs do not require interaction, thus allowing users to focus only on the task execution.

Both types of GUIs maintain a consistent structure. The larger central area, where the attention is most concentrated (Sharma et al. 2016), is where the user is displayed and the interaction takes place. User instructions are shown in the left area: this layout, which users are pretty accustomed to, partly follows web applications that commonly show the main menus on the left (Boll et al., 2015). Although the structure of the GUI is simple, it still aims to increase the user experience: many of the design guidelines, indicated in (Rot et al., 2017), have been considered, particularly regarding the categories of visualization, navigation, communication, support, and personalization. This approach makes us confident for a subsequent experimental campaign in which we will analyze the user experience and collect feedback on usability and acceptability through dedicated questionnaires.

Fig. 5
figure 5

Example of the interactive GUI. On the left, the message suggests the action to be performed. In this case, the interaction oc-curs by moving the arm/hand on the red start button. A clear visual feedback of the movement and of the scene is provided based on the color stream and the involved joint

Fig. 6
figure 6

Example of the execution GUI. On the left, the message suggests the action to be performed. In this case, a rough visual feedback of the movement and the scene is provided based on depth video stream and some joints of the skeletal model

4 Experimental framework

4.1 Participants

A group of fourteen subjects with PD was involved in the experimental study at the Department of Neurology and Neurorehabilitation of the Istituto Auxologico Italiano (San Giuseppe Hospital). The study refers to data collected between late 2019 and early 2020. Subjects were recruited according to the UK Parkinson’s Disease Society Brain Bank Clinical Diagnostic standards (Hughes et al. 1992), by adopting specific inclusion criteria: no history of neurosurgical procedures or injuries to lower limbs; minimal tremor (severity < = 1); no cognitive impairment (Mini–Mental State Examination Score > = 27/30).

The participants were also enrolled as belonging to the PIGD motor subtype of PD, according to their PIGD sub-score as the average of some UPDRS items (i.e., arising from chair, gait, posture, and postural stability tasks). The ratio of mean tremor score to mean PIGD score was calculated to determine the PD subtype: ratio scores ≤ 1.0 identified the PIGD subtype (Jankovic et al. 1990). Compared with other non-PIGD PD subtypes, PIGD subjects show a significantly higher probability of falling (Pelicioni et al. 2019). All PD participants were assessed in their “on” status, thus during their best motor performances when the effects of levodopa treatment on motor symptoms are still present.

As required by the experimental procedure, the PIGD sub-score of the PD subjects was further assessed by an expert neurologist before each instrumental session. The characteristics of the group of PD subjects are: average Hoehn and Yahr score = 2.4 (range: 1–3); average age = 67.8 years (range: 54–75); average disease duration = 6.7 years (range: 3–9). A group of fourteen volunteers made up the age-matched control group (CG): the inclusion criteria were the absence of any neurological, motor, or cognitive disorders and no episode of previous falls.

4.2 Experimental procedure and data acquisition

A technician instructed the participants on using the system in a laboratory setting; then, each subject started to self-interact with the system through the HMI under the technician‘s supervision, this in the perspective of future use of the system in the home setting. All PD and CG participants performed the tasks proposed by the system through the interactive and execution GUIs as defined in the study protocol and under the same operative conditions. Informed consent was obtained according to the Declaration of Helsinki (2008) before participating in the study.

5 Experimental results

5.1 Correlation between CoM parameters and PIGD scores

Before performing the statistical analysis, the trials have been analyzed to verify the stability of the skeletal model joints involved in estimating CoM. In particular, the mean and standard deviation of the joint positions has been used to identify any body-tracking problems. All the trials analyzed did not reveal any critical issue of tracking or anomalous jittering of joints, so they have been all considered for the statistical analysis.

The statistical analysis results indicate that CoM parameters are able to discriminate PD from CG subjects, as confirmed by the Mann-Whitney U Test for the USD, TS and RS tasks (Tables 3, 4 and 5, respectively). Furthermore, Spearman’s correlation values indicate that many CoM parameters are significant and, in general, show a moderate to strong correlation with the PIGD scores (|ρ > 0.4|, p-value < 0.05).

A threshold of |ρ > 0.5| on Spearman’s correlation values was set to select, from the set of CoM parameters initially considered (Table 2), those which best correlate to PIGD scores. Then, only the subsets of parameters shown in Tables 3, 4 and 5 were considered for the training phase of the respective task classifiers.

As can be seen, all the selected parameters are able to discriminate PD from CG subjects, even if with different discriminatory power for the different tasks. In particular, the Area parameter is the less discriminant in all the tasks, even if with acceptable values in the differences between PD and CG mean values and in the corresponding p-values. Concerning the correlation (ρ Spearman’s coefficient) between PIGD scores and CoM parameters, this is generally lower for the USD task than for TS and RS tasks: this can be due to the less challenging motor aspects of USD task, and the consequent reduced impact on the CoM sway.

For the TS task, the correlation coefficients of the ML parameters are generally higher than the others: this can be explained by the higher instability in the medio-lateral plane due to the tandem position of the feet. The high values for the AP coefficients in the RS task can be explained with similar arguments: the forward outstretching of the arms creates higher instability in the antero-posterior plane.

Table 3 Discriminant power and correlation of parameters for USD task (postural parameters evaluated for ST phase)
Table 4 Discriminant power and correlation of parameters for TS task (postural parameters evaluated for ST phase)
Table 5 Discriminant power and correlation of parameters for RS task (postural parameters evaluated for ST phase)

5.2 Accuracy of the automated assessment

Classifier performance is evaluated by applying the leave-one-out cross-validation method, and it is expressed in terms of classification accuracy. In Table 6, two types of classification accuracy are presented, where the subjects are classified according to their CoM parameters during the ST phase. First, the results for the binary classification problem are presented, where the subjects are classified as belonging to PD and CG groups; second, the results for the multiclass classification problem are presented, where the PD subjects are classified into the three classes of increasing severity of postural instability. In this second case, the per-class accuracy is used, where the classification accuracies are averaged over the classes (Sokolova et al., 2009).

The accuracy of the binary classifiers indicates that the TS task has the best classification accuracy, while the USD task has the worst. The same happens for multiclass classifiers: this could be explained by observing that the TS task is the most challenging for postural stability, both for PD and CG subjects. The CoM sway tends to be more evident for both types of subjects. Similar arguments used before to explain the correlation results can also be applied in this case. However, the accuracy obtained for the binary classification is very high for all three tasks. As expected, the accuracies of the multiclass classifiers are lower than the corresponding accuracies of binary classifiers: the classification is generally more difficult when the number of classes increases and the training data remains the same.

Table 6 Accuracy of SVM classifiers for the three tasks (USD, RS and TS) using parameters of ST phase

The results on the classification accuracy suggest that the system can be successfully used for the automatic assessment of postural stability of PD subjects at home.

Fig. 7
figure 7

Radar Graphs of the differences between the average values of COM parameters in ST and DT phases, for the three tasks (USD, TS and RS) and for CG and PD groups

The greater postural instability in the DT phase is highlighted in [Fig. 7] by the differences in the average values of the COM parameters estimated in ST and DT phases. The radar graphs show the differences for the three tasks (USD, RS, and TS) and for CG and PD groups: difference values, which refer to different physical quantities, are represented in the [0–1] range to be easily comparable. The DT condition clearly increases the average values of the postural parameters (all the differences are positive), indicating a worsening of stability compared with the ST condition. Although this happens for both PD and CG groups, it is more evident for the PD group. This behavior is indicated by the PD radar graphs that are wider than CG ones for all the tasks. These results demonstrate that CoM parameters successfully detect the expected worsening of the stability occurring in PD subjects under DT conditions. Furthermore, they indicate that CoM parameters are more sensitive in discriminating the worsening in postural stability of PD subjects than CG ones.

The performance of the classifiers in assessing the PD subjects during the DT phase has also been evaluated. The worsening in the stability by switching from ST to DT condition has been successfully detected by the USD, RS, and TS classifiers as a shift to a worse stability class in the majority of the evaluated subjects (86%), with a minority (14%) remaining in the same stability class.

6 Conclusion

The objective and daily assessment of postural stability are highly desirable in PD because it could be a good index of the risk of falling and the consequent injuries. In this paper, a vision system for the automatic assessment of postural stability in a home environment is presented. The system is based on a low-cost RGB-Depth device, whose tracking capabilities have been exploited both to characterize the movements of PD subjects during balance tasks for the assessment of postural stability, and to build a gestural HMI suitable for people with motor impairment. The natural interaction with the system is based on gestures and actions to be performed with simple body movements thanks to customizable GUIs that guide the users during the overall test session: this makes the solution easy-to-use, intuitive, and suitable for home use without requiring particular technical skills. Specific constraints have been considered in the definition of the system setup and experimental protocol in order to avoid external interferences due to light sources, reflective surfaces, clothing, and worn reflective objects. These elements could cause discontinuities and artifacts in the depth map and, consequently, an erroneous positioning of skeleton model joints and estimation of CoM location that could affect postural stability analysis.

Postural stability is assessed by means of kinematic parameters that characterize the CoM sways measured by the system during the execution of specifically designed balance tasks. In particular, the three proposed tasks (named USD, RS, and TS) have been derived directly from the Berg balance scale, a clinical scale commonly used for postural stability assessment. The HMI of the system combined with the task design makes them suitable for self-management by people with motor impairment, also ensuring the safety of the user when performing the tasks in a typical home environment.

The evaluation of postural stability by these multiple balance tasks aims to stimulate the subject with different types of postural stress, providing a more extensive assessment of anomalies and balance dysfunctions. Furthermore, considering that the dual-task condition is typical in daily activities, a concurrent cognitive task has been introduced to force the onset of instability and to evaluate the responsiveness of the system to the rapid deterioration of the postural stability. Postural parameters estimated by the CoM sway of the body have been proved to be strongly correlated to postural instability. A machine learning approach, based on SVM supervised classifiers, allows the automated scoring of the user performance through the postural parameters evaluated by the system during the execution of each task. The classifiers have been trained in an experimental campaign, where two groups of PD and CG subjects have been assessed simultaneously by the system and an expert neurologist. The results show the discriminant power of the selected parameters, which distinguish the PD group from the CG one.Furthermore, the selected parameters show good correlation with the assigned standard clinical scores. The good accuracy of the classifiers in assigning a score to the performance of PD subjects on three increasing severity classes is promising for the in-field use of the system. Nevertheless, caution is needed because of the limited sample size of the PD group.

Particularly interesting is the analysis of the DT condition: postural instability increases during the execution of a concurrent task, both for PD and CG groups, but the difference between the two groups becomes wider compared to the single-task condition.

These results demonstrate that CoM parameters successfully detect the expected worsening of the stability that occurs in PD subjects for the DT condition. CoM parameters are also more sensitive in discriminating the worsening of the postural stability in PD subjects than in CG subjects. This sensitivity to the DT condition of the parameters will be further investigated in future work. Certainly, further investigations are necessary to confirm these preliminary results: for example, the PD sample size should be increased; the three individual assessments of the tasks should be integrated into a single postural stability index; the analysis of PD subjects in DT condition should be further explored through different types of concurrent cognitive or secondary motor tasks. Nevertheless, the results are encouraging, particularly in the perspective of the home monitoring of postural stability, which impacts the quality of life and the safety of PD people. In addition, the results suggest that supervised classifiers can be used for the automatic assessment of the subject’s performance, for predicting a potential risk of falls, and for the automatic detection of balance alterations.

As a final note, although Microsoft Kinect® v1 and v2 have been declared discontinued, other commercial alternatives are now available: Microsoft Kinect Azure®, Orbbec Astra®, and Intel RealSense® D400 series, combined with new body-tracking algorithms (e.g., Nuitrack® software, Openpose®) should allow to achieve results similar to those presented here (Cao et al. 2021). In particular, the Microsoft Kinect Azure®, which replaced Kinect v2, seems to ensure high performance and accuracy, as proved by some recent studies on the onboard sensors and the new body-tracking algorithm (Tölgyessy et al. 2021a,b).