DISPEL: A Python Framework For Developing Measures From Digital Health Technologies
DISPEL: A Python Framework For Developing Measures From Digital Health Technologies
DISPEL: A Python Framework For Developing Measures From Digital Health Technologies
Abstract—Goal: This paper introduces DISPEL, a Python While the framework has already been refined based on
framework to facilitate development of sensor-derived mea- clinical trials’ data, ad-hoc validation of the provided algo-
sures (SDMs) from data collected with digital health tech- rithms in their specific context of use is recommended to
nologies in the context of therapeutic development for neu- the users.
rodegenerative diseases. Methods: Modularity, integrability
and flexibility were achieved adopting an object-oriented Index Terms—Signal processing, digital biomarkers, dig-
architecture for data modelling and SDM extraction, which ital health technology, python, gait, balance, cognition,
also allowed standardizing SDM generation, naming, stor- drawing, smartphone, wearable sensing, inertial sensor.
age, and documentation. Additionally, a functionality was Impact Statement— DISPEL offers a framework to extract
designed to implement systematic flagging of missing data sensor-derived measures from digital health technologies
and unexpected user behaviors, both frequent in unsu- with traceability from source to outcome. Its open-source
pervised monitoring. Results: DISPEL is available under availability aims at supporting multi-disciplinary collabora-
MIT license. It already supports formats from different data tions to accelerate and scale development and adoption of
providers and allows traceable end-to-end processing from digital biomarkers/endpoints.
raw data collected with wearables and smartphones to
structured SDM datasets. Novel and literature-based sig-
nal processing approaches currently allow to extract SDMs I. INTRODUCTION
from 16 structured tests (including six questionnaires), as-
EDUCING duration, size and costs of clinical trials is a
sessing overall disability and quality of life, and measuring
performance outcomes of cognition, manual dexterity, and
mobility. Conclusion: DISPEL supports SDM development
R compelling need in the development of therapeutic solu-
tions for neurodegenerative diseases. Digital health technologies
for clinical trials by providing a production-grade Python (DHTs) and sensor-derived measures (SDMs) of motor and
framework and a large set of already implemented SDMs.
cognitive function are promising solutions to this problem. Their
development and validation lifecycle, however, is complex and it
Manuscript received 11 December 2023; revised 23 February 2024 might take years before adequate technology readiness levels are
and 6 May 2024; accepted 6 May 2024. Date of publication 17 May
2024; date of current version 20 June 2024. The work of S. Del reached through cross-sectional and longitudinal randomized
Din was supported by the the Innovative Medicines Initiative 2 Joint control trials [1], [2], [3]. Tools to shorten this cycle are hence
Undertaking (JU) through Mobilise-D Project under Grant 820820, in highly needed, including suitable computational frameworks
part by the National Institute for Health Research (NIHR) Newcastle
Biomedical Research Centre (BRC) based at The Newcastle upon Tyne for SDM extraction. Available open-source libraries for the
Hospital NHS Foundation Trust, Newcastle University and the Cumbria, development of SDMs, in fact, are often limited to very spe-
Northumberland and Tyne and Wear (CNTW) NHS Foundation Trust cific applications or device sensor locations [4], [5], [6]. Even
and in part by the NIHR/Wellcome Trust Clinical Research Facility (CRF)
infrastructure at Newcastle upon Tyne Hospitals NHS Foundation Trust. recent efforts towards a code library for integration of diverse
The work of S. Del Din and C. Hinchliffe was supported by the Innova- biomarkers [7] or for effective standardization [8], [9] do not
tive Medicines Initiative 2 Joint Undertaking (IMI2 JU) through Project seem to have yet achieved the needed level of traceability and
IDEA-FAST under Grant 853981. This work was supported by Biogen
Digital Health. The JU was supported by the European Union’s Horizon generalizability.
2020 Research and Innovation Program and the European Federation of This paper presents a Python framework, named DISPEL,
Pharmaceutical Industries and Associations (EFPIA). The review of this designed to facilitate development and validation of SDMs
letter was arranged by Editor Martina Mancini. (Corresponding author:
C. Mazzà.) within the context of therapeutic development applied to neu-
A. Scotland, G. Cosne, A. Juraver, A. Karatsidis, J. Penalver-Andres, rodegenerative diseases. Specifically, to facilitate the fulfilment
E. Bartholomé, C. M. Kanzler, C. Mazzà, D. Roggen, and S. Belachew of evolving, and potentially uncharted, regulatory requirements
are with the Biogen, Cambridge, Massachusetts, Cambridge, MA
02142 USA (e-mail: [email protected]; gautier.cosne@biogen. for DHTs and for digital biomarker and endpoint development
com; [email protected]; [email protected]; and acceptance [3], the following objectives were identified: a)
[email protected]; emmanuel.bartholome@bio allow to ingest data collected with different DHTs and experi-
gen.com; [email protected]; [email protected];
[email protected]; [email protected]). mental protocols; b) offer a traceable process for the extraction
C. Hinchliffe and S. Del Din are with the Translational and Clinical of SDMs from raw data; c) manage the complexity associated to
Research Institute, The Catalyst, NE4 5TG Newcastle upon Tyne, U.K the data that are typically recorded in clinical trials and real world
(e-mail: [email protected]; silvia.del-din@newcastle.
ac.uk). scenarios; d) flagging unexpected user-behaviors potentially
Digital Object Identifier 10.1109/OJEMB.2024.3402531 affecting the SDM values; e) enable community contributions
© 2024 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
Fig. 1. Figure shows the core functionalities of DISPEL (white blocks) and the modules implementing them (orange text). The orange rectangular
boxes represent the different scenarios in which a user could interact with DISPEL and the type of data and modules they would need to implement
and provide as an input in each case. The grey box in the middle of the chart details the content of the data model. This consists of a reading
(including data and metadata) structured in a series of levels (Leveli ) that may contain different sensor data (RawDataSet), custom-defined windows
of analysis (LevelEpoch) and calculated measures (MeasureSet). The Model also includes the flags generated if technical issues or user-related
deviations in the test executions are detected. Exerts of info for a data trace graph and SDMs tables obtained as an output are also illustrated,
together with an example of automatically standardized measures and flags naming.
in the form of data formats and signal processing modules for new io implementations facilitate reading from different sources.
SDMs extraction. This is expected to encourage integrating new data types and/or
code modules, irrespective of device type and location.
II. MATERIALS AND METHODS Modularity, integrability and flexibility were adopted to
DISPEL is made of building blocks developed according to enable processing of data from structured performance outcome
good software development practices such as test-driven devel- assessments/tests (e.g., six-minute walk, finger drawing, etc.).
opment and peer-review. These blocks served as a foundation These tests can entail different experimental protocols, typically
for development and enabled to turn prototype algorithms into composed of sub-tasks, which in DISPEL are structured in
production-grade software adhering to good software devel- so-called levels. For example, if, in a smartphone-based
opment practices. Robust testing was achieved by regression drawing task, the subject is asked to draw four different
testing of SDMs, peer-review and, where feasible, enhancing shapes twice with both hands, each attempt is a level and the
unit tests with synthetic data or controlled experiments with shape type and hand side are modalities. This data modelling
expected outcomes. approach allows for high flexibility and simplifies coding when
DISPEL was designed to support custom and third-party data processing multiple raw data sets and extracting multiple
formats. Ad-hoc input output (io) modules and guidance for SDMs at a time. Moreover, it enables extracting SDMs only
496 IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY, VOL. 5, 2024
TABLE I
TESTS, MODALITIES, USER-RELATED FLAGS AND MEASURES
for specific tasks and sub-tasks and processing only specific (e.g., multiple measures from same tests, average over dif-
levels, optimizing computational resources and management of ferent levels, etc.). Wide usability of the library was pursued
incomplete datasets. by enabling measure collections to be exported into a Python
Transparency was ensured by adopting an object-oriented dictionary, a JSON file, or a CSV file. Quality monitoring is
design, where each DataSet and MeasureSet object contains facilitated by flags, which can be linked during the processing
attributes indicating source and definition (e.g., units), while steps to specific measures, levels or readings. These flags are au-
each processing step class is linked to its input and output data. tomatically stored in the data model and allow to track technical
The object-oriented design also enabled standardization in (e.g., unstable sampling frequency) or user-related factors (e.g.,
SDMs computation, naming and storing. Automatically gener- wrong handling of a smartphone).
ated SDM names directly link to the referring task, level, and
modality, while automatically generated documentation sum-
III. RESULTS
marizes specific computation settings. Information needed to
build data trace graphs is generated for all transformation and DISPEL is freely accessible at https://github.com/
extraction steps. newcastleuniversity/DISPEL/. Fig. 1 summarizes the
Measure extraction functionalities are organized in modules implementation of the various modules and the library outputs,
per test and measures are wrapped in a class (measure collection) together with use cases representing encouraged contributions
to simplify handling multiple data collections and aggregations from the DHT community.
IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY, VOL. 5, 2024 497
DISPEL ingests data from multiple providers. Originally to the library documentation. CMK, CM, DR, EB, and SB
designed to process data from three mobile data collection ap- supervised the scientific developments related to the definition
plications (Ad Scientiam, Biogen Digital Health Konectom, and of sensor derived measures of mobility, dexterity, and cognition.
SensorLog - .json), it currently also reads data from a wearable CM produced the initial draft of the manuscript. GC and JPA de-
sensor provider (APDM Wearable Technologies - .h5) and in the signed the figures and tables. All co-authors critically reviewed
Mobilise-D format [10]. the initial draft and contributed to the development of the final
Implemented algorithms, partly taken from the literature manuscript.
and partly newly developed, extract SDMs from 16 structured
wearable- or smartphone-based tests (including 6 question- CONFLICT OF INTEREST STATEMENT
naires), assessing overall disability and quality of life, and AS, GC, AJ, AK, JPA, EB, CMK, CM, DR, and SB were
measuring performance outcomes in the domains of cognition, employees of Biogen at the time of developing this paper and
manual dexterity and mobility (see Table I, exemplifying also might hold shares of the company. SDD reports consultancy
currently flagged user behaviors). activity with Hoffmann-La Roche Ltd. outside of this study.
IV. DISCUSSION ACKNOWLEDGMENT
This paper aimed to provide the DHT community with a tool The contributions of Ali Neishabouri, Oussama Tchita, Clé-
to accelerate the development of SDMs for clinical trials. To ment Dulong, Loïc Carment, Angéline Plaud, Kevin Bouaou to
this end, DISPEL offers a production-grade Python framework, the original design and development of the library is gratefully
which already extracts a large set of SDMs from sixteen assess- acknowledged. Biogen authors played a role in both the design
ments/tests. and implementation of the library and in the preparation of this
Compared to other available tools, one of DISPEL’s strengths manuscript.
is its generalizability to different providers and DHTs. This is
also accompanied by a unique way of simultaneously warranting REFERENCES
high transparency, modularity and flexibility. DISPEL is also [1] J. C. Goldsack et al., “Verification, analytical validation, and clinical val-
novel in the way it leverages the advantages of object-oriented idation (V3): The foundation of determining fit-for-purpose for biometric
monitoring technologies (BioMeTs),” npj Digit. Med., vol. 3, no. 1, 2020,
codebase to deal with the complexity of data from real-life Art. no. 55.
unsupervised settings. In fact, its modular and customizable [2] L. Rochester et al., “A roadmap to inform development, validation and
pipeline of processing steps allow for programming simultane- approval of digital mobility outcomes: The mobilise-D approach,” Digit.
Biomarkers, vol. 4, no. 1, pp. 13–27, 2020.
ous SDM computations and handling of problematic steps (e.g., [3] Center for Drug Evaluation and Research, “Digital health technologies
data missingness or signal corruption). Furthermore, it includes for remote data acquisition in clinical investigations,” U.S. Food and Drug
an innovative solution for flagging technical and user-related Administration, 2023. Accessed: Feb. 7, 2024. [Online]. Available: https://
www.fda.gov/regulatory-information/search-fda-guidance-documents/
factors affecting SDMs values. All these attributes of DISPEL digital-health-technologies-remote-data-acquisition-clinical-investigat
are essential when operating in the highly regulated intended use ions
of drug development clinical trials. DISPEL is also expected to [4] B. Bent, M. Henriquez, and J. P. Dunn, “Cgmquantify: Python and R
software packages for comprehensive analysis of interstitial glucose and
facilitate reproducibility and transferability of results across the glycemic variability from continuous glucose monitor data,” IEEE Open
community. J. Eng. Med. Biol., vol. 2, pp. 263–266, 2021.
DISPEL includes both published and unpublished measures [5] M. Czech and S. Patel, “GaitPy: An open-source python package for gait
analysis using an accelerometer on the lower back,” J. Open Source Softw.,
extraction algorithms. While these have been extensively tested vol. 4, no. 11, 2019, Art. no. 1778.
and optimized for end-use based on data from clinical trials [6] D. Makowski et al., “NeuroKit2: A python toolbox for neurophysiological
in neurodegenerative disorders [11], [12], [13], their analytical signal processing,” Behav. Res. Methods, vol. 53, no. 8, pp. 1689–1696,
2021.
and clinical validation [1], [2], [3] should be the focus of further [7] B. Bent et al., “The digital biomarker discovery pipeline: An open-source
dedicated efforts. software platform for the development of digital biomarkers using mHealth
and wearables data,” J. Clin. Transl. Sci., vol. 5, no. 1, 2021, Art. no. e19.
V. CONCLUSION [8] A. Küderle et al., “tpcp: Tiny pipelines for complex problems - A set of
framework independent helpers for algorithms development and evalua-
DISPEL offers a Python framework to generate SDMs from tion,” J. Open Source Softw., vol. 8, no. 2, 2023, Art. no. 4953.
various DHT data providers and is readily extendable to incor- [9] L. Adamowicz et al., “SciKit digital health: Python package for stream-
lined wearable inertial sensor data processing,” JMIR mHealth uHealth,
porate additional input and processing modules. We hope to vol. 10, no. 4, 2022, Art. no. e36762.
see DISPEL becoming a widely adopted and community-driven [10] L. Palmerini et al., “Mobility recorded by wearable devices and gold
framework, to accelerate digital biomarker/endpoint develop- standards: The Mobilise-D procedure for data standardization,” Sci. Data,
vol. 10, no. 12, 2023, Art. no. 38.
ment and to support collaborative efforts between industry, [11] “Validation of DigiCog and konectom tools to support digitalized clin-
academia, and regulators to drive DHT literacy and adoption. ical assessment in multiple sclerosis (DigiToms) ‘ClinicalTrials.gov,’
clinicaltrials.gov,” 2020. Accessed: Feb. 12, 2024. [Online]. Available:
https://clinicaltrials.gov/study/NCT04756700
AUTHOR CONTRIBUTIONS STATEMENT [12] “A study to assess the clinical validity of konectomTM in adults living
with neuromuscular disorders (DigiNOA) ‘ClinicalTrials.gov,’ clinical-
AS conceived the original design of DISPEL and coordinated trials.gov,” 2022. Accessed: Feb. 12, 2024. [Online]. Available: https:
its development. AS, GC, AJ, AK, and JPA implemented the //clinicaltrials.gov/study/NCT05109637
[13] “Wearable assessments in the clinic and home in PD (WATCH-PD) ‘Clini-
library and the documentation of the code. CH and SDD sup- calTrials.gov,’ clinicaltrials.gov,” 2019. Accessed: Feb. 12, 2024. [Online].
ported the integration of additional I/O modules and contributed Available: https://clinicaltrials.gov/study/NCT03681015