Skip to main content

Showing 1–43 of 43 results for author: Dobson, R

.
  1. arXiv:2408.17181  [pdf, other

    cs.CL

    Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study

    Authors: Shubham Agarwal, Thomas Searle, Mart Ratas, Anthony Shek, James Teo, Richard Dobson

    Abstract: Electronic Health Records are large repositories of valuable clinical data, with a significant portion stored in unstructured text format. This textual data includes clinical events (e.g., disorders, symptoms, findings, medications and procedures) in context that if extracted accurately at scale can unlock valuable downstream applications such as disease prediction. Using an existing Named Entity… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

  2. arXiv:2407.08442  [pdf, other

    cs.LG cs.AI

    How Deep is your Guess? A Fresh Perspective on Deep Learning for Medical Time-Series Imputation

    Authors: Linglong Qian, Tao Wang, Jun Wang, Hugh Logan Ellis, Robin Mitra, Richard Dobson, Zina Ibrahim

    Abstract: We introduce a novel classification framework for time-series imputation using deep learning, with a particular focus on clinical data. By identifying conceptual gaps in the literature and existing reviews, we devise a taxonomy grounded on the inductive bias of neural imputation frameworks, resulting in a classification of existing deep imputation strategies based on their suitability for specific… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  3. arXiv:2406.07497  [pdf

    cs.SD eess.AS

    A pilot protocol and cohort for the investigation of non-pathological variability in speech

    Authors: Nicholas Cummins, Lauren L. White, Zahia Rahman, Catriona Lucas, Tian Pan, Ewan Carr, Faith Matcham, Johnny Downs, Richard J. Dobson, Judith Dineley

    Abstract: Background Speech-based biomarkers have potential as a means for regular, objective assessment of symptom severity, remotely and in-clinic in combination with advanced analytical models. However, the complex nature of speech and the often subtle changes associated with health mean that findings are highly dependent on methodological and cohort choices. These are often not reported adequately in st… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 29 pages. Pre peer review

    ACM Class: J.3

  4. arXiv:2405.17508  [pdf, other

    cs.LG stat.ML

    Unveiling the Secrets: How Masking Strategies Shape Time Series Imputation

    Authors: Linglong Qian, Zina Ibrahim, Wenjie Du, Yiyuan Yang, Richard JB Dobson

    Abstract: In this study, we explore the impact of different masking strategies on time series imputation models. We evaluate the effects of pre-masking versus in-mini-batch masking, normalization timing, and the choice between augmenting and overlaying artificial missingness. Using three diverse datasets, we benchmark eleven imputation models with different missing rates. Our results demonstrate that maskin… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  5. arXiv:2404.11212  [pdf

    stat.AP

    Deciphering seasonal depression variations and interplays between weather changes, physical activity, and depression severity in real-world settings: Learnings from RADAR-MDD longitudinal mobile health study

    Authors: Yuezhou Zhang, Amos A. Folarin, Yatharth Ranjan, Nicholas Cummins, Zulqarnain Rashid, Pauline Conde, Callum Stewart, Shaoxiong Sun, Srinivasan Vairavan, Faith Matcham, Carolin Oetzmann, Sara Siddi, Femke Lamers, Sara Simblett, Til Wykes, David C. Mohr, Josep Maria Haro, Brenda W. J. H. Penninx, Vaibhav A. Narayan, Matthew Hotopf, Richard J. B. Dobson, Abhishek Pratap, RADAR-CNS consortium

    Abstract: Prior research has shown that changes in seasons and weather can have a significant impact on depression severity. However, findings are inconsistent across populations, and the interplay between weather, behavior, and depression has not been fully quantified. This study analyzed real-world data from 428 participants (a subset; 68.7% of the cohort) in the RADAR-MDD longitudinal mobile health study… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  6. arXiv:2401.02258  [pdf, other

    cs.LG cs.AI

    Uncertainty-Aware Deep Attention Recurrent Neural Network for Heterogeneous Time Series Imputation

    Authors: Linglong Qian, Zina Ibrahim, Richard Dobson

    Abstract: Missingness is ubiquitous in multivariate time series and poses an obstacle to reliable downstream analysis. Although recurrent network imputation achieved the SOTA, existing models do not scale to deep architectures that can potentially alleviate issues arising in complex data. Moreover, imputation carries the risk of biased estimations of the ground truth. Yet, confidence in the imputed values i… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  7. arXiv:2312.16713  [pdf, other

    cs.LG cs.AI

    Knowledge Enhanced Conditional Imputation for Healthcare Time-series

    Authors: Linglong Qian, Zina Ibrahim, Hugh Logan Ellis, Ao Zhang, Yuezhou Zhang, Tao Wang, Richard Dobson

    Abstract: This study presents a novel approach to addressing the challenge of missing data in multivariate time series, with a particular focus on the complexities of healthcare data. Our Conditional Self-Attention Imputation (CSAI) model, grounded in a transformer-based framework, introduces a conditional hidden state initialization tailored to the intricacies of medical time series data. This methodology… ▽ More

    Submitted 4 January, 2024; v1 submitted 27 December, 2023; originally announced December 2023.

  8. arXiv:2312.02953  [pdf

    stat.AP q-bio.QM

    Longitudinal Assessment of Seasonal Impacts and Depression Associations on Circadian Rhythm Using Multimodal Wearable Sensing

    Authors: Yuezhou Zhang, Amos A Folarin, Shaoxiong Sun, Nicholas Cummins, Yatharth Ranjan, Zulqarnain Rashid, Callum Stewart, Pauline Conde, Heet Sankesara, Petroula Laiou, Faith Matcham, Katie M White, Carolin Oetzmann, Femke Lamers, Sara Siddi, Sara Simblett, Srinivasan Vairavan, Inez Myin-Germeys, David C. Mohr, Til Wykes, Josep Maria Haro, Peter Annas, Brenda WJH Penninx, Vaibhav A Narayan, Matthew Hotopf , et al. (2 additional authors not shown)

    Abstract: Objective: This study aimed to explore the associations between depression severity and wearable-measured circadian rhythms, accounting for seasonal impacts and quantifying seasonal changes in circadian rhythms.Materials and Methods: Data used in this study came from a large longitudinal mobile health study. Depression severity (measured biweekly using the 8-item Patient Health Questionnaire [PHQ-… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  9. arXiv:2310.04468  [pdf, other

    cs.CL cs.AI

    Validating transformers for redaction of text from electronic health records in real-world healthcare

    Authors: Zeljko Kraljevic, Anthony Shek, Joshua Au Yeung, Ewart Jonathan Sheldon, Mohammad Al-Agil, Haris Shuaib, Xi Bai, Kawsar Noor, Anoop D. Shah, Richard Dobson, James Teo

    Abstract: Protecting patient privacy in healthcare records is a top priority, and redaction is a commonly used method for obscuring directly identifiable information in text. Rule-based methods have been widely used, but their precision is often low causing over-redaction of text and frequently not being adaptable enough for non-standardised or unconventional structures of personal health information. Deep… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  10. arXiv:2308.11773  [pdf

    cs.CL cs.CY cs.SD eess.AS q-bio.QM

    Identifying depression-related topics in smartphone-collected free-response speech recordings using an automatic speech recognition system and a deep learning topic model

    Authors: Yuezhou Zhang, Amos A Folarin, Judith Dineley, Pauline Conde, Valeria de Angel, Shaoxiong Sun, Yatharth Ranjan, Zulqarnain Rashid, Callum Stewart, Petroula Laiou, Heet Sankesara, Linglong Qian, Faith Matcham, Katie M White, Carolin Oetzmann, Femke Lamers, Sara Siddi, Sara Simblett, Björn W. Schuller, Srinivasan Vairavan, Til Wykes, Josep Maria Haro, Brenda WJH Penninx, Vaibhav A Narayan, Matthew Hotopf , et al. (3 additional authors not shown)

    Abstract: Language use has been shown to correlate with depression, but large-scale validation is needed. Traditional methods like clinic studies are expensive. So, natural language processing has been employed on social media to predict depression, but limitations remain-lack of validated labels, biased user samples, and no context. Our study identified 29 topics in 3919 smartphone-collected speech recordi… ▽ More

    Submitted 5 September, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

  11. arXiv:2308.02043  [pdf

    cs.CY cs.AI

    Disease Insight through Digital Biomarkers Developed by Remotely Collected Wearables and Smartphone Data

    Authors: Zulqarnain Rashid, Amos A Folarin, Yatharth Ranjan, Pauline Conde, Heet Sankesara, Yuezhou Zhang, Shaoxiong Sun, Callum Stewart, Petroula Laiou, Richard JB Dobson

    Abstract: Digital Biomarkers and remote patient monitoring can provide valuable and timely insights into how a patient is coping with their condition (disease progression, treatment response, etc.), complementing treatment in traditional healthcare settings.Smartphones with embedded and connected sensors have immense potential for improving healthcare through various apps and mHealth (mobile health) platfor… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

  12. arXiv:2306.10119  [pdf, other

    astro-ph.HE astro-ph.SR

    Early Spectroscopy and Dense Circumstellar Medium Interaction in SN 2023ixf

    Authors: K. Azalee Bostroem, Jeniveve Pearson, Manisha Shrestha, David J. Sand, Stefano Valenti, Saurabh W. Jha, Jennifer E. Andrews, Nathan Smith, Giacomo Terreran, Elizabeth Green, Yize Dong, Michael Lundquist, Joshua Haislip, Emily T. Hoang, Griffin Hosseinzadeh, Daryl Janzen, Jacob E. Jencson, Vladimir Kouprianov, Emmy Paraskeva, Nicolas E. Meza Retamal, Daniel E. Reichart, Iair Arcavi, Alceste Z. Bonanos, Michael W. Coughlin, Ross Dobson , et al. (31 additional authors not shown)

    Abstract: We present the optical spectroscopic evolution of SN~2023ixf seen in sub-night cadence spectra from 1.18 to 14 days after explosion. We identify high-ionization emission features, signatures of interaction with material surrounding the progenitor star, that fade over the first 7 days, with rapid evolution between spectra observed within the same night. We compare the emission lines present and the… ▽ More

    Submitted 12 December, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: Published in ApJL

    Journal ref: The Astrophysical Journal Letters, Volume 956, Issue 1, id.L5, 17 pp., Oct 2023

  13. Towards robust paralinguistic assessment for real-world mobile health (mHealth) monitoring: an initial study of reverberation effects on speech

    Authors: Judith Dineley, Ewan Carr, Faith Matcham, Johnny Downs, Richard Dobson, Thomas F Quatieri, Nicholas Cummins

    Abstract: Speech is promising as an objective, convenient tool to monitor health remotely over time using mobile devices. Numerous paralinguistic features have been demonstrated to contain salient information related to an individual's health. However, mobile device specification and acoustic environments vary widely, risking the reliability of the extracted features. In an initial step towards quantifying… ▽ More

    Submitted 31 May, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: Accepted for publication at Interspeech 2023

    Journal ref: Proc. INTERSPEECH 2023, 2373-2377

  14. arXiv:2212.10540  [pdf

    q-bio.QM

    Challenges in Using mHealth Data From Smartphones and Wearable Devices to Predict Depression Symptom Severity: Retrospective Analysis

    Authors: Shaoxiong Sun, Amos A. Folarin, Yuezhou Zhang, Nicholas Cummins, Rafael Garcia-Dias, Callum Stewart, Yatharth Ranjan, Zulqarnain Rashid, Pauline Conde, Petroula Laiou, Heet Sankesara, Faith Matcham, Daniel Leightley, Katie M. White, Carolin Oetzmann, Alina Ivan, Femke Lamers, Sara Siddi, Sara Simblett, Raluca Nica, Aki Rintala, David C. Mohr, Inez Myin-Germeys, Til Wykes, Josep Maria Haro , et al. (6 additional authors not shown)

    Abstract: A number of challenges exist for the analysis of mHealth data: maintaining participant engagement over extended time periods and therefore understanding what constitutes an acceptable threshold of missing data; distinguishing between the cross-sectional and longitudinal relationships for different features to determine their utility in tracking within-individual longitudinal variation or screening… ▽ More

    Submitted 14 August, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  15. arXiv:2212.08072  [pdf

    cs.CL cs.AI cs.LG

    Foresight -- Generative Pretrained Transformer (GPT) for Modelling of Patient Timelines using EHRs

    Authors: Zeljko Kraljevic, Dan Bean, Anthony Shek, Rebecca Bendayan, Harry Hemingway, Joshua Au Yeung, Alexander Deng, Alfie Baston, Jack Ross, Esther Idowu, James T Teo, Richard J Dobson

    Abstract: Background: Electronic Health Records hold detailed longitudinal information about each patient's health status and general clinical history, a large portion of which is stored within the unstructured text. Existing approaches focus mostly on structured data and a subset of single-domain outcomes. We explore how temporal modelling of patients from free text and structured data, using deep generati… ▽ More

    Submitted 24 January, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

  16. Discharge Summary Hospital Course Summarisation of In Patient Electronic Health Record Text with Clinical Concept Guided Deep Pre-Trained Transformer Models

    Authors: Thomas Searle, Zina Ibrahim, James Teo, Richard Dobson

    Abstract: Brief Hospital Course (BHC) summaries are succinct summaries of an entire hospital encounter, embedded within discharge summaries, written by senior clinicians responsible for the overall care of a patient. Methods to automatically produce summaries from inpatient documentation would be invaluable in reducing clinician manual burden of summarising documents under high time-pressure to admit and di… ▽ More

    Submitted 10 April, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

  17. arXiv:2204.09594  [pdf

    cs.CL cs.LG

    Predicting Clinical Intent from Free Text Electronic Health Records

    Authors: Kawsar Noor, Katherine Smith, Julia Bennett, Jade OConnell, Jessica Fisk, Monika Hunt, Gary Philippo, Teresa Xu, Simon Knight, Luis Romao, Richard JB Dobson, Wai Keong Wong

    Abstract: After a patient consultation, a clinician determines the steps in the management of the patient. A clinician may for example request to see the patient again or refer them to a specialist. Whilst most clinicians will record their intent as "next steps" in the patient's clinical notes, in some cases the clinician may forget to indicate their intent as an order or request, e.g. failure to place the… ▽ More

    Submitted 25 March, 2022; originally announced April 2022.

  18. arXiv:2201.12644  [pdf

    q-bio.QM

    Associations between depression symptom severity and daily-life gait characteristics derived from long-term acceleration signals in real-world settings

    Authors: Yuezhou Zhang, Amos A Folarin, Shaoxiong Sun, Nicholas Cummins, Srinivasan Vairavan, Linglong Qian, Yatharth Ranjan, Zulqarnain Rashid, Pauline Conde, Callum Stewart, Petroula Laiou, Heet Sankesara, Faith Matcham, Katie M White, Carolin Oetzmann, Alina Ivan, Femke Lamers, Sara Siddi, Sara Simblett, Aki Rintala, David C Mohr, Inez Myin-Germeys, Til Wykes, Josep Maria Haro, Brenda WJH Penninx , et al. (5 additional authors not shown)

    Abstract: Gait is an essential manifestation of depression. Laboratory gait characteristics have been found to be closely associated with depression. However, the gait characteristics of daily walking in real-world scenarios and their relationships with depression are yet to be fully explored. This study aimed to explore associations between depression symptom severity and daily-life gait characteristics de… ▽ More

    Submitted 29 January, 2022; originally announced January 2022.

  19. arXiv:2112.11903  [pdf

    q-bio.QM

    The utility of wearable devices in assessing ambulatory impairments of people with multiple sclerosis in free-living conditions

    Authors: Shaoxiong Sun, Amos A Folarin, Yuezhou Zhang, Nicholas Cummins, Shuo Liu, Callum Stewart, Yatharth Ranjan, Zulqarnain Rashid, Pauline Conde, Petroula Laiou, Heet Sankesara, Gloria Dalla Costa, Letizia Leocani, Per Soelberg Sørensen, Melinda Magyari, Ana Isabel Guerrero, Ana Zabalza, Srinivasan Vairavan, Raquel Bailon, Sara Simblett, Inez Myin-Germeys, Aki Rintala, Til Wykes, Vaibhav A Narayan, Matthew Hotopf , et al. (3 additional authors not shown)

    Abstract: Multiple sclerosis (MS) is a progressive inflammatory and neurodegenerative disease of the central nervous system affecting over 2.5 million people globally. In-clinic six-minute walk test (6MWT) is a widely used objective measure to evaluate the progression of MS. Yet, it has limitations such as the need for a clinical visit and a proper walkway. The widespread use of wearable devices capable of… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

  20. arXiv:2108.06835  [pdf

    cs.IR

    Deployment of a Free-Text Analytics Platform at a UK National Health Service Research Hospital: CogStack at University College London Hospitals

    Authors: Kawsar Noor, Lukasz Roguski, Alex Handy, Roman Klapaukh, Amos Folarin, Luis Romao, Joshua Matteson, Nathan Lea, Leilei Zhu, Wai Keong Wong, Anoop Shah, Richard J Dobson

    Abstract: As more healthcare organisations transition to using electronic health record (EHR) systems it is important for these organisations to maximise the secondary use of their data to support service improvement and clinical research. These organisations will find it challenging to have systems which can mine information from the unstructured data fields in the record (clinical notes, letters etc) and… ▽ More

    Submitted 15 August, 2021; originally announced August 2021.

  21. arXiv:2107.03134  [pdf, other

    cs.CL

    MedGPT: Medical Concept Prediction from Clinical Narratives

    Authors: Zeljko Kraljevic, Anthony Shek, Daniel Bean, Rebecca Bendayan, James Teo, Richard Dobson

    Abstract: The data available in Electronic Health Records (EHRs) provides the opportunity to transform care, and the best way to provide better care for one patient is through learning from the data available on all other patients. Temporal modelling of a patient's medical history, which takes into account the sequence of past events, can be used to predict future events such as a diagnosis of a new disorde… ▽ More

    Submitted 7 July, 2021; originally announced July 2021.

    Comments: 6 pages, 2 figures, 3 tables

  22. Estimating Redundancy in Clinical Text

    Authors: Thomas Searle, Zina Ibrahim, James Teo, Richard JB Dobson

    Abstract: The current mode of use of Electronic Health Record (EHR) elicits text redundancy. Clinicians often populate new documents by duplicating existing notes, then updating accordingly. Data duplication can lead to a propagation of errors, inconsistencies and misreporting of care. Therefore, quantifying information redundancy can play an essential role in evaluating innovations that operate on clinical… ▽ More

    Submitted 26 October, 2021; v1 submitted 25 May, 2021; originally announced May 2021.

    Journal ref: JBI v124 (2021)

  23. arXiv:2104.12407  [pdf

    stat.ML cs.LG

    Predicting Depressive Symptom Severity through Individuals' Nearby Bluetooth Devices Count Data Collected by Mobile Phones: A Preliminary Longitudinal Study

    Authors: Yuezhou Zhang, Amos A Folarin, Shaoxiong Sun, Nicholas Cummins, Yatharth Ranjan, Zulqarnain Rashid, Pauline Conde, Callum Stewart, Petroula Laiou, Faith Matcham, Carolin Oetzmann, Femke Lamers, Sara Siddi, Sara Simblett, Aki Rintala, David C Mohr, Inez Myin-Germeys, Til Wykes, Josep Maria Haro, Brenda WJH Pennix, Vaibhav A Narayan, Peter Annas, Matthew Hotopf, Richard JB Dobson

    Abstract: The Bluetooth sensor embedded in mobile phones provides an unobtrusive, continuous, and cost-efficient means to capture individuals' proximity information, such as the nearby Bluetooth devices count (NBDC). The continuous NBDC data can partially reflect individuals' behaviors and status, such as social connections and interactions, working status, mobility, and social isolation and loneliness, whi… ▽ More

    Submitted 26 April, 2021; originally announced April 2021.

  24. arXiv:2104.09263  [pdf, other

    eess.SP cs.HC cs.LG

    Fitbeat: COVID-19 Estimation based on Wristband Heart Rate

    Authors: Shuo Liu, Jing Han, Estela Laporta Puyal, Spyridon Kontaxis, Shaoxiong Sun, Patrick Locatelli, Judith Dineley, Florian B. Pokorny, Gloria Dalla Costa, Letizia Leocan, Ana Isabel Guerrero, Carlos Nos, Ana Zabalza, Per Soelberg Sørensen, Mathias Buron, Melinda Magyari, Yatharth Ranjan, Zulqarnain Rashid, Pauline Conde, Callum Stewart, Amos A Folarin, Richard JB Dobson, Raquel Bailón, Srinivasan Vairavan, Nicholas Cummins , et al. (4 additional authors not shown)

    Abstract: This study investigates the potential of deep learning methods to identify individuals with suspected COVID-19 infection using remotely collected heart-rate data. The study utilises data from the ongoing EU IMI RADAR-CNS research project that is investigating the feasibility of wearable devices and smart phones to monitor individuals with multiple sclerosis (MS), depression or epilepsy. Aspart of… ▽ More

    Submitted 19 April, 2021; originally announced April 2021.

    Comments: 34pages, 4figures

  25. Remote smartphone-based speech collection: acceptance and barriers in individuals with major depressive disorder

    Authors: Judith Dineley, Grace Lavelle, Daniel Leightley, Faith Matcham, Sara Siddi, Maria Teresa Peñarrubia-María, Katie M. White, Alina Ivan, Carolin Oetzmann, Sara Simblett, Erin Dawe-Lane, Stuart Bruce, Daniel Stahl, Yatharth Ranjan, Zulqarnain Rashid, Pauline Conde, Amos A. Folarin, Josep Maria Haro, Til Wykes, Richard J. B. Dobson, Vaibhav A. Narayan, Matthew Hotopf, Björn W. Schuller, Nicholas Cummins, The RADAR-CNS Consortium

    Abstract: The ease of in-the-wild speech recording using smartphones has sparked considerable interest in the combined application of speech, remote measurement technology (RMT) and advanced analytics as a research and healthcare tool. For this to be realised, the acceptability of remote speech collection to the user must be established, in addition to feasibility from an analytical perspective. To understa… ▽ More

    Submitted 30 August, 2021; v1 submitted 17 April, 2021; originally announced April 2021.

    Comments: Accepted to Interspeech 2021. Formatting changes + minor language edits

    ACM Class: H.1.2

    Journal ref: Proc. Interspeech 2021, pp. 631-635

  26. arXiv:2011.09361  [pdf, other

    cs.LG cs.CY

    A Knowledge Distillation Ensemble Framework for Predicting Short and Long-term Hospitalisation Outcomes from Electronic Health Records Data

    Authors: Zina M Ibrahim, Daniel Bean, Thomas Searle, Honghan Wu, Anthony Shek, Zeljko Kraljevic, James Galloway, Sam Norton, James T Teo, Richard JB Dobson

    Abstract: The ability to perform accurate prognosis of patients is crucial for proactive clinical decision making, informed resource management and personalised care. Existing outcome prediction models suffer from a low recall of infrequent positive outcomes. We present a highly-scalable and robust machine learning framework to automatically predict adversity represented by mortality and ICU admission from… ▽ More

    Submitted 11 June, 2021; v1 submitted 18 November, 2020; originally announced November 2020.

    Comments: 14 pages

  27. arXiv:2010.01165  [pdf, other

    cs.CL cs.AI cs.LG

    Multi-domain Clinical Natural Language Processing with MedCAT: the Medical Concept Annotation Toolkit

    Authors: Zeljko Kraljevic, Thomas Searle, Anthony Shek, Lukasz Roguski, Kawsar Noor, Daniel Bean, Aurelie Mascio, Leilei Zhu, Amos A Folarin, Angus Roberts, Rebecca Bendayan, Mark P Richardson, Robert Stewart, Anoop D Shah, Wai Keong Wong, Zina Ibrahim, James T Teo, Richard JB Dobson

    Abstract: Electronic health records (EHR) contain large volumes of unstructured text, requiring the application of Information Extraction (IE) technologies to enable clinical analysis. We present the open-source Medical Concept Annotation Toolkit (MedCAT) that provides: a) a novel self-supervised machine learning algorithm for extracting concepts using any concept vocabulary including UMLS/SNOMED-CT; b) a f… ▽ More

    Submitted 25 March, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

    Comments: Preprint: 27 Pages, 3 Figures

  28. arXiv:2009.12983  [pdf

    stat.AP q-bio.QM

    The Relationship between Major Depression Symptom Severity and Sleep Collected Using a Wristband Wearable Device: Multi-centre Longitudinal Observational Study

    Authors: Yuezhou Zhang, Amos A Folarin, Shaoxiong Sun, Nicholas Cummins, Rebecca Bendayan Yatharth Ranjan, Zulqarnain Rashid, Pauline Conde, Callum Stewart, Petroula Laiou, Faith Matcham, Katie White, Femke Lamers, Sara Siddi, Sara Simblett, Inez Myin-Germeys, Aki Rintala, Til Wykes, Josep Maria Haro, Brenda WJH Pennix, Vaibhav A Narayan, Matthew Hotopf, Richard JB Dobson

    Abstract: Research in mental health has implicated sleep pathologies with depression. However, the gold standard for sleep assessment, polysomnography, is not suitable for long-term, continuous, monitoring of daily sleep, and methods such as sleep diaries rely on subjective recall, which is qualitative and inaccurate. Wearable devices, on the other hand, provide a low-cost and convenient means to monitor sl… ▽ More

    Submitted 27 September, 2020; originally announced September 2020.

  29. arXiv:2009.09648  [pdf

    physics.soc-ph cs.SI q-bio.QM

    Measuring the effect of Non-Pharmaceutical Interventions (NPIs) on mobility during the COVID-19 pandemic using global mobility data

    Authors: Berber T Snoeijer, Mariska Burger, Shaoxiong Sun, Richard JB Dobson, Amos A Folarin

    Abstract: The implementation of governmental Non-Pharmaceutical Interventions (NPIs) has been the primary means of controlling the spread of the COVID-19 disease. The intended effect of these NPIs has been to reduce mobility. A strong reduction in mobility is believed to have a positive effect on the reduction of COVID-19 transmission by limiting the opportunity for the virus to spread in the population. Du… ▽ More

    Submitted 21 September, 2020; originally announced September 2020.

    Comments: 16 pages, 6 figures

  30. arXiv:2006.07358  [pdf, ps, other

    cs.LG cs.CL cs.SD eess.AS stat.ML

    Comparing Natural Language Processing Techniques for Alzheimer's Dementia Prediction in Spontaneous Speech

    Authors: Thomas Searle, Zina Ibrahim, Richard Dobson

    Abstract: Alzheimer's Dementia (AD) is an incurable, debilitating, and progressive neurodegenerative condition that affects cognitive function. Early diagnosis is important as therapeutics can delay progression and give those diagnosed vital time. Developing models that analyse spontaneous speech could eventually provide an efficient diagnostic modality for earlier diagnosis of AD. The Alzheimer's Dementia… ▽ More

    Submitted 23 September, 2020; v1 submitted 12 June, 2020; originally announced June 2020.

    Comments: Submitted to INTERSPEECH 2020: Alzheimer's Dementia Recognition through Spontaneous Speech The ADReSS Challenge Workshop

    Journal ref: Interspeech 2020

  31. Experimental Evaluation and Development of a Silver-Standard for the MIMIC-III Clinical Coding Dataset

    Authors: Thomas Searle, Zina Ibrahim, Richard JB Dobson

    Abstract: Clinical coding is currently a labour-intensive, error-prone, but critical administrative process whereby hospital patient episodes are manually assigned codes by qualified staff from large, standardised taxonomic hierarchies of codes. Automating clinical coding has a long history in NLP research and has recently seen novel developments setting new state of the art results. A popular dataset used… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Journal ref: ACL 2020

  32. arXiv:2005.06624  [pdf, other

    cs.CL cs.LG

    Comparative Analysis of Text Classification Approaches in Electronic Health Records

    Authors: Aurelie Mascio, Zeljko Kraljevic, Daniel Bean, Richard Dobson, Robert Stewart, Rebecca Bendayan, Angus Roberts

    Abstract: Text classification tasks which aim at harvesting and/or organizing information from electronic health records are pivotal to support clinical and translational research. However these present specific challenges compared to other classification tasks, notably due to the particular nature of the medical lexicon and language used in clinical records. Recent advances in embedding methods have shown… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

  33. arXiv:2004.14331  [pdf

    q-bio.QM cs.HC

    Using smartphones and wearable devices to monitor behavioural changes during COVID-19

    Authors: Shaoxiong Sun, Amos Folarin, Yatharth Ranjan, Zulqarnain Rashid, Pauline Conde, Callum Stewart, Nicholas Cummins, Faith Matcham, Gloria Dalla Costa, Sara Simblett, Letizia Leocani, Per Soelberg Sørensen, Mathias Buron, Ana Isabel Guerrero, Ana Zabalza, Brenda WJH Penninx, Femke Lamers, Sara Siddi, Josep Maria Haro, Inez Myin-Germeys, Aki Rintala, Til Wykes, Vaibhav A. Narayan, Giancarlo Comi, Matthew Hotopf , et al. (1 additional authors not shown)

    Abstract: We aimed to explore the utility of the recently developed open-source mobile health platform RADAR-base as a toolbox to rapidly test the effect and response to NPIs aimed at limiting the spread of COVID-19. We analysed data extracted from smartphone and wearable devices and managed by the RADAR-base from 1062 participants recruited in Italy, Spain, Denmark, the UK, and the Netherlands. We derived… ▽ More

    Submitted 22 July, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

  34. Modeling Rare Interactions in Time Series Data Through Qualitative Change: Application to Outcome Prediction in Intensive Care Units

    Authors: Zina Ibrahim, Honghan Wu, Richard Dobson

    Abstract: Many areas of research are characterised by the deluge of large-scale highly-dimensional time-series data. However, using the data available for prediction and decision making is hampered by the current lag in our ability to uncover and quantify true interactions that explain the outcomes.We are interested in areas such as intensive care medicine, which are characterised by i) continuous monitorin… ▽ More

    Submitted 3 April, 2020; originally announced April 2020.

    Comments: 8 pages, 3 figures. Accepted for publication in the European Conference of Artificial Intelligence (ECAI 2020)

    ACM Class: F.4.1; K.3.2

    Journal ref: European Conference on Artificial Intelligence 325(2020) 1826-1833

  35. arXiv:2002.08901  [pdf

    cs.CL cs.LG

    Identifying physical health comorbidities in a cohort of individuals with severe mental illness: An application of SemEHR

    Authors: Rebecca Bendayan, Honghan Wu, Zeljko Kraljevic, Robert Stewart, Tom Searle, Jaya Chaturvedi, Jayati Das-Munshi, Zina Ibrahim, Aurelie Mascio, Angus Roberts, Daniel Bean, Richard Dobson

    Abstract: Multimorbidity research in mental health services requires data from physical health conditions which is traditionally limited in mental health care electronic health records. In this study, we aimed to extract data from physical health conditions from clinical notes using SemEHR. Data was extracted from Clinical Record Interactive Search (CRIS) system at South London and Maudsley Biomedical Resea… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.

    Comments: 4 pages, 2 tables

  36. arXiv:2002.04176  [pdf, other

    stat.AP q-bio.NC

    Personalized acute stress classification from physiological signals with neural processes

    Authors: Callum L. Stewart, Amos Folarin, Richard Dobson

    Abstract: Objective: A person's affective state has known relationships to physiological processes which can be measured by wearable sensors. However, while there are general trends those relationships can be person-specific. This work proposes using neural processes as a way to address individual differences. Methods: Stress classifiers built from classic machine learning models and from neural processes… ▽ More

    Submitted 10 February, 2020; originally announced February 2020.

    Comments: 16 pages (inc. references), 5 figures, 3 tables

  37. The side effect profile of Clozapine in real world data of three large mental hospitals

    Authors: Ehtesham Iqbal, Risha Govind, Alvin Romero, Olubanke Dzahini, Matthew Broadbent, Robert Stewart, Tanya Smith, Chi-Hun Kim, Nomi Werbeloff, Richard Dobson, Zina Ibrahim

    Abstract: Objective: Mining the data contained within Electronic Health Records (EHRs) can potentially generate a greater understanding of medication effects in the real world, complementing what we know from Randomised control trials (RCTs). We Propose a text mining approach to detect adverse events and medication episodes from the clinical text to enhance our understanding of adverse effects related to Cl… ▽ More

    Submitted 27 January, 2020; originally announced January 2020.

  38. arXiv:1912.10166  [pdf

    cs.CL cs.LG stat.ML

    MedCAT -- Medical Concept Annotation Tool

    Authors: Zeljko Kraljevic, Daniel Bean, Aurelie Mascio, Lukasz Roguski, Amos Folarin, Angus Roberts, Rebecca Bendayan, Richard Dobson

    Abstract: Biomedical documents such as Electronic Health Records (EHRs) contain a large amount of information in an unstructured format. The data in EHRs is a hugely valuable resource documenting clinical narratives and decisions, but whilst the text can be easily understood by human doctors it is challenging to use in research and clinical applications. To uncover the potential of biomedical documents we n… ▽ More

    Submitted 18 December, 2019; originally announced December 2019.

    Comments: Preprint, 25 pages, 5 figures and 4 tables

  39. arXiv:1912.00672  [pdf

    q-bio.QM cs.LG stat.ML

    On Classifying Sepsis Heterogeneity in the ICU: Insight Using Machine Learning

    Authors: Zina Ibrahim, Honghan Wu, Ahmed Hamoud, Lukas Stappen, Richard Dobson, Andrea Agarossi

    Abstract: Current machine learning models aiming to predict sepsis from Electronic Health Records (EHR) do not account for the heterogeneity of the condition, despite its emerging importance in prognosis and treatment. This work demonstrates the added value of stratifying the types of organ dysfunction observed in patients who develop sepsis in the ICU in improving the ability to recognise patients at risk… ▽ More

    Submitted 3 December, 2019; v1 submitted 2 December, 2019; originally announced December 2019.

    Comments: 3 Figures and 2 tables. Accepted for publication at the Journal of American Medical Informatics Association

    Journal ref: Journal of the American Medical Informatics Association 27 (2020) 437-443

  40. arXiv:1907.07322  [pdf, other

    cs.HC cs.CL cs.LG

    MedCATTrainer: A Biomedical Free Text Annotation Interface with Active Learning and Research Use Case Specific Customisation

    Authors: Thomas Searle, Zeljko Kraljevic, Rebecca Bendayan, Daniel Bean, Richard Dobson

    Abstract: We present MedCATTrainer an interface for building, improving and customising a given Named Entity Recognition and Linking (NER+L) model for biomedical domain text. NER+L is often used as a first step in deriving value from clinical text. Collecting labelled data for training models is difficult due to the need for specialist domain knowledge. MedCATTrainer offers an interactive web-interface to i… ▽ More

    Submitted 16 July, 2019; originally announced July 2019.

    Journal ref: EMNLP/IJCNLP 2019

  41. arXiv:1903.03995  [pdf

    cs.CL cs.AI

    Efficiently Reusing Natural Language Processing Models for Phenotype-Mention Identification in Free-text Electronic Medical Records: Methodology Study

    Authors: Honghan Wu, Karen Hodgson, Sue Dyson, Katherine I. Morley, Zina M. Ibrahim, Ehtesham Iqbal, Robert Stewart, Richard JB Dobson, Cathie Sudlow

    Abstract: Background: Many efforts have been put into the use of automated approaches, such as natural language processing (NLP), to mine or extract data from free-text medical records to construct comprehensive patient profiles for delivering better health-care. Reusing NLP models in new settings, however, remains cumbersome - requiring validation and/or retraining on new data iteratively to achieve conver… ▽ More

    Submitted 23 October, 2019; v1 submitted 10 March, 2019; originally announced March 2019.

  42. arXiv:1811.11005  [pdf, other

    cs.CL cs.LG stat.ML

    Application of Clinical Concept Embeddings for Heart Failure Prediction in UK EHR data

    Authors: Spiros Denaxas, Pontus Stenetorp, Sebastian Riedel, Maria Pikoula, Richard Dobson, Harry Hemingway

    Abstract: Electronic health records (EHR) are increasingly being used for constructing disease risk prediction models. Feature engineering in EHR data however is challenging due to their highly dimensional and heterogeneous nature. Low-dimensional representations of EHR data can potentially mitigate these challenges. In this paper, we use global vectors (GloVe) to learn word embeddings for diagnoses and pro… ▽ More

    Submitted 28 November, 2018; v1 submitted 23 November, 2018; originally announced November 2018.

    Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216

    Report number: ML4H/2018/37

  43. arXiv:1609.00061  [pdf, other

    math.NA math.CT

    Pixel Arrays: A fast and elementary method for solving nonlinear systems

    Authors: David I. Spivak, Magdalen R. C. Dobson, Sapna Kumari, Lawrence Wu

    Abstract: We present a new method, called the pixel array method, for approximating all solutions in a bounding box for an arbitrary nonlinear system of relations. In contrast with other solvers, our approach requires that the user must specify which variables are to be exposed, and which are to be left latent. The entire solution set is then obtained---in terms of these exposed variables---by performing a… ▽ More

    Submitted 14 May, 2017; v1 submitted 31 August, 2016; originally announced September 2016.

    Comments: 22 pages

    MSC Class: 65H10; 18-04