Maintenance Notice

Due to necessary scheduled maintenance, the JMIR Publications website will be unavailable from Wednesday, July 01, 2020 at 8:00 PM to 10:00 PM EST. We apologize in advance for any inconvenience this may cause you.

Who will be affected?

Accepted for/Published in: JMIR AI

Date Submitted: Nov 8, 2023
Date Accepted: Jun 29, 2024

The final, peer-reviewed published version of this preprint can be found here:

Evaluation of Generative Language Models in Personalizing Medical Information: Instrument Validation Study

Spina A, Andalib S, Flores D, Vermani R, Halaseh FF, Nelson AM

Evaluation of Generative Language Models in Personalizing Medical Information: Instrument Validation Study

JMIR AI 2024;3:e54371

DOI: 10.2196/54371

PMID: 39137416

PMCID: 11350306

Bridging Health Literacy Gaps with AI: Evaluation of Generative Language Models in Personalizing Medical Information

  • Aidin Spina; 
  • Saman Andalib; 
  • Daniel Flores; 
  • Rishi Vermani; 
  • Faris F. Halaseh; 
  • Ariana M. Nelson

ABSTRACT

Background:

AI-driven generative language models (GLM) have enormous potential in medicine. A quickly implementable and impactful application of GLMs is to address low health literacy (LHL).

Objective:

The goal of this study is to evaluate GLM’s potential to tailor the complexity of medical information to patient-specific input education level, which is crucial if it is to serve as a tool in addressing LHL.

Methods:

Input templates related to three prevalent chronic diseases–type II diabetes, chronic obstructive pulmonary disease (COPD), and hypertension were designed. Each clinical vignette was adjusted for hypothetical patient education levels to evaluate output personalization. To assess the success of a GLM in tailoring output writing, the readability of pre- and post-transformation outputs were quantified using the Flesch Reading Ease Score (FKRE) and the Flesch-Kincaid Grade Level (FKGL).

Results:

Responses (n=90) were generated using the GLM program across three clinical vignettes. Mean and standard deviations (SD) were FKRE=62.6 (SD=3.26), 54.6 (3.73), 46.2 (7.19), 42.1 (5.02), 38.0 (7.93), 23.9 (3.53) when aggregated for education level (5th grade, 8th grade, high school, associate’s, bachelor’s, and doctorate, respectively). For the three lowest education levels, the output FKRE means did not show concordance with pre-specified input education level. However, output FKRE means did show concordance for the three highest education levels. The respective FKGL means=8.32 (SD=0.642), 9.70 (0.727), 11.2 (1.17), 11.6 (0.841), 12.4 (1.19), 14.6 (0.550). The GLM produced outputs with statistically significant differences between mean FKRE and FKGL across input education levels.

Conclusions:

GLMs can change the structure and readability of medical text outputs according to specified education levels but are more effective for higher than lower education level inputs. GLMs are also able to categorize input education designation into three broad tiers of output readability: Easy (5th and 8th grade), Medium (high school, associate’s, and bachelor’s degree), and Difficult (doctorate). Future research must establish how GLMs can reliably personalize medical texts for patients with lower education levels to improve healthcare literacy.


 Citation

Please cite as:

Spina A, Andalib S, Flores D, Vermani R, Halaseh FF, Nelson AM

Evaluation of Generative Language Models in Personalizing Medical Information: Instrument Validation Study

JMIR AI 2024;3:e54371

DOI: 10.2196/54371

PMID: 39137416

PMCID: 11350306

Download PDF


Request queued. Please wait while the file is being generated. It may take some time.

© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it's website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.