0% found this document useful (0 votes)
34 views9 pages

Exploring Large Language Model For Next Generation of Artificial Intelligence in Ophthalmology

exploring-large-language-model-for-next-generation-of-artificial-intelligence-in-ophthalmology
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views9 pages

Exploring Large Language Model For Next Generation of Artificial Intelligence in Ophthalmology

exploring-large-language-model-for-next-generation-of-artificial-intelligence-in-ophthalmology
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

TYPE Review

PUBLISHED 23 November 2023


DOI 10.3389/fmed.2023.1291404

Exploring large language model


OPEN ACCESS for next generation of artificial
intelligence in ophthalmology
EDITED BY
Haoyu Chen,
The Chinese University of Hong Kong, China

REVIEWED BY
Jo-Hsuan "Sandy" Wu, Kai Jin 1†, Lu Yuan 2†, Hongkang Wu 1, Andrzej Grzybowski 3 and
University of California, San Diego, Juan Ye 1*
United States
Jana Lipkova, Eye Center, The Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China,
1

University of California, Irvine, United States Department of Ophthalmology, The Children's Hospital, Zhejiang University School of Medicine,
2

*CORRESPONDENCE
National Clinical Research Center for Child Health, Hangzhou, China, 3 Institute for Research in
Juan Ye Ophthalmology, Foundation for Ophthalmology Development, Poznan, Poland
[email protected]

These authors have contributed equally to this



In recent years, ophthalmology has advanced significantly, thanks to rapid
work
progress in artificial intelligence (AI) technologies. Large language models (LLMs)
RECEIVED 09 September 2023
like ChatGPT have emerged as powerful tools for natural language processing.
ACCEPTED 20 October 2023
PUBLISHED 23 November 2023 This paper finally includes 108 studies, and explores LLMs’ potential in the next
CITATION
generation of AI in ophthalmology. The results encompass a diverse range of
Jin K, Yuan L, Wu H, Grzybowski A and studies in the field of ophthalmology, highlighting the versatile applications of
Ye J (2023) Exploring large language model for LLMs. Subfields encompass general ophthalmology, retinal diseases, anterior
next generation of artificial intelligence in
ophthalmology.
segment diseases, glaucoma, and ophthalmic plastics. Results show LLMs’
Front. Med. 10:1291404. competence in generating informative and contextually relevant responses,
doi: 10.3389/fmed.2023.1291404 potentially reducing diagnostic errors and improving patient outcomes. Overall,
COPYRIGHT this study highlights LLMs’ promising role in shaping AI’s future in ophthalmology.
© 2023 Jin, Yuan, Wu, Grzybowski and Ye. This By leveraging AI, ophthalmologists can access a wealth of information, enhance
is an open-access article distributed under the
terms of the Creative Commons Attribution diagnostic accuracy, and provide better patient care. Despite challenges,
License (CC BY). The use, distribution or continued AI advancements and ongoing research will pave the way for the next
reproduction in other forums is permitted, generation of AI-assisted ophthalmic practices.
provided the original author(s) and the
copyright owner(s) are credited and that the
original publication in this journal is cited, in KEYWORDS
accordance with accepted academic practice.
No use, distribution or reproduction is artificial intelligence, large language model, ChatGPT, ophthalmology, diagnostic
permitted which does not comply with these accuracy and efficacy
terms.

Introduction
The history of artificial intelligence (AI) in medicine dates back to the 1950s when
researchers began to explore the use of computers to analyze medical data and make
diagnostic decisions. However, past methods had limitations in accuracy and speed and
still could not analyze unstructured medical data (1). Natural Language Processing (NLP)
is a subfield of AI that focuses on enabling computers to understand, interpret, and
generate human language. It involves the development of algorithms and models that can
process and analyze unstructured text data. Large Language Models (LLM) refer to
advanced artificial intelligence models, such as GPT-3 (Generative Pre-trained
Transformer 3), that are built on transformer architecture. The transformer architecture
is a deep learning model that efficiently captures context and dependencies in sequential
data, making it a fundamental choice for natural language processing tasks and beyond.
These models are trained on massive amounts of text data from the internet, enabling
them to generate human-like text and perform a wide range of NLP tasks with remarkable
accuracy and versatility. ChatGPT builds on the capabilities of large language models to

Frontiers in Medicine 01 frontiersin.org


Jin et al. 10.3389/fmed.2023.1291404

FIGURE 1
Workflow of large language model (LLM) for artificial intelligence (AI) in ophthalmology. Text (symptoms, medical history, etc.) and images (Optical
coherence tomography, Fundus fluorescein angiography, etc.) are encoded and fed into a model that has been trained on a large amount of data,
which can decode the relevant information required. LLM applications include automated question-answering, diagnose, information screening,
summarization, image analysis, predictive modeling.

generate coherent and contextually relevant responses, making it Study selection and search strategy
well-suited for chatbot applications. It is designed to generate
human-like responses to a wide range of prompts and questions We conducted a comprehensive literature search following the
and may enhance healthcare delivery and patients’ quality of life PRISMA guidelines. Searches were performed on PubMed and
(Figure 1) (2). The use of LLMs in healthcare offers several Google Scholar databases, spanning from January 2016 to June 2023.
potential benefits. Keywords were selected from two distinct categories: ophthalmology-
ChatGPT and LLMs can be applied in various ways. They can related terms (ophthalmology, eye diseases, eye disorders) and large
serve as clinical documentation aids, helping with administrative tasks language model-related terms (large language models, ChatGPT,
such as clinic scheduling, medical coding for billing, and generating natural language processing, chatbots). The search strategy involved
preauthorization letters (3). LLMs can also be used as summarization the use of the following keywords: (“Ophthalmology” OR “Eye
tools, improving communication with patients and assisting in clinical Diseases” OR “Eye Disorders”) AND (“Large Language Models” OR
trials. They can make processes such as curriculum design, testing of “ChatGPT” OR “Natural Language Processing” OR “Chatbots”). The
knowledge base, and continuing medical education more dynamic (4). terms from each category were cross-referenced independently with
LLMs can reduce the burden of administrative tasks for healthcare terms from the other category.
professionals, save time, and improve efficiency. They also have the
potential to provide valuable clinical insights and support decision-
making (5). This capability may help ophthalmologists enabling Inclusion and exclusion criteria
evidence-based decision-making and revolutionizing various aspects
of eye care and research. We established specific inclusion criteria for article selection. The
publication period considered research from January 2016 to June
2023 to ensure the inclusion of up-to-date findings. Initially, 6,130
Method of literature search articles were identified through titles and abstracts. We prioritized
research quality and the application of Large Language Models
For this review, we followed the Preferred Reporting Items for (LLMs) in our selection process. Additionally, articles published prior
Systematic Reviews and Meta-Analyzes (PRISMA) guidelines. to 2016 were included for historical context and those pertinent to
closely related topics.
In the meantime, studies meeting the following criteria will
Abbreviations: AI, Artificial Intelligence; NLP, Natural Language Processing; LLM, be excluded: (1) duplicate literature previously included in the review,
Large Language Model; GPT, Generative Pre-trained Transformer; HER, Electronic (2) irrelevant topics, where the article is unrelated to ophthalmology
Health Records; eAMD, exudative Age-related Macular Degeneration; DR, Diabetic or the application of the large language model, (3) conference
Retinopathy; OCT, Optical Coherence Tomography; BERT, Bidirectional Encoder abstracts, and (4) non-original research, such as editorials, case
Representations from Transformers. reports or commentaries.

Frontiers in Medicine 02 frontiersin.org


Jin et al. 10.3389/fmed.2023.1291404

Language considerations language considerations, and thorough data extraction (Figure 2).
A total of 108 articles were independently screened for eligibility by
A comprehensive review was conducted primarily on English- two reviewers (Kai Jin and Lu Yuan), including assessments of titles
language articles, totaling 6,130 papers. Furthermore, we evaluated 14 and abstracts, followed by full-text review. Any disagreements were
papers predominantly published in Chinese. For articles in languages resolved through discussion with a third author (Juan Ye).
such as French, Spanish, and German, we assessed their abstracts. This Ultimately, 108 studies were included in the review.
multilingual approach allowed us to comprehensively evaluate the
literature. The primary inclusion criterion required research to
specifically address the application of AI in ophthalmology and Results
demonstrate a certain level of perceived quality.
We finally included 108 studies. The results (Table 1) encompass
a diverse range of studies in the field of ophthalmology, highlighting
Data extraction and analysis the versatile applications of LLMs. The results reflect a wide spectrum
of LLM applications, and subfields of interest in ophthalmology. They
Following a rigorous selection process, relevant data were showcase the versatility of LLMs in addressing various aspects of
extracted and analyzed from the selected articles. Key themes, trends, automated question-answering (55 studies), diagnose (5 studies),
advancements, and challenges related to the utilization of LLMs in information screening (27 studies), summarization (5 studies), image
ophthalmology were systematically synthesized. analysis (5 studies), predictive modeling (11 studies). Subfields
In accordance with the PRISMA guidelines, this review adhered encompass general ophthalmology (38 studies), retinal diseases (32
to a structured and rigorous approach, encompassing a studies), anterior segment diseases (27 studies), glaucoma (6 studies),
comprehensive literature search, meticulous inclusion criteria, and ophthalmic plastics (5 studies) (Figure 3).

FIGURE 2
PRISMA 2020 flow diagram for this systematic review.

Frontiers in Medicine 03 frontiersin.org


Jin et al. 10.3389/fmed.2023.1291404

TABLE 1 Summary of representative current studies using LLM in ophthalmology.

Reference Year Publication Subspeciality Aim Application Approaches


Lin et al. (6) 2023 Eye General ophthalmology To compare the performance Automated question- GPT-3.5, GPT-4
on a practice ophthalmology answering
written examination

Antaki et al. (7) 2023 Ophthalmology Science General ophthalmology To evaluate the performance Automated question- ChatGPT
on ophthalmology questions answering

Cai et al. (8) 2023 American Journal of General ophthalmology To compare the performance Automated question- Bing Chat, ChatGPT 3.5,
Ophthalmology on ophthalmology board-style answering and ChatGPT 4.0,
questions.

Mihalache et al. (9) 2023 JAMA Ophthalmology General ophthalmology To assess the performance on Automated question- ChatGPT
board certification exam in answering
ophthalmology

Bernstein et al. (10) 2023 JAMA Network Open General ophthalmology To generate ophthalmology Automated question- ChatGPT version 3.5
advice answering

Ali et al. (11) 2023 Ophthalmic Plast Lacrimal drainage disorders To response to lacrimal Automated question- ChatGPT
Reconstr Surg drainage disorders answering

Tsui et al. (12) 2023 Eye Posterior vitreous detachment, To response to common Automated question- ChatGPT
retinal tear and detachment, ocular symptoms answering
ocular surface disease,
exudative age-related macular
degeneration (eAMD), and
post-intravitreal injection pain
and redness

Potapenko et al. (13) 2023 Acta Ophthalmologica Retinal diseases To evaluate accuracy on Automated question- ChatGPT
patient information answering

Momenaei et al. (14) 2023 Ophthalmology Retina Retinal diseases To evaluate the Automated question- ChatGPT-4
appropriateness and answering
readability of the medical
knowledge

Waisberg et al. (15) 2023 Irish Journal of Medical Anterior ischemic optic Fundus image analysis Image analysis GPT-4
Science neuropathy

Hu et al. (16) 2022 Transl Vis Sci Technol. Glaucoma To Predict Glaucoma Predictive Modeling Pre-trained Transformers
Progression Requiring Surgery

Lee et al. (17) 2023 Ophthalmic Res General ophthalmology To assign procedural codes Predictive Modeling Bidirectional Encoder
based on the surgical report Representations from
Transformers (BERT)

Liu et al. (18) 2023 AMIA Retinal vascular disease To provide a diagnosis based Summarization GPT3.5-Turbo
on FFA reports

Yu et al. (19) 2022 BMC Medical Diabetic retinopathy To Identify diabetic Information NLP(Extraction, Named
Informatics and retinopathy-related clinical screening entity recognition), DL,
Decision Making concepts and their attributes Pre-trained Transfomers

Valentín-Bravo et al. 2023 Arch Soc Esp Oftalmol. Vitreoretinal disease To write a scientific article Information ChatGPT, DALL-E 2
(20) screening

Singh et al. (4) 2023 Clin Exp Ophthamol. Dry eye disease To conduct a literature review Information ChatGPT
screening

Singh et al. (21) 2023 Seminars in Cornea, retina, glaucoma, To construct ophthalmic Information ChatGPT
Ophthalmology pediatric ophthalmology, discharge summaries and screening
neuroophthalmology, and operative notes
ophthalmic plastics surgery

Rasmussen et al. 2023 Graefe’s archive for Vernal keratoconjunctivitis To provided responses to Automated question- ChatGPT
(22) clinical and patient and parent questions answering
experimental
ophthalmology

lim et al. (23) 2023 Ebiomedicine Myopia To deliver accurate responses Automated question- ChatGPT-3.5,
to common myopia-related answering ChatGPT-4.0, and Google
query Bard

Waisberg et al. (24) 2023 Annals of Biomedical General ophthalmology To write ophthalmic operative Information GPT-4
Engineering notes screening

Frontiers in Medicine 04 frontiersin.org


Jin et al. 10.3389/fmed.2023.1291404

FIGURE 3
Major applications of LLM in Ophthalmology. The patient’s information like symptoms, medical history and other health-related details are inputted
into the LLM, which outputs valuable clinical insights to the physician and helps him or her make decisions.

General ophthalmology exudative age-related macular degeneration (eAMD), and post-


intravitreal injection pain and redness. The abilities of ChatGPT in
The application of LLMs in ophthalmology is a rapidly growing constructing discharge summaries and operative notes were evaluated
field with promising potential, encompassing various aspects of through a study conducted by Swati et al. (21). The study found that
patient care and clinical workflows. LLMs can analyze general ChatGPT was able to construct ophthalmic discharge summaries and
ophthalmology patient data and medical records to recommend operative notes in a matter of seconds, with tailored responses based
personalized diagnosis and treatment plans for individuals with on the quality of inputs given. However, there were some limitations
specific eye conditions. Chatbots integrated with electronic health such as the presence of generic text and factual inaccuracies in some
record (EHR) systems can access patient information to provide responses. The authors suggest that ChatGPT can be utilized to
context-aware responses and support clinical decision-making. minimize the time spent on discharge summaries and improve patient
The majority of current NLP applications in ophthalmology focus care, but it should be used with caution and human verification.
on extracting specific text, such as visual acuity, from free-text notes Another study aimed to assess the performance of an AI chatbot,
for the purposes of quantitative analysis (25). NLP also offers ChatGPT, in answering practice questions for ophthalmology board
opportunities to develop search engines for data within free-text certification examinations (9). ChatGPT correctly answered 46.4% of
notes, clean notes, automated question-answering, and translating the questions, with the best performance in the category of general
ophthalmology notes for other specialties or for patients. Low vision medicine (79%) and the poorest in retina and vitreous (0%). ChatGPT
rehabilitation improves quality of life for visually impaired patients, provided explanations and additional insight for 63% of questions but
free-text progress notes within the EHR using NLP provide valuable selected the same multiple-choice response as the most common
information relevant to predicting patients’ visual prognosis (26). NLP answer provided by ophthalmology trainees only 44% of the time. The
with unstructured clinician notes supports low vision and blind researchers compared the performance of several generative AI
rehabilitation for war veterans with traumatic brain injury based on models on the ophthalmology board-style questions (6–8), including
veterans’ needs rather than system-level factors (27, 28). This suggests Bing Chat (Microsoft), ChatGPT 3.5 and 4.0 (OpenAI). Performance
that AI with NLP may be particularly important for the performance was compared with that of human respondents. Results showed that
of predictive models in ophthalmology. Given the potential of LLMs ChatGPT-4.0 and Bing Chat performed comparably to
in healthcare and the increasing reliance of patients on online human respondents.
information, it is important to evaluate the quality of chatbot- Existing electronic differential diagnosis support tools, like the
generated advice and compare it with human-written advice from Isabel Pro Differential Diagnosis Generator, have limitations in terms
ophthalmologists. The panel of ophthalmologists had a 61.3% of structured input and context-specific language processing. In one
accuracy in distinguishing between chatbot and human responses (10). study, ChatGPT identified the correct diagnosis in 9 out of 10 cases
As chatbot technology is continually evolving, there are additional and had the correct diagnosis listed in all 10 of its lists of differentials
applications in general ophthalmology. The researchers evaluated the (29). Isabel, on the other hand, identified only 1 out of 10 provisional
ability of the ChatGPT to respond to ocular symptoms by scripting 10 diagnoses correctly, but included the correct diagnosis in 7 out of 10
prompts reflective of common patient messages relating to various of its differential diagnosis lists. The median position of the correct
ocular conditions (12). These conditions included posterior vitreous diagnosis in the ranked differential lists was 1.0 for ChatGPT versus
detachment, retinal tear and detachment, ocular surface disease, 5.5 for Isabel.

Frontiers in Medicine 05 frontiersin.org


Jin et al. 10.3389/fmed.2023.1291404

Retinal diseases responses rated as ‘good’, compared to 61.3% in ChatGPT-3.5 and


54.8% in Google Bard (23).
Some studies evaluate the accuracy of an AI-based chatbot in
providing patient information on common retinal diseases, including
AMD, diabetic retinopathy (DR), retinal vein occlusion, retinal artery Glaucoma
occlusion, and central serous chorioretinopathy.
In healthcare settings, when patients provide information about Previous studies have developed predictive models for glaucoma
their medical history, symptoms, or other health-related details, there progression, but uncertainty remains on how to integrate the
is the potential for miscommunication or misalignment between the information in free-text clinical notes, which contain valuable clinical
patient’s perspective and the physician’s understanding of the situation. information (32). Some studies aim to predict glaucoma progression
Traditional methods of obtaining patient information may lead to requiring surgery using deep learning approaches on EHRs and natural
dissatisfaction if the information obtained misaligns with the language processing of clinical free-text notes. Sunil et al. presents an
physician’s information (30). ChatGPT can improve patient satisfaction artificial intelligence approach to predict near-term glaucoma
in terms of information provision by providing accurate and well- progression using clinical free-text notes and data from electronic
formulated responses to various topics, including common retinal health records (33). The authors developed models that combined
diseases (13). This accessibility can be particularly beneficial when structured data and text inputs to predict whether a glaucoma patient
ophthalmologists are not readily available. Among retinal diseases, DR would require surgery within the following year. The model
is a leading cause of blindness in adults, and there is increasing interest incorporating both structured clinical features and free-text features
in developing AI technologies to detect DR using EHRs. Most AI-based achieved the highest performance with an AUC of 0.899 and an F1
DR diagnoses are focused on medical images, but there is limited score of 0.745. Another study aims to fill the gap by developing a deep
research exploring the lesion-related information captured in the free learning predictive model for glaucoma progression using both
text image reports. In Yu et al. (19) study, two state-of-the-art structured clinical data and natural language processing of clinical free-
transformer-based NLP models, including BERT and RoBERTa, were text notes from EHRs. The combination model showed the best AUC
examined and compared with a recurrent neural network implemented (0.731), followed by the text model (0.697) and the structured model
using Long short-term memory (LSTM) to extract DR-related concepts (0.658) (34). Hu et al. (16) explored the use of transformer-based
from clinical narratives. The results show that for concept extraction, language models, specifically Bidirectional Encoder Representations
the BERT model pretrained with the MIMIC III dataset outperformed from Transformers (BERT), to predict glaucoma progression requiring
other models, achieving the highest performance with F1-scores of surgery using clinical free-text notes from EHRs. The results showed
0.9503 and 0.9645 for strict and lenient evaluations, respectively. The that the BERT models outperformed an ophthalmologist’s review of
findings of this study could have a significant impact on the clinical notes in predicting glaucoma progression. Michelle et al. (35)
development of clinical decision support systems for DR diagnoses. utilized an automated pipeline for data extraction from EHRs to
evaluate the real-world outcomes of glaucoma surgeries, tube shunt
surgery had a higher risk of failure (Baerveldt: Hazard Ratio (HR) 1.44,
Anterior segment disease 95% CI 1.02 to 2.02; Ahmed: HR 2.01, 95% CI 1.28 to 3.17).

Anterior segment vision-threatening disease included the


diagnosis of corneal ulcer, iridocyclitis, hyphema, anterior scleritis, Ophthalmic plastics
or scleritis with corneal involvement. Patients with anterior
segment diseases present a diagnostic challenge for many primary In the study conducted by Mohammad et al. (11), ChatGPT’s
care physicians. The researchers developed a decision support tool performance in providing information about primary acquired
to predict vision-threatening anterior segment disease using nasolacrimal duct obstruction and congenital nasolacrimal duct
primary clinical notes based on NLP (31). The ultimate prediction obstruction was evaluated. Regarding insights into the history and
model exhibited an area under the curve (AUC) of 0.72, with a 95% effectiveness of dacryocystorhinostomy surgery, ChatGPT was tested
confidence interval ranging from 0.67 to 0.77. Using a threshold on this specific topic. Agreement among the three observers was high
that achieved a sensitivity of 90%, the model demonstrated a (95%) in grading the responses. The responses of ChatGPT were
specificity of 30%, a positive predictive value of 5.8%, and a high graded as correct for only 40% of the prompts, partially correct in
negative predictive value of 99%. One study evaluates the accuracy 35%, and outright factually incorrect in 25%. Hence, some degree of
of responses provided by the ChatGPT to patient and parent factual inaccuracy was present in 60% of the responses, if we consider
questions on vernal keratoconjunctivitis (VKC), a complex and the partially correct responses.
recurring disease primarily affecting children (22). The researchers
formulated questions in four categories and assessed the chatbot’s
responses for information accuracy. The chatbot was found to Discussion
provide both relevant and inaccurate statements. Inaccurate
statements were particularly observed regarding treatment and The newer generation of GPT models, exemplified by GPT-3 and
potential side effects of medications. A comparative analysis of the beyond, differs from their predecessors through significantly larger
performance of three LLMs, namely ChatGPT-3.5, ChatGPT-4.0, model sizes, improved performance on various language tasks,
and Google Bard, was conducted in delivering accurate and enhanced few-shot learning abilities, and increased versatility, while
comprehensive responses to common myopia-related queries. also necessitating more substantial computational resources and
ChatGPT-4.0 demonstrated the highest accuracy, with 80.6% of raising ethical considerations.

Frontiers in Medicine 06 frontiersin.org


Jin et al. 10.3389/fmed.2023.1291404

Strengths Furthermore, there may be resistance or skepticism among some


healthcare professionals towards adopting AI-driven technologies like
AI technology, such as online chat-based AI language models, LLMs. It will be crucial to address these concerns, provide proper
has the potential to assist clinical workflows and augment patient training, and foster a collaborative environment where human experts
education and communication about common ophthalmology and AI work together synergistically (45).
diseases prevention queries (Table 1). GPT’s medical subspecialty The interpretability and explainability of the decisions made by
capabilities have improved significantly from GPT-3 to GPT-4. these models are another challenge. As they are often considered
Both LLMs struggled with image-based and higher-order “black boxes,” “understanding the reasoning behind their
ophthalmology questions, perhaps reflecting the importance of recommendations can be difficult,” leading to potential mistrust from
visual analysis in ophthalmology. Given the ongoing advances in clinicians and patients (46). Developing methods to make the models
computer vision, it may be possible to address this limitation in more transparent and explainable will be essential for their widespread
future LLMs. There is room for improvement in medical acceptance and adoption (47).
conversational agents, as all models exhibited instances of Lastly, the rapidly evolving nature of AI and language model
hallucination, incorrect justification, or non-logical reasoning (36). technologies demands continuous updates and improvements. Staying
Although ChatGPT 4.0 has demonstrated remarkable capabilities up-to-date with the latest advancements and incorporating new
in a variety of domains, the presence of these errors raises concerns knowledge into the models is essential to maintain their accuracy and
about the reliability of the system, especially in critical clinical relevance in the ever-changing field of ophthalmology.
decision making. While LLMs like ChatGPT offer tremendous potential in
Ophthalmologists are starting to use ChatGPT to help with ophthalmology, addressing the challenges of AI hallucination and
paperwork such as scientific articles, discharge summaries and misinformation is paramount. It is essential to consider the broader
operative notes (15, 24, 37). The scientific accuracy and reliability on societal implications, including patient trust, medical liability, ethical
certain topics were not sufficient to automatically generate concerns, scientific integrity, health disparities, and regulatory
scientifically rigorous articles. This was also objected to by some oversight when integrating AI into ophthalmic practices. Responsible
ophthalmologists (38). Firstly, operative notes are not general AI implementation and continuous monitoring are essential to
descriptions of surgical procedures and a specific patient has its own harness the benefits of AI while minimizing potential risks. One
unique characteristics. Secondly, operative notes are legal documents concern in the use of LLMs for medical applications is the lack of
and the surgeon is responsible for the accuracy and completeness of reproducibility, as these generative models may not consistently
the notes. Thirdly, there is no evidence that GPT-4 can accurately provide the same answers, potentially impacting the reliability of their
capture the unique aspects of individual cases in the real world, such outputs in clinical settings. Addressing these challenges will
as intraoperative complications. Finally, the writing of operative notes be essential to fully realize the potential benefits of large language
requires a degree of clinical decision-making and clinical judgment models in ophthalmology and to ensure their responsible and ethical
that cannot be automated. implementation in patient care (48).
In a recent development, ChatGPT has emerged as an author or
co-author of scientific papers in the field of ophthalmology (39, 40).
This innovative inclusion has sparked discussions and garnered Future perspectives
attention from the scientific community. The presence of ChatGPT as
an author in scientific research reflects the evolving landscape of The future perspectives of LLMs in ophthalmology hold
artificial intelligence’s involvement in various domains, including tremendous promise for transforming the landscape of eye care and
ophthalmology, opening avenues for new perspectives and research (49). These advanced language models, powered by AI and
collaborative contributions. NLP, are poised to revolutionize how ophthalmologists diagnose, treat,
and manage various eye conditions. LLMs can be integrated with
image analysis techniques to create multimodal AI systems. These
Challenges systems can process both textual and visual information, enhancing
their capabilities in ophthalmology. For instance, LLMs can analyze
Despite the promising future, integrating LLMs into textual patient records and medical literature, while image analysis
ophthalmology also poses several challenges that need to be addressed. algorithms can interpret medical images such as fundus photographs.
Firstly, ensuring patient data privacy and maintaining the security of Through their ability to analyze vast amounts of medical literature,
sensitive medical information will be critical (41). These models patient data, and diagnostic images, these models can provide more
require vast amounts of data to achieve their potential, but data- accurate and timely diagnoses, personalized treatment plans, and even
sharing must be conducted responsibly and in compliance with strict predict disease progression. The combination of LLMs and image
ethical and legal guidelines (42, 43). analysis can lead to more efficient and accurate decision-making in
Another significant challenge is the potential for bias in the data ophthalmic practice. Additionally, LLMs can be used as tools to
used to train these language models (44). If the data used for training support communication and knowledge exchange in the following
is not diverse enough, the models may exhibit biases that can lead to ways. While LLMs themselves do not directly facilitate communication
inaccurate or unfair recommendations, particularly when dealing with like human interaction, their capabilities can enhance and streamline
underrepresented populations. Efforts must be made to identify and information exchange and knowledge sharing among eye care
mitigate these biases to ensure equitable and reliable outcomes for professionals worldwide. As research and development in this field
all patients. continue to progress, we can expect these language models to become

Frontiers in Medicine 07 frontiersin.org


Jin et al. 10.3389/fmed.2023.1291404

indispensable tools that enhance efficiency, accessibility, and Funding acquisition, Project administration, Resources, Supervision,
ultimately improve patient outcomes in ophthalmology. Validation, Visualization, Writing – review & editing.

Limitations Funding
This review acknowledges several potential limitations that may The author(s) declare financial support was received for the
have affected the comprehensiveness and potential bias of the research, authorship, and/or publication of this article. This work has
literature search and selection process. These limitations include been financially supported by Natural Science Foundation of China
publication bias, language bias due to the focus on English-language (grant number 82201195), and Clinical Medical Research Center for
studies, potential database selection bias, the possibility of excluding Eye Diseases of Zhejiang Province (grant number 2021E50007).
relevant studies due to search term restrictions, the limited date
range, and the predefined exclusion criteria that may have omitted
relevant research. The review also recognizes the potential for missed Acknowledgments
references and acknowledges the subjectivity in reviewer bias, which
could impact study inclusion. Moreover, the review underscores the Thanks to all the peer reviewers and editors for their opinions
importance of addressing these limitations to ensure a more and suggestions.
comprehensive and balanced assessment of the field of AI in
ophthalmology. Despite these potential constraints, the review
provides valuable insights into the applications and challenges of AI Conflict of interest
in ophthalmology, but readers should consider these limitations
when interpreting the findings and drawing conclusions from The authors declare that the research was conducted in the
the review. absence of any commercial or financial relationships that could
be construed as a potential conflict of interest.

Author contributions
Publisher’s note
KJ: Conceptualization, Data curation, Funding acquisition,
Investigation, Methodology, Writing – original draft, Writing – review All claims expressed in this article are solely those of the authors
& editing. LY: Data curation, Formal analysis, Investigation, and do not necessarily represent those of their affiliated organizations,
Methodology, Writing – original draft. HW: Formal analysis, Software, or those of the publisher, the editors and the reviewers. Any product
Validation, Writing – review & editing. AG: Supervision, Validation, that may be evaluated in this article, or claim that may be made by its
Visualization, Writing – review & editing. JY: Conceptualization, manufacturer, is not guaranteed or endorsed by the publisher.

References
1. Jin K, Ye J. Artificial intelligence and deep learning in ophthalmology: current status 11. Ali MJ. ChatGPT and lacrimal drainage disorders: performance and scope of
and future perspectives. Adv Ophthalmol Pract Res. (2022) 2:100078. doi: 10.1016/j. improvement. Ophthal Plast Reconstr Surg. (2023) 39:221–5. doi: 10.1097/
aopr.2022.100078 IOP.0000000000002418
2. Will ChatGPT transform healthcare? Nat Med. (2023) 29:505–6. doi: 10.1038/ 12. Tsui JC, Wong MB, Kim BJ, Maguire AM, Scoles D. Appropriateness of ophthalmic
s41591-023-02289-5 symptoms triage by a popular online artificial intelligence chatbot. Eye (Lond). (2023).
doi: 10.1038/s41433-023-02556-2
3. Sharma P, Parasa S. ChatGPT and large language models in gastroenterology. Nat
Rev Gastroenterol Hepatol. (2023) 20:481–2. doi: 10.1038/s41575-023-00799-8 13. Potapenko I, Boberg-Ans LC, Stormly HM. Artificial intelligence-based chatbot
patient information on common retinal diseases using ChatGPT. Acta Ophthalmol.
4. Singhal K, Azizi S. Large language models encode clinical knowledge. Nature. (2023). doi: 10.1111/aos.15661
(2023) 620:172–80. doi: 10.1038/s41586-023-06291-2
14. Momenaei B, Wakabayashi T, Shahlaee A, Durrani AF, Pandit SA, Wang K, et al.
5. Arora A, Arora A. The promise of large language models in health care. Lancet Appropriateness and readability of chatgpt-4-generated responses for surgical treatment
(London, England). (2023) 401:641. doi: 10.1016/S0140-6736(23)00216-7 of retinal diseases. Ophthalmol Retina. (2023). 7:862–8.
6. Lin JC, Younessi DN, Kurapati SS, Tang OY, Scott IU. Comparison of GPT-3.5, 15. Waisberg E, Ong J, Masalkhi M, Kamran SA, Zaman N, Sarker P, et al. GPT-4: a
GPT-4, and human user performance on a practice ophthalmology written examination. new era of artificial intelligence in medicine. Ir J Med Sci. (2023). doi: 10.1007/
Eye (London, England). (2023). doi: 10.1038/s41433-023-02564-2 s11845-023-03377-8
7. Antaki F, Touma S, Milad D, El-Khoury J, Duval R. Evaluating the performance of 16. Hu W, Wang SY. Predicting Glaucoma progression requiring surgery using clinical
ChatGPT in ophthalmology: an analysis of its successes and shortcomings. Ophthalmol free-text notes and transfer learning with transformers. Transl Vis Sci Technol. (2022)
Sci. (2023) 3:100324. doi: 10.1016/j.xops.2023.100324 11:37. doi: 10.1167/tvst.11.3.37
8. Cai LZ, Shaheen A, Jin A, Fukui R, Yi JS, Yannuzzi N, et al. Performance of 17. Lee YM, Bacchi S, Macri C, Tan Y, Casson R, Chan WO. Ophthalmology operation
generative large language models on ophthalmology board style questions. Am J note encoding with open-source machine learning and natural language processing.
Ophthalmol. (2023) 254:141–9. doi: 10.1016/j.ajo.2023.05.024 Ophthalmic Res. (2023) 66:928–39.
9. Mihalache A, Popovic MM, Muni RH. Performance of an artificial intelligence 18. Liu X, Wu J, Shao A, Shen W, Ye P, Wang Y, et al. Transforming retinal vascular
Chatbot in ophthalmic knowledge assessment. JAMA Ophthal. (2023) 141:589–97. doi: disease classification: a comprehensive analysis of chatgpt’s performance and inference
10.1001/jamaophthalmol.2023.1144 abilities on non-english clinical environment. medRxiv (2023). doi:
10.1101/2023.06.28.23291931
10. Bernstein IA, Zhang YV, Govil D, Majid I, Chang RT, Sun Y, et al. Comparison of
ophthalmologist and large language model Chatbot responses to online patient eye care 19. Yu Z, Yang X, Sweeting GL, Ma Y, Stolte SE, Fang R, et al. Identify diabetic
questions. JAMA Netw Open. (2023) 6:e2330320. doi: 10.1001/jamanetworkopen.2023.30320 retinopathy-related clinical concepts and their attributes using transformer-based

Frontiers in Medicine 08 frontiersin.org


Jin et al. 10.3389/fmed.2023.1291404

natural language processing methods. BMC Med Inform Decis Mak. (2022) 22:255. doi: 34. Wang SY, Tseng B, Hernandez-Boussard T. Deep learning approaches for
10.1186/s12911-022-01996-2 predicting Glaucoma progression using electronic health records and natural language
processing. Ophthalmol Sci. (2022) 2:100127. doi: 10.1016/j.xops.2022.100127
20. Valentín-Bravo FJ, Mateos-Álvarez E, Usategui-Martín R, Andrés-Iglesias C,
Pastor-Jimeno JC, Pastor-Idoate S. Artificial intelligence and new language models in 35. Sun MT, Singh K, Wang SY. Real-world outcomes of Glaucoma filtration surgery
ophthalmology: complications of the use of silicone oil in vitreoretinal surgery. Arch Soc using electronic health records: an informatics study. J Glaucoma. (2022) 31:847–53. doi:
Esp Oftalmol (Engl Ed). (2023) 98:298:303. 10.1097/IJG.0000000000002122
21. Singh S, Djalilian A, Ali MJ. ChatGPT and ophthalmology: exploring its potential 36. Azamfirei R, Kudchadkar SR, Fackler J. Large language models and the perils of
with discharge summaries and operative notes. Semin Ophthalmol. (2023) 38:503–7. doi: their hallucinations. Crit Care. (2023) 27:120. doi: 10.1186/s13054-023-04393-x
10.1080/08820538.2023.2209166
37. Asensio-Sánchez VM. Artificial intelligence and new language models in
22. Rasmussen MLR, Larsen AC, Subhi Y, Potapenko I. Artificial intelligence-based ophthalmology: complications of the use of silicone oil in vitreoretinal surgery. Arch Soc
ChatGPT chatbot responses for patient and parent questions on vernal keratoconjunctivitis. Esp Oftalmol (Engl Ed). (2023) 98:298–303.
Graefes Arch Clin Exp Ophthalmol. (2023) 261:3041–3. doi: 10.1007/s00417-023-06078-1
38. Lawson MLA. Artificial intelligence in surgical documentation: a critical review
23. Lim ZW, Pushpanathan K, Yew SME, Lai Y, Sun CH, Lam JSH, et al. Benchmarking of the role of large language models. Ann Biomed Eng. (2023). doi: 10.1007/
large language models’ performances for myopia care: a comparative analysis of s10439-023-03282-2
ChatGPT-3.5, ChatGPT-4.0, and Google bard. EBioMedicine. (2023) 95:104770. doi:
10.1016/j.ebiom.2023.104770 39. Salimi A, Saheb H. Large language models in ophthalmology scientific writing:
ethical considerations blurred lines or not at all? Am J Ophthalmol. (2023) 254:177–81.
24. Waisberg E, Ong J, Masalkhi M, Kamran SA, Zaman N, Sarker P, et al. GPT-4 and doi: 10.1016/j.ajo.2023.06.004
ophthalmology operative notes. Ann Biomed Eng. (2023). doi: 10.1007/
s10439-023-03263-5 40. Ali MJ, Singh S. ChatGPT and scientific abstract writing: pitfalls and caution.
Graefes Arch Clin Exp Ophthalmol. (2023) 261:3205–6. doi: 10.1007/s00417-023-06123-z
25. Chen JS, Baxter SL. Applications of natural language processing in ophthalmology:
present and future. Front Med. (2022) 9:906554. doi: 10.3389/fmed.2022.1078403 41. Abdullah YI, Schuman JS, Shabsigh R, Caplan A, Al-Aswad LA. Ethics of artificial
intelligence in medicine and ophthalmology. Asia-Pacific J. Ophthalmol. (Phila Pa).
26. Gui H, Tseng B, Hu W, Wang SY. Looking for low vision: predicting visual (2021) 10:289–98. doi: 10.1097/APO.0000000000000397
prognosis by fusing structured and free-text data from electronic health records. Int J
Med Inform. (2022) 159:104678. doi: 10.1016/j.ijmedinf.2021.104678 42. Shen Y, Heacock L, Elias J. ChatGPT and other large language models are double-
edged swords. Radiology. (2023) 307:e230163. doi: 10.1148/radiol.230163
27. Winkler SL, Finch D, Llanos I, Delikat J, Marszalek J, Rice C, et al. Retrospective
analysis of vision rehabilitation for veterans with traumatic brain injury-related vision 43. Tom E, Keane PA, Blazes M, Pasquale LR, Chiang MF, Lee AY, et al. Protecting
dysfunction. Mil Med. (2023) 188:e2982–6. doi: 10.1093/milmed/usad120 data privacy in the age of AI-enabled ophthalmology. Transl Vis Sci Technol. (2020) 9:36.
doi: 10.1167/tvst.9.2.36
28. Winkler SL, Finch D, Wang X, Toyinbo P, Marszalek J, Rakoczy CM, et al. Veterans
with traumatic brain injury-related ocular injury and vision dysfunction: 44. Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G. Potential biases in machine
recommendations for rehabilitation. Optom Vis Sci. (2022) 99:9–17. doi: 10.1097/ learning algorithms using electronic health record data. JAMA Intern Med. (2018)
OPX.0000000000001828 178:1544–7. doi: 10.1001/jamainternmed.2018.3763
29. Balas M, Ing EB. Conversational ai models for ophthalmic diagnosis: comparison 45. Dow ER, Keenan TDL, Lad EM, Lee AY, Lee CS, Loewenstein A, et al. From data
of chatgpt and the isabel pro differential diagnosis generator. JFO Open Ophthalmol. to deployment: the collaborative community on ophthalmic imaging roadmap for
(2023) 1:100005. doi: 10.1016/j.jfop.2023.100005 artificial intelligence in age-related macular degeneration. Ophthalmology. (2022)
129:e43–59. doi: 10.1016/j.ophtha.2022.01.002
30. Visser M, Deliens L, Houttekier D. Physician-related barriers to communication
and patient- and family-centred decision-making towards the end of life in intensive 46. González-Gonzalo C, Thee EF, Klaver CCW, Lee AY, Schlingemann RO, Tufail A,
care: a systematic review. Crit Care. (2014) 18:604. doi: 10.1186/s13054-014-0604-z et al. Trustworthy AI: closing the gap between development and integration of AI
systems in ophthalmic practice. Prog Retin Eye Res. (2022) 90:101034. doi: 10.1016/j.
31. Singh K, Thibodeau A, Niziol LM, Nakai TK, Bixler JE, Khan M, et al. Development preteyeres.2021.101034
and validation of a model to predict anterior segment vision-threatening eye disease
using primary care clinical notes. Cornea. (2022) 41:974–80. doi: 10.1097/ 47. Tools such as ChatGPT threaten transparent science; here are our ground rules for
ICO.0000000000002877 their use. Nature. (2023) 613:612. doi: 10.1038/d41586-023-00191-1
32. Salazar H, Misra V, Swaminathan SS. Artificial intelligence and complex statistical 48. Chou YB, Kale AU, Lanzetta P, Aslam T, Barratt J, Danese C, et al. Current status
modeling in glaucoma diagnosis and management. Curr Opin Ophthalmol. (2021) and practical considerations of artificial intelligence use in screening and diagnosing
32:105–17. doi: 10.1097/ICU.0000000000000741 retinal diseases: vision academy retinal expert consensus. Curr Opin Ophthalmol. (2023)
34:403–13. doi: 10.1097/ICU.0000000000000979
33. Jalamangala Shivananjaiah SK, Kumari S, Majid I, Wang SY. Predicting near-term
glaucoma progression: an artificial intelligence approach using clinical free-text notes 49. Li JO, Liu H, Ting DSJ, Jeon S, Chan RVP, Kim JE, et al. Digital technology, tele-
and data from electronic health records. Front Med. (2023) 10:1157016. doi: 10.3389/ medicine and artificial intelligence in ophthalmology: a global perspective. Prog Retin
fmed.2023.1157016 Eye Res. (2021) 82:100900. doi: 10.1016/j.preteyeres.2020.100900

Frontiers in Medicine 09 frontiersin.org

You might also like