Exploring Large Language Model For Next Generation of Artificial Intelligence in Ophthalmology
Exploring Large Language Model For Next Generation of Artificial Intelligence in Ophthalmology
REVIEWED BY
Jo-Hsuan "Sandy" Wu, Kai Jin 1†, Lu Yuan 2†, Hongkang Wu 1, Andrzej Grzybowski 3 and
University of California, San Diego, Juan Ye 1*
United States
Jana Lipkova, Eye Center, The Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, China,
1
University of California, Irvine, United States Department of Ophthalmology, The Children's Hospital, Zhejiang University School of Medicine,
2
*CORRESPONDENCE
National Clinical Research Center for Child Health, Hangzhou, China, 3 Institute for Research in
Juan Ye Ophthalmology, Foundation for Ophthalmology Development, Poznan, Poland
[email protected]
Introduction
The history of artificial intelligence (AI) in medicine dates back to the 1950s when
researchers began to explore the use of computers to analyze medical data and make
diagnostic decisions. However, past methods had limitations in accuracy and speed and
still could not analyze unstructured medical data (1). Natural Language Processing (NLP)
is a subfield of AI that focuses on enabling computers to understand, interpret, and
generate human language. It involves the development of algorithms and models that can
process and analyze unstructured text data. Large Language Models (LLM) refer to
advanced artificial intelligence models, such as GPT-3 (Generative Pre-trained
Transformer 3), that are built on transformer architecture. The transformer architecture
is a deep learning model that efficiently captures context and dependencies in sequential
data, making it a fundamental choice for natural language processing tasks and beyond.
These models are trained on massive amounts of text data from the internet, enabling
them to generate human-like text and perform a wide range of NLP tasks with remarkable
accuracy and versatility. ChatGPT builds on the capabilities of large language models to
FIGURE 1
Workflow of large language model (LLM) for artificial intelligence (AI) in ophthalmology. Text (symptoms, medical history, etc.) and images (Optical
coherence tomography, Fundus fluorescein angiography, etc.) are encoded and fed into a model that has been trained on a large amount of data,
which can decode the relevant information required. LLM applications include automated question-answering, diagnose, information screening,
summarization, image analysis, predictive modeling.
generate coherent and contextually relevant responses, making it Study selection and search strategy
well-suited for chatbot applications. It is designed to generate
human-like responses to a wide range of prompts and questions We conducted a comprehensive literature search following the
and may enhance healthcare delivery and patients’ quality of life PRISMA guidelines. Searches were performed on PubMed and
(Figure 1) (2). The use of LLMs in healthcare offers several Google Scholar databases, spanning from January 2016 to June 2023.
potential benefits. Keywords were selected from two distinct categories: ophthalmology-
ChatGPT and LLMs can be applied in various ways. They can related terms (ophthalmology, eye diseases, eye disorders) and large
serve as clinical documentation aids, helping with administrative tasks language model-related terms (large language models, ChatGPT,
such as clinic scheduling, medical coding for billing, and generating natural language processing, chatbots). The search strategy involved
preauthorization letters (3). LLMs can also be used as summarization the use of the following keywords: (“Ophthalmology” OR “Eye
tools, improving communication with patients and assisting in clinical Diseases” OR “Eye Disorders”) AND (“Large Language Models” OR
trials. They can make processes such as curriculum design, testing of “ChatGPT” OR “Natural Language Processing” OR “Chatbots”). The
knowledge base, and continuing medical education more dynamic (4). terms from each category were cross-referenced independently with
LLMs can reduce the burden of administrative tasks for healthcare terms from the other category.
professionals, save time, and improve efficiency. They also have the
potential to provide valuable clinical insights and support decision-
making (5). This capability may help ophthalmologists enabling Inclusion and exclusion criteria
evidence-based decision-making and revolutionizing various aspects
of eye care and research. We established specific inclusion criteria for article selection. The
publication period considered research from January 2016 to June
2023 to ensure the inclusion of up-to-date findings. Initially, 6,130
Method of literature search articles were identified through titles and abstracts. We prioritized
research quality and the application of Large Language Models
For this review, we followed the Preferred Reporting Items for (LLMs) in our selection process. Additionally, articles published prior
Systematic Reviews and Meta-Analyzes (PRISMA) guidelines. to 2016 were included for historical context and those pertinent to
closely related topics.
In the meantime, studies meeting the following criteria will
Abbreviations: AI, Artificial Intelligence; NLP, Natural Language Processing; LLM, be excluded: (1) duplicate literature previously included in the review,
Large Language Model; GPT, Generative Pre-trained Transformer; HER, Electronic (2) irrelevant topics, where the article is unrelated to ophthalmology
Health Records; eAMD, exudative Age-related Macular Degeneration; DR, Diabetic or the application of the large language model, (3) conference
Retinopathy; OCT, Optical Coherence Tomography; BERT, Bidirectional Encoder abstracts, and (4) non-original research, such as editorials, case
Representations from Transformers. reports or commentaries.
Language considerations language considerations, and thorough data extraction (Figure 2).
A total of 108 articles were independently screened for eligibility by
A comprehensive review was conducted primarily on English- two reviewers (Kai Jin and Lu Yuan), including assessments of titles
language articles, totaling 6,130 papers. Furthermore, we evaluated 14 and abstracts, followed by full-text review. Any disagreements were
papers predominantly published in Chinese. For articles in languages resolved through discussion with a third author (Juan Ye).
such as French, Spanish, and German, we assessed their abstracts. This Ultimately, 108 studies were included in the review.
multilingual approach allowed us to comprehensively evaluate the
literature. The primary inclusion criterion required research to
specifically address the application of AI in ophthalmology and Results
demonstrate a certain level of perceived quality.
We finally included 108 studies. The results (Table 1) encompass
a diverse range of studies in the field of ophthalmology, highlighting
Data extraction and analysis the versatile applications of LLMs. The results reflect a wide spectrum
of LLM applications, and subfields of interest in ophthalmology. They
Following a rigorous selection process, relevant data were showcase the versatility of LLMs in addressing various aspects of
extracted and analyzed from the selected articles. Key themes, trends, automated question-answering (55 studies), diagnose (5 studies),
advancements, and challenges related to the utilization of LLMs in information screening (27 studies), summarization (5 studies), image
ophthalmology were systematically synthesized. analysis (5 studies), predictive modeling (11 studies). Subfields
In accordance with the PRISMA guidelines, this review adhered encompass general ophthalmology (38 studies), retinal diseases (32
to a structured and rigorous approach, encompassing a studies), anterior segment diseases (27 studies), glaucoma (6 studies),
comprehensive literature search, meticulous inclusion criteria, and ophthalmic plastics (5 studies) (Figure 3).
FIGURE 2
PRISMA 2020 flow diagram for this systematic review.
Antaki et al. (7) 2023 Ophthalmology Science General ophthalmology To evaluate the performance Automated question- ChatGPT
on ophthalmology questions answering
Cai et al. (8) 2023 American Journal of General ophthalmology To compare the performance Automated question- Bing Chat, ChatGPT 3.5,
Ophthalmology on ophthalmology board-style answering and ChatGPT 4.0,
questions.
Mihalache et al. (9) 2023 JAMA Ophthalmology General ophthalmology To assess the performance on Automated question- ChatGPT
board certification exam in answering
ophthalmology
Bernstein et al. (10) 2023 JAMA Network Open General ophthalmology To generate ophthalmology Automated question- ChatGPT version 3.5
advice answering
Ali et al. (11) 2023 Ophthalmic Plast Lacrimal drainage disorders To response to lacrimal Automated question- ChatGPT
Reconstr Surg drainage disorders answering
Tsui et al. (12) 2023 Eye Posterior vitreous detachment, To response to common Automated question- ChatGPT
retinal tear and detachment, ocular symptoms answering
ocular surface disease,
exudative age-related macular
degeneration (eAMD), and
post-intravitreal injection pain
and redness
Potapenko et al. (13) 2023 Acta Ophthalmologica Retinal diseases To evaluate accuracy on Automated question- ChatGPT
patient information answering
Momenaei et al. (14) 2023 Ophthalmology Retina Retinal diseases To evaluate the Automated question- ChatGPT-4
appropriateness and answering
readability of the medical
knowledge
Waisberg et al. (15) 2023 Irish Journal of Medical Anterior ischemic optic Fundus image analysis Image analysis GPT-4
Science neuropathy
Hu et al. (16) 2022 Transl Vis Sci Technol. Glaucoma To Predict Glaucoma Predictive Modeling Pre-trained Transformers
Progression Requiring Surgery
Lee et al. (17) 2023 Ophthalmic Res General ophthalmology To assign procedural codes Predictive Modeling Bidirectional Encoder
based on the surgical report Representations from
Transformers (BERT)
Liu et al. (18) 2023 AMIA Retinal vascular disease To provide a diagnosis based Summarization GPT3.5-Turbo
on FFA reports
Yu et al. (19) 2022 BMC Medical Diabetic retinopathy To Identify diabetic Information NLP(Extraction, Named
Informatics and retinopathy-related clinical screening entity recognition), DL,
Decision Making concepts and their attributes Pre-trained Transfomers
Valentín-Bravo et al. 2023 Arch Soc Esp Oftalmol. Vitreoretinal disease To write a scientific article Information ChatGPT, DALL-E 2
(20) screening
Singh et al. (4) 2023 Clin Exp Ophthamol. Dry eye disease To conduct a literature review Information ChatGPT
screening
Singh et al. (21) 2023 Seminars in Cornea, retina, glaucoma, To construct ophthalmic Information ChatGPT
Ophthalmology pediatric ophthalmology, discharge summaries and screening
neuroophthalmology, and operative notes
ophthalmic plastics surgery
Rasmussen et al. 2023 Graefe’s archive for Vernal keratoconjunctivitis To provided responses to Automated question- ChatGPT
(22) clinical and patient and parent questions answering
experimental
ophthalmology
lim et al. (23) 2023 Ebiomedicine Myopia To deliver accurate responses Automated question- ChatGPT-3.5,
to common myopia-related answering ChatGPT-4.0, and Google
query Bard
Waisberg et al. (24) 2023 Annals of Biomedical General ophthalmology To write ophthalmic operative Information GPT-4
Engineering notes screening
FIGURE 3
Major applications of LLM in Ophthalmology. The patient’s information like symptoms, medical history and other health-related details are inputted
into the LLM, which outputs valuable clinical insights to the physician and helps him or her make decisions.
indispensable tools that enhance efficiency, accessibility, and Funding acquisition, Project administration, Resources, Supervision,
ultimately improve patient outcomes in ophthalmology. Validation, Visualization, Writing – review & editing.
Limitations Funding
This review acknowledges several potential limitations that may The author(s) declare financial support was received for the
have affected the comprehensiveness and potential bias of the research, authorship, and/or publication of this article. This work has
literature search and selection process. These limitations include been financially supported by Natural Science Foundation of China
publication bias, language bias due to the focus on English-language (grant number 82201195), and Clinical Medical Research Center for
studies, potential database selection bias, the possibility of excluding Eye Diseases of Zhejiang Province (grant number 2021E50007).
relevant studies due to search term restrictions, the limited date
range, and the predefined exclusion criteria that may have omitted
relevant research. The review also recognizes the potential for missed Acknowledgments
references and acknowledges the subjectivity in reviewer bias, which
could impact study inclusion. Moreover, the review underscores the Thanks to all the peer reviewers and editors for their opinions
importance of addressing these limitations to ensure a more and suggestions.
comprehensive and balanced assessment of the field of AI in
ophthalmology. Despite these potential constraints, the review
provides valuable insights into the applications and challenges of AI Conflict of interest
in ophthalmology, but readers should consider these limitations
when interpreting the findings and drawing conclusions from The authors declare that the research was conducted in the
the review. absence of any commercial or financial relationships that could
be construed as a potential conflict of interest.
Author contributions
Publisher’s note
KJ: Conceptualization, Data curation, Funding acquisition,
Investigation, Methodology, Writing – original draft, Writing – review All claims expressed in this article are solely those of the authors
& editing. LY: Data curation, Formal analysis, Investigation, and do not necessarily represent those of their affiliated organizations,
Methodology, Writing – original draft. HW: Formal analysis, Software, or those of the publisher, the editors and the reviewers. Any product
Validation, Writing – review & editing. AG: Supervision, Validation, that may be evaluated in this article, or claim that may be made by its
Visualization, Writing – review & editing. JY: Conceptualization, manufacturer, is not guaranteed or endorsed by the publisher.
References
1. Jin K, Ye J. Artificial intelligence and deep learning in ophthalmology: current status 11. Ali MJ. ChatGPT and lacrimal drainage disorders: performance and scope of
and future perspectives. Adv Ophthalmol Pract Res. (2022) 2:100078. doi: 10.1016/j. improvement. Ophthal Plast Reconstr Surg. (2023) 39:221–5. doi: 10.1097/
aopr.2022.100078 IOP.0000000000002418
2. Will ChatGPT transform healthcare? Nat Med. (2023) 29:505–6. doi: 10.1038/ 12. Tsui JC, Wong MB, Kim BJ, Maguire AM, Scoles D. Appropriateness of ophthalmic
s41591-023-02289-5 symptoms triage by a popular online artificial intelligence chatbot. Eye (Lond). (2023).
doi: 10.1038/s41433-023-02556-2
3. Sharma P, Parasa S. ChatGPT and large language models in gastroenterology. Nat
Rev Gastroenterol Hepatol. (2023) 20:481–2. doi: 10.1038/s41575-023-00799-8 13. Potapenko I, Boberg-Ans LC, Stormly HM. Artificial intelligence-based chatbot
patient information on common retinal diseases using ChatGPT. Acta Ophthalmol.
4. Singhal K, Azizi S. Large language models encode clinical knowledge. Nature. (2023). doi: 10.1111/aos.15661
(2023) 620:172–80. doi: 10.1038/s41586-023-06291-2
14. Momenaei B, Wakabayashi T, Shahlaee A, Durrani AF, Pandit SA, Wang K, et al.
5. Arora A, Arora A. The promise of large language models in health care. Lancet Appropriateness and readability of chatgpt-4-generated responses for surgical treatment
(London, England). (2023) 401:641. doi: 10.1016/S0140-6736(23)00216-7 of retinal diseases. Ophthalmol Retina. (2023). 7:862–8.
6. Lin JC, Younessi DN, Kurapati SS, Tang OY, Scott IU. Comparison of GPT-3.5, 15. Waisberg E, Ong J, Masalkhi M, Kamran SA, Zaman N, Sarker P, et al. GPT-4: a
GPT-4, and human user performance on a practice ophthalmology written examination. new era of artificial intelligence in medicine. Ir J Med Sci. (2023). doi: 10.1007/
Eye (London, England). (2023). doi: 10.1038/s41433-023-02564-2 s11845-023-03377-8
7. Antaki F, Touma S, Milad D, El-Khoury J, Duval R. Evaluating the performance of 16. Hu W, Wang SY. Predicting Glaucoma progression requiring surgery using clinical
ChatGPT in ophthalmology: an analysis of its successes and shortcomings. Ophthalmol free-text notes and transfer learning with transformers. Transl Vis Sci Technol. (2022)
Sci. (2023) 3:100324. doi: 10.1016/j.xops.2023.100324 11:37. doi: 10.1167/tvst.11.3.37
8. Cai LZ, Shaheen A, Jin A, Fukui R, Yi JS, Yannuzzi N, et al. Performance of 17. Lee YM, Bacchi S, Macri C, Tan Y, Casson R, Chan WO. Ophthalmology operation
generative large language models on ophthalmology board style questions. Am J note encoding with open-source machine learning and natural language processing.
Ophthalmol. (2023) 254:141–9. doi: 10.1016/j.ajo.2023.05.024 Ophthalmic Res. (2023) 66:928–39.
9. Mihalache A, Popovic MM, Muni RH. Performance of an artificial intelligence 18. Liu X, Wu J, Shao A, Shen W, Ye P, Wang Y, et al. Transforming retinal vascular
Chatbot in ophthalmic knowledge assessment. JAMA Ophthal. (2023) 141:589–97. doi: disease classification: a comprehensive analysis of chatgpt’s performance and inference
10.1001/jamaophthalmol.2023.1144 abilities on non-english clinical environment. medRxiv (2023). doi:
10.1101/2023.06.28.23291931
10. Bernstein IA, Zhang YV, Govil D, Majid I, Chang RT, Sun Y, et al. Comparison of
ophthalmologist and large language model Chatbot responses to online patient eye care 19. Yu Z, Yang X, Sweeting GL, Ma Y, Stolte SE, Fang R, et al. Identify diabetic
questions. JAMA Netw Open. (2023) 6:e2330320. doi: 10.1001/jamanetworkopen.2023.30320 retinopathy-related clinical concepts and their attributes using transformer-based
natural language processing methods. BMC Med Inform Decis Mak. (2022) 22:255. doi: 34. Wang SY, Tseng B, Hernandez-Boussard T. Deep learning approaches for
10.1186/s12911-022-01996-2 predicting Glaucoma progression using electronic health records and natural language
processing. Ophthalmol Sci. (2022) 2:100127. doi: 10.1016/j.xops.2022.100127
20. Valentín-Bravo FJ, Mateos-Álvarez E, Usategui-Martín R, Andrés-Iglesias C,
Pastor-Jimeno JC, Pastor-Idoate S. Artificial intelligence and new language models in 35. Sun MT, Singh K, Wang SY. Real-world outcomes of Glaucoma filtration surgery
ophthalmology: complications of the use of silicone oil in vitreoretinal surgery. Arch Soc using electronic health records: an informatics study. J Glaucoma. (2022) 31:847–53. doi:
Esp Oftalmol (Engl Ed). (2023) 98:298:303. 10.1097/IJG.0000000000002122
21. Singh S, Djalilian A, Ali MJ. ChatGPT and ophthalmology: exploring its potential 36. Azamfirei R, Kudchadkar SR, Fackler J. Large language models and the perils of
with discharge summaries and operative notes. Semin Ophthalmol. (2023) 38:503–7. doi: their hallucinations. Crit Care. (2023) 27:120. doi: 10.1186/s13054-023-04393-x
10.1080/08820538.2023.2209166
37. Asensio-Sánchez VM. Artificial intelligence and new language models in
22. Rasmussen MLR, Larsen AC, Subhi Y, Potapenko I. Artificial intelligence-based ophthalmology: complications of the use of silicone oil in vitreoretinal surgery. Arch Soc
ChatGPT chatbot responses for patient and parent questions on vernal keratoconjunctivitis. Esp Oftalmol (Engl Ed). (2023) 98:298–303.
Graefes Arch Clin Exp Ophthalmol. (2023) 261:3041–3. doi: 10.1007/s00417-023-06078-1
38. Lawson MLA. Artificial intelligence in surgical documentation: a critical review
23. Lim ZW, Pushpanathan K, Yew SME, Lai Y, Sun CH, Lam JSH, et al. Benchmarking of the role of large language models. Ann Biomed Eng. (2023). doi: 10.1007/
large language models’ performances for myopia care: a comparative analysis of s10439-023-03282-2
ChatGPT-3.5, ChatGPT-4.0, and Google bard. EBioMedicine. (2023) 95:104770. doi:
10.1016/j.ebiom.2023.104770 39. Salimi A, Saheb H. Large language models in ophthalmology scientific writing:
ethical considerations blurred lines or not at all? Am J Ophthalmol. (2023) 254:177–81.
24. Waisberg E, Ong J, Masalkhi M, Kamran SA, Zaman N, Sarker P, et al. GPT-4 and doi: 10.1016/j.ajo.2023.06.004
ophthalmology operative notes. Ann Biomed Eng. (2023). doi: 10.1007/
s10439-023-03263-5 40. Ali MJ, Singh S. ChatGPT and scientific abstract writing: pitfalls and caution.
Graefes Arch Clin Exp Ophthalmol. (2023) 261:3205–6. doi: 10.1007/s00417-023-06123-z
25. Chen JS, Baxter SL. Applications of natural language processing in ophthalmology:
present and future. Front Med. (2022) 9:906554. doi: 10.3389/fmed.2022.1078403 41. Abdullah YI, Schuman JS, Shabsigh R, Caplan A, Al-Aswad LA. Ethics of artificial
intelligence in medicine and ophthalmology. Asia-Pacific J. Ophthalmol. (Phila Pa).
26. Gui H, Tseng B, Hu W, Wang SY. Looking for low vision: predicting visual (2021) 10:289–98. doi: 10.1097/APO.0000000000000397
prognosis by fusing structured and free-text data from electronic health records. Int J
Med Inform. (2022) 159:104678. doi: 10.1016/j.ijmedinf.2021.104678 42. Shen Y, Heacock L, Elias J. ChatGPT and other large language models are double-
edged swords. Radiology. (2023) 307:e230163. doi: 10.1148/radiol.230163
27. Winkler SL, Finch D, Llanos I, Delikat J, Marszalek J, Rice C, et al. Retrospective
analysis of vision rehabilitation for veterans with traumatic brain injury-related vision 43. Tom E, Keane PA, Blazes M, Pasquale LR, Chiang MF, Lee AY, et al. Protecting
dysfunction. Mil Med. (2023) 188:e2982–6. doi: 10.1093/milmed/usad120 data privacy in the age of AI-enabled ophthalmology. Transl Vis Sci Technol. (2020) 9:36.
doi: 10.1167/tvst.9.2.36
28. Winkler SL, Finch D, Wang X, Toyinbo P, Marszalek J, Rakoczy CM, et al. Veterans
with traumatic brain injury-related ocular injury and vision dysfunction: 44. Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G. Potential biases in machine
recommendations for rehabilitation. Optom Vis Sci. (2022) 99:9–17. doi: 10.1097/ learning algorithms using electronic health record data. JAMA Intern Med. (2018)
OPX.0000000000001828 178:1544–7. doi: 10.1001/jamainternmed.2018.3763
29. Balas M, Ing EB. Conversational ai models for ophthalmic diagnosis: comparison 45. Dow ER, Keenan TDL, Lad EM, Lee AY, Lee CS, Loewenstein A, et al. From data
of chatgpt and the isabel pro differential diagnosis generator. JFO Open Ophthalmol. to deployment: the collaborative community on ophthalmic imaging roadmap for
(2023) 1:100005. doi: 10.1016/j.jfop.2023.100005 artificial intelligence in age-related macular degeneration. Ophthalmology. (2022)
129:e43–59. doi: 10.1016/j.ophtha.2022.01.002
30. Visser M, Deliens L, Houttekier D. Physician-related barriers to communication
and patient- and family-centred decision-making towards the end of life in intensive 46. González-Gonzalo C, Thee EF, Klaver CCW, Lee AY, Schlingemann RO, Tufail A,
care: a systematic review. Crit Care. (2014) 18:604. doi: 10.1186/s13054-014-0604-z et al. Trustworthy AI: closing the gap between development and integration of AI
systems in ophthalmic practice. Prog Retin Eye Res. (2022) 90:101034. doi: 10.1016/j.
31. Singh K, Thibodeau A, Niziol LM, Nakai TK, Bixler JE, Khan M, et al. Development preteyeres.2021.101034
and validation of a model to predict anterior segment vision-threatening eye disease
using primary care clinical notes. Cornea. (2022) 41:974–80. doi: 10.1097/ 47. Tools such as ChatGPT threaten transparent science; here are our ground rules for
ICO.0000000000002877 their use. Nature. (2023) 613:612. doi: 10.1038/d41586-023-00191-1
32. Salazar H, Misra V, Swaminathan SS. Artificial intelligence and complex statistical 48. Chou YB, Kale AU, Lanzetta P, Aslam T, Barratt J, Danese C, et al. Current status
modeling in glaucoma diagnosis and management. Curr Opin Ophthalmol. (2021) and practical considerations of artificial intelligence use in screening and diagnosing
32:105–17. doi: 10.1097/ICU.0000000000000741 retinal diseases: vision academy retinal expert consensus. Curr Opin Ophthalmol. (2023)
34:403–13. doi: 10.1097/ICU.0000000000000979
33. Jalamangala Shivananjaiah SK, Kumari S, Majid I, Wang SY. Predicting near-term
glaucoma progression: an artificial intelligence approach using clinical free-text notes 49. Li JO, Liu H, Ting DSJ, Jeon S, Chan RVP, Kim JE, et al. Digital technology, tele-
and data from electronic health records. Front Med. (2023) 10:1157016. doi: 10.3389/ medicine and artificial intelligence in ophthalmology: a global perspective. Prog Retin
fmed.2023.1157016 Eye Res. (2021) 82:100900. doi: 10.1016/j.preteyeres.2020.100900