0% found this document useful (0 votes)
26 views8 pages

Big Data in Ophthalmology

This review article discusses the role of big data in ophthalmology, highlighting its potential applications, challenges, and the importance of translating findings into real-world improvements in eye care. It defines big data by its volume, variety, velocity, veracity, and value, and emphasizes its ability to enhance research, clinical care, and public health policy evaluation. However, the article also addresses the limitations and challenges associated with big data, including data quality issues and the necessity for robust statistical analysis.

Uploaded by

Ashu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views8 pages

Big Data in Ophthalmology

This review article discusses the role of big data in ophthalmology, highlighting its potential applications, challenges, and the importance of translating findings into real-world improvements in eye care. It defines big data by its volume, variety, velocity, veracity, and value, and emphasizes its ability to enhance research, clinical care, and public health policy evaluation. However, the article also addresses the limitations and challenges associated with big data, including data quality issues and the necessity for robust statistical analysis.

Uploaded by

Ashu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

REVIEW ARTICLE

Big Data in Ophthalmology


Ching-Yu Cheng, MD, PhD  y, Zhi Da Soh, MPH , Shivani Majithia, OD 
Sahil Thakur, MS , Tyler Hyungtaek Rim, MD, PhD  y
Yih Chung Tham, PhD  y, and Tien Yin Wong, MD, PhD  y

diverse and constantly changing attributes that are too complex or


Abstract: Big data is the fuel of mankind’s fourth industrial revolution.
“big” to be handled by traditional means.3 Thus, big data encom-
Coupled with new technology such as artificial intelligence and deep
passes more than just a large “volume” of data.
learning, the potential of big data is poised to be harnessed to its maximal
in years to come. In ophthalmology, given the data-intensive nature of this
specialty, big data will similarly play an important role. Electronic
WHAT IS BIG DATA?
medical records, administrative and health insurance databases, mega
There is a misconception of what “big data” is. Other than
national biobanks, crowd source data from mobile applications and social
simply large volume of data, big data has been further character-
media, and international epidemiology consortia are emerging forms of
ized by its variety (how diverse), velocity (how fast), veracity
“big data” in ophthalmology. In this review, we discuss the characteristics
(how accurate), and value (how useful). Volume is the sheer size
of big data, its potential applications in ophthalmology, and the chal-
of a database involved in big data, which is often measured in
lenges in leveraging and using these data. Importantly, in the next phase
petabytes or exabytes.4 Variety refers to the diversity of data
of work, it will be pertinent to further translate “big data” findings into
collected, which has been largely expanded by advancements in
real-world applications, to improve quality of eye care, and cost-effec-
omics technologies and the emergence of electronic medical
tiveness and efficiency of health services in ophthalmology.
records in the last decade.4 This has created a plethora of data
from different sources (eg, biobank, administrative databases, and
Key Words: big data, artificial intelligence, internet of things,
so on), in different formats (ie, unstructured vs structured), and for
ophthalmology, data science
different purposes (eg, administration, clinical care, and research,
(Asia Pac J Ophthalmol (Phila) 2020;9:291–298) and so on). Velocity is defined by the speed of incoming data, and
the burgeoning use of wearable computing devices has rendered

T he fourth industrial revolution has dawned upon us, driven


largely by mankind’s ability to harness enormous inter-
connected mobile, wireless, and digital devices (referred to as
data velocity near or at real time.4 Veracity represents the
accuracy and quality of data curated, and should ideally be clean,
complete, consistent, current, and complaint.3,4 Lastly, value
internet-of-things or IoT), and the computational prowess to indicates the usefulness of collected data in studying changes
expertly learn, think, and act like humans in real-time (referred in clinical outcomes, behavioral modification, improvement in
to as artificial intelligence or AI).1 Once thought of as a distant workflow, and monetization potentials.3
fantasy, this IoT and AI revolution is not only a present reality, but
one that is already intertwined in every façade of our daily lives,
and it will only play a larger role in the decades to come.2 APPLICATIONS OF BIG DATA IN MEDICINE AND
The fuel of this revolution is data, or rather, “big data.”3 HEALTH CARE
Although a universal definition of big data remains elusive, it There are significant benefits that big data can bring to
generally represents the rapid aggregation of a large amount of biomedical research, clinical medicine, and health care. Big data
could potentially transform the way health care is perceived,
practiced, and delivered.5
From the Singapore Eye Research Institute, Singapore National Eye Centre, First, big data has the potential to enhance research and
Singapore; and yOphthalmology & Visual Sciences Academic Clinical Pro- redefine the boundaries of traditional research methodology. Big
gram (Eye ACP), Duke-NUS Medical School, Singapore.
Submitted April 21, 2020; accepted June 3, 2020. data can generate greater scientific insights into hypotheses that
C-Y.C. and Z.D.S. contributed equally to this study. would otherwise be unanswered or answered inadequately by
Conflicts of Interest: The authors report no conflicts of interest.
Financial support: This study was supported by National Medical Research Council traditional “small data” studies such as randomized clinical trials
(NMRC), Singapore (NMRC/CIRG/1442/2016 and NMRC/CSA-SI/0012/ and even observational studies.6 Big data also allows for real-time
2017). The sponsor or funding organization had no role in the design or
conduct of this research. assessment of interventions in real-world settings, and provides
Correspondence: Prof Ching-Yu Cheng, The Academia, 20, College Road. Dis- greater statistical power to detect novel or subtle associations.7 In
covery Tower level 6, Singapore 169856.
E-mail: [email protected]. addition, big data allows for more in-depth understanding of
Copyright ß 2020 Asia-Pacific Academy of Ophthalmology. Published by Wolters disease pathogenesis,5 such as in genetics and proteonomics
Kluwer Health, Inc. on behalf of the Asia-Pacific Academy of Ophthalmology.
This is an open access article distributed under the terms of the Creative studies. An example can be seen in cancer research, where big
Commons Attribution-Non Commercial-No Derivatives License 4.0 (CCBY- data has provided the platform to integrate multiple large-scale
NC-ND), where it is permissible to download and share the work provided it is
properly cited. The work cannot be changed in any way or used commercially omics datasets to form molecular meta-networks.8 This aggrega-
without permission from the journal. tion better facilitates the identification of key regulatory and
ISSN: 2162-0989
DOI: 10.1097/APO.0000000000000304 dysregulatory elements in different cancer, thereby providing

ß 2020 Asia-Pacific Academy of Ophthalmology. https://journals.lww.com/apjoo | 291


Cheng et al Asia-Pacific Journal of Ophthalmology  Volume 9, Number 4, July/August 2020

greater insights into tumor development and modifiers of treat- records are common.17 This could come either in the form of
ment response.9,10 Furthermore, big data mitigates against com- incomplete recording by medical staffs or loss to follow-up by
mon issues, such as selection bias and generalizability, in “small patients. Data entry may also be recorded in an unstructured
data” studies.7,11 format (eg, free text), and could be missed during data extrac-
Second, big data can improve clinical care through the tion.17 Also, EMR often rely on diagnostic or billing codes such as
development of sophisticated algorithms that provide physicians the International classification of Diseases (ICD) for administra-
with a holistic representation of an individual’s health status.3 Big tive purposes, which may give rise to misclassification, miscod-
data is the foundation of the practice of precision medicine,3 ing, and/or inadequate representation of conditions.18 Taken
improves prediction of health outcomes and disease progression together, the large, diverse, and rapid influx of data will result
for timely interventions,5 and aids in standardizing care by in ambiguities that limit its application.15 Furthermore, the real-
providing diagnostic and/or therapeutic recommendations based world implementation of big data is further confounded by issues
on aggregated inputs from fellow physicians and other resources.6 of privacy and security.16
For example, in cardiology, big data has been utilized to develop
softwares for imaging, risk profiling, genetic assessment, and
detecting anomalies on electrocardiogram and predicting cardio- ANALYSIS AND MANAGEMENT OF BIG DATA
vascular events.12 In fact, cardiology has the second most Food The basic principles of data analysis are similar in traditional
and Drug Administration (FDA)-approved algorithms in medi- epidemiology and big data.19 For example, commonly used
cine as of 2019, and includes automation ranging from early statistical techniques in epidemiological studies, such as the
detection of atrial fibrillation to quantification of coronary artery receiver-operating characteristics curve, remain a major metric
calcification.13 in deep-learning research that uses big data.20
Finally, big data may be utilized to evaluate the effectiveness Deep-learning is the unique subset of machine-learning that
of public health policies, which include the provision of health is germane to the fourth industrial revolution.21 In deep-learning,
care services and resources.5,14 Furthermore, IoT and digital computers are developed with algorithms that utilize a cascade of
devices are now delivering health information directly to individ- multilayered artificial neural networks that automatically extract,
uals, which empower them to play a more active role not only in transform, and recognize the intricate structures within inputted
managing their health, but also in altering the way in which health data (eg, fundus photographs).22 Each layer within an artificial
care services are sought and delivered.6 Importantly, the cost of neural network produces an output that is used as input for
adopting big data is becoming increasingly affordable.15 processing in the next layer, with the final layer producing a
diagnostic output (eg, presence of disease in fundus photographs).
This process is refined repeatedly (ie, back-propagation) until the
CHALLENGES OF BIG DATA diagnostic output matches a reference standard (ie, ground
Despite its potential, big data are not without its limitations truth).22 Therefore, big data is required to develop and improve
and implementation challenges (Fig. 1), and this is especially so if the accuracy of these algorithms.23 This consequently brings
the appropriate infrastructures and management systems are not about new statistical challenges, and greater scrutiny on statistical
first established.15,16 analysis and interpretation is vital.24
Big data is inherently messy, which raises concerns regarding The sheer size of big data leads to increased statistical power
its data quality.15 For example, electronic medical records and precision. This in turn leads to remarkably small P values
(EMRs) are not intended for research purpose, and incomplete obtained in hypothesis testings, even when the observed

FIGURE 1. Limitation and implementation challenges of big data.

292 | https://journals.lww.com/apjoo ß 2020 Asia-Pacific Academy of Ophthalmology.


Asia-Pacific Journal of Ophthalmology  Volume 9, Number 4, July/August 2020 Big Data in Ophthalmology

differences may be clinically inconsequential.25 In addition, Electronic Medical Records and Data Registry
undesirable practices, such as performing multiple testing to Big data in health care is driven largely by the advents of
derive at a “reportable” outcome that is statistically significant, information technologies, which allow for traditional medical
may be easier to achieve in big data. These practices, also known records to be digitalized into electronic format (EMR) and
as “p-hacking,” increase the risk of false-positive results, and combined with auxillary tests results.37 This results in a “one-
findings that are not reproducible.26 These highly precise but stop” data portal that allows physicians to visualize and better
biased results are not only misleading, but may also lead to understand the pattern-of-care provided, and for administrators to
dangerous clinical practice and loss of trust in big data analysis.24 identify gaps in service provision. In addition, advanced digital
Thus, it remains imperative to understand the limitations of P technologies such as the IoT further allow for fully automated,
values, the differences between statistical and clinical signifi- real-time, data linkage between different EMR systems,38 which
cance, and the need for robust validation through peer reviews. in turn, enables the creation of sophisticated clinical data regis-
Likewise, issues such as confounding, biases, and reverse tries such as SMEYEDAT, IRIS, Fight Retinal Blindness! (FRB!),
causation are not remedied by simply using big data, and must be and SOURCE.
similarly addressed with appropriate study design and/or statisti- The Smart Eye Database (SMEYEDAT) is a web-based
cal adjustments.24 For example, detection bias may arise in the ophthalmologic data warehouse that aggregates EMR and diag-
interpretation of EMR and/or administrative data analysis. Detec- nostic images from a university eye hospital near real-time to
tion bias occurs when an exposure is erroneously associated with facilitate easier and faster identification of patients with specific
an outcome through increased surveillance, screening or testing.27 conditions.39
For example, the risk associated between diabetes mellitus and IRIS is a cloud-based ophthalmic data registry that was
eye diseases, such as glaucoma and cataract, is likely over- developed by the American Academy of Ophthalmology in
estimated due to increased referrals and routine eye screening 2014 to drive improvements in the provision of eye care services,
for diabetic patients.28 to promote population health through adequate eye care cover-
The biases associated with EMR and/or administrative data age,40 and to pioneer evidence-based scientific knowledge
analysis may be ascertained and mitigated statistically. First, derived from clinical data registry.41 IRIS aggregates clinical
selection bias can be controlled by using propensity score adjust- data automatically to provide real-time analysis of 15 quality-
ment/matching to account for systematic differences in health control measures and 22 outcome measures from >60 million
system “exposure” versus “non-exposure” groups.29–31 Propen- patients. This allows physicians to compare the effectiveness of
sity scores may be used to create inverse probability weights to different treatment options in real-world settings; the coverage of
balance observed differences in the 2 populations with the goal of eye care provision; the impact of rare diseases and disease
mimicking a scenario where individuals would be randomized to comorbidities; and to better detect subtle clinical associations.41
be included versus excluded from the EMR/administrative data. For example, Cantrell et al42 utilized the IRIS registry to observe
However, the effectiveness of inverse probability weighting to the pattern-of-care rendered by fellow physicians in newly diag-
achieve representativeness is still debatable, but nonetheless nosed cases of macular edema, and reported among other findings
provides an avenue to address factors that are associated with that the majority of cases did not receive anti-VEGF treatment
inclusion or exclusion in the dataset.32 within the first 28 days, and bevacizumab was preferred in treated
Secondly, post-stratification adjustment can be used to stan- cases. This study highlights the potential of big data in informing
dardize crude estimates according to variables implicating the physicians of real-world practice to monitor and standardize the
selection bias.33 In the context of EMR/administrative data, these quality of care. However, identification of cases was limited by
variables might include demographic factors such as sex, race/ the sole usage of ICD coding, which is affected by misclassifica-
ethnicity, and sociodemographic factors, and various systemic tion, miscoding, and incompleteness.42
risk factors or comorbid conditions. Since inclusion is non- The “FRB!” is an ophthalmic data registry that contains
random, controlling for the number of health encounters also longitudinal data on the effects of switching between different
accounts for systematic differences between those who regularly treatment modalities in neovascular age-related macular degen-
or irregularly visit their health care provider.33 Therefore, includ- eration.43 Furthermore, FRB! tracks data across Europe, Middle
ing the number of health encounters as a propensity score East, and Asia, which further provides a platform for global
matching variable makes it possible to measure the effect size comparison of treatment outcomes in the future.44
more accurately.28,29,31 The “Sight Outcomes Research Collaborative” (SOURCE)
ophthalmic data repository that improved the efficiency in iden-
tifying ocular diseases from EMR data.45 Instead of relying
TYPES OF BIG DATA AND THEIR APPLICATIONS IN heavily on structured data (eg, ICD billing codes) to search for
OPHTHALMOLOGY ocular diseases, SOURCE incorporates an algorithm that based its
Big data is recognized as an integral component of modern search on both structured and unstructured (eg, free text from
medicine, including ophthalmology.34 In fact, ophthalmology is medical reports) EMR data. For example, Stein et al utilized
one of the most data driven specialty in medicine, with data SOURCE to search for exfoliation syndrome, and reported a
ranging from numerical values (eg, intraocular pressure, spheri- positive predictive value (PPV) of 95% and negative predictive
cal equivalent, and so on), 2-dimensional images (eg, fundus value (NPV) of 100%. Furthermore, 60% of cases would have
photographs), 3-dimensional scans (eg, optical coherence been missed if their repository had relied solely on billing code
tomography), to clinical and surgical records (Fig. 2).35 These alone.45 This study illustrates the importance of developing
data are often unstructured, and could not be adequately curated appropriate tools (eg, free text analysis) to harness the potentials
until now.36 of big data.46

ß 2020 Asia-Pacific Academy of Ophthalmology. https://journals.lww.com/apjoo | 293


Cheng et al Asia-Pacific Journal of Ophthalmology  Volume 9, Number 4, July/August 2020

FIGURE 2. Characteristics of big data and its sources in ophthalmology. EMR indicates electronic medical records.

Big data from EMR has also been utilized in designing new databases provide the necessary sample size needed to study rare
model-of-care in ophthalmology. In the United Kingdom, EMR diseases, and for the development of deep-learning algorithms.56
data has been utilized to assess the viability of virtual eye clinic in In addition, these databases have been used to identify trend in
identifying cases of unstable glaucoma that required closer obser- surgery,57 and safety profile of ophthalmic drugs.31,58
vation.47 This study aggregated the mean deviation score from
473,252 Humphrey Visual Field tests to establish a range of Epidemiology Consortia
expected values, which was used as reference for assessing Research consortia or networking are collaborative initia-
Humphrey Visual Field measurements conducted in a virtual tives that represent a significant change in the way clinical and
eye clinic and in normal clinic settings, and its effectiveness in epidemiological research are conducted.59 These consortia bring
identifying unstable glaucoma cases. This initiative highlights the together researchers across multiple domains and countries, and
potential application of big data in designing innovative model- provide a shared platform for capacity building, research collab-
of-care that bridges the “supply-demand” gap in eye care ser- oration, results’ aggregation and validation, and technology
vices.48,49 transfer.59,60
In epidemiology research, consortia are valuable resources
Administrative and Health Insurance Database that further provide a “bird’s-eye” view of the burden of diseases
In Asia, South Korea and Taiwan have a national health and its impact in a particular geographical region. This in turn
insurance program that covers the majority of their citizen, and enables researchers to come together to study pertinent issues that
data from these databases have been actively utilized in are answered inadequately by individual groups.61 For example,
research.50,51 In South Korea, a mandatory health insurance preventable vision impairment continues to be a major global
program was started in 1977, and extended to cover the entire health issue, and yet, its reasons remain inconclusive.62,63
nation in 1989.52 Currently, >97% of the population are covered In ophthalmology, epidemiology consortia include the Euro-
by the Korean National Health Insurance Service.53 In 2015, a pean Eye Epidemiology (E3) consortium,61 the Asian Eye Epi-
database was established to include 2% of Korean National Health demiology Consortium (AEEC),64 and the Visual Loss Expert
Insurance Service data (1 million people) and other cohorts of Group (VLEG).65 The E3 comprises of 29 study groups (eg,
elderlies that provided de-identified claims, health screening, and Rotterdam Study, Guttemberg Health Study) from 12 European
mortality data.53,54 In Taiwan, a similar database named the countries, whereas AEEC comprises of 40 population-based
National Health Insurance Research Database was set up for studies in Asia (eg, Beijing Eye Study, Singapore Epidemiology
research purposes.55 of Eye Diseases Study) from 9 Asian countries. Using data from
These insurance databases are crucial in providing the data the AEEC, researchers were able to provide a more accurate and
needed to study research hypotheses that are difficult to address in precise estimate of geographic atrophy in Asians, a relatively rare
traditional research methodologies. For example, insurance blinding disease in Asia. 66 The VLEG is made up of an

294 | https://journals.lww.com/apjoo ß 2020 Asia-Pacific Academy of Ophthalmology.


Asia-Pacific Journal of Ophthalmology  Volume 9, Number 4, July/August 2020 Big Data in Ophthalmology

international team of 78 ophthalmologists, optometrists, and with symptoms of COVID-19 in China. This information pro-
epidemiologists, and was formed by the Global Burden of Disease vided health care workers with a nationwide patient-level data,
in 2007 to measure comparable estimates of burden of disease, and aids in their understanding of the outbreak progression.
injuries, and risk factors due to vision impairment.65 Other Crowd source data are also increasingly utilized in ophthal-
consortia include The Meta-Analysis for Eye Disease study mic research. For example, Plano is a smartphone application that
group, and the International Eye Disease Consortium, who have has been developed to aid in myopia prevention.80 In addition,
reported on the global prevalence of diabetic retinopathy and sensor devices are currently adopted in myopia research to study
retinal vein occlusion.67,68 the effect of outdoor light intensities on myopia progression,81
These consortia provide a crucial source of big data for and tracking of physical activities through wearable devices has
regional disease surveillance and study of disease pathogenesis, been suggested as a way to track activities of daily living after
and provide a “bigger voice” in advocating eye care, providing cataract surgery.82
recommendations for public health policies, and advancing oph- Big data has also been utilized in dry eye research through the
thalmic discoveries.69 Taken together, consortia are greater than use of crowd source data. The DryEyeRhythm (Ohako Inc, Tokyo,
the sum of its part, and they epitomize the T.E.A.M acronym— Japan) is a smartphone application that was developed to assess
Together, Everyone Achieves More. the potential of crowd source data in identifying the character-
istics and risk factors associated with diagnosed or undiagnosed
Biobanks dry eye.83 It does so through a mobile application platform that
The idea of collecting and storing human specimens has been collects patient-specific information, such as demographic char-
around for over a 100 years.70 Handling and storing biospecimens acteristics, medical history, lifestyle, subjective symptoms, and
have evolved from storage in a few freezers to large repositories disease-specific symptoms measured on the Ocular Surface Dis-
with computerized databases, robotics processing samples at a ease Index and Zung Self-rating Depression Scale. Importantly,
rapid pace, and the launch of virtual biobanks.71 Biobanks can results from this study highlighted the potentials of crowd source
include samples from different epidemiologic study areas such as data in managing ocular disease. Crowd source data will likely
population studies, clinical trials, and diagnostic studies.71 An proliferate in the coming years with increased adoption of these
example of a large population biobank is the UK biobank, which smart devices.77
is a community-based prospective cohort study.72
The UK biobank was established with the objective of Ocular Image Database
investigating the effects of genetic, lifestyle, and environmental Furthermore, a few large-scale image datasets have been
risk factors associated with a wide range of major diseases for made public for the ophthalmic research.84–86 For example, as
500,000 participants aged between 40 and 69 years from 2006 to mentioned above, the UK Biobank has a huge collection of retinal
2010.73 These participants have also consented to long-term photographs and optical coherence tomography (OCT) scans
follow-up and extensive testing including a large assortment of made available for ophthalmic research.87 In addition, Kaggle
physical measurements, biospecimen (blood, urine, and saliva is a major source of image dataset, having organized hundreds of
samples) collection, and genotyping has been included for all machine learning competitions, including one on diabetic reti-
participants. Additional testing such as an enhanced ophthalmic nopathy,84 since its inception. These ocular datasets are very
examination including visual acuity, auto refraction, and retinal useful and crucial for developing deep learning algorithms for
images data has also been collected on 100,000 participants.74,75 screening and diagnosis of ocular diseases and prediction of
The UK biobank is a rich resource as it has the ability to follow-up nonocular clinical outcomes (see more details below). Moreover,
on the overall health of these participants by linkage to their health this form of data will likely proliferate in the years to come, and
records for a more comprehensive patient profile.73 In addition, when properly curated, will be an important data source
due to its open-access nature, the UK biobank will allow research- for performing retrospective clinical validation in AI medical
ers to study a wide range of complex diseases that will eventually devices.
lead to improvements in prevention, diagnosis, and treatment of
an array of diseases.73
APPLICATIONS OF DEEP LEARNING ON BIG DATA
Crowd Source Data IN OPHTHALMOLOGY
The advents of digital health have given rise to a new source Big data is particularly useful and widely adopted in the
of crowd source data, which include a wide range of inter- development of predictive algorithms through deep-learning.88 In
connected data generated from social media, mobile applications, ophthalmology, deep-learning has been applied predominantly on
sensor devices, and wearable computing devices.76 Crowd source ocular images to identify and classify ocular diseases such as
data may improve and simplify recruitment of research partic- diabetic retinopathy and retinopathy of prematurity.88 Over the
ipants,77 and allow traditional methods of health assessment to be last few years, two successful algorithms have been approved for
integrated with real-time patient data (eg, blood glucose level, screening diabetic retinopathy.89,90 For example, SELENAþ is a
physical activities) to gain further insights into the social deter- deep-learning algorithm that was authorized to screen for diabetic
minants of health in relation to health outcomes.78 retinopathy in Singapore.91 This algorithm was developed with
In addition, crowd source data can provide a crucial early data from the Singapore National Diabetic Retinopathy Screening
source of surveillance information in a pandemic. For example, Programme (SiDRP) and 10 multi-ethnic cohort studies, compris-
Sun et al79 utilized information from a combination of sources, ing of 494,661 fundus images.90 Deep-learning has also been
ranging from a health care-orientated social media to mainstream adopted in neuro-ophthalmology studies, where an algorithm was
news media, to identify health seeking behavior of individuals developed with 14,341 fundus images from 11 countries to

ß 2020 Asia-Pacific Academy of Ophthalmology. https://journals.lww.com/apjoo | 295


Cheng et al Asia-Pacific Journal of Ophthalmology  Volume 9, Number 4, July/August 2020

differentiate optic disc with papilledema from normal optic disc endeavored so as to translate its potentials into real-world advan-
and optic disc with nonpapilledema abnormalities.92 In addition, tages. In conclusion, the transformative potentials of big data in
deep-learning has also been successfully applied to 3-dimensional health care and ophthalmology are massive, and it will continue to
OCT scans. De Fauw et al93 developed a triage algorithm based on grow and exert an even larger influence in the years ahead.
14,884 OCT scans, and reported its performances in making Therefore, it is not only prudent that we take stock of where
referral recommendations (ie, urgent, semi-urgent, routine, obser- we are currently in this big data revolution, but to also endeavour a
vation) to be as good or even better than eye care practitioners path ahead so as to fully harness its full potentials.
over a range of sight-threatening retinal diseases. Remarkably,
this algorithm was developed without an excessively large data
set, and was able to maintain its accuracy when analyzing images REFERENCES
from different OCT machines. 1. Forum WE. The fourth industrial revolution: what it means, how to
In addition to screening and detecting ocular disease, deep respond. Available at: https://www.weforum.org/agenda/2016/01/the-fourth-
learning has been applied on retinal photography to predict industrial-revolution-what-it-means-and-how-to-respond/. Published 2016.
cardiovascular risk factor and anemia. Poplin et al94 developed Accessed March 26, 2020.
a deep-learning algorithm that was able to predict cardiovascular 2. Pereira F, Machado P, Costa E, Cardoso A. Progress in Artificial
risk factors, the majority of which were not thought to be Intelligence: 17th Portuguese Conference on Artificial Intelligence, EPIA
quantifiable in retinal images. These factors, ranging from age, 2015, Coimbra, Portugal, September 8-11, 2015. Proceedings. Vol 9273.
smoking status, to systolic blood pressure, are core components 1st 2015.;1st 2015; ed. Cham: Springer International Publishing; 2015.
used in cardiovascular risk calculators, thereby suggesting that
3. Panesar A. Machine Learning and AI for Healthcare: Big Data for
cardiovascular risk may be assessed directly from retinal photo-
Improved Health Outcomes. 1st ed. Berkeley CA: Apress; 2019.
graphs per se.94 Mitani et al95 further reported that retinal photo-
graphs could be used to predict hemoglobin concentration, which 4. Baro E, Degoul S, Beuscart R, Chazard E. Toward a literature-driven

is the most reliable indicator of anemia. This represents a major definition of big data in healthcare. Biomed Res Int. 2015;2015:639021–

advancement as anemia is often treatable but continues to be a 639029.

major source of poor health due to low detection rates that arise 5. Shilo S, Rossman H, Segal E. Axes of a revolution: challenges and
due to the invasiveness and cost of current screening modalities.96 promises of big data in healthcare. Nat Med. 2020;26:29–38.
6. Murdoch TB, Detsky AS. The inevitable application of big data to health
care. JAMA. 2013;309:1351–1352.
FURTHER OUTLOOK AND CONCLUSIONS 7. Boland MV. Big data, big challenges. Ophthalmology. 2016;123:7–8.
In ophthalmology, big data was first utilized decades ago in 8. Brown JAL, Ni Chonghaile T, Matchett KB, Lynam-Lennon N, Kiely PA.
the form of large cohorts in epidemiology studies.63 These cohorts Big data-led cancer research, application, and insights. Cancer Res.
were originally set-up for eye disease surveillance, and were 2016;76:6167–6170.
imperative in informing the burden of eye diseases, and the
9. Aviner R, Shenoy A, Elroy-Stein O, Geiger T. Uncovering hidden layers of
inefficiency and/or deficiency in eye care delivery.97 Over time,
cell cycle regulation through integrative multi-omic analysis. PLoS Genet.
these cohorts evolved in tandem with advancements in diagnostic
2015;11:e1005554.
tools and omics technologies, and are now a crucial driver for
“analytic driven” medicine that strives for a preventive, predic- 10. Yugi K, Kubota H, Hatano A, Kuroda S. Trans-omics: how to reconstruct

tive, personalized, precise and participatory model-of-care.63,98 biochemical networks across multiple ‘omic’layers. Trends Biotechnol.

Over the last few years, research into big data has focused more on 2016;34:276–290.

its real-world application, rather than as a proof-of-concept. This 11. Clark A, Ng JQ, Morlet N, Semmens JB. Big data and ophthalmic
includes the development of management systems that efficiently research. Surv Ophthalmol. 2016;61:443–465.
handle its complexity,39,45 and predictive algorithms that translate 12. Cuocolo R, Perillo T, De Rosa E, Ugga L, Petretta M. Current applications
its potentials into clinical solutions.89,90 of big data and machine learning in cardiology. J Geriatr Cardiol.
Big data is poised to grow at an exponential rate in the years 2019;16:601–607.
to come.38 Therefore, it is imperative to build an incisive and 13. Meskó B. FDA approvals for smart algorithms in medicine in one giant
enabling environment that build upon existing infrastructures, and infographic. The Medical Futurist In. 2019. Available at: https://
to promote a multidisciplinary collaborative effort to handle its medicalfuturist.com/fda-approvals-for-algorithms-in-medicine/. Accessed
evolving complexity. First, it remains crucial to develop the March 30, 2020.
necessary hardware and software that are required to adequately 14. Einav L, Finkelstein A, Mullainathan S, Obermeyer Z. Predictive modeling
handle, share, analyze, and safe guard the rapid rate of data of U.S. health care spending in late life. Science. 2018;360:1462–1465.
growth. Secondly, it is necessary to upscale the training of
15. Rough K, Thompson JT. When does size matter? Promises, pitfalls, and
technical specialists who are skilled to handle the challenges
appropriate interpretation of “big” medical records data. Ophthalmology.
of big data, so as to derive at meaningful and actionable informa-
2018;125:1136–1138.
tion. Furthermore, it is also vital that medical personnel, policy
makers, and the public in general are educated with the necessary 16. Househ M, Kushniruk AW, Borycki EM. Big Data, Big Challenges: A
knowledge to interpret findings from big data analysis, understand Healthcare Perspective: Background, Issues, Solutions and Research
what it can and cannot address, and to recognize poor analytic Directions. Cham, Switzerland: Springer International Publishing; 2019.
practices and fictitious claims. Lastly, as big data continue to 17. Bowman S. Impact of electronic health record systems on information
evolve in size and complexity, an integrated environment that integrity: quality and safety implications. Perspect Health Inf Manag.
promotes interdisciplinary research collaboration must be 2013;10:1c.

296 | https://journals.lww.com/apjoo ß 2020 Asia-Pacific Academy of Ophthalmology.


Asia-Pacific Journal of Ophthalmology  Volume 9, Number 4, July/August 2020 Big Data in Ophthalmology

18. Coleman AL, Morgenstern H. Use of insurance claims databases to 38. Dash S, Shakyawar SK, Sharma M, Kaushik S. Big data in healthcare:
evaluate the outcomes of ophthalmic surgery. Surv Ophthalmol. management, analysis and future prospects. J Big Data. 2019;6:1–25.
1997;42:271–278. 39. Kortüm KU, Müller M, Kern C, et al. Using electronic health records to
19. Yu M, Tham YC, Rim TH, Ting DS, Wong TY, Cheng CY. Reporting on build an ophthalmologic data warehouse and visualize patients’ data. Am J
deep learning algorithms in health care. Lancet Digit Health. 2019;1:e328– Ophthalmol. 2017;178:84–93.
e329. 40. Chiang MF, Sommer A, Rich WL, Lum F, Parke DW 2nd. The 2016
20. Gonçalves L, Subtil A, Oliveira MR, Bermudez P. ROC curve estimation: American Academy of Ophthalmology IRIS1 Registry (Intelligent
an overview. REVSTAT–Statistical Journal. 2014;12:1–20. Research in Sight) Database: Characteristics and Methods. Ophthalmology.
21. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–444. 2018;125:1143-1148.

22. Shen D, Wu G, Suk HI. Deep learning in medical image analysis. Annu 41. Parke DW, Rich WL, Sommer A, Lum F. The American Academy of
Rev Biomed Eng. 2017;19:221–248. Ophthalmology’s IRIS1 Registry (Intelligent Research in Sight Clinical
Data): a look back and a look to the future. Ophthalmology.
23. Rahimy E. Deep learning applications in ophthalmology. Curr Opin
2017;124:1572–1574.
Ophthalmol. 2018;29:254–260.
42. Cantrell RA, Lum F, Chia Y, et al. Treatment patterns for diabetic macular
24. Ehrenstein V, Nielsen H, Pedersen AB, Johnsen SP, Pedersen L. Clinical
edema: an intelligent research in sight (IRIS) registry analysis.
epidemiology in the era of big data: new opportunities, familiar challenges.
Ophthalmology. 2020;127:427–429.
Clin Epidemiol. 2017;9:245–250.
43. Gillies MC, Campain A, Barthelmes D, et al. Long-term outcomes of
25. Madondo SM. The American Statistical Association (ASA) Statement of
treatment of neovascular age-related macular degeneration: data from an
2016 on statistical significance and P-value: a critical thought. Science
observational study. Ophthalmology. 2015;122:1837–1845.
Journal of Applied Mathematics and Statistics. 2017;5:41.
44. Mantel I, Gillies MC, Souied EH. Switching between ranibizumab and
26. Nuzzo R. Scientific method: statistical errors. Nature. 2014;506:150.
aflibercept for the treatment of neovascular age-related macular
27. Haut ER, Pronovost PJ. Surveillance bias in outcomes reporting. JAMA.
degeneration. Surv Ophthalmol. 2018;63:638–645.
2011;305:2462–2463.
45. Stein JD, Rahman M, Andrews C, et al. Evaluation of an algorithm for
28. Rim TH, Lee SY, Bae HW, Seong GJ, Kim SS, Kim CY. Increased risk of
identifying ocular conditions in electronic health record data. JAMA
open-angle glaucoma among patients with diabetes mellitus: a 10-year
Ophthalmol. 2019;137:491–497.
follow-up nationwide cohort study. Acta Ophthalmol. 2018;96:e1025–
46. Matheny ME, Whicher D, Thadaney Israni S. Artificial intelligence in
e1030.
health care: a report from the National Academy of Medicine. JAMA.
29. Rim TH, Kim HK, Kim JW, Lee JS, Kim DW, Kim SS. A nationwide
2019;323:509.
cohort study on the association between past physical activity and
47. Jones L, Bryan SR, Miranda MA, Crabb DP, Kotecha A. Example of
neovascular age-related macular degeneration in an East Asian population.
monitoring measurements in a virtual eye clinic using ‘big data’. Br J
JAMA Ophthalmol. 2018;136:132–139.
Ophthalmol. 2018;102:911–915.
30. Rim TH, Kim HS, Kwak J, Lee JS, Kim DW, Kim SS. Association of
48. Ministry of Health (MOH) S. Transforming our healthcare system to meet
corticosteroid use with incidence of central serous chorioretinopathy in
evolving needs. 2020. Available at: link to PDF - https://www.moh.gov.sg/
South Korea. JAMA Ophthalmol. 2018;136(10):1164–1169.
docs/librariesprovider5/cos2020/cos-2020—transforming-our-healthcare-
31. Rim TH, Yoo TK, Kwak J, et al. Long-term regular use of low-dose
system-to-meet-evolving-needs.pdf. Accessed March 4, 2020.
aspirin and neovascular age-related macular degeneration: national sample
49. Bigus JP, Campbell M, Carmeli B, et al. Information technology for
cohort 2010-2015. Ophthalmology. 2019;126:274–282.
healthcare transformation. IBM J Res Dev. 2011;55. 6:1-6:14.
32. Lonjon G, Porcher R, Ergina P, Fouet M, Boutron I. Potential pitfalls of
50. Cheng TM. Taiwan’s new national health insurance program: genesis and
reporting and bias in observational studies with propensity score analysis
experience so far. Health Aff (Millwood). 2003;22:61–76.
assessing a surgical procedure. Ann Surg. 2017;265:901–909.
51. Song SO, Jung CH, Song YD, et al. Background and data configuration
33. Bower JK, Patel S, Rudy JE, Felix AS. Addressing bias in electronic health
process of a nationwide population-based study using the korean national
record-based surveillance of cardiovascular disease risk: finding the signal
health insurance system. Diabetes Metab J. 2014;38:395–403.
through the noise. Curr Epidemiol Rep. 2017;4:346–352.
52. Song YJ. The South Korean health care system. JMAJ. 2009;52:206–209.
34. Amirian P, Lang T, van Loggerenberg F. Big Data in Healthcare:
Extracting Knowledge from Point-of-Care Machines. 1st ed. Cham, 53. Lee J, Lee JS, Park SH, Shin SA, Kim K. Cohort profile: the national
Switzerland: Springer International Publishing; 2017. health insurance service–national sample cohort (NHIS-NSC), South
Korea. Int J Epidemiol. 2017;46:e15.
35. Matossian C. Big data analysis can benefit ophthalmoc practice and bump up
the bottom line. Available at: https://www.healio.com/ophthalmology/practice- 54. Kim YI, Kim YY, Yoon JL, et al. Cohort Profile: National health
management/news/print/ocular-surgery-news/%7B62e8b2a4-57f5-4843-a33b- insurance service-senior (NHIS-senior) cohort in Korea. BMJ Open.
1f94c045016f%7D/big-data-analysis-can-benefit-ophthalmic-practice-and-bump- 2019;9:e024344.
up-the-bottom-line. Published 2017. Accessed January 4, 2020. 55. Lin LY, Warren-Gash C, Smeeth L, Chen PC. Data resource profile: the
36. Bote-Curiel L, Muñoz-Romero S, Gerrero-Curieses A, Rojo-Álvarez JL. National Health Insurance Research Database (NHIRD). Epidemiol Health.
Deep learning and big data in healthcare: a double review for critical 2018;40:e2018062.
beginners. Appl Sci. 2019;9:2331. 56. Wang HH, Wang YH, Liang CW, Li YC. Assessment of deep learning
37. Sheikh A, Wright A, Bates D, Cresswell K. Key advances in clinical using nonimaging information and sequential medical records to develop a
informatics: transforming health care through health information prediction model for nonmelanoma skin cancer. JAMA Dermatol.
technology. Elsevier; 2017 2019;155:1277–1283.

ß 2020 Asia-Pacific Academy of Ophthalmology. https://journals.lww.com/apjoo | 297


Cheng et al Asia-Pacific Journal of Ophthalmology  Volume 9, Number 4, July/August 2020

57. Lee CS, Rim THT, Kwon HJ, Yi JH, Lee SC. Partial lamellar 79. Sun K, Chen J, Viboud C. Early epidemiological analysis of the
sclerouvectomy of ciliary body tumors in a Korean population. Am J coronavirus disease 2019 outbreak based on crowdsourced data: a
Ophthalmol. 2013;156:36–42. population-level observational study. Lancet Digit Health. 2020;2:e201–
58. Rim TH, Lee CS, Lee SC, Kim SS. Intravitreal ranibizumab therapy for e208.
neovascular age-related macular degeneration and the risk of stroke. Retina. 80. Dirani M, Crowston JG, Wong TY. From reading books to increased smart
2016;36:2166–2174. device screen time. Br J Ophthalmol. 2019;103:1–2.
59. Dockrell HM. Presidential address: the role of research networks in tackling 81. Wu PC, Chen CT, Lin KK, et al. Myopia prevention and outdoor light
major challenges in international health. Int Health. 2010;2:181–185. intensity in a school-based cluster randomized trial. Ophthalmology.
60. Puljak L, Vari SG. Significance of research networking for enhancing 2018;125:1239–1250.
collaboration and research productivity. Croat Med J. 2014;55:181–183. 82. Coleman AL. How big data informs us about cataract surgery: The LXXII
61. Delcourt C, Korobelnik JF, Buitendijk GH, et al. Ophthalmic epidemiology Edward Jackson Memorial Lecture. Am J Ophthalmol. 2015;160:1091–
in Europe: the “European Eye Epidemiology” (E3) consortium. Eur J 1103.
Epidemiol. 2016;31:197–210. 83. Inomata T, Iwagami M, Nakamura M, et al. Characteristics and risk factors
62. Flaxman SR, Bourne RR, Resnikoff S, et al. Global causes of blindness associated with diagnosed and undiagnosed symptomatic dry eye using a
and distance vision impairment 1990-2020: a systematic review and meta- smartphone application. JAMA Ophthalmol. 2019;138:58–68.
analysis. Lancet Glob Health. 2017;5:e1221–e1234. 84. Graham B. Kaggle Diabetic Retinopathy Detection Competition Report.
63. Wong TY, Hyman L. Population-Based Studies in Ophthalmology. Am J University of Warwick; 2015
Ophthalmol. 2008;146:656–663. 85. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses
64. Tham YC, Tao Y, Zhang L, et al. Is kidney function associated with and treatable diseases by image-based deep learning. Cell. 2018;172:1122–
primary open-angle glaucoma? Findings from the Asian Eye Epidemiology 1131.
Consortium. Br J Ophthalmol. . 2020;bjophthalmol-2019-314890. 86. Porwal P, Pachade S, Kamble R, et al. Indian diabetic retinopathy image
65. Bourne R, Price H, Taylor H, et al. New systematic review methodology dataset (idrid): a database for diabetic retinopathy screening research. Data.
for visual impairment and blindness for the 2010 Global Burden of Disease 2018;3:25.
Study. Ophthalmic Epidemiol. 2013;20:33–39. 87. Biobank U. About UK Biobank. Available at: https://www ukbiobank ac
66. Hyungtaek Rim T, Ryo K, Tham YC, et al. Prevalence and pattern of uk/a bout-biobank-uk. 2014.
geographic atrophy in Asia: the Asian Eye Epidemiology Consortium. 88. Ting DSW, Pasquale LR, Peng L, et al. Artificial intelligence and deep
2020. doi:10.1016/j.ophtha. 2020.04.019. learning in ophthalmology. Br J Ophthalmol. 2019;103:167–175.
67. Rogers S, McIntosh RL, Cheung N, et al. The prevalence of retinal vein 89. Abràmoff MD, Lavin PT, Birch M, Shah N, Folk JC. Pivotal trial of an
occlusion: pooled data from population studies from the United States, autonomous AI-based diagnostic system for detection of diabetic
Europe, Asia, and Australia. Ophthalmology. 2010;117:313–319. retinopathy in primary care offices. NPJ Digit Med. 2018;1:39.
68. Yau JW, Rogers SL, Kawasaki R, et al. Global prevalence and major risk 90. Ting DSW, Cheung CY-L, Lim G, et al. Development and validation of a
factors of diabetic retinopathy. Diabetes Care. 2012;35:556–564. deep learning system for diabetic retinopathy and related eye diseases using
69. Bourne RRA, Stevens GA, White RA, et al. Causes of vision loss retinal images from multiethnic populations with diabetes. JAMA.
worldwide, 1990–2010: a systematic analysis. Lancet Glob Health. 2017;318:2211–2223.
2013;1:e339–e349. 91. Goh T. An A.I. for the eye: New tech cuts time for spotting signs of
70. Eiseman E, Haga SB. Handbook of Human Tissue Sources. Santa Monica, diabetic eye disease. Available at: https://www.straitstimes.com/singapore/
CA: Rand; 1999. health/an-ai-for-the-eye. Published 2019. Accessed. March 4, 2020.

71. De Souza YG, Greenspan JS. Biobanking past, present and future: 92. Milea D, Najjar RP, Zhubo J, et al. Artificial intelligence to detect
responsibilities and benefits. AIDS. 2013;27:303–312. papilledema from ocular fundus photographs. N Engl J Med.
2020;382:1687–1695.
72. Ollier W, Sprosen T, Peakman T. UK Biobank: from concept to reality.
Pharmacogenomics. 2005;6:639–646. 93. De Fauw J, Ledsam JR, Romera-Paredes B, et al. Clinically applicable
deep learning for diagnosis and referral in retinal disease. Nat Med.
73. Biobank U. About UK Biobank. Available at: https://www.ukbiobank.ac.uk/
2018;24:1342–1350.
about-biobank-uk/. Published 2019. Accessed March 30, 2020.
94. Poplin R, Varadarajan AV, Blumer K, et al. Prediction of cardiovascular
74. Allen N, Sudlow C, Downey P, Peakman T, Danesh J. UK Biobank:
risk factors from retinal fundus photographs via deep learning. Nat Biomed
current status and what it means for epidemiology. Health Policy and
Eng. 2018;2:158–164.
Technology. 2012;1:123–126.
95. Mitani A, Huang A, Venugopalan S, et al. Detection of anaemia from
75. Collins R. What makes UK Biobank special? Lancet. 2012;379:1173–1174.
retinal fundus images via deep learning. Nat Biomed Eng. 2020;18–27.
76. Shapiro MJD, Wald J, Mon D. Patient-Generated Health Data: White
96. Milman N. Anemia—still a major health problem in many parts of the
paper. RTI International; 2012. Available at: http://www.rti.org/pubs/
world! Ann Hematol. 2011;90:369–377.
patientgeneratedhealthdata.pdf.
77. Dimitrov DV. Medical internet of things and big data in healthcare. 97. Pearce N. Traditional epidemiology, modern epidemiology, and public
Healthc Inform Res. 2016;22:156–163. health. Am J Public Health. 1996;86:678–683.

78. Roski J, Bo-Linn GW, Andrews TA. Creating value in health care through 98. Alonso SG, de la Torre Dı́ez I, Zapiraı́n BG. Predictive, personalized,
big data: opportunities and policy implications. Health Aff (Millwood). preventive and participatory (4P) medicine applied to telemedicine and
2014;33:1115–1122. eHealth in the literature. J Med Syst. 2019;43:140.

298 | https://journals.lww.com/apjoo ß 2020 Asia-Pacific Academy of Ophthalmology.

You might also like