Automated Clinical Diagnosis: The Role of Content in Various Sections of A Clinical Document
Automated Clinical Diagnosis: The Role of Content in Various Sections of A Clinical Document
Automated Clinical Diagnosis: The Role of Content in Various Sections of A Clinical Document
Vivek Datla, Sadid A. Hasan, Ashequl Qadir, Kathy Lee, Yuan Ling, Joey Liu, and Oladimeji Farri
Artificial Intelligence Laboratory, Philips Research North America, Cambridge, MA
Email: {firstname.lastname,kathy.lee 1,dimeji.farri}@philips.com
Abstract—Clinical diagnosis is a critical aspect of patient useful in various tasks including question answering (QA)
care that is typically driven by expert medical knowledge and and automated reasoning [4], [5], [6].
intuition. An automated system for clinical diagnosis could In this paper, we leverage articles under the “Clinical
reduce the cognitive burden of clinicians during patient care
and medical education. In this paper, we describe a Knowledge Medicine” category of Wikipedia to build a knowledge-
Graph (KG)-based clinical diagnosis system that leverages driven clinical diagnosis system. We use a Knowledge
publicly available knowledge sources to infer possible diagnoses Graph (KG)-based approach to accomplish this goal. Our
from free-text clinical narratives. We experiment with the system takes free-text description of a medical problem (a
content in various sections of a clinical document within the clinical narrative) as input and provides the most likely
electronic health record (EHR) to investigate the contribution
of each section to the performance of automated diagnosis diagnoses. We convert the link structure in Wikipedia into
systems. Evaluation on MIMIC-III dataset demonstrates that a knowledge graph where the nodes represent Wikipedia
the content of “history of present illness” and “past medical pages, hyperlinked concepts, and redirect pages, while edges
history” sections can play a greater role for clinical diagno- represent the relationships between them. We develop a
sis inference than other sections and all sections combined. query system on the knowledge graph that utilizes the
Comparison with a state-of-the-art deep learning-based clinical
diagnosis system confirms the effectiveness of our system. content of Wikipedia as well as the link structure in iden-
tifying the most probable diagnoses. The structure is used
Keywords-clinical diagnosis; knowledge graph; electronic to determine relationships among diseases and symptoms
health record;
while the content of Wikipedia pages is used to rank their
strength of association. Based on the strength of association,
I. I NTRODUCTION
the system generates a ranked list of diagnoses for a given
Clinical diagnosis is a critical and non-trivial aspect of medical problem.
patient care. Intuition based on past professional experiences Our experiments on MIMIC (Medical Information Mart
and knowledge gained from formal medical training typi- for Intensive Care)-III [7] discharge summaries demonstrate
cally drives the clinician’s ability to make a diagnosis [1]. that identifying relevant sections in these documents can lead
Although mimicking the intuition of clinicians can be very to a substantial gain in performance in inferring the most
challenging, an automated system designed for clinical diag- probable diagnoses. We observe that providing content from
nosis can support expert reasoning based on available knowl- all sections of the document to the system works poorly
edge sources, especially when trying to resolve complicated compared to using specific sections of the document. We
clinical scenarios. Such a system could significantly reduce also compare the KG-based system’s performance to the
the cognitive burden of clinicians during patient care so state-of-the-art Condensed Memory Networks (C-MemNN)-
they could be better-informed and adequately engage their based clinical diagnosis system [8] also trained on MIMIC-
patients towards achieving desired health outcomes [2], [3]. III dataset. Evaluation results reveal that our system per-
Available text-based knowledge sources for medicine forms better in some of the experiments.
include scientific publications and textbooks. However, a Given the increasing interest in artificial intelligence (AI)
significant proportion of these sources are proprietary and and clinical decision support systems within the machine
require formal and commercial agreements in place for learning and health informatics communities, our work helps
wide-spread use in automated systems for clinical decision identify the most appropriate information within electronic
support. Instead, we use Wikipedia as our knowledge source clinical documents that would drive automated diagnostic so-
given the fact it is publicly available and that medical lutions towards optimal accuracy leading to better-informed
and colloquial usage of medical terms are represented - a clinical decisions. Researchers in these communities can uti-
feature that may help build a robust computational model for lize findings of our paper to improve quality of training data
automated diagnostic inferencing. Furthermore, Wikipedia is when developing AI models to address complex reasoning
used by several researchers in the field of natural language tasks in patient care.
processing (NLP) as a rich multilingual knowledge base The main contributions of this paper can be summarized
1005
III. I NFERRING C LINICAL D IAGNOSIS WITH can be mapped to their corresponding Wikipedia page. The
K NOWLEDGE G RAPH NLP engine uses medical ontologies such as SNOMED [33],
In this work, we introduce a novel hybrid approach to UMLS [34], and RadLex [35] for normalization. Compo-
address the clinical diagnostic inferencing problem. Figure 1 nents 1-2 in Figure 1 represent these steps.
shows the overall architecture of our system. We first build 2) Querying the Knowledge Graph: Next, we query the
a structured knowledge graph (KG) using contents from knowledge graph with the extracted symptoms for predicting
Wikipedia that are relevant for this problem. Given a clin- a set of diagnoses. However, not all symptoms contribute
ical narrative, we then identify the patient’s symptoms in equally in the prediction process. For example, symptoms
the narrative using an information extraction engine. The such as “fever” is very common and can occur with many
extracted symptoms are used to query the knowledge graph diseases, whereas “stridor” can be more uniquely associated
for predicting a set of diagnoses for the given narrative. The to respiratory diseases. So from the list of extracted symp-
following sections discuss the details of the knowledge graph toms, we need to identify the symptoms that are the most
construction and our method for predicting diagnoses from distinctive for determining potential diagnoses and weigh
a clinical narrative. them accordingly. For each extracted symptom, we query
the PubMed corpus of the 2014 TREC CDS track using
A. Knowledge Graph Construction Elasticsearch1 to retrieve its term frequency in the corpus,
For constructing our knowledge graph, we used Wikipedia and use the inverse of the term frequency as its weight.
as our knowledge source. We collected all documents under This signifies that the symptoms that are relatively rare in
the clinical medicine category in Wikipedia. This category the PubMed corpus will be assigned higher weights than the
served as the root node of our knowledge graph. The symptoms that are more frequent. Component 3 in Figure 1
subcategories and any page in Wikipedia under this root represents this step. The calculated weights are used to
category became the initial children nodes in our graph. activate these nodes in the knowledge graph at a later step.
The nodes representing the sub-categories might not have 3) Building the Solution Space: Our next step is to
had any content text, whereas the nodes representing the create a solution space within the knowledge graph that
pages had their content text. We further expanded the nodes consists of nodes representing symptoms, leading to nodes
recursively up to a depth of 10 using breadth-first search, representing candidate diagnosis. We conduct this in two
extracted all subcategories and pages, mining a total of stages: a) building a bare-bones sub-space, and b) expanding
188,139 Wikipedia pages from 17,121 categories. These the subspace to have a connected path between any two
pages and the categories were then added to the graph. nodes.
Some of the categories were verified by Domain experts a) Building the initial subspace: The solution space
as unrelated to clinical medicine, so we pruned our graph initially contains only the nodes representing the input
at these categories. Furthermore, we created an edge for query symptoms. We further include all of the immediate
any hyperlink associated with a term in any of the retrieved neighbors of the input symptom nodes in the initial solution
pages. These edges connected the page node that contained space. This process gives us several trees which may or
the term with the page node that was the hyperlink des- may not (a scattered forest) be connected. If the trees are
tination. The resulting knowledge graph (KG) contained a connected i.e. if there is an existing path between any two
total of 381,964 nodes and 1,906,302 edges. The constructed nodes of the graph, then we identify it as a connected forest.
knowledge graph is represented in the 4th component in This is represented in component 5 of Figure 1. If there is
Figure 1. no connected forest, then we perform the next step.
b) Expanding the initial subspace: If the resulting forest
B. Inferencing Diagnosis
is not connected then we expand the subspace. For this, we
1) Symptom Extraction from Clinical Narratives: The identify nodes that share common entities such as diseases,
diagnosis inferencing process begins with a clinical nar- medications, procedures and symptoms. We use a greedy
rative that describes the symptoms and any demographic approach to minimize the number of new nodes added to the
information of a patient. The clinical narrative is written solution space by expanding the nodes that would provide
in unstructured text, so these concepts need to be identified the maximum connectivity with a minimal number of nodes
and extracted before we can query the knowledge graph. added to our solution space. For implementing the greedy
For example, the clinical narrative may contain sentences approach we identify two nodes in the knowledge graph
such as “A 5-year-old boy has fever, cough, drooling, stridor, that have the smallest number of children and share at least
and dysphagia with voice change.” We use a hybrid clinical one child between them. This common child acts like a
NLP engine [32] to first identify and extract the symptoms: path between the two unconnected trees making it a single
fever, cough, drooling, stridor, dysphagia with voice change, connected graph. We repeat the process till the whole graph
and demographic information: 5-year-old boy. We also use
the NLP engine to normalize the symptoms so that they 1 https://www.elastic.co/products/elasticsearch
1006
! &
"
% !
! "
#
#
!
!
%
%
#
"
"%
$
$ %
" ! !
becomes connected. their neighbors except to their parent node. The motivation
The solution space can grow exponentially in size if the for performing this step is to accumulate the weights from
expanded node is a very common symptom. Also, expanding symptoms to the nodes containing diagnosis.
a very common symptom adds many unrelated diagnosis and The control module is responsible for stopping an activa-
procedures to our solution space. By following the expansion tion if the propagated weight is below a certain threshold. We
strategy mentioned above, we overcome the risk of adding use a very small value 0.001 as our threshold, below which
the unrelated nodes to our solution space. Component 6 in the activation do not contribute much in the accumulation
Figure 1 represents this process. process. Also, the control module makes sure that there is no
4) Activating Nodes in the Solution Space: Next, we cyclic propagation of weights and keeps track of the nodes
start activating nodes in the solution space to find a set of to which the current node has passed the activation.
probable diagnoses. We start with the weights of the input The end result of this stage is a weighted graph, where
query symptoms. These weights are then spread across the each node is weighted based on the accumulation of the
knowledge graph with an activation module and a control proportion of weights propagated from the symptoms. Com-
module. ponents 7-8 in Figure 1 represent this stage.
The activation module takes the weight of a symptom 5) Identifying and Ranking Diagnoses: Since our ul-
node and propagates the weight to its immediate neighbor. timate goal is to infer a set of diagnoses, any node in
The propagated weight is dampened by a factor, so that the weighted graph that is not a disease/syndrome node is
the weight propagated from a node weakens (i.e. lessens) filtered out from our solution space. The remaining nodes
when a propagation happens farther away from the initial form the set of possible diagnoses for the input symptoms
symptom. All the nodes keep propagating the activation to retrieved from the clinical narrative. For each disease node,
1007
we check the signs and symptoms mentioned in Wikipedia discharge note (e.g. in Table I), we get the concepts shown
for that disease, and score the node based on the overlap in Table II.
of the symptoms in the clinical narrative and the content
of that disease/syndrome Wikipedia page. The diseases are There are 4,186 unique diagnoses in MIMIC-III discharge
then re-ranked based on the overlap score and they form the notes. However, many diagnoses (labels) occur in only a
candidate set of diagnoses for the symptoms. single note. The 50 most-common labels cover 97% of the
As the final ranking step, if the demographic information notes, and the 100 most-common labels cover 99.97%. We
of the patient is retrieved from the clinical narrative, then present experiments for both the 50 most-common and 100-
we mine the epidemiology of the disease mentioned in most common labels.
Wikipedia to identify the prevalence of the current diagnosis
in that age group. For example, if the disease is very B. Comparison of Sections of EHR as Clinical Narrative
prevalent in children but not in adults and the patient is We conducted extensive experiments to understand the
mentioned as an adult, then the rank of the disease is role of the content in a particular section of a clinical note to
pushed lower than the adult diseases in the list. Once the infer the correct diagnoses. Given the unstructured free-text
re-ranking process based on the epidemiology information in each section of the medical note as input, we measure
is completed, we get our final ranked list of diagnoses that the accuracy of the system in identifying the diagnoses.
are inferred for the given clinical narrative. These final steps In this study, we consider the following sections of a
are represented in components 9-10 in Figure 1. MIMIC-III note for these experiments: social history (i.e.,
behavioral information such as smoking, drinking, diabetes
IV. E XPERIMENTS AND E VALUATION
etc.), chief complaint (i.e., symptoms such as chest pain,
A. Dataset headache, dizziness, etc.), history of present illness, past
medical history, brief hospital course (i.e. information about
procedures and medications provided during the hospital
stay), and discharge medications.
We also compare our system to a state-of-the-art clinical
diagnosis inference system on the MIMIC-III dataset, which
uses Condensed Memory Neural Networks (C-MemNNs) [8]
to formulate the task as a multiclass-multilabel classification
problem. Due to a large number of diagnoses (class labels)
in the dataset, the C-MemNN model simplifies the task by
considering the most frequent N diagnoses for training. We
also adapt similar settings for our experiments.
C. Metrics
Figure 2. Distribution of diagnoses in MIMIC-III [8] We use precision and recall to evaluate our systems. For a
meaningful comparison, we consider two variations of these
We evaluate the system on discharge notes in MIMIC-III metrics: 1) strict (exact word match with the ground truth
database [36]. MIMIC-III contains physiological signals and diagnosis), and 2) relaxed (allowing paraphrases and disease
various measurements captured from patient monitors, and synonyms based on the human disease network [38]).
comprehensive clinical data obtained from hospital medical Recall that, our knowledge graph is built using the
information systems for over 58K hospital admissions. We medical concepts in Wikipedia, where the clinical concepts
use the note events table from MIMIC-III v1.3, which are mostly standardized and may be different from the
contains the free-text clinical notes for patients. We use ‘dis- abbreviated/colloquial usage of medical terms in a clinical
charge summaries,’ instead of ‘admission notes,’ as former note. For example, a MIMIC note may refer to “diabetes
contains actual ground truth and free-text. Table I shows an mellitus type 2” by mentioning “dm type 2”, “diabetes type
example discharge note used in this paper. The diagnoses 2”, “db 2” or “diabetes 2”. Since our approach considers
present in the MIMIC-III notes are very specific and are diagnosis concepts based on the Wikipedia page titles, a
not evenly distributed as shown in Figure 2. Many diseases strict measure of precision and recall based on exact word
appear very few times inside the corpus. overlap with the ground truth diagnosis may be insufficient
to measure the effectiveness of our systems. Hence, we
For the experiments, we have used a subset of 14K notes. introduced the relaxed alternatives of the metrics.
We processed the notes to extract the medical concepts from The precision at 5 represented as P@5 is the ratio of
the MIMIC notes based on SNOMED [37] using our hybrid correct diagnoses over the top five predictions. It should be
clinical NLP engine [32]. For example, after processing a noted that a MIMIC chart note can have many diagnoses,
1008
Table I
E XAMPLE OF A DISCHARGE NOTE IN MIMIC-III
Table II
M EDICAL CONCEPTS EXTRACTED USING A HYBRID CLINICAL NLP ENGINE [32]
Discharge Note
CHIEF COMPLAINT: Chest pain.
HISTORY OF PRESENT ILLNESS: 41-year-old female , coronary artery bypass,
chest pain, discomfort,cold, pain
Diagnosis
coronary artery disease, saphenous vein graft, myocardial infarction,
stenting, coronary artery, coronary artery anastomosis.
hence making this a very strict measure. The recall at 5 in MIMIC notes. Not surprisingly, the results show that
represented as R@5 measures how well we covered all history of present illness and past medical history have the
the possible diagnoses in the MIMIC note. In the relaxed most relevant information for identifying the diagnoses.
setting of a measure we consider that the predicted diagnosis
is correct even when the ground truth diagnoses and the We also compare our results with the C-MemNN [8]
predicted diagnoses are synonyms or paraphrases of each model. In their paper, the authors report the results using
other. three metrics: P@5, Area Under the Curve (AUC), and Ham-
ming loss. AUC and Hamming loss are not the appropriate
metrics for our experimental settings, so we use precision-
D. Results and Discussion and recall-based metrics for this comparison. Results show
that our systems have lower strict precision scores for the
Table III shows the results of our experiments with various
“top-50 classes” experiments. However, when we consider
sections of the medical chart note. From these results, we can
the top 100 classes, the All sections variant performs better
understand the role of content to infer the correct diagnoses.
than the C-MemNN system, which also uses all sections
We can see that the individual sections perform better (except the diagnosis section) as the input of their model.
than the combined sections (All). This can be attributed to Considering the relaxed precision metric, we find that the
the generality of some of the sections in the MIMIC notes, proposed KG-based system can perform better than the
where the procedures/medications apply to many diseases. C-MemNN model with the selective use of content from
Specifically, the brief hospital course section has many pro- various sections.
cedures that are common among several diseases, which may
have led it to lower scores. On the other hand, the discharge From our experiments, it is clear that all sections do not
medications section only covers the pain medications and contribute equally for clinical diagnosis inference. Hence,
may not be representative of the surgery or complications it might be difficult for a machine learning system to learn
the patient had due to pre-existing chronic conditions. the complex relationships among the medical concepts and
Further analysis shows that social history has a higher the diagnoses present in a clinical note. For the MIMIC-
score in the relaxed measure of precision when we consider III dataset, our experiments suggest that training the model
the top 100 classes. Also, this can be an aberration as on the past medical history and history of present illness
people with a social history of alcohol and smoking had sections could help a machine learning system improve
more chances of having diabetes, hypertension and other the accuracy of clinical diagnosis inference compared to
lifestyle diseases that are among the most common diseases considering the full clinical note.
1009
Table III
E VALUATION RESULTS (A LL = COMBINATION OF ALL CONSIDERED SECTIONS ; P= PRECISION ; R= RECALL ; S = STRICT; R = RELAXED ; TOP SCORES ARE
BOLDFACED ).
VI. C ONCLUSION [5] D. Milne and I. H. Witten, “An open-source toolkit for mining
wikipedia,” Artificial Intelligence, vol. 194, pp. 222–239,
In this paper, we described our Knowledge Graph (KG)- 2013.
based clinical diagnosis inference system. We conducted
extensive experiments on the MIMIC-III benchmark dataset [6] B. Katz, G. Marton, G. C. Borchardt, A. Brownell, S. Felshin,
D. Loreto, J. Louis-Rosenberg, B. Lu, F. Mora, S. Stiller
considering various sections of a clinical note. Results et al., “External knowledge sources for question answering.”
demonstrated that the content of the history of present illness in TREC, 2005.
and past medical history sections can contribute the most
for clinical diagnosis inference compared to all sections. [7] A. E. Johnson, T. J. Pollard, L. Shen, L.-w. H. Lehman,
Furthermore, we showed that the proposed KG-based system M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L. A. Celi,
and R. G. Mark, “Mimic-iii, a freely accessible critical care
can perform well in comparison to the state-of-the-art C- database,” Scientific data, vol. 3, 2016.
MemNN model for a relaxed precision metric.
In future, we would improve the current KG-based di- [8] A. Prakash, S. Zhao, S. A. Hasan, V. Datla, K. Lee, A. Qadir,
agnosis inference system by adding more properties (e.g. J. Liu, and O. Farri, “Condensed memory networks for
clinical diagnostic inferencing,” AAAI, 2016.
relationships among the clinical concepts) to the edges of the
knowledge graph. Also, we would like to utilize the findings [9] Z. C. Lipton, D. C. Kale, C. Elkan, and R. Wetzell, “Learning
of this study to improve the training data sets for machine to diagnose with lstm recurrent neural networks,” arXiv
learning models that help infer the clinical diagnoses from preprint arXiv:1511.03677, 2015.
free-text narratives.
[10] E. Choi, M. T. Bahadori, and J. Sun, “Doctor ai: Predicting
clinical events via recurrent neural networks,” arXiv preprint
R EFERENCES arXiv:1511.05942, 2015.
[1] G. Norman, M. Young, and L. Brooks, “Non-analytical [11] E. Choi, M. T. Bahadori, J. Sun, J. Kulas, A. Schuetz,
models of clinical reasoning: the role of experience,” and W. Stewart, “Retain: An interpretable predictive model
Medical Education, vol. 41, no. 12, pp. 1140–1145, for healthcare using reverse time attention mechanism,” in
2007. [Online]. Available: http://dx.doi.org/10.1111/j.1365- Advances in Neural Information Processing Systems, 2016,
2923.2007.02914.x pp. 3504–3512.
1010
[12] M. S. Simpson, E. M. Voorhees, and W. Hersh, “Overview [25] K. Bollacker, C. Evans, P. Paritosh, T. Sturge, and J. Taylor,
of the trec 2014 clinical decision support track,” DTIC “Freebase: a collaboratively created graph database for struc-
Document, Tech. Rep., 2014. turing human knowledge,” in Proceedings of the 2008 ACM
SIGMOD international conference on Management of data.
[13] K. Roberts, M. S. Simpson, E. Voorhees, and W. R. Hersh, AcM, 2008, pp. 1247–1250.
“Overview of the trec 2015 clinical decision support track,”
in TREC, 2015. [26] F. M. Suchanek, G. Kasneci, and G. Weikum, “Yago: a
core of semantic knowledge,” in Proceedings of the 16th
[14] S. A. Hasan, S. Zhao, V. Datla, J. Liu, K. Lee, A. Qadir, international conference on World Wide Web. ACM, 2007,
A. Prakash, and O. Farri, “Clinical question answering using pp. 697–706.
key-value memory networks and knowledge graph.” TREC,
[27] S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak,
2016.
and Z. Ives, “Dbpedia: A nucleus for a web of open data,” in
The semantic web. Springer, 2007, pp. 722–735.
[15] S. A. Hasan, Y. Ling, J. Liu, and O. Farri, “Using neural
embeddings for diagnostic inferencing in clinical question [28] S. Yang, Y. Xie, Y. Wu, T. Wu, H. Sun, J. Wu, and X. Yan,
answering,” 2015. “Slq: a user-friendly graph querying system,” in Proceedings
of the 2014 ACM SIGMOD International Conference on
[16] T. R. Goodwin and S. M. Harabagiu, “Medical question Management of Data. ACM, 2014, pp. 893–896.
answering for clinical decision support,” in Proceedings of
the 25th ACM International on Conference on Information [29] S. Jonnalagadda, T. Cohen, S. Wu, and G. Gonzalez, “En-
and Knowledge Management. ACM, 2016, pp. 297–306. hancing clinical concept extraction with distributional seman-
tics,” J. of Biomedical Informatics, vol. 45, no. 1, pp. 129–
[17] Y. Ling, Y. An, and S. A. Hasan, “Improving clinical 140, 2012.
diagnosis inference through integration of structured and
unstructured knowledge,” in Proceedings of the 1st EACL [30] Y. Li, S. Lipsky Gorman, and N. Elhadad, “Section clas-
Workshop on Sense, Concept and Entity Representations and sification in clinical notes using supervised hidden markov
their Applications (SENSE), 2017. model,” in Proceedings of the 1st ACM International Health
Informatics Symposium. ACM, 2010, pp. 744–750.
[18] Y. Ling, Y. An, M. Liu, S. A. Hasan, Y. Fan, and X. Hu, [31] R. Pivovarov and N. Elhadad, “A hybrid knowledge-based
“Integrating extra knowledge into word embedding models and data-driven approach to identifying semantically similar
for biomedical nlp tasks,” in Proceedings of the 30th Interna- concepts,” Journal of biomedical informatics, vol. 45, no. 3,
tional Joint Conference on Neural Networks (IJCNN), 2017. pp. 471–481, 2012.
[19] Z. Zheng and X. Wan, “Graph-based multi-modality learning [32] S. A. Hasan, X. Zhu, Y. Dong, J. Liu, and O. Farri, “A hybrid
for clinical decision support,” in Proceedings of the 25th ACM approach to clinical question answering,” in Proceedings of
International on Conference on Information and Knowledge The Twenty-Third Text REtrieval Conference, TREC 2014,
Management. ACM, 2016, pp. 1945–1948. Gaithersburg, Maryland, USA, November 19-21, 2014, 2014.
[20] S. Balaneshin-kordan and A. Kotov, “Optimization method [33] K. A. Spackman, K. E. Campbell, and R. A. Côté, “Snomed
for weighting explicit and latent concepts in clinical decision rt: a reference terminology for health care.” in Proceedings
support queries,” in Proceedings of the 2016 ACM on Inter- of the AMIA annual fall symposium. American Medical
national Conference on the Theory of Information Retrieval. Informatics Association, 1997, p. 640.
ACM, 2016, pp. 241–250.
[34] O. Bodenreider, “The unified medical language system
[21] L. Shi, S. Li, X. Yang, J. Qi, G. Pan, and B. Zhou, “Semantic (umls): integrating biomedical terminology,” Nucleic acids
health knowledge graph: Semantic integration of heteroge- research, vol. 32, no. suppl 1, pp. D267–D270, 2004.
neous medical knowledge and services.”
[35] C. P. Langlotz, “Radlex: a new method for indexing online
educational materials 1,” 2006.
[22] S. Geng and Q. Zhang, “Clinical diagnosis expert system
based on dynamic uncertain causality graph,” in Information [36] A. E. Johnson, T. J. Pollard, L. Shen, L.-w. H. Lehman,
Technology and Artificial Intelligence Conference (ITAIC), M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L. A. Celi,
2014 IEEE 7th Joint International. IEEE, 2014, pp. 233– and R. G. Mark, “Mimic-iii, a freely accessible critical care
237. database,” Scientific data, vol. 3, 2016.
[23] D. Ferrucci, A. Levas, S. Bagchi, D. Gondek, and E. T. [37] M. Q. Stearns, C. Price, K. A. Spackman, and A. Y. Wang,
Mueller, “Watson: beyond jeopardy!” Artificial Intelligence, “Snomed clinical terms: overview of the development process
vol. 199, pp. 93–105, 2013. and project status.” in Proceedings of the AMIA Symposium.
American Medical Informatics Association, 2001, p. 662.
[24] A. Lally, S. Bachi, M. A. Barborak, D. W. Buchanan, J. Chu-
Carroll, D. A. Ferrucci, M. R. Glass, A. Kalyanpur, E. T. [38] L. M. Schriml, C. Arze, S. Nadendla, Y.-W. W. Chang,
Mueller, J. W. Murdock et al., “Watsonpaths: scenario-based M. Mazaitis, V. Felix, G. Feng, and W. A. Kibbe, “Disease
question answering and inference over unstructured informa- ontology: a backbone for disease semantic integration,” Nu-
tion,” Yorktown Heights: IBM Research, 2014. cleic acids research, vol. 40, no. D1, pp. D940–D946, 2012.
1011