Ncomms 12474
Ncomms 12474
Received 24 Jan 2016 | Accepted 6 Jul 2016 | Published 16 Aug 2016 DOI: 10.1038/ncomms12474 OPEN
Lung cancer is the most prevalent cancer worldwide, and histopathological assessment is
indispensable for its diagnosis. However, human evaluation of pathology slides cannot
accurately predict patients’ prognoses. In this study, we obtain 2,186 haematoxylin and eosin
stained histopathology whole-slide images of lung adenocarcinoma and squamous cell
carcinoma patients from The Cancer Genome Atlas (TCGA), and 294 additional images from
Stanford Tissue Microarray (TMA) Database. We extract 9,879 quantitative image features
and use regularized machine-learning methods to select the top features and to distinguish
shorter-term survivors from longer-term survivors with stage I adenocarcinoma (Po0.003)
or squamous cell carcinoma (P ¼ 0.023) in the TCGA data set. We validate the survival
prediction framework with the TMA cohort (Po0.036 for both tumour types). Our results
suggest that automatically derived image features can predict the prognosis of lung
cancer patients and thereby contribute to precision oncology. Our methods are extensible to
histopathology images of other organs.
1 Biomedical Informatics Program, Stanford University, 1265 Welch Road, MSOB, X-215, MC 5479, Stanford 94305-5479, California, USA. 2 Department of
Genetics, Stanford University, 300 Pasteur Dr, M-344, Stanford 94305-5120, California, USA. 3 Department of Computer Science, Stanford University, 353
Serra Mall, Stanford 94305-9025, California, USA. 4 Department of Pathology, Stanford University, 300 Pasteur Dr, L235, Stanford 94305, California, USA.
* These authors contributed equally to this work. Correspondence and requests for materials should be addressed to D.L.R. (email: [email protected]) or
to M.S. (email: [email protected]).
L
ung cancer is the most prevalent cancer and the leading We design a fully automated informatics pipeline to extract
cause of cancer-related deaths worldwide, resulting objective quantitative image features, assess the diagnostic utility
in more than 1.4 million deaths annually1,2. Evaluation of of the feature sets, build classifiers to distinguish lung cancers
the microscopic histopathology slides by experienced pathologists with different survival outcomes, discover novel image
is indispensable to establishing the diagnosis3–5 and defines the features that predicted patient prognosis and validate the
types and subtypes of lung cancers, including the two major types results in an independent data set. Our methods may ultimately
of non-small cell lung cancer: adenocarcinoma and squamous cell provide prognostic information for the patients, and contribute to
carcinoma6–8. The distinction of squamous cell carcinoma from precision medicine of lung cancer.
adenocarcinoma is important for chemotherapeutic selection,
because certain antineoplastic agents are contraindicated for
Results
squamous cell carcinoma patients because of decreased efficacy9 Patient characteristics and fully automated image features.
or increased toxicity10. In addition, more adenocarcinoma patients
We obtained 2,186 haematoxylin and eosin (H&E) stained
possess genetic aberrations with available targeted therapy, such as whole-slide histopathology images from The Cancer Genome
EGFR mutations and ALK rearrangements11–13. Certain
Atlas (TCGA)37,38, encompassing lung adenocarcinoma and lung
histological features, such as pathology grade, have been squamous cell carcinoma as well as adjacent benign tissue.
associated with survival outcomes in some studies14,15. Prompt
All images captured at 40 magnification were tiled with open
and meticulous inspection of tumour histomorphology is critical to microscopy environment tools39. To target regions with
patient care, and determination of relevant prognostic markers is
pathological changes, our automated pipeline skipped regions
the key to personalized cancer management. For example, patients with relatively sparse cellularity such as alveolar spaces and
with poorer prognoses may benefit from closer follow-up, more
selected the 10 densest tiles per image for further analysis. We
aggressive form of treatment, and advance care planning16,17. also acquired 294 tissue microarray images from the Stanford
Currently, lung cancer samples are manually evaluated for their
Tissue Microarray (TMA) Database40, with one representative
histological features by light microscopy. However, qualitative histopathology image selected by pathologists for each of the
evaluation of well-established histopathology patterns alone
227 lung adenocarcinoma and 67 lung squamous cell carcinoma
(such as the classification of tumour grades) is insufficient for patients. Patient characteristics of both the TCGA and TMA
predicting the survival outcomes of patients with lung adeno-
cohorts are summarized in Tables 1 and 2, respectively.
carcinoma or lung squamous cell carcinoma18,19, and even the To extract objective morphological information from
best-characterized histopathology features only achieve modest
thousands of images, we built a fully automated image-
agreements among experienced pathologists. As an illustration, segmentation pipeline to identify the tumour nuclei and tumour
the inter-observer agreement for features that define the types of
cytoplasm from the histopathology images using the Otsu
non-small cell lung cancer is moderate (k ¼ 0.48–0.64)20, and the method41 (see Methods for details), and extracted quantitative
diagnostic agreement for classifying adenocarcinomas and
features from the identified tumour nuclei and cytoplasm
squamous carcinomas is also relatively low (k ¼ 0.41–0.46 (Supplementary Fig. 1). Our fully automated pipeline reliably
among community pathologists, k ¼ 0.64–0.69 among
identified most tumour cells and tumour nuclei, and the results
pulmonary pathology experts and k ¼ 0.55–0.59 among all were consistent across different slides and images from different
pathologists under study)21. Poorer tumour differentiation and
batches (Supplementary Fig. 2). A total of 9,879 quantitative
poorer slide quality were associated with lower diagnostic
features were extracted from each image tile with CellProfiler42,43.
agreement21. Several recent studies have attempted to define Types of image features included cell size, shape, distribution of
additional visual features for prognostic prediction for patients
pixel intensity in the cells and nuclei, as well as texture of the cells
with lung adenocarcinoma4,22,23 or lung squamous cell and nuclei. Supplementary Table 1 provides a list of feature
carcinoma24,25. However, there is still considerable room for
categories included in this study.
improvement for the inter-rater agreements of these features26–28.
Subjective or erroneous evaluation of histopathology images may
lead to poor therapeutic choice, which results in decreased Image features accurately identify tumour parts. To determine
survival and loss of quality of life in numerous patients29. if the quantitative image features were biologically relevant, we
Computerized image processing technology has been shown to first examined if they could distinguish malignancy from normal
improve efficiency, accuracy and consistency in histopathology adjacent tissue (inflammation, atelectasis or lymphocytic
evaluations, and can provide decision support to ensure diagnostic infiltration in the absence of tumour cells) for the TCGA cohort.
consistency30. Automated histopathological analysis systems also We used seven classifiers: naive Bayes, support vector machines
have been proven to be valuable in prognostic determinations of (SVM) with Gaussian kernel, SVM with linear kernel, SVM with
various malignancies, including breast cancer31, neuroblastoma32, polynomial kernel, bagging for classification trees, random
lymphoma33 and pre-cancerous lesions in the esophagus34. forest utilizing conditional inference trees44 and Breiman’s
Automated systems can identify candidate regions that require random forest45. The TCGA data set was randomly partitioned
further diagnostic assessment and propose novel image features into distinct training and test set, with models built and
useful for prognosis. Current clinical practice could thus benefit optimized through the training data and classification
greatly from the development and incorporation of such systems performance evaluated through the test set. This process was
into clinical care31,32. With the recent availability of digital whole- repeated 20 times to ensure the robustness of our classifiers.
slide images30, there is now an opportunity for systematic analysis of Our classifiers achieved an average area under the receiver
the microscopic morphology of lung cancer cells, whose structural operating characteristic curve (AUC) of 0.81 (best classifiers:
diversity had previously posed a great challenge for automated SVM with Gaussian kernel, random forest utilizing conditional
analysis35,36. In particular, there is the possibility of identifying inference trees, and Breiman’s random forest (AUC ¼ 0.85). The
previously unrecognized image features that correlate with patients’ performance of these three classifiers did not differ significantly
prognoses, and potentially guide treatment decisions31. (analysis of variance (ANOVA) test P value ¼ 0.8514)) in
In this study, we aim to improve the prognostic prediction of distinguishing between adenocarcinoma and adjacent dense
lung adenocarcinoma and squamous cell carcinoma patients benign tissue when using the top 80 quantitative features
through objective features distilled from histopathology images. (Fig. 1a and Supplementary Table 2). When classifying
Table 1 | Patient characteristics of TCGA cohort. Table 2 | Patient characteristics of the TMA cohort.
distinguish between benign and malignant lesions. For instance, the initial diagnosis, whereas B15% of the patients survived for
Haralick texture features such as sum entropy and difference more than 10 years. A number of studies aimed to distinguish
variance were among the top features in both classification tasks. patients with different survival outcomes with additional
The relevance of our quantitative image features for diagnostic visual patterns22,23. However, non-systematic errors may take
classification was also validated in the TMA data set. Utilizing the place using these subjective assessments, and these visual
same informatics pipeline on these samples, most of the classifiers evaluations are hard to standardize26–28. It is thus difficult for
achieved AUC around 0.78 (SVM with Gaussian kernel has the human evaluators to predict survival outcomes based purely on
highest AUC of 0.85; the performance of the top three classifiers the H&E stained microscopic slides4,18. Although higher tumour
did not differ significantly (ANOVA test P value ¼ 0.13)), grade is thought to be associated with poorer survival outcomes14,
indicating the robustness of our informatics method (Fig. 2b this association is weak in patients with stage I lung
and Supplementary Table 3). The slightly higher AUC in the adenocarcinoma in both TCGA and TMA data sets (log-rank
TMA samples relative to the TCGA samples may be due to the test P value40.05; Fig. 3b).
manual selection of representative views by the pathologist, With an aim to provide better prognostic prediction with the
whereas the entire slide was used for the TCGA samples. The top H&E slides, we investigated whether our quantitative features
quantitative features included texture features in the tumour could predict survival in stage I patients. We built elastic net-Cox
nucleus and cytoplasm, and radial distribution of pixel intensity. proportional hazards models46 to select the most informative
quantitative image features and calculated survival indices
derived from H&E stained microscopic pathology images
Image features predict stage I adenocarcinoma survival. We (see Methods). Patients were categorized into longer-term or
next investigated the prognostic values of our quantitative feature shorter-term survivors based on their survival indices. Our
sets. Stage I adenocarcinoma patients are known to have diverse model successfully distinguished shorter-term survivors from
survival outcomes (Fig. 3a). In the TCGA cohort, more than 50% longer-term survivors in the test set (log-rank test
of the stage I adenocarcinoma patients died within 5 years after P value ¼ 0.0023; Fig. 3c). Among the 60 image features
a
1.00
0.75
Bagging (AUC=0.83)
Sensitivity
0.00
0.00 0.25 0.50 0.75 1.00
1 − specificity
b
1.00
0.75
Bagging (AUC=0.87)
Naive bayes (AUC=0.77)
Sensitivity
0.00
0.00 0.25 0.50 0.75 1.00
1 − specificity
Figure 1 | Quantitative image features accurately distinguished malignancies from adjacent dense normal tissues. (a) ROC curves for classifying lung
adenocarcinoma versus adjacent dense normal tissues in the TCGA test set. Classifiers with 80 features attained average AUC of 0.81. (b) ROC curves for
classifying lung squamous cell carcinoma from adjacent dense normal tissues in the TCGA test set. Classifiers with 80 features attained average AUC of
0.85. The performance of different classifiers is shown. CIT, conditional inference trees; ROC, receiver operator characteristics.
a
1.00
0.75
Bagging (AUC=0.74)
Sensitivity
Naive bayes (AUC=0.63)
Random forest (AUC=0.75)
0.50 Random forest with CITs (AUC=0.73)
SVMs with gaussian kernel (AUC=0.75)
SVMs with linear kernel (AUC=0.70)
SVMs with polynomial kernel (AUC=0.74)
0.25
0.00
0.00 0.25 0.50 0.75 1.00
1 − specificity
b
1.00
0.75
Bagging (AUC=0.75)
Naive bayes (AUC=0.73)
Sensitivity
0.00
0.00 0.25 0.50 0.75 1.00
1 − specificity
Figure 2 | Quantitative image features successfully distinguished histopathology images of lung adenocarcinoma from those of lung squamous
cell carcinoma. (a) ROC curves for classifying the two malignancies in the TCGA test set. Most classifiers achieved AUC40.7. (b) ROC curves for
classifying the two malignancies in the TMA test set. Most classifiers achieved AUC 40.75, indicating that our informatics pipeline was successfully
validated in the independent TMA data set. The performance of different classifiers is shown. CIT, conditional inference trees; ROC, receiver operator
characteristics.
selected by our methods, the top features that facilitated Image features predict squamous cell carcinoma survival.
classification of survival outcomes included texture of the Stage and grade alone only have limited predictive values in
nuclei, Zernike shape decomposition of the nuclei, and Zernike stratifying survival outcomes in patients with squamous
shape decomposition of the cytoplasm (Supplementary Data 1). cell carcinoma (log-rank test P value40.2; Fig. 4a,b)19.
Our approach for survival prediction was validated with images To validate the generalizability of our survival prediction
from an independent data set (the Stanford TMA database). method to other lung cancers, we utilized similar informatics
The same image processing workflow with elastic net-Cox workflow incorporating image features and tumour stage
proportional hazards model selected a similar set of features, to build prediction models in squamous cell carcinoma based
which also successfully distinguished longer-term survivors from on our quantitative image features. Our elastic net
shorter-term survivors in the stage I adenocarcinoma cohort models selected 15 features and classified patients into different
(log-rank test P value ¼ 0.028; Fig. 3d). The patients in different survival groups (log-rank test P value ¼ 0.023; Fig. 4c).
survival groups did not have significantly different treatments Features most indicative of survival outcomes included
(w2-test P value40.9 for neoadjuvant chemotherapy, radiation Zernike shape in the tumour nuclei and cytoplasm
therapy and targeted molecular therapy). (Supplementary Data 2).
Figure 3e,f show some examples of histopathology images from Our prognostic methodology for squamous cell carcinoma was
stage I lung adenocarcinoma patients with the same pathology also confirmed in the independent Stanford TMA cohort. Elastic
grade, but with different survival outcomes. The differences in net-Cox proportional hazards model successfully distinguished
tumour cell morphology between the two histopathology images longer-term survivors from shorter-term survivors with lung
were not easily identified by visual inspection, but could be squamous cell carcinoma (log-rank test P value ¼ 0.035; Fig. 4d).
distinguished based on our quantitative image features. These The patients in different survival groups did not have significantly
quantitative features proved to be useful in predicting survival different treatments (w2-test P value40.71 for neoadjuvant
outcomes of stage I adenocarcinoma patients. chemotherapy, radiation therapy and targeted molecular
Probability of survival
Probability of survival
Stage II Stage II
Stage III Stage III
0.75 Stage IV 0.75 Stage IV
0.25 0.25
0.00 0.00
0 50 100 150 200 0 50 100 150
Months Months
Probability of survival
Grade 1 Grade 2
Grade 1−2 Grade 3
0.75 Grade 2 0.75
Grade 2−3
Grade 3 P = 0.0502
0.50 P = 0.06 0.50
0.25 0.25
0.00 0.00
0 50 100 150 200 0 50 100 150
Months Months
c 1.00
d 1.00
Predicted prognostic groups Predicted prognostic groups
Probability of survival
Longer-term survivors
Shorter-term survivors
Probability of survival Longer-term survivors
Shorter-term survivors
0.75 0.75
P = 0.0023 P = 0.028
0.50 0.50
0.25 0.25
0.00 0.00
0 50 100 150 200 0 50 100 150
Months Months
e f
200 µm 200 µm
Figure 3 | Quantitative image features predicted the survival outcomes of stage I lung adenocarcinoma patients. (a) Kaplan–Meier curves of lung
adenocarcinoma patients stratified by tumour stage. Patients with higher stages tended to have worse prognosis (log-rank test P value o0.001 in TCGA
data set, log-rank test P ¼ 0.0068 in TMA data set). However, the survival outcomes varied widely. (left: TCGA data set, right: TMA data set). (b) Kaplan–
Meier curves of stage I lung adenocarcinoma patients stratified by tumour grade. Tumour grade did not significantly correlate with survival (left: TCGA data
set, log-rank test P value ¼ 0.06; right: TMA data set, log-rank test P value ¼ 0.0502). (c) Kaplan–Meier curves of stage I lung adenocarcinoma patients
stratified using quantitative image features. Image features predicted the survival outcomes. Elastic net-Cox proportional hazards model categorized
patients into two prognostic groups, with a statistically significant difference in their survival outcomes in the TCGA test set (log-rank test P
value ¼ 0.0023). (d) The same classification workflow was validated in the TMA data set, with comparable prediction performance. (log-rank test P
value ¼ 0.028). (e) Sample image of stage I adenocarcinoma with long survival. This patient suffered from stage IB, grade 3 lung adenocarcinoma, and
survived more than 99 months after diagnosis. Our classifier correctly predicted the patient as a long survivor. (f) Sample image of stage I adenocarcinoma
with short survival. This patient suffered from stage IB, grade 3 lung adenocarcinoma, and survived less than 12 months after diagnosis. Our classifier
correctly predicted the patient as a short survivor.
therapy). Similarly, Zernike shape, texture and radial distribution methodology could quantify some of the pathology patterns
of intensity were among the top prediction features. Figure 4e,f predictive of patient survival.
shows examples of histopathology images from squamous cell
carcinoma patients with the same pathology stage and grade, but
with different survival outcomes. As with lung adenocarcinoma, Discussion
the visual features associated with survival outcomes of lung To our knowledge, this is the first study to predict the prognoses
squamous carcinoma were not well established24,25, but our of lung cancer patients by quantitative histopathology features
extracted from whole-slide pathology images. In this study, we Previously, the vast amount of information contained in
designed an automated workflow that identified thousands of whole-slide pathology images has posed a great computational
objective features from the images, built and evaluated machine- challenge to researchers. The huge dimension of the original
learning classifiers to predict the survival outcomes of lung cancer images made it extremely difficult to manipulate, and informatics
patients. We also validated our methodology using histopathol- workflows requiring manual tumour tissue segmentation were
ogy images from an independent tissue microarray database. not feasible for millions of image tiles. As such previous
Probability of survival
Stage II Stage II
Stage III Stage III
Stage IV
0.75 0.75
0.25 0.25
0.00 0.00
0 50 100 150 0 50 100 150 200
Months Months
Probability of survival
Grade 1−2 Grade 2
Grade 2 Grade 3
Grade 2−3
0.75 Grade 3
Grade 4
0.75
0.00 0.00
0 50 100 150 0 50 100 150 200
Months Months
Probability of survival
0.75 0.75
P = 0.023 P = 0.035
0.50 0.50
0.25 0.25
0.00 0.00
0 50 100 150 0 50 100 150
Months Months
e f
200 µm 200 µm
Figure 4 | Quantitative image features predicted the survival outcomes of lung squamous cell carcinoma patients. (a) Kaplan–Meier curves of lung
squamous cell carcinoma patients stratified by tumour stage. Although patients with higher stages generally have worse outcomes, the trend was not
statistically significant (left: TCGA data set, log-rank test P value ¼ 0.216; right: TMA data set, log-rank test P value ¼ 0.388). (b) Kaplan–Meier curves of
stage I lung squamous cell carcinoma patients stratified by tumour grade. Tumour grade did not significantly correlate with survival. (left: TCGA data set,
log-rank test P value ¼ 0.847; right: TMA data set, log-rank test P value ¼ 0.964). (c) Kaplan–Meier curves of lung squamous cell carcinoma patients
stratified using quantitative image features. The image features predicted the survival outcomes. Elastic net-Cox proportional hazards model categorized
patients into two prognostic groups, with a statistically significant difference in their survival in the TCGA test set (log-rank test P value ¼ 0.023). (d) The
same classification workflow was validated in the TMA data set, with comparable prediction performance. (log-rank test P value ¼ 0.035). (e) Sample
image of lung squamous cell carcinoma in a patient with long survival. This patient suffered from stage I, grade 1 lung squamous cell carcinoma, and
survived more than 70 months after diagnosis. Our classifier correctly predicted the patient as a long survivor. (f) Sample image of squamous cell
carcinoma in a patient with short survival. This patient suffered from stage I, grade 1 lung squamous cell carcinoma, and only survived 12.4 months after
diagnosis. Our classifier correctly predicted the patient as a short survivor.
investigators have only focused on selected represented views in circle of the smallest diameter that covers the tumour nucleus,
tissue microarrays rather than whole slides31,47. An advantage of setting all pixels within the tumour nucleus to one and
our approach is that no additional human effort is needed in background to zero, and then decomposing the resulting binary
our informatics workflow other than the diagnostic labels and image into Zernike polynomials, where the coefficients serve as
survival information for the training data. This makes it scalable features. Texture features quantified the correlations between
to large amount of information contained in whole-slide nearby pixels within the regions of interest. This showed that
pathology images. To our knowledge, this is the first study to nuanced patterns of nuclear shape are important determinants of
show the utility of fully automated quantitative image features patient prognosis. In the squamous cell carcinoma group, the
extracted from whole-slide histopathology images to predict most important features also included Zernike shape features of
patient survival. As such, it could provide rapid and objective the nuclei. This showed that both local anatomical structures
survival prediction for numerous patients. (for example, shape of cell nuclei and cytoplasm) and global
An important component of our image processing technique is patterns of the tumour cell nucleus (for example, texture of the
the selection of the densest image tiles, as they generally contain the nuclei) are associated with survival outcomes.
most cells per image. Since normal lung is composed predomi- Machine-learning techniques have previously been shown to be
nantly of alveolar structures that are relatively sparse in cells, the useful in predicting patient prognosis in several cancers and
densest image tiles typically show pathological changes, such as pre-cancerous lesions31,32,34,51. For instance, researchers have
tumour, lymphocytic infiltration, inflammation or atelectasis— developed computerized morphometry to distinguish different
tissue regions where image feature extraction is expected to be grades of epithelial dysplasia in Barrett’s esophagus34, and other
biologically informative. We further established an automated groups of investigators associated features in the stromal
pipeline to identify tumour-like cells and extract 9,879 features components with the prognosis of breast cancer31. In this
directly from the images. These features capture both the local study, we demonstrated that through incorporating multiple
anatomical structure (for example, shape of the cell nuclei) and image databases, selecting the most informative features
more global patterns (for example, texture) of the tumour cell and and optimizing classifiers, we are able to predict the prognosis
tumour nuclei. As a benchmark for the utility of our objective for a cancer with diverse histopathology patterns. Our machine-
features, machine-learning models with selected features success- learning models were trained and tested on images contributed by
fully identified images with tumour cells and classified tumour more than 20 medical centres, which reduces the systematic bias
types, showing that our image features could recapture the of any single image source. Our results also showed that the
important image labels annotated by trained pathologists. classification performance is not very sensitive to the choice of
Patients with lung adenocarcinoma or squamous cell machine-learning models.
carcinoma are known to have very diverse survival outcomes. One limitation of this study is that cases submitted for TCGA
Even patients with the same stage and pathology grade can and TMA databases might be biased in terms of having mostly
have very different survival times18,19. Indeed, patients with images in which the morphological patterns of disease are
stage I lung adenocarcinoma exhibit a broad survival range, and definitive, which could be different from what pathologists
clinical stage only weakly predicted the survival outcomes of encounter at their day-to-day practice. For instance, pathologists
lung squamous cell carcinoma patients. Historically, with the reviewed many slides and microscopic views, and only uploaded
exception of pathological stage, the examination of H&E stained the most representative views to the TMA database. Although
microscopic slides has provided limited information on patients’ histopathology images with typical pathological patterns might be
prognoses. Currently, morphological assessment of subtypes of helpful in generating machine-learning models, how these
well-differentiated adenocarcinoma or squamous cell carcinoma diagnostic models performed in the actual clinical settings remain
in combination with molecular testing yields some useful to be explored. In addition, certain semi-quantitative pattern
prognostic information4,48–50. In this study, we demonstrated assessments of adenocarcinoma, such as acinar or papillary, were
that the extracted quantitative morphological features in the not available in either databases. Future research could integrate
H&E stained slides, including Zernike shape features, predicts quantitative image features along with a richer set of qualitative
patient survival. These quantitative image features are generally and semi-quantitative annotations. In addition, as the universal
difficult to spot by manual inspection, but computerized standard for digitalizing histopathology images is not yet
methods can efficiently and effectively identify such features. established, retraining of prediction models is required for data
Since H&E stained images are routinely prepared and reviewed in sets with different levels of magnification. Another limitation is
current clinical practice, our classifiers could be efficiently applied that this study only focused on H&E stained images. The clinical
to routine practice. utility of integrating quantitative features from immunochemical
We validated our informatics framework for survival stained images or molecular data remain to be established.
prediction by an independent TMA data set, demonstrating the In summary, we demonstrate that histopathology image
generalizability of our approach. We leveraged elastic net-Cox classifiers based on quantitative features can successfully predict
proportional hazards models, which are computationally efficient, survival outcomes of lung adenocarcinoma and lung squamous cell
and are capable of reducing the number of parameters in the carcinoma patients. This capability is superior to the current
models effectively and handling right-censored survival data. This practice utilized by pathologists who assess the images in terms of
method is well-suited for analysing large amounts of data and tumour grade and stage. Investigating the objective features
large number of features in our analysis. Accurate prognostic associated with survival also provides insights for histopathology
prediction generated by our models can guide clinical decision studies. Similar approaches may be applied to the pathology of other
making and enhance precision medicine. organs. Our methods could facilitate prognostic prediction based on
We also investigated the top features associated with prognosis the routinely collected H&E stained histopathology slides, thereby
in lung adenocarcinoma and squamous cell carcinoma. In the contributing to precision oncology and enhance quality of care.
adenocarcinoma group, the primary prognostic features that
distinguished longer-term survivor from shorter-term survivors
included Zernike shape features of the nuclei and cytoplasm and Methods
Histopathology image sources. A total of 2,186 whole-slide H&E stained
nuclei texture features. For each tumour cell, Zernike shape histopathology images were obtained from TCGA37,38, which included samples
features of the nucleus were generated first by identifying the from 515 lung adenocarcinoma patients and 502 lung squamous cell carcinoma
patients. All images were included for image processing and analysis. All tumour Two automated classification tasks were designed to evaluate the utility of the
samples were gathered by surgical excision. Lymph nodes were assessed by extracted features: (1) to classify images of malignancy from images of adjacent
pathology evaluation. R-status and adjuvant/neoadjuvant treatment status were benign tissues; and (2) to distinguish lung adenocarcinoma from lung squamous
determined by reviewing the clinical notes. For every image, the associated cell carcinoma. The inputs to the classification algorithms were the quantitative
pathology report and clinical variables, such as demographic and survival features extracted from the images as described in the previous section, and the
information, were also acquired from the source database. outputs were the predicted diagnoses groups. For tumour-type classification, the
The whole-slide images with 40 magnification were tiled into overlapping prediction results for image tiles of the same patient were aggregated.
1,000 1,000 pixels using bftools in the open microscopy environment39, which
generated more than 10 million image tiles in total. To reduce computational time, Machine-learning methods for prognosis prediction. Elastic net-Cox
only the 10 densest images of each image series were selected, as they contained proportional hazards models (R package ‘glmnet’) were built to calculate the
more cells for further investigations. For each image tile, the image density was survival index of each patient46. The models were trained and the features were
calculated as the percentage of non-white (all of the red, green, and blue values selected on the training set. Regularization parameters were selected by 10-fold
were below 200 in the 24-bit RGB colour space) pixels in that tile. cross-validation on the training set. Elastic net-Cox proportional hazards model
To ensure the extensibility of the developed methods, tissue microarray (TMA) were built with the selected parameters, and survival indices for each patients were
images from Stanford Department of Pathology40 were acquired and processed as calculated to determine the threshold for survival group classification. The
an external validation set. A total of 227 lung adenocarcinoma and 67 lung distribution of survival indices on the training examples was examined, and the
squamous cell carcinoma patients were included in this cohort, and one median index in the training set was selected to divide patients into good and poor
representative H&E stained histopathology image per patient was selected by prognostic groups. The same threshold was used to classify patients in the test set
pathologists. All images from TMA were included for further image processing. into two predicted survival groups. We further performed sensitivity analysis on
Informed consent of the TCGA and TMA participants were obtained by the the number of discretized prognostic groups, and the results from three prognostic
TCGA consortium37,38 and TMA investigators40, respectively. All images were groups (divided by the first and second tertile of the survival indices in the training
publicly available for research purposes, and did not require institutional review set) did not differ much from the two-group model (Supplementary Fig. 3).
board approval.
12. Yu, K. H. & Snyder, M. Omics profiling in precision oncology. Mol. Cell. 40. Marinelli, R. J. et al. The Stanford Tissue Microarray database. Nucleic Acids
Proteomics 20, O116.059253 (2016). Res. 36, D871–D877 (2008).
13. Snyder, M. Genomics and Personalized Medicine: What Everyone Needs to 41. Otsu, N. A threshold selection method from gray-level histograms. Automatica
Know (Oxford University Press, 2016). 11, 23–27 (1975).
14. Harpole, Jr. D. H., Herndon, 2nd J. E., Wolfe, W. G., Iglehart, J. D. & Marks, J. R. 42. Carpenter, A. E. et al. CellProfiler: image analysis software for identifying and
A prognostic model of recurrence and death in stage I non-small cell lung cancer quantifying cell phenotypes. Genome Biol. 7, R100 (2006).
utilizing presentation, histopathology, and oncoprotein expression. Cancer Res. 43. Kamentsky, L. et al. Improved structure, function and compatibility for
55, 51–56 (1995). CellProfiler: modular high-throughput image analysis software. Bioinformatics
15. Yoshizawa, A. et al. Impact of proposed IASLC/ATS/ERS classification of lung 27, 1179–1180 (2011).
adenocarcinoma: prognostic subgroups and implications for further revision of 44. Strobl, C., Boulesteix, A.-L., Kneib, T., Augustin, T. & Zeileis, A. Conditional
staging based on analysis of 514 stage I cases. Mod. Pathol. 24, 653–664 (2011). variable importance for random forests. BMC Bioinformatics 9, 307 (2008).
16. Franklin, W. A. Diagnosis of lung cancer: pathology of invasive and preinvasive 45. Liaw, A. & Wiener, M. Classification and Regression by randomForest. R News
neoplasia. Chest 117, 80S–89S (2000). 2, 18–22 (2002).
17. Kerr, K. M. Personalized medicine for lung cancer: new challenges for 46. Simon, N., Friedman, J., Hastie, T. & Tibshirani, R. Regularization paths
pathology. Histopathology 60, 531–546 (2012). for Cox’s proportional hazards model via coordinate descent. J. Stat. Softw. 39,
18. Beer, D. G. et al. Gene-expression profiles predict survival of patients with lung 1–13 (2011).
adenocarcinoma. Nat. Med. 8, 816–824 (2002). 47. Fuchs, T. J. & Buhmann, J. M. Computational pathology: challenges and
19. Inamura, K. et al. Two subclasses of lung squamous cell carcinoma with promises for tissue analysis. Comput. Med. Imag. Grap. 35, 515–530 (2011).
different gene expression profiles and prognosis identified by hierarchical 48. Coate, L. E., John, T., Tsao, M. S. & Shepherd, F. A. Molecular predictive and
clustering and non-negative matrix factorization. Oncogene 24, 7105–7113 prognostic markers in non-small-cell lung cancer. Lancet Oncol. 10, 1001–1010
(2005). (2009).
20. Stang, A. et al. Diagnostic agreement in the histopathological evaluation of lung 49. Dubinski, W., Leighl, N. B., Tsao, M. S. & Hwang, D. M. Ancillary testing in
cancer tissue in a population-based case-control study. Lung Cancer 52, 29–36 lung cancer diagnosis. Pulm. Med. 2012, 249082 (2012).
(2006). 50. Feng, J. et al. FoxQ1 overexpression influences poor prognosis in non-small cell
21. Grilley-Olson, J. E. et al. Validation of interobserver agreement in lung cancer lung cancer, associates with the phenomenon of EMT. PloS One 7, e39937 (2012).
assessment: hematoxylin-eosin diagnostic reproducibility for non-small cell 51. Samsi, S., Lozanski, G., Shana’ah, A., Krishanmurthy, A. K. & Gurcan, M. N.
lung cancer: the 2004 World Health Organization classification and Detection of follicles from IHC-stained slides of follicular lymphoma using
therapeutically relevant subsets. Arch. Pathol. Lab. Med. 137, 32–40 (2013). iterative watershed. IEEE Trans. Biomed. Eng. 57, 2609–2612 (2010).
22. Warth, A. et al. The novel histologic International Association for the Study of 52. Friedman, N., Geiger, D. & Goldszmidt, M. Bayesian network classifiers. Mach.
Lung Cancer/American Thoracic Society/European Respiratory Society Learn. 29, 131–163 (1997).
classification system of lung adenocarcinoma is a stage-independent predictor 53. Cortes, C. & Vapnik, V. Support-vector networks. Mach. Learn. 20, 273–297
of survival. J. Clin. Oncol. 30, 1438–1446 (2012). (1995).
23. Tsao, M. S. et al. Subtype classification of lung adenocarcinoma predicts benefit 54. Sing, T., Sander, O., Beerenwinkel, N. & Lengauer, T. ROCR: visualizing
from adjuvant chemotherapy in patients undergoing complete resection. J. Clin. classifier performance in R. Bioinformatics 21, 7881 (2005).
Oncol. 33, 3439–3446 (2015). 55. Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer New York,
24. Weichert, W. et al. Proposal of a prognostically relevant grading scheme for 2009).
pulmonary squamous cell carcinoma. Eur. Respir. J. 47, 938–946 (2015).
25. Kadota, K. et al. Comprehensive pathological analyses in lung squamous
cell carcinoma: single cell invasion, nuclear diameter, and tumour budding
Acknowledgements
We thank Dr Matt van de Rijn for his valuable advice on the study design and inter-
are independent prognostic factors for worse outcomes. J. Thorac. Oncol. 9,
pretation of results, Dr Viswam S. Nair for acquiring images from the Stanford Tissue
1126–1139 (2014).
Microarray Database and Dr David Paik, Dr Jocelyn Barker, Vanessa Sochat and Rebecca
26. Warth, A. et al. Interobserver variability in the application of the novel IASLC/
Sawyer for their suggestions on the image processing pipeline, and Drs Stephen Bach and
ATS/ERS classification for pulmonary adenocarcinomas. Eur. Respir. J. 40,
Theodoros Rekatsinas for their suggestions on the writing of the manuscript. Chung Yu
1221–1227 (2012).
Wang, A. Montana Scher, Jaeho Shin and Andrej Krevl for their assistance on the
27. Thunnissen, E. et al. Reproducibility of histopathological subtypes and invasion
computation framework. K.-H.Y. is a Winston Chen Stanford Graduate Fellow and
in pulmonary adenocarcinoma. An international interobserver study. Mod.
Howard Hughes Medical Institute International Student Research fellow. This work was
Pathol. 25, 1574–1583 (2012).
supported in part by grants from National Cancer Institute, National Institutes of Health,
28. Warth, A. et al. Training increases concordance in classifying pulmonary
grant numbers U01CA142555, 5P50HG00773502, and 5U24CA160036-05.
adenocarcinomas according to the novel IASLC/ATS/ERS classification.
Virchows Arch. 461, 185–193 (2012).
29. Raab, S. S. et al. Clinical impact and frequency of anatomic pathology errors in Author contributions
cancer diagnoses. Cancer 104, 2205–2213 (2005). K.-H.Y. conceived, designed, performed the analyses, interpreted the results and wrote the
30. Hipp, J. et al. Computer aided diagnostic tools aim to empower rather than manuscript. G.J.B., R.B.A., C.R., D.L.R and M.S. interpreted the results and edited the
replace pathologists: lessons learned from computational chess. J. Pathol. manuscript. C.Z. edited the manuscript. R.B.A., C.R., D.L.R. and M.S. supervised the work.
Inform. 2, 25 (2011).
31. Beck, A. H. et al. Systematic analysis of breast cancer morphology uncovers
stromal features associated with survival. Sci. Transl. Med. 3, 108ra113 (2011). Additional information
32. Sertel, O. et al. Computer-aided prognosis of neuroblastoma on whole-slide Supplementary Information accompanies this paper at http://www.nature.com/
images: classification of stromal development. Pattern Recognit. 42, 1093–1103 naturecommunications
(2009).
33. Sertel, O. et al. Histopathological image analysis using model-based Competing financial interests: The authors declare no competing financial interests.
intermediate representations and color texture: follicular lymphoma grading.
J. Signal Process. Syst. 55, 169–183 (2009). Reprints and permission information is available online at http://npg.nature.com/
34. Sabo, E. et al. Computerized morphometry as an aid in determining the grade reprintsandpermissions/
of dysplasia and progression to adenocarcinoma in Barrett’s esophagus. Lab.
How to cite this article: Yu, K.-H. et al. Predicting non-small cell lung cancer prognosis
Invest. 86, 1261–1271 (2006).
by fully automated microscopic pathology image features. Nat. Commun. 7:12474
35. Churg, A. The fine structure of large cell undifferentiated carcinoma of the lung.
doi: 10.1038/ncomms12474 (2016).
Evidence for its relation to squamous cell carcinomas and adenocarcinomas.
Hum. Pathol. 9, 143–156 (1978).
36. Yamada, E. et al. Tumour-size-based morphological features of metastatic This work is licensed under a Creative Commons Attribution 4.0
lymph node tumors from primary lung adenocarcinoma. Pathol. Int. 64, International License. The images or other third party material in this
591–600 (2014). article are included in the article’s Creative Commons license, unless indicated otherwise
37. Cancer Genome Atlas Research Network. Comprehensive molecular profiling in the credit line; if the material is not included under the Creative Commons license,
of lung adenocarcinoma. Nature 511, 543–550 (2014). users will need to obtain permission from the license holder to reproduce the material.
38. Cancer Genome Atlas Research Network. Comprehensive genomic To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
characterization of squamous cell lung cancers. Nature 489, 519–525 (2012).
39. Linkert, M. et al. Metadata matters: access to image data in the real world.
J. Cell Biol. 189, 777–782 (2010). r The Author(s) 2016