0% found this document useful (0 votes)
66 views11 pages

Phasestain: The Digital Staining of Label-Free Quantitative Phase Microscopy Images Using Deep Learning

Uploaded by

asrar asim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
66 views11 pages

Phasestain: The Digital Staining of Label-Free Quantitative Phase Microscopy Images Using Deep Learning

Uploaded by

asrar asim
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Rivenson et al.

Light: Science & Applications (2019)8:23 Official journal of the CIOMP 2047-7538
https://doi.org/10.1038/s41377-019-0129-y www.nature.com/lsa

ARTICLE Open Access

PhaseStain: the digital staining of label-free


quantitative phase microscopy images
using deep learning
Yair Rivenson1,2,3, Tairan Liu1,2,3, Zhensong Wei1,2,3, Yibo Zhang 1,2,3
, Kevin de Haan1,2,3 and Aydogan Ozcan 1,2,3,4

Abstract
Using a deep neural network, we demonstrate a digital staining technique, which we term PhaseStain, to transform
the quantitative phase images (QPI) of label-free tissue sections into images that are equivalent to the brightfield
microscopy images of the same samples that are histologically stained. Through pairs of image data (QPI and the
corresponding brightfield images, acquired after staining), we train a generative adversarial network and demonstrate
the effectiveness of this virtual-staining approach using sections of human skin, kidney, and liver tissue, matching the
brightfield microscopy images of the same samples stained with Hematoxylin and Eosin, Jones’ stain, and Masson’s
trichrome stain, respectively. This digital-staining framework may further strengthen various uses of label-free QPI
techniques in pathology applications and biomedical research in general, by eliminating the need for histological
staining, reducing sample preparation related costs and saving time. Our results provide a powerful example of some
of the unique opportunities created by data-driven image transformations enabled by deep learning.
1234567890():,;
1234567890():,;
1234567890():,;
1234567890():,;

Introduction sections2,8, which can be considered a weakly scattering


Quantitative phase imaging (QPI) is a rapidly emerging phase object, having limited amplitude contrast modula-
field, with a history of several decades in development1,2. tion under brightfield illumination.
QPI is a label-free imaging technique, which generates a Although QPI techniques result in quantitative contrast
quantitative image of the optical-path-delay through the maps of label-free objects, the current clinical and
specimen. Other than being label-free, QPI utilizes low- research gold standard is still mostly based on the
intensity illumination, while still allowing for a rapid brightfield imaging of histologically labeled samples. The
imaging time, which reduces phototoxicity in comparison staining process dyes the specimen with colorimetric
to, e.g., commonly used fluorescence imaging modalities. markers, revealing the cellular and subcellular morpho-
QPI can be performed on multiple platforms and devi- logical information of the sample under brightfield
ces3–7, from ultra-portable instruments all the way to microscopy. As an alternative, QPI has been demon-
custom-engineered systems integrated with standard strated for the inference of local scattering coefficients of
microscopes, with different methods of extracting the tissue samples8,9; for this information to be adopted as a
quantitative phase information. QPI has also been diagnostic tool, some of the obstacles include the
recently used for the investigation of label-free thin tissue requirement of retraining experts and competing with a
growing number of machine learning-based image ana-
lysis software10,11, which utilizes vast amounts of stained
Correspondence: Aydogan Ozcan ([email protected])
1
Electrical and Computer Engineering Department, University of California, Los tissue images to perform, e.g., automated diagnosis, image
Angeles, CA 90095, USA
2
segmentation, or classification, among other tasks. One
Bioengineering Department, University of California, Los Angeles, CA 90095,
possible way to bridge the gap between QPI and standard
USA
Full list of author information is available at the end of the article. image-based diagnostic modalities is to perform digital
These authors contributed equally: Yair Rivenson, Tairan Liu, Zhensong Wei

© The Author(s) 2019


Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction
in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if
changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If
material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain
permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
Rivenson et al. Light: Science & Applications (2019)8:23 Page 2 of 11 23

Standard histological workflow


Histological Brightfield Stained
Slide staining microscope brightfield
Preparation

N O
HO OH
N N HO OH
N OH
Methenamine Periodic
acid 50 μm

PhaseStain
Unstained
Slide Obtain quantitative Deep neural Network
phase at 550 nm network output

50 μm Convolutional 50 μm
neural network

Fig. 1 PhaseStain workflow. A quantitative phase image of a label-free specimen is virtually stained by a deep neural network, bypassing the
standard histological staining procedure that is used as part of clinical pathology

(i.e., virtual) staining of the phase images of label-free it transforms the phase image of a weakly scattering
samples to match the images of histologically stained object (e.g., a label-free thin tissue section, which exhi-
samples. One previously used method for the digital bits low amplitude modulation under visible light) into
staining of tissue sections involves the acquisition of amplitude object information, presenting the same color
multimodal, nonlinear microscopy images of the samples, features that are observed under a brightfield micro-
while applying staining reagents as part of the sample scope, after the histological staining process.
preparation, followed by a linear approximation of the We experimentally demonstrated the success of our
absorption process to produce a pseudo-Hematoxylin and PhaseStain approach using label-free sections of human
Eosin (H&E) image of the tissue section under investi- skin, kidney, and liver tissue that were imaged by a
gation12–14. holographic microscope, matching the brightfield micro-
As an alternative to model-based approximations, scopy images of the same tissue sections stained with
deep learning has recently been successful in various H&E, Jones’ stain, and Masson’s trichrome stain,
computational tasks based on a data-driven approach, respectively.
solving inverse problems in optics, such as super- The deep learning-based virtual-staining of label-free
resolution15–17, holographic image reconstruction and tissue samples using quantitative phase images provide
phase recovery18–21, tomography22, Fourier ptycho- another important example of the unique opportunities
graphic microscopy23, localization microscopy24–26, and enabled by data-driven image transformations. We believe
ultrashort pulse reconstruction27. Recently, the appli- that the PhaseStain framework will be instrumental for
cation of deep learning for the virtual staining of auto- the QPI community to further strengthen various uses of
fluorescence images of nonstained tissue samples has label-free QPI techniques30–34 for clinical applications
also been demonstrated28. Following the success of and biomedical research, helping to eliminate the need for
these previous results, here, we demonstrate that deep histological staining, and reduce sample preparation
learning can be used for the digital staining of label-free associated time, labor, and costs.
thin tissue sections using their quantitative phase ima-
ges. For this image transformation between the phase Results
image of a label-free sample and its stained brightfield We trained three deep neural network models, which
image, which we term PhaseStain, we used a deep neural correspond to the three different combinations of tissue
network trained using the generative adversarial net- and stain types, i.e., H&E for skin tissue, Jones’ stain for
work (GAN) framework29. Conceptually, PhaseStain kidney tissue, and Masson’s trichrome for liver tissue.
(see Fig. 1) provides an image that is the digital Following the training phase, these three trained deep
equivalent of a brightfield image of the same sample networks were blindly tested on holographically recon-
after the histological staining process; stated differently, structed quantitative phase images (see the Methods
Rivenson et al. Light: Science & Applications (2019)8:23 Page 3 of 11 23

QPI of label-free skin tissue section Network output — digital H&E staining
4/5

Trained
network

100 μm 0 100 μm

Brightfield image of the


QPI of label-free skin Network output — H&E histologically stained
tissue section (zoom in) digital H&E staining skin tissue (20×/0.75NA)
4/5

50 μm 50 μm 50 μm
0

Fig. 2 Virtual H&E staining of label-free skin tissue using the PhaseStain framework. Top: QPI of a label-free skin tissue section and the resulting
network output. Bottom: zoom-in image of a region of interest and its comparison to the histochemically stained gold standard brightfield image

Brightfield image of the


QPI of label-free Network output — histologically stained
tissue section digital staining tissue (20×/0.75 NA)
4 /5

Jones’ stain
Kidney

50 μm 0 50 μm
4 /5
Masson’s trichrome
Liver

50 μm 0 50 μm

Fig. 3 PhaseStain-based virtual staining of label-free kidney tissue (Jones’ stain) and liver tissue (Masson’s Trichrome)

section) that were not part of the network’s training set. tissue sections are illustrated in Fig. 3, for kidney (digital
Figure 2 shows our results for the virtual H&E staining of Jones’ staining) and liver (digital Masson’s Trichrome
a phase image of a label-free skin tissue section, which staining). These virtually stained quantitative phase ima-
confirms discohesive tumor cells lining papillary struc- ges show sheets of clear tumor cells arranged in small
tures with dense fibrous cores. Additional results for the nests with a delicate capillary bed for the kidney tissue
virtual staining of quantitative phase images of label-free section, and a virtual trichrome stain highlighting the
Rivenson et al. Light: Science & Applications (2019)8:23 Page 4 of 11 23

normal liver architecture without significant fibrosis or The deep network inference fidelity for these noisy
inflammation, for the liver tissue section. phase inputs is reported in Fig. 4, which reveals that it is
These deep learning-based virtual-staining results pre- indeed sensitive to local phase variations and the related
sented in Figs. 2 and 3 visually demonstrate the high- noise, and it improves its inference performance as we
fidelity performance of the GAN-based staining frame- spatially extend the filter size, L (while the SNR remains
work. To further shed light on this comparison between fixed). In other words, the PhaseStain network output is
the PhaseStain results and the corresponding brightfield more impacted by small scale variations, corresponding
images of the histologically stained tissue samples, we to, e.g., the information encoded in the morphology of the
quantified the structural similarity (SSIM) index of these edges or the refractive index discontinuities (or sharp
two sets of images using: gradients) of the sample. We also found that for a kernel
size of LΔ~3 µm, the SSIM remains unchanged (~0.8),
1 X ð2μ1;i μ2;i þ 2σ 1;2;i þ c2 Þ
across a wide range of perturbation coefficients, β. This
SSIMðU1 ; U2 Þ ¼   
3 i¼1;2;3 μ2 þ μ2 þ c1 σ 2 þ σ 2 þ c2 result implies that the network is less sensitive to sample
1;i 2;i 1;i 2;i

ð1Þ preparation imperfections, such as height variations and


wrinkles in the thin tissue section, which naturally occur
during the preparation of the tissue section.
where U1 and U2 are the PhaseStain output and the
corresponding brightfield reference image, respectively,
Discussion
μk,i and σk,i are the mean and the standard deviation of
The training process of a PhaseStain network needs to be
each image Uk (k = 1,2), respectively, and index i refers to
performed only once, following which, the newly acquired
the RGB channels of the images. The cross-variance
quantitative phase images of various samples are blindly fed
between the i-th image channels is denoted with σ1,2,i and
to the pretrained deep network to output a digitally stained
c1, c2 are stabilization constants used to prevent division
image for each label-free sample, corresponding to the
by a small denominator. The result of this analysis
image of the same sample FOV, as it would have been
revealed that the SSIM was 0.8113, 0.8141, and 0.8905, for
imaged with a brightfield microscope, following the histo-
the virtual-staining results corresponding to the skin,
logical staining process. In terms of the computation speed,
kidney, and liver tissue samples, respectively, where the
the virtual staining using PhaseStain takes 0.617 s on
analysis was performed on ~10 megapixel images, corre-
average, using a standard desktop computer equipped with
sponding to a field-of-view (FOV) of ~1.47 mm2 for each
a dual-GPU for an FOV of ~0.45 mm2, corresponding to
sample.
~3.22 megapixels (see the implementation details in the
Next, to evaluate the sensitivity of the network output to Methods section). This fast inference time, even with
phase noise in our measurements, we performed a relatively modest computers, means that the PhaseStain
numerical experiment on the quantitative phase image of network can be easily integrated with a QPI-based whole
a label-free skin tissue, where we added noise in the fol- slide scanner, since the network can output virtually
lowing manner: stained images in small patches while the tissue is still
~ being scanned by an automated microscope, to simulta-
ϕðm; nÞ ¼ ϕðm; nÞ þ δϕðm; nÞ ¼ ϕðm; nÞ neously create label-free QPI and digitally stained whole
1    slide images of the samples.
þ βrðm; nÞ  2
exp ðm2 þ n2 ÞΔ2 = 2ðLΔÞ2
2πL The proposed technology has the potential to save time,
ð2Þ labor, and costs, by presenting an alternative to the
standard histological staining workflow used in clinical
where ϕ ~ is the resulting noisy phase distribution (i.e., the pathology. As an example, one of the most common
image under test), ϕ is the original phase image of the skin staining procedures (i.e., H&E stain) takes on average
tissue sample, r is drawn from a normal distribution N(0, ~45 min and costs approximately $2–5, while the Mas-
1), β is the perturbation coefficient, L is the Gaussian filter son’s Trichrome staining procedure takes ~2–3 h, with
size/width, and Δ is the pixel size, which spatially costs that range between $16 and $35, and often requires
smoothens the random noise into isotropic patches, as monitoring of the process by an expert, which is typically
shown in Fig. 4. We chose these parameters such that the conducted by periodically examining the specimen under
overall phase signal-to-noise-ratio (SNR) is statistically a microscope. In addition to saving time and costs, by
identical for all the cases and made sure that no phase circumventing the staining procedure, the tissue con-
wrapping occurs. We then used ten random realizations stituents would not be altered; this means that the unla-
of this noisy phase image for four combinations of (β, L) beled tissue sections can be preserved for later analysis,
values to generate ϕ, ~ which was used as the input to our such as matrix-assisted laser desorption ionization by the
trained deep neural network. microsectioning of specific areas35 for molecular analysis
Rivenson et al. Light: Science & Applications (2019)8:23 Page 5 of 11 23

a Additive phase noise Noisy QPI input Network output


LΔ = 0.373 μm 1
0.2
0.8
0.6
0.1
QPI of label-free skin 0.4

tissue section 0 0.2


0
–0.1 –0.2
–0.4
–0.2
–0.6 50 μm

LΔ ~ 3 μm 1
0.2
0.8
0.6
0.1
50 μm
0.4
0 0.2
0
–0.1 –0.2
–0.4
–0.2
–0.6 50 μm

b 0.9
0.85
0.8
0.75
Average SSIM

0.7
0.65
0.6
0.55
0.5
0.45
0.4
0 1 2 3
Log2L

Fig. 4 PhaseStain results for noisy phase input images (ground truth shown in Fig. 2). a Top row: LΔ~0.373 µm; second row: LΔ~3 µm.
b Analysis of the impact of phase noise on the inference quality of PhaseStain (quantified using SSIM), as a function of the Gaussian filter length,
L (see Eq. (2))

or the micromarking of subregions that can be labeled be applied to virtually stain the resulting images of
with specific immunofluorescence tags or tested for per- various other QPI techniques, regardless of their imaging
sonalized therapeutic strategies and drugs36,37. configuration, specific hardware, or phase recovery
While in this study, we trained three different neural method2,6,7,38–41 that are employed.
network models to obtain optimal results for specific One of the disadvantages of coherent imaging systems is
tissue and stain combinations, this does not pose a “coherence-related image artifacts”, such as speckle noise,
practical limitation for PhaseStain, since we can also train or dust or other particles creating holographic inter-
a more general digital staining model for a specific stain ference fringes, which do not appear in the incoherent
type (H&E, Jones’ stain, etc.) using multiple tissue types brightfield microscopy images of the same samples. In
stained with it, at the cost of increasing the network size Fig. 5, we demonstrate the image distortions that, for
as well as the training and inference times19. Additionally, example, out-of-focus particles create on the PhaseStain
from the clinical diagnostics perspective, the tissue type output image. To reduce such distortions in the network
under investigation and the stain needed for its clinical output images, the coherence-related image artifacts
examination are both known a priori, and therefore the resulting from out-of-focus particles can be digitally
selection of the correct neural network for each sample to removed by using a recently introduced deep learning-
be examined is straightforward to implement. based hologram reconstruction method, which learns,
It is important to note that, in addition to the lensfree through data, to attack or eliminate twin-image artifacts
holographic microscope (see the Methods section) that as well as the interference fringes resulting from out-of-
we used in this work, the PhaseStain framework can also focus or undesired objects19,20.
Rivenson et al. Light: Science & Applications (2019)8:23 Page 6 of 11 23

Network output—
QPI of label-free liver tissue section digital Masson’s trichrome staining
4/5

Trained
network

50 μm 0 50 μm

Network output— Brightfield image (20x/0.75 NA)


QPI of label-free liver digital Masson’s of the histologically stained
tissue section (zoom in) trichrome staining liver tissue
4/5

10 μm 10 μm 10 μm
0

Fig. 5 The impact of holographic fringes resulting from out-of-focus particles on the deep neural network’s digital staining performance.
Top row: QPI of a label-free liver tissue section and the resulting network output. Bottom row: zoom-in image of a region of interest where the
coherence-related artifact partially degrades the virtual staining performance

While in this manuscript, we demonstrated the applic- Materials and methods


ability of the PhaseStain approach to fixed paraffin- Sample preparation and imaging
embedded tissue specimens, our approach should also be All the samples that were used in this study were
applicable to frozen tissue sections, involving other tissue obtained from the Translational Pathology Core Labora-
fixation methods as well (following a similar training tory (TPCL) and prepared by the Histology Lab at UCLA.
process as detailed in the Methods section). Moreover, They were obtained after the de-identification of the
while our method was demonstrated for thin tissue sec- patient related information and prepared from existing
tions, QPI has been shown to be valuable to image cells specimens. Therefore, this work did not interfere
and smear samples (such as blood and Pap smears)2,41, with standard practices of care or sample collection
and the PhaseStain technique would also be applicable to procedures.
digitally stain these types of specimens. Following formalin-fixing paraffin-embedding, the tis-
To summarize, our presented results demonstrate sue block is sectioned using a microtome into ~2–4 µm
some of the emerging opportunities created by deep thick sections. This step is only needed for the training
learning for label-free quantitative phase imaging. The phase, where the transformation from a phase image into
phase information resulting from various coherent a brightfield image needs to be statistically learned. These
imaging techniques can be used to generate a virtually tissue sections are then deparaffinized using Xylene and
stained image, translating the phase images of weakly mounted on a standard glass slide using CytosealTM
scattering objects such as thin tissue sections into images (Thermo-Fisher Scientific, Waltham, MA, USA), followed
that are equivalent to the brightfield images of the same by sealing of the specimen with a coverslip. In the
samples, after the histological labeling. The PhaseStain learning/training process, this sealing step presents sev-
framework, in addition to saving time and costs asso- eral advantages: protecting the sample during the imaging
ciated with the labeling process, has the potential to and sample handling processes and reducing artifacts
further strengthen the use of label-free QPI techniques such as sample thickness variations.
in the clinical diagnostics workflow, while also preser- Following the sample preparation, the specimen was
ving tissues for, e.g., subsequent molecular and genetic imaged using an on-chip holographic microscope to
analysis. generate a quantitative phase image (detailed in the next
Rivenson et al. Light: Science & Applications (2019)8:23 Page 7 of 11 23

subsection). Following the QPI process, the label-free Multiheight phase recovery
specimen slide was put into Xylene for ~48 h, until the Lensfree in-line holograms at eight sample-to-sensor
coverslip can be removed without introducing distortions distances were captured. The axial scanning step size was
to the tissue. Once the coverslip was removed, the slide chosen to be 15 μm. Accurate z-steps were obtained by
was dipped multiple times in absolute alcohol and 95% applying a holographic autofocusing algorithm based on
alcohol, and then washed in D.I. water for ~1 min. Fol- the edge sparsity criterion (“Tamura of the gradient”, i.e.,
lowing this step, the tissue slides were stained with H&E ToG)45. A zero-phase was assigned to the object intensity
(skin tissue), Jones’ stain (kidney tissue), and Masson’s measurement as an initial phase guess, to start the itera-
trichrome (liver tissue) and then coverslipped. These tis- tions. An iterative multiheight phase recovery algorithm46
sue samples were then imaged using a brightfield auto- was then used by propagating the complex field back and
mated slide scanner microscope (Aperio AT, Leica forth between each height using the transfer function of
Biosystems) with a 20×/0.75NA objective (Plan Apo), free-space47. During this iterative process, the phase was
equipped with a 2×magnification adapter, which results in kept unchanged at each axial plane, where the amplitude
an effective pixel size of ~0.25 µm. was updated by using the square-root of the object
intensity measurement. One iteration was defined as
Quantitative phase imaging propagating the hologram from the eighth height (farthest
Lensfree imaging setup from the sensor chip) to the first height (nearest to the
The quantitative phase images of label-free tissue samples sensor) and then back propagating the complex field to
were acquired using an in-line lensfree holography setup41. the eighth height. Typically, after 10–30 iterations, the
A light source (WhiteLase Micro, NKT Photonics, phase is retrieved. For the final step of the reconstruction,
Denmark) with a center wavelength at 550 nm and a the complex wave defined by the converged amplitude
spectral bandwidth of ~2.5 nm was used as the illumination and phase at a given hologram plane was propagated to
source. The uncollimated light emitted from a single-mode the object plane47, from which the phase component of
fiber was used for creating a quasi-plane-wave that illumi- the sample was extracted.
nated the sample. The sample was placed between the light
source and the CMOS image sensor chip (IMX 081, Data preprocessing and image registration
Sony Corp., Minato, Tokyo, Japan, pixel size of 1.12 μm) An important step in our training process is to perform
with a source-to-sample distance (z1) of 5–10 cm and a an accurate image registration, between the two imaging
sample-to-sensor distance (z2) of 1–2 mm. This on-chip modalities (QPI and brightfield), which involves both
lensfree holographic microscope has a submicron resolu- global matching and local alignment steps. Since the
tion with an effective pixel size of 0.37 µm, covering a network aims to learn the transformation from a label-
sample FOV of ~20 mm2 (which accounts for the free phase retrieved image to a histologically stained
entire active area of the sensor). The positioning stage brightfield image, it is crucial to accurately align the FOVs
(MAX606, Thorlabs Inc., Newton, NJ, USA), which held- for each input and target image pair in the dataset.
the CMOS sensor, enabled the 3D translation of the imager We perform this cross-modality alignment procedure in
chip for performing pixel super-resolution (PSR)5,41,42 four steps; steps 1, 2, and 4 are done in MATLAB
and multiheight-based iterative phase recovery41,43. (The MathWorks Inc., Natick, MA, USA) and step 3
All imaging hardware was controlled automatically involves TensorFlow.
by LabVIEW (National Instruments Corp., Austin, TX, The first step is to find a roughly matched FOV between
USA). QPI and the corresponding brightfield image. This is done
by first bicubic downsampling of the whole slide image
Pixel super-resolution (PSR) technique (WSI) (~60 by 60 k pixels) to match the pixel size of the
To synthesize a high-resolution hologram (with a pixel phase retrieved image. Then, each 4096 × 4096-pixel
size of ~0.37 μm) using only the G1 channel of the Bayer phase image was cropped by 256 on each side (resulting in
pattern (R, G1, G2, and B), a shift-and-add based PSR an image with 3584 × 3584 pixels) to remove the padding
algorithm was applied42,44. The translation stage that that is used for the image reconstruction process. Fol-
holds the image sensor was programmed to laterally shift lowing this step, both the brightfield and the corre-
on a 6 × 6 grid with a subpixel spacing at each sample-to- sponding phase images are edge extracted using the
sensor distance. A low-resolution hologram was recorded Canny method48, which uses a double threshold to detect
at each position and the lateral shifts were precisely strong and weak edges on the gradient of the image. Then,
estimated using a shift estimation algorithm41. This step a correlation score matrix is calculated by correlating each
results in six nonoverlapping panels that were each 3584 × 3584-pixel patch of the resulting edge image to the
padded to a size of 4096 × 4096 pixels, and individually same size as the image extracted from the brightfield edge
phase-recovered, which is detailed next. image. The image with the highest correlation score
Rivenson et al. Light: Science & Applications (2019)8:23 Page 8 of 11 23

QPI of label-free tissue section N × N × 64 Generator N × N × 32 Virtually stained tissue section


4
–
5
N/2 × N/2 × 128 N/2 × N/2 × 64

N/4 × N/4 × 256 N/4 × N/4 × 128

+ 2
N/8 × N/8 × 512 N/8 × N/8 × 256 2
+ 2
2
+ 2
0 +2 2
2
N/16 × N/16 × 512

3 Convolutional layer down block + Pointwise add 2 2 Stride average pooling

3 Convolutional layer up block Concatenation 2 Times up-sampling


2
3 Convolutional layer Data passing (no processing) Zero padding

Discriminator
256, 256, 64
Label Generator output 128, 128, 128

64, 64, 256


32, 32, 512
16, 16, 1024
Output
2 2 2 2 2 (2048)
(probability)
(8, 8, 2048)

2 Convolutional layer down block. The second convolution has a stride of 2.

FC layer
2 Convolutional layer

Fig. 6 Architecture of the generator and discriminator networks within the GAN framework

indicates a match between the two images, and the cor- quantitative phase images into stained brightfield images,
responding brightfield image is cropped out from the which can help the distortion correction between the two
WSI. Following this initial matching procedure, the image modalities in the fourth/final step. In other words,
quantitative phase image and the brightfield microscope to make the local registration tractable, we first train a
images are coarsely matched. deep network with the globally registered images, to
The second step is used to correct for potential rota- reduce the entropy between the images acquired with the
tions between these coarsely matched image pairs, which two imaging modalities (i.e., QPI vs. brightfield image of
might be caused by a slight mismatch in the sample pla- the stained tissue). This neural network has the same
cement during the two image acquisition experiments structure as the network that was used for the final
(which are performed on different imaging systems, training process (see the next subsection on the GAN
holographic vs. brightfield). This intensity-based regis- architecture and its training) with the input and target
tration step correlates the spatial patterns between the images obtained from the second registration step dis-
two images; the phase image that is converted to an cussed earlier. Since the image pairs are not well aligned
unsigned integer format and the luminance component of yet, the training is stopped early at only ~2000 iterations
the brightfield image were used for this multimodal to avoid a structural change at the output to be learnt by
registration framework implemented in MATLAB. The the network. The output and target images of the network
result of this digital procedure is an affine transformation are then used as the registration pairs in the fourth step,
matrix, which is applied to the brightfield microscope which is an elastic image registration algorithm, used to
image patch, to match it with the quantitative phase correct for local feature registration16.
image of the same sample. Following this registration step,
the phase image and the corresponding brightfield image GAN architecture and training
are globally aligned. A further crop of 64 pixels on each The GAN architecture that we used for PhaseStain is
side to the aligned image pairs is used to accommodate for detailed in Fig. 6 and Supplementary Table 1. Following
a possible rotation angle correction. the registration of the label-free quantitative phase images
The third step involves the training of a separate neural to the brightfield images of the histologically stained tis-
network that roughly learns the transformation from sue sections, these accurately aligned fields-of-view were
Rivenson et al. Light: Science & Applications (2019)8:23 Page 9 of 11 23

Table 1 Training details for the virtual staining of different tissue types using PhaseStain. Following the training, the
blind inference takes ~0.617 s for an FOV of ~0.45 mm2, corresponding to ~3.22 megapixels (see the Discussion section)

Tissue type # of iterations # of patches (256 × 256 pixels) Training time (h) # of epochs

Liver 7500 2500 training/625 validation 11.076 25


Skin 11000 2500 training/625 validation 11.188 18
Kidney 13600 2312 training/578 validation 13.173 39

partitioned to overlapping patches of 256 × 256 pixels, activation function, which is defined as:
which were then used to train the GAN model. The GAN x for x>0
is composed of two deep neural networks, a generator and LReLUðxÞ ¼ ð5Þ
0:1x otherwise
a discriminator. The discriminator network’s loss function
is given by: At the output of each block, the number of channels is
increased by 2-fold (except for the first block that
‘discrimnator ¼ DðGðxinput ÞÞ2 þ ð1  Dðzlabel ÞÞ2 ð3Þ increases from 1 input channel to 64 channels). The
blocks are connected by an average-pooling layer of stride
where D(.) and G(.) refer to the discriminator and gen- two that downsamples the output of the previous block by
erator network operators, respectively, xinput denotes the a factor of two for both horizontal and vertical dimensions
input to the generator, which is the label-free quantitative (as shown in Fig. 6 and supplementary Table 1).
phase image, and zlabel denotes the brightfield image of the
histologically stained tissue. The generator network, G, In the upsampling path, each block also consists of three
tries to generate an output image with the same statistical convolutional layers and three LReLU activation func-
features as zlabel, while the discriminator, D, attempts to tions, which decrease the number of channels at its output
distinguish between the target and the generator output by fourfold. The blocks are connected by a bilinear
images. The ideal outcome (or state of equilibrium) will upsampling layer that upsamples the size of the output
be when the generator’s output and target images share an from the previous block by a factor of two for both lateral
identical statistical distribution, where in this case, D(G dimensions. A concatenation function with the corre-
(xinput)) should converge to 0.5. For the generator deep sponding feature map from the downsampling path of the
network, we defined the loss function as: same level is used to increase the number of channels
   from the output of the previous block by two. The two
‘generator ¼ L1 zlabel ; G xinput þ λ paths are connected in the first level of the network by a
      2 convolutional layer, which maintains the number of the
´ TV G xinput þ α ´ 1  D G xinput
feature maps from the output of the last residual block in
ð4Þ the downsampling path (see Fig. 6 and supplementary
Table 1). The last layer is a convolutional layer that maps
where the L1{.} term refers to the absolute pixel-by-pixel the output of the upsampling path into 3 channels of the
difference between the generator output image and its tar- YCbCr color map.
get, TV{.} stands for the total variation regularization that is The discriminator network consists of one convolu-
being applied to the generator output, and the last term tional layer, five discriminator blocks, an average-pooling
reflects a penalty related to the discriminator network pre- layer, and two fully connected layers. The first con-
diction of the generator output. The regularization para- volutional layer receives 3 channels (YCbCr color map)
meters (λ, α) were set to 0.02 and 2000 so that the total from either the generator output or the target and
variation loss term, λ × TV{G(xinput)}, was ~2% of the L1 loss increases the number of channels to 64. The dis-
term, and the discriminator loss term, α × (1 − D(G(xinput)))2 criminator blocks consist of two convolutional layers
was ~98% of the total generator loss, lgenerator. with the first layer maintaining the size of the feature
For the generator deep neural network, we adapted the map and the number of channels, while the second layer
U-net architecture49, which consists of a downsampling increases the number of channels by twofold and
and an upsampling path, with each path containing four decreases the size of the feature map by fourfold. The
blocks forming four distinct levels (see Fig. 6 and Sup- average-pooling layer has a filter size of 8 × 8, which
plementary Table 1). In the downsampling path, each results in a matrix with a size of (B, 2048), where B refers
residual block consists of three convolutional layers and to the batch size. The output of this average-pooling
three leaky rectified linear (LReLU) units used as an layer is then fed into two fully connected layers with the
Rivenson et al. Light: Science & Applications (2019)8:23 Page 10 of 11 23

a L1-loss b Generator loss


30 535
530
25
525
520
20
Loss (a.u.)

515

Loss (a.u.)
15 510
505
10
500

5 495
490
0 485
11 00
12 00
13 00
14 00
15 00
16 00
17 00
18 00
19 00
00
00
00
00
00
00

75 0
85 0
00

10 00
15 0

11 00
12 00
13 00
14 00
15 00
16 00
17 00
18 00
19 00
00
25 0
00

45 0
00
00
00
00
00

10 00
0
0
0
50

0
50
,5
,5
,5
,5
,5
,5
,5
,5
,5
,5
25
35
45
55
65

95

,5
,5
,5
,5
,5
,5
,5
,5
,5
,5
15

35

55
65
75
85
95
Number of iterations Number of iterations

Fig. 7 PhaseStain convergence plots for the validation set of the digital H&E staining of the skin tissue. a L1-loss with respect to the number
of iterations. b Generator loss, lgenerator with respect to the number of iterations

first layer maintaining the size of the feature map, while Implementation details
the second layer decreases the output channel to 1, The number of image patches that were used for
resulting in an output size of (B, 1). The output of this training, the number of epochs, and the training schedules
fully connected layer is going through a sigmoid func- are shown in Table 1. The network was implemented
tion, indicating the probability that the three-channel using Python version 3.5.0, with TensorFlow framework
discriminator input is drawn from a histologically version 1.7.0. We implemented the software on a desktop
stained brightfield image. For the discriminator network, computer with a Core i7-7700K CPU @ 4.2 GHz
all the convolutional layers and the fully connected (Intel Corp., Santa Clara, CA, USA) and 64GB of RAM,
layers are connected by LReLU nonlinear activation running a Windows 10 operating system (Micro-
functions. soft Corp., Redmond, WA, USA). Following the training
Throughout the training, the convolution filter size was for each tissue section, the corresponding network was
set to be 3 × 3. For the patch generation, we applied data tested with 4 image patches of 1792 × 1792 pixels with an
augmentation by using 50% patch overlap for the liver and overlap of ~7%. The outputs of the network were then
skin tissue images, and 25% patch overlap for the kidney stitched to form the final network output image of 3456 ×
tissue images (see Table 1). The learnable parameters 3456 pixels (FOV ~1.7 mm2), as shown in, e.g., Fig. 2. The
including filters, weights, and biases in the convolutional network training and testing were performed using dual
layers and the fully connected layers are updated using an GeForce GTX 1080Ti GPUs (NVidia Corp., Santa Clara,
adaptive moment estimation (Adam) optimizer with a CA, USA).
learning rate of 1 × 10−4 for the generator network and
1 × 10−5 for the discriminator network. Acknowledgments
The Ozcan Research Group at UCLA acknowledges the support of NSF
For each iteration of the discriminator, there were v Engineering Research Center (ERC, PATHS-UP), the Army Research Office (ARO;
iterations of the generator network; for the liver and skin W911NF-13-1-0419 and W911NF-13-1-0197), the ARO Life Sciences Division,
tissue training, v = max(5, floor(7 − w/2)), where we the National Science Foundation (NSF) CBET Division Biophotonics Program,
the NSF Emerging Frontiers in Research and Innovation (EFRI) Award, the NSF
increased w by 1 for every 500 iterations (w was initialized INSPIRE Award, NSF Partnerships for Innovation: Building Innovation Capacity
as 0). For the kidney tissue training, we used v = max(4, (PFI:BIC) Program, the National Institutes of Health (NIH, R21EB023115), the
floor(6 − w/2)), where we increased w by 1 for every 400 Howard Hughes Medical Institute (HHMI), Vodafone Americas Foundation, the
Mary Kay Foundation, and Steven & Alexandra Cohen Foundation. The authors
iterations. This helped us to train the discriminator not to also acknowledge the Translational Pathology Core Laboratory (TPCL) and the
overfit to the target brightfield images. We used a batch Histology Lab at UCLA for their assistance with the sample preparation and
size of ten for the training of the liver and skin tissue staining, as well as Prof. W. Dean Wallace of Pathology and Laboratory
Medicine at UCLA’s David Geffen School of Medicine for image evaluations.
sections, and five for the kidney tissue sections. All the
convolutional kernel entries are initialized using a trun-
Author details
cated normal distribution. All the network bias terms are 1
Electrical and Computer Engineering Department, University of California, Los
initialized to be zero. The network’s training stopped Angeles, CA 90095, USA. 2Bioengineering Department, University of California,
when the validation set’s L1-loss did not decrease after Los Angeles, CA 90095, USA. 3California NanoSystems Institute (CNSI),
University of California, Los Angeles, CA 90095, USA. 4Department of Surgery,
4000 iterations. A typical convergence plot of our training David Geffen School of Medicine, University of California, Los Angeles, CA
is shown in Fig. 7. 90095, USA
Rivenson et al. Light: Science & Applications (2019)8:23 Page 11 of 11 23

Conflict of interest 25. Nehme, E., Weiss, L. E., Michaeli, T. & Shechtman, Y. Deep-STORM: super-
A.O., Y.R., and Z.W. have a patent application on the invention reported in this resolution single-molecule microscopy by deep learning. Optica 5, 458–464
manuscript. (2018).
26. Ouyang, W., Aristov, A., Lelek, M., Hao, X. & Zimmer, C. Deep learning massively
Supplementary information is available for this paper at https://doi.org/ accelerates super-resolution localization microscopy. Nat. Biotechnol. 36,
10.1038/s41377-019-0129-y. 460–468, https://doi.org/10.1038/nbt.4106 (2018).
27. Zahavy, T. et al. Deep learning reconstruction of ultrashort pulses. Optica 5,
666–673 (2018).
Received: 23 July 2018 Revised: 5 January 2019 Accepted: 11 January 2019
28. Rivenson Y., et al. Virtual histological staining of unlabelled tissue-
autofluorescence images via deep learning. Nat. Biomed. Eng. (in the press).
29. Goodfellow I. J., et al. Generative adversarial nets. In Proceedings of the 27th
International Conference on Neural Information Processing Systems. (MIT
Press: Cambridge, MA, 2014) pp. 2672–2680.
References
30. Park, H. S., Rinehart, M. T., Walzer, K. A., Chi, J. T. A. & Wax, A. Automated
1. Cuche, E., Marquet, P. & Depeursinge, C. Simultaneous amplitude-contrast and
detection of P. falciparum using machine learning algorithms with quantita-
quantitative phase-contrast microscopy by numerical reconstruction of Fres-
tive phase images of unstained cells. PLoS ONE 11, e0163045 (2016).
nel off-axis holograms. Appl. Opt. 38, 6994–7001 (1999).
31. Chen, C. L. et al. Deep learning in label-free cell classification. Sci. Rep. 6, 21471
2. Popescu, G. Quantitative Phase Imaging of Cells and Tissues. (McGraw-Hill, New
(2016).
York, 2011).
32. Roitshtain, D. et al. Quantitative phase microscopy spatial signatures of cancer
3. Shaked, N. T., Rinehart, M. T. & Wax, A. Dual-interference-channel quantitative-
cells. Cytometry A 91, 482–493 (2017).
phase microscopy of live cell dynamics. Opt. Lett. 34, 767–769 (2009).
33. Jo, Y. et al. Holographic deep learning for rapid optical screening of anthrax
4. Wang, Z. et al. Spatial light interference microscopy (SLIM). Opt. Express 19,
spores. Sci. Adv. 3, e1700606, https://doi.org/10.1101/109108 (2017).
1016–1026 (2011).
34. Javidi, B. et al. Sickle cell disease diagnosis based on spatio-temporal cell
5. Greenbaum, A. et al. Imaging without lenses: achievements and remaining
dynamics analysis using 3D printed shearing digital holographic microscopy.
challenges of wide-field on-chip microscopy. Nat. Methods 9, 889–895 (2012).
Opt. Express 26, 13614–13627 (2018).
6. Zheng, G. A., Horstmeyer, R. & Yang, C. Wide-field, high-resolution Fourier
35. Tata, A. et al. Wide-field tissue polarimetry allows efficient localized mass
ptychographic microscopy. Nat. Photonics 7, 739–745 (2013).
spectrometry imaging of biological tissues. Chem. Sci. 7, 2162–2169 (2016).
7. Tian, L. & Waller, L. Quantitative differential phase contrast imaging in an LED
36. Cree, I. A. et al. Guidance for laboratories performing molecular pathology for
array microscope. Opt. Express 23, 11394–11403 (2015).
cancer patients. J. Clin. Pathol. 67, 923–931 (2014).
8. Wang, Z., Tangella, K., Balla, A. & Popescu, G. Tissue refractive index as marker
37. Patel, P. G. et al. Preparation of formalin-fixed paraffin-embedded tissue cores
of disease. J. Biomed. Opt. 16, 116017 (2011).
for both RNA and DNA extraction. J. Vis. Exp. 2016, e54299, https://doi.org/
9. Wang, Z., Ding, H. F. & Popescu, G. Scattering-phase theorem. Opt. Lett. 36,
10.3791/54299 (2016).
1215–1217 (2011).
38. Ikeda, T., Popescu, G., Dasari, R. R. & Feld, M. S. Hilbert phase microscopy for
10. Liu Y., et al. Detecting cancer metastases on gigapixel pathology images.
investigating fast dynamics in transparent systems. Opt. Lett. 30, 1165–1167
ArXiv: 1703.02442 (2017).
(2005).
11. Litjens G., et al. A survey on deep learning in medical image analysis. Med.
39. Shaked, N. T., Zhu, Y. Z., Badie, N., Bursac, N. & Wax, A. Reflective interferometric
Image Anal. 42: 60–88 (2017).
chamber for quantitative phase imaging of biological sample dynamics. J.
12. Tao, Y. K. et al. Assessment of breast pathologies using nonlinear microscopy.
Biomed. Opt. 15, 030503 (2010).
Proc. Natl Acad. Sci. USA 111, 15304–15309 (2014).
40. Watanabe, E., Hoshiba, T. & Javidi, B. High-precision microscopic phase ima-
13. Giacomelli, M. G. et al. Virtual hematoxylin and eosin transillumination
ging without phase unwrapping for cancer cell identification. Opt. Lett. 38,
microscopy using epi-fluorescence imaging. PLoS ONE 11, e0159337
1319–1321 (2013).
(2016).
41. Greenbaum, A. et al. Wide-field computational imaging of pathology slides
14. Orringer, D. A. et al. Rapid intraoperative histology of unprocessed surgical
using lens-free on-chip microscopy. Sci. Transl. Med 6, 267ra175 (2014).
specimens via fibre-laser-based stimulated Raman scattering microscopy. Nat.
42. Bishara, W., Su, T. W., Coskun, A. F. & Ozcan, A. Lensfree on-chip microscopy
Biomed. Eng. 1, 0027 (2017).
over a wide field-of-view using pixel super-resolution. Opt. Express 18,
15. Rivenson, Y. et al. Deep learning microscopy. Optica 4, 1437–1443 (2017).
11181–11191 (2010).
16. Rivenson, Y. et al. Deep learning enhanced mobile-phone microscopy. ACS
43. Luo, W., Zhang, Y. B., Göröcs, Z., Feizi, A. & Ozcan, A. Propagation phasor
Photon. 5, 2354–2364, https://doi.org/10.1021/acsphotonics.8b00146 (2018).
approach for holographic image reconstruction. Sci. Rep. 6, 22738
17. Wang H., et al. Deep learning enables cross-modality super-resolution in
(2016).
fluorescence microscopy Nat. Methods 16, 103–110 (2019).
44. Farsiu, S., Robinson, M. D., Elad, M. & Milanfar, P. Fast and robust multiframe
18. Sinha, A., Lee, J., Li, S. & Barbastathis, G. Lensless computational imaging
super resolution. IEEE Trans. Image Process. 13, 1327–1344 (2014).
through deep learning. Optica 4, 1117–1125 (2017).
45. Zhang, Y. B., Wang, H. D., Wu, Y. C., Tamamitsu, M. & Ozcan, A. Edge sparsity
19. Rivenson, Y., Zhang, Y. B., Günaydin, H., Teng, D. & Ozcan, A. Phase recovery
criterion for robust holographic autofocusing. Opt. Lett. 42, 3824–3827 (2017).
and holographic image reconstruction using deep learning in neural net-
46. Greenbaum, A. & Ozcan, A. Maskless imaging of dense samples using pixel
works. Light Sci. Appl. 7, e17141 (2018).
super-resolution based multi-height lensfree on-chip microscopy. Opt. Express
20. Wu, Y. C. et al. Extended depth-of-field in holographic imaging using deep-
20, 3129–3143 (2012).
learning-based autofocusing and phase recovery. Optica 5, 704–710 (2018).
47. Goodman, J. W. Introduction to Fourier Optics. (Roberts and Company Pub-
21. Jo Y., et al. Quantitative phase imaging and artificial intelligence: a review.
lishers, Englewood, 2005).
arXiv: 1806.03982 (2018).
48. Canny, J. A computational approach to edge detection. IEEE Trans. Pattern
22. Kamilov, U. et al. Learning approach to optical tomography. Optica 2, 517–522
Anal. Mach. Intell. PAMI-8, 679–698 (1986).
(2015).
49. Ronneberger O., Fischer P., Brox T. U-Net: convolutional networks for bio-
23. Nguyen T., Xue Y. J., Li Y. Z., Tian L., Nehmetallah G. Deep learning approach to
medical image segmentation. In: Navab N., Hornegger J., Wells W. M., Frangi A.
Fourier ptychographic microscopy. arXiv: 1805.00334 (2018).
F., (eds). Medical Image Computing and Computer-Assisted Intervention—
24. Boyd N., Jonas E., Babcock H. P., Recht B. DeepLoco: fast 3D localization
MICCAI 2015. (Springer, Cham, 2015) pp. 234–241 https://doi.org/10.1007/978-
microscopy using neural networks. bioRxiv: 267096, 2018. https://doi.org/
3-319-24574-4_28.
10.1101/267096.

You might also like