Phasestain: The Digital Staining of Label-Free Quantitative Phase Microscopy Images Using Deep Learning
Phasestain: The Digital Staining of Label-Free Quantitative Phase Microscopy Images Using Deep Learning
Light: Science & Applications (2019)8:23 Official journal of the CIOMP 2047-7538
https://doi.org/10.1038/s41377-019-0129-y www.nature.com/lsa
Abstract
Using a deep neural network, we demonstrate a digital staining technique, which we term PhaseStain, to transform
the quantitative phase images (QPI) of label-free tissue sections into images that are equivalent to the brightfield
microscopy images of the same samples that are histologically stained. Through pairs of image data (QPI and the
corresponding brightfield images, acquired after staining), we train a generative adversarial network and demonstrate
the effectiveness of this virtual-staining approach using sections of human skin, kidney, and liver tissue, matching the
brightfield microscopy images of the same samples stained with Hematoxylin and Eosin, Jones’ stain, and Masson’s
trichrome stain, respectively. This digital-staining framework may further strengthen various uses of label-free QPI
techniques in pathology applications and biomedical research in general, by eliminating the need for histological
staining, reducing sample preparation related costs and saving time. Our results provide a powerful example of some
of the unique opportunities created by data-driven image transformations enabled by deep learning.
1234567890():,;
1234567890():,;
1234567890():,;
1234567890():,;
N O
HO OH
N N HO OH
N OH
Methenamine Periodic
acid 50 μm
PhaseStain
Unstained
Slide Obtain quantitative Deep neural Network
phase at 550 nm network output
50 μm Convolutional 50 μm
neural network
Fig. 1 PhaseStain workflow. A quantitative phase image of a label-free specimen is virtually stained by a deep neural network, bypassing the
standard histological staining procedure that is used as part of clinical pathology
(i.e., virtual) staining of the phase images of label-free it transforms the phase image of a weakly scattering
samples to match the images of histologically stained object (e.g., a label-free thin tissue section, which exhi-
samples. One previously used method for the digital bits low amplitude modulation under visible light) into
staining of tissue sections involves the acquisition of amplitude object information, presenting the same color
multimodal, nonlinear microscopy images of the samples, features that are observed under a brightfield micro-
while applying staining reagents as part of the sample scope, after the histological staining process.
preparation, followed by a linear approximation of the We experimentally demonstrated the success of our
absorption process to produce a pseudo-Hematoxylin and PhaseStain approach using label-free sections of human
Eosin (H&E) image of the tissue section under investi- skin, kidney, and liver tissue that were imaged by a
gation12–14. holographic microscope, matching the brightfield micro-
As an alternative to model-based approximations, scopy images of the same tissue sections stained with
deep learning has recently been successful in various H&E, Jones’ stain, and Masson’s trichrome stain,
computational tasks based on a data-driven approach, respectively.
solving inverse problems in optics, such as super- The deep learning-based virtual-staining of label-free
resolution15–17, holographic image reconstruction and tissue samples using quantitative phase images provide
phase recovery18–21, tomography22, Fourier ptycho- another important example of the unique opportunities
graphic microscopy23, localization microscopy24–26, and enabled by data-driven image transformations. We believe
ultrashort pulse reconstruction27. Recently, the appli- that the PhaseStain framework will be instrumental for
cation of deep learning for the virtual staining of auto- the QPI community to further strengthen various uses of
fluorescence images of nonstained tissue samples has label-free QPI techniques30–34 for clinical applications
also been demonstrated28. Following the success of and biomedical research, helping to eliminate the need for
these previous results, here, we demonstrate that deep histological staining, and reduce sample preparation
learning can be used for the digital staining of label-free associated time, labor, and costs.
thin tissue sections using their quantitative phase ima-
ges. For this image transformation between the phase Results
image of a label-free sample and its stained brightfield We trained three deep neural network models, which
image, which we term PhaseStain, we used a deep neural correspond to the three different combinations of tissue
network trained using the generative adversarial net- and stain types, i.e., H&E for skin tissue, Jones’ stain for
work (GAN) framework29. Conceptually, PhaseStain kidney tissue, and Masson’s trichrome for liver tissue.
(see Fig. 1) provides an image that is the digital Following the training phase, these three trained deep
equivalent of a brightfield image of the same sample networks were blindly tested on holographically recon-
after the histological staining process; stated differently, structed quantitative phase images (see the Methods
Rivenson et al. Light: Science & Applications (2019)8:23 Page 3 of 11 23
QPI of label-free skin tissue section Network output — digital H&E staining
4/5
Trained
network
100 μm 0 100 μm
50 μm 50 μm 50 μm
0
Fig. 2 Virtual H&E staining of label-free skin tissue using the PhaseStain framework. Top: QPI of a label-free skin tissue section and the resulting
network output. Bottom: zoom-in image of a region of interest and its comparison to the histochemically stained gold standard brightfield image
Jones’ stain
Kidney
50 μm 0 50 μm
4 /5
Masson’s trichrome
Liver
50 μm 0 50 μm
Fig. 3 PhaseStain-based virtual staining of label-free kidney tissue (Jones’ stain) and liver tissue (Masson’s Trichrome)
section) that were not part of the network’s training set. tissue sections are illustrated in Fig. 3, for kidney (digital
Figure 2 shows our results for the virtual H&E staining of Jones’ staining) and liver (digital Masson’s Trichrome
a phase image of a label-free skin tissue section, which staining). These virtually stained quantitative phase ima-
confirms discohesive tumor cells lining papillary struc- ges show sheets of clear tumor cells arranged in small
tures with dense fibrous cores. Additional results for the nests with a delicate capillary bed for the kidney tissue
virtual staining of quantitative phase images of label-free section, and a virtual trichrome stain highlighting the
Rivenson et al. Light: Science & Applications (2019)8:23 Page 4 of 11 23
normal liver architecture without significant fibrosis or The deep network inference fidelity for these noisy
inflammation, for the liver tissue section. phase inputs is reported in Fig. 4, which reveals that it is
These deep learning-based virtual-staining results pre- indeed sensitive to local phase variations and the related
sented in Figs. 2 and 3 visually demonstrate the high- noise, and it improves its inference performance as we
fidelity performance of the GAN-based staining frame- spatially extend the filter size, L (while the SNR remains
work. To further shed light on this comparison between fixed). In other words, the PhaseStain network output is
the PhaseStain results and the corresponding brightfield more impacted by small scale variations, corresponding
images of the histologically stained tissue samples, we to, e.g., the information encoded in the morphology of the
quantified the structural similarity (SSIM) index of these edges or the refractive index discontinuities (or sharp
two sets of images using: gradients) of the sample. We also found that for a kernel
size of LΔ~3 µm, the SSIM remains unchanged (~0.8),
1 X ð2μ1;i μ2;i þ 2σ 1;2;i þ c2 Þ
across a wide range of perturbation coefficients, β. This
SSIMðU1 ; U2 Þ ¼
3 i¼1;2;3 μ2 þ μ2 þ c1 σ 2 þ σ 2 þ c2 result implies that the network is less sensitive to sample
1;i 2;i 1;i 2;i
LΔ ~ 3 μm 1
0.2
0.8
0.6
0.1
50 μm
0.4
0 0.2
0
–0.1 –0.2
–0.4
–0.2
–0.6 50 μm
b 0.9
0.85
0.8
0.75
Average SSIM
0.7
0.65
0.6
0.55
0.5
0.45
0.4
0 1 2 3
Log2L
Fig. 4 PhaseStain results for noisy phase input images (ground truth shown in Fig. 2). a Top row: LΔ~0.373 µm; second row: LΔ~3 µm.
b Analysis of the impact of phase noise on the inference quality of PhaseStain (quantified using SSIM), as a function of the Gaussian filter length,
L (see Eq. (2))
or the micromarking of subregions that can be labeled be applied to virtually stain the resulting images of
with specific immunofluorescence tags or tested for per- various other QPI techniques, regardless of their imaging
sonalized therapeutic strategies and drugs36,37. configuration, specific hardware, or phase recovery
While in this study, we trained three different neural method2,6,7,38–41 that are employed.
network models to obtain optimal results for specific One of the disadvantages of coherent imaging systems is
tissue and stain combinations, this does not pose a “coherence-related image artifacts”, such as speckle noise,
practical limitation for PhaseStain, since we can also train or dust or other particles creating holographic inter-
a more general digital staining model for a specific stain ference fringes, which do not appear in the incoherent
type (H&E, Jones’ stain, etc.) using multiple tissue types brightfield microscopy images of the same samples. In
stained with it, at the cost of increasing the network size Fig. 5, we demonstrate the image distortions that, for
as well as the training and inference times19. Additionally, example, out-of-focus particles create on the PhaseStain
from the clinical diagnostics perspective, the tissue type output image. To reduce such distortions in the network
under investigation and the stain needed for its clinical output images, the coherence-related image artifacts
examination are both known a priori, and therefore the resulting from out-of-focus particles can be digitally
selection of the correct neural network for each sample to removed by using a recently introduced deep learning-
be examined is straightforward to implement. based hologram reconstruction method, which learns,
It is important to note that, in addition to the lensfree through data, to attack or eliminate twin-image artifacts
holographic microscope (see the Methods section) that as well as the interference fringes resulting from out-of-
we used in this work, the PhaseStain framework can also focus or undesired objects19,20.
Rivenson et al. Light: Science & Applications (2019)8:23 Page 6 of 11 23
Network output—
QPI of label-free liver tissue section digital Masson’s trichrome staining
4/5
Trained
network
50 μm 0 50 μm
10 μm 10 μm 10 μm
0
Fig. 5 The impact of holographic fringes resulting from out-of-focus particles on the deep neural network’s digital staining performance.
Top row: QPI of a label-free liver tissue section and the resulting network output. Bottom row: zoom-in image of a region of interest where the
coherence-related artifact partially degrades the virtual staining performance
subsection). Following the QPI process, the label-free Multiheight phase recovery
specimen slide was put into Xylene for ~48 h, until the Lensfree in-line holograms at eight sample-to-sensor
coverslip can be removed without introducing distortions distances were captured. The axial scanning step size was
to the tissue. Once the coverslip was removed, the slide chosen to be 15 μm. Accurate z-steps were obtained by
was dipped multiple times in absolute alcohol and 95% applying a holographic autofocusing algorithm based on
alcohol, and then washed in D.I. water for ~1 min. Fol- the edge sparsity criterion (“Tamura of the gradient”, i.e.,
lowing this step, the tissue slides were stained with H&E ToG)45. A zero-phase was assigned to the object intensity
(skin tissue), Jones’ stain (kidney tissue), and Masson’s measurement as an initial phase guess, to start the itera-
trichrome (liver tissue) and then coverslipped. These tis- tions. An iterative multiheight phase recovery algorithm46
sue samples were then imaged using a brightfield auto- was then used by propagating the complex field back and
mated slide scanner microscope (Aperio AT, Leica forth between each height using the transfer function of
Biosystems) with a 20×/0.75NA objective (Plan Apo), free-space47. During this iterative process, the phase was
equipped with a 2×magnification adapter, which results in kept unchanged at each axial plane, where the amplitude
an effective pixel size of ~0.25 µm. was updated by using the square-root of the object
intensity measurement. One iteration was defined as
Quantitative phase imaging propagating the hologram from the eighth height (farthest
Lensfree imaging setup from the sensor chip) to the first height (nearest to the
The quantitative phase images of label-free tissue samples sensor) and then back propagating the complex field to
were acquired using an in-line lensfree holography setup41. the eighth height. Typically, after 10–30 iterations, the
A light source (WhiteLase Micro, NKT Photonics, phase is retrieved. For the final step of the reconstruction,
Denmark) with a center wavelength at 550 nm and a the complex wave defined by the converged amplitude
spectral bandwidth of ~2.5 nm was used as the illumination and phase at a given hologram plane was propagated to
source. The uncollimated light emitted from a single-mode the object plane47, from which the phase component of
fiber was used for creating a quasi-plane-wave that illumi- the sample was extracted.
nated the sample. The sample was placed between the light
source and the CMOS image sensor chip (IMX 081, Data preprocessing and image registration
Sony Corp., Minato, Tokyo, Japan, pixel size of 1.12 μm) An important step in our training process is to perform
with a source-to-sample distance (z1) of 5–10 cm and a an accurate image registration, between the two imaging
sample-to-sensor distance (z2) of 1–2 mm. This on-chip modalities (QPI and brightfield), which involves both
lensfree holographic microscope has a submicron resolu- global matching and local alignment steps. Since the
tion with an effective pixel size of 0.37 µm, covering a network aims to learn the transformation from a label-
sample FOV of ~20 mm2 (which accounts for the free phase retrieved image to a histologically stained
entire active area of the sensor). The positioning stage brightfield image, it is crucial to accurately align the FOVs
(MAX606, Thorlabs Inc., Newton, NJ, USA), which held- for each input and target image pair in the dataset.
the CMOS sensor, enabled the 3D translation of the imager We perform this cross-modality alignment procedure in
chip for performing pixel super-resolution (PSR)5,41,42 four steps; steps 1, 2, and 4 are done in MATLAB
and multiheight-based iterative phase recovery41,43. (The MathWorks Inc., Natick, MA, USA) and step 3
All imaging hardware was controlled automatically involves TensorFlow.
by LabVIEW (National Instruments Corp., Austin, TX, The first step is to find a roughly matched FOV between
USA). QPI and the corresponding brightfield image. This is done
by first bicubic downsampling of the whole slide image
Pixel super-resolution (PSR) technique (WSI) (~60 by 60 k pixels) to match the pixel size of the
To synthesize a high-resolution hologram (with a pixel phase retrieved image. Then, each 4096 × 4096-pixel
size of ~0.37 μm) using only the G1 channel of the Bayer phase image was cropped by 256 on each side (resulting in
pattern (R, G1, G2, and B), a shift-and-add based PSR an image with 3584 × 3584 pixels) to remove the padding
algorithm was applied42,44. The translation stage that that is used for the image reconstruction process. Fol-
holds the image sensor was programmed to laterally shift lowing this step, both the brightfield and the corre-
on a 6 × 6 grid with a subpixel spacing at each sample-to- sponding phase images are edge extracted using the
sensor distance. A low-resolution hologram was recorded Canny method48, which uses a double threshold to detect
at each position and the lateral shifts were precisely strong and weak edges on the gradient of the image. Then,
estimated using a shift estimation algorithm41. This step a correlation score matrix is calculated by correlating each
results in six nonoverlapping panels that were each 3584 × 3584-pixel patch of the resulting edge image to the
padded to a size of 4096 × 4096 pixels, and individually same size as the image extracted from the brightfield edge
phase-recovered, which is detailed next. image. The image with the highest correlation score
Rivenson et al. Light: Science & Applications (2019)8:23 Page 8 of 11 23
+ 2
N/8 × N/8 × 512 N/8 × N/8 × 256 2
+ 2
2
+ 2
0 +2 2
2
N/16 × N/16 × 512
Discriminator
256, 256, 64
Label Generator output 128, 128, 128
FC layer
2 Convolutional layer
Fig. 6 Architecture of the generator and discriminator networks within the GAN framework
indicates a match between the two images, and the cor- quantitative phase images into stained brightfield images,
responding brightfield image is cropped out from the which can help the distortion correction between the two
WSI. Following this initial matching procedure, the image modalities in the fourth/final step. In other words,
quantitative phase image and the brightfield microscope to make the local registration tractable, we first train a
images are coarsely matched. deep network with the globally registered images, to
The second step is used to correct for potential rota- reduce the entropy between the images acquired with the
tions between these coarsely matched image pairs, which two imaging modalities (i.e., QPI vs. brightfield image of
might be caused by a slight mismatch in the sample pla- the stained tissue). This neural network has the same
cement during the two image acquisition experiments structure as the network that was used for the final
(which are performed on different imaging systems, training process (see the next subsection on the GAN
holographic vs. brightfield). This intensity-based regis- architecture and its training) with the input and target
tration step correlates the spatial patterns between the images obtained from the second registration step dis-
two images; the phase image that is converted to an cussed earlier. Since the image pairs are not well aligned
unsigned integer format and the luminance component of yet, the training is stopped early at only ~2000 iterations
the brightfield image were used for this multimodal to avoid a structural change at the output to be learnt by
registration framework implemented in MATLAB. The the network. The output and target images of the network
result of this digital procedure is an affine transformation are then used as the registration pairs in the fourth step,
matrix, which is applied to the brightfield microscope which is an elastic image registration algorithm, used to
image patch, to match it with the quantitative phase correct for local feature registration16.
image of the same sample. Following this registration step,
the phase image and the corresponding brightfield image GAN architecture and training
are globally aligned. A further crop of 64 pixels on each The GAN architecture that we used for PhaseStain is
side to the aligned image pairs is used to accommodate for detailed in Fig. 6 and Supplementary Table 1. Following
a possible rotation angle correction. the registration of the label-free quantitative phase images
The third step involves the training of a separate neural to the brightfield images of the histologically stained tis-
network that roughly learns the transformation from sue sections, these accurately aligned fields-of-view were
Rivenson et al. Light: Science & Applications (2019)8:23 Page 9 of 11 23
Table 1 Training details for the virtual staining of different tissue types using PhaseStain. Following the training, the
blind inference takes ~0.617 s for an FOV of ~0.45 mm2, corresponding to ~3.22 megapixels (see the Discussion section)
Tissue type # of iterations # of patches (256 × 256 pixels) Training time (h) # of epochs
partitioned to overlapping patches of 256 × 256 pixels, activation function, which is defined as:
which were then used to train the GAN model. The GAN x for x>0
is composed of two deep neural networks, a generator and LReLUðxÞ ¼ ð5Þ
0:1x otherwise
a discriminator. The discriminator network’s loss function
is given by: At the output of each block, the number of channels is
increased by 2-fold (except for the first block that
‘discrimnator ¼ DðGðxinput ÞÞ2 þ ð1 Dðzlabel ÞÞ2 ð3Þ increases from 1 input channel to 64 channels). The
blocks are connected by an average-pooling layer of stride
where D(.) and G(.) refer to the discriminator and gen- two that downsamples the output of the previous block by
erator network operators, respectively, xinput denotes the a factor of two for both horizontal and vertical dimensions
input to the generator, which is the label-free quantitative (as shown in Fig. 6 and supplementary Table 1).
phase image, and zlabel denotes the brightfield image of the
histologically stained tissue. The generator network, G, In the upsampling path, each block also consists of three
tries to generate an output image with the same statistical convolutional layers and three LReLU activation func-
features as zlabel, while the discriminator, D, attempts to tions, which decrease the number of channels at its output
distinguish between the target and the generator output by fourfold. The blocks are connected by a bilinear
images. The ideal outcome (or state of equilibrium) will upsampling layer that upsamples the size of the output
be when the generator’s output and target images share an from the previous block by a factor of two for both lateral
identical statistical distribution, where in this case, D(G dimensions. A concatenation function with the corre-
(xinput)) should converge to 0.5. For the generator deep sponding feature map from the downsampling path of the
network, we defined the loss function as: same level is used to increase the number of channels
from the output of the previous block by two. The two
‘generator ¼ L1 zlabel ; G xinput þ λ paths are connected in the first level of the network by a
2 convolutional layer, which maintains the number of the
´ TV G xinput þ α ´ 1 D G xinput
feature maps from the output of the last residual block in
ð4Þ the downsampling path (see Fig. 6 and supplementary
Table 1). The last layer is a convolutional layer that maps
where the L1{.} term refers to the absolute pixel-by-pixel the output of the upsampling path into 3 channels of the
difference between the generator output image and its tar- YCbCr color map.
get, TV{.} stands for the total variation regularization that is The discriminator network consists of one convolu-
being applied to the generator output, and the last term tional layer, five discriminator blocks, an average-pooling
reflects a penalty related to the discriminator network pre- layer, and two fully connected layers. The first con-
diction of the generator output. The regularization para- volutional layer receives 3 channels (YCbCr color map)
meters (λ, α) were set to 0.02 and 2000 so that the total from either the generator output or the target and
variation loss term, λ × TV{G(xinput)}, was ~2% of the L1 loss increases the number of channels to 64. The dis-
term, and the discriminator loss term, α × (1 − D(G(xinput)))2 criminator blocks consist of two convolutional layers
was ~98% of the total generator loss, lgenerator. with the first layer maintaining the size of the feature
For the generator deep neural network, we adapted the map and the number of channels, while the second layer
U-net architecture49, which consists of a downsampling increases the number of channels by twofold and
and an upsampling path, with each path containing four decreases the size of the feature map by fourfold. The
blocks forming four distinct levels (see Fig. 6 and Sup- average-pooling layer has a filter size of 8 × 8, which
plementary Table 1). In the downsampling path, each results in a matrix with a size of (B, 2048), where B refers
residual block consists of three convolutional layers and to the batch size. The output of this average-pooling
three leaky rectified linear (LReLU) units used as an layer is then fed into two fully connected layers with the
Rivenson et al. Light: Science & Applications (2019)8:23 Page 10 of 11 23
515
Loss (a.u.)
15 510
505
10
500
5 495
490
0 485
11 00
12 00
13 00
14 00
15 00
16 00
17 00
18 00
19 00
00
00
00
00
00
00
75 0
85 0
00
10 00
15 0
11 00
12 00
13 00
14 00
15 00
16 00
17 00
18 00
19 00
00
25 0
00
45 0
00
00
00
00
00
10 00
0
0
0
50
0
50
,5
,5
,5
,5
,5
,5
,5
,5
,5
,5
25
35
45
55
65
95
,5
,5
,5
,5
,5
,5
,5
,5
,5
,5
15
35
55
65
75
85
95
Number of iterations Number of iterations
Fig. 7 PhaseStain convergence plots for the validation set of the digital H&E staining of the skin tissue. a L1-loss with respect to the number
of iterations. b Generator loss, lgenerator with respect to the number of iterations
first layer maintaining the size of the feature map, while Implementation details
the second layer decreases the output channel to 1, The number of image patches that were used for
resulting in an output size of (B, 1). The output of this training, the number of epochs, and the training schedules
fully connected layer is going through a sigmoid func- are shown in Table 1. The network was implemented
tion, indicating the probability that the three-channel using Python version 3.5.0, with TensorFlow framework
discriminator input is drawn from a histologically version 1.7.0. We implemented the software on a desktop
stained brightfield image. For the discriminator network, computer with a Core i7-7700K CPU @ 4.2 GHz
all the convolutional layers and the fully connected (Intel Corp., Santa Clara, CA, USA) and 64GB of RAM,
layers are connected by LReLU nonlinear activation running a Windows 10 operating system (Micro-
functions. soft Corp., Redmond, WA, USA). Following the training
Throughout the training, the convolution filter size was for each tissue section, the corresponding network was
set to be 3 × 3. For the patch generation, we applied data tested with 4 image patches of 1792 × 1792 pixels with an
augmentation by using 50% patch overlap for the liver and overlap of ~7%. The outputs of the network were then
skin tissue images, and 25% patch overlap for the kidney stitched to form the final network output image of 3456 ×
tissue images (see Table 1). The learnable parameters 3456 pixels (FOV ~1.7 mm2), as shown in, e.g., Fig. 2. The
including filters, weights, and biases in the convolutional network training and testing were performed using dual
layers and the fully connected layers are updated using an GeForce GTX 1080Ti GPUs (NVidia Corp., Santa Clara,
adaptive moment estimation (Adam) optimizer with a CA, USA).
learning rate of 1 × 10−4 for the generator network and
1 × 10−5 for the discriminator network. Acknowledgments
The Ozcan Research Group at UCLA acknowledges the support of NSF
For each iteration of the discriminator, there were v Engineering Research Center (ERC, PATHS-UP), the Army Research Office (ARO;
iterations of the generator network; for the liver and skin W911NF-13-1-0419 and W911NF-13-1-0197), the ARO Life Sciences Division,
tissue training, v = max(5, floor(7 − w/2)), where we the National Science Foundation (NSF) CBET Division Biophotonics Program,
the NSF Emerging Frontiers in Research and Innovation (EFRI) Award, the NSF
increased w by 1 for every 500 iterations (w was initialized INSPIRE Award, NSF Partnerships for Innovation: Building Innovation Capacity
as 0). For the kidney tissue training, we used v = max(4, (PFI:BIC) Program, the National Institutes of Health (NIH, R21EB023115), the
floor(6 − w/2)), where we increased w by 1 for every 400 Howard Hughes Medical Institute (HHMI), Vodafone Americas Foundation, the
Mary Kay Foundation, and Steven & Alexandra Cohen Foundation. The authors
iterations. This helped us to train the discriminator not to also acknowledge the Translational Pathology Core Laboratory (TPCL) and the
overfit to the target brightfield images. We used a batch Histology Lab at UCLA for their assistance with the sample preparation and
size of ten for the training of the liver and skin tissue staining, as well as Prof. W. Dean Wallace of Pathology and Laboratory
Medicine at UCLA’s David Geffen School of Medicine for image evaluations.
sections, and five for the kidney tissue sections. All the
convolutional kernel entries are initialized using a trun-
Author details
cated normal distribution. All the network bias terms are 1
Electrical and Computer Engineering Department, University of California, Los
initialized to be zero. The network’s training stopped Angeles, CA 90095, USA. 2Bioengineering Department, University of California,
when the validation set’s L1-loss did not decrease after Los Angeles, CA 90095, USA. 3California NanoSystems Institute (CNSI),
University of California, Los Angeles, CA 90095, USA. 4Department of Surgery,
4000 iterations. A typical convergence plot of our training David Geffen School of Medicine, University of California, Los Angeles, CA
is shown in Fig. 7. 90095, USA
Rivenson et al. Light: Science & Applications (2019)8:23 Page 11 of 11 23
Conflict of interest 25. Nehme, E., Weiss, L. E., Michaeli, T. & Shechtman, Y. Deep-STORM: super-
A.O., Y.R., and Z.W. have a patent application on the invention reported in this resolution single-molecule microscopy by deep learning. Optica 5, 458–464
manuscript. (2018).
26. Ouyang, W., Aristov, A., Lelek, M., Hao, X. & Zimmer, C. Deep learning massively
Supplementary information is available for this paper at https://doi.org/ accelerates super-resolution localization microscopy. Nat. Biotechnol. 36,
10.1038/s41377-019-0129-y. 460–468, https://doi.org/10.1038/nbt.4106 (2018).
27. Zahavy, T. et al. Deep learning reconstruction of ultrashort pulses. Optica 5,
666–673 (2018).
Received: 23 July 2018 Revised: 5 January 2019 Accepted: 11 January 2019
28. Rivenson Y., et al. Virtual histological staining of unlabelled tissue-
autofluorescence images via deep learning. Nat. Biomed. Eng. (in the press).
29. Goodfellow I. J., et al. Generative adversarial nets. In Proceedings of the 27th
International Conference on Neural Information Processing Systems. (MIT
Press: Cambridge, MA, 2014) pp. 2672–2680.
References
30. Park, H. S., Rinehart, M. T., Walzer, K. A., Chi, J. T. A. & Wax, A. Automated
1. Cuche, E., Marquet, P. & Depeursinge, C. Simultaneous amplitude-contrast and
detection of P. falciparum using machine learning algorithms with quantita-
quantitative phase-contrast microscopy by numerical reconstruction of Fres-
tive phase images of unstained cells. PLoS ONE 11, e0163045 (2016).
nel off-axis holograms. Appl. Opt. 38, 6994–7001 (1999).
31. Chen, C. L. et al. Deep learning in label-free cell classification. Sci. Rep. 6, 21471
2. Popescu, G. Quantitative Phase Imaging of Cells and Tissues. (McGraw-Hill, New
(2016).
York, 2011).
32. Roitshtain, D. et al. Quantitative phase microscopy spatial signatures of cancer
3. Shaked, N. T., Rinehart, M. T. & Wax, A. Dual-interference-channel quantitative-
cells. Cytometry A 91, 482–493 (2017).
phase microscopy of live cell dynamics. Opt. Lett. 34, 767–769 (2009).
33. Jo, Y. et al. Holographic deep learning for rapid optical screening of anthrax
4. Wang, Z. et al. Spatial light interference microscopy (SLIM). Opt. Express 19,
spores. Sci. Adv. 3, e1700606, https://doi.org/10.1101/109108 (2017).
1016–1026 (2011).
34. Javidi, B. et al. Sickle cell disease diagnosis based on spatio-temporal cell
5. Greenbaum, A. et al. Imaging without lenses: achievements and remaining
dynamics analysis using 3D printed shearing digital holographic microscopy.
challenges of wide-field on-chip microscopy. Nat. Methods 9, 889–895 (2012).
Opt. Express 26, 13614–13627 (2018).
6. Zheng, G. A., Horstmeyer, R. & Yang, C. Wide-field, high-resolution Fourier
35. Tata, A. et al. Wide-field tissue polarimetry allows efficient localized mass
ptychographic microscopy. Nat. Photonics 7, 739–745 (2013).
spectrometry imaging of biological tissues. Chem. Sci. 7, 2162–2169 (2016).
7. Tian, L. & Waller, L. Quantitative differential phase contrast imaging in an LED
36. Cree, I. A. et al. Guidance for laboratories performing molecular pathology for
array microscope. Opt. Express 23, 11394–11403 (2015).
cancer patients. J. Clin. Pathol. 67, 923–931 (2014).
8. Wang, Z., Tangella, K., Balla, A. & Popescu, G. Tissue refractive index as marker
37. Patel, P. G. et al. Preparation of formalin-fixed paraffin-embedded tissue cores
of disease. J. Biomed. Opt. 16, 116017 (2011).
for both RNA and DNA extraction. J. Vis. Exp. 2016, e54299, https://doi.org/
9. Wang, Z., Ding, H. F. & Popescu, G. Scattering-phase theorem. Opt. Lett. 36,
10.3791/54299 (2016).
1215–1217 (2011).
38. Ikeda, T., Popescu, G., Dasari, R. R. & Feld, M. S. Hilbert phase microscopy for
10. Liu Y., et al. Detecting cancer metastases on gigapixel pathology images.
investigating fast dynamics in transparent systems. Opt. Lett. 30, 1165–1167
ArXiv: 1703.02442 (2017).
(2005).
11. Litjens G., et al. A survey on deep learning in medical image analysis. Med.
39. Shaked, N. T., Zhu, Y. Z., Badie, N., Bursac, N. & Wax, A. Reflective interferometric
Image Anal. 42: 60–88 (2017).
chamber for quantitative phase imaging of biological sample dynamics. J.
12. Tao, Y. K. et al. Assessment of breast pathologies using nonlinear microscopy.
Biomed. Opt. 15, 030503 (2010).
Proc. Natl Acad. Sci. USA 111, 15304–15309 (2014).
40. Watanabe, E., Hoshiba, T. & Javidi, B. High-precision microscopic phase ima-
13. Giacomelli, M. G. et al. Virtual hematoxylin and eosin transillumination
ging without phase unwrapping for cancer cell identification. Opt. Lett. 38,
microscopy using epi-fluorescence imaging. PLoS ONE 11, e0159337
1319–1321 (2013).
(2016).
41. Greenbaum, A. et al. Wide-field computational imaging of pathology slides
14. Orringer, D. A. et al. Rapid intraoperative histology of unprocessed surgical
using lens-free on-chip microscopy. Sci. Transl. Med 6, 267ra175 (2014).
specimens via fibre-laser-based stimulated Raman scattering microscopy. Nat.
42. Bishara, W., Su, T. W., Coskun, A. F. & Ozcan, A. Lensfree on-chip microscopy
Biomed. Eng. 1, 0027 (2017).
over a wide field-of-view using pixel super-resolution. Opt. Express 18,
15. Rivenson, Y. et al. Deep learning microscopy. Optica 4, 1437–1443 (2017).
11181–11191 (2010).
16. Rivenson, Y. et al. Deep learning enhanced mobile-phone microscopy. ACS
43. Luo, W., Zhang, Y. B., Göröcs, Z., Feizi, A. & Ozcan, A. Propagation phasor
Photon. 5, 2354–2364, https://doi.org/10.1021/acsphotonics.8b00146 (2018).
approach for holographic image reconstruction. Sci. Rep. 6, 22738
17. Wang H., et al. Deep learning enables cross-modality super-resolution in
(2016).
fluorescence microscopy Nat. Methods 16, 103–110 (2019).
44. Farsiu, S., Robinson, M. D., Elad, M. & Milanfar, P. Fast and robust multiframe
18. Sinha, A., Lee, J., Li, S. & Barbastathis, G. Lensless computational imaging
super resolution. IEEE Trans. Image Process. 13, 1327–1344 (2014).
through deep learning. Optica 4, 1117–1125 (2017).
45. Zhang, Y. B., Wang, H. D., Wu, Y. C., Tamamitsu, M. & Ozcan, A. Edge sparsity
19. Rivenson, Y., Zhang, Y. B., Günaydin, H., Teng, D. & Ozcan, A. Phase recovery
criterion for robust holographic autofocusing. Opt. Lett. 42, 3824–3827 (2017).
and holographic image reconstruction using deep learning in neural net-
46. Greenbaum, A. & Ozcan, A. Maskless imaging of dense samples using pixel
works. Light Sci. Appl. 7, e17141 (2018).
super-resolution based multi-height lensfree on-chip microscopy. Opt. Express
20. Wu, Y. C. et al. Extended depth-of-field in holographic imaging using deep-
20, 3129–3143 (2012).
learning-based autofocusing and phase recovery. Optica 5, 704–710 (2018).
47. Goodman, J. W. Introduction to Fourier Optics. (Roberts and Company Pub-
21. Jo Y., et al. Quantitative phase imaging and artificial intelligence: a review.
lishers, Englewood, 2005).
arXiv: 1806.03982 (2018).
48. Canny, J. A computational approach to edge detection. IEEE Trans. Pattern
22. Kamilov, U. et al. Learning approach to optical tomography. Optica 2, 517–522
Anal. Mach. Intell. PAMI-8, 679–698 (1986).
(2015).
49. Ronneberger O., Fischer P., Brox T. U-Net: convolutional networks for bio-
23. Nguyen T., Xue Y. J., Li Y. Z., Tian L., Nehmetallah G. Deep learning approach to
medical image segmentation. In: Navab N., Hornegger J., Wells W. M., Frangi A.
Fourier ptychographic microscopy. arXiv: 1805.00334 (2018).
F., (eds). Medical Image Computing and Computer-Assisted Intervention—
24. Boyd N., Jonas E., Babcock H. P., Recht B. DeepLoco: fast 3D localization
MICCAI 2015. (Springer, Cham, 2015) pp. 234–241 https://doi.org/10.1007/978-
microscopy using neural networks. bioRxiv: 267096, 2018. https://doi.org/
3-319-24574-4_28.
10.1101/267096.