0% found this document useful (0 votes)
9 views12 pages

Organ at Risk Segmentation in Head and Neck CT Images Using A Two-Stage Segmentation Framework Based On 3D U-Net

This study presents a two-stage segmentation framework based on 3D U-Net for accurately segmenting organs at risk (OARs) in head and neck CT images, which is crucial for effective treatment planning in radiotherapy. The method decomposes the segmentation task into locating a bounding box for each OAR and then segmenting the OAR within that box, significantly improving accuracy and efficiency. Evaluation against state-of-the-art methods using the MICCAI 2015 Challenge dataset showed that the proposed framework achieved top rankings in segmentation accuracy for multiple OARs.

Uploaded by

vvbvansh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views12 pages

Organ at Risk Segmentation in Head and Neck CT Images Using A Two-Stage Segmentation Framework Based On 3D U-Net

This study presents a two-stage segmentation framework based on 3D U-Net for accurately segmenting organs at risk (OARs) in head and neck CT images, which is crucial for effective treatment planning in radiotherapy. The method decomposes the segmentation task into locating a bounding box for each OAR and then segmenting the OAR within that box, significantly improving accuracy and efficiency. Evaluation against state-of-the-art methods using the MICCAI 2015 Challenge dataset showed that the proposed framework achieved top rankings in segmentation accuracy for multiple OARs.

Uploaded by

vvbvansh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Received September 11, 2019, accepted September 26, 2019, date of publication October 1, 2019, date of current version

October 16, 2019.


Digital Object Identifier 10.1109/ACCESS.2019.2944958

Organ at Risk Segmentation in Head and Neck CT


Images Using a Two-Stage Segmentation
Framework Based on 3D U-Net
YUEYUE WANG , LIANG ZHAO , MANNING WANG , AND ZHIJIAN SONG
Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, Shanghai 200433, China
Shanghai Key Laboratory of Medical Imaging Computing and Computer Assisted Intervention, Shanghai 200032, China
Corresponding authors: Manning Wang ([email protected]) and Zhijian Song ([email protected])

ABSTRACT Accurate segmentation of organs at risk (OARs) plays a critical role in the treatment planning
of image-guided radiotherapy of head and neck cancer. This segmentation task is challenging for both
humans and automated algorithms because of the relatively large number of OARs to be segmented, the large
variability in size and morphology across different OARs, and the low contrast between some OARs and the
background. In this study, we propose a two-stage segmentation framework based on 3D U-Net. In this
framework, the segmentation of each OAR is decomposed into two subtasks: locating a bounding box of the
OAR and segmenting the OAR from a small volume within the bounding box, and each subtask is fulfilled by
a dedicated 3D U-Net. The decomposition makes each subtask much easier so that it can be better completed.
We evaluated the proposed method and compared it to state-of-the-art methods using the Medical Image
Computing and Computer-Assisted Intervention 2015 Challenge dataset. In terms of the boundary-based
metric 95% Hausdorff distance, the proposed method ranked first for seven of nine OARs and ranked second
for the other OARs. In terms of the area-based metric dice similarity coefficient, the proposed method ranked
first for five of nine OARs and ranked second for the other three OARs with a small difference from the
method that ranked first.

INDEX TERMS 3D U-Net, CT images, head and neck, organ at risk segmentation.

I. INTRODUCTION task [5], [6]. It may take radiologist three hours to segment all
Head and neck (HaN) cancer is one of the most common OARs for treatment planning [5]. Some treatment planning
cancers, with more than half a million cases worldwide per systems have automatic segmentation function, such as the
year [1]. Image-guided radiation therapy (IGRT), includ- atlas-based segmentation methods [7], but the segmentation
ing intensity-modulated radiation therapy (IMRT) and volu- result has not met the clinical needs. Intensive labor is still
metric modulated arc therapy, is a state-of-the-art treatment needed for manual adjustment of the segmentation result to
option because of its highly conformal dose delivery make it applicable for treatment planning and the time needed
[2]–[4]. The key to the success of IGRT is patient-specific for manual adjustment is comparable to manual segmentation
treatment planning, in which medical images are used to from scratch [6]. Therefore, there is a great demand for a
make a radiation plan to concentrate the radiation dose on the rapid, accurate, and automatic OAR segmentation method to
target volume while minimizing the dose to the surrounding reduce radiologist labor in HaN treatment planning.
organs at risk (OARs). Therefore, it is essential to segment the Medical image segmentation is an area of intense research,
OARs in treatment planning images, which usually include and many methods for segmenting different targets from med-
HaN computed tomography (CT) images. In current clini- ical images of different modalities have been proposed. Some
cal practice, OARs are usually delineated manually, but the of these methods have also been applied in OAR segmenta-
complexity and variability of the OARs morphology in HaN tion, but unfortunately, the current results are far from being
CT images make it an inaccurate and very time consuming satisfactory. A Head and Neck Auto Segmentation Challenge
was held in conjunction with the Medical Image Computing
The associate editor coordinating the review of this manuscript and and Computer-Assisted Intervention (MICCAI) conference
approving it for publication was Junxiu Liu . in 2015 (referred to as the ‘‘MICCAI 2015 Challenge’’ from

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/
VOLUME 7, 2019 144591
Y. Wang et al.: Organ at Risk Segmentation in Head and Neck CT Images Using a Two-Stage Segmentation Framework

here on), which provided a public data set for OARs seg- nerve and chiasm. They used atlas-based method to locate a
mentation in HaN CT images [8]. Six teams participated in bounding box enclosing the target OAR and then performed
this challenge and finished this task using different segmen- segmentation in a small target volume. Zhu et al. [22] pro-
tation methods, including the statistical shape model, active posed the AnatomyNet, an end-to-end and atlas-free three
appearance model, multiatlas-based segmentation method dimensional squeeze-and-excitation U-Net (3D SE U-Net),
and the semiautomatic segmentation method [8], but their for fast and fully automated whole-volume HaN anatomical
segmentation results were not satisfactory to radiologist. The segmentation. Tong et al. [23] proposed a fully convolutional
challenges of OAR segmentation in HaN CT images include: neural network with a shape representation model for multi-
(i) the complexity and variability of the OARs are high, and it organ segmentation for HaN cancer radiotherapy. However,
is difficult to incorporate prior information into shape models these existing deep-learning-based methods generally pro-
to support the segmentation of new images; (ii) the sizes of duced accurate segmentation maps for large organs, while the
OARs are varied, and most segmentation methods usually get accuracy of small OARs was often sacrificed.
accurate results in bigger OARs while inaccurate results in To seperate the segmentation of large and small OARs,
smaller OARs and (iii) the contrast of soft tissues is poor in we adopt a two-stage framework for OARs localiza-
CT images, which makes it difficult to segment some OARs, tion and segmentation. Recently, two-stage framework and
such as brainstem. U-Net have shown their outstanding performances in vari-
Although the contrast between bone and soft tissues is ous medical image computing tasks [24]–[30]. In this study,
relatively high in CT images, the characteristics of the HaN we propose a two-stage framework to decompose OAR seg-
OAR segmentation task, including the large number of OARs mentation into two relatively simpler tasks and complete each
to be segmented, the great variety in size and morphology of task by a dedicated 3D U-Net. The first task is to locate the
different OARs, and the low contrast between some OARs target OAR with a bounding box and the second task is to
and their background, make simple segmentation methods, segment the target OAR within the bounding box. Decompo-
such as thresholding, edge detection, and region growing, dif- sition of this task makes it simpler than directly segmenting
ficult to succeed. Many methods that have been successfully the OARs from the entire volume and improves the segmen-
used in other medical image segmentation tasks, such as 3D tation performance. Experiments using MICCAI 2015 Chal-
level set [9] and atlas-based techniques [10], have also been lenge data showed that the proposed method achieved the
applied in this field, but the results are not satisfactory. highest dice similarity coefficient (DSC) for six of the nine
Several approaches have been developed to incorporate OARs and achieved the second highest DSC for the other
prior knowledge, which often represents the results of gold three OARs. In addition, the proposed method achieved the
standard segmentation of some subjects, to help segment smallest 95% Hausdorff distance (95HD) for seven of the
new subjects, and these approaches have also been used in nine OARs with a significant benefit and achieved the second
HaN OAR segmentation. For example, the method proposed smallest 95HD for the other two OARs.
in [11] built a statistical shape model of OARs and deforms
the model to fit the image to achieve segmentation. A mul-
tiatlas approach [12] registered the segmented images to II. MATERIALS AND METHODS
the target image and then fused the label of the segmented A. THE MICCAI 2015 CHALLENGE DATASET
images to obtain a segmentation result of the target image. In this study, we evaluated the proposed OAR seg-
Another approach is to train a classifier with prior segmented mentation framework and compared it to other methods
images and transform the segmentation task into a classifica- using the PDDCA dataset, which is publicly available at
tion task [13]. In the MICCAI 2015 Challenge, most teams (http://www.imagenglab.com/newsite/pddca/). This dataset
adopted several approaches that include the statistical shape was provided by Dr. Gregory C. Sharp and was used in
model, active appearance model, and the multiatlas-based the Head and Neck Auto-Segmentation Challenge 2015, a
method to utilize prior knowledge. This challenge provided a satellite event at the MICCAI 2015 conference. The current
unified evaluation framework for different methods on OAR version (v 1.4.1) of the PDDCA dataset consists of 25 train-
segmentation. ing images, 8 additional training images, and 15 testing
In recent years, deep learning methods, especially the images. The original images are from the RTOG 0522 clinical
convolutional neural network (CNN), have demonstrated trial [18], which provides 111 HaN CT images for treatment
excellent performance in medical image segmentation planning. The subset was chosen to ensure that the image
tasks [14]–[19], and CNN has also been applied for OARs quality is adequate and the target OARs have minimal overlap
segmentation in H&N CT images [20]–[23]. The first [20] with the tumors. Each image consists of a series of axial slices
using deep learning methods proposed a 2D CNN for OARs with 512 × 512 voxels on each slice, and the number of
segmentation from in-house HaN CT images, but it only got slices varies from 76 to 263. The in-plane spacing is between
a slight improvement in right submandibular gland and right 0.76 mm × 0.76 mm and 1.27 mm × 1.27 mm, and the inter-
optic nerve, and the performance for the other OARs was plane spacing is between 1.25 and 3.00 mm.
similar to that of the traditional methods. In [21], a interleaved In this dataset, nine anatomical structures, namely, the
3D CNNs method was proposed to jointly segment the optic brainstem, optic nerve left, optic nerve right, chiasm, parotid

144592 VOLUME 7, 2019


Y. Wang et al.: Organ at Risk Segmentation in Head and Neck CT Images Using a Two-Stage Segmentation Framework

target structure, which requires a bounding box of a specific


size.
The second 3D U-Net, denoted as SegNet, is used
to segment the target structure from the target volume
obtained from the previous step. The target volume has
a size of h × w × k, which is much smaller than the
384 × 384 × 224 cropped volume, and only one structure
is segmented from it. These two characteristics make the
segmentation of SegNet much easier. The output of SegNet
is a mask volume with each voxel being 0 or 1, indicating
background and target voxels, respectively.
LocNet and SegNet are separately trained; one LocNet
and one SegNet are trained for each of the nine structures.
In sections III-C and III-D, we introduce the preprocessing
needed to prepare the training and testing data for the two 3D
U-Nets and the concrete training and testing procedures.

C. PREPROCESSING
1) INTERPOLATING AND CROPPING THE ORIGINAL IMAGES
The original images have different in-plane and inter-plane
FIGURE 1. Examples of OARs of a patient in different slices of a CT scan,
and the OARs are manually annotated and shown in different colors. resolutions, which increase the variance of the shape and
size of each structure and potentially increase the diffi-
culty in segmenting them. Therefore, we resampled all the
left, parotid right, mandible, submandibular left, and sub- images into isotropic volumes with the same spatial resolu-
mandibular right, were used as segmentation targets. And tion of 1 mm × 1 mm × 1 mm using bi-cubic interpolation.
examples of OARs of a patient in different slices of a CT After interpolation, the in-plane size of all training and testing
scan are shown in Fig. 1. All nine of these structures are images was between 389 × 389 and 650 × 650 in voxels, and
important OARs in HaN radiotherapy [19] and they are man- the number of slices was between 226 and 416.
ually segmented by experts to provide high quality and con- Because the input size of SegNet and LocNet needs to be
sistency. The mask for most of these structures are provided adjusted to multiples of eight, we need to crop the isotropic
in all 33 training images, except that the mandibular, left volumes after interpolation. Considering the sizes of the
submandibular glands and right submandibular glands are isotropic volumes in this dataset and the requirement that
only segmented in 25, 26 and 21 training images, respectively. the size in each direction should be a multiple of eight,
The masks for all nine structures are provided in 15 testing we cropped the images into a 384 × 384 × 224 volume
images and used as the gold standard for evaluation. automatically. However, we did not have to manually crop
the training and testing images to place the target structures at
B. OVERVIEW OF THE TWO-STAGE the center of the cropped volume. In contrast, we divided the
SEGMENTATION FRAMEWORK nine target structures into two groups and adopted a consistent
The proposed two-stage segmentation framework and its cropping strategy for each group. The first group consisted of
training and testing flowcharts are illustrated in Fig. 2. The brainstem, optic chiasm, and optic nerves (both left and right),
framework consists of two 3D U-Nets. The original images and the second group consisted of the mandible, parotid
and masks were first cropped to a volume with a consistent glands (both left and right), and submandibular glands (both
resolution of 384 × 384 × 224 for further processing. left and right). The X, Y, and Z axes of the coordinate frame
The first 3D U-Net, denoted as LocNet, is used to coarsely of the original images corresponded to the left-right, anterior-
locate the target structure with a bounding box. The cropped posterior, and superior-inferior directions of the human body.
images and masks are first downsampled to a resolution of We positioned the 384 × 384 × 224 cropping window on the
96 × 96 × 56 in voxels and used for training LocNet. original images with margins on both sides of the cropping
LocNet outputs a 0–1 classification for each voxel, indi- window along each axis. The voxels of the margins may be
cating whether a voxel falls in the bounding box. A post- different for different images because of the differences in the
processing step is used to generate a bounding box of size size of images. Nevertheless, for all target structures, the ratio
(h/4) × (w/4) × (k/4) from the output of LocNet, and the between the left and right margins along the X-axis was 0.5 to
bounding box is transferred back to the coordinate frame of 0.5; the ratio between the anterior and posterior margins along
the cropped volume. Then, the bounding box is applied to the the Y-axis was 0.3 to 0.7 and 0.2 to 0.8 for the structures in the
cropped volume to obtain a smaller volume of size h × w × k, first and second groups, respectively; the ratio between the
which is the target volume. One LocNet is trained for each superior and the inferior margins along the Z-axis was 0.9 to

VOLUME 7, 2019 144593


Y. Wang et al.: Organ at Risk Segmentation in Head and Neck CT Images Using a Two-Stage Segmentation Framework

FIGURE 2. Flowchart of the proposed segmentation framework.

0.1 and 0.7 to 0.3 for the structures in the first and second TABLE 1. Size of the bounding box for each target structure.
groups, respectively. Each target structure is not necessarily
located at the center of the cropped volume because a dedi-
cated network will be used to locate it. For the structures in
each group, cropping is automatically performed on both the
training and testing images with the same parameters.

2) DETERMINING THE SIZE OF THE BOUNDING


BOX FOR EACH STRUCTURE
In the 384 × 384 × 224 cropped volume, we first located a
bounding box to enclose the target structure and called the
volume data within the bounding box as the target volume.
We needed to determine the size of the bounding box for each
structure before locating it. Because the target volume is the
input of SegNet, its size in each direction should also be a
SegNet, respectively. As shown in Fig. 3, LocNet and Seg-
multiple of eight. In this study, we determined the size of the
Net have the same network structure, consisting of an anal-
target volume for each structure by considering the size of the
ysis path and a synthesis path. In the analysis path, each
structure in the training dataset (Table 1).
layer contains two 3 × 3 × 3 convolutions, each followed
by a batch normalization (BN) and a rectified linear unit
D. TWO-STAGE 3D U-Net SEGMENTATION FRAMEWORK (ReLu), and then a 2 × 2 × 2 max pooling with strides
In this study, we concatenated two 3D U-Nets to segment of two in each dimension. In the synthesis path, each layer
a target structure, where the first 3D U-Net was used to consists of an up-convolution of 2 × 2 × 2 by strides of
locate a relatively small target volume that enclosed the two in each dimension, followed by two 3 × 3 × 3 con-
target structure, and the second 3D U-Net was used to volutions each followed by a BN and a ReLu. Shortcut
segment out the target structure from the target volume. connections from layers of equal resolution in the analy-
The first and the second networks are called LocNet and sis path provide essential high-resolution features for the

144594 VOLUME 7, 2019


Y. Wang et al.: Organ at Risk Segmentation in Head and Neck CT Images Using a Two-Stage Segmentation Framework

FIGURE 3. Structures of the two-stage 3D U-Net frameworks.

synthesis path. In the final layer, a 1 × 1 × 1 convolu- A. EVALUATION METRICS


tion is used to reduce the number of output channels to a We used four evaluation metrics in this study.
0–1 classification. In total, each network has 17 convolutional 1. Dice similarity coefficient (DSC). The DSC measures
layers. the degree of overlap between the segmentation result
The 384 × 384 × 224 cropped volume is first downsam- and the gold standard and is defined as follows:
pled to the size of 96 × 96 × 56 and then input into LocNet.
2|A ∩ B|
The output of LocNet is a 96 × 96 × 56 sized binary volume, DSC = (1)
from which we locate the bounding box. The cropped volume |A| +|B|
is downsampled by a factor of four; thus, the bounding boxes where A and B represent the voxel set of the segmentation
that we want to locate in the 96 × 96 × 56 output volume result and the voxel set of the gold standard, respectively.
also shrink by a factor of four. For example, the size of 2. The 95% Hausdorff distance (95HD). Before defining
the bounding box of the mandible is 144 × 144 × 112 in 95HD, we first need to define the Hausdorff distance,
voxels and thus we need to locate a bounding box with a size which is usually used to measure the deviation of the
of 36 × 36 × 28 in the 96 × 96 × 56 output volume. It is very contour of two areas. Given two-point sets X and Y
unlikely that all the voxels with a value of 1 fall in a cuboid and d(x, y) measuring the Euler distance between the
of the expected size. Here, we used the sliding-window tech- two points x ∈ X and y ∈ Y , the directed Hausdorff
nique to locate the expected bounding box. We slid a cuboid distance can be defined as follows:
with the expected size in the output volume and regarded the −

location at which the cuboid encloses the maximum number dH (X , Y ) = max min d(x, y) (2)
xX yY
of voxels with value 1 as the true location of the bound box. −

When multiple locations have the same maximum number, Hausdorff distance dH (X , Y ) measures the largest distance
the average location is used. from points in X to its nearest neighbor in Y , and this distance
is sensitive to large segmentation errors in a very small region.
To eliminate this sensitivity, an 95% Hausdorff distance is
III. EXPERIMENTS AND RESULTS calculated to measure the 95th percentile of the distance,
According to the regulation of the Head and Neck Auto- −−→
denoted as dH ,95 (X , Y ). In this study, we used 95HD, which
Segmentation Challenge 2015, we used all 33 training images is calculated as follows:
in the dataset for training LocNet and SegNet and tested −−→ −−→
the segmentation framework using the 15 testing images. 95HD = (dH ,95 (X , Y ) + dH ,95 (Y , X ))/2 (3)
Four metrics were calculated to evaluate the performance 3. Positive predictive value (PPV). The PPV is the pro-
of the proposed segmentation framework. We compared portion of the correctly segmented volume within the
the proposed method with several state-of-the-art methods, entire volume of segmentation result.
including both traditional and artificial intelligence-based
|A ∩ B|
approaches. Finally, we showed the efficiency of the net- PPV = (4)
work by comparing the proposed method with two traditional |A|
approaches used in segmenting 3D medical images with a 4. Sensitivity (SEN). SEN is the proportion of the cor-
deep learning framework. rectly segmented volume within the entire volume of

VOLUME 7, 2019 144595


Y. Wang et al.: Organ at Risk Segmentation in Head and Neck CT Images Using a Two-Stage Segmentation Framework

TABLE 2. Average (± standard deviation) performance of our method with and without interpolation (INT) for each structure.

the gold standard. Without interpolation, the original images were not
|A ∩ B| cropped; they were directly downsampled to a size
SEN = (5) of 128 × 128 × 64 as the input to LocNet. The size of the
|B|
bounding box for some structures was different from that of
B. EXPERIMENTAL SETTINGS the interpolated images, but the same size was used across
The proposed networks were implemented using Python all images. The processing after obtaining the target volume
based on the Keras package [31] and experiments were per- was the same as the interpolated images. In each experiment,
formed on a computer with a single GPU (i.e., NVIDIA DSC, 95HD, PPV, and SEN were calculated for each OAR
GTX 1080 Ti) and Linux Ubuntu 14.04 LTS 64-bit operating and the results are listed in Table 2. In Table 2, the OARs
system. are ordered by decreasing volume. The proposed method
We trained one LocNet and one SegNet for each of the demonstrated good segmentation accuracy in large OARs and
nine OARs. The size of the training images for LocNet was its performance decreased with the decrease in the volume of
96 × 96 × 56 and the size of the training images for SegNet the OARs when the volume-related metrics, including DSC,
was determined by the size of the bounding box for each PPV, and SEN, were considered. However, this observation
structure except for the mandible (Table 1). The size of the did not hold for the contour-based metric, 95HD. Overall,
bounding box for the mandible was 144 × 144 × 112, but its the mean and standard deviation of the 95HDs for all the
target volume was further downsampled to 144 × 144 × 56 OARs were small, indicating that the segmentation method
because of the memory limitation. The size of a mini-batch in found the correct contour in most areas for each structure. The
each epoch was 1. Cross entropy loss function was adopted difference in the performance reflected by the volume-based
in the logistic regression as our loss function for both LocNet and contour-based metrics is caused by the fact that similar
and SegNet, which was minimized by Adam optimizer using levels of error on the contour can result in large errors in small
recommended parameters, and the training was terminated structures and small errors in large structures when comput-
at 200 iterations over the training images. For each OAR, ing volume-based metrics because volume-based metrics use
we trained one LocNet and one SegNet; thus, we trained the volume of the structure as the denominator.
18 networks for all nine OARs in this dataset, which took For most OARs, the segmentation results with interpola-
approximately 30 hours. In testing stage, the segmentation tion were superior to those without interpolation, especially
of one OAR on one image took approximately 6 seconds, when the volume of the OAR was small. One possible rea-
of which approximately 2 seconds was spent on the network son for the decreased accuracy with interpolation is that the
processing of the image and approximately 4 seconds on interpolated images have lower in-plane resolution than the
the postprocessing of the output of LocNet. Several small original images. We interpolated the original images to 1 mm
isolated regions that did not belong to the target structure in resolution in each dimension because of the memory limi-
the output of SegNet existed for some structures. We adopted tation. Using a higher resolution for the interpolated images
a simple postprocessing technique, in which we deleted iso- may further improve the accuracy of segmentation, not only
lated regions whose volume was less than 10% of the total for large OARs but also for small OARs.
segmentation result. Fig. 4 illustrates the segmentation results for subject
0522c0857 with and without interpolation. The mandible
C. SEGMENTATION RESULTS OF THE PROPOSED METHOD was segmented more accurately without interpolation, while
We evaluated the performance of our method using the two- the submandibular gland, optic nerve, and chiasm were seg-
stage segmentation framework and interpolated isotropic mented better with interpolation. To illustrate the overall
images. In addition, we tested the proposed segmentation performance of the proposed method, we show the seg-
framework using the original images without interpolation. mentation results for subjects 0522c576, 0522c0667, and

144596 VOLUME 7, 2019


Y. Wang et al.: Organ at Risk Segmentation in Head and Neck CT Images Using a Two-Stage Segmentation Framework

FIGURE 4. Segmentation results for subject 0522c0857. The first and second rows show the segmentation results
without and with interpolation, respectively. From left to right: the 85th, 92th, 102th, 112th, 118th, and 120th slice of
the axial view. The gold standard results are depicted in green, and our results are depicted in red.

FIGURE 5. Segmentation results for subjects 0522c576 (row a), 0522c0667 (row b), and 0522c0857 (row c). The gold
standard results are depicted in green, and our results are depicted in red.

FIGURE 6. Example of typical good (upper row) and bad (lower row) segmentation results for the nine subjects (left to right:
mandible, left parotid gland, right parotid gland, brainstem, left submandibular gland, right submandibular gland, left optic
nerve, right optic nerve, and chiasm). The gold standard results are depicted in green, and our results are depicted in red.

0522c0857 with interpolation in Fig. 5. In addition, for each by the following three reasons. 1) Lack of training data.
of the nine OARs, we chose one good segmentation result and For we only have 33 training images, these were not enough
one bad segmentation result and show the slices in Fig. 6. For for network training, and it was easy to make the network
the bad segmentation results, we thought it may be caused overfit. 2) Low contrast. The boundaries of some OARs are

VOLUME 7, 2019 144597


Y. Wang et al.: Organ at Risk Segmentation in Head and Neck CT Images Using a Two-Stage Segmentation Framework

TABLE 3. Average (± standard deviation) DSCs for the competing methods.

TABLE 4. Average (± standard deviation) 95HDs for the competing methods (unit: mm).

unclear, it is difficult to segment even for experienced radiol- Eight of these OARs were used in the MICCAI 2015 Chal-
ogists, let alone neural networks. 3) Inaccurate localization. lenge (except the brainstem). We cannot directly compare
If the OAR can’t be localized from the whole volume accu- the results of our method with those of [20] because that
rately by LocNet, it can’t be segmented by SegNet accurately. study used a different set of data. Nevertheless, the DSC
for our method is higher than that of [20] for seven
D. COMPARISON OF ACCURACY AGAINST OARs (6.0% higher on average). Additionally, this method
STATE-OF-THE-ART METHODS needs a doctor to determine the approximate location of
It is difficult to compare different methods of OAR seg- each OAR to be segmented. In [13], a hierarchical vertex
mentation in HaN CT images because of the differences regression-based segmentation method was proposed, and the
in datasets, OARs, and evaluation metrics used in differ- DSCs for the brainstem, mandible, and parotid gland were
ent studies. While MICCAI 2015 Challenge provides a uni- 0.9±0.04, 0.94±0.01, and 0.84±0.06, respectively. However,
fied evaluation framework. We first compare the proposed this method was only evaluated by two-fold cross validation
method (with interpolation) with the four methods that ranked on 33 training images but not evaluated on the 15 testing
top in the challenge. In these four methods, UC [32] pro- images, and the segmentation results of other structures were
vided DSCs for all nine OARs but provided no 95HDs, not provided either. In [21], a interleaved 3D CNNs method
IM [33] provided DSCs and 95HDs for three OARs, and was proposed to jointly segment the optic nerve and chi-
UB [11] and VU [34] provided DSCs and 95HDs for all nine asm. The DSCs for the left optic nerve, right optic nerve,
OARs. Tables 3 and 4 illustrate the DSCs and 95HDs for and chiasm were 0.72±0.08, 0.70±0.09, and 0.58±0.17,
our method and the four competing methods, respectively. respectively. This method was designed to segment small
As shown in Table 4, our method outperforms the competing targets and is not applied on other OARs in the MICCAI
methods in terms of 95HD with a large margin for seven of 2015 Challenge dataset. Furthermore, they utilized a joint
all nine OARs. In terms of the DSC, our method ranks first segmentation scheme, while we segmented each OAR sep-
in five of the nine OARs and ranks second for the other three arately. In [22], the authors used an end-to-end and atlas-
OARs. free three dimensional squeeze-and-excitation U-Net (3D
In addition to the four methods that can be directly com- SE U-Net) for fast and fully automated whole-volume HaN
pared, some other studies have used different datasets or the anatomical segmentation. The DSCs for the brainstem, chi-
same dataset in a different training-testing grouping scheme. asm, mandible, optic nerve left, optic nerve right, parotid left,
The first study [20] using deep learning methods to seg- parotid right, submandibular left and submandibular right are
ment OARs in HaN CT images provided DSCs for 13 OARs. 0.867, 0.532, 0.925, 0.721, 0.706, 0.881, 0.874, 0.814, and

144598 VOLUME 7, 2019


Y. Wang et al.: Organ at Risk Segmentation in Head and Neck CT Images Using a Two-Stage Segmentation Framework

TABLE 5. Runtimes of the different methods. Therefore, we kept all the blocks that contain the target
structure and randomly chose the same number of blocks
without the target structure for training SegNet. In the testing
stage, we slid a window of size 64 × 64 × 64 in the whole
volume with some overlap between neighboring windows and
0.813, respectively. Though they used four datasets to train adopted a maximum voting for each voxel to obtain the final
and test their model, our methods still get similar DSCs with segmentation result.
their methods. We tried to use SE block in our SegNet, but The DSCs and 95HDs for the three baselines and the
it failed to impove the segmentation accuracy, and the results proposed method with interpolation are listed in Table 6.
are listed in the supplemental file. As shown in Table 6, our method outperforms the other meth-
ods in both DSC and 95HD. For joint localization method,
E. COMPARISON OF RUNTIMES AGAINST though it reduced the number of LocNets, it needed the
STATE-OF-THE-ART METHODS network focus on nine different structure localization, which
Runtime comparison is difficult because the code of the was difficult for only one model. Due to the location error of
competing methods is not available, and we cannot run all joint localization method, the segmentation accuracy of target
the methods on the same computer. Nevertheless, we listed structure is slightly lower than that of the proposed method.
the runtimes of VU, IM, UB, and Ibragimova given in the For downsampling method, it is very difficult to distinguish
original papers for segmenting all nine OARs of one subject small structures, such as the optic nerve and chiasm, in the
in Table 5. Segmenting all nine OARs using our method downsampled images, and thus their segmentation accuracy
required approximately 108 s on average. was very low. For the sliding-window method, several param-
eters, such as the window size, step size, and ratio between the
F. ROLE OF TARGET LOCALIZATION positive and negative samples for training, may influence the
To show the superiority of the proposed target localization final result. We experimented with several combinations of
network, we compare our method with the following three parameters and kept the best one, but we cannot guarantee
baseline methods. that the reported accuracy is the best possible result. The
95HDs for the sliding-window strategy were very large for
1) JOINT LOCALIZATION most OARs because this method segments out some false-
In the localization stage, we trained one network for one positive voxels far from the true target OAR. Some postpro-
structure, so we trained nine networks for nine structures. cessing strategies may improve the accuracy of these two
To demonstrate the superiority of one localization network strategies, but the improvement is limited.
for one structure, we trained a joint localization network for As shown in Table7, we compared the average training and
all nine structures. The joint localization network has the testing time of four methods for segmenting one structure.
same structure as LocNet proposed in this paper, except that For the training time, our method needs to train two networks
the output results are the location of nine structures. The for each structure, so it takes longer than joint localization
output of joint localization network was processed by the and downsampling. For the testing time, our method needs
same post-processing method as LocNet, and SegNet was to locate and segment one structure using two networks,
used to segment each structure after joint localization. between which some processing needs to be done to obtain
the bounding box. Therefore, the testing time of the proposed
2) DOWNSAMPLING method is longer than that of downsampling. The training and
Because of the memory limit, it is challenging to place the testing time of the sliding-window strategy are much longer
entire 3D image volume into the GPU for training and test- than those of the proposed method.
ing. One solution is to downsample the original image to a
manageable size. In the downsampling strategy used in this IV. DISCUSSION
paper, we downsampled the training and testing images to a In this study, we proposed a new framework for the automatic
size of 96 × 96 × 56. Then we used nine 3D U-Net to segment segmentation of OARs in HaN CT images and evaluated
nine target structure from the downsample image saperately. its performance with the MICCAI 2015 Challenge dataset.
In contrast to the previous methods that are based on deep
3) SLIDING-WINDOW neural networks, the proposed framework decomposes the
To solve the memory limit problem, another way adopted in segmentation into two simpler tasks: locating a bounding box
many previous studies is sliding window [35], [36], which and segmenting a small volume within the bounding box and
crops the original images into small blocks and performs the trains a 3D U-Net for each task. The proposed two-stage
segmentation block by block. In the sliding-window strategy framework easily achieves a large field of view with a small
used in this paper, we cropped the original volume data to memory print. If a small structure is going to be segmented
non-overlapping blocks with a size of 64 × 64 × 64. Of all in a large image, 3D U-Net can produce positive values at
the blocks, only a very small proportion contained the target irrelevant parts of the large image, since the field of view
structure, and thus we could not use all the blocks for training. will be too small, which prevents it from understanding that

VOLUME 7, 2019 144599


Y. Wang et al.: Organ at Risk Segmentation in Head and Neck CT Images Using a Two-Stage Segmentation Framework

TABLE 6. Average (± standard deviation) DSCs and 95HDs for different strategies of handling large volumes.

TABLE 7. Runtimes of the proposed method and downsampling and In this study, we utilized 3D U-Net for both localization
sliding-window strategies.
and segmentation tasks. The decomposition made each of
the two tasks much easier, and the deep neural network
could be properly trained for the specific task. The results
showed that the trained LocNet could locate the bounding
box containing the target structure in all cases. After the
bounding box was accurately located, training the SegNet to
segment one structure with a similar shape and appearance
in different subjects became much easier than training a
the given location is irrelevant. Using a LocNet provides this
network to segment multiple structures with different shapes
large field of view with a small memory footprint. Experi-
and appearances from the original images. This strategy of
ments using the MICCAI 2015 Challenge dataset showed that
decomposing a medical image segmentation task into two
the proposed method significantly outperformed the state-of-
tasks, i.e., locating a bounding box and segmenting in the
the-art methods.
bounding box, has also been used in other applications where
There are many methods that have been used for the seg-
multiple structures are to be segmented.
mentation of OARs in HaN CT images, but the results were
In this study, 3D U-Nets were used for both the locating
relatively poor compared to other medical image segmenta-
and segmentation tasks, and many other network structures
tion tasks. The difficulty comes from the characteristics of the
can be used to replace the 3D U-Net for one task or both tasks.
OAR segmentation task, such as the large variability in the
We did not attempt to test different network structures in this
shape and size across different target structures and the poor
study, but experimenting with more network architectures is
contrast between some structures and their background. Deep
a potential research direction in the future. For some of the
neural networks have become the best choice for most image
output of SegNet, there were several small isolated regions
processing tasks and often outperform traditional methods
that did not belong to the target structure. The simple post-
with a large margin in medical image segmentation appli-
processing adopted in this study only slightly improved the
cations [14]–[19]. However, existing studies on the applica-
final results, and a more sophisticated postprocessing method
tion of deep neural networks in OAR segmentation in HaN
may further improve the accuracy. In addition, the number of
CT images demonstrate similar performance to traditional
subjects in the MICCAI 2015 Challenge dataset is not very
methods. One of the major obstacles in using deep neural
large, which may limit the performance of the deep learning
networks in medical image segmentation has been the con-
network. In the future, verifying whether the segmentation
tradiction between large-sized high-resolution images and
accuracy of the method can be further improved by training
limited memory. Previously, this problem was addressed by
on more data is necessary. Moreover, testing whether the
using downsampling or sliding-window strategies, but our
method is suitable for clinical use and whether it can help
experiments show that the performance of both strategies is
improve treatment planning workflows are important.
very poor. In a recent study, a multiatlas-based segmentation
method [21] was first used to roughly locate the region of
interest and then segmented only a small volume within V. CONCLUSION
the region using a 3D CNN. This method achieved high In this study, we proposed a two-stage segmentation frame-
segmentation accuracy on three small structures and showed work based on 3D U-Net for the automatic segmentation of
that decomposing the localization and segmentation tasks is OARs in HaN CT images. The framework decomposes the
helpful. original segmentation tasks into two easier subtasks: locating

144600 VOLUME 7, 2019


Y. Wang et al.: Organ at Risk Segmentation in Head and Neck CT Images Using a Two-Stage Segmentation Framework

a bound box of the target structure and segmenting the target [16] P. Hu, F. Wu, J. Peng, Y. Bao, F. Chen, and D. Kong, ‘‘Automatic abdominal
structure in a small volume within the bounding box. One 3D multi-organ segmentation using deep convolutional neural network and
time-implicit level sets,’’ Int. J. Comput. Assist. Radiol. Surg., vol. 12, no. 3,
U-Net is trained for each task, and the decomposition allows pp. 399–411, 2017.
the two tasks to be completed more accurately and quickly. [17] J. Cai, L. Lu, Z. Zhang, F. Xing, L. Yang, and Q. Yin, ‘‘Pancreas seg-
Experiments using the MICCAI 2015 Challenge dataset show mentation in MRI using graph-based decision fusion on convolutional
neural networks,’’ in Proc. Int. Conf. Med. Image Comput. Comput.-Assist.
that the proposed method significantly outperforms the state- Intervent., in Lecture Notes in Computer Science, 2016, pp. 442–450.
of-the-art methods. [18] F. Milletari, N. Navab, and S.-A. Ahmadi, ‘‘V-Net: Fully convolutional
neural networks for volumetric medical image segmentation,’’ in Proc. 4th
ACKNOWLEDGMENT Int. Conf. 3D Vis., Oct. 2016, pp. 565–571.
(Yueyue Wang and Liang Zhao are co-first authors.) [19] Q. Zhu, B. Du, B. Turkbey, P. L. Choyke, and P. Yan, ‘‘Deeply-supervised
CNN for prostate segmentation,’’ Proc. Int. Joint Conf. Neural Netw.,
REFERENCES May 2017, pp. 178–184.
[1] C. Fitzmaurice, C. Allen, R. M. Barber, L. Barregard, Z. A. Bhutta, [20] B. Ibragimov and L. Xing, ‘‘Segmentation of organs-at-risks in head and
H. Brenner, and T. Fleming, ‘‘Global, regional, and national cancer neck CT images using convolutional neural networks,’’ Med. Phys., vol. 44,
incidence, mortality, years of life lost, years lived with disability, and no. 2, pp. 547–557, 2017.
disability-adjusted life-years for 32 cancer groups, 1990 to 2015: A sys- [21] X. Ren, L. Xiang, D. Nie, Y. Shao, H. Zhang, D. Shen, and Q. Wang,
tematic analysis for the global burden of disease study,’’ JAMA Oncol., ‘‘Interleaved 3D-CNNs for joint segmentation of small-volume structures
vol. 3, no. 4, pp. 524–548, 2017. in head and neck CT images,’’ Med. Phys., vol. 45, no. 5, pp. 2063–2075,
[2] E. K. Hansen, M. K. Bucci, J. M. Quivey, V. Weinberg, and P. Xia, ‘‘Repeat 2018.
CT imaging and replanning during the course of IMRT for head-and-neck [22] W. Zhu, Y. Huang, L. Zeng, X. Chen, Y. Liu, Z. Qian, N. Du, W. Fan, and
cancer,’’ Int. J. Radiat. Oncol. Biol. Phys., vol. 64, pp. 355–362, Feb. 2006. X. Xie, ‘‘AnatomyNet: Deep learning for fast and fully automated whole-
[3] W. F. A. R. Verbakel, J. P. Cuijpers, D. Hoffmans, M. Bieker, volume segmentation of head and neck anatomy,’’ Med. Phys., vol. 46,
B. J. Slotman, and S. Senan, ‘‘Volumetric intensity-modulated arc therapy no. 2, pp. 576–589, 2019.
vs. conventional IMRT in head-and-neck cancer: A comparative planning [23] N. Tong, S. Gou, S. Yang, D. Ruan, and K. Sheng, ‘‘Fully automatic multi-
and dosimetric study,’’ Int. J. Radiat. Oncol. Biol. Phys., vol. 74, no. 1, organ segmentation for head and neck cancer radiotherapy using shape
pp. 252–259, 2009. representation model constrained fully convolutional neural networks,’’
[4] L. Zhao, Q. Wan, Y. Zhou, X. Deng, C. Xie, and S. Wu, ‘‘Erratum to Med. Phys., vol. 45, no. 10, pp. 4558–4567, 2018.
‘The role of replanning in fractionated intensity modulated radiotherapy for [24] C. Wang, T. Macgillivray, G. Macnaught, G. Yang, and D. Newby,
nasopharyngeal carcinoma,’’’ Radiotherapy Oncol., vol. 99, no. 2, p. 256, ‘‘A two-stage 3D Unet framework for multi-class segmentation on
2011. full resolution image,’’ 2018, arXiv:1804.04341. [Online]. Available:
[5] P. M. Harari, S. Song, and W. A. Tomé, ‘‘Emphasizing conformal avoid- https://arxiv.org/abs/1804.04341
ance versus target definition for IMRT planning in head-and-neck cancer,’’ [25] H. Chen, W. Lu, M. Chen, L. Zhou, R. Timmerman, D. Tu, L. Nedzi,
Int. J. Radiat. Oncol. Biol. Phys., vol. 77, no. 3, pp. 950–958, 2010. Z. Wardak, S. Jiang, X. Zhen, and X. Gu, ‘‘A recursive ensemble organ seg-
[6] M. La Macchia, F. Fellin, M. Amichetti, M. Cianchetti, S. Gianolini, mentation (REOS) framework: Application in brain radiotherapy,’’ Phys.
V. Paola, A. J. Lomax, and L. Widesott, ‘‘Systematic evaluation of three Med. Biol., vol. 64, no. 2, 2019, Art. no. 025015.
different commercial software solutions for automatic segmentation for [26] K. He, X. Cao, Y. Shi, D. Nie, Y. Gao, and D. Shen, ‘‘Pelvic organ
adaptive therapy in head-and-neck, prostate and pleural cancer,’’ Radiat. segmentation using distinctive curve guided fully convolutional networks,’’
Oncol., vol. 7, Sep. 2012, Art. no. 160. IEEE Trans. Med. Imag., vol. 38, no. 2, pp. 585–595, Feb. 2019.
[7] G. Sharp, K. D. Fritscher, V. Pekar, M. Peroni, N. Shusharina, [27] A. Balagopal, S. Kazemifar, D. Nguyen, M.-H. Lin, R. Hannan,
H. Veeraraghavan, and J. Yang, ‘‘Vision 20/20: Perspectives on automated A. Owrangi, and S. Jiang, ‘‘Fully automated organ segmentation in
image segmentation for radiotherapy,’’ Med. Phys., vol. 41, no. 5, 2017, male pelvic CT images,’’ Phys. Med. Biol., vol. 63, no. 24, 2018,
Art. no. 050902. Art. no. 245015.
[8] P. F. Raudaschl et al., ‘‘Evaluation of segmentation methods on head [28] H. Fashandi, G. Kuling, Y. Lu, H. Wu, and A. L. Martel, ‘‘An investiga-
and neck CT: Auto-segmentation challenge 2015,’’ Med. Phys., vol. 44, tion of the effect of fat suppression and dimensionality on the accuracy
pp. 2020–2036, Jun. 2017. of breast MRI segmentation using U-nets,’’ Med. Phys., vol. 46, no. 3,
[9] E. Street, L. Hadjiiski, B. Sahiner, S. Gujar, M. Ibrahim, S. K. Mukherji, pp. 1230–1244, Mar. 2019.
and H.-P. Chan, ‘‘Automated volume analysis of head and neck lesions on [29] M. U. Dalmış, G. Litjens, K. Holland, A. Setio, R. Mann, N. Karssemeijer,
CT scans using 3D level set segmentation,’’ Med. Phys., vol. 34, no. 11, and A. Gubern-Mérida, ‘‘Using deep learning to segment breast and fibrog-
pp. 4399–4408, 2007. landular tissue in MRI volumes,’’ Med. Phys., vol. 44, no. 2, pp. 533–546,
[10] X. Han, M. S. Hoogeman, P. C. Levendag, L. S. Hibbard, D. N. Teguh, 2017.
P. Voet, A. C. Cowen, and T. K. Wolf, ‘‘Atlas-based auto-segmentation [30] Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger,
of head and neck CT images,’’ in Proc. Int. Conf. Med. Image Com- ‘‘3D U-Net: Learning dense volumetric segmentation from sparse annota-
put. Comput.-Assist. Intervent., in Lecture Notes in Computer Science, tion,’’ in Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent.,
vol. 5242, 2008, pp. 434–441. 2016, pp. 424–432.
[11] R. Mannion-Haworth, M. Bowes, A. Ashman, G. Guillard, A. Brett, and [31] F. Chollet, ‘‘Keras,’’ Tech. Rep., 2015.
G. Vincent, ‘‘Fully automatic segmentation of head and neck organs using [32] T. Albrecht, T. Gass, C. Langguth, and M. Lüthi, ‘‘Multi atlas segmen-
active appearance models,’’ Tech. Rep., 2016. tation with active shape model refinement for multi-organ segmenta-
[12] M. Peroni, ‘‘Methods and algorithms for image guided adaptive radio-and tion in head and neck cancer radiotherapy planning,’’ Tech. Rep., 2015,
hadron-therapy,’’ Tech. Rep., 2011, p. 152. pp. 1–6.
[13] Z. Wang, L. Wei, L. Wang, Y. Gao, W. Chen, and D. Shen, ‘‘Hierarchi- [33] M. O. Arteaga, D. C. Peña, and G. C. Dominguez, ‘‘Head and neck auto
cal vertex regression-based segmentation of head and neck CT images segmentation challenge based on non-local generative models,’’ MIDAS J.,
for radiotherapy planning,’’ IEEE Trans. Image Process., vol. 27, no. 2, to be published.
pp. 923–937, Feb. 2018. [34] A. Chen and B. Dawant, ‘‘A multi-atlas approach for the automatic seg-
[14] F. Milletari, S.-A. Ahmadi, C. Kroll, A. Plate, V. Rozanski, mentation of multiple structures in head and neck CT images,’’ MIDAS J.,
J. Maiostre, J. Levin, O. Dietrich, B. Ertl-Wagner, K. Bötzel, and to be published.
N. Navab, ‘‘Hough-CNN: Deep learning for segmentation of deep brain [35] L. Yu, X. Yang, H. Chen, J. Qin, and P. A. Heng, ‘‘Volumetric convnets
regions in MRI and ultrasound,’’ Comput. Vis. Image Understand., with mixed residual connections for automated prostate segmentation
vol. 164, pp. 92–102, Nov. 2017. from 3D MR images,’’ in Proc. 31st AAAI Conf. Artif. Intell., 2017,
[15] K. H. Cha, L. Hadjiiski, R. K. Samala, H.-P. Chan, E. M. Caoili, and pp. 66–72.
R. H. Cohan, ‘‘Urinary bladder segmentation in CT urography using deep- [36] O. Ronneberger, P. Fischer, and T. Brox, ‘‘U-Net: Convolutional networks
learning convolutional neural network and level sets,’’ Med. Phys., vol. 43, for biomedical image segmentation,’’ in Medical Image Computing and
no. 4, pp. 1882–1896, 2016. Computer-Assisted Intervention—MICCAI. 2015.

VOLUME 7, 2019 144601


Y. Wang et al.: Organ at Risk Segmentation in Head and Neck CT Images Using a Two-Stage Segmentation Framework

YUEYUE WANG received the bachelor’s degree in MANNING WANG received the B.S. and M.S.
communication engineering from the Ocean Uni- degrees in power electronics and power transmis-
versity of China, in 2017. She is currently pur- sion from Shanghai Jiaotong University, Shang-
suing the Ph.D. degrees with the Digital Medical hai, China, in 1999 and 2002, respectively, and
Research Center, School of Basic Medical Sci- the Ph.D. degree in biomedical engineering from
ences, Fudan University, and the Shanghai Key Fudan University, Shanghai, in 2011. He is cur-
Laboratory of Medical Imaging Computing and rently a Professor of biomedical engineering with
Computer Assisted Intervention, Shanghai, China. the School of Basic Medical Science, Fudan Uni-
Her research interests include medical image seg- versity, where he is also the Deputy Director of the
mentation and medical image classification. Digital Medical Research Center and the Shanghai
Key Laboratory of Medical Imaging Computing and Computer Assisted
Intervention (MICCAI). His research interests include medical image pro-
cessing, image-guided intervention, and computer vision.

ZHIJIAN SONG received the B.S. degree from


the Shandong University of Technology, Shan-
dong, China, in 1982, the M.S. degree from the
Jiangsu University of Technology, Jiangsu, China,
LIANG ZHAO received the bachelor’s degree in in 1991, and the Ph.D. degree in biomedical engi-
physics from Nankai University, Tianjing, China, neering from Xian Jiaotong University, Xi’an,
in 2005, and the M.Sc. degree in medical physics China, in 1994. He is currently a Professor with the
from the University of Surrey, U.K., in 2007. School of Basic Medical Science, Fudan Univer-
He is currently pursuing the Ph.D. degree with sity, Shanghai, where he is also the Director of the
the Digital Medical Research Center, School of Digital Medical Research Center and the Shanghai
Basic Medical Sciences, Fudan University, and Key Laboratory of Medical Image Computing and Computer Assisted Inter-
the Shanghai Key Laboratory of Medical Imaging vention (MICCAI). His research interests include medical image processing,
Computing and Computer Assisted Intervention, image-guided intervention, and the application of virtual and augmented
Shanghai, China. His research interests include reality technologies in medicine.
medical image processing/analysis and edge computing.

144602 VOLUME 7, 2019

You might also like