0% found this document useful (0 votes)

12 views9 pages

Hetero-Modal Variational Encoder-Decoder For

Uploaded by

scribd01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views9 pages

Hetero-Modal Variational Encoder-Decoder For

Uploaded by

scribd01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Hetero-Modal Variational Encoder-Decoder for

Joint Modality Completion and Segmentation

Reuben Dorent, Samuel Joutard, Marc Modat, Sébastien Ourselin and Tom
Vercauteren

School of Biomedical Engineering and Imaging Sciences, Kings College London

[email protected]

Abstract. We propose a new deep learning method for tumour segmen-

arXiv:1907.11150v1 [eess.IV] 25 Jul 2019

tation when dealing with missing imaging modalities. Instead of produc-

ing one network for each possible subset of observed modalities or using
arithmetic operations to combine feature maps, our hetero-modal varia-
tional 3D encoder-decoder independently embeds all observed modalities
into a shared latent representation. Missing data and tumour segmen-
tation can be then generated from this embedding. In our scenario, the
input is a random subset of modalities. We demonstrate that the opti-
misation problem can be seen as a mixture sampling. In addition to this,
we introduce a new network architecture building upon both the 3D U-
Net and the Multi-Modal Variational Auto-Encoder (MVAE). Finally, we
evaluate our method on BraTS2018 using subsets of the imaging modali-
ties as input. Our model outperforms the current state-of-the-art method
for dealing with missing modalities and achieves similar performance to
the subset-specific equivalent networks.

Keywords: Tumour segmentation, Modality completion, multi-modal,

missing modalities

1 Introduction

Tumour segmentation and associated volume quantification plays an essential

role during the diagnosis, follow-up and surgical planning stages of primary brain
tumours. Multiple imaging sequences are usually employed to distinguish and
assess the key tumour components such as the whole tumour, the peritumoral
edema and the enhancing region. The common sequences are T1-weighted (T1),
contrast enhanced T1-weighted (T1c), T2-weighted (T2) and Fluid Attenuation
Inversion Recovery (FLAIR) images. These modalities reveal different charac-
teristics of brain tissues. In practice, the set of acquired modalities may vary
during the clinical assessment. For this reason, we aim to automatically segment
these key components given an arbitrary set of modalities.
Methods based on deep learning currently achieve the best performance in
brain tumour segmentation. Most of them require the full set of n modalities as
input [4,9], while a scenario of missing modalities is common in practice. Seg-
mentation with missing data can be achieved by: 1/ Training a model for each
2 R. Dorent et al.

possible subset of modalities; 2/ Synthesising missing modalities [6] in order to

then perform full modality segmentation; 3/ Creating a common feature space
which encodes the shared information from which the segmentation is created
[3,12]. The two first options involve training and handling a different network for
each of the 2n − 1 combinations. These two solutions are cumbersome and com-
putationally sub-optimal since duplicate information is extracted 2n − 1 times.
In contrast, encoding the modalities into a common feature space produces a
single model that shares feature extraction.
The current state-of-the-art network architecture which allows for missing
modalities is HeMIS [3] and related extensions [12]. Feature maps are first ex-
tracted independently for each modality, then their first and second moments
are computed across the modalities and used for predicting the final segmen-
tation. However, using these arithmetic operations does not force the network
to learn a shared latent representation. In contrast, Multi-modal Variational
Auto-Encoders (MVAE) [13] provide a principled formulation to create a com-
mon representation: the n modalities and the segmentation map are considered
conditionally independent given the common latent variable z.
While our goal to segment the tumour with missing modalities, auto-encoding
and modality completion promote informativeness of the latent space and can be
seen as regularizers, similarly to [9]. Ideally, all the modality-specific information
should be encoded in the common latent space, meaning that the model should
be able to reconstruct all the observed modalities. Additionally, the information
loss related to any missing modality should be minimal (modality completion).
In this paper, we introduce a hetero-modal variational encoder-decoder for
tumour segmentation and missing modalities completion. The contribution of
this work is four-fold. First, we extend the MVAE for 3D tumour segmenta-
tion from multimodal datasets with missing modalities. Secondly, we propose a
principled formulation of the optimisation process based on a mixture sampling
procedure. Thirdly, we adapt the 3D U-Net in a variational framework for this
task. Finally, we show that our model outperforms HeMIS in terms of tumour
segmentation while comparing favourably with equivalent subset-specific models.

2 Method
2.1 Multi-modal Variational Auto-Encoders (MVAE)
The MVAE [13] aims at identifying a model in which n modalities x = (x1 , .., xn )
are conditionally independent given a hidden latent variable z. We consider the
directed latent-variable model parameterised by θ (typically the weights of a
decoding network fθ (·) going from the latent space to the image space):
n
Y
pθ (z, x1 , ..., xn ) = p(z) pθ (xi |z) (1)
i=1

where p(z) is a prior on the latent space, which we classically choose as a standard
normal distribution z ∼ N (0, I). The goal is then to maximise the marginal log-
likelihood L(x; θ) = log(pθ (x1 , ..., xn )) with respect to θ. However, the integral
Hetero-Modal Variational Encoder-Decoder 3
R
pθ (x1 , ..., xn ) = pθ (x|z)p(z) is computationally intractable. [5] proposed to
optimise, with respect to (φ, θ), the evidence lower-bound (ELBO):

L(x; θ) ≥ ELBO(x; θ, φ) , Eqφ (z|x) [log(pθ (x|z))] − KL[qφ (z|x)||p(z)] (2)

where qφ (z|x) is a tractable variational posterior that aims to approximate the

intractable true posterior pθ (z|x). For this purpose, qφ (z|x) is typically modelled
as a Gaussian after an encoding of x into a mean and diagonal covariance by a
neural network, hφ (x) = µφ (x), Σφ (x) , such that:

qφ (z|x) = N (z; µφ (x), Σφ (x)) (3)

The KL divergence between the two Gaussians qφ (z|x) and p(z) can be computed
in closed form given by their means and covariances. In contrast, estimating
Eqφ (z|x) [log(pθ (x|z))] is done by sampling the hidden variable z according to the
Gaussian qφ (·|x) and then decoding it as fθ (z) in image space to evaluate pθ (x|z).
To make sampling from z|x amenable to back-propagation, reparametrisation is
used [5]: µφ (x) + Σφ (x) × where ∼ N (0, I).
Wu, et al. [13] extended this variational formulation to a multi-modal setting.
Qn
The authors remarked that pθ (z|x) ∝ p(z) i=1 pθp(z) (z|xi )
. This expression shows
that pθ (z|x) can be decomposed into n modality-specific terms. For this reason,
the authors approximate each pθp(z) (z|xi )
with a modality-specific variational pos-
terior qφi (z|xi ). Similarly to (3), qφi (z|xi ) is modelled as a Gaussian distribution
after an encoding of xi into a mean and a diagonal covariance by a neural net-
work, hφi (xi ) = µφi (xi ), Σφi (xi ) , such that qQ i (z|xi ) = N (z; µφi (xi ), Σφi (xi )).
n
Finally, [1] demonstrates that qφ (z|x) ∝ p(z) i=1 qφi (z|xi ) is Gaussian with
mean µφ and covariance Σφ defined by:
X X
Σφ = (I + Σφ−1
i
)−1 and µφ = Σφ−1 ( Σφ−1
i
µφi ) (4)
i i

This formulation allows for encoding each modality independently and fusing
their encoding using a closed-form formula.
However, from this well-posed multimodal extension of the ELBO, [13] resort
to a ad hoc training sampling procedure. At each training iteration, the extremes
cases (one modality and all the modalities) and random modality subsets are
used concurrently. This option is highly memory consuming, not suitable for 3D
images and not adapted to the clinical scenarios where some imaging subsets are
clinically more frequent than others. The next section proposes to include this
prior information in our principled training procedure via ancestral sampling.

2.2 Mixture Sampling for Modality Completion and Segmentation

In our scenario, the clinician provides a subset of n = 4 imaging modalities

with some subsets of input modalities being more likely to be provided than
others. We use an encoder-decoder to produce the missing modalities as well
4 R. Dorent et al.

T1 Decoder: fϕ
T1 Encoder: hϕ T1 (μ ϕ , Σ ϕ )
T1

T1 T1

T1c Decoder: fϕ
T1c Encoder: hϕT 1c (μ ϕ , Σ ϕ )
T1c

T1 c T 1c

(μ ϕ , Σϕ) z ~ N (μ ϕ , Σϕ ) T2 Decoder: fϕ
T2

T2 Encoder: hϕT2 (μ ϕ , Σ ϕ )
T2 T2
FLAIR Decoder: f ϕ Fl

FLAIR Encoder: h ϕ Fl
(μ ϕ , Σ ϕ )
Fl Fl
Seg Decoder: fϕ
seg

Fig. 1. MVAE architecture. Each imaging modality is encoded independently, the mean
and covariance of each q(z|xi ) are fused using the closed-form formula (4). A sample z
is randomly drawn and is decoded into imaging modalities and the segmentation map.

as the tumour segmentation. Although segmentation could be considered as a

missing modality, we chose not to encode it as it is not observed in practice.
Consequently, our model is composed of 4 encoders and 5 decoders (see Fig. 1).
Without loss of generality, we consider a training set providing the complete
n modalities per subject. Consequently, during training, we can artificially re-
move some modalities as input yet evaluate the reconstruction error on all the
modalities. When the training set is incomplete, the reconstruction error is only
evaluated on the available data.
Let P denote the set of all possible non-empty combinations of the n modal-
ities. Our goal is to maximise (2) when z has been encoded via a random subset
π ∈ P drawn with probability απ . This is exactly the ancestral sampling of a
mixture model: we first draw the class label (here the subset) and then we draw
a sample from the distribution associated to this class. For this reason, we model
qφ (z|x) as a mixture where the probabilities απ are chosen to be representative
of the clinical scenario:
X
qφ (z|x) = απ qφπ (z|xπ )
π∈P

We choose qφπ P
(z|xπ ) as Gaussian. Given the convexity of the KL divergence and
the fact that π∈P απ = 1, we obtain:
X
KL[qφ (z|x)||p(z)] ≤ απ KL[qφπ (z|xπ )||p(z)]
π

Finally, our lower-bound is a weighted sum of the subset-specific lower-bound:

X
L(x; θ) ≥ απ (Eqφπ (z|xπ ) [log(pθ (x|z))] − KL[qφπ (z|xπ )||p(z)]) (5)
π∈P | {z }
ELBOπ (x)

The single Gaussian prior model for p(z) promotes consistency of the embedding
z across the subsets of modalities π (qφπ (z|xπ )) and in turn across the full set of
modalities (qφ (z|x)). In our optimisation procedure, at each iteration, we propose
to randomly draw a subset π with a probability απ as the model input and
Hetero-Modal Variational Encoder-Decoder 5

x2
x2

64 64
32 32
16 16

8 88

~ M

88
~ 1: for imaging
modality
~ 4: for segmentation
16 16 x2
32 32 x2
~ x2
x2

x2 N 1x1x1 2x2x2
x2 Product of x2
Concat.
Gaussians (4) Convolutions max-pooling
N
64 64
32 32 Hidden Inst. Leaky- N 3x3x3 Trilinear
16 16 Variable Norm ReLU convolutions x2 Upsampling
N
8 88

Fig. 2. Our 3D variational encoder-decoder (U-HVED). Only two encoders and one
decoders are shown. Product of Gaussian is defined in (4).

optimise ELBOπ (x). Classical modelling of pθ (.|z) includes Gaussian distribution

for image reconstruction and Bernoulli distribution for classification.

2.3 Network architecture: 3D Variational Encoder-Decoder

To exploit our framework we propose a novel network architecture: a 3D encoder-
decoder with variational skip-connections. Our model is a mix between a 3D
U-Net [10] and the MVAE [13].
In the U-net architecture, context information is extracted via the contract-
ing path (encoder) and precise localisation is produced by the expanding part
(decoder). In addition, information is captured at different levels via the skip-
connections. To avoid a trivial identity function, existing auto-encoder architec-
tures do not use skip-connections. In our case, the encoding of the latent variable
is multi-modal and the imposed consistency of the latent representation creates a
bottleneck. Skip-connections therefore do not allow for trivial identity mapping
and can be included in our architecture.
We propose to use a multi-level latent variable to generate them. Figure 2
shows our network architecture. Unlike the existing hierarchical VAE models
[11,14], we propose a fully convolutional network. Each modality i is indepen-
dently encoded which produces 4 multi-scale means and variances (µki , Σik )k∈[1,..,4] .
At each level, the means and the variances of the modalities present in the input
subset xπ are combined via the product of Gaussian defined in (4). We then
decode the multi-scale latent variable for each of the modalities and the segmen-
tation. Consequently, we have n encoders and n + 1 decoders. We assert that it
is the first deep network which allows for missing modalities and performs 3D
imaging reconstruction and segmentation in a variational manner.
6 R. Dorent et al.

3 Data and implementation details.

Data. We evaluate our method on the training set of BRATS18 [7]. The training
set contains the scans of 285 patients, 210 with high grade glioma and 75 with
low grade glioma. Each patient was scanned with four sequences (T1, T1c, T2
and FLAIR) and pre-processed by the organisers: scans have been skull-striped
and re-sampled to an isotropic 1mm resolution, and the four sequences of the
each patient have been co-registered. The ground truth was obtained by manual
segmentation results given by experts. The segmentation classes include the
following tumour tissue labels: 1) necrotic core and non-enhancing tumour, 2)
oedema, 3) enhancing core.

Implementation details. As pre-processing step, we used histogram-based

scale standardisation method [8] followed by a zero mean and unit-variance nor-
malisation. As a data augmentation, we randomly flip the axes and include a
rotation with a random angle in [−10◦ , 10◦ ]. The networks were implemented in
Tensorflow using NiftyNet [2]. We used Adam as optimiser with initial learning
rate 10−3 divided by 4 every 104 iterations, batch size 1 and maximal iteration
60k. Early stopping is performed if a plateau of performance is reached on the
validation data set. At each iteration, a 112 × 112 × 112 random patch is fed to
the network. We did a 3-fold validation by random split of the data set a training
(70%), validation (10%) and testing (20%) sets. We regularize with a L2 weight
decay of 10−5 . During training, we uniformly draw a number of modalities i
between 1 and 4 and uniformly draw a subset π of size i. During inference, given
a subset of modalities, we randomly draw 10 hidden variable z from q(.|xπ ) and
decode them and average the outputs. Implementation is publicly available1 .

Choices of the losses. The reconstruction loss follows from pθ (xi |z). For the
segmentation we use the sum of the cross-entropy Lcross and the dice loss func-
tion Ldice [4]. For the imaging reconstruction loss, we used the classic L2 loss.
Additionally, given a drawn subset π, our loss includes the closed-form KL diver-
gence between the Gaussians qφ (z|xπ ) and p(z). For weighting the regularization
losses (KL divergence and reconstruction loss), we did a grid search over weights
in [0, 0.1, 1]. Finally, the loss associated to maximising the ELBO (5) is:

L = Ldice + Lcross + 0.1 ∗ L2 + 0.1 ∗ KL

4 Experiments and results

Model comparison. To evaluate the performance of our model (U-HVED),
we compare it to three different approaches: The first, HeMIS is the model de-
scribed in [3] and is the current state-of-the-art for segmentation with missing
modalities. The second, U-HeMIS, is a particular case of our method where the
1
https://github.com/ReubenDo/U-HVED
Hetero-Modal Variational Encoder-Decoder 7

FLAIR

Subset T1 T1+T1c T1+T1c+T2 T1+T1c+T2+FLAIR Ground Truth

Subset FLAIR FLAIR+T2 FLAIR+T2+T1c FLAIR+T2+T1c+T1 Ground Truth

Seg

Subset FLAIR FLAIR+T2 FLAIR+T2+T1c FLAIR+T2+T1c+T1 Ground Truth

Fig. 3. Example of FLAIR and T1 completion and tumour segmentation given a subset
of modalities as input. Green: edema; Red: non-enhancing core; Blue: enhancing core.

modalities are encoded as U-HVED and the skip-connection are the first and sec-
ond moments of the modality-specific feature maps such as in HeMIS. U-HeMIS
has only one decoder for tumour segmentation. The third approach, Single, is
the ”brute-force” method in which for each possible subset of modalities, we
train a U-Net network where the observed modalities are concatenated as input.
The encoder and decoder are those of our model. Given the 3-fold validation, we
consequently trained 45 Single networks.

Missing modalities completion. Unlike these three approaches, U-HVED

(Ours) generates missing modalities. Since image completion is a means rather
than an end, we only provided a qualitative evaluation (Fig. 3) of T1 and FLAIR
reconstruction examples. We find the reconstruction to be good quality, given
that VAEs classically suffer of blurriness. Interestingly, our model tries to recon-
struct the tumour information even when the tumour information is missing or
not clear, such as in T1 scans. Moreover, comparable reconstructions are per-
formed using 3 modalities and 4 modalities. This suggests that our network can
effectively learn a common representation of the imaging modalities.

Tumour segmentation. In order to evaluate the robustness of our model, we

present qualitative results in Fig. 3 and comparative results with other methods
in Table 1 for all the possible input subsets. We used the Dice Similarity as
metric. First, the U-Net architecture in U-HeMIS always achieves better per-
8 R. Dorent et al.

Table 1. Comparison of the different models (Dice %) for the different combinations
of available modalities. Modalities present are denoted by •, the missing ones by ◦. ∗
denotes significant improvement provided by a Wilcoxon test (p < 0.05).

Modalities Complete Core Enhancing

F T1 T1 c T2 HeMIS U-HeMIS U-HVED Sing HeMIS U-HeMIS U-HVED Sing HeMIS U-HeMIS U-HVED Sing
◦ ◦ ◦ • 38.6 79.2 80.9∗ 82.6 19.5 50.0 54.1∗ 54.9 0.0 23.3 30.8∗ 34.2
◦ ◦ • ◦ 2.6 58.5 62.4∗ 70.4 6.5 58.5 66.7∗ 71.5 11.1 60.8 65.5∗ 70.4
◦ • ◦ ◦ 0.0 54.3∗ 52.4 72.7 0.0 37.9 37.2 59.2 0.0 12.4 13.7∗ 32.2
• ◦ ◦ ◦ 55.2 79.9 82.1∗ 81.5 16.2 49.8 50.4 55.5 6.6 24.9 24.8 26.3
◦ ◦ • • 48.2 81.0 82.7∗ 83.2 45.8 69.1 73.7∗ 73.3 55.8 68.6 70.2∗ 70.1
◦ • • ◦ 15.4 63.8 66.8∗ 70.6 30.4 64.0 69.7∗ 73.9 42.6 65.3 67.0∗ 71.9
• • ◦ ◦ 71.1 83.9 84.3 83.3 11.9 56.7∗ 55.3 54.3 1.2 29.0∗ 24.2 30.7
◦ • ◦ • 47.3 80.8 82.2∗ 83.1 17.2 53.4 57.2∗ 59.7 0.6 28.3 30.7∗ 33.4
• ◦ ◦ • 74.8 86.0 87.5∗ 86.3 17.7 58.7 59.7 57.7 0.8 28.0 34.6∗ 31.0
• ◦ • ◦ 68.4 83.3 85.5∗ 85.3 41.4 67.6 72.9∗ 72.0 53.8 68.0 70.3∗ 69.9
• • • ◦ 70.2 85.1 86.2∗ 85.1 48.8 70.7 74.2∗ 74.9 60.9 69.9 71.1 70.1
• • ◦ • 75.2 87.0 88.0∗ 85.7 18.7 61.0 61.5 57.9 1.0 33.4 34.1 34.1
• ◦ • • 75.6 87.0 88.6∗ 85.8 54.9 72.2 75.6∗ 75.2 60.5 69.7 71.2∗ 72.2
◦ • • • 44.2 82.1 83.3∗ 81.5 46.6 70.7 75.3∗ 74.7 55.1 69.7 71.1∗ 71.1
• • • • 73.8 87.6 88.8∗ 87.5 55.3 73.4 76.4∗ 78.4 61.1 70.8 71.7∗ 72.7
Means 50.7 78.6 80.1∗ 81.6 28.7 59.7 64.0∗ 66.2 27.4 48.1 50.0∗ 52.7

formance than the original 2D fully-convolutionnal HeMIS. This highlights the

efficiency of the 3D U-net architecture. Secondly, U-HVED (Ours) outperforms
significantly U-HeMIS in most of the cases: 13 out of 15 cases for the com-
plete tumour, 10 out of 15 cases for the core tumour; 11 out 15 cases for the
enhancing tumour. This demonstrates that auto-encoding and modality comple-
tion improves the segmentation performance. Finally, U-HVED achieves similar
performance to the 15 subset-specific models (Single). Again, this suggests that
the imaging modalities are efficiently embedded in the latent space.

5 Discussion and conclusion

In this work, we demonstrate the efficacy of a multi-modal variational approach
for segmentation with missing modalities. Our model outperforms the state-of-
the-art approach HeMIS [3]. In fact, HeMIS could be seen as the non-variational
version of our method where: 1/one does not sample but uses the mean of the
latent variable instead; 2/the modality-specific covariances are set up to the
identity, Σi = I; 3/only the segmentation is reconstructed from the hidden
variable. In this case, each modality are independently encoded and averaged
such as HeMIS. Finally, our method (U-HVED) offers promising insight for
leveraging large but incomplete data sets. For future work, we want to provide
an analysis of the the learned embedding. This task is particularly challenging
due to the multi-scale representation of the hidden variable.

Acknowledgement We thank C. Sudre, W. Li, B. Murray, Z. Eaton-Rosen,

F. Bragman, L. Fidon and T. Varsavsky for their useful comments. This work was
supported by the Wellcome Trust [203148/Z/16/Z] and EPSRC [NS/A000049/1].
TV is supported by a Medtronic/RAEng Research Chair [RCSRF1819/7/34].
Hetero-Modal Variational Encoder-Decoder 9

References
1. Cao, Y., Fleet, D.J.: Generalized product of experts for automatic and principled
fusion of Gaussian Process Predictions. CoRR abs/1410.7827 (2014)
2. Gibson, E., Li, W., et al.: NiftyNet: a deep-learning platform for medical imaging.
Computer Methods and Programs in Biomedicine 158, 113 – 122 (2018)
3. Havaei, M., Guizard, N., Chapados, N., Bengio, Y.: Hemis: Hetero-modal image
segmentation. In: MICCAI 2016. pp. 469–477. Springer, Cham (2016)
4. Isensee, F., et al.: No new-net. In: Brainlesion: Glioma, Multiple Sclerosis, Stroke
and Traumatic Brain Injuries. pp. 234–244. Springer, Cham (2019)
5. Kingma, D.P., Welling, M.: Auto-encoding variational bayes. In: ICLR (2014)
6. Li, R., Zhang, W., Suk, H.I., Wang, L., Li, J., Shen, D., Ji, S.: Deep learning based
imaging data completion for improved brain disease diagnosis. In: MICCAI 2014.
pp. 305–312. Springer, Cham (2014)
7. Menze, B.H., et al.: The multimodal brain tumor image segmentation benchmark
BRATS. In: IEEE Transactions on Medical Imaging. vol. 34, pp. 1993–2024 (2015)
8. Milletari, F., N.N., Ahmadi, S.: V-net: Fully convolutional neural networks for
volumetric medical image segmentation. In: Inter-national Conference on 3D Vision
(3DV),. pp. 565–571 (2016)
9. Myronenko, A.: 3d MRI brain tumor segmentation using autoencoder regular-
ization. In: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain
Injuries. pp. 311–320. Springer, Cham (2019)
10. Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomed-
ical image segmentation. In: MICCAI 2015. pp. 234–241. Springer, Cham (2015)
11. Sø nderby, C.K., Raiko, T., Maalø e, L., Sø nderby, S.r.K., Winther, O.: Ladder
variational autoencoders. In: NeurIPS. pp. 3738–3746 (2016)
12. Varsavsky, T., et al.: PIMMS: permutation invariant multi-modal segmentation.
In: DLMIA 2018, MICCAI 2018. pp. 201–209 (2018)
13. Wu, M., Goodman, N.: Multimodal generative models for scalable weakly-
supervised learning. In: NeurIPS. pp. 5580–5590 (2018)
14. Zhao, S., Song, J., Ermon, S.: Learning hierarchical features from deep generative
models. In: ICML. pp. 4091–4099 (2017)

Variational Autoencoders
No ratings yet
Variational Autoencoders
94 pages
Multimodal Data Integration For Oncology in The Era of Deep Neural Networks A Review
No ratings yet
Multimodal Data Integration For Oncology in The Era of Deep Neural Networks A Review
32 pages
Session 15-1 Multimodal
No ratings yet
Session 15-1 Multimodal
82 pages
VAE talk.compressed - 副本
No ratings yet
VAE talk.compressed - 副本
59 pages
Developing Deep Transfer and Machine Learning Models of Che 2024 Expert Syst
No ratings yet
Developing Deep Transfer and Machine Learning Models of Che 2024 Expert Syst
24 pages
MultiVae 8
No ratings yet
MultiVae 8
39 pages
Warner 等 - 2024 - Multimodal Machine Learning in Image-Based and Clinical Biomedicine Survey and Prospects
No ratings yet
Warner 等 - 2024 - Multimodal Machine Learning in Image-Based and Clinical Biomedicine Survey and Prospects
17 pages
Mmcformer Missing Modality Compensation Transformer For Brain Tumor Segmentation
No ratings yet
Mmcformer Missing Modality Compensation Transformer For Brain Tumor Segmentation
19 pages
Confidence-Aware Multi-Modality Learning For Eye Disease Screening
No ratings yet
Confidence-Aware Multi-Modality Learning For Eye Disease Screening
27 pages
Bayesian NN
No ratings yet
Bayesian NN
82 pages
Automated Ensemble Multimodal Machine Learning For Healthcare
No ratings yet
Automated Ensemble Multimodal Machine Learning For Healthcare
14 pages
Set 6-1 GAN For Multimodal Segmentation of Medical Images
No ratings yet
Set 6-1 GAN For Multimodal Segmentation of Medical Images
17 pages
Multimodal Integration of Longitudinal Noninvasive Diagnostics For Survival Prediction in Immunotherapy Using Deep Learning.18253v1
No ratings yet
Multimodal Integration of Longitudinal Noninvasive Diagnostics For Survival Prediction in Immunotherapy Using Deep Learning.18253v1
20 pages
M-VAAL: Multimodal Variational Adversarial Active Learning For Downstream Medical Image Analysis Tasks
No ratings yet
M-VAAL: Multimodal Variational Adversarial Active Learning For Downstream Medical Image Analysis Tasks
17 pages
Mmformer Multimodal Medical Transformer For Incomplete Multimodal Learning of Brain Tumor
No ratings yet
Mmformer Multimodal Medical Transformer For Incomplete Multimodal Learning of Brain Tumor
11 pages
Brain Tumor Segmentation With Self-Ensembled, Deeply-Supervised 3D U-Net Neural Networks: A Brats 2020 Challenge Solution
No ratings yet
Brain Tumor Segmentation With Self-Ensembled, Deeply-Supervised 3D U-Net Neural Networks: A Brats 2020 Challenge Solution
14 pages
9.multimodal Transformer of Incomplete MRI Data For Brain Tumor Segmentation
No ratings yet
9.multimodal Transformer of Incomplete MRI Data For Brain Tumor Segmentation
11 pages
Diagnosis of Breast Cancer Molecular Subtypes Using Machine Learning Models On Unimodal and Multimodal Datasets
No ratings yet
Diagnosis of Breast Cancer Molecular Subtypes Using Machine Learning Models On Unimodal and Multimodal Datasets
13 pages
Omnivec: Learning Robust Representations With Cross Modal Sharing
No ratings yet
Omnivec: Learning Robust Representations With Cross Modal Sharing
18 pages
Optimizing A Deep Learning Approach For Automatic Segmentations For White Matter Lesions
No ratings yet
Optimizing A Deep Learning Approach For Automatic Segmentations For White Matter Lesions
73 pages
ArticleBrainTumor Sujet2
No ratings yet
ArticleBrainTumor Sujet2
12 pages
NestedFormer Nested Modality-Aware Transformer For Brain Tumor Segmentation
No ratings yet
NestedFormer Nested Modality-Aware Transformer For Brain Tumor Segmentation
11 pages
Multimodal Variational Autoencoder For Low-Cost Cardiac Hemodynamics Instability Detection
No ratings yet
Multimodal Variational Autoencoder For Low-Cost Cardiac Hemodynamics Instability Detection
11 pages
2020 - 2-Modality-Pairing Learning For Brain Tumor
No ratings yet
2020 - 2-Modality-Pairing Learning For Brain Tumor
11 pages
M3AE Multimodal Representation Learning For Brain Tumor Segmentation With Missing Modalities
No ratings yet
M3AE Multimodal Representation Learning For Brain Tumor Segmentation With Missing Modalities
9 pages
A Weakly Supervised Gradient Attribution Constraint For Interpretable Classifica
No ratings yet
A Weakly Supervised Gradient Attribution Constraint For Interpretable Classifica
12 pages
Top 10 BraTS 2020 Challenge Solution Brain Tumor S
No ratings yet
Top 10 BraTS 2020 Challenge Solution Brain Tumor S
13 pages
MVFusFra A Multi-View Dynamic Fusion Framework For Multimodal Brain Tumor Segmentation
No ratings yet
MVFusFra A Multi-View Dynamic Fusion Framework For Multimodal Brain Tumor Segmentation
12 pages
Splitting Expands The Application Range of Vision
No ratings yet
Splitting Expands The Application Range of Vision
11 pages
2022 Multimodal Brain Tumor Detection Using Multimodal Deep Transfer Learning
No ratings yet
2022 Multimodal Brain Tumor Detection Using Multimodal Deep Transfer Learning
11 pages
Nakhli Sparse Multi-Modal Graph Transformer With Shared-Context Processing For Representation Learning CVPR 2023 Paper
No ratings yet
Nakhli Sparse Multi-Modal Graph Transformer With Shared-Context Processing For Representation Learning CVPR 2023 Paper
11 pages
大模型-疾病诊断-LoRA Guided Multi-Modal Disease
No ratings yet
大模型-疾病诊断-LoRA Guided Multi-Modal Disease
10 pages
4136 Paper
No ratings yet
4136 Paper
11 pages
Multi-Task Attention-Based Semi-Supervised Learning For Medical Image Segmentation
No ratings yet
Multi-Task Attention-Based Semi-Supervised Learning For Medical Image Segmentation
9 pages
Brain Tumor Segmentation and Survival Prediction Using 3D Attention Unet
No ratings yet
Brain Tumor Segmentation and Survival Prediction Using 3D Attention Unet
11 pages
2024 Leveraging Knowledge of Modality Experts For Incomplete Multimodal Learning
No ratings yet
2024 Leveraging Knowledge of Modality Experts For Incomplete Multimodal Learning
9 pages
Sample Concept Paper
No ratings yet
Sample Concept Paper
3 pages
MoME Mixture of Multimodal Experts For Cancer Survival Prediction
No ratings yet
MoME Mixture of Multimodal Experts For Cancer Survival Prediction
11 pages
Auto-Encoding Variational Bayes: Diederik P. Kingma Max Welling
No ratings yet
Auto-Encoding Variational Bayes: Diederik P. Kingma Max Welling
9 pages
Modally Reduced Representation Learning of Multi-L
No ratings yet
Modally Reduced Representation Learning of Multi-L
9 pages
A Summary of FedMEMA
No ratings yet
A Summary of FedMEMA
5 pages
3D MRI Brain Tumor Segmentation Using Autoencoder Regularization
No ratings yet
3D MRI Brain Tumor Segmentation Using Autoencoder Regularization
10 pages
MOPED
No ratings yet
MOPED
8 pages
2019 (Brats) 3rdpos 3d To 2d
No ratings yet
2019 (Brats) 3rdpos 3d To 2d
9 pages
A Multi-Task Cross-Task Learning Architecture For Ad-Hoc Uncertainty Estimation in 3D Cardiac MRI Image Segmentation
No ratings yet
A Multi-Task Cross-Task Learning Architecture For Ad-Hoc Uncertainty Estimation in 3D Cardiac MRI Image Segmentation
4 pages
Real-Time Bayesian Personalization Via A Learnable
No ratings yet
Real-Time Bayesian Personalization Via A Learnable
9 pages
Li Et Al - 2020 - A Comparison of Pre-Trained Vision-and-Language Models For Multimodal
No ratings yet
Li Et Al - 2020 - A Comparison of Pre-Trained Vision-and-Language Models For Multimodal
6 pages
Puch Et Al - 2019 - Few-Shot Learning With Deep Triplet Networks For Brain Imaging Modality
No ratings yet
Puch Et Al - 2019 - Few-Shot Learning With Deep Triplet Networks For Brain Imaging Modality
8 pages
Modality Preserving U-Net For Segmentation of Multimodal Medical Images
No ratings yet
Modality Preserving U-Net For Segmentation of Multimodal Medical Images
16 pages
Lesson 14 Cellular Respiration
No ratings yet
Lesson 14 Cellular Respiration
18 pages
52 Outcome Prediction in Histo
No ratings yet
52 Outcome Prediction in Histo
4 pages
Interpersonal Relationship
No ratings yet
Interpersonal Relationship
161 pages
Unit 6 Grammar
No ratings yet
Unit 6 Grammar
11 pages
Jacenków Et Al - 2022 - Indication As Prior Knowledge For Multimodal Disease Classification in Chest
No ratings yet
Jacenków Et Al - 2022 - Indication As Prior Knowledge For Multimodal Disease Classification in Chest
5 pages
Pay-For-Performance: The Evidence: Mcgraw-Hill/Irwin
No ratings yet
Pay-For-Performance: The Evidence: Mcgraw-Hill/Irwin
29 pages
Vikas Venkat Sigatapu - ECE
No ratings yet
Vikas Venkat Sigatapu - ECE
8 pages
Tutorial - What Is A Variational Autoencoder - Jaan Altosaar
No ratings yet
Tutorial - What Is A Variational Autoencoder - Jaan Altosaar
20 pages
Piumi Isbi 2024
No ratings yet
Piumi Isbi 2024
1 page
(Course Code: 4340002) : For All Diploma Courses
No ratings yet
(Course Code: 4340002) : For All Diploma Courses
8 pages
Hot Dog - (Winner of The 2023 Caldecott Medal) - Doug Salati Axis 360 (Digital Media Service) - Place of Publication Not Identified, 2022 - Knopf - 9780593308431 - Anna's Archive
No ratings yet
Hot Dog - (Winner of The 2023 Caldecott Medal) - Doug Salati Axis 360 (Digital Media Service) - Place of Publication Not Identified, 2022 - Knopf - 9780593308431 - Anna's Archive
48 pages
Advanced Financial Modelling: 2 Days
No ratings yet
Advanced Financial Modelling: 2 Days
7 pages
Personality and Lifestyles
No ratings yet
Personality and Lifestyles
8 pages
Exercise 1
No ratings yet
Exercise 1
3 pages
S00.P10.11 in House Repair (Billeable)
No ratings yet
S00.P10.11 in House Repair (Billeable)
9 pages
WLS Lifebook WEB
No ratings yet
WLS Lifebook WEB
63 pages
Tatakelola Kolaborasi Pengembangan Kampung Wisata Berbasis Masyarakat
No ratings yet
Tatakelola Kolaborasi Pengembangan Kampung Wisata Berbasis Masyarakat
13 pages
0607 w12 QP 4
No ratings yet
0607 w12 QP 4
20 pages
Amina Resume
No ratings yet
Amina Resume
2 pages
Title Defense in Practical Research 1
No ratings yet
Title Defense in Practical Research 1
8 pages
Advt 144 26may23 Final
No ratings yet
Advt 144 26may23 Final
6 pages
Culminating Recital
No ratings yet
Culminating Recital
3 pages
The Middle Jurassic Oseberg Delta, Northern North Sea: A Sedimentological and Sequence Stratigraphic
No ratings yet
The Middle Jurassic Oseberg Delta, Northern North Sea: A Sedimentological and Sequence Stratigraphic
5 pages
Assistant Professor English Solved Papers 2nd Edition 2nd Edition Yct Expert Team Instant Download
100% (1)
Assistant Professor English Solved Papers 2nd Edition 2nd Edition Yct Expert Team Instant Download
91 pages
Critical Urban Theory
No ratings yet
Critical Urban Theory
23 pages
Past Perefect Exercise
No ratings yet
Past Perefect Exercise
2 pages
Jay Hardwick Resume
No ratings yet
Jay Hardwick Resume
2 pages
DOC-20240412-WA0014. - 1713077993257 - Rahul Mahadev Ugade
No ratings yet
DOC-20240412-WA0014. - 1713077993257 - Rahul Mahadev Ugade
2 pages
L&D Profiling and Actionplan
No ratings yet
L&D Profiling and Actionplan
6 pages
Understanding Archipelagic Insight
No ratings yet
Understanding Archipelagic Insight
5 pages
Tier 1 Action Plan
No ratings yet
Tier 1 Action Plan
3 pages
Statement of Purpose Auburn
No ratings yet
Statement of Purpose Auburn
2 pages
NSTP 02 Activity 4 Community Immersion
No ratings yet
NSTP 02 Activity 4 Community Immersion
1 page
Empathy
No ratings yet
Empathy
2 pages

Hetero-Modal Variational Encoder-Decoder For

Uploaded by

Hetero-Modal Variational Encoder-Decoder For

Uploaded by

Hetero-Modal Variational Encoder-Decoder for

Joint Modality Completion and Segmentation

School of Biomedical Engineering and Imaging Sciences, Kings College London

Abstract. We propose a new deep learning method for tumour segmen-

tation when dealing with missing imaging modalities. Instead of produc-

Keywords: Tumour segmentation, Modality completion, multi-modal,

Tumour segmentation and associated volume quantification plays an essential

possible subset of modalities; 2/ Synthesising missing modalities [6] in order to

L(x; θ) ≥ ELBO(x; θ, φ) , Eqφ (z|x) [log(pθ (x|z))] − KL[qφ (z|x)||p(z)] (2)

where qφ (z|x) is a tractable variational posterior that aims to approximate the

qφ (z|x) = N (z; µφ (x), Σφ (x)) (3)

2.2 Mixture Sampling for Modality Completion and Segmentation

In our scenario, the clinician provides a subset of n = 4 imaging modalities

as the tumour segmentation. Although segmentation could be considered as a

Finally, our lower-bound is a weighted sum of the subset-specific lower-bound:

optimise ELBOπ (x). Classical modelling of pθ (.|z) includes Gaussian distribution

2.3 Network architecture: 3D Variational Encoder-Decoder

3 Data and implementation details.

Implementation details. As pre-processing step, we used histogram-based

L = Ldice + Lcross + 0.1 ∗ L2 + 0.1 ∗ KL

4 Experiments and results

Subset T1 T1+T1c T1+T1c+T2 T1+T1c+T2+FLAIR Ground Truth

Subset FLAIR FLAIR+T2 FLAIR+T2+T1c FLAIR+T2+T1c+T1 Ground Truth

Subset FLAIR FLAIR+T2 FLAIR+T2+T1c FLAIR+T2+T1c+T1 Ground Truth

Missing modalities completion. Unlike these three approaches, U-HVED

Tumour segmentation. In order to evaluate the robustness of our model, we

Modalities Complete Core Enhancing

formance than the original 2D fully-convolutionnal HeMIS. This highlights the

5 Discussion and conclusion

Acknowledgement We thank C. Sudre, W. Li, B. Murray, Z. Eaton-Rosen,

You might also like