0% found this document useful (0 votes)

36 views

Visualizing Hierarchies in scRNA-seq Data Using A Density Tree-Biased Autoencoder

This document presents a new method called DTAE (Density Tree-biased AutoEncoder) for visualizing hierarchies in single-cell RNA sequencing (scRNA-seq) data. DTAE first builds a density tree from the data to capture its hierarchical structure, then trains an autoencoder to embed the data in 2D while preserving the density tree structure. The authors introduce the density tree and DTAE, compare it to related visualization methods, and demonstrate its ability to qualitatively and quantitatively represent real and synthetic scRNA-seq data hierarchies.

Uploaded by

Romina Turco

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views

Visualizing Hierarchies in scRNA-seq Data Using A Density Tree-Biased Autoencoder

Uploaded by

Romina Turco

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Visualizing hierarchies in scRNA-seq data using a density tree-biased

autoencoder
Quentin Garrido 1,5 * Sebastian Damrich 1 Alexander Jäger 1 Dario Cerletti 2,3
Manfred Claassen 4 Laurent Najman 5 Fred A. Hamprecht 1
arXiv:2102.05892v3 [q-bio.QM] 22 Apr 2022

1
HCI/IWR, Heidelberg University, Germany
2
Institute of Molecular Systems Biology, ETH Zürich, Switzerland
3
Institute of Microbiology, ETH Zürich, Switzerland
4
Internal Medicine I, University Hospital Tübingen, University of Tübingen, Germany
5
Université Gustave Eiffel, CNRS, LIGM, F-77454 Marne-la-Vallée, France

Abstract tion. In particular, this permits studying the cell develop-

ment through time more precisely.
Motivation: Single cell RNA sequencing (scRNA-seq) Waddington’s popular metaphor likens the development
allows studying the development of cells in unprecedented of cells to marbles rolling down a landscape (Waddington,
detail. Given that many cellular differentiation processes 1957). While cells are all grouped at the top of the hill when
are hierarchical, their scRNA-seq data is expected to they are not yet differentiated (e.g., stem cells), as they start
be approximately tree-shaped in gene expression space. rolling down, they can take multiple paths and end up in
Inference and representation of this tree-structure in two distinct differentiated states, or cell fates.
dimensions is highly desirable for biological interpretation However, for every cell, hundreds or thousands of ex-
and exploratory analysis. pressed genes are recorded, and this data is noisy. To sum-
Results: Our two contributions are an approach for iden- marize such high-dimensional data, it is useful to visualize
tifying a meaningful tree structure from high-dimensional it in two or three dimensions.
scRNA-seq data, and a visualization method respecting the Our goal, then, is to identify the hierarchical (tree) struc-
tree-structure. We extract the tree structure by means of a ture of the scRNA-seq data and subsequently reduce its
density based maximum spanning tree on a vector quan- dimensionality while preserving the extracted hierarchical
tization of the data and show that it captures biological properties. We address this in two steps, illustrated in fig-
information well. We then introduce DTAE, a tree-biased ure 1.
autoencoder that emphasizes the tree structure of the data First, we cluster the scRNA-seq data in high-dimensional
in low dimensional space. We compare to other dimension space to obtain a more concise and robust representa-
reduction methods and demonstrate the success of our tion. Then, we capture the hierarchical structure as a min-
method both qualitatively and quantitatively on real and imum spanning tree (MST) on our cluster centers, with
toy data. edge weights reflecting the data density in high-dimensional
Availability: Our implementation relying on PyTorch space. We dub the resulting tree “density tree”.
(Paszke et al., 2019) and Higra (Perret et al., 2019) is Second, we embed the data to low dimension with an au-
available at https://github.com/hci-unihd/DTAE. toencoder, a type of artificial neural network. In addition to
the usual aim of reconstructing its input, we bias the autoen-
coder to also reproduce the density tree in low-dimensional
space. As a result, the hierarchical properties of the data are
1 Introduction emphasized in our visualization.

Single-cell RNA sequencing (scRNA-seq) data allows ana-

lyzing gene expression profiles at the single-cell level, thus 2 Related Work
granting insights into cell behavior at unparalleled resolu-
There are various methods for visualizing scRNA-seq data
* quentin.garrido[at]edu.esiee.fr and trajectory inference, and many of them have been re-

1
Figure 1: Schematic method overview. a) High-dimensional data. b) Proposed density tree. After computing the k-means
centroids on the data, we build a tree based on the data density between pairs of centroids. c) DTAE. An autoencoder is used
to learn a representation of our data. This embedding is regularized by the previously computed tree in order to preserve
its hierarchical structure in low-dimensional space. d) The final DTAE embedding. After training of the autoencoder, the
bottleneck layer visualizes the data in low dimension and respects the density structure.

viewed for instance in Saelens et al. (2019). We therefore to MONOCLE 2.

mention only some exemplary approaches here. Visualization only. Most visualization methods do not
Graph only. SCORPIUS (Cannoodt et al., 2016) was provide a graph representation of the data. PHATE (Moon
one of the first such methods. It is limited to linear topolo- et al., 2019) is a recent approach which computes diffusion
gies rather than trees. More versatile methods include probabilities on the data before applying multidimensional
SLINGSHOT (Street et al., 2018) and SPADE (Bendall scaling.
et al., 2011; Qiu et al., 2011). In contrast to our work, The general purpose dimension reduction methods
these three methods only provide a graph summary of the t-SNE (Maaten and Hinton, 2008), UMAP (McInnes et al.,
data, but not a 2D scatter plot. Similar to our density tree, 2020; Becht et al., 2019) and ForceAtlas2 (Jacomy et al.,
SLINGSHOT and SPADE determine the hierarchical struc- 2014) are popular for visualizing scRNA-seq data. They
ture of the dataset as a MST on cluster centers. However, aim to layout the kNN graph structure of the data with t-
SLINGSHOT does not consider density. SPADE addresses SNE focusing more on discrete clusters and ForceAtlas2
the data density only by downsampling dense regions to better representing the continuous structure (Böhm et al.,
equalize the data density. In particular, it does not inform 2020). This often works well, but lacks the focus on hierar-
the MST by the actual data density, which can be problem- chies that our method provides. While the continuous focus
atic, as illustrated in figure 2. In contrast, we induce our of ForceAtlas2 seems apt to show differentiation processes
density tree to have edges in high-density regions. PAGA in scRNA-seq datasets, we find that without a specific tree-
(Wolf et al., 2019) produces primarily a graph summary prior the biologically interesting branching events are often
of the data. It first clusters the k nearest neighbor (kNN) poorly resolved.
graph of the data by modularity, and then places edges be- Like DTAE, several recent methods for visualizing
tween clusters of high connectivity. Optionally, a layout of scRNA-seq data rely on neural networks. We describe them
the PAGA graph can serve as initialization to other meth- in the following. Many approaches are extensions of the au-
ods, such as UMAP. In our method, we connect clusters de- toencoder (AE) (Rumelhart et al., 1985), a network which
pending on the data density between two cluster centroids. encodes the data to a lower dimensional latent space from
Moreover, our proposed visualization is directly optimized which it tries to decode the input. A prominent member of
to respect the density tree, while PAGA injects graph infor- this family is DCA (Eraslan et al., 2019) which replaces the
mation only at the initialization of the visualization. usual reconstruction loss by a count-based ZINB loss and
Graph and visualization. MONOCLE 2 (Qiu et al., aims at denoising scRNA-seq data. Its extension scDeep-
2017) is more similar to our method, as it provides both a Cluster (Tian et al., 2019) jointly trains a clustering model
visualization and a hierarchical graph structure on a vector in latent space. SAUCIE (Amodio et al., 2019) is another
quantization of the data. The tree in MONOCLE 2 is in- popular AE method and addresses multiple tasks includ-
ferred in conjunction with the embedding, while we learn ing batch effect removal and clustering, for which it uses
it as a first step in high-dimensional space and consider the a binary hidden layer. In order to exploit more relational
data density explicitly. As a result, our density tree depends information, scGAE (Luo et al., 2021) uses a graph AE
only on the biological data but not the embedding initializa- based on the kNN graph and achieves good visualization
tion or dimension. MONOCLE 2 is conceptually promis- results both for clustered and continuous scRNA-seq data,
ing, but empirically found to be often inferior to other meth- but without our inductive prior of a hierarchical embedding
ods, confer (Moon et al., 2019). Hence, we did not compare or our explicit focus on data density. Topological autoen-

2
coders (Moor et al., 2020) are conceptually closest to our Algorithm 1 Density tree generation
idea of retaining topological properties during dimension Require: High-dimensional data X ∈ Rn×d
reduction. They compute the MST on all points, which pro- Require: Number of k-means centroids k
duces less stable results than our density-based approach on procedure G ENERATE T REE(X, k)
cluster centroids. C ← K M EANS(X, k) . O(nkdt) with t the number
Variational autoencoders (VAEs) (Kingma and Welling, of iterations
2013), a generative AE version, have also been explored. G = (C, E) the complete graph on our centroids
A popular VAE for scRNA-seq data is scVI (Lopez et al., for {i, j} a two-element subset of {1, . . . , k} do .
2018), which explicitly models batch effects and library O(k 2 )
sizes. Instead, scVAE (Grønbech et al., 2020) investigates di,j = 0
likelihood functions suitable for scRNA-seq data and pro- end for
poses a clustering model in latent space. DR-A (Lin et al., for i = 1, . . . , |X| do . O(nk)
2020) apply adverserial training instead of the variational a ← arg min ||xi − cj ||2 . Nearest centroid
objective. Finally, scvis is a VAE tailored to visualization j=1,...,k
(Ding et al., 2018) and uses a t-SNE-like regularization term b ← arg min ||xi − cj ||2 . Second nearest
in the latent space. j=1,...,k
j6=a
Ivis (Szubert et al., 2019) employs a triplet loss function centroid
and a siamese neural network instead of an AE to preserve da,b = da,b + 1 . Increase nearest centroids’
the nearest neighbor relations in the visualization. edge strength
Both scDeepCluster and scVAE shape the latent space end for
into disconnected clusters, which is orthogonal to our goal T ← M AX S PANNING T REE(G, d) . O(k 2 log k)
of illustrating continuous developmental hierarchies. scVI, return T, d . Retains the density tree and the edge
scGAE and scDeepCluster work with a latent space dimen- strengths
sion larger than two and thus require an additional dimen- end procedure
sion reduction, typically with t-SNE, to visualize the data.
Neither of the pure visualization methods aims to bring
out the hierarchical properties often present in scRNA-seq our original data well. But it is crucial that the extracted
dataset. In particular, they do not use the data density to tree follows the dense regions of the data if we want to vi-
infer lineages. None of them provide a graph summary of sualize developmental trajectories of differentiating cells: a
the data. Our contribution, however, is to supply the user trajectory is plausible if we observe intermediate cell states
with a tree-shaped graph summarizing the hierarchies along and unlikely if there are jumps in the development. By
dense lineages in the data as well as a 2D embedding that preferring tree edges in high density regions of the data,
respects this tree shape. we ensure that the computed spanning tree is biologically
plausible. Following this rationale, we build the maximum
spanning tree on the complete graph over centroids whose
3 Methods edge weights are given by the density of the data along each
edge instead of the minimum spanning tree on Euclidean
3.1 Approximating the High-dimensional
distance. This results in a tree that (we believe) captures
scRNA-seq Data with a Tree Waddington’s hypothesis better than merely considering cu-
To summarize the high-dimensional data in terms of a tree, mulative differences in expression levels.
the minimum spanning tree (MST) on the Euclidean dis- To estimate the support that a data sample provides for an
tances is an obvious choice. This route is followed by Moor edge, we follow Martinetz and Schulten (1994). Consider
et al. (2020) who reproduce the MST obtained on their the complete graph G = (C, E) such that C = {c1 , . . . , ck }
high-dimensional data in their low-dimensional embedding. is the set of centroids. In the spirit of Hebbian learning,
However, scRNA-seq data can be noisy, and a MST built on that is, emphasizing connections that appear frequently, we
all of our data is very sensitive to noise. Therefore, we first count, for each edge, how often its incident vertices are the
run k-means clustering on the original data, yielding more two closest centroids to any given datum.
robust centroids for the MST construction and also reducing As pointed out by Martinetz and Schulten (1994) this
downstream complexity. amounts to an empirical estimate of the integral of the den-
A problem with the Euclidean MST, illustrated in fig- sity of observations across the second-order Voronoı̈ region
ure 2, is that two centroids can be close in Euclidean space (defined as the set of points having a particular set of 2 cen-
without having many data points between them. In such a troids as its 2 nearest centroids) associated with this pair of
case, a Euclidean MST would not capture the skeleton of cluster centers. Finally, we compute the maximum span-

3
Figure 2: (left, middle) Comparison of the tree built on k-means centroids using Euclidean distance or density weights.
The data was generated using the PHATE library (Moon et al., 2019), with 3 branches in 2D. Original data points are
transparently overlayed to better visualize their density. While the tree based on the Euclidean distance places connections
between centroids that are close but have only few data points between them (see red ellipse), our tree based on the data
density instead includes those edges that lie in high density regions (see pink ellipse). (right) Complete graph over centroids
and its Hebbian edge weights. Null-weight edges, that is edges not supported by data, are omitted for clarity.

ning tree over these Hebbian edge weights. Our strategy for
building the tree is summarized in algorithm 1.
1 X
Our data-density based tree follows the true shape of the Lrec = MSE(X, g(f (X))) = ||xi − g(f (xi ))||22 .
N
data more closely than a MST based on the Euclidean dis- xi ∈X
tance weights, as illustrated in figure 2. We claim this in- (1)
dicates it being a better choice for capturing developmental This term is the typical loss function for an autoencoder and
trajectories. Having extracted the tree shape in high dimen- ensures that the embedding is as faithful to the original data
sions, our goal is to reproduce this tree as closely as possible as possible, forcing it to extract the most salient data fea-
in our embedding. tures.

3.2.2 Push-Pull Loss

3.2 Density-Tree biased Autoencoder The main loss term that biases the DTAE towards the den-
(DTAE) sity tree is the push-pull loss. It trains the encoder to embed
the data points such that the high-dimensional data density,
We use an autoencoder to faithfully embed the high- and, in particular, the density tree, are reproduced in low-
dimensional scRNA-seq data in a low-dimensional space, dimensional-space.
and bias it such that the topology inferred in high- We find a centroid in embedding space by averaging
dimensional space is respected. An autoencoder is an the embeddings of all points assigned to the corresponding
artificial neural network consisting of two concatenated k-means cluster in high-dimensional space. In this way, we
subnetworks, the encoder f , which maps the input to can easily relate the centroids in high and low dimension,
lower-dimensional space, also called embedding space, and and will simply speak of centroids when the ambient space
the decoder g, which tries to reconstruct the input from is clear from the context.
the lower-dimensional embedding. It can be seen as a To reproduce the density structure in low-dimensional
non-linear generalization of PCA. We visualize the low- space, we want that the closest two high-dimensional cen-
dimensional embeddings hi = f (xi ) and hence choose troids to a point xi ∈ X correspond to the two low-
their dimension to be 2. dimensional centroids that are closest to its embedding
The autoencoder is trained by minimizing the following hi = f (xi ). We denote the latter centroids by ci,1 and ci,2 ,
loss terms, including new ones that bias the autoencoder to and low-dimensional centroids that actually correspond to
also adhere to the tree structure. the closest high-dimensional centroids by c0i,1 and c0i,2 . As
long as c0i,1 , c0i,2 differ from ci,1 and ci,2 , the encoder places
hi next to different centroids than in high-dimensional
3.2.1 Reconstruction Loss space. To improve this, we want to move c0i,1 , c0i,2 and hi
towards each other while separating ci,1 and ci,2 from hi .
The first term of the loss is the reconstruction loss, defined The following preliminary version of our push-pull loss im-
as plements this:

4
geodesic distance dgeo (ca , cb ) with ca , cb ∈ C is defined
2 as the number of edges in the shortest path between ca
L̃push (hi ) = − (||hi − ci,1 ||2 + ||hi − ci,2 ||2 ) (2)
2 and cb in the density tree. Centroids at the end of differ-
L̃pull (hi ) = ||hi − c0i,1 ||2 + ||hi − c0i,2 ||2 (3) ent branches in the density have a higher geodesic distance
1 X than centroids nearby on the same branch. By weighing
L̃push-pull = L̃push (f (xi )) + L̃pull (f (xi )). (4) the push-pull loss contribution of an embedded point by
N
xi ∈X the geodesic distance between its two currently closest cen-
The push loss decreases as hi and the currently closest centroids, we focus the push-pull loss on embeddings which
troids, ci,1 and ci,2 , are placed further apart from each other, erroneously lie between different branches.
while the pull loss decreases when hi gets closer to the The geodesic distances can be computed quickly in
correct centroids, c0i,1 and c0i,2 . Indeed, the push-pull loss O(k 2 ) via breadth first search, and this only has to be done
term is minimized if and only if each embedding hi lies in once before training the autoencoder.
the second-order Voronoı̈ region of those low-dimensional The final version of our push-pull loss becomes
centroids whose high-dimensional counterparts contain the
data point xi in their second-order Voronoı̈ region. In other 1 X
Lpush-pull = dgeo (ci,1 , ci,2 )
words, the loss is zero precisely when we are reproducing N
xi ∈X
the edge densities from high dimension in low dimension.
Note that we let the gradient flow through both the in- · (Lpush (f (xi )) + Lpull (f (xi ))) .
dividual embeddings and through the centroids, which are (8)
means of embeddings themselves.
This naı̈ve formulation of the push-pull loss has the Note, that the normalized push-pull loss in equation (7) and
drawback that it can become very small if all embeddings the geodesically reweighted push-pull loss in (8) both also
are nearly collapsed into a single point, which is undesir- get minimized if and only if the closest centroids in em-
able for visualization. Therefore, we normalize the contri- bedding space correspond to the closest centroids in high-
bution of every embedding hi by the distance between the dimensional space.
two correct centroids in embedding space. This prevents the
collapsing of embeddings, and also ensures that each data- 3.2.3 Compactness loss
point xi contributes equally, regardless of how far apart c0i,1
and c0i,2 are. The push-pull loss thus becomes The push-pull loss replicates the empirical high-
!2 dimensional data density in embedding space by moving
||hi − ci,1 ||2 + ||hi − ci,2 ||2 the embeddings into the correct second-order Voronoı̈
Lpush (hi ) = − (5)
||c0i,1 − c0i,2 ||2 region, which can be large or unbounded. For optimal
!2 visibility of the tree structure, an embedding should not
||hi − c0i,1 ||2 + ||hi − c0i,2 ||2 only be in the correct second-order Voronoı̈ region, but lie
Lpull (hi ) = (6)
||c0i,1 − c0i,2 ||2 compactly around the line between its two centroids. To
achieve this, we add the compactness loss, which is just
1 X
Lpush-pull = Lpush (f (xi )) + Lpull (f (xi )). (7) another instance of the pull loss
N
xi ∈X
!2
So far, we only used the density information from high- 1 X ||hi − c0i,1 ||2 + ||hi − c0i,2 ||2
Lcomp = (9)
dimensional space for the embedding, but not the extracted N ||c0i,1 − c0i,2 ||2
xi ∈X
density tree itself. The push-pull loss in equation (7) is ag-
1 X
nostic to the positions of the involved centroids within the = Lpull (f (xi )), (10)
density tree, only their Euclidean distance to the embed- N
xi ∈X
ding hi matters. In contrast, the hierarchical structure is
important for the biological interpretation of the data: it is The compactness loss is minimized if the embedding hi is
much less important if an embedding is placed close to two exactly between the correct centroids c0i,1 and c0i,2 and has
centroids that are on the same branch of the density tree elliptic contour lines with foci at the centroids.
than it is if the embedding is placed between two different
branches. In the first case, cells are just not ordered cor- 3.2.4 Cosine loss
rectly within a trajectory, while in the second case we get
false evidence for an altogether different pathway. Since the encoder is a powerful non-linear map, it can
We tackle this problem by reweighing the push-pull loss introduce artifactual curves in the low-dimensional tree
with the geodesic distance along the density tree. The branches. However, especially tight turns can impede the

5
visual clarity of the embedding. As a remedy, we pro- 3.3 Training procedure
pose an optional additional loss term that tends to straighten
branches. Firstly, we compute the k-means centroids, the edge densi-
ties, the density tree, and geodesic distances. This has to be
Centroids at which the embedding should be straight are
done only once as an initialization step. Secondly, we pre-
the ones within a branch, but not at a branching event of
train the autoencoder with only the reconstruction loss via
the density tree. The former can easily be identified as the
stochastic gradient descent on minibatches. This provides a
centroids of degree 2.
warm start for finetuning the autoencoder with all losses in
Let c be a centroid in embedding space of degree 2 with the third step.
its two neighboring centroids nc,1 and nc,2 . The branch is During finetuning, all embedding points are needed to
straight at c if the two vectors c − nc,1 and nc,2 − c are par- compute the centroids in embedding space. Therefore, we
allel or, equivalently, if their cosine is maximal. Denoting perform full-batch gradient descent during finetuning. For
by C2 = {c ∈ C | deg(c) = 2} the set of all centroids algorithmic details regarding the training procedure, confer
of degree 2, considered in embedding space, we define the to supplementary algorithm S1.
cosine loss as We always used k = 50 centroids for k-means clus-
tering in our experiments. This number needs to be
1 X (c − nc,1 ) · (nc,2 − c)
Lcosine = 1 − . (11) high enough so that the tree yields a skeleton of the
|C2 | ||c − nc,1 ||2 ||nc,2 − c||2 data, but not so high that the density loses its mean-
c∈C2
ing. k = 50 is a default value that works well in
Essentially, it measures the cosine of the angles along the a variety of scenarios. Our autoencoder always has a
tree branch and becomes minimal if all these angles are zero bottleneck dimension of 2 for visualization. In the ex-
and the branches straight. periments, we used layers of the following dimensions
d(input dimension), 2048, 256, 32, 2, 32, 256, 2048, d. This
A generalization of this criterion that deals with noisy
results in symmetrical encoders and decoders with four lay-
edges in the density tree is discussed in section B of the
ers. While not necessary in our experiments, if a lighter
appendix.
network is desired, we recommend applying PCA first to
reduce the number of input dimensions, or to filter out more
genes during the preprocessing. We omitted hidden layers
3.2.5 Complete loss function of dimension larger than the input. We use fully connected
layers and ReLU activations after every layer but the last
Combining the four loss terms of the preceding sections, we
encoder and decoder layer and employ the Adam (Kingma
arrive at our final loss
and Ba, 2017) optimizer with learning rate 2 × 10−4 for
pretraining and 1 × 10−3 for finetuning unless stated oth-
L = λrec Lrec + λpush-pull Lpush-pull + λcomp Lcomp + λcos Lcos . erwise. We used a batch size of 256 for pretraining in all
(12) experiments.
The relative importance of the loss terms, especially of
Lcomp and Lcos , which control finer aspects of the visualiza-
tion, might depend on the use-case. In practice, we found 4 Results
λrec = λpush-pull = λcomp = 1 and λcos = 50 to work well.
This configuration reduces the number of weights to adjust In this section, we show the performance of our method on
from four to one. toy and real scRNA-seq datasets and compare it to a vanilla
An ablation study of the different losses’ contribution is autoencoder, as well as to the popular non-parametric meth-
available in section C of the appendix. Its main conclusion ods PCA, Force Atlas 2, UMAP and PHATE and to the most
is that while the push-pull loss and reconstruction loss are prevalent neural network-based approaches, SAUCIE, DCA
sufficient to obtain satisfactory results, the addition of the and scVI. For all network-based approaches, we choose a
compactness and cosine loss helps to improve the visualiza- bottleneck of dimension 2 to directly use them for visual-
tions further and facilitates reproducibility. Empirically, we ization.
found that adding the compactness loss without the cosine
loss sometimes leads to discontinuous embeddings. The
two loss terms should therefore be added or omitted jointly.
4.1 PHATE generated data
If the default loss weights are not satisfactory, we recom- We applied our method to an artificial dataset created with
mend adjusting the cosine loss weight first. To understand the library published alongside Moon et al. (2019), to
how changing the loss parameters may affect the results, demonstrate its functionality in a controlled setting. We
please refer to the qualitative results in the ablation study. generated a toy dataset whose skeleton is a tree with one

6
Figure 3: Results obtained using data generated by the PHATE library. Branches are coloured by groundtruth labels.

backbone branch and 9 branches emanating from the back- tree structure of the data, especially for the endocrine sub-
bone, consisting in total of 10,000 points in 100 dimensions. types. The visualized hierarchy is biologically plausible,
We pretrained for 150 epochs with a learning rate of with a particularly clear depiction of the α-, β- and ε-cell
10−3 and finetuned for another 150 epochs with a learning branches and a visible, albeit too strong, separation of the
rate of 10−2 . δ-cells. This is in agreement with the results from Bastidas-
Figure 3 shows the visualization results. The finetun- Ponce et al. (2019). UMAP also performs very well and
ing significantly improves the results of the pretrained au- attaches the δ-cells to the main trajectory. However, the
toencoder, whose visualisation collapses the grey and green α- and β-cell branches are not as prominent as in DTAE.
branch onto the blue branch. All methods other than DCA, PHATE does not manage to separate the δ- and ε-cells dis-
scVI and PCA achieve satisfactory results that make the true cernibly from the other endocrine subtypes. As on toy data
tree structure of the data evident. While PHATE, UMAP in figure 3, it produces overly crisp branches for the α- and
and Force Atlas 2 produce overly crisp branches compared β-cells. PCA mostly overlays all endocrine subtypes. All
to the PCA result, the reconstruction loss of our autoen- methods but the vanilla autoencoder show a clear branch
coder guards us from collapsing the branches into lines. with tip and acinar cells and one via EP and Fev+ cells
PHATE appears to overlap the cyan and yellow branches to the endocrine subtypes, but only DTAE, DCA, SAUCIE
near the backbone, and UMAP introduces artificially curved and scVI manage to also hint at the more generic trunk and
branches. scVI collapses the green and brown as well as the multipotent cells from which these two major branches em-
pink and cyan branches together, giving hard to interpret vi- anate. However, SAUCIE, DCA and scVI fail to produce a
sualizations. The results on this toy dataset demonstrate that meaningful separation between the α- and β-cell branches.
our method can embed high-dimensional hierarchical data The ductal and Ngn3 low EP cells overlap in all methods.
into 2D and emphasize its tree-structure while avoiding to It is worth noting that the autoencoder alone was not able
collapse too much information compared to state-of-the-art to visualize meaningful hierarchical properties of the data.
methods. In our method, all branches are easily visible. However, the density tree-biased finetuning in DTAE made
this structure evident, highlighting the benefits of our ap-
proach.
4.2 Endocrine pancreatic cell data
In figure 4, we overlay DTAE’s embedding with a pruned
We evaluated our method on the data from Bastidas-Ponce version of the density tree and see that the visualization
et al. (2019). It represents endocrine pancreatic cells at closely follows the tree structure around the differentiated
different stages of their development and consists of gene endocrine cells. This combined representation of low-
expression information for 36351 cells and 3999 genes. dimensional embedding and overlaid density tree further
Preprocessing information can be found in Bastidas-Ponce facilitates the identification of branching events, most no-
et al. (2019). We pretrained for 300 epochs and used 250 tably for the α- and β-cells, and shows the full power of
epochs for finetuning. our method. It also provides an explanation for the appar-
Figure 4 and supplementary figure S5 depicts visual- ent separation of the δ-cells. Since there are relatively few
izations of the embryonic pancreas development with dif- δ-cells, they are not represented by a distinct k-means cen-
ferent methods. Our method can faithfully reproduce the troid.

7
Figure 4: Pruned density tree superimposed over embeddings of the endocrine pancreatic cell dataset, colored by cell sub-
types. We use finer labels for the endocrine cells. Darker edges represent denser edges. Only edges with more than 100
points contributing to them are plotted here.

Our method places more k-means centroids in the dense method. For instance, together with the density tree, we can
region in the lower right part of DTAE’s panel in figure 4 identify the ε-cells as a separate branch and find the location
than is appropriate to capture the trajectories, resulting in of the branching event into different endocrine subtypes in
many small branches. Fortunately, this does not result in the UMAP embedding.
an exaggerated tree-shaped visualization that follows every
spurious branch, which we hypothesize is thanks to the suc-
cessful interplay between the tree bias and the reconstruc- 4.3 T-cell infection data
tion aim of the autoencoder: If the biological signal encoded
in the gene expressions can be reconstructed by the decoder We further applied our method to T-cell data of a chronic
from an embedding with enhanced hierarchical structure, and an acute infection, which was shared with us by the au-
the tree-bias shapes the visualization accordingly. Con- thors of Cerletti et al. (2020). The data was preprocessed
versely, an inappropriate tree-shape is prevented if it would using the method described in Zheng et al. (2017), for more
impair the reconstruction. Overall, the density tree recovers details confer Cerletti et al. (2020). It contains gene expres-
the pathways identified in Bastidas-Ponce et al. (2019) to a sion information for 19029 cells and 4999 genes. While
large extent. Only the trajectory from multipotent via tip to we used the combined dataset to fit all dimension reduc-
acinar cells includes an unexpected detour via the trunk and tion methods, we only visualize the 13707 cells of the
ductal cells, which the autoencoder mends by placing the chronic infection for which we have phenotype annotations
tip next to the multipotent cells. from Cerletti et al. (2020) allowing us to judge visualization
quality from a biological viewpoint. We pretrained for 600
The density tree also provides useful information in con- epochs and used 250 epochs for finetuning.
junction with other dimension reduction methods. In fig- Figure 5 and supplementary figure S6 demonstrate that
ure 4, we overlay their visualizations with the pruned den- our method makes the tree structure of the data clearly vis-
sity tree by computing the centroids in the respective em- ible. The visualized hierarchy is also biologically signif-
bedding spaces according to the k-means cluster assign- icant: The two branches on the right correspond to the
ments. The density tree can help to find branching events memory-like and terminally exhausted phenotypic states,
and gain insights into the hierarchical structure of the data which are identified as the main terminal fates of the differ-
that is visualized with an existing dimension reduction entiation process in Cerletti et al. (2020). Furthermore, the

8
Figure 5: Pruned density tree superimposed over embeddings of the chronic part of the T-cell data, colored by phenotypes.
Darker edges represent denser edges. Only edges with more than 100 points contributing to them are plotted here.

purple branch at the bottom contains the proliferating cells. event towards memory-like and terminally exhausted cells.
Since the cell cycle affects cell transcription significantly, PCA exhibits only the coarsest structure and fails to sepa-
those cells are expected to be distinct from the rest. rate the later states visibly. The biological structure is de-
It is encouraging that DTAE makes the expected bio- cently preserved in the UMAP visualization, but the hierar-
logical structure apparent even without relying on known chy is less apparent than in DTAE. SAUCIE, scVI and Force
marker genes or differential cell expression, which were Atlas 2 produce results that are very similar to PCA, with
used to obtain the phenotypic annotations in Cerletti et al. later states that are hard to distinguish. DCA produces re-
(2020). sults that are very similar to the vanilla autoencoder, where
Interestingly, our method places the branching event to- even though the later states are visible, there is a significant
wards the memory-like cells in the vicinity of the exhausted amount of noise in the embedding, making the analysis dif-
cells, as does UMAP, while Cerletti et al. (2020) recog- ficult. Overall, our method outperforms the other visualiza-
nized a trajectory directly from the early stage cells to the tion methods on this dataset.
memory-like fate. The exact location of a branching event In figure 5, we have overlaid our embedding with a
in a cell differentiation process is difficult to determine pre- pruned version of the density tree and see that DTAE’s vi-
cisely. We conjecture that fitting the dimensionality reduc- sualization indeed closely follows the tree structure. It is
tion methods on the gene expression measurements of cells noteworthy that even the circular behavior of proliferation
from an acute infection in addition to those from the chronic cells is accurately captured by a self-overlaid branch, al-
infection analyzed in Cerletti et al. (2020) provided addi- though our tree-based method is not directly designed to
tional evidence for the trajectory via exhausted cells to the extract circular structure.
memory-like fate. Unfortunately, an in-depth investigation Figure 5 also shows the other dimension reduction meth-
of this phenomenon is beyond the scope of this methodolog- ods in conjunction with the pruned density tree. Reassur-
ical paper. ingly, we find that all methods embed the tree in a plausi-
The competing methods expose the tree-structure of the ble way, i.e., without many self-intersections or oscillating
data less obviously than DTAE. The finetuning significantly branches. This is evidence that our density tree indeed cap-
improves the results from the autoencoder, which shows tures a meaningful tree structure of the data. As for the
no discernible hierarchical structure. PHATE separates the endocrine pancreas dataset, the density tree can enhance hi-
early cells, proliferating cells and the rest. But its layout erarchical structure in visualizations of existing dimension
is very tight around the biologically interesting branching reduction methods. It, for example, clarifies in the UMAP

9
Type of metric Local Global Voronoi
Euclidean Geodesic All
Metric ARI k-NN 1st order 2nd order
Pearson Spearman Pearson Spearman
DTAE (Ours) 93.75 48.70 85.51 72.91 82.39 87.19 98.24 94.21 82.86
AE 74.83 70.96 87.41 77.20 70.16 73.23 89.83 58.43 75.26
PHATE 84.76 73.48 45.43 46.04 74.15 78.45 85.27 44.04 66.45
UMAP 78.88 87.75 53.42 54.31 79.40 80.12 83.31 55.94 71.64
SAUCIE 89.99 67.43 82.22 78.50 84.03 85.41 96.43 78.58 82.83
DCA 49.79 64.37 76.54 90.95 40.40 65.92 63.26 49.33 62.57
scVI 74.80 54.30 87.82 67.68 75.45 82.75 86.42 57.77 73.37
Force Atlas 2 72.88 72.23 37.28 48.06 35.67 76.65 77.27 43.27 57.91
PCA 60.40 40.78 73.42 66.02 96.44 96.40 80.76 56.82 71.38

Table 1: Relative quantitative performances averaged over all studied datasets. For each metric, we give the best performing
method a value of 100 and scale other results proportionally. The metrics are described in section 4.4 and higher values
indicate better performance. The rightmost column contains the average relative performance over all metrics. DTAE and
SAUCIE have the best performance overall, with DTAE excelling in Voronoı̈ metrics and ARI.

plot that the pathway towards the terminally exhausted cells ond order Voronoı̈ diagram with k = 50, there is a bias to-
is via the exhausted and effector like cells and not directly wards DTAE since we optimize this criterion. For local and
via the proliferating cells. Voronoı̈ diagram based metrics, we have to adjust a param-
eter k (either for k-means clustering or for a k-NN graph).
We vary the value of k between 10 and 100 with a step of
4.4 Quantitative analysis 10 and report the area under the curve.

The purpose of a visualization method is to make the most We report results aggregated on all three datasets in ta-
salient, qualitative properties of a dataset visible. Neverthe- ble 1 and full results are available in supplementary ta-
less, a quantitative evaluation can support the comparison ble S5. This aggregation makes it easier to deduce general
of visualization methods and provide evidence that the data patterns of performance among multiple datasets. From the
and its visualization are structurally similar. Unfortunately, results on all datasets, we can clearly see that DTAE out-
there is to our knowledge no consensus as to which metric performs other methods on Voronoı̈ diagram based metrics,
aligns with practitioners’ notion of a useful visualization. in part due to the bias towards them for k = 50. On lo-
Hence, any single metric cannot validate the quality of a cal metrics, DTAE achieves the best performance on ARI,
method. This is why it is important to use multiple metrics, followed closely by SAUCIE. However, for k-NN preserva-
so that one can hope for a more reliable result. tion UMAP performs better than other methods by a signif-
We selected eight different metrics, some of which icant margin which is consistent with the criterion it opti-
have been employed to judge visualization methods be- mizes (Damrich and Hamprecht, 2021). For euclidean dis-
fore (Moon et al., 2019; Kobak and Berens, 2019; Becht tance preservation, autoencoder based methods perform the
et al., 2019). The first group of metric considers the local best, with no clear winner overall. For geodesic distance
structure. We compute the Adjusted Rand Index (ARI) be- preservation, PCA performs the best, even though it pro-
tween a k-means clustering in high and low dimension and duced poor visualizations. This is in line with previous
the number of correct neighbors in the k-NN graph in high findings (Kobak and Berens, 2019). Most other methods
and low dimension. The next category are global metrics, obtained very similar performance on this metric, making
which rely on distance preservation. Euclidean distances it hard to conclude that any method performs better than
are computed in low dimension and euclidean or geodesic another.
distances are computed in high dimension. Then correla- In order to more easily compare methods, aggregated
tions are computed between those distances. Finally, we performances over all metrics are reported in the rightmost
use Voronoı̈ diagram based metrics. First or second order column of table 1. This aggregation makes it easier to evalu-
Voronoı̈ diagrams on the k-means centroids are computed ate the overall performance of a method when using a wide
using the k-means assignments to obtain the seeds in low- variety of criteria. We chose the arithmetic mean to com-
dimensional space. Then the ratio of points placed in the bine the results for simplicity’s sake. From this, we can see
correct Voronoı̈ region is computed. When using the sec- that DTAE and SAUCIE perform significantly better than

10
structure of the dataset, a more general dimension reduction
method might be preferable for initial data exploration.

5.2 Neural network limitations

Artificial neural networks are powerful non-linear func-
tions that can produce impressive results. Unfortunately,
they require the choice of a number of hyperparameters,
such as the dimension of the hidden layers and the learning
rate, making them less end-user friendly than their classical
counterparts.

Figure 6: Failure case: Highly clustered data violates our 6 Conclusion

underlying assumption of a tree structure. Dentate gyrus
data from Hochgerner et al. (2018) with clusters colored by We have introduced a new way of capturing the hierarchical
groundtruth cluster assignments. properties of scRNA-seq data of a developing cell popula-
tion with a density based minimum spanning tree. This tree
is a hierarchical representation of the data that places edges
other methods, with DTAE surpassing SAUCIE by a small in high density regions and thus captures biologically plau-
margin. However, from a qualitative point of view, DTAE sible trajectories. The density tree can be used to inform
produced superior visualizations compared to SAUCIE, as any dimension reduction method about the hierarchical na-
discussed previously. ture of the data.
Overall, DTAE produced excellent results both from a Moreover, we used the density tree to bias an autoen-
quantitative and qualitative point of view, highlighting its coder and were thus able to produce promising visualiza-
usefulness as a visualization method for tree-shaped data. tions exhibiting clearly visible tree-structure both on syn-
thetic and real world scRNA-seq data of developing cell
populations.
5 Limitations
5.1 Hierarchy assumption Funding
Our method is tailored to Waddington’s hierarchical struc- Supported, in part, by Informatics for Life funded by the
ture assumption of developmental cell populations, in Klaus Tschira Foundation.
which the highest data density is along the developmen-
tal trajectory. It produces convincing results in this setting
as shown above. However, if the assumption is violated,
for instance because the dataset contains multiple separate References
developmental hierarchies or a mixture of hierarchies and
distinct clusters of fully differentiated cell fates, the den- Amodio, M. et al (2019). Exploring single-cell data with deep multitasking neural
networks. Nature Methods, 16(11), 1139–1145.
sity tree cannot possibly be a faithful representation of the
dataset. Indeed, in such a case, our method yields a poor Bastidas-Ponce, A. et al (2019). Comprehensive single cell mRNA profiling re-
result. As an example, confer figure 6 with visualizations veals a detailed roadmap for pancreatic endocrinogenesis. Development, 146(12),
dev173849.
of the dentate gyrus dataset from Hochgerner et al. (2018),
preprocessed according to Zheng et al. (2017). This dataset Becht, E. et al (2019). Dimensionality reduction for visualizing single-cell data using
consists of a mostly linear cell trajectory and several dis- UMAP. Nature Biotechnology, 37(1), 38–44. Number: 1 Publisher: Nature
Publishing Group.
tinct clusters of differentiated cells, and consequently does
not meet our model’s assumption. Indeed, DTAE manages Bendall, S.C. et al (2011). Single-cell mass cytometry of differential immune and
to only extract some linear structures, but overall fails on drug responses across a human hematopoietic continuum. Science, 332(6030),
687–696.
this dataset, similarly to PHATE. UMAP seems to produce
the most useful visualization here. Böhm, J.N. et al (2020). Attraction-repulsion spectrum in neighbor embeddings.
One could adapt our method by extracting a forest of arXiv preprint arXiv:2007.08902.

disconnected density trees by cutting edges below a den- Cannoodt, R. et al (2016). SCORPIUS improves trajectory inference and identifies
sity threshold. However, if little is known a priori about the novel modules in dendritic cell development. preprint, Bioinformatics.

11
Cerletti, D. et al (2020). Fate trajectories of CD8 + T cells in chronic LCMV infection. Szubert, B. et al (2019). Structure-preserving visualisation of high dimensional
preprint, Immunology. single-cell datasets. Scientific reports, 9(1), 1–10.

Damrich, S. and Hamprecht, F.A. (2021). On UMAP’s true loss function. Tian, T. et al (2019). Clustering single-cell rna-seq data with a model-based deep
arXiv:2103.14608 [cs, stat]. arXiv: 2103.14608. learning approach. Nature Machine Intelligence, 1(4), 191–198.

Ding, J. et al (2018). Interpretable dimensionality reduction of single cell transcrip- Waddington, C.H. (1957). The strategy of the genes : a discussion of some aspects
tome data with deep generative models. Nature communications, 9(1), 1–13. of theoretical biology. Routledge Library Editions: 20th Century Science. Rout-
ledge.
Eraslan, G. et al (2019). Single-cell rna-seq denoising using a deep count autoen-
coder. Nature communications, 10(1), 1–14. Wolf, F.A. et al (2019). PAGA: graph abstraction reconciles clustering with trajectory
inference through a topology preserving map of single cells. Genome Biology,
Grønbech, C.H. et al (2020). scvae: Variational auto-encoders for single-cell gene 20(1), 59.
expression data. Bioinformatics, 36(16), 4415–4422.
Zheng, G.X.Y. et al (2017). Massively parallel digital transcriptional profiling of
Hochgerner, H. et al (2018). Conserved properties of dentate gyrus neurogenesis single cells. Nature Communications, 8(1).
across postnatal development revealed by single-cell RNA sequencing. Nature
Neuroscience, 21(2), 290–299.

Jacomy, M. et al (2014). Forceatlas2, a continuous graph layout algorithm for handy

network visualization designed for the gephi software. PloS one, 9(6), e98679.

Kingma, D.P. and Ba, J. (2017). Adam: A method for stochastic optimization.

Kingma, D.P. and Welling, M. (2013). Auto-encoding variational bayes. arXiv

preprint arXiv:1312.6114.

Kobak, D. and Berens, P. (2019). The art of using t-SNE for single-cell transcrip-
tomics. Nature Communications, 10(1), 5416.

Lin, E. et al (2020). A deep adversarial variational autoencoder model for dimension-

ality reduction in single-cell rna sequencing analysis. BMC bioinformatics, 21(1),
1–11.

Lopez, R. et al (2018). Deep generative modeling for single-cell transcriptomics.

Nature methods, 15(12), 1053–1058.

Luo, Z. et al (2021). scgae: topology-preserving dimensionality reduction for single-

cell rna-seq data using graph autoencoder. bioRxiv.

Maaten, L.v.d. and Hinton, G. (2008). Visualizing Data using t-SNE. Journal of
Machine Learning Research, 9(86), 2579–2605.

Martinetz, T. and Schulten, K. (1994). Topology representing networks. Neural

Networks, 7(3), 507–522.

McInnes, L. et al (2020). UMAP: Uniform Manifold Approximation and Projection

for Dimension Reduction. arXiv:1802.03426 [cs, stat]. arXiv: 1802.03426.

Moon, K.R. et al (2019). Visualizing structure and transitions in high-dimensional

biological data. Nature Biotechnology, 37(12), 1482–1492.

Moor, M. et al (2020). Topological Autoencoders. arXiv:1906.00722 [cs, math, stat].

arXiv: 1906.00722.

Paszke, A. et al (2019). PyTorch: An Imperative Style, High-Performance Deep

Learning Library. arXiv:1912.01703 [cs, stat]. arXiv: 1912.01703.

Perret, B. et al (2019). Higra: Hierarchical Graph Analysis. SoftwareX, 10, 100335.

Qiu, P. et al (2011). Extracting a cellular hierarchy from high-dimensional cytometry

data with spade. Nature biotechnology, 29(10), 886–891.

Qiu, X. et al (2017). Reversed graph embedding resolves complex single-cell trajec-

tories. Nature Methods, 14(10), 979–982.

Rumelhart, D.E. et al (1985). Learning internal representations by error propagation.

Technical report, California Univ San Diego La Jolla Inst for Cognitive Science.

Saelens, W. et al (2019). A comparison of single-cell trajectory inference methods.

Nature Biotechnology, 37(5), 547–554.

Street, K. et al (2018). Slingshot: cell lineage and pseudotime inference for single-
cell transcriptomics. BMC Genomics, 19(1), 477.

12
A Training loop algorithm
deg(v, t) = min n
n=1...|Γ(v)|
Algorithm S1 Training loop Pn
i=1 WΓ(v)i t
Require: Autoencoder (g ◦ f )θ s.t. P|Γ(v)| ≥
100
j=1 WΓ(v)j
Require: Pretraining epochs np , batch size b and learning
rate αp We can clearly see that when t = 100 we obtain the
Require: Finetuning epochs nf and learning rate αf classical definition of degree. As this generalization has not
Require: Weight parameters for the loss improved the visualization quality drastically, we opted for
λrec ,λpush-pull ,λcomp ,λcos the simpler version of the cosine loss in the main paper.
1: T, C, C2 , dgeo ← I NITIALIZATION (X)
2: #P retraining
3: for t = 0, 1, . . . , np do C Ablation study
4: for i = 0, 1, . . . , np /b do
5: Sample a minibatch m from X In order to better visualize the contributions of each ele-
6: m̂ ← g(f (m)) ment of our method, we conducted an ablation study of the
7: L ← Lrec different loss parameters and evaluated their impact both
8: θt+1 ← θt − αp ∇L qualitatively and quantitatively.
9: end for
10: end for C.1 Loss parameters
11: #F inetuning
12: for t = np , . . . , np + nf do The first phenomenon that is studied is the influence of
13: h ← f (X) dropping loss terms entirely. The reconstruction loss is al-
14: X̂ ← g(h) ways kept since it is necessary for the embeddings to con-
15: L ← λrec Lrec + λpush-pull Lpush-pull + λcomp Lcomp + tain salient information about the data. Not all combinations
λcos Lcos of loss parameters will be studied, but only those that should
16: θt+1 ← θt − αf ∇L be interesting (for example, using only the cosine loss does
17: end for not make much sense, so it is not an interesting scenario).
We will not study the influence of the weights for every
loss since the default weights of 1 lead to good performance
and this configuration significantly reduces the dimension
of the hyperparameter space. All experiments are described
B Cosine loss generalization in table S1.
The performance will be evaluated both qualitatively and
The definition of a vertex’ degree in a graph as the number quantitatively on all three discussed datasets to demonstrate
of incident edges to it is not perfect, as it does not take into as clearly as possible the impact of every loss term.
account the noisiness of the graph. On real datasets, we may
have stray clusters which lead to noisy edges in the density Experiment Lrec Lpush-pull Lcomp Lcos (weight)
graph. These usually manifest as edges with only one point A X
contributing to them in high dimension. This leads to ver- B X X
tices with an effective degree of 2 that have a higher degree C X X
due to these noisy edges, and are thus ignored by the cosine D X X X
loss. E X X X(50)
To remedy this, we introduce a different definition of de- F X X X X(50)
gree. We consider a threshold t ∈ [0, 100] and define the
degree of a vertex as the smallest number of incident edges Table S1: List of loss parameters for our ablations.
that account for t% of all points contributing to the vertex’s
incident edges. As t gets closer to a hundred, we converge As can be seen in figures S1,S2 and S3, the compactness
to the original definition of degree. loss alone is not sufficient to obtain a good representation
More formally put, consider a weighted graph since it has no repulsive force. The reconstruction loss helps
G = (V, E, W ) and a function Γ that returns incident to avoid a total collapse but is not sufficient to prevent a
edges to a given vertex sorted by their weights. This partial collapse, as visible in the endocrine pancreas and the
alternative definition of a vertex’s degree is then: T-cell datasets. While the push-pull loss already gives good

13
Type of metric Local Global Voronoi
Euclidean Geodesic
Metric ARI k-NN 1st order 2nd order
Pearson Spearman Pearson Spearman
λpp = 0, λcomp = 0, λcos =0 34.62 19.09 0.66 0.66 0.56 0.56 70.01 38.98
λpp = 0, λcomp = 1, λcos =0 38.54 23.20 0.80 0.79 0.74 0.73 70.44 35.24
λpp = 1, λcomp = 0, λcos =0 48.19 25.05 0.78 0.76 0.72 0.70 79.68 56.02
λpp = 1, λcomp = 1, λcos =0 48.70 26.40 0.81 0.78 0.74 0.72 79.26 55.93
λpp = 1, λcomp = 0, λcos = 50 45.54 22.71 0.80 0.77 0.74 0.72 78.94 52.97
λpp = 1, λcomp = 1, λcos = 50 46.00 24.60 0.81 0.80 0.75 0.74 78.85 53.77
(a) PHATE generated dataset.
Euclidean Geodesic
Metric ARI k-NN 1st order 2nd order
Pearson Spearman Pearson Spearman
λpp = 0, λcomp = 0, λcos =0 34.52 4.17 0.81 0.85 0.57 0.62 65.37 30.83
λpp = 0, λcomp = 1, λcos =0 24.88 2.32 0.65 0.69 0.53 0.59 46.30 11.11
λpp = 1, λcomp = 0, λcos =0 43.07 3.07 0.77 0.79 0.71 0.77 73.79 50.68
λpp = 1, λcomp = 1, λcos =0 44.69 2.93 0.73 0.79 0.66 0.75 72.95 46.75
λpp = 1, λcomp = 0, λcos = 50 35.64 2.77 0.73 0.75 0.70 0.75 68.44 38.82
λpp = 1, λcomp = 1, λcos = 50 39.79 2.85 0.71 0.74 0.71 0.78 69.24 38.04
(b) Endocrine pancreas dataset.
Euclidean Geodesic
Metric ARI k-NN 1st order 2nd order
Pearson Spearman Pearson Spearman
λpp = 0, λcomp = 0, λcos =0 29.24 2.20 0.40 0.33 0.40 0.42 35.17 4.50
λpp = 0, λcomp = 1, λcos =0 40.65 1.24 0.15 0.17 0.20 0.20 28.75 2.75
λpp = 1, λcomp = 0, λcos =0 29.72 1.28 0.42 0.26 0.36 0.39 47.19 18.63
λpp = 1, λcomp = 1, λcos =0 45.55 1.15 0.38 0.23 0.35 0.38 37.71 16.29
λpp = 1, λcomp = 0, λcos = 50 29.24 1.23 0.37 0.24 0.42 0.44 44.73 12.16
λpp = 1, λcomp = 1, λcos = 50 37.25 1.15 0.31 0.19 0.40 0.41 38.41 12.85
(c) T-cells dataset.

Table S2: Quantitative results in different scenarios for DTAE’s loss weights.

Figure S1: Results of the ablations on the PHATE generated Figure S2: Results of the ablations on the T-cell dataset,
dataset, colored by groundtruth clusters. colored by phenotypes.

results when used alone, since the tree structure is visible, loss, however, this combination can lead to sparse repre-
adding the compactness loss yields embeddings in which sentation due to the fact that seeds of second order Voronoı̈
the points lie compactly along the tree. Without the cosine cells do not necessarily lie in their cell. This means that

14
Figure S3: Results of the ablations on the endocrine pan-
creas dataset, colored by cell types. Figure S4: Results obtained on the chronic infection subset
of the T-cell dataset when varying the cosine loss weight,
colored by phenotypes.
Rel. Perf.
λpp = 0, λcomp = 0, λcos =0 81.27 C.2 Cosine loss weight
λpp = 0, λcomp = 1, λcos =0 67.60
λpp = 1, λcomp = 0, λcos =0 92.04 A parameter that is interesting to study in more detail is the
λpp = 1, λcomp = 1, λcos =0 90.66 cosine loss weight. While a lot of the other losses have a sig-
λpp = 1, λcomp = 0, λcos = 50 87.24 nificant impact on the embeddings, the cosine loss is mostly
λpp = 1, λcomp = 1, λcos = 50 86.99 cosmetic, and it is important to understand its behavior for
low and high weights. The cosine loss weight will only be
Table S3: Relative performance out of a hundred over all studied on the T-cell dataset, since it is enough to demon-
datasets and metrics. strate its impact on quantitative and qualitative results.
As can be seen in figure S4 the cosine loss straightens
the branches for every weight as intended. However, with
higher weights, it also has a density regularizing effect. As
points will not necessarily be spread out along the line be- its weight increases, we obtain a more homogeneous and
tween two centroids but only lie inside the intersection of less clumped point cloud. While there is no clear explana-
the line between the two centroids and their second order tion for this behavior, a hypothesis is that the higher weight
Voronoı̈ cell, which may be much smaller than the full line means that this criterion will be optimized with higher pri-
between the centroids. Using only the push-pull and co- ority during the finetuning. Since the pretraining produces
sine loss can lead to satisfying results, but the embedding dense embedding and this cosine loss has no incentive to
is more spread out than with the compactness loss. Adding produce sparse embeddings, this denser structure is kept
the cosine loss makes all the results cleaner and helps with during training. On the contrary, the push-pull loss can have
the density of the point cloud. This effect is discussed in the a sparsifying effect, since the seeds of second order Voronoı̈
next section. cells do not necessarily lie in their cells. When the cosine
loss weight is smaller, this loss is optimized with higher pri-
From a quantitative point of view, adding all of these ority, which would lead to the sparser embeddings. All of
losses leads to worse performances than just using the push- this is intimately linked to the dynamics of neural network
pull loss alone. Since the compactness and cosine losses training and not only to minimizers of each criterion, mak-
are designed with visualization in mind, they can alter the ing a precise study of this process highly complex.
fidelity of the embedding. For example, making the points From a quantitative point of view, a slight decrease in
tighter along the density tree will lead to pairwise distances performance is visible in table S4 for all metrics except
that are preserved more poorly, which is an effect that we for the preservation of geodesic distances or of first order
indeed observe in the global metrics in table S2. Voronoı̈ diagrams. As a result, the overall performance de-
Nonetheless, when looking at aggregated performances in creases noticeably when increasing the cosine loss weight,
table S3 we can see that all experiments except when us- see the rightmost column in table S4.
ing the compactness loss alone still perform comparatively. This again illustrates the trade-offs between quantitative
As such, the increase in qualitative performance stemming and qualitative performance, where even though a method
from the addition of losses is not done at the expense of the performs slightly worse quantitatively, it might still produce
preservation of the data’s intrinsic structure. In particular, results that are easier to interpret for humans.
the push-pull loss alone drastically improves the visualiza-
tion not only qualitatively, but also quantitatively.

15
Type of metric Local Global Voronoi
Euclidean Geodesic All
Metric ARI k-NN 1st order 2nd order
Pearson Spearman Pearson Spearman
λcos =1 45.83 1.15 0.37 0.24 0.38 0.39 37.30 16.36 95.68
λcos =2 44.77 1.11 0.35 0.24 0.40 0.40 37.90 16.39 95.34
λcos =5 45.22 1.09 0.37 0.20 0.40 0.45 37.38 14.12 93.30
λcos = 10 44.29 1.12 0.35 0.18 0.42 0.46 36.40 13.66 91.83
λcos = 15 43.50 1.14 0.29 0.15 0.44 0.46 38.78 14.59 90.28
λcos = 20 43.08 1.17 0.33 0.17 0.42 0.46 37.43 14.41 91.74
λcos = 50 37.25 1.15 0.31 0.19 0.40 0.41 38.41 12.85 87.50

Table S4: Quantitative results on the T-cells dataset when varying the cosine loss weight. The weights for the push-pull
and compactness losses are set to one. The rightmost column contains the average performance over all metrics for a given
method.

16
D High resolution results

Figure S5: Results obtained on the endocrine pancreatic cell dataset, colored by cell types.

Figure S6: Results obtained on the chronic infection subset of the T-cell dataset, colored by phenotypes.

17
E Complete quantitative results

Type of metric Local Global Voronoi

Euclidean Geodesic
Metric ARI k-NN 1st order 2nd order
Pearson Spearman Pearson Spearman
DTAE (Ours) 46.00 24.60 0.81 0.80 0.75 0.74 78.85 53.77
AE 34.67 19.18 0.63 0.64 0.55 0.54 70.56 38.64
PHATE 51.33 60.44 0.50 0.46 0.54 0.52 71.12 30.40
UMAP 55.13 67.92 0.53 0.48 0.54 0.51 75.18 46.19
SAUCIE 56.62 37.61 0.81 0.79 0.75 0.73 82.98 65.07
DCA 40.41 21.29 0.64 0.64 0.59 0.59 73.83 43.17
scVI 36.51 19.68 0.69 0.68 0.67 0.67 69.94 36.98
Force Atlas 2 51.85 64.38 0.59 0.55 0.56 0.54 75.79 46.44
PCA 39.87 22.36 0.77 0.74 0.67 0.66 73.14 42.53
(a) PHATE generated dataset.
Euclidean Geodesic
Metric ARI k-NN 1st order 2nd order
Pearson Spearman Pearson Spearman
DTAE (Ours) 39.79 2.85 0.71 0.74 0.71 0.78 69.24 38.04
AE 34.12 4.24 0.81 0.84 0.58 0.62 65.12 30.53
PHATE 30.92 3.70 0.64 0.65 0.71 0.78 57.22 22.27
UMAP 30.41 4.67 0.57 0.58 0.79 0.82 57.79 21.21
SAUCIE 38.94 3.69 0.81 0.81 0.71 0.73 69.46 37.93
DCA 30.93 5.01 0.41 0.78 0.37 0.63 64.29 27.98
scVI 32.99 3.80 0.67 0.68 0.65 0.67 61.29 27.52
Force Atlas 2 25.11 3.96 0.28 0.62 0.21 0.74 46.24 13.08
PCA 21.84 1.94 0.69 0.67 0.87 0.87 48.65 15.50
(b) Endocrine pancreas dataset.
Euclidean Geodesic
Metric ARI k-NN 1st order 2nd order
Pearson Spearman Pearson Spearman
DTAE (Ours) 37.25 1.15 0.31 0.19 0.40 0.41 38.41 12.85
AE 28.87 2.17 0.38 0.32 0.43 0.43 34.84 4.58
PHATE 32.00 1.25 -0.02 0.02 0.42 0.43 33.70 3.45
UMAP 23.41 1.52 0.11 0.21 0.46 0.44 29.24 5.28
SAUCIE 26.86 1.59 0.21 0.25 0.43 0.42 34.30 4.63
DCA 0.10 1.34 0.45 0.62 0.00 0.26 3.17 1.04
scVI 28.69 1.26 0.43 0.23 0.38 0.46 33.31 5.67
Force Atlas 2 23.83 0.93 0.02 0.01 0.05 0.41 28.39 3.09
PCA 20.82 1.10 0.18 0.16 0.61 0.57 32.30 8.27
(c) T-cells dataset.

Table S5: Full Quantitative results on all studied datasets. Metrics are described in section 4.4 and higher values indicate
better performance.

Max A. Little Machine Learning For Signal Processing Data Science Algorithms and Computational Statistics Oxford University Press USA 2019
100% (1)
Max A. Little Machine Learning For Signal Processing Data Science Algorithms and Computational Statistics Oxford University Press USA 2019
378 pages
T-Visne: Interactive Assessment and Interpretation of T-Sne Projections
No ratings yet
T-Visne: Interactive Assessment and Interpretation of T-Sne Projections
18 pages
Evaluating_Representation_Learning_and_Graph_Layout_Methods_for_Visualization
No ratings yet
Evaluating_Representation_Learning_and_Graph_Layout_Methods_for_Visualization
10 pages
scRNAseq_clustering_Asa_Bjorklund_2021
No ratings yet
scRNAseq_clustering_Asa_Bjorklund_2021
53 pages
Visualizing Data Using t-SNE: Laurens Van Der Maaten
No ratings yet
Visualizing Data Using t-SNE: Laurens Van Der Maaten
27 pages
Graph-Based Clustering and Data Visualization Algorithms
No ratings yet
Graph-Based Clustering and Data Visualization Algorithms
1 page
vandermaaten14a
No ratings yet
vandermaaten14a
25 pages
1 s2.0 S1566253523002257 Main
No ratings yet
1 s2.0 S1566253523002257 Main
44 pages
Nonlinear Dimensionality Reduction Techniques: A Data Structure Preservation Approach Lespinats
No ratings yet
Nonlinear Dimensionality Reduction Techniques: A Data Structure Preservation Approach Lespinats
79 pages
Biochem Molecular Bio Educ - 2023 - Garma - Demystifying dimensionality reduction techniques in the omics era A
No ratings yet
Biochem Molecular Bio Educ - 2023 - Garma - Demystifying dimensionality reduction techniques in the omics era A
14 pages
Dyngem: Deep Embedding Method For Dynamic Graphs
No ratings yet
Dyngem: Deep Embedding Method For Dynamic Graphs
8 pages
NIHMS1510930 Supplement Supplementary Materials
No ratings yet
NIHMS1510930 Supplement Supplementary Materials
14 pages
Representation Learning On Graphs: Methods and Applications
No ratings yet
Representation Learning On Graphs: Methods and Applications
23 pages
A Survey of Graph Neural Networks in Various Learning Paradigms Methods, Applications, and Challenges
No ratings yet
A Survey of Graph Neural Networks in Various Learning Paradigms Methods, Applications, and Challenges
70 pages
Science 2006 Cottrell 454 5
No ratings yet
Science 2006 Cottrell 454 5
3 pages
The Graph Neural Network Model
No ratings yet
The Graph Neural Network Model
20 pages
Week 8 DS Practical (1)
No ratings yet
Week 8 DS Practical (1)
13 pages
Disease Prediction Using Graph Convolutional Networks: Application To Autism Spectrum Disorder and Alzheimer's Disease
No ratings yet
Disease Prediction Using Graph Convolutional Networks: Application To Autism Spectrum Disorder and Alzheimer's Disease
21 pages
Bbac 625
No ratings yet
Bbac 625
12 pages
Unsuper
No ratings yet
Unsuper
15 pages
scRNA-seq Analysis JY Chen 09-13-2019 Share
No ratings yet
scRNA-seq Analysis JY Chen 09-13-2019 Share
27 pages
GNN-Foundations-Frontiers-and-Applications-chapter2
No ratings yet
GNN-Foundations-Frontiers-and-Applications-chapter2
10 pages
T Sne Implementation R Python
No ratings yet
T Sne Implementation R Python
19 pages
stuff-in-the-news-02
No ratings yet
stuff-in-the-news-02
12 pages
Nonlinear Dimensionality Reduction Techniques: A Data Structure Preservation Approach Lespinats pdf download
No ratings yet
Nonlinear Dimensionality Reduction Techniques: A Data Structure Preservation Approach Lespinats pdf download
47 pages
Nonlinear Dimensionality Reduction Techniques: A Data Structure Preservation Approach Lespinats download
100% (2)
Nonlinear Dimensionality Reduction Techniques: A Data Structure Preservation Approach Lespinats download
80 pages
2019-Adversarially Regularized Graph Autoencoder For Graph Embedding
No ratings yet
2019-Adversarially Regularized Graph Autoencoder For Graph Embedding
7 pages
IN4089 - Lecture 05 - Graphs and Dimensionality Reduction-Pdfjam
No ratings yet
IN4089 - Lecture 05 - Graphs and Dimensionality Reduction-Pdfjam
13 pages
Deformable Structural Models Steven Bergner Gruvi Lab Simon Fraser University Stephan Al-Zubi, Klaus Tönnies Department of Simulation and Graphics Otto-Von-Guericke University Magdeburg, Germany
No ratings yet
Deformable Structural Models Steven Bergner Gruvi Lab Simon Fraser University Stephan Al-Zubi, Klaus Tönnies Department of Simulation and Graphics Otto-Von-Guericke University Magdeburg, Germany
4 pages
A Survey on Network Embedding
No ratings yet
A Survey on Network Embedding
21 pages
A Comprehensive Survey On Deep Graph Representation Learning
No ratings yet
A Comprehensive Survey On Deep Graph Representation Learning
85 pages
Scanpy For Analysis of Large-Scale Single-Cell Gene Expression Data
No ratings yet
Scanpy For Analysis of Large-Scale Single-Cell Gene Expression Data
7 pages
Unsupervised Deep Learning for Structured Shape Matching
No ratings yet
Unsupervised Deep Learning for Structured Shape Matching
14 pages
Asap Poster Sibdays
No ratings yet
Asap Poster Sibdays
1 page
The art of using t-SNE for single cell transcriptomics
No ratings yet
The art of using t-SNE for single cell transcriptomics
14 pages
Yang 20 A
No ratings yet
Yang 20 A
16 pages
GNNS
No ratings yet
GNNS
7 pages
Learning Context-Aware Distributed Gene Representa
No ratings yet
Learning Context-Aware Distributed Gene Representa
12 pages
Mixing Autoencoder With Classifier: Conceptual Data Visualization
No ratings yet
Mixing Autoencoder With Classifier: Conceptual Data Visualization
6 pages
Tree of Latent Mixtures For Bayesian Modelling and Classification of High Dimensional Data
No ratings yet
Tree of Latent Mixtures For Bayesian Modelling and Classification of High Dimensional Data
8 pages
Lecture Dimensionality Reduction
No ratings yet
Lecture Dimensionality Reduction
34 pages
Author's Accepted Manuscript: Pattern Recognition
No ratings yet
Author's Accepted Manuscript: Pattern Recognition
41 pages
Learning Methods
No ratings yet
Learning Methods
70 pages
Bbab 340
No ratings yet
Bbab 340
16 pages
4 Clustering
No ratings yet
4 Clustering
21 pages
A Comprehensive Survey On Graph Neural Networks
No ratings yet
A Comprehensive Survey On Graph Neural Networks
22 pages
Gcn
No ratings yet
Gcn
23 pages
Level Set Trees For Applied Statistics
100% (1)
Level Set Trees For Applied Statistics
124 pages
A_Survey_on_Network_Embedding
No ratings yet
A_Survey_on_Network_Embedding
20 pages
Machine Learning On Graphs: A Model and Comprehensive Taxonomy
No ratings yet
Machine Learning On Graphs: A Model and Comprehensive Taxonomy
49 pages
GNN in Histopathology
No ratings yet
GNN in Histopathology
22 pages
Original GNN
No ratings yet
Original GNN
22 pages
Coarse-to-Fine - A Hierarchical Diffusion Model For Molecule Generation in 3D
No ratings yet
Coarse-to-Fine - A Hierarchical Diffusion Model For Molecule Generation in 3D
23 pages
1 s2.0 S0031320317303497 Main
No ratings yet
1 s2.0 S0031320317303497 Main
14 pages
Structured Class-Labels in Random Forests For Semantic Image Labelling
No ratings yet
Structured Class-Labels in Random Forests For Semantic Image Labelling
8 pages
Tucker Et Al 2012 Bioinformatics Tools in Predictive Ecology Applications To Fisheries
No ratings yet
Tucker Et Al 2012 Bioinformatics Tools in Predictive Ecology Applications To Fisheries
12 pages
Mapper DR
No ratings yet
Mapper DR
10 pages
1-s2.0-S0950705121002240-main-1
No ratings yet
1-s2.0-S0950705121002240-main-1
14 pages
MasterThesisDoc
No ratings yet
MasterThesisDoc
55 pages
UNIT 2-2
No ratings yet
UNIT 2-2
33 pages
Introduction to Bioinformatics, Sequence and Genome Analysis
From Everand
Introduction to Bioinformatics, Sequence and Genome Analysis
Jerry H. Swift
No ratings yet
Applied Computing and Informatics: Harleen Kaur, Vinita Kumari
No ratings yet
Applied Computing and Informatics: Harleen Kaur, Vinita Kumari
6 pages
Mini Project
No ratings yet
Mini Project
19 pages
A Novel Approach On Tamil Text Classification Using C Final Modified For Uploading
No ratings yet
A Novel Approach On Tamil Text Classification Using C Final Modified For Uploading
6 pages
Prabhat Report11
No ratings yet
Prabhat Report11
70 pages
Scikit - Notes ML
100% (2)
Scikit - Notes ML
12 pages
Download Big Data, IoT, and Machine Learning: Tools and Applications (Internet of Everything (IoE)) 1st Edition Rashmi Agrawal (Editor) ebook All Chapters PDF
100% (1)
Download Big Data, IoT, and Machine Learning: Tools and Applications (Internet of Everything (IoE)) 1st Edition Rashmi Agrawal (Editor) ebook All Chapters PDF
62 pages
Brain-Based Computer Interfaces in Virtual Reality
No ratings yet
Brain-Based Computer Interfaces in Virtual Reality
6 pages
ML Assignment (22BCE8086) 2
No ratings yet
ML Assignment (22BCE8086) 2
19 pages
Co Po Format (Machine Learning, It-7th Sem)
No ratings yet
Co Po Format (Machine Learning, It-7th Sem)
9 pages
Data Science Ai
No ratings yet
Data Science Ai
27 pages
PCA
100% (1)
PCA
33 pages
Machine Learning-1
100% (1)
Machine Learning-1
9 pages
DM_merged
No ratings yet
DM_merged
169 pages
Autoencoders and Their Applications in Machine Learning
No ratings yet
Autoencoders and Their Applications in Machine Learning
52 pages
Rotating Machinery and Signal Processing.
No ratings yet
Rotating Machinery and Signal Processing.
142 pages
Machine Learning for Physics and Astronomy 10th Edition Viviana Acquaviva - Download the ebook and start exploring right away
100% (2)
Machine Learning for Physics and Astronomy 10th Edition Viviana Acquaviva - Download the ebook and start exploring right away
58 pages
PCA and LDA Assignment
No ratings yet
PCA and LDA Assignment
5 pages
20 Questions On Feature Engineering and Eda
No ratings yet
20 Questions On Feature Engineering and Eda
9 pages
Machine Learning: III B. Tech I Semester Regular/Supplementary Examinations, December - 2023
No ratings yet
Machine Learning: III B. Tech I Semester Regular/Supplementary Examinations, December - 2023
8 pages
Machine Learning (ML) Solved MCQs [Set-2] McqMate.com
No ratings yet
Machine Learning (ML) Solved MCQs [Set-2] McqMate.com
6 pages
A Step-By-Step Explanation of Principal Component Analysis (PCA) - Built in
No ratings yet
A Step-By-Step Explanation of Principal Component Analysis (PCA) - Built in
8 pages
CLASS 10 AI Chapter 2
No ratings yet
CLASS 10 AI Chapter 2
18 pages
Data Mining Disease Diagnosis Presentation
No ratings yet
Data Mining Disease Diagnosis Presentation
35 pages
Advances in Knowledge Discovery and Data Mining Part II 14th edition by Mohammed Zaki , Jeffrey Xu Yu, Ravindran, Vikram Pudi ISBN 3642136710 978-3642136719 - The full ebook with complete content is ready for download
100% (7)
Advances in Knowledge Discovery and Data Mining Part II 14th edition by Mohammed Zaki , Jeffrey Xu Yu, Ravindran, Vikram Pudi ISBN 3642136710 978-3642136719 - The full ebook with complete content is ready for download
86 pages
Chapter 3
No ratings yet
Chapter 3
31 pages
U20cs604 Machine Learning Unit III
No ratings yet
U20cs604 Machine Learning Unit III
23 pages
Assignment 1 - LP1
No ratings yet
Assignment 1 - LP1
14 pages
Presentation 33360 Content Document 20250319044717PM
No ratings yet
Presentation 33360 Content Document 20250319044717PM
126 pages
HyperSpecTral Image Classification
No ratings yet
HyperSpecTral Image Classification
17 pages

Visualizing Hierarchies in scRNA-seq Data Using A Density Tree-Biased Autoencoder

Uploaded by

Visualizing Hierarchies in scRNA-seq Data Using A Density Tree-Biased Autoencoder

Uploaded by

Visualizing hierarchies in scRNA-seq data using a density tree-biased

Abstract tion. In particular, this permits studying the cell develop-

Single-cell RNA sequencing (scRNA-seq) data allows ana-

viewed for instance in Saelens et al. (2019). We therefore to MONOCLE 2.

3.2.2 Push-Pull Loss

5.2 Neural network limitations

Figure 6: Failure case: Highly clustered data violates our 6 Conclusion

Jacomy, M. et al (2014). Forceatlas2, a continuous graph layout algorithm for handy

Kingma, D.P. and Welling, M. (2013). Auto-encoding variational bayes. arXiv

Lin, E. et al (2020). A deep adversarial variational autoencoder model for dimension-

Lopez, R. et al (2018). Deep generative modeling for single-cell transcriptomics.

Luo, Z. et al (2021). scgae: topology-preserving dimensionality reduction for single-

Martinetz, T. and Schulten, K. (1994). Topology representing networks. Neural

McInnes, L. et al (2020). UMAP: Uniform Manifold Approximation and Projection

Moon, K.R. et al (2019). Visualizing structure and transitions in high-dimensional

Moor, M. et al (2020). Topological Autoencoders. arXiv:1906.00722 [cs, math, stat].

Paszke, A. et al (2019). PyTorch: An Imperative Style, High-Performance Deep

Perret, B. et al (2019). Higra: Hierarchical Graph Analysis. SoftwareX, 10, 100335.

Qiu, P. et al (2011). Extracting a cellular hierarchy from high-dimensional cytometry

Qiu, X. et al (2017). Reversed graph embedding resolves complex single-cell trajec-

Rumelhart, D.E. et al (1985). Learning internal representations by error propagation.

Saelens, W. et al (2019). A comparison of single-cell trajectory inference methods.

Type of metric Local Global Voronoi

You might also like