CTDAbook
CTDAbook
Yusu Wang
Halıcıoğlu Data Science Institute
University of California, San Diego
La Jolla, California, USA 92093
ii Computational Topology for Data Analysis
This material has been / will be published by Cambridge University Press as Computational
Topology for Data Analysis by Tamal Dey and Yusu Wang. This pre-publication version is free
to view and download for personal use only. Not for re-distribution, re-sale, or use in derivative
works.
Contents
1 Basics 3
1.1 Topological space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Metric space topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Maps, homeomorphisms, and homotopies . . . . . . . . . . . . . . . . . . . . . 9
1.4 Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.1 Smooth manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Functions on smooth manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5.1 Gradients and critical points . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5.2 Morse functions and Morse Lemma . . . . . . . . . . . . . . . . . . . . 18
1.5.3 Connection to topology . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.6 Notes and Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Topological Persistence 51
3.1 Filtrations and persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.1.1 Space filtration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.1.2 Simplicial filtrations and persistence . . . . . . . . . . . . . . . . . . . . 54
iii
iv Computational Topology for Data Analysis
3.2 Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.2.1 Persistence diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.3 Persistence algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3.1 Matrix reduction algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.3.2 Efficient implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.4 Persistence modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.5 Persistence for PL-functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.5.1 PL-functions and critical points . . . . . . . . . . . . . . . . . . . . . . 80
3.5.2 Lower star filtration and its persistent homology . . . . . . . . . . . . . 84
3.5.3 Persistence algorithm for 0-th persistent homology . . . . . . . . . . . . 86
3.6 Notes and Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4 General Persistence 93
4.1 Stability of towers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.2 Computing persistence of simplicial towers . . . . . . . . . . . . . . . . . . . . 97
4.2.1 Annotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.2.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.2.3 Elementary inclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.2.4 Elementary collapse . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3 Persistence for zigzag filtration . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.3.1 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.3.2 Zigzag persistence algorithm . . . . . . . . . . . . . . . . . . . . . . . . 108
4.4 Persistence for zigzag towers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.5 Levelset zigzag persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.5.1 Simplicial levelset zigzag filtration . . . . . . . . . . . . . . . . . . . . . 115
4.5.2 Barcode for levelset zigzag filtration . . . . . . . . . . . . . . . . . . . . 116
4.5.3 Correspondence to sublevel set persistence . . . . . . . . . . . . . . . . 117
4.5.4 Correspondence to extended persistence . . . . . . . . . . . . . . . . . . 118
4.6 Notes and Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
In recent years, the area of topological data analysis (TDA) has emerged as a viable tool for an-
alyzing data in applied areas of science and engineering. The area started in the 90’s with the
computational geometers finding an interest in studying the algorithmic aspect of classical sub-
ject of algebraic topology in mathematics. The area of computational geometry flourished in
80’s and 90’s by addressing various practical problems and enriching the area of discrete geom-
etry in the course. Handful of computational geometers felt that, analogous to this development,
computational topology has the potential of addressing the area of shape and data analysis while
drawing upon and perhaps developing further the area of topology in the discrete context; see
e.g. [26, 117, 120, 188, 292]. The area gained the momentum with the introduction of persistent
homology in early 2000 followed by a series of mathematical and algorithmic developments on
the topic. The book by Edelsbrunner and Harer [149] presents these fundamental developments
quite nicely. Since then, the area has grown both in its methodology and applicability. One conse-
quence of this growth has been the development of various algorithms which intertwine with the
discoveries of various mathematical structures in the context of processing data. The purpose of
this book is to capture these algorithmic developments with the associated mathematical guaran-
tees. It is appropriate to mention that there is an emerging sub-area of TDA which centers more
around statistical aspects. This book does not deal with these developments though we mention
some of it in the last chapter where we describe the recent results connecting TDA and machine
learning.
We have 13 chapters in the book listed in the table of contents. After developing the basics
of topological spaces, simplicial complexes, homology groups, and persistent homology in the
first three chapters, the book is then devoted to presenting algorithms and associated mathemat-
ical structures in various contexts of topological data analysis. These chapters present materials
mostly not covered in any book in the market. To elaborate on this claim, we briefly give an
overview of the topics covered by the present book. The fourth chapter presents generalization
of the persistence algorithm to extended settings such as to simplicial maps (instead of inclu-
sions), zigzag sequences both with inclusions and simplicial maps. Chapter 5 covers algorithms
on computing optimal generators both for persistent and non-persistent homology. Chapter 6 fo-
cuses on algorithms that infer homological information from point cloud data. Chapter 7 presents
algorithms and structural results for Reeb graphs. Chapter 8 considers general graphs including
directed ones. Chapter 9 focuses on various recent results on characterizing nerves of covers in-
cluding the well known Mapper and its multiscale version. Chapter 10 devotes to the important
concept discrete Morse theory, its connection to persistent homology, and its applications to graph
reconstruction. Chapter 11 and 12 introduce multiparameter persistence. The standard persistence
9
10 Computational Topology for Data Analysis
is defined over a 1-parameter index set such as Z or R. Extending this index set to a poset such
as Zd or Rd , we get d-parameter or multiparameter persistence. Chapter 11 focuses on computing
indecomposables for multiparameter persistence that are generalizations of bars in 1-parameter
case. Chapter 12 focuses on various definitions of distances among multiparameter persistence
modules and their computations. Finally, we conclude with Chapter 13 that presents some recent
development of incorporating persistence into the machine learning (ML) framework.
This book is intended for the audience comprising researchers and teachers in computer sci-
ence and mathematics. The graduate students in both fields will benefit from learning the new
materials in topological data analysis. Because of the topics, the book plays a role of a bridge
between mathematics and computer science. Students in computer science will learn the math-
ematics in topology that they are usually not familiar with. Similarly, students in mathematics
will learn about designing algorithms based on mathematical structures. The book can be used
for a graduate course in topological data analysis. In particular, it can be part of a curriculum in
data science which has been/is being adopted in universities. We are including exercises for each
chapter to facilitate teaching and learning.
There are currently few books on computational topology/topological data analysis in the mar-
ket to which our book will be complementary. The materials covered in this book predominately
are new and have not been covered in any of the previous books. The book by Edelsbrunner and
Harer [149] mainly focuses on early developments in persistent homology and do not cover the
materials in Chapters 4 to 13 in this book. The recent book of Boissonnat et al.[39] focuses mainly
on reconstruction, inference, and Delaunay meshes. Other than the Chapter 6 which focuses on
point cloud data and inference of topological properties and Chapter 1-3 which focus on prelim-
inaries about topological persistence, there are hardly any overlap. The book by Oudot [249]
mainly focuses on algebraic structures of persistence modules and inference results. Again, other
than preliminary Chapters 1-3 and Chapter 6, there are hardly any overlap. Finally, unlike ours,
the books by Tierny [286] and by Rabadán and Blumberg [260] mainly focus on applying TDA
to specific domains of scientific visualizations and genomics respectively.
This book, as any other, is not created in isolation. Help coming from various corners con-
tributed to its creation. It was seeded by the class notes that we developed for our introductory
course on Computational Topology and Data Analysis which we taught at the Ohio State Univer-
sity. During this teaching, the class feedback from students gave us the hint that a book covering
increasingly diversified repertoire of topological data analysis is necessary at this point. We thank
all those students who had to bear with the initial disarray that was part of freshly gathering a
coherent material on a new subject. This book would not have been possible without our own
involvement with TDA which was mostly supported by grants from National Science Foundation
(NSF). Many of our PhD students worked through these projects that helped us consolidate our
focus on TDA. In particular, Tao Hou, Ryan Slechta, Cheng Xin, and Soham Mukherjee gave
their comments on drafts of some of the chapters. We thank all of them. We thank everyone from
the TGDA@OSU group for creating one of the best environments for carrying out research in
applied and computational topology. Our special thanks go to Facundo Mémoli, who has been a
great colleague (collaborated with us on several topics) as well as a wonderful friend at OSU. We
also acknowledge the support of the department of CSE at the Ohio State University where a large
amount of the contents of this book were planned and written. The finishing came to fruition after
we moved to our current institutions.
Computational Topology for Data Analysis 11
Finally, it is our pleasure to acknowledge the support of our families that kept us motivated
and engaged throughout the marathon of writing this book, especially during the last stretch over-
lapping the 2020-2021 Coronavirus pandemic. Tamal recalls his daughter Soumi and son Sounak
asking him continuously about the progress of the book. His wife Kajari extended all the help
necessary to make space for extra time needed for the book. Despite suffering from the reduced
attention to family matters, all of them offered their unwavering support and understanding gra-
ciously. Tamal dedicates this book to his family and his late parents Gopal Dey and Hasi Dey
without whose encouragement and love, he would not have been in a position to take up this
project. Yusu thanks her husband Mikhail Belkin for his never-ending support and encouragement
throughout writing this book and beyond. Their two children Alexander and Julia contributed in
their typical ways by making everyday delightful and unpredictable for her. Without their support
and love, she would not be able to finish this book. Finally, Yusu dedicates this book to her par-
ents Qingfen Wang and Jinlong Huang, who always gave her space to grow and encouraged her
to do her best in life, as well as to her great aunt Zhige Zhao and great uncle Humin Wang, who
kindly took her under their care when she was 13. She can never repay their kindness.
12 Computational Topology for Data Analysis
Prelude
We make sense of the world around us primarily by understanding and studying the “shape" of
the objects that we encounter in real life or in a digital environment. Geometry offers a common
language that we usually use to model and describe shapes. For example, the familiar descriptors
such as distances, coordinates, angles and so on from this language assist us to provide detailed
information of a shape of interest. Not surprisingly, mankind has used geometry for thousands of
years to describe objects in his/her surrounding.
However, there are many situations where the detailed ge-
ometric information is not needed and may even obscure the
real useful structure that is not so explicit. A notable example
is the Seven Bridges of Königsberg problem, where in the city
of Königsberg, Pregel river separated the city into four regions,
connected by seven bridges as shown in Figure 1 (taken from
the Wikipedia page for "Seven bridge of Königsberg"). The
question is to find a walk through the city that crosses each
bridge exactly once. Story goes that mathematician Leonhard
Euler observed that factors such as the precise shape of these
regions and the exact path taken are not important. What is
Figure 1: “Map of Königsberg
important is the connectivity among the different regions of
in Euler’s time showing the ac-
the city as connected by the bridges. In particular, the problem
tual layout of the seven bridges,
can be modeled abstractly using a graph with four nodes, rep-
highlighting the river Pregel and
resenting the four regions in the city of Königsberg, and seven
the bridges" by Bogdan Giuşcă
edges representing the bridges connecting them. The problem
is licensed under CC BY-SA 3.0.
then reduces to what’s later known as finding the Euler tour (or
Eulerian cycle) in this graph, which can be easily solved.
For another example, consider animation in computer graphics where one wants to develop a
software that can continuously deform one object to another (in the sense that one can stretch and
change the shape, but cannot break and add to the shape). Can we continuously deform a frog to
a prince this way1 ? Is it possible to continuously deform a tea cup to a bunny? It turns out the
latter is not possible.
In these examples, the core structure of interest behind the input object or space is character-
ized by the way the space is connected, and the detailed geometric information may not matter. In
general, topology intuitively models and studies properties that are invariant as long as the con-
nectivity of space does not change. As a result, topological language and concepts can provide
1
Yes according to Disney movies.
13
14 Computational Topology for Data Analysis
powerful tools to characterize, identify, and process essential features of both spaces and functions
defined on them. However, to bring topological methods to the realm of practical applications,
not only do we need new ideas to make topological concepts and resulting structures more suit-
able for modern data analysis tasks, but also algorithms to compute these structures efficiently. In
the past two decades, the field of applied and computational topology has developed rapidly, pro-
ducing many fundamental results and algorithms that have advanced both fronts. These progress
further fueled the significant growth of topological data analysis (TDA) which has already found
applications in various domains such as computer graphics, visualization, material science, com-
putational biology, neuroscience and so on.
(a) (b)
(c) (d)
(e) (f)
Figure 2: Examples of the use of topological ideas in data analysis. (a) A persistence-based clus-
tering strategy: The persistence diagram of a density field estimated from an input noisy point
cloud (shown in top row) is used to help group points into clusters (bottom row). Reprinted
by permission from Springer Nature: Springer Nature, Discrete & Computational Geome-
try, "Analysis of scalar fields over point cloud data", Frédéric Chazal et al. [87], c 2011.
(b) Using persistence diagram summaries to represent and cluster neuron cells based on their
tree morphology; image taken from [206] licensed by Kanari et al.(2018) under CC BY 4.0
(https://creativecommons.org/licenses/by/4.0/). (c) Using optimal persistent 1-cycle correspond-
ing to a bar (red) in the persistence barcode, defects in diseased eyes are localized; image taken
from [128]. (d) Topological landscape (left) of a 3D volumetric Silicium data set. A volume
rendering of Silicium dataset is on the right. However, note that it is hard to see all the structures
forming the lattice of the crystal, while the topological landscape view shows clearly that most
of them have high function values and are of similar sizes; image taken from [299], reprinted
by permission from IEEE: Gunther Weber et al. (2007). (e) Mapper structure behind the high-
dimensional cell gene expression data set can not only show the cluster of different tumor or
normal cells, but also their connections; image taken from [244], reprinted by permission from
Monica Nicolau et al. (2011, fig. 3). (f) Using a discrete Morse based graph skeleton reconstruc-
tion algorithm to help reconstruct road networks from satellite images even with few labelled
training data; image taken from [139].
Computational Topology for Data Analysis 1
Basics
3
4 Computational Topology for Data Analysis
a rigorous way to state that the domain can be deformed into a shape without ever colliding with
itself.
Perhaps, it is more intuitive to understand the concept of topology in presence of a metric
because then we can use the metric balls such as Euclidean balls in an Euclidean space to define
neighborhoods – the open sets. Topological spaces provide a way to abstract out this idea without
a metric or point coordinates, so they are more general than metric spaces. In place of a metric, we
encode the connectivity of a point set by supplying a list of all of the open sets. This list is called
a system of subsets of the point set. The point set and its system together describe a topological
space.
Definition 1.1 (Topological space). A topological space is a point set T endowed with a system
of subsets T , which is a set of subsets of T that satisfies the following conditions.
• ∅, T ∈ T .
• T 2 = {{(u, z), u}, {(v, z), v}, {(w, z), w}, {(u, z), (v, z), (w, z), z}}.
Then, T = 2T1 ∪T2 is a topology because it satisfies all three axioms. All open sets of T are
generated by union of elements in B = T 1 ∪ T 2 and there is no smaller set with this property. Such
a set B is called a basis of T . We will see later in the next chapter (Section 2.1) that these are
open stars of all vertices and edges.
Computational Topology for Data Analysis 5
v u v u v u
z z z
w w w
(a) (b) (c)
Figure 1.1: Example 3: (a) a graph as a topological space, stars of the vertices and edges as open
sets, (b) a closed cover with three elements, (c) an open cover with four elements.
Definition 1.2 (Closure; Closed sets). A set Q is closed if its complement T \ Q is open. The
closure Cl Q of a set Q ⊆ T is the smallest closed set containing Q.
In Example 1.1, the set {3, 5, 7} is closed because its complement {0, 1} in T is open. The
closure of the open set {0} is {0, 3, 7} because it is the smallest closed set (complement of open set
{1, 5}) containing 0. In Example 1.2, all sets are both open and closed. In Example 1.3, the set
{u, z, (u, z)} is closed, but the set {z, (u, z)} is neither open nor closed. Interestingly, observe that
{z} is closed. The closure of the open set {u, (u, z)} is {u, z, (u, z)}. In all examples, the sets ∅ and
T are both open and closed.
Definition 1.3. Given a topological space (T, T ), the interior Int A of a subset A ⊆ T is the union
of all open subsets of A. The boundary of A is Bd A = Cl A \ Int A.
The interior of the set {3, 5, 7} in Example 1.1 is {5} and its boundary is {3, 7}.
Definition 1.4 (Subspace topology). For every point set U ⊆ T, the topology T induces a subspace
topology on U, namely the system of open subsets U = {P ∩ U : P ∈ T }. The point set U endowed
with the system U is said to be a topological subspace of T.
In Example 1.1, consider the subset U = {1, 5, 7}. It has the subspace topology
In Example 1.3, the subset U = {u, (u, z), (v, z)} has the subspace topology
{∅, {u, (u, z)}, {(u, z)}, {(v, z)}, {(u, z), (v, z)}, {u, (u, z), (v, z)}}.
Definition 1.5 (Connected). A topological space (T, T ) is disconnected if there are two disjoint
non-empty open sets U, V ∈ T so that T = U ∪ V. A topological space is connected if its not
disconnected.
6 Computational Topology for Data Analysis
The topological space in Example 1.1 is connected. However, the topological subspace (Def-
inition 1.4) induced by the subset {0, 1, 5} is disconnected because it can be obtained as the union
of two disjoint open sets {0, 1} and {5}. The topological space in Example 1.3 is also connected,
but the subspace induced by the subset {(u, z), (v, z), (w, z)} is disconnected.
Definition 1.6 (Cover; Compact). An open (closed) cover of a topological space (T, T ) is a col-
lection C of open (closed) sets so that T = c∈C c. The topological space (T, T ) is called compact
S
if every open cover C of it has a finite subcover, that is, there exists C 0 ⊆ C such that T = c∈C 0 c
S
and C 0 is finite.
In Figure 1.1(b), the cover consisting of {{u, z, (u, z)}, {v, z, (v, z)}, {w, z, (w, z)} is a closed cover
whereas the cover consisting of {{u, (u, z)}, {v, (v, z)}, {w, (w, z)}, {z, (u, z), (v, z), (w.z)} in Figure 1.1(c)
is an open cover. Any topological space with finite point set T is compact because all of its cov-
ers are finite. Thus, all topological spaces in the discussed examples are compact. We will see
example of non-compact topological spaces where the ground set is infinite.
In the above examples, the ground set T is finite. It can be infinite in general and topology
may have uncountably infinitely many open sets containing uncountably infinitely many points.
Next, we introduce the concept of quotient topology. Given a space (T, T ) and an equivalence
relation ∼ on elements in T, one can define a topology induced by the original topology T on the
quotient set T/ ∼ whose elements are equivalence classes [x] for every point x ∈ T.
Definition 1.7 (Quotient topology). Given a topological space (T, T ) and an equivalence relation
∼ defined on the set T, a quotient space (S, S ) induced by ∼ is defined by the set S = T/ ∼ and
the quotient topology S where
S := U ⊆ S | {x : [x] ∈ U} ∈ T .
We will see the use of quotient topology in Chapter 7 when we study Reeb graphs.
Infinite topological spaces may seem baffling from a computational point of view, because
they may have uncountably infinitely many open sets containing uncountably infinitely many
points. The easiest way to define such a topological space is to inherit the open sets from a metric
space. A topology on a metric space excludes information that is not topologically essential. For
instance, the act of stretching a rubber sheet changes the distances between points and thereby
changes the metric, but it does not change the open sets or the topology of the rubber sheet. In
the next section, we construct such a topology on a metric space and examine it from the concept
of limit points.
It can be shown that three axioms above imply that d(p, q) ≥ 0 for every pair p, q ∈ T. In
a metric space T, an open metric ball with center c and radius r is defined to be the point set
Bo (c, r) = {p ∈ T : d(p, c) < r}. Metric balls define a topology on a metric space.
Definition 1.9 (Metric space topology). Given a metric space T, all metric balls {Bo (c, r) | c ∈
T and 0 < r ≤ ∞} and their union constituting the open sets define a topology on T.
All definitions for general topological spaces apply to metric spaces with the above defined
topology. However, we give alternative definitions using the concept of limit points which may
be more intuitive.
As we mentioned already, the heart of topology is the question of what it means for a set
of points to be connected. After all, two distinct points cannot be adjacent to each other; they
can only be connected to another by passing through an uncountably many intermediate points.
The idea of limit points helps express this concept more concretely, specifically in case of metric
spaces.
We use the notation d(·, ·) to express minimum distances between point sets P, Q ⊆ T,
Definition 1.10 (Limit point). Let Q ⊆ T be a point set. A point p ∈ T is a limit point of Q, also
known as an accumulation point of Q, if for every real number > 0, however tiny, Q contains a
point q , p such that that d(p, q) < .
In other words, there is an infinite sequence of points in Q that gets successively closer and
closer to p—without actually being p—and gets arbitrarily close. Stated succinctly, d(p, Q\{p}) =
0. Observe that it doesn’t matter whether p ∈ Q or not.
To see the parallel between definitions given in this subsection and the definitions given be-
fore, it is instructive to define limit points also for general topological spaces. In particular, a
point p ∈ T is a limit point of a set Q ⊆ T if every open set containing p intersect Q.
We can also distinguish between closed and open point sets using the concept of limit points.
Informally, a triangle in the plane is closed if it contains all the points on its edges, and open if it
excludes all the points on its edges, as illustrated in Figure 1.3. The idea can be formally extended
to any point set.
8 Computational Topology for Data Analysis
Figure 1.2: The point set at left is disconnected; it can be partitioned into two connected subsets
shaded differently. The point set at right is connected. The black point at the center is a limit point
of the points shaded lightly.
Figure 1.3: Closed, open, and relatively open point sets in the plane. Dashed edges and open
circles indicate points missing from the point set.
Definition 1.12 (Closure; Closed; Open). The closure of a point set Q ⊆ T, denoted Cl Q, is the
set containing every point in Q and every limit point of Q. A point set Q is closed if Q = Cl Q,
i.e. Q contains all its limit points. The complement of a point set Q is T \ Q. A point set Q is open
if its complement is closed, i.e. T \ Q = Cl (T \ Q).
For example, consider the open interval (0, 1) ⊂ R, which contains every r ∈ R so that
0 < r < 1. Let [0, 1] denote a closed interval (0, 1) ∪ {0} ∪ {1}. The numbers 0 and 1 are both limit
points of the open interval, so Cl (0, 1) = [0, 1] = Cl [0, 1]. Therefore, [0, 1] is closed and (0, 1) is
not. The numbers 0 and 1 are also limit points of the complement of the closed interval, R \ [0, 1],
so (0, 1) is open, but [0, 1] is not.
The definition of open set of course depends on the space being considered. A triangle τ that
is missing the points on its edges, and therefore is open in the two-dimensional Euclidean space
aff τ. However, it is not open in the Euclidean space R3 . Indeed, every point in τ is a limit point
of R3 \ τ, because we can find sequences of points that approach τ from the side. In recognition
of this caveat, a simplex σ ⊂ Rd is said to be relatively open if it is open relative to its affine hull.
Figure 1.3 illustrates this fact where in this example, the metric space is R2 .
We can define the interior and boundary of a set using the notion of limit points also. Infor-
mally, the boundary of a point set Q is the set of points where Q meets its complement T \ Q. The
interior of Q contains all the other points of Q.
Computational Topology for Data Analysis 9
Definition 1.13 (Boundary; Interior). The boundary of a point set Q in a metric space T, denoted
Bd Q, is the intersection of the closures of Q and its complement; i.e. Bd Q = Cl Q ∩ Cl (T \ Q).
The interior of Q, denoted Int Q, is Q \ Bd Q = Q \ Cl (T \ Q).
For example, Bd [0, 1] = {0, 1} = Bd (0, 1) and Int [0, 1] = (0, 1) = Int (0, 1). The boundary
of a triangle (closed or open) in the Euclidean plane is the union of the triangle’s three edges, and
its interior is an open triangle, illustrated in Figure 1.3. The terms boundary and interior have
similar subtlety as open sets: the boundary of a triangle embedded in R3 is the whole triangle,
and its interior is the empty set. However, relative to its affine hull, its interior and boundary are
defined exactly as in the case of triangles embedded in the Euclidean plane. Interested readers
can draw the analogy between this observation and the definition of interior and boundary of a
manifold that appear later in Definition 1.23.
We have seen a definition of compactness of a point set in a topological space (Definition 1.6).
We define it differently here for the metric space. It can be shown that the two definitions are
equivalent.
Definition 1.14 (Bounded; Compact). The diameter of a point set Q is sup p,q∈Q d(p, q). The set
Q is bounded if its diameter is finite, and is unbounded otherwise. A point set Q in a metric space
is compact if it is closed and bounded.
In the Euclidean space Rd we can use the standard Euclidean distance as the choice of metric.
On the surface of the coffee mug, we could choose the Euclidean distance too; alternatively, we
could choose the geodesic distance, namely the length of the shortest path from p to q on the
mug’s surface.
Example 1.4 (Euclidean ball). In Rd , the Euclidean d-ball with center c and radius r, denoted
B(c, r), is the point set B(c, r) = {p ∈ Rd : d(p, c) ≤ r}. A 1-ball is an edge, and a 2-ball is called
a disk. A unit ball is a ball with radius 1. The boundary of the d-ball is called the Euclidean
(d − 1)-sphere and denoted S (c, r) = {p ∈ Rd : d(p, c) = r}. The name expresses the fact that we
consider it a (d − 1)-dimensional point set—to be precise, a (d − 1)-dimensional manifold—even
though it is embedded in d-dimensional space. For example, a circle is a 1-sphere, and a layman’s
“sphere” in R3 is a 2-sphere. If we remove the boundary from a ball, we have the open Euclidean
d-ball Bo (c, r) = {p ∈ Rd : d(p, c) < r}.
The topological spaces that are subspaces of a metric space such as Rd inherit their topology
as a subspace topology. Examples of topological subspaces are the Euclidean d-ball Bd , Euclidean
d-sphere Sd , open Euclidean d-ball Bdo , and Euclidean halfball Hd , where
Bd = {x ∈ Rd : kxk ≤ 1},
Sd = {x ∈ Rd+1 : kxk = 1},
Bdo = {x ∈ Rd : kxk < 1},
Hd = {x ∈ Rd : kxk < 1 and xd ≥ 0}.
gluing it because they are connected the same way. They have the same topology. This notion
of topological equivalence can be formalized via functions that send the points of one space to
points of the other while preserving the connectivity.
This preservation of connectivity is achieved by preserving the open sets. A function from one
space to another that preserves the open sets is called a continuous function or a map. Continuity
is a vehicle to define topological equivalence, because a continuous function can send many points
to a single point in the target space, or send no points to a given point in the target space. If the
former does not happen, that is, when the function is injective, we call it an embedding of the
domain into the target space. True equivalence is given by a homeomorphism, a bijective function
from one space to another which has continuity as well as a continuous inverse. This ensures that
open sets are preserved in both directions.
Definition 1.15 (Continuous function; Map). A function f : T → U from the topological space
T to another topological space U is continuous if for every open set Q ⊆ U, f −1 (Q) is open.
Continuous functions are also called maps.
A topological space can be embedded into a Euclidean space by assigning coordinates to its
points so that the assignment is continuous and injective. For example, drawing a triangle on a
paper is an embedding of S1 into R2 . There are topological spaces that cannot be embedded into a
Euclidean space, or even into a metric space—these spaces cannot be represented by any metric.
Next we define homeomorphism that connects two spaces that have essentially the same topol-
ogy.
For maps between compact spaces, there is a weaker condition to be verified for homeomor-
phism because of the following property.
Proposition 1.1. If T and U are compact metric spaces, every bijective map from T to U has a
continuous inverse.
One can take advantage of this fact to prove that certain functions are homeomorphisms by
showing continuity only in the forward direction. When two topological spaces are subspaces of
the same larger space, a notion of similarity called isotopy exists which is stronger than homeo-
morphism. If two subspaces are isotopic, one can be continuously deformed to the other while
keeping the deforming subspace homeomorphic to its original form all the time. For example, a
solid cube can be continuously deformed into a ball in this manner.
Computational Topology for Data Analysis 11
Figure 1.4: Each point set in this figure is homeomorphic to the point set above or below it, but
not to any of the others. Open circles indicate points missing from the point set, as do the dashed
edges in the point sets second from the right.
Figure 1.5: Two tori knotted differently, one triangulated and the other not. Both are homeomor-
phic to the standard unknotted torus on the left, but not isotopic to it.
that sends the open d-ball Bdo to itself if t = 0, and to the Euclidean space Rd if t = 1. The
parameter t plays the role of time, that is, ξ(Bdo , t) deforms continuously from a ball at time zero
to Rd at time one. Thus, there is an isotopy between the open d-ball and Rd .
Every ambient isotopy becomes an isotopy if its domain is restricted from Rd × [0, 1] to
T × [0, 1]. It is known that if there is an isotopy between two subspaces, then there exists an
ambient isotopy between them. Hence, the two notions are equivalent.
There is another notion of similarity among topological spaces that is weaker than homeo-
morphism, called homotopy equivalence. It relates spaces that can be continuously deformed to
one another but the transformation may not preserve homeomorphism. For example, a ball can
shrink to a point, which is not homeomorphic to it because a bijective function from an infinite
point set to a single point cannot exist. However, homotopy preserves some form of connectivity,
such as the number of connected components, holes, and/or voids. This is why a coffee cup is
homotopy equivalent to a circle, but not to a ball or a point.
To get to homotopy equivalence, we first need the concept of homotopies, which are isotopies
sans the homeomorphism.
For example, let g : B3 → R3 be the identity map on a unit ball and h : B3 → R3 be the map
sending every point in the ball to the origin. The fact that g and h are homotopic is demonstrated
by the homotopy H(x, t) = (1 − t) · g(x). Observe that H(B3 , t) deforms continuously a ball at time
zero to a point at time one. A key property of a homotopy is that, as H is continuous, at every
time t the map H(·, t) remains continuous.
For developing more intuition, consider two maps that are not homotopic. Let g : S1 → S1
be the identity map from the circle to itself, and let h : S1 → S1 map every point on the circle to
a single point p ∈ S1 . Although apparently it seems that we can contract a circle to a point, that
view is misleading because the map H is required to map every point on the circle at every time
to a point on the circle. The contraction of the circle to a point is possible only if we break the
continuity, say by cutting or gluing the circle somewhere.
Observe that a homeomorphism relates two topological spaces T and U whereas a homotopy
or an isotopy (which is a special kind of homotopy) relates two maps, thereby indirectly estab-
lishing a relationship between two subspaces g(X) ⊆ U and h(X) ⊆ U. That relationship is not
necessarily an equivalent one, but the following is.
Definition 1.20 (Homotopy equivalent). Two topological spaces T and U are homotopy equivalent
if there exist maps g : T → U and h : U → T such that h ◦ g is homotopic to the identity map
ιT : T → T and g ◦ h is homotopic to the identity map ιU : U → U.
Homotopy equivalence is indeed an equivalence relation, that is, if A, B and B, C are homo-
topy equivalent spaces, so are the pairs A, C. Homeomorphic spaces necessarily have the same
Computational Topology for Data Analysis 13
Figure 1.6: All three of the topological spaces are homotopy equivalent, because they are all
deformation retracts of the leftmost space.
dimension though homotopy equivalent spaces may have different dimensions. To gain more
intuition about homotopy equivalent spaces, we show why a 2-ball is homotopy equivalent to a
single point p. Consider a map h : B2 → {p} and a map g : {p} → B2 where g(p) is any point q
in B2 . Observe that h ◦ g is the identity map on {p}, which is trivially homotopic to itself. In the
other direction, g ◦ h : B2 → B2 sends every point in B2 to q. A homotopy between g ◦ h and the
identity map idB2 is given by the map H(x, t) = (1 − t)q + tx.
An useful intuition for understanding the definition of homotopy equivalent spaces can be
derived from the fact that two spaces T and U are homotopy equivalent if and only if there exists
a third space X so that both T and U are deformation retracts of X; see Figure 1.6.
Definition 1.21 (Deformation retract). Let T be a topological space, and let U ⊂ T be a subspace.
A retraction r of T to U is a map from T to U such that r(x) = x for every x ∈ U. The space U is
a deformation retract of T if the identity map on T can be continuously deformed to a retraction
with no motion of the points already in U: specifically, there is a homotopy called deformation
retraction R : T × [0, 1] → T such that R(·, 0) is the identity map on T, R(·, 1) is a retraction of T
to U, and R(x, t) = x for every x ∈ U and every t ∈ [0, 1].
Fact 1.1. If U is a deformation retract of T, then T and U are homotopy equivalent.
For example, any point on a line segment (open or closed) is a deformation retract of the
line segment and is homotopy equivalent to it. The letter M is a deformation retract of the letter
W, and also of a 1-ball. Moreover, as we said before, two spaces are homotopy equivalent if
they are deformation retractions of a common space. The symbols ∅, ∞, and (viewed as
one-dimensional point sets) are deformation retracts of a double doughnut—a doughnut with
two holes. Therefore, they are homotopy equivalent to each other, though none of them is a
deformation retract of any of the others because one is not a subspace of the other. They are not
homotopy equivalent to A, X, O, ⊕, , }, a ball, nor a coffee cup.
1.4 Manifolds
A manifold is a topological space that is locally connected in a particular way. A 1-manifold
has this local connectivity looking like a segment. A 2-manifold (with boundary) has the local
connectivity looking like a complete or partial disc. In layman’s term, a 2-manifold has the
structure of a piece of paper or rubber sheet, possibly with the boundaries glued together forming
a closed surface—a category that includes disks, spheres, tori, and Möbius bands.
Definition 1.22 (Manifold). A topological space M is a m-manifold, or simply manifold, if every
point x ∈ M has a neighborhood homeomorphic to Bm m
o or H . The dimension of M is m.
14 Computational Topology for Data Analysis
Every manifold can be partitioned into boundary and interior points. Observe that these words
mean very different things for a manifold than they do for a metric space or topological space.
Definition 1.23 (Boundary; Interior). The interior Int M of a m-manifold M is the set of points in
M that have a neighborhood homeomorphic to Bm o . The boundary Bd M of M is the set of points
M \ Int M. The boundary Bd M, if not empty, consists of the points that have a neighborhood
homeomorphic to Hm . If Bd M is the empty set, we say that M is without boundary.
A single point, a 0-ball, is a 0-manifold without boundary according to this definition. The
closed disk B2 is a 2-manifold whose interior is the open disk B2o and whose boundary is the circle
S1 . The open disk B2o is a 2-manifold whose interior is B2o and whose boundary is the empty set.
This highlights an important difference between Definitions 1.13 and 1.23 of “boundary”: when
B2o is viewed as a point set in the space R2 , its boundary is S1 according to Definition 1.13; but
viewed as a manifold, its boundary is empty according to Definition 1.23. The boundary of a
manifold is always included in the manifold.
The open disk B2o , the Euclidean space R2 , the sphere S2 , and the torus are all connected 2-
manifolds without boundary. The first two are homeomorphic to each other, but the last two are
not. The sphere and the torus in R3 are compact (bounded and closed with respect to R3 ) whereas
B2o and R2 are not.
A d-manifold, d ≥ 2 can have orientations whose formal definition we skip here. Informally,
we say that a 2-manifold M is non-orientable if, starting from a point p, one can walk on one
side of M and end up on the opposite side of M upon returning to p. Otherwise, M is orientable.
Spheres and balls are orientable, whereas the Möbius band in Figure 1.7 (a) is a non-orientable
2-manifold with boundary.
Figure 1.7: (a) A Möbius band. (b) Removal of the red and green loops opens up the torus into a
topological disk. (c) A double torus: every surface without boundary in R3 resembles a sphere or
a conjunction of one or more tori. (d) Double torus knotted.
boundary is g if 2g is the maximum number of loops that can be removed from the surface without
disconnecting it; here the loops are permitted to intersect each other. For example, the sphere has
genus zero as every loop cuts it into two discs. The torus has genus one: a circular cut around
its neck and a second circular cut around its circumference, illustrated in Figure 1.7(b), allow it
to unfold into a topological disk. A third loop would cut it into two pieces. Figure 1.7(c) and (d)
each shows a 2-manifold without boundary of genus 2. Although a high-genus surface can have
a very complex shape, all compact 2-manifolds in R3 that have the same genus and no boundary
are homeomorphic to each other.
The map φ is regular if its Jacobian has rank k at every point in U. The map φ is C i -continuous if
the ith-order partial derivatives of φ are continuous.
The reader may be familiar with parametric surfaces, for which U is a 2-dimensional param-
eter space and its image φ(U) in d-dimensional space is a parametric surface. Unfortunately, a
single parametric surface cannot easily represent a manifold with a complicated topology. How-
ever, for a manifold to be smooth, it suffices that each point on the manifold has a neighborhood
that looks like a smooth parametric surface.
Definition 1.24 (Smooth embedded manifold). For any i > 0, an m-manifold M without boundary
embedded in Rd is C i -smooth if for every point p ∈ M, there exists an open set U p ⊂ Rm , a
neighborhood W p ⊂ Rd of p, and a map φ p : U p → W p ∩ M such that (i) φ p is C i -continuous, (ii)
φ p is a homeomorphism, and (iii) φ p is regular. If m = 2, we call M a C i -smooth surface.
The first condition says that each map is continuously differentiable at least i times. The
second condition requires each map to be bijective, ruling out “wrinkles” where multiple points
in U map to a single point in W. The third condition prohibits any map from having a directional
derivative of zero at any point in any direction. The first and third conditions together enforce
smoothness, and imply that there is a well-defined tangent m-flat at each point in M. The three
conditions together imply that the maps φ p defined in the neighborhood of each point p ∈ M
overlap smoothly. There are two extremes of smoothness. We say that M is C ∞ -smooth if for
every point p ∈ M, the partial derivatives of φ p of all orders are continuous. On the other hand,
M is nonsmooth if M is a m-manifold (therefore C 0 -smooth) but not C 1 -smooth.
16 Computational Topology for Data Analysis
R
(a) (b)
Figure 1.8: (a) The graph of a function f : R2 → R. (b) The graph of a function f : R → R with
critical points marked.
In previous sections, we introduced topological spaces, including the special case of (smooth)
manifolds. Very often, a space can be equipped with continuous functions defined on it. In this
section, we focus on real-valued functions of the form f : X → R defined on a topological space
X, also called scalar functions; see Figure 1.8 (a) for the graph of a function f : R2 → R. Scalar
functions appear commonly in practice that describe space/data of interest (e.g., the elevation
function defined on the surface of earth). We are interested in the topological structures behind
scalar functions. In this section, we limit our discussion to nicely behaved scalar functions (called
Morse functions) defined on smooth manifolds. Their topological structures are characterized
by the so-called critical points which we will introduce below. Later in the book we will also
discuss scalar functions on simplicial complex domains, as well as more complex maps defined
on a space X, e.g., a multivariate function f : X → Rd .
The gradient vector of f at x ∈ Rd intuitively captures the direction of steepest increase of function
f . More precisely, we have:
Definition 1.25 (Gradient for functions on Rd ). Given a smooth function f : Rd → R, the gradient
vector field ∇ f : Rd → Rd is defined as follows: for any x ∈ Rd ,
∂f ∂f ∂f
∇ f (x) = (x) T ,
(x), (x), · · · (1.3)
∂x1 ∂x2 ∂xd
where (x1 , x2 , . . . , xd ) represents an orthonormal coordinate system for Rd . The vector ∇ f (x) ∈ Rd
is called the gradient vector of f at x. A point x ∈ Rd is a critical point if ∇ f (x) = [0 0 · · · 0]T ;
otherwise, x is regular.
Observe that for any v ∈ Rd , the directional derivative satisfies that Dv f (x) = h∇ f (x), vi.
It then follows that ∇ f (x) ∈ Rd is along the unit vector v where Dv f (x) is maximized among
the directional derivatives in all unit directions around x; and its magnitude k∇ f (x)k equals the
value of this maximum directional derivative. The critical points of f are those points where
the directional derivative vanishes in all directions – locally, the rate of change for f is zero no
matter which direction one deviates from x. See Figure 1.9 for the three types of critical points,
minimum, saddle point, and maximum, for a generic smooth function f : R2 → R.
Finally, we can extend the above definitions of gradients and critical points to a smooth func-
tion f : M → R defined on a smooth Riemannian m-manifold M. Here, a Riemannian manifold
is a manifold equipped with a Riemannian metric, which is a smoothly varying inner product de-
fined on the tangent spaces. This allows the measurements of length so as to define gradient. At a
point x ∈ M, denote the tangent space of M at x by TM x , which is the m-dimensional vector space
consisting of all tangent vectors of M at x. For example, TM x is a m-dimensional linear space Rm
for a m-dimensional manifold M embedded in the Euclidean space Rd , with Riemannian metric
(inner product in the tangent space) induced from Rd .
The gradient ∇ f is a vector field on M, that is, ∇ f : M → TM maps every point x ∈ M to
a vector ∇ f (x) ∈ TM x in the tangent space of M at x. Similar to the case for a function defined
on Rd , the gradient vector field ∇ f satisfies that for any x ∈ M and v ∈ TM x , h∇ f (x), vi gives
rise to the directional derivative Dv f (x) of f in direction v, and ∇ f (x) still specifies the direction
of steepest increase of f along all directions in TM x with its magnitude being the maximum rate
of change. More formally, we have the following definition, analogous to Definition 1.25 for the
case of a smooth function on Rd .
Definition 1.26 (Gradient vector field; Critical points). Given a smooth function f : M → R
defined on a smooth m-dimensional Riemannian manifold M, the gradient vector field ∇ f : M →
18 Computational Topology for Data Analysis
p p p p
Figure 1.9: Top row: The graph of the function around non-degenerate critical points for a smooth
function on R2 , and a degenerate critical point, called “monkey saddle”. For example, for an
index-0 critical point p, its local neighborhood can be written as f (x) = f (p) + x12 + x22 , making
p a local minimum. Bottom row: the local (closed) neighborhood of the corresponding critical
point in the domain R2 , where the dark blue colored regions are the portion of neighborhood of p
whose function value is at most f (p).
f z z z
w w w
v v v v v
u u u u u u
Figure 1.10: (a) The height function defined on a torus with critical points u, v, w, and z. (b) – (f):
Passing through an index-k critical point is the same as attaching a k-cell from the homotopy point
of view. For example, M≤a+ε for a = f (v) (as shown in (d)) is homotopy equivalent to attaching a
1-cell (shown in (c)) to M≤a−ε (shown in (b)) for an infinitesimal positive ε.
small ε > 0 such that there is no other critical points of f contained in this interval-level set other
than p. Then the sublevel set M≤α+ε has the same homotopy type as M≤α−ε with a k-cell attached
to its boundary Bd M≤α−ε .
Finally, we state the well-known Morse inequalities, connecting critical points with the so-
called Betti numbers of the domain which we will define formally in Section 2.5. In particular,
fixing a field coefficient, the i-th Betti number is the rank of the so-called i-th (singular) homology
group of a topological space X.
Theorem 1.5 (Morse inequalities). Let f be a Morse function on a smooth compact d-manifold
M. For 0 ≤ i ≤ d, let ci denote the number of critical points of f with index i, and βi be the i-th
Betti number of M. We then have:
• ci ≥ βi for all i ≥ 0; and di=0 (−1)i ci = di=0 (−1)i βi . (weak Morse inequality)
P P
Exercises
1. A space is called Hausdorff if every two disjoint point sets have disjoint open sets containing
them.
(a) Give an example of a space that is not Hausdorff.
(b) Give an example of a space that is Hausdorff.
(c) Show the above examples on the same ground set T.
2. In every space T, the point sets ∅ and T are both closed and open.
(a) Give an example of a space that has more than two sets that are both closed and open,
and list all of those sets.
(b) Explain the relationship between the idea of connectedness and the number of sets
that are both closed and open.
3. A topological space T is called path connected if any two points x, y ∈ T can be joined by
a path, i.e. there exists a continuous map f : [0, 1] → T of the segment [0, 1] ⊂ R onto T
so that f (0) = x and f (1) = y. Prove that a path connected space is also connected but the
converse may not be true; however, if T is finite, then the two notions are equivalent.
4. Prove that for every subset X of a metric space, Cl Cl X = Cl X. In other words, augmenting
a set with its limit points does not give it more limit points.
22 Computational Topology for Data Analysis
6. Prove that the metric is a continuous function on the Cartesian space T × T of a metric space
T.
7. Give an example of a bijective function that is continuous, but its inverse is not. In light of
Proposition 1.1, the spaces need to be non-compact.
8. A space is called normal if it is Hausdorff and for any two disjoint closed sets X and Y,
there are disjoint open sets U X ⊃ X and UY ⊃ Y. Show that any metric space is normal.
Show the same for any compact space.
10. (a) Construct an explicit deformation retraction of Rk \ {o} onto Sk−1 where o denotes the
origin. Also, show Rk ∪ {∞} is homeomorphic to Sk .
(b) Show that any d-dimensional finite convex polytope is homeomorphic to the d-dimensional
unit ball Bd .
11. Deduce that homeomorphism is an equivalence relation. Show that the relation of homo-
topy among maps is an equivalence relation.
12. Consider the function f : R3 → R defined as f (x1 , x2 , x3 ) = 3x12 + 3x22 − 9x32 . Show that
the origin (0, 0, 0) is a critical point of f . Give the index of this critical point. Let S denote
the unit sphere centered at the origin. Show that f (−∞,0] ∩ S is homotopy equivalent to two
points, whereas f [0,∞) ∩ S is homotopy equivalent to S1 , the unit 1-sphere (i.e., circle).
Chapter 2
This chapter introduces two very basic tools on which topological data analysis (TDA) is built.
One is simplicial complexes and the other is homology groups. Data supplied as a discrete set
of points do not have an interesting topology. Usually, we construct a scaffold on top of it which
is commonly taken as a simplicial complex. It consists of vertices at the data points, edges con-
necting them, triangles, tetrahedra and their higher dimensional analogues that establish higher
order connectivity. Section 2.1 formalizes this construction. There are different kinds of simpli-
cial complexes. Some are easier to compute, but take more space. The others are more sparse,
but take more time to compute. Section 2.2 presents an important construction called the nerve
and a complex called the Čech complex which is defined on this construction. This section also
presents a commonly used complex in topological data analysis called the Vietoris-Rips complex
that interleaves with the Čech complexes in terms of containment. In Section 2.3, we introduce
some of the complexes which are sparser in size than the Vietoris-Rips or Čech complexes.
The second topic of this chapter, the homology groups of a simplicial complex, are the essen-
tial algebraic structures with which TDA analyzes data. Homology groups of a topological space
capture the space of cycles up to the ones called boundaries that bound “higher dimensional” sub-
sets. For simplicity, we introduce the concept in the context of simplicial complexes instead of
topological spaces. This is called simplicial homology. The essential entities for defining the ho-
mology groups are chains, cycles, and boundaries which we cover in Section 2.4. For simplicity
and also for the relevance in TDA, we define these structures under Z2 -additions.
Section 2.5 defines the simplicial homology group of a simplicial complex as the quotient
space of the cycles with respect to the boundaries. Some of the related concepts to homology
groups such as induced homology under a map, singular homology groups for general topological
spaces, relative homology groups of a complex with respect to a subcomplex, and the dual concept
of homology groups called cohomology groups are also introduced in this section.
23
24 Computational Topology for Data Analysis
Definition 2.1 (Simplex). For k ≥ 0, a k-simplex σ in an Euclidean space Rm is the convex hull1
of a set P of k + 1 affinely independent points in Rm . In particular, a 0-simplex is a vertex, a
1-simplex is an edge, a 2-simplex is a triangle, and a 3-simplex is a tetrahedron. A k-simplex is
said to have dimension k. For 0 ≤ k0 ≤ k, a k0 -face (or, simply a face) of σ is a k0 -simplex that
is the convex hull of a nonempty subset of P. Faces of σ come in all dimensions from zero (σ’s
vertices) to k; and σ is a face of σ. A proper face of σ is a simplex that is the convex hull of a
proper subset of P; i.e. any face except σ. The (k − 1)-faces of σ are called facets of σ; σ has
k + 1 facets.
In Figure 2.1(left), triangle abc is a 2-simplex which has three vertices as 0-faces and three
edges as 1-faces. These are proper faces out of which edges are its facets. Similarly, a tetra-
hedron has four 0-faces (vertices), six 1-faces (edges), four 2-faces (triangles), and one 3-face
(tetrahedron itself) out of which vertices, edges, triangles are proper. The triangles are facets.
Definition 2.2 (Geometric simplicial complex). A geometric simplicial complex K, also known
as a triangulation, is a set containing finitely2 many simplices that satisfies the following two
restrictions.
• For any two simplices σ, τ ∈ K, their intersection σ ∩ τ is either empty or a face of both σ
and τ.
The dimension k of K is the maximum dimension of any simplex in K which is why we also refer
it as a simplicial k-complex.
The above definition of simplicial complexes is very geometric which is why they are referred
as geometric simplicial complexes. Figure 2.1 shows such a geometric simplicial 2-complex in
R2 (left) and another in R3 (right). There is a parallel notion of simplicial complexes that is devoid
of geometry.
Definition 2.3 (Abstract simplex and simplicial complex). A collection K of non-empty subsets
of a given set V(K) is an abstract simplicial complex if every element σ ∈ K has all of its non-
empty subsets σ0 ⊆ σ also in K. Each such element σ with |σ| = k + 1 is called a k-simplex (or
simply a simplex). Each subset σ0 ⊆ σ with |σ0 | = k0 + 1 is called a k0 -face (or, simply a face)
of σ and σ with |σ| = k + 1 is called a k-coface (or, simply a coface) of σ0 . Sometimes, σ0 is
also called a face of σ with co-dimension k − k0 . Also, a (k − 1)-face ((k + 1)-coface resp.) of
a k-simplex is called its facet (cofacet resp.). The elements of V(K) are the vertices of K. Each
k-simplex in K is said to have dimension k. We also say K is a simplicial k-complex if the top
dimension of any simplex in K is k.
Remark 2.1. The collection K can possibly be empty in which case V(K) is empty though a
non-empty K cannot have the empty set as one of its elements by definition.
1
Convex hull of a set of given points p0 , . . . , pk in Rm is the set of all points x ∈ Rm that are convex combination of
the given points, i.e., x = Σki=0 αi pi for αi ≥ 0 and Σαi = 1.
2
Topologists usually define complexes so they have countable cardinality. We restrict complexes to finite cardinality
here.
Computational Topology for Data Analysis 25
a
c
d
f e
Figure 2.1: (left) A simplicial complex with six vertices, eight edges, and one triangle, (right) A
simplicial 2-complex triangulating a 2-manifold in R3 .
Stars and links. Given a simplex τ ∈ K, its star in K is the set of simplices that have τ as a face,
denoted by St(τ) = {σ ∈ K | τ ⊆ σ} (recall that τ ⊆ σ means that τ is a face of σ). Generally, the
star is not closed under face relation and hence is not a simplicial complex. We can make it so by
adding all missing faces. The result is the closed star, denoted by St(τ) = σ∈St(τ) {σ} ∪ {σ0 ∈ K |
S
26 Computational Topology for Data Analysis
σ0 ⊂ σ}, which is also the smallest subcomplex that contains the star. The link of τ consists of the
set of simplices in the closed star that are disjoint from τ, that is, Lk(τ) = {σ ∈ St(τ) | σ ∩ τ = ∅}.
Intuitively, we can think of the star (resp. the closed star) of a vertex as an open (resp. closed)
neighborhood around it, and the link as the boundary of that neighborhood.
In Figure 2.1(left), we have
• St(a) = {{a}, {a, b}, {a, d}, {a, f }, {a, b, d}}, St(a) = St(a) ∪ {{b}, {d}, { f }, {b, d}}
• St({a, b}) = {{a, b}, {a, b, d}}, St({a, b}) = St({a, b}) ∪ {{a}, {b}, {d}, {a, d}, {b, d}}
• Lk(a) = {{b}, {d}, { f }, {b, d}}, Lk( f ) = {{a}, {d}}, Lk({a, b}) = {{d}}.
Simplicial map. Corresponding to the continuous functions (maps) between topological spaces,
we have a notion called simplicial map between simplicial complexes.
Definition 2.6 (Simplicial map). A map f : K1 → K2 is called simplicial if for every simplex
{v0 , . . . , vk } ∈ K1 , we have the simplex { f (v0 ), . . . , f (vk )} in K2 .
A simplicial map is called a vertex map if the domain and codomain of f are only vertex sets
V(K1 ) and V(K2 ) respectively. Every simplicial map is associated with a vertex map. However, a
vertex map f : V(K1 ) → V(K2 ) does not necessarily extend to a simplicial map from K1 to K2 .
Fact 2.1. Every continuous function f : |K1 | → |K2 | can be approximated closely by a simplicial
maps g on appropriate subdivisions of K1 and K2 . The approximation being ‘close’ means that,
for a point x ∈ |K1 |, there is a simplex in K2 which contains both f (x) and h(x) in geometric
realization.
Contiguous maps play an important role in topological analysis. We use a result involving
contiguous maps and homology groups. We defer stating it till Section 2.5 where we introduce
homology groups.
Computational Topology for Data Analysis 27
M U N (U)
Figure 2.2: Examples of two spaces (left), open covers of them (middle), and their nerves (right).
(Top) the intersections of covers are contractible, (bottom) the intersections of covers are not
necessarily contractible.
Taking U to be a cover of a topological space in the above definition, one gets a nerve of a
cover. Figure 2.2 shows two topological spaces, their covers, and corresponding nerves.
One important result involving nerves is the so called Nerve Theorem which have different
forms that depend on the type of topological spaces and covers. Adapting to our need, we state
it for metric spaces (Definition 1.8) which are a special type of topological spaces as we have
observed in Chapter 1.
Theorem 2.1 (Nerve Theorem [45, 300]). Given a finite cover U (open or closed) of a metric
space M, the underlying space |N(U)| is homotopy equivalent to M if every non-empty intersection
∩ki=0 Uαi of cover elements is homotopy equivalent to a point, that is, contractible.
The cover in the top row of Figure 2.2 satisfies the property of the above theorem and its nerve
is homotopy equivalent to M whereas the same is not true for the cover shown in the bottom row.
Given a finite subset P for a metric space (M, d), we can build an abstract simplicial complex
called Čech complex with vertices in P using the concept of nerve.
Definition 2.9 (Čech complex). Let (M, d) be a metric space and P be a finite subset of M. Given
a real r > 0, the Čech complex Cr (P) is defined to be the nerve of the set {B(pi , r)} where
B(pi , r) = {x ∈ M | d(pi , x) ≤ r}
28 Computational Topology for Data Analysis
Figure 2.3: (left) Čech complex Cr (P), (right) Rips complex VRr (P).
Observe that if M is Euclidean, the balls considered for Čech complex are necessarily convex
and hence their intersections are contractible. By Theorem 2.1, Čech complex in this case is
homotopy equivalent to the space of union of the balls. The Čech complex is related to another
complex called Vietoris-Rips complex which is often used in topological data analysis.
Definition 2.10 (Vietoris-Rips complex). Let (P, d) be a finite metric space. Given a real r > 0, the
Vietoris-Rips (Rips in short) complex is the abstract simplicial complex VRr (P) where a simplex
σ ∈ VRr (P) if and only if d(p, q) ≤ 2r for every pair of vertices of σ.
Notice that the 1-skeleton of VRr (P) determines all of its simplices. It is the completion (in
terms of simplices) of its 1-skeleton; see Figure 2.3. Also, we observe the following fact.
Fact 2.2. Let P be a finite subset of a metric space (M, d) where M satisfies the property that, for
any real r > 0 and two points p, q ∈ M with d(p, q) ≤ 2r, the metric balls B(p, r) and B(q, r) have
non-empty intersection. Then, the 1-skeletons of VRr (P) and Cr (P) coincide.
Notice that if M is Euclidean, it satisfies the condition stated in the above fact and hence for
finite point sets in any Euclidean space, Čech and Rips complexes defined with Euclidean balls
share the same 1-skeleton. However, for a general finite metric space (P, d), it may happen that
for some p, q ∈ P, one has d(p, q) ≤ 2r and B(p, r) ∩ B(q, r) = ∅.
An easy but important observation is that the Rips and Čech complexes interleave.
Proposition 2.2. Let P be a finite subset of a metric space (M, d). Then,
Proof. The first inclusion is obvious because if there is a point x in the intersection ∩ki=1 B(pi , r),
the distances d(pi , p j ) for every pair (i, j), 1 ≤ i, j ≤ k, are at most 2r. It follows that for every
simplex {p1 , . . . , pk } ∈ Cr (P) is also in VRr (P).
To prove the second inclusion, consider a simplex {p1 , . . . , pk } ∈ VRr (P). Since by definition
of the Rips complex d(pi , p1 ) ≤ 2r for every pi , i = 1, . . . , k, we have ∩ki=1 B(pi , 2r) ⊃ p1 , ∅.
Then, by definition, {p1 , . . . , pk } is also a simplex in C2r (P).
Computational Topology for Data Analysis 29
Figure 2.4: Every triangle in a Delaunay complex has an empty open circumdisk.
Definition 2.11 (Delaunay simplex; Complex). In the context of a finite point set P ∈ Rd , a k-
simplex σ is Delaunay if its vertices are in P and there is an open d-ball whose boundary contains
its vertices and is empty—contains no point in P. Note that any number of points in P can lie on
the boundary of this ball. But, for simplicity, we will assume that only the vertices of σ are on the
boundary of its empty ball. A Delaunay complex of P, denoted Del P, is a (geometric) simplicial
complex with vertices in P in which every simplex is Delaunay and |Del P| coincides with the
convex hull of P, as illustrated in Figure 2.4.
Fact 2.3. Every non-degenerate point set (no d + 2 points are co-spherical) admits a unique
Delaunay complex.
Delaunay complexes are dual to the famous Voronoi diagrams defined below.
30 Computational Topology for Data Analysis
Definition 2.12 (Voronoi diagram). Given a finite point set P ⊂ Rd in generic position, the
Voronoi diagram Vor (P) of P is the tessellation of the embedding space Rd into convex cells
V p for every p ∈ P where
Fact 2.4. For P ⊂ Rd , Del (P) is the nerve of the set of Voronoi cells {V p } p∈P which is a closed
cover of Rd .
The above fact actually provides a duality between Delaunay complex and Voronoi diagram.
It is expresed by the duality among their faces. Specifically, a Delaunay k-simplex in Del (P) is
dual to a Voronoi (d − k)-face in Vor (P). The Voronoi diagram dual to the Delaunay complex in
Figure 2.4 is shown in Figure 2.5.
The following optimality properties make Delaunay complexes useful for applications.
Fact 2.5. A triangulation of a point set P ⊂ Rd is a geometric simplicial complex whose vertex
set is P and whose simplices tessellate the convex hull of P. Among all triangulations of a point
set P ⊂ Rd , Del P achieves the following optimized criteria:
3. For a simplex in Del P, let its min-ball be the smallest ball that contains the simplex in it.
In all dimensions, Del P minimizes the largest min-ball.
1-skeletons of Delaunay complexes in R2 are plane graphs and hence Delaunay complexes in
R2 have size Θ(n) for n points. They can be computed in Θ(n log n) time. In R3 , their size grows
to Θ(n2 ) and they can be computed in Θ(n2 ) time. In Rd , d ≥ 3, Delaunay complexes have size
Θ(ndd/2e ) and can be computed in optimal time Θ(ndd/2e ) [92].
Alpha complex. Alpha complexes are subcomplexes of the Delaunay complexes which are
parameterized by a real α ≥ 0. For a given point set P and α ≥ 0, an alpha complex consists
of all simplices in Del (P) that have a circumscribing ball of radius at most α. It can also be
described alternatively as a nerve. For each point p ∈ P, let B(p, α) denote a closed ball of radius
α centering p. Consider the closed set Dαp defined as follows:
The alpha complex Del α (P) is the nerve of the closed sets {Dαp } p∈P . Another interpretation for
alpha complex stems from its relation to the Voronoi diagram of the point set P. Alpha complex
contains a k-simplex σ = {p0 , . . . , pk } if and only if ∪ p∈P B(p, α) meets the intersection of Voronoi
cells V p0 ∩ V p1 · · · ∩ V pk . Figure 2.5 shows an alpha complex for the point set in Figure 2.4 for an
α. The Voronoi diagram is shown with the dotted segments.
Computational Topology for Data Analysis 31
Figure 2.5: Alpha complex of the point set in Figure 2.4 for an α indicated in the figure. The
Voronoi diagram of the point set is shown with dotted edges. The triangles and edges in the
complex are shown with solid edges which are subset of the Delaunay complex.
Definition 2.13 (Weak witness). Let P be a point set with a real valued function on pairs d :
P × P → R and Q ⊆ P be a finite subset . A simplex σ = {q1 , . . . , qk } with qi ∈ Q is weakly
witnessed by x ∈ P \ Q if d(q, x) ≤ d(p, x) for every q ∈ {q1 , . . . , qk } and p ∈ Q \ {q1 , . . . , qk }.
We now define the witness complex using the notion of weak witnesses.
Definition 2.14 (Witness complex). Let P, Q be point sets as in Definition 2.13. The witness
complex W(Q, P) is defined as the collection of all simplices whose all faces are weakly witnessed
by a point in P \ Q.
Observe that a simplex which is weakly witnessed may not have all its faces weakly witnessed
(Exercise 7). This is why the definition above forces the condition to have a simplicial complex.
When P = Rd equipped with Euclidean distance and Q is a finite subset of it, we have the
notion of strong witness.
Definition 2.15 (Strong witness). Let Q ⊂ Rd be a finite set. A simplex σ = {q1 , . . . , qd } with
qi ∈ Q is strongly witnessed by x ∈ Rd if d(q, x) ≤ d(p, x) for every q ∈ {q1 , . . . , qd } and
p ∈ Q \ {q1 , . . . , qd } and additionally, d(q1 , x) = · · · = d(qd , x).
p2 p1
q1
q2
q3
Figure 2.6: A witness complex constructed out of the points in Figure 2.4 where landmarks are
the black dots and the witness points are the hollow dots. The witnesses for five edges and the
triangle are the centers of the six circles; e.g., the triangle q1 q2 q3 and the edge q1 q3 are weakly
witnessed by the points p1 and p2 respectively.
Proposition 2.3. A simplex σ is strongly witnessed if and only if every face τ ≤ σ is weakly
witnessed.
Furthermore, when Q ⊂ Rd , we have some connections of the witness complex to the Delau-
nay complex. By definition, we know the following:
Fact 2.6. Let Q be a finite subset of Rd . Then a simplex σ is in the Delaunay triangulation Del Q
if and only if σ is strongly witnessed by points in Rd .
By combining the above fact and the observation that every simplex in a witness complex is
strongly witnessed, we have the following result which was observed by de Silva [113].
One important implication of the above observation is that the witness complexes for point
samples in an Euclidean space are embedded in that space.
The concept of the witness complex has a parallel in the concept of the restricted Delaunay
complex. When the set P in Proposition 2.4 is not necessarily a finite subset, but only a subset X of
Rd , and Q is finite, we can relate W(Q, P) to the restricted Delaunay complex Del|X Q defined as
the collection of Delaunay simplices in Del Q whose Voronoi duals have non-empty intersection
with X.
Proposition 2.5.
Figure 2.7: A graph induced complex shown with bold vertices, edges, and a shaded triangle on
the left. The input graph within the shaded triangle is shown on right. The 3-clique with three
different colors (shown inside the shaded triangle on the right) cause the shaded triangle in the
left to be in the graph induced complex.
Input graph G(P). The input point set P can be a finite sample of a subset X of an Euclidean
space, such as a manifold or a compact subset. In this case, we may consider the input graph
G(P) to be the neighborhood graph Gα (P) := (P, E) where there is an edge {p, q} ∈ E if and only
if d(p, q) ≤ α. The intuition is that if P is a sufficiently dense sample of X, then Gα (P) captures
the local neighborhoods of the points in X. Figure 2.7 shows a graph induced complex for a point
data in the plane with a neighborhood graph where d is the Euclidean metric. To emphasize the
dependence on α we use the notation Gα (P, Q, d) := G(Gα (P), Q, d).
34 Computational Topology for Data Analysis
Subsample Q. Of course, the ability of capturing the topology of the sampled space after sub-
sampling with Q depends on the quality of Q. We quantify this quality with a parameter δ > 0.
Definition 2.17. A subset Q ⊆ P is called a δ-sample of a metric space (P, d), if the following
condition holds:
The first condition ensures Q to be a good sample of P with respect to the parameter δ and the
second condition enforces that the points in Q cannot be too close relative to the distance δ.
Metric d. The metric d assumed in the metric space (P, d) will be of two types in our discus-
sion below; (i) the Euclidean metric denoted dE , (ii) the graph metric dG derived from the the
input graph G(P) where dG (p, q) is the shortest path distance between p and q in the graph G(P)
assuming its edges have non-negative weights such as their Euclidean lengths.
We state two inference results involving GIC below. The first result is about reconstructing
a surface from its sample. The other result is about inferring one dimensional homology group
from a sample. We introduce the homology groups in the next section. The reader can skip this
result and come back to it after consulting the relevant definitions later. Also, for details, we refer
to [124]. In the following theorem ρ denotes the ‘reach’ of the manifold, an intrinsic feature w.r.t.
which the sampling needs to be dense. We define it more precisely in Definition 6.8 of Chapter 6.
In the next theorem, dG is the graph metric where the input graph is Gα (P) for some α ≥ 0
constructed by the Euclidean metric which is the input P is equipped with.
Instead of stating other homology inference results precisely, we give some empirical results
involving homology groups just to emphasize the advantage of GICs over other complexes in this
respect. Again, the readers unfamiliar with the homology groups can consult the next section.
An empirical example. When equipped with appropriate metric, the GIC can decipher the
topology from data. It retains the simplicity of the Rips complex as well as the sparsity of the
witness complex. It does not build a Rips complex on the subsample and thus is sparser than
the Rips complex with the same set of vertices. This fact makes a real difference in practice as
experiments show.
Computational Topology for Data Analysis 35
Figure 2.8 shows experimental results on two data sets, 40,000 sample points from a Klein
bottle in R4 and 15,000 sample points from the so-called primary circle of natural image data
considered in R25 . The graphs connecting any two points within α = 0.05 unit distance for
Klein bottle and α = 0.6 unit distance for the primary circle were taken as input for the graph
induced complexes. The 2-skeleton of the Rips complexes for these α parameters have 608,200
and 1,329,672,867 simplices respectively. These sizes are too large to carry out fast computations.
14 14
Witness complex Witness complex
12 Rips complex Rips complex
12
Graph induced complex Graph induced complex
8
8
6
6
4
4
2
0 2
0 0.5 1 1.5 0 0.5 1 1.5
δ δ
(a) (b)
8 15
Witnessdcomplex Witness complex
7 Ripsdcomplex Rips complex
Graphdinduceddcomplex Graph induced complex
6
Complex size in log scale
10
DimensiondofdH1
3
5
2
0 0
0.5 1 1.5 2 2.5 0.5 1 1.5 2 2.5
δ δ
(c) (d)
Figure 2.8: Comparison results for Klein bottle in R4 (top row) and primary circle in R25 (bottom
row). The estimated β1 computed from three complexes are shown on the left, and their sizes are
shown on log scale on right; images taken from [124].
For comparisons, we constructed the graph induced complex, a sparsified version of Rips
complex (Section 6.2), and the witness complex on the same subsample determined by a param-
eter δ. The parameter δ is also used in the graph induced complex and the witness complex. The
edges in the Rips complex built on the same subsample were of lengths at most α + 2δ. One
of the main use of the sparse complexes in TDA is to infer homology groups (covered in next
section) from samples. To compare GIC with the sparse Rips and witness complexes, we varied
δ and observed the rank of the one dimensional homology group (β1 ). As evident from the plots,
the graph induced complex captured β1 correctly for a significantly wider range of δ (left plots)
while its size remained comparable to that of the witness complex (right plots). In some cases, the
graph induced complex could capture the correct β1 with remarkably small number of simplices.
For example, it had β1 = 2 for Klein bottle when there were 278 simplices for δ = 0.7 and 154
36 Computational Topology for Data Analysis
simplices for δ = 1.0. In both cases Rips and witness complexes had wrong β1 while the Rips
complex had a much larger size (loge scale plot) and the witness complex had comparable size.
This illustrates why the graph induced complex can be a better choice than the Rips and witness
complexes.
Constructing a GIC. One may wonder how to efficiently construct the graph induced com-
plexes in practice. Experiments show that the following procedure runs quite efficiently in prac-
tice. It takes advantage of computing nearest neighbors within a range and, more importantly,
computing cliques only in a sparsified graph.
Let the ball B(q, δ) in metric d be called the δ-cover for the point q. A graph induced complex
α
G (P, Q, d) where Q is a δ-sparse δ-sample can be built easily by identifying δ-covers with a
rather standard greedy (farthest point) iterative algorithm. Let Qi = {q1 , . . . , qi } be the point set
sampled so far from P. We maintain the invariants (i) Qi is δ-sparse and (ii) every point p ∈ P
that are in the union of δ-covers q∈Qi B(q, δ) have their closest point ν(p) ∈ argminq∈Qi d(p, q) in
S
Qi identified. To augment Qi to Qi+1 = Qi ∪ {qi+1 }, we choose a point qi+1 ∈ P that is outside the
δ-covers q∈Qi B(q, δ). Certainly, qi+1 is at least δ units away from all points in Qi thus satisfying
S
the first invariant. For the second invariant, we check every point p in the δ-cover of qi+1 and
update ν(p) to include qi+1 if its distance to qi+1 is smaller than the distance d(p, ν(p)). At the
end, we obtain a sample Q ⊆ P whose δ-covers cover the entire point set P and thus is a δ-sample
of (P, d) which is also δ-sparse due to the invariants maintained. Next, we construct the simplices
of Gα (P, Q, d). This needs identifying cliques in Gα (P) that have vertices with different closest
points in Q. We delete every edge pp0 from Gα (P) where ν(p) = ν(p0 ). Then, we determine every
clique {p1 , . . . pk } in the remaining sparsified graph and include the simplex {ν(p1 ), . . . , ν(pk )} in
Gα (P, Q, d). The main saving here is that many cliques of the original graph are removed before
it is processed for clique computation.
Next, we focus on the second topic of this chapter, namely homology groups.They are algebraic
structures to quantify topological features in a space. It does not capture all topological aspects
of a space in the sense that two spaces with the same homology groups may not be topologically
equivalent. However, two spaces that are topologically equivalent must have isomorphic homol-
ogy groups. It turns out that the homology groups are computationally tractable in many cases,
thus making them more attractive in topological data analysis. Before we introduce its definition
and variants in Section 2.5, we need the important notions of chains, cycles, and boundaries given
in the following section.
Definition 2.18 (Group; Homomorphism; Isomorphism). A set G together with a binary operation
‘+’ is a group if it satisfies the following properties: (i) for every a, b ∈ G, a + b ∈ G, (ii) for
every a, b, c ∈ G, (a + b) + c = a + (b + c), (iii) there is an identity element denoted 0 in G so that
Computational Topology for Data Analysis 37
• (r + r0 )x = r · x + r · x0
• 1·x= x
• (r · r0 ) · x = r · (r0 · x)
Essentially, in an R-module, elements can be added and multiplied with coefficients in R.
However, if R is taken as a field k, each non-zero element acquires a multiplicative inverse and
we get a vector space.
38 Computational Topology for Data Analysis
Definition 2.24 (Vector space). An R-module V is called a vector space if R is a field. A set of
elements {g1 , . . . , gk } is said to generate the vector space V if every element a ∈ V can be written
as a = α1 g1 + . . . + αk gk for some α1 , . . . , αk ∈ R. The set {g1 , . . . , gk } is called a basis of V if
every a ∈ V can be written in the above way uniquely. All bases of V have the same cardinality
which is called the dimension of V. We say a set {g1 , . . . , gm } ⊆ V is independent if the equation
α1 g1 + . . . + αm gm = 0 can only be satisfied by setting αi = 0 for i = 1, . . . , m.
Fact 2.7. A basis of a vector space is a generating set of minimal cardinality and an independent
set of maximal cardinality.
2.4.2 Chains
Let K be a simplicial k-complex with m p number of p-simplices, k ≤ p ≤ 0. A p-chain c in K is
Pm
a formal sum of p-simplices added with some coefficients, that is, c = i=1p αi σi where σi are the
p-simplices and αi are the coefficients. Two p-chains c = αi σi and c0 = α0i σi can be added
P P
to obtain another p-chain
mp
X
c+c =0
(αi + α0i )σi .
i=1
In general, coefficients can come from a ring R with its associated additions making the chains
constituting an R-module. For example, these additions can be integer additions where the coef-
ficients are integers; e.g., from two 1-chains (edges) we get
(2e1 + 3e2 + 5e3 ) + (e1 + 7e2 + 6e4 ) = 3e1 + 10e2 + 5e3 + 6e4 .
Notice that while writing a chain, we only write the simplices that have non-zero coefficient in
the chain. We follow this convention all along. In our case, we will focus on the cases where the
coefficients come from a field k. In particular, we will mostly be interested in k = Z2 . This means
that the coefficients come from the field Z2 whose elements can only be 0 or 1 with the modulo-2
additions 0 + 0 = 0, 0 + 1 = 1, and 1 + 1 = 0. This gives us Z2 -additions of chains, for example,
we have
(e1 + e3 + e4 ) + (e1 + e2 + e3 ) = e2 + e4 .
Observe that p-chains with Z2 -coefficients can be treated as sets: the chain e1 + e3 + e4 is the
set {e1 , e3 , e4 }, and Z2 -addition between two chains is simply the symmetric difference between
the corresponding sets.
From now on, unless specified otherwise, we will consider all chain additions to be Z2 -
additions. One should keep in mind that one can have parallel concepts for coefficients and
additions coming from integers, reals, rationals, fields, and other rings. Under Z2 -additions, we
have
mp
X
c+c= 0σi = 0.
i=1
Below, we show addition of chains shown in Figure 2.9:
0-chain: ({b} + {d}) + ({d} + {e}) = {b} + {e} (left)
1-chain: ({a, b} + {b, d}) + ({b, c} + {b, d}) = {a, b} + {b, c} (left)
2-chain: ({a, b, c} + {b, c, e}) + ({b, c, e}) = {a, b, c} (right)
Computational Topology for Data Analysis 39
b
d
a a b e
c
d
e c
Pm
The p-chains with the Z2 -additions form a group where the identity is the chain 0 = i=1p 0σi ,
and the inverse of a chain c is c itself since c + c = 0. This group, called the p-th chain group, is
denoted C p (K). We also drop the complex K and use the notation C p when K is clear from the
context. We do the same for other structures that we define afterward.
where v̂i indicates that the vertex vi is omitted. Informally, we can view ∂ p as a map that sends
a p-simplex σ to the (p − 1)-chain that has non-zero coefficients only on σ’s (p − 1)-faces also
referred as σ’s boundary. At this point, it is instructive to note that the boundary of a vertex is
empty, that is, ∂0 σ = ∅. Extending ∂ p to a p-chain, we obtain a homomorphism ∂ p : C p → C p−1
called the boundary operator that produces a (p − 1)-chain when applied to a p-chain:
mp
X mp
X
∂pc = αi (∂ p σi ) for a p-chain c = αi σi ∈ C p .
i=1 i=1
Again, we note the special case of p = 0 when we get ∂0 c = ∅. The chain group C−1 has only one
single element which is its identity 0. On the other end, we also assume that if K is a k-complex,
then C p is 0 for p > k.
Consider the complex in Figure 2.9(right). For the 2-chain abc + bcd we get
It means that from the two triangles sharing the edge bc, the boundary operator returns the four
boundary edges that are not shared. Similarly, one can check that the boundary of the 2-chains
40 Computational Topology for Data Analysis
consisting of all three triangles in Figure 2.9(right) contains all 7 edges. In particular, the edge bc
does not get cancelled because of all three (odd) triangles adjoin it.
∂2 (abc + bcd + bce) = ab + bc + ca + be + ce + bd + dc.
One important property of the boundary operator is that, applying it twice produces an empty
chain.
Proposition 2.8. For p > 0 and any p-chain c, ∂ p−1 ◦ ∂ p (c) = 0.
Proof. Observe that ∂0 is a zero map by definition. Also, for a k-complex, ∂ p operates on a zero
element for p > k by definition. Then, it is sufficient to show that, for 1 ≤ p ≤ k, ∂ p−1 ◦ ∂ p (σ) = 0
for a p-simplex σ. Observe that ∂ p σ is the set of all (p − 1)-faces of σ and every (p − 2)-faces of
σ is contained in exactly two (p − 1)-faces. Thus, ∂ p−1 (∂ p σ) = 0.
Extending the boundary operator to the chains groups, we obtain the following sequence of
homomorphisms satisfying Proposition 2.8 for a simplicial k-complex; such a sequence is also
called a chain complex:
∂k+1 ∂k
0 = Ck+1 / Ck / Ck−1 ∂k−1 / Ck−2 ··· C1
∂1
/ C0 ∂0
/ C−1 = 0. (2.1)
Fact 2.8.
1. For p ≥ −1, C p is a vector space because the coefficients are drawn from a field Z2 –it has a
basis so that every element can be expressed uniquely as a sum of the elements in the basis.
2. There is a basis for C p where every p-simplex form a basis element because any p-chain
is a unique subset of the p-simplices. The dimension of C p is therefore n, the number of p-
simplices. When p = −1 and p ≥ k + 1, C p is trivial with dimension 0. In Figure 2.9(right)
{abc, bcd, bce} is a basis for C2 and so is {abc, (abc + bcd), bce}.
Figure 2.10: Each individual red, blue, green cycle is not a boundary because they do not bound
any 2-chain. However, the sum of the two red cycles, and the sum of the two blue cycles each
form a boundary cycle because they bound 2-chains consisting of redish and bluish triangles
respectively.
1. C0 = Z0 and Bk = 0.
2. For p ≥ 0, B p ⊆ Z p ⊆ C p .
2.5 Homology
The homology groups classify the cycles in a cycle group by putting togther those cycles in the
same class that differ by a boundary. From a group theoretic point of view, this is done by taking
the quotient of the cycle groups with the boundary groups, which is allowed since the boundary
group is a subgroup of the cycle group.
Definition 2.26 (Homology group). For p ≥ 0, the p-th homology group is the quotient group
H p = Z p /B p . Since we use a field, namely Z2 , for coefficients, H p is a vector space and its
dimension is called the p-th Betti number, denoted by β p :
β p := dim H p .
Figure 2.11: Complex K of a tetrahedron: (a) Vertices, (b) spanning tree of the 1-skeleton, (c)
1-skeleton, (d) 2-skeleton of K.
Example. Consider the boundary complex K of a tetrahedron which consists of four triangles,
six edges, and four vertices. Consider the 0-skeleton K 0 of K which consists of four vertices only.
All four vertices whose classes coincide with them are necessary to generate H0 (K 0 ). Therefore,
these four vertices form a basis of H0 (K 0 ). However, one can verify that H0 (K 1 ) for the 1-skeleton
K 1 is generated by any one of the four vertices because all four vertices belong to the same
class when we consider K 1 . This exemplifies the fact that rank of H0 (K) captures the number of
connected components in a complex K.
The 1-skeleton K 1 of the tetrahedron is a graph with four vertices and six edges. Consider a
spanning tree with any vertex and the three edges adjoining it as in Figure 2.11(b). There is no
1-cycle in this configuration. However, each of the other three edges create a new 1-cycle which
are not boundary because there is no triangle in K 1 . These three cycles c1 , c2 , c3 as indicated
in Figure 2.11(c) form their own classes in H1 (K 1 ). Observe that the 1-cycle at the base can be
written as a combination of the other three and thus all classes in H1 (K 1 ) can be generated by
only three classes [c1 ], [c2 ], [c3 ] and no fewer. Hence, these three classes form a basis of H1 (K 1 ).
To develop more intuition, consider a simplicial surface M without boundary embedded in R3 . If
the surface has genus g, that is g tunnels and handles in the complement space, then H1 (M) has
dimension 2g (Exercise 4).
The 2-chain of the sum of four triangles in K make a 2-cycle c because its boundary is 0. Since
K does not have any 3-simplex (the tetrahedron is not part of the complex), this 2-cycle cannot
be added to any 2-boundary other than 0 to form its class. Therefore, the homology class of c is
c itself, [c] = {c}. There is no other 2-cycle in K. Therefore, H2 (K) is generated by [c] alone.
Its dimension is only one. If the tetrahedron is included in the complex, c becomes a boundary
element, and hence [c] = [0]. In that case, H2 (K) = 0. Intuitively, one may think H2 (K) capturing
the voids in a complex K embedded in R3 . (Convince yourself that H1 (K) = 0 no matter whether
the tetrahedron belongs to K or not.)
Fact 2.10. For p ≥ 0,
1. H p is a vector space (when defined over Z2 ),
2. H p may not be a vector space when defined over Z, the integer coefficients. In this case,
there could be torsion subgroups,
3. the Betti number, β p = dim H p , is given by β p = dim Z p − dim B p ,
4. there are exactly 2β p homology classes in H p when defined with Z2 coefficients.
Computational Topology for Data Analysis 43
b e
a
K1 c K2 h K3
d g
Figure 2.12: Induced homology by simplicial map: Simplicial map f obtained by the vertex map
a → e, b → e, c → g, d → g induces a map at the homology level f∗ : H1 (K1 ) → H1 (K2 ) which
takes the only non-trivial class created by the empty triangle abc to zero though H1 (K1 ) H1 (K2 ).
Another simplicial map K2 → K3 destroys the single homology class born by the empty triangle
egh in K2 .
Definition 2.27 (Chain map). Let f : K1 → K2 be a simplicial map. The chain map f# :
C p (K1 ) → C p (K2 ) corresponding to f is defined as follows. If c = αi σi is a p-chain, then
P
f# (c) = αi τi where
P
(
f (σi ), if f (σi ) is a p-simplex in K2
τi =
0 otherwise.
For example, in Figure 2.12, the 1-cycle bc+cd+db in K1 is mapped to the 1-chain eg+eg = 0
by the chain map f# .
Proposition 2.9. Let f : K1 → K2 a simplicial map. Let ∂Kp 1 and ∂Kp 2 denote the boundary
homomorphisms in dimension p ≥ 0. Then, the induced chain maps commute with the boundary
homomorphisms, that is, f# ◦ ∂Kp 1 = ∂Kp 2 ◦ f# .
The statement in the above proposition can also be represented with the following diagram,
which we say commutes since starting from the top left corner, one reaches to the same chain at
the lower right corner using both paths–first going right and then down, or first going down and
then right (see Definition 3.15 in the next chapter).
f#
C p (K1 ) / C p (K2 ) (2.2)
K K
∂p 1 ∂p 2
f#
C p−1 (K1 ) / C p−1 (K2 )
For example, in Figure 2.12, we have f# (c = ab + bd + da) = 0 and ∂Kp 1 (c) = 0. Therefore,
∂Kp 2 ( f# (c)) = ∂Kp 2 (0) = 0 = f# (0) = f# (∂Kp 1 (c)).
44 Computational Topology for Data Analysis
Since B p (K1 ) ⊆ Z p (K1 ), we have that f# (B p (K1 )) ⊆ f# (Z p (K1 )). Thus, the induced map in the
quotient space, namely,
is well defined. Furthermore, by the commutativity of the Diagram (2.2), f# (Z p (K1 )) ⊆ Z p (K2 )
and f# (B p (K1 )) ⊆ B p (K2 ), which gives an induced homomorphism in the homology groups:
Fact 2.11. For two contiguous maps f1 : K1 → K2 and f2 : K1 → K2 , the induced maps
f1 ∗ : H p (K1 ) → H p (K2 ) and f2 ∗ : H p (K1 ) → H p (K2 ) are equal.
Fact 2.12. H p (K, K0 ) H p (K ∗ ) for all p > 0 and β0 (H0 (K, K0 )) = β0 (H0 (K ∗ )) − 1.
For example, consider K to be an edge {a, b, ab} with K0 = {a, b} as in Figure 2.13(left). The
1-chain ab is a relative 1-cycle because ∂1 (ab) = a + b ∈ C0 (K0 ) and hence ∂1K,K0 ([ab]) is 0 in
C0 (K, K0 ). This is indicated by the presence of the loop in the coned space.
Computational Topology for Data Analysis 45
a a b b
x a a x
b b c c
Figure 2.13: Illustration for relative homology: the subcomplex K0 consists of (left) vertices a
and b, (right) vertices a, b, c, and the edge ab; the coned complex K ∗ are indicated with a coning
from a dummy vertex x.
Now, consider K to be a triangle {a, b, c, ab, ac, bc, abc} with K0 = {a, b, c, ab} as in Fig-
ure 2.13(right). The 1-chains bc and ac both are relative 1-cycles because ∂1 (bc) = b + c ∈ C0 (K0 )
and hence ∂1K,K0 ([bc]) is 0 in C0 (K, K0 ); similarly, ∂1K,K0 ([ac]) = 0. The 1-chain ab is of course
a relative 1-cycle because it is already 0 as a relative chain. Therefore, the relative 1-cycle
group Z1 (K, K0 ) has a basis {[bc], [ac]}. The relative 1-boundary group B1 (K, K0 ) is given by
∂2K,K0 (abc) = [ab] + [bc] + [ac] = [bc] + [ac]. The relative homology group H1 (K, K0 ) has one
non-trivial class, namely the class of either [bc] or [ac] but not both because [bc]+[ac] is a relative
boundary.
Definition 2.28 (Singular simplex). A singular p-simplex for a topological space X is defined as
a map σ : ∆ p → X.
Notice that the map σ need not be injective and thus ∆ p may be ‘squashed’ arbitrarily in its
image. Nevertheless, we can still have a notion of the chains, boundaries, and cycles which are
the main ingredients for defining a homology group called the singular homology of X.
The boundary of a p-simplex σ is given by ∂σ = τ0 + τ2 + . . . + τ p where τi : (∂∆ p )i → X is
the restriction of the map σ on the ith facet (∂∆ p )i of ∆ p .
A p-chain is a sum of singular p-simplices with coefficients from integers, reals, or some
appropriate rings. As before, under our assumption of Z2 coefficients, a singular p-chain is given
by i αi σi where αi = 0 or 1. The boundary of a singular p-chain is defined the same way as we
P
did for simplicial chains, only difference being that we have to accommodate for infinite chains.
∂(c p = σ1 + σ2 + . . .) = ∂σ1 + ∂σ2 + . . .
We get the usual chain complex with ∂ p ◦ ∂ p−1 = 0 for all p > 0
∂ p+1 ∂p ∂ p−1
· · · → C p → C p−1 → · · ·
46 Computational Topology for Data Analysis
and can define the cycle and boundary groups as Z p = ker ∂ p and B p = im ∂ p+1 . We have the
singular homology defined as the quotient group H p = Z p /B p .
A useful fact is that singular and simplicial homology coincide when both are well defined.
Theorem 2.10. Let X be a topological space with a triangulation K, that is, the underlying space
|K| is homeomorphic to X. Then H p (K) H p (X) for any p ≥ 0.
Note that the above theorem also implies that different triangulations of the same topological
space give rise to isomorphic simplicial homology.
2.5.4 Cohomology
There is a dual concept to homology called cohomology. Although cohomology can be defined
with coefficients in rings as in the case of homology groups, we will mainly focus on defining it
over a field thus becoming a vector space.
A vector space V defined with a field k admits a dual vector space V ∗ whose elements are
linear functions φ : V → k. These linear functions themselves can be added and multiplied over
k forming the dual vector space V ∗ . The homology group H p (K) as we defined in Definition 2.26
over the field Z2 is a vector space and hence admits a dual vector space which is usually denoted
as Hom(H p (K), Z2 ). The p-th cohomology group denoted H p (K) is not equal to this dual space,
though over the coefficient field Z2 , one has that H p (K) is isomorphic to Hom(H p (K), Z2 ) and
H p (K) is also defined with spaces of linear maps.
Also, verify that φ(c + c0 ) = φ(c) + φ(c0 ) satisfying the property of group homomorphism. For a
chain c, the particular cochain that assigns 1 to a simplex if and only if it has a non-zero coefficient
in c, is called its dual cochain c∗ . The p-cochains form a cochain group C p dual to C p where the
addition is defined by (φ + φ0 )(c) = φ(c) + φ0 (c) by taking Z2 -addition on the right. We can also
define a scalar multiplication (αφ)(c) = αφ(c) by using the Z2 -multiplication. This makes C p a
vector space.
Similar to boundaries of chains, we have the notion of coboundaries of cochains δ p : C p →
C p+1 . Specifically, for a p-cochain φ, its (p + 1)-coboundary is given by the homomorphism
δφ : C p+1 → Z2 defined as δφ(c) = φ(∂c) for any (p + 1)-chain c. Therefore, the coboundary
operator δ takes a p-cochain and produces a (p + 1)-cochain giving the sequence for a simplicial
k-complex:
δ−1 δ0 δ1 δk−1 δk
0 = C−1 −−→ C0 −→ C1 −→ · · · −−−→ Ck −→ Ck+1 = 0
The set of p-coboundaries forms the coboundary group (vector space) B p where the group
addition and scalar multiplication are given by the same in C p .
Computational Topology for Data Analysis 47
f
b b a h
d
b g
a a
c c
c e
(i) (ii) (iii)
Figure 2.14: Illustration for cohomology: (i) and (iii) 1-cochain with support on the solid thick
edges is a 1-cocycle which is not a 1-coboundary, so it constitutes a non-trivial class in H1 . The
1-cochain with support on dashed edges constitutes a cohomologous class, (ii) 1-cochain with
support on the solid thick edges is a 1-cocycle which is also a 1-coboundary and hence belongs
to a trivial class.
Now we come to cocycles, the dual notion to cycles. A p-cochain φ is called a p-cocycle if its
coboundary δφ is a zero homomorphism. The set of p-cocycles form a group Z p (a vector space)
where again the addition and multiplication are induced by the same in C p .
Similar to the boundary operator ∂, the coboundary operator δ satisfies the following property:
Example. Consider the three complexes in Figure 2.14. In the following discussion, for conve-
nience, we refer to the p-simplices on which c p evaluates to 1 as the support of c p . The 1-cochain
φ with the support on the edge ac is a cocycle because δ1 φ = 0 as there is no triangle and hence
no non-zero 2-cochain. It is also not a coboundary because there is no 0-cochain φ0 (assignment
of 0 and 1 on vertices) so that
δ0 φ0 (ac) = φ0 (a + c) = 1 = φ(ac)
δ0 φ0 (ab) = φ0 (a + b) = 0 = φ(ab)
δ0 φ0 (bc) = φ0 (b + c) = 0 = φ(bc).
The 1-cochain φ with support on edges ab and ac in Figure 2.14(ii) is a 1-cocycle because
δ1 φ(abc) = φ(ab + ac + bc) = 0. Notice that, now a cochain with support only on one edge
ac cannot be a cocycle because of the presence of the triangle abc. The 1-cochain φ is also a
1-coboundary because a 0-cochain with assignment of 1 on the vertex a produces φ as a cobound-
ary.
Similarly, verify that the 1-cochain φ with support on edges cd and ce in Figure 2.14(iii) is
a cocycle but not a coboundary. Thus, the class [φ] is non-trivial in 1-dimensional cohomology
H1 . Any other non-trivial class is cohomologous to it. For example, the class [φ0 ] where φ0 has
48 Computational Topology for Data Analysis
support on edges b f and bg is cohomologous to [φ]. This follows from the fact that [φ] + [φ0 ] =
[φ + φ0 ] = [0] because φ + φ0 is a 1-coboundary obtained by assigning 1 to vertices a, b, and c.
Similar to the homology groups, a simplicial map f : K1 → K2 also induces a homomorphism
f ∗ between the two cohomology groups, but in the opposite direction. To see this, consider the
chain map f# induced by f (Definition 2.27). Then, a cochain map f # : C p (K2 ) → C p (K1 ) is
defined as f # (φ)(c) = φ( f# (c)). The cochain map f # in turn defines the induced homomorphism
between the respective cohomology groups. We will use the following fact in Section 4.2.1.
Exercises
1. Suppose we have a collection of sets U = {Uα }α∈A where there exists an element U ∈ U
that contains all other elements in U. Show that the nerve complex N(U) is contractible to
a point.
2. Given a parameter α and a set of points P ⊂ Rd , show that the alpha complex Del α (P) is
contained in the intersection of Delauney complex and Čech complex at scale α; that is,
Del α (P) ⊆ Del (P) ∩ Cα (P).
3. Let K be the simplicial complex of a tetrahedron. Write a basis for the chain groups C1 ,
C2 , boundary groups B1 , B2 , and cycle groups Z1 , Z2 . Write the boundary matrix repre-
senting the boundary operator ∂2 with rows and columns representing bases of C1 and C2
respectively.
4. Let K be a triangulation of an orientable surface without boundary that has genus g. Prove
that β1 (K) = 2g.
6. We state the nerve theorem (Theorem 2.1) for covers where either all cover elements are
closed or all cover elements are open. Show that the theorem does not hold if we mix open
and closed elements in the cover.
7. Give an example where a simplex which is weakly witnessed may not have all its faces
weakly witnessed. Show that (i) W(Q, P0 ) ⊆ W(Q, P) for P0 ⊆ P, (ii) W(Q0 , P) may not be
a subcomplex of W(Q, P) where Q0 ⊆ Q.
8. Consider Definition 2.16 for Graph induced complex. Let VR(G) be the clique complex
given by the input graph G(P). Assume that the map ν : P → 2Q sends every point to a
singleton under input metric d. Then, ν : P → ν(P) is a well defined vertex map. Prove that
the vertex map ν : P → Q extends to a simplicial map ν̄ : VR(G) → G(G(P), Q, d). Also,
show that every simplicial complex K(Q) with the vertex set Q for which ν̄ : VR(G) →
K(Q) becomes simplicial must contain G(G(P), Q, d).
10. Consider a complex K = {a, b, c, ab, bc, ca, abc}. Enumerate all elements in the 1-chain,
1-cycle, 1-boundary groups defined on K under Z2 coefficient. Do the same for cochains,
cocycles, and coboundaries.
12. Prove that ∂ p−1 ◦ ∂ p = 0 for relative chain groups and also δ p ◦ δ p−1 = 0 for cochain groups.
Chapter 3
Topological Persistence
Suppose we have a point cloud data P sampled from a 3D model. A quantified summary of the
topological features of the model that can be computed from this sampled representation helps in
further processing such as shape analysis in geometric modeling. Persistent homology offers this
avenue as Figure 3.1 illustrates. For further explanation, consider P sampled from a curve in R2 as
in Figure 3.3. Our goal is to get the information that the sampled space had two loops, one bigger
and more prominent than the other. The notion of persistence captures this information. Consider
the distance function r : R2 → R defined over R2 where r(x) equals d(x, P), that is, the minimum
distance of x to the points in P. Now let us look at the sublevel sets of r, that is, r−1 [−∞, a]
for some a ∈ R+ ∪ {0}. These sublevel sets are union of closed balls of radius a centering the
points. We can observe from Figure 3.3 that if we increase a starting from zero, we come across
Figure 3.1: Persistence barcodes computed from a point cloud data. The barcode on right shows
a single long bar for H0 signifying one connected component, eight long bars for H1 signifying
eight fundamental classes two for each of the four ‘through holes’, and a single long bar for H2
signifying the connected closed surface; picture taken from [135].
different holes surrounded by the union of these balls which ultimately get filled up at different
times. However, the two holes corresponding to the original two loops persist longer than the
others. We can abstract out this observation by looking at how long a feature (homological class)
survives when we scan over the increasing sublevel sets. This weeds out the ‘false’ features
(noise) from the true ones. The notion of persistent homology formalizes and discretizes this
idea – It takes a function defined on a topological space (simplicial complex) and quantifies the
51
52 Computational Topology for Data Analysis
changes in homology classes as the sublevel sets (subcomplexes) grow with increasing value of
the function.
There are two predominant scenarios where persistence appears though in slightly different
contexts. One is when the function is defined on a topological space which requires considering
singular homology groups of the sublevel sets. The other is when the function is defined on a
simplicial complex and the sequence of sublevel sets are implicitly given by a nested sequence
of subcomplexes called a filtration. This involves simplicial homology. Section 3.1 introduces
persistence in both of these contexts though we focus mainly on the simplicial setting which is
availed most commonly for computational purposes.
The birth and death of homological classes give rise to intervals during which a class remains
alive. These intervals together called a barcode summarize the topological persistence of a filtra-
tion; see e.g. Figure 3.1. An equivalent notion called persistence diagrams plots the intervals as
points in the extended plane R̄2 := (R ∪ {±∞})2 ; specifically, the birth and death constitute the x-
and y-coordinates of a point. The stability of the persistence diagrams against the perturbation of
the functions that generate the filtrations is an important result. It makes topological persistence
robust against noise. When filtrations are given without any explicit mention of a function, we
can still talk about the stability of the persistence diagrams with respect to the so-called interleav-
ing distance between the induced persistence modules. Sections 3.2 and 3.4 are devoted to these
concepts.
The algorithms that compute the persistence diagram from a given filtration are presented
in Section 3.3. First, we introduce it assuming that the input is presented combinatorially with
simplices added one at a time in a filtration. The algorithm pairs simplices, one creating and the
other destroying an interval. Then, this pairing is translated into matrix operations assuming that
the input is a boundary matrix representing the filtration. A more efficient version of the algorithm
is obtained by some simple but effective modification.
Finally, we consider the case of a piecewise linear (PL) function on a simplicial complex
and derive a filtration out of it from which the actual persistence of the input PL function can be
computed. This is presented in Section 3.5.
Ta ⊆ Tb for a ≤ b.
Now consider a sequence of reals a1 ≤ a2 ≤ . . . , ≤ an which are often chosen to be critical values
where the homology group of the sublevel sets change as illustrated in Figure 3.2. Considering
the sublevel sets at these values and a dummy value a0 = −∞ with Ta0 = ∅, we obtain a nested
Computational Topology for Data Analysis 53
Figure 3.2 shows an example of the inclusions of the sublevel sets. The inclusions in a filtration
induce linear maps in the singular homology groups of the subspaces involved. So, if ι : Tai →
Ta j , i ≤ j, denotes the inclusion map x 7→ x, we have an induced homomorphism
i, j
h p = ι∗ : H p (Tai ) → H p (Ta j ) (3.2)
a1 a1 a2 a1 a2 a3 a1 a2 a3 a4 a1 a2 a3 a4 a5
Figure 3.2: Persistence of a function on a topological space that has five critical values: (a) Ta1 :
only a new class in H0 is created, (b) Ta2 : two new independent classes in H1 are created, (c) Ta3 :
one of the two classes in H1 dies, (d) Ta4 : the single remaining class in H1 dies, (e) Ta5 : a new
class in H2 is created.
It is worthwhile to mention that writing a group to be 0 means that it is a trivial group con-
i, j
taining only the identity element 0. The homomorphism h p sends the homology classes of the
sublevel set Tai to those of the sublevel set of Ta j . Some of these classes may die (become trivial)
i, j
while the others survive. The image Im h p contains this information.
The inclusions of sublevel sets give rise to persistence also in the context of point clouds, a
common input form in data analysis.
Point cloud. For a point set P in a metric space (M, d), define the distance function f : M → R,
x 7→ d(x, p) where p ∈ argminq∈P d(x, q). Observe that the sublevel sets f −1 (−∞, a] are the union
of closed metric balls of radius a centering points in P. Now we have exactly the same setting as
we described for general topological spaces above where T is replaced with M and sublevel sets
Ta ’s by the union of metric balls that grows with increasing value of a. Figure 3.3 illustrates an
example where M is the Euclidean plane R2 .
54 Computational Topology for Data Analysis
Figure 3.3: Noisy sample of a curve with two loops and the growing sublevel sets of the distance
function to the sample points: The larger loop appearing as the bigger hole in the complement of
the union of balls persists longer than the same for the smaller loop while other spurious holes
persist even shorter.
F : ∅ = K0 ⊆ K1 ⊆ · · · ⊆ Kn = K
Computational Topology for Data Analysis 55
a1 a1 a2 a1 a2 a3 a1 a2 a3 a4 a1 a2 a3 a4 a5
Figure 3.4: Persistence of the piecewise linear version of the function on a triangulation of the
topological space considered in Figure 3.2.
F : ∅ = K0 ,→ K1 ,→ · · · ,→ Kn = K.
F is called simplex-wise if either Ki \ Ki−1 is empty or a single simplex for every i ∈ [1, n]. Notice
that the possibility of difference being empty allows two consecutive complexes to be the same.
(b)
(a)
(c) (d)
Figure 3.5: Čech complex of the union of balls considered in Figure 3.3. Homology classes in
H1 are being born and die as the union grows. The two most prominent holes appear as two most
persistent homology classes in H1 . Other classes appear and disappear quickly with relatively
much shorter persistence.
56 Computational Topology for Data Analysis
∅ = K0 ,→ K1 ,→ · · · ,→ Kn = K.
Vertex function. A vertex function f : V(K) → R is defined on the vertex set V(K) of the
complex K. We can construct a filtration F from such a function.
Lower/upper stars. Recall that in Section 2.1 we have defined the star and link of a vertex
v ∈ K which intuitively captures the concept of local neighborhood of v in K. We infuse the
information about a vertex function f into these structures. First, we fix a total order on vertices
V = {v1 , . . . , vn } of K so that their f -values are in non-decreasing order, that is, f (v1 ) ≤ f (v2 ) ≤
· · · ≤ f (vn ). The lower-star of a vertex v ∈ V, denoted by Lst(v), is the set of simplices in St(v)
whose vertices except v appear before v in this order. The closed lower-star Lst(v) is the closure
of Lst(v), i.e, it consists of simplices in Lst(v) and their faces. The lower-link Llk(v) is the set of
simplices in Lst(v) disjoint from v. Symmetrically, we can define the upper star Ust(v), closed
upper star Ust(v), and upper link Ulk(v), spanned by vertices in the star of v which appear after v
in the chosen order.
One gets a filtration using the lower stars of the vertices: K f (vi ) in the following filtration
denotes all simplices in K spanned by vertices in {v1 , . . . , vi }. Let v0 denote a dummy vertex with
f (v0 ) = −∞.
∅ = K f (v0 ) ⊆ K f (v1 ) ⊆ K f (v2 ) ⊆ · · · ⊆ K f (vn ) = K
Observe that the K f (vi ) \ K f (vi−1 ) = Lst(vi ) for i ∈ [1, n] in the above filtration, that is, each time we
add the lower star of the next vertex in the filtration. This filtration called the lower star filtration
for f is studied in Section 3.5 in more details. Figure 3.6 shows a lower star filtration. A lower
stat filtration can be made simplex-wise by adding the simplices in a lower star in any order that
puts a simplex after all of its faces.
Alternatively, we may consider the vertices in non-increasing order of f values and ob-
tain an upper star filtration. For this we take K f (vi ) to be all simplices spanned by vertices in
{vi , vi+1 , . . . , vn }. Assuming a dummy vertex vn+1 with f (vn+1 ) = ∞, one gets a filtration
Observe that the K f (vi ) \ K f (vi+1 ) = Ust(vi ) for i ∈ [1, n] in the above filtration, that is, each time we
add the upper star of the next vertex in the filtration. This filtration called the upper star filtration
for f is in some sense a symmetric version of the lower star filtration though they may provide
different persistence pairs. An upper stat filtration can also be made simplex-wise by adding the
simplices in an upper star in any order that puts a simplex after all of its faces. In this book, by
default, we will assume that the function values along a filtration are non-decreasing. This means
that we consider only lower filtrations by default.
Vertex functions are closely related to the so called piecewise linear functions (PL-functions).
A vertex function f : K → R defines a piecewise linear function (PL-function) on the underlying
Computational Topology for Data Analysis 57
v2 v3 v4 v3
v2 v2
v1 v1
v0 v1
v0 v0 v1
v0 v0
v6 v8 v9
v5 v6 v7 v8
v5 v6 v7 v6 v7
v4 v3 v4 v3 v5 v4 v5 v5
v3 v4 v3 v4
v2 v2 v3
v1 v2 v2
v1 v1 v2
v0 v0 v1 v1
v0 v0
v0
Figure 3.6: The sequence shows a lower-star filtration of K induced by a vertex function which is
a ‘height function’ that records the vertical height of a vertex increasing from bottom to top here.
space |K| of K which is obtained by linearly interpolating f over all simplices. On the other hand,
the restriction of a PL-function to vertices trivially provides a vertex function.
Fact 3.1.
• A simplex-wise lower star filtration for f is also a filtration for the simplex-wise monotonic
function f¯ : K → R where f¯(σ) = maxv∈σ f (v).
• Similarly, a simplex-wise upper star filtration for f is also a filtration for the simplex-wise
monotonic function f¯(σ) = maxv∈σ (− f (v)).
without any explicit input function, we say F is induced by the simplex-wise monotone function
f where every simplex σ ∈ (Ki \ Ki−1 ) for Ki , Ki+1 is given the value f (σ) = i.
i, j
Naturally, every simplicial filtration gives rise to a sequence of homomorphisms h p as in
Equation 3.2 induced by inclusions again forming a homology module
i, j
hp
0 = H p (K0 ) → H p (K1 ) → · · · → H p (Ki ) →· · ·→ H p (K j ) · · · → H p (Kn ) = H p (K).
3.2 Persistence
In both cases of space and simplicial filtration F, we arrive at a homology module:
i, j
hp
H p F : 0 = H p (X0 ) → H p (X1 ) → · · · → H p (Xi ) →· · ·→ H p (X j ) · · · → H p (Xn ) = H p (X) (3.3)
Definition 3.4 (Persistent Betti number). The p-th persistent homology groups are the images of
i, j i, j
the homomorphisms; H p = im h p , for 0 ≤ i ≤ j ≤ n. The p-th persistent Betti numbers are the
i, j i, j i, j
dimensions β p = dim H p of the vector spaces H p .
The p-th persistent homology groups contain the important information of when a homology
class is born or when it dies. The issue of birth and death of a class becomes more subtle because
when a new class is born, many other classes that are sum of this new class and any other exist-
ing class also are born. Similarly, when a class ceases to exist, many other classes may cease to
exist along with it. Therefore, we need a mechanism to pair births and deaths canonically. Fig-
ure 3.7 illustrates birth and death of a class though the pairing of birth and death events is more
complicated as stated in Fact 3.3.
i, j
Observe that the non trivial elements of p-th persistent homology groups H p consist of classes
that survive from Xi to X j , that is, the classes which do not get ‘quotient out’ by the boundaries in
X j . So, one can observe:
i, j i, j ij
Fact 3.2. H p = Z p (Xi )/(B p (X j ) ∩ Z p (Xi )) and β p = dim H p .
Notice that Z p (Xi ) is a subgroup of Z p (X j ) because Xi ⊆ X j and hence the above quotient is
well defined. We now formally state when a class is born or dies.
Definition 3.5 (Birth and death). A non-trivial p-th homology class ξ ∈ H p (Xa ) is born at Xi ,
i ≤ a, if ξ ∈ Hi,ap but ξ < H p
i−1,a
. Similarly, a non-trivial p-th homology class ξ ∈ H p (Xa ) dies
a, j−1 a, j
entering X j , a < j, if h p (ξ) is not zero (non-trivial) but h p (ξ) = 0.
Observe that not all classes that are born at Xi necessarily die entering some X j though more
than one such may do so.
Computational Topology for Data Analysis 59
Fact 3.3. Let [c] ∈ H p (X j−1 ) be a p-th homology class that dies entering X j . Then, it is born
at Xi if and only if there exists a sequence i1 ≤ i2 ≤ · · · ≤ ik = i for some k ≥ 1 so that (i)
0 , [ci` ] ∈ H p (X j−1 ) is born at Xi` for every ` ∈ {1, . . . , k} and (ii) [c] = [ci1 ] + · · · + [cik ].
One may interpret the above fact as follows. When a class dies, it may be thought of as a
merge of several classes among which the youngest one [cik ] determines the birth point. This
viewpoint is particularly helpful while pairing simplices in the persistence algorithm PairPersis-
tence presented later.
Figure 3.7: A simplistic view of birth and death of classes: A class [c] is born at Xi since it is not
in the image of H p (Xi−1 ). It dies entering X j since this is the first time its image becomes trivial.
Notice that each Xi , i = 0, . . . , n, is associated with a value of the function f that induces
F. For a space filtration, we say f (Xi ) = ai where Xi = Tai . For a simplicial filtration, we say
f (Xi ) = ai where ai = f (σ) for any σ ∈ Xi when the filtration function (Definition 3.3) is simplex-
wise monotone. When it is a vertex function f , then we extend f to a simplex-wise monotone
function as stated in Fact 3.1.
The first difference on the RHS counts the number of independent classes that are born at or
before Xi and die entering X j . The second difference counts the number of independent classes
that are born at or before Xi−1 and die entering X j . The difference between the two differences
thus counts the number of independent classes that are born at Xi and die entering X j . When
j = n + 1, µi,n+1
p counts the number of independent classes that are born at Xi and die entering
60 Computational Topology for Data Analysis
Xn+1 . They remain alive till the end in the original filtration without extension, or we say that they
never die. To emphasize that classes which exist in Xn actually never die, we equate n + 1 with ∞
and take an+1 = a∞ = ∞. Observe that, with this assumption, we have βi,n+1 = βi,∞ = 0 for every
i ≤ n.
Remark 3.1. The p-th homology classes in H p (X j−1 ) that get born at Xi and die entering X j
may not form a vector space. Hence, we cannot talk about its dimension. In fact, definition of
i, j
µ p , in some sense, compensates for this limitation. This definition involves alternating sums of
dimensions (βi j ’s) of vector spaces. The dimensions appearing with the negative signs lead to this
i, j
anomaly. However, one can express µ p as the dimension of a vector space which is a quotient of
a subspace, see [18] for details.
i, j
Definition 3.7 (Class persistence). For µ p , 0, the persistence Pers ([c]) of a class [c] that is born
at Xi and dies at X j is defined as Pers ([c]) = a j − ai . When j = n + 1 = ∞, Pers ([c]) equals
an+1 − ai = ∞.
Notice that, values ai s can be the index i when no explicit function is given (Definition 3.3).
In that case, persistence of a class sometimes referred as index persistence which is j − i.
Definition 3.8 (Persistence diagram). The persistence diagram Dgm p (F f ) (also written Dgm p f )
of a filtration F f induced by a function f is obtained by drawing a point (ai , a j ) with non-zero
i, j
multiplicity µ p , i < j, on the extended plane R̄2 := (R ∪ {±∞})2 where the points on the diagonal
∆ : {(x, x)} are added with infinite multiplicity.
The addition of the diagonal is a technical necessity for results that we will see afterward.
A class born at ai and never dying is represented as a point (ai , an+1 ) = (ai , ∞) (point v in
Figure 3.8) – we call such points in the persistence diagram as essential persistent points, and
their corresponding homology classes as essential homology classes. Classes may have the same
coordinates because they may be born and die at the same time. This happens only when we allow
mutiple homology classes being created or destroyed at the same function value or filtration point.
In general, this also opens up the possibility of creating infinitely many birth-death pairs even if
the filtration is finite. To avoid such pathological cases, we always assume that the linear maps in
the homology modules have finite rank, a condition known as q-tameness in the literature [80].
There is also an alternative representation of persistence called barcode where each birth-
death pair (ai , a j ) is represented by a line segment [ai , a j ) called a bar which is open on the right.
The open end signifies that the class dying entering X j does not exist in X j . Points at infinity
such as (ai , ∞) are represented with a ray [ai , ∞) giving an infinite bar. See Figure 3.8(right).
Figure 3.9 shows typical persistence diagrams and barcodes (ignoring the types of end points) for
p = 0, 1.
Fact 3.4.
1. If a class has persistence s, then the point representing it will be at a Euclidean distance
√s from the diagonal ∆ (distance between t, t¯ and r, r̄ in Figure 3.8).
2
2. For sublevel set filtrations, all points (ai , a j ) representing a class have ai ≤ a j , so they lie
on or above the diagonal.
Computational Topology for Data Analysis 61
∞
v
u
w t
death v
t̄ u
q t
r w
r
p r̄ q
p
birth
Figure 3.8: (left) A persistence diagram with non-diagonal points only in the positive quadrant,
(right) corresponding barcode.
Figure 3.9: Typical persistence diagrams and the corresponding barcodes for an image data, red
and blue correspond to 0-th and 1-st persistence diagrams respectively. The bars are sorted in
increasing order of their birth time from bottom to top.
3. If mi denotes the multiplicity of an essential point (ai , ∞) in Dgm p (F), where F is a filtration
of X = Xn , one has Σi mi = dim H p (X), the p-th Betti number of X.
Here is one important fact relating persistent Betti numbers and persistence diagrams.
Theorem 3.1. For every pair of indices 0 ≤ k ≤ ` ≤ n and every p, the p-th persistent Betti
i, j
number satisfies βk,`
p = i≤k j>` µ p .
P P
death
birth
Figure 3.10: Two persistence diagrams and their bottleneck distance which is half of the side
lengths of the squares representing bijections.
Let Dgm p (F f ) and Dgm p (Fg ) be two persistence diagrams for two functions f and g. We
want to consider bijections between points from Dgm p (F f ) and Dgm p (Fg ). However, they may
have different cardinality for off-diagonal points. Recall that persistence diagrams include the
points on the diagonal ∆ each with infinite multiplicity. This addition allows us to borrow points
from the diagonal when necessary to define the bijections. Note that we are considering only
filtrations of finite complexes which also make each homology group finite.
Definition 3.9 (Bottleneck distance). Let Π = {π : Dgm p (F f ) → Dgm p (Fg )} denote the set of
all bijections. Consider the distance between two points x = (x1 , x2 ) and y = (y1 , y2 ) in L∞ -norm
kx − yk∞ = max{|x1 − x2 |, |y1 − y2 |} with the assumption that ∞ − ∞ = 0. The bottleneck distance
between the two diagrams is:
Fact 3.5. db is a metric on the space of persistence diagrams. Clearly, db (X, Y) = 0 if and only if
X = Y. Moreover, db (X, Y) = db (Y, X) and db (X, Y) ≤ db (X, Z) + db (Z, Y).
There is a caveat for the above fact. If db is taken as a distance on the space of homology
modules H p F instead of the persistence diagrams Dgm p (F) they generate, that is, if we define
Computational Topology for Data Analysis 63
db (H p F f , H p Fg ) := db (Dgm p (F f ), Dgm p (Fg )), then it may not be a metric. The first axiom for
metric becomes false if the homology modules are allowed to have classes created and destroyed
at the same function values. These classes of zero persistence generate points on the diagonal ∆
in the diagram. Since points on the diagonal have infinite multiplicity, two modules differing in
the number of such classes of zero persistence may have diagrams with zero bottleneck distance.
If we allow such cases, db becomes a pseudometric on the space of homology modules meaning
that it satisfies all axioms of a metric except the first one.
The following theorems originally proved in [102] and further detailed in [149] quantify the
notion of the stability of the persistence diagram. There are two versions, one involves simplicial
filtrations and the other involves space filtrations. For two functions, f, g : X → R, the infinity
norm is defined as k f − gk∞ := sup x∈X | f (x) − g(x)|.
Theorem 3.2 (Stability for simplicial filtrations). Let f, g : K → R be two simplex-wise monotone
functions giving rise to two simplicial filtrations F f and Fg . Then, for every p ≥ 0,
For the second version of the stability theorem, we require that the functions referred in the
theorem are ‘nice’ in the sense that they are tame. A function f : X → R is tame if the homology
groups of its sublevel sets have finite rank and these ranks change only at finitely many values
called critical.
Theorem 3.3 (Stability for space filtrations). Let X be a triangulable space and f, g : X → R be
two tame functions giving rise to two space filtrations F f and Fg where the values for sublevel
sets include critical values. Then, for every p ≥ 0,
There is another distance called q-Wasserstein distance with which persistence diagrams are
also compared.
Definition 3.10 (Wasserstein distance). Let Π be the set of bijections as defined in Definition 3.9.
For any p ≥ 0, q ≥ 1, the q-Wasserstein distance is define as
h q i1/q
dW,q (Dgm p (F f ), Dgm p (Fg )) = inf Σ x∈Dgm p (F f ) kx − π(x)kq .
π∈Π
The distance dW,q also is a metric on the space of persistence diagrams just like the bottleneck
distance. It also enjoys a stability property though it is not as strong as in Theorem 3.3.
Fact 3.6. Let f, g : X → R be two Lipschitz functions defined on a triangulable compact metric
space X. Then, there exist constants C and k depending on X and the Lipschitz constants of f and
g so that for every p ≥ 0 and q ≥ k,
1− k
dW,q (Dgm p (F f ), Dgm p (Fg )) ≤ C · k f − gk∞ q .
64 Computational Topology for Data Analysis
The above result got improved recently [278] by considering the Lq -distance between func-
tions defined on a common domain X:
1/q
k f − gkq = Σ x∈X | f (x) − g(x)|q .
Theorem 3.4 (Stability for Wasserstein distance). Let f, g : K → R be two simplex-wise mono-
tone functions on a simplicial complex K. Then, one has
Bottleneck distances can be computed using perfect matchings in bipartite graphs. Computing
Wasserstein distances become more difficult. It can be computed using an algorithm for minimum
weight perfect matching in weighted bipartite graphs. We leave it as an Exercise question (Exer-
cise 5).
Then, the bottleneck distance we want to compute must be L∞ distance max{|xa − xb |, |ya − yb |}
for two points a ∈ Ã and b ∈ B̃. We do a binary search on all such possible O(n2 ) distances where
|Ã| = | B̃| = n. Let δ0 , δ1 , · · · , δn0 be the sorted sequence of these distances in a non-decreasing
order.
Given a δ = δi ≥ 0 where i is the median of the index in the binary search interval [`, u],
we construct a bipartite graph G = (Ã ∪ B̃, E) where an edge e = (a, b){a∈Ã,b∈B̃} is in E if and
only if either both a ∈ Ā and b ∈ B̄ (weight(e) = 0) or ka − bk∞ ≤ δ (weight(e) = ka − bk∞ ).
A complete matching in G is a set of n edges so that every vertex in à and B̃ is incident to
exactly one edge in the set. To determine if G has a complete matching, one can use an O(n2.5 )
algorithm of Hopcroft and Karp [198] for complete matching in a bipartite graph. However,
exploiting the geometric embedding of the points in the persistence diagrams, we can apply an
O(n1.5 ) time algorithm of Efrat et al. [154] for the purpose. If such an algorithm affirms that a
complete matching exists, we do the following: if ` = u we output δ, otherwise we set u = i
and repeat. If no matching exists, we set ` = i and repeat. Observe that matching has to exist
for some value of δ, in particular for δn0 and thus the binary search always succeeds. Algorithm
1: Bottleneck lays out the pseudocode for this matching. The algorithm runs in O(n1.5 log n)
time accounting for the O(log n) probes for binary search each applying O(n1.5 ) time matching
algorithm. However, to achieve this complexity, we have to avoid sorting n0 = O(n2 ) values taking
O(n2 log n) time. Again, using the geometric embedding of the points, one can perform the binary
probes without incurring the cost for sorting. For details and an efficient implementation of this
algorithm see [208].
Computational Topology for Data Analysis 65
∅ = K0 ,→ K1 ,→ K2 ,→ · · · ,→ Kn = K
2. An existing (p − 1)-cycle c along with its class [c] dies (destroyed). In this case we call σ j
a negative simplex (also called a destructor).
v2 v2 v2 v4
v3 v3
v1 v1 v1 v1
K1 (v1 , −) K2 (v2 , −) K3 (v3 , −) K4 (v4 , −)
v2 v4 v2 v4 v2 v4 v2 e 8 v4
e6 e6 e6 e7
e7
v0
e5 v3 e 5 v3 e5 v3 e 5 v3
v1 v1 v1 v1
K5 (v3 , e5 ) K6 (v2 , e6 ) K7 (v4 , e7 ) K8 (e8 , −)
v2 e8 v4 v2 e8 v4 v2 e8 v4
e6 t10 t10
e6 e9 e7 e9 e7 e6 e9 e7
t11
v3 e5 v3 e5 v3
e5 v1 v1
v1 K9 (e9 , −) K10 (e9 , t10 ) K11 (e8 , t11 )
Figure 3.11: Red simplices are positive and blue ones are negative. The simplices are indexed
to coincide with their order in the filtration. (·, ·) in each subcomplex Ki (·, ·) shows the pairing
between the positive and the negative. The second component missing in the parenthesis shows
the introduction of a positive simplex.
To elaborate the above two changes consider the example depicted in Figure 3.11. When one
moves from K7 to K8 , a non-boundary loop which is a 1-cycle (e5 + e6 + e7 + e8 ) is created after
adding edge e8 . Strictly speaking, a positive p-simplex σ j may create more than one p-cycle.
Only one of them can be taken as independent and the others become its linear combinations with
the existing ones in K j−1 . From K8 to K9 , the introduction of edge e9 creates two non-boundary
loops (e5 + e6 + e9 ) and (e7 + e8 + e9 ). But any one of them is the linear combination of the other
one with the existing loop (e5 + e6 + e7 + e8 ). Notice that there is no canonical way to choose an
independent one. However, the creation of a loop is reflected in the increase of the rank of H1 . In
other words, in general, the Betti number β p increases by 1 for a positive simplex. For a negative
simplex, we get the opposite effect. In this case β p−1 decreases by 1 signifying a death of a cycle.
However, unlike positive simplices, the destroyed cycle is determined uniquely up to homology,
which is the equivalent class carried by the boundary of σ j . For example, in Figure 3.11, the loop
(e7 + e8 + e9 ) gets destroyed by triangle t10 when we go from K9 to K10 .
Pairing. We already saw that destruction of a class is uniquely paired with the creation of a
class through the ‘youngest first’ rule; see the discussion after Fact 3.3. By Fact 3.7, this means
that each negative simplex is paired uniquely with a positive simplex. The goal of the persistence
algorithm is to find out these pairs.
Consider the birth and death of the classes by addition of simplices into a filtration. When
a p-simplex σ j is added, we explore if it destroys the class [c] of its boundary c = ∂σ j if it is
not a boundary already. The cycle c was created when the youngest (p − 1)-simplex in it, say
σi , was added. Note that a simplex is younger if it comes later in the filtration. If σi , a positive
Computational Topology for Data Analysis 67
(p − 1)-simplex, has already been paired with a p-simplex σ0j , then a class also created by σi got
destroyed when σ0j appeared. We can get the (p − 1)-cycle representing this destroyed class and
add it to ∂σ j . The addition provides a cycle that existed before σi . We update c to be this new
cycle and look for the youngest (p − 1)-simplex σi in c and continue the process till we find one
that is unpaired, or the cycle c becomes empty. In the latter case, we discover that c = ∂σ j was a
boundary cycle already and thus σ j creates a new class in H p (K j ). In the other case, we discover
that σ j is a negative p-simplex which destroys a class created by σi . We pair σ j with σi . Indeed,
one can show that the above algorithm produces the persistence pairs according to Definition 3.11
below, that is, their function values lead to the persistence diagram (Definition 3.8). We give a
proof for a matrix version of the algorithm later (Theorem 3.6).
Definition 3.11 (Persistence pairs). Given a simplex-wise filtration F : K0 ,→ K1 ,→ · · · ,→ Kn ,
for 0 < i < j ≤ n, we say a p-simplex σi = Ki \ Ki−1 and a (p + 1)-simplex σ j = K j \ K j−1 form a
i, j
persistence pair (σi , σ j ) if and only if µ p > 0.
The full algorithm is presented in Algorithm 2:PairPersistence, which takes as input a se-
quence of simplices σ1 , σ2 , · · · , σn ordered according to the filtration of a complex whose persis-
tence diagram is to be computed. It assumes that the complex is represented combinatorially with
adjacency structures among its simplices.
Algorithm 2 PairPersistence(σ1 , σ2 , · · · , σn )
Input:
An ordered sequence of simplices forming a filtration of a complex
Output:
Determine if a simplex is ‘positive’ or ‘negative’ and generate persistent pairs
1: for j = 1 to n do
2: c := ∂ p σ j
3: σi is the youngest positive (p − 1)-simplex in c
4: while σi is paired and c is not empty do
5: Let c0 be the cycle destroyed by the simplex paired with σi \∗ computed previously in
step 10 ∗\
6: c := c0 + c \∗ this addition may cancel simplices ∗\
7: Update σi to be the youngest positive (p − 1)-simplex in c
8: end while
9: if c is not empty then
10: σ j is a negative p-simplex; generate pair (σi , σ j ); associate c with σ j as destroyed
11: else
12: σ j is a positive p-simplex \∗ σ j may get paired later ∗\
13: end if
14: end for
Let us again consider the example in Figure 3.11 and see how the algorithm Pair works. From
K7 to K8 , e8 is added. Its boundary is c = (v2 + v4 ). The vertex v4 is the youngest positive vertex
in c but it is paired with e7 in K7 . Thus, c is updated to (v3 + v4 + v4 + v2 ) = (v3 + v2 ). The vertex
v3 becomes the youngest positive one but it is paired with e5 . So, c is updated to (v1 + v2 ). The
68 Computational Topology for Data Analysis
vertex v2 becomes the youngest positive one but it is paired with e6 . So, c is updated to be empty.
Hence e8 is a positive edge. Now we examine the addition of the triangle t11 from K10 to K11 .
The boundary of t11 is c = (e5 + e6 + e9 ). The youngest positive edge e9 is paired with t10 . Thus,
c is updated by adding the cycle destroyed by t10 to (e5 + e6 + e7 + e8 ). Since e8 is the youngest
positive edge that is not yet paired, t11 finds e8 as its paired positive edge. Observe that, we finally
obtain a loop that is destroyed by adding the negative triangle. For example, we obtain the loop
(e5 + e6 + e7 + e8 ) by adding t11 .
1 if σi ∈ ∂ p σ j
(
D p [i, j] =
0 otherwise.
• One can
L combineL all boundary matrices into a single matrix D that represents all linear
maps ∂
p p = p (C p → C p−1 ), that is, transformation of a basis of all chain groups
together to a basis of itself, but with a shift to a one lower dimension.
1 if σi ∈ ∂∗ σ j
(
D[i, j] =
0 otherwise.
Given any matrix A, let rowA [i] and colA [ j] denote the ith row and jth column of A, respec-
tively. We abuse the notation slightly to let colA [ j] denote also the chain {σi | A[i, j] = 1}, which
is the collection of simplices corresponding to 1’s in the column colA [ j].
Definition 3.13 (Reduced matrix). Let lowA [ j] denote the row index of the last 1 in the jth column
of A, which we call the low-row index of the column j. It is undefined for empty columns (marked
with −1 in Algorithm 3). The matrix A is reduced (or is in reduced form) if lowA [ j] , lowA [ j0 ]
for any j , j0 ; that is, no two columns share the same low-row indices.
Fact 3.8. Given a matrix A in reduced form, we have that the set of non-zero columns in A are all
linearly independent over Z2 .
Computational Topology for Data Analysis 69
Let Γba := {colR̃ [k] | k ∈ [1, b] and 1 ≤ lowR̃ [k] ≤ a}. Using the facts that all non-zero columns
in R̃ with index at most b form a basis for Bbp−1 , and that each low-row index for every non-
zero column is unique, one can show that rank (Zap−1 ∩ Bbp−1 ) = |Γba |. Now consider the set of
all non-zero columns in R̃ with index at most b that are not in Γba , denoted by b Γba . Note that
Γa | = rank (R̃a+1 ) = rank (B p−1 ) − |Γa |; hence
|bb b b b
Combining the above with Proposition 3.7, Eqn. (3.6) and Eqn. (3.7), we thus have that:
i, j j−1 j
µ p−1 = rank (Zip−1 ) − rank (Zip−1 ∩ B p−1 ) − rank (Zip−1 ) − rank (Zip−1 ∩ B p−1 )
j−1 j
p−1 ) − rank (Z p−1 ∩ B p−1 ) + rank (Z p−1 ) − rank (Z p−1 ∩ B p−1 )
− rank (Zi−1 i−1 i−1 i−1
j j−1 j−1 j
= rank (Zip−1 ∩ B p−1 ) − rank (Zip−1 ∩ B p−1 ) + rank (Zi−1 i−1
p−1 ∩ B p−1 ) − rank (Z p−1 ∩ B p−1 )
j j−1 j−1 j
= − rank (R̃i+1 ) + rank (R̃i+1 ) − rank (R̃i ) + rank (R̃i ) = rR̃ (i, j) = rR (i, j) = rD (i, j).
Algorithm 3 MatPersistence(D)
Input:
Boundary matrix D of a complex with columns and rows ordered by a given filtration
Output:
Reduced matrix with each column j either being empty or having a unique lowD [ j] entity
1: for j = 1 → |colD | do
2: while ∃ j0 < j s.t. lowD [ j0 ] == lowD [ j] and lowD [ j] , −1 do
3: colD [ j] := colD [ j] + colD [ j0 ]
4: end while
5: if lowD [ j] , −1 then
6: i := lowD [ j] \∗ generate pair (σi , σ j ) ∗\
7: end if
8: end for
Matrix reduction algorithm. Notice that there are possibly many R and V for a fixed D forming
a reduced-form decomposition. Theorem 3.6 implies that the persistent pairing is independent of
the particular contents of R and V as long as R is reduced and V is upper triangular. If we reduce
a given filtered boundary matrix D to a reduced form R only with left-to-right column additions,
Computational Topology for Data Analysis 71
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
2 1 1 2 1 1 2 1 1 2 1
3 1 1 3 1 1 1 3 1 1 3 1 1
4 1 1 4 1 1 4 1 1 1 4 1 1
5 1 1 5 1 1 5 1 5 1
6 1 1 6 1 6 1 6 1
Figure 3.12: Matrix reduction for a 6 × 4 matrix D: low of columns are shaded to point out the
conflicts. (a) lowD [1] conflicts with lowD [2] and colD [1] is added to colD [2], (b) lowD [2] conflicts
with lowD [3], (c) lowD [3] conflicts with lowD [4], (d) the addition of colD [3] to colD [4] zero out
the entire column colD [4].
Fact 3.9. The number of unpaired p-simplices in a simplex-wise filtration of a simplicial complex
K equals its p-th Betti number β p (K).
We already mentioned that the input boundary matrix D should respect the filtration order,
that is, the row and column indices of D correspond to the indices of the simplices in the input fil-
tration. Observe that we can consider slightly different filtration without changing the persistence
pairs. We can arrange all of p-simplices for any p ≥ 0 together in the filtration without changing
72 Computational Topology for Data Analysis
their relative orders as follows where σij denotes the jth i-simplex among all i-simplices in the
original filtration.
p p p
(σ01 , σ02 . . . , σ0n0 ), . . . , (σ1 , σ2 , . . . , σn p ), . . . , (σd1 , σd2 , . . . , σdnd ) (3.8)
This means columns and rows of p-simplices in D become adjacent though retaining their relative
ordering from the original matrix. Observe that, by this rearrangement, all columns that are added
to a column j in the original D still remain to the left of j in their newly assigned indices. In other
words, processing the rearranged matrix D can be thought of as processing each individual p-
boundary matrix D p = [∂ p ] separately where the column and row indices respect the relative
orders of p and (p − 1)-simplices in the original filtration.
Complexity of MatPersistence. Let the filtration F based on which the boundary matrix D is
constructed insert n simplices. This means that D has at most n rows and columns. Then, the outer
for loop is executed at most O(n) times. Within this for loop, steps 5-7 takes only O(1) time. The
complexity is indeed determined by the while loop (steps 2-4). We argue that this loop iterates
at most O(n) times. This follows from the fact that each column addition in step 3 decreases
lowD [ j] by at least one and over the entire algorithm it cannot decrease by more than the length
of the column which is O(n). Each column addition in step 3 takes at most O(n) time giving a
total time of O(n2 ) for the while loop. Accounting for the outer for loop, we get a complexity of
O(n3 ) for MatPersistence.
One can implement the above matrix reduction algorithm with a more efficient data structure
noting that most of the entries in the input matrix D is empty. A linked list representing the
non-zero entries in the columns of D is space-wise more efficient. Edelsbrunner and Harer [149]
present a clever implementation of MatPersistence using such a sparse matrix representation. For
every column j, the algorithm executes O( j − i) column additions of O( j − i) length each incurring
a cost O(( j − i)2 ) where i = 1 if σ j is positive and is the index of the simplex σi with which it pairs
in case σ j is negative. Therefore, the total time complexity becomes O( j∈[1,n] ( j − i)2 ). Here, we
P
assume that the dimension of the complex K is a constant.
It is worth noting that essentially the matrix reduction algorithm is a version of the classical
Gaussian elimination method with a given column order and a specific choice of row pivots. In
this respect, persistence of a given filtration can be computed by the PLU factorization of a matrix
for which Bunch and Hopcroft [57] gives an O(M(n)) time algorithm where M(n) is the time to
multiply two n × n matrices. It is known that M(n) = O(nω ) where ω ∈ [2, 2.373) is called the
exponent for matrix multiplication.
Chen and Kerber [95] observed the following simple fact. If we process the input filtration
backward in dimension, that is, process the boundary matrices D p , p = 1, . . . , d in decreasing
order of dimensions, then a persistence pair (σ p−1 , σ p ) is detected from D p before processing the
column for σ p−1 in D p−1 . Fortunately, we already know that σ p−1 has to be a positive simplex
because it cannot pair with a negative simplex σ p otherwise. So, we can simply ignore the column
of σ p−1 while processing D p−1 . We call it clearing out column p − 1. In practice, this saves a
considerable amount of computation in cases where a lot of positive simplices occur such as in
Rips filtrations. Algorithm 4:ClearPersistence implements this idea.
We cannot take advantage of the clearing for the last dimension in the filtration. If d is the
highest dimension of the simplices in the input filtration, the matrix Dd has to be processed for all
columns because the pairings for the positive d-simplices are not available.
6 5 4 3 2 1 6 5 4 3 2 1
4 1 1 1 4 1 1 1
3 1 1 1 3 1 1 1 1
2 1 1 1 2 1 1
1 1 1 1 1 1 1 1
(a) (b)
6 5 4 3 2 1 6 5 4 3 2 1
4 1 1 1 4 1
3 1 1 1 3 1 1
2 1 1 2 1 1
1 1 1 1 1
(c) (d)
Figure 3.13: Matrix reduction with the twisted matrix D∗ of the matrix D in Figure 3.12 which is
first transposed and then got its rows and columns reversed in order; the conflicts in lowD [·] are
resolved to obtain the intermediate matrices shown in (a) through (d); the last transformation from
(c) to (d) assumes to complete all conflict resolutions from columns 3 through 1. Observe that
every column-row pair correspond to row-column pair in the original matrix. Also, all columns
that are zeroed out here correspond to all rows in the original that did not get paired with any
column meaning that they are either negative simplex, or positive simplex not paired with any.
If the number of d-simplices is large compared to simplices of lower dimensions, the incurred
cost of processing their columns can still be high. For example, in a Rips filtration restricted up to
a certain dimension d, the number of d-simplices becomes usually much larger than the number
of, say, 1-simplices. In those cases, the clearing can be more cost-effective if it can be applied
forward.
In this respect, the following observation becomes helpful. Let D∗p denote the anti-transpose
of the matrix D p , defined by the transpose of D p with the columns and rows being ordered in
74 Computational Topology for Data Analysis
Algorithm 4 ClearPersistence(D1 , D2 , . . . , Dd )
Input:
Boundary matrices ordered by dimension of the boundary operators with columns ordered by
filtration
Output:
Reduced matrices with each column for negative simplex having a unique low entry
1: MatPersistence(Dd )
2: for i = (d − 1) → 1 do
3: for j = 1 → |colDi | do
4: if σ j is not paired while processing Di+1 then
5: \∗ column j is not processed if σ j is already paired∗\
6: while ∃ j0 < j s.t. lowD [ j] , −1 and lowDi [ j0 ] == lowDi [ j] do
7: colDi [ j] := colDi [ j] + colDi [ j0 ]
8: end while
9: if lowD [ j] , −1 then
10: k := lowDi [ j] \∗ generate pair (σk , σ j ) ∗\
11: end if
12: end if
13: end for
14: end for
reverse. This means that if D p has row and column indices 1, . . . , m and 1, . . . , n respectively,
then D∗p (i, j) = D p (n + 1 − j, m + 1 − i). Call it the twisted matrix of D p . Figure 3.13 shows the
twisted matrix D∗ of the matrix D in Figure 3.12 where the rows and columns are marked with
the indices of the original matrix. The following proposition guarantees that we can compute the
persistence pairs in D p from the matrix D∗p .
Proposition 3.8. (σ p−1 , σ p ) is a persistence pair computed from D p if and only if (σ p , σ p−1 ) is
computed as a pair from D∗p by MatPersistence(D∗p ).
Proof. Let the indices of σ p−1 and σ p in D p be i and j respectively. Then, by Theorem 3.6, one
has lowR [ j] = i where R is the reduced matrix obtained from D p by left-to-right column additions.
Consider bottom-to-top row additions in D p each of which takes a row and adds it to a row above
it. Similar to lowA [ j] for a matrix A, let lftA [i] denote the column index of the leftmost 1 in the
row i of A. Call A left reduced if every non-zero row i has a unique lftA [i]. In the rest of the proof,
for simplicity, we use the row and column indices of D p also for D∗p , that is, by an index pair ( j, i)
in D∗p , we actually mean the pair (n + 1 − j, m + 1 − i).
First, observe that, each bottom-to-top row addition in D p is equivalent to a left-to-right col-
umn addition in D∗p . Also, a reduced matrix by left-to-right column additions in D∗p correspond
to a left reduced matrix obtained by corresponding bottom-to-top row additions in D p . So, if S
denotes the reduced matrix obtained from D∗p by left-to-right column additions and L denotes the
left reduced matrix obtained from D p by bottom-to-top row additions, then lowS [i] = j if and
only if lftL [i] = j. Furthermore, MatPersistence(D∗p ) computes the pair ( j, i) (hence (σ p , σ p−1 ) if
and only if lowS [i] = j.
Computational Topology for Data Analysis 75
Therefore, to prove the proposition, it is sufficient to argue that lowR [ j] = i if and only if
lftL [i] = j. By Proposition 3.5, lowR [ j] = i if and only if rD p (i, j) as defined in Eqn. (3.5) equals
1. Therefore, it is sufficient to show that lftL [i] = j if and only if rD p (i, j) = 1.
The above claim can be proved exactly the same way as Proposition 3.5 is proved in [106]
while replacing the role of lowR [ j] with lftL [i]. Observe that bottom-to-top row additions do not
change the rank of the lower left minors. Hence, rD p = rL . Therefore, it is sufficient to show
j
that lftL [i] = j if and only if rL (i, j) = 1. Assume first lftL [i] = j. The rows of Li (see the
j j
definitions above Eqn. (3.5)) are linearly independent and hence rank (Li ) − rank (Li+1 ) = 1.
j
Now delete the last column in Li which leaves the top row to have only zeroes. This im-
j−1 j−1
plies that rank (Li ) − rank (Li+1 ) = 0. This gives rL (i, j) = 1 as needed. Next, assume that
j j−1
lftL [i] , j. Consider Li and Li . If lftL [i] > j, the top row in both matrices is zero. Therefore,
j j j−1 j−1
rank (Li ) − rank (Li+1 ) = 0 and also rank (Li ) = rank (Li+1 ) giving rL (i, j) = 0 as required.
j j
If lftL [i] < j, the top row in both matrices are non-zero giving rank (Li ) − rank (Li+1 ) = 1 and
j−1 j−1
rank (Li ) − rank (Li+1 ) = 1 giving again rL (i, j) = 0 as required.
To apply clearing we process D∗p+1 after D∗p by calling ClearPersistence(D∗d , · · · , D∗2 , D∗1 )
because if we get a pair (σ p+1 , σ p ) while processing D∗p , we already know that σ p+1 is a negative
simplex and its column in D∗p+1 cannot contain a defined low entry. This means that the column
of σ p+1 in D∗p+1 can be zeroed out and hence can be ignored. Now, the only boundary matrix that
needs to be processed without any clearing is D∗1 . So, depending on whether Dd or D1 is large,
one can choose to process the filtration in increasing or decreasing dimensions respectively.
Definition 3.14 (Persistence module). A persistence module over a poset A ⊆ R is any collection
V = Va a∈A of vector spaces Va together with linear maps va,a0 : Va → Va0 so that va,a = id and
va,a0
va0 ,a00 ◦ va,a0 = va,a00 for all a, a0 , a00 ∈ A where a ≤ a0 ≤ a00 . Sometimes we write V = Va −→
Va0 a≤a0 to denote this collection with the maps.
Remark 3.3. A persistence module defined over a subposet A of R can be ‘extended’ into a
module over R. For this, for any a < a0 ∈ A where the open interval (a, a0 ) is not in A and for any
a ≤ b < b0 < a0 , assume that vb,b0 is an isomorphism and lima→−∞ Va = 0 if it is not given.
Our goal is to define a distance between two persistence modules with respect to which we
would bound the distance between their persistence diagrams. Given two persistence modules
defined over R, we define a distance between them by identifying maps between constituent vector
spaces of the modules.
We will come across a structural property involving maps called commutative diagrams quite
often in this and following chapters.
fi
Definition 3.15 (Commutative diagram). A commutative diagram is a collection of maps Ai → Bi
where any two compositions of maps beginning and ending in the same sets result in equal maps.
Formally, whenever we have two sequences in the collection of the form:
f1 fm
A = U1 → U2 · · · → Um+1 = B
g1 gn
A = V1 → V2 · · · → Vn+1 = B,
we have fm ◦ · · · ◦ f1 = gn ◦ · · · ◦ g1 . Commutative diagrams are usually formed by commutative
triangles and squares.
Definition 3.16 (ε-interleaving). Let U and V be two persistence modules over the index set
R. We say U and V are ε-interleaved if there exist two families of maps ϕa : Ua → Va+ε and
ψa : Va → Ua+ε satisfying the following two conditions:
1. va+ε,a0 +ε ◦ ϕa = ϕa0 ◦ ua,a0 and ua+ε,a0 +ε ◦ ψa = ψa0 ◦ va,a0 [rectangular commutativity]
2. ψa+ε ◦ ϕa = ua,a+2ε and ϕa+ε ◦ ψa = va,a+2ε [triangular commutativity]
" $
V : ... / Va / Va+ε / Va+2ε / . .% . . . .
Some of the relevant maps for interleaving between two modules are shown above whereas the
two parallelograms and the two triangles below depict the rectangular and the triangular commu-
tativities respectively.
ua,a0 ua+ε,a0 +ε
Ua / Ua0 U / Va0 +ε
< a+ε <
ϕa ϕa0 ψa ψa0
" "
Va+ε / Va0 +ε Va / Va0
va+ε,a0 +ε va,a0
Computational Topology for Data Analysis 77
ua,a+2ε
Ua / Ua+2ε U
; < a+ε
ϕa ψa+ε ψa ϕa+ε
" va,a+2ε #
Va+ε Va / Va+2ε
Definition 3.17 (Interleaving distance). Given two persistence modules U and V, their interleav-
ing distance is defined as
Observe that, when ε = 0, Definition 3.16 implies that the maps ϕa : Ua → Va and ψa : Va →
Ua are isomorphisms. In that case, we get the following diagrams where each vertical map is an
isomorphism and each square commutes. We get two isomorphic persistence modules.
Definition 3.18 (Isomorphic persistence modules). We say two persistence modules U and V
indexed over an index set A ⊆ R are isomorphic if the following two conditions hold (illustrated
by the diagram above).
1. Ua Va for every a ∈ A, and
Fact 3.10. If two persistence modules arising from two filtrations F f and Fg are isomorphic, the
persistence diagrams Dgm p (F f ) and Dgm p (Fg ) are identical.
We have seen earlier that filtrations give rise to homology modules and hence persistence
modules. Just like the persistence modules, we can define an interleaving distance between two
filtrations too. In the following definition, ιa,a0 denotes the inclusion map from Xa to Xa0 and also
from Ya to Ya0 for a0 ≥ a. For simplicial filtrations, we need contiguity of simplicial maps to
assert equality of maps at the homology level whereas for space filtrations, we need homotopy
of continuous maps to assert equality at the homology level. These maps are between filtrations
and not internal maps within the filtrations which are still inclusions. In next chapter, we go from
inclusions to simplicial maps as internal maps (see Definition 4.2).
Definition 3.19 (ε-interleaving). We say two simplicial (space resp.) filtrations X and Y defined
over R are ε-interleaved if there exist two families of simplicial (continuous resp.) maps ϕa :
Xa → Ya+ε and ψa : Ya → Xa+ε satisfying the following two conditions:
1. ϕa ◦ ιa+ε,a0 +ε is contiguous (homotopic) to ϕa0 ◦ ιa,a0 and ιa+ε,a0 +ε ◦ ψa is contiguous (homo-
topic) to ψa0 ◦ ιa,a0 [rectangular commutativity]
Similar to the persistence modules, we can define the interleaving distance between two fil-
trations:
dI (X, Y) = inf{ε | X and Y are ε-interleaved}
Two ε-interleaved filtrations give rise to ε-interleaved persistence modules at the homology level.
Since contiguous simplicial (homotopic continuous resp.) maps become equal at the homology
level, we obtain the following inequality.
Proposition 3.9. dI (H p X, H p Y) ≤ dI (X, Y) for any p ≥ 0 where H p X and H p Y denote the persis-
tence modules of X and Y respectively at the homology level.
Now we relate the interleaving distance between two persistence modules and the persistence
diagrams they define. For this, we consider a special type of a persistence module called interval
module. Below, we use the standard convention that an open end of an interval is denoted with
the first brackets ‘(’ or ‘)’ and a closed end of an interval with the third brackets ‘[’ or ‘]’.
Definition 3.20 (Interval module). Given an index set A ⊆ R and a pair of indices b, d ∈ A,
b ≤ d, four types of interval modules denoted I[b, d), I(b, d], I[b, d], I(b, d) respectively are special
persistence modules defined as:
va,a0
• (closed-open): I[b, d) : {Va −→ Va0 }a,a0 ∈A where (i) Va = Z2 for all a ∈ [b, d) and Va = 0
otherwise, (ii) va,a0 is identity map for b ≤ a ≤ a0 < d and zero map otherwise.
va,a0
• (open-closed): I(b, d] : {Va −→ Va0 }a,a0 ∈A where (i) Va = Z2 for all a ∈ (b, d] and Va = 0
otherwise, (ii) va,a0 is identity map for b < a ≤ a0 ≤ d and zero map otherwise.
va,a0
• (closed-closed): I[b, d] : {Va −→ Va0 }a,a0 ∈A where (i) Va = Z2 for all a ∈ [b, d] and Va = 0
otherwise, (ii) va,a0 is identity map for b ≤ a ≤ a0 ≤ d and zero map otherwise.
va,a0
• (open-open): I(b, d) : {Va −→ Va0 }a,a0 ∈A where (i) Va = Z2 for all a ∈ (b, d) and Va = 0
otherwise, (ii) va,a0 is identity map for b < a ≤ a0 < d and zero map otherwise.
In general, we denote the four types of interval modules as Ihb, di being oblivious to the
particular type. The two end points b, d signify the birth and the death points of the interval
in analogy to the bars we have seen for persistence diagrams. This is why sometimes we also
write Ihb, di = hb, di. Gabriel [163] showed that a persistence module decomposes uniquely into
interval modules when the index set is finite. This condition can be relaxed further as stated in
proposition below. A persistence module U for which each of the vectors spaces Ua , a ∈ A ⊆ R
has finite dimension is called a pointwise finite dimensional (p.f.d. in short) persistence module.
A persistence module for which the connecting linear maps have finite rank is called q-tame. The
results below are part of a more general concept called quiver theory.
Proposition 3.10.
Any p.f.d. persistence module decomposes uniquely into interval modules, that is, U
• L
j∈J Ihb j , d j i [111, 298].
• Any q-tame persistence module decomposes uniquely into interval modules [80].
The birth and death points of the interval modules that a given persistence module U decom-
poses into (Proposition 3.10) can be plotted as points in R2 . This defines a persistence diagram
DgmU for a persistence module U. We aim to relate the interleaving distance between persistence
modules and the bottleneck distance between their persistence diagram thus defined.
For the index set A = R, Chazal et al. [77] showed that the bottleneck distance between two
persistence diagrams of p.f.d. modules is bounded from above by their interleaving distance. The
result also holds for q-tame modules. It is proved in [23, 220] that the two distances are indeed
equal.
Theorem 3.11. Given two q-tame persistence modules defined over the totally ordered index set
R, dI (U, V) = db (Dgm U, Dgm V).
Remark 3.4. The isometry theorem stated for the index set R does not apply directly to the per-
sistence modules that are not defined over the index set R. In this case, to define the interleaving
distance, we can extend the module to be indexed over R as described in Remark 3.3. For exam-
ple, consider a persistence module H p F obtained from a filtration F defined on a finite index set
A or when A = Z. Observe that, all interval modules for H p F (without extension) are of closed-
closed type [b, d] for some b, d ∈ A. This brings out a subtlety. The intervals of the form [b, d]
where b = d are mapped to the diagonal ∆ in the persistence diagram. These points get ignored
while computing the bottleneck distance as both diagrams have the diagonal points with infinite
multiplicity. In fact, the isometry theorem (Theorem 3.11) does not hold if this is not taken care
of. To address the issue, for persistence modules H p F generated by a finite filtration F, we map
each interval [b, d] in the decomposition of H p F to a point (b, d + 1) in Dgm p (F) (Definition 3.8).
This aligns with the observation that, after the extension over the index set R, the interval [b, d]
indeed stretches to [b, d + 1).
In Section 3.5.1, we describe critical points of such functions. The restriction of f on the
vertex set of K is a vertex function fV : V(K) → R which naturally induces a simplicial filtration
(the lower-star filtration). In Section 3.5.2, we relate the space filtration of the PL-function f :
|K| → R with the simplicial filtration induced by fV , which in turn allows us to apply the output of
the persistence algorithm run on F fV to the space filtration. Finally, in Section 3.5.3, we present a
simple algorithm to compute the 0-th persistence diagram induced by a PL-function (thus also a
vertex function).
PL-critical points. For a Morse function f defined on a smooth d-manifold M, the Morse
Lemma (see Proposition 1.2) suggests that the index of a critical point p is completely determined
by its local neighborhood within the sub-level set M≤ f (p) . For PL-functions, this is captured by
lower-star and lower-link. We define the PL-critical points for PL-functions using homology
groups. However, as the neighborhood of a point is not necessarily a topological ball, we now
need to consider both the lower and upper links. In this context, it is more convenient to use the
p-th reduced Betti number β̃ p (X) of a space/complex X.
Definition 3.22 (Reduced Betti number). β̃ p (X) = β p (X) for p > 0. For p = 0, β̃0 (X) = β0 (X) − 1
and β̃−1 (X) = 0 if X is not empty; otherwise, β̃0 (X) = 0 and β̃−1 (X) = 1.
Definition 3.23 (PL-critical points). Given a PL-function f : |K| → R, we say that a vertex v ∈ K
is a regular vertex or point if β̃ p (Llk(v)) = 0 and β̃ p (Ulk(v)) = 0 for any p ≥ −1. Otherwise, it is
a PL-critical (or simply critical) vertex or point. Furthermore, we say that v has lower-link-index
p if β̃ p−1 (Llk(v)) > 0 . Similarly v has upper-link-index p if β̃ p−1 (Ulk(v)) > 0.
The function value of a critical point is a critical value for f .
Discussions of PL-critical points. Some examples of PL-critical points are given in Figure
3.14. As mentioned above, in the smooth case for a Morse function defined on a m-manifold M,
the type of a non-degenerate critical point v is completely defined by its local neighborhood lower
than f (v) (as the portion higher than f (v) is its complement w.r.t. a m-ball). This is no longer
the case for the PL case, as we see in Figure 3.14. We also note that a PL-critical point could
have multiple lower-link-indices and upper-link-indices. Nevertheless, as we will see later (e.g.,
Theorem 3.13), these PL-critical points are related to the change of homology groups within the
sublevel-sets or superlevel-sets, somewhat analogous to the smooth setting.
We note that there are other concepts of “critical values" exist in the literature. In particular,
the concept of homological critical values is introduced in [102] for a function f : M → R defined
Computational Topology for Data Analysis 81
p p p p
e
(a) (b) (c) (d)
Figure 3.14: The point p is a regular point in (a). The point p is PL-critical in (b), (c) and
(d). Light-blue shaded triangles are in the lower-star, light-pink ones are in the upper-star; while
light-yellow shaded ones are in neither. In (b), note that edge e is not in Llk(p); here p has lower-
link-index 1 as β̃0 (Llk(p)) = 1. In (c), the point p has upper-link-index 2. In (d), the point p has
lower-link-index 1 and upper-link-index 2.
Two choices of “sublevel sets". Consider a PL-function f : |K| → R. Its sublevel set at a is
given by
|K|a := {x ∈ |K| | f (x) ≤ a},
which gives rise to a space filtration over |K| as a increases. Let us call it a space sublevel set.
On the other hand, given a ∈ R, we can also consider the subcomplex Ka spanned by all
vertices from K whose function value is at most a; that is,
We refer to Ka as the simplicial sublevel set w.r.t f : |K| → R (or w.r.t. the vertex function f |V(K) :
V(K) → R). Assume vertices v1 , . . . , vn ∈ V(K) are ordered so that f (v1 ) ≤ f (v2 ) ≤ · · · f (vn ). It
is easy to see that Ka = K f (vi ) if a ∈ [ f (vi ), f (vi+1 )). Note that this is also the sublevel set for the
simplex-wise monotonic function f¯ introduced in Fact 3.1. These two “types” of sublevel sets
relate to each other via the following result.
82 Computational Topology for Data Analysis
f
q p3
p2
z
a
p0 p1
p
Figure 3.15: Consider the simplex σ = {p0 , p1 , p2 , p3 }, where τI = {p0 , p1 } and τO = {p2 , p3 }.
The shaded region equals |σ| ∩ |K|a . This shaded region is the union of a set of segments pz
which are disjoint in their interior. The map µ deformation retracts the segment pz to the point
p ∈ |τI | ⊆ |Ka |.
Theorem 3.12. Given a PL-function f : |K| → R, for any a ∈ R, the space and simplicial sublevel
sets have isomorphic homology groups; that is, H∗ (Ka )H∗ (|K|a ).
Furthermore, the following diagram commutes, where the horizontal homomorphisms are
induced by natural inclusions:
H∗ (Ka ) / H∗ (Kb )
H∗ (|K|a ) / H∗ (|K|b )
Proof. If a < f (v1 ), Ka = ∅ and |K|a = ∅. If a ≥ f (vn ), Ka = K and |K|a = |K|. Thus
the theorem holds in both cases. Now assume a ∈ [ f (vi ), f (vi+1 )) for some i ∈ [1, n). In this
case, Ka = K f (vi ) = j≤i Lst(v j ). It follows that |Ka | ⊆ |K|a . We now show that there is a
S
continuous map µ : |K|a × [0, 1] → |K|a that will continuously deform the identity map on |K|a
to a retraction from |K|a to |Ka |. In other words, µ is a deformation retraction from |K|a onto |Ka |;
thus |Ka | ,→ |K|a induces an isomorphism at the homology level. This will then establish the first
part of the theorem.
For any point x ∈ |Ka |, we set µ(t, x) = x for any t ∈ [0, 1]. Now the set of points in
A := |K|a \ |Ka | form a set of “partial simplices": In particular, since f is a PL-function, there is a
set C of simplices in K such that A = ∪σ∈C interior(σ) ∩ |K|a , where interior(σ) denotes the set
of points in |σ| that is not in any proper face of σ. We construct the map µ on A by constructing
its restriction to each simplex σ ∈ C.
Specifically, consider σ = {p0 , . . . , pd } ∈ C. Let τI = {p0 , . . . , p s } be the maximal face of
σ contained in |K|a , and τO = {p s+1 , . . . , pd } is then the face outside |K|a spanned by vertices
of σ not in |K|a . See Figure 3.15. On the other hand, we can write the underlying space |σ|
as |σ| = p∈|τI | q∈|τO | pq, where pq denotes the convex combination of p and q (line segment
S S
from p to q). Furthermore, pq ∩ |K|a = pz with f (z) = a as f is a PL-function. For any point
x ∈ pz, we simply set µ(t, x) = (1 − t)x + tp. This map is well-defined as all lines pq, p ∈ |τI | and
q ∈ |τO |, are disjoint in their interior. Since f is piecewise linear on σ, the map µ as constructed
is continuous. Also, µ(0, ·) is identity on |K|a , and µ(1, ·) : |K|a → |Ka | is a retraction map. Thus µ
is a deformation retraction and by Fact 1.1, |K|a and |Ka | are homotopy equivalent, implying that
H∗ (|K|a )H∗ (|Ka |). The first part of theorem then follows.
Computational Topology for Data Analysis 83
Furthermore, given that µ is a deformation retraction, the natural inclusion |Ka | ⊆ |K|a induces
an isomorphism at the homology level. The second part of the theorem follows from this, com-
bined with the naturality of the isomorphism H∗ (Ka )H∗ (|Ka |).
We note that we can also inspect the superlevel sets for the underlying space |K| and for the
simplicial setting in a symmetric manner. A result analogous to the above theorem also holds for
the superlevel sets.
Relation to PL-critical points. Similar to critical points for smooth functions, the homology
group of the sublevel sets can only change at the PL-critical points. For simplicity, in what follows
we set Ki := K f (vi ) , for any i ∈ [1, n]. Observe that for any a ∈ R, if complex Ka is non-empty,
then it equals Ki for some i; in particular, Ka = Ki where a ∈ [ f (vi ), f (vi+1 )).
Proof. Let A = Lst(vr ) be the closed lower-star of vr , and B = Kr−1 . Set U = A ∪ B and
V = A ∩ B; it is easy to see that U = Kr ; while V = Llk(vr ). Furthermore, by the definition of
lower-stars and lower-links over a simplicial complex, A = Lst(vr ) equals the coning of vr and
Llk(vr ). It follows that A has trivial reduced homology for all dimensions. Now consider the
following (Mayer-Vietoris) exact sequence:
φ
· · · −→ H̃ p (V) −→ H̃ p (A) ⊕ H̃ p (B) −→ H̃ p (U) −→ H̃ p−1 (V) −→ · · · (3.9)
Corollary 3.14. Given a PL-function f : |K| → R defined on a finite simplicial complex K, let
[a, b] ⊂ R be such that it does not contain any PL-critical value for f .
(1) Then the inclusion map Ka ,→ Kb induces isomorphism between the simplicial homology
groups, that is, H p (Ka )H p (Kb ) for any dimension p ≥ 0.
(2) This also implies that |K|a ,→ |K|b induces isomorphism between the singular homology
groups, that is, H p (|K|a )H p (|K|b ) for any dimension p ≥ 0.
Again, a version of the above Corollary also holds for superlevel sets.
84 Computational Topology for Data Analysis
As Ki := j≤i Lst(v j ) is the union of the lower-stars of v1 , . . . , vi , we call the filtration in Eqn.
S
(3.10) the lower star filtration for f ; see also Section 3.1.2 and Figure 3.6. The two homology
modules H p F f and H p b
F f can be shown to be isomorphic due to Theorem 3.12, and thus they
produce identical persistence diagrams (Fact 3.10).
Corollary 3.15. The homology module H p F f is isomorphic to the homology module H p b F f for
every p ≥ 0. This implies that these two persistence modules have the same persistence diagrams.
Intuitively, the lower-star filtration of the simplicial complex K can be thought of as the dis-
crete version of the sublevel set filtration of the space |K| w.r.t. the PL-function f . By Corollary
3.15, the lower star simplicial filtratoin F f and the sublevel set space filtration F
cf have identical
persistence diagrams. We refer to this common persistence diagram as the persistence diagram of
the PL-function f , denoted by Dgm f .
For a space filtration induced by a Morse function defined on a Riemannian manifold, the
birth- and death-coordinates of the points in the persistence diagrams correspond to critical values
of this Morse function. A similar result holds for the PL-case. In particular, one can prove, using
Corollary 3.14 that, for a PL-function f , the persistence pairings for F f occur only between PL-
critical points. That is:
i, j
Fact 3.11. Given a PL-function f : |K| → R and its associated filtration F f , let µ f,p denote the
i, j
corresponding p-th persistence pairing function w.r.t. F f . If µ f,p , 0, then vertices vi and v j must
be PL-critical.
However, not all PL-critical points necessarily appear in persistence pairings w.r.t. the lower
star filtration F f .
• for any simplex σ, its faces appear earlier than it in the total ordering of simplices.
With this total ordering of simplices, the induced simplex-wise filtration becomes:
Note that Ki = LIi ; thus F f is a subsequence of the simplex-wise filtration F s . The construction
of F s from F f is not necessarily unique. We can simply choose σI j−1 +1 , . . . , σI j to be the set of
simplices in Lst(v j ) sorted by their dimension. We now construct the following map π : [0, m] →
[0, n] as π( j) = k if j ∈ [Ik−1 + 1, Ik ]; that is, π( j) = k means that simplex σ j is in the lower-star of
vertex vk .
We run the persistence algorithm MatPersistence (Algorithm 3) on the simplex-wise filtration
i, j
F s . Let µ s,p denote the persistence pairing function w.r.t. F s . Many of the pairings are between
two simplices within the same lower-star of a vertex and are not interesting. Instead, we aim to
compute the persistence diagram Dgm f for the filtration F f , which captures only the non-local
pairings where the birth and death are from different Ki s. The following theorem specifies how we
can compute the persistence diagram Dgm f for the filtration F f from the output of the persistence
algorithm with the simplex-wise filtration F s as input.
Theorem 3.16 (Computation of Dgm f in the PL-case). Given a PL-function f : |K| → R, let
i, j
µ s,p denote the p-dimensional persistence pairing function w.r.t. the simplex-wise filtration F s as
i, j
described above. We can compute the persistence pairing function µ f,p w.r.t. F f as follows:
X X
i, j
µ f,p := s,p for any i < j ≤ n; and µ f,p :=
µb,d i,∞
µb,∞
s,p for any i ≤ n.
b∈(Ii−1 ,Ii ],d∈(I j−1 ,I j ] b∈(Ii−1 ,Ii ]
i, j
If µ f,p , 0, we refer to (vi , v j ) as a persistence pair w.r.t. f and we add the corresponding
i, j
persistent point ( f (vi ), f (v j )), with multiplicity µ f,p , 0, to the persistence diagram Dgm f . The
persistence of this pair (vi , v j ) is | f (vi ) − f (v j )|.
Lst(vi ) 1
1
..... ...
Lst(vj )
i, j
Figure 3.16: µ f,p = 2.
86 Computational Topology for Data Analysis
Remark 3.5. As an example, see Figure 3.16 which shows the reduced matrix after running
Algorithm 3:MatPersistence on the filtered boundary matrix D for F s where ‘1’ indicates the
lowest ‘1’ in the shaded columns. Only columns corresponding to p-simplices are shown. We
i, j i, j
have µ f,p = 2. One can have an alternate view of the persistence pairs given by µ f,p as follows:
i, j
for each persistence index pair (i, j) ∈ Dgm(F s ) (i.e., µ p > 0 w.r.t. F s ), one has a persistence
pair (vπ(i) , vπ( j) ) for F f if an only if π(i) , π( j). In other words, all local pairs (i, j) ∈ Dgm(F s )
with π(i) = π( j) signifying that σi and σ j are from the lower-star of the same vertex are ignored
for the persistence diagram Dgm(F f ).
i, j i, j
Proof. [Proof of Theorem 3.16.] Recall that µ f,p and µ s,p are the persistence pairing functions
i, j i, j
induced by the filtration F f and F s , respectively. Similarly, we use β f,p and β s,p to denote the
persistent Betti numbers induced by filtrations F f and F s , respectively. In what follows, we will
i, j
just prove that, for any dimension p ≥ 0 and i, j ∈ [1, n], we have that µ f,p can be computed as
stated in the theorem. The case when j = ∞ can be handled in a similar manner and is left as an
exercise.
For any i ∈ [1, n], let Ii be as defined in Eqn. (3.12). Given the relation of F f and F s , it follows
i0 , j0 Ii0 ,I j0
that for any i0 , j0 ∈ [1, n], we have that Ki0 = LIi0 , K j0 = LI j0 , and thus β f,p = β s,p as F f is a
subsequence of F s .
Now fix the dimension p ≥ 0, and for simplicity, omit p from all subscripts. Given any
i, j ∈ [1, n], we have that:
i, j i, j−1 i, j i−1, j−1 i−1, j
µ f = (β f − β f ) − (β f − βf )
I ,I j−1 I ,I I ,I j−1 I ,I I ,I
= (β si − β si j ) − (β si−1 − β si−1 j ) = µ si j .
I ,I
Hence we aim to show µ si j = b∈(Ii−1 ,Ii ],d∈(I j−1 ,I j ] µb,d
P
s which then proves the theorem. To this end,
note that by Theorem 3.1, we have the following:
I ,I I ,I
X X X
β si j−1 − β si j = µb,d
s − µb,d
s = µb,d
s ;
b≤Ii ,d>I j−1 b≤Ii ,d>I j b≤Ii ,d∈(I j−1 ,I j ]
I ,I I ,I
X X X
β si−1 j−1 − β si−1 j = µb,d
s − s =
µb,d µb,d
s ;
b≤Ii−1 ,d>I j−1 b≤Ii−1 ,d>I j b≤Ii−1 ,d∈(I j−1 ,I j ]
I ,I
X X X
⇒ µ si j = µb,d
s − µb,d
s = µb,d
s .
b≤Ii ,d∈(I j−1 ,I j ] b≤Ii−1 ,d∈(I j−1 ,I j ] b∈(Ii−1 ,Ii ],d∈(I j−1 ,I j ]
An implication of the above result is that any simplex-wise filtration F s obtained from the
lower star filtration F f produces the same pairing between critical points and the same persistence
diagram.
diagram Dgm0 f ) for a PL-function f : |K| → R can be computed efficiently in O(n log n + mα(n))
time, where n and m are the number of vertices and edges in K, respectively.
Indeed, first observe that we only need the 1-skeleton of K to compute Dgm0 f . So, in what
follows, assume that K contains only vertices V and edges E. Assume that all vertices in V are
sorted in non-decreasing order of their f -values. As before, let Ki be the union of lower-stars of
all vertices v j where j ≤ i. Since we are only interested in the 0-th homology, we only need to
track the 0-th homology group of Ki , which essentially embodies the information about connected
components.
Assume we are at vertex v j . Consider Lst(v j ). There are three cases.
C 0 = C1 ∪ C2 ∪ · · · ∪ Cr ∪ Lstv j .
Proposition 3.17. Suppose Case-3 happens where edges in Lst(v j ) merges components C1 , . . . , Cr
in K j−1 . Let vki be the global minimum of component Ci for i ∈ [1, r]. Assume w.l.o.g that
f (vk1 ) ≤ f (vk2 ) ≤ · · · ≤ f (vkr ). Then the node v j participates in exactly r−1 number of persistence
pairings (vk2 , v j ), . . . , (vkr , v j ) for the 0-dimensional persistent diagram Dgm0 f , corresponding to
points ( f (vk2 ), f (v j )), . . . , ( f (vkr ), f (v j )) in Dgm0 f .
Intuitively, when Case-3 happens, consider the set of 0-cycles c2 = vk2 + vk1 , c3 = vk3 +
vk1 , . . . , cr = vkr + vk1 . On one hand, it is easy to see that their corresponding homology classes
[ci ]’s are independent within H0 (K j−1 ). Furthermore, each ci is created upon entering Kki for each
i ∈ [1, r]. On the other hand, the homology classes [c2 ], . . . , [cr ] become trivial in H0 (K j ) (thus
k ,j
they are destroyed upon entering K j ). Hence µ0i > 0 for i ∈ [2, r], corresponding to persistence
pairings (vk2 , v j ), . . . , (vkr , v j ). Furthermore, consider any 0-cycle c1 = vk1 + c where c is a 0-chain
from Kk1 −1 . The class [c1 ] is created at Kk1 yet remains non-trivial at K j . Hence there is no
persistence pairing (vk1 , v j ).
Based on Proposition 3.17 we can compute the persistence pairings for the 0-dimensional
persistent homology without the matrix reduction algorithm. We only need to maintain connected
components information for each Ki , and potentially merge multiple components. We would also
need to be able to query the membership of a given vertex u in the components of the current
sublevel set. Such operations can be implemented by a standard union-find data structure.
Specifically, a union-find data structure is a standard data structure that maintains dynamic
disjoint sets [109]. Given a set of elements U called the universe, this data structure typically
88 Computational Topology for Data Analysis
supports the following three operations to maintain a set S of disjoint subsets of U, where each
subset also maintains a representative element: (1) MakeSet(x) which creates a new set {x} and
adds it to S; (2) FindSet(x) returns the representative of the set from S containing x; and (3)
Union(x, y) merges the sets from S containing x and y respectively into a single one if they are
different.
We now present Algorithm 5:ZeroPerDg. Here the universe U is the set of all vertices V of
K. Note that each vertex v is also associated with its function value f (v). In this algorithm, we
assume that the representative of a set C is the minimum in it, i.e, the vertex with the smallest
f -value, and the query RepSet(v) returns the representative of the set containing vertex v. We
assume that this query takes the same time as FindSet(v). Given a disjoint set C, we also use
RepSet(C) to represent the representative (minimum) of this set. One can view a disjoint set C in
the collection S as the maximal set of elements sharing the same representative.
Let n and m denote the number of vertices and edges in K, respectively. Sorting all vertices
in V takes O(n log n) time. There are O(n + m) number of CreateSet, FindSet, Union and RepSet
operations. By using the standard union-find data structure, the total time for all these operations
are (n + m)α(n) where α(n) is the inverse Ackermann’s function that grows extremely slowy with
n [109]. Hence the total time complexity of Algorithm ZeroPerDg is O(n log n + mα(n)).
Computational Topology for Data Analysis 89
Note that lines 18-20 of algorithm ZeroPerDg inspect all disjoint sets after processing all
vertices and their lower-stars; each of such disjoint sets corresponds to a connected component in
K. Hence each of them generates an essential pair in the 0-th persistence diagram.
Theorem 3.18. Given a PL-function f : |K| → R, the 0-dimensional persistence diagram Dgm0 f
for the lower-star filtration of f can be computed by the algorithm ZeroPerDg in O(n log n +
mα(n)) time, where n and m are the number of vertices and edges in K respectively.
Connection to minimum spanning tree. If we view the 1-skeleton of K as a graph G = (V, E),
then ZeorPerDg(K, f ) essentially computes the minimum spanning forest of G with the following
edge weights: for every edge e = (u, v), we set its weight w(e) = max{ f (u), f (v)}. Then, we can
get the persistent pairs output of ZeroPerDg by running the well known Kruskal’s algorithm on
the weighted graph G. When we come across an edge e = (u, v) that joins two disjoint components
in this algorithm, we determine the two minimum vertices `1 , `2 in these two components and pair
e with the one among `1 , `2 that has the larger f -value. After generating all such vertex-edge pairs
(u, e), we convert them to vertex-vertex pairs (u, v) where e ∈ Lst(v). We throw away any pair of
the form (u, u) because they signify local pairs.
Graph filtration. The algorithm ZeroPerDg can be easily adapted to compute persistence for
a given filtration of a graph. In this case, we process the vertices and edges in their order in the
filtration and maintain connected components using union-find data structure as in ZeroPerDg.
For each edge e = (u, v), we check if it connects two disconnected components represented by
vertices `1 and `2 (line 11) and if so, e is paired with the younger vertex between `1 and `2 (line
13). We output all vertex-edge pairs thus computed. The vertices and edges that remain unpaired
provide the infinite bars in the 0-th and 1-st persistence diagrams. The algorithm runs in O(nα(n))
time if the graph has n vertices and edges in total. The O(n log n) term in the complexity is
eliminated because sorting of the vertices is implicitly given by the input filtration.
For efficient implementation, clearing and compression strategies as described in Section 3.3.2
were presented by Chen and Kerber [95]. We have given a proof based on matrix reduction that the
same persistent pairs can be computed by considering the anti-transpose of the boundary matrix.
This is termed as the cohomology algorithm first introduced in [115]. The name is justified by
the fact that considering cohomology groups and the resulting persistence module that reverses
the arrows (Fact 2.14), we obtain the same barcode. The anti-transpose of the boundary matrix
indeed represents the coboundary matrix filtered reversely. These tricks are further used by Bauer
for processing Rips filtration efficiently in the Ripser software [19]; see also [304]. Boissonnat et
al. [41, 42] have suggested a technique to reduce the size of a given filtration using strong collapse
of Barmak and Minian [17]. The collapse on the complex can be efficiently achieved only through
simple manipulations of the boundary matrix.
The concept of bottleneck distance for persistence diagrams was first proposed by Cohen-
Steiner et al. [102] who also showed the stability of such diagrams in terms of bottleneck dis-
tances with respect to the infinity norm of the difference between functions generating them. This
result was extended to Wassrstein distance though in a weaker form in [104] which got improved
recently [278]. The more general concept of interleaving distance between persistence modules
and the stability of persistence diagrams with respect them was presented by Chazal et al. [77].
The fact that bottleneck distance between persistence diagrams is not only bounded from above
by interleaving distance but is indeed equal to it was shown by Lesnick [220] which was further
studied by Bauer and Lesnick [23] later. Also, see [54] for more generalization at algebraic level.
The use of the reduced Betti numbers for lower-link of a vertex to quantify its criticality was
originally introduced in [150] for a PL-function defined on a triangulation of a d-manifold. Our
PL-criticality considers both the lower-link and upper-link for more general simplicial complexes.
As far as we know, the relations between such PL-critical points and homology groups of sublevel-
sets for the PL-setting have not been stated explicitly elsewhere in the literature. The concept of
homological critical values was first introduced in [102], and the more general concept of “levelset
critical values" (and levelset tame functions) was originally introduced in [63].
The idea of using union-find data structure to compute the 0-th persistent homology group was
already introduced in the original persistence algorithm paper [152]. In this chapter, we present a
modification for the PL-function setting.
Exercises
1. Let K be a p-complex with every (p − 1)-simplex incident to exactly two p-simplices. Let
M be a boundary matrix of the boundary operator ∂ p for K. We run a different version of
the persistence algorithm on M. We scan its columns from left to right as before, but we
add the current column to its right to resolve conflict, i.e., for each i = 1, · · · , n in this order
if there exists j > i so that low M [i] = low M [ j], then add col M [i] to col M [ j]. Show that:
2. For a given matrix with binary entries, a valid column operation is one that adds a column
to its right (Z2 -addition). Similarly, define a valid row operation is the one that adds a row
to another one above it. Show that there exists a set of valid column and row operations
that leave every row and column either empty or with a single non-zero entry.
7. Let mq and βq be the number of q-simplices and qth Betti number of a simplicial complex
of dimension p. Using pairing in persistence, show that
8. Let F be a filtration where every p-simplex appear only after all (p − 1)-simplices like in
Eqn. (3.8). Let F0 be a modified filtration of F as follows. For every p ≥ 0, all p-simplices
in F are ordered in non-decreasing order of their persistence values in F0 assuming that
unpaired p-simplices have persistence value ∞. Show that the persistence pairing remains
the same for F and F0 .
Describe the relation between their corresponding persistence diagrams Dgm(F) and Dgm(F0 ).
10. Give an example of a piecewise linear function f : |K| → R where a vertex vi is a PL-critical
point, but H∗ (Ki−1 )H∗ (Ki ) as induced by inclusion.
11. Let f : V(K) → R be a vertex function defined on the vertex set V(K) of complex K.
Consider g = h ◦ f + a, where h : R → R is a monotone function and a ∈ R is a real value.
Consider the lower-star filtrations F f and Fg induced by induced PL-functions f, g : |K| →
R as in Eqn. 3.10. Describe the relation between their corresponding persistence diagrams
Dgm(F f ) and Dgm(Fg ).
• Show that dI (P f , Pg ) ≤ δ.
92 Computational Topology for Data Analysis
13. For a PL-function f : |K| → R, we know how to produce a simplex-wise filtration F so that
the barcode for f can be read from the barcode of F. Design an algorithm to do the reverse,
that is, given a filtration F on a complex K, produce a filtration G of a simplicial complex
K 0 so that G is indeed a simplex-wise filtration of a PL function g : |K 0 | → R where bars
for F can be obtained from those for G. (Hint: use barycentric subdivision of K).
15. Consider the two persistence modules U and V as shown below and a sequence of linear
maps fi : Ui → Vi so that all squares commute.
U: U1 / U2 / U3 / ...... / Um
f1 f2 f3 fm
V: V1 / V2 / V3 / ...... / Vm
where the maps are induced from the module U. Prove that ker F is a persistence module.
Show the same for the sequences
General Persistence
We have considered filtrations so far for defining persistence and its stability. In a filtration, the
connecting maps between consecutive spaces or complexes are inclusions. Assuming a discrete
subset of reals, A : a0 ≤ a1 ≤ · · · ≤ an , as an index set, we write a filtration as:
A more generalized scenario occurs when the inclusions are replaced with continuous maps for
space filtrations and simplicial maps for simplicial filtrations: xi j : Xai → Xa j . In that case, we
call the sequence a space and a simplicial tower respectively:
x01 x12 x(n−1)n
X : Xa0 −→ Xa1 −→ · · · −→ Xan . (4.1)
Considering the homology group of each space (complex resp.) in the sequence, we obtain a
sequence of vector spaces connected with linear maps, which we have seen before. Specifically,
we obtain the following tower of vector spaces:
x01∗ x12∗ x(n−1)n∗
H p X : H p (Xa0 ) −→ H p (Xa1 ) −→ · · · −→ H p (Xan ).
In the above sequence each linear map xi j∗ is the homomorphism induced by the map xi j . We
have already seen that persistent homology of such a sequence of vector spaces and linear maps
are well defined. However, since the linear maps here are not induced by inclusions, the original
persistence algorithm as described in the previous chapter does not work. In Section 4.2, we
describe a new algorithm to compute the persistence diagram of simplicial towers. Next, we
generalize a filtration by allowing the inclusion maps to be directed either way giving rise to what
is called a zigzag filtration:
F : Xa0 ↔ Xa1 ↔ · · · ↔ Xan . (4.2)
where each bidirectional arrow ‘↔’ is either a forward or a backward inclusion map. In Sec-
tion 4.3, we present an algorithm to compute the persistence of a zigzag filtration. A juxtaposition
of a zigzag filtration with a tower provides a further generalization referred to as a zigzag tower.
Section 4.4 presents an approach for computing the persistence of such a tower.
Before presenting the algorithms, we generalize the notion of stability for towers. We have
seen such a notion in Section 3.4 for persistence modules arising out of filtrations. Here, we adapt
it to a tower.
93
94 Computational Topology for Data Analysis
Definition 4.1 (Tower). A tower indexed in an ordered set A ⊆ R is any collection T = T a a∈A of
objects T a , a ∈ A, together with maps ta,a0 : T a → T a0 so that ta,a = id and ta0 ,a00 ◦ ta,a0 = ta,a00 for
ta,a0
all a ≤ a0 ≤ a00 . Sometimes we write T = T a −→ T a0 a≤a0 to denote the collection with the maps.
We say that the tower T has resolution r if a ≥ r for every a ∈ A.
When T is a collection of topological spaces connected with continuous maps, we call it a
space tower. When it is a collection of simplicial complexes connected with simplicial maps, we
call it a simplicial tower, and when it is a collection of vector spaces connected with linear maps,
we call it a vector space tower.
Remark 4.1. As we have already seen, in practice, it may happen that a tower needs to be defined
over a discrete set or more generally an index set A that is only a subposet of R. In such a case,
one can ‘embed’ A into R and convert the input to a tower according to Definition 4.1 by assuming
that for any a < a0 ∈ A with (a, a0 ) < A and for any a ≤ b < b0 < a0 , tb,b0 is an isomorphism.
xa,a0
Definition 4.2 (Interleaving of simplicial (space) towers). Let X = Xa −→ Xa0 a≤a0 and Y =
ya,a0
Ya −→ Ya0 a≤a0 be two towers of simplicial complexes (spaces resp.) indexed in R. For any real
ε ≥ 0, we say that they are ε-interleaved if for every a one can find simplicial maps (continuous
maps resp.) ϕa : Xa → Ya+ε and ψa : Ya → Xa+ε so that
(i) for all a ∈ real, ψa+ε ◦ ϕa and xa,a+2ε are contiguous (homotopic resp.),
(ii) for all a ∈ R, ϕa+ε ◦ ψa and ya,a+2ε are contiguous (homotopic resp.),
(iii) for all a0 ≥ a, ϕa0 ◦ xa,a0 and ya+ε,a0 +ε ◦ ϕa are contiguous (homotopic resp.),
(iv) for all a0 ≥ a, xa+ε,a0 +ε ◦ ψa and ψa0 ◦ ya,a0 are contiguous (homotopic resp.).
These four conditions are summarized by requiring that the four diagrams below commute up
to contiguity (homotopy resp.):
xa,a+2ε
Xa / Xa+2ε X (4.3)
; = a+ε
ϕa ψa+ε ψa ϕa+ε
! ya,a+2ε #
Ya+ε Ya / Ya+2ε
Computational Topology for Data Analysis 95
xa,a0 xa+ε,a0 +ε
Xa / Xa0 X / Xa0 +ε
= a+ε <
ϕa ϕa 0 ψa ψa0
! "
Ya+ε / Ya0 +ε Ya / Ya0
ya+ε,a0 +ε ya,a0
If we replace the operator ‘+’ by the multiplication ‘·’ with respect to the indices in the above
definition, then we say that X and Y are multiplicatively ε-interleaved. By interleaving we will
mean additive interleaving by default and use the term multiplicative interleaving where necessary
to signify that the shift is multiplicative rather than additive.
Definition 4.3 (Interleaving distance between simplicial (space) towers). The interleaving dis-
tance between two simplicial (space) towers X and Y is:
Similar to the simplicial (space) towers, we can define interleaving of vector space towers.
But, in that case, we replace contiguity (homotopy) with equality in conditions (i) through (iv).
ua,a0 va,a0
Definition 4.4 (Interleaving of vector space towers). Let U = Ua −→ Ua0 a≤a0 and V = Va −→
Va0 a≤a0 be two vector space towers indexed in R. For any real ε ≥ 0, we say that they are ε-
interleaved if for each a ∈ R one can find linear maps ϕa : Ua → Va+ε and ψa : Va → Ua+ε so
that
(i) for all a ∈ R, ψa+ε ◦ ϕa = ua,a+2ε ,
xa,a0 ya,a0
Suppose that we have two simplicial (space) towers X = {Xa → Xa0 } and Y = {Ya →
Ya0 }. Consider the two vector space towers also called homology towers obtained by taking the
homology groups of the complexes (spaces), that is,
x(a,a0 )∗ y(a,a0 )∗
VX = {H p (Xa ) → H p (Xa0 )} and VY = {H p (Ya ) → H p (Ya0 )}.
The following should be obvious because simplicial (continuous resp.) maps become linear maps
and contiguous (homotopic resp.) maps become equal at the homology level.
96 Computational Topology for Data Analysis
Interleaving between Čech and Rips filtrations: We show an example where we can use the
stability result in Corollary 4.4. Let P ⊆ M be a finite subset of a metric space (M, d). Consider
the Rips and Čech-filtrations:
R : {VRε (P) ,→ VRε (P)}0<ε≤ε0 and C : {Cε (P) ,→ Cε (P)}0<ε≤ε0 .
0 0
Figure 4.1 illustrates that Čech an Rips complexes are multiplicatively 2-interleaved. Then,
according to Corollary 4.4, the persistence diagrams Dgmlog C and Dgmlog R have bottleneck dis-
tance of log 2 = 1.
Computational Topology for Data Analysis 97
4.2.1 Annotations
We maintain a consistent cohomology basis using a notion called annotations [60] which are
binary vectors assigned to simplices. These annotations are updated as we go forward through the
sequence in the given tower. This implicitly maintains a cohomology basis in the reverse direction
where birth and death of cohomology classes coincide with the death and birth respectively of
homology classes.
Definition 4.6 (Annotation). Given a simplicial complex K, Let K(p) denote the set of p-simplices
g
in K. An annotation for K(p) is an assignment a : K(p) → Z2 of a binary vector aσ = a(σ) of
length g to each p-simplex σ ∈ K. The binary vector aσ is called the annotation for σ. Each entry
‘0’ or ‘1’ of aσ is called its element. Annotations for simplices provide an annotation for every
p-chain c p : ac p = Σσ∈c p aσ .
g
An annotation a : K(p) → Z2 is valid if following two conditions are satisfied:
1. g = rank Hp (K), and
2. two p-cycles z1 and z2 have az1 = az2 if and only if their homology classes are identical, i.e.
[z1 ] = [z2 ].
Proposition 4.5. The following two statements are equivalent.
g
1. An annotation a : K(p) → Z2 is valid
2. The cochains {φi }i=1,··· ,g given by φi (σ) = aσ [i] for every σ ∈ K(p) are cocycles whose
cohomology classes {[φi ]}, i = 1, . . . , g, constitute a basis of H p (K).
In light of the above result, an annotation is simply one way to represent a cohomology ba-
sis. However, by representing the corresponding basis as an explicit vector associated with each
simplex, it localizes the basis to each simplex. As a result, we can update the cohomology basis
locally by changing the annotations locally (see Proposition 4.8). This point of view also helps to
reveal how we can process elementary collapses, which are neither inclusions nor deletions, by
transferring annotations (see Proposition 4.9).
98 Computational Topology for Data Analysis
4.2.2 Algorithm
fi
Consider the persistence module H p K induced by a simplicial tower K : {Ki → Ki+1 } where
every fi is a so-called elementary simplicial map which we will introduce shortly:
f 0∗ f 1∗ fn−1∗
H p K : H p (K0 ) → H p (K1 ) → H p (K2 ) · · · → H p (Kn ).
Instead of tracking a consistent homology basis for the module H p K, we track a cohomology
basis in the module H p K where the homomorphisms are in reverse direction:
f0∗ f1∗ ∗
fn−1
H p K : H p (K0 ) ← H p (K1 ) ← H p (K2 ) · · · ← H p (Kn ).
As we move from left to right in the above sequence, the annotations implicitly maintain a coho-
mology basis whose elements are also time stamped to signify when a basis element is born or
dies. We keep in mind that the birth and death of a cohomology basis element coincides with the
death and birth of a homology basis element because the two modules run in opposite directions.
To jump start the algorithm, we need annotations for simplices in K0 at the beginning whose
non-zero elements are timestamped with 0. This can be achieved by considering an arbitrary filtra-
tion of K0 and then applying the generic algorithm as we describe for inclusions in Section 4.2.3.
The first vertex in this filtration gets the annotation of [1].
Before describing the algorithm, we observe a simple fact that simplicial maps can be decom-
posed into elementary maps which let us design simpler atomic steps for the algorithm.
• f is injective, and K 0 has at most one more simplex than K. In this case, f is called an
elementary inclusion.
• f is not injective but is surjective, and the vertex map fV is injective everywhere except on a
pair {u, v} ⊆ V(K). In this case, f is called an elementary collapse. An elementary collapse
maps a pair of vertices into a single vertex, and is injective on every other vertex.
Proposition 4.6. If f : K → K 0 is a simplicial map, then there are elementary simplicial maps fi
f0 f1 fn−1
K = K0 → K1 → K2 · · · → Kn = K 0 so that f = fn−1 ◦ fn−2 ◦ · · · ◦ f0 .
In view of Proposition 4.6, it is sufficient to show how one can design the persistence algo-
rithm for an elementary simplicial map. At this point, we make a change in the definition 4.7
of elementary simplicial maps that eases further discussions. We let fV to be identity (which is
an injective map) everywhere except possibly on a pair of vertices {u, v} ⊆ V(K) for which fV
maps to one of these two vertices, say u, in K 0 . This change can be implemented by renaming the
vertices in K 0 that are mapped onto injectively.
Computational Topology for Data Analysis 99
0 0 00 00 00 00 00 00 0 0
u 01 v u 01 v u 00 v 0 v
1 u v 1 1 1 u
0 1 00 10 00 10 00 10 0 1
Figure 4.2: Case(i) of inclusion: the boundary ∂uv = u + v of the edge uv has annotation 1 + 1 = 0.
After its addition, every edge gains an element in its annotation which is 0 for all except the edge
uv. Case (ii) of inclusion: the boundary of the top triangle has annotation 01. It is added to the
annotation of uv which is the only edge having the second element 1. Consequently the second
element is zeroed out for every edge, and is then deleted.
Case (i): If a∂σ is a zero vector, the class [∂σ] is trivial in H p−1 (Ki ). This means that σ creates a
p-cycle in Ki+1 and by duality a p-cocycle is killed while going left from Ki+1 to Ki . In this case
we augment the annotations for all p-simplices by one element with a time stamp i + 1, that is,
the annotation [b1 , b2 , · · · , bg ] for every p-simplex τ is updated to [b1 , b2 , · · · , bg , bg+1 ] with bg+1
being time stamped i + 1. We set bg+1 = 0 for τ , σ and bg+1 = 1 for τ = σ. The element bi
of aσ is set to zero for 1 ≤ i ≤ g. Other annotations for other simplices remain unchanged. See
Figure 4.2(a).
Case (ii): If a∂σ is not a zero vector, the class of the (p − 1)-cycle ∂σ is nontrivial in H p−1 (Ki ).
Therefore, σ kills the class of this (p − 1)-cycle and a corresponding class of (p − 1)-cocycles
is born in the reverse direction. We simulate it by forcing a∂σ to be zero which affects other
annotations as well. Let i1 < i2 < · · · < ik be the set of indices in non-decreasing order so that
bi1 , bi2 , · · · , bik are all of the nonzero elements in a∂σ = [b1 , b2 , · · · , bik , · · · , bg ]. Recall that φ j
denotes the (p − 1)-cocycle given by its evaluation φ j (σ0 ) = aσ0 [ j] for every (p − 1)-simplex
σ0 ∈ Ki (Proposition 4.5). With this notation, the cocycle φ = φi1 + φi2 + · · · + φik is born
after deleting σ in the reverse direction. This cocycle does not exist after time ik in the reverse
direction. In other words, the cohomology class [φ] which is born leaving the time i + 1 is killed at
time ik . This pairing matches that of the standard persistence algorithm where the youngest basis
element is chosen to be paired among all those ones whose combination is killed. We add the
vector a∂σ to the annotation of every (p − 1)-simplex whose ik -th element is nonzero. This zeroes
out the ik -th element of the annotation for every (p − 1)-simplex and at the same time updates
other elements so that a valid annotation according to Proposition 4.5 is maintained. We simply
delete ik -th element from the annotation for every (p − 1)-simplex. See Figure 4.2(b). We further
set the annotation aσ for σ to be a zero-vector of length s, where s is the length of the annotation
100 Computational Topology for Data Analysis
Algorithm 6 Annot(K)
Input:
K: input complex
Output:
Annotation for every simplex in K
1: Let m := |K 0 |
2: For every vertex vi ∈ K 0 , assign an m-vector a(vi ) where a(vi )[ j] = 1 iff j = i
3: for p = 1 → d do
4: for all simplex σ ∈ K p do
5: Let annotation of every p-simplex be a vector of length g so far
6: if a(∂σ) , 0 then
7: assign a(σ) to be a 0 vector of size g
8: pick any non-zero entry bu in a(∂σ)
9: add a(∂σ) to every (p − 1)-simplex σ0 s.t. a(σ0 )[u] = 1
10: delete u-th entry from annotation of every (p − 1)-simplex
11: else
12: extend a(τ) for every p-simplex τ so far added by appending a 0 bit
13: create vector a(σ) of length g + 1 with only the last bit being 1
14: end if
15: end for
16: end for
Definition 4.8 (Link condition). A vertex pair (u, v) in a simplicial complex Ki satisfies the link
condition if the edge uv ∈ Ki and Lk u ∩ Lk v = Lk uv. An elementary collapse fi : Ki → Ki+1
satisfies the link condition if the vertex pair on which fi is not injective satisfies the link condition.
000 000 00
00 00
00 00 00 00 11 00 00 00 11 00
w w w
10
01 10 10 10
u 11 v u 00 v u
Figure 4.3: Annotation updates for elementary collapse: inclusion of a triangle so as to satisfy the
link condition (upper row), annotation transfer and actual collapse (lower row); annotation 11 of
the vanishing edge uv is added to all edges (cofacets) adjoining u.
Proposition 4.7 ([12]). If an elementary collapse fi : Ki → Ki+1 satisfies the link condition, then
the underlying spaces |Ki | and |Ki+1 | remain homotopy equivalent. Hence, the induced homomor-
phisms fi∗ : H p (Ki ) → H p (Ki+1 ) and fi∗ : H p (Ki ) ← H p (Ki+1 ) are isomorphisms.
If an elementary collapse satisfies the link condition, we can perform the collapse knowing
that the cohomology does not change. Otherwise, we know that the cohomology is affected by the
collapse and it should be reflected in our updates for annotations.
The diagram at the left provides a precise means to carry out the change
in cohomology. Let S be the minimal set of simplices ordered in non-
fi
/ Ki+1 decreasing order of their dimensions whose addition to Ki makes (u, v) sat-
Ki O
j
isfy the link condition. One can describe a construction of S recursively
fi0 as follows. In dimension 1, if the edge (u, v) is missing, it is added to S .
!
K̂i Recursively assume that S has all of the necessary p-simplices. Then, all
missing (p + 1)-simplices adjoining the edge (a, b) whose boundary is al-
ready present are added to S . For each simplex σ ∈ S , we modify the
annotations of every simplex which we would have done if σ were to be
inserted. Thereafter, we carry out the rest of the elementary collapse. In essence, implicitly, we
obtain an intermediate complex K̂i = Ki ∪ S where the diagram on the left commutes. Here, fi0
is induced by the same vertex map that induces fi , and j is an inclusion. This means that the
persistence of fi is identical to that of fi0 ◦ j which justifies our action of elementary inclusions
followed by the actual collapses.
102 Computational Topology for Data Analysis
We remark that this is the only place where we may insert implicitly a simplex σ in the current
approach. The number of such σ is usually much smaller than the number of simplices that one
may need for a coning strategy detailed in Section 4.4 to process simplicial towers.
After constructing K̂i with annotations, we transfer annotations to prepare for the collapse.
This step locally changes the annotations for simplices containing the vertices u and/or v. The
following definition facilitates the description.
Definition 4.9 (Vanishing; Mirror simplices). For the elementary collapse fi0 : K̂i → Ki+1 , a
simplex σ ∈ Ki is called vanishing if the cardinality of fi0 (σ) is one less than that of σ. Two
simplices σ and σ0 are called mirror partners if one contains u and the other v, and share rest
of the vertices. In Figure 4.3 (lower row), the vanishing simplices are {uv, uvw} and the mirror
partners are {u, v}, {uw, vw}.
In an elementary collapse that sends (u, v) to u, all vanishing simplices need to be deleted, and
all simplices containing v need to be pulled to corresponding ones containing the vertex u (which
are their mirror partners). We update the annotations in such a way that the annotations of all
vanishing simplices become zero, and those of both mirror partners become the same. Once this
is achieved, the collapse is implemented by simply deleting the vanishing simplices and replacing
v with u in all simplices containing v (effectively this identifies mirror partners) without changing
their annotations. The following proposition provides the justification behind the specific update
operations that we perform.
g
Proposition 4.8. Let K be a simplicial complex and a : K(p) → Z2 be a valid annotation. Let
σ ∈ K(p) be any p-simplex and τ any of its (p − 1)-faces. Then, adding aσ to the annotation for
all cofacets of τ including σ produces a valid annotation for K(p). Furthermore, the cohomology
basis corresponding to the annotations (Proposition 4.5) remains unchanged by this modification.
Consider now the elementary collapse fi0 : K̂i → Ki+1 that sends (u, v) to u. We update the
annotations for simplices in K̂i as follows. First, note that the vanishing simplices are exactly
those simplices containing the edge {u, v}. For every p-simplex containing {u, v}, i.e., a vanishing
simplex, exactly two of its (p − 1)-faces are mirror simplices, and all other remaining (p − 1)-faces
are vanishing simplices. Let σ be a vanishing p-simplex and τ be its (p − 1)-face that is a mirror
simplex containing u. We add aσ to the annotations for all cofacets (cofaces of codimension 1)
of τ including σ. This implements the annotation transfer for σ. By Proposition 4.8, the new
annotation generated by this process corresponds to the old cohomology basis for K̂i . This new
annotation has aσ as zero since aσ + aσ = 0. See the the lower row of Figure 4.3. We perform the
above operation for each vanishing simplex. It turns out that by using the relations of vanishing
simplices and mirror simplices, each mirror simplex eventually acquires an identical annotation
to that of its partner. Specifically, we have the following observation.
Proposition 4.9. After all possible annotation transfers involved in a collapse, (i) each vanishing
simplex has a zero annotation; and (ii) each mirror simplex τ has the same annotation as its
mirror partner simplex τ0 .
Subsequent to the annotation transfer, the annotation of K̂i fits for actual collapse since each
pair of mirror simplices which are collapsed to a single simplex get the identical annotation and
Computational Topology for Data Analysis 103
the vanishing simplex acquires the zero annotation. Furthermore, Proposition 4.8 tells us that
the cohomology basis does not change by annotation transfer which aligns with the fact that
fi0∗ : H p (Ki+1 ) → H p (K̂i ) is indeed an isomorphism. Accordingly, no time stamp changes after
the annotation transfer and the actual collapse. Propositions 5.2 and 5.3 in [122] provide formal
statements justifying the algorithm for annotation updates.
The persistence diagram of a given simplicial tower K can be retrieved easily from the anno-
tation algorithm. Each time during an elementary operation either we add a new element into the
annotation of all p-simplices for some p ≥ 0 or delete an element from the annotations of all of
them. During the deletion, we add the point (bar) (a, b) into Dgm p K where b is the current time
of deletion (death) and a is the time stamp of the element when it was added (birth).
K1 K3
0 1 3 4
Figure 4.4: The zigzag filtration K0 ,→ K1 ←- K2 ,→ K3 ←- K4 has four intervals (bars) for one
dimensional homology H1 , namely [0, 4], [1, 1], [3, 4], and [4, 4].
For p ≥ 0, considering the p-th homology groups with coefficient in a field k (which is Z2
here), we obtain a sequence of vector spaces connected by forward or backward linear maps,
called a zigzag persistence module:
ϕ0 ϕ1 ϕn−2 ϕn−1
H p F : H p (X0 ) ←−→ H p (X1 ) ←−→ · · · ←−−→ H p (Xn−1 ) ←−−→ H p (Xn ) (4.5)
where the map ϕi : H p (Xi ) ↔ H p (Xi+1 ) can either be forward or backward and is induced by the
inclusion.
In the non-zigzag case, when index set for H p F is finite, Proposition 3.10 says that H p F is a
direct sum of interval modules. In zigzag case, similar statement holds due to quiver theory [163].
Definition 4.10 (Quiver). A quiver Q = (N, E) is a directed graph which can be finite or infinite.
A representation V(Q) of Q is an assignment of a vector space Vi to every node Ni ∈ N and a linear
map vi j : Vi → V j for every directed edge (Ni , N j ) ∈ E. Figure 4.5 illustrates representations of
two quivers.
vi−1,i vi,i0
Vi0
Vi−3 Vi−2 Vi−1 Vi Vi+1 Vi+2
vi−1,i
A zigzag persistence module is a special type of quiver representation where the graph is finite
and linear shaped, also known as An -type (see Figure 4.5(bottom)), where every node has at most
two directed edges incident to it. Such a quiver representation has an interval decomposition
though we need to define the intervals afresh to take into account the fact that arrows can be
bidirectional.
Definition 4.11 (Interval module). An interval module I[b,d] also called an interval or a bar over
an index set 0, 1, . . . , n with field k is a sequence of vector spaces
I[b,d] : I0 ↔ I1 · · · ↔ In
where Ik = k for b ≤ k ≤ d and 0 otherwise with the maps k ← k and k → k being identities.
Remark 4.2. Notice that unlike the bars that we defined in Chapter 3 for non-zigzag filtration,
here the bars are closed on both ends. However, we will see that we can designate them to be of
four types similar to what we have seen for the persistence modules for non-zigzag persistence.
Computational Topology for Data Analysis 105
Theorem 4.10 ([13, 265, 163]). Every quiverL representation V(Q) for an An -type quiver Q has
an interval decomposition, that is, V(Q) i I[bi ,di ] . Furthermore, this decomposition is unique
up to isomorphism and permutation of the intervals.
Types of bars. A bar [b, d] for a zigzag persistence module H p F can be of four types depending
on the direction of the arrow between Xb−1 and Xb and the arrow between Xd and Xd+1 in F. They
are:
open-closed [b, d]: Xb−1 ←- Xb · · · Xd ←- Xd+1 : b > 0 and the inclusion Xb−1 ←- Xb is a
backward arrow; and d < n with the inclusion Xd ←- Xd+1 being a backward arrow;
open-open [b, d]: Xb−1 ←- Xb · · · Xd ,→ Xd+1 : b > 0 and the inclusion Xb−1 ←- Xb is a back-
ward arrow; and either d = n or the inclusion Xd ,→ Xd+1 is a forward arrow.
With the four types of bars, when we compute the bottleneck distance between persistence
diagrams for two zigzag persistence modules, we consider matching between bars of similar
types. That is, db (Dgm p (F1 ), Dgm p (F2 )) is computed with the understanding that only similar
types of bars are compared while matching the bars and the points on the diagonal are assumed to
have any type. We face a difficulty in defining an interleaving distance between zigzag modules
because of the the zigzag nature of the arrows. However, one can define such an interleaving
distance by mapping the module to a 2-parameter persistence module. See the notes in Chapter 12
for more details.
106 Computational Topology for Data Analysis
4.3.1 Approach
We briefly describe an overview of our approach for computing zigzag persistent intervals for a
simplicial zigzag filtration:
F : ∅ = K0 ↔ K1 ↔ · · · ↔ Kn−1 ↔ Kn . (4.6)
We assume that the filtration is simplex-wise, which means that Ki , Ki+1 differ by only one simplex
σi and also begins with the empty complex. We have seen similar conditions before for the non-
zigzag case in Section 3.1.2. This is not a serious restriction because we can expand an inclusion
of a set of simplices to a series of inclusions by a single simplex while using any order that puts a
simplex after all its faces and we can always pad an empty complex at the beginning with the first
inclusion being forward.
The method we describe is derived from maintaining a consistent basis with a set of represen-
tative cycles over the intervals as we define now. These cycles generate an interval module in a
straightforward way by associating a cycle to a homology class at each position.
1. For b > 0, [cb ] is not in the image of ϕb−1 if Kb−1 ↔ Kb is a forward inclusion, or [cb ] is
the non-zero class mapped to 0 by ϕb−1 otherwise.
2. For d < n, [cd ] is not in the image of ϕd if Kd ↔ Kd+1 is a backward inclusion, or [cd ] is
the non-zero class mapped to 0 by ϕd otherwise.
3. For each i ∈ [b, d − 1], [ci ] ↔ [ci+1 ] by ϕi , that is, either [ci ] 7→ [ci+1 ] or [ci ] ←[ [ci+1 ] by ϕi .
The interval module induced by the representative p-cycles is a zigzag persistence module I :
I0 ↔ I1 · · · ↔ In such that Ii equals the 1-dimensional vector space generated by [ci ] ∈ H p (Ki )
for i ∈ [b, d] and equals 0 otherwise.
The following theorem justifies the definition of representative cycles, which says that repre-
sentative cycles always produce an interval decomposition of a zigzag module and vice versa:
We now present an abstract algorithm based on an approach in [228] which helps us design
a concrete algorithm later. Given a filtration F : ∅ = K0 ↔ · · · ↔ Kn starting with an empty
complex, first let Dgm p (F0 ) = ∅. The algorithm then iterates for i ← 0, . . . , n − 1. At the
beginning of the i-th iteration, inductively assume that the intervals and their representative cycles
for H p Fi have already been computed. The aim of the i-th iteration is to compute these for H p Fi+1 .
Let Dgm p (Fi ) = [bα , dα ] | α ∈ Ai be indexed by a set Ai , and let cαk ⊆ Kk | k ∈ [bα , dα ] be a
Computational Topology for Data Analysis 107
set of representative p-cycles for each [bα , dα ]. For ease of presentation, we also let cαk = 0 for
each α ∈ Ai and each k ∈ [0, i] not in [bα , dα ]. We call intervals of Dgm p (Fi ) ending with i
as surviving intervals at index i. Each non-surviving interval of Dgm p (Fi ) is directly included
in Dgm p (Fi+1 ) and its representative cycles stay the same. For surviving intervals of Dgm p (Fi ),
the i-th iteration proceeds with the following cases determined by the types of the linear maps
ϕi : H p (Ki ) ↔ H p (Ki+1 ).
ϕi is isomorphic: In this case, no intervals are created or cease to persist. For each surviv-
ing interval [bα , dα ] in Dgm p (Fi ), [bα , dα ] now corresponds to an interval [bα , i + 1] in
Dgm p (Fi+1 ). The representative cycles for [bα , i + 1] are set by the following rule:
Trivial setting rule of representative cycles: For each j with bα ≤ j ≤ i, the representative
cycle for [bα , i + 1] at index j stays the same. The representative cycle for [bα , i + 1] at i + 1
is set to a cαi+1 ⊆ Ki+1 such that [cαi ] ↔ [cαi+1 ] by ϕi .
ϕi points forward and is injective: A new interval [i + 1, i + 1] is added to Dgm p (Fi+1 ) and its
representative cycle at i + 1 is set to a p-cycle in Ki+1 containing σi . All surviving intervals
of Dgm p (Fi ) persist to index i+1 and their representative cycles are set by the trivial setting
rule.
ϕi points backward and is surjective: A new interval [i + 1, i + 1] is added to Dgm p (Fi+1 ) and
its representative cycle at i + 1 is set to a p-cycle homologous to ∂(σi ) in Ki+1 . All surviving
intervals of Dgm p (Fi ) persist to index i + 1 and their representative cycles are set by the
trivial setting rule.
ϕi points forward and is surjective: A surviving interval of Dgm p (Fi ) does not persist to i + 1.
Let Bi ⊆ Ai consist of indices of all surviving intervals. We have that [cαi ] | α ∈ Bi forms
α
a basis of H p (Ki ). Suppose that ϕi [cαi 1 ] + · · · + [ci ` ] = 0, where α1 , . . . , α` ∈ Bi . We can
rearrange the indices such that bα1 < bα2 < · · · < bα` and α1 < α2 < · · · < α` . Let λ be
α1 if the arrow ◦bα −1 ↔ ◦bα points backward for every α ∈ {α1 , . . . , α` } and otherwise be
the largest α ∈ {α1 , . . . , α` } such that ◦bα −1 ↔ ◦bα points forward. Then, [bλ , i] forms an
interval of Dgm p (Fi+1 ). For each k ∈ [bλ , i], let zk = cαk 1 + · · · + cαk ` ; then, zk | k ∈ [bλ , i] is a
set of representative cycles for [bλ , i]. All the other surviving intervals of Dgm p (Fi ) persist
to i + 1 and their representative cycles are set by the trivial setting rule.
ϕi points backward and is injective: A surviving interval of Dgm p (Fi ) does not persist to i + 1.
Let Bi ⊆ Ai consist of indices of all surviving intervals, and let cαi 1 , . . . , cαi ` be the cycles in
α
ci | α ∈ Bi containing σi . We can rearrange the indices such that bα1 < bα2 < · · · < bα`
and α1 < α2 < · · · < α` . Let λ be α1 if the arrow ◦bα −1 ↔ ◦bα points forward for every
α ∈ {α1 , . . . , α` } and otherwise be the largest α ∈ {α1 , . . . , α` } such that ◦bα −1 ↔ ◦bα points
backward. Then, [bλ , i] forms an interval of Dgm p (Fi+1 ) and the representative cycles for
[bλ , i] stay the same. For each α ∈ {α1 , . . . , α` } not equal to λ, let zk = cαk + cλk for each k
such that bα ≤ k ≤ i, and let zi+1 = zi ; then, zk | k ∈ [bα , i + 1] is a set of representative
cycles for [bα , i + 1]. For the other surviving intervals, the setting of representative cycles
follows the trivial setting rule.
Remark 4.3. Note that in the above algorithm, there is no canonical choice for the representative
classes. However, all choices produce the same intervals.
108 Computational Topology for Data Analysis
2. The columns of Z p with negative birth timestamps form a basis of B p (Ki ). Moreover, for
each column Z p [ j] of Z p with a negative birth timestamp, one has that Z p [ j] = ∂ C p+1 [ j] .
3. For columns of Z p with non-negative birth timestamps, their birth timestamps bijectively
map to the starting indices of the intervals of Dgm p (Fi ) ending with i. Moreover, for each
column Z p [ j] of Z p such that b p [ j] is non-negative, one has that Z p [ j] is a representative
cycle at index i for the interval b p [ j], i .
Zigzag algorithm.
For each i ← 0, . . . , n − 1, the algorithm does the following:
– Death: Let J consist of indices in I whose corresponding columns in Z p−1 have non-
negative birth timestamps. If ϕb p−1 [α]−1 points backward ∀ α ∈ J, let λ be the smallest
index in J; otherwise, let λ be the largest α in J such that ϕb p−1 [α]−1 points forward.
Then, do the following:
1. Output the (p − 1)-th interval b p−1 [λ], i .
– Birth: First, the boundaries in Z p−1 need to be updated so that they form a basis of
B p−1 (Ki+1 ):
1. while there are two columns Z p−1 [α], Z p−1 [β] with negative birth timestamps s.t.
C p [α], C p [β] contain σi do
2. if pivot(Z p−1 [α]) > pivot(Z p−1 [β]) then
3. Z p−1 [α] ← Z p−1 [α] + Z p−1 [β]
4. C p [α] ← C p [α] + C p [β]
5. else
6. Z p−1 [β] ← Z p−1 [α] + Z p−1 [β]
110 Computational Topology for Data Analysis
At the end of the algorithm, for each p and each column Z p [α] of Z p with non-negative birth
timestamp, output the p-th interval b p [α], n . Notice that while spewing out the bars, the al-
gorithm can easily output the types of the bars by looking at the relevant arrows as described
before.
f0 f1 f2 fn−1
K : K0 ←→ K1 ←→ K2 ←→ · · · ←→ Kn . (4.7)
Recall that by Proposition 4.6 each map fi : Ki → Ki+1 can be decomposed into elementary
inclusions and elementary collapses. So, without loss of generality, we assume that every fi is
either an elementary inclusion or an elementary collapse.
Computational Topology for Data Analysis 111
w x f x
(u, v) → u
y w y
v
K0
K
ι ι0
u u
w
x
v K̂ y
Figure 4.6: Elementary collapse (u, v) → u: the cone u ∗ St v adds edges uw, uv, ux, triangles
uwx, uvx, uvw, and the tetrahedron uvwx.
First, we propose a simulation of an elementary collapse with a coning strategy that only
requires additions of simplices.
Let f : K → K 0 be an elementary collapse. Assume that the induced vertex map collapses
vertices u, v ∈ K to u ∈ K 0 , and is identity on other vertices. For a subcomplex X ⊆ K, define the
S
cone u ∗ X to be the complex σ∈X {σ ∪ {u}}. Consider the augmented complex
K̂ := K ∪ u ∗ St v .
In other words, for every simplex {u0 , . . . , ud } ∈ St v of K, we add the simplex {u0 , . . . , ud } ∪ {u}
to K̂ if it is not already in. See Figure 4.6. Notice that K 0 is a subcomplex of K̂ in this example
which we observe is true in general.
Claim 4.1. K 0 ⊆ K̂.
Now consider the inclusions ι : K ,→ K̂ and ι0 : K 0 ,→ K̂. These inclusions along with
the elementary collapse constitute a diagram in Figure 4.6 which does not necessarily commute.
Nevertheless, it commutes at the homology level which is precisely stated below.
ι∗ ι0∗
Proposition 4.12. In the zigzag module H p (K) → H p (K̂) ← H p (K 0 ) induced by inclusions ι and
ι0 , the linear maps ι0∗ is an isomorphism and f∗ : H p (K) → H p (K 0 ) equals to (ι0∗ )−1 ◦ ι∗ .
Proof. We use the notion of contiguous maps which induces equal maps at the homology level.
Recall that two maps f1 : K1 → K2 , f2 : K1 → K2 are contiguous if for every simplex σ ∈ K1 ,
f1 (σ)∪ f2 (σ) is a simplex in K2 . We observe that the simplicial maps ι0 ◦ f and ι are contiguous and
ι0 induces an isomorphism at the homology level, that is, ι0∗ : H p (K 0 ) → H p (K̂) is an isomorphism.
Since ι is contiguous to ι0 ◦ f , we have ι∗ = (ι0 ◦ f )∗ = ι0∗ ◦ f∗ . Since ι0∗ is an isomorphism,
(ι0∗ )−1 exists and is an isomorphism. It then follows that f∗ = (ι0∗ )−1 ◦ ι∗ .
1
Note here we only iterate over columns C p [β] for which Z p−1 [β] is a boundary.
112 Computational Topology for Data Analysis
Proposition 4.12 allows us to simulate the persistence of a simplicial tower with only inclusion-
induced homomorphisms which, in turn, allows us to consider a simplicial zigzag filtration. More
specifically, the simplicial tower in Eqn. (4.7) generates the zigzag persistence module by induced
homomorphisms fi ∗
f0 ∗ f1 ∗ f2 ∗ fn−1 ∗
H p (K0 ) ←→ H p (K1 ) ←→ H p (K2 ) ←→ · · · ←→ H p (Kn ). (4.8)
With our observation that every map fi ∗ can be simulated with an inclusion induced map, our goal
is to replace the original simplicial tower in Eqn. (4.7) with a zigzag filtration so that we can take
advantage of the algorithm in section 4.3. In view of Proposition 4.12, the two diagrams shown
in Figure 4.7 commute, the one on left corresponds to a forward collapse fi : Ki → Ki+1 and the
other on right corresponds to a backward collapse fi : Ki ← Ki+1 .
H p (Ki )
fi ∗
/ H p (Ki+1 ) o = H p (Ki+1 ) H p (Ki )
= / H p (Ki ) o fi ∗
H p (Ki+1 )
= ' = = ' =
ιi∗
H p (Ki ) / H p (K̂i ) o '
H p (Ki+1 ) H p (Ki )
' / H p (K̂i+1 ) o ιi∗ H p (Ki+1 )
Figure 4.7: Top modules induced from an elementary collapse are isomorphic to the modules
induced by inclusions at the bottom.
Observe that, if fi is an inclusion instead of a collapse, we can still construct similar com-
muting diagrams. In that case, we simply take K̂i = Ki+1 when fi is a forward inclusion and take
K̂i+1 = Ki when fi is a backward inclusion.
Now, we can expand each fi ∗ of the persistence module in Eqn. (4.8) by juxtaposing it with
an equality as in the top modules shown in Figure 4.7. Then, this expanded module becomes
isomorphic to the modules induced by inclusions at the bottom of the commuting diagrams.
In general, we first consider the expansion of the module in Eqn. (4.8) to the following module
in Eqn. (4.9) where S i = Ki+1 , gi = fi , and hi is equality when fi is forward, and S i = Ki , gi is
equality and hi = fi when fi is backward.
g0 h0 g1 h1 g2 hn−1
H p (K0 ) −→ H p (S 0 ) ←− H p (K1 ) −→ H p (S 1 ) ←− H p (K2 ) −→ · · · ←− H p (Kn ) (4.9)
Using Figure 4.7, a module isomorphic to the module in Eqn. (4.9) can be constructed as given in
Eqn. (4.10) where T i = K̂i when fi is forward and T i = K̂i+1 when fi is backward. All maps are
induced by inclusions.
The two persistence modules in Eqn. (4.9) and in Eqn. (4.10) are isomorphic because all vertical
maps in the diagram below are isomorphisms and all squares commute (Figure 4.7).
In view of the module in Eqn. (4.10), we convert the tower K in Eqn. (4.7) to the zigzag
filtration below where T i = K̂i when fi is forward and T i = K̂i+1 when fi is backward:
F : K0 ,→ T 0 ←- K1 ,→ T 1 ←- K2 ,→ · · · ←- Kn (4.11)
Computational Topology for Data Analysis 113
g0 g1 g2
H p (K0 ) / H p (S 0 ) o h0 H p (K1 ) / H p (S 1 ) o h1 H p (K2 ) / ...... o H p (Kn )
= ' = ' = =
H p (K0 ) / H p (T 0 ) o H p (K1 ) / H p (T 1 ) o H p (K2 ) / ...... o H p (Kn )
The zigzag filtration above is simplex-wise but does not begin with an empty complex. We can
expand K0 simplex-wise to convert the filtration to a simplex-wise filtration that begins with an
empty complex. Then, we can apply the zigzag algorithm in Section 4.3.2 to compute the barcode.
Theorem 4.13. The persistence diagram of K can be derived from that of the filtration F.
Example 4.1. Consider the tower in Eqn. (4.12) where each map is an elementary collapse and
the persistence module induced by it in Eqn. (4.13). This module can be expanded and its isomor-
phic module is shown at the bottom of the commuting diagram in Figure 4.9.
f0 f1 f2 fn−1
K0 −→ K1 ←− K2 −→ · · · −→ Kn (4.12)
f0 ∗ f1 ∗ f2 ∗ fn−1 ∗
H p (K0 ) −→ H p (K1 ) ←− H p (K2 ) −→ · · · −→ H p (Kn ) (4.13)
We obtain the following zigzag filtration that corresponds to the module at the bottom of the
diagram in Figure 4.9. Hence, we can compute the barcode for the input tower in Eqn. (4.12)
from this zigzag filtration.
/ H p (K1 ) o = = =
f 0∗ f 1∗ f 2∗
H p (K0 ) H p (K1 ) / H p (K1 ) o H p (K2 ) / ...... o H p (Kn )
= ' = ' = =
i 0∗
i1∗ i 2∗
H p (K0 ) / H p (K̂0 ) o ' H p (K1 )
' / H p (K̂2 ) o H p (K2 ) / ...... o '
H p (Kn )
Figure 4.9: Commuting diagram for the module in Eqn. (4.13) and its isomorphic module.
Remark 4.4. Notice that, when fi is an inclusion, we can eliminate introducing the middle column
in Figure 4.8 which will translate into eliminating some of the inclusions in the sequence in
Eqn. (4.11). We introduced these extraneous inclusions just to make the expanded module generic
in the sense that its inclusions reverse the directions alternately.
114 Computational Topology for Data Analysis
Definition 4.14 (Critical, regular value). An open interval I ⊆ R is called a regular interval if
there exist a topological space Y and a homeomorphism Φ : Y × I → XI so that f ◦ Φ is the
projection onto I and Φ extends to a continuous function Φ̄ : Y × I¯ → XI¯ where I¯ is the closure of
I. We assume that f is of Morse type [63] meaning that each levelset X=s has finitely-generated
homology groups and there are finitely many values called critical a0 = −∞ < a1 < · · · <
an < an+1 = +∞, so that each interval (ai , ai+1 ) is a maximal interval that is regular. A value
s ∈ (ai , ai+1 ) is then called a regular value.
The original construction [63] of level set (henceforth written levelset) zigzag persistence
picks regular values s0 , s1 , . . . , sn so that each si ∈ (ai , ai+1 ). Then, the levelset zigzag filtration of
f is defined as follows:
This construction relies on a choice of regular values and there is no canonical choice. As we
work on simplicial complexes, different regular values can result in different complexes in the
filtration. Therefore, we adopt the following alternative definition of a levelset zigzag filtration X,
which does not rely on a choice of regular values:
X : X(a0 ,a2 ) ←- · · · ,→ X(ai−1 ,ai+1 ) ←- X(ai ,ai+1 ) ,→ X(ai ,ai+2 ) ←- · · · ,→ X(an−1 ,an+1 ) . (4.15)
The space of the type X(ai−1 ,ai+1 ) contains a critical value ai and hence is called a critical space.
For a similar reason a space of the type X(ai ,ai+1 ) is called regular space which does not contain
any critical value. Considering the homology groups of the spaces, we get the zigzag persistence
module:
H p X : H p (X(a0 ,a2 ) ) ← · · · → H p (X(ai−1 ,ai+1 ) ) ← H p (X(ai ,ai+1 ) ) → H p (X(ai ,ai+2 ) ) ← · · · → H p (X(an−1 ,an+1 ) ).
Note that X(ai ,ai+1 ) deformation retracts to X=si and X(ai−1 ,ai+1 ) deformation retracts to X[si−1 ,si ] ,
so the zigzag modules induced by the two diagrams are isomorphic, i.e., equivalent at the persis-
tent homology level. See Figure 4.10 for an example of a levelset zigzag filtration.
Computational Topology for Data Analysis 115
: ···
Figure 4.10: A torus with four critical values. The real-valued function is the height function over
the horizontal line. The first several subspaces in the levelset zigzag diagram are given and the
remaining ones are symmetric. Empty dot indicates that the point is not included.
Generation of barcode for levelset zigzag. The interval decomposition of the module H p X
gives the barcode for the zigzag persistence. However, the endpoints of the bars may belong to
either the index of a critical or regular space. If it belongs to a critical space X(ai−1 ,ai+1 ) , we map
it to the critical value ai . Otherwise, if it belongs to a regular space X(ai ,ai+1 ) , we map it to the
regular value si . After this conversion, still the bars do not end solely in critical values. We
modify the endpoints further. In keeping with the understanding that even the levelset homology
classes do not change in the regular spaces, we convert an endpoint si to an adjacent critical value
and make the bar (interval module) open at that critical value. Precisely we modify the bars as (i)
[ai , a j ] ⇔ [ai , a j ], (ii) [ai , s j ] ⇔ [ai , a j+1 ) (iii) [si , a j ] ⇔ (ai , a j ] (ii) [si , s j ] ⇔ (ai , a j+1 ). As in
the case of standard zigzag filtration, the intervals in (i)-(iv) are referred as closed-closed, closed-
open, open-closed, and open-open bars respectively. Our goal is to compute these four types of
bars for a PL-function where the space X is the underlying space of a simplicial complex K.
A complex of the form K(i,i+1) in the filtration is called a regular complex and a complex of the
form K(i,i+2) is called a critical complex. Note that while we can expect the space and simplicial
116 Computational Topology for Data Analysis
Definition 4.15. A complex K is called compatible with the levelsets of a PL-function f : |K| → R
if for every simplex σ of K and its convex hull |σ|, function values of points in |σ| contain at most
one critical value of f .
Given a PL-function f on a complex K, one can make K compatible with the levelsets of f
by subdividing K with barycentric subdivisions; see e.g. [103].
Proposition 4.14. Let K be compatible with the levelsets of f , and let X = |K|; one has that
X(ai ,a j ) deformation retracts to K(i, j) for any two critical values ai < a j . Therefore, the zigzag
modules induced by the space and the simplicial levelset zigzag filtrations are isomorphic.
Our goal is to compute the four types of bars for the zigzag filtration X from its simplicial
version K. For this, we make K simplex-wise and call it F. First, F starts and ends with the same
original complexes in K. Second, whenever an inclusion in K is expanded so that one simplex is
added at a time, the addition follows the order of the simplices’ function values. Formally, for the
inclusion K(i,i+1) ,→ K(i,i+2) in K, let u1 = vi+1 , u2 , . . . , uk be all the vertices with function values
in [ai+1 , ai+2 ) such that f (u1 ) < f (u2 ) < · · · < f (uk ); then, the lower stars of u1 , . . . , uk are added
in sequence by F. Note that for each u j ∈ u1 , . . . , uk , we do not restrict how simplices in the
lower star of u j are added. For the inclusion K(i−1,i+1) ←- K(i,i+1) in K, everything is reversed, i.e.,
vertices are ordered in decreasing function values and upper stars are added. With this expansion,
the zigzag filtration K in Eqn. (4.17) is converted to a filtration F shown below where a dashed
arrow indicates insertions of one or more simplices and a solid arrow indicates a single simplex
insertion. In particular, we indicate that the backward inclusion K(i−1,i+1) c- K(i,i+1) is expanded
into a simplex-wise filtration.
After expanding all forward and backward inclusions to make them simplex-wise, we obtain a
zigzag filtration whose complexes can be indexed by 0, 1, . . . , n as we assume next.
of. To apply the algorithm in Section 4.3.2, we need the input zigzag filtration to begin with an
empty complex. The filtration F as constructed from expanding K has the first complex K(0,2) that
is non-empty. So, as before, we expand K(0,2) simplex-wise and begin F with an empty complex.
We assume below this is the case for F.
The bars in the barcode for F do not necessarily coincide with the four types of bars for K
with endpoints only in critical values. However, we can read the bars for K from the bars of F.
First, assume that F is indexed as
F : ∅ = K0 ↔ K1 ↔ · · · ↔ Kn−1 ↔ Kn .
This means that a complex K j , j > 0, is of the four categories, (i) it is a complex in the expansion
of the backward inclusion K(i−1,i+1) c- K(i,i+1) , (ii) it is a complex in the expansion of the forward
inclusion K(i,i+1) ,d K(i,i+2) , (iii) it is a regular complex K(i,i+1) for some i > 0, (iv) it is a critical
complex K(i−1,i+1) for some i > 0. The types of complexes where the endpoints of a bar [b, d] for
F are located determine the bars for K and hence X which can be of four types: closed-closed
[ai , a j ], closed-open [ai , a j ), open-closed (ai , a j ], and open-open (ai , a j ).
Let [b, d] be a bar for F. If both Kb and Kd appear in the expansion of a forward inclusion
K(i,i+1) ,d K(i,i+2) , we ignore the bar because it is an artificial bar created due to expanding the
filtration K into the filtration F. Similarly, we ignore the bar if both Kb and Kd appear in the
expansion of a backward inclusion K(i−1,i+1) c- K(i,i+1) . We explain other cases below.
(Case 1.) Kb is either a regular complex K(i,i+1) or in the expansion of K(i−1,i+1) c- K(i,i+1) : the
complex Kb is a subcomplex of the critical complex K(i−1,i+1) which stands for the critical value
ai . So, the end b is mapped to ai and made open because the class for the bar [b, d] does not exist
in K(i−1,i+1) .
(Case 2.) Kb is either the critical complex K(i,i+2) or in the expansion of K(i,i+1) ,d K(i,i+2) : the
complex is a subcomplex of the critical complex K(i,i+2) which stands for the critical value ai+1 .
So, the end b is mapped to ai+1 and is closed because the class for [b, d] is alive in K(i,i+2)
(Case 3.) Kd is the critical complex K(i−1,i+1) or is in the expansion of the backward inclusion
K(i−1,i+1) c- K(i,i+1) : the complex is a subcomplex of the critical complex K(i−1,i+1) which stands
for the critical value ai . So, the end d is mapped to ai and made closed because the class for the
bar [b, d] exists in K(i−1,i+1) .
(Case 4.) Kd is either the regular complex K(i,i+1) or in the expansion of K(i,i+1) ,d K(i,i+2) : the
complex is a subcomplex of the critical complex K(i,i+2) which stands for the critical value ai+1 .
So, the end d is mapped to ai+1 and is open because the class for [b, d] is not alive in K(i,i+2) .
Then, considering f to be a PL-function on X = |K|, we have already seen in Section 3.5 that X
can be converted to a simplicial filtration K shown below where K[0,i] = {σ ∈ K | f (σ) ≤ ai }. This
filtration can further be converted into a simplex-wise filtration which can be used for computing
Dgm p (K) for p ≥ 0.
The bars for this case have the form [ai , a j ) where a j can be an+1 = ∞. Each such bar is closed at
the left endpoint because the homology class being born exists at K[0,i] . However, it is open at the
right endpoint because it does not exist at K[0, j] .
One can see that there are two types of bars in the sublevel set persistence, one of the type
[ai , a j ), j ≤ n, which is bounded on the right, and the other of the type [ai , ∞) = [ai , an+1 )
which is unbounded on the right. The unbounded bars are the infinite bars weLintroduced in
Section 3.2.1. They correspond to the essential homology classes since H p (K) i [ai , ∞). The
work of [59, 63] imply that both types of barcodes of the standard persistence can be recovered
from those of the levelset zigzag persistence as the theorem below states:
Theorem 4.15. Let K and K0 denote the filtrations for the sublevel sets and level sets respectively
induced by a continuous function f on a topological space with critical values a0 , a1 , · · · , an+1
where a0 = −∞ and an+1 = ∞. For every p ≥ 0,
1. [ai , a j ), j , n + 1 is a bar for Dgm p (K) iff it is so for Dgm p (K0 ),
2. [ai , an+1 ) is a bar for Dgm p (K) iff either [ai , a j ] is a closed-closed bar for Dgm p (K0 ) for
some a j > ai , or (a j , ai ) is an open-open bar for Dgm p−1 (K0 ) for some a j < ai .
The decomposition of the persistence module H p E arising out of E provides the bars in Dgm p (E).
For the first part of the sequence, the endpoints of the bars are designated with respective function
Computational Topology for Data Analysis 119
values ai as before. For the second part, the birth or death point of a bar is designated as an+i if its
class either is born in (K[0,n] , K[i,n] ) or dies entering into (K[0,n] , K[i,n] ) respectively for 0 ≤ i ≤ n.
We leave the proof of the following theorem as an exercise; see also [63].
Theorem 4.16. Let K and E denote the simplicial levelset zigzag filtration and the extended
filtration of a PL-function f : |K| → R. Then, for every p ≥ 0,
1. [ai , a j ) is a bar for Dgm p (K) iff it is a bar for Dgm p (E),
2. (ai , a j ] is a bar for Dgm p (K) iff [an+ j , an+i ) is a bar for Dgm p+1 (E),
3. [ai , a j ] is a bar for Dgm p (K) iff [ai , an+ j ) is a bar for Dgm p (E),
4. (ai , a j ) is a bar for Dgm p (K) iff [a j , an+i ) is a bar for Dgm p+1 (E).
Clearly, for two persistence modules H p E and H p E0 arising out of two extended filtrations E
and E0 , the stability of persistence diagrams holds, that is, db (Dgm p E, Dgm p E0 ) = dI (H p E, H p E0 )
(Theorem 3.11).
Given a real valued function f : X → R on a topological space X, the level sets at the critical
and intermediate values give rise to a levelset zigzag filtration as shown in Section 4.5. Carlsson,
de Silva, and Morozov [63] introduced this set up and observed the decomposition of the zigzag
module into interval modules with open or closed ends. The four types of bars arising out of
this zigzag module give more information than the standard sublevel set persistence which only
outputs closed-open and infinite bars. It was observed in [59] that the open-open and closed-
closed bars indeed capture the infinite bars of the sublevel set persistence with an appropriate
dimension shift. Theorem 4.15 summarizes this connection. The extended persistence originally
proposed for surfaces [5] and later extended for filtrations [103] also computes all four types of
bars, but they are described differently using the persistence diagrams rather than open and closed
ends.
Exercises
1. Show that the inequality in Proposition 4.1 cannot be improved to equality by giving a
counterexample.
6. For computing the persistence of a simplicial tower, we checked the link condition in all
dimensions. Argue that it is sufficient to check the condition only for three relevant dimen-
sions.
Complete the above approach with a proof of correctness into an algorithm that computes
the annotation for edges in O(gn) time if K has n simplices.
8. Do we get the same barcode if we run the zigzag persistence algorithm given in Sec-
tion 4.3.1 and the standard persistence algorithm on a non-zigzag filtration? If so, prove it.
If not, show the difference and suggest a modification to the zigzag persistence algorithm
so that the both output become the same.
Computational Topology for Data Analysis 121
fi
9. Suppose that a persistence module {Vi → Vi+1 } is presented with the linear maps fi as
matrices whose columns and rows are fixed bases of Vi and Vi+1 respectively. Design an
algorithm to compute the barcode for the input module. Do the same when the input module
is a zigzag tower.
10. ([127]) We have seen that for graphs a near-linear time algorithm exists for computing non-
zigzag persistence. Design a near-linear time algorithm for computing zigzag persistence
for graphs.
(a) Design an algorithm to compute the barcode of − f from a level set zigzag filtration
of f .
(b) Show that f and − f produce the same closed-closed and open-open bars for the lev-
elset zigzag filtration.
(c) In general, given a zigzag filtration F, consider the filtration F0 = −F in opposite
direction from right to left. What is the relation between the barcodes of these two
filtrations?
12. We computed persistence of zigzag towers by first converting it into a zigzag filtration and
then using the algorithm in section 4.3 to compute the bars. Design an algorithm that skips
the intermediate conversion to a filtration.
13. Design an algorithm for computing the extended persistence from a given PL-function on
an input simplicial complex.
So far we have focused mainly on the rank of the homology groups. However, the homology
generators, that is, the cycles whose classes constitute the elements of the homology groups carry
information about the space. Computing just some generating cycles (cycle basis) typically can
Figure 5.1: Double torus has 1-st homology group of rank four meaning that classes of four
representative cycles generate H1 ; (left) A non-optimal cycle basis, (right) optimal cycle basis.
be done by the standard algorithms for computing homology groups such as the persistence al-
gorithms. In practice, however, we may sometimes be interested in generating cycles that have
some optimal property; see Figure 5.1.
In particular, if the space has a metric associated with it, one may associate a measure with
the cycles that can differentiate them in terms of their ‘size’. For example, if K is a simplicial
complex embedded in Rd , the measure of a 1-cycle can be its length. Then, we can ask to compute
a set of 1-cycles whose classes generate H1 (K) and has minimum total length among all such sets
of cycles. Typically, the locality of these cycles capture interesting geometric features of the
space |K|. Some applications may benefit from computing such cycles respecting geometry. For
example, in computer graphics often a surface is cut along a set of cycles to make it flat for
parameterization. The classes of these cycles constitute a basis of the 1-st homology group. In
general, shortest (optimal) cycle basis is desired because they produce good parameterzation for
graphic rendering. Figure 5.2 shows examples of such cycles for three kinds of input where
a shortest (optimal) cycle basis has been computed with an algorithm that we describe in this
chapter. The algorithm works for simplicial complexes though we can apply it on point cloud
data as well after computing an appropriate complex such as Čech or Rips complex on top of the
input points.
123
124 Computational Topology for Data Analysis
It turns out that, for p > 1, the problem of computing an optimal homology basis for p-
th homology group H p is NP-hard [94]. However, the problem is polynomial time solvable for
p = 1 [136]. A greedy algorithm which was originally devised for computing an optimal H1 -basis
for surfaces [156] extends to general simplicial complexes as described in Section 5.1.
There is another case of optimality, namely the localization of homology classes. In this
problem, given a p-cycle c, we want to compute an optimal p-cycle c∗ in the same homology
class of c, that is, [c] = [c∗ ]. This problem is NP-hard even for p = 1 [73]. Interestingly, there are
some special cases for which an integer program formulated for the problem can be solved with
a linear program [126]. This is the topic of Section 5.2.
The two versions mentioned above do not consider persistence framework. We may ask what
are the optimal cycles for persistent homology classes. Toward formulating the problem precisely,
we define a persistent cycle for a given bar in the barcode of a filtration. This is a cycle whose
class is created at the birth point and becomes a boundary at the death point of the bar. Among all
persistent cycles for a given bar, we want to compute an optimal one. The problem in general is
NP-hard, but one can devise polynomial time algorithms for some special cases such as filtrations
of what we call weak pseudo-manifolds [129]. Section 5.3 describes these algorithms.
Figure 5.2: Computed shortest basis cycles (left) on a triangular mesh of Botijo, a well known
surface model in computer graphics, (middle) on a point cloud data sampling the surface of Bud-
dha, another well known surface model in computer graphics, (right) on an isosurface generated
from a volume data in visualization.
Observe that, an optimal generator may not have minimal number of cycles whose classes
generate the homology group because we allow zero weights and hence an optimal generator may
contain extra cycles with zero weights. This prompts us to define the following.
We observe that optimal H p (K)-generators with positively weighted cycles are necessarily
cycle bases. Notice that to generate H p (K), the number of cycles in any H p (K)-generator has to
be at least β p (K) = dim H p (K). On the other hand, an optimal H p (K)-generator with positively
weighted cycles cannot have more than β p cycles because such a generator must contain a cycle
whose class is a linear combination of the classes of other cycles in the generator. Thus, omission
of this cycle still generates H p (K) while decreasing the weight of the generator. For 1-dimension,
similar reasoning can also be applied to conclude that each cycle in an H1 (K)-cycle basis nec-
essarily contains a simple cycle which together form a cycle basis (Exercise 1). A 1-cycle is
simple if it has a single connected component (viewed as a graph) and every vertex has exactly
two incident edges.
Fact 5.1.
(ii) Every cycle ci in an H1 (K)-basis has a simple cycle c0i ⊆ ci so that {c0i }i form an H1 (K)-basis.
We now focus on computing an optimal H p (K)-basis also known as the optimal homology
basis problem or OHBP in short. One may observe that Definition 5.3 formulates OHBP as a
weighted `1 -optimization of representatives of bases. This allows for different types of optimality
to be achieved by choosing different weights. For example, assume that the simplicial complex K
of dimension p or greater is embedded in Rd , where d ≥ p + 1. Let the Euclidean p-dimensional
volume of p-simplices be their weights. This specializes OHBP to the Euclidean `1 -optimization
problem. The resulting optimal H p (K)-basis has the smallest p-dimensional volume amongst all
such bases. If the weights are taken to be unit, the resulting optimal solution has the smallest
number of p-simplices amongst all H p (K)-bases.
Algorithm 7 GreedyBasis(C)
Input:
A set of p-cycles C in a complex
Output:
A maximal set of cycles from C whose classes are independent and total weight is minimum
1: Sort the cycles from C in non-decreasing order of their weights; that is, C = {c1 , . . . , cn }
implies w(ci ) ≤ w(c j ) for i ≤ j
2: Let B := {c1 }
3: for i = 2 to n do
4: if [ci ] is independent w.r.t. B then
5: B := B ∪ {ci }
6: end if
7: end for
8: if [c1 ] is trivial (boundary), output B \ {c1 } else output B
Proposition 5.1. Suppose that C, the input to the algorithm GreedyBasis, contains an optimal
H p (K)-basis. Then, the output of GreedyBasis is an optimal H p (K)-basis.
Proof. Let C contain an optimal H p (K)-basis C∗ = {c∗1 , . . . , c∗g } sorted according to their ap-
pearance in the ordered sequence of C = {c1 , . . . , cn }. Let C0 = {c01 , . . . , c0g0 } be the output of
GreedyBasis again sorted according to the appearance of the cycles in C. By Definition 5.3, g,
the cardinality of C∗ , is the dimension of H p (K) and hence g0 ≤ g because g + 1 or more classes
cannot be independent in H p (K).
Among all optimal H p (K)-basis that C contains, take C∗ to be lexicographically smallest, that
is, there is no other sorted C̃∗ = {c̃∗1 , . . . , c̃∗g } so that there exists a j ≥ 1 where c̃∗1 = c∗1 , . . . , c̃∗j−1 =
c∗j−1 and c̃∗j = ck and c∗j = c` with k < `.
First, we show that C0 is a prefix of C∗ . If not, there is a least index j ≥ 1 so that c∗j , c0j .
Since the classes of the cycles in C∗ form a basis for H p (K), and C0 cannot contain any trivial cycle
(ensured by step 8), the class [c0j ] can be written as a linear combination of the classes of the cycles
in C∗ . Consider the class [c∗k ] in this linear combination with the largest index k. It is not possible
that c∗k appears before c0j in the order. This is because then [c0j ] will be a linear combination of the
classes of the cycles appearing before c0j in C0 which is impossible by the construction of C0 . So,
assume that c∗k appears after c0j . Then, consider the sorted sequence of cycles C̃∗ constructed by
replacing c∗k in C∗ with c0j . First, notice that C̃∗ is lexicographically smaller than C∗ and it is also
an H p (K)-basis contradicting the fact that C∗ is the lexicographically smallest optimal cycle basis.
The fact that C̃∗ is an H p (K)-cycle basis follows from the observation that [c0j ] is independent
of the classes of the cycles in C∗ \ {c∗k } because [c0j ] is a linear combination of the classes that
necessarily include [c∗k ].
Now, to complete the proof, we note that g0 = g. If not, then g0 < g and C0 is a prefix of C∗ .
But, then one can add c∗g0 +1 from C∗ to C0 where [c∗g0 +1 ] is independent of all classes of the cycles
already in C0 . This suggests that the algorithm GreedyBasis cannot stop without enlarging C0 .
The above proposition suggests that GreedyBasis can compute an optimal cycle basis if its
Computational Topology for Data Analysis 127
input set C contains one. We show next that such an input (i.e, a set of 1-cycles containing an
optimal H1 (K)-basis) can be computed for H1 (K) in O(n2 log n) time where the 2-skeleton of K
has n simplices.
Specifically, given a simplicial complex K, notice that H1 (K) is completely determined by
the 2-skeleton of K and hence without loss of generality we can assume K to be a 2-complex.
Algorithm 8:Generator computes a set C of 1-cycles from such a complex which includes an
optimal basis.
Algorithm 8 Generator(K)
Input:
A 2-complex K
Output:
A set of 1-cycles containing an optimal H1 (K)-basis
1: Let K 1 be the 1-skeleton of K with vertex set V and edge set E
2: C := {∅}
3: for all v ∈ V do
4: compute a shortest path tree T v rooted at v in K 1 = (V, E)
5: for all e = (u, w) ∈ E \ T v s.t. u, w ∈ T v do
6: Compute cycle ce = πu,w ∪ {e} where πu,w is the unique path connecting u and w in T v
7: C := C ∪ {ce }
8: end for
9: end for
10: Output C
Proposition 5.2. Generator(K) computes an H1 (K)-generator C with their weights in O(n2 log n)
time for a 2-complex K with n vertices and edges. Furthermore, the set C contains an optimal basis
where |C| = O(n2 ).
Proof. We prove that any cycle c in an optimal H1 -basis C∗ that is not computed by Generator can
be replaced by a cycle computed by Generator while keeping C∗ optimal. This proves the claim
that the output of Generator contains an optimal basis (and thus C is necessarily a H1 -generator).
First, assume that C∗ consists of simple cycles because otherwise we can choose such cycles
from the cycles of C∗ due to Fact 5.1(ii). So, assume that c ∈ C∗ is simple. Let v be any vertex in
c. There exists at least one edge e in c which is not in the shortest path tree T v . Let e = {u, w}.
Consider the shortest paths πv,u and πv,w in T v from the root v to the vertices u and w respectively.
Notice that even though K 1 may be disconnected, vertices u, w are necessarily in T v . Also, let π0v,u
and π0v,w be the paths from v to u and w respectively in the cycle c. If πv,u = π0v,u and πv,w = π0v,w
we have c = ce computed by Generator. So, assume that at least one path does not satisfy this
condition, say πv,u , π0v,u . See Figure 5.3.
Consider the two cycles c1 and c2 where c1 consists of the paths π0v,w , πv,u and e; c2 consists
of the paths πv,u and π0v,u . Observe that c = c1 + c2 . Also, w(c1 ) ≤ w(c) and w(c2 ) ≤ w(c). If both
[c1 ] and [c2 ] are dependent on the classes of the cycles in C∗ \ c, we will have [c] dependent on
them as well. This contradicts that C∗ is an H1 (K)-basis.
128 Computational Topology for Data Analysis
Tv
πv,w
πv,w
πv,u
πv,u
0
e
u w
Figure 5.3: Tree T v and the paths πv,u , πv,w , π0v,u , π0v,w .
Algorithm 9 OptGen(K)
Input:
A 2-complex K
Output:
An optimal H1 (K)-basis
1: C:= Generator(K)
2: Output C∗ :=GreedyBasis(C)
Algorithm 10 AnnotEdge(K)
Input:
A simplicial 2-complex K
Output:
Annotations for edges in K
1: Let K 1 be the 1-skeleton of K with edge set E
2: Compute a spanning forest T of K 1 ; m = |E| − |T |
3: For every edge e ∈ E ∩ T , assign an m-vector a(e) where a(e) = 0
4: Index remaining edges in E \ T as e1 , . . . , em
5: For every edge ei , assign a(ei )[ j] = 1 iff j = i
6: for all triangle t ∈ K do
7: if a(∂t) , 0 then
8: pick any non-zero entry bu in a(∂t)
9: add a(∂t) to every edge e s.t. a(e)[u] = 1
10: delete u-th entry from annotation of every edge
11: end if
12: end for
Figure 5.4: (left) A non-trivial cycle in a double torus, (right) optimal cycle in the class of the
cycle on left.
e and hence per cycle ce ∈ C giving a time complexity of O(gn2 ) in total for the entire set C.
Next, we describe an efficient way of determining the independence of cycles as needed in
step 4 of GreedyBasis. Independence of the class [ce ] with respect to all classes already chosen
by GreedyBasis is done in a batch mode. One can do it edge by edge incurring more cost. We use
a divide-and-conquer strategy instead.
Let ce1 , ce2 , . . . , cek be the sorted order of cycles in C computed by Generator. We construct a
matrix A whose ith column is the vector a(cei ), and compute the first g columns that are indepen-
dent called the earliest basis of A. Since there are k cycles in C, the matrix A is g × k. We use the
following iterative method, based on making blocks, to compute the set J of indices of columns
that define the earliest basis. We partition A from left to right into submatrices A = [A1 |A2 | · · · ],
where each submatrix Ai contains g columns, with the possible exception of the last submatrix,
which contains at most g columns. Initially, we set J to be the empty set. We then iterate over the
submatrices Ai by increasing index, that is, as they are ordered from left to right. At each iteration
we compute the earliest basis for the matrix [A J |Ai ], where A J is the submatrix whose column
indices are in J. We then set J to be the indices from the resulting earliest basis, increase i, and go
to the next iteration. At each iteration we need to compute the the earliest basis in a matrix with
g rows and at most |J| + g ≤ 2g columns. Thus, each iteration takes O(gω ) time, and there are at
most O(k/g) = O(n2 /g) iterations. Summing over all iterations, this gives a time complexity of
O(n2 gω−1 ).
Theorem 5.3. Given a simplicial 2-complex K with n simplices, an optimal H1 (K)-basis can be
computed in O(nω + n2 gω−1 ) time.
Proof. A H1 -generator containing an optimal (cycle) basis can be computed in O(n2 log n)
time due to Proposition 5.2. One can compute an optimal H1 -basis from C by GreedyBasis
due to Proposition 5.1. However, instead of using GreedyBasis, we can apply the divide-and-
conquer technique outlined above for computing the cycles output by GreedyBasis which takes
O(nω + n2 gω−1 ) time. Retaining only the dominating terms, we obtain the claimed complexity for
the entire algorithm.
5.2 Localization
In this section we consider a different optimization problem. Here we are given a p-cycle c in
an input complex with non-negative weights on p-simplices and our goal is to compute a cycle
Computational Topology for Data Analysis 131
c∗ that is of optimal (minimal) weight in the homology class [c], see Figure 5.4. We extend this
localization problem from cycles to chains. For this, first we extend the concept of homologous
cycles in Section 2.5 to chains straightforwardly. Two p-chains c, c0 ∈ C p are called homologous
if and only if they differ by a boundary, that is, c ∈ c0 + B p . We ask for computing a chain of
minimal weight which is homologous to a given chain.
Definition 5.4. Let w : K(p) → R≥0 be a non-negative weight function defined on the set of
p-simplices in a simplicial complex K. We extend w to the chain group C p by defining w(c) =
i ci w(σi ) where c = i ci σi .
P P
Definition 5.5 (OHCP). Given a non-negative weight function w : K(p) → R≥0 defined on the
set of p-simplices in a simplicial complex K and a p-chain c in C p (K), the optimal homologous
chain problem (OHCP) is to find a chain c∗ which has the minimal weight w(c∗ ) among all chains
homologous to c.
If we use Z2 as the coefficient ring for defining homology classes, the OHCP becomes NP-
hard. We are going to show that it becomes polynomial time solvable if (i) the coefficient ring is
chosen to be integers Z and (ii) the complex K is such that H p (K) does not have a torsion which
may be introduced because of using Z as the coefficient ring.
We will formulate OHCP as an integer program which requires the chains to be represented
as an integer vector. Given a p-chain x = m−1 i=0 xi σi with integer coefficients xi , we use x ∈ Z
m
P
to denote the vector formed by the coefficients xi . Thus, x is the representation of the chain x in
the elementary p-chain basis, and we will use x and x interchangeably.
Recall that for a vector x ∈ Rm , the 1-norm (or `1 -norm) kxk1 is i |xi |. Let W be any real
P
m × m diagonal matrix with diagonal entries wi . Then, the 1-norm of W x, that is, kW xk1 is
P
i |wi ||xi |. (If W is a general m × m nonsingular matrix then kW xk1 is called the weighted 1-norm
of x.) We now state in words our approach to the optimal homologous chains and later formalize
it in Eqn. (5.1). The main idea is to cast OHCP as an integer program. Unfortunately, integer
programs are in general NP-hard and thus cannot be solved in polynomial time unless P=NP. We
solve it by a linear program and identify a class of integer programs called totally unimodular
for which linear programs give exact solution. Then, we interpret total unimodularity in terms of
topology. Our approach to solve OHCP can be succinctly stated by the following steps:
• write OHCP as an integer program involving 1-norm minimization, subject to linear con-
straints;
• convert the integer program into an integer linear program by converting the 1-norm cost
function to a linear one using the standard technique of introducing some extra variables
and constraints;
• find the conditions under which the constraint matrix of the integer linear program is totally
unimodular; and
• for this class of problems, relax the integer linear program to a linear program by dropping
the constraint that the variables be integral. The resulting optimal chain obtained by solving
the linear program will be an integer valued chain homologous to the given chain.
132 Computational Topology for Data Analysis
We assume that W is a diagonal matrix obtained from non-negative weights on simplices. Let
w be a non-negative real-valued weight function on the oriented p-simplices of K and let W be
the corresponding diagonal matrix (the i-th diagonal entry of W is w(σi ) = wi ).
The resulting objective function kW xk1 = i wi |xi | in (5.1) is not linear in xi because it uses
P
the absolute value of xi . However, it is piecewise-linear in these variables. As a result, Eqn. (5.1)
can be reformulated as an integer linear program by splitting every variable xi into two parts xi+
and xi− [27, page 18]:
wi (xi+ + xi− )
X
min
i
+
subject to x − x− = c + D p+1 y (5.2)
+
x , x ≥0
−
x+ , x− ∈ Zm , y ∈ Zn .
Comparing the above formulation to the standard form integer linear program in Eqn. (5.4), we
notice that the vector x in Eqn. (5.4) corresponds to [x+ , x− , y]T in Eqn. (5.2) above. Thus, the
minimization is over x+ , x− and y, and the coefficients of xi+ and xi− in the objective function are
wi , but the coefficients corresponding to y j are zero. The linear programming relaxation of this
formulation just removes the constraints about the variables being integral. The resulting linear
program is:
wi (xi+ + xi− )
X
min
i
subject to x+ − x− = c + D p+1 y
x+ , x− ≥ 0 .
To cast the program in standard form [27], we can eliminate the free (unrestricted in sign)
variables y by replacing these by y+ − y− and imposing the non-negativity constraints on the
new variables. The resulting linear program has the same objective function, and the equality
Computational Topology for Data Analysis 133
constraints:
wi (xi+ + xi− )
X
min
i
subject to x − x− = c + D p+1 (y+ − y− )
+
x+ , x− , y+ , y− ≥ 0 .
where B = D p+1 . This is exactly in the form we want the linear program to be in view of
Eqn. (5.4). We now prove a result about the total unimodularity of this matrix that allows us to
solve the optimization by a linear program.
Theorem 5.4. Let A be a m × n totally unimodular matrix. Then the integer linear program (5.4)
can be solved in time polynomial in the dimensions of A.
h i
Proposition 5.5. If B = D p+1 is totally unimodular then so is the matrix I −I −B B .
Proof. The proof uses operations that preserve the total unimodularity ofh a matrix.
i These are
listed in [272, page 280]. If B is totally unimodular then so is the matrix −B B since scalar
multiples of columns of B are being appended on the left to get this matrix. The full matrix in
question can be obtained from this one by appending columns with a single ±1 on the left, which
proves the result.
As a result of Theorem 5.4 and Proposition 5.5, we have the following algorithmic result.
Theorem 5.6. If the boundary matrix D p+1 of a finite simplicial complex of dimension greater
than p is totally unimodular, the optimal homologous chain problem (5.1) for p-chains can be
solved in polynomial time.
Proof. We have seen above that a reformulation of OHCP without the integrality constraints
leads to the linear program (5.3). By Proposition 5.5, the equality constraint matrix of this linear
program is totally unimodular. Then, by Theorem 5.4, the linear program (5.3) can be solved in
polynomial time, while achieving an integral solution.
134 Computational Topology for Data Analysis
Manifolds. Our results in the next section (Section 5.2.3) are valid for any finite simplicial
complex. But first we consider a simpler case – simplicial complexes that are triangulations of
manifolds. We show that for finite triangulations of compact p-dimensional orientable manifolds,
the top non-trivial boundary matrix D p is totally unimodular irrespective of the orientations of its
simplices. There are examples of non-orientable manifolds where total unimodularity does not
hold (Exercise 7). Further examination of why total unimodularity does not hold in these cases
leads to the results in Theorem 5.9.
Let K be a finite simplicial complex that triangulates a (p+1)-dimensional compact orientable
manifold M.
Theorem 5.7. For a finite simplicial complex triangulating a (p + 1)-dimensional compact ori-
entable manifold, D p+1 is totally unimodular irrespective of the orientations of the simplices.
As a result of the above theorem and Theorem 5.6 we have the following result.
Corollary 5.8. For a finite simplicial complex triangulating a (p + 1)-dimensional compact ori-
entable manifold, the optimal homologous chain problem can be solved for p-dimensional chains
in polynomial time.
∂L,L
p+1 : C p+1 (L, L0 ) → C p (L, L0 ) ,
0
and any zero rows. The zero rows correspond to p-simplices that are not faces of any of the
(p + 1)-simplices of L. Then the following holds.
Theorem 5.9. D p+1 is totally unimodular if and only if H p (L, L0 ) is torsion-free, for all pure
subcomplexes L0 , L of K of dimensions p and p + 1 respectively, where L0 ⊂ L.
Proof. (only if): We show that if H p (L, L0 ) has torsion for some L, L0 then D p+1 is not totally
unimodular. Let DL,L 0 L,L0
p+1 be the corresponding relative boundary matrix. Bring D p+1 to the so called
Smith normal form which is a block matrix
∆ 0
" #
0 0
where ∆ = diag(d1 , . . . , dl ) is a diagonal matrix with di ≥ 1 being integers. The row or column
of zero matrices in the block shown above may be empty, depending on the dimension of the
matrix. This can be done, for example, by using the reduction algorithm [241][pages 55–57]. The
construction of the Smith normal form implies that dk > 1 for some 1 ≤ k ≤ l because H p (L, L0 )
has torsion. Thus, the product d1 . . . dk is greater than 1. By a result of Smith [281] mentioned
in [272, page 50], this product is the greatest common divisor of the determinants of all k × k
square submatrices of DL,L 0 L,L0
p+1 . It follows that some square submatrix of D p+1 , and hence of D p+1 ,
has determinant value greater than 1. Then, D p+1 is not totally unimodular.
(if): Assume that D p+1 is not totally unimodular. We show that, in that case, there exist sub-
complexes L0 and L of dimensions p and (p + 1) respectively, with L0 ⊂ L, so that H p (L, L0 ) has
torsion. Let S be a square submatrix of D p+1 so that |det(S )| > 1. Let L correspond to the columns
of D p+1 that are included in S and let BL be the submatrix of D p+1 formed by these columns. This
submatrix BL may contain zero rows. Those zero rows (if any) correspond to p-simplices that are
not a facet of any of the (p + 1)-simplices in L. To form S from BL , we first discard the zero rows
to form a submatrix B0L . This is safe because det(S ) , 0 and so these zero rows cannot occur in
S.
The rows in B0L correspond to p-simplices that adjoin some (p + 1)-simplex in L. Let L0
correspond to rows of B0L which are excluded to form S . Observe that S is the relative boundary
matrix DL,L 0
p . Consider the Smith normal form of S . This normal form is a square diagonal matrix
obtained by reducing S . Since the elementary row and column operations used for this reduction
preserve determinant magnitude, the determinant of the resulting diagonal matrix has magnitude
greater than 1. It follows that, at least one of the diagonal entries in the normal form is greater
than 1. Then, by [241, page 61] H p (L, L0 ) has torsion.
Corollary 5.10. For a simplicial complex K of dimension greater than p, there is a polynomial
time algorithm for answering the following question: Is H p (L, L0 ) torsion-free for all subcom-
plexes L0 and L of dimensions p and (p + 1) such that L0 ⊂ L?
Proof. Seymour’s decomposition theorem for totally unimodular matrices [273],[272, Theorem
19.6] yields a polynomial time algorithm for deciding if a matrix is totally unimodular or not
[272, Theorem 20.3]. That algorithm applied on the boundary matrix D p+1 proves the above as-
sertion.
136 Computational Topology for Data Analysis
A special case. In Section 5.2.2, we have seen the special case of compact orientable manifolds.
We saw that the top dimensional boundary matrix of a finite triangulation of such a manifold is
totally unimodular. Now we show another special case for which the boundary matrix is totally
unimodular and hence OHCP is polynomial time solvable. This case occurs when we ask for
optimal p-chains in a simplicial complex K which is embedded in R p+1 . In particular, OHCP can
be solved by linear programming for 2-chains in 3-complexes embedded in R3 . This follows from
the following result:
Theorem 5.11. Let K be a finite simplicial complex embedded in R p+1 . Then, H p (L, L0 ) is torsion-
free for all pure subcomplexes L0 and L of dimensions p and p + 1 respectively, such that L0 ⊂ L.
Corollary 5.12. Given a p-chain c in a weighted finite simplicial complex embedded in R p+1 , an
optimal chain homologous to c can be computed by a linear program.
Depending on whether the interval is finite or not, we have two cases captured in the following
definitions.
Problem 1 (PCYC-FIN p ). Given a finite filtration F and a finite interval [b, d) ∈ Dgm p (F), this
problem asks for computing an optimal persistent p-cycle for the bar [b, d).
Problem 2 (PCYC-INF p ). Given a finite filtration F and an infinite interval [b, ∞) ∈ Dgm p (F),
this problem asks for computing an optimal persistent p-cycle for the bar [b, ∞).
When p ≥ 2, computing optimal persistent p-cycles for both finite and infinite intervals is
NP-hard in general. We identify a special but important class of simplicial complexes, which we
term as weak (p + 1)-pseudomanifolds, whose optimal persistent p-cycles can be computed in
Computational Topology for Data Analysis 137
Specifically, it turns out that if the given complex is a weak (p + 1)-pseudomanifold, the
problem of computing optimal persistent p-cycles for finite intervals can be cast into a mini-
mal cut problem (see Section 5.3.1) due to the fact that persistent cycles of such kind are null-
homologous in the complex. However, when p ≥ 2 and intervals are infinite, the computation of
the same becomes NP-hard. Nonetheless, for infinite intervals, if we assume that the weak (p+1)-
pseudomanifold is embedded in R p+1 , then the optimal persistent p-cycle problem reduces to a
minimal cut problem (see Section 5.3.3) and hence belongs to P. Note that a simplicial complex
that can be embedded in R p+1 is necessarily a weak (p + 1)-pseudomanifold. We also note that
while there is an algorithm [94] in the non-persistence setting which computes an optimal p-cycle
by minimal cuts (Exercise 8, the non-persistence algorithm assumes the (p + 1)-complex to be
embedded in R p+1 ), the algorithm for finite intervals presented here, to the contrary, does not need
the embedding assumption.
Before we present the algorithms for cases where they run in polynomial time, we summarize
the complexity results for different cases. In order to make our statements about the hardness
results precise, we let WPCYC-FIN p denote a subproblem1 of PCYC-FIN p and let WPCYC-
INF p , WEPCYC-INF p denote two subproblems of PCYC-INF p , with the subproblems requiring
additional constraints on the given simplicial complex. Table 5.1 lists the hardness results for
all problems of interest, where the column “Restriction on K” specifies the additional constraints
subproblems require on the given simplicial complex K. Note that WPCYC-INF p being NP-hard
trivially implies that PCYC-INF p is NP-hard.
The polynomial time algorithms for the cases listed in Table 5.1 map the problem of comput-
ing optimal persistent cycles into the classic problem of computing minimal cuts in a flow net-
work. The only exception is PCYC-INF1 which can be solved by computing Dijkstra’s shortest
paths in graphs. We will not consider this special case here whose details can be found in [128].
σb
σd
Figure 5.5: An example of the constructions in our algorithm showing the duality between persis-
tent cycles and cuts having finite capacity for p = 1. (a) The input weak 2-pseudomanifold K with
its dual flow network drawn in blue, where the central hollow vertex denotes the dummy vertex,
the red vertex denotes the source and the orange vertices denote the sinks. All graph edges dual
to the outer boundary 1-simplices actually connect to the dummy vertex. (b) The partial complex
Kb in the input filtration F, where the bold green 1-simplex denotes σFb which creates the green
1-cycle. (c) The partial complex Kd in F, where the 2-simplex σFd creates the pink 2-chain killing
the green 1-cycle. (d) The green persistent 1-cycle of the interval [b, d) is dual to a cut (S , T )
having finite capacity, where S contains all the vertices inside the pink 2-chain and T contains all
the other vertices. The red graph edges denote those edges across (S , T ) and their dual 1-chain is
the green persistent 1-cycle.
vertices in s2 are referred to as sinks. A cut (S , T ) of (G, s1 , s2 ) consists of two disjoint subsets S
and T of V(G) such that S ∪ T = V(G), s1 ⊆ S , and s2 ⊆ T . We define the set of edges across the
cut (S , T ) as
The capacity of a cut (S , T ) is defined as C(S , T ) = e∈E(S ,T ) C(e). A minimal cut of (G, s1 , s2 ) is
P
a cut with the minimal capacity. Note that we allow parallel edges in G (see Figure 5.6) to ease
the presentation. These parallel edges can be merged into one edge during computation.
of its dual graph edge be +∞. Finally, we compute a minimal cut of this flow network and return
the p-chain dual to the edges across the minimal cut as an optimal persistent cycle of the interval.
The intuition of the above algorithm is best explained by an example illustrated in Figure 5.5,
where p = 1. The key to the algorithm is the duality between persistent cycles of the input
interval and cuts of the dual flow network having finite capacity. To see this duality, first consider
a persistent p-cycle c of the input interval [b, d). There exists a (p + 1)-chain A in Kd created
by σFd whose boundary equals c, making c killed. We can let S be the set of graph vertices
dual to the simplices in A and let T be the set of the remaining graph vertices, then (S , T ) is a
cut. Furthermore, (S , T ) must have finite capacity as the edges across it are exactly dual to the
p-simplices in c and the p-simplices in c have indices in F less than or equal to b. On the other
hand, let (S , T ) be a cut with finite capacity, then the (p + 1)-chain whose simplices are dual to
the vertices in S is created by σFd . Taking the boundary of this (p + 1)-chain, we get a p-cycle c.
Because p-simplices of c are exactly dual to the edges across (S , T ) and each edge across (S , T )
has finite capacity, c must reside in Kb . We only need to ensure that c contains σFb in order to
show that c is a persistent cycle of [b, d). In Section 5.3.2, we argue that c indeed contains σFb
(proof of Theorem 5.14), so c is a persistent cycle.
In the dual graph, an edge is created for each p-simplex. If a p-simplex has two (p + 1)-
cofaces, we simply let its dual graph edge connect the two vertices dual to its two (p + 1)-cofaces;
otherwise, its dual graph edge has to connect to the infinite vertex on one end. A problem about
this construction is that some weak (p + 1)-pseudomanifolds may have p-simplices being face
of no (p + 1)-simplices and these p-simplices may create self loops around the infinite vertex.
To avoid self loops, we simply ignore these p-simplices. The reason why we can ignore these
p-simplices is that they cannot be on the boundary of a (p + 1)-chain and hence cannot be on a
persistent cycle of minimal weight. Algorithmically, we ignore these p-simplices by constructing
the dual graph only from what we call the (p + 1)-connected component of K containing σFd .
Definition 5.9 (q-connected). Let K be a simplicial complex. For q ≥ 1, two q-simplices σ and
σ0 of K are q-connected in K if there is a sequence of q-simplices of K, (σ0 , . . . , σl ), such that
σ0 = σ, σl = σ0 , and for all 0 ≤ i < l, σi and σi+1 share a (q − 1)-face. The property of
q-connectedness defines an equivalence relation on q-simplices of K. Each set in the partition
induced by the equivalence relation constitutes a q-connected component of K. We say K is q-
connected if any two q-simplices of K are q-connected in K. See Figure 5.6 for an example of
1-connected components and 2-connected components.
Complexity. The time complexity of MinPersCycFin depends on the encoding scheme of the
input and the data structure used for representing a simplicial complex. For encoding the input,
we assume K and F are represented by a sequence of all the simplices of K ordered by their
indices in F, where each simplex is denoted by its set of vertices. We also assume a simple yet
reasonable simplicial complex data structure as follows: In each dimension, simplices are mapped
to integral identifiers ranging from 0 to the number of simplices in that dimension minus 1; each
q-simplex has an array (or linked list) storing all the id’s of its (q + 1)-cofaces; a hash map for
each dimension is maintained for the query of the integral id of each simplex in that dimension
based on the spanning vertices of the simplex. We further assume p to be constant. By the above
assumptions, let n be the size (number of bits) of the encoded input, then there are no more than
n elementary O(1) operations in line 1 and 2 so the time complexity of line 1 and 2 is O(n). It is
not hard to verify that the flow network construction also takes O(n) time so the time complexity
of MinPersCycFin is determined by the minimal cut algorithm. Using the max-flow algorithm by
Orlin [248], the time complexity of MinPersCycFin becomes O(n2 ).
Computational Topology for Data Analysis 141
In the rest of this section, we first describe the subroutine DualGraphFin, then close the section
by proving the correctness of the algorithm.
Dual graph construction. We describe the DualGraphFin subroutine used in Algorithm Min-
PersCycFin, which returns a dual graph G and a θ denoting two bijections which we use to prove
the correctness. Given the input (K,
e p), DualGraphFin constructs an undirected connected graph
G as follows:
θ : {(p + 1)-simplices of K}
e → V(G) r {v∞ }
by letting θ(σ p+1 ) = v. Note that in the above range notation of θ, {v∞ } may not be a subset
of V(G).
θ : {p-simplices of K}
e → E(G)
using the same notation as the bijection for V(G), by letting θ(σ p ) = e.
Note that we can take the image of a subset of the domain under a function. Therefore, if
(S , T ) is a cut for a flow network built on G, then θ−1 (E(S , T )) denotes the set of p-simplices
dual to the edges across the cut. Also note that since simplicial chains with Z2 coefficients can be
interpreted as sets, θ−1 (E(S , T )) is also a p-chain.
Proof. For contradiction, suppose that s2 is an empty set. Then v∞ < V(G) and σFd is the (p + 1)-
e with the greatest index in F. Since v∞ < V(G), any p-simplex of K
simplex of K e must be a face
of two (p + 1)-simplices of K,
e so the set of (p + 1)-simplices of K
e forms a (p + 1)-cycle created
F F
by σd . Then σd must be a positive simplex in F, which is a contradiction.
The following two propositions specify the duality mentioned at the beginning of Section 5.3.1:
Proposition 5.14. For any cut (S , T ) of (G, s1 , s2 ) with finite capacity, the p-chain c = θ−1 (E(S , T ))
is a persistent p-cycle of [b, d) and w(c) = C(S , T ).
142 Computational Topology for Data Analysis
Proof. Let A = θ−1 (S ), it is easy to check that c = ∂(A). The key is to show that c is created
by σFb which we show now. Suppose that c is created by a p-simplex σ p , σFb . Since C(S , T )
is finite, we have that index(σ p ) < b. We can let c0 be a persistent cycle of [b, d) and c0 = ∂(A0 )
where A0 is a (p + 1)-chain of Kd . Then we have c + c0 = ∂(A + A0 ). Since A and A0 are both
created by σFd , then A + A0 is created by a (p + 1)-simplex with an index less than d in F. So c + c0
is a p-cycle created by σFb which becomes a boundary before σFd is added. This means that σFb
is already paired when σFd is added, contradicting the fact that σFb is paired with σFd . Similarly,
we can prove that c is not a boundary until σFd is added, so c is a persistent cycle of [b, d). Since
(S , T ) has finite capacity, we must have
X X
C(S , T ) = C(e) = w(θ−1 (e)) = w(c)
e∈θ(c) θ−1 (e)∈c
Proposition 5.15. For any persistent p-cycle c of [b, d), there exists a cut (S , T ) of (G, s1 , s2 ) such
that C(S , T ) ≤ w(c).
Proof. Let A be a (p + 1)-chain in Kd such that c = ∂(A). Note that A is created by σFd and c
is the set of p-simplices which are face of exactly one (p + 1)-simplex of A. Let c0 = c ∩ K e and
A = A ∩ K, we claim that c = ∂(A ). To prove this, first let σ be any p-simplex of c , then σ p is
0 e 0 0 p 0
a face of exactly one (p + 1)-simplex σ p+1 of A. Since σ p ∈ K, e it is also true that σ p+1 ∈ K,
e and
so σ p+1 ∈ A . Then σ is a face of exactly one (p + 1)-simplex of A , so σ ∈ ∂(A ). On the other
0 p 0 p 0
p+1
hand, let σ p be any p-simplex of ∂(A0 ), then σ p is a face of exactly one (p + 1)-simplex σ0 of
p+1
A0 . Note that σ0 ∈ A, and we want to prove that σ p is a face of exactly one (p + 1)-simplex
p+1 p+1 p+1
σ0 of A. Suppose that σ p is a face of another (p + 1)-simplex σ1 of A, then σ1 ∈ K e because
p+1 p+1
σ0 ∈ K. e So we have σ
1 ∈ A∩K e = A0 , contradicting the fact that σ p is a face of exactly one
p+1
(p + 1)-simplex of A0 . Then we have σ p ∈ ∂(A). Since σ0 ∈ K, e we have σ p ∈ K, e which means
that σ ∈ c .
p 0
Let S = θ(A0 ) and T = V(G) r S , then it is true that (S , T ) is a cut of (G, s1 , s2 ) because A0
is created by σFd . We claim that θ−1 (E(S , T )) = ∂(A0 ). The proof of the equality is similar to the
one in the proof of Proposition 5.14. It follows that E(S , T ) = θ(c0 ). We then have that
X X
C(S , T ) = C(e) = w(θ−1 (e)) = w(c0 )
e∈θ(c0 ) θ−1 (e)∈c0
Figure 5.6: A weak 2-pseudomanifold K e embedded in R2 with three voids. Its dual graph is
drawn. The complex has one 1-connected component and four 2-connected components with the
2-simplices in 2-connected components shaded.
(G, s1 , s2 ) has a cut with finite capacity by Proposition 5.15. This means that C(S ∗ , T ∗ ) is finite.
By Proposition 5.14, the chain c∗ = θ−1 (E(S ∗ , T ∗ )) is a persistent cycle of [b, d). Suppose that
c∗ is not an optimal persistent cycle of [b, d) and instead let c0 be a minimal persistent cycle of
[b, d). Then there exists a cut (S 0 , T 0 ) such that C(S 0 , T 0 ) ≤ w(c0 ) < w(c∗ ) = C(S ∗ , T ∗ ) by Propo-
sition 5.14 and 5.15, contradicting the fact that (S ∗ , T ∗ ) is a minimal cut.
the graph edges are dual to the p-simplices. The duality between cycles and cuts is as follows:
Since the ambient space R p+1 is contractible (homotopy equivalent to a point), every p-cycle in K e
is the boundary of a (p + 1)-dimensional region obtained by point-wise union of certain (p + 1)-
simplices and/or voids. We can derive a cut2 of the dual graph by putting all vertices contained
in the (p + 1)-dimensional region into one vertex set and putting the rest into the other vertex
set. On the other hand, for every cut of the graph, we can take the point-wise union of all the
(p + 1)-simplices and voids dual to the graph vertices in one set of the cut and derive a (p + 1)-
dimensional region. The boundary of the derived (p + 1)-dimensional region is then a p-cycle in
e We observe that by making the source and sink dual to the two (p + 1)-simplices or voids that
K.
σ
e adjoins, we can build a flow network where a minimal cut produces an optimal p-cycle in K e
containing σ e.
The efficiency of the above algorithm is in part determined by the efficiency of the dual graph
construction. This step requires identifying the voids that the boundary p-simplices are incident
on; see Figure 5.6 for an illustration. A straightforward approach would be to first group the
boundary p-simplices into p-cycles by local geometry, and then build the nesting structure of these
p-cycles to correctly reconstruct the boundaries of the voids. This approach has a quadratic worst-
case complexity. To make the void boundary reconstruction faster, we assume that the simplicial
complex being worked on is p-connected so that building the nesting structure is not needed.
This reconstruction then runs in almost linear time. To satisfy the p-connected assumption, we
begin the algorithm by taking K e as a p-connected subcomplex of Kb containing σF and continue
b
only with this K.e The computed output is still correct because the minimal cycle in K e is again
a minimal cycle in Kb . We skip the details of constructing void boundaries which can be done
in O(n log n) time. Also, we skip the proof of correctness of the following theorem. Interested
readers can consult [129] for details.
Theorem 5.17. Given an infinite interval [b, ∞) ∈ Dgm p (F) for a filtration F of a weak (p + 1)-
pseudomanifold K embedded in R p+1 , an optimal persistent cycle for [b, ∞) can be computed in
O(n2 ) time where n is the number of p and (p + 1)-simplices in K.
on surfaces utilizing the duality between minimal cuts of a surface-embedded graph and optimal
homologous cycles of a dual complex. A better algorithm is proposed in [74]. Both algorithms
are fixed parameter tractable running in time exponential in the genus of the surface. For gen-
eralpdimension, Borradaile et al. [44] showed that the OHCP problem in dimension p can be
O( log n)-approximated and is fixed-parameter tractable for weak (p + 1)-pseudomanifolds. The
only polynomial-time exact algorithm [94] in general dimension for OHCP works for p-cycles in
complexes embedded in R p+1 , which uses a reduction to minimal (s, t)-cuts. Interestingly, when
the coefficient is chosen to be Z instead of Z2 for the homology groups, the problem becomes
polynomial time solvable if there is no relative torsion as shown in [126]. The material presented
in Section 5.2 is taken from this paper.
Persistence added an extra layer of complexity to the problem of computing minimal repre-
sentative cycles. Escolar and Hiraoka [157] and Obayashi [247] formulated the problem as an
integer program by adapting a similar formulation for the non-persistent case. Wu et. al [302]
adapted the algorithm of Busaryev et al. [60] to present an exponential-time algorithm, as well
as an A∗ heuristics in practice. The problem of computing optimal persistent cycle is NP-hard
even for H1 [128]. The problem becomes polynomial time solvable for some special cases such
as computing optimal persistent 2-cycles in a 3-complex embedded in 3-dimension [129]. The
materials in Section 5.3 are taken from this source.
Exercises
1. Show that every cycle in a H1 (K)-basis contains a simple cycle which together form a
H1 (K)-basis themselves.
2. Design an O(n2 log n + n2 g) algorithm to compute the shortest non-trivial 1-cycle in a sim-
plicial 2-complex K with n simplices and g = β1 (K). Do the same in O(n2 log n) time when
K is a 1-complex (a graph).
3. ([130]) We have given an O(nω + n2 gω−1 ) algorithm for computing an optimal H1 -basis for
a complex with n simplices. Taking g = Ω(n), this runs in O(nω+1 ) worst-case time. Give
an O(n3 ) algorithm for the problem.
4. How can one make the algorithm in [130] more efficient for a weighted graph G with n
vertices and edges? For this, show that (i) an annotation for G can be computed in O(n2 )
time, (ii) this annotation can be utilized to compute the annotations for O(n2 ) candidate
cycles in O(n3 ) time, (iii) finally, an optimal basis can be computed in O(n3 ) time by the
divide-and-conquer greedy algorithm in [130] though more efficiently.
5. Define a minmax basis of H p (K) as the set of cycles which generate H p (K) and the maxi-
mum weight of the cycles is minimized among all such generators. Prove that an optimal
H p -cycle basis defined in Definition 5.3 is also a minmax basis.
6. Prove that a simplicial p-complex embedded in R p cannot have torsion in H p−1 and hence
OHCP for (p − 1)-cycles can be solved in polynomial time in this case.
7. Take an example of a triangulation of Möbius strip and show that the integer program
formulation of OHCP for it is not totally unimodular.
146 Computational Topology for Data Analysis
8. Professor Optimist claims that an optimal H p -generator for K embedded in R p+1 can be
obtained by computing optimal persistent p-cycles for infinite bars in any filtration of K.
Show that he is wrong. Give a polynomial time algorithm for computing a non-trivial
p-cycle that has the least weight in K.
9. Consider computing a persistent 1-cycle for a bar [b, d) given a filtration of an edge-
weighted complex K. Let c be a cycle created by the edge e = (u, v) at the birth time b
where c is formed by the edge e and the shortest path between u and v in the 1-skeleton
of the complex Kb . If [c] = 0 at Kd , prove that c is an optimal persistent cycle for the bar
[b, d).
10. Give an example where the above computed cycle using shortest path at the birth time is
not a persistent cycle.
11. For a finite interval [b, d) ∈ Dgm p (F) of a filtration F of a weak (p + 1)-pseudomanifold,
one can take the two vertices of the dual edge of the creator p-simplex σb in the algorithm
MinPersCycFin (Section 5.3.1) as source and sink respectively. Give an example to show
that this does not work for computing a minimal persistent cycle for [b, d). What about
taking the dual vertex of the destroyer simplex σd and the infinite vertex as the source and
the sink respectively?
12. ([93]) For a vertex v in a complex with non-negative weights on edges, let discrete geodesic
ball Brv of radius r be the maximal subcomplex L ⊆ K so that the shortest path from v to
every vertex in L is at most r. For a cycle c, let w(c) = min{r | c ⊆ Brv }. Give a polynomial
time algorithm to compute an optimal H p -cycle basis for any p ≥ 1 with these weights.
Chapter 6
In this chapter, we focus on topological analysis of point cloud data (PCD), a common type of
input data across a broad range of applications. Often, there is a hidden space of interest, and
the PCD we obtain contains only observations / samples from that hidden space. If the sample
is sufficiently dense, it should carry information about the hidden space. We are interested in
topological information in particular. However, discrete points themselves do not have interesting
topology. To impose a connectivity that mimics that of the hidden space, we construct a simplicial
complex such as the Rips or Čech complex using the points as vertices. Then, an appropriate
filtration is constructed as a proxy for the same on the topological space that the PCD presumably
samples. This provides topological summaries such as the persistence diagrams induced by the
filtrations. Figure 6.1 [192] shows an example application of this approach. The PCD in this
case represents atomic configurations of silica in three different states: liquid, glass, and crystal
states. Each atomic configuration can be viewed as a set of weighted points, where each point
represents the center of an atom and its weight is the radius of the atom. The persistence diagrams
for the three states show distinctive features which can be used for further analysis of the phase
transitions. The persistence diagrams can also be viewed as a signature of the input PCD and can
be used to compare shapes (e.g. [78]) or provide other analysis.
Figure 6.1: Persistence diagrams of silica in liquid (left), glass (middle), and crystal (right) states.
Image taken from [192], reprinted by permission from Yasuaki Hiraoka et al. (2016, fig. 2).
We mainly focus on PCD consisting of a set of points P ⊆ (Z, dZ ) embedded in some metric
space Z equipped with a metric dZ . One of the most common choices for (Z, dZ ) in practice is
147
148 Computational Topology for Data Analysis
the d-dimensional Euclidean space Rd equipped with the standard L p -distance. We review the
relevant concepts of constructing Rips and Čech complexes, their filtrations, and describe the
properties of the resulting persistence diagrams in Section 6.1. In practice, the size of a filtration
can be prohibitively large. In Section 6.2, we discuss data sparsification strategies to approximate
topological summaries much more efficiently and with theoretical guarantees.
As we have mentioned, a PCD can be viewed as a window through which we can peek at topo-
logical properties of the hidden space. In particular, we can infer about the hidden homological
information using the PCD at hand if it samples the hidden space sufficiently densely. In Section
6.3, we provide such inference results for the cases when the hidden space is a manifold or is a
compact set embedded in the Euclidean space. To obtain theoretical guarantees, we also need to
introduce the language of sampling conditions to describe the quality of point samples. Finally,
in Section 6.4, we focus on the inference of scalar field topology from a set of point samples P,
as well as function values available at these samples. More precisely, we wish to estimate the
persistent homology of a real-valued function f : X → R from a set of discrete points P ⊂ X as
well as the values of f over P.
We often omit Z from the subscript when its choice is clear. As mentioned in Chapter 2.2, the
Čech complex Cr (P) is the nerve of the union of balls Pr . If the metric balls centered at points
in P in the metric space (Z, dZ ) are convex, then the Nerve Theorem (Theorem 2.1) gives the
following corollary.
Corollary 6.1. For a fixed r ≥ 0, if the metric ball BZ (x, r) is convex for every x ∈ P, then Cr (P)
is homotopy equivalent to Pr , and thus Hk (Cr (P)) Hk (Pr ) for any dimension k ≥ 0.
The above result justifies the utility of Čech complexes. For example, if P ⊆ Rd and dZ is the
standard L p -distance for p > 0, then the Čech complex Cr (P) becomes homotopy equivalent to the
union of r-radius balls centering at points in P. Later in this chapter, we will also see an example
where the points P are taken from a Riemannian manifold X equipped with the Riemannian metric
dX . When the radius r is small enough, the intrinsic metric balls also become convex. In both
cases, the resulting Čech complex captures information of the union of r-balls Pr .
In general, it is not clear at which scale (radius r) one should inspect the input PCD. Varying
the scale parameter r, we obtain a filtration of spaces P := {Pα ,→ Pα }α≤α0 as well as a filtered
0
sequence of simplicial complexes C(P) := {Cα (P) ,→ Cα (P)}α≤α0 . The homotopy equivalence
0
between Pr and Cr , if holds, further induces an isomorphism between the persistence modules
obtained from these two filtrations.
Proposition 6.2 ([91]). If the metric ball B(x, r) is convex for every x ∈ P and all r ≥ 0, then the
persistence module Hk P is isomorphic to the persistence module Hk C(P). This also implies that
Computational Topology for Data Analysis 149
their corresponding persistence diagrams are identical; that is, Dgmk P = Dgmk C(P), for any
dimension k ≥ 0.
A related persistence-based topological invariant is given by the Vietoris-Rips filtration R(P) =
{VRα (P) ,→ VRα (P)}α≤α0 , where the Vietoris-Rips complex VRr (P) for a finite subset P ⊆ (Z, dZ )
0
Recall from Chapter 4.1 that the Čech filtration and Vietoris-Rips filtration are multiplicatively
2-interleaved, meaning that their persistence modules are log 2-interleaved at the log-scale, and
Finite metric spaces. The above definitions of Čech or Rips complexes assume that P is em-
bedded in an ambient metric space (Z, dZ ). It is possible that Z = P and we simply have a discrete
metric space spanned by points in P, which we denote by (P, dP ). Obviously, the construction of
Čech and Rips complexes can be extended to this case. In particular, the Čech complex CrP (P) is
now defined as
where BP (p, r) := {q ∈ P | dP (p, q) ≤ r}. However, note that when P ⊂ Z and dP is the restriction
of the metric dZ to points in P, the Čech complex CrP (P) defined above can be different from the
Čech complex CrZ (P), as the metric balls (BP vs. BZ ) are different. In particular, in this case, we
have the following relation between the two types of Čech complexes:
On the other hand, in this setting, the two Rips complexes are the same because the definition of
Rips complex involves only pairwise distance between input points, not metric balls.
The persistence diagrams induced by the Čech and the Rips filtrations can be used as topologi-
cal summaries for the input PCD P. We can then for example, compare input PCDs by comparing
these persistence diagram summaries.
Definition 6.1 (Čech, Rips distance). Given two finite point sets P and Q, equipped with appro-
priate metrics, the Čech distance between them is a pseudo-distance defined as:
These distances are stable with respect to the Hausdorff or the Gromov-Hausdorff distance
between P and Q depending on whether they are embedded in a common metric space or are
viewed as two discrete metric spaces (P, dP ) and (Q, dQ )). We introduce the Hausdorff and
Gromov-Hausdorff distances now. Given a point x and a set A from a metric space (X, d), let
d(x, A) := inf a∈A d(x, a) denote the closest distance from x to any point in A.
150 Computational Topology for Data Analysis
Definition 6.2 (Hausdorff distance). Given two compact sets A, B ⊆ (Z, dZ ), the Hausdorff dis-
tance between them is defined as:
Note that the Hausdorff distance requires the input objects are embedded in a common am-
bient space. In case they are not embedded in any common ambient space, we use Gromov-
Hausdorff distance, which intuitively measures how much two input metric spaces differ from
being isometric.
Definition 6.3 (Gromov-Hausdorff distance). Given two metric spaces (X, dX ) and (Y, dY ), a cor-
respondence C is a subset C ⊆ X × Y so that (i) for every x ∈ X, there exists some (x, y) ∈ C; and
(ii) for every y0 ∈ Y, there exists some (x0 , y0 ) ∈ C. The distortion induced by C is
1
distortC (X, Y) := sup |dX (x, x0 ) − dY (y, y0 )|.
2 (x,y),(x0 ,y0 )∈C
The Gromov-Hausdorff distance between (X, dX ) and (Y, dY ) is the smallest distortion possible by
any correspondence; that is,
Theorem 6.3. Čech- and Rips-distances satisfy the following stability statements:
dCech (P, Q) ≤ 2dGH ((P, dP ), (Q, dQ )), and dRips (P, Q) ≤ dGH ((P, dP ), (Q, dQ )).
Note that the bound on dCech (P, Q) in statement (2) of the above theorem has an extra factor
of 2, which comes due to the difference in metric balls – see the discussions after Eqn (6.4). We
also remark that (2) in the above theorem can be extended to the so-called totally bounded metric
spaces (which are not necessarily finite) (P, dP ) and (Q, dQ ) defined as follows. First, recall that
an ε-sample (Definition 2.17) of a metric space (Z, dZ ) is a finite set S ⊆ Z so that for every
z ∈ Z, dZ (z, S ) ≤ ε. A metric space (Z, dZ ) is totally bounded if there exists a finite ε-sample for
every ε > 0. Intuitively, such a metric space can be approximated by a finite metric space for any
resolution.
Figure 6.2: Vietoris-Rips complex: (b) at small scale, the Rips complex of points shown in (a)
requires the two white points; (c) the two white points become redundant at larger scale.
spanned by points in P, in which case the size of d-skeleton of Cr (P) or VRr (P) is Θ(nd+1 ) for
n = |P|.
On the other hand, as shown in Figure 6.2, as the scale r increases, certain points could
become “redundant”, e.g, having no or little contribution to the underlying space of the union of
all r-radius balls. Based on this observation, one can approximate these filtrations with sparsified
filtrations of much smaller size. In particular, as the scale r increases, the point set P with which
one constructs a complex is gradually sparsified keeping the total number of simplicies in the
complex linear in the input size of P where the dimension of the embedding space is assumed to
be fixed.
We describe two data sparsification schemes in Sections 6.2.1 and 6.2.2, respectively. We
focus on the Vietoris-Rips filtration for points in a Euclidean space Rd equipped with the standard
Euclidean distance d.
Definition 6.4 (Nets and net-tower). Given a finite set of points P ⊂ (Rd , d) and γ, γ0 ≥ 0, a
subset Q ⊆ P is a (γ, γ0 )-net of P if the following two conditions hold:
Covering condition: Q is a γ-sample for (P, d), i.e., for every p ∈ P, d(p, Q) ≤ γ.
Packing condition: Q is also γ-sparse, i.e., for every q , q0 ∈ Q, d(q, q0 ) ≥ γ0 .
Net-tower via farthest point sampling. We now introduce a specific net-tower constructed via
the classical strategy of farthest point sampling, also called greedy permutation e.g. in [56, 70].
Given a point set P ⊂ (Rd , d), choose an arbitrary point p1 from P and set P1 = {p1 }. Pick pi
recursively as pi ∈ argmax p∈P\Pi−1 d(p, Pi−1 )1 , and set Pi = Pi−1 ∪ {pi }. Now set t pi = d(pi , Pi−1 ),
which we refer to as the exit-time of pi . Based on these exit-times, we construct the following two
families of sets:
It is easy to verify that both Nγ and N γ are γ-nets, and the families N and N are indeed two net-
towers as γ increases. As γ increases, Nγ and N γ can only change when γ = t p for some p ∈ P.
Hence the sequence of subsets P = Pn ⊃ Pn−1 ⊇ · · · ⊇ P2 ⊇ P1 contain all the distinct sets in the
open and closed net-towers {Nγ } and {N γ }.
In what follows, we discuss a sparsification strategy for the Rips filtration of P using the above
net-towers. The approach can be extended to other net-towers, such as the net-tower constructed
using the net-tree data structure of [182].
Weights, weighted distance, and sparse Rips filtration. Given the exit-time t p s for all points
p ∈ P, we now associate a weight w p (α) for each point p at a scale α as follows (the graph of this
weight function is shown on the right): for some constant 0 < ε < 1,
wα (p)
tp
≥α
tp
0 ε (1−)
tp tp tp
w p (α) = α− <α<
ε ε ε(1−ε)
tp
εα
≤α
ε(1−ε) 0
tp tp α
(1−)
Claim 6.1. The weight function w p is a continuous, 1-Lipschitz, and non-decreasing function.
The parameter ε controls the resolution of the sparsification. The net-induced distance at
scale α between input points is defined as:
Definition 6.5 (Sparse (Vietoris-)Rips). Given a set of points P ⊂ Rd , a constant 0 < ε < 1, and
the open net-tower {Nγ } as well as the closed net-tower {N γ } for P as introduced above, the open
sparse-Rips complex at scale α is defined as
Qα := {σ ⊆ Nε(1−ε)α | ∀p, q ∈ σ, b
dα (p, q) ≤ 2α}; (6.9)
1
Note that there may be multiple points that maximize d(p, Pi−1 ) making argmax p∈P\Pi−1 d(p, Pi−1 ) a set. We can
choose pi to be any point in this set.
Computational Topology for Data Analysis 153
Proof of part (i) of Theorem 6.4. To relate S(P) to R(P), we need to go through a sequence of
intermediate steps. First, we define the relaxed Rips complex at scale α as
c α (P) := {σ ⊂ P | ∀p, q ∈ σ, b
VR dα (p, q) ≤ 2α}.
The following claim ensures that the relaxed Rips complexes form a valid filtration connected by
inclusions R(P)
b c α (P) ,→ VR
= {VR c β (P)}α≤β , which we call the relaxed Rips filtration.
Proof. The weight function w p is 1-Lipschitz for any p ∈ P (Claim 6.1). Thus we have that
In what follows, we drop the argument P from notations such as in complexes VRα (P) or in
sparse Rips filtration S(P) when the point set in question is understood.
154 Computational Topology for Data Analysis
α
Proposition 6.5. Let C = 1
1−ε . Then for any α ≥ 0 we have that VRα/C ⊆ VR
c ⊆ VRα .
Again, if argminq∈Nεα d(p, q) contains more than one point, we set πα (p) to be an arbitrary one.
This projection is well-defined as Nεα ⊆ Nε(1−ε)α given that 0 < ε < 1/3 < 1. We need sev-
eral technical results on this projection map, which we rely on later to construct maps between
appropriate versions of Rips complexes. First, the following two results are easy to show.
Fact 6.1. For every p ∈ P, d(p, πα (p)) ≤ w p (α) − wπα (p) (α) ≤ εα.
dα (p, πα (q)) ≤ b
Fact 6.2. For every pair p, q ∈ P, we have that b dα (p, q).
We are now ready to show that inclusion induces an isomorphism between the homology
groups of the sparse Rips complex and the relaxed Rips complex.
Proof. First, we consider the projection map πα and argue that it induces a simplicial map
πα : VRc α → Qα which is in fact a simplicial retraction2 . Next, we show that the map i ◦ πα :
c α → VR
VR c α is contiguous to the identity map id : VRc α → VR c α . As πα is a simplicial retraction,
it follows that i∗ is an isomorphism (Lemma 2 of [275]).
To see that πα is a simplicial map, apply Fact 6.2 twice to have that
Since both Qα and VR c α are clique complexes, this then implies that πα is a simplicial map. Fur-
thermore, it is easy to see that it is a retraction as πα (q) = q for any q in the vertex set of Qα
(which is Nε(1−ε)α ).
Now to show that i◦πα is contiguous to id, we observe that for any p, q ∈ P with b
dα (p, q) ≤ 2α,
all edges among {p, q, πα (p), πα (q)} exist and thus all simplices spanned by them exist in VR c α.
Indeed, that bdα (πα (p), πα (q)) ≤ 2α is already shown above in Eqn (6.11). Combining Fact 6.1
with the fact that w p (α) ≤ εα, we have that
dα (p, πα (p)) = d(p, πα (p)) + w p (α) + wπα (p) (α) ≤ 2w p (α) ≤ 2εα < 2α.
b
2
A simplicial retraction f : K → L is a simplicial map from K ⊆ L to L so that f (σ) = σ for any σ ∈ K.
Computational Topology for Data Analysis 155
α
The closed sparse-Rips complex Q is the relaxed Rips complex over the vertex set N ε(1−ε)α ,
which is a superset of the vertex set of Qα . Hence the above proposition also holds for the
α α
inclusion Qα ,→ Q . It then follows that H∗ (Qα ) H∗ (Q ). Finally, we show that the inclusion
α
also induces an isomorphism between H∗ (Q ) and H∗ (Sα ), which when combined with the above
c α.
results connects Sα and VR
α
Proposition 6.7. For any α ≥ 0, the inclusion h : Q ,→ Sα induces an isomorphism at the
α
homology level, that is, H∗ (Q ) H∗ (Sα ) under h∗ .
Proof. Consider the sequence {Sα }α∈R . First, we discretize α to have distinct values α0 < α1 <
α2 . . . αm so that Sα0 = ∅, and αi s are exactly the time when the combinatorial structure of Sα
β
changes. As Sα = β≤α Q , these are also exactly the moments when the combinatorial structure
P
α
of Q changes. Hence we only need to prove the statement for such αi ’s, and it will then work for
αi α
all α’s. Set λi := ε(1 − ε)αi . Note that the vertex set for Q is N λi by the definition of Q in Eqn
(6.10).
αk
Now fix a k ≥ 0. We will show that h : Q ,→ Sαk induces an isomorphism at the homololgy
level. We use some intermediate complexes
k
[ αj
T i,k := Q , for i ∈ [1, k].
j=i
αk αk
Obviously, T 1,k = Sαk where T k,k = Q . Set hi : T i+1,k ,→ T i,k . The inclusion h : Q ,→ Sαk can
then be written as h = h1 ◦ h2 ◦ · · · ◦ hk−1 . In what follows, we prove that hi : T i+1,k ,→ T i,k induces
an isomorphism at the homology level for each i ∈ [1, k − 1], which then proves the proposition.
αi
First, note that while T i,k is not necessarily the same as Q , they share the same vertex set.
αi+1
Now, because of our choices of αi s and λi s, the vertex set of T i+1,k , which is the vertex set of Q ,
namely N λi+1 , equals Nλi . Hence we can consider the projection παi : T i,k → T i+1,k given by the
projection of the vertex set Nλi−1 = N λi of T i,k to the vertex set Nλi = N λi+1 of T i+1,k . To prove that
hi induces an isomorphism at the homology level, by Lemma 2 of [275], it suffices to show that
(i) παi is a simplicial retraction, and (ii) hi ◦ παi is contiguous to the identity map id : T i,k → T i,k .
To prove (i), it is easy to verify that παi is a retraction. To see that παi induces a simplicial map,
we need to show that for every σ ∈ T i,k , παi (σ) ∈ T i+1,k . As παi is a retraction, we only need to
αi
prove this for every σ ∈ T i,k \ T i+1,k . On the other hand, note that by definition, T i,k \ T i+1,k ⊆ Q .
αi
To this end, the argument in Proposition 6.6 also shows that παi : Q → Qαi is a simplicial map,
αi αi αi
and furthermore, h0 ◦ παi is contiguous to id0 : Q → Q , where h0 : Qαi ,→ Q . Because of
αi+1
our choice of αi s, Qαi and Q have the same vertex set, which is Nλi . Furthermore, for every
edge (p, q) ∈ Qαi , we have that b dαi (p, q) ≤ 2αi . As αi < αi+1 , it follows from Claim 6.2 that
αi+1 αi+1
dαi+1 (p, q) ≤ 2αi+1 . Hence, the edge (p, q) is in Q . This implies that Qαi ⊆ Q . Putting
b
αi
everything together, it follows that, for every σ ∈ T i,k \ T i+1,k ⊆ Q , we have
αi+1
παi (σ) ∈ Qαi ⊆ Q ⊆ T i+1,k .
Now we prove (ii), that is, hi ◦ παi is contiguous to the identity map id : T i,k → T i,k . This
means that we need to show for every σ ∈ T i,k , σ ∪ παi (σ) ∈ T i,k . Again, as παi is a simplicial re-
αi
traction, we only need to show this for σ ∈ T i,k \T i+1,k ⊆ Q . As mentioned above, using the same
αi αi
argument as in Proposition 6.6, we know that h0 ◦παi is contiguous to the identity id0 : Q → Q .
αi αi
Hence we have that for every σ ∈ Q , σ ∪ παi (σ) ∈ Q . It follows that σ ∪ παi (σ) ∈ T i,k as
αi
Q ⊆ T i,k . This proves (ii) completing the proof of the proposition.
Combining Propositions 6.6 (as well as the discussion after this proposition) and 6.7, we have
c α } induces isomorphic persistence modules. This, together with Proposition 6.5,
that {Sα } and {VR
implies part (i) of Theorem 6.4.
Proof of part (ii) of Theorem 6.4. Let S(k) denote the set of k-simplices ever appeared in S(P),
which is also the set of k-simplices in the last complex S∞ of S(P). To bound the size of S(k) , we
charge each simplex in S(k) to the vertex of it with smallest exit-time. Observe that a point p ∈ P
β tp
does not contribute to any new edge in the sparse Rips complex Q for β > ε(1−ε) . This means
αp
that to bound the number of simplices charged to p, we only need to bound such simplices in Q
tp
with α p = ε(1−ε) .
αp
Set E(p) = {q ∈ P | (p, q) ∈ Q and t p ≤ tq }. We add p to E(p) too. We claim that
|E(p)| = O(( 1ε )d ). In particular, consider the closed net-tower {N γ }; recall that N γ is a γ-net. As
E(p) ⊆ N t p , the packing-condition of the net implies that the closest pair in E(p) has distance
αp
at least t p between them. On the other hand, for each (p, q) ∈ Q , we have b dα p (p, q) ≤ 2α p
implying that the E(p) ⊆ B(p, 2α p ). A simple packing argument then implies that the number of
points in E(p) is
2α p d
! !d !d
= O
2 = O 1 .
O
tp ε(1 − ε) ε
The last equality follows because ε < 1/3 and thus 1 − ε ≥ 2/3. The total number of k-simplices
charged to p is bounded by O(( 1ε )kd ), and the total number k-simplices in S(P) is O(( 1ε )kd n),
proving part (ii) of Theorem 6.4.
Consider the following vertex map πk : Pk → Pk+1 , for any k ∈ [0, m − 1], where πk (v) is the
nearest neighbor of v ∈ Pk in Pk+1 . Define b πk : P0 → Pk+1 as b πk := πk ◦ · · · ◦ π0 . Based on the
fact that Pk+1 is a αε (1 + ε) -net of Pk , it can be verified that πk induces a simplicial map
2 k−1
O (P0 ) O
πk
b
jk jk+1
? * ?
πk
VR α(1+ε)k
(Pk ) / VRα(1+ε)k+1 (Pk+1 )
# %
V: Va0 / Va +ε / Va +2ε / . . .% . . . / Va +mε / .# . .
0 0 0
It turns out that to verify the commutativity of the diagram in Eqn. (6.14), it is sufficient to
verify it for all subdiagrams of the form as in Eqn. (4.3). Furthermore, ε-weakly interleaved
persistence modules also have bounded bottleneck distances between their persistence diagrams
[77] though the distance bound is relaxed to 3ε, that is, if U and V are weakly-ε interleaved,
then db (DgmU, DgmV) ≤ 3ε. Analogous results hold for multiplicative setting. Finally, using a
similar packing argument as before, one can also show that the total number of k-simplices that
ever appear in the simplicial-map based sparsification b S is linear in n (assuming that k and the
dimension d are both constant). To summarize:
Theorem 6.8. Given a set of n points P ⊂ Rd , we can 3 log(1 + ε)-approximate the persistence
diagram of the discrete Rips filtration in Eqn. (6.12) by that of the filtration in Eqn. (6.13) at
the log-scale. The number of k-simplices that ever appear in the filtration in Eqn. (6.13) is
O(( 1ε )O(kd) n).
158 Computational Topology for Data Analysis
Main ingredients. Since points themselves do not have interesting topology, we first construct
a certain simplicial complex K, typically a Čech or a Vietoris-Rips complex from P. Next, we
compute the homological information of K as a proxy for the same of X. Of course, the approx-
imation becomes faithful only when the given sample P is sufficiently dense and the parameters
used for building the complexes are appropriate. The high level approach works as follows.
To provide quantitative statements on the approximation quality of the outcome of the above
approach, we need to describe first what the quality of the input PCD P is, often referred to as
the sampling conditions. Intuitively, a better approximation in homology is achieved if the input
points P “approximates” or “samples” X better. The quality of input points is often measured by
the Hausdorff distance w.r.t. the Euclidean distances between PCD P and the hidden domain X of
interest (Definition 6.2), such as requiring that dH (P, X) ≤ ε for some ε > 0. Note that points in
P do not necessarily lie in X. The approximation guarantee for dim(H∗ (X)) relies on relating the
distance fields induced by X and by the sample P. We describe the distance field and feature sizes
of X in Section 6.3.1. We present how to infer homology for smooth manifolds and compact sets
from data in Section 6.3.2 and Section 6.3.3 respectively. In Section 6.4, we discuss inferring the
persistent homology induced by a scalar function f : X → R on X.
Given x ∈ Rd , let Π(x) ∈ X denote the set of closest points of x in X; that is,
Π(x) = {y ∈ X | d(x, y) = dX (x)}.
The medial axis of X, denoted by MX , is the closure of the set of points with more than one closest
point in X; that is,
MX = closure{x ∈ Rd | |Π(x)| ≥ 2}.
Intuitively, |Π(x)| ≥ 2 implies that the maximal Euclidean ball centered at x whose interior is free
of points in X meets X in more than one point on its boundary. Hence, MX is the closure of the
centers of such maximal empty balls.
Definition 6.8 (Local feature size and reach). For a point x ∈ X, the local feature size at x,
denoted by lfs(x), is defined as the minimum distance to the medial axis MX ; that is,
lfs(x) := d(x, MX ).
The reach of X, denoted by ρ(X), is the minimum local feature size of any point in X.
The concept has been primarily developed for the case when X is a smooth manifold embed-
ded in Rd . Indeed, the local feature size can be zero at a non-smooth point: consider a planar
polygon; its medial axis intersects its vertices, and the local feature size at a vertex is thus zero.
The reach of a smoothly embedded manifold could also be zero; see Section 1.2 of [119] for an
example. Next, we describe a “weaker" notion of feature size [89, 90], which is more suitable for
compact subsets of Rd .
Critical points of distance field. The distance function dX introduced above is not everywhere
differentiable. Its gradient is defined on Rd \{X ∪MX }. However, one can still define the following
vector which extends the notion of gradient of dX to include the medial axis MX : Given any point
x ∈ Rd \ X, there exists a unique closed ball with minimal radius that encloses Π(x) [225]. Let
c(x) denote the center of this minimal enclosing ball, and r(x) its radius. It is easy to see that for
any x ∈ Rd \ MX , this ball and c(x) degenerates to the unique point in Π(x).
Definition 6.9 (Generalized vector field). Define the following vector field ∇d : Rd \ X → Rd
where the (generalized) gradient vector at x ∈ Rd \ X is:
x − c(x)
∇d (x) = .
dX (x)
The critical points of ∇d are points x for which ∇d (x) = 0. We also call the critical points of ∇d
the critical points of the distance function dX .
This generalized gradient field ∇d coincides with the gradient of the distance function dX for
points in Rd \ {X ∪ MX }. The distance field (distance function) and its critical points were previ-
ously studied in e.g., [177], and have played an important role in sampling theory and homology
inference. In general, a point x is a critical point if and only if x ∈ Rd \X is contained in the convex
hull of Π(x) (The convex hull of a compact set A ⊂ Rd is the smallest convex set that contains
A). It is necessary that all critical points of ∇d belong to the medial axis MX of X. For the case
where X is a finite set of points in Rd , the critical points of dX are the non-empty intersections of
the Delaunay simplices with their dual Voronoi cells (if exist) [119].
160 Computational Topology for Data Analysis
Definition 6.10 (Weak feature size). Let C denote the set of critical points of ∇d . The weak
feature size of X, denoted by wfs(X), is the distance between X and C; that is,
wfs(X) = min inf d(x, c).
x∈X c∈C
Proposition 6.9. If 0 < α < are such that there is no critical value of dX in the closed interval
α0
[α, α0 ], then X α deformation retracts onto X α . In particular, this implies that H∗ (X α ) H(X α ).
0 0
In the homology inference frameworks, the reach is usually used for the case when X is a
smoothly embedded manifold, while the weak feature size is used for general compact spaces.
Specifically, recall that Ar is the r-offset of A which also equals the union of balls ∪a∈A B(a, r).
The connection between the discrete samples P and the manifold X is made through the union of
balls Pα . The following result is a variant of a result by Niyogi, Smale, Weinberger [245]3 .
Proposition 6.10. Let P ⊂ Rd be a finite pointqset such that dH (X, P) ≤ ε where X ⊂ Rd is a
smooth manifold with reach ρ(X). If 3ε ≤ α ≤ 34 35 ρ(X), then H∗ (Pα ) is isomorphic to H∗ (X).
The Čech complex Cα (P) is the nerve complex for the set of balls {B(p, α), p ∈ P}. As
Euclidean balls are convex, Nerve Theorem implies that Cα (P) is homotopy equivalent to Pα .
It follows that we can use the Čech complex Cα (P), for an appropriate α, to infer homology of
X using the isomorphisms H∗ (X) H∗ (Pα ) H∗ (Cα (P)). The first isomorphism follows from
Proposition 6.10 and the second one from the homotopy equivalence between the nerve and space.
A stronger statement in fact holds: For any α ≤ β, the following diagram commutes:
i∗
H∗ (Pα ) / H∗ (Pβ ) (6.16)
h∗ h∗
i∗
H∗ (Cα (P)) / H∗ (Cβ (P))
3
The result of [245] assumes that P ⊆ X, in which case it shows that Pα deformation retracts to X. In our statement
P is not necessarily from X, and the isomorphism follows from results of [245] and Fact 6.3.
Computational Topology for Data Analysis 161
Here, i∗ stands for the homomorphism induced by inclusions, and h∗ is the homomorphism in-
duced by the homotopy equivalence h : Pα → Cα (P) given by the Nerve Theorem. This leads to
the following theorem on estimating H∗ (X) from a pair of Rips complexes.
Theorem 6.11. Given a smooth manifold X embedded in Rqd , let ρ(X) be its reach. Let P ⊂ Rd be
finite sample such that dH (P, X) ≤ ε. For any 3ε ≤ α ≤ 3
16 5 ρ(X),
3
let i∗ : H∗ (VRα ) → H∗ (VR2α )
be the homomorphism induced by the inclusion i : VRα → VR . We have that 2α
where the last isomorphism is induced by inclusion. On the other hand, recall the interleaving
relation between the Čech and the Rips complexes:
H∗ (Cα (P)) → H∗ (VRα (P)) → H∗ (C2α (P)) → H∗ (VR2α (P)) → H∗ (C4α (P)).
We have H∗ (Cα (P)) H∗ (C2α (P)) H∗ (C4α (P)) by Eqn. (6.17). Thus we have
The theorem then follows from the second part of Fact 6.3.
Prop. 6.12
H∗ (X λ ) o / image H∗ (Cα → H∗ (C2α ) Eqn.
o (6.21)
/ image H∗ (VRα ) → H∗ (VR4α ). (6.18)
Fact 6.3
It is similar to Eqn (6.15) for the manifold case. However, we no longer have the isomorphism
between H∗ (Pα ) and H∗ (X). To overcome this difficulty, we leverage Proposition 6.9. This in turn
162 Computational Topology for Data Analysis
requires us to consider a pair of Čech complexes to infer homology of X λ , instead of a single Čech
complex as in the case of manifolds.
More specifically, suppose that the point set P satisfies that dH (P, X) ≤ ε; then we have the
following nested sequence for α > ε and α0 ≥ α + 2ε:
X α−ε ⊆ Pα ⊆ X α+ε ⊆ Pα ⊆ X α +ε .
0 0
(6.19)
By Proposition 6.9, we know that if it also holds that α0 + ε < wfs(X), then the inclusions
between X α−ε ⊆ X α+ε ⊆ X α +ε induce isomorphisms between their homology groups, which are
0
also isomorphic to H∗ (X λ ) for λ ∈ (0, wfs(X)). It then follows from the second part of Fact 6.3
that, for α, α0 ∈ ε, wfs(X) − ε and α0 − α ≥ 2ε, we have
Combining the above with the commutative diagram in Eqn. (6.16), we obtain the following
result on inferring homology of X λ using a pair of Čech complexes.
Proposition 6.12. Let X be a compact set in Rd and P ⊂ Rd a finite set of points with dH (X, P) < ε
for some ε < 41 wfs(X). Then, for all α, α0 ∈ ε, wfs(X) − ε such that α0 − α ≥ 2ε, and any λ ∈
(0, wfs(X)), we have H∗ (X λ ) im (i∗ ), where i∗ : H∗ (Cα (P)) → H∗ (Cα (P)) is the homomorphism
0
Finally, to perform homology inference with the Rips complexes, we again resort to the in-
terleaving relation between Čech and Rips complexes, and apply the first part of Fact 6.3 to the
following sequence
H∗ (Cα/2 (P)) → H∗ (VRα/2 (P)) → H∗ (Cα (P)) → H∗ (C2α (P)) → H∗ (VR2α (P)) → H∗ (C4α (P)).(6.21)
If 2ε ≤ α ≤ 41 (wfs − ε), both H∗ (Cα/2 (P)) → H∗ (C4α (P)) and H∗ (Cα (P)) → H∗ (C2α (P)) have
ranks equal to dim(H∗ (X λ )) by Proposition 6.12. Applying Fact 6.3, we then obtain the following
result.
Theorem 6.13. Let X be a compact set in Rd and P a finite point set with dH (X, P) < ε for some
ε < 91 wfs(X). Then, for all α ∈ 2ε, 41 (wfs(X) − ε) and all λ ∈ (0, wfs(X)), we have H∗ (Xλ )
im ( j∗ ), where j∗ is the homomorphism between homology groups induced by the inclusion j :
VRα/2 (P) ,→ VR2α (P).
For simplicity, we often write the filtration and the corresponding persistence module as F f =
{Fα }α∈R and H p F f = {H(Fα )}α∈R , when the choices of maps connecting their elements are clear.
Our goal is to approximate the persistence diagram Dgm p (F f ) from point samples P and
ˆf : P → R. Intuitively, we construct a specific Čech (or Rips) complex Cr (P), use fˆ to in-
duce a filtration of Cr (P), and then use its persistent homology to approximate Dgm p (F f ). More
specifically, we need to consider nested pair filtration for either Cr (P) or VRr (P).
Nested pair filtration. Let Pα = {p ∈ P | fˆ(p) ≤ α} be the set of sample points with the func-
tion value for fˆ at most α, which presumably samples the sublevel set Fα of X w.r.t. f . To estimate
the topology of Fα from these discrete sample Pα , we consider either the Čech complex Cr (Pα )
or the Rips complex VRr (Pα ). For the time being, consider VRr (Pα ). As we already saw in pre-
vious sections, the topological information of Fα can be inferred from a pair of nested complexes
jα 0
VRr (Pα ) ,→ VRr (Pα ) for some appropriate r < r0 . To study F f , we need to inspect Fα → Fβ for
α ≤ β. To this end, fixing r and r0 , for any α ≤ β, consider the following commutative diagram
induced by inclusions:
H∗ (VRr (Pα )) / H∗ (VRr (Pβ )) (6.23)
iα∗ iβ ∗
β
jα∗
H∗ (VRr (Pα ))
0
/ H∗ (VRr0 (Pβ ))
β β β β
Set φα : im (iα∗ ) → im (iβ ∗ ) to be φα = jα ∗ |im (iα∗ ) , that is, the restriction of jα ∗ to im (iα∗ ). This
map is well-defined as the diagram above commutes. This gives rise to a persistence module
β
im (iα∗ ); φα α≤β , that is, a family of totally ordered vector spaces im (iα ) with commutative ho-
β
momorphisms φα between any two elements. We formalize and generalize the above construction
below.
Definition 6.11 (Nested pair filtration). A nested pair filtration is a sequence of pairs of com-
iα
plexes {ABα = (Aα , Bα )}α∈R where (i) Aα ,→ Bα is inclusion for every α and (ii) ABα ,→ ABβ for
β
jα
α ≤ β is given by Aα ,→ Aβ and Bα ,→ Bβ . The p-th persistence module of the filtration {ABα }α∈R
β β
is given by the homology module {im (H p (Aα ) → H p (Bα )); φα }α≤β where φα is the restriction
β
of jα ∗ on the im iα∗ . For simplicity, we say the module is induced by the nested pair filtration
{Aα ,→ Bα }.
The high level approach of inferring persistent homology of a scalar field f : X → R from a
set of points P equipped with fˆ : P → R involves the following steps:
164 Computational Topology for Data Analysis
Step 1. Sort all points of αi in non-decreasing fˆ-values, P = {p1 , . . . , pn }. Set αi = fˆ(pi ) for
i ∈ [1, n].
Step 2. Compute the persistence diagram induced by the filtration of nested pairs {VRr (Pαi ) ,→
0 0
VRr (Pαi )}i∈[1,n] (or {Cr (Pαi ) ,→ Cr (Pαi )}i∈[1,n] ) for appropriate parameters 0 < r < r0 .
The persistent homology (as well as persistence diagram) induced by the filtration of nested
pairs is computed via the algorithm in [105]. To obtain an approximation guarantee for the above
approach, we consider an intermediate object defined by the intrinsic Riemannian metric on the
manifold X. Indeed, note that the filtration of X w.r.t. f is intrinsic in the sense that it is indepen-
dent of how X is embedded in Rd . Hence it is more natural to approximate its persistent homology
with an object defined intrinsically for X.
Given a compact Riemannian manifold X ⊂ Rd embedded in Rd , let dX be the Riemannian
metric of X inherited from the Euclidean metric dE of Rd . Let BX (x, r) := {y ∈ X | dX (x, y) ≤ r} be
the geodesic ball on X centered at x and with radius r, and BoX (x, r) be the open geodesic ball. In
contrast, BE (x, r) (or simply B(x, r)) denotes the Euclidean ball in Rd . A ball BoX (x, r) is strongly
convex if for every pair y, y0 ∈ BX (x, r), there exists a unique minimizing geodesic between y and
y0 whose interior is contained within BoX (x, r). For details on these concepts, see [76, 164].
Definition 6.12 (Strong convexity). For x ∈ X, let ρc (x; X) denote the supreme of radius r such
that the geodesic ball BoX (x, r) is strongly convex. The strong convexity radius of (X, dX ) is defined
as ρc (X) := inf x∈X ρc (x; X).
Let dX (x, P) := inf p∈P dX (x, p) denote the closest geodesic distance between x and the set
P ⊆ X.
Definition 6.13 (ε-geodesic sample). A point set P ⊂ X is an ε-geodesic sample of (X, dX ) if for
all x ∈ X, dX (x, P) ≤ ε.
Recall that Pα is the set of points in P with fˆ-value at most α. The union of geodesic balls
Pδ;X = p∈Pα BX (p, δ) is intuitively the “δ-thickening" of Pα within the manifold X. We use two
S
α
kinds of Čech and Rips complexes. One is defined with the metric dE of the ambient Euclidean
space which we call (extrinsic) Čech complex Cδ (Pα ) and (extrinsic) Rips complex VRδ (Pα ). The
other is intrinsic Čech complex CδX (Pα ) and intrinsic Rips complex VRδX (Pα ) that are defined with
the intrinsic metric dX . Note that CδX (Pα ) is the nerve complex of the union of geodesic balls
forming Pαδ;X . Also the interleaving relation between the Čech and Rips complexes remains the
same as for general geodesic spaces; that is, CδX (Pα ) ⊆ VRδX (Pα ) ⊆ C2δ
X (Pα ) for any α and δ.
Proposition 6.14. Let X ⊂ Rd be a compact Riemannian manifold with intrinsic metric dX , and
let f : X → R be a C-Lipschitz function. Suppose P ⊂ X is an ε-geodesic sample of X, equipped
with fˆ : P → R so that fˆ = f |P . Then, for any fixed δ ≥ ε, the filtration {Fα }α and the filtration
{Pδ;X
α }α are (Cδ)-interleaved w.r.t. inclusions.
The intrinsic Čech complex CδX (Pα ) is the nerve complex for {BX (p, δ)} p∈Pα . Furthermore,
for δ < ρc (X), the family of geodesic balls in {BX (p, δ)} p∈Pα form a cover of the union Pδ;Xα
that satisfies the condition of the Nerve Theorem (Theorem 2.1). Hence, there is a homotopy
equivalence between the nerve complex CδX (Pα ) and Pδ;X
α . Furthermore, using the same argument
for showing that diagram in Eqn. (6.16) commutes (Lemma 3.4 of [91]), one can show that the
following diagram commutes for any α ≤ β ∈ R and δ ≤ ξ < ρc (X):
H∗ (Pδ;X
α )
i∗
/ H∗ (Pξ;X ) (6.25)
β
h∗ h∗
H∗ (CδX (Pα ))
i∗
/ H∗ (Cξ (Pβ ))
X
Here the horizontal homomorphisms are induced by inclusions, and the vertical ones are isomor-
phisms induced by the homotopy equivalence between a union of geodesic balls and its nerve
complex. The above diagram leads to the following result (see Lemma 2 of [87] for details):
Corollary 6.15. Let X, f , and P be as in Proposition 6.14 (although f does not need to be C-
Lipschitz). For any δ < ρc (X), {Pαδ;X }α∈R and {CδX (Pα )}α∈R are 0-interleaved. Hence they induce
isomorphic persistence modules which have identical persistence diagrams.
Combining with Proposition 6.14, this implies that the filtration {CδX (Pα )}α and the filtration
{Fα }α are Cδ-interleaved for ε ≤ δ < ρc (X).
However, we cannot access the intrinsic metric dX of the manifold X and thus cannot directly
construct intrinsic Čech complexes. It turns out that for points that are sufficiently close, their
Euclidean distance forms a constant factor approximation of the geodesic distance between them
on X.
Proposition 6.16. Let X ⊂ Rd be an embedded Riemannian manifold with reach ρX . For any two
points x, y ∈ X with dE (x, y) ≤ ρX /2, we have that:
4dE2 (x, y)
4
dE (x, y) ≤ dX (x, y) ≤ 1 +
dE (x, y) ≤ dE (x, y).
3ρX 2 3
This implies the following nested relation between the extrinsic and intrinsic Čech complexes:
4
δ 16
δ 3
CδX (Pα ) ⊆ Cδ (Pα ) ⊆ CX3 (Pα ) ⊆ C 3 δ (Pα ) ⊆ CX9 (Pα ); for any δ < ρX .
4
(6.26)
8
Note that a similar relation also holds between the intrinsic Čech filtration and the extrinsic Rips
complexes due to the nested relation between extrinsic Čech and Rips complexes. To infer persis-
tent homology from nested pairs filtrations for complexes constructed under the Euclidean metric,
we use the following key lemma from [87], which can be thought of as a persistent version as well
as a generalization of Fact 6.3.
166 Computational Topology for Data Analysis
Proposition 6.17. Let X, f , and P be as in Proposition 6.14. Suppose that there exist ε0 ≤ ε00 ∈
[ε, ρc (X)) and two filtrations {Gα }α and {G0α }α , so that
Then the persistence module induced by the filtration {Fα }α for f and that induced by the nested
pairs of filtrations {Gα ,→ G0α }α are Cε00 -interleaved, where f is C-Lipschitz.
Combining this proposition with the sequences in Eqn. (6.26), we obtain the following results
on inferring the persistent homology induced by a function f : X → R.
Theorem 6.18. Let X ⊂ Rd be a compact Riemannian manifold with intrinsic metric dX , and
f : X → R a C-Lipschitz function on X. Let ρX and ρc (X) be the reach and the strong convexity
radius of (X, dX ) respectively. Suppose P ⊂ X is an ε-geodesic sample of X, equipped with
fˆ : P → R such that fˆ = f |P . Then:
In particular, in each case above, the bottleneck distance between their respective persistence
diagrams is bounded by the stated interleaving distance between persistence modules.
Much of the materials in Section 6.3 are taken from [81, 87, 91, 245]. We remark that there
have been different variations of the medial axis in the literature. We follow the notation from
[119]. We also note that there exists a robust version of the medial axis, called the λ-medial axis,
proposed in [89]. The concept of the local feature size was originally proposed in [270] in the
context of mesh generation and a different version that we describe in this chapter was introduced
in [8] in the context of curve/surface reconstruction. The local feature size has been widely used
in the field of surface reconstruction and mesh generation; see the books [98, 119]. Critical points
of the distance field were originally studied in [177]. See [89, 90, 225] for further studies as well
as the development on weak feature sizes.
In homology inference for manifolds, we note that Niyogi, Smale and Weinberger in [245]
provide two deformation retract results from union of balls over P to a manifold X; Proposition
3.1 holds for the case when P ⊂ X, while Proposition 7.1 holds when P is within a tubular
neighborhood of X. The latter has much stronger requirement on the radius α. In our presentation,
Proposition 6.10 uses a corollary of Proposition 3.1 of [245] to obtain an isomorphism between
the homology groups of union of balls and of X. This allows a better range of the parameter
α – however, we lose the deformation retraction here; see the footnote above Proposition 6.10.
Results in Chapter 6.4 are mostly based on the work in [87].
This chapter focuses on presenting the main framework behind homology (or persistent ho-
mology) inference from point cloud data. The current theoretical guarantees hold when input
points sample the hidden domain well within Hausdorff distance. For more general noise models
that include outliers and statistical noise, we need a more robust notion of distance field than what
we used in Section 6.3.1. To this end, an elegant concept called distance to measures (DTM)
has been proposed in [79], which has many nice properties and can lead to more robust homo-
logical inferences; see, e.g., [82]. An alternative approach using kernel-distance is proposed in
[256]. See also [56, 79, 246] for data sparsification or homology inference for points corrupted
with more general noise, and [55] for persistent homology inference under more general noise for
input scalar fields.
Exercise
1. Prove Part (i) of Theorem 6.3.
2. Prove the bound on the Rips pseudo-distance dRips (P, Q) in Part (ii) of Theorem 6.3.
3. Given two finite sets of points P, Q ⊂ Rd , let dP and dQ denote the restriction of the Eu-
clidean metric over P and Q respectively. Consider the Hausdorff distance δH = dH (P, Q)
between P and Q, as well as the Gromov-Hausdorff distance δGH = dGH ((P, dP ), (Q, dQ )).
5. Consider the greedy permutation approach introduced in Chapter 6.2, and the assignment
of exit-times for points p ∈ P. Construct the open tower {Nγ } and closed tower {N γ } as
described in the chapter. Prove that both Nγ and N γ are γ-nets for P.
6. Suppose we are given P0 ⊃ P1 sampled from a metric space (Z, d) where P1 is a γ-net of
P0 . Define π : P0 → P1 as π(p) 7→ argminq∈P1 d(p, q) (if argminq∈P1 d(p, q) contains more
than one point, then set π(p) to be any point q that minimizes d(p, q)).
(a) Prove that the vertex map π induces a simplicial map π : VRα (P0 ) → VRα+γ (P1 ).
(b) Consider the following diagram. Prove that the map j◦π is contiguous to the inclusion
map i.
VRα (P0 ) / VRα+γ (P0 )
i
O (6.27)
π
j
& ?
VRα+γ (P1 )
7. Let P be a set of points in Rd . Let d2 and d1 denote the distance metric under L2 norm
and under L1 norm respectively. Let C2 (P) and C1 (P) be the Čech filtration over P induced
by d2 and d1 respectively. Show the relation between the log-scaled version of persistence
diagrams Dgmlog C2 (P) and Dgmlog C1 (P), that is, bound db (Dgmlog C2 (P), Dgmlog C1 (P))
(see the discussion above Corollary 4.4 in Chapter 4).
8. Prove Proposition 6.14. Using the fact that Diagram 6.25 commutes, prove Corollary 6.15.
Chapter 7
Reeb Graphs
Figure 7.1: (Left). A description function based on averaging geodesic distances is shown on
different models, together with some isocontours of this function. This function is robust w.r.t.
near-isometric deformation of shapes. (Right) The Reeb graph of the descriptor function (from the
left) is used to compare different shapes. Here, given a query shape (called “key"), the most sim-
ilar shapes retrieved from a database are shown on the right. Images taken from [190], reprinted
by permission from ACM: Masaki Hilaga et al. (2001).
169
170 Computational Topology for Data Analysis
We define the Reeb graph and introduce some properties of it in Section 7.1. We also describe
efficient algorithms to compute it for the piecewise-linear setting in Section 7.2. For comparing
Reeb graphs, we need to define distances among them. In Section 7.3, we present two equivalent
distance measures for the Reeb graphs and give a stability result of these distances w.r.t. changes
in the input function that define the Reeb graph. In particular, we note that a Reeb graph can also
be viewed as a graph equipped with a “height” function on it which is induced by the original
function f : X → R on the input domain. This height function provides a natural metric on the
Reeb graph, rendering a view of the Reeb graph as a specific metric graph. This further leads
to a distance measure for Reeb graphs based on the Gromov-Hausdorff distance idea, which we
present in Section 7.3. An alternative way to define a distance for Reeb graphs is based on the
interleaving idea, which we also introduce in Section 7.3. It turns out that these two versions of
distances for Reeb graphs are strongly equivalent, meaning that they are within a constant factor
of each other.
Φ
X Rf
f
x y z
Φ(x) = Φ(y) Φ(z)
regular CW complex which is a graph, and this is why it is commonly called a Reeb graph. In
particular, from now on, we tacitly assume that the input function f : X → R is levelset tame,
meaning that (i) each level set f −1 (a) has a finite number of components, and each component is
path connected1 , and (ii) f is of Morse type (Definition 4.14). It is known that Morse functions
on a compact smooth manifold and PL-functions on finite simplicial complexes are both levelset
tame.
A level set may consist of several connected components, each of which is called a contour.
Intuitively, the Reeb graph R f is obtained by collapsing contours (connected components) in each
level set f −1 (a) continuously. In particular, as we vary a, R f tracks the the changes (e.g., creation,
deletion, splitting and merging) of connected components in the levelsets f −1 (a), and thus is a
meaningful topological summary of f : X → R.
As the function f is constant on each contour in a levelset, f : X → R also induces a contin-
uous function f˜ : R f → R defined as f˜(z) = f (x) for any preimage x ∈ Φ−1 (z) of z. To simplify
notation, we often write f (z) instead of f˜(z) for z ∈ R f when there is no ambiguity, and use f˜
mostly to emphasize the different domains of the functions. In all illustrations of this chapter, we
plot the Reeb graph with the vertical coordinate of a point z to be the function value f (z).
Critical points. As we describe above, the Reeb graph can be viewed as the underlying space
of a 1-dimensional cell complex, where there is also a function f˜ : R f → R defined on R f . We
can further assume that the function f˜ is monotone along each 1-cell of R f – if not, we simply
insert a new node where this condition fails, and the tameness of f : X → R guarantees that we
only need to add finite number of nodes. Hence we can view the Reeb graph as the underlying
space of a 1-dimensional simplicial complex (graph) (V, E) associated with a function f˜ that is
monotone along each edge e ∈ E. Note that we can further insert more nodes into an edge in E,
breaking it into multiple edges; see, e.g., the augmented Reeb graph in Figure 7.4 (c). We now
continue with this general view of the Reeb graph, whose underlying space is a graph equipped
with a function f˜ that is monotone along each edge. We can then talk about the induced critical
points as in Definition 3.23. An alternative (and simpler) way to describe such critical points are
as follows: Given a node x ∈ V in the vertex set V := V(R f ) of the Reeb graph R f , let up-degree
(resp. down-degree) of x denote the number of edges incident to x that have higher (resp. lower)
values of f˜ than x. A node is regular if both of its up-degree and down-degree equal to 1, and
critical otherwise. A critical point is a minimum (maximum) if it has down-degree 0 (up-degree
0), and a down-fork (up-fork) if it has down-degree (up-degree) larger than 1. A critical point can
be degenerate, having more than one types of criticality: e.g., a point with down-degree 0 and
up-degree 2 is both a minimum and an up-fork.
Note that because of the monotonicity of f˜ at regular points, the Reeb graph together with its
associated function is completely described, up to homeomorphisms preserving the function, by
the function values at the critical points.
Now imagine that one sweeps the domain X in increasing order of f -values, and tracks the
changes in the connected components during this process. New components appear (at down-
degree 0 nodes), existing components vanish (at up-degree 0 nodes), or components merge or
1
As introduced in Exercise 3 of Chapter 1, a topological space T is path connected if any two points x, y ∈ T can
be joined by a path, i.e., there exists a continuous map f : [0, 1] → T of the segment [0, 1] ⊂ R onto T so that f (0) = x
and f (1) = y.
172 Computational Topology for Data Analysis
split (at down/up-forks). The Reeb graph R f encodes such changes thereby making it a simple
but meaningful topological summary of the function f : X → R. However, it only tracks the
connected components in the levelset, thus cannot capture complete information about f . Never-
theless, it reflects certain aspects about both the domain X itself and the function f defined on it,
which we describe in Section 7.2.3.
f X f f f
(a) Input scalar field (b) Reeb graph (c) Merge tree Split tree
Figure 7.3: Examples of the Reeb graph, the merge tree and the split tree of an input scalar field.
Variants of Reeb graphs. Treating a Reeb graph as a simplicial 1-complex, we can talk about
1-cycles (loops) in it. A loop-free Reeb graph is also called a contour tree, which itself has found
many applications in computer graphics and visualization. Instead of tracking the connected
components within a levelset, one can also track them within the sublevel set while sweeping
X along increasing f -values, or track them within the superlevel set while sweeping X along
decreasing f -values. The resulting topological summaries are called the merge tree and the split
tree, respectively. See the precise definition below and examples in Figure 7.3.
Definition 7.2. Define x ∼ M y if and only if f (x) = f (y) = a and x is connected to y within the
sublevel set f −1 ((−∞, a]). Then the quotient space T M = X/ ∼ M is the merge tree w.r.t. f .
Alternatively, if we define x ∼S y if and only if f (x) = f (y) = a and x is connected to y within
the superlevel set f −1 ([a, +∞)), then the quotient space T S = X/ ∼S is the split tree w.r.t. f .
Indeed, for levelset tame functions we consider, T M and T S are both finite trees. If R f is
loop-free (thus a tree), then this contour tree is uniquely decided by, and can be computed from,
the merge and split trees of f .
Finally, instead of real-valued functions, one can define a similar quotient space X/ ∼ for a
continuous map f : X → Z to a general metric space (e.g, Z = Rd ), where ∼ is the equivalence
relation x ∼ y if and only if f (x) = f (y) = a and x is connected to y within the levelset f −1 (a). The
resulting structure is called the Reeb space. See Section 9.3 where we consider this generalization
in the context of another structure called mapper.
of K. From now on, we assume that f is generic and K = (V, E, T ) is a simplicial 2-complex
with vertex set V, edge set E and triangle set T . Let nv , ne and nt denote the size of V, E, and T ,
respectively, and set m = nv + ne + nt . We sketch algorithms to compute the Reeb graph for the
PL-function f . Sometimes, they output the so-called augmented Reeb graph, which is essentially
a refinement of the Reeb graph R f with certain additional degree-2 vertices inserted in arcs of R f .
Definition 7.3 (Augmented Reeb). Given a PL-function f : |K| → R defined on a simplicial
complex K = (V, E, T ), let R f be its Reeb graph and Φ f : |K| → R f (K) be the associated quotient
map. The augmented Reeb graph of f : |K| → R, denoted by b R f , is obtained by inserting each
point in Φ f (V) := {Φ f (v) | v ∈ V} as graph nodes to R f (if it is not already in).
r
w Φf (r)
Φf (w) Φf (w)
q
Φf (q)
p Φf (p)
Φf (p)
Figure 7.4: (a) A simplicial complex K. The set of 2-simpices of K include 4rpq, 4rpw, 4rqw,
as well as the two dark-colored triangles incident to p and to w, respectively. (b) Reeb graph of
the height function on |K|. (c) Its augmented Reeb graph.
For a PL-function, each critical point of the Reeb graph R f (w.r.t. f˜ : R f → R induced by f )
is necessarily the image of some vertex in K, and thus the critical points form a subset of points in
Φ f (V). The augmented Reeb graph b R f then includes all remaining points in Φ f (V) as (degree-2)
graph nodes. See Figure 7.4 for an example, where as a convention, we plot a node Φ f (v) at the
same height (function value) as v.
We now sketch the main ideas behind two algorithms that compute the Reeb graph for a
PL-function with the best time complexity, one deterministic and the other randomized.
Gb
b
v
a
Ga
Figure 7.5: As one sweeps past v, the combinatorial structure of the pre-image graph changes.
Ga has 3 connected components (one of which contains a single point only), while Gb has only 2
components.
A natural idea to construct the Reeb graph R f of f : |K| → R is to sweep the domain K
with increasing value of a, track the connected components in Ga during the course, and record
the changes (merging or splitting of components, or creation and removal of components) in the
resulting Reeb graph.
Furthermore, as f is a PL-function, the combinatorial structure of Ga can only change when
we sweep past a vertex v ∈ V. When that happens, only edges / triangles from K incident to v
can incur changes in Ga . See Figure 7.5. Let sv denote the total number of simplicies incident
on v. It is easy to see that as one sweeps through the vertex v, only O(sv ) number of insertions
and deletions are needed to update the pre-image graph Ga . To be able to build the Reeb graph
R f , we simply need to maintain the connectivity of Ga as we sweep. Assuming we have a data
structure to achieve this, the high level framework of the sweep algorithm is then summarized in
Algorithm 12:Reeb-SweepAlg.
Algorithm 12 Reeb-SweepAlg(K, f )
Input:
A simplicial 2-complex K and a vertex function f : V(K) → R
Output:
The Reeb graph of the PL-function induced by f
1: Sort vertices in V = {v1 , . . . , vnv } in increasing order of f -values
2: Initialize the Reeb graph R and the pre-image grpah Ga to be empty
3: for i = 1 to nv do
4: LC = LowerComps(vi )
5: UpdatePreimage(vi ) \∗Update the pre-image graph Ga ∗\
6: UC = UpperComps(vi ))
7: UpdateReebgraph(R, LC, UC, vi )
8: end for
9: Output R as the Reeb graph
In particular, suppose we have a data structure, denoted by DynSF, that maintains a spanning
forest of the pre-image graph at any moment. Each connected component in the pre-image graph
is associated with a certain vertex v from V, called representative vertex of this component, which
Computational Topology for Data Analysis 175
indicates that this component is created when passing through v. We assume that the data structure
DynSF allows the following operations: First, assume that a graph node ea ∈ Wa in the pre-image
graph Ga is generated by edge e ∈ K, that is, ea is the intersection of e with the levelset f −1 (a).
• Find(e): given an edge e ∈ E, returns the representative vertex of the component in the
current pre-image graph Ga containing the node ea ∈ Wa generated by e.
• Insert(e, e0 ), Delete(e, e0 ): inserts an edge (ea , e0a ) into Ga and deletes (ea , e0a ) from Ga
respectively while still maintaining a spanning forest for Ga under these operations.
Using these operations, the pseudo-codes for the subroutines called in algorithm Reeb-SweepAlg
are given in Algorithms 13:LowerComps, 14:UpdatePreImage, and 15:UpdateReebGraph. (The
routine UpperComps is symmetric to LowerComps and thus omitted.) These codes assume that
edges of K not intersecting the levelsets are still in the pre-image graphs as isolated nodes; hence
there is no need to add or remove isolated nodes.
Algorithm 13 LowerComps(v)
Input:
a vertex v ∈ K
Output:
A list Lc of connected components in the pre-image graph generated by the lower-star of v
1: LC = empty list
2: for all edges e in the lower-star of v do
3: c = DynSF.Find(e)
4: if c is not marked ‘listed’ then
5: LC.add(c); and mark c as ’listed’
6: end if
7: end for
Time complexity analysis. Suppose the input simplicial 2-complex K = (V, E, T ) has n vertices
and m simplices in total. Sorting the vertices takes O(n log n) time. Then steps 4 to 7 of the
algorithm Reeb-SweepAlg performs O(m) numbers of Find, Insert and Delete operations using
the data structure DynSF.
One could use state-of-the-art data structure for dynamic graph connectivity as DynSF – in-
deed, this is the approach taken in [146]. However, note that this is an offline version of the
dynamic graph connectivity problem, as all insertions / deletions are known in advance and thus
can be pre-computed. To this end, we assign each edge in the pre-image graph a weight, which is
the time ( f -value) it will be deleted from the pre-image graph Ga . We then maintain a maximum
spanning forest of Ga during the sweeping to maintain connectivity. In general, a deletion of a
maximum-spanning tree edge (u, v) can incur expensive search in the pre-image graph for a re-
placement edge (as u and v may still be connected). However, because of the specific assignment
of edge weights, this expensive search is avoided in this case. If a maximum spanning tree edge
is to be deleted, it will simply break the tree in the maximum spanning forest containing this
edge, and no replacement edge needs to be identified. One can use a standard dynamic tree data
176 Computational Topology for Data Analysis
Algorithm 14 UpdatePreImage(v)
Input:
A vertex v ∈ K
Output:
Update the pre-image graph after sweeping past v
1: for all triangles uvw incident on v do
2: \∗ w.l.o.g. assume f (u) < f (w) ∗\
3: if f (v) < f (u) then
4: DynSF.Insert(vu, vw)
5: else
6: if f (v) > f (w) then
7: DynSF.Delete(vu, vw)
8: else
9: DynSF.Delete(uv, uw)
10: DynSF.Insert(vw, uw)
11: end if
12: end if
13: end for
structure, such as the Link-Cut trees [280], to maintain the maximum spanning forest efficiently in
O(log m) amortized time for each find / insertion / deletion operation. Putting everything together,
it takes O(m log m) time to compute the Reeb graph by the sweep.
Theorem 7.1. Given a PL-function f : |K| → R, let m denote the total number of simplices in the
2-skeleton of K. One can compute the (augmented) Reeb graph R f of f in O(m log m) time.
models the effect of the quotient map Φ, but does so in a randomized manner so as to obtain a
good (expected) running time.
v7 v7 v7
v3 v3 v3
v6 v6 v6
v4 v4 v4
v1 v1 v1
v2 v8 v2 v8 v2 v8
v5 v5 v5
v4 v4 v4
v1 v1 v1
v2 v8 v2 v8 v2 v8
v5 v5 v5
Figure 7.6: The vertices are randomly ordered. Starting from the initial simplicial complex in (a),
the algorithm performs vertex-collapse for vertices in this random order, as shown in (b) – (f).
Algorithm 16 Reeb-RandomAlg(K, f )
Input:
A simplicial 2-complex K and a vertex function f : V(K) → R
Output:
The augmented Reeb graph of the PL-function induced by f
1: Let V = {v1 , . . . , vnv } be a random permutation of vertices in V
2: Set K0 = K and f0 = f
3: for i = 1 to nv do
4: Collapse the contour of fi−1 : |Ki−1 | → R passing through (incident to) vi and obtain
complex Ki
5: fi : |Ki | → R is the PL-function on Ki induced from fi−1
6: end for
7: Output the final complex Knv as the augmented Reeb graph
p2 p2
e2 e2
e3
q q q x q
y
e1 e1
p1 p1
(d) (e)
Figure 7.7: The function f is the height function. The contour incident to point q for the complex
in (a) is collapsed, resulting a new complex in (b); and (c) the collapse of the contour within a
single triangle incident to q. (d) An example where this triangle is bordering another triangle.
(e) There are two triangles incident to q that has q being the mid-vertex; and they both need to
be processed. The triangle qp1 p4 does not have q as mid-vertex, and it is not touched while
processing q.
collapsed simplicial complex Ki−1 whose augmented Reeb graph is the same as that of f . It
then “collapses" the contour of fi−1 passing through the vertex vi and obtains a new PL-function
fi : |Ki | → R over a further collapsed simplicial complex Ki that maintains the augmented Reeb
graph.
The key is to implement this “collapse" step (lines 4-5). To see the effect of collapsing the
contour incident to a vertex, see Figure 7.7 (a) and (b). To see how is the collapse implemented,
first consider the triangle qp1 p2 incident to vertex q as in Figure 7.7 (c), and assume that q is
the mid-vertex of this triangle, that is, its height value ranks second among the three vertices of
the triangle. Intuitively, we need to map each horizontal segment (part of a contour at different
height) to the corresponding point along the edges qp1 and qp2 . If this triangle incident to q that
we are collapsing has one or more triangles sharing the edge p1 p2 as shown in Figure 7.7 (d), then
for each such incident triangle, we need to process it appropriately. In particular, see one such
triangle (p1 , p2 , r) in Figure 7.7 (d), then, as q0 is sent to q, the dotted edge rq0 becomes edge rq as
shown. Thus, the triangle rp1 p2 is now split into two new triangles qrp1 and qrp2 . In this case, it
is easy to see that at most one of the new triangles will have q as the mid-vertex. We collapse this
triangle and continue the process until no more triangle with q as the mid-vertex is left (Figure 7.7
(b)). Triangle(s) incident to q but not having q as the mid-vertex are not processed, e.g., triangle
qp1 p4 in Figure 7.7 (e). At this point, the entire contour passing through q is collapsed into a
single point, and lines 4-5 of the algorithm are executed.
After processing each vertex as described above, the algorithm Reeb-RandomAlg in the end
Computational Topology for Data Analysis 179
computes the final complex Knv in line 7. It is necessarily a simplicial 1-complex because no
vertex can be the mid-vertex of any triangle, implying that there is no triangle left. It is easy to
see that, by construction, Knv is the augmented Reeb graph w.r.t. f : |K| → R.
Time complexity. For each vertex v, the time complexity of the collapse is proportional to
the number of triangles T v intersected by the contour Cv passing through v. In the worst case,
T v = |nt |, giving rise to O(nv nt ) worst case running time for algorithm Reeb-RandomAlg. This
worst case time complexity turns out to be tight. However, if one processes the vertices in a
random order, then the worst case behavior is unlikely to happen, and the expected running time
can be proven to be O(m log nv ) = O(m log m). Essentially, one argues that an original triangle
from the input simplicial complex is split only O(log nv ) = O(log m) expected number of times
thus creating O(log m) expected number of intermediate triangles which takes O(log m) expected
time to collapse. The argument is in spirit similar to the analysis of the path length in a randomly
built binary search tree [109].
The equality β0 (X) = β0 (R f ) in the above statement follows from the fact that R f is the quo-
tient space X/ ∼ and each equivalent class itself is connected (it is a connected component in some
levelset). The relation on β1 can be proven directly, and it is also a by-product of Theorem 7.4
below (combined with Fact 7.2). The above statement also implies that if X is simply connected,
then R f is loop-free.
For the case where X is a 2-manifold, more information about X can be recovered from the
Reeb graph of a Morse function defined on it.
Theorem 7.3 ([107]). Let f : X → R be a Morse function defined on a connected and compact
2-manifold.
We now present a result that characterizes H1 (R f ) w.r.t. H1 (X) in a more precise manner,
which also generalizes Theorem 7.3.
180 Computational Topology for Data Analysis
The coset ω + H p (X) for every class ω ∈ H p (X) provides an equivalence class in Ȟ p (X). We call
h a vertical homology class if h + H p (X) is not 0 in Ȟ p (X). In other words, h < H p (X). Two
homology classes h1 and h2 are vertically homologous if h1 ∈ h2 + H p (X).
Fact 7.2. By definition, rank (H p (X)) = rank (H p (X)) + rank (Ȟ p (X)).
Let I be a closed interval of R. We define the height of I = [a, b] to be height(I) = |b − a|; note
that the height could be 0. Given a homology class h ∈ H p (X) and an interval I, we say that h is
supported by I if h ∈ im (i∗ ) where i∗ : H p (XI ) → H p (X) is the homomorphism induced by the
canonical inclusion XI ,→ X. In other words, XI contains a p-cycle γ from the homology class h.
We define the height of a homology class h ∈ H p (X) to be
Isomorphism between Ȟ1 (X) and H1 (R f ). The surjection Φ : X → R f (X) induces a chain map
Φ# from the 1-dimensional singular chain group of X to the 1-dimensional singular chain group of
R f (X) which eventually induces a homomorphism Φ∗ : H1 (X) → H1 (R f (X)). For the horizontal
subgroup H1 (X), we have that Φ∗ (H1 (X)) = 0 ∈ H1 (R f (X)). Hence Φ∗ induces a well-defined
homomorphism between the quotient groups
H1 (X) H1 (R f (X))
Φ̌ : Ȟ1 (X) = → = H1 (R f (X)).
H1 (X) H1 (R f (X))
The right equality above follows from that H1 (R f (X)) = 0, which holds because every level set
of R f (X) consists only of a finite set of disjoint points due to the levelset-tameness of function
f : X → R. It turns out that Φ̌ is an isomorphism – Intuitively, this is not surprising as Φ maps
each contour in the level set to a single point, which in turn collapses every horizontal cycle.
Theorem 7.4. Given a levelset tame function f : X → R, let Φ̌ : Ȟ1 (X) → H1 (R f (X)) be
the homomorphism induced by the surjection Φ : X → R f (X) as defined above. Then the map
Φ̌ is an isomorphism. Furthermore, for any vertical homology class h ∈ Ȟ1 (X), we have that
height(h) = height(Φ̌(h)).
Computational Topology for Data Analysis 181
Persistent homology for f : R f → R. We have discussed earlier that the Reeb graph of a
levelset tame function f : X → R can be represented by a graph whose edges have monotone
function values. Then, the function f : R f → R can be treated as a PL-function on the simpli-
cial 1-complex R f . This gives rise to the standard setting where a PL-function f is defined on a
simplicial 1-complex R f whose persistence is to be computed. We can apply algorithm ZeroP-
erDg from Section 3.5.3 to compute the 0-th persistence diagram Dgm0 ( f ). For computing one
dimensional persistence diagram Dgm1 ( f ), one can modify this algorithm slightly by registering
the function values of the edges that create cycles. These are edges that connect vertices in the
same component. The function values of these edges are the birth points of the 1-cycles that never
die. This algorithm takes O(n log n + mα(n)) time where m and n are the number of vertices and
edges respectively in R f .
We can also compute the levelset zigzag persistence of f (Section 4.5) using the zigzag per-
sistence algorithm in Section 4.3.2. However, taking advantage of the graph structures, one can
compute the levelset zigzag persistence for a Reeb graph with n vertices and edges in O(n log n)
time using an algorithm of [5] that takes advantage of mergeable tree data structure [169]. Only
the 0-th persistence diagram Dgm0 ( f ) is nontrivial in this case. We can read the zeroth persistence
diagram for the standard persistence using Theorem 4.15 from this level set persistence diagram.
Furthermore, for every infinite bar [ai , ∞) in the standard one dimensional persistence diagram,
we get a pairing (a j , ai ) (open-open bar) in the zeroth levelset diagram Dgm0 ( f ).
Reeb graphs can be a useful tool to compute the zeroth levelset zigzag persistence diagram
of a function on a topological space. Let f : X → R be a continuous function whose zeroth
persistence diagram we want to compute. We already observed that the function f induces a
continuous function on the Reeb graph R f . To distinguish the two domains more explicitly, we
denote the former function f X and the latter as f R . The following observation helps computing the
zeroth levelset zigzag persistence diagram Dgm0 ( f X ) because computationally it is much harder
to process a space, say the underlying space of a simplicial complex than only a graph (simplicial
1-complex).
Proposition 7.5. Dgm0 ( f X ) = Dgm0 ( f R ) where the diagrams are for the zeroth levelset zigzag
persistence.
The result follows from the following observation. Consider the levelset zigzag filtrations F X
and FR for the two functions as in sequence (4.15).
F X : X(a0 ,a2 ) ←- · · · ,→ X(ai−1 ,ai+1 ) ←- X(ai ,ai+1 ) ,→ X(ai ,ai+2 ) ←- · · · ,→ X(an−1 ,an+1 )
FR : R f (a0 ,a2 ) ←- · · · ,→ R f (ai−1 ,ai+1 ) ←- R f (ai ,ai+1 ) ,→ R f (ai ,ai+2 ) ←- · · · ,→ R f (an−1 ,an+1 )
j j
Using notation for interval sets Xi = X(ai ,a j ) and Ri = R f (ai ,a j ) , we have the following com-
mutative diagram between the 0-th levelset zigzag persistence modules.
All vertical maps are isomorphism because the number of components in X ij is exactly equal to
the number of components in the quotient space Rij = X ij / ∼ which is used to define the Reeb
graph. All horizontal maps are induced by inclusions. It follows that every square in the above
diagram commutes. Therefore the above two modules are isomorphic.
Definition 7.4. Given a Reeb graph (F, f ), its ε-smoothing, denoted by Sε (F, f ), is the Reeb
graph of the function fε : Fε → R where fε (x, t) = f (x) + t for x ∈ F and t ∈ [−ε, ε]. In other
words, Sε (F, f ) = Fε / ∼ fε , where ∼ fε denotes the equivalence relation where x ∼ fε y if and only
if x, y ∈ Fε are from the same contour of fε .
See Figure 7.8 for an example. As Sε (F, f ) is the quotient space Fε / ∼ fε , we use [x, t],
x ∈ F, t ∈ [−ε, ε], to denote a point in Sε (F, f ), which is the equivalent class of (x, t) ∈ Fε
under the equivalence relation ∼ fε . Also, note that there is a natural “quotiented-inclusion” map
ι : (F, f ) → Sε (F, f ) defined as ι(x) = [x, 0], for any x ∈ F.
Suppose we have two Reeb graphs (A, fa ) and (B, fb ). A map µ : (A, fa ) → (B, fb ) between
them is function-preserving if fa (x) = fb (µ(x)) for each x ∈ A. A function-preserving map µ be-
tween (A, fa ) and Sε (B, fb ) induces a function-preserving map µε between Sε (A, fa ) and S2ε (B, fb )
Computational Topology for Data Analysis 183
2ε
Figure 7.8: From left to right, we have the Reeb graph (F, f ), its ε-thickening (Fε , fε ), and the
Reeb graph Sε (F, f ) of fε : Fε → R.
as follows:
µε : Sε (A, fa ) → S2ε (B, fb ) such that [x, t] 7→ [µ(x), t].
Now consider the “quotiented-inclusion” map ι introduced earlier, and suppose we also have a
pair of function-preserving maps φ : (F, f ) → Sε (G, g) and ψ : (G, g) → Sε (F, f ). Using the
above construction, we then obtain the following maps:
Definition 7.5 (Reeb graph interleaving). A pair of continuous maps φ : (F, f ) → Sε (G, g) and
ψ : (G, g) → Sε (F, f ) are ε-interleaved if (i) both of them are function preserving, and (ii) the
following diagram commutes:
ι ιε
(F, f ) / Sε (F, f ) / S2ε (F, f )
: 8
φ φε
ψ ψε
$ &
(G, g) / Sε (G, g) / S2ε (G, g).
ι ιε
One can recognize that the above requirements of commutativity mirror the rectangular and
triangular commutativity in case of persistence modules (Definition 3.16). It is easy to verify the
rectangular commutativity, that is, to verify that the following diagram (and its symmetric version
involving maps ψ and ψε ) commutes.
ι / Sε (F, f )
(F, f )
φε
φ
$ ιε &
Sε (G, g) / S2ε (G, g)
Rectangular commutativity however does not embody the interaction between maps φ and ψ. The
key technicality lies in verifying the triangular commutativity, that is, φ and ψ make the diagram
184 Computational Topology for Data Analysis
S: ε (F, f )
φε
ψ &
(G, g) / Sε (G, g) / S2ε (G, g)
ι ιε
For sufficiently large ε, Sε (A, fa ) for any Reeb graph becomes a single segment with monotone
function values on it. Hence one can always find maps φ and ψ that are ε-interleaved for suf-
ficiently large ε. On the other hand, if ε = 0, then this implies ψ = φ−1 . Hence the smallest
ε accommodating ε-interleaved maps indicates how far the input Reeb graphs are from being
identical. This forms the intuition behind defining the following distance between Reeb graphs.
Definition 7.6 (Interleaving distance). Given two Reeb graphs (F, f ) and (G, g), the interleaving
distance between them is defined as:
dI (F, G) = inf{ε | there exists a pair of ε-interleaved maps between (F, f ) and (G, g) }. (7.1)
Definition 7.7 (Function-induced metric). Given a path π from u to v in a Reeb graph (A, fa ), the
height of π is defined as
height(π) = max fa (x) − min fa (x).
x∈π x∈π
Let Π(u, v) denote the set of all paths between two points u, v ∈ A. The function-induced metric
d fa : A × A → R on A induced by fa is defined as
In other words, d fa (u, v) is the minimum length of any closed interval I ⊂ R such that u and v
are in the same path component of fa−1 (I). It is easy to verify for a finite Reeb graph, the function-
induced distance d fa is indeed a proper metric on it, and hence we can view the Reeb graph
(A, fa ) as a metric space (A, d fa ). Refer to Chapter 9, Definition 9.6 for a generalized version of
this metric.
Definition 7.8 (Functional distortion distance). Given two Reeb graphs (F, f ) and (G, g), and a
pair of continuous maps Φ : F → G and Ψ : G → F, set
and
1
D(Φ, Ψ) = sup d f (x, x0 ) − dg (y, y0 ) .
(x,y),(x0 ,y0 )∈C(Φ,Ψ) 2
Computational Topology for Data Analysis 185
The functional distortion distance between (F, f ) and (G, g) is defined as:
Note that the maps Φ and Ψ are not required to preserve function values; however the terms
k f − g ◦ Φk∞ and kg − f ◦ Ψk∞ bound the difference in function values under the maps Φ and Ψ. If
we ignore these two terms k f − g ◦ Φk∞ and kg − f ◦ Ψk∞ , and if we do not assume that Φ and Ψ
have to be continuous, then dFD is the simply the Gromov-Hausdorff distance between the metric
spaces (F, d f ) and (G, dg ) [175]. The above definition is thus a function-adapted version of the
continuous Gromov-Hausdorff distance 2 .
Properties of the distances. The two distances we introduced turn out to be strongly equivalent.
Furthermore, it is known that for Reeb graphs F, G derived from two “nice” functions f, g :
X → R defined on the same domain X, both distances are stable [20, 116].
Definition 7.9 (Stable distance). Given f, g : X → R, let (F, f˜) and (G, g̃) be the Reeb graph of f
and g, respectively.
We say that a Reeb graph distance dR is stable if
Finally, it is also known that these distances are bounded from below (up to a constant factor)
by the bottleneck distance between the persistence diagrams associated to the two input Reeb
graphs. In particular, given (F, f ) (and similarly for (G, g)), consider the 0-th persistence diagram
Dgm0 ( f ) induced by the levelset zigzag-filtration of f as in previous section. We consider only
the 0-th persistence homology as each levelset f −1 (a) consisting of only a finite set of points. We
have the following result (see Theorem 3.2 of [32]).
Theorem 7.7. db (Dgm0 ( f ), Dgm0 (g)) ≤ 2dI (F, G) ≤ 2dFD (F, G).
Universal Reeb graph distance. We introduced two Reeb graph distances above. There are
other possible distances for Reeb graphs, such as the edit distance originally developed for Reeb
graphs induced by functions on curves and surfaces. All these distances are stable, which is an im-
portant property to have. The following concept allows one to identify the most “discriminative"
Reeb graph distance among all stable distances.
Definition 7.10. A Reeb graph distance dU is universal if and only if (i) dU is stable; and (ii) for
any other stable Reeb graph distance dS , we have dS ≤ dU .
2
It turns out that if one removes the requirement of continuity on Φ and Ψ, the resulting functional distortion
distance takes values within a constant factor of dFD we defined for the case of Reeb graphs.
186 Computational Topology for Data Analysis
It has been shown that neither the interleaving distance nor the functional distortion distance
is universal. On the other hand, for Reeb graphs of piecewise-linear functions defined on com-
pact triangulable spaces, such universal Reeb graph distance indeed exists. In particular, one
can construct a universal Reeb graph distance via a pullback idea to a common space; see [21].
The authors of [21] propose two further edit-like distances for Reeb graphs, both of which are
universal.
Computation. Unfortunately, except for the bottleneck distance db , the computation of any of
the distances mentioned above is at least as hard as graph isomorphism. In fact, even for merge
trees (which are simpler variant of the Reeb graph, described in Definition 7.2 at the end of
Section 7.1), it is NP-hard to compute the interleaving distance between them [6]. But for this
special case, a fixed-parameter tractable algorithm exists [289].
The interleaving distance of merge trees was originally introduced by Morozov et al. in
[237]. The interleaving distance for the Reeb graphs is more complicated, and was introduced
by de Silva et al. [116]. There is also an equivalent cosheave-theoretical way of defining the
interleaving distance. Its description involves the sheaf theory [112]. The functional distortion
distance for Reeb graphs was originally introduced in [20], and its relation to interleaving distance
was studied in [24]. The lower-bound in Theorem 7.7 was proven in [32]; while some weaker
bounds were earlier given in [47, 24]. An interesting distance between Reeb graphs can be defined
by mapping its levelset zigzag persistence module to a 2-parameter persistence module. See the
Notes in Chapter 12 for more details. The edit distance for Reeb graphs induced by functions
on curves or surfaces has been proposed in [158, 159]. Finally, the universality of Reeb graph
distance and universal (edit-like) distance for Reeb graphs was proposed and studied in [21].
It remains an interesting open question whether the interleaving distance (and thus functional
distortion distance) is within a constant factor of the universal Reeb graph distance.
Exercise
1. Suppose we are given a triangulation K of a 2-dimensional square. Let f : |K| → R be a
PL-function on K induced by a vertex function f : V(K) → R. Assume that all vertices
have distinct function values.
3. Recall the vertical homology group introduced in Section 7.2.3. Suppose we are given
compact spaces X ⊂ Y and a function f : Y → R; without loss of generality, denote the
restriction of f over X also by f : X → R. Prove that the inclusion induces a well-defined
homomorphism ι∗p : Ȟ p (X) → Ȟ p (Y) between the vertical homology groups Ȟ p (X) and
Ȟ p (Y) w.r.t. f .
4. Recall the concept of merge tree introduced in Definition 7.2 and Figure 7.3 (c). An alter-
native way to define interleaving distance for merge trees is as follows [237]:
First, a merge tree (T, h) can be treated as a rooted tree where the function h serves as the
height function, and the function value from the root to any leaf is monotonically decreas-
ing. We also extend the root upward to +∞. See Figure 7.9 (a). Given any point x ∈ |T |,
we can then refer to any point along the path from x to +∞ as its ancestor; in particular, we
define xa , called the a-shift of x, as the ancestor of x with function value h(x) + a.
188 Computational Topology for Data Analysis
5. Given a finite simplicial complex K, let nd denote the number of d-dimensional simplices in
K. Let f be a PL-function on K induced by f : V(K) → R, and assume that all n0 vertices
in V(K) are already sorted in non-decreasing order of f . Describe an algorithm to compute
the merge tree for K w.r.t. f , and give the time complexity of your algorithm. (Make your
algorithm as efficient as possible.)
h +∞
2r
z = xa = y a
f (x) + a
f (x)
x y
Figure 7.9: (a) The point z is the a-shift of both x and y. (b) An example of input points sampling a
hidden graph (Q-shaped curve). (c) The r-Rips complex spanned by these points “approximates"
a thickened version of the hidden graph G ⊂ R2 . The Reeb graph for distance to a basepoint will
then aim to recover this hidden graph.
6. [Programming exercise]: Let P be a set of points in Rd . Imagine that points in P are sam-
pled around a hidden graph G ⊂ Rd ; in particular, P is an ε-sample of G. See Figure 7.9 (b)
and (c). Implement the following algorithm to compute a graph from P as an approximation
of the hidden graph G.
Step 1 : Compute the Rips complex K := VRr (P) for a parameter r. Assume K is con-
nected. (If not, perform the following for each connected component of K). Assign
the weight of each edge in the 1-skeleton K 1 of K to be its length.
Computational Topology for Data Analysis 189
Step 2 : Choose a point q ∈ P as the base point. Let f : P → R be the shortest path
distance function from any point p ∈ P to the base point q in the weighted graph K 1 .
Step 3 : Compute the Reeb graph G b of the PL-function induced by f , and return G.
b
The returned Reeb graph G b can serve as an approximation of the hidden graph G. See
[167, 88] for analysis of variants of the above procedure.
190 Computational Topology for Data Analysis
Chapter 8
In this chapter, we present some examples of topological tools that help analyze or summarize
graphs. In the previous chapter, we discussed one specific type of graph, the Reeb graph, obtained
by quotienting a space with the connected components of levelsets of a given function. Abstractly,
a Reeb graph can also be considered as a graph equipped with a height function. In this chapter,
we focus on general graphs. Structures such as cliques in a graph correspond to simplices as we
have seen in Vietoris-Rips complexes. They can help summarizing or characterizing graph data.
See Figure 8.1 for an example [262], where a directed graph is used to model the synaptic network
of neurons built by taking neurons as the vertices and the synaptic connections directed from pre-
to postsynaptic neurons as the directed edges. It is observed that there are unusually high number
Figure 8.1: (A) shows examples of two directed cliques (simplices) formed in the synap-
tic network. (B) shows the number of p-simplices for different types of graphs, where
“Bio-M" is the synaptic network from reconstructed neurons. Note that this neuronal net-
work has far more directed cliques than other biological or random graphs. (C) shows
that the count of directed cliques further differ depending on which layers neurons reside.
Image taken from [262], licensed by Michael W. Reimann et al.(2017) under CC BY 4.0
(https://creativecommons.org/licenses/by/4.0/).
191
192 Computational Topology for Data Analysis
of directed cliques (viewed as a simplex as we show in Section 8.3.1) in such networks, compared
to other biological networks or random graphs. Topological analysis such as the one described in
Section 8.3 can facilitate such applications.
Before considering directed graphs, we focus on topological analysis of undirected graphs in
Sections 8.1 and 8.2. We present topological approaches to summarize and compare undirected
graphs. In Section 8.3, we discuss how to obtain topological invariants for directed graphs. In
particular, we describe two ways of defining homology for directed graphs. The first approach
constructs an appropriate simplicial complex over an input directed graph and then takes the cor-
responding simplicial homology of this simplicial complex (Section 8.3.1). The second approach
considers the so-called path homology for directed graphs, which differs from the simplicial ho-
mology. It is based on constructing a specific chain complex directly from directed paths in the
input graph, and defining a homology group using the boundary operators associated with the
resulting chain complex (Section 8.3.2). It turns out that both path homology and the persistent
version of it can be computed via a matrix reduction algorithm similar to the one used in the
standard persistence algorithm for simplicial filtrations though with some key differences. We de-
scribe this algorithm in Section 8.3.3, and mention an improved algorithm for the 1-st homology.
Clique complex view. Given a graph G = (V, E), its induced clique complex, also called the
flag complex, is defined as follows.
Definition 8.1 (Clique complex). Given a graph G = (V, E), a clique simplex σ of dimension k is
By definition, every face of a clique simplex is also a clique simplex. Therefore, the collection of
all clique simplices form a simplicial complex CG called the clique complex of G. In other words,
the vertices of any (k + 1)-clique in G spans a k-simplex in CG .
Fact 8.1. A positively weighted graph G = (V, E, ω) induces a metric graph (|G|, dG ).
Indeed, viewing G as a simplicial 1-complex, let |G| be the underlying space of G. For every
edge e ∈ E, consider the arclength parameterization e : [0, ω(e)] → |e|, and define dG (x, y) =
|e−1 (y) − e−1 (x)| for every pair x, y ∈ |e|. The length of any path π(u, v) between two points
u, v ∈ |G| is the sum of the lengths of the restrictions of π to edges in G. The distance dG (u, v)
between any two points u, v ∈ |G| is the minimum length of any path connecting u to v in |G|
which is a metric. The metric space (|G|, dG ) is the metric graph of G.
Intrinsic Čech and Vietoris-Rips filtrations. Given a metric graph (|G|, dG ), let Bo|G| (x; r) :=
{y ∈ |G| | dG (x, y) < r} denote the radius-r open metric ball centered at x ∈ |G|. Following
1
If G is unweighted, then ω : E → R is the constant function ω(e) = 1 for any e ∈ E.
194 Computational Topology for Data Analysis
Definitions 2.9 and 2.102 , the intrinsic Čech complex Cr (|G|) and intrinsic Vietoris-Rips complex
VRr (|G|) are defined as:
\
Cr (|G|) := {x0 , . . . , x p } | Bo|G| (xi ; r) , ∅ ;
i∈[0,p]
Remark 8.1. Observe that intrinsic Čech and Vietoris-Rips complexes as defined above are in-
finite complexes because we consider all points in the underlying space. Alternatively, G =
(V, E, ω) can also be viewed as a discrete metric space (V, d̂) where d̂ : V × V → R+ ∪ {0} is the
restriction of dG to graph nodes V of G. We can thus build discrete intrinsic Čech or Vietoris-Rips
complexes spanned by only vertices in G. If G is a complete graph, then the discrete Vietoris-Rips
complex at scale r is equivalent to the clique complex for Gr as introduced in Section 8.1.1. Most
of our discussions below apply to analogous results for the discrete case.
We now consider the intrinsic Čech filtration C := {Cr }r∈R and intrinsic Vietoris-Rips filtra-
tion R := {VRr }r∈R , and their induced persistence modules H p C := {H p (Cr )}r∈R and H p R :=
{H p (VRr )}r∈R . We have (see [81]):
Fact 8.2. Given a finite metric graph (|G|, dG ) induced by G = (V, E, ω), the persistence modules
H p C and H p R are both q-tame (recall the definition of q-tame in Section 3.4).
Hence both the intrinsic Čech and intrinsic Vietoris-Rips filtrations induce well-defined per-
sistence diagrams, which can be used as summaries (signatures) for the input graph G = (V, E, ω).
In what follows, we present some results on the homotopy types of these simplicial complexes,
as well as their induced persistent homology.
Topology of Čech and Vietoris-Rips complexes. The intrinsic Čech and Vietoris-Rips com-
plexes induced by a metric graph may have non-trivial high-dimensional homology groups. The
following results from [2] provide a precise characterization of the homotopy groups of these
complexes for a metric graph whose underlying space is a circle. Specifically, let S1 denote the
circle of unit circumference which is assumed for simplicity; the results below can be extended
to a circle of any length by appropriate scaling. Let Sd denote the d-dimensional sphere.
Theorem 8.1. Let 0 < r < 12 . There are homotopy equivalences: for ` = 0, 1, . . . ,
` `+1
Cr (S1 ) ' S2`+1 if <r≤ ; and
2(` + 1) 2(` + 2)
` r `+1
VRr/2 (S1 ) ' S2`+1 if < ≤ .
2` + 1 2 2` + 3
We remark that if one uses the closed ball to define these complexes, then the statements are
similar and but involve some additional technicalities; see [2].
Much less is known for more general metric graphs. Below we present two sets of results:
Theorem 8.2 characterizes the intrinsic Vietoris-Rips complexes for a certain family of metric
2
Note that here we use open metric balls instead of closed metric balls to define the Čech and Rips complexes, so
that the theoretical result in Theorem 8.1 is cleaner to state.
Computational Topology for Data Analysis 195
graphs [3]; while Theorem 8.3 characterizes only the 1-st persistent homology induced by the
intrinsic Čech complexes, but for any finite metric graph [166]. Recall that H̃ p denotes the p-th
reduced homology group.
Theorem 8.2. Let G be a finite metric graph, with each edge of length one, that can be obtained
from a vertex by iteratively attaching (i) an edge along a vertex or (ii) a k-cycle graph along
a vertex or a single edge for k > 2 (see, e.g., Figure 8.2). Then we have that H̃ p (VR(G; r)) ≈
⊕ni=1 H̃ p (VR(Cki ; r)) where ⊕ stands for the direct sum, n is the number of times operation (ii) is
performed, and Cki is a loop of ki edges (and thus Cki is of length ki ) which was attached in the
i-th time that operation (ii) is performed.
C6
u w C4
Figure 8.2: A 4-cycle C4 is attached to the base graph along vertex v; while a 6-cycle C6 is
attached to the base graph along edge (u, w).
The above theorem can be relaxed to allow for different edge lengths though one needs to
define the “gluing” more carefully in that case. See [3] for details. Graphs described in Theorem
8.2 are intuitively generated by iteratively gluing a simple loop along a “short” simple path in
the existing graph. Note that the above theorem implies that the Vietoris-Rips complex for a
connected metric tree has isomorphic reduced homology groups as a point.
Persistent homology induced by Čech complexes. Instead of a fixed scale, Theorem 8.3 below
provides a complete characterization for the 1-st persistent homology of intrinsic Čech complex
filtration of a general finite metric graph. To present the result, we recall the concept of the
shortest cycle basis (optimal basis) for H1 (G) while treating G = (V, E, ω) as a simplicial 1-
complex (Definition 5.3). Specifically, in our setting, given any 1-cycle γ = ei1 + ei2 + · + eis ,
define the length of γ to be length(γ) = sj=1 ω(ei j ). A cycle basis of G refers to a set of g 1-
P
cycles Γ = {γ1 , . . . , γg } that form a basis for the 1-dimensional cycle group Z1 (G). Notice that we
can replace H1 (G) with the cycle group Z1 (G) because the two are isomorphic in case of graphs.
Given a cycle basis Γ, its length-sequence is the sequence of lengths of elements in the basis
in non-decreasing order. A cycle basis of G is a shortest cycle basis if its length-sequence is
lexicographically minimal among all cycle basis of G.
Theorem 8.3. Let G = (V, E, ω) be a finite graph with positive weight function ω : E → R. Let
{γ1 , . . . , γg } be a shortest cycle basis of G where g = rank (Z1 (G)), and for each i = 1, . . . , g, let
`i = length(γi ). Then, the 1-st persistence diagram Dgm1 C induced by the intrinsic Čech filtration
C := {Cr (|G|)}r∈R on the metric graph (|G|, dG ) consists of the following set of points on the y-axis:
`i
Dgm1 C = {(0, ) | 1 ≤ i ≤ g}.
4
196 Computational Topology for Data Analysis
Let G and D denote the space of finite metric graphs and the space of finite persistence diagrams,
respectively; and let 2D denote the space of all subsets of D. We define:
In other words, φ maps a metric graph |G| to a set of (infinitely many) points φ(|G|) in the space of
persistence diagrams D. The image φ(|G|) is another graph in the space of persistence diagrams
though this map φ is not necessarily injective.
Now let (|G1 |, dG1 ) and (|G2 |, dG2 ) denote the metric graphs induced by finite graphs G1 =
(V1 , E1 , ω1 ) and G2 = (V2 , E2 , ω2 ) with positive edge weights.
Definition 8.2 (Persistence distortion distance). Given finite metric graphs (|G1 |, dG1 ) and (|G2 |, dG2 ),
the persistence-distortion distance between them, denoted by dPD (G1 , G2 ), is the Hausdorff dis-
tance dH (φ(|G1 |), φ(|G2 |) between the two image sets φ(|G1 |) and φ(|G2 |) in the space of persistence
diagrams (D, db ) equipped with the bottleneck distance db . In other words, setting A := φ(|G1 |)
and B := φ(|G2 |), we have
dPD (G1 , G2 ) := dH φ(|G1 |), φ(|G2 |) = max max min db (P, Q); max min db (P, Q) .
P∈A Q∈B Q∈B P∈A
The persistence distortion dPD is a pseudo-metric. It can be computed in polynomial time for
finite input graphs. It is stable w.r.t. the Gromov-Hausdorff distance between the two input metric
graphs.
Computational Topology for Data Analysis 197
Figure 8.3: (a) a 3-clique and a 4-clique with source a and sink c. (b) A directed graph (left) and
its directed clique complex (right). The set of triangles in this complex are: {bce, ced, ed f }. There
is no higher dimensional simplices. Note that if the edge (b, d) is also in the directed graph in (b),
then the tetrahedron bcde will be in its corresponding directed clique complex.
Directed clique complex. A node in a directed graph is a source node if it has in-degree 0;
and it is a sink node if it has out-degree 0. A directed cycle is a sequence of directed edges
(v0 , v1 ), (v1 , v2 ), . . . , (vk , v0 ). A graph is a directed acyclic graph (DAG) if it does not contain any
directed cycle. A graph ({v1 , . . . , vk }, E 0 ) is a directed k-clique if (i) there is exactly one edge
between any pair of (unordered) vertices (thus there are 2k edges in E 0 ), and (ii) it is a DAG. See
Figure 8.3 (a) for examples. A set of vertices {vi1 , . . . , vik } spans a directed clique in G = (V, E) ~ if
0 ~
there is a subset of edges of E ⊆ E such that ({vi1 , . . . , vik }, E ) is a directed k-clique. It is easy to
0
see that given a directed clique, any subset of its vertices also form a directed clique (Exercise 5).
198 Computational Topology for Data Analysis
Hence a k-clique spans a (k-1)-simplex in the directed clique complex. See Figure 8.3 (b) for
a simple example. Now given a weighted directed graph G = (V, E, ~ ω), for any a ≥ 0, let Ga be
the subgraph of G spanned by all directed edges whose weight is at most a. Assuming all edges
~ are sorted by their weights in a non-decreasing order, set ai = ω(ei ). Similar
e1 , . . . , em , m = |E|,
to the clique complex filtration for undirected graphs introduced in Section 8.1.1, this gives rise
to the following filtration of simplicial complexes induced by the directed clique complexes:
C(Ga1 ) ,→ b
b C(Ga2 ) ,→ · · · ,→ b
C(Gam ).
One can then use the persistence diagram induced by the above filtration as a topological invariant
for the input directed graph G.
~ ω) and a threshold
Definition 8.4 (Dowker complex). Given a weighted directed graph G = (V, E,
δ, the Dowker δ-sink complex is the following simplicial complex:
Dδsi (G) := {σ = {vi0 , . . . , vid } | there exists v ∈ V so that ω(vi j , v) ≤ δ for any j ∈ [0, d]}. (8.3)
In the above definition, v is called a δ-sink for the simplex σ. In the example on the right of
Figure 8.3 (a), assume all edges have weight 1. If we now remove edge (b, d), then abd is not a
3-clique any more in Gδ=1 . However, abd still forms a 2-simplex in the Dowker sink complex D1si
with sink c.
In general, as δ increases, we obtain a sequence of Dowker complexes connected by inclu-
sions, called the Dowker sink filtration D si (G) = {Dδsi ,→ Dδsi0 }δ≤δ0 .
Alternatively, one can define the Dowker δ-source complex in a symmetric manner:
Dδso (G) := {σ = {vi0 , . . . , vid } | there exists v ∈ V so that ω(v, vi j ) ≤ δ for any j ∈ [0, d]} (8.4)
resulting in a Dowker source filtration D so (G) = {Dδso ,→ Dδso0 }δ≤δ0 . It turns out that by the duality
theorem of Dowker [147], the two Dowker complexes have isomorphic homology groups. It can
be further shown that the choice of Dowker complexes does not matter when persistent homology
is considered [99].
Theorem 8.5 (Dowker Duality). Given a directed graph G = (V, E, ~ ω), for any threshold δ ∈ R
si so
and dimension p ≥ 0, we have H p (Dδ ) H p (Dδ ). Furthermore, the persistence modules induced
by the Dowker sink and the Dowker source filtrations are isomorphic as well, that is,
here, we use the directed graph to define a chain complex directly. The resulting path homol-
ogy group has interesting mathematical structures behind, e.g., there is a concept of homotopy in
directed graphs under which the path homology is preserved, and it accommodates the Künneth
formula [186].
Note that in this chapter, we have assumed that a given directed graph G = (V, E) ~ does not
contain self-loops (where a self-loop is an edge (u, u) from u to itself). For notational simplicity,
below we sometimes use index i to refer to vertex vi ∈ V = {v1 , . . . , vn }.
Let k be a field with 0 and 1 being the additive and multiplicative identities respectively. We
use −a to denote the additive inverse of a in k. An elementary p-path on V is an ordered sequence
vi0 , vi1 , · · · , vi p of p + 1 of vertices of V, which we denote by evi0 ,vi1 ,...,vid , or just ei0 ,i1 ,··· ,i p for
simplicity. Let Λ p = Λ p (G, k) denote the k-linear space of all linear combinations of elementary
p-paths with coefficients from k. The set {ei0 ,··· ,i p | i0 , · · · , i p ∈ V} forms a basis for Λ p . Each
element c of Λd is called a p-path or p-chain, and it can be written as
X
c= ai0 ···i p ei0 ···i p , where ai0 ···i p ∈ k.
i0 ,··· ,i p ∈V
Similar to the case of simplicial complexes, we can define boundary map ∂ p : Λ p → Λ p−1 as:
X
∂ p ei0 ···i p = (−1)k ei0 ···î j ···i p , for any elementary p -path ei0 ···i p ,
i0 ,··· ,i p ∈V
where îk means the removal of index ik . The boundary of a p-path c = ai0 ···i p · ei0 ···i p , is thus
P
∂ p c = ai0 ···i p · ∂ p ei0 ···i p . For convenience, we set Λ−1 = 0 and note that Λ0 is the set of k-linear
P
combinations of vertices in V. It is easy to show that ∂ p−1 · ∂ p = 0, for any p > 0. In what follows,
we often omit the dimension p from ∂ p when it is clear from the context.
Next, we restrict the consideration to real paths in directed graphs formed by consecutive
directed edges. Specifically, given a directed graph G = (V, E), ~ call an elementary p-path ei0 ,··· ,i p
allowed if there is an edge from ik to ik+1 for all k ∈ [0, p − 1]. Define A p as the space spanned
by all allowed elementary p-paths, that is, A p := span{ei0 ···i p : ei0 ···i p is allowed}. An elementary
p-path i0 · · · i p is called regular if ik , ik+1 for all k, and is irregular otherwise. Clearly, every
allowed path is regular since there is no self-loop. However, applying the boundary map ∂ to Λ p
may create irregular paths. For example, ∂euvu = evu − euu + euv is irregular because of the term
euu . To deal with this case, the term containing consecutive repeated vertices is taken as 0. Thus,
for the previous example, we have ∂euvu = evu − 0 + euv = evu + euv . The boundary map ∂ on
A p is now taken to be the boundary map for Λ p restricted on A p with this modification, where
all terms with consecutive repeated vertices created by the boundary map ∂ are replaced with 0’s.
For simplicity, we still use the same symbol ∂ to represent this modified boundary map on the
space of allowed paths.
After restricting the boundary operator to the space of allowed paths A p s, the inclusion that
∂A p ⊂ A p−1 may not hold; that is, the boundary of an allowed p-path is not necessarily an allowed
(p − 1)-path. To this end, we adopt a stronger notion of allowed paths: a path c is ∂-invariant
if both c and ∂c are allowed. Let Ω p := {c ∈ A p | ∂c ∈ A p−1 } be the space generated by all
∂-invariant p-paths. Note that ∂Ω p ⊂ Ω p−1 (as ∂2 = 0). This gives rise to the following chain
complex of ∂-invariant allowed paths:
∂ ∂ ∂ ∂
· · · Ωp →
− Ω p−1 →
− · · · Ω1 →
− Ω0 →
− 0.
200 Computational Topology for Data Analysis
Definition 8.5 (Path homology). The p-th cycle group is defined as Z p = ker ∂|Ω p , and elements
in Z p are called p-cycles. The p-th boundary group is defined as B p = Im ∂|Ω p+1 , with elements of
B p called p-boundary cycles (or simply p-boundaries). The p-th path homology group is defined
as H p (G, k) = Z p /B p .
v2 v4
v1 v6
v3 v5
Examples. Consider the directed graph in Figure 8.4, and assume that the coefficient field k ,
Z2 : Examples of elementary 1-path include: e12 , e24 , e13 , e14 , and so on. However, e13 and e14 are
not an allowed 1-path. More examples of allowed 1-path include: e12 +e46 , e12 +e31 , e46 +e65 +e45
and e46 +e65 −e45 . Note that any allowed 1-path is also ∂-invariant; that is, Ω1 = A1 , as all 0-paths
are allowed. Observe that ∂(e46 + e65 + e45 ) = e6 − e4 + e5 − e6 + e5 − e4 = 2e5 − 2e4 , which
is not 0 (unless the coefficient field k = Z2 ). However, ∂(e46 + e65 − e45 ) = 0, meaning that
e46 + e65 − e45 ∈ Z1 . Other 1-cycle examples include
e12 + e23 + e31 , e24 + e45 − e23 − e35 , and e12 + e24 + e45 − e53 + e31 ∈ Z1
Examples of elementary 2-paths include: e123 , e245 , e256 and e465 . However, e256 is not al-
lowed. Consider the allowed 2-path e245 , its boundary ∂e245 = e45 − e25 + e24 is not allowed as e25
is not allowed. Hence the allowed 2-path e245 is not ∂-invariant; similarly, we can see that neither
e235 nor e123 is in Ω2 . It is easy to check that e465 ∈ Ω2 as ∂e465 = e65 − e45 + e46 . Also note that
while neither e235 nor e245 is in Ω2 , the allowed 2-path e245 − e235 is ∂-invariant as
∂(e245 − e235 ) = e45 − e25 + e24 − e35 + e25 − e23 = e45 + e24 − e35 − e23 ∈ A1 .
This example suggests that elementary ∂-invariant p-paths do not necessarily form a basis for Ω p
– this is rather different from the case of simplicial complex, where the set of p-simplices form a
basis for the p-th chain group.
The above discussion also suggests that e46 + e65 − e45 , e24 + e45 − e23 − e35 ∈ B1 .
For the example in Figure 8.4,
{e12 + e23 + e31 , e46 + e65 − e45 , e24 + e45 − e23 − e35 } is a basis for the 1-cycle group Z1 ;
{e46 + e65 − e45 , e24 + e45 − e23 − e35 } is a basis for the 1-boundary group B1 ; while
{e245 − e235 , e465 } is a basis for the space of ∂-invariant 2-paths Ω2 .
Computational Topology for Data Analysis 201
Persistent path homology for directed graphs. Given a weighted directed graph G = (V, E, ~ ω),
let Ga denote the subgraph of G containing all directed edges with weight at most a. This gives
rise to a filtration of graphs G : {Ga ,→ Gb }a≤b . Let H p (Ga ) denote the p-th path homology
induced by graph Ga . It can be shown [100] that the inclusion Ga ,→ Gb induces a well-defined
homormorphism ξa,b p : H p (Ga ) → H p (Gb ), and the sequence G : {Ga ,→ Gb }a≤b leads to a
persistence module H p G : {H p (Ga ) → H p (Gb )}a≤b .
Algorithm setup. Given a p-path τ, its allowed-time is set to be the smallest value (weight)
a when it belongs to A p (Ga ); and we denote it by at(τ) = a. Let A p = {τ1 , . . . , τt } denote
the set of elementary allowed p-paths, sorted by their allowed-times in a non-decreasing order.
Similarly, set A p−1 = {σ1 , . . . , σ s } to be the sequence of elementary allowed (p − 1)-paths sorted
by their allowed-times in a non-decreasing order. Let a1 < a2 < · · · < atˆ be the sequence of
distinct allowed-times of elementary p-paths in A p in increasing orders. Obviously, tˆ ≤ t = |A p |.
Similarly, let b1 < b2 < · · · < b ŝ be the sequence of distinct allowed-times for (p − 1)-paths in
A p−1 sorted in increasing order.
Note that A p (resp. A p−1 ) forms a basis for A p (G) (resp. A p−1 (G)). In fact, for any i, set
Aapi := {τ j | at(τ j ) ≤ ai }. It is easy to see that Aapi equals {τ1 , . . . , τρi }, where
ρi ∈ [1, t] is the largest index of any elementary p-path whose allowed-time is at most ai ; (8.5)
and Aapi forms a basis for A p (Gai ). Note that the cardinality of Aapi \ Aapi−1 could be larger than 1 and
b
that is why ρi is not necessarily equal to i. A symmetric statement holds for A p−1
j
and A p−1 (Gb j ).
From now on, we fix a dimension p. At high level, the algorithm for computing the p-th
persistent path homology has the following three steps, which looks similar to the algorithm
that computes standard persistent homology for simplicial complexes. However, there are key
differences in the implementation of these steps.
eab 0 1 0 eab 0 1 −1
ecb 0 0 1 ecb 0 0 1
G b c ead 0 ead 0
1 2 −1 0 −1 1
eed 1 0 0 eed 1 0 −1
a 10 5 ece 1 0 0 ece 1 0 −1
ebd 0 1 1 ebd 0 1 0
3 4 ecd −1 0 −1 ecd −1 0 0
d e eced eabd ecbd eced eabd ecbd
(a) graph G (b) original boundary matrix M (c) reduced matrix M
b
Figure 8.5: The input is the weighted directed graph in (a). Its 1-dimensional boundary matrix
M as constructed in (Step 1) is shown in (b). Note that at(ecd ) = +∞ (so ecd < A1 (G)). For
each edge (i.e, elementary allowed 1-path) in G, its allowed-time is simply its weight. There are
only three elementary allowed 2-paths, and their allowed-times are: at(eced ) = 5, at(eabd ) = 10
and at(ecbd ) = 10. (c) shows the reduced matrix. From this matrix, we can deduce that the 1-
th persistence diagram (for path homology) includes two points: (10, 10) and (5, 10) (generated
by the second and third columns). Note that for the first column (corresponding to eced ), as
at(col Mb [1]) = ∞; hence the corresponding γ1 is not ∂-invariant.
Description of Step 2. We now perform the standard left-to-right matrix reduction to M, where
the only allowed operation is to add a column to some column on its right. We convert M to its
reduced form M b (Definition 3.13); and through this process, we also update γi accordingly so
that at any moment, ∂ p γi = col M0 [i] where M 0 is the updated boundary matrix at that point. In
particular, if we add column j to column i > j, then we will update γi = γi + γ j . We note that
Computational Topology for Data Analysis 203
other than the additional maintenance of γs, this reduction step of M is the same as the reduction
in Algorithm 3:MatPersistence given in Section 3.3. The following claim follows easily from
that there are only left-to-right column additions, and that the allowed-times of γi s are initially
sorted in non-decreasing order.
Claim 8.1. For any i ∈ [1, t], the allowed-time of γi remains the same through any sequence of
left-to-right column additions.
Let Ωip denote the space of ∂-invariant p-paths w.r.t. Gai ; that is, Ωip = Ω p (Gai ). Given a
p-path τ, let ent(τ) be its entry-time, which is the smallest value a such that τ ∈ Ω p (Ga ). It is
easy to see that for any p-path τ, we have that
Recall that each column vector col M b [i] is in fact the vector representation of a (p − 1)-path
(with respect to basis elements in Ab = {σ1 , . . . , σ` }). Also, the allowed time for a column col b [i]
M
is given by at(col Mb [i]) = at(σh ) where h = lowId(col M b [i]).
Claim 8.2. Given a reduced matrix M, b let C = Pt ci col b [i] be a (p − 1)-path. Let col b [ j] be
i=1 M M
the column with lowest (i.e, largest) lowId among all columns col M b [i]s such that ci , 0, and set
h = lowId(col M
b [ j]). It then follows that at(C) = at(σh ).
Now for the reduced matrix M, b given any i ∈ [1, tˆ], we set ρi to be the largest index j ∈ [1, t]
such that at(γ j ) ≤ ai . By Claim 8.1, for each j there is a fixed allowed time associated to the
p-path γ j associated to it, which stays invariant through the reduction process. So this quantity
ρi is well defined, consistent with what we defined earlier in Eqn. (8.5), and remains invariant
through the reduction process. Now set:
Γi := {γ1 , . . . , γρi },
Ii := { j ≤ ρi | at(col M
b [ j]) ≤ ai }, and
Σi := {γ j | ent(γ j ) ≤ ai } = {γ j | j ∈ Ii }.
Theorem 8.6. For any k ∈ [1, tˆ], Γk forms a basis for Akp := A(Gak ); while Σk forms a basis for
Ωkp = Ω p (Gak ).
Proof. That Γk forms a basis for Akp follows easily from the facts that originally, {τ1 , . . . , τρk }
form a basis for Akp , and the left-to-right column additions maintain this. In what follows, we
prove that Σk forms a basis for Ωkp . First, note that all elements in Σk represent paths in Ωkp and
they are linearly independent by construction (as their low-row indices are distinct). So we only
need to show that any element in Ωkp can be represented by a linear combination of vectors in Σk .
Let ξk denote the largest index j ∈ [1, s] such that at(σ j ) ≤ ak . In other words, an equivalent
formulation for Ik is that Ik = { j ≤ ρk | lowId(col M
b [ j]) ≤ ξk }.
Now consider any γ ∈ Ω p ⊆ A p . As Γ forms a basis for Akp , we have that
k k k
ρk
X ρk
X ρk
X
γ= ci γi and ∂γ = ci ∂γi = ci col M
b [i].
i=1 i=1 i=1
204 Computational Topology for Data Analysis
As γ ∈ Ωkp and ent(γ) = max{at(γ), at(∂γ)} (see Eqn. (8.6)), we have at(γ) ≤ ak and at(∂γ) ≤
ak . By Claim 8.2, it follows that for any j ∈ [1, ρk ] with c j , 0, its lowId satisfies lowId(col M
b [ j]) ≤
ξk . Hence each such index j with c j , 0 must belong to Ik , and as a result, γ can be written as
a linear combination of p-paths in Σk . Combined with that all vectors in Σk are in Ωkp and are
linearly independent, it follows that Σk forms a basis for Ωkp .
Proof. Let ∂ˆ p denote the restriction of ∂ p over Ω p . Recall that Z p = Ker∂ˆ p , while B p−1 = Im∂ˆ p .
Easy to see that by construction of Z k , we have Z k ⊆ Span(Z k ) ⊆ Z p (Gak ). Since all Γi s are
linearly independent, we thus have that vectors in Z k are linearly independent. It then follows that
|Z k | ≤ rank (Z p (Gak )) where |Z k | stands for the cardinality of Z k .
Similarly, as the matrix Mb is reduced, all non-zero columns of M b are linearly independent, and
thus vectors in B are linearly independent. Furthermore, by Theorem 8.6, each vector in Bk is in
k
B p−1 (Gak ) (as it is the boundary of a p-path from Ωkp ). Hence we have that Span(Bk ) ⊆ B p−1 (Gak ),
and |Bk | ≤ rank (B p−1 (Gak )).
On the other hand, let ∂ˆ p |Ωkp denote the restriction of ∂ˆ p to only Ωkp ⊆ Ω p . Note that by Rank
Nullity Theorem,
|Σkp | = rank (Ωkp ) = rank (ker(∂ˆ p |Ωkp )) + rank (im (∂ˆ p |Ωkp )) = rank (Z p (Gak )) + rank (B p−1 (Gak )).
As rank (Σkp ) = |Z k | + |Bk |, and combining the above equation with the inequalities obtained
in the previous paragraphs, it follows that it must be that |Z k | = rank (Z p (Gak )) and |Bk | =
rank (B p−1 (Gak )). The claim then follows.
Description of Step 3: constructing persistence diagram from the reduced matrix M. b Given
a weighted directed graph G = (V, E, ~ ω), for each dimension p ≥ 0, construct the boundary
matrix M p+1 as described above in (Step 1). Perform the left-to-right column reduction to M p+1
b=M
to obtain a reduced form M b p+1 as in (Step 2). The p-th persistence diagram Dgm p G where
G : {G ,→ G }a≤b can be computed as follows.
a b
Let µa,b
p denote the persistence pairing function: that is, the persistence point (a, b) is in
Dgm p G with multiplicity µa,b
p if and only if µ p > 0. At the beginning, µ p is initialized to
a,b a,b
be 0 for all a, b ∈ R. We then inspect every non-zero column col M b [i], and take the following
actions.
b [i]) , ∞, then we increase the pairing function µ
at(col M
b [i]),ent(γi ) by 1, where
• If at(col M
γi is the allowed elementary (p + 1)-path corresponding to this column. Observe that,
at(col Mb [i]) ≤ ent(γi ) because
• Otherwise, the path γi corresponds to this column is not ∂-invariant (i.e, not in Ω p ), and we
do nothing.
Computational Topology for Data Analysis 205
• Finally, consider the reduced matrix Mb p for the p-th boundary matrix M p as constructed in
(Step 1). Recall the construction of Jk as in Corollary 8.7. For any j ∈ Jk such that j is not
appearing as the low-row index of any column in M b p+1 , we increase the pairing function
µat(τ),∞ by 1, where τ is the elementary p-path corresponding to this column.
See Figure 8.5 for an example. Let N p denote the number of allowed elementary p-paths in
G: obviously, N p = O(n p+1 ). However, as we see earlier, the number of rows of M p+1 is not
necessarily bounded by N p ; and we can only bound it by the number of elementary p-paths in G,
which we denote by N bp . If we use the standard Gaussian elimination for the column reduction as
in Algorithm 3:MatPersistence, then the time complexity to compute the reduced matrix M b p+1 is
O(Nbp2 N p+1 ). One can further improve it using the fast matrix multiplication time.
We note that due to Theorem 8.6 and Corollary 8.7, the above algorithm is rather similar
to the matrix reduction for the standard persistent homology induced by simplicial complexes.
However, the example in Figure 8.5 shows the difference.
Improved computation for 1-st persistent path homology. The time complexity can be im-
proved for computing the 0-th and 1-st persistent path homology. In particular, the 0-th persis-
tence path homology coincides with the 0-th persistent homology induced by the persistence of
clique complexes, and thus can be computed in O(mα(n) + n log n) time using the union-find data
~
structure, where n = |V| and m = |E|.
u v
w u
u v v w z
bigon boundary triangle boundary quadrangle
Figure 8.6: Boundary bigon, triangle and quadrangle. Such boundary cycles generate all 1-
dimensional boundary cycles.
For the 1-dimensional case, it turns out that the boundary group has further structures. In
particular, the 1-dimensional boundary group is generated by only the specific forms of bigons,
triangles and quadrangles as shown in Figure 8.6. The 1-st persistent path homology can thus
be computed more efficiently by a different algorithm (from the above matrix reduction algo-
rithm) by enumerating certain family of boundary cycles of small cardinality which generates the
boundary group. In particular, the cardinality of this family depends on the so-called arboricity
a(G) of G: Ignoring the direction of edges in graph G (i.e., viewing it as an undirected graph),
its arboricity a(G) is the minimum number of edge-disjoint spanning forests into which G can be
decomposed [183]. An alternative definition of the arboricity is that:
|E(H)|
a(G) = max . (8.7)
H is a subgraph of G |V(H)| − 1
Without describing the algorithm developed in [131], we present its computational complexity
for the 1-st persistent path homology in the following theorem.
206 Computational Topology for Data Analysis
Theorem 8.8. Given a directed weighted graph G = (V, E, ~ w) with n = |V|, m = |E|,
~ and N p =
O(n p+1 ) the number of allowed elementary p-paths, assume that the time to compute the rank of
a r × r matrix is rω . Let din (v) and dout (u) denote the in-degree and out-degree of a node v ∈ V,
and a(G) be the arboricity of G. Set K = min{a(G)m, (u,v)∈E~ (din (u) + dout (u))}. Then we can
P
compute the p-th persistent path homology:
In particular, the arboricity a(G) = O(1) for plannar graphs, thus it takes O(nω ) time to
compute the 1-st persistent path homology for a planar directed graph G.
Exercise
G b c G b c
2 3 1 2
a 4 1 a 10 5
6 5 e 3 4 e
d d
(a) (b)
Figure 8.7: (a) graph for Exercise 6. (b) graph for Exercise 7. Edge weights are marked.
1. Consider a metric tree (|T |, dT ) induced by a positively weighted finite tree T = (V, E, w).
Suppose the largest edge weight is w0 . Consider the discrete intrinsic Čech complex Cr (V)
spanned by vertices in V. That is, let BT (x; r) := {y ∈ |T | | dT (x, y) < r} denote the open
radius-r ball around a point x. Then, we have
\
Cr (V) := {hv0 , . . . , v p i | vi ∈ V for i ∈ [0, p], and BT (vi ; r) , ∅}.
i∈[0,p]
2. Consider a finite graph G=(V,E) with unit edge length, and its induced metric dG on it. For
a base point v ∈ V, let fv : |G| → R be the shortest path distance function to v; that is, for
any x ∈ |G|, fv (x) = dG (x, v).
4. Given two finite metric graphs G1 = (|G1 |, dG1 ) and G2 = (|G2 |, dG2 ), pick an arbitrary point
v ∈ |G1 | and consider its associated shortest-path distance function fv : |G1 | → R to this
point; that is, fv (x) = dG1 (x, v) for any x ∈ |G1 |. For any point w ∈ |G2 |, let gw : |G2 | → R
denote the shortest-path distance function to w in |G2 | via dG2 . Let Dgm0 fv (resp. Dgm0 gw )
denote the 0-th persistence diagram induced by the superlevel set filtration of fv (resp.
of gw ). Argue that there exists some point w∗ ∈ |G2 | such that db (Dgm0 fv , Dgm0 gw∗ ) ≤
C · dGH (G1 , G2 ) for some constant C > 0, where dGH is the Gromov-Hausdorff distance.
208 Computational Topology for Data Analysis
5. Show that given a directed clique, any subset of its vertices span a directed subgraph with
a unique source and a unique sink.
6. Consider the graph in Figure 8.7 (a). Compute the 0-th and 1-st persistence diagrams for
the filtrations induced by (i) the directed clique complexes; (ii) the Dowker-sink complexes;
and (iii) the Dowker-source complexes.
7. Consider the graph in Figure 8.7 (b). Compute the 1-st persistence diagram for the filtra-
tions (i) induced by directed clique complexes; and (ii) induced by path homology.
8. Consider a pair of directed graphs G = (V, E) and G0 = (V, E 0 ) spanned by the same set of
vertices V, and E 0 = E ∪ {(u, v)}; that is, G0 equals to G with an additional directed edge
e = (u, v). Consider path homology. Consider the 1st cycle and boundary groups for G and
for G0 .
Data can be complex both in terms of the domain where they come from and in terms of proper-
ties/observations associated with them which are often modeled as functions/maps. For example,
we can have a set of patients, where each patient is associated with multiple biological markers,
giving rise to a multivariate function from the space of patients to an image domain that may or
may not be the Euclidean space. To this end, we need to analyze not only real-valued scalar fields
as we did in so far in the book, but also more complex maps defined on a given domain, such as
multivariate, circle valued, sphere valued maps, etc.
U1 U2 U3 U4 U5
Figure 9.1: The function values on a hand model are binned into intervals as indicated by different
colors. The mapper [277] corresponding to these intervals (cover) is shown with the graph below;
image courtesy of Facundo Mémoli and Gurjeet Singh.
One way to analyze complex maps is to use the Mapper methodology introduced by Singh
et al. in [277]. In particular, given a map f : X → Z, the mapper M( f, U) creates a topological
metaphor for the structure behind f by pulling back a cover U of the space Z to a cover on X
through f . This mapper methodology can work with any (reasonably tame) continuous maps
between two topological spaces. It converts complex maps and covers of the target space into
simplicial complexes, which are much easier to process computationally. One can view the map
209
210 Computational Topology for Data Analysis
f and a finite cover of the space Z as the lens through which the input data X is examined. It is in
some sense related to Reeb graphs which also summarizes f but without any particular attention
to a cover of the codomain. Figure 9.1 shows a mapper construction where the reader can see its
similarity to the Reeb graph. The choice of different maps and covers allows the user to capture
different aspects of the input data. The mapper methodology has been successfully applied to
analyzing various types of data, we have shown an example in Figure 2(e) in the Prelude, for
others see e.g. [227, 244].
To understand the Mapper and its multiscale version Multiscale Mapper better, we study first
some properties of nerves as they are at the core of these constructions. We already know Nerve
Theorem (Theorem 2.1) which states that if every intersection of cover elements in a cover U is
contractible, then the nerve N(U) is homotopy equivalent to the space X = U. However, we
S
cannot hope for such a good cover all the time and need to investigate what happens if the cover
is not good. Sections 9.1 and 9.2 are devoted to this study. Specifically, we show that if every
cover element satisfies a weaker property that it is only path connected, then the nerve may not
preserve homotopy, but satisfies a surjectivity property in one-dimensional homology.
One limitation of the mapper is that it is defined with respect to a fixed cover of the target
space. Naturally, the behavior of the mapper under a change of cover is of interest because it has
the potential to reveal the property of the map at different scales. Keeping this in mind, we study a
multiscale version of mapper, which we refer to as multiscale mapper. It is capable of producing
a multiscale summary in the form of a persistence diagram using a cover of the codomain at
different scales. In Section 9.4, we discuss the stability of the multiscale mapper under changes
in the input map and/or in the tower U of covers. An efficient algorithm for computing mapper
and multiscale mapper for a real valued PL-function is presented in Sections 9.5. In Section 9.6,
we consider the more general case of a map f : X → Z where X is a simplicial complex but Z is
not necessarily Euclidean. We show that we can use an even simpler combinatorial version of the
multiscale mapper, which only acts on vertex sets of X with connectivity given by the 1-skeleton
graph of X. The cost we pay here is that the resulting persistence diagram approximates (instead
of computing exactly) the persistence diagram of the standard multiscale mapper if the tower of
covers of Z is “good" in certain sense.
Maps between covers. If we have two covers U = {Uα }α∈A and V = {Vβ }β∈B of a space X,
a map of covers from U to V is a set map ξ : A → B so that Uα ⊆ Vξ(α) for every α ∈ A. We
Computational Topology for Data Analysis 211
abuse the notation ξ to also indicate the map U → V. The following proposition connects a map
between covers to a simplicial map between their nerves.
ξ ζ
N (U) U V N (V) N (U) U V N (V) N (U) U V N (V)
Figure 9.2: Cover maps ξ and ζ indicated by solid arrows induce simplicial maps N(ξ) and N(ζ)
whose corresponding vertex maps are indicated by dashed arrows.
Proposition 9.1. Given a map of covers ξ : U → V, there is an induced simplicial map N(ξ) :
N(U) → N(V) given on vertices by the map ξ.
Proof. Write U = {Uα }α∈A and V = {Vβ }β∈B . Then, for all α ∈ A we have Uα ⊆ Vξ(α) . Now take
any σ ∈ N(U). We need to prove that ξ(σ) ∈ N(V). For this observe that
\ \ \
Vβ = Vξ(α) ⊇ Uα , ∅,
β∈ξ(σ) α∈σ α∈σ
Proposition 9.2 (Induced maps are contiguous). Let ζ, ξ : U → V be any two maps of covers.
Then, the simplicial maps N(ζ) and N(ξ) are contiguous.
Proof. Write U = {Uα }α∈A and V = {Vβ }β∈B . Then, for all α ∈ A we have both
Now take any σ ∈ N(U). We need to prove that ζ(σ) ∪ ξ(σ) ∈ N(V). For this write
\ \ \
Vβ = Vζ(α) ∩ Vξ(α)
where the last step follows from assuming that σ ∈ N(U). It implies that the vertices in ζ(σ)∪ξ(σ)
span a simplex in N(V).
In Figure 9.2, the two maps N(ξ) and N(ζ) can be verified to be contiguous (Definition 2.7).
Furthermore, contiguous maps induce identical maps at the homology level (Fact 2.11). Proposi-
tion 9.2 implies that the map H∗ (N(U)) → H∗ (N(V)) thus induced can be deemed canonical.
Maps at homology level. Now we focus on establishing various maps at the homology levels
for covers and their nerves. We first establish a map φU between X and the geometric realization
|N(U)| of a nerve complex N(U). This helps us to define a map φU ∗ from the singular homology
groups of X to the simplicial homology groups of N(U) (through the singular homology of |N(U)|).
The nerve theorem (Theorem 2.1) says that if the elements of U intersect only in contractible
spaces, then φU is a homotopy equivalence and hence φU ∗ is an isomorphism between H∗ (X) and
H∗ (N(U)). The contractibility condition can be weakened to a homology ball condition to retain
the isomorphism between the two homology groups [219]. In absence of such conditions of the
cover, simple examples exist to show that φU ∗ could be neither a monophorphism (injection) nor
an epimorphism (surjection). Figure 9.3 gives an example where φU ∗ is not surjective in H2 .
However, for one dimensional homology groups, the map φU ∗ is necessarily a surjection when
each element in the cover U is path connected. We call such a cover U path connected. The
simplicial maps arising out of cover maps between path connected covers induce a surjection
between the 1-st homology groups of two nerve complexes.
R3 f R2 N (f −1 U)
Uα
f −1 Uα
Figure 9.3: The map f : S2 ⊂ R3 → R2 takes the sphere to R2 . The pullback of the cover element
Uα makes a band surrounding the equator which causes the nerve N( f −1 U) to pinch in the middle
creating two 2-cycles. This shows that the map φU : X → N(U) may not induce a surjection in
H2 .
Blow up space. The proof of the nerve theorem given by Hatcher in [186] uses a construction
that connects the two spaces X and |N(U)| via a blow-up space XU that is a product space of U
and the geometric realization |N(U)|. In our case U may not satisfy the contractibility condition
as in that proof. Nevertheless, we use a similar construction to define three maps, ζ : X → XU ,
π : XU → |N(U)|, and φU : X → |N(U)| where φU = π ◦ ζ is referred to as the nerve map; see
Figure 9.4(left). Details about the construction of these maps follow.
Denote the elements of the cover U as Uα for α taken from some indexing set A. The vertices
of N(U) are denoted by {uα , α ∈ A}, where each uα corresponds to the cover element Uα . For each
Computational Topology for Data Analysis 213
XU U1
U1 × {1}
XU
ζ π
U0,1 × [0, 1] U0,1
φU
X |N (U)|
U0 × {0}
U0
X
Figure 9.4: (left) Various maps used for blow up space; (right) example of a blow up space.
finite non-empty intersection Uα0 ,...,αn := ni=0 Uαi consider the product Uα0 ,...,αn × ∆nα0 ,...,αn , where
T
∆nα0 ,...,αn denotes the n-dimensional simplex with vertices uα0 , . . . , uαn . Consider now the disjoint
union
G
M := Uα0 ,...,αn × ∆nα0 ,...,αn
α0 ,...,αn ∈A: Uα0 ,...,αn ,∅
together with the following identification: each point (x, y) ∈ M, with x ∈ Uα0 ,...,αn and y ∈
[α0 , . . . , b αi , . . . , αn ] ⊂ ∆nα0 ,...,αn is identified with the corresponding point in the product Uα0 ,...,bαi ,...,αn ×
∆α0 ,...,bαi ,...,αn via the inclusion Uα0 ,...,αn ⊂ Uα0 ,...,bαi ,...,αn . Here [α0 , . . . , b
αi , . . . , αn ] denotes the
i-th face of the simplex ∆nα0 ,...,αn . Denote by ∼ this identification and now define the space
XU := M / ∼. An example for the case when X is a line segment and U consists of only two
open sets is shown in Figure 9.4(right).
In what follows we assume that the space X is compact. The main motivation behind re-
stricting X to such spaces is that they admit a condition called partition of unity which we use to
establish further results.
Definition 9.1 (Locally finite). An open cover {Uα , α ∈ A} of X is called a refinement of another
open cover {Vβ , β ∈ B} of X if every element Uα ∈ U is contained in an element Vβ ∈ V.
Furthermore, U is called locally finite if every point x ∈ X has a neighborhood contained in
finitely many elements of U.
Definition 9.2 (Partition of unity). A collection of real valued continuous functions {ϕα : X →
[0, 1], α ∈ A} is called a partition of unity if (i) α∈A ϕα (x) = 1 for all x ∈ X, (ii) For every x ∈ X,
P
there are only finitely many α ∈ A such that ϕα (x) > 0.
If U = {Uα , α ∈ A} is any open cover of X, then a partition of unity {ϕα , α ∈ A} is subordinate
to U if the support1 supp(ϕα ) of ϕα is contained in Uα for each α ∈ A.
Fact 9.1 ([258]). For any open cover U = {Uα , α ∈ A} of a compact space X, there exists a
partition of unity {ϕα , α ∈ A} subordinate to U.
We assume that X is compact and hence for an open cover U = {Uα }α of X, we can choose
any partition of unity {ϕα , α ∈ A} subordinate to U according to Fact 9.1. For each x ∈ X such that
1
The support of a real-valued function is the subset of the domain whose image is non-zero.
214 Computational Topology for Data Analysis
x ∈ Uα , denote by xα the corresponding copy of x residing in XU . For our choice of {ϕα , α ∈ A},
define the map ζ : X → XU as:
X
for any x ∈ X, ζ(x) := ϕα (x) xα .
α∈A
Proposition 9.3. Every 1-cycle γ in |N(U)| has a 1-cycle γ0 in N(U) so that [γ] = [|γ0 |].
Proposition 9.4. If U is path connected, φU∗ : H1 (X) → H1 (|N(U)|) is a surjection, where φU∗ is
the homomorphism induced by the nerve map defined in Eqn. (9.1).
Proof. Let [γ] be any class in H1 (|N(U)|). Because of Proposition 9.3, we can assume that γ = |γ0 |,
where γ0 is a 1-cycle in the 1-skeleton of N(U). We will construct a 1-cycle γU in XU so that
π(γU ) = γ. Assume first that such a γU can be constructed. Then, consider the map ζ : X → XU
in the construction of the nerve map φU where φU = π ◦ ζ. There exists a class [γX ] in H1 (X) so
that ζ∗ ([γX ]) = [γU ] because ζ∗ is an isomorphism by Fact 9.2. Then, φU∗ ([γX ]) = π∗ (ζ∗ ([γX ]))
because φU∗ = π∗ ◦ ζ∗ . It follows φU∗ ([γX ]) = π∗ ([γU ]) = [γ] showing that φU∗ is surjective.
Therefore, it remains only to show that a 1-cycle γU can be constructed given γ0 in N(U)
so that π(γU ) = γ = |γ0 |. Let e0 , e1 , . . . , er−1 , er = e0 be an ordered sequence of edges on γ0 .
Recall the construction of the space XU . In that terminology, let ei = ∆nαi α(i+1) mod r . Let vi =
e(i−1) mod r ∩ ei for i ∈ [0, r − 1]. The vertex vi = vαi corresponds to the cover element Uαi where
Computational Topology for Data Analysis 215
Uαi ∩ Uα(i+1) mod r , ∅ for every i ∈ [0, r − 1]. Choose a point xi in the common intersection
Uαi ∩ Uα(i+1) mod r for every i ∈ [0, r − 1]. Then, the edge path ẽi = ei × xi is in XU by construction.
Also, letting xαi to be the lift of xi in the lifted Uαi , we can choose a vertex path xαi { xα(i+1) mod r
residing in the lifted Uαi and hence in XU because Uαi is path connected. Consider the following
cycle obtained by concatenating the edge and vertex paths
γU = ẽ0 xα0 { xα1 ẽ1 · · · ẽr−1 xαr−1 { xα0
By projection, we have π(ẽi ) = ei for every i ∈ [0, r − 1] and π(xαi { xα(i+1) mod r ) = vαi and thus
π(γU ) = γ as required.
Since we are eventually interested in the simplicial homology groups of the nerves rather
than the singular homology groups of their geometric realizations, we make one more transition
using the known isomorphism between the two homology groups (Theorem 2.10). Specifically,
if ιU : H p (|N(U)|) → H p (N(U)) denotes this isomorphism, we let
φ̄U∗ : H1 (X) → H1 (N(U)) denote the composition ιU ◦ φU∗ . (9.2)
As a corollary to Proposition 9.4, we obtain:
Theorem 9.5. If U is path connected, φ̄U∗ : H1 (X) → H1 (N(U)) is a surjection.
U1 U2 U3
ξ1 ξ2
X
N (ξ1 ) N (ξ2 )
Figure 9.5: Sequence of cover maps induce a simplicial tower and hence a persistence module:
classes in H1 can only die.
From nerves to nerves. We now extend the result in Theorem 9.5 to simplicial maps between
two nerves induced by cover maps. Figure 9.5 illustrates this fact. The following proposition is
key to establishing the result.
θ
Proposition 9.6 (Coherent partitions of unity). Suppose {Uα }α∈A = U −→ V = {Vβ }β∈B are open
covers of a compact topological space X and θ : A → B is a map of covers. Then there exists a
partition of unity {ϕα }α∈A subordinate to the cover U such that if for each β ∈ B we define
then the set of functions {ψβ }β∈B is a partition of unity subordinate to the cover V.
Proof. The proof closely follows that of [258, Corollary pp. 97]. Since X is compact, there
exists a partition of unity {ϕα }α∈A subordinate to U. The fact that the sum in the expression of ψβ
is well defined and continuous follows from the fact that the family {supp(ϕα )}α is locally finite.
Let Cβ := α∈θ−1 (β) supp(ϕα ). The set Cβ is closed, Cβ ⊂ Uβ , and ψβ (x) = 0 for x < Cβ so that
S
supp(ψβ ) ⊂ Cβ ⊂ Vβ . Now, to check that the family {Cβ }β∈B is locally finite pick any point x ∈ X.
Since {supp(ϕα )}α is locally finite there is an open set O containing x such that O intersects only
finitely many elements in U. Denote these cover elements by Uα1 , . . . , Uα` . Now, notice if β ∈ B
and β < {θ(αi ), i = 1, . . . , `}, then O does not intersect Cβ . Then, the family {supp(ψβ )}β∈B is
locally finite. It then follows that for x ∈ X one has
X X X X
ψβ (x) = ϕα (x) = ϕα (x) = 1.
β∈B β∈B α∈θ−1 (β) α∈A
We have obtained that {ψβ }β∈B is a partition of unity subordinate to V as needed by the propo-
sition.
θ
Let {Uα }α∈A = U −→ V = {Vβ }β∈B be two open covers of X connected by a map of covers
θ : A → B. Apply Proposition 9.6 to obtain coherent partitions of unity {ϕα }α∈A and {ψβ }β∈B
subordinate to U and V, respectively. Let the nerve maps φU : X → |N(U)| and φV : X → |N(V)|
τ
be defined as in Eqn. (9.1) using these coherent partitions of unity. Let N(U) → N(V) be the
simplicial map induced by the cover map θ. The map τ can be extended to a (linear) continuous
map τ̂ : |N(U)|→|N(V)| by assigning y ∈ |N(U)| to τ̂(y) ∈ |N(V)| where
X X X
y= tα uα =⇒ τ̂(y) = tα τ̂(uα ), with tα = 1.
Claim 9.1. The map τ̂ satisfies the property that, for x ∈ X, τ̂(φU (x)) = φV (x).
Proof. For any point x ∈ X, one has φU (x) = Σα∈A ϕα (x)uα where uα is the vertex corresponding
to Uα ∈ U in |N(U)|. Then,
X X X
τ̂ ◦ φU (x) = τ̂ ϕα (x)uα =
ϕα (x)τ(uα ) = ϕα (x) vθ(α)
Xα∈AX α∈A
X α∈A
Corollary 9.7. The induced maps of φU∗ : H p (X) → H p (|N(U)|), φV∗ : H p (X) → H p (|N(V)|), and
τ̂∗ : H p (|N(U)|) → H p (|N(V)|) commute, that is, φV∗ = τ̂∗ ◦ φU∗ .
With the fact that isomorphism between singular and simplicial homology commutes with
simplicial maps and their linear continuous extensions, Corollary 9.7 implies that:
Computational Topology for Data Analysis 217
Hp (X)
φU∗ φV∗
τ̂∗
Hp (|N (U)|) Hp (|N (V)|)
ιU ιV
τ∗
Hp (N (U)) Hp (N (V))
Figure 9.6: Maps relevant for Proposition 9.8; φ̄V∗ = ιV ◦ φV∗ and φ̄U∗ = ιU ◦ φU∗ . The triangular
‘roof’ and the square ‘room’ commute, so does the entire ‘house’.
Proposition 9.8. φ̄V∗ = τ∗ ◦ φ̄U∗ where φ̄V∗ : H p (X) → H p (N(V)), φ̄U∗ : H p (X) → H p (N(U)) and
τ : N(U) → N(V) is the simplicial map induced by a cover map U → V.
Proof. Consider the diagram in Figure 9.6. The upper triangle commutes by Corollary 9.7. The
bottom square commutes by the property of simplicial maps, see Theorem 34.4 in [241]. The
claim in the proposition follows by combining these two commutating subdiagrams.
Proposition 9.8 extends Theorem 9.5 to the simplicial maps between two nerves.
Theorem 9.9. Let τ : N(U) → N(V) be a simplicial map induced by a cover map U → V where
both U and V are path connected. Then, τ∗ : H1 (N(U)) → H1 (N(V)) is a surjection.
By Proposition 9.8, τ∗ ◦ φ̄U∗ = φ̄V∗ . By Theorem 9.5, the map φ̄V∗ is a surjection. It follows that
τ∗ is a surjection.
Definition 9.3. The size s(X 0 ) of a subset X 0 of the pseudometric space (X, d) is defined to be
its diameter, that is, s(X 0 ) = sup x,x0 ∈X 0 ×X 0 d(x, x0 ). The size of a class c ∈ H p (X) is defined as
s(c) = inf z∈c s(z). According to Definition 5.3, a set of p-cycles z1 , z2 , . . . , zn of H p (X) is called a
cycle basis if the classes [z1 ], [z2 ], . . . , [zn ] together form a basis of H p (X). It is called an optimal
cycle basis if Σni=1 s(zi ) is minimal among all cycle bases.
Lebesgue number of a cover. Our goal is to characterize the classes in the nerve of U with
respect to the sizes of their preimages in X via the map φU . The Lebesgue number of a cover U
becomes useful in this characterization. It is the largest real number λ(U) so that any subset of X
with size at most λ(U) is contained in at least one element of U. Formally, the Lebesgue number
λ(U) of U is defined as:
As we will see below, a homology class of size no more than λ(U) cannot survive in the nerve
(Proposition 9.12). Further, the homology classes whose sizes are significantly larger than the
maximum size of a cover do necessarily survive where we define the maximum size of a cover as
i. Let ` = g + 1 if λ(U) > s(zg ). Otherwise, let ` ∈ [1, g] be the smallest integer so that
s(z` ) > λ(U). If ` , 1, then we have that the class φ̄U∗ [z j ] = 0 for j = 1, . . . , ` − 1.
Moreover, if ` , g + 1, then the classes {φ̄U∗ [z j ]} j=`,...,g generate H1 (N(U)).
ii. The classes {φ̄U∗ [z j ]} j=`0 ,...,g are linearly independent where s(z`0 ) > 4smax (U).
The result above says that only the classes of H1 (X) generated by cycles of large enough size
survive in the nerve. To prove this result, we use a map ρ that sends each 1-cycle in N(U) to
a 1-cycle in X. We define a chain map ρ : C1 (N(U)) → C1 (X) among one dimensional chain
groups as follows. It is sufficient to exhibit the map for an elementary chain of an edge, say
e = {uα , uα0 } ∈ C1 (N(U)). Since e is an edge in N(U), the two cover elements Uα and Uα0 in X
have a common intersection. Let a ∈ Uα and b ∈ Uα0 be two points that are arbitrary but fixed
for Uα and Uα0 respectively. Pick a path ξ(a, b) (viewed as a singular chain) in the union of Uα
and Uα0 which is path connected as both Uα and Uα0 are. Then, define ρ(e) = ξ(a, b). A cycle γ
when pushed back by ρ and then pushed forward by φU remains in the same class. The following
proposition states this fact whose proof appears in [133].
Proposition 9.11. Let γ be any 1-cycle in N(U). Then, [φU (ρ(γ))] = [|γ|].
Computational Topology for Data Analysis 219
The following proposition provides a sufficient characterization of the cycles whose classes
become trivial after the push forward.
Proposition 9.12. Let z be a 1-cycle in C1 (X). Then, [φU (z)] = 0 if λ(U) > s(z).
Proof. It follows from the definition of the Lebesgue number that there exists a cover element
Uα ∈ U such that z ⊆ Uα because s(z) < λ(U). We claim that there is a homotopy equivalence
that sends φU (z) to a vertex in N(U) and hence [φU (z)] is trivial.
Let x be any point in z. Recall that φU (x) = Σi ϕi (x)uαi . Since Uα has a common intersection
with each Uαi so that ϕαi (x) , 0, we can conclude that φU (x) is contained in a simplex with the
vertex uα . Continuing this argument with all points of z, we observe that φU (z) is contained in
simplices that share the vertex uα . It follows that there is a homotopy that sends φU (z) to uα , a
vertex of N(U).
Proof of (ii): For a contradiction, assume that there is a subsequence {`1 , . . . , `t } ⊂ {`0 , . . . , g}
so that Σtj=1 [φU (z` j )] = 0. Let z = Σtj=1 φU (z` j ). Let γ be a 1-cycle in N(U) so that [z] = [|γ|]
whose existence is guaranteed by Proposition 9.3. As Σtj=1 [φU (z` j )] = 0, it must be that there is
a 2-chain D in N(U) so that ∂D = γ. Consider a triangle t = {uα1 , uα2 , uα3 } contributing to D.
Let a0i = φ−1U
(uαi ). Since t appears in N(U), the covers Uα1 , Uα2 , Uα3 containing a01 , a02 , and a03
respectively have a common intersection in X. This also means that each of the paths a01 { a02 ,
a02 { a03 , a03 { a01 has size at most 2smax (U). Then, ρ(∂t) is mapped to a 1-cycle in X of size at
most 4smax (U). It follows that ρ(∂D) can be written as a linear combination of cycles of size at
most 4smax (U). Since z1 , . . . , zg form an optimal cycle basis of H1 (X), each of the 1-cycles of size
at most 4smax (U) is generated by basis elements z1 , . . . , zk where s(zk ) ≤ 4smax (U). Therefore, the
class of z0 = φU (ρ(γ)) is generated by a linear combination of the basis elements whose preimages
have size at most 4smax (U). The class [z0 ] is same as the class [|γ|] by Proposition 9.11. But, by
assumption [|γ|] = [z] is generated by a linear combination of the basis elements whose sizes are
larger than 4smax (U) reaching a contradiction. Hence the assumption cannot hold and (ii) is true.
each α, we can now consider the decomposition of f −1 (Uα ) into its path connected components,
S jα
and we write f −1 (Uα ) = i=1 Vα,i , where jα is the number of path connected components Vα,i ’s
in f (Uα ). We write f U for the cover of X obtained this way from the cover U of Z and refer to
−1 ∗
it as the pullback cover of X induced by U via f . By construction, every element in this pullback
cover f ∗ U is path connected.
Notice that there are pathological examples of f where f −1 (Uα ) may shatter into infinitely
many path components. This motivates us to consider well-behaved functions f : we require
that for every path connected open set U ⊆ Z, the preimage f −1 (U) has finitely many open path
connected components. Consequently, all nerves of pullbacks of finite covers become finite.
f f −1
X Z X Z X Z
Figure 9.7: Mapper construction: (left) a map f : X → Z from a circle to a subset Z ⊂ R, (middle)
the inverse map f −1 induces a cover of circle from a cover U of Z, (right) the nerves of the two
covers of X and Z: the nerve on the left (quadrangle shaped) is the mapper induced by f and U.
Definition 9.4 (Mapper). Let X and Z be topological spaces and let f : X → Z be a well-behaved
and continuous map. Let U = {Uα }α∈A be a finite open cover of Z. The mapper arising from these
data is defined to be the nerve of the pullback cover f ∗ (U) of X; that is, M(U, f ) := N( f ∗ (U)). See
an illustration in Figure 9.7.
Notice that we define the mapper using finite covers which allow us to extend definitions
of persistence modules and persistence diagrams from previous chapters to the case of mappers.
However, in the next Remark and later we allow infinite covers for simplicity. The definition of
mapper remains valid with infinite covers.
Remark 9.1. The construction of mapper is quite general if we allow the cover U to be infinite.
For example, it can encompass both the Reeb graph and merge trees: consider X a topological
space and f : X → R. Then, consider the following two options for U = {Uα }α∈A , the other
ingredient of the construction:
• Uα = (−∞, α) for α ∈ A = R. This corresponds to sublevel sets which in turn lead to merge
trees. See, for example, the construction in Figure 9.8(b).
• Uα = (α − ε, α + ε) for α ∈ A = R, for some fixed ε > 0. This corresponds to (ε-thick)
level sets, which induce a relaxed notion of Reeb graphs. See the description in “Mapper
for PCD” below and Figure 9.8(a).
In these two examples, for simplicity of presentation, the set A is allowed to have infinite cardi-
nality. Also, note one can take any open cover of R in this definition. This may give rise to other
constructions beyond merge trees or Reeb graphs. For instance, using the infinite setting for sim-
plicity again, one may choose any point r ∈ R and let Uα = (r − α, r + α) for each α ∈ A = R or
other constructions.
Computational Topology for Data Analysis 221
Mapper for PCD: Consider a finite metric space (P, dP ), that is, a point set P with distances
between every pair of points. For a real r ≥ 0, one can construct a graph Gr (P) with every
point in P as a vertex where an edge (p, p0 ) is in Gr (P) if and only if dP (p, p0 ) ≤ r. Let f :
P → R be a real-valued function on the point set P. For a set of intervals U covering R, we
can construct the mapper as follows. For every interval (a, b) ∈ U, let P(a,b) = f −1 ((a, b)) be
the set of points with function values in the range (a, b). Each such set consists of a partition
P(a,b) = tPi(a,b) determined by the graph connectivity of Gr (P). Each set Pi(a,b) consists of the
vertices of a connected component of the subgraph of Gr (P) spanned by the vertices in P(a,b) .
The vertex sets (a,b)∈U {Pi(a,b) } thus obtained over all intervals constitute a cover f −1 (U) of P.
S
The nerve of this cover is the mapper M(P, f ). Here the intersection between cover elements is
determined by the intersection of discrete sets.
Observe that, in the above construction, if one takes the intervals of U = {Ui }i∈Z where
Ui = (i − ε, i + ε) for some ε ∈ (0, 1) causing only two consecutive intervals overlap partially,
then we get a discretized approximation of the Reeb graphs of the function that f approximates
on the discretized sample P. Figure 9.8 illustrates this observation. In the limit that each interval
degenerates to a point, the discretized Reeb converges to the original Reeb graph as shown in [133,
240].
(a) (b)
Figure 9.8: Mapper construction for point cloud, a map f : P → Z from a PCD P to a subset
Z ⊂ R; the graph Gr is not shown: (a) covers are intervals; points are colored with the interval
colors, gray points have values in two overlapping intervals, the mapper is a discretized Reeb
graph; (b) the covers are sublevel sets, points are colored with the smallest levelset they belong
to, discretized Reeb graph does not have the central loop any more.
222 Computational Topology for Data Analysis
Proposition 9.13. Let f : X → Z, and U and V be two covers of Z with a map of covers
ξ : U → V. Then, there is a corresponding map of covers between the respective pullback covers
of X: f ∗ (ξ) : f ∗ (U) −→ f ∗ (V).
Proof. Indeed, we only need to note that if U ⊆ V, then f −1 (U) ⊆ f −1 (V), and therefore it is clear
that each path connected component of f −1 (U) is included in exactly one path connected compo-
nent of f −1 (V). More precisely, let U = {Uα }α∈A , V = {Vβ }β∈B , with Uα ⊆ Vξ(α) for α ∈ A. Let
bα,i , i ∈ {1, . . . , nα } denote the connected components of f −1 (Uα ) and b
U Vβ, j , j ∈ {1, . . . , mβ } denote
the connected components of f (Vβ ). Then, the map of covers f (ξ) from f ∗ (U) to f ∗ (V) is given
−1 ∗
by requiring that each set U bα,i is sent to the unique set of the form b bα,i ⊆ b
Vξ(α), j so that U Vξ(α), j .
ξ ζ
Furthermore, observe that if U → V → W are three different covers of a topological space
with the intervening maps of covers between them, then f ∗ (ζ ◦ ξ) = f ∗ (ζ) ◦ f ∗ (ξ).
The above result for three covers easily extends to multiple covers and their pullbacks. The
sequence of pullbacks connected by cover maps and the corresponding sequence of nerves con-
nected by simplicial maps define multiscale mappers. Recall the definition of towers (Defini-
ua,a0
tion 4.1) to designate a sequence of objects connected with maps. Let U = Ua −→ Ua0 r≤a≤a0
denote a tower, where r = res(U) refers to its resolution. The objects here can be covers, simpli-
cial complexes, or vector spaces. The notion of resolution and the variable a intuitively specify
the granularity of the covers and the simplicial complexes induced by them.
The pullback property given by Proposition 9.13 makes it possible to take the pullback of a
given tower of covers of a space via a given continuous function into another space as stated in
proposition below.
In general, given a cover tower W of a space X, the nerve of each cover in W together with
simplicial maps induced by each map of W provides a simplicial tower which we denote by N(W).
Computational Topology for Data Analysis 223
f −1
N erve
N (f ∗ Ur ) f ∗ Ur Ur
Figure 9.9: Illustrating construction of multiscale mapper from a cover tower; CT and ST denote
cover and simplicial towers respectively, that is, CT(Z) = U, CT(X) = f ∗ (U), and ST(X) =
N( f ∗ (U)).
Consider for example a sequence res(U) ≤ a1 < a2 < . . . < an of n distinct real numbers.
Then, the definition of multiscale mapper MM(U, f ) gives rise to the following simplicial tower:
N( f ∗ (Ua1 )) → N( f ∗ (Ua2 )) → · · · → N( f ∗ (Uan )). (9.3)
which is a sequence of simplicial complexes connected by simplicial maps.
Applying to them the homology functor H p (·), p = 0, 1, 2, . . ., with coefficients in a field, one
obtains a persistence module: tower of vector spaces connected by linear maps.
H p N( f ∗ (Ua1 )) → · · · → H p N( f ∗ (Uan )) .
(9.4)
Given our assumptions that the covers are finite and that the function f is well-behaved, we
obtain that the homology groups of all nerves have finite dimensions. Thus, we get a persistence
module which is p.f.d.(see Section 3.4). Now one can summarize the persistence module induced
by MM(U, f ) with its persistent diagram Dgm p MM(U, f ) for each dimension p ∈ N. The diagram
Dgm p MM(U, f ) can be viewed as a topological summary of f through the lens of U.
Definition 9.6 (Pullback metric). Given a metric space (Z, dZ ), we define its pullback metric as
the following pseudometric d f on X: for x, x0 ∈ X,
Consider the Lebesgue number of the pullback covers of X. The following observation in this
respect is useful.
Proposition 9.15. Let U be a cover for the codomain Z and U0 be its restriction to f (X). Then,
the pullback cover f ∗ U has the same Lebesgue number as that of U0 ; that is λ( f ∗ U) = λ(U0 ).
Proof. First, observe that, for any path connected cover of X, a subset of X that realizes the
Lebesgue number can be taken as path connected because, if not, this subset can be connected
by a path entirely lying within the cover element containing it. Let X 0 ⊆ X be any subset where
s(X 0 ) ≤ λ(U0 ). Then, f (X 0 ) ⊆ Z has a diameter at most λ(U0 ) by the definitions of size (Definition
9.3) and pullback metric. Therefore, by the definition of Lebesgue number, f (X 0 ) is contained in
a cover element U 0 ∈ U0 . Since X 0 is path connected, a path connected component of f −1 (U 0 )
contains X 0 . It follows that there is a cover element in f ∗ U that contains X 0 . Since X 0 was chosen
as an arbitrary path connected subset of size at most λ(U0 ), we have λ( f ∗ U) ≥ λ(U0 ). At the same
time, it is straightforward from the definition of size that each cover element in f −1 (U 0 ) has at
most the size of U 0 for any U 0 ∈ U0 . Combining with the fact that U0 is the restriction of U to
f (X), we have λ( f ∗ U) ≤ λ(U0 ), establishing the equality as claimed.
Given a cover U of Z, consider the mapper N( f ∗ U). Let z1 , . . . , zg be a set of optimal cycle
basis for H1 (X) where the metric used to define optimality is the pullback metric d f . Then, as a
consequence of Theorem 9.10 we have:
Theorem 9.16. Let f : X → Z be a map from a path connected space X to a metric space Z
equipped with a cover U (i and ii below) or a tower of covers {Ua } (iii below). Let U0 be the
restriction of U to f (X).
i Let ` = g + 1 if λ(U0 ) > s(zg ). Otherwise, let ` ∈ [1, g] be the smallest integer so that
s(z` ) > λ(U0 ). If ` , 1, the class φU∗ [z j ] = 0 for j = 1, . . . , ` − 1. Moreover, if ` , g + 1, the
classes {φU∗ [z j ]} j=`,...,g generate H1 (N( f ∗ U)).
ii The classes {φU∗ [z j ]} j=`0 ,...,g are linearly independent where s(z`0 ) > 4smax (U).
iii Consider a H1 -persistence module of a multiscale mapper induced by a tower of path con-
nected covers:
s1∗ s2∗ sn∗
H1 N( f ∗ Ua0 ) → H1 N( f ∗ Ua1 ) → · · · → H1 N( f ∗ Uan )
(9.5)
Let ŝi∗ = si∗ ◦ s(i−1)∗ ◦ · · · ◦ φ̄Ua0 ∗ . Then, the assertions in (i) and (ii) hold for H1 (N( f ∗ Uai ))
with the map ŝi∗ : X → N( f ∗ Uai ).
Computational Topology for Data Analysis 225
9.4 Stability
To be useful in practice, the multiscale mapper should be stable against the perturbations in the
maps and the covers. we show that such a stability is enjoyed by the multiscale mapper under
some natural condition on the tower of covers. Recall that previous stability results for towers as
described in section 4.1 were drawn on the notion of interleaving. We identify compatible notions
of interleaving for cover towers as a way to measure the “closeness" between two cover towers.
Definition 9.7 (Interleaving of cover towers). Let U = {Ua } and V = {Va } be two cover towers
of a topological space X so that res(U) = res(V) = r. Given η ≥ 0, we say that U and V are
η-interleaved if one can find cover maps ζa : Ua → Va+η and ξa0 : Va0 → Ua0 +η for all a, a0 ≥ r;
see the diagram below.
· · · −→ Ua / Ua+η /2 Ua+2η −→ · · ·
>
ζa ξa+η
ξa ζa+η
· · · −→ Va / Va+η /, Va+2η −→ · · ·
Analogously, if we replace the operator ‘+’ by the multiplication ‘·’ in the above definition, then
we say that U and V are multiplicatively η-interleaved.
Proposition 9.17. (i) If U and V are (multiplicative) η1 -interleaved and V and W are (multi-
plicative) η2 -interleaved, then, U and W are (multiplicative (η1 η2 )-) (η1 + η2 )-interleaved. (ii) Let
f : X → Z be a continuous function and U and V be two (multiplicative) η-interleaved tower of
covers of Z. Then, f ∗ (U) and f ∗ (V) are also (multiplicative) η-interleaved.
Note that in the definition of interleaving cover towers, we do not have explicit requirement
that maps need to make sub-diagrams commute unlike the the interleaving between simplicial
towers (Definition 4.2). However, it follows from Proposition 9.2 that interleaving cover towers
lead to interleaving between simplicial towers for N(U) and N(V) as shown in the proposition
below.
Proposition 9.18. Let U and V be two (multiplicatively) η-interleaved cover towers of X with
res(U) = res(V). Then, N(U) and N(V) are also (multiplicatively) η-interleaved.
Proof. We prove the proposition for additive interleaving. Replacing the ‘+’ operator with ‘·’
gives the proof for multiplicative interleaving. Let r denote the common resolution of U and V.
ua,a0 va,a0
Write U = Ua −→ Ua0 r≤a≤a0 and V = Va −→ Va0 r≤a≤a0 , and for each a ≥ r let ζa : Ua → Va+η
and ξa : Va → Ua+η be given as in Definition 9.7. To define interleaving between the towers of
nerves arising out of covers, we consider similar diagrams to (4.3) at the level of covers involving
226 Computational Topology for Data Analysis
covers of the form Ua and Va , and apply the nerve construction. This operation yields diagrams
identical to those in (4.3) where for every a, a0 where a0 ≥ a ≥ r:
• Ka := N(Ua ), La := N(Va ),
To satisfy Definition 4.2, it remains to verify conditions (i) to (iv). We only verify (i), since the
proof of the others follows the same arguments. For this, notice that both the composite map
ξa+η ◦ ζa and ua,a+2η are maps of covers from Ua to Ua+2η . By Proposition 9.2 we then have that
N(ξa+η ◦ ζa ) and N(ua,a+2η ) = fa,a+2η are contiguous. But, by the properties of the nerve construc-
tion N(ξa+η ◦ ζa ) = N(ξa+η ) ◦ N(ζa ) = ψa+η ◦ ϕa , which completes the claim.
Combining Proposition 9.17 and Proposition 9.18, we get that the two multiscale mappers un-
der cover perturbations stay stable, which is the first part of Corollary 9.19. Recall from Chapter 4
that, for a finite simplicial tower S and p ∈ N, we denote by Dgm p (S) the p-th persistence dia-
gram of the tower S with coefficients in a fixed field. Using Proposition 9.18 and Theorem 4.3, we
have a stability result for Dgm p MM(U, f ) when f is kept fixed but the cover tower U is perturbed,
which is the second part of the corollary below.
Corollary 9.19. For η ≥ 0, let U and V be two finite cover towers of Z with res(U) = res(V) >
0. Let f : X → Z be well-behaved and U and V be η-interleaved. Then, MM(U, f ) and
MM(V, f ) are η-interleaved. In particular, the bottleneck distance between the persistence di-
agrams Dgm p MM(U, f ) and Dgm p MM(V, f ) is at most η for all p ∈ N.
Definition 9.8 ((c, s)-good cover tower). Given a cover tower U = {Uε }ε≥s>0 , we say that it is
(c,s)-good if for any ε ≥ s > 0, we have that (i) smax (Uε ) ≤ ε and (ii) λ(Ucε ) ≥ ε.
As an example, consider the cover tower U = {Uε }ε≥s with Uε := {Bε/2 (z) | z ∈ Z}. It is a
(2, s)-good cover tower of the metric space (Z, dZ ).
We now characterize the persistent homology of multiscale mappers induced by (c, s)-good
cover towers. Theorem 9.20 states that the multiscale-mappers induced by any two (c, s)-good
cover towers interleave with each other, implying that their respective persistence diagrams are
also close under the bottleneck distance. From this point of view, the persistence diagrams in-
duced by any two (c, s)-good cover towers contain roughly the same information.
uε,ε0 vε,ε0
Theorem 9.20. Given a map f : X → Z, let U = {Uε −→ Uε0 ε≤ε0 and V = {Vε −→ Vε0 ε≤ε0
be two (c, s)-good cover towers of Z. Then the corresponding multiscale mappers MM(U, f ) and
MM(V, f ) are multiplicatively c-interleaved.
Computational Topology for Data Analysis 227
Claim 9.2. Any two (c, s)-good cover towers U and V are multiplicatively c-interleaved.
Proof. It follows easily from the definitions of (c, s)-good cover tower. Specifically, first we
construct ζε : Uε → Vcε . For any U ∈ Uε , we have that diam(U) ≤ ε. Furthermore, since V is
(c, s)-good, there exists V ∈ Vcε such that U ⊆ V. Set ζε (U) = V; if there are multiple choice of
V, we can choose an arbitrary one. We can construct ξε0 : Vε0 → Ucε0 in a symmetric manner, and
the claim then follows.
This claim, combined with Propositions 9.17 and 9.18, prove the theorem.
We also need the following definition in order to state the stability results precisely.
Definition 9.9. Given a tower of covers U = {Uε } and ε0 ≥ res(U), we define the ε0 -truncation of
U as the tower Trε0 (U) := Uε ε0 ≤ε . Observe that, by definition res(Trε0 (U)) = ε0 .
Proposition 9.21. Let X be a compact topological space, (Z, dZ ) be a compact path connected
metric space, and f, g : X → Z be two continuous functions such that for some δ ≥ 0 one has that
δ = max x∈X dZ ( f (x), g(x)). Let W be any (c, s)-good cover tower of Z. Let ε0 = max(1, s). Then,
the ε0 -truncations of f ∗ (W) and g∗ (W) are multiplicatively 2c max(δ, s) + c -interleaved.
Proof. For notational convenience write η := 2c max(δ, s) + c, {Ut } = U := f ∗ (W), and {Vt } =
V := g∗ (W). With regards to satisfying Definition 4.2 for U and V, for each ε ≥ ε0 we need only
exhibit maps of covers ζε : Uε → Vηε and ξε : Vε → Uηε . We first establish the following, where
recall that the offset Or is defined as Or := {z ∈ Z | dZ (z, O) ≤ r}.
Now, pick any ε ≥ ε0 , any U ∈ Uε , and fix δ0 := max(δ, s). Then, there exists W ∈ Wε such
that U ∈ cc( f −1 (W)), where cc(Y) stands for the set of path connected components of Y. Claim
9.3 implies that f −1 (W) ⊆ g−1 (W δ ). Since W is a (c, s)-good cover of the connected space Z
0
and s ≤ max(δ, s) ≤ 2δ0 + ε, there exists at least one set W 0 ∈ Wc(2δ0 +ε) such that W δ ⊆ W 0 .
0
This means that U is contained in some element of cc(g−1 (W 0 )) where W 0 ∈ Wc(2δ0 +ε) . But, also,
since c(2δ0 + ε) ≤ c(2δ0 + 1)ε for ε ≥ ε0 ≥ 1, there exists W 00 ∈ Wc(2δ0 +1)ε such that W 0 ⊆ W 00 .
This implies that U is contained in some element of cc(g−1 (W 00 )) where W 00 ∈ Wc(2δ0 +1)ε . This
process, when applied to all U ∈ Uε , all ε ≥ ε0 , defines a map of covers ζt : Ut → V(2cδ0 +c)ε . A
similar observation produces for each ε ≥ ε0 a map of covers ξε from Vε to V(2cδ0 +c)ε .
So we have in fact proved that ε0 -truncations of U and V are multiplicatively η-interleaved.
Applying Proposition 9.21, Proposition 9.18, and Corollary 4.4, we get the following result,
where Dgmlog stands for the persistence diagram at the log-scale (of coordinates).
228 Computational Topology for Data Analysis
Corollary 9.22. Let W be a (c, s)-good cover tower of the compact connected metric space Z and
let f, g : X → Z be any two well-behaved continuous functions such that max x∈X dZ ( f (x), g(x)) =
δ. Then, the bottleneck distance between the persistence diagrams
1
db (Dgmlog MM(W, f , Dgmlog MM W, g ) ≤ log(2c max(s, δ) + c) + max(0, log ).
s
Proof. We use the notation of Proposition 9.21. Let U = f (W) and V = g (W). If max(1, s) = s,
∗ ∗
then U and V are multiplicatively (2c max(s, δ) + c)-interleaved by Proposition 9.21 which gives a
bound on the bottleneck distance of log(2c max(s, δ) + c) between the corresponding persistence
diagrams at the log-scale by Corollary 4.4. In the case when s < 1, the bottleneck distance
remains the same only for the 1-truncations of U and V. Shifting the starting point of the two
families to the left by at most s can introduce barcodes of lengths at most log 1s or can stretch
the existing barcodes to the left by at most log 1s for the respective persistence modules at the
log-scale. To see this, consider the persistence module below where ε1 = s:
Hk N( f ∗ (Uε1 )) → Hk N( f ∗ (Uε2 )) → · · · · · · → Hk N( f ∗ (U1 )) → · · · → Hk N( f ∗ (Uεn ))
A homology class born at any index in the range [s, 1) either dies at or before the index 1 or
is mapped to a homology class of Hk N( f ∗ (U1 )) . In the first case we have a bar code of length at
most | log s| = log 1s at the log-scale. In the second case, a bar code of the persistence module
Hk N( f ∗ (Uε1 )) → · · · → Hk N( f ∗ (Uεn ))
starting at index 1 gets stretched to the left by at most | log s| = log 1s . The same conclusion can
be drawn for the persistence module induced by V. Therefore the bottleneck distance between
the respective persistence diagrams at log-scale changes by at most log 1s .
Definition 9.10. Given a (pseudo)metric space (Y, dY ), its intrinsic Čech complex Cr (Y) at scale r
is defined as the nerve complex of the set of intrinsic r-balls {B(y; r)}y∈Y defined using (pseudo)metric
dY .
Definition 9.11 (Intrinsic Čech filtration). The intrinsic Čech filtration of the (pseudo)metric
space (Y, dY ) is
0
C(Y) = {Cr (Y) ,→ Cr (Y)}0<r<r0 .
0
The intrinsic Čech filtration at resolution s is defined as C s (Y) = {Cr (Y) ,→ Cr (Y)} s≤r<r0 .
Recall the definition of the pseudometric d f on X (Definition 9.6) induced from a metric on Z.
Applying Definition 9.10 on the pseudometric space (X, d f ), we obtain its intrinsic Čech complex
Cr (X) at scale r and then its Čech filtration C s (X).
Theorem 9.23. Let C s (X) be the intrinsic Čech filtration of (X, d f ) starting with resolution s. Let
uε,ε0
U = {Uε −→ Uε0 s≤ε≤ε0 be a (c, s)-good cover tower of the compact connected metric space Z.
Then the multiscale mapper MM(U, f ) and C s (X) are multiplicatively 2c-interleaved.
Corollary 9.24. Given a continuous map f : X → Z and a (c, s)-good cover tower U of Z,
let Dgmlog MM(U, f ) and Dgmlog C s denote the log-scaled persistence diagram of the persistence
modules induced by MM(U, f ) and by the intrinsic Čech filtration C s of (X, d f ) respectively. We
have that
db (Dgmlog MM(U, f ), Dgmlog C s ) ≤ 2c.
depends on the size of the 1-skeleton of K, which is typically orders of magnitude smaller than
the total number of simplices (such as triangles, tetrahedra, etc) in K.
Recall that K 1 denote the 1-skeleton of a simplicial complex K: that is, K 1 contains the set of
vertices and edges of K. Define f˜ : |K 1 | → R to be the restriction of f to |K 1 |; that is, f˜ is the PL
function on |K 1 | induced by function values at vertices.
Condition 9.1 (Minimum diameter condition). For a cover tower W of a compact connected
metric space (Z, dZ ), let
κ(W) := inf{diam(W); W ∈ W ∈ W}
denote the minimum diameter of any element of any cover of the tower W. Given a simplicial
complex K with a function f : |K| → Z and a tower of covers W of the metric space Z, we say
that (K, f, W) satisfies the minimum diameter condition if diam( f (σ)) ≤ κ(W) for every simplex
σ ∈ K.
In our case, f is a PL-function, and thus satisfying the minimum diameter condition means
that for every edge e = (u, v) ∈ K 1 , | f (u) − f (v)| ≤ κ(W). In what follows we assume that K is
connected. We do not lose any generality by this assumption because the arguments below can be
applied to each connected component of K.
sε,ε0
Definition 9.12 (Isomorphic simplicial towers). Two simplicial towers S = S ε −→ S ε0 and
tε,ε0
T = T ε −→ T ε0 are isomorphic, denoted S T, if res(S) = res(T), and there exist simplicial
isomorphisms ηε and ηε0 such that the diagram below commutes for all res(S) ≤ ε ≤ ε0 .
sε,ε0
Sε / S ε0
ηε ηε0
tε,ε0
Tε / T ε0
Our main result in this section is the following theorem which enables us to compute the mapper,
multiscale mapper, as well as the persistence diagram for the multiscale mapper of a PL function
f from its restriction f˜ to the 1-skeleton of the respective simplicial complex.
Theorem 9.25. Given a PL-function f : |K| → R and a tower of covers W of the image of f
with (K, f, W) satisfying the minimum diameter condition, we have that MM(W, f ) MM(W, f˜),
where f˜ is the restriction of f to |K 1 |.
We show in Proposition 9.26 that the two mapper outputs M(W, f ) and M(W, f˜) are identical
up to a relabeling of their vertices (hence simplicially isomorphic) for every W ∈ W. Also, since
the simplicial maps in the filtrations MM(W, f ) and MM(W, f˜) are induced by the pullback of the
same tower of covers W, they are identical again up to the same relabeling of the vertices. This
then establishes the theorem.
In what follows, for clarity of exposition, we use X and X 1 to denote the underlying space |K|
and |K 1 | of K and K 1 , respectively. Also, we do not distinguish between a simplex σ ∈ K and its
image |σ| ⊆ X and thus freely say σ ⊆ X when it actually means that |σ| ⊆ X for a simplex σ ∈ K.
Computational Topology for Data Analysis 231
Proposition 9.26. If (K, f, W) satisfies the minimum diameter condition, then for every W ∈ W,
M(W, f ) is identical to M(W, f˜) up to relabeling of the vertices.
Proposition 9.27. If (X, f, W) satisfies the minimum diameter condition, then for every W ∈ W
and every U ∈ f ∗ (W), the set U ∩ X 1 is connected.
f
G
W f −1 (W )
Figure 9.10: Partial thickened edges belong to the two connected components in f −1 (W). Note
that each set in ccG ( f −1 (W)) contains only the set of vertices of a component in cc( f −1 (W)).
In what follows, as before, cc(O) for a set O denotes the set of all path connected components
of O.
Given a map f : |K| → Z defined on the underlying space |K| of a simplicial complex K, to
construct the mapper and multiscale mapper, one needs to compute the pullback cover f ∗ (W) for
a cover W of the compact metric space Z. Specifically, for any W ∈ W one needs to compute
the preimage f −1 (W) ⊂ |K| and shatter it into connected components. Even in the setting adopted
in 9.5, where we have a PL function f˜ : |K 1 | → R defined on the 1-skeleton K 1 of K, the
connected components in cc( f˜−1 (W)) may contain vertices, edges, and also partial edges: say for
an edge e ∈ K 1 , its intersection eW = e ∩ f −1 (W) ⊆ e, that is, f (eW ) = f (e) ∩ W, is a partial
edge. See Figure 9.10 for an example. In general for more complex maps, σ ∩ f −1 (W) for any
k-simplex σ may be partial triangles, tetrahedra, etc., which can be nuisance for computations.
The combinatorial version of mapper and multiscale mapper sidesteps this problem by ensuring
that each connected component in the pullback f −1 (W) consists of only vertices of K. It is thus
simpler and faster to compute.
Definition 9.13 (G-induced connected component). Given a set of vertices O ⊆ V(G), the set
of connected components of O induced by G, denoted by ccG (O), is the partition of O into a
maximal subset of vertices connected in GO ⊆ G, the subgraph spanned by vertices in O. We
refer to each such maximal subset of vertices as a G-induced connected component of O. We
define f ∗G (W), the G-induced pull-back via the function f , as the collection of all G-induced
connected components ccG ( f −1 (Wα )) for all α ∈ A.
Definition 9.14. (G-induced multiscale mapper) Similar to the mapper construction, we define
Computational Topology for Data Analysis 233
Algorithm 17 MMapper( f, K, W)
Input:
f : |K| → Z given by fV : V(K) → Z, a cover tower W = {W1 , . . . , Wt }
Output:
Persistence diagram Dgm∗ (MMK 1 (W, fV )) induced by the combinatorial MM of f w.r.t. W
1: for i = 1, . . . , t do
j
2: compute VW ⊆ V(K) where f (VW ) = f (V(K)) ∩ W and {VW } j = ccK 1 (VW ), ∀W ∈ Wi ;
j
3: compute Nerve complex Ni = N({VW } j,W ).
4: end for
5: compute the filtration F : {Ni → Ni+1 , i ∈ [1, t − 1]}
6: compute Dgm∗ (F).
Given a map f : |K| → Z defined on the underlying space |K| of a simplicial complex K, let
fV : V(K) → R denote the restriction of f to the vertices of K. Consider the 1-skeleton graph
K 1 that provides the connectivity information for vertices in V(K). Given any cover tower W of
the metric space Z, the K 1 -induced multiscale mapper MMK 1 (W, fV ) is called the combinatorial
multiscale mapper of f w.r.t. W.
Theorem 9.28. Assume that (Z, dZ ) is a compact and connected metric space. Given a map
f : |K| → Z, let fV : V(K) → Z be the restriction of f to the vertex set V(K) of K.
Given a (c, s)-good cover tower W of Z such that (K, f, W) satisfies the minimum diam-
eter condition (cf. Condition 9.1), the bottleneck distance between the persistence diagrams
Dgmlog MM W, f and Dgmlog MMK 1 W, fV is at most 3 log(3c) + 3 max(0, log 1s ) for all k ∈ N.
234 Computational Topology for Data Analysis
Exercises
1. For a simplicial complex K, simplices with no cofacet are called maximal simplices. Con-
sider a closed cover of |K| with the closures of the maximal simplices as the cover elements.
Let N(K) denote the nerve of this cover. Prove that N(N(K)) is isomorphic to a subcomplex
of K.
2. ([17]) A vertex v in K is called dominated by a vertex v0 if every maximal simplex con-
taining v also contains v0 . We say K collapses strongly to a complex L if L is obtained by
a series of deletions of dominated vertices with all their incident simplices. Show that K
strongly collapses to N(N(K)).
3. We say a cover U of a metric space (Y, d) is (α, β)-cover if α ≤ λ(U) and β ≥ smax (U).
• Consider a δ-sample P of Y, that is, every metric ball B(y; δ), y ∈ Y, contains a point
in P. Prove that the cover U = {B(p; 2δ)} p∈P is a (δ, 4δ)-cover of Y.
Computational Topology for Data Analysis 235
4. Theorem 9.5 requires that the cover to be path connected. Show that this condition is
necessary by presenting a counterexample otherwise.
5. One may generalize Theorem 9.5 as follows: If for any k ≥ 0, t-wise intersections of cover
elements for all t > 0 have trivial reduced homology for Hk−t , then the nerve map induces a
surjection in Hk . Prove or disprove it.
• Prove that the quotient map q : X → R f is surjective and also induces a surjection
q∗ : H1 (X) → H1 (R f ).
• Call a class [c] ∈ H1 (X) vertical if and only if there is no c0 ∈ C1 (X) so that [c] = [c0 ]
and f ◦ σ is constant for every σ ∈ c0 . Show that q∗ ([c]) , 0 if and only if c is vertical.
• Let z1 , . . . , zg be an optimal cycle basis (Definition 9.3) of H1 (X) defined with respect
to the pseudometric d f (Definition 9.6). Let ` ∈ [1, g] be the smallest integer so that
s(z` ) , 0. Prove that if no such ` exists, H1 (R f ) is trivial, otherwise, {[q(zi )]}i=`,...g is
a basis for H1 (R f ).
7. Let us endow R f with a distance d̃ f that descends via the map q: for any equivalence classes
r, r0 ∈ R f , pick x, x0 ∈ X with r = q(x) and r0 = q(x0 ), then define
d̃ f (r, r0 ) := d f (x, x0 ).
Discrete Morse theory is a combinatorial version of the classical Morse theory. Invented by For-
man [161], the theory combines topology with the combinatorial structure of a cell complex.
Specifically, much like the fact that critical points of a smooth Morse function on a manifold de-
termines its topological entities such as homology groups and Euler Characteristics, an analogous
concept called critical simplices of a discrete Morse function also determine similar structures
for the complex it is defined on. Gradient vectors associated with smooth Morse functions give
rise to integral lines and eventually the notion of stable and unstable manifolds [232]. Similarly,
a discrete Morse function defines discrete gradient vectors leading to V-paths analogous to the
integral lines. Using these V-paths, one can define the analogues of stable and unstable manifolds
of the critical simplices.
It turns out that an acyclic pairing between simplices and their faces so that every simplex
participates in at most one pair provides a discrete Morse function and conversely a discrete Morse
function defines such a pairing. This pairing termed as a Morse matching is a main building
block of the discrete Morse theory. In this chapter, we connect this matching with the pairing
obtained through persistence algorithm. Specifically, we present an algorithm for computing a
Morse matching and hence a discrete Morse vector field by connecting persistent pairs through
V-paths. This requires an operation called critical pair cancellation which may not succeed all
the time. However, for 1-complexes and simplicial 2-manifolds (pseudomanifolds), it always
succeeds. Section 10.1 and 10.2 are devoted to these results.
In Section 10.4, we apply our persistence based discrete Morse vector field to reconstruct
geometric graphs from their noisy samples. Here we show that unstable manifolds of critical
edges can recover a graph with guarantees from a density data that captures the hidden graph rea-
sonably well. We provide two applications of using this graph reconstruction algorithm, one for
road network reconstructions from GPS trajectories and satellite images and another for neuron
reconstructions from their images. Section 10.5 describes these applications.
237
238 Computational Topology for Data Analysis
The first condition says that at most one facet of a simplex σ has higher or equal function
value than f (σ) and the second condition says that at most one cofacet of a simplex σ can have
lower or equal function value than f (σ). By a result of Chari [75], the two conditions imply that
the two sets above are disjoint, that is, if a pair (σ p−1 , σ p ) satisfies the first condition, there is no
pair (σ p , σ p+1 ) satisfying the second condition and vice versa. This means that a Morse function
f induces a matching:
Definition 10.1 (Matching). A set of ordered pairs M = {(σ, τ)} is a matching in K if the following
conditions hold:
In Figure 10.1, we indicate a matching by putting an arrow from the lower dimensional sim-
plex to the higher dimensional simplex. Observe that the source of each arrow is a facet of the
target of the arrow.
Note however, the matching in K defined by a Morse function has an additional property of
acyclicity which we show next. First, let us define a relation σi ≺ σi+1 if σi+1 = µ(σi ) or σi+1 is
a facet of σi but σi , µ(σi+1 ).
Definition 10.2 (V-path and Morse matching). Given a matching M in K, for k > 0, a V-path π
is a sequence
where for 0 < i < k, σi , µ(σi−1 ) implies σi+1 = µ(σi ). In other words, a V-path is an alternating
sequence of facets and cofacets thus alternating in dimensions where every consecutive pair also
alternates between matched and unmatched pairs. A V-path is cyclic if the first simplex σ0 is a
facet of the last simplex σk or σ0 = µ(σk ) and the matching M is called cyclic if there is such a
path in it. Otherwise, M is called acyclic. An acyclic matching in K is called a Morse matching.
1
Forman formulated discrete Morse function for more general cell complexes.
Computational Topology for Data Analysis 239
In Figure 10.1(left), the matching indicated by the arrows is not a Morse matching whereas
the matching in Figure 10.1(right) is a Morse matching. Observe that in a sequence like (10.1),
the function values on facets of the matched pairs strictly decreases. This observation leads to the
following fact.
Fact 10.1. The matching induced by a Morse function on K is acyclic, thus is a Morse matching.
Proof. First order those simplices which are in some pair of M. A simplex σ p−1 is ordered
before σ p if (σ p−1 , σ p ) ∈ M and it is ordered after σ p if it is a facet of σ p but (σ p−1 , σ p ) < M.
Such an ordering is possible because M is acyclic. Then, simply order the rest of the simplices
not in any pair of M according to their increasing dimensions. Assign the order numbers as the
function values of the simplices, which can easily be verified to satisfy the conditions (1) and (2)
of a (discrete) Morse function on K.
Proposition 10.1. Given a Morse function f on K with its induced Morse matching M, let ci s
and βi s defined as above. We have:
The weak Morse inequality can be derived from the strong Morse inequality (Exercise 7).
Definition 10.3 (DMVF). A discrete Morse vector field (DMVF) V in a simplicial complex K is
a partition V = C t L t U of K where L is the set of facets paired with a unique cofacet in U
in a Morse matching M giving µ(L) = U and C is the set of unpaired simplices called critical
simplices. We also say that V is induced by matching M in this case.
240 Computational Topology for Data Analysis
a a
b b
d d
c c
f e f e
Figure 10.1: Two DMVFs: (left) the matching is not Morse because the sequence a ≺ ab ≺ b ≺
bc ≺ c ≺ cd ≺ d ≺ da is cyclic; (right) the matching is Morse, and there is no cyclic sequence.
We interpret each pair (σ, τ = µ(σ)) as a vector originating at σ and terminating at τ and
draw the vector by an arrow with tail in σ and head in τ; see Figures 10.1 and 10.2. The critical
simplices are treated as critical points of the vector field justifying their names. The vertex e and
edge ce in both left and right pictures in Figure 10.1 are critical whereas the vertex c is critical
only in the right picture and the edge b f is only critical in the left picture.
In analogy to the integral lines for smooth vector fields, we define the so called critical V-paths
for discrete Morse vector fields.
Observe that, σ0 and σk in the above definition are necessarily a p- and (p − 1)-simplex
respectively if the V-path alternates between p and (p − 1)-simplices. The V-path corresponding
to a critical V-path cannot be cyclic due to this observation. The critical triangle cda with any of
its edges in Figure 10.1(left) forms a non-critical V-path wheres the pair ce ≺ e forms a critical
V-path in Figure 10.1(right).
In a critical V-path π, the pairs (σ1 , σ2 ), · · · , (σ2i−1 , σ2i ), · · · (σk−2 , σk−1 ) are matched. We
can cancel the pairs of critical simplices (σ0 , σk ) by reversing the matched pairs.
Definition 10.5 (Cancellation). Let (σ0 , σk ) be a pair of critical simplices with a critical V-path
π : σ0 ≺ σ1 ≺ · · · σi−1 ≺ σi ≺ σi+1 · · · ≺ σk . The pair (σ0 , σk ) is cancelled if one modifies
the matching by shifting the matched pairs by one position, that is, by asserting that the pairs
(σk , σk−1 ), · · · , (σ2i+1 , σi ), · · · , (σ1 , σ0 ) are matched instead – we refer to this as the (Morse)
cancellation on (σ0 , σk ). Observe that a cancellation essentially reverses the vectors in the V-path
π and additionally converts critical simplices σ0 and σk to be non-critical; see Figure 10.2. We
say that the pair (σ0 , σk ) is (Morse) cancellable if there exists a unique V-path between them.
Observe that a cancellation preserves the property of matching, that is, the new pairs together
with the undisturbed pairs indeed form a matching. Uniqueness of the critical V-path connecting
a pair of critical simplices ensures that the resulting new matching remains Morse. If there are
more than one such critical V-path, the new matching may become cyclic – for example, in Fig-
ure 10.2(c), the cancellation of one critical V-path between the triangle-edge pair creates a cyclic
V-path. The uniqueness of critical V-path is sufficient to ensure that such cyclic matching cannot
be produced. In particular, we have:
Computational Topology for Data Analysis 241
Figure 10.2: Critical vertices and edges are marked red; (a) before cancellation of edge-vertex
pair (v2 , e2 ); (b) after cancellation, the path from e2 to v2 is inverted, giving rise to a critical V-
path from e1 to v1 , making (v1 , e1 ) now potentially cancellable; (c) the edge-triangle pair (e, t), if
cancelled, creates cycle as there are two V-paths between them.
Proposition 10.2. Given a Morse matching M, suppose we cancel a pair of critical simplices σ
and σ0 in a DMVF V via a critical V-path to obtain a new matching M 0 . Then M 0 remains a
Morse matching if and only if this V-path is the only critical V-path connecting σ and σ0 in V
(i.e., the pair (σ, σ0 ) is cancellable as per Definition 10.5).
Proof. First, assume that there are two V-paths π and π0 originating at σ and ending at σ0 . Since
π and π0 are distinct and have common simplices σ at the beginning and σ0 at the end, there are
simplices τ and τ0 where the two paths differ for the first time after τ and join again for the first
time at τ0 . Reversing one V-path, say π, creates a V-path from τ0 to τ. This sub-path along with
the V-path from τ to τ0 on π0 creates a cyclic V-path, thus proving the ’only if’ part.
Next, suppose that there is only a single V-path from σ to σ0 . After reversing this path, we
claim that no cyclic V-path is created. For contradiction, assume that a cyclic V-path is created
as the result of reversal of π. Let the maximal sub-path of reversed π on this cyclic path starts
at τ and ends at τ0 . We have τ , τ0 because otherwise the original matching needs to be cyclic
in the first place. But, then the cyclic V-path has a sub-path from τ0 to τ that is not in π. Since
the reversed V-path π has a sub-path from τ to τ0 , the original path has a sub-path from τ0 to
τ. It means that the DMVF V originally had two V-paths from σ to σ0 , with one of them being
π while the other one containing a sub-path not in π. This forms a contradiction that there is a
single V-path from σ to σ0 . Hence the assumption that a cyclic V-path is created is wrong, which
completes the proof of the ’if’ part.
Proposition 10.3. Let (v1 , e1 ), (v2 , e2 ), · · · , (vn , en ) be the sequence of all non-essential persis-
tence pairs of vertices and edges sorted in increasing order of the appearance of the edges ei ’s in
a filtration of a 1-complex K. Let V0 be the DMVF in K with all simplices being critical. Sup-
pose DMVF Vi−1 can be obtained by cancelling successively (v1 , e1 ), (v2 , e2 ), · · · (vi−1 , ei−1 ). Then,
(vi , ei ) can be cancelled in Vi−1 providing a DMVF Vi for all i ≥ 1.
Proof. Inductively assume that (i) Vi−1 is a DMVF obtained as claimed in the proposition and
(ii) any matched edge in Vi−1 is a paired edge in a persistence pair. We argue that these two
hypotheses hold for Vi proving the claim due to the hypothesis (i).
The base case for i = 1 is true trivially because V0 is a DMVF and there is no matched edge.
Inductively assume that Vi−1 satisfies the inductive hypothesis for i > 1. Consider the persistence
pair (vi , ei ). First, we observe that a V-path ei = ei1 ≺ vi1 ≺ . . . ≺ ein ≺ vin = vi exists in
Vi−1 . If not, starting from the two endpoints of ei , we attempt to follow the two V-paths and let
v, v0 , vi be the first two critical vertices encountered during this construction. Without loss of
generality, assume that v0 appears before v in the filtration. Then, the 0-dimensional class [v + v0 ]
is born when v is introduced. It is destroyed by ei . It follows that (v, ei ) is a persistence pair
(Fact 3.3) contradicting that actually (vi , ei ) is a persistence pair. For induction, consider the V-
path ei = ei1 ≺ vi1 ≺ . . . ≺ ein ≺ vin = vi in Vi−1 which is cancelled to create Vi . For Vi not to
be a DMVF, due to Proposition 10.2, we must have another distinct V-path from ei to vi in Vi−1 ,
ei = e j1 ≺ v j1 ≺ . . . ≺ e jn0 ≺ vin0 = vi . These two non-identical paths form a 1-cycle. Every edge
in this cycle except possibly ei are matched edges in Vi−1 and hence participates in a persistence
pair by the inductive hypothesis. Then, all edges in the 1-cycle participate in some persistence
pair because ei is also such an edge by assumption. But, this is impossible because in any 1-cycle
at least one edge has to remain unpaired in persistence. It follows that by cancelling (vi , ei ), we
obtain a DMVF Vi satisfying the inductive hypothesis (i). Also, inductive hypothesis (ii) follows
because the new matched pairs in Vi involve edges that were already matched in Vi−1 and the edge
ei which participates in a persistence pair by assumption.
The result above holds for vertex-edge pairing in any simplicial complex. Furthermore, using
dual graphs, it can be used for edge-triangle pairing in triangulations of 2-manifolds. Given a
simplicial 2-complex K whose underlying space is a 2-manifold without boundary, consider the
dual graph (1-complex) K ∗ where each triangle t ∈ K becomes a vertex t∗ ∈ K ∗ and two vertices
t1∗ and t2∗ are joined with an edge e∗ if triangles t1 and t2 share an edge e in K.
The following result connects the persistence of a filtration of K and its dual graph K ∗ .
Proof. Recall Proposition 3.8. An edge-triangle persistence pair produced by the filtered bound-
ary matrix D2 for filtration of K are exactly same as the triangle-edge persistence pair obtained
from the twisted (transposed and reversed) matrix D∗2 by left-to-right column additions. The
matrix D∗2 is exactly the filtered boundary matrix of a filtration F(K ∗ ) of K ∗ that reverses the sub-
sequence of triangle and edges. Dualizing a triangle t to a vertex t∗ and an edge e to an edge e∗ ,
we can view F(K ∗ ) as a filtration on a 1-complex (graph). Then, applying Theorem 3.6, we get
that (t∗ , e∗ ) is indeed a persistence pair for the filtration F(K ∗ ).
We can compute a DMVF V ∗ for K ∗ by cancelling all persistence pairs as stated in Propo-
sition 10.3. By duality, this also produces a DMVF V for the 2-manifold K. The action of
cancelling a vertex-edge pair in K ∗ can be translated into a cancellation of an edge-triangle pair
in K. Combining Propositions 10.3 and 10.4, we obtain the following result.
Theorem 10.5. Let K be a finite simplicial 2-complex whose underlying space is a 2-manifold
without boundary and F be a simplex-wise filtration of K (Definition 3.1). Starting from the trivial
DMVF where each simplex is critical, one can obtain a DMVF in K by cancelling the vertex-edge
and edge-triangle persistence pairs given by F.
In general, by duality one can apply the above theorem to cancel all persistence pairs be-
tween (d − 1)-simplices and d-simplices in a filtration of a simplicial d-complex where each
(d − 1)-simplex has at most two d-simplices as cofacets. This includes simplicial d-manifolds
with boundary. We call a (d − 1)-simplex boundary if it adjoins exactly one d-simplex. For
this extension, one has to introduce a ‘dummy’ vertex in the dual graph that connects to all dual
vertices of d-simplices incident to a boundary (d − 1)-simplex. We leave it as an exercise (Exer-
cise 11).
Unfortunately, it does not extend any further. In particular, the result in Theorem 10.5 does
not extend to arbitrary simplicial 2-complexes and hence arbitrary simplicial complexes. The
main difficulty arises because such a complex does not admit a dual graph in general. Indeed,
there are counterexamples which exhibit that every persistence pair for a filtration of a simplicial
2-complex cannot be cancelled leading to a DMVF. The following Dunce hat example exhibits
this obstruction.
Dunce hat. Consider a 2-manifold with boundary which is a cone with apex v and the boundary
circle c. Let u be a point on c. Modify the cone by identifying the line segment uv with the
circle c. Because of the similarity, the space obtained by this identification is called the Dunce
hat. Consider a triangulation K of the Dunce hat. Notice that Dunce hat and hence |K| is not a
2-manifold. The edges discretizing uv in K have three triangles incident to them. We show that
there is no DMVF without any critical edge and triangle for K. The complex K is known to have
βi (K) = 0 for all i > 0 and has two or more triangles adjoining every edge in it. For any filtration
of K, there cannot be any edge or triangle that remains unpaired because otherwise that would
contradict that β1 (K) = 0 and β2 (K) = 0 (Fact 3.9 in Chapter 3). If a DMVF V were possible
to be created by cancelling persistence pairs, there would be a finite maximal V-path that cannot
be extended any further. Consider such a path π starting at a simplex σ. If σ is a triangle, the
edge µ−1 (σ) matched with it can be added before it to extend π. If σ is an edge, there is a triangle
adjoining σ not in the V-path because at least two triangles adjoin e and the V-path starting at
244 Computational Topology for Data Analysis
e cannot be cyclic. We can add that triangle to extend π. In both cases, we contradict that π is
maximal.
10.2.2 Algorithms
The above results naturally suggest an algorithm for computing a persistence based DMVF for
a simplicial 2-manifold K. We compute the persistence pairs on a chosen filtration F of K and
then cancel them successively as Theorem 10.5 suggests. Both of these tasks can be combined by
modifying the well known Kruskal’s algorithm for computing minimum spanning tree of a graph.
Consider a graph G = (U, E) which can be either the 1-skeleton of a complex K or the dual
graph K ∗ if K is a simplicial 2-manifold. Assume that u1 , u2 , . . . , uk and e1 , e2 , . . . , e` be an
ordered sequence of vertices and edges in G. For minimum spanning tree, the sequence of edges
are taken in non-decreasing order of their weights. Here we describe the algorithm by assuming
any order. Kruskal’s algorithm maintains a spanning forest of the vertex set. It brings one edge
e at a time in the given order either to join two trees in the current forest or to discover that the
edge makes a cycle and hence does not belong to the spanning forest. If the two endpoints of
e belong to two different trees in the forest, then it joins those two trees. Otherwise, e connects
two vertices in the same tree creating a cycle. The main computation involves determining if two
vertices of an edge belong to the same tree or not. Algorithm 18:PersDMVF does it by union-find
data structure which maintains the set of vertices of a tree in a single set and two sets are united
if an edge joins the two respective trees. This is similar to FindSet and Union operations in the
algorithm ZeroPerDg described in Section 3.5.3. All such find and union operations can be done
in O(k + `α(`)) time assuming that there are k vertices and ` edges in the graph which dominates
the overall complexity.
We can incorporate the persistence computation and Morse cancellations simultaneously in
the above algorithm with some simple modifications. We process the vertices and edges in their
order of the input filtration. Usually, the filtration F = F f is given by a simplex-wise monotone
function f as described in Section 3.1.2. We compute the persistence Pers (e) of an edge e as
Pers (e) = | f (e) − f (r)| if e pairs with the vertex r and ∞ otherwise.
For a vertex u in the filtration F f , we do not do anything other than creating a new set con-
taining u only. When an edge e = (u, u0 ) comes in, we check if u and u0 belong to the same
tree by using the union-find data structure. If they do, the edge e is designated as a creator for
persistence and as a critical edge in DMVF that is being built on G. Otherwise, we compute
Pers (e) after finding the persistence pair for e and at the same time cancel e with its pair in
the DMVF as follows. Assume inductively that the current DMVF matches every vertex other
than the roots of the trees to one of its adjacent edge as follows. For a leaf vertex v, con-
sider the path v = v1 , e1 , . . . , ek−1 , vk = r from v to the root r which consist of matched pairs
(v1 , e1 ), . . . , (vk−1 , ek−1 ) and the critical vertex r. For the edge e = (u, u0 ), let the roots of the two
trees T u and T u0 containing u and u0 be r and r0 respectively. Assume without loss of generality
that r succeeds r0 in the input filtration. Then, e pairs with r in persistence because e joins the
two components created by r and r0 between which r comes later in the filtration. We cancel the
persistence pair (r, e) by shifting the matched pairs on the path from u to r as stated in Defini-
tion 10.5. We join the two trees T u and T u0 into one tree by calling the routine Join. The root of
the joined tree becomes r0 . Cancelling (r, e) maintains the invariant that every path from the leaf
to the root of the new tree remains a V-path. See Figure 10.3 for an illustration.
Computational Topology for Data Analysis 245
Algorithm 18 PersDMVF(G, F f )
Input:
A graph G and a filtration F f on its n vertices and edges
Output:
A DMVF V and persistence pairs of F f which are cancelled for creating V
1: Let G = (U, E) and F be the input filtration of its n vertices and edges.
2: T := {∅}; V := ∅ t ∅ t {(U ∪ E)}; Initialize U := U
3: for all i = 1, . . . , n do
4: if σi ∈ F f is a vertex u then
5: Create a tree T rooted at u; T := T ∪ {T }
6: else if σi ∈ F is an edge e = (u, u0 ) then
7: if t :=FindSet(u)= t0 :=FindSet(u0 ) then
8: designate e as creator and critical in V; Pers (e) := ∞
9: else
10: Union(t,t0 ) updating U
11: Let T u and T u0 be trees containing u and u0
12: Find V-paths πu from u to root r and πu0 from u0 to r0 in T u and T u0 respectively
13: Let r succeed r0 in F; Cancel (e, r) considering the V-path πu and update DMVF V;
Pers (e) := | f (e) − f (r)|
14: Join(T u , T u0 ) in T
15: end if
16: end if
17: end for
18: Output V and persistence pairs with persistence values
r0 r0
0
e
r r
u0 e u0
u e u
Tu Tu0 Tu Tu0
Figure 10.3: Illustration for Algorithm PersDMVF: destroyer edge e = (u, u0 ) is joining two trees
T u and T u0 with roots r and r0 respectively. The pair (r, e) is cancelled reversing the arrows on
three edges on the path from r to u0 ; edge e0 in the right picture is a creator and does not make
any change in the forest.
The costly step in algorithm PersDMVF is the cancellation step which takes O(n) time and
thus incurs a running time O(n2 ) in total. However, we observe that all matchings in the final
DMVF are made between a node v and the edge e that connects v to its parent parent(v) in the
respective rooted tree and the root remains critical. All non-tree edges remain critical. Thus, we
can eliminate the cancellation step in PersDMVF and after computing the final forest we can
246 Computational Topology for Data Analysis
Algorithm 19 SimplePersDMVF(G, F f )
Input:
A graph G and a filtration F f on its n vertices and edges
Output:
A DMVF V and persistence pairs of F f which are cancelled for creating V
1: Let G = (U, E) and F f be the input filtration of its n vertices and edges.
2: T := {∅}; V := ∅; Initialize U := U
3: for all i = 1, . . . , n do
4: if σi ∈ F f is a vertex u then
5: Create a tree T rooted at u; T := T ∪ {T }
6: else if σi ∈ F is an edge e = (u, u0 ) then
7: if t :=FindSet(u)= t0 :=FindSet(u0 ) then
8: designate e as creator and critical in V; Pers (e) := ∞
9: else
10: Union(t,t0 ) updating U
11: Let T u and T u0 be trees containing u and u0 with roots r and r0
12: Let r succeed r0 in F; Pers (e) := | f (e) − f (r)|
13: Join(T u , T u0 ) in T with edge e
14: end if
15: end if
16: end for
17: for each tree T ∈ T do
18: for each node v in T do
19: e := (v, parent(v)), V := V t (v, e)
20: end for
21: Put the root of T as a critical vertex in V
22: end for
23: Output V and persistence pairs with persistence values
determine all matched pairs by traversing the trees upward from the leaves to the roots while
matching a vertex with the edge visited next in this upward traversal. This matching takes O(n)
time. Accounting for the union-find operations, all other steps in PersDMVF take O(nα(n)) time
in total. The simplified version Algorithm 19:SimplePersDMVF incorporates these changes. We
have the following result.
Theorem 10.6. Given a simplicial 1-complex or a simplicial 2-manifold K with n simplices, one
can compute
1. a DMVF by cancelling all persistence pairs resulting from a given filtration of K in O(nα(n))
time;
2. a DMVF as above when the filtration is induced by a given PL-function on K in O(n log n)
time.
Computational Topology for Data Analysis 247
Proof. We argue for all statements in the theorem when K is a 1-complex. By considering the
dual graph K ∗ , and combining Propositions 10.3 and Proposition 10.4, the arguments also hold
for K when it is a simplicial 2-manifold. The algorithm SimplePersDMVF outputs the same as
the algorithm PersDMVF whose correctness follows from Theorem 10.5 because it cancels the
persistence pairs exactly as the theorem dictates. The complexity analysis of the algorithm Sim-
plePersDMVF establishes the first statement. For the second statement, given the function values
at the vertices of K, we can compute a simplex-wise lower star filtration (Section 3.5) in O(n log n)
time after sorting these function values. A subsequent application of SimplePersDMVF on this
lower star filtration provides us the desired DMVF.
We can modify SimplePersDMVF slightly to take into account a threshold δ for persistence,
that is, we can cancel pairs only with persistence up to δ. To do this, we need a slightly different
version of Proposition 10.3. The cancellation also succeeds if we cancel persistent pairs in the
order of their persistence values. The proof of Proposition 10.3 can be adapted for the following
proposition.
Proposition 10.7. Let (v1 , e1 ), (v2 , e2 ), · · · , (vn , en ) be the sequence of all non-essential persis-
tence pairs of vertices and edges sorted in non-decreasing order of their persistence for a given
filtration of K. Let V0 be the DMVF in K with all simplices being critical. Suppose DMVF Vi−1
can be obtained by cancelling successively (v1 , e1 ), (v2 , e2 ), · · · (vi−1 , ei−1 ). Then, (vi , ei ) can be
cancelled in Vi−1 providing a DMVF Vi for all i ≥ 1.
The modified algorithm proceeds as in SimplePersDMVF, but designate those edges critical
whose persistence is more than δ. Then, before traversing the edges of the trees in the forest T
to output the vertex-edge pairs, we delete all these critical edges from T. This splits the trees in
T and creates more trees. We need to determine the roots of these trees. Observe that, had we
done the cancellations as in PersDMVF, the roots of the trees would have been the vertices that
appear the earliest in the filtration among all vertices in the respective trees. So, all trees in T
obtained after deleting all critical edges are rooted at the vertices that appear the earliest in the
filtration. Then, the steps 17 to 22 in SimplePersDMVF compute the vertex-edge matchings into
the DMVF from these rooted trees. The new algorithm called PartialPersDMVF modifies step
13 of the algorithm SimplePersDMVF as:
• delete all critical edges from T and create new rooted trees in T as described
Let Vδ denote the resulting DMVF after canceling all vertex-edge persistence pairs with per-
sistence at most δ.
248 Computational Topology for Data Analysis
Proposition 10.8. The following statements hold for the output T of the algorithm PartialPers-
DMVF w.r.t any δ ≥ 0:
(i) For each tree T i , its root ri is the only critical simplex in Vδ ∩ T i . The collection of these
roots corresponds exactly to those vertices whose persistence is bigger than δ.
(ii) Any edge with Pers (e) > δ remains critical in Vδ and cannot be contained in T.
errors. Below, we introduce some concept necessary for transitioning to the discrete versions of
(un)stable manifolds.
(a) (b)
Figure 10.4: (Un)stable manifolds for a smooth Morse function on left and its discrete version
(shown partially) on right: (a) t1 and t2 are maxima (critical triangles in discrete Morse), v is a
minimum, e1 and e2 are saddles (critical edges in discrete Morse). The unstable manifold of e1
flows out of it to t1 and t2 . On the other hand, its stable manifolds flow out of minima such as v
and come to it. These flows work in the opposite direction of ‘gravity’ because if we put a drop
of water at x it will flow to v. If we put it on the other side of the mountain ridge it will flow to
other minimum; (b) the flow direction reverses from the smooth case to the discrete case.
triangle-edge paths. Notice that these mountain ridges on a triangulation of d-manifold corre-
spond to a V-path alternating between d and (d − 1) dimensional simplices. Computationally,
however, vertex-edge gradient paths are simpler to handle especially for the Morse cancellations
below. Hence in our algorithm below, we negate the density function ρ and consider the function
−ρ. The algorithm outputs a subset of the 1-unstable manifolds that are vertex-edge gradient paths
in the discrete setting as the recovered hidden graph.
With the above set up, we have an input function f : V(K) → R defined at the vertices V(K)
of a complex K whose linear extension leads to a PL function still denoted by f : |K| → R. For
computing persistence, we use the lower-star filtration F f of f and its simplex-wise version as
described in Section 3.1.2.
10.4.1 Algorithm
Intuitively, we wish to use “mountain ridges” of the density field to approximate the hidden graph
as Figure 10.6 shows. We compute these ridges as the 1-stable manifolds (“valley ridges") of
f = −ρ, the negation of the density function. In the discrete setting, these become 1-unstable
manifolds consisting of vertex-edge gradient paths in an appropriate DMVF. We compute this
DMVF by cancelling vertex-edge persistence pairs whose persistence is at most a threshold δ.
The rational behind this choice is that the small undulations in a 1-unstable manifold caused by
noise and discretization need to be ignored by cancellation. The procedure PartialPersDMVF
described earlier in Section 10.2.2 achieves this goal. Finally, the union of the 1-unstable mani-
folds of all remaining high-persistence critical edges is taken as the output graph Ĝ, as outlined
in Algorithm 21:CollectG. Algorithm 20:MorseRecon presents these steps.
Algorithm 20 MorseRecon(K, ρ, δ)
Input:
A 2-complex K, a vertex function ρ on K, a threshold δ
Output:
A graph
1: Let F be a simplex-wise lower star filtration of K w.r.t. f = −ρ.
2: Compute persistence Pers (e) for every edge e for the filtration F.
3: Let K 1 be the 1-skeleton of K and F 1 be F restricted to vertices and edges only
4: Let T be the forest computed by PartialPersDMVF(K 1 ,F 1 ,δ)
5: CollectG(K 1 ,T, Pers (·), δ)
Computational Topology for Data Analysis 251
Theorem 10.9. The time complexity of the algorithm MorseRecon is O(Pert(K)), where Pert(K)
is the time to compute persistence pairings for K.
We remark that, for K with n vertices and edges, collecting all 1-unstable manifolds takes
O(n) time if one avoids revisiting edges while tracing paths. This O(n) term is subsumed by
Pert(K) because there are at least n/2 such pairs.
Consider the DMVF Vδ computed by PartialPersDMVF. Notice that, Proposition 10.8(i)
implies that for each T i , any V-path of Vδ starting at a vertex or an edge in T i terminates at its
root ri . See figure 10.3 for an example. Hence for any vertex v ∈ T i , the path π(v) computed by
CollectG is the unique V-path starting at v. This immediately leads to the following result:
Corollary 10.10. For each critical edge e = (u, v) with Pers (e) ≥ δ, π(u) ∪ π(v) ∪ {e} computed
by the algorithm CollectG is the 1-unstable manifold of e in Vδ .
252 Computational Topology for Data Analysis
We remark that the noise model is rather limited – In particular, it does not allow significant
non-uniform density distribution. However, this is the only case that theoretical guarantees are
known at the moment for a discrete Morse based reconstruction framework. In practice, the
algorithm has often been applied to non-uniform density distributions.
Input assumption. Let ρ be an input density field which is a (β, ν, ω)-approximation of a con-
nected graph G, and δ ∈ [ν, β − ν).
Under the above input assumption, let Ĝ be the output of algorithm MorseRecon(K, ρ, δ). The
proof of the following result can be found in [138].
(i) There is a single critical vertex left after MorseRecon returns, which is in Gω .
(ii) Every critical edge considered by CollecG forms a persistence pair with a triangle.
Theorem 10.12. Under the input assumption, the output graph satisfies Ĝ ⊆ Gω .
Proof. Recall that the output graph Ĝ consists of the union of 1-unstable manifolds of all the
edges e∗1 , . . . , e∗g with persistence larger than δ – by Propositions 10.11 (ii) and (iii), they are all
positive (paired with triangles), and contained inside Gω . Below we show that other simplicies in
their 1-unstable manifolds are also contained in Gω .
Take any i ∈ [1, g] and consider e∗i = (u, v). Without loss of generality, consider the critical
V-path π : e∗i ≺ (u = u1 ) ≺ e1 ≺ u2 ≺ . . . ≺ e s ≺ u s+1 . By definition u s+1 is a critical vertex and is
necessarily the global minimum v0 for the density field ρ, which is also contained inside Gω . We
now argue that all simpliecs in the path π lie inside Gω . In fact, we argue a stronger statement:
first, we say that a gradient vector (v, e) is crossing if v ∈ Gω and e < Gω (i.e, e ∈ cl(Gω )). Since
v is an endpoint of e, this means that the other endpoint of e must lie in K \ Gω .
Claim 10.2. During the cancellation with threshold δ in the algorithm MorseRecon, no crossing
gradient vector is ever produced.
254 Computational Topology for Data Analysis
Proof. Suppose the claim is not true. Then, let (v, e) be the first crossing gradient vector ever
produced during the cancellation process. Since we start with a trivial discrete gradient vector
field, the creation of (v, e) can only be caused by reversing of some gradient path π0 connecting
two critical simplices v0 and e0 while we are performing cancellation for the persistence pair
(v0 , e0 ). Obviously, Pers (e0 ) ≤ δ because otherwise cancellation would not have been performed.
On the other hand, due to our (β, ν, ω)-noise model and the choice of δ, it must be that either both
v0 , e0 ∈ Gω or both v0 , e0 ∈ K \ Gω – as otherwise, the persistence of this pair will be larger than
β − ν > δ.
Now consider the V-path π0 connecting e0 and v0 in the current discrete gradient vector field
V 0 . The path π0 begins and ends with simplices that are either both in Gω or both are outside
Gω and also it has simplices both inside and outside Gω . It follows that the path π0 contains a
gradient vector (v00 , e00 ) going in the opposite direction crossing inside/outside, that is, v00 ∈ Gω
and e00 < Gω . In other words, it must contain a crossing gradient vector. This however contradicts
our assumption that (v, e) is the first crossing gradient vector. Hence, the assumption is wrong and
no crossing gradient vector can ever be created.
As there is no crossing gradient vector during and after cancellation, it follows that π, which is
one piece of the 1-unstable manifold of the critical edge e∗i , has to be contained inside Gω . The
same argument works for the other piece of 1-unstable manifold of e∗i which starts from the other
endpoint of e∗i . Since this holds for any i ∈ [1, g], the theorem follows.
The previous theorem shows that Ĝ is geometrically close to G. Next we show that they are
also close in topology.
Proof. First we show that Ĝ is connected. Then, we show that Ĝ has the same first Betti number
as that of G which implies the claim as any two connected graphs in Rk with the same first
Betti number are homotopy equivalent. Suppose that Ĝ has at least two components. These two
components should come from two trees in the forest computed by PartialPersDMVF. The roots,
say r and r0 , of these two trees must reside in Gω due to Claim 10.2 and Proposition 10.11(iii).
Furthermore, the supporting complex of Gω is connected because it contains the connected graph
G. It follows that there is a path connecting r and r0 within Gω . All vertices and edges in Gω
appear earlier than other vertices and edges in the filtration that PartialPersDMVF works on.
This two facts mean that the first edge which connects the two trees rooted at r and r0 resides
in Gω . This edge has a persistence less than δ and should be included in the reconstruction
by MorseRecon. It follows that CollectG returns 1-unstable manifolds of edges ending at a
common root of the tree containing both r and r0 . In other words, Ĝ cannot have two components
as assumed.
The underlying space of ω-neighborhood Gω of G deformation retracts to G by definition.
Observe that, by our noise model, Gω is a sublevel set in the filtration that determines the per-
sistence pairs. This sublevel set being homotopy equivalent to G must contain exactly g positive
edges where g is the first Betti number of G. Each of these positive edges pairs with a triangle in
Gω . Therefore, Pers (e) > δ for each of the g positive edges in Gω . By our earlier results, these
are exactly the edges that will be considered by procedure CollectG. Our algorithm constructs Ĝ
Computational Topology for Data Analysis 255
by adding these g positive edges to the spanning tree each of which adds a new cycle. Thus, Ĝ
has first Betti number g as well, thus proving the proposition.
We have already proved that Ĝ is contained in Gω . This fact along with Proposition 10.13 can
be used to argue that any deformation retraction taking (underlying space) Gω to G also takes Ĝ
to a subset G0 ⊆ G where G0 and G have the same first Betti number. In what follows, we use Gω
to denote also its underlying space.
Proof. The fact that H|Ĝ (·, `) is continuous for any ` ∈ [0, 1] is obvious from the continuity of H.
Only thing that needs to be shown is that G0 := H|Ĝ (Ĝ, 1) has the same first Betti number as that of
G. We observe that a cycle in Ĝ created by a positive edge e along with the paths to the root of the
spanning tree is also non-trivial in Gω because this is a cycle created by adding the edge e during
persistence filtration and the cycle created by the edge e is not destroyed in Gω . Therefore, a cycle
basis for H1 (Ĝ) is also a homology basis for H1 (Gω ). Since the map H(·, 1) : Gω → G is a homo-
topy equivalence, it induces an isomorphism in the respective homology groups; in particular, a
basis in H1 (Gω ) is mapped bijectively to a basis in H1 (G). Therefore, the image G0 = H|Ĝ (Ĝ, 1)
must have a basis of cardinality g = β1 (Ĝ) = β1 (Gω ) = β1 (G) proving that β1 (G0 ) = β1 (G).
10.5 Applications
10.5.1 Road network
Robust and efficient automatic road network reconstruction from GPS traces and satellite images
is an important task in GIS data analysis and applications. The Morse-based approach can help
reconstructing the road network in both cases in a conceptually simple and clean manner. The
framework provides a meaningful and robust way to remove noise because it is based on the
concept of persistent homology. Intuitively, reconstruction of a road network from a noisy data
is tantamount to reconstructing a graph from a noisy function on a 2D domain. One needs to
eliminate noise and at the same time preserve the signal. Persistent homology and discrete Morse
theory help address both of these aspects. We can simply use the graph reconstruction algorithm
detailed in the previous section for this road network recovery.
GPS trajectories. Here the input is a set of GPS traces, and the goal is to reconstruct the under-
line road network automatically from these traces. The input set of GPS traces can be converted
into a density map ρ : Ω → R defined on the planar domain Ω = [0, 1] × [0, 1]. We then use
our graph reconstruction algorithm MorseRecon to recover the “mountain ridges" of the density
field; see Figure 10.6.
In Figure 10.7, we show reconstructed road network after improving the discrete-Morse based
output graphs with an editing strategy [137]. After the automatic reconstruction, the user can
observe the missing branches and can recover them by artificially making a vertex near the tip of
256 Computational Topology for Data Analysis
each such branch a minimum. This forces a 1-unstable manifold from a saddle edge to each of
these minima. Similarly, if a distinct loop in the network is missing, the user can artificially make
a triangle in the center of the loop a maximum which forces the loop to be detected.
Figure 10.6: Road network reconstruction [295]: (left) input GPS traces; (right) terrain corre-
sponding to the graph of the density function computed from input GPS traces; black lines are
the output of algorithm MorseRecon, which captures the ’mountain ridges’ of the terrain, corre-
sponding to the reconstructed road-network. The upper right is a top view of the terrain.
Figure 10.7: Road network reconstruction with editing [137]: (left) red points (minima) are added,
red branches are newly reconstructed for the Athens map (black curves are original reconstruction,
blue curves are input GPS traces); (middle) we also add blue triangles as maxima to capture many
missing loops; (right) upper: an example to show that adding extra triangles as maxima will
capture more loops, bottom: Berlin with adding both branches and loops.
Satellite images. In this case, we combine the Morse based graph reconstruction with a neural
network framework to recover the road network from input satellite images. First, we feed the
gray scale values of the input satellite image as a density function to MorseRecon. The output
graphs from a set of images are used to train a convolutional neural network CNN, which output
an image aiming to capture only the foreground (roads) in the satellite images. After training this
CNN, we feed the original satellite images to it to obtain a set of hopefully “cleaner" images.
These cleaned images are again fed to MorseRecon to output a graph which can again be used
to further train the CNN. Repeated use of this reconstruct-and-train step clean the noise consid-
Computational Topology for Data Analysis 257
erably. In Figure 2 (f) from Chapter Prelude, we show an example of the output of this strategy.
Notice that this strategy eliminates the need for curating the satellite images manually for creating
training samples.
The discrete Morse based graph reconstruction algorithms have been applied to both fronts.
Neuron cells have tree morphology and can commonly be modeled as a rooted tree, where the root
of the tree locates in the soma (cell body) of the neuron. In Figure 10.8, we show the reconstructed
neuron morphology by applying the discrete Morse algorithm directly to an Olfactory Projection
Fibers data set (specifically, OP-2 data set) from the DIADEM challenge [259]. Specifically, the
input is an image stack acquired by 2-channel confocal microscopy method. In the approach
proposed in [294], after some preprocessing, the discrete Morse based algorithm is applied to
the 3D volumetric data to construct a graph skeleton. A tree-extraction based algorithm is then
applied to extract a tree structure from the graph output.
The discrete Morse based graph reconstruction algorithm can also be used in a more sophis-
ticated manner to handle more challenging data. Indeed, a new neural network framework is
proposed in [16] to combine the reconstructed Morse graph as topological prior with a UNet
[269] like neural network architecture for cell process segmentation from various neuroanatom-
ical image data. Intuitively, while UNet has been quite successful in image segmentation, such
approaches lack a global view (e.g., connectivity) of the structure behind the segmented signal.
Consequentially, the output can contain broken pieces for noisy images, and features such as
junction nodes in input signal can be particularly challenging to recover. On the other hand, while
DM-based graph reconstruction algorithm is particularly effective in capturing global graph struc-
tures, it may produce many false positives. The framework proposed in [16], called DM++ uses
output from discrete Morse as a separate channel of input, and co-train it together with the output
of a specific UNet-like architecture called ALBU [61] so as to use these two input to comple-
258 Computational Topology for Data Analysis
Figure 10.9: The DM++ framework proposed by [16], which combines both the DM output with
standard neural-network based output together via a Siamese neural network stack so as to use
these two inputs to augment each other and obtain better connected final segmentation; image
courtesy of Samik Banerjee et al. (2020, fig. 2b).
ment each other. See Figure 10.9. In particular, UNet output helps to remove false positives from
discrete Morse output, while the Morse graph output helps to obtain better connectivity.
proposed using the theory for detecting filamentary structures in data for cosmic webs. These
work proposed using cancellations as long as they are permitted acknowledging the fact that all
cancellations in a 2- or 3-complex may not be possible. Wang et al. proposed to use discrete
Morse complexes to compute unstable 1-manifolds as an output for a road network from GPS
data [295]. Using unstable 1-manifolds in a discrete Morse complex defined on a triangulation
in R2 to capture the hidden road network was proposed in this paper. Ultimately, this proposed
approach was implemented with a simplified algorithm and a proof of guarantee in [138]. The
material in Section 10.4 is taken from this paper. The application to road network reconstruction
from GPS trajectories and satellite images in Section 10.5 appeared in [137] and [139] respec-
tively. The application to neuron imaging data is taken from [16, 294].
Exercises
1. A Hasse diagram of a simplicial complex K is a directed graph that has a vertex vσ for
every simplex σ in K and a directed edge from vσ to vσ0 if and only if σ0 is a cofacet of
σ. Let M be a matching in K. Modify the Hasse diagram by reversing every edge that is
directed from vσ to vσ0 and (σ0 , σ) is in the matching M. Show that M induces a DMVF if
and only if the modified Hasse diagram does not have any directed cycle.
3. Call a V-path extendible if it can be extended by a simplex at any of the two ends.
5. Let K be a simplicial Möbius strip with all its vertices on the boundary. Design a DMVF on
K so that there is only one critical edge and only one critical vertex and no critical triangle.
6. Prove that two V-paths that meet must have a common suffix.
(a) The strong Morse inequality implies the weak Morse inequality in Proposition 10.1.
(b) A matching which is not Morse may not satisfy Morse inequalities as in Proposi-
tion 10.1 but always satisfies the equality c p − c p−1 + · · · ± c0 = β p − β p−1 + · · · ± β0
for a p-dimensional complex K.
260 Computational Topology for Data Analysis
Prove that this simple modification produces the same DMVF as the PartialPersDMVF
described in the text.
11. Let K be a simplicial d-complex that has every (d − 1)-simplex incident to at most two d-
simplices. Extend Theorem 10.5 to prove that all persistent pairs between (d − 1)-simplices
and d-simplices arising from a filtration of K can be cancelled.
12. Prove ∂ p−1 ◦ ∂ p = 0 for the boundary operator defined for chain groups of critical cells as
described for discrete Morse chain complex in the notes above.
Chapter 11
In previous chapters, we have considered filtrations that are parameterised by a single parameter
such as Z or R. Naturally, they give rise to a 1-parameter persistence module. In this chapter, we
generalize the concept and consider persistence modules that are parameterized by one or more
parameters such as Zd or Rd .They are called multiparameter persistence modules in general. Mul-
tiparameter persistence modules naturally arise from filtrations that are parameterized by multiple
values such as the one shown in Figure 11.1 over two parameters.
Figure 11.1: A bi-filtration parameterized over curvature and radius, reprinted by permission
from Springer Nature: Springer Nature, Discrete & Computational Geometry, "The Theory of
Multidimensional Persistence", Gunnar Carlsson et al.[65], c 2009.
The classical algorithm of Edelsbrunner et al. [152] presented in Chapter 3 provides a unique
decomposition of the 1-parameter persistence module over Z implicitly generated by an input
simplicial filtration. Similarly, a multiparameter persistence module M over the grid Zd can be
implicitly given by an input multiparameter
L finite simplicial filtration and we look for computing a
decomposition (Definition 11.10) M M i . The modules M i are the counterparts of bars in the
i
261
262 Computational Topology for Data Analysis
1-parameter case and are called indecomposables. These indecomposables are more complicated
and cannot be completely characterized as in the one-parameter case. Nonetheless, for finitely
generated persistence modules defined over Zd , their existence is guaranteed by the Krull-Schmidt
theorem [10]. Figure 11.2 illustrates indecomposables of some modules.
B
A
Figure 11.2: Decomposition of a finitely generated 2-parameter persistence module: (left) rect-
angle decomposable module: each indecomposable is supported by either bounded (A) or un-
bounded rectangle (B and C), D is a free module; (right) interval decomposable module: each
indecomposable is supported over a 2D interval (defined in next chapter).
An algorithm for decomposing a multiparameter persistence module can be derived from the
so-called Meataxe algorithm which applies to much more general modules than we consider in
TDA at the expense of high computational cost. Sacrificing this generality and still encompassing
a large class of modules that appear in TDA, we can design a much more efficient algorithm.
Specifically, we present an algorithm that can decompose a finitely presented module with a time
complexity that is much better than the Meataxe algorithm though we lose the generality as the
module needs to be distinctly graded as explained later.
For measuring algorithmic efficiency, it is imperative to specify how the input module is pre-
sented. Assuming an index set of size s and vector spaces of dimension O(m), a 1-parameter
persistence module can be presented by a set of matrices of dimensions O(m) × O(m) each rep-
resenting a linear map Mi → Mi+1 between two consecutive vector spaces Mi and Mi+1 . This
input format is costly as it takes O(sm2 ) space (O(m2 )-size matrix for each index) and also does
not appear to offer any benefit in time complexity for computing the bars. An alternative pre-
sentation is obtained by considering the persistence module as a graded module over a polyno-
mial ring k[t] and presenting it with the so-called generators {gi } of the module and relations
{ i αi gi = 0 | αi ∈ k[t]} among them. A presentation matrix encoding the relations in terms
P
of the generators characterizes the module completely. Then, a matrix reduction algorithm akin
to the persistence algorithm MatPersistence from Chapter 3 provides the desired decomposi-
tion1 . Figure 11.3 illustrates the advantage of this presentation over the other costly presentation.
In practice, when the 1-parameter persistence module is given by an implicit simplicial filtra-
1
persistence algorithm takes a filtration as input whereas here a module is presented with input matrices.
Computational Topology for Data Analysis 263
tion, one can apply the matrix reduction algorithm directly on a boundary matrix rather than first
computing a presentation matrix from it and then decomposing it. If there are O(n) simplices
constituting the filtration, the algorithm runs in O(n3 ) time with simple matrix reductions and
in O(nω ) time with more sophisticated matrix multiplication techniques where ω < 2.373 is the
exponent for matrix multiplication.
( 10 ) 2 ( 10 10 ) 2 ( 1 1 ) ( 1) (2) (3)
(0) 1
g1 tg1 , g2 t2 g1 , tg2 , g3 t3 g1 , t2 g2 , tg3 t4 g1 , t3 g2 , t2 g3
(1) 1 1
0 = t2 g1 + tg2 0 = t3 g1 + t2 g2 0 = t4 g1 + t3 g2
0 = t2 g2 + tg3 0 = t3 g2 + t2 g3 (2) 1
Figure 11.3: Costly presentation (top) vs. graded presentation (bottom,right). The top chain can
be summarized by three generators g1 , g2 , g3 at grades (0), (1), (2) respectively, and two relations
0 = t2 g1 + tg2 , 0 = t2 g2 + tg3 at grades (2), (3) respectively (Definition 11.5). The grades of
the generators and relations are given by the first times they appear in the chain. Finally, these
information can be summarized succinctly by the presentation matrix on the right.
The Meataxe algorithm for multiparameter persistence modules follows the costly approach
analogous to the one in the 1-parameter case that expects the presentation of each individual linear
map explicitly. In particular, it expects the input d-parameter module M over a finite subset of
Zd to be given as a large matrix in kD×D with entries in a fixed field k = Zq , where D is the sum
of dimensions of vector spaces over all points in Zd supporting M. The time complexity of the
Meataxe algorithm is O(D6 log q) [196]. In general, D might be quite large. It is not clear what is
the most efficient way to transform an input that specifies generators and relations ( or a simplicial
filtration) to a representation matrix required by the Meataxe algorithm. A naive approach is to
consider the minimal sub-grid in Zd that supports the non-trivial maps. In the worst-case,
with
N being the total number of generators and relations, one has to consider O( Nd ) = O(N d ) grid
points in Zd each with a vector space of dimension O(N). Therefore, D = O(N d+1 ) giving a
worst-case time complexity of O(N 6(d+1) log q). Even allowing approximation, the algorithm runs
in O(N 3(d+1) log q) time [197].
In this chapter, we take the alternate approach where the module is treated as a finitely pre-
sented graded module over multivariate polynomial ring R = k[t1 , · · · , td ] [108] and presented
with a set of generators and relations graded appropriately. Given a presentation matrix encod-
ing relations with generators, our algorithm computes a diagonalization of the matrix giving a
presentation of each indecompsable which the input module decomposes into. Compared to the
1-parameter case, we have to cross two main barriers for computing the indecomposables. First,
we need to allow row operations along with column operations for reducing the input matrix. In
1-parameter case, row operations become redundant because column operations already produce
the bars. Second, unlike in 1-parameter case, we cannot allow all left-to-right column or bottom-
to-top row operations for the matrix reduction because the parameter space Zd , d > 1, unlike Z
only induces a partial order on these operations. These two difficulties are overcome by an in-
cremental approach combined with a linearization trick. Given a presentation matrix with a total
of O(N) generators and relations that are graded distinctly, the algorithm runs in O(N 2ω+1 ) time.
264 Computational Topology for Data Analysis
Definition 11.1 (Polynomial ring). Given a variable t and a field k, the set of polynomials given
by
k[t] = {a0 + a1 t + a2 t2 + · · · + an tn | n ≥ 0, ai ∈ k}
forms a ring with usual polynomial addition and multiplication operations. The definition can be
extended to multivariate polynomials
i
k[t] = k[t1 , · · · , tk ] = {Σi1 ,··· ,ik ai1 ,··· ,ik t1i1 · · · t jj · · · tkik | i1 , · · · , ik ≥ 0, ai1 ,··· ,ik ∈ k}.
t12 •
··· ↑ ···
Figure 11.4: A graded 2-parameter module. All sub-diagrams of maps and compositions of maps
are commutative.
module equivalently as a collection of vectors spaces {Mu }u∈Zd with a collection of linear maps
{ti • : Mu → Mu+ei , ∀i, ∀u} where the commutative property (t j •) ◦ (ti •) = (ti •) ◦ (t j •) holds. The
commutative diagram in Figure 11.4 shows a graded module for d = 2, also called a bigraded
module.
Definition 11.3 (Graded module R). There is a special graded module M where Mu is the k-
vector space generated by tu = t1u1 t2u2 · · · tdud and the ring action is given by the ring R. We denote
it with R not to be confused with the ring R which is used to define it.
Before we introduce persistence modules as instances of graded modules, we extend the no-
tion of simplicial filtration to the multiparameter framework.
Figure 11.5 shows an example of a 2-parameter filtration and various graded modules associ-
ated with it. The module resulting with the homology group at the bottom right is a persistence
module. The figure also shows other graded modules of chain groups (left) and boundary groups
(middle).
0
0 1 2
k [t1 ,0,0]> k3 [t1 ,0,0]> k3 0 0 k2 [C] k2 k t1 k t1 k
0 0 k t1 k 0 0 0 0 0 0 0 k t1 k
(a) (b) (c)
Figure 11.5: (top) An example of a 2-parameter simplicial filtration. Each square box indicates
what is the current (filtered) simplical complex at the bottom left grid point of the box. (bottom)
We show different modules considering different abelian groups arising out of the complexes with
the ring actions on the arrows (see Section 11.5 for details): (a) The module of 0-th chain groups
t1 0 0 t2 0 0
C0 , A = 0 t1 0 and B = 0 t2 0 . (b) The module of 0-th boundary groups B0 ,
0 0 t1 0 0 t2
! !
t1 0 t2 0
C = and D = . (c) The module of the 0-th homology groups H0 , it has
0 t1 0 t2
one connected component in 0-th homology groups at grades except (0, 0) and (1, 1), and has two
connected components at grade (1, 1).
In this exposition, we assume that all modules are finitely generated. Such modules always
admit a minimal generating set. In our example in Figure 11.5, the vertex set {vb , vr , vg } is a
minimal generating set for the module of zero-dimensional homology groups.
Definition 11.6 (Morphism). A graded module morphism, called morphism in short, between
two graded modules M and N is defined as an R-linear map f : M → N preserving grades:
f (Mu ) ⊆ Nu , ∀u ∈ Zd . Equivalently, it can also be described as a collection of linear maps
Computational Topology for Data Analysis 267
Two graded modules M, N are isomorphic if there exist two morphisms f : M → N and g : N →
M such that g ◦ f and f ◦ g are identity maps. We denote it as M N.
Definition 11.7 (Shifted module). For a graded module M and some u ∈ Zd , define a shifted
graded module M→u by setting (M→u )v = Mv−u for each v.
Definition 11.8 (Free module). We say L a graded module is free if it is isomorphic to the direct
j R , where each R = R→u j for some u j ∈ Z . Here R is
j
sum of a collection of R , denoted as j j d
Essentially, like isomorphic modules, two isomorphic morphisms can be considered the same.
For two morphisms f1 : M 1 → N 1 and f2 : M 2 → N 2 , there exists a canonical morphism
g : M 1 ⊕ M 2 → N 1 ⊕ N 2 , g(m1 , m2 ) = ( f1 (m1 ), f2 (m2 )), which is essentially uniquely determined
by f1 and f2 and is denoted as f1 ⊕ f2 . A module is trivial if it has only the element 0 at every
grade. We denote a trivial morphism by 0 : 0 → 0. Analogous to the decomposition of a module,
we can also define a decomposition of a morphism.
Definition 11.12 (Morphism decomposition). A morphism f is L indecomposable if f ' f1 ⊕
f2 =⇒ f1 or f2 is the trivial morphism 0 : 0 → 0. We call f fi a decomposition of f . If
each fi is indecomposable, we call it a total decomposition of f .
Like decompositions of modules, the total decompositions of a morphism is also essentially
unique.
In Figure 11.6, we illustrate the presentation matrix of the persistence module H0 consisting
of zero dimensional homology groups induced by the filtration shown in Figure 11.5. We will
see later that, in this case, f equals the boundary morphism ∂1 : C1 → C0 whose columns are
edges and rows are vertices. For example, the red edge er whose grade is (1, 1) has two boundary
vertices vb , the blue vertex with grade (0, 1) and vr , the red vertex with grade (1, 0). To bring vb
to grade (1, 1), we need to multiply by the polynomial t1 . Similarly, to bring vr to grade (1, 1), we
need to multiply by t2 . The corresponding entries in the column of er are t1 and t2 respectively
indicated by shaded boxes. Actual matrices are shown later in Example 11.1.
An important property of a graded module H is that a decomposition of its presentation f
corresponds to a decomposition of H itself. The decomposition of f can be computed by diago-
nalizing its presentation matrix [ f ]. Informally, a diagonalization of a matrix A is an equivalent
matrix A0 in the following form (see formal Definition 11.15 later):
A1 0 · · · 0
0 A2 · · · 0
A0 = . .. . . ..
.. . . .
0 0 · · · Ak
L
L nonzero entries are in Ai ’s and we write A
All Ai . It is not hard to see that for a map
L
f fi , there is a corresponding diagonalization [ f ] [ fi ]. With these definitions, we have
the following theorem that motivates the decomposition algorithm (see proof in [142]).
1 (1,0) vr
(1,1) vg
0
0 1 2
Figure 11.6: The presentation matrix of the module H0 consisting of zero dimensional homology
groups for the example in Figure 11.5. The boxes in the matrix containing non-zero entries are
shaded.
Theorem 11.1. There are 1-1 correspondences between the following three structures arising
from a minimal presentation map f : F 1 → F 0 of a graded module H, and its presentation matrix
[ f ]:
L i
1. A decomposition of the graded module H H;
L
2. A decomposition of the presentation map f fi ;
270 Computational Topology for Data Analysis
L
3. A diagonalization of the presentation matrix [ f ] [ f ]i .
Remark 11.2. From Theorem 11.1, we can see that there exist an essentially unique total decom-
position of a presentation map and an essentially unique total diagonalization of the presentation
matrix of H which correspond to an essentially unique total decomposition of H (up to permuta-
tion, isomorphism, and trivial summands). In practice, we might be given a presentation which is
not necessarily minimal. One way to handle this case is to compute the minimal presentation of
the given presentation first. For 2-parameter modules, this can be done by an algorithm presented
in [222]. The other choice is to compute the decomposition of the given presentation directly,
which is sufficient to get the decomposition of the module thanks to the following proposition.
Proposition 11.2. Let f be any presentation (not necessarily minimal) of a graded module H.
The following statements hold:
L i
1. for any decomposition of H H , there exists a decomposition of f ⊕ fi such that
coker fi = H i , ∀i;
Remark 11.3. By Remark 11.1, any presentation f can be written as f f ∗ ⊕ f 0 with f ∗ being
the minimal presentation and coker f 0 = 0. Furthermore, f 0 can be written as f 0 g ⊕ h where g
is an identity map and h is a zero map. The corresponding matrix form is [ f ] [ f ∗ ] ⊕ [g] ⊕ [h]
with [g] being an identity submatrix and [h] being an empty matrix representing a collection
of zero columns. Therefore, one can easily read these trivial parts from the result of matrix
diagonalization. See the following diagram for an illustration.
f∗ g h
[ f ∗]
[ f ] =
1
1
1
with Row(B) ⊆ Row(A), Col(B) ⊆ Col(A). We say an index pair (i, j) is in B if i ∈ Row(B) and
j ∈ Col(B), denoted as (i, j) ∈ B. We denote a block of A on B as the matrix restricted to
the index block B, i.e. [Ai, j ](i, j)∈B , denoted as A|B . We call B the index of the block A|B . We
abuse the notations Row(A|B ) := Row(B) and Col(A|B ) := Col(B). For example, the ith row
ri = Ai,∗ = A|[{i},Col(A)] and the jth column c j = A∗, j = A|[Row(A),{ j}] are blocks with indices
{i}, Col(A)] and Row(A), { j}] respectively. Specifically, ∅, { j} represents an index block of a
single column j and [{i}, ∅] represents an index block of a single row i. We call [∅, ∅] the empty
index block.
A matrix can have multiple equivalent forms for the same morphism they represent. We use
0
A ∼ A to denote the equivalence of matrices. One fact about equivalent matrices is that they can
be obtained from one another by row and column operations introduced later (Chapter 5 in [110]).
Note that each nonempty matrix A has a trivial diagonalization with the set of index blocks
being the singleton {(Row(A), Col(A))}. Guaranteed by Krull-Schmidt theorem [10], all total
diagonalizations are unique up to permutations of their rows and columns, and equivalent trans-
formation within each block. The total diagonalization of A is denoted generically as A∗ . All
total diagonalizaitons of A have the same set of index blocks unique up to permutations of rows
and columns. See Figure 11.7 for an illustration of a diagonalized matrix.
1 2 3 4 5 1 3 2 4 5
1 1
2 4
3 6
4 3
5 2
6 5
Figure 11.7: (left) A nontrivial diagonalization where the locations of non-zero entries are pat-
terned and the pattern for all such entries in the same block are the same. (right) The same matrix
with permutation of columns and rows to bring entries of a block in adjacent locations, the three
index blocks are: ((1, 4, 6)(1, 3)), ((2, 3)(2, 4)), and ((5)(5)).
272 Computational Topology for Data Analysis
11.3.1 Simplification
First we want to transform the diagonalization problem to an equivalent problem that involves
matrices with a simpler form. The idea is to simplify the presentation matrix to have entries only
in k which is taken as Z2 . There is a correspondence between diagonalizations of the original
presentation matrix and certain constrained diagonalizations of the corresponding transformed
matrix.
We first make some observations about the homogeneous property of presentation maps and
presentation matrices. Equivalent matrices actually represent isomorphic presentations f 0 f
that admit commutative diagram,
f
F1 F0
h1 h0
f0
F1 F0
where h1 and h0 are endomorphisms on F 1 and F 0 respectively. The endomorphisms are realized
by basis changes between corresponding presentation matrices [ f ] [ f 0 ]. Since all morphisms
between graded modules are required to be homogeneous (preserve grades) by definition, we can
use homogeneous bases (all the basis elements chosen are homogeneous elements2 ) for F 0 and
F 1 to represent matrices. Let F 0 =< g1 , · · · , g` > and F 1 =< s1 , · · · , sm > where gi and si are
homogeneous elements for every i. With this choice, we can consider only equivalent presentation
matrices under homogeneous basis changes. Each entry [ f ]i, j is also homogeneous. That means,
[ f ]i, j = t1u1 t2u2 · · · tdud where (u1 , u2 , · · · , ud ) = gr(s j ) − gr(gi ). Writing u = (u1 , u2 , · · · , ud ) and
tu = t1u1 t2u2 · · · tdud , we get [ f ]i, j = tu where u = gr(s j ) − gr(gi ) called the grade of [ f ]i, j . We call
such presentation matrix homogeneous presentation matrix.
For example, given F 0 =< g(1,1) 1 , g2
(2,2)
>, the basis change g(2,2)
2 ← g(2,2)
2 + g(1,1)
1 is not
(2,2) (1,1) (2,2) (2,2)
homogeneous since g2 + g1 is not a homogeneous element. However, g2 ← g2 +
(1,1) (2,2) (1,1) (2,2)
t g1 is a homogeneous change with gr(g2 + t g1 ) = gr(g2 ) = (2, 2), which results
(1,1) (1,1)
2. a nonzero entry [ f ]i, j can only be zeroed out by column operations from columns ck <gr c j
or by row operations from rows r` >gr ri .
2
Recall that an element m ∈ M is homogeneous with grade gr(m) = u for some u ∈ Zd if m ∈ Mu .
Computational Topology for Data Analysis 273
Observation (2) indicates which subset of column and row operations is sufficient to zero out
the entry [ f ]i, j . We restate the diagonalization problem as follows:
Given an n×m homogeneous presentation matrix A = [ f ] consisting of entries in k[t1 , · · · , td ]
with grading on rows and columns, find a total diagonalization of A under the following admissi-
ble row and column operations:
The above operations realize all possible homogeneous basis changes. That means, any ho-
mogeneous presentation matrix can be realized by a combination of the above operations.
In fact, the values of nonzero entries in the matrix are redundant under the homogeneous prop-
erty gr(Ai, j ) = gr(c j ) − gr(ri ) given by observation (1). So, we can further simplify the matrix by
replacing all the nonzero entries with their k-coefficients. For example, we can replace 2·tu with
2. What really matters are the partial orders defined by the grading of rows and columns. With
our assumption of k = Z2 , all nonzero entries are replaced with 1. Based on above observations,
we further simplify the diagonalization problem to be the one as follows.
Given a k-valued matrix A with a partial order on rows and columns, find a total diagonaliza-
tion A∗ ∼ A with the following admissible operations:
The assumption of k = Z2 allows us to ignore the first set of multiplication operations on the
binary matrix obtained after transformation. We denote the set of all admissible column and row
operations as
Under the assumption that no two columns nor rows have same grades, Colop and Rowop are
closed under transitive relation.
Proposition 11.3. If (i, j), ( j, k) ∈ Colop (Rowop) then (i, k) ∈ Colop (Rowop).
Given a solution of the diagonalization problem in the simplified form, one can reconstruct a
solution of the original problem on the presentation matrix by reversing the above process of sim-
plification. We will illustrate it by running the algorithm on the working example in Figure 11.5
at the end of this section. The matrix reduction we employ for diagonalization may be viewed as
a generalized matrix reduction because the matrix is reduced under constrained operations Colop
and Rowop which might be a nontrivial subset of all basic operations.
274 Computational Topology for Data Analysis
2 vb + vg = 0(1, 2)
1 vg (1, 1) v + v = 0(2, 1)
r g
vb (0, 1) vb + vr = 0(1, 1)
0
0 1 2 vr (1, 0)
Figure 11.8: The persistence module corresponding to the presentation matrix [∂1 ] shown in
Example 11.1. The generators are given by the three vertices with grades (0, 1), (1, 0), (1, 1) and
the relations are given by the edges with grades (1, 1), (1, 2), (2, 1).
Remark 11.4. There are two extreme but trivial cases: (i) there are no <gr -comparable pair of
rows and columns. In this case, Colop = Rowop = ∅ and the original matrix is a trivial solution.
(ii) All pairs of rows and all pairs of columns are <gr -comparable. Or equivalently, both Colop
and Rowop are totally ordered. In this case, one can apply traditional matrix reduction algorithm
to reduce the matrix to a diagonal matrix with all nonzero blocks being 1 × 1 minors. This is also
the case for 1-parameter persistence module if one further applies row reduction after column
reduction. Note that row reductions are not necessary for reading out persistence information
because it essentially does not change the persistence information. However, in multiparameter
cases, both column and row reductions are necessary to obtain a diagonalization from which
the persistence information can be read. From this view-point, the algorithm we present can be
thought of as a generalization of the traditional persistence algorithm.
Example 11.1. Consider our working example in Figure 11.5. One can see later in Section 11.5
(Case 1) that the presentation matrix of this example can be chosen to be the same as the matrix
of the boundary morphism ∂1 : C1 → C0 . With fixed bases C0 =< v(0,1) (1,0) (1,1)
b , vr , vg > and
(1,1) (1,2) (2,1)
C1 =< er , eb , eg >, this presentation matrix [∂1 ] and the corresponding binary matrix A
can be written as follows (recall that superscripts indicate the grades) :
[∂1 ] e(1,1)
r e(1,2) e(2,1)
g A c(1,1) c(1,2) c(2,1)
b 1 2 3
r1(0,1)
v(0,1)
b
t(1,0)
t(1,1) 0 1
1 0
(1,0)
v(1,0) t(1,1) −→ r2
t(0,1) 1
r 0
0 1
v(1,1)
(0,1) t(1,0) r3(1,1)
g 0 t 0 1 1
extend the algorithm to overcome this limitation. However, the algorithm introduced below can
still compute a correct diagonalization (not necessarily total) by applying the trick of adding small
enough perturbations to tied grades (considering Zd ⊆ Rd ) to reduce the case to the one satisfying
our assumption. Furthermore, this diagonalization in fact coincides with a total diagonalization
of some persistence module which is arbitrarily close to the original persistence module under a
well-known metric called interleaving distance which we discuss in the next chapter. In practice,
the persistence module usually arises from a simplicial filtration as shown in our working exam-
ple. The assumption of distinct grading of the columns and rows is automatically satisfied if at
most one simplex is introduced at each grade in the filtration.
Let A be the presentation matrix whose total diagonalization we are seeking for. We order
the rows and columns of the matrix A according to any order that extends the partial order on
the grades to a total order, e.g., dictionary order. We fix the indices Row(A) = {1, 2, · · · , `}
and Col(A) = {1, 2, · · · , m} according to this order. With this ordering, observe that, for each
admissible column operation ci → c j , we have i < j, and for each admissible row operation
rl → rk , we have l > k.
For any column ct , let A≤t := A|C denote the left submatrix on C = Row(A), { j ∈ Col(A) |
j ≤ t} and A<t denote its stricter version obtained by excluding column ct from A≤t . Our algo-
rithm starts with the finest decomposition that puts every free module given by each generator
(rows) into a separate block and then combine them incrementally as we process the relations
(columns). The main idea of our algorithm is presented in Algorithm 22:TotDiagonalize which
runs as follows (see Figure 11.9 for an illustration):
current column
(t−1) (t−1)
1. Main iteration (at iteration t) ct 2. Sub-column update (e.g.: Bi = B2 )
(t−1) (t−1)
B3 B3 sub-column untouched yet
T (t−1) T T
(t−1) Bi sub-column to update
B2 ... 0 (t−1)
= B2 0 ct |RowB (t−1)
2
B1 (t−1)
(t)
B1 0sub-column
=c| t RowB (t−1)
1
already reduced.
B1 unchanged
ct
A<t A≤t
(t) (t)
B (t−1) = {B1
(t−1) (t−1)
, B2
(t−1)
, B3 } B (t) = {B0 = (∅, {t}), B1 }
Figure 11.9: (left) A at the beginning of iteration t with A<t being totally diagonalized with
three index blocks B(t−1) = {B(t−1)
1 , B(t−1)
2 , B(t−1)
3 }. (right) A sub-column update step: ct |RowB(t−1)
1
has already been reduced to zero. So, B(t) (t−1)
1 = B1 is added into B(t) . White regions including
ct |RowB(t−1) must be preserved afterward. Now for i = 2, we attempt to reduce purple sub-column
1
ct |RowB(t−1) . We extend it to block on T := Row(B(t−1) ), (Col(A≤t ) \ Col(B(t−1)
2 2 )) (colored purple)
2
and try to reduce it in BlockReduce.
2. Main iteration: Process A from left to right incrementally by introducing a column ct and
considering left submatrices A≤t for t = 1, 2, · · · , m. We update and maintain the collection
of index blocks B(t) ← {B(t)
i } for the current submatrix A≤t in each iteration by using column
and block updates stated below. Here we use upper index (·)(t) to emphasize the iteration t.
Prior preservation means that the operations together change neither A<t nor other
sub-columns ct |RowB(t−1) for every j < i. If such operations exist, we apply them on
j
the current A to get an equivalent matrix with the sub-column ct |RowB(t−1) being zeroed
i
out and we set B(t)
i ← Bi
(t−1)
. Otherwise, we leave the matrix A unchanged and add
the column index t to those of B(t−1) , i.e., we set B(t) (t−1)
), Col(B(t−1)
i i ← Row(Bi i )∪
{t} . After processing every sub-column ct |RowB(t−1) one by one, all index blocks B(t)
i
i
containing column index t are merged into one single index block. At the end of
iteration t, we get an equivalent matrix A with A≤t being totally diagonalized with
index blocks B(t) .
2(b). Block reduce: To update the entries of each sub-column of ct described in 2(a), we
propose a block reduction algorithm ALgorithm 24:BlockReduce to compute the cor-
rect entries. Given T := Row(B(t−1) ), (Col(A≤t ) \ Col(B(t−1)
i i )) , this routine checks
whether the block T can be zeroed out by some collection of admissible operations.
If so, ct does not join the block B(t)
i and A is updated with these operations.
For two index blocks B1 , B2 , we denote the merging B1 ⊕ B2 of these two index blocks as
an index block Row(B1 ) ∪ Row(B2 ), Col(B1 ) ∪ Col(B2 ) . In the following algorithm, we treat
the given matrix A to be a global variable which can be visited and modified anywhere by every
subroutines called. Consequently, every time we update values on A by some operations, these
operations are applied to the latest A.
The outer loop is the incremental step for main iteration introducing a new column ct which
updates the diagonalization of A≤t from the last iteration. The inner loop corresponds to block
updates which checks the intersection of the current column and the rows of each previous block
one by one.
Remark 11.5. The algorithm TotDiagonalize does not require the input presentation matrix to
be minimal. As indicated in Remark 11.3, the trivial parts result in either identity blocks or single
column blocks like ∅, { j} . Such a single column block corresponds to a zero morphism and
Computational Topology for Data Analysis 277
Algorithm 22 TotDiagonalize(A)
Input:
A = input matrix treated as a global variable whose columns and rows are totally ordered
respecting some fixed partial order given by the grading.
Output:
A total diagonalization A∗ with index blocks B∗
(0)
1: B(0) ← {Bi := {i}, ∅ | i ∈ Row(A)}
2: for t ← 1 to m := |Col(A)| do
B(t)
3: 0 ← ∅, {t}
4: for each B(t−1) ∈ B(t−1) do
i
T := Row(B(t−1) ), Col(A≤t ) \ Col(B(t−1)
5: i i )
6: if BlockReduce (T )== false then
7: B(t)
i ← Bi
(t−1)
⊕ B(t)
0 ; \∗update Bi by appending t∗\
8: else
9: B(t)
i ← Bi
(t−1)
; \∗Bi remains unchanged∗\
10: \∗ A and ct are updated in Blockreduce when it return true∗\
11: end if
12: end for
13: B(t) ← {B(t) (t)
i } with all Bi containing t merged as one block.
14: end for
15: Return (A, B(m) )
is not merged with any other blocks. Therefore, c j is a zero column. For a single row block
{i}, ∅ which is not merged with any other blocks, ri is a zero row vector. It represents a free
indecomposable submodule in the total decomposition of the input persistence module.
We first prove the correctness of TotDiagonalize assuming that BlockReduce routine works
as claimed, namely, it checks if a sub-column of the current column ct can be zeroed out while
preserving the prior, that is, without changing the left submatrix from the previous iteration and
also the other sub-columns of ct that have already been zeroed out.
Proposition 11.4. At the end of each iteration t, A≤t is a total diagonalization.
Proof. We prove it by induction on t. For the base case t = 0, it follows trivially by definition.
Now assume A(t−1) is the matrix we get at the end of iteration (t − 1) with A(t−1)≤t−1 being totally
(t−1)
diagonalized. That means, A≤t−1 = A≤t−1 where A = A is the original given matrix. For
∗ (0)
contradiction, assume at the end of iteration t, the matrix we get, A(t) , has left submatrix A(t) ≤t
which is not a totally diagonalized. That means, some index block B ∈ B(t) can be decomposed
further. Observe that such B must contain t because all other index blocks (not containing t) in
B(t) are also in B(t−1) which cannot be decomposed further by our inductive assumption. We
denote this index block containing t as Bt . Let A0 be the equivalent matrix of A(t) such that A0≤t is
a total diagonalization with index blocks B0 . Let F be an equivalent transformation from A(t) to
A0 , which decomposes Bt into at least two distinct index blocks of B0 , say B0 , B1 , · · · . Only one
of them contains t, say B0 . Then B1 consists of only indices that are from A≤t−1 , which means
278 Computational Topology for Data Analysis
B1 equals some index block Bi ∈ B(t−1) . Therefore, the transformation F gives a sequence of
admissible operations which can reduce the sub-column ct |Row(Bi ) to zero in A(t) . Starting with
this sequence of admissible operations, we construct another sequence of admissible operations
which further keeps A(t) (t)
≤t−1 unchanged to reach the contradiction. Note that A≤t−1 = A≤t−1 .
(t−1)
Now we design the BlockReduce subroutine as required. With the requirement of prior
preservation, observe that reducing the sub-column ct |RowB for some B ∈ B(t−1) is the same as
reducing T = [Row(B), (Col(A≤t ) \ Col(B))] called the target block (see Figure 11.9 on right).
The main idea of BlockReduce is to consider a specific subset of admissible operations called
independent operations. Within A≤t , these operations only change entries in T and this change is
independent of their order of application. The BlockReduce subroutine is designed to search for
a sequence of admissible operations within this subset and reduce T with it, if it exists. Clearly,
the prior is preserved with these operations. The only thing we need to ensure is that searching
within the set of independent operations is sufficient. That means, if there exists a sequence of ad-
missible operations that can reduce T to 0 and meanwhile preserves the prior, then we can always
find one such sequence with only independent operations. This is what we show next.
Consider the following matrices for each admissible operation. For each admissible column
operation ci → c j , let
Yi, j := A·[δi, j ]
where [δi, j ] is the m × m square matrix with only one non-zero entry at (i, j). Observe that A·[δi, j ]
is a matrix with the only nonzero column at j with entries copied from ci in A. Similarly, for each
admissible row operation rl → rk , let [δk,l ] be the ` × ` matrix with only non-zero entry at (k, l).
let
Xk,l := [δk,l ]·A
Application of a column operation ci → c j can be viewed as updating A to A·(I + [δi, j ]) = A +
Yi, j . Similar observation holds for row operations as well. For a target block T = [Row(B), Col(A≤t )\
Col(B)] defined on some B ∈ B(t−1) , we say an admissible column (row) operation, ci → c j
Computational Topology for Data Analysis 279
ci cj
rk
rl Al,i
Figure 11.10: [δk,l ]A[δi, j ] is a matrix with the only nonzero entry at (k, j) being a copy of Al,i .
(rl → rk resp.) is independent on T if i < Col(T ), j ∈ Col(T ) (l < Row(T ), k ∈ Row(T ) resp.).
Briefly, we just call such operations independent operations if T is clear from the context.
We have two observations about independent operations that are important. The first one
follows from the definition that T = [Row(B), Col(A≤t ) \ Col(B)].
Observation 11.1. Within A≤t , an independent column or row operation only changes entries on
T.
Observation 11.2. For any independent column operation ci → c j and row operation rl → rk ,
we have [δk,l ]·A·[δi, j ] = 0. Or, equivalently
Proof. [δk,l ]·A·[δi, j ] = Al,i [δk, j ] (see Fig 11.10 for an illustration). By definitions of independence
and T , we have l < Row(B), i ∈ Col(B). That means they are row index and column index from
different blocks. Therefore, Al,i = 0.
The following proposition reveals why we are after the independent operations.
Proposition 11.5. The target block A|T can be reduced to 0 while preserving the prior if and only
if A|T can be written as a linear combination of independent operations. That is,
X X
A|T = αk,l Xk,l |T + βi, j Yi, j |T
l<Row(T ) i<Col(T )
k∈Row(T ) j∈Col(T )
The full proof can be seen in [142]. Here, we give some intuitive explanation. Reducing the
target block A|T to 0 is equivalent to finding matrices P and Q encoding sequences of admissible
row operations and admissible column operations respectively so that PAQ|T = 0. For ⇐ direc-
tion, we can build P = I + αk,l [δk,l ] and Q = I + βi, j [δi, j ] with binary coefficients αk,l ’s and
P P
βi, j ’s given in Eqn. (11.5). Then using Observations 11.1 and 11.2, one can show PAQ indeed
reduces A|T to 0 with the prior being preserved. This provides the proof for the ‘if’ direction.
280 Computational Topology for Data Analysis
For ‘only if’ direction, as long as we show that the existence of a transformation reducing A|T
to 0 implies the existence of a transformation reducing A|T to 0 by independent operations, we
are done. This is formally proved in [142].
We can view A|T , Yi, j |T , Xk,l |T as binary vectors in the same |T |-dimensional space. Propo-
sition 11.5 tells us that it is sufficient to check if A|T can be a linear combination of the vectors
corresponding to a set of independent operations. So, we first linearize each of the matrices
Yi, j |T ’s, Xk,l |T ’s, and A|T to a column vector as described later (see Figure 11.11). Then, we
check if A|T is in the span of Yi, j |T ’s and Xk,l |T ’s. This is done by collecting all vectors Xi, j |T ’s
and Yk,l |T ’s into a matrix S called the source matrix (Figure 11.11(right)) and then reducing the
vector c := A|T with S by some standard matrix reduction algorithm with left-to-right column
additions, which we have seen before in Section 3.3.1 for computing persistence. This routine is
presented in Algorithm 23:ColReduce (S, c) which reduces the column c w.r.t. the input matrix S
by reducing the matrix [S|c] altogether by MatPersistence in Section 3.3.1.
If c = A|T can be reduced to 0, we apply the corresponding independent operations to update
A. Observe that all column operations used in reducing A|T together only change the sub-column
ct |Row(B) while row operations may change A to the right of the column t. We say this procedure
reduces c with S.
Algorithm 23 ColReduce(S, c)
Input:
Source matrix S and target column c to reduce
Output:
Reduced column c with S
1: S0 ← [S|c]
2: Call MatPersistence(S0 );
3: return c along with indices of columns in S used for reduction of c
Fact 11.1. There exists a set of column operations adding a column only to its right such that the
matrix [S|c] is reduced to [S0 |0] if and only if ColReduce(S, c) returns a zero vector.
((1, m), (2, m), . . . , (`, n), (1, m − 1), (2, m − 1), . . . ).
For any index block B, let Lin(A|B ) be the vector of dimension |Col(B)| · |Row(B)| obtained by
linearizing A|B to a vector in the above linear order on the indices.
Proposition 11.6. The target block on T can be reduced to zero in A while preserving the prior
if and only if BlockReduce(T ) returns true.
Time complexity. First we analyze the time complexity of TotDiagonalize assuming that the
input matrix has size ` × m. Clearly, max{`, m} = O(N) where N is the total number of generators
Computational Topology for Data Analysis 281
Lin(A)T
A
ck := c`
ci cj
ST
cj := ci
Lin(Yij )T
Yij
0 0 0 0 cj := ci 0
cj := ct
Figure 11.11: (top) Matrix A is linearized to the vector Lin(A) in middle; (bottom) the column
operation ci → c j is captured by Yi j whose linearization is illustrated in middle; (right) source
matrix S combining all operations (row operations not shown). In the picture, (·)T denotes trans-
posed matrices.
Algorithm 24 BlockReduce(T )
Input:
Index of target block T to be reduced; Given matrix A is assumed to be a global variable
Output:
A boolean to indicate whether A|T can be reduced and reduced block A|T if possible.
1: Compute c := Lin(A|T ) and initialize empty matrix S
2: for each admissible column operation ci → c j with i < Col(T ), j ∈ Col(T ) do
3: compute Yi, j |T := (A·[δi, j ])|T and yi, j = Lin(Yi, j |T ); update S ← [S|yi, j ]
4: end for
5: for each admissible row operation rl → rk with l < Row(T ), k ∈ Row(T ) do
6: compute Xk,l |T := ([δk,l ]·A)|T and xk,l := Lin(Xk,l |T ); update S ← [S|xk,l ]
7: end for
8: ColReduce (S, c) returns indices of yi, j ’s and xk,l ’s used to reduce c (if possible)
9: For every returned index of yi, j or xk,l apply ci → c j or rl → rk to transform A
10: return A|T == 0
and relations. For each of O(N) columns, we attempt to zero out every sub-column with row
indices coinciding with each block B of the previously determined O(N) blocks. Let B has NB
rows. Then, the block T has NB rows and O(N) columns.
To zero-out a sub-column,N we create a source matrix out of T which has size O(NNB ) ×
2
O(N ) because each of O( 2 ) possible operations is converted to a column of size O(NNB ) in
the source matrix. The source matrix S with the target vector c can be reduced with an efficient
algorithm [57, 200] in O(a + N 2 (NNB )ω−1 ) time where a is the total number of nonzero elements
in [S|c] and ω ∈ [2, 2.373) is the exponent for matrix multiplication. We have a = O(NNB · N 2 ) =
O(N 3 NB ). Therefore, for each block B we spend O(N 3 NB + N 2 (NNB )ω−1 ) time. Then, observing
B∈B N B = O(N), for each column we spend a total time of
P
B∈B B∈B
282 Computational Topology for Data Analysis
Therefore, counting for all of the O(N) columns, the total time for decomposition takes O(N 2ω+1 )
time.
We finish this analysis by commenting that one can build the presentation matrix from a given
simplicial filtration consisting of n simplices leading to the following cases: (i) For 0-th homology,
the boundary matrix ∂1 can be taken as the presentation matrix giving N = O(n) and a total time
complexity of O(n2ω+1 ); (ii) for 2-parameter case, N = O(n) and presentations can be computed
in O(n3 ) time giving a total time complexity of O(n2ω+1 ); (iii) for d-parameter case, N = O(nd−1 )
and a presentation matrix can be computed in O(nd+1 ) time giving a total time complexity of
O(n(2ω+1)(d−1) ). We discuss the details in Section 11.5.
(1,0) vr =
0
(1,1) vg
∂∗ U ∂ V
Figure 11.12: Diagonalizing the binary matrix given in Example 11.1. It can be viewed as
multiplying the original matrix ∂ with a left matrix U that represents the row operation and a right
matrix V that represents the column operations.
Before the first iteration, B is initialized to be B = {B1 = ({1}, ∅), B2 = ({2}, ∅), B3 =
({3}, ∅)}. In the first iteration when t = 1, we have block B0 = (∅, {1}) for column c1 . For B1 =
({1}, ∅), the target block we hope to zero out is T = ({1}, {1}). So we call BlockReduce(T ) to check
if A|T can be zeroed out and update the entries on T according to the results of BlockReduce(T ).
There is only one admissible operation from outside of T into it, namely, r3 → r1 . The target
vector c = Lin(A|T ) and the source matrix S = {Lin(([δ1,3 ]A)|T )} are:
The result of ColReduce(S, c) stays the same as its input. That means we cannot reduce c at all.
Therefore, BlockReduce(T, t) returns false and nothing is updated in the original matrix.
It is not surprising that the matrix remains the same because the only admissible operation
that can affect T does not change any entries in T at all. So there is nothing one can do to
reduce it, which results in merging B1 ⊕ B0 = ({1}, {1}). Similarly, for B2 with T = ({2}, {1}),
the only admissible operation r3 → r2 does not change anything in T . Therefore, the matrix
does not change and B2 is merged with B1 ⊕ B0 , which results in the block ({1, 2}, {1}). For
B3 with T = ({3}, {1}), there is no admissible operation. So the matrix does not change. But
A|T = A|({3},{1}) = 0. That means BlockReduce returns true. Therefore, we do not merge B3 . In
summary, B0 , B1 , B2 are merged to be one block ({1, 2}, {1}) in the first iteration. So after the first
iteration, there are two index blocks in B(1) : ({1, 2}, {1}) and ({3}, ∅).
In the second iteration t = 2, we process the second column c2 . Now B1 = ({1, 2}, {1}), B2 =
({3}, ∅) and B0 = (∅, {2}). For the block B1 = ({1, 2}, {1}), the target block we hope to zero out is
T = ({1, 2}, {2}). There are three admissible operations from outside of T into T , r3 → r1 , r3 →
r2 , c1 → c2 . BlockReduce(T ) constructs the target vector c = Lin(A|T ) and the source matrix
S = {Lin(([δ1,3 ]A)|T ), Lin(([δ2,3 ]A)|T ), Lin((A[δ1,2 ])|T )} illustrated as follows:
S c
" #
1 0 0 0
0 1 0 0
and returns true since A0 |T == 0. Therefore, we do not merge B1 . We continue to check for
the block B2 = ({3}, ∅) and T = ({3}, {1, 2}), whether A0 |T can be reduced to zero. There is no
admissible operation for this block at all. Therefore, the matrix stays the same and BlockReduce
returns false. We merge B2 ⊕ B0 = ({3}, {2}).
Continuing the process for the last column c3 in the third iteration t = 3, we see that B1 =
({1, 2}, {1}), B2 = ({3}, {2}) and B0 = (∅, {3}). For the block B1 = ({1, 2}, {1}), the target block we
hope to zero out is T = ({1, 2}, {2, 3}). There are four admissible operations from outside of T into
T , r3 → r1 , r3 → r2 , c1 → c2 , c1 → c3 . BlockReduce(T ) constructs the target vector c = Lin(A|T )
and the source matrix S = {Lin(([δ1,3 ]A)|T ), Lin(([δ2,3 ]A)|T ), Lin((A[δ1,2 ])|T )}, Lin((A[δ1,3 ])|T )} il-
284 Computational Topology for Data Analysis
lustrated as follows:
S c
1 0 1 0 0
0 1 1 0 0
1 0 0 0 0
0 1 0 0 0
and returns true since A0 |T == 0. Therefore, we do not merge B1 with any other block. We
continue to check for the block B2 = ({3}, {2}) and T = ({3}, {1, 3}), whether A0 |T can be reduced
to zero. There is no admissible operation for this block at all. Therefore, the matrix stays the same
and BlockReduce returns false. We merge B2 ⊕ B0 = ({3}, {2, 3}).
Finally the algorithm returns the matrix A0 shown above as the final result. It is the correct
∗
total diagonalization with two index blocks in BA : B1 = ({1, 2}, {1}) and B2 = ({3}, {2, 3}). An
examination of ColReduce(S, c) in all three iterations over columns reveals that the entire matrix
A is updated by operations r3 → r2 and c1 → c2 . We can further transform it back to the original
form of the presentation matrix [∂1 ]. Observe that a row addition ri ← ri + r j reverts to a basis
change in the opposite direction.
For each chain complex C· (Xu ), we have the cycle spaces Z p (Xu )’s and boundary spaces
B p (Xu )’s as kernels and images of boundary maps ∂ p ’s respectively, and the homology group
H p (Xu ) = Z p (Xu )/B p (Xu ) as the cokernel of the inclusion maps B p (Xu ) ,→ Z p (Xu ). In line with
category theory we use the notations im , ker, coker for indicating both the modules of kernel,
image, cokernel and the corresponding morphisms uniquely determined by their constructions3 .
We obtain the following commutative diagram:
In the language of graded modules, for each p, the family of vector spaces and linear maps (in-
clusions) ({C p (Xu )}u∈Zd , {C p (Xu ) ,→ C p (Xv )}u≤v ) can be summarized as a Zd -graded R-module:
M
C p (X) := C p (Xu ), with the ring action ti · C p (Xu ) : C p (Xu ) ,→ C p (Xu+ei ) ∀i, ∀u.
u∈Zd
That is, the ring R acts as the linear maps (inclusions) between pairs of vector spaces in C p (X· )
with comparable grades. It is not too hard to check that this C p (X· ) is indeed a graded module.
Each p-chain in a chain space C p (Xu ) is a homogeneous element with grade u.
Then we have a chain complex of graded L modules (C∗ (X), ∂∗ ) where ∂∗ : C∗ (X) → C∗−1 (X)
is the boundary morphism given by ∂∗ , u∈Zd ∂∗,u with ∂∗,u : C∗ (Xu ) → C∗−1 (Xu ) being the
boundary map on C∗ (Xu ).
3
e.g. ker ∂ p denotes the inclusion of Z p into C p
286 Computational Topology for Data Analysis
The kernel and image of a graded module morphism are also graded modules as submodules
of domain and codomain respectively whereas the cokernel is a quotient module of the codomain.
They can also be defined grade-wise in the expected way:
All the linear maps are naturally induced from the original linear maps in M and N. In our
chain complex cases, the kernel and image of the boundary morphism ∂ p : C p (X) → C p−1 (X)
is the family of cycle spaces Z p (X) and family of boundary spaces B p−1 (X) respectively with
linear maps induced by inclusions. Also, from the inclusion induced morphism
L B p (X) ,→ Z p (X),
we have the cokernel module H p (X), consisting of homology groups u∈Zd H p (Xu ) and linear
maps induced from inclusion maps Xu ,→ Xv for each comparable pairs u ≤ v. This H p (X)
is the persistence module M which we decompose. Classical persistence modules arising from a
filtration of a simplicial complex over Z is an example of a 1-parameter persistence module where
the action t1 · Mu ⊆ Mu+e1 signifies the linear map Mu → Mv between homology groups induced
by the inclusion of the complex at u into the complex at v = u + e1 .
In our case, we have chain complex of graded modules and induced homology groups which
can be succinctly described by the following diagram:
An assumption. We always assume that the simplicial filtration is 1-critical, which means that
each simplex has a unique earliest birth time. For the case which is not 1-critical, called multi-
critical, one may utilize the mapping telescope, a standard algebraic construction [186], which
transforms a multi-critical filtration to a 1-critical one. However, notice that this transformation
increases the input size depending on the multiplicity of the incomparable birth times of the
simplices. For 1-critical filtrations, each module C p is free. With a fixed basis for each free
module C p , a concrete matrix [∂ p ] for each boundary morphism ∂ p based on the chosen bases can
be constructed.
With this input, we discuss our strategies for different cases that depend on two parameters,
d, the number of parameters of filtration function, and p, the dimension of the homology groups
in the persistence modules.
Note that a presentation gives an exact sequence F 1 → F 0 H → 0. To reveal further
details of a presentation of H, we recognize that it respects the following commutative diagram,
Y1
im f 1 ker f 0
f1 f 0 =coker f 1
F1 F0 H
Computational Topology for Data Analysis 287
where Y 1 ,→ F 0 is the kernel of f 0 . With this diagram being commutative, all maps in
this diagram are essentially determined by the presentation map f 1 . We call the surjective map
f 0 : F 0 → H generating map, and Y 1 = ker f 0 the 1 st syzygy module of H.
Justification. For p = 0, the cycle module Z0 = C0 is a free module. So we have the presentation
of H0 as claimed. It is easy to check that ∂1 : C1 → C0 is a presentation of H0 since both C1 and
C0 are free modules. With standard basis of chain modules C p ’s, we have a presentation matrix
[∂1 ] as the valid input to our decomposition algorithm.
The 0-th homology in our working example (Figure 11.5) corresponds to this case. The
presentation matrix is the same as the matrix of boundary morphism ∂1 .
Bp Zp Hp
im ∂ p+1 ker ∂ p
∂¯ p+1
C p+1 Cp
∂ p+1
1. Compute a basis G(Z p ) for the free module Z p where G(Z p ) is presented as a set of
generators in the basis of C p . This can be done by an algorithm in [222]. Take G(Z p )
as the row basis of the presentation matrix [∂¯ p+1 ].
2. Present im ∂ p+1 in the basis of G(Z p ) to get the presentation matrix [∂¯ p+1 ] of the
induced map as follows. Originally, im ∂ p+1 is presented in the basis of C p through
the given matrix [∂ p+1 ]. One needs to rewrite each column of [∂ p+1 ] in the basis G(Z p )
computed in the previous step. This can be done as follows. Let [G(Z p )] denote the
288 Computational Topology for Data Analysis
matrix presenting basis elements in G(Z p ) in the basis of C p . Let c be any column
vector in [∂ p+1 ]. We reduce c to zero vector by the matrix [G(Z p )] and note the
columns that are added to c. These columns provide the necessary presentation of c
in the basis G(Z p ). This reduction can be done by the persistent algorithm described
in Chapter 3.
Bp
im ∂ p+1
coker∂¯ p+1
C p+1 ∂¯ p+1 Zp Hp 0
Here the ∂¯ p+1 is an induced map from ∂ p+1 . With a fixed basis on Z p and standard basis of C p+1 ,
we rewrite the presentation matrix [∂ p+1 ] to get [∂¯ p+1 ], which constitutes a valid input to our
decomposition algorithm.
s(1,1,1)
g(0,1,1)
(1,0,0)
1
t
(0,1,0)
g(1,0,1)
2
t
(0,0,1)
g(1,1,0)
3 t
Computational Topology for Data Analysis 289
(0, 1, 1) (1, 1, 1)
(0, 0, 1) (1, 0, 1)
(0, 1, 0) (1, 1, 0)
(0, 0, 0) (1, 0, 0)
Figure 11.13: An example of a filtration of simplicial complex for d = 3 with non-free Z when
p = 1. The three cycles at gradings (0, 1, 1), (1, 0, 1), (1, 1, 0) are three generators in Z1 . However,
at grading (1, 1, 1), the earliest time these three cycles exist simultaneously, there is a relation
among these three generators.
tion matrix into consideration, we get a time complexity bound of O(nd+1 ) + O(n(2ω+1)(d−1) ) =
O(n(2ω+1)(d−1) ).
11.6 Invariants
For a given persistence module, it is useful to compute invariants that in some sense summarize
the information contained in them. Ideally, these invariants should characterize the input mod-
ule completely, meaning that the two invariants should be equal if and only if the modules are
isomophic. Persistence diagrams for 1-parameter tame persistence modules are such invaraints.
For multiparameter persistence modules, no such complete invariants exist that are finite and
hence computable. However, we can still aim for invariants that are computable and characterize
the modules in some limited sense, meaning that these invariants remain equal for isomorphic
modules though may not differentiate non-isomorphic modules. Of course, their effectiveness in
practice is determined by their discriminative power. We present two such invariants below: the
first one rank invariant was suggested in [65] whereas the second one graded Betti number was
brought to TDA by [214] and studied further in [221].
We can retrieve the information about birth and death of generators from the multirank. For a
grade u, define its immediate predecessors Pu and immediate successors S u as:
Fact 11.4.
L
1. We have that m generators get born at grade u if and only if coker u0 ∈P u
Mu0 → Mu has
dimension m.
L
2. We have that m generators die leaving grade u if and only if ker Mu → 0
u ∈S u M u0 has
dimension m.
Fact 11.5. Two interval decomposable modules are isomorphic if and only if they have the same
multirank invariants.
Definition 11.20 (Free resolution). For a graded module M, a free resolution F → M is an exact
sequence:
f2 f1 f0
··· F2 F1 F0 M 0 where each F i is a free graded R-
module.
Now we observe that a free resolution can be obtained as an extension of a free presentation.
Consider a free presentation of M as depicted below.
292 Computational Topology for Data Analysis
Y1
im f 1 ker f 0
f1 f 0 =coker f 1
F1 F0 M
If the presentation map f 1 has nontrivial kernel, we can find a nontrivial map f 2 : F 2 → F 1
with im f 2 = ker f 1 , which implies coker f 2 im f 1 = ker f 0 = Y 1 . Therefore, f 2 is in fact a
presentation map of the module Y 1 which is so called the first syzygy module of M (named after
Hilbert’s famous syzygy theorem [191]). We can keep doing this to get f 3 , f 4 , . . . by constructing
presentation maps on higher order syzygy modules Y 2 , Y 3 , . . . of M, which results in a diagram
depicted below, which is gives a free resolution of M.
Y3 Y2 Y1
im f 3 ker f 2 im f 2 ker f 1 im f 1 ker f 0
f3 f2 f1 f 0 =coker f 1
··· F3 F2 F1 F0 M
Free resolution is not unique. However, there exists an essentially unique minimal free resolu-
tion in the sense that any free resolution can be obtained by summing the minimal free resolution
with a free resolution of a trivial module. Below we give a construction to build a minimal free
resolution from a minimal free presentation. The proof that it indeed creates a minimal free
resolution can be found in [50, 268].
Definition 11.21 (Graded Betti numbers). Let F j be a free module in the minimal free resolution
of a graded module M. Let β M d
j,u be the multiplicity of each grade u ∈ Z in the multiset consisting
of the grades of homogeneous basis elements for F j . Then, the mapping β(−,−)
M : Z≥0 × Zd → Z≥0
is an invariant called the graded Betti numbers of M.
For example, the graded Betti number of the persistence module for our working example in
Figure 11.5 is listed as Table 11.1.
L i
Definition 11.22 (Persistent graded Betti numbers). Let M M be a total decomposition
i
of a graded module M. We have for each indecomposable M , the refined graded Betti numbers
i i Mi
β M = {β Mj,u | j ∈ N, u ∈ Z }. We call the set PB(M) := {β } the persistent graded Betti numbers
d
of M.
For the working example in Figure 11.5, the persistent graded Betti numbers are given in two
tables listed in Table 11.2.
Computational Topology for Data Analysis 293
Table 11.1: All the nonzero graded Betti numbers βi,u are listed in the table. Empty items are all
zeros.
One way to summarize the information of graded Betti numbers is to use the Hilbert function,
which is also called dimension function [141] defined as:
Fact 11.6. There is a relation between the graded Betti numbers and dimension function of a
persistence module as follows:
XX
∀u ∈ Zd , dmM(u) = (−1) j β j,v
v≤u j
Then for each indecomposable M i , we have the dimension function dmM i related to persistent
graded Betti numbers restricted to M i .
Definition 11.23 (Blockcode). The set of dimension functions Bdm (M) := {dmM i } is called the
blockcode of M.
For our working example, the dimension functions of indecomposable summands M 1 and M 2
are (see Figure 11.14 for the visualization):
1 if u ≥ (1, 0) or u ≥ (0, 1) 1 if u = (1, 1)
dmM 1 (u) = 2
=
dmM (u) (11.2)
0 otherwise
0 otherwise
1
βM (1,0) (0,1) (1,1) (2,1) (1,2) (2,2) ···
β0 1 1
β1 1
β≥2
2
βM (1,0) (0,1) (1,1) (2,1) (1,2) (2,2) ···
β0 1
β1 1 1
β2 1
β≥3
1 2
Table 11.2: Persistence grades PB(M) = {β M , β M }. All nonzero entries are listed in this table.
Blank boxes indicate 0 entries.
294 Computational Topology for Data Analysis
2 2 2
1 1 1
0 0 0
0 1 2 0 1 2 0 1 2
k t1 k t1 k k t1 k t1 k 0 0 0 0 0
t2 [t2 ,t2 ] t2 t2 t2 t2 0 t2 0
0 [t2 ,0]> t2 0 t2 t2 0 0 0
0 0 k t1 k 0 0 k t1 k 0 0 0 0 0
Figure 11.14: (top) 2-parameter simplicial filtration for our working example in Figure 11.5.
dmM 1 and dmM 2 : each colored square represents an 1-dimensional vector space k and each
white square represents a 0-dimensional vector space. In the middle picture M 1 is generated by
b , vr which are drawn as a blue dot and a red dot respectively. They are merged at (1, 1) by
v0,1 1,0
We can read out some useful information from dimension functions on each indecompos-
able. We take the dimension functions of our working example as an example. For dmM 1 , two
connected components are born at the two left-bottom corners of the purple region. They are
merged together immediately when they meet at grade (1, 1). After that, they persist forever as
one connected component. For dmM 2 , one connected component born at the left-bottom corner
of the square green region. Later at the grades of left-top corner and right-bottom corner of the
green region, it is merged with some other connected component with smaller grades of birth.
Therefore, it only persists within this green region.
In general, both persistent graded Betti numbers and blockcodes are not sufficient to classify
multiparameter persistence modules, which means they are not complete invariants. As indicated
in [64], there is no complete discrete invariant for multiparameter persistence modules. How-
ever, interestingly, these two invariants are indeed complete invariants for interval decomposable
modules like this example, which we will study in the next chapter.
Computational Topology for Data Analysis 295
module, which results into the presistent graded Betti numbers. For each indecomposable, we
apply dimension function [141], which is also known as the Hilbert function in commutative al-
gebra to summarize the graded Betti numbers for each indecomposalbe module. This constitutes
a blockcode for indecomposable module of the persistence module. The blockcode is a good ve-
hicle for visualizing lower dimensional persistence modules such as 2- or 3-parameter persistence
modules. For details on these invariants, see [142].
Exercises
1. Using the matrix diagonalization algorithm as described in this chapter, devise an algo-
rithm to compute a minimal presentation of a 2-parameter persistence module given by a
simplicial filtration over Z2 .
2. Give an example of a 2-parameter simplicial filtration over Z2 at least one of whose decom-
posables is not free.
3. Give an example of a 2-parameter simplicial filtration over Z2 at least one of whose decom-
posables does not have all of its non-trivial vector spaces over the grades being isomorphic.
4. Give an example of a 2-parameter persistence module M with three generators and relations
that have the following properties: (i) M is indecomposable, (ii) M has two indecompos-
ables, (iii) M has three indecomposables.
5. Prove that the cycle module Z p arising from a 2-parameter simplicial filtration is always
free.
6. Design a polynomial time algorithm for computing decomposition of the persistence mod-
ule induced by a given simplicial filtration over Z2 when a simplex can be a generator at
different grades.
7. Let A a presentation matrix with n generators and relations whose grades are distinct and
totally ordered. Design an O(n3 ) time algorithm to decompose A. Interpret types of each
indecomposable in such a case.
8. The algorithm TotDiagonalize has been written assuming that the field of the polynomial
ring is Z2 . Write it for a general finite field.
9. Give and example of two non-isomorphic 2-parameter persistence modules which have the
same rank invariant.
10. Design an efficient algorithm to compute the rank invariant of a module from the simplicial
filtration inducing it.
11. Prove that a 2-parameter persistence module M is an interval (see Section 11.6.1) if and
only if supp(M) is connected and each Mu for u ∈ supp(M) has dimension 1.
Computational Topology for Data Analysis 297
12. Suppose that a 2-parameter persistence module is given by a presentation matrix. Design an
algorithm to determine if M is interval or not without decomposing the input matrix (hint:
consider computing graded Betti numbers from the grades of the rows and columns of the
matrix).
13. Show that for a finitely presented (finite number of generators and relations) graded module
M, there exist two interval decomposable graded modules M 1 and M 2 so that the rank
invariants (Definition 11.18) satisfy ruv (M) = ruv (M 1 ) − ruv (M 2 ) for every u, v ∈ supp(M).
Given a presentation matrix for M, compute such M 1 and M 2 efficiently.
14. Write a pseudocode for the construction of a minimal free resolution given in Section 11.6.2.
Analyze its complexity.
298 Computational Topology for Data Analysis
Chapter 12
We have seen that persistence modules are important objects of study in topological data analysis
in that they serve as an intermediate between the raw input data and the output summarization
with persistence diagrams. For 1-parameter case, the distances between modules can be computed
from bottleneck distances between the corresponding persistence diagrams. For multiparameter
persistence modules, we already saw in Chapter 11 that the indecomposables which are analogues
to bars in 1-parameter case are more complicated. So, defining distances between persistence
Figure 12.1: A 2-parameter module is sliced by lines that provide matching distance between
two modules as we explain in Section 12.3. Figure is an output of RIVET software due to [221],
courtesy of Michael Lesnick and Matthew Wright (2015, fig. 3).
modules in terms of indecomposables become also more complicated. However, we need distance
or distance-like notion between persistence modules to compare the input data inducing them.
299
300 Computational Topology for Data Analysis
Figure 12.1 shows an output of RIVET software [221] that implemented the so-called matching
distance between 2-parameter persistence modules. In this chapter, we describe some of these
distances proposed in the literature and algorithms for computing them efficiently (polynomial
time).
The interleaving distance dI between 1-parameter persistence modules as defined in Chapter 3
provides a useful means to compare them. Fortunately, for 1-parameter persistence modules,
they can be computed exactly by computing the bottleneck distance db between their persistence
diagrams thanks to the isometry theorem [220] (see also [23, 80]). Chapter 3 gives a polynomial
time algorithm O(n1.5 log n) for computing bottleneck distance. The status however is not so well
settled for multiparameter persistence modules.
One of the difficulties facing the definition and computation of distances among multiparame-
ter persistence modules is the fact that their indecomposables do not have a finite characterization
as indicated in Chapter 11. Even for finitely generated modules, this is true though a unique de-
composition is guaranteed by Krull-Schmidt Theorem [10]. Despite this difficulty, one can define
an interleaving distance dI for multiparameter persistence modules which can be viewed as an
extension of the interleaving distance defined for 1-parameter persistence modules. Shown by
Lesnick [220], this distance is the most fundamental one because it is the most discriminative
distance among persistence modules that is also stable with respect to functions or simplicial fil-
trations that give rise to the modules. Unfortunately, it turns out that computing dI for n-parameter
persistence modules and even approximating it within a factor less than 3 is NP-hard for n ≥ 2.
For a special case of modules called interval modules, dI can be computed in polynomial time.
In Section 12.2, we introduce the interleaving distance for multiparameter persistence modules.
We follow it with a polynomial time algorithm [141] in Section 12.4.3 which computes dI for
2-parameter interval modules.
To circumvent the problem of computing interleaving distances, several other distances have
been proposed in the literature that is computable in polynomial time and bounds the interleaving
distance either from above or below, but not both in the general case. Given the NP-hardness of
approximating interleaving distance, there cannot exist any polynomial time computable distance
that bounds dI both from above and below within a constant factor of 3 unless P = NP. The
matching distance dm as defined in Section 12.3 bounds dI from below, that is, dm ≤ dI , and it
can be computed in polynomial time.
Finally, in Section 12.4, we extend the definition of the bottleneck distance to multiparam-
eter persistence modules. Extending the concept from 1-parameter case, one can define db as
the supremum of the pairwise interleaving distances between indecomposables under an opti-
mal matching. Then, straightforwardly, dI ≤ db but the converse is not necessarily true. It is
known that no lower bound in terms of db for dI may exist even for a special class of 2-parameter
persistence modules called interval decomposable modules [47]. However, db can be useful as
a reasonable upper bound to dI . Unfortunately, a polynomial time algorithm for computing db
is not known for general persistence modules. For some persistence modules whose indecom-
posables have constant description such as block decomposable modules, one can compute db in
polynomial time simply because the interleaving distance between any two modules with constant
description cannot take more than O(1) time.
In Section 12.4, we consider a special class of persistence modules whose indecomposables
are intervals and present a polynomial time algorithm for computing db for them. These are mod-
Computational Topology for Data Analysis 301
ules whose indecomposables are supported by “stair-case" polyhedra. Our algorithm assumes
that all indecomposables are given and computes db exactly for 2-parameter interval decompos-
able modules. Although the algorithm can be extended to persistence modules with larger number
of parameters, we choose to present it only for 2-parameter case for simplicity while not losing
the essential ingredients for the general case. The indecomposables required as input can be
computed by the decomposition algorithm presented in the previous chapter (Chapter 11).
Definition 12.1 (Category). A category C is a set of objects Obj C with a set of morphisms
hom(x, y) for every pair of elements x, y ∈ Obj C where
3. for homomorphisms f, g, h, the compositions wherever defined are associative, that is, ( f ◦
g) ◦ h = f ◦ (g ◦ h);
All sets form a category Set with functions between them playing the role of morphisms. Topo-
logical spaces form a category Top with continuous maps between them being the morphisms.
Vector spaces form the category Vec with linear maps between them being the morphisms. A
poset P form a category with every pair x, y ∈ P admitting at most one morphism; hom(x, y)
has one element if x ≤ y and empty otherwise. Such a category is called a thin category in the
literature for which composition rules take trivial form.
4. F preserves identity morphisms, that is, F(1 x ) = 1F(x) for every x ∈ Obj C.
One can observe that the 1-parameter persistence module is a functor from the category of totally
ordered set of Z (or R) to the category of Vec. Homology groups with a field coefficient for
topological spaces provide a functor from category Top to the category of vectors spaces Vec. We
can define maps between functors themselves.
F(ρ)
F(x) / F(y)
ηx ηy
G(ρ)
G(x) / G(y)
Let k be a field, Vec be the category of vector spaces over k, and vec be the subcategory of
finite dimensional vector spaces. As usual, for simplicity, we assume k = Z2 .
Definition 12.4 (Persistence module). Let P be a poset category. A P-indexed persistence module
is a functor M : P → Vec. If M takes values in vec, we say M is pointwise finite dimensional
(p.f.d.). The P-indexed persistence modules themselves form another category where the natural
transformations between functors constitute the morphisms.
Definition 12.5 (Finite type). A P-indexed persistence module M is said to have finite type if M
is p.f.d. and all morphisms M(x ≤ y) are isomorphisms outside a finite subset of P.
Here we consider the poset category to be Rd with the standard partial order and all modules
to be of finite type. We call Rd -indexed persistence modules as d-parameter modules in short.
The reader can recognize that this is a shift from our assumption in the last chapter where we
considered Zd -indexed modules. The category of d-parameter modules in this chapter is denoted
as Rd -mod. For a d-parameter module M ∈ Rd -mod, we use notations M x := M(x) and ρ M x→y :=
M(x ≤ y).
Definition 12.6 (Shift). For any δ ∈ R, we denote ~δ = (δ, · · · , δ) = δ · ~e, where ~e = (e1 , e2 , . . . , ed )
with {ei }di=1 being the standard basis of Rd . We define a shift functor (·)→δ : Rd -mod → Rd -mod
where M→δ := (·)→δ (M) is given by M→δ (x) = M(x + ~δ) and M→δ (x ≤ y) = M(x + ~δ ≤ y + ~δ). In
other words, M→δ is the module M shifted diagonally by ~δ.
Definition 12.7 (Interleaving). For two d-parameter persistence modules M and N, and δ ≥ 0,
a δ-interleaving between M and N are two families of linear maps {φ x : M x → N x+~δ } x∈Rd and
{ψ x : N x → M x+~δ } x∈Rd satisfying the following two conditions; see Figure 12.2:
Computational Topology for Data Analysis 303
• ∀x ≤ y ∈ Rn , φy ◦ ρ M
x→y = ρ ~
N ◦ φ x and ψy ◦ ρNx→y = ρ M ~ ◦ ψx
x+δ→y+~δ x+δ→y+~δ
ρN
x + 2~δ x+~δ,y+~δ y + ~δ
x + ~δ
ρN
x,x+2~δ
x + ~δ
N x N
ρN
x,y
ψx+~δ ψy
ψx
ψx
φx+~δ
φx φx φy
x + 2~δ ρM
x+~δ,y+~δ
ρM
x,x+2~δ y
M x + ~δ M
x x ρM
x,y
(a) (b)
Definition 12.9. The matching distance dm (M, N) between two persistence modules is defined as
The weight w(`) is introduced to make the matching distance stable with respect to the inter-
leaving distance.
3. If a point p is above (below) a line `, then point `∗ is above (below) the line p∗ .
Consider the open half-plane Ω of R2 where Ω = {x, y | x > 0}. Let α denote the bijective map
between Ω and the space Λ of lines with positive slopes where α(p) = p∗ .
The representation theory [65, 108, 214] tells us that finitely generated graded modules as
defined in Chapter 11 are essentially equivalent to persistence modules as defined in this chapter
as long as they are of finite type (Definition 12.5). Then, if a persistence module M is a functor
on the poset P = R2 or Z2 , we can talk about the grades (elements of P) of a generating set of M
and the relations which are combinations of generators that become zero. A mindful reader can
recognize these are exactly the grades of the rows and columns of the presentation matrix for M
(Definition 11.14).
Given two 2-parameter persistence modules M and N, let gr(M) and gr(N) denote the grades
of all generators and relations in M and N respectively. Consider the set of lines L dual to the
points in gr(M) ∪ gr(N). These lines together create a line arrangement in Ω which is a partition
of Ω into vertices, edges, and faces. The vertices are points where two lines meet, the edges are
maximal connected subset of the lines excluding the vertices, and faces are maximal connected
subsets of Ω excluding the vertices and edges. Let A0 denote this initial arrangement. We refine
this arrangement further later. First, we observe an invariant property of the arrangemnt for which
we need the following definition.
Definition 12.10 (Point pair type). Given two points p, q and a line `, we say (p, q) has the
following types with respect to ` : (i) Type-1 if both p and q lie above `, (ii) Type-2 if both p and
q lie below `, (iii) Type-3 if p lies above and q lies below `, and (iv) Type-4 if p lies below and q
lies above `.
The following proposition follows from Fact 12.1.
Proposition 12.2. For two points p, q ∈ gr(M) ∪ gr(N) and a face τ ∈ A0 , the type of (p, q) with
respect to the line z∗ is the same for all z ∈ τ.
Computational Topology for Data Analysis 305
Our goal is to refine A0 further to another arrangement A so that for every face τ ∈ A the grade
points p, q that realizes dI (M|` , N|` ) for every ` = z∗ remains the same for all z ∈ τ. Toward that
goal, we define the push of a grade point.
Definition 12.11 (Push). For a point p = (p x , py ) and a line ` : y = sx − t, the push push(p, `) is
defined as
(p x , sp x − t) for p below `
(
push(p, `) =
((py + t)/s, py ) for p above `
Geometrically, push(p, `) is the intersection of ` with the upward ray originating from p in
the first case, and horizontal ray originating from p in the second case. Figure 12.3 illustrates the
two cases.
p
q
Figure 12.3: Pushes of two points to three lines. Thick segments indicate δ p,q for the correspond-
ing lines.
For p, q ∈ R2 , let
δ p,q (`) = kpush(p, `) − push(q, `)k2
Consider the equations
where
1
(
if p, q ∈ gr(M) or p, q ∈ gr(N)
c p,q = 2
1 otherwise.
The following proposition is proved in [207].
Proposition 12.3. The solution set z ∈ τ for a face τ ∈ A0 so that δ p,q (z∗ ) satisfies the above
equations is either empty, the entire face τ, intersection of a line with τ, or the intersection of two
lines with τ.
Let A be the arrangement of Ω with the lines used to form A0 , the lines stated in the above
proposition, and the vertical line x = 1.
306 Computational Topology for Data Analysis
s=0
Figure 12.4: Outer regions are shaded gray whose outer segments are drawn with thickened seg-
ments; the hatched region is inner.
A region is the closure of a face τ ∈ A in Ω. A region R is called inner if it is bounded and its
closure in R2 does not meet the vertical line s = 0. See Figure 12.4. All other regions are called
outer. An outer region has exactly two edges that are either unbounded or reaches the vertical
line s = 0 in the limit. They are called outer edges. It turns out that sup F(z) is achieved either at
a vertex or at the limit point of the outer edges that can be computed easily.
Theorem 12.6. The supremum supz∈R F(z) for a region R is realized either at a boundary vertex of
R or at the limit point of an outer edge. In the latter case, let p, q be the pair given by Theorem 12.5
for τ ⊆ R. If e is an outer edge and p lies above z∗ for any (and all by Proposition 12.2) z ∈ τ,
then sup F restricted to e is given by:
|p x − t| if line of e intersects line x = 0 at t.
(
sup F|e =
|q x + r| if line of e is infinite and has slope r.
The roles of p and q reverses if p lies below z∗ for any z ∈ τ.
We present the entire algorithm in Algorithm 25:MatchDist. It is known that this algorithm
runs in O(n11 ) time where n is the total number of generators and relations for the two input
modules. A more efficient algorithm approximating the matching distance is also known [209].
Computational Topology for Data Analysis 307
Algorithm 25 MatchDist(M, N)
Input:
Two modules M and N with grades of their generators and relations
Output:
Matching distance between M and N
1: Compute arrangement A as described from gr(M) ∪ gr(N);
2: Let V be the vertex set of A;
3: Compute maximum m = maxz∈V F(z∗ ) over all vertices z ∈ V;
4: for every outer region R do
5: Pick a point z ∈ R;
6: Compute the pair p, q ∈ gr(M) ∪ gr(N) that realizes dI (M|z∗ , N|z∗ );
7: if p is above z∗ then
8: if e as defined in Theorem 12.6 is infinite then
9: m := max(m, q x + r) where r is the slope of e
10: else
11: m := max(m, p x − t) where e meets line x = 0 at t
12: end if
13: else
14: reverse roles of p and q
15: end if
16: end for
17: return m
For the next definition, we call a d-parameter module M δ-trivial if ρ M = 0 for all x ∈ Rd .
x→x+~δ
Lm Ln
Definition 12.13 (Bottleneck distance). Let M i=1 M i and N j=1 N j be two persistence
modules, where Mi and N j are indecomposable submodules of M and N respectively. Let I =
{1, · · · , m} and J = {1, · · · , n}. We say M and N are δ-matched for δ ≥ 0 if there exists a matching
µ : I 9 J so that, (i) i ∈ I \ coim µ =⇒ Mi is 2δ-trivial, (ii) j ∈ J \ im µ =⇒ N j is 2δ-trivial,
and (iii) i ∈ coim µ =⇒ Mi and Nµ(i) are δ-interleaved.
The bottleneck distance is defined as
Fact 12.2. dI ≤ db .
308 Computational Topology for Data Analysis
Definition 12.14 (Interval). An interval is a subset ∅ , I ⊂ R̄d that satisfies the following:
Let I¯ denote the closure of an interval I in the standard topology of R̄d . The lower and upper
boundaries of I are defined as
Let B(I) = L(I) ∪ U(I). According to this definition, R̄d is an interval with boundary B(R̄d )
that consists of all the points with at least one coordinate ∞. The vertex set V(R̄d ) consists of 2d
corner points with coordinates (±∞, · · · , ±∞).
U (IM3 )
IM3
IM2
IM1
L(IM1 )
M = M1 ⊕ M2 ⊕ M3
(a) (b)
db (M, N) between two 1-parameter persistence modules M and N. This approach constructs
a bipartite graph G out of the intervals of M and N and their pairwise interleaving distances
including the distances to zero modules. If these distance computations take O(C) time in total,
5
then the algorithm for computing db takes time O(m 2 log m + C) where M and N together have
5
m indecomposables altogether. Observe that, the term m 2 in the complexity comes from the
bipartite matching. Although this could be avoided in the 1-parameter case taking advantage of
the two dimensional geometry of the persistence diagrams, we cannot do this here for determining
matching among indecomposables according to Definition 12.13. Given indecomposables (say
computed by the algorithm in Chapter 11 or Meataxe [251]), this approach is readily extensible
to the d-parameter modules if one can compute the interleaving distance between any pair of
indecomposables including the zero modules. To this end, we present an algorithm to compute
the interleaving distance between two 2-parameter interval modules Mi and N j with ti and t j
vertices respectively on their intervals in O((ti + t j ) log(ti + t j )) time. This gives a total time of
5 5
O(m 2 log m+ i, j (ti +t j ) log(ti +t j )) = O(m 2 log m+t2 log t) where t is the total number of vertices
P
over all input intervals.
Now we focus on computing the interleaving distance between two given intervals. Given
intervals I M and IN with t vertices, the algorithm searches a value δ so that there exists two families
of linear maps from M to N→δ and from N to M→δ respectively which satisfy both triangular and
square commutativity. The search is done with a binary probing: For a chosen δ from a candidate
set of O(t) values, the algorithm determines the direction of the search by checking two conditions
called trivializability and validity on the intersections of modules M and N.
Definition 12.18 (Intersection module). For two interval modules M and N with intervals I M and
IN respectively let IQ = I M ∩ L
`
IN , which is a disjoint union of intervals, IQi . The intersection
module Q of M and N is Q = Qi , where Qi is the interval module with interval IQi . That is,
k if x ∈ I M ∩ IN
Q 1 if x, y ∈ I M ∩ IN
Qx = and for x ≤ y, ρ x→y =
0 otherwise
0 otherwise
From the definition we can see that the support of Q, supp(Q), is I M ∩ IN . We call each Qi an
intersection component of M and N. Write I := IQi and consider φ : M → N to be any morphism.
The following proposition says that φ is constant on I.
Proof.
1 1
M pi M pi+1 M pi M pi+1
φ pi φ pi+1 φ pi φ pi+1
N pi 1
N pi+1 N pi 1
N pi+1
For any x, y ∈ I, consider a path (x = p0 , p1 , p2 , ..., p2m , p2m+1 = y) in I from x to y and the
commutative diagrams above for pi ≤ pi+1 (left) and pi ≥ pi+1 (right) respectively. Observe that
φ pi = φ pi+1 in both cases due to the commutativity. Inducting on i, we get that φ(x) = φ(y).
Computational Topology for Data Analysis 311
IQ1 IM
IN
IQ2
(−∞, ∞) (∞, ∞)
IM
d0
x0 x 0 I
y
I
x d0
y IN
d
d IQ
∆x = ∆ x0 x 2d
∆x = ∆x0
2d0
(−∞, −∞) (∞, −∞)
Figure 12.7: d = dl(x, I), y = πI (x), d0 = dl(x0 , L(I)) (left); d = dl(x, I) and d0 = dl(x0 , U(I)) are
0
defined on the left edge of B(R̄2 ) (middle); Q is d(M,N) - and d(N,M) -trivializable (right).
Definition 12.19 (Valid intersection). An intersection component Qi is (M, N)-valid if for each
x ∈ IQi the following two conditions hold (see Figure 12.6):
Proposition 12.9. Let {Qi } be a set of intersection components of M and N with intervals {IQi }.
Let {φ x } : M → N be the family of linear maps defined as φ x = 1 for all x ∈ IQi and φ x = 0
otherwise. Then φ is a morphism if and only if every Qi is (M, N)-valid.
Definition 12.20 (Diagonal projection and distance). Let I be an interval and x ∈ R̄n . Let ∆ x =
~ | α ∈ R} denote the line called diagonal with slope 1 that passes through x. We define (see
{x + α
Figure 12.7)
miny∈∆x ∩I {d∞ (x, y) := |x − y|∞ } if ∆ x ∩ I , ∅
dl(x, I) =
+∞ otherwise.
Note that ∀α ∈ R, we have ±∞ + α = ±∞. Therefore, for x ∈ V(R̄n ), the line collapses to a
single point. In that case, dl(x, I) , +∞ if and only if x ∈ I, which means πI (x) = x.
Notice that upper and lower boundaries of an interval are also intervals by definition. With
this understanding, following properties of dl are obvious from the above definition.
Fact 12.3.
312 Computational Topology for Data Analysis
(ii) Let L = L(I M ) or U(I M ) and let x, x0 be two points such that πL (x), πL (x0 ) both exist. If x
and x0 are on the same facet or the same diagonal line, then |dl(x, L)−dl(x0 , L)| ≤ d∞ (x, x0 ).
Set V L(I) := V(I)∩L(I), EL(I) := E(I)∩L(I), VU(I) := V(I)∩U(I), and EU(I) := E(I)∩U(I).
Proposition 12.10. For an intersection component Q of M and N with interval I, the following
conditions are equivalent:
1. Q is (M, N)-valid.
Proposition 12.11. An intersection component Q is δ(M,N) -trivializable if and only if every vertex
of Q is δ(M,N) -trivializable.
Recall that for two modules to be δ-interleaved, we need two families of linear maps satisfying
both triangular commutativity and square commutativity. For a given δ, Theorem 12.14 below
provides criteria which ensure that such linear maps exist. In the algorithm, we then will make
sure that these criteria are verified.
Given an interval module M and the diagonal line ∆ x for any x ∈ R̄d , there is a 1-parameter
persistence module M|∆x which is the functor restricted on the poset ∆ x as a subcategory of R¯d .
We call it a 1-dimensional slice of M along ∆ x . Define
Equivalently we have δ∗ = sup x∈R̄n {dI (M|∆x , N|∆x )}. We have the following Proposition and
Corollary from the equivalent definitions of δ∗ .
Proposition 12.12. For two interval modules M, N and δ > δ∗ ∈ R+ , there exist two families of
linear maps φ = {φ x : M x → N(x+δ) } and ψ = {ψ x : N x → M(x+δ) } such that for each x ∈ R̄d , the
1-dimensional slices M|∆x and N|∆x are δ-interleaved by the linear maps φ|∆x and ψ|∆x .
Computational Topology for Data Analysis 313
Theorem 12.14. For two interval modules M and N, dI (M, N) ≤ δ if and only if the following
two conditions are satisfied:
(i) δ ≥ δ∗ ,
(ii) ∀δ0 > δ, each intersection component of M and N→δ0 is either (M, N→δ0 )-valid or δ(M,N→δ0 ) -
trivializable, and each intersection component of M→δ0 and N is either (N, M→δ0 )-valid or δ(N,M→δ0 ) -
trivializable.
‘only if’ direction: Given M and N are δ-interleaved. The part (i) follows from Corol-
lary 12.13 directly. For part (ii), by definition of interleaving, ∀δ0 > δ, we have two families
of linear maps {φ x } and {ψ x } which satisfy both triangular and square commutativities. Let the
morphisms between the two persistence modules constituted by these two families of linear maps
be φ = {φ x } and ψ = {ψ x } respectively. For each intersection component Q of M and N→δ0 with
interval I := IQ , consider the restriction φ|I . By Proposition 12.8, φ|I is constant, that is, φ|I ≡ 0
or 1. If φ|I ≡ 1, by Proposition 12.9, Q is (M, N→δ0 )-valid. If φ|I ≡ 0, by the triangular commuta-
tivity of φ, we have that ρ M = ψ x+~δ0 ◦ φ x = 0 for each point x ∈ I. That means x + 2δ~0 < I M .
x→x+2~δ0
By Fact 12.3(i), dl(x, U(I M ))/2 < δ0 . Similarly, ρN = φ x ◦ ψ ~0 = 0 =⇒ x − δ~0 < IN ,
x−δ
x−δ~0 →x+δ~0
which is the same as to say x − 2δ~0 < IN→δ0 . By Fact 12.3(i), dl(x, L(IN→δ0 ))/2 < δ0 . So ∀x ∈ I,
(M,N 0 )
we have dtriv →δ (x) < δ0 . This means Q is δ0(M,N 0 ) -trivializable. Similar statement holds for
→δ
intersection components of M→δ0 and N.
‘if’ direction: We construct two families of linear maps {φ x }, {ψ x } as follows: On the interval
I := IQi of each intersection component Qi of M and N→δ0 , set φ|I ≡ 1 if Qi is (M, N→δ0 )-valid
and φ|I ≡ 0 otherwise. Set φ x ≡ 0 for all x not in the interval of any intersection component.
Similarly, construct {ψ x }. Note that, by Proposition 12.9, φ := {φ x } is a morphism between M
and N→δ0 , and ψ := {ψ x } is a morphism between N and M→δ0 . Hence, they satisfy the square
commutativity. We show that they also satisfy the triangular commutativity.
We claim that ∀x ∈ I M , ρ M = 1 =⇒ x + δ~0 ∈ IN and similar statement holds for
x→x+2δ~0
IN . From condition that δ > δ ≥ δ and by proposition 12.12, we know that there exist two
0 ∗
families of linear maps satisfying triangular commutativity everywhere, especially on the pair of
1-parameter persistence modules M|∆x and N|∆x . From triangular commutativity, we know that
for ∀x ∈ I M with ρ M = 1, x + δ~0 ∈ IN since otherwise one cannot construct a δ-interleaving
x→x+2δ~0
between M|∆x and N|∆x . So we get our claim.
Now for each x ∈ I M with ρ M = 1, we have dl(x, U(I M ))/2 ≥ δ0 by Fact 12.3, and
x→x+2δ~0
x + δ~0 ∈ IN by our claim. This implies that x ∈ I M ∩ IN→δ0 is a point in an interval of an intersec-
tion component Q x of M, N→δ0 which is not δ0(M,N 0 ) -trivializable. Hence, it is (M, N→δ0 )-valid by
→δ
the assumption. So, by our construction of φ on valid intersection components, φ x = 1. Symmet-
rically, we have that x + δ~0 ∈ IN ∩ I M→δ0 is a point in an interval of an intersection component of N
and M→δ0 which is not δ0(N,M 0 ) -trvializable since dl(x + δ~0 , L(I M ))/2 ≥ δ0 . So by our construction
→δ
of ψ on valid intersection components, ψ x+δ~0 = 1. Then, we have ρ M ~0
= ψ x+δ~0 ◦ φ x for ev-
x→x+2δ
ery nonzero linear map ρ M ~0
. The statement also holds for any nonzero linear map ρN .
x→x+2δ x→x+2δ~0
314 Computational Topology for Data Analysis
Note that the above proof provides a construction of the interleaving maps for any specific
δ0 if it exists. Furthermore, the interleaving distance dI (M, N) is the infimum of all δ0 satisfying
the two conditions in the theorem, which means dI (M, N) is the infimum of all δ0 ≥ δ∗ satisfying
condition 2 in Theorem 12.14.
Definition 12.22 (Candidate set). For two interval modules M and N, and for each point x in
I M ∪ IN , let
D(x) = {dl(x, L(I M )), dl(x, L(IN )), dl(x, U(I M )), dl(x, U(IN ))} and
S = {d | d ∈ D(x) or 2d ∈ D(x) for some vertex x ∈ V(I M ) ∪ V(IN )} and
S ≥δ := {d | d ≥ δ, d ∈ S }.
Algorithm 26 Interleaving(I M , IN )
Input:
I M and IN with t vertices in total
Output:
dI (M, N)
1: Compute the candidate set S and let be the half of the smallest difference between any two
numbers in S . /* O(t) time */
2: Compute δ∗ ; Let δ = δ∗ . /* O(t) time */
3: Let δ∗ = δ0 , δ1 , · · · , δk be numbers in S ≥δ∗ in non-decreasing order. /*O(t log t) time */
4: ` := 0; u = k;
5: while ` < u /* O(log t) probes*/ do
6: i := b (u+`)
2 c; δ := δi ; δ := δ + ε;
0
In Algorithm 26:Interleaving, the following generic task of computing diagonal span is per-
formed for several steps. Let L and U be any two chains of vertical and horizontal edges that are
both x- and y-monotone. Assume that L and U have at most t vertices. Then, for a set X of O(t)
points in L, one can compute the intersection of ∆ x with U for every x ∈ X in O(t) total time. The
idea is to first compute by a binary search a point x in X so that ∆ x intersects U if at all. Then, for
other points in X, traverse from x in both directions while searching for the intersections of the
diagonal line with U in lock steps.
Now we analyze the complexity of the algorithm Interleaving. The candidate set, by def-
inition, has O(t) values which can be computed in O(t) time by the diagonal span procedure.
By Proposition 12.15, δ∗ is in S and can be determined by computing the interleaving distances
dI (M|∆x , N|∆x ) for modules indexed by diagonal lines passing through O(t) vertices of I M and IN .
This can be done in O(t) time by diagonal span procedure. Once we determine δ∗ , we perform
a binary search (while loop) with O(log t) probes for δ = dI (M, N) in the truncated set S δ≥δ∗
to satisfy the first condition of Theorem 12.14. Intersections between two polygons I M and IN
bounded by x- and y-monotone chains can be computed in O(t) time by a simple traversal of the
boundaries. The validity and trivializability of each intersection component can be determined in
time linear in the number of its vertices due to Proposition 12.10 and Proposition 12.11 respec-
tively. Since the total number of intersection points is O(t), validity check takes O(t) time in total.
The check for trivializabilty also takes O(t) time if one uses the diagonal span procedure. So the
total time complexity of the algorithm is O(t log t).
Proposition 12.15 below says that δ∗ is determined by a vertex in I M or IN and δ∗ ∈ S .
Proposition 12.15. (i) δ∗ = max x∈V(IM )∪V(IN ) {dI (M|∆x , N|∆x )}, (ii) δ∗ ∈ S .
The correctness of the algorithm Interleaving already follows from Theorem 12.14 as long
as the candidate set contains the distance dI (M, N). This is indeed true as shown in [141].
Theorem 12.16. dI (M, N) ∈ S .
Remark 12.1. Our main theorem and algorithm consider the persistence modules defined on R2 .
For a persistence module defined on a finite or discrete poset like Z2 , one can extend it to a persis-
tence module M on R2 to apply our theorem and algorithm. This extension is achieved by assum-
ing that all morphisms outside the given persistence module are isomorphisms and M x→−∞ = 0
if it is not given otherwise. The reader can draw the analogy between this extension and the one
we had for 1-parameter persistence modules (Remark 3.3).
persistence modules is the best discriminating distance between modules having the property of
stability. It is straightforward to observe that dI ≤ db . For some special cases, results in the
reverse direction exist. Botnan and Lesnick [47] proved that, for the special class of 2-parameter
persistence modules, called block decomposable modules, db ≤ 52 dI . The support of each inde-
composable in such modules consists of the intersection of a bounded or unbounded axis-parallel
rectangle with the upper halfplane supported by the diagonal line x1 = x2 . Bjerkevik [32] im-
proved this result to db ≤ dI thereby extending the isometry theorem dI = db to 2-parameter
block decomposable persistence modules.
Interestingly, a zigzag persistence module (Chapter 4) can be mapped to a block decompos-
able module [47]. Therefore, one can define an interleaving and a bottleneck distance between
two zigzag persistence modules by the same distances on their respective block decomposable
modules. Suppose that M1 and M2 denote the block decomposable modules corresponding to two
zigzag filtration F1 and F2 respectively. Bjerkevik’s result implies that db (Dgm p (F1 ), Dgm p (F2 )) ≤
2db (M1 , M2 ) = 2dI (M1 , M2 ). The factor of 2 comes due to the difference between how distances
to a null module are computed in 1-parameter and 2-parameter cases. It is important to note that
the bottleneck distance db for persistence diagrams here takes into account the types of the bars as
described in Section 4.3. This means, while matching the bars for computing this distance, only
bars of similar types are matched.
A similar conclusion can also be derived for the bottleneck distance between the levelset per-
sistence diagrams of Reeb graphs. Mapping the 0-th levelset zigzag modules Z f , Zg of two Reeb
graphs (F, f ) and (G, g) to block decomposable modules M f and Mg respectively, one gets that
db (Dgm0 (Z f ), Dgm0 (Zg )) ≤ 2db (M f , Mg ) = 2dI (M f , Mg ). The interleaving distance dI (M f , Mg )
between block decomposable modules is bounded from above (not necessarily equal) by the in-
terleaving distance between Reeb graphs given by Definition 7.6, that is, dI (M f , Mg ) ≤ dI (F, G).
Bjerkevik also extended his result to rectangle decomposable d-parameter modules (inde-
composables are supported on bounded or unbounded rectangles). Specifically, he showed that
db ≤ (2d − 1)dI for rectangle decomposable d-parameter modules and db ≤ (d − 1)dI for free
d-parameter modules. He gave an example for exactness of this bound when d = 2.
Multiparameter matching distance dm introduced in [71] provides a lower bound to interleav-
ing distance [216]. This matching distance can be approximated within any error threshold by
algorithms proposed in [29, 72]. But, it cannot provide an upper bound like db . The algorithm
for computing dm exactly as presented in Section 12.3 is taken from [207]. The complexity of
this algorithm is rather high. To address this issue, an approximation algorithm with better time
complexity has been proposed in [209] which builds on the result in [29].
For free, block, rectangle, and triangular decomposable modules, one can compute db by
computing pairwise interleaving distances between indecomposables in constant time because
they have a description of constant complexity. Due to the results mentioned earlier, dI can be es-
timated within a constant or dimension-dependent factors by computing db for these modules. On
the other hand, Botnan and Lesnick [47] observed that even for interval decomposable modules,
db cannot approximate dI by any constant factor approximation.
Bjerkevik et al. [33] showed that computing interleaving distance for 2-parameter interval
decomposable persistence modules as considered in this chapter is NP-hard. Worse, it cannot be
approximated within a factor of 3 in polynomial time. In this context, the fact that db does not
approximate dI within any factor for 2-parameter interval decomposable modules [47] turns out
Computational Topology for Data Analysis 317
to be a boon in disguise because otherwise a polynomial time algorithm for computing it by the
algorithm as presented in Section 12.4 would not have existed. This algorithm is taken from [141]
whose extension to the multiparameter persistence modules is available on arxiv.
Exercises
1. Show that dI and db are pseudo-metrics on the space of finitely generated multiparameter
persistence modules. Show that if the grades of generators and relations of the modules do
not coincide, both become metrics.
2. Give an example of two persistence modules M and N for which dm (M, N) = 0 but
dI (M, N) , 0.
3. Prove dI ≤ db and dm ≤ dI .
5. The algorithm MatchDist computes dm in O(n11 ) time where n is the total number of gen-
erators and relations with which the input modules are described. Design an algorithm for
computing dm that runs in o(n11 ) time.
6. Consider the matching distance dm between two interval modules. Compute dm in this case
in O(n4 ) time.
Machine learning (ML) has been a prevailing technique for data analysis. Naturally, researchers
in the past few years have explored ways to combine the machine learning techniques with the
TDA techniques. In previous chapters we have introduced various topological structures and
algorithms for computing them. In this chapter, we give two examples of combining topological
ideas with machine learning approaches. Note that this chapter is not intended to be a survey of
such TDA+ML approaches, given that this is a very active and rapidly evolving field.
We have seen that persistent homology, in some sense, encodes the “shape” of data. Thus, it
is natural to use persistent homology to map a potentially complex input data (e.g., a point set or a
graph) to a feature representation (persistence diagram). In particular, a simple persistence-based
feature vectorization and data analysis framework can be as follows: Given a collection C of
objects (e.g., a set of images, a collection of graphs, etc.), apply the persistent homology to map
each object to a persistence-diagram representation. Thus, objects in the input collection are now
mapped to a set of points in the space of persistence diagrams. Different types of input data can
all be now mapped to a common feature space: the space of persistence diagrams. Equipping this
space with appropriate metric structures, one can then carry out downstream data analysis tasks on
C in the space of persistence diagrams. In Section 13.1, we further elaborate on this framework, by
describing several methods to assign a nice metric or kernel on the space of persistence diagrams.
One way to further incorporate topological information into machine learning framework is
by using a “topological loss function". In particular, as topology provides a language to describe
global properties of a space, it can help a machine learning task at hand by allowing one to
inject topological constraints or prior. This usually leads to optimizing a “topological function"
over certain persistence diagrams. An example is given in Figure 13.1, taken from [199], where
a term representing the topological quality of the output segmented images is added as part of
the loss function to help improve topology of the segmented foregrounds. In Section 13.2, we
give another example of how to use “topological function" and describe how to address the key
challenge of differentiating such a topological loss function when it involves persistent homology
based information.
In this book, we have focused mainly on the mathematical structures and algorithmic / compu-
tational aspects involving TDA. However, we note that there has been important development in
319
320 Computational Topology for Data Analysis
Figure 13.1: The high level neural network framework, where topological information of the
segmented image (captured via persistent homology) is used to help train the neural network for
better segmentation; reprinted by permission from Xiaoling Hu et al. (2019, fig. 2)[199].
statistical treatment of topological summaries, which are crucial for quantification of uncertainty,
noise, and convergence of topological summaries computed from sampled data. In concluding
this book, we provide a very brief description of some of such developments in Section 13.3 at
the end of this chapter. Interested readers can follow the references given within this section for
further details.
Death-time
(2, 6)
(3, 5)
(1, 4) (2, 6)
(1, 4)
λ1
(3, 5)
λ2
Birth-time
λ3
(a) (b)
Figure 13.2: (a) A persistence diagram D and its corresponding landscape functions are in (b),
where λk := λD (k, ·) for k = 1, 2, and 3.
Definition 13.1 (Persistence landscape). Given a finite persistence diagram D = {(bi , di )}i∈[1,n]
from D, the persistence landscape w.r.t. D is a function λD : N × R → R where
For a fixed k, λD (k, ·) : R → R is a function on R. In particular, one can think of each persistent
point (bi , di ) giving rise to a triangle whose upper boundary is traced out by points
see Fig. 13.2. There are n such triangles, and the function λD (k, ·) is the k-th upper envelop in the
arrangement formed by the union of these triangles, which intuitively are points on the boundary
of the k-th layer of these triangles.
The persistence landscape maps the persistence diagrams to a linear function space. The
p-norm on persistence landscapes is defined as
∞
X
p p
kλD k p = kλD (k, ·)k p .
k=1
Note that for any k > n, λD (k, ·) ≡ 0. One can recognize that persistence landscapes for
finite persistence diagrams lie in the so-called L p -space L p (N × R) 1 . If p = 2, then this is a
Hilbert space. Given a set of persistence diagrams, one can compute their mean or carry out other
statistical analysis in L p (N × R). For example, given a set of ` finite diagrams D1 , . . . , D` ∈ D,
one can define the mean landscape λ of their corresponding landscapes λD1 , . . . , λD` to be
`
1X
λ(k, t) = λD (k, t).
` i=1 i
1
For 1 ≤ p < ∞, L p (X) is defined as L p (X) := { f : X → R | k f k p ≤ +∞}. For example, L2 (Rd ) is the space of
standard square-integrable functions on Rd . Then L p (X) is defined as L p (X) = L p (X)/ ∼ where f ∼ g if k f − gk p = 0.
322 Computational Topology for Data Analysis
The following claim states that the map from the space of finite persistence diagrams D to the
space of persistence landscapes is injective, and this map is lossless in terms of the information
encoded in the persistence diagram.
Claim 13.1. Given a persistence diagram D, let λD be its persistence landscape. Then from λD
one can uniquely recover the persistence diagram D.
However, a function λ : N × R → R may not be the image of any valid persistence diagram.
For example, the mean landscape introduced above may not be the image of any persistence
diagram.
Finally, in addition to being injective, under appropriate norms, the map from persistence
diagram to persistence landscape is also stable (1-Lipschitz w.r.t. the bottleneck distance between
persistence diagrams):
Theorem 13.1. For persistence diagrams D and D0 , Λ∞ (D, D0 ) ≤ dB (D, D0 ).
Additional stability results for Λ p are given in [51], relating it to the p-th Wasserstein distance
for persistence diagrams, or for the case where the persistence diagrams are induced by tame
Lipschitz functions.
Now, given a set X and a Hilbert space H of real-valued functions on X, the evaluation
functional over H is a linear functional that evaluates each function in f at a point x: that is,
given x, L x : H → R is defined as L x ( f ) = f (x) for any f ∈ H. The Hilbert space H is called a
Reproducing Kernel Hilbert Space (RKHS) if L x is continuous for all x ∈ X. It is known that given
a positive semi-definite kernel k, there is a unique Reproducing Kernel Hilbert Space (RKHS) Hk
such that k(x, y) = hk(·, x), k(·, y)iHk . We call k the reproducing kernel for Hk . From now on,
we simply use “kernel" to refer to a positive semi-definite kernel. See [271] for more detailed
discussions of kernels, RKHS, and related concepts.
Equivalently, a kernel can be thought of as the inner product k(x, y) = hΦ(x), Φ(y)iH after
mapping X to some Hilbert space H via a feature map Φ : X → H. With this inner product, one
can also further induce a pseudo-metric2 by:
Many machine learning pipelines directly use kernels and its associate inner-product structure.
The work of [263] constructs then following persistence scale
o space kernel (PSSK) by defining
an explicit feature map. Let Ω = x = (x1 , x2 ) ∈ R | x2 ≥ x1 denote the subspace of R2 on or
2
above the diagonal3 . Recall the L2 -space L2 (Ω), which is a Hilbert space.
Definition 13.3 (Persistence scale space kernel (PSSK)). Define the feature map Φσ : D → L2 (Ω)
at scale σ > 0 as follows: for a persistence diagram D ∈ D and x ∈ Ω, set:
1 X − ||x−y||2 ||x−ȳ||2
Φσ (D)(x) = [e 4σ − e− 4σ ],
4πσ y∈D
where ȳ = (y2 , y1 ) if y = (y1 , y2 ) (i.e, ȳ is the reflection of y across the diagonal). This feature
map induces the following persistence scale space kernel (PSSK) kσ : D × D → R using the inner
product structure on L2 (Ω): given two diagrams D, E ∈ D,
1 X ||y−z||2 ||y−z̄||2
kσ (D, E) = hΦσ (D), Φσ (E)iL2 (Ω) = [e− 8σ − e− 8σ ]. (13.2)
8πσ y∈D; z∈E
In other words, a persistence diagram is now mapped to a function Φσ (D) : Ω → R under the
feature map Φσ . By construction, the PSS kernel is positive definite. Now consider the distance
induced by the PSS kernel
This distance is stable in the sense that the feature map Φσ is Lipschitz w.r.t. the 1-Wasserstein
distance:
Theorem 13.2. Given two persistence diagrams D, E ∈ D, we have
1
kΦσ (D) − Φσ (E)kL2 (Ω) ≤ dW,1 (D, E).
2πσ
3
Often in the literature, one assumes that the standard persistent homology is considered where the the birth time
is smaller than or equal to the death time in the filtration. Several of the kernels introduced here, including PSSK and
PWGK, assume that persistence diagrams lie in Ω.
324 Computational Topology for Data Analysis
The persistence image is a discretization of the persistence surface. Specifically, fix a grid on a
rectangular region in the plane with a collection P of N rectangles (pixels). The persistence image
of a persistence diagram D is ID = { I[p] }p∈P which consists of N numbers (i.e, a vector in RN ),
one for each pixel p in the grid P with ID [p] := p ρD dxdy.
R
We remark that the weight function ω in constructing the persistence surface allows points
in the persistence diagrams to have different contribution in the final representation. A natural
choice of ω(u) could be the persistence |b − d| of point u = (b, d).
The persistence image can be viewed as a vector in RN . One could then compute distance
between two persistence diagrams D and E by the L2 -distance kID −IE k2 between their persistence
images (vectors) ID and IE . Other L p -norms can also be used.
Persistence images are shown to be stable w.r.t. the 1-Wasserstein distance between persis-
tence diagrams [4]. As an example, below we state the stability result for the special case where
the persistence surfaces are generated using the normalized Gaussian distribution φu : R2 → R
−kz−uk22 /2σ2
defined via φu (z) = 2πσ
1
2e for any z ∈ R2 . See [4] for stability results for the general
cases.
Theorem 13.3. Suppose persistence images are computed with the normalized Gaussian distri-
bution with variance σ2 and weight function ω : R2 → R. Then the persistence images are stable
w.r.t. the 1-Wasserstein distance between persistence diagrams. More precisely, given two finite
and bounded persistence diagrams D and E, we have:
r
√ 10 kωk∞
k I D − I E k1 ≤ 5|∇ω| + · dW,1 (D, E).
π σ
Here, ∇ω stands for the gradient of ω, and |∇ω| = supz∈R2 k∇ωk2 is the maximum norm of the
gradient vector of ω at any point in R2 . The same upper bound holds for k ID −IE k2 and k ID −IE k∞
as well.
This feature map induces the following persistence weighted kernel (PWK) Kk,ω : D × D → R:
X
Kk,ω (D, E) = hΨk,ω (D), Ψk,ω (E)iHk = ω(x)ω(y)k(x, y). (13.4)
x∈D; y∈E
The intuition of the above feature map is as follows: given a persistence diagram D, it can be
viewed as a discrete measure µωD := x∈D ω(x)δ x , where ω : R2 → R is a weight function, and δ x
P
is the Dirac measure at x. (Similar to persistence images, the use of the weight function ω allows
different point in the birth-death plane to have different influence.) The map Ψk,w (D) is essen-
tially the kernel mean embedding of distributions (with persistence diagrams viewed as discrete
measures) to the RKHS. It is known that if the kernel k is C0 -universal, then this embedding is in
fact injective [283], and hence the resulting induced distance kΨk,w (D) − Ψk,w (E)kHk is a proper
metric (instead of a pseudo-metric).
This weighted kernel kω is still positive semi-definite for strictly positive weight function ω :
Ω → R+ . Let Hkω denote its associated RKHS. Then the following map
X
Ψkω (D) := ω(x)ω(·)k(·, x)
x∈D
defines a valid feature map Ψkω : D → Hkω to the RKHS Hkω . It is shown in [215] that the induced
inner product Kkω : D × D → R by this feature map equals the inner product in Eqn. (13.4):
X
Kkω (D, E) = hΨkω (D), Ψkω (E)iHkω = ω(x)ω(y)k(x, y) = Kk,w (D, E). (13.5)
x∈D;y∈E
Persistence weighted Gaussian kernel (PWGK). There are different choices for the weight
function ω and the kernel k. For example, given a persistence point x = (b, d), let pers(x) = |d−b|.
Then we can set the weight function to be
bottleneck distance dB and the 1-Wasserstein distance dW,1 on persistence diagrams are shown in
[215], with bounds depending on the weight function ω and the kernel kG . The precise statements
are somewhat involved, so we omit the details here. We remark that the stability w.r.t the bottle-
neck distance is provided, which is usually harder to obtain than for Wasserstein distance for such
vectorizations of persistence diagrams.
Finally, now that the persistence diagrams are embedded in a RKHS, one can directly use
the associated inner product and kernel for machine learning pipelines. One can also further put
another kernel based on the RKHS representation of persistence diagrams. Indeed, the persistence
weighted kernel in Eqn (13.4) is equivalent of putting a linear kernel on the RKHS Hk . We can
also consider using a non-linear kernel, say the Gaussian kernel on the RKHS Hk , and obtain yet
another kernel on persistence diagrams, called (k, ω)-Gaussian kernel4 :
1
G
(D, E) = exp − kΨk,w (D) − Ψk,w (E)k2Hk .
Kk,w 2
2τ
Theorem 13.4. Given X and φ : X × X → R, the kernel φ is negative semi-definite if and only if
e−tφ is positive semi-definite for all t > 0.
In what follows, we construct the so-called Sliced Wasserstein distance dS W for persistence
diagrams, which is shown to be negative definite. We then use it to construct the Sliced Wasser-
stein kernel following the above theorem.
Specifically, let µ, ν be two (unnormalized) non-negative measures on the real line, such that
the total mass µ(R) equals ν(R), and they are bounded. Let Π(µ, ν) denote the set of measures on
R2 with marginals µ and ν. Consider
"
W(µ, ν) = inf |x − y| · dP(x, y), (13.6)
P∈Π(µ,ν) R×R
which is simply the 1-Wasserstein distance between measures µ and ν. In the following definition,
S1 denotes the unit circle in the plane.
Definition 13.6 (Sliced Wasserstein distance). Given a unit vector θ ∈ S1 ⊆ R2 , let L(θ) denote
the line {λθ | λ ∈ R}. Let πθ : R2 → L(θ) be the orthogonal projection of the plane onto L(θ).
Given two persistence diagrams D and E, set µθD := p∈D δπθ (p) and µθD := p∈D δπθ ◦π∆ (p) , where
P P
π∆ : R2 → ∆ is the orthogonal projection onto the diagonal ∆ = {(x, x) | x ∈ R}. Set µθE and µθE in
a symmetric manner. Then the Sliced Wasserstein distance between D and E is defined as:
Z
1
dS W (D, E) := W(µθD + µθE , µθE + µθD )dθ.
2π S1
4
In the work of [215], it also sometimes refer to ΨGkG ,warc as the persistence weighted Gaussian kernel.
5
In [25], the use of positive (negative) positive kernel is the same as our positive (negative) semi-positive kernel.
Computational Topology for Data Analysis 327
In the above definition, the sums µθD + µθE and µθE + µθD ensure the resulting two measures have
the same total mass.
Proposition 13.5. dS W is negative semi-definite on D where D is the space of bounded and finite
persistence diagrams.
Combining the above proposition with Theorem 13.4, we can now define the positive semi-
definite Sliced Wasserstein kernel kS W on D as:
dS W (D,E)
−
kS W (D, E) := e 2σ2 , for σ > 0. (13.7)
The Sliced Wasserstein distance is not only stable, but also strongly equivalent to 1-Wasserstein
distance dW,1 on bounded persistence diagrams in the following sense:
Theorem 13.6. Let DN be the set of bounded persistence diagrams with cardinalities at most N.
For any D, E ∈ DN , one has:
dW,1 (D, E) √
≤ dS W (D, E) ≤ 2 2 · dW,1 (D, E).
4N(4N − 1) + 2
This function is similar to the persistence surface used in [4]. Recall that ∆ denotes the diagonal
in the plane. Given a diagram D, let D∆ := {π∆ (u) | u ∈ D} where π∆ denotes the orthogonal
projection onto the diagonal ∆.
Definition 13.7 (Persistence Fisher (PF) kernel). Given two persistence diagrams D, E, the Fisher
information metric between their corresponding persistence surfaces µD and µE is defined as:
Z p
dFI M (D, E) := dFI M (µD∪E∆ , µE∪D∆ ) = arccos µD∪E∆ (x)µE∪D∆ (x) dx .
R2
The Persistence Fisher (PF) kernel for persistence diagrams is then defined as:
Proposition 13.7. The function (dFI M − τ) is negative definite on the set of bounded and finite
persistence diagrams D for any τ ≥ π2 .
By the above result and Theorem 13.4, we have that e−t(dFI M −τ) is positive definite for t > 0
and τ ≥ π2 . Furthermore, by definition, we can rewrite the Persistence Fisher kernel as:
π
kPF (D, E) = e−t·dFI M (D,E) = α · e−t·(dFI M (D,E)−τ) , where τ ≥ and α = e−tτ > 0.
2
As α > 0 is a fixed constant, it then follows that:
The work of [218] provides interesting analysis of the eigen system of the integral opera-
tor induced by kPF . Furthermore, both persistence Fisher kernel and Sliced-Wasserstein kernel
are infinitely divisible. This could bring computational advantages when using them in kernel
machines. The PSS kernel and PWGK kernels do not have this property.
We remark that there are other vectorization approaches of persistence diagrams developed.
Very recently, there have also been several pieces of work on learning the representation of per-
sistence diagrams in an end-to-end manner using labelled data. We will mention some of these
work in the Bibliography notes later.
See Fig. 13.3 (a) for an example, where the classification boundary S f consists of the U-shaped
curve and two closed loops.
Computational Topology for Data Analysis 329
Figure 13.3: (a) Red curve is the classification boundary S f . (b) shows the graph of the classifier
function f , with S f (the level set at value 0) marked in red. (c) Pushing the saddle q1 down to
remove this left component in S f as shown in (d). (Image taken from [96]).
The classifier may have unnecessary details that over-fit the input data, and one way to address
this is via regularizing (constraining) properties of f (e.g., requiring that it is smooth). The work
of [96] proposed to regularize the “topological simplicity" of a classifier. In the example of Fig.
13.3 (a), there are three components (0-th homological features) in S f . To develop a notion of
“topological complexity” of the classification boundary, it is desirable to quantify the “robustness"
of these topological features. To do so, we will need to use the information of the entire classifier
function f beyond just the 0-level set; see Figure 13.3 (b). Notice that, while the two small
components in S f are of similar size in S f , intuitively, it takes less perturbation of the classifier
function f to remove the left component. In particular, one could push down the saddle point q1
so that this component is merged with the large component in the level-set S f and thus reduces
the 0-th Betti number of S f . See Figure 13.3 (c) and (d). The perturbation required to do so in
terms of the maximum changes in the function values is less than what is required for pushing q2
or p2 to remove the right component.
Hence the “robustness" of features within the level set S f depends on information of f beyond
just S f . To this end, one can do the following: Let Dgm f be the levelset zigzag persistence
diagram of f . Set
ΠS f := {(b, d) ∈ Dgm f | b ≤ 0; d ≥ 0}.
f
x7
x6 death time
x5 (f1, f6)
0 x4 (f2, f5)
0
(f3, f4)
x3 x2
x1
0 birth-time
(a) (b)
Figure 13.4: (a) A function f : R → R. Its persistence pairings (of critical points) are
marked by the dotted curves: {(x1 , x6 ), (x2 , x5 ), (x3 , x4 ), . . .}. The corresponding persistence di-
agram is shown in (b). The set ΠS f consists of all points within the red-rectangle; that is,
ΠS f = {( f1 , f6 ), ( f2 , f5 ), . . .}, where fi = f (xi ) for i ∈ [1, 6]. Note that ( f3 , f4 ) is not in ΠS f as
the interval [ f3 , f4 ] does not contain 0. (Image taken from [96].)
See Figure 13.4 for an illustration. Intuitively, points in ΠS f are those persistent features
330 Computational Topology for Data Analysis
whose life-time passes through the 0-level set S f . There is a one-to-one correspondence between
the “topological features" in S f with the points in ΠS f (this can be made more precise via the
persistent cycles concept introduced in Definition 5.7 of Chapter 5), and one can view a point
(b, d) ∈ ΠS f as the life-time of its corresponding feature in the 0-level set S f . The robustness of
the feature corresponding to point c = (b, d) is then defined as ρ(c) = min{|b|, |d|}. Intuitively, this
is the least amount of function perturbation in terms of the L∞ norm needed to remove this feature
from S f (i.e., to push persistent point c out of the set ΠS f ). One can then define the topological
complexity (topological penalty) for the classifier f as
X
Ltopo ( f ) := ρ2 (c).
c∈ΠS f
In practice, suppose for example we have the supervised setting where we are given a set of
points Xn = {x1 , . . . , xn } with class labels {y1 , . . . , yn }. Assume the classifier fω is parameterized
by ω ∈ Rm . We can combine the topological penalty Ltopo with any standard loss function to
define a final loss function, for example,
X
L( fω , Xn ) = `( fω (xi ), yi ) + λLtopo ( fω ), (13.8)
xi ∈Xn
where the first term represents standard loss and `(·, ·) could be cross-entropy loss, hinge loss and
so on.
Finally, to optimize L( fω , Xn ) w.r.t. ω (so as to learn the best classifier fω ), we can do (stochas-
tic) gradient decent, and thus need to compute gradient for Ltopo ( fω ). To this end, we approximate
the domain X by taking a certain simplicial complex K spanned by samples Xn . In [96], only the
0-th topological information of the classification boundary S f is used. Hence one only needs the
1-skeleton of K. In the implementation of [96], that is then simply taken as the k-nearest neighbor
graph spanned by input samples Xn . One then use the approach to be described shortly in Section
13.2.2 below to compute gradients of this loss function, which is a persistence-based topological
function.
To optimize the topological function T(ω) one may need to compute gradient of T w.r.t. the
parameter ω. Applying the chain rule, this means that one needs to be able to compute ∂b i
∂ω and
∂di ∂T
∂ω for certain points (bi , di ) in the persistence diagram. (Terms such as ∂bi can be computed
easily if the analytic form of T w.r.t. bi s and di s are given; again, consider Ltopo ( f ) from previous
section as an example.) Intuitively, this requires the “inverse” of the map which maps fω to its
persistence diagram Dgm fω . This inverse in general does not exists. However, assuming that fω
is a PL function defined on K, then it turns out one can map bi s and di s back to vertices of K and
this map is locally constant if all vertices of K have distinct function values.
More specifically, suppose Dgm f is generated by the persistent homology of the sublevel
set filtration induced by f . Recall as described in Section 3.5.2, from the algorithmic point of
view, the sublevel set filtration is simulated by the so-called lower-star filtration of K. Using
notations from Section 3.5.2, let Vi be the first i vertices of V, sorted in non-decreasing order of
their function values, and Ki = j≤i Lst(vi ) the set of all simplices spanned by vertices in Vi (i.e,
S
by vertices whose function value is at most f (vi )). The sublevel set filtration F is constructed
by adding vi and all simplices in its lower-star in increasing order of i; recall Eqn. (3.10). Fur-
thermore, recall that (Theorem 3.16) each persistent point in the diagram Dgm f is in fact of the
form (bi , di ) = ( f (v`i ), f (vri )) such that the pairing function µ`fi ,ri > 0, and vertices v`i and vri are
both homological critical points for the PL-function f . We use the map ρ : Dgm f → V × V to
denote this correspondence6 with ρ(bi , di ) = (v`i , vri ). We will also abuse notation slightly and
write ρ(bi ) = v`i and ρ(di ) = vri . In other words, birth and death points in the persistence diagram
Dgm f can be mapped back to unique vertices in the vertex set of K.
This gives us a map ξ : Rm → 2V×V that map any parameter ω ∈ Rm to a collection of
pairs ξ(ω) := ρ(Dgm fω ) ⊆ V × V. Assume that as the parameter ω ∈ Rm changes, the function fω
changes continuously (w.r.t the L∞ norm on the function space). It then follows that its persistence
diagram Dgm fω also changes continuously due to the Stability result of persistence [102]. The
image of Dgm fω under ρω also changes, although not necessarily continuously. Nevertheless, for
a PL function fω , this image set stays fixed (constant) within a small neighborhood of ω if fω is
“nice”. More specifically,
Proposition 13.9. Suppose fω : |K| → R is a PL-function with distinct values on all vertices
V of K, and K is a finite simplicial complex. Then there exists a neighborhood of ω in the
parameter space such that ξ remains constant within this neighborhood; that is, the image set
ξ(ω) = ρω (Dgm fω ) remains the same for all parameters within this neighborhood.
Recall that bi = fω (ρω (bi )) = fω (v`i ). It follows that, if conditions on fω in Proposition 13.9
holds, then within a sufficiently small neighborhood of ω, even though bi moves continuously, the
identify of v`i remains the same and bi = fω (v`i ) as ω varies within this neighborhood. Hence we
have
∂bi ∂ fω (ρω (bi )) ∂ fω (v`i ) ∂ fω
= = = (v`i ).
ω ω ω ω
∂ fω
The derivative ∂di
∂ω can be computed in an analogous manner by ∂ω (vri ). This in turn leads to the
∂T
computation of the derivative ∂ω for the persistence-based topological function T(ω).
6
Note that while formulated differently, this map is the same as the one used in [257].
332 Computational Topology for Data Analysis
Performing statistics on space of persistence diagrams. One key objective in data analysis is
to model and quantify variations in data, such as computing the mean or variance of a collection
of data. Given the power of persistent homology in mapping an input complex object to its
persistence diagram summary, it’s natural to whether we can compute mean / variance in the
space of persistence diagrams. This question was first studied in [230], and to answer it, one
needs to study the property of the space of persistence diagrams equipped with certain metrics.
To state the results, we first need to refine the definition of Wasserstein distance of persistence
diagrams (Definition 3.10) to allow different norms for measuring the distance between two points
in the persistence diagram. The definition below assumes that we take the general view where a
persistence diagram includes infinitely many copies of the diagonal.
Definition 13.8. Let P and Q be two persistence diagrams. The (p, q)-Wasserstein distance be-
tween these two diagrams is:
q1
X
p q
dW,q (P, Q) := inf kx − Π(x)k p ,
(13.9)
Π:P→Q
x∈P
Now, let D∅ denote the trivial persistence diagram which contains only infinite copies of the
diagonal.
p
Definition 13.9. Given p, q, the space of persistence diagrams Dq consists of all persistence
diagrams within finite distance to the trivial persistence diagram D∅ ; that is,
p p
n o
Dq := P | dW,q (P, D∅ ) < ∞ .
p
In what follows, for simplicity, we abuse the notations slightly and let Dq denote the met-
p p p
ric space (Dq , dW,q ) equipped with dW,q . It is shown in [230] that D∞
q is a so-called Polish (i.e,
complete and separable) space, and probability measures can be defined. It is later shown that
more can be said about the space D22 (Theorem 2.5, [291]), which is a non-negatively curved
Alexandrov space (i.e., a geodesic space with curvature bounded from below by zero).
Computational Topology for Data Analysis 333
Furthermore, in both cases, the concept of “mean" and “variance" can be introduced using the
notion of Fréchet function. Specifically, in what follows, we use D to denote either D∞ 2
q and D2
with metric dD be the corresponding metric d∞ 2
W,q or dW,2 . We will consider probability measures
defined on (D, B(D)), where B(D) is the Borel σ-algebra on D.
Definition 13.10. Given a probability distribution ρ on (D, B(D)), its Fréchet function Fρ : D →
R is defined as, for any X ∈ D,
Z
Fρ (X) := d2D (X, Y)dρ(Y). (13.10)
D
Often in the literature, one uses the Fréchet mean to refer to an element in the Fréchet mean
set. Intuitively, the Fréchet function is a generalization of the arithmetic mean in the sense that
it minimizes the sum of the square distances to all points in the distribution. If the input is a
collection of persistence diagrams Ω = {D1 , D2 , . . . , Dm }, then we can talk about the mean of this
collection as the mean of the discrete measure ρΩ = m1 m i=1 δDi induced by them, where δX is the
P
Dirac measure centered at X ∈ D.
In general, it is not clear whether Fréchet mean even exists. However, for the space D as
defined above, it is shown [230, 291] that Fréchet mean set is not empty under mild conditions on
the distribution.
Theorem 13.10. Let ρ be a probability measure on (D, B(D)) with a finite second moment, that
is, Fρ (X) < ∞ for any X ∈ D. If ρ has compact support, then E(ρ) , ∅.
In the case when D = D22 , leveraging the property of D22 , Turner et al. developed an iterative
algorithm to compute a local minimum of the Fréchet functional [291]. The computational ques-
tion for Fréchet mean however remains open. We also note that in general, the Fréchet mean is
not unique. This becomes undesirable for example when tracking the mean of a set of varying
persistence diagrams. To address this issue, a modified concept of probabilistic Fréchet mean is
proposed in [239], which intuitively is a probabilistic measure on D22 , and the authors show how
to use this to build useful statistics on the vineyard (the time-varying persistence diagrams).
from these samples to that of X (when appropriate), whether such estimates converge, or how to
compute confidence interval (set) and so on.
We will not review the results here, as that would require careful description of the models
used: We refer the readers to the nice survey by Wasserman in [297], which discussed the sta-
tistical estimation for various topological objects, including (hierarchical) clustering (related to
merge trees), persistence diagrams, and ridge estimation. We will just mention that in the context
of persistence diagram and its variants (e.g, persistence landscapes), there has been work to ana-
lyze their concentration and convergence behavior as the number of samples n tends to infinity for
different settings [85, 84], or to obtain confidence set for them via bootstrapping or subsampling
[34, 83, 160].
The inference and estimation of topological information has been discussed earlier in Chap-
ter 6; however we have assumed that the samples are deterministic there. Also note that as the
distribution P (where input points are sampled from) deviates further from the true distribution
we are interested in, the standard construction based on the Rips or Čech complexes to approx-
imate the sublevel sets of the distance field (recall Definition 6.7 in Section 6.3.1) is not longer
appropriate. Instead, one now needs to use more robust notions of “distance field". To this end,
an elegant concept called distance to measures (DTM) has been proposed [79], which has many
nice properties and can lead to more robust topology inferences; e.g., [82]. An alternative is to
use kernel-distance as proposed in [256].
Finally, we note that there also has been a line of work to study topological properties (e.g.,
Betti numbers, or the largest persistence in the persistence diagram) of random simplicial com-
plexes [35, 36, 37, 202, 203, 204]. We will not describe this interesting line of work in this book.
work [305], learning the best representation based on persistence images is formulated as an op-
timization problem and solved directly via (stochastic) gradient descents. The resulting learned
vector representations can be combined simply with kernel-SVM for different tasks such as graph
classification.
Differentiating a function involving persistence has been independently proposed and studied
in several work from different communities, first in [165] for continuation of point clouds, then in
[257] for continuous shape matching and in [96] for topological regularization of classifiers. The
gradients computation of a persistence-based topological function presented in Section 13.2.2
follows mostly from the discussion in [257]. The general topological optimization framework
is rather general and powerful. Several recent work apply such ideas to different stages of ma-
chine learning applications. For example, [101, 199] used topological loss terms to help enforce
a topological prior on individual input object for deep-learning based image segmentation. The
work of [101] assumed certain prior knowledge of the topology of the segmented images. Instead
of assuming this prior-knowledge, [199] proposed to learn to segment with correct topology by
using a topological loss function to help ensure that the topology of segmented images is the
same as the ground-truth for labelled images. The potential applications of these ideas has been
further broadened in [49], where the authors introduced and developed a topological layer for
function-induced persistence and for distance-based filtration induced persistence. Such a persis-
tence layer idea is further developed in [211] using the persistence landscape representation of
general filtrations. Instead of having a topological constraint on individual input data point, one
can also consider using it for the latent space behind data. For example, [193, 236] applied such
ideas with auto-encoders.
There has also been some recent work on using (persistent) homology to help characterize
the complexity of a neural network (or its training process). For example, [264] proposed the so-
called neural persistence, to characterize the structural complexity of neural networks. [28, 180]
proposed to measure the capacity of an architecture by the topological complexity of the classifiers
it can produce. [168] proposed to study the topology of the activation networks (neural networks
with node activation for specific inputs), and used such patterns to help understand adversarial
examples. [243] studied the change of the topology of the transformed data space across different
layers of a deep neural network. While overall, exploration in this direction is still in the initial
stage, these are exciting ideas and there is much potential in using topological tools to understand
neural networks.
336 Computational Topology for Data Analysis
Bibliography
[1] Aaron Aadcock, Erik Carlsson, and Gunnar Carlsson. The ring of algebraic functions on
persistence barcodes. Homology, Homotopy and Applications, 18:381–402, 2016.
[2] Michal Adamaszek and Henry Adams. The Vietoris-Rips complexes of a circle. Pacific J.
Math., 290:1–40, 2017.
[3] Michal Adamaszek, Henry Adams, Ellen Gasparovic, Maria Gommel, Emilie Purvine,
Radmila Sazdanovic, Bei Wang, Yusu Wang, and Lori Ziegelmeier. On homotopy
types of Vietoris-Rips complexes of metric gluings. J. Appl. Comput. Topology, 2020.
https://doi.org/10.1007/s41468-020-00054-y.
[4] Henry Adams, Tegan Emerson, Michael Kirby, Rachel Neville, Chris Peterson, Patrick
Shipman, Sofya Chepushtanova, Eric Hanson, Francis Motta, and Lori Ziegelmeier. Persis-
tence images: a stable vector representation of persistent homology. J. Machine Learning
Research, 18:218–252, 2017.
[5] Pankaj K. Agarwal, Herbert Edelsbrunner, John Harer, and Yusu Wang. Extreme elevation
on a 2-manifold. Discrete Comput. Geom., 36(4):553–572, 2006.
[6] Pankaj K. Agarwal, Kyle Fox, Abhinandan Nath, Anastasios Sidiropoulos, and Yusu Wang.
Computing the Gromov-Hausdorff distance for metric trees. ACM Trans. Algorithms,
14(2):24:1–24:20, 2018.
[7] Paul Aleksandroff. Über den allgemeinen dimensionsbegriff und seine beziehungen zur
elementaren geometrischen anschauung. Mathematische Annalen, 98:617–635, 1928.
[8] Nina Amenta, Marshall W. Bern, and David Eppstein. The crust and the beta-skeleton:
Combinatorial curve reconstruction. Graphical Models Image Processing, 60(2):125–135,
1998.
[9] Hideto Asashiba, Emerson G. Escolar, Yasuaki Hiraoka, and Hiroshi Takeuchi. Matrix
method for persistence modules on commutative ladders of finite type. J. Industrial Applied
Math., 36(1):97–130, 2019.
[10] Michael Atiyah. On the Krull-Schmidt theorem with application to sheaves. Bulletin de la
Société Mathématique de France, 84:307–317, 1956.
337
338 Computational Topology for Data Analysis
[11] Dominique Attali, Herbert Edelsbrunner, and Yuriy Mileyko. Weak witnesses for Delaunay
triangulations of submanifolds. In Proc. ACM Sympos. Solid Physical Model., pages 143–
150, 2007.
[12] Dominique Attali, André Lieutier, and David Salinas. Efficient data structure for represent-
ing and simplifying simplicial complexes in high dimensions. In Proc. 27th Annu. Sympos.
Comput. Geom. (SoCG), pages 501–509, 2011.
[13] Maurice Auslander. Representation theory of Artin Algebras II. Communications in Alge-
bra, 1(4):269–310, 1974.
[14] Maurice Auslander and David Buchsbaum. Groups, rings, modules. Dover Publications,
2014.
[15] Aravindakshan Babu. Zigzag coarsenings, mapper stability and gene-network analyses,
2013. PhD thesis, Stanford University.
[16] Samik Banerjee, Lucas Magee, Dingkang Wang, Xu Li, Bing xing Huo, Jaikishan Jayaku-
mar, Katherine Matho, Meng-Kuan Lin, Keerthi Ram, Mohanasankar Sivaprakasam,
Josh Huang, Yusu Wang, and Partha Mitra. Semantic segmentation of microscopic
neuroanatomical data by combining topological priors with encoder-decoder deep net-
works. Nature Machine Intelligence, 2:585–594, 2020. Also available on biorxiv at
2020.02.18.955237.
[17] Jonathan Ariel Barmak and Elias Gabriel Minian. Strong homotopy types, nerves and
collapses. Discret. Comput. Geom., 47(2):301–328, 2012.
[18] Saugata Basu and Negin Karisani. Efficient simplicial replacement of semi-algebraic sets
and applications. CoRR, arXiv:2009.13365, 2020.
[19] Ulrich Bauer. Ripser: efficient computation of Vietoris-Rips persistence barcodes. CoRR,
arXiv:1908.02518, 2019.
[20] Ulrich Bauer, Xiaoyin Ge, and Yusu Wang. Measuring distance bewteen Reeb graphs. In
Proc. 30th Annu. Sympos. Comput. Geom. (SoCG), pages 464–473, 2014.
[21] Ulrich Bauer, Claudia Landi, and Facundo Mémoli. The Reeb graph edit distance is uni-
versal. In Proc. 36th Internat. Sympos. Comput. Geom. (SoCG), pages 15:1–15:16, 2020.
[22] Ulrich Bauer, Carsten Lange, and Max Wardetzky. Optimal topological simplification of
discrete functions on surfaces. Discrete Comput. Geom., 47(2):347–377, 2012.
[23] Ulrich Bauer and Michael Lesnick. Induced matchings of barcodes and the algebraic sta-
bility of persistence. In Proc. 13th Annu. Sympos. Comput. Geom. (SoCG), pages 355–364,
2014.
[24] Ulrich Bauer, Elizabeth Munch, and Yusu Wang. Strong equivalence of the interleaving
and functional distortion metrics for Reeb graphs. In Proc. 31st Annu. Sympos. Comput.
Geom. (SoCG), pages 461–475, 2015.
Computational Topology for Data Analysis 339
[25] Christian Berg, Jens P. R. Christensen, and Paul Ressel. Harmonic analysis on semigroups:
Theory of positive definite and related functions. Springer, 1984.
[26] Marshall W. Bern, David Eppstein, Pankaj K. Agarwal, Nina Amenta, L. Paul Chew,
Tamal K. Dey, David P. Dobkin, Herbert Edelsbrunner, Cindy Grimm, Leonidas J. Guibas,
John Harer, Joel Hass, Andrew Hicks, Carroll K. Johnson, Gilad Lerman, David Letscher,
Paul E. Plassmann, Eric Sedgwick, Jack Snoeyink, Jeff Weeks, Chee-Keng Yap, and Denis
Zorin. Emerging challenges in computational topology. CoRR, arXiv:cs/9909001, 1999.
[27] Dimitris Bertsimas and John N. Tsitsiklis. Introduction to Linear Optimization. Athena
Scientific, Belmont, MA, 1997.
[28] Monica Bianchini and Franco Scarselli. On the complexity of neural network classifiers:
A comparison between shallow and deep architectures. IEEE Trans. Neural Networks
Learning Sys., 25(8):1553–1565, 2014.
[29] Silvia Biasotti, Andrea Cerri, Patrizio Frosini, and Daniela Giorgi. A new algorithm for
computing the 2-dimensional matching distance between size functions. Pattern Recogni-
tion Letters, 32(14):1735–1746, 2011.
[30] Silvia Biasotti, Bianca Falcidieno, and Michela Spagnuolo. Extended Reeb graphs for sur-
face understanding and description. In Proc. 9th Internat. Conf. Discrete Geom. Computer
Imagery, pages 185–197, 2000.
[31] Silvia Biasotti, Daniela Giorgi, Michela Spagnuolo, and Bianca Falcidieno. Reeb graphs
for shape analysis and applications. Theor. Comput. Sci., 392(1-3):5–22, 2008.
[33] Håvard Bjerkevik, Magnus Botnan, and Michael Kerber. Computing the interleaving dis-
tance is NP-hard. Found. Comput. Math., 2019.
[34] Ander J. Blumberg, Itamar Gal, Michael A. Mandell, and Matthew Pancia. Robust statis-
tics, hypothesis testing, and confidence intervals for persistent homology on metric mea-
sure spaces. Found. Comput. Math., 14:745–789, 2014.
[35] Omer Bobrowski and Matthew Kahle. Topology of random geometric complexes: a survey.
J. Applied Comput. Topology, 1(3):331–364, 2018.
[36] Omer Bobrowski, Matthew Kahle, and Primoz Skraba. Maximally persistent cycles in
random geometric complexes. Ann. Appl. Probab., 27(4):2032–2060, 2017.
[37] Omer Bobrowski and Primoz Skraba. Homological percolation and the Euler characteris-
tic. Phys. Rev. E, 101:032304, 2020.
[38] Erik Boczko, William D. Kalies, and Konstantin Mischaikow. Polygonal approximation of
flows. Topology and its Applications, 154:2501–2520, 2007.
340 Computational Topology for Data Analysis
[39] Jean-Daniel Boissonnat, Frédéric Chazal, and Mariette Yvinec. Geometric and Topological
Inference. Cambride Texts in Applied Mathematics. Cambridge U. Press, 2018.
[40] Jean-Daniel Boissonnat, Leonidas J. Guibas, and Steve Y. Oudot. Manifold reconstruction
in arbitrary dimensions using witness complexes. In Proc. 23rd Annu. Sympos. Comput.
Geom. (SoCG), pages 194–203, 2007.
[41] Jean-Daniel Boissonnat and Siddharth Pritam. Edge collapse and persistence of flag com-
plexes. In 36th Internat. Sympos. Comput. Geom., (SoCG), volume 164 of LIPIcs, pages
19:1–19:15, 2020.
[42] Jean-Daniel Boissonnat, Siddharth Pritam, and Divyansh Pareek. Strong collapse for per-
sistence. In 26th Annu. European Sympos. Algorithms, ESA, volume 112 of LIPIcs, pages
67:1–67:13, 2018.
[43] Glencora Borradaile, Erin Wolf Chambers, Kyle Fox, and Amir Nayyeri. Minimum cycle
and homology bases of surface-embedded graphs. J. Comput. Geom. (JoCG), 8(2):58–79,
2017.
[44] Glencora Borradaile, William Maxwell, and Amir Nayyeri. Minimum bounded chains
and minimum homologous chains in embedded simplicial complexes. In 36th Internat.
Sympos. Comput. Geom., (SoCG), volume 164 of LIPIcs, pages 21:1–21:15, 2020.
[45] Karol Borsuk. On the imbedding of systems of compacta in simplicial complexes. Funda-
menta Mathematicae, 35:217–234, 1948.
[46] Magnus Botnan, Justin Curry, and Elizabeth Munch. The poset interleaving distance, 2016.
[47] Magnus Botnan and Michael Lesnick. Algebraic stability of zigzag persistence modules.
Algebraic & Geometric Topology, 18:3133–3204, 2018.
[48] Stephane Bressan, Jingyan Li, Shiquan Ren, and Jie Wu. The embedded homology of
hypergraphs and applications. Asian J. Math., 23(3):479–500, 2019.
[50] Winfried Bruns and H Jürgen Herzog. Cohen-Macaulay Rings. Cambridge University
Press, 1998.
[51] Peter Bubenik. Statistical topological data analysis using persistence landscapes. J. Ma-
chine Learning Research, 16(1):77–102, 2015.
[52] Peter Bubenik and Peter T. Kim. A statistical approach to persistent homology. Homology,
Homotopy, and Applications, 9(2):337–362, 2007.
[53] Peter Bubenik and Jonathan Scott. Categorification of persistent homology. Discrete Com-
put. Geom., 51(3):600–627, 2014.
Computational Topology for Data Analysis 341
[54] Peter Bubenik, Jonathan A. Scott, and Donald Stanley. Wasserstein distance for generalized
persistence modules and abelian categories. arXiv: Rings and Algebras, arxiv:1809.09654,
2018.
[55] Michaël Buchet, Frédéric Chazal, Tamal K. Dey, Fengtao Fan, Steve Y. Oudot, and Yusu
Wang. Topological analysis of scalar fields with outliers. In Proc. 31st Annu. Sympos.
Comput. Geom. (SoCG), pages 827–841, 2015.
[56] Mickaël Buchet, Frédéric Chazal, Steve Y. Oudot, and Donald Sheehy. Efficient and ro-
bust persistent homology for measures. In Proc. 26th Annu. ACM-SIAM Sympos. Discrete
Algorithms (SODA), pages 168–180, 2015.
[57] James R. Bunch and John E. Hopcroft. Triangular factorization and inversion by fast matrix
multiplication. Mathematics of Computation, 28(125):231–236, 1974.
[58] Dmitri Burago, Yuri Burago, and Sergei Ivanov. A course in metric geometry. volume 33
of AMS Graduate Studies in Math. American Mathematics Society, 2001.
[59] Dan Burghelea and Tamal K. Dey. Topological persistence for circle-valued maps. Discrete
Comput. Geom., 50(1):69–98, 2013.
[60] Oleksiy Busaryev, Sergio Cabello, Chao Chen, Tamal K. Dey, and Yusu Wang. Annotating
simplices with a homology basis and its applications. In Algorithm Theory - SWAT 2012 -
13th Scandinavian Sympos. Workshops, pages 189–200, 2012.
[61] Alexander Buslaev, Selim S Seferbekov, Vladimir Iglovikov, and Alexey Shvets. Fully
convolutional network for automatic road extraction from satellite imagery. In CVPR Work-
shops, pages 207–210, 2018.
[62] Gunnar Carlsson and Vin de Silva. Zigzag persistence. Found. Comput. Math., 10(4):367–
405, Aug 2010.
[63] Gunnar Carlsson, Vin de Silva, and Dmitriy Morozov. Zigzag persistent homology and
real-valued functions. In Proc. 26th Annu. Sympos. Comput. Geom. (SoCG), pages 247–
256, 2009.
[64] Gunnar Carlsson, Gurjeet Singh, and Afra Zomorodian. Computing multidimensional per-
sistence. In Proc. Internat. Sympos. Algorithms Computation (ISAAC), pages 730–739.
Springer, 2009.
[65] Gunnar Carlsson and Afra Zomorodian. The theory of multidimensional persistence. Dis-
crete Comput. Geom., 42(1):71–93, 2009.
[66] Hamish Carr, Jack Snoeyink, and Ulrike Axen. Computing contour trees in all dimensions.
Comput. Geom.: Theory and Applications, 24(2):75–94, 2003.
[67] Mathieu Carrière, Frédéric Chazal, Yuichi Ike, Théo Lacombe, Martin Royer, and Yuhei
Umeda. Perslay: a neural network layer for persistence diagrams and new graph topologi-
cal signatures. In Proc. 23rd Internat. Conf. Artificial Intelligence Stat. (AISTATS), volume
108, pages 2786–2796, 2020.
342 Computational Topology for Data Analysis
[68] Mathieu Carrière, Marco Cuturi, and Steve Y. Oudot. Sliced Wasserstein kernel for persis-
tence diagrams. In Proc. Internat. Conf. Machine Learning, pages 664–673, 2017.
[69] Mathieu Carrière and Steve Oudot. Structure and stability of the one-dimensional mapper.
Found. Comput. Math., 18(6):1333–1396, 2018.
[70] Nicholas J. Cavanna, Mahmoodreza Jahanseir, and Donald R. Sheehy. A geometric per-
spective on sparse filtrations. In Proc. Canadian Conf. Comput. Geom. (CCCG), 2015.
[71] Andrea Cerri, Barbara Di Fabio, Massimo Ferri, Patrizio Frosini, and Claudia Landi. Betti
numbers in multidimensional persistent homology are stable functions. Mathematical
Methods in the Applied Sciences, 36(12):1543–1557, 2013.
[72] Andrea Cerri and Patrizio Frosini. A new approximation algorithm for the matching dis-
tance in multidimensional persistence. J. Comput. Math., pages 291–309, 2020.
[73] Erin W. Chambers, Jeff Erickson, and Amir Nayyeri. Minimum cuts and shortest homolo-
gous cycles. In Proc. 25th Annu. Sympos. Comput. Geom. (SoCG), pages 377–385, 2009.
[74] Erin W. Chambers, Jeff Erickson, and Amir Nayyeri. Homology flows, cohomology cuts.
SIAM J. Comput., 41(6):1605–1634, 2012.
[75] Manoj K. Chari. On discrete Morse functions and combinatorial decompositions. Discrete
Math., 217(1-3):101–113, 2000.
[76] Isaac Chavel. Riemannian Geometry: A Modern Introduction, 2nd Ed. Cambridge univer-
sity press, 2006.
[77] Frédéric Chazal, David Cohen-Steiner, Marc Glisse, Leonidas J. Guibas, and Steve Oudot.
Proximity of persistence modules and their diagrams. In Proc. 25th Annu. Sympos. Comput.
Geom. (SoCG), pages 237–246, 2009.
[78] Frédéric Chazal, David Cohen-Steiner, Leonidas J. Guibas, Facundo Mémoli, and Steve Y.
Oudot. Gromov-Hausdorff stable signatures for shapes using persistence. Comput. Graph-
ics Forum, 28(5):1393–1403, 2009.
[79] Frédéric Chazal, David Cohen-Steiner, and Quentin Mérigot. Geometric inference for
probability distributions. Found. Comput. Math., 11(6):733–751, 2011.
[80] Frédéric Chazal, Vin de Silva, Marc Glisse, and Steve Oudot. The structure and stability
of persistence modules. CoRR, arXiv:1207.3674, 2012.
[81] Frédéric Chazal, Vin de Silva, and Steve Oudot. Persistence stability for geometric com-
plexes. Geometriae Dedicata, 173(1):193–214, Dec 2014.
[82] Frédéric Chazal, Brittany Fasy, Fabrizio Lecci, Bertr, Michel, Aless, ro Rinaldo, and Larry
Wasserman. Robust topological inference: Distance to a measure and kernel distance. J.
Machine Learning Research, 18(159):1–40, 2018.
Computational Topology for Data Analysis 343
[83] Frédéric Chazal, Brittany Fasy, Fabrizio Lecci, Alessandro Rinaldo, Aarti Singh, and Larry
Wasserman. On the bootstrap for persistence diagrams and landscapes. Modeling Analysis
Info. Sys., 20(6):96–105, 2013. Also available at arXiv:1311.0376.
[84] Frédéric Chazal, Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, and Larry A.
Wasserman. Stochastic convergence of persistence landscapes and silhouettes. J. Comput.
Geom. (JoCG), 6(2):140–161, 2015.
[85] Frédéric Chazal, Marc Glisse, Catherine Labruère, and Bertrand Michel. Convergence
rates for persistence diagram estimation in topological data analysis. J. Machine Learning
Research, 16(110):3603–3635, 2015.
[86] Frédéric Chazal, Marc Glisse, Catherine Labruère, and Bertrand Michel. Convergence
rates for persistence diagram estimation in Topological Data Analysis. J. Machine Learn-
ing Research, 16:3603–3635, 2015.
[87] Frédéric Chazal, Leonidas J. Guibas, Steve Oudot, and Primoz Skraba. Analysis of scalar
fields over point cloud data. Discrete Comput. Geom., 46(4):743–775, 2011.
[88] Frédéric Chazal, Ruqi Huang, and Jian Sun. Gromov-Hausdorff approximation of filamen-
tary structures using Reeb-type graphs. Discrete Comput. Geom., 53:621–649, 2015.
[89] Frédéric Chazal and André Lieutier. Weak feature size and persistent homology: com-
puting homology of solids in Rn from noisy data samples. In Proc. 21st Annu. Sympos.
Comput. Geom. (SoCG), pages 255–262, 2005.
[90] Frédéric Chazal and André Lieutier. Stability and computation of topological invariants of
solids in Rn . Discrete Comput. Geom., 37(4):601–617, 2007.
[91] Frédéric Chazal and Steve Y. Oudot. Towards persistence-based reconstruction in Eu-
clidean spaces. In Proc. 24th Annu. Sympos. Comput. Geom. (SoCG), pages 232–241,
2008.
[92] Bernard Chazelle. An optimal convex hull algorithm in any fixed dimension. Discrete
Comput. Geom., 10:377–409, 1993.
[93] Chao Chen and Daniel Freedman. Measuring and computing natural generators for homol-
ogy groups. Comput. Geom.: Theory & Applications, 43 (2):169–181, 2010.
[94] Chao Chen and Daniel Freedman. Hardness results for homology localization. Discrete
Comput. Geometry, 45 (3):425–448, 2011.
[95] Chao Chen and Michael Kerber. An output-sensitive algorithm for persistent homology.
Comput. Geom.: Theory and Applications, 46(4):435–447, 2013.
[96] Chao Chen, Xiuyan Ni, Qinxun Bai, and Yusu Wang. A topological regularizer for clas-
sifiers via persistent homology. In Proc. 22nd Internat. Conf. Artificial Intelligence Stat.
(AISTATS), pages 2573–2582, 2019.
344 Computational Topology for Data Analysis
[97] Siu-Wing Cheng, Tamal K. Dey, and Edgar A. Ramos. Manifold reconstruction from point
samples. In Proc. 16th Annu. ACM-SIAM Sympos. Discrete Algorithms (SODA), pages
1018–1027, 2005.
[98] Siu-Wing Cheng, Tamal K. Dey, and Jonathan R. Shewchuk. Delaunay Mesh Generation.
CRC Press, 2012.
[99] Samir Chowdhury and Facundo Mémoli. Persistent homology of asymmetric networks:
An approach based on Dowker filtrations. CoRR, arXiv:1608.05432, 2018.
[100] Samir Chowdhury and Facundo Mémoli. Persistent path homology of directed networks.
In Proc. 29th Annu. ACM-SIAM Sympos. Discrete Algorithms (SODA), pages 1152–1169.
SIAM, 2018.
[101] James R. Clough, Ilkay Oksuz, Nicholas Byrne, Veronika A. Zimmer, Julia A. Schnabel,
and Andrew P. King. A topological loss function for deep-learning based image segmen-
tation using persistent homology. CoRR, 2019.
[102] David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. Stability of persistence dia-
grams. Discrete Comput. Geom., 37(1):103–120, Jan 2007.
[103] David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. Extending persistence using
Poincaré and Lefschetz duality. Found. Comput. Math., 9(1):79–103, 2009.
[104] David Cohen-Steiner, Herbert Edelsbrunner, John Harer, and Yuriy Mileyko. Lipschitz
functions have Lp -stable persistence. Found. Comput. Math., 10(2):127–139, 2010.
[105] David Cohen-Steiner, Herbert Edelsbrunner, John Harer, and Dmitriy Morozov. Persistent
homology for kernels, images, and cokernels. In Proc. 20th Annu. ACM-SIAM Sympos.
Discrete Algorithms (SODA), pages 1011–1020, 2009.
[106] David Cohen-Steiner, Herbert Edelsbrunner, and Dmitriy Morozov. Vines and vineyards by
updating persistence in linear time. In Proc. 22nd Annu. Sympos. Comput. Geom. (SoCG),
pages 119–126, 2006.
[107] Kree Cole-McLaughlin, Herbert Edelsbrunner, John Harer, Vijay Natarajan, and Valerio
Pascucci. Loops in Reeb graphs of 2-manifolds. Discrete Comput. Geom., 32(2):231–244,
2004.
[108] René Corbet and Michael Kerber. The representation theorem of persistence revisited and
generalized. J. Appl. Comput. Topology, 2(1):1–31, Oct 2018.
[109] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. Introduc-
tion to Algorithms, Third Edition. The MIT Press, 3rd edition, 2009.
[110] David A Cox, John Little, and Donal O’shea. Using Algebraic Geometry, volume 185.
Springer Science & Business Media, 2006.
[112] Justin Curry. Sheaves, cosheaves and applications. CoRR, arXiv:1303.3255, 2013.
[113] Vin de Silva. A weak definition of Delaunay triangulation. CoRR, arXiv:cs/0310031, 2003.
[114] Vin de Silva and Gunnar Carlsson. Topological estimation using witness complexes. In
Proc. Sympos. Point-Based Graphics., 2004.
[115] Vin de Silva, Dmitriy Morozov, and Mikael Vejdemo-Johansson. Dualities in persistent
(co)homology. Inverse Problems, 27:124003, 2011.
[116] Vin de Silva, Elizabeth Munch, and Amit Patel. Categorified Reeb graphs. Discrete Com-
put. Geom., 55(4):854–906, 2016.
[117] Cecil Jose A. Delfinado and Herbert Edelsbrunner. An incremental algorithm for Betti
numbers of simplicial complexes on the 3-sphere. Comput. Aided Geom. Design,
12(7):771–784, 1995.
[118] Olaf Delgado-Friedrichs, Vanessa Robins, and Adrian P. Sheppard. Skeletonization and
partitioning of digital images using discrete Morse theory. IEEE Trans. Pattern Anal. Ma-
chine Intelligence, 37(3):654–666, 2015.
[119] Tamal K. Dey. Curve and Surface Reconstruction: Algorithms with Mathematical Analy-
sis. Cambridge Monographs Applied Comput. Math. Cambridge University Press, 2006.
[120] Tamal K. Dey, Herbert Edelsbrunner, and Sumanta Guha. Computational topology. Ad-
vances in Discrete Comput. Geom., 1999.
[121] Tamal K. Dey, Herbert Edelsbrunner, Sumanta Guha, and Dmitry V. Nekhayev. Topology
preserving edge contraction. Publications de l’ Institut Mathematique (Beograd), 60:23–
45, 1999.
[122] Tamal K. Dey, Fengtao Fan, and Yusu Wang. Computing topological persistence for sim-
plicial maps. CoRR, arXiv:1208.5018, 2012.
[123] Tamal K. Dey, Fengtao Fan, and Yusu Wang. An efficient computation of handle and tunnel
loops via reeb graphs. ACM Trans. Graph., 32(4):32, 2013.
[124] Tamal K. Dey, Fengtao Fan, and Yusu Wang. Graph induced complex for point data. In
Proc. 29th. Annu. Sympos. Comput. Geom. (SoCG), pages 107–116, 2013.
[125] Tamal K. Dey, Fengtao Fan, and Yusu Wang. Computing topological persistence for simpli-
cial maps. In Proc. 13th Annu. Sympos. Comput. Geom. (SoCG), pages 345:345–345:354,
2014.
[126] Tamal K. Dey, Anil N. Hirani, and Bala Krishnamoorthy. Optimal homologous cycles,
total unimodularity, and linear programming. SIAM J. Comput., 40(4):1026–1044, 2011.
[127] Tamal K. Dey and Tao Hou. Computing zigzag persistence on graphs in near-linear time.
In Proc. 37th Internat. Sympos. Comput. Geom. (SoCG), 2021.
346 Computational Topology for Data Analysis
[128] Tamal K. Dey, Tao Hou, and Sayan Mandal. Persistent 1-cycles: Definition, computation,
and its application. In Comput. Topology Image Context - 7th Internat. Workshop, pages
123–136, 2019.
[129] Tamal K. Dey, Tao Hou, and Sayan Mandal. Computing minimal persistent cycles: Poly-
nomial and hard cases. In Proc. ACM-SIAM Sympos. Discrete Algorithms (SODA), pages
2587–2606. SIAM, 2020.
[130] Tamal K. Dey, Tianqi Li, and Yusu Wang. Efficient algorithms for computing a minimal
homology basis. In LATIN 2018: Theoretical Informatics - 13th Latin American Sympo-
sium, pages 376–398, 2018.
[131] Tamal K. Dey, Tianqi Li, and Yusu Wang. An efficient algorithm for 1-dimensional (per-
sistent) path homology. In Proc. 36th. Internat. Sympos. Comput. Geom. (SoCG), pages
36:1–36:15, 2020.
[132] Tamal K. Dey, Facundo Mémoli, and Yusu Wang. Multiscale mapper: Topological sum-
marization via codomain covers. In Proc. 27th Annu. ACM-SIAM Symposium on Discrete
Algorithms (SODA), pages 997–1013, 2016.
[133] Tamal K. Dey, Facundo Mémoli, and Yusu Wang. Topological analysis of nerves, Reeb
spaces, mappers, and multiscale mappers. In Proc. 33rd Internat. Sympos. Comput. Geom.
(SOCG), pages 36:1–36:16, 2017.
[134] Tamal K. Dey, Dayu Shi, and Yusu Wang. Comparing graphs via persistence distortion. In
Proc. 31st Annu. Sympos. Comput. Geom. (SoCG), pages 491–506, 2015.
[135] Tamal K. Dey, Dayu Shi, and Yusu Wang. SimBa: An efficient tool for approximating
Rips-filtration persistence via simplicial batch-collapse. In Proc. 24th Annu. European
Sympos. Algorithms (ESA 2016), volume 57 of LIPIcs, pages 35:1–35:16, 2016.
[136] Tamal K. Dey, Jian Sun, and Yusu Wang. Approximating loops in a shortest homology
basis from point data. In Proc. 26th Annu. Sympos. Comput. Geom. (SoCG), pages 166–
175, 2010.
[137] Tamal K. Dey, Jiayuan Wang, and Yusu Wang. Improved road network reconstruction
using discrete Morse theory. In Proc. 25th ACM SIGSPATIAL Internat. Conf. Advances in
GIS, pages 58:1–58:4, 2017.
[138] Tamal K. Dey, Jiayuan Wang, and Yusu Wang. Graph reconstruction by discrete Morse
theory. In Proc. 34th Internat. Sympos. Comput. Geom. (SoCG), pages 31:1–31:15, 2018.
[139] Tamal K. Dey, Jiayuan Wang, and Yusu Wang. Road network reconstruction from satel-
lite images with machine learning supported by topological methods. In Proc. 27th ACM
SIGSPATIAL Internat. Conf. Advances in GIS, pages 520–523, 2019.
[140] Tamal K. Dey and Yusu Wang. Reeb graphs: Approximation and persistence. Discrete
Comput. Geom., 49(1):46–73, 2013.
Computational Topology for Data Analysis 347
[141] Tamal K. Dey and Cheng Xin. Computing bottleneck distance for 2-D interval decompos-
able modules. In Proc. 34th Internat. Sympos. Comput. Geom. (SoCG), pages 32:1–32:15,
2018.
[142] Tamal K. Dey and Cheng Xin. Generalized persistence algorithm for decomposing multi-
parameter persistence modules. CoRR, arXiv:1904.03766, 2019.
[143] Barbara Di Fabio and Massimo Ferri. Comparing persistence diagrams through complex
vectors. In Vittorio Murino and Enrico Puppo, editors, Image Analysis and Processing —
ICIAP 2015, pages 294–305, 2015.
[144] Pawel Dlotko, Kathryn Hess, Ran Levi, Max Nolte, Michael Reimann, Martina Sco-
lamiero, Katharine Turner, Eilif Muller, and Henry Markram. Topological analysis of the
connectome of digital reconstructions of neural microcircuits. CoRR, arXiv:1601.01580,
2016.
[145] Harish Doraiswamy and Vijay Natarajan. Efficient output-sensitive construction of Reeb
graphs. In Proc. 19th Internat. Sympos. Algorithms Computation, pages 556–567, 2008.
[146] Harish Doraiswamy and Vijay Natarajan. Efficient algorithms for computing Reeb graphs.
Comput. Geom.: Theory and Applications, 42:606–616, 2009.
[147] Clifford H. Dowker. Homology groups of relations. Annals of Maths, 56:84–95, 1952.
[148] Herbert Edelsbrunner. Geometry and Topology for Mesh Generation, volume 7 of Cam-
bridge Monographs Applied Comput. Math. Cambridge University Press, 2001.
[149] Herbert Edelsbrunner and John Harer. Computational Topology: An Introduction. Applied
Mathematics. American Mathematical Society, 2010.
[150] Herbert Edelsbrunner, John Harer, and Amit K. Patel. Reeb spaces of piecewise linear
mappings. In Proc. 24th Annu. Sympos. Comput. Geom. (SoCG), pages 242–250, 2008.
[151] Herbert Edelsbrunner, David G. Kirkpatrick, and Raimund Seidel. On the shape of a set of
points in the plane. IEEE Trans. Info. Theory, 29(4):551–558, 1983.
[152] Herbert Edelsbrunner, David Letscher, and Afra Zomorodian. Topological persistence and
simplification. Discrete Comput. Geom., 28:511–533, 2002.
[153] Herbert Edelsbrunner and Ernst P. Mücke. Three-dimensional alpha shapes. ACM Trans.
Graph., 13(1):43–72, 1994.
[154] Alon Efrat, Alon Itai, and Matthew J. Katz. Geometry helps in bottleneck matching and
related problems. Algorithmica, 31(1):1–28, 2001.
[155] David Eisenbud. The Geometry of Syzygies: A Second Course in Algebraic Geometry and
Commutative Algebra, volume 229. Springer Science & Business Media, 2005.
[156] Jeff Erickson and Kim Whittlesey. Greedy optimal homotopy and homology generators.
In Proc. 16th Annu. ACM-SIAM Sympos. Discrete Algorithms (SODA), pages 1038–1046,
2005.
348 Computational Topology for Data Analysis
[157] Emerson G. Escolar and Yasuaki Hiraoka. Optimal cycles for persistent homology via
linear programming. Optimization in the Real World, 13:79–96, 2016.
[158] Barbara Di Fabio and Claudia Landi. Reeb graphs of curves are stable under function
perturbations. Mathematical Methods in Applied Sciences, 35:1456–1471, 2012.
[159] Barbara Di Fabio and Claudia Landi. The edit distance for Reeb graphs of surfaces. Dis-
crete Comput. Geom., 55:423–461, 2016.
[160] Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, Larry Wasserman, Sivaraman
Balakrishnan, and Aarti Singh. Confidence sets for persistence diagrams. The Annal. Stat.,
42(6):2301–2339, 2014.
[161] Robin Forman. Morse theory for cell complexes. Adv. Math., 134:90–145, 1998.
[162] Patrizio Frosini. A distance for similarity classes of submanifolds of a Euclidean space.
Bulletin of the Australian Mathematical Society, 42(3):407–415, 1990.
[164] Sylvestre Gallot, Dominique Hulin, and Jacques Lafontaine. Riemannian Geometry.
Springer-Verlag, 2nd edition, 1993.
[165] Marcio Gameiro, Yasuaki Hiraoka, and Ippei Obayashi. Continuation of point clouds via
persistence diagrams. Physica D: Nonlinear Phenomena, 334:118 – 132, 2016. Topology
in Dynamics, Differential Equations, and Data.
[166] Ellen Gasparovic, Maria Gommel, Emilie Purvine, Radmila Sazdanovic, Bei Wang, Yusu
Wang, and Lori Ziegelmeier. The relationship between the intrinsic Čech and persistence
distortion distances for metric graphs. J. Comput. Geom. (JoCG), 10(1), 2019. DOI:
https://doi.org/10.20382/jocg.v10i1a16.
[167] Xiaoyin Ge, Issam Safa, Mikhail Belkin, and Yusu Wang. Data skeletonization via Reeb
graphs. In Proc. 25th Annu. Conf. Neural Info. Processing Sys. (NIPS), pages 837–845,
2011.
[168] Thomas Gebhart, Paul Schrater, and Alan Hylton. Characterizing the shape of activation
space in deep neural networks. CoRR, arXiv:1901.09496, 2019.
[169] Loukas Georgiadis, Robert Endre Tarjan, and Renato Fonseca F. Werneck. Design of
data structures for mergeable trees. In Proc. 17th Annu. ACM-SIAM Sympos. Discrete
Algorithms (SODA), pages 394–403, 2006.
[170] Robert Ghrist. Elementary Applied Topology. CreateSpace Independent Publishing Plat-
form, 2014.
[171] Alexander Grigor’yan, Yong Lin, Yuri Muranov, and Shing-Tung Yau. Homologies of path
complexes and digraphs. CoRR, arXiv:1207.2834, 2012.
Computational Topology for Data Analysis 349
[172] Alexander Grigor’yan, Yong Lin, Yuri Muranov, and Shing-Tung Yau. Homotopy theory
for digraphs. CoRR, arXiv:1407.0234, 2014.
[173] Alexander Grigor’yan, Yong Lin, Yuri Muranov, and Shing-Tung Yau. Cohomology of
digraphs and (undirected) graphs. Asian J. Math, 19(5):887–931, 2015.
[174] Alexander Grigor’yan, Yuri Muranov, and Shing-Tung Yau. Homologies of digraphs and
Künneth formulas. Communications in Analysis and Geometry, 25(5):969–1018, 2017.
[175] Mikhail Gromov. Groups of polynomial growth and expanding maps (with an appendix by
Jacques Tits). Publications Mathématiques de I’Institut des Hautes Études Scientifiques,
53(1):53–78, 1981.
[176] Mikhail Gromov. Hyperbolic groups. In S.M. Gersten, editor, Essays in Group Theory,
volume 8, pages 75–263. Mathematical Sciences Research Institute Publications, Springer,
1987.
[177] Karsten Grove. Critical point theory for distance functions. Proc. Sympos. Pure. Math.,
54(3):357–385, 1993.
[178] Leonidas J. Guibas and Steve Y. Oudot. Reconstructing using witness complexes. Discrete.
Comput. Geom., 30:325–356, 2008.
[179] Victor Guillemin and Alan Pollack. Differential Topology. Prentice Hall, 1974.
[180] William H. Guss and Ruslan Salakhutdinov. On characterizing the capacity of neural net-
works using algebraic topology. CoRR, arXiv:1802.04443, 2018.
[181] Attila Gyulassy, Natallia Kotava, Mark Kim, Charles Hansen, Hans Hagen, and Valerio
Pascucci. Direct feature visualization using Morse-Smale complexes. IEEE Trans. Visual-
ization Comput. Graphics (TVCG), 18(9):1549–1562, 2012.
[182] Sariel Har-Peled and Manor Mendel. Fast construction of nets in low-dimensional metrics
and their applications. SIAM J. Comput., 35(5):1148–1184, 2006.
[183] Frank Harary. Graph Theory. Addison Wesley series in mathematics. Addison-Wesley,
1971.
[184] William Harvey, In-Hee Park, Oliver Rübel, Valerio Pascucci, Peer-Timo Bremer, Cheng-
long Li, and Yusu Wang. A collaborative visual analytics suite for protein folding research.
J. Mol. Graph. Modeling (JMGM), 53:59–71, 2014.
[185] William Harvey, Raphael Wenger, and Yusu Wang. A randomized O(m log m) time algo-
rithm for computing Reeb graph of arbitrary simplicial complexes. In Proc. 25th Annu.
ACM Sympos. Compu. Geom. (SoCG), pages 267–276, 2010.
[186] Allen Hatcher. Algebraic Topology. Cambridge University Press, Cambridge, 2002.
[187] Jean-Claude Hausmann. On the Vietoris-Rips complexes and a cohomology theory for
metric spaces. Annals Math. Studies, 138:175–188, 1995.
350 Computational Topology for Data Analysis
[188] John Hershberger and Jack Snoeyink. Computing minimum length paths of a given homo-
topy class. Comput. Geom.: Theory and Applications, 4:63–97, 1994.
[189] Franck Hétroy and Dominique Attali. Topological quadrangulations of closed triangulated
surfaces using the Reeb graph. Graph. Models, 65(1-3):131–148, 2003.
[190] Masaki Hilaga, Yoshihisa Shinagawa, Taku Kohmura, and Tosiyasu L Kunii. Topology
matching for fully automatic similarity estimation of 3D shapes. In Proc. 28th Annu. Conf.
Comput. Graphics Interactive Techniques, pages 203–212, 2001.
[191] David Hilbert. Über die theorie der algebraischen formen. Mathematische Annalen,
36:473–530, 1890.
[192] Yasuaki Hiraoka, Takenobu Nakamura, Akihiko Hirata, Emerson G. Escolar, Kaname Mat-
sue, and Yasumasa Nishiura. Hierarchical structures of amorphous solids characterized by
persistent homology. Proc. National Academy Sci., 113(26):7035–7040, 2016.
[193] Christoph Hofer, Roland Kwitt, Marc Niethammer, and Mandar Dixit. Connectivity-
optimized representation learning via persistent homology. In Proc. 36th Internat. Conf.
Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 2751–
2760. PMLR, 2019.
[194] Christoph Hofer, Roland Kwitt, Marc Niethammer, and Andreas Uhl. Deep learning with
topological signatures. In Proc. Advances Neural Information Processing Sys., pages
1634–1644, 2017.
[195] Christoph D. Hofer, Roland Kwitt, and Marc Niethammer. Learning representations of
persistence barcodes. J. Machine Learning Research, 20(126):1–45, 2019.
[196] Derek F. Holt. The Meataxe as a tool in computational group theory. London Mathematical
Society Lecture Note Series, pages 74–81, 1998.
[197] Derek F. Holt and Sarah Rees. Testing modules for irreducibility. J. Australian Math.
Society, 57(1):1–16, 1994.
[198] John E. Hopcroft and Richard M. Karp. An n5/2 algorithm for maximum matchings in
bipartite graphs. SIAM J. Comput., 2(4):225–231, 1973.
[199] Xiaoling Hu, Fuxin Li, Dimitris Samaras, and Chao Chen. Topology-preserving deep
image segmentation. In Proc. 33rd Annu. Conf. Neural Info. Processing Sys. (NeuRIPS),
pages 5658–5669, 2019.
[200] Oscar H Ibarra, Shlomo Moran, and Roger Hui. A generalization of the fast lup matrix
decomposition algorithm and applications. J. Algorithms, 3(1):45 – 56, 1982.
[201] Arthur F. Veinott Jr. and George B. Dantzig. Integral extreme points. SIAM Review, 10
(3):371–372, 1968.
[203] Matthew Kahle. Sharp vanishing thresholds for cohomology of random flag complexes.
Annal. Math., pages 1085–1107, 2014.
[204] Matthew Kahle and Elizabeth Meckes. Limit theorems for betti numbers of random sim-
plicial complexes. Homology, Homotopy and Applications, 15(1):343–374, 2013.
[205] Sara Kališnik. Tropical coordinates on the space of persistence barcodes. Found. Comput.
Math., 19:101–129, 2019.
[206] Lida Kanari, Paweł Dłotko, Martina Scolamiero, Ran Levi, Julian Shillcock, Kathryn Hess,
and Henry Markram. A topological representation of branching neuronal morphologies.
Neuroinformatics, 16(1):3–13, 2018.
[207] Michael Kerber, Michael Lesnick, and Steve Oudot. Exact computation of the matching
distance on 2-parameter persistence modules. In Proc. 35th Internat. Sympos. Comput.
Geom. (SoCG), volume 129 of LIPIcs, pages 46:1–46:15, 2019.
[208] Michael Kerber, Dmitriy Morozov, and Arnur Nigmetov. Geometry helps to compare
persistence diagrams. J. Experimental Algo. (JEA), 22(1):1–4, 2017.
[209] Michael Kerber and Arnur Nigmetov. Efficient approximation of the matching distance for
2-parameter persistence. CoRR, arXiv:1912.05826, 2019.
[210] Michael Kerber and Hannah Schreiber. Barcodes of towers and a streaming algorithm for
persistent homology. Discrete Comput. Geom., 61(4):852–879, 2019.
[211] Kwangho Kim, Jisu Kim, Manzil Zaheer, Joon Kim, and Larry Wasserman
Frédéric Chazal. PLLay: Efficient topological layer based on persistent landscapes. In
Proc. 33rd Annu. Conf. Advances Neural Info. Processing Sys. (NeurIPS), 2020.
[212] Woojin Kim and Facundo Mémoli. Generalized persistence diagrams for persistence mod-
ules over posets. CoRR, arXiv:1810.11517, 2018.
[213] Henry King, Kevin P. Knudson, and Neza Mramor. Generating discrete Morse functions
from point data. Exp. Math., 14(4):435–444, 2005.
[215] Genki Kusano, Kenji Fukumizu, and Yasuaki Hiraoka. Kernel method for persistence
diagrams via kernel embedding and weight factor. Journal of Machine Learning Research,
18(189):1–41, 2018.
[216] Claudia Landi. The rank invariant stability via interleavings. CoRR, arXiv:1412.3374,
2014.
[217] Janko Latschev. Vietoris-Rips complexes of metric spaces near a closed Riemannian man-
ifold. Archiv der Mathematik, 77(6):522–528, 2001.
352 Computational Topology for Data Analysis
[218] Tam Le and Makoto Yamada. Persistence Fisher kernel: A Riemannian manifold kernel
for persistence diagrams. In Proc. Advances Neural Info. Processing Sys. (NIPS), pages
10028–10039, 2018.
[219] Jean Leray. Sur la forme des espaces topologiques et sur les points fixes des représenta-
tions. J. Math. Pure Appl., 24:95–167, 1945.
[220] Michael Lesnick. The theory of the interleaving distance on multidimensional persistence
modules. Found. Comput. Math., 15(3):613–650, 2015.
[221] Michael Lesnick and Matthew Wright. Interactive visualization of 2-d persistence modules.
CoRR, arXiv:1512.00180, 2015.
[222] Michael Lesnick and Matthew Wright. Computing minimal presentations and betti num-
bers of 2-parameter persistent homology. CoRR, arXiv:1902.05708, 2019.
[223] Thomas Lewiner, Hélio Lopes, and Geovan Tavares. Applications of Forman’s discrete
Morse theory to topology visualization and mesh compression. IEEE Trans. Vis. Comput.
Graph., 10(5):499–508, 2004.
[224] Li Li, Wei-Yi Cheng, Benjamin S. Glicksberg, Omri Gottesman, Ronald Tamler, Rong
Chen, Erwin P. Bottinger, and Joel T. Dudley. Identification of type 2 diabetes sub-
groups through topological analysis of patient similarity. Science Translational Medicine,
7(311):311ra174, 2015.
[225] André Lieutier. Any open bounded subset of Rn has the same homotopy type as its medial
axis. Computer-Aided Design, 36(11):1029–1046, 2004.
[226] Yong Lin, Linyuan Lu, and Shing-Tung Yau. Ricci curvature of graphs. Tohoku Mathe-
matical Journal, Second Series, 63(4):605–627, 2011.
[228] Clément Maria and Steve Y. Oudot. Zigzag persistence via reflections and transpositions.
In Proc. 26th Annu. ACM-SIAM Sympos. Discrete Algorithms (SODA), pages 181–199,
2015.
[229] Paolo Masulli and Alessandro EP Villa. The topology of the directed clique complex as a
network invariant. SpringerPlus, 5(1):388, 2016.
[230] Yuriy Mileyko, Sayan Mukherjee, and John Harer. Probability measures on the space of
persistence diagrams. Inverse Problems, 27(12):124007, 2011.
[231] Ezra Miller and Bernd Sturmfels. Combinatorial Commutative Algebra. Springer-Verlag
New York, 2004.
[232] John W. Milnor. Topology from a differentiable viewpoint. Virginia Univ. Press, 1965.
Computational Topology for Data Analysis 353
[233] John W. Milnor. Morse Theory. Annals of Mathematics Studies. Princeton University
Press, 5th edition, 1973.
[234] Nikola Milosavljević, Dmitriy Morozov, and Primoz Skraba. Zigzag persistent homology
in matrix multiplication time. In Proc. 27th Annu. Sympos. Comput. Geom. (SoCG), pages
216–225, 2011.
[235] Konstantin Mischaikow and Vidit Nanda. Morse theory for filtrations and efficient compu-
tation of persistent homology. Discrete Comput. Geom., 50(2):330–353, 2013.
[236] Michael Moor, Max Horn, Bastian Rieck, and Karsten Borgwardt. Topological autoen-
coders. CoRR, arXiv:1906.00722, 2019.
[237] Dmitriy Morozov, Kenes Beketayev, and Gunther H. Weber. Interleaving distance between
merge trees. In Workshop on Topological Methods in Data Analysis and Visualization:
Theory, Algorithms and Applications, 2013.
[239] Elizabeth Munch, Katharine Turner, Paul Bendich, Sayan Mukherjee, Jonathan Mattingly,
and John Harer. Probabilistic Fréchet means for time varying persistence diagrams. Elec-
tron. J. Statist., 9(1):1173–1204, 2015.
[240] Elizabeth Munch and Bei Wang. Convergence between categorical representations of
Reeb space and mapper. In 32nd Internat. Sympos. Comput. Geom. (SoCG), volume 51
of LIPIcs, pages 53:1–53:16, 2016.
[242] James R. Munkres. Topology, 2nd Edition. Prentice Hall, Inc., 2000.
[243] Gregory Naitzat, Andrey Zhitnikov, and Lek-Heng Lim. Topology of deep neural networks.
J. Mach. Learn. Res., 21:184:1–184:40, 2020.
[244] Monica Nicolau, Arnold J. Levine, and Gunnar Carlsson. Topology based data analysis
identifies a subgroup of breast cancers with a unique mutational profile and excellent sur-
vival. Proc. National Acad. Sci., 108.17:7265–7270, 2011.
[245] Partha Niyogi, Stephen Smale, and Shmuel Weinberger. Finding the homology of subman-
ifolds with high confidence from random samples. Discrete Comput. Geom., 39(1-3):419–
441, 2008.
[246] Partha Niyogi, Stephen Smale, and Shmuel Weinberger. A topological view of unsuper-
vised learning from noisy data. SIAM J. Comput., 40(3):646–663, 2011.
[248] James B. Orlin. Max flows in O(nm) time, or better. In Proc. 45th Annu. ACM Sympos.
Theory Comput. (STOC), pages 765–774, 2013.
[249] Steve Oudot. Persistence Theory: From Quiver Representations to Data Analysis, volume
209. AMS Mathematical Surveys and Monographs, 2015.
[250] Deepti Pachauri, Chris Hinrichs, Moo K. Chung, Sterling C. Johnson, and Vikas Singh.
Topology-based kernels with application to inference problems in Alzheimer’s disease.
IEEE Trans. Med. Imaging, 30(10):1760–1770, 2011.
[251] Richard A. Parker. The computer calculation of modular characters (the Meataxe). Comput.
Group Theory, pages 267–274, 1984.
[252] Salman Parsa. A deterministic O(m log m) time algorithm for the Reeb graph. Discrete
Comput. Geom., 49(4):864–878, Jun 2013.
[253] Valerio Pascucci, Giorgio Scorzelli, Peer-Timo Bremer, and Ajith Mascarenhas. Robust
on-line computation of Reeb graphs: simplicity and speed. ACM Trans. Graph., 26(3):58,
2007.
[254] Amit Patel. Generalized persistence diagrams. J. Appl. Comput. Topology, 1:397–419,
2018.
[255] Giovanni Petri, Martina Scolamiero, Irene Donato, and Francesco Vaccarino. Topological
strata of weighted complex networks. PLOS ONE, 8:1–8, 06 2013.
[256] Jeff M. Phillips, Bei Wang, and Yan Zheng. Geometric inference on kernel density esti-
mates. In Lars Arge and János Pach, editors, Proc. 31st Internat. Sympos. Comput. Geom.
(SoCG), volume 34 of LIPIcs, pages 857–871, 2015.
[257] Adrien Poulenard, Primoz Skraba, and Maks Ovsjanikov. Topological function optimiza-
tion for continuous shape matching. Comput. Graphics Forum, 37(5):13–25, 2018.
[258] Victor V. Prasolov. Elements of combinatorial and differential topology, volume 74. Amer.
Math. Soc., 2006.
[260] Raúl Rabadán and Andrew J. Blumberg. Topological Data Analysis for Genomics and
Evolution: Topology in Biology. Cambridge University Press, 2019.
[261] Geoge Reeb. Sur les points singuliers d’une forme de Pfaff complètement intégrable ou
d’une fonction numérique. Comptes Rendus Hebdomadaires des Séances de l’Académie
des Sciences, 222:847–849, 1946.
[262] Michael W Reimann, Max Nolte, Martina Scolamiero, Katharine Turner, Rodrigo Perin,
Giuseppe Chindemi, Paweł Dłotko, Ran Levi, Kathryn Hess, and Henry Markram. Cliques
of neurons bound into cavities provide a missing link between structure and function. Fron-
tiers Comput. Neuroscience, 11:48, 2017.
Computational Topology for Data Analysis 355
[263] Jan Reininghaus, Stefan Huber, Ulrich Bauer, and Roland Kwitt. A stable multi-scale
kernel for topological machine learning. In Proc. Comput. Vision Pattern Recognition,
pages 4741–4748, 2015.
[264] Bastian Rieck, Matteo Togninalli, Christian Bock, Michael Moor, Max Horn, Thomas
Gumbsch, and Karsten Borgwardt. Neural persistence: A complexity measure for deep
neural networks using algebraic topology. In Proc. Internat. Conf. Learning Representa-
tions (ICLR), 2019.
[265] Claus M. Ringel and Hiroyuki Tachikawa. Q-F3 rings. J. für die Reine und Angewandte
Mathematik, 272:49–72, 1975.
[266] Vanessa Robins. Towards computing homology from finite approximations. Topology
Proceedings, 24(1):503–532, 1999.
[267] Vanessa Robins, Peter J. Wood, and Adrian P. Sheppard. Theory and algorithms for con-
structing discrete Morse complexes from grayscale digital images. IEEE Trans. Pattern
Anal. Machine Intelligence, 33(8):1646–1658, 2011.
[268] Tim Römer. On minimal graded free resolutions. Illinois J. Math, 45(2):1361–1376, 2001.
[269] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks
for biomedical image segmentation. In Proc. Internat. Conf. Medical Image Comput.
Computer-Assisted Intervention, pages 234–241. Springer, 2015.
[270] Jim Ruppert. A Delaunay refinement algorithm for quality 2-dimensional mesh generation.
J. Algorithms, 18:548–585, 1995.
[271] Bernhard Schölkopf and Alexander J. Smola. Learning with Kernels: Support Vector Ma-
chines, Regularization, Optimization, and Beyond. The MIT Press, 1998.
[272] Alexander Schrijver. Theory of Linear and Integer Programming. John Wiley & Sons Ltd.,
Chichester, 1986.
[276] Yoshihisa Shinagawa, Tosiyasu L. Kunii, and Yannick L. Kergosien. Surface coding based
on Morse theory. IEEE Comput. Graph. Appl., 11(5):66–78, 1991.
[277] Gurjeet Singh, Facundo Mémoli, and Gunnar Carlsson. Topological methods for the analy-
sis of high dimensional data sets and 3D object recognition. In Proc. Eurographics Sympos.
Point-Based Graphics (2007), pages 91–100, 2007.
356 Computational Topology for Data Analysis
[278] Primoz Skraba and Katharine Turner. Wasserstein stability for persistence diagrams. CoRR,
arXiv:2006.16824, 2021.
[279] Jacek Skryzalin. Numeric invariants from multidimensional persistence, 2016. PhD thesis,
Stanford University.
[280] Daniel D. Sleator and Robert Endre Tarjan. A data structure for dynamic trees. J. Comput.
Syst. Sci., 26(3):362–391, June 1983.
[281] Henry J. S. Smith. On systems of linear indeterminate equations and congruences. Philo-
sophical Transactions of the Royal Society of London, 151:293–326, 1861.
[282] Thierry Sousbie. The persistent cosmic web and its filamentary structure - I. theory and
implementation. Monthly Notices Royal Astronomical Soc., 414(1):350–383, 2011.
[283] Bharath K. Sriperumbudur, Kenji Fukumizu, and Gert R.G. Lanckriet. Universality, char-
acteristic kernels and RKHS embedding of measures. J. Machine Learning Research,
12(70):2389–2410, 2011.
[284] Jian Sun, Maks Ovsjanikov, and Leonidas Guibas. A concise and provably informative
multi-scale signature based on heat diffusion. In Proceedings of the Symposium on Geom-
etry Processing (SGP), page 1383âĂŞ1392, Goslar, DEU, 2009. Eurographics Association.
[285] Julien Tierny. Reeb graph based 3D shape modeling and applications. PhD thesis, Uni-
versite des Sciences et Technologies de Lille, 2008.
[286] Julien Tierny. Topological Data Analysis for Scientific Visualization. Springer Internat.
Publishing, 2017.
[287] Julien Tierny, Attila Gyulassy, Eddie Simon, and Valerio Pascucci. Loop surgery for vol-
umetric meshes: Reeb graphs reduced to contour trees. IEEE Trans. Vis. Comput. Graph.,
15(6):1177–1184, 2009.
[288] Brenda Y. Torres, Jose H. M. Oliveira, Ann Thomas Tate, Poonam Rath, Katherine Cum-
nock, and David S. Schneider. Tracking resilience to infections by mapping disease space.
PLOS Biology, 14(4):1–19, 2016.
[289] Elena Farahbakhsh Touli and Yusu Wang. FPT-algorithms for computing gromov-
hausdorff and interleaving distances between trees. CoRR, arXiv:1811.02425, 2018. in
Proc. European Sympos. Algorithms (ESA) 2019.
[290] Tony Tung and Francis Schmitt. The augmented multiresolution Reeb graph approach for
content-based retrieval of 3d shapes. Internat. J. Shape Modeling, 11(1):91–120, 2005.
[291] Katharine Turner, Yuriy Mileyko, Sayan Mukherjee, and John Harer. Fréchet means for
distributions of persistence diagrams. Discrete Comput. Geom., 52(1):44–70, 2014.
[292] Gert Vegter and Chee K. Yap. Computational complexity of combinatorial surfaces. In
Proc. 6th Annu. Sympos. Comput. Geom. (SoCG), pages 102–111, 1990.
Computational Topology for Data Analysis 357
[293] Leopold Vietoris. Über den höheren zusammenhang kompakter räume und eine klasse von
zusammenhangstreuen abbildungen. Mathematische Annalen, 97:454–472, 1927.
[294] Suyi Wang, Xu Li, Partha Mitra, and Yusu Wang. Topological skeletonization and tree-
summarization of neurons using discrete Morse theory. CoRR, arXiv:1805.04997, 2018.
[295] Suyi Wang, Yusu Wang, and Yanjie Li. Efficient map reconstruction and augmentation
via topological methods. In Jie Bao, Christian Sengstock, Mohammed Eunus Ali, Yan
Huang, Michael Gertz, Matthias Renz, and Jagan Sankaranarayanan, editors, Proc. 23rd
SIGSPATIAL Internat. Conf. Advances in GIS, pages 25:1–25:10, 2015.
[296] Suyi Wang, Yusu Wang, and Rephael Wenger. The JS-graph of join and split trees. In
Proc. 30th Annu. Sympos. Comput. Geom. (SoCG), pages 539–548, 2014.
[297] Larry Wasserman. Topological data analysis. Annual Review of Statistics and Its Applica-
tion, 5(1):501–532, 2018. Available at SSRN: https://ssrn.com/abstract=3156968.
[298] Carry Webb. Decomposition of graded modules. Proc. American Math. Soc., 94(4):565–
571, 1985.
[299] Gunther Weber, Peer-Timo Bremer, and Valerio Pascucci. Topological landscapes: A ter-
rain metaphor for scientific data. IEEE Trans. Vis. Comput. Graphics, 13(6):1416–1423,
2007.
[300] André Weil. Sur les théoréms de de Rham. Commentarii Mathematici Helvetici, 26:119–
145, 1952.
[301] Zoë Wood, Hugues Hoppe, Mathieu Desbrun, and Peter Schröder. Removing excess topol-
ogy from isosurfaces. ACM Trans. Graph., 23(2):190–208, 2004.
[302] Pengxiang Wu, Chao Chen, Yusu Wang, Shaoting Zhang, Changhe Yuan, Zhen Qian, Dim-
itris N. Metaxas, and Leon Axel. Optimal topological cycles and their application in cardiac
trabeculae restoration. In Info. Processing Medical Imaging - 25th Internat. Conf., IPMI,
pages 80–92, 2017.
[303] Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan Salakhut-
dinov, and Alexander Smola. Deep sets. In Proc. Advances Neural Info. Processing Sys.,
pages 3391–3401, 2017.
[304] Simon Zhang, Mengbai Xiao, and Hao Wang. GPU-accelerated computation of Vietoris-
Rips persistence barcodes. In 36th Internat. Sympos. Comput. Geom. (SoCG), volume 164,
pages 70:1–70:17, 2020.
[305] Qi Zhao and Yusu Wang. Learning metrics for persistence-based summaries and appli-
cations for graph classification. In Proc. 33rd Annu. Conf. Neural Info. Processing Sys.
(NeuRIPS), pages 9855–9866, 2019.
Index
358
Computational Topology for Data Analysis 359
complex ball, 9
simplicial, 24 sphere, 9
connected, 7 exact sequence, 268
connected space, 5 extended persistence, 118
contiguous maps, 26 extended plane, 52, 59
continuous function, 10
face
contour, 171
of a simplex, 23, 24
contour tree, 172
facet
convex hull, 24
of a simplex, 23, 24
coset, 37
field, 37
cover
filtration, 54
path connected, 212
function, 57
maps, 210
nested pair, 163
critical
simplex-wise, 54
V-path, 240
finite type, 302
point, 16, 17
finitely generated, 265
point index, 19
flag complex, 193
simplices, 239
free group, 37
value, 17, 114
free module, 267
cut, 137
free resolution, 291
cycle group, 40
function-induced metric, 184
death, 58 functional distortion distance, 184
deformation retract, 13 functor, 301
deformation retraction, 13
generalized vector field, 159
Delaunay
generating set, 265
complex, 29
minimal, 266
simplex, 29
generator, 37, 265
derivative, 16
genus, 14
diameter of a point set, 9
geodesic
dimension, 24
distance, 9
of a manifold, 13
geometric realization, 25
of a simplex, 23, 24
grade, 264
disconnected, 7
graded Betti number, 292
discrete Morse
graded module, 264
field, 239
gradient, 331
function, 238
vector field, 17
distance field, 158
path, 249
DMVF, 239
vector, 17, 237, 248, 249
Dowker complex, 198
gradient vector, 17
edge, 23 gradient vector field, 17
elementary simplicial map, 98 Gromov-Hausdorff distance, 150
embedding, 10 group, 36
Euclidean Hausdorff, 21
360 Computational Topology for Data Analysis
ring, 37 space, 4
Rips distance, 149 subspace, 5
topologically equivalent, 10
sampling conditions, 34 topology, 4, 6
shifted module M→u , 267 total decomposition
simplex, 23, 24 module, 267
simplex-wise, 54 morphism, 268
monotone function, 56 totally unimodular, 133
simplicial tower, 94
map, 26 triangle, 23
retraction, 154 triangular commutativity, 76, 303
simplicial complex, 25 triangulation, 24, 26
abstract, 24 trivial module, 268
geometric, 24
singular unbounded, 9
homology, 45 underlying space, 25
simplex, 45 union-find, 87
skeleton, 25 unit ball, 9
Sliced Wasserstein upper star, 56
distance, 326 filtration, 56
kernel, 327 upper-link-index, 80
Smith normal form, 135
smooth valid annotation, 97
manifold, 15 vector space, 38, 46
surface, 15 vertex, 23
sparse Rips, 152 function, 56
filtration, 152 map, 26
split tree, 172 vertical homology, 180
stability Vietoris-Rips complex, 28
persistence diagram, 62 Voronoi diagram, 30
star, 25
Wasserstein distance, 63
strong convexity, 164
weak
strong witness, 31
feature size, 160
sublevel set, 19, 52
interleaving
subordinate, 213
vector space towers, 157
subspace topology, 5
pseudomanifold, 137
superlevel set, 19
witness, 31
support, 290
weight of cycle, 124
surface, 14
without boundary, 14
system of subsets, 4
witness complex, 31
tame, 63
zigzag
persistence module, 78
filtration, 103
tetrahedron, 23
module, 104
topological