0% found this document useful (0 votes)
81 views322 pages

Algebraic Topology For Data Scientists Michael S. Postol. PH.D

This document is an introduction to algebraic topology for data scientists written by Michael S. Postol. It provides background on point-set topology, abstract algebra, and traditional homology theory. It aims to explain key concepts in these fields at an intuitive level while also introducing formal definitions and examples. The document contains acknowledgments, tables of contents, multiple chapters on different topics, and references to other scholars' works. The overall goal appears to be providing data scientists with foundational knowledge in algebraic topology.

Uploaded by

Jacek Walczak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views322 pages

Algebraic Topology For Data Scientists Michael S. Postol. PH.D

This document is an introduction to algebraic topology for data scientists written by Michael S. Postol. It provides background on point-set topology, abstract algebra, and traditional homology theory. It aims to explain key concepts in these fields at an intuitive level while also introducing formal definitions and examples. The document contains acknowledgments, tables of contents, multiple chapters on different topics, and references to other scholars' works. The overall goal appears to be providing data scientists with foundational knowledge in algebraic topology.

Uploaded by

Jacek Walczak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 322

Algebraic Topology for Data Scientists

Michael S. Postol. Ph.D.


arXiv:2308.10825v1 [math.AT] 21 Aug 2023

August 22, 2023

The MITRE Corporation

[email protected]

All rights reserved.

The author’s affiliation with The MITRE Corporation is provided for


identification purposes only and is not intended to convey or imply MITRE’s
concurrence with, or support for, the positions, opinions, or viewpoints expressed
by the author.
2
Acknowledgements

I would like to thank the following people for their help in putting together this book in its current form:
First of all I would like to thank my long time collaborator, Bob Simon, for leading the development of our time
series analysis tool (TSAT). This grew out of my interest in the time series discretization algorithm developed by
Jessica Lin and Eamonn Keogh. Jessica Lin, gave us a lot of advice on how to use SAX for our novel cybersecurity
applications. Candace Diaz has an algebraic topology background and we have had many discussions on current
topics in topological data analysis. She was the person who introduced me to the Gidea-Katz work which we applied
to extending SAX to the problem of classifying multivariate time series. Drew Wicke did a large portion of the software
development part of TSAT and helped in the experiments described in our paper.
In addition to the people listed above, I would also like to thank Liz Munch, Rocio Gonzalez-Diaz, Andrew
Blumberg, Emilie Purvine, and Tegan Emerson for helpful conversations. I would like to thank Marian Gidea, Yuri
Katz, Eamonn Keogh, Jessica Lin, Ian Witten, Dmitry Cousin, Henry Adams, Liz Munch, Bob Ghrist, Vin de Silva,
Greg Friedman, and Frances Sergereart for the use of illustrations from their works. Mary Ann Wymore helped me
with the proper formatting and with legal expertise. I apologize for anyone I may have left out from the above list.
Finally, I would like to thank my wife Nadine for her help in pulling together the permissions. I am grateful to her
for her support for this effort and all her support in general over the last 26 years.

3
4
Contents

1 Introduction 1
1.1 Note on the Use of This Book. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Point-Set Topology Background 5
2.1 Sets, Functions, and Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.2 Functions and Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.3 Metric Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Topological Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Subspaces, Product Spaces, and Quotient Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.2 Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.4.3 Quotient Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.5 Separation Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.6 Two Ways to be Infinite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.7 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.8 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3 Abstract Algebra Background 31
3.1 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Exact Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3 Rings and Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Vector Spaces, Modules, and Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.5 Category Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4 Traditional Homology Theory 51
4.1 Simplicial Homology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.1.1 Simplicial Complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.1.2 Homology Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.1.3 Homology with Arbitrary Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.1.4 Computability of Homology Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.1.5 Homomorphisms Induced by Simplicial Maps . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.1.6 Topological Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.1.7 Application: Maps of Spheres and Fixed Point Theorems, and Euler Number . . . . . . . . . 67
4.2 Eilenberg Steenrod Axioms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.2.1 Long Exact Sequences and Zig-Zagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.2.2 Mayer-Vietoris Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.2.3 The Axiom List or What is a Homology Theory? . . . . . . . . . . . . . . . . . . . . . . . . 73
4.3 Singular and Cellular Homology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3.1 Singular Homology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

5
6 CONTENTS

4.3.2 CW Complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.3.3 Homology of CW Complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3.4 Projective Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5 Persistent Homology 83
5.1 Definition of Persistent Homology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2 Computational Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.3 Bar Codes, Persistence Diagrams, and Persistence Landscapes . . . . . . . . . . . . . . . . . . . . . 89
5.4 Zig-Zag Persistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.5 Distance Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.6 Sublevel Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.7 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.7.1 Review of Graph Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.7.2 Graph Distance Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.7.3 Simplicial Complexes from Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.8 Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.9 SAX and Multivariate Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.9.1 SAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.9.2 SEQUITUR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.9.3 Representative Pattern Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5.9.4 Converting Multivariate Time Series to the Univariate Case or Predicting Economic Collapse . 109
5.9.5 Classification of Internet of Things Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.9.6 More on Time Series and Topological Data Analysis . . . . . . . . . . . . . . . . . . . . . . 111
5.10 Persistence Images and Template Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.11 Open Source Software and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6 Related Techniques 123
6.1 Q-Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
6.1.1 Topological Representation of Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.1.2 Exterior Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.1.3 Homotopy Shomotopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.1.4 Structures of the University of Essex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.1.5 Rules for Success in the Committee Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.2 Sensor Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
6.3 Mapper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
6.3.1 Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
6.3.2 Implementation for Point Cloud Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.3.3 Height Function Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
6.3.4 Application to Breast Cancer Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.4 Simplicial Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.4.1 Ordered Simplicial Complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.4.2 Delta Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.4.3 Definition of Simplicial Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
6.4.4 Geometric Realization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.5 UMAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.5.1 Theoretical Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
6.5.2 Computational View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.5.3 Weaknesses of UMAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
6.5.4 UMAP vs. t-SNE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
7 Some Unfinished Ideas 153
7.1 Time Series of Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.2 Directed Simplicial Complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
CONTENTS 7

7.3 Computer Intrusion Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154


7.4 Market Basket Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8 Cohomology 159
8.1 Introduction to Homological Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
8.1.1 Hom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.1.2 Tensor Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
8.1.3 Ext . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8.1.4 Tor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
8.2 Definition of Cohomology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
8.3 Cup Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
8.4 Universal Coefficient Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
8.5 Homology and Cohomology of Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
8.6 Ring Structure of the Cohomology of a Product Space . . . . . . . . . . . . . . . . . . . . . . . . . . 183
8.7 Persistent Cohomology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
8.8 Ripser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
9 Homotopy Theory 189
9.1 The Extension and Lifting Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
9.2 The Fundamental Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
9.3 Fiber Bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
9.4 The Hopf Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
9.5 Paths and Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
9.6 Higher Homotopy Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
9.6.1 Definition of Higher Homotopy Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
9.6.2 Relative Homotopy Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
9.6.3 Boundary Operator and Induced Homomorphisms . . . . . . . . . . . . . . . . . . . . . . . 203
9.6.4 Properties of Homotopy Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
9.6.5 Homotopy Systems vs. Eilenberg-Steenrod Axioms . . . . . . . . . . . . . . . . . . . . . . . 206
9.6.6 Operation of the Fundamental Group on the Higher Homotopy Groups . . . . . . . . . . . . 206
9.7 Calculation of Homotopy Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
9.7.1 Products and One Point Union of Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
9.7.2 Hurewicz Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
9.7.3 Freudenthal’s Suspension Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
9.7.4 Whitehead’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
9.8 Eilenberg-MacLane Spaces and Postnikov Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
9.9 Spectral Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
9.9.1 Basic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
9.9.2 Leray-Serre Spectral Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
9.9.3 Filtrations and Exact Couples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
9.9.4 Kenzo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
10 Obstruction Theory 227
10.1 Classical Obstruction Theory and the Extension Problem . . . . . . . . . . . . . . . . . . . . . . . . 228
10.1.1 Extension Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
10.1.2 The Obstruction Cochain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
10.1.3 The Difference Cochain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
10.1.4 Eilenberg’s Extension Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
10.1.5 Obstruction Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
10.1.6 Application to Data Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
10.2 Possible Application to Image Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
10.3 More on Simplicial Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
10.3.1 Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
8 CONTENTS

10.3.2 Kan Complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235


10.4 Computability of the Extension Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
11 Steenrod Squares and Reduced Powers 241
11.1 Steenrod’s Original Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
11.2 Cohomology Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
11.3 Construction of Steenrod Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
11.3.1 Cohomology of K(Z2 , 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
11.3.2 Acyclic Carriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
11.3.3 The Cup-i Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
11.3.4 Steenrod Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
11.4 Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
11.4.1 Sq 1 and Sq 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
11.4.2 The Cartan Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
11.4.3 Cartesian Products of P ∞ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
11.4.4 Adem Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
11.5 The Hopf Invariant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
11.6 The Steenrod Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
11.7 Cohomology of Eilenberg-Mac Lane Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
11.7.1 Bocksstein Exact Couple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
11.7.2 Serre’s Exact Sequence for a Fiber Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
11.7.3 Transgression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
11.7.4 Cohomology Version of Leray-Serre Spectral Sequence . . . . . . . . . . . . . . . . . . . . . 264
11.7.5 H ∗ (Z, 2; Z) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
11.7.6 H ∗ (Z2 , n; Z2 ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
11.7.7 Additional Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
11.8 Reduced Powers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
11.9 Vector Bundles and Stiefel-Whitney Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
11.9.1 Vector Bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
11.9.2 New Bundles from Old Ones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
11.9.3 Stiefel-Whitney Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
11.9.4 Grassmann Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
11.9.5 Representation of Stiefel-Whitney Classes as Steenrod Squares . . . . . . . . . . . . . . . . 272
11.10 Computer Computation of Steenrod Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
12 Homotopy Groups of Spheres 279
12.1 Stable Homotopy Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
12.2 Classes of Abelian Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
12.3 Some Techniques from the Early Years of the Subject . . . . . . . . . . . . . . . . . . . . . . . . . . 284
12.3.1 Finiteness of Homotopy Groups of Odd Dimensional Spheres . . . . . . . . . . . . . . . . . 284
12.3.2 Iterated Suspension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
12.3.3 The p-primary components of πm (S 3 ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
12.3.4 Pseudo-projective Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
12.3.5 The Hopf Invariant Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
12.3.6 πn+1 (S n ) and πn+2 (S n ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
12.3.7 πn+r (S n ) for 3 ≤ r ≤ 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
12.4 More Modern Techniques and Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
12.5 Table of Some Homotopy Groups of Spheres . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
13 Conclusion 293
List of Tables

3.1.1 Group Laws. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.9.1 Results for TDA Multivariate Case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6.1.1 Social Amenities at the University of Essex[9]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

12.1.1Stable Homotopy Groups πiS for i ≤ 19. [63, 162]. . . . . . . . . . . . . . . . . . . . . . . . . . . . 280

12.5.1Homotopy Groups πi (S n ) for 1 ≤ n ≤ 10 and 1 ≤ i ≤ 15 [162]. . . . . . . . . . . . . . . . . . . . . 291

9
10 LIST OF TABLES
List of Figures

1.0.1 Coffee Cup Turning into a Donut [98] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


1.0.2 Ring of Data Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.1.1 Union, Intersection and Set Difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6


2.1.2 Torus [179], Cylinder[174], and Moebius Strip[176] . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.1 Closure, Interior, and Frontier(Boundary) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4.1 Folded Rectangles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4.2 Klein Bottle [175] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.4.3 Boy’s Surface [175] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4.4 Suspension of a Circle [178] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.8.1 Topologist’s Sine Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.1.1 Spaceship Earth [177] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52


4.1.2 Convex vs Non-Convex Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.3 Good and Bad Simplical Complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1.4 Boundary Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.1.5 Homology Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.1.6 Excision Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.1.7 Barycentric Subdivision [110] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.1.8 Stereographic Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2.1 Zig-zag Lemma [110] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.3.1 Non-triangulable CW Complex [110] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.3.2 Torus and Klein Bottle as CW Complexes. [110] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.0.1 Ring of Data Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83


5.1.1 Vietoris-Rips Complexes of a Point Cloud with ϵ Growing [129] . . . . . . . . . . . . . . . . . . . . 85
5.3.1 Bar Codes and Persitence Diagrams [31] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3.2 Persitence Diagrams vs. Persistence Landscapes [55] . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.5.1 Example Matching of Two Persistence Diagrams [31] . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.6.1 Sublevel Sets of the Height Function of a Torus [129] . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.6.2 Barcode and Persistence Diagram of a Function f : [0, 1] → R [31]. . . . . . . . . . . . . . . . . . . 98
5.6.3 Barcode and Persistence Diagram of a Height function of a surface in R3 [31]. . . . . . . . . . . . . 98
5.7.1 Example of a Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.7.2 Spanning Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.7.3 Directed Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.7.4 Graph Edit Distance Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.9.1 Piecewise Aggregate Approximation (PAA) Example [88] . . . . . . . . . . . . . . . . . . . . . . . 106
5.9.2 SAX Discretization Example: α = 3 [88] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.9.3 Grammar Rule Examples [115] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

11
12 LIST OF FIGURES

5.9.4 SEQUITUR Example [115] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108


5.9.5 Behavior of Three Indices Immediately Before Two Economic Crises. [55] . . . . . . . . . . . . . . 109
5.9.6 Clique Complex of Window ω10 (0) After First 4 Nonzero Edges are Added [34] . . . . . . . . . . . 111
5.9.7 Ordinal Partition Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.10.1Pipeline For Persistent Images [2]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.10.2Three Criterion for Relative Compactness on Sets of Persistence Diagrams [120]. . . . . . . . . . . . 117
5.10.3Tent function g(4,6),2 drawn in the birth-death plane and the birth-lifetime plane [120]. . . . . . . . . 121

6.2.1 A generator of H2 (Rs , Fs ) which is killed by the inclusion i∗ into H2 (Rw , Fw ) The fence nodes are
on top and the strip is a border of radius rf . [39] . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
6.3.1 Simplicial Complex for Example 6.3.1 [146]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.3.2 Simplicial Complex for Example 6.3.2 [146]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
6.3.3 Simplicial Complex for Example 6.3.3 [146]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
6.3.4 Simplicial Complex for Breast Cancer data with filter p = 2, k = 4. [116]. . . . . . . . . . . . . . . 135
2
6.4.1 Face Maps of |∆ | [51]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
2
6.4.2 |∆ | glued into a cone [51]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
6.4.3 A Delta set with two vertices bounding two different edges (based on example from [51]). . . . . . . 137
6.4.4 The 1-simplices in |∆1 |, the 1-simplices in |∆2 |, and the 2-simplices in |∆2 | [51]. . . . . . . . . . . 139
6.4.5 Face of a singular simplex [51]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6.4.6 Degenerate singular simplex [51]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

7.3.1 Authorized user and lion impostor [126]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155


7.3.2 Example of a process graph [126]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
7.3.3 Example of a window graph [126]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
7.3.4 Confusion matrix of 10-way random forest for intrusion detection [126]. . . . . . . . . . . . . . . . 156

8.3.1 S 1 ∨ S 1 ∨ S 2 [110]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

9.9.1 First Quadrant Spectral Sequence [92]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217


9.9.2 Kenzo the Cat [43]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
9.9.3 Maps Involved in a Reduction [139]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

10.3.1Realization of ∆1 × ∆1 [51]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234


10.3.2Horns on |∆2 | [51]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
10.3.3A singular set satisfies the Kan condition [51]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

11.5.1Attaching a 2n-cell to S n ∨ S n to form S n × S n . [106]. . . . . . . . . . . . . . . . . . . . . . . . . 256

12.1.12-components of πiS for i ≤ 60 [63]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280


12.1.23-components of πiS for i ≤ 100 [63]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
12.1.35-components of πiS for i ≤ 1000 [63]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
Chapter 1

Introduction

This textbook will give a thorough introduction to topological data analysis (TDA), the application of algebraic topol-
ogy to data science. Algebraic topology was the topic of my Ph.D. dissertation, written 30 years ago at the University
of Minnesota [124, 125]. At the time, the subject was rather theoretical so I have always been intrigued by the idea
of having a practical application. Today, there seems to be a growing interest in using topology for data science and
machine learning. At the International Conference on Machine Learning Applications held in December 2019, there
was a special session in topological data analysis that was filled beyond capacity for the entire session.
Algebraic topology is traditionally a very specialized field of math and most mathematicians have never been
exposed to it, let alone data scientists, computer scientists, and analysts. I have three goals in writing this book. The
first is to bring people up to speed who are missing a lot of the necessary background. I will describe the topics in
point-set topology, abstract algebra, and homology theory needed for a good understanding of TDA. The second is
to explain TDA and some current applications and techniques. Finally, I would like to answer some questions about
more advanced topics such as cohomology, homotopy, obstruction theory, and Steenrod squares, and what they can
tell us about data. It is hoped that readers will have the tools to start to think about these topics and where they might
fit in.
One important question is whether TDA gives any information that more conventional techniques don’t. Is it just
a solution looking for a problem? What are the advantages? I strongly believe that TDA is a good tool to be used in
conjunction with other machine learning methods. I am hoping to make that case in this discussion.
So what is the advantage of algebraic topology? First of all it is a classifier. The idea is to classify geometric
objects. Roughly, two objects or spaces (more on that term in the next chapter) are equivalent, if it is possible to
deform one into the other without tearing or gluing. You may have heard that a topologist has a hard time distinguishing
between a donut and a coffee cup. In reality, though, it would be a hollow donut and a covered coffee cup with a hollow
handle. (See figure 1.0.1) Another way to picture it, would be a hollow donut vs. a sphere with a hollow handle.

Figure 1.0.1: Coffee Cup Turning into a Donut [98]

Can you tell the difference between a line and a plane? One way is to remove a point. The line is now in two
pieces but the plane stays connected. But what about a plane and three dimensional space? They are actually quite
different, but how do you know? If you remove a point from either, they stay in one piece. When we get to homology,
I will show you a very easy way to tell them apart.
Now look at the data set plotted in Figure 1.0.2. It looks like a ring of points, but how does a computer know that?
As far as the computer knows, it is just a bunch of disconnected points. As you will see, persistent homology, will
easily determine its shape.

1
2 CHAPTER 1. INTRODUCTION

Figure 1.0.2: Ring of Data Points

Another advantage is that topology is very bad at telling things apart. Think of all the coffee cups that have been
mistakenly eaten. In algebraic topology, a circle, a square, and a triangle are pretty much the same thing. What this
does for you is to assure you that the differences it detects are pretty significant and it should reduce your false positive
rate.
Topological data analysis is easier than ever before to experiment with as there is now a choice of fast and easily
accessible software.
Finally, you will learn a whole new collection of math puns to impress your friends. For example, what do you get
when you cross an elephant with an ant? The trivial elephant bundle over the ant. If you want to understand that one,
keep reading.
In chapters 2 and 3, I will give the necessary background on point-set topology and abstract algebra. If you are
already fluent in these, you can skim through these chapters or skip them entirely. Chapter 4 will deal with homology
theory in the more traditional way. Although many papers just jump to persistent homology with half page descriptions,
some exposure to more conventional homology theory will put you in a good position both to understand TDA and
possibly to extend it. In chapter 5, I will present persistent homology and some of its visualizations such as bar codes,
persistence diagrams, and persistence landscapes. I will also describe applications to data in the form of sublevel sets,
graphs, and time series. This will include some of my own work on multivariate time series and the application to
the classification of internet of things data. In chapter 6, I will discuss some applications such as Q-analysis, sensor
coverage, Ayasdi’s ”mapper” algorithm, simplicial sets, and the dimesnionality reduction algorithm, UMAP. In chapter
7, I will discuss some ideas I have had in the past but never finished on the use of traditional homology. These include
simplicial complexes derived from graphs and their potential application to change detection in a time series of graphs.
Also, I will describe an idea for using homology for market basket analysis. How do you distinguish between a grocery
store shopper who is buying for a large family and one who is single?
In the last part, I will discuss an idea I have for analyzing a cloud of data points using obstruction theory. (Don’t
worry if you don’t know what all or even most the words in this paragraph mean. I will get you there.) Obstruction
theory deals with the ”extension problem.” If you have a map from a space A to a space Y, and A is a subspace of X,
can you extend the map continuously to all of X while keeping its values on A the same as before. Obstruction theory
combines cohomology and homotopy theory. So chapter 8 will deal with cohomology. Cohomology’s strength is the
idea of a ”cup product” giving cohomology the structure of a ring. (See chapter 3 for rings.) Chapter 9 will deal with
homotopy. Homotopy theory is much easier to describe than homology but computations are extremely difficult. The
advantage is that when you can do the computations, you get a lot more information than homology theory gives you.
In Chapter 10, I will describe obstruction theory and my ideas for how to apply it to data science. Chapter 11 will deal
with Steenrod squares and reduced powers which give cohomology the structure of an algebra. (Again see chapter
3). Finally, Chapter 12 will describe homotopy groups of spheres. This is a problem that has not been completely
solved but I feel that the early work that was done in the 1950’s and 1960’s holds the key to the potential application
of obstruction theory to data science.
This is quite an ambitious list, but I hope to make it understandable even if you don’t yet have all of the background.
The idea is to get you used to the language and the concepts. TDA is a field with a lot of untapped potential so I would
like to start you thinking about all of the possibilities for its future.
One final note. Topology is not the same as topography. Topography looks at features of a geographic area while
1.1. NOTE ON THE USE OF THIS BOOK. 3

topology deals with the classification of geometric objects. And network topology is an example of a common feature
in math: word reuse.

1.1 Note on the Use of This Book.


The last four chapters of this book contain some very specialized subjects in algebraic topology that even most
math Ph.D.’s have never seen. Reading the first 5 chapters will give you a strong background and allow you to read
most of the current papers in the subject. Chapters 6 and 7 provide some interesting special topics, and cohomology
which is covered in Chapter 8, will allow you to understand RIPSER, a very fast state of the art software package
which will allow you to do most computations in topological data analysis.
The last four chapters are what make this book unique. In his 1956 American Mathematical Society Colloquium
Lecture [153], Norman Steenrod (my mathematical great-grandfather and someone you will soon hear more about)
described his motivation for his construction of Steenrod squares (see Chapter 11). The idea is to encode more of the
geometry of a shape into the algebra by introducing additional structure. This allows shapes to be classified when they
can’t be classified by easier methods.
It seems to me that difficult classification problems in data science should work the same way. The additional
structure provided by homotopy groups, obstructions, and Steenrod squares should allow for the solution of harder
classification problems.
There are two issues that then need to be addressed. Are these methods tractable and are they useful? I believe that
the answer to the first question is that homotopy groups, obstructions, and Steenrod squares are tractable for a range
of problems. The question of their usefulness has yet to be answered. I feel. though, that it is important to lay out the
issues and let the reader decide if there are problems that could be addressed using these methods. There is already
some work in this direction and it could be an interesting research area for the future.
4 CHAPTER 1. INTRODUCTION
Chapter 2

Point-Set Topology Background

What is space? According to the Hitchhiker’s Guide to the Galaxy [1], ”Space is big. You just won’t believe how
vastly, hugely, mind-bogglingly big it is. I mean, you may think it’s a long way down the road to the chemist’s, but
that’s just peanuts compared to space.”
Actually, there are really only 2 types of spaces: topological spaces and vector spaces. Any others are just special
cases of these. The space we live in is really both, so it is a topological vector space. I will deal with topological
spaces in this chapter and postpone vector spaces to chapter 3.
Point set topology is concerned with the properties of topological spaces, sets of points with a special class of
subsets called open sets. Topological spaces correspond to geometric shapes. Topology describes the properties of
geometric shapes we are used to and gives all sorts of examples of strange shapes where things break down. Fortunately
in algebraic topology, almost everything we deal with is nicely behaved, and in TDA even more so. A big advantage in
TDA is that there are only a finite number of data points, greatly simplifying the theory. By the time someone collects
infinitely many samples, I will have plenty of time to revise this book.
Rather than write a text for the usual semester length course, I will just focus on defining the terms and giving
some basic properties. One very popular text is Munkres [111]. I will follow the book by Willard [172], the book that
I used in college, and most of the material in this chapter is taken from there. In section 2.1, I will describe the basics
of sets and functions, mainly so we can agree on notation. I will also briefly discuss, commutative diagrams. At the
end, I will define one of the most familiar type of topological space, metric spaces. Section 2.2 will formally define
topological spaces. In section 2.3, I will discuss continuous functions which have a much easier description than when
you first met them in calculus. Section 2.4 will cover subspaces, product spaces and quotient spaces, all of which
will have analogues in abstract algebra as you will see in Chapter 3. Section 2.5 will present the hierarchy of spaces
defined by the separation axioms. In Section 2.6, I will do a little more set theory and discuss the difference between
countably infinite and uncountable infinite sets. I will conclude with compactness in Section 2.7 and connectedness in
Section 2.8, two ideas that will be very important in algebraic topology.

2.1 Sets, Functions, and Metric Spaces


2.1.1 Sets
A set is just a collection of objects called elements. In point-set topology, the elements are called points. If a is an
element of set A we write a ∈ A. Otherwise, we write a ∈ / A. If A and B are sets, we say that A is contained in B or
A ⊂ B if every element of A is also an element of B. In this case, A is a subset of B. If A ⊂ B and B ⊂ A then A
and B have exactly the same elements and A = B.
Write A − B for the set of all elements in A that are not in B. If B ⊂ A, then A − B is the complement of B in A.
Figure 2.1.1 illustrates some of these concepts. Here we show the union of overlapping sets A and B and have a
set C such that C ⊂ A. A picture like this is called a Venn Diagram.

5
6 CHAPTER 2. POINT-SET TOPOLOGY BACKGROUND

Figure 2.1.1: Union, Intersection and Set Difference

If A consists of the elements 1, 2, x, cat, dog, and horse, we write A = {1, 2, x, cat, dog, horse}. A set with no
elements is the empty set and is written ∅ or {}.
A set A can be an element of another set B or even itself. A ∈ B does not mean the same thing as A ⊂ B as in
the latter case the elements of A are all elements of B while in the former case A itself is an element of B. This leads
to a question. Is there a set of all sets? If there is, this leads to Russell’s Paradox. If there was a set of all sets, it would
have to include the set of all sets that don’t contain themselves as an element, ie Q = {A|A ∈ / A}. (The vertical bar
means ”such that”.)
The famous analogy is the town with a male barber who shaves every man who doesn’t shave himself. In that case,
who shaves the barber? A more descriptive example is based on an analogy given in [148]. Suppose there is an island
where every inhabitant has a club named after them and every inhabitant is in at least one club. Not everyone, though
is in the club named after them. Also, every club is named after someone. Call a person sociable if they are a member
of the club named after them and unsociable if they are not. Could there be a club made up of all of the unsociable
people? Suppose there were. It would have to be named after someone. If it was the Michael Postol club, I could not
be a member as I would be sociable. Then I would not be in the club named after me so I would be unsociable so then
I would have to be in the club.
For this reason we don’t talk about the set of all sets. Instead, the collection of sets form a category. Category
theory deals with collections of things like sets, topological spaces, groups, etc and special maps between them. The
proper place to discuss this subject is in the next chapter but category theory and algebraic topology are very closely
linked. For now, think about the different types of sets we will learn about (especially in chapter 3) and the things they
have in common.
If A and B are sets, then the union A ∪ B is the set of elements that are in A or B or both. The intersection A ∩ B
is the set of elements that are in both A and B. If A ∩ B = ∅, we say that A and B are disjoint.
For example, if A = {1, 3, 5} and B = {5, 7, 9} then A ∪ B = {1, 3, 5, 7, 9} and A ∩ B = {5}.
Similar definitions apply for more than two and even infinitely many sets where the union denotes elements that
are in at least one of the sets and the intersection is the set of elements that are in all of the sets.
Here are some sets that will arise frequently:
2.1. SETS, FUNCTIONS, AND METRIC SPACES 7

R : The set of real numbers. (2.1)


n
R : The set of ordered n-tuples of real numbers. (2.2)
Z : The set of integers {· · · , −2, −1, 0, 1, 2, · · · } (2.3)
Zn : The set {0, 1, 2, · · · , n − 1} of integers modulo n. (2.4)
Q : The set of rational numbers. (2.5)

2.1.2 Functions and Relations


A method of building a new set from an old one is to form the (Cartesian) product.

Definition 2.1.1. If X1 and X2 are two sets then the Cartesian product (or simply the product) of X1 and X2 is the
set X1 × X2 is the set of all ordered pairs (x1 , x2 ) such that x1 ∈ X1 and x2 ∈ X2 .

When the sets are geometric shapes we can think of the Cartesian product as the shape resulting from putting a copy
of one set at every point of the other set. For example, Rn × R = Rn+1 . As another example, let I be the interval [0, 1]
and S n be the hollow n−sphere defined to be the set of all points (X1 , · · · , Xn+1 ) such that X12 + · · · + Xn+1
2
= 1.
1 2 1 1 1
So S is a circle and S is the surface of a sphere.Then S × I is a cylinder and S × S is a torus (a hollow donut).
A Moebius strip looks like a cylinder held up close but has a twist in it. Later we will look at Cartesian products with
a ”twist”. These are called fiber bundles. See Figure 2.1.2 for some pictures.

(a) Torus and Cylinder (b) Moebius Strip

Figure 2.1.2: Torus [179], Cylinder[174], and Moebius Strip[176]

Definition 2.1.2. A function f from a set X to a set Y written f : X → Y is a subset of X × Y with the property that
for each x ∈ X there is one and only one y ∈ Y such that (x, y) ∈ f . In this case, we write f (x) = y.

In this above definition, X is the domain of f and Y is the range of f . The image f (X) of f is the set defined to
be the set of y ∈ Y such that f (x) = y for some x ∈ X. If X = Y there is a special function f : X → X called the
identity function. In this case, f (x) = x for all x ∈ X.
If f : X → Y and g : Y → Z then there is a function gf : X → Z called the composition of g with f and defined
f g
X Y Z
by gf (x) = g(f (x)). We can also write it this way.
For example, if f (x) = x2 + 1 and g(x) = sin(x), then gf (x) = sin(x2 + 1).
If f : X → Y , we would like a way to undo f . To do this, we need two things to happen. First of all, for every
y ∈ Y , we need at least one x ∈ X for it to map to. The second condition is that for every y ∈ Y , we need at most
one x ∈ X for it to map to. The first condition is called onto and the second is called one-to-one.

Definition 2.1.3. A function f is one-to-one or injective or an injection if for x1 , x2 ∈ X, if x1 ̸= x2 implies


f (x1 ) ̸= f (x2 ). A function f is onto or surjective or a surjection if for any y ∈ Y , there is an x ∈ X such that
f (x) = y. In this case, f (X) = Y , i.e. the range of f is all of Y .
8 CHAPTER 2. POINT-SET TOPOLOGY BACKGROUND

Example 2.1.1. The function f (x) = x2 is not one-to-one since f (−1) = 1 = f (1). It is also not onto if Y = R,
since there is no real value of x with x2 = −1. But g(x) = x3 is both one-to-one and onto.
Definition 2.1.4. A function f : X → Y has a (two-sided) inverse g : Y → X if gf = 1X and f g = 1Y where 1X
is the identity function on X. In this case, we write g = f −1 (x). So f −1 undoes f . By our remarks above, a function
has an inverse if and only if it is one-to-one and onto.
It is possible for a function to be one-to-one, but not onto. Then it has an inverse function whose domain is the
original function’s range. An example of this is the function f (x) = ex . This is one-to-one, but its range is R+ , the
postitive real numbers as opposed to all of R. This function has an inverse f −1 (x) = ln(x) , which is only defined on
the range R+ of f .
Definition 2.1.5. Let X ⊂ Y . Then f : X → Y is an inclusion if f (x) = x for all x ∈ X. Then f is obviously
one-to-one, but f is only onto if X = Y .
Closely related to functions are relations. A relation is any subset of A × A. So a function F : A → A is a special
type of relation in which the subset can not contain both (a1 , b1 ) and (a1 , b2 ) for a1 , b1 , b2 ∈ A as this is saying that
f (a1 ) = b1 and f (a1 ) = b2 . There are two types of relations that will be important in what follows. These are
equivalence relations and partial orders.
Definition 2.1.6. We will define a particular relation between elements in A writing a ∼ b if (a, b) is in the relation.
Then ∼ is an equvalence relation if the following three properties hold for all a, b, c ∈ A:
1. Reflexive: a ∼ a
2. Symmetric: If a ∼ b then b ∼ a
3. Transitive: If a ∼ b and b ∼ c then a ∼ c.
In this case, if a ∼ b, we say that a is equivalent to b.
Equality between real numbers is an example of an equivalence relation. Congruence of triangles is another. We
will meet many more.
Definition 2.1.7. We will define a particular relation between elements in A writing a ≤ b if (a, b) is in the relation.
Then ≤ is an partial order if the following three properties hold for all a, b, c ∈ A:
1. Reflexive: a ≤ a
2. Antisymmetric: If a ≤ b and b ≤ a, then a = b
3. Transitive: If a ≤ b and b ≤ c then a ≤ c.
Real numbers obviously form a partial order where ≤ has the usual meaning. Note that we didn’t require that
a ≤ b or b ≤ a for any a, b ∈ A. If this holds, then we have a linear order. The real numbers have a linear order.
An example of a partial order that is not a linear order is the following: Consider subsets of a set X. We say that
for A, B ⊂ X, we have A ≤ B if A ⊂ B. You can easily check that all of the properties hold, but it is possible that
neither A ⊂ B nor B ⊂ A is true. In this case, we say that A and B are incomparable.
To conclude this section, I will introduce the idea of a commutative diagram. If you are an algebraic topologist,
you learn to love them.
Consider the following diagram:
g1 g2
A B C
h
f1 f2 f3
g3 g4
D E F
2.1. SETS, FUNCTIONS, AND METRIC SPACES 9

The vertices represent sets and the arrows are functions between them. As before, following two arrows in suc-
cession represents function compostiton. The idea is that no matter what path you take, as long as you are following
arrows, you get to the same place. So let a ∈ A. Then we can get to F in several ways, and g4 g3 f1 (a), g4 f2 g1 (a),
f3 g2 g1 (a), and hg1 (a) are all the same element of F .

2.1.3 Metric Spaces


In the next section I will finally introduce topological spaces, but here we will start with the most famous special
case. A metric space is basically a space with the notion of a distance between two points.
Definition 2.1.8. A metric space is a set M together with a function ρ : M × M → R such that we have the following
for x, y, z ∈ M :
1. ρ(x, y) ≥ 0 and ρ(x, y) = 0 if and only if x = y.
2. ρ(x, y) = ρ(y, x)
3. Triangle Inequality: ρ(x, y) + ρ(y, z) ≥ ρ(x, z).
So the first property says that the difference between two points is a positive number unless they are the same point
in which it is 0. The second property is that the distance from x to y is the same as the distance from y to x. The
triangle inequality says that the sum of the lengths of two sides of a triangle is greater than or equal to the length of the
third side. They are equal if the three points lie on the same line. An equivalent statement is that the shortest distance
between two pints is a straight line.
The function ρ is called a distance or equivalently, a metric.
Example 2.1.2. M is the real numbers where ρ(x, y) = |x − y| for x, y ∈ R.
pPn
Example 2.1.3. M is Rn . For x = (x1 , x2 , · · · , xn ), y = (y1 , y2 , · · · , yn ) ∈ Rn , we have ρ(x, y) = i=1 (xi − yi )2 .
This is the usual Euclidean metric.
Example 2.1.4. M is R2 . For x = (x1 , x2 ), y = (y1 , y2 ), we have ρ(x, y) = |x1 − y1 | + |x2 − y2 |. This is called the
taxi cab or Manhattan distance.
Example 2.1.5. Let Lp for p a positive integer be the set of real valued functions f of a single real variable such that
1
|f |p < ∞.Then for f, g ∈ Lp we have the Lp distance ρ(f, g) = ( R |f − g|p ) p .
R R
R
The last two examples will serve as opposite extremes when we discuss topological spaces.
Example 2.1.6. The discrete metric space: Let M be any set. For x, y ∈ M we let ρ(x, y) = 0 if x = y, and let
ρ(x, y) = 1 if x ̸= y.
You should check that this is an actual metric space. If you lived in this world, everyone would be one unit tall,
your commute to work would be one unit, the distance to the sun (assuming you had one) would be one unit, etc. Sort
of convenient but as we will see, everything would fall apart and there would be no path from anywhere to anywhere
else.
The opposite extreme is the trivial pseudometric space in which ρ(x, y) = 0 for any x and y. This is a pseudometric
rather than a metric as property one is violated but properties 2 and 3 still hold.
Finally, given a metric, we have the idea of a ball. This is the set of points whose distance from a center point is
less than or equal to a particular value called the radius.
Definition 2.1.9. Let M be a metric space and x ∈ M . The open ball of radius δ centered at x written Bδ (x) is the
set of y ∈ M such that ρ(x, y) < δ. If instead, we have ρ(x, y) ≤ δ, the set is called a closed ball and we will write it
as B̄δ (x).
Note that if M = R, then Bδ (x) is the open interval (x − δ, x + δ). If M = R2 , then Bδ (x) is the circle centered
at x of radius δ (without the boundary).
The open balls in a metric space will be an example of the more general concept of a base for a topology. We will
explore this further in the next section.
10 CHAPTER 2. POINT-SET TOPOLOGY BACKGROUND

2.2 Topological Spaces


We are now ready to define a topological space. This is a set with a special class of subsets called open sets.

Definition 2.2.1. A topology on a set x is a collection τ of subsets of X called the open sets satisfying:

1. Any union (finite or infinite) of subsets of τ belong to τ .

2. Any finite intersection of subsets of τ belong to τ .

3. ∅ and X belong to τ .

The set X together with the topology τ is called a topological space. In algebraic topology, the word space will always
mean a topological space unless it is explicitly stated otherwise.

Definition 2.2.2. Given two topologies τ1 and τ2 on the same set X, we say that τ1 is weaker (smaller, coarser) than
τ2 or equivalently that τ2 is stronger (larger, finer) than τ1 if τ1 ⊂ τ2 .

Example 2.2.1. Given a nonempty set X, the weakest topology is the trivial topology in which τ1 = X, ∅. At the
other extreme is the discrete topology in which τ2 consists of every subset of X. Obviously τ1 ⊂ τ2 . In fact, the
discrete topology is the strongest possible topology on X. You should check that both of these satisfy Definition 2.2.1.

Example 2.2.2. Let M be a metric space. Then the metric topology or usual topology consists of sets G which are
defined to be open if and only if for any x ∈ G, there exists a real number δ > 0 such that Bδ (x) ⊂ G.

In other words, G is open if for any point in G, we can draw an open ball around it that stays in G. For example,
the open interval (0, 1) is open. Let x = .1. Then if we let δ = .05, then x − δ = .05, and x + δ = .15, so the open
ball is (.05, .15) and that is contained in (0, 1). It should be clear that we can do something similar no matter how
close we get to 0 or 1. So (0, 1) is open.
We can also see why the union of open sets can be infinite but we needed to specify a finite union. Consider the
collection of open intervals (− n1 , n1 ), for n = 1, 2, · · · . These are all open, but

\ 1 1
(− , ) = {0}
n=1
n n

The set {0} is not open since any open interval containing 0 will contain points other than 0.

Definition 2.2.3. If X is a topological space, then a subset F is closed if the complement of F in X is open.

For the rest of this section, you should convince yourself of the truth of the results by drawing pictures. The first
one is analogous to the definition of open set and can be shown by looking at complements.

Theorem 2.2.1. Let X be a topological space and let F be the collection of closed subsets of X. Then the following
hold:

1. Any intersection (finite or infinite) of subsets of F belong to F.

2. Any finite union of subsets of F belong to F.

3. ∅ and X belong to F.

On R, closed intervals are closed sets as their complements are open. For intervals, we can make an open interval
closed by adding in its boundary. This motivates the following:

Definition 2.2.4. If X is a topological space, and E ⊂ X, then the closure of E written Ē is the intersection of all
closed subsets of X which contain E. By the previous theorem, Ē must also be closed.
2.2. TOPOLOGICAL SPACES 11

Theorem 2.2.2. If A ⊂ B then Ā ⊂ B̄.


Theorem 2.2.3. Closure of subsets in a topological space have the following properties:
1. E ⊂ Ē
¯ = Ē
2. Ē
3. A ∪ B = Ā ∪ B̄
4. ¯
∅=∅
5. E is closed in X if and only if Ē = E.
Note that the closure of the intersection is not necessarily the intersection of the closure. Let A be the rationals in
R and B be the irrationals in R. It turns out that R = Ā = B̄, so Ā ∩ B̄ = R, but A ∩ B = ∅, so A ∩ B = ¯∅ = ∅.
You can think of closure as putting in the rest of the boundary of a set. The opposite of that is the interior operation.
Definition 2.2.5. If X is a topological space, and E ⊂ X, then the interior of E written E ◦ is the union of all open
subsets of X which are contained in E. By the definition of an open set, E ◦ must also be open.
Theorem 2.2.4. If A ⊂ B then A◦ ⊂ B ◦ .
Theorem 2.2.5. Interiors of subsets in a topological space have the following properties:
1. E ⊃ E ◦
2. E ◦◦ = E ◦
3. (A ∩ B)◦ = A◦ ∩ B ◦
4. X ◦ = X
5. G is open in X if and only if G = G◦ .
Again let A be the rationals and B be the irrationals. Then A ∪ B = R, and R◦ = R. But A◦ = B ◦ = ∅. So
(A ∪ B)◦ ̸= A◦ ∪ B ◦
I alluded to a boundary that is the difference between an open and a closed set. I will now make this precise:
Definition 2.2.6. If X is a topological space, and E ⊂ X, then the boundary or frontier of E written Bd(E) is the set
Bd(E) = Ē ∩ (X − E).
Theorem 2.2.6. For any subset E of a topological space X:
1. Ē = E ∪ Bd(E)
2. E ◦ = E − Bd(E)
3. X = E ◦ ∪ Bd(E) ∪ (X − E)◦
To help you picture these results, consider Figure 2.2.1. The set E consists of the small region including the solid
but not the broken line. The closure Ē includes the broken line as well. The interior E ◦ is everything on the inside
but does not include even the solid line. The frontier or boundary Bd(E) is the broken and solid lines together. You
should convince yourself of the last few theorems at least in this case.
In the remainder of this section, I will briefly discuss neighborhoods and bases. These generalize the open balls in
a metric space.
Definition 2.2.7. If X is a topological space, and x ∈ X, then a neighborhood of x is a set U which contains an open
set V containing x. So U is a neighborhood if and only if x ∈ U ◦ . The collection Ux of all neighborhoods of x is
called a neighborhood system at x.
12 CHAPTER 2. POINT-SET TOPOLOGY BACKGROUND

Figure 2.2.1: Closure, Interior, and Frontier(Boundary)

A neighborhood system can be used to define a topology on X.


Theorem 2.2.7. The neighborhood system Ux at x in a topological space X has the following properties:
1. If U ∈ Ux , then x ∈ U .
2. If U, V ∈ Ux , then U ∩ V ∈ Ux .
3. If U ∈ Ux , then there is a V ∈ Ux such that U ∈ Uy for each y ∈ V.
4. If U ∈ Ux and U ⊂ V then V ∈ Ux .
5. A subset G of X is open if and only if G contains a neighborhood of each of its points.
Since any set containing a neighborhood is also a neighborhood, we will limit ourselves to a smaller collection
called a neighborhood base.
Definition 2.2.8. If X is a topological space, and x ∈ X, then a neighborhood base at x is a subcollection Bx taken
from the neighborhood system Ux having the property that each U ∈ Ux contains some V ∈ Bx . An element of a
neighborhood base is a basic neighborhood
Example 2.2.3. For any topological space, the open neighborhoods of x form a neighborhood base since if U is a
neighborhood of x, U ◦ must also be one.
Example 2.2.4. For x in a metric space, the open balls Bδ (x) form a neighborhood base at x.
Example 2.2.5. If X has the discrete topology, the set {x} is a neighborhood base at x since every subset of X
containing x is open and thus a neighborhood of x. If X has the trivial topology, the only neighborhood base at any
point is {X} as X is the only nonempty open set and thus the only neighborhood of any point X.
Theorem 2.2.8. The neighborhood base Bx at x in a topological space X has the following properties:
1. If V ∈ Bx , then x ∈ V .
2. If U, V ∈ Bx , then there is a W ∈ Bx such that W ⊂ U ∩ V .
2.3. CONTINUOUS FUNCTIONS 13

3. If V ∈ Bx then there is some V0 ∈ Bx such that if y ∈ V0 , then there is some W ∈ By with W ⊂ V .

4. A subset G of X is open if and only if G contains a basic neighborhood of each of its points.

By property 4, a subset G of a metric space M is open if and only if for any x ∈ G, there is a δ > 0 such that the
open ball Bδ (x) ⊂ G. This the usual definition of an open set that you may have seen.
The next theorem characterizes some of the concepts we have seen in terms of neighborhood bases.

Theorem 2.2.9. Let X be a topological space and suppose we have fixed a neighbohood base at each point x ∈ X.
Then we have:

1. G ⊂ X is open if and only if G contains a basic neighborhood of each of its points.

2. F ⊂ X is closed if and only if each point x ∈


/ F has a basic neighborhood disjoint from F .

3. Ē is the set of points x in X such that each basic neighborhood of x meets E.

4. E ◦ is the set of points x in X such that some basic neighborhood of x is contained in E.

5. Bd(E) is the set of points x in X such that every basic neighborhood of x meets both E and X − E.

A more global version of the above is a base for a topology.

Definition 2.2.9. If X is a topological space with topology τ , a base for τ is a collection B ⊂ τ , such that τ is
produced by taking all possible unions of elements in B.

Example 2.2.6. For a metric space X, the open balls Bδ (x) for δ > 0 and x ∈ X form a base for the usual topology..

The connection with neighborhood bases is that for each x ∈ X, the sets in B which contain x form a neighbohood
base at x.
Finally, you may see a topology defined in terms of a subbase.

Definition 2.2.10. If X is a topological space with topology τ , a subbase for τ is a collection C ⊂ τ , such that the set
of all possible intersections of elements in C is a base for τ .

2.3 Continuous Functions


As we will see in Chapter 3 as well, for each type set we describe, we will look at a privileged function that
preserves the key defining property of the set. In abstract algebra, this function that preserves the algebraic structure
is called a homomorphism. (Lots more on those later.) For a topological space, the special functions are called
continuous.
If you remember continuous functions from calculus but hated epsilons and deltas, I am going to make you very
happy. Continuous functions simply preserve the idea of open sets, although in the reverse direction.

Definition 2.3.1. Let X and Y be topological spaces and f : X → Y . Then f is continuous at x0 ∈ X if for each
neighborhood V of f (x0 ) in Y, there is a neighborhood U of x0 in X such that f (U ) ⊂ V . We say that f is continuous
on x if it is continuous at every point of X.

By what we have said about the metric space topology, you should convince yourself that the definition of conti-
nuity that you learned in calculus is equivalent to our definition here.
The next theorem gives a much more useful definition of continuity. It is the one we tend to use in practice.

Theorem 2.3.1. Let X and Y be topological spaces and f : X → Y . Then the following are equivalent:

1. f is continuous.
14 CHAPTER 2. POINT-SET TOPOLOGY BACKGROUND

2. for each open set G in Y , f −1 (G) is open in X.

3. for each closed set F in Y , f −1 (F ) is closed in X

4. For each E ⊂ X, f (Ē) ⊂ f (E).

The characterization of a continuous function as a function having the property that the inverse image of an open
set is open immediately implies the following:

Theorem 2.3.2. If X, Y , and Z are topological spaces, and f : X → Y and g : Y → Z are continuous, then the
composition gf : X → Z is also continuous.

From now on, the terms function or map between topological spaces will always be assumed to be continuous
unless otherwise specified.
Note that for a continuous function, we have not claimed that the image of an open set is always open. For example,
let f be a real valued function of one variable defined by f (x) = 0. It is continuous as the inverse image of any open
set containing 0 is all of R, which is also open. The inverse image of an open set not containing 0 is ∅ which is also
open. But as an example f (0, 1) = {0} so f takes the open set (0, 1) to the closed set{0}. The maps which preserve
open sets in both directions are called homeomorphisms.

Definition 2.3.2. Let X and Y be topological spaces and f : X → Y . Then f is a homeomorphism if f is continuous,
one-to-one, and onto and if f −1 is continuous. In this case we say that X and Y are homeomorphic.

Theorem 2.3.3. Let X and Y be topological spaces and f : X → Y . Then the following are equivalent:

1. f is a homeomorphism.

2. A set G is open in X if and only if f (G) is open in Y

3. A set F is closed in X if and only if f (F ) is closed in Y

4. For each E ⊂ X, f (Ē) = f (E).

If X and Y are homeomorphic, then from the point of view of topology, they are basically the same object. Note
that they may not seem exactly the same. For example, a circle, a square and a triangle are all homeomorphic. But
they can be continuously deformed into each other without tearing or gluing anything. (Those terms will be more
precise when we discuss connectedness.) The equivalent term for the algebraic objects we will discuss in chapter 3 is
isomorphic. The relation of being homeomorphic is an equivalence relation. The main goal of algebraic topology is to
classify topological spaces by whether they are homeomorphic to each other. In general, this is a very hard problem.

2.4 Subspaces, Product Spaces, and Quotient Spaces


Here are some new topological spaces you can create from old ones. These all have analogues in abstract algebra.

2.4.1 Subspaces
We get this one almost for free. When is a subset of a topological space another topological space? Always. You
just need to define an open set in the right way.

Definition 2.4.1. Let X be a topological space with topology τ, and let A ⊂ X. The collection τ ′ = {G ∩ A|G ∈ τ }
is a topology for A called the relative topology for A. In this case, we call the subset A a subspace of X.

So basically, we get our open sets in A by taking the open sets of X and intersecting them with A.

Theorem 2.4.1. If A is a subspace of a topological space X then:


2.4. SUBSPACES, PRODUCT SPACES, AND QUOTIENT SPACES 15

1. H ⊂ A is open in A if and only if H = G ∩ A where G is open in X.


2. F ⊂ A is closed in A if and only if F = K ∩ A where K is closed in X.
3. for E ⊂ A, the closure of E in A is the intersection of A with the closure of E in X.
4. If x ∈ A, then V is a neighborhood of x in A if and only if V = U ∩ A where U is a neighborhood of x in X.
5. If x ∈ A, and Bx is a neighborhood base at x in X, then {B ∩ A|B ∈ Bx } is a neighborhood base at x in A.
6. If B is a base for X, then {B ∩ A|B ∈ B} is a base for A.
Note that we haven’t talked about interiors or frontiers. The next example shows why.
Example 2.4.1. Let X be R2 with the usual topology and let A = E be the x-axis. Then the interior in A of E is A
while the interior in X of E is ∅, and A ̸= ∅ ∩ A. It is always true, though that the interior of E in A is contained in
the intersection of A and the interior of E in X. In the same case, the frontier of E in A is ∅ while the frontier of E in
X is A and A ̸= ∅ ∩ A. It is always true, though that the frontier of E in A is contained in the intersection of A and
the frontier of E in X.
Finally, we mention that when restricted to subspaces, continuous functions stay continuous.
Definition 2.4.2. If f : X → Y and A ⊂ X, then the restriction of f to A written f |A is the map of A into Y defined
by (f |A)(a) = f (a) for every a ∈ A.
Theorem 2.4.2. If A ⊂ X and f : X → Y is continuous, then (f |A) : A → Y is continuous.
Theorem 2.4.3. If X = A ∪ B, where A and B are both open or both closed in X and if f : X → Y is a function
such that both (f |A) and (f |B) are continuous, then f is continuous.

2.4.2 Product Spaces


We discussed the cartesian product of two sets in Section 2.1.2. We repeat the definition here for reference.
Definition 2.4.3. If X1 and X2 are two sets then the Cartesian product (or simply the product) of X1 and X2 is the
set X1 × X2 is the set of all ordered pairs (x1 , x2 ) such that x1 ∈ X1 and x2 ∈ X2 .
We will now loosen up the definition a bit. First of all, we will allow as many of these sets as we want, even
infinitely many. For the product of n sets, the product is
n
Y
Xi = {(x1 , · · · , xn )},
i=1

where xi ∈ Xi for 1 ≤ i ≤ n. To define the product of infinitely many sets, we use the idea of an index set. An index
set is a set A of objects where we have one term for each object in A. So here is the most general definition we will
use:
Definition 2.4.4. Let Xα be a set for each α ∈ A. The Cartesian product (or simply the product) of the Xα is the set
Y [
Xα = {x : A → Xα |x(α) ∈ Xα for each α ∈ A}.
α∈A α∈A

So basically, we define a point in the product as a function which picks out one element from each of the sets. In
order to do this we need the Axiom of Choice. I won’t go into a long discussion of this but informally, suppose you
had infinitely many shoes. You could easily pick out one from each pair by just taking all of the left ones. But suppose
you had infinitely many pairs of socks. It is not obvious that you could pick out one from each pair as this would keep
16 CHAPTER 2. POINT-SET TOPOLOGY BACKGROUND

you busy all morning and every other morning until the end of time. So the Axiom of Choice says that you can pick
out one element from each of an infinite collection of sets. If you believe it or if you don’t, all of the other axioms of
set theory still hold.
The one case where we may be interested in an infinite Cartesian product is when talking about spaces whose
points are actually functions. Consider a real valued function of one variable defined on all of R. Any such function
is a point in the space RR . This is the product of the reals taken infinitely many times, one for each real number.
For example, take the function f (x) = x2 . If we think of a coordinate for every real number, then the coordinate
corresponding to 3 will be 32 = 9.
It will be helpful in what follows to picture products as finite to understand the results in this section but I will
do this in more generality so you can understand the proofs in the paper on template functions that I will discuss in
Chapter 6.
Since we are working with topological spaces, we want to know what open sets look like on a product of spaces.
To get there we need a couple more terms. The value x(α) ∈ Xα is usually denoted xα and called the αth coordinate
of X. The space XQ α is the αth factor space.
The map πβ : Xα → Xβ defined by πβ (x) = xβ is called the βth projection map.
As an example, consider R3 which is the product of the reals with itself three times. The index set A is simply
{1, 2, 3}. Let x = (3, −4.5, 100) be a point in R3 . The three coordinates are x1 = 3, x2 = −4.5, and x3 = 100. The
projection map π2 takes a point in R3 to the second coordiante so π2 (3, −4.5, 100) = −4.5.
Now we are ready to define a topology on a prodcut space.
Q
Definition 2.4.5. The Tychonoff
Q topology or product topology on Xα is obtained by taking as a base for the open
sets the sets of the form Uα where
1. Uα is open in Xα for each α in A.
2. For all but finitely many coordinates, Uα = Xα .
Qn
In a finite product, we can dispense with the second condition, so open sets are unions of sets of the form i=1 Uα
where Uα is open in Xα .
Definition 2.4.6. If X and Y are topological spaces, and f : X → Y , then f is an open map (closed map) if for each
open (closed) set A in X, f (A) is open (closed) in Y .
An open map is almost the opposite of a continuous map. For an open map, the image of an open set is open
while for a continuous map, the inverse image of an open set is open. If F is one-to-one, onto, and open then f −1 is
continuous. So a one-to-one, onto, and continuous map is a homeomorphism. The same facts hold for closed maps.
Q
Theorem 2.4.4. The βth projection map πβ : Xα → Xβ is continuous and open but need not be closed.
As an example, consider the graph of y = x1 as a subset of R2 , Then π1 is the projection onto the x axis. If
G = {(x, x1 )} is this graph, then G is closed in R2 , but π1 (G) = (−∞, 0) ∪ (0, ∞), which is an open set. So in this
case π1 : R2 → R is not closed.
Finally, we have the following:
Q
Theorem 2.4.5. A map f : Y → Xα is continuous if and only if πα ◦ f is continuous for each α ∈ A.
In abstract algebra, the analogue of a Cartesian product is a direct product.

2.4.3 Quotient Spaces


Here is where things will start to get sticky. By that, I mean that topologists love to glue things together. Twists
and turns that would shred ordinary paper to pieces are totally legal in topology. In this section I will show you some
examples and then define a quotient space (the result of gluing) in general. We will want to know how to define open
sets and learn what happens to continuous functions. Also note, that there are analogous quotient objects in algebra.
Especially quotient groups will play a vital role in algebraic topology.
2.4. SUBSPACES, PRODUCT SPACES, AND QUOTIENT SPACES 17

Figure 2.4.1: Folded Rectangles

Here is an experiment you can try at home. Take a rectangular piece of paper and fold it in the four different ways
shown in Figure 2.4.1. In the first example, we glue along both edges and get our old friend the torus. (See Figure
2.1.2.) In the second example, glue the short edges with a twist and leave the long ones alone. This gives a Moebius
strip (also shown in Figure 2.1.2.). In the third case. glue the short ends with a twist and the long ones without one.
This gives a Klein Bottle shown in Figure 2.4.2. If you tried to do this, you would probably have a pretty hard time. A
Klein bottle only fits in 4 dimensions, but in 3 dimensions, we can do an immersion. This means that it may intersect
itself. So you have to imagine that the narrow tube is not hitting a wall as it comes up on the right. If you walked
through a Klein bottle and started on the inside, you would end up on the outside and vice versa.

Figure 2.4.2: Klein Bottle [175]

This brings up two excuses for not doing your math homework:

1. I put it in a Klein bottle and it fell out.


18 CHAPTER 2. POINT-SET TOPOLOGY BACKGROUND

2. I put it in a Klein bottle and a four dimensional dog ate it.


In the last case we have the real projective plane. Here we twist both the long and short edges. Again, we have a
four dimensional object and we picture a three dimensional immersion called Boy’s surface in Figure 2.4.3. Another
way to make the real projective plane is as follows: Take a (hollow) sphere. Two points on a sphere are antipodal
if the lie on a line through the center. So for example, the North Pole and the South Pole are antipodal. Now glue
together every pair of antipodal points. Before you get to the Equator, you are fine. For example, just glue the Southern
hemisphere to the Northern one. Once you start trying to stick the points on the Equator together, you will have the
same trouble (as you would expect.) Projective spaces provide important examples in algebraic topology

Figure 2.4.3: Boy’s Surface [175]

One final example will be important later. Let X be a topological space and I = [0, 1]. The it cone CX of X is
the result of taking X × I and identifying the points (x, 1) to a single point. If X is a circle, then CX is an actual
cone. Another way of thinking of a cone is to take every point in X and connect it with a straight line to a fixed point
outside of X. Suppose instead we have two fixed outside points and we connect every point of X to each of them
with a straight line. (It is easier to picture if you think of a point above and a point below.) This is X × I where the
points (x, 0) are all identified and the points (x, 1) are all identified. This is called the suspension and is written SX
or ΣX. Assuming everything is made of silly putty you can think of a cone as plugging up holes and a suspension as
raising the dimension of a sphere by one. So you can flatten a cone to be a filled in circle or deform a suspension of a
circle to be a sphere. Figure 2.4.4 helps illustrate this where the top or bottom half is a cone and the entire object is a
suspension of a circle. (Two ice cream cones stuck together at their open ends).

Figure 2.4.4: Suspension of a Circle [178]

In practice we do the gluing by taking the original space to the new one with a function that is onto but not one-
to-one as points we want to identify will be sent to the same place as each other. We define the topology on the new
space in a way which makes this map continuous. The new space will be called the quotient space.
Definition 2.4.7. If X is topological space, Y is a set, and g : X → Y is an onto mapping, then G is defined to be
open in Y if and only if g −1 (G) is open in X. The topology on Y induced by g and defined in this manner is called
the quotient topology and Y is called a quotient space of X. The map g is called a quotient map.
2.5. SEPARATION AXIOMS 19

Now suppose Y was already a topological space. Are open sets in Y the same as those defined in the quotient
topology? If g is continuous, then an open set in the original topology has to be open in the quotient topology by
definition. To go the other way, we need g to be open or closed.

2.5 Separation Axioms


The separation algorithms form a hierarchy among topological spaces. They are classified by how well open sets
can separate points in these spaces. The classification is denoted by the names T0 , T1 , T2 , T3 , and T4 . In this section, I
will define these types and show the relationships between them. I will also give some examples of weird spaces that
fulfill some but not all of the conditions. This will help make sure you understand the concepts. Also, imagining weird
spaces is part of the fun of point-set topology or point-set pathology as it is nicknamed. If you want to see more of
these, I refer you to [152]. The material in this section is taken from there as well as from [172]. Note that in algebraic
topology, we almost always deal with spaces that are T2 . In data science applications, the spaces we deal with will
always be T4 .
Definition 2.5.1. A topological space X is T0 if whenever x and y are distinct points in X, there is an open set
containing one and not the other.
Example 2.5.1. A space with at least two points having the trivial topology is not even T0 . The only nonempty open
set is the entire space which has to contain both of these points.
Definition 2.5.2. A topological space X is T1 if whenever x and y are distinct points in X, there is an open set
containing each one and not the other.
Example 2.5.2. This example will illustrate the difference between T0 and T1 . Let X = {a, b} be a space consisting of
two points and suppose that the opens sets are τ = {X, ∅, {a}}. (Check that this satisfies the definition of a topology.)
Then X is T0 since there is an open set containing a and not b, namely {a}. But {b} is not open (note that it is closed
though being the complement of {a}.) The smallest open set containing b is all of X which also contains a. So X is
T0 but not T1
It should be obvious that any T1 space is T0 . Here is an interesting fact about T1 spaces. I will provide the proof
which is short and instructive.
Theorem 2.5.1. The following are equivalent for a topological space X:
1. X is T1
2. Each one point set in X is closed.
3. Each subset of X is the intersection of the open sets containing it.
Proof:
(1) ⇒ (2): If X is T1 and x ∈ X then each y ̸= x is contained in an open set disjoint from x, so X − {x} is a
union of open sets and thus open. So x is closed.
(2) ⇒ (3): If A ⊂ X then A is the intersection of all sets of the form X − {x} for x ∈ / A. Each of these is open
since one point sets are closed.
(3) ⇒ (1): If (3) holds then {x} is the intersection of all open sets containing it, so for any y ̸= x, there is an open
set containing x and not y. ■
The next level is T2 . Another name for a T2 space is a Hausdorff space. I will feel free to use the two terms
interchangeably. In algebraic topology, we work almost exclusively with Hausdorff spaces. They were named after
Felix Hausdorff, whose 1914 book Set Theory was the first textbook on point-set topology. The original version is in
German, but see [64] for an English translation.
Definition 2.5.3. A topological space X is T2 or Hausdorff if whenever x and y are distinct points in X, there are
disjoint open subsets U and V of X such that x ∈ U and y ∈ V .
20 CHAPTER 2. POINT-SET TOPOLOGY BACKGROUND

Example 2.5.3. Any metric space M is Hausdorff. Let x and y be in X and let ρ(x, y) = δ. (Recall that ρ(x, y) is
the distance between x and y. Then letting U (x, ϵ) be an open ball centered at x of radius ϵ, we have that U (x, δ/2)
and U (y, δ/2) are disjoint open sets containing x and y respectively.

It should be obvious that a Hausdorff space is T1 , but there are T1 spaces that are not Hausdorff as the next example
shows.

Example 2.5.4. Let X be an infinite set with the cofinite topology. In this topology, open sets are those whose
complements are finite, i.e. the complements of open sets have finitely many points. We also add in X and ∅. (Check
that this satisfies the definition of a topology.) This means that the closed sets are the finite subsets of X, with the
exception of X itself. Then one point sets are closed so by Theorem 2.5.1, X is T1 . But X is not T2 . Now it is a
fact that the complement of an intersection of two sets is a union of their complements. (Try to prove this or convince
yourself using a Venn diagram.) So if U and V are open sets and U ∩ V = ∅, then the union of their complements is
all of X. But this is impossible since X − U and X − V are both finite, but X is infinite. Thus, X is not T2 .

Theorem 2.5.2. 1. Every subspace of a T2 space is T2 .

2. A nonempty product space is T2 if and only if each factor is T2 .

3. Quotients of T2 spaces need not be T2 .

Example 2.5.5. Things are even worse then part 3 of the previous theorem. A continuous open image of a Hausdorff
space need not be Hausdorff. Let A consist of the two horizontal lines in R2 , y = 0 and y = 1. Let f : A → B be a
quotient map which identifies (x, 0) and (x, 1) for x ̸= 0. So every point on the lower line except for (0, 0) is glued to
the point directly above it. Then the space B is one line except that the points (0, 0) and (0, 1) are still separated. But
any open neighborhood containing one of these points has to intersect any open neighborhood containing the other.
So B is not Hausdorff.

For our next trick, we will try to replace points by arbitrary closed sets and separate them.

Definition 2.5.4. A topological space X is regular if whenever A is closed in X, and x ∈


/ A, there are disjoint open
subsets U and V of X such that x ∈ U and A ⊂ V . A regulat T1 space is called a T3 space.

Note that we have not entirely preserved our hierarchy. A regular space is not necessarily Hausdorff. For example,
let X have the trivial topology. Let A = ∅ and x ∈ X. Then since A is empty, x ∈ / A. Let U = X and V = A. Then
U and V are both open, x ∈ U , A ⊂ V , and U ∩ V = ∅. So X is regular. (Note that we have not insisted that A is a
proper subset of V and in general we won’t as we describe separation axioms.) X is obviously not Hausdorff as the
only open set which contains x or y in X is all of X. So a regular space need not be Hausdorff.
To fix this, we defined a T3 space to be a regular T1 space. Then since we saw that in a T1 space, one point sets
are closed, we automatically get that a T3 set is Hausdorff (T2 ). So it is important to remember that T2 and Hausdorff
mean the same thing, but T3 and regular do not.

Example 2.5.6. Not every T2 space is T3 . Let X be the real line with neighborhoods of nonzero points as in the usual
topology, but neighborhoods of 0 are in the form U − A where U is a neighborhood of 0 in the usual topology and

A = {1/n|n = 1, 2, · · · }.

Then X is obviously Hausdorff (draw small enough balls around x and y). But X is not T3 since A is closed in X,
but can’t be separated from 0 by disjoint open sets.

The next two theorems state some properties of regular spaces.

Theorem 2.5.3. The following are equivalent for a topological space X:

1. X is regular.
2.5. SEPARATION AXIOMS 21

2. If U is open in X and x ∈ U , then there is an open set V containing x such that V ⊂ U .

3. Each x ∈ X has a neighborhood base consisting of closed sets.

Theorem 2.5.4. 1. Every subspace of a regular (T3 ) space is regular (T3 ).

2. A nonempty product space is regular (T3 ) if and only if each factor space is regular (T3 ).

3. Quotients of T3 spaces need not be regular.

At the next level, we can start to talk about separation with the use of real valued functions. This will eventually
lead to our first result on extending continuous functions. I will get much more into this when I talk about obstruction
theory.

Definition 2.5.5. A topological space X is completely regular if whenever A is closed in X, and x ∈ / A, there is a
continuous function f : X → I such that f (x) = 0 and f (A) = 1. (Recall that I = [0, 1] is the closed unit interval.)
Equivalently, we can find a continuous function f : X → R such that f (x) = a and f (A) = b, wher a and b are
real numbers with a ̸= b. Any such function f is said to separate A and x. A completely regular T1 space is called a
Tychonoff space.

Note that completely regular spaces are regular. If A is closed, x ∈ / A, and f : X → I such that f (x) = 0 and
f (A) = 1, then f −1 ([0, 21 )) and f −1 (( 12 , 1]) are disjoint open sets containing x and A respectively.
Since the symbol T4 is used for normal spaces, which we will cover next, some books refer to Tychonoff spaces as
T3 21 spaces.

Example 2.5.7. Every metric space is Tychonoff. Let X be a metric space with distance function ρ, A be closed in
X, and x ∈/ A. For y ∈ X, let f (y) = ρ(y, A) = mina∈A ρ(y, a). Then f is a continuous function from A to R,
f (A) = 0, and f (x) > 0. Thus, X is Tychonoff.

Theorem 2.5.5. 1. Every subspace of a completely regular (Tychonoff) space is completely regular (Tychonoff).

2. A nonempty product space is completely regular (Tychonoff) if and only if each factor space is completely regular
(Tychonoff).

3. Quotients of Tychonoff spaces need not be regular or even T2 .

Here is a result that is useful in homotopy theory,

Theorem 2.5.6. A topological space is a Tychonoff space if and only if it is homeomorphic to a subspace of some
cube. (Here a cube is a product of copies of I.)

I will stop at the fourth level of our heirarchy, normal spaces. These are the logical next step but much less well
behaved. Even subspaces of normal spaces are not normal. I will state some results but avoid the proofs and most of
the counterexamples which will take us too far afield. I encourage you to look at [172], [152], and [111] if you are
curious.

Definition 2.5.6. A topological space X is normal if whenever A and B are disjoint closed sets in X, there are disjoint
open subsets U and V of X such that A ⊂ U and B ⊂ V . A normal T1 space is called a T4 space.

Here are two relatively easy examples.

Example 2.5.8. A normal space need not be regular. Let X be the real line and let the open sets be the sets of the
form (a, ∞) for a ∈ R. Then X is normal as no two nonempty closed sets are disjoint. But X is not regular since the
point {1} can not be separated from the closed set (−∞, 0] by disjoint open sets.
22 CHAPTER 2. POINT-SET TOPOLOGY BACKGROUND

Example 2.5.9. If X is a metric space, then X is normal. Let ρ be the distance function and let A and B be two
disjoint closed sets in X. Since we already know that a metric space is Hausdorff, for each x ∈ A, we can pick δx > 0
such that the open ball U (x, δx ) centered at x of radius δx does not meet B. Also, for each y ∈ B, we can pick ϵy > 0
such that U (x, ϵy ) does not meet A. Let
[ [
U= U (x, δx /3), V = U (y, ϵy /3).
x∈A y∈B

Then U and V are open sets in X containing A and B respectively. To see that U and V are disjoint, suppose
z ∈ U ∩ V . Then for x ∈ A, and y ∈ B, ρ(x, z) < δx /3 and ρ(y, z) < ϵy /3, Without loss of generality we can
assume that δx = max{δx , ϵy }. Then by the triangle inequality, ρ(x, y) < δx /3 + ϵy /3 < δx . But then y ∈ U (x, δx )
which is impossible. Thus U and V must be disjoint so X is normal.

Theorem 2.5.7. 1. Closed subspaces of normal spaces are normal.

2. Product spaces of normal spaces need not be normal.

3. The closed continuous image of a normal (T4 ) space is normal (T4 ).

Our main interest in normal spaces is Tietze’s Extension Theorem, our first example of the solution to an extension
problem. The proof involves Urysohn’s Lemma which is interesting in its own right.

Theorem 2.5.8. Urysohn’s Lemma A space X is normal if and only if whenever A and B are disjoint closed sets in
X, there is a continuous function f : X → [0, 1] with f (A) = 0 and f (B) = 1. An immediate consequence is that
every T4 space is Tychonoff.

Let X and Y be topological spaces, and A ⊂ X. Let f : A → Y be continuous. The extension problem asks if
we can extend f to a continuous function g : X → Y whose restriction to A, g|A : A → Y is equal to f . One of the
main goals of homotopy theory is to answer this question. The next theorem provides an answer in the special case
where A is closed in X and Y = R.

Theorem 2.5.9. Tietze’s Extension Theorem A space X is normal if and only if whenever A is a closed subset of X,
and f : A → R is continuous, there is a continuous function g : X → R such that g|A = f .

2.6 Two Ways to be Infinite


In this section, we will discuss different sizes of infinity. The argument came from the work of Georg Cantor and
was originally pretty controversial. It is pretty much accepted now and resolves a big paradox in probability theory.
This will be a digression back to set theory, but it will be needed in the next section as well as many other areas of
mathematics.

Definition 2.6.1. Let A be a set. Then the cardinality of A, written |A| is the number of elements in A.

So if X = {cat, dog, horse}, then |X| = 3. So two sets with the same cardinality have the same number of
elements. We can show that two sets have the same cardinality by exhibiting a one-to-one correspondence. For every
element of set A we pair it with exactly one element of set B. We will do the same thing whether these sets are finite
or infinite. For infinite sets you just need to keep this in mind even though things will go against your intuition if you
have never seen these arguments before.

Definition 2.6.2. Let A be a set. Then A is countable if there is a one-to-one correspondence between the elements
of A and the positive integers. If A is countable, we write the cardinality of A as |A| = ℵ0 .

So a set is countable if you can count it: 1, 2, 3, 4, · · ·


2.6. TWO WAYS TO BE INFINITE 23

Example 2.6.1. The positive integers are obviously countable. So are the integers. To see this we write them as 0, -1,
1, -2, 2, -2, 3, · · · . Then we have the correspondence 1 ↔ 0, 2 ↔ −1,3 ↔ 1,4 ↔ −2,5 ↔ 2, · · · You might think
that there are more integers than positive integers, but we can match them up.

An illustration of this is in Hilbert’s paradox of the Grand Hotel. There have been several versions of this, but see
[52] for one of them. This will be my own version. Suppose there is a hotel with infinitely many rooms. There is a
single hallway and the rooms are numbered, 1, 2, 3, · · · . Suppose you come to the hotel and find out that all of the
rooms are full. There is an easy solution. Just have everyone move down one room and you can get room 1. The next
day, turns out to be the first day of IICI, the Infinite International Convention on Inifinte International Conventions
on Infinite International Conventions on .... The conference is very popular and there are countably infinite many
attendees. There is still no problem, have everyone move to the room that is double their current number. Now there
are countably many empty rooms.
It turns out that there are also countably many rational numbers. To see this, consider the following array. I have
represented every rational number at least once with a lot of duplicates. Now just count them following the arrows:
0 −1 1 −2 2 −3 3 ···

0/2 −1/2 1/2 −2/2 2/2 −3/2 3/2 ···

0/3 −1/3 1/3 −2/3 2/3 −3/3 3/3 ···

0/4 ··· ··· ··· ··· ··· ··· ···

By continuing the zig-zag pattern and letting each row i have i in the denominator, we can be sure that we count
all of the rationals at least once.
If we replace row i in the diagram with the set Xi = {xi1 , xi2 , xi3 , · · · } the same argument shows that ∪∞
i=1 Xi is
countable. So we have shown the following:

Theorem 2.6.1. A countable union of countable sets is countable.

Now I will claim that the real numbers are uncountable, in other words, that there are somehow more of them. So
what would that actually mean for infinite sets? It has to do with matching.
Suppose you bought your silverware as a set and you started with the same number of forks and spoons. But you
suspect that there is a small black hole in your kitchen and that your spoons are being pulled into a parallel universe.
You have a big dinner party planned for that night. (This is post-COVID.) When you set the table, you find that there
are places with forks but no spoons. Since you ran out of spoons but had forks left over, you can conclude that there
are more forks than spoons. So our plan is to match real numbers up with positive integers and still have some left
over.
Construct a sequence x1 , x2 , x3 , · · · where each xi ∈ [0, 1] is written as a binary expansion. Then let x be the
number where the i-th bit is complemented from xi . Then x differs from every xi on the list and can’t appear on it. So
any countable sequence of numbers in [0, 1] has to leave something out, like the fork with no spoon. So the numbers
in [0, 1] and hence the reals are not countable.
So how many real numbers are there? To find out, we use the concept of a power set.

Definition 2.6.3. Let A be a set. Then the power set of A denoted P (A) is the set of all subsets of A.

Example 2.6.2. Let A = {a, b, c}. Listing all of the subsets of A we get:
24 CHAPTER 2. POINT-SET TOPOLOGY BACKGROUND

1. ∅

2. {a}

3. {b}

4. {c}

5. {a, b}

6. {a, c}

7. {b, c}

8. {a, b, c} = A

So |P (A)| = 8.

In general, if |A| = n, then P (A) = 2n . To see this, if A = {a1 , a2 , · · · , an }, then for each i where 1 ≤ i ≤ n,
we can include xi in our subset or exclude it. So this makes the number of subsets 2n .
Now looking at real numbers in [0, 1] we write the binary expansion as .x1 x2 x3 · · · . We are in the situation above
if 0 means exclude a member of a set and 1 means to include it. Since the binary digits map to the positive integers,
they form a set of cardinality ℵ0 , so the cardinality of their power set is 2ℵ0 . We sometimes write 2ℵ0 = C, where C
stands for continuum. The continuum hypothesis states that there is no set with cardinality c such that ℵ0 < c < 2ℵ0 .
It has never been proven or disproven and whether it is true or false is independent of the other axioms of set theory.
A related concept which appears in many topology examples is the set of ordinals Ω. Note that this means some-
thing different than the usual definition of ordinal numbers (first, second, third, etc.) Ω is an uncountable linearly
ordered set with a largest element ω1 . (A set is A is linearly ordered if it is partially ordered and has the property that
for a, b ∈ A, either a ≤ b or b ≤ a.) You can think of Ω as the set starting with positive integers 1, 2, 3, 4, · · · . There
is a smallest ordinal larger than all of these which we call ω0 , the first infinite ordinal. Then we have ω0 + 1, ω0 + 2,
ω0 + 3, · · · until we reach 2ω0 .Eventually we get to ω02 , ω03 , · · · until we finally get to the first uncountable ordinal ω1
and stop. We define Ω0 = Ω − ω1 and call it the set of countable ordinals.
Now we can build one of my favorite weird spaces. First attach 0 to Ω and insert a copy of (0, 1) between any
successive pair of ordinals. This makes a very long number line called the long line. I will follow [152] and let the
long line L only include the countable ordinals. If we extend all the way to ω1 we have the extended long line L∗ .
So next time you are frustrated by standing in a long line, be grateful it isn’t as long as the long line.

2.7 Compactness
If you ever took a class in Real Analysis/Advanced Calculus, you probably learned something about compactness.
I found it a little confusing the first time, so hopefully I can make it clearer. In Rn , compactness is equivalent to being
closed and bounded (pairs of points are closer than some fixed number.) Since we live in R3 , this is the case in our
world. This is convenient as if a compact car was not closed and bounded the edges would be kind of fuzzy and it
would be very hard to fit into a parking space. And if you had a compact disc (anyone remember what those are?) it
would be hard to find something that could play them. But unless you are in Rn with the usual metric space topology,
this is not the case. Later we will see a space that is closed and bounded but not compact.
Since we now understand what a countable set is, I will start with a little more generality.

Definition 2.7.1. A subset D of a set X is dense in X if D = X. An equivalent definition is that every open set in X
contains an element of D.

Definition 2.7.2. A topological space X is separable if it has a countable dense subset.


2.7. COMPACTNESS 25

Example 2.7.1. It is obvious that every open interval in R contains a rational. Since these form a base for the topology
on R, we know that every open set in R contains a rational. So the rationals are dense in R. Since we know that the
rationals are countable, R is separable.

Theorem 2.7.1. 1. A continuous image of a separable space is separable.

2. Subspaces of a separable space need not be separable. But an open subspace of a separable set is separable.

3. A product of separable spaces is separable provided each factor is separable, Hausdorff, and contains 2 points
and that there are no more than 2ℵ0 of them.

Definition 2.7.3. An open cover of a topological space X is a collection of subsets of X such that X is contained in
their union. We say that this collection of open sets covers X. A subcover is a subcollection of a cover which still
covers X.

Definition 2.7.4. A topological space X is Lindelöf if every open cover of X has a countable subcover. A topological
space X is compact if every open cover of X has a finite subcover.

Be careful here. I didn’t say for example that a space is compact if there is some open cover with a finite subcover.
That is always true. Just cover X with itself. But if I give you any open cover, you need to pick out finitely many
which still cover X.
We will start with Lindelöf spaces.

Example 2.7.2. The real line R with the usual topology is Lindelöf. Let {Uα } be an open cover of R. For each
rational, choose an element of {Uα } containing it. Then there are countably many of these (at most) and they must
still cover R since the rationals are dense in R. So every open cover has a countable subcover.

Theorem 2.7.2. 1. A continuous image of a Lindelöf space is Lindelöf.

2. Closed subspaces of Lindelöf spaces are Lindelöf. Arbitrary subspaces of Lindelöf spaces need not be Lindelöf.

3. Products of even two Lindelöf spaces need not be Lindelöf.

I will prove 1. and the first part of 2.


Proof:
1) Let f : X → Y be continuous and onto and X is Lindelöf. Let {Uα } be an open cover of Y . Then {f −1 (Uα )} is
an open cover of X. Since X is Lindelöf, we can choose a countable subcollection {f −1 (U1 ), f −1 (U2 ), · · · } which
covers X. Then {U1 , U2 , · · · } is a countable subcolection which covers Y , and Y is Lindelöf.■
2) Let F be closed in a Lindelöf space X. If {Uα } is an open cover of F , we have for each α an open subset Vα of
X such that Uα = F ∩ Vα . Then X − F together with {Vα } form an open cover of X. Since X is Lindelöf,there is a
countable subcover {X − F, V1 , V2 , · · · }. Then the corresponding {U1 , U2 , · · · } form a countable subcover of F .■

Example 2.7.3. As an example of a subset of a Lindelöf space that is not Lindelöf, we let Ω be the set of ordinals,
and Ω0 = Ω − ω1 , where ω1 is the first uncountable ordinal. We provide Ω with the order topology in which a basic
neighborhood of α ∈ Ω0 is of the form (α1 , α2 ) where α1 < α < α2 . A basic neighborhood of ω1 is of the form
(γ, ω1 ]. Then Ω is a Lindelöf space. To see this, given any open cover of Ω one of the elements, say U , contains ω1 .
Then U contains an interval (γ, ω1 ] for some γ. But then we only need to cover [1, γ] which is a countable set, so we
have a countable subcover and Ω is a Lindelöf space.
But the subspace Ω0 is not Lindelöf. Let Uα = [1, α) for α ∈ Ω0 . Then the set of Uα is an open cover of Ω0 . But
there is no countable subcover. If there were, let {Uα1 , Uα2 , · · · } be this subcover. Then we would have that the least
upper bound of {α1, α2, · · · } is ω1 . But any element of Ω larger than any number in this set would have a countable
number of predecessors and would need to be strictly less than ω1 . So we have a contradiction and Ω0 is not Lindelöf.

Now we move on to the much more important compact spaces. It should be obvious that a compact space is always
Lindelöf. The first example shows that not every Lindelöf space is compact.
26 CHAPTER 2. POINT-SET TOPOLOGY BACKGROUND

Example 2.7.4. The real line R with the usual topology is not compact.Consider the open intervals {(−n, n)} , where
n = 1, 2, 3, · · · . Then these intervals cover R but there is no finite subcover. If there were, then there would be a
largest interval (−N, N ) and it would not include any x ∈ R with |x| > N .
Theorem 2.7.3. 1. A continuous image of a compact space is compact.
2. Closed subspaces of a compact space are compact.
We get these for free from Theorem 2.7.2 by replacing ”countable” with ”finite” in the proofs.
Theorem 2.7.4. A compact subset of a Hausdorff space is closed.
Example 2.7.5. To see a compact subset of a space which is not closed, consider the two point space X = {a, b}
where {a} is open but {b} is not. Then {a} is not closed in X but it is compact as there are only finitely many open
sets in X. Of course, X is not Hausdorff.
The next result which you may know from real analysis gives us lots of examples of compact sets:
Theorem 2.7.5. Heine-Borel Theorem A subset of Rn (with the usual topology) is compact if and only if it is closed
and bounded.
So in particular, the unit interval I is compact as is the n-cube I n .
Example 2.7.6. To see a closed and bounded set which is not compact, let X be the discrete metric space with
infinitely many points. Recall that for any x ̸= y, ρ(x, y) = 1. Then X is bounded as the distance between any two
points is bounded by 1 and X is closed as it is equal to a closed ball of radius 1. But X is not compact. Take as the
open cover, open balls of radius 21 centered on each point of X. Then each of these sets consist of a single point and
no subset of them (finite or not) can cover X. So X is closed and bounded but not compact.
Finally, products of compact spaces are particularly nice.
Theorem 2.7.6. Tychonoff Theorem A product of compact spaces is compact.

2.8 Connectedness
Everything grows together, because you’re all one piece.–Fred Rogers, Mister Rogers Neighborhood.
This brings up some serious questions. What does it mean to be one piece. Are you really in one piece? What
about the cells, molecules, atoms, subatomic particles, etc. Fortunately, for topological spaces, it is easy to define
being in one piece.
While we are on the subject of Mister Rogers, what do you call an open subset of the complex plane containing
the number i? The Neighborhood of Make Believe.
Actually, connectedness and its variants form one of the most important concepts in algebraic topology. Here is
the definition:
Definition 2.8.1. An topological space X is disconnected if there are nonempty disjoint open subsets H and K of X
such that X = H ∪ K. We say that H and K disconnect X. If these subsets do not exist, we say that X is connected.
We could replace open with closed in this definition. So a space X is connected if the only subsets that are both
open and closed are X and ∅.
Definition 2.8.2. If x ∈ X, the largest connected subset of X containing x is called the (connected) component of x.
We will denote it as Cx . It is the union of all connected subsets of x containing X.
So you can think of components as the individual pieces of X.
Example 2.8.1. Let X ⊂ R where X = [0, 1] ∪ (4, 5). Then [0, 1] and (4, 5) are the two components of X.
2.8. CONNECTEDNESS 27

If x, y are two distinct points in X, then either Cx = Cy or Cx ∩ CY = ∅. If this was not the case, then Cx ∪ CY
would be a connected set containing both x and y which is larger than both Cx and Cy which is impossible. So
components of X form a partition of X into maximal connected subsets. (A partition means a collection of disjoint
subsets whose union is all of X.)
Here are some examples of connected and disconnected spaces.
Example 2.8.2. The Sorgenfrey line E is the real line with an alternate topology. Here the basic neighborhoods of
x ∈ E are of the form [x, z), where z > x. Then E is disconnected. If we let H = ∪x<0 [x, 0) and K = ∪z>0 [0, z)
then H and K are both open, H ∩ K = ∅, and H ∪ K = E. So E is disconnected.
Example 2.8.3. The discrete metric space is disconnected if it has at least 2 points. Since each point is an open set,
any 2 subsets of the space form a disjoint union of open sets.
Example 2.8.4. The unit interval I = [0, 1] is connected. Suppose it was disconnected by H and K. Let 1 ∈ H. Then
H contains a neighborhood of 1, so let c = sup K. (The term sup means least upper bound or the smallest number
that is greater than or equal to everything in K. The greatest lower bound is denoted inf.) We know that c ̸= 1 since
H contains a neighborhood of 1. Since c belongs to H or K and both of these are open, some neighborhood of c is
contained in H or K. But any neighborhood of c must contain points of K to the left and points of H to the right.
This is a contradiction so I is connected. By the same argument, so is any closed interval in R.
Connected sets are preserved under continuous functions.
Theorem 2.8.1. The continuous image of a connected set is connected.
Proof: Let X be connected and f be a continuous function of X onto Y .Then if H and K disconnect Y , f −1 (H)
and f −1 (K) disconnect X which is a contradiction. so Y is connected. ■.
Now I will show you something really cool. Do you remember the Intermediate Value Theorem from calculus.
Your book probably didn’t even try to prove it. With topology, the proof is two lines.
Corollary 2.8.2. Intermediate Value Theorem: Let f be a real valued function of a single real variable which is
continuous on [a, b]. (Without loss of generality we can assume f (b) ≥ f (a) or we can make the obvious modification.)
Then for every y such that f (a) ≤ y ≤ f (b), there is an x ∈ [a, b] such that f (x) = y.
Proof: Since [a, b] is connected, f ([a, b]) is also connected by the previous theorem. So if f (a) ≤ y ≤ f (b) but y
is not in the image of f then the image of f is disconnected which would be a contradiction. ■.
The next set of results deal with subsets of connected spaces. These need not be connected as X = [0, .1] ∪ (.3, .5)
is a subset of the connected set I which is not connected.
Definition 2.8.3. Sets H and K are mutually separated in X if

H ∩ K = K ∩ H = ∅.

Theorem 2.8.3. A subspace E of X is connected if and only if there are no nonempty mutually separated sets H and
K in X with E = H ∪ K.
Corollary 2.8.4. If H and K are mutually separated in X and E is a connected subset of H ∪ K then E ⊂ H or
E ⊂ K.
We now have some more ways of proving a space is connected.
Theorem 2.8.5. 1. If X = ∪Xα where each Xα is connected and ∩Xα ̸= ∅, then X is connected.
2. If each pair of points in X lies in some connected subset of X then X is connected.
3. If X = ∪ni=1 Xi where each Xi is connected and Xi ∩ Xi+1 ̸= ∅ for all i, then X is connected.
Theorem 2.8.6. If E is a connected subset of X and E ⊂ A ⊂ E then A is connected.
28 CHAPTER 2. POINT-SET TOPOLOGY BACKGROUND

Proof: It is enough to show that E is connected since if E ⊂ A ⊂ E then A is equal to the closure in A of E, so
we can replace the closure in X of E (i.e. E) by the closure in A of E. Suppose E = H ∪ K where H and K are
disjoint nonempty open sets in E. Then E = (H ∩ E) ∪ (K ∩ E) and so the union of disjoint nonempty open sets in
E. So if E is disconnected, so is E. ■.
Example 2.8.5. The real line R is connected. To see this, we know that R = ∪∞ n=1 [−n, n], and each interval [−n, n]
is homeomorphic to I and thus connected. Since their intersection is nonempty (it contains 0), part 1 of Theorem 2.8.5
shows that R is connected.
Example 2.8.6. Rn is connected by the same theorem. Rn is the union of all straight lines through its origin, and
each of these is homeomorphic to R which is connected by the last example. Also, the origin lies in their intersection.
So Rn is connected.
Example 2.8.7. R2 and R1 are not homeomorphic. To see this, remove a point from both of them. Then R2 is still
connected, but R1 becomes disconnected. When we do homology theory in Chapter 4, I will show you why Rn and
Rm are not homeomorphic if n ̸= m. We don’t yet have the tools for it now, but it is an easy consequence of homology
theory.
Connectivity behaves well with product spaces due to continuity of projection maps.
Theorem 2.8.7. A nonempty product space is connected if and only if each factor space is connected.
This follows from the continuity of projection maps and the fact that the continuous image of a connected space is
connected.
Finally, we have the following result on components of a space.
Theorem 2.8.8. The components of X are closed sets.
Proof: If Cx is the component of x in X, then Cx is a connected set containing x. So Cx ⊂ Cx and Cx is closed.■.
Now that we have defined connectedness, you may wonder if you can always get there from here. Is there a ”path”
connecting any two points? In homotopy theory you learn to love paths and especially ”loops” which are paths that
start and end at the same place. We will now define the notion of pathwise connectedness.
Definition 2.8.4. A space X is pathwise connected if for any two points x, y ∈ X, there is a continuous function
f : I → X such that f (0) = x and f (1) = y. The function as well as its range f (I) is called a path from x to y with
initial point x and terminal point y. A continuous function f : I → X such that f (0) = f (1) = x is called a loop.
You will frequently see the term arcwise connected which means that for a pair of points we can find a path is a
homeomorphism onto its image. (So it doesn’t intersect itself.) It turns out that a Hausdorff space is arcwise connected
if and only if it is pathwise connected. (See [172] for a proof.) Since we won’t need spaces that are not Hausdorff in
data science applications, we can treat these terms as interchangeable when you see them in books and papers. I will
use the term pathwise connected in the rest of this paper.
Theorem 2.8.9. Every pathwise connected space is connected.
Proof: If H and K disconnect the pathwise connected space X, let f : I → X be a path between x ∈ H and
y ∈ K. Then f −1 (H) and f −1 (H) disconnect I which is a contradiction.■.
Example 2.8.8. Not every connected space is path connected. Figure 2.8.1 shows the topologist’s sine curve which
is the union of the graph of sin( x1 ) for x > 0 and the x-axis for x ≤ 0. The right hand side, oscillates between -1
and 1 and the oscillation gets faster and faster as you approach 0. This set is connected as any neighborhood of 0
contains points on both pieces of the graph, so it cannot be disconnected by open sets. But there is no path from (0, 0)
to (x, sin( x1 ) for any x > 0. So the topologist’s sine curve is connected but not path connected.
Example 2.8.9. In the discrete metric space, every point is both open and closed, so every point is its own connected
component. Such a space is called totally disconnected. In this space, there is no path between any pair of points as
this would disconnect the unit interval I. So if you lived in this space, even though you would have a short commute
to work (only 1 unit), you would have no way to get there. And you wouldn’t be all one piece, but infinitely many.
Now you should know enough topology to handle algebraic topology. Next we will focus on the algebra.
2.8. CONNECTEDNESS 29

Figure 2.8.1: Topologist’s Sine Curve


30 CHAPTER 2. POINT-SET TOPOLOGY BACKGROUND
Chapter 3

Abstract Algebra Background

Like point-set topology, abstract algebra is a huge area. I took an entire year of it in college and an additional year in
graduate school. Fortunately, the amount you need for algebraic topology is not nearly as extensive. So in this chapter,
I will present the amount you need to have a basic understanding of homology theory. We can add a little more as we
need it when we go into more advanced topics.
In topology, we give sets additional structure by defining a distinguished class of subsets, the open sets. In algebra,
by contrast, the structure relates to operations where a subset of elements point to another element in a set. In a binary
operation we take two elements and produce a third. A group is a set in which there is a single operation called addition
or multiplication. It doesn’t really matter which one as the operation may not resemble what we do to numbers. But we
want to write the operations in a group in a familiar form, so for example, if we are adding, the identity element (which
leaves the other fixed) is written as a 0, and in multiplication, it is written as a 1. The elements could be numbers but
could also be rocks, flowers, or people as long as you can multiply or add them in a way which follows a fixed set
of rules. So in Section 3.1, I will discuss the basics of groups. In algebraic topology, groups have an especially nice
structure which is very easy to visualize.
Section 3.2 will deal with the idea of an exact sequence. This is a chain of groups and maps between them with
some very nice properties. You can often compute the structure of groups in an exact sequence if you know the
structure of their neighbors. In algebraic topology this is one of your most effective tools.
In Section 3.3, we will introduce rings which have two binary operations, both addition and multiplication. Now
there is a big difference between them. For example, if r is and element of a ring R, then 0 + r = r, but 0r = 0. What
ties them together is a distributive law. A very special type of ring is a field. The real, rational, and complex numbers
are all fields. An important example of a ring that is not a field is the integers.
In Section 3.4, we introduce a new operation called scalar multiplication. A vector space is a group under addition
with a way to multiply elements by elements of an external field called the field of scalars. Vector spaces should be
familiar to you if you have ever taken a class in linear algebra. If we use a ring instead of a field, we get a module. So
a vector space is just a special case of a module. Therefore, a vector space on the moon is a lunar module. If you have
addition, multiplication, and scalar multiplication, you get an algebra. Square matrices are an example.
So now you are going to object and say, ”None of this is abstract enough for me.” Well you don’t have to worry
because we will finish this chapter off with a section on category theory. As I pointed out when we were discussing
set theory, Russell’s paradox doesn’t allow for the set of all sets. So we can talk about the category of sets. Categories
consist of objects and special maps between them called morphisms. Examples are the categories of sets, topological
spaces, groups, and rings. Maps between categories that take objects to objects and morphisms to morphisms are
called functors. Homology is a functor that takes a geometric object (actually a topological space) and replaces it by a
group in each dimension up to the highest dimension of the space. (I will make this more precise in Chapter 4, but we
can always think of these spaces as subsets of some Rn so n is their dimension.)
There are lots of textbooks on abstract algebra, but my personal favorite to start with is Herstein’s Topics in Algebra
[66]. This book seems to be out of print and hard to find now. It has been replaced by the posthumously published
Abstract Algebra, 3rd Edition [67], which covers somewhat less material. Most of Sections 3.1, 3.2, and 3.4 will come

31
32 CHAPTER 3. ABSTRACT ALGEBRA BACKGROUND
Rule Addition
Closure a, b ∈ G implies that a + b ∈ G
Associative Law a + (b + c) = (a + b) + c
Identity There exists an element 0 ∈ G such that for every a ∈ G, 0 + a = a + 0 = a
Inverse For every a ∈ G there exists −a ∈ G such that a + (−a) = (−a) + a = 0
Rule Multiplication
Closure a, b ∈ G implies that ab ∈ G
Associative Law a(bc) = (ab)c
Identity There exists an element 1 ∈ G such that for every a ∈ G, 1a = a1 = a
Inverse For every a ∈ G there exists a−1 ∈ G such that aa−1 = a−1 a = 1

Table 3.1.1: Group Laws.

from [66], with some additional material from Munkres [110]. Section 3.2 is entirely from [110]. Another one of my
favorites is Jacobson’s Basic Algebra. The only problem is that the name is a misnomer and it is almost embarrassing
to walk around with it as people will think I never learned my high school algebra. In actuality, though, if you know
all the material in this two volume set, you will be well prepared for any Ph.D. qualifying exam in algebra. There is a
good short chapter on category theory in the second volume which I will use in Section 3.5. If you want to learn more,
the classic book by one of the inventors of category theory is MacLane’s Categories for the Working Mathematician
[91].

3.1 Groups
Since as mentioned in the introduction, a group’s operation can either be multiplication or addition, I will define it
in both cases.
Definition 3.1.1. A nonempty set G is a group if there is a binary operation sum which is denoted by + or product
which is denoted by · which takes two elements of G and produces a third and obeys the rules listed in Table 3.1.1.
(Here we write ab for a · b.)
Arranging the rules in a table shows the explicit parallels betwen groups under addition and multiplication. They
are the same thing other than a difference in notation.
Note that there is something missing from the rules you learned about numbers in high school algebra. There is no
commutative law. This law only holds for a special type of group.
Definition 3.1.2. A group G is called an abelian group if the commutative law holds. It is written a + b = b + a for
addition or ab = ba for multiplication.A group which is not abelian is a non-abelian group.
What is purple and commutes? An abelian grape.
A group can be either finite or infinite. If G is finite, the number of elements in G is called the order of G and is
written as |G|.
Example 3.1.1. The trivial group having only one element is {1} written multiplicatively or {0} written additively.We
will write these simply as 1 and 0.
Example 3.1.2. Let G = Z, the set of integers. Then G is a group under addition. In fact, it is an abelian group.
But Z is not a group under multiplication. No integer other than 1 and −1 have inverses that are also integers. The
nonzero integers are an example of a monoid. You may remember a monoid as the one-eyed giant that Odysseus met.
(Or was that a Cyclops?) Actually a monoid is a similar to a group except that it might not have inverses. As we will
see, a ring is an abelian group under addition and its nonzero elements form a monoid under multiplication.
3.1. GROUPS 33

Example 3.1.3. Let G = R, the set of real numbers. Then R is a group under addition and the nonzero elements of R
form a group under multiplication. The same holds if R is replaced by the set Q of rationals.The irrationals are not a
group under addition and the nonzero irrationals are not a group under multiplication as they are missing both 0 and 1.
Example 3.1.4. An example of a finite group is the integers mod n written as Zn . Zn = {0, 1, 2, · · · , n − 1} which
is a group under addition. Addition is done in the usual way, but (n − 1) + 1 = 0. Think of a clock with a 0 where the
12 usually goes. Then for example, 11 + 3 = 2.
Example 3.1.5. An example of a non-abelian group is the set of invertible n × n matrices with real entries for a fixed
n. The operation is matrix multiplication which we know is not commutative. The set of all n × n matrices with real
entries is a group under addition.
Example 3.1.6. A more exotic yet important group is the permutation group Sn . (Don’t confuse this with the sphere
S n .) This group consists of all possible permutations of n objects where permutations are multiplied by taking them
in succession. |Sn | = n! as can be seen from the fact that a permutaion can put the first object in n positions, the
second in n − 1 positions, etc. Also, this group is non-abelian for n ≥ 3. In fact, S3 is the smallest non-abelian group.
As an example, suppose you are the zookeeper in a small zoo which has only 3 animals: a lion, an elephant, and
a giraffe. Your boss likes to constantly move animals around to confuse any potential visitors. So he gives you a card
every day with instructions on moving animals. The cards are written in cycle notation. If two objects are enclosed
in the same set of parentheses, you move each object to the one on its right and circle around if necessary. So for
example, if the cages are in a row in the order [Lion, Elephant, Giraffe] or [L, E, G] and the card says (E)(L, G) this
means to leave the elephant in its place and switch the lion and the giraffe resulting in the new configuration [G, E, L].
We usually omit a cycle with one element so we write the previous permutation as (L, G). If we have the cycle (L)
or equivalently either of the other two letters, this means leave everything alone. We multiply permutations by doing
then in succession so (E, L)(L, E, G) means first switch the elephant and the lion, then lion to elephant, elephant
to giraffe, and giraffe to lion. So [L, E, G] becomes [E, L, G] and then [L, G, E]. Note that reversing the order is
(L, E, G)(E, L) so we do the circular switch first and then switch the elephant and the lion. So [L, E, G] becomes,
[G, L, E] and then [G, E, L]. So reversing the order gives a different configuration, and S3 is non-abelian. To show it
is an actual group, we can undo a cycle with two elements by doing it again. So applying (L, E) twice switches the
lion and the elephant and then switches them back. We undo (L, E, G) with (G, E, L).
Note that Zn is a group formed by taking an element and all multiples until one of them becomes 0. An equivalent
multiplicative group is the group G = {1, a, a2 , a3 , · · · , an−1 } where a(an−1 ) = 1. This type of group is generated
by a single element and is called a cyclic group of order n. The integers form an infinite cyclic group under addition.
An equivalent multiplicative group is a group of the form {g i } where i is any integer and g 0 = 1. I will make precise
what I mean by equivalent a little later.
What do you call 5 friends who all like to ride bicycles together? A cyclic group of order 5.
Recall that topological spaces had subspaces, product spaces, and quotient spaces. I will now describe the ana-
logues for groups. These will be critical in algebraic topology. I will start with subgroups. Unlike the case of
topological spaces, not every subset of a group is a subgroups. We need this subset to be a group itself.
Definition 3.1.3. A nonempty subset H of a group G is a subgroup of G if it is itself a group.
H inherits the binary operation from G so it is additive if G is additive and multiplicative if G is multiplicative.
I am going to use multiplicative notation for the next part of the discussion. Later I will assume we are dealing
with abelian groups which are traditionally represented as additive. Homology and cohomology only involve abelian
groups. Non-abelian groups can come up in homotopy theory, though, so I will have more to say about them when we
get to Chapter 9. I will always specify when the groups being discussed are restricted to being abelian.
So what does H need to be a group itself? It will automatically have the associative law. So we need closure, the
identity, and inverses. We can drop the requirement that the identity be in H if we have closure and the existence of
inverses since h ∈ H implies h−1 ∈ H, so by closure, h · h−1 = 1 ∈ H. So we have shown:
Theorem 3.1.1. A nonempty subset H of a group G is a subgroup of G if and only if a, b ∈ H implies ab ∈ H and
a ∈ H implies a−1 ∈ H.
34 CHAPTER 3. ABSTRACT ALGEBRA BACKGROUND

Example 3.1.7. If Z is the integers under addition, then let 3Z be the multiples of 3. Then since the sum of multiples
of 3 is always a multiple of 3, and a is a multiple of 3 if and only if −a is a multiple of 3, we have that 3Z is a subgroup
of Z.
Example 3.1.8. If Z is the integers under addition, then the even numbers are equal to 2Z so they form a subgroup by
the argument in the last example. The odd numbers, do not form a subgroup. For example, 3 and 5 are odd, but 3+5=8
which is an even number, so closure fails. And the odd numbers do not contain the identity element 0.
Example 3.1.9. The integers form a subgroup of the reals under addition.
Example 3.1.10. Let G be the nonzero real numbers under multiplication. Then the nonzero rationals form a subgroup
as do the nonzero positive rationals.
If our group is finite, there is an important relationship between the order of G and the order of its subgroups. We
start with the idea of a coset.
Definition 3.1.4. If H is a subgroup of G, and a ∈ G then we define the right coset Ha = {ha|h ∈ H}. The left
coset aH has an analogous definition. If G is additive, cosets are of the form H + a.
Theorem 3.1.2. Any two right cosets of H in G are either identical or disjoint. Since H = H1, H is itself a coset
and the theorem implies that the cosets partition G.
Proof: Suppose Ha and Hb have a common element c. Then c = h1 a = h2 b for some h1 , h2 ∈ H. So
h−1 −1 −1
1 h1 a = h1 h2 b. So a = h1 h2 b which implies that a ∈ Hb. This implies Ha ⊂ Hb. A similar argument shows
Hb ⊂ Ha. So Ha = Hb. ■
Note that the above argument shows that Ha = Hb if ab−1 ∈ H.
Now every coset is the same size as H. This is true since for any two distinct elements of h1 , h2 ∈ H, h1 a and
h2 a are distinct as we can see by multiplying both by a−1 on the right. So G is a union of disjoint sets of size |H|. So
we have proves the very important result:
Theorem 3.1.3. Langrange Theorem If G is a finite group, then H is a divisor of G.
So for a group of prime order, the only subgroups are G itself and the trivial subgroup 1.
Definition 3.1.5. If H is a subgroup of G, then the index of H in G written [G : H] is the number of right cosets of
H in G.
Theorem 3.1.4. If H is a subgroup of G, then [G : H] = |G|/|H|.
Example 3.1.11. Let G = Z10 and H = {0, 5}. Then |G| = 10 and |H| = 2. H is a subgroup of G since in Z10 ,
5+5 = 0. The cosets of H are H +0 = {0, 5}, H +1 = {1, 6}, H +2 = {2, 7}, H +3 = {3, 8}, H +4 = {4, 9}.These
exhaust Z10 and there are 5 of them, so [G : H] = 5 = 10/2 = |G|/|H| as we expect. Note that for an additive
group, H + a = H + b if a − b ∈ H. For example H + 2 = {2, 7}, and H + 7 = {7, 2} = H + 2. We check that
7 − 2 = 5 ∈ H.
Example 3.1.12. An infinite group can have a subgroup with finite index. An easy example is letting G be the integers
and H the even integers under addition. H + 1 is the set of odd integers, and G = H ∪ (H + 1). So we have that the
index is [G : H] = 2.
We will talk about the order of an element of a group.
Definition 3.1.6. If G is a group and g ∈ G, then the order of g, written o(g) is the smallest power m of g such that
g m = 1. This is finite for every element if g is finite. the analogue for an additive group is mg = 0.
If g ∈ G has finite order m, then {1, g, g 2 , · · · , g m−1 } is a subgroup of G. So we have:
Theorem 3.1.5. If G is a finite group and g ∈ G has order m, then m divides |G|.
3.1. GROUPS 35

If a group is abelian, left and right cosets coincide. In general, though, we want to only look at subgroups where
the cosets behave nicely. Such a subgroup is called normal.
Definition 3.1.7. A subgroup N of a group G is called normal if for any g ∈ G, n ∈ N , we have gng −1 ∈ N .
Be careful. We are not saying that gng −1 = n. That would happen if the group was abelian. But we do have the
following:
Theorem 3.1.6. N is a normal subgroup of G if and only if gN g −1 = N for all g ∈ G.
This says that for n ∈ N , gng −1 = n1 for some n1 ∈ N . For a normal subgroup, a right coset N g is the left
coset gN for the same g. This gives us a way to multiply right cosets. For a normal subgroup N of G and g1 , g2 ∈ G,
N g1 N g2 = N (g1 N )g2 = N (N g1 )g2 = N g1 g2 . With this multiplication, N = N 1 the identity, and the inverse of
N g equal to N g −1 , the right cosets form a group.
Definition 3.1.8. If N is a normal subgroup of G, the set of right cosets with the multiplication written above is called
a quotient group and written G/N . By our comments above, |G/N | = |G|/|N |
I can’t overemphasize how important quotient groups are in algebraic topology. Homology is defined as a quotient
group, and every TDA paper I have ever seen assumes you already know what a quotient group is.
If G is written additively, then the cosets are of the form N + a. N + 0 = N is the identity, (N + a) + (N + b) =
N + (a + b), and the inverse of N + a is N + (−a) = N − a.
If G is abelian, then every subgroup is normal but this is not true if G is non-abelian.
Example 3.1.13. Let G = Z10 and H = {0, 5}. We listed the cosets of H in G in Example 3.1.11 above. The
quotient group G/H is formed by adding the cosets (H + a) + (H + b) = H + (a + b) mod 10. (Recall a mod b
is the remainder you get when dividing a by b.)
Example 3.1.14. Let G = Z under addition, and H = nZ which is the set of multiples of n. We saw that H is a
subgroup of G. The cosets of H are H + 0 = H, H + 1, H + 2, · · · , H + (n − 1). These exhuast Z since a multiple of
n plus n is still a multiple of n, so H + n = H. We can write out G/H as {0, 1, 2, · · · , n − 1} so we see that Z/nZ
is just Zn . That is why you will often see Zn written as Z/nZ or even Z/n in books. I will stick to the Zn notation.
Definition 3.1.9. A group G is simple if it has no normal subgroups except G itself and the trivial group 1.
An abelian group can only be simple if it is a cyclic group of order p where p is a prime. Non-abelian groups
can be simple and very complicated. There was a big project in hte 1960s-1980s to classify all finite simple groups,
and the proof is about 10,000 pages long. There are a few families and 26 sporadic ones. The largest of these
is the Fischer-Griess Monster Group, also known as the ”Friendly Giant” or simply the ”Monster.” Its order is
808,017,424,794,512,875,886,459,904,961,710,757,005,754,368,000,000,000. If you are curious about where it comes
from see [60] for the original 100 page construction or [36] for the simpler 30 page version.
Now that we have seen subgroups and quotient groups, you may wonder if there is an analogue to Cartesian
products. There are a few constructions that are useful but the one we will be concerned with is a direct product.
Definition 3.1.10. Let G and H be groups. Then the direct product of G and H denoted G × H has an underlying set
which is the cartesian product of the underlying sets of G and H and (g1 , h1 )(g2 , h2 ) = (g1 g2 , h1 h2 ). If G and H are
written additively, we write G ⊕ H and use the term direct sum.
We can think of G and H as subgroups of G × H by writing G = {(g, 1)} for g ∈ G and H = {(1, h)} for h ∈ H.
Then we have the following properties:
1. G ∩ H = 1
2. Every element of G × H can be written uniquely as the product of an element of G and an element of H.
3. Every element of G commutes with every element of H.
36 CHAPTER 3. ABSTRACT ALGEBRA BACKGROUND

4. Both G and H are normal subgroups of G × H.


5. The order of G × H is the product of the order of G and the order of H.
Example 3.1.15. There can only be one group of order 2, order 3, and order 5. These are the cyclic groups Z2 , Z3 ,
and Z5 respectively. We know that there are no others as these are of prime order so the can not have any nontrivial
subgroups.
Example 3.1.16. There are two distinct groups of order 4. The group Z4 is the familiar cyclic group. We also have
the group Z2 ⊕ Z2 . This group contains the identity (0, 0) and three elements of order two: (1, 0), (0, 1), and (1, 1)
have order 2, since in Z2 , 1 + 1 = 0.This group is called the Klein four group.
Example 3.1.17. For order 6, there are 3 groups. They are Z6 , Z3 ⊕ Z2 and the non-abelian permutation group S3
that we saw earlier. As we will see later, though, Z6 , Z3 ⊕ Z2 can be thought of as the same group.
Now we want to know what it means for two groups to be the same. The analogue of a continuous function from
topology is a homomorphism. This preserves the algebraic structure, this time in the forward rather than the reverse
direction. The analogue of a homeomorphism is an isomorphism. Two groups that are isomorphic are basically the
same group. I will now make these concepts precise.
Definition 3.1.11. A mapping f from a group G to a group H is a homomorphism if f (g1 g2 )=f (g1 )f (g2 ) for all
g1 , g2 ∈ G. (In additive notation, f (g1 + g2 )=f (g1 ) + f (g2 ).) A homomorphism f is a monomorphism if it is one to
one. It is a epimorphism if it is onto. If the homomorphism f is both one to one and onto (implying the existence of
an inverse), then f is called an isomorphism and we say that G and H are isomorphic.
Two groups that are isomorphic are pretty much indistinguishable in the way that two topological spaces are
indistinguishable if they are homeomorphic.
Example 3.1.18. One easy example of a homomorphism is defined as f(g)=1 for all g ∈ G. In additive notation,
f (g) = 0, and this is called the zero homomorphism. We see it often in algebraic topology.
Example 3.1.19. If f : G → G, then f is the identity homomorphism if f (g) = g for g ∈ G.
Example 3.1.20. Let G be the group of real numbers under addition and H the group of nonzero real numbers under
multiplication. Let f : G → H, be defined by f (x) = 2x . This is a homomorphism since f (a + b) = 2a+b = 2a 2b =
f (a)f (b).
Example 3.1.21. Let f : Z → Z, be defined by f (x) = 2x. This is a homomorphism since f (a + b) = 2(a + b) =
2a + 2b = f (a)f (b).
Example 3.1.22. Let G = {0, 1} = Z2 under addition and H = {1, −1} under the usual multiplication. Let
f : G → H with f (0) = 1 and f (1) = −1. This is one to one and onto and you can check that f is a homomorphism.
For example, f (1 + 1) = f (0) = 1 and f (1)f (1) = (−1)(−1) = 1. Checking the other possibilities gives that
f (a + b) = f (a)f (b). So f is a homomorphism and we have defined it to be one to one and onto. So G and H are
isomorphic.
Example 3.1.23. As mentioned above, Z6 is isomorphic to Z3 ⊕ Z2 . To see this, let g ∈ Z6 . Let f (g) = (a, b) ∈
Z3 ⊕ Z2 , where g = 3b + a. (I.e. b and a arer the quotient and remainder respectively when g is divided by 3. We
have f −1 ((a, b)) = 3b + a. It is easily checked that f (g + h) = f (g) + f (h) and that f −1 is actually the inverse of
f . So Z6 is isomorphic to Z3 ⊕ Z2 .
Definition 3.1.12. Let f : G → H be a homomorphism. Then the kernel K of f is {g ∈ G|f (g) = 1}. (In additive
notation f (g) = 0.) The image f (G) of f is {h ∈ H|h = f (g) for some g ∈ G}.
Theorem 3.1.7. Let f : G → H be a homomorphism. Then f (1G ) = 1H and f (g −1 ) = f (g)−1 , where 1G and 1H
are the identity elements of G and H respectively.
3.1. GROUPS 37

Proof: Let g ∈ G. Then f (g) = f (g1G ) = f (g)f (1G ). Multiplying both sides by f (g)−1 on the left gives
1H = f (1G ). So 1H = f (1G ) = f (gg −1 ) = f (g)f (g −1 ) So multiplying on the left by f (g)−1 on both sides gives
f (g)−1 = f (g −1 ). ■
Using the above result it is easy to show the following:
Theorem 3.1.8. Let f : G → H be a homomorphism. Then the kernel of f is a normal subgroup of G.
Try this one yourself using the formulas.
Theorem 3.1.9. Let f : G → H be a homomorphism. Then f is one to one if and only if the kernel of f is 1.
Proof: Supose the kernel of f is 1. Let g1 , g2 ∈ G, and suppose f (g1 ) = f (g2 ). Then 1 = f (g1 )f (g1 )−1 =
f (g1 )f (g2 )−1 = f (g1 g2−1 ).. Since the kernel of f is 1, g1 g2−1 = 1, so g1 = g2 and f is one to one. If f is one to one
and g is in the kernel of f , then 1 = f (g) = f (1) so we must have g = 1. ■
The next theorem is very important. See [66] for the proof.
Theorem 3.1.10. Let f : G → H be a homomorphism with kernel K. Then G/K ≈ H. (Here ≈ denotes an
isomorphism.)
Now let G and H be abelian. We will always write abelian groups additively. So the above result says that f is
one to one if the kernel K = 0. Also, recall that every subgroup of an abelian group is normal. So the image f (G) is
a normal subgroup of H and can form the quotient group H/f (G). We call this group the cokernel of f .If f is onto,
then f (G) = H, so H/f (G) = H/H = 0. So we have:
Theorem 3.1.11. Let f : G → H be a homomorphism of abelian groups. Then f is onto if and only if the cokernel of
f is 0. f is an isomorphism if the kernel and the cokernal of f are both 0.
In both homology and cohomology theory, all of the groups we care about have an especially nice form. To
understand this, we will need a couple more definitions.
Definition 3.1.13. Let G be a group. A set S of generators for G is a set of elements of G such that every element of
G can be expressed as a product of finitely many of the elements of S and their inverses. If G is generated by a finite
set S, then G is finitely generated. If S consists of a single element g then G is a cyclic group generated by g.
Definition 3.1.14. Let G be a group. A set R of relations is a set of strings g1r1 g2r2 · · · gnrn of generators gi and integers
ri whose product is 1. Every group has the relation gg −1 = 1 for every generator g, but there may be others as well.
A group with no other relations is called a free group. A group with only finitely many relations is called finitely
presented.
Example 3.1.24. Let G be the free group with two generators x and y. Then elements of G are the element 1 along
with strings of x, y, x−1 and y −1 . Some examples are x8 , x6 y 3 , and x3 (y −1 )5 y 3 x8 .
In homology and cohomology, every group we will ever deal with is finitely generated and abelian. In homotopy
theory, there is one exception called the fundamental group that could be non-abelian. We will deal with it in Chapter
9.
What’s purple, commutes, and is worshiped by a limited number of people? A finitely venerated abelian grape.
What’s purple, commutes, and given gifts by a limited number of people? A finitely presented abelian grape.
Definition 3.1.15. Let G be a finitely generated group.Then G is a free abelian group, if G = Z ⊕ Z ⊕ · · · ⊕ Z. The
number of summands of Z is called the rank of G. In algebraic topology, the rank is also called the betti number.
Warning: Do not confuse a free group and a free abelian group. A free group is about as non-abelian as you can
get.
A free abelian group of rank n has a basis {g1 , · P· · , gn } which is defined to be a set of elements of G such that
n
every element of G can be written uniquely as a sum i=1 ci gi where ci ∈ Z for 1 ≤ i ≤ n.
One way of constructing a free abelian group from a finite set of generators is called a formal sum. In this case, we
Pn with a generating set S = {g1 , · · · , gn } and a function ϕ : S → Z, so the elements of our group are of the form
start
i=1 ϕ(gi )gi
38 CHAPTER 3. ABSTRACT ALGEBRA BACKGROUND

Example 3.1.25. Suppose you go to a fruit stand that has apples, bananas, and cherries. You buy 2 apples, 4 banananas
and 12 cherries. This represents the element 2a + 4b + 12c, where {a, b, c} is the generating set. You could also have
an element like −6a where you sell them 6 apples. You can add and subtract peieces of fruit of the same kind, but for
different kinds, we just put a plus or minus sign between them.

When we build homology groups, we will start with the group of chains which will be a set of formal sums. We
will discuss these in Chapter 4.
Recall that for an additive group, the order of an element g is m if m is the smallest positive integer such that
mg = 0. The set of all elements of G of finite order is called the torsion subgroup. If T = 0, we say that G is torsion
free.
We conclude this section with an important result which is the basis of homology theory. The good news is that it
means that homology groups take a very simple form. See [110] for a proof.

Theorem 3.1.12. The Fundamental Theorem of Finitely Generated Abelian Groups: Let G be a finitely generated
abelian group. Then we can write G uniquely as
M M M M M
G ≡ (Z ··· Z) Zt1 ··· Z tk

with ti > 1 and t1 |t2 | · · · |tk where a|b


L means L that a divides b. The number of summands Z is called the betti number
β of G. The torsion subgroup is Zt1 · · · Ztk .
L L L L L
Example 3.1.26. G ≡ (Z Z Z) Z3 Z6 Z12 . Then g is the set of 6−tuples (a, b, c, d, e, f, g) where
addition is coordinatewise and a, b, c ∈ Z, d ∈ Z3 , e ∈ Z6 , f ∈ Z12 . A typical element could be (1, 5, −3, 2, 4, 8).
Note that 3|6|12.
L
Note that for m and n relatively prime, Zm Zn ≡ Zmn (prove this yourself), so any finite cyclic group can be
written as a direct sum of cyclic groups whose orders are powers of primes. So we can also write G uniquely as
M M M M M
G ≡ (Z ··· Z) Za1 ··· Zap

where the ai are powers of distinct primes.


So that was a rather long section, but it barely scratched the surface of group theory. It is pretty much most of what
you need to know for algebraic topology. The next section will be a first taste of a branch of abstract algebra called
homological algebra which is specialized for algebraic topology while being a subject in its own right. You will see a
lot more of it in Chapter 4.

3.2 Exact Sequences


Exact sequences are a key tool in computation in homology, cohomolgy, and homotopy theory. This topic really
belongs in Chapter 4, but I am including it here as it illustrates some of the group theory concepts we have just learned.
In this section, all groups will be abelian and written additively, but there is no reason we can’t have an exact
sequence of non-abelian groups.

Definition 3.2.1. Consider a sequence (finite or infinite) of abelian groups and homorphisms.
ϕ1 ϕ2
··· A1 A2 A3 ···

The sequence is exact at A2 if image ϕ1 =kernel ϕ2 . If this happens at every group (except the left and right ends
if they exist), then the sequence is an exact sequence.

We will use some of what we have learned in the last section about homomorphisms to prove some facts about
exact sequences.
3.2. EXACT SEQUENCES 39

Theorem 3.2.1. The sequence


ϕ
A1 A2 0
is exact if and only if ϕ is an epimorphism.
Proof: Since the second map is the zero homomorphism, its kernel is all of A2 . So the image of ϕ is also all of
A2 ,and ϕ is an epimorphism. If ϕ is an epimorphism, then the image of ϕ is all of A2 , so the second map has all of A2
as its kernel and is the zero homomorphism. ■.
Theorem 3.2.2. The sequence
ϕ
0 A1 A2
is exact if and only if ϕ is an monomorphism.
Proof: Since the first map is the zero homomorphism, its image is 0. So the kernel of ϕ is 0, and ϕ is a monompor-
phism by Theorem 3.1.9. If ϕ is a monomorphism, then the kernel of ϕ is 0 so the first map is the zero homomorphism.
■.
You should be very happy whenever you see this as a piece of an exact sequence:
ϕ
0 A1 A2 0

By the last two theorems, the map ϕ is both a monomorphism and an epimorphism. So ϕ is an isomorphism and
A1 ≡ A2 , and if you already know one of the groups, you also know the other. This is a very common situation in
algebraic topology and this gives exact sequences a lot of their power.
I will give you two more results to try to prove on your own. We will need two new definitions.
Definition 3.2.2. Let G be an abelian group with subgroup H. Then the homomorphism i : H → G is an inclusion
if i(h) = h for h ∈ H. (Recall the analogous definition for topological spaces.) The homomorphism p : G → G/H
with p(g) = g + H is called a it projection. For nonabelian groups we modify the definition of a projection by having
H be a normal subgroup of G and p(g) = Hg = gH. (In some mathematical word reuse, if G = H ⊕ K, the maps
of (h, k) to either the first or the second coorcinate are also called projections.)
Definition 3.2.3. An exact sequence of the form
ϕ ψ
0 A1 A2 A3 0

is called a short exact sequence.


Short exact sequences are also very common. Try to prove the next two results with the facts we have learned.
Theorem 3.2.3. For the short exact sequence defined above, ψ induces an isomorphism between A2 /ϕ(A1 ) and A3 .
Conversely, if ψ : A → B is an epimorphism with kernel K, then the sequence
i ψ
0 K A B 0
is exact, where i is inclusion.
Theorem 3.2.4. Suppose we have an exact sequence
α ϕ β
A1 A2 A3 A4

then the following are equivalent:


1. α is an epimorphism.
2. β is a monomorphism.
3. ϕ is the zero homomorphism.
40 CHAPTER 3. ABSTRACT ALGEBRA BACKGROUND

We can get even more information from a short exact sequence if it is split.

Definition 3.2.4. Consider a short exact sequence


ϕ ψ
0 A1 A2 A3 0.

The sequence is split if the group ϕ(A1 ) is a direct summand of A2 .

In this case the sequence is of the form


ϕ L ψ
0 A1 ϕ(A1 ) B A3 0.

where ϕ is an isomorphism of A1 and ϕ(A1 ), and ψ is an isomorphism of B and A3 .


Here are two more facts about split exact sequences.

Theorem 3.2.5. Suppose we have an exact sequence


ϕ ψ
0 A1 A2 A3 0.

then the following are equivalent:

1. The sequence splits.

2. There is a map p : A2 → A1 such that p ◦ ϕ = iA1 , where iA is the identity map on an abelian group A.

3. There is a map j : A3 → A2 such that ψ ◦ j = iA3

The picture for the last two statements is


ϕ ψ
0 A1 A2 A3 0.
p j
Proof:
To show that (1) implies (2) and (3) consider the sequence
i L π
0 A1 A1 A3 A3 0.

Then just let p : A1 ⊕ A3 → A3 be projection and j : A3 → A1 ⊕ A3 be inclusion. Then you can easily check
that (2) and (3) hold.
To show that (2) implies (1). We will show that A2 = ϕ(A1 ) ⊕ (ker p). For x ∈ A2 , we can write x = ϕ(p(x)) +
(x − ϕ(p(x))). The first term is in ϕ(A1 ) and the second is in ker p, since p(x − ϕ(p(x))) = p(x) − p(ϕ(p(x)) =
p(x) − p(x) = 0. Secondly, if x ∈ ϕ(A1 ) ∩ (ker p) then x = ϕ(y) for some y ∈ A1 , so p(x) = p(ϕ(y)) = y. But
since x ∈ (ker p), p(x) = 0, so y = 0, which implies that x = ϕ(y) = 0. Thus A2 = ϕ(A1 ) ⊕ (ker p).
To show that (3) implies (1). We will show that A2 = (ker ψ) ⊕ j(A3 ). Since (ker ψ) = ϕ(A1 ), we will be
done. First, for x ∈ A2 , we write x = (x − j(ψ(x))) + j(ψ(x)).The first term is in (ker ψ) since ψ(x − j(ψ(x))) =
ψ(x) − ψ(j(ψ(x))) = ψ(x) − ψ(x) = 0. The second term is in j(A3 ). Secondly, If x ∈ (ker ψ) ∩ j(A3 ), then
x = j(z) for some z ∈ A3 , so ψ(x) = ψ(j(z)) = z. Since x ∈ (ker ψ), ψ(x) = 0, so z = 0, and x = j(z) = 0.
Thus, A2 = (ker ψ) ⊕ j(A3 ). ■

Corollary 3.2.6. Suppose we have an exact sequence


ϕ ψ
0 A1 A2 A3 0.

Then if A3 is free abelian, the sequence splits;

Proof: Choose a basis for A3 and for each basis element e, define j(e) to be any element of the nonempty set
ψ −1 (e). The set is nonempty since ψ is onto.
3.3. RINGS AND FIELDS 41

We will see more about exact sequences in Chapter 4, but this is a good test of your understanding of homomor-
phisms and their properties. The full power of exact sequences will wait until we discuss homology theory and the
important Eilenberg-Steenrod axioms.

3.3 Rings and Fields


The purpose of this section and the next will not be to teach you an entire course on abstract algebra or linear
algebra. Instead, the idea is to make sure that you know all of the terms that we will use later.
As I mentioned earlier, a group has a single binary operation, either addition or multiplication. A ring has both with
each one playing a separate role. They are tied together with a distributive law. There are also some variants on the
detinition I will give. First of all, there is an element that plays the role of the number 1 which is called a unit element.
I will never use a ring with out a unit element, although [66] allows for one. Since the multiplicative identiy can be
written as I instead of 1, we can refer to a ring without a unit element as a rng. Also, most rings have the associative
property for multiplication, but there are exceptions. One example is the octonions. They are a generalization of the
quaternions which I will soon define, but multiplication is not associative. They play a role in the homotopy groups of
spheres and obstruction theory by extension, but there are easier problems to solve before we can think about that. So
unless otherwise specified, I will always use associative rings. Here is a definition.

Definition 3.3.1. A nonempty set R is called an (associative) ring if it has two binary operations called addition and
multiplication, it is an abelian group under addition, and it has the following additional properties:

1. If a, b ∈ R, then ab ∈ R. (closure for multiplication)

2. There exists an element 1 ∈ R called the unit element such that for r ∈ R, 1r = r1 = r.. (identity for
multiplication)

3. For a, b, c ∈ R, a(bc) = (ab)c. (associative law for multiplication)

4. For a, b, c ∈ R, a(b + c) = ab + ac and (b + c)a = ba + ca (distributive laws).

Definition 3.3.2. A commutative ring is a ring for which multiplication is commutative, i.e. for a, b ∈ R, ab = ba.
(Remember that for a ring, addition is always commutative.) A ring that is not a commutative ring is a noncommutative
ring.

Example 3.3.1. The integers with usual addition and multiplication form a commutative ring. So do the rationals and
the reals. The irrationals are not a ring at all as they don’t contain either 0 or 1.

Example 3.3.2. The complex numbers consist of numbers of the form a + bi, where a, b are real numbers and
i2 = −1. If a + bi and c + di are two complex numbers, then we define (a + bi) + (c + di) = (a + c) + (b + d)i, and
(a + bi)(c + di) = (ac − bd) + (ad + bc)i. The complex numbers form a commutative ring with these definitions of
addition and multiplication. The identies for addition and multiplication are 0 = 0 + 0i and 1 = 1 + 0i respectively.

Example 3.3.3. Our first example of a noncommutative ring is the ring of quaternions. The quaternions are numbers
of the form a + bi + cj + dk where a, b, c, and d are real numbers and i2 = j 2 = k 2 = −1. We also have
ij = k, jk = i, ki = j, ji = −k, kj = −i, and ik = −j. So the quaternions are not commutative. One way to
remember the correct sign is through ths diagram:

i j

k
42 CHAPTER 3. ABSTRACT ALGEBRA BACKGROUND

Following the arrows gives a positive sign (eg. ij = k), while opposing them gives a negative sign (eg. ji = −k.)
We have
(a + bi + cj + dk) + (e + f i + gj + hk) = (a + e) + (b + f )i + (c + g)j + (d + h)k.
For multiplication, we have
(a+bi+cj+dk)(e+f i+gj+hk) = (ae−bf −cg−dh)+(af +be+ch−dg)i+(ag+ce+df −bh)j+(ah+de+bg−cf )k.
If we reverse the order of the multiplication, then we switch the signs on the last two terms of the coefficients of i, j,
and k.
Example 3.3.4. The n × n matrices with real entries form a noncommutative ring ring under the usual addition and
multiplication of square matrices.
We now look at some particular types of rings.
Definition 3.3.3. If R is a commutative ring. then a nonzero element a ∈ R is a zero-divisor if there exists a b ∈ R
such that b ̸= 0 and ab = 0. A commutative ring is called an integral domain if it has no zero divisors.
Example 3.3.5. The integers form an integral domain.
Example 3.3.6. Let R = Zp where p is prime. If a, b ∈ Zp and ab = 0 then p must divide ab. Since p is prime, it
must divide a or b. But both a and b are less than p, so either a = 0 or p = 0. So Zp is an integral domain.
Example 3.3.7. Let R = Zm where m is not a prime. Then Zm is not an integral domain as m = ab for some a < m
and b < m. For example, if m = 15 then 3 · 5 = 0 in Z15 , so 3 and 5 are both zero-divisors.
The other thing we didn’t assume about rings was an inverse for multiplication.
Definition 3.3.4. A ring is a division ring if all of its nonzero elements have multiplicative inverses.
Definition 3.3.5. A commutative division ring is called a field.
Example 3.3.8. The integers are a commutative ring but not a division ring and thus not a field. For example, there is
no integer we can multiply by 2 to get 1. The rationals, reals, and complex numbers are all fields.
Here is an easy result for rings that we will need. I will prove the first statement. try to prove the rest.
Theorem 3.3.1. Let R be a ring and a, b ∈ R.
1. a0=0a=0
2. a(-b)=(-a)b=-(ab)
3. (-a)(-b)=ab
4. (-1)a=(-1)
5. (-1)(-1)=1
Proof of statement 1: a0 = a(0 + 0) = a0 + a0. So a0 = 0 since an additive inverse is unique. Similarly,
0a = (0 + 0)a = 0a + 0a, so 0a = 0. ■
Now a division ring can not have zero divisors. To see this, suppose D is a division ring, and a, b are nonzero
elements of D with ab = 0. Then a has a multiplicative inverse a−1 and 0 = ab = a−1 (ab) = (a−1 a)b = 1b = b, so
b = 0 which is a contradiction. So a division ring is an integral domain. This shows that Zm for m not prime can not
be a field.
We would like to show that Zp for p a prime is a field. To do this, we use the famous Pigeonhole Principle.
The image you should have is a wall of mailboxes. These resemble a design of dovecote where the birds nest in an
array of containers. (Doves and pigeons are really the same family of birds.) So suppose there are 100 mailboxes and
the mail carrier has 101 letters, Then if she puts one letter in each box, there will still be one left over. So at least one
lucky person got at least two letters. This obvious situation is stated below:
3.3. RINGS AND FIELDS 43

Theorem 3.3.2. Pigeonhole Principle: If n objects are distributed over m places, and if n > m, then some place
receives at least two objects.

Example 3.3.9. Here is a more interesting example. Suppose that the average person has 100,000 hairs on their head.
We know that Max, has 500,000 hairs on his head and that is more than anyone else in the U.S. Then there are two
people in the U.S. with exactly the same number of hairs on their heads. Suppose we have boxes numbered 0-500,000.
We also have index cards with the names of all 300 million Americans. For each person, we put their card in the box
corresponding to the number of hairs on their head. Then we have more cards than boxes so at least two cards must
go in the same box and we are done.

Theorem 3.3.3. A finite integral domain is a field.

Proof: Let D be a finite integral domain. (All our rings are assumed to have a unit element.) Let r1 , r2 , · · · , rn be
all of the elements in D. Suppose a ∈ D is nonzero. Since integral domains are assumed to be commutative we are
done if we can produce a multiplicative inverse for a. Consider the elements r1 a, r2 a, · · · , rn a. They are all distinct,
since if ri a = rj a, then 0 = ri a − rj a = (ri − rj )a. Since D is an integral domain and a ̸= 0, we must have
ri − rj = 0, so ri = rj . So r1 a, r2 a, · · · , rn a must include all elements of D by the Pigeonhole Principle as there are
n of these and D has n elements. So one of these elements, say rk a must equal 1 and rk = a−1 . ■
Now we know that Zp for p a prime is a field by the previous result. It turns out that any finite field must have pn
elements for p a prime. See Chapter 7 in [66] for details.
As in the case of groups, the special maps between rings are called homomorphisms. In this case, they are required
to preserve both addition and multiplication.

Definition 3.3.6. Let R and S be rings. A map ϕ : R → S is a homomorphism if

1. ϕ(a + b) = ϕ(a) + ϕ(b)

2. ϕ(ab) = ϕ(a)ϕ(b)

If a homomorphism of rings is one to one and onto, it is an isomorphism.

Now we start to run into a situation that does not quite match groups. First of all, how do we define the kernel of a
homomorphism? As a ring is an abelian group under addtion, we will let the kernel of ϕ be the set {r ∈ R|ϕ(r) = 0}.
So the kernel can easily be shown to be a subgroup of the additive group of R, and we don’t have to worry about
whether it is a normal subgroup since the additive group of R is asssumed to be abelian. But is the kernel a subring,
i.e. is it a ring itself? Not quite. For one thing, it may not contain the unit element 1. Also, while ϕ(0) = 0, ϕ(1) may
not equal the unit element of S.

Example 3.3.10. Let R = Z10 and S = Z20 . Let ϕ(r) = 2r, for r ∈ Z10 . Then ϕ(0) = 0, but ϕ(1) = 2. Also the
kernel of ϕ is {0}.

Here is a way the kernel of a homomorphism interacts with multiplication:

Theorem 3.3.4. Let R and S be rings. Let ϕ : R → S be a homomorphism and let K = ker ϕ. Then if u ∈ K, then
ru and ur are both in K for all r ∈ R.

Proof: Let r be in R and u ∈ K. Then ϕ(ru) = ϕ(r)ϕ(u) = ϕ(r)0 = 0, so ru ∈ K. ur ∈ K by a similar


argument. ■
So we would like to define a subset of a ring that would act in a way that is similar to the kernel of a ring
homomorphism in order to define quotient rings which would be analogus to quotient groups. What would be the
ideal subset to define? Hopefully in wishing for such an object, we are not being too idealistic. How about an ideal?

Definition 3.3.7. A nonempty subset U of a ring R is a (two sided) ideal of R if it is a subgroup of R under addition
and for u ∈ U , we have ur ∈ U and ru ∈ U for all r ∈ R.
44 CHAPTER 3. ABSTRACT ALGEBRA BACKGROUND

Example 3.3.11. The prime example you should keep in mind is letting R = Z and letting U be the even integers.
Then any integer times an even number is even. Multiples of m for m ∈ Z also works. Odd numbers are not an ideal
as an even number times an odd number is even. They are not even a subgroup as 0 is not odd. They are a coset 1 + U
of U where U is the even numbers.
Definition 3.3.8. Let R be a ring and U an ideal of R. The quotient ring R/U is the set of all cosets r + U of
U , where r ∈ R. Addition is defined by (r1 + U ) + (r2 + U ) = (r1 + r2 ) + U , and multiplication is defined by
(r1 + U )(r2 + U ) = (r1 r2 ) + U .
My only comment here will be about multiplication. Suppose a1 + U = a2 + U , i.e. they represent the same
coset. Also, let b1 + U = b2 + U . Then for multiplication to make sense, we need a1 b1 + U = a2 b2 + U . From our
discussion of cosets of groups, we know that if a1 + U = a2 + U , then a1 − a2 ∈ U , so a1 = a2 + u1 for some
u1 ∈ U . Similarly, b1 = b2 + u2 for some u2 ∈ U .So a1 b1 = (a2 + u1 )(b2 + u2 ) = a2 b2 + u1 b2 + a2 u2 + u1 u2 .
Since U is an ideal, the last 3 terms are all in U . Thus, a1 b1 + U = a2 b2 + U and we are done. You should be able to
convince yourself that the cosets form a ring with these operations.
Theorem 3.3.5. Let R and S be rings. Let ϕ : R → S be onto and let K = ker ϕ. We now know that K is an ideal in
R, and S ≡ R/K.
As a final comment, in cohomology, we use what is known as a graded ring. We assign a dimension to every
element and let r1 r2 have dimension p + q if r1 has dimension p and r2 has dimension q. You already know about
at least one graded ring. The polynomials in x with real coefficients is of this form where we use the degree of the
polynomial as its dimension. The polynomial rings are the prototype for cohomology rings. We will meet them in
Chapter 8.
So that is more than enough ring theory to do algebraic topology. The main purpose of this section was to define
terms and show the analogues with groups.

3.4 Vector Spaces, Modules, and Algebras


In this section, I will just review some basic concepts of vector spaces which are analogous to what we saw for
groups and rings as well as some new properties. This should be a review for most readers. There are a lot of good
books on linear algebra if you have gaps. I will take most of the material from [66]. A classic old book that is very
comprehensive is [70]. The material on modules may be new. It is also taken from [66]. It turns out that a vector space
is a special type of module but so is an abelian group. There is a structure theorem for finitely generated modules
that closely parallels the one for abelian groups. Finally, the Steenrod Squares I will talk about in Chapter 11 form an
algebra. Algebras have addition, multiplication, and scalar multiplication. They come in all sorts of strange shapes so
I will content myself with defining them in general and talk about specific ones in their proper place.
Definition 3.4.1. A nonempty set V is a vector space over a field F if V is an abelian group under addition and if for
every a ∈ F and v ∈ V , there is an element av ∈ V such that for all a, b ∈ F and v, w ∈ V , :
1. a(v + w) = av + aw.
2. (a + b)v = av + bv.
3. a(bv) = (ab)v
4. 1v = v, where 1 is the unit element of F .
The elements of a vector space are called vectors and the elements of the asscoaiated field are called scalars. The
product of a scalar and a vector is called scalar multiplication.
Example 3.4.1. The most common vector space we will see is Rn , the set of ordered n-tuples of real numbers. Here,
F = R. For v = (v1 , · · · , vn ) and w = (w1 , · · · , wn ), we have v + w = (v1 + w1 , · · · , vn + wn ). For a real number
a, av = (av1 , · · · , avn ).
3.4. VECTOR SPACES, MODULES, AND ALGEBRAS 45

For vector spaces, we have the ideas of subspaces, homomorphisms, and quotient spaces. We define those next.

Definition 3.4.2. A nonempty subset W of V is a subspace of V is W is itself a vector space. Equivalently, W is a


subspace of V is given w1 , w2 ∈ W and a, b ∈ F , aw1 + bw2 ∈ W .

As you probably can guess now, a homomorphism of vector spaces preserves addition and scalar multiplication.
A vector space homomorphism has a special name that is much more common: a linear transformation.

Definition 3.4.3. Let V and W be vector spaces. A function T : V → W is called a linear transformation if for
v1 , v2 ∈ V and a ∈ F , T (v1 + v2 ) = T (v1 ) + T (v2 ) and T (av1 ) = aT (v1 ).

Example 3.4.2. Let f : R → R. Then if f (x) = ax + b, the graph of f is a line. Is f a linear transformation? We
have f (x1 + x2 ) = a(x1 + x2 ) + b = ax1 + ax2 + b, while f (x1 ) + f (x2 ) = ax1 + b + ax2 + b = ax1 + ax2 + 2b.
So f (x1 ) + f (x2 ) = f (x1 + x2 ) if and only if b = 0. Also f (cx) = acx + b for c ∈ R. while cf (x) = acx + cb.
Since c is arbitrary, we must again have b = 0. So a line is the graph of a linear transformation if and only if it passes
through the origin. Otherwise, it is the graph of an affine transformation.

As practice, try to prove this next theorem:

Theorem 3.4.1. If V is a vector space over F . Let 0V be the zero element of V and 0F be the zero element of F . Then

1. a0V = 0V , for a ∈ F .

2. 0F v = 0V , for v ∈ V .

3. (−a)v = −(av) for a ∈ F , v ∈ V .

4. If v ̸= 0V , then av = 0V implies a = 0F .

Now we would like to construct quotient spaces. Let W be s subspace of V . Remenber that V and W are abelian
groups so we can certainly form a quotient group V /W consisting of cosets v + W for v ∈ V . We would like V /W
to be a vector space as well. The obvious scalar multiplication is for a ∈ F, a(v + W ) = av + W . This is fine as long
as it is true that if v and v ′ genertate the same coset (i.e. v + W = v ′ + W ), av + W = av ′ + W . This is where we
need the previous theorem. If v + W = v ′ + W , (v − v ′ ) ∈ W , and since W is a subspace, a(v − v ′ ) ∈ W . Now
using part 3 of the previous theorem, −v ′ = −(1v ′ ) = (−1)v ′ , so a(v − v ′ ) = av + a(−1)v ′ = av − av ′ ∈ W . So
av + W = av ′ + W and scalar multiplication is well defined. So we have constructed a quotient space V /W .
I will just mention that vector spaces have direct sums with the meaning you would expect. V = U ⊕ W if every
element v ∈ V can be written uniquely in the form v == u + w, where u ∈ U , and w ∈ W .
We will finish off vector spaces with the notion of its dimension. Then you will know how to answer the next
person who asks, ”Isn’t the fourth dimension time?”. To get there we need a couple more definitions.

Definition 3.4.4. Let V be a vector space over F . If v1 , · · · , vn ∈ V , then any element of the form a1 v1 + · · · + an vn ,
where the ai ∈ F is called a linear combination of v1 , · · · , vn over F .

Definition 3.4.5. If S is a nonempty subset of vector space V over F , the linear span L(S) of S is the set of all linear
combinations of finite subsets of S.

Definition 3.4.6. If V is a vector space over F , and v1 , · · · , vn ∈ V , we say that they are linearly dependent if for
some a1 , · · · , an ∈ F which are not all zero, a1 v1 + · · · + an vn = 0. If they are not linearly dependent, then they are
linearly independent.

Definition 3.4.7. If V is a vector space over F , a linearly independent set v1 , v2 , · · · ∈ V is a basis of V if V = L(S).
The number of elements in the basis (which can be finite or infinite) is called the dimension of V . For V finite
dimensional it can be shown that any 2 bases have the same number of elements, so the definition makes sense.
46 CHAPTER 3. ABSTRACT ALGEBRA BACKGROUND

Example 3.4.3. Rn is an n-dimensional vector space. For example, let n = 2. Then (1, 0) and (0, 1) form a basis. Any
element (a, b) ∈ R2 can be written as a(1, 0) + b(0, 1), and they are linearly independent as a(1, 0) + b(0, 1) = (0, 0)
if and only if a = b = 0. Another basis is (2, 1) and (3, 3). a(2, 1) + b(3, 3) = (2a + 3b, a + 3b). If this is (0, 0), Then
2a + 3b = 0, and a + 3b = 0. Solving for a and b, we see that again a = b = 0, so they are linearly independent. If
(c, d) is an arbitrary element of R2 , we have c = 2a + 3b, and d = a + 3b. So a = c − d, and d = a + 3b = c − d + 3b,
so b = (2d − c)/3. This shows that (2, 1) and (3, 3) span R2 , and they form a basis since they are linearly independent
as well.
Recall that in Chapter 2, we said that Rn was a topological space as well. The usual topology on Rn is that of a
metric space. So Rn is actually a topological vector space. Finite dimensional topological vector spaces are sort of
boring, but there is an entire subject called functional analysis that deals with infinite dimensional topological vector
spaces. If you are curious, a great reference is Functional Analysis by Rudin [141].
We now move on to modules, a generalization of abelian groups. We won’t have a lot of need for them,, but they
come up frequently in the field of homological algebra. As this is closely related to algebraic topology, I will include
a short description for completeness. A module is like a vector space but the scalars can be any kind of ring, not just a
field. So any vector space is a module. That’s why if you see a vector space on the moon it is a lunar module.
Definition 3.4.8. Let R be a ring. A nonempty set M is an R-module if M is an abelian group under addition and for
every r ∈ R and m ∈ M , there is an element rm in M such that for a, b ∈ M and r, s ∈ R,
1. r(a + b) = ra + rb
2. r(sa) = (rs)a
3. (r + s)a = ra + sa
4. If 1 is the unit element of R, then 1m = m.
Now we need to be a little careful. If R is a non-commutative ring, then what we have defined is a left module and
we could define an analogous right module. But we won’t have any reason to worry about that in this book.
Example 3.4.4. Every abelian group is a module over the integers. If g ∈ G, and k ∈ Z, then kg = g + · · · + g (k
times). You can check that is satisfies all of the properties of a module.
Example 3.4.5. Let R be a commutative ring and M be an ideal of R. Then we know that rm and mr are both in M
and are equal to each other. So M is a module over R.
Example 3.4.6. R is a module over itself.
We can define direct sums of modules in the usual way. We will develop a structure theorem for modules that is
analogous to the Fundamental Theorem of Finitely Generated Abelian Groups. We will need to make some restrictions
on the ring of scalars first. For the rest of this section, R will be a commutative ring with unit element 1. We will want
R to be a Euclidean ring in which we have a process like long division.
Definition 3.4.9. An integral domain R is a Euclidean ring if for every a ̸= 0 in R, there is a non-negative integer
d(a) called the degree of a such that
1. For all nonzero a, b ∈ R, d(a) ≤ d(ab).
2. For all nonzero a, b ∈ R, there exists t, r ∈ R such that a = tb + r, where either r = 0 or d(r) < d(b).
We do not assign a value to d(0).
Example 3.4.7. The integers form a Euclidean ring where for a ̸= 0, d(a) = |a|. (The absolute value of a.) Then
|ab| ≥ |a|, where equality holds if b = 1 or b = −1. For the second part, suppose for example a = 30, and b = 7.
Then if we divide 30/7, we get a quotient of 4 and a remainder of 2. So let t = 4 and r = 2. Then 30 = 4(7) + 2 with
|2| < |7|. So we are basically doing long division.
3.5. CATEGORY THEORY 47

Example 3.4.8. The other interesting example is to let R be the ring of polynomials in x with coefficients in a field.
(Say with real coefficients.) Then d(f ) is the degree of f . You can check that this is also a Euclidean ring and you can
perform long division with polynomials as well.
Ideals of Euclidean rings have a special property. They are all principal ideals, which means that they are of the
form (a) = {xa|x ∈ R}. In other words, they are generated by a single element.
Definition 3.4.10. An integral domain R is a principal ideal domain if every ideal of R is a principal ideal.
Theorem 3.4.2. Every Euclidean ring is a principal ideal domain.
Proof: Let R be a Euclidean ring and A an ideal of R. Assume that A ̸= 0 or we could generate A with the single
element 0. Let a0 ∈ A be of minimal degree in A, i.e. there is no nonzero element with a strictly smaller degree. This
is always possible as the degree is a non-negative integer. Let a ∈ A. Then there exists t, r ∈ R, such that a = ta0 + r
where r = 0 or d(r) < d(a0 ). Since A is an ideal of R, ta0 ∈ A, so r = a − ta0 ∈ A. If r ̸= 0, then d(r) < d(a0 ).
But this is impossible since we assumed a0 was of minimal degree in A. So r = 0 and a = ta0 . Since a was an
arbitrary element of A, A = (a0 ). So A is a principal ideal domain. ■
Definition 3.4.11. An module over R is cyclic if there is an element m0 ∈ M such that every m ∈ M is of the form
m = rm0 for some r ∈ R. M is finitely generated if there is a finite set of elements m1 , · · · , mn such that every
m ∈ M is of the form r1 m1 + · · · + rn mn for some r1 , · · · , rm ∈ R.
Now we can state the promised structure theorem:
Theorem 3.4.3. Let M be a finitely generated module over a principal ideal domain R. Then there is a unique
decreasing sequence of principal ideals (d1 ) ⊇ (d2 ) ⊇ · · · ⊇ (dn ) such that
M M M M M M
M ≡R ··· R R/(d1 ) R/(d2 ) ··· R/(dn ).

The generators of the ideals are unique and for each i for 1 ≤ i ≤ n − 1, di divides di+1 .
Finally, we have an algebra which combines addition, multiplication, and scalar multiplication.
Definition 3.4.12. A ring A is called an algebra over a field F if A is a vector space over F such that for all a, b ∈ A,
there is an element ab ∈ A with the property that if α ∈ F , α(ab) = (αa)b = a(αb).
Example 3.4.9. One example of an algebra is the square n × n matrices with real entries. Then we have the usual
matrix addition and multiplication. Scalars are the real numbers. If m is a matrix and a is a real number, then am is
the matrix m with all of its entries multiplied by a.
In algebraic topology, the Steenrod squares form an algebra. We will see then in chapter 11.
So you now should have seen a lot of patterns that repeat for groups, rings, vector spaces, etc. They each have
special maps preserving their structure as well as subobjects, product objects, and quotient objects. Category theory
will allow us to talk about these properties in general terms.

3.5 Category Theory


I know what you are thinking now. ”This chapter is not abstract enough for me. I need even more abstraction.”
Never fear, in this last section, we will talk about category theory. I will take the material from the second volume of
Jacobson [76]. For a very complete description, see MacLane [91]. A more recent and currently popular book is Riehl
[135].
Category theory addresses some issues we have already seen. The first is Russell’s paradox. Recall we have seen
that the collection of all sets is not a set. Instead of the set of sets, we will talk about the category of sets. Also, I
have shown you that there a lot of repeated patterns. Topological spaces, groups, rings, and vector spaces, all have
48 CHAPTER 3. ABSTRACT ALGEBRA BACKGROUND

special subsets, products, quotients, and special maps between them that preserve their main structure. Sometimes, it
is convenient to just prove results about these objects in general using the properties they have in common. A category
will contain a type of set or object and a special map between objects called a morphism. Finally, we will want nicely
behaved maps between categories. These are called functors. As an example, homology theory takes a topological
space and maps it to a collection of abelian groups: one for each dimension. This collection is also known as a graded
group.
What do you call the students in an abstract algebra class a week after the final? A graded group.
The amount of category theory used in algebraic topology has varied over time. There are some modern TDA
papers that are filled with it. My personal preference is to go easy on it, but you can’t do algebraic topology without
at least some category theory. In this section, I will provide basic definitions and examples and we will come back to
it later as needed.
Definition 3.5.1. A category C consists of a class ob C of objects, and a set hom C (A, B) of morphisms for each
ordered pair (A, B) of objects in C. (We drop the subscript C if the context is obvious and just write hom(A, B).)
For each ordered triple of objects (A, B, C), there is a map (f, g) → gf called composition taking hom(A, B) ×
hom(B, C) to hom(A, C). The objects and morphisms satisfy the following conditions:
1. If (A, B) ̸= (C, D), then hom(A, B) and hom(C, D) are disjoint.
2. (Associativity) If f ∈ hom(A, B), g ∈ hom(B, C), and h ∈ hom(C, D), then (hg)f = h(gf ).
3. (Unit) For every object A, we have a unique element 1A ∈ hom(A, A), such that f 1A = f for every f ∈
hom(A, B), and 1A g = g, for every g ∈ hom(B, A).
If the objects in a category form an actual set, the category is called a small category.
I will now give some examples. Convince yourself that they satisfy the definition.
Example 3.5.1. Our first example is the category of sets. The objects are sets and the morphisms are functions.
Example 3.5.2. The category of groups. The morphisms are group homomorphisms.
Example 3.5.3. The category of vector spaces over a given field F . The morphisms are linear transformations.
Example 3.5.4. The category of topological spaces. The morphisms are continuous functions.
Definition 3.5.2. A category D is a subcategory of C if any object of D is an object of C, and for any pair A, B of
objects of D, hom D (A, B) ⊂ hom C (A, B).
Example 3.5.5. The category of abelian groups. Again. let the morphisms be group homomorphisms. Abelian groups
are a subcategory of the category of groups.
Categories have special morphisms called isomorphisms. If f ∈ hom(A, B), then f is an isomorphism if there
exists g ∈ hom(B, A) such that f g = 1B , and gf = 1A . If such a map g exists, it is unique and we write g = f −1 .
In that case, (f −1 )−1 = f . If f and h are isomorphisms, and f h is defined, then (f h)−1 = h−1 f −1 . In the category
of sets, isomorphisms are bijective sets, while in the category of groups, they are group isomorphisms.
Here are two ways to get new categories from old ones.
Definition 3.5.3. Let C be a category. We define the dual category or opposite category Cop to have the same objects
as C, but hom Cop (A, B) = hom C (B, A). Commutative diagrams in the dual category correspond to diagrams in the
original category, but the arrows are reversed. If f : A → B in C, then f : B → A in Cop , and if
f
A B
g
h
C
3.5. CATEGORY THEORY 49

is commutative in C, then
f
A B
g
h
C
is commutative in Cop .
In general, the term ”dual” or the prefix ”co-” will mean to reverse arrows.
We can also talk about the product of two categories.
Definition 3.5.4. If C and D are two categories,then we can define the product category C × D. The objects are
ordered pairs (A, A′ ), where A is an object in C and A′ is an object in D. Morphisms are also ordered pairs, and if
A, B, C are objects of C, A′ , B ′ , C ′ are objects of D, f ∈ hom C (A, B), g ∈ hom C (B, C), f ′ ∈ hom D (A′ , B ′ ),
g ′ ∈ hom D (B ′ , C ′ ), then
(g, g ′ )(f, f ′ ) = (gf, g ′ f ′ ).
The next concept is very important in algebraic topology. A functor is a map between categories. It takes objects
to objects and morphisms to morphisms. There are two types of functors. Covariant functors keep morphisms going
int the same direction and contravariant functors reverse the direction. we will meet them both in algebraic topology.
Definition 3.5.5. If C and D are categories, a covariant functor F from C to D consists of a map taking an object A in C
to an object F (A) in D, and for every pair of objects A, B in C, a map of hom C (A, B) into hom D (F (A), F (B)),such
that F (1A ) = 1F (A) , and if gf is defined in C, then F (gf ) = F (g)F (f ).
Definition 3.5.6. If C and D are categories, a contravariant functor F from C to D consists of a map taking
an object A in C to an object F (A) in D, and for every pair of objects A, B in C, a map of hom C (A, B) into
hom D (F (B), F (A)),such that F (1A ) = 1F (A) , and if gf is defined in C, then F (gf ) = F (f )F (g).
A contravariant functor can also be thought of as a covariant functor from Cop to D.
I will now give some examples of functors. They are all covariant unless otherwise specified.
Example 3.5.6. Let D be a subcategory of C. The injection functor takes an object in D to itself and maps a morphism
in D to the same morphism in C. In the special case where D=C, it is called the identity functor.
Example 3.5.7. A forgetful functor ”forgets” some of the structure of a category. For example, let F go from the
category of groups to the category of sets where it takes a group to its underlying set. Another example takes rings to
abelian groups and a ring homomorphism to just its additive property to become an additive abelian group homomor-
phism.
Example 3.5.8. The power functor goes from the category of sets to itself and takes any set to its power set (i.e. the
set of its subsets. See Section 2.6.) If f : A → B, this functor sends f to fP : P (A) → P (B) which sends a subset
A1 ⊂ A to the subset f (A1 ) ⊂ B. The empty set is sent to itself.
Example 3.5.9. The projection functor goes from the product category C × D to C. It takes A × A′ to A, and
(f, g) ∈ hom((A, B), (A′ , B ′ )) into f ∈ hom(A, B).
Example 3.5.10. Homology, which we will define in Chapter 4 is a functor from the category of topological spaces
to the category of graded groups. A continuous function corresponds to a group homomorphism in each direction.
Example 3.5.11. Finally, I will give an example of a contravariant functor which will be useful in cohomology. (We
will define cohomology in Chapter 8.) Let G be a fixed abelian group. Define a functor which takes an abelian
group A to the group of homomorphisms Hom(A, G) from A to G. A map from A to B corresponds to a map from
Hom(B, G) to Hom(A, G). To see this, let f : A → B, and g ∈ Hom(B, G). Then gf ∈ Hom(A, G), so we
have a map Hom(B, G) → Hom(A, G) given by g → gf . Note that this is a contravariant functor. If we take A to
Hom(G, A) for fixed G then this is a covariant functor.
50 CHAPTER 3. ABSTRACT ALGEBRA BACKGROUND

Finally, I will define a map between two functors called a natural transformation.
Definition 3.5.7. Let F and G be functors from C to D. We define a natural transformation η from F to G to be a
map that assigns to every object A in C a morphism ηA ∈ hom D (F A, GA) such that for any objects A, B of C, and
any f ∈ hom C (A, B), the rectangle below is commutative:
ηA
A FA GA

f F (f ) G(f )

B FB ηB GB

If every ηA is an isomorphism, then η is called a natural isomorphism.


I will save examples of natural transformations until the next chapter.
This gives you a taste of category theory. We will revisit it when necessary.
Now we have all the background we need. The next chapter will start to describe algebraic topology.
Chapter 4

Traditional Homology Theory

Now I will put the last two chapters together and talk about algebraic topology. Before I get into persistent homology,
which is the basis of topological data analysis as it is currently practiced, I will give an introduction to homology
using the traditional approach. Persistent homology will be an easy special case. If we want to get beyond it and start
exploring the possible applications of cohomology and homotopy, we will need a better understanding of the subject
as it was originally developed. Rather than make this chapter 100 pages long, I will use it as an outline and you can
look in any of the textbooks that I will list for more information. One thing I will sacrifice, though, is giving you a full
appreciation for how hard it is to prove that all of the machinery works the way it is supposed to. Like a lot of other
subjects in math, you go through the pain of developing the tools and then never look back.
The first section of this chapter will be the longest and deal with simplicial homology. A topological space is
represented by a collection of vertices, edges, triangles, etc. There is an equivalent combinatorial approach where we
are looking at subsets of a collection of objects. The latter approach is the one that is more applicable to data science.
As we mentioned at the end of the last chapter, homology is a functor which takes a topological space to a graded
group, i.e. an abelian group in every dimension up to the dimension of the space. This means that we not only need to
produce these groups, but for a continuous map between two different spaces, there has to be a corresponding group
homomorphism in every dimension between those spaces’ homology groups. Also, if the spaces are homeomorphic,
we would like these homomorphisms to be actual isomorphisms. This is the property of topological invariance. We
can do all of this, but proving it works turns out to pretty difficult. I won’t go through it all, but you can find it in any
of the references.Along the way, I will briefly discuss computation. There is a very ugly way to compute these groups
directly using the definition. Exact sequences give a shortcut in a lot of special cases. But a computer can always
compute homology groups. I will briefly outline the method in [110]. There are much better methods developed in the
last 30 years and there is software that is publicly available, so I won’t go into a lot of detail on this particular method.
I will finish the section with some interesting applications to maps of spheres and fixed points. The former is the key
to obstruction theory, which I will describe in more detail later.
The next section is about the Eilenberg Steenrod Axioms. In [47], Eilenberg and Steenrod took an axxiomatic
approach to the subject. In other words, there are no pictures. A big contribution was the idea of a homology theory that
would satisfy a set of axioms. The full list is a big help in computation and has interesting parallels in cohomology and
homotopy. Given the notion of a homology theory, you would expect there to be more than one. I will briefly describe
two others in the last section. Simplicial homology only works on spaces that can be represented by simplices or
triangulated. Singular homology is more general and proving topological invariance is much easier than in simplicail
homology. But computations are not at all practical. A good compromise is cellular homology. This has the advantage
that your topological space does not have to plugged into a wall outlet to use it. It replaces simplicial complexes with
the more flexible CW complexes. For any situation we would meet in data science, you can use any of these theories.
In the case where they all apply, the homology groups are all the same.
The material in this chapter is entirely taken from Munkres book, Elements of Algebraic Topology [110], which is
my personal favorite. Here are some other references. As far as I know, the earliest book on algebraic topology was
from Alexandrov and Hopf in 1935 [8]. (We will hear more about Hopf later.) It is in German. The first American

51
52 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

(and English language) textbook was written in 1942 by Solomon Lefschetz [87]. Eilenberg and Steenrod’s book [47]
that I metioned above was the standard reference for many years. Another popular book is Spanier [149] which first
came out in 1966. Currently, the main competitor to Munkres is Hatcher [63]. Hatcher’s book seems to be the most
popular these days, and it had a free online version for some time before it was published in print. While it covers
more material than Munkres such as Steenrod squares and some homotopy theory, I find it to be a much harder book.
There are books on homotopy that I like better and will mention them when we get to chapter 9.

4.1 Simplicial Homology


I will now describe simplicial homology. In TDA (as it stands now), it is the only homology theory we will ever
need.

4.1.1 Simplicial Complexes


The basic building block of homology theory is a simplex. This is a point in dimension 0, a line segment in
dimension 1, a triangle in dimension 2, a tetrahedoron in dimension 3, etc. Note that for an n-dimensional simplex,
there are n + 1 vertices. Simplices are glued together to form simplicial complexes. Most shapes you have ever seen
or thought about can be built in this way. As an example, Epcot’s Spaceship Earth at Disney World is is a copy of the
sphere S 2 modeled as a simplicial complex. (See Figure 4.1.1) Here are the precise definitions.

Figure 4.1.1: Spaceship Earth [177]

Definition 4.1.1. A subset X of Rn is convex if given any two points x1 , x2 ∈ X, the line segment connecting
them is contained in X. (See Figure 4.1.2.) This line is of the form x = tx1 + (1 − t)x2 where 0 ≤ t ≤ 1. If
4.1. SIMPLICIAL HOMOLOGY 53

x0 , x1 , · · · , xk ∈ Rn , the convex hull of these points is the smallest convex subset of Rn containing them.

Figure 4.1.2: Convex vs Non-Convex Set

Pk
Definition 4.1.2. A set of points x0 , x1 , · · · , xk ∈ Rn is geometrically independent if for any real scalars ti , i=0 ti =
Pk
0 and i=0 ti xi = 0 imply ti = 0 for all i. This is the same as saying that any three of these points are not on the
same line, any four are not on the same plane, etc.

Definition 4.1.3. Let x0 , x1 , · · · , xk ∈ Rn be geometrically independent. The k-simplex σ spanned by x0 , x1 , · · · , xk


is the convex hull of these points. In formulas, it is the set

k
X k
X
{x|x = ti xi } where ti = 1 and ti ≥ 0 for all i.
i=0 i=0

The points, x0 , x1 , · · · , xk are the vertices of σ, and we say that σ has dimension k. If τ is a simplex spanned by a
subset of these vertices, then τ is a face of σ. The real numbers {ti } are called the barycentric coordinates of σ.

Note that a k-simplex has k + 1 vertices. For example, a triangle is a 2-simplex with three vertices, x0 , x1 , and
x2 . Letting the n-ball B n be the interior of the the (n − 1)-sphere S n−1 , an n-simplex is homeomorphic to B n and
its boundary is homeomorphic to S n−1 . So a triangle (remember it is made of silly putty) is homeomorphic to B 2 . Its
boundary can be deformed into the circle, S 1 .
We now stick simplices together into a simplicial complex.

Definition 4.1.4. A simplicial complex K in Rn is a collection of simplices such that

1. Every face of a simplex of K is in K.

2. The intersection of any two simplices of K is a face of each of them.

Figure 4.1.3 shows some examples of good and bad simplicial complexes. The main thing is that the intersection
of two faces must be an entire face.
The next definition is very important.

Definition 4.1.5. Let K be a simplicial complex and K p be the collection of all simplices of dimension at most p.
Then K p is called the p-skeleton of K.

To visualize where this term comes from, imagine Spaceship Earth with all its triangles missing. Then there
would only be vertices and edges and you would see an outline of the sphere. That complex would be the 1-skeleton,
Spaceship Earth1 .
Now, we look at the space taken up by the complex K.

Definition 4.1.6. Let K be a simplicial complex. The union of all of the simplices of K as a subset of Rn is called
the polytope or underlying space ofK and denoted by |K|.
54 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

Figure 4.1.3: Good and Bad Simplical Complexes

The polytope |K| has some nice properties. First of all, any simplex is a closed bounded subset of Rn so it is
compact by the Heine-Borel Theorem. So for a finite complex K, |K| is compact as it is a finite union of compact
sets. It is easily proved that for a space X, f : |K| → X is continuous if and only if f |σ is continuous for every
simplex σ in K. I will give the proof that |K| is Hausdorff. First, we need to extend the definition of barycentric
coordinates to a simplicial complex.
Definition 4.1.7. Let K be a simplicial complex. If x is a point of |K|, then x is a vertex of K or it is in the interior
of exactly one simplex σ of K. Suppose the vertices of σ are a0 , · · · , ak . Then
k
X
x= ti ai ,
i=0
P
where ti > 0 for each i and ti = 1. If v is an arbitrary vertex of K we define the barycentric coordinate tv (x) of x
as tv (x) = 0 if v is not one of the vertices ai and tv (x) = ti , if v = ai .
Theorem 4.1.1. |K| is Hausdorff.
Proof: If x ̸= y, then there is at least one vertex v of K with tv (x) ̸= tv (y). Without loss of generality, assume
tv (x) < tv (y), and choose a number r with tv (x) < r < tv (y). Then {z ∈ K|tv (z) < r} and {z ∈ K|tv (z) > r} are
disjoint open sets containing x and y respectively. ■
Homology will be a functor that will take simplicial complexes to a group in each dimension. We will then need
special maps between simplicial complexes.
Theorem 4.1.2. Let K and L be complexes, and let f : K 0 → L0 be a map. Suppose whenever the vertices
v0 , · · · , vn of K span a simplex of K, f (v0 ), · · · , f (vn ) span a simplex of L. Then f can be extended to a continuous
map g : |K| → |L| such that
X n
x= ti vi
i=0

implies
n
X
g(x) = ti f (vi ).
i=0

We call G the simplicial map induced by f .


Note that we don’t insist that the f (vi ) are always distinct. If they are and f is bijective, then g turns out to be a
homeomorphism. Also note that the composition of two simplical maps is a simplicial map.
4.1. SIMPLICIAL HOMOLOGY 55

We end this subsection with a different kind of simplicial complex. This version is much more useful in TDA
applications.
Definition 4.1.8. An abstract simplicial complex is a collection S of nonempty sets such that if A is an element of
S, so is every nonempty subset of A. A is called a simplex of S. The dimension of A is one less than the number of
elements of A. A subset of A is called a face of A.
Note that in an abstract simplicial complex, any two simplices have to intersect in an edge. So the only property
we need to worry about is that the complex contains all of the faces of any of its simplices. We can sometimes define
an interesting abstract simplicial complex by defining maximal simplices, i.e. simplices that are not contained in any
larger simplex. We then get the complex by including the maximal simplices and all of their faces.
Example 4.1.1. Here is an idea of mine. The Apriori Algorithm [7] looks at frequent common subsets of a large
collection of objects. For example, a supermarket might analyze the items that customers buy together. It determines
rules like a customer that buys peanut butter and bread is also likely to buy jelly. For each customer, we can assign
an abstract simplicial complex. The maximal simplices are the items bought in each shopping trip and then we make
this into a simplicial complex by including all of the faces. My idea is to use the homology to cluster customers. For
example, can you distinguish a customer who is single from one who is buying for a large family? I will discuss this
more in Chapter 7.
Example 4.1.2. Abstract simplicial complexes are also closely related to hypergraphs. Suppose we have a set of
objects called vertices. An (undirected) graph is the collection of vertices along with a set of pairs of vertices called
edges. The two vertices are the ends of the edge and they can be the same if we allow loops. A hypergraph is a variant
of a graph where the hyperedges can contain more than two vertices. For example, if the vertices are employees in
an office, one person can send an email to five of the employees making a hyperedge of size 6. Taking all of the
hyperedges in a hypergraph along with all of their subsets forms an abstract simplicial complex.
One more thing to note is that an abstract simplicial complex can always be represented as a simplicial complex as
we first defined it. If there are N +1 vertices, we use the N standard basis vectors along with the origin in RN . (Recall
that a standard basis vector has one coordinate a one and the rest zeros.) Label these points to match the vertices in
your abstract complex, and for each simplex in the complex, form the simplex from its corresponding vertices in RN .
Thus we have a pairing between both kinds of simplicial complexes.
In the next part, we will see how to turn complexes into groups.

4.1.2 Homology Groups


Now I will define homology groups. For now, I will talk about homology groups whose coefficients are integers.
If you can understand these, going to other types of coefficients are not that hard. In persistent homology, integers
are replaced with Z2 coefficients. I will discuss the differences in the next subsection. There is a general formula
for changing between coefficients but it is closely related to cohomology and will be discussed in Chapter 8 when we
develop some additional tools. One issue that moving to Z2 eliminates is the issue of orientation which I will describe
next.
Definition 4.1.9. Let σ be a simplex with vertices v0 , v1 , · · · , vp . We choose an ordering of its vertices and say that
two orderings are equivalent if they differ by an even permutation, i.e. an even number of swaps of pairs of them. The
orderings then divide into two equivalence classes called orientations. We choose a particular orientation and call σ
an oriented simplex, written σ = [v0 , v1 , · · · , vp ].
Definition 4.1.10. Let K be a simplicial complex. A p-chain on K is a function c from the oriented p-simplices of K
to the integers such that:
1. c(σ) = −c(σ ′ ) if σ and σ ′ are opposite orientations of the same simplex.
2. c(σ) = 0 for all but finitely many oriented p-simplices σ.
56 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

We can add p-chains by adding their values. They form an abelian group denoted Cp (K). Letting the dimension dim
K be the dimension of the highest dimensional simplex in K, we let Cp (K) = 0 if p >dim K or p < 0.
Definition 4.1.11. Let σ be an oriented simplex. Then the elementary chain c correspoding to σ is defined by
1. c(σ) = 1
2. c(σ ′ ) = −1 if σ and σ ′ are opposite orientations of the same simplex.
3. c(τ ) = 0 for any other simplex τ .
Theorem 4.1.3. Cp (K) is a free abelian group. The basis is obtained by orienting each p-simplex and using the
corresponding elementary chains.
Definition 4.1.12. We now define a special homomorphism

∂p : Cp (k) → Cp−1 (K).

called the it boundary operator. From the previous theorem, we just need to define ∂p on the oriented simplices of K.
Let σ = [v0 , v1 , · · · , vp ] be an oriented simplex with p > 0. Then
p
X
∂p σ = ∂p [v0 , v1 , · · · , vp ] = (−1)i [v0 , · · · , v̂i , · · · , vp ]
i=0

where the symbol v̂i means that the vertex vi is deleted. Since Cp (K) = 0 for p < 0, ∂p = 0 for p ≤ 0.

Figure 4.1.4: Boundary Examples

Example 4.1.3. Figure 4.1.4 shows an example of the boundary of both a 1-simplex and a 2-simplex. For the 1-
simplex [v0 , v1 ], ∂1 [v0 , v1 ] = v1 − v0 . so the boundary is just the end point minus the start point.
For the 2-simplex, ∂2 [v0 , v1 , v2 ] = [v1 , v2 ] − [v0 , v2 ] + [v0 , v1 ]. Each i-simplex starts at the left vertex and moves
towards the right one unless it is negative in the sum in which case we move in the opposite direction. The triangle
[v0 , v1 , v2 ] is oriented clockwise as shown in the figure. So when taking the boundary, the inside is removed and we
move along the edges as follows: Start at v1 and move to v2 . Then traverse [v0 , v2 ] in the reverse direction so we move
from v2 to v0 . Then we move from v0 to v1 . So we are moving on the boundary of the triangle in the same order as
shown on the left hand side of the figure.
Notice if we take the boundary ∂1 ([v1 , v2 ] − [v0 , v2 ] + [v0 , v1 ]), we get v2 − v1 + v0 − v2 + v1 − v0 = 0. So the
boundary of the boundary is 0. Does that always happen?
Theorem 4.1.4. ∂p−1 ◦ ∂p = 0.
Try to prove this yourself from the formula. It is messy but not too hard.
Now we can finally define a homology group.
4.1. SIMPLICIAL HOMOLOGY 57

Definition 4.1.13. The kernel of ∂p : Cp (k) → Cp−1 (K) is called the group of p-cycles denoted Zp (K). The image
of ∂p+1 : Cp+1 (k) → Cp (K) is called the group of p-boundaries and denoted Bp (K). The previous theorem shows
that Bp (K) ⊂ Zp (K). We can take the quotient group Hp (K) = Zp (K)/Bp (K) and we call it the pth homology
group of K.

What do elements of a homology group look like? We know every boundary is a cycle, but we are looking for
cycles that are not boundaries. The next two examples from [110] are instructive.

Figure 4.1.5: Homology Examples

Example 4.1.4. Consider the complex K of Figure 4.1.5. It is the boundary P4 of a square with edges e1 , e2 , e3 , e4 .
The group C1 (K) has rank 4, and a general 1-chain is of the form c = i=1 ni ei . The value on vertex v (bottom
right) of ∂1 c is n1 − n2 . So for c to be a cycle, we must have n1 = n2 . Looking at the other vertices, we must have
n1 = n2 = n3 = n4 . So Z1 (K) ∼ = Z generated by the cycle e1 + e2 + e3 + e4 . Since there are no 2-simplices,
B1 (K) = 0, so H1 (K) = Z1 (K) = Z.

Example 4.1.5. The complex L on the right side of Figure 4.1.5. P5is a filled in square with edges e1 , e2 , e3 , e4 . We
also have an edge e5 so a general 1-chain is of the form c = i=1 ni ei . For c to be a cycle we need n1 = n2 ,
n3 = n4 , and n5 = n3 − n2 . As an example, the contributions on the top right vertex are −n3 , +n2 , +n5 , so we
want n2 + n5 − n3 = 0. We can arbitraily assign n2 and n3 and the other values are then determined. So Z1 (L) is
free abelian with rank 2. We get a basis by letting n2 = 1 and n3 = 0 making our generic 1-chain e1 + e2 − e5 and
letting n2 = 0 and n3 = 1 making our generic 1-chain e3 + e4 + e5 . The first is ∂2 σ1 and the second is ∂2 σ2 . So
H1 (L) = Z1 (L)/B1 (L) = 0. The general 2-chain is m1 σ1 + m2 σ2 , which is a cycle only if m1 = m2 = 0. So
H2 (L) = 0.

This would get pretty ugly if this was the only way to compute homology groups. Suppose you are a typical visitor
to Disney World. You would probably want to calculate the homology groups of Spaceship Earth. Imagine spending
your entire vacation labeling all of the vertices, edges, and triangles. But there is a lot of cancellation, and we can
replace any chain with a chain whose difference is a boundary. (Remember for an abelian group G and subgroup H,
g1 and g2 are in the same coset of H if g1 − g2 ∈ H.) In a homology group, two chains c1 , c2 are homologous if
c1 − c2 = ∂d for some chain d. (Note we will drop the subscript on ∂ when it won’t cause confusion.) So homologous
chains represent the same element in the homology group. If c = ∂d we say that c bounds or is homologous to zero.
So the way to get some homology groups more easliy is to subtract off pieces of chains that are also boudaries. See
[110] for a lot of examples.
The main thing I want you to take away from the first two examples is to think of what a cycle is and what is a cycle
that is not a boundary. In dimension one, a cycle is a loop and if it is not the boundary of something, it corresponds to
a hole. So identifying homologous chains as being in the same class, a class which is a generator in a homology group
corresponds to a hole.
Here are some homology groups of interesting objects. See [110] or your favorite textbook for proofs. Basically,
you eliminate pieces of chains that are also boundaries.

Theorem 4.1.5. Hp (S n ) = 0 for p ≥ 1 and p ̸= n, and Hn (S n ) ∼


= Z for n ≥ 1.

So now you don’t have to compute the homology of Spaceship Earth as it is basically S 2 .
58 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

Theorem 4.1.6. Let T be a complex whose underlying space is a torus. Then H1 (T ) ∼


= Z ⊕ Z, and H2 (T ) ∼
= Z.

Theorem 4.1.7. Let K be a complex whose underlying space is a Klein bottle. Then H1 (K) ∼
= Z ⊕ Z2 , and
H2 (K) ∼
= 0.

Theorem 4.1.8. Let P 2 be a complex whose underlying space is a projective plane. Then H1 (P 2 ) ∼
= Z2 , and
H2 (P 2 ) ∼
= 0.

You may notice that I haven’t discussed zero dimensional homology. I will do that next.

Theorem 4.1.9. Let K be a complex. Then the group H0 (K) is free abelian. If {vα } is a collection consisting of one
vertex from each connected component of |K|, then the homology classes of the chains vα form a basis for H0 (K). In
particular, if |K| is connected, than H0 (K) ∼
= Z.

The idea behind the proof is to first show that if 2 vertices are in the same component of |K|, there is a path
consisting of edges from one to the other. Fix a component of |K| and let v be the corresponding vertex in the
collection. If w is another vertex, let [v, a1 ], [a1 , a2 ], · · · , [an−1 , w] be 1-simplices forming a path from v to w. Then
the 1-chain [v, a1 ] + [a1 , a2 ] + · · · + [an−1 , w] has boundary w − v. So w is homologous
P to v and every 0-chain in K
is homologous to a linear combination of the {vα }. Also no chain of the form nα vα can be a boundary as any one
simplex can only lie in a single component. This proves the theorem.
There is another common version of zero dimensional homology called reduced homology.

Definition 4.1.14. Let ϵ : C0 (K) → Z be the surjective homomorphism defined by ϵ(v) = 1, for each vertex v ∈ K.
If c is a 0-chain, then ϵ(c) is the sum of the values of c on the vertices of K. The map ϵ is called an augmentation map
for C0 (k). Now if d = [v0 , v1 ] is a 1-simplex, then ϵ(∂1 (d)) = ϵ(v1 − v0 ) = 1 − 1 = 0,, so im ∂1 ⊂ ker ϵ. We define
the reduced homology group H̃0 (K) = ker ϵ/im ∂1 . If p ≥ 1, we define H̃p (K) ≡ Hp (K).

Theorem 4.1.10. The group H̃0 (K) is free abelian and

H̃0 (K) ⊕ Z ∼
= H0 (K).

So for |K| connected, H̃0 (K) = 0. We get a basis by fixing an index α0 and choosing a collection {vα } consisting of
one vertex from each connected component of |K|. Then {vα − vα0 } form a basis of H̃0 (K).

Definition 4.1.15. A simplicial complex K is acyclic if its reduced homology is zero in all dimensions.

Now recall the definition of a cone from Section 2.4.3. It is formed from a space X by taking the cartesian product
X × I and identifying the points (x, 1) to a single point. Another way to look at it is that you pick a point outside X
and connect every point in X to that point with a straight line. What do you think this operation does to homology?
Imagine X = S 1 and take Y = CX. We have then turned a circle into an ice cream cone. We have also just
plugged up a hole. Taking cones plugs up all of the holes. So homology becomes zero for p ≥ 1 and every point of X
is connected to the external point, so in general, CX is connected even if X didn’t start out that way.
Suppose you start with a simplicial complex K. We form a cone w ∗ K in which for any simplex σ = [a0 , · · · , ap ]
of K, we include the simplex [w, σ] = [w, a0 , · · · , ap ] and all of its faces. Note that any simplex of dimension p > 0
can be formed by taking a cone of a simplex of dimension p − 1.

Theorem 4.1.11. If w ∗ K is a cone, then H̃p (w ∗ K) = 0 for all p.

So we now know the homology of a simplex of dimension p > 1 as it is the cone of a simplex of dimension p − 1.
Also, the homology of a point x has no simplices in dimensions higher than 0 and has 1 connected component, so
H̃p (x) = 0 for all p.
As a final topic in this subsection, I will discuss the idea of relative homology. In this case we have a smaller
complex K0 ⊂ K, and chains are represented as a quotient group.. This is surprisingly important for the theory as
you will see when we discuss the Eilenberg Steenrod axioms.
4.1. SIMPLICIAL HOMOLOGY 59

Definition 4.1.16. Let K be a simplicial complex, and K0 a subcomplex. Then a p-chain of K0 can be thought of as
a p-chain of K whose value is set to zero on any simplices not in K0 . So Cp (K0 ) is a subgroup of Cp (K) and we can
define the group Cp (K, K0 ) of relative p-chains of K modulo K0 as the quotient group

Cp (K, K0 ) = Cp (K)/Cp (K0 ).

The group Cp (K, K0 ) is free abelian with basis σi + Cp (K0 ) where σi runs over all simplices of K that are not in
K0 .
We say that a chain is carried by a subcomplex if it is zero for every simplex not in the subcomplex. Now we need
a boundary map ∂ : Cp (K, K0 ) → Cp−1 (K, K0 ). Since the restriction of the usual boundary ∂ to Cp (K0 ) takes a
p-chain carried by K0 to a p − 1-chain carried by K0 . We can define a boundary ∂ : Cp (K, K0 ) → Cp−1 (K, K0 )
by ∂(cp + Cp (K0 )) = ∂(cp ) + Cp−1 (K0 ) where the use of ∂ as the original or the relative boundary should be clear
from the context. Then ∂ ◦ ∂ = 0. so we have
1. Zp (K, K0 ) = ker ∂ : Cp (K, K0 ) → Cp−1 (K, K0 )
2. Bp (K, K0 ) = im ∂ : Cp+1 (K, K0 ) → Cp (K, K0 )
3. Hp (K, K0 ) = Zp (K, K0 )/Bp (K, K0 ).
These groups are called the group of relative p-cycles, the group of relative p-boundaries, and the relative homology
group in dimension p respectively. Note that a relative p-chain cp + Cp (K0 ) is a relative cycle if and only if ∂cp is
carried by K0 and a relative p-boundary if and only if there is a p + 1-chain dp+1 of K such that cp − ∂dp+1 is carried
by K0 .
We conclude with an important result which will also be included as one of the Eilenberg Steenrod axioms.
Theorem 4.1.12. Excision Theorem: Let K be a complex, and let K0 be a subcomplex. Let U be an open set
contained in |K0 |, such that |K| − U is the underlying space of a subcomplex L of K. Let L0 be the subcomplex of K
whose underlying space is |K0 | − U . Then we have an isomorphism Hp (L, L0 ) ∼ = Hp (K, K0 ).
Proof: Consider the composite map ϕ

Cp (L) → Cp (K) → Cp (K)/Cp (K0 ).

This is inclusion followed by projection. It is onto since Cp (K)/Cp (K0 ) has as a basis all cosets σi + Cp (K0 ) with
σi not in K0 and L contains all of these simplices. The kernel is then Cp (L0 ). So the map induces an isomorphism

Cp (L)/Cp (L0 ) ∼
= CP (K)/Cp (K0 ),

for all p. Since the boundary operator is preserved by this isomorphism, Hp (L, L0 ) ∼
= Hp (K, K0 ).

Figure 4.1.6: Excision Example

Figure 4.1.6 shows an example. This is the problem of adversarial algebraic topology. You have a nice simplicial
complex with a subcomplex and a bully comes and rips a hole in it. As long as the hole is an open set and contained
within the subcomplex, you don’t have a problem, and everything stays the same. We will see that this is not the case
with homotopy groups and is one of the factors that makes them much harder to compute than homology groups.
60 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

4.1.3 Homology with Arbitrary Coefficients


This subsection will be pretty short but relevant to topological data analysis. In most applications, we use homology
over Z2 .
Definition 4.1.17. Let K be a simplicial complex and G an abelian group. Then a p-chain of K with coefficients
in G is a function cp from the oriented p-simplices of K to G that vanishes on all but finitely many p-simplices and
cp (σ) = −cp (σ ′ ) if σ and σ ′ are opposite orientations of the same simplex. We write Cp (K; G) where Cp (K) is
understood to equal Cp (K : Z).
The boundary operator ∂ : Cp (K; G) → Cp−1 (K : G). is defined by the formula ∂(gσ) = g(∂(σ)) where g ∈ G.
We can then define cycles, boundaries, and homology with coefficients in G in the obvious way. Relative homology
can also be defined.
When we let G = Z2 , we look at the simplices as being included or excluded from a chain. Also for g ∈ Z2 ,
g = −g and 2g = 0. So the boundary map loses its minus signs and we no longer talk about orientation. This changes
the calculations a bit for the homology of the objects we have seen. Persistent homology needs G to be a field as we
will see in the next chapter.
If we know homology with integer coefficients. we can find it for any other coefficient group using the Universal
Coefficient Theorem. I will defer this theorem until we talk about cohomology as it uses the machinery of homological
algebra, especially tensor and torsion products. There is a Universal Coefficient Theorem for cohomology as well
which allows us to compute cohomology groups if we know homology groups. This is good news but also bad news in
the sense that cohomology groups do not give us any additional information. The power of cohomology comes from
its ring structure using the operation of cup products. I will get to all of that in Chapter 8.

4.1.4 Computability of Homology Groups


In this subsection, I will briefly outline the automated method of computing homology groups described in
Munkres[110]. This has some theoretical interest, but there is much better software out there these days. Remem-
ber that we are talking state of the art in 1984. Fast computation is not really my area, but I will talk about persistent
homology software in the next chapter. Also, we can calculate homology groups with exact sequences in many inter-
esting cases using old fashioned pencil and paper methods.
We start with a theorem on subgroups of a free abelian group.
Theorem 4.1.13. Let A be a free abelian group. Then any subgroup B of A is also free abelian. If A is of finite rank
m, then B has rank r ≤ m.
To compute homology groups, we will work with the matrix of the boundary map. (Remember that chain groups
are always free abelian.) We will explicitly define a matrix of a map between two free abelian groups.
Definition 4.1.18. Let G and G′ be free abelian groups with bases a1 , · · · , an and a′1 , · · · , a′m respectively. Then if
f : G → G′ is a homomorphism, then
m
X
f (aj ) = λij a′i
i=1
for unique integers λij. The matrix (λij ) is called the matrix of f relative to the given bases for G and G′ .
Theorem 4.1.14. Let G and G′ be free abelian groups of ranks n and m respectively, and let f : G → G′ be a
homomorphism. Then there are bases for G and G′ such that relative to these bases, the matrix has the form
 
b1 0
 .. 

 . 0  
 0 bk 
 
 
 
 0 0 
4.1. SIMPLICIAL HOMOLOGY 61

where b ≥ 1 and b1 |b2 | · · · |bk . The matrix is called the Smith normal form.

I will give an outline of the procedure. See [110] or your favorite linear algebra book for details and examples.
Also, Matlab has a built in function to do this calculation, and people have developed functions for python and Math-
ematica.
Throughout the process we use the three elementary row operations:

1. Exchange row i and row j.

2. Multiply row i by −1.

3. Replace row i by (row i) +q(row k), where q is an integer and k ̸= i.

There are also similar column operations.


Now for a matrix A = (aij ) let α(A) denote the smallest nonzero element of the absolute value |aij | of the entires
of A. We call aij a minimal entry of A if |aij | = α(A). The reduction proceeds in two steps. The first brings the
matrix to a form where α(A) is as small as possible. The second reduces the dimensions of the matrix involved.
Step 1: To decrease the value of α(A) we use the following fact: If the number α(A) fails to divide some entry
of A then it is possible to decrease the value of α(A) by applying elementary row and column operations to A. The
converse is also true.
The idea is then to perform row and column operations until α(A) divides every entry of A. See [110] for explicit
steps.
Step 2: At this point the minimal nonzero element divides all other nonzero elements. Bring it to the top left
corner and make it positive. Since it divides all entries in its row and column, we can apply row and column operations
to make all of those entries zero.
Now repeat Steps 1 and 2 by ignoring the first row and column and working on the smaller matrix.
Step 3: The algorithm terminates when the smaller matrix is all zeros or disappears. At this point, the matrix is in
Smith normal form and we have guaranteed that each diagonal element will divide all elements below it.
We will state a general theorem on chain complexes which we will now define. Simpliicial homology groups are a
special case.

Definition 4.1.19. A chain complex C is a sequence

∂p+1 ∂p
··· Cp+1 Cp Cp−1 ···

of abelian groups Ci and homomorphisms ∂i indexed by the integers such that ∂p ◦ ∂p+1 = 0 for all p. The pth
homology group of C is defined by the equation Hp (C) = ker ∂p /im ∂p+1 .

Theorem 4.1.15. Let {Cp , ∂p } be a chain complex such that each group Cp is free and of finite rank. Then for each
p, there are subgroups Up , Vp , Wp of Cp such that
M M
C p = Up Vp Wp ,

where ∂p (Up ) ⊂ Wp−1 and ∂p (Vp ) = ∂p (Wp ) = 0. In addtion, there are bases for Up and Wp−1 relative to which
∂p : Up → Wp−1 has a matrix of the form
 
b1 0
B=
 .. ,

.
0 bk
where bi ≥ 1 and b1 |b2 | · · · |bk .
62 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

Outline of Proof [110]:


Step 1:
Let Zp be the group of p-cycles and Bp be the group of p-boundaries. Wp consists of all elements cp ∈ Cp
such that mcp ∈ Bp for some non-zero integer m. Since Cp is torsion free, mcp ̸= 0 and mcp = ∂p+1 dp+1 so
m∂p cp = ∂mcp = 0 implies that ∂p cp = 0, so Wp ⊂ Zp . Wp is called the group of weak boundaries. Munkres shows
that Wp is a direct summand of Zp and lets Vp be the group such that Zp = Vp ⊕ Wp .
Step 2:
Choose basis e1 , · · · , en for Cp and e′1 , · · · , e′ m for Cp−1 so that the matirx of ∂p : Cp → Cp−1 has the Smith
normal form

e1 ··· ek ek+1 ··· en


 
e′1 b1 0
 .. 
··· 
 . 0 

e′k  0 bk 
 
e′k+1 



···  0 0 
e′m

where bi ≥ 1 and b1 |b2 | · · · |bk . Munkres shows that the following hold:

1. ek+1 , · · · , en is a basis for Zp .

2. e′1 , · · · , e′k is a basis for Wp−1

3. b1 e′1 , · · · , bk e′k is a basis for Bp−1

Step 3:
Now choose bases for Cp and Cp−1 using Step 2. Define Up to be the group spanned by e1 , · · · , ek . then Cp =
Up ⊕ Zp . Using Step 1, choose Vp so that Zp = Vp ⊕ Wp . Then we have a decompostition of Cp such that ∂p (Vp ) =
∂p (Wp ) = 0. The existence of the desired bases for Up and Wp−1 follow from Step 2. ■
Now we state the method of computing homology groups.

Theorem 4.1.16. The homology groups of a finite complex K are effectively computable.

Proof: By the preceeding theorem. we have a decomposition


M M M
Cp (K) = Up Zp = Up Vp Wp ,

where Zp is the group of p-cycles and Wp is the group of weak p-boundaries. Now

Hp (K) = Zp /Bp ∼ (Wp /Bp ) ∼


M M
= Vp = (Zp /Wp ) (Wp /Bp ).

By the proof of the previous theorem, Zp /Wp is free and Wp /Bp is the torsion subgroup.
Orient the simplices of K and choose a basis for the groups Cp (K). The matrix of ∂p has entries in the set
{−1, 0, 1}. Reducing it to Smith normal form and looking at Step 2 of the proof of the previous theorem we have
these facts

1. The rank of Zp equals the number of zero columns.

2. The rank of Wp−1 equals the number of nonzero rows.

3. There is an isomorphism
Wp−1 /Bp−1 ∼
M M M
= Z b1 Zb2 ··· Zbk .
4.1. SIMPLICIAL HOMOLOGY 63

So we get the torsion coefficients of K of dimension p − 1 from the Smith normal form of the matrix of ∂p . This
normal form also gives the rank of Zp and the normal form of ∂p+1 gives the rank of Wp . The difference of these
numbers is the rank of Zp minus the rank of Wp which is the betti number of K in dimension p. ■

Example 4.1.6. Suppose we want to compute H1 (K) and we have the following matrices already reduced to Smith
normal form:  
1 0 0 0 0 0 0
 0 3 0 0 0 0 0 
 
 0 0 6 0 0 0 0 
 
 0 0 0 12 0 0 0 
 0 0 0 0 0 0 0 ,
∂2 =  
 
 0 0 0 0 0 0 0 
 
 0 0 0 0 0 0 0 
0 0 0 0 0 0 0
and  
1 0 0 0 0 0 0 0
∂1 =  0 5 0 0 0 0 0 0 .
0 0 0 0 0 0 0 0
Then we have the rank of Z1 is 6. The rank of W1 taken from ∂2 is 4. So the betti number is 6-4=2. The torsion
coefficients from the matrix for ∂2 are 3, 6, and 12. so

H1 (K) ∼
M M M M
=Z Z Z3 Z6 Z12 .

4.1.5 Homomorphisms Induced by Simplicial Maps


I promised you earlier to produce a functor from topological spaces to groups in each dimension. We have the
groups now but we need to take maps between spaces and turn them into homomorphisms between the groups. We
would like to do this for any continuous map but we are not quite there yet. Earlier, though we talked about simplicial
maps. From now on we will talk about simplicial maps f : K → L and we mean that f is a continuous map of |K|
into |L| that maps each simplex of K into a simplex of L of the same or lower dimension. So f maps each vertex of
K into a vertex or L, and this map equals the simplical map we defined in Theorem 4.1.2.

Definition 4.1.20. Let f : K → L be a simplicial map. If [v0 , · · · , vp ] is s simplex of K then the points f (v1 ), · · · , f (vp )
span a simplex of L. Define a homomorphism f# : Cp (K) → Cp (L) by f# ([v0 , · · · , vp ]) = [f (v1 ), · · · , f (vp )] if all
of the vertices are distinct and f# ([v0 , · · · , vp ]) = 0 otherwise. f# is called a chain map induced by f .

This important theorem can pe proved from the above definition. Care must be taken when the vertices of the
image are not distinct.

Theorem 4.1.17. The homomorphism f# commutes with ∂. Therefore, f# induces a homomorphism f∗ : Hp (k) →
Hp (L).The chain map f# also commutes with the augmentation map ϵ, so it induces a homomorphism f∗ of reduced
homology groups.

Theorem 4.1.18. 1. Let i : K → K be the identity simplicial map. Then i∗ : Hp (K) → HP (k) is the identity
homomorphism for all p.

2. Let f : K → L and g : L → M be simplicial maps. Then (gf )∗ = g∗ f∗ . This makes homology a functor from
topological spaces and simplicial maps to groups and homomorphisms.

It turns out that more than one simplicial map can lead to the same homomorphism on homology groups. To see
when this happens we need the following definition:
64 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

Definition 4.1.21. Let f, g : K → L be simplicial maps. Suppose that for each p suppose there is a homomorphism
D : Cp (K) → Cp+1 (L) such that
∂D + D∂ = g# − f# .
Then D is called a chain homotopy between f# and g# .
Theorem 4.1.19. If there is a chain homotopy between f# and g# then the induced homomorphisms f∗ and g∗ for
both ordinary and reduced homology are equal.
Proof Let z be a p-cycle of K. Then g# (z) − f# (z) = ∂D(z) + D∂(z) = ∂D(z) + 0. So f# (z) and g# (x) differ
by a boundary and are in the same homology class. Thus f∗ = g∗ . ■
We conclude with a condition that allows us to find a chain homotopy.
Definition 4.1.22. Let f, g : K → L be simplicial maps. These maps are contiguous if for each simplex [v0 , · · · , vp ] of
K, the vertices f (v0 ), · · · , f (vp ), g(v0 ), · · · , g(vp ) span a simplex of L. This simplex can have dimension anywhere
between 0 and 2p + 1 depending on how many are distinct.
The proof of this result is lengthy but not too difficult. See [110].
Theorem 4.1.20. Let f, g : K → L be contiguous simplicial maps. Then there is a chain homotopy between f# and
g# , so f∗ and g∗ for both ordinary and reduced homology are equal.

4.1.6 Topological Invariance


One thing we would like is for homeomorphic topological spaces to have the same homotopy groups and for
homeomorphisms to translate to isomorphisms on homology groups. This does turn out to be true, but unfortunately,
having the same homology groups does not imply that two spaces are homemorphic. This issue led to the discovery
of cohomology and Steenrod operations which provide additional algebraic structure.
For now, we know that simplicial complexes with invertible simplicial maps have the same homology groups.
So we have some issues to consider. First of all, can any topological space be represented as a simplicial complex?
The answer is no. For this to be true, the space needs to be triangulable which means it is homeomorphic to a
simplicial complex. Almost all spaces you could name are triangulable and you will never have to deal with this issue
in topological data analysis. In Section 4.3.2, I will describe the only counterexample I have ever seen.
The hard part is moving from simplicial maps to continuous maps. Munkres [110] gives the entire proof of
topological invariance (i. e. homeomorphic spaces lead to isomorphic homology groups) in sections 14-18. As it is
long and complicated, I won’t repeat it here but just describe some of the ideas behind it.
We will assume all of our spaces are the underlying spaces or polytopes of simplicial complexes.
Definition 4.1.23. If K is a simplicial complex, then the star of a vertex v ∈ K denoted St(v) is the union of the
interiors of all simplices in K that have v as a vertex.
Definition 4.1.24. Let h : |K| → |L| be a continuous map. If f : K → L is a simplicial map such that h(St(v)) ⊂
St(f (v)) for each vertex v ∈ K, then f is called a simplicial approximation to h.
The map f is an approximation to h in the sense that given x ∈ K, there is a simplex in L that contains both f (x)
and h(x).
To find such an approximation, we subdivide K into smaller simplices.
Definition 4.1.25. Let K be a complex. A complex K ′ is a subdivision of K if each simplex of K ′ is contained in a
simplex of K and each simplex of K equals the union of finitely many simplices of K ′ .
Pk P
Now recall that a simplex σ with vertices x0 , · · · , xk is the set of all points x = i=0 ti xi where ti = 1 and
ti ≥ 0. The ti are called the barycentric coordinates. The place where they are all equal is
k
X 1
σ̂ = vi .
i=0
k+1
4.1. SIMPLICIAL HOMOLOGY 65

σ̂ is called the barycenter of σ.


Now we divide up K into a subdivision sd(K) as follows. First, include all of the vertices in sd(K) that were
originally in K. We divide up the p-skeletons for p = 1, 2, · · · . Suppose we already have subdivided the p-skeleton of
K. Then for any p + 1-simplex σ of K, find its barycenter σ̂. Then for every for every p-simplex τ ⊂ σ with vertices
v0 , · · · , vp in sd(K) we include the p + 1-simplex [σ̂, v0 , · · · , vp ]. When we are done, the new complex sd(K) is
called the first barycentric subdivision of K. We can do the process a second time creating the second barycentric
subdivision sd2 (K). Continuing in this way, we can make the simplices in K as small as we want. Figure 4.1.7
demonstrates the process and shows the first 2 subdivisions of the complex on the left.

Figure 4.1.7: Barycentric Subdivision [110]

The idea in finding a simplicial approximation to a continuous map is to subdivide K until each star of a vertex in
K is in one of the sets h−1 (St(w)) where w is a vertex of L. This gives the following:
Theorem 4.1.21. The finite simplicial approximation theorem: Let K and L be complexes where K is finite. Given
a continuous map h : |K| → |L| there is an integer N such that h has a simplicial approximation f : sdN (K) → L.
With this result, we are now ready to show topological invariance.
Definition 4.1.26. Let K and L be simplicial complexes and let h : |K| → |L| be continuous. Choose a subdivision
K ′ of K such that h has a simplicial approximation f : K ′ → L. So K ′ = sdN (K) for some N if K is finite. Let
C(K) be the chain complex consisting of the chain groups of K and the usual boundaries. Let λ : C → C′ be the chain
map representing subdivision. (Munkres [110] Section 17 makes this precise.) Then the homomorphism induced by h
is h∗ : Hp (K) → Hp (L) defined by h∗ = f∗ λ∗ .
Munkres shows that these maps have all the nice properties we had with simplicial maps.
Theorem 4.1.22. The identity map i : |K| → |K| induces the identity homomorphism i∗ : Hp (K) → Hp (K). If
h : |K| → |L| and k : |L| → |M | are continuous, then (kh)∗ = k∗ h∗ . The same holds for reduced homology.
Theorem 4.1.23. Topological invariance of homology groups: If h : |K| → |L| is a homeomorphism then h∗ :
Hp (K) → Hp (L) is an isomorphism. The same holds for reduced homology.
This is immediate from the previous theorem by composing h and h−1 which exists since h is a homeomorphism.
I will remark that everything holds for relative homology as well.
We can even do better than this. I will show this result holds for the weaker condition of homotopy equivalence. I
will discuss this next.
Definition 4.1.27. If X and Y are topological spaces, two continuous maps h, k : X → Y are homotopic if there is a
continuous map F : X × I → Y such that F (X, 0) = h(x) and F (x, 1) = k(x). The map F is called a homotopy of
h to k.
We can think of F as continuously deforming h into k. The set of all maps from X to Y is partitioned into
equivalence classes called homotopy classes where two maps are in the same class if they are homotopic to each other.
As a preview of chapter 9, the elements of the homotopy group πn (X) will be homotopy classes of continuous
maps from S n to X.
66 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

Theorem 4.1.24. If h, k : |K| → |L| are homotopic, then h∗ , k∗ : Hp (K) → Hp (L) are equal. The same holds for
reduced homology.

For relative homology, we modify the definition of homotopy to include maps of pairs. Let h, k : (|K|, |K0 |) →
(|L|, |L0 |) be maps of pairs where K0 is a subcomplex of K and L0 is a subcomplex of L. This means that they map
|K| to L while mapping K0 into L0 . Then they are homotopic if there is a homotopy H : |K| × I → |L| of h to k
such that H maps |K0 | × I into |L0 |.

Theorem 4.1.25. If h, k are homotopic as maps of pairs of spaces, then h∗ = k∗ as homomorphisms of relative
homology groups.

Definition 4.1.28. Two spaces are homotopy equivalent or have the same homotopy type if there are maps f : X → Y
and g : Y → X such that gf = iX and f g = iY . The maps f and g are called homotopy equivalences and g is a
homotopy inverse of f .

Recall that a complex is acyclic if all reduced homology groups are zero.

Definition 4.1.29. A space X is contractible if it has the homotopy type of a single point.

Now you know what every word in my dissertation title [124] means.

Example 4.1.7. The unit ball is contractible. Let F (x, t) = (1 − t)x, for x ∈ B n . Then F (x, 0) = x and F (x, 1) = 0.
Note that the unit ball and the origin are not homeomorphic. The constant map f : B n → 0 is obviously not one-to-one
and so not invertible.

So now we have a stronger result. If two spaces are not necessarily homeomorphic but only homotopy equivalent,
they have isomorphic homology groups. This is immediate from Theorem 4.1.24 and formally stated here:

Theorem 4.1.26. If h : |K| → |L| is a homotopy equivalence, then h∗ is an isomorphism on homology groups. In
particular, if |K| is contractible, then K is acyclic.

Definition 4.1.30. Let A ⊂ X. A retraction of X onto A is a continuous map r : X → A such that r(a) = a for each
a ∈ A. If there is a retraction of X onto A, we say that A is a retract of X. A deformation retraction of X onto A is a
continuous map F : X × I → X such that

1. F (x, 0) = x for x ∈ X.

2. F (x, 1) ∈ A for x ∈ X.

3. F (a, t) = a for a ∈ A.

If such an F exists, then A is called a deformation retract of X.

Every deformation retract is also a retract. To see this, let r = F (x, 1). Then r : X → A and r(a) = a for a ∈ A.
A retract is not necessarily a deformation retract as the next example shows.

Example 4.1.8. Let X = S 1 (the unit circle) and A = (0, 1), i.e. A is a single point. Letting r(x) = (0, 1) for x ∈ S 1 .
Then r is obviously a retraction. But A is not a deformation retract of S 1 . To see this, let X be any topological space,
A a deformation retract of X. If r : X → A is defined by r = F (x, 1) where F is as above and j : A → X is
inclusion, then F is a homotopy between jr and identity iX of X. So r and j are homotopy inverses of each other and
the homology groups of X and A are isomorphic. In our case, H1 (S 1 ) ∼ = Z and H1 (A) = 0 for A a single point. So
A is a retract of X but not a deformation retract.

Theorem 4.1.27. The unit sphere S n−1 is a deformation retract of punctured Euclisean space Rn − 0.
4.1. SIMPLICIAL HOMOLOGY 67

Proof: Let X = Rn − 0. Define F : X × I → X by

F (x, t) = (1 − t)x + tx/||x||,


p
where ||x|| is the standard Euclidean norm, i.e. if x = (x1 , · · · , xn ) then ||x|| = x21 + · · · + x2n . Then F moves x
along the line through the origin to it’s intersection with the unit sphere, so it is a deformation retraction of Rn − 0
onto S n−1 . ■
Now we can finally settle the question about Euclidean spaces of different dimension being homeomorphic.

Theorem 4.1.28. The Euclidean spaces Rm and Rn are not homeomorphic if m ̸= n.

Proof: Assume m, n > 1 as otherwise, we can remove a point and R1 becomes disconnected. Remove a point
from each of Rm and Rn . Then the resulting spaces are homeomorphic to Rm − 0 and Rn − 0 respectively. But
these are deformation retracts of S m−1 and S n−1 respectively and so have isomorphic homology groups to them. But
Hm−1 (S m−1 ) ∼ = Z, while Hm−1 (S n−1 ) = 0. So their homology groups differ and they can not be homeomorphic.

Note that we could have made a similar argument by adding a point rather than removing one. Adding a point
called the point at infinity to Rn produces the sphere S n . This is called a one point compactification. It is most easily
visualized for n = 2.

Figure 4.1.8: Stereographic Projection

The stereographic projection from S 2 to R2 involves placing the South pole of S 2 at the origin of R2 . (See Figure
4.1.8.) Given any point x ∈ S 2 other than the North pole, we map it to the point of R2 which is on the line containing
the North pole and x. The South pole maps to the origin and the equator maps to the unit circle in R2 . It can be
checked that this map is a homeomorphism between S 2 with the North pole removed and R2 . As you move closer to
the north pole the line gets more and more horizontal and intersects in a point of increasing magnitude so we can think
of the projection of the North pole as infinity. With this projection and similar ones for any Rm , we can see that Rn
and Rm are homeomorphic if and only if S n and S m have the same homology groups. As we saw above, this is only
true if m = n.

4.1.7 Application: Maps of Spheres and Fixed Point Theorems, and Euler Number
At this point, I can give you an idea of how obstruction theory works. I will make this more precise when we see
cohomology and homotopy. Suppose, we have a simplicial complex K and L is a subcomplex of K. Let g : L → Y
and we want to extend g to a function f : K → Y where f restricted to L is equal to g. We will suppose Y = S n .
Then the idea is to extend F to the vertices of K −L, then the edges, then the 2-simplices, etc. We can always extend g
to the vertices and if Y is path connected, we can extend to the edges. (This is fine if Y is a sphere.) Suppose we have
extended g to the n-simplices. For an n + 1-simplex, f is defined on the boundary which we know is homeomorphic
to S n . If Y = S n , then extending f to the interior would amount to extending a map h : S n → S n to a map
k : B n+1 → S n . Maps between spheres of the same dimension are classified by an integer called the degree. That
will be the first topic in this section.
So if Y = S n , what do we do if we are not at the step where we go from n-simplices to n + 1-simplices? In
general, if Y = S n and we are going from m to m + 1, then f restricted to the boundary of an m + 1 simplex is a map
from S m to S n . The homotopy classes of these maps form the group πm (S n ). As we are about to see, we can extend
68 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

to the interior of this simplex if f is homotopic to a constant map which means that f is the zero element in this group.
If n = m, we are in the case described here. If m < n, πm (S n ) = 0, so the extension is always possible. If m > n,
things get harder. The groups πm (S n ) have not all been calculated even for n = 2, and unlike the case of homology,
they are mostly nonzero. As we will see, though, there are many special cases where the problem is tractable and may
be of some interest in data science.
Definition 4.1.31. Let n ≥ 1 and f : S n → S n be a continuous map. If α is one of the two generators of the group
Hn (S n ) ∼
= Z, then f∗ (α) = dα for some integer d as these are the only homomorphisms from Z to itself. Since
f∗ (−α) = d(−α) it doesn’t matter which generator we pick. The number d is called the degree of f .
As I mentioned above, a constant map can always be extended. If L is a subcomplex of K and g(l) = y0 for l ∈ L
and a fixed y0 in Y , then this map is continuous and we can extend g to the constant map f (k) = y0 for k ∈ K. A
constant map will have degree 0 as we will now show.
Theorem 4.1.29. Degree has the following properties:
1. If f is homotopic to g then deg f = deg g.
2. If f extends to a continuous map h : B n+1 → S n then deg f =0.
3. The identity map has degree 1.
4. deg(f g) = (deg(f ))(deg(g)).
Proof: Property 1 holds since homotopic maps give rise to the same homomorphisms in homology. Property 2
follows from the fact that f∗ : Hn (S n ) → Hn (S n ) equals the composite

j∗ h∗
Hn (S n ) Hn (B n+1 ) Hn (S n ),

where j is inclusion. Since B n+1 is acyclic, the composite is the zero homomorphism. Properties 3 and 4 follow from
Theorem 4.1.22. ■
Theorem 4.1.30. There is no retraction r : B n+1 → S n .
Proof: The retraction r would be an extension of the identity map i : S n → S n . Since i has degree 1 ̸= 0, the
extension can not exist. ■
We can now easily prove an interesting theorem about fixed points, ie points where f (x) = x.
Theorem 4.1.31. Brouwer Fixed-Point Theorem: Every continuous map ϕ : B n → B n has a fixed point.
Proof: Suppose ϕ : B n → B n has no fixed point. Then we can define a map h : B n → S n−1 by the equation
x − ϕ(x)
h(x) = ,
||x − ϕ(x)||
since we know x − ϕ(x) ̸= 0. (Note that dividing by the norm makes the points have norm 1 so they lie on the surface
of the sphere.) If f : S n−1 → S n−1 is the restriction of h to S n−1 then f has degree 0 by property 2 of Theorem
4.1.29.
But we can show that f has degree 1. Let H : S n−1 × I → S n−1 be a homotopy defined by
u − tϕ(u)
H(u, t) = .
||u − tϕ(u)||
Then the denominator is always nonzero since for t = 1, u ̸= ϕ(u) as ϕ does not have a fixed point, and for t < 1, we
know ||u|| = 1 and ||tϕ(u)|| = t||ϕ(u)|| ≤ t < 1. The map H is a homotopy between f and the identity of S n−1 so
f has degree 1, which is a contradiction. Hence ϕ must have a fixed point. ■
Next we will see a few more examples of degrees of certain maps and what can be done with them.
4.1. SIMPLICIAL HOMOLOGY 69

Definition 4.1.32. The antipodal map a : S n → S n is the map defined by the equation a(x) = −x for x ∈ S n . It
takes a point to its antipodal point which lies on the other side of the sphere on a line from x to the center of the sphere.
Theorem 4.1.32. Let n ≥ 1. The degree of the antipodal map a : S n → S n is (−1)n+1 .
The idea of the proof is that if we reflect a coordinate in Rn+1 by multiplying it by (−1), the reflection map has
degree -1. Then just compose the n + 1 reflection maps corresponding to each coordinate. See [110] for the full proof
which has to take into account simplicial approximations.
Theorem 4.1.33. If h : S n → S n has a degree different from (−1)n+1 then h has a fixed point.
Proof: Suppose h has no fixed point. We will prove h is homotopic to a. Otherwise, since h(x) and x are not
antipodal, we can form a homotopy which moves h(x) to −x along the shorter great circle path. The equation is
(1 − t)h(x) + t(−x)
H(x, t) = .
||(1 − t)h(x) + t(−x)||
So we are done as long as the denominator is never 0. Suppose we had (1 − t)h(x) = tx for some x and t. Then
taking norms of both sides, we would get 1 − t = t = 1/2. So h(x) = x which is a contradiction since h has no fixed
point. So h is homotopic to a and has degree (−1)n+1 . ■
Theorem 4.1.34. If h : S n → S n has a degree different from 1, then h carries some point x to its antipode −x.
Proof: In this case, if a is the antipodal map, then ah has degree different from (−1)n+1 , so ah has a fixed point
x. Then a(h(x)) = x, so h(x) = −x. ■
An immediate consequence is a famous theorem about tangent vector fields. These fields associate a tangent vector
to each point on a sphere. A tangent vector at a point x must be perpendicular to the line from the center of the sphere
to x so the inner product with x must be zero.
Theorem 4.1.35. S n has a non-zero tanngent vector field if and only if n is odd.
This shows that since the surface of your head has the homotopy type of S 2 , you can’t brush your hair so that every
hair is lying perfectly flat.
Proof: If n is odd, let n = 2k−1. Then for x in S n let v(x) = v(x1 , · · · , x2k ) = (−x2 , x1 , −x4 , x3 , · · · , −x2k , x2k−1 .
Then the inner product ⟨x, v(x)⟩ = x1 (−x2 ) + x2 x1 + · · · + x2k−1 (−x2k ) + x2k x2k−1 = 0. So v(x) is perpendicular
to x and the {v(x)} form a nonzero tangent vector field to S n .
If the {v(x)} form a nonzero tangent vector field to S n , then for each x, h(x) = v(x)/||v(x)|| is a map of S n into
S . Since h(x) is perpendicular to x, h(x) can not equal x or −x. So the degree of h is (−1)n+1 and the degree of h
n

is 1. So n must be odd. ■
Munkres introduces the Euler number at this point as part of a discussion of more general fixed point theorems. I
won’t get into those here but you should know what an Euler number is as it is a useful topological invariant.
Definition 4.1.33. Let K be a finite complex. Then the Euler number or Euler characteristic χ(K) is defined by the
equation X
χ(K) = (−1)p rank(Cp (K)).
For example, if K is 2-dimensional then χ(K) = #vertices − #edges + #faces. Munkres [110] proves the
following:
Theorem 4.1.36. Let K be a finite complex. Let βp = rank(Hp (K)/Tp (K)), i.e. the betti number of K in dimension
p. Then X
χ(K) = (−1)p βp .
p

We note that for surfaces, there is a closely related idea called the genus. The relationship is χ(X) = 2 − 2g where
g is the genus of X. if X = S 2 then β0 = β2 = 1 and β1 = 0. So χ(X) = 2 and g = 0. For the torus T , β0 = β2 = 1
and β1 = 2. So χ(T ) = 0 and g=1. Generally, g is equal to the number of handles we attach to a sphere. So a donut
and a coffee cup both have genus one and Euler number 0, and that is the reason a topologist can’t tell them apart.
70 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

4.2 Eilenberg Steenrod Axioms


In their textbook [47], Eilenberg and Steenrod took an axiomatic approach to homology theory. They started with
a series of axioms that a homology theory needed to satisfy and showed that simplicial complexes and simplicial
maps satisfied them. In this section, I will list the axioms. First, though, there are two preliminaries that will help
to unsderstand them. The first, is the long exact homology sequence of a pair. A related idea is the Mayer-Vietoris
sequence which computes the homology of the union of two complexes using another long exact sequence. Finally,
we will state the axioms and see how they relate to simplicial homology.

4.2.1 Long Exact Sequences and Zig-Zagging


Recall that in Section 3.2, we discussed exact sequences of abelian groups in which the kernel of each map is the
image of the previous map. A sequence of 5 groups whose first and last ones are 0 is called a short exact sequence.
A sequence indexed by the integers is called a long exact sequence. An important long exact sequence involves the
homology of a pair. I will state the version for simplicial homology. One of the Eilenberg-Steenrod axioms will
generalize this. Later, we will see variants for cohomology and homotopy.

Theorem 4.2.1. The long exact homology sequence of a pair: Let K be a complex, K0 a subcomplex. Then there
is a long exact sequence

i∗ j∗ ∂∗
··· Hp (K0 ) Hp (K) Hp (K, K0 ) Hp−1 (K0 ) ··· ,

where i : K0 → K and j : (K, ∅) → (K, K0 ) are inclusions.

The third map ∂∗ is called the homology boundary homomorphism. Given a cycle z in Cp (K, K0 ), it is in d +
Cp (K0 ) where ∂d is carried by K0 . So if {} means homology class then define ∂∗ {z} = {∂d}.
This result will be an immediate consequence of the more general ”zig-zag lemma” which I will now describe. It
featured in the only movie I have ever seen depicting algebraic topology: a Dutch movie called ”Antonia’s Line”. Her
granddaughter, a math professor, was writing it on the board. Definitely the most important part.

Definition 4.2.1. Let C, D, and E be chain complexes. Let 0 be the trivial chain complex whose groups vanish in every
dimension. Let ϕ : C → D and ψ : D → E be chain maps. (Remember this means they commute with boundaries.)
We say the sequence

ϕ ψ
0 C D E 0.

is a short exact sequence of chain complexes if

ϕ ψ
0 Cp Dp Ep 0.

is an exact sequence of groups for every p.

In the previous theorem, we had the exact sequence of complexes

i π
0 C(K0 ) C(K) C(K, K0 ) 0.

Here i is inclusion and π is projection. The sequence is exact because Cp (K, K0 ) = Cp (K)/Cp (K0 ).
4.2. EILENBERG STEENROD AXIOMS 71

Theorem 4.2.2. Zig-zag Lemma: If C, D, and E are chain complexes such that the sequence
ϕ ψ
0 C D E 0.

is exact. Then there is a long exact homology sequence


ϕ∗ ψ∗ ∂∗
··· Hp (C) Hp (D) Hp (E) Hp−1 (C) ··· ,

where ∂∗ is induced by the boundary operator in D.

Figure 4.2.1: Zig-zag Lemma [110]

The proof is long but not that hard. It is an example of what is called ”diagram chasing”. Refer to Figure 4.2.1
from Munkres [110]. You can find the full proof there. I will list the steps and write out step 1. The steps are:
1. Define ∂∗ .
2. Show that ∂∗ is a well defined homomorphism, i.e. two cycles in Ep that are homologous are sent to homologous
elements of Cp−1 by ∂p .
3. Show exactness at Hp (D).
4. Show exactness at Hp (E).
5. Show exactness at Hp−1 (C).
As an example, to define ∂∗ : Let ep ∈ Ep be a cycle. Since ψ is onto by exactness, choose dp ∈ Dp so that
ψ(dp ) = ep . The element ∂D (dp ) ∈ Dp−1 is in ker ψ since ψ(∂D (dp )) = ∂E (ψ(dp )) = ∂E (ep ) = 0, since we
assumed that ep was a cycle. So there is an element cp−1 ∈ Cp−1 such that ϕ(cp−1 ) = ∂D (dp ) since ker ψ = imϕ.
This element is unique since ϕ is injective. Also cp−1 is a cycle since ϕ(∂C (cp−1 )) = ∂D (ϕ(cp−1 )) = ∂D ∂D (dp ) = 0.
So ∂C (cp−1 ) = 0 since ϕ is injective.So if {} denotes homology let ∂∗ {ep } = {cp−1 }.
So this gives you a taste of how the rest of the proof goes. Look at Figure 4.2.1 for help.

4.2.2 Mayer-Vietoris Sequences


In this very short section, I will give another useful exact sequence for computing homology.
Theorem 4.2.3. Let K be a complex and let K0 and K1 be subcomplexes such that K = K0 ∪ K1 . Let A = K0 ∩ K1 .
Then there is an exact sequence
L
··· Hp (A) Hp (K0 ) Hp (K1 ) Hp (K) Hp−1 (A) ··· ,

called the Mayer-Vietoris sequence of (K0 , K1 ). There is a similar sequence in reduced homology if A ̸= ∅.
72 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

Proof: I will give the proof for ordinary homology. See [110] for the modifications in the reduced homology case.
We want to construct a short exact sequence of chain complexes

ϕ L ψ
0 C(A) C(K0 ) C(K1 ) C(K) 0.

and apply the zig-zag lemma. L


First, we need to define the chain complex in the middle. It’s chain group in dimesnsion p is Cp (K0 ) Cp (K1 ).
The boundary operator is defined by ∂ ′ (d, e) = (∂0 d, ∂1 e), where ∂0 and ∂1 are the boundary operators for C(K0 )
and C(K1 ) respectively.
Next, we need to define the chain maps ϕ and ψ. Consider the commutative diagram where all maps are inclusions:

K0
i k

m
A K.

j l
K1

Define homomomorphisms ϕ(c) = (i# (c), −j# (c)) and ψ(d, e) = k# (d) + l# (e). These are both chain maps.
To check exactness, ϕ is injective since i# and j# are inclusion of chains. To see that ψ is surjective, let d ∈
Cp (K), write d as a sum of oriented simplices and let (d0 ) be those carried by K0 . Then d − d0 is carried by K1 and
ψ(d0 , d − d0 ) = d.
Now check exactness at the middle term. Note that ψϕ(c) = m# (c)−m# (c) = 0. Conversely, if ψ(d, e) = 0, then
d = −e as chains in K. Since d is carried by K0 and e is carried by K1 , they must both be carried by K0 ∩ K1 = A.
So (d, e) = (d, −d) = ϕ(d) and we are done.
The homology of the middle chain complex in dimension p is

ker ∂ ′
L
ker ∂0 ker ∂1 ∼ M

= L = Hp (K0 ) Hp (K1 ).
im∂ im∂0 im∂1

The result now follows from the zig-zag lemma. ■


As an application, recall that the cone of a space is the result of taking an external point and including all of the
lines from that point to a point of the original sapce. For a simplicial complex K, if w ∈ / K, then if v0 , · · · , vp is a
p-simplex of K, we add the p + 1-simplex [w, v0 , · · · , vp ] to the cone. Munkres writes the cone as w ∗ K. Cones
basically plug up holes and are contractible which means they are also acyclic.

Definition 4.2.2. Let K be a complex and w0 ∗ K and w1 ∗ K be two cones of K whose polytopes intersect in |K|.
Then S(K) = (w0 ∗ K) ∪ (w1 ∗ K) is called the suspension of K.

I briefly mentioned suspension in Section 2.4 and Figure 2.4.4. The definition here is equivalent to the one we
gave there but now stated in terms of simplicial complexes. I also mentioned that suspension raises the dimension of a
sphere by one. (Hold two ice cream cones together with their wide ends touching.) Then the homology in dimension
p of the suspension of a sphere is equal to the homology in dimension p − 1 of the original sphere. This is true even
for other spaces as we now show.

Theorem 4.2.4. Let K be a complex. Then for all p there is an isomorphism

H̃p (S(K)) → H̃p−1 (K).


4.2. EILENBERG STEENROD AXIOMS 73

Proof: Let K0 = (w0 ∗ K) and K1 = (w1 ∗ K). Then S(K) = K0 ∪ K1 and K0 ∩ K1 = K. Using the
Mayer-Vietoris sequence, we get
M M
H̃p (K0 ) H̃p (K1 ) → H̃p (S(K)) → H̃p−1 (K) → H̃p−1 (K0 ) H̃p−1 (K1 ).

Since cones are acyclic, the terms at the end are zero so the middle map s an isomorphism. ■
I will end by mentioning that one interesting fact about the Austrian mathematician Leopold Vietoris was his
unusually long lifespan. He died in 2002, just under two months before his 111th birthday.

4.2.3 The Axiom List or What is a Homology Theory?


As I mentioned earlier, in their textbook [47], Eilenberg and Steenrod defined a homology theory as one satisfying
a particular set of axioms. This implies that there could be multiple homology theories. That is in fact true. I will
list the axioms here and spend the rest of the chapter briefly discussing two important alternatives to the simplicial
homology theory we have already described. Homology theories work with special pairs of topological spaces and
subspaces called admissible pairs.
Definition 4.2.3. Let A be a class of pairs (X, A) of topological spaces such that:
1. If (X, A) belongs to A, then so do (X, X), (X, ∅), (A, A), and (A, ∅)
2. If (X, A) belongs to A, then so does (X × I, A × I).
3. There is a one point set P such that (P, ∅) is in A.
Then A is called an admissible class of spaces for a homology theory.
We now state the axioms themselves.
Definition 4.2.4. If A is a set of admissible pairs, a homology theory on A consists of three functions:
1. A function Hp defined for each integer p and each pair (X, A) in A whose value is an abelian group.
2. A function for each integer p that assigns to a continous map h : (X, A) → (Y, B) (recall that this means
h(A) ⊂ B), a homomorphism
h∗ : Hp (X, A) → Hp (Y, B).

3. A function for each integer p that assigns to each pair (X, A) ∈ A a homomorphism

∂∗ : Hp (X, A) → Hp−1 (A),

where A denotes the pair (A, ∅).


These functions satisfy the following axioms where all pairs of spaces are in A:
− Axiom 1: If i is the identity, then i∗ is the identity.
− Axiom 2: (kh)∗ = k∗ h∗ .
− Axiom 3: If f : (X, A) → (Y, B), then the following diagram commutes:

f∗
Hp (X, A) Hp (Y, B)

∂∗ ∂∗
(f |A)∗
Hp−1 (A) Hp−1 (B)
74 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

− Axiom 4: (Exactness Axiom) The sequence

i∗ π∗ ∂∗
··· Hp (A) Hp (X) Hp (X, A) Hp−1 (A) ···

is exact where i : A → X and π : X → (X, A) are inclusion maps.

− Axiom 5: (Homotopy Axiom) If h, k : (X, A) → (Y, B) are homotopic then h∗ = k∗ .

− Axiom 6: (Excision Axiom) Given (X, A), let U be an open subset of X such that Ū ⊂ A◦ (i.e. the closure of U
is contained in the interior of A). Then if (X − U, A − U ) is admissible, then inclusion induces an isomorphism

Hp (X − U, A − U ) ∼
= Hp (X, A).

− Axiom 7: (Dimension Axiom) If P is a one point space then Hp (P ) = 0 for p ̸= 0, and H0 (P ) ∼


= Z. If we are
using homology with coefficients in an arbitrary abelian group G, then H0 (P ) ∼
= G.

− Axiom 8: (Axiom of Compact Support) If α ∈ Hp (X, A), there is an admissible pair (X0 , A0 ) with X0 ⊂ X
and A0 ⊂ A compact such that α is in the image of the homomorphism Hp (X0 , A0 ) → Hp (X, A) induced by
inclusion.

Note that there are variant homology theories where Axiom 7 may fail and a point may have nonzero homology
in dimensions other than zero. Examples of this are K-theory and cobordism. Neither of them will be covered in this
book. Such a theory is called an extraordinary homology theory. One book that briefly discusses both cobordism and
K-theory is [96].
A you might expect, simplicial homology is an actual homology theory, but there are some subtleties in showing
this. I will refer you to Munkres [110] for most of the details but we should at least define what type of pair corresponds
to an admissible pair in simplicial homology theory. What we need is called a triangulable pair.

Definition 4.2.5. Let A be a subspace of X. A triangulation of the pair (X, A) is a complex K and a subcomplex K0
along with a homeomorphism h : (|K|, |K0 |) → (X, A). If such a triangulation exists, then (X, A) is a triangulable
pair. If A is empty, then X is a triangulable space.

4.3 Singular and Cellular Homology


In this final section of the chapter I will briefly describe two other important homology theories. Singular homology
has a much nicer theory, but it is hard to visualize or actually compute anything. Unlike simplicial homology, though,
it is very easy to prove topological invariance and the only condition on (X, A) is that A is a subspace of X. Cellular
homology is a good middle ground. It is more flexible than simplicial homology and simple shapes are easier to
construct. Best of all, you don’t have to worry about tripping over cords when you are using it. Finally, I will use
cellular homology to compute the homology of the projective spaces. These will turn out to be very important for the
construction of Steenrod squares.

4.3.1 Singular Homology


Start with the space R∞ which has coordinates indexed by the positive integers and has only finitely many non-
zero components. Let ∆p be the p-simplex in R∞ with vertices e0 , · · · , ep such that e0 is the origin and ei has a one
in the i-th coordinate and zeros elsewhere. Then ∆p is called the standard p-simplex. If X is a topological space,
then a singular p-simplex is a continuous map T : ∆p → X. Note that T has no other restrictions and can even be a
constant map.
4.3. SINGULAR AND CELLULAR HOMOLOGY 75

The free abelian group generated by all singular p-simplices is called the singular chain group of X in dimension
p and denoted Sp (X). You can already tell why this would not be practical for computation. In most cases, this group
would be enormous.
To define faces of a simplex we need a special map called the linear singular simplex. If a0 , · · · .ap are points in
R∞ which are not necessarily independent, then we have a map l : ∆p → R∞ that takes ei to ai for i = 0, 1, · · · , p.
It is defined by
p
X
l(x1 , · · · , xp , 0, · · · ) = a0 + xi (ai − a0 ).
i=1

We denote it by l(x1 , · · · , xp ). Then l(e1 , · · · , ep ) is inclusion of ∆p into R∞ . If we leave one coordinate out, we
have a map l(e1 , · · · , êi , · · · , ep ) which takes ∆p−1 onto the face e0 e1 · · · ei−1 ei+1 · · · ep of ∆p .
Now the ith face of the simplex T : ∆p → X is T ◦ l(e1 , · · · , êi , · · · , ep ). So we can define a boundary
homomorphism ∂ : Sp (X) → Sp−1 (X) by
p
X
∂T = (−1)i T ◦ l(e1 , · · · , êi , · · · , ep ).
i=0

If f : X → Y is a continuous map, then define f# (T ) : Sp (X) → Sp (Y ) by f# (T ) = f ◦ T.


Theorem 4.3.1. The homomorphism f# commutes with ∂ and ∂∂ = 0.
We now have a chain complex with groups Sp (X) and a boundary homomorphism ∂ so we can define homology
groups in the usual way.
We can also define reduced homology by letting the augmentation map ϵ : S0 (X) → Z be defined by ϵ(T ) = 1
for any 0-simplex T . Now if T is a one simplex, then ϵ(∂T )=ϵ(T (l(e0 )) − T (l(e1 ))) = 1 − 1 = 0. So we can also
define reduced homology.
Theorem 4.3.2. If i : X → X is the identity, then i∗ : Hp (X) → Hp (X) is the identity. If f : X → Y and
g : Y → Z then gf∗ = g∗ f∗ . The same holds in reduced homology.
Proof: Both equations actually hold for chains since i# (T ) = iT = T and (gf )# (T ) = (gf )T = g(f T ) =
g# (f# (T )). Now the result follows since everything commutes with ∂.
The preceding result immediately implies topological invariance.
Theorem 4.3.3. If h : X → Y is a homeomorphism, then h∗ is an isomorphism.
The point is that we have proved topological invariance with much less work than in the simplicial case.
I will leave out the rest of the details, but we can define relative homology in the same way as before. If X is a
topological space and A is a subspace, then (X, A) is automatically an admissible pair, and the Eilenberg-Steenrod
Axioms all hold, so singular homology is a true homology theory. Most importantly for a space with the homotopy
type of a simplicial complex, its singular and simplicial homology groups are the same. See [110] for all of the details.

4.3.2 CW Complexes
I will finish my discussion of homology with a discussion of cellular homology. In many cases, this makes
spaces much easier to describe than in simplicial theory and aids in computations when you don’t have the use of a
computer. Spaces are represented as CW complexes. Instead of simplices, CW complexes are made up of cells which
are homeomorphic to open balls. So a 0-cell is a point, a 1-cell is a line segment, a 2-cell is a solid circle, and a
3-cell is a solid ball. Now lets go back to Disneyworld. (Sounds good as I am writing this in January.) Remember the
representation of Spaceship Earth as a simplicial complex. There are 11,520 triangles making up this sphere, although
it is actually missing some due to supports and doors but we won’t worry about that. Imagine if you wanted to compute
the simplicial homology by hand. I would be a lot of work and most of them would cancel out anyway, so it would be
a big waste of time. We can form S 2 as a CW complex with a 2-cell and glue is entire boundary to a 0-cell. Now the
76 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

complex has only 2 pieces. (Note there are other ways to produce S 2 so we need to be a little careful as we will see
later.) In this subsection and the next, I will make this definition formal and show you how to compute homology.
First we need to say what we mean by gluing spaces together.

Definition 4.3.1. Let X and Y be disjoint topological spaces. Let A be a closed subset of X and let f : A → Y
be continuous. Then let X ∪ Y be the topological sum (i,e, a disjoint union). Form a quotient space by identifying
each set {y} ∪ f −1 (y) for y ∈ Y to a point. The set will consist of these identified points along with te points {x} of
X − A. We denote this quotient space by X ∪f Y and call it the adjunction space determined by f ..

Definition 4.3.2. A space is a cell of dimension m or an m-cell if it is homeomorphic to the ball B m . It is an open
cell of dimension m if it is homeomorphic to the interior of B m .

Definition 4.3.3. A CW complex is a space X and a collection of disjoint open cells eα whose union is X such that

1. X is Hausdorff.

2. For each open m-cell eα of the collection, there exists a continuous map fα : B m → X that maps the interior
of B m homeomorphically onto eα and carries the boundary of B m into a finite union of open cells, each of
dimension less than m.

3. A set A is closed in X if A ∩ eα is closed in eα for each α.

CW complexes were discovered by J. H. C. Whitehead, but they are not named after him. The second condition
is called ”closure finiteness”, and the third condition epresses the fact that X has what Whitehead called the ”weak
topology” with respect to the {eα }. The letters C and W come from these conditions.
To match the notation in Munkres, we will use the notation ėα for eα − eα . Then the definition implies that fα
carries B m onto eα , and the boundary of B m onto ėα .
The map fα is called a characteristic map for the open cell eα . We use X both for the CW complex and its
underlyng space.
A finite CW complex is one which has finitely many cells. Such a complex is always compact. Also, in homotopy
theory, a lot of ugliness is avoided for spaces having the homotopy type of a finite CW complex. Any space that we
will ever see in data science will be of this type.

Theorem 4.3.4. Let X be a CW complex with open cells eα . A function f : X → Y is continuous if and only if f |eα
is continuous for each α. A function F : X × I → Y is continuous if and only if f |(eα × I) is continuous for each α.

Now we give some examples.

Example 4.3.1. Recall that a torus can be created by folding a rectangle. We can represent a Torus as a CW complex
consisting of one 2-cell (the image of the interior of the rectangle), two 1-cells (the images of the two short ends glued
together and the two long ends glued together), and one 0-cell (the image of the four vertices all glued together).
The short ends become the circle through the inside of the torus and the two long ends form the circle around the
circumference of the torus.

A CW complex has a notion of a p-skeleton analogous to that of a simplicial complex.

Definition 4.3.4. If X is a CW complex, then the p-skeleton X p is the subspace of X consisting of open cells of
dimension at most p.The dimension of X is the dimension of the highest dimesnional cell in X.

Example 4.3.2. We saw earlier that we can construct S 2 as a CW complex consisting of a circle (a 2-cell) whose
boundary is glued to a single point (a 0-cell). So this complex has 2 cells. Another representation of S 2 has four cells.
Take a point and attach a one-cell to form a circle. Now letting the circle be the equator, attach two 2-cells for the
Northern and Southern hemispheres respectively. This complex is also of homotopy type S 2 . Let A be the first type
of complex and B the second type. Then A and B are homotopy equivalent. But their one skeletons are not homotopy
equivalent. A1 consists of a single point, while B 1 has the homotopy type of the circle, S 1 .
4.3. SINGULAR AND CELLULAR HOMOLOGY 77

As promised, I will now give an example of a CW complex which is non-triangulable. Recall that for a triangulable
space X, there is a simplicial complex K and a homeomorphism h : |K| → X. The example as Munkres presents it
is a little confusing. What I thinks he means is that he constructs a finite CW complex which can not be triangulated
by a finite simplicial complex. Assuming that is the case, I can’t find an example of a space that can’t be triangulated
at all. We need one more definition:

Definition 4.3.5. If a CW complex X is triangulated by a complex K in such a way that each skeleton X p of X is
triangulated by a subcomplex of K of dimension at most p, then we say that X is a triangulable CW complex.

Now consider the weird space pictured in Figure 4.3.1. I wouldn’t try to build this one at home,

Figure 4.3.1: Non-triangulable CW Complex [110]

A is the subspace of R3 consisting of a square and a triangle one of whose edges coincides with the diagonal D of
the square. So A is the space of a complex consisting of 3 triangles with an edge in common. Now let C be a 1-cell in
the square intersecting the diagonal in an infinite disconnected set. (Letting D be a piece of the positive x axis starting
from the origin and let C be the graph of the function y = x sin(1/x) will accomplish this.) Now take a ball B 3 and
attach it to the curve C by gluing every arc from the South pole to the North pole of B 3 to the curve C. The resulting
adjunction space X is a finite CW complex with cells the open simplices of A in dimensions 0, 1, and 2, and the 3-cell
e3 which is the interior of B 3 .
Suppose h : |K| → X is a triangulation, where K is a finite simplicial complex. We can write X as the disjoint
union
X = (A − C) ∪ C ∪ e3 .
Munkres uses arguments from the theory of local homology groups (see [110] section 35) to argue that for a simplex σ
in K, the image of the interior of σ can not intersect both (A−C) and e3 , so h(σ) lies in A or e3 . So if h triangulates A
and e3 , it must triangulate A ∩ e3 = C. A similar argument shows that D is triangulated by h and so is C ∩ D. But the
latter is an infinite set of disconnected points and can’t be triangulated by a finite complex, so we have a contradiction.
In the above I left out some details that won’t show up in later sections. See [110] for the details. Fortunately, there
is no data science application I can imagine where we will ever see a space like the one just described. We all always
assume we can represent our spaces as finite simplicial complexes.
I will conclude this subsection by stating two helpful results that will help you picture where CW complexes come
from. Also, recall from Chapter 2 that a topological space X is normal if whenever A and B are disjoint closed sets in
X, there are disjoint open subsets U and V of X such that A ⊂ U and B ⊂ V . In his discussion of adjucntion spaces,
Munkres shows that the adjunction of two normal spaces is normal and the normality is closed under coherent union
in which the intersection of a closed subset of the union of a collection of spaces and each of these spaces individually
is closed. The two theorems below then imply that CW complexes are always normal.

Theorem 4.3.5. 1. Let X be a CW complexPof dimension p. Then X is homeomorphic to an adjunction space


formed from X p−1 and a topological sum Bα of closed p-balls by means of a continuous map g : BdBα →
P
X p−1 .
P
PY is a CW complex of dimension at most p − 1,
2. If Bα is a topological sum of closed
P p-balls, and if g :
BdBα → Y is a continuous map, then the adjunction space X formed from Y and Bα by means of g is a
CW complex and Y is its p − 1-skeleton.
78 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

Theorem 4.3.6. 1. Let X be a CW complex. Then X p is a closed subspace of X p+1 for each p, and X is the
coherent union of the spaces X 0 ⊂ X 1 ⊂ X 2 ⊂ · · · .
2. Suppose that Xp is a CW-complex for each p and Xp is the p-skeleton of Xp+1 for each p. If X is the coheret
union of the spaces Xp , then X is a CW complex having Xp as its p-skeleton.

4.3.3 Homology of CW Complexes


It now remains to see how to compute the homology of a CW complex. To do this, we have to start with a definition
of chains and boundaries. It might seem logical to have chains be formal sums of cells analogous to how simplicial
chains were defined. The problem is that we can’t really define boundaries in the same way. We will define cellular
chains a little differently and immediately show that they are the same thing for the undelrying space of a simplicial
complex.
Let X denote a CW complex with open cells eα and characteristic maps fα . The symbol Hp can designate singular
homology, but since we will never worry about any type of space other than a triangulable finite CW complex, we can
just as easily let it denote simplicial homology. Recall the exact homology sequence of a pair:

i∗ j∗ ∂∗
··· Hp (A) Hp (X) Hp (X, A) Hp−1 (A) ···

where i : A → X and j : X → (X, A) are inclusion maps.


Definition 4.3.6. If X is a CW complex, let

Dp (X) = Hp (X p , X p−1 ).

Let ∂ : Dp (x) → Dp−1 (X) be the composite

∂∗ j∗
Hp (X p , X p−1 ) Hp−1 (X p−1 ) Hp−1 (X p−1 , X p−2 ),

where j is inclusion. The fact that ∂ 2 = 0 follows from the fact that

j∗ ∂∗
Hp−1 (X p−1 ) Hp−1 (X p−1 , X p−2 ) Hp−2 (X p−2 )

is exact as it is part of the long exact sequence of the pair (X p−1 , X p−2 ). The chain complex D(X) = {Dp (x), ∂} is
called the cellular chain complex of X.
Example 4.3.3. To help picture this definition, we would like to see how it compares to the simplicial chain complex
when the CW complex X is the underlying space of a simplicial complex K. Letting X p be the cellular p-skeleton
of X and K (p) be the simplicial p-skeleton of K, assume that the open cells of X coincide with the open simplices
of K. We want to compute Hp (X p , X p−1 ). The simplicial chain group Ci (K (p) , K (p−1) ) is equal to 0 for i ̸= p and
Cp (K (p) ) = Cp (K). Thus,
Hp (X p , X p−1 ) = Hp (K (p) , K (p−1) ) = Cp (K).
Also, the boundary operator is the ordinary simplicial boundary. This shows that in this case we can use the CW
complex to compute the homology of X.
Munkres next shows that the cellular chain complex of a CW complex X behaves in a lot of ways like a simplicial
complex. He shows that Dp (X) is free abelian with a basis consisting of the oriented p-cells. (I will define what we
mean by oriented cells next.) Also, he shows that D(X) can be used to compute the singular and thus the simplicial
homology of X. Rather than reproduce his rather long proof, I will refer you to [110], Section 39.
4.3. SINGULAR AND CELLULAR HOMOLOGY 79

To understand some examples, we will introduce some new definitions. For each open p-cell eα of the CW complex
X, the group Hp (eα , ėα ) is infinite cyclic (i.e. isomorphic to the group Z of integers.) This is the case since we are
taking a closed ball and identifying its boundary to a point. For example, e2 is a circle and identifying the boundary of
a closed circle to a point produces S 2 . We know that H2 (S 2 ) = Z. Now the group Z has two generators, 1 and −1.
These generators will be called the two orientations of the cell eα . The cell ep together with an orientation is called
an oriented p-cell.
Assuming that X is triangulable as we always can in data science applications, let K be the complex that triangu-
lates X. Since X p and X p−1 are subcomplexes of K, any open p-cell eα is a union of open simplices of K, so eα is
is the polytope of a subcomplex of K. The group Hp (eα , ėα ) is the group of p-chains carried by eα whose boundaries
are carried by ėα . This group is Z and either generator of the group is called a fundamental cycle for (eα , ėα ).
The cellular chain group Dp (X) is the group of all simplicial p-chains of X carried by X p whose boundaries are
carried by X p−1 . Any such p-chain can be written uniquely as a finite linear combination of fundamental cycles for
those pairs (eα , ėα ) for whiich dim eα = p.

Figure 4.3.2: Torus and Klein Bottle as CW Complexes. [110]

Example 4.3.4. Now we will compute the homology of the torus and the Klein bottle. You can follow along on Figure
4.3.2.
We let X be the torus or Klein bottle produced by folding the rectangle L as shown in the figure. In either case, X
is a triangulable CW complex having an open 2-cell e2 , two open 1-cells e1 and e′1 which are the images of A and B
respectively, and one 0-cell e0 . So D2 (X) ∼= Z, D1 (X) ∼ = Z ⊕ Z, and D0 (X) ∼ = Z. To compute homology, we need
generators for these chain groups. Letting d be the the sum of all of the 2-simplices of L oriented counterclockwise,
we get a cycle of (L, BdL). In fact, d is a fundamental cycle for (L, BdL). By Munkres’ arguments about the
equivalence of cellular and simplicial homology, γ = g# (d) where g is the identification map as shown in the figure,
is a fundamental cycle for (e2 , ė2 )
Now look at the rectangle at the right side of the figure. w1 = g# (c1 ) is a fundamental cycle for (e1 , ė1 ) as is
g# (c3 ). Also, z1 = g# (c2 ) is a fundamental cycle for (e′1 , ė′1 ) as is g# (c4 ). In the complex L, we have
∂c1 = v4 − v1 , ∂c2 = v2 − v1 ,
∂d = −c1 + c2 + c3 − c4 .
The difference between the torus and the Klein bottle is how L is folded, so we take it into account when we
apply g. Now ∂w1 = g# (∂c1 ) = 0 and ∂z1 = g# (∂c2 ) = 0 for both the torus ad the Klein bottle. In the case
80 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

of the torus, ∂(γ) = g# (∂d) = 0 since g# (c1 ) = g# (c3 ) and g# (c2 ) = g# (c4 ). In the case of the Klein bottle,
∂(γ) = g# (∂d) = 2g# (c2 ) = 2z1 , since g# (c1 ) = g# (c3 ) and g# (c2 ) = −g# (c4 ).
Now it is easy to compute homology. If X is the torus, γ is a 2-cycle and there are no 2-boundaries, so H2 (X) ∼
=
Z. The 1-chains of X are generated by w1 and z1 . Both of these are 1-cycles and there are no 1-boundaries so
H1 (X) ∼= Z ⊕ Z. Since the torus is connected, H0 (X) ∼ = Z.
Letting X be the Klein bottle, γ generates the 2-chains and it is not a cycle, so H2 (X) = 0. The 1-chains
of X are generated by w1 and z1 . Both of these are 1-cycles and the boundaries are generated by ∂γ = 2z1 , so
H1 (X) ∼= Z ⊕ Z2 . Since the Klein bottle is connected, H0 (X) ∼ = Z.
These computations agree with what was stated for simplicial homology.

Example 4.3.5. Let S n be an n-sphere with n > 1. S n has the homotopy type of a CW complex with one n-cell and
one 0-cell. Then the cellular chain complex is infinite cyclic in dimensions n and 0 and 0 otherwise. So Hi (S n ) ∼ =Z
if i = 0, n and Hi (S n ) = 0 otherwise. For i = 1, the one cell is a cycle as its endpoints are identified. So Hi (S 1 ) ∼
=Z
if i = 0, 1 and Hi (S 1 ) = 0 otherwise.

4.3.4 Projective Spaces


Munkres [110] includes this section as an application of cellular homology. I am interested in it both for that
reason and for its role in the construction of Steenrod squares which I will discuss in Chapter 11. This section comes
entirely from [110].

Definition 4.3.7. Starting with the n-sphere S n , identify all pairs of antipodal points x and −x. The resulting quotient
space is called the real projective n-space and denoted P n .

Represent Rn as the set of sequences of real numbers {x1 , x2 , · · · } such that xi = 0 for i > n. Then we have
R ⊂ Rn+1 . Letting S n be the set {x1 , x2 , · · · } ⊂ Rn+1 such that x21 +x22 +· · ·+x2n+1 = 1, we see that S n−1 ⊂ S n .
n

Now S n−1 is just the equator of S n and if x ∈ S n−1 then −x ∈ S n−1 . So we have that P n−1 ⊂ P n . In fact, P n−1
is a closed subspace of P n .

Theorem 4.3.7. The space P n is a CW complex having one cell in each dimension 0 ≤ j ≤ n. Its j-skeleton is P j .

The idea of the proof is that we proceed by induction on n. P 0 identifies the two points of S 0 , so it consists of a
single point. Supposing the theorem is true for P n−1 . Any pair of points that are off of the equator of S n have at least
n n
one point in the open Northern hemisphere E+ . Now E+ is an open n-cell and the quotient map takes its boundary
n−1 n−1
S onto P which is assumed to be a CW-complex with one cell in each dimension up to n − 1. So the quotient
n
map restricted to E+ becomes a characteristic map for an n-cell. and we are done.
Note that P has a 0-cell and an open 1-cell so it is homeomorphic to S 1 .
1

Definition 4.3.8. P 0 ⊂ P 1 ⊂ P 2 · · · is an increasing sequence of spaces. Let P ∞ be the coherent union of these
spaces. (A coherent union of a sequence of spaces {Xα } is the union of these spaces in which a subset A is defined to
be open if A ∩ Xα is open for every α.) P ∞ is called infinite dimensional projective space. By the previous theorem,
it is a CW complex with one j-cell for all j ≥ 0. Its n-skeleton is P n .

We can also construct an analogous complex projective space. Start the way we did with real numbers defining
the n-dimensional complex vector space C n to be the space of sequences of complex numbers {z1 , z2 , · · · } such that
zi = 0 for i > n. Then we have C n ⊂ C n+1 . There is a homeomorphism ρ : C n+1 → R2n+2 called the realification
operator defined by ρ(z1 , z2 , · · · ) = (Re(z1 ), Im(z1 ), Re(z2 ), Im(z2 ), · · · ), where Re(zj ) and Im(zj ) are the real
and imaginary parts of zj respectively. Let z j be the complex conjugate of zj and |zj | be the norm of zj . If zj = a + bi
then z j = a − bi. |zj | = (a2 + b2 ) = (a + bi)(a − bi) = zj z j . So for z = (z1 , z2 , · · · , zn+1 ),
v
un+1
uX
|z| = t zj z j .
j=1
4.3. SINGULAR AND CELLULAR HOMOLOGY 81

The subspace of C n+1 consisting of all points of norm 1 is called the complex n-sphere. It corresponds to S 2n+1
under the operator ρ so we will denote it by S 2n+1 both as a subset of C n+1 and of R2n+2 .

Definition 4.3.9. In the complex n-sphere S 2n+1 , define an equivalence relation ∼ by

(z1 , · · · , zn+1 , 0, · · · ) ∼ (λz1 , · · · , λzn+1 , 0, · · · ),

where λ is a complex number such that |λ| = 1. The resulting quotient space is called complex projective n-space and
is denoted CP n .

Analogous to the real case, C n ⊂ C n+1 for all n, so S 2n−1 ⊂ S 2n+1 . Passing to quotient spaces we get
CP n−1 ⊂ CP n .

Theorem 4.3.8. The space CP n is a CW complex having one cell in each dimension 2j for 0 ≤ j ≤ n. Its dimension
is 2n and its 2j-skeleton is CP j .

The proof is similar to the proof in the real case but now CP n − CP n−1 is an open 2n-cell. Note that I have
glossed over some of the point set topology that shows that P n and CP n satisfy all of the requirements for a CW
complex. See Munkres [110] for more details.
One useful fact is that CP 0 is a single point and CP 1 is formed by attaching an open 2-cell. This makes CP 1
homeomorphic to S 2 .

Definition 4.3.10. Since CP 0 ⊂ CP 1 ⊂ CP 2 · · · is an increasing sequence of spaces, let CP ∞ be the coherent


union of these spaces. CP ∞ is called infinite dimensional complex projective space. By the previous theorem, it is a
CW complex with one cell in each non-negative even dimension.

Now we will compute the homology of these spaces. The easier case is the complex projective spaces. Since there
is one cell in every even dimension, Di (CP n ) ∼
= Z if i is even and Di (CP n ) = 0 otherwise. So in even dimensions
every chain is a cycle and no chain bounds. This gives the following result:

Theorem 4.3.9. Hi (CP n ) = ∼ Z if i is even and 0 ≤ i ≤ 2n, and Hi (CP n ) = 0 otherwise. The group Hi (CP ∞ ) ∼
=Z
if i is even and i ≥ 0, and Hi (CP ∞ ) = 0 otherwise.

The case for P n is harder. We know that for P n , the cellular chain groups are infinite cyclic for 0 ≤ k ≤ n. So
we need to compute the boundary operators. Since the open k-cell ek of P n equals P k − P k−1 and ėk = P k−1 ,
Dk (P n ) = Hk (P k , P k−1 ). So we need to compute the boundary ∂∗ : Hk+1 (P k+1 , P k ) → Hk (P k , P k−1 ).

Theorem 4.3.10. Let p : S n → P n be the quotient map where n ≥ 1, Let j : P n → (P n , P n−1 ) be inclusion. The
composite homomorphism

p∗ j∗
Hn (S n ) Hn (P n ) Hn (P n , P n−1 )

is zero if n is even and multiplication by 2 if n is odd.

Proof: Munkres provides a proof on the chain level and one on the homology level. I will give the simpler chain
level proof and refer you to [110] for the homology level proof.
A theorem of [110] says that we can triangulate S n so that the antipodal map a : S n → S n is simplicial and P n
can be triangulated so that p : S n → P n is simplicial. Then we can use simplicial homology. Let cn be a fundamental
n
cycle for (E+ , S n−1 ). Then p# (cn ) is a fundamental cycle for (P n , P n−1 ). (See [110] Theorem 39.1.)
Let γn = cn + (−1)n−1 a# (cn ) be a chain of S n . The antipodal map a : S n−1 → S n−1 has degree (−1)n , so a#
equals multiplication by (−1)n on Cn−1 (S n−1 ). So

∂γn = ∂cn + (−1)n−1 a# (∂cn ) = ∂cn + (−1)2n−1 ∂cn = ∂cn − ∂cn = 0,


82 CHAPTER 4. TRADITIONAL HOMOLOGY THEORY

so γn is a cycle. It is a multiple of no other cycle of S n since its restriction to E+


n
is cn which is a fundamental cycle
n n−1 n
for (E+ , S ). So γn is a fundamental cycle for S .
Finally, p# (γn ) = p# (cn + (−1)n−1 a# (cn )). Since p ◦ a = p, p# (γn ) = [1 + (−1)n−1 ]p# (cn ). This is 0 for n
even and multiplication by 2 for n odd. The theorem then follows from the fact that γn is a fundamental cycle for S n
and p# (cn ) is a fundamental cycle for (P n , P n−1 ). ■
Now we can compute the cellular boundary maps.
Theorem 4.3.11. The map ∂∗ : Hn+1 (P n+1 , P n ) → Hn (P n , P n−1 ) is zero if n is even and muliplies a generator
by 2 if n is odd.
n+1
Proof: The map p′ : (E+ , S n ) → (P n+1 , P n ) is a characteristic map for the open n + 1-cell of P n+1 so it
induces an isomorphicm in homology. Consider the following commutative diagram:

n+1 ∂∗
Hn+1 (E+ , Sn) ∼ Hn (S n )
=

= p′∗ p∗
∂∗ j∗
Hn+1 (P n+1 , P n ) Hn (P n ) Hn (P n , P n−1 )

n+1
The map ∂∗ at the top is an isomorphism by the long exact reduced homology sequence of (E+ , S n ), since
n+1
E+ is acyclic. By the preceding theorem, (j ◦ p)∗ is zero if n is even and multiplication by 2 if n is odd, so we are
done. ■
So all chains in odd dimensions are cycles and even multiples of generators bound, while in even dimensions, there
are no cycles. So we have the following:
Theorem 4.3.12. The homology of projective space is as follows:

 Z/2 if i is odd and 0 < i < 2n + 1,
H̃i (P 2n+1 ) ∼
= Z if i = 2n + 1,
0 otherwise.


Z/2 if i is odd and 0 < i < 2n,
H̃i (P 2n
)∼
=
0 otherwise.

Z/2 if i is odd and 0 < i,
H̃i (P ∞ ) ∼
=
0 otherwise.
We are now ready to apply what we have learned to topological data analysis.
Chapter 5

Persistent Homology

Recall from my introduction to the book that we want to get an idea of the shape of our data. Recall Figure 1.0.2
reproduced here as Figure 5.0.1:

Figure 5.0.1: Ring of Data Points

From the point of view of algebraic topology, this figure is pretty boring. It consists of N disconnected points. The
0-th Betti number is N and all of the other homology grups are zero. But your eyes think this should be a circle. So
how do we know? And how do we convince a computer of that fact?
The answer is that we grow the points into circles of increasing radius.As we do, the circles will start to overlap
and create holes. As they grow, the holes will be plugged up. There will be a lot of bubbles along the circumference
and a big hole in the middle. Keeping track of when the holes form (ie at what radius) and when they are plugged up,
we can fully describe the shape. The big hole is the most important feature and it persists the longest, so we decide it
is the most important feature. In the language of the last chapter, each component is a 0-homology class and each hole
is a 1-homology class. Once we find these classes, we can visualize them. The three standard methods are bar codes,
persistence diagrams, and persistence landscapes. These three visualizations display the same information, the birth
and death times for each homology class.
In general, we will replace the points with balls of growing radius To compute homology at each stage, we will need
a simplicial complex. Section 5.1 shows how to do this with a point cloud and formally defines persistent homology.
As the balls grow, we get a simpliciall complex which is growing as simplices are added. The series of complexes is
called a filtration. Section 5.2 discusses computation issues and explains why persistent homology is generally done
with coefficients in a field as opposed to the integral coefficients used in Chapter 4. In Section 5.3, we will see the
visualizations described above.
To do machine learning tasks such as clustering, we would like to have some notion of distance between persistence
diagrams. The standard measures are bottleneck distance and Wasserstein distance. Persistence landscapes have their
own distance measure which I used in my own time series work. All of these measures will be described in Section
5.4.
As I mentioned, the simplicial complexes in persistent homology grow monitonically forming a filtration. Can we

83
84 CHAPTER 5. PERSISTENT HOMOLOGY

still do something if it this is not the case? Section 5.5 will deal with this case which is called zig-zag persistence.
There are other ways in which persistence diagrams can arise besides point clouds. One example is sublevel sets
of a height function assigned to a data set. An important example of this is grayscale images. These are treated in
Section 5.6. Graphs can also lead to persistence diagrams if they are weighted. I will show how this works in Section
5.7. In addition, I will show how several different simplicial complexes can be produced from graphs and describe a
quick and easy distance measure between graphs.
In Section 5.8, I will talk about time series and how they relate to TDA. A time series is a set of equally spaced
measurements over time. In Section 5.9, I will describe the SAX method of converting a time series into a string of
symbols. This allows for an easy method to detect anomalous substrings as well as classify time series using substrings
that are common in one class and rare in the other. Using the a method developed by Gidea and Katz [55] for financial
applications, TDA can be used to convert a multivariate time series into a univariate one. I will conclude the section
with my own use of these techniques to classify internet of things devices.
Is there a good general method of extracting features from persistence diagrams? I will describe two very promising
techniques. The first is the technique of persistence images developed by Adams, et. al.[2]. Another technique is to
use template functions. These are a family of real valued functions highlighting different regions of a diagram. It was
developed by Perea, Munch, and Khasawneh [120]. I will conclude with a list of software which can be used for TDA
applications.
Here are some references on TDA. A good survey paper is [127]. I can also recommend four books. Edelsbrunner
and Harer [44] was written by two of the founders of the subject. Other books include [53] and [181]. The recent
book [129] is concerned specifically with applications in biology with a heavy emphasis on virus evolution and cancer
genomics. It has a great condensed introduction to persistent homology, and I will rely on it pretty heavily in this
chapter.

5.1 Definition of Persistent Homology


We will start with turning a set of discrete points into a simplicial complex. The idea is to consider that these points
are just a sample of points from the true shape. We would like to somehow reconstruct the actual shape and then take
the homology. But we can’t really be sure of the true scale. Is the shape folded up? Which holes are actually there and
which are just artifacts of our sampling. The idea of persistent homology is to take several scales at once. We do this
by growing each point to a ball such that every ball has the same radius. We will freeze the process for a set of radii,
build a complex, and then compute its homology.
We would like to build an abstract simplicial complex whose homology reflects that of the shape produced by
taking the union of balls around points. We start with a complex that has been around for a long time. The definitions
and results in this section are taken from [129] unless otherwise specfied.

Definition 5.1.1. For x ∈ Rn and ϵ > 0, let Bϵ (x) be the open ball of radius ϵ centered at x. The C̆ech complex
Cϵ (X) for a finite set of points X ⊂ Rn is an abstract simplicial complex with the points of X as vertices and a
k-simplex [v0 , v1 , · · · , vk ] for {v0 , v1 , · · · , vk } ⊂ X if
\
Bϵ (vi ) ̸= 0.
i

It turns out that the underlying space of the C̆ech complex is actually homeomorphic to the union of balls centered
at the points of X. (See the references cited in [129].) The problem is that for high dimensions, it is not easy to
tell when the intersections of the balls are non-empty. An easier task would be to look at whether the balls intersect
pairwise. In this case, we would only need to know pairwise distances between the points. We would then include a
simplex if the distances between each pair of vertices is less than or equal to than twice the radius of the balls. The
resulting complex is the Vietoris-Rips complex. Note that we can now define a simplex in any metric space with
distance d.
5.1. DEFINITION OF PERSISTENT HOMOLOGY 85

Definition 5.1.2. Let X be a finite metric space with distance d and fix ϵ > 0. The Vietoris-Rips complex V Rϵ (X) is an
abstract simplicial complex with the points of X as vertices and a k-simplex [v0 , v1 , · · · , vk ] whenever d(vi , vj ) ≤ 2ϵ
for all 0 ≤ i, j ≤ k.

The Vietoris-Rips complex is easier to compute than the C̆ech complex but is not always the same as the following
example from [129] shows.

Example 5.1.1. Let X = {(0, 0), (1, 0), ( 21 , ( 23 )} ⊂ R2 . These points are the vertices of an equilateral triangle with
side 1. If ϵ > 21 , then any pair of balls of radius epsilon intersect so the Vietoris-Rips and C̆ech complexes have the

same 1-skeletons. But the centroid of the triangle which is equidistant from all three vertices is a distance 33 ≈ .577

from each vertex. So for 21 < ϵ < 33 , the pairs of balls interesect but the intersection of all three is empty. So the
Vietoris-Rips complex has a 2-simplex but the C̆ech complex does not. As the union of the balls has a hole, the 1-
homology of the union has Betti-number of 1 which agrees with the C̆ech complex but not the Vietoris-Rips complex
which is acyclic.

It turns out that there is a relationship between the Vietoris-Rips complex and the C̆ech complex. I will state it here
without proof:

Theorem 5.1.1. If X is a finite set of points in Rn and ϵ > 0 then we have the inclusions

Cϵ (X) ⊂ V Rϵ (X) ⊂ C2ϵ (X).

One thing to notice is that for both the Vietoris-Rips complex and the C̆ech complex, increasing the scale parameter
ϵ increases the number of simplices in the complex. This is a consequence of the fact that increasing the radius of the
balls centered at each point makes them more likely to intersect. This fact is the motivation for my interest in the con-
nection between obstruction theory and TDA. Obstruction theory looks at extending a map from L to Y continuously
to a map from K to Y if L is a subcomplex of K. Picking the proper map may give interesting information about the
data points. I will discuss this subject further in later chapters after I define cohomology and homotopy groups as they
are needed to provide the criterion for extending maps in this way.
The increasing sequence of complexes is called a filtration. The idea will be to choose several values of epsilon
and compute the resulting homology in dimension k of the homology of the Vietoris-Rips complex, V Rϵ (X). As ϵ
grows, we keep track of each k-dimensional homology class and determine at what value of ϵ the class is born and at
what values it dies. The idea is that classes with a longer lifetime are the most important. The lifetime of a homology
class is its persistence and this process computes the persistent homology of a point cloud. Figure 5.1.1 shows the
resulting complexes in an example point cloud as ϵ grows.

Figure 5.1.1: Vietoris-Rips Complexes of a Point Cloud with ϵ Growing [129]


86 CHAPTER 5. PERSISTENT HOMOLOGY

This leads to some issues. First of all, what coefficients should we use for our homology groups. The most common
is to use Z2 . This avoids the problem of orientation as in Z2 , 1 = −1. But you are throwing away information. Why
not use integer coefficients as we do in Chapter 4? It turns out, we need to have coefficients in a field for persistent
homology and we will explore the reasons in the next section. Also, what values of ϵ should we use? We would like
to capture the changes and with finitely many points, simplices are added only finitely many times leading to only
finitely many changes in the homology groups. So equal spaceing may work but you need the intervals to be adjusted
to how far apart your points are. Finally, what values of k should we use to compute Hk (V Rϵ (X). It is best to start
with small values as these are easiest to interpret. H0 represents the number of components, H1 represents the number
of holes, and H2 represents the number of voids (think of the spherical holes in a block of Swiss cheese. In Section
5.3, we will see how to visualize these classes.

5.2 Computational Issues


In this section, I will discuss the paper of Zomorodian and Carlsson [182], which is one of the earliest papers on
computing persistent homology. I won’t copy the entire algorithm. You can look there, but keep in mind that there
are much faster algorithms that have been developed since then and there is a lot of open source software that will do
these computations for you. Our interest will be in providing a more algebraic description of persistent homology and
answering the question of why we need our coefficients to be in a field.
First lets review some ring theory. Recall that a Euclidean ring allows for long division. Also, recall Theorem
3.4.2 which states that any Eucllidean ring is a principal ideal domain (PID). You may wonder if the opposite is true.
Herstein [66] says that the answer is no, and gives the paper by Motzkin [108] as a reference. That paper is lacking in
details and somewhat difficult to understand. A much better paper is that of Wilson [173]. Letting R be the subring of
the complex numbers of the form √
1 + −19
R = {a + b |a, b ∈ Z},
2
he shows that R is a PID which is not a Euclidean ring.
Recall that we also mentioned that the polynomial ring F [x] is a Euclidean ring if F is a field. I will outline the
proof from Herstein [66].
First of all, an element of a Euclidean ring has a degree. For a polynomial, this is the usual degree. A generic
element of F [X] is f (x) = a0 + a1 x + a2 x2 + · · · + an xn , where the coefficients a0 , a1 , · · · , an ∈ F. In this case,
the degree of f is deg f = n.
Now let f (x) = a0 + a1 x + a2 x2 + · · · + an xn and g(x) = b0 + b1 x + b2 x2 + · · · + bm xm .Assume an , bm ̸= 0.
Multiplying polynomials in the usual way gives that the highest degree term is an bm xn+m . So we get the folllowing:

Theorem 5.2.1. If f (x) and g(x) are nonzero elements of F [x], then deg (f (x)g(x)) = deg f (x)+ deg g(x).

Theorem 5.2.2. If f (x) and g(x) are nonzero elements of F [x], then deg f (x) ≤ deg (f (x)g(x)).

Proof: This follows from Theorem 5.2.1 and the fact that for g(x) ̸= 0, deg g(x) ≥ 0.■

Theorem 5.2.3. F [x] is an integral domain.

This immediate from the previous result.


We will now be able to show that F [x] is a Euclidean ring if we can prove the division algorithm.

Theorem 5.2.4. If f (x) and g(x) are nonzero elements of F [x], then there exist polynomials t(x) and r(x) in F [x]
such that f (x) = t(x)g(x) + r(x) where r(x) = 0 or deg r(x) < deg g(x).

Proof: If the degree of f (x) is less than the degree of g(x) we can set t(x) = 0 and r(x) = f (x). So assume that
deg f (x) ≥ deg g(x);.
Let f (x) = a0 + a1 x + a2 x2 + · · · + am xm and g(x) = b0 + b1 x + b2 x2 + · · · + bn xn with am , bn ̸= 0, and
m ≥ n. We can proceed by induction on the degree of f (x). If f has degree 0, then so does g. So let f (x) = a0 and
5.2. COMPUTATIONAL ISSUES 87

g(x) = b0 . Then since F is a field and a0 and b0 are nonzero, we have f (x) = (a0 /b0 )g(x). Here division means
multiplying by b−1
0 which we can do since F is a field and all nonzero elements have a multiplicative inverse.
If the degree of f (x) is greater than zero, let f1 (x) = f (x) − (am /bn )xm−n g(x), Then deg f1 (x) ≤ m − 1, so by
induction on the degree of f (x), we can write f1 (x) = t1 (x)g(x) + r(x) where r(x) = 0 or deg r(x) < deg g(x). So
f (x)−(am /bn )xm−n g(x) = t1 (x)g(x)+r(x) or f (x) = (am /bn )xm−n g(x)+t1 (x)g(x)+r(x) = t(x)g(x)+r(x),
where t(x) = (am /bn )xm−n + t1 (x). So we are done. ■
Theorem 5.2.5. If F is a field, then the polynomial ring F [x] is a Euclidean ring. Consequently, it is also a principal
ideal domain.
Now letting Z be the integers, we see the problem with this proof in the case of Z[x]. Unless the coefficient of the
highest degree term of g(x) is either 1 or -1, we can’t divide by it. In fact, Z[x] is not even a pricipal ideal domain. Let
the ideal U be generated by 2 and x. A principal ideal would have to be generated by an element of minimal degree
that is of the form 2a + bx where a and b are integers. So the only possible generators would be 2 or -2. But x is not
a multiple of 2 since 2 has no inverse in Z. We would have the same problem with −2. So Z[x] is not even a PID,
despite the integers being a PID.
Now we will show that the chain complexes we will use for persistent homology will be modules over the polyno-
mial ring R[t]. This will be a PID if R is a field. In that case, we will make use of the structure theorem of modules
over a PID (Theorem 3.4.3).
Now I will summarize the description of persistence from [182]. Assume we have a filtration of complexes. The i-
th complex will be denoted K i . Note that this section will not have any reference to cohomology in which superscripts
are generally used. Here the superscript will be an index for a filtration. For the Vietoris-Rips complex, we could
think of the superscript indexing the radius of the balls centered at points in a point cloud. For K i , Cki , Zki , Bki , and
Hki , are the groups of k-chains, k-cycles, k-boundaries, and k-th homology group respectively. The p-persistent k-th
homology group of K i is \
Hki,p = Zki /(Bki+p Zki ).

Note that both groups in the denominator are subgroups of Cki+p as the complexes are getting bigger as the su-
perscript increases so the chain group Cki is a subgroup of Ckj for j > i. So the intersection of the groups in the
denominator is also a group and a subgroup of the numerator.
Now recall from Theoremm 3.4.3 that a finitely generated module over a principal ideal domain R is the direct
sum of copies of R (the free part) and summands of the form R/(di ) (the torsion part). The Betti number β is the rank
or number of summands of R in the free part.
Definition 5.2.1. The p-persistent k-th Betti number of K i is the rank βki,p of the free subgroup of Hki,p . We can also
define persistent homology groups using the injection ηki,p : Hki → Hki+p coming from the map on chains. We can
define Hki,p as the image of ηki,p .

We are still left with the problem of finding compatible bases for Hki and Hki+p . To get around this problem, [182]
combines all of the complexes of the filtration into a single algebraic structure. They then establish a correspondence
that gives a simple description when the coefficients are in a field. Persistent homology is represented as the homology
of a graded module over a polynomial ring.
Definition 5.2.2. A persistence complex C is a family of chain complexes {C∗i }i≥0 over R together with chain maps
f i : C∗i → C∗i+1 so that we have the following diagram:

f0 f1 f2
C∗0 C∗1 C∗2 ···

The filtered complex K with inclusion maps for the simplices is a persistence complex. Here is a part of the
complex. The filtration indices increase as you move right and the dimension goes down as you move down. ∂k are
the usual boundary operators.
88 CHAPTER 5. PERSISTENT HOMOLOGY

∂3 ∂3 ∂3
0 1
f f f2
C20 C21 C22 ···

∂2 ∂2 ∂2
0 1
f f f2
C10 C11 C12 ···

∂1 ∂1 ∂1
0 1
f f f2
C00 C01 C02 ···

Definition 5.2.3. A persistence module M is a family of R–modules M i together with homomorphisms ϕi : M i →


M i+1 .

Definition 5.2.4. A peristence complex (resp. persistence module) is of finite type if each component complex (mod-
ule) is a finitely generated R-module, and if the maps f i (resp. ϕi ) are isomorphisms for i ≥ m for some m.

For a Vietoris-Rips complex K produced by finitely many points, the associated persistence complex and persis-
tence module (the homology of the complex) are of finite type. If m is the largest distance between any pair of points,
the complex V Rm (K) consists of one simplex whose vertices are all of the points and it no longer changes when
ϵ > m.
Now we need to make the algebra a little more precise. A graded ring R will be a direct sum of abelian groups Ri
such that if a ∈ Ri and b ∈ Rj then ab ∈ Ri+j . Elements in a single Ri are called homogeneous. The polynomial
ring R[t] is a graded ring with standard grading Ri = Rti for all nonnegative integers i. ati and btj for a, b ∈ R are
homogeneous elements while there sum is not. The product abti+j has degree i + j as required. We define a graded
module M over a graded ring R in a similar way. M = ⊕Mi where the sum is over all integers, and for r ∈ Ri and
m ∈ Mj , we have rm ∈ Mi+j . A graded ring (resp. module) is non-negatively graded if Ri = 0 (resp. Mi = 0) for
i < 0.
In this notation, we write the structure theorem in the following form:

Theorem 5.2.6. If D is a PID, then every finitely generated D-module can be uniquely decomposed as
m

M
β
D =D ⊕( D/di D),
i=1

for di ∈ D, β ∈ Z, and di divides di+1 for all i. A graded module M over a graded PID D decomposes uniquely as
n m
M∼
M M
=( Σαi D) ⊕ ( Σγj D/dj D),
1=1 j=1

where dj ∈ D are homogeneous elements so that dj divides dj+1 , αj , γj ∈ Z, and Σα denotes and α-shift upward in
grading.

For a persistence module M = {M i , ϕi }i≥0 over a ring R, give R[t] the standard grading and define a graded
module over R[t] by
M∞
α(M) = M i,
i=0
5.3. BAR CODES, PERSISTENCE DIAGRAMS, AND PERSISTENCE LANDSCAPES 89

where the the action of r is just the sum of the actions on each component and

t(m0 , m1 , m2 , · · · ) = (0, ϕ0 (m0 ), ϕ1 (m1 ), ϕ2 (m2 ), · · · ).

So t shifts elements of the module up a degree.What this means in our case is that if a simplex is added at time i, t
shifts it to time i + 1 but preserves the memory that it had been added earlier by applying ϕi .
So α provides a correspondence between persistence modules of finite type over R and finitely generated non-
negatively graded modules over R[t]. So we want R to be a field so that R[t] becomes a PID and we can use the
structure theorem. So we now have that our persistence module is of the form

Mn Mm
αi
( Σ R[t]) ⊕ ( Σγj R[t]/(tnj )),
1=1 j=1

since R[t] is a PID and all principal ideals are of the form (tn ).
Now we get to the key point. We want a more meaningful description of the possible graded modules over R[t]. De-
fine a P-interval is an ordered pair (i, j) with 0 ≤ i < j, where i ∈ Z and j ∈ Z ∪ {+∞}. Let Q(i, j) for a P-interval
(i, j) to be Σi R[t]/(tj−i ), and Q(i, +∞) = Σi R[t]. For a set of P-intervals S = {(i1 , j1 ), (i2 , j2 ), · · · , (in , jn )} ,
define
Mn
Q(S) = Q(ik , jk ).
k=1

So we have a one-to one correspondence between finite sets of P-intervals and finitely generated modules over R[t]
where R is a field.
So we know now why we wanted the coefficients to be in a field. Usually the field will be Z2 since we don’t have to
worry about orientations. But what do the P-intervals represent? Suppose we have the interval (i, j). This corresponds
to the summand Σi R[t]/(tj−i ). If the module represents the homology, the shift Σi means that the homology class in
this summand doesn’t show up until time i, while the quotient (tj−i ) means that it disappears by time j. The time i is
called the birth time of the class, and the time j is the death time. The correspondence and the structure theorem imply
that we can characterize the persistent homology by keeping track of all of the birth and death times of the homology
classes in Hk (X) for each k. This will be done using the visualizations in the next section.
The rest of [182] provides an early algorithm that modifies the algorithm we saw in Section 4.1.4 for the persistent
case where the coefficients are in a field. There is also a modification for coefficients in a PID such as the integers but
only for a pair of times i and i + p. Knowing the birth and death times of the classes doesn’t uniquely determine the
persistence module in this case, but we can compute regular homology at fixed times over the integers. As we will
see in Chapter 8, we can use the Universal Coefficient Theorem to obtain the homology in any other abelian group
from the homology with integer coefficients. We will describe that process when we cover some homological algebra.
Meanwhile, in the rest of this chapter, we will always work in a field and use Z2 unless otherwise specified.
I will not say anything more about the algorithm in [182]. It has historical interest, but has since been replaced by
much more efficient algorithms.

5.3 Bar Codes, Persistence Diagrams, and Persistence Landscapes


Now we know that we can characterize persistent homology by birth and death times of the homology classes
in each dimension. The next step is to produce a visualization. The three most popular are bar codes, persistence
diagrams, and persistence landscapes.
These visualizations are best explained by using pictures. The first picture I will show is from the survey paper by
Chazal and Michel [31].
Bar codes were the first visualization tried. The x-axis represents the filtration index or time (here the radius of
the balls.) The y-axis doesn’t have a meaning. There is just a bar for every homology class. In Figure 5.3.1, the 0 and
1-dimensional homology are combined on one chart. Usually, we make a separate chart for each dimension, but this
90 CHAPTER 5. PERSISTENT HOMOLOGY

Figure 5.3.1: Bar Codes and Persitence Diagrams [31]

diagram is simple enough to combine them. The red bars are in dimension 0 and the blue ones are dimension 1. If you
can’t see colors on your copy, there are two blue bars at the bottom of charts c, d, and e.
In picture a, we have a set of disconnected dots. The bars should probably have length 0 at this stage, but they are
longer so you can see them. In b, many of them have merges so many of the 0-dimensional bars have died by then. In
picture c, we have two complete loops so we get the two blue bars on the bottom. From picture b, we see that the top
loop was born before the bottom one, so the corresponding bar starts further to the left. Also, we now have a single
component, so there is only one red bar that is still growing. In picture d, the bottom hole has closed up so the bottom
bar has ended. There is now one red bar corresponding to the one component and one blue bar corresponding to the
one hole. In picture e, all of the holes have closed and the homology stops evolving. We now have the full bar code
picture with one surviving red bar and all of the others have ended.
Bar codes present some problems though. The main one is an easy distance measure between bar codes for different
point clouds. Also, they can be a little clumsy if there are a lot of homology classes. A more elegant visualization is a
persistence diagram. At the bottom right of Figure 5.3.1, the final bar chart is transformed into a persistence diagram.
This is more of a conventional scatter plot with the x-axis representing birth times and the y-axis representing death
times. Again, these times correspond to the radius of the balls when the class is formed and when it dies. For each
class, we plot a single point whose x coordinate is its birth time and whose y coordinate is its death time. In Figure
5.3.1, the 0-dimensional and 1-dimensional classes are again plotted together. All of the 0-dimensional classes sit on
the y-axis as they were all born at time 0 but die at different times. The two dots towards the right are the 1-dimensional
classes. Note that the higher one represents the top loop as it is born earlier(so it is further left) and dies later (so it is
higher) than the bottom loop.
Note that the persistence diagram always sits above the diagonal line y = x. After all, a class can’t die before it is
born. Also, the set of points plotted is known as a multiset as the same point can appear multiple times. There is no
reason that two or more classes can’t both be born and die at the same time as each other.
Persistence landscapes were developed by Bubenik [25] who felt they wee more suitable for statistical analysis.
The reason has to do with distance measures so I will defer that to Section 5.5. Like bar codes and presistence
diagrams, we start with a set of birth time-death time intervals. Rather than form a scatterplot, for a persistence
landscape, we construct a function for each of these intervals. The function is shaped like a triangle and its base and
height are proportional to its lifetime. Suppose homology class i is born at time bi and dies at time di . Then we define
5.4. ZIG-ZAG PERSISTENCE 91

a piecewise linear function

if bi < x ≤ bi +d

 x − bi 2 ,
i

bi +di
f(bi ,di ) = −x + di if 2 < x < di ,
0 otherwise.

di −bi
These functions are triangles with base from bi to di and height 2 . So they are wider and taller the longer their
lifetime.

Figure 5.3.2: Persitence Diagrams vs. Persistence Landscapes [55]

Figure 5.3.2 shows an example from [55]. In this case we are only plotting the 1-dimesnional homology. Let ϵ be
the radius of the circles. The small hole along the right edge is born at ϵ = ϵ1 and dies at ϵ = ϵ2 . The large hole in
the middle is born at ϵ = ϵ2 and dies at ϵ = ϵ3 . The persistence diagram at the bottom left includes points at (ϵ1 , ϵ2 )
and (ϵ2 , ϵ3 ). In the persistence landscape at the bottom right, we have 2 triangular functions. The one on the left has a
base from ϵ1 to ϵ2 , while the one on the right has a base from ϵ2 to ϵ3 . The heights are ϵ2 −ϵ
2
1
and ϵ3 −ϵ
2
2
respectively.
In a real life case there will be lots of triangles and they will overlap. This is a foreshadowing of what that λ1 is
doing in the picture.
I will end this section on a cliffhanger and finish this story in Section 5.5 after one short diversion.

5.4 Zig-Zag Persistence


In our previous discussion we were dealing with a filtration in which are complexes are growing monotonically. If
this is not the case, there is an alternate notion of zig-zag persistence [30]. The material here is taken from [129].
Suppose we take multiple sets of samples Xi form a metric space X. We can form a sequence

X1 X1 ∪ X2 X2 X2 ∪ X3 X3 ···
92 CHAPTER 5. PERSISTENT HOMOLOGY

where the maps are all inclusions. This gives rise to the corresponding diagram

Hk (V Rϵ (X1 )) Hk (V Rϵ (X1 ∪ X2 )) Hk (V Rϵ (X2 ))

Hk (V Rϵ (X2 ∪ X3 ))

Hk (V Rϵ (X3 )) ···

We now define zigzag diagrams more generally. They can have other shapes besides the one just shown.Keep in
mind that homology is taken with coefficients in a field F so the homology groups are actually vector spaces over F .
Definition 5.4.1. A zigzag diagram or zigzag module of shape S is a sequence of linear transformations between F
vector spaces:
f1 f2 fn−1
X1 X2 ··· Xk

where each map fi has its direction specified by the i-th letter in S. Each letter is R or L.
Example 5.4.1. If S is a string of R′ s or L′ s then we have an ordinary filtration.
Example 5.4.2. If S = RRL then a zigzag diagram of shape S is

X1 X2 X3 X4

Example 5.4.3. If S = RLRLRL then a zigzag diagram of shape S is

X1 X2 X3 X4 X5 X6 X7

In the case of filtrations, we would look for features that survived a chain of inclusion maps. For a zig-zag filtration,
we look for consistency across parts of the diagram. For example, if a zigzag diagram has shape RL,
f1 f2
X1 X2 X3 ,

we look for features x1 ∈ X1 , x2 ∈ X2 , and x3 ∈ X3 such that f1 (x1 ) = x2 = f2 (x3 ).


In the next part, we use the term zigzag module to emphasize the algebraic structure.
Definition 5.4.2. A zigzag submodule N of a zigzag module M of shape S is a zigzag module of shape S such that
each space Ni in N a subspace of the space Mi in M and the maps in N are the restrictions of those in M .
Example 5.4.4. Let F = R, and suppose we have the zigzag module
f1 f2
R R2 R,

where for x ∈ R, f1 (x) = (x, 0) and f2 (x) = (0, x). Then we have a zigzag submodule
g1 g2
R R R

where the R in the middle is the first coordinate of R2 so that for x ∈ R, g1 (x) = x, and g2 (x) = 0.
5.4. ZIG-ZAG PERSISTENCE 93

A zigzag module is decomposable if it can be written as the direct sum of non-trivial submodules. Otherwise it is
indecomposable.

Theorem 5.4.1. Any zigzag module of shape S can be written as a direct sum of indecomposables in a way that is
unique up to permutation.

Definition 5.4.3. An interval zigzag module of shape S is a zigzag module

f1 f2 fn−1
X1 X2 ··· Xk

where for fixed a ≤ b, Xi = F for 1 ≤ a ≤ i ≤ b ≤ k, and Xi = 0 otherwise.

Theorem 5.4.2. The indecomposable zigzag modules are all in the form of interval zigzag modules.

This theorem allows us to produce barcodes in this case by writing one bar extending from ai to bi for each
indecomposable zigzag submodule whose nonzero terms extend from ai to bi .
Here is one type of zigzag module we can build in a metric space. Let X = {x1 , x2 , x3 , · · · , xn } be a finite metric
space where an ordering of the points has been chosen, and for 1 ≤ k ≤ n, Xk = {x1 , x2 , x3 , · · · , xk } consists of the
first k points of the ordering.

Definition 5.4.4. Let A and B be non-empty subsets of a metric space X with distance function d. The Hausdorff
distance between A and B is

dH (A, B) = max(sup inf d(a, b), sup inf d(a, b)).


a∈A b∈B b∈B a∈A

So for sets A and B, for each element of A, find the closest element of B and find the distance. Then take the
largest of these as we run over the elements of A. Then do the same thing with the roles of A and B reversed. Finally
take the largest of those two values.
Now for 1 ≤ k ≤ n, let ϵk = dH (Xk , X). Note that ϵk ≥ ϵk+1 since Xk+1 includes an additional point so it will
always be at least as close to the full set X as Xk in Hausdorff distance.

Definition 5.4.5. Choose real numbers α > β > 0. The Rips zigzag consists of the zigzag module specified by the
diagram of simplicial complexes

··· V Rβϵi−1 (Xi−1 ) V Rαϵi−1 (Xi ) V Rβϵi (Xi )

V Rαϵi (Xi+1 )

V Rβϵi+1 (Xi+1 ) ···

The α and β control the size of the complexes and were introduced for computational efficiency.
This is all I will say about zigzag persistence. See [129], Section 5.4.3 for a use in studying dementia in AIDS
patients.
94 CHAPTER 5. PERSISTENT HOMOLOGY

5.5 Distance Measures


Now that we can produce a bar code or persistence diagram from a set of data points, we would like to define a
distance measure between these diagrams.This will allow us to perform machine learning tasks such as clustering and
classification. What we need from these measures is that they have the property of stability. In other words, we want
small changes in our data to result in small changes in the distance between our persistence diagrams. An important
aspect of TDA is that there are measures that behave in this way. The material in this section is taken from [31] and
[129].
Recall that a persistence diagram consists of a multiset of points on a graph each of which represent a homology
class. The x-coordinate is the birth time and the y-coordinate is the death time. We also will include the diagonal and
consider each point on it to have infinite multiplicity.
Definition 5.5.1. Let a = (a1 , · · · , an ) and b = (b1 , · · · , bn ) be points in Rn . Then the Lp distance between a and b
for p a positive integer is
n
X 1
dp (a, b) = ( |ai − bi |p ) p .
i=1
The L∞ distance is
d∞ (a, b) = max |ai − bi |.
1≤i≤n

Now suppose we have two diagrams dgm1 and dgm2 . A matching m between dgm1 and dgm2 is a bijective map
between their points. A point of dgm1 can be mapped to a point of dgm2 or it can be mapped to the closest point
on the diagonal. If p = (b1 , d1 ) and q = (b2 , d2 ) then d∞ (p, q) = max(|b1 − b2 |, |d1 − d2 |). If p = (b, d) and we
want to match p to a point on the diagonal, we map it to ( b+d b+d
2 , 2 ) and the L

distance is |d−b|
2 . We want to find the
matching that will minimize the worst distance between any pair of points in the matching.
Definition 5.5.2. The bottleneck distance between two diagrams dgm1 and dgm2 is
dB (dgm1 , dgm2 ) = inf max d∞ (p, q).
m (p,q)∈m

Here the infimum is taken over all possible matchings where p is in dgm1 or on the diagonal and q is in dgm2 or on
the diagonal.

Figure 5.5.1: Example Matching of Two Persistence Diagrams [31]

Figure 5.5.1 shows an example of matching 2 persistence diagrams. The blue circles belong to one diagram and
the red squares belong to another. Note that some points from each diagram are mapped to the nearest point on the
diagonal.
The bottleneck distance is entirely determined by the furthest pair of points. To give more influence to the other
points, we can use the Wasserstein distance instead.
5.5. DISTANCE MEASURES 95

Definition 5.5.3. The p-Wasserstein distance between two diagrams dgm1 and dgm2 is
X 1
dWp (dgm1 , dgm2 ) = (inf d∞ (p, q)p ) p .
m
(p,q)∈m

As before, the infimum is taken over all possible matchings where p is in dgm1 or on the diagonal and q is in dgm2 or
on the diagonal.

Now recall form the last section that the Hausdorff distance dH is a nearness measure on subsets of a metric space.
What if we want to generalize this when the metric spaces may not be the same.

Definition 5.5.4. A function f : X → Z is an isometry if dX (x1 , y1 ) = dZ (f (x1 ), f (y1 )). If f is one-to-one then f
is an isometric embedding of X into Z.

Definition 5.5.5. If X1 and X2 are compact metric spaces and can be isometrically embedded into Z then we define
the Gromov-Hausdorff distance as

dGH (X1 , X2 ) = inf dH (f1 (X1 ), f2 (X2 ))

where the infimum is taken over all possible isometric embeddings fi : Xi → Z, for i = 1, 2, and dH denotes
Hausdorff measure over subsets of Z.

The following stability theorem is from [129] and credited by them to [32]. See [31] and its references for several
other variants.

Theorem 5.5.1. Let X and Y be finite metric spaces. Let P Hk (V R(X)) be the k-dimensional persistence diagram
of the Vietoris-Rips complex of X and let P Hk (V R(Y )) be defined similarly. Then for all k > 0,

dB (P Hk (V R(X)), P Hk (V R(Y )) ≤ dGH (X, Y ).

So a small change in a point cloud results in a small change in bottleneck distance between the respective persis-
tence diagrams. Similar theorems hold for Wasserstein distance and persistence diagrams built in the other ways I will
describe in the next few sections.
You may wonder if there is an efficient way to calculate these distances. The answer is yes, but I won’t go into
them. The persistent homology packages that I will list at the end of this chapter can all do them for you. I will refer
you to their descriptions and references if you are interested in efficient computation methods.
There has been some efforts to develop a statistical theory for persistent homology. For example, if you have a
point cloud, how much of the true geometry have you covered? You really only have a sample. Is there a statistical
theory complete with hypothesis testing, confidence intervals, etc.? This appears to be a pretty undeveloped area. If
you are interested, see Chapter 3 of [129] or the relevant sections of [31].
What I will say is that persistence landscapes provide the framework for some familiar statistical results such as
the law of large numbers and the central limit theorem. The key is that we have the notion of a mean persistence
landscape. In addition, we have a metric which makes landscapes into a Banach space, which I will now define.

Definition 5.5.6. Let X be a vector space over the real or complex numbers. Then X is a normed space if for every
vector x ∈ X we define a real number ||x|| called the norm of x with the following properties:

1. For every vector x, ||x|| ≥ 0, and ||x|| = 0 if and only if x = 0.

2. For every vector x and scalar α, ||αx|| = |α|||x||.

3. Triangle Inequality: For x, y ∈ X, ||x + y|| ≤ ||x|| + ||y||.

Any normed space is automatically a metric space where we define for x, y ∈ X, the distance d(x, y) = ||x − y||.
96 CHAPTER 5. PERSISTENT HOMOLOGY

Example 5.5.1. Rn is a normed space. If x = (x1 , · · · , xn ) ∈ Rn , then define


q
||x|| = x21 + x22 + · · · + x2n .

To define a Banach space, we need the notion of completeness.

Definition 5.5.7. Let X be a metric space (or more generally any topological space). A sequence of points in X is
the image of a function f : Z + → X where Z + is the set of positive integers. We write a sequence as {x1 , x2 , · · · },
where xi = f (i) for all i ∈ Z + .

Definition 5.5.8. Let X be a metric space. A sequence of points {x1 , x2 , · · · } in X converges to a point y ∈ X if
given ϵ > 0, there exists an integer N such that d(y, xi ) < ϵ for i > N . The point y is called the limit of the sequence.

Definition 5.5.9. Let X be a metric space. A sequence of points {x1 , x2 , · · · } in X is a Cauchy sequence if given
ϵ > 0, there exists an integer N such that d(xi , xj ) < ϵ for i, j > N .

Definition 5.5.10. Let X be a metric space.Then X is complete if every Cauchy sequence of points in X converges
to a point in X. If X is a complete metric space which is also a normed space, then X is a Banach space.

Example 5.5.2. R is a Banach space and so is every closed subspace. The open interval (0, 1) is not a Banach space.
1
Let xi = i+1 for every positive integer i. This is a Cauchy sequence but its limit is 0 ∈
/ (0, 1).

Now recall that to create a persistence landscape, we started with a persistence diagram and for each non-diagonal
point (bi , di ) we constructed a triangular shaped function

if bi < x ≤ bi +d

 x − bi 2 ,
i

bi +di
f(bi ,di ) = −x + di if 2 < x < di ,
0 otherwise.

We can safely assume that the persistence diagram has finitely many off-diagonal points. Recall that we have a
triangle for each of these and in practice there is a lot of overlap. For every positive integer k, we have a function λk
defined by
λk (x) = kmax{f(bi ,di ) (x)|(bi , di ) ∈ P },
where P is our persistence diagram and kmax takes the k-th largest value of the overlapping triangles. Set λk (x) = 0
if there is no k-th largest value.
Now let λ = {λk }k∈Z + . This sequence of functions is what we will call a persistence landscape. Persistence
landscapes form a vector subspace of the Banach space Lp (Z + × R) consisting of sequences of functions of this form
whose Lp norm defined below is finite. For the vector space structure, if λ and η are two landscapes and c ∈ R, then
we define (λ + η)k to be λk +ηk and (cλ)k = cλk for all k. It is a Banach space with the norm

X 1
||λ||p = ( ||λk ||pp ) p ,
k=1

where ||f ||p denotes the Lp norm ( R |f |p )1/p .


R

Now we can talk about the mean of persistence landscapes. For example, the mean of η and λ is (η + λ)/2. This
doesn’t necessarily correspond to a persistence landscape but it does allow us to derive a law of large numbers and
a central limit theorem for persistence landscapes allowing for a statistical analysis that is more natural than if we
worked with persistence diagrams. See Bubenik [25] for more details.
My own interest in persistence landscapes and especially the work of Gidea and Katz [55] is the application to
time series analysis. I will go into more details in Sections 5.8 and 5.9.
So now we can review the main steps for analyzing a point cloud:
5.6. SUBLEVEL SETS 97

Step 1: Look at the points. If they are low dimensional we may be able to see some features we want to capture.
Otherwise we can use a standard dimensionality reduction technique like prinicpal componenet analysis or multidi-
mensional scaling.
Step 2: Compute homology at different values of an increasing radius ϵi . Do this for some differing low dimen-
sions.
Step 3: Look at a persistence diagram. Points that are further from the main diagonal are more ”persistent” and
may be the most interesting. As an example, we have points in the shape of a ring. The hole in the middle will last the
longest.
Step 4: Use bottleneck or Wasserstein distance to perform clustering and classification tasks.
I will describe some additional ideas in the rest of this chapter and the next two.
So we are left with the question: Is TDA good for anything? It sees to be mostly potential now but in the internet
of things classification work that I will describe in Section 5.9, it worked better than other methods we tried for noisy
long term data. In [129] there are a lot of biology examples. For example, they describe work on the evolution of
viruses. If two strains of a virus infect a single cell, they can combine into a new strain creating a cycle in their
family tree. The size of the cycle corresponds to the persistence of a one dimensional homology class and can be
informative in determining a virus’s ancestry. There is also a lot of discussion on the use of TDA in cancer research.
They particularly cite the success of the related technique, Mapper, in the discovery of a previously unknown line of
breast cancer cell. I will discuss that result in Section 6.3. Gidea and Katz [55] have a financial application in mind. I
will say more about that in Section 5.9. There are a lot more applications in the literature, but I will leave it to subject
matter experts to decide how useful they are.
What I can say for sure is that there is a lot of interest in the subject and it will be easy to conduct experiments as
there is free and open source software that can do all of the steps for you. It will be interesting to see where all of this
leads in the future.

5.6 Sublevel Sets


In various problems in topology and especially in the field of differential geometry we deal with curved versions of
Rn called manifolds. Think of the surface of the Earth that looks flat when you are on it but you can see its spherical
structure as you move back. We know that gravity bends space and time so differential geometry is the main tool in
general relativity. Here is a precise definition.
Definition 5.6.1. A topological space M is an n-manifold if every point x ∈ M has an open neighborhood homeo-
morphic to Rn .
A curve is a 1-manifold and a surface is a 2-manifold.
Given a manifold M , we may want to look at the inverse images of a height function h : M → R. The study
of these functions is called Morse theory. A full description can be found in [101]. A source of persistent homology
diagrams is the sublevel sets of this function, ie. f −1 ((−∞, a]) for a ∈ R. This is best illustrated by some examples.

Figure 5.6.1: Sublevel Sets of the Height Function of a Torus [129]

Example 5.6.1. Figure 5.6.1 [129] shows the sublevel sets of the height function of a torus that is standing on its side.
The four examples on the right side of the figure represent a disk, a (bent) cylinder, a torus with a disk removed, and
then a full torus.
98 CHAPTER 5. PERSISTENT HOMOLOGY

Figure 5.6.2: Barcode and Persistence Diagram of a Function f : [0, 1] → R [31].

Example 5.6.2. Figure 5.6.2 from [31] shows a real valued function a single variable. Looking at the sublevel sets,
we start at y = 0 and move upward, looking at the sets f −1 ((−∞, y]) for increasing values of y. The resulting sets are
empty until we hit the minimum at a1 . Now we have an interval on the x-axis, so we have created a new 0-dimensional
homology class. A second one is created at a2 which is another minimum. When we hit a function maximum, a class
is destroyed by the merging of two intervals.There are merges at a4 and a5 . The resulting barcodes and persistence
diagrams are in the center and right of the figure.

Figure 5.6.3: Barcode and Persistence Diagram of a Height function of a surface in R3 [31].

Example 5.6.3. Figure 5.6.3 also comes from [31]. In this case we have a surface in R3 and the sublevel sets have
both 0-dimensional homology (left 2 bars) and 1-dimensional homology (right 3 bars). The corresponding persistence
diagram is on the right. If you can see the colors, the red bars and dots are 0-dimensional classes and the blue ones are
1-dimensional homology. Note that this surface is homeomorphic to a torus.

Example 5.6.4. Suppose we have a surface which is the graph of a function of 2 variables. The sublevel sets form the
lawnmower complex. I once used this as an idea for detecting a large change in a set of geographic points. Here is a
potential use case.
The Island of the Lost Elephants is a little known island off the coast of Alaska originally settled by a herd of
elephants with very bad navigation skills who had been looking for a shortcut to Antarctica. The island is rectangular
and is divided into clearly marked squares of equal size. Every year, the island is visited by a team from the North
American Society for the Counting of Elephants (NASCE). They count the number of elephants in each square and
plot a three dimensional surface whose height is the number of elephants in each square. The motivation is to find out
whether the location of the elephants has significantly changed. Computing a persistence diagram for each year based
on the sublevel sets, we can then plot bottleneck or Wasserstein distances between them. If we see a spike or significant
level change we can issue an ECLA (elephant change of location alert). This is a better solution than using Euclidean
distance because an elephant moving into a neighboring square should not be as significant as a longer move.
5.7. GRAPHS 99

Example 5.6.5. Suppose we are interested in a set of grayscale images. Each pixel represents a number between
0 and 255. This is similar to the previous case in that we have a surface which is the graph of a function of two
variables (pixel value vs row and column). We can then compute the associated persistence diagram and look at the
distance measures. Again, this is preferebale to Euclidean distance as it will be less sensitive to small differences in
the positions of the objects in the image. For a color image, there are three values between 0 and 255 for each pixel.
We can simply combine them through an operation such as adding them or taking their average value.,

Example 5.6.6. The Vietoris-Rips complex of a point cloud is a special case of a sublevel set. Our height function
is then the Euclidean distance of each point of Rn to the nearest point of our finite data set where our data points are
embedded in Rn .

In Chapter 6, I will discuss the data visualization method, Mapper. Mapper also involves looking at the inverse
images of a height function and is a variation on the theme of this section. See the description there for details.

5.7 Graphs
Another structure that can be analyzed with persistent homology is a graph. Graphs can also lead to more tradi-
tional simplicial complexes. I will start with a quick review of graph terminology.

5.7.1 Review of Graph Terminology

Figure 5.7.1: Example of a Graph

Definition 5.7.1. A graph G is a nonempty set V (G) of elements called vertices or nodes together with a set E(G)
of elements called edges such that for each edge we associate a pair of vertices called its ends. An edge with identical
ends is called a loop. The 2 ends of an edge are adjacent or joined by the edge, and we refer to adjacent vertices as
neighbors. We say that the edge is incident to each of its ends.

Figure 5.7.1 represents a graph with 6 vertices and 11 edges. The edge E1 , for example, joins the 2 vertices V1 and
V5 so that V1 and V5 are adjacent to each other or V1 is a neighbor of V5 and vice versa. E1 is incident to V1 and V5 .

Definition 5.7.2. A graph G is complete if every pair of distinct vertices is joined by an edge. It is bipartite if the
vertices can be partitioned into 2 sets X and Y such that every edge of G has one end in X and one end in Y .

It is sometimes convenient to represent a graph in terms of a matrix. We define 2 types of matrices which contain
all of the information represented in a graph.
100 CHAPTER 5. PERSISTENT HOMOLOGY

Definition 5.7.3. Given a graph G, the adjacency matrix A(G) is defined to be the matrix (aij ) whose rows and
columns are indexed by the vertices of G and such that aij is the number of edges joining vertices i and j. (Note
that this matrix is always symmetric since if an edge joins i and j it also joins j and i. The incidence matrix B(G)
is defined to be the matrix (bve ) whose rows are indexed by the vertices of G and whose columns are indexed by the
edges of G. We let bve = 2 if edge e is a loop at v, bve = 1 if e joins v to another vertex, and bve = 0 otherwise.

For the graph in the figure, taking the vertices and edges in numerical order we have that the adjacency matrix is
 
0 1 1 1 1 1
1 0 1 1 1 1
 
1 1 0 0 1 1
A(G) =   
1 1 0 0 0 0

1 1 1 0 0 0
1 1 1 0 0 0

and the incidence matrix is


 
1 1 1 1 1 0 0 0 0 0 0
0 0 1 0 0 1 1 1 1 0 0
 
0 1 0 0 0 0 1 0 0 1 1
B(G) = 
0
.
 0 0 1 0 0 0 0 1 0 0
1 0 0 0 0 1 0 0 0 1 0
0 0 0 0 1 0 0 1 0 0 1
Unless otherwise specified we will always be working with simple graphs, i.e. graphs with no loops or multiple
edges.

Definition 5.7.4. In a simple graph, the degree of a vertex is the number of its neighbors. A graph is d-regular if every
vertex has degree d, and it is regular if it is d-regular for some d.

For the graph in the figure, V1 and V2 have degree 5, V3 has degree 4, V5 and V6 have degree 3, and V4 has degree
2. Note that in the adjacency matrix, the degree of a vertex is the weight (i.e. number of ones) of its corresponding
row and corresponding column. It is also equal to the weight of the corresponding row in the incidence matrix.

Definition 5.7.5. Let G and H be graphs. If V (H) ⊆ V (G), E(H) ⊆ E(G) and every edge in H has the same
pair of ends as it has in G, we say that H is a subgraph of G or G is a supergraph of H. We say that H spans G if
V (H) = V (G).

For example, if H is the subgraph of the graph G in our figure such that

E(H) = {E3 , E7 , E9 , E10 , E11 },

all of the vertices of G are included in the endpoints of these edges, so H is a proper subgraph which spans G.

Definition 5.7.6. A walk in a graph G is a sequence W = v0 e1 v1 e2 . . . ek vk , where the vi are vertices of G, the ei are
edges of G, and for 1 ≤ i ≤ k, vi−1 and vi are the ends of ei . The walk is open if v0 ̸= vk and closed if v0 = vk . The
length of W is the number of its edges. The walk is a trail if the edges are all distinct and a path if the vertices are all
distinct. A closed trail of positive length whose vertices (apart from its ends v0 and vk ) are distinct is called a cycle.

For example, in our graph G, V1 E1 V5 E10 V3 E2 V1 is a cycle. G contains many examples of walks, paths, trails,
and cycles. The reader should look for some examples.

Definition 5.7.7. A graph is connected if any 2 of its vertices are connected by a path. A component of a graph is a
connected subgraph which is maximal in the sense that it is not properly contained in any larger connected subgraph.
5.7. GRAPHS 101

Definition 5.7.8. A graph is acyclic or a forest if it contains no cycle. A tree is an acyclic connected graph. A vertex
in a tree of degree one is called a leaf. A spanning tree of a graph G is a subgraph H which is a tree and spans G.
We saw that the subgraph H with edge set

E(H) = {E3 , E7 , E9 , E10 , E11 }

spans our graph G. Figure 5.7.2 shows this subgraph.

Figure 5.7.2: Spanning Tree

It is now easily seen that this graph is a tree, so it is a spanning tree. There are a number of others. The leaves of
this tree are V1 , V4 , V5 , V6 .
Theorem 5.7.1. If H is a spanning tree for G then the number of edges of H is |E(H)| = |V (G)| − 1 = |V (H)| − 1.
So far, I have only been discussing undirected graphs in which the edge does not have a preferred direction. I now
give the following definition:
Definition 5.7.9. A directed graph or digraph D is a graph G in which each edge is assigned a direction with the
starting end called its tail and the finishing end called its head.
Figure 5.7.3 shows a directed graph with the same edge set and vertex set as the graph in Figure 5.1.

Figure 5.7.3: Directed Graph


102 CHAPTER 5. PERSISTENT HOMOLOGY

Definition 5.7.10. The associated digraph D(G) of an undirected graph G is the digraph obtained from G by replacing
each edge of G by 2 edges pointing in opposite directions. The underlying graph G(D) of a digraph D is the graph
obtained from D by ignoring the directions of its edges.

Note that G(D(G)) ̸= G and D(G(D)) ̸= D. Figure 5.7.1 is the underlying graph of Figure 5.7.3, but Figure
5.7.3 is not the associated digraph for Figure 5.7.1.
I conclude this section with some analogous definitions for digraphs to those given above for undirected graphs.

Definition 5.7.11. Given a digraph D, the adjacency matrix A(D) is defined to be the matrix (aij ) whose rows and
columns are indexed by the vertices of D and such that aij is the number of edges with tail i and head j. (Note that
this matrix is not generally symmetric.)

For the graph in Figure 5.7.3, taking the vertices and edges in numerical order we have that the adjacency matrix
is  
0 1 1 1 1 1
0 0 1 1 1 1
 
0 0 0 0 1 1
A(G) = 
0

 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0

Definition 5.7.12. In a digraph D, the outdegree of a vertex v is number of edges of D whose tail is v. The indegree
of a v is number of edges of D whose head is v. The sum of the outdegree and the indegree of v is the total degree of
v. This is equal to the degree of v in the underlying graph G(D). A sink is a vertex of outdegree 0, and a source is a
vertex of indegree 0.

The terms sink and source are actually derived from fluid mechanics. Fluids are sucked into sinks and spray out of
sources. The digraph in Figure 5.7.3 has V1 as its only source and sinks V4 , V5 , and V6 .
The indegree and outdegree of each vertex can be read off from the adjacency matrix. The outdegree is the weight
of the corresponding row, and the indegree is the weight of the corresponding column. For example, V2 has indegree
1 and outdegree 4.

Definition 5.7.13. A directed walk in a digraph D is a sequence W = v0 e1 v1 e2 . . . ek vk , where the vi are vertices of
G, the ei are edges of G, and for 1 ≤ i ≤ k, vi−1 is the tail of ei , and vi is the head of ei . Directed trails, paths and
cycles are defined in an analogous way.

Note that the digraph in Figure 5.7.3 has no directed cycles.


Finally, I will define a weighted graph.

Definition 5.7.14. A weighted graph G has a weighting function w : E(G) → R which assigns a real number called
the weight to each edge of the graph.

Note that both undirected and directed graphs can be weighted. In either case we modify the defnition of the
adjecency matrix by replacing each one with the weight of the corresponding edge.

5.7.2 Graph Distance Measures


In areas such as network change detection or graph classification, we would like to have a good way to measure a
distance between two graphs. A good reference on this subject with lots of interesting ideas is the book of Wallis, et.
al. [167]. If two graphs have a similar set of nodes a quick and easy metric to compute is graph edit distance. It has
served me well on many network change detection problems.
5.7. GRAPHS 103

Definition 5.7.15. Let G1 and G2 be two graphs. Suppose for i = 1, 2, |Vi and Ei represent the number of vertices
and edges respectively in graph Gi , and |V1 ∩ V2 | and |E1 ∩ E2 | are the number of vertices and edges respectively that
the two graphs have in common. Then the (unweighted) graph edit distance is

d(G1 , G2 ) = |V1 | + |V2 | − 2|V1 ∩ V2 | + |E1 | + |E2 | − 2|E1 ∩ E2 |.

If the graphs are weighted we can add an additional term. Let {ej } be the set of common edges and let βj1 and βj2 be
the respective weights of edge ej in graphs G1 and G2 . Then the weighted edit distance is
X |βj1 − βj2 |
d(G1 , G2 ) = |V1 | + |V2 | − 2|V1 ∩ V2 | + |E1 | + |E2 | − 2|E1 ∩ E2 | + α ,
j
max(|βj1 , βj2 )

where α is a scaling factor which is usually set to one.


Note that if G1 = G2 then |V1 | = |V2 | = |V1 ∩ V2 | and |E1 | = |E2 | = |E1 ∩ E2 |, so

d(G1 , G2 ) = |V1 | + |V2 | − 2|V1 ∩ V2 | + |E1 | + |E2 | − 2|E1 ∩ E2 | = 0.

Figure 5.7.4: Graph Edit Distance Example

Example 5.7.1. Consider the two graphs in Figure 5.7.4. There are 5 vertices in each of the two graphs and they
have 4 of them in common. G1 has 5 edges and G2 has 4 edges and the two graphs have 3 edges in common. The
unweighted graph edit distance is

d(G1 , G2 ) = 5 + 5 − 2(4) + 5 + 4 − 2(3) = 5.

If we look at weights there are three common edges: AB, BC, and BD. For AB, we have |4 − 12| = 8 and
max(4, 12) = 12. For BC, we have |8−1| = 7 and max(8, 1) = 8. For BD, we have |7−4| = 3, and max(7, 3) = 7.
So to compute the weighted graph edit distance for α = 1, we add the unweighted distance of 5 to the terms in the
summation giving
8 7 3 112 + 147 + 72
d(G1 , G2 ) = 5 + + + =5+ ≈ 6.97.
12 8 7 168
In my own work I have gotten better results by dropping the weight factor even if the graph is weighted.

5.7.3 Simplicial Complexes from Graphs


Graphs give rise to several natural simplicial complexes. I will start with those. Then I will talk about how they
give rise to persistence diagrams.
104 CHAPTER 5. PERSISTENT HOMOLOGY

Recall that in an abstract simplicial complex we start with a collection of objects called vertices. A simplex is a
subset of the vertices that will be included in the complex. If the subset is of order n when we have an n − 1-simplex.
A maximal simplex is a simplex which is not a subset of any other simplex. So in order to define an abstract simplicial
complex, it suffices to define its vertices and maximal simplices. The complex then consists of the maximal simplices
and all of their subsets.
Before I list these complexes, I will give one more definition that will be helpful in understanding the third example.
Definition 5.7.16. A partially ordered set or poset is a set X with a binary relation ≤ called a partial order with the
following three properties:
1. Reflexivity: If a ∈ X, then a ≤ a.
2. Antisymmetry: If a, b ∈ X then a ≤ b and b ≤ a implies a = b.
3. Transitivity: If a, b, c ∈ X, then a ≤ b and b ≤ c implies a ≤ c.
Note that in a partially ordered set it is possible that for a, b ∈ X that a ≤ b and b ≤ a are both false. In this case we
say that a and b are incomparable.
An easy example of a partially ordered set is a collection of sets where A ≤ B if A ⊆ B.
Now here are five simplicial complexes you can build out of a graph:

1. The graph itself. This is a complex of dimension one. We can think of any higher dimensional complex as
strategically throwing information away. This is useful if we are trying to detect only major changes in a graph.
2. Complete Subgraph Complex: The vertices are the graph edges and the maximal simplices are the edges sets
forming a complete subgraph. Note that a complete subgraph is also called a clique.
3. Neighborhood Complex: The vertices are the graph vertices and the maximal simplices consist each vertex
along with their neighbors.
4. Poset Complex: The vertices are the graph vertices and the maximal simplices are the maximal directed paths.
Note this is defined only in a directed graph. We can also do something similar for any finte partially ordered
set.
5. Matching Complex: The vertices are the graph edges and the simplices are the edge sets forming a matching.
A matching is a set of edges in which no two are adjacent.
6. Broken Circuit Complex: The vertices are the graph edges and the maximal simplices are the are maximal
connected edge sets containing no circuit. (These are the spanning trees for a connected graph.) Such edge sets
form the independent sets of a matroid (a generalization of linear independence) and have particularly simple
homology. For details see [18].

Here are some potential features for a machine learning algorithm:


1. Dimension of the complex.
2. Number of simplices in each dimension.
3. Euler characteristic.
4. Betti numbers of homology groups
Finally, suppose we have a weighted graph. We can make a Vietoris-Rips complex and the corresponding persis-
tence diagram. For each ϵ, we include a set of vertices in a simplex if for each pair of vertices, there is an path between
them of weight less than or equal to ϵ. (The weight of a path is the sum of the weight of its edges.) Then we can make
bar codes, persistence diagrams, and persistence landscapes just as in the case of point clouds.
5.8. TIME SERIES 105

5.8 Time Series


A time series {x1 , x2 , x3 , · · · } is a sequence of measurements at successive time intervals. For example, a time
series could represent the price of a stock on each day or the temperature every hour. We can also have a multidimen-
sional times series where we take more than one measurement simultaneously. For example. you might keep track of
your height and weight at noon Greenwich Mean Time on your birthday each year. This is a two dimensional time
series.
There is a highly developed theory of time series in economic modeling. If we expect the measurement for each
time period to be a linear combination of the measurements for the last p times for some p (with some random noise
added) we have an autoregressive (AR) model. A moving average (MA) model assumes the value for a time period is a
linear combination of some average value and q white noise terms for some q. An ARMA model is the sum of an AR
and a MA model.
There are also more complicated models such as ARIMA in which first (or higher order) differences are involved,
and ARCH in which the variance is allowed to vary. What all of these have in common is that our goal is to determine
the unknown parameters and be able to make predictions.
I won’t get into details here as time series modeling is a whole subject of its own. The classic book on this type
of modeling is the book by Box, Jenkins, and Reinsel [19]. A more modern book is Hamilton [61]. Two books with
an economics focus are [49] and [163]. Finally, Cryer and Chan [37] is a full description of the time series analysis
(TSA) package in R.
I have found that traditional time series modeling is not readily applicable to cyber defense applications. Economists
are trying to model what is normal, but I was more interested in anomalies. Also, in the applications I looked at, there
was not a good justification for thinking that the quantities I was measuring were going to be linear combinations of
the measurements at previous times.
Another issue is finding a distance measure between two time series. The obvious one is Euclidean distance in
which you add up the squared distances of the values of the the two time series at each time period. But this does not
take into account that the series might just be shifts of each other or differ in scaling. A solution to this is dynamic time
warping or DTW. In DTW, we match indices from the first series to indices from the second. An index in one series
is allowed to match more than one index in the other and indices can be skipped as long as we satisfy the following
condition: If index i is matched to index f (i) then i > j implies f (i) ≥ f (j). We then take take the match in which
P
i |i − f (i)| is minimized. See [15] for details of how this is implemented. This approach can be rather slow, though,
even with more modern variants that are faster.
Another issue involves finding anomalies in time series. It is not hard to find spikes or unusal values. One way to
do it is to take a sliding window of n samples {xi , xi+1 , · · · , xi+n−1 }. Let µ be the mean of these samples and σ the
standard deviation. Then we say that xi+n is an anomalous value if |xi+n σ
−µ|
is above some threshold. But what if we
instead want to find an anomalous substring? For example, an EKG has obvious spikes but we don’t want to say that
a flat line would be normal since we have eliminated the spikes.
A good solution to all of these problems was to use the time series discretization method, SAX [88]. I will tell
the story of how I used SAX to build a time series analysis tool that performs classification and anomaly detection in
the next section. But what does any of this have to do with topological data analysis? TDA was the key to dealing
with multivariate time series. The result was a successful effort to classify internet of things devices based on network
traffic [41].

5.9 SAX and Multivariate Time Series


While trying to find a different approach to time series analysis, I found an interesting paper. It was a Master’s
Thesis in Statistics by Caroline Kleist of Humboldt Universität in Berlin [82] that provides a survey of data mining
techniques when the data points are time series. The paper had multiple references to Jessica Lin at George Mason
University who had formulated a time series analysis method called SAX in which the series is represented by a string
of symbols from a small alphabet. It turns out that one of my previous offices was already working with George Mason
University and I started a project with Lin’s colleague Robert Simon to apply SAX to cyber defense problems. Our
106 CHAPTER 5. PERSISTENT HOMOLOGY

internet of things paper [41] was the result of this project. We also developed a software tool called TSAT. TSAT is a
JAVA program which runs on either Windows or Linux.
In this section, I will break down the steps of our analysis. In Section 5.9.1, I will introduce SAX and describe
its advantages. In Section 5.9.2, I will discuss the SEQUITUR algorithm which combines symbols into grammar
rules and gives SAX the ability to detect anomalies. Next, I will describe the Representative Pattern Mining or RPM
algorithm which we use to classify time series. In Section 5.9.4, I wil give the method of Gidea and Katz for converting
a multivariate time series into a univariate one using topological data analysis. Then I will describe our internet of
things results. Finally, in Section 5.9.6 I will describe some work of Elizabeth Munch at Michigan State University on
TDA and time series. It is hoped that this work will help us improve our current algorithms.

5.9.1 SAX
In this section, I will briefly outline the SAX algorithm. For more information, see [88] and its references.
SAX stands for Symbolic Aggregate approXimation. (The X comes from the middle of the last word.) The
algorithm replaces a sequence of real valued measurements with a string of symbols from an alphabet size which is
typically 3-12 symbols. Doing this allows for extremely fast dstance computations as there are a small number of
pairs of symbols and their distances can be kept in a look up table. This greatly speeds up both Euclidean distance
computation and dynamic time warping. In addition, SAX has the following applications:

1. Finding motifs or common patterns within a series.

2. Finding surprising subsequences representing anomalies.

3. Finding contrast sets. Given two sets of time series, are there features which distinguish one set from the other?

4. The RPM (Representative Pattern Mining) Algorithm uses SAX to find strings which distinguish one set of time
series from another set.

The first step in the process is optional and referred to as Piecewise Aggregate Approximation or PAA. PAA reduces
the number of terms in a time series by dividing it into w equal sized frames. If n is the length of the time series, we
will assume in our example that w to be a divisor of n. For each frame, we assign the value to the mean value of time
steps in that frame. This converts the series into a series of length n/w. For example, if n = 128, and w = 16, we get
the time series of length 8 shown in Figure 5.9.1.

Figure 5.9.1: Piecewise Aggregate Approximation (PAA) Example [88]

The next set is to assign a symbol from our alphabet. To do this, let µ be the mean and σ be the standard deviation
of the time series resulting from PAA. For each xi in this series, we normalize it by replacing it with xiσ−µ . Then
pretend that the values come form a standard normal distribution. This is not necessarily true, but we at least want the
symbols to be roughly equally likely. Divide the range of the standard normal into α parts of equal probability where
α is the size of the alphabet we want to use.
In Figure 5.9.2 we have an example where the alphabet is of size 3 and consists of the symbols a, b, and c. For the
standard normal distribution,
1
P (x < −.43) = P (−.43 < x < .43) = P (x > .43) = .
3
5.9. SAX AND MULTIVARIATE TIME SERIES 107

Figure 5.9.2: SAX Discretization Example: α = 3 [88]

So for each normalized value x in our series, we replace it with a if x < −.43, with c if x > .43, and b otherwise. We
have now converted our time series into the string of symbols, baabccbc.
To detect anomalies and classify series, the symbols are converted into rules using a context free grammar. The
rough idea is as follows:
Suppose we receive the string abcabcaaaabcabc. We see that abc is common, so let R1 = abc be a grammar
rule. We now have R1 R1 aaaR1 R1 . Now let R2 = R1 R1 , giving R2 aaaR2 . Any substrings left over at the end are
anomalous. In addition, we can classify 2 sets of time series by finding grammar rules which are common in one set
but rare in another.
I did cheat a little though. We want to have a systematic algorithm to assign a set of rules. That is our next subject.

5.9.2 SEQUITUR
In SAX, grammar rules are assigned using the SEQUITUR algorithm of Nevill-Manning and Witten. (For computer
scientists, SEQUITUR is an algorithm which computes a context free grammar.) I will outline their algorithm here.
See their paper [115] for details.
SEQUITUR infers a hierarchical structure from a sequence of discrete symbols. The rules are new symbols that
replaces strings of existing symbols. Rules then include other rules leading to the hierarchy. We are going to impose
two constraints:
1. Digram uniqueness: No pair of adjacent symbols appears more than once in the grammar.
2. Rule utility: Every rule is used more than once.
The next four examples refer to Figure 5.9.3 taken from [115].

Figure 5.9.3: Grammar Rule Examples [115]

Example 5.9.1. The first example introduces their terminology. In box a, the original rule is the sequence S → abcdbc.
Since bc repeats we replace it with the rule A → bc. Now we get the sequence S → aAdA.
Example 5.9.2. Box b shows how a rule can be reused in longer rules. We start with the sequence S → abcdbcabcdbc.
If we impose the rule B → bc and then A → aBdB, we get the sequence S → AA. This gives a rule within a rule
showing a heirarchical structure.
108 CHAPTER 5. PERSISTENT HOMOLOGY

Example 5.9.3. So far we haven’t thought about our constraints. In box c, we have two possible substitutions for the
original sequence S. In the top one, digram uniqueness is violated as we have bc appearing twice in rule A, so we
have introduced redundancy. In the bottom one, rule utility is violated as B is used only once. We would have a more
concise grammar if it was removed.

Example 5.9.4. Box d shows that both constrains can be satisfied but the rules are not necessarily unique. Both the
top and bottom are acceptable and satisfy both constraints.

To show how the algorithm proceeds and enforces the constraints, Neville-Manning and Witten use a table which
I have reproduced as Figure 5.9.4.

Figure 5.9.4: SEQUITUR Example [115]

SEQUITUR reads a sequence from left to right. If a digram appears more than once, a rule is created, imposing
digram uniqueness. If a rule is only used once, it is eliminated, imposing rule utility. Reading down the table, the
first interesting event is when symbol 6 is added. Now we have bc appearing twice, so we create the rule A → bc and
make the substitution in our sequence. At symbol 9,we have bc appearing again, so we replace it with A, getting the
sequence aAdAaA. But now aA appears twice, so we create the new rule B → aA. This gives the sequence BdAB.
Now at symbol 10, we get BdABd. Now Bd appears twice, so we create C → Bd. So the sequence is CAC.
Now we see that the only use for B was to define C, so we are violating rule utility. Our sequence doesn’t change if
we eliminate rule B and just let C → aAd. That is how we get longer rules.
The remainder of [115] shows that the algorithm is linear in both time in space. The interest for SAX is that the
rules represent common substrings in a sequence. Any symbols that are not covered by these rules are by definition
anomalous. We can also classify time series by looking at rules that are common for one type of series and rare for the
other. That is the subject of our next section.

5.9.3 Representative Pattern Mining


Representative Pattern Mining [169] is the algorithm that TSAT uses to classify time series. The idea is to find
representative patterns, i.e. subsequences that are common in one class of time series and rare in the other. I will
briefly summarize the steps and refer you to the paper for details.
The algorithm consists of two steps: Learning representative patterns during a training stage and then using them
for classification. The training stage consists of preprocessing the data, generating representative pattern candidates,
5.9. SAX AND MULTIVARIATE TIME SERIES 109

and then selecting the most representative candidates. Then classification is done by letting each pattern be a feature
and computing a distance from a time series we want to classify to each pattern. Support Vector Machines are used in
[169] as a classifier, but in TSAT, we used random forests instead.
The preprocessing step in the training phase consists of turning the time series into symbols using SAX and then
using SEQUITUR to generate grammar rules. The algorithm keeps track of the position of the rules in the sequence.
The algorithm eliminates patterns that are too similar to each other through a clustering process and eventually chooses
a set that will be the best discriminator of the two classes.
SAX extracts subsequences using a sliding window, and a PAA frame size and alphabet size must be chosen. In
TSAT, a user can choose these parameters or the algorithm can choose them for the user by finding the parameter set
that will do the best job of discriminating the training data. See [169] for details.

5.9.4 Converting Multivariate Time Series to the Univariate Case or Predicting Economic
Collapse
So what does any of this have to do with algebraic topology? The answer comes when we try to use SAX to
classify multivariate time series.
One approach might be to take each coordinate individually. But each coordinate could individually seem normal
but taken together could represent an anomaly. For example, if you represented a person’s height and weight over
time, it might not be unusual that they weigh 100 pounds or are 6 feet tall but being both would be pretty unusual.
TDA allows us to do the conversion without decoupling coordinates.
We used the method of Gidea and Katz [55]. Suppose our time series consisted of m measurements at each time
period. Then letting X = {x1 , x2 , · · · }, each xi represents a point in Rm . Choose a window size w. Then slide
the window across the series. At position k, our window is the set {xk , xk+1 , · · · , xk+w−1 }. We then have a cloud
of w points in Rm and we can compute the persistence landscape of these points. If we step by one time unit, we
now have a cloud of points {xk+1 , xk+2 , · · · , xk+w }. Now letting λ1 and λ2 be the two landscapes, we compute the
distance ||λ1 − λ2 ||p as defined in Section 5.5, where we generally let p = 1 or p = 2. This gives a real number for
the difference between each window setting, so we have a univariate time series and we can use SAX and RPM for
classification as before.
Gidea and Katz used this technique for forecasting economic downturns. They started with the daily time series of
four US stock market indices: S&P 500, DJIA, NASDAQ, and Russell 2000 from December 23,1987 to December 8,
2016 (7301 trading days). This is a series of points in R4 . They used a sliding window of size w = 50 and w = 100.
Two interesting dates were chosen, the dot com crash on March 10, 2000 and the Lehman bankruptcy on September 15,
2008. Figure 5.9.5 shows in top to bottom order the normalized S&P series, the normalized L1 norm of the persistence
landscapes calculated with a sliding window of 50 days, and the volatility index (VIX) which measures the volatility
of the S&P index over the next 30 days and is often used for forecasting. The left side represents the 1000 trading days
before the dot com crash and the right side represent the 1000 trading days before the Lehman bankruptcy. Looking at
the middle row, we see the dramatic behavior of the persistence landscapes in the days before the two crashes.

Figure 5.9.5: Behavior of Three Indices Immediately Before Two Economic Crises. [55]
110 CHAPTER 5. PERSISTENT HOMOLOGY

I will briefly mention another Gidea and Katz paper that includes three additional authors [54]. This paper ana-
lyzed four cryptocurrencies: Bitcoin, Etherium, Litecoin, and Ripple. The problem was to find a warning of collapse
of the value of these currencies. The method was related to the previous paper with a few differences. For each of these
currencies, start with a univariate time series representing the log-returns. The input for TDA is a coordinate embed-
ding d = 4 and a sliding window of size w = 50. So if {xi } is the original time series, let z0 = (x0 , x1 , x2 , x3 ), z1 =
(x1 , x2 , x3 , x4 ), · · · . Then our sliding windows are of the form wi = {zi , zi+1 , · · · , zi+49 }. We consider each window
to be a point cloud, compute it’s persistence landscape, and then compute the L1 norm

X
||λ||1 = ( ||λk ||1 ),
k=1

where λk is as defined in Section 5.5.


Now perform a k-means clustering to find unusual dates in the series. The input is a series of points (x, y, z) ∈ R3
where x is the log of the price of the asset, y is the log-return of the asset, and z is the L1 norm produced by TDA. For
each of these currencies, there were clusters consisting of points representing dates leading up to a crash. See [54] for
more details.
With these techniques, we were in a good position to use multivariate time series to classify internet of things
devices.

5.9.5 Classification of Internet of Things Data


The material in this section comes from my own paper [41]. The idea was to classify internet of things (IoT)
devices using data that would still be visible under encryption such as statistics on packet size and interarrival time. The
experiments were performed on a proprietary testbed consisting of 183 devices produced by a variety of manufacturers.
The devices were placed in multiple rooms and corridors designed to emulate how they would be used in commercial
and residential environments.
All network traffic over a 9 month period was collected and each recieved packet was time-stamped. From the
logs, we could look at source and destination MAC and IP addresses,transport layer ports, and the device type for each
payload. This gave us a labeled data set and for each conversation we could compute the total number of bytes sent in
a time window and the mean and variance of interarrival time. We experimented with both single and multi-attribute
time series. For the univariate series we processed the data using SAX, and used Representative Pattern Matching to
perform classification. For the multivariate series, we used both observed and derived attributes. We used the TDA
method of Gidea and Katz to transform a multivariate series into a univariate one and then used SAX and RPM for
classification. For the paper, we used the Dionysus package [105], but later sped up the computations dramatically with
the Ripser package [12]. Ripser is now included in TSAT. As an alternate algorithm for classifying multivariate time
series is WEASEL+MUSE [142] which looks at patterns in each coordinate separately and includes the coordinate as
part of their label.
We looked at three types of IoT devices:

1. Cameras including audio speakers and characterized by high-volume burst data.

2. Sensors such as environmental sensors.

3. Multi-purpose devices such as tablets or certain cameras that allowed for two-way and streaming audio,

The collection was noisy and incomplete. There were interruptions due to system maintenance and equipment
upgrades and much higher traffic on weekdays during work hours than at night, on weekends, and during holiday
breaks. For single attribute time series, the best results were for a day of training data and a day of test data. The
attribute that worked best was total number of bytes followed by packet interarrival time.
For the multivariate time series, WEASEL+MUSE worked as well as TDA for small amounts of data. When we
tried increasing the size of the test data, TDA worked much better than other methods. Table 5.9.1 shows the results of
TDA on one month of training and 8 months of testing data. Considering the amount of noise, these results are quite
5.9. SAX AND MULTIVARIATE TIME SERIES 111
Device Type F1 MCC
Cameras .687 .535
Sensors .835 .765
Multi-purpose .8 .69
Weighted .77 .66

Table 5.9.1: Results for TDA Multivariate Case.

good. They are evaluated using the F1 score and the Matthews Correlation Coefficient (MCC). Letting TP, FP, TN,
and FN be true positives, false postives, true negatives, and false negatives respectively, we have
2T P
F1 =
2T P + F P + F N
and
TP × TN − FP × FN
M CC = p .
(T P + F P )(T P + F N )(T N + F P )(T N + F N )
Note that the F1 score ranges from 0 to 1 while MCC ranges from -1 to 1.
To conclude, while univariate time series and more traditional methods worked well for shorter time periods, TDA
outperformed the other methods for a noisy multi-month period. See [41] for more details.

5.9.6 More on Time Series and Topological Data Analysis


In this section, I will discuss two additional papers that built on the work of the last section. Also, there are two
remaining practical questions:
1. How do we pick the size of the window which determines the size of the point clouds used to compute persistence
landscapes?
2. In our work we slid the window by one time step at a time. Is this always the right thing to do? How do we
know?
These are still open problems but I will conclude by discussing a paper [112] that was suggested to me by Elizabeth
Munch of Michigan State University to begin to answer these questions.
First, though, I will summarize a paper [34] which cites my IoT work and presents an alternative topology based
algorithm.The paper classifies IoT devices using the single attribute of interarrival time. The advantage, though, is that
they can work in smaller time windows and they create a simpler simplical complex.

Figure 5.9.6: Clique Complex of Window ω10 (0) After First 4 Nonzero Edges are Added [34]

The complex is best described using a picture. Figure 5.9.6 is taken from [34]. We are given a time series of packets
{p0 , p1 , · · · , pn }. If packet pi arrives at time T (pi ), we define the interarrival time ∆T (pi ) to be T (pi ) − T (pi−1 ).
Now choose a window size k and we classify windows ωk (i) = (pi , pi+1 , · · · , pi+k−1 ) as to the type of IoT device
that produced them. In this paper, they chose k = 25.
Now build a filtration of complexes as follows. Figure 5.9.6 shows a window of size 10 with starting point 0
denoted ωk (i) with k = 10, and i = 0. For the window ωk (i), we draw a graph with nodes i, i + 1, · · · , i + k − 1 along
112 CHAPTER 5. PERSISTENT HOMOLOGY

the bottom and node i + k above them. For the bottom nodes, connect node j to node j + 1. Then we connect node i
to node i + k. This is our initial graph. Now for nodes i + 1 to node i + k − 1, add the edge from node j to node i + k
in increasing order of interarrival time T (pj ). When two consecutive bottom nodes are connected to the top node, we
add the 2-simplex bounded by those edges and the edge on the bottom between them. In the figure, we have added
the edges connecting 10 to 3, 4, 7, and 8, and this also adds the 2-simlplices [3, 4, 10] and [7, 8, 10]. Adding edges and
corresponding 2-simplices in succession creates our filtration. We only look at 1-dimensional homology classes. The
resulting diagram is converted to a persistence image. I will explain how this is done in the next section, but for now
think of it as a 128 × 128 array of numbers. The images are then classified using a convolutional neural net.
According to [34], the technique gave an accuracy of 95.34%, a recall (T P/T P +F N ) of 95.27%, and a precicison
(T P/T P + F P ) of 95.46% where TP, FP, and FN are true positive, false positives, and false negatives respectively.
These results came from a single experiment and would not have necessarily done as well in the case we looked
at, but it still provides a quick and easy method to try.
Neither this method or the method of Gidea and Katz that we used has a good way of choosing window size. The
paper of Myers, Munch, and Khasawneh [112] addreses this issue but only for the univariate case. I will describe it
next.
For a time series we need to choose the time lag τ and the embedding dimension d. The parameter τ tells us when
to take our measurements, and we may not be able to control it if our time series is discrete. Once τ is chosen, we can
think of d as a window size, where our window is {xt , xt+τ , xt+2τ , · · · , xt+(d−1)τ }. Instead of a point cloud, though,
we use these windows to construct a type of graph called an ordinal partition graph. I will define this graph first and
then discuss how we choose τ and d.
Suppose for example that d = 3. So each window is of the form {xt , xt+τ , xt+2τ }. Let 6 consecutive terms of our
time series be {58, 35, 73, 54, 40, 46}. The vertices of the graph will be permutations in the group Sd (recall Example
3.1.6) and we want to find the permutation π ∈ Sd such that xπ(1) ≤ · · · ≤ xπ(d) . Recall that the order of Sd is d!. In
our case I will list the 6 elements of S3 (in no particular order) as follows:

π1 = (1)
π2 = (12)
π3 = (13)
π4 = (23)
π5 = (123)
π6 = (321)

The first window is {58, 35, 73}, and to put these in ascending order we need to switch the first 2 elements while
keeping the last one fixed. So we use π2 . Listing the first four windows and the corresponding permutations, we get

{58, 35, 73} → π2


{35, 73, 54} → π4
{73, 54, 40} → π3
{54, 40, 46} → π6

Now we form a graph with the permutations as vertices and a directed edge from πi to πj if there are consecutive
windows corresponding to πi and πj . The edges are weighted with weights corresponding to the number of times this
transition appears in the graph. Figure 5.9.7 shows the graph corresponding to our example. In practice, since there
are d! vertices, we don’t include the ones that never appear in our time series.
Once they build the ordinal partition graph, Myers, et. al. build a filtration using the shortest distance between every
pair of vertices. This can be computed using the all pairs shortest path length function from the python NetworkX
package. The one-dimensional persistence diagram is then computed using the python wrapper Scikit-TDA for the
software package Ripser.
5.9. SAX AND MULTIVARIATE TIME SERIES 113

Figure 5.9.7: Ordinal Partition Graph

It remains to see how they found the time delay τ and the embedding dimension d. First fix d = 3, and plot the
permutation entropy X
H(d) = − p(πi ) log2 p(πi ),

where p(πi ) is the probability of permutation πi for a range of values of τ . Choose τ to be the first prominent peak of
H(d) in the H as a function of τ curve. (Note that changing τ changes the probabilities of each permutation.) It was
shown in [134], that the first peak of H(d) is independent of d for d ∈ {3, 4, 5, 6, 7, 8}.
Once we choose τ , define the permutation entropy per symbol to be
1
h′ (d) = H(d),
d−1
where we make d a free parameter that we want to determine. We then plot h′ (d) for d ranging from 3 to 8, and choose
the value of d to that maximizes h′ (d).
This leaves the question of whether we could do something similar for multivariate series. Here d is actually the
window size. In the example of my internet of things work, τ is already fixed. One idea might be to take the largest
value of d for each of the coordinates and have that be the window size. That would need to be the subject of future
work.
I will conclude this section with some ways of getting a single number summary or score of a persistence diagram
to help with classification. These scores are all listed in [112]. If a homology class x is born at time b and dies at time
d, then the persistence is pers(x) = d − b.

1. Maximum Persistence: This is the persistence of the class with the largest peristence. For diagram D, denote
this as maxpers(D).

2. Periodicity Score: Let G′ be a cycle graph with n vertices in which the nodes and edges form one big cycle.
Suppose all edges have weight one and the weight is just the shortest path. Then there is a single cycle that
is born at time 1 and dies at time ⌈ n3 ⌉ where ⌈x⌉ is the smallest integer that is greater than or equal to x. The
corresponding diagram D′ has
n
maxpers(D′ ) = ⌈ ⌉ − 1.
3
If we call this quantity Ln , we can compare it to another unweighted graph G. (In our case we can use the
unweighted ordinal partition graph.) If G has persistence diagram D, then define the network periodicity score

maxpers(D)
P (D) = 1 − .
Ln

The score ranges from 0 to 1 with P (D) = 0 if G is a cycle graph.


114 CHAPTER 5. PERSISTENT HOMOLOGY

3. The ratio of the number of homology classes to the graph order: This is

|D|
M (D) = .
|V |

The number is not reallly useful in dimension 0 as the number of 0 dimensional classes is n − 1 for a graph
with n vertices. For a higher dimensional diagram, we expect a periodic time series to have a smaller number of
classes than a chaotic one so this number should be smaller in this case.

4. Normalized persistent entropy: This function is calculated from the lifetimes of the classes in the diagram,
and is defined as  
X pers(x) pers(x)
E(D) = − log2 ,
L(D) L(D)
x∈D
P
where L(D) = x∈D pers(x) is the sum of the lifetimes of the points in the diagram. Since it is hard to
compare this quantity to diagrams with different numbers of points, we normalize E as

E(D)
E ′ (D) = .
log2 (L(D))

Speaking of periodicity, I will mention the paper of Perea and Harer [119] which uses persistence diagrams to
learn about the periodicity of a time series.
One last comment on that. A quick and easy thing to do is to plot the first derivative of the function against the
second derivative. A sine or cosine function plots to a perfect circle. You can tell if a periodic function starts to drift
by connecting the points in this plot and looking at the path. (You can do this in R for example.) If the path starts to
deviate form your circle, something is happening and those times may be anomalous.
The scores mentioned in [112] can be used as machine learning features. I will discuss two more interesting ways
to derive features from a persistence diagram in the next section.

5.10 Persistence Images and Template Functions


Recall that in the last section we saw that [34] classified IoT devices using persistence images. Now I will explain
where these images come from.
Persistence images were proposed in the paper of Adams, et. al. [2].
We start form a persistence diagram B in birth-death coordinates. We transform this diagram to birth-lifetime
coordinates using T (x, y) = (x, y − x). Let gu (x, y) be a normalized symmetric Gaussian probability distribution
over R2 with mean u = (ux , uy ) and variance σ 2 defined as

1 −[(x−ux )2 +(y−uy )2 ]/2σ2


gu (x, y) = e .
2πσ 2
Fix a nonnegative weighting function f : R2 → R that is zero on the horizontal axis, continuous, and piecewise
differentiable. We then transform the persistence diagram into a real valued function over the plane.

Definition 5.10.1. For a persistence diagram B, the corresponding persistence surface ρB : R2 → R is the function
X
ρB (z) = f (u)gu (z).
u∈T (B)

To get our image, we fix a grid of n pixels in the plane and integrate ρB over each one to get an n-long vector of
real numbers.
5.10. PERSISTENCE IMAGES AND TEMPLATE FUNCTIONS 115

Definition 5.10.2. For a persistence diagram B, the corresponding persistence image is the collection of pixels
ZZ
I(ρB )p = ρB dydx.
p

Persistence images can also combine persistence diagrams representing homology in different dimensions into a
single object by concatenating.
Note that we need to make three choices when generating a persistence image:
1. The resolution of the grid.
2. The distribution function gu with its mean and standard deviation.
3. The weighting function f .
The resolution of the image corresponds to overlaying a grid on the persistence diagram (in birth-lifetime coordi-
nates). It does not have a huge effect on classification accuracy,
The distribution used in the paper centers a Gaussian at each point. Letting the points in the diagram be the means,
only the variance need be chosen. This is an open problem but changes in variance do not have a big effect.
The weighting function f from the brth-lifetime plane to R needs to be zero on the horizontal axis (the analogue of
the diagonal in birth-death coordinates), be continuous, and be piecewise differentiable. The weighting function used
in the paper was piecewise linear and was chosen to give the most weight to classes of maximum persistence.
Let b > 0 and define wb : R → R as

 0 if t ≤ 0,
wb (t) = t/b if 0 < t < b, and
1 if t ≥ b.

Then set f (x, y) = wb (y), where b is the persistence value of the longest lasting class in the persistence diagram.
Adams, et. al. show that small changes in Wasserstein distance lead to small changes in persistence images.
Figure 5.10.1 shows the pipeline for persistent images from data to the persistence diagram to the transformed
diagram and finally to the surface and image at different grid scales.

Figure 5.10.1: Pipeline For Persistent Images [2].

I will comment that the result is more of a numerical matrix than an image. The picture is actually a Matlab heat
map. If we rescale to be in the range 0-255, we can make a grayscale image or overlay three of them to get a color
image. The numbers can be flattened into a one dimensional vector and sent to your favorite classifier or kept in
two-dimensions and sent to a convolutional neural net. Better still, you can keep the Matlab image and sell it to your
local modern art museum. (Now that should prove that TDA is practical.)
For the last major topic in this chapter, I will summarize the paper by Perea, Munch, and Khasawneh on Tempate
Functions [120]. The paper is very long and is heavily mathematical, using both point-set topology and functional
analysis. The latter can be thought of of infinite dimensional linear algebra. Finite dimensional spaces are sort of
116 CHAPTER 5. PERSISTENT HOMOLOGY

boring from the point of view of topology. All 5-dimensional spaces, for example, are pretty much the same (i.e. they
are all homeomorphic to each other). Infinite dimensional spaces can vary a lot more. The typical example is function
spaces. For example, integrable periodic real valued functions can be represented by a Fourier series which is the
sum of an infinite number of sine and cosine functions of varying frequencies. We can think of these functions as an
infinite vector space basis for the set of periodic functions. The paper cites the book by Conway [35] as a reference
for functional analysis, but my personal favorite is Rudin, [141].
Like persistent images, the main purpose of template functions is to turn a persistence diagram into a vector
of real numbers that can be sent to a classifier. The approach is to build a family of continuous and compactly
supported (i.e. only nonzero on a compact set) functions from persistence diagrams to the real numbers. They build
two families of these functions: tent functions which emphasize local contributions of points in a persistence diagram,
and interpolating polynomials which capture global pairwise interactions. Template fucntions provide high accuracy
rates, and Perea, et. al. found that in most cases, interpolating polynomials did better than tent functions. Both types
of functions can be computed using the open source Teaspoon package.
I wil now summarize the theory presented in [120]. See there for details and especially for the proofs. Recall that
a d-dimensional persistence diagram is a set S of points in R2 where each point (x, y) represents a homology class
of dimension d that is born at time x and dies at time y. Here we have y > x ≥ 0, so we are only looking at points
above the diagonal.. We assume the dimension is fixed and we will no longer refer to it. As it is possible for more than
one class to be born and die at the same time, the points have multiplicity a nonnegative integer. So let µ : S → N
be the multiplicity where N represents the natural numbers, i.e. the positive integers. We can write D = (S, µ) for a
diagram, and we will often use D and S interchangeably where S is understood to be the set of points in D. (So for
example, we can talk about subsets of D when we mean subsets of S.)
Definition 5.10.3. Given D = (S, µ) and U ⊂ R2 , the multiplicity of D in U is
X
M ult(D, U ) = µ(x)
x∈S∩U

if this quantity is finite, and ∞ otherwise.


Here is some more notation used throughout the paper. The diagonal is denoted ∆ = {(x, x)} ∈ R2 . The wedge
W is the region above and not including the diagonal. If z = (x, y) ∈ W , then pers(z) = y − x. The portion of W
with persistence greater than ϵ is denoted W ϵ . If we want to include the lower boundary, we write W ϵ .
Definition 5.10.4. The space of persistence diagrams denoted D is the collection of diagrams D = (S, µ) where:
1. S ⊂ W called the underlying set of D, has the property that M ult(D, W ϵ ) is finite for any ϵ > 0.
2. µ is a function from S to the positive integers. In particular, µ(x) is the multiplicity of x ∈ S.
The space of finite persistence diagrams is D0 = {(S, µ) ∈ D|S is finite}.
We can describe the multiplicity of multiple persistence diagrams in a subset U ⊂ R2 . by adding their individual
multiplicities.
The next step is to characterize compactness in the space D. We will need this to look at the topology of the
space of real valued continuous functions on this space. The space D is made into a metric space using the bottleneck
distance dB . (See Section 5.5.)
Definition 5.10.5. A subspace of a topological space is relatively compact if its closure is compact.
The main theorem is as follows:
Theorem 5.10.1. A set S of persistence diagrams in D is relatively compact if and only if:
1. It is bounded.
2. It is off-diagonal birth bounded (ODBB).
5.10. PERSISTENCE IMAGES AND TEMPLATE FUNCTIONS 117

Figure 5.10.2: Three Criterion for Relative Compactness on Sets of Persistence Diagrams [120].

3. It is uniformly off-diagonally finite (UODF).

I will now define each of these conditions. To help picture this, consider Figure 5.10.2 from [120].

Definition 5.10.6. A subspace of a topological space is bounded if it is contained in an open ball of finite radius.

Note that with bottleneck distance, unmatched points are matched against points on the diagonal. In [120], our
definition of Bottleneck distance is modified so that the distance between an unmatched point and the diagonal is
1/2 the perpendicular distance. This is then half of the points persistence value. So if ∅ is the empty diagram,
dB (D, ∅) = 21 max{pers(x)|x ∈ S}. So if BC (D) = {D′ ∈ D|dB (D, D′ ) < C} denotes the ball of radius C > 0
centered at diagram D, then if S ⊂ D is bounded, there exists a C > 0 with S ⊂ BC (∅). This is the situation pictured
in the upper left circle of Figure 5.10.2.

Definition 5.10.7. A set S ⊂ D is off-diagonal birth bounded (ODBB) if for every ϵ > 0 there exists a constant
Cϵ ≥ 0 such that if x ∈ S ∩ W ϵ (equivalently, pers(x) ≥ ϵ) for (S, µ) ∈ S, then birth(X) ≤ Cϵ .

This is the situation in the top right circle where the x coordinate representing birth has a right hand boundary at
Cϵ .

Definition 5.10.8. A set S ⊂ D is uniformly off-diagonal finite (UODF) if for every ϵ > 0 there exists a positive
integerMϵ such that
M ult(D, W ϵ ) ≤ Mϵ
for all D ∈ S.

This is the situation in the bottom circle.


Now we list three examples of sets which satisfy only two out of the three conditions. See Figure 5.10.2.
118 CHAPTER 5. PERSISTENT HOMOLOGY

Example 5.10.1. Let S = {Dn |n ∈ N }, Dn = {(0, 1)} with µDn (0, 1) = n. This is bounded and ODBB but not
UODF.

Example 5.10.2. Let S = {Dn |n ∈ N }, Dn = {(n, n + 1)} with µDn (n, n + 1) = 1. This is bounded and UODF
but not ODBB.

Example 5.10.3. Let S = {Dn |n ∈ N }, Dn = {(0, n)} with µDn (0, n) = 1. This is UODF and ODBB but not
bounded.

What we are interested in is the topology of continuous functions from D to R. The usual topology on this type of
function space is the compact-open topology.

Definition 5.10.9. Let X, Y be topological spaces and let C(X, Y ) be the set of continuous functions from X to Y .
Given K ⊂ X compact and V ⊂ Y open, let U (K, V ) be the set of functions f ∈ C(X, Y ) such that F (K) ⊂ V .
The set of U (K, V ) for all K ⊂ X compact and V ⊂ Y open forms a subbase for a topology on C(X, Y ) called the
compact-open topology. (Recall that this means that open sets are formed by finite intersections and arbitrary unions
of the sets U (K, V ).)

Definition 5.10.10. A topological space is locally compact if every point has an open neighborhood contained in a
compact set. The open set is called a compact neighborhood.

Perea, et. al. show the following:

Theorem 5.10.2. Relatively compact subsets of (D, dB ) have empty interior. Thus, (D, dB ) is not relatively compact
and no diagram D ∈ D has a compact neighborhood.

It turns out that compact subsets of the space of persistence diagrams are nowhere dense (i.e their closure has
empty interior.) This means that D can not be written as a countable union of compact sets. This means that the
compact open topology on C(D, R) is not metrizable (i.e. it can’t be made into a metric space.) For more details on
why, see Rxample 2.2 , Chapter IV of Conway [35].
Next, Perea et. al. handle the problem of finding compact-open dense subsets of C(D, R). In other words, we
want a family of functions which will approximate any continuous function f ∈ C(D, R). We need the following
definition:

Definition 5.10.11. A coordinate system for D is a collection F ⊂ C(D, R) which separates points. In other words,
if D ̸= D′ are two diagrams in D then there exists F ∈ F for which F (D) ̸= F (D′ ).

Now we want a coordinate system to be small. We could take F to be the space of all real valued continuous
functions on D, but this is too big to be practical. Consider the space Rn . We define the cartesian coordinates to be a
basis of n continuous linear functions from Rn to R. We want to find a continuous embedding (ie, a one-to-one map)
of D into an appropriate topological vector space V which we will need to choose. (Note that a topological vector
space is a vector space V over a field F which is a topological space and whose addition and scalar multiplication
operations are continuous functions from V × V to V and F × V to V respectively. In our case we use the field R.)
I will skip the rather lengthy derivation and just state the main results. To understand them, we will need a few more
definitions.

Definition 5.10.12. If V is a topological vector space, its dual is the vector space V ∗ = {T : V → R} such that T is
linear and continuous. If the topology of V comes from a norm || · ||V , then V ∗ has the operator norm

||T ||∗ = sup |T (v)|.


||v||V =1

There are several standard topologies on V ∗ but the one that we will need is the weak ∗ topology.
5.10. PERSISTENCE IMAGES AND TEMPLATE FUNCTIONS 119

Definition 5.10.13. The weak ∗ topology is the smallest topology so that for each v ∈ V , the resulting evaluation
function eV : V ∗ → R defined as eV (T ) = T (v) is continuous. A basis for open neighborhoods of T ∈ V ∗ is given
by sets of the form  
N (v1 , · · · , vk ; ϵ)(T ) = T ′ ∈ V ∗ : max |T ′ (vi ) − T (vi )| < ϵ
1≤i≤k

where v1 , · · · , vk ∈ V and ϵ > 0.

We now need a generalization of Banach spaces called locally convex spaces.


As before let W be the wedge on a persistence diagram and let Kn be a sequence of compact subsets of W with
Kn ⊂ Kn+1 for all positive integers n, and W = ∪Kn . Let Cc (Kn ) be the set of real valued functions f on W whose
support (i.e. {x ∈ W |f (x) ̸= 0}) is contained in Kn . Then Cc (Kn ) is a Banach space if endowed with the sup norm
||f || = supx∈Kn |f (x)|. In particular, it is locally convex.

Definition 5.10.14. A topological vector space V is locally convex if its topology is generated by a family P = {pα }
of continuous functions pα : V → [0, ∞) so that

1. pα (u + v) ≤ pα (u) + pα (v) for all u, v ∈ V.

2. pα (λu) = |λ|pα (u) for all scalars λ.

3. If pα (u) = 0 for all α, then u = 0.

4. The topology of V is the weakest for which all of the p′α s are continuous.

In particular, all normed spaces are locally convex. (We just have one pα with pα (v) = ||v||.) Note that each
inclusion Cc (Kn ) ⊂ Cc (Kn+1 ) is continuous and Cc (W ) = ∪∞n=1 Cc (Kn ).

Definition 5.10.15. The strict inductive limit topology is the finest locally convex topology for which each inclusion
Cc (Kn ) ⊂ Cc (Kn+1 ) is continuous. In this topology, a linear map T : Cc (W ) → Y where Y is locally convex is
continuous if and only if the restriction of T to each Cc (Kn ) is continuous.

Let Cc (W )∗ be the dual of Cc (W ) with repect to the strict inductive limit topology and let Cc (W )∗ have the
weakest topology so that for each f ∈ Cc (W ), the resulting evaluation function ef : Cc (W )∗ → R with ef (T ) =
T (f ) is continuous. This is the corresponding weak ∗ topology.

Theorem 5.10.3. Given a persistence diagram D = (S, µ) ∈ D and a function f ∈ Cc (W ), define


X
νD (f ) = µ(x)f (x).
x∈S

If Cc (W ) is endowed with the strict inductive limit topology and Cc (W )∗ with the corresponding weak ∗ topology,
then ν : D → Cc (W )∗ defined by ν(D) = νD is continuous, injective, and if A ⊔ B denotes the disjoint union of A
and B, then for D, D′ ∈ D, ν(D ⊔ D′ ) = ν(D) + ν(D′ ).

So now that we have an embedding ν : D → Cc (W )∗ , we an look for coordinate functions. The next theorem
characterizes the elements of Cc (W )∗∗ , the dual of the dual.

Theorem 5.10.4. Let V be a locally convex space and endow its topological dual V ∗ with the associated weak ∗
topology. This is the smallest topology such that all the evaluations eV : V ∗ → R defined by ev (T ) = T (v) for v ∈ V
are continuous. Then the function e : V → V ∗∗ defined by e(v) = eV is an isomorphism of locally convex spaces.

Applying this theorem to the locally convex space Cc (W ) topologized with the strict inductive limit topology
implies that the elements of Cc (W )∗∗ are evaluations ef with f ∈ Cc (W ). Composing
P ef with ν yields a continuous
function ef ν : D → R and ef ν(D) = νD (f ) = νf (D), where νD (f ) = νf (D) = x∈S µ(x)f (x). The νf are the
functions we are looking for.
120 CHAPTER 5. PERSISTENT HOMOLOGY

Definition 5.10.16. A template system for D is a collection T ⊂ Cc (W ) so that

FT = {νf |f ∈ T }

is a coordinate system for D. The elements of T are called template functions.


Template functions can approximate continuous functions on persistence diagrams in the following sense:
Theorem 5.10.5. Let T ⊂ Cc (W ) be a template system for D, let C ⊂ D be compact, and let F : C → R be
continuous. Then for every ϵ > 0, there exists a positive integer N , a polynomial p ∈ R[x1 , · · · , xN ], and template
functions f1 , · · · , fN ∈ T such that

|p(νD (f1 ), · · · , νD (fN )) − F (D)| < ϵ

for every D ∈ C. In other words, the collection of functions of the form D → p(νD (f1 ), · · · , νD (fN )) is dense in
C(D, R) in the compact-open topology.
We now want to see how to construct template functions. The next theorem shows that we can construct a countable
template system by translating and rescaling a function f ∈ Cc (W ).
Theorem 5.10.6. Let f ∈ Cc (W ), n be a postive integer, and m ∈ R2 have integer coordinates. Define
m
fn,m (x) = f (nx + ).
n
If f ̸= 0, then
T = {fn,m } ∩ Cc (W )
is a template system for D. Recall that a function f is Lipschitz if for any x, y in the domain of f there is a constant
C independent of x and y such that |f (x) − f (y)| ≤ C|x − y|. Then if f is Lipschitz, the elements of the coordinate
system
{νfn,m = fn,m ν|fn,m ∈ T } = FT
are Lipschitz on any relatively compact set S ⊂ D.
Perea, et. al. describe two families of template functions: tent functions and interpolating polynomials. I will
describe them next. The families will be defined in the birth-lifetime plane instead of the birth-death plane just to keep
the definitions a little simpler. The shifted diagrams will be denoted by a tilde. So the wedge W f consists of points
2 ϵ
(x, y) ∈ R such that x ≥ 0 and y > 0, and W is the subset of W where y > ϵ. The point x = (a, b) in the
g f
birth-death plane is transformed to x̃ = (a, b − a) in the birth-lifetime plane. A diagram D = (S, µ) is transformed to
De = (S,
e µ̃) where µ̃(x̃) = µ(x).
The first example is tent functions. Given a point A = (a, b) ∈ W f and a radius δ with 0 < δ < min{a, b}, define
the tent function on W to be
f
1
gA,δ (x, y) = 1 − max{|x − a|, |y − b|}
δ +

where | · |+ means take the value of the function if it is positive and let it be zero otherwise. Since δ < min{a, b}, the
function is zero outside of the compact box [a − δ, a + δ] × [b − δ, b + δ] ⊂ W f.
Given a persistence diagram D = (S, µ), the tent function is
X
GA,δ (D) = G e A,δ (D)
e = µ̃(x̃)gA,δ (x̃).
x̃∈S
e

To limit the number of these functions, let δ > 0 be the partition scale, let d be the number of subdivisions along the
diagonal (birth-death) or y-axis (birth-lifetime) and let ϵ be the upward shift. Then use only the tent functions

{G(δi,δj+ϵ),δ |0 ≤ i ≤ d, 1 ≤ j ≤ d}.
5.11. OPEN SOURCE SOFTWARE AND FURTHER READING 121

Figure 5.10.3: Tent function g(4,6),2 drawn in the birth-death plane and the birth-lifetime plane [120].

An example plot from [120] is shown if Figure 5.10.3.


Now we have a vector of numbers the output which can be sent to your favorite classifier. Perea, et. al show that
tent functions separate points. In other words, they have different values for different persistence diagrams.
The other template functions are interpolating polynomials. They are formed by constructing polynomials with
specified values at mesh points of R2 and evaluating them on the points of the persistence diagram. The construction
is somewhat more complicated so I will refer you to [120] for details. Perea et. al. claimed that these preformed even
better than tent functions in their experiments.
The Teaspoon Package in Python does all of the work for your, constructing both tent functions and interpolating
polynomials from your data points. It can be found at https://github.com/lizliz/teaspoon .

5.11 Open Source Software and Further Reading


In this chapter, I showed you what persistence homology is, how it can arise form point clouds, sublevel sets,
graphs, and time series, and some methods for pulling out numerical features to feed to a classifier. This should give
you lots of things to try on your own problems.
There is a lot more written on TDA that I haven’t covered. Here are some more places to look for further reading.
There are five books I can list on TDA that each have a different perspective. The oldest is Computational Homology
by Kaczynski, Mischaikow, and Mrozek [79].This book doesn’t cover persistent homology, but covers algorithms for
computing homology groups. The books by Zomorodian [181] and Edelsbrunner/Harer [44] are the earliest books I
know of that cover persistent homology. Ghrist’s Elementary Applied Topology [53] gives a taste of a lot of potential
applications of topology which you can follow up on if any of the topics interest you. The book by Rabadan and
Blumberg [129] is the most recent and modern, and it is heavily geared towards biomedical applications, especially
virus evolution and cancer. The mathematical background material is done very well, and they cover some topics that
I skipped. For example, they talk about the use of statistical techniques in topological data analysis. Many of these
problems are hard and still unsolved.
I also, left out the idea of witness complexes in which a subset of a point cloud called landmark points is chosen to
reduce computational cost. There are extremely fast programs now so this is probably not as much of a problem as it
once was.
There are also some good papers that will give you an introduction to TDA. Here are some good papers to get
you started. Carlsson [29] was one of the earliest introductory papers. Chazal and Michel [31] is a more modern
122 CHAPTER 5. PERSISTENT HOMOLOGY

introduction. Pun, et. al. [127] is one of the best organized surveys out there giving you options at every step of the
pipeline. They also list software, but they leave out Ripser. Finally, I should mention the introduction by Munch [109]
which was used as background for the January 2021 American Mathematical Society short course on Mathematical
and Computational Methods for Complex Social Systems.
One advantage of experimenting with TDA is that there is now a lot of open source software. Ripser is probably the
fastest for doing persistent homology, but it doesn’t do everything. Javaplex, Hera, and Dionysus are two alternatives.
I also mentioned Elizabeth Munch’s Teaspoon package which handles template functions. Also check out packages
that are available in R. Two comprehensive lists of software and what it can do are found in [127] and Appendix A of
[129].
So now that you know so much about TDA, what comes next. In the next chapter, I will talk about some famous
and important algorithms that are topology related but not exactly persistent homology. Then I will briefly discuss a
couple of my own ideas that I never got to fully pursue.
After that, I will take you to the frontier and look at recent developments that could help tap the more advanced
techniques of cohomology, Steenrod squares, and homotopy theory.
Chapter 6

Related Techniques

In this chapter, I will cover some applications of algebraic topology which do not involve persistent homology. Section
6.1 will cover Ronald Atkin’s ”Q-Analysis.” Q-Analysis, described in [9], is the earliest attempt I can think of to apply
algebraic topology to the real world. In this case, it was to explain the structure and politics of his university, the
University of Essex in Colchester, UK. I will spend some time on this subject as nobody else is likely to explain
it to you. Q-analysis never really caught on, but I think that Atkin had some interesting things to say about large
bureaucracies and how they function.
In Section 6.2, I will talk about sensor coverage, one of the earliest proposed applications of computational homol-
ogy theory. Suppose you have an array of sensors that know if their coverage area overlaps with that of neighboring
sensors. Can you tell if your coverage region has any holes?
Section 6.3 will cover Mapper, the main commercial product of Gunnar Carlsson’s company, Ayasdi. I will de-
scribe the algorithm and its breakthrough application: The discovery of a new class of breast cancer cell.
In Section 6.4, I will discuss simplicial sets, a generalization of simplical complexes. They are necessary to
understand the dimensionality reduction algorithm UMAP, and they are also used in modern computational algebraic
topology. Finally, I will discuss the UMAP algorithm itself in Section 6.5.

6.1 Q-Analysis
In [9], Ronald Atkin looked at the problem of using topology to describe social science data and discover rela-
tionships. To do that, he used simplicial complexes to describe binary relations. Remember that an abstract simplical
complex consists of a collection of subsets of a vertex set such that if σ is in the complex, then so is any face of σ. The
complex then can be used to illustrate relationships. For example, if X is a set of people and Y is a set of committees,
we could build a simplicial complex to illustrate the relation ”is a member of” by letting xi ∈ σj if person i is a
member of committee j. We also have the inverse relationship committee j contains person i. Q-analysis looks at the
geometric structure of these relations at each level of a heirarchy in which the N + 1 level groups together objects
at the N level. For example, if football, baseball, and basketball are objects at the N level they might be grouped
together as the object ”sports” at the N + 1 level. At each level we look at how many elements are shared between
simplices, patterns which asscoiate numbers to simplices, and the forces that cause patterns to evolve.
In Sections 6.1.1, 6.1.2, and 6.1.3, I will make these concepts concrete by discussing topological representation of
relations, exterior algebra, and shomotopy respectively. Then I will choose some examples form Atkin’s description
of the University of Essex to illustrate the concepts. In Section 6.1.4, I will discuss the application to the University’s
committee structure. Finally, section 6.1.5 will cover rules for success in handling the hierarchy.
Although there have been a number of papers on Q-analysis since this book was published, you rarely see it in
modern discussions of TDA. Still, Atkin had some interesting things to say about the workings of bureaucracies and it
would be interesting to apply similar ideas to ohter large organizations such as corporations of Government Agencies.

123
124 CHAPTER 6. RELATED TECHNIQUES

6.1.1 Topological Representation of Relations


For the definitions in this section, I will refer to the following example from [9]:
Given a relation λ ⊂ Y × X, we have a simplicial complex KY (X; λ) where elements of X correspond to
vertices and elements of Y correspond to maximal simplices. In what follows the term ”simplex” will just be used for
the maximal simplices.
Example 6.1.1. Suppose we have the following incidence matrix Λ in which the rows are maximal simplices and the
columns are vertices. Λji = 1 if xi is in simplex Yj and Λji = 0 otherwise:
 
1 1 1 1 0 0 0 0
0 0 1 1 1 0 0 0
 
0 0 0 0 1 0 0 1
Λ= 0 0 0 0 0 1 1 1

 
0 0 1 0 0 0 1 0
0 0 0 1 0 1 0 1

In this case, we have 6 maximal simplices: Y1 = [x1 , x2 , x3 , x4 ], Y2 = [x3 , x4 , x5 ], Y3 = [x5 , x8 ],


Y4 = [x6 , x7 , x8 ], Y5 = [x3 , x7 ], Y6 = [x4 , x6 , x8 ].
Note that we can also define the complex KX(Y ; λ−1 )) whose incidence matrix is the transpose of Λ.
Now we look at chains of simplices that have vertices in common.
Definition 6.1.1. Two simplices σ and τ are q-connected if there is a chain σ = σ1 , σ2 , · · · , τ = σh such that each
consecutive pair {σi , σi+1 } has a face of at least dimension q in common. We say that this is a chain of q-connection
of length h − 1.
WARNING: This is not the same as the term n-connected used in homotopy theory.
Also note that I am departing from Atkin’s notation where σn means that n is the dimension of σ.
A q-simplex is always q-connected to itself by a chain of length 0. Also, if σ and τ are q-connected, they are also
n-connected for 0 ≤ n < q.
Example 6.1.2. Let σ1 = [x1 , x2 , x3 , x4 , x5 , x6 ], σ2 = [x2 , x3 , x4 , x5 , x6 , x9 ], σ3 = [x4 , x5 , x6 , x7 , x8 , x9 ], and σ4 =
[x6 , x7 , x8 , x9 , x10 ]. Then σ1 ∩ σ2 = [x2 , x3 , x4 , x5 , x6 ], σ2 ∩ σ3 = [x4 , x5 , x6 , x9 ], and σ3 ∩ σ4 = [x6 , x7 , x8 , x9 ]. So
σ1 and σ2 share a face of dimension 4, σ2 and σ3 share a face of dimension 3, and σ3 and σ4 share a face of dimension
3. So σ1 and σ4 are 3-connected through a chain of length 3.
For a simplicial complex K of dimension n, fix q with 0 ≤ q ≤ n. Then the property of being q-connected is an
equivalence relation on the simplices of K. We call the equivalence classes q-components and denote their cardinality
by Qq . If we analyze K by finding the values of Q0 , Q1 , · · · , Qn , we say that we have perfomed a Q-analysis on K.
Example 6.1.3. Referring to Example 6.1.1, The largest simplex is Y1 of dimension 3. Since there is no other of that
dimension, we have Q3 = 1. There are 4 simplices of dimension at least 2 (remember we only consider maximal
simplices). These are Y1 , Y2 , Y4 and Y6 , and none of these are 2-connected to anything. So Q2 = 4. We also have
Q1 = 4, since Y1 and Y2 share the 1-face [x3 , x4 ], and Y4 and Y6 share the 1-face [x6 , x8 ] while Y3 and Y5 are maximal
1-simplices. Finally, Q0 = 1 as every simplex intersects at leasr one other so that the complex K is connected.
Note that Q0 is the number of connected components, so it always equals the 0-betti number in homology.
We can do this calculation quickly by mutliplying ΛΛT and subtracting 1 from every entry of the resulting matrix.
ΛΛ is a symmetric matrix and (ΛΛT )ij is the number of ones that row i and row j have in common. So subtracting
T

1 gives the dimension of the largest common face.


Definition 6.1.2. Given a complex K with dimension n the structure vector is Q(K) = [Qn , · · · , Q1 , Q0 ].
Example 6.1.4. In our previous example, the structure vector is Q = [1, 4, 4, 1].
6.1. Q-ANALYSIS 125

Definition 6.1.3. Given a complex K with dimension n and structure vector Q(K) = [Qn , · · · , Q1 , Q0 ], the obstruc-
tion vector Q̂ is defined as Q̂ = [Qn , Qn−1 − 1, · · · , Q1 − 1, Q0 − 1]

The obstruction vector is a property of the complex and acts as an obstruction to the free flow of patterns which
will be defined next.
Word reuse warning: This obstruction has nothing to do with the obstruction theory I will define in Chapter 10.

Example 6.1.5. In our previous example, the obstruction vector is Q̂ = [1, 3, 3, 0].

Definition 6.1.4. A pattern π on a complex K is a mapping from the simplices of K to the integers. The restriction
of π to t-simplices is written π t ,

Note that this mapping is not necessarily one-to-one and it is not necessarily a homomorphism.
Atkin describes the underlying geometry as the static backcloth denoted S(N ). An incremental change in pattern
δπ on S(N ) is the analogue of a force in physics. The intensity of the force is on a simplex σ is

δπ(σ)
.
π(σ)

Changes in patterns in the environment can induce social forces such as social pressure, organizational pressure, etc.
In this case it will be the patterns in the univeristy environment.

6.1.2 Exterior Algebra


In this section, we will discuss the properties of an exterior algebra. You may have seen this if you have ever
worked with differential forms.
Let V be a finite dimensional vector space over a field F with basis {v1 , · · · , vn }. We turn this space into an
algebra using the exterior product or wedge product.

Definition 6.1.5. An exterior algebra V is an algebra over F whose product is the wedge product v ∧ w having
the properties that v ∧ w = −w ∧ v (antisymmetry) and v ∧ v = 0 for v, w ∈ V . Using the distributive law, if
a1 , a2 , b1 , b2 ∈ F , and v, w ∈ V , then

(a1 v + a2 w) ∧ (b1 v + b2 w) = (a1 b2 − a2 b1 )(v ∧ w).

In addition, the product is associative (i.e. if v, w, z ∈ V , then (v ∧ w) ∧ z = v ∧ (w ∧ z). This makes V into an
associative algebra.

Definition 6.1.6. Let V be a n-dimensional vector space over a field F with basis {v1 , · · · , vn }. Define a new vector
space ∧2 V with basis consisting of all elements of the form vi ∧ vj with vi and vj basis elements of V and i ̸= j.
In general can define ∧k V for 2 ≤ k ≤ n with basis consisting of vi1 ∧ · · · ∧ vik where the product is taken over all
k-tuples of distinct basis elements of V . Let ∧0 = F and ∧1 = V . Then we define the exterior algebra on V to be the
asscociative algebra ∧V defined as

∧V = ∧0 V ⊕ ∧1 V ⊕ ∧2 V ⊕ · · · ⊕ ∧n V.

Note that if x ∈ ∧p V and y ∈ ∧q V , then (x ∧ y) ∈ ∧p+q V provided that p + q ≤ n. Also the dimension of ∧p is
the nuber of combinations of selecting p objects from n objects (order doesn’t matter) or
 
n n!
= .
p (n − p)!p!

We will now modify this construction for V a module over the integers. (I,e. an abelian group,) Ordering all of the
elements of a set X, let V = {x1 , x2 , · · · , xn }. We can then associate simplicial complexes with polynomials over
126 CHAPTER 6. RELATED TECHNIQUES

∧V . We will leave out the wedge symbol and just write vi ∧ vj as vi vj . Then a p-simplex [xi0 , xi1 , · · · , xip ] can be
written as the element xi0 xi1 · · · xip ∈ ∧p+1 .
For a complex K, and a simplex in K, let ρ(σ) be the monomial in ∧V corresponding to σ. Suppose we have a
pattern π on K. Then the pattern polynomial, also denoted as π, is defined to be
X
π =1+ ρ(σ)π(σ).
σ⊂K

Now we introduce a function that takes a simplex to the sum of its faces.

Definition 6.1.7. If σ = [v0 , v1 , · · · , vp ] is a simplex, then we define the face operator f by


p
X
fσ = [v, · · · , v̂i , · · · , vp ].
i=0

Note that this looks like the boundary but does not involve an alternating sum. Also, we are not working in Z2 so
it is not true that applying f twice gives zero.We will let f 0 be the identity map and define f n = f (f n−1 ) for n > 0.
Letting the simplices be monomials in ∧V , f becomes a linear function, i.e. f (aσ) = af (σ) for a ∈ Z, and
f (σ + τ ) = f (σ) + f (τ ). For a 0-dimensional simpllex σ = [v], we define f (σ) = 1.

Example 6.1.6. In ∧V , f (x1 x2 x3 ) = x1 x2 + x1 x3 + x2 x3 . Then f 2 (x1 x2 x3 ) = f (x1 x2 + x1 x3 + x2 x3 ) =


x1 + x2 + x1 + x3 + x2 + x3 = 2x1 + 2x2 + 2x3 ̸= 0.

Now if σ and τ share a vertex, then στ = 0 in ∧V since the vertex appears twice in the product. If they share
two vertices, then we also have f (σ)τ = σf (τ ). But σf t (τ ) ̸= 0 for t > 1. So continuing this argument gives the
following:

Theorem 6.1.1. Two simplices σ and τ in complex K share a t-face if and only if their monomial representations in
∧V satisfy (f i σ)τ = 0 or σ(f i τ ) = 0 for 0 ≤ i ≤ t and both terms are nonzero for i > t.

6.1.3 Homotopy Shomotopy


We now introduce a concept of nearness between simplices.

Definition 6.1.8. If two simplices share a q-face, we say that they are q-near.

We can describe a chain of q-connectivity in terms of q-nearness. In a chain of q-connection, successive simplices
are q-near. In a simplicial complex, we can extend the concept of nearness to chains of connection. We will denotes
such a chain from σ to τ as [σ, τ ].

Definition 6.1.9. Let [σ0 , σn ] and [τ0 , τm ] be two chains of connection. The intermediate simplices are numbered
σ1 , · · · , σn−1 and τ1 , · · · , τm−1 repectively. Then [σ0 , σn ] and [τ0 , τm ] are q-adjacent if

1. σ0 is q-near τ0 .

2. σn is q-near τm .

3. Given σi in [σ0 , σn ] , there is a τj in [τ0 , τm ] such that σi is q-near τj . Let τj1 and τj2 be the chosen simplices
that are q-near σi1 and σi2 respectively. Then i1 < i2 , implies j1 < j2 .

4. Given τj in [τ0 , τm ] , there is a σi in [σ0 , σn ] such that τj is q-near σi . Let σi1 and σi2 be the chosen simplices
that are q- nearτj1 and τj2 respectively. Then j1 < j2 , implies i1 < i2 .

A q-chain of connection is obviously q-adjacent to itself. The relation is also symmetric but it is not transitive.
Atkin now defines a discrete analog of homotopy called a pseudo-homotopy or shomotopy for short.
6.1. Q-ANALYSIS 127
Level Object
N+1 {Social Amenity}
N {Catering, Sport}
N-1 {Food, Drink, Groceries, . . . , Athletics, Ball Games, . . . }
N-2 {Pies, Chips, . . . , Beer, Wine, . . . , Carrots, Milk, Meat, . . . , Football, Tennis, High-Jump, Billiards, . . . }

Table 6.1.1: Social Amenities at the University of Essex[9].

Definition 6.1.10. Let K + be a simplicial complex with a simplex σ−1 added. σ−1 is a (-1)-dimensional simplex
which will just be something we can use for numbers that are out of range in the function we will now define. Let
c1 and c2 be chains of q-connection of length n1 and n2 respectively. A (discrete) q-shomotopy is a mapping Sh :
Z × Z → K + such that

1. For fixed x ∈ Z, the image of Sh(x, y) is a chain of q-connection. As this chain will have finite length m since
K is a finite complex, we let S(x, y) = σ−1 for y < 0 or y > m.

2. Sh(0, y) = c1 and Sh(m, y) = c2 for some finite m > 0.

3. Sh(x, y) and Sh(x + 1, y) are q-adjacent for 0 ≤ x < m.

Shomotopy is an equivalence relation and the finite set of all chains of connection in K fall into disjoint equivalence
classes called q-shomotopy classes.
An analogue of a continuous function (more specifically an isometry) is a face saving map.

Definition 6.1.11. Given 2 complexes K and K ′ , let β : K → K ′ be an injective map. β is a face saving map if for
simplices σ, τ ∈ K, σ is q-near to τ in K, implies that β(σ) is q-near to β(τ ) in K ′ .

A face saving map preserves q-connectivites but need not preserve chain length. This means that these maps also
preserve the porperty of being q-adjacent and preserve q-shomotopies.
Shomotopy gives an analogue of the homotopy groups I will discuss in Chapter 9. As a preview consider the
fundamental group π1 (X, x0 ). This consists of homotopy classes of maps f from I = [0, 1] to X where f (0) =
f (1) = x0 . These are loops that start and end at the base point x0 . There is a group structure where the multiplication
[x]˙[y] consists of traveling around a representative loop in class [x] followed by one in class [y]. The loop where all
of I is mapped to the basepoint is the identity and if [x] is represented by f : I → X, then [x]−1 is represented by
g : I → X where g(t) = f (1 − t) so that the loop is traversed in the opposite direction. I will get into much more
details in Chapter 9.
For the shomotopy version, let σ0 be a fixed simplex in K. Then σ0 is a base simplex and a chain of q-connection
[σ0 , σ0 ] is a q-loop based at σ0 . Group structures are entirely analogous as the identity is the q-loop of length 1 while
an inverse of a loop is achieved by listing the simplices in the opposite direction. Loops that are not shomotopic to a
loop of length 1 represent holes in the complex.

6.1.4 Structures of the University of Essex


Atkin describes a research study that he did at the University of Essex in Colchester, UK where he was a professor.
The idea was to see how the structures described above would apply to various aspects of the University. These
categories ranged from phyiscal structure, to administrative organization, committee structures, and social amenities.
For each category, there was a set of heirarchical levels N − 1, N, N + 1, · · · . Objects at a higher level include objects
at the lower levels. Atkin gives an example of social amenities which I have reproduced in Table 6.1.1.
Note that the levels are not partitions. For example, Drink and Groceries share Milk as a common object. That is
where q-connectivity plays a role.
128 CHAPTER 6. RELATED TECHNIQUES

The report covered levels from N +3 which was the world outside the University down to the level of the indivisual
which was N − 2, but an individual typically operates at multiple levels depending on the context.
Within a level N , the N -level backcloth S(N ) is the space in which activity takes place. It is analyzed by looking
at the structure vector Q and its associated obstruction vector Q̂. This gives a global analysis. A more local one is
the study of shomotopy and especially the q-holes which the traffic at that level must avoid. These holes look like
an object to an individual. There is also the group structure whose role is poorly understood. Anyway, the structure
affects how the University community interacts and how decisions are made.
I will give a couple of examples from the rest of the book. I would have appreciated a lot of them more if I had ever
been to the University of Essex as there is a lot of description of the various buildings, dining halls, etc. I imagine it
must have changed in some way in the last 50 years, though. I have been to Colchester in 2002 and saw the Colchester
Castle, not to be confused with the ”citadel” described below.
The first example I will describe is the committee structure. There are 28 of these including the University Council
which operates both at level N + 3 (the interface with the outside world) and at level N + 2. All of the others operate
at level N + 1.
The maximal simplices are people and the vertices are the committees they are on. The Vice Chancellor is on 20 of
these committees representing a simplex of dimension 19. We then have two added components at dimension 9 which
are added by two people, one in the History Department and one in the Literature Department. The structure vector is

Q = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 2, 3, 3, 6.4.5.9.7.1}

where the positions go from dimension 19 down to 0 (left to right) and the first 3 is in dimension 9. This gives the
obstruction vector
Q̂ = {2, 1, 2, 2, 5, 3, 4, 8, 6, 0}.
The obstruction vector has its largest value at q = 2 representing business that must be discussed by 3 committees.
As the q-connectivity represents committees that are shared by particular individuals, high values in the obstruction
vector show more rigidity. A pattern vector may show a ranking each individual assigns to each order of business. To
get things done, these patterns need to change and the higher the obstruction value, the more resistance.
Flipping things around, we get a complex where vertices are people and maximal simplices are committees. The
connectivity consist of 2 committees that share people. This is required for good communication and a high obstruction
means that there are several groups of committees that don’t share members, so they don;t talk to each other. The
structure and obstruction vectors in this case also had some large components.
Atkin proposed that when committees are formed (an annual process), Q-Analysis may help to determine mem-
bership that will be more or less resistant to change.
Another interesting quantity in this context is the eccentricity of a simplex.

Definition 6.1.12. Let σ be an r-dimensional simplex in a complex K. Define q̌ (the bottom-q) to be the least value of
q for which a distinct simplex is q-connected to σ. Note that if 2 simplices are q-connected, they are also t−connected
for t < q, so for this definition to make sense we use the highest number for which they are q-connected. Then define
the eccentricity of σ to be
r − q̌
Ecc(σ) = .
q̌ + 1
Higher eccentricity means a simplex is the most out of step with the others. If σ does not share any vertices with
another simplex, we say that q̌ = −1 and Ecc(σ) = ∞.
It turns out that the individual with the highest eccentricity is the Vice Chancellor. Committees with high eccen-
tricity were those who shared the fewest members with any other committee.
One of the most interesting ideas is that of a hole. In this case, committees that share members form a loop which
can’t be reconciled. Someone with connections to all of the committees must plug up the hole for business to get done.
In this case, the only candidate is the Vice Chancellor, but the rest of the individuals see this as a power grab. It would
be interesting to compare this process to that of any large organization.
6.2. SENSOR COVERAGE 129

6.1.5 Rules for Success in the Committee Structure


In general a hole is plugged with a ”psuedo-committee” at the next highest level. Here are Atkins two rules:

1. To succeed in politics, inject (N − k)-level business into N -level committees.

2. To thwart in politics, refer (N + k)-level business to N -level committees.

In the latter case, you can purposely create holes. If no such N -level committees exist, you can create them and
call them ”Working Parties.” (Working Groups in our terminology.)
Atkin gives some very specific case studies that are heavily dependent on how his University was structured but
he saw a major disconnect between levels N + 2 (Univeristy Council) and N + 1 (Committees) as compared with any
level below that such as the departments. He refers to levels N + 1 and N + 2 as ”the citadel”. (The other Colchester
Castle?)
Since [9] was published, there have been some other papers published on Q-Analysis, but it seems to be mostly
missing in modern day TDA discussions. Atkin, himself, didn’t seem to publish anything more recently than the early
1980’s. Still I think he has some interesting things to say about the workings of a large organization. Case studies of
large corporations or Government Agencies might be interesting to see if they behave in similar ways. My guess is
that it is very probable.

6.2 Sensor Coverage


I will now move ahead by 30 years and describe one of the first modern applications of topological data analysis.
The problem of sensor coverage was described by Vin de Silva and Robert Ghrist, two pioneers in TDA, in [39].
Suppose we have an array of sensors located at unknown points. (The problem would be trivial if we knew their
locations.) Each sensor covers an area which is a closed ball of a fixed radius. Although we don’t know the exact
locations of the sensors, the sensors can detect neighboring sensors if they are within a certain distance. The question
is whether there is complete coverager or is there a hole in the region covered by the sensors. Algebraic topology will
come to the rescue and provide a method for solving this problem. In this section, I will summarize [39] and how it
solves this problem. For a more modern approach, see the book by Robinson [136].
We make six assumptions:

1. Nodes cover a closed ball of covering radius rc .

2. Nodes broadcast unique ID numbers. Each node can detect the identity of any node within radius rs via a strong
signal or within a larger radius rw via a weak signal.

3. The radii of communication rs and rw and the covering radius rc satisfy


√ √
rc ≥ rs / 2 and rw ≥ rs 10.

4. Nodes lie in a compact domain D ⊂ Rd . Nodes can detect the presence but not the direction of the boundary
∂D within a fence detection radius rf .

5. The domain D − Nr̂ (∂D) is connected, where



Nr̂ (∂D) = {x ∈ D|||x − ∂D|| ≤ r̂} and r̂ = rf + rs / 2.

Here ||x − ∂D|| denotes the distance from x to the closest point of ∂D.

6. The outer boundary can not exhibit large-scale wrinkling. (The actual statement involves differential geometry
terms that I don’t want to get into here as that will take us too far afield and are not that relevant in what follows.
See [39] for more details.)
130 CHAPTER 6. RELATED TECHNIQUES

Note that none of these constants mentioned in the assumptions are dependent on the dimension d.
Let X be the set of nodes and Xf be the set of fence nodes which lie within the radius rf of ∂D. We build a
collection of graphs:


Rs Rw

⊂ ⊂

Fs Fw

The graphs Rs and Rw have vertices which are the nodes in X and an edge between 2 nodes if the distance
between them is less than rs and rw respectively. These are the communication graphs for the strong and weak signals
respectively. The fence subgraphs Fs and Fw are the subgraphs of Rs and Rw in which we restrict the nodes to the
subset Xf ⊂ X.
For these four graphs we have the corresponding Vietoris-Rips complexes with radius rs for Rs and Fs and rw for
Rw and Fw . We write corresponding the nested simplicial complexes as:


Rs Rw

⊂ ⊂

Fs Fw

The sensor cover U is the union of discs of radius rc (see assumption 1). Here is the main result of [39]:

Theorem 6.2.1. For a fixed set of nodes in a domain D ⊂ Rd satisfying our six assumptions, the sensor cover U
contains D − Nr̂ (∂D) if the homomorphism

i∗ : Hd (Rs , Fs ) → Hd (Rw , Fw )

induced by inclusion is nonzero.

The proof has a lot to do with the geometry of the balls given all the conditions on the radii. I will outline it by
mainly focusing on the role of homology.
Here is where persistence plays a role. The criterion for coverage is equivalent to saying that there is a nonzero
generator for the relative homology group Hd (Rs , Fs ) that persists to Hd (Rw , Fw ). Remember that rw > rs .
Assumptions 5 and 6 turn out to insure that Hd (D, Nr̂ (∂D) has betti number β = 1. Also, Hd (U∪Nr̂ (∂D), Nr̂ (∂D))
is nonzero if and only if U contains D − Nr̂ (∂D). Now the complex that would capture the topology of U exactly
would be the C̆ech complex. But as we saw in Section 5.1, the C̆ech complex is very hard to compute as opposed to
the Vietoris-Rips complex. So we would like it to be true that with the Vietoris-Rips complex, Hd (Rs , Fs ) is nonzero
if and only if U contains D − Nr̂ (∂D).
Unfortunately, this is not always the case. Consider the situation in Figure 6.2.1. In this case d = 2, and there is
a cycle of points in Fs which are attached to a single vertex in Rs − Fs . The cycle has two edges of length rs and
two short edges of length ϵ ≪ rs . Both diagonals of the top rectangle are longer than rs so they are not present in Fs .
This corresponds to a generator in H2 (Rs , Fs ) which does not imply coverage of the entire domain. When the radius
is increased to rw , the diagonals appear and kill the relative 2-cycle by filling in the rectangle. So the image of this
fake class is the zero element of H2 (Rw , Fw ). Now it is possible to produce a new fake class so H2 (Rw , Fw ) is not
necessarily 0, but the original fake class was killed by i∗ so we know that there was a gap in coverage.
The bulk of the work is in analysis of the coverage near the boundary ∂D. The proof is lenghty and technical
so you can see [39] for details. The key point, though, is that the homology calculations find holes that the C̆ech
6.3. MAPPER 131

Figure 6.2.1: A generator of H2 (Rs , Fs ) which is killed by the inclusion i∗ into H2 (Rw , Fw ) The fence nodes are
on top and the strip is a border of radius rf . [39]

complex would pick up but the Vietoris-Rips complex misses. Finally, I will mention that the bounds on the radii can
be tightened by introducing a dependence on the dimension d of the domain. Then we have
s s p
d 7d − 5 + 2 2d(d − 1)
rc ≥ rs and rw ≥ rs ,
2(d + 1) d+1

and assumption 5 says that the domain D − Nr̂ (∂D) is connected, where
r
d−1
Nr̂ (∂D) = {x ∈ D|||x − ∂D|| ≤ r̂} and r̂ = rf + rs .
2d

6.3 Mapper
Ayasdi is a company specializing in Topological Data Analysis. It was started in 2008 by Gunnar Carlsson, Gurjeet
Singh, and Harlan Sexton of Stanford University. One of their biggest commercial products is a visualization algorithm
called Mapper introduced in [146]. In this section, I will describe the algorithm and how it works. In Section 6.3.1,
I will give the general construction. Section 6.3.2 describes the specific implementation for point cloud data. The
construction involves level curves of a height function, so possible height functions are dealt with in Section 6.3.3.
Finally, Section 6.3.4 describes an interesting use case relating to cancer research. In [116], Nicolau, Levine, and
Carlsson, use Mapper to find a previously undiscovered type of breast cancer cell. Patients with this type of cell
exhibited 100% survival and no metastasis. I will describe how Mapper was used to discover this type of cell.
Mapper software is available in open source versions in both R and Python as well as a version sold by Ayasdi
which requires a license.

6.3.1 Construction
Definition 6.3.1. Let X be a topological space and U = {Uα }α∈A be a cover of X by a finite collection of open sets
indexed by a finite indexing set A. We define a simplicial complex N (U) called the nerve of U whose vertex set are
the elements of A and {α0 , α1 , · · · , αk } for αi ∈ A span a simplex of N (U) if and only if Uα0 ∩ Uα1 ∩ · · · ∩ Uαk ̸= ∅.
Now consider a continuous function f : X → Z, where X is a topological space and Z is a parameter space. Usu-
ally Z ⊆ Rn for some n. Now suppose V = {Vα }α∈A is an open cover of Z. Let f −1 (V) = {f −1 (Vα )}α∈A . Then
since f is continuous, f −1 (V) is an open cover of X. Given X, Z, f , and V, Mapper returns the nerve N (f −1 (V)).
We will illustrate this with 2 examples.
Example 6.3.1. Suppose X = [−N, N ] ⊂ R, the parameter space is Z = [0, +∞), and f : X → Z is the absolute
value of the quantile function of a Gaussian probability density function with mean 0 and standard deviation σ. In
other words, if p ∈ [0, 1], f (p) s the absolute value of the number q such that the cumulative distribution function of q
132 CHAPTER 6. RELATED TECHNIQUES

has value p. (This is a change from [146] as just using the probability density function of the Gaussian would not have
the stated properties.) The covering V of Z consists of the four sets [0, 5), (4, 10), (9, 15), and (14, +∞). Assume
that N is large enough so that f (N ) > 14. Then f −1 ([0, 5)) has a single component, but f −1 ((4, 10)), f −1 ((9, 15)),
and f −1 ((14, +∞)) all have 2 components, one on the positive side and one on the negative side. We represent the
simplicial complex produced by the nerve of f −1 (V) as shown in Figure 6.3.1. The nodes are labled by color and
size. The color indicates the value of the function f (red being high and blue being low) at a representative point in
the indicated subset of V ⊂ Z. (We could take an average over V .) The size represents the number of points in the set
corresponding to the node in the case that X is discrete. (As it will be in the next section). In the continuous case, it
corresponds to the integral of f over the represented subset of X.

Figure 6.3.1: Simplicial Complex for Example 6.3.1 [146].

Example 6.3.2. Suppose X = {(x, y)|x2 + y 2 = 1} ⊂ R2 be the unit circle in the plane. Let Z = [−1, 1], and
f (x, y) = x. The covering V of Z consists of the three sets [−1, − 31 ], [− 12 , 12 ], and [ 13 , 1].Then f −1 ([−1, − 13 ])
and f −1 ([ 31 , 1]) have one component each, while f −1 ([− 21 , 12 ]) has two components. The corresponding simplicial
complex is shown in Figure 6.3.2. (Note that I have modified the example in [146] in two ways. First of all, I
have replaced 23 in the paper by 13 so that the intervals overlap. Otherwise, the simplicial complex would be four
disconnected vertices. Also, I had f (x, y) = x rather than y so the picture would be more intuitive. If f (x, y) = y,
the figure should be flipped with the two green nodes on the sides and the red node on the top.

Figure 6.3.2: Simplicial Complex for Example 6.3.2 [146].

6.3.2 Implementation for Point Cloud Data


In applying Mapper to point cloud data, we move from what Singh, et. al. call the topological version to what
they call the statistical version. The main difference is that we use clustering to represent the geometric operation of
partitioning a set into its connected components.
Assume we have a point cloud X with N points. We have a function f : X → R on the points called the filter. Let
Z be the range of f . Divide this range into a set S of intervals which overlap. Let L be the length of these intervals
and p be the fraction of L defining the overlap. For each interval Ij ∈ S, let Xj = {x|f (x) ∈ Ij }. Then the sets
Xj cover X. For each Xj , we find clusters Xjk . If Im overlaps with Ij , we will have overlap between clusters of
Xm and clusters of Xj . So let each cluster be a vertex in our complex and draw an edge between Xjk and Xmn if
Xjk ∩ Xmn ̸= ∅.
Example 6.3.3.
p Suppose X be a set of points sampled from a noisy circle in the plane. Let Z = [0, 4.2], and
f (x, y) = (x − qx )2 + (y − qy )2 , where q = (qx , qy ) is the leftmost point in the circle. Let L = 1 and p = .2.
Figure 6.3.3 shows the original circle, the unclustered points, and the final complex.
Any clustering algorithm can be used, but Singh, et. al. use single linkage clustering [78][77]. The advantages of
this scheme are that its input can be any distance matrix so the points aren’t restricted to lie in Euclidean space, and we
don’t need to know the number of clusters in advance. The algorithm returns a vector C ∈ RN −1 , where N is the size
of our point cloud. At the start, each point is its own cluster. At each step, we merge the two closest clusters where
the distance between clusters A and B is the smallest distance between a point of A and a point of B. This distance at
6.3. MAPPER 133

Figure 6.3.3: Simplicial Complex for Example 6.3.3 [146].

each step is held in C. We histogram these distances and look for a natural break. Longer distances should separate
natural clusters.
Another issue is that if the filter function is real valued, our complexes are 1-dimensional making them graphs.
We may want to look at a higher dimensional simplical complex. This is achieved if we have Rm for m > 1 as our
parameter space. We would then need to have an open cover of our range in Rm . For example, we could cover R2
with overlapping rectangles. Then if clusters corresponding to three regions overlapped, we would add a 2-simplex
and if there were four regions, we would add a 3-simplex.

6.3.3 Height Function Examples


Singh, et. al. give some examples of filter functions, but any real valued function will work. (Recall that filter and
height functions are basically the same thing.) Here are two interesting examples.

Example 6.3.4. Density: Let ϵ > 0, and

−d(x, y)2
X  
fϵ (x) = Cϵ exp ,
y
ϵ
R
where x, y ∈ X, and Cϵ is a constant such that fϵ (x)dx = 1. The function fϵ estimates density and ϵ controls
the smoothness of the estimate. This gives important features of the point cloud. Several other methods of density
estimation are explained in [144].

Example 6.3.5. Eccentricity: This is another family of functions that explore the geometry of the set. Eccentricity
finds points that are far from the center point and assigns them higher values. On a Gaussian distribution, eccentricity
increases when density decreases. Given p with 1 ≤ p < +∞, we set
 p1
d(x, y)p
P
y∈X
Ep (x) = ,
N

and
E∞ (x) = max d(y, x)
y∈X

where x ∈ X and N is the number of points in X.


134 CHAPTER 6. RELATED TECHNIQUES

6.3.4 Application to Breast Cancer Research


In this section I will summarize the paper of Nicolau, et. al. [116] which uses Mapper to identify a previously
unknown type of breast cancer cell. This paper was a big historical breakthrough, demonstrating Mapper’s usefulness.
As the biology is a little outside of my own personal expertise, I will have to use some terms without a rigorous defini-
tion and refer you to the paper and its references. What I want you to take from this is the shape of the graph produced
by Mapper had a long tail that required an explanation. This tail would not have been visible using standard clustering
or common dimensionality reduction techniques such as principal component analysis (PCA) or multidimensional
scaling.The procedure outined in [116] is called progression analysis of disease or PAD, and is available as a web tool
either with or without Mapper.
The first step is application of a technique called disease-specific genomic analysis (DSGA) [117], which takes a


vector T representing genomic data and represents it as the sum

− →
− →

T = Nc . T + Dc . T ,

− →

where Nc . T is the normal component representing healthy tissue and Dc . T is the disease component. This is done
using a method of making high dimensional data less sparse followed by PCA. The reader is referred to [117] for
the details. The idea is to use the disease component representing the deviation from health vector to perform further
analysis.
I will now show how to apply Mapper. Note that we could do the same steps on a wide variety of data matrices. It
will turn out, though that the Mapper graph produced at the end will have biological significance. Start with a matrix
whose columns are patients and whose rows are any genomic variables from a breast cancer microarray gene expres-

→ − → −→ −→ − → −→
sion data set. The columns consist of tumor data vectors T1 , T2 , · · · , Tm and normal tissue vectors N1 , N2 , · · · , Nk .
We perform the following steps:
1. Using DSGA, transform all of the data and construct the following two matrices: (i) DC.mat, the matrix whose

→ −
→ −→ −
→ − → −→
columns Dc.T1 , Dc.T2 , · · · , DC.Tm are the disease components of the original tumor vectors T1 , T2 , · · · , Tm .

→ −→ −→
(ii) L1.mat, the matrix whose columns L1.N1 , L1.N2 , · · · , L1.Nk are an estimate of the disease component of
normal tissue. These matrices are concatenated to form the matrix L1Dc.mat.
2. Threshold data coordinates so that only the genes that show a significant deviation from the healthy state are
retained in the data matrix from Step 1. Any appropriate test for significance can be used.
3. We apply several filter functions to the data points represented by the columns of the matrix. Given a column
v = (v1 , · · · , vn ), the filter function fp,k (v) is defined as

n
! kp
X
p
fp,k (v) = |vi | .
i=1

Note that for k = 1, the filter function is just the Lp norm.


4. Apply Mapper to the columns of the matrix using the filter functions defined in step 3. For the clustering, use
the Pearson Correlation Distance
Pn
(xi − x̄)(yi − ȳ)
dcor (x, y) = 1 − pPn 1=1 Pn .
2 2
i=1 (xi − x̄) i=1 (yi − ȳ)

Nicolau, et. al. used filters with p ranging from 1 to 5 and k ranging from 1 to 10. Figure 6.3.4 shows the Mapper
graph for fp=2,k=4 .
In the figure, the main feature is the long tail at the bottom right. This corresponds to estrogen receptor positive
tumors (ER+) with high levels of the c-MYB gene. This is denoted as c-MYB+ and was represented 7.5% of the
patients. It was newly discovered through this analysis and regular clustering methods would not have discovered it.
This group had a 100% survival rate with no recurrence of disease. See [117] for more details.
6.4. SIMPLICIAL SETS 135

Figure 6.3.4: Simplicial Complex for Breast Cancer data with filter p = 2, k = 4. [116].

6.4 Simplicial Sets


It will now be necessary to take a long digression in order to understand the UMAP algorithm in the next section.
It relies heavily on the idea of simplicial sets. Simplicial sets, a generalization of simplicial complexes, are a key part
of UMAP. In addition, they form the basis for combinatorial homotopy theory which we will need in order to apply
obstruction theory to data science.
The idea is as follows: Recall from Chapter 4 that if K and L are simplical complexes, then a simplicial map
fP: K → L takes verticesPvi in K to vertices f (vi ) in L. Then for any point x ∈ K with barycentric coordinates
ti vi , we have f (x) = ti f (vi ). In other words, a simplicial map is completely determined by its action on the
vertices of K. Now for distinct vi we don’t make a condition that f (vi ) be distinct. So two vertices in K can be taken
to the same vertex in L. As an easy example, let K consist of the 2-simplex [v0 , v1 , v2 ] and let L be the face [v0 , v1 ].
Suppose f (v0 ) = v0 and f (v1 ) = f (v2 ) = v1 . This is a map that collapses K onto one of its faces. We could like
to somehow distinguish the simplex [0, 1] as the image of a 2-simplex. We can do this by saying that this simplex is
degenerate and represent it as [v0 , v1 , v1 ].
The standard reference for simplicial sets is the comprehensive book by Peter May [97]. Another reference is [38].
These books can be quite difficult for someone new to the subject as they are entirely combinatorial and have little
in the way of pictures. To remedy this situation, Greg Friedman, wrote a paper [51] with a lot of pictures that is a
much gentler introduction to the subject. My previous example came from there, and I will outline his approach in the
remainder of this section.

6.4.1 Ordered Simplicial Complexes


In what follows, it will be easier to restrict ourselves to ordered simplicial complexes in which the vertices all
follow some order. We write a simplex as [vi0 , · · · , vik ] if vim < vin whenever m < n. When the entire complex
136 CHAPTER 6. RELATED TECHNIQUES

consists of a single simplex, we will talk about the standard (ordered) n-simplex written as |∆n |. In the literature, it is
common to write |∆n | as [0, 1, · · · , n]. Then a face of [0, 1, · · · , n] is of the form [i0 , i1 , · · · , ik ] where 0 ≤ i0 < i1 <
· · · < ik ≤ n. For each simplex in an ordered simplical complex, there is exactly one way to represent the vertices in
this form, and we can think of it is the image of a standard k-simplex |∆k | = [0, 1, · · · , k].
Now we would like a way to assign a map to an n-simplex that will give us an (n − 1)-dimensional face.
Definition 6.4.1. If [0, 1, · · · , n] is the standard n−simplex, then there are n + 1 face maps d0 , d1 , · · · , dn defined as
dj ([0, 1, · · · , n]) = [0, 1, · · · , j − 1, j + 1, · · · , n], i.e. we remove vertex j. (Here I am using the notation of [51]. In
[97] the more common notation ∂j is used.)

Figure 6.4.1: Face Maps of |∆2 | [51].

Figure 6.4.1 shows the face maps of |∆2 |. Note that these maps are assignments rather than continuous maps.
In general, we define d0 , d1 , · · · , dn on the n simplices of an ordered simplicial complex, where dj removes the j-th
vertex in the ordering. We can get more general by allowing composition of face maps to produce a face of dimension
less than n − 1. We just need to have j be less than or equal to k for a k-simplex so that the definition still makes
sense.
Example 6.4.1. Let n = 6 and compute d3 d1 d5 ([0, 1, 2, 3, 4, 5, 6]). We have

d3 d1 d5 ([0, 1, 2, 3, 4, 5, 6]) = d3 d1 ([0, 1, 2, 3, 4, 6]) = d3 ([0, 2, 3, 4, 6]) = [0, 2, 3, 6].

Note that in the last step we removed vertex 4 instead of 3 since by removing vertex 1, vertex 4 moved into slot 3
(counting from 0).
Example 6.4.2. Let n = 5. We have

d2 d4 ([0, 1, 2, 3, 4, 5]) = d2 ([0, 1, 2, 3, 5]) = [0, 1, 3, 5],

while
d3 d2 ([0, 1, 2, 3, 4, 5]) = d3 ([0, 1, 3, 4, 5]) = [0, 1, 3, 5].
We see that these are equal and in general for i < j, removing i first will move j to the j −1 spot so that di dj = dj−1 di .

6.4.2 Delta Sets


Before defining simplicial sets, we will look at a concept that is one level of abstraction beyond a simplicial
complex.
6.4. SIMPLICIAL SETS 137

Definition 6.4.2. A Delta set (also called a semisimplicial complex, semisimplicial set, or a ∆-set ) consists of a
sequence of sets X0 , X1 , · · · , and for each n ≥ 0, maps di : Xn+1 → Xn for each i with 0 ≤ i ≤ n + 1 such that
di dj = dj−1 di whenever i < j. (Note that we capitalize Delta to reflect the fact that the Greek letter ∆ is capitalized.)
The difference between Delta sets and ordered simplicial complexes is that the faces may not be unique. This
reflects the fact that we may want to glue faces together. Friedman gives two examples.

Figure 6.4.2: |∆2 | glued into a cone [51].

Example 6.4.3. Figure 6.4.2 shows a 2-simplex |∆2 | with the faces [0, 2] and [1, 2] glued together to form a cone.
The result is not actually a simplicial complex as the two faces of [0, 1] are not unique. We have X0 = {[0], [2]},
X1 = {[0, 1], [0, 2]} and X2 = {[0, 1, 2]}. Then the face maps are what we would expect but d0 ([0, 1]) = [1] = [0] =
d1 ([0, 1]). Check that the relations di dj = dj−1 di for i < j still hold in this case.

Figure 6.4.3: A Delta set with two vertices bounding two different edges (based on example from [51]).

Example 6.4.4. It is possible to have a set of vertices define two different simplices as Figure 6.4.3 shows. Let e0 and
e1 be the top and bottom edges respectively. Then X0 = {v0 , v1 } and X1 = {e0 , e1 }. So d0 (e0 ) = d0 (e1 ) = v1 and
d1 (e0 ) = d1 (e1 ) = v0 . Note that it is the face maps which determine what is glued together.
Now we will give an alternate definition of Delta sets involving category theory. You will need to understand this
approach in order to understnad the UMAP paper. First we define a category ∆.ˆ

Definition 6.4.3. The category ∆ˆ has as objects the sets [n] = [0, 1, 2, · · · , n], and morphisms the strictly order
preserving functions f : [m] → [n] for m ≤ n. (A function f is strictly order preserving if i < j implies f (i) < f (j).)
Now we look at the dual or opposite category ∆ ˆ op . This category contains the objects [n] but the morphisms go in
the opposite direction. Rather than include a face of dimension [m] in a simplex of dimension [n] like the morphisms
138 CHAPTER 6. RELATED TECHNIQUES
ˆ we go from a higher dimensional simplex and map it to a particular face. But this is exactly what the face maps
in ∆,
in a Delta set do. If n − m > 1, we just have a composition of face maps. So here is the alternate definition:
ˆ op → Set, where Set is the category of sets and functions.
Definition 6.4.4. A Delta set is a covariant functor X : ∆
ˆ
Equivalently, a Delta set is a contravariant functor ∆ → Set.
In this formulation, we think of the functor X of taking the standard n−simplex [n] to the collection of all
n−dimensional simplices of our Delta set. The formula di dj = dj−1 di for i < j comes from the fact that it holds on
[n].

6.4.3 Definition of Simplicial Sets


Recall our cone in Example 6.4.3. Under the gluing map, the image of |∆2 | should still be 2-dimensional (the
surface of the cone) but it only has 2 distinct vertices, [0] = [1] and [2]. This is an example of a degenerate simplex.
You can think of a degenerate simplex as one without the right number of vertices. But how do we tell it from
a 1-dimensional simplex? The answer is that we allow vertices to repeat. So a degenerate simplex is of the form
[vi0 , · · · , vin ] where the vertices are not all distinct but we still have ik ≤ im if k < m.
Example 6.4.5. How many 1-simplices including degenerate ones are in the 2-simplex |∆2 | = [0, 1, 2]? There are 6
of them: [0, 0], [0, 1], [0, 2], [1, 1], [1, 2], and [2, 2]. In Figure 6.4.4, Friedman illustrates the 1-simplices in |∆1 |, the
1-simplices in |∆2 |, and the 2-simplices in |∆2 |.

There is one more catch. |∆n | can have degenerate simplices of dimension greater than n. For example, |∆0 | has
the degenerate simplex [0, 0, 0, 0, 0, 0, 0, 0]. To keep track we need degeneracy maps.
Definition 6.4.5. If [0, 1, · · · , n] is the standard n−simplex, then there are n + 1 degeneracy maps s0 , s1 , · · · , sn
defined as sj ([0, 1, · · · , n]) = [0, 1, · · · , j, j, · · · , n], i.e. we repeat vertex j.

This idea can easily be extended to simplicial complexes and Delta sets. Any degenerate simplex can be obtained
from an ordinary simplex by compositions of degeneracy maps.
Definition 6.4.6. A simplicial set consists of a sequence of sets X0 , X1 , · · · , and for each n ≥ 0, functions di : Xn →
Xn−1 and si : Xn → Xn+1 for each i with 0 ≤ i ≤ n such that:

di dj = dj−1 di if i < j,
di sj = sj−1 di if i < j,
dj sj = dj+1 sj = id,
di sj = sj di−1 if i > j + 1,
si sj = sj+1 si if i ≤ j.

Try some examples yourself to convince yourself that these formulas are correct.
Any ordered simplicial complex can be thought of as a simplicial set if we add the degenerate simplices. These
are formed by taking any simplex in the complex and repeating any subset of the vertices as many times as we want.
Also, a simplicial set is a Delta set if we throw away the degeneracy maps.
Example 6.4.6. The standard 0-simplex [0] can be thought of a simplical set with an element in each Xn for n ≥ 0.
The element in Xn is [0, · · · , 0] of length n + 1.
Recall that we wrote |∆n | for the standard n-simplex. We will write ∆n with the bars removed to stand for the
standard n-simplex thought of as a simplicial set.
6.4. SIMPLICIAL SETS 139

Figure 6.4.4: The 1-simplices in |∆1 |, the 1-simplices in |∆2 |, and the 2-simplices in |∆2 | [51].

Example 6.4.7. The prototype of simplicial sets comes from singular homology. Recall from Section 4.3.1 that if
X is a topological space, then a singular p-simplex is a continuous map T : ∆p → X. We defined the group of
singular p-chains as the free abelian group generated by these simplices. We got a face map by composing T with a
restriction of ∆p to one of its faces. (We did this by mapping ∆p linearly into R∞ . Also we had followed Munkres’
notation and wrote p as a subscript rather than a superscript as we are doing here. The idea is the same though.) For
degeneracy operators, include ∆p into ∆p+1 and then include in R∞ and compose T with the result. A degenerate
singular simplex is a collapsed version of a higher dimensional simplex. In Figures 6.4.5 and 6.4.6, Friedman [51]
illustrates a face of a singular simplex and a degenerate singular simplex respectively.
Definition 6.4.7. A simplex x ∈ Xn is called nondegenerate if x can not be written as si y for some i and some
y ∈ Xn−1 .
140 CHAPTER 6. RELATED TECHNIQUES

Figure 6.4.5: Face of a singular simplex [51].

Figure 6.4.6: Degenerate singular simplex [51].

Every simplex of a simplicial complex or a Delta set is nondegenerate. A degenerate simplex can have a nonde-
generate face. For example, any simplex x whether it is degenerate or not can be written in the form x = dj sj x. It is
more surprising that a nondegenerate simplex can have a degenerate face. See the next subsection for an example.
As is the case with Delta sets, simplicial sets also have a categorical definition. We start by defining a category ∆.

Definition 6.4.8. The category ∆ has objects the finite ordered sets [n] = {0, 1, 2, · · · , n}. The morphisms are finite
order preserving functions f : [m] → [n].

The main difference between the category ∆ and the category ∆ ˆ is that the morphisms of ∆ are only order
preserving rather than strictly order preserving. In other words, we can repeat elements. For example, we can have
f : [3] → [5] with f ([0, 1, 2, 3]) = [0, 2, 2, 4].
The morphisms in ∆ are generated by Di : [n] → [n + 1] and Si : [n + 1] → [n] for 0 ≤ i ≤ n. These maps are
defined as Di [0, 1, · · · , n] = [0, 1, · · · , î, · · · , n + 1] and Si [0, 1, · · · , n + 1] = [0, 1, · · · , i, i, · · · , n].
Now we want our face and degeneracy maps to go in the opposite direction. Di includes the ith face of [n] in
[n + 1] so the face map di should map [n] to this face. The opposite of Si which collapses [n + 1] to [n] by identifying
the ith and i + 1th vertex is the map si which assigns to an n-simplex the degenerate n + 1 simplex which repeats the
6.4. SIMPLICIAL SETS 141

ith vertex. This leads to the following definition.

Definition 6.4.9. A simplicial set is a contravariant functor X : ∆ → Set or equivalently, a covariant functor
X : ∆op → Set.

We can also think of simplicial sets as objects of a category whose morphisms consist of functions fn : Xn → Yn
that commute with face maps and degeneracy maps.
Maps of simplicial sets, unlike the simplicial maps of Chapter 4, are not uniquely defined by what they do to
vertices. They are uniquely defined, though, by what they do to nondegenerate simplices.

6.4.4 Geometric Realization


Given a simplicial set, we would like to find an actual geometric object which corresponds to it. This can be done,
and the result is called the geometric realization by May [97] or just the realization by Friedman [51].

Definition 6.4.10. Let X be a simplicial set. Give each set Xn the discrete topology. (Recall that this means that
every subset is open.) Let |∆n | be the standard n-simplex with the usual topology. The realization |X| is given by

a
|X| = (Xn × |∆n |)/ ∼,
n=0

n
where ∼ is the equivalence relation generated by the relations (x, Di (p)) ∼ (d`i (x), p) for x ∈ Xn+1 , p ∈ |∆ | and
n
the relations (x, Si (p)) ∼ (si (x), p) for x ∈ Xn−1 , p ∈ |∆ |, and the symbol means disjoint union. Here Di and
Si are the face inclusions and collapses from our discussion of the category ∆.

This is how the definition works. First of all, we would like a an n-simplex for every element of Xn , and that is what
Xn ×|∆n | provides. The disjoint union means we do this for every n. Now look at the relation (x, Di (p)) ∼ (di (x), p).
Then x is an (n + 1)-simplex of X and Di (p) is a point on the ith face of a geometric n + 1-simplex. (di (x), p) is
the ith face of that simplex together with the same point, now in an n-simplex. So the relation takes the n simplex
corresponding to di (x) in Xn ×|∆n | and glues it to the ith face of the (n+1)-simplex assigned to x in Xn+1 ×|∆n+1 |.
The last step is to get rid of degenerate simplices. The relation (x, Si (p)) ∼ (si (x), p) takes a degenerate n-simplex
and a point p in the pre-collapse n-simplex |∆n | and glues p to the (n − 1)-simplex represented by x st the point Si (p)
which is the image of the collapse map. If x happens to be degenerate it is also collapsed.
It turns out that the realization of a simplicial set obtained from an ordered simplicial complex is the original
simplicial complex. Let ∂∆n denote the simplicial set obtained from the boundary ∂|∆n | of the ordered simplicial
complex |∆n | by adjoining all of the degeneracies. We would like to describe ∂∆n as a simplicial set. One way to do
this is to include any m-simplex [i0 , · · · , im ] where 0 ≤ i1 ≤ · · · ≤ im ≤ n. The one condition is that it is not allowed
to include all of the numbers from 0 to n as it would then contain [0, 1, · · · , n] which would not be included in the
boundary as it is all of |∆n |. Since the set of simplices of the form [i0 , · · · , im ] whose vertices are nondecreasing and
don’t include all the numbers from 0 to n forms the simplicial set representing the ordered simplical complex ∂|∆n |,
we have |∂∆n | ∼ = S n−1 .
A much more efficient way to represent S n−1 is to only use the nondegenerate simplices [0, 1, · · · , n − 1] and [0].
The only simplex in Xm for 1 ≤ m < n − 1 is the degenerate simplex [0, · · · , 0] (m − 1 times). So all of the faces
of [0, 1, · · · , n − 1] are [0, · · · , 0] and the resulting simplicial set is the equivalent of collapsing the boundary of an
(n − 1)-cell to a point. So this is also a representation of S n−1 .
Note that the second construction relied on degenerate sets and could not have been done with Delta sets. It is also
an example of a non-degenerate simplex with degenerate faces.
We conclude with some useful facts.The proofs are not hard and can be found in [51].

Theorem 6.4.1. A degenerate simplex is a degeneracy of a unique nondegenerate simplex. If z is a degenerate


simplex, there is a unique non-degenerate simplex x such that z = si1 · · · sik x for some collection of degeneracy maps
si1 , · · · , sik .
142 CHAPTER 6. RELATED TECHNIQUES

Theorem 6.4.2. If X is a simplicial set then its realization |X| is a CW complex with one n-cell for each nondegenerate
n-simplex of X.

Our last result will be needed to understand UMAP.

Definition 6.4.11. Let C and D be categories. An adjunction consists of a pair of functors F : C → D and G : D →
C together with an isomorphism D(F c, d) ∼ = C(c, Gd) for each c ∈ C and d ∈ D that is natural in both variables.
(I will be more explicit below.) F and G are called adjunct functors. By isomorphism, we mean a bijection. We
represent the bijection with the pair consisting of the morphism f ♯ : F c → d in D and the morphism f ♭ : c → Gd in
C. The pair f ♯ and f ♭ are adjunct or transposes of each other. Riehl [135], the source of this form of the definition,
shows that the natuarlity condition can be expressed as follows: In the diagram below, the left hand square commutes
in category D if and only if the right hand square commutes in category C.

f♯ f♭
Fc d c Gd

Fh k h Gk

F c′ d′ c′ Gd′
g♯ g♭

Theorem 6.4.3. In the above notation, let C be the category of simplicial sets and D be the category of topological
spaces. Let F : C → D defined by F (X) = |X| be the realization functor and G : D → C be the singular set functor
which takes a space Y to the simplicial set S(Y ) where Sn (Y ) consists of the singular n-chains, i.e. the continuous
maps from |∆n | to Y . Then F and G are adjoint functors.

Given f ♯ : |X| → Y and we need to produce f ♭ : X → S(Y ). But restricting f ♯ to a nondegenerate simplex
(x, |∆n |) gives a continuous function |∆n | → Y which is in S(Y )n . This gives a map from X to S(Y ) which we can
define to be f ♭ . If (x, |∆n |) represents a degenerate simplex, then we precompose with the appropriate collapse map
of ∆n into |X| before applying f ♯ .
If we start with f ♭ : X → S(Y ), this assigns to each n-simplex x ∈ X a continuous function σx : |∆n | → Y . Let
f then be the continuous function that acts on the simplex (x, |∆n |) by applying σx to |∆n |.

I will leave it to the reader to check the other parts of the definition of adjoint functors. This result will be crucial
to understanding UMAP.

6.5 UMAP
The Uniform Manifold Approximation and Projection for Dimension Reduction or UMAP algorithm of McInnes,
Healy, and Melville [100] is an algorithm for dimensionality reduction. The description is very long and difficult with
a theory that combines topics from algebraic topology, differential geometry, category theory, and fuzzy set theory. In
this section, I will outline the algorithm while filling in the mathematical details needed to understand the theory. I
will also summarize their discussion on the comparisons to other algorithms and when UMAP is particularly useful.
In machine learning applications, we tend to deal with high dimensional data. This makes the data difficult to
visualize and a large amount of training data is needed to fully train a classifier to deal with new examples. For
this reason, dimensionality reduction is needed both to aid visualization and as a preprocessing step for clustering
and classification algorithms. Dimension reduction algorithms such as principal component analysis (PCA) [73] and
multidimensional scaling (MDS) [86] preserve pairwise distances among the data points, while algorithms such as
t-SNE [165, 164] favor preservation of local distances over global distance. UMAP falls into the latter category. In
this way, t-SNE and UMAP preserve the manifold structure of the data (for example if your data was on the surface of
a torus) as opposed to PCA and MDS which squash everything into flat space.
6.5. UMAP 143

Following [100], in Section 6.5.1, I will outline the theoretical basis of UMAP. Then 6.5.2 will outline the com-
putational approach. Section 6.5.3 will discuss weaknesses of UMAP and situations in which UMAP should not be
used. Finally, Section 6.5.4 will compare UMAP to t-SNE.
Note that this is probably the hardest section in the entire book, so don’t be too discouraged if you have trouble on
a first reading. The material in later chapters will not be dependent on it.

6.5.1 Theoretical Basis


This section of [100] requires an extensive background. I will try to fill in as many gaps as I can, relying on the
reader having read the material in this book up to this point. For additional background on the specific constructions
used, see [11] and [150]. For category theory, [135] gives you all the information you need. I will try to give you a
feel for the algebraic geometry and differential geometry used but for a full background, you can see [62] for algebraic
geometry and [151] for differential geometry.
Recall that an n-dimensional manifold is a topological space M such that each point x ∈ M has an open neigh-
borhood homeomorphic to Rn . So a curve is a 1-dimensional manifold and a surface is a 2-dimensional manifold.
UMAP assumes that the data is uniformly distributed on some manifold. The algorithm first approximates a manifold
on which the data lies and then constructs a fuzzy simplicial set (to be defined below) representation of the approxi-
mated manifold.
If the data was uniformly distributed on M , then away from boundaries, a ball of fixed volume centered at any
point of M should contain the same number of data points. As this is not realistic for finite data, we will need some
differential geometry.

Definition 6.5.1. Let V be a vector space over the field F , where F is either the real numbers R or the complex
numbers C. Then an inner product on V is a function ⟨·, ·⟩ : V × V → F such that for a, b ∈ F and x, y, z ∈ V :

1. ⟨x, y⟩ = ⟨y, x⟩ (if F = R, then ⟨x.y⟩ = ⟨y.x⟩),

2. ⟨ax + by, z⟩ = a⟨x, z⟩ + b⟨y, z⟩

3. ⟨x, x⟩ ≥ 0 and ⟨x, x⟩ = 0 if and only if x = 0.

The vector space V is called an inner product space.


p
If a vector space V has an
p inner product it automatically has a norm ||x|| = ⟨x.x⟩. Then V is also a metric space
with d(x, y) = ||x − y|| = ⟨x − y, x − y⟩. If this metric is complete, i.e. every Cauchy sequence in V converges to
a point in V , then V is called a Hilbert space.
For an n-manifold M , let p ∈ M . The homeomorphism ϕ : U → Rn where U is an open subset of M containing
p is called a coordinate chart.

Definition 6.5.2. The tangent space Tp M of an n-dimensional manifold M at the point p is defined as follows: Let
γ1 , γ2 : [−1, 1] → M be two ciurves on M and suppose a coordinate chart ϕ has been chosen. Let γ1 (0) = γ2 (0) = p.
Then γ1 and γ2 are equivalent if the derivatives of ϕ ◦ γ1 and ϕ ◦ γ2 coincide at 0. (These are functions of one real
variable so we mean derivative in the ordinary sense.) An equivalence class is called a tangent vector and the set of
equivalence classes is called the tangent space. It turns out that the tangent space is independent of the choice of
coordinate chart ϕ.

At each point p of an n dimensional manifold M , the tangent space Tp M is an n-dimensional vector space. We
write ϕ = (x1 , · · · , xn ) : U → Rn . The we write the basis of Tp M as
(    )
∂ ∂
,··· , .
∂x1 p ∂xn p



Here ∂xi p is the equivalence class of a curve γ such that the tangent to ϕ ◦ γ is in the xi direction.
144 CHAPTER 6. RELATED TECHNIQUES

Now we let g be a matrix such that


*    +
∂ ∂
gij = , .
∂xi p ∂xj p

In other words, g determines inner products on Tp M . Then g is called a Riemannian metric and M is called a
Riemannian manifold.
Here is the last fact we need before resuming with UMAP.
Theorem 6.5.1. Whitney Embedding Theorem: Let M be an n-dimensional manifold that is smooth (i.e if U and V
are open subsets of M with corresponding coordinate charts ϕU and ϕV then ϕU ◦ ϕ−1 n n
V : R → R is differentiable.)
2n 2n
Then M can be embedded in a space R . (I. e. there is a one to one map taking M into R .)
If we embed M in Rm then we call Rm the ambient space. Now I can state the next result from [100].
Theorem 6.5.2. Let (M, g) be a Riemannian manifold in an ambient Rn , and let p ∈ M be a point. Let g be constant
diagonal matrix in a ball B ⊂ U centered at p of volume Vn with respect to g where Vn is the volume of a ball in Rn
of radius 1. Explicitly,
n−1
2n π ( 2 ) n−1
n 
π2 2
Vn = n  for n even and Vn = for n odd.
2 !
n!
Then the geodesic distance (i.e. the shortest distance along a path in the manifold) between p and another point q ∈ B
is 1r d(p, q) where r is the radius of the ball in the ambient space Rn and d(p, q) is the distance in the ambient space.
The point of all of this is that for each point Xi in our data set, we can normalize distances so that a point and
its k nearest neighbors are contained in a ball of uniform volume no matter what point we take as our center. So
our assumption of the points being uniformly distributed on the manifold will hold locally if we have a good way of
pasting each of these local metric spaces together. That is what we will do next.
The next step involves simplicial sets and your author has prepared you for this in the previous section. There
are some notational differences I need to point out. The first is the idea of a colimit. In [100], it is stated that for a
simplicial set X, we have that
colimx∈Xn ∆n ∼ = X.
This turns out to be equivalent to definition 6.4.10. For a sequence of sets and inclusions the colimit is the disjoint
union. (See [135] Chapter 3 and especially Example 3.1.26.) We will also use the fact that the singular set functor and
realization functor are adjoints.
Next, we need to talk about fuzzy set theory [180] or fuzzy logic as opposed to faulty logic. Normally, given a set
S and an element x, either x ∈ S or x ∈ / S. If S is a fuzzy set, there is a membership function µ : S → [0, 1]. So an
element x is in S to a varying degree. µ(x) then is the membership strength of x.
Example 6.5.1. Consider this scenario. You are getting pretty tired of TDA by now and need to take a break. So you
decide to go hiking in the Arizona desert, but unfortunately, you forgot to bring any water with you and you are starting
to get pretty thirsty. Then you see two bottles standing up against a cactus. The bottles have unusual labels. The first
one says that the probability of the contents being a potable liquid is .9. The second one says that the membership of
the contents in the set of potable liquids is .9. What should you do?
Later, someone shows up, apologizes for the confusing messages on the bottles, and tears off the labels. Underneath
are a different set of labels. The first bottle turns out to be hydrochloric acid. Somehow, it got mixed up with 9 identical
looking water bottles, so the probability of it being water was .9. The second bottle turns out to be a bottle of beer.
Assuming the most potable liquid is water, it is similar enough to earn a membership function of .9. There is one other
big difference. Now that we have more information, the probability of bottle 1 being potable is now 0, but the fuzzy
membership function of bottle number 2 is unchanged despite the additional information.
We want to represent fuzzy sets in terms of category theory. To do this we need some more category theory. To
understand the motivation, I will introduce the concept of a sheaf from algebraic geometry.
6.5. UMAP 145

Definition 6.5.3. Let X be a topological space. A presheaf P on X consists of the following:

1. For each open set U ⊆ X, a set P(U ). The elements in this set are called the sections of P over U .

2. If V ⊆ U is an inclusion of open sets, we have a corresponding function resV,U : P(U ) → P(V ) called a
restriction function or restriction morphism.

3. For every open set U ⊆ X, the restriction resU,U : P(U ) → P(U ) is the identity.

4. If there are three open sets W ⊆ V ⊆ U , then resW,V ◦ resV,U = resW,U .

For the next definition, if s ∈ P(U ) is a section for the presheaf P over the open set U and V ⊆ U is open, then
the notation s|V denotes the element of P(V ) which is equal to resV,U (s).

Definition 6.5.4. A presheaf P is a sheaf if it has two additional properties:

1. Locality: If {Ui } is an open cover of an open set U , and if s, t ∈ P(U ) are such that s|Ui = t|Ui for all i, then
s = t.

2. Gluing: If {Ui } is an open cover of an open set U , and if for each i, there is a section si ∈ P(Ui ) such that for
any pair of open sets Ui and Uj in the cover, si |Ui ∩Uj = sj |Ui ∩Uj , then there is a section s ∈ P(U ) such that
s|Ui = si for all i.

Example 6.5.2. Let X = R and for U ⊂ R, P(U ) is the set of continuous real valued functions on R. For V ⊆ U ,
and f ∈ P(U ), resV,U (f ) is just the usual restriction of f to V .

Example 6.5.3. Let X = R and for U ⊂ R, P(U ) is the set of bounded continuous real valued functions on R. Then
P is a presheaf which is not a sheaf. Let {Ui } be an open cover of R consisting of the sets (−i, i) Then gluing fails.
Letting f (x) = x, f is bounded when restricted to each of the Ui , but f is unbounded on the real line.

Now notice that given the category C whose objects are open subsets of X and whose morphisms are inclusions,
a presheaf is a functor from C op to Set. Now let I = [0, 1] and define a topology whose open sets are intervals of the
form [0, a) for a ∈ (0, 1]. Identifying I with the category of subsets of this form and inclusions, we have a presheaf P
which is a functor from I op to Set. Define a fuzzy set to be the subcategory whose restrictions from [0, a) to [0, b) are
one to one. We can think of a the section P([0, a)) to be the set of all elements whose membership strength is at least
a.
It turns out that these presheaves are actually sheaves. They form a category whose objects are sheaves and whose
morphisms are natural transformations. We are almost ready to define the category of fuzzy sets, but we need a couple
more definitions.

Definition 6.5.5. Suppose C and D are categories. A functor F : C → D is full if for each x, y ∈ C, the map
homC (x, y) → homD (F x, F y) is surjective.

Definition 6.5.6. Suppose C and D are categories. A functor F : C → D is faithful if for each x, y ∈ C, the map
homC (x, y) → homD (F x, F y) is injective.

Definition 6.5.7. A functor F : C → D that is both full and faithful is called fully faithful, If C is a subcategory of D
and there is a fully faithful functor F : C → D that is injective on the objects of C then, C is a full subcategory of D.

Definition 6.5.8. The category Fuzz of fuzzy sets is the full subcategory of sheaves on I spanned by fuzzy sets.

Now we can define fuzzy simplicial sets. Recall that a simplicial set was a functor X : ∆op → Set. So this is
actually a presheaf on ∆ with values in Set. So a fuzzy simplicial set will be a presheaf on ∆ with values in Fuzz.

Definition 6.5.9. The category of fuzzy simplicial sets sFuzz is the category with objects given by functors from ∆op
to Fuzz and morphisms given by natural transformations.
146 CHAPTER 6. RELATED TECHNIQUES

A fuzzy simplicial set can also be viewed as a sheaf over ∆ × I where ∆ has the trivial topology and ∆ × I
has the product topology. Use ∆n<a to be the sheaf given by the the representable functor of the object ([n], [0, a)),
i. e. the functor isomorphic to Hom( ([n], [0, a)), −). The importance of the fuzzy version of simplicial sets is their
relationship to metric spaces.

Definition 6.5.10. An extended-pseudo-metric space (X, d) is a set X and a map d : X × X → [0, ∞] such that for
x, y, z ∈ X,

1. d(x, y) ≥ 0 and d(x, x) = 0,

2. d(x, y) = d(y, x), and

3. d(x, z) ≤ d(x, y) + d(y, z) or d(x, z) = ∞.

The difference between an extended-pseudo-metric space and a metric space is that an extended-pseudo-metric space
can have infinite distances and it is possible for d(x, y) to be 0 even it x ̸= y.

The category of extended-pseudo-metric spaces EPMet has as objects extended-psuedo-metric spaces. The mor-
phisms are non-expansive maps which are maps f : X → Y with the property that for x1 , x2 ∈ X, dY (f (x1 ), f (x2 )) ≤
dX (x1 , x2 ). (Remember that the map is an isometry if ≤ is replaced by equality.) The sub-category of finite extended-
pseudo-metric spaces is denoted FinEpMet.
Now construct adjoint functors R EAL and S ING between the categories sFuzz and EPMet. These correspond to
the realization and singular functors from classical simplicial set theory. The functor R EAL is defined on the standard
fuzzy simplices ∆n<a with
( n
)
X
R EAL(∆n<a ) = (t0 , · · · , tn ) ∈ Rn+1 | ti = − log(a), ti ≥ 0 .
i=0

The metric on R EAL(∆n<a ) is inherited from Rn+1 . A morphism ∆n<a → ∆m <b exists only if a ≤ b and is determined
by a ∆ morphism σ : [n] → [m]. The action of R EAL on such a morphism is given by the map
 
log(b)  X X X
(x0 , x1 , · · · , xn ) → xi0 , x i0 , · · · , xi0  .
log(a) −1 −1 −1
i0 ∈σ (0) i0 ∈σ (1) i0 ∈σ (m)

This map is non-expansive since 0 ≤ a ≤ b ≤ 1 implies that log(b)/log(a) ≤ 1.


Extend this to a general simplical set via colimits (the disjoint union operation from Definition 6.4.10). So
R EAL(X) = colim∆n<a →X R EAL(∆n<a ).
The analog of the adjoint functor S ING is defined for an extended-pseudo-metric space Y by its action on ∆ × I
with
S ING(Y ) : ([n], [0, a)) → homEPMet (R EAL(∆n<a ), Y ).

Since McInnes, et. al. are only interested in finite metric spaces, they consider the subcategory of bounded fuzzy
simplicial sets, Fin-sFuzz. Define the finite fuzzy realization functor as follows:

Definition 6.5.11. The functor F IN R EAL: Fin − sFuzz → FinEPMet is defined by setting

F IN R EAL(∆n<a ) = ({x1 , x2 , · · · , xn }, da ),

where da (xi , xj ) = − log(a) if i ̸= j and da (xi , xi ) = 0. Then define

F IN R EAL(X) = colim∆n<a →X F IN R EAL(∆n<a ).


6.5. UMAP 147

The action of F IN R EAL on a map ∆n<a → ∆m n m


<b where a ≤ b defined by σ : ∆ → ∆ is given by

({x1 , x2 , · · · , xn }, da ) → ({xσ(1) , xσ(2) , · · · , xσ(n) }, db ),

which is a non-expansive map, since if a ≤ b, then − log(a) ≥ − log(b) so that da ≥ db .


We have a functor F IN S ING: FinEPMet → Fin − sFuzz defined by

F IN S ING(Y ) : ([n], [0, a)) → homFinEPMet (FinReal(∆n<a ), Y ).

It is then shown that F IN R EAL and F IN S ING are adjunct functors.


The point of all of this work is to piece together the metric spaces defined by each Xi in the manifold and its k
nearest neighbors. Each of these metric spaces can be transformed into their corresponding fuzzy simplicial set. We
then take a fuzzy union of all of our simplicial sets. This means that the membership function of any element is the
maximum of its membership function taken over every set involved in the union.

Definition 6.5.12. Let X = {X1 , · · · , XN } be a dataset in Rn . Let {(X, di )} for 1 ≤ i ≤ N be a family of extended-
pseudo-metric spaces with a common underlying set X such that di (Xj , Xk ) = dM (Xj Xk ) − ρ for i = j or i = k
and infinite otherwise where ρ is the distance to the nearest neighbor of Xi in Rn and dM is the geodesic distance on
the manifold either known beforehand or approximated by Theorem 6.5.2. Then the fuzzy topological representation
of X is
[n
F IN S ING((X, di )).
i=1

The fuzzy simplicial set union has now merged the different metric spaces and forms a global representation of
the manifold. We can now perform dimension reduction by finding low dimensional representations that match the
topological structure of the source data.
Let Y = {Y1 , · · · , YN } be a set of points corresponding to the data set X but being points in Rd with d << n.
We then compute a fuzzy set representation of Y and compare it to that of X. In this case, we usually consider Y as a
subset of the manifold Rd .
To compare two fuzzy sets, we need the same reference set. Given a sheaf represntation P we translate to classical
fuzzy sets by setting A = ∪a∈(0,1] P([0, a)) and membership function µ(x) = sup{a ∈ (0, 1]|x ∈ P([0, 1))}.

Definition 6.5.13. Consider two fuzzy sets with underlying set A and membership functions µ and ν respectively.
The cross entropy C is defined as
X 
µ(A)
 
1 − µ(A)

C((A, µ), (A, ν)) = µ(a) log + (1 − µ(a)) log .
ν(A) 1 − ν(A)
a∈A

The last step is to optimize the embedding Y in Rd with respect to fuzzy cross-entropy using stochastic gradient
descent.
McInnes, et. al. restricted their attention to the 1-skeleton of the fuzzy simplicial sets as this significantly reduced
computational costs.

6.5.2 Computational View


McInnes, et. al. [100] then provide a computational description of UMAP which I will now summarize. The
fuzzy simplicial sets described above are only computationally tractable for the 1-skeleton which can be described as
a weighted graph. This makes UMAP a k-neighbor based graph learning algorithm. (Recall that we looked at the k
nearest neighbors for each data point.)
The following axioms are assumed to be true:

1. There exists a manifold on which the data is uniformly distributed.


148 CHAPTER 6. RELATED TECHNIQUES

2. The underlying manifold is locally connected, i.e. every point in the manifold has a neighborhood base (see
Section 2.2) consisting of open connected sets.

3. The main goal is to preserve the topological structure of this manifold.

Here are the main steps:

1. Graph Construction

(a) Construct a weighted k-neighbor graph.


(b) Apply a transform on the edges to use the local distance from the ambient space.
(c) Deal with the inherent asymmetry of the k-neighbor graph.

2. Graph Layout

(a) Define an objective function that preserves the desired characteristics of this k-neighbor graph.
(b) Find a low dimensional representation which optimizes this objective function.

To perform the graph construction, start with a collection of data points X = {x1 , · · · , xN } be the input dataset
with pairwise distances given by the pseudo-metric d. Given an input hyperparameter k, for each xi , compute the
k-nearest neighbors under d labeled {xi1 , · · · , xik }. UMAP finds these neighbors using the nearest neighbor descent
algorithm [42]. Now define for each xi two quantities ρi and σi . We define ρi to be the distance between xi and the
nearest neighbor xij such that 1 ≤ j ≤ k and d(xi , xij ) > 0. We set σi to be the value such that

k  
X − max(0, d(xi , xij ) − ρi )
exp = log2 (k).
j=1
σi

Then ρi comes from the local connectivity constraint and ensures that xi connects to at least one other data point with
an edge weight of 1. σi is a normalization factor defining the Riemannian metric local to the point xi .
Now define a directed graph G whose vertices are the points X = {x1 , · · · , xN }. Each point xi ∈ X is connected
by a directed edge to its k nearest neighbors {xi1 , · · · , xik }, where the directed edge from xi to xij has a weight of

k  
X − max(0, d(xi , xij ) − ρi )
exp .
j=1
σi

The graph G is the 1-skeleton of the fuzzy simplicial set associated to the pseudo-metric space associated to xi .
The weight associated to the edge is the fuzzy membership strength of the corresponding 1-simplex within the fuzzy
simplicial set.
Now let A be the weighted adjacency matrix of G and let

B = A + AT − A ◦ AT ,

where AT is the transpose of A and X ◦ Y is the product of two matrices formed by multiplying them componentwise.
If Aij is the probability that the directed edge from xi to xj exists, then Bij is the probability that at least one of the
two directed edges from xi to xj or xj to xi exists. The UMAP graph G is then the undirected graph whose weighted
adjacency matrix is B.
Now we need to describe the graph layout step. UMAP uses a force directed graph layout algorithm. A force
directed layout algorithm uses a set of attractive forces along edges and a set of repulsive forces among vertices. The
algorithm proceeds by iteratively applying attractive and repulsive forces at each edge or vertex. Slowly decreasing
these forces guarantee convergence to a local minimum.
6.5. UMAP 149

In UMAP, the attractive force between two vertices xi and xj connected by an edge and sitting at coordinates yi
and yj is determined by !
2(b−1)
−2ab||yi − yj ||2
w((xi , xj ))(yi − yj )
1 + ||yi − yj ||22

where a and b are hyperparameters and w((xi , xj )) is the weight of the edge between xi and xj . The repulsive force
is given by !
2b
 (1 − w((xi , xj )))(yi − yj ),
(ϵ + ||yi − yj ||22 )) 1 + ||yi − yj ||2b
2 )

where ϵ is set to .001 to prevent division by zero.


These forces are derived from gradients optimizing the edgewise cross-entropy between the weighted graph G and
the equivalent weighted graph H constructed from the points {y1 , · · · , yN }. So the idea is to position the points yi so
that the weighted graph induced by them most closely approximates the graph G where we measure the distance by
the total cross-entropy over all edge existence probabilities. Since G matches the topology of the original data, H will
be a good low dimensional representation of the topology of this data.
More implementation details and test results are found in [100]. In the next two sections, we will briefly look at
when you would want to use UMAP as opposed to a competing algorithm.

6.5.3 Weaknesses of UMAP


UMAP is very fast and effective for both visualization and dimesnsion reduction, but there are some situations in
which UMAP should not be used.

1. If you need interpretability of the reduced dimensions, UMAP does not give them a specific meaning. Principal
Component Analysis (PCA) or the related Non-negative Matrix Factorization (NMF) are much better in this
situation. For example, PCA reveals the dimensions which are the directions of greatest variability.

2. Don’t use UMAP if you don’t expect to have a manifold structure. Otherwise, it could find manifold structure
in the noise.

3. UMAP favors local distance over global distance. If you are mainly concerned with global distances, multidi-
mensional scaling (MDS) is better suited to this task.

4. Finally, UMAP makes a number of approximations to improve the computational efficiency of the algorithm.
This can impact the results for small data sets with less than 500 points. The authors recommend that UMAP
not be used in this case.

6.5.4 UMAP vs. t-SNE


Since I have been mentioning t-SNE throughout this section as UMAP’s main competitor, I will now give you
an idea of what it actually does. While, [165, 164] were the original sources, I will base this discussion on the short
summary that appears in [129]. Later, I will compare t-SNE to UMAP and discuss the major differences.
As with UMAP, we face the problem of non-uniform density. The original algorithms was called stochastic
neighbor embedding or SNE [69].
As before, we have a set of data points {x1 , · · · , xN } ⊂ Rn and let {y1 , · · · , ym } be a candidate for the corre-
sponding image points in Rk for k < n. Letting d be the distance in Rn and d be the distance in Rk , let
d(xi ,xj )2

2σ 2
e i
pj|i = d(xi ,xk )2

2σ 2
P
k̸=i e
i
150 CHAPTER 6. RELATED TECHNIQUES

and 2
e−d(yi ,yj )
qj|i = P
k̸=i e−d(yi ,yk )2
where the variances σi will be obtained by a process described below, and in the equation for qj|i , all of the variances

are fixed to be 22 . Set pi|i = qi|i = 0.
Looking at pj|i and qj|i as probability distributions, we would like them to be as close as possible where the
difference will be the Kullback-Leibler divergence. We will use it in the form of the cost function
XX pj|i
C= pj|i log .
i j
qj|i

By minimizing this particular function, we are doing the optimization locally for each xi , yi pair taking into account
their neighbors, then summing over all of the xi . The density around a point is reflected in this cost function,
The SNE algorithm involves solving for the minimizing points {yi } using gradient descent. Convergence is slow
and depends on good choices of σi . To estimate σi , fix a value
P
P = 2− j pj|i log2 pj|i
.

This value is called the perplexity and can be thought of as the number of neighbors used. This is a loose estimate of
the local dimension. The perplexity is typically between 10 and 100. It is an input parameter, and we solve for values
of σi achieving the desired perplexity.
SNE has two major problems:
1. The gradient descent procedure is slow and it can be hard to get it to converge.
2. With a large number of points, visualization can be difficult as the cost function will push most points into the
center of mass.
To address these issues, van der Maaten and Hinton proposed the t-distributed stochastic neighborhood embedding
algorithm or t-SNE. SNE is modified in the following ways:
1. We symmetrize pj|i as
pj|i + pi|j
pij = .
2
2. We symmetrize qj|i as
−1
1 + d(yi , yj )2
qij = P −1 .
k̸=l 1 + d(yk , yl )2
In the original definition, qj|i was defined using a Gaussian. This definition replaces it with a Student t-
distribution with one degree of freedom which has more weight in the tails.
3. The cost function is now XX pij
C= pij log .
i j
qij

These modifications lead to heavier tails in the embedding space giving outliers less impact on the overall results
and the compression around the center of mass is alleviated. Also, the gradient descent procedure becomes more
efficient.
Now to compare t-SNE and UMAP, we can write the UMAP cross-entropy cost function as
X  
pij

1 − pij

CU M AP = pij log + (1 − pij ) log .
qij 1 − qij
i̸=j
6.5. UMAP 151

Although the cost functions look similar, the definitions for pj|i and qj|i are somewhat different for UMAP. In the
high dimensional space, the pj|i are fuzzy membership functions taking the form
 −d(x
i ,xj )−ρi

pj|i = e σi
.

The values pj|i are only calculated for k nearest neighbors with all other pj|i = 0. The distance d(xi , xj ) in the
high dimensional space can be any distance function, ρi is the distance to the nearest neighbor of xi , and σi is the
normalizing factor which plays a similar role to the perplexity derived σi from t-SNE. Symmetrization is carried out
by fuzzy set union and can be expressed as

pij = (pj|i + pi|j ) − pj|i pi|j .

The low dimensional similarities are given by


−1
qij = 1 + a||yi − yj ||2b
2

where a and b are user defined values. UMAP’s defaults give a = 1.929 and b = 0.7915. Setting a = b = 1 results in
the Student t-distribution used in t-SNE.
According to [100], UMAP is demonstrably faster than t-SNE and provides better scaling. This allows for the
generation of high quality embeddings of larger data sets than had ever been possible in the past.
152 CHAPTER 6. RELATED TECHNIQUES
Chapter 7

Some Unfinished Ideas

In this very short chapter, I will mention some ideas that have never been fully pursued. They involve the use of
algebraic topology for time series of graphs, directed simplicial complexes, computer intrusion detection, and market
basket analysis.

7.1 Time Series of Graphs


In Chapter 5, I discussed time series in general and the basics of graphs. One interesting combination of these two
topics is the idea of time series of graphs. There are many problems in which we are interested in network change
detection. This could involve differences in connectivity between nodes or in the weights of the edges between them.
Edges can be weighted by attributes of the traffic the network carries. Typically, we want to detect drastic changes or
anomalies in the networks.
To convert a series of graphs into a numerical time series, we could use a graph distance measure. Edit distance,
discussed in Section 5.7, is especially useful and easy to compute. The book by Wallis, et. al [167] has several other
ideas. We could also get a series by tracking attributes such as number of vertices, number of edges, maximum degree,
etc. As I mentioned, a graph can also be converted to a simplicial complex in various ways, so we can track Betti
numbers, Euler characteristic, or dimension of the complex. We can also convert any graph into a persistence diagram
or landscape and then produce a time series with any of the distance measures we discussed in Chapter 5.
Given a time series, there is an easy trick to detecting spikes. Slide a window of size k for some small number k.
Let {xn , xn+1 , · · · , xn+k−1 } be the values in this window and let µn and σn be their mean and standard deviation.
−µn
Then we can detect a spike at xn+k if xn+k σn is above some predetermined threshold. For significant level changes,
we can use control chart techniques such as cusum from statistical quality control. A good book with a lot of examples
is [103]. These techniques are very useful if you own a widget factory and a disgruntled employee throws a wrench in
the works. For more complex anomalies, you can use SAX as discussed in Chapter 5.

7.2 Directed Simplicial Complexes


Directed simplicial complexes are a structure that I have just recently learned about. It has a very interesting
application in neurology.
The Blue Brain Project [17] is a project of EPFL in Lausanne, Switzerland to fully map the brain of a mouse.
A mouse’s brain has about 100 million neurons and a trillion synapses. The idea is to understand the link between
the neural network structure and its function. By constructing graphs that reflect the information flow and using
algebraic topology it is possible to find cliques of neurons that respond to specific stimuli and determine how these
are organized into cavities. The problem is that algebraic topology mainly deals with data from undirected graphs. I
will now describe the structures needed for this project. The idea of simplicial complexes and persistence related to

153
154 CHAPTER 7. SOME UNFINISHED IDEAS

directed graphs could easily be applied to a variety of problems. In addition, there is an extension of RIPSER called
FLAGSER that rapidly performs the required computations. See [59, 89, 133, 147] for more details.
Definition 7.2.1. An abstract directed simplicial complex is a collection S of finite ordered sets with the property that
if σ is in S, then so is every subset τ of σ with the ordering inherited from σ.
Directed simplicial complexes are similar to the simplicial complexes that we saw before but the difference is that
order of the vertices is now important. So for example, [v0 , v1 , v2 ] is not the same as [v0 , v2 , v1 ]. For this discussion
we will call the i-th face σ i of σ = [v0 , · · · , vn ] the face determined by removing the vertex vn−i
The key structure used in the Blue Brain Project is the directed flag complex.
Definition 7.2.2. The directed flag complex associated to the the directed graph G is the abstract directed simplicial
complex S(G) whose vertices are the vertices of G and whose directed n-simplices are n + 1-tuples [v0 , · · · .vn ] such
that for each pair i, j with 0 ≤ i < j ≤ n, there is a directed edge from vi to vj in the graph G. The vertex v0 is called
the source of the simplex, and there is a directed edge from v0 to every other vertex in the simplex. The vertex vn is
called the sink of the simplex and there is a directed edge from every vertex to vn .
Note that the analogue of the directed flag complex for undirected graphs is the complete subgraph or clique
complex defined in Section 5.7.
Now we give a definition of directionality in directed graphs.
Definition 7.2.3. Let G be a directed graph. For each vertex v in G, define the signed degree of v as

sd(v) = Indegree(v) − Outdegree(v).


P
(Note that for any finite graph, v∈V (G) sd(v) = 0.) The directionality of G denoted Dr(G) is defined to be
X
Dr(G) = sd(v)2 .
v∈V (G)

We then have that a directed n-simplex σ is a fully connected directed graph on n + 1 vertices with a unique source
and sink. It can be shown if G is any other directed graph on n + 1 vertices, then Dr(G) ≤ Dr(σ).
To compute homology, we need to define chains and boundaries. We will be working with coefficients in Z2 . Then
the n-chains Cn are the formal sums of the n-dimensional directed simplices. We define the boundary ∂n : Cn →
Cn−1 as
∂n (σ) = σ 0 + σ 1 + · · · + σ n ,
where σ i is the ith face of σ as defined above. The we define homology in the usual way.
We can also define persistent homology by taking the sublevel sets of a height function on a directed graph as
a filtration. FLAGSER is able to compute betti numbers and persistence diagrams using a variant of RIPSER. One
feature is that it can skip matrix columns in the computations resulting in a huge speedup with only small drops in
accuracy.
As an example, the Blue Brain project modeled the neocortical column (a cross section of the brain cortex) of a 14
day old rat. The graph has about 30,000 vertices and around 8 million edges. FLAGSER could build a directed flag
complex in about 30 seconds on a laptop and compute a fairly accurate estimate of the homology on an HPC node in
about 12 hours. See [89] for more details and comparisons.

7.3 Computer Intrusion Detection


How many times has something like this happened to you. You are working on your computer unaware that a
fierce lion is watching you closely. After watching you type your password, the lion waits for you to leave and then
pounces on your computer, deleting several of your files and writing nasty emails to your boss. (See Figure 7.3.1).
How can you convince the system administrator (and your boss) that it wasn’t really you.
7.3. COMPUTER INTRUSION DETECTION 155

Figure 7.3.1: Authorized user and lion impostor [126].

Back in 2004, there were already several methods for computer intrusion detection and there are probably even
better ones now. Besides, a lion would probably be hitting several keys at once and its spelling and grammar would
be terrible. Still, my collaborator Tom Goldring and I thought it would be interesting to try to detect unauthorized
users using features from graphs along with the newly developed classification techniques of random forests [20]. Our
work was cut short by organizational changes, but I was able to present what we had at the Second Conference on
Algebraic Topological Methods in Computer Science in July 2004. There were no proceedings from the conference,
but my slides can still be found online on the conference website [126].
The data consisted of 30 sessions each of 10 users on Windows NT machines. The data was represented by 2
directed graphs. Process graphs have vertices representing the processes called by the user and a directed edge from
process 1 to process 2 if process 2 is spawned by process 1. Process graphs are disconnected trees since processes
called by the operating system are omitted. Window graphs have vertices representing windows clicked on by user
with a directed edge from a window to the next one clicked on. Window graphs are connected and can have cycles.
Both types of graphs have edge weights representing time (in seconds) between the start of processes (process graphs)
or the time between first clicks on consecutive windows (window graphs). Figures 7.3.2 and 7.3.3 show examples of
a process graph and a window graph respectively.

Figure 7.3.2: Example of a process graph [126].

The data involved 30 sessions each for 10 users. Each session had 56-long feature vectors related to the size
and shape of windows and process graphs as well as edge weights representing times between clicks on windows
or the start of processes. We then determined the most important features using a random forest. It turned out that
window graphs were more important than process graphs. The most important window graph features were sources
156 CHAPTER 7. SOME UNFINISHED IDEAS

Figure 7.3.3: Example of a window graph [126].

(representing whether the user returned to the first window used), sinks (representing whether the last window used
was one that was used previously), and timing factors such as the total edge weight and the total weighted degree.
Running the data through a 10-way random forest produced the confusion matrix in Figure 7.3.4. This shows that
some users such has users 4, 9, and 10 were pretty easy to distinguish while user 5 was much more difficult. Still, this
is a good start for the difficult problem of distinguishing 10 classes.

Figure 7.3.4: Confusion matrix of 10-way random forest for intrusion detection [126].

This was as far as we got, but we had some ideas for possible improvements. The first one was to create a directed
graph called an interval graph. The vertices represent windows and there is a directed edge between 2 windows up
at the same time that is directed from window the opened earlier to the one opened later. Interval graphs have the
advantage that they distinguish users who open many windows at once from neater users.
Using algebraic topology, we could add features related to the graph complexes described in Section 5.7. We
could also now create the directed flag complex described in Section 7.2. Once the complex is created, we could
send a classifier features such as the dimension of the complex, the number of simplices in each dimension, the Euler
characteristic, or the Betti numbers of the homology groups.
7.4. MARKET BASKET ANALYSIS 157

Although this project was never finished, it may provide some ideas for other types of problems.

7.4 Market Basket Analysis


Back in Section 4.1.1, I mentioned the idea of market basket analysis or association analysis. (For a very under-
standable and complete description of this topic, see Chapters 6 and 7 of [159].) Suppose a supermarket has a set of
items that it sells. Let those be the vertices of a hypergraph. When a customer visits the store, the list of items they buy
is called a transaction. We can think of the set of transactions belonging to a particular customer as a hypergraph with
each transaction corresponding to a hyperedge. One interesting problem is to produce a set of association rules. An
example would be that if a customer buys peanut butter and bread, they will also buy jelly. To formulate the problem,
we will need some terminology. Let I = {i1 , · · · , in } be the set of items the store sells and T = {t1 , · · · , tk } be the
set of transactions of all customers combined. A subset of I is called an itemset. A transaction ti contains an itemset
X if X is a subset of ti . The number of transactions including the itemset X is called the support count of X and
denoted σ(X).
An association rule is an expression of the form X → Y where X and Y are disjoint itemsets. For example,
{peanut butter, bread} → {jelly} is an association rule. We can always build a rule with any two disjoint sets of items
and no rule will always hold. So how do we decide which ones are interesting? We would like to select rules with
high values of support and confidence. The support of a rule X → Y is the fraction of transactions which contain
both X and Y . The confidence is the fraction of transactions containing X that also contain Y . We would like to find
all rules whose support and confidence each exceed a predetermined threshold. But the number of possible rules for a
store with d items is R = 3d − 2d+1 − 1. (Proof is an exercise in [159].) So we need to do something more efficient.
The Apriori Alogrithm [7] is designed to find these rules efficiently. Its strategy is to generate frequent itemsets whose
support is above a given threshold and then use that list to find rules of high confidence.
Now suppose we want to use the data to solve a different problem. We would like to classify or cluster the
customers based on their transactions. Are they living alone or are they buying for a large family? Can we tell the
difference just from what they buy rather than how much?
As the set of each customer’s transactions form a hypergraph. As with graphs, there are a number of distance
measures to choose from. One example is a generalization of graph edit distance. For hypergraphs C1 and C2 , let Γ1i
and Γ2i be the sets of hyperedges of size i for C1 and C2 receptively. Suppose m is the maximum size of a hyperedge,
and V (Ci ) is the set of vertices of Ci . Then the hypergraph edit distance is
m
X
d(C1 , C2 ) = |V (C1 )| + |V (C2 )| − 2|V (C1 ) ∩ V (C2 )| + (|Γ1i | + |Γ2i | − 2|Γ1i ∩ Γ2i |).
i=2

When m = 2, this reduces to the usual graph edit distance.


Now recall that any hypergraph can be made into a simplicial complex by taking the hyperedges as the maximal
simplices and adding all of their subsets to form a simplicial complex. Then we can classify customers by Euler
characteristic and betti numbers for each dimension. These techniques are all untried as far as I know and it would be
interesting to see what they would produce.
In a talk at the Workshop on Topological Data Analysis: Theory and Applications sponsored by the Tutte Institute
and the University of Western Ontario (May 1-3, 2021), Emilie Purvine of Pacific Northwest National Laboratory
argued that the above method of producing a simplicial complex from a hypergraph is inadequate. This is due to the
fact that we are adding simplices that aren’t really there. For some interesting alternatives, you can watch her talk
online [128].
158 CHAPTER 7. SOME UNFINISHED IDEAS
Chapter 8

Cohomology

Cohomology is sort of the reverse of homology. Instead of a boundary map that goes down in dimension, we have a
coboundary that goes up in dimension. There is a cohomology exact sequence that is like the homology exact sequence
only backwards. The Universal Coefficient Theorem (see Section 8.4) shows that if we know the homology groups of
a complex, we can compute the cohomology groups. In fact, two spaces with isomorphic homology groups in every
dimension have isomorphic cohomology groups in every dimension. So why do we care?
The answer is that cohomology actually forms a ring. We have a product called the cup product (see Section 8.3)
which takes a cohomology class of dimension p and a class of dimension q and produces a class of dimension p + q.
But couldn’t we do the same thing in homology? Surprisingly, the answer is no. The argument is subtle and we will
see why this is true in Section 8.5. We will also see two spaces with the same homology groups but with a different
cohomology ring structure, meaning the spaces are not homotopy equivalent.
So what does this have to do with data science? Remember that at its heart, algebraic topology is a classifier.
The more we know about a pair of objects, the better we can tell them apart. To paraphrase Norman Steenrod [153],
cohomology encodes more of the geometry into the algebra. Steenrod squares, which we will meet in Chapter 11,
encode even more of the geometry as they enhance cohomology rings by giving them the richer structure of algebras. I
would conjecture that the richer strucuture will provide even more information, especially if we can adapt the definition
of persistence. It turns out that there is already some work in this direction, but there are a lot of open problems.
One definite practical advantage of cohomology is that the computations are much faster than for persistent ho-
mology. Once we do these computations we can find the persistent homology as well. This fact is the main reason for
the big speedup of RIPSER over its predecessors.
In the next section, I will describe four important mathematical tools from the area of homological algebra. They
will be necessary to define cohomology and state its most important properties. I will then define cohomology in
Section 8.2 and describe the analogs to homology including a new set of Eilenberg-Steenrod axioms. Section 8.3
deals with the cup product which gives cohomology its ring structure. Section 8.4 discusses the two Universal Coef-
ficient Theorems which allow us to compute cohomology of a space if we know its homology and the homology with
coefficients in any abelian group if we know its homology over the integers. Section 8.5 deals with the homology
and cohomology of product spaces using the Künneth Theorems. In Section 8.6, we will look more closely at the
cohomology ring structure of a product space. In Section 8.7, we turn to persistent cohomology, and we conclude the
chapter in Section 8.8 by describing the role of persistent cohomology in the RIPSER software.
Unless otherwise stated, the material in Sections 8.1-8.5 comes from Munkres [110]. The other algebraic topology
textbooks I mentioned [47, 63, 149] have their own explanations of the same topics.

8.1 Introduction to Homological Algebra


Homological algebra is an important computational tool in algebraic topology, but it is an entire mathematical field
of its own. We will use it to define cohomology and state the Universal Coefficiant and Künneth Theorems as well as

159
160 CHAPTER 8. COHOMOLOGY

to perform the related computations. As stated above, the material comes from Munkres [110], and we will mostly
restrict ourselves to abelian groups, but the operations I will define have generalizations to modules. (Recall that an
abelian group can be thought of as a module over the ring of integers.) If you want to see it in its more general form,
the classic book is the book by Mac Lane with the modest name, ”Homology” [92]. Other popular books are the ones
by Rotman [137] and Hilton and Stammbach [68].

8.1.1 Hom
Given two abelian groups, A and G, we get a third abelian group Hom(A, G) consisting of the homomorphisms
from A to G. If ϕ, ψ ∈ Hom(A, G) we define addition by (ϕ + ψ)(a) = ϕ(a) + ψ(a) for a ∈ A. The reader should
check that (ϕ + ψ) is a homomorphism and so an element of Hom(A, G). The identity of Hom(A, G) is the function
f such that f (A) = 0 and if ϕ ∈ Hom(A, G) then the inverse −ϕ is defined by (−ϕ)(a) = ϕ(−a).
Example 8.1.1. Hom(Z, G) ∼ = G by the isomorphism Φ that takes ϕ : Z → G to ϕ(1) ∈ G. To see that this is an
isomorphism, suppose ϕ, ψ ∈ Hom(A, G). Then Φ(ϕ + ψ) = (ϕ + ψ)(1) = ϕ(1) + ψ(1) = Φ(ϕ) + Φ(ψ). Φ is
surjective since given g ∈ G, we can choose a homomorphism ϕ such that ϕ(1) = g. Φ is injective since if ϕ : Z → G
is a homomorphism and ϕ(1) = 0, then for any positive integer n, ϕ(n) = ϕ(1)+ϕ(1)+· · ·+ϕ(1) = (0+· · ·+0) = 0.
For n negative, ϕ(n) = −ϕ(−n) = 0. So the kernel of Φ is the function which is identically 0. Thus, Φ is an
isomorphism.
The next definition will be key when defining cohomology.
Definition 8.1.1. A homomorphism f : A → B gives rise to a dual homomorphism


Hom(B, G) Hom(A, G)

going in the reverse direction. The map f˜ assigns to the homomorphism ϕ : B → G the composite
f ϕ
A B G.

In other words, f˜(ϕ) = ϕ ◦ f .


Note that the assignment A → Hom(A, G) and f → f˜ is a contravariant functor from the category of abelian
groups and maps to itself.
Theorem 8.1.1. If f is a homomorphism and f˜ is its dual homomorphism then:
1. If f is an isomorphism, so is f˜.
2. If f is the zero homomorphism, so is f˜.
3. If f is surjective, then f˜ is injective.
The following result generalizes the last statement.
Theorem 8.1.2. If the sequence
f g
A B C 0

is exact, then so is the dual sequence

f˜ g̃
Hom(A, G) Hom(B, G) Hom(C, G) 0.

If f is injective and the first sequence splits, the f˜ is surjective and the second sequence splits.
8.1. INTRODUCTION TO HOMOLOGICAL ALGEBRA 161

Example 8.1.2. In general, exactness of a short exact sequence does not imply exactness of the dual sequence. Let
f : Z → Z be multiplication by 2. Then the sequence

f
0 Z Z Z2 0

is exact. But f˜ is not surjective. If ϕ ∈ Hom(Z, Z), then f˜(ϕ) = ϕ ◦ f . Now any homomorphism of Z into itself
is multiplication by α for some α ∈ Z. So any element of Hom(Z, Z) takes even integers to even integers. So the
image of f˜ is not all of Hom(Z, Z) and f˜ is not surjective.
So converting a short exact sequence to Hom reverses the arrows but still needs something else on the left hand
side. That will be accomplished using the Ext functor that we will see in Section 8.1.3. We also remark that applying
Hom with G in the first coordinate instead of the second produces a covariant functor so the arrows are not reversed.
We conclude this section with a description of the Hom of two finitely generated abelian groups. I will state the
theorem in a little less generality than Munkres as I will assume a finite indexing set so we only need to deal with
direct sums rather than direct products.
Theorem 8.1.3. 1. There are isomorphisms
n n
Ai , G) ∼
M M
Hom( = Hom(Ai , G)
i=1 i=1

and
m m
Gi ) ∼
M M
Hom(A, = (Hom(A, Gi ).
i=1 i=1

2. There is an isomorphism of Hom(Z, G) with G. If f : Z → Z equals multiplication by m, then so does f˜.


m
3. Hom(Zm , G) ∼
= ker(G −→ G).
Proof: I will give Munkres’ proofs of the second half of statement (2) and statement (3). (We have already
discussed the first statement of (2) in Example 8.1.1 and (1) is just standard group theory.)
Let f : Z → Z be multiplication by m. Then for ϕ ∈ Hom(Z, G), and z ∈ Z, f˜(ϕ)(z) = ϕ(f (z)) = ϕ(mz) =
mϕ(z). So f˜(ϕ) = mϕ. Then f˜ is multiplication by m in Hom(Z, G) and so also in G by way of the isomorphism
of Hom(Z, G) with G.
To see (3), if
m
0 Z Z Zm 0

is exact, then
m
Hom(Z, G) Hom(Z, G) Hom(Zm , G) 0

m
is exact. Then ker(G −→ G) ∼ = Hom(Zm , G). ■
What if we want to find Hom(G, H) where G and H are finite cyclic groups?
Theorem 8.1.4. There is an exact sequence

m
0 Zd Zn Zn Zd 0,

where d = gcd(m, n), i.e. the greatest common divisor of m and n. From this it can be shown that Hom(Zm , Zn ) ∼
=
m ∼
ker(Zn −→ Zn ) = Zd .
162 CHAPTER 8. COHOMOLOGY

Now we can determine Hom(A, B) when A and B are any finitely generated abelian groups. Reviewing what we
know, we have:
Hom(Z, G) ∼ =G
m
Hom(Zm , G) = ker(G −→ G)
So
Hom(Zm , Z) = 0
and
Hom(Zm , Zn ) ∼
= Zd ,
where d = gcd(m, n).
We will want to remember these formulas for later.
Example 8.1.3. Let A = Z ⊕ Z ⊕ Z15 ⊕ Z3 , and B = Z ⊕ Z6 ⊕ Z2 . Find Hom(A, B).
Hom(A, B) ∼
= Hom(Z ⊕ Z ⊕ Z15 ⊕ Z3 , Z ⊕ Z6 ⊕ Z2 )

= Hom(Z.Z) ⊕ Hom(Z, Z6 ) ⊕ Hom(Z, Z2 )⊕
Hom(Z.Z) ⊕ Hom(Z, Z6 ) ⊕ Hom(Z, Z2 )⊕
Hom(Z15 , Z) ⊕ Hom(Z15 , Z6 ) ⊕ Hom(Z15 , Z2 )⊕
Hom(Z3 .Z) ⊕ Hom(Z3 , Z6 ) ⊕ Hom(Z3 , Z2 )

= Z ⊕ Z6 ⊕ Z2 ⊕
Z ⊕ Z6 ⊕ Z2 ⊕
0 ⊕ Z3 ⊕ 0⊕
0 ⊕ Z3 ⊕ 0

= Z 2 ⊕ (Z6 )2 ⊕ (Z3 )2 ⊕ (Z2 )2 .

8.1.2 Tensor Product


Just when you thought you could relax, the next topic is guaranteed to make you a lot tenser. (Or maybe tensor?)
I will give a description of the important operation, tensor products.
Before starting, I need to give you a mathematical word reuse warning. Tensor products are not the same thing
as the tensors used in physics. Those tensors are sort of like multidimensional matrices. You can think of matrices
and vectors as special cases. They do have interesting uses in data science and are an alternate method of analyzing
multivariate time series. For anything you would ever want to know about tensors and data science, start with the
paper of Tammy Kolda and Brett Bader of Sandia National Laboratory [85].
With that out of the way, tensor products are a binary operation on abelian groups, or more generally, on modules.
We would like a function f : A × B → C, for abelian groups A, B, C, that is linear when restricted to either variable.
In other words, we want f to be bilinear. This is what tensor products accomplish.
Definition 8.1.2. Let A and B be abelian groups. Let F (A, B) be the free abelian group generated be the elements of
A × B. Let R(A, B) be the subgroup generated by elements of the form
(a1 + a2 , b1 ) − (a1 , b1 ) − (a2 , b1 ),
(a1 , b1 + b2 ) − (a1 , b1 ) − (a1 , b2 ),
for a1 , a2 ∈ A and b1 , b2 ∈ B. Define the tensor product denoted A ⊗ B by
A ⊗ B = F (A, B)/R(A, B).
The coset of the pair (a, b) is denoted a ⊗ b.
8.1. INTRODUCTION TO HOMOLOGICAL ALGEBRA 163

Our goal now is to list some properties of tensor product and show how to compute the tensor products of various
abelian groups.Then we will compute the tensor product of the two groups from example 8.1.3.
Given a function from A × B → C, where C is an abelian group, we have a unique homomorphism of F (A, B)
into C since the elements of A × B form a basis of F (A, B). This function is bilinear if and only if it maps the
subgroup R(A, B) to zero. So every homomorphism of A ⊗ B into C determines a bilinear function of A × B into C
and vice versa.
Now any element of F (A, B) is a finite linear combination of pairs (a, b) so any element of A ⊗ B is a finite linear
combination of elements of the form a ⊗ b. So a ⊗ b is not a typical element of A ⊗ B, but it is a typical generator.
Now the definition implies that
(a1 + a2 , b1 ) = (a1 , b1 ) + (a2 , b1 ),
and
(a1 , b1 + b2 ) = (a1 , b1 ) + (a1 , b2 ).
So
a ⊗ b = (0 + a) ⊗ b = 0 ⊗ b + a ⊗ b.
This gives 0 ⊗ b = 0. Similar arguments show
a ⊗ 0 = 0,
(−a) ⊗ b = −(a ⊗ b) = a ⊗ (−b),
and
(na) ⊗ b = n(a ⊗ b) = a ⊗ (nb),
where n ∈ Z.
Definition 8.1.3. Let f : A → A′ and g : B → B ′ be homomorphisms. There is a unique homomorphism
f ⊗ g : A ⊗ B → A′ ⊗ B ′
such that (f ⊗ g)(a ⊗ b) = f (a) ⊗ g(b) for all a ∈ A, b ∈ B. f ⊗ g is called the tensor product of f and g.
Theorem 8.1.5. There is an isomorphism Z⊗G ∼
= G mapping n⊗g to ng. It is natural with respect to homomorphisms
of G.
Proof: The function mapping Z × G to G sending (n, g) to ng is bilinear so it induces a homomorphism ϕ :
Z ⊗ G → G sending n ⊗ g to ng.
Let ψ : G → Z ⊗ G with ψ(g) = 1 ⊗ g. For g ∈ G, ϕψ(g) = ϕ(1 ⊗ g) = g. On a generator n ⊗ g of Z ⊗ G,
ψ(ϕ(n ⊗ g)) = ψ(ng) = 1 ⊗ ng = n ⊗ g. So ψ and ϕ are inverses of each other and Z ⊗ G ∼= G.
Naturality comes from the commutativity of the diagram

=
Z ⊗G G

iz ⊗ f f

=
Z ⊗H H. ■

Important note: In this chapter the term natural isomorphism or natural exact sequence will mean that maps will
satisfy a commutative diagram of the appropriate shape.
Example 8.1.4. Let A′ be a subgroup of A and B ′ be a subgroup of B. It is not necessarily true that A′ ⊗ B ′ is
a subgroup of A ⊗ B. The reason is that the tensor product of the inclusion mappings is not necessarily injective.
As an example, let A = Q, the group of rational numbers under addition. Let A′ = Z, and B = B ′ = Z2 . Then
Z ⊗ Z2 ∼ = Z2 , by our previous theorem, but Q ⊗ Z2 = 0 since in Q ⊗ Z2 , a ⊗ b = (a/2) ⊗ 2b = (a/2) ⊗ 0 = 0.
This gives Q ⊗ Z2 = 0. Obviously, Z2 is not a subgroup of 0. This also shows that the tensor product of the inclusion
maps, i : A′ → A and j : B ′ → B are not necessarily injective.
164 CHAPTER 8. COHOMOLOGY

Although the tensor product of injective maps may not be injective, the tensor product of surjective maps is always
surjective.
Theorem 8.1.6. Let ϕ : B → C and ϕ′ : B ′ → C ′ are surjective. Then ϕ ⊗ ϕ′ : B ⊗ B ′ → C ⊗ C ′ is surjective, and
its kernel is the subgroup of B ⊗ B ′ generated by all elements of the form b ⊗ b′ where b ∈ ker B, and b′ ∈ ker B ′ .
Idea of Proof: Let G be the subgroup of B ⊗B ′ generated by elements of the form b⊗b′ described in the statement
of the theorem. Then ϕ ⊗ ϕ′ maps G to zero, so it induces a homomorphism Φ : (B ⊗ B ′ )/G → C ⊗ C ′ . We are
done if Φ is an isomorphism, so we build an inverse for it. Let ψ : C × C ′ → (B ⊗ B ′ )/G by ψ(c, c′ ) = b ⊗ b′ + G
where b, b′ are chosen so that ϕ(b) = c and ϕ(b′ ) = c′ . Check that ψ is well defined and bilinear so that it defines a
homomorphism Ψ : C ⊗ C ′ → (B ⊗ B ′ )/G. Finally show that Φ and Ψ are inverses. ■
Now we look at what tensor product does to exact sequences.
Theorem 8.1.7. If
ϕ ψ
A B C 0

is exact, then
ϕ ⊗ iG ψ ⊗ iG
A⊗G B⊗G C ⊗G 0

is exact. If ϕ is injective and the first sequence splits, then ϕ ⊗ iG is injective and the second sequence splits.
Proof: We know from the previous theorem that ψ ⊗ iG is surjective and that the kernel is the subgroup D of
B ⊗ G generated by elements of the form b ⊗ g with b ∈ ker ψ. The image of ϕ ⊗ iG is the subgroup E of B ⊗ G
generated by elements of the form ϕ(a) ⊗ g. Since ker ψ = image ϕ, D = E.
If ϕ is injective and the first sequence splits, let p : B → A be a homomorphism such that pϕ = iA . (Remember
this is the definition of the sequence splitting.) Then

(p ⊗ iG ) ◦ (ϕ ⊗ iG ) = iA ⊗ iG = iA⊗G .

So ϕ ⊗ iG is injective and the second sequence splits via p ⊗ iG . ■


Just as Ext filled out the missing piece of the exact sequence when Hom was applied, Tor does the same thing with
tensor product (under additional conditions). I will describe them both in detail in the next two subsections.
Theorem 8.1.8. There is a natural isomorphism

Zm ⊗ G ∼
= G/mG

.
Proof:
Consider the exact sequence
m
0 → Z −→ Z → Zm → 0.
(Remember that m above the arrow means multiplication by m.) Now tensor it with G to get the exact sequence
m⊗i
G
Z ⊗ G −−−−→ Z ⊗ G → Zm ⊗ G → 0.

Since Z ⊗ G ∼
= G, this becomes
m
G −→ G → Zm ⊗ G → 0.
This proves the result. ■
The next two results will allow us to compute the tensor product of any two finitely generated abelian groups. The
proofs are lengthier, so the reader is referred to Munkres [110].
8.1. INTRODUCTION TO HOMOLOGICAL ALGEBRA 165

Theorem 8.1.9. We have the following natural isomorphisms:


1. A⊗B ∼= B ⊗A. (Recall that this is not true for Hom. Hom(A, B) is generally not isomorphic to Hom(B, A).)
2. ( i=1 Ai ) ⊗ B ∼
Ln Ln
= i=1 (Ai ⊗ B).
A ⊗ ( i=1 Bi ) ∼
Ln Ln
= i=1 (A ⊗ Bi ).
3. A ⊗ (B ⊗ C) ∼
= (A ⊗ B) ⊗ C.
Recall that an additive abelian group is torsion free if there are no elements of finite order. Note that a torsion free
finitely generated abelian group is free.
Theorem 8.1.10. If
0→A→B→C→0
is exact and G is torsion free, then

0→A⊗G→B⊗G→C ⊗G→0

is exact.
The proof is easy in the case G is free which will always be the case if G is finitely generated and torsion free,
since D ⊗ Z ∼= D for any abelian group D, and direct sums of exact sequences are exact. So exactness of the second
sequence follows from exactness of the first sequence. See [110] for the proof in the more general case.
An easy consequence is the following:
Theorem 8.1.11. If A is free abelian with basis {ai }, and B is free abelian with basis {bj } then A ⊗ B is free abelian
with basis {ai ⊗ bj }.
Now we can take the tensor products of any two finitely generated ableian groups. To collect what we know:

Z ⊗G∼
=G⊗Z ∼
= G.

Zm ⊗ G ∼
= G ⊗ Zm ∼
= G/mG.
These statements imply that
Zm ⊗ Z ∼
= Z ⊗ Zm ∼
= Z/mZ = Zm ,
and if d = gcd(m, n), then
Zm ⊗ Zn ∼
= Zd .
Example 8.1.5. Returning to our running example, let A = Z ⊕ Z ⊕ Z15 ⊕ Z3 , and B = Z ⊕ Z6 ⊕ Z2 . Find A ⊗ B.

A⊗B ∼
= (Z ⊕ Z ⊕ Z15 ⊕ Z3 ) ⊗ (Z ⊕ Z6 ⊕ Z2 )

= (Z ⊗ Z) ⊕ (Z ⊗ Z6 ) ⊕ (Z ⊗ Z2 )⊕
(Z ⊗ Z) ⊕ (Z ⊗ Z6 ) ⊕ (Z ⊗ Z2 )⊕
(Z15 ⊗ Z) ⊕ (Z15 ⊗ Z6 ) ⊕ (Z15 ⊗ Z2 )⊕
(Z3 ⊗ Z) ⊕ (Z3 ⊗ Z6 ) ⊕ (Z3 ⊗ Z2 )

= Z ⊕ Z6 ⊕ Z2 ⊕
Z ⊕ Z6 ⊕ Z2 ⊕
Z15 ⊕ Z3 ⊕ 0⊕
Z3 ⊕ Z3 ⊕ 0

= Z 2 ⊕ Z15 ⊕ (Z6 )2 ⊕ (Z3 )3 ⊕ (Z2 )2 .
166 CHAPTER 8. COHOMOLOGY

Finally, tensor products provide a new definition of homology with coefficients in an arbitrary abelian group G.
For applications, the most common situation is G = Z2 .
Definition 8.1.4. Let G be an abelian group, and C = {Cp , ∂} be a chain complex. Then the p-th homology group
of the chain complex C ⊗ G = {Cp ⊗ G, ∂ ⊗ iG } is denoted Hp (C; G) and called the p-th homology group of C with
coefficients in G.
Let’s look at the simplicial homology case. How does this definition compare with the one we gave in Section
4.1.3? The group Cp (K)⊗G is theP direct sum of copies of G, one for each simplex Pof K. Each element of Cp (K)⊗G
can be represented as a finite sum σi ⊗ gi . Its boundary can be represented
P as (∂σi ) ⊗ gi . In Section
P 4.1.3 we
represented the chain cp with coefficients in G by a finite formal sum cp = gi σi with boundary ∂cp = gi (∂σi ).
It should be clear that the two chain complexes are isomorphic, but for the theory, C(K) ⊗ G is more convenient. We
will use this definition for the rest of the chapter.

8.1.3 Ext
This topic is Extremely interesting. In fact it is an example of Extreme algebraic topology.
Actually, Ext is an abbreviation of the word extension. It is the missing piece in taking a short exact sequence
and applying the Hom functor. Ext will be a crucial piece in converting homology to cohomology using the Universal
Coefficient Theorem that I will describe in Section 8.4.
Ext(A, B) is a functor like Hom(A, B) taking two abelian groups and returning a third abelian group. It is
contravariant in the first variable and covariant in the second. This means that given homomorphisms γ : A → A′ and
δ : B ′ → B, there is a homomorphism Ext(γ, δ) : Ext(A′ , B ′ ) → Ext(A, B).
Considering Ext as a black box for now, the following theorem describes its main properties. We need a definition
first.
Definition 8.1.5. Let A be an abelian group. A free resolution of A is a short exact sequence

0→R→F →A→0

such that F and R are free. Any abelian group A has a free resolution

0 → R(A) → F (A) → A → 0,

where F (A) is the free abelian group generated by the elements of A, and R(A) is the kernel of the projection of
F (A) into A. (In other words, R(A) is generated by the relations of A.) This resolution is called the canonical free
resolution of A.
Theorem 8.1.12. There is a function that assigns to each free resolution
ϕ ψ
0→R−
→F −
→A→0

of the abelian group A, and to each abelian group B, an exact sequence

π ϕ̃ ψ̃
0 ← Ext(A, B) ←
− Hom(R, B) ←
− Hom(F, B) ←
− Hom(A, B) ← 0.

This function is natural in in the sense that a homomorphism

0 R F A 0

α β γ

0 R′ F′ A′ 0
8.1. INTRODUCTION TO HOMOLOGICAL ALGEBRA 167

of free resolutions and a homomorphism δ : B ′ → B of abelian groups gives rise to a homomorphism of exact
sequences:

0 Ext(A, B) Hom(R, B) Hom(F, B) Hom(A, B) 0

Ext(γ, δ) Hom(α, δ) Hom(β, δ) Hom(γ, δ)

0 Ext(A′ , B ′ ) Hom(R′ , B ′ ) Hom(F ′ , B ′ ) Hom(A′ , B ′ ) 0.

Now the group Ext(A, B) is the cokernel of ϕ̃ by exactness. It turns out that the map Ext(γ, δ) does not depend
on α or β at all. The proof depends on some cohomology theory that I will skip but is explained in Sections 46 and
52 of [110]. It’s definition is derived from the commutativity of the two right squares in the diagram at the end of
the previous theorem. The two right squares commute by functoriality of Hom, so the left square commutes. We say
that Ext(γ, δ) is the homomorphism, induced by γ and δ. Munkres also shows that if γ = iA and δ = iB , then
Ext(iA , iB ) is an isomorphism even if we choose two different free resolutions of A. So let
ϕ
0 → R(A) −
→ F (A) → A → 0

be the canonical free resolution of A. Then

Ext(A, B) = cok(ϕ̃) = Hom(R(A), B)/ϕ̃(Hom(F (A), B)).

We will use this as our definition.


Now if γ : A → A′ and δ : B ′ → B, then if we extend γ to a homomorphism of the canonical free resolution of
A to that of A′ , then we define Ext(γ, δ) : Ext(A′ , B ′ ) → Ext(A, B) to be the homomorphism induced by γ and δ
relative to these free resolutions. So Ext is a functor of two variables which is contravariant in the first variable and
covariant in the second. The name Ext(A, B) is short for the group of extensions of B by A.
To compute Ext(A, B) when A and B are finitely generated abelian groups, use the following result, (Assume
only finitely many terms so we can use direct sums.)
Theorem 8.1.13. 1. There are natural isomorphisms

Ext(⊕Ai , B) ∼
= ⊕Ext(Ai , B),
and
Ext(A, ⊕Bi ) ∼
= ⊕Ext(A, Bi ).

2. Ext(A, B) = 0 if A is free.
3. Given B, there is an exact sequence
m
0 ← Ext(Zm , B) ← B ←− B ← Hom(Zm , B) ← 0.

Proof: Part 1 follows the similar result for Hom. Taking direct sums of the free resolutions result in isomorphisms
of the three right hand terms and thus, isomorphisms of the left terms.
For part 2, a free resolution of A splits if A is free. So taking Hom leaves the sequence exact and Ext(A, B) = 0.
m
For part 3, Theorem 8.1.12, start with the free resolution 0 → Z −→ Z → Zm → 0. Applying Theorem 8.1.12
gives the sequence
m
0 ← Ext(Zm , B) ← Hom(Z, B) ←− Hom(Z, B) ← Hom(Zm , B) ← 0.

Using the fact that Hom(Z, B) ∼= B, we have the result. ■


The theorem implies that Ext(Z, G) = 0 and Ext(Zm , G) ∼
= G/mG. As a consequence, Ext(Zm , Z) ∼
= Zm and

if d = gcd(m, n), then Ext(Zm , Zn ) = Zd .
168 CHAPTER 8. COHOMOLOGY

Example 8.1.6. Returning to our running example, let A = Z ⊕ Z ⊕ Z15 ⊕ Z3 , and B = Z ⊕ Z6 ⊕ Z2 . Find A ⊗ B.
Ext(A, B) ∼
= Ext(Z ⊕ Z ⊕ Z15 ⊕ Z3 , Z ⊕ Z6 ⊕ Z2 )

= Ext(Z.Z) ⊕ Ext(Z, Z6 ) ⊕ Ext(Z, Z2 )⊕
Ext(Z.Z) ⊕ Ext(Z, Z6 ) ⊕ Ext(Z, Z2 )⊕
Ext(Z15 .Z) ⊕ Ext(Z15 , Z6 ) ⊕ Ext(Z15 , Z2 )⊕
Ext(Z3 .Z) ⊕ Ext(Z3 , Z6 ) ⊕ Ext(Z3 , Z2 )

= 0 ⊕ 0 ⊕ 0⊕
0 ⊕ 0 ⊕ 0⊕
Z15 ⊕ Z3 ⊕ 0⊕
Z3 ⊕ Z3 ⊕ 0

= Z15 ⊕ (Z3 )3 .

8.1.4 Tor
Tor, which stands for ”the onion router” is free and open source software for anonymous communication. Traffic is
routed through publicly available relays and encrypts the destination IP address. Tor hidden services allow customers
to access a server anonymously and are often used for illegal activities. This subject, while interesting, is NOT the
topic of this section.
The other Tor is short for torsion product and is the last functor we will need. Meanwhile, what do you call a group
of mathematicians who give you anonymous help on your homological algebra? A Tor hidden service.
Tor will play a similar role to Ext where we will apply tensor product rather than Hom to an exact sequence. Ext
and Tor are sometimes called derived functors as Ext is derived from Hom and Tor is derived from tensor product. Tor
is a functor that assigns to an ordered pair A, B of abelian groups an abelian group T or(A, B) and to an ordered pair
of homomorphisms γ : A → A′ and δ : B → B ′ , a homomorphism T or(γ, δ) : T or(A, B) → T or(A′ , B ′ ). It is
covariant in both variables.
Note: Munkres uses the notation A ∗ B rather than T or(A, B). I prefer the latter notation which appears in the
more general treatment of Mac Lane [92].
As the constructions are analogous to those of Ext, I will keep this section brief and refer the reader to Section 54
of [110] for the proofs.
Theorem 8.1.14. There is a function that assigns to each free resolution
ϕ ψ
0→R−
→F −
→A→0
of the abelian group A, and to each abelian group B, an exact sequence
π⊗i
B ϕ⊗iB ψ⊗iB
0 → T or(A, B) −−−→ R ⊗ B −−−→ F ⊗ B −−−−→ A ⊗ B → 0.
The function is natural in the sense that a homomorphism of a free resolution of A to a free resolution of A′ and a
homomorphism of B to B ′ induce a homomorphism of the tensor product exact sequences.
Definition 8.1.6. Let A be an abelian group and let
ϕ
0 → R(A) −
→ F (A) → A → 0
be the canonical free resolution of A. The group ker(ϕ ⊗ iB ) is denoted T or(A, B) and called the torsion product of
A and B. If γ : A → A′ and δ : B → B ′ are homomorphisms then extend γ to a homomorphism of canonical free
resolutions and define T or(γ, δ) : T or(A, B) → T or(A′ , B ′ ) to be the homomorphism induced by γ and δ relative
to these free resolutions.
8.1. INTRODUCTION TO HOMOLOGICAL ALGEBRA 169

Theorem 8.1.15. 1. There is a natural isomorphism T or(A, B) ∼


= T or(B, A).
2. There are natural isomorphisms
T or(⊕Ai , B) ∼
= ⊕T or(Ai , B),
and
T or(A, ⊕Bi ) ∼
= ⊕T or(A, Bi ).

3. T or(A, B) = 0 if A or B is torsion free.


4. Given B, there is an exact sequence
m
0 → T or(Zm , B) → B −→ B → Zm ⊗ B → 0.
m
This theorem implies that T or(Z, G) = 0 and T or(Zm , G) ∼
= ker(G −→ G). The latter gives T or(Zm , Z) = 0
and for d = gcd(m, n), T or(Zm , Zn ) ∼
= Zd .
Example 8.1.7. Returning to our running example, let A = Z ⊕ Z ⊕ Z15 ⊕ Z3 , and B = Z ⊕ Z6 ⊕ Z2 . Find A ⊗ B.

T or(A, B) ∼
= T or(Z ⊕ Z ⊕ Z15 ⊕ Z3 , Z ⊕ Z6 ⊕ Z2 )

= T or(Z.Z) ⊕ T or(Z, Z6 ) ⊕ T or(Z, Z2 )⊕
T or(Z.Z) ⊕ T or(Z, Z6 ) ⊕ T or(Z, Z2 )⊕
T or(Z15 .Z) ⊕ T or(Z15 , Z6 ) ⊕ T or(Z15 , Z2 )⊕
T or(Z3 .Z) ⊕ T or(Z3 , Z6 ) ⊕ T or(Z3 , Z2 )

= 0 ⊕ 0 ⊕ 0⊕
0 ⊕ 0 ⊕ 0⊕
0 ⊕ Z3 ⊕ 0⊕
0 ⊕ Z3 ⊕ 0

= (Z3 )2 .

The following table from [110] summarizes all four functors and allows us to compute them on any pair of finitely
generated abelian groups:

Z ⊗G∼
=G T or(Z, G) = 0
Hom(Z, G) ∼
=G Ext(Z, G) = 0
∼ ∼ m
Zm ⊗ G = G/mG T or(Zm , G) = ker(G −→ G)
m
Hom(Zm , G) ∼
= ker(G −→ G) Ext(Zm , G) ∼= G/mG
Zm ⊗ Z ∼
= Zm T or(Zm , Z) = 0
Hom(Zm , Z) = 0 Ext(Zm , Z) ∼
= Zm

If d = gcd(m, n), then

Zm ⊗ Zn ∼
= T or(Zm , Zn ) ∼
= Hom(Zm , Zn ) ∼
= Ext(Zm , Zn ) ∼
= Zd .
I will conclude this section with some comments on the more general case where A and B are modules over a ring
R. Let a ∈ A, b ∈ B and r ∈ R. An element f ∈ HomR (A, B) has the additional condition that f (ra) = rf (a). For
tensor products, we write A ⊗R B and have the additional condition

r(a ⊗ b) = (ra) ⊗ b = a ⊗ (rb).


170 CHAPTER 8. COHOMOLOGY

For Ext and Tor, things become more complicated. Let A be an R-module and let F0 be the free module generated
by the elements of A. Let ϕ : F0 → A be the natural epimorphism. What changes here is that a submodule of a free
module is not necessarily free. So we don’t necessarily have an exact sequence 0 → R → F → A where R and F
are both free. Instead, let F1 be the free R-module generated by elements of ker(ϕ). This gives an exact sequence
F1 → F0 → A where F1 and F0 are free. Continuing gives a free resolution of the R-module A which is in the form
ϕk ϕ1
· · · → Fk −→ Fk−1 → · · · −→ F0 → A → 0,
where each Fi is free. Applying the functor HomR , we get the sequence
ϕ̃2 ϕ̃1 ϕ̃
· · · ←− HomR (F1 , G) ←− HomR (F0 , G) ←
− HomR (A, G) ← 0.
Exactness only holds for the 2 right hand terms. For n ≥ 1, ExtnR (A, G) = ker(ϕ̃n+1 )/im(ϕ̃n ). This gives an entire
sequence of Ext terms. If R = Z, Ext1 is just the Ext we defined earlier, and Extn (A, G) = 0 for n > 1. We can
n
define T orR (A, G) in an analogous way. See [92] for more details.

8.2 Definition of Cohomology


We are now ready to define cohomology. With what you know now. it will be pretty easy. Think of cohomology
as a dual of homology. Boundary maps that decrease the dimension are replaced with coboundary maps that increase
the dimension. Here are the specifics:
Definition 8.2.1. Let K be a simplicial complex and G an abelian group. The group of p-cochains of K with coeffi-
cients in G is the group
C p (K; G) = Hom(Cp (K), G).
Now let f : Cp (K) → G be a p-cochain. Then we can define a homomorphism δ(f ) : Cp+1 (K) → G by δ(f ) = f ◦∂,
where ∂ : Cp+1 (K) → Cp (K) is the boundary map. So δ(f ) ∈ C p+1 (K : G). We have then defined a map δ :
C p (K; G) → C p+1 (K; G) called the coboundary. Then the group ker(δ : C p (K; G) → C p+1 (K; G)) = Z p (K; G)
is called the group of p-cocycles. The group im(δ : C p−1 (K; G) → C p (K; G)) = B p (K; G)) is called the group of
p-coboundaries. Since δ(f ) = f ◦ ∂, ∂ 2 = 0 implies that δ 2 = 0, so any coboundariy is a cocycle. Then we define
the p-cohomology group as
H p (K; G) = Z p (K; G)/B p (K; G).
We will write H p (K; G) as H p (K) when G = Z.
It is convenient to use the notation ⟨cp , cp ⟩ to denote the element of G representing the value of the cochain cp on
the chain cp . Then we have ⟨δcp , cp+1 ⟩ = ⟨cp , ∂cp+1 ⟩.
We still need to define 0-dimensional homology.
Theorem 8.2.1. Let K be a simplicial complex. Then H 0 (K; G) is the group of 0-cochains c0 such that c0 is constant
on the set of vertices belonging to a connected component of |K|. If |K| is connected, then H 0 (K; G) ∼
= G, and if
G = Z, the group is generated by the cochain which has the value 1 on each vertex.
Proof: H 0 (K; G) = Z 0 (K; G) as there are no coboundaries in dimension 0. If v and w are in the same component
of K, then there is a 1-chain c1 with ∂c1 = v − w. So for any cocycle c0 ,
0 = ⟨δc0 , c1 ⟩ = ⟨c0 , ∂c1 ⟩ = ⟨c0 , v⟩ − ⟨c0 , w⟩.
On the other hand, if c0 is a cochain such that ⟨c0 , v⟩ − ⟨c0 , w⟩ = 0 whenever v and w lie in the same component of
|K|, then for each oriented 1-simplex σ of K,
⟨δc0 , σ⟩ = ⟨c0 , ∂σ⟩ = 0.
Then we have that δc0 = 0, so the c0 is a cocycle and we are done. ■
Next we describe reduced and relative cohomology.
8.2. DEFINITION OF COHOMOLOGY 171

Definition 8.2.2. Given a complex K, let ϵ be the augmentation map. The diagram

1 ϵ
C1 (K) −→ C0 (K) →
− Z

dualizes to the diagram


1δ ϵ̃
C1 (K; G) ←− C0 (K; G) ←
− Z.
The homomorphism ϵ̃ is called a coaugmentation. It is injective and δ1 ◦ ϵ̃ = 0. Define the reduced cohomology of K
by setting H̃ p (K; G) = H p (K; G) for p > 0 and

H̃ 0 (K; G) = ker δ1 /im ϵ̃.

Theorem 8.2.2. If K is connected, then H̃ 0 (K; G) = 0. In general,

H 0 (K; G) ∼
= H̃ 0 (K; G) ⊕ G

.
Definition 8.2.3. Let K be a complex and K0 be a subcomplex of K. The group of relative cochains in dimension p
is defined as
C p (K, K0 ; G) = Hom(Cp (K, K0 ), G).
The relative coboundary operator δ is the dual of the relative boundary operator. Then the group ker(δ : C p (K, K0 ; G) →
C p+1 (K, K0 ; G)) = Z p (K, K0 ; G) is called the group of relative p-cocycles. The group im(δ : C p−1 (K, K0 ; G) →
C p (K, K0 ; G)) = B p (K, K0 ; G)) is called the group of relative p-coboundaries. Then the relative cohomology group
is
H p (K, K0 ; G) = Z p (K, K0 ; G)/B p (K, K0 ; G).
The dual of the long exact homology sequence of the pair (K, K0 ) leads to a long exact cohomology sequence
going the other direction.
Theorem 8.2.3. Let K be a complex and K0 be a subcomplex. There exists an exact sequence
δ
· · · ← H p (K0 ; G) ← H p (K; G) ← H p (K, K0 ; G) ←
− H p−1 (K0 ; G) ← · · · .

A similar sequence exists in reduced cohomology if K0 ̸= ∅. A simplical map f : (K, K0 ) → (L, L0 ) induces a
homomorphism of long exact cohomology sequences.
Like homology theory, cohomology theory has its own set of Eilenberg-Steenrod Axioms. To describe them,.we
need the idea of a cochain complex.
Let C = {Cp , ∂} be a chain complex and G and abelian group. Then the p-cochain group of C with coefficients in
G is C p (C; G) = Hom(Cp , G). Letting δ be the dual of ∂, we can define the cohomology groups as before.
Definition 8.2.4. Let C = {Cp , ∂} and C ′ = {Cp′ , ∂ ′ } be chain complexes. Suppose ϕ : C → C ′ is a chain map.
(Recall that this means that ∂ ′ ϕ = ϕ∂.) Then the dual homomorphism

ϕ̃
− C p (C ′ ; G)
C p (C; G) ←

commutes with δ and is called a cochain map. It carries cocycles to cocycles and coboundaries to coboundaries so it
induces a homomorphism ϕ∗ : H p (C ′ ; G) → H p (C; G) on cohomology groups.
Now we can state the Eilenberg-Steenrod Axioms for Cohomology. See Section 4.2.3 to compare these to the
axioms for homology.
Given an admissible class A of pairs of spaces (X, A) and an abelian group G, a cohomology theory on A with
coefficients in G consists of the following:
172 CHAPTER 8. COHOMOLOGY

1. A function defined for each integer p and each pair (X, A) in A whose value is an abelian group H p (X, A; G).
2. A function that assigns to each continuous map h : (X, A) → (Y, B) and each integer p a homomorphism
h∗
H p (X, A; G) ←− H p (Y, B; G).

3. A function that assigns to each pair (X, A) in A and each integer p a homomorphism
δ∗
H p (X, A; G) ←− H p−1 (A; G)

.
These functions satisfy the following axioms where all pairs of spaces are in A:
− Axiom 1: If i is the identity, then i∗ is the identity.
− Axiom 2: (kh)∗ = h∗ k ∗ .
− Axiom 3: If f : (X, A) → (Y, B), then the following diagram commutes:

f∗
H p (X, A; G) H p (Y, B; G)

δ∗ δ∗

(f |A)
H p−1 (A; G) H p−1 (B; G)

− Axiom 4: (Exactness Axiom) The sequence

i∗ π∗ δ∗
··· H p (A; G) H p (X; G) H p (X, A; G) H p−1 (A; G) ···

is exact where i : A → X and π : X → (X, A) are inclusion maps.


− Axiom 5: (Homotopy Axiom) If h, k : (X, A) → (Y, B) are homotopic then h∗ = k ∗ .
− Axiom 6: (Excision Axiom) Given (X, A), let U be an open subset of X such that Ū ⊂ A◦ (i.e. the closure of U
is contained in the interior of A). Then if (X − U, A − U ) is admissible, then inclusion induces an isomorphism

H p (X − U, A − U ; G) ∼
= H p (X, A; G).

− Axiom 7: (Dimension Axiom) If P is a one point space then H p (P ; G) = 0 for p ̸= 0, and H 0 (P ; G) ∼


= G.
Note that there is no equivalent to the axiom of compact support for cohomology theory. Also, the absence of
axiom 7 leads to an extraordinary cohomology theory.
As with homology, we can define singular and cellular cohomology theories. To define them, we simply modify the
definition of cochains in Definition 8.2.1 to have Cp (K) be the group of singular or cellular chains. For a triangulable
space, simplicial, singular, and cellular cohomology all coincide.
Theorem 8.2.4. Let n > 0. Then
1. H i (S n ; G) ∼
= G for i = 0, n,
2. H i (B n , S n−1 ; G) ∼
= G for i = n.
8.2. DEFINITION OF COHOMOLOGY 173

These groups are zero for all other values of i.


Proof: The first statement easily follows from the cellular chain complex of S n and the fact that Hom(Z, G) ∼
= G.
The second statement comes from the long exact sequence of the pair (B n , S n−1 ) in cohomology and the fact that B n
is acyclic. ■
Now recall that if X is a CW complex, let

Dp (X) = Hp (X p , X p−1 ).

Let ∂ : Dp (x) → Dp−1 (X) be the composite

∂∗ j∗
Hp (X p , X p−1 ) Hp−1 (X p−1 ) Hp−1 (X p−1 , X p−2 ),

where j is inclusion.
Example 8.2.1. Let T be the torus and K be the Klein bottle. Find their cohomology with coefficients in the group
G.
For both of them the cellular chain complex is of the form

2 1 ∂
· · · → 0 → Z −→ Z ⊕ Z −→ Z → 0.

Let γ generate D2 (T ) and w1 and z1 be a basis for D1 (T ). For the torus, both ∂2 and ∂1 are zero so from the dual
sequence, we get
H 2 (T ; G) ∼
= G, H 1 (T ; G) ∼
= G ⊕ G, H 0 (T ; G) ∼= G.
For the Klein bottle K, ∂1 = 0, and we can choose a basis w1 and z1 for D1 (K) such that ∂2 γ = 2z1 . The dual
sequence is of the form
δ2 δ1
· · · ← 0 ← G ←− G ⊕ G ←− G ← 0.
Now Hom(D1 (K), G) = ∼ G ⊕ G where the first summand represents homomorphisms ϕ : D1 (K) → G that vanish
on z1 and the second summand represents homomorphisms ψ : D1 (K) → G that vanish on w1 . Since ∂1 = 0, its
dual is δ1 = 0. To compute δ2 , we have that

⟨δ2 ϕ, γ⟩ = ⟨ϕ, ∂2 γ⟩ = ⟨ϕ, 2z1 ⟩ = 2⟨ϕ, z1 ⟩ = 0,

⟨δ2 ψ, γ⟩ = ⟨ψ, ∂2 γ⟩ = ⟨ψ, 2z1 ⟩ = 2⟨ψ, z1 ⟩.


So δ2 takes the first summand to 0 and equals multiplication by 2 on the second summand. This gives
2
H 2 (K; G) ∼
= G/2G, H 1 (K; G) ∼ → G), H 0 (K; G) ∼
= G ⊕ ker(G − = G.

If G = Z, this becomes
H 2 (K) ∼
= Z2 , H 1 (K) ∼
= Z, H 0 (K) ∼
= Z.
Note that for the torus, the homology and cohomology groups are the same, but for the Klein bottle, they are very
different. (Recall that H1 (K) ∼= Z ⊕ Z2 and H2 (K) = 0.)
I will conclude this section with a result from Munkres [110] that states that if two spaces have the same homology
groups then they have the same cohomology groups. A free chain complex is one for which the chain group Cp is free
for all p. The simplicial, singular, and cellular chain complexes are all free. Also, recall that a chain map is a map that
commutes with boundaries.
Theorem 8.2.5. Let C and D be free chain complexes and let ϕ : C → D be a chain map. If ϕ induces isomorphisms in
homology in all dimensions than it induces isomorphisms in cohomology in all dimensions. Thus, if Hp (C) ∼
= Hp (D)
for all p, then H p (C; G) ∼
= H p (D; G) for all p and G.
174 CHAPTER 8. COHOMOLOGY

This seems like bad news. Cohomology would be no better for classification then homology. As we will see later,
the Universal Coeffieicent Theorem allows us to compute cohomology groups from homology groups and vice versa.
So why do we bother with cohomology before. As I have already mentioned, cohomology has an additional ring
structure that homology lacks. The next section will define cup products that turn cohomology into a ring. So why
isn’t homology also a ring? I will answer that question in Section 8.5.

8.3 Cup Products


I will follow the approach in [110] of defining cup products for singular cohomology. Then the isomorphism with
simplicial and cellular cohomology will define it for those cases.
In what follows, R will always denote a commutative ring with unit element. Typically, it will be either Z or Zn .
Definition 8.3.1. Let X be a topological space. Let S p (X; R) = Hom(Sp (X), R) denote the group of singular
p-cochains of X with coefficients in R. (After all, a ring is always an abelian group under addition.) Define a map

S p (X; R) × S q (X; R) −
→ S p+q (X; R)

as follows: If T : ∆p+q → X is a singular p + q-simplex then

⟨cp ∪ cq , T ⟩ = ⟨cp , T ◦ l(e0 , · · · , ep )⟩ · ⟨cq , T ◦ l(ep , · · · , ep+q )⟩.

(If you don’t remember what this notation means, see the beginning of Section 4.3.1.)
The cochain cp ∪ cq is called the cup product of the cochains cp and cq .
We can think of T ◦ l(e0 , · · · , ep ) as the front p-face of ∆p+q and T ◦ l(ep , · · · , ep+q ) as its back q-face. The
multiplication is the multiplication in R.
Theorem 8.3.1. The cup product of cochains is bilinear and associative. The cochain z 0 whose value is 1 on each
singular 0-simplex is the unit element. We also have the following coboundary formula:

δ(cp ∪ cq ) = (δcp ) ∪ cq + (−1)p cp ∪ (δcq ).

The theorem can be proved directly from the formula for cup product.
The coboundary formula shows that the cup product of 2 cocycles is itself a cocycle. The cohomology class of
z p ∪ z q for cocycles z p and z q depends only on their cohomology classes since

(z p + δdp−1 ) ∪ z q = z p ∪ z q + δ(dp−1 ∪ z q )

and
z p ∪ (z q + δdq−1 ) = z p ∪ z q + (−1)p δ(z p ∪ dq−1 ).
This means that we can pass to cohomology.
Theorem 8.3.2. The cochain cup product induces a cup product

H p (X; R) × H q (X; R) −
→ H p+q (X; R).

The cup product on cohomology is bilinear and associative with the cohomology class {z 0 } acting as the unit element.
Theorem 8.3.3. If h : X → Y is a continuous map then h∗ preserves cup products.
Proof: The theorem follows immediately from the fact that the cochain map h♯ preserves cup products of cochains
as the value of h♯ (cp ∪ cq ) on T equals the value of cp ∪ cq on h ◦ T which is

⟨cp , h ◦ T ◦ l(e0 , · · · , ep )⟩ · ⟨cq , h ◦ T ◦ l(ep , · · · , ep+q )⟩.

But this value is the same as h♯ (cp ) ∪ h♯ (cq ). ■


8.3. CUP PRODUCTS 175

Definition 8.3.2. Let H ∗ (X : R) denote the direct sum ⊕∞ i


i=0 H (X; R). The cup product makes this group into a ring
with unit element. It is called the cohomology ring of X with coefficients in R. A ring in this form is called a graded
ring.
Note that a homotopy equivalence induces a ring isomorphism of cohomology rings, so cohomology is a topolog-
ical invariant.
Is the cohomology ring commutative? It sort of is. If αp ∈ H p (X; R) and β q ∈ H q (X; R), then

αp ∪ β q = (−1)pq β q ∪ αp .

This property is called anti-commutativity but many books just call it commutativity despite not being commutative in
the usual meaning. I will prove the formula in Section 8.6 with access to more tools.
We can also define a cup product for simplicial cohomology rather than singular by modifying the cochain formula
to
⟨cp ∪ cq , [v0 , · · · , vp+q ]⟩ = ⟨cp , [v0 , · · · , vp ]⟩ · ⟨cq , [vp , · · · , vp+q ]⟩.
Simplicial cup products retain all of the properties of singular cup products and they form isomorphic rings for trian-
gulable spaces.
Munkres computes some examples of cohomology rings but there are some difficulties. First of all, we need to
find cocycles representing the cohomology groups. Also, isomorphic cell complexes may not always lead to the same
cohomology ring.
We will look at cohomology rings of product spaces in Section 8.5. Munkres discusses cohomology rings of
manifolds in connection with Poincaré duality, a relation between the homology and cohomology groups of a manifold.
I won’t get too far into it as it is a bit of a diversion, but there is one result about the cohomology ring of real projective
space that I will state that will be useful when we talk about Steenrod squares. (I will derive it in a different way in
Section 11.3.1.)
Theorem 8.3.4. If u is the non-zero element of H 1 (P n ; Z2 ) then uk (i.e. the cup product of u with itself k times) is the
nonzero element of H k (P n ; Z2 ) for 1 < k ≤ n. Then H ∗ (P n ; Z2 ) is a truncated polynomial ring over Z2 with one
generator u in dimension 1 and un+1 = 0. H ∗ (P ∞ ; Z2 ) is a polynomial ring over Z2 with a single one dimensional
generator.
Finally, I will give an example of two spaces with the same cohomology groups but non-isomorphic homology
rings. We will need one definition.
Definition 8.3.3. Let X and Y be topological spaces. The wedge product X ∨ Y is the union of X and Y with one
point in common.
Example 8.3.1. S 1 ∨ S 1 is a figure 8.
Example 8.3.2. Now consider the following scenario. You want a donut for dessert and you go to your local donut
shop. You are a little disappointed when you are handed a sphere with 2 circles attached. (Actually a hollow tetrahe-
dron with two triangles. See Figure 8.3.1.) The manager claims that it is the same thing as all of its homology and
cohomology grpups are the same as those of a torus. But something doesn’t seem right to you. Who is right?
Let X = S 1 ∨ S 1 ∨ S 2 and T = S 1 × S 1 be the torus. The space X is a CW complex with one cell in dimension
0, two cells in dimension 1, and one cell in dimension 2. Referring to Figure 8.3.1, write the fundamental cycles

w1 = [a, b] + [b, c] + [c, a],

z1 = [a, d] + [d, e] + [e, a],


z2 = ∂[a, f, g, h]
for these cells. The boundary operators in the cellular chain complex all vanish. So the cellular chain complex of X
is isomorphic to the cellular chain complex of the torus, T . Thus, the homology and cohomology groups of X are
isomorphic to those of T .
176 CHAPTER 8. COHOMOLOGY

The cohomology rings, however, are not isomorphic. We claim that the cohomology ring of X is trivial, i.e. every
product of cohomology classes of positive dimension is zero. This is the case because the fundamental cycles in
dimension 1 do not carry the boundaries of any two cell so all cup products of one dimensional cocycles are 0.
Munkres shows that the product of the two generators α, β of H 1 (T ) is α ∪ β = ±γ where γ is a generator of
2
H (T ).
Thus X and T do not have isomorphic cohomology rings, which implies that they are not homeomorphic or even
homotopy equivalent. So you are well within your rights to have the donut shop give you a refund.

Figure 8.3.1: S 1 ∨ S 1 ∨ S 2 [110].

In the previous example, I did not duplicate Munkres’ computation of the cohomology ring of the torus.(See [110]
for more details.) Finding cocycles generating the cohomology groups is surprisingly hard. This may make the full
use of cohomology difficult in data science applications. Still, the rest of the chapter will show that there are more
things you can do.

8.4 Universal Coefficient Theorems


In this section, we will start to make more use of the homological algebra we learned in Section 8.1. We will
have two Universal Coefficient Theorems (UCT). The UCT for cohomology allows us to compute cohomology groups
given homology groups. The UCT for homology allows us to compute homology groups with coefficients in any other
finitely generated abelian groups given the homology groups with integer coefficients.
Letting C = {Cp , ∂} be a chain complex and G be an abelian group, recall that C p (C; G) = Hom(Cp (C), G). Is
it true that H p (C; G) = Hom(Hp (C), G)? It would be nice if that were true as it would be pretty easy to compute
cohomology groups from homology groups.
The answer is ”almost but not quite”. But these will be terms in an exact sequence so we will often have all the
information we need.

Definition 8.4.1. For a chain complex C = {Cp , ∂} and an abelian group G, there is a map

Hom(Cp , G) × Cp → G

which takes the pair (cp , dp ) to the element ⟨cp , dp ⟩ ∈ G. This is a bilinear map called the evaluation map. It induces
a bilinear map called the Kronecker index in which the image of (αp , βp ) is denoted ⟨αp , βp ⟩.

We need to check that this map is well defined. Let z p be a cocycle and zp be a cycle. then

⟨z p + δdp−1 , zp ⟩ = ⟨z p , zp ⟩ + ⟨dp−1 , ∂zp ⟩ = ⟨z p , zp ⟩ + 0 = ⟨z p , zp ⟩


8.4. UNIVERSAL COEFFICIENT THEOREMS 177

and
⟨z p , zp + ∂dp+1 ⟩ = ⟨z p , zp ⟩ + ⟨δz p , dp+1 ⟩ = ⟨z p , zp ⟩ + 0 = ⟨z p , zp ⟩.
Definition 8.4.2. The Kronecker map is the map
κ : H p (C; G) → Hom(Hp (C); G)
that sends αp to the homomorphism ⟨αp , ⟩. So (κ(αp ))(βp ) = ⟨αp , βp ⟩. The map κ is a homomorphism because the
Kronecker index is linear in the first variable.
I will now state the UCT for cohomology. See [110] Section 53 for the proof.
Theorem 8.4.1. Universal Coefficient Theorem for Cohomology If C is a free chain complex and G is an abelian
group, then there is an exact sequence
κ
− H p (C; G) ← Ext(Hp−1 (C, G) ← 0
0 ← Hom(Hp (C), G) ←
that is natural with respect to homomorphisms induced by chain maps. It splits but not naturally.
If (X, A) is a topological pair, then the exact sequence takes the form
κ
− H p (X, A; G) ← Ext(Hp−1 (X, A), G) ← 0
0 ← Hom(Hp (X, A), G) ←
and it is natural with respect to homomorphisms induced by continuous maps. (Hi (X, A) is replaced by Hi (X) in the
special case where A = ∅.) This sequence also splits but not naturally.
The next result is stronger than Theorem 8.2.5 and follows from the naturality of the exact sequence in the UCT.
Theorem 8.4.2. Let C and D be free chain complexes and let ϕ : C → D be a chain map. If ϕ∗ : Hi (C) → Hi (D) is
an isomorphism for i = p, p − 1, then
ϕ∗
H p (C; G) ←− H p (D; G)
is an isomorphism.
Example 8.4.1. Let T be the torus. From Example 4.3.4, we know that H0 (T ) ∼
= Z, H1 (T ) ∼
= Z ⊕Z and H2 (T ) ∼
= Z.
What are the cohomology groups?
Since, T is connected, we know that H 0 (T ) ∼
= Z by Theorem 8.2.1. Now from Section 8.1, Ext(Z, Z) = 0. So
= Hom(Hp (T ), Z). Since Hom(Z, Z) ∼
we have from the exact sequence in the UCT that H p (T ) ∼ = Z, we have
H 1 (T ) ∼
= Hom(H1 (T ), Z) ∼
= Hom(Z ⊕ Z, Z) ∼
=Z ⊕Z
and
= Hom(Z, Z) ∼
= Hom(H2 (T ), Z) ∼
H 2 (T ) ∼ = Z.
This agrees with Example 8.2.1.
Example 8.4.2. Let K be the Klein bottle. From Example 4.3.4, we know that H0 (K) ∼= Z, H1 (K) ∼
= Z ⊕ Z2 and
H2 (K) = 0. What are the cohomology groups?
Again, K is connected, so H 0 (K) ∼
= Z. For H 1 (K), we have the exact sequence
0 ← Hom(H1 (K), Z) ← H 1 (K) ← Ext(H0 (K), Z) ← 0.
Now Ext(H0 (K), Z) = Ext(Z, Z) = 0, so
H 1 (K) ∼
= Hom(H1 (K), Z) = Hom(Z ⊕ Z2 , Z) = Hom(Z, Z) ⊕ Hom(Z2 , Z) = Z ⊕ 0 = Z.
For H 2 (K), we have the exact sequence
0 ← Hom(H2 (K), Z) ← H 2 (K) ← Ext(H1 (K), Z) ← 0.
Now H2 (K) = 0, so Hom(H2 (K), Z) = 0. So
H 2 (K) ∼
= Ext(H1 (K), Z) = Ext(Z ⊕ Z2 , Z) = Ext(Z, Z) ⊕ Ext(Z2 , Z) = 0 ⊕ Z2 = Z2 .
This agrees with Example 8.2.1.
178 CHAPTER 8. COHOMOLOGY

There is an analogous UCT for homology. We want to compute homology with coefficients in some other group
given homology with integer coefficients. Tensor product and Tor replace Hom and Ext, and the arrows point in the
other direction.

Theorem 8.4.3. Universal Coefficient Theorem for Homology If C is a free chain complex and G is an abelian
group, then there is an exact sequence

0 → Hp (C) ⊗ G → Hp (C; G) → T or(Hp−1 (C), G) → 0

that is natural with respect to homomorphisms induced by chain maps. It splits but not naturally.
If (X, A) is a topological pair, then the exact sequence takes the form

0 → Hp (X, A) ⊗ G → Hp (X, A; G) → T or(Hp−1 (X, A), G) → 0

and it is natural with respect to homomorphisms induced by continuous maps. This sequence also splits but not
naturally.

Theorem 8.4.4. Let C and D be free chain complexes and let ϕ : C → D be a chain map. If ϕ∗ : Hi (C) → Hi (D) is
an isomorphism for i = p, p − 1, then
ϕ∗ : Hp (C; G) → Hp (D; G)

is an isomorphism.
∼ Z, H1 (T ) =
Example 8.4.3. Let T be the torus. Using the fact that H0 (T ) = ∼ Z ⊕ Z and H2 (T ) = ∼ Z, what are the
homology groups with coefficients in Z2 ?
Since, T is connected, we know that H0 (T ; Z2 ) ∼
= Z2 . Now from Section 8.1, T or(Z, Z2 ) = 0. So we have from
the exact sequence in the UCT for homology that Hp (T ; Z2 ) ∼
= Hp (T ) ⊗ Z2 . Since Z ⊗ Z2 ∼= Z2 , we have

H1 (T ; Z2 ) ∼
= H1 (T ) ⊗ Z2 ∼
= (Z ⊕ Z) ⊗ Z2 ∼
= Z2 ⊕ Z2

and
H2 (T ; Z2 ) ∼
= H2 (T ) ⊗ Z2 ∼
= Z ⊗ Z2 ∼
= Z2 .

Example 8.4.4. Let K be the Klein bottle. Using the fact that H0 (K) ∼
= Z, H1 (K) ∼
= Z ⊕ Z2 and H2 (K) = 0, what
are the homology groups with coefficients in Z2 ?
Again, K is connected, so H0 (K; Z2 ) ∼
= Z2 . For p = 1, the sequence becomes

0 → H1 (K) ⊗ Z2 → H1 (K; Z2 ) → T or(H0 (K), Z2 ) → 0.

Now T or(H0 (K), Z2 ) = T or(Z, Z2 ) = 0, so

H1 (K : Z2 ) ∼
= H1 (K) ⊗ Z2 ∼
= (Z ⊕ Z2 ) ⊗ Z2 ∼
= Z2 ⊕ Z2 .

For p = 2, the sequence becomes

0 → H2 (K) ⊗ Z2 → H2 (K; Z2 ) → T or(H1 (K), Z2 ) → 0.

Now H2 (K) = 0, so
H2 (K : Z2 ) ∼
= T or(H1 (K), Z2 ) ∼
= T or(Z ⊕ Z2 , Z2 ) ∼
= Z2 .

In Z2 coefficients, homology of the torus and the Klein bottle are the same. In Z2 the ”twist” is ignored.
8.5. HOMOLOGY AND COHOMOLOGY OF PRODUCT SPACES 179

8.5 Homology and Cohomology of Product Spaces


In this section, we will learn how to compute homology and cohomology groups of a product space. We will also
see what happens to a cohomology ring. The main result is called the Künneth Theorem. There will be a version for
both homology and cohomology.
Definition 8.5.1. Let C = {Cp , ∂} and C ′ = {Cp′ , ∂ ′ } be chain complexes. Their tensor product is a new complex
denoted C ⊗ C ′ whose chain group in dimension m is defined by

(C ⊗ C ′ )m = ⊕p+q=m Cp ⊗ Cq′ ,

whose boundary ∂ is defined as


∂(cp ⊗ c′q ) = ∂cp ⊗ c′q + (−1)p cp ⊗ ∂ ′ c′q .
We define ∂ as a function on Cp × Cq′ , which is bilinear so it induces a homomorphism on tensor product.
2
It is easy to see that ∂ = 0. Also, the tensor product of two chain maps is a chain map on the tensor product
complexes.
The tensor product of two free chain complexes is itself free.
If K and L are simplicial complexes, then |K| × |L| is a CW complex with cells σ × τ for σ ∈ K and τ ∈ L.
Letting D(K × L) be the corresponding cellular chain complex, the group Dm (K × L) is free abelian with a basis
element for each cell σ p × τ q for which p + q = m. This basis element is a fundamental cycle for σ p × τ q .
Orienting the simplices of K and L, they form bases for the simplicial chain complexes C(K) and C(L). The
m-dimensional chain group of (C(K) ⊗ C(L)) consists of the elements σ p ⊗ τ q for p + q = m.
From this one to one correspondence, we can choose a group isomorphism that respects boundaries and becomes
an isomorphism of chain complexes. This leads to the following result. See [110] for the proof.
Theorem 8.5.1. If K and L are simplicial complexes and if |K| × |L| is triangulated so that each cell σ × τ is the
polytope of a subcomplex then
C(K) ⊗ C(L) ∼= D(K × L).
So C(K) ⊗ C(L) can be used to compute the homology of |K| × |L|.
Next we get to the Künneth Theorem itself. As with the Universal Coefficient Theorems, we will start with chain
complexes and then move to topological spaces.
Let
Θ : Hp (C) ⊗ Hq (C ′ ) → Hp+q (C ⊗ C ′ )
be defined by
Θ({zp } ⊗ {zq′ }) = {zp ⊗ zq′ },
where zp is a p-cycle of C, and zq′ is a q-cycle of C ′ . It is easily shown that {zp ⊗ zq′ } is a cycle and the map is well
defined. Extending Θ to a direct sum, the map

Θ : ⊕p+q=m Hp (C) ⊗ Hq (C ′ ) → Hm (C ⊗ C ′ ),

can be shown to be a monomorphism, and its image is a direct summand.


The Künneth Theorem identifies the cokernel of Θ. It is our old friend Tor. Again, the reader is referred to [110]
for the rather lengthy proof.
Theorem 8.5.2. Künneth Theorem for Chain Complexes: Let C and C ′ be free chain complexes. There is an exact
sequence
Θ
0 → ⊕p+q=m Hp (C) ⊗ Hq (C ′ ) −
→ Hm (C ⊗ C ′ ) → ⊕p+q=m T or(Hp−1 (C), Hq (C ′ )) → 0

which is natural with respect to homomorphisms induced by chain maps. The sequence splits but not naturally.
180 CHAPTER 8. COHOMOLOGY

Theorem 8.5.3. If K and L are finite simplicial complexes, the Künneth Theorem implies that since C(K) and C ′ (L)
are free,
Hm (|K| × |L|) ∼ = ⊕p+q=m [Hp (K) ⊗ Hq (L) ⊕ T or(Hp−1 (K), Hq (L)].
Example 8.5.1. Recall the donut shop from Example 8.3.2. They can’t tell a donut form a sphere wedged with 2
circles, so you decide to try a different one. You get a pleasant surprise when a couple of blocks away, you discover
a shop that not only sells donuts but the Cartesian product of two spheres of any dimension. Suppose you order a
hyperdonut of the form S r × S s for some r, s > 0, and you are very curious about its homology. Since S r has
Hp (S r ) ∼
= Z for p = 0, r and is 0 otherwise, the Tor terms are all 0 as T or(Z, G) = 0 for any abelian group G. So
for p + q = m,
Hm (S r × S s ) ∼
= ⊕p+q=m Hp (S r ) ⊗ Hq (S s ).
So for r ̸= s,
Hm (S r × S s ) ∼
=Z
if m = 0, r, s, r + s and
Hm (S r × S s ) = 0
otherwise.
In the case r = s,
Hm (S r × S r ) ∼
=Z
for m = 0, 2r,
Hm (S r × S r ) ∼
=Z ⊕Z
for m = r, and
Hm (S r × S r ) = 0
otherwise.
Example 8.5.2. For field coefficients and free chain complexes, the Tor terms also go away. If we have finite simplicial
complexes K and L, we get a vector space isomorphism

⊕p+q=m Hp (K; F ) ⊗F Hq (L; F ) → Hm (|K| × |L|; F ).

There is some background I need to add here. It will be important for applying the Künneth Theorem and for
looking at the ring structure of product space cohomology in the next section. We will also need it for the construction
of Steenrod squares in Chapter 11.
The first step is to establish a relationship between the singular chain complexes of two topological spaces X and
Y and the singular chain complex of their product. This accomplished by way of the Eilenberg-Zilber Theorem. The
proof relies on the very abstract idea of acyclic models.
C will be an arbitrary category but almost always the category of topological spaces or pairs of spaces and the
continuous maps between them. A is the category of augmented chain complexes and chain maps.
Definition 8.5.2. Let G be a functor from C to A. Given an object X of C, let Gp (X) be the p-dimensional group of
the augmented chain complex G(X). Let M be a collection of objects of C called models. Then G is acyclic relative
to M if G(X) is acyclic for each X ∈ M. We say that G is free relative to M if for each p ≥ 0, there exsts:
1. An index set Jp .
2. An indexed family {Mα }α∈Jp of objects of M.
3. An indexed family {iα }α∈Jp , where iα ∈ Gp (Mα ) for each α.
Given X, the elements
G(f )(iα ) ∈ Gp (X)
are distinct and form a basis for Gp (X) as f ranges over hom(Mα , X), and α ranges over Jp .
8.5. HOMOLOGY AND COHOMOLOGY OF PRODUCT SPACES 181

Example 8.5.3. Consider the singular chain functor form the category C of topological spaces to A. Let M be the
collection {∆p } for p ≥ 0. The functor is acyclic relative to M. It is also free, since for each p, let Jp have only one
element and the corresponding object of M be ∆p . The corresponding element of Sp (∆p ) is the identity simplex ip .
As T ranges over all continuous maps ∆p → X the elements T♯ (ip ) = T form a basis for Sp (X).

Example 8.5.4. Let G be defined on pairs of spaces and G(X, Y ) = S(X × Y ), G(f, g) = (f × g)♯ . Let M =
{(∆p , ∆q )} with p, q ≥ 0. Then G is acyclic relative to M, since ∆p × ∆q is contractible. To show G is free relative
to M: For each index p, let Jp consist of a single element with the corresponding object of M being (∆p , ∆p ) and
the corresponding element of Sp (∆p × ∆p ) being the diagonal map dp (x) = (x, x). As f and g range over all maps
from ∆p to X and Y respectively, (f × g)♯ (dp ) ranges over all maps ∆p → X × Y which is a basis for Sp (X × Y ).

If G is free relative to M, then it is automatically free relative to any larger collection. If it is acyclic relative to
M, then it is automatically acyclic relative to any smaller collection. So for G to be both free and acyclic we must
choose M to be exactly the right size.

Theorem 8.5.4. Acyclic Model Theorem: Let G and G′ be functors from category C to the category A of augmented
chain complexes and chain maps. Let M be a collection of objects of C. If G is free relative to M and G′ is acyclic
relative to M, then the following hold:

1. There is a natural transformation TX of G to G′ .



2. Given two natural transformations TX , TX of G to G′ , there is a natural chain homotopy DX between them.
In other words, for each object X of C we have TX : Gp (X) → G′p (X) and DX : Gp (X) → G′p+1 (X), and
naturality implies that for each f ∈ Hom(X, Y ),

G′ (f ) ◦ TX = TY ◦ G(f ),

and
G′ (f ) ◦ DX = DY ◦ G(f ).

The next result will use the Acyclic Model Theorem to show that the chain complex S(X) ⊗ S(Y ) can be used to
compute the singular homology of X × Y .

Theorem 8.5.5. Eilenberg-Zilber Theorem: For every pair X, Y of topological spaces there are chain maps

µ
S(X) ⊗ S(Y ) S(X × Y )
ν

that are chain-homotopy inverse to each other (i.e. their composition in either direction is chain homotopy equivalent
to the identity.) They are natural with respect to chain maps induced by continuous maps.

Example 8.5.4 is pretty much the proof. The theorem also justifies the use of topological spaces in the Künneth
Theorem. We state it in the following form.

Theorem 8.5.6. Künneth Theorem for Topological Spaces: Let X and Y be topological spaces. There is an exact
sequence

0 → ⊕p+q=m Hp (X) ⊗ Hq (Y ) → Hm (X × Y ) → ⊕p+q=m T or(Hp−1 (X), Hq (Y )) → 0

which is natural with respect to homomorphisms induced by continuous maps. The sequence splits but not naturally.

The monomorphism
Hp (X) ⊗ Hq (Y ) → Hm (X ⊗ Y )
182 CHAPTER 8. COHOMOLOGY

is called the homology cross product. It is equal to the composite


Θ µ∗
→ Hm (S(X) ⊗ S(Y )) −→ Hm (X ⊗ Y )
Hp (X) ⊗ Hq (Y ) −

where Θ is induced by inclusion and µ is the chain equivalence from the Eilenberg-Zilber Theorem. We will not use
this cross product much but a similar product in cohomology will be much more interesting.
We will need an explicit formula for the other Eilenberg-Zilber equivalence ν.
Theorem 8.5.7. Let π1 : X × Y → X and π2 : X × Y → Y be projections. Let

ν : Sm (X × Y ) → ⊕p+q=m Sp (X) ⊗ Sq (Y )

be defined as
m
X
ν(T ) = [π1 ◦ T ◦ l(e0 , · · · , ei )] ⊗ [π2 ◦ T ◦ l(ei , · · · , em )].
i=0

Then ν is a natural chain map that preserves augmentation.


Using this map, our next goal is to define a cohomology cross product. We will let R be a commutative ring with
unit element.
Let C be a chain complex. This gives rise to a cochain complex Hom(C, R) whose p-dimensional group is
Hom(Cp , R). If C ′ is another chain complex, we can form the tensor product Hom(C, R) ⊗ Hom(C ′ , R) whose
m-dimensional cochain group is
⊕p+q=m Hom(Cp , R) ⊗ Hom(Cq′ , R)
and the coboundary is given by

δ(ϕp ⊗ ψ q ) = δϕp ⊗ ψ q + (−1)p ϕp ⊗ δ ′ ψ q .

Instead of taking Hom first and then tensor product, we could perform the two operations in the opposite order and
form a different cochain complex. We will define a homomorphism between them.
Definition 8.5.3. Let p + q = m. We define the homomorphism

θ : Hom(Cp , R) ⊗ Hom(Cq′ , R) → Hom((C ⊗ C ′ )m , R)

by
⟨θ(ϕp ⊗ ψ q ), cr ⊗ c′s ⟩ = ⟨ϕp , cr ⟩ · ⟨ψ q , c′s ⟩.
Note that this is zero unless p = r and q = s.
Theorem 8.5.8. The homomorphism θ is a natural cochain map.
We have the corresponding cohomology map

Θ : H p (C; R) ⊗ H q (C ′ ; R) → H m (C ⊗ C ′ ); R)

for m = p + q defined by
Θ({ϕ} ⊗ {ψ}) = {θ(ϕ ⊗ ψ)}.
Definition 8.5.4. Let m = p + q. The cohomology cross product is the composite
Θ ν∗
H p (X) ⊗ H q (Y ) −
→ H m (S(X) ⊗ S(Y )) −→ H m (X × Y )

where ν is the Eilenberg-Zilber equivalence. The image of αp ⊗ β q under this homomorphism is denoted αp × β q .
Using the homomorphism Θ, we can now state the Künneth Theorem for Cohomology.
8.6. RING STRUCTURE OF THE COHOMOLOGY OF A PRODUCT SPACE 183

Theorem 8.5.9. Künneth Theorem for Cohomology: Let C and C ′ be free finitely generated chain complexes that
vanish below some specified dimension (usually dimension 0). There is an exact sequence
Θ
0 → ⊕p+q=m H p (C) ⊗ H q (C ′ ) −
→ H m (C ⊗ C ′ ) → ⊕p+q=m T or(H p+1 (C), H q (C ′ )) → 0

which is natural with respect to homomorphisms induced by chain maps. The sequence splits but not naturally.
Denoting the cohomology cross product by ×, we can state the version for topological space.
Theorem 8.5.10. Let X and Y be topological spaces and suppose Hi (X) is finitely generated for each i. Then there
is a natural exact sequence
×
0 → ⊕p+q=m H p (X) ⊗ H q (Y ) −
→ H m (X × Y ) → ⊕p+q=m T or(H p+1 (X), H q (Y )) → 0.

It splits but not naturally if Hi (Y ) is finitely generated for each i.

8.6 Ring Structure of the Cohomology of a Product Space


In this section we will look more closely at the ring structure of a product space in cohomology. We will also settle
the question of why cohomology has a ring structure but homology does not.
Again, we let R be a commutative ring with unit element.
The cohomology cross product is the composite of maps Θ and ν ∗ which are induced by cochain maps θ and ν̃.
Defining cp × cq = ν̃θ(cp ⊗ cq ), we call this the cochain cross product. It is given by the formula

⟨cp × cq , T ⟩ = ⟨cp , π1 ◦ T ◦ l(e0 , · · · , ep )⟩ · ⟨cq , π2 ◦ T ◦ l(ep , · · · , ep+q )⟩,

where T : ∆p+q → X × Y. This says that the value of cp × cq on T equals the value of cp on the front face of the first
component of T times the value of cq on the back face of the second component of T .
The basic properties of the cross products are described in the following result.
Theorem 8.6.1. 1. If λ : X × Y → Y × X is the map that reverses coordinates, then

λ∗ (β q × αp ) = (−1)pq αp × β q .

2. In H ∗ (X × Y × Z; R), we have (α × β) × γ = α × (β × γ).


3. Let π1 : X × Y → X and π2 : X × Y → Y be projections. Let 1X and 1Y be the unit elements of H ∗ (X; R)
and H ∗ (Y ; R) respectively. Then
π1∗ (αp ) = αp × 1Y
and
π2∗ (β q ) = 1X × β q .
Now for the punchline: The cup product and cross product have very similar formulas. The next result is the
connection between them.
Theorem 8.6.2. Let d : X → X × X be the diagonal map d(x) = (x, x). Then

d∗ (αp × β q ) = αp ∪ β q .

Proof: Let z p and z q be representative cocycles for αp and β q respectively. If T : ∆p+q → X is a singular
simplex, then

⟨d♯ (z p × z q ), T ⟩ = ⟨z p × z q , d ◦ T ⟩ = ⟨z p , π1 ◦ (d ◦ T ) ◦ l(e0 , · · · , ep )⟩ · ⟨z q , π2 ◦ (d ◦ T ) ◦ l(ep , · · · , ep+q )⟩


184 CHAPTER 8. COHOMOLOGY

= ⟨z p , T ◦ l(e0 , · · · , ep )⟩ · ⟨z q , T ◦ l(ep , · · · , ep+q )⟩ = ⟨cp ∪ cq ⟩,


where we have used the fact that π1 ◦ d = π2 ◦ d = ix . ■
So now we have the reason why homology isn’t a ring. Both homology and cohomology have cross products but
consider the diagram in homology:
× ∗ d
→ Hp+q (X × X) ←−
Hp (X) × Hq (X) − Hp+q (X).

The problem is that for homology, d∗ goes in the wrong direction.


Anti-commutativity for the cup product is now easy.
Theorem 8.6.3.
αp ∪ β q = (−1)pq β q ∪ αp .
Proof: Let λ : X × X → X × X be the map that interchanges coordinates. Using the fact that λ ◦ d = d, we have
that
αp ∪ β q = d∗ (αp × β q ) = (λ ◦ d)∗ (αp × β q ) = d∗ ((−1)pq βq × αp ) = (−1)pq β q ∪ αp .

I will conclude this section with two more properties of cohomology rings of products.
Theorem 8.6.4. In the ring H ∗ (X × Y ; R) we have:

(α × β) ∪ (α′ × β ′ ) = (−1)(dim β)(dim α ) (α ∪ α′ ) × (β ∪ β ′ ).

Definition 8.6.1. The tensor product H ∗ (X; R) ⊗ H ∗ (Y ; R) has the structure of a ring where for β ∈ H p (Y ; R) and
α′ ∈ H q (X; R),
(α ⊗ β) ∪ (α′ ⊗ β ′ ) = (−1)pq (α ∪ α′ ) ⊗ (β ∪ β ′ ).
Theorem 8.6.5. If Hi (X) is finitely generated for all i, then the cross product defines a monomorphism of rings

H ∗ (X) ⊗ H ∗ (Y ) → H ∗ (X × Y ).

If F is a field, it defines an isomorphism of algebras

H ∗ (X; F ) ⊗F H ∗ (Y ; F ) → H ∗ (X × Y ; F ).

Example 8.6.1. The hyperdonut shop from Example 8.5.1 was so good, you can’t wait to get back there. Now that
you have computed the homology groups, you would like to compute the cohomology rings. Consider S r × S s for
r, s > 0. Let α ∈ H r (S r ) and β ∈ H s (S s ) be generators. Then since the torsion terms in the Künneth Theorem
vanish, H ∗ (S r × S s ) is free of rank 4 with basis 1 × 1, α × 1, 1 × β, and α × β. The only nonzero product of elements
of positive dimension is (α × 1) ∪ (1 × β) = (α × β).
Example 8.6.2. Consider the cohomology ring of P 2 × P 2 with coefficients in Z2 . Let u ∈ H 1 (P 2 ) be the nonzero
element. We know from Theorem 8.3.4 that u2 = u ∪ u is nonzero and un = 0 for n > 2. By Theorem 8.6.5,
H ∗ (P 2 × P 2 ; Z2 ) is a vector space of dimension 9. The basis is:

dim 0 1×1
dim 1 u × 1, 1 × u
dim 2 u2 × 1, u × u, 1 × u2
dim 3 u2 × u, u × u2
dim 4 u2 × u2 .

Multiplication uses Theorem 8.6.4, and we don’t have to worry about signs as we are working in Z2 . For example,

(u2 × u) ∪ (1 × u) = (u2 ∪ 1) × (u ∪ u) = u2 × u2 .
8.7. PERSISTENT COHOMOLOGY 185

8.7 Persistent Cohomology


Although Edelsbrunner and Harer [44] discuss cohomology, the earliest source I know of is the paper of deSilva et.
al [40]. The earliest paper that describes an algorithm to compute persistent cohomology is the Ph.D. Thesis of Harer’s
student at Duke, Aubrey HB [65]. (I was curious about her unusual last name consisting of 2 initials. According to
the biography in her thesis she was born Aubrey Rae Hebda-Bolduc.) In this section, we will look at her description
of cohomology and cup product as they relate to persistence. There is a heavy emphasis on manifolds, duality, and
characteristic classes. I will briefly discuss the latter in Chapter 11 as they are one of the main applications of Steenrod
squares, but I want to mainly focus on other areas. Duality is covered extensively in Munkres [110], while the classic
book on characteristic classes is Milnor and Stasheff [102].
All of the material in this section is from [65]
Suppose we have a filtration
∅ = K0 ⊂ K1 ⊂ · · · ⊂ Kn = K.
Suppose we add one simplex at each step so that Ki = Ki−1 ∪ σi . Also, cohomology will be over Z2 .
q
Definition 8.7.1. Let ψj,i : H q (Kj ) → H q (Ki ) for i ≤ j be the map induced by inclusion. (Remember that the
arrow goes in the reverse direction.) We say that a q-dimensional cohomology class Γ is born at Kj if Γ ∈ H q (Kj ),
/ Im(ψj+1,j ). We say that Γ dies entering Ki if i is the largest index such that there exists a Λ ∈ H q (Kj+1 )
but Γ ∈
q q
with ψj+1,i (Λ) = ψj,i (Γ) ∈ H q (Ki ). The copersistence of Γ is j − i.

For a simplex σ we write σ ∗ for the cochain which is one on σ and zero everywhere else. Then Γ ∈ C q (K j ) can
be written in the form Γ = σ1∗ + · · · + σn∗ for some n.
We call σ a coface of τ if τ is a face of σ. For the computations we will need the q-dimensional coincidence
matrix. which we will now define.
Definition 8.7.2. For a simplicial complex K, the q-dimensional coincidence matrix CIq is whose rows correspond
to the ordered q-dimensional simplices and whose columns correspond to the ordered q − 1-dimensional simplices. If
σ i is a q-dimensional simplex, and τj is a q − 1-dimensional simplex, then CIq [i, j] = 1 if σi is a coface of τj and
CIq [i, j] = 0 otherwise.
The algorithm does a reduction of the total coincidence matrix CI whose rows and columns are indexed by all
simplices. (Not just those of dimension q or q − 1.) CI[i, j] = 1 if the simplex i is a coface of j and j represents
a simplex of a dimension one lower than i. The algorithms in [65] give generators for H p (K) at every level of
the filtration and determines when these generators die as we move backwards through it. See there for details and
numerical examples.
HB’s thesis also looks at cup products. In Z2 it is easy to compute cup products. The cup product of two coho-
mology classes is the pairwise concatenation of the simplices on which the first factor evaluates to one and those on
which the second factor does.
As an example, let α = [1, 2]∗ + [1, 4]∗ + [2, 4]∗ and β = [2, 3]∗ + [3, 4]∗ + [2, 4]∗ . The only concatenations
preserving vertex ordering are [1, 2, 3] and [1, 2, 4]. So α ∪ β([1, 2, 3]) = α([1, 2]) · β([2, 3]) = 1 · 1 = 1. By contrast
α ∪ β([1, 3, 4]) = α([1, 3]) · β([3, 4]) = 0 · 1 = 0.
The focus in [65] is decomposability
Definition 8.7.3. Suppose we have a cohomology class β ∈ H p+q (K). β is decomposable if there exists an x ∈
H p (K) and a y ∈ H q (K) such that β = x ∪ y.
The following is the algorithm in [65] for determining if β is decomposable as a cup product with a given x.
Let K be a simplicial complex for which we have computed the cohomology groups. Let [x] be a p-dimensional
class that is born at Ks with s > n and dies at Kt with t < n − 1. Let x = σ1∗ + · · · + σn∗ and let Kn be Kn−1 with a
(p + q)-dimesnional simplex ∆ added. Also suppose that ∆∗ = δ(y) at the level of the filtration Kn . By removing ∆
and moving backwards to Kn−1 , δ(y) becomes 0, and y is born as a cocycle.
When ∆ is removed, we have [y] ∈ H p+q−1 (Kn−1 ). Let y = τ1∗ + · · · + τm ∗
. Consider the coincidence matrix
whose columns are indexed by τ1 , · · · , τm and whose rows are codimension one cofaces of the τj ’s. (In other words,
186 CHAPTER 8. COHOMOLOGY

the rows are simplices of dimension one more than those of the columns.) If λi represents row i, then the i, j-th entry
of the matrix is equal to one if and only if τj is a face of λi . For each λi , if there is a σk in x such that σk is a front
face of λi , we store λi in a list denoted β. After looping through the rows, if the size of β is even, then β is not a cup
product of anything with x. (Remember that we are working in Z2 .) If the size of the list is odd, then β is born as a
cup product with x. So the birth of this cocycle gives rise to the birth of a cup product.
See [65] for an explicit numerical example.

8.8 Ripser
As I mentioned earlier, Ripser, which was developed by Ulrich Bauer of the Technical University of Munich is the
fastest software I am aware of for computing persistent homology barcodes. The software is available at [12]. A full
description of its advantages over earlier algorithms can be found in [13]. Bauer has a nice summary of this paper in a
talk he gave at the Alan Turing Institute in 2017 [14]. The material in this section will come from there.
Bauer uses the following example. Take 192 points dispersed over S 2 . Compute homology barcodes up to dimen-
sion 2. There are over 56 million simplices in the 3-skeleton. As an example, javaplex took 53 minutes and 20 seconds
and used 12 GB of memory. Dionysus took 8 minutes and 53 seconds and used 3.4 GB. Ripser takes 1.2 seconds and
uses 152 MB of memory.
The improved performance is based on four optimizations:
1. Clearing inessential columns. (I will explain what this means below.)
2. Computing cohomology rather than homology.
3. Implicit matrix reduction.
4. Apparent and emergent pairs.
It turns out that cohomology yields a considerable speedup but only in conjunction with clearing.
Consider the standard matrix algorithm. Suppose we have a cloud of points in a finite metric space and we want to
compute the perisistent homology over Z2 . Start with the filtration boundary matrix D where the columns and rows
are indexed by simplices in the filtration and Di,j = 1 if σi ∈ ∂σj , and Di,j = 0 otherwise. (See for example [33].)
For a matrix R, let Ri be the ith column of R. The pivot index of Ri denoted by Pivot Ri is the largest row index of
any nonzero entry or 0 if all column entries are 0. Define Pivots R = ∪i Pivot Ri \ {0}.
A column Ri is reduced if Pivot Ri cannot be decreased using column additions by scalar multiples of columns
Rj with j < i. In particular, Ri is reduced of Ri = 0 or all columns Rj with j < i are reduced and satisfy Pivot Rj ̸=
Pivot Ri . The entire matrix R is reduced if all of its columns are reduced.
The following proposition is the basis for matrix reduction algorithms to compute persistence homology.
Theorem 8.8.1. [33]. Let D be a filtration boundary matrix and let V be a full rank upper triangular matrix such
that R = DV is reduced. Then the index peristence (i.e. birth-death) pairs are

{(i, j)|i = Pivot Rj ̸= 0},

and the essential indices (birth times of classes that live forever) are

{i|Ri = 0, i ∈
/ Pivots R}.

Theorem 8.8.2. The persistent homology is generated by the representative cycles

{Rj |j is a death index} ∪ {Vi |i is an essential index},

in the sense that for all filtration indices k, H∗ (Kk ) is generated by the cycles

{Rj |(i, j) is an index persistence pair with k ∈ [i, j)} ∪ {Vi |i is an essential index with k ∈ [i, ∞)},
8.8. RIPSER 187

and for all pairs of indices k, m, the image of the map in homology H∗ (Kk ) → H∗ (Km ) induced by inclusion has a
basis generated by the cycles

{Rj |(i, j) is an index persistence pair with k, m ∈ [i, j)} ∪ {Vi |i is an essential index with k ∈ [i, ∞)}.

This leads to the first optimization. We don’t care about a column with a non-essential birth index. So we set
Ri = 0 if i =Pivot Rj for some j ̸= i. As we still need to compute V , we can simply set Vi = Rj as this column is a
boundary. Then we still get R = DV reduced and V full rank upper triangular. This is a big savings as reducing birth
columns are typically harder than reducing death columns, and we only need to reduce essential birth columns.
Now for cohomology, we reduce DT instead of D. In low dimensions it turns out that the number of skipped
columns is much greater when we use cohomology. The persistence barcodes are easily converted from cohomology
to homology by the universal coefficient theorem as we are working over a field.
Returning to our example of 192 points on a sphere, we need to reduce 56,050,096 columns. Using clearing, this
reduces to 54,888,816. With cohomology and clearing, the number reduces to 1,161,472 which is less than 1/60 of the
work.
The next optimization is implicit matrix reduction. Ripser only stores the matrix V which is much sparser than
R = DV . Recompute D and R only when needed. We only need the current column Rj and the pivots of the previous
columns to obtain the persistence intervals [i, j).
The last optimization of apparent and emergent pairs involves ideas from discrete Morse theory where simplices
can be collapsed onto faces. See [13] and its references for the details which I will omit in the interest of space.
In conclusion, topological data analysis so far only uses a small fraction of the machinery of algebraic topology.
The recent development of the Ripser algorithm demonstrated that cohomology is a very valuable tool that had not
previously been explored. Are there more tools from algebraic topology that would be useful in data science? It is
hard for me to imagine that this is not the case. In the remainder of this book, I would like to explore this possibility.
188 CHAPTER 8. COHOMOLOGY
Chapter 9

Homotopy Theory

Homotopy theory is the branch of algebraic topology dealing with the classification of maps of a sphere into a space.
It grew up alongside homology theory and is closely related. Like homology, it is a functor from topological spaces
and continuous maps to groups and homomorphisms. If in some dimension, the groups are not isomorphic, the spaces
are not homeomorphic or even homotopy equivalent. Also, the theory is somewhat easier to picture for homotopy as
opposed to homology.
There are also some major differences. Homology groups as we have seen are computable for a finite simplicial
complex, while results for homotopy groups apply under very special cases. While spheres have the simplest homology
groups, we don’t even know all of the homotopy groups of S 2 . (And we never will.) In fact homotopy groups of
spheres is a special topic with an extensive and very difficult literature. I will briefly touch on it in Chapter 12.
The simplest spaces in homotopy which have only one nonzero homotopy group are the Eilenberg-Mac Lane spaces.
With the exception of the circle S 1 , Eilenberg-Mac Lane spaces are generally very complicated and usually infinite
dimensional. Sometimes, though, homotopy groups are eaiser to calculate. For example, homotopy groups of product
spaces are just direct sums as opposed to homology where we need the more complicated Künneth Theorem. And for
fiber bundles, a twisted version of a product (eg. a Möbius strip), homotopy produces a long exact sequence, while for
homology, you need to use a spectral sequence, one of the scariest mathematical tools ever dreamed up. I will try to
give you a taste of what spectral sequences are at the end of this chapter.
So why am I putting homotopy in a book about data science? The answer is that it is a vital component of
obstruction theory. Suppose L is a subcomplex of a simplicial complex K. We have a continuous map f : L → Y
and we would like to extend it to a map g : K → Y such that g is continuous and g|L = f . The idea is to extend it
over the vertices, edges, triangles, etc. Suppose we have extended it to the k-simplices. For the next step, we know
that the function has been extended to the boundary of each (k + 1)-siimplex and we want to extend it to the interior.
This turns out to be equivalent to asking if a certain cocycle on K is equal to zero. I will explain the details in Chapter
10.
For now, though, recall that in a filtration, the complex at an earlier filtration index is a subcomplex of the complex
at a later one.. For a point cloud, we may ask if a function on the Vietoris-Rps complex at radius ϵ1 can be extended
to a function on the complex at radius ϵ2 , where ϵ2 > ϵ1 . If not, where in the cloud can it be done, and where are the
obstructions. Also, what kind of function and what target space Y would be useful in appliications?
In this chapter, I will restrict myself to giving the basic definitions and results that you need to understand what
comes afterwards. I will leave out most of the proofs and refer you to the standard textbooks for more details.
The first book I will recommend is Steenrod’s Topology of Fibre Bundles [156], the earliest homotopy textbook I
know of. (Note: Although ”fibre” is often used in the literature, I will use the American spelling ”fiber” unless it is in
a direct quote.) It is dense and quite dated but very clearly written. My personal favorite is the book by Hu [74]. It is
also very clear but also a little old. The chapter on spectral sequences, though, is missing some key pieces so I would
recommend looking elsewhere on that topic. (See below for some suggestions.) Another classic is the extremely thick
and dense book by Whitehead [170]. Finally, there is a section on homotopy in Hatcher [63].
Unless otherwise stated, the material in Sections 9.1-9.7 come from Hu [74]. Section 9.1 will describe the extension

189
190 CHAPTER 9. HOMOTOPY THEORY

problem, a major theme for the remainder of this book and its dual, the lifting problem. In Section 9.2, I will start with
the homotopy group in dimension 1, also known as the fundamental group. Section 9.3 will discuss fiber bundles, and
Section 9.4 will discuss an interesting nontrivial example, the Hopf maps. Another example is the spaces of paths and
loops which are the subject of Section 9.5. Higher dimensional homotopy groups are the subject of Section 9.6 and
Section 9.7 gives some examples of computational tricks. Finally, I will discuss the computational tools of Postnikov
systems in Section 9.8 and spectral sequences in Section 9.9.

9.1 The Extension and Lifting Problems


Definition 9.1.1. Let f : X → Y be a continuous map between two topological spaces (generally, the term map will
always mean a continuous function), and let A be a subspace of X. Then the map g : A → Y such that g(x) = f (x)
is called the restriction of f to A and f is called an extension of g.
If h : A → X is inclusion, then the extension problem is to determine whether or not a given map g : A → Y has
an extension over X. Equivalently, we want to find f : X → Y which makes the following diagram commute (i.e,
g = f h):
g
A Y

h f
X

Using the functoriality properties of homology, if such an extension exists, we can solve find f∗ so the following
diagram commutes:
g∗
Hm (A) Hm (Y )

h∗ f∗
Hm (X)

for each m.
Recall that we have looked at this problem for A = Y = S n , n > 0, and X = B n+1 . If g is a constant map, let
g(a) = y0 for all a ∈ A and fixed y0 ∈ Y . We can easily extend to all of X by letting f (x) = y0 .
But suppose g : S n → S n is the identity map. Then Hn (S n ) ∼= Z and g∗ is the identity on Z. But Hn (B n+1 ) = 0
so h∗ = 0. Thus, it is impossible to find f∗ such that f∗ h∗ = g∗ and the identity can not be extended over the entire
ball.
Now the homology problem has a solution even if g ≃ f h meaning that g is homotopic to f h rather than strictly
equal to it. We will use this more general problem in what follows. Hu lists a number of related problems. One
interesting one is the dual of the extension problem.
Let A, X and Y be topological spaces. We don’t assume anything about inclusion but instead let f : X → A be
surjective. Given a map g : Y → A, we would like to find a map f : Y → X such that g = hf . This is called the
lifting problem. Here is a diagram repositioned to illustrate the name ”lifting” but with the spaces and maps having the
same names as before. You can see that all of the arrows have been reversed.

X
f
h

Y g A
9.2. THE FUNDAMENTAL GROUP 191

Like the extension problem, the lifting problem is also usually generalized to the case of finding a map f such that
g ≃ hf.
My dissertation [124] actually involved solving a lifting problem involving some strange infinite dimensional
function spaces.

9.2 The Fundamental Group


In this section, we will discuss the properties of the one-dimensional homotopy group, also known as the funda-
mental group. It classifies maps between S 1 and a space X.
First, though, we will look at a simple example of what we will call a fiber bundle in the next section.

Definition 9.2.1. The exponential map is the map p : R → S 1 defined by

p(x) = e2πxi ,

for x ∈ R.

The first result is an easy consequence of the definition.

Theorem 9.2.1. The exponential map p is a homomorphism in the sense that p(x + y) = p(x)p(y) for x, y ∈ R. (Note
that multiplying in S 1 is actually adding angles.) The kernel p−1 (1) of p consists of the integers.

Theorem 9.2.2. For every proper connected subspace U of S 1 , p takes every component of p−1 (U ) homeomorphically
onto U .

For the next result, we need to define paths and loops. These will be crucial in what follows.

Definition 9.2.2. A path in a space X is a map σ : I → X, where I = [0, 1]. σ(0) is called the initial point of σ and
σ(1) is called the terminal point. If σ(0) = σ(1) = x0 , then σ is called a loop, and x0 is its base point.

Theorem 9.2.3. The Covering Path Property: For eavery path σ : I → S 1 and every point x0 ∈ R such that
p(x0 ) = σ(0), there exists a unique path τ : I → R such that τ (0) = x0 and pτ = σ.

Important: In this subject, it is easy to get lost in all the spaces and maps between them. If you get confused. draw
diagrams and make sure you keep track of which spaces are the domain and range of the maps involved.
In the previous result, if log denotes the natural logarithm, the path τ : I → R is given by τ (0) = x0 , and

1
τ (t) = log(σ(t))
2πi
for t ∈ I.
Recall that for f, g : X → Y , f is homotopic to g, denoted f ≃ g if there is a continuous map H : X × I → Y
such that H(x, 0) = f and H(x, 1) = g. Following Hu, we will write ft : X → Y and call it a homotopy of f if f is
specified but g is not.

Theorem 9.2.4. The Covering Homotopy Property: For every map f : X → R of a space X into the reals, and
every homotopy ht : X → S 1 , (0 ≤ t ≤ 1) of the map h = pf , there exists a unique homotopy ft : X → R of f such
that pft = ht for every t ∈ I..

Now we define the fundamental group π1 (X).


Let X be a space and x0 ∈ X. Let Ω be the set of all maps f : S 1 → X such that f (1) = x0 . If we replace the
map f with the map f p : I → X where p is the exponential map, we have a loop in X with x0 as its base point. The
correspondence is one to one, so identifying f with f p, we can consider Ω to be the set of loops in X with base point
x0 .
192 CHAPTER 9. HOMOTOPY THEORY

Letting
Ω = {f : I → X|f (0) = x0 = f (1)},
we can define a multiplication in Ω. If f, g ∈ Ω, the product consists of travelling around loop f and then loop g. To
be precise, we define the product f · g by

if 0 ≤ t ≤ 21 ,

f (2t)
(f · g)(t) =
g(2t − 1) if 12 ≤ t ≤ 1.

Now homotopy of maps is an equivalence relation, so we denote the homotopy class of f by [f ] and call f a
representative of the class [f ].
Let d denote the degenerate loop d(I) = x0 . For a loop f ∈ Ω, let f −1 be the loop f traversed in the opposite
direction so that f −1 (t) = f (1 − t). Then (f −1 )−1 = f .

Theorem 9.2.5. If f, g, h ∈ Ω and d, f −1 are as defined above, the following hold:

1. If f ≃ f ′ and g ≃ g ′ , then f · g ≃ f ′ · g ′ .

2. (f · g) · h ≃ f · (g · h).

3. f · d ≃ d · f ≃ f .

4. f · f −1 ≃ f −1 · f ≃ d.

By property 1, [f ][g] = [f · g]. So Theorem 9.2.5 implies that the homotopy classes form a group with identity
element e = [d] and the inverse of [f ] is [f −1 ].

Definition 9.2.3. The homotopy classes of loops f : I → X with f (0) = f (1) = x0 form a group with the
multiplication defined above called the fundamental group of X at x0 denoted π1 (X, x0 ).

As an example, let X = S 1 . Remember from Section 4.1.7 that the degree of a map f : S n → S n determines the
homotopy class of f . Now Hn (S n ) ∼ = Z, and we defined the degree deg(f ) by f∗ (z) = deg(f )z. Theorem 4.1.29
implies that the map [f ] → deg(f ) is a well defined homomorphism from π1 (S 1 , 1) to Z. Hu proves that a map
f : S 1 → S 1 of degree n is homotopic to the map ϕn : S 1 → S 1 defined by ϕn (z) = z n and that ϕn has degree n.
So the homomorphism is one-to-one and onto and hence π1 (S 1 ) ∼ = Z.
Note that the fundamental group is not computable in general. We can, however, write down generators and
relations for the fundamental group of a finite simplicial complex.
Also, unlike homology groups, the fundamental group may not be abelian. For example, let E be a figure 8. Let
x0 be the point where the two loops meet, f denote the top loop and g denote the bottom loop. Then π1 (E, x0 ) is the
(non-abelian) free group generated by f and g.
What happens if we pick a different base point x1 ? If there is a path σ from x1 to x0 , then if f is a loop with
base point x1 , and τ is the reverse of σ, then σ · f · τ is a loop with base point x0 . So the map [f ] → [σ · f · τ ] is an
isomorphism π1 (X, x1 ) ∼ = π1 (X, x0 ).
If X is pathwise connected, we can dispense with the base point and simply write π1 (X).
Finally, if ϕ : X → Y and f : I → X is a loop on X, then ϕf : I → Y is a loop on Y . So ϕ gives rise to a
homomorphism ϕ∗ : π1 (X, x0 ) → π1 (Y, y0 ) for y0 = ϕ(x0 ) where ϕ∗ ([f ]) = [ϕf ].
Now we would like to know when π1 (X) = 0 for a pathwise connected space X.

Definition 9.2.4. A pathwise connected space X is simply connected if every pair of paths σ, τ : I → X with
σ(0) = τ (0) and σ(1) = τ (1) are homotopic by a homotopy that fixes the endpoints.

If f is a loop in a simply connected space X, and d is the degenerate loop at f (0) = f (1) = x0 then [f]=[d]. So
π1 (X) = 0. Hu proves the stronger statement that the two conditions are equivalent.

Theorem 9.2.6. If X is a nonempty pathwise connected space, then X is simply connected if and only if π1 (X) = 0.
9.3. FIBER BUNDLES 193

Theorem 9.2.7. S n is simply connected if and only if n > 1.

Proof: S 0 consists of two points and is not even pathwise connected, so it is not simply connected. S 1 is also not
simply connected by the previous theorem since π1 (S 1 ) ∼ = Z ̸= 0.
For n > 1 let f : I → S n be a loop on S n with base point x0 . Since n > 1, f (I) is a proper subspace of S n and
thus contractible to the point x0 . So [f ] = [d] and π1 (S n ) = 0. ■
Since the fundamental group can be non-abelian, how does it compare with the one-dimensional homology group?
For a pathwise connected space X, there is a simple relationship.
Let
h∗ : π1 (X, x0 ) → H1 (X)
be defined as follows. Let α ∈ π1 (X, x0 ). choose a representative loop for α called f : S 1 → X with f (1) = x0 .
Then f induces a homomorphism f∗ : H1 (S 1 ) → H1 (X). Let ι be the generator for the infinite cyclic group H1 (S 1 )
corresponding to the counterclockwise orientation of S 1 . Then f∗ (ι) ∈ H1 (X) depends only on the class α so we
define h∗ by h∗ (α) = f∗ (ι). (We will see several variants of this construction in future discussions.)
Now we need a definition from abstract algebra.

Definition 9.2.5. Let G be a group and let g, h ∈ G. Then the commutator [g, h] of g and h is the element ghg −1 h−1 .
The set of commutators is not closed under multiplication, but the subgroup they generate is called the commutator
subgroup.

If G is abelian, any commutator reduces to the identity. If Comm(G) is the commutator subgroup of G, then the
quotient group G/Comm(G) is abelian and called the abelianization of G.
Since h∗ is a homomorphism and H1 (X) is abelian, Comm(π1 (X, x0 )) must be contained in the kernel of h∗ .
We actually have the following:

Theorem 9.2.8. If X is pathwise connected then the natural homomorphism h∗ maps π1 (X, x0 ) onto H1 (X) with the
commutator subgroup Comm(π1 (X, x0 )) as its kernel. So H1 (X) is isomorphic to the abelianization of π1 (X, x0 ).

Theorem 9.2.8 is the Hurewicz Theorem for dimension one. The Theorem has a nicer form for higher dimensions
which we will see in Section 9.7.

9.3 Fiber Bundles


So what is a fiber bundle anyway? The usual definition is a space that is ”locally” a cartesian product of two
spaces. Another definition is a map that has the covering homotopy property.
One thing that will help us a lot is that we are working with finte simplicial complexes in any data science appli-
cation. The definitions become equivalent if the spaces we are working with are paracompact.

Definition 9.3.1. A space X is paracompact if given every open cover {Uα } of X, has a locally finite refinement. In
other words, there is an open cover {Vβ } such that every set Vβ in the cover is contained in some Uα in the original
cover and every point x ∈ X has a neighborhood which intersects only finitely many of the sets in {Vβ }.

A compact space is easily seen to be paracompact. Any CW complex is paracompact and any finite CW complex
is actually compact. Finite simplicial complexes are also compact and those are the only kind of spaces we will deal
with in applications.
We will start with the simpler definition found in Hu [74]. Then we will compare it to the more elaborate version
in Steenrod [156]. Be careful of similar sounding words that may or may not mean the same thing. And again, draw
yourself lots of pictures. There are way too many symbols to keep track of otherwise.
Hu starts his discussion with various covering homotopy properties. There are really only three that will concern
us: The CHP, the ACHP, and the PCHP. For the definitions refer to the following diagram:
194 CHAPTER 9. HOMOTOPY THEORY

E
f∗
p

X B
f

Definition 9.3.2. Let X be a given space and f : X → B be a given map, and let ft : X → B for t ∈ [0, 1] be a
given homotopy of f . A map f ∗ : X → E is said to cover f relative to p if pf ∗ = f . A homotopy ft∗ : X → E
for t ∈ [0, 1] of f ∗ is said to cover the homotopy ft relative to p if pft∗ = ft for all t. Then ft∗ is called a covering
homotopy of ft .

Definition 9.3.3. The map p : E → B is said to have the covering homotopy property or CHP for the space X if for
every map f ∗ : X → E and every homotopy ft : X → B of the map f = pf ∗ : X → B, there exists a homotopy
ft∗ : X → E of f ∗ which covers the homotopy ft . The map p : E → B has the absolute covering homotopy property
or ACHP if it has the CHP for every space X.The map p : E → B has the polyhedral covering homotopy property or
PCHP if it has the CHP for every triangulable space X.

Example 9.3.1. Let E = B × D, and let p : E → B be the natural projection. Let X be a given space, f ∗ : X → E
be a given map, and let ft : X → B a given homotopy of the map f = pf ∗ . Then ft has a covering homotopy
ft∗ : X → E of f ∗ defined by
ft∗ (x) = (ft (x), qf ∗ (x))
for every x ∈ X, t ∈ [0, 1] where q : E → D is the other projection. So p has the ACHP.

Definition 9.3.4. A map p : E → B is a fibering if it has the PCHP. In this case E is called a fiber space over the base
space B with projection p : E → B. (E is also called the total space.) For each point b ∈ B, the subspace p−1 (b) of
E is called the fiber over b.

For all of our applications a fibering will be equivalent to what Hu calls a bundle space.

Definition 9.3.5. A map p : E → B has the bundle property if there exists a space D such that for each b ∈ B, there
is an open neighborhood U of b in B together with a homeomorphism

ϕU : U × D → p−1 (U )

of U × D onto p−1 (U ) satisfying the condition

pϕU (u, d) = u

for u ∈ U , d ∈ D. Then E is called the bundle space over the base space B relative to the projection p. The space D
is called a director space. The open sets U and the homeomorphisms ϕU are called the decomposing neighborhoods
and decomposing functions respectively.

A bundle space can be thought of as a space that is locally a cartesian product but may have a twist. In this way a
mobius strip differs from a cylinder, the latter of which is S 1 × I.

Definition 9.3.6. Let E = B × D and p : E → B be projection. Then E is a bundle space over B. (Just let the
neighborhood U be all of B.) Then E is called a product bundle or trivial bundle over B.

Now you can finally understand the pun from my introduction: What do you get when you cross an elephant and
an ant? The trivial elephant bundle over the ant.
The following two results give the connection between a bundle space and the fiber space we defined first. The
idea is to use the fact that the projection of a product space onto one of its factors has the ACHP and thus the PCHP.
9.3. FIBER BUNDLES 195

Theorem 9.3.1. Every bundle space E over B relative to p : E → B is a fiber space over B relative to p.

Theorem 9.3.2. If a map p : E → B has the bundle property, then it has the CHP for every paracompact Hausdorff
space and thus any finite CW complex or simplicial complex.

Steenrod [156] calls the bundle space we just defined an Ehresmann-Feldbau bundle or E-F bundle. It is a special
case of what he calls a coordinate bundle which has a lot more structure. I will now give his definition of a coordinate
bundle and the related term fiber bundle. We first need the notion of a topological group.

Definition 9.3.7. A topological group G is a set which has a group structure and a topology such that:

1. g −1 is continuous for g ∈ G.

2. The map µ : G × G → G given by µ(g1 , g2 ) = g1 g2 is continuous.

Definition 9.3.8. If G is a topological group and Y is a topological space, then we say that G is a topological
transformation group of Y relative to a map η : G × Y → Y if:

1. η is continuous.

2. η(e, y) = y if e is the identity of G.

3. η(g1 g2 , y) = η(g1 .η(g2 , y)) for g1 , g2 ∈ G and y ∈ Y .

We say that G acts on Y. If Y is a set and not necessarily a topological space, a group G (not necessarily topological)
acts on Y if statements 2 and 3 hold.

We usually leave out η and write η(g, y) as gy.


For a fixed g, the map y → gy is a homeomorphism of Y onto itself since it has a continuous inverse y → g −1 y.
So η induces a homomorphism from G to the group of homeomorphisms of Y . G is effective if gy = y for all y ∈ Y
implies that g = e. Then G is isomorphic to a group of homeomorphisms of Y .
In our definition, we will assume that G is effective. I will now define a coordinate bundle. There are lots of maps
here. Draw yourself pictures or I guarantee that you will be totally confused. Also, I will use E for the bundle space
and B for the base space which is more conventional, but note that Steenrod uses B for the bundle space and X for
the base space.

Definition 9.3.9. A coordinate bundle B consists of the following:

1. A space E called the bundle space.

2. A space B called the base space.

3. A map p : E → B of E onto B called the projection.

4. A space Y called the fiber.

5. An effective topological transformation group G of Y called the group of the bundle.

6. A family {Vj } of open sets covering B. These open sets are called coordinate neighborhoods.

7. For each coordinate neighborhood Vj ⊂ B, we have a homeomorphism ϕj : Vj × Y → p−1 (Vj ) called the
coordinate function. These functions need to satisfy conditions 8-10.

8. pϕj (b, y) = b.
196 CHAPTER 9. HOMOTOPY THEORY

9. If the map ϕj,b : Y → p−1 (b) is defined by setting

ϕj,b (y) = ϕj (b, y),

then for each pair of indices i, j, and each b ∈ Vi ∩ Vj , the homeomorphism

ϕ−1
j,b ϕi,b : Y → Y

coincides with the operation of an element of G which is unique since G is effective.

10. For each pair i, j of indices, the map


gji : Vi ∩ Vj → G
defined by
gji (b) = ϕ−1
j,b ϕi,b

is continuous. These maps are called the coordinate transformations of the bundle.

Definition 9.3.10. In a coordinate bundle, we denote p−1 (b) = Yb and call it the fiber over b.

Definition 9.3.11. Two coordinate bundles are equivalent if they have the same bundle space, base space, projection,
fiber and group, and their coordinate functions {ϕj }, {ϕ′k } satisfy the conditions that

g kj (b) = ϕ′−1
k,b ϕj,b ,

for b ∈ Vj ∩ Vk′ coincides with the operation of an element of G and the map

g kj (b) → G

is continuous. The definition defines an equivalence relation. An equivalence class of coordinate bundles is called a
fiber bundle.

In the bundle space defined in Hu (our Defintion 9.3.5) the director space is the same as the fiber of a coordinate
bundle. In this case, the fibers over each point in the base space are homeomorphic. (See [156] Section 6.)
For the rest of this chapter and the book in general, I will use the terms fiber space, bundle space, and fiber bundle
in accordance with my sources, but remember that in the cases we care about, fiber space and bundle space are pretty
much equivalent, and fiber bundles differ with the inclusion of a group of transformations.
The rest of this section is taken from [74].

Definition 9.3.12. If E is a bundle space over B with projection p : E → B, a cross section in E over a subspace X
of B is a map f : X → E such that pf (x) = x for x ∈ X.

If U ⊂ B is a decomposing neighborhood with decomposing function ϕU : U × D → p−1 (U ) and projection


ψU : U × D → D then for any point e ∈ p−1 (U ), there is a cross section fe : U → E given by fe (u) = ϕU (u, d)
where d = ψU ϕ−1 U (e) for each u ∈ U . If u = p(e), then fe (u) = e. So in bundle spaces, local cross sections always
exist.
For global cross sections in which f is defined on all of B, we would have pf = iB so on homology p∗ f∗ is the
identity on H∗ (B). Then f∗ : Hm (B) → Hm (E) is a monomorphism for all m, and p∗ : Hm (E) → Hm (B) is an
epimorphism for all m. We will see that this condition fails for the Hopf maps from the next section.
I will conclude this section with a discussion of a useful construction which produces a new fiber space from an
old one by way a map of a given space into its base space.

Definition 9.3.13. Let p : E → B and p′ : E ′ → B ′ be to fiberings. A map F : E → E ′ is called a fiber map if it


carries fibers into fibers, i.e. for every b ∈ B there is a b′ ∈ B ′ such that F carries p−1 (b) into p′−1 (b′ ).
9.4. THE HOPF MAPS 197

In this case, F induces a function f : B → B ′ defined by

f (b) = p′ F p−1 (b)

for every b ∈ B. Then f is called the induced map of the fiber map F . If E is a bundle space over B, then p is open
so f is continuous. The following diagram is commutative:

F
E E′
p p′

B B′
f

Suppose we are given a fibering p′ : E ′ → B ′ and a map f : B → B ′ of a given space B ′ into B. We can construct
a fibering p : E → B and fiber map F : E → E ′ that induces f . (Hu calls F a lifting of f .) Let E be the subspace of
B × E ′ given by
E = {(b, e′ ) ∈ B × E ′ |f (b) = p′ (e′ )}
and let p : E → B denote the projection defined by p(b, e′ ) = b. Let F : E → E ′ be defined by F (b, e′ ) = e′ . Then
by construction f p = p′ F , so F is a lifting of f .
Hu uses the polyhedral covering homotopy property of p′ to show that it also holds for p so p is a fibering called
the fibering induced by f.
In the case where B is a subspace of B ′ and f is inclusion, then E can be identified with p′−1 (B) and p with p′ |B.
We then call the induced fibering p : E → B the restriction of p′ : E ′ → B ′ to B. This gives the following result.

Theorem 9.3.3. If E is a fiber space over a base space B with projection p : E → B and if A is a subspace of B
then p−1 (A) is a fiber space over A with (p|p−1 (A)) as a projection.

9.4 The Hopf Maps


There are three interesting fiber bundles (or fiber spaces if we ignore their transformation groups) called the Hopf
bundles or Hopf fibrations. The corresponding projections will be called Hopf maps. These are the only fiber spaces
in which the fiber, bundle space, and base space are all spheres. Proving this is very hard but it is related to the Hopf
invariant problem which we will defer to Chapter 11.
The original paper describing these bundles was published by Hopf in 1935 [72]. The three maps are described
in more or less detail in Steenrod [156], Hu [74], and Hatcher [63]. I will follow Hatcher’s definitions. Then I will
talk more about the first bundle which maps S 3 → S 2 following Lyons, who takes a different approach and gives a
good intuitive description of where this map comes from. Lastly, I will state some results from Hu on maps which
arealgebraically trivial, i.e. trivial on homology for each dimension.
Hopefully it is not too much of a spoiler her to tell you that πn (X) for n > 1 is the group of homotopy classes of
maps from S n → X. (The group operation and more details will be explained in their place in section 9.6.) You may
hope that like homology and cohomology, the homotopy groups πm (S n ) = 0 for m > n. Then I could end this book
right here. The Hopf maps were the bad news that this wasn’t true. As the map from S 3 → S 2 will turn out not to be
homotopic to a constant, we know that π3 (S 2 ) ̸= 0. For obstruction theory applications to data science, this will be
one of our simplest examples, and I will discuss the situation in detail in Section 10.2.
Writing our bundles in the form F → E → B where F is the fiber, E is the bundle space, and B is the base space,
the three Hopf bundles are:

1. S 1 → S 3 → S 2 .

2. S 3 → S 7 → S 4 .
198 CHAPTER 9. HOMOTOPY THEORY

3. S 7 → S 15 → S 8 .

I will describe these one at a time. The material comes from [63]. First, though consider the map S n → P n which
identifies antipodal points on S n . This is a fiber bundle called a covering space which means it has a discrete fiber. In
this case, the fiber consists of two points, so we can think of it as S 0 .
Now, look at the complex analogue. We have the fiber space S 1 → S 2n+1 → CP n . Then S 2n+1 is the unit sphere
in C n+1 and CP n is the quotient space of S 2n+1 under the equivalence relation (z0 , · · · , zn ) ∼ λ(z0 , · · · , zn ) for
λ ∈ S 1 , the unit circle in C. The projection S 2n+1 → CP n sends (z0 , · · · , zn ) to its equivalence class [z0 , · · · , zn ].
Since λ varies over S 1 , the fibers are copies of S 1 . To check that it is actually a bundle space, we let Ui ⊂ CP n be
the open set of equivalence classes [z0 , · · · , zn ] such that zi ̸= 0. Let hi : p−1 (Ui ) → Ui × S 1 by hi ((z0 , · · · , zn )) =
([z0 , · · · , zn ], zi /|zi |). We can do this since zi ̸= 0. This takes fibers to fibers and is a homeomorphism with inverse
([z0 , · · · , zn ], λ) → λzi−1 |zi |. So there is a local section.
Now let n = 1. Recall that CP 1 consists of a 0-cell and a 2-cell so it is homeomorphic to S 2 . So the bundle
constructed as above becomes S 1 → S 3 → S 2 . This is our first Hopf bundle. The projection can be taken to be
(z0 , z1 ) → z0 /z1 ∈ C ∪ {∞} = S 2 . The fiber, bundle space, and base space are all spheres. In polar coordinates we
have
r0
p(r0 eiθ0 , r1 eiθ1 ) = ei(θ0 −θ1 )
r1
where r02 + r12 = 1.
Now replace C by the field H of quaternions. Recall from Example 3.3.3 that the quaternions are of the form
a + bi + cj + dk where a, b, c, and d are real numbers and i2 = j 2 = k 2 = −1. We also have ij = k, jk = i, ki =
j, ji = −k, kj = −i, and ik = −j. (So the quaternions are not commutative.) Also define the conjugate h for
h ∈ H. If h = a + bi + cj + dk, then h = a − bi − cj − dk. Quaternionic projective space HP n is formed from
S 4n+3 by identifying points which are multiples of each other by unit quaternions analogous to the complex case. It
has a CW structure with a cell in dimensions that are a multiple of 4 up to 4n. So HP 1 is S 4 . There is a fiber bundle
S 3 → S 4n+3 → HP n . Here S 3 is the unit quaternions and S 4n+3 is the unit sphere in HP n . For n = 1 this becomes
S3 → S7 → S4.
We get our last Hopf map by using the octonians.

Definition 9.4.1. The ring O of octonions or Cayley numbers consists of pairs (h1 , h2 ) of quaternions with multipli-
cation given by
(a1 , a2 )(b1 , b2 ) = (a1 b1 − b2 a2 , a2 b1 + b2 a1 ).
Then O is a ring but it is non-commutative and not even associative.

Letting S 15 be the unit sphere in the 16-dimensional space O2 , the projection map p : S 15 → S 8 = O ∪ {∞} is
(z0 , z1 ) → z0 z1−1 where z0 , z1 ∈ O. This is a fiber bundle with fiber S 7 , where S 7 is the unit octonians, but the proof
is complicated by the fact that O is not associative. See [63] for details. It turns out that there is an octonian projective
plane OP 2 formed by attaching a 16-cell to S 8 via the Hopf map S 15 → S 8 . But there is no OP n for n > 2 as the
associativity of multiplication is needed for the relation (z0 , · · · , zn ) ∼ λ(z0 , · · · , zn ). to be an equivalence relation.
I will now return to the first Hopf bundle S 1 → S 3 → S 2 , and look at it in more detail. In [90], David Lyons looks
at the Hopf map in the context of rotations of 3-space, one of its original motivations. This connection allows for its
use in physics areas such as magnetic monopoles [113], rigid body mechanics [93], and quantum information theory
[107]. Lyons’ paper is aimed at undergraduates and doesn’t use much topology but is very interesting for the way it
illustrates the map and the spaces involved. I will digress for a while and summarize some of its main points as an aid
to working with it in the context of Chapter 10.
Lyons refers to the Hopf bundle as the Hopf fibration. A fibration is technically a slightly more general term as
the fibers are allowed to be homotopy equivalent in a fibration and not necessarily homeomorphic. In the literature,
though, fibration and fiber bundle are often used interchangeably, and the Hopf fibration is definitely a fiber bundle in
the more strict sense. Still, this allows us to mention the contribution of the Beach Boys to algebraic topology: ”I’m
picking up good fibrations.”
9.4. THE HOPF MAPS 199

Lyons uses an alternative formula for the projection which will be more useful for us and still produce a S 1 →
S → S 2 fiber bundle. Let (a, b, c, d) ∈ S 3 , so we have a2 + b2 + c2 + d2 = 1. Then the Hopf map p : S 3 → S 2 is
3

defined as
p(a, b, c, d) = (a2 + b2 − c2 − d2 , 2(ad + bc), 2(bd − ac)).
The reader should check that the square of the three coordinates on the right sum to (a2 + b2 + c2 + d2 )2 = 1.
To understand the properties of the Hopf map, we need to understand more about the quaternions and their relation
to rotations. Suppose you want to look at a rotation in R3 . We can represent it by choosing an axis of rotation which
can be represented by a vector in R3 along with an angle of rotation. So we need a 4-tuple of real numbers. What if
you want to look at the composition of two rotations. Given the axis of rotation and angle for each of them can you
find these parameters for their composition? William Hamilton invented quaternions in the mid-19th century to handle
problems like this. He was inspired by the corresponding problem in R2 .
In R2 , we can represent rotations of the plane around the origin by unit length complex numbers. If z1 = eiθ1
and z2 = eiθ2 , then z1 z2 = ei(θ1 +θ2 ) . So if θ1 , θ2 represent angles of rotation, then we compose the rotations by
multiplying the corresponding complex numbers. We would like quaternions to have a similar property.
Let H be the set of quaternions and r = a + bi + cj + dk ∈ H. Then the conjugate r = a − bi − cj − dk and the
norm p √
||r|| = a2 + b2 + c2 + d2 = rr.
Now the norm has the property that for r, s ∈ H, ||rs|| = ||r||||s||, so the product of quaternions of unit norm also has
unit norm. A unit norm quaternion is a point of S 3 ⊂ R4 . If r ̸= 0, then the multiplicative inverse of r is
r
r−1 = .
||r||2

So if r has unit norm, then r−1 = r. Also, multiplication of quaternions is associative but not commutative.
Now we represent rotations in R3 using quaternions as follows. For p = (x, y, z) ∈ R3 , associate a quaternion
p = xi + yj + zk. (A quaternion with no real part is called pure.) If r is an arbitrary nonzero quaternion then it turns
out that rpr−1 is also pure so it is of the form x′ i + y ′ j + z ′ k and can be associated with the point (x′ , y ′ , z ′ ) ∈ R3 .
So r defines a mapping Rr : R3 → R3 . The following theorem can be proved by direct calculation.
Theorem 9.4.1. The map Rr : R3 → R3 where r ∈ H has the following properties:
1. Rr is a linear map.
2. If k is a nonzero real number, then Rkr = Rr .
3. If r ̸= 0, then Rr is invertible and (Rr )−1 = Rr−1 .
Property 2 implies that we can restrict r to have unit norm.
The next two results show how to use quaternions to define rotations. See [90] for an outline of the proof.
Theorem 9.4.2. Let r = a + bi + cj + dk be a quaternion of unit length. If r = ±1, Rr is obviously the identity.
Otherwise, Rr is the rotation about the axis determined by the vector (b, c, d) with an angle of rotation
p
θ = 2 cos−1 (a) = 2 sin−1 ( b2 + c2 + d2 ).

Theorem 9.4.3. Let r and s be unit quaternions. Then Rs Rr = Rrs . So composition of rotations corresponds to
multiplication of quaternions. Note: My left side is in reverse order to the statement in [90] to be consistent with my
convention of composing functions from right to left. Use r = i and s = j to confirm that this is correct.
Now we will redefine the Hopf map in terms of quaternions. Fix a distinguished point P0 = (1, 0, 0) ∈ S 2 . Let
r = a+bi+cj +dk be a unit quaternion corresponding to the point (a, b, c, d) ∈ S 3 . Then the projection p : S 3 → S 2
of the Hopf fibration is defined as
p(r) = Rr (P0 ) = rir.
200 CHAPTER 9. HOMOTOPY THEORY

Computing the formula explicitly shows that it corresponds to Lyons’ earlier formula.
Using that formula, what is the inverse image of (1, 0, 0) ∈ S 2 ? Since,

p(a, b, c, d) = (a2 + b2 − c2 − d2 , 2(ad + bc), 2(bd − ac)) = (1, 0, 0),

we see that the set of points


C = {(cos t, sin t, 0, 0)|t ∈ R}
maps to (1, 0, 0). Since
1 = a2 + b2 + c2 + d2 = a2 + b2 − c2 − d2
we have that c, d = 0 so C is the entire inverse image of (1, 0, 0).
It turns out that the inverse image of every point is a circle, confirming that the fiber of the Hopf filtration is S 1 .
See [90] for more discussion.
The takeaway for what will come later is that we have some very explicit formulas for a map from a higher to a
lower dimensional sphere. The Hopf map turns out to be the generator of π3 (S 2 ), the group of homotopy classes of
maps from S 3 to S 2 . I will discuss more implications of this in Chapter 10,

9.5 Paths and Loops


In this section, I will describe an important fiber bundle. The material comes from [74]. As we will see, fiber
bundles lead to a long exact sequence in homotopy. The path-space fibration will be our goal. It has the advantage of
having a total space which is contractible.
We will write Y X for the set of continuous maps from X to Y . We will generally consider Y X to be a topological
space having the compact-open topology. Recall that this means that the set of continuous functions f : X → Y such
that f (K) ⊂ V for all K ⊂ X compact and V ⊂ Y open forms a subbase for the topology.
Reusing notation, we will use Ω = Y I for the space of paths on Y with the compact-open topology. Paths break
Y into path components for which any two points can be connected by a path. If y ∈ Y , then Cy will be the path
component containing y.

Definition 9.5.1. A generalized triad (Y ; A, B) is a space Y together with two subspaces A and B. (Y ; A, B) is a
triad if A ∩ B ̸= ∅. For a generalized triad (Y ; A, B) let [Y ; A, B] be the subset of Ω consisting of paths σ in Y such
that σ(0) ∈ A and σ(1) ∈ B.

Here are some interesting special cases. If A = Y and B consists of a single point y, the subspace [Y; Y, y] of Ω
is denoted Ωy and called the space of paths that terminate at y.
Let Λ be the space of loops in Y . Then [Y ; y, y] is the space of paths that start and end at y, so by definition it is
the space of loops on Y with base point at y. We write this space as Λy . Let δy be the degenerate loop δy (I) = y. Hu
uses some facts about the topology of function spaces to prove the following important result:

Theorem 9.5.1. The space Ωy of paths that terminate in y, is contractible to the point δy .

I can now define the path-space fibration. See [74] sections 12 and 13 for the proof that it is actually a fiber space.

Definition 9.5.2. The space Ωy is a fiber space called the path-space fibration over Y relative to the initial projection
defined by p : Ωy → Y where for σ ∈ Ωy , p(σ) = σ(0). Since any path σ ∈ Ωy terminates at y, if p(σ) = y, then σ
also begins at y, so σ is a loop with base point y. So the fiber is Λy .

9.6 Higher Homotopy Groups


As I mentioned before, homotopy groups classify maps from S n to X up to homotopy equivalence. In this section,
I will precisely define them and list some of their basic properties.
9.6. HIGHER HOMOTOPY GROUPS 201

9.6.1 Definition of Higher Homotopy Groups


In order to make it easier to define relative homotopy and a group multiplication, I will follow Hu and replace S n
with an n-dimensional cube whose boundary is identified to a point. We will use the notation f : (X, A) → (Y, B) to
mean that f : X → Y and f (A) ⊂ B. This can generalize to a triple in the obvious way.

Definition 9.6.1. Let n > 1 and I n be the n-dimensional cube that is the product of I = [0, 1] with itself n times. Let
∂I n be the boundary of I n . Let F n (X, x0 ) be the set of maps f : (I n , ∂I n ) → (X, x0 ). The maps have an addition
f + g defined by
if 0 ≤ t1 ≤ 12 ,

f (2t1 , t2 , · · · , tn )
(f + g)(t) =
g(2t1 − 1, t2 , · · · , tn ) if 21 ≤ t1 ≤ 1.
for t = (t1 , · · · , tn ) ∈ I n . For f, g ∈ F n (X, x0 ), f + g ∈ F n (X, x0 ). If [f ] is the homotopy class of f ∈ F n (X, x0 ),
then define [f ] + [g] = [f + g]. The homotopy classes of F n (X, x0 ) form a group under this addition called the n-th
homotopy group of X at x0 denoted πn (X, x0 ). The identity is the class [0] of the map (I n , ∂I n ) → (x0 , x0 ) and the
inverse of [f ] is the class [f θ], where θ : I n → I n is defined as θ(t) = (1 − t1 , t2 , · · · , tn ).
For n = 1, π1 (X, x0 ) is the fundamental group we defined in Section 9.2. For n = 0, π0 (X, x0 ) is the set of path
components of X and its neutral element is defined to be the component containing x0 . We say that π0 (X, x0 ) = 0 if
X is path connected. Note that π0 (X, x0 ) is not a group.

Note that πn (X, x0 ) could have been defined if we used maps f : (S n , s0 ) → (X, x0 ). The two halves of I n
defined by the conditions t1 ≤ 21 and t1 ≥ 21 correspond to the Southern and Northern hemispheres of S n respectively.
For n > 1, we can rotate S n to exchange hemispheres while keeping s0 fixed. This is the intuition for the following
important result.

Theorem 9.6.1. For n > 1, πn (X, x0 ) is an abelian group.

As we saw earlier, π1 (X, x0 ) need not be abelian. It is abelian, though, in some interesting special cases.

Definition 9.6.2. Let X be a topological space and µ : X × X → X define a multiplication on X. X is called an


H-space if the µ is continuous and there exists a point e ∈ X such that µ(x, e) = µ(e, x) = x for all x ∈ X.

The spheres S 0 , S 1 , S 3 , and S 7 are H-spaces when considered to be the norm one elements of the reals, complexes,
quaternions, and octonions. There is a theorem that these are the only spheres that are H-spaces. S 7 is an H-space but
not a group as the octonions are not associative. The space of loops on X with base point x0 is another example of an
H-space, where the multiplication consists of travelling around the two loops one after the other. The identity is the
degenerate loop consisting of only x0 . Hu proves the folowing:

Theorem 9.6.2. If X is an H-space, then π1 (X, x0 ) is abelian.

Now if p + q = n, then
n p
×I q p q
XI = XI = (X I )I ,
so πn (X, x0 ) = πq (F p , d0 ), where d0 is the constant loop. If p = 1, then F p is the space of loops in X with base
point x0 . This proves the following.

Theorem 9.6.3. πn (X, x0 ) ∼


= πn−1 (W, d0 ) where W is the space of loops on X and n > 0.

Since the space of loops W is an H-space, π1 (W, d0 ) is abelian. As this must be isomorphic to π2 (X, x0 ) which
is abelian by Theorem 9.6.1, the fact that π1 need not always be abelian does not pose a problem in this case.
Finally, the following is a direct result of the fact that I n is pathwise connected.

Theorem 9.6.4. If X0 denotes the path-component of X containing x0 then for n > 0,

πn (X0 , x0 ) ∼
= πn (X, x0 ).
202 CHAPTER 9. HOMOTOPY THEORY

We can already determine the homotopy groups of some spheres. First of all for n > 0, S n is connected so
π0 (S n , s0 ) = 0.
Due to the fact that homotopy classes of maps S n → S n are determined by the degree of the map (See Theorem
4.1.29), we see that πn (S n ) ∼
= Z. Also, it turns out that πn (S 1 ) = 0 for n > 1. This can be proved using covering
spaces (covered in books such as [111, 63, 74] for example) or the long exact sequence of a fiber space which I will
discuss in Section 9.6.4.
It also turns out that πm (S n ) = 0 for 0 < m < n. This follows from the fact that if we represent the spheres
as CW complexes, S n consisting of a 0-cell and an n-cell. Using the fact that a map between two CW complexes
can be approximated by a cellular map, let f : S m → S n and assume that f is cellular. But then f takes S m into
the m-skeleton of S n which consists of the single 0-cell. So any map f : S m → S n is homotopic to a constant and
πm (S n ) = 0 for m < n.
On the other hand, if m > n, this is not the case. We will end this section by looking at π3 (S 2 ).
We will start with two theorems proved in [74].

Theorem 9.6.5. Hopf Classification Theorem: Let X be a triangulable space with dimension less than or equal
to n. Let χ be a generator of the infinite cyclic group H n (S n ). The assignment f → f ∗ (χ) sets up a one-to-one
correspondence between the homotopy classes of the maps f : X → S n and the elements of the cohomology group
H n (X).

This theorem gives another proof that πm (S n ) = 0 for m < n. Letting X = S m , H n (S m ) = 0.

Definition 9.6.3. A map f : X → Y is algebraically trivial if for f∗ : Hm (X) → Hm (Y ) and f ∗ : H m (Y ) →


H m (X), f∗ = 0 and f ∗ = 0 for all m > 0.

The Hopf map p : S 3 → S 2 is algebraically trivial as there is no m > 0 for which Hm (S 3 ) and Hm (S 2 ) are both
nonzero. The same holds for the cohomology groups.
Let X be a triangulable space. For an arbitrary map F : X → S 3 , the composed map f = pF : X → S 2 where p
is the Hopf map is also algebraically trivial.

Theorem 9.6.6. For any given triangulable space X, the assignment F → f = pF sets up a one-to-one correspon-
dence between the homotopy classes of the maps F : X → S 3 and those of the algebraically trivial maps f : X → S 2 .

Using Theorems 9.6.5 and 9.6.6 we get the following.

Theorem 9.6.7. The homotopy classes of the algebraically trivial maps f : X → S 2 of a 3-dimensional triangulable
space X into S 2 are in a one-to-one correspondence with the elements of H 3 (X). For α ∈ H 3 (X), we associate α
with the homotopy class of the map f = pF : X → S 2 such that F : X → S 3 is a map with F ∗ (χ) = α and χ is as
defined in Theorem 9.6.5.

Now let X = S 3 .

Theorem 9.6.8. The homotopy classes of the maps f : S 3 → S 2 are in one to one correspondence with the integers.
For an integer n, we associate n with the homotopy class of the map f = pF : S 3 → S 2 such that F : S 3 → S 3 has
degree n.

The immediate consequence of Theorem 9.6.8 is that π3 (S 2 ) ∼


= Z. It turns out that the generator of this group is
[p] where p is the Hopf map.

9.6.2 Relative Homotopy Groups


As in homology theory, there are relative homotopy groups and these groups will produce a long exact sequence.
Let X be a space, A ⊂ X, and x0 ∈ A. We call (X, A, x0 ) a triplet. If A = x0 , then we just write the pair (X, x0 ).
9.6. HIGHER HOMOTOPY GROUPS 203

Definition 9.6.4. Consider the n-cube I n for n > 0. The initial (n-1)-face of I n identified with I n−1 is defined by
tn = 0. The union of the remaining (n − 1)-faces is denoted by J n−1 . Then we have

∂I n = I n−1 ∪ J n−1 and ∂I n−1 = I n−1 ∩ J n−1 .

(To help visualize this, let n = 2. Then I 1 is the bottom edge and J 1 consists of the other three edges.) Let

f : (I n , I n−1 , J n−1 ) → (X, A, x0 ).

In particular, f (∂I n ) ⊂ A and f (∂I n−1 ) = x0 . Let F n = F n (X, A, x0 ) be the set of these maps. Then πn (X, A, x0 )
is the set of homotopy classes of these maps relative to the system {I n−1 , A; J n−1 , x0 } In other words, if f and g are
in the same homotopy class, there is a continuous map F : I n × I → X such that F (x, 0) = f (x), F (x, 1) = g(x),
and for t ∈ [0, 1], f (t) : (I n , I n−1 , J n−1 ) → (X, A, x0 ), The set of these classes is the n-th relative homotopy set
denoted πn (X, A, x0 ). If n > 1, then πn (X, A, x0 ) is a group with addition as defined for πn (X, x0 ). This group is
called a relative homotopy group.

For absolute homotopy groups, πn (X, x0 ) is a set for n = 0, a group for n > 0 and an abelian group for n > 1.
For relative homotopy groups, it turns out that πn (X, A, x0 ) is a set for n = 1, a group for n > 1 and an abelian group
for n > 2.
I will now state three analogous properties of relative homotopy groups to those of the absolute homotopy groups.
First I need a definition.

Definition 9.6.5. Let T = (X, A, x0 ) be a triplet. Let X ′ = [X; X, x0 ] be the space of paths on X that terminate at
x0 . Let p : X ′ → X be the initial projection. Let

A′ = p−1 (A) = [X; A, x0 ] ⊂ X ′ .

In other words, A′ is the set of paths that start in A and terminate at x0 . Let x′0 be the degenerate loop x′0 (I) = x0 .
Then T ′ = [X ′ , A′ , x′0 ] is called the derived triplet of T . The map p : (X ′ , A′ , x′0 ) → (X, A, x0 ) is called the derived
projection.

Theorem 9.6.9. For every n > 0,


πn (X, A, x0 ) = πn−1 (A′ , x′0 )
.

This implies that every relative homotopy group can be expressed as an absolute homotopy group.

Theorem 9.6.10. If X0 denotes the path component of X containing x0 and A0 denotes the path component of A
containing x0 , then for n > 1,
πn (X, A, x0 ) = πn (X0 , A0 , x0 ).

Theorem 9.6.11. If α ∈ πn (X, A, x0 ), is represented by a map f ∈ F n (X, A, x0 ) such that f (I n ) ⊂ A, then α = 0.

9.6.3 Boundary Operator and Induced Homomorphisms


Now that we have defined relative homotopy groups, we would like to use them in a long exact sequence like we
have in homology. One of the components is a boundary operator

∂ : πn (X, A, x0 ) → πn−1 (A, x0 ).

Unlike in homology, the definition in homotopy is very easy.


204 CHAPTER 9. HOMOTOPY THEORY

Definition 9.6.6. Let (X, A, x0 ) be a triplet. For n > 0 define

∂ : πn (X, A, x0 ) → πn−1 (A, x0 )

as follows: Let α ∈ πn (X, A, x0 ) be represented by a map

f : (I n , I n−1 , J n−1 ).

If n = 1, f (I n−1 ) is a point of A which determines a path component β ∈ π0 (A, x0 ) of A. If n > 1, then the
restriction of f to I n−1 is a map of (I n−1 , ∂I n−1 ) into (A, x0 ), so it represents an element β ∈ πn−1 (A, x0 ). Since β
does not depend on the map f representing α, so define ∂(α) = β. We call ∂ the boundary operator.
Theorem 9.6.12. The boundary operator ∂ is a homomorphism for n > 1.
Now consider a map
f : (X, A, x0 ) → (Y, B, y0 ).
We would like homotopy to have the functorial properties that homology has, so we would like to obtain a homomor-
phism on homotopy groups. This also turns out to be easy.
First of all, since f is continuous, it sends path components of X to path components of Y , so it induces a
transformation
f∗ : π0 (X, x0 ) → π0 (Y, y0 )
which sends the neutral element of π0 (X, x0 ) to that of π0 (Y, y0 ).
For n > 0, If ϕ ∈ F n (X, A, x0 ), the composition f ϕ is in F n (Y, B, y0 ). The assignment f → f ϕ defines a map

f# : F n (X, A, x0 ) → F n (Y, B, y0 ).

Since f# is continuous, it carries the path components of F n (X, A, x0 ) into those of F n (Y, B, y0 ). This induces a
transformation
f∗ : πn (X, A, x0 ) → πn (Y, B, y0 )
called the induced transformation which sends the neutral element of πn (X, A, x0 ) to that of πn (Y, B, y0 ).
Theorem 9.6.13. Let f : (X, A, x0 ) → (Y, B, y0 ). If n = 1, f∗ is a transformation which sends the neutral element
of πn (X, A, x0 ) to that of πn (Y, B, y0 ). If n > 1, then

f∗ : πn (X, A, x0 ) → πn (Y, B, y0 )

is a homomorphism.
Theorem 9.6.14. Let f : (X, x0 ) → (Y, y0 ). If n = 0, f∗ is a transformation which sends the neutral element of
πn (X, x0 ) to that of πn (Y, y0 ). If n > 0, then

f∗ : πn (X, x0 ) → πn (Y, y0 )

is a homomorphism.

9.6.4 Properties of Homotopy Groups


In this section, I will list some properties of homotopy groups and compare them to those of homology groups. In
the next section, I will define a homotopy system, the analogue of a homology theory. I will state most of these without
proof, so see [74] for the details.
Theorem 9.6.15. Property 1: If f : (X, A, x0 ) → (X, A, x0 ) is the identity map, then f∗ is the identity transforma-
tion on πn (X, A, x0 ) for all n ≥ 0.
9.6. HIGHER HOMOTOPY GROUPS 205

Theorem 9.6.16. Property 2: If f : (X, A, x0 ) → (Y, B, y0 ) and g : (Y, B, y0 ) → (Z, C, z0 ) are maps, then
(gf )∗ = g∗ f∗ for all n ≥ 0. So the assignment (X, A, x0 ) → πn (X, A, x0 ) and f → f∗ is a covariant factor from
the category of triples to the category of homotopy groups.
Theorem 9.6.17. Property 3: If f : (X, A, x0 ) → (Y, B, y0 ) is a map and g : (A, x0 ) → (B, y0 ) is the restriction of
f to A, then the following rectangle is commutative for all n > 0:


πn (X, A, x0 ) πn−1 (A, x0 )

f∗ g∗

πn (Y, B, y0 ) πn−1 (B, y0 )


Properties 1-3 correspond to those of homology groups. There is also a long exact sequence for a triplet which
differs from the long exact sequence of a pair in homology only by the inclusion of the base point. For the pieces that
are sets and not groups, the term kernel will refer to the inverse image of the neutral element which must then coincide
with the image of the previous map to extend the definition of exactness.
Theorem 9.6.18. Property 4 (Exactness Property): If (X, A, x0 ) is a triplet, let i : (A, x0 ) ⊂ (X, x0 ) and j :
(X, x0 ) = (X, x0 , x0 ) ⊂ (X, A, x0 ) be inclusions, and i∗ , j∗ be the induced transformations. Let ∂ be the boundary
operator defined above. Then the following homotopy sequence of the triplet is exact:
j∗ ∂ ∗ i j∗ ∂
· · · −→ πn+1 (X, A, x0 ) −
→ πn (A, x0 ) −→ πn (X, x0 ) −→ πn (X, A, x0 ) −
→ ···
j∗ ∂ ∗ i
· · · −→ π1 (X, A, x0 ) −
→ π0 (A, x0 ) −→ π0 (X, x0 )
Theorem 9.6.19. Property 5 (Homotopy Property): If f, g : (X, A, x0 ) → (Y, B, x0 ) are homotopic, then f∗ , g∗ :
πn (X, A, x0 ) → πn (Y, B, x0 ) are equal for every n.
Now topological invariance of homotopy groups are an immediate consequence of properties 1, 2, and 5.
Theorem 9.6.20. If f : (X, A, x0 ) → (Y, B, x0 ) is a homotopy equivalence, then the induced transformation f∗ :
πn (X, A, x0 ) → πn (Y, B, x0 ) is an isomorphism for all n. (One to one and onto for n = 1 where f is a map of sets
rather than groups.)
Now you will see the special place that fiber spaces have in homotopy theory. They have a long exact sequence of
their own which is very useful for computation. For homology, though things aren’t so nice. That is where spectral
sequences, which will be discussed in Section 9.8, come in.
Theorem 9.6.21. Property 6 (Fibering Property): If p : (E, C, x0 ) → (B, D, y0 ) is a fibering, and C = p−1 (D),
then
p∗ : πn (E, C, x0 ) → πn (B, D, y0 )
is an isomorphism for all n. (One to one and onto for n = 1 where f is a map of sets rather than groups.)
A very important consequence of this property is the following. Let D = y0 and C = p−1 (y0 ) = F where F is
the fiber of p by the definition of the fiber. So our fibering becomes p : (E, F, x0 ) → (B, y0 ) . By Property 6, p∗ is an
isomorphism on homotopy groups, so replacing πn (E, F, x0 ) by πn (B, y0 ) in the long exact sequence of the triplet
(E, F, x0 ) gives the long exact sequence of the fiber space.
Theorem 9.6.22. If p : E → B is a fibering with fiber F , and suppose all E,B, and F are all path connected (so we
can ignore the base point), then we have a long exact sequence:
j∗ ∂ ∗ i j∗ ∂
· · · −→ πn+1 (B) −
→ πn (F ) −→ πn (E) −→ πn (B) −
→ ···
j∗ ∂ i

· · · −→ π1 (B) −
→ π0 (F ) −→ π0 (E)
206 CHAPTER 9. HOMOTOPY THEORY

Example 9.6.1. For the Hopf map, E = S 3 , B = S 2 , and F = S 1 . Since πn (S 1 ) = 0 for n > 1,we get from the
long exact sequence that for n ≥ 3, πn (S 3 ) ∼
= πn (S 2 ). (Write out some terms and check this for yourself.)
In homology theory, the fibering property is generally false. It is replaced by the excision property which is false
in homotopy theory.
For our final property, let X consist of one point x0 . Then the only map from I n into X is the constant map. This
gives the following.
Theorem 9.6.23. Property 7 (Triviality Property): If X consists of the single point x0 , then πn (X, x0 ) = 0 for all
n.

9.6.5 Homotopy Systems vs. Eilenberg-Steenrod Axioms


Properties 1-7 describe a homotopy system. In this section, I will give a precise definition. Compare it to the
definition of a homology theory we saw in Section 4.2.3.
Definition 9.6.7. A homotopy system
H = {π, ∂, ∗}
consists of three functions π, ∂, ∗. The function π assigns to each triplet (X, A, x0 ) and each integer n > 0, the set
πn (X, A, x0 ). The function ∂ assigns to each triplet (X, A, x0 ) and each integer n > 0, a transformation

∂ : πn (X, A, x0 ) → πn−1 (A, x0 ),

where if n = 1, π0 (A, x0 ) denotes the set of all path components of A, and for n > 0, πn (A, x0 ) is defined to be
πn (A, x0 , x0 ). The function ∗ assigns to each map

f : (X, A, x0 ) → (Y, B, y0 )

and each integer n > 0, a transformation

f∗ : πn (X, A, x0 ) → πn (Y, B, y0 ).

The system H must satisfy 7 axioms. For 1 ≤ i ≤ 7, Axiom i =Property i from above with 2 exceptions.
Axiom 4: Similar to Property 4 but the homotopy sequence of a pair need only be weakly exact. This means that
if πn (X, x0 ) = 0 for all n, then ∂ is one to one and onto.
Axiom 6: If p : (X ′ , A′ , x′0 ) → (X, A, x0 ) is the derived projection then p∗ is a one to one and onto map

p : πn (X ′ , A′ , x′0 ) → πn (X, A, x0 )

for all n > 0.


The derived projection is the same as the path-space fibration, so it is actually weaker than Property 6.
Theorem 9.6.24. H = {π, ∂, ∗} as we have defined the three functions in this chapter is an example of a homotopy
system.
Hu proves that any homotopy system is in a sense equivalent to ours. See [74] for details.

9.6.6 Operation of the Fundamental Group on the Higher Homotopy Groups


The goal of the remainder of this section is to determine if we really need the base point and what it does. When
can we dispose of it altogether and just talk about the homotopygroups πn (X)?
We need a couple of definitions.
Definition 9.6.8. An automorphism of a group is an isomorphism of the group to itself.
9.6. HIGHER HOMOTOPY GROUPS 207

Definition 9.6.9. The action of a group G on a set S is a function f : G × S → S where we write f (g, s) as gs and
such that for g, h ∈ G and s ∈ S,

1. 1G s = s.

2. g(hs) = (gh)s.

Let X be a space and let x0 , x1 ∈ X. Let σ : I → X be a path connecting them so that σ(0) = x0 and σ(1) = x1 .
Then x0 , x1 lie in the same path component of X so π0 (X, x0 ) = π0 (X, x1 ). For n > 0, Hu proves the following
result.

Theorem 9.6.25. For each n > 0, every path σ : I → X with x0 = σ(0) and x1 = σ(1) gives an isomorphism

σn : πn (X, x1 ) ∼
= πn (X, x0 )

which depends only on the homotopy class of σ relative to the endpoints. (I. e. the endpoints are kept fixed.) If σ
is the degenerate path σ(I) = x0 then σn is the identity automorphism. If σ, τ are paths with τ (0) = σ(1), then
(στ )n = σn τn . For every path σ : I → X and map f : X → Y , let τ = f σ be the corresponding path in Y with
y0 = f (x0 ) and y1 = f (x1 ). Then the following commutes:

σn
πn (X, x1 ) πn (X, x0 )

f∗ f∗

πn (Y, y1 ) πn (Y, y0 )
τn

As a loop is really a path, the theorem immediately implies the following.

Theorem 9.6.26. The fundamental group π1 (X, x0 ) acts on πn (X, x0 ) for n ≥ 1 as a group of automorphisms.

Theorem 9.6.25 implies that for a pathwise connected space, the homotopy groups are isomorphic for any choice
of a basepoint and we can write πn (X) for these groups. To give these groups a geometric meaning, though, we need
one more condition on X. We need it to be n-simple.

Definition 9.6.10. A group G acts simply on a set S if gs = s for all g ∈ G, s ∈ S. A space X is n-simple if
π1 (X, x0 ) acts simply on πn (X, x0 ) for all x0 ∈ X.

The following properties are immediate.

Theorem 9.6.27. A pathwise connected space is n-simple if there exists an x0 ∈ X such that π1 (X, x0 ) acts simply
on πn (X, x0 ).

Theorem 9.6.28. A simply connected space is n-simple for every n > 0.

Theorem 9.6.29. A pathwise connected space is n-simple if πn (X) = 0..

It can be shown that π1 (X, x0 ) acts on itself by conjugation. In other words, for g, h in π1 (X, x0 ), hg = h−1 gh.
Then we have the following.

Theorem 9.6.30. A pathwise connected space X is 1-simple if and only if π1 (X) is abelian.

Example 9.6.2. The sphere S m is n-simple for m, n > 0. This holds for m > 1 and any n by Theorem 9.6.28. By
Theorem 9.6.29, S 1 is n-simple for n > 1, and we know that S 1 is 1-simple by Theorem 9.6.30.

To get a better feel for the geometric meaning of n-simplicity we have the following:
208 CHAPTER 9. HOMOTOPY THEORY

Theorem 9.6.31. A space X is n-simple if and only if for every point x0 ∈ X and maps f, g : S n → X with
f (s0 ) = x0 = g(s0 ), f ≃ g implies f ≃ g rel s0 .
Proof: Let X be n-simple. Then π1 (X, x0 ) acts simply on πn (X, x0 ). Since f ≃ g, there exists a homotopy ht
with h0 = f and h1 = g. Let f and g represent the elements α and β of πn (X, x0 ) respectively. Let σ : I → X be a
path defined by σ(t) = ht (s0 ) for t ∈ I. Since σ(0) = x0 = σ(1), σ represents and element w ∈ π1 (X, x0 ) and it
can be shown that α = wβ. Since π1 (X, x0 ) acts simply on πn (X, x0 ), α = β. So by definition of πn (X, x0 ), f ≃ g
rel s0 .
For the other direction, let w ∈ π1 (X, x0 ) and let σ be a loop representing w. If α ∈ πn (X, x0 ) is represented by
a map f : S n → X with f (s0 ) = x0 , then the element wα of πn (X, x0 ) is represented by a map g : S n → X with
g(s0 ) = x0 and satisfying f ≃ g. This implies that f ≃ g rel s0 , so wα = α and X is n-simple. ■
We can use this result to prove the following.
Theorem 9.6.32. A pathwise connected topological group X is n-simple for every n > 0.
Proof: Let x0 be the identity element of X, and f, g : S n → X be two homotopic maps with f (s0 ) = g(s0 ) = x0 .
Then there is a homotopy ht with h0 = f , and h1 = g. Define a homotopy kt : S n → X, by taking

kt (s) = [ht (s0 )]−1 [ht (s)],

for s ∈ S n and t ∈ I. Then k0 = f and k1 = g since f (s0 ) = g(s0 ) = x0 , and kt (s0 ) = x0 for all t ∈ I. Then f ≃ g
rel s0 . By the previous theorem, X is n-simple. ■
If we consider πn (X) to be the set of homotopy classes of maps S n → X and πn (X, x0 ) to be the set of homotopy
classes of maps (S n , s0 ) → (X, x0 ). Obviously πn (X, x0 ) is a subgroup of πn (X). Hu shows the following.
Theorem 9.6.33. If X is pathwise connected and n-simple, then the inclusion πn (X, x0 ) → πn (X) is one to one and
onto.
In this case, we can safely drop the basepoint and talk about the group πn (X).

9.7 Calculation of Homotopy Groups


So how do you calculate homotopy groups? It’s really hard. There is no general algorithm like there is for
homology. Also, even for the sphere S 2 we don’t know alll of the homotopy groups, and we never will. Still, there are
some useful results that can help in a lot of cases. I will mostly list some of the well known theorems in this section.
You should think about what is easy in homotopy and hard in homology or vice versa. In this section, the material
comes from Hu [74] unless otherwise stated.

9.7.1 Products and One Point Union of Spaces


The product of two spaces is a good example of a case that is easier for homotopy than for homology. Recall that
for homology and cohomology, we had the Künneth Theorems (Theorems 8.5.6 and 8.5.10) which give the product as
a term in an exact sequence involving the factors. For homotopy, the formula takes a much simpler form.
Theorem 9.7.1. Let X and Y be spaces with x0 ∈ X and y0 ∈ Y . Let Z = X × Y and z0 = (x0 , y0 ) ∈ Z. Then for
every n > 0,
πn (Z, z0 ) ∼
= πn (X, x0 ) ⊕ πn (Y, y0 ).
(Here we are using direct additive notation despite the fact that π1 may be non-abelian.)
Proof: Let p : (Z, z0 ) → (X, x0 ) and q : (Z, z0 ) → (Y, y0 ) be projections and i : (X, x0 ) → (Z, z0 ) and
j : (Y, y0 ) → (Z, z0 ) be inclusions. passing to the corresponding homotopy groups gives

p∗ i∗ = 1, q∗ j∗ = 1, p∗ j∗ = 0, q∗ i∗ = 0.
9.7. CALCULATION OF HOMOTOPY GROUPS 209

Let
h : πn (Z, z0 ) → πn (X, x0 ) ⊕ πn (Y, y0 )
be a homomorphism defined by h(α) = (p∗ (α), q∗ (α)) for α ∈ πn (Z, z0 ). We need to show that h is an isomorphism.
Letting α ∈ πn (X, x0 ) and β ∈ πn (Y, y0 ), let γ = i∗ (α) + j∗ (β) ∈ πn (Z, z0 ). Then

h(γ) = (p∗ i∗ α + p∗ j∗ β, q∗ i∗ α + q∗ j∗ β) = (α, β),

so h is an epimorphism.
Now let δ ∈ πn (Z, Z0 ) and h(δ) = 0. Then we have p∗ δ = 0 and q∗ δ = 0. Let f : (I n , ∂I n ) → (Z, z0 ) represent
δ. Then since p∗ δ = 0 and q∗ δ = 0, there are homotopies gt : (I n , ∂I n ) → (X, x0 ) and ht : (I n , ∂I n ) → (Y, y0 ) with
g0 = pf , h0 = qf g1 (I n ) → x0 and h1 (I n ) → y0 . Define a homotopy ft : I n → Z by letting ft (s) = (gt (s), ht (s)).
Then f0 = f , f1 (I n ) = z0 , and ft (∂I n ) = z0 for all t ∈ I. Thus δ = 0 and h is a monomorphism. Thus h is an
isomorphism. ■
Note that the inverse of h is
h−1 : πn (X, x0 ) ⊕ πn (Y, y0 ) → πn (Z, z0 )
is given by h−1 (α, β) = i∗ (α) + j∗ (β).

Example 9.7.1. Lets return to our favorite hyperdonut shop. First we will be boring and get a regular donut T =
S 1 × S 1 . Then π1 (T ) = π1 (S1 ) ⊕ π1 (S1 ) ∼ = Z ⊕ Z, and πn (T ) = 0 for n > 1. The formula extends to any
number of factors so π1 (S 1 × · · · × S 1 ) ∼
= Z ⊕ · · · ⊕ Z where the free abelian group on the right has rank equal to
the number of factors in the product and πn (S 1 × · · · × S 1 ) = 0. For a hyperdonut X = S m × S n for m, n > 0,
πi (X) ∼
= πi (S m ) ⊕ πi (S n ).

Our next example is a little more complicated for homotopy than for homology. Let x0 ∈ X and y0 ∈ Y . Identify
x0 with y0 and let X ∨ Y be the result. Then we call X ∨ Y the one point union of X and Y . It can be imbedded
in X × Y by a map k taking X ∨ Y to the subset of X × Y of the form (x, y0 ) ∪ (x0 , y). For homology we have
that for n > 0, Hn (X ∨ Y ) ∼ = Hn (X) ⊕ Hn (Y ). For example, we can use a Mayer-Vietoris sequence as (x, y0 ) and
(x0 , y) are homotopy equivalent to X and Y respectively and their intersection is a single point. In homotopy we have
a slightly more complicated formula.

Theorem 9.7.2. For every n > 1, we have

πn (X ∨ Y, (x0 ∼ y0 )) ∼
= πn (X, x0 ) ⊕ πn (Y, Y0 ) ⊕ πn+1 (X × Y, X ∨ Y, (x0 , y0 )).

See Hu for the details of the proof but I will comment that we need n > 1 so that all of the groups are abelian. The
extra term πn+1 (X × Y, X ∨ Y, (x0 , y0 )) on the right comes from the long exact sequence of the pair (X × Y, X ∨ Y ).
Hu shows that in the special case of spheres we have the following. (This should make you hungry for some more
hyperdonuts.

Theorem 9.7.3. For every p, q > 0 and n < p + q − 1, we have

πn (S p ∨ S q ) ∼
= πn (S p ) ⊕ πn (S q ).

9.7.2 Hurewicz Theorem


I will now outline the famous Hurewicz Theorem which discusses the relation between homotopy and homology
groups of a space. I will refer you to Hu for the full proof but it is important for the subsequent material that you
understand the form of the map from the first nonzero homotopy group to the homology group of the same dimenstion.
Let α be an element of the relative homology group πn (X, A, x0 ) for n > 0. (Remember that for n = 1, this is a
set and not a group.) Let
ϕ : (E n , S n−1 , s0 ) → (X, A, x0 )
210 CHAPTER 9. HOMOTOPY THEORY

represent α, and let E n be a unit n-ball in Rn , S n−1 be the unit (n − 1)-sphere bounding E n and s0 = (1, 0, · · · , 0).
The coordinate system in Rn determines an orientation and thus a generator ξn of the group Hn (E n , S n−1 ) ∼= Z.
Since ϕ maps (E n , S n−1 ) into (X, A) it induces a map on homology ϕ∗ : Hn (E n , S n−1 ) → Hn (X, A) where
Hn (X, A) denotes the singular homology group with integral coefficients. ϕ∗ depends only on α ∈ πn (X, A, x0 ) so
we have a function
χn : πn (X, A, x0 ) → Hn (X, A)
where χn (α) = ϕ∗ (ξn ).

Theorem 9.7.4. If either n > 1 or A = x0 , then χn is a homomorphism which will be called the natural homomor-
phism of πn (X, A, x0 ) into Hn (X, A).

Theorem 9.7.5. For any map f : (X, A, x0 ) → (Y, B, y0 ), the following is commutative:

f∗
πn (X, A, x0 ) πn (Y, B, y0 )

χn χn

Hn (X, A) Hn (Y, B)
f∗

For the case A = x0 , there is an isomorphism

j∗ ; Hn (X) ∼
= Hn (X, x0 ).

Let
hn = j∗−1 χn : πn (X, x0 ) → Hn (X).
This is called the natural homomorphism or Hurewicz homomorphism of πn (X, x0 ) into Hn (X). For n = 1, h1
corresponds to the homomorphism h∗ in the proof of Theorem 9.2.7.

Definition 9.7.1. A space X for n ≥ 0 is n-connected if it is pathwise connected and πm (X) = 0 for all 0 < m ≤ n.
So a 1-connected space is simply connected.

We can now state the Hurewicz Theorem. See Hu [74] for the proof.

Theorem 9.7.6. Hurewicz Theorem: If X is an (n − 1)-connected finite simplicial complex with n > 1, then the
reduced homology groups H̃i (X) = 0 for 0 ≤ i < n, and the natural homomorphism hn is an isomorphism.

So the first nonzero homotopy group is isomorphic to the homology group of the same dimension for a simply
connected finite simplicial complex.
Recall that the case n = 1 was handled in Theorem 9.2.8. As π1 (X) may be non-abelian, we have to mod out by
the commutator subgroup and make it abelian to produce H1 (X).
The map hn will be important for some of our later constructions.

9.7.3 Freudenthal’s Suspension Theorem


In this section,we will follow Hatcher [63].
As stated above, excision generally doesn’t hold for homotopy. A version of it does hold, though, in a range of
dimensions called the stable range.

Theorem 9.7.7. Let X be a CW complex, and let A, B ⊂ X be subcomplexes of X such that X = A ∪ B, and C =
A ∩ B ̸= ∅. If (A, C) is m-connected and (B, C) is n-connected with m, n ≥ 0, then the map πi (A, C) → πi (X, B)
induced by inclusion is an isomorphism for i < m + n and an epimorphism for i = m + n.
9.8. EILENBERG-MACLANE SPACES AND POSTNIKOV SYSTEMS 211

This theorem is sometimes called the homotopy excision theorem as the rather lengthy proof makes use of an
excision argument that holds in the specified dimension range.
Now recall that for a complex X, the cone CX is the complex obtained by taking a point x0 not in X and
connecting every point in X with a line segment to x0 . As an example, it turns a circle into the usual meaning of
a cone. Also remember that cones plug up holes and that they are acyclic, so that all reduced homology groups and
homotopy groups are zero. The suspension SX involves taking two points x0 and x1 and joining each to every point
in X. The suspension of a circle is S 2 as you can see by holding two ice cream cones together along their wide ends.
In fact we have that S(S n ) ∼ S n+1 for n ≥ 0.
Definition 9.7.2. The suspension map πi (X) → πi+1 (SX) is defined as follows: Let SX = C+ X ∪ C− X where
C+ X and C− X are two cones over X. The suspension map is the map
πi (X) ≈ πi+1 (C+ X, X) → πi+1 (SX, C− X) ≈ πi+1 (SX),
where the isomorphisms on the two ends come from the long exact homotopy sequences of the pairs (C+ X, X) and
(SX, C− X) respectively and the middle map is induced by inclusion.
We can now state the important Freundenthal Suspension Theorem.
Theorem 9.7.8. Freudenthal Suspension Theorem: The suspension map πi (S n ) → πi+1 (S n+1 ) for i > 0 is an
isomorphism for i < 2n − 1 and an epimorphism for i = 2n − 1. This holds more generally for the suspension
πi (X) → πi+1 (SX) whenever X is an (n − 1)-connected CW complex.
Proof: From the long exact sequence of the pair (C+ X, X), we see that πi+1 (C+ X, X) ∼ = πi (X) so that
(C+ X, X) is n-connected if X is (n − 1)-connected. The same holds for the pair (C− X, X). Replacing i by i + 1, m
by n, A by C+ X, B by C− X, X by SX and C by X in Theorem 9.7.7, gives that the middle map in the suspension
is an isomorphism for i + 1 < n + n = 2n and an epimorphism for i + 1 = 2n. ■
Definition 9.7.3. The range of dimensions where the suspension πi (S n ) → πi+1 (S n+1 ) is an isomorphism is called
the stable range. The study of homotopy groups of spheres in the stable range is called stable homotopy theory.
Theorem 9.7.9. πn (S n ) ∼= Z generated by the identity map for all n > 0. The degree map πn (S n ) → Z is an
isomorphism.
We have already seen this result, but it is also an immediate consequnce of the Freudenthal suspension theorem
since for i > 1, πi (S i ) ∼
= πi+1 (S i+1 ). Although we would only know that π1 (S 1 ) → π2 (S 2 ) is an epimorphism, we
could use the Hopf fibration S 1 → S 3 → S 2 to show it is actually an isomorphism.
The Freudenthal Suspension Theorem is the most basic tool for computing homotopy groups of spheres. We will
take another look at it in Chapter 12.

9.7.4 Whitehead’s Theorem


It turns out as in the case of homology that two spaces that have the same homotopy groups do not have to be
homotopy equivalent (let alone homeomorphic). We do have the following useful result for CW complexes.
Theorem 9.7.10. Whitehead’s Theorem: Let f : X → Y induce isomorphisms f∗ : πn (X) → πn (Y ) for all n ≥ 0.
Then f is a homotopy equivalence.
See Hatcher [63] for a proof..

9.8 Eilenberg-MacLane Spaces and Postnikov Systems


In homology, spheres are the simplest type of space. In reduced homology, H̃n (S n ) ∼
= Z and H̃i (S n ) = 0 for
1 n
i ̸= n. In homotopy, this only holds for S , but for S with n > 1, we don’t know all of the homotopy groups and
never will. I will talk more about that in Chapter 12.
So what are the simplest spaces when it comes to homotopy?
212 CHAPTER 9. HOMOTOPY THEORY

Definition 9.8.1. Let n > 1 and π be an abelian group. Then the Eilenberg-MacLane space denoted K(π, n) is a
space whose only nonzero homotopy group is πn (K(π, n)) = π. If n = 1, we have a similar definition but π need not
be abelian.
Example 9.8.1. We have already shown that S 1 is a K(Z, 1) space.
Example 9.8.2. Consider the sphere S ∞ , which is the limit of including S n in S n+1 as n → ∞. This space is
contractible (we will see why in Chapter 11), and the quotient map p : S ∞ → P ∞ which glues together antipodal
points is a covering space, i.e. it is a fiber bundle with a discrete fiber. In this case, the fiber F consists of two points,
and π0 (F ) = Z2 while πn (F ) = 0 for n > 0. Restricting to S n → P n , we have the fiber bundle F → S n → P n and
the long exact sequence of the fiber bundle shows that π1 (P n ) = Z2 and πi (P n ) = 0 for 1 < i < n. Letting n → ∞
shows that P ∞ is a K(Z2 , 1) space.
Example 9.8.3. From the bundle S 1 → S ∞ → CP ∞ , the long exact sequence shows that πi (CP ∞ ) ∼
= πi−1 (S 1 ) for
∞ ∞
all i since S is contractible. This makes CP a K(Z, 2) space.
So is there a K(π, n) for any abelian π and n > 1? The answer is yes as we will now show. Note that most of
these spaces are infinite dimensional and you will only visualize them in your worst nightmares. But building a CW
complex of that form is surprisingly easy. I will present the argument found in the book by Mosher and Tangora [106].
We will make a lot of use of their book in Chapter 11.
Theorem 9.8.1. If n > 1 and π is an abelian group, then there exists a CW complex with the homotopy type of a
K(π, n) space.
Proof: Let
0→R→F →π→0
be a free resolution of π and let {ai }i∈I and {bj }j∈J be bases of R and F respectively where I and J are index sets.
Let K be the wedge (one point union) of spheres Sjn for j ∈ J. A more poetic and often used term is that K is a
bouquet of spheres. For each ai , which represents a relation in π, take an (n + 1)-cell ei and attach it to K by a map
fi : ėi → K where [fi ] = ai ∈ πn (K) = F , and ėi is the boundary of ei . Let X be the space formed from K by
attaching the cells ei as described. The dimension of X is at most n+1 and πn (X) = π. It is also (n−1)-connected by
the Hurewicz Theorem as there are no cells in dimension less than n so the reduced homology groups are H̃i (X) = 0
for 0 ≤ i < n.
We now need to kill off the higher homotopy groups. That will be done with the following result.
Theorem 9.8.2. Let Z be the complex formed by attaching an (m+1)-cell to a CW complex Y by a map f : S m → Y .
Then the inclusion j : Y → Z induces isomorphisms j∗ : πi (Y ) ≡ πi (Z) for i < m. In dimension m, j∗ is an
epimorphism which takes the subgroup generated by [f ] to zero.
Proof: The isomorphism for i < m follows from the fact that j can be approximated by a cellular map and that Y
and Z have the same m-skeleton. If i, j, and k are inclusions the following diagram commutes.

i
S m = ėm+1 em+1

f k

Y Z = Y ∪f em+1
j

In homotopy we get j∗ f∗ = k∗ i∗ = 0, since em+1 acyclic implies that i∗ = 0. If [1] is the generator of πm (S m ) ∼
=
Z, then j∗ [f ] = j∗ f∗ = 0. Since j∗ is a homomorphism, it takes the subgroup generated by [f ] to zero. Finally, the
fact that j∗ is onto πm (Z) is true by cellular approximation since any map of S m into Z can be deformed into the
m-skeleton Z m = Y . This proves Theorem 9.8.2. So the attaching maps are all killed off and the higher homotopy
groups are zero, proving Theorem 9.8.1. ■
9.8. EILENBERG-MACLANE SPACES AND POSTNIKOV SYSTEMS 213

We will see that any two K(π, n) spaces with n > 1 have the same homotopy type. This will come from an
important theorem about cohomology operations, the subject of Chapter 11.
The analog of an Eilenberg-MacLane space for homology is a Moore Space.

Definition 9.8.2. Let n ≥ 1 and G be an abelian group. Then the Moore space denoted M (G, n) is a space whose only
nonzero reduced homology group is Hn (M (G, n)) = G. For n > 1 we will need the space to be simply connected.

Example 9.8.4. Any S n is a M (Z, n) space.

Example 9.8.5. Let X be S n with a cell en+1 attached by a map S n → S n having degree m. Then X is a M (Zm , n)
space. We can then form M (G, n) with G any finitely generated abelian group by taking the wedge of spheres and
spaces of the form M (Zm , n).

For a general abelian group G, there is a construction involving the free resolution of G which is similar ot the
construction of Eilenberg-MacLane spaces above. See Hatcher [63] for details.
The final topic in this section is Postnikov systems also known as Postnikov towers. They were invented in 1951 by
Mikhail Postnikov and are a means to decompose a space with more than one nontrivial homotopy groups into simpler
spaces. (I would have invented them myself if the ”nikov” had been replaced by an ”ol”.) Postnikov’s early papers are
mainly in Russian, but the American Mathematical Society published a translation in 1957 [123]. My advisor, Donald
Kahn, wrote an early paper in 1963 [80] which shows how a map f : X → Y gives rise to a map of the Postnikov
system for X into the Postnikov system for Y . Postnikov systems are covered in a number of books (eg. [63, 170]),
but I will follow the description in Mosher and Tangora [106]. Postnikov systems will be needed for the calculations
in the next chapter that are involved in applying obstruction theory to data science.
We would like to find invariants which represent the homotopy type of a space through dimension n. Unfortunately,
the n-skeleton does not uniquely determine homotopy type through dimension n as the following example shows.

Example 9.8.6. Let K denote S n as a CW complex with one n-cell and one 0-cell, where the n-cell is attached to
the 0-cell along its boundary. Let L be S n as a CW-complex built in a different way. S n consists of S n−1 with two
n-cells attached. S n−1 is the equator and the two n-cells consist of the northern and southern hemispheres. Then K
and L are homeomorphic, but the (n − 1)-skeleton K n−1 of K is a single point, but Ln−1 is S n−1 .

To get around this problem, we need Whitehead’s theory of n-types [171]. First we state his cellular approximation
theorem which we have informally used in our discussion of Eilenberg-MacLane spaces.

Definition 9.8.3. A map f : K → L is cellular if f (K n ) ⊂ Ln . A homotopy F : K × I → L between cellular maps


is cellular if F (K n × I) ⊂ Ln+1 for every n.

Theorem 9.8.3. Cellular Approximation Theorem: Let K0 be a subcomplex of K and let f : K → L be a map
such that f |K0 is cellular. Then there exists a cellular map g : K → L such that g ∼ f rel K0 .

Theorem 9.8.4. If f : K → L, there exists a cellular map g : K → L which is homotopic to f . If two cellular maps
f and g are homotopic, there exists a cellular homotopy between them.

Next we will define the ideas of n-homotopy type and n-type.

Definition 9.8.4. Two maps f, g : X → Y are n-homotopic if for every CW complex K of dimension at most n and
for every map ϕ : K → X, we have that f ϕ, gϕ : K → Y are homotopic.

The property of being n-homotopic is an equivalence relation. If f and g are n-homotopic maps from a complex
K into a space X, their restrictions to K n are homotopic by the definition. Conversely, if their restrictions to K n are
homotopic, then they are n-homotopic by the Cellular Approximation Theorem.

Definition 9.8.5. Two spaces X and Y have the same n-homotopy type if there exists maps f : X → Y and g :
Y → X such that f g is n-homotopic to idY and gf is n-homotopic to idX . Then we say that (f, g) is a n-homotopy
equivalence and g is an n-homotopy inverse of f .
214 CHAPTER 9. HOMOTOPY THEORY

The next definition will be more important.

Definition 9.8.6. Two CW complexes K and L have the same n-type if their n-skeletons K n and Ln have the same
(n − 1)-homotopy type. In other words, K and L have the same n-type if there are maps between K n and Ln whose
compositions are (n − 1)-homotopic to the identity maps of K n and Ln .

WARNING: n-type and n-homotopy type do not mean the same thing. Be careful not to confuse them.
Note that when n = ∞, ∞-homotopic means homotopic and K ∞ = K.

Theorem 9.8.5. If K and L have the same n-type, then they have the same m-type for m < n. This statement holds
even for n = ∞. If K and L have the same ∞-type, then they have the same m-type for all finite m. So n-type is a
homotopy invariant of a complex.

Example 9.8.7. Let K and L be the two complexes representing S n given in Example 9.8.6. Let n = 3. Then
K 2 = e0 (i. e. a single point) and L2 = S 2 . Then since any map of a 1-complex into S 2 is nullhomotopic (i.e.
homotopic to a constant map), K 2 and L2 have the same 1-homotopy type, so K and L have the same 2-type.

It turns out that the 1-type of a complex is a measure of the number of connected components, and Whitehead
showed that two complexes with the same 2-type have the same fundamental group. Postnikov systems will charac-
terize n-type for higher values of n.
Recall that [X, Y ] denotes homotopy classes of maps from X to Y .

Theorem 9.8.6. If L1 and L2 have the same n-type and the dimension of K is less than or equal to n − 1 then there
is a one to one correspondence between the sets [K, L1 ] and [K, L2 ].

Letting K = S i for i < n gives the following.

Theorem 9.8.7. If L1 and L2 have the same n-type then πi (L1 ) ∼


= πi (L2 ) for all i < n.

The converse is false, but Whitehead [171] proves the following.

Theorem 9.8.8. Suppose that there is a map F : K n → Ln which induces isomorphisms on the homotopy groups
πi (K n ) ∼
= πi (Ln ) for all i < n. Then K and L have the same n-type.

Now we are almost ready to define Postnikov systems. If L is an (n − 1) connected complex and π = πn (L) then
L and K(π, n) have the same homotopy groups through dimension n.

Theorem 9.8.9. L and K(π, n) have the same (n+1)-type.

We need a map between the two spaces that induces the isomorphism on π so that we can use Theorem 9.8.8.
This comes from a correspondence between [L, K(π, n)] and H n (L; π). This is an important result in the theory of
cohomology operations. Mosher and Tangora cover it earlier in their book, but we will accept it for now and talk about
it more in Chapter 11.
We would like to modify K(π, n) in such a way as to build a space with the same (n + 2)-type as the complex L.
This is done through a Postnikov system.

Definition 9.8.7. Let X be an (n − 1)-connected CW complex with n ≥ 2 so that X is simply connected. A diagram
of the form shown is a Postnikov system if it satisfies the following conditions:
9.9. SPECTRAL SEQUENCES 215

..
.

Xm+1

ρm+1
km (X)
X ρm Xm K(πm+1 , m + 2)

..
.

Xn = K(πn , n)

(∗)

1. Each Xm for m ≥ n has the same (m + 1)-type as X and there is a map ρm : X → Xm inducing the
isomorphisms πi (X) ∼
= πi (Xm ) for all i ≤ m.
2. πi (Xm ) = 0 for i > m.
3. Xm+1 is the fiber space over Xm induced by km (X) from the path-space fibration over K(πm+1 , m + 2).
4. The diagram commutes up to homotopy.
In the diagram, πi denotes πi (X), km denotes a homotopy class of maps Xm → K(πm+1 , m + 2) and (∗) is a one
point space.
The tools that Mosher and Tangora use to construct a Postnikov system are beyond what you have. (See the first
12 chapters of [106].) In chapter 10, I will outline a method that is claimed to be computable in polynomial time using
a very different technique.

9.9 Spectral Sequences


Spectral sequences are the most ugly, hideous algebraic objects ever invented. Imagine a two dimensional array
whose grid points contain abelian groups or more generally modules. These objects are also constantly changing over
time as we stand by helplessly hoping they converge to something we can deal with. But in algebraic topology, we
need to face up to them eventually. For example, suppose we have a fiber space F → E → B. In homotopy theory, we
have an exact sequence. But what about homology? If you know the homology of F and B can you find the homology
of E. As we will see in Chapter 11, finding cohomology of Eilenberg MacLane spaces involves problems of this type.
The Leray-Serre spectral sequence comes to the rescue here. Another spectral sequence, the Adams spectral sequence,
is used in computing homotopy groups of spheres.
The main book on spectral sequences that will tell you everything you ever wanted to know abot the topic is the
book by McCleary [99]. Mac Lane’s book [92] has a more compact description and there is also a description in
216 CHAPTER 9. HOMOTOPY THEORY

Mosher and Tangora [106] which is more focused on cohomology operations. The description in Hu [74] seems to be
missing some key pieces so I don’t recommend it for this topic. Also, note that Edelsbrunner and Harer [44] have a
very brief description in their TDA book, but they don’t really explore how spectral sequences might be applied.
I am presenting a brief description as it is important in understanding topics such as cohomology operations and
homotopy groups of speheres. Also, one of the main ways that spectral sequences arise is through filtrations so it may
connect to TDA in that way. In what follows, I will mainly follow Mac Lane with help from other sources as needed.
I will start with some definitions, then describe the Leray-Serre spectral sequence. Then I will discuss the two ways a
spectral sequence can arise: through a filtration or through an exact couple. Finally, I will mention the Kenzo software
of Dousson, et. al. [43], which can automatically do computations in spectral sequences and Postnikov towers in a
number of cases. I have not had the chance to experiment with this software but it could be the basis of future efforts.

9.9.1 Basic Definitions


Definition 9.9.1. A differential bigraded module over a ring R is family of modules E = {Ep,q } for integers p and q
together with a differential which is a family of homomorphisms d : E → E of bidegree (−r, r − 1) and d · d = 0.
The bidegree means that for all p, q ∈ Z,
d : Ep,q → Ep−r,q+r−1 .
The condition d · d = 0 means that we can consider E to be a chain complex with d as a boundary map and define
homology as
ker(d : Ep,q → Ep−r,q+r−1 )
Hp,q (E) = .
im(d : Ep+r,q−r+1 → Ep,q )
We can make E into a singly graded module by letting E = {En } with total degree n by letting
X
En = Ep,q .
p+q=n

Then the differential d becomes a differential d : En → En−1 and


X
Hn (E) = Hp,q (E).
p+q=n

Definition 9.9.2. A spectral sequence E = {E r , dr } is a sequence E 2 , E 3 , · · · of bigraded modules each with a


differential
dr : Ep,q
r r
→ Ep−r,q+r−1
for r = 2, 3, · · · of bidegree (−r, r − 1) and E r+1 = H(E r , dr ). In other words, E r+1 is the bigraded homology
module of the preceeding (E r , dr ). E r and dr determine E r+1 but not necessarily dr+1 . E 2 is called the initial term
of the spectral sequence. (Note that it is conventional for a spectral sequence to start at r = 2 but there is no reason it
couldn’t start at r = 1 instead.)
In what follows, I will use d2 for the differential when r = 2 and d · d for d composed with itself.
Definition 9.9.3. If E1 , E2 are two spectral sequences then a homomorphism f : E1 → E2 is a family of homomor-
phisms of bigraded modules
f r : E1r (p,q) → E2r (p,q)
such that dr f r = f r dr and each f r+1 is the map induced by f r on homology.
Now we will look at the sequence in terms of submodules of E 2 . Recall that E r+1 = H(E r , dr ). So E 3 =
H(E 2 , d2 ) = C 2 /B 2 where C 2 is the kernel of d2 and B 2 is the image of d2 . C 2 , B 2 are then submodules of E 2 .
Similarly E 4 = H(E 3 , d3 ) = C 3 /B 3 where C3 /B 2 = ker d3 and B 3 /B 2 = im d3 . Here B 3 ⊂ C 3 . Iterating gives a
chain
0 = B1 ⊂ B2 ⊂ B3 ⊂ · · · ⊂ C 3 ⊂ C 2 ⊂ C 1 = E2
9.9. SPECTRAL SEQUENCES 217

of bigraded submodules of E 2 with E r+1 = C r /B r and

dr : C r−1 /B r−1 → C r−1 /B r−1

has kernel C r /B r−1 and image B r /B r−1 .


We can think of C r−1 as the module of elements that live until stage r and B r−1 as the module of elements that
bound by stage r. Letting r → ∞, C ∞ = ∩∞ r
r=2 C is the module of elements that live forever and B

= ∪∞ r
r=2 B is
the module of elements that eventually bound.
Now B ∞ ⊂ C ∞ so the spectral sequence determines a bigraded module

E ∞ = {Ep,q
∞ ∞
= Cp,q ∞
/Bp,q }.

Generally E ∞ will be what you are trying to find. Also note that a homomorphism f : E1 → E2 will induce
homomorphisms f r : E1r → E2r for 2 ≤ r ≤ ∞.
r
In homology calculations we often use a first quadrant spectral sequence in which Ep,q = 0 if p < 0 or q < 0.
The modules Ep,q are displayed at the grid points (p, q) of the first quadrant of the p − q plane as shown in Figure
9.9.1.

Figure 9.9.1: First Quadrant Spectral Sequence [92].

The differential dr is marked by an arrow. The terms of total degree n lie on a line of slope -1, and the differentials
r+1
go from the line to one point below it. At the grid point (p, q), the value of Ep,q is the kernel of the arrow starting
r
at Ep,q modulo the image of the arrow that ends there. There is no change if the out going arrow ends outside the
first quadrant (if r > p) or the incoming arrow starts outside of it (r > q + 1). So for max(p, q + 1) < r < ∞,
r+1 r r
Ep,q = Ep,q . So for fixed, p and q, Ep,q is eventually constant for large enough r as r increases.
The terms Ep,0 on the p-axis are called the base terms. Each arrow dr ending on the base comes from below so it
r+1 r
is zero. So each Ep,0 is a submodule of Ep,0 equal to the kernel of dr : Ep,0
r r
→ Ep−r,r−1 . This gives a sequence of
monomorphisms
∞ p+1 4 3 2
Ep,0 = Ep,0 → · · · → Ep,0 → Ep,0 → Ep,0 .
The terms E0,q on the q-axis are called the fiber terms. Each arrow from a fiber term ends at zero. So the kernel of
r+1
dr us all of E0,q
r
, and E0,q r
is the quotient of E0,q by the image of dr . This gives a sequence of epimorphisms

2 3 4 q+2 ∞
E0,q → E0,q → E0,q → · · · E0,q = E0,q .

The two sequences are known as edge homomorphisms. The terms base and fiber should sound familiar. You will see
where they come from in the next subsection.
218 CHAPTER 9. HOMOTOPY THEORY

Definition 9.9.4. A spectral sequence is bounded below if for each degree n, there is a number s dependent on n such
2
that Ep,q = 0 when p < s and p + q = n. This means that on a line of constant degree,the terms are eventually zero
as p decreases. This is certainly true in a first quadrant sequence.
Theorem 9.9.1. Mapping Theorem: If f : E1 → E2 is a homomorphism of spectral sequences and if f t : E1t → E2t
is an isomorphism for some t, then f r : E1r → E2r is an isomorphism for r ≥ t. If in addition, E1 and E2 are bounded
below, f ∞ : E1∞ → E2∞ is an isomorphism.

9.9.2 Leray-Serre Spectral Sequence


There are a number of famous spectral sequences (see [99] for several examples), but I will focus on one of them.
The Leray-Serre spectral sequence is the main tool for computing homology of fiber spaces. If you can read French,
J-P Serre’s Ph.D. thesis [143] has a great explanation and the proofs of the theorems I will state. I could have only
dreamed of having a thesis that influential. It was considered so importnat that it was published in its entirety.
Consider a fiber space F → E → B. We will assume that B is pathwise and simply connected. Then any fiber
has the same homology groups, so we can talk about the groups Hp (B; Hq (F )). This is the (singular) homology of B
with coefficients in the group Hq (F ). If p = 0, then H0 (B; Hq (F )) ∼
= Hq (F ).
Theorem 9.9.2. Leray-Serre: If f : E → B is a fiber map with base B pathwise and simply connected, and fiber F
pathwise connected, then for each n, there is a nested family of subgroups of the singular homology group Hn (E),

0 = H−1,n+1 ⊂ H0,n ⊂ H1,n−1 ⊂ · · · ⊂ Hn−1,1 ⊂ Hn,0 = Hn (E),

and a first quadrant spectral sequence such that


2 ∼
Ep,q = Hp (B; Hq (F )),

and
∞ ∼
Ep,q = Hp,q /Hp−1,q+1 .
If eB is the composite edge homomorphism on the base (defined in Section 9.9.1), the composite
∞ eB
Hp (E) = Hp,0 → Hp,0 /Hp−1,1 ∼
= Ep,0 2 ∼
−−→ Ep,0 = Hp (B; H0 (F )) ∼
= Hp (B)

is the homomorphism induced on homology by the fiber map f : E → B.


If eF is the composite edge homomorphism on the fiber, the composite
eF
= H0 (B; Hq (F )) ∼
Hq (F ) ∼ 2
= E0,q −−→ ∞
E0,q → Hq (E)

is the homomorphism induced on homology by the inclusion F ⊂ E.


The sequence relates the homology of the base and the fiber to that of the total space E. E ∞ gives successive
factor groups in the filtration of the homology of E.
Now recall the Universal Coefficient Theorem for Homology (Theorem 8.4.3) states that the sequence
2
0 → Hp (B) ⊗ Hq (F ) → Ep,q = Hp (B; Hq (F )) → T or(Hp−1 (B), Hq (F )) → 0
2 ∼
is exact. If the Hp−1 (B) are all torsion free then Ep,q = Hp (B) ⊗ Hq (F ).
Mac Lane gives three examples. I will describe them in detail. They are instructive in seeing how a spectral
sequence works.
Theorem 9.9.3. The Wang sequence: If f : E → S k is a fiber space for k ≥ 2, and the fiber F is pathwise
connected, then there is an exact sequence
dk
· · · → Hn (E) → Hn−k (F ) −→ Hn−1 (F ) → Hn−1 (E) → · · · .
9.9. SPECTRAL SEQUENCES 219

Proof: The base space S k is simply connected and has homology Hp (S k ) ∼ = Z if p = 0, k and Hp (S k ) = 0
2 k ∼
otherwise. Then we have Ep,q = Hp (S ; Hq (F )) = Hp (B) ⊗ Hq (F ), since all of the homology groups of S k are
torsion free. Also, since Z ⊗ G ∼ = G for any abelian group G, we have Ep,q 2 ∼
= Hq (F ) for p = 0, k and Ep,q2
=0
2
otherwise. The nonzero terms of Ep,q are all on the vertical lines p = 0 and p = k, so the only nonzero differential
dr for r ≥ 2 is dk . (Recall that dk lowers p by k and raises q by k − 1.) This means that E 2 = E 3 = · · · = E k and
E k+1 = E k+2 = · · · = E ∞ . Since E k+1 = E ∞ is the homology of (E k , dk ) we have the exact sequence

∞ k k dk ∞
0 → Ek,q → Ek,q −→ E0,q+k−1 → E0,q+k−1 → 0.

Now the filtration for Hn (E) has only two nonzero quotient modules so it collapses to

0 ⊂ H0,n = Hk−1,n−k+1 ⊂ Hk,n−k = Hn .


∞ ∼
Since by definition Ep,q = Hp,q /Hp−1,q+1 , we get a short exact sequence
∞ ∞
0 → E0,n → Hn (E) → Ek,n−k → 0.
k 2 ∼
Now splice this exact sequence and the previous one together substituting q = n − k, Ek,q = Ek,q = Hq (F ), and
k 2 ∼
E0,q = E0,q = Hq (F ). We then get:

Hn (E) 0

∞ dk ∞
0 Ek,n−k Hn−k (F ) Hn−1 (F ) E0,n−1 0

0 Hn−1 (E)

Following the dashed lines with dk in the middle gives the desired result. ■
For the next example, we will compute the homology of the loop space of a sphere.
Notation warning: Recall that Hu [74] uses the symbol Ω for a path space and Λ for a loop space. Mac Lane [92]
and most other books use Ω for a loop space. I will switch notation here to agree with him. From now on, I will
specify what the symbol represents. In this section ΩX will mean the space of loops on X. Mac Lane calls the space
of paths on X, L(X).
Now if B is connected and simply connected, we have a fiber space p : L(B) → B. Let b0 be a base point and
L(B) the set of paths that end at b0 . Then p takes a path in B to it’s initial point and p−1 (b0 ) is the space of loops
starting and ending at b0 . We let B be pathwise and simply connected so we can ignore the base point. Then the fiber
is F = ΩB. Also recall that L(B) is contractible.
We will look at the homology of the space of loops of the sphere S k . You may find this useful if k = 2 and you
find yourself going around in circles.

Theorem 9.9.4. The loop space ΩS k of the sphere S k with k > 1 has homology Hn (ΩS k ) ∼
= Z if n ≡ 0 (mod k − 1)
and Hn (ΩS k ) = 0 otherwise for n ≥ 0.

Proof: S k for k > 1 is simply connected so each loop can be contracted to the zero loop. Thus ΩS k is path
connected so that H0 (ΩS k ) ∼= Z. Now E = L(B) is contractible so E is acyclic. So every third term in the Wang
sequence is zero except H0 (E) ∼ = Z. So the sequence gives isomorphisms Hn−k (ΩS k ) ∼
= Hn−1 (ΩS k ) and the result
k
follows from the initial value at H0 (ΩS ). ■
220 CHAPTER 9. HOMOTOPY THEORY

Theorem 9.9.5. : The Gysin sequence: If p : E → B is a fiber space with simply connected base B and with fiber
F = S k with k ≥ 1, then there is an exact sequence
p∗ dk+1
· · · → Hn (E) −→ Hn (B) −−−→ Hn−k−1 (B) → Hn−1 (E) → · · · .

Proof: Since Hq (F ) = Hq (S k ) is only nonzero if q = 0, k, we have Ep,q


2 ∼
= Hp (B) if q = 0 or q = k, and
2
Ep,q = 0 otherwise. The spectral sequence then lies on the two horizontal lines q = 0 and q = k, and the only
nonzero differential is dk+1 . We then get the exact sequences

∞ 2 dk+1
2 ∞
0 → En,0 → En,0 −−−→ En−k−1,k → En−k−1,k →0

and
∞ ∞
0 → En−k−1,k+1 → Hn (E) → En,0 → 0,
and we splice them together as in Theorem 9.9.3 to get our result. ■

9.9.3 Filtrations and Exact Couples


I will now discuss two situations that can give rise to spectral sequences.
In the first situation we have a filtration. A filtration F of a module A is a family of submodules Fp A for p ∈ Z
such that
· · · ⊂ Fp−1 A ⊂ Fp A ⊂ Fp+1 A ⊂ · · · .
Each filtration F of A determines a associated graded module GF A = {(GF A)p = Fp A/Fp−1 A}, consisting of
the successive factor modules in the chain.
If F and F ′ are filtrations of A and A′ respectively, then a homomorphism F : A → A′ of filtered modules
is a module homomorphism with f (Fp A) ⊂ Fp′ A′ . The filtration of a graded module A with a differential and
homomorphisms preserving the grading induces a filtration of the homology module H(A) with Fp (H(A)) defined
as the image of H(Fp A) under the injection Fp A → A. Since A itself is graded, the filtration F of A determines a
filtration Fp An of each An and the differential of A induces homomorphisms ∂Fp An → Fp An−1 for each p and each
n. The family {Fp An } is a bigraded module. Letting q = n−p we call p the filtration degree and q the complementary
degree. The bigraded module then takes the form {Fp Ap+q }. We call this a F DGZ module as an abbreviation of a
filtered differential Z-graded module.
A filtration F of a differential graded module A is said to be bounded if there are integers s, t dependent on n such
that s < t, Fs An = 0, and Ft An = An . This makes the filtration of finite length and it takes the form

0 = Fs An ⊂ Fs+1 An ⊂ · · · ⊂ Ft An = An .

A spectral sequence {Epr , dr } is said to converge to a graded module H if there is a filtration F of H and isomor-
phisms E ∞ ∼ = Fp H/Fp−1 H of graded modules for each p. To put the spectral sequence in the more familiar bigraded
form, let Epr be the graded module {Ep,q r
} graded by the complementary degree q.
Theorem 9.9.6. Each filtration F of a differential graded module A determines a spectral sequence (E r , dr ) for
1 ∼
r = 1, 2, · · · which is a covariant functor of (F, A) together with the isomorphisms Ep,q = Hp+q (Fp A/Fp−1 A). If F
∞ ∼
is bounded, then the sequence converges to H(A), i.e. Ep,q = Fp (Hp+q A)/Fp−1 (Hp+q A).
Question to think about: Persistent homology starts with a filtration of a differential graded module. (See Sections
5.1 and the more general version in Section 5.2.) Would building the associated spectral sequence give you interesting
information about your data? As I mentioned above, Edelsbrunner and Harer mention spectral sequences in [44] but to
the best of my knowledge, the meaning of a spectral sequence derived from a point cloud has not been fully explored.
See [92] for the rather lengthy proof of Theorem 9.9.6 and [99] for a more complete discussion of the relationship
between a filtration and the associated spectral sequence.
An alternate way of deriving spectral sequences is through exact couples.
9.9. SPECTRAL SEQUENCES 221

Definition 9.9.5. An exact couple C = {D, E, i, j, k} consists of two modules D and E together with homomorphisms
i, j, and k such that the triangle below is exact:

i
D D
k j

The modules D and E may be graded or bigraded.

Since the triangle is exact, the composite jk : E → E has square (jk)(jk) = j(kj)k = j0k = 0, since kj = 0 by
exactness. So jk is a differntial on E. Let E ′ = H(E, jk) and D′ = i(D). Let i′ = i|D′ , j ′ (id) = j(d) + jkE, and
k ′ (e + jkE) = ke for d ∈ D and e ∈ E where jk(e) = 0.
Now id = 0 implies d ∈ kE by exactnesss, so jd ∈ jkE and j ′ is well defined. Also, jke = 0 implies that
ke ∈ iD so k ′ is well defined.
It is easy to show that that C′ = {D′ , E ′ , i′ , j ′ , k ′ } forms and exact triangle. (Try it.) C′ is then an exact couple
called the derived couple of C. Iterating the process produces the rth-derived couple Cr = {Dr , E r , ir , j r , k r }.
Let C1 = C, C2 = C′ , and Cr+1 = (Cr )′ . Then we have the following:

Theorem 9.9.7. An exact couple of bigraded modules D and E with maps of bidegrees deg(i) = (1, −1), deg(j) =
(0, 0), and deg(k) = (−1, 0) determines a spectral sequence (E r , dr ) with dr = j r k r .

For the exact couple (C)r , Mac Lane shows that we have bidegrees deg(ir ) = (1, −1), deg(j r ) = (−r + 1, r − 1),
and deg(k r ) = (−1, 0), so that deg(j r k r ) = (−r, r − 1). Thus, each E r+1 is the homology of E r with the appropriate
bidegree for a spectral sequence. The first stage of the sequence can be expressed as shown:

.. ..
. .

i i
j k j k j
··· Ep,q+1 Dp−1,q+1 Ep−1,q+1 Dp−2,q+1 ···

i i
j k j k j
··· Ep+1,q Dp,q Ep,q Dp−1,q ···

i i
j k j k j
··· Ep+2,q−1 Dp+1,q−1 Ep+1,q−1 Dp,q−1 ···

i i

.. ..
. .

A filtration of a graded differential module A determines an exact couple as follows. The short exact sequence of
complexes
0 → Fp−1 A → Fp A → Fp A/Fp−1 A → 0
222 CHAPTER 9. HOMOTOPY THEORY

yields the exact homology sequence


i j k
· · · → Hn (Fp−1 A) →
− Hn (Fp A) −
→ Hn (Fp A/Fp−1 A) −
→ Hn−1 (Fp−1 A) → · · · ,

where i is injection, j is projection, and k is the homology boundary homomorphism. These sequences give an exact
couple with Dp,q = Hp+q (Fp A) and Ep,q = Hp+q (Fp A/Fp−1 A) and the degrees of i, j, and k as in Theorem 9.9.7.
This is called the exact couple of the filtration F .
Theorem 9.9.8. The spectral sequence of F is isomorphic to that of the exact couple of F .
See Mac Lane [92] for the proof. Also, filtrations give rise to exact couples but exact couples do not have to come
from a filtration. See Mac Lane for an example.

9.9.4 Kenzo
Kenzo is a software package for computational homological algebra and algebraic topology developed by Francis
Sergeraert of Grenoble Alpes University in Grenoble, France. The package was written in LISP and originally called
EAT (effective algebraic topology) and then CAT (constructive algebraic topology.) The latter acronym inspired
Sergeraert to name the package after his cat, Kenzo. (See Figure 9.9.2). Kenzo was written in LISP and has an
extensive website [43] with examples and lots of documentation.
Kenzo is not necessary for computing homology and cohomology groups, even though you can use it that way.
Rather, it is your best hope for computations of objects such as spectral sequences and Postnikov systems. It is also
the basis for the obstruction theory algorithms I will mention in the next chapter. The paper that brought Kenzo to my
attention was the introduction to the 2006 Summer School in Genova (Genoa) Italy [139] written by Sergeraert and
Julio Rubio of University of La Rioja, Spain. Rubio’s Ph.D student, Ana Romero, is a joint author on a paper that
specifically deals with what type of spectral sequences can be computed with Kenzo. See [140] for details.
I will spend some time on some key concepts. Although you can prove some pretty cool results with exact
sequences and spectral sequences, neither one is actually an algorithm. There are plenty of times where they introduce
a lot of uncertainty and the fact that you can set up your problem that way does not guarantee a solution. We want an
algorithm that is constructive, and that is the type of problem that Kenzo can handle.
Rubio and Sergeraert [139] discuss the notion of constructive mathematics. There are proofs that an object exists
that don’t actually produce the object. Here is one of my favorites.
Definition 9.9.6. If X is a topological space then a sequence in X is a function from the nonnegative integers to X.
If we relax this condition to allow for the domain to be any linearly ordered set, the resulting function is called a net.
An ultranet U is a net such that for any subset A ⊂ X, U is eventually in A or U is eventually not in A. (By this we
mean that if the ultranet is a map from an indexing set J to X, and xα is the point in X that is the image of α ∈ J,
then there exists β ∈ J, such that xγ ∈ A for all γ > β or xγ ∈ / A for all γ > β. Which of these conditions holds
depends on the particular subset A.)
Example 9.9.1. Let a net (or even a sequence) have the property that for all γ > β, xγ = x0 where x0 is a fixed point
in X and β ∈ J is fixed. Then for a given choice of A ⊂ X, the net is eventually in A if x0 ∈ A and eventually not in
A if x0 ∈
/ A. This type of ultranet is called a trivial ultranet.
Theorem 9.9.9. There exists a non-trivial ultranet.
I will refer you to Willard [172] for example if you want to see the proof. He uses a one to one correspondence
between nets and a special collection of subsets of a space called a filter. The counterpart to ultranets, ultrafilters,
are a lot easier to construct. These imply the existence of nontrivial ultranets, but to my knowledge, nobody has ever
constructed one.
Part of the idea of constructive mathematics, then, is that that there is a difference between saying the converse of
a proposition is false and that the proposition is true. Here is an example from [139].
Example 9.9.2. Consider these two statements. Do they mean the same thing?
9.9. SPECTRAL SEQUENCES 223

Figure 9.9.2: Kenzo the Cat [43].

1. It is false that there is no book about constructive analysis in the library.

2. The upper shelf to the left of the east window at the second floor of the library has a book about constructive
analysis.

Both statements imply there is such a book exists, but only the second one enables it to be found. Romero and
Sergeraert suggest the book by Bishop and Bridges [16] as a good example.
Here is another example from [139].

Example 9.9.3. Are there two irrational numbers a and b such that ab is rational?
√ √2 √
Solution
√ 1: Let c = 2 . If c is rational then a = b = 2 is a solution. If c is irrational, then let a = c and
b = 2. Then
√ √2 √ √ (√2√2) √ 2
ab = ( 2 ) 2 = 2 = 2 = 2.
√ √ √ √2 √
So either (a, b) = ( 2, 2) is a solution or (a, b) = ( 2 , 2) is a solution.
224 CHAPTER 9. HOMOTOPY THEORY

Solution 2: Let a = 2 and b = 2 log2 3. Both a and b are known to be irrational and
√ 2 log2 3 1
ab = 2 = 2 2 ·2 log2 3 = 2log2 3 = 3.

So (a, b) = ( 2, 2 log2 3) is a solution.
Solution 1 is not a constructive solution. It gives two answers and one of them must be right but we don’t know
which one. Solution 2 gives an actual answer.

So what is a constructive algorithm in homological algebra? An example of a failure is J.P. Serre’s attempt to
determine π6 (S 3 ) through building fibrations and using his spectral sequence. I will omit the details but in the end he
got the exact sequence
0 ← Z6 ← π6 (S 3 ) ← Z2 ← 0.
This shows that π6 (S 3 ) has 12 elements but he couldn’t decide between Z12 (later proved to be the correct answer)
or Z6 ⊕ Z2 . The problem was that the homology group from the spectral sequence corresponding to Z6 did not have
an explicit isomorphism exhibited between corresponding cycles and elements of Z6 . This would have solved the
problem.
The constructiveness requirement in homological algorithm is stated in [139] as follows:

Definition 9.9.7. Let R be a ground ring and C a chain complex of R-modules. A solution S of the homological
problem for C is a set S = {σi }1≤i≤5 of five algorithms:

1. σ1 : C → {T rue, F alse} is a predicate for deciding for every n ∈ Z and every n-chain c ∈ Cn whether c is a
cycle.

2. σ2 : Z → {R-modules} associates to every integer n an R-module σ2 (n) which is isomorphic to Hn (C).

3. The algorithm σ3 is indexed by Z. For each n ∈ Z, σ3,n : σ2 (n) → Zn (C) associates to every n-homology
class h ∈ σ2 (n) a cycle σ3,n (h) representing this homology class.

4. The algorithm σ4 is indexed by Z. For each n ∈ Z, σ4,n : Cn ⊃ Zn (C) → σ2 (n) associates to every n-cycle
z ∈ Zn (C) the homology class of z coded as an element of σ2 (n).

5. The algorithm σ5 is indexed by Z. For each n ∈ Z, σ5,n : Zn (C) → Cn+1 associates to every n-cycle
z ∈ Zn (C) that σ4 determined to be a boundary, a chain c ∈ Cn+1 such that dc = z.

Another concept is a locally effective versus an effective object. For a locally effective object, we can perform
calculations without storing the entire object in memory, just as a pocket calculator can add integers without storing
all of Z in memory. For some, algorithms, though, we do need to know everything about the object. If we want to
compute Hn (C), we need a full knowledge of Hk (C) for c = n + 1, n, n − 1 and full knowledge of the differentials
dk and dk+1 so that we can compute ker dk and image dk+1 .
The last concept I will define is a reduction. It allows us to perform calculations in a simpler complex that is
equivalent in some way to a more complicated one.

Definition 9.9.8. A reduction ρ : Ĉ ⇒ C is a diagram:

g
h Ĉ C
f

where:

1. Ĉ and C are chain complexes.


9.9. SPECTRAL SEQUENCES 225

2. f and g are chain-complex morphisms.


3. h is a chain homotopy (degree +1).
4. These relations are satisfied:

(a) f g = idc .
(b) gf + dh + hd = idĈ .
(c) f h = hg = hh = 0.
Theorem 9.9.10. Let ρ : Ĉ ⇒ C be a reduction. Then ρ is equivalent to a decomposition:

Ĉ = A ⊕ B ⊕ C ′

such that:
1. C ′ = im g is a subcomplex of Ĉ.

2. A ⊕ B = ker f is a subcomplex of Ĉ.


3. A = ker f ∩ ker h is not in general a subcomplex of Ĉ.
4. B = ker f ∩ ker d is a a subcomplex of Ĉ with null differentials.
5. The chain morphisms f and g are inverse isomorphisms between C ′ and C.

6. The arrows d and h are module homomorphisms of degrees -1 and 1 between A and B.
Figure 9.9.3 shows all of the reduction maps in a single picture.
In the next chapter we will need to determine if maps between certain simplicial complexes and spheres when
restricted to the boundaries of simplices are zero elements in the resulting homotopy groups. Building on the theory
of constructive algebraic topology, this seems to be possible in a lot of cases. I will make this explicit next.
226 CHAPTER 9. HOMOTOPY THEORY

Figure 9.9.3: Maps Involved in a Reduction [139].


Chapter 10

Obstruction Theory

Back in 1989, one of the first chapters I wrote on my dissertation heavily relied on obstruction theory. My advisor said
that it was a good start but I would have to add to it or people would call me an ”obstructionist”. Thirty years later,
I was telling anyone who would listen that I thought obstruction theory would have an interesting application in data
science. TDA deals with filtrations and these are really simplicial complexes that are growing. What if there was an
interesting function from these complexes to some space Y ? At each stage in adding a new simplex, we could ask if
this function could be extended continuously to the larger complex. Would that tell you something interesting about
your data that you didn’t know before? Maybe you could find something interesting about the point cloud itself. All
you would have to do is compute cohomology groups of the complex with coefficients in the homotopy groups of the
target space. Most people could only nod their heads at this one.
Then with COVID and the lockdown, I suddenly had a lot more free time. I decided I would review my homotopy
theory (pretty rusty after 30 years) and attack this problem once and for all. As of the time of this writing (July 2021)
I don’t have proof of an application that will change the world. I can argue, though, that this problem is possible due
to recent advances in computational topology such as the ones I mentioned at the end of the last chapter. If you have
read the first 9 chapters of this book, I can now explain to you what I mean.
So the idea is this: Let X be a point cloud. and V Rϵ (X) be the Vietrois-Rips complex associated with X at radius
ϵ. For δ > ϵ, V Rϵ (X) ⊂ V Rδ (X). Let f : X → Y be an ”interesting” function and let Y be an ”interesting”
space. As we will see, we need to use homotopy groups of Y , so we would like Y to be a low dimensional sphere S k
where k ≥ 2. (Remember that life could get unpleasant if Y is not simply connected.) Suppose we had a continuous
extension g : V Rϵ (X) → Y that agreed with f on the vertices of V Rϵ (X) , i.e. on the original point cloud. Could
we extend g to a continuous function g̃ : V Rδ (X) → Y such that g̃ = g on V Rϵ (X)?
You should recognize this as the extension problem. To attack it, consider the two prototype examples. Recall that
if f : A → Y is a constant function of the form f (A) = y0 ∈ Y , we can extend f continuously to g; X → y0 ∈ Y for
any X ⊃ A. This also works if f is only homotopic to a constant map. If A = S n , then this means that f represents
the zero element of πn (Y ).
At the other extreme, we showed in Theorem 4.1.29 that if f : S n → S n has a nonzero degree, then it does not
extend to a continuous map h : B n+1 → S n . This holds in particular if f is the identity on S n . Since everything
holds up to homotopy we can say that f extends to h if and only if [f ] = 0 in πn (S n ).
The idea then behind obstruction theory is this: Suppose K is a complex and L is a subcomplex. Let f : L → Y .
Extend f over the vertices, then the edges, then the 2-simplices of K \ L. Suppose we have extended f to the n-
simplices of K \ L. We want to extend it over the (n + 1)-simplices. If σ is an (n + 1)-simplex, then f is already
defined on the boundary ∂σ ∼ S n . So the restriction of f to the boundary of σ is map S n → Y . We can extend f
to σ if and only if f |∂σ represents the zero element of πn (Y ). Since we now have an element of πn (Y ) for every
(n + 1)-simplex, the result is a cohain in H n+1 (K, L; πn (Y )) known as the obstruction cochain. We can extend f to
the (n + 1)-simplices if and only if this cochain is zero.
We can tell if there is an extension to each individual simplex σ if we can decide whether f |∂σ represents the
zero element of πn (Y ). How could we do this and would we live long enough to determine the answer. Francis

227
228 CHAPTER 10. OBSTRUCTION THEORY

Sergeraert comes to our rescue here along with a group of colleagues from Masaryk and Charles Universities in the
Czech Republic. The paper [26] describes a polynomial time algorithm to compute all maps into a sphere from a space
X.This set is presented as a group and a given map can be associated with a group element. The paper [28] provides a
polynomial time computation of homotopy groups and Postnikov systems, and also discusses the extension problem,
which can be solved under certain dimensionality conditions. The companion paper [27] shows that these conditions
can not be relaxed. Finally, [50] gives an algorithm for showing if two given maps are homotopic. Especially the first
three papers are very long and complicated, so I will state the claims they make that apply to the problem at hand and
outline their approach.
In the first section, I will give a more rigorous description of obstruction theory as it relates to the extension
problem. In Section 10.2, I will present a possible data science application of obstruction theory and the Hopf map
f : S 3 → S 2 . Finally in Section 10.3, I will discuss some additional simplical set concepts needed to understand the
computational papers I mentioned above. The papers themselves are the subject of Section 10.4.
I should mention that obstruction theory was used by Steenrod for the problem of constructing a cross section of
a fiber bundle. See Steenrod [156]. Smith, Bendich, and Harer [145] applied obstructions to constructions of cross
sections for a data merging application. In what follows I will restrict myself to the extension problem.

10.1 Classical Obstruction Theory and the Extension Problem


This section is taken from Hu [74]. Since we are interested in data science applications, we will assume our
complexes are finite and triangulable. We will work with cell complexes here and simplicial sets in Section 10.4, but
in our case they will be equivalent.
Let K be a finite cell complex, L a subcomplex of K, and Y a given pathwise connected space with a fixed point
y0 ∈ Y . Let K n be the n-skeleton of K. The restricted extension problem is whether a map f : L → Y can be
extended continuously over K (as opposed to the more general problem of a map which is homotopic to an extension).
In any case we would meet in data science, extendability is actually equivalent to a weaker condition. The extension
problem depends only on the homotopy class of f .
Definition 10.1.1. Let f : X → Y and A ⊂ X. Then a homotopy ht ; A → Y is a partial homotopy of f if f |A = h0 .
Definition 10.1.2. A subspace A of a space X has the homotopy extension property or HEP in X with respect to a
space Y if every partial homotopy ht : A → Y of a map f : X → Y has an extension gt : X → Y such that g0 = f .
A has the absolute homotopy extension property or AHEP in X if it has the AHEP in X with respect to every space
Y.
For our applications, the AHEP will always hold as the following theorem shows. (See I.9. of [74] for the proof.)
Theorem 10.1.1. Let (X, A) be a finitely trianguable pair. Then A has the AHEP in X.
As we are dealing with finite simplicial complexes, A will have the AHEP in X with respect to any Y . In this case,
we can loosen our problem to ask whether there is a map f : X → Y such that f h is homotopic to a given g : A → Y
if h : A → X is inclusion.
The method will be to extend our given map step by step over the subcomplexes K̄ n = L ∪ K n until we hit an
obstruction. It may also be possible to go back a step and try a different extension that will avoid the obstruction. The
rest of this section will illustrate the process.

10.1.1 Extension Index


Again, let K be a cell complex, L a subcomplex of K, and K̄ n = L ∪ K n . Also, again assume that Y is pathwise
connected.
Definition 10.1.3. A map f : L → Y is n-extensible over K for a given n ≥ 0 if it has an extension over the
subcomplex K̄ n of K. The extension index of f over K is the least upper bound of integers n such that f is n-
extensible over K.
10.1. CLASSICAL OBSTRUCTION THEORY AND THE EXTENSION PROBLEM 229

Since Y is pathwise connected, we can map the vertices of K \ L to arbitrary points of Y and the one cells to paths
connecting their endpoints. So we have the following.

Theorem 10.1.2. Every map f : L → Y is 1-extensible over K.

Theorem 10.1.3. Homotopic maps have the same extension index.

This holds since L has the AHEP in K̄ n .


Hu shows that nothing changes if there is a further subdivision or a different triangulation is used so we can assume
that all maps are simplicial.

10.1.2 The Obstruction Cochain


Now fix an integer n. We will assume that Y is n-simple. Recall that this means that the action of π1 (Y, y0 )
on πn (Y, y0 ) is simple. (See Section 9.6.6). We can ensure this condition by letting Y be simply connected. In
applications, we will have Y = S m where m ≥ 2.
Now let g : K̄ n → Y be given. The map g determines an (n + 1)-cochain cn+1 (g) with coefficients in πn (Y ) as
follows:
Let σ be any (n + 1)-cell in K. The boundary ∂σ is an oriented n-sphere. Since ∂σ ⊂ K n , the map gσ = g|∂σ
is already defined and as a map from S n → Y , determines an element [gσ ] ∈ πn (Y ). Define the obstruction cochain
cn+1 (g) of the map g by taking
[cn+1 (g)](σ) = [gσ ] ∈ πn (Y )
for every (n + 1)-cell σ of K.
It turns out that this cochain is actually a cocycle as the next theorem shows.

Theorem 10.1.4. The obstruction cn+1 (g) is a relative (n + 1)-cocycle of K modulo L, i.e.

cn+1 (g) ∈ Z n+1 (K, L; πn (Y )).

Proof: We first need to show that


cn+1 (g) ∈ C n+1 (K, L; πn (Y )).
We have already shown that it is a (n + 1)-cochain of K with coefficients in πn (Y ). So we just need to show it
vanishes on L. Let σ be an (n + 1)-cell of L. Since g is defined on all of L, it has an extension over the closure of σ
so [gσ ] = 0 in πn (Y ).
Now we need to prove that cn+1 (g) is a cocycle. Given an (n+2)-cell σ in K, we need to show that [δcn+1 (g)(σ)] =
0. Let B denote the subcomplex ∂σ of K and B n denote its n-skeleton. Consider the homomorphisms in the following
diagram.
∂ h k
→ Zn (B) = Zn (B n ) = Hn (B n ) ←
Cn+1 (B) − − πn (B n ) −→

πn (Y ),
where h denotes the natural homomorphism between homotopy and homology (see Section 9.7.2.) and k∗ is induced
by the map k = g|B n .
If n > 1, then B n is (n − 1)-connected so h is an isomorphism by the Hurewicz Theorem. If n = 1, then h is an
epimorphism and the kernel of h is contained in the kernel of k∗ since πn (y) is abelian (Theorem 9.6.30). So in either
case we get a well defined homomorphism

ϕ = k∗ h−1 : Zn (B) → πn (Y ).

Since Cn (B) is an abelian group, the kernel Zn (B) of ∂ : Cn (B) → Cn−1 (B) is a direct summand of Cn (B). So, ϕ
has an extension
d : Cn (B) → πn (Y ).
230 CHAPTER 10. OBSTRUCTION THEORY

Now let τ be an (n + 1)-cell in B. Then [cn+1 (g)](τ ) is represented by a map k|∂τ. We then get

[cn+1 (g)](τ ) = k∗ h−1 (∂τ ) = d(∂τ ).

This implies that


[δcn+1 (g)](σ) = [cn+1 (g)](∂σ) = d(∂∂σ) = 0.
So cn+1 (g) ∈ Z n+1 (K, L; πn (Y )). ■
The next two theorems are easy consequences.
Theorem 10.1.5. The map g : K̄ n → Y has an extension over K̄ n+1 if and only if cn+1 (g) = 0.
Theorem 10.1.6. If g0 , g1 : K̄ n → Y are homotopic maps, then cn+1 (g0 ) = cn+1 (g1 ).

10.1.3 The Difference Cochain


Now consider two maps g0 , g1 : K̄ n → Y which are homotopic on K̄ n−1 . The difference of the obstruction
cocycles of g0 and g1 is a coboundary.
To see this, consider a homotopy ht : K̄ n−1 → Y such that h0 = g0 |K̄ n−1 and h1 = g1 |K̄ n−1 .
Think of the unit interval I = [0, 1] as a cell complex with a 1-cell I and two 0-cells, 0 and 1. We have δ0 = −I
and δ1 = I. Then the product J = K × I is also a cell complex. Let J n be its n-skeleton. Let

J¯n = (L × I) ∪ J n = (K̄ n × 0) ∪ (K̄ n−1 × I) ∪ (K̄ n × 1).

Let F : J n → Y be defined by F (x, 0) = g0 (x) and F (x, 1) = g1 (x) for x ∈ K̄ n and F (x, t) = ht (x) for
x ∈ K̄ n−1 .
Then F determines an obstruction cocycle cn+1 (F ) of the complex J modulo L × I with coefficients in πn (Y ).
Then cn+1 (F ) coincides with cn+1 (g0 ) × 0 on K × 0 and with cn+1 (g1 ) × 1 on K × 1.
Let M denote the subcomplex (K × 0) ∪ (L × I) ∪ (K × 1) of J = K × I. Then it follows that

cn+1 (F ) − cn+1 (g0 ) × 0 − cn+1 (g1 ) × 1

is a cochain of J modulo M with coefficients in πn (Y ). Since σ → σ × I is a one-to-one correspondence between


the n-cells of K \ L and the (n + 1)-cells of J \ M , it defines the isomorphism

k : C n (K, L, πn (Y )) ≈ C n+1 (J, M ; πn (Y )).

So there is a unique cochain dn (g0 , g1 ; ht ) ∈ C n (K, L, πn (Y )) called the deformation cochain such that

kdn (g0 , g1 ; ht ) = (−1)n+1 {cn+1 (F ) − cn+1 (g0 ) × 0 − cn+1 (g1 ) × 1}.

If we actually have g0 |K̄ n−1 = g1 |K̄ n−1 and ht (x) = g0 (x) = g1 (x) for every x ∈ K̄ n−1 and every t ∈ [0, 1], we
write dn (g0 , g1 ; ht ) as dn (g0 , g1 ) and call it the difference cochain.
So the difference cochain has the property that g0 and g1 are actually equal on K̄ n−1 rather than just homotopic.
Theorem 10.1.7. In the situation above, the homotopy ht : K̄ n−1 → Y has an extension Ht : K̄ n → Y such that
H0 = g0 and H1 = g1 if and only if dn (g0 , g1 ; ht ) = 0.
The deformation cochain has a nice coboundary formula.
Theorem 10.1.8.
δdn (g0 , g1 ; ht ) = cn+1 (g0 ) − cn+1 (g1 ).
Since δI = 0, the isomorphism k taking σ to σ × I defined above commutes with δ. Applying δ to the formula
for k and using the facts that δ0 = −I, δ1 = I and that cn+1 (F ), cn+1 (g0 ), and cn+1 (g1 ) are all cocyles proves the
theorem.
The next result shows the importance of the deformation (difference) cochain. See [74] for the proof.
10.1. CLASSICAL OBSTRUCTION THEORY AND THE EXTENSION PROBLEM 231

Theorem 10.1.9. Let g0 : K̄ n → Y and suppose there is a homotopy ht : K̄ n−1 → Y with h0 = g0 |K̄ n−1 . For every
cochain c ∈ C n (K, L; πn (Y )), there exists a map g1 : K̄ n → Y such that g1 |K̄ n−1 = h1 and dn (g0 , g1 ; ht ) = c. If
z is a cocycle in Z n+1 (K, L; πn (Y )) such that z is in the same cohomology class as cn+1 (g0 ) mod L, there exists a
map g1 : K̄ n → Y such that g1 |K̄ n−1 = h1 and cn+1 (g1 ) = z.

In particular, if ht = g0 |K̄ n−1 for all t ∈ [0, 1], then there exists a map g1 : K̄ n → Y such that g0 |K̄ n−1 =
g1 |K̄ n−1 and cn+1 (g1 ) = z.

10.1.4 Eilenberg’s Extension Theorem


Let g : K̄ n → Y . Then g determines an obstruction cocycle cn+1 (g) ∈ Z n+1 (K, L; πn (Y )). Since cn+1 (g) is a
cocycle, it represents a cohomology class γ n+1 (g) ∈ H n+1 (K, L; πn (Y )) called the obstruction cohomology class.
The next theorem shows its significance.

Theorem 10.1.10. Eilenberg’s Extension Theorem: γ n+1 (g) = 0 if and only if there exists a map H : K̄ n+1 → Y
such that H|K̄ n−1 = g|K̄ n−1 .

Proof: Assume that such an H exists and let h = H|K̄ n . Since h extends to K̄ n+1 , cn+1 (h) = 0. Since
g|K̄ n−1 = h|K̄ n−1 , the difference cochain dn (g, h) is defined. By Theorem 10.1.8, cn+1 (h) = 0 implies that
cn+1 (g) is the coboundary of dn (g, h). Hence, γ n+1 (g) = 0.
Now assume that γ n+1 (g) = 0. Then cn+1 (g) is in the same cohomology class as 0 mod L. By Theorem
10.1.9, there exists a map h : K̄ n → Y such that g|K̄ n−1 = h|K̄ n−1 , and cn+1 (h) = 0. Then h has an extension
H : K̄ n+1 → Y . ■
Now suppose that g is an extension of f : L → Y . If cn+1 (g) ̸= 0, then g can not be extended over K̄ n+1 . But if
n+1
c (g) is in the zero cohomology class, then the obstruction is removable by modifying the values of g only on the
n-cells of K \ L.

10.1.5 Obstruction Sets


I will conclude this section with one last definition.

Definition 10.1.4. Let f : L → Y . The (n+1)-dimensional obstruction set

On+1 (f ) ⊂ H n+1 (K, L; πn (Y ))

is defined as follows: If f is not n-extensible over K then On+1 (f ) = ∅. So suppose instead that there exists an
extension g : K̄ n → Y of f . The cohomology class γ n+1 (g) is called an (n+1)-dimensional obstruction element of f.
Then On+1 (g) is the set of all (n + 1)-dimensional obstruction elements of f .

So we have a (n + 1)-dimensional obstruction element of f for each homotopy class of extensions K̄ n → Y of f .

Theorem 10.1.11. Homotopic maps have the same (n + 1)-dimesnional obstruction set.

Now let (K ′ , L′ ) be another cellular pair and ϕ : (K ′ , L′ ) → (K, L) be a proper cellular map. Then ϕ induces a
homomorphism
ϕ∗ : H n+1 (K, L; πn (Y )) → H n+1 (K ′ , L′ ; πn (Y )).

Theorem 10.1.12. If f ′ = f ϕ : L′ → Y , then ϕ∗ sends On+1 (f ) into On+1 (f ′ ).

If (K, L) is a subdivision of (K ′ , L′ ) and ϕ is the identity, then ϕ∗ is an isomorphism that sends On+1 (f ) onto a
subset of On+1 (f ′ ) in a one to one fashion. If (K, L) and (K ′ , L′ ) are simplicial, then the identity ϕ−1 is homotopic to
a simplicial map and ϕ∗ is an isomorphism that sends On+1 (f ) onto On+1 (f ′ ). So the (n+1)-dimensional obstruction
set is smaller on a subdivision and smallest on a simplicial pair.
232 CHAPTER 10. OBSTRUCTION THEORY

Theorem 10.1.13. If (K, L) is a simplicial pair and f : L → Y is a map, then On+1 (f ) is a topological invariant
and does not depend on the triangulation of (K, L).
Now we let (K, L) be a cellular pair, but in applications it will actually be simplicial. The following theorems
come from the definition of obstruction sets and the Eilenberg Extension Theorem. They apply to both the cellular
and simplicial case.
Theorem 10.1.14. The map f : L → Y is n-extensible over K if and only if On+1 (f ) ̸= ∅.
Theorem 10.1.15. The map f : L → Y is (n + 1)-extensible over K if and only if On+1 (f ) contains the zero element
of H n+1 (K, L; πn (Y )).
Theorem 10.1.16. If Y is r-simple and H r+1 (K, L; πr (Y )) = 0 for every r with n ≤ r < m then the n-extensibility
of f : L → Y over K implies its m-extensibility over K.
If K \ L is of dimension not exceeding m then the hypothesis of Theorem 10.1.16 implies that f : L → Y has an
extension over K if an only if it is n-extensible over K.
Theorem 10.1.17. If Y is r-simple and H r+1 (K, L; πr (Y )) = 0 for every r ≥ 1, then every map f : L → Y has an
extension over K.

10.1.6 Application to Data Science


Let’s look at what we know in relation to data science. Consider a point cloud X again and let V Rϵ (X) be the
Vietoris-Rips complex at radius ϵ. Let Y be a simply connected space. It is then r-simple for every r. The most
tractable situation is to let Y = S m , where m ≥ 2. As we will see in Chapter 12, finding homotopy groups of spheres
is an extremely difficult problem involving some very sophisticated algerbraic topology tools. On the other hand, a
huge number of these groups are known and low dimensional groups of low dimensional spheres can be read from a
table. (I will give you one in Chapter 12.)
Now suppose δ > ϵ. Then V Rϵ (X) ⊂ V Rδ (X). Suppose f is an interesting function going from V Rϵ (X) → Y .
Can f be extended to g : V Rδ (X) → Y . Looking at the obstruction cochain cn+1 (f ) we can discover potential
problem simplices. Let L = V Rϵ (X) and K = V Rδ (X). If Y = S m , we need to look at H n+1 (K, L, πn (S m )).
Now if Y = S m , then πn (S m ) = 0 for n < m, so we know that we can extend f to K̄ m . For n ≥ m, πn (S m ) is
generally nonzero. So there are two things to look at. First of all is H n+1 (K, L, πn (S m )) = 0? Given our knowledge
of πn (S m ) we can determine if the group is zero by computing cohomolgy with the help of the long exact sequence
of the pair (K, L) and the Universal Coefficient Theorem. If there is no simplex in K \ L of dimension more than k,
then H k+1 (K, L, πk (S m )) = 0.
Two more problems remain.
1. Didn’t we say that we need the coefficients to be in a field? πn (S m ) is not necessarily going to be a field. For
example, πm (S m ) ∼ = Z for all m > 0. This doesn’t actually matter. Although we can’t use the persistent
homology algorithms, we still can compute the homology or cohomology for ϵ and δ fixed. Our problem may
suggest interesting (ϵ, δ) pairs or we can use every pair at which a simplex is added to the Vietoris-Rips complex
at δ. After all, this will only be finitely many times. In a different step, we can compute the persistent homology
of our cloud with Z2 coefficients and use features from both persistent homology and obstruction theory.
2. What if H n+1 (K, L, πn (S m )) ̸= 0? We can still extend f to an (n + 1)-simplex σ if f |∂σ is the zero element
of πn (S m ). How can we tell if that is the case. Using the papers [26, 27, 28] of Cadek, et. al, we can determine
whether a map is the zero element of a homotopy group provided certain restrictions on dimensionality are met.
This leaves us with a lot of cases in which this is possible. These papers will be the subject of Section 10.4.
So I will leave you with some questions to think about.
1. For what data science problems would it be interesting to pick out a particular portion of a point cloud? (Or a
graph, time series, etc?)
10.2. POSSIBLE APPLICATION TO IMAGE CLASSIFICATION 233

2. Is there a meaningful function from these points to a sphere for which obstructions in part of your cloud would
have an interesting meaning?

3. Could n-extensibility of such a function be a feature that would help solve a difficult classification problem?

I will give a rough idea of a possible application in the next section.

10.2 Possible Application to Image Classification


I will give a disclaimer at the beginning that this idea is not completely thought out. I know that there are a lot of
great ways to segment and classify images, but the hope is that we will find something interesting that may have not
shown up with other methods. Even if this doesn’t accomplish that goal, it might inspire other situations for which
obstruction theory is more suited.
Consider an image having k × n pixels. Consider each one to be a point on a grid. As a start, we could take the
distance between them to be Euclidean distance where √ we consider neighboring pixels to be a distance of one unit
apart. Pixels adjacent along a diagonal would be 2 units apart. To make things more interesting, we could vary
inter-pixel distances in some way. Even more interesting, we could keep distances the same but stack the frames to
represent a moving picture. Distances to the pixels on different levels could represent units of time.
Now each pixel represents three eight byte numbers corresponding to Y (luminance or greyscale) and two chromi-
nance values Cr and Cb . Each of these is a number between 0 and 255. Rescale the numbers by subtracting 128 and
adding some small number like .001 to avoid all three numbers being 0. Let v = [Ȳ , C¯r , C̄b ] be the vector of rescaled
values. Then v corresponds to the point v/||v|| on S 2 . So we have a map f from the vertices of V Rϵ (X) to S 2 , and we
can extend it continuously to the one and two dimensional simplices since S 2 is path connected and simply connected.
The interesting case is when we try to add 3-simplices and 4-simplices. Recall that π2 (S 2 ) ∼ = Z is generated by the
idS 2 and f |∂σ3 is the zero element if and only if it has degree 0. This will determine if the map can be extended to a
given 3-simplex σ3 . For the 4-simplex σ4 , π3 (S 2 ) ∼
= Z is generated by the Hopf map. We can use the method of the
Cadek, et. al. papers to determine if f |∂σ4 is the zero element of this group. Due to the dimensionality restrictions in
their results, this is as far as you can go.
Which simplices would have non-zero obstructions? Would anything interesting be going on there? Especially in
the moving picture case, it would make for some interesting experiments.

10.3 More on Simplicial Sets


As computational homotopy theory can’t really handle topological spaces themselves, it has to work with simplicial
sets. Peter May’s book [97] has a very well developed homotopy theory in the simplicial set context. It is probably a
good idea to work through it if you want a full understanding of the algorithms in Section 10.4. For this book, I will
just state the main theorems and outline some of the algorithms which get you there. Still we will need to know a
little more about simplicial sets. I will again get the material from the more brief and easier to understand discussion
in Friedman [51]. While nowhere near as complete as [97], it gives an idea of the major concepts. I will focus on
Friedman’s description of product sets and Kan complexes. You may find it helpful to review Section 6.4 at this point.
Two concepts from there will be especially important to remember:

1. According to Theorem 6.4.1, any degenerate simplex is formed by applying a sequence of degeneracy maps to
a unique non-degenerate simplex.

2. According to Theorem 6.4.2, a simplicial set X can be realized as a CW complex with one n-cell for each
nondegenerate n-simplex of X.
234 CHAPTER 10. OBSTRUCTION THEORY

10.3.1 Products
Recall from Section 8.5 that we had to work pretty hard to work with products of simplicial complexes and chain
complexes. Simplicial sets are much easier by comparison.

Definition 10.3.1. Let X and Y be simplicial sets. Their product X × Y is defined by its n-simplices, face maps, and
degeneracy maps as follows:

1. (X × Y )n = Xn × Yn = {(x, y)|x ∈ Xn , y ∈ Yn },

2. If (x, y) ∈ (X × Y )n , then di (x, y) = (di x, di y).

3. If (x, y) ∈ (X × Y )n , then si (x, y) = (si x, si y).

Theorem 10.3.1. Let X and Y be simplicial sets and |X| and |Y | their geometric realizations. If |X × Y | is a
CW-complex, then |X × Y | and |X| × |Y | are homeomorphic.

To see what this looks like, I will take Friedman’s example of a square.

Example 10.3.1. Consider ∆1 × ∆1 . We claim that |∆1 × ∆1 | is the square |∆1 | × |∆1 |. We need to find the
nondegenerate simplices of ∆1 × ∆1 .
Figure 10.3.1 [51] illustrates the situation.

Figure 10.3.1: Realization of ∆1 × ∆1 [51].

∆1 has the two 0-simplices [0] and [1], so the product 0-simplices are

X0 = {([0], [0]), ([0], [1]), ([1], [0]), ([1], [1])},

which are the four vertices of the square.


For dimension 1, the one simplices of ∆1 are [0, 0], [0, 1], and [1, 1]. So the product has nine 1-simplices. The only
one made up of nondegenerate simplices is ([0, 1], [0, 1]). Since d0 ([0, 1], [0, 1]) = ([1], [1]) and d1 ([0, 1], [0, 1]) =
([0], [0]), we know that ([0, 1], [0, 1]) must be the diagonal. Those with one degenerate and one nondegenerate 1-
simplex are ([0, 0], [0, 1]), ([0, 1], [0, 0]), ([1, 1], [0, 1]), and ([0, 1], [1, 1]), which by checking d0 and d1 , we can see are
the left, bottom, right, and top of the square respectively. The other four 1- simplices are degeneracies of the vertices.
For example, ([0, 0], [1, 1]) = s0 ([0], [1]).
Next, there are four 2-simplices of ∆1 : [0, 0, 0], [0, 0, 1], [0, 1, 1], and [1, 1, 1]. So there are sixteen 2-simplices of
∆ × ∆1 . There are also two degeneracy maps s0 an s1 from (∆1 × ∆1 )1 → (∆1 × ∆1 )2 . These act on the nine
1

1-simplices, but we reduce the number from 18. since s0 s0 = s1 s0 and there are four degenerate 1-simplices s0 vi of
10.3. MORE ON SIMPLICIAL SETS 235

∆1 ×∆1 corresponding to the degeneracies of the four vertices. This reduces the number to 14 degenerate 2-simplices.
There are two nondegnerate two simplices ([0, 0, 1], [0, 1, 1]) and ([0, 1, 1], [0, 0, 1]) which are the two triangles. We
can see this by computing face maps.
Now we need to show that all higher dimensional simplices in ∆1 × ∆1 are degenerate. Since there are no
nondegenreate simplices of ∆1 of dimension greater than 1, each 3-simplex must be a double degeneracy of a 1-
simplex. But there are only six options for a 1-simplex e. They are s0 s0 e, s0 s1 e, s1 s0 e, s1 s1 e, s2 s0 e, and s2 s1 e. But
the simplicial set axioms reduce these to s1 s0 e, s2 s0 e, and s2 s1 e. But the axioms then show that for 1-simplices e and
f,

(s1 s0 e, s1 s0 f ) = s1 (s0 e, s0 f )
(s1 s0 e, s2 s0 f ) = (s0 s0 e, s0 s1 f ) = s0 (s0 e, s1 f )
(s1 s0 e, s2 s1 f ) = (s1 s0 e, s1 s1 f ) = s1 (s0 e, s1 f )
(s2 s0 e, s1 s0 f ) = (s0 s1 e, s0 s0 f ) = s0 (s1 e, s0 f )
(s2 s0 e, s2 s0 f ) = s2 (s0 e, s0 f )
(s2 s0 e, s2 s1 f ) = s2 (s0 e, s1 f )
(s2 s1 e, s1 s0 f ) = (s1 s1 e, s1 s0 f ) = s1 (s1 e, s0 f )
(s2 s1 e, s2 s0 f ) = s2 (s1 e, s0 f )
(s2 s1 e, s2 s1 f ) = s2 (s1 e, s1 f )

So all 3-simplices of ∆1 × ∆1 are degenerate. So all higher dimensional simplices are also degenerate.

Friedman also looks at ∆p × ∆q , See [51] for the details.

10.3.2 Kan Complexes


In order to do homotopy theory with simplicial sets we need an equivalent condition to the homotopy extension
property for topological spaces. We saw in Theorem 10.1.1 that the AHEP holds for a finitely triangulable pair. To
apply this to simplicial sets, we need some definitions.

Definition 10.3.2. As a simplicial complex, the kth horn |Λnk | on the n-simplex |∆n | is the subcomplex of |∆n |
obtained by removing the interior of |∆n | and the interior of the face dk ∆n . Let Λnk be the associated simplicial set.
This set consists of simplices [i0 , · · · , im ] with 0 ≤ i0 ≤ · · · ≤ im ≤ n such that not all numbers from 1 to n are
included (this would be the top face or a degeneracy of it), and we never have all numbers other than k represented as
this would be the (n − 1)-face that was removed or a degeneracy of it. Figure 10.3.2 from [51] shows an example.

Figure 10.3.2: Horns on |∆2 | [51].

Definition 10.3.3. The simplicial set X satisfies the extension condition or Kan condition if any morphism of sim-
plicial sets Λnk → X can be extended to a simplicial morphism ∆n → X. X is then called a Kan complex or a
fibrant.
236 CHAPTER 10. OBSTRUCTION THEORY

Note that Kan complexes are named after Daniel Kan [81] as opposed to my advisor, Donald Kahn despite having
similar names.
Here is an alternate definition of the Kan condition.
Definition 10.3.4. The simplicial set X satisfies the Kan condition for any collection of (n − 1) simplices

x0 , x1 , · · · , xk−1 , xk+1 , · · · , xn

in X such that di xj = dj−1 xi for any i < j with i ̸= k and j ̸= k there is an n-simplex x in X such that di x = xi for
all i ̸= k.
The alternative definition’s condition on the simplices glues them together to form the horn Λnk (possibly with
degenerate faces) and the definition says we can extend the horn to an n-simplex in X which is possibly degenerate.
Example 10.3.2. The standard simplices ∆n for n > 0 do not satisfy the Kan condition. Let ∆1 = [0, 1] and Λ20 be
the horn depicted in Figure 10.3.2 consisting of [0, 1], [0, 2] and their degeneracies. Let f : Λ20 → ∆1 be a simplicial
morphism with f ([0, 2]) = [0, 0] and f ([0, 1]) = [0, 1]. Then f is unique as it is specified on the nondegenerate
simplices of Λ20 . It is also a simplicial map since all functions on all simplices are order preserving. But this can not be
extended to a simplicial map ∆2 → ∆1 since f (0) = 0, f (1) = 1, and f (2) = 0. so this map is not order preserving
on ∆2 .
Recall that the singular set functor, briefly mentioned in Theorem 6.4.3 is defined as follows:
Definition 10.3.5. If Y is a topological space, let S(Y ) be the simplicial set where Sn (Y ) consists of the singular
n-chains, (i,e. the continuous maps from |∆n | → Y ). Then S is called the singular set functor.
Friedman states that this example is critical.
Example 10.3.3. If Y is a topological space, then S(Y ) satisfies the Kan condition. To see this let f : Λnk → S(Y ).
This map assigns for each n−1 face di ∆n with i ̸= k of ∆n a singular simplex σi : |∆n−1 | → Y . Every other simplex
of Λnk is a face or a degeneracy of a face of one of these (n − 1) simplices, so the map f is completely determined.
Also the simplicial set axioms ensure that the σi glued together yield a continuous function f : |Λnk | → Y . We can
now extend this function to all of |∆n | by letting r : |∆n | → |Λnk | be a continuous retraction. Such a retraction exists
since (|∆n |, |Λnk |) is homeomorphic to (I n−1 × I, I n−1 × 0). Then let σ = f r : |∆n | → Y . This is a singular
n-simplex with di f = σi for all i ̸= k. This then determines the extension.
Figure 10.3.3 from [51] illustrates the extension.

10.4 Computability of the Extension Problem


In this section, I will discuss the possibility of deciding the extension problem on a computer. Although obstruction
theory and the extension problem were one of Steenrod’s main interests, he never found a systematic procedure. See
[153] for an interesting description, although it is rather dated now. I will have a lot more to say about that paper in
Chapter 11.
The only early paper addressing an algorithmic approach was the paper of E. H. Brown [21] from 1957. Brown
showed that [X, Y ] is computable if Y is simply connected and all higher homotopy groups of Y are finite. This
is pretty limiting as it rules out Y being a sphere, since πn (S n ) ∼ = Z. He also came up with a method to compute
higher homotopy groups of Y which drops the finiteness condition but is hard to generalize. We also know that it is
undecidable to determine if [S 1 , Y ] = π1 (Y ) is trivial.
The two papers of C̆adek et. al. [26, 28] give an algorithm for deciding if a function can be extended and the related
question of characterizing maps into a sphere. As I mentioned above, the main problem to decide if f : ∂σ → Y can
be extended to all of σ is to determine if [f ] is the zero element of πn (Y ) assuming that σ is an (n + 1) simplex.
This problem is a special case of determining the group of maps from a space to a sphere Y = S d . In [26] an even
more general result is proved.
10.4. COMPUTABILITY OF THE EXTENSION PROBLEM 237

Figure 10.3.3: A singular set satisfies the Kan condition [51].

Theorem 10.4.1. Let d ≥ 2. There is an algorithm that given finite simplicial complexes (or simplicial sets) X and
Y , where dim X ≤ 2d − 2 and Y is (d − 1)-connected, computes the abelian group [X, Y ] of homotopy classes of
simplicial maps from X to Y . Since this is a finitely generated abelian group, our result expresses it as a direct product
of cyclic groups. Not only that, but given a simplicial map f : X → Y , the element of [X, Y ] corresponding to [f ]
can be determined. If d is fixed, the algorithm runs in time polynomial to the number of simplices (or nondegenerate
simplices in the simplicial set case) of X and Y .
Based on this result, [28] explicitly applies it to the extension problem.
Theorem 10.4.2. Let d ≥ 2. There is a polynomial time algorithm that given finite simplicial complexes X and Y
and a subcomplex A ⊂ X, where dim X ≤ 2d − 1 and Y is (d − 1)-connected, determines whether a simplicial
map f : A → Y admits an extension to a (not necessarily simplicial) map f : X → Y . If the extension exists, dim
X ≤ 2d − 2, and [X, Y ]f ⊂ [X, Y ] is the subset of [X, Y ] consisting of the maps X → Y extending f , then the
algorithm computes the structure of [X, Y ]f as a coset in the abelian group [X, Y ]. More generally X can be a finite
simplicial set (as opposed to a simplicial complex), and Y can be a simplicial set whose homology can be computed
in polynomial time.
Another C̆adek et. al. paper [27] shows the importance of the bounds on dimensionality in the previous two results.
Theorem 10.4.3. The extendability problem is undecidable for finite simplicial complexes A ⊂ X and Y and a
simplicial map f : A → Y if dim X = 2d and Y is (d − 1)-connected. In addition, for every d ≥ 2, there is a fixed
(d − 1)-connected Y = Yd such that for maps into Yd with A, X, and f as input, and dim X = 2d, the extension
problem is undecidable.
In particular, for every even d ≥ 2, the extension problem is undecidable for X of dimension 2d and Y = S d . For
odd d, however, it was shown in [166] that if Y = S d , then the extension problem is decidable without any restriction
on the dimension of X.
With the background I have given you so far, you should be able to read the papers, but they are quite long and
involved. Rather than give a complete proof of the results I have just stated, I will go over some of the ideas. I will
mainly focus on [26] but I will also mention the extension proof in [28].
238 CHAPTER 10. OBSTRUCTION THEORY

To start, although the results would presumably address the data science applications I have in mind, there is no
mention of actual code or any experiments that I could find. It is true that Francis Sergeraert, the inventor of Kenzo,
is a coauthor on [26], so it is possible that Kenzo could implement the algorithm, but I have not looked closely at it. I
also have not found any mention of using these algorithms for data science applications. It seems to be an open area
for future research and the results give a lot of hope that obstruction theory for many special cases in data science
could be tractable.
I will now describe some of the ideas that lie behind the algorithm described in Theorem 10.4.1 and give the brief
outline found in [26]
First of all, the step by step method of obstruction theory is not really a practical algorithm. The extensions at
each step are not unique and we could end up searching an infinite tree of extensions as each step depends on previous
choices. This is the reason that Brown [21] needed all of the higher homotopy groups of Y to be finite. To keep this
process under control, [26] uses the group structure on [X, Y ].
In representing abelian groups, we need to know what information is really available to us. Semi-effective abelian
groups have a representation of their elements that can be stored on a computer and we can compute the group opera-
tions of addition and inverse on the level of representatives of their homotopy classes. A fully effective representation
means that we have a list of generators and relations, and we can express any given element in terms of the generators.
A homomorphism f between two semi-effective abelian groups is locally effective if there is an algorithm that takes a
representative of an element a in the domain of f and returns a representative of f (a).
Our geometric objects are represented as simplicial sets. A finite simplical set is just encoded as a list of its
nondegenerate simplices and how they are glued together. We also need a way to represent an infinite simplicial set
such as an Eilenberg-Mac Lane space. To handle this, we use a framework developed by Sergeraert (see [139]) in
which the simplicial set is treated as a black box. This means we can encode the set and operations like the face
operators. The simplicial set is called locally effective. A simplicial map between locally effective simplicial sets is
locally effective if there is an algorithm that evaluates it on any given simplex of the domain simplicial set.
One more step before outlining the algorithm is to talk about Postnikov systems. To convert to the notation used
by [26], I will modify the diagram from Definition 9.8.7 that we previously used for a Postnikov system.

..
.

Pm+1

ϕm+1
km (Y )
Y Pm Km+2 = K(πm+1 , m + 2)
ϕm

..
.

Pd = K(πd , d)

(∗)

The maps ϕi : Y → Pi induce bijections [X, Y ] → [X, Pi ] if dim X ≤ i. If Y is (d − 1)-connected, then for
10.4. COMPUTABILITY OF THE EXTENSION PROBLEM 239

i ≤ 2d − 2, the Postnikov stage Pi is an H-space or a group up to homotopy. This in turn induces an abelian group
structure on [X, Pi ] for every space X regardless of the dimension of X. But if dim X ≤ 2d − 2, then we have the
bijection [X, Y ] → [X, P2d−2 ] which then gives the abelian group structure for [X, Y ]. To prove Theorem 10.4.1,
we compute the stages P0 , · · · , P2d−2 of a Postnikov system of Y and then by induction on i, determine [X, Pi ] for
i ≤ 2d − 2. We then return the description of [X, P2d−2 ] as an abelian group.
For the inductive computation of [X, Pi ] we can ignore the dimension restriction on X as we will also need to
compute [SX, Pi−1 ] where SX is the suspension of X and is of dimension one greater than that of X.
Finally, we need the Pi to be Kan simplicial sets. This insures that every continuous map X → Pi is homotopic to
a simplicial map. Simplicial maps are discrete and finitely representable objects.
Here is an outline of the algorithm described in Theorem 10.4.1.
Main Algorithm:

1. Using the algorithm in Section 4 of [28], find a suitable representation of the first 2d − 2 stages of a Postnikov
system for Y . This algorithm provides us with the isomorphism types of the first 2d − 2 homotopy groups
πi = πi (Y ) along with the Postnikov stages Pi , and the Eilenberg-Mac Lane spaces Ki+1 = K(πi , i + 1) and
Li = K(πi , i) for i ≤ 2d − 2 as locally effective simplicial sets. It also outputs the maps between these spaces
such as the Postnikov classes ki−1 : Pi−1 → Ki+1 as locally effective simplicial maps.

2. Given a finite simplicial set X, the algorithm computes [X, Pi ] as a fully effective abelian group by induction
on i for i ≤ 2d − 2. The final output is then [X, P2d−2 ]. This step involves the following:

(a) Construct locally effective simplicial maps ⊞i : Pi × Pi → Pi and ⊟i : Pi → Pi representing group


addition and inverse respectively. This does imply that the spaces Pi have some sort of group structure.
C̆adek, et. al. state that the Pi have an H-space structure (i. e. it is group up to homotopy) since Pi has
nonzero homotopy groups in the range from d to 2d − 1 where Y is (d − 1)-connected. They outline the
proof in Section 5 of [26], but refer the reader to Whitehead [170] for a full proof.
(b) Using the group structure on the Pi , induce group operations ⊞i∗ and ⊟i∗ on the set SM ap(X, Pi ) of the
simplicial maps from X to Pi . These correspond to the group operations in [X, Pi ], the group of homotopy
classes of SM ap(X, Pi ).The result is a semi-effective representation for [X, Pi ].
(c) The last step is to convert this semi-effective representation of [X, Pi ] into a fully effective one. For this
step, [X, Li ] and [X, Ki+1 ] are easy to compute as fully effective abelian groups, since the groups of
homotopy classes of maps from X into Eilenberg-Mac Lane spaces are isomorphic to certain cohomology
groups of X. (I will explain this in Chapter 11.) In addition to [X, Li ] and [X, Ki+1 ], assume that we have
already computed [SX, Pi−1 ] and [X, Pi−1 ] in the induction step. (Here SX is the suspension of X.)
(d) The four groups from the previous step fit into an exact sequence of abelian groups as follows:

[SX, Pi−1 ] [X, Li ] [X, Pi ]

[X, Pi−1 ] [X, Ki+1 ]

For the specific maps between them and how you extract the value of [X, Pi ], I will refer you to Section 6
of [26] as some detailed explanation is required. The idea, though is that we filter out the maps X → Pi−1
that can be lifted to maps X → Pi . (This requires the computation of a different type of obstruction. See
for example, [74].) For each map that can be lifted, we determine all possible liftings and check on which
lifts are homotopic. Since there are infinitely many homotopy classes of maps involved in these operations,
we need to work with generators and relations in the appropriate abelian groups of homotopy classes. In
the end, we have a fully effective representation of [X, Pi ].
240 CHAPTER 10. OBSTRUCTION THEORY

If Y is fixed, we can evaluate the Postnikov classes ki for i ≤ 2d − 2 as one time work. Note that if Y = S d , then
kd corresponds to the Steenrod square Sq 2 . You will learn all about Steenrod squares in Chapter 11.
The reader is encouraged to look at Sections 3-6 of [26] for explicit details of the algorithm we have just outlined.
Finally, we will see how Theorem 10.4.1 can be used to derive the extension algorithm in Theorem 10.4.2. The
proof comes from [28].
Proof of Theorem 10.4.2:
Suppose A ⊂ X and Y are simplicial sets and f : A → Y is a simplicial map. Let X be finite with dim
X ≤ 2k − 1 and let Y be (k − 1)-connected. By Theorem 7.6.22 of [149], a continuous extension to X exists if
and only if ϕ2k−2 f : A → P2k−2 can be continuously extended to X, where ϕ2k−2 : Y → P2k−2 is the map in the
Postnikov system for Y . By the homotopy extension property, this happens when there is a map X → P2k−2 whose
restriction to A is homotopic to ϕ2k−2 f . In terms of homotopy classes, we need [ϕ2k−2 f ] to be in the image of the
restriction map ρ : [X, P2k−2 ] → [A, P2k−2 ].
Now to compute [X, Y ] we compute the group [X, P2k−2 ]. If dim X = 2k − 1 the two groups are not isomorphic
but the computation of [X, P2k−2 ] still works. By the methods of the algorithm for Theorem 10.4.1, this group is fully
effective and the generators are specified as simplicial maps. We also use the same methods to compute [A, P2k−2 ].
The group operation in [X, P2k−2 ] is induced by an operation ⊞ on SM ap(X, P2k−2 ) which is defined on each
simplex, so the restriction map ρ is a group homomorphism.
If [g] ∈ [X, P2k−2 ] is represented by g, then the restriction g|A is a representative of an element of [A, P2k−2 ]
which we can express in terms of the generators of [A, P2k−2 ]. So ρ is polynomial time computable and we can
compute its image as a subgroup of [A, P2k−2 ] by computing the images of the generators of [X, P2k−2 ] . Given a
simplicial map f : A → Y , we compute the corresponding element [ϕ2k−2 f ] ∈ [A, P2k−2 ] and test if it lies in the
image of ρ. This is the required polynomial time algorithm.
If we have the additional restriction that dim X ≤ 2k − 2, then [X, Y ] ∼ = [X, P2k−2 ] and [A, Y ] ∼
= [A, P2k−2 ],
−1
so if x is in the image of ρ, we can compute the preimage ρ (x) as a coset of [X, P2k−2 ]. (Here ρ is represented
by a matrix.) This coset is isomorphic to the group [X, Y ]f of maps X → Y which are extensions of f . This proves
Theorem 10.4.2. ■
This leaves us with some questions:

1. Would there be an interesting map f from a Vietoris-Rips complex to a space Y which may not be a sphere but
is (k − 1)-connected?
2. The Hopf map is within range of Theorem 10..4.2. (X = S 3 has dimension 3, and S 2 is 1-connected. Letting
k=2, we have dim X = 2(2)−1.) What would the algorithm do in the case of classifying still or moving images.
How long would it take?

3. Many homotopy groups of spheres are tabulated. Would knowing these speed up the algorithm?
This area is wide open. Assuming the algorithms from this section, it would be very interesting to see what
situations are tractable and what we could learn from our data.
Chapter 11

Steenrod Squares and Reduced Powers

With this chapter, we are definitely at the frontier of where algebraic topology and data science intersect. I have seen
very little discussion anywhere of the use of Steenrod squares for a data science application. So why am I including it
in this book?

1. Steenrod squares provide an even richer algebraic structure than cohomology. Using Z2 coefficients (as we must
with Steenrod squares), cup product provides a map

∪ : H p (X; Z2 ) × H q (X; Z2 ) → H p+q (X; Z2 ).

Steenrod squares provide maps


Sq i : H p (X; Z2 ) → H p+i (X; Z2 ),
where 0 ≤ i ≤ p. It turns out that for a ∈ H p (X; Z2 ), we have Sq 0 (a) = a and Sq p (a) = a ∪ a. (Hence the
term Steenrod squares.) This also seems relevant as we normally work in Z2 for persistent homology. But what
if you hate Z2 and really want to do all your work in Z683 ? (Yes, 683 is a prime number.) You can always use
reduced powers. They are of the form

P i : H q (X; Zp ) → H q+2i(p−1) (X; Zp ),

where p is an odd prime. I would probably try to talk you out of it in a practical situation, especially p = 683.
2. There are explicit formulas you can program onto a computer for finding Steenrod squares of simplicial sets.
The paper of González-Dı́az and Real [56] provides such a formula. The bad news is that the complexity is
exponential. If c ∈ C i+k (X; Z2 ) then the number of face operators taking part in the formula for Sq i (c) is
O(ik+1 ). The good news is that these numbers are very manageable if you keep the dimensions low. Also,
Aubrey HB [65] came up with a simplified formula for Sq 1 . Just computing a small number of the squares
could make the difference in a hard classification problem.
3. Steenrod squares are used to construct Stiefel-Whitney classes, which are cohomology classes used to classify
vector bundles. Vector bundles are fiber bundles whose fiber is a vector space. The point is that Steenrod squares
are already used in a classification problem.
4. Steenrod squares are historically tied to obstruction theory and the extension problem. Not only was it a mo-
tivation for their discovery, but the main theorem about the correspondence between cohomology operations
and cohomology classes of Eilenberg-Mac Lane spaces relies on obstruction theory for its proof. (This will be
discussed in Section 11.2.)

In Section 11.1, I will discuss two historical papers by Steenrod that give some insight into his original motivation
for Steenrod squares. Then I will go through the modern theory mainly using the book by Mosher and Tangora

241
242 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

[106]. This will include the correspondence between cohomology operations (of which Steenrod squares are a special
case) and the cohomology of Eilenberg-Mac Lane spaces, construction and properties of Steenrod squares, the Hopf
invariant, the structure of the Steenrod algebra, and some approaches to the very hard problem of computing the
cohomology of Eilenberg-Mac Lane spaces. Then I will briefly discuss reduced powers which involve cohomology
with coefficients in Zp , where p is an odd prime. Mosher and Tangora don’t discuss this case, so we will need to refer
to the book by Steenrod and Epstein [157]. Then I will briefly describe vector bundles and Stiefel-Whitney classes.
The classic book on this subject is Milnor and Stasheff [102], but Aubrey HB deals with them as well in [65]. I will
conclude with the formulas found in [56] and [65] that allow for computations on a computer.

11.1 Steenrod’s Original Motivation


Steenrod squares first appeared in the paper [154], written in 1946 and published in 1947. At that time, homotopy
theory was in its very early stages and there was a lot of interest in describing groups of homotopy classes of maps
between one type of complex and another. The first such result was classfiying maps from S n to itself in terms of
degree. In the 1930’s, Hopf discovered that π3 (S 2 ) ∼ = Z generated by the Hopf map. Then Freudenthal and Pontrjagin
discovered that πn+1 (S n ) ∼ = Z2 for n > 2.
The problem Steenrod wanted to generalize was first solved by Hopf in 1933. Rather than map S n to itself, map
an n-dimensional complex K n to S n . Hopf showed that two maps f, g : K n → S n are homotopic (written f ∼ g) if
and only if f ∗ (sn ) = g ∗ (sn ) where sn is the generator of H n (S n ) ∼ = Z. Any n-cocycle in K n is f ∗ (sn ) for some f .
So the homotopy classes are in one to one correspondence with H (K n ). n

In 1941, Pontrjagin [121] enumerated all of the homotopy classes of maps of a 3-complex K 3 to a 2-sphere S 2 . If
f ∼ g, then f ∗ (s2 ) = g ∗ (s2 ). In this case, g ∼ g ′ , where g ′ is a function that coincides with f on the 2-skeleton K 2 .
For a 3-cell σ, f and g ′ define an element d3 (f, g ′ , σ) ∈ π3 (S 2 ). Then Pontrjagin showed that f ∼ g ′ and therefore
to g if there exists a 1-cocycle e1 in K 3 such that d3 (f, g ′ ) is cohomologous to 2e1 ∪ f ∗ (s2 ), where d3 (f, g ′ ) is a
3-cocycle in Z 3 (K 3 ; π3 (S 2 )) = Z 3 (K 3 ; Z). He also showed that any pair consisting of a 2-cocycle and a 3-cocycle
on K 3 can be realized as f ∗ (s2 ) and d3 (f, g) for a suitable f and g.
Steenrod’s goal was to generalize this result to the case of maps from an (n+1)-complex K to S n . To do this, the
cup product is generalized to the cup-i product ∪i . For i ≥ 0, and cochains u and v of dimensions p and q respectively,
u ∪i v has dimension p + q − i. If i = 0, u ∪0 v = u ∪ v.
In the original formula, the cup-i product was similar to our original definition of simplicial cup product. (See
the formula immediately after Definition 8.3.2.) If u and v are as above, then for a (p + q)-simplex, apply u to the
p-simplex spanned by the first p + 1 vertices and v to the q-simplex spanned by the last q + 1 vertices. Then multiply
the result. For the cup-i product, we have some overlap in vertices and need to reorder them in a specific way. The
result can then be positive or negative depending on whether the reordering involves an even or odd permutation.
(Remember this means flipping two elements an even or odd number of times.)
The coboundary formula is also somewhat more complicated. Recall from Theorem 8.31 that

δ(cp ∪ cq ) = (δcp ) ∪ cq + (−1)p cp ∪ (δcq ).

This means the cup product of 2 cocycles is itself a cocycle.


Here is the corresponding formula for cup product.

Theorem 11.1.1. If u and v are cochains of dimensions p and q, then

δ(u ∪i v) = (−1)p+q−i u ∪i−1 v + (−1)pq+p+q v ∪i−1 u + δu ∪i v + (−1)p u ∪i δv.

The proof is long, ugly, and painful. See [154] if you are curious.
The point, though, is that the cup-i product of 2 cocycles is no longer a cocycle. But all is not lost. If we take
u ∪i u and u is a cocycle, the last 2 terms go away. If u has dimension p, then
2 2
δ(u ∪i u) = (−1)2p−i u ∪i−1 u + (−1)p +2p
u ∪i−1 u = ((−1)2p−i + (−1)p +p
)u ∪i−1 u.
11.1. STEENROD’S ORIGINAL MOTIVATION 243

This is equal to 2(u ∪i−1 u) if p − i is even or 0 if p − i is odd. Working mod 2, u ∪i u is a cocycle if u is a cocycle.
We replace (u ∪i u) with the homomorphism

Sqi : H p (K; Z2 ) → H 2p−i (K; Z2 ).

Later we will switch to the more modern notation Sq i which is equal to Sqp−i .
The Pontrjagin formula for n > 2 becomes dn+1 (f, g) is cohomlogous to en−1 ∪n−3 en−1 .
The homotopy classification theorem is obtained from an extension theorem. It states that a map f from the n-
skeleton K n of an (n + 2)-complex K to S n can be extended to K if and only if f ∗ (sn ) is a cocycle in K and
f ∗ (sn ) ∪n−2 f ∗ (sn ) = 0. See [154] for the proof and all of the details.
The other interesting historical paper is [153]. Steenrod died in 1971, but he had given permission to the journal
”Advanced Mathematics” to publish the paper which was based on the Colloquium Lectures given at the Annual
Meeting of the American Mathematical Society and Pennsylvania State University in August 1957. It was published
posthumously in 1972. It is much more polished than his original paper and a great introduction, but a little dated.
What I want to give you is a taste of the connection between the development of cohomology operations and the
extension problem. I will summarize the first part of [153] and use Mosher and Tangora [106] for the actual theory.
Steenrod’s introduction to the subject focused on the extension problem. I will restate it in the way it appeared at
the beginning of Chapter 9. Let X and Y be topological spaces. Let A be a closed subset of X and let h : A → Y be
a mapping. A mapping f : X → Y is called an extension of h if f (x) = h(x) for each x ∈ A. Letting g : A → X be
inclusion we want to find a map f such that f g = h. In terms of diagrams, we have the following:

X
g f

A Y
h

We would like to determine if such an extension f exists.


We use algebraic topology to attack this problem by turning a geometry problem into an algebra problem. Using
homology, we want to find f∗ for each q in the following diagram:

Hq (X)

g∗ f∗

Hq (A) Hq (Y )
h∗

Here all the maps are group homomorphisms. A necessary condition for an extension to exist is that there is a
group homomorphism f∗ such that f∗ g∗ = h∗ . Unfortunately, this condition is not sufficient. As Steenrod points out,
too much information is lost in the translation from spaces to groups. We would like to encode as much algebra in the
geometry as possible.
We can do better by using cohomology. Now we have the diagram

H q (X)

g∗ f∗

H q (A) H q (Y )
h∗
244 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

Now we want to find an f ∗ such that h∗ = g ∗ f ∗ . Another key difference, form the previous case is that our maps
are now ring homomorphisms. This means that they not only respect addition, but also ring multiplication. In other
words
f ∗ (u ∪ v) = f ∗ (u) ∪ f ∗ (v).
Below I will give some examples on how this approach is an improvement.
Meanwhile, we can do even better. A cohomology operation T relative to dimensions q and r is a collection of
functions {TX }, one for each space X, such that

TX : H q (X) → H r (X)

and for each mapping f : X → Y,


f ∗ TY u = TX f ∗ u
for all u ∈ H q (Y ). (I will state a more general definition in the next section.)
Sq i and P i are two examples of cohomology operations.
If f is the solution to an extension problem then f ∗ must be a ring homomorphism with f ∗ Ty = Tx f ∗ for every
cohomology operation T .
Some examples will illustrate these ideas.
Example 11.1.1. We have already seen that the identity on S n can not be extended to the ball B n+1 of which S n
forms the boundary. I will review this case here for comparison. Converting to homology, we have the diagram

Hq (B n+1 )

g∗ f∗

Hq (S n ) Hq (S n )
h∗

Let q = n. Then Hn (S n ) ∼
= Z. B n+1 is acyclic, so in particular Hn (B n+1 ) = 0. Since h is the identity, h∗ is the
identity homomorphism on Z. But then g∗ = 0 so for any f∗ , we have f∗ g∗ = 0 ̸= h∗ = idZ .
Next we have a case where we need to use cohomology to make a decision.
Example 11.1.2. Let X = CP 2 , the complex projective plane. (See Definition 4.3.9.) By Theorem 4.3.8, X is a CW
complex with one cell each in dimensions 0, 2, and 4. Let A be CP 1 . Then A is the complex projective line. It has
one 0-cell and one 2-cell so it is homeomorphic to S 2 . I claim that A is not a retract of X. Again, this means that the
identity on A can not be extended to all of X.
Suppose this was not the case. If f : X → A is a retract, then if g : A → X is inclusion, we would have
g ∗ f ∗ = identity, This means that H q (X) =im f ∗ ⊕ ker g ∗ for any q. We can abbreviate this to

H ∗ (X) = im f ∗ ⊕ ker g ∗ .

In addition, g ∗ :im f ∗ → H ∗ (A) is an isomorpism. Using the fact that f ∗ and g ∗ are ring homomorphisms, we have
that im f ∗ is a subring and ker g ∗ is an ideal.
In our case, we need ker g ∗ to be a direct summand. The table shows the cohomology of X and A with integer
coefficients.

H0 H1 H2 H3 H4
 
A Z 0 Z 0 0
X Z 0 Z 0 Z
Now g ∗ : H q (X) → H q (Z) is an isomorphism in dimensions 0 and 2 and g ∗ = 0 in dimension 4. So the necessary
direct sum composition exists and is unique. In dimensions 0 and 2, ker g ∗ = 0 and im f ∗ = H q (X). In dimension
11.2. COHOMOLOGY OPERATIONS 245

4, g ∗ = 0, so ker g ∗ = H 4 (X) and im f ∗ = 0. So using only the group structure, we might think that the retraction
exists.
But looking at the ring structure, we see that this is not the case. This is because im f ∗ is not a subring. To see this,
note that in dimension 2, H 2 (X) ∼= Z, so let u be a generator. Then u is in im f ∗ . Since X is a manifold, we can use
Poincaré duality to show that u ∪ u generates H 4 (X). (I haven’t covered Poincaré duality, which relates homology
and cohomology of manifolds as we won’t make much use of it. If you are curious, see Munkres [110] Sections 65
and 68.) Anyway, since we said that im f ∗ = 0 in dimension 4, we see that u ∈im f ∗ , but u ∪ u ∈im / f ∗ . So f ∗ is not
a subring and we have a contradiction.
So we know there is no retraction from CP 2 to CP 1 but we needed the ring structure to see that.

The final example is a retraction problem that can not be decided by the cohomology ring but can be determined
using the Steenrod squaring operations.

Example 11.1.3. Let P n be n-dimensional projective space. Let X be P 5 with the subspace P 2 collapsed to a point.
Let A ⊂ X be the image of P 4 under the collapsing map P 5 → X. We claim that A is not a retract of X.
Again, suppose A is a retract of X. Then as before, ker g ∗ is a direct summand of H ∗ (X; Z2 ). The cohomology
of A and X with Z2 coefficients is shown in the following table:

H0 H1 H2 H3 H4 H5
 
A Z2 0 0 Z2 Z2 0
X Z2 0 0 Z2 Z2 Z2
Now g ∗ : H ∗ (X; Z2 ) → H ∗ (A; Z2 ) is an isomorphism in dimensions less than 5. So im f ∗ = H q (X; Z2 ) in
dimensions less than 5. and im f ∗ = 0 in dimension 5. In this case, though, im f ∗ actually is a subring. To see this,
the smallest positive dimension with H q (X; Z2 ) ̸= 0 is dimension 3. But if u is the nonzero element of H q (X; Z2 )
for q ≥ 3, then u ∪ u has dimension at least 6. Since H q (X; Z2 ) = 0 for q ≥ 6, u ∪ u = 0. Thus im f ∗ is a subring.
So it is still possible for A to be a retract of X.
Now consider the cohomology operation Sq 2 : H 3 (X; Z2 ) → H 5 (X; Z2 ). Let u be the nonzero element of
H (X; Z2 ). It turns out that Sq 2 u is the nonzero element of H 5 (X; Z2 ). Now im f ∗ contains u, but is zero in
3

dimension 5. So if v ∈ H 3 (A; Z2 ) is the nonzero element in dimension 3, then f ∗ (v) = u. Then f ∗ Sq 2 v = f ∗ (0) =
0, but Sq 2 f ∗ (v) = Sq 2 (u) ̸= 0. So f ∗ Sq 2 ̸= Sq 2 f ∗ and the retraction can not exist.

The last example is an argument for trying Steenrod squares in data science applications. They are able to settle
the existence of extensions that even cohomology can’t handle. We will need to get a better idea of what they actually
are, what properties they have, and how to calculate them. The rest of this chapter will answer those questions.

11.2 Cohomology Operations


The material from here until the end of Section 11.7 is taken from Mosher and Tangora [106]. I will begin with a
formal definition of a cohomology operation.

Definition 11.2.1. A cohomology operation of type (π, n; G, m) is a family of functions θX : H n (X; π) → H m (X; G),
one for each space X such that for any map f : X → Y , f ∗ θY = θX f ∗ . Cohomology operations are not necessarily
homomorphisms. We denote the set of all cohomology operations of type (π, n; G, m) as O(π, n; G, m)

Example 11.2.1. Let R be a ring. Then the square u → u2 = u ∪ u gives for each n an operation

H n (X; R) → H 2n (X; R).

Definition 11.2.2. For a space X and i > 0, define the Hurewicz homomorphism h : πi (X) → Hi (X) by choosing a
generator u of Hi (S i ) ∼
= Z and letting h([f ]) = f∗ (u) where f : S i → X.
246 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

Recall that the Hurewicz Theorem states that if X is (n − 1)-connected for n > 1 then h is an isomorphism in
dimensions less than or equal to n and h is an epimorphism in dimension n + 1.
Now the Universal Coefficient Theorem for Cohomology gives an exact sequence

0 → Ext(Hn−1 (X), π) → H n (X; π) → Hom(Hn (X), π) → 0

for any space X. If X is (n − 1)-connected, then Hn−1 (X) = 0, so the Ext term goes away and we have

H n (X; π) ∼
= Hom(Hn (X), π).
In this case if π = πn (X), then the group Hom(Hn (X), πn (X)) contains the inverse h−1 of the Hurewicz
homomorphism. The inverse exists in dimension n since X is (n − 1)-connected.
Definition 11.2.3. Let X be (n−1)-connected. The fundamental class of X is the cohomology class ı ∈ H n (X; πn (X))
corresponding to h−1 under the above isomorphism. The class may also be denoted by ıX or ın .
In particular, the Eilenberg-Mac Lane space K(π, n) has a fundamental class ın ∈ H n (K(π, n); π).
Here is the main theorem.
Theorem 11.2.1. There is a one to one correspondence [X, K(π, n)] ↔ H n (X; π) given by [f ] ↔ f ∗ (ın ). (The
square brackets represent homotopy classes.)
I will refer you to [106] for the proof, but I will point out that it makes heavy use of obstruction theory, both
for extending maps and extending homotopies. So obstruction theory provided both the motivation and the tools for
studying cohomology operations.
I will devote the rest of this section to some consequences of Theorem 11.2.1.
Theorem 11.2.2. There is a one to one correspondence

[K(π, n), K(π ′ , n)] ↔ Hom(π, π ′ ).

Proof: By Theorem 11.2.1, [K(π, n), K(π ′ , n)] ↔ H n (K(π, n); π ′ ). By the Universal Coefficient Theorem,
H (K(π, n); π ′ ) ∼
n
= Hom(Hn (K(π, n)), π ′ ) = Hom(π, π ′ ), since Hn (K(π, n)) = π by the Hurewicz Theorem. ■
We have seen earlier that the Eilenberg-Mac Lane spaces have the homotopy type of a CW complex which is unique
up to homotopy.We will always assume we are working with CW complexes. We will abbreviate H m (K(π, n); G) as
H m (π, n; G).
Let θ be a cohomology operation of type (π, n; G, m). Since ın ∈ H n (π, n; π), we have that θ(ın ) ∈ H m (π, n; G).
Theorem 11.2.3. There is a one to one correspondence

O(π, n; G, m) ↔ H m (π, n; G)

given by θ ↔ θ(ın ).
Proof: The idea is to show that the function θ ↔ θ(ın ) has a two sided inverse.
Let ϕ ∈ H m (π, n; G). We will use ϕ to denote an operation of type (π, n; G, m) defined as follows. If u ∈
H n (X; π), let ϕ(u) = f ∗ (ϕ) ∈ H m (X; G), where f : X → K(π, n), and [f ] corresponds to u in Theorem 11.2.1.
We now have functions in both directions between O(π, n; G, m) and H m (π, n; G) and we need to see that their
composition in either order is the identity. If X = K(π, n) and u = ın , then f must be homotopic to the identity,
so ϕ(ın ) = f ∗ (ϕ) = ϕ. Going in the other direction, if ϕ = θ(ın ), then the operation ϕ is equal to θ since ϕ(u) =
f ∗ (ϕ) = f ∗ (θ(ın )) = θ(f ∗ (ın )) = θ(u) and we are done. ■
We will often use a symbol for both and operation and its corresponding cohomology class.
The following theorem is a direct result of Theorems 11.2.1 and 11.2.3.
Theorem 11.2.4. There is a one to one correspondence

O(π, n; G, m) ↔ [K(π, n), K(G, m)].


11.3. CONSTRUCTION OF STEENROD SQUARES 247

By Theorem 11.2.3, the problem of finding all cohomology operations comes down to finding the cohmology
groups of the appropriate Eilenberg-Mac Lane space. This should be easy as we know there are efficient ways to
compute cohomology groups. The bad news is that our methods will not usually work. K(Z, 1) is a circle, but
Eilenberg-Mac Lane spaces tend to be infinite dimensional, so they are not representable as finite complexes, and our
current methods would literaly take forever.
We will see how to calculate cohomology groups of some specific Eilenberg-Mac Lane spaces in Section 11.7.
Until then we will focus on a specific type of cohomology operation that we have already mentioned: Steenrod
squares.

11.3 Construction of Steenrod Squares


In this section, I will outline the construction of Steenrod squares found in Mosher and Tangora. These are
cohomology operations of type (Z2 , n; Z2 , n + i).

11.3.1 Cohomology of K(Z2 , 1)


Recall that we saw in Example 9.8.2 that the infinite dimensional projective space P ∞ is a K(Z2 , 1) space. We
will look at the homology and cohomology of this space.
We start by looking at the cell structure. Letting S ∞ = ∪∞ n
n=0 S , we can give S

the structure of a CW-complex
with two cells in each dimension i for i ≥ 0 denoted di and T di . These represent the Northern and Southern hemi-
spheres with T being the map that interchanges them. The boundary for homology is given by

∂di = di−1 + (−1)i T di−1 ,

with ∂T = T ∂ and T 2 = 1. (Think of S 2 where di−1 and T di−1 are the two pieces of the equator.)
Then S ∞ is acyclic. This is because in even dimensions, the nonzero cycles are generated by d2j −T d2j = ∂d2j+1 ,
so cycles coincide with boundaries. In odd dimensions, the cycles are generated by d2j−1 + T d2j−1 = ∂d2j .
Now consider the homology of P ∞ . We obtain P ∞ from S ∞ by identifying di and T di so that there is now one
cell in every dimension. We will call that cell ei . Then the boundary formula becomes ∂ei = ei−1 + (−1)i ei−1 . So
∂ei = 2ei−1 for i even and ∂ei = 0 for i odd. So the reduced homology H̃i (P ∞ ) is Z2 for i odd and 0 for i even.
By the Universal Coefficient Theorems, since H∗ (P ∞ ; Z) is Z2 in odd dimensions, we have that H ∗ (P ∞ ; Z) is
Z2 in positive even dimensions. Then H∗ (P ∞ ; Z2 ) and H ∗ (P ∞ ; Z2 ) are Z2 in every dimension.
We next look at the ring structure. Let W be the cellular chain complex of S ∞ . Then W is a Z2 -free acyclic chain
complex with two generators in each dimension i for i ≥ 0.
To get the ring structure of P ∞ we need a diagonal map for W . (You may want to review Section 8.6 to see the
connection.) Let T act on W ⊗ W by T (u ⊗ v) = T (u) ⊗ T (v), where T is the action that flips hemispheres described
above.
Let r : W → W ⊗ W be defined by
i
X
r(di ) = (−1)j(i−j) dj ⊗ T j di−j
j=0

and
r(T di ) = T (r(di )).
Here T j = T if j is odd and T j = 1 if j is even, since T 2 = 1. Then r is a chain map with respect to the boundary in
W ⊗ W . (Recall that ∂(u ⊗ v) = ∂u ⊗ v + (−1)dim(u) u ⊗ ∂v.)
Let h denote the diagonal map of Z2 , so that h(0) = (0, 0) and h(1) = (1, 1). Then r is h-equivariant, i.e.
r(gw) = h(g)r(w) for g ∈ Z2 and w ∈ W . So r induces a chain map S : W/T → W/T ⊗ W/T , where W/T is the
248 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

chain complex of P ∞ . This means that


i
X
s(ei ) = (−1)j(i−j) ej ⊗ ei−j .
j=0

Then s is a chain approximation to the diagonal map ∆ of P ∞ , and we can use it to find cup products in H ∗ (P ∞ ; Z2 ).
Let αi be the nonzero element of H i (P ∞ ; Z2 ) ∼
= Z2 . The summation for s(ej+k ) contains a term ej ⊗ ek with
coefficient 1 mod 2. Then

⟨αj ∪ αk , ej+k ⟩ = ⟨∆∗ (αj × αk ), ej+k ⟩


= ⟨αj × αk , s(ej+k )⟩
=1 mod 2.

Thus, αj ∪ αk = αj+k . This proves Theorem 8.3.4 which states that H ∗ (P ∞ ; Z2 ) is a polynomial ring generated by
the nonzero element u ∈ H 1 (P ∞ ; Z2 ).

11.3.2 Acyclic Carriers


For the construction of Steenrod squares, we will need a theorem on acyclic carriers. First we will need some
definitions. In what follows, π and G are groups.
P
Definition 11.3.1. The group ring Z[π] consists of formal sums zi gi where zi ∈ Z and gi ∈ π. Addition is
carried out component wise and multiplication using the distributive law, the usual multiplication in Z, and the group
multiplication for π. For example,

(z1 g1 + z2 g2 )(z3 g3 + z4 g4 ) = z1 z3 g1 g3 + z1 z4 g1 g4 + z2 z3 g2 g3 + z2 z4 g2 g4 .

Let K be a π-free chain complex with a Z[π]-basis B of elements called cells. For two cells, σ and τ , let
[τ : σ] ∈ Z[π] be the coefficient of σ in ∂τ . Let L be a chain complex acted on by G, and let h : π → G be a
homomorphism.

Definition 11.3.2. An h-equivariant carrier C from K to L is a function C from the basis B of K to the subcomplexes
of L such that:

1. If [τ : σ] ̸= 0, then C(σ) ⊂ C(τ ).

2. If x ∈ π and σ ∈ B, then h(x)C(σ) ⊂ C(σ).

C is an acyclic carrier if the subcomplex C(σ) is acyclic for every cell σ ∈ B. The h-equivariant chain map
f : K → L is carried by C if f (σ) ∈ C(σ) for every σ ∈ B.

Theorem 11.3.1. (Equivariant) Acyclic Carrier Theorem: Let C be an acyclic carrier from K to L. Let K ′ be a
subcomplex of K which is a Z[π]-free complex on a subset of B. Let f : K ′ → L be an h-equivariant chain map
carried by C. Then f extends over all of K to an h-equivariant chain map carried by C. This extension is unique up to
an h-equivariant chain homotopy carried by C.

Proof: We proceed by Pinduction on dimension. Suppose that f has been P extended overP K q . Let τ ∈ B be a
(q + 1)-cell. Then ∂τ = ai σi where ai = [τ : σi ] ∈ Z(π). Then f (∂τ ) = f (ai σi ) = h(ai )f (σi ),which is
in C(τ ) by the definition of an h-equivariant carrier. Since f is a chain map, f (∂τ ) = ∂f (τ ) is a cycle, but it is also a
boundary since C(τ ) is acyclic. So there exists x ∈ C(τ ) such that ∂x = f (∂τ ). Choose any such x and let f (τ ) = x.
Then f is extended over K q+1 by ensuring it is h-equivariant. Uniqueness is proved by applying the construction to
the complex K × I and the subcomplex K ′ × I ∪ K × I. ˙ ■
11.3. CONSTRUCTION OF STEENROD SQUARES 249

11.3.3 The Cup-i Products


We will next see how the cup-i products mentioned in Section 11.1 are derived using the Acyclic Carrier Theorem.
We will still have a little way to go as this will involve cohomology with integer coefficients.
Let K be the chain complex of a simplicial complex and let W be the Z2 -free complex of S ∞ described earlier. Let
T generate the action of Z2 on W ⊗K by T (w⊗k) = T (w)⊗k, and on K⊗K by T (x⊗y) = (−1)dim(x)dim(y) (y⊗x).
Let K = C(X) be the chain complex of the simplicial complex X. By the Eilenberg-Zilber Theorem (Theorem 8.5.5),
there is a chain-homotopy equivalence Ψ : C(X × X) → C(X) × C(X). For a generator σ of K (i.e a simplex of
X),Ψ(C(σ × σ)) is a subcomplex of C(X) ⊗ C(X). Then let the carrier C : W ⊗ K → K ⊗ K be defined as
C(di ⊗ σ) = Ψ(C(σ × σ)). C is acyclic and h-equivariant, where h is the identity map of Z2 . Then there exists an
h-equivariant chain map ϕ : W ⊗ K → K ⊗ K carried by C.
The map ϕ is what we need for the cup-i products. If we restrict ϕ to d0 ⊗ K, we can view it as a map ϕ0 : K →
K ⊗ K which is carried by the diagonal carrier. So it is chain homotopic to the diagonal map and can be used to
compute cup products in K. (Review Section 8.6.) The same can be said of T ϕ0 : σ → ϕ(T d0 ⊗ σ). So ϕ0 and T ϕ0
are both carried by C, so they are equivariantly homotopic. A chain homotopy is given by ϕ1 : K → K ⊗ K where
ϕ1 (σ) = ϕ(d1 ⊗ σ). Then ϕ1 and T ϕ1 are equivariantly homotopic homotopies and a chain homotopy is given by an
analogously defined ϕ2 .

Definition 11.3.3. For each integer i ≥ 0, define a cup-i product

C p (K) ⊗ C q (K) → C p+q−i (K) with (u, v) → u ∪i v

by the formula
(u ∪i v)(c) = (u ⊗ v)ϕ(di ⊗ c)
for c ∈ Cp+q−i (K). The definition depends on a specific choice of ϕ but it is independent of that choice as we will
soon see below.

Mosher and Tangora give another proof of the coboundary formula (Theorem 11.1.1) with this definition. It is not
hard but messy so I will refer you to [106] for details.

11.3.4 Steenrod Squares


As we mentioned in Section 11.1, if u ∈ C p (K) is a cocycle mod 2, (i.e. δu = 2a for some a ∈ C p+1 (K)), then
u ∪i u is also a cocycle mod 2. As before, we get a map Sqi : Z p (K; Z2 ) → Z 2p−i (K; Z2 ) taking u to u ∪i u. This
operation also takes coboundaries to coboundaries, so it passes to a (group) homomorphism: Sqi : H p (K; Z2 ) →
H 2p−i (K; Z2 ).

Theorem 11.3.2. Let f : K → L be a continuous map. Then the following diagram commutes:

Sqi
H p (L; Z2 ) H 2p−i (L; Z2 )

f∗ f∗

H p (K; Z2 ) H 2p−i (K; Z2 )


Sqi

Proof: We can assume f is simplicial as otherwise we could substitute it with a simplicial approximation. Let
u ∈ C p (L), and c ∈ C2p−i (K). Then we have

f ∗ (Sqi (u))(c) = (u ⊗ u)ϕL (di ⊗ f (c)) = (u ⊗ u)ϕL (1 ⊗ f )(di ⊗ c)

Sqi (f ∗ (u))(c) = (f ∗ u ⊗ f ∗ u)ϕK (di ⊗ c) = (u ⊗ u)(f ⊗ f )ϕK (di ⊗ c).


250 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

Now the two chain maps ϕL (1 ⊗ f ) and (f ⊗ f )ϕK are both carried by the acyclic carrier C from W ⊗ K to L ⊗ L
defined by C(di ⊗ σ) = C(f (σ) ⊗ f (σ)). So they are equivariantly chain homotopic and the images of the two
functions above are cohomologous. ■
Theorem 11.3.3. The operation Sqi is independent of the choice of ϕ.
Proof: In the previous theorem, let K = L and f be the identity. Letting ϕL and ϕK be two different choices of ϕ
implies the result. ■
Recall that in defining cup-i products, we said that (u ∪0 u)(c) = (u ⊗ u)ϕ0 (c) and ϕ0 is chain homotopic to the
diagonal map and thus suitable for computing cup products. This shows the following.
Theorem 11.3.4. If u ∈ C p (X; Z2 ) then Sq0 (u) = u2 ≡ u ∪ u.
Recall that Sqi was the notation used in Steenrod’s original paper [154]. From now on, we will use the more
modern notation Sq i defined below.
Definition 11.3.4. We define Sq i : H p (X; Z2 ) → H p+i (X; Z2 ) for 0 ≤ i ≤ p by Sq i = Sqp−i . For values of i
outside this range, define Sq i = 0. We refer to Sq i as a Steenrod square.
Sq i raises the cohomology dimension by i. If {u} ∈ H p (X; Z2 ), then Sq p (u) = Sqp−p (u) = Sq0 (u) = u2 by
Theorem 11.3.4. The name ”Steenrod square” is derived from this fact.
We can also define Steenrod squares for relative cohomology. Consider the exact sequence
q∗ j∗
0 → C ∗ (K, L) −→ C ∗ (K) −→ C ∗ (L) → 0,
where j and q are inclusions. We can let ϕL = ϕK |W ⊗ L, since ϕk (di ⊗ σ) ∈ C(σ × σ) ⊂ L ⊗ L for σ ∈ L.
Then for u, v ∈ C ∗ (K), j ∗ (u ∪i v) = j ∗ (u) ∪i j ∗ (v). To define a relative cup-i product, let u, v ∈ C ∗ (K, L).
Then j ∗ (q ∗ u ∪i q ∗ v) = j ∗ q ∗ (u) ∪i j ∗ q ∗ (v) = 0 since j ∗ q ∗ = 0. But by exactness, we have that (q ∗ u ∪i q ∗ v) is
in the image of q ∗ . Since q ∗ is a monomorphism, we can define u ∪i v the unique cochain in C ∗ (K, L) such that
q ∗ (u ∪i v) = q ∗ u ∪i q ∗ v. The coboundary formula stays the same in this case, so we get a homomorphism
Sq i : H p (K, L; Z2 ) → H p+i (K, L; Z2 )
and q ∗ Sq i = Sq i q ∗ .
Letting δ ∗ be the coboundary homomorphism δ ∗ : H p (L) → H p+1 (K, L) in the exact cohomology sequence of
the pair (K, L), Mosher and Tangora show the following.
Theorem 11.3.5. Sq i commutes with δ ∗ as in the diagram below.

Sq i
H p (L; Z2 ) H p+i (L; Z2 )

δ∗ δ∗

H p+1 (K, L; Z2 ) H p+i+1 (K, L; Z2 )


Sq i

We conclude this section with one more property that will provide an interesting example. Recall that the suspen-
sion of X is the result of taking X × I and collapsing each of X × {0} and X × {1} to single points. By an argument
similar to the proof of Theorem 4.2.1, there is an isomorphism S ∗ : H̃ p (X) ≈ H̃ p+1 (SX).
Theorem 11.3.6. The Steenrod squares commute with suspension, i.e.
Sq i S ∗ = S ∗ Sq i : H̃ p (X) → H̃ p+i+1 (SX).
Example 11.3.1. This example shows a nontrivial Steenrod square which is not a cup product. Let P 2 be the projective
plane. H ∗ (P 2 ; Z2 ) is the truncated polynomial ring generated by the nonzero class u ∈ H 1 (P 2 ; Z2 ) ∼
= Z2 . We have
that u3 = 0. Now Sq 1 (u) = Sq0 (u) = u2 , so S ∗ Sq 1 (u) ̸= 0. By Theorem 11.3.6, Sq 1 S ∗ (u) is also nonzero, so the
operation Sq 1 is nonzero. Thus, the operation Sq 1 is nontrivial in H 2 (SP 2 ; Z2 ).
11.4. BASIC PROPERTIES 251

11.4 Basic Properties


Theorem 11.4.1. The Steenrod squares Sq i for i ≥ 0 have the following properties:

1. Sq i is a natural homomorphism H p (K, L; Z2 ) → H p+i (K, L; Z2 ).

2. If i > p, Sq i (x) = 0 for x ∈ H p (K, L; Z2 ).

3. If x ∈ H p (K, L; Z2 ), then Sq p (x) = x2 = x ∪ x.

4. Sq 0 is the identity homomorphism.

5. Sq 1 is the Bockstein homomorphism defined below.

6. δ ∗ Sq i = Sq i δ ∗ where δ ∗ is the coboundary homomorphism δ ∗ : H p (L; Z2 ) → H p+1 (K, L; Z2 ).

7. Cartan Formula: Writing xy for x ∪ y we have


X
Sq i (xy) = (Sq j x)(Sq i−j y).
j

8. Adem Relations: For a < 2b,


X b − c − 1
a b
Sq Sq = Sq a+b−c Sq c ,
c
a − 2c

where the binomial coefficient is taken mod 2.

We have already seen properties 1, 2, 3, and 6. Chapter 3 of Mosher and Tangora [106] is devoted to proving the
other properties given our definitions. Rather than give all of the details which involve some lengthy calculations, I
will outline some of the interesting points. See [106] for more details.
Note that these properties can be taken as axioms that completely characterize the squares. This is done in Steenrod
and Epstein [157].

11.4.1 Sq 1 and Sq 0
First of all, I will define Sq 1 as promised. Consider the exact sequence
m
0 → Z −→ Z → Z2 → 0,

where m is multiplication by 2. Define the Bockstein homomorphism β : H p (K, L; Z2 ) → H p+1 (K, L; Z) as follows.
Let x ∈ H p (K, L; Z2 ). Represent x by a cocycle c and choose an integral cochain c′ which maps to c under reduction
mod 2. Then δc′ is a multiple of 2, and 21 (δc′ ) represents βx. The composition of β and the reduction homomorphism
H p+1 (K, L; Z) → H p+1 (K, L; Z2 ) gives a homomorphism

δ2 : H p (K, L; Z2 ) → H p+1 (K, L; Z2 )

which is also called the Bockstein homomorphism. I claim that Sq 1 = δ2 .


To outline the proof, we start with the following. It is also a special case of the Adem Relations.

Theorem 11.4.2. δ2 Sq j = 0 if j is odd, and δ 2 Sq j = Sq j+1 if j is even.


252 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

An immediate consequence is that Property 3 implies Property 4. See [106] for the proof of Theorem 11.4.2. I will
give the proof of Property 3.
First of all, let K = P 2 . Then if u is the nonzero element of H 1 (P 2 , Z2 ), δ 2 Sq 0 (u) = Sq 1 (u) = u2 ̸= 0. So
Sq (u) ̸= 0 and Sq 0 (u) = u, since u is the only nonzero element of H 1 (P 2 , Z2 ). So Property 3 holds in this case.
0

If K = S 1 , then letting f : S 1 → P 2 be such that f ∗ (u) = σ where σ generates H 1 (S 1 ; Z2 ), we have

Sq 0 (σ) = Sq 0 (f ∗ (u)) = f ∗ (Sq 0 (u)) = f ∗ (u) = σ.

Using suspensions and Theorem 11.3.6, we have that Property 3 holds for any sphere S n . We extend this to any
complex K of dimension n by mapping K to S n so that if σ generates H n (S n ; Z2 ) we let f ∗ (σ) map to a given
n-dimensional cohomology class of K. We then generalize to an infinite dimensional complex K through inclusion
of K n into K and finally to pairs via an excision argument. (See [106] for the details in these cases.)

11.4.2 The Cartan Formula


I won’t prove the Cartan Formula, but an interesting observation is that while the squares Sq i are homomorphisms
with respect to group addition, they are obviously not ring homomorphisms. Otherwise we would have Sq i (xy) =
Sq i (x)Sq i (y), which contradicts the Cartan Formula. On the other hand, there is something we can do.

Definition 11.4.1. Define Sq : H ∗ (K; Z2 ) → H ∗ (K; Z2 ) by


X
Sq(u) = Sq i u.
i

This sum is finite and the image Sq(u) is not homogeneous. In other words it is the sum of elements of differing
dimensions.

Theorem 11.4.3. Sq is a ring homomorphism.

Proof: Sq(u) ∪ Sq(v) = ( i Sq i u) ∪ ( j Sq j v) has Sq i (u ∪ v) as its (p + q + i)-dimensional term by the Cartan


P P
formula. (Here u has dimension p and v has dimension q.) Then Sq(u) ∪ Sq(v) = Sq(u ∪ v). ■

j

Theorem 11.4.4. For u ∈ H 1 (K; Z2 ), we have Sq i (uj ) = i uj+1 .

Proof: Sq(u) = Sq 0 (u) + Sq 1 (u) = u + u2 . Since Sq is a ring homomorphism, Sq(uj ) = (u + u2 )j =


j
P j k
u k k u . Comparing coefficients proves the result. ■
In particular, this gives the action of all Sq i on H ∗ (P ∞ ; Z2 ).

11.4.3 Cartesian Products of P ∞


This subsection is necessary if you want to understand more of the theory from Mosher and Tangora in later
chapters. It is especially needed to understand the structure of the Steenrod algebra that I will discuss in Section 11.6.
Let K = P ∞ = K(Z2 , 1). We will write Kn = K × · · · × K (n times). Since H ∗ (K; Z2 ) is the polynomial ring
on the one dimensional class, the Künneth Theorem implies that H ∗ (Kn ; Z2 ) is the polynomial ring over Z2 generated
by x1 , · · · , xn , where xi is the nonzero one dimensional class in the i-th copy of K. (Recall that for coefficients in a
field, the Tor terms go away. See Example 8.5.2.)
11.4. BASIC PROPERTIES 253

Definition 11.4.2. The elementary symmetric polynomials in n variables x1 , · · · , xn are defined as follows:

σ0 = 1
X
σ1 = xi
1≤i≤n
X
σ2 = xi xj
1≤i<j≤n
X
σ3 = xi xj xk
1≤i<j<k≤n

·········
σn = x1 x2 · · · xn

Definition 11.4.3. A symmetric polynomial in n variables is a polynomial where interchanging two variables produces
the same polynomial.
Obviously the elementary symmetric polynomial are all symmetric polynomials. Let S be the subring of Z2 [x1 , · · · , xn ]
consisting of the symmetric polynomials.
Theorem 11.4.5. Fundamental Theorem of Symmetric Algebra: Let R be a commutative ring with unit element
and S the subring of the polynomial ring R[x1 , · · · , xn ] consisting of the symmetric polynomials. Then S is equal to
the polynomial ring R[σ1 , · · · , σn ] where σi is the ith elementary symmetric polynomial in n variables. (Note that
σ0 = 1 is excluded.)
So in our case, S = Z2 [σ1 , · · · , σn ]. First, we will see what the squares do to σi .
Theorem 11.4.6. In H ∗ (Kn ; Z2 ), Sq i (σn ) = σn σi for 0 ≤ i ≤ n.
Proof:
n
Y n
Y n
X
Sq(σn ) = Sq(x1 x2 · · · xn ) = Sq(x1 )Sq(x2 ) · · · Sq(xn ) = (xi + x2i ) = σn ( (1 + xi )) = σn σi .
i=1 i=1 i=0

The term of degree n + i is then σn σi . ■


As in Section 11,2, we will abbreviate H ∗ (K(π, n); G) as H ∗ (π, n; G). Let ın ∈ H n (Z2 , n; Z2 ), be the funda-
mental class. (See Definition 11.2.3.)
Theorem 11.4.7. In H ∗ (Z2 , n; Z2 ), Sq i (ın ) ̸= 0 for 0 ≤ i ≤ n.
Proof: By Theorem 11.2.1, there is a map f : Kn → K(Z2 , n) such that f ∗ (ın ) = σn . Then

f ∗ (Sq i (ın )) = Sq i f ∗ (ın ) = Sq i (σn ) = σn σi ̸= 0,

and we are done. ■


We can find more linearly independent elements of H ∗ (Z2 , n; Z2 ) by using compositions of the squares.
Let I = {i1 , · · · , ir } be a sequence of positive integers. Then Sq I denotes the composifion Sq i1 Sq i2 · · · Sq ir . If
I = {} is the empty sequence, we let Sq I = Sq 0 .
The following definition will be very important in Section 11.6 when we talk about the Steenrod algebra.
Definition 11.4.4. A sequence I as described above is admissible if ij ≥ 2(ij+1 ) for every j < r. This is automatically
we also refer to Sq I as admissible. The length of I
satisfied if I is empty or I = {i1 }. If I is an admissible sequence, P
r
is the number r of terms. The degree d(I) is the sum of the terms j=1 ij . So Sq I raises the dimension by d(I). For
an admissible sequence I, the excess is e(I) = 2i1 − d(I).
254 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

Theorem 11.4.8. If e(I) is the excess, then

e(I) = 2i1 − d(I) = i1 − i2 − · · · − ir = (i1 − 2i2 ) + (i2 − 2i3 ) + · · · + ir .

Again, let S be the symmetric polynomial subring of the polynomial ring H ∗ (Kn ; Z2 ) = Z2 [x1 , · · · , xn ]. Define
an ordering of the monomials of S as follows. Given any such monomial, write it as

m = σje11 σje22 · · · σjess

such that j1 > j2 > · · · > js . Then put m < m′ if j1 < j1′ or if j1 = j1′ and (m/σj1 ) < (m′ /σj1 ).

Theorem 11.4.9. If d(I) ≤ n, then Sq I (σn ) can be written as σn QI where

QI = σi1 · · · σir + (a sum of monomials of lower order).

The result is proved using induction on the length of I, the Cartan Formula, and Theorem 11.4.6.
As I runs through all the admissible sequences of degree ≤ n, the monomials σI = σi1 σi2 · · · σir are linearly
independent in S, and hence in H ∗ (Kn ; Z2 ). The theorem shows that the Sq I (σn ) are also linearly independent.
Choosing a map f : K n → K(Z2 , n) such that f ∗ (ın ) = σn . shows the following.

Theorem 11.4.10. As I runs through all the admissible sequences of degree ≤ n, the elements Sq I (ın ) are linearly
independent in H ∗ (Z2 , n; Z2 ).

Theorem 11.4.11. If u ∈ H n (K; Z2 ) for some space K and I has excess e(I) > n, then Sq I u = 0. If e(I) = n,
then Sq I u = (Sq J u)2 , where J denores the sequence obtained from I by dropping i1 .

Proof: If e(I) = i1 − i2 − · · · − ir > n, then i1 > n + i2 + · · · + ir = dim (Sq J )(u). So by Property 2 of


Theorem 11.4.1, Sq I u = 0.
If e(I) = n, then we replace the greater than sign with an equal sign and i1 = dim (Sq J )(u). By Property 3 of
Theorem 11.4.1, Sq I u = (Sq J u)2 . ■
These results are included in a theorem of Serre which states that H ∗ (Z2 , n; Z2 ) is the polynomial ring over Z2
with generators {Sq I (ın )} as I runs through all admissible sequence of excess less than n. Mosher and Tangora use
this result to prove the Adem relations but don’t prove it until later using the machinery of spectral sequences. For
completeness. I will discuss it and some related results in Section 11.7.

11.4.4 Adem Relations


The Adem Relations (Property 8 of Theorem 11.4.1) relate to what happens when you switch the order of the
compostion of 2 Steenrod squares. Rather than copy Mosher and Tangora’s very long proof, I will focus on showing
you how to use them.
Letting ⌊x⌋ denote the floor of x or the largest integer z ≤ x an Adem relation has the form

⌊a
2⌋  
a b
X b−c−1
R = Sq Sq + Sq a+b−c Sq c ≡ 0 mod 2,
c=0
a − 2c

x

where a < 2b. We use the convention that the binomial coefficient y = 0 if y < 0 or x < y.
Let’s try some examples.

Example 11.4.1.
Sq 2 Sq 6 = Sq 7 Sq 1
11.4. BASIC PROPERTIES 255

Proof: Let a = 2, b = 6. Then since 2 < 2(6) = 12, we can use the Adem relation

1  
X 6−c−1
Sq 2 Sq 6 = Sq 8−c Sq c
c=0
2 − 2c
   
6−0−1 6−1−1
= Sq 8 Sq 0 + Sq 7 Sq 1
2 − 2(0) 2 − 2(1)
   
5 4
= Sq 8 Sq 0 + Sq 7 Sq 1
2 0
= 10Sq 8 Sq 0 + Sq 7 Sq 1
= Sq 7 Sq 1 ,

since we are working modulo 2. ■

Example 11.4.2.
Sq 2 Sq 4 = Sq 6 + Sq 5 Sq 1

Proof: Let a = 2, b = 4. Then since 2 < 2(4) = 8, we can use the Adem relation

1  
X 4−c−1
Sq 2 Sq 4 = Sq 6−c Sq c
c=0
2 − 2c
   
4−0−1 4−1−1
= Sq 6 Sq 0 + Sq 5 Sq 1
2 − 2(0) 2 − 2(1)
   
3 2
= Sq 6 Sq 0 + Sq 5 Sq 1
2 0
= 3Sq 6 Sq 0 + Sq 5 Sq 1
= Sq 6 + Sq 5 Sq 1 ,

since we are working modulo 2 and Sq 0 is the identity so it can be dropped. ■

Example 11.4.3.
Sq 2n−1 Sq n = 0

Proof: Let a = 2n − 1, b = n. Then since 2n − 1 < 2n, we can use the Adem relation

⌊ 2n−1
2 ⌋ 
2n−1 n
X n−c−1
Sq Sq = Sq 3n−1−c Sq c .
c=0
2n − 1 − 2c

Now for the binomial coefficient to be nonzero, we need 2n − 1 − 2c > 0, so c < ⌊ 2n−1
2 ⌋ = n − 1. This condition
is already taken care of in the summation. But we also need

2n − 1 − 2c < n − c − 1
n − 2c < −c
n < c.

But this is impossible since c < n − 1. Thus Sq 2n−1 Sq n = 0. ■


256 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

11.5 The Hopf Invariant


The next topic has little to do with data science and is a little tangential but a rather interesting application of
Steenrod squares. I will outline some of the arguments and refer the reader to Chapter 4 of Mosher and Tangora [106]
for the details.
Consider the sphere S n where n ≥ 2. Let f : S 2n−1 → S n be given. Let S 2n−1 be the boundary of an oriented
2n-cell and form the cell complex K = S n ∪f e2n . Recall that this means we added a 2n-cell to K with the boundary
attached to S n by f . The integral cohomology of K is zero except in dimensions 0, n, and 2n, and H i (K; Z) ∼ =Z
in those dimensions. Let σ and τ generate H n (K; Z) and H 2n (K; Z) respectively. Then σ 2 = σ ∪ σ is an integral
multiple of τ .

Definition 11.5.1. The Hopf invariant of f is the integer H(f ) such that σ 2 = H(f )τ .

Question: Under what circumstances does H(f ) = 1?


The homotopy type of K depends only on the homotopy class of f . So the Hopf invariant defines a map H :
π2n−1 (S n ) → Z where the class α ∈ π2n−1 (S n ) is taken to H(f ) where f is any map representing α.
If n is odd, then by anti-commutativity of cup product (Theorem 8.6.3),
2
σ ∪ σ = (−1)n σ ∪ σ = −σ ∪ σ,

so σ 2 = 0. So for n odd, H(f ) = 0.


If n = 2, 4, or 8, we have the Hopf maps S 3 → S 2 , S 7 → S 4 , and S 15 → S 8 . (See Section 9.4.) These maps are
known to have Hopf invariant one as they turn out to be attaching maps of 2n cells in the complex, quaternionic, and
Cayley projective planes respectively and these spaces have cohomology rings that are truncated polynomial rings so
that powers of generators are also generators. So σ 2 = τ.

Theorem 11.5.1. If n is even, there exists a map f : S 2n−1 → S n with Hopf invariant 2.

Figure 11.5.1: Attaching a 2n-cell to S n ∨ S n to form S n × S n . [106].

It takes some work to prove this, but the map itself is the map F g : S 2n−1 → S n formed as follows: Consider
the product S n × S n as a cell complex formed by attaching a 2n-cell to S n ∨ S n . Let g : S 2n−1 → S n ∨ S n be the
attaching map shown in Figure 11.5.1 [106]. Let F : S n ∨ S n → S n be the folding map. Then Mosher and Tangora
show that F g has Hopf invariant 2.

Theorem 11.5.2. The transformation H : π2n−1 (S n ) → Z is a homomorphism, i.e. if f, g represent elements of


π2n−1 (S n ), then H(f + g) = H(f ) + H(g).

Again, I will refer you to [106] for the proof, but we immediately have the following.

Theorem 11.5.3. If n is even, then π2n−1 (S n ) contains Z as a direct summand.


11.5. THE HOPF INVARIANT 257

Now we will work towards showing that there does not exist a map f : S 2n−1 → S n with Hopf invariant 1 if n is
not a power of 2. The key is the fact that f has an odd Hopf invariant, so σ 2 is an odd multiple of τ in K = S n ∪f e2n .
But then, since σ 2 = Sq n σ, we have Sq n σ = τ in the mod 2 cohomology of K.
Definition 11.5.2. Sq i is said to be decomposable if
X
Sq i = at Sq t ,
t<i

where each at is a sequence of squaring operations. If Sq i is not decomposable, then it is indecomposable.


Example 11.5.1. Sq 1 is indecomposable. Also Sq 2 is indecomposable since Sq 1 Sq 1 = 0.
Example 11.5.2. Sq 3 = Sq 1 Sq 2 by the Adem relations, so it is decomposable. Also, we showed in Example 11.4.2
that Sq 2 Sq 4 = Sq 6 + Sq 5 Sq 1 , so Sq 6 is decomposable.
Theorem 11.5.4. Sq i is indecomposable if and only if i is a power of 2.
Proof: Let i be a power of 2 and let u ∈ H 1 (P ∞ ; Z2 ) be the generator. Using the ring homomorphism Sq,

Sq(ui ) = (Sq(u))i = (u + u2 )i = ui + u2i mod 2.

So Sq t (ui ) = 0 unless t = 0 or t = i, and Sq i (ui ) = u2i . The fact that u2i ̸= 0 shows that Sq i is indecomposable,
since otherwise, X
u2i = Sq i (ui ) = at Sq t (ui ) = 0.
t<i

For the converse, let i = a + 2 , where 0 < a < 2 . Writing b for 2k , the Adem relations give
k k

  X b − c − 1
a b b−1 a+b
Sq Sq = Sq + Sq a+b−c Sq c .
a c>0
a − 2c

b−1

Since b = 2k is a power of 2, the next theorem will show that a = 1 mod 2. So Sq i = Sq a+b is decomposable.

Pm Pm
Theorem 11.5.5. Let p be a prime and let a and b have expansions a = i=0 ai pi , and b = i
i=0 bi p . Then
  Y m  
b bi
= mod p.
a i=0
ai

Proof: In the ring Zp [x], (1 + x)p = 1 + xp . (This makes high school algebra students especially happy.) Thus
i i
(1 + x)b = (1 + x)bi p = (1 + xp )bi . Then ab is the coefficient of xa in this expansion as seen from the first
Q Q 
Qm
expression, but the last expression shows that this coefficient is i=0 abii . ■


I can now state Mosher and Tangora’s main result.


Theorem 11.5.6. If there exists a map f : S 2n−1 → S n of Hopf invariant 1. Then n is a power of 2.
Proof: Recall that if f exists, then Sq n σ = τ in the complex K = S n ∪f e2n , where σ, τ are the generators
of H ∗ (K; Z2 ) in dimensions n and 2n respectively. If n is not a power of 2, then Sq n is decomposable, but in K,
Sq i = 0 for 0 < i < n. This is a contradiction, so n must be a power of 2. ■
J. Frank Adams proved an even stronger result.
Theorem 11.5.7. If there exists a map f : S 2n−1 → S n of Hopf invariant 1. Then n = 2, 4, or 8. So the only maps of
Hopf invariant one are the three Hopf maps.
The proof is the subject of a very dense 85 page paper [3]. You should try it if you are looking for a challenge.
258 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

11.6 The Steenrod Algebra


Let’s recall the sructures of abstract algebra. A group has a multiplication or an addition but not necessarily both.
A ring has addition and multiplication tied together by a distributive law. A vector space has addition and scalar
multiplication, and an algebra has addition, multiplication, and scalar multiplication. The typical first example of an
algebra is the set of square n × n matrices over a field.
Just when you thought things couldn’t get more complicated, I will introduce the Steenrod algebra. It involves our
friends the Steenrod squares. The Steenrod algebra is an example of a Hopf algebra. And cohomology rings are also
graded modules over the Steenrod algebra.
So what is our interest? As I mentioned earlier, the more algebraic structure that encodes our data, the more
information we have for classification. For now, other than use specific Steenrod squares Sq i as features, I don’t know
what else to do with the extra structure. But I will leave it to you to think about.
Meanwhile, Frank Adams used the Steenrod algebra to build a new spectral sequence to study homotopy groups
of spheres. The Adams spectral sequence provides information about the 2-component of πm (S n ), where m is in the
stable range (i.e. m ≤ 2n − 2) and the 2-component is the set of elements whose orders are powers of 2. The E2
term which starts the sequence involves ExtA (M, N ) where M and N are modules over the Steenrod algebra A that
we will define in this chapter. While this topic is somewhat more advanced than I want to cover in this book, the idea
is that even this much algebraic structure leads to meaningful results in geometry. I believe that the potential for data
science goes well beyond persistent homology, and it would be very interesting to see where it could lead.
In the next chapter, I will focus on some more basic approaches to computing homotopy groups of spheres. If you
are intrigued by the discussion of the Adams spectral sequence, you can learn more in Adams’ original papers [3, 4],
the last chapter of Mosher and Tangora [106] or the book by McLeary [99].
The rest of this section is taken from [106]. I will discuss some of the major properties of the Steenrod algebra and
state some results mostly without the proofs.
The Steenrod algebra is an example of a graded algebra. I will start by building up to the definition of a graded
algebra. Then I will define the Steenrod algebra as an example.
Recall that we defined a module in Definition 3.4.8. It is basically a generalization of a vector space where the
scalars can be an arbitrary ring as opposed to a field. If the ring is non-commutative, we have to distinguish between
left and right modules depending on which side the scalars are multiplied. In our case, the scalars will always be a
commutative ring R with unit element. In this case, we don’t run into this problem and generally multiply with the
scalars on the left. So r ∈ R and m ∈ M implies rm ∈ M .
Definition 11.6.1. Let M be a module over a commutative ring with unit element R. Then M is unitary or unital if
1m = m, where 1 is the unit element of R.
Definition 11.6.2. M is a graded R-module if M = ∪∞ i=0 Mi where the Mi are unitary R-modules. A homomorphism
f : M → N is a sequence fi of R-homomorphisms fi : Mi → Ni . (I.e. if r ∈ R and m1 , m2 ∈ Mi then
fi (rm1 ) = rfi (m1 ) and fi (m1 + m2 ) = fi (m1 ) + fi (m2 ).) The tensor product of M ⊗ N of two graded R-modules
is defined by letting

X
(M ⊗ N )t = Mi ⊗ Nt−i .
i=0

Definition 11.6.3. A is a graded R-algebra if A is a graded R-module with a multiplication ϕ : A⊗A → A, where ϕ is
a homomorphism of graded R-modules and there exists an element e ∈ A such that for a ∈ A, ϕ(e⊗a) = ϕ(a⊗e) = a.
We will write ϕ(m ⊗ n) as mn. We call e the unit of A. A homomorphism of graded R-algebras is a graded R-module
homomorphism which respects multiplication and units.
Definition 11.6.4. A graded R-algebra A is associative if ϕ(ϕ ⊗ 1A ) = ϕ(1A ⊗ ϕ). If a, b, c ∈ A, then this means
that (ab)c = a(bc), corresponding to our usual definition.
Definition 11.6.5. A graded R-algebra A is commutative if ϕ(T ) = ϕ : A ⊗ A → A, where T : A ⊗ A → A ⊗ A is
defined as
T (m ⊗ n) = (−1)deg(m)deg(n) (n ⊗ m),
11.6. THE STEENROD ALGEBRA 259

where m, n ∈ A. This means that mn = (−1)deg(m)deg(n) nm.


Definition 11.6.6. A graded R-algebra A is augmented if there is a graded algebra homomorphism ϵ : A → R, where
R is considered as a graded R-algebra such that R0 = R, and Ri = 0 for i > 0. This means that since ϵ respects the
grading, we have that ϵ : A0 → R. An augmented graded R-algebra is connected if ϵ is an isomophism.
Let A and B be graded R-algebras. Then A ⊗ B can be thought of as the tensor product of graded R-modules.
We can give it an algebra structure by defining

(a1 ⊗ b1 )(a2 ⊗ b2 ) = (−1)deg(a2 )deg(b1 ) (a1 a2 ) ⊗ (b1 b2 ),

for a1 , a2 ∈ A and b1 , b2 ∈ B.
Definition 11.6.7. Let M be an R-module. The tensor algebra Γ(M ) is defined as follows: Let M 0 = R, M 1 = M ,
M 2 = M ⊗ M , and M n = M ⊗ · · · ⊗ M (n times).Then Γ(M ) is the graded R-algebra defined by Γ(M )r = M r ,
and the product is given by the isomorphism M s ⊗ M t ∼
= M s+t .
The tensor algebra Γ(M ) is associative but not commutative.
Now we are finally ready to define the Steenrod algebra. Let R = Z2 and M be the graded Z2 -module such that
Mi = Z2 generated by the symbol Sq i for i ≥ 0. Then the tensor algebra Γ(M ) is bigraded where for example
Sq p ⊗ Sq q is in Γ(M )2,p+q and represents the composition Sq p Sq q . For each pair of integers (a, b) such that 0 <
a < 2b let
X b − c − 1
a b
R(a, b) = Sq ⊗ Sq + Sq a+b−c ⊗ Sq c .
c
a − 2c
(By now you should recognize these as the Adem relations.)
Definition 11.6.8. The Steenrod algebra A is the quotient algebra Γ(M )/Q, where Q is the ideal of Γ(M ) generated
by the R(a, b) defined above along with 1 + Sq 0 .
The elements of A are polynomials in Sq i with coefficients in Z2 subject to the Adem relations.
Theorem 11.6.1. The monomials Sq I where I runs through the admissible sequences form a basis for A as a Z2
module.
Idea of proof: Linear independence follows from Theorem 11.4.10. To show that these elements span A, define
the moment m(I) by the formula
r
X
m(I) = m({i1 , · · · , ir }) = is s.
s=1
If I is not admissible, there is a pair is , is+1 with is < 2is+1 . Starting at the right and applying the Adem relations
to the first such pair, leads to a sum of monomials with strictly lower moment than I. Since the moment function is
bounded below by 0, the process terminates and the admissible Sq I span A. ■
Example 11.6.1. Find a basis for A7 . We need to find sequences {i1 , · · · , ir } that sum to 7 where is ≥ 2is+1
for every consecutive pair of indices is , is+1 .Starting at 7 and working downwards gives that A7 is generated by
Sq 7 , Sq 6 Sq 1 , Sq 5 Sq 2 , and Sq 4 Sq 2 Sq 1 .
Example 11.6.2. Find a basis for A9 . Starting at 9 and working downwards gives that A9 is generated by Sq 9 , Sq 8 Sq 1 , Sq 7 Sq 2 ,
and Sq 6 Sq 3 .
Now we will give A even more structure. We will produce an algebra homomorphism ψ : A → A ⊗ A called the
diagonal map of A.
Let M be the graded Z2 module generated in each i ≥ 0 by Sq i . Define ψ : Γ(M ) → Γ(M ) ⊗ Γ(M ) by the
formula X
ψ(Sq i ) = Sq j ⊗ Sq i−j
j
260 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

and the requirement that ψ be an algebra homomorphism so that


X X
ψ(Sq r ⊗ Sq s ) = ψ(Sq r ) ⊗ ψ(Sq s ) = (Sq a ⊗ Sq r−a ) ⊗ (Sq b ⊗ Sq s−b ).
a b

Theorem 11.6.2. The map ψ induces an algebra homomorphism ψ : A → A ⊗ A.

The proof takes some work but the basic idea is to show that the kernel of the natural projection

p : Γ(M ) → Γ(M )/Q = A

is contained in the kernel of ψ defined above.


The diagonal map turns A into a Hopf algebra which I will now define.
Let A be a connected graded R-algebra with unit. The existence of the unit is expressed by the fact that both
compositions in this diagram are the identity, where 1 represents the identity map on A, and η is the coaugmentation,
which is the inverse of the isomorphism ϵ : A0 → R. Also, ϕ is the multiplication in A. So we have a unit element
1 ∈ R with 1a = a1 = a for a ∈ A.

A⊗R

= 1⊗η

ϕ
A A⊗A A


= η⊗1
R⊗A

Now let A be a graded R-module with a given augmentation, which is an R-homomorphism ϵ : A → R. We say
that A is a co-algebra with co-unit if we are given an R-homomorphism ψ : A → A ⊗ A such that both compositions
are the identity in the dual diagram (i.e. arrows turned backwards) below.

A⊗R

= 1⊗ϵ

ψ
A A⊗A A


= ϵ⊗1
R⊗A

Then ψ is called the diagonal map or the co-multiplication.

Definition 11.6.9. Let A be a connected R-algebra with augmentation ϵ. Let A also have a co-algebra structure with
co-unit and that the diagonal map ψ : A → A ⊗ A is a homomorphism of R-algebras. Then A is a (connected) Hopf
algebra.

The above discussion shows the following:

Theorem 11.6.3. The Steenrod algebra A is a Hopf algebra.


11.6. THE STEENROD ALGEBRA 261

Now let R be a field (think of Z2 ) and let A be a connected Hopf algebra over R. Since R is a field, each Ai
is a vector space. Suppose each Ai is finite dimensional over R. We define the dual Hopf algebra A∗ by setting
(A∗ )i = (Ai )∗ , the vector space dual. (Recall from linear algebra that if V is a vector space over a field F then the
dual space V ∗ is the vector space of homomorphisms (i.e. linear transformations) from V to F .)
The multiplication ϕ in A gives rise to the diagonal map ϕ∗ in A∗ and the diagonal map ψ in A gives rise to the
multiplication map ψ ∗ in A∗ . Then A∗ is itself a Hopf algebra and ϕ∗ is associative (commutative) if and only if ϕ is
associative (commutative) and the same holds for ψ and ψ ∗ . A and A∗ are isomorphic as R-modules but not in general
as algebras.
Now let A∗ be the dual of the Steenrod algebra. Then A∗ is a Hopf algebra with an associative diagonal map ϕ∗
and with associative and commutative multiplication ψ ∗ . It turns out that A∗ is a polynomial ring.
Definition 11.6.10. Let u be the generator of H 1 (P ∞ ; Z2 ). For each i ≥ 0, let ξi be the element of A∗2i −1 such that
i i
ξi (θ)(u)2 = θ(u) ∈ H 2 (P ∞ ; Z2 )
for all θ ∈ A2i −1 . We set ξ0 to be the unit in A∗ .
i
To break this down, ξi (θ) is an element of Z2 . If this element is z, we want z(u2 ) = θ(u). So ξi (θ) is the
i
coefficient of u2 when θ is applied to u.
Here is the main structure theorem for A∗ . See [106] for the proof.
Theorem 11.6.4. As an algebra, A∗ is the polynomial ring over Z2 generated by the {ξi } for i ≥ 1.
Now the Steenrod algebra actually acts on the cohomology ring over a space. This makes the cohomology ring an
algebra over an algebra.
To put this more precisely, let A be a graded R-algebra and M be a graded R-module, where R is a commutative
ring with unit element.
Definition 11.6.11. M is a graded A-module if there is an R-module homomorphism µ : A ⊗ M → M such that
µ(1 ⊗ m) = m and
µ(ϕA ⊗ 1) = µ(1 ⊗ µ) : A ⊗ A ⊗ M → M,
where ϕA is the multiplication in A.
Now suppose that A is a Hopf algebra and that M is an A-module which is also an R-algebra. Then M ⊗ M is an
A-module under the composition
ψ⊗1⊗1 T µ⊗µ
µ′ : A ⊗ M ⊗ M −−−−−→ A ⊗ A ⊗ M ⊗ M −
→ A ⊗ M ⊗ A ⊗ M −−−→ M ⊗ M,
where ψ is the diagonal map of A, and T interchanges the second and third terms of the tensor product.
Definition 11.6.12. M is an algebra over the Hopf algebra A if the multiplication ϕM : M ⊗ M → M is a homomor-
phism of modules. This means that the following diagram commutes:

µ′
A⊗M ⊗M M ⊗M

1 ⊗ ϕM ϕM

A⊗M µ M

In the case of interest to us, let R = Z2 , A = A, and M = H ∗ (X; Z2 ), where X is a topological space. Recall
that the diagonal map ψ of A is defined by
X
ψ(Sq i ) = Sq j ⊗ Sq i−j
j
262 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

and we have X X
ψ(Sq r ⊗ Sq s ) = ψ(Sq r ) ⊗ ψ(Sq s ) = (Sq a ⊗ Sq r−a ) ⊗ (Sq b ⊗ Sq s−b ).
a b
If θ ∈ A and x, y ∈ H ∗ (X; Z2 ), then the Cartan formula implies that
θ(ϕM (x ⊗ y)) = θ(x ∪ y) = ϕM (ψ(θ)(x ⊗ y)).
So the cohomology ring with Z2 coefficients is actually an algebra over the Hopf algebra A.
Mosher and Tangora also derive a formula for the diagonal map of the dual A∗ which I will state for completeness.
The diagonal map ϕ∗ of A∗ is given by the formula
k
X i
ϕ∗ (ξk ) = (ξk−i )2 ⊗ ξi .
i=0

An interesting question is what the extra structure of Steenrod squares could tell you about your data. Could the
extra structure help with hard classification problems now that more of the geometry is included? The good news is
that in low dimensional cases, there are algorithms that can compute them in a reasonable time period. These issues
will be the subject of the last section of this chapter.

11.7 Cohomology of Eilenberg-Mac Lane Spaces


Recall way back in Section 11.2 that we said that the cohomology operations of the form H n (X; π) → H m (X; G)
are in one to one correspondence with the cohomology group H m (K(π, n); G). (We abbreviate the latter as H m (π, n; G).)
To recall how the correspondence worked, we let θ be such an operation. By the Universal Coefficient Theorem,
H n (X; πn (X)) ∼
= Hom(Hn (X); πn (X))
if X is (n − 1)-connected. K(π, n) definitely meets this criterion, so we let ın be the element of H n (π, n; π) cor-
responding to the inverse h−1 of the Hurewicz isomorphism h : πn (X) → Hn (X) where X = K(π, n). Then
ın ∈ H n (π, n, π) is called the fundamental class and the cohomology operation θ is paired with θ(ın ) ∈ H m (π, n; G).
Now look at the cohomology operation Sq i . It is of the form H n (X; Z2 ) → H n+i (X; Z2 ). Here G = π = Z2 ,
and m = n + i. So Sq i is paired with an element of H n+i (Z2 , n, Z2 ). So we would expect a close connection
between the structure of the cohomology ring H ∗ (Z2 , n, Z2 ) and the Steenrod squares.. We will see that this is a
polynomial ring whose generators are certain Steenrod squares. Proving this fact is not easy, though. It will involve
some sophisticated machinery, especially spectral sequences.
The calculations are not as much hard as messy. To understand them, though, I will outline some key points. Then
you can look at Mosher and Tangora if you are interested in the details.
At this point, you should review Section 9.9. I will assume you know the basic definitions of spectral sequences,
the use of the Leray-Serre spectral sequence in computing homology of fiber spaces, and how to get a spectral sequence
from an exact couple.

11.7.1 Bocksstein Exact Couple


The Bockstein exact couple will be needed for Section 11.7.8 to describe the structure of H ∗ (Z2m , 1; Z2 ).
Definition 11.7.1. The Bockstein exact couple is of the form:

i1
D1 = H ∗ (; Z) H ∗ (; Z)

k1 j1
1 ∗
E = H (; Z2 )
11.7. COHOMOLOGY OF EILENBERG-MAC LANE SPACES 263

Here i1 is induced by multiplication by 2 in Z; j 1 is reduced by the reduction mod 2 homomorphism ρ : Z → Z2 ,


and k 1 is the Bockstein homomorphism β defined at the start of Section 11.4.1. The differential d = j 1 k 1 is the
Bockstein homomorphism δ2 (Section 11.4.1).

Mosher and Tangora use subscripts rather than superscripts for successive differentials. So d1 = d, d2 = j 2 k 2 :
E → E 2 , etc.
2

The operation dr acts as follows. Take a cocycle in Z2 coefficients, represent it by an integral cocycle, take its
coboundary, divide by 2r , and reduce the coefficients mod 2. (The division is possible since dr is defined only on the
kernel of dr−1 . Every dr raises the dimension by 1 in H ∗ (; Z2 ).
The Bockstein differentials act on Z2 cohomology, but they can give information on integral cohomology as well.
The Universal coefficeint theorems can be used to prove the following.

Theorem 11.7.1. Elements of H ∗ (X; Z2 ) which come from free integral classes lie in ker dr for every r and not
in im dr for any r. If z generates a summand Z2r in H n+1 (X; Z) then there exist corresponding summands Z2 in
H n (X; Z2 ) and H n+1 (X; Z2 ). If we call their generators z ′ and z ′′ respectively, then di (z ′ ) and di (z ′′ ) are zero for
i < r and dr (z ′ ) = z ′′ . This means that z ′ and z ′′ are not in im di for i < r. We say that the image of ρ on the free
subgroup of H ∗ (X; Z) ”persists to E ∞ ”, and that z ′ and z ′′ ”persist to E r but not to E r+1 .”

Our last result will be useful in calculating homotopy groups of spheres.

Theorem 11.7.2. Suppose that H i (X; Z2 ) = 0 for i < n, and H n (X; Z2 ) = Z2 with generator z. Then the part of
H n (X; Z) not involving odd primes is Z if dr (z) = 0 for all r and Z2n if di (z) = 0 for i < n, and dn (z) ̸= 0.

11.7.2 Serre’s Exact Sequence for a Fiber Space


Recall from Theorem 9.6.22 that a fiber space has a long exact sequence in homotopy. Is there something similar
in homology? Not always, but we can show the following using the Leray-Serre spectral sequence (Section 9.9.2).
i p
Theorem 11.7.3. Let F → − E− → B be a fiber space with B simply connected. Suppose that Hi (B) = 0 for 0 < i < p
and Hj (F ) = 0 for 0 < j < q. Then there is an exact sequence

∗ i p∗ τ
Hp+q−1 (F ) −→ Hp+q−1 (E) −→ Hp+q−1 (B) −
→ Hp+q−2 (F ) → · · · → H1 (E) → 0.
2 2
Proof: Since Ei,j = Hp (B; Hq (F )) in the Leray-Serre spectral sequence, Ei,j = 0, when either 0 < i < p or

0 < j < q. Looking at the Ei,j terms where i + j = n gives the exact sequence
∞ ∞
0 → E0,n → Hn (E) → En,0 → 0.

In general, we have the exact sequence

∞ n dn
n ∞
0 → En,0 → En,0 −→ E0,n−1 → E0,n−1 → 0.
n ∼ n ∼
If n < p + q, En,0 = Hn (B) and E0,n−1 = Hn−1 (F ). Substituting into the previous sequence in splicing it into the
one above for all n < p + q proves the theorem. ■
The sequence of the theorem is called Serre’s exact sequence for a fiber space.

11.7.3 Transgression
The map τ : Hn (B) → Hn−1 (F ) of Theorem 11.7.3 corresponds to dnn,0 and is called the transgression. This
map was only defined if n < p + q. and F and B are (q − 1)-connected and (p − 1)-connected respectively.
To define it more generally as dnn,0 , τ has a subgroup of Hn (B) as its domain and a quotient of Hn−1 (F ) as its
range.
264 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS
p
Here is a different but equivalent description which will be more useful. In the fiber space F → E − → B, let
p0 : (E, F ) → (B, ∗) be the map of pairs induced by p. (Here ∗ denotes a single point of B.) For any n, let τ
be a map from im (p0 )∗ which is a subgroup of Hn (B, ∗) to a quotient group of Hn−1 (F ) defined as follows: If
x ∈ im (p0 )∗ ⊂ Hn (B, ∗), choose y ∈ Hn (E, F ) such that (p0 )∗ (y) = x, and take ∂y as a representative of τ (x).
Then τ (x) is in a quotient group because of the indeterminacy in choosing y.
2
Definition 11.7.2. We say that x ∈ Hn (B) is transgressive if τ (x) is defined. Since Hn (B) = En,0 , this is the same
as saying that di (x) = 0 for all i < n.
The above is equivalent to the condition that x ∈ im (p0 )∗ and that if x = (p0 )∗ (y) then τ x is the homology class
of ∂y. So τ = τ . The proof is in Serre’s thesis [143].

11.7.4 Cohomology Version of Leray-Serre Spectral Sequence


As we want to interact with Steenrod squares, we really would like a spectral sequence for cohomology of a fiber
space as opposed to homology. Fortunately, Serre supplied us with one of those as well. We will rely on it heavily for
computing cohomology of some Eilenberg-Mac Lane spaces.
p
We will assume F → E − → B is a fiber space and make life easier by letting B be simply connected. We will
get a spectral sequence (Er , dr ) of which E2p,q = H p (B; H q (F )). Note that we distinguish the cohomology spectral
sequence from the homology spectral sequence by turning subscripts into superscripts and vice versa. We have
dp,q p,q
r : Er → Erp+r,q−r+1 ,
so that the bidegree of dr is (r, −r + 1) which is the opposite direction of the homology differential dr . The sequence
converges to H ∗ (E).
Theorem 11.7.4. If R is a commutative ring with unit element, then there is a spectral sequence with E2p,q =
H p (B; H q (F ; R)) converging to H ∗ (E; R) such that:
1. For each r, Er is a bigraded ring. If µ is the ring multiplication, then
′ ′ ′ ′
µ : Erp,q ⊗ Erp ,q → Erp+p ,q+q .

2. In Er , dr is an anti-derivation, which means that


dr (ab) = dr (a)b + (−1)k adr (b),
where k is the total degree of a. The formula holds wherever it makes sense.
3. The product in the ring Er+1 is induced by the product in Er , and the product in E∞ is the cup product in
H ∗ (E; R).
Note that if R is a field, the Künneth Theorem implies that E2 = H ∗ (B; R) ⊗ H ∗ (F ; R).
Theorem 11.7.5. : Serre’s Cohomology Exact Sequence: If B is (p − 1)-connected F is (q − 1)-connected, then
there is an exact sequence analogous to the homology version which terminates as:
τ p∗ i∗
· · · → H p+q−2 (F ) −
→ H p+q−1 (B) −→ H p+q−1 (E) −→ H p+q−1 (F ).
The cohomology transgression is
τ = d0,n−1
n : En0,n−1 → Enn,0 .
Definition 11.7.3. x ∈ H n−1 (F ) is transgressive if τ (x) is defined. This is true if di (x) = 0 for all i < n, or
equivalently, δx lies in im p∗ ⊂ H n (E, F ). If δx = p∗ (y), then τ (x) contains y.
Theorem 11.7.6. If x ∈ H n−1 (F ) is transgressive, then so is Sq i (x) and if y ∈ τ (x), then Sq i (y) ∈ τ (Sq i (x)).
Proof: If y ∈ τ (x), then p∗ y = δx, so Sq i (p∗ y) = Sq i (δx). By naturality, p∗ (Sq i (y)) = δ(Sq i (x)) which means
that Sq i (y) = τ (Sq i (x)). ■
11.7. COHOMOLOGY OF EILENBERG-MAC LANE SPACES 265

11.7.5 H ∗ (Z, 2; Z)
For the remainder of this section, I will state some results about cohomology of Eilenberg-Mac Lane spaces. I will
limit myself to stating results with some short comments. See Mosher and Tangora for detailed calculations, all of
which involve cohomology spectral sequences.
Two types of fiber spaces will be useful here. The first is the path-space fibration of Definition 9.5.2. We will
p
write it as ΩB → E − → B. (I am following the notation of Mosher and Tangora that ΩB is the loops on B which
is more common than the notation ΛB found in Hu.) Recall that E is the space of paths on B and is contractible. If
B = K(π, n) then the homotopy exact sequence shows that the fiber is F = ΩB = K(π, n − 1).
The other type comes from a short exact sequence of abelian groups

0 → A → B → C → 0.

Mosher and Tangora show that we can use this sequence to construct a fiber space

i p
K(A, n) →
− K(B, n) −
→ K(C, n).

The work will involve setting up the appropriate fiber space and calculating the cohomology using the Serre
spectral sequence.

Theorem 11.7.7. H ∗ (Z, 2; Z) is the polynomial ring Z[ı2 ] where ı2 is of degree 2 and (ı2 )n generates H 2n (Z, 2; Z).

The theorem is proved using the spectral sequence of the fiber space S 1 = K(Z, 1) → E → K(Z, 2) where E is
contractible and R = Z.

11.7.6 H ∗ (Z2 , n; Z2 )
The first approach in Mosher and Tangora is to calculate the cohomology of H ∗ (Z2 , 2; Z2 ) in low dimensions
using the fiber space
F = K(Z2 , 1) = P ∞ → E → B = K(Z2 , 2).

We already know that the cohomology of F is a polynomial ring on one generator and the results about transgression
are used to relate it to the generators of the cohomology of B. To find the ring structure in general, we need another
theorem.

Definition 11.7.4. A graded ring R over Z2 has the ordered set x1 , x2 , · · · as a simple system of generators if the
monomials {xi1 xi2 · · · xir |i1 < i2 < · · · < ir } form a Z2 -basis for R and if for each n, only finitely many xi have
degree n.
p
Theorem 11.7.8. Borel’s Theorem: Let F → E − → B be a fiber space with E acyclic, and suppose that H ∗ (F ; Z2 )
has a simple system {xi } of transgressive generators. Then H ∗ (B; Z2 ) is a polynomial ring generated by {τ (xi )}.

Theorem 11.7.9. H ∗ (Z2 , n; Z2 ) is a polynomial ring over Z2 with generators {Sq I (ın )} where I runs through all
admissible sequences with excess less than n.

We know that the theorem is true for n = 1 so we proceed by induction on n. We suppose it is true for n and use
the fiber space
F = K(Z2 , n) → E → B = K(Z2 , n + 1)

with E acyclic. We apply Borel’s Theorem and go through a long calculation with admissible sequences. See [106]
for the details.
266 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

11.7.7 Additional Examples


Here are four more results from [106] on cohomology of Eilenberg-Mac Lane spaces.

Theorem 11.7.10. H ∗ (Z, 2; Z2 ) is the polynomial ring over Z2 generated by ı2 ∈ H 2 (Z, 2; Z2 ).

Theorem 11.7.11. H ∗ (Z, n; Z2 ) is the polynomial ring over Z2 generated by the elements {Sq I (ın )} where I runs
through all admissible sequences of excess e(I) < n , and where the last entry ir of I is not equal to one.

For the next theorem. we need to define an exterior algebra.

Definition 11.7.5. Let V be a vector space over a field F . (More generally, we could start with a module over a
commutative ring with unit element, but we will be using the field Z2 in our case.) The exterior algebra E(V ) is the
quotient Γ(V )/J where Γ(V ) is the tensor algebra over V (Definition 11.6.7), and J is generated by the elements of
the form v ⊗ v for v ∈ V . The exterior algebra E(v1 , · · · , vr ) for v1 , · · · , vr is the exterior algebra over the subspace
of V generated by v1 , · · · , vr .

If f : Γ(V ) → Γ(V )/J = E(V ) is the projection, then we write v ∧ w for f (v ⊗ w). So we have that v ∧ v = 0
and
(v + w) ∧ (v + w) = (v ∧ v) + (v ∧ w) + (w ∧ v) + (w ∧ w) = (v ∧ w) + (w ∧ v) = 0,
so (v ∧ w) = −(w ∧ v).

Theorem 11.7.12. H ∗ (Z2m , 1; Z2 ) = P [dm (ı1 )] ⊗ E(ı1 ) for m ≥ 2. Here dm denotes the mth Bockstein homomor-
phism (See Sections 11.7.1-11.7.4.), P [dm (ı1 )] is the polynomial ring over Z2 generated by dm (ı1 ) ∈ H 2 (Z2m , 1; Z2 )
and E(ı1 ) is the exterior algebra generated by ı1 ∈ H 1 (Z2m , 1; Z2 ). (Recall that the Bockstein homomorphisms raise
the dimension by one. This happens in both the homology and the cohomology versions.)

Theorem 11.7.13. H ∗ (Z2m , n; Z2 ) is the polynomial ring over Z2 generated by elements {Sq Im (ın )} where we
define Sq Im = Sq I if I terminates with ir > 1, and we replace Sq ir with dm in Sq I if ir = 1. In this case as well, I
runs through all admissible sequences of excess e(I) < n.

One further question. We know that we could not have computed these cohomology rings with software like
Ripser, since Eilenberg-Mac Lane spaces are infinite dimensional and therefore have infinitely many simplices. Given
that Kenzo can compute spectral sequences in some cases, could it have found these results? That could be the subject
of some interesting experiments.

11.8 Reduced Powers


It is well known that if Superman is exposed to kryptonite, he becomes less powerful and thus has reduced powers.
That, however is not the subject of this section.
Steenrod reduced powers are an additional family of cohomology operations. Suppose that 2 is not your favorite
prime number, and you really prefer to work over Z41 . Reduced powers handle the case of cohomology over Zp where
p is an odd prime number.
The construction is discussed in Steenrod’s lectures [153] but presented in much more detail in Steenrod and
Epstein [157]. Since Mosher and Tangora leave out reduced powers entirely, the material for this section comes from
[157]. I will refer you there for the full constructions, but I will mention that the construction in Section 11.3 is
modified to look at the action of a subgroup π of the permutation group Sn on a W ⊗ K ⊗ · · · ⊗ K = W ⊗ K ⊗n ,
where W is now a π-free acyclic complex, and K ⊗n is the tensor product of K with itself n times.
The result is the (cyclic) reduced powers taking the form

P i : H q (K; Zp ) → H q+2i(p−1) (K; Zp ).


11.8. REDUCED POWERS 267

The noation leaves ambiguous which value of p is being used. I used to imagine notation like 3i for p = 3, but that
could obvously get confusing. In [153], Steenrod sometimes uses the notation Ppi , but I have never seen that anywhere
else. So for now, we will use the conventional notation, P i and understand which value of p we are using by context.
Our first task is to write down the properties corresponding to those listed in Theorem 11.4.1. We start with a
Bockstein homomorphism analogous to the one representing Sq 1 .
Let p be an odd prime and let β : H q (X; Zp ) → H q+1 (X; Zp ) be the Bockstein operator associated with the exact
coefficient sequence
m
0 → Z −→ Z → Zp → 0,
where m is multiplication by p. To define it, we start with an homomorphism β : H q (X; Zp ) → H q+1 (X; Z) defined
in an analogous way to the mod 2 case. Let x ∈ H q (X; Zp ). Represent x by a cocycle c and choose an integral cochain
c′ which maps to c under reduction mod p. Then δc′ is a multiple of p, and p1 (δc′ ) represents βx. The composition of
β and the reduction homomorphism H q+1 (X; Z) → H q+1 (X, Zp ) gives a homomorphism

β : H q (X; Zp ) → H q+1 (X; Zp ).

We have that β 2 = 0 and writing xy for x ∪ y,

β(xy) = β(x)y + (−1)dim(X) xβ(y).

We now give the main axioms for P i . Compare them to the axioms for Sq i in Theorem 11.4.1.

Theorem 11.8.1. The Steenrod reduced powers P i for i ≥ 0 have the following properties:

1. P i is a natural homomorphism H q (K, L; Zp ) → H q+2i(p−1) (K, L; Zp ).

2. If 2k > q, then P k (x) = 0 for x ∈ H q (K; L; Zp ).

3. If x ∈ H 2k (K; L; Zp ), then P k (x) = xp = x ∪ · · · ∪ x (p times).

4. P 0 is the identity homomorphism.

5. δ ∗ P i = P i δ ∗ , where δ ∗ is the coboundary homomorphism δ ∗ : H q (L; Zp ) → H q+1 (K, L; Zp ).

6. Cartan Formula: Writing xy for x ∪ y we have


X
P k (xy) = (P i x)(P k−i y).
i

7. Adem Relations: For a < pb,


⌊a
p⌋  
a+t (p − 1)(b − t) − 1
X
a b
P P = (−1) P a+b−t P t ,
t=0
a − pt

where the binomial coefficient is taken mod p.


If a ≤ b then
⌊a⌋
p  
a b
X
a+t (p − 1)(b − t)
P βP = (−1) βP a+b−t P t
t=0
a − pt
⌊ a−1
p ⌋  
X
a+t−1 (p − 1)(b − t) − 1
+ (−1) P a+b−t βP t .
t=0
a − pt − 1
268 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

It turns out that like Sq i , P i also commutes with suspension. The Bockstein homomorphism β also commutes
with both coboundary and suspension.
Also, like Sq i , the reduced powers form an algebra.

Definition 11.8.1. Define the Steenrod algebra A(p) to be the graded associative algebra generated by elements P i of
degree 2i(p − 1) and β of degree 1, subject to β 2 = 0, the Adem Relations, and P 0 = 1.

A monomial in A(p) can be written in the form

β ϵ0 P s1 β ϵ1 · · · P sk β ϵk ,

where ϵi ∈ {0, 1} and si is a positive integer. We denote this monomial by P I , where

I = {ϵ0 , s1 , ϵ1 , · · · , sk , ϵk , 0, 0, · · · }.

A sequence I is admissible if si > psi+1 +ϵi for each i ≥ 1. The corresponding P I as well as P 0 are called admissible
Pk
monomials. Define the moment of I to be i=1 i(si + ϵi ). The degree of I is the degree of P I .
Steenrod and Epstein prove the following.

Theorem 11.8.2. The admissible monomials form a basis for A(p).

Steenrod and Epstein have a similar discussion to that of Steenrod squares of the structure of A(p) and its dual
A(p)∗ . For example, A(p) is a Hopf algebra, so reduced powers make the cohomology ring H ∗ (X; Zp ) an algebra
over a Hopf algebra. See there for more details.
Obviously, Steenrod squares and reduced powers do not represent all possible cohomology operations. For exam-
ple, Steenrod [153] mentions the Pontrjagin pth power

P : H 2q (K; Zpk ) → H 2pq (K; Zpk+1 ).

Pontrjagin [122] discovered these operations for p = 2 and Thomas [160, 161] discovered them for primes p > 2.
There are also higher order cohomology operations derived from the Steenrod squares that have proved useful in
the computation of homotopy groups of spheres. I will not discuss them in this book but there is a good description in
Chapter 16 of Mosher and Tangora [106].
Finally, there is some work out there on computer calculation of Steenrod reduced powers. See for example the
papers of Gonzalez-Diaz and Real [57, 58] on this subject. Still, the idea of using cohomology operations for data
science is so new, I will stick with the simplest ones, the Steenrod squares, for the remainder of this chapter.

11.9 Vector Bundles and Stiefel-Whitney Classes


In this section, I will describe a classical example of Steenrod squares being used in a classification problem.
Algebraic topology has often been focused on solving problems in differential geometry. As an example, recall the
problem of the existence of a nonzero tangent vector field for S n that we addressed in Theorem 4.1.35. This section
deals with vector bundles which lie on the boundary between these two fields.
Roughly speaking, vector bundles are fiber spaces whose fiber is a vector space. An example from differential
geometry is the tangent bundle on a manifold. The base space is the manifold and the fiber at a point on the manifold
is the tangent space at that point. Vector bundles can be classified by certain cohomology classes of the base space
known as characteristic classes. I will focus on a specific type called Stiefel-Whitney classes. These are classes in the
cohomology over Z2 , and it turns out they can be represented using Steenrod squares. Two vector bundles that are
homeomorphic have the same Stiefel-Whitney classes, but the converse is not always true.
Question: If our data lies on a manifold, could we solve a classification problem by calculating the Stiefel-Whitney
classes of some appropriate bundle?
I won’t address computation issues, but Aubrey HB touches on this area in her thesis [65]. See there for details.
11.9. VECTOR BUNDLES AND STIEFEL-WHITNEY CLASSES 269

In what follows, I will give the definition of vector bundles and Stiefel-Whitney classes along with stating some of
their properties. I will describe a notable example, the Grassmann manifold, and describe its cohomology. Finally, I
will describe how the Stiefel-Whitney classes are represented by Steenrod squares.
The classic book on characteristic classes if the book by Milnor and Stasheff [102]. All of the material in this
section comes from there.

11.9.1 Vector Bundles


I will start by defining a vector bundle. Compare this to the definition of a fiber space from Section 9.3.
Definition 11.9.1. A real vector bundle ξ over a topological space B (called the base space) consists of the following:
1. A topological space E = E(ξ) called the total space.
2. A map π : E → B called the projection map.
3. For each b ∈ B, π −1 (b) is a vector space over the reals called the fiber over b and denoted by Fb .
In addition, the bundle must satisfy the condition of local triviality: For each b ∈ B, there exists a neighborhood
U ⊂ B, an integer n ≥ 0, and a homeomorphism
h : U × Rn → π −1 (U )
so that for each b ∈ U , the correspondence x → h(b, x) defines a vector space isomorphism between Rn and π −1 (b).
The pair (U, h) is called a local coordinate system for ξ about b. If U can be chosen to be all of B, then ξ is called
a trivial bundle.
The bundle ξ is called an n-plane bundle if the dimension of Fb is n for all b ∈ B.
Definition 11.9.2. Two bundles ξ and η over the same base space B are isomorphic, written ξ ∼
= η if there exists
a homeomorphism f : E(ξ) → E(η) which maps each vector space Fb (ξ) isomorphically onto the corresponding
vector space Fb (η).
Example 11.9.1. The trivial bundle with total space B × Rn and projection π(b, x) = b is a vector bundle denoted
ϵnB .
Example 11.9.2. The tangent bundle τM of a smooth manifold M (the coordinate charts are infinitely differentiable)
has a total space DM which consists of all pairs (x, v) with x ∈ M and v in the tangent space Tx M . (See Definition
6.5.2.) The projection map is π : DM → M with π(x, v) = x. If τM is trivial, then M is called parallelizable.
Example 11.9.3. The normal bundle ν of a smooth manifold M ∈ Rn has a total space E ⊂ M × Rn which consists
of all pairs (x, v) such that v is orthogonal to the tangent space Tx M .
Definition 11.9.3. The canonical line bundle γn1 over the projective space P n is the bundle with total space E(γn1 ) ⊂
P n × Rn+1 consisting of all pairs ({±x}, v) such that v is a multiple of x. Each fiber π −1 ({±x}) can be identified
with the line through x and −x in Rn+1 .
Theorem 11.9.1. The bundle γn1 over P n is not trivial for n ≥ 1.
Definition 11.9.4. A cross-section of a vector bundle ξ with base space B is a continuous function s : B → E(ξ)
which takes b ∈ B into the corresponding fiber Fb (ξ). The cross section is nowhere zero if s(b) is a nonzero vector in
Fb (ξ) for each b.
A vector field is a cross-section of the tangent bundle of a manifold.
A trivial R1 bundle has a nowhere zero cross-section. The bundle γn1 turns out to have no such cross section.
Milnor and Stasheff prove a more general result.
Theorem 11.9.2. An Rn bundle ξ is trivial if and only if ξ admits n cross-sections s1 , · · · , sn which are nowhere
dependent, i.e, the vectors s1 (b), · · · , sn (b) are linearly independent for all b ∈ B.
270 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

11.9.2 New Bundles from Old Ones


Now that we know what a vector bundle is, here are some constructions of new ones from old ones. They will be
necessary to understand the axioms for Stiefel-Whitney classes.

Definition 11.9.5. Let ξ be a vector bundle with projection π : E → B, and let B ′ ⊂ B. Then letting E ′ = π −1 (B ′ )
and π ′ : E ′ → B ′ be the restriction of π ′ to E ′ , we get a vector bundle ξ|B ′ called the restriction of ξ to B ′ . Each
fiber Fb (ξ|B ′ ) is equal to the corresponding fiber Fb (ξ).

The next example is entirely analogous to induced fibering construction at the end of Section 9.3.

Definition 11.9.6. Let ξ be a vector bundle with projection π : E → B, and let B1 be a topological space. Given
a map f : B1 → B, we get the induced bundle f ∗ ξ over B1 . The total space E1 of f ∗ ξ is the subset of B1 × E
consisting of pairs (b, e) with f (b) = π(e).

Definition 11.9.7. Let ξ1 and ξ2 be vector bundles with projections πi : Ei → Bi , for i = 1, 2. Then the Cartesian
product ξ1 × ξ2 is the bundle with projection map

π1 × π2 : E1 × E2 → B1 × B2

with fibers
(π1 × π2 )−1 (b1 , b2 ) = Fb1 (ξ1 ) × Fb2 (ξ2 ).

Definition 11.9.8. Let ξ1 , ξ2 be bundles over the same base space B. Let d : B → B × B be the diagonal map
d(b) = (b, b). The induced bundle d∗ (ξ1 × ξ2 ) over B is called the Whitney sum of ξ1 and ξ2 and is denoted ξ1 ⊕ ξ2 .
Each fiber Fb (ξ1 ⊕ ξ2 ) is isomorphic to the direct sum Fb (ξ1 ) ⊕ Fb (ξ2 ).

Definition 11.9.9. Let ξ, η be bundles over the same base space B, and suppose that E(ξ) ⊂ E(η). The bundle ξ is a
sub-bundle of η if each fiber Fb (ξ) is a vector subspace of the corresponding fiber Fb (η).

Theorem 11.9.3. Let ξ1 , ξ2 be sub-bundles of η such that each fiber Fb (η) is the vector space direct sum Fb (ξ1 ) ⊕
Fb (ξ2 ). Then η is isomorphic to the Whitney sum ξ1 ⊕ ξ2 .

Recall that a Hilbert space is a vector space V with an inner product. Assuming the scalars are the reals, we have
an inner product v.w ∈ R for v, w ∈ V . Milnor and Stasheff call this a Euclidean vector space. There is a function
µ : V → R with µ(v) = v.v. In general

1
v.w = (µ(u + v) − µ(v) − µ(w)).
2
The function µ is quadratic, which means that µ(v) = i li (v)l′i (v) where l and l′ are linear, and µ is positive
P
definite, which means that µ(v) > 0 if v ̸= 0.

Definition 11.9.10. A Euclidean vector bundle is a real vector bundle ξ together with a continuous function µ : V → R
such that the restriction of µ to each fiber of ξ is quadratic and positive definite. µ is called a Euclidean metric on ξ.

Now let η be a Euclidean vector bundle and ξ a sub-bundle of η. Let Fb (ξ ⊥ ) consist of all vectors v ∈ Fb (η) such
that v.w = 0 for all w ∈ Fb (ξ). Let E(ξ ⊥ ) be the union of the Fb (ξ ⊥ ). Then E(ξ ⊥ ) is the total space of a sub-bundle
ξ ⊥ of η, and η is isomorphic to the Whitney sum ξ ⊕ ξ ⊥ . We call ξ ⊥ the orthogonal complement of ξ.
As an example, the orthogonal complement of the tangent bundle of a manifold is the normal bundle.

11.9.3 Stiefel-Whitney Classes


We can now define Stiefel-Whitney classes. They will be elements of the cohomology ring over Z2 of the base
space of a vector bundle. Milnor and Stasheff also discuss other types of characteristic classes such as the Euler
11.9. VECTOR BUNDLES AND STIEFEL-WHITNEY CLASSES 271

classes for integral cohomology and Chern classes for vector bundles whose fibers are vector spaces over the complex
numbers. See [102] for more details.
As was the case for Steenrod squares, Stiefel-Whitney classes can be defined axiomatically without worrying about
whether they actually exist. This is the approach of Milnor and Stasheff. After the axioms and some properties are
described, [102] describes the construction in terms of Steenrod squares. It is then shown that classes constructed in
this way satisfy the required axioms. I will follow their approach including a short discussion of an interesting example
known as a Grassmann manifold.
Definition 11.9.11. Axiomatic definition: To each real vector bundle ξ there is a sequence of cohomology classes

wi (ξ) ∈ H i (B(ξ); Z2 ), i = 0, 1, 2, · · · ,

called the Stiefel-Whitney classes of ξ. They satisfy the following four axioms:
1. The class w0 (ξ) is equal to the unit element 1 ∈ H 0 (B(ξ); Z2 ), and wi (ξ) = 0 for i > n if ξ is an n-plane
bundle.
2. Naturality: If f : B(ξ) → B(η) is covered by a bundle map from ξ to η (i.e. there is a map E(ξ) → E(η) that
preserves fibers.) Then
wi (ξ) = f ∗ wi (η).

3. The Whitney Product Theorem: If ξ and η are vector bundles over the same base space, then
k
X
wk (ξ ⊕ η) = wi (ξ) ∪ wk−i (η).
i=0

4. For the line bundle γ11 over the circle P 1 , w1 (γ11 ) ̸= 0.


Assuming such classes exist, here are some consequences.
Theorem 11.9.4. If ξ is isomorphic to η, then wi (ξ) = wi (η) for all i.
This shows that Stiefel-Whitney classes classify vector bundles just as homology, cohomology, and homotopy
classify topological spaces. A similarity is that isomorphic vector bundles must have the same Stiefel-Whitney classes,
but having the same classes does not guarantee that two bundles are isomorphic.
Theorem 11.9.5. If ϵ is a trivial vector bundle then wi (ϵ) = 0 for i > 0.
This is true since for a trivial vector bundle ϵ, there is a bundle map from ϵ to a bundle over a point. This theorem
combined with the Whitney product theorem gives the following.
Theorem 11.9.6. If ϵ is trivial, and η is another vector bundle over B, then wi (ϵ ⊕ η) = wi (η).
Theorem 11.9.7. If ξ is a n-plane bundle with a Euclidean metric which possesses a nowhere zero cross-section, then
wn (ξ) = 0. If ξ has k cross-sections that are nowhere linearly dependent then

wn−k+1 (ξ) = wn−k+2 (ξ) = · · · = wn (ξ) = 0.

11.9.4 Grassmann Manifolds


Grassmann manifolds are an interesting class of manifolds whose cohomology ring can be described in terms of
Stiefel-Whitney classes. I will describe what they are and state some of their properties.
Definition 11.9.12. A Grassmann manifold Gn (Rn+k ) is the set of all n-dimensional planes through the origin of
Rn+k .
272 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

Definition 11.9.13. An n-frame in Rn+k is an n-tuple of linearly independent vectors in Rn+k The collection of all
n-frames in Rn+k is an open subset of the n-fold Cartesian product Rn+k × · · · × Rn+k called a Stiefel manifold and
denoted Vn (Rn+k ).
There is a function
q : Vn (Rn+k ) → Gn (Rn+k )
which maps an n-frame to the n-plane it spans. Milnor and Stasheff define the topology on Gn (Rn+k ) as the quotient
topology determined by this map and use it to prove the following.
Theorem 11.9.8. The Grassmann manifold Gn (Rn+k ) is a compact manifold of dimension nk. The correspondence
X → X ⊥ which assigns each n-plane to its orthogonal k-plane defines a homeomorphism between Gn (Rn+k ) and
Gk (Rn+k ).
The proof of this statement is lengthy, but I will comment on the dimension nk. The dimension of a manifold is d if
every open neighborhood of a point is homeomorphic to Rd . Now if X0 ∈ Gn (Rn+k ) then X0 is an n-plane in Rn+k ,
so let X0⊥ be its orthogonal k-plane, so that Rn+k = X0 ⊕X0⊥ . Then let U be the open subset of Gn (Rn+k ) consisting
of all n-planes Y such that the projection p : X0 ⊕ X0⊥ → X0 maps Y onto X0 . In other words, Y ∩ X0⊥ = 0. So
each Y ∈ U can be considered to be the graph of a linear transformation T (Y ) : X0 → X0⊥ . So there is a one to one
correspondence T between U and Hom(X0 , X0⊥ ), and the latter has dimension nk. It takes some work, but we are
done once we prove that T is a homeomorphism.(See [102].)
A canonical vector bundle γn (Rn+k ) can be constructed over Gn (Rn+k ) by taking as the total space E(γn (Rn+k ))
the set of pairs
(n-plane in Rn+k , vector in that plane).
We can generalize this construction to R∞ and call the bundle γ n .
Let Gn = Gn (R∞ ) be the set of n-planes in R∞ . Milnor and Stasheff prove the following.
Theorem 11.9.9. The cohomology ring H ∗ (Gn ; Z2 ) is a polynomial ring over Z2 freely generated (ie. there are no
poynomial relations) by w1 (γn ), · · · , wn (γn ).

11.9.5 Representation of Stiefel-Whitney Classes as Steenrod Squares


We are finally ready to see the connection between Stiefel-Whitney classes and Steenrod squares.
Let ξ be an n-plane vector bundle with total space E, base space B, and projection map π. Let E0 be the nonzero
elements of E. (Remember that an element of E is an element of some fiber Fb = π −1 (b). So any element of E
is an element of some vector space and so it makes sense to talk about nonzero elements. If F is a typical fiber, let
F0 = F ∩ E0 denote the nonzero elements of F .
The following two theorems are proved in a later chapter of [102] with more advanced tools that we have not
covered. See there (Chapter 10) if you are curious.
In this section, we will always work with Z2 coefficients, so H i (X) will be understood to be H i (X; Z2 ).
Theorem 11.9.10. H i (F, F0 ) = 0 for i ̸= n and H n (F, F0 ) = Z2 .
H i (E, E0 ) = 0 for i < n and H i (E, E0 ) = H i−n (B) for i ≥ n.
Theorem 11.9.11. H i (E, E0 ) = 0 for i < n, and H n (E, E0 ) contains a unique class u called the fundamental
cohomology class such that for each fiber F , the restriction u|(F, F0 ) ∈ H n (F, F0 ) is the unique nonzero class in
H n (F, F0 ). In addition the correspondence x → x ∪ u defines an isomorphism H k (E) → H k+n (E, E0 ) for every k.
Now a vector bundle has a zero cross-section which embeds B as a deformation retract of E, so that the projection
π : E → B induces an isomorphism π ∗ : H k (B) → H k (E).
Definition 11.9.14. The Thom isomorphism ϕ : H k (B) → H k+n (E, E0 ) is defined to be the composition of the two
isomorphisms
π∗ ∪u
H k (B) −→ H k (E) −−→ H k+n (E, E0 ).
11.10. COMPUTER COMPUTATION OF STEENROD SQUARES 273

Using ϕ we define the Stiefel-Whitney class wi (ξ) ∈ H i (B) by


wi (ξ) = ϕ−1 Sq i ϕ(1).
Now ϕ(1) = 1 ∪ u = u, so Sq i (ϕ(1)) = Sq i (u), and wi (ξ) is the unique cohomology class in H i (B) such that
ϕ(wi (ξ)) = π ∗ wi (ξ) ∪ u is equal to Sq i (u).
It remains to check that this definition satisfies the four axioms of Stiefel-Whitney classes.
The main takeaway from this is that Steenrod squares have appeared in a classical classification problem from
geometry. Although there is still the weakness that two non-isomorphic bundles could have the same Stiefel-Whitney
classes, they may still be useful given the right type of data.
For any practical problem involving Steenrod squares in data science, we will need to know how to compute them.
That is the topic of the next section.

11.10 Computer Computation of Steenrod Squares


In this section, I will describe in detail the formula for computing Steenrod squares derived by González-Dı́az and
Real. All of the material comes from [56], except for a simplified formula for Sq 1 from Aubrey HB’s thesis [65].
The computations will be performed on our old friends, the simplicial sets. We also use a different construction
of Steenrod squares than the one presented in Section 11.2. We will use the construction of Steenrod and Epstein
[157] in which they construct a series of morphisms {Di } called higher diagonal approximations which are proved
to exist by the acyclic model theorem. Then letting C∗N (X) denote the normalized simplicial set consisting of only
non-degenerate simplices of X, if c ∈ Hom(CjN (X), Z2 ), and x ∈ Ci+j N
(X) then

Sq i (c)(x) = µ(⟨c ⊗ c, Dj−i (x)⟩), i ≤ j,


and Sq i (c)(x) = 0 if i > j. Here µ is the homomorphism induced by multiplication in Z2 .
So we are left with the problem of writing down a formula for Di . Now Real derived such a formula in [132] in
terms of the component morphisms of the Elilenberg-Zilber contraction from C∗N (X × X) onto C∗N (X) ⊗ C∗N (X)).
Now the formula for Di involves an explicit formula for the homotopy operator. (All of these terms will be defined
soon.) But the homotopy operator is defined in terms of partitions or shuffles of the face and degeneracy operators
of the simplicial sets. This would give complexity of order 2n to evaluate an element of dimension n. The main
contribution of [56] is to drastically reduce this complexity by realizing that the face and degeneracy operators of X
can always be written as
sjt · · · sj1 ∂i1 · · · ∂is ,
where jt > · · · > j1 ≥ 0, and is > · · · > i1 ≥ 0. Since the image of the morphisms of the Eilenbeg-Zilber contraction
is always normalized, we can throw away any terms with a degeneracy operator in its expression. This produces a
much simpler formula for the Di .
I won’t repeat the definition of a simplicial set, but we will use ∂i for face operators and si for degeneracy operators.
A simplex x is degenerate if x = si y for some simplex y and degeneracy operator si . A simplex which is not
degenerate is non-degenerate. The proofs in the paper make heavy use of the relation ∂i sj = sj−1 ∂i for i < j. This
puts a composition of face and degeneracy
Pn operators in the standard form and determines which terms are degenerate.
For an n simplex, we use dn = i=0 (−1)i ∂i as the homology boundary and δ n as the corresponding coboundary.
Letting X be a simplicial set and C∗ (X) be the chain complex {Cn (X), dn } define s(C∗ (X)) as the graded R-
module generated by all of the degenerate simplices. In C∗ (X), dn (s(Cn−1 (X))) ⊂ s(Cn−2 (X)), so C∗N (X) =
{Cn (X)/s(Cn−1 (X)), dn } is a chain complex called the normalized chain complex associated to X. If G is a ring, let
C ∗ (X : G) be the cochain complex associated to C∗N (X).
At the end of Chapter 9, I spoke about reductions. Following Eilenberg and Mac Lane [45], I will use the term
contraction and repeat the definition with the notation used in [56].
Definition 11.10.1. Let N and M be chain complexes. A contraction from N onto M is a triple (f, g, ϕ), where
f : N → M is called projection, g : M → N is called inclusion, and ϕ : N → N is called homotopy and raises the
degree by 1. (Note that the analogy is with chain homotopy.) These maps satisfy the following relations:
274 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

1. f g = 1M .
2. ϕd + dϕ + gf = 1N .
3. ϕg = 0.
4. f ϕ = 0.
5. ϕϕ = 0.
This definition implies an equivalence between big complex N and small complex M .
Definition 11.10.2. Let (f1 , g1 , ϕ1 ) and (f2 , g2 , ϕ2 ) be two contractions. We can construct two additional contractions:
1. The tensor product contraction (f1 ⊗ f2 , g1 ⊗ g2 , ϕ1 ⊗ g2 f2 + 1N1 ⊗ ϕ2 ) from N1 ⊗ N2 to M1 ⊗ M2 .
2. If N2 = M1 , the composition contraction (f2 f1 , g1 g2 , ϕ1 + g1 ϕ2 f1 ) from N1 to M2 .
The formula for Steenrod squares comes from an explicit formula for a particular contraction from C∗N (X × Y )
to C∗N (X) ⊗ C∗N (Y ) called the Eilenberg-Zilber contraction [48]. Before I define it, I will have to teach you a new
dance, the (p, q)-shuffle.
Definition 11.10.3. If p and q are non-negative integers, a (p, q)-shuffle (α, β) is a partition of the set
{0, 1, · · · , p + q − 1}
of integers into disjoint subsets α1 < · · · < αp and β1 < · · · < βq of p and q integers respectively. The signature of
the shuffle (α, β) is defined by
p
X
sig(α, β) = αi − (i − 1).
i=1

Example 11.10.1. Let p = 4, q = 6, α = {3, 5, 6, 8}, and β = {0, 1, 2, 4, 7, 9}. Then


sig(α, β) = (3 − 0) + (5 − 1) + (6 − 2) + (8 − 3) = 3 + 4 + 4 + 5 = 16.
Definition 11.10.4. The Eilenberg-Zilber contraction from C∗N (X × Y ) to C∗N (X) ⊗ C∗N (Y ), where X and Y are
simplicial sets, consists of the triple (AW, EML, SHI) defined by:
1. The Alexander-Whitney operator
AW : C∗N (X × Y ) → C∗N (X) ⊗ C∗N (Y )
defined as
m
X
AW (am × bm ) = ∂i+1 · · · ∂m am ⊗ ∂0 · · · ∂i−1 bm .
i=0
If X = Y , AW is a simplicial approximation to the diagonal and this operator allows for the construction of
cup products. Interchanging am and bm produces a different approximation and comparing them leads to the
Steenrod squares.
2. The Eilenberg-Mac Lane operator
EM L : C∗N (X) ⊗ C∗N (Y ) → C∗N (X × Y )
defined by
X
EM L(ap ⊗ bq ) = (−1)sig(α,β) sβq · · · sβ1 ap × sαp · · · sα1 bq .
(α,β)∈{(p,q)-shuffles}

This operator is a process of ”triangulation” in X × Y .


11.10. COMPUTER COMPUTATION OF STEENROD SQUARES 275

3. The Shih operator


SHI : C∗N (X × Y ) → C∗+1
N
(X × Y )
is defined by
SHI(a0 × b0 ) = 0;

X
SHI(am × bm ) = (−1)m+sig(α,β)+1 sβq +m · · · sβ1 +m sm−1 ∂m−q+1 · · · ∂m am
0≤q≤m−1
0≤p≤m−q−1
(α,β)∈{(p+1,q)-shuffles}

× sαp+1 +m · · · sα1 +m ∂m · · · ∂m−q−1 bm ;


Pp+1
where m = m − p − q and sig(α, β) = i=1 αi − (i − 1).

Eilenberg and Mac Lane [46] derived a recursive formula for SHI, but the one here was stated by Rubio in [138]
and proved by Morace in the appendix to [131].
We will need to define some more maps.
The diagonal map
∆ : C∗N (X) → C∗N (X × X)
is defined by ∆(x) = (x, x).
The automorphism
t : C∗N (X × X) → C∗N (X × X)
is defined by t(x1 , x2 ) = (x2 , x1 ).
The automorphism
T : C∗N (X) ⊗ C∗N (X) → C∗N (X) ⊗ C∗N (X)
is defined by T (x1 ⊗ x2 ) = (−1)dim(x1 )dim(x2 ) x2 ⊗ x1 .
Now the AW operator is not commutative. Being an approximation to the diagonal, though, it can be used to
compute cup products. Let R be a ring, c ∈ C i (X; R), c′ ∈ C j (X; R), and x ∈ Ci+j
N
(X), the cup product of c and c′
is

(c ∪ c′ )(x) = µ(⟨c ⊗ c′ , AW ∆(x)⟩)


= µ(⟨c, ∂i+1 · · · ∂i+j x⟩ ⊗ ⟨c′ , ∂0 · · · ∂i−1 x⟩),

where µ is induced by multiplication in R.


In [155], Steenrod builds a sequence of maps {Di } called higher diagonal approximations where

Di : C∗N (X) → C∗N (X) ⊗ C∗N (X)

defined by

D0 = AW ∆
d⊗ Di+1 + (−1) Di+1 d = T Di + (−1)i+1 Di ,
i

where d and d⊗ are the homology differentials (boundary maps) of C∗N (X) and C∗N (X) ⊗ C∗N (X) respectively. It
turns out that the Di can be expressed in the form Di = hi ∆ where

hi : C∗N (X × X) → C∗N (X) ⊗ C∗N (X)

is a homomorphism of degree i and whose existence can be proved using the acyclic model theorem. This gives a
recursive formula for the {Di }.
276 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

Instead of following this approach, González-Dı́az and Real make use of the explixit formula for the Eilenberg-
Zilber contraction, It is shown in [132] that hi = AW (t(SHI)i ) for all i. The formula for the Steenrod square

Sq i : H j (X; Z2 ) → H j+i (X; Z2 )

is now
Sq i (c)(x) = µ(⟨c ⊗ c, AW (t(SHI)j−i (x, x))⟩)
for i ≤ j, and Sq i (c)(x) = 0 for i > j.
Now the image of hi lies in C∗N (X) ⊗ C∗N (X), so expressing the compositions of face and degeneracy operators
of the summands of the formulas in the standard form

sjt · · · sj1 ∂i1 · · · ∂is ,

determines which terms can be eliminated. We only keep terms with no degeneracy operators. In this way, González-
Dı́az and Real reduce the amount of work required to compute Steenrod squares and make these computations tractable
in a lot of cases.
Now I will state some of the explicit formulas from [56]. As scary as they may look at first, think of how easy
they would be to implement on a computer. The proofs involve some combinatorics and a lot of messy arithmetic, but
follow directly from the formulas we have given. See [56] for the details.
Theorem 11.10.1. Let R be a ring and X a simplicial set. Let (AW, EM L, SHI) be the Eilenberg-Zilber contraction
from C∗N (X × X) onto C∗N (X) ⊗ C∗N (X). Then the morphism

hn = AW ((t(SHI))n ) : Cm
N
(X × X) → (C∗N (X) ⊗ C∗N (X))m+n

can be expressed in the form:



m
X n −1
iX 1 −1
iX
AW ((t(SHI))n ) = ··· (−1)A(n)+B(n,m,ı)+C(n,ı)+D(n,m,ı)
in =n in−1 =n−1 i0 =0

∂i0 +1 · · · ∂i1 −1 ∂i2 +1 · · · ∂in−1 −1 ∂in +1 · · · ∂m


⊗ ∂0 · · · ∂i0 −1 ∂i1 +1 · · · ∂in−2 −1 ∂in−1 +1 · · · ∂in −1

if n is even or

m
X n −1
iX 1 −1
iX
AW ((t(SHI))n ) = ··· (−1)A(n)+B(n,m,ı)+C(n,ı)+D(n,m,ı)
in =n in−1 =n−1 i0 =0

∂i0 +1 · · · ∂i1 −1 ∂i2 +1 · · · ∂in−2 −1 ∂in−1 +1 · · · ∂in −1


⊗ ∂0 · · · ∂i0 −1 ∂i1 +1 · · · ∂in−1 −1 ∂in +1 · · · ∂m

if n is odd,
where 
1 if n ≡ 3, 4, 5, 6 mod 8
A(n) =
0 otherwise.
 P⌊ n ⌋

 j=0 i2j
2
if n ≡ 1, 2 mod 4
B(n, m, ı) =
 P⌊ n−1

2 ⌋
j=0 i2j+1 + nm if n ≡ 0, 3 mod 4
11.10. COMPUTER COMPUTATION OF STEENROD SQUARES 277

n
⌊2⌋
X
C(n, ı) = (i2j + i2j−1 )(i2j−1 + · · · + i0 )
j=0

and

(m + in )(in + · · · + i0 ) if n is even,
D(n, m, ı) =
0 if n is odd,
where ı = (i0 , i1 , · · · , in ).
The theorem gives an explicit formula for the cup-i product since for c ∈ C p (X; R) and c′ ∈ C q (X : R), we have

(c ∪i c′ )(x) = µ(⟨c ⊗ c′ , Di (x)⟩),


N
for all i and x ∈ Cp+q−i (X).
In a similar way, we get a formula for Steenrod squares. Since we said that

Sq i (c)(x) = µ(⟨c ⊗ c, AW (t(SHI)j−i (x, x))⟩),

we get the following.


Theorem 11.10.2. Let X be a simplicial set. Then if c ∈ C j (X; Z2 ) and x ∈ Ci+j
N
(X) then

Sq i : H j (X; Z2 ) → H j+i (X : Z2 )

is defined by:
• If i ≤ j and i + j is even, then:
m
X n −1
iX 2 −1
iX
Sq i (c)(x) = ···
in =S(n) in−1 =S(n−1) i1 =S(1)

µ(⟨c, ∂i0 +1 · · · ∂i1 −1 ∂i2 +1 · · · ∂in−1 −1 ∂in +1 · · · ∂m x⟩


⊗ ⟨c, ∂0 · · · ∂i0 −1 ∂i1 +1 · · · ∂in−2 −1 ∂in−1 +1 · · · ∂in −1 x⟩).

• If i ≤ j and i + j is odd, then:


m
X n −1
iX 2 −1
iX
Sq i (c)(x) = ···
in =S(n) in−1 =S(n−1) i1 =S(1)

µ(⟨c, ∂i0 +1 · · · ∂i1 −1 ∂i2 +1 · · · ∂in−2 −1 ∂in−1 +1 · · · ∂in −1 x⟩


⊗ ⟨c, ∂0 · · · ∂i0 −1 ∂i1 +1 · · · ∂in−1 −1 ∂in +1 · · · ∂m x⟩).

• If i > j, then Sq i (c)(x) = 0.


Here n = j − i, m = i + j,
m+1 k
S(k) = ik+1 − ik+2 + · · · + (−1)k+n−1 in + (−1)k+n ⌊ ⌋ + ⌊ ⌋,
2 2
for all 0 ≤ k ≤ n and i0 = S(0).
Finally, [56] evaluates the complexity. Here is the main result.
278 CHAPTER 11. STEENROD SQUARES AND REDUCED POWERS

Theorem 11.10.3. Let X be a simplicial set and k be a non-negative integer. If c ∈ C i+k (X; Z2 ), then the number of
face operators taking part in the formula for Sq i (c) is O(ik+1 ).
This looks like bad news. The complexity is exponential. Is it so bad that you won’t live long enough to see the
computation finish? Let’s take a closer look.We start with the example in [56].
Example 11.10.2. Suppose there are O(k 2 ) non-degenerate simplices in each Xk . The number of face operators in
Sq i (c) for c ∈ C i+2 (X; Z2 ) is O(i5 ). This is because the number of face operators in the formula is O(i3 ) and we
multiply this by the number of simplices in X2i+2 as we need to evaluate Sq i (c) on each of them.
Here are some reasons why this is not so bad.

1. In TDA, we normally work in low dimensions. If c is of dimension 3 for example, the worst complexity is 33 or
27 face operators per simplex.
2. Remember what a face operator is. We have an ordered list of vertices and we remove one corresponding to a
particular index. That should be easy and fast to implement.

3. The lower the dimension you start with, the fewer nonzero Steenrod squares.
4. If you want to just use Sq 1 as a feature, you can implement HB’s simplified formula I will state below.
What I have not seen is any timing experiments. It might be worth a try before giving up on Steenrod squares. I
have not done any myself, but I think you will be pleasantly surprised.
To conclude, here is the promised simplified formula for Sq 1 from Aubrey HB’s thesis [65]. See there for the
proof.
Theorem 11.10.4. Let c ∈ C p (X; Z2 ). Then
p
⌊ 2 ⌋ i−1
X X
1
Sq (c)(⟨v0 , v1 , · · · , vp+1 ⟩) = c(⟨v0 , · · · , vc
2i , · · · , vp+1 ⟩) · c(⟨v0 , · · · , vc
2j , · · · , vp+1 ⟩)
i=0 j=0
p
⌊ 2 ⌋ i−1
X X
+ c(⟨v0 , · · · , v[
2i−1 , · · · , vp+1 ⟩) · c(⟨v0 , · · · , v[
2j−1 , · · · , vp+1 ⟩),
i=0 j=0

where the notation vbk means that vertex vk is left out, and all addition and multiplication is carried out in Z2 .
Chapter 12

Homotopy Groups of Spheres

As homotopy groups of spheres is the last chapter in Hu’s book, it seems a good place to conclude. A lot has happened
since 1959, and this topic is now the subject of a number of entire books, so I am not going to even try to cover all of
it. Instead, I will give you a small taste of it and provide you with a table which you can use for obstruction theory or
just satisfy your curiosity.
I will start with a short description of stable homotopy groups of spheres that appears in Hatcher [63]. These are
sequences of homotopy groups you can get using the Freudenthal suspension theorem. Although these groups seem
chaotic, there are some interesting patterns when you look at p-components of these groups. (I.e. quotients where
elements of order prime to p are eliminated.) Then I will talk about classes of abelian groups in which notions such as
isomorphism are generalized. This is a useful tool for proving theorems about p-components of abelian groups. Then I
will outline some of the techniques and examples from the last chapter of Hu [74]. I would encourage you to read it if
you ever need to deal with these groups as it is one of the only places I know of that includes explicit generators. Then
I will briefly mention some of the more advanced techniques such as the Adams spectral sequence. I will also mention
a very recent paper that claims to have found a huge number of new groups with machine computations. Finally, I will
give you a table of πi (S n ) for 1 ≤ i ≤ 15 and 1 ≤ n ≤ 10 courtesy of H. Toda [162].
This material will stretch the limit of my own knowledge, so I can only give you an idea of a lot of it, but it is
something I would like to look into further. Maybe I will have more to say in a sequel.

12.1 Stable Homotopy Groups


The material in this section is taken from Hatcher [63].
The Freudenthal suspension theorem (Theorem 9.7.8) states that the suspension map πi (X) → πi+1 (SX) for an
(n − 1)-connected CW complex X is an isomorphism for i < 2n − 1. We will raise the dimensions by 1 so for an
n-connected complex, the map is an isomorphism for i < 2(n + 1) − 1 = 2n + 1. This means that it is an isomorphism
for i ≤ n, so the suspension SX is (n + 1)-connected.

Definition 12.1.1. The sequence of iterated suspensions

πi (X) → πi+1 (SX) → πi+2 (S 2 X) → · · ·

eventually has all maps become isomorphisms. The group that results is the stable homotopy group πiS (X). The
range of dimensions where the suspension is an isomorphism is called the stable range and its study is called stable
homotopy theory. The terms unstable range and unstable homotopy theory are often used for the opposites.

Definition 12.1.2. The group πiS (S 0 ) equals πi+n (S n ) for n > i + 1. We write πiS for πiS (S 0 ) and call it the stable
i-stem.

279
280 CHAPTER 12. HOMOTOPY GROUPS OF SPHERES
i 0 1 2 3 4 5 6 7 8 9
πis Z Z2 Z2 Z24 0 0 Z2 Z240 (Z2 )2 (Z2 )3
i 10 11 12 13 14 15 16 17 18 19
πis Z6 Z504 0 Z3 (Z2 )2 Z480 ⊕ Z2 (Z2 )2 (Z2 )4 Z8 ⊕ Z2 Z264 ⊕ Z2

Table 12.1.1: Stable Homotopy Groups πiS for i ≤ 19. [63, 162].

It turns out that πiS is finite for i > 0. At the time that Hatcher was writing his book, these were known for i ≤ 61.
The recent paper by Isaksen, Wang, and Xu [75] has increased this to 90 with only a couple of exceptions. I will talk
more about that later. Table 12.1.1 was borrowed by Hatcher from [162]. It shows the groups πiS for i ≤ 19. Note that
I will use the notation (Zp )n for Zp ⊕ · · · ⊕ Zp (n times).
There are some interesting patterns when looking at p-components of these groups.
Definition 12.1.3. Let G be a group. Then the p-component of G is the quotient of G obtained by factoring out all
elements of G whose orders are relatively prime to p.
Example 12.1.1. Let G = Z ⊕ (Z2 )3 ⊕ (Z5 )2 . Then the 2-component of G is Z ⊕ (Z2 )3 and the 5-component is
Z ⊕ (Z5 )2 . The 3-component is Z.
Example 12.1.2. For a finite abelian group, the p-component consists of all elements whose order is a power of p.
Letting H = (Z2 )3 ⊕ (Z5 )2 , the 2-component of H is (Z2 )3 , the 5-component is (Z5 )2 , and the 3-component is 0.

Figure 12.1.1: 2-components of πiS for i ≤ 60 [63].

Figure 12.1.1 from [63] is a diagram of the 2-components of πiS for i ≤ 60. A vertical chain of n dots in column
i represents a Z2n summand of πiS . The bottom dot is a generator and moving up vertically represents multiplication
by 2. The three generators η, ν, and σ for i = 1, 3, 7 are the Hopf maps S 3 → S 2 , S 7 → S 4 , and S 15 → S 8
respectively. The horizontal and diagonal lines provide information about compositions of maps between spheres. We
have products πiS × πjS → πi+jS
defined by compositions S i+j+k → S j+k → S k . Hatcher proves that this product
produces a graded ring structure on stable homotopy groups.
Theorem 12.1.1. The composition products πiS × πjS → πi+j
S
induce a graded ring structure on π∗S = ⊕i πiS such
that for α ∈ πiS and β ∈ πjS , αβ = (−1)ij βα.
12.2. CLASSES OF ABELIAN GROUPS 281

It then follows that the p-components p π∗S = ⊕i (p πiS ) also form a graded ring with the same property. In 2 πiS
many of the compositions with suspensions of the Hopf maps η and ν are nontrivial. These are indicated in the diagram
by segments extending one or two units to the right, diagonally for η and horizontally for ν. In the far left corner, you
can see the relation η 3 = 4ν (each vertical step is multiplication by 2) in 2 π3S . Now Table 12.1.1 shows π3S ∼
= Z24 so
the 2-component is Z8 . So in π3S the relation is η 3 = 12ν since 2η = 0 (there is no dot directly above η), so 2η 3 = 0
and η 3 must be the unique element of order 2 in Z24 .
At the bottom we see a repeating pattern with spikes in dimensions 8k − 1. The spike in dimension 2m (2n + 1) − 1
has height m + 1. (Try some examples.)

Figure 12.1.2: 3-components of πiS for i ≤ 100 [63].

Figure 12.1.2 shows the 3-components of πiS for i ≤ 100. Now vertical segments are multiplication by 3, the solid
diagonal and horizontal segments denote compostion with α1 ∈ 3 π3S and β1 ∈ 3 π10 S
respectively. The dashed lines
involve a more complicated composition known as a Toda bracket. They are described in Hatcher briefly and Mosher
and Tangora [106] in more detail. For now, I would call your attention to the regularity of the bottom of the diagram
where there is a spike of height m + 1 in dimension 4k − 1 where m has the property that 3m is the highest power of
3 dividing 4k.
Even more regularity appears with higher primes. Figure 12.1.3 shows the 5-components for i ≤ 1000.
Hatcher produced these diagrams from tables published in Kochman [83] and Kochman and Mahowald [84] for
p = 2. The computations for p = 3, 5 are from Ravenel [130].
I haven’t told you how any of those groups were computed. I will give some small examples. But first we will
need to talk about classes of abelian groups.

12.2 Classes of Abelian Groups


Although this subject is also covered in Hu [74], I will take the material in this section from Mosher and Tangora
[106].
Our goal will be to show that if X and Y are simply connected and all of their homology groups are finitely
generated (always true for a finite complex), then if f : X → Y induces an isomorphism on homology with Zp
coefficients, the p-components of the homotopy groups are also isomorphic. This is a valuable tool in computing
homotopy groups.

Definition 12.2.1. A class of abelian groups is a collection C of abelian groups satisfying the following axiom:
Axiom 1: If 0 → A′ → A → A′′ → 0 is a short exact sequence, then A is in C if and only if A′ and A′′ are in C.
282 CHAPTER 12. HOMOTOPY GROUPS OF SPHERES

Figure 12.1.3: 5-components of πiS for i ≤ 1000 [63].

This means that classes of abelian groups are closed under formation of subgroups, quotient groups, and group
extensions. (By definition A is an extension of A′′ by A′ as it is the middle term in the short exact sequence. This is a
direct product if the sequence splits.)
We will always assume that C is nonempty.

Definition 12.2.2. A homomorphism f : A → B is a C-monomorphism if ker f ∈ C. It is a C-epimorphism if coker


f ∈ C. It is a C-isomorphism if it is both a C-monomorphism and a C-epimorphism.

Note that a C-isomorphism does not necessarily have an inverse. For this reason, the relation A ∼ B if there is a
C-isomorphism from A to B is reflexive but not symmetric. We say A and B are C-isomorphic if they are equivalent
by the smallest reflexive, symmetric, and transitive relation containing ∼. Equivalently, there is a finite sequence
{A = A0 , A1 , A2 , · · · , An = B} such that for each i with 0 ≤ i < n, there is a C-isomorphism between Ai and Ai+1
in one direction or the other.
We will also make use of the following axioms:
Axiom 2A: If A and B are in C, then so are A ⊗ B and T or(A, B).
Axiom 2B: If A is in C, then so is A ⊗ B for every abelian group B.
Axiom 3: If A is in C, then so is Hn (A, 1; Z) for every n > 0.
Axiom 2B implies 2A since T or(A, B) is a subgroup of A ⊗ R for a certain group R.

Example 12.2.1. C0 contains only the group G = {0} having one element. Then C0 -monomorphism, C0 -epimorphism,
and C0 -isomorphism reduce to the usual definitions. This satisfies axioms 1, 2B, and 3. Axiom 2A doesn’t apply since
C0 does not contain two distinct groups.

Example 12.2.2. CF G is the class of finitely generated abelian groups. It satisfies axioms 1, 2A, and 3. It can fail for
2B if B is not finitely generated.
12.2. CLASSES OF ABELIAN GROUPS 283

For axiom 3, we know that K(Z, 1) = S 1 and K(Z2 , 1) = P ∞ . They have finitely many cells in each dimension
so their homology is finitely generated. A similar structure can be built for K(Zp , 1), where p is an odd prime. So
axiom 3 holds.

Example 12.2.3. Cp where p is a prime is the class of abelian torsion groups of finite exponent (i.e the least common
multiple of the orders of the group elements is finite), and the order of every element is relatively prime to p. This
class satisfies axioms 1, 2B, and 3, but the latter is not obvious.

We will generalize three theorems we saw earlier to make use of classes of abelian groups. See [106] for the
proofs.

Theorem 12.2.1. Hurewicz Theorem mod C: If X is simply connected and C is a class satisfying axioms 1, 2A, and
3, then if πi (X) is in C for all i < n, we have that Hi (X) is in C for all i < n, and the Hurewicz homomorphism
h : πn (X) → Hn (X) is a C-isomorphism.

Theorem 12.2.2. Relative Hurewicz Theorem mod C: Let A be a subspace of X and let X and A be simply
connected. Also suppose that i# : π2 (A) → π2 (X) is an epimorphism where i : A → X is inclusion. Suppose C is
a class satisfying axioms 1, 2B, and 3. Then if πi (X, A) is in C for all i < n, we have that Hi (X, A) is in C for all
i < n, and h : πn (X, A) → Hn (X, A) is a C-isomorphism.

Theorem 12.2.3. Whitehead’s Theorem mod C: Let f : A → X where X and A are simply connected and
suppose f# : π2 (A) → π2 (X) is an isomorphism. Suppose C is a class satisfying axioms 1, 2B, and 3. Then
f# : πi (A) → πi (X) is a C-isomorphism for all i < n and f# : πn (A) → πn (X) is a C-epimorphism if and only if
the same statements hold for f∗ on H∗ .

It is an easy fact that for C satisfying Axiom 1, if A ∈ C and A and B are C-isomorphic, then B ∈ C. Then
Theorem 12.2.1 implies the following.

Theorem 12.2.4. All homotopy groups of a finite simply connected complex are finitely generated.

We now state the main theorem for this statement. The proof is a pretty easy consequence of our three generalized
theorems. (Their proofs are more complicated. See [106].) For the final part we need one more Theorem proved in
[106].

Theorem 12.2.5. Let f : A1 → A2 be a homomorphism of finitely generated abelian groups. Suppose f is a Cp -


isomorphism. Then A1 and A2 have isomorphic p-components.

Theorem 12.2.6. Cp Approximation Theorem: Let X and A be simply connected and nice. (CW-complexes are
fine.) Suppose Hi (A) and Hi (X) are finitely generated for every i. Let f : A → X be such that f# ; π2 (A) → π2 (X)
is an epimorphism. (The map f is then homotopic to an inclusion map, so we can let it be inclusion without loss of
generality.) Then conditions 1-6 below are equivalent and imply condition 7.

1. f ∗ : H i (X; Zp ) → H i (A; Zp ) is isomorphic for i < n and monomorphic for i = n.

2. f∗ : Hi (A; Zp ) → Hi (X; Zp ) is isomorphic for i < n and epimorphic for i = n.

3. Hi (X, A; Zp ) = 0 for i ≤ n.

4. Hi (X, A; Z) ∈ Cp for i ≤ n.

5. πi (X, A) ∈ Cp for i ≤ n.

6. f# : πi (A) → πi (X) is a Cp -isomorphism for i < n and a Cp -epimorphism for i = n.

7. πi (A) and πi (X) have isomorphic p-components for i < n.


284 CHAPTER 12. HOMOTOPY GROUPS OF SPHERES

So the theorem implies that if we want to find the p-component of πi (X), find a space A with the same cohomology
in dimension i with Zp coefficients and a map of A into X inducing isomorphisms in Zp coefficients.
Proof: Conditions 1 and 2 are equivalent by vector space duality, since Zp is a field and all Hi are finitely
generated.
Conditions 2 and 3 are equivalent by the exact homology sequence of the pair (X, A).
Condition 3 implies condition 4, since the universal coefficient theorem gives an exact sequence

0 → Hi (X, A) ⊗ Zp → Hi (X, A; Zp ) → T or(Hi−1 (X, A), Zp ) → 0

and the middle group is zero by condition 3. But this implies that the left group is zero since it maps into zero by a
monomorphism. Since Hi (X, A) is finitely generated, Hi (X, A) ⊗ Zp = 0 implies that Hi (X, A) is the direct sum of
finite groups of order prime to p, so Hi (X, A) ∈ Cp , proving Condition 4.
Condition 4 applied to Hi and Hi−1 combined with the fact that Cp satisfies axiom 2B and thus axiom 2A, shows
that the tensor product or Tor of these groups with Zp is in Cp . Then the exact sequence shows that Condition 3 holds.
Conditions 4 and 5 are equivalent by the relative Hurewicz therorem.
Conditions 5 and 6 are equivalent by the exact homotopy sequence mod Cp of the pair (X, A).
Then Condition 6 implies condition 7 by the Hurewicz theorem mod CF G , which impliies that the groups πi (A)
and πi (X) are finitely generated. Condition 7 now follows by Theorem 12.2.5. ■

12.3 Some Techniques from the Early Years of the Subject


In this section, I will describe some techniques and results from Hu’s book [74]. The book was written in 1959, so
the Serre spectral sequence was already discovered, but not the Adams spectral sequence which I will describe in the
next section. Steenrod squares had already been developed, but Hu doesn’t cover them. Still, he has some interesting
results derived using tools I have already covered, and it is instructive to see some examples. I will briefly outline
what has happened in the last 60 years in the next section, and I will cite some references you can read if the subject
interests you.
Hu describes the techniques involving the Freudenthal suspension theorem we have already covered in the first
part of the chapter. I will now give some additional examples, all from his book.

12.3.1 Finiteness of Homotopy Groups of Odd Dimensional Spheres


We already know the homotopy groups of S 1 so assume that we are looking at S n for n odd and n ≥ 3. The main
theorem is as follows.
Theorem 12.3.1. If S n is an odd dimensional sphere and m > n then πm (S n ) is finite.
We will need a definition.
Definition 12.3.1. A fiber space p : E → B where B is pathwise connected is called n-connective if E is n-connected
and
p∗ : πm (E, e0 ) → πm (B, b0 )
is an isomorphism for m > n where e0 ∈ E and p(e0 ) = b0 and hence πn+1 (B, b0 ) is isomorphic to Hn+1 (E).
Let X be an n-connective fiber space over S n . By the long exact sequence of a fiber space, F is a K(Z, n − 1)
space. Now by Theorem 12.2.4, all homotopy groups of S n are finitely generated. So by Theorem 12.2.1, all homology
groups of X are finitely generated.
Using the Leray-Serre spectral sequence, it can be shown that H ∗ (F ; Z) is a polynomial algebra generated by an
element of degree n − 1. (Similar to the proof of Theorem 11.7.7 which is the special case n = 3.) Hu then uses the
Wang exact sequence for cohomology
ρ∗
· · · → H m (X; Z) → H m (F ; Z) −→ H m−n+1 (F ; Z) → H m+1 (X : Z) → · · ·
12.3. SOME TECHNIQUES FROM THE EARLY YEARS OF THE SUBJECT 285

to show that
ρ∗ : H p(n−1) (F ; Z) → H (p−1)(n−1) (F ; Z)
is an isomorphism for every positive integer p. Then exactness shows that H m (X; Z) = 0 for every m > 0. Since
Hm (X) is finitely generated, the universal coefficient theorem shows that Hm (X) is finite for m > 0. But then the
Hurewicz Theorem mod CF G shows that πm (X) is finite. So the theorem holds since X is n-connective.

12.3.2 Iterated Suspension


For this chapter, I will break with Hu’s notation and use the more common ΩX rather than ΛX for the space of
loops on X.
Pick a point s0 ∈ S n to be the base point. Let S n be the equator of the sphere S n+1 . Let W = ΩS n+1 which is
the space of loops in S n+1 starting and ending at s0 . Let i : S n → W be defined as follows: Let u and v be the North
and South poles of S n+1 respectively. Define i(x) for x ̸= s0 to be loop formed by joining s0 to u to x to v and then
back to s0 all by the shortest geodesic arcs. The loop i(s0 ) is the loop from s0 to u to s0 to v and back to s0 . The map
i is a homeomorphism of S n into W .
We have that i(s0 ) is homotopic to the degenerate loop w0 ∈ W , so i(s0 ) can be joined to w0 by a path σ. For
m > 0 we have a homomorphism
i∗ : πm (S n , s0 ) → πm (W, s0 )
where we identify x with i(x) so we write s0 for i(s0 ). The path σ induces an isomorphism

σ∗ : πm (W, s0 ) ≈ πm (W, w0 ).

We also have that since W is the space of loops on S n+1 then there is an isomorphism

h∗ : πm (W, w0 ) ≈ πm+1 (S n+1 , s0 ).

Composing these three maps gives the map

Σ = h∗ σ∗ i∗ : πm (S n , s0 ) → πm+1 (S n+1 , s0 ).

This map is called the suspension map and is equivalent to the suspension map defined in Definition 9.7.2. The
Freudenthal suspension theorem (Theorem 9.7.8) also applies with this definition.
If we repeat the process with
j : Ω(S n+1 ) → Ω2 (S n+2 ),
we get an embedding
k = ji : S n → Ω2 (S n+2 ).
For each m, we have that k induces an isomorphism

k∗ : πm (S n , s0 ) → πm (Ω2 (S n+2 ), s0 ).

Similar to before, we also have an isomorphism

l∗ : πm (Ω2 (S n+2 ), s0 ) → πm+2 (S n+2 , s0 ).

Theorem 12.3.2. The map l∗ k∗ is equal to the iterated suspension Σ2 . In addition k∗ is an isomorphism for m <
2n − 1 and an epimorphism for m = 2n − 1.
The next result uses Σ2 to obtain information on p-primary components of the homotopy groups.
Theorem 12.3.3. Let n ≥ 3 be an odd integer and p a prime. Then the iterated suspension

Σ2 : πm (S n ) → πm+2 (S n+2 )

is a Cp isomorphism if m < p(n + 1) − 3 and is a Cp epimorphism if m = p(n + 1) − 3.


286 CHAPTER 12. HOMOTOPY GROUPS OF SPHERES

Theorem 12.3.4. If n ≥ 3 is an odd integer, p is prime, and m < n + 4p − 6, then the p-primary components of
πm (S n ) and πm−n+3 (S 3 ) are isomorphic.

Proof: We will proceed by induction on n. When n = 3, this is obvious. Assume q ≥ 5 is an odd integer and the
theorem is true for every odd integer n with 3 ≤ n < q.
By the previous theorem, the p-primary components of πm (S q ) and πm−2 (S q−2 ) are isomorphic if m − 2 <
p(q − 1) − 3. Since q ≥ 5, we have (p − 1)(q − 5) ≥ 0 so

q + 4p − 8 ≤ p(q − 1) − 3.

So m − 2 < p(q − 1) − 3 whenever m < q + 4p − 6.


By our induction hypothesis, the p primary components of πm−2 (S q−2 ) and πm−q+3 (S 3 ) are isomorphic if m <
q + 4p − 6. So the result is true if n = q. ■

Example 12.3.1. Let p = 7. Then the 7-primary components of πm (S n ) and πm−n+3 (S 3 ) are isomorphic if m <
n + 28 − 6 = n + 22 where n is odd. So for example, the 7-components of πm (S 11 ) and πm−8 (S 3 ) are isomorphic
if 8 < m < 33.

12.3.3 The p-primary components of πm (S 3 )


Let X be a 3-connective fiber space over S 3 . We know the fiber F is a K(Z, 2) space.

Theorem 12.3.5. Hm (X) = 0 if m is odd and H2n (X) ∼


= Zn for every n > 0.
This is also derived from the Wang exact sequence for cohomology and the fact that H ∗ (F ) is a polynomial algebra
with generator of dimension 2.

Theorem 12.3.6. If p is prime then the p-primary component of πm (S 3 ) is 0 if m < 2p and Zp is m = 2p.

Proof: Letting Cp be that class of abelian groups of order prime to p, we have by Therorem 12.3.5 that Hm (X) ∈
Cp if 0 < m < 2p. By the Hurewicz theorem mod Cp , we have that πm (X) ∈ Cp for 0 < m < 2p, and π2p (X) is
Cp -isomorphic to Zp . Since X is a 3-connective fiber space over S 3 , we have that πm (S n ) ∼
= πm (X) for each m > 3.

Theorem 12.3.7. If n ≥ 3 is an odd integer and p is prime then the p-primary component of πm (S n ) is 0 if m <
n + 2p − 3 and is Zp if m = n + 2p − 3.

Proof: Since p ≥ 2, we have


n + 2p − 3 < n + 4p − 6.
Then by Theorem 12.3.4, the p-primary components of πm (S n ) and πm−n+3 (S 3 ) are isomorphic.■

Example 12.3.2. If p = 7 and n = 11 then the 7-primary component of πm (S 11 ) is 0 if m < 11 + 14 − 3 = 22 and


is Z7 if m = 22.

12.3.4 Pseudo-projective Spaces


Definition 12.3.2. A space P = Phn+1 is a pseudo-projective space if it is formed by attaching an (n + 1)-cell E n+1
to S n by a map ϕ : ∂E n+1 = S n → S n of degree h > 0.

The homology groups are H0 (P ) ∼


= Z, Hn (P ) ∼
= Zh and Hi (P ) = 0 for i ̸= 0, n.
Theorem 12.3.8. For every m < 2n − 1, there is an exact sequence

0 → πm (S n ) ⊗ Zh → πm (P ) → T or(πm−1 (S n ), Zh ) → 0.
12.3. SOME TECHNIQUES FROM THE EARLY YEARS OF THE SUBJECT 287

Let X be a 3-connective fiber space over S 3 with projection ω : X → S 3 . Since the p-primary component of
π2p (X) is Zp , there is a map f : S 2p → X representing a generator [f ] of this component.
Now let P = Pp2p+1 . Then S 2p ⊂ P , and p[f ] = 0 implies that f can be extended to a map g : P → X. Compose
with the projection ω : X → S 3 to get χ = ωg : P → S 3 .
Theorem 12.3.9. χ∗ : πm (P ) → πm (S 3 ) is a monomorphism for m < 4p − 1. It sends the p-primary component of
πm (P ) onto that of πm (S 3 ) for m ≤ 4p − 1.
Theorem 12.3.10. If p is a prime number and m ≤ 4p − 2, then

π2p (P ) ∼
= Zp ,
π4p−3 (P ) ∼
= Zp ,
π4p−2 (P ) ∼
= Zp if p > 2,
πm (P ) = 0 otherwise.

Proof: By Theorem 12.3.5 and Hurewicz’s Theorem, πm (P ) = 0 for m < 2p and π2p (P ) = Zp . Applying
Theorem 12.3.8 with n = 2p, h = p, and m ≤ 4p − 3, we get an exact sequence

0 → πm (S 2p ) ⊗ Zp → πm (P ) → T or(πm−1 (S 2p ), Zp ) → 0.

By the Freudenthal suspension theorem, πm (S 2p ) ∼


= πm−1 (S 2p−1 ) for every m ≤ 4p − 3. Since 2p − 1 ≥ 3, Theorem
12.3.7 implies that the p-primary component of πm−1 (S 2p−1 ) is 0 if m < 4p − 3 and Zp if m = 4p − 3. This gives
πm (P ) = 0 for 2p < m < 4p − 3 and π4p−3 (P ) ∼ = Zp .
By Theorem 12.3.9, the p-primary component of πm (S 3 ) is 0 whenever 2p < m < 4p−3, and is Zp if m = 4p−3.
Then by Theorem 12.3.4, if n ≥ 3 is odd, the p-primary component of πm (S n ) is 0 for n + 2p − 3 < m < n + 4p − 6.
Now πm (S 2p ) ∼ = πm+1 (S 2p+1 ) for every m ≤ 4p − 2. So the p-primary component of πm+1 (S 2p+1 ) is 0 if
4p − 3 < m < 6p − 6.
If p > 2 then 4p − 2 < 6p − 6 so the p-primary component of π4p−2 (S 2p ) is 0. By Theorem 12.3.8 with n = 2p,
h = p, and m = 4p − 2, we get π4p−2 (P ) ∼= Zp . ■
Now we have two facts about the homotopy groups of S 3 .
Theorem 12.3.11. If p is a prime number, then the p-primary component of πm (S 3 ) is 0 if 2p < m < 4p − 3 and is
Zp if m = 4p − 3. If p > 2, then the p-primary component of π4p−2 (S 3 ) is Zp .
Theorem 12.3.12. If n ≥ 3, is an odd integer and p prime then the p-primary component of πm (S n ) is 0 if n+2p−3 <
m < n + 4p − 6 and that of πn+4p−6 (S n ) is 0 or Zp .

12.3.5 The Hopf Invariant Revisited


Recall the definition of the Hopf invariant from Section 11.5. We have a map f : S 2n−1 → S n . Letting K =
S ∪f e2n , we have that if σ and τ are generators of H n (K; Z) and H 2n (K; Z) respectively, then the Hopf invariant
n

H(f ) of f is defined by σ ∪ σ = H(f )τ . It turns out that H(f ) = 0 if n odd. Hu proves the following:
Theorem 12.3.13. Let n be even and f : S 2n−1 → S n be a map with Hopf invariant H(f ) = k ̸= 0. Let C denote
the class of all finite abelian groups of order dividing some power of k. If

χf : πm−1 (S n−1 ) ⊕ πm (S 2n−1 ) → πm (S n )

is the homomorphism defined by


χf (α, β) = Σ(α) ⊕ f∗ (β)
n−1 2n−1
where α ∈ πm−1 (S ), β ∈ πm (S ), and Σ is the suspension map, then χf is a C-isomorphism for every m > 1.
Theorem 12.3.14. If H(f ) = ±1, then χf is an isomorphism and the suspension Σ : πm−1 (S n−1 ) → πm (S n ) is a
monomorphism for every m > 1.
288 CHAPTER 12. HOMOTOPY GROUPS OF SPHERES

Theorem 11.5.1 states that there exists a map of Hopf invariant 2, so we have the following.
Theorem 12.3.15. Let n be even and let C denote the class of all finite abelian groups of order dividing some power
of 2. Then the homotopy group πm (S n ) is C-isomorphic to πm−1 (S n−1 ) ⊕ πm (S 2n−1 ).

12.3.6 πn+1 (S n ) and πn+2 (S n )


Since πm (S 1 ) = 0 for m > 1 and πm (S 2 ) ∼
= πm (S 3 ) for m > 2, assume n ≥ 3.
Theorem 12.3.16. πn+1 (S n ) ∼
= Z2 for n ≥ 3.
Proof: Let X be a 3-connective fiber space over S 3 . Then
π4 (S 3 ) ∼
= π4 (X) ∼
= H4 (X) ∼
= Z2 ,
where the last equality follows by Theorem 12.3.5. By the Freudenthal suspension theorem we have
π4 (S 3 ) ∼
= π5 (S 4 ) ∼
= · · · πn+1 (S n ) ∼
= ··· ,
so πn+1 (S n ) ∼
= Z2 for n ≥ 3. ■
Now recall that the generator of π3 (S 2 ) is the Hopf map p. The suspension Σ : π3 (S 2 ) → π4 (S 3 ) is an epi-
morphism, so Σp : S 4 → S 3 is the generator of π4 (S 3 ). In general, the generator of πn+1 (S n ) is the (n − 2)-fold
suspension Σn−2 p of the Hopf map p.
Theorem 12.3.17. πn+2 (S n ) ∼
= Z2 for n ≥ 3.
Proof: Apply Theorem 12.3.8 with n = 4, h = 2, and m = 5. Then we get an exact sequence
0 → π5 (S 4 ) ⊗ Z2 → π5 (P25 ) → T or(Z, Z2 ) → 0.
Now π5 (S 4 ) ∼
= Z2 by the previous result and Z2 ⊗ Z2 = Z2 . Also T or(Z, Z2 ) = 0, so π5 (P25 ) ∼
= Z2 . Then Theorem
12.3.9 implies that the 2-primary component of π5 (S 3 ) is isomorphic to Z2 . By Theorem 12.3.3, The p-primary
component of π5 (S 3 ) is 0 for p > 2, so π5 (S 3 ) ∼
= Z2 .
Since the Hopf map S 7 → S 4 has Hopf invariant 1, Apply Theorem 12.3.14 with n = 4 and m = 6 so that
π6 (S 4 ) ∼
= π5 (S 3 ) ⊕ π6 (S 7 ) ∼
= Z2 ⊕ 0 = Z2 .
By the Freudenthal suspension theorem we have
π6 (S 4 ) ∼
= π7 (S 5 ) ∼
= · · · πn+2 (S n ) ∼
= ··· ,
so πn+2 (S n ) ∼
= Z2 for n ≥ 3. ■
In general, let p : S 3 → S 2 be the Hopf map. Then the suspension is Σp : S 4 → S 3 and the 2-fold suspension
is Σ2 p : S 5 → S 4 . The generator of π5 (S 3 ) is represented by q = ΣpΣ2 p : S 5 → S 3 . In general, πn+2 (S n ) is
generated by the (n − 3)-fold suspension Σn−3 q.

12.3.7 πn+r (S n ) for 3 ≤ r ≤ 8


Hu lists some results for πn+r (S n ) for 3 ≤ r ≤ 8. For r = 3, 4, he computes them with methods similar to those
we have just used, and he refers readers to [162] for the rest. Here is the rest of his list:
r=3
π6 (S 3 ) ∼
= Z12
4 ∼
π7 (S ) = Z ⊕ Z12
πn+3 (S n ) ∼
= Z24 if n ≥ 5.
12.3. SOME TECHNIQUES FROM THE EARLY YEARS OF THE SUBJECT 289

r=4
π7 (S 3 ) ∼
= Z2
4 ∼
π8 (S ) = Z2 ⊕ Z2
π9 (S 5 ) ∼
= Z2
πn+4 (S n ) = 0 if n ≥ 6.

r=5
π8 (S 3 ) ∼
= Z2
4 ∼
π9 (S ) = Z2 ⊕ Z2
π10 (S 5 ) ∼
= Z2
6 ∼
π11 (S ) = Z
πn+5 (S n ) = 0 if n ≥ 7.

r=6
π9 (S 3 ) ∼
= Z3
4 ∼
π10 (S ) = Z24 ⊕ Z3
πn+6 (S n ) ∼
= Z2 if n ≥ 5.

r=7
π10 (S 3 ) ∼
= Z15
4 ∼
π11 (S ) = Z15
π12 (S 5 ) ∼
= Z30
6 ∼
π13 (S ) = Z60
π14 (S 7 ) ∼
= Z120
8 ∼
π15 (S ) = Z ⊕ Z120
πn+7 (S n ) ∼
= Z240 if n ≥ 9.

r=8
π11 (S 3 ) ∼
= Z2
4 ∼
π12 (S ) = Z2
π13 (S 5 ) ∼
= Z2
6 ∼
π14 (S ) = Z24 ⊕ Z2
π15 (S 7 ) ∼
= Z2 ⊕ Z2 ⊕ Z2
8 ∼
π16 (S ) = Z2 ⊕ Z2 ⊕ Z2 ⊕ Z2
π17 (S 9 ) ∼
= Z2 ⊕ Z2 ⊕ Z2
n ∼
πn+8 (S ) = Z2 ⊕ Z2 if n ≥ 10.
290 CHAPTER 12. HOMOTOPY GROUPS OF SPHERES

One final note: Hu gives π10 (S 4 ) ∼ = Z24 ⊕ Z2 , but after checking [162], I believe this to be a typo and that
π10 (S 4 ) ∼
= Z24 ⊕ Z3 is actually correct.

12.4 More Modern Techniques and Further Reading


The entire second half of Mosher and Tangora [106] is devoted to computing homotopy groups of spheres.
Their first approach to build an approximation to S n using a Postnikov system. As a first approximation, K(Z, n)
has the same cohomology and homology groups as S n up to dimension n. Unfortuantely, K(Z, n) has nonzero
cohomology in higher dimensions. We would like to have a better approximation. Here we look at 2-components and
use mod 2 coefficients. Now H n+1 (Z, n; Z2 ) = 0 and H n+2 (Z, n; Z2 ) ∼ = Z2 generated by Sq 2 (ın ) where ın is the
fundamental class. (See Theorem 11.7.11) This class defines a map K(Z, n) → K(Z2 , n + 2) by Theorem 11.2.1.
From the standard contractible fibre space (i.e the path-space fibration) over K(Z2 , n + 2), the map induces a fiber
space X1 over K(Z, n) with fiber ΩK(Z2 , n + 2) = K(Z2 , n + 1). It turns out that X1 is a better approximation
to S n than K(Z, n) . This is because the class Sq 2 (ın ) ∈ H n+2 (Z, n; Z2 ) has been killed so that H n+2 (X1 ; Z2 ) =
H n+2 (S n ; Z2 ) = 0. Using this construction along with the Cp Approximation Theorem (Theorem 12.2.6), Mosher
and Tangora are able to compute the 2-components of stable homotopy groups πn+k (S n ) for k ≤ 7. Note that I have
left out a lot of details, so see the description in [106].
Mosher and Tangora use the Postnikov decomposition of X to construct a spectral sequence for [Y, X], the homo-
topy classes of maps from Y to X. If Y = S m and X = S n where m ≥ n, we get πm (S n ) as a special case. The
problem is that this spectral sequence starts at H ∗ (Y ; π∗ X) (See [106], Chapter 14.) so it requires us to know the
homotopy groups of X already. This lead to the big breakthrough that happened just as Hu’s book was being wirtten.
This was the discovery of the Adams spectral sequence [4].
The Adams spectral sequence gives information about the 2-component of [Y, X]. It is valid in the stable range,
so if X is (n − 1)-connected, we need dim(Y ) ≤ 2n − 2. Finally, the E 2 term is the module ExtA (H ∗ (X), H ∗ (Y ))
where we are talking about the cohomology of X and Y as modules over the Steenrod algebra A. It turns out that
ExtA (H ∗ (X), H ∗ (Y )) also has a multiplicative structure. Although [106] doesn’t get into that, there is more in
McLeary’s book on spectral sequences [99] as well as Adams’ original paper [4].
To get the full homotopy groups, we need the p-primary components where p > 2 is an odd prime. For this, we
need Steenrod reduced powers and the corresponding mod p Steenrod algebras Ap . For more on p-components of
homotopy groups of spheres, see Steenrod and Epstein [157] and McLeary [99].
Meanwhile, Toda had computed several homotopy groups of spheres using the EHP spectral sequence related to
the fibration
S n → ΩS n+1 → ΩS 2n+1

over Z2 . The first term is E1k,n = πk+n (S 2n−1 ) where the 2-primary components are implied. He used slightly
different fibrations for the p-primary components. His methods were used to make the table in Hu’s book. See [162]
for details.
It turns out that calculating the Ext term for the Adams Spectral sequence is very hard. One approach is to use the
May spectral sequence [94, 95]. The Adams spectral sequence can be strengthened by replacing cohomology mod p
by a generalized cohomology theory such as complex cobordism. This was the innovation of Novikov [118] in 1967.
For a complete description of cobordism theory including complex cobordism, see [158]. A good description of how
all of this fits together to compute stable homotopy groups of spheres, see Ravenel [130]. The book assumes a strong
background, so I would recommend reading Mosher and Tangora and the relevant parts of McLeary first.
Finally, Isaksen, Wang, and Xu [75] made the most recent advance in the subject in 2020. The motivic homotopy
theory of Morel and Voevodsky [104] was originally used to apply homotopy methods to algebraic geometry. Using
these methods, Isaksen, et. al. have computed the stable homotopy groups πiS up to i = 90 beating out the old result of
i ≤ 61. This has been made possible by computer calculations of Ext groups involved in the Adams spectral sequence.
This has produced data up to dimension 200 which has not yet been fully interpreted. The relevant algorithms were
developed by Bruner [22, 23, 24], Nassau [114], and Wang [168]. See [75] and its references for the details.
12.5. TABLE OF SOME HOMOTOPY GROUPS OF SPHERES 291

Sn S1 S2 S3 S4 S5 S6 S7 S8 S9 S 10
π1 (S n ) Z 0 0 0 0 0 0 0 0 0
n
π2 (S ) 0 Z 0 0 0 0 0 0 0 0
π3 (S n ) 0 Z Z 0 0 0 0 0 0 0
n
π4 (S ) 0 Z2 Z2 Z 0 0 0 0 0 0
n
π5 (S ) 0 Z2 Z2 Z2 Z 0 0 0 0 0
n
π6 (S ) 0 Z12 Z12 Z2 Z2 Z 0 0 0 0
π7 (S n ) 0 Z2 Z2 Z ⊕ Z12 Z2 Z2 Z 0 0 0
n
π8 (S ) 0 Z2 Z2 Z22 Z24 Z2 Z2 Z 0 0
n
π9 (S ) 0 Z3 Z3 Z22 Z2 Z24 Z2 Z2 Z 0
π10 (S n ) 0 Z15 Z15 Z24 ⊕ Z3 Z2 0 Z24 Z2 Z2 Z
n
π11 (S ) 0 Z2 Z2 Z15 Z2 Z 0 Z24 Z2 Z2
n
π12 (S ) 0 Z22 Z22 Z2 Z30 Z2 0 0 Z24 Z2
n
π13 (S ) 0 Z12 ⊕ Z2 Z12 ⊕ Z2 Z23 Z2 Z60 Z2 0 0 Z24
π14 (S n ) 0 Z84 ⊕ Z22 Z84 ⊕ Z22 Z120 ⊕ Z12 ⊕ Z2 Z23 Z24 ⊕ Z2 Z120 Z2 0 0
n
π15 (S ) 0 Z22 Z22 Z84 ⊕ Z25 Z72 ⊕ Z2 Z23 Z23 Z ⊕ Z120 Z2 0

Table 12.5.1: Homotopy Groups πi (S n ) for 1 ≤ n ≤ 10 and 1 ≤ i ≤ 15 [162].

12.5 Table of Some Homotopy Groups of Spheres


I will conclude this section with the promised table. Recall that we want πi (S n ) for 1 ≤ n ≤ 10 and 1 ≤ i ≤ 15.
First of all, we know that π1 (S 1 ) ∼
= Z and πi (S 1 ) = 0 for i > 1.
n ∼
We also know that πn (S ) = Z and πi (S n ) = 0 for i < n.
Theorem 9.6.8 says that π3 (S 2 ) ∼ = Z and Example 9.6.1 says that πn (S 2 ) ∼= πn (S 3 ) for n ≥ 3.
As an exercise, step through the results in Section 12.3 and see what you can fill in. Then use the stable stems that
were listed but not proved in Section 12.1. There will still be a few groups that you will not find, but you will get a lot
of them.
Table 12.5.1 is the result. I will write Zpn for the direct sum of Zp with itself n-times.
One last comment. The one place where I can see homoopy groups of spheres showing up in data science would
be in obstruction theory. Recall that obstructions to the extension problem where we want to extend f : A → Y to
f : X → Y lie in cohomology groups whose coefficients are homotopy groups of Y . If Y is a sphere, it would be
useful to know some of its low dimensional homotopy groups.
292 CHAPTER 12. HOMOTOPY GROUPS OF SPHERES
Chapter 13

Conclusion

This book serves three functions. Chapters 2-4 give you the math background in point set topology, abstract algebra,
and homology to enable you to understand most current papers on topological data analysis. Chapters 5-7 give you
a taste of what is going on currently. It should give you an idea of what to do with data in the form of a point cloud,
graph, image, or time series. It also includes a list of open source software that should enable you to easily perform
experiments. Finally in Chapters 8-12, I give you a taste of some more advanced topics in algebraic topology such
as cohomology, homotopy, obstruction theory, and Steenrod cohomology operations. Although there are questions of
complexity, I have pointed you to several papers that have addressed algorithmic issues and a number of open source
software projects. Although persistent homology works very well in a lot of practical cases, it would be interesting to
experiment with some of the other techniques and see if they can help in solving hard classification problems. Their
application in data science is an area that is currently wide open.

As I write this book, topological data analysis is evolving rapidly. The goal of this book is to enable you to
understand the range of topics and enable you to experiment and contribute to this exciting field.

293
294 CHAPTER 13. CONCLUSION
Bibliography

[1] Adams, D., The Hitchhiker’s Guide to the Galaxy, Pan Books, 1979.

[2] Adams, H., Emerson, T., Kirby, M., Neville, R., Peterson, C., Shipman, P., Chepushtanova, S., Hanson, E.,
Motta, F., and Ziegelmeier, L., ”Persistence Images: A Stable Vector Representation of Persistent Homology,”
Journal of Machine Learning Research, Vol. 18, 2017. Figures reprinted with permission of authors.

[3] Adams, J. F., ”On the Non-existence of Elements of Hopf Invariant One,” Annals of Mathematics, series 2, vol.
75, pp. 20-104, 1962.

[4] Adams, J. F., ”On the Structure and Applications of the Steenrod Algebra,” Comm. Math. Helv., vol.32, pp.
180-214, 1958.

[5] Adams, J. F., Stable Homotopy Theory, Lecture Notes in Mathematics 3, Springer-Verlag, Berlin, 1964.

[6] Adams, J. F., Stable Homotopy and Generalized Homology, University of Chicago Press, 1995.

[7] Agrawal, R., Imielinski, T., and Swami, A., ”Mining Association Rules Between Sets of Items in Large
Databases”, Proceedings of ACM SIGMOD Intl. Conference on Management of Data, pp. 207-216, 1993.

[8] Alexandroff, P. and Hopf, H., Topologie, Springer-Verlag, 1935.

[9] Atkin, R., Combinatorial Connectivities in Social Systems, Birkhauser Verlag, 1977.

[10] Barnes, D. and Roitzheim, C., Foundations of Stable Homotopy Theory, Cambridge University Press, 2020.

[11] Barr, M., ”Fuzzy Set Theory and Topos Theory,” Canadian Mathematics Bulletin, vol. 24, no. 4, pp. 501-508,
1986.

[12] Bauer, U., ”Ripser,” https://github.com/Ripser/ripser, 2016.

[13] Bauer, U., ”Ripser: Efficient Computation of Vietoris-Rips Persistence Barcodes,” arXiv:1908.02518v2
[math.AT] 26 Feb 2021.

[14] Bauer, U., ”Ripser: Efficient Computation of Vietoris-Rips Persistence Barcodes,” Talk at Conference on Com-
putational and Statistical Aspects of Topological Data Analysis, Alan Turing Institute, London, UK, March 23,
2017, Avalable at ulrich-bauer.org/ripser-talk.pdf

[15] Berndt, D. and Clifford, J., ”Using Dynamic Time Warping to Find Patterns in Time Series,” Workshop on
Knowledge Discovery in Databases, 12th International Conference on Artificial Intelligence, Seattle, WA, 1994,
pp. 229-248.

[16] Bishop, E. and Bridges, D, Constructive Analysis, Springer-Verlag, 1985.

[17] Blue Brain Project. Available online: https:///www.epfl.ch/research/domains/bluebrain

295
296 BIBLIOGRAPHY

[18] Björner, A. ”Homology and Shellability of Matroids and Geometric Lattices”, in White, N., Matroid Applica-
tions, Cambridge University Press, 1992.

[19] Box, G., Jenkins, G. and Reinsel, G., Time Series Analysis: Forecasting and Control, 4th Edition, Wiley, 2008.

[20] Breiman, L., ”Random Forests,” Machine Learning, vol. 45, pp. 5-32, 2001.

[21] Brown, E, ”Finite Computability of Postnikov Complexes”, Annals of Mathematics, Second Series, vol. 65, pp.
1-20, 1957.

[22] Bruner, R., ”Calculation of Large Ext Modules,” Computers in Geoemetry and Topology (Chicago, IL 1986),
Lectures in Pure and Applied Mathematics, vol. 114, pp. 79-104, 1989.

[23] Bruner, R., ”Ext in the Nineties,” Algebraic Topology (Oaxtepec 1991), Contemrp. Math., American Mathemat-
ical Society, vol. 146, pp. 71-90, 1993.

[24] Bruner, R., ”The Cohomology of the Mod 2 Steenrod Algebra: A Computer Calculation,” Wayne State University
Research Report, vol. 37, 1997.

[25] Bubenik, P., ”Statistical Topological Data Analysis Using Persistence Landscapes,” Journal of Machine Learning
Research, Vol. 16, pp. 77-102, 2015.

[26] C̆adek, M., Krc̆ál, M., Matous̆ek, Sergeraert, F., Vokr̆ı́nek, L., and Wagner, U, ”Computing all Maps into a
Sphere,” Journal of the ACM, vol. 61, no. 3, pp. 1-44, May 2014.

[27] C̆adek, M., Krc̆ál, M., Matous̆ek, Vokr̆ı́nek, L., and Wagner, U, ”Extendability of Continuous Maps is Undecid-
able,” Discrete Computational Geometry, vol. 51, pp.24-66, 2014.

[28] C̆adek, M., Krc̆ál, M., Matous̆ek, Vokr̆ı́nek, L., and Wagner, U, ”Polynomial Time Computation of Homotopy
Groups and Postnikov Systems in Fixed Dimension,” SIAM Journal on Computing, vol. 43, no.5, 1728-1780,
2014.

[29] Carlsson, G., ”Topology and Data,” Bulletin of the American Mathematical Society, vol. 46, no.2. pp. 255-308,
April 2009.

[30] Carlsson, G., de Silva, V. Morozov, D. ”Zigzag persistent homology and real valued functions,” Proceedings of
the 25th ACM Symposium on Computational Geometry, Aarhus, Denmark, June 8-10, 2009, pp. 247-256.

[31] Chazal, F. and Michel, B., ”An Introduction to Topological Data Analysis: Fundamental and Practical Aspects
for Data Scientists,” arXiv: 1710.04019v1 [math.ST] 11 Oct 2017.

[32] Cohen-Steiner, D. Edelsbrunner, H. and Harer, J., ”Stability of Persistence Diagrams,” Discrete Computational
Geometry, Vol. 37, no. 1, pp. 103-120, 2007.

[33] Cohen-Steiner, D. Edelsbrunner, H. and Morozov, D., ”Vines and Vineyards by Updating Persistence in Linear
Time,” Proceedings of the Twenty-second Annual Symposium on Computational Geometry (SCG ’06), pp. 119-
126, Sedona, Arizona, June 2006.

[34] Collins, C., Iorga, M., Cousin, D., and Chapman, D., ”Passive Encrypted IoT Device Fingerpringing with Persis-
tent Homology,” Topological Data Analyis and Beyond Workshop at the 34th Conference on Neural Information
Processing Systems, Vancouver, Canada, December 2020. Figure used with permission of authors.

[35] Conway, J. B., A Course in Functional Analysis, Volume 96, Springer Science & Business Media, 2013.

[36] Conway, J. H,, ”A Simple Construction for the Fischer-Griess Monster Group,” Inventiones Mathematicae, vol.
79, pp. 513-540, 1985.
BIBLIOGRAPHY 297

[37] Cryer, J. and Chan, K., Time Series Analysis With Applications in R, 2nd Edition, Springer, 2008.

[38] Curtis, E. B., Simplicial Homotopy Theory Advances in Mathematics, vol. 6, pp. 107-209, 1971.

[39] de Silva, V. and Ghrist, R., ”Coverage in Sensor Networks Via Persistent Homology,” Algebraic & Geometric
Topology, vol. 7, pp. 339-358, 2007. Figures reprinted with permission of authors.

[40] de Silva, V., Morozov, D., and Vejdemo-Johansson, M., ”Persistent Cohomology and Circular Coordinates,”
Discrete and Computational Geometry, vol. 45, pp. 737-759, 2011.

[41] Diaz, C. Postol, M., Simon, R., and Wicke, D., ”Time-Series Data Analysis for Classification of Noisy and
Incomplete Internet-of-Things Datasets,” Proceedings of the 18th IEEE International Conference on Machine
Learning and Applications (ICMLA), December 2019.

[42] Dong, W., Moses, C., and Li, K., ”Efficient k-Nearest Neighbor Graph Construction for Generic Similarity Mea-
sures,” Proceedings of the 20th ACM International Conference on World Wide Web (WWW ’11), Hyderabad,
India, pp. 577-586, Mach 2011.

[43] Dousson, X., Rubio, J. , Sergeraert, F., and Siret, Y., The Kenzo Program, http://www-fourier.ujf-
grenoble.fr/ sergerar/Kenzo/ Photograph reproduced with permission of content owner.

[44] Edelsbrunner, H.and Harer, J., Computational Topology – An Introduction, American Mathematical Society,
2010.

[45] Eilenberg, S. and Mac Lane S., ”On the Groups H(π, n), I.,” Annals of Math., vol. 58, pp. 55-106, 1953.

[46] Eilenberg, S. and Mac Lane S., ”On the Groups H(π, n), II.,” Annals of Math., vol. 60, pp. 49-139, 1954.

[47] Eilenberg, S. and Steenrod, N, Foundations of Algebraic Topology, Princeton University Press, 1952.

[48] Eilenberg, S. and Zilber, J., ”On Products of Complexes,” American Journal of Math., vol. 75, pp. 200-204, 1959.

[49] Enders, W.Applied Econometric Time Series, 3rd Edition, Wiley 2009.

[50] Filakovský, M. and Vokr̆ı́nek, L., ”Are Two Given Maps Homotopic?.” Foundations of Computational Mathe-
matics, vol. 20, pp. 311-330, 2020.

[51] Friedman, G., ”An Elementary Illustrated Introduction to Simplicial Sets,” Rocky Mountain Journal of Mathe-
matics, vol. 42., no. 2, pp. 353-423, 2012. Figures reprinted with permission of author.

[52] Gamow, G., One Two Three... Infinity: Facts and Speculations of Science, New York, Viking Press, 1947.

[53] Ghrist, R. Elementary Applied Topology, Createspace, 2014.

[54] Gidea, M., Goldsmith, D., Katz, Y., Roldan, P., and Shmalo, Y., ”Topological Recognition of Critical Transitions
in Time Series of Cryptocurrencies,” arXiv:1809.00695v1 [q-fin.MF] 3 Sep 2018.

[55] Gidea, M. and Katz, Y., ”Topological Data Analysis of Financial Time Series: Landscapes of Crashes”, Physica
A, vol. 491, pp. 820-834, 2018. Figures reprinted with permission of authors.

[56] González-Dı́az, R. and Real, P., ”A Combinatorial Method for Computing Steenrod Squares,” Journal of Pure
and Applied Algebra,” vol. 139, pp. 89-108, 1999.

[57] González-Dı́az, R. and Real, P., ”Computation of Cohomology Operations on Finite Simplicial Complexes,” Ho-
mology, Homotopy and Applications Volume 5 (2003) Number 2. Volume of a Workshop at Stanford University.
Pages: 83 – 93. DOI: https://dx.doi.org/10.4310/HHA.2003.v5.n2.a4
298 BIBLIOGRAPHY

[58] González-Dı́az, R. and Real, P., ”Steenrod Reduced Powers and Computability,” Proceed-
ings of the 5th International IMACS Conference on Applications of Computer Algebra.
ACA’99 Session 18. Computations in Pure Mathematics (Algebra, Analysis, Geometry, ...)
https://math.unm.edu/∼aca/ACA/1999/Proceedings/pure/Gonzalez-Diaz abstract.ps
[59] Govc, D., ”Computing Homotopy Types of Directed Flag Complexes,” arXiv:2006.05333v1 [math.AT] 9 Jun
2020.
[60] Griess, R. L., Jr., ”The Friendly Giant,” Inventiones Mathematicae, Vol. 69, pp. 1-102, 1982.
[61] Hamilton, J. Time Series Analysis, Princeton University Press, 1994.
[62] Hartshorne, R., Algebraic Geometry, Springer Science + Business Media, Inc., 1977.
[63] Hatcher, A., Algebraic Topology, Cambridge University Press, 2009. License for figure purchased from publisher:
PLSclear%20FPL%20Licence%20[73717]%20(1).pdf
[64] Hausdorff, F., Set Theory, Reprinted by the American Mathematical Society, 2005.
[65] HB, A., Persistent Cohomomology Operations, Duke University, 2011.
[66] Herstein, I, Topics in Algebra, 2nd Edition, Xerox Corporation 1975.
[67] Herstein, I, Abstract Algebra, 3rd Edition,John Wiley and Sons Inc. 1999.
[68] Hilton, P. and Stammbach, U., A Course in Homological Algebra: Second Edition, Springer-Verlag, 1997.
[69] Hinton, G. and Roweis, S., ”Stochastic Neighbor Embedding,” Proceedings of the 15th International Conference
on Neural Information Processing Systems (NIPS ’02), Cambridge, MA, USA, pp. 857-864, MIT Press, 2002.
[70] Hoffman, K. and Kunze, R., Linear Algebra, 2nd Edition, Pearson, 1971.
[71] Hopf, H., ”Die Klassen der Abbildungen der n-dimensionalen Polyder auf die n-dimesnsionale Sphäre,” Comm.
Math. Helv., vol. 5, pp. 39-54, 1933.
[72] Hopf, H., ”Über die Abbildungen von Sphären auf Sphären neidrigerer Dimension,” Fundamenta Mathematcae,
vol. 25, pp. 427-440, 1935.
[73] Hoteling, H., ”Analysis of a Complex of Statistical Variables into Principal Components,” Journal of Educational
Psychology, vol. 24, no. 6. pp. 417-441, 1933.
[74] Hu, S-T., Homotopy Theory, Academic Press, 1959.
[75] Isaksen, D., Wang, G., and Xu, Z., ”Stable Homotopy Groups of Spheres,” Proceedings of the National Acedemy
of Sciences, vol. 117, no. 40, pp. 24757-24763, October 6, 2020.
[76] Jacobson, N. Basic Algebra I and II, Dover Edition, 2009.
[77] Jain, A. K. and Dubes, R. C., Algorithms for Clustering Data, Prentice Hall Advanced Reference Series, Prentice
Hall Inc., Englewood Cliffs, NJ, 1988.
[78] Johnson, S. C., ”Hierarchical Clustering Schemes,” Psychometrika vol. 2, pp. 241-254, 1967.
[79] Kaczynski, T., Mischaikow, K., and Mrozek, M., Computational Homology, Springer-Verlag, 2004.
[80] Kahn, D. ”Induced maps for Postnikov Systems,” Transactions of the American Mathematical Society, vol. 10,
pp. 432-450, 1963.
[81] Kan, D., ”A Combinatorial Definition of Homotopy Groups,” Annals of Mathematics, vol. 67, pp. 282-312, 1958.
BIBLIOGRAPHY 299

[82] Kleist, C., Time Series Data Mining Methods: A Review, Master’s Thesis, Humboldt-Universität zu Berlin,
School of Business and Economics, Berlin, Germany, March 25, 2015.

[83] Kochman, S., Stable Homotopy Groups of Spheres, Springer Lecture Notes 1423, 1990.

[84] Kochman, S. and Mahowald, M., ”On the Computation of Stable Stems,” Contemp. Math., vol. 181, pp. 299-316,
1995.

[85] Kolda, T. and Bader, B., ”Tensor Decompositions and Applications,” SIAM Review, vol. 51, no. 3, pp. 455-500,
2009.

[86] Kruskal, J. B., ”Multidimensional Scaling by Optimizing Goodness of Fit to a Nonmetric Hypothesis,” Psy-
chometrika, vol. 29, no. 1, pp. 1-27, March 1964.

[87] Lefschetz, S. Algebraic Topology, American Mathematical Society Colloquium Publications, Volume 27, 1942.

[88] Lin, J., Keogh, E., Wei, L., and Lonardi, S., ”Experiencing SAX: A Novel Symbolic Representation of Time
Series,” Data Mining and Knowledge Discovery, Springer, 2007. Figures reprinted with permission of authors.

[89] Lütgehetmann, D., Govc, D., Smith, J., and Levi, R., ”Computing Persistent Homology of Directed Flag Com-
plexes,” Algorithms, vol. 13, no. 19, pp. 1-18, 2020.

[90] Lyons, D., ”An Elementary Introduction to the Hopf Fibration,” Mathematics Magazine, vol. 76, no. 2, pp. 87-98,
April 2003.

[91] Mac Lane, S., Categories for the Working Mathematician, Second Edition, Springer, 1978.

[92] Mac Lane, S., Homology, Springer Verlag, 1995. License for figures purchased from publisher:
https://marketplace.copyright.com/rs-ui-web/mp/license/f6c295ee-f9b0-4f97-8e55-8995fafad7be/f9658e3d-
7752-4d12-9cc4-61ee1e75a1dd

[93] Marsden, J. and Ratiu, T., Introduction to Mechanics and Symmetry, Springer-Verlag, 1994.

[94] May, J. P., ”The Cohomology of Restricted Lie Algebras and of Hopf Algebras,” Bulletin of the American
Mathematical Society, vol. 71, pp. 372-377, 1965.

[95] May, J. P., ”The Cohomology of Restricted Lie Algebras and of Hopf Algebras,” Journal of Algebra, vol. 3, no.
2, pp. 123-146, 1966.

[96] May, J. P., A Concise Course in Algebraic Topology, University of Chicago Press, 1999.

[97] May, J. P. Simplicial Objects in Algebraic Topology, Chicago Lectures in Mathematics, 1992.

[98] Mathematics Stack Exchange, ”How can a mug and a torus be equivalent if the mug is
chiral?”, electronpusher: https://math.stackexchange.com/users/398916/electronpusher, URL:
https://math.stackexchange.com/q/2258389 (version: 2017-05-02) License: CC BY-SA 3.0

[99] McCleary, J., A User’s Guide to Spectral Sequences, Second Edition, Cambridge University Press, 2001.

[100] McInnes, L., Healy, J., and Melville, J., ”UMAP: Uniform Manifold Approximation and Projection for Dimen-
sion Reduction,” arXiv: 1802.03426v3 [stat.ML]. September 18, 2020.

[101] Milnor, J., Morse Theory, Princeton University Press, 1963.

[102] Milnor, J. and Stasheff, J., Characteristic Classes, Princeton University Press, 1974.

[103] Montgomery, D., Introduction to Statistical Quality Control, 8th Edition, John Wiley & Sons, 2020.
300 BIBLIOGRAPHY

[104] Morel, F. and Voevodsky, V. ”A1 -homotopy Theory of Schemes,” Inst. Hautes Études Sci. Publ. Math., vol. 90,
pp. 45-143, 1999.

[105] Morozov, D., ”Dionysus,” Software available at http://www.mrzv.org/software/dionysus, 2012.

[106] Mosher, R. and Tangora, M., Cohomology Operations and Applications in Homotopy Theory, Dover Publica-
tions, Inc., 2008.

[107] Mosseri, R. and Dandoloff, R., ”Geometry of Entangled States, Bloch Spheres and Hopf Fibrations,” Journal of
Physics A, vol. 34, pp. 10243-10252, 2001.

[108] Motzkin, T., ”The Euclidean Algorithm,” Bulletin of the American Mathematical Society, Vol. 55, pp1142-
1146, 1949.

[109] Munch, E., ”A User’s Guide to Topological Data Analysis,” Journal of Learning Analytics, vol. 4. no. 2. pp.
47-61, 2017.

[110] Munkres, J., Elements of Algebraic Topology, The Benjamin/Cummings Publishing Company, Inc.,
1984. Images reprinted with permission of the publisher. License: https://marketplace.copyright.com/rs-ui-
web/mp/license/82de90d1-7f17-462f-9337-ea838209180c/d1580ee1-6e26-462d-8008-71fcdcf4f0f0

[111] Munkres, J., Topology, Second Edition, Pearson, 2017.

[112] Myers, A., Munch, E., and Khasawneh, F., ”Persistent Homology of Complex Networks for Dynamic State
Detection,” arXiv:1904.07403v2 [nlin.CD], 27 Jan 2020.

[113] Nakahara, M., Geometry, Topology, and Physics, Institute of Physics Publishing, 1990.

[114] Nassau, C., www.nullhomotopie.de

[115] Nevill-Manning, C. and Witten, I., ”Identifying Hierarchical Structure in Sequences: A linear-time algorithm,”
Journal of Artificial Intelligence Research, volume 7, 1997, pp 67-82. Figures reprinted with permission of
authors.

[116] Nicolau, M., Levine, A., and Carlsson, G., ”Topology Based Data Analysis Identifies a Subgroup of Breast
Cancers With a Unique Mutational Profile and Excellent Survival,” Proceedings of the National Academy of
Sciences, vol. 108, no. 17, pp. 7265-7270, 2011.

[117] Nicolau, M., Tibshirani R., Børresen-Dale, A. L., Jeffrey, S. S. ”Disease-Specific Genomic Analysis: Identify-
ing the Signature of Pathologic Biology,” Bioinformatics, vol. 23, pp. 957-965, 2007.

[118] Novikov, S. ”Methods of Algebraic Topology from the Point of View of Cobordism Theory,” Izvestiya Akademii
Nauk SSR. Seriya Matematicheskaya (in Russian), vol. 31, pp. 855-951, 1967.

[119] Perea,J. and Harer, J., ”Application of Topological Methods to Signal Analysis,” Foundations of Computational
Mathematics, Vol. 15, pp. 799-838, 2015.

[120] Perea, J. Munch, E., and Khasawneh, F., ”Approximating Continuous Functions on Persistence Diagrams Using
Template Functions,” arXiv: 1902.07190v2 [cs.CG] 19 Mar 2019. Figures reprinted with permission of authors.

[121] Pontrjagin, L., ”A Classification of Mappings of the 3-dimensional Complex into the 2-dimensional Sphere,”
Rec. Math. [Mat. Sbornik], N. S., vol. 9, no, 51, pp. 331-363, 1941.

[122] Pontrjagin, L., ”Mappings of a 3-sphere into an n-complex,” C. R. Akad. Nauk SSSR, vol. 34, pp. 35-37, 1942.

[123] Postnikov, M., ”Investigations in Homotopy Theory of Continuous Mappings,” American Mathematical Society
Translations, Series 2, vol, 7, pp. 1-134, 1957.
BIBLIOGRAPHY 301

[124] Postol, M., Realization of Homotopy Equivalences by Homeomorphisms, Thesis, University of Minnesota,
1990.

[125] Postol, M., “Homotopy and Topological Actions on Spaces With Few Homotopy Groups,” Proceedings of the
American Mathematical Society, vol. 114, pp. 251-260, 1992.

[126] Postol, M. and Goldring, T., ”Computer Intrusion Detection Using Features From Graph Theory and Alge-
braic Topology,” Presented at the Second Conference on Algebraic Topological Methods in Computer Science,
London, Ontario, Canada, July 2004, Available online: atmcs2.appliedtopology.org/talks/Postol.pdf

[127] Pun, C., Xia, K., Lee, S., ”Persistent-Homology-Based Machine Learning — A Survey, ” SSRN Electronic
Journal, January 2018.

[128] Purvine, E., ”Homology of Graphs and Hypergraphs,” Talk at the Workshop on Topological Data Analy-
sis: Theory and Applications, May 2, 2021, Recorded talk available online: math.sci.uwo.ca/ jardine/conf-
recordings/Purvine-HomologyHypergraphs.mp4

[129] Rabadán, R. and Blumberg, A. Topological Data Analysis for Genomics and Evolution, Cam-
bridge University Press, 2020. Figures reprinted with permission of the publisher. License:
PLSclear%20FPL%20Licence%20[73717].pdf

[130] Ravenel, D., Complex Cobordism and the Stable Homotopy Groups of Spheres, 2nd Edition, AMS Chelsea
Publishing, 2004.

[131] Real, P. Homological Perturbation Theory and Associativity, Preprint of Department of Applied Mathematic I,
University of Seville, 1996.

[132] Real, P., ”On the Computability of the Steenrod Squares,” Annali de’ll Universitá di Ferrara, sezione VII.
Scienze Matematiche, v. XLII, pp. 57-63, 1996.

[133] Reimann, M., Nolte, M., Scolamiero, M., Turner, K., Perin, R., Chindemi, G., Dłotko, P., Levi, R., Hess, K.,
and Markram, H., ”Cliques of Neurons Bound into Cavities Provide a Missing Link Between Structure and
Function,” Frontiers in Computational Neuroscience, vol. 11, article 48, pp. 1-16, June 2017.

[134] Riedl, M., Müller, A., and Wessel, N., ”Practical Considerations of Permutation Entropy: A Tutorial Review,”
The European Physical Journal of Special Topics,” Vol. 222, pp. 249-262, 2013.

[135] Riehl, E. Category Theory in Context, Dover Publications, Inc., 2016.

[136] Robinson, M., Topological Signal Processing, Springer-Verlag, 2014.

[137] Rotman, J. An Introduction to Homological Algebra, Springer, 2009.

[138] Rubio, J., Homologie effective des espaces de lacets itérés: un logiciel, Thèse de doctorat de l’Institut Fourier,
Grenoble, 1991.

[139] Rubio, J. and Sergeraert, F., ”Constructive Homological Algebra and Applications,” 2006 Genova Summer
School, arXiv:1208.3816v3 [math.KT] 22 Aug 2013. Figures reprinted with permission of authors.

[140] Romero, A., Rubio, J. and Sergeraert, F, ”Computing Spectral Sequences,” Journal of Symbolic Computation,
vol. 41, no. 10, pp. 1059-1079, October 2006.

[141] Rudin, W. Functional Analysis, 2nd Edition, McGraw-Hill, 1991.

[142] Schäfer, P. and Leser, U., ”Multivariate Time Series Classification With WEASEL+MUSE,” CoRR, vol.
abs/1711.11343, 2017.
302 BIBLIOGRAPHY

[143] Serre, J.-P., ”Homologie singulière des espaces fibrés,” Annals of Mathematics, vol. 2, no. 54, pp. 425-505,
1951.
[144] Silverman, B. W., Density Estimation for Statistics and Data Analysis, Monographs on Statistics and Applied
Probability, Chapman & Hall, London, 1986.
[145] Smith, A., Bendich, P., and Harer J., ”Persistent Obstruction Theory for a Model Category of Measures with
Applications to Data Merging,” Transactions of the American Mathematical Society, Series B, vol. 8, no. 1, pp.
1-38.
[146] Singh, G., Memoli, F., and Carlsson, G., ”Topological Methods for the Analysis of High Dimensional Data
Sets and 3D Object Recognition,” Eurographics Symposium on Point-Based Graphics, Prague, Czech Republic,
September 2-3, 2007.
[147] Smith, J.P., ”Flagser-Adaptions,”, 2019. Available online: https://github.com/JasonPSmith/flagser-adaptions
[148] Smullyan, R., What is the Name of This Book?: The Riddle of Dracula and Other Logical Puzzles, Dover
Recreational Math, 2011.
[149] Spanier, E., Algebraic Topology, Springer-Verlag, 1966.
[150] Spivak, D. ”Metric Realization of Fuzzy Simplicial Sets,” Self published notes., 2012.
[151] Spivak, M., A Comprehensive Introduction to Differential Geometry, Volume 1, Publish or Perish, 3rd Edition,
1999.
[152] Steen, L. and Seebach, J., Counterexamples in Topology, Second Edition, Springer-Verlag, New York, 1978.
[153] Steenrod, N., ”Cohomology Operations and Obstructions to Extending Continuous Functions,” Advances in
Mathematics, vol. 8, pp. 371-416, 1972.
[154] Steenrod, N., ”Products of Cocycles and Extensions of Mappings,” Annals of Mathematics, vol. 48, no.2. pp.
290-320, April, 1947.
[155] Steenrod, N., ”Reduced Powers of Cohomology Classes,” Annals of Math., vol. 56, pp. 47-67, 1952.
[156] Steenrod, N., The Topology of Fibre Bundles, Princeton University Press, 1951.
[157] Steenrod, N. and Epstein, D., Cohomology Operations, Annals of Mathematics Studies, Number 50, Princeton
University Press, 1962.
[158] Stong, R., Notes on Cobordism Theory, Princeton University Press, 2015.
[159] Tan, P., Steinbach, M., and Kumar, V., Introduction to Data Mining, Pearson Education, Inc., 2006.
[160] Thomas, P., ”A Generalization of the Pontrjagin Square Cohomology Operations,” Proceedings of the National
Academy of Sciences, vol. 42, pp. 266-269, 1956.
[161] Thomas, P., ”The Generalized Pontrjagin Cohomology Operations and Rings with Divided Powers,” Memoirs
of the American Mathematical Society, no. 27, 1957.
[162] Toda, H., Composition Methods in Homotopy Groups of Spheres, Annals of Mathematical Studies 49, Princeton
University Press, 1962.
[163] Tsay, R., Analysis of Financial Time Series, 3rd Edition, Wiley, 2010.
[164] van der Maaten, L., ”Accelerating t-SNE Using Tree-Based Algorithms,” Journal of Machine Learning Re-
search, vol. 15, no. 1, pp. 3221-3245, 2014.
BIBLIOGRAPHY 303

[165] van der Maaten, L., and Hinton, G., ”Visualizing Data Using t-SNE,” Journal of Machine Learning Research,
vol. 9, pp. 2579-2605, 2008.
[166] Vokr̆ı́nek, L. ”Decidability of the Extension Problem for Maps into Odd Dimensional Spheres,”, Preprint, arXiv:
1312.2474, January 2014.
[167] Wallis, W., Bunke, H., Dickinson, P., and Kraetzl, M., A Graph-Theoretic Approach to Enterprise Network
Dynamics, Birkhauser, 2006.
[168] Wang, G., github.com/pouiyter/morestablestems

[169] Wang, X., Lin, J., Senin, P., Oates, T., Gandhi, S., Boedihardjo, A., Chen, C. and Frankenstein, S., ”RPM:
Representative Pattern Mining for Efficient Time Series Classification,” Proceedings of the 19th International
Conference on Extending Database Technology (EDBT), Bordeaux, France, March 15-18, 2016.
[170] Whitehead, G., Elements of Homotopy Theory, Springer-Verlag, 1979.
[171] Whitehead, J., ”Combinatorial Homotopy I, II”, Bulletin of the American Mathematical Society, vol. 55, pp.
213-245, 453-496, 1949.
[172] Willard, S., General Topology, Addison-Wesley Publishing Company, 1970.
[173] Wilson, J., ”A Principal Ideal Ring That is not a Euclidean Ring,” Mathematics Magazine, Vol. 46, no. 1, pp.
34-38, January 1973.

[174] ”cylinder,” Wiktionary, Wikimedia Foundation, June 30, 2020, en.wiktionary.org/wiki/cylinder


https://commons.wikimedia.org/wiki/Commons:GNU Free Documentation License, version 1.2
[175] ”Immersion(Mathematics).” Wikipedia, Wikimedia Foundation, July 25, 2020,
en.wikipedia.org/wiki/Immersion(mathematics) https://creativecommons.org/licenses/by-sa/4.0/deed.en

[176] ”Möbius Strip” Wikipedia, Wikimedia Foundation, July 30, 2023, https://en.wikipedia.org/wiki/Möbius strip
https://commons.wikimedia.org/wiki/Commons:GNU Free Documentation License, version 1.2
[177] ”Spaceship Earth (Epcot)” Wikipedia, Wikimedia Foundation, September 6, 2007,
https://en.wikipedia.org/wiki/Spaceship Earth (Epcot)#/media/File:Spaceship Earth 2.jpg, image created
by Katie Rommel-Esham, License: CC BY-SA 4.0

[178] ”Suspension(Topology).” Wikipedia, Wikimedia Foundation, February 22, 2019,


en.wikipedia.org/wiki/Suspension(Topology) https://commons.wikimedia.org/wiki/Commons:GNU Free
Documentation License, version 1.2
[179] ”Torus.” Wikipedia, Wikimedia Foundation, July 30, 2020, en.wikipedia.org/wiki/Torus
https://commons.wikimedia.org/wiki/Commons:GNU Free Documentation License, version 1.2

[180] Zadeh, L., ”Fuzzy Sets,” Information and Control, vol. 8, no. 3, pp. 338-353, 1965.
[181] Zomorodian, A., Topology for Computing , Cambridge University Press, New York, 2009.
[182] Zomorodian,A. and Carlsson, G., ”Computing Persistent Homology,” Discrete and Computational Geometry,
vol. 33, no.2, pp. 249-264, February 2005.
304 BIBLIOGRAPHY
Index

A ⊗ B, 162 algebra (object), 47


CP n , 81 arcwise connected, 28
K(π, n), 212 associative graded R-algebra, 258
M (G, n), 213 associative ring, 41
P ∞ , 80 augmentation map, 58
P n , 80 augmented graded R-algebra, 259
Sn, 7 automorphism, 206
Sq i , 250 AW, 274
T0 , 19
T1 , 19 Banach space, 96
T2 , 19 bar code, 89
T3 , 20 barycenter, 65
T4 , 21 barycentric coordinates, 64
T3 12 , 21 barycentric subdivision, 65
X ∪f Y , 76 basic neighbohood, 12
X ∨ Y , 175 betti number, 37
∆-set, 137 Blue Brain Project, 153
ℵ0 , 22 Bockstein exact couple, 262
χ(K), 69 Bockstein homomorphism, 251
πiS (X), 279 bottleneck distance, 94
πn (X, x0 ), 201 boundary, 11
C, 24 bounded, 117
(p, q)-shuffle, 274 bouquet of spheres, 212
Boy’s surface, 18
abelian group, 32 bundle space, 194
abelianization, 193
absolute covering homotopy property, 194 canonical free resolution, 166
absolute homotopy extension property, 228 canonical line bundle, 269
abstract simplicial complex, 55 cardinality, 22
ACHP, 194 Cartesian product, 7, 15
acyclic carrier, 248 category, 48
acyclic complex, 58 Cauchy sequence, 96
acyclic model, 180 Cayley number, 198
adjacency matrix, 100 Cech complex, 84
adjunct functors, 142 cell, 76
adjunction, 142 cellular chain complex, 78
adjunction space, 76 chain, 55
admissible sequence, 253 chain complex, 61
AHEP, 228 chain homotopy, 64

305
306 INDEX

chain map, 63 cube, 21


CHP, 194 cup product, 174
class of abelian groups, 281 cup-i product, 249
closed map, 16 CW complex, 76
closed set, 10 cycle, 100
closure, 10 cyclic group, 33
co-algebra, 260
co-unit, 260 decomposable Steenrod square, 257
coaugmentation, 171, 260 deformation cochain, 230
coboundary, 170 deformation retraction, 66
cochain, 170 degeneracy map, 138
cocycle, 170 degenerate simplex, 138
coface, 185 degree of a map, 68
coherent union, 80 Delta set, 137
cohomology, 170 dense, 24
cohomology cross product, 182 derived functor, 168
cohomology operation, 245 derived triplet, 203
cohomology ring, 175 difference cochain, 230
coincidence matrix, 185 differential bigraded module, 216
digraph, 101
cokernel, 37
associated, 102
commutative diagram, 8
direct product, 35
commutative graded R-algebra, 258
direct sum, 35
commutative ring, 41
directed flag complex, 154
commutator, 193
directed simplicial complex, 154
commutator subgroup, 193
division ring, 42
compact space, 25
dual category, 48
compact-open topology, 118
dual homomorphism, 160
complement, 5
dual space, 261
complete metric space, 96
dynamic time warping, 105
completely regular, 21
complex projective n-space, 81 E(G), 99
component, 26 edges, 99
cone, 18 ends, 99
connected, 26 Eilenberg-MacLane space, 212
connected augmented graded R-algebra, 259 Eilenberg-Steenrod Axioms for Homology., 73
continuosus function, 13 Eilenberg-Zilber contraction, 274
continuum hypothesis, 24 elementary symmetric polynomials, 253
contractible, 66 EML, 274
contraction, 273 empty set, 6
contravariant functor, 49 epimorphism, 36
convex hull, 53 equivariant carrier, 248
convex set, 52 equvialence relation, 8
coordinate bundle, 195 Euclidean ring, 46
coordinate chart, 143 Euler characteristic, 69
copersistence, 185 Euler number, 69
coset, 34 exact couple, 221
countable, 22 exact sequence, 38
covariant functor, 49 exponential map, 191
covering homotopy property, 194 Ext, 166
cross section, 196 extension condition, 235
INDEX 307

extension index, 228 simple, 100


extension problem, 190 spanned by a subgraph, 100
exterior algebra, 266 underlying, 102
undirected, 101
face map, 136 weighted, 102
faulty logic, 144 graph edit distance, 102
fiber bundle, 196 Grassmann manifold, 271
fiber map, 196 Gromov-Hausdorff distance, 95
fibering, 194 group, 32
fibrant, 235 group acting on a set., 195
field, 42 group ring, 248
filtration, 85 Gysin sequence, 220
finitely generated, 37
finitely presented, 37 H-space, 201
forest, 101 Hausdorff, 19
forgetful functor, 49 Hausdorff distance, 93
formal sum, 37 head, 101
free abelian group, 37 HEP, 228
free chain complex, 173 Hilbert space, 143
free group, 37 Hom, 160
free resolution, 166 homeomorphism, 14
frontier, 11 homology cross product, 182
function, 7 homology group, 57
function: domain, 7 homology with arbitrary coefficients, 60, 166
function: image, 7 homomorphism, 36
function: range, 7 homotopic maps, 65
functor, 49 homotopy equivalence, 66
faithful, 145 homotopy equivalent, 66
full, 145 homotopy extension property, 228
fundamental cohomology class, 246 homotopy group, 201
fundamental cycle, 79 homotopy inverse, 66
fundamental group, 192 homotopy system, 206
fuzzy logic, 144 homotopy type, 66
fuzzy set theory, 144 Hopf algebra, 260
Hopf invariant, 256
generalized triad, 200 Hurewicz homomorphism, 210, 245
generators, 37 hyperdonut, 180
genus, 69
geometrically independent, 53 ideal, 43
graded R-algebra, 258 identity function, 7
graded R-module, 258 incidence matrix, 100, 124
graded ring, 175 incident, 99
graph, 99 inclusion, 8
acyclic, 101 indecomposable Steenrod square, 257
bipartite, 99 index set, 15
complete, 99 initial point, 28
component of, 100 injection, 7
connected, 100 injective, 7
d-regular, 100 inner product, 143
directed, 101 inner product space, 143
regular, 100 integral domain, 42
308 INDEX

interior, 11 nodes, 99
intersection, 6 noncommutative ring, 41
isometry, 95 normal bundle, 269
isomorphism, 36 normal space, 21
normal subgroup, 35
Kan complex, 235 normed space, 95
Kan condition, 235
kernel, 36 obstruction cochain, 229
Klein bottle, 17 obstruction cohomology class, 231
Kronecker index, 176 obstruction set, 231
Kronecker map, 177 octonion, 198
one-to-one, 7
leaf, 101 onto, 7
lifting problem, 190 open cover, 25
Lindelöf space, 25 open map, 16
linear combination, 45 open set, 10
linear independence, 45 opposite category, 48
linear singular simplex, 75 ordered simplicial complexes, 135
linear span, 45 ordinal partition graph, 112
linear transformation, 45 ordinals, 24
local triviality, 269 orientation, 55
locally convex, 119 oriented p-cell, 79
long exact sequence, 70
long line, 24 p-component, 280
loop, 28, 99, 191 paracompact, 193
parallelizable manifold, 269
manifold, 97 partial homotopy, 228
Mayer-Vietoris sequence, 71 partial order, 8
metric space, 9 partially ordered set, 104
module, 46 path, 28, 100, 191
monoid, 32 path-space fibration, 200
monomorphism, 36 pathwise connected, 28
Moore space, 213 PCHP, 194
morphism, 48 permutation group, 33
Morse theory, 97 persistence complex, 87
mutually separated, 27 persistence diagram, 90
persistence image, 115
n-connected, 210 persistence landscape, 90, 95
n-connective fiber space, 284 persistence surface, 114
n-extensible, 228 persistent homology, 85
n-homotopic, 213 Pigeonhole Principle, 42
n-homotopy type, 213 polyhedral covering homotopy property, 194
n-simple, 207 polytope, 64
n-type, 214 poset, 104
natural transformation, 50 Postnikov system, 213, 214
neighborhood, 11 Postnikov tower, 213
neighborhood base, 12 power set, 23
neighborhood system, 11 presheaf, 145
neighbors, 99 principal ideal domain, 47
nerve, 131 product bundle, 194
net, 222 product topology, 16
INDEX 309

pseudo-projective space, 286 singular set functor, 142, 236


sink, 102
quaternion, 41, 198 skeleton, 53
quotient group, 35 Smith normal form, 61
quotient map, 18 source, 102
quotient space, 18 spanning tree, 101
quotient topology, 18 spectral sequence, 216
split exact sequence, 40
real projective plane, 18 stable homotopy group, 279
real projective space, 80 stable homotopy theory, 211, 279
reduced homology, 58 stable i-stem, 279
reduced cohomology, 171 stable range, 211, 279
reduced homology, 58 standard p-simplex, 74
reduced powers, 266 standard (ordered) n-simplex, 136
regular, 20 star of a vertex, 64
relation, 8 Steenrod algebra, 259
relative coboundary, 171 Steenrod reduced powers, 266
relative cochain, 171 Steenrod square, 250
relative cocycle, 171 stereographic projection, 67
relative cohomology, 171 Stiefel manifold, 272
relative homology, 58 Stiefel-Whitney classes, 271
relative homotopy group, 203 subcategory
relative topology, 14 full, 145
relatively compact, 116 subgraph, 100
retraction, 66 subgroup, 33
Riemannian manifold, 144 subset, 5
Riemannian metric, 144 subspace, 14, 45
ring, 41 supergraph, 100
Rips zigzag, 93 surjection, 7
rng, 41 surjective, 7
Russell’s Paradox, 6 suspension, 18, 72
suspension map, 211, 285
semisimplicial complex, 137 symmetric polynomial, 253
semisimplicial set, 137
separable, 24 tail, 101
sequence, 96, 222 tangent bundle, 269
SEQUITUR algorithm, 107 tangent space, 143
Serre’s exact sequence for a fiber space, 263 template functions, 120
set, 5 tensor algebra, 259
sheaf, 145 tensor product, 162
SHI, 275 tensor product of complexes, 179
short exact sequence, 39 tent functions, 120
simple group, 35 terminal point, 28
simplex, 53 Thom isomorphism, 272
simplicial approximation, 64 time series, 105
simplicial complex, 53 topography, 2
simplicial map, 54 topological group, 195
simplicial set, 138 topological space, 10
simply connected, 192 topological transformation group, 195
singular p-simplex, 74 topologist’s sine curve, 28
singular chain group, 75 topology, 10
310 INDEX

base, 13 universal coefficient theorem for cohomology, 177


cofinite, 20 universal coefficient theorem for homology, 178
discrete, 10
subbase, 13 V (G), 99
trivial, 10 vector bundle, 269
Tor, 168 vector space, 44
torsion free, 38 vertices, 99
torsion product, 168 adjacent, 99
torsion subgroup, 38 degree, 100
totally disconnected, 28 indegree, 102
trail, 100 joined by an edge, 99
transgression, 263 outdegree, 102
transgressive, 264 total degree, 102
tree, 101 Vietoris-Rips complex, 85
triad, 200
triangulable pair, 74 walk, 100
triangulable space, 74 directed, 102
triangulation, 74 length, 100
trivial bundle, 194 open, 100
trivial cohomology ring, 176 Wang sequence, 218
trivial ultranet, 222 Wasserstein distance, 95
Tychonoff space, 21 wedge product, 175
Tychonoff topology, 16 Whitney sum, 270

ultranet, 222 zero-divisor, 42


union, 6 zig-zag persistence, 91
unit element, 41 zigzag diagram, 92
unital module, 258 zigzag module, 92
unitary module, 258 zigzag submodule, 92

You might also like