W11 Trees PDF
W11 Trees PDF
Trees 247
4.2 Trees
Investigate!
Consider the graph drawn below.
Does the definition above agree with your intuition for what graphs
we should call trees? Try thinking of examples of trees and make sure
they satisfy the definition. One thing to keep in mind is that while the
trees we study in graph theory are related to trees you might see in other
subjects, the correspondence is not exact. For instance, in anthropology,
you might study family trees, like the one below,
4Sometimes this is stated as “a tree is an acyclic connected graph;” “acyclic” is just a
fancy word for “containing no cycles.”
248 4. Graph Theory
Me
Mom Dad
So far so good, but while your grandparents are (probably) not blood-
relatives, if we go back far enough, it is likely that they did have some
common ancestor. If you trace the tree back from you to that common
ancestor, then down through your other grandparent, you would have a
cycle, and thus the graph would not be a tree.
You might also have seen something called a decision tree (such as the
algorithm for deciding whether a series converges or diverges). Some-
times these too contain cycles, as the decision for one node might lead
you back to a previous step.
Both the examples of trees above also have another feature worth
mentioning: there is a clear order to the vertices in the tree. In general,
there is no reason for a tree to have this added structure, although we can
impose such a structure by considering rooted trees, where we simply
designate one vertex as the root. We will consider such trees in more
detail later in this section.
Properties of Trees
We wish to really understand trees. This means we should discover
properties of trees; what makes them special and what is special about
them.
A tree is a connected graph with no cycles. Is there anything else we
can say? It would be nice to have other equivalent conditions for a graph
to be a tree. That is, we would like to know whether there are any graph
theoretic properties that all trees have, and perhaps even that only trees
have.
To get a feel for the sorts of things we can say, we will consider three
propositions about trees. These will also illustrate important proof tech-
niques that apply to graphs in general, and happen to be a little easier for
trees.
Our first proposition gives an alternate definition for a tree. That is, it
gives necessary and sufficient conditions for a graph to be a tree.
Proposition 4.2.1 A graph T is a tree if and only if between every pair of distinct
vertices of T there is a unique path.
4.2. Trees 249
Proof. This is an “if and only if” statement, so we must prove two impli-
cations. We start by proving that if T is a tree, then between every pair of
distinct vertices there is a unique path.
Assume T is a tree, and let u and v be distinct vertices (if T only has
one vertex, then the conclusion is satisfied automatically). We must show
two things to show that there is a unique path between u and v: that there
is a path, and that there is not more than one path. The first of these is
automatic, since T is a tree, it is connected, so there is a path between any
pair of vertices.
To show the path is unique, we suppose there are two paths between
u and v, and get a contradiction. The two paths might start out the same,
but since they are different, there is some first vertex u 0 after which the
two paths diverge. However, since the two paths both end and v, there
is some first vertex after u 0 that they have in common, call it v 0. Now
consider the two paths from u 0 to v 0. Taken together, these form a cycle,
which contradicts our assumption that T is a tree.
Now we consider the converse: if between every pair of distinct ver-
tices of T there is a unique path, then T is a tree. So assume the hypothesis:
between every pair of distinct vertices of T there is a unique path. To prove
that T is a tree, we must show it is connected and contains no cycles.
The first half of this is easy: T is connected, because there is a path
between every pair of vertices. To show that T has no cycles, we assume
it does, for the sake of contradiction. Let u and v be two distinct vertices
in a cycle of T. Since we can get from u to v by going clockwise or
counterclockwise around the cycle, there are two paths from u and v,
contradicting our assumption.
We have established both directions so we have completed the proof.
qed
Read the proof above very carefully. Notice that both directions had
two parts: the existence of paths, and the uniqueness of paths (which
related to the fact that there were no cycles). In this case, these two parts
were really separate. In fact, if we just considered graphs with no cycles
(a forest), then we could still do the parts of the proof that explore the
uniqueness of paths between vertices, even if there might not exist paths
between vertices.
This observation allows us to state the following corollary:5
Corollary 4.2.2 A graph F is a forest if and only if between any pair of vertices
in F there is at most one path.
We do not give a proof of the corollary (it is, after all, supposed to
follow directly from the proposition) but for practice, you are asked to
5A corollary is another sort of provable statement, like a proposition or theorem, but
one that follows direction from another already established statement, or its proof.
250 4. Graph Theory
give a careful proof in the exercises. When you do so, try to use proof by
contrapositive instead of proof by contradiction.
Our second proposition tells us that all trees have leaves: vertices of
degree one.
Proposition 4.2.3 Any tree with at least two vertices has at least two vertices of
degree one.
Proof. We give a proof by contradiction. Let T be a tree with at least
two vertices, and suppose, contrary to stipulation, that there are not two
vertices of degree one.
Let P be a path in T of longest possible length. Let u and v be the
endpoints of the path. Since T does not have two vertices of degree one,
at least one of these must have degree two or higher. Say that it is u. We
know that u is adjacent to a vertex in the path P, but now it must also be
adjacent to another vertex, call it u 0.
Where is u 0? It cannot be a vertex of P, because if it was, there would
be two distinct paths from u to u 0: the edge between them, and the first
part of P (up to u 0). But u 0 also cannot be outside of P, for if it was, there
would be a path from u 0 to v that was longer than P, which has longest
possible length.
This contradiction proves that there must be at least two vertices of
degree one. In fact, we can say a little more: u and v must both have
degree one. qed
The proposition is quite useful when proving statements about trees,
because we often prove statements about trees by induction. To do so, we
need to reduce a given tree to a smaller tree (so we can apply the inductive
hypothesis). Getting rid of a vertex of degree one is an obvious choice,
and now we know there is always one to get rid of.
To illustrate how induction is used on trees, we will consider the
relationship between the number of vertices and number of edges in
trees. Is there a tree with exactly 7 vertices and 7 edges? Try to draw one.
Could a tree with 7 vertices have only 5 edges? There is a good reason
that these seem impossible to draw.
Proposition 4.2.4 Let T be a tree with v vertices and e edges. Then e v − 1.
Proof. We will give a proof by induction on the number of vertices in the
tree. That is, we will prove that every tree with v vertices has exactly v − 1
edges, and then use induction to show this is true for all v ≥ 1.
For the base case, consider all trees with v 1 vertices. There is only
one such tree: the graph with a single isolated vertex. This graph has
e 0 edges, so we see that e v − 1 as needed.
Now for the inductive case, fix k ≥ 1 and assume that all trees with
v k vertices have exactly e k − 1 edges. Now consider an arbitrary tree
T with v k + 1 vertices. By Proposition 4.2.3, T has a vertex v 0 of degree
4.2. Trees 251
one. Let T 0 be the tree resulting from removing v 0 from T (together with
its incident edge). Since we removed a leaf, T 0 is still a tree (the unique
paths between pairs of vertices in T 0 are the same as the unique paths
between them in T).
Now T 0 has k vertices, so by the inductive hypothesis, has k − 1 edges.
What can we say about T? Well, it has one more edge than T 0, so it has
k edges. But this is exactly what we wanted: v k + 1, e k so indeed
e v − 1. This completes the inductive case, and the proof. qed
There is a very important feature of this induction proof that is worth
noting. Induction makes sense for proofs about graphs because we can
think of graphs as growing into larger graphs. However, this does NOT
work. It would not be correct to start with a tree with k vertices, and then
add a new vertex and edge to get a tree with k + 1 vertices, and note that
the number of edges also grew by one. Why is this bad? Because how do
you know that every tree with k + 1 vertices is the result of adding a vertex
to your arbitrary starting tree? You don’t!
The point is that whenever you give an induction proof that a state-
ment about graphs that holds for all graphs with v vertices, you must
start with an arbitrary graph with v + 1 vertices, then reduce that graph to
a graph with v vertices, to which you can apply your inductive hypothe-
sis.
Rooted Trees
So far, we have thought of trees only as a particular kind of graph. How-
ever, it is often useful to add additional structure to trees to help solve
problems. Data is often structured like a tree. This book, for example,
has a tree structure: draw a vertex for the book itself. Then draw vertices
for each chapter, connected to the book vertex. Under each chapter, draw
a vertex for each section, connecting it to the chapter it belongs to. The
graph will not have any cycles; it will be a tree. But a tree with clear
hierarchy which is not present if we don’t identify the book vertex as the
“top”.
As soon as one vertex of a tree is designated as the root, then every
other vertex on the tree can be characterized by its position relative to the
root. This works because there is a unique path between any two vertices
in a tree. So from any vertex, we can travel back to the root in exactly one
way. This also allows us to describe how distinct vertices in a rooted tree
are related.
If two vertices are adjacent, then we say one of them is the parent of
the other, which is called the child of the parent. Of the two, the parent is
the vertex that is closer to the root. Thus the root of a tree is a parent, but
is not the child of any vertex (and is unique in this respect: all non-root
vertices have exactly one parent).
252 4. Graph Theory
Example 4.2.5
a c e i
d g
Example 4.2.6
are adjacent (they are siblings), so we are good so far. Now put
into A every child of every vertex in B (i.e., every grandchild of the
root). Keep going until all vertices have been assigned one of the
sets, alternating between A and B every “generation.” That is, a
vertex is in set B if and only if it is the child of a vertex in set A.
The key to how we partitioned the tree in the example was to know
which vertex to assign to a set next. We chose to visit all vertices in the
same generation before any vertices of the next generation. This is usually
called a breadth first search (we say “search” because you often traverse
a tree looking for vertices with certain properties).
In contrast, we could also have partitioned the tree in a different order.
Start with the root, put it in A. Then look for one child of the root to put in
B. Then find a child of that vertex, into A, and then find its child, into B,
and so on. When you get to a vertex with no children, retreat to its parent
and see if the parent has any other children. So we travel as far from the
root as fast as possible, then backtrack until we can move forward again.
This is called depth first search.
These algorithmic explanations can serve as a proof that every tree
is bipartite, although care needs to be spent to prove that the algorithms
are correct. Another approach to prove that all trees are bipartite, using
induction, is requested in the exercises.
Spanning Trees
One of the advantages of trees is that they give us a few simple ways to
travel through the vertices. If a connected graph is not a tree, then we can
still use these traversal algorithms if we identify a subgraph that is a tree.
First we should consider if this even makes sense. Given any connected
graph G, will there always be a subgraph that is a tree? Well, that is
actually too easy: you could just take a single edge of G. If we want to
use this subgraph to tell us how to visit all vertices, then we want our
subgraph to include all of the vertices. We call such a tree a spanning
tree. It turns out that every connected graph has one (and usually many).
Spanning tree.
Given a connected graph G, a spanning tree of G is a subgraph of
G which is a tree and includes all the vertices of G.
Every connected graph has a spanning tree.
and consider the new graph G1 G − e (i.e., the graph you get by deleting
e). This tree is still connected since e belonged to a cycle, there were at
least two paths between its incident vertices. Now repeat: if G1 has no
cycles, we are done, otherwise define G2 to be G1 − e1 , where e1 is an edge
in a cycle in G1 . Keep going. This process must eventually stop, since
there are only a finite number of edges to remove. The result will be a
tree, and since we never removed any vertex, a spanning tree.
This is by no means the only algorithm for finding a spanning tree.
You could have started with the empty graph and added edges that belong
to G as long as adding them would not create a cycle. You have some
choices as to which edges you add first: you could always add an edge
adjacent to edges you have already added (after the first one, of course),
or add them using some other order. Which spanning tree you end up
with depends on these choices.
Example 4.2.7
6If you add the smallest edge adjacent to edges you have already added, you are doing
Prim’s algorithm. If you add the smallest edge in the entire graph, you are following
Kruskal’s algorithm.