0% found this document useful (0 votes)
13 views69 pages

Trees

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views69 pages

Trees

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 69

Trees

• General Trees . . . . . . . . . . . . . . . . . . . . . . . . . 300


 Tree Definitions and Properties . . . . . . . . . . . . . . . 301
 The Tree Abstract Data Type . . . . . . . . . . . . . . . 305
 Computing Depth and Height . . . . . . . . . . . . . . . . 308
• Binary Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 311
 The Binary Tree Abstract Data Type . . . . . . . . . . . . 313
 Properties of Binary Trees . . . . . . . . . . . . . . . . . 315
• Implementing Trees . . . . . . . . . . . . . . . . . . . . . . 317
 Linked Structure for Binary Trees . . . . . . . . . . . . . . 317
 Array-Based Representation of a Binary Tree . . . . . . . 325
 Linked Structure for General Trees . . . . . . . . . . . . . 327
• Tree Traversal Algorithms . . . . . . . . . . . . . . . . . . . 328
 Preorder and Postorder Traversals of General Trees . . . . 328
 Breadth-First Tree Traversal . . . . . . . . . . . . . . . . 330
 Inorder Traversal of a Binary Tree . . . . . . . . . . . . . 331
 Implementing Tree Traversals in Python . . . . . . . . . . 333
 Applications of Tree Traversals . . . . . . . . . . . . . . . 337
 Euler Tours and the Template Method Pattern . . . . . 341
General Trees
A tree is an abstract data type that stores elements hierarchically. With the exception
of the top element, each element in a tree has a parent element and zero or
more children elements. A tree is usually visualized by placing elements inside
ovals or rectangles, and by drawing the connections between parents and children
with straight lines. We typically call the top element the root
of the tree, but it is drawn as the highest element, with the other elements being
connected below.

Figure 8.2: A tree


with 17 nodes representing the organization of a
fictitious corporation.
• The root stores Electronics R’Us.
• The children of the root store R&D,
• Sales, Purchasing, and Manufacturing.
• The internal nodes store Sales, International,Overseas,
Electronics R’Us, and Manufacturing.
Formal Tree Definition

Formally, we define a tree T as a set of nodes storing elements such that the nodes have a parent-child relationship that
satisfies the following properties:
• If T is nonempty, it has a special node, called the root of T, that has no parent.
• Each node v of T different from the root has a unique parent node w; every node with parent w is a child of w.
Note that according to our definition, a tree can be empty, meaning that it does not have any nodes. This convention also
allows us to define a tree recursively such that a tree T is either empty or consists of a node r, called the root of T, and a
(possibly empty) set of subtrees whose roots are the children of r.

Other Node Relationships

• Two nodes that are children of the same parent are siblings.
• A node v is external if v has no children.
• A node v is internal if it has one or more children.
• External nodes are also known as leaves.
Example 8.1: In Section 4.1.4, we discussed the hierarchical relationship between files and directories in a computer’s file
system, although at the time we did not emphasize the nomenclature of a file system as a tree. In Figure 8.3, we revisit an
earlier example. We see that the internal nodes of the tree are associated with directories and the leaves are associated with
regular files. In the UNIX and Linux operating systems, the root of the tree is appropriately called the “root directory,” and
is represented by the symbol “/.”

• A node u is an ancestor of a node v if u = v or u is an


ancestor of the parent of v.
• Conversely, we say that a node v is a descendant of a
node u if u is an ancestor of v.
• For example, in Figure 8.3, cs252/ is an ancestor of
papers/, and pr3 is a descendant of cs016/.
• The subtree of T rooted at a node v is the tree consisting
of all the descendants of v in T (including v itself).
• In Figure 8.3, the subtree rooted at cs016/ consists of
the nodes cs016/, grades, homeworks/, programs/, hw1,
hw2, hw3, pr1, pr2, and pr3.
Edges and Paths in Trees

• An edge of tree T is a pair of nodes (u,v) such that u is the parent of v, or vice versa. A path of T is a sequence of nodes
such that any two consecutive nodes in the sequence form an edge.
• For example, the tree in Figure 8.3 contains the path (cs252/, projects/, demos/, market).
General Trees
• Example 8.2: The inheritance relation between classes in a Python program forms a tree when single inheritance is used.
• The BaseException class is the root of that hierarchy, while all user-defined exception classes should conventionally be
declared as descendants of the more specific Exception class.
Ordered Trees
A tree is ordered if there is a meaningful linear order among the children of each node; that is, we purposefully identify
the children of a node as being the first, second, third, and so on. Such an order is usually visualized by arranging siblings
left to right, according to their order.

Example 8.3: The components of a structured document, such as a book, are hierarchically organized as a tree whose
internal nodes are parts, chapters, and sections, and whose leaves are paragraphs, tables, figures, and so on. (See Figure
8.6.) The root of the tree corresponds to the book itself. We could, in fact, consider expanding the tree further to show
paragraphs consisting of sentences, sentences consisting of words, and words consisting of characters. Such a tree is an
example of an ordered tree, because there is a well-defined order among the children of each node.
The Tree Abstract Data Type

we define a tree ADT using the concept of a position as an abstraction for a node of a tree.
An element is stored at each position, and positions satisfy parent-child relationships that define the tree structure.
A position object for a tree supports the method:
p.element( ): Return the element stored at position p.
The tree ADT then supports the following accessor methods, allowing a user to navigate the various positions of a tree:
T.root(): Return the position of the root of tree T, or None if T is empty.
T.is root(p): Return True if position p is the root of Tree T.
T.parent(p): Return the position of the parent of position p, or None if p is the root of T.
T.num children(p): Return the number of children of position p.
T.children(p): Generate an iteration of the children of position p.
T.is leaf(p): Return True if position p does not have any children.
len(T): Return the number of positions (and hence elements) that are contained in tree T.
T.is empty( ): Return True if tree T does not contain any positions.
T.positions( ): Generate an iteration of all positions of tree T.
iter(T): Generate an iteration of all elements stored within tree T.
• Any of the above methods that accepts a position as an argument should generate a ValueError if that position is
invalid for T.
• If a tree T is ordered, then T.children(p) reports the children of p in the natural order. If p is a leaf, then T.children(p)
generates an empty iteration.
• In similar regard, if tree T is empty, then both T.positions( ) and iter(T) generate empty iterations.
A Tree Abstract Base Class in Python
• We choose to define a Tree class, in Code Fragment 8.1, that serves as an abstract base class corresponding to the tree
ADT.
• The Tree class provides a definition of a nested Position class (which is also abstract), and declarations of many of the
accessor methods included in the tree ADT.
• However, our Tree class does not define any internal representation for storing a tree, and five of the methods given in
that code fragment remain abstract (root, parent, num children, children, and len ); each of these methods raises a
NotImplementedError.
• The subclasses are responsible for overriding abstract methods, such as children, to provide a working implementation
for each behavior, based on their chosen internal representation.
• Although the Tree class is an abstract base class, it includes several concrete methods with implementations that rely on
calls to the abstract methods of the class.
• In defining the tree ADT in the previous section, we declare ten accessor methods.
• Five of those are the ones we left as abstract, in Code Fragment 8.1.
• The other five can be implemented based on the former.
8.1. General Trees
8.1.3 Computing Depth and Height

• Let p be the position of a node of a tree T.


• The depth of p is the number of ancestors of p, excluding p itself.
• Note that this definition implies that the depth of the root of T is 0.
• The depth of p can also be recursively defined as follows:
• • If p is the root, then the depth of p is 0.
• • Otherwise, the depth of p is one plus the depth of the parent of p.

• Based on this definition, we present a simple, recursive algorithm, depth, in Code Fragment 8.3, for computing the
depth of a position p in Tree T. This method calls itself recursively on the parent of p, and adds 1 to the value
returned.
• The running time of T.depth(p) for position p is O(dp +1), where dp denotes the depth of p in the tree T, because the
algorithm performs a constant-time recursive step for each ancestor of p.
• Thus, algorithm T.depth(p) runs in O(n) worstcase time, where n is the total number of positions of T, because a
position of T may have depth n−1 if all nodes form a single branch.
• Although such a running time is a function of the input size, it is more informative to characterize the running time
in terms of the parameter dp, as this parameter may be much smaller than n.
Height
• The height of a position p in a tree T is also defined recursively:
• If p is a leaf, then the height of p is 0.
• Otherwise, the height of p is one more than the maximum of the heights of p’s children.
• The height of a nonempty tree T is the height of the root of T.
• For example, the tree of Figure 8.2 has height 4.
• In addition, height can also be viewed as follows.

Proposition 8.4: The height of a nonempty tree T is equal to the maximum of the depths of its leaf positions.
• We can compute the height of a tree more efficiently, in O(n) worst-case time, by relying instead on the original
recursive definition.
• To do this, we will parameterize a function based on a position within the tree, and calculate the height of the subtree
rooted at that position.
• Algorithm height2, shown as nonpublic method height2 in Code Fragment 8.5, computes the height of tree T in this
way.
Proposition 8.5: Let T be a tree with n positions, and let cp denote the number of children of a position p of T. Then,
summing over the positions of T, Σp cp =n−1.
Justification: Each position of T, with the exception of the root, is a child of another position, and thus contributes one
unit to the above sum.
By Proposition 8.5, the running time of algorithm height2, when called on the root of T, is O(n), where n is the number of
positions of T. Revisiting the public interface for our Tree class, the ability to compute heights of subtrees is beneficial, but
a user might expect to be able to compute the height of the entire tree without explicitly designating the tree root. We can
wrap the nonpublic height2 in our implementation with a public height method that provides a default interpretation when
invoked on tree T with syntax T.height(). Such an implementation is given in Code Fragment 8.6.
Binary Trees

A binary tree is an ordered tree with the following properties:


1. Every node has at most two children.
2. Each child node is labeled as being either a left child or a right child.
3. A left child precedes a right child in the order of children of a node.

• The subtree rooted at a left or right child of an internal node v is called a left subtree or right subtree, respectively, of v.
• A binary tree is proper if each node has either zero or two children.
• Some people also refer to such trees as being full binary trees.
• Thus, in a proper binary tree, every internal node has exactly two children.
• A binary tree that is not proper is improper.
Example 8.6:
• An important class of binary trees arises in contexts where we wish to represent a number of different outcomes that can
result from answering a series of yes-or-no questions.
• Each internal node is associated with a question. Starting at the root, we go to the left or right child of the current node,
depending on whether the answer to the question is “Yes” or “No.”
• With each decision, we follow an edge from a parent to a child, eventually tracing a path in the tree from the root to a
leaf.
• Such binary trees are known as decision trees, because a leaf position p in such a tree represents a decision of what to do
if the questions associated with p’s ancestors are answered in a way that leads to p.
• A decision tree is a proper binary tree.
• Figure 8.7 illustrates a decision tree that provides recommendations to a prospective investor.
• Example 8.7: An arithmetic expression can be represented by a binary tree whose leaves are associated with variables or
constants, and whose internal nodes are associated with one of the operators +, −, ×, and /. (See Figure 8.8.)
• Each node in such a tree has a value associated with it.
• • If a node is leaf, then its value is that of its variable or constant.
• • If a node is internal, then its value is defined by applying its operation to the values of its children.
• An arithmetic expression tree is a proper binary tree, since each operator +, −, ×, and / takes exactly two operands. Of
course, if we were to allow unary operators, like negation (−), as in “−x,” then we could have an improper binary tree.
The Binary Tree Abstract Data Type

As an abstract data type, a binary tree is a specialization of a tree that supports three additional accessor methods:

T.left(p): Return the position that represents the left child of p, or None if p has no left child.

T.right(p): Return the position that represents the right child of p, or None if p has no right child.

T.sibling(p): Return the position that represents the sibling of p, or None if p has no sibling.
The Binary Tree Abstract Base Class in Python
Properties of Binary Trees

• Binary trees have several interesting properties dealing with relationships between their heights and number of nodes.
• In a binary tree, level 0 has at most one node (the root), level 1 has at most two nodes (the children of the root), level 2
has at most four nodes, and so on. (See Figure 8.9.)
• In general, level d has at most 2d nodes.
Proposition 8.8: Let T be a nonempty binary tree, and let n, nE, nI and h denote the number of nodes, number of external
nodes, number of internal nodes, and height of T, respectively.

Then T has the following properties:


1. h+1 ≤ n ≤ 2h+1−1
2. 1 ≤ nE ≤ 2h
3. h ≤ nI ≤ 2h−1
4. log(n+1)−1 ≤ h ≤ n−1

Also, if T is proper, then T has the following properties:


1. 2h+1 ≤ n ≤ 2h+1−1
2. h+1 ≤ nE ≤ 2h
3. h ≤ nI ≤ 2h−1
4. log(n+1)−1 ≤ h ≤ (n−1)/2
Relating Internal Nodes to External Nodes in a Proper Binary Tree

In addition to the earlier binary tree properties, the following relationship exists between the number of internal nodes and
external nodes in a proper binary tree.

Proposition 8.9: In a nonempty proper binary tree T, with nE external nodes and nI internal nodes, we have
nE = nI +1.
Justification:
• We justify this proposition by removing nodes from T and dividing them up into two “piles,” an internal-node pile and an
external-node pile, until T becomes empty.
• The piles are initially empty. By the end, we will show that the external-node pile has one more node than the internal-
node pile. We consider two cases:

Case 1:
If T has only one node v, we remove v and place it on the external-node pile. Thus, the external-node pile has one node and
the internal-node pile is empty.
Case 2:
• Otherwise (T has more than one node), we remove from T an (arbitrary) external node w and its parent v, which is an
internal node.
• We place w on the external-node pile and v on the internal-node pile.
• If v has a parent u, then we reconnect u with the former sibling z of w, as shown in Figure 8.10.
• This operation, removes one internal node and one external node, and leaves the tree being a proper binary tree.
• Repeating this operation, we eventually are left with a final tree consisting of a single node.
• Note that the same number of external and internal nodes have been removed and placed on their respective piles by the
sequence of operations leading to this final tree.
• Now, we remove the node of the final tree and we place it on the external-node pile.
• Thus, the external-node pile has one more node than the internal-node pile.
Implementing Trees

There are several choices for the internal representation of trees. We describe the most common representations in this
section. We begin with the case of a binary tree, since its shape is more narrowly defined.

Linked Structure for Binary Trees


• A natural way to realize a binary tree T is to use a linked structure, with a node (see Figure 8.11a) that maintains
references to the element stored at a position p and to the nodes associated with the children and parent of p.
• If p is the root of T, then the parent field of p is None.
• Likewise, if p does not have a left child (respectively, right child), the associated field is None.
• The tree itself maintains an instance variable storing a reference to the root node (if any), and a variable, called size, that
represents the overall number of nodes of T.
• We show such a linked structure representation of a binary tree in Figure 8.11b.
Implementing Trees
Array-Based Representation of a Binary Tree

An alternative representation of a binary tree T is based on a way of numbering the positions of T. For every position p of
T, let f (p) be the integer defined as follows.
• If p is the root of T, then f (p) = 0.
• If p is the left child of position q, then f (p) = 2 f (q)+1.
• If p is the right child of position q, then f (p) = 2 f (q)+2.

• The numbering function f is known as a level numbering of the positions in a binary tree T, for it numbers the positions
on each level of T in increasing order from left to right.
• Note well that the level numbering is based on potential positions within the tree, not actual positions of a given tree,
so they are not necessarily consecutive.
• For example, in Figure 8.12(b), there are no nodes with level numbering 13 or 14, because the node with level
numbering 6 has no children.
• The level numbering function f suggests a representation of a binary tree T by means of an array-based structure A
(such as a Python list), with the element at position p of T stored at index f (p) of the array.
• We show an example of an array-based representation of a binary tree in Figure 8.13.
• One advantage of an array-based representation of a binary tree is that a position p can be represented by the single
integer f (p), and that position-based methods such as root, parent, left, and right can be implemented using simple
arithmetic operations on the number f (p).
• Based on our formula for the level numbering, the left child of p has index 2 f (p)+1, the right child of p has index 2 f
(p)+2, and the parent of p has index ( f (p)−1)/2
• We leave the details of a complete implementation as an exercise (R-8.18).
• The space usage of an array-based representation depends greatly on the shape of the tree.
• Let n be the number of nodes of T, and let fM be the maximum value of f (p) over all the nodes of T.
• The array A requires length N = 1+ fM, since elements range from A[0] to A[ fM].
• Note that A may have a number of empty cells that do not refer to existing nodes of T. In fact, in the worst case, N = 2n
−1,
• Another drawback of an array representation is that some update operations for trees cannot be efficiently supported. For
example, deleting a node and promoting its child takes O(n) time because it is not just the child that moves locations
within the array, but all descendants of that child.
Tree Traversal Algorithms
• A traversal of a tree T is a systematic way of accessing, or “visiting,” all the positions of T.
• The specific action associated with the “visit” of a position p depends on the application of this traversal, and could
involve anything from incrementing a counter to performing some complex computation for p.

Preorder and Postorder Traversals of General Trees


• In a preorder traversal of a tree T, the root of T is visited first and then the subtrees rooted at its children are traversed
recursively.
• If the tree is ordered, then the subtrees are traversed according to the order of the children.
• The pseudo-code for the preorder traversal of the subtree rooted at a position p is shown in Code Fragment 8.12.
Postorder Traversal

• Another important tree traversal algorithm is the postorder traversal.


• In some sense, this algorithm can be viewed as the opposite of the preorder traversal, because it recursively traverses
the subtrees rooted at the children of the root first, and then visits the root (hence, the name “postorder”).
• Pseudo-code for the postorder traversal is given in Code Fragment 8.13, and an example of a postorder traversal is
portrayed in Figure 8.16.
Running-Time Analysis

• Both preorder and postorder traversal algorithms are efficient ways to access all the positions of a tree.
• The analysis of either of these traversal algorithms is similar to that of algorithm height2, given in Code Fragment 8.5
of Section 8.1.3.
• At each position p, the nonrecursive part of the traversal algorithm requires time O(cp+1), where cp is the number of
children of p, under the assumption that the “visit” itself takes O(1) time.
• By Proposition 8.5, the overall running time for the traversal of tree T is O(n), where n is the number of positions in the
tree.
• This running time is asymptotically optimal since the traversal must visit all the n positions of the tree.
Inorder Traversal of a Binary Tree

• During an inorder traversal, we visit a position between the recursive traversals of its left and right subtrees.
• The inorder traversal of a binary tree T can be informally viewed as visiting the nodes of T “from left to right.”
• Indeed, for every position p, the inorder traversal visits p after all the positions in the left subtree of p and before all the
positions in the right subtree of p.
• Pseudo-code for the inorder traversal algorithm is given in Code Fragment 8.15, and an example of an inorder traversal
is portrayed in Figure 8.18.
Binary Search Trees

• An important application of the inorder traversal algorithm arises when we store an ordered sequence of elements in a
binary tree, defining a structure we call a binary search tree.
• Let S be a set whose unique elements have an order relation.
• For example, S could be a set of integers.
• A binary search tree for S is a binary tree T such that, for each position p of T:
 Position p stores an element of S, denoted as e(p).
 Elements stored in the left subtree of p (if any) are less than e(p).
 Elements stored in the right subtree of p (if any) are greater than e(p).
• An example of a binary search tree is shown in Figure 8.19.
• The above properties assure that an inorder traversal of a binary search tree T visits the elements in nondecreasing order.
A binary search tree storing integers. The solid path is traversed when searching
Figure 8.19:
(successfully) for 36. The dashed path is traversed when searching (unsuccessfully) for 70.
• We can use a binary search tree T for set S to find whether a given search value v is in S, by traversing a path down the
tree T, starting at the root.
• At each internal position p encountered, we compare our search value v with the element e(p) stored at p.
• If v < e(p), then the search continues in the left subtree of p.
• If v = e(p), then the search terminates successfully.
• If v > e(p), then the search continues in the right subtree of p.
• Finally, if we reach an empty subtree, the search terminates unsuccessfully.
• In other words, a binary search tree can be viewed as a binary decision tree (recall Example 8.6), where the question
asked at each internal node is whether the element at that node is less than, equal to, or larger than the element being
searched for.
• Note that the running time of searching in a binary search tree T is proportional to the height of T.
• Recall from Proposition 8.8 that the height of a binary tree with n nodes can be as small as log(n+1)−1 or as large as
n−1.
• Thus, binary search trees are most efficient when they have small height.
Implementing Tree Traversals in Python

• When first defining the tree ADT in Section 8.1.2, we stated that tree T should include support for the following methods:
 T.positions( ): Generate an iteration of all positions of tree T.
 iter(T): Generate an iteration of all elements stored within tree T.
• At that time, we did not make any assumption about the order in which these iterations report their results.
• In this section, we demonstrate how any of the tree traversal algorithms we have introduced could be used to produce
these iterations.
• To begin, we note that it is easy to produce an iteration of all elements of a tree, if we rely on a presumed iteration of all
positions.
• Therefore, support for the iter(T) syntax can be formally provided by a concrete implementation of the special method
iter within the abstract base class Tree.
• Our implementation of Tree. iter is given in Code Fragment 8.16.
Applications of Tree Traversals

Table of Contents

• When using a tree to represent the hierarchical structure of a document, a preorder traversal of the tree can naturally be
used to produce a table of contents for the document.
• For example, the table of contents associated with the tree from Figure 8.15 is displayed in Figure 8.20.
• Part (a) of that figure gives a simple presentation with one element per line; part (b) shows a more attractive
presentation produced by indenting each element based on its depth within the tree.
• A similar presentation could be used to display the contents of a computer’s file system, based on its tree representation
(as in Figure 8.3).
The unindented version of the table of contents, given a tree T, can be produced with the
following code:
for p in T.preorder( ):
print(p.element( ))
• To produce the presentation of Figure 8.20(b), we indent each element with a number of spaces equal to twice the
element’s depth in the tree (hence, the root element was unindented).
• Although we could replace the body of the above loop with the statement print(2 T.depth(p) * ` ` + str(p.element())),
such an approach is unnecessarily inefficient.
• Although the work to produce the preorder traversal runs in O(n) time, based on the analysis of Section 8.4.1, the
calls to depth incur a hidden cost.
• Making a call to depth from every position of the tree results in O(n 2) worst-case time, as noted when analyzing the
algorithm height1 in Section 8.1.3.
• A preferred approach to producing an indented table of contents is to redesign a top-down recursion that includes the
current depth as an additional parameter.
• Such an implementation is provided in Code Fragment 8.23.
• This implementation runs in worst-case O(n) time (except, technically, the time it takes to print strings of increasing
lengths).
Parenthetic Representations of a Tree
• It is not possible to reconstruct a general tree, given only the preorder sequence of elements, as in Figure 8.20(a).
• The use of indentation or numbered labels provides such context, with a very human-friendly presentation.
• However, there are more concise string representations of trees that are computer-friendly.
• The parenthetic string representation P(T) of tree T is recursively defined as follows.
• If T consists of a single position p, then

You might also like