Chapter 8 - Binary Trees
Chapter 8 - Binary Trees
1
Binary Trees
A fundamental data structure
Combines advantages of arrays and linked lists
Fast search time
Fast insertion
Fast deletion
Moderately fast access time
2
Recall Ordered Arrays…
Their search time is faster, because there is some ‘ordering’ to the
elements.
We can do binary search, O(log n)
Instead of linear search, O(n)
Their insertion time is slower, because you have to find the correct
position to insert first
That takes O(log n) time
Instead of just dropping the element at the end, O(1)
4
Trees: General
A tree consists of nodes, connected by edges
Trees cannot have cycles
Otherwise, it’s a graph
Here’s a general tree:
5
Traversing a Tree
Start at the root and traverse downward along its edges
Typically, edges represent some kind of relationship
We represent these by references
Just as in linked lists:
class Link {
int data;
Link next;
}
In a tree:
class Node {
int data;
Node child1;
Node child2;
6 …
}
Size of a Tree
Increases as you go down
Opposite of nature.
7
Binary Trees
A special type of tree
With this tree, nodes had varying numbers of children:
8
A Binary Tree
For now, note each node has no more than two children
9
A Binary Tree
Each node thus has a left and right child
What would the Java class look like?
10
Binary Trees: Terms
Path: Sequence of nodes connected by edges
Green line is a path from A to J
11
Binary Trees: Terms
Root: The node at the top of the tree
Can be only one (in this case, A)
12
Binary Trees: Terms
Parent: The node above. (B is the parent of D, A is the parent of B, A is
the grandparent of D)
13
Binary Trees: Terms
Child: A node below. (B is a child of A, C is a child of A, D is a child of
B and a grandchild of A)
14
Binary Trees: Terms
Leaf: A node with no children
In this graph: H, E, I, J, and G
15
Binary Trees: Terms
Subtree: A node’s children, it’s children’s children, etc.
The hilited example is just one, there are many in this tree
16
Binary Trees: Terms
Visit: Access a node, and do something with its data
For example we can visit node B and check its value
17
Binary Trees: Terms
Traverse: Visit all the nodes in some specified order.
One example: A, B, D, H, E, C, F, I, J, G
18
Binary Trees: Terms
Levels: Number of generations a node is from the root
A is level 0, B and C are at level 1, D, E, F, G are level 2, etc.
19
Binary Trees: Terms
Key: The contents of a node
20
A Binary Search Tree
A binary tree, with the following characteristics:
The left child is always smaller than its parent
The right child is always larger than its parent
All nodes to the right are bigger than all nodes to the left
21
Integer Tree
Will use this class for individual nodes:
class Node {
public int data;
public Node left;
public Node right;
}
Let’s sketch the Java template for a binary search tree (page 375)
22
Example main() function
Page 275, with a slight tweak
Insert three elements: 50, 25, 75
Search for node 25
If it was found, print that we found it
If it was not found, print that we did not find it
23
Finding a node
What do we know?
For all nodes:
All elements in the left subtree are smaller
All elements in the right subtree are larger
24
Searching for a KEY
We’ll start at the root, and check its value
If the value = key, we’re done.
If the value is greater than the key, look at its left child
If the value is less than the key, look at its right child
Repeat.
25
Example
Searching for
element 57
26
Java Implementation – find()
Pages 377-378
27
Number of operations: Find
Typically about O(log n). Why?
29
Example
Inserting
element
45
30
Java Implementation – insert()
Page 380
31
Traversing a Tree
Three Ways:
Inorder (most common)
Preorder
Postorder
32
Inorder Traversal
Visits each node of the tree in ascending order:
Implies: We have to print a node’s left child before the node itself, and the
34 node itself before its right child
Inorder Traversal
Ascending Order
We can think of this recursively: start at the root, inorder traverse the left
subtree, print the root, inorder traverse the right subtree
35
Java Implementation
Page 382
A recursive function
Let’s try it with a simple example
36
Preorder Traversal
Prints all parents before children
Prints all left children before right children. So with
this tree:
37
A preorder traversal produces:
Preorder Traversal
Order: Root, left, right
39
Postorder Traversal
Prints all children before parents
Prints all left children before right children. So with
this tree:
40
A postorder traversal produces:
Preorder Traversal
Order: Left, right, root
42
Finding the Minimum
In a binary search tree, this is always the leftmost child
of the tree! Easy. Java?
Start at the root, and traverse until you have no more left
children
43
Finding the Maximum
In a binary search tree, this is also easy – it’s the rightmost child in the tree
Start at the root, traverse until there are no more right children
Java?
44
Deletion
This is the challenging one
First, find the element you want to delete
Once you’ve found it, one of three cases:
1. The node has no children (easy)
2. The node has one child (decently easy)
3. The node has two children (difficult)
45
Case 1: No Children
To delete a node with no children:
Find the node
Set the appropriate child field in its parent to null
Example: Removing 7 from the tree below
46
Java Implementation
Start from page 390-391
Find the node first
As we go through, keep track of:
The parent
Whether the node is a left or right child of its parent
47
Case 2: One Child
Assign the deleted
node’s child as the
child of its parent
Essentially, ‘snip out’
the deleted node from
the sequence
Example, deleting 71
from this tree:
48
Java Implementation
Pages 392-393
Two cases to handle. Either:
The right child is null
If the node is a left child, set its parent’s left child to the node’s left child
If the node is a right child, set its parent’s right child to the node’s left child
The left child is null
If the node is a left child, set its parent’s left child to the node’s right child
If the node is a right child, set its parent’s right child to the node’s right child
49
Case 3: Two Children
Here’s the tough case.
Let’s see an example of why it’s complicated…
50
Case 3: Two Children
What we need is the next highest node to replace 25.
For example, if we replaced 25 by 30, we’re set.
51
Case 3: Two Children
We call this the inorder successor of the deleted node
i.e., 30 is the inorder successor of 25. This replaces 25.
52
Inorder successor
The inorder successor
is always going to be
the smallest element in
the right subtree
In other words, the
smallest element that is
larger than the deleted
node.
53
Finding the inorder successor
Algorithm to find the
inorder successor of some
node X:
First go to the right child
of X
Then keep moving to left
children
Until there are no more
Then we are at the inorder
successor
55
to the deleted node’s left
Replace deleted node by
If the successor is not the deleted node’s right child, tougher
We must add two steps:
1. Set the successor’s parent’s left to the successor’s right
2. Set the successor’s right to the deleted node’s right
3. Set the successor’s left to the deleted node’s left (as before)
4. Replace the deleted node by the successor (as before)
56
Java Implementation (Time Pending)
getSuccessor() function, page 396
Accepts a node
First goes to its right child
Then goes to the left child
Does this until no more left children
57
Efficiency: Binary Search Trees
Note that:
Insertion, deletion, searching all involved visiting nodes of the tree until we
found either:
The position for insertion
The value for deletion
The value we were searching for
For any of these, we would not visit more than the number of levels in the
tree
Because for every node we visit, we check its value, and if we’re not done, we
go to one of its children
58
Efficiency: Binary Search Trees
So for a tree of n nodes, how many levels are there:
Nodes Levels
1 1
3 2
7 3
15 4
31 5
….
1,073,741,824 30
59 It’s actually log(n) + 1!
So…
All three of our algorithms: insertion, deletion, and searching take O(log
n) time
We go through log n + 1 levels, each time with one comparison.
At the point of insertion or deletion, we just manipulate a constant number
of references (say, c)
That’s independent of n
60
Compare to Arrays
Take 1 million elements and delete an element in the middle
Arrays -> Average case, 500 thousand shifts
Binary Search Trees -> 20 or fewer comparisons
Similar case when comparing with insertion into an ordered array
61
Huffman Codes
An algorithm to ‘compress’ data
Purpose:
Apply a compression algorithm to take a large file and store it as a smaller set
of data
Apply a decompression algorithm to take the smaller compressed data, and
get the original back
So, you only need to store the smaller compressed version, as long as you
have a program to compress/decompress
Compression Examples: WinZip, MP3
62
Quick Lesson In Binary
Generally for an n-digit number in binary:
bn-1 … b2b1b0 = bn-12n-1 + … + b222 + b121 + b020
Internal Storage
01001001 01001100 01001111 01010110 01000101 01010100 01010010
01000101 01000101 01010011
All characters take one byte (8 bits) of storage, and those 8 bits correspond
to the ASCII values
65
Underlying Motivation
Why use the same number of bits to store all characters?
For example, E is used much more often than Z
So what if we only used two bits to store E
And still used the eight to store Z
We should save space.
66
One thing we must watch
When choosing shorter codes, we cannot use any code that is the prefix of
another code. For example, we could not have:
E: 01
X: 01011000
67
Most Used Characters
The most used characters will vary by file
Computing Huffman Codes first requires computing the frequency of each
character, for example for “SUSIE SAYS IT IS EASY”:
CHAR COUNT
A 2
E 2
I 3
S 6
T 1
U 1
Y 2
Space 4
Linefeed 1
68
Computing Huffman Codes
Huffman Codes are varying bit lengths depending on frequency
(remember S had the highest freq at 6):
CHAR CODE
A 010
E 1111
I 110
S 10
T 0110
U 01111
Y 1110
Space 00
69 Linefeed 01110
Coding “SUSIE SAYS IT IS EASY”
CHAR CODE
A 010
E 1111
I 110
S 10
T 0110
U 01111
Y 1110
Space 00
Linefeed 01110
10 01111 10 110 1111 00 10 010 1110 10 00 110 0110 00 110 10 00 1111
010 10 1110 01110 (65 bits)
70
Before, it would’ve been (21*8=168 bits!)
A Huffman Tree
Idea:
Each character appears
as a leaf in the tree
The higher the
frequency of a
character, the higher up
in the tree it is
Number outside a leaf
is its frequency
Number outside a non-
leaf is the sum of all
child frequencies
71
A Huffman Tree
Decoding a message:
For each bit, go right
(1) or left (0)
Once you hit a
character, print it, go
back to the root and
repeat
Example: 0100110
Start at root:
Go L(0), R(1), L(0),
get A
Go back to root
72
Go L(0), R(1), R(1),
Encoding
Decoding is thus easy
when you have this tree
However, we must
produce the tree
74
Next
Take the left two elements, and form a subtree
The two leaves are the two characters
The parent is empty, with a frequency as the sum of its two children
Put this back in the priority queue, in the right spot
75
Continue this process…
Again, adjoin the leftmost two elements (now we actually adjoin a leaf
and a subtree):
76
Keep going…
Adjoin leaves Y (2) and E (2), this forms a subtree with root frequency of
4
77
Continue until we have one tree…
78
Continue until we have one tree…
79
Continue until we have one tree…
80
Continue until we have one tree…
81
Our final tree
Note we were able to construct this from the frequency table
82
Obtaining the Huffman Code from the tree
Once we construct the
tree, we still need the
Huffman Code to encode
the file
No way around this: we
have to start from the
root and traverse all
possible paths to leaf
nodes
As we go along, keep
track of if we go left (0)
or right (1)
83 So A went left (0), then
Code Table
When we get the
Huffman Code for each
character, we insert
them into a Code Table,
as to the right
Decoding a File
Read the compressed file bit-by-bit
Use the Huffman Tree to get each character
85