0% found this document useful (0 votes)
56 views85 pages

Chapter 8 - Binary Trees

This document provides an overview of binary trees as a fundamental data structure. It defines key terms related to binary trees such as root, parent, child, leaf, and discusses operations like searching, insertion, deletion and traversing binary trees. Searching and insertion in a binary search tree take O(log n) time since each operation eliminates half of the remaining nodes. Traversal can be done inorder, preorder or postorder and is implemented recursively by traversing left and right subtrees. Deletion is more complex, involving different approaches depending on whether the node has 0, 1 or 2 children.

Uploaded by

Bandar Abdallat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views85 pages

Chapter 8 - Binary Trees

This document provides an overview of binary trees as a fundamental data structure. It defines key terms related to binary trees such as root, parent, child, leaf, and discusses operations like searching, insertion, deletion and traversing binary trees. Searching and insertion in a binary search tree take O(log n) time since each operation eliminates half of the remaining nodes. Traversal can be done inorder, preorder or postorder and is implemented recursively by traversing left and right subtrees. Deletion is more complex, involving different approaches depending on whether the node has 0, 1 or 2 children.

Uploaded by

Bandar Abdallat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 85

Binary Trees

CS221N, Data Structures

1
Binary Trees
A fundamental data structure
Combines advantages of arrays and linked lists
Fast search time
Fast insertion
Fast deletion
Moderately fast access time

Of course, they’re a bit more complex to implement

2
Recall Ordered Arrays…
Their search time is faster, because there is some ‘ordering’ to the
elements.
We can do binary search, O(log n)
Instead of linear search, O(n)

Their insertion time is slower, because you have to find the correct
position to insert first
That takes O(log n) time
Instead of just dropping the element at the end, O(1)

Trees provide ‘somewhat’ of an ordering


3
Each of these algorithms will be O(log n)
Recall Linked Lists…
Insertion and deletion are fast
O(1) on the end
In the middle, O(n) to find the position, but O(1) to insert/delete
Better than the expensive shifting in arrays
Finding is slower, O(n)

Trees perform insertion/deletion similarly, by changing references


But they provide shorter paths for finds that are log n in length, as
opposed to a linked list which could be length n

4
Trees: General
A tree consists of nodes, connected by edges
Trees cannot have cycles
Otherwise, it’s a graph
Here’s a general tree:

5
Traversing a Tree
 Start at the root and traverse downward along its edges
 Typically, edges represent some kind of relationship
 We represent these by references
 Just as in linked lists:
class Link {
int data;
Link next;
}
 In a tree:
class Node {
int data;
Node child1;
Node child2;
6 …
}
Size of a Tree
Increases as you go down
Opposite of nature. 

7
Binary Trees
A special type of tree
With this tree, nodes had varying numbers of children:

With binary trees, nodes can have a maximum of two children.


The tree above is called a multiway tree

8
A Binary Tree
For now, note each node has no more than two children

9
A Binary Tree
Each node thus has a left and right child
What would the Java class look like?

10
Binary Trees: Terms
Path: Sequence of nodes connected by edges
Green line is a path from A to J

11
Binary Trees: Terms
Root: The node at the top of the tree
Can be only one (in this case, A)

12
Binary Trees: Terms
Parent: The node above. (B is the parent of D, A is the parent of B, A is
the grandparent of D)

13
Binary Trees: Terms
Child: A node below. (B is a child of A, C is a child of A, D is a child of
B and a grandchild of A)

14
Binary Trees: Terms
Leaf: A node with no children
In this graph: H, E, I, J, and G

15
Binary Trees: Terms
Subtree: A node’s children, it’s children’s children, etc.
The hilited example is just one, there are many in this tree

16
Binary Trees: Terms
Visit: Access a node, and do something with its data
For example we can visit node B and check its value

17
Binary Trees: Terms
Traverse: Visit all the nodes in some specified order.
One example: A, B, D, H, E, C, F, I, J, G

18
Binary Trees: Terms
Levels: Number of generations a node is from the root
A is level 0, B and C are at level 1, D, E, F, G are level 2, etc.

19
Binary Trees: Terms
Key: The contents of a node

20
A Binary Search Tree
A binary tree, with the following characteristics:
The left child is always smaller than its parent
The right child is always larger than its parent
All nodes to the right are bigger than all nodes to the left

21
Integer Tree
Will use this class for individual nodes:

class Node {
public int data;
public Node left;
public Node right;
}

Let’s sketch the Java template for a binary search tree (page 375)

22
Example main() function
Page 275, with a slight tweak
Insert three elements: 50, 25, 75
Search for node 25
If it was found, print that we found it
If it was not found, print that we did not find it

23
Finding a node
What do we know?
For all nodes:
All elements in the left subtree are smaller
All elements in the right subtree are larger

24
Searching for a KEY
We’ll start at the root, and check its value
If the value = key, we’re done.
If the value is greater than the key, look at its left child
If the value is less than the key, look at its right child
Repeat.

25
Example

Searching for
element 57

26
Java Implementation – find()
Pages 377-378

27
Number of operations: Find
Typically about O(log n). Why?

What’s a case where it won’t be?


28 How can we guarantee O(log n)?
Inserting a node
What must we do?
Find the place to insert a node (similar to a find, except we go all the way
till there are no more children)
Put it there

29
Example

Inserting
element
45

30
Java Implementation – insert()
Page 380

31
Traversing a Tree
Three Ways:
Inorder (most common)
Preorder
Postorder

32
Inorder Traversal
Visits each node of the tree in ascending order:

In this tree, an inorder traversal produces:


33
9 14 23 30 34 39 47 53 61 72 79 84
Inorder Traversal
Ascending Order

Implies: We have to print a node’s left child before the node itself, and the
34 node itself before its right child
Inorder Traversal
Ascending Order

We can think of this recursively: start at the root, inorder traverse the left
subtree, print the root, inorder traverse the right subtree
35
Java Implementation
Page 382
A recursive function
Let’s try it with a simple example

36
Preorder Traversal
Prints all parents before children
Prints all left children before right children. So with
this tree:

37
A preorder traversal produces:

Preorder Traversal
Order: Root, left, right

We can again do this recursively: print the root,


38 preorder traverse the left subtree, preorder traverse the
right subtree
Java Implementation
Not in the book
But we should be able to do it!

39
Postorder Traversal
Prints all children before parents
Prints all left children before right children. So with
this tree:

40
A postorder traversal produces:

Preorder Traversal
Order: Left, right, root

We can again do this recursively: postorder traverse the


41 left subtree, postorder traverse the right subtree, print
the root
Java Implementation
Not in the book
But again, we should be able to do it!

42
Finding the Minimum
In a binary search tree, this is always the leftmost child
of the tree! Easy. Java?
Start at the root, and traverse until you have no more left
children

43
Finding the Maximum
In a binary search tree, this is also easy – it’s the rightmost child in the tree
Start at the root, traverse until there are no more right children
Java?

44
Deletion
This is the challenging one
First, find the element you want to delete
Once you’ve found it, one of three cases:
1. The node has no children (easy)
2. The node has one child (decently easy)
3. The node has two children (difficult)

45
Case 1: No Children
To delete a node with no children:
Find the node
Set the appropriate child field in its parent to null
Example: Removing 7 from the tree below

46
Java Implementation
Start from page 390-391
Find the node first
As we go through, keep track of:
 The parent
 Whether the node is a left or right child of its parent

Then, handle the case when both children are null


Set either the left or right child of the parent to null
Unless it’s the root, in which case the tree is now empty

47
Case 2: One Child
Assign the deleted
node’s child as the
child of its parent
Essentially, ‘snip out’
the deleted node from
the sequence
Example, deleting 71
from this tree:

48
Java Implementation
Pages 392-393
Two cases to handle. Either:
The right child is null
 If the node is a left child, set its parent’s left child to the node’s left child
 If the node is a right child, set its parent’s right child to the node’s left child
The left child is null
 If the node is a left child, set its parent’s left child to the node’s right child
 If the node is a right child, set its parent’s right child to the node’s right child

49
Case 3: Two Children
Here’s the tough case.
Let’s see an example of why it’s complicated…

50
Case 3: Two Children
What we need is the next highest node to replace 25.
For example, if we replaced 25 by 30, we’re set.

51
Case 3: Two Children
We call this the inorder successor of the deleted node
i.e., 30 is the inorder successor of 25. This replaces 25.

52
Inorder successor
The inorder successor
is always going to be
the smallest element in
the right subtree
In other words, the
smallest element that is
larger than the deleted
node.

53
Finding the inorder successor
Algorithm to find the
inorder successor of some
node X:
First go to the right child
of X
Then keep moving to left
children
Until there are no more
Then we are at the inorder
successor

This is what should


54
replace X
Removing the successor
We must remove the
successor from its
current spot, and place it
in the spot of the deleted
node

If the successor is the


deleted node’s right
child:
Set the successor’s left

55
to the deleted node’s left
Replace deleted node by
If the successor is not the deleted node’s right child, tougher
We must add two steps:
1. Set the successor’s parent’s left to the successor’s right
2. Set the successor’s right to the deleted node’s right
3. Set the successor’s left to the deleted node’s left (as before)
4. Replace the deleted node by the successor (as before)

56
Java Implementation (Time Pending)
getSuccessor() function, page 396
Accepts a node
First goes to its right child
Then goes to the left child
Does this until no more left children

Also removes successor

Rest of delete(), page 398

57
Efficiency: Binary Search Trees
Note that:
Insertion, deletion, searching all involved visiting nodes of the tree until we
found either:
 The position for insertion
 The value for deletion
 The value we were searching for

For any of these, we would not visit more than the number of levels in the
tree
Because for every node we visit, we check its value, and if we’re not done, we
go to one of its children

58
Efficiency: Binary Search Trees
So for a tree of n nodes, how many levels are there:

Nodes Levels
1 1
3 2
7 3
15 4
31 5
….
1,073,741,824 30
59 It’s actually log(n) + 1!
So…
All three of our algorithms: insertion, deletion, and searching take O(log
n) time
We go through log n + 1 levels, each time with one comparison.
At the point of insertion or deletion, we just manipulate a constant number
of references (say, c)
That’s independent of n

So the number of operations is log n + 1 + c, or O(log n)

60
Compare to Arrays
Take 1 million elements and delete an element in the middle
Arrays -> Average case, 500 thousand shifts
Binary Search Trees -> 20 or fewer comparisons
Similar case when comparing with insertion into an ordered array

What is slow for a binary search tree is traversal


Going through all elements in the entire tree
But for a large database, this probably will never be necessary

61
Huffman Codes
An algorithm to ‘compress’ data
Purpose:
Apply a compression algorithm to take a large file and store it as a smaller set
of data
Apply a decompression algorithm to take the smaller compressed data, and
get the original back

So, you only need to store the smaller compressed version, as long as you
have a program to compress/decompress
Compression Examples: WinZip, MP3

62
Quick Lesson In Binary
Generally for an n-digit number in binary:
bn-1 … b2b1b0 = bn-12n-1 + … + b222 + b121 + b020

Assume unsigned bytes, convert these:


01011010
10000101
00101001
10011100
11111111

Everything internally is stored in binary


63
Characters
Take one byte of memory (8 bits)
All files are just sequences of characters

Internal storage (ASCII Codes):


CHAR DEC BINARY_
A 65 01000001
B 66 01000010
C 67 01000011
...
Y 89 01011001
64 Z 90 01011010
Example File
ILOVETREES
(ASCII: I=73, L=76, O=79, V=86,E=69,T=84,R=82,S=83)

Internal Storage
01001001 01001100 01001111 01010110 01000101 01010100 01010010
01000101 01000101 01010011

All characters take one byte (8 bits) of storage, and those 8 bits correspond
to the ASCII values
65
Underlying Motivation
Why use the same number of bits to store all characters?
For example, E is used much more often than Z
So what if we only used two bits to store E
And still used the eight to store Z
We should save space.

For example, with ILOVETREES, if we change E’s to 01:


We save 6*3 = 18 bits

For large files, we could start saving a significant amount of data!

66
One thing we must watch
When choosing shorter codes, we cannot use any code that is the prefix of
another code. For example, we could not have:
E: 01
X: 01011000

Why? Because if we come across this in the file:


…….010110000
We don’t know if it’s an E followed by something else,
Or an X.

67
Most Used Characters
The most used characters will vary by file
Computing Huffman Codes first requires computing the frequency of each
character, for example for “SUSIE SAYS IT IS EASY”:
CHAR COUNT
A 2
E 2
I 3
S 6
T 1
U 1
Y 2
Space 4
Linefeed 1
68
Computing Huffman Codes
Huffman Codes are varying bit lengths depending on frequency
(remember S had the highest freq at 6):

CHAR CODE
A 010
E 1111
I 110
S 10
T 0110
U 01111
Y 1110
Space 00
69 Linefeed 01110
Coding “SUSIE SAYS IT IS EASY”
CHAR CODE
A 010
E 1111
I 110
S 10
T 0110
U 01111
Y 1110
Space 00
Linefeed 01110
10 01111 10 110 1111 00 10 010 1110 10 00 110 0110 00 110 10 00 1111
010 10 1110 01110 (65 bits)
70
Before, it would’ve been (21*8=168 bits!)
A Huffman Tree
Idea:
Each character appears
as a leaf in the tree
The higher the
frequency of a
character, the higher up
in the tree it is
Number outside a leaf
is its frequency
Number outside a non-
leaf is the sum of all
child frequencies
71
A Huffman Tree
Decoding a message:
For each bit, go right
(1) or left (0)
Once you hit a
character, print it, go
back to the root and
repeat
Example: 0100110
Start at root:
Go L(0), R(1), L(0),
get A
Go back to root
72
Go L(0), R(1), R(1),
Encoding
Decoding is thus easy
when you have this tree

However, we must
produce the tree

The idea will be to start


with just characters and
frequencies (the
leaves), and then grow
the tree
73
First step
Start from the leaves, which contain single characters and their associated
frequencies
Store these nodes in a priority queue, ordered by frequency

74
Next
Take the left two elements, and form a subtree
The two leaves are the two characters
The parent is empty, with a frequency as the sum of its two children
Put this back in the priority queue, in the right spot

75
Continue this process…
Again, adjoin the leftmost two elements (now we actually adjoin a leaf
and a subtree):

76
Keep going…
Adjoin leaves Y (2) and E (2), this forms a subtree with root frequency of
4

77
Continue until we have one tree…

78
Continue until we have one tree…

79
Continue until we have one tree…

80
Continue until we have one tree…

81
Our final tree
Note we were able to construct this from the frequency table

82
Obtaining the Huffman Code from the tree
Once we construct the
tree, we still need the
Huffman Code to encode
the file
No way around this: we
have to start from the
root and traverse all
possible paths to leaf
nodes
As we go along, keep
track of if we go left (0)
or right (1)
83 So A went left (0), then
Code Table
When we get the
Huffman Code for each
character, we insert
them into a Code Table,
as to the right

Now encoding becomes


easy!
For each character,
lookup in the code table
and store those binary
84
digits

“Official Steps”
Encoding a File
Obtain a frequency table (count each character’s frequency)
Make the Huffman Tree
Make the Code Table
Store each character as its Huffman Code from the Code Table

Decoding a File
Read the compressed file bit-by-bit
Use the Huffman Tree to get each character

85

You might also like