Data Structure and Algorithms (005) Fall 2024 - Week 12
Data Structure and Algorithms (005) Fall 2024 - Week 12
Fall 2024
Instructor: Prof. John Yun
Lecture: 9-12P, Wednesdays; S2-402
Index
Week 12
Fall 2024
Index
Searching…
• The Era of Data Deluge
• (2.5*1018) = 25,000,000,000,000,000,000 bytes of data created
daily
• 1 Terabyte Disk = 2,500,000 Disks
• Searching…
• Index: At the end of a book
• Website
• File
• Record
Searching in our lives
Record, Key, Search Tree
• Record
• Including everything about an object
• Example: Personal Record
• National ID (Social Security) Number
• Name
• Student ID
• Home Address
• Home Telephone Number Each of these data is
• Work Phone Number, called Field
• Personal Mobile Number
• Education Level
• Annual Income
• Family Members
• The purpose of the index is to search for the record of the object.
Key
• Index is made of a field that represents the record
• Field that uniquely identifying a record
• We call such field a Key.
• In the Personal Record example: disk
• The Name CANNOT be the key
• Fields that CAN be the key:
record
• Student ID
• National ID
• Mobile Number
Red-Black Tree
Search Tree
Balanced Search Tree
2-3-4 Tree
We can use the array for the index, but very inefficient,
It’s not practical.
Week 12
Fall 2024
Tree
root
40
root
10 55
7 20 70
(a) Root node with the key value 30 (b) Root node with the key value 40
7 20 70
10 30 50 65
17
7 20 35 61 70 80
TR
TL
Subtree r
40
10 55
7 20 70
7 20
70
17
Expression can be represented
with the binary search tree
a-b a–b*c (a + b) / c
- - /
a b a * + c
b c a b
Array-based Binary Search Tree Representation
leftChild rightChild
0 40 3 1
1 55 -1 2
0 2 70 -1 -1
40
3 10 6 4
3 1 4 20 5 -1
10 55
5 17 -1 -1
6 4 2 6 7 -1 -1
7 20 70
7 ?
8 ?
5
17
… … … …
Structure of (Binary Search Tree)
Node Object
(a) Binary Search Tree Node (b) Actual Binary Search Tree Node
Structure of (Binary Search Tree)
Node Object
Relationship between Binary Search Tree object and Binary Search Tree represented with tree nodes
Same Data, Different Binary Search Tree
20
Optimal Shape
17 55
55
7 10 40 70 20 70
17 40
7 10
It could look like this…
20 7 Worst Case Shape
10
17 40
17
7 10 55
20
70 40
55
70
• Index
• Binary Search Tree
• Binary Search Tree
Algorithm &
Binary Search Tree Implementation
Algorithm &
Implementation
Week 12
Fall 2024
Search (of a Binary Search Tree)
search(t, x): # t: root reference of the subtree
# x: the key you’re searching for
if (t = null || t.item = x)
return t
else if (x < t.item)
return search(t.left, x) # t.left: left children of the t
else
return search(t.right, x) # t.right: right children of the t
t.left t.right
Solving it recursively!
Recursive Perspective on Search
search(t, x)
t
search(t.left, x)
t.left
search(t, x):
if ( t = null || t.item = x )
return t
else if (x < t.item)
return search(t.left, x)
else
return search(t.right, x)
Failed Search 55
20 70
17 40 Φ(null)
7 10
search(t, x):
if ( t = null || t.item = x )
return t
else if (x < t.item)
return search(t.left, x)
else
return search(t.right, x)
55
Successful Search
20 70
17 40
7 10
(a) Successful Search (b) Failed Searches
Insertion
• Assumption: There is no element x in the search tree
55
20 70
17 40 60
7 10
Insert() Algorithm (for Binary Search Tree)
insert(x):
root ← insertItem(root, x) # root: the reference root node
insertItem(t, x):
# t: root node of the (sub)tree
# x: element x to be inserted
# after finishing inserting the element, return the root node reference
if (t = null)
r.item ← x; r.left ← null; r.right ← null # r: new node
return r
else if (x < t.item)
t.left ← insertItem(t.left, x)
return t
else
t.right ← insertItem(t.right, x)
return x
insert() Algorithm (non-recursive)
# root: reference of the tree root node
insert(x): # x: inserting key
r.item ← x; r.left ← null; r.right ← null # r: new node
if (root = null)
root ← r
else
t ← root
while (t ≠ null)
parent ← t
if (x < t.item) t ← t.left
else t ← t.right
if (x < parent.item) parent.left ← r
else parent.right ← r
Unbalanced Binary Search Tree
Deletion
deleteSketch(r): # r: node to be deleted
if (r is the leaf node) # case 1
leave r as where it is
else if (if r has only one child) # case 2
let r’s parent point to r’s child directly
else # case 3
find the smallest element s from the right subtree of r
copy s into r’s place and remove s
deleteSketch(r): ◀ r: deleting node
if (r is the leaf node) ◀ Case 1
leave the r
else if (r has only child) ◀ Case 2
let r’s parent point to r’s child directly
else ◀ Case 3
find the minimum element s from r’s right subtree
copy s into r and remove s
r r r
r has 2 children
r has no child r has only 1 child
Case 1
Case 1 Delete Result
p p
r has no child
Case 2
Case 2 Delete Result
p p
r r r
r has 2 children
m m
Case 1
(a) r has only 1 child (b) Delete r (c) Place r’s child into r’s place
Case 3 – (1)
(c) Move s’ child to the s place (d) Move s into r’s place
delete() Algorithm (non-recursive)
# root: root node of the tree
delete(r, p): # r: deleting node, p: parent node of r
# when r is the root node
if (r = root)
root ← deleteNode(root);
# when r is not the root node
else if (r = p.left)
p.left ← deleteNode(r) # r is the left child of p
else
p.right ← deleteNode(r) # r is the right child of p
deleteNode(r):
if (r.left = r.right = null) return null
else if (r.left = null and r.right ≠ null) return r.right
else if (r.left ≠ null and r.right = null) return r.left
else
s ← r.right
while (s.left ≠ null)
parent ← s; s ← s.left
r.item ← s.item # copy content
if (s = r.right) r.right ← s.right
else parent.left = s.right
return r
delete() Algorithm (recursive with given x)
delete(t, x): # t: root node of the tree
# x: deleting element
if (t = null)
Error(“item not found!”)
else if (x = t.item)
t ← deleteNode(t)
return t
else if (x < t.item)
t.left ← delete(t.left, x)
return t
else
t.right ← delete(t.right, x)
return t
deleteNode(t):
if (t.left = null && t.right = null)
return null # case 1
else if (t.left = null)
return t.right # case 2 (has only left child)
else if (t.right = null)
return t.left # case 2 (has only right child)
else # case 3 (has both children)
(minItem, node) ← deleteMinItem(t.right)
t.item ← minItem
t.right ← node
return t # t survived but changed
deleteMinItem(t):
if (t.left = null)
return (t.item, t.right) # found minimum at t
else # branch left
(minItem, node) = deleteMinItem(t.left)
t.left ← node
return (minItem, t)
Search Time
Fact:
The asymptotic average search time for a binary
search tree with a total of n nodes is Θ(log n)
Properties of a Binary Search Tree
For integer h>=1, the total number of leaf nodes l(h) of a full binary
Fact 1
search tree with the height h is 2h-1.
Proof
i) (Base Case) When h=1, there is only 1 leaf node, so l(1) = 21-1 =
20 = 1 is satisfied.
ii) (Recursive Assumption) Let’s assume that the full binary search
tree with the height=k has l(k) = 2k-1 number of leaf nodes.
Proof
i) (Base Case) When h=0, it is an empty binary search tree. This
also belongs to the full binary search tree. The number of nodes
is 0. n(0) = 20-1 = 1 is satisfied.
ii) (Recursive Assumption) Let’s assume that for all 0<=k<h, the
number of nodes for a full binary search tree is n(k) = 2k-1
iii) (Recursive Analysis) We need to prove that n(h) = 2h-1 Given the
assumption, the number of nodes for a full binary search tree with
the height h-1 is n(h-1) = 2h-1-1. The full binary search tree with
the height h has two subtrees which are full binary search tree of
their own with the height h-1. Thus, the total number of nodes is
n(h) = 1+2ㆍn(h-1) = 1+2(2h-1-1) = 2h-1 (Here we are adding 1 for
the root node)
For integer h>=0, the binary tree with the height h has equal or less
Fact 3
than 2h-1 number of nodes
Proof
The full binary tree with the height h has the most number of nodes
among all binary trees. Since the full binary tree with the height h has
2h-1 number of nodes, all binary trees with the height h has at most
2h-1 number of nodes.
The height of a binary tree having the total number of n nodes is at
Fact 4
least h log2(n+1) )
Proof If the height of the binary tree is h, then its number of nodes is n 2h-
1 from the Fact 3. When converted, it is h log2(n+1). Because h is
an integer, h log2(n+1) )
Thus, the height of the binary tree having n total number of nodes can
be represented as Ω(log n)
The asymptotic time for insertion in a binary search tree is the same as
Fact 6
the time for searching
The asymptotic average search time for a binary search tree with n
Fact 7
nodes is Ꝋ(log n).
Traversal of Binary Tree
Traversal: Visiting all nodes of a binary tree
Preorder Traversal
Inorder Traversal
Postorder Traversal Number of the node
represents the order of the
visit
Fact:
1 When n elements are inserted
40
sequentially to form a binary search tree,
2 2 and all possible permutations of the input
10 55 elements are equally likely, the expected
IPL (Internal Path Length) of the
3 3 3 resulting binary search tree is O(n log n).
7 20 70
What it means:
4 Average Search Time is O(n logn)
17
IPL = 18
Finding the size of the tree
size(t):
if (t = null) return 0
else return (1+size(t.left)+size(t.right))
Tree Sort
Sort using the binary search tree
treeSort(A[ ], n):
Create a binary search tree, T from reading the elements from A[0…n-1]
in order
Traverse the tree T in inOrder, and save in A[0…n-1]
newItem
newItem
left right
Implementation: Search
Implementation: Insert
Implementation: Delete (1)
Implementation: Delete (2)
# 3 cases
# 1. tNode is the leaf node
# 2. tNode has only 1 child
# 3. tNode has two children
# case 1(no children)
# Miscellaneous
End of Lecture