Heap Sort: Input: One-Dimension Array Advantages of Insertion Sort and Merge Sort Heap Sort
Heap Sort: Input: One-Dimension Array Advantages of Insertion Sort and Merge Sort Heap Sort
Heap Sort
6 -- 1
Heapify Function
6 -- 2
1
1D Array
Memory
a y f k
start
1-dimensional array x = [a, y, f, k]
x[1] = a; x[2] = y; x[3] = f; x[4] = k
6 -- 3
Sorting Revisited
So far we’ve talked about two algorithms to sort an array of
numbers
What is the advantage of merge sort?
Answer: good worst-case running time O(n lg n)
Conceptually easy, Divide-and-Conquer
What is the advantage of insertion sort?
Answer: sorts in place: only a constant number of array
elements are stored outside the input array at any time
Easy to code, When array “nearly sorted”, runs fast in practice
avg case worst case
Insertion sort n 2 n2
Merge sort n log n n log n
Next on the agenda: Heapsort
Combines advantages of both previous algorithms
6 -- 4
2
Heaps
A heap can be seen as a complete binary tree
In practice, heaps are usually implemented as arrays
A = 16 14 10 8 7 9 3 2 4 1
6 -- 5
Heaps
A = 16 14 10 8 7 9 3 2 4 1 =
16
14 10
8 7 9 3
2 4 1
6 -- 6
3
Referencing Heap Elements
The root node is A[1]
Node i is A[i]
Parent(i)
return i/2
Left(i)
return 2*i
Right(i)
return 2*i + 1
1 2 3 4 5 6 7 8 9 10
16 15 10 8 7 9 3 2 4 1
Level: 3 2 1 0
6 -- 7
6 -- 8
4
Heap Operations: Heapify()
Heapify(): maintain the heap property
Given: a node i in the heap with children L and R
two subtrees rooted at L and R, assumed to be
heaps
Problem: The subtree rooted at i may violate the
heap property (How?)
A[i] may be smaller than its children value
Action: let the value of the parent node “float
down” so subtree at i satisfies the heap property
If A[i] < A[L] or A[i] < A[R], swap A[i] with the
largest of A[L] and A[R]
Recurse on that subtree
6 -- 9
6 -- 10
5
Heapify() Example
16
4 10
14 7 9 3
2 8 1
A = 16 4 10 14 7 9 3 2 8 1
6 -- 11
Heapify() Example
16
4 10
14 7 9 3
2 8 1
A = 16 4 10 14 7 9 3 2 8 1
6 -- 12
6
Heapify() Example
16
4 10
14 7 9 3
2 8 1
A = 16 4 10 14 7 9 3 2 8 1
6 -- 13
Heapify() Example
16
14 10
4 7 9 3
2 8 1
A = 16 14 10 4 7 9 3 2 8 1
6 -- 14
7
Heapify() Example
16
14 10
4 7 9 3
2 8 1
A = 16 14 10 4 7 9 3 2 8 1
6 -- 15
Heapify() Example
16
14 10
4 7 9 3
2 8 1
A = 16 14 10 4 7 9 3 2 8 1
6 -- 16
8
Heapify() Example
16
14 10
8 7 9 3
2 4 1
A = 16 14 10 8 7 9 3 2 4 1
6 -- 17
Heapify() Example
16
14 10
8 7 9 3
2 4 1
A = 16 14 10 8 7 9 3 2 4 1
6 -- 18
9
Heapify() Example
16
14 10
8 7 9 3
2 4 1
A = 16 14 10 8 7 9 3 2 4 1
6 -- 19
Heap Height
Definitions:
The height of a node in the tree = the number of
edges on the longest downward path to a leaf
6 -- 20
10
# of nodes in each level
6 -- 21
Heap Height
A heap storing n keys has height h = lg n = (lg n)
Due to heap being complete, we know:
The maximum # of nodes in a heap of height h
2h + 2h-1 + … + 22 + 21 + 20 =
i=0 to h 2i=(2h+1–1)/(2–1) = 2h+1 - 1
The minimum # of nodes in a heap of height h
1 + 2h-1 + … + 22 + 21 + 20 =
i=0 to h-1 2i + 1 = [(2h-1+1–1)/(2–1)] + 1 = 2h
Therefore
2h n 2h+1 - 1
h lg n & lg(n+1) – 1 h
lg(n+1) – 1 h lg n
which in turn implies:
h = lg n = (lg n)
6 -- 22
11
Analyzing Heapify()
6 -- 23
Analyzing Heapify()
12
Analyzing Heapify()
So we have
T(n) T(2n/3) + (1)
6 -- 25
13
Heap Operations: BuildHeap()
So:
6 -- 27
BuildHeap()
// given an unsorted array A, make A a heap
BuildHeap(A)
{
1. heap-size[A] length[A]
2. for i length[A]/2 downto 1
3. do Heapify(A, i)
}
14
BuildHeap() Example
1
4
2 3
1 3
4 5
6 7
2 16 9 10
8 9 10
14 8 7
6 -- 29
BuildHeap() Example
1
4
2 3
1 3
4 i=5 6 7
2 16 9 10
8 9 10
14 8 7
6 -- 30
15
BuildHeap() Example
1
4
2 3
1 3
i=4 5 6 7
2 16 9 10
8 9 10
14 8 7
6 -- 31
BuildHeap() Example
1
4
2 i=3
1 3
4 5 6 7
14 16 9 10
8 9 10
2 8 7
6 -- 32
16
BuildHeap() Example
1
4
i=2 3
1 10
4 5 6 7
14 16 9 3
8 9 10
2 8 7
6 -- 33
BuildHeap() Example
i=1
4
2 3
16 10
4 5 6 7
14 7 9 3
8 9 10
2 8 1
6 -- 34
17
BuildHeap() Example
1
16
2 3
14 10
4 5 6 7
8 7 9 3
8 9 10
2 4 1
6 -- 35
Analyzing BuildHeap()
Each call to Heapify() takes O(lg n) time
There are O(n) such calls (specifically, n/2)
Thus the running time is O(n lg n)
Is this a correct asymptotic upper bound?
YES
Is this an asymptotically tight bound?
NO
A tighter bound is O(n)
How can this be? Is there a flow in the above reasoning?
We can derive a tighter bound by observing that the time
for Heapify to run at a node varies with the height of the
node in the tree, and the heights of most nodes are small.
Fact: an n-element heap has at most 2h-k nodes of level k,
where h is the height of the tree.
6 -- 36
18
Analyzing BuildHeap(): Tight
The time required by Heapify on a node of height k is O(k).
So we can express the total cost of BuildHeap as
Heapsort
Given BuildHeap(), an in-place sorting
algorithm is easily constructed:
Maximum element is at A[1]
Discard by swapping with element at A[n]
Decrement heap_size[A]
A[n] now contains correct value
Restore heap property at A[1] by calling
Heapify()
Repeat, always swapping A[1] for
A[heap_size(A)]
6 -- 38
19
Heapsort
Heapsort(A)
{
1. Build-Heap(A)
2. for i length[A] downto 2
3. do exchange A[1] A[i]
4. heap-size[A] heap-size[A] - 1
5. Heapify(A, 1)
}
6 -- 39
HeapSort() Example
1
16
2 3
14 10
4 5 6 7
8 7 9 3
8 9 10
2 4 1
6 -- 40
20
HeapSort() Example
1
14
2 3
8 10
4 5 6 7
4 7 9 3
8 9
2 1 16
i = 10
6 -- 41
HeapSort() Example
1
10
2 3
8 9
4 5 6 7
4 7 1 3
8
2 14 16
i=9 10
6 -- 42
21
HeapSort() Example
1
9
2 3
8 3
4 5 6 7
4 7 1 2
10 14 16
i=8 9 10
6 -- 43
HeapSort() Example
1
8
2 3
7 3
4 5 6
4 2 1 9
i=7
10 14 16
8 9 10
6 -- 44
22
HeapSort() Example
1
7
2 3
4 3
4 5
1 2 8 9
i=6 7
10 14 16
8 9 10
6 -- 45
HeapSort() Example
1
4
2 3
2 3
4 i=5
1 7 8 9
6 7
10 14 16
8 9 10
6 -- 46
23
HeapSort() Example
1
3
2 3
2 1
i=4 4 7 8 9
5 6 7
10 14 16
8 9 10
6 -- 47
HeapSort() Example
1
2
2 i=3
1 3
4
4 7 8 9
5 6 7
10 14 16
8 9 10
6 -- 48
24
HeapSort() Example
1
1
i =2 3
2 3
4
4 7 8 9
5 6 7
10 14 16
8 9 10
6 -- 49
Analyzing Heapsort
6 -- 50
25
Analyzing Heapsort
6 -- 51
26