0% found this document useful (0 votes)
78 views

Heap Sort: Input: One-Dimension Array Advantages of Insertion Sort and Merge Sort Heap Sort

The running time of Heapify() aside from the recursive call is O(1). Heapify() can recursively call itself a maximum of h times where h is the height of the tree which is O(log n). Therefore, the worst-case running time of Heapify() on a heap of size n is O(log n).

Uploaded by

Mohammed Hajjaj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views

Heap Sort: Input: One-Dimension Array Advantages of Insertion Sort and Merge Sort Heap Sort

The running time of Heapify() aside from the recursive call is O(1). Heapify() can recursively call itself a maximum of h times where h is the height of the tree which is O(log n). Therefore, the worst-case running time of Heapify() on a heap of size n is O(log n).

Uploaded by

Mohammed Hajjaj
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Chapter 6

Heap Sort

6 -- 1

Outlines: Heap Sort

 Input: One-Dimension Array


 Advantages of Insertion Sort and Merge Sort
 Heap Sort:
 The Heap Property

 Heapify Function

 Build Heap Function

 Heap Sort Function

6 -- 2

1
1D Array

Memory
a y f k

start
 1-dimensional array x = [a, y, f, k]
 x[1] = a; x[2] = y; x[3] = f; x[4] = k

6 -- 3

Sorting Revisited
 So far we’ve talked about two algorithms to sort an array of
numbers
 What is the advantage of merge sort?
 Answer: good worst-case running time O(n lg n)
 Conceptually easy, Divide-and-Conquer
 What is the advantage of insertion sort?
 Answer: sorts in place: only a constant number of array
elements are stored outside the input array at any time
 Easy to code, When array “nearly sorted”, runs fast in practice
avg case worst case
Insertion sort n 2 n2
Merge sort n log n n log n
 Next on the agenda: Heapsort
 Combines advantages of both previous algorithms

6 -- 4

2
Heaps
 A heap can be seen as a complete binary tree
 In practice, heaps are usually implemented as arrays

 An array A that represent a heap is an object with two


attributes: A[1 .. length[A]]
 length[A]: # of elements in the array

 heap-size[A]: # of elements in the heap stored within


array A, where heap-size[A] ≤ length[A]

 No element past A[heap-size[A]] is an element of the


heap

A = 16 14 10 8 7 9 3 2 4 1

6 -- 5

Heaps

 For example, heap-size of the following heap = 10


 Also, length[A] = 10

A = 16 14 10 8 7 9 3 2 4 1 =

16

14 10

8 7 9 3

2 4 1

6 -- 6

3
Referencing Heap Elements
 The root node is A[1]
 Node i is A[i]

 Parent(i)
 return i/2

 Left(i)
 return 2*i

 Right(i)
 return 2*i + 1
1 2 3 4 5 6 7 8 9 10
16 15 10 8 7 9 3 2 4 1

Level: 3 2 1 0

6 -- 7

The Heap Property

 Heaps also satisfy the heap property:


 A[Parent(i)]  A[i] for all nodes i > 1
 In other words, the value of a node is at most
the value of its parent
 The largest value in a heap is at its root (A[1])

 and subtrees rooted at a specific node contain


values no larger than that node’s value

6 -- 8

4
Heap Operations: Heapify()
 Heapify(): maintain the heap property
 Given: a node i in the heap with children L and R
 two subtrees rooted at L and R, assumed to be
heaps
 Problem: The subtree rooted at i may violate the
heap property (How?)
 A[i] may be smaller than its children value
 Action: let the value of the parent node “float
down” so subtree at i satisfies the heap property
 If A[i] < A[L] or A[i] < A[R], swap A[i] with the
largest of A[L] and A[R]
 Recurse on that subtree

6 -- 9

Heap Operations: Heapify()


Heapify(A, i)
{
1. L  left(i)
2. R  right(i)
3. if L  heap-size[A] and A[L] > A[i]
4. then largest  L
5. else largest  i
6. if R  heap-size[A] and A[R] > A[largest]
7. then largest  R
8. if largest i
9. then exchange A[i]  A[largest]
10. Heapify(A, largest)
}

6 -- 10

5
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1

6 -- 11

Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1

6 -- 12

6
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1

6 -- 13

Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1

6 -- 14

7
Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1

6 -- 15

Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1

6 -- 16

8
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1

6 -- 17

Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1

6 -- 18

9
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1

6 -- 19

Heap Height

 Definitions:
 The height of a node in the tree = the number of
edges on the longest downward path to a leaf

 What is the height of an n-element heap? Why?


 The height of a tree for a heap is (lg n)
 Because the heap is a binary tree, the height of
any node is at most (lg n)
 Thus, the basic operations on heap runs in O(lg n)

6 -- 20

10
# of nodes in each level

 Fact: an n-element heap has at most 2h-k nodes of


level k, where h is the height of the tree

 for k = h (root level)  2h-h = 20 =1


 for k = h-1  2h-(h-1) = 21 =2
 for k = h-2  2h-(h-2) = 22 =4
 for k = h-3  2h-(h-3) = 23 =8
 …
 for k = 1  2h-1 = 2h-1
 for k = 0 (leaves level) 2h-0 = 2h

6 -- 21

Heap Height
 A heap storing n keys has height h = lg n = (lg n)
 Due to heap being complete, we know:
 The maximum # of nodes in a heap of height h
 2h + 2h-1 + … + 22 + 21 + 20 =
  i=0 to h 2i=(2h+1–1)/(2–1) = 2h+1 - 1
 The minimum # of nodes in a heap of height h
 1 + 2h-1 + … + 22 + 21 + 20 =
  i=0 to h-1 2i + 1 = [(2h-1+1–1)/(2–1)] + 1 = 2h
 Therefore
 2h  n  2h+1 - 1
 h  lg n & lg(n+1) – 1  h
 lg(n+1) – 1  h  lg n
 which in turn implies:
 h = lg n = (lg n)
6 -- 22

11
Analyzing Heapify()

 Aside from the recursive call, what is the


running time of Heapify()?

 How many times can Heapify() recursively


call itself?

 What is the worst-case running time of


Heapify() on a heap of size n?

6 -- 23

Analyzing Heapify()

 The running time at any given node i is


 (1) time to fix up the relationships among
A[i], A[Left(i)] and A[Right(i)]
 plus the time to call Heapify recursively on a
sub-tree rooted at one of the children of node i
 And, the children’s subtrees each have size at most
2n/3
 The worst case occurs when the last row of the
tree is exactly half full
 Blue =Yellow = Black = Red = ¼ n
 Blue + Black = ½ n
 Yellow + Red= ½ n
 Level 0: leave level = Blue +Yellow = ½ n = 2h
6 -- 24

12
Analyzing Heapify()

 So we have
T(n)  T(2n/3) + (1)

 Heapify takes T(n) = Θ(h)


 h = height of heap = lg n
 T(n) = Θ(lg n)

6 -- 25

Heap Operations: BuildHeap()

 We can build a heap in a bottom-up manner by


running Heapify() on successive subarrays
 Fact: for array of length n, all elements in range
A[n/2 + 1, n/2 + 2 .. n] are heaps (Why?)
 These elements are leaves, they do not have children
 We know that
 2h+1-1 = n  2.2h = n + 1
 2h = (n + 1)/2 = n/2 + 1 = n/2
 We also know that the leave-level has at most
 2h nodes = n/2 + 1 = n/2 nodes
 and other levels have a total of n/2 nodes
 n/2 + 1 + n/2 = n/2 + n/2 = n
6 -- 26

13
Heap Operations: BuildHeap()

 So:

 Walk backwards through the array from n/2 to 1,


calling Heapify() on each node.

 Order of processing guarantees that the children of


node i are heaps when i is processed

6 -- 27

BuildHeap()
// given an unsorted array A, make A a heap

BuildHeap(A)
{
1. heap-size[A]  length[A]
2. for i  length[A]/2 downto 1
3. do Heapify(A, i)
}

The Build Heap procedure, which runs in linear time,


produces a max-heap from an unsorted input array.

However, the Heapify procedure, which runs in


O(lg n) time, is the key to maintaining the heap property.
6 -- 28

14
BuildHeap() Example

 Work through example


A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}
 n=10, n/2=5

1
4
2 3
1 3
4 5
6 7
2 16 9 10
8 9 10
14 8 7

6 -- 29

BuildHeap() Example

 A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}

1
4
2 3
1 3
4 i=5 6 7
2 16 9 10
8 9 10
14 8 7

6 -- 30

15
BuildHeap() Example

 A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}

1
4
2 3
1 3
i=4 5 6 7
2 16 9 10
8 9 10
14 8 7

6 -- 31

BuildHeap() Example

 A = {4, 1, 3, 14, 16, 9, 10, 2, 8, 7}

1
4
2 i=3
1 3
4 5 6 7
14 16 9 10
8 9 10
2 8 7

6 -- 32

16
BuildHeap() Example

 A = {4, 1, 10, 14, 16, 9, 3, 2, 8, 7}

1
4
i=2 3
1 10
4 5 6 7
14 16 9 3
8 9 10
2 8 7

6 -- 33

BuildHeap() Example

 A = {4, 16, 10, 14, 7, 9, 3, 2, 8, 1}

i=1
4
2 3
16 10
4 5 6 7
14 7 9 3
8 9 10
2 8 1

6 -- 34

17
BuildHeap() Example

 A = {16, 14, 10, 8, 7, 9, 3, 2, 4, 1}

1
16
2 3
14 10
4 5 6 7
8 7 9 3
8 9 10
2 4 1

6 -- 35

Analyzing BuildHeap()
 Each call to Heapify() takes O(lg n) time
 There are O(n) such calls (specifically, n/2)
 Thus the running time is O(n lg n)
 Is this a correct asymptotic upper bound?
 YES
 Is this an asymptotically tight bound?

 NO
 A tighter bound is O(n)
 How can this be? Is there a flow in the above reasoning?
 We can derive a tighter bound by observing that the time
for Heapify to run at a node varies with the height of the
node in the tree, and the heights of most nodes are small.
 Fact: an n-element heap has at most 2h-k nodes of level k,
where h is the height of the tree.
6 -- 36

18
Analyzing BuildHeap(): Tight
 The time required by Heapify on a node of height k is O(k).
So we can express the total cost of BuildHeap as

k=0 to h 2h-k O(k)= O(2h k=0 to h k/2k)


= O(n k=0 to h k(½)k)

From: k=0 to ∞ k xk = x/(1-x)2 where x =1/2

So, k=0 to  k/2k = (1/2)/(1 - 1/2)2 = 2

Therefore, O(n k=0 to h k/2k) = O(n)

 So, we can bound the running time for building a heap


from an unordered array in linear time.
6 -- 37

Heapsort
 Given BuildHeap(), an in-place sorting
algorithm is easily constructed:
 Maximum element is at A[1]
 Discard by swapping with element at A[n]
 Decrement heap_size[A]
 A[n] now contains correct value
 Restore heap property at A[1] by calling
Heapify()
 Repeat, always swapping A[1] for
A[heap_size(A)]

6 -- 38

19
Heapsort
Heapsort(A)
{
1. Build-Heap(A)
2. for i  length[A] downto 2
3. do exchange A[1]  A[i]
4. heap-size[A]  heap-size[A] - 1
5. Heapify(A, 1)
}

6 -- 39

HeapSort() Example

 A = {16, 14, 10, 8, 7, 9, 3, 2, 4, 1}

1
16
2 3
14 10
4 5 6 7
8 7 9 3
8 9 10
2 4 1

6 -- 40

20
HeapSort() Example

 A = {14, 8, 10, 4, 7, 9, 3, 2, 1, 16}

1
14
2 3
8 10
4 5 6 7
4 7 9 3
8 9
2 1 16
i = 10

6 -- 41

HeapSort() Example

 A = {10, 8, 9, 4, 7, 1, 3, 2, 14, 16}

1
10
2 3
8 9
4 5 6 7
4 7 1 3
8
2 14 16
i=9 10

6 -- 42

21
HeapSort() Example

 A = {9, 8, 3, 4, 7, 1, 2, 10, 14, 16}

1
9
2 3
8 3
4 5 6 7
4 7 1 2

10 14 16
i=8 9 10

6 -- 43

HeapSort() Example

 A = {8, 7, 3, 4, 2, 1, 9, 10, 14, 16}

1
8
2 3
7 3
4 5 6
4 2 1 9
i=7
10 14 16
8 9 10

6 -- 44

22
HeapSort() Example

 A = {7, 4, 3, 1, 2, 8, 9, 10, 14, 16}

1
7
2 3
4 3
4 5
1 2 8 9
i=6 7
10 14 16
8 9 10

6 -- 45

HeapSort() Example

 A = {4, 2, 3, 1, 7, 8, 9, 10, 14, 16}

1
4
2 3
2 3
4 i=5
1 7 8 9
6 7
10 14 16
8 9 10

6 -- 46

23
HeapSort() Example

 A = {3, 2, 1, 4, 7, 8, 9, 10, 14, 16}

1
3
2 3
2 1

i=4 4 7 8 9
5 6 7
10 14 16
8 9 10

6 -- 47

HeapSort() Example

 A = {2, 1, 3, 4, 7, 8, 9, 10, 14, 16}

1
2
2 i=3
1 3
4
4 7 8 9
5 6 7
10 14 16
8 9 10

6 -- 48

24
HeapSort() Example

 A = {1, 2, 3, 4, 7, 8, 9, 10, 14, 16}

1
1
i =2 3

2 3
4
4 7 8 9
5 6 7
10 14 16
8 9 10

6 -- 49

Analyzing Heapsort

 The call to BuildHeap() takes O(n) time


 Each of the n - 1 calls to Heapify() takes O(lg n) time

 Thus the total time taken by HeapSort()


= O(n) + (n - 1) O(lg n)
= O(n) + O(n lg n)
= O(n lg n)

6 -- 50

25
Analyzing Heapsort

 The O(n log n) run time of heap-sort is much


better than the O(n2) run time of selection and
insertion sort

 Although, it has the same run time as Merge sort,


but it is better than Merge Sort regarding memory
space
 Heap sort is in-place sorting algorithm
 But not stable
 Does not preserve the relative order of elements
with equal keys

6 -- 51

26

You might also like