0% found this document useful (0 votes)
16 views41 pages

L14 Heaps

Uploaded by

Jessica Milner
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views41 pages

L14 Heaps

Uploaded by

Jessica Milner
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Heaps

The Heap Property and the Best Priority Queue


Recall: Array
Binary Trees
0

More compact storage


and better locality of 1 2

reference
3 4 5 6
It is expensive to grow
and wastes space
proportional to 2h for a
tree of depth h with n
nodes.
Recall: Array Binary Trees
Binary Trees starting at index 0 Binary Trees starting at index 1 (Book
Examples)
We can represent a tree with n keys by means
of an array/vector of length n We can represent a tree with n keys by means
of an array/vector of length n + 1
Cell at index 0 is the root node
Cell at index 0 is not used, cell 1 the root node
Where i is the parent’s index:
Where i is the parent’s index:
● A left child is held at index 2i + 1
● A right child is held at index 2i + 2 ● A left child is held at index 2i
● A right child is held at index 2i + 1
A parent is at ⌊(i - 1) / 2⌋ for a node at index i
A parent is at ⌊i / 2⌋ for a node at index i
Data Structures: Heaps

Learn about heaps. This video is a part of HackerRank's Cracking The Coding Interview Tutorial with Gayle Laakmann McDowell.
http://www.hackerrank.com/domains/tutorials/cracking-the-coding-interview?utm_source=videoutm_medium=youtubeutm_campaign=ctci
Heaps
A heap is (typically) a binary tree storing keys at its nodes and satisfying the
following properties:

Heap Property: if P is a parent node of C, then the key of P is either greater than
or equal to (in a max heap) or less than or equal to (in a min heap) the key of
C

● Max Heap: key(P) ≥ key(C)


● Min Heap: key(P) ≤ key(C)

The last node of a heap is the rightmost node of maximum depth


VisuAlgo - Heaps
Heaps Implementation
Heaps are usually implemented in an array (fixed size or dynamic array/vector)

This avoids the need for pointers between elements, like in linked trees

After an element is inserted into or deleted from a heap, the heap property may
be violated and the heap must be balanced by internal operations

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Complete Binary Tree
Complete Binary Tree: let h be the
height of the heap

● For i = 0, … , h − 1, there are


2𝑖 nodes of depth i
● At depth h − 1, the internal nodes
are to the left of the external nodes

The last node of a heap is the rightmost


node of maximum depth
Complete Binary Tree Property
● A heap T storing 𝑛 entries has height ℎ = 𝑙𝑜𝑔𝑛

Proof: From the fact that T is complete, there 2𝑖 nodes in level 𝑖, 0 ≤ 𝑖 ≤ ℎ − 1,


and level h has at least 1 node
1 + 2 + 4 + ⋯ + 2ℎ−1 + 1 = (2ℎ −1) + 1 = 2ℎ , 2ℎ ≤ 𝑛 ≤ 2ℎ+1 − 1,
ℎ ≤ log 𝑛, log 𝑛 + 1 − 1 ≤ ℎ
So, ℎ = 𝑙𝑜𝑔𝑛
// left-complete tree interface
template <typename E>
class CompleteTree {
public:
class Position; // node position type
int size() const; // number of elements
Position left(const Position& p); // get left child

Interface
Position right(const Position& p); // get right child
Position parent(const Position& p); // get parent
bool hasLeft(const Position& p) const; // node left child?
bool hasRight(const Position& p) const; // node right child?

Complete Tree for Heap bool isRoot(const Position& p) const;


Position root();
// is this the root?
// get root position
Position last(); // get last node
void addLast(const E& e); // add a new last node
void removeLast(); // remove last node
// swap node contents
void swap(const Position& p, const Position& q);
};
template <typename E>
class VectorCompleteTree {
public:
VectorCompleteTree() : V(1) {}

Implementation // a position in the tree


typedef typename std::vector<E>::iterator Position;

(Partial) private:
std::vector<E> V; // tree contents

Complete Tree for Heap using a protected:


// map an index to a position
Vector Position pos(int i) { return V.begin() + i; }

// map a position to an index


int idx(const Position& p) const { return p - V.begin(); }
};
template <typename E>
class VectorCompleteTree {
public:

Implementation
int size() const { return V.size() - 1; }

Position root() { return pos(1); }

(Partial)
Position last() { return pos(size()); }

void addLast(const E& e) { V.push_back(e); }


void removeLast() { V.pop_back(); }

Complete Tree for Heap using a void swap(const Position& p, const Position& q) {

Vector E e = *q;
*q = *p;
*p = e;
}
};

Note: This Binary Tree is starting at index 1 (like the Book Examples)
template <typename E>
class VectorCompleteTree {
public:

Implementation
Position left(const Position& p)
{ return pos(2 * idx(p)); }
Position right(const Position& p)

(Partial)
{ return pos(2 * idx(p) + 1); }
Position parent(const Position& p)
{ return pos(idx(p)/2); }

bool hasLeft(const Position& p) const


Complete Tree for Heap using a { return 2 * idx(p) <= size(); }

Vector bool hasRight(const Position& p) const


{ return 2 * idx(p) + 1 <= size(); }
bool isRoot(const Position& p) const
{ return idx(p) == 1; }
};

Note: This Binary Tree is starting at index 1 (like the Book Examples)
Left child is held at index 2i, Right child is held at index 2i + 1,
A parent is at floor( i / 2 ) for a node at index i
Recall: Priority Queue ADT
A priority queue stores a collection of entries

Typically, an entry is a pair(key, value), where the key indicates the priority

Main methods of the Priority Queue ADT:

● insert(e) inserts an entry e


● removeMin() removes the entry with smallest key

Additional methods: min(), size(), empty()

Applications: Standby flyers, Auctions, Stock market


Heaps and Priority Queues
We can use a heap to implement a priority queue

We store a (key, element) item at each node

We keep track of the position of the last node


Insertion into a Heap
Method insert() of the priority queue
ADT corresponds to the insertion of a
key k to the heap

Insertion algorithm consists of 3 steps

● Find the insertion node z (the new


last node)
● Store k at z
● Restore the heap-order property
(discussed next in Upheap)
Upheap (Restoring the Heap)
After the insertion of a new key k, the heap property may be violated

Upheap restores the heap-order by swapping k along an upward path from the insertion
node

Upheap terminates when key k reaches the root or a node whose parent has a key smaller
than or equal to k

Since a heap has height O(log n), upheap runs in O(log n) time
Removal from a Heap
Method removeMin of the priority queue ADT
corresponds to the removal of the root key from
the heap

The removal algorithm consists of three steps

● Replace the root key with the key of the last


node w
● Remove w
● Restore the heap-order property (discussed
next in Downheap)
Downheap (Restoring the Heap)
After replacing the root key with the key k of the last node, the heap property may be
violated

Downheap restores the heap property by swapping key k along a downward path from the
root

Downheap terminates when key k reaches a leaf or a node whose children have keys
greater than or equal to k

Since a heap has height O(log n), downheap runs in O(log n) time
Note on Min vs Max Heaps
The examples in this book and presentation are showing a
min-heap where key(P) ≤ key(C)

If a max-heap was used, the Upheap and Downheap


algorithms would work to restore the key(P) ≥ key(C)
property instead of the min-heap property
template <typename E, typename C>
class HeapPriorityQueue {
public:
int size() const; // number of elements
bool empty() const; // is the queue empty?
void insert(const E& e); // insert element

Interface
const E& min(); // minimum element
void removeMin(); // remove minimum

private:

Priority Queue with a Heap VectorCompleteTree<E> T; // priority queue contents


C isLess; // less-than comparator

// shortcut for tree position


typedef typename VectorCompleteTree<E>::Position Position;
};
// number of elements
template <typename E, typename C>
int HeapPriorityQueue<E,C>::size() const {

Simple
return T.size();
}

Functions
// is the queue empty?
template <typename E, typename C>
bool HeapPriorityQueue<E,C>::empty() const {
return size() == 0;

Priority Queue with a Heap }

// minimum element
template <typename E, typename C>
const E& HeapPriorityQueue<E,C>::min() {
return *(T.root()); // return reference to root element
}
// insert element
template <typename E, typename C>
void HeapPriorityQueue<E,C>::insert(const E& e) {
// add e to heap
T.addLast(e);

// e's position
Position v = T.last();

Insert // up-heap bubbling


while (!T.isRoot(v)) {
Position u = T.parent(v);

Priority Queue with a Heap // if v in order, we're done


if (!isLess(*v, *u)) break;

// ...else swap with parent


T.swap(v, u);
v = u;
}
}
template <typename E, typename C>
void HeapPriorityQueue<E,C>::removeMin() {
if (size() == 1) { // only one node? remove it
T.removeLast();
} else {
Position u = T.root();
// swap last with root and remove last
T.swap(u, T.last()); T.removeLast();

Remove Min
// down-heap bubbling
while (T.hasLeft(u)) {
Position v = T.left(u);
if (T.hasRight(u) && isLess(*(T.right(u)), *v)) {
v = T.right(u); // v is u's smaller child
Priority Queue with a Heap }

// is u out of order? then swap


if (isLess(*v, *u)) {
T.swap(u, v);
u = v;
} else { break; } // else we're done
}
}
}
Heap-Sort
Consider a priority queue with n items Using a heap-based priority queue, we
implemented by means of a heap can sort a sequence of n elements in
O(n log n) time
● the space used is O(n)
● methods insert and removeMin The resulting algorithm is called heap-
take O(log n) time sort
● methods size, empty, and min take
Heap-sort is much faster than quadratic
time O(1) time
( O(n2) ) sorting algorithms, such as
insertion-sort and selection-sort
Heap-Sort (based on our implementation)
We can represent a heap with n keys by means of a vector of length n + 1

For the node at index i: left child is at index 2i and right child is at index 2i + 1

Links between nodes are not explicitly stored

The cell of at index 0 is not used

Operation insert corresponds to inserting at index n + 1

Operation removeMin corresponds to removing at index n

Yields in-place heap-sort


Heap-Sort in-place

A heap sort is considered to be an in-


place algorithm because no extra
memory is used to perform the sort.

Although space is required for temporary


variables, the whole algorithm works by
swapping elements in the array that is
provided as input.

The build max operation works in-place


(no extra memory allocated)
Question:
At which nodes of a max-heap can an entry
with the largest key be stored? What about a
min-heap?
Question:
At which nodes of a min-heap can an entry
with the smallest key be stored? What about a
max-heap?
Question
Illustrate the performance of the selection sort
algorithm on the following input sequence: (23, 15,
3, 7, 44, 9, 8, 1, 12)
● The selection sort algorithm sorts an array by repeatedly finding the minimum element
(considering ascending order) from unsorted part and putting it at the beginning. The
algorithm maintains two subarrays in each array.
● 1) The subarray which is already sorted.
2) Remaining subarray which is unsorted.
● In every iteration of selection sort, the minimum element (considering ascending order)
from the unsorted subarray is picked and moved to the sorted subarray .
Question
Illustrate the performance of the insertion sort
algorithm on the following input sequence:
(23, 15, 3, 7, 44, 9, 8, 1, 12)
Question
Illustrate the performance of the heap-sort
algorithm (in-place) on the following input
sequence: (23, 15, 3, 7, 44, 9, 8, 1, 12)
Static Huffman Coding

• Static Huffman coding assigns variable length codes to symbols based on their
frequency of occurrences in the given message. Low frequency symbols are
encoded using many bits, and high frequency symbols are encoded using fewer
bits.

• The message to be transmitted is first analyzed to find the relative frequencies of


its constituent characters.

• The coding process generates a binary tree, the Huffman code tree, with branches
labeled with bits (0 and 1).

• The Huffman tree (or the character codeword pairs) must be sent with the
compressed information to enable the receiver decode the message.
Static Huffman Coding Algorithm
Find the frequency of each character in the file to be compressed;

For each distinct character create a one-node binary tree containing the character and its frequency as its
priority;

Insert the one-node binary trees in a priority queue in increasing order of frequency;

while (there are more than one tree in the priority queue) {

dequeue two trees t1 and t2;

Create a tree t that contains t1 as its left subtree and t2 as its right subtree; // 1

priority (t) = priority(t1) + priority(t2);

insert t in its proper location in the priority queue; // 2


}

Assign 0 and 1 weights to the edges of the resulting tree, such that the left and right edge of each node do
not have the same weight; // 3

Note: The Huffman code tree for a particular set of characters is not unique.
(Steps 1, 2, and 3 may be done differently).
Static Huffman Coding example

Example: Information to be transmitted over the internet contains


the following characters with their associated frequencies:

Character a e l n o s t
Frequency 45 65 13 45 18 22 53
Use Huffman technique to answer the following questions:

• Build the Huffman code tree for the message.

• Use the Huffman tree to find the codeword for each character.

• If the data consists of only these characters, what is the total number of bits to be
transmitted? What is the compression ratio?
Static Huffman Coding example (cont’d)
Static Huffman Coding example (cont’d)
Static Huffman Coding example (cont’d)
Static Huffman Coding example (cont’d)
Static Huffman Coding example (cont’d)

The sequence of zeros and ones that are the arcs in the path from the root to each leaf node are
the desired codes:
character a e l n o s t
Huffman 110 10 0110 111 0111 010 00
codeword
Static Huffman Coding example (cont’d)

If we assume the message consists of only the characters a,e,l,n,o,s,t then the
number of bits for the compressed message will be 696:

If the message is sent uncompressed with 8-bit ASCII representation for the
characters, we have 261*8 = 2088 bits.

You might also like