1/1-A For Unsuccessful, 1/a Log (1/1-A) For Successful Search
1/1-A For Unsuccessful, 1/a Log (1/1-A) For Successful Search
Lecture 2
- hash function -> number of collisions is equal to the average of the list. ie: number of
key elements divided by the number of slots in the table.
- double hashing be relatively prime? to guarantee that the prob sequence is a full
permutation of m
- when doing open addressing, we must use the same probe sequence, ie go in the
same order when trying to put stuff into the slot table.
- universal set of hash functions: for a given number of keys, the number of hash
functions giving an equal result is less or equal to the total number of hash functions
divided by the range (number of slots). also, for a hash function chosen randomly
from H, the chance of collision between two keys is <= 1/m
- example of universal hashing: suppose we have a table with m slots, m being a prime
number. take a key x and decompose it into a vector of certain length (r+1). get a
vector a with a number of elements r+1. dot product these two and mod it by m.
- Heap is a tree based data structure (binary tree but we can also use k-ary trees)
- for this class, we are going to restrict ourselves only to binary trees
- max-heap : root has biggest key. in this class we are going to stick to the max heap.
- value of the parent bigger than the value of the node
- before going lower in a heap, we must fill the current row.
- if implemented as an array, we have: root is A [1], left [i] is A [2i], right [i] is A [2i + 1],
parent [i] is A [i/2]
- Heapsort best of merge sort and heapsort. you can guarantee running time of n
log(n). you can do the also the sorting in place! algorithm design technique: you
create a data structure during the execution of the algorithm (an heap).
- time to fix the node i and its children: O(1)? time to fix subrooted tree is T(size of
subtree)
- running time of this? everytime, we call max heapify, we go one level down.
- T(n) = T(largest) + O(1)
2
- max amount of times this going to be run corresponds to the height of your heap.
height is log n.
- when largest <= 2n/3, we have worst case. i.e.: when the last row is exactly half full,
you end up with O(n).
- for 10 elements, we start at 10/2 = 5. Max heapify(5), (4), (3), (2), etc.
- Correctness of BuildMaxHeap: loop invariant (all nodes of the current node are the
root of a heap). before first iteration, i is the parent of the node n (by n/2) and (n/2 +
1) etc are leaves, trivial max heaps. By loop invariant, the subtrees of the children of i
are max heaps. Max heapify makes i a max heap root while keeping the leaves of i
roots of max heaps. decrementing i reapplies the loop invariant to the decremented i
(leaves of it are going to be max heaps too).
- running time of BuildMaxHeap is MaxHeapify [ O(log n) ] and the number of calls to it,
ie time n. O(n*log n)
- With a tighter bound, we can show that BuildMaxHeap is actually linear. O(n).
- RECAP: build max heap from array, swap first and last, remove last element from
size call max heapify on the new root (give it back good shape), repeat this until size
is equal to 1.