0% found this document useful (0 votes)
47 views3 pages

1/1-A For Unsuccessful, 1/a Log (1/1-A) For Successful Search

The document summarizes key concepts about hashing, heaps, and heapsort. It discusses how hash functions map keys to table slots, the load factor alpha, collision resolution techniques like double hashing. It then covers heap data structures, including their representation as binary trees and arrays. Max heaps store the maximum value at the root. Heapsort uses a max heap to sort in place in O(n log n) time by repeatedly removing the maximum element. Building a max heap from an array takes O(n) time using the MaxHeapify procedure.

Uploaded by

RobertSmith
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views3 pages

1/1-A For Unsuccessful, 1/a Log (1/1-A) For Successful Search

The document summarizes key concepts about hashing, heaps, and heapsort. It discusses how hash functions map keys to table slots, the load factor alpha, collision resolution techniques like double hashing. It then covers heap data structures, including their representation as binary trees and arrays. Max heaps store the maximum value at the root. Heapsort uses a max heap to sort in place in O(n log n) time by repeatedly removing the maximum element. Building a max heap from an array takes O(n) time using the MaxHeapify procedure.

Uploaded by

RobertSmith
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Thursday, August 8, 199

Lecture 2
- hash function -> number of collisions is equal to the average of the list. ie: number of
key elements divided by the number of slots in the table.

- theta (1) + alpha : load factor


- if and only if trap: you could still have constant time when searching if you have only 1
or 2 checks to do before finding the thing, supposing you have many data entries

- double hashing be relatively prime? to guarantee that the prob sequence is a full
permutation of m

- when doing open addressing, we must use the same probe sequence, ie go in the
same order when trying to put stuff into the slot table.

- linear probing tends to create clusters, quadratic hashing?


- best one? double hashing
- what is the meaning of alpha in the two theorems? load factor?
- when you have 50 keys, 100 slots, you get an alpha of 0.5. so yes, alpha is the load
factor. in two tries, youll be able to see if your element is present.

- 1/1-a for unsuccessful, 1/a log(1/1-a) for successful search


- the equations above are theoretical constructs, it assumes you have uniform hashing
- universal hashing is built by having an universal class of functions to choose from
when hashing. not having more collisions than the average case. 1/m?

- universal set of hash functions: for a given number of keys, the number of hash
functions giving an equal result is less or equal to the total number of hash functions
divided by the range (number of slots). also, for a hash function chosen randomly
from H, the chance of collision between two keys is <= 1/m

- example of universal hashing: suppose we have a table with m slots, m being a prime
number. take a key x and decompose it into a vector of certain length (r+1). get a
vector a with a number of elements r+1. dot product these two and mod it by m.

- H = Ua {ha} with ha(x) = sigma from i =0 to r ai * xi mod m


- perfect hashing no needed

Thursday, August 8, 199


Second part : Heap and Heapsort

- Heap is a tree based data structure (binary tree but we can also use k-ary trees)
- for this class, we are going to restrict ourselves only to binary trees
- max-heap : root has biggest key. in this class we are going to stick to the max heap.
- value of the parent bigger than the value of the node
- before going lower in a heap, we must fill the current row.
- if implemented as an array, we have: root is A [1], left [i] is A [2i], right [i] is A [2i + 1],
parent [i] is A [i/2]

- there are no gaps in the array implementation of a heap


- height of node: number of edges from node to a leaf
- height of heap: height of the root = O(log n)
- Most operations in a heap run in O(log n)
- When sorting a heap, first you gotta build a heap out of it. Then, your know the root is
at the top of the tree. swap the first and last element of the array. you need now to
place n-1 elements at the right place. make the remaining a heap, then move the max
to the lowest and repeat till no more elements left.

- Heapsort best of merge sort and heapsort. you can guarantee running time of n
log(n). you can do the also the sorting in place! algorithm design technique: you
create a data structure during the execution of the algorithm (an heap).

- The heap is useful for building priority queues.


- To implement the heapsort we need two functions: 1 is basically maxheapify. In
essence, this is comparing the root with one of the two children, swap it with the
largest element. After the swap, one of the two nodes satisfy the heap property.
Potentially broken for other tree, so recurse at the position swapped. the i is the
position of the node in the whole thing. we need something to deal with the case
where there is nothing to heapify though.

- time to fix the node i and its children: O(1)? time to fix subrooted tree is T(size of
subtree)

- running time of this? everytime, we call max heapify, we go one level down.
- T(n) = T(largest) + O(1)
2

Thursday, August 8, 199

- max amount of times this going to be run corresponds to the height of your heap.
height is log n.

- when largest <= 2n/3, we have worst case. i.e.: when the last row is exactly half full,
you end up with O(n).

- To build a heap, we gotta call max heapify in a bottom up manner.


- by definition, half of your nodes are going to be leaves. leaves are heaps themselves.
you can start operating on the first internal node of your tree.

- for 10 elements, we start at 10/2 = 5. Max heapify(5), (4), (3), (2), etc.
- Correctness of BuildMaxHeap: loop invariant (all nodes of the current node are the
root of a heap). before first iteration, i is the parent of the node n (by n/2) and (n/2 +
1) etc are leaves, trivial max heaps. By loop invariant, the subtrees of the children of i
are max heaps. Max heapify makes i a max heap root while keeping the leaves of i
roots of max heaps. decrementing i reapplies the loop invariant to the decremented i
(leaves of it are going to be max heaps too).

- running time of BuildMaxHeap is MaxHeapify [ O(log n) ] and the number of calls to it,
ie time n. O(n*log n)

- With a tighter bound, we can show that BuildMaxHeap is actually linear. O(n).
- RECAP: build max heap from array, swap first and last, remove last element from
size call max heapify on the new root (give it back good shape), repeat this until size
is equal to 1.

- is build max heap and maxheapify the same thing? NO


- build heap: O(n)..for loop n-1 times O(n), swap O(1), MaxHeapify O(log n)
- in place, guaranteed to have n log n, but in practice quick sort runs a bit better.

You might also like