Comp 2123 Notes
Comp 2123 Notes
Algorithm E ciency
• E cient if it runs on polynomial time
- For n elements, it preforms less than p(n) steps
- However hard to determine exactly what p(n) is
• Big-Oh notation
- T(n) = O(f(n)) if there exists a n0 such that T(n)<= cf(n) for n>n0 (c>0)
- T(n) = Ω(f(n)) if there exists a n0 such that T(n)>= cf(n) for n>n0 (c>0)
- T(n) = Θ(f(n)) if T(n) = O(f(n)) and T(n) = Ω(f(n)
For a double
‘for loop’
Using upper
bound so j-i
will happen at
most n times
Can be
improved by
using less
loops and
more +,-,/
ffi
ffi
Week 2
• Stack
- Uses LIFO (last in rst out)
- Supports push, pop, top,size,isempty
- O(1)
• Queue
- Uses FIFO ( rst in rst out)
- You can remove rst element and add to end
- Enqueue,dequeue, rst, size, isempty
- Algorithm tips
• Can change enqueue to support sum( ) or max( ) commands
ffi
fi
fi
fi
fi
fi
Week 3
Trees
• Inheritance hierarchy
• Terminology
- Root = node without parent (only 1, no parents, all tree = its descendants)
- Internal node = node with at least one child
- External/leaf node = no children
- Ancestors = has grand-children
- Descendants = has grand-parents
- Siblings = same parent
• Concepts
- Depth of node = number of ancestors (so root is depth 0)
- Level = a given depth of siblings
- Height of tree = maximum depth+1 (generations )
- Subtree = a branch
- Edge = a direct parental link
- Path = a link in a tree which involves an edge
- Lowest common ancestor of (x,y) = deepest ancestor of both with no children
• Operators
- size( )
- Is empty( )
- Iterator( )
- Positions( )
- root( )
- Parent( )
- Children( )
- numchildren( )
- Isinternal( ), isexternal( ), isroot( )
Traversing
Pre-order traversal visits the root rst post-order traversal visits root last
preorder(x)
visit(x)
for each child of x
preorder(x)
Binary tree
• At most 2 children (right or left child - left rst )
• Proper binary tree = all 2 children
• leftchild( ), rightchild( ), sibling( )
• Inorder traversal goes all the way left (deals with it all), then back to node
fi
fi
Week 4 - Binary search trees
Insert(key,newvalue)
search(key)
if key isfound:
key.value=newvalue
if key isnotfound:
external_node_found.child =
newvalue
delete(key)
search(key)
if key has 1 external child
delete key and external child
set key.internal_child.parent to
key’s parent
if key has 2 external children
nd next internal node (inorder)
key.value = next internal node
next internal node.leftchild dies
Get, put, remove all in O(height) next internal node.rightchild =
left child of parent of next
internal node
Range Querys
Priority Queue
• Similar to a map or dictionary
• Operations
- insert(key, value)
- Remove_min( ) - removes smallest key
- min( )
• Can be sorted or unsorted
- Unsorted has O(1) insertion but O(n) removal
- Sorted has O(n) insertion but O(1) removal
Sorting methods
• Priority Queue sorting: Remove n min elements in
O(n) time -> O(n2)
• Selection sorting
- For each i, nd minimum ahead of it and switch
• Insertion sorting
- For each i, if next entry is smaller then switch, and with the smaller of the switch: head
backwards and switch successively to nd right place
Heap sort
• Use heapify function to convert an array to a heap in O(n)
• then remove_min n elements so heap sort takes nlogn
• Implementation of heapify
- Root is index 0
- Last node is index n-1
- For node i
• Left child at 2i+1
• Right child at 2i + 2
• Parent at (i-1)/2
fi
fi
fi
Week 6
Maps
• Key-value
• Operations
- Get
- Put - O(1) if nonexistent , O(n) if it has to be found and replaced
- Remove -O(n)
- Size
- Keys, Values
Implementations
• Unsorted list
- Put is O(1) if nonexistent , O(n) if it has to be found and replaced
- Remove, get are also O(n)
• Restricted keys
- Put in array of size N, so index = key
- Makes get, put, remove O(1)
- However is ine cient e.g. for 9 digit ID then N = 109
- Can’t do string keys
• Hash
- Key -> hash function -> h(key) -> index in array
- Used for a xed array size N
- Usually uses mod(N), e.g. key * modN
• h(key) involves hashcode (converting key to int) then compression (int to index)
• Probability that 2 keys collide is 1/N so then E(collisions) = n/N
• This is called a universal hash function
- Random linear hash function
• h(k) = (( ak+b) mod p) mod N where p is prime and a,b < p, p > n
• Still = 1/N
- Other options include sum of all keys so far, a polynomial involving all keys (which can be
evaluated in O(n) time
Collision Handling
• Separate chaining
- Makes all array entries a separate linked list of records that have
same hash function value
- E(collisions) is still n/N = α = loading factor
- This makes put O(1 + α) as you might have to add α things in
though a linked list
- Worst case is O(n) if all hash values collide
• Open addressing and linear probing
- Chuck collisions in next available cell
- However clusters can collide and you have a linearly probe a longer list of cells
- Search
• Go index of key
• If index.key = search_value then return value
• Else go forward until it matches
- Get, insert and delete
• Get skips over ‘DEFUNCT’ cells
• Remove removes the element but leaves a ‘DEFUNCT’
• Put can replace ‘DEFUNCT’ or null cells to put
• Can run at worst in O(n)
• Expected probes = 1/(1-α) so if α<1, then it runs in O(1)
• Cuckoo hashing
- Uses 2 hash arrays
- Get and remove are O(1) as you just check both arrays
- Put: you nd the location and evict the resident, then put the resident in the other hash
- This can create an eviction cycle _. Can be solved by cutting and rehashing into larger tables
- Expected time of n put operations is O(n)
Compositions
• V is a set of nodes
• E is a set of edges
- Can be directed (from A to B)
- Or undirected (2 way)
Terminology
• Undirected
- Edges connect endpoints
- Edges with same endpoint are incident
- Connected vertices are adjacent
- Degree is the amount of edges a node has
- Parallel edges share same 2 vertices
- Self loop only has 1 vertex
- Simple graphs have no parallel or self loops
• Directed
- Head and tail
- Out-degree and in-degree = amount in and out
- Parallel edges have same head and tail
- Simple graphs have no self or parallel, also anti-parallel where 2 edges have reverse head
and tail
• Path
- A path along edges
- Simple path has only distinct nodes (no repeat visits)
- A cycle is a path that starts and ends with same vertex
• simple cycle is same logic
• An acyclical graph has no cycles
• Subsets
- Subset of a graph
- Can be G = ( vertsubset, edges(vertsubset))
- Or G = ( verts(edgesubset), edgesubset)
• Connectivity
- Connected if there’s a connection to each vertex
- Can have connected subsets
• Trees and forests
- A tree is a graph that is connected but accylical
- A forest is a collection of disconnected trees
- A tree with n vertices must have n-1 edges
- A spanning tree is a connected subgraph of a graph that is a tree
- Can also have spanning forests
Properties
• Letting m = number of edges, n = number of vertices
• Sum o all vertices degrees = 2m
• In a simple undirected graph: m =< n (n-1) / 2
• In a simple directed graph: m =< n (n-1)
Data Structure
• getallvertices( )
• Getalledges( )
• getedge( nodeA, nodeB)
• getvertices( edge)
• outdegree( ) and indegree( ) or degree( )
• outcomingedges (node)
• insert and delete for both vertex and edge
ff
EdgeList structure: Each edge just contains the 2 vertices it connects
Adjacency list: also contains a list of all incident edges on each vertex in each vertex
- Can be stored as an adjacency matrix
Traversal techniques
Depth rst
Cut edges
• If an edge is removed, it disconnects a graph
• Essentially use DFS, remove an edge and see if graph still connects
Weighted graphs
• Each edge has an associated numerical value representing something
• Can use a bonacci heap which is O(1) for decreasekey, so = O(m + nlogn)
• May no work for negative edge values
Prims algorithm
Essentially gets a cutset for each node individually
Based on assumption that all edges. Are distinct so can add i/n^2 to each
Greedy Algorithmns
• You build a solution by continually taking locally optimal choices
• May not be best solution
Fractional knapsack
• Choosing the optimal combination of items with a bene ted a weight (repetitions allowed)
• Weight is however bounded
• So sum of xi (amount of times item i is used) x wi < W
- Implying that sum of xi <W
- And that 0<=xi<=wi -> this cap stops too many repetitions
Nlogn to sort
N to go through so = O(nlogn)
Interval partitioning
• Like classroom scheduling
• Find maximum depth
O(nlogn)
Merge-sort
• Halve array, sort each then recombine
Merge
consecutively inserts the smallest uninserted item
Quick sort
=O(nlogn)
Divide and conquer 2
Maxima-set
Can use to get time complexity, depends on logba and the power of f(n)
Finding the kth smallest integer in an unsorted n-list
Randomisation
Randomized algorithms are algorithms where the behaviour doesn’t depend solely on the input. It
also depends (in part) on random choices or the values of a number of random bits.
Skiplists
• Like a map
• Uses get,put,remove ect