CS 97SI: INTRODUCTION TO PROGRAMMING CONTESTS
Jaehyun Park
Todays Lecture: Graph Algorithms
What are graphs? Adjacency Matrix and Adjacency List Special Graphs Depth-First Search and Breadth-First Search Topological Sort Eulerian Circuit Minimum Spanning Tree (MST) Strongly Connected Components (SCC)
What are graphs?
An abstract way of representing connectivity using nodes (or vertices) and edges We will label the nodes from 1 to edges connect some pairs of nodes
Edges
can be either one-directional (directed) or bidirectional
Nodes and edges can have some auxiliary information
Figure from Wikipedia
Why study graphs?
Lots of problems formulated and solved in terms of graphs
Shortest
path problems Network flow problems Matching problems 2-SAT problem Graph coloring problem Traveling Salesman Problem (TSP): still unsolved! and many more
Storing Graphs
We need to store both the set of nodes and the set of edges
Nodes
can be stored in an array Edges must be stored in some other way
We want to support the following operations
Retrieving
all edges incident to a particular node Testing if given two nodes are directly connected
Use either adjacency matrix or adjacency list to store the edges
Adjacency Matrix
An easy way to store connectivity information
Checking
if two nodes are directly connected: 1
time
Make an matrix
= 1 if there is an edge from to = 0 otherwise
Uses 2 memory
use when is less than a few thousands, AND when the graph is dense
Only
Adjacency List
Each node has its own list of edges
The
lists have variable lengths Space usage: ( + )
1
3 From 1 2 3 5 2 4 4 5 2 3 2 2 5 To 3 5 5
Implementing Adjacency List
Solution 1. Using linked lists
Too
much memory/time overhead Using dynamic allocated memory or pointers is bad
Solution 2. Using an array of vectors
Easier
to code, no bad memory issues But very slow
Solution 3. Using arrays (!)
Assuming
the total number of edges is known Very fast and memory-efficient
Implementation Using Arrays
ID 1 2 1 To 2 3 3 5 3 2 2 5 1 4 2 8 Next Edge ID 1 3 5 2 3 6 4 7 5 -
3
3 1 2 6 5 4 From Last Edge ID
3 4 5 6 7 8
2 7
Implementation Using Arrays
Have two arrays E of size and LE of size
contains the edges LE contains the starting pointers of the edge lists
E LE[i] = 0
Initialize LE[i] = -1 for all i
is also fine if the arrays are 1-indexed
Inserting a new edge from u to v with ID k
E[k].to = v E[k].nextID = LE[u] LE[u] = k
Implementation Using Arrays
Iterating over all edges starting at u:
for(ID = LE[u]; ID != -1; ID = E[ID].nextID) // E[ID] is an edge starting from u
Its pretty hard to modify the edge lists
The
graph better be static!
Special Graphs
Tree: a connected acyclic graph
The
most important type of graph in CS Alternate definitions (all are equivalent!)
connected graph with 1 edges An acyclic graph with 1 edges There is exactly one path between every pair of nodes An acyclic graph but adding any edge results in a cycle A connected graph but removing any edge disconnects it
A
Special Graphs
Directed Acyclic Graph (DAG): the name says what it is
Equivalent
to a partial ordering of nodes
Bipartite Graph
Nodes
can be separated into two groups and such that edges exist between and only (no edges within or within )
Graph Traversal
The most basic graph algorithm that visits nodes of a graph in certain order Used as a subroutine in many other algorithms
We will cover two algorithms
Depth-First
Search (DFS): uses recursion (stack) Breadth-First Search (BFS): uses queue
Depth-First Search Pseudocode
DFS(): visits all the nodes reachable from in depth-first order
as visited For each edge :
Mark
If
is not visited, call DFS()
Use non-recursive version if recursion depth is too big (over a few thousands)
Replace
recursive calls with a stack
Breadth-First Search Pseudocode
BFS(): visits all the nodes reachable from in breadth-first order
a queue Mark as visited and push it to While is not empty:
Initialize
the front element of and call it For each edge :
Take
If is not visited, mark it as visited and push it to
Topological Sort
Input: a DAG = , Output: an ordering of nodes such that for each edge , comes before There can be many answers
e.g.
{6, 1, 3, 2, 7, 4, 5, 8} and {1, 6, 2, 3, 4, 5, 7, 8} are valid orderings for the graph on the right
6
2 1 4 3
Topological Sort
Any node without an incoming edge can be the first element After deciding the first node, remove outgoing edges from it Repeat!
Time complexity: 2 +
Ugh,
too slow
Topological Sort (faster version)
Precompute the number of incoming edges deg for each node Put all nodes with zero deg into a queue Repeat until becomes empty:
from For each edge
Take
deg (essentially removing the edge ) If deg becomes zero, push to
Decrement
Time complexity: +
Eulerian Circuit
Given an undirected graph We want to find a sequence of nodes that visits every edge exactly once and comes back to the starting point Eulerian circuits exist if and only if
is connected and each node has an even degree
Constructive Proof of Existence
Pick any node in and walk randomly(!) without using the same edge more than once Each node is of even degree, so when you enter a node, there will be an unused edge you exit through
Except
at the starting point, at which you can get stuck
When you get stuck, what you have is a cycle
Remove
the cycle and repeat the process in each connected component Glue the cycles together to finish!
Related Problems
Eulerian path: exists if and only if the graph is connected and the number of nodes with odd degree is 0 or 2.
Hamiltonian path/cycle: a path/cycle that visits every node in the graph exactly once. Looks similar but still unsolved!
Minimum Spanning Tree (MST)
Given an undirected weighted graph = , Want to find a subset of with the minimum total weight that connects all the nodes into a tree
We will cover two algorithms:
Kruskals
algorithm Prims algorithm
Kruskals Algorithm
Main idea: the edge with the smallest weight has to be in the MST
Simple
proof:
not. Take the MST that doesnt contain . to , which results in a cycle. Remove the edge with the highest weight from the cycle.
Assume Add
The removed edge cannot be since it has the smallest weight.
we have a better spanning tree than Contradiction!
Now
Kruskals Algorithm
Another main idea: after an edge is chosen, the two nodes at the ends can be merged and considered as a single node (supernode) Pseudocode:
Sort
the edges in increasing order of weight Repeat until there is one supernode left:
Take If
the minimum weight edge connects two different supernodes:
Connect them and merge the supernodes (use union-find)
Otherwise,
ignore and go back
Prims Algorithm
Main idea:
a set that starts out with a single node Find the smallest weighted edge = , that connects and Add to the MST, add to Repeat until =
Maintain
Differs from Kruskals in that we grow a single supernode instead of growing multiple ones here and there
Prims Algorithm Pseudocode
Initialize to , to cost , for every
If
there is no edge between and , cost , =
Repeat until = :
Find
Add
with smallest
a priority queue or a simple linear search
Use
to , add to the total weight of the MST For each edge , :
Update
to min , cost ,
Can be modified to compute the actual MST along with the total weight
Kruskals vs Prims
Kruskals Algorithm
log time Pretty easy to code Generally slower than Prims
Takes
Prims Algorithm
Time A
complexity depends on the implementation:
be 2 + , log , log
Can
bit trickier to code Generally faster than Kruskals
Strongly Connected Components (SCC)
Given a directed graph = , A graph is strongly connected if all nodes are reachable from every single node in Strongly connected components of are maximal strongly connected subgraphs of
The
graph on the right has 3 SCCs: {a, b, e}, {c, d, h}, {f, g}
Figure from Wikipedia
Kosarajus Algorithm
Initialize counter = 0 While not all nodes are labeled:
Choose an arbitrary unlabeled node Start DFS from
Check the current node as visited Recurse on all unvisited neighbors After the DFS calls are finished, increment and set s label to
Reverse the direction of all the edges For node with label 1
Find all reachable nodes from and group them as an SCC
Kosarajus Algorithm
We wont prove why this works Two graph traversals are performed
Running
time: +
Other SCC algorithms exist but this one is particularly easy to code
and
asymptotically optimal