9 - Introduction Graph Theory
9 - Introduction Graph Theory
9 - Introduction Graph Theory
Topics
Introduction to biology (cell, DNA, RNA, genes, proteins)
Sequencing and genomics (sequencing technology, sequence
alignment algorithms)
Functional genomics and microarray analysis (array technology,
statistics, clustering and classification)
Introduction to biological networks
Introduction to graph theory
Network properties
Network/node centralities
Network motifs
Network models
Network/node clustering
Network comparison/alignment
Software tools for network analysis
Interplay between topology and biology
Chapter 3
Graphs
2
4
6
5
7
A = B equal sets
A B union of sets A and B
A B intersection of sets A and B
a A a is an element of set A
V = { 1, 2, 3, 4, 5, 6, 7, 8 }
E = { {1,2}, {1,3}, {2,3}, {2,4}, {2,5}, {3,5}, {3,7}, {3,8}, {4,5}, {5,6} }
n=8
m = 11
9
V = { 1, 2, 3, 4, 5, 6, 7, 8 }
E = { {1,2}, {1,3}, {2,3}, {2,4}, {2,5}, {3,5}, {3,7}, {3,8}, {4,5}, {5,6} }
n=8
m = 11
10
London
Paris
Graph types:
Undirected
Directed
Mixed (some edges
directed some undirected)
Weighted
(weights on edges or nodes)
11
12
13
a bipartite graph
14
cycle C = 1-2-4-5-3-1
Simple cycle:
all vertices and edges are distinct
each edge is preceded and followed by its end-vertices
E.g.: 1-2-3-7-8-3-1 in figure above is not a simple cycle, C above is
C3
C4
C5
15
16
child of v
leaves
a tree
18
23
6
16
18
5
6
5
11
8
14
10
11
21
T, eT ce = 50
G = (V, E)
19
19
20
21
K3
K4
K5
22
1
3
Walk:
{2,3}, {3,1}, {1,2}, {2,3}, {3,4}
1
3
Path:
{3,1}, {1,2}, {2,3}, {3,4}
1
3
Simple Path:
{1,2}, {2,3}, {3,4}
Cycle
23
E.g.
3 connected components
24
Partial subgraph:
3-node path
Induced subgraph:
3-node cycle: C4=K4
25
27
28
29
31
32
1
0
1
1
0
0
0
0
0
2
1
0
1
1
1
0
0
0
3
1
1
0
0
1
0
1
1
4
0
1
0
1
1
0
0
0
5
0
1
1
1
0
1
0
0
6
0
0
0
0
1
0
0
0
7
0
0
1
0
0
0
0
1
8
0
0
1
0
0
0
1
0
= degree of node 2
33
34
35
linear
quadratic
cubic
37
38
39
Complexity Classes
Polynomial-time algorithms:
On input size n, their running time is O(nk)
Not all problems can be solved in polynomial time (poly-time).
Intuition:
Polynomial time algorithms are tractable or easy
Problems that require super-polynomial time are hard
Complexity classes:
P: problems that are solvable in polynomial time
NP: their solutions are verifiable in polynomial time, i.e.,
decision problems for which there exists a
poly-time certifier
40
Complexity Classes
E.g.
Hamiltonian Cycle of a graph G(V,E) is a simple cycle that contains
each vertex in V.
Problem: does a graph have a Hamiltonian Cycle? NP-complete
If solution given, sequence (v1, v2, v3,,vn) easy to check in poly-time
whether each vi,vi+1 in E for all i and vn,v1 in E .
instance s
certificate t
41
Complexity Classes: P, NP
P. Decision problems for which there is a poly-time algorithm.
NP. Decision problems for which there is a poly-time certifier.
Claim. P NP.
Proof. Consider any problem X in P.
By definition, there exists a poly-time algorithm A that solves X.
If we can solve in poly-time, we can verify a solution in poly time.
NP
Complexity Classes: P, NP
NP-complete. A problem Y in NP with the property that for every problem X
in NP, X p Y (X is poly-time reducible to Y).
A is poly-time reducible to B if there exists a function f: A B such that is
a yes instance for A if and only if f() is a yes instance for B and if f is polytime computable.
Problem L is NP-complete if:
L is in NP
every problem in NP is poly-time reducible to L (i.e., L is NP-hard:
at least as hard as any NP problem)
NP-hard
NP-c
NP
P
43
44
Graph Traversing
Given a graph G(V,E), explore every vertex and every edge
Using adjacency list is more efficient
Example algorithms:
Depth-first search (DFS)
Breadth-first search (BFS)
45
Graph Traversing
BFS example:
L0
L1
L2
L3
46
Graph Traversing
BFS: code from LEDA
The LEDA Platform of Combinatorial and Geometric Computing, by K.
Mehlhorn and St. Nher
Graph Traversing
DFS applications:
Determines whether G is connected
Computes the connected components of G (strongly connected
components of a digraph = directed graph)
Path / cycle finding
Topological sort (ordering of vertices of digraph G(V,E) such that for every
edge (u,v) in E, u appears before v in the ordering)
Linear running time
BFS applications:
Computes the distance from s to each reachable vertex in unewighted G
Finds shortest paths from s to all other nodes in unweighted G
Finds a simple cycle, if there is one
Computes the connected components of G
48
Graph Traversing
Single-source shortest path problems (SSSPP):
Given a source vertex s, find distances and shortest paths from s to all
other vertices
BFS works on unweighted graphs
Dijkstras algorithm for weighted graphs:
Each node is labeled with its distance from the source node along the
best known path
Initially, all nodes are labeled with infinity
Graph Traversing
Dijkstras Shortest Path Algorithm
Step 1: initially all nodes are non-permament
Step 2: set the source node (A) as permanent
A is at the same time the current node
Step 3:
Examine all non-permanent nodes i adjacent to the current node
Fore each i, calculate the cumulative distance from the source node to i
via the current node
Relabel i with the newly computed distance
But if i already has a shorter cumulative distance than the calculated one, then
to NOT relabel.
Also, label i with the name of the current node (as predecessor)
Compare labels (distances) of all non-permanent nodes and choose the
one with the smallest value. Change the node to permanent and set it as
the current node.
Repeat step 3 until all nodes become permanent.
50
Graph Traversing
51
Graph Traversing
Dijkstras Shortest Path Algorithm
Example
52
Graph Traversing
Dijkstras Shortest Path Algorithm
Example
53
Graph Traversing
Dijkstras Shortest Path Algorithm
Example
54
Graph Traversing
Dijkstras Shortest Path Algorithm
Example
55
Graph Traversing
Dijkstras Shortest Path Algorithm
Example
56
Graph Traversing
Dijkstras Shortest Path Algorithm
Example
57
Graph Traversing
Dijkstras Shortest Path Algorithm
Example
58
Graph Traversing
Dijkstras Shortest Path Algorithm
Example
59
Graph Traversing
Dijkstras Shortest Path Algorithm
Example
60
Graph Traversing
Dijkstras Shortest Path Algorithm
Example
61
Graph Traversing
Dijkstras Shortest Path Algorithm
Example
62
Graph Traversing
Dijkstras Shortest Path Algorithm
Example
63
Graph Traversing
Dijkstras Shortest Path Algorithm
Example
64
Graph Traversing
Dijkstras Shortest Path Algorithm (SSSPP)
Does not allow negative weights on edges
Similar to BFS (BFS is this algorithm but with all weights equal to 1)
Time complexity varies on implementation:
O(|V|2) this is O(|E|) for dense graphs
O(|E| log|V|) good for sparse graphs (for them, O(|E|) is of O(|V|)
O(|V| log|V| + |E|) good for both sparse and dense graphs
65
Graph Traversing
Belman-Ford Algorithm for SSSPP:
Works on weighted, directed graphs
Allows negative weights on edges, but no negative weight cycles
If there is a negative weight cycle reachable from source vertex s, it reports no
solution exists; otherwise produces the shortest paths and their weights
B-F algorithm (G,s)
For each v V
d[v] =
d[s] = 0
For i=1 to |V|-1
For each edge (u,v) E
If ( d[v] > d[u]+w(u,v) )
d[v] = d[u] + w(u,v)
For each edge (u,v) E
If d[v] > d[u]+w(u,v)
Return FALSE (negative weight cycle found)
O(|V|)
O(|E|)
66
Graph Traversing
All-pairs shortest paths
Goal: create an n x n matrix of distance (u,v)
Use B-F algorithm once from each vertex as a source
But O(|V|2 |E|) running time, i.e., O (|V|4) running time for dense graphs
Can do slightly better with Dijkstras from each node, but no negative weight edges
Use adjacency matrix representation of G with entries being weights of edges, wij
Negative weights are allowed, but no negative weight cycles (detects them if exist)
Floyd-Warshall algorithm
Output:
A matrix of distances, D (or equivalently, costs, C)
A predecessor matrix,
Dynamic programming algorithm (breaking down into smaller subproblems)
67
Graph Traversing
All-pairs shortest paths
Floyd-Warshall algorithm
For i=1 to n
For j=1 to n
d(0)ij=wij (d(k-1)ij is length of the shortest i,j-path using only {1,2,,k} nodes)
For k=1 to n
For i=1 to n
For j=1 to n
d(k)ij = min{d(k-1)ij , d(k-1)ik + d(k-1)kj }
Return D(n) (matrix of distances, or costs C)
O(n3) time
O(n2) space
(store only previous matrix)
68
Graph Traversing
Example
Floyd-Warshall algorithm
69
Topics
Introduction to biology (cell, DNA, RNA, genes, proteins)
Sequencing and genomics (sequencing technology, sequence
alignment algorithms)
Functional genomics and microarray analysis (array technology,
statistics, clustering and classification)
Introduction to biological networks
Introduction to graph theory
Network properties
Network/node centralities
Network motifs
Network models
Network/node clustering
Network comparison/alignment
Software tools for network analysis
Interplay between topology and biology
70