Final Exam Prep History of Science
Final Exam Prep History of Science
5.1: Trees
5.2: Fundamentals
Theorem 5.1: For any connected (simple) graph G with n vertices and m edges, n ≤ m + 1.
Since this theorem works for simple graphs it certainly also works for non-simple graphs hence…
Theorem 5.2: For any tree T with n vertices and m edges, n = m + 1, is a tree.
Theorem 5.3: A connected graph G with n vertices and m edges for which n = m + 1, is a tree.
Theorem 5.4: A graph G is a tree if and only if there exists exactly one path between every two
vertices u and v.
Theorem 5.5: An edge e of a graph G is a cut edge if and only if e is not part of any cycle of G.
Therefore, a connected graph G is a tree if and only if every edge is a cut edge.
1 of 9
Kruskal’s Algorithm: Choose an edge1 with minimal weight. Choose a next edgek+1 such that
the following two conditions are met
The induced graph is acyclic
The weight (ek+1) is minimal
Stop when there is no more edge to select in the previous step.
Theorem 5.7: Consider a weighted graph G with n vertices. Any spanning tree TKruskal of G
constructed by Kruskal’s algorithm has minimal weight.
For an efficient implementation it is important to use the disjoint-set data structure, to
1. determine if an edge joins different subtrees, and
2. join two different subtrees
See the course Data structures & Algorithms.
Lemma 2 (Weighted Graph with Distinct Weights): Let 𝐺 be a weighted graph in which all edge
weights are distinct. Let 𝑇 be a minimal spanning tree for 𝐺, and 𝑆 a subtree of 𝑇.
Let 𝑒 be the lowest-weight outgoing edge of 𝑆 (i.e., 𝑒 is incident to exactly one vertex in 𝑆).
Then 𝑒 ∈ 𝐸(𝑇)
Consequences: A different MST algorithm (Prim-Jarnik): Repeatedly extend a subgraph, starting
from an arbitrary vertex, by adding the least-weight outgoing edge (which is unique!). The result is
a minimal spanning tree by the lemma.
In a weighted graph G in which all edges are distinct, the minimal spanning tree is unique.
Dijkstra’s Algorithm: It is used to find a sink tree, one for every node in the network. The
algorithm works for both directed and undirected graphs. One restriction of the algorithm is that
the weight of the arc cannot be negative. It is widely deployed in communication networks where
it is known as a link-state routing protocol. The algorithm uses edge relaxation.
The root node is the destination.
WRITE THE ALGORITHM LATER
Implementing the algorithm efficiently requires the Fibonacci heap data structure, to determine
which vertex in H has the smallest d-value. (This will be explained in the course Data structures &
Algorithms.) Its worst-case time complexity then is Ο(𝑚 + 𝑛 log 𝑛), where 𝑛 and 𝑚 are number of
vertices and arcs respectively. Dijkstra’s original description had worst-case time complexity Ο(n2),
because he didn’t use a heap...
We need to know which vertices are adjacent to each other and what the weight of their
respective connecting edges are.
Bellman-Ford Algorithm: We don’t have to know the topology of the graph when we’re applying
this algorithm. It computes shortest paths from all vertices to a vertex u, in a weighted digraph. It
allows edges to be negative. Again it uses relaxation, but now all arcs are considered n times (with
n the number of vertices in the digraph). The idea is that each shortest path contains at most 𝑛 − 1
arcs. Except when there is a cycle of negative weight!
2 of 9
The time complexity of the algorithm is O (mn). At each step, each vertex needs to inspect the
information collected at each of its neighbors. In total, the vertices needs to inspect roughly m
other vertices, where m is the total number of edges. The total number of steps we need to
perform is equal to the length of the longest shortest path and can be shown to increase
proportional to the number of vertices.
Let 𝐺 be connected. 𝑑 (𝑢, 𝑣) denotes the distance between vertices 𝑢 and 𝑣, i.e., the length of a
shortest path between 𝑢 and 𝑣.
Eccentricity ε(𝑢): max {𝑑 (𝑢,𝑣) 𝑣 ∈ 𝑉 (𝐺) } maximum distance between vertex u and any other vertex
Radius 𝑟𝑎𝑑(𝐺): min {𝜀 (𝑢) 𝑢 ∈ 𝑉 (𝐺) } min eccentricity. how disseparate the vertices in a network are.
Diameter 𝑑𝑖𝑎𝑚(𝐺): max {𝑑 (𝑢, 𝑣) 𝑢, 𝑣 ∈ 𝑉(𝐺)} maximum distance between two vertices
|𝑑(𝑢) is the average length of shortest paths from 𝑢 to any other 𝑣
|𝑑 𝐺 denotes the average path length
The characteristic path length is the median over all |𝑑 (𝑢) . If number of values is even: (k /2 +
k+1/2) / 2 is the median
In a simple connected and undirected graph G with the set of vertices V(G) where
v ∈ V (G), δ(v) > 1 and has the neighbor set N (v) the clustering coefficient of the node is calculated
3 of 9
as follows: (2 x mv ) / nv (nv −1) where mv is the number of edges in a subgraph induced by N (G)
and nv =|N(G)|. If it is a directed graph, the formula becomes mv / nv (nv −1).
The clustering coefficient of CC (G) is ( cc (u) + cc (v)…) / |V (G)*| *only the ones with a higher
degree than 1.
Observation:
6.4: Centrality
Consider a (strongly) connected graph G. The center 𝐶(𝐺) is the set of vertices with minimal
eccentricity. C(G) def {v ∈ V(G) | ε(v) = rad(G)}.
At the center means at minimal distance to the farthest vertex. The higher the centrality, the
“closer” to the center of a graph.
𝑐E(𝑢) denotes the (eccentricity based) vertex centrality of 𝑢: 1/ 𝜀(u)
In some cases, it is more important to know how close a node is to all other nodes. 𝑐C(𝑢) denotes
the closeness of 𝑢: 1/d (u, v)
Important vertices are those who lie on many shortest paths, as their removal may significantly
increase the distance between other vertices.
𝑆(𝑥, 𝑦) is set of shortest paths between 𝑥 and 𝑦.
𝑆 (𝑥, 𝑢, 𝑦) ⊆ 𝑆(𝑥, 𝑦) contains the shortest paths that pass through 𝑢.
𝑐B(𝑢) denotes the betweenness centrality of 𝑢:
4 of 9
An Erdős–Rényi 𝐸𝑅(𝑛, 𝑝) graph is a simple, undirected graph which consists of 𝑛 vertices and each
pair of distinct vertices is adjacent with probability 𝑝 ∈ [0,1]. Each vertex has 𝑛 − 1 possible
incident edges. So in an ER-graph, on average we can expect 𝑝 . (𝑛 − 1) edges at each vertex.
Theorem: The expected clustering coefficient for any vertex in an ER-graph is 𝑝.
Theorem: P[𝛿(𝑢) = 𝑘] is the probability that 𝛿(𝑢) = 𝑘 (in an ER-graph).
Theorem: The probability of a specific subset of 𝑘
neighbors is pk.(1-p)n-1-k
𝐸𝑅(𝑛, 𝑝) represents a group of Erdős–Rényi graphs. Most 𝐸𝑅(𝑛, 𝑝) graphs aren’t isomorphic.
𝐸𝑅(2000, 0.015)
• expected 𝛿̅ = 0.015×1999 = 29.985 !!
• expected E 𝐺2 =“-𝑛-𝛿=“×2000×29.985=2998.
• in 𝐺2 ∶ 29708 edges
The probability that the degree distribution of an ER-graph resembles the expected one increases
with the size of the graph.
For large 𝐺 ∈ 𝐸𝑅 𝑛, 𝑝 the expected average shortest path length 𝑑̅(𝐺) tends to
Where ln is the natural logarithm and 𝛾 the Euler’s constant (0.5772)
In an ER graph most vertices gather in a component and a few are not
connected to the component. This component is called the giant component and as we increase
p the component gets larger very quickly.
Watts-Strogatz graphs: The idea is to combine properties of classical random graphs with high
clustering coefficients.
𝑉= {v1, v2, v3,…, vn}. Choose 𝑛 ≫ 𝑘 ≫ ln(𝑛) ≫ 1, with 𝑘 even. (≫ means much larger)
• Order the 𝑛 vertices into a ring. Connect each vertex to its first k/2 right-hand neighbors, and to
its first k/2 left-hand neighbors. This is equivalent to construct a Harary graph .
• For each vertex 𝑢, considers (only once) each edge <𝑢,𝑣> ∈𝐻k,n. With probability 𝑝, replace it by
<𝑢,𝑤> where 𝑤≠𝑣 is randomly chosen from 𝑉− 𝑁(𝑢). The resulting graph is in 𝑊𝑆(𝑛, 𝑘, 𝑝).
Observation: Many vertices in a WS-graph will be close to each other. Because if 𝑝 isn’t very close
to 0, edges are created to other “groups” of vertices.
Weak links are the long links in a WS-graph that cross the ring.
5 of 9
Observation: WS-graphs have a high clustering coefficient because for 𝑝 significantly smaller than
1, many edges aren’t replaced.
Theorem: CC (G) ≈ 0.75 for any WS graph. For any WS(n, k, 0) graph CC(G) = 3 (k-2) / 4 (k-1).
Theorem: For all graphs in 𝑊𝑆(𝑛, 𝑘, 0), the average shortest path length 𝑑̅ 𝑢 from vertex 𝑢 to any
other vertex is roughly n/2k.
Theorem: The average shortest path length in 𝑊𝑆(𝑛, 𝑘, 0) graphs is high whereas in small world
graphs this is not the case. But if 𝑝 increases, the average shortest path length drops rapidly but
the clustering coefficient stays relatively high. Typically, 𝑝 = 0.05 is a good value for both 𝐶𝐶(𝐺)
and 𝑑̅(𝐺).
We can only build scale-free networks by using a growth process combined with preferential
attachment. Meaning that in order to understand real-world large networks, we mimic their
creation by observing how new nodes attach themselves to existing nodes.
Barbasi-Albert graphs: Let 𝐺 ∈ 𝐸𝑅(n0, 𝑝) and 𝑉 = 𝑉(𝐺). Let 𝑛 ≫ n0. While |𝑉| < 𝑛 do: :
1. Add a new vertex vs to Vs-1 (i.e., Vs ← Vs−1 ∪ {vs}).
2. Add a new edge <𝑣,𝑢> for 𝑚 ≤ 𝑛0 distinct 𝑢 ∈ 𝑉 − {𝑣}. Each u chosen with a probability
Result: a Barbarasi-Albert graph BA (n, n0; m). The expected distribution is:
Generalized Barbasi-Albert Graphs: Start with a set 𝑉 of 𝑛0 vertices, and no edges. While |𝑉| < 𝑛
do:
1. V←𝑉 ∪ {𝑣} for some new vertex 𝑣. / Add a new vertex vs to Vs-1
2. Add edges <𝑣,𝑢> for 𝑚 ≤ 𝑛0 distinct 𝑢 ∈ 𝑉− {𝑣} . Each 𝑢 is chosen with prob. proportional to 𝛿(𝑢).
3. For a constant 𝑐 ≥ 0, add 𝑐 x 𝑚 edges between vertices in 𝑉 − {𝑣} . The probability to add an
edge <𝑥,𝑦> is proportional to 𝛿(𝑥) x 𝛿(𝑦).
Hubs in scale-free networks make them vulnerable to targeted attacks. A scale-free network
quickly becomes disconnected if hubs are removed. But it is at least as robust as an ER- graph
under a random attack.
Among the vertices present at the start of constructing a BA-graph, only few are lucky to get
connections early on, and are likely to get a very high degree. Vertices that don’t get connections
early on most likely end up with a small clustering coefficient. By contrast, vertices 𝑣 with say 𝑠 ≥
20000 are mostly linked to vertices with a very high degree.
BA-graphs have higher clustering coefficients than ER-graphs, but these values are still relatively
small compared to the real-world networks.
PageRank is clearly based on indegrees, yet the rank of a page and its indegree turn out to be
only weakly correlated. When we rank pages according to PageRank: P [rank = k] is proportional
to 1 / k2.1
7 of 9
Graphs can analyse social structures such as formation of groups, influence of relationships, ties of
families and friends, (dis)liking in groups of people.
Classroom: min 𝜀 to see which child is important then closeness but closeness is not always be a
good indicator of importance. Thus, we look at the betweenness centrality which is considered
the best indicator of importance. A popular child may not be the best for spreading info.
Intuition: the fraction of vertices that can reach v, divided by the average distance of these vertices
to v.
Consider a digraph with adjacency matrix A where, 0 ≤ 𝐴[𝑣,𝑢] ≤1 and Σ 𝐴[𝑣,𝑢] =1 𝑓𝑜𝑟 𝑒𝑎𝑐h 𝑣𝑒𝑟𝑡𝑒𝑥 𝑢.
Intuition:𝐴 [𝑣, 𝑢] expresses how much 𝑣 is appreciated by 𝑢.
The ranked prestige of vertices 𝑣 satisfies: prank(v) = Σ 𝐴[𝑣,𝑢] x prank(u)
Σ prank(v)2 = 1
Intuition: The importance of 𝐴 [𝑣, 𝑢] depends on prank(u)
A triad is a (potential) relationship between a triple of social entities; the
relationship between each pair is labeled as positive or negative. We consider
balanced triads.
In a signed graph, each edge e is labeled with either a positive (“+”) or a negative (“−”) sign,
denoted by sign(e). The graph can be directed or undirected.
The sign(T) of a walk T is the product of the signs of its edges: Пe ∈ E(T) 𝑠𝑖𝑔𝑛 (𝑒)
Theorem: A sign graph is balanced if all its cycles are positive. A signed graph G is balanced if
and only if V(G) can be partitioned into two disjoint subsets v0 and v1 such that:
each negative edge is incident to a vertex from v0 and from v1: E−(G) = { ⟨x,y⟩ | x ∈ V , y ∈ V }
0 1
each positive edge is incident to vertices from the same set. E+(G) = { ⟨x,y⟩ | x,y ∈ V0 or x,y ∈ V1}
In an affiliation network, people are tied together through membership relations (e.g., a sports
club or management team).
8 of 9
Social structures are assumed to consist of actors and events. Actors are tied to each other
through joint participation in an event.
Discover correlations between events by actors that participate in both.
Affiliation networks are naturally represented as bipartite graphs. Vertices in VA represent actors
and in VE events.
Edge <a, e> represents actor a participates in event e.
Number off events in which actors a and b both participate: NE [a, b] = AE[a, e] x AE [b, e]
Number of actors that participate in events e and f: NA [e, f] = AE[a, e] x AE[a, f]
Empirical evaluation has been done on static networks. Do realistic graphs indeed grow with
these assumptions?
Densification
If we look at real-world graphs (esp. social networks) for a long time, then we observe that:
1. The average node degree increases. This phenomenon is called densification. More
specifically: e(t) is proportional to n(t) 𝛼 where 𝑒 𝑡 and 𝑛(𝑡) are the number of edges and nodes
resp. at time 𝑡 and 𝛼 > 1.
2. The diameter decreases at the network grows.
What causes the densification and related diameter shrinking? Leskovec et al. (2005) propose two
models: Community Guided Attachment (CGA) and Forest Fire Model (not covered in this course)
Community Guided Attachment (CGA): It has been observed that power-law occurs in self- similar
datasets.
An object is self-similar it is similar to a part of itself.One example of a self-similar graph is
communities.
Definition: Let 𝑇 be a tree with 𝑁 leaves, height 𝐻, and constant fanout 𝑏. Also, let ht(𝑣, 𝑤) be the
tree distance between leaves v and w, i.e., the height of the smallest subtree of 𝑇 which contains
both 𝑣 and 𝑤 .
Algorithm: We construct a random graph with 𝑁 nodes where the probability that there is an
edge between 𝑣 and 𝑤 is 𝑓(h" 𝑣,𝑤).The function f is called Difficulty Function.
9 of 9