Mit14 15s22 Lec2
Mit14 15s22 Lec2
Alexander Wolitzky
MIT
1
Plan
Types of networks:
I Social and economic networks: nodes are people or groups of
people.
I Friendship networks, business relationships between firms,
intermarriages between families, employment relations in the
labor market
I Information networks: nodes are “information objects”
I Web links, citation network between academic articles,
semantic/classification networks (e.g., taxonomies)
I ...
3
Types of Networks in the Real World (cntd.)
I Technological networks
I Infrastructure networks like internet, power grid, transportation
networks
I Temporary networks like sensor networks, autonomous vehicles
I Biological networks
I Food web, protein interaction network, neural network,
network of metabolic pathways
4
History of Study of Graphs/Networks
Historical study of networks:
I Mathematical graph theory: central part of discrete math
I Started with Euler’s 1735 solution to the Königsberg bridge
problem.
I Social network analysis in sociology.
I Typical studies involved circulation of questionnaires, leading
to relatively small networks; also little focus on individual
behavior.
The length of a walk (or path) is the number of edges in the walk
(or path).
I The distance between nodes i and j is the length of a
geodesic between them (or ∞ if no such path exists).
8
For directed graphs, the same definitions hold with directed edges
(in which case we say “a path from node i to node j”).
Powers of the Adjacency Matrix
The powers of the adjacency matrix contain useful information
about walks and paths.
You’ll see more ways of using the adjacency matrix on the pset.
10
Connectivity and Components
An undirected graph is connected if for every two nodes there
exists a path between them.
A directed graph is
I connected if the underlying undirected graph is connected
(i.e. ignoring the directions of the edges).
I strongly connected if each node can reach every other node
by a “directed path”.
18
Degree Distributions (cntd.)
Two types of degree distributions for random graph models:
I P (d ) ≤ ce −αd for some constants α > 0 and c > 0:
the tails of the distribution fall off exponentially (or faster):
large degrees are very unlikely.
I P (d ) = cd −γ for some constants γ > 0 and c > 0:
called a power-law distribution, the tails of the distribution
are “fat”: large degrees are much less unlikely.
I (Approximate) power laws appear in many settings, including
distributions of income, city populations, and internet traffi c.
I Also known as a scale-free distribution: a distribution that is
unchanged (within a multiplicative factor) under a rescaling of
the variable.
I Appear linear on a log-log plot.
The average path length is the average distance between any two
nodes:
∑i 6=j ` (i, j )
average path length =
n (n − 1)
1
The average clustering coeffi cient is Cl Avg (g ) = n ∑i Cli (g ) .
Consider the undirected “windmill” network, where everyone is
linked to the center and one other node.
I Average clustering is close to 1, because Cli (g ) = 1 for
everyone except the center.
I Overall clustering is close to 0, because vast majority of
22
potential triangles consist of the center and two individuals
who are not linked.
Centrality Measures
There are several measures that capture some notion of the
“centrality” or “importance” of a node in a network.
I Different measures capture different notions of centrality,
which matter for answering different questions.
© ACM. All rights reserved. This content is excluded from our Creative Commons
license. For more information, see https://ocw.mit.edu/help/faq-fair-use/
26
The Friendship Network at a US High School
28
Measuring Homophily
30
Introducting Eigenvector Centrality (time permitting)
The simplest measure is eigenvector centrality: a non-zero vector
C = (Ci )i ∈N such that, for some scalar λ > 0, we have
32
When is Eigenvector Centrality Well-Defined?
33
When is Eigenvector Centrality Well-Defined? (cntd.)
In matrix form, the equation for the Ci ’s is
λC = g T C ,
where λ is a scalar, C is a n × 1 vector, and g T is the transpose of
the n × n adjacency matrix (transposed because, for directed
graphs, we care about the nodes that link to you, not the nodes
you link to).
I That is, C is an eigenvector of g T , with λ the corresponding
eigenvalue.
I The Perron-Frobenius theorem of linear algebra says that, for
every irreducible non-negative matrix, its largest eigenvalue is
positive, and all the components of the corresponding
eigenvector are also positive.
I So, if we let λ be the largest eigenvalue of g T , the
corresponding eigenvector C is non-negative.
34
I Thus, for any strongly connected network, the eigenvector
centrality vector C is well-defined.
Interpretation as Long-Run Population Shares
A useful interpretation of eigenvector centrality as the long-run
outcome of a reproduction process (which also explains why it’s
always well-defined for strongly connected networks):
I Suppose a “virus” starts at a random node in the graph.
I In each period, every virus sends one copy of itself along each
link from the node where it is located. Then it dies.
I (So there’s 1 virus in period 1, Ni viruses in period 2,
∑j ∈N i Nj viruses in period 3, etc.)
I Letting this process run forever, the virus never dies out
(because the network is strongly connected), and we can
calculate the long-run fraction of viruses located at each node.
I The long-run fraction of viruses located at node i equals Ci .
Theorem
For every irreducible non-negative matrix A, its largest eigenvalue
r1 is a positive real number, and the components of the
corresponding eigenvector v1 are also all positive.
36
Intuition for the Perron-Frobenius Theorem
I Fix any non-negative vector x (0) ∈ Rn . Suppose that we can
write it as a linear combination of the eigenvectors vi of A:
x (0) = ∑ ci vi .
i
Just like with the viruses, the limiting vector x (∞) is proportional
to the largest eigenvector. This vector defines eigenvector
centrality.
38
MIT OpenCourseWare
https://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.
39