0% found this document useful (0 votes)

2 views126 pages

Graph and Patterns

The document outlines a lecture on graph theory and its applications in data science, focusing on the motivations for studying graphs, various graph concepts, and network generation. It discusses real-world networks, properties of graphs, and models for complex networks, including power-law distributions and the concept of 'small world' phenomena. Additionally, it highlights the importance of understanding graph patterns and their implications in social behavior and information propagation.

Uploaded by

bocerin283

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views126 pages

Graph and Patterns

Uploaded by

bocerin283

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 126

Algorithm Foundations of Data Science

Lecture 3: Graph and Patterns

MING GAO

DaSE@ECNU
(for course related communications)
[email protected]

Mar. 28, 2018

Outline

1 Graph
Motivations
Patterns

2 Graph Concepts
Graph types
Properties
Graph Modeling

3 Network Generation

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 2 / 47
Graph Motivations

Outline

1 Graph
Motivations
Patterns

2 Graph Concepts
Graph types
Properties
Graph Modeling

3 Network Generation

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 3 / 47
Graph Motivations

Graphs - why should we care?

Networks in real world

“YahooWeb graph”: 1B vertices(Web sites), 6B edges (http
links)
Facebook, Twitter, etc: more than 1B users
Food Web: all biologies, food chain
Power-grid: vertices (plants or consumers), edges (power lines)
Airline route: vertices (airports), edges (flights)
Adoption: users purchase products, adopt services, etc.
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 4 / 47
Graph Motivations

Motivation questions
Questions
What do real graphs look like?
Graph Motivations

Motivation questions
Questions
What do real graphs look like?
What properties of vertices, edges are important to model?
What local and global properties are important to measure?
Graph Motivations

Motivation questions
Questions
What do real graphs look like?
What properties of vertices, edges are important to model?
What local and global properties are important to measure?
Are graphs helpful to understand the real world?
Social influence
Recommendation
Information propagation
Human behaviors
Is a sub-graph “normal” (Water army, fraud detection, spam
filtering, etc)?
How to generate realistic graphs?
How to get a “good” sample of a network?
Graph Motivations

Models for complex networks

Steven H. S. proposes the model for complex networks in Nature

2001.
Graph Motivations

Models for complex networks

Steven H. S. proposes the model for complex networks in Nature

2001.
Regular network: each node has exactly the same
number of edges.
Graph Motivations

Models for complex networks

Steven H. S. proposes the model for complex networks in Nature

Models for complex networks

Steven H. S. proposes the model for complex networks in Nature

2001.
Regular network: each node has exactly the same
number of edges.
Random network: it is obtained by starting with
a set of n isolated vertices and adding successive
edges between them at random.
Scale-free network: it grows via attaching new
nodes to previously existing nodes randomly,
while the probability is proportional to the degree
of the target node, i.e., richly connected nodes
tend to get richer, leading to the formation of
hubs and a skewed degree distribution with a
heavy tail.(Matthew Effect or Pareto’s Law)

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 6 / 47
Graph Motivations

Are real graphs random?

Graph Motivations

Are real graphs random?

Looks random - right?

How does the Internet look like? Any rules?
Graph Motivations

Are real graphs random?

Looks random - right?

How does the Internet look like? Any rules?

Diameter: would you like to guess?

In- and outdegree distributions: if average degree is 2, what
is the most probable degree?
Other (surprising) patterns?

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 7 / 47
Graph Patterns

Outline

1 Graph
Motivations
Patterns

2 Graph Concepts
Graph types
Properties
Graph Modeling

3 Network Generation

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 8 / 47
Graph Patterns

Power-law I
Graph Patterns

Power-law I

Internet topology

Out-degree distribution is plotted in log-log scale.

It forms a line with a slope ∼ −2.15
freq. = deg .−2.15

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 9 / 47
Graph Patterns

What is the power-law?

Due to Matthew effect, Pareto’s law, “rich-get-richer”, or the 80/20

principle, there are many settings with power law (Zipf’s law).
Graph Patterns

What is the power-law?

Due to Matthew effect, Pareto’s law, “rich-get-richer”, or the 80/20

principle, there are many settings with power law (Zipf’s law).
80% of Italy’s land owned by 20% of the population.
Graph Patterns

What is the power-law?

Due to Matthew effect, Pareto’s law, “rich-get-richer”, or the 80/20

principle, there are many settings with power law (Zipf’s law).
80% of Italy’s land owned by 20% of the population.
Richest 20% obtain 82.70% income.
Graph Patterns

What is the power-law?

Due to Matthew effect, Pareto’s law, “rich-get-richer”, or the 80/20

What is the power-law?

Due to Matthew effect, Pareto’s law, “rich-get-richer”, or the 80/20

What is the power-law?

Due to Matthew effect, Pareto’s law, “rich-get-richer”, or the 80/20

What is the power-law?

Due to Matthew effect, Pareto’s law, “rich-get-richer”, or the 80/20

What is the power-law?

Due to Matthew effect, Pareto’s law, “rich-get-richer”, or the 80/20

principle, there are many settings with power law (Zipf’s law).
80% of Italy’s land owned by 20% of the population.
Richest 20% obtain 82.70% income.
Bible: rank VS. frequency (log-log)
Web: hit count VS. volume
File: count VS. size
Publication: citation VS. count
Business
80% of a company’s profits come from 20% of customers.
80% of a company’s complaints come from 20% of customers.
80% of a company’s profits come from 20% of the time staff
spent
80% of a company’s sales are made by 20% of sales staff

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 10 / 47
Graph Patterns

Power-law II
Graph Patterns

Power-law II

Rank of out-degrees

Vertices are ranked in decreasing out-degree order, and

plotted in log-log scale.
It forms a line with a slope ∼ −0.74
deg . = rank −0.74

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 11 / 47
Graph Patterns

Power-law III
Graph Patterns

Power-law III

Rank of eigenvalues

Eigenvalues of adjacency matrix (top 20) are ranked in

decreasing order, and plotted in log-log scale.
It forms a line with a slope ∼ −0.48
eigen. = rank −0.48

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 12 / 47
Graph Patterns

Power-law IV
Graph Patterns

Power-law IV

Hop plot

P many neighbors within 1, 2, · · · , h hops?

How
( hi=1 avg .i )
Pairs of vertices are plotted in log-log scale. It forms a
line with a slope ∼ 2.83
pairs. = hop 2.83

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 13 / 47
Graph Patterns

Power-law V
Graph Patterns

Power-law V

Counting of triangles

X-axis: # of triangles a vertex participates in

Y-axis: count of such vertices
In log-log scale, the plot is almost linear.

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 14 / 47
Graph Patterns

Triangle law
How to count # triangles?

Naive algorithm: 3-way join (O(n3 )).

# triangles = 16 ni=1 λ3i . Why?
P

Because of skewness, we only need the top few eigenvalues via

using Lanczos algorithm.

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 15 / 47
Graph Patterns

Erdös number

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 16 / 47
Graph Patterns

Erdös number

Small world - six degrees of separation

The world looks “small” when you think of how short a path of friends
it takes to get from you to almost anyone else. Stanley Milgram and
his colleagues in the 1960s did an experiment.
296 randomly chosen “starters” asked to forward a letter to a
“target” person, a stockbroker in Boston’s suburb.
The six degrees of separation was also found by Jure Leskovec
on Miscrosoft Instant Message.

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 16 / 47
Graph Patterns

Shrinking diameter

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 17 / 47
Graph Patterns

Shrinking diameter

Citation or patents networks

For citation network, they collected citations among Physics papers.

11 years data
29,555 papers
352,807 citations
For each month, create a graph of all citations up to the
month.
The diameters are plotted in the figures.
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 17 / 47
Graph Patterns

Temporal evolution of graphs

Question

Let N(t) and E (t) be # nodes and # edges at time t, respectively.

Suppose that N(t + 1) = 2N(t), what is your guess for E (t + 1)?
Graph Patterns

Temporal evolution of graphs

Question

Let N(t) and E (t) be # nodes and # edges at time t, respectively.

Suppose that N(t + 1) = 2N(t), what is your guess for E (t + 1)?
It is over-doubled, but obeying: E (t) ∼ N(t)α for all t, where
1 < α < 2.
For tree (clique), α = 1 (α = 2).

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 18 / 47
Graph Patterns

Dunbar’s number
Why primates have unusually big brains?

Social group size (and a lot of social behaviour as wel) correlates

with relative neocortex volume.
Graph Patterns

Dunbar’s number
Why primates have unusually big brains?

Social group size (and a lot of social behaviour as wel) correlates

with relative neocortex volume.
Our relationships form a hierarchically inclusive series of circles
of increasing size but decreasing intensity.
Graph Patterns

Dunbar’s number
Why primates have unusually big brains?

Social group size (and a lot of social behaviour as wel) correlates

with relative neocortex volume.
Our relationships form a hierarchically inclusive series of circles
of increasing size but decreasing intensity.
150 is the limitation on reciprocated relationships.
Graph Patterns

Dunbar’s number
Why primates have unusually big brains?

Social group size (and a lot of social behaviour as wel) correlates

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 19 / 47
Graph Concepts Graph types

Outline

1 Graph
Motivations
Patterns

2 Graph Concepts
Graph types
Properties
Graph Modeling

3 Network Generation

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 20 / 47
Graph Concepts Graph types

Graph types
Undirected graph

A undirected graph on 4 vertices

Degree: # edges connected to the
vertex
Degree 0 vertex: isolated vertex

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 21 / 47
Graph Concepts Graph types

Graph types
Undirected graph

A undirected graph on 4 vertices

Degree: # edges connected to the
vertex
Degree 0 vertex: isolated vertex

Directed graph

A directed graph on 4 vertices

In-degree: # incoming edges to the
vertex
Out-degree: # outgoing edges to the
vertex
Degree: in-degree + outdegree

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 21 / 47
Graph Concepts Graph types

Graph types cont.

Signed graph

A signed graph on 3 vertices

Positive-degree: # edges associated
with positive labels
Negative-degree: # edges associated
with negative labels

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 22 / 47
Graph Concepts Graph types

Graph types cont.

Signed graph

A signed graph on 3 vertices

Positive-degree: # edges associated
with positive labels
Negative-degree: # edges associated
with negative labels

Bipartite graph

Users interact on social platforms

Reply network
Retweet network
Adoption network

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 22 / 47
Graph Concepts Properties

Outline

1 Graph
Motivations
Patterns

2 Graph Concepts
Graph types
Properties
Graph Modeling

3 Network Generation

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 23 / 47
Graph Concepts Properties

Paths

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 24 / 47
Graph Concepts Properties

Paths
Path
Path is a sequence of nodes with the property
that each consecutive pair in the sequence is
connected by an edge
Simple path does not repeat nodes.
The length of path is the number of
nodes in the path

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 24 / 47
Graph Concepts Properties

Cycle

Cycle is a path with at least three edges, in

which the first and last nodes are the same.
Every edge in the 1970 Arpanet belongs to a
cycle, and this was by design. Why?

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 24 / 47
Graph Concepts Properties

Connectivity

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 25 / 47
Graph Concepts Properties

Connectivity

Connected component

A connected component is a subset of nodes

s.t.:
Every node in the subset has a path to
every other; and
Graph Concepts Properties

Connectivity

Connected component

A connected component is a subset of nodes

s.t.:
Every node in the subset has a path to
every other; and
The subset is not part of some larger set
with the property that every node can
reach every other.
Graph Concepts Properties

Connectivity

Connected component

A connected component is a subset of nodes

s.t.:
Every node in the subset has a path to
every other; and
The subset is not part of some larger set
with the property that every node can
reach every other.
A graph is connected if for every pair of nodes,
there is a path between them, i.e., the whole
graph is a connected component.

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 25 / 47
Graph Concepts Properties

Strongly connected component

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 26 / 47
Graph Concepts Properties

Strongly connected component

A directed graph is strongly connected if there

is a path from every node to every other node.

Edges of the path must follow the

forward direction.
Graph Concepts Properties

Strongly connected component

A directed graph is strongly connected if there

is a path from every node to every other node.

Edges of the path must follow the

forward direction.
A undirected graph can be treated as a
bidirectional graph. Thus connected
component in a directed graph is also a
SCC.
Graph Concepts Properties

Strongly connected component

A directed graph is strongly connected if there

is a path from every node to every other node.

Edges of the path must follow the

Strongly connected component

A directed graph is strongly connected if there

is a path from every node to every other node.

Edges of the path must follow the

forward direction.
A undirected graph can be treated as a
bidirectional graph. Thus connected
component in a directed graph is also a
SCC.
In a strongly connected component,
there are followers and followees for each
node.
SCCs can be treated as super-nodes.
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 26 / 47
Graph Concepts Properties

Giant component

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 27 / 47
Graph Concepts Properties

Giant component

Giant connected component

A connected component that contains a sig-

nificant fraction of all the nodes.
When a network (e.g., friendship
network) contains a giant component, it
almost always contains only one.
Graph Concepts Properties

Giant component

Giant connected component

A connected component that contains a sig-

Giant component

Giant connected component

A connected component that contains a sig-

nificant fraction of all the nodes.
When a network (e.g., friendship
network) contains a giant component, it
almost always contains only one.
The other connected components are
very small by comparison.
The largest connected component would
break apart into three distinct
components if this node were removed
[related to robustness of network].

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 27 / 47
Graph Concepts Properties

Web giant component

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 28 / 47
Graph Concepts Properties

Web giant component

Web graph

Web contains a giant strongly connected com-

ponent (containing home pages of many of the
major commercial, governmental, and non-
profit organizations)
Graph Concepts Properties

Web giant component

Web graph

Web contains a giant strongly connected com-

ponent (containing home pages of many of the
major commercial, governmental, and non-
profit organizations)
IN: nodes that can reach the giant SCC
but cannot be reached from it, i.e.,
nodes that are “upstream” of it.
Graph Concepts Properties

Web giant component

Web graph

Web contains a giant strongly connected com-

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 28 / 47
Graph Concepts Properties

Distance and diameter

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 29 / 47
Graph Concepts Properties

Distance and diameter

Distance or Geodesic distance

The distance between two vertices in a graph
is the number of edges in a shortest path.
Diameter is the length of the “longest
shortest path” between any two vertices
of a graph.
Graph Concepts Properties

Distance and diameter

Distance or Geodesic distance

The distance between two vertices in a graph
is the number of edges in a shortest path.
Diameter is the length of the “longest
shortest path” between any two vertices
of a graph.
Erdös number is bounded by diameter of
a graph.
Graph Concepts Properties

Distance and diameter

Distance or Geodesic distance

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 29 / 47
Graph Concepts Properties

Mean Geodesic distance of undirected networks

Definition
1 X
L= 1
dij ,
2 n(n + 1) i≥j

where n denotes # of nodes, and dij is the shortest distance between

nodes i and j.
Mean Geodesic distance includes distance to itself.
Graph Concepts Properties

Mean Geodesic distance of undirected networks

Definition
1 X
L= 1
dij ,
2 n(n + 1) i≥j

where n denotes # of nodes, and dij is the shortest distance between

nodes i and j.
Mean Geodesic distance includes distance to itself.
Can be computed in O(mn) using breadth first search, where
m denotes # of edges.
Graph Concepts Properties

Mean Geodesic distance of undirected networks

Definition
1 X
L= 1
dij ,
2 n(n + 1) i≥j

where n denotes # of nodes, and dij is the shortest distance between

nodes i and j.
Mean Geodesic distance includes distance to itself.
Can be computed in O(mn) using breadth first search, where
m denotes # of edges.
What happens if the network has multiple connected
components?
Graph Concepts Properties

Mean Geodesic distance of undirected networks

Definition
1 X
L= 1
dij ,
2 n(n + 1) i≥j

where n denotes # of nodes, and dij is the shortest distance between

1 X
L−1 = d −1
1
2 n(n + 1) i≥j ij
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 30 / 47
Graph Concepts Properties

Summarization

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 31 / 47
Graph Concepts Graph Modeling

Outline

1 Graph
Motivations
Patterns

2 Graph Concepts
Graph types
Properties
Graph Modeling

3 Network Generation

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 32 / 47
Graph Concepts Graph Modeling

Adjacency matrix

Definition
Given a finite graph G = (V , E ), an adjacency matrix A is a |V |×|V |
matrix, whose elements indicate whether pairs of vertices are adjacent
or not in the graph.
The adjacency matrix is a (0,1)-matrix with zeros on its
diagonal.
If the graph is undirected, the adjacency matrix is symmetric.
Graph Concepts Graph Modeling

Adjacency matrix

The adjacency matrix A of a bipartite

graph whose two parts have r and s
vertices
can be written
in the form
0r ,r B
A= .
B 0s,s

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 33 / 47
Graph Concepts Graph Modeling

Storing a graph
Adjacency lists

An adjacency list is a collection of unordered lists used to represent

a graph G . Each list describes the set of neighbors of a vertex in the
graph.

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 34 / 47
Graph Concepts Graph Modeling

Random walk of a graph

Markov chain
Suppose that G = (V , E ) is a graph of n vertices with vertex set V and
edge set E ⊂ V × V . Let N(x) = {y |(x, y ) ∈ E }, and degree of vertex x
denote as d(x) = |N(x)|.
Graph Concepts Graph Modeling

Random walk of a graph

Markov chain
Suppose that G = (V , E ) is a graph of n vertices with vertex set V and
edge set E ⊂ V × V . Let N(x) = {y |(x, y ) ∈ E }, and degree of vertex x
denote as d(x) = |N(x)|.
Note that x is isolated vertex if N(x) = 0.
G is an undirected graph, we have (x, y ) ∈ E if (y , x) ∈ E .
1
For each x ∈ V , the transition matrix P(y |x) is d(x) if y ∈ N(x), and
P(y |x) = 0 otherwise.
Let X be a random walk on G , if G is connected then X is
irreducible.
X has period 2 if and only if G is bipartite, in which case the parts
are the cyclic classes of X .
Let D = diag (d1 , d2 , · · · , dn ) be a diagonal matrix, and P = D −1 A.

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 35 / 47
Graph Concepts Graph Modeling

Combinatorial Laplacian of graph

Definition
Graph Concepts Graph Modeling

Combinatorial Laplacian of graph

Definition
Given a graph G , (Combinatorial) Laplacian of G :
L = D − A,  i.e.,
 dv , if u = v ;
L(u, v ) = −1, if u and v are adjacent ;
0, otherwise.

If G is an undirected graph G, and its Laplacian matrix
L with eigenvalues λ0 ≤ λ1 ≤ · · · ≤ λn−1 , then
L is singular and symmetric(existing λi = 0).
Since row sum and column sum of L is zero,
λ0 = 0 and v0 = (1, 1, · · · , 1).
The second smallest eigenvalue is called algebraic
connectivity.
Graph Concepts Graph Modeling

Combinatorial Laplacian of graph

Incidence matrix
Definition
An incidence matrix B is a |V |×|E | matrix that shows the relationship
between vertices and edges of graph G = (V , E ).
Graph Concepts Graph Modeling

Incidence matrix
Definition
An incidence matrix B is a |V |×|E | matrix that shows the relationship
between vertices and edges of graph G = (V , E ).
Each column corresponds to an edge e = (vi , vj ) (with i < j),
where the value of an entry is 1 in the row corresponding to vi ,
and entry −1 in the row corresponding to vj .
Graph Concepts Graph Modeling

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 37 / 47
Graph Concepts Graph Modeling

Normalized Laplacian of graph

Definition

Given a graph  G , normailized Laplacian of G : L = D −1/2 LD −1/2 ,

 1, if u = v ;
1
i.e., L(u, v ) = − d d , if u and v are adjacent ;
√
 u v
0, otherwise.
Graph Concepts Graph Modeling

Normalized Laplacian of graph

Definition

Given a graph  G , normailized Laplacian of G : L = D −1/2 LD −1/2 ,

 1, if u = v ;
1
i.e., L(u, v ) = − d d , if u and v are adjacent ;
√
 u v
0, otherwise.
L = D −1/2 BB T D −1/2 = I − D −1/2 AD −1/2 =
D 1/2 (I − P)D −1/2 . Thus, L is positive semidefinite and
0 ≤ λ(L) ≤ 2.
Graph Concepts Graph Modeling

Normalized Laplacian of graph

Definition

Given a graph  G , normailized Laplacian of G : L = D −1/2 LD −1/2 ,

 1, if u = v ;
1
i.e., L(u, v ) = − d d , if u and v are adjacent ;
√
 u v
0, otherwise.
L = D −1/2 BB T D −1/2 = I − D −1/2 AD −1/2 =
D 1/2 (I − P)D −1/2 . Thus, L is positive semidefinite and
0 ≤ λ(L) ≤ 2.
L is singular and symmetric, and λ0 = 0 corresponding to
eigenvector D 1/2 v0T = D 1/2 (1, 1, · · · , 1)T .
Graph Concepts Graph Modeling

Normalized Laplacian of graph

Definition

Given a graph  G , normailized Laplacian of G : L = D −1/2 LD −1/2 ,

Normalized Laplacian of graph

Definition

Given a graph  G , normailized Laplacian of G : L = D −1/2 LD −1/2 ,

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 38 / 47
Graph Concepts Graph Modeling

Properties of normalized Laplacian [WebScience 2013]

Properties

The eigenvalues of the normalized Laplacian matrix of graph G with

n vertices satisfy the following properties:
n
0 ≤ λ2 ≤ n−1 ≤ λn ≤ 2.
n
λ2 = · · · = λn = n−1 if and only if G is a clique.
λn = 2 if and only if G is a bi-clique.
G has at least i connected components if and only if λj = 0,
for j = 1, 2, · · · , i.
The mean of eigenvalues λ2 , λ3 , · · · , λn of a network G with n
n
vertices is n−1 .
The variance of eigenvalues λ2 , λ3 , · · · , λn of a network G with
1 Pn Pn Aij n
n vertices is n−1 i=1 j6=i d(vi )d(vj ) − (n−1)2 (R-energy).

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 39 / 47
Network Generation

Network generation

Generators
Erdös-Renyi model
Preferential attachment
Variations + extensions
Copying model
Triad-closing
Butterfly model
Recursion - Kronecker generator

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 40 / 47
Network Generation

Random network generator: Erdös-Renyi model

Erdös-Renyi model is known as the random graph model, which gen-

erates undirected random graphs.
Parameters: N (# vertices) and p (prob. of forming an edge)
For each possible node pair, the approach generates an edge
with probability p. Thus, # edges = pN(N−1)
2 .
Degree distribution:
P(node has degree k) = N−1
k
k p (1 − p)N−1−k
Follows binomial distribution with mean (N − 1)p and variance
(N − 1)p(1 − p) (not power-law distribution).

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 41 / 47
Network Generation

Scale-free network generator

Preferential attachment model
The more connected a node is, the more likely it is to receive new links
(namely, Rich gets Richer, Matthew Effect or Paretos Law, etc.).
Price model
Barabasi Albert model

Price model for citation networks

Each new paper is generated with m citations (mean).
New papers cite previous papers with probability proportional
to their indegree (citations).
Each new paper is generated with m citations (mean).
New papers cite previous papers with probability proportional to
their indegree (citations).
Power law with exponent α = 2 + m1 [Science 1965]

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 42 / 47
Network Generation

Barabasi Albert model

Model
Network Generation

Barabasi Albert model

Model

Start with an initial network of m0 (≥ 2)

nodes, and the degree of each node ≥ 1,
otherwise it will always remain isolated.
Network Generation

Barabasi Albert model

Model

Start with an initial network of m0 (≥ 2)

Barabasi Albert model

Model

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 43 / 47
Network Generation

Kronecker product of matrices

Given two matrices U ∈ Rn×m and V ∈ Rp×q , the Kronecker product

matrix S ∈ Rnp×mq is given by
 
u11 V u12 V ··· u1m V
O  u21 V u22 V ··· u2m V 
S =U V =
 
··· ··· ··· ··· 
un1 V un2 V ··· unm V
Network Generation

Kronecker product of matrices

Given two matrices U ∈ Rn×m and V ∈ Rp×q , the Kronecker product

matrix S ∈ Rnp×mq is given by
 
u11 V u12 V · · · u1m V
O  u21 V u22 V · · · u2m V 
S =U V = ···

··· ··· ··· 
un1 V un2 V · · · unm V
N N N N N
A (aB + C ) = (aA) B + A C , but A B 6= B A.
Network Generation

Kronecker product of matrices

Given two matrices U ∈ Rn×m and V ∈ Rp×q , the Kronecker product

matrix S ∈ Rnp×mq is given by
 
u11 V u12 V · · · u1m V
O  u21 V u22 V · · · u2m V 
S =U V = ···

··· ··· ··· 
un1 V un2 V · · · unm V
N N N N N
A (aB + C ) = (aA) B + A C , but A B 6= B A.
N N N
(A B)(C D) = (AC ) (BD).
Network Generation

Kronecker product of matrices

Given two matrices U ∈ Rn×m and V ∈ Rp×q , the Kronecker product

Kronecker product of matrices

Given two matrices U ∈ Rn×m and V ∈ Rp×q , the Kronecker product

matrix S ∈ Rnp×mq is given by
 
u11 V u12 V · · · u1m V
O  u21 V u22 V · · · u2m V 
S =U V = ···

··· ··· ··· 
un1 V un2 V · · · unm V
N N N N N
A (aB + C ) = (aA) B + A C , but A B 6= B A.
N N N
(A B)(C D) = (AC ) (BD).
(A B) = A−1 B −1 and (A B)T = AT
N −1 N N N T
B
|A B| = |A| |B| and Tr (A B) = Tr (A)Tr (B) if A ∈ Rn×n and
m n
N N
B ∈ Rm×m .
Network Generation

Kronecker product of matrices

Given two matrices U ∈ Rn×m and V ∈ Rp×q , the Kronecker product

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 44 / 47
Network Generation

Kronecker model cont.

Model
Instead of a single property of the network, Kronecker model can fit
multiple properties of a network, which makes them interesting for
fitting.
Network Generation

Kronecker model cont.

Model
Instead of a single property of the network, Kronecker model can fit
multiple properties of a network, which makes them interesting for
fitting.
Deterministic Kronecker model: it begins with an initiator
graph G1 with N1 nodes, and produces successively larger
graphs G2 , · · · , Gn such that the k−th graph Gk has Nk = N1k .
Network Generation

Kronecker model cont.

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 45 / 47
Network Generation

Sources for generator

Generators
Erdös Renyi: http://ladamic.com/netlearn/NetLogo501/
ErdosRenyiDegDist.html
BRITE: http://wwwcsbuedu/brite/
INET: http://topology.eecs.umich.edu/inet
Kronecker:
[email protected]
http://www.cc.gatech.edu/dimacs10/archive/
kronecker.shtml
http://www.cc.gatech.edu/dimacs10/archive/
kronecker.shtml

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 46 / 47
Take-home msg.

Take-home messages

Graph
Motivations
Patterns
Graph aspects
Graph types
Properties
Graph modeling
Network generation
Erdös Renyi model
Barabasi Albert model
Kronecker model

MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 47 / 47

MR Marmalade PDF
No ratings yet
MR Marmalade PDF
26 pages
Newman Networks An Introduction 2010
100% (2)
Newman Networks An Introduction 2010
394 pages
A Project Report On Social Work
0% (1)
A Project Report On Social Work
13 pages
9 Large Network
No ratings yet
9 Large Network
68 pages
Module4 Networkmodels
No ratings yet
Module4 Networkmodels
68 pages
Networks Sna
No ratings yet
Networks Sna
126 pages
Advanced Topics in Data Mining Special Focus: Social Networks
No ratings yet
Advanced Topics in Data Mining Special Focus: Social Networks
35 pages
Topic 1 - Graphs
No ratings yet
Topic 1 - Graphs
14 pages
Lecture 1 - Patterns! 2
No ratings yet
Lecture 1 - Patterns! 2
3 pages
Graph Theory: Fundamentals, Algorithms, and Real-World Use: Nguyen Van A July 18, 2025
No ratings yet
Graph Theory: Fundamentals, Algorithms, and Real-World Use: Nguyen Van A July 18, 2025
1 page
Introduction To Networks: 15.053 March 22, 2007
No ratings yet
Introduction To Networks: 15.053 March 22, 2007
60 pages
Lesc3 GraphTheory
No ratings yet
Lesc3 GraphTheory
22 pages
03 CD Phan Tich Mang Xa Hoi - Mo Hinh Toan Cho Mang Xa Hoi
No ratings yet
03 CD Phan Tich Mang Xa Hoi - Mo Hinh Toan Cho Mang Xa Hoi
116 pages
Topic 03 - Graph Patterns
No ratings yet
Topic 03 - Graph Patterns
79 pages
Week 16
No ratings yet
Week 16
47 pages
Graph Data Science - Vipin Kumar
No ratings yet
Graph Data Science - Vipin Kumar
17 pages
Graph Theory For Business: Applications and Benefits
No ratings yet
Graph Theory For Business: Applications and Benefits
3 pages
Graph Theory - Introduction
No ratings yet
Graph Theory - Introduction
5 pages
13 Network Models: Nadine Baumann and Sebastian Stiller
No ratings yet
13 Network Models: Nadine Baumann and Sebastian Stiller
32 pages
Mathematics-2
No ratings yet
Mathematics-2
10 pages
I Am Sharing 'DSE ASSIGNMENT ADITI CHAUDHARY' With You
No ratings yet
I Am Sharing 'DSE ASSIGNMENT ADITI CHAUDHARY' With You
7 pages
Senior Project Whole PDF
No ratings yet
Senior Project Whole PDF
23 pages
Chapter Six
No ratings yet
Chapter Six
45 pages
The Science of Social Networks
100% (4)
The Science of Social Networks
47 pages
Intermediate Data Science NX
No ratings yet
Intermediate Data Science NX
48 pages
Exercises Understanding Markets With Data Science
No ratings yet
Exercises Understanding Markets With Data Science
10 pages
10 Graph Algorithms Visually Explained
No ratings yet
10 Graph Algorithms Visually Explained
16 pages
Patterns in Static Graphs: Bits Bits
No ratings yet
Patterns in Static Graphs: Bits Bits
34 pages
Mit14 15s22 Lec2
No ratings yet
Mit14 15s22 Lec2
39 pages
Bks MaiHL 15uu tn00 Xxaann
No ratings yet
Bks MaiHL 15uu tn00 Xxaann
14 pages
TM3 ch05 Link Analysis
No ratings yet
TM3 ch05 Link Analysis
69 pages
Data Structure Algo Graph Classroom - Discussion Notes
No ratings yet
Data Structure Algo Graph Classroom - Discussion Notes
10 pages
05 Graphs
No ratings yet
05 Graphs
124 pages
Algorithm Design Unit 3
No ratings yet
Algorithm Design Unit 3
4 pages
5 Graph Data Science Basics Everyone Should Know
No ratings yet
5 Graph Data Science Basics Everyone Should Know
9 pages
Discrete Mathematics in The Modern World
50% (2)
Discrete Mathematics in The Modern World
24 pages
Networks BigData 1
No ratings yet
Networks BigData 1
43 pages
Gionis
No ratings yet
Gionis
191 pages
Graphs 1
No ratings yet
Graphs 1
29 pages
Introducción A La Teoría de Grafos
No ratings yet
Introducción A La Teoría de Grafos
58 pages
Complex Network Models
No ratings yet
Complex Network Models
110 pages
A Graph 2
No ratings yet
A Graph 2
17 pages
Lecture2
No ratings yet
Lecture2
25 pages
Social Network Analysis
No ratings yet
Social Network Analysis
38 pages
First Half Scribe
No ratings yet
First Half Scribe
33 pages
Revised Social Network Analysis - Chapter 3 - Network Growth
No ratings yet
Revised Social Network Analysis - Chapter 3 - Network Growth
36 pages
Network Dynamics 2013
No ratings yet
Network Dynamics 2013
93 pages
Aph Theory
No ratings yet
Aph Theory
46 pages
Uniformed Search
No ratings yet
Uniformed Search
37 pages
Unit-IV Notes Data Structure
No ratings yet
Unit-IV Notes Data Structure
35 pages
Data Mining and BI: Social Network Analytics: Random Graphs
No ratings yet
Data Mining and BI: Social Network Analytics: Random Graphs
46 pages
NetworkEffects A16z.compressed
100% (1)
NetworkEffects A16z.compressed
85 pages
Lesson 1
No ratings yet
Lesson 1
50 pages
2 Networks in Economics and Finance PDF
No ratings yet
2 Networks in Economics and Finance PDF
24 pages
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
No ratings yet
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
97 pages
LECTURE NOTES 1 - Introduction To Graphs To Post
No ratings yet
LECTURE NOTES 1 - Introduction To Graphs To Post
67 pages
Complex Network
No ratings yet
Complex Network
6 pages
Lecture 8nnn
No ratings yet
Lecture 8nnn
16 pages
Introduction To Complex Networks: Flavia Bonomo
No ratings yet
Introduction To Complex Networks: Flavia Bonomo
38 pages
Chapter10 Graphs
No ratings yet
Chapter10 Graphs
90 pages
AI Basics
From Everand
AI Basics
Anand Vemula
No ratings yet
AI Algorithms: Foundations, Applications, and Advancements
From Everand
AI Algorithms: Foundations, Applications, and Advancements
Anand Vemula
No ratings yet
TSClu Win
No ratings yet
TSClu Win
24 pages
Sliding Window Topk
No ratings yet
Sliding Window Topk
30 pages
A Learning Problem For Entity Matching
No ratings yet
A Learning Problem For Entity Matching
19 pages
07 Clustering
No ratings yet
07 Clustering
44 pages
Sampling
No ratings yet
Sampling
100 pages
Finding Top-K Shortest Simple Paths With Diversity
No ratings yet
Finding Top-K Shortest Simple Paths With Diversity
26 pages
分布式数据流
No ratings yet
分布式数据流
64 pages
Central It y
No ratings yet
Central It y
92 pages
Redis源代码分析
No ratings yet
Redis源代码分析
32 pages
13 基于知识图谱的问答
No ratings yet
13 基于知识图谱的问答
73 pages
Vldb2008 Ps 4up
No ratings yet
Vldb2008 Ps 4up
16 pages
国际象棋入门与提高
No ratings yet
国际象棋入门与提高
253 pages
64格导游大师 - 国际象棋实战教科书
No ratings yet
64格导游大师 - 国际象棋实战教科书
316 pages
Statistical Inference: Lecture 2: Transformations and Expectations
No ratings yet
Statistical Inference: Lecture 2: Transformations and Expectations
95 pages
8 图数据库系统
No ratings yet
8 图数据库系统
72 pages
Challenges & Opportunities in Graph Processing at Alibaba, 钱正平
No ratings yet
Challenges & Opportunities in Graph Processing at Alibaba, 钱正平
49 pages
Life Hacks Sample
No ratings yet
Life Hacks Sample
30 pages
Winawer Sample
No ratings yet
Winawer Sample
17 pages
中国国际象棋：国际象棋中局妙手
No ratings yet
中国国际象棋：国际象棋中局妙手
211 pages
The Benoni For The Tournament Player (John Nunn) (Z-Library)
No ratings yet
The Benoni For The Tournament Player (John Nunn) (Z-Library)
164 pages
软件逆向工程原理与实践
No ratings yet
软件逆向工程原理与实践
162 pages
Karjakin Defence Sample
No ratings yet
Karjakin Defence Sample
16 pages
Key Elementsof Chess Strategy Excerpt
No ratings yet
Key Elementsof Chess Strategy Excerpt
17 pages
Bogoljubov Vol1 Sample
No ratings yet
Bogoljubov Vol1 Sample
25 pages
Key Elementsof Chess Tactics Excerpt
No ratings yet
Key Elementsof Chess Tactics Excerpt
19 pages
9152
No ratings yet
9152
25 pages
Play The Barry Attack: Andrew Martin
No ratings yet
Play The Barry Attack: Andrew Martin
27 pages
Bogoljubov Volume 2 Sample
No ratings yet
Bogoljubov Volume 2 Sample
15 pages
Sphinx Vol 1 - Sample
No ratings yet
Sphinx Vol 1 - Sample
20 pages
Aramco HSE Questions
No ratings yet
Aramco HSE Questions
20 pages
Aditya Vahini News Letter
No ratings yet
Aditya Vahini News Letter
3 pages
Ecs268: Structural & Material Laboratory: I. Objective
No ratings yet
Ecs268: Structural & Material Laboratory: I. Objective
7 pages
StraMa Comprehensive Guidelines (C1 To C8) PDF
No ratings yet
StraMa Comprehensive Guidelines (C1 To C8) PDF
103 pages
Practical Manual BT511P Introduction To Biotechnology
0% (1)
Practical Manual BT511P Introduction To Biotechnology
60 pages
Carder - Poedagar 2
No ratings yet
Carder - Poedagar 2
1 page
Policy Server Installation Guide
0% (1)
Policy Server Installation Guide
24 pages
HJ 1
No ratings yet
HJ 1
1 page
Styrofoam Cup: Experiment 3: Energy in Thermal System Objectives
No ratings yet
Styrofoam Cup: Experiment 3: Energy in Thermal System Objectives
3 pages
Hunshu
No ratings yet
Hunshu
6 pages
Chapter 3
No ratings yet
Chapter 3
10 pages
Rubrics Essay
No ratings yet
Rubrics Essay
1 page
China Plastic Chair in Furniture Suppliers, Plastic Chair in Furniture Manufacturers From China On
No ratings yet
China Plastic Chair in Furniture Suppliers, Plastic Chair in Furniture Manufacturers From China On
11 pages
Sinopsis Muhammad Haris Yulianto-1
No ratings yet
Sinopsis Muhammad Haris Yulianto-1
6 pages
Semantic Textual Similarity With Siamese Neural Networks: Tharindu Ranasinghe, Constantin or Asan and Ruslan Mitkov
No ratings yet
Semantic Textual Similarity With Siamese Neural Networks: Tharindu Ranasinghe, Constantin or Asan and Ruslan Mitkov
8 pages
Statistics Study Group 1
No ratings yet
Statistics Study Group 1
3 pages
T7 Astro Camera Astronomy Planetary Quick Guide
No ratings yet
T7 Astro Camera Astronomy Planetary Quick Guide
20 pages
DAA NOTES UNIT 1 (Design and Analysis of Algorithm)
No ratings yet
DAA NOTES UNIT 1 (Design and Analysis of Algorithm)
18 pages
Letter in Support of Responsible Fintech Policy
No ratings yet
Letter in Support of Responsible Fintech Policy
155 pages
ANT 4468 - Syllabus PDF
No ratings yet
ANT 4468 - Syllabus PDF
5 pages
Part 1 «Listening»: Содержание ↑ Audioscript ↓
No ratings yet
Part 1 «Listening»: Содержание ↑ Audioscript ↓
7 pages
‎⁨إجابات تانية ثانوي (شرح ومراجعة) - ترم ثاني - 2025⁩
No ratings yet
‎⁨إجابات تانية ثانوي (شرح ومراجعة) - ترم ثاني - 2025⁩
63 pages
YETI Documentation: Release 1.0
No ratings yet
YETI Documentation: Release 1.0
53 pages
Daily Report Swiss Embassy Jakarta
No ratings yet
Daily Report Swiss Embassy Jakarta
1 page
Math 101 Exercises PDF
No ratings yet
Math 101 Exercises PDF
2 pages
Material Safety Data Sheet: Ephedrine Hydrochloride
No ratings yet
Material Safety Data Sheet: Ephedrine Hydrochloride
6 pages
Horsetail Equisetum Hyemale1
No ratings yet
Horsetail Equisetum Hyemale1
8 pages
Refind Conf
No ratings yet
Refind Conf
8 pages