Graph and Patterns
Graph and Patterns
MING GAO
DaSE@ECNU
(for course related communications)
[email protected]
1 Graph
Motivations
Patterns
2 Graph Concepts
Graph types
Properties
Graph Modeling
3 Network Generation
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 2 / 47
Graph Motivations
Outline
1 Graph
Motivations
Patterns
2 Graph Concepts
Graph types
Properties
Graph Modeling
3 Network Generation
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 3 / 47
Graph Motivations
Motivation questions
Questions
What do real graphs look like?
Graph Motivations
Motivation questions
Questions
What do real graphs look like?
What properties of vertices, edges are important to model?
What local and global properties are important to measure?
Graph Motivations
Motivation questions
Questions
What do real graphs look like?
What properties of vertices, edges are important to model?
What local and global properties are important to measure?
Are graphs helpful to understand the real world?
Social influence
Recommendation
Information propagation
Human behaviors
Graph Motivations
Motivation questions
Questions
What do real graphs look like?
What properties of vertices, edges are important to model?
What local and global properties are important to measure?
Are graphs helpful to understand the real world?
Social influence
Recommendation
Information propagation
Human behaviors
Is a sub-graph “normal” (Water army, fraud detection, spam
filtering, etc)?
Graph Motivations
Motivation questions
Questions
What do real graphs look like?
What properties of vertices, edges are important to model?
What local and global properties are important to measure?
Are graphs helpful to understand the real world?
Social influence
Recommendation
Information propagation
Human behaviors
Is a sub-graph “normal” (Water army, fraud detection, spam
filtering, etc)?
How to generate realistic graphs?
Graph Motivations
Motivation questions
Questions
What do real graphs look like?
What properties of vertices, edges are important to model?
What local and global properties are important to measure?
Are graphs helpful to understand the real world?
Social influence
Recommendation
Information propagation
Human behaviors
Is a sub-graph “normal” (Water army, fraud detection, spam
filtering, etc)?
How to generate realistic graphs?
How to get a “good” sample of a network?
Graph Motivations
Motivation questions
Questions
What do real graphs look like?
What properties of vertices, edges are important to model?
What local and global properties are important to measure?
Are graphs helpful to understand the real world?
Social influence
Recommendation
Information propagation
Human behaviors
Is a sub-graph “normal” (Water army, fraud detection, spam
filtering, etc)?
How to generate realistic graphs?
How to get a “good” sample of a network?
How to design an efficient algorithm to handle large-scale
graphs?
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 5 / 47
Graph Motivations
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 6 / 47
Graph Motivations
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 7 / 47
Graph Patterns
Outline
1 Graph
Motivations
Patterns
2 Graph Concepts
Graph types
Properties
Graph Modeling
3 Network Generation
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 8 / 47
Graph Patterns
Power-law I
Graph Patterns
Power-law I
Internet topology
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 9 / 47
Graph Patterns
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 10 / 47
Graph Patterns
Power-law II
Graph Patterns
Power-law II
Rank of out-degrees
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 11 / 47
Graph Patterns
Power-law III
Graph Patterns
Power-law III
Rank of eigenvalues
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 12 / 47
Graph Patterns
Power-law IV
Graph Patterns
Power-law IV
Hop plot
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 13 / 47
Graph Patterns
Power-law V
Graph Patterns
Power-law V
Counting of triangles
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 14 / 47
Graph Patterns
Triangle law
How to count # triangles?
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 15 / 47
Graph Patterns
Erdös number
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 16 / 47
Graph Patterns
Erdös number
The world looks “small” when you think of how short a path of friends
it takes to get from you to almost anyone else. Stanley Milgram and
his colleagues in the 1960s did an experiment.
296 randomly chosen “starters” asked to forward a letter to a
“target” person, a stockbroker in Boston’s suburb.
The six degrees of separation was also found by Jure Leskovec
on Miscrosoft Instant Message.
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 16 / 47
Graph Patterns
Shrinking diameter
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 17 / 47
Graph Patterns
Shrinking diameter
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 18 / 47
Graph Patterns
Dunbar’s number
Why primates have unusually big brains?
Dunbar’s number
Why primates have unusually big brains?
Dunbar’s number
Why primates have unusually big brains?
Dunbar’s number
Why primates have unusually big brains?
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 19 / 47
Graph Concepts Graph types
Outline
1 Graph
Motivations
Patterns
2 Graph Concepts
Graph types
Properties
Graph Modeling
3 Network Generation
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 20 / 47
Graph Concepts Graph types
Graph types
Undirected graph
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 21 / 47
Graph Concepts Graph types
Graph types
Undirected graph
Directed graph
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 21 / 47
Graph Concepts Graph types
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 22 / 47
Graph Concepts Graph types
Bipartite graph
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 22 / 47
Graph Concepts Properties
Outline
1 Graph
Motivations
Patterns
2 Graph Concepts
Graph types
Properties
Graph Modeling
3 Network Generation
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 23 / 47
Graph Concepts Properties
Paths
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 24 / 47
Graph Concepts Properties
Paths
Path
Path is a sequence of nodes with the property
that each consecutive pair in the sequence is
connected by an edge
Simple path does not repeat nodes.
The length of path is the number of
nodes in the path
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 24 / 47
Graph Concepts Properties
Paths
Path
Path is a sequence of nodes with the property
that each consecutive pair in the sequence is
connected by an edge
Simple path does not repeat nodes.
The length of path is the number of
nodes in the path
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 24 / 47
Graph Concepts Properties
Paths
Path
Path is a sequence of nodes with the property
that each consecutive pair in the sequence is
connected by an edge
Simple path does not repeat nodes.
The length of path is the number of
nodes in the path
Cycle
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 24 / 47
Graph Concepts Properties
Connectivity
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 25 / 47
Graph Concepts Properties
Connectivity
Connected component
Connectivity
Connected component
Connectivity
Connected component
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 25 / 47
Graph Concepts Properties
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 26 / 47
Graph Concepts Properties
Giant component
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 27 / 47
Graph Concepts Properties
Giant component
Giant component
Giant component
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 27 / 47
Graph Concepts Properties
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 28 / 47
Graph Concepts Properties
Web graph
Web graph
Web graph
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 28 / 47
Graph Concepts Properties
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 29 / 47
Graph Concepts Properties
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 29 / 47
Graph Concepts Properties
1 X
L−1 = d −1
1
2 n(n + 1) i≥j ij
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 30 / 47
Graph Concepts Properties
Summarization
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 31 / 47
Graph Concepts Graph Modeling
Outline
1 Graph
Motivations
Patterns
2 Graph Concepts
Graph types
Properties
Graph Modeling
3 Network Generation
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 32 / 47
Graph Concepts Graph Modeling
Adjacency matrix
Definition
Given a finite graph G = (V , E ), an adjacency matrix A is a |V |×|V |
matrix, whose elements indicate whether pairs of vertices are adjacent
or not in the graph.
The adjacency matrix is a (0,1)-matrix with zeros on its
diagonal.
If the graph is undirected, the adjacency matrix is symmetric.
Graph Concepts Graph Modeling
Adjacency matrix
Definition
Given a finite graph G = (V , E ), an adjacency matrix A is a |V |×|V |
matrix, whose elements indicate whether pairs of vertices are adjacent
or not in the graph.
The adjacency matrix is a (0,1)-matrix with zeros on its
diagonal.
If the graph is undirected, the adjacency matrix is symmetric.
Graph Concepts Graph Modeling
Adjacency matrix
Definition
Given a finite graph G = (V , E ), an adjacency matrix A is a |V |×|V |
matrix, whose elements indicate whether pairs of vertices are adjacent
or not in the graph.
The adjacency matrix is a (0,1)-matrix with zeros on its
diagonal.
If the graph is undirected, the adjacency matrix is symmetric.
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 33 / 47
Graph Concepts Graph Modeling
Storing a graph
Adjacency lists
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 34 / 47
Graph Concepts Graph Modeling
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 35 / 47
Graph Concepts Graph Modeling
Incidence matrix
Definition
An incidence matrix B is a |V |×|E | matrix that shows the relationship
between vertices and edges of graph G = (V , E ).
Graph Concepts Graph Modeling
Incidence matrix
Definition
An incidence matrix B is a |V |×|E | matrix that shows the relationship
between vertices and edges of graph G = (V , E ).
Each column corresponds to an edge e = (vi , vj ) (with i < j),
where the value of an entry is 1 in the row corresponding to vi ,
and entry −1 in the row corresponding to vj .
Graph Concepts Graph Modeling
Incidence matrix
Definition
An incidence matrix B is a |V |×|E | matrix that shows the relationship
between vertices and edges of graph G = (V , E ).
Each column corresponds to an edge e = (vi , vj ) (with i < j),
where the value of an entry is 1 in the row corresponding to vi ,
and entry −1 in the row corresponding to vj .
L = BB T . Thus, L is positive semidefinite and has nonnegative
eigenvalues since xT Lx = xT BB T x = (B T x)T (B T x) ≥ 0
(λi ≥ 0).
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 37 / 47
Graph Concepts Graph Modeling
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 38 / 47
Graph Concepts Graph Modeling
Properties
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 39 / 47
Network Generation
Network generation
Generators
Erdös-Renyi model
Preferential attachment
Variations + extensions
Copying model
Triad-closing
Butterfly model
Recursion - Kronecker generator
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 40 / 47
Network Generation
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 41 / 47
Network Generation
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 42 / 47
Network Generation
Model
Network Generation
Model
Model
Model
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 43 / 47
Network Generation
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 44 / 47
Network Generation
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 45 / 47
Network Generation
Generators
Erdös Renyi: http://ladamic.com/netlearn/NetLogo501/
ErdosRenyiDegDist.html
BRITE: http://wwwcsbuedu/brite/
INET: http://topology.eecs.umich.edu/inet
Kronecker:
[email protected]
http://www.cc.gatech.edu/dimacs10/archive/
kronecker.shtml
http://www.cc.gatech.edu/dimacs10/archive/
kronecker.shtml
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 46 / 47
Take-home msg.
Take-home messages
Graph
Motivations
Patterns
Graph aspects
Graph types
Properties
Graph modeling
Network generation
Erdös Renyi model
Barabasi Albert model
Kronecker model
MING GAO (DaSE@ECNU) Algorithm Foundations of Data Science Mar. 28, 2018 47 / 47