0% found this document useful (0 votes)
5 views

Basics of Network Analysis

Uploaded by

pjanhavi2910
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Basics of Network Analysis

Uploaded by

pjanhavi2910
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Basics of Network Analysis

Hiroki Sayama
[email protected]
Graph = Network
• G(V, E): graph (network)
V: vertices (nodes), E: edges (links)
1 Nodes = 1, 2, 3, 4, 5
Links =
1<->2, 1<->3, 1<->5,
3 2<->3, 2<->4, 2<->5,
2
3<->4, 3<->5, 4<->5

4 (Nodes may have states;


links may have directions
and weights)
5 2
Representation of a network

• Adjacency matrix:
A matrix with rows and columns labeled by
nodes, where element aij shows the number
of links going from node i to node j
(becomes symmetric for undirected graph)
• Adjacency list:
A list of links whose element “i->j” shows a
link going from node i to node j
(also represented as “i -> {j1, j2, j3, …}”)
3
Exercise
• Represent the following network in:

1 – Adjacency matrix

3 – Adjacency list
2

5 4
Degree of a node
• A degree of node u, deg(u), is the
number of links connected to u

u1 u2

deg(u1) = 4 deg(u2) = 2

5
Connected graph
• A graph in which there is a path
between any pair of nodes

6
Connected components

Connected Number of
component connected
components
= 2

Connected
component 7
Complete graph
• A graph in which any pair of nodes
are connected (often written as K1, K2,
…)

8
Regular graph
• A graph in which all nodes have the
same degree (often called k-regular
graph with degree k)

9
Bipartite graph
• A graph whose nodes can be divided
into two subsets so that no link
connects nodes within the same subset

=
10
Directed graph

• Each link is
directed
• Direction repre-
sents either order
of relationship or
accessibility
between nodes

E.g. genealogy
11
Weighted directed graph

• Most general
version of graphs
• Both weight and
direction is
assigned to each
link

E.g. traffic
network
12
Measuring Topological Properties of
Networks (1):
Macroscopic Properties
Network density
• The ratio of # of actual links and #
of possible links

– For an undirected graph:


d = |E| / ( |V| (|V| - 1) / 2 )

– For a directed graph:


d = |E| / ( |V| (|V| - 1) )

14
Characteristic path length
• In graph theory: Maximum of
shortest path lengths between pairs
of nodes (a.k.a. network diameter)
• In complex network science: Average
shortest path lengths
• Characterizes how large the world
being modeled is
– A small length implies that the network is
well connected globally 15
Clustering coefficient
• For each node:
– Let n be the number of its neighbor nodes
– Let m be the number of links among the k
neighbors
– Calculate c = m / (n choose 2)
Then C = <c> (the average of c)
• C indicates the average probability for
two of one’s friends to be friends too
– A large C implies that the network is well
connected locally to form a cluster 16
Degree distribution

P(k) = Prob. (or #) of nodes with


degree k

• Gives a rough profile of how the


connectivity is distributed within the
network
Sk P(k) = 1 (or total # of nodes)
17
Power law degree distribution
A few well-connected nodes,
• P(k) ~ k-g a lot of poorly connected nodes
log P(k)
P(k)

log k

Linear in log-log plot


k
-> No characteristic scale
Scale-free network
(Scale-free networks)
18
How it appears

Random Scale-free
19
Degree Distributions of Real-World
Complex Networks

Actors WWW Power grid


A Barabási, R Albert Science 1999;286:509-512

20
Degree distribution of FB
P(k) CCDF

• http://www.facebook.com/note.php?note_id=1
0150388519243859
• http://arxiv.org/abs/1111.4503 21
Measuring Topological Properties of
Networks (2):
Centralities
Centrality measures (“B,C,D,E”)
• Degree centrality
– How many connections the node has
• Betweenness centrality
– How many shortest paths go through the
node
• Closeness centrality
– How close the node is to other nodes
• Eigenvector centrality
23
Degree centrality
• Simply, # of links attached to a node

CD(v) = deg(v)

or sometimes defined as
CD(v) = deg(v) / (N-1)

24
Betweenness centrality
• Prob. for a node to be on shortest
paths between two other nodes

#sp(s,e,v)
CB(v) = Σs≠v,t≠v
#sp(s,e)
• s: start node, e: end node
• #sp(s,e,v): # of shortest paths from s to e
that go though node v
• #sp(s,e): total # of shortest paths from s to e
• Easily generalizable to “group betweenness” 25
Closeness centrality
• Inverse of an average distance from a
node to all the other nodes

n-1
CC(v) =
Σw≠v d(v,w)
• d(v,w): length of the shortest path from v to w
• Its inverse is called “farness”
• Sometimes “Σ” is moved out of the fraction (it works for
networks that are not strongly connected)
• NetworkX calculates closeness within each connected 26
component
Eigenvector centrality
• Eigenvector of the largest eigenvalue
of the adjacency matrix of a network

CE(v) = (v-th element of x)


Ax = lx
• l: dominant eigenvalue
• x is often normalized (|x| = 1)

27
Exercise
• Who is most central by degree,
betweenness, closeness, eigenvector?

28
Which centrality to use?
• To find the most popular person
• To find the most efficient person to
collect information from the entire
organization
• To find the most powerful person to
control information flow within an
organization
• To find the most important person (?)
29
Measuring Topological Properties of
Networks (3):
Mesoscopic Properties
Degree correlation (assortativity)

• Pearson’s correlation coefficient of


node degrees across links

Cov(X, Y)
r =
σX σY
• X: degree of start node (in / out)
• Y: degree of end node (in / out)

31
Assortative/disassortative networks

Social
networks are
assortative

Engineered /
biological
networks are
disassortative

(from Newman, M. E. J., Phys. Rev. Lett. 89: 208701, 2002) 32


K-cores
• A connected component of a network
obtained by repeatedly deleting all
the nodes whose degree is less than k
until no more such nodes exist
– Helps identify where the core cluster is
– All nodes of a k-core have at least
degree k
– The largest value of k for which a k-
core exists is called “degeneracy” of the
network 33
Exercise
• Find the k-core (with the largest k)
of the following network

34
Coreness (core number)
• A node’s coreness (core number) is c
if it belongs to a c-core but not
(c+1)-core

• Indicates how strongly the node is


connected to the network
• Classifies nodes into several layers
– Useful for visualization

35
Community
• A subgraph of a network within which
nodes are connected to each other
more densely than to the outside
– Still defined vaguely…
– Various detection
algorithms proposed
• K-clique percolation
• Hierarchical clustering
• Girvan-Newman algorithm
• Modularity maximization
(e.g., Louvain method) (diagram from Wikipedia) 36
Modularity
• A quantity that characterizes how
good a given community structure is in
dividing the network

|Ein|-|Ein-R|
Q =
|E|
• |Ein|: # of links connecting nodes that belong
to the same community
• |Ein-R|: Estimated |Ein| if links were random 37
Community detection based on
modularity

• The Louvain method


– Heuristic algorithm to construct
communities that optimize modularity
• Blondel et al. J. Stat. Mech. 2008 (10):
P10008
• Python implementation by Thomas
Aynaud available at:
– https://bitbucket.org/taynaud/python-
louvain/
38

You might also like