0% found this document useful (0 votes)
3 views

03_NetworkStructure

The document discusses the structure and properties of networks, focusing on concepts such as connected components, giant components, and clustering coefficients. It highlights the significance of triadic closure and the small-world property in real-world networks, along with definitions of various metrics like degree centrality and betweenness centrality. Additionally, it explores the implications of these properties on social networks and the emergence of giant components in random graphs.

Uploaded by

Husein Yusuf
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

03_NetworkStructure

The document discusses the structure and properties of networks, focusing on concepts such as connected components, giant components, and clustering coefficients. It highlights the significance of triadic closure and the small-world property in real-world networks, along with definitions of various metrics like degree centrality and betweenness centrality. Additionally, it explores the implications of these properties on social networks and the emergence of giant components in random graphs.

Uploaded by

Husein Yusuf
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 75

Network Structure

Basic definitions: Components, Giant


component, triadic closure, clustering
coefficient, cliques, bridges, local
bridges
Outline
• As a reference point:
– Poisson random graphs
– Regular graphs
• Common properties of real-world
networks
– Size of largest connected component
– Small-world property
– Heavy-tailed degree distribution
– Hierarchical organization
– Network motifs
Review: Connectedness
Not edge!
• Dfn: A graph (or subgraph) is
connected if there is a path between
each pair of nodes.
• If no path, this is disconnected.
Components
• Dfn: a connected component is a
subset S of nodes where:
1. Every node in S has a path to every
other
S comp {v  S | u  S , path(u , v)}
2. S is not part of a larger subset S’
where property #1 holds
Giant Component

• A range of network datasets , large, complex networks often have what is called a giant
component, a deliberately informal term for a connected component that contains a
significant fraction of all the nodes. Moreover, when a network contains a giant
component, it almost always contains only one.
• The example of the global friendship network and try imagining that there were two giant
components, each with hundreds of millions of people. All it would take is a single edge from
someone in the rest of these components to someone in the second, and the two giant
components would merge into a single component.
• The global social network likely contained two giant components | one in
the Americas, and one in the Europe-Asia land mass. Because of this,
technology evolved independently in the two components, and perhaps
even worse, human diseases evolved independently;
• and so when the two components finally came in contact, the technology
and diseases of one quickly and disastrously overwhelmed the other.
• The notion of giant components is useful for reasoning about networks
on much smaller scale as well.
Giant Component
• A giant component: “connected
component that contains a
significant fraction of all the nodes”
• Ex: Real-world social network
• Ex: Random graphs
Giant component(1)
• How many components exist in G(n,p) model
– p=0  Every node is isolated
–  Component size = 1 (independent of n)
– p=1  All nodes connected with each other
–  Component size = n (proportional to n)

• It is interesting to examine what happens for values of p in-


between
– In particular, what happens to the largest component in the
network as p increases?
• The size of the largest component undergoes a sudden change,
or phase transition, from constant size to extensive size at one
particular special value of p
7
Emergence of giant connected
component in G(n,p) as p increases

http://networkx.lanl.gov/archive/networkx-1.1/examples/drawing/giant_component.html
Giant component (2)
• A network component whose size grows in proportion to n is
called giant component

• Let u be the fraction of nodes that do not belong to the giant


component. Hence,
– If there is no giant component  u=1
– If there is giant component  u<1

• In order for a node i not to connect to the giant component:


– i needs not connect to any other node j  1-p
– i is connected to j, but j itself is not connected to the giant component
 pu

9
Giant component
• Hence, the total probability of i not being connected to
giant component via vertex j is: 1-p+pu
– Considering all n-1 vertices through which i can connect:

– Taking the logarithms at both sides and the Taylor


approximation for large n:

– This equation cannot be solved in close form


10
Giant component
 We plot y=1-e-cS with S between 0 and 1 (since it represents
fraction of nodes)
 We also plot y=S
 The point where the two curves intersect is the solution
 For small c only one solution
 S=0
 For greater c there might be two
solutions
 The point where two solutions start
appearing is when the gradients of the two
curves are equal at S=0
 This happens for c=1
11
Giant component
• Until now we have proved that if c≤1 there cannot
be any giant component
– However, we have not proved what happens if c>1
• How can we be sure that there is a giant
component?
• After all there are two solutions to the equation; one
for S=0 and one for larger S

The periphery of this initially small


component can only increase if c>1

12
Giant component
Average degree

• We expect a giant component if c > 1

13
Emergence of giant component
Emergence of giant connected
– Mean degree component in G(n,p) as p
increases
Reference point-2: Regular
graphs
• Ring lattice with k connections to nearest
neighbors
Small world property
• The connection topology is assumed to be either completely regular
or completely random.
• But many biological, technological and social networks lie somewhere
between these two extremes.
• Regular networks ‘Rewired’ to introduce increasing amount of
disorder.
• These systems can be highly ‘clustered’ like regular lattice, yet have
small characteristics path length like random graphs ---- call them
“small world” networks.
• Ex: Power grids, models of dynamic systems with small world
couplings display enhanced signal propagation speed, computational
power and synchronizability.
• In particular, infection diseases spread more easily in small world
networks than regular lattice.
The principle of Triadic closure
“If two people in a social network have a friend
in common, then there is an increased
likelihood that they will become friends
themselves ...”
- Rapoport 1953
G
• Why? B
– Opportunity
C
– Trust F A

– Incentive
How prevalent is triadic E D
closure in a graph?
Global clustering based on triplets of nodes
A triplet consists of three nodes that are connected by either two (open triplet) or
three (closed triplet) undirected ties.
A triangle consists of three closed triplets, one centred on each of the nodes.

Figure 3.1: The formation of the edge between B and C illustrates the effects of
triadic closure, since they have a common neighbor A.
Cluster coefficient
• In graph theory, a clustering coefficient is a measure of the
degree to which nodes in a graph tend to cluster together.
Evidence suggests that in most real-world networks, and in
particular social networks, nodes tend to create tightly knit
groups characterized by a relatively high density of ties; this
likelihood tends to be greater than the average probability of
a tie randomly established between two nodes.
• Two versions of this measure exist: the global and the local.
The global version was designed to give an overall indication
of the clustering in the network, whereas the local gives an
indication of the embeddedness of single nodes.
The Clustering Coefficient
• The basic role of triadic closure in social networks has motivated the
formulation of simple social network measures to capture its prevalence is the
clustering coefficient .
• The clustering coefficient of a node A is defined as the probability that two
randomly selected friends of A are friends with each other.
• In other words, it is the fraction of pairs of A’s friends that are connected to each
other by edges.
• For example, the clustering coefficient of node A in Figure 3.2(a) is 1/6
(because there is only the single C-D edge among the six pairs of friends B-C, B-
D, B-E, C-D, C-E, and D-E), and it has increased to 1/2 in the second snapshot of
the network in Figure 3.2(b) (because there are now the three edges B-C, C-D,
and D-E among the same six pairs). In general, the clustering coe fficient of a
node ranges from 0 (when none of the node’s friends are friends with each
other) to 1 (when all of the node’s friends are friends with each other), and the
more strongly triadic closure is operating in the neighborhood of the node, the
higher the clustering coefficient will tend to be.
Figure 3.2: If we watch a network for a longer span of time, we can see multiple edges
forming — some form through triadic closure while others (such as the D-G edge)
form even though the two endpoints have no neighbors in common.
Clustering Coefficient
Dfn: The (overall) clustering coefficient
of a node v is the probability that u1, u2
(two randomly selected neighbors of v)
are themselves neighbors.
neighbors of v that are linked

 #{(u , u )  E | u u , u  N (v), u  N (v)}


1 2 1 2 1 2
Cl (G )  v

 #{(u , u ) | u u , u  N (v), u  N (v)}


v
1 2 1 2 1 2

Any 2 neighbors of v
Cliques
• Dfn: A clique is a maximal,
completely connected subgraph of a
given graph.
B B B

C A C A C A

D D D
INDIVIDUAL EXERCISE:

What is the clustering coefficient


of this graph?
What are the cliques?
G

C
F A

E D
Bridges
Dfn:
An edge e
is a bridge
in the graph G
if G―e
has more components than G
What about giant components?
Local Bridges
Dfn:
An edge e = (v,u)
is a local bridge
in the graph G
if pathG-e(v, u) ― pathG(v, u) > 2
span of local bridge

i.e., no friends in common


TIE STRENGTH AND
BRIDGES

GROUP
DISCUSSION:
Not all edges (“ties”) are created equal.
To find a job, which would be more useful:
A good friend? “strong” tie
Or an acquaintance?
“weak” tie
Strong Triadic Closure Property
• Node v violates the Strong Triadic
Closure Property if:
– it has strong ties to 2 other nodes u1 and u2
– and there is no edge at all (strong or weak)
between u1 and u2

• Node v satisfies the Strong Triadic


Closure Property if it does not violate it.
Local Bridges must be Weak Ties!
• What if:
 A has strong ties
 A = local bridge
• Contradiction!
… of Strong Triadic Closure

“Strong” and “weak” are imprecise!


How does this look in the real world?
Tie Strength
• Define “strong” and “weak” ties?
– “Weak”: e.g., follow on Facebook (FB)
– “Strong”: e.g., reciprocal FB messages
• Quantify “strong” and “weak” ties?
– E.g., # likes, messages, etc
Neighborhood Overlap
• Dfn: The neighborhood overlap of
an edge e = (v, u) is: shared neighbors (i.e.,
triadic closure)

| N (v)  N (u )  u  v |
neighborhoodOverlap(v,u ) 
| N (v)  N (u )  u  v |
all neighbors of
either node
INDIVIDUAL EXERCISE:

What is the neighborhood overlap


of :
1. BC edge? 2. AB edge?
G

C
F A

E D
Centrality: degree, betweenness, closeness, information;

NODE-LEVEL METRICS
Centrality
• How does a node relate to the
overall network?
• In real-life…
– Information flow
– Bargaining power
– Influence
Review: Node Degree
• Dfn: The node degree is the
number of neighbors a node has.
di(G) = |N(i)| deg=3 A

E B
• w.r.t. adjacency matrix:
adjacency A B C D E D C
matrix value
A 0 1 0 0 1

d i (G )  ai , j B
C
1
0
0
1
1
0
0
0
0
1
j V D 0 0 0 0 1
pick
row i E 1 0 1 1 0
Review:
Local Bridges
Dfn:
An edge e = (i,j)
is a local bridge
in the graph G
if pathG-e(i, j) ― pathG(i, j) > 2
span of local bridge

i.e., no friends in common


Centrality Measures
Centrality
Degree Centrality
• How connected is the node?
• Node degree, scaled to [0,1]
D d i (G ) degree of node i
C (G ) 
i
|V |  1 # of nodes in
network (not i)
• Missing: node location?
Betweenness Centrality
• How important is this node in
connecting others?
# shortest paths between # shortest paths
k and j that include i between k and j

Pi (k , j ) / P (k , j )
C (G )  
i
B

j k ;i {k , j } (| V |  1)(| V |  2) / 2
Closeness Centrality
• How easily can a node reach others?
• Simple: Inverse average distance
C |V |  1 # nodes (not i)
C (G ) 
 distance(i, j )
i
avg. distance
i j from node i
• Decay centrality:
If δ -> 0? Degree centrality


i j
distance( i , j )
If δ -> 1? Component size

decay parameter in [0,1]


Information Centrality
• How well can the network respond
if a node is deactivated?
– Measure in graph efficiency:
1 1
E (G )  
| V | (| V |  1) i , j G ,i j d i , j
• Compare the change in efficiency
I E E[G ]  E[G ' ]
C (G ) 
i  graph G without node i or
E E[G ] corresponding edges

You might also like