0% found this document useful (0 votes)
29 views29 pages

2 Introduction To Network Analysis-Network Structure - Jasmine Mondolo

This document discusses network structure and topology. It defines key concepts like degree distribution, density, and centrality measures. Degree distribution looks at the number of connections each node has. Density measures the proportion of actual vs. possible connections. Centrality measures evaluate nodes' importance, like degree centrality, closeness centrality, and betweenness centrality. The document also discusses network-level metrics, clustering coefficients, modularity, and different network models like random, regular, scale-free, and small-world networks.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views29 pages

2 Introduction To Network Analysis-Network Structure - Jasmine Mondolo

This document discusses network structure and topology. It defines key concepts like degree distribution, density, and centrality measures. Degree distribution looks at the number of connections each node has. Density measures the proportion of actual vs. possible connections. Centrality measures evaluate nodes' importance, like degree centrality, closeness centrality, and betweenness centrality. The document also discusses network-level metrics, clustering coefficients, modularity, and different network models like random, regular, scale-free, and small-world networks.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

A gentle introduction to network

analysis

Part 2-Network structure

Jasmine Mondolo
[email protected]
Network structure/topology: an introduction

• It defines the way different nodes are placed and linked to each other and the
emerging overall patterns

• Each network has its own topology or structure which affects the actions and
capabilities of the nodes at the local level, the “cost” for travelling throughout the
network (e.g., easiness of people’s movements in a transportation network, easiness
of water’s flow in a hydraulic network) and the diffusion/ contagion of a
phenomenon (e.g., idea, disease), which is also influenced by the node(s) from which it
starts
Degree Distribution

• The degree of a node is the number of edges attached to that node

• A degree distribution is a tally of how many nodes have each degree


Degree Distribution and Power-law Distribution

• The larger the range of degrees, the more the degree distribution
approximates a curve

• We may observe quite different degree distributions; networks often follow a


power-law distribution
Density

• It measures how many edges exist versus how many edges there could
possibly be

• It consists in the ratio between the number of “realized” edges and the number
of all the possible edges it is a measure of network connectiveness

⋕ 𝑒𝑑𝑔𝑒𝑠
𝑑𝑒𝑛𝑠𝑖𝑡𝑦 =
⋕ 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑒𝑑𝑔𝑒𝑠

⋕ 𝑛𝑜𝑑𝑒𝑠 ∗(⋕ 𝑛𝑜𝑑𝑒𝑠−1)


⋕ 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑒𝑑𝑔𝑒𝑠 =
2
(«hand-shake» problem)
Density

About 43% of all the possible edges exist/taking two nodes at random, the
probability of an existing edge between them is about 43%
Density
• Another example: two networks with the same nodes but different density

• Caution is required when interpreting the value, especially when comparing


networks of different magnitude (the bigger the network gets, the lower the
density often is; e.g., 1% of density in a network of 1,000 nodes vs 1% of density in
a network of 100,000 nodes)

• The type of relationship matters (e.g., informal interactions/similarities vs


collaborations)
Relevant structural metrics
• Node-level metrics

• Degree Centrality
• Closeness Centrality
• Betweenness Centrality
• Local Clustering Coefficient

• Network-level metrics

• Magnitude/Scale
• Density
• Diameter
• Average degree and average weighted degree
• Average (shortest) path length
• Average/global clustering coefficient
• Modularity «cluster-level» metrics
Node-level measures

Which node is the most “important”?


Node-level measures
They measure the position/ “importance” of a node (from various perspectives)
within a network

• Local (first-order) centrality measures they account only for the “nearest
neighbourood” of a node
• Degree centrality
• Local clustering coefficient
• Other measures (e.g., average neighbours’ degree)

• Global (higher-order) centrality measures they account for the overall


network structure
• Closeness centrality
• Betweenness centrality
• Other measures (not studied here)
Degree Centrality
• Degree centrality is simply the degree of a node (i.e., the number of edges a node
has)

DCD =5
DCR = 9
DCF = 3
DCB =4

• It can be practical to normalise this value (for comparing networks of different node-
size but also for an easier interpretation using N-1 (all the other nodes) as normalized
factor range 0 -1

• NormDCR = 9/17 = 0.53 (R is directly connected to 53 % of the other nodes)


• In a directed network we can calculate in-degree and out-degree centrality
Local Clustering Coefficient

• It is the ratio between the number of actual edges between a node’s


alters/nearest neighbours and the total number of possible edges between
them how close a node’s alters are to forming a clique (e.g., “How many
of my friends are also close to each other?”)

LLC(i) = 10/10 = 1 LLC(i) = 3/10 =0.3 LLC(i) = 0/10 = 0


Local Clustering Coefficient

• It corresponds to the density of the 1.5-degree egocentric network of


the node

• High levels of local clustering coefficients are often associated with low
levels of degree centrality

• The average of the local clustering coefficients of all the nodes in the
network is a useful structural measure
Closeness Centrality
• Closeness centrality measures how close (in terms of topological distance) a node
is to every other node in the network

• It can be regarded as a measure of how long it will take to spread something (e.g.,
information or a disease) from the node of interest to all the other nodes
sequentially

• It is often calculated as the average of the shortest path lengths (d) from one
σ𝑛
𝑗=1 𝑑 𝑖,𝑗
node (i) to every other node (j) in the network 𝐶𝐶 𝑖 = (lower values,
𝑛−1
higher centrality)

• Sometimes it is calculated as the inverse of the average shortest paths


𝑛−1 1
𝐶𝐶 𝑖 = σ𝑛 , or simply as σ𝑛 (the higher the index, the more a node is
𝑗=1 𝑑 𝑖,𝑗 𝑗=1 𝑑 𝑖,𝑗
close to the other nodes)
Closeness Centrality

• If a node is directly connected to all the other nodes, 𝑑 𝑖, 𝑗 = 1, σ𝑛𝑗=1 𝑑 𝑖, 𝑗 = n-1


and 𝑪𝑪 𝒊 = 1 /disconnected node: CC = 0
Betweenness Centrality
• It measures the extent to which a node is a gatekeeper/bridge/intermediate in
the network how much potential control a node has over the flow of
information

• It captures how frequently a node (i) lies on the shortest paths (G) linking
pairs of the other n-1 nodes

• To calculate the BC of node C, we first compute


the fraction of shortest paths (over the number
of total shortest paths) that go through C
/on which C lies for every pair of nodes
in the network, and then we aggregate
them together
Example
Network-level metrics

• Magnitude order/cardinality (the number of nodes) and the number of edges

• Density

• Diameter how many steps it takes for the two most distant nodes in the network to
reach one another

• Average degree the average of the degrees of all the nodes

• Average weighted degree the average degree that accounts for the weights of the
nodes’ edges

• Average (shortest) path length/average geodesic distance the average of the shortest
paths how many steps it takes, on average, to get from one node to another one; it is
sometimes regarded as a measure of communication efficiency
Cluster-level metrics

• Average clustering coefficient it is the average of the (local) clustering


coefficients of all the nodes in the network; it captures the proportion of
complete triangles

• e.g., avgCC = 0.25 25% the possible triangles are complete

• Modularity
Modularity

• A measure of the extent to which a network can be divided into


groups/clusters/communities/modules (”Does it have a strong group
structure?) It can be regarded as an indicator of clustering

• Networks with high modularity have dense connections between the


nodes within communities but sparse connections between nodes in
different communities

• Note: clustering (the intrinsic tendency of the nodes to


form clusters, regardless of the reasons) vs
partitioning (which is made by the researcher)
• Its calculation requires an optimization process (for more info you can see the video:
https://www.youtube.com/watch?v=berZf9Nhr0E); popular methods for community
detection: Louvain’s method and the algorithm by Valente et al. 2015 (0-1 range)
Networks with multiple types of nodes (brief mention)

• Two-mode or bipartite networks


people organizations
• There are two categories of nodes
• Connections take place only between nodes
of different types
• There may be also more than two partitions
(multipartite networks)

• [Networks with multiple nodes that are not bi/multipartite]


Main types of network/network models

There are some major network typologies presenting some specific


features:

• Random networks

• Regular networks

• Centralized or scale-free networks

• «Small worlds»
Random network

• Also known as Erdős–Rényi random graph model

• It does not display a clear pattern

• It is generated by simply taking a set of nodes and randomly placing edges


between them with some given fixed probability

• The degree distribution is often bell-shaped

• The average clustering coefficient is generally small

• The average shortest path length is generally very small


Regular network

• Each node has the same degree

• There is a clear pattern

• Edges are placed following a rule Regular network vs random network

• The average shortest path length is generally quite long/high

• The average clustering coefficient is generally high

• Example: the map of Manhattan’s streets


Small world
• Small world “concept” (related to the “six-degree of separation” theory) and more
formal definition (small-world graph/network; Watts & Strogatz, 1998)

• Not strong degree variation, several well-connected nodes

• It has high average clustering coefficient -like regular networks-

• The average shortest path length tends to be short -like random networks-
considering the network magnitude (generally very significant)

• It can be very effective at spreading information and connecting individuals who


are physically remote

• Examples: several social networks, neural networks, the power grid networks… (When
Facebook had 721 million users, some researchers calculated that the average shortest path was
equal to just 4.74)
Centralized or scale-free network
• It is very heterogeneous and unequal in terms of how connected and influential the
different nodes in the network are it has a very uneven/skewed/asymmetric
degree distribution (there is a power relationship between the degree a node has and
the frequency of its occurrence)

• It is quite widespread in the real world (e.g., the network of


global banking activity or the world wide web)

• It can be very robust or fragile, depending


on the type of “attack”
(whether nodes are removed randomly or strategically)
Main types of network structure: a summary

average clustering average (shortest) degree


coefficient path length distribution
Regular high high nodes have the
same degree
Random low low It tends to be bell-
shaped
Scale-free ( low) variable power-law
Small worlds high low (power-law)
Measuring and visualizing networks

[table retrieved from Chapter


Chapter 8, p.100 of Patent 8, p.100 of Patent Analytics]

> Stata and especially R can


can perform network analysis be used as well

> Gephi: easiness of use +


quite complete analysis

>Pajek is often used for


patent n patent network analysis
What we learnt so far

• Network basics
• Definition of network structure and some basic examples
• Degree distribution
• Main network metrics at the node level, network level and cluster
level
• One-mode vs two-mode networks
• Main types of network
• Things to consider when building a network
• Main software used for performing network anaysis

You might also like