0% found this document useful (0 votes)
23 views62 pages

SNA - T2-3 - Graphs and Degree

Graphs an introduction
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views62 pages

SNA - T2-3 - Graphs and Degree

Graphs an introduction
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

Social Network Analysis

A.Y. 23/24

Communication Strategies

1
Graphs
an introduction

2
Euler and the 7 bridges of Könisberg
(Prussia, 1736) today Kaliningrad

How to walk through the city by crossing each


bridge only once? 3
Networks as graphs

A D

Graph ! (",#): network


q Vertices (set ") : nodes, people, concepts
q Edges (set #): links, relations, associations
technology social social
mathematics psychology cognition 4
Directed versus undirected

q A connection relationship can have a


privileged direction or can be mutual
q Either a directed or an undirected link

A B C D

q If the network has only (un)directed links, it


is also called itself (un)directed network
q Certain networks can have both types

5
Directed versus undirected

q At first glance undirected à directed by


duplicating links, but not necessarily quite the
same though

Ann & Bob know


each other

Carl sent an
email to Dana

Dana sent an
email to Carl

6
Some examples

network nodes and links type


Facebook Profiles and friendship undirected
Instagram Accounts and followers directed
the www Webpages and links directed
citation Papers and references directed
network
social network People and undirected
friends/acquaintances
movie network Actors and co-starring undirected
genealogy People and parenthood directed

7
Can U think of other social networks?

network nodes and links type


Twitter Accounts & follows directed
WhatsApp People & messages directed
WhatsApp People & contacts undirected
TikTok Accounts & friendship undirected
LinkedIn Acounts & friendship undirected
TikTok Accounts & follows directed
Pinterest People & image like directed
YouTube Accounts & followers directed
YouTube Accounts & collaborations undirected
Ask Accounts & replies directed
LinkedIn Acounts & followers directed
8
Generality of representation

C
E

B
A 9
Graph representations
visual plot, adjacency mantix, edge list

10
Multi-graphs

q Multi-graphs (or pseudo-graphs)


Some network representations require
multiple links (e.g., number of citations
from one author to another)

A multiple edges
multiple edges
B

11
Weighted graphs

q Weighted graph
Sometimes a weight is associated to a
link, e.g., to underline that the links
are not identical (strong/weak relationships)

Can be seen as a generalization of


multi-graphs (weight = # of links)

1.5 C e.g., strength of a tie


1 0.2 = weak (acquaintances)
A B 2.3 1 = strong (friends)
1.5 = stronger (close friends)
0.2 2.3 = very strong (best friends)
D
12
Signed graphs

q Edges can have signed values


positive if there is an agreement between
nodes
negative if there’s a disagreement

q This is typical of correlation networks


correlation = a measure of similarity

q More difficult to handle


13
Signed graph example

Sincerity Modesty
A personality network
(Costantini et al, 2015)

Fearfulness
Dependence Greed-
Fairness
avoidance
Sentimentality Forgiveness

Social boldness Gentleness


Anxiety
Liveliness

Inquisitiveness
Social self-esteem
Sociability
Flexibility
Unconventionality

Aestethic appreciation Diligence Prudence Patience


Creativity

Negative links are displayed in red Perfectionism

Organization 14
Self interactions

q In many networks nodes do not interact


with themselves

q To account for self-interactions, we add


loops to represent them
C
self
loop A B

15
Adjacency matrix

q An adjacency matrix A = [aij] associated


to graph ! has i is the row index j is the column index

entries aij = 0 if nodes i and j are not connected


if nodes i and j are connected then aij ≠ 0

this is a12

0.3 1 0 0 row 1
1.5 3
1 1 0 1.5 0.2
0.3 1 2 2.3 A= 0 1.5 0 2.3
0.2 0 0.2 2.3 0
4
column 2 16
Symmetries

q Undirected graph = symmetric matrix


1.5 3 0.3 1 0 0
1 1 0 1.5 0.2
0.3 1 2 2.3 A= 0 1.5 0 2.3
0.2 0 0.2 2.3 0
4

q Directed graph = asymmetric matrix


1.5 3 0.3 1 0 0
1 1 0 1.5 0
0.3 1 2 2.3 A= 0 1.5 0 0
0.2 0 0.2 2.3 0
4 17
Convention

q The weight aij is associated to


i th row
j th column
directed edge jà i starting from node j and leading to
node i

1.5 3 0.3 1 0 0a
24
1 1 0 1.5 0
0.3 1 2 2.3 A= 0 1.5 0 0a
34
0.2 0 0.2 2.3 0
4 a42 a43

18
An example
which of these representations do you like best?

Oliver

Marc
Sarah
Giulia
2 0 0 1 1 1 0
5
1
0 0 0 1 1 0
1 0 0 0 1 0
Anna A= 1 1 0 0 0 1
4
6
1 1 1 0 0 1
Thomas
0 0 0 1 1 0
which of these representations do you like best? 19
Graph plots may carry relevant info…
US republicans and democrats interactions on Twitter (2020)

speaker of US house
of representatives

republicans

democrats 20
… or may not!

21
Real networks are sparse

q The adjacency matrix is typically sparse


good for tractability !

A=

22
Multi-layer networks

described by a set of adjacency matrices Aℓ


e.g., one for likes, one for mentions, and one for retweets
23
A question 4 U

q So, what’s the take-away so far?

24
Storing network data
adjacency matrix versus edge list

adjacency matrix edge list

N2 entries
L entries
0 0 1 1 1 0 1à3
0 0 0 1 1 0 1à4
1à5
1 0 0 0 1 0
A= 1 1 0 0 0 1
2à4
2à5
3à5
1 1 1 0 0 1 4à6
0 0 0 1 1 0 5à6

Which one do U think is better?


25
Distances in graphs
and related concepts

26
Paths

q Path
a sequence of interconnected nodes (meaning
each pair of nodes adjacent in the sequence are
connected by a link)
A B C D

q Path length
# of links involved in the path (if the path
involves n nodes then the path link is n-1)

q Cycle A B
path where starting and
ending nodes coincide
D C
27
Distances

q Shortest path (between any two nodes)


the path with the minimum length, which is called
the distance The distance
3 5 between nodes
1 and 8 is d18=5
1 2
6
4

it is not unique! 7 8

q Diameter (of the network) The diameter is d=5

the highest distance in the network

28
Small world

q Average path length


average distance between all nodes pairs (apply an
algorithm to all node couples, and take the average)

q In real networks distance between two randomly chosen nodes is


generally short
q Milgram [1967]: 6 degrees of separation

q What does this mean?


We are more connected than we think
29
Small world
we and the US presidents

1 Granovetter’s
weak tie ;-)

2 4

30
Connectivity

q Connected graph (undirected)


for all couples (i,j) there exists a path connecting them
if disconnected, we count the # of connected components
(e.g., use BFS and iterate)

q Giant component (the biggest one)


0 1 0 0 0 0 0 0
q Isolates (the other ones) 1 0 1 1 0 0 0 0
3 0 1 0 1 1 0 0 0
5
1 2
A= 0 1 1 0 1 0 0 0
0 0 1 1 0 0 0 0
6
0 0 0 0 0 0 1 1
4
0 0 0 0 0 1 0 1

8 0 0 0 0 0 1 1 0
7
block-diagonal matrix 31
Bridges

q A bridge is a link between two


connected components
its removal would make the network disconnected

0 1 0 0 0 0 0 0
1 0 1 1 0 0 0 0
3 0 1 0 1 1 0 0 0
5
1 2
A= 0 1 1 0 1 0 0 0
0 0 1 1 0 1 0 0
6
0 0 0 0 1 0 1 1
4
0 0 0 0 0 1 0 1

8 0 0 0 0 0 1 1 0
7

32
Bipartite graphs
and semantic networks

33
Bipartite graphs

q Connections are available only between the


groups ! and ℬ

! ℬ
! ℬ
4

1 5 !
!
2 6
A=
3 7
!
ℬ !
8
34
Bipartite graph example

Hashtags
Tweets
those who think they are crazy enough to
change the world eventually do. 4 #GretaThunberg
#climatechange #ClimateCrisis 1
#ClimateAction #GretaThunberg #Greta
5 #climatechange

Hopefully these kids will succeed where


past generations have failed. 2 6 #GlobalWarming
#TheResistance #FBR #ClimateChange
#Environment #GlobalWarming
#GretaThunberg
7 #Environment

The #environment can have a major effect


on the human cardiovascular system. A 8 #longevity
new study has found an increase in heat- 3
induced #heartattack risk in recent years.
Could #ClimateChange be a risk factor?
#longevity

35
Meaning

q Bipartite graphs represent memberships/relationships,


e.g., groups (!) to which people (ℬ) belong
examples: movies/actors, classes/students, conferences/authors

q We can build separate networks (projections) for ! and


ℬ (sometimes this is useful)
in the movies/actors example being linked can be interpreted in
two ways: “actors in the same movie” (projection on ℬ), or
“movies sharing the same actor” (projection on !)

36
Abstract example

Nodes are linked


1
if they have a
common
ion on ! 2 neighbour in ℬ
ct
proje
3
PS: we say that
nodes i and j have a
! 4 ℬ common neighbour k
if both i and j are
1 5 connected to k

2 6 5
4
7 proj
3 ecti
on on ℬ
Nodes are linked
6 if they have a
8 common
neighbour in !

7 8
37
Projection on a semantic network
#hashtags that appear in the same tweet are linked

#climateaction tweets after Greta Thunberg 38


Projection on a semantic network
words that appear in the same tweet are linked

#metoo tweets 39
Takeaways so far

q (un)Directed graphs
q Weighted and signed graphs
q Adjacency matrix & edge list
q Distances
q Giant component, isolates, bridges
q Bipartite graphs & projections

41
Degree centrality
a first approach to node importance

42
The notion of centrality
In Network Science

43
An example of node centrality
museums network
The Museum ecosystem on Twitter

Introduction
> Communities, countries

The study
United states 28,3% Mexico 4,4% 44
and some key players Unknown
United Kingdom
18,24%
10,69%
Canada
Australia
2,52%
1,89%
Node degree
undirected networks

q The degree ki of node i in an undirected


networks is
the # of links i has to other nodes, or
the # of nodes i is linked to
k3 = 2

k1 = 1 k2 = 3

k4 = 2

The average degree is


<k> = ∑i ki / N = (1+3+2+2)/4
=2
45
Node degree
directed networks

q For directed networks we distinguish between


in-degree kiin = # of entering links
out-degree kiout= # of exiting links

(undirected: kiin = kiout due to the symmetry)


k2in = 2

k2out = 3

The average degree is


<k> = ∑ kiout / N = (1+3+2+0)/4
= ∑ kiin / N = (1+2+1+2)/4
= 3/2
46
Meaning

q A social-capital measure of cohesion


q In-degree = importance as an Authority
q Out-degree = importance as a Hub
an influencer:
authority or hub?
In www:
q Authorities (quality as a content provider)
nodes that contain useful information, or having
a high number of edges pointing to them
(e.g., course homepages)

q Hubs (quality as an expert)


trustworthy nodes, or nodes that link to
many authorities (e.g., course bulletin)

47
Adjacency matrix and degree

q The in (out) degree can be obtained by


summing the adjacency matrix over
rows (columns)

k2in = 2 0 1 0 0
1 0 1 0 k2in= 2
A= 0 1 0 0
k2out = 3 0 1 1 0
k2out= 3

48
Real networks are sparse

q The maximum degree is N-1


q In real networks <k> ≪ N-1

49
Visualizing degree centrality
how to get useful insights on centrality

50
Graphical representations
of degree centrality

by size by colour

51
Degree distribution

ü a probability distribution pk
ü pk = the fraction of nodes that have degree equal to k
ü pk = # of nodes with degree k, divided by N
1
8 2
3
7 3
1 2

k=[1322] 4 6 4
pk 5
pk
0.5 k=[22222222]
0.25 1

k 0.5
1 2 3 k
1 2 3 52
Log-log plot

q In real (large) networks, degrees have a


large range à log representation
0.5=1/2, i.e., half of the nodes have degree 1

0.5
0.2=1/5, i.e., 1 node every 5 has degree 2
0.2

nodes with high degree = hubs

1 node
every 1000
has degree
27
53
Scale-free networks
those that follow a power-law

54
The power law
typical of social networks

ɣ is the slope of
saturation
@ low the approx.
degrees linear behaviour

plateau
@ high
degrees

Why the name power-law? Because the (approx.) linear


behaviour in the log domain ensures
ln(pk) = c - ɣ ⋅ ln(k) à pk = C k -ɣ
55
Examples
from past projects

semantic network Erasmus+


of politicians posts exchanges

ɣin = 2.7 ɣin = 2.1

a netw

a netw
sc

sc
al or k

al or k
e-

e-
fr e

fr e
e

e
ɣout = 2.6 ɣout = 2.3

56
The ultra-small-world
of scale-free networks

Small world
Ultra small world hubs not
large hubs significantly large

ɣ, the slope
57
Scale-free networks
versus random networks

Random Scale-free
network network

a netw
sc
a network

al or k
e-
with a scale

fr e
e
Ø Randomly wired network Ø Power-law network
Ø Has smaller hubs Ø Has big hubs
Ø Needs a linear plot Ø Needs a log-log plot
58
Preferential attachment
a simple concept that (partially) explains the power-law

Nodes link to the more connected nodes


e.g., think of www
This idea has a long history
1955 1976 1999
1923 1931

1925 1941 1968

Matthew effect: “rich gets


richer”, i.e., high connectivity
quantifies attractiveness
59
The copying model
explaining preferential attachment

q Citation network
researchers decide what papers to read and cite by
“copying” references from papers they have read à
papers with more citations are more likely to be cited

q Social network
the more acquaintances an individual has, the higher
the chancer of getting new friends, i.e., we “copy” the
friends of friends à difficult to get friends if you have
none

q Semantic network
does the model apply here?
60
Attractiveness
a further essential concept to explain the power-law

q There is an innate ability of a node to attract links


just a quality assessment of the individual

q Otherwise oldest nodes would have an inherent


advantage and cannot be defeated (first mover’s
advantage), which is in contrast with intuition and
evidence
e.g., Altavista [1990] à Google [2000] à Facebook [2011] à Instagram [202?]
e.g., #parisagreement [2018] à #fridays4future [2019]
q [2018]
Attractiveness
a visual example

node node
degree degree
Attractiveness (log)
(linear)

re
tu
fu
s4
ay
r id
ction

#f
a te a
#c l i m c l i m ate
c t i o non
#a

time time

!i can be measured by data scientists !

62
Takeaways

q Degree, degree distribution, loglog plot


q Authorities and hubs
q Power law, scale-free networks
q Slope, Ultra-small-world regime
q Preferential attachment
q Attractiveness

63

You might also like