0% found this document useful (0 votes)
13 views110 pages

MLC 04 Graph Methods Ranking Communities Link Prediction-Sose2023

The document outlines basic graph and network methods in machine learning, focusing on ranking, community detection, and link prediction. It discusses key concepts such as centrality measures, the PageRank algorithm, and community structure definitions. The content is part of a course led by Prof. Dr. Martin Atzmüller at Osnabrück University during the summer semester of 2023.

Uploaded by

Yunru Cheng
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views110 pages

MLC 04 Graph Methods Ranking Communities Link Prediction-Sose2023

The document outlines basic graph and network methods in machine learning, focusing on ranking, community detection, and link prediction. It discusses key concepts such as centrality measures, the PageRank algorithm, and community structure definitions. The content is part of a course led by Prof. Dr. Martin Atzmüller at Osnabrück University during the summer semester of 2023.

Uploaded by

Yunru Cheng
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 110

Machine Learning for Complex Data

Networks and Sequences


Summer semester 2023

Prof. Dr. Martin Atzmüller

Osnabrück University & DFKI


Semantic Information Systems Group
https://sis.cs.uos.de/
4 – Basic Graph/Network Methods:
Ranking, Community Detection, Link Prediction

1 Overview
2 Overview: Graph/Network Methods
3 Ranking on Networks/Graphs
4 Community Detection
5 Link Prediction
6 Summary
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Basic Graph/Network Methods

• Ranking:
• Importance/impact/influence of nodes
• Node-centered criterion
• Centralities
• Pagerank algorithm
• Community Detection
• Detecting groups/clusters etc.
• Different criteria; nodes and edges
• Basically: connection structure
• Extensions: Labels of nodes and/or edges
• Compare to machine learning – clustering algorithms
• Link Prediction
• Consider link structure
• Structural: Hidden/missing links in a network
• Temporal: Predicting new links to appear in a network

1
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Reprise: Complex networks


Definition
Graphs abstracting, directly or indirectly, interactions in real-world
systems.

Basic topological
features
• Low Density
• Small Diameter
• Scale-free

• High Clustering
Coefficient

2
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Notation

A graph G =< V , E ⊆ V × V >

• V : set of nodes (a.k.a. vertices, actors,


sites)
• E : set of edges (a.k.a. ties, links, bonds)

Notations
• AG Adjacency Matrix d : aij 6= 0 iff (vi , vj ) ∈ E , 0 otherwise.
• n = |V |
• m = |E |. Often m ∼ n
• Γ(v ) : neighbors of node v . Γ(v ) = {x ∈ V : (x, v ) ∈ E }.
• Node degree : d(v ) =k Γ(v ) k

3
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Centralities & Ranking

4
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Reprise: Definitions I

• In a directed graph, E denotes a subset of V × V .


• The density of G is the fraction of possible edges that are
actually present.
• In a weighted graph, each edge l ∈ E is given an edge weight
w (l) by some weighting function w : E → R.
• The degree of a node in a network is the number of
connections it has to other nodes.
• The adjacency matrix of a set of nodes S with n = |S|
contained in a (weighted) graph G = (V , E ) is a matrix
A ∈ Rn×n with Aij = 1 (Aij = w (i, j)) iff (i, j) ∈ E for any
nodes i, j in S (assuming some bijective mapping from
1, . . . , n to S). We identify a graph with its according
adjacency matrix where appropriate.

5
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Reprise: Definitions II

• A path v0 →G vn of length n in a graph G is a sequence


v0 , . . . , vn of nodes with n ≥ 1 and (vi , vi+1 ) ∈ E for
i = 0, . . . , n − 1.
• A shortest path between nodes u and v is a path u →G v of
minimal length and the diameter of G is the shortest path of
maximal length.
• The transitive closure of a graph G = (V , E ) is given by
G ∗ = (V , E ∗ ) with (u, v ) ∈ E ∗ iff there exists a path u →G v .
• A strongly connected component (scc) of G is a subset
U ⊆ V , such that u →G ∗ v exists for every u, v ∈ U.
• A (weakly) connected component (wcc) is defined accordingly,
ignoring the direction of edges (u, v ) ∈ E .

6
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Reprise: Definitions III

• Many observations of network properties can be explained just


by the network’s degree distribution
• It is therefore important to contrast the observed property to
the according result obtained on a random graph as a null
model which shares the same degree distribution.
• If a single network G is considered, a corresponding null
model G can be obtained by randomly replacing edges
(u1 , v1 ), (u2 , v2 ) ∈ E with (u1 , v2 ) and (u2 , v1 ), ensuring that
these edges were not present in G beforehand
• This process is typically repeated a multiple of the graph edge
set’s cardinality

7
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Centralities

Node centralities

• Idea: Estimate importance


and/or influence of a node
• PageRank, HITS algorithms: web
search results ranking
• FolkRank algorithm: Tag
recommendation
• Betweenness, Closeness:
Importance of actors in a social
Degree centrality example
network

8
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Node Centrality I

• The betweenness centrality bet measures the number of


shortest paths of all node pairs that go through a specific
node.
X σst (v )
bet(v ) = .
σst
s6=v 6=t∈V

Hereby, σst denotes the number of shortest paths between s


and t and σst (v ) is the number of shortest paths between s
and t passing through v . Thus, a vertex has a high
betweenness centrality if it can be found on many shortest
paths between other vertex pairs.

9
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Node Centrality II
• The closeness centrality clos considers the length of these
shortest paths. Then, the shorter its shortest path length to
all other reachable nodes, the higher a vertex ranks:
1
clos(v ) = P .
t∈V \v dG (v , t)

dG (v , t) denotes hereby the geodesic distance (shortest path)


between the vertices v and t.
• The eigenvector centrality eig of a node is an important
measure of its influence, similar to the pagerank measure.
Intuitively, a node is central, if it has many central neighbors.
The eigenvector centrality eig (v ) of node v is defined as
X
eig (v ) = λ eig (u) ,
{u,v }∈E

where λ ∈ R is a constant.
10
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Ranking

• Estimate:Importance of a node
• In complex (social) systems also: Influence of a node
• Centrality-based: different criteria – e.g., degree-based,
path-based (relating e.g., to degree, betweenness, closeness,
eigenvector centrality)
• Now: Pagerank:
• Considering link structure/paths
• Very many applications in graph-based machine learning
• Example: Optimizing network structure in graph-neural
networks

11
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Ranking Methods
• 2 independently developed algorithms
– PageRank (Brin & Page)
– HITS: Hypertext Induced Topic Search (Kleinberg)
• Basic idea:
– Popularity
– Link-based
– Intuitively: Web pages are popular, if many popular
pages link to them
– "PageRank is a global ranking of all web pages,
regardless of their content, based solely on their
location in the Web's graph structure.“
[Page et al 1998]

12
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

13
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Popularity-based Ranking
• Hubs and Authorities
– Hubs link to highly rated nodes (collecting links)
– Authorities are linked to (as being relevant) from
other popular nodes

14
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

PageRank

15
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Vorlesung
Web
Science

16
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

PageRank: Original Summation Formula


[Langville, Meyer 2004]

17
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Starting Problems
• Sinks & Cycles
– Some pages get fully ranked, some
get no "ranking" contribution
– Cycles "reverse“ the evaluation
– Some nodes without outgoing nodes
è Dangling nodes
• How many iterations?
– Does the process converge?
– Does it converge to a single vector?

18
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Approach of Brin & Page


• Crucial idea: "Random Surfer"
– Similar to navigating through the web (hyperlinks)
– Example
6 Links è probability 1/6 to select a specific link

– For dangling nodes: Jump to an arbitrary node


with equal probability
– Also: Stop following on a link path with a given
(fixed) probability

19
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

20
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

21
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

22
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Google Matrix: Step by Step

23
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Google Matrix: Step by Step

24
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Google Matrix: Step by Step

25
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Pagerank: Result

26
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Detection

27
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community detection

Overview
I Given a network/graph, find “modules”
• Single network
• Multiplex networks
• Attributed networks
I Community structures
• Graph Clustering/disjoint communities
• Hierarchical organization
• Overlapping communities
I Questions:
• What is “a community”?
• What are “good” communities?
• How do we evaluate these?

28
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community detection

Definitions
I A dense subgraph loosely coupled to other modules in the
network
I A community is a set of nodes seen as “one” by nodes outside
of the community
I A subgraph where almost all nodes are linked to other nodes
in the community.
I ...

29
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Subjectivity of Community Definition


Each component is a
A densely-knit community
community

30
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Taxonomy of Community Criteria


• Criteria vary depending on the tasks
• Roughly, community detection methods can be divided
into 4 categories (not exclusive):
• Node-Centric Community
– Each node in a group satisfies certain properties
• Group-Centric Community
– Consider the connections within a group as a whole. The group
has to satisfy certain properties without zooming into node-
level
• Network-Centric Community
– Partition the whole network into several disjoint sets
• Hierarchy-Centric Community
– Construct a hierarchical structure of communities

31
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Complete Mutuality: Cliques


• Clique: a maximum complete subgraph in which all
nodes are adjacent to each other

Nodes 5, 6, 7 and 8 form a clique

• NP-hard to find the maximum clique in a network


• Straightforward implementation to find cliques is
very expensive in time complexity

32
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Finding the Maximum Clique


• In a clique of size k:
Each node maintains degree >= k-1
• Nodes with degree < k-1 will not be included in the
maximum clique
• Recursively apply the following pruning procedure
– Sample a sub-network from the given network, and find
a clique in the sub-network, say, by a greedy approach
– Suppose the clique above is size k, in order to find out a
larger clique, all nodes with degree <= k-1 should be
removed.
• Repeat until the network is small enough
• Many nodes will be pruned as complex networks
(typically) follow a power law distribution for node
degrees

33
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Maximum Clique Example

• Suppose we sample a sub-network with nodes


{1-5} and find a clique {1, 2, 3} of size 3
• In order to find a clique >3, remove all nodes with
degree <=3-1=2
– Remove nodes 2 and 9
– Remove nodes 1 and 3
– Remove node 4

34
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Clique Percolation Method (CPM)


• Clique is a very strict definition, unstable
• Normally use cliques as a core or a seed to find
larger communities

• CPM is such a method to find overlapping


communities
– Input
• A parameter k, and a network
– Procedure
• Find out all cliques of size k in a given network
• Construct a clique graph. Two cliques are adjacent if they share k-1
nodes
• Each connected component in the clique graph forms a community

35
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

CPM Example
Cliques of size 3:
{1, 2, 3}, {1, 3, 4}, {4, 5, 6},
{5, 6, 7}, {5, 6, 8}, {5, 7, 8},
{6, 7, 8}

Communities:
{1, 2, 3, 4}
{4, 5, 6, 7, 8}

36
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Reachability : k-clique, k-club


• Any node in a group should be reachable in k hops
• k-clique: a maximal subgraph in which the largest
geodesic distance between any nodes <= k
• k-club: a substructure of diameter <= k
Cliques: {1, 2, 3}
2-cliques: {1, 2, 3, 4, 5}, {2, 3, 4, 5, 6}
2-clubs: {1,2,3,4}, {1, 2, 3, 5}, {2, 3, 4, 5, 6}

• Note: A k-clique might have diameter larger than k in


the subgraph
• Detection: Often involves combinatorial optimization

37
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Group-Centric: Density-Based Groups


• The group-centric criterion requires the whole group to
satisfy a certain condition
– E.g., the group density >= a given threshold
• A subgraph is a quasi-
clique if

• A similar strategy to that of cliques can be


used
– Sample a subgraph, and find a maximal
quasi-clique (say, of size k)
– Remove nodes with degree

38
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Network-Centric Community Detection


• Network-centric criterion needs to consider the
connections within a network globally
• Goal: partition nodes of a network into disjoint sets
• Approaches:
– Clustering based on vertex similarity
– Latent space models
– Block model approximation
– Spectral clustering
– Modularity maximization

39
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Clustering based on Vertex Similarity


• Apply k-means or similarity-based clustering to nodes
• Vertex similarity is defined in terms of the similarity of
their neighborhood
• Structural equivalence: two nodes are structurally
equivalent iff they are connecting to the same set of
actors
Nodes 1 and 3 are
structurally
equivalent;
So are nodes 5 and 6.

• Structural equivalence is too restricted for practical use.

40
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Vertex Similarity
• Jaccard Similarity [ ]-1
• Cosine similarity ∑s

41
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Latent Space Models


• General idea: Map nodes into a
low-dimensional space

• Multi-dimensional scaling (MDS)


– Given a network, construct a
proximity matrix representing the
pairwise distance between nodes
(e.g., geodesic distance)
– Construct clusters in new matrix:
• Apply clustering such that the
proximity between nodes based on
network connectivity is preserved
è Apply k-means clustering Two communities:
{1, 2, 3, 4} and {5, 6, 7, 8, 9}

42
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Block Models

• S is the community indicator matrix


• Relax S to be numerical values, then the optimal
solution corresponds to the top eigenvectors of A

Two communities:
{1, 2, 3, 4} and {5, 6, 7, 8, 9}

43
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Cut
• Most interactions are within group whereas
interactions between groups are few
• Community detection à Minimum cut problem
• Cut: A partition of vertices of a graph into two disjoint
sets
• Minimum cut problem: Find a graph partition such that
the number of edges between the two sets is
minimized

44
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Ratio Cut & Normalized Cut

• Minimum cut often returns an imbalanced


partition, with one set being a singleton
• Change the objective function to consider
community size
Ci,: a community
|Ci|: number of nodes in Ci
vol(Ci): sum of degrees in Ci

45
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Ratio Cut & Normalized Cut Example

For partition in in red:

For partition in green:

Both ratio cut and normalized cut prefer a balanced partition

46
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Modularity Maximization
• Modularity measures the strength of a community
partition by taking into account the degree distribution
• Given a network with m edges, the expected number
of edges between two nodes with di and dj is
The expected number of
edges between nodes 1
and 2 is
3*2/ (2*14) = 3/14
• Strength of a community:

• Modularity:

• A larger value indicates a good community structure

47
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Modularity Matrix
• Modularity matrix:

• Similar to spectral clustering, Modularity


maximization can be reformulated as

• Optimal solution: Top eigenvectors of the


modularity matrix
• Apply k-means to S as a post-processing step
to obtain community partition

48
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Modularity Maximization Example

Two Communities:
{1, 2, 3, 4} and {5, 6, 7, 8, 9}

k-means

Modularity Matrix

49
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Hierarchy-Centric Community Detection


• Goal: build a hierarchical structure of communities
based on network topology

• Allow the analysis of a network at different


resolutions

• Representative approaches:
– Divisive Hierarchical Clustering
– Agglomerative Hierarchical clustering

50
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Divisive Hierarchical Clustering


• Divisive clustering
– Partition nodes into several sets
– Each set is further divided into smaller ones
– Network-centric partition can be applied for the partition
• One particular example: recursively remove the
“weakest” tie
– Find the edge with the least strength
– Remove the edge and update the corresponding strength
of each edge
• Recursively apply the above two steps until a network
is discomposed into desired number of connected
components.
• Each component forms a community

51
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Edge Betweenness
• The strength of a tie can be measured by edge
betweenness
• Edge betweenness: the number of shortest paths
that pass along with the edge

The edge betweenness of e(1, 2) is


4, as all the shortest paths from 2 to
{4, 5, 6, 7, 8, 9} have to either pass
e(1, 2) or e(2, 3), and e(1,2) is the
shortest path between 1 and 2

• The edge with higher betweenness tends to be


the bridge between two communities.

52
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Divisive clustering based on edge


betweenness
Initial betweenness values

After removing e(4,5), the


betweenness of e(4, 6) becomes
20, which is the highest;

After remove e(4,6), the edge


e(7,9) has the highest
betweenness value 4, and
should be removed.

53
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Agglomerative Hierarchical Clustering


• Initialize each node as a community
• Merge communities successively into larger
communities following a certain criterion
– E.g., based on modularity increase

54
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Further Community Quality Functions I

One example we have already seen – Modularity – for estimating


the quality of a community partition.
• For a given undirected graph G = (V , E ) and a community
C ⊆ V we use the following notation:
• n := |V |,
• m := |E |,
• nC := |C |,
• mC := | {{u, v } ∈ E : u, v ∈ C } | – the number of intra-edges
of C , and
• m̄C := | {{u, v } ∈ E : | {u, v } ∩ C | = 1} | – the number of
inter-edges of C .
• Furthermore, it is convenient to introduce an inter-degree for
a node u ∈ C (that depends on the choice of C ) by
d̄C (u) := | {{u, v } ∈ E : v ∈
/ C } |, counting the number of
edges between u and nodes outside of C .

55
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Further Community Quality Functions II

• The Inverse Average-ODF (out-degree fraction) IAODF


compares the number of inter-edges to the number of all edges
of a community C , and averages this for the whole community
by considering the fraction for each individual node:

1 X d̄C (u)
IAODF(C ) := 1 − (1)
nC d(u)
u∈C

• The segregation index SIDX compares the number of


expected inter-edges to the number of observed inter-edges,
normalized by the expectation:

E (m̄C ) − m̄C m̄C n(n − 1)


SIDX(C ) = =1− (2)
E (m̄C ) 2mnC (n − nC )

56
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Detection on Attributed Graphs I

First: What is an attributed graph?


• Integrates additional attribute data about individual instances
• Example: In a social network, attributes describing each user’s
characteristics might be combined with the underlying
friendship network to form an attributed graph
• Also, here the relationships (friendship) could be labeled with
some information, e. g., specifying the “degree” of friendship
Other terms: “graphs with feature vectors”, “labeled graphs”, or
“annotated graph”, where also the term “network” is often
substituted for “graph”

57
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Detection on Attributed Graphs II


Definition 1 (Attributed Graph)
An attributed graph is a graph G in which each v ∈ V is
associated to a vector of attribute values x = (x1 , . . . , xd ), and
each edge e ∈ E to a vector y = (y1 , . . . , yt ). We use ai (v ) to
refer to the ith attribute value of a node v , and ai (e) for the edge
e respectively. We denote with AV the set of node attributes,
d = |AV |, and with AE the set of edge attributes, t = |AE |. If
|AV | > 0, |AE | = 0, we refer to G as node-attributed; similarly if
|AV | = 0, |AE | > 0, we call it edge-attributed. If
|AV | = 0, |AE | = 0, then we refer to a plain graph.

• Classic community detection just identifies subgroups of nodes


with a dense structure
• Thus: This lacks an interpretable description – community is
just given by a set of nodes / node IDs
58
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Detection on Attributed Graphs III

• With attributed graphs: it becomes possible to use these for


descriptions description-oriented community detection
• Interpretable, actionable (recommendations, etc.)
• Local patterns (not global partitioning)
• Main idea: Detect descriptions of subsets of nodes
(subgraphs) with a high community quality function
• Example: COMODO algorithm (Atzmueller et al. 2016)
• Applies descriptive community detection
• Core idea: Discover subsets of nodes – according to a
description (pattern)
• Subset should have high quality according to community
quality function
• Thus, two representations
• Subgraph (set of nodes) – graph structure level
• Descriptive pattern (set of attributes) – inducing the subgraph

59
Knowledge-Based Systems (KBS), Prof. Dr. M. Atzmüller, Osnabrück University

Transformation & Mining


■ Dataset of edges connecting two nodes
■ Described by intersection of labels of the two nodes
■ Additionally: Store nodes, and respective degrees
■ Apply
top-k method w/ optimistic-estimate pruning
(COMODO)
Web Mining,
Web Mining, Computer,
Computer JavaScript

Web Mining,
Computer, Java

60
Knowledge-Based Systems (KBS), Prof. Dr. M. Atzmüller, Osnabrück University

■ Algorithmutilizes special FP-tree-like structure


& optimistic estimates for efficient processing

61
Knowledge-Based Systems (KBS), Prof. Dr. M. Atzmüller, Osnabrück University

Evaluation: Datasets
■ BibSonomy dump (until January 2010)
■ 175,521 tags, 5,579 users, 467,291 resources,
2,120,322 tag assignments, 700 friendship links
■ Friend, click,Visit graph
■ Delicious (HetRec workshop): 1,861 users,
7,664 bi-directional links, 53,388 tags
■ Last.fm (HetRec workshop): 1,892 users,
12,717 bi-directional links, 11,946 tags

62
Knowledge-Based Systems (KBS), Prof. Dr. M. Atzmüller, Osnabrück University

Results: Pruning performance

BibSonomy friend graph

63
Knowledge-Based Systems (KBS), Prof. Dr. M. Atzmüller, Osnabrück University

Results: Pruning Performance

Last.fm friend graph Delicious friend graph

64
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Quality - Optimistic Estimates I

Making use of the minimum support threshold τn we can first


observe the following inequality for each subcommunity C 0 of a
community C , with a size above the minimal size threshold τn ,
i. e., |C 0 | ≥ τn :
nC 0 nC 0 τn
X X X
m̄C 0 = δ̄C’ (i) ≥ δ̄C (i) ≥ δ̄C (i).
i=1 i=1 i=1

Here, we assume that the values δ̄C (i), i = 1, . . . , nC and δ̄C’ (i),
i = 1, . . . , nC 0 are the inter-degrees of the nodes in C and C 0
respectively in ascending order, such that δ̄C (i), i = 1, . . . , τn
denote the minimal τn inter-degrees with respect to C .

65
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Quality - Optimistic Estimates II

Proposition 2
An optimistic estimate for SIDX(C ) is given by
(P (P ))
τn t
n(n − 1) i=1 δ̄C (i)
nC
i=1 δ̄C (i)
oe(SIDX(C )) := 1 − max , min ,
2m p(C ) t=τn t(n − t)
(
n2
4 , if nC ≥ n2 ,
where p(C ) := .
nC (n − nC ) otherwise

66
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Quality - Optimistic Estimates III

Proof.
For a subcommunity C 0 ⊆ C with |C 0 | ≥ τn we have

n(n − 1) m̄C 0
SIDX(C 0 ) = 1− ≤
2m nC (n − nC 0 )
0
PnC 0
n(n − 1) i=1 δ̄C (i)
≤ 1− ≤ (3)
2m nC 0 (n − nC 0 )
Pτn
n(n − 1) i=1 δ̄C (i)
≤ 1− nC . (4)
2m maxt=τn {t(n − t)}
n Pt o
i=1 δ̄C (i)
From (3) it is clear that 1 − n(n−1)
2m minnC
t=τn t(n−t) is an optimistic
estimate for SIDX(C ). On the other hand we have
maxnt=τ
C
n
{t(n − t)} = p(C ), since t(n − t) has its maximum at t = n2 .
Pτn
n(n−1) δ̄C (i)
Together with (4) we obtain 1 − 2m
i=1
p(C ) as another optimistic
estimate.

67
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Quality - Optimistic Estimates IV

Proposition 3
For the inverse Average-ODF let d̃C (u) := d̄d(u) C (u)
and δ̃C (i),
i = 1, . . . , nC as these ratios for all nodes in C in ascending order.
Then
τn
1 X
oe(IAODF(C )) := 1 − δ̃C (i)
τn
i=1

is an optimistic estimate for IAODF(C ).

68
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Quality - Optimistic Estimates V

Proof.
For a subcommunity C 0 ⊆ C with |C 0 | ≥ τn we have

1 X | {{u, v } ∈ E : v ∈ V \ C 0 } |
IAODF(C 0 ) = 1 − ≤
nC 0 0
d(u)
u∈C
1 X | {{u, v } ∈ E : v ∈ V \ C } |
≤ 1− =
nC 0 0
d(u)
u∈C
nC 0
1 X 1 X
= 1− d̃C (u) ≤ 1 − δ̃C (i) ≤
nC 0 0
nC 0
u∈C i=1
τn
1 X
≤ 1− δ̃C (i) .
τn
i=1

69
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Quality - Optimistic Estimates VI

Proposition 4
An optimistic estimate for the local modularity contribution can be
derived based only on the number of edges mC within the
community:
(
0.25, if mC ≥ m2 ,
oe(MODL(C )) = m 2
mC
m − m2 , otherwise.
C

A simple but useful observation is the following equation that


combines some the above defined entities for a community C :
X
d(i) = 2mC + m̄C . (5)
i∈C

70
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Quality - Optimistic Estimates VII


Proof.
Using Equation 5 we obtain:

mC X d(u) d(v )
MODL(C ) = − =
m 4m2
u,v ∈C
mC 1 X X
= − 2
d(u) d(v ) =
m 4m
u∈C v ∈C
mC 1 X
= − d(u)(2mC + m̄C ) =
m 4m2
u∈C
mC 1
= − (2mC + m̄C )2 ≤
m 4m2
mC m2
≤ − C2 =
m m
= oe(MODL(C
ˆ )).

71
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

How to evaluate community detection


algorithms and the detected communities,
respectively?

72
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Evaluating Community Detection


• For groups with clear definitions
– E.g., Cliques, k-cliques, k-clubs, quasi-cliques
– Verify whether extracted communities satisfy the
definition
• For networks with ground truth information
– Normalized mutual information
– Accuracy of pairwise community memberships
• Using Evidence networks
– Data-driven approach
– Inexpensive, if secondary data is available

73
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Measuring a Clustering Result


1, 2, 4, 5, 4, 5,
1, 3 2
3 6 6

Ground Truth Clustering Result

How to measure the


clustering quality?

• The number of communities after grouping can be


different from the ground truth
• No clear community correspondence between
clustering result and the ground truth
• Normalized Mutual Information can be used

74
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Normalized Mutual Information


• Entropy: the information contained in a distribution

• Mutual Information: the shared information between


two distributions

• Normalized Mutual Information (between 0 and 1)

• Consider a partition as a distribution (probability of


one node falling into one community), we can compute
the matching between two clusterings

75
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

NMI
𝑘 (") : number of communities in partition a
𝑘 ($) : number of communities in partition b

Notation:
𝜋a, 𝜋b denote different partitions.
: : number of nodes in Partition a assigned
to the h-th community;
: number of nodes in Partition b assigned
to l-th community;
:: number of nodes assigned to h-th community in
partition a, and l-th community in partition b.

k^{(a)} is the number of communities in partition a,


76
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

NMI-Example
4, 5,
1, 2, 3
• Partition a: [1, 1, 1, 2, 2, 2] 6

• Partition b: [1, 2, 1, 3, 3, 3] 1, 3 2 4, 5,6

nha nlb nh ,l l=1 l=2 l=3


h=1 3 l=1 2 h=1 2 1 0
h=2 3 l=2 1 h=2 0 0 3
l=3 3

=0.8278

77
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Accuracy of Pairwise Memberships


• Consider all the possible pairs of nodes and check whether they reside in
the same community
• An error occurs if
– Two nodes belonging to the same community are
assigned to different communities after clustering
– Two nodes belonging to different communities are
assigned to the same community
• Construct a contingency table

78
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Accuracy Example
1, 1, 4,
4,
2 5,
2, 3 5, 6 3
6
Ground Truth Clustering Result

Ground Truth
C(vi) = C(vj) C(vi) != C(vj)
Clustering C(vi) = C(vj) 4 0
Result C(vi) != C(vj) 2 9

Accuracy = (4+9)/ (4+2+9+0) = 13/15

79
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Assessment
w/ Evidence Networks
• Evaluation & comparison of communities
• Are the discovered communities meaningful?
• Hard problem!
– Gold-standard è costly!
– User study è costly!
• Approach in [Mitzlaff et al. 2011]
– Cost-effective
– Secondary data
– Relative ranking of communities

80
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Data: Evidence Networks


[Mitzlaff et al. 2011]
– Explicit social relations often sparse, e.g., friendship network
è Idea: Use simple to acquire secondary data

• User interactions with typical “Web 2.0”


application establish implicit relations
between users by
§ Visiting profile pages,
§ Sending messages,
§…
è “Evidence Networks”

81
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Evidence Networks in BibSonomy


[Mitzlaff et al. 2011]
Explicitness

Friend Graph:
è User v is a friend

of user u

Click Graph:
è User u clicked on a Visit Graph:
post of user v è User u looked at

the user page of v

Size

82
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Evaluation: Paradigm I
[Mitzlaff et al. 2011]

I) Given:

community structure evidence network

II) Evaluation:
Map users to nodes in evidence network and
calculate quality measure (e.g., modularity)

83
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Evaluation: Paradigm II
[Mitzlaff et al. 2011]

• Follows the paradigm of reconstructing existing


social structures
• No absolute quality figure for assessing
community quality
• But relative ranking for given community
allocations
è Proposed paradigm yields following work flow:
1) Automatically generate many different community
allocations (different methods with exhaustive parameter
variation)
2) Calculate ranking on all models
3) Select top-n candidate models and manually choose best
community allocation by inspection

84
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Experiments: Description
[Mitzlaff et al. 2011]

• Problem: We need good communities before we


can verify that their quality is assessed in a
sensible way
• But we are only interested in relative ranking of
different community allocations
è We generate many different initial community
allocations, each of which is than gradually
disturbed by randomization
1) We examine whether the proposed evaluation methodology
captures the expected decrease in quality
2) We examine whether the different evidence networks induce
consistent rankings
3) We check the correlation between the individual rankings
and the rankings on the Friend-Graph as a reference.

85
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Experiments: Results
[Mitzlaff et al. 2011]

Modularity in the Friend Graph


• LDA-x-kMeans-y
• x topics
• y communities

è Quality measure
decreases as
expected for all
considered
initial
clusterings
Initial clustering Random clustering

86
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Modularity in the Friend Graph


Experiments: Results
Modularity in the Copy Graph

è Ranking is
similar for
different
evidence
networks
87
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Experiments: Results
[Mitzlaff et al. 2011]

Modularity in the Visit Graph

è Ranking slightly
varies for top
positions –
degree of
explicitness
correlates with
ranking
consistency

88
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Experiments: Results
[Mitzlaff et al. 2011]
• Modularity and conductance values highly
correlate between the Friend Graph and each
of the other evidence networks (Friend
Graph as a reference)
Evidence Modularity Conductance
Network
Follower 0.86 ∓ 0.17 0.89 ∓ 0.28
Graph
Group Graph 0.91 ∓ 0.13 1.00 ∓ 0.01
Copy Graph 0.82 ∓ 0.17 0.99 ∓ 0.03
Click Graph 0.80 ∓ 0.17 0.99 ∓ 0.04
Visit Graph 0.72 ∓ 0.25 0.97 ∓ 0.06

89
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Evaluation using Semantics


• For networks with semantics
– Networks come with semantic or attribute information of
nodes or connections
– Human subjects can verify whether the extracted
communities are coherent
• Evaluation is qualitative
• It is also intuitive and helps understand a
community

An animal A health
community community

90
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Application: User Recommendation


• Most users belong not to one single research
community
• Interests diverge and the number of users
increases
• Relevant resources are hidden

è Community detection
è Personalization
è User recommendation
è Necessary: Descriptive method (see descriptive
community detection
è Goal: Personalized view on application

91
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Conferator & MyGroup


[Atzmueller et al. 2011, it+ti]

• Social Conference Guidance


System
• GI: LWA 2010, 2011, 2012, 2015
• ACM Hypertext 2011
• INFORMATIK 2013
– "Virtual" business cards
– List of all conversations
– Personalization of schedule
– Recommendations (Talks,Persons)
– In-door Localization
• MyGroup: Working groups
– Current activities, locations
– Enhance interactions
– Expert finder

92
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Link Prediction

Link prediction
I Structural:
Find hidden/missing links
in a network
ex. Missing links in
Wikipedia
I Temporal:
Predicting new links to
appear at time tp based on
the network state at time
t < tp

93
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Link Prediction

94
69 Finally, Section 6 concludes with a summary and outlines interesting directions for future work.
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University
2 BACKGROUND

Link Prediction - Formal


70 In this section, we define basic concepts in graph theory that is relevant for this paper. For further
71 background in graph theory we refer the work of Diestel (2017). Next, we provide a brief overview on ASP.
72 2.1 Basic Definitions: Graph Theory & Link Prediction
73 A graph G is an ordered pair (V, E) consisting of a set of vertices (nodes) and a set of edges. An edge
74 (u, v) consists of a pair of nodes u, v representing a relationship between them. A social network can be
75 abstracted by a graph, where actors correspond to nodes and the links in between them corresponds to
76 edges. A node v is a neighbor of (adjacent to) a node u if there is an edge (u, v) between them. G(u) stands
77 for the set of neighbours of a node u. Let G = {Gt=o , Gt=1 , · · · , Gt=n } be a temporal sequence of evolving
78 graphs where Gt=i = (Vt=i , Et=i ). For link prediction on such sequences, given t = n the goal is to predict
79 the structure of a graph in t = n + 1, i. e., Gt=n+1 . Specifically, we try to identify pairs (u, v), such that
80 u, v 2 Vt=n+1 and (u, v) 2/ Et=n+1 .
81 Prominent approaches for link prediction consider similarity scores between pairs of nodes, e. g., based
82 on neighbourhoods of pairs of nodes. Here, we will enhance link prediction based on neighbourhood-based
83 similarity scores with background knowledge. As one prominent neighbourhood-based similarity score, we
84 use the Common neighbours score: It counts the number of common neighbours of a pair of nodes. Given,
85 (u, v) the pair of nodes under observation, the common neighbours can formally be written as:
CN(u, v) = |G(u) \ G(v)|

This is a provisional file, not the final typeset article 2

Relatively simple approaches based on:


• Topological, e.g., common neighbors
• Path-based measures (e.g., distances)

95
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

lying Answer Set Programming for Knowledge-Based Link Prediction


Link Prediction – Simple Example
rve that there are
nections with red
t the corresponding
e first interval. For
g., neighborhood
res for prediction,
ists between these
the first interval.
lemented by self-
will be formalized
hen, these might
edictions. That is,
e predicted based
s and questionaire
omain knowledge.
corporate domain
nsider it as an ideal
information, in this
ires.
at the purpose of
an ASP approach
ce, but to provide
plicability for link Figure 1. Interaction network (links are split into
its advantages like subsets based on time)
96
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Link Prediction ... Illustrated


Prediction of new and recurring contacts,
e.g., Face-to-Face contacts at a conference

97
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Prediction of New Contacts

98
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Prediction of Recurring Contacts

99
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Example: Prediction of New Contacts


with Common Neighbors as Predictor
First Day Second and Third Day

2
2

100
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Unsupervised ...
Table 7 Overview of all features: formal definition, mean values of user pairs with (∅Valw) and without (∅Valwo) trading interactions, and
Feature Description [Eberhard, Trattner & Atzmueller, 2018]
Formal definition ∅Valw

Online social network


Topological
O+CN Num. common outgoing neighbors O+CN (u, v) = |𝛩+ (u) ∩ 𝛩+ (v)| 1.16e−01
O−CN Num. common incoming neighbors O−CN (u, v) = |𝛩− (u) ∩ 𝛩− (v)| 1.56e−01
O+JC Outgoing Jaccard’s coefficient O+JC (u, v) = |𝛩+ (u) ∩ 𝛩+ (v)| 2.13e−03
|𝛩+ (u) ∪ 𝛩+ (v)|
O−JC Incoming Jaccard’s coefficient O−JC (u, v) = |𝛩− (u) ∩ 𝛩− (v)| 2.39e−03
|𝛩− (u) ∪ 𝛩− (v)|
O+PS Preferential attachment+− O+PS (u, v) = |𝛩+ (u)| ⋅ |𝛩− (v)| 1.05e+02
O−PS Preferential attachment−+ O−PS (u, v) = |𝛩− (u)| ⋅ |𝛩+ (v)| 9.41e+01
{
OR Reciprocity of user communication 0 if (u, v) ∈ EO , (v, u) ∉ EO 1.86e−02
OR (u, v) =
1 if (u, v) ∈ EO , (v, u) ∈ EO
OAA Adamic–Adar ∑ 1 1.02e−01
OAA (u, v) = log(|𝛩(z)− |)
z ∈ 𝛩− (u) ∩ 𝛩− (v)

OK001 Katz (𝛽 = .001) c


∑ 1.55e−05
OK𝛽 (u, v) = 𝛽 l ⋅ |pathlu,v |
l=1

OK01 Katz (𝛽 = .01) 1.72e−04


OK1 Katz (𝛽 = .1) 9.79e−03
ORPR01 Rooted PageRank (𝛼 = .01) ORPR𝛼 (u, v) = stationary probability of v/random 1.79e−03
walk: (1) with probability 1 − 𝛼 move to a
random neighbor of current node, (2) with prob-
ability 𝛼 return to u
ORPR05 Rooted PageRank (𝛼 = .05) 1.91e−03
ORPR15 Rooted PageRank (𝛼 = .15) 1.98e−03
ORPR3 Rooted PageRank (𝛼 = .3) 1.95e−03
13

ORPR5 Rooted PageRank (𝛼 = .5) 1.70e−03


Homophilic 101
ability 𝛼 return to u
ORPR05 Rooted PageRank (𝛼 = .05) 1.91e−03
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University
ORPR15 Rooted PageRank (𝛼 = .15) 1.98e−03
ORPR3 Rooted PageRank (𝛼 = .3) 1.95e−03
[Eberhard, Trattner & Atzmueller, 2018]1.70e−03
1 13
ORPR5
Table 7 (continued)Rooted PageRank (𝛼 = .5)
Homophilic
3
Feature Description Formal definition ∅Valw
GC Num. common groups GC (u, v) = |𝛥(u) ∩ 𝛥(v)| 2.43e−01
GJC Jaccard’s coefficient of groups GJC (u, v) = |𝛥(u) ∩ 𝛥(v)| 6.33e−03
|𝛥(u) ∪ 𝛥(v)|
IC Num. common interests IC (u, v) = |𝛷(u) ∩ 𝛷(v)| 2.01e−02
IJC Jaccard’s coefficient of interests IJC (u, v) = |𝛷(u) ∩ 𝛷(v)| 1.14e−03
|𝛷(u) ∪ 𝛷(v)|
OI Num. interactions OI(u, v) = |𝜄(u, v)| 4.64e−01
RRC Num. common check-ins RRC (u, v) = |𝛬(u) ∩ 𝛬(v)| 1.12e−02
RRJC Jaccard’s coefficient of check-ins RRJC (u, v) = |𝛬(u) ∩ 𝛬(v)| 2.13e−04
|𝛬(u) ∪ 𝛬(v)|
RRO Overlap of check-ins RRO (u, v) = |𝛬(u) ∩ 𝛬(v)| 1.81e−04
|𝛬(u)| + |𝛬(v)|
RFC Num. common favored regions RFC (u, v) = |𝛯(u) ∩ 𝛯(v)| 3.55e−02
RFJC Jaccard’s coefficient of favored regions RFJC (u, v) = |𝛯(u) ∩ 𝛯(v)| 6.50e−03
|𝛯(u) ∪ 𝛯(v)|
RFO Overlap of favored regions RFO (u, v) = |𝛯(u) ∩ 𝛯(v)| 5.08e−03
|𝛯(u)| + |𝛯(v)|
Location-based social network
Topological
LCN Num. common neighbors LCN (u, v) = |𝛤 (u) ∩ 𝛤 (v)| 2.57e+00
LJC Jaccard’s coefficient LJC (u, v) = |𝛤 (u) ∩ 𝛤 (v)| 1.02e−02
|𝛤 (u) ∪ 𝛤 (v)|
LAA Adamic–Adar ∑ 1 1.58e+00
LAA (u, v) = log(|𝛤 (z)|)
z ∈ 𝛤 (u) ∩ 𝛤 (v)

LDS Num. days seen LDS (u, v) = |𝜂(u, v)| 3.14e−01


LMD Mean distance 1 ∑ 4.22e−01
LMD (u, v) = |𝜔(u,v)|
d
d ∈ 𝜔(u,v)

LK001 Katz (𝛽 = .001) c


∑ 3.80e−05
LK𝛽 (u, v) = 𝛽 l ⋅ |pathlu,v |
l=1

LK01 Katz (𝛽 = .01) 8.40e−04


LK1 Katz (𝛽 = .1) 2.60e−01
102
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

... as Supervised Machine Learning

[Hasan et al. 2006]

103
Evaluating link prediction methods 753
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Some [Yang,
Lichtenwalter
& Chawla,
moreFig. 1 Link prediction and evaluation. The black color indicates snapshots of the network from which link
2015]

details
prediction features are calculated (feature network). The gray color indicates snapshots of the network from
which the link prediction instances are labeled (label network). We can observe all links at or before time t,
and we aim to predict future links that will occur at time t + 1
Fig. 1 Link prediction and evaluation. The black color indicates snapshots of the network from which link
prediction features are calculated (feature network). The gray color indicates snapshots of the network from
network
whichsnapshots based on
the link prediction particular
instances segments
are labeled of data. We
(label network). Comparisons among
can observe all links atpredictors
or before time t,
require
andthat evaluation
we aim to predictencompasses
future links thatprecisely
will occurthe same
at time t +set
1 of instances whether the predictor
is unsupervised or supervised. We construct four network snapshots:
– Training
networkfeatures:
snapshots data fromonsome
based period segments
particular in the past, t−x up
of Gdata. to G t−1 , from
Comparisons which
among we
predictors
derive
requirefeature vectors for
that evaluation training data.
encompasses precisely the same set of instances whether the predictor
– Training labels: data
is unsupervised from G t , the
or supervised. Welast training-observable
construct four networkperiod, from which we derive
snapshots:
class labels, whether the link forms or not, for the training feature vectors.
– Training features: data from some period in the past, G t−x up to G t−1 , from which we
– Testing features: data from some period in the past up to G t , from which we derive feature
derive feature vectors for training data.
vectors for testing data. Sometimes it may be ideal to maintain the window size that we
– Training labels: data from G t , the last training-observable period, from which we derive
use for the training feature vector, so we commence the snapshot at G t−x+1 . In other
class labels, whether the link forms or not, for the training feature vectors.
cases, we might want to be sure not to ignore effects of previously existing links, so we
– Testing features: data from some period in the past up to G t , from which we derive feature
commence the snapshot at G Sometimes
vectors for testing data. t−x
. it may be ideal to maintain the window size that we
– Testing
uselabels:
for thedata from G
training t+1 , from
feature which
vector, we commence
so we derive classthelabels for theattesting
snapshot feature
G t−x+1 . In other
vector. These
cases, wedata arewant
might strictly excluded
to be sure notfrom inclusion
to ignore in any
effects training data.
of previously existing links, so we
A classifier is constructed
commence fromatthe
the snapshot training
G t−x . data and evaluated on the testing data. There
are always strictly
– Testing divided
labels: datatraining and, from
from G t+1 testing sets,we
which because
derive G
class is never
t+1 labels for observable in
the testing feature
training. vector. These data are strictly excluded from inclusion in any training data. 104
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Outlook: Multiplex Networks


• Use networks capturing/modeling different aspects of
relations/interactions,
– Friendship, work, relationships, ...
– Interactions on social media
• Instagram: Similar posts
• Flickr, Twitter: Followers, friends
– Face-to-face networks
– Co-citation: DBLP (co-authorship)
– Conference: Paper similarity
– ...

105
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University
[Scholz et al., AAAI ICWSM 2013]
Hybrid Rooted Pagerank
Novel
extension of
Rooted
Pagerank
algorithm for
multiplex
networks

106
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University
[Scholz et al., AAAI ICWSM 2013]
Link Prediction - Hybrid Rooted PR
Context:
Contact network
at academic
conference
• Co-author
network
(DBLP);

In addition:
• Encounter
network
(close-by)
• Paper
similarity
network

107
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Summary

What Did We Learn?

• Overview on basic graph/network learning methods

• Approaches for ranking on Networks and Graphs

• Methods and approaches for community detection

• Methods for community evaluation/assessment

• Overview of approaches for link prediction

• Brief outlook onto multiplex networks

108

You might also like