0% found this document useful (0 votes)

13 views110 pages

MLC 04 Graph Methods Ranking Communities Link Prediction-Sose2023

The document outlines basic graph and network methods in machine learning, focusing on ranking, community detection, and link prediction. It discusses key concepts such as centrality measures, the PageRank algorithm, and community structure definitions. The content is part of a course led by Prof. Dr. Martin Atzmüller at Osnabrück University during the summer semester of 2023.

Uploaded by

Yunru Cheng

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views110 pages

MLC 04 Graph Methods Ranking Communities Link Prediction-Sose2023

Uploaded by

Yunru Cheng

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 110

Machine Learning for Complex Data

Networks and Sequences

Summer semester 2023

Prof. Dr. Martin Atzmüller

Osnabrück University & DFKI

Semantic Information Systems Group
https://sis.cs.uos.de/
4 – Basic Graph/Network Methods:
Ranking, Community Detection, Link Prediction

1 Overview
2 Overview: Graph/Network Methods
3 Ranking on Networks/Graphs
4 Community Detection
5 Link Prediction
6 Summary
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Basic Graph/Network Methods

• Ranking:
• Importance/impact/influence of nodes
• Node-centered criterion
• Centralities
• Pagerank algorithm
• Community Detection
• Detecting groups/clusters etc.
• Different criteria; nodes and edges
• Basically: connection structure
• Extensions: Labels of nodes and/or edges
• Compare to machine learning – clustering algorithms
• Link Prediction
• Consider link structure
• Structural: Hidden/missing links in a network
• Temporal: Predicting new links to appear in a network

1
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Reprise: Complex networks

Definition
Graphs abstracting, directly or indirectly, interactions in real-world
systems.

Basic topological
features
• Low Density
• Small Diameter
• Scale-free

• High Clustering
Coefficient

2
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Notation

A graph G =< V , E ⊆ V × V >

• V : set of nodes (a.k.a. vertices, actors,

sites)
• E : set of edges (a.k.a. ties, links, bonds)

Notations
• AG Adjacency Matrix d : aij 6= 0 iff (vi , vj ) ∈ E , 0 otherwise.
• n = |V |
• m = |E |. Often m ∼ n
• Γ(v ) : neighbors of node v . Γ(v ) = {x ∈ V : (x, v ) ∈ E }.
• Node degree : d(v ) =k Γ(v ) k

3
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Centralities & Ranking

4
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Reprise: Definitions I

• In a directed graph, E denotes a subset of V × V .

• The density of G is the fraction of possible edges that are
actually present.
• In a weighted graph, each edge l ∈ E is given an edge weight
w (l) by some weighting function w : E → R.
• The degree of a node in a network is the number of
connections it has to other nodes.
• The adjacency matrix of a set of nodes S with n = |S|
contained in a (weighted) graph G = (V , E ) is a matrix
A ∈ Rn×n with Aij = 1 (Aij = w (i, j)) iff (i, j) ∈ E for any
nodes i, j in S (assuming some bijective mapping from
1, . . . , n to S). We identify a graph with its according
adjacency matrix where appropriate.

5
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Reprise: Definitions II

• A path v0 →G vn of length n in a graph G is a sequence

v0 , . . . , vn of nodes with n ≥ 1 and (vi , vi+1 ) ∈ E for
i = 0, . . . , n − 1.
• A shortest path between nodes u and v is a path u →G v of
minimal length and the diameter of G is the shortest path of
maximal length.
• The transitive closure of a graph G = (V , E ) is given by
G ∗ = (V , E ∗ ) with (u, v ) ∈ E ∗ iff there exists a path u →G v .
• A strongly connected component (scc) of G is a subset
U ⊆ V , such that u →G ∗ v exists for every u, v ∈ U.
• A (weakly) connected component (wcc) is defined accordingly,
ignoring the direction of edges (u, v ) ∈ E .

6
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Reprise: Definitions III

• Many observations of network properties can be explained just

by the network’s degree distribution
• It is therefore important to contrast the observed property to
the according result obtained on a random graph as a null
model which shares the same degree distribution.
• If a single network G is considered, a corresponding null
model G can be obtained by randomly replacing edges
(u1 , v1 ), (u2 , v2 ) ∈ E with (u1 , v2 ) and (u2 , v1 ), ensuring that
these edges were not present in G beforehand
• This process is typically repeated a multiple of the graph edge
set’s cardinality

7
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Centralities

Node centralities

• Idea: Estimate importance

and/or influence of a node
• PageRank, HITS algorithms: web
search results ranking
• FolkRank algorithm: Tag
recommendation
• Betweenness, Closeness:
Importance of actors in a social
Degree centrality example
network

8
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Node Centrality I

• The betweenness centrality bet measures the number of

shortest paths of all node pairs that go through a specific
node.
X σst (v )
bet(v ) = .
σst
s6=v 6=t∈V

Hereby, σst denotes the number of shortest paths between s

and t and σst (v ) is the number of shortest paths between s
and t passing through v . Thus, a vertex has a high
betweenness centrality if it can be found on many shortest
paths between other vertex pairs.

9
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Node Centrality II
• The closeness centrality clos considers the length of these
shortest paths. Then, the shorter its shortest path length to
all other reachable nodes, the higher a vertex ranks:
1
clos(v ) = P .
t∈V \v dG (v , t)

dG (v , t) denotes hereby the geodesic distance (shortest path)

between the vertices v and t.
• The eigenvector centrality eig of a node is an important
measure of its influence, similar to the pagerank measure.
Intuitively, a node is central, if it has many central neighbors.
The eigenvector centrality eig (v ) of node v is defined as
X
eig (v ) = λ eig (u) ,
{u,v }∈E

where λ ∈ R is a constant.
10
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Ranking

• Estimate:Importance of a node
• In complex (social) systems also: Influence of a node
• Centrality-based: different criteria – e.g., degree-based,
path-based (relating e.g., to degree, betweenness, closeness,
eigenvector centrality)
• Now: Pagerank:
• Considering link structure/paths
• Very many applications in graph-based machine learning
• Example: Optimizing network structure in graph-neural
networks

11
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Ranking Methods
• 2 independently developed algorithms
– PageRank (Brin & Page)
– HITS: Hypertext Induced Topic Search (Kleinberg)
• Basic idea:
– Popularity
– Link-based
– Intuitively: Web pages are popular, if many popular
pages link to them
– "PageRank is a global ranking of all web pages,
regardless of their content, based solely on their
location in the Web's graph structure.“
[Page et al 1998]

12
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

13
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Popularity-based Ranking
• Hubs and Authorities
– Hubs link to highly rated nodes (collecting links)
– Authorities are linked to (as being relevant) from
other popular nodes

14
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

PageRank

15
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Vorlesung
Web
Science

16
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

PageRank: Original Summation Formula

[Langville, Meyer 2004]

17
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Starting Problems
• Sinks & Cycles
– Some pages get fully ranked, some
get no "ranking" contribution
– Cycles "reverse“ the evaluation
– Some nodes without outgoing nodes
è Dangling nodes
• How many iterations?
– Does the process converge?
– Does it converge to a single vector?

18
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Approach of Brin & Page

• Crucial idea: "Random Surfer"
– Similar to navigating through the web (hyperlinks)
– Example
6 Links è probability 1/6 to select a specific link

– For dangling nodes: Jump to an arbitrary node

with equal probability
– Also: Stop following on a link path with a given
(fixed) probability

19
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

20
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

21
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

22
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Google Matrix: Step by Step

23
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Google Matrix: Step by Step

24
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Google Matrix: Step by Step

25
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Pagerank: Result

26
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Detection

27
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community detection

Overview
I Given a network/graph, find “modules”
• Single network
• Multiplex networks
• Attributed networks
I Community structures
• Graph Clustering/disjoint communities
• Hierarchical organization
• Overlapping communities
I Questions:
• What is “a community”?
• What are “good” communities?
• How do we evaluate these?

28
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community detection

Definitions
I A dense subgraph loosely coupled to other modules in the
network
I A community is a set of nodes seen as “one” by nodes outside
of the community
I A subgraph where almost all nodes are linked to other nodes
in the community.
I ...

29
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Subjectivity of Community Definition

Each component is a
A densely-knit community
community

30
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Taxonomy of Community Criteria

• Criteria vary depending on the tasks
• Roughly, community detection methods can be divided
into 4 categories (not exclusive):
• Node-Centric Community
– Each node in a group satisfies certain properties
• Group-Centric Community
– Consider the connections within a group as a whole. The group
has to satisfy certain properties without zooming into node-
level
• Network-Centric Community
– Partition the whole network into several disjoint sets
• Hierarchy-Centric Community
– Construct a hierarchical structure of communities

31
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Complete Mutuality: Cliques

• Clique: a maximum complete subgraph in which all
nodes are adjacent to each other

Nodes 5, 6, 7 and 8 form a clique

• NP-hard to find the maximum clique in a network

• Straightforward implementation to find cliques is
very expensive in time complexity

32
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Finding the Maximum Clique

• In a clique of size k:
Each node maintains degree >= k-1
• Nodes with degree < k-1 will not be included in the
maximum clique
• Recursively apply the following pruning procedure
– Sample a sub-network from the given network, and find
a clique in the sub-network, say, by a greedy approach
– Suppose the clique above is size k, in order to find out a
larger clique, all nodes with degree <= k-1 should be
removed.
• Repeat until the network is small enough
• Many nodes will be pruned as complex networks
(typically) follow a power law distribution for node
degrees

33
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Maximum Clique Example

• Suppose we sample a sub-network with nodes

{1-5} and find a clique {1, 2, 3} of size 3
• In order to find a clique >3, remove all nodes with
degree <=3-1=2
– Remove nodes 2 and 9
– Remove nodes 1 and 3
– Remove node 4

34
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Clique Percolation Method (CPM)

• Clique is a very strict definition, unstable
• Normally use cliques as a core or a seed to find
larger communities

• CPM is such a method to find overlapping

communities
– Input
• A parameter k, and a network
– Procedure
• Find out all cliques of size k in a given network
• Construct a clique graph. Two cliques are adjacent if they share k-1
nodes
• Each connected component in the clique graph forms a community

35
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

CPM Example
Cliques of size 3:
{1, 2, 3}, {1, 3, 4}, {4, 5, 6},
{5, 6, 7}, {5, 6, 8}, {5, 7, 8},
{6, 7, 8}

Communities:
{1, 2, 3, 4}
{4, 5, 6, 7, 8}

36
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Reachability : k-clique, k-club

• Any node in a group should be reachable in k hops
• k-clique: a maximal subgraph in which the largest
geodesic distance between any nodes <= k
• k-club: a substructure of diameter <= k
Cliques: {1, 2, 3}
2-cliques: {1, 2, 3, 4, 5}, {2, 3, 4, 5, 6}
2-clubs: {1,2,3,4}, {1, 2, 3, 5}, {2, 3, 4, 5, 6}

• Note: A k-clique might have diameter larger than k in

the subgraph
• Detection: Often involves combinatorial optimization

37
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Group-Centric: Density-Based Groups

• The group-centric criterion requires the whole group to
satisfy a certain condition
– E.g., the group density >= a given threshold
• A subgraph is a quasi-
clique if

• A similar strategy to that of cliques can be

used
– Sample a subgraph, and find a maximal
quasi-clique (say, of size k)
– Remove nodes with degree

38
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Network-Centric Community Detection

• Network-centric criterion needs to consider the
connections within a network globally
• Goal: partition nodes of a network into disjoint sets
• Approaches:
– Clustering based on vertex similarity
– Latent space models
– Block model approximation
– Spectral clustering
– Modularity maximization

39
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Clustering based on Vertex Similarity

• Apply k-means or similarity-based clustering to nodes
• Vertex similarity is defined in terms of the similarity of
their neighborhood
• Structural equivalence: two nodes are structurally
equivalent iff they are connecting to the same set of
actors
Nodes 1 and 3 are
structurally
equivalent;
So are nodes 5 and 6.

• Structural equivalence is too restricted for practical use.

40
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Vertex Similarity
• Jaccard Similarity [ ]-1
• Cosine similarity ∑s

41
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Latent Space Models

• General idea: Map nodes into a
low-dimensional space

• Multi-dimensional scaling (MDS)

– Given a network, construct a
proximity matrix representing the
pairwise distance between nodes
(e.g., geodesic distance)
– Construct clusters in new matrix:
• Apply clustering such that the
proximity between nodes based on
network connectivity is preserved
è Apply k-means clustering Two communities:
{1, 2, 3, 4} and {5, 6, 7, 8, 9}

42
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Block Models

• S is the community indicator matrix

• Relax S to be numerical values, then the optimal
solution corresponds to the top eigenvectors of A

Two communities:
{1, 2, 3, 4} and {5, 6, 7, 8, 9}

43
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Cut
• Most interactions are within group whereas
interactions between groups are few
• Community detection à Minimum cut problem
• Cut: A partition of vertices of a graph into two disjoint
sets
• Minimum cut problem: Find a graph partition such that
the number of edges between the two sets is
minimized

44
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Ratio Cut & Normalized Cut

• Minimum cut often returns an imbalanced

partition, with one set being a singleton
• Change the objective function to consider
community size
Ci,: a community
|Ci|: number of nodes in Ci
vol(Ci): sum of degrees in Ci

45
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Ratio Cut & Normalized Cut Example

For partition in in red:

For partition in green:

Both ratio cut and normalized cut prefer a balanced partition

46
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Modularity Maximization
• Modularity measures the strength of a community
partition by taking into account the degree distribution
• Given a network with m edges, the expected number
of edges between two nodes with di and dj is
The expected number of
edges between nodes 1
and 2 is
3*2/ (2*14) = 3/14
• Strength of a community:

• Modularity:

• A larger value indicates a good community structure

47
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Modularity Matrix
• Modularity matrix:

• Similar to spectral clustering, Modularity

maximization can be reformulated as

• Optimal solution: Top eigenvectors of the

modularity matrix
• Apply k-means to S as a post-processing step
to obtain community partition

48
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Modularity Maximization Example

Two Communities:
{1, 2, 3, 4} and {5, 6, 7, 8, 9}

k-means

Modularity Matrix

49
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Hierarchy-Centric Community Detection

• Goal: build a hierarchical structure of communities
based on network topology

• Allow the analysis of a network at different

resolutions

• Representative approaches:
– Divisive Hierarchical Clustering
– Agglomerative Hierarchical clustering

50
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Divisive Hierarchical Clustering

• Divisive clustering
– Partition nodes into several sets
– Each set is further divided into smaller ones
– Network-centric partition can be applied for the partition
• One particular example: recursively remove the
“weakest” tie
– Find the edge with the least strength
– Remove the edge and update the corresponding strength
of each edge
• Recursively apply the above two steps until a network
is discomposed into desired number of connected
components.
• Each component forms a community

51
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Edge Betweenness
• The strength of a tie can be measured by edge
betweenness
• Edge betweenness: the number of shortest paths
that pass along with the edge

The edge betweenness of e(1, 2) is

4, as all the shortest paths from 2 to
{4, 5, 6, 7, 8, 9} have to either pass
e(1, 2) or e(2, 3), and e(1,2) is the
shortest path between 1 and 2

• The edge with higher betweenness tends to be

the bridge between two communities.

52
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Divisive clustering based on edge

betweenness
Initial betweenness values

After removing e(4,5), the

betweenness of e(4, 6) becomes
20, which is the highest;

After remove e(4,6), the edge

e(7,9) has the highest
betweenness value 4, and
should be removed.

53
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Agglomerative Hierarchical Clustering

• Initialize each node as a community
• Merge communities successively into larger
communities following a certain criterion
– E.g., based on modularity increase

54
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Further Community Quality Functions I

One example we have already seen – Modularity – for estimating

the quality of a community partition.
• For a given undirected graph G = (V , E ) and a community
C ⊆ V we use the following notation:
• n := |V |,
• m := |E |,
• nC := |C |,
• mC := | {{u, v } ∈ E : u, v ∈ C } | – the number of intra-edges
of C , and
• m̄C := | {{u, v } ∈ E : | {u, v } ∩ C | = 1} | – the number of
inter-edges of C .
• Furthermore, it is convenient to introduce an inter-degree for
a node u ∈ C (that depends on the choice of C ) by
d̄C (u) := | {{u, v } ∈ E : v ∈
/ C } |, counting the number of
edges between u and nodes outside of C .

55
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Further Community Quality Functions II

• The Inverse Average-ODF (out-degree fraction) IAODF

compares the number of inter-edges to the number of all edges
of a community C , and averages this for the whole community
by considering the fraction for each individual node:

1 X d̄C (u)
IAODF(C ) := 1 − (1)
nC d(u)
u∈C

• The segregation index SIDX compares the number of

expected inter-edges to the number of observed inter-edges,
normalized by the expectation:

E (m̄C ) − m̄C m̄C n(n − 1)

SIDX(C ) = =1− (2)
E (m̄C ) 2mnC (n − nC )

56
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Detection on Attributed Graphs I

First: What is an attributed graph?

• Integrates additional attribute data about individual instances
• Example: In a social network, attributes describing each user’s
characteristics might be combined with the underlying
friendship network to form an attributed graph
• Also, here the relationships (friendship) could be labeled with
some information, e. g., specifying the “degree” of friendship
Other terms: “graphs with feature vectors”, “labeled graphs”, or
“annotated graph”, where also the term “network” is often
substituted for “graph”

57
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Detection on Attributed Graphs II

Definition 1 (Attributed Graph)
An attributed graph is a graph G in which each v ∈ V is
associated to a vector of attribute values x = (x1 , . . . , xd ), and
each edge e ∈ E to a vector y = (y1 , . . . , yt ). We use ai (v ) to
refer to the ith attribute value of a node v , and ai (e) for the edge
e respectively. We denote with AV the set of node attributes,
d = |AV |, and with AE the set of edge attributes, t = |AE |. If
|AV | > 0, |AE | = 0, we refer to G as node-attributed; similarly if
|AV | = 0, |AE | > 0, we call it edge-attributed. If
|AV | = 0, |AE | = 0, then we refer to a plain graph.

• Classic community detection just identifies subgroups of nodes

with a dense structure
• Thus: This lacks an interpretable description – community is
just given by a set of nodes / node IDs
58
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Detection on Attributed Graphs III

• With attributed graphs: it becomes possible to use these for

descriptions description-oriented community detection
• Interpretable, actionable (recommendations, etc.)
• Local patterns (not global partitioning)
• Main idea: Detect descriptions of subsets of nodes
(subgraphs) with a high community quality function
• Example: COMODO algorithm (Atzmueller et al. 2016)
• Applies descriptive community detection
• Core idea: Discover subsets of nodes – according to a
description (pattern)
• Subset should have high quality according to community
quality function
• Thus, two representations
• Subgraph (set of nodes) – graph structure level
• Descriptive pattern (set of attributes) – inducing the subgraph

59
Knowledge-Based Systems (KBS), Prof. Dr. M. Atzmüller, Osnabrück University

Transformation & Mining

■ Dataset of edges connecting two nodes
■ Described by intersection of labels of the two nodes
■ Additionally: Store nodes, and respective degrees
■ Apply
top-k method w/ optimistic-estimate pruning
(COMODO)
Web Mining,
Web Mining, Computer,
Computer JavaScript

Web Mining,
Computer, Java

60
Knowledge-Based Systems (KBS), Prof. Dr. M. Atzmüller, Osnabrück University

■ Algorithmutilizes special FP-tree-like structure

& optimistic estimates for efficient processing

61
Knowledge-Based Systems (KBS), Prof. Dr. M. Atzmüller, Osnabrück University

Evaluation: Datasets
■ BibSonomy dump (until January 2010)
■ 175,521 tags, 5,579 users, 467,291 resources,
2,120,322 tag assignments, 700 friendship links
■ Friend, click,Visit graph
■ Delicious (HetRec workshop): 1,861 users,
7,664 bi-directional links, 53,388 tags
■ Last.fm (HetRec workshop): 1,892 users,
12,717 bi-directional links, 11,946 tags

62
Knowledge-Based Systems (KBS), Prof. Dr. M. Atzmüller, Osnabrück University

Results: Pruning performance

BibSonomy friend graph

63
Knowledge-Based Systems (KBS), Prof. Dr. M. Atzmüller, Osnabrück University

Results: Pruning Performance

Last.fm friend graph Delicious friend graph

64
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Quality - Optimistic Estimates I

Making use of the minimum support threshold τn we can first

observe the following inequality for each subcommunity C 0 of a
community C , with a size above the minimal size threshold τn ,
i. e., |C 0 | ≥ τn :
nC 0 nC 0 τn
X X X
m̄C 0 = δ̄C’ (i) ≥ δ̄C (i) ≥ δ̄C (i).
i=1 i=1 i=1

Here, we assume that the values δ̄C (i), i = 1, . . . , nC and δ̄C’ (i),
i = 1, . . . , nC 0 are the inter-degrees of the nodes in C and C 0
respectively in ascending order, such that δ̄C (i), i = 1, . . . , τn
denote the minimal τn inter-degrees with respect to C .

65
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Quality - Optimistic Estimates II

Proposition 2
An optimistic estimate for SIDX(C ) is given by
(P (P ))
τn t
n(n − 1) i=1 δ̄C (i)
nC
i=1 δ̄C (i)
oe(SIDX(C )) := 1 − max , min ,
2m p(C ) t=τn t(n − t)
(
n2
4 , if nC ≥ n2 ,
where p(C ) := .
nC (n − nC ) otherwise

66
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Quality - Optimistic Estimates III

Proof.
For a subcommunity C 0 ⊆ C with |C 0 | ≥ τn we have

n(n − 1) m̄C 0
SIDX(C 0 ) = 1− ≤
2m nC (n − nC 0 )
0
PnC 0
n(n − 1) i=1 δ̄C (i)
≤ 1− ≤ (3)
2m nC 0 (n − nC 0 )
Pτn
n(n − 1) i=1 δ̄C (i)
≤ 1− nC . (4)
2m maxt=τn {t(n − t)}
n Pt o
i=1 δ̄C (i)
From (3) it is clear that 1 − n(n−1)
2m minnC
t=τn t(n−t) is an optimistic
estimate for SIDX(C ). On the other hand we have
maxnt=τ
C
n
{t(n − t)} = p(C ), since t(n − t) has its maximum at t = n2 .
Pτn
n(n−1) δ̄C (i)
Together with (4) we obtain 1 − 2m
i=1
p(C ) as another optimistic
estimate.

67
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Quality - Optimistic Estimates IV

Proposition 3
For the inverse Average-ODF let d̃C (u) := d̄d(u) C (u)
and δ̃C (i),
i = 1, . . . , nC as these ratios for all nodes in C in ascending order.
Then
τn
1 X
oe(IAODF(C )) := 1 − δ̃C (i)
τn
i=1

is an optimistic estimate for IAODF(C ).

68
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Quality - Optimistic Estimates V

Proof.
For a subcommunity C 0 ⊆ C with |C 0 | ≥ τn we have

1 X | {{u, v } ∈ E : v ∈ V \ C 0 } |
IAODF(C 0 ) = 1 − ≤
nC 0 0
d(u)
u∈C
1 X | {{u, v } ∈ E : v ∈ V \ C } |
≤ 1− =
nC 0 0
d(u)
u∈C
nC 0
1 X 1 X
= 1− d̃C (u) ≤ 1 − δ̃C (i) ≤
nC 0 0
nC 0
u∈C i=1
τn
1 X
≤ 1− δ̃C (i) .
τn
i=1

69
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Quality - Optimistic Estimates VI

Proposition 4
An optimistic estimate for the local modularity contribution can be
derived based only on the number of edges mC within the
community:
(
0.25, if mC ≥ m2 ,
oe(MODL(C )) = m 2
mC
m − m2 , otherwise.
C

A simple but useful observation is the following equation that

combines some the above defined entities for a community C :
X
d(i) = 2mC + m̄C . (5)
i∈C

70
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Quality - Optimistic Estimates VII

Proof.
Using Equation 5 we obtain:

mC X d(u) d(v )
MODL(C ) = − =
m 4m2
u,v ∈C
mC 1 X X
= − 2
d(u) d(v ) =
m 4m
u∈C v ∈C
mC 1 X
= − d(u)(2mC + m̄C ) =
m 4m2
u∈C
mC 1
= − (2mC + m̄C )2 ≤
m 4m2
mC m2
≤ − C2 =
m m
= oe(MODL(C
ˆ )).

71
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

How to evaluate community detection

algorithms and the detected communities,
respectively?

72
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Evaluating Community Detection

• For groups with clear definitions
– E.g., Cliques, k-cliques, k-clubs, quasi-cliques
– Verify whether extracted communities satisfy the
definition
• For networks with ground truth information
– Normalized mutual information
– Accuracy of pairwise community memberships
• Using Evidence networks
– Data-driven approach
– Inexpensive, if secondary data is available

73
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Measuring a Clustering Result

1, 2, 4, 5, 4, 5,
1, 3 2
3 6 6

Ground Truth Clustering Result

How to measure the

clustering quality?

• The number of communities after grouping can be

different from the ground truth
• No clear community correspondence between
clustering result and the ground truth
• Normalized Mutual Information can be used

74
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Normalized Mutual Information

• Entropy: the information contained in a distribution

• Mutual Information: the shared information between

two distributions

• Normalized Mutual Information (between 0 and 1)

• Consider a partition as a distribution (probability of

one node falling into one community), we can compute
the matching between two clusterings

75
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

NMI
𝑘 (") : number of communities in partition a
𝑘 ($) : number of communities in partition b

Notation:
𝜋a, 𝜋b denote different partitions.
: : number of nodes in Partition a assigned
to the h-th community;
: number of nodes in Partition b assigned
to l-th community;
:: number of nodes assigned to h-th community in
partition a, and l-th community in partition b.

k^{(a)} is the number of communities in partition a,

76
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

NMI-Example
4, 5,
1, 2, 3
• Partition a: [1, 1, 1, 2, 2, 2] 6

• Partition b: [1, 2, 1, 3, 3, 3] 1, 3 2 4, 5,6

nha nlb nh ,l l=1 l=2 l=3

h=1 3 l=1 2 h=1 2 1 0
h=2 3 l=2 1 h=2 0 0 3
l=3 3

=0.8278

77
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Accuracy of Pairwise Memberships

• Consider all the possible pairs of nodes and check whether they reside in
the same community
• An error occurs if
– Two nodes belonging to the same community are
assigned to different communities after clustering
– Two nodes belonging to different communities are
assigned to the same community
• Construct a contingency table

78
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Accuracy Example
1, 1, 4,
4,
2 5,
2, 3 5, 6 3
6
Ground Truth Clustering Result

Ground Truth
C(vi) = C(vj) C(vi) != C(vj)
Clustering C(vi) = C(vj) 4 0
Result C(vi) != C(vj) 2 9

Accuracy = (4+9)/ (4+2+9+0) = 13/15

79
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Community Assessment
w/ Evidence Networks
• Evaluation & comparison of communities
• Are the discovered communities meaningful?
• Hard problem!
– Gold-standard è costly!
– User study è costly!
• Approach in [Mitzlaff et al. 2011]
– Cost-effective
– Secondary data
– Relative ranking of communities

80
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Data: Evidence Networks

[Mitzlaff et al. 2011]
– Explicit social relations often sparse, e.g., friendship network
è Idea: Use simple to acquire secondary data

• User interactions with typical “Web 2.0”

application establish implicit relations
between users by
§ Visiting profile pages,
§ Sending messages,
§…
è “Evidence Networks”

81
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Evidence Networks in BibSonomy

[Mitzlaff et al. 2011]
Explicitness

Friend Graph:
è User v is a friend

of user u

Click Graph:
è User u clicked on a Visit Graph:
post of user v è User u looked at

the user page of v

Size

82
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Evaluation: Paradigm I
[Mitzlaff et al. 2011]

I) Given:

community structure evidence network

II) Evaluation:
Map users to nodes in evidence network and
calculate quality measure (e.g., modularity)

83
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Evaluation: Paradigm II
[Mitzlaff et al. 2011]

• Follows the paradigm of reconstructing existing

social structures
• No absolute quality figure for assessing
community quality
• But relative ranking for given community
allocations
è Proposed paradigm yields following work flow:
1) Automatically generate many different community
allocations (different methods with exhaustive parameter
variation)
2) Calculate ranking on all models
3) Select top-n candidate models and manually choose best
community allocation by inspection

84
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Experiments: Description
[Mitzlaff et al. 2011]

• Problem: We need good communities before we

can verify that their quality is assessed in a
sensible way
• But we are only interested in relative ranking of
different community allocations
è We generate many different initial community
allocations, each of which is than gradually
disturbed by randomization
1) We examine whether the proposed evaluation methodology
captures the expected decrease in quality
2) We examine whether the different evidence networks induce
consistent rankings
3) We check the correlation between the individual rankings
and the rankings on the Friend-Graph as a reference.

85
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Experiments: Results
[Mitzlaff et al. 2011]

Modularity in the Friend Graph

• LDA-x-kMeans-y
• x topics
• y communities

è Quality measure
decreases as
expected for all
considered
initial
clusterings
Initial clustering Random clustering

86
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Modularity in the Friend Graph

Experiments: Results
Modularity in the Copy Graph

è Ranking is
similar for
different
evidence
networks
87
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Experiments: Results
[Mitzlaff et al. 2011]

Modularity in the Visit Graph

è Ranking slightly
varies for top
positions –
degree of
explicitness
correlates with
ranking
consistency

88
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Experiments: Results
[Mitzlaff et al. 2011]
• Modularity and conductance values highly
correlate between the Friend Graph and each
of the other evidence networks (Friend
Graph as a reference)
Evidence Modularity Conductance
Network
Follower 0.86 ∓ 0.17 0.89 ∓ 0.28
Graph
Group Graph 0.91 ∓ 0.13 1.00 ∓ 0.01
Copy Graph 0.82 ∓ 0.17 0.99 ∓ 0.03
Click Graph 0.80 ∓ 0.17 0.99 ∓ 0.04
Visit Graph 0.72 ∓ 0.25 0.97 ∓ 0.06

89
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Evaluation using Semantics

• For networks with semantics
– Networks come with semantic or attribute information of
nodes or connections
– Human subjects can verify whether the extracted
communities are coherent
• Evaluation is qualitative
• It is also intuitive and helps understand a
community

An animal A health
community community

90
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Application: User Recommendation

• Most users belong not to one single research
community
• Interests diverge and the number of users
increases
• Relevant resources are hidden

è Community detection
è Personalization
è User recommendation
è Necessary: Descriptive method (see descriptive
community detection
è Goal: Personalized view on application

91
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Conferator & MyGroup

[Atzmueller et al. 2011, it+ti]

• Social Conference Guidance

System
• GI: LWA 2010, 2011, 2012, 2015
• ACM Hypertext 2011
• INFORMATIK 2013
– "Virtual" business cards
– List of all conversations
– Personalization of schedule
– Recommendations (Talks,Persons)
– In-door Localization
• MyGroup: Working groups
– Current activities, locations
– Enhance interactions
– Expert finder

92
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Link Prediction

Link prediction
I Structural:
Find hidden/missing links
in a network
ex. Missing links in
Wikipedia
I Temporal:
Predicting new links to
appear at time tp based on
the network state at time
t < tp

93
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Link Prediction

94
69 Finally, Section 6 concludes with a summary and outlines interesting directions for future work.
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University
2 BACKGROUND

Link Prediction - Formal

70 In this section, we define basic concepts in graph theory that is relevant for this paper. For further
71 background in graph theory we refer the work of Diestel (2017). Next, we provide a brief overview on ASP.
72 2.1 Basic Definitions: Graph Theory & Link Prediction
73 A graph G is an ordered pair (V, E) consisting of a set of vertices (nodes) and a set of edges. An edge
74 (u, v) consists of a pair of nodes u, v representing a relationship between them. A social network can be
75 abstracted by a graph, where actors correspond to nodes and the links in between them corresponds to
76 edges. A node v is a neighbor of (adjacent to) a node u if there is an edge (u, v) between them. G(u) stands
77 for the set of neighbours of a node u. Let G = {Gt=o , Gt=1 , · · · , Gt=n } be a temporal sequence of evolving
78 graphs where Gt=i = (Vt=i , Et=i ). For link prediction on such sequences, given t = n the goal is to predict
79 the structure of a graph in t = n + 1, i. e., Gt=n+1 . Specifically, we try to identify pairs (u, v), such that
80 u, v 2 Vt=n+1 and (u, v) 2/ Et=n+1 .
81 Prominent approaches for link prediction consider similarity scores between pairs of nodes, e. g., based
82 on neighbourhoods of pairs of nodes. Here, we will enhance link prediction based on neighbourhood-based
83 similarity scores with background knowledge. As one prominent neighbourhood-based similarity score, we
84 use the Common neighbours score: It counts the number of common neighbours of a pair of nodes. Given,
85 (u, v) the pair of nodes under observation, the common neighbours can formally be written as:
CN(u, v) = |G(u) \ G(v)|

This is a provisional file, not the final typeset article 2

Relatively simple approaches based on:

• Topological, e.g., common neighbors
• Path-based measures (e.g., distances)

95
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

lying Answer Set Programming for Knowledge-Based Link Prediction

Link Prediction – Simple Example
rve that there are
nections with red
t the corresponding
e first interval. For
g., neighborhood
res for prediction,
ists between these
the first interval.
lemented by self-
will be formalized
hen, these might
edictions. That is,
e predicted based
s and questionaire
omain knowledge.
corporate domain
nsider it as an ideal
information, in this
ires.
at the purpose of
an ASP approach
ce, but to provide
plicability for link Figure 1. Interaction network (links are split into
its advantages like subsets based on time)
96
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Link Prediction ... Illustrated

Prediction of new and recurring contacts,
e.g., Face-to-Face contacts at a conference

97
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Prediction of New Contacts

98
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Prediction of Recurring Contacts

99
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Example: Prediction of New Contacts

with Common Neighbors as Predictor
First Day Second and Third Day

2
2

100
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Unsupervised ...
Table 7 Overview of all features: formal definition, mean values of user pairs with (∅Valw) and without (∅Valwo) trading interactions, and
Feature Description [Eberhard, Trattner & Atzmueller, 2018]
Formal definition ∅Valw

Online social network

Topological
O+CN Num. common outgoing neighbors O+CN (u, v) = |𝛩+ (u) ∩ 𝛩+ (v)| 1.16e−01
O−CN Num. common incoming neighbors O−CN (u, v) = |𝛩− (u) ∩ 𝛩− (v)| 1.56e−01
O+JC Outgoing Jaccard’s coefficient O+JC (u, v) = |𝛩+ (u) ∩ 𝛩+ (v)| 2.13e−03
|𝛩+ (u) ∪ 𝛩+ (v)|
O−JC Incoming Jaccard’s coefficient O−JC (u, v) = |𝛩− (u) ∩ 𝛩− (v)| 2.39e−03
|𝛩− (u) ∪ 𝛩− (v)|
O+PS Preferential attachment+− O+PS (u, v) = |𝛩+ (u)| ⋅ |𝛩− (v)| 1.05e+02
O−PS Preferential attachment−+ O−PS (u, v) = |𝛩− (u)| ⋅ |𝛩+ (v)| 9.41e+01
{
OR Reciprocity of user communication 0 if (u, v) ∈ EO , (v, u) ∉ EO 1.86e−02
OR (u, v) =
1 if (u, v) ∈ EO , (v, u) ∈ EO
OAA Adamic–Adar ∑ 1 1.02e−01
OAA (u, v) = log(|𝛩(z)− |)
z ∈ 𝛩− (u) ∩ 𝛩− (v)

OK001 Katz (𝛽 = .001) c

∑ 1.55e−05
OK𝛽 (u, v) = 𝛽 l ⋅ |pathlu,v |
l=1

OK01 Katz (𝛽 = .01) 1.72e−04

OK1 Katz (𝛽 = .1) 9.79e−03
ORPR01 Rooted PageRank (𝛼 = .01) ORPR𝛼 (u, v) = stationary probability of v/random 1.79e−03
walk: (1) with probability 1 − 𝛼 move to a
random neighbor of current node, (2) with prob-
ability 𝛼 return to u
ORPR05 Rooted PageRank (𝛼 = .05) 1.91e−03
ORPR15 Rooted PageRank (𝛼 = .15) 1.98e−03
ORPR3 Rooted PageRank (𝛼 = .3) 1.95e−03
13

ORPR5 Rooted PageRank (𝛼 = .5) 1.70e−03

LDS Num. days seen LDS (u, v) = |𝜂(u, v)| 3.14e−01

LMD Mean distance 1 ∑ 4.22e−01
LMD (u, v) = |𝜔(u,v)|
d
d ∈ 𝜔(u,v)

LK001 Katz (𝛽 = .001) c

∑ 3.80e−05
LK𝛽 (u, v) = 𝛽 l ⋅ |pathlu,v |
l=1

LK01 Katz (𝛽 = .01) 8.40e−04

LK1 Katz (𝛽 = .1) 2.60e−01
102
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

... as Supervised Machine Learning

[Hasan et al. 2006]

103
Evaluating link prediction methods 753
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Some [Yang,
Lichtenwalter
& Chawla,
moreFig. 1 Link prediction and evaluation. The black color indicates snapshots of the network from which link
2015]

details
prediction features are calculated (feature network). The gray color indicates snapshots of the network from
which the link prediction instances are labeled (label network). We can observe all links at or before time t,
and we aim to predict future links that will occur at time t + 1
Fig. 1 Link prediction and evaluation. The black color indicates snapshots of the network from which link
prediction features are calculated (feature network). The gray color indicates snapshots of the network from
network
whichsnapshots based on
the link prediction particular
instances segments
are labeled of data. We
(label network). Comparisons among
can observe all links atpredictors
or before time t,
require
andthat evaluation
we aim to predictencompasses
future links thatprecisely
will occurthe same
at time t +set
1 of instances whether the predictor
is unsupervised or supervised. We construct four network snapshots:
– Training
networkfeatures:
snapshots data fromonsome
based period segments
particular in the past, t−x up
of Gdata. to G t−1 , from
Comparisons which
among we
predictors
derive
requirefeature vectors for
that evaluation training data.
encompasses precisely the same set of instances whether the predictor
– Training labels: data
is unsupervised from G t , the
or supervised. Welast training-observable
construct four networkperiod, from which we derive
snapshots:
class labels, whether the link forms or not, for the training feature vectors.
– Training features: data from some period in the past, G t−x up to G t−1 , from which we
– Testing features: data from some period in the past up to G t , from which we derive feature
derive feature vectors for training data.
vectors for testing data. Sometimes it may be ideal to maintain the window size that we
– Training labels: data from G t , the last training-observable period, from which we derive
use for the training feature vector, so we commence the snapshot at G t−x+1 . In other
class labels, whether the link forms or not, for the training feature vectors.
cases, we might want to be sure not to ignore effects of previously existing links, so we
– Testing features: data from some period in the past up to G t , from which we derive feature
commence the snapshot at G Sometimes
vectors for testing data. t−x
. it may be ideal to maintain the window size that we
– Testing
uselabels:
for thedata from G
training t+1 , from
feature which
vector, we commence
so we derive classthelabels for theattesting
snapshot feature
G t−x+1 . In other
vector. These
cases, wedata arewant
might strictly excluded
to be sure notfrom inclusion
to ignore in any
effects training data.
of previously existing links, so we
A classifier is constructed
commence fromatthe
the snapshot training
G t−x . data and evaluated on the testing data. There
are always strictly
– Testing divided
labels: datatraining and, from
from G t+1 testing sets,we
which because
derive G
class is never
t+1 labels for observable in
the testing feature
training. vector. These data are strictly excluded from inclusion in any training data. 104
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Outlook: Multiplex Networks

• Use networks capturing/modeling different aspects of
relations/interactions,
– Friendship, work, relationships, ...
– Interactions on social media
• Instagram: Similar posts
• Flickr, Twitter: Followers, friends
– Face-to-face networks
– Co-citation: DBLP (co-authorship)
– Conference: Paper similarity
– ...

105
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University
[Scholz et al., AAAI ICWSM 2013]
Hybrid Rooted Pagerank
Novel
extension of
Rooted
Pagerank
algorithm for
multiplex
networks

106
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University
[Scholz et al., AAAI ICWSM 2013]
Link Prediction - Hybrid Rooted PR
Context:
Contact network
at academic
conference
• Co-author
network
(DBLP);

In addition:
• Encounter
network
(close-by)
• Paper
similarity
network

107
Machine Learning for Complex Data, Prof. Dr. M. Atzmüller, Osnabrück University

Summary

What Did We Learn?

• Overview on basic graph/network learning methods

• Approaches for ranking on Networks and Graphs

• Methods and approaches for community detection

• Methods for community evaluation/assessment

• Overview of approaches for link prediction

• Brief outlook onto multiplex networks

108

GCAT - Link Prediction in Knowledge Graphs
No ratings yet
GCAT - Link Prediction in Knowledge Graphs
73 pages
Lecture 1 - Introduction
No ratings yet
Lecture 1 - Introduction
124 pages
MLC 02 Graphs Models Generation-Sose2023
No ratings yet
MLC 02 Graphs Models Generation-Sose2023
41 pages
Lecture11 PageRank V0
No ratings yet
Lecture11 PageRank V0
38 pages
MLC 05 Multiplex Network Analysis-1
No ratings yet
MLC 05 Multiplex Network Analysis-1
34 pages
MLC 01 Intro Basics Graphs-Sose2023
No ratings yet
MLC 01 Intro Basics Graphs-Sose2023
44 pages
Mining Graphs
No ratings yet
Mining Graphs
23 pages
ch05 Linkanalysis1
No ratings yet
ch05 Linkanalysis1
60 pages
Semi-Supervised Learning With Graphs
No ratings yet
Semi-Supervised Learning With Graphs
174 pages
Complex Network Models
No ratings yet
Complex Network Models
110 pages
Social Media Mining: Reza Zafarani Mohammad Ali Abbasi Huan Liu
No ratings yet
Social Media Mining: Reza Zafarani Mohammad Ali Abbasi Huan Liu
382 pages
Networks Sna
No ratings yet
Networks Sna
126 pages
Tomaž Bratanic - Graph Algorithms For Data Science - With Examples in Neo4j-Manning Publications (2024)
No ratings yet
Tomaž Bratanic - Graph Algorithms For Data Science - With Examples in Neo4j-Manning Publications (2024)
10 pages
04 Pagerank
No ratings yet
04 Pagerank
64 pages
Lec 32
No ratings yet
Lec 32
25 pages
DM PPT
No ratings yet
DM PPT
12 pages
14 Link 1
No ratings yet
14 Link 1
10 pages
Neural Networks and Fuzzy Logic
From Everand
Neural Networks and Fuzzy Logic
C. Naga Bhaskar
No ratings yet
Topic 01 - Course Introduction
No ratings yet
Topic 01 - Course Introduction
38 pages
09 Pagerank
No ratings yet
09 Pagerank
61 pages
Menendez Llorente
No ratings yet
Menendez Llorente
22 pages
Course 5-6
No ratings yet
Course 5-6
54 pages
Datamining-Lect7 - Link Analysis Ranking PageRank - Random Walks HITS Absorbing Random Walks and Label Propagation
No ratings yet
Datamining-Lect7 - Link Analysis Ranking PageRank - Random Walks HITS Absorbing Random Walks and Label Propagation
99 pages
Lecture 9
No ratings yet
Lecture 9
64 pages
TM3 ch05 Link Analysis
No ratings yet
TM3 ch05 Link Analysis
69 pages
Credit Appraisal in SIDBI
No ratings yet
Credit Appraisal in SIDBI
131 pages
Graph Based Data Science
No ratings yet
Graph Based Data Science
37 pages
Newman Networks An Introduction 2010
100% (2)
Newman Networks An Introduction 2010
394 pages
Social Media Data Mining
100% (2)
Social Media Data Mining
382 pages
Graph Neural Network Introduction
No ratings yet
Graph Neural Network Introduction
88 pages
2020 - William L. Hamilton - Graph Representation Learning-Morgan & Claypool
No ratings yet
2020 - William L. Hamilton - Graph Representation Learning-Morgan & Claypool
161 pages
Cours 2010
No ratings yet
Cours 2010
227 pages
Documentation
No ratings yet
Documentation
52 pages
Rec Sys Network
No ratings yet
Rec Sys Network
45 pages
Unit I Graph Theory and Concepts
No ratings yet
Unit I Graph Theory and Concepts
35 pages
Lesson 1
No ratings yet
Lesson 1
50 pages
Dear The Weight
From Everand
Dear The Weight
Masud Rana
No ratings yet
Week 16
No ratings yet
Week 16
47 pages
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
No ratings yet
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
44 pages
Feb 28
No ratings yet
Feb 28
12 pages
Intermediate Data Science NX
No ratings yet
Intermediate Data Science NX
48 pages
ST6391, ST6392, ST6393 ST6395, ST6397, ST6399: Data Sheet
No ratings yet
ST6391, ST6392, ST6393 ST6395, ST6397, ST6399: Data Sheet
68 pages
Googles Secret and Linear Algebra
No ratings yet
Googles Secret and Linear Algebra
7 pages
Graph Representation Learning
No ratings yet
Graph Representation Learning
141 pages
BIOS Manual For System Boards With Intel® 7 Series / C216 Chipset
No ratings yet
BIOS Manual For System Boards With Intel® 7 Series / C216 Chipset
70 pages
Original GNN
No ratings yet
Original GNN
22 pages
Senior Project Whole PDF
No ratings yet
Senior Project Whole PDF
23 pages
Module 6-: Real Time Big Data Models
No ratings yet
Module 6-: Real Time Big Data Models
58 pages
PMBD-07-Link Analysis
No ratings yet
PMBD-07-Link Analysis
42 pages
The Graph Neural Network Model
No ratings yet
The Graph Neural Network Model
20 pages
6 Pagerank
No ratings yet
6 Pagerank
7 pages
Advanced Analysis of Algorithms: Dept of CS & IT University of Sargodha
No ratings yet
Advanced Analysis of Algorithms: Dept of CS & IT University of Sargodha
51 pages
Graph Mining Handout
No ratings yet
Graph Mining Handout
7 pages
Update Stock 04 Desember 2023
No ratings yet
Update Stock 04 Desember 2023
49 pages
MEP Myanmar
No ratings yet
MEP Myanmar
27 pages
Google Pagerank and Reduced-Order Modelling
No ratings yet
Google Pagerank and Reduced-Order Modelling
56 pages
Greening & Restoring The Sahara Desert With The Groasis Waterboxx
No ratings yet
Greening & Restoring The Sahara Desert With The Groasis Waterboxx
70 pages
GML Introduction
No ratings yet
GML Introduction
11 pages
Page Rank PDF
0% (1)
Page Rank PDF
20 pages
943EMH (Stage IIIA) - Cummins - Elevated Cab
No ratings yet
943EMH (Stage IIIA) - Cummins - Elevated Cab
2 pages
C OMBINATORIAL M ODELS OF C OMPLEX S YSTEMSTesis Doctorado Eng
No ratings yet
C OMBINATORIAL M ODELS OF C OMPLEX S YSTEMSTesis Doctorado Eng
194 pages
Chapter 2-3 The Simplex Method
No ratings yet
Chapter 2-3 The Simplex Method
19 pages
Report PDF
No ratings yet
Report PDF
35 pages
CHAPTER 6 Valve Timing 13 Two Stroke Engine
No ratings yet
CHAPTER 6 Valve Timing 13 Two Stroke Engine
20 pages
Distributed Computing Seminar: Lecture 5: Graph Algorithms & Pagerank
No ratings yet
Distributed Computing Seminar: Lecture 5: Graph Algorithms & Pagerank
33 pages
Case Study Presentation Two Tough Calls A Harvard Business School
No ratings yet
Case Study Presentation Two Tough Calls A Harvard Business School
10 pages
Financial Statement Analysis: Abid Hussain
No ratings yet
Financial Statement Analysis: Abid Hussain
14 pages
BSBWRT401 - Assessment 2 Template
No ratings yet
BSBWRT401 - Assessment 2 Template
13 pages
Conservation Equations and Modeling of Chemical and Biochemical Processes 1st Edition Said S.E.H. Elnashaie Download
No ratings yet
Conservation Equations and Modeling of Chemical and Biochemical Processes 1st Edition Said S.E.H. Elnashaie Download
63 pages
ICASSP, 1991, Pp. 2773-2776.: Texture Information in Run-Length Matrices
No ratings yet
ICASSP, 1991, Pp. 2773-2776.: Texture Information in Run-Length Matrices
8 pages
Shell Leverages Data To Transform From Reactive To Predictive Operations
No ratings yet
Shell Leverages Data To Transform From Reactive To Predictive Operations
6 pages
Introduction For Term Paper Sample
100% (1)
Introduction For Term Paper Sample
4 pages
Dassault Mirage III
No ratings yet
Dassault Mirage III
31 pages
Cecilia Laurente - Nursing Practice and Career
100% (1)
Cecilia Laurente - Nursing Practice and Career
11 pages
Arm Cylinder - 331
No ratings yet
Arm Cylinder - 331
3 pages
BS 1881-Part 115-86
No ratings yet
BS 1881-Part 115-86
11 pages
HVDC
No ratings yet
HVDC
3 pages
PaveAnalyzer White Paper
No ratings yet
PaveAnalyzer White Paper
4 pages
Football Accumulator Tips For Today - WinDrawWin.
No ratings yet
Football Accumulator Tips For Today - WinDrawWin.
1 page
FB 12 STC 045 en 03 - Epoca Raso NHL - RNHL 105 - Eng
No ratings yet
FB 12 STC 045 en 03 - Epoca Raso NHL - RNHL 105 - Eng
3 pages
Prepared Food Photos, Inc V New Kianis Pizza & Subs, Inc: Judgment Entered $51,461.50
No ratings yet
Prepared Food Photos, Inc V New Kianis Pizza & Subs, Inc: Judgment Entered $51,461.50
5 pages
Perceptrons: Fundamentals and Applications for The Neural Building Block
From Everand
Perceptrons: Fundamentals and Applications for The Neural Building Block
Fouad Sabry
No ratings yet
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
From Everand
Competitive Learning: Fundamentals and Applications for Reinforcement Learning through Competition
Fouad Sabry
No ratings yet
Attractor Networks: Fundamentals and Applications in Computational Neuroscience
From Everand
Attractor Networks: Fundamentals and Applications in Computational Neuroscience
Fouad Sabry
No ratings yet
Feedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs
From Everand
Feedforward Neural Networks: Fundamentals and Applications for The Architecture of Thinking Machines and Neural Webs
Fouad Sabry
No ratings yet
Bio Inspired Computing: Fundamentals and Applications for Biological Inspiration in the Digital World
From Everand
Bio Inspired Computing: Fundamentals and Applications for Biological Inspiration in the Digital World
Fouad Sabry
No ratings yet
The Cultural Revolution Extra Reading
No ratings yet
The Cultural Revolution Extra Reading
2 pages
Cab 2024
No ratings yet
Cab 2024
1 page
Anjaney Deshpande Resume
No ratings yet
Anjaney Deshpande Resume
1 page
Building and Site Security Policy
No ratings yet
Building and Site Security Policy
1 page