Lec31 32
Lec31 32
• Even though the general problem is the same, there are different
algorithms that deal with each one of the problems
5
Graph partitioning
• In graph partitioning the number of groups k is pre-defined
• In particular, we want to divide the network vertices into k non-
overlapping groups of given sizes such that the number of edges
across groups are minimized
• Sizes can also be provided in approximation only
• E.g., within specific range
• E.g., divide the network nodes into two groups of equal size, such that the
number of edges between them is minimized
• Graph partitioning is useful in parallel processing of numerical
solutions of network processes
6
Community detection
• In community detection neither the number of groups nor their sizes
are pre-defined
• The goal is to find the natural fault lines along which the network
separates
• Few edges between groups and many edges within groups
• Community detection is not as well-posed of a problem as graph
partitioning
• What do we mean “many” and “few”?
• Many different objectives different algorithms
7
Problem formulation
• Formulate the problem of community detection as a maximization
prob.
• Consider a simple, undirected network
• Divide the network into every possible division
• Assign high score if division is “good” (division has many edges within communities),
• Assign low score if division is “bad”.
• Search through the divisions to find the one with the highest score
• The first does not change when we change the sign of sv and we can ignore it
• Since 𝐵𝑖𝑗 = 𝐵𝑗𝑖 , the other two terms are equal and their sum is 2𝑠𝑣 σ𝑖(≠𝑣) 𝐵𝑖𝑣 𝑠𝑖
• Change in modularity, when we flip 𝑠𝑣 to − 𝑠𝑣
1 𝑠𝑣
• Δ𝑄 = −2𝑠𝑣 σ𝑖(≠𝑣) 𝐵𝑖𝑣 𝑠𝑖 − 2𝑠𝑣 σ𝑖(≠𝑣) 𝐵𝑖𝑣 𝑠𝑖 = − σ𝑖(≠𝑣) 𝐵𝑖𝑣 𝑠𝑖
4𝑚 𝑚
å i =n
s 2
• Or in matrix notation 𝐁𝐬 = 𝛽𝐬
• The optimal s is one of the eigenvectors of the modularity matrix and 𝛽 is the
corresponding eigenvalue
1 𝑇
n
𝑄= 𝛽𝐬 𝐬 = 𝛽 , where s 𝑇 𝑠 = 𝑛
4𝑚 4𝑚
• Since our goal is to make the modularity as large as possible, we want the
eigenvalue 𝛽 to be as large as possible.
MA 653: Network Science 22
• For removing the relaxation (i.e. s should take values ±1)
• We want to minimize the angle between s and the leading eigenvector
(denoted by u)
• Equivalently, we want to maximize the inner product : 𝑠 𝑇 𝑢 = σ𝑖 𝑠𝑖 𝑢𝑖
• The maximum is achieved when 𝑠𝑖 𝑢𝑖 is positive for all i
• This occurs when si has the same sign as ui for all i
• Thus we approximate the vector s as follows
+1, if 𝑢𝑖 > 0
• 𝑠𝑖 = ቊ
−1, if 𝑢𝑖 < 0