0% found this document useful (0 votes)
14 views3 pages

W7 - Graph Analytics - Features of Graph (II) (Me)

The document discusses two ways to describe features of nodes in a graph: structure-based features and graphlet degree vector. Structure-based features include clustering coefficient, which measures how close a node's neighbors are to forming a complete graph. Graphlet degree vector counts the number of small subgraphs a node participates in to characterize its local network topology.

Uploaded by

gihankumar4678
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views3 pages

W7 - Graph Analytics - Features of Graph (II) (Me)

The document discusses two ways to describe features of nodes in a graph: structure-based features and graphlet degree vector. Structure-based features include clustering coefficient, which measures how close a node's neighbors are to forming a complete graph. Graphlet degree vector counts the number of small subgraphs a node participates in to characterize its local network topology.

Uploaded by

gihankumar4678
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Graph Analytics - Features of Graph (II)

Another way to describe the feature of a node is using structure-based features, which capture
topological properties of local neighbourhood around a node. Typical structure-based measures
include clustering coefficient, graphlet degree vector.

0.1 Cluster coefficient


Evidence suggests that in most real-world networks, and in particular social networks, nodes tend
to create tightly knit groups characterised by a relatively high density of ties; this likelihood tends
to be greater than the average probability of a tie randomly established between two nodes.
A clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together.
It gives an indication of the embeddedness of single nodes.
• The clustering coefficient of a node in a graph quantifies how close its neighbours are to being
a clique (complete graph).
• The clustering coefficient 𝐶𝑖 for a node 𝑖 is given by a proportion of the number of links
between the nodes within its neighbourhood divided by the number of links that could possibly
exist between them.
𝑛𝑢𝑚𝑏𝑒𝑟_𝑜𝑓_𝑙𝑖𝑛𝑘𝑠_𝑎𝑚𝑜𝑛𝑔_𝑛𝑒𝑖𝑔ℎ𝑏𝑜𝑢𝑟𝑖𝑛𝑔_𝑛𝑜𝑑𝑒𝑠_𝑜𝑓_𝑛𝑜𝑑𝑒_𝑖
𝐶𝑖 =
(𝑘(𝑖)
2 )

where 𝑘(𝑖) is the degree of node 𝑖, (𝑘(𝑖)


2 ) gives the possible number of node pairs (i.e., links)
among the 𝑘(𝑖) neighbouring nodes of node 𝑖.
For the graph below, node 𝑣 has clustering coefficient of 1 (meaning the node’s neighbours all
know each other), 0.5 and 0 (meaning the node’s neighbours do not know each other), respectively.

This kind of graph is called an ego network, which consists of a focal node (“ego”), in this case the
node $v, $and the nodes to whom ego is directly connected to (these are called “alters”) plus the
links, if any, among the alters.
Local clustering coefficient counts the number of triangles in the ego network.

1
0.2 Graphlet Degree Vector
We can generalise the above by counting the number of pre-specified subgraphs, i.e., graphlets.
Graphlets are small induced non-isomorphic subgraphs from a large graph that differentiate nodes
according to their subgraph positions—or their orbits. In a graphlet, the set of isomorphisms of the
graph into itself comprises its set of automorphisms. Two nodes 𝑢 and 𝑣 are said to be equivalent
(meaning “in the same orbit”) when there exists some automorphism that maps 𝑢 into 𝑣.
The figure below shows 2- to 5-node graphlets (numbered from G0 to G29) and their automorphism
orbits (numbered from 0 to 72). For each graphlet, nodes in the same automorphism orbits are
identified with the same color (e.g. all blue nodes in G1 are in O1, they are in a symmetric position
in the graphlet, the green node is in a different topological position, it is in O2). In addition, these
graphlets are ordered within groups from the least to the most dense with respect to the number
of edges when compared to the maximum possible number of edges in the graphlet

Graphlet Degree Vector (GDV) is a graphlet based feature for nodes. GDV counts the number of
graphlets that a node touches (or participates in). For the figure below, for node 𝑣, it participates
in two of the graphlet of type 𝑎, one of the graphlet of type 𝑏, zero of the graphlet of type 𝑐, and
two of the graphlet of type 𝑑. The GDV of node 𝑣 is expressed as a vector [2, 1, 0, 2].

2
Considering graphlets on 2- to 5-node, we get - Vector of 73 coordinates is a signature of a node
that describes the topology of the node’s neighbourhood - Captures its interconnectivities out to a
distance of 4 hops
Graphlet Degree Vector characterises the local neighbourhood structure around the given node
based on the frequencies of these graphlets that the node participates in. It provides a measure of
a node’s local network topology. Comparing vectors of two nodes provides a more detailed measure
of local topological similarity than node degrees or clustering coefficient.

You might also like