W7 - Graph Analytics - Features of Graph (II) (Me)
W7 - Graph Analytics - Features of Graph (II) (Me)
Another way to describe the feature of a node is using structure-based features, which capture
topological properties of local neighbourhood around a node. Typical structure-based measures
include clustering coefficient, graphlet degree vector.
This kind of graph is called an ego network, which consists of a focal node (“ego”), in this case the
node $v, $and the nodes to whom ego is directly connected to (these are called “alters”) plus the
links, if any, among the alters.
Local clustering coefficient counts the number of triangles in the ego network.
1
0.2 Graphlet Degree Vector
We can generalise the above by counting the number of pre-specified subgraphs, i.e., graphlets.
Graphlets are small induced non-isomorphic subgraphs from a large graph that differentiate nodes
according to their subgraph positions—or their orbits. In a graphlet, the set of isomorphisms of the
graph into itself comprises its set of automorphisms. Two nodes 𝑢 and 𝑣 are said to be equivalent
(meaning “in the same orbit”) when there exists some automorphism that maps 𝑢 into 𝑣.
The figure below shows 2- to 5-node graphlets (numbered from G0 to G29) and their automorphism
orbits (numbered from 0 to 72). For each graphlet, nodes in the same automorphism orbits are
identified with the same color (e.g. all blue nodes in G1 are in O1, they are in a symmetric position
in the graphlet, the green node is in a different topological position, it is in O2). In addition, these
graphlets are ordered within groups from the least to the most dense with respect to the number
of edges when compared to the maximum possible number of edges in the graphlet
Graphlet Degree Vector (GDV) is a graphlet based feature for nodes. GDV counts the number of
graphlets that a node touches (or participates in). For the figure below, for node 𝑣, it participates
in two of the graphlet of type 𝑎, one of the graphlet of type 𝑏, zero of the graphlet of type 𝑐, and
two of the graphlet of type 𝑑. The GDV of node 𝑣 is expressed as a vector [2, 1, 0, 2].
2
Considering graphlets on 2- to 5-node, we get - Vector of 73 coordinates is a signature of a node
that describes the topology of the node’s neighbourhood - Captures its interconnectivities out to a
distance of 4 hops
Graphlet Degree Vector characterises the local neighbourhood structure around the given node
based on the frequencies of these graphlets that the node participates in. It provides a measure of
a node’s local network topology. Comparing vectors of two nodes provides a more detailed measure
of local topological similarity than node degrees or clustering coefficient.