0% found this document useful (0 votes)
197 views

Node2vec: Scalable Feature Learning For Networks: Aditya Grover Et Al. Presented By: Saim Mehmood Ahmadreza Jeddi

Node2vec is an algorithm that learns continuous feature representations for nodes in a network. It extends the skip-gram model from word2vec by defining a flexible notion of a node's network neighborhood using biased random walks. This allows node2vec to learn node embeddings that preserve both the local and global network structure. Node2vec was shown to outperform other methods on tasks like multi-label classification and link prediction on several real-world networks.

Uploaded by

siraj mohammed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
197 views

Node2vec: Scalable Feature Learning For Networks: Aditya Grover Et Al. Presented By: Saim Mehmood Ahmadreza Jeddi

Node2vec is an algorithm that learns continuous feature representations for nodes in a network. It extends the skip-gram model from word2vec by defining a flexible notion of a node's network neighborhood using biased random walks. This allows node2vec to learn node embeddings that preserve both the local and global network structure. Node2vec was shown to outperform other methods on tasks like multi-label classification and link prediction on several real-world networks.

Uploaded by

siraj mohammed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

node2vec: Scalable Feature Learning for

Networks

Aditya Grover et al.

Presented By:
Saim Mehmood
Ahmadreza Jeddi
Background: Networks
● Ubiquitous in real world.

● Examples include:
○ Social, Road, World Wide Web &
IoT etc.

● A flexible and general data structure.


○ Many types of data can be
formulated as networks.
Network Mining: Ranking
● Edges between nodes identify the co-
authorship of papers.

● Ranking helps in discovering the


most influential author.
Network Mining: Community Detection

● Who tend to work together?

● Dividing graph representation


into set of communities:
○ Machine Learning,
Information Retrieval &
Data Mining
Tasks: Node Classification
● Interested in predicting the most probable
labels of nodes in a network.

● Social networks: predicting interests of


users.

● Protein-protein interaction (PPI)


network: predicting functional labels of
proteins.

● Example: d1 is democrat, d2 is republican.


What about d3 & d4?
Tasks: Link Prediction
● We wish to predict whether a pair of
nodes in a network have an edge
connecting them.

● Usefulness of Link Prediction:


○ In genomics, it helps us discover
novel interactions between genes.

○ In social networks, it can


identify real-world friends.
Contribution & Main Idea

● Their key contribution is in defining a flexible notion of a node’s


network neighborhood.
● By choosing an appropriate notion, node2vec can learn
representations that organize nodes based on their network roles
(structural equivalence) and communities (homophily) they
belong to.
Word2Vec

Representation (feature) learning:


Automatically learn representations needed for feature detection

Exampel: CNNs

Tell things like:


1) neural networks take numbers
2) not all the datasets are
originally in numerical form
3) for words, we sometimes use
oneHot, here to learn we also use
oneHot
Skip-gram model

A text document is given:


A set of sentences

Input data to give into neural


network
● Context window
● Context window size ⍵
Skip-gram model, Cont.

● Let’s say we have V words in our vocabulary


● Use one-hot encoding to train the network

● One hidden layer


● No activation function
● Stochastic gradient descent

Example:
● V = 10000 (size of vocabulary)
● 300 neurons in hidden layer
● Word “ants” as input to network
Add things like, when everything is learned, freeze all the weights,
give in all the 1hot vectors => this is equal to weight matrix
Node Embedding

● In 2014, DeepWalk: Online Learning of Social Representations

● Each graph as a document

● Random walks will be the sentences in this document

● Random Walk In graph G(V,E):


○ A sequences of nodes <v1, v2, …, vk>, such that each (vi, vi+1) is an edge in E
General method for node embedding
Classic Search Strategies

● The problem of sampling neighborhoods of a source node


can be viewed as a form of local search.

● There are two extreme sampling strategies for generating


neighborhood sets:

○ Breadth-first Sampling (BFS)

○ Depth-first Sampling (DFS)


Breadth-first Sampling (BFS)

● Neighborhood is restricted to nodes which are immediate


neighbors of the source .

● For a neighborhood of size BFS samples nodes


..
BFS Structural Equivalence

● Nodes that have similar structural roles in networks


should be embedded closely together.
○ E.g., nodes and in fig

● Restricting search to nearby nodes,


BFS gives microscopic view.

● Network roles such as bridges and


hubs can be inferred using BFS.
Depth-first Sampling (DFS)

● Neighborhood consists of nodes sequentially sampled at


increasing distances from the source node .

● For a neighborhood of size DFS samples nodes


.
DFS Homophily

● Nodes that are highly interconnected and belong to


similar network communities should be embedded
closely together.
○ E.g., nodes and in fig

● DFS sampled nodes reflect a


macro-view of nodes neighborhood.
Visualization of Les Miserables Network

● Generated by node2vec

● Label colors reflecting:

○ Homophily (top)

○ Structural Equivalence (bottom)


Flexible notion of neighborhood

● Authors design a flexible neighborhood sampling strategy which


allows them to smoothly interpolate between BFS and DFS

● The above sampling strategy is achieved by a flexible biased


random walk that explores neighborhoods in a BFS as well as
DFS fashion.
Drawbacks of DeepWalk and LINE

● DeepWalk: learns d-dimensional feature representations by


simulating uniform random walks. Can be observed as a special
case of node2vec with parameters p = 1 & q = 1.

● LINE: learns d-dimensional feature representations in two steps:


○ d / 2 dimensions by BFS-style
○ d / 2 dimensions by sampling nodes at 2-hop distance

● Networks represent a mixture of homophily and structural


equivalence, which are not effectively covered by the above two
methods.
21

How Node2vec take random walks


Default Setup: Parameters Just the same as DeepWalk
● Dimensionality (D) : 128
● Number of walks starting from each node (r) : 10
● Walk Length (l) : 80
● Context Size (k) : 10

Each random walk:


● Step 1: initial node
● Step 2: look at the neighbors, select one as the next node
● Step 3: repeat step 2 until the length of random walk is equal l

Say that each node has degree at least one


Where node2vec really comes in

Biasing the selection of next node


● Last edge in random walk : t → v
● Currently at node v
● How to select next node from v neighbors ?

How much do you like to go back to ⟶ parameter “p”


t?
How much do you like to go far from t ? ⟶ parameter “q”
Overview of node2vec algorithm
Edge Embedding

Link prediction deals with pairs of nodes

We need to find embeddings of edges

Embedding of edge (u,v) done by a binary operator g(u, v) : V × V ⟶ Rd


Node2vec Scalability

Process time is linear in number of nodes ⟶ O( a |V| )


● a is constant relying on R (numWalks) and L (walkLength)

For Optimization (SGD) use negative sampling

Example: Erdos-Renyi graphs with an average


degree of 10
Results: Multi-label Classification
● Every node is assigned labels from a finite set .

● Training of algorithms on a fraction of nodes.

● Task is to predict remaining nodes.

● Datasets used:
○ BlogCatalog: network of social relationships of bloggers
○ Protein-Protein Interactions (PPI): subgraph of PPI network
for Homo Sapiens
○ Wikipedia: co-occurrence network of words
Graphs

x-axis shows
fraction of labeled
data
Results: Link Prediction
● Given a network with certain fraction of edges removed, we would
like to predict the missing edges.

● Datasets used:
○ Facebook: nodes and edges represents friendships &
relationships between them.
○ PPI: node represents a protein, and an edge indicates a
biological interaction between a pair of proteins.
○ arXiv: nodes represent scientists, and an edge indicates
collaboration between them
Table

None of feature learning algorithms have


been previously used for link prediction,
they additionally evaluate node2vec
against heuristic scores
Area Under Curve (AUC) scores for link prediction
Thank You

You might also like