0% found this document useful (0 votes)
13 views32 pages

Graph Representation Learning

Uploaded by

wepol14095
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views32 pages

Graph Representation Learning

Uploaded by

wepol14095
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Relational Learning with

Neural Encoders
Pedro Almagro Blanco

CCIA - Universidad de Sevilla


GMSC - Universidad Central de Ecuador
Agenda
1. Some notes on Machine Learning (ML)

2. A successful ML model: the Neural Network (NN)

3. Using NN to encode data: Neural Encoders

4. Word2vec: Learning Text Encodings

5. From Texts to Graphs: Learning Graph Encodings!

6. The focus is on...

7. Bibliography
Some notes on Machine Learning (ML)
Machine Learning is focus on learning functions from data examples.

- Supervised Curve fitting

ML - Unsupervised Data exploration

- Reinforcement Maximize rewards


Some notes on Machine Learning (ML)
Machine Learning is focus on learning functions from data examples.
- Classification
- Supervised Curve fitting - Regression
- ...

- Clustering
ML - Unsupervised Data exploration - Dimensionality reduction
- ...

- Control theory
- Reinforcement Maximize rewards - Multi-agent systems
- ...
Some notes on Machine Learning (ML)
Traditional ML models are able to learn from vectors. Images and texts are easily
vectorizable, but the same cannot be said for texts and graphs.
- Classification
- Regression

ML
- Clustering
- Anomaly detection
- Signal processing
- Dimensionality reduction
? - ...
A successful ML model: the Neural Network (NN)
“One Model to rule them all…”

“A feed-forward network with a single hidden layer containing a finite number of


neurons can approximate continuous functions on compact subsets of Rn, under
mild assumptions on the activation function.” (Universal approximation theorem)
A successful ML model: the Neural Network (NN)
“One Model to rule them all…”

Moreover...

Artificial neural networks are efficient on modern hardware (GPU/TPUs)!


A successful ML model: the Neural Network (NN)
Artificial neural networks can be expressed by means of a graph G = (V, E, σ, Φ),
where V is the set of nodes (neurons), E is the set of directed edges, Φ : E → R is
the function of weights associated with edges, and σ is an activation function
associated with nodes (usually the same for all).
A successful ML model: the Neural Network (NN)
The supervised learning problem is to find the right weights (Φ’) for the system to
return the expected output for a given set of samples (D).
A successful ML model: the Neural Network (NN)
The supervised learning problem is to find the right weights (Φ’) for the system to
return the expected output for a given set of samples (D).
A successful ML model: the Neural Network (NN)
A successful ML model: the Neural Network (NN)
Using NN to encode data: Neural Encoders
Once the NN has been trained, the intermediate layers capture all the necessary
input information from previous layers to calculate the corresponding output.
Using NN to encode data: Neural Encoders
Once the NN has been trained, the intermediate layers capture all the necessary
input information to calculate the corresponding output.

Latent
Representation
Using NN to encode data: Neural Encoders
Intermediate layers activations can be used as latent representation of any input.

¡Obtained encoding depends on the dataset (D) used to train the NN!

Latent
Representation
Using NN to encode data: Neural Encoders
An autocoder is a specific case of a neuronal encoder, the network is learning the
identity function.
Using NN to encode data: Neural Encoders
When the network reaches an acceptable state, the activations in the hidden
layers capture the information of the input data necessary to reproduce it as
output.

Latent
Representation
Neural encoders to vectorize texts and graphs!
Traditional ML models are able to learn from vectors. Images and texts are easily
vectorizable, but the same cannot be said for texts and graphs.
- Classification
- Regression

ML
- Clustering
- Anomaly detection
- Signal processing
- Dimensionality reduction
? - ...
Neural encoders to vectorize texts and graphs!
Traditional ML models are able to learn from vectors. Images and texts are easily
vectorizable, but the same cannot be said for texts and graphs.
- Classification
- Regression

ML
- Clustering
- Anomaly detection
- Signal processing
NEURAL
- Dimensionality reduction
ENCODERS - ...
Word2vec: Learning Texts Encodings
We need a training set (D) composed by pairs (x,y) containing the semantic
information of texts and graphs in order to properly train our encoder. Mikolov et
al. proposed to sample sentences in texts to obtain D.
Word2vec: CBOW vs. Skip-Gram
Mikolov et al. presented two architectures, under the generic name of word2vec,
allowing to learn vector representations of words: Continuous bag-of-words
(CBOW) and Skip-gram.
Word2vec: Skip-Gram
Iit uses one word to predict the context. The context is composed of the words
appearing to both the right and left of the given word. Given a sequence of training
words, the objective of the Skip-Gram model is to maximize the average log
probability:

The basic Skip-Gram formulation uses the soft-max function:


Word2vec: Skip-Gram
Where φ(w) and φ(w)’ are the input and
output vector representations of w.
Optimizing this model by gradient descent
means taking a training example and
adjusting all the parameters of the model
(expensive).

W W
Two efficient approximations

Negative Sampling Hierarchical Soft-max


Word2vec: Learning Texts Encodings
Trained with a bunch of books, this model is able to obtain vector representations
of words capturing interesting syntactic and semantic properties.
From Texts to Graphs: Learning Graph Encodings!
When dealing with graphs, we need pairs (node, node-in-context) to train our
neural encoder.

words nodes

contexts ????
From Texts to Graphs: Learning Graph Encodings!
Many models performing graph embeddings inspired by Word2vec have been
presented. They differ in how node-in-context is defined.

- Random Walks (DeepWalk, 2014)

- Node Neighborhoods (LINE, 2015)


contexts
- Parametrized Random Walks (node2vec, 2016)

- ...
From Texts to Graphs: Learning Graph Encodings!
Random Walks (DeepWalk, 2014)

2. Random
Walk Embedding
Generation

1. Original Graph 3. Neural Encoder


Training
From Texts to Graphs: Learning Graph Encodings!
Parametrized Random Walks (node2vec, 2016)

2. Parametrized
Random Walk Embedding
Generation

1. Original Graph Breadth Depth 3. Neural Encoder


First First Training
Search Search
(Structural Equivalence) (Homophily)
From Texts to Graphs: Learning Graph Encodings!
Node Neighborhoods (LINE,2015)

Context-1
Target

Context-2 concatenate Embedding


Target

1. Original Graph 2. Sampling 3. Neural Encoders


Training
The focus is on...

- Node/Link/Graph Classification
- Node/Link/Graph Clustering
- Node/Link/Graph Ranking

SAMPLING
D ENCODING

ML - Entity Retrieval
- Link Prediction
- Community Detection
- Long Distance Queries Approx
- Semantic Analysis of Networks
- ...
The focus is on...

- Node/Link/Graph Classification
- Node/Link/Graph Clustering
- Node/Link/Graph Ranking

SAMPLING

Degree
D ENCODING

ML - Entity Retrieval
- Link Prediction
- Community Detection
- Long Distance Queries Approx
- Semantic Analysis of Networks
PageRank - ...
Betweenness
Closeness
...
Bibliography

- Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their
compositionality. In Burges, C. J. C., Bottou, L., Welling, M., Ghahramani, Z., & Weinberger, K. Q. (Eds.), Advances in Neural
Information Processing Systems 26, pp. 3111–3119. Curran Associates, Inc.

- Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. CoRR,
abs/1301.3781.

- Mikolov, T., Yih, S. W.-t., & Zweig, G. (2013). Linguistic regularities in continuous space word representations. In Proceedings of
the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language
Technologies (NAACL-HLT-2013). Association for Computational Linguistics.

- Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, pp. 701–710, New York, NY, USA. ACM.

- Grover, A., & Leskovec, J. (2016). node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining, 2016.

- Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., & Mei, Q. (2015). Line: Large-scale information network embedding. In
Proceedings of the 24th International Conference on World Wide Web, WWW ’15, pp. 1067–1077, New York, NY, USA. ACM.

You might also like