NLP_Machine Learning
NLP_Machine Learning
Learning for
Natural
Chapter 24.
Language
7/27/2024 TRAN THI THUY MINH 1
1. Word embedding
6. Summary
7/27/2024 2
1. Word embedding
• If we want to plug words into a Neural Network, or some other machine learning
algorithm, we need a way to turn the words into numbers
• Word embedding is a technique used in NLP and ML to represent words as numerical
vectors
“She is beautiful”
She 1
Is 2
Beautiful 3
“Beautiful” and “Pretty” mean similar things – they have different number
-> Neural network will need a lot more complexity and training
-> It would be nice if similar words that are used in similar ways could be given similar numbers.
So that learning how to use one word will help learn how to use the other at the same time
Z1
Z2
• LSTMs is a kind of RNN, it can choose to remember some parts of the input, copying it over to the next timestep, and
forgot other parts.
• Unlike traditional RNNs, LSTMs use gating units to selectively retain or forget information over time steps, enabling
them to better preserve relevant information for NLP tasks.
• LSTMs is a kind of RNN, it can choose to remember some parts of the input, copying it over to the next timestep, and
forgot other parts.
• Unlike traditional RNNs, LSTMs use gating units to selectively retain or forget information over time steps, enabling
them to better preserve relevant information for NLP tasks.
• The most commonly used for machine translation is called a sequence-to-sequence mode
Forget
• First, the attention component itself has no learned weights and supports variable-leght
sequences on both source and target side
• Second, attention is entirely latent
• Attention can also be combined with multilayer RNNs
Derive the semantic relationship between words using word-word co-occurrence matrix
Window = 1
I 0 2 0 0
love 2 0 1 1
cats 0 1 0 0
you 0 1 0 0
you 0 1 0 0
P(I | Love) = 2/2 = 1
j X
rose
Transformer
Transformer
data.
2 Recurrent neural networks (RNNs) excel in capturing local and long-distance context.
3 Sequence-to-sequence models are valuable for machine translation and text generation.
4 Transformers, with self-attention, effectively model both local and long-range context, optimizing
5 Transfer learning, leveraging pretrained contextual word embeddings, enables versatile model
development