0% found this document useful (0 votes)

14 views17 pages

11.chapter8 WordEmbedding

Uploaded by

Minh Mai Ngọc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views17 pages

11.chapter8 WordEmbedding

Uploaded by

Minh Mai Ngọc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Natural Language Processing

AC3110E

1
Chapter 8: Word embedding

Lecturer: PhD. DO Thi Ngoc Diep

SCHOOL OF ELECTRICAL AND ELECTRONIC ENGINEERING
HANOI UNIVERSITY OF SCIENCE AND TECHNOLOGY
Introduction

• Vector semantics: representations of the (embedded) meaning of words

• representation learning instead of creation by hand
• important in meaning-related tasks (question-answering or dialogue)
• Word meaning views:
• a string of letters
• index in a vocabulary list
• in word relationships: similar meaning, synonym/antonyms, positive/negative
connotations, etc.
• How to represent the words in sense?
• Lexical semantics
• Words and embedding
• Similarity measuring
• Word2vec
• Semantic properties of embeddings
• Evaluating vector models

3
8.1 Lexical Semantics

• Common terms:
• Word form, Lemma, Word sense
word form

mice,

• Lemmas can be polysemous (have multiple senses)

• Relations between word senses
• Synonymy/synonym: meaning is identical or nearly identical
water/H20
"H20" in a surfing guide?
big/large
my big sister != my large sister

• Other relations between words ! WordNet, a thesaurus containing

lists of synonym sets and hypernyms

4
Relations between word senses

• Similar words:
• Words with similar meanings. Not synonyms, but sharing some element of meaning

• Word Similarity
• help in computing how similar the meaning of two phrases
or sentences

SimLex-999 dataset
• Word Relatedness/word association

• Same semantic field:

• Semantic Frames and Roles:

• denote perspectives or semantic frame participants in a particular type of event:
shell/buy, seller/buyer

5
Relations between word senses

• Antonym
• “Similar” words that are opposites with respect to only one feature of meaning

• Connotations (affective meanings)

• related to a writer or reader’s emotions, sentiment, opinions, etc.
• Positive connotations: happy
• Negative connotations: sad

• Positive connotation: copy, replica, reproduction

• Negative connotation: fake, knockoff, forgery

• Positive evaluation: great, love

• Negative evaluation: terrible, hate

6
8.2 Word semantic vectors

• Standard way to represent word meaning in NLP

• Vectors to represent words
• Represent a word as a point in a multidimensional semantic space
• Show the distributions of word neighbors
• Can compute semantic similarity based on vector distance

Negative words
Neutral function
words

Positive words
A two-dimensional (t-SNE) projection of
embeddings for some words and phrases

• => offers enormous power to NLP applications: provide similar meanings instead of
word forms only
• Vector semantic models
• Learned automatically from text
• tf-idf model, word2vec model
• etc.
Jurafsky, Daniel, and James H. Martin. Speech and Language Processing: An Introduction
to Natural Language Processing, Computational Linguistics, and Speech Recognition 7
8.3 Word embeddings

• Previous traditional word models

• one-hot word vectors, Word/term-document matrix, Word-word matrix, tf-idf, etc.
• sparse, long vector => problem for applying, training and optimization
• How to create a word representation with short, dense vectors
• Distributional semantics: “A word’s meaning is given by the words that frequently
appear close-by”
• Based on the context of words (set of words that appear nearby: within a fixed-size
window) to discover the word meaning
• Use the many contexts of w to build up a representation of w
• vectors of words that appear in similar contexts, will have high similarity score
• Embeddings: word vectors, (word) embeddings, (neural) word representations, etc.

https://web.stanford.edu/class/cs224n/ 8
8.3 Word embeddings

• Method for computing word vectors:

• “Neural Language Model”-inspired models
• Word2vec (skipgram, CBOW)
• GloVe
• FastText
• Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA)
• Dynamic contextual embeddings
• ELMo, BERT
• Compute distinct embeddings for a word in its context

*sometimes the algorithm is loosely referred to as word2vec 9

Word2vec

• A framework for learning word vectors

• 2 models:
• skip-gram (Mikolov et al., 2013)
• Predict context words by the target word
• works well with small amount of the training data
• represents well even rare words or phrases
• Continuous bag of words (CBOW) (Mikolov et al., 2013)
• Based on context words to predict target words.
• several times faster to train than the skip-gram
• slightly better accuracy for the frequent words

*sometimes the algorithm is loosely referred to as word2vec 10

Word2vec

• Skip-gram models
• Naïve softmax (simple but expensive loss function, when many output classes)
• More optimized variants like hierarchical softmax
• Negative sampling with logistic regression
• Skip-gram as logistic regression
• Instead of counting how often each word c occurs near a target word w => train a
binary classifier on prediction task: “Is word c likely to show up near that target
word w?”
• Use logistic regression to train the classifier to distinguish those two cases +/-.
• Self-supervision:
• only use running text as implicitly supervised training data
• a word c that occurs near the target word w acts as ‘positive example’
• randomly sample other words in the lexicon to get negative samples (negative sampling ).
• The learned weights are used as the embeddings.

11
Skip-gram model as logistic regression

• The classifier
• context, can be a window of ±n words
• near: a word is c likely to occur near the target w if its embedding vector is similar to
the target embedding:
• Similarity(w,c) ≈ c·w
• Model the probability that word c is/is not a real context word for target word w:
1 1
𝑃 + 𝑤, 𝑐 = 𝜎 𝐜. 𝐰 = 1+exp(−𝐜.𝐰); 𝑃 − 𝑤, 𝑐 = 1 − 𝑃 + 𝑤, 𝑐 = 1+exp(𝐜.𝐰)
• Model the probability that a sequence c1..cL is a real context sequence for target
word w:
𝑃 + 𝑤, 𝑐1:𝐿 = ς𝐿𝑖=1 𝜎 𝐜𝐢 . 𝐰
• Each word has two embedding: one for the word as a target, and one for the word
considered as context
• For all |V| words in the vocabulary
=> need to build two matrices W and C.

12
Skip-gram model as logistic regression

• Learning embeddings - Negative sampling

• Input: a corpus of text
• Assigning a random embedding vector for each of the words
• Create positive examples and negative examples from corpus of text
k=2

L=±2

• Minimize this loss function that:

• Maximize the similarity of (w,cpos) word pairs
• Minimize the similarity of (w,cneg) word pairs.
• 𝐿𝐶𝐸 = − log 𝑃 + 𝑤, 𝑐𝑝𝑜𝑠 ς𝑘𝑖=1 𝑃 − 𝑤, 𝑐𝑛𝑒𝑔𝑖 = −[log 𝜎 𝑐𝑝𝑜𝑠 ⋅ 𝑤 + σ𝑘𝑖=1 log 𝜎 −𝑐𝑛𝑒𝑔𝑖 ⋅ 𝑤 ]
• Walks through the training corpus using stochastic gradient descent to iteratively
update the embedding of each word => W, C matrices

13
Visualizing Embeddings - Semantic properties of embeddings

• The most common visualization method : t-SNE

• project the d dimensions of a word down into 2 dimensions

GloVe vector king − man + w𝑜𝑚𝑎𝑛

close to queen

Capture comparative
and superlative
morphology

 Preservation of semantic and syntactic relationships !

Jurafsky, Daniel, and James H. Martin. Speech and Language Processing: An Introduction
to Natural Language Processing, Computational Linguistics, and Speech Recognition 14
8.4 Evaluating Vector Models

• Extrinsic evaluation on tasks

• using vectors in an NLP task and seeing whether this improves performance over
some other model
• Intrinsic evaluations
• computing the correlation between an algorithm’s word similarity scores and word
similarity ratings assigned by humans
• WordSim-353, SimLex-999: datasets present words without context
• Stanford Contextual Word Similarity (SCWS), Word-in-Context (WiC) dataset: include
context

15
Resources

• Word2vec (Mikolov et al)

• https://code.google.com/archive/p/word2vec/
• GloVe (Pennington, Socher, Manning)
• http://nlp.stanford.edu/projects/glove/

16
17

Word Embedding
No ratings yet
Word Embedding
35 pages
Word Embeddings
No ratings yet
Word Embeddings
55 pages
DM Chapter 9 - Word Embedding
No ratings yet
DM Chapter 9 - Word Embedding
7 pages
Vector Representation of Text: Vagelis Hristidis Prepared With The Help of Nhat Le Many Slides Are From Richard Socher
No ratings yet
Vector Representation of Text: Vagelis Hristidis Prepared With The Help of Nhat Le Many Slides Are From Richard Socher
20 pages
Sense VEC A Fast and Accurate Method For Word Sense Disambiguation in Neural Word Embeddings
No ratings yet
Sense VEC A Fast and Accurate Method For Word Sense Disambiguation in Neural Word Embeddings
9 pages
3 WordMeaning
No ratings yet
3 WordMeaning
78 pages
Word 2 Vec
No ratings yet
Word 2 Vec
6 pages
Lebijp 59 SZ 31 Py
No ratings yet
Lebijp 59 SZ 31 Py
69 pages
Neural Models For NLP
No ratings yet
Neural Models For NLP
67 pages
21 Word2Vec 24 09 2024
No ratings yet
21 Word2Vec 24 09 2024
63 pages
7a. Word Embeddings Word2Vec and GloVe
No ratings yet
7a. Word Embeddings Word2Vec and GloVe
39 pages
Week 2 and 3
No ratings yet
Week 2 and 3
76 pages
12 Subrata DL
No ratings yet
12 Subrata DL
25 pages
Vector Semantics Embeddings
No ratings yet
Vector Semantics Embeddings
11 pages
NLP Lec 03
No ratings yet
NLP Lec 03
26 pages
Word 2 Vec
No ratings yet
Word 2 Vec
33 pages
Unit 2
No ratings yet
Unit 2
15 pages
CCS369 Unit-2 20.12.24
No ratings yet
CCS369 Unit-2 20.12.24
41 pages
Word and Document Embeddings
No ratings yet
Word and Document Embeddings
94 pages
Lecture Word Embeddings WordTo Vec IR
No ratings yet
Lecture Word Embeddings WordTo Vec IR
60 pages
Lecture 10
No ratings yet
Lecture 10
86 pages
Learning Representations That Convey Semantic and Syntactic Information
No ratings yet
Learning Representations That Convey Semantic and Syntactic Information
14 pages
Vector Semantics and Embeddings
No ratings yet
Vector Semantics and Embeddings
29 pages
XCS224N Module1 Slides
No ratings yet
XCS224N Module1 Slides
72 pages
NLP DL Lecture2
No ratings yet
NLP DL Lecture2
54 pages
Word Embeddings 1
No ratings yet
Word Embeddings 1
42 pages
Unit 2 Updated New
No ratings yet
Unit 2 Updated New
77 pages
ML4D-L6 nlp2
No ratings yet
ML4D-L6 nlp2
58 pages
Vector Semantics and Embedding (Part 2)
No ratings yet
Vector Semantics and Embedding (Part 2)
47 pages
Word Embeddings A Survey
No ratings yet
Word Embeddings A Survey
11 pages
08 Word Embeddings (2021)
No ratings yet
08 Word Embeddings (2021)
58 pages
4 Word Representation
No ratings yet
4 Word Representation
41 pages
Natural Language Processing With Deep Learning CS224N/Ling284
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
57 pages
Wordembed
No ratings yet
Wordembed
31 pages
CCS369 - TSS-Unit 2
No ratings yet
CCS369 - TSS-Unit 2
56 pages
Word Embadding
No ratings yet
Word Embadding
24 pages
ML For NLP-LO4
No ratings yet
ML For NLP-LO4
42 pages
NLP Using Deep Learning Handson
No ratings yet
NLP Using Deep Learning Handson
7 pages
BDMH LLM
No ratings yet
BDMH LLM
51 pages
Christopher Manning Lecture 1: Introduction and Word Vectors
No ratings yet
Christopher Manning Lecture 1: Introduction and Word Vectors
42 pages
Chapter II
No ratings yet
Chapter II
26 pages
Word Embeddings Classification
No ratings yet
Word Embeddings Classification
52 pages
Explaining The Intuition of Word2Vec & Implementing It in Python
No ratings yet
Explaining The Intuition of Word2Vec & Implementing It in Python
13 pages
Lecture 4 Word Representation
No ratings yet
Lecture 4 Word Representation
48 pages
Wordembed v2.0
No ratings yet
Wordembed v2.0
46 pages
NLP 2
No ratings yet
NLP 2
8 pages
Madhav Institute of Technology & Science, Gwalior
No ratings yet
Madhav Institute of Technology & Science, Gwalior
13 pages
cs224n 2025 Lecture02 Wordvecs2
No ratings yet
cs224n 2025 Lecture02 Wordvecs2
46 pages
Christopher Manning Lecture 2: Word Vectors, Word Senses, and Neural Classifiers
No ratings yet
Christopher Manning Lecture 2: Word Vectors, Word Senses, and Neural Classifiers
57 pages
Constructing and Evaluating Word Embeddings
No ratings yet
Constructing and Evaluating Word Embeddings
33 pages
Word2Vec - A Baby Step in Deep Learning But A Giant Leap Towards Natural Language Processing
100% (1)
Word2Vec - A Baby Step in Deep Learning But A Giant Leap Towards Natural Language Processing
12 pages
Natural Language Processing With Deep Learning CS224N/Ling284
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
33 pages
Word Embedding
No ratings yet
Word Embedding
9 pages
Natural Language Processing With Deep Learning CS224N/Ling284
No ratings yet
Natural Language Processing With Deep Learning CS224N/Ling284
36 pages
NLP - L9 Word Embedding
No ratings yet
NLP - L9 Word Embedding
5 pages
2017 Grade 11 Eng Model - 043107
100% (1)
2017 Grade 11 Eng Model - 043107
10 pages
Azure AI Fundamentals Study Guide and Practice Exam For The Microsoft AI-900 Exam (David Voss David Voss) (Z-Library)
No ratings yet
Azure AI Fundamentals Study Guide and Practice Exam For The Microsoft AI-900 Exam (David Voss David Voss) (Z-Library)
77 pages
Voulgaris, Bulut - AI For Data Science (AVG) (2018)
No ratings yet
Voulgaris, Bulut - AI For Data Science (AVG) (2018)
202 pages
Google's Gemini PRO 1.5 - Next-Generation AI Model - Encord
No ratings yet
Google's Gemini PRO 1.5 - Next-Generation AI Model - Encord
25 pages
Simple Review of AlphaGo
No ratings yet
Simple Review of AlphaGo
1 page
Robot Applications in The Real World: Description
No ratings yet
Robot Applications in The Real World: Description
6 pages
14.chapter10 AdvancedDeepLearningForText
No ratings yet
14.chapter10 AdvancedDeepLearningForText
22 pages
Research Paper (AI RESUME ANALYSER)
No ratings yet
Research Paper (AI RESUME ANALYSER)
4 pages
The Functions of Deep Learning: Gilbert Strang
No ratings yet
The Functions of Deep Learning: Gilbert Strang
1 page
PROJECT 1 DIAL5111w
No ratings yet
PROJECT 1 DIAL5111w
13 pages
Digital Finance and The Future of The Global Financial System
No ratings yet
Digital Finance and The Future of The Global Financial System
255 pages
Đề THPT
No ratings yet
Đề THPT
32 pages
Cloud Digital Leader - 0
No ratings yet
Cloud Digital Leader - 0
40 pages
Ty It FF105 Sem1 22 23
No ratings yet
Ty It FF105 Sem1 22 23
36 pages
Logistics PROJECT FINAL
No ratings yet
Logistics PROJECT FINAL
60 pages
Rocket Fuel Newsletter
No ratings yet
Rocket Fuel Newsletter
55 pages
1.chapter1 Introduction Chapter2 LanguageCharacteristics
No ratings yet
1.chapter1 Introduction Chapter2 LanguageCharacteristics
35 pages
Support Vector Machines
No ratings yet
Support Vector Machines
69 pages
12-13.chapter9 DeepLearningInNLP
No ratings yet
12-13.chapter9 DeepLearningInNLP
45 pages
5G Delivers New Wave of Automation For Manufacturing: Ffi FF
No ratings yet
5G Delivers New Wave of Automation For Manufacturing: Ffi FF
16 pages
Deep Fake Detection Document
No ratings yet
Deep Fake Detection Document
10 pages
NFP 2003sep
No ratings yet
NFP 2003sep
64 pages
Fair Federated Learning For Digital Healthcare
No ratings yet
Fair Federated Learning For Digital Healthcare
15 pages
AI Benefits Vs Risks
No ratings yet
AI Benefits Vs Risks
4 pages
15.chapter11 NLPApplications
No ratings yet
15.chapter11 NLPApplications
25 pages
Final Year Project Report
No ratings yet
Final Year Project Report
11 pages
Assginment - With Hints
No ratings yet
Assginment - With Hints
2 pages
December 2024
No ratings yet
December 2024
32 pages
Remove Text From Images Using CV2 and Keras-OCR - by Carlo Borella - Towards Data Science
No ratings yet
Remove Text From Images Using CV2 and Keras-OCR - by Carlo Borella - Towards Data Science
18 pages
Geopolitics Assignment-SYB AG07
No ratings yet
Geopolitics Assignment-SYB AG07
36 pages
S4eee1598 023 36311 0
No ratings yet
S4eee1598 023 36311 0
16 pages
Ai LP (1) 2
No ratings yet
Ai LP (1) 2
8 pages
A Guide To Transformers
No ratings yet
A Guide To Transformers
7 pages
The ROOM Fellowship Slides 2022-0820 - 783978159
No ratings yet
The ROOM Fellowship Slides 2022-0820 - 783978159
23 pages
Artificial Intelligence Class 10 Syllabus
No ratings yet
Artificial Intelligence Class 10 Syllabus
5 pages

11.chapter8 WordEmbedding

Uploaded by

11.chapter8 WordEmbedding

Uploaded by

Natural Language Processing

Lecturer: PhD. DO Thi Ngoc Diep

• Vector semantics: representations of the (embedded) meaning of words

• Lemmas can be polysemous (have multiple senses)

• Other relations between words ! WordNet, a thesaurus containing

• Same semantic field:

• Semantic Frames and Roles:

• Connotations (affective meanings)

• Positive connotation: copy, replica, reproduction

• Positive evaluation: great, love

• Standard way to represent word meaning in NLP

• Previous traditional word models

• Method for computing word vectors:

*sometimes the algorithm is loosely referred to as word2vec 9

• A framework for learning word vectors

*sometimes the algorithm is loosely referred to as word2vec 10

• Learning embeddings - Negative sampling

• Minimize this loss function that:

• The most common visualization method : t-SNE

GloVe vector king − man + w𝑜𝑚𝑎𝑛

 Preservation of semantic and syntactic relationships !

• Extrinsic evaluation on tasks

• Word2vec (Mikolov et al)

You might also like