0% found this document useful (0 votes)

16 views137 pages

Week 8

Uploaded by

vasudevchalla8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views137 pages

Week 8

Uploaded by

vasudevchalla8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 137

Lexical Semantics

EL
Pawan Goyal

PT CSE, IIT Kharagpur

Week 8, Lecture 1
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 1 / 18

Lexical Semantics

Definition
Lexical semantics is concerned with the systematic meaning related
connections among lexical items, and the internal meaning-related structure of
individual lexical items.

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 2 / 18

Lexical Semantics

Definition
Lexical semantics is concerned with the systematic meaning related
connections among lexical items, and the internal meaning-related structure of
individual lexical items.

EL
To identify the semantics of lexical items, we need to focus on the notion of
lexeme, an individual entry in the lexicon.

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 2 / 18

Lexical Semantics

Definition
Lexical semantics is concerned with the systematic meaning related
connections among lexical items, and the internal meaning-related structure of
individual lexical items.

EL
To identify the semantics of lexical items, we need to focus on the notion of
lexeme, an individual entry in the lexicon.

What is a lexeme? PT
Lexeme should be thought of as a pairing of a particular orthographic and
N
phonological form with some sort of symbolic meaning representation.
Orthographic form, and phonological form refer to the appropriate form
part of a lexeme
Sense refers to a lexeme’s meaning counterpart.

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 2 / 18

Example

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 3 / 18

Example: meaning related facts?

Definitions from the American Heritage Dictionary (Morris, 1985)

right adj. located near the right hand esp. being on the right when facing
the same direction as the observer
left adj. located near to this side of the body than the right

EL
red n. the color of blood or a ruby
blood n. the red liquid that circulates in the heart, arteries and veins of
animals
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 4 / 18

Example: meaning related facts?

Definitions from the American Heritage Dictionary (Morris, 1985)

right adj. located near the right hand esp. being on the right when facing
the same direction as the observer
left adj. located near to this side of the body than the right

EL
red n. the color of blood or a ruby
blood n. the red liquid that circulates in the heart, arteries and veins of
animals
PT
The entries are description of lexemes in terms of other lexemes
N
Definitions make it clear that right and left are similar kind of lexemes that
stand in some kind of alternation, or opposition, to one another
We can glean that red is a color, it can be applied to both blood and
rubies, and that blood is a liquid.

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 4 / 18

Relations between word meanings

Homonymy

EL
Polysemy
Synonymy
Antonymy
Hypernymy
Hyponymy
PT
N
Meronymy

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 5 / 18

Homonymy
Definition
Homonymy is defined as a relation that holds between words that have the
same form with unrelated meanings.

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 6 / 18

Homonymy
Definition
Homonymy is defined as a relation that holds between words that have the
same form with unrelated meanings.

Examples

EL
Bat (wooden stick-like thing) vs Bat (flying mammal thing)
Bank (financial institution) vs Bank (riverside)

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 6 / 18

Homonymy
Definition
Homonymy is defined as a relation that holds between words that have the
same form with unrelated meanings.

Examples

EL
Bat (wooden stick-like thing) vs Bat (flying mammal thing)
Bank (financial institution) vs Bank (riverside)

homophones and homographs

PT
homophones are the words with the same pronunciation but different
N
spellings.
write vs right
piece vs peace
homographs are the lexemes with the same orthographic form but different
meaning. Ex: bass
Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 6 / 18
Problems for NLP applications

Text-to-Speech
Same orthographic form but different phonological form

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 7 / 18

Problems for NLP applications

Text-to-Speech
Same orthographic form but different phonological form

EL
Information Retrieval

PT
Different meaning but same orthographic form
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 7 / 18

Problems for NLP applications

Text-to-Speech
Same orthographic form but different phonological form

EL
Information Retrieval

PT
Different meaning but same orthographic form

Speech Recognition
N
to, two, too
Perfect homonyms are also problematic

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 7 / 18

Polysemy

Multiple related meanings within a single lexeme.

The bank was constructed in 1875 out of local red brick.
I withdrew the money from the bank.

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 8 / 18

Polysemy

Multiple related meanings within a single lexeme.

The bank was constructed in 1875 out of local red brick.
I withdrew the money from the bank.

EL
Are those the same sense?

PT
Sense 1: “The building belonging to a financial institution”
Sense 2: “A financial institution”
N
Another example
Heavy snow caused the roof of the school to collapse.
The school hired more teachers this year than ever before.

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 8 / 18

Polysemy: multiple related meanings

Often, the relationships are systematic

E.g., building vs. organization
school, university, hospital, church, supermarket

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 9 / 18

Polysemy: multiple related meanings

Often, the relationships are systematic

E.g., building vs. organization
school, university, hospital, church, supermarket

EL
More examples:

Austen)
PT
Author (Jane Austen wrote Emma) ↔ Works of Author (I really love Jane

Animal (The chicken was domesticated in Asia) ↔ Meat (The chicken

N
was overcooked)
Tree (Plums have beautiful blossoms) ↔ Fruit (I ate a preserved plum
yesterday)

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 9 / 18

Polysemy: multiple related meanings

Zeugma test
Which of these flights serve breakfast?

EL
Does Midwest Express serve Philadelphia?

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 10 / 18

Polysemy: multiple related meanings

Zeugma test
Which of these flights serve breakfast?

EL
Does Midwest Express serve Philadelphia?
*Does Midwest Express serve breakfast and San Jose?

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 10 / 18

Polysemy: multiple related meanings

Zeugma test
Which of these flights serve breakfast?

EL
Does Midwest Express serve Philadelphia?
*Does Midwest Express serve breakfast and San Jose?

PT
Combine two separate uses of a lexeme into a single example using
conjunction
N
Since it sounds weird, we say that these are two different senses of serve.

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 10 / 18

Synonymy

Words that have the same meaning in some or all contexts.

filbert / hazelnut

EL
couch / sofa
big / large
automobile / car
vomit / throw up
water / H2 O
PT
N
Two lexemes are synonyms if they can be successfully substituted for each
other in all situations.

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 11 / 18

Synonymy: A relation between senses

Consider the words big and large.

Are they synonyms?

How big is that plane?

EL
Would I be flying on a large or small plane?

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 12 / 18

Synonymy: A relation between senses

Consider the words big and large.

Are they synonyms?

How big is that plane?

EL
Would I be flying on a large or small plane?

How about here?

PT
Miss Nelson, for instance, became a kind of big sister to Benjamin.
*Miss Nelson, for instance, became a kind of large sister to Benjamin.
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 12 / 18

Synonymy: A relation between senses

Consider the words big and large.

Are they synonyms?

How big is that plane?

EL
Would I be flying on a large or small plane?

How about here?

PT
Miss Nelson, for instance, became a kind of big sister to Benjamin.
*Miss Nelson, for instance, became a kind of large sister to Benjamin.
N
Why?
big has a sense that means being older, or grown up
large lacks this sense

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 12 / 18

Synonyms

Shades of meaning
What is the cheapest first class fare?

EL
*What is the cheapest first class price?

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 13 / 18

Synonyms

Shades of meaning
What is the cheapest first class fare?

EL
*What is the cheapest first class price?

Collocational constraints
PT
We frustate ’em and frustate ’em, and pretty soon they make a big
mistake.
N
*We frustate ’em and frustate ’em, and pretty soon they make a large
mistake.

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 13 / 18

Antonyms

Senses that are opposites with respect to one feature of their meaning
Otherwise, they are similar!
I dark / light

EL
I short / long
I hot / cold
I up / down
I in / out
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 14 / 18

Antonyms

Senses that are opposites with respect to one feature of their meaning
Otherwise, they are similar!
I dark / light

EL
I short / long
I hot / cold
I up / down
I in / out

More formally: antonyms can

PT
N
define a binary opposition or at opposite ends of a scale (long/short,
fast/slow)
Be reversives: rise/fall

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 14 / 18

Hyponymy and Hypernymy

Hyponymy
One sense is a hyponym of another if the first sense is more specific, denoting
a subclass of the other

EL
car is a hyponym of vehicle
dog is a hyponym of animal
mango is a hyponym of fruit

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 15 / 18

Hyponymy and Hypernymy

Hyponymy
One sense is a hyponym of another if the first sense is more specific, denoting
a subclass of the other

EL
car is a hyponym of vehicle
dog is a hyponym of animal
mango is a hyponym of fruit

Hypernymy PT
N
Conversely
vehicle is a hypernym/superordinate of car
animal is a hypernym of dog
fruit is a hypernym of mango

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 15 / 18

Hyponymy more formally

Entailment

EL
Sense A is a hyponym of sense B if being an A entails being a B.
Ex: dog, animal

Transitivity
PT
A hypo B and B hypo C entails A hypo C
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 16 / 18

Meronyms and holonyms

Definition
Meronymy: an asymmetric, transitive relation between senses.

EL
X is a meronym of Y if it denotes a part of Y .
The inverse relation is holonymy.

PT
meronym
porch
wheel
holonym
house
car
N
leg chair
nose face

Pawan Goyal (IIT Kharagpur) Lexical Semantics Week 8, Lecture 1 17 / 18

Lexical Semantics - WordNet

EL
Pawan Goyal

PT CSE, IIT Kharagpur

Week 8, Lecture 2
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 1 / 27

WordNet

https://wordnet.princeton.edu/wordnet/
A hierarchically organized lexical database

EL
A machine-readable thesaurus, and aspects of a dictionary
Versions for other languages are under development

PT
part of speech
noun
verb
no. synsets
82,115
13,767
N
adjective 18,156
adverb 3,621

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 2 / 27

Synsets in WordNet

A synset is a set of synonyms representing a sense

EL
Example: chump as a noun to mean ‘a person who is gullible and easy to
take advantage of’

PT
Each of these senses share this same gloss.
N
For WordNet, the meaning of this sense of chump is this list.

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 3 / 27

lemma vs. synsets

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 4 / 27

All relations in WordNet

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 5 / 27

Wordnet noun and verb relations

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 6 / 27

WordNet Hierarchies

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 7 / 27

Word Similarity

Synonymy is a binary relation

I Two words are either synonymous or not

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 8 / 27

Word Similarity

Synonymy is a binary relation

I Two words are either synonymous or not
We want a looser metric

EL
I Word similarity or
I Word distance

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 8 / 27

Word Similarity

Synonymy is a binary relation

I Two words are either synonymous or not
We want a looser metric

EL
I Word similarity or
I Word distance
Two words are more similar
I

PT
If they share more features of meaning
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 8 / 27

Word Similarity

Synonymy is a binary relation

I Two words are either synonymous or not
We want a looser metric

EL
I Word similarity or
I Word distance
Two words are more similar
I

PT
If they share more features of meaning
Actually these are really relations between senses:
I Instead of saying “bank is like fund”
N
I We say
F Bank1 is similar to fund3
F Bank2 is similar to slope5

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 8 / 27

Word Similarity

Synonymy is a binary relation

I Two words are either synonymous or not
We want a looser metric

EL
I Word similarity or
I Word distance
Two words are more similar
I

We will compute similarity over both words and senses

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 8 / 27

Two classes of algorithms

EL
Distributional algorithms
By comparing words based on their distributional context in the corpora

Thesaurus-based algorithms
PT
Based on whether words are “nearby” in WordNet
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 9 / 27

Thesaurus-based Word Similarity

We could use anything in the thesaurus:

I Meronymy, hyponymy, troponymy

EL
I Glosses and example sentences

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 10 / 27

Thesaurus-based Word Similarity

We could use anything in the thesaurus:

I Meronymy, hyponymy, troponymy

EL
I Glosses and example sentences
In practice, “thesaurus-based” methods usually use:
I the is-a/subsumption/hypernymy hierarchy
I

PT
and sometimes the glosses too
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 10 / 27

Thesaurus-based Word Similarity

We could use anything in the thesaurus:

I Meronymy, hyponymy, troponymy

EL
I Glosses and example sentences
In practice, “thesaurus-based” methods usually use:
I the is-a/subsumption/hypernymy hierarchy
I

PT
and sometimes the glosses too
Word similarity vs. word relatedness
I Similar words are near-synonyms
N
I Related words could be related any way
F car, gasoline : related, but nor similar
F car, bicycle: similar

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 10 / 27

Path-based similarity

Basic Idea

EL
Two words are similar if they are nearby in the hypernym graph

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 11 / 27

Path-based similarity

Basic Idea

EL
Two words are similar if they are nearby in the hypernym graph
pathlen(c1 , c2 ) = number of edges in shortest path (in hypernym graph)
between senses c1 and c2

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 11 / 27

Path-based similarity

Basic Idea

EL
Two words are similar if they are nearby in the hypernym graph
pathlen(c1 , c2 ) = number of edges in shortest path (in hypernym graph)
between senses c1 and c2
simpath (c1 , c2 ) =
PT 1
1+pathlen(c1 ,c2 )
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 11 / 27

Path-based similarity

Basic Idea

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 11 / 27

Shortest path in the hierarchy

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 12 / 27

Leacock-Chodorow (L-C) Similarity

L-C similarity

simLC (c1 , c2 ) = −log(pathlen(c1 , c2 )/2d)

EL
d: maximum depth of the hierarchy

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 13 / 27

Leacock-Chodorow (L-C) Similarity

L-C similarity

simLC (c1 , c2 ) = −log(pathlen(c1 , c2 )/2d)

EL
d: maximum depth of the hierarchy

Problems with L-C similarity

PT
Assumes each edge represents a uniform distance
‘nickel-money’ seems closer than ‘nickel-standard’
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 13 / 27

Leacock-Chodorow (L-C) Similarity

L-C similarity

simLC (c1 , c2 ) = −log(pathlen(c1 , c2 )/2d)

EL
d: maximum depth of the hierarchy

Problems with L-C similarity

PT
Assumes each edge represents a uniform distance
‘nickel-money’ seems closer than ‘nickel-standard’
N
We want a metric which lets us assign different “lengths” to different
edges - but how?

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 13 / 27

Concept probability models

Cencept probabilities
For each concept (synset) c, let P(c) be the probability that a randomly

EL
selected word in a corpus is an instance (hyponym) of c

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 14 / 27

Concept probability models

Cencept probabilities
For each concept (synset) c, let P(c) be the probability that a randomly

EL
selected word in a corpus is an instance (hyponym) of c
P(ROOT) = 1
The lower a node in the hierarchy, the lower its probability

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 14 / 27

Concept probability models

Cencept probabilities
For each concept (synset) c, let P(c) be the probability that a randomly

EL
selected word in a corpus is an instance (hyponym) of c
P(ROOT) = 1
The lower a node in the hierarchy, the lower its probability

Estimating concept probabilitiesPT

Train by counting “concept activations” in a corpus
N
Each occurrence of dime also increments counts for coin, currency,
standard, etc.

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 14 / 27

Example : concept count

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 15 / 27

Example : concept probabilities

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 16 / 27

Information content

EL
Information content: IC(c) = −logP(c)
Lowest common subsumer : LCS(c1 , c2 ): the lowest node in the hierarchy

PT
that subsumes (is a hypernym of) both c1 and c2
We are now ready to see how to use information content (IC) as a
similarity metric.
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 17 / 27

Example : Information content

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 18 / 27

Resnik Similarity

EL
Intuition: how similar two words are depends on how much they have in
common

common subsumer
PT
It measures the commonality by the information content of the lowest

simresnik (c1 , c2 ) = IC(LCS(c1 , c2 )) = −logP(LCS(c1 , c2 ))

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 19 / 27

Example: Resnik similarity

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 20 / 27

Lin similarity

Proportion of shared information

It’s not just about commonalities - it’s also about differences!
Resnik: The more information content they share, the more similar they
are

EL
Lin: The more information content they don’t share, the less similar they
are

PT
Not the absolute quantity of shared information but the proportion of
shared information
N
2logP(LCS(c1 , c2 ))
simLin (c1 , c2 ) =
logP(c1 ) + logP(c2 )
The information content common to c1 and c2 , normalized by their average
information content.

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 21 / 27

Example: Lin similarity

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 22 / 27

Jiang-Conrath distance

JC similarity
We can use IC to assign lengths to graph edges:

EL
distJC (c, hypernym(c)) = IC(c) − IC(hypernym(c))

PT
distJC (c1 , c2 ) = distJC (c1 , LCS(c1 , c2 )) + distJC (c2 , LCS(c1 , c2 ))
= IC(c1 ) − IC(LCS(c1 , c2 )) + IC(c2 ) − IC(LCS(c1 , c2 ))
= IC(c1 ) + IC(c2 ) − 2 × IC(LCS(c1 , c2 ))
N
1
simJC (c1 , c2 ) =
IC(c1 ) + IC(c2 ) − 2 × IC(LCS(c1 , c2 ))

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 23 / 27

Example: Jiang-Conrath distance

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 24 / 27

The (extended) Lesk Algorithm

Two concepts are similar if their glosses contain similar words

EL
I Drawing paper: paper that is specially prepared for use in drafting
I Decal: the art of transferring designs from specially prepared paper to a
wood or glass or metal surface

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 25 / 27

The (extended) Lesk Algorithm

Two concepts are similar if their glosses contain similar words

EL
I Drawing paper: paper that is specially prepared for use in drafting
I Decal: the art of transferring designs from specially prepared paper to a
wood or glass or metal surface

PT
For each n-word phrase that occurs in both glosses, add a score of n2
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 25 / 27

The (extended) Lesk Algorithm

Two concepts are similar if their glosses contain similar words

EL
I Drawing paper: paper that is specially prepared for use in drafting
I Decal: the art of transferring designs from specially prepared paper to a
wood or glass or metal surface

PT
For each n-word phrase that occurs in both glosses, add a score of n2
paper and specially prepared → 1 + 4 = 5
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 25 / 27

Problem in mapping words to wordnet senses

EL
I saw a man who is 98 years old and can still walk and tell jokes

PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 26 / 27

Ambiguity is rampant!

EL
PT
N

Pawan Goyal (IIT Kharagpur) Lexical Semantics - WordNet Week 8, Lecture 2 27 / 27

Word Sense Disambiguation - I

EL
Pawan Goyal

PT CSE, IIT Kharagpur

Week 8, Lecture 3
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 1 / 15

Word Sense Disambiguation (WSD)

Sense ambiguity
Many words have several meanings or senses
The meaning of bass depends on the context

EL
Are we talking about music, or fish?
I An electric guitar and bass player stand off to one side, not really part of
the scene, just as a sort of nod to gringo expectations perhaps.
I

were too skinny.

PT
And it all started when fishermen decided the striped bass in Lake Mead
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 2 / 15

Word Sense Disambiguation (WSD)

Sense ambiguity
Many words have several meanings or senses
The meaning of bass depends on the context

EL
Are we talking about music, or fish?
I An electric guitar and bass player stand off to one side, not really part of
the scene, just as a sort of nod to gringo expectations perhaps.
I

were too skinny.

PT
And it all started when fishermen decided the striped bass in Lake Mead
N
Disambiguation
The task of disambiguation is to determine which of the senses of an
ambiguous word is invoked in a particular use of the word.

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 2 / 15

Word Sense Disambiguation (WSD)

Sense ambiguity
Many words have several meanings or senses
The meaning of bass depends on the context

EL
Are we talking about music, or fish?
I An electric guitar and bass player stand off to one side, not really part of
the scene, just as a sort of nod to gringo expectations perhaps.
I

were too skinny.

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 2 / 15

Algorithms

Knowledge Based Approaches

EL
I Overlap Based Approaches
Machine Learning Based Approaches
I Supervised Approaches
I
I PT
Semi-supervised Algorithms
Unsupervised Algorithms
Hybrid Approaches
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 3 / 15

Knowledge Based Approaches

Overlap Based Approaches

Require a Machine Readable Dictionary (MRD).

EL
Find the overlap between the features of different senses of an
ambiguous word (sense bag) and the features of the words in its context
(context bag).

etc. PT
The features could be sense definitions, example sentences, hypernyms

The features could also be given weights.

N
The sense which has the maximum overlap is selected as the
contextually appropriate sense.

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 4 / 15

Lesk’s Algorithm
Sense Bag: contains the words in the definition of a candidate sense of the
ambiguous word.

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 5 / 15

Lesk’s Algorithm
Sense Bag: contains the words in the definition of a candidate sense of the
ambiguous word.
Context Bag: contains the words in the definition of each sense of each
context word.

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 5 / 15

Lesk’s Algorithm
Sense Bag: contains the words in the definition of a candidate sense of the
ambiguous word.
Context Bag: contains the words in the definition of each sense of each
context word.

EL
On burning coal we get ash.

PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 5 / 15

Lesk’s Algorithm
Sense Bag: contains the words in the definition of a candidate sense of the
ambiguous word.
Context Bag: contains the words in the definition of each sense of each
context word.

EL
On burning coal we get ash.

PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 5 / 15

Walker’s Algorithm

A Thesaurus Based approach

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 6 / 15

Walker’s Algorithm

A Thesaurus Based approach

Step 1: For each sense of the target word find the thesaurus category to
which that sense belongs

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 6 / 15

Walker’s Algorithm

A Thesaurus Based approach

Step 1: For each sense of the target word find the thesaurus category to
which that sense belongs
Step 2: Calculate the score for each sense by using the context words.

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 6 / 15

Walker’s Algorithm

A Thesaurus Based approach

Step 1: For each sense of the target word find the thesaurus category to
which that sense belongs
Step 2: Calculate the score for each sense by using the context words.

EL
A context word will add 1 to the score of the sense if the thesaurus
category of the word matches that of the sense.

PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 6 / 15

Walker’s Algorithm

A Thesaurus Based approach

Step 1: For each sense of the target word find the thesaurus category to
which that sense belongs
Step 2: Calculate the score for each sense by using the context words.

EL
A context word will add 1 to the score of the sense if the thesaurus
category of the word matches that of the sense.
I E.g. The money in this bank fetches an interest of 8% per annum

PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 6 / 15

Walker’s Algorithm

A Thesaurus Based approach

Step 1: For each sense of the target word find the thesaurus category to
which that sense belongs
Step 2: Calculate the score for each sense by using the context words.

EL
A context word will add 1 to the score of the sense if the thesaurus
category of the word matches that of the sense.
I E.g. The money in this bank fetches an interest of 8% per annum
I
I
Target word: bank
PT
Clue words from the context: money, interest, annum, fetch
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 6 / 15

Walker’s Algorithm

A Thesaurus Based approach

Step 1: For each sense of the target word find the thesaurus category to
which that sense belongs
Step 2: Calculate the score for each sense by using the context words.

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 6 / 15

WSD Using Random Walk Algorithm

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 7 / 15

WSD Using Random Walk Algorithm

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 8 / 15

WSD Using Random Walk Algorithm

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 9 / 15

WSD Using Random Walk Algorithm

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 10 / 15

Naïve Bayes for WSD
A Naïve Bayes classifier chooses the most likely sense for a word given
the features of the context:

ŝ = arg max P(s|f )

s∈S

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 11 / 15

Naïve Bayes for WSD
A Naïve Bayes classifier chooses the most likely sense for a word given
the features of the context:

ŝ = arg max P(s|f )

s∈S

EL
Using Bayes’ law, this can be expressed as:
P(s)P(f |s)

PT
ŝ = arg max
s∈S

= arg max P(s)P(f |s)

P(f )
N
s∈S

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 11 / 15

Naïve Bayes for WSD
A Naïve Bayes classifier chooses the most likely sense for a word given
the features of the context:

ŝ = arg max P(s|f )

s∈S

EL
Using Bayes’ law, this can be expressed as:
P(s)P(f |s)

PT
ŝ = arg max
s∈S

= arg max P(s)P(f |s)

P(f )
N
s∈S

The ‘Naïve’ assumption: all the features are conditionally independent,

given the sense’:
n
Y
ŝ = arg max P(s) P(fj |s)
s∈S j=1

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 11 / 15

Training for Naïve Bayes

‘f ’ is a feature vector consisting of:

I POS of w
I Semantic and Syntactic features of w
I Collocation vector (set of words around it) → next word (+1), +2, -1, -2 and

EL
their POS’s
I Co-occurrence vector

PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 12 / 15

Training for Naïve Bayes

‘f ’ is a feature vector consisting of:

I POS of w
I Semantic and Syntactic features of w
I Collocation vector (set of words around it) → next word (+1), +2, -1, -2 and

EL
their POS’s
I Co-occurrence vector
Set parameters of Naïve Bayes using maximum likelihood estimation
(MLE) from training data
PT count(si , wj )
N
P(si ) =
count(wj )
count(fj , si )
P(fj |si ) =
count(si )

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 12 / 15

Decision List Algorithm

Based on ‘One sense per collocation’ property

I Nearby words provide strong and consistent clues as to the sense of a
target word

EL
Collect a large set of collocations for the ambiguous word
Calculate word-sense probability distributions for all such collocations

PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 13 / 15

Decision List Algorithm

Based on ‘One sense per collocation’ property

I Nearby words provide strong and consistent clues as to the sense of a
target word

EL
Collect a large set of collocations for the ambiguous word
Calculate word-sense probability distributions for all such collocations
Calculate the log-likelihood ratio

PT
log(
P(Sense − A|Collocationi )
P(Sense − B|Collocationi )
)
N
Higher log-likelihood ⇒ more predictive evidence

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 13 / 15

Decision List Algorithm

Based on ‘One sense per collocation’ property

I Nearby words provide strong and consistent clues as to the sense of a
target word

EL
Collect a large set of collocations for the ambiguous word
Calculate word-sense probability distributions for all such collocations
Calculate the log-likelihood ratio

PT
log(
P(Sense − A|Collocationi )
P(Sense − B|Collocationi )
)
N
Higher log-likelihood ⇒ more predictive evidence
Collocations are ordered in a decision list, with most predictive
collocations ranked highest

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 13 / 15

Decision List Algorithm

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 14 / 15

Decision List Algorithm

EL
PT
N
Classification of a test sentence is based on the highest ranking collocation,
found in the test sentences.

plucking flowers affects plant growth.

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 14 / 15

Decision List: Example

Example: discriminating between bass (fish) and bass (music):

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - I Week 8, Lecture 3 15 / 15

Word Sense Disambiguation - II

EL
Pawan Goyal

PT CSE, IIT Kharagpur

Week 8, Lecture 4
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - II Week 8, Lecture 4 1 / 14

Minimally Supervised WSD - Yarowsky

Annotations are expensive!

EL
“Bootstrapping” or co-training
I Start with (small) seed, learn decision list
I Use decision list to label rest of corpus

PT
I Retain ‘confident’ labels, treat as annotated data to learn new decision list
I Repeat . . .
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - II Week 8, Lecture 4 2 / 14

Minimally Supervised WSD - Yarowsky

Annotations are expensive!

EL
“Bootstrapping” or co-training
I Start with (small) seed, learn decision list
I Use decision list to label rest of corpus

PT
I Retain ‘confident’ labels, treat as annotated data to learn new decision list
I Repeat . . .
Heuristics (derived from observation):
N
I One sense per discourse
I One sense per collocation

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - II Week 8, Lecture 4 2 / 14

More about heuristics

One Sense per Discourse

EL
A word tends to preserve its meaning across all its occurrences in a given
discourse

PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - II Week 8, Lecture 4 3 / 14

More about heuristics

One Sense per Discourse

EL
A word tends to preserve its meaning across all its occurrences in a given
discourse

One Sense per Collocation

PT
A word tends to preserve its meaning when used in the same collocation
I Strong for adjacent collocations
N
I Weaker as the distance between the words increases

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - II Week 8, Lecture 4 3 / 14

Yarowsky’s Method

Example

EL
Disambiguating plant (industrial sense) vs. plant (living thing sense)
Think of seed features for each sense
I Industrial sense: co-occurring with ‘manufacturing’
I

PT
Living thing sense: co-occurring with ‘life’
Use ‘one sense per collocation’ to build initial decision list classifier
Treat results (having high probability) as annotated data, train new
N
decision list classifier, iterate

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - II Week 8, Lecture 4 4 / 14

Yarowsky’s Method: Example

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - II Week 8, Lecture 4 5 / 14

Yarowsky’s Method: Example

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - II Week 8, Lecture 4 6 / 14

Yarowsky’s Method: Example

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - II Week 8, Lecture 4 7 / 14

Yarowsky’s Method: Example

EL
PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - II Week 8, Lecture 4 8 / 14

Yarowsky’s Method

Termination
Stop when

EL
I Error on training data is less than a threshold
I No more training data is covered
Use final decision list for WSD

Advantages PT
N
Accuracy is about as good as a supervised algorithm
Bootstrapping: far less manual effort

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - II Week 8, Lecture 4 9 / 14

HyperLex

Key Idea: Word Sense Induction

Instead of using “dictionary defined senses”, extract the “senses from the
corpus” itself
These “corpus senses” or “uses” correspond to clusters of similar

EL
contexts for a word.

PT
N

Pawan Goyal (IIT Kharagpur) Word Sense Disambiguation - II Week 8, Lecture 4 10 / 14

HyperLex

Key Idea: Word Sense Induction

Instead of using “dictionary defined senses”, extract the “senses from the
corpus” itself
These “corpus senses” or “uses” correspond to clusters of similar

EL
contexts for a word.

PT
N