0% found this document useful (0 votes)
83 views16 pages

Word Sense Disambiguation: Michael Melese (PHD) Michael - Melese@Aau - Edu.Et

This document discusses word sense disambiguation, which is the process of identifying the intended meaning of words with multiple meanings based on context. It describes different relationships between word senses such as polysemy, homonymy, synonymy, antonymy, hyponomy, and hypernomy. Common approaches to word sense disambiguation include supervised machine learning, dictionary-based methods, and semi-supervised learning.

Uploaded by

Gezish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views16 pages

Word Sense Disambiguation: Michael Melese (PHD) Michael - Melese@Aau - Edu.Et

This document discusses word sense disambiguation, which is the process of identifying the intended meaning of words with multiple meanings based on context. It describes different relationships between word senses such as polysemy, homonymy, synonymy, antonymy, hyponomy, and hypernomy. Common approaches to word sense disambiguation include supervised machine learning, dictionary-based methods, and semi-supervised learning.

Uploaded by

Gezish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Word Sense

Disambiguation

Michael Melese (PhD)


[email protected]
WSD
n Word sense disambiguation is the problem of selecting a sense
for a word from a set of predefined possibilities.
¨ Sense Inventory usually comes from a dictionary or thesaurus.
¨ Knowledge intensive methods, supervised learning, and (sometimes)
bootstrapping approaches.
n Word Sense Disambiguation select the correct sense in context.
Relationships between word senses
n Polysemy
n Homonymy
n Synonymy
n Antonymy
n Hypernomy
n Hyponomy
Polysemy
n Polysemy - most words have many possible meanings, or
phrase with different, but related senses.
¨A single lexeme with multiple meanings (bank the building, bank the
financial institution) Are those the same sense ? Which sense of bank?
n Is it distinct from (homonymous with) the river bank sense?

n How about the savings bank sense?

n A computer program has no basis for disambiguating the sense


for that instance, even if it is obvious to a human.
Homonymy
n Lexemes that share the same form in terms of Phonology,
orthographic or both.
n But have unrelated, distinct meanings
¨ Examples
n bat (wooden stick-like thing) vs bat (flying scary mammal thing)
n bank (financial institution) vs bank (river side)

¨ Can be homophones, homographs, or both:


n Homophones: Write and right, Piece and peace
Synonymy
n Word that have the same meaning in some or all contexts
¨ couch / sofa
¨ big / large
¨ automobile / car
¨ vomit / throw up
¨ Water / H20
n Two lexemes are synonyms if they can be successfully
substituted for each other in many situations
Antonymy
n Antonyms can define the opposite/reverse ends of a scale
(long/short, fast/slow, rise/fall, up/down). Senses that are
opposites with respect to one feature of their meaning.
n Example
¨ dark/light
¨ short/long
¨ hot/cold
¨ up/down
¨ in/out
Hyponomy and Hypernym
n One sense is a hyponym of another if the first sense is more
specific, denoting a subclass of the other
¨ caris a hyponym of vehicle
¨ dog is a hyponym of animal
¨ mango is a hyponym of fruit

n Hypernym/superordinate (Conversely)
¨ vehicle is a hypernym/superordinate of car
¨ animal is a hypernym of dog
¨ fruit is a hypernym of mango
Approaches of WSD
n Supervised Machine Learning
n Thesaurus/Dictionary Methods
n Semi-Supervised Learning
Supervised ML approach
n Supervised machine learning approach:
¨ A training corpus of words tagged in context with their sense
¨ Used to train a classifier that can tag words in new text

n What we need to disambiguate:


¨ The tag set (“sense inventory”)
¨ The training corpus
¨ A set of features extracted from the training corpus

n A classifier (NB, LR, ANN, SVM, KNN)


Dictionary and Thesaurus Methods
n A simplified Lesk algorithm might be used to disambiguate a
word sense using WordNet.
¨ The bank can guarantee deposits will eventually cover future tuition
costs because it invests in adjustable-rate mortgage securities.
bank1 Gloss: A financial institution that accepts deposits and channels the money into
lending activities
“he cashed a check at the bank”, “that bank holds the mortgage on my
Examples: home”
bank2 Gloss: sloping land (especially the slope beside a body of water)
Examples: “they pulled the canoe up on the bank”, “he sat on the bank of the river
and watched the currents”
Simplified Lesk Algorithm (1)
n Given the sentence
“The bank can guarantee
bank1 Gloss: A financial institution that accepts deposits
deposits will eventually and channels the money into lending activities
cover future tuition costs Examples: “he cashed a check at the bank”, “that bank
holds the mortgage on my home”
because it invests in
bank2 Gloss: sloping land (especially the slope beside a
adjustable-rate mortgage
Examples: body of water)
securities.” “they pulled the canoe up on the bank”,
“he sat on the bank of the river and watched
n Choose sense with most the currents”

word overlap between gloss


and context.
Simplified Lesk Algorithm (2)
n For a given the sentence choose sense with the most overlap
between the context and signature.
¨ “The bank can guarantee deposits will eventually cover future tuition
costs because it invests in adjustable-rate mortgage securities.”
Gloss: A financial institution that accepts deposits and channels the money
into lending activities
bank1
Examples: “he cashed a check at the bank”, “that bank holds the mortgage on
my home”
Gloss: sloping land (especially the slope beside a body of water)
bank2 Examples: “they pulled the canoe up on the bank”,
“he sat on the bank of the river and watched the currents”
Semi-Supervised Learning
n Problem: supervised and dictionary-based approaches require
large hand-built resources.
¨ What if you don’t have so much training data?
n Solution: Bootstrapping
¨ Generalize from a very small hand-labeled seed-set.
Practical Applications
n Machine Translation
¨ Translate “bill” from English to Spanish
n Is it a “pico” or a “cuenta”?
n Is it a bird jaw or an invoice?

n Information Retrieval
¨ Find all Web Pages about “cricket”
n The sport or the insect?
Practical Applications
n Question Answering
¨ What is George Miller’s position on gun control?
n The psychologist or US congressman?
n Knowledge Acquisition
¨ Add to KB: Herb Bergson is the mayor of Duluth.
n Minnesota or Georgia?

You might also like