0% found this document useful (0 votes)

101 views6 pages

Word Sense Disambiguation in Natural Language Processing - GeeksforGeeks

Uploaded by

anand.1044a

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

101 views6 pages

Word Sense Disambiguation in Natural Language Processing - GeeksforGeeks

Uploaded by

anand.1044a

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Word Sense Disambiguation in Natural Language

Processing
Last Updated : 21 Apr, 2023

Word sense disambiguation (WSD) in Natural Language Processing (NLP)

is the problem of identifying which “sense” (meaning) of a word is activated
by the use of the word in a particular context or scenario. In people, this
appears to be a largely unconscious process. The challenge of correctly
identifying words in NLP systems is common, and determining the specific
usage of a word in a sentence has many applications. The application of
Word Sense Disambiguation involves the area of Information Retrieval,
Question Answering systems, Chat-bots, etc.

Word Sense Disambiguation (WSD) is a subtask of Natural Language

Processing that deals with the problem of identifying the correct sense of a
word in context. Many words in natural language have multiple meanings,
and WSD aims to disambiguate the correct sense of a word in a particular
context. For example, the word “bank” can have different meanings in the
sentences “I deposited money in the bank” and “The boat went down the
river bank”.

WSD is a challenging task because it requires understanding the context in

which the word is used and the different senses in which the word can be
used. Some common approaches to WSD include:

1. Supervised learning: This involves training a machine learning model on a

dataset of annotated examples, where each example contains a target
word and its sense in a particular context. The model then learns to
predict the correct sense of the target word in new contexts.
2. Unsupervised learning: This involves clustering words that appear in
similar contexts together, and then assigning senses to the resulting
clusters. This approach does not require annotated data, but it is less
accurate than supervised learning.
3. Knowledge-based: This involves using a knowledge base, such as a
dictionary or ontology, to map words to their different senses. This
approach relies on the availability and accuracy of the knowledge base.
4. Hybrid: This involves combining multiple approaches, such as supervised
and knowledge-based methods, to improve accuracy.

WSD has many practical applications, including machine translation,

information retrieval, and text-to-speech systems. Improvements in WSD
can lead to more accurate and efficient natural language processing systems.

Word Sense Disambiguation (WSD) is a subfield of Natural Language

Processing (NLP) that deals with determining the intended meaning of a
word in a given context. It is the process of identifying the correct sense of a
word from a set of possible senses, based on the context in which the word
appears. WSD is important for natural language understanding and machine
translation, as it can improve the accuracy of these tasks by providing more
accurate word meanings. Some common approaches to WSD include using
WordNet, supervised machine learning, and unsupervised methods such as
clustering.

The noun ‘star’ has eight different meanings or senses. An idea can be
mapped to each sense of the word. For example,

“He always wanted to be a Bollywood star.” The word ‘star’ can be

described as “A famous and good singer, performer, sports player,
actor, personality, etc.”
“The Milky Way galaxy contains between 200 and 400 billion stars”.
In this, the word star means “a big ball of burning gas in space that
we view as a point of light in the night sky.”

Difficulties in Word Sense Disambiguation

There are some difficulties faced by Word Sense Disambiguation (WSD).

Different Text-Corpus or Dictionary: One issue with word sense

disambiguation is determining what the senses are because different
dictionaries and thesauruses divide words into distinct senses. Some
academics have proposed employing a specific lexicon and its set of
senses to address this problem. In general, however, research findings
based on broad sense distinctions have outperformed those based on
limited ones. The majority of researchers are still working on fine-grained
WSD.
PoS Tagging: Part-of-speech tagging and sense tagging have been
shown to be very tightly coupled in any real test, with each potentially
constraining the other. Both disambiguating and tagging with words are
involved in WSM part-of-speech tagging. However, algorithms designed
for one do not always work well for the other, owing to the fact that a
word’s part of speech is mostly decided by the one to three words
immediately adjacent to it, whereas a word’s sense can be determined by
words further away.

Sense Inventories for Word Sense Disambiguation

Sense Inventories are the collection of abbreviations and acronyms with their
possible senses. Some of the examples used in Word Sense Disambiguation
are:

Princeton WordNet: is a vast lexicographic database of English and other

languages that is manually curated. For WSD, this is the de facto standard
inventory. Its well-organized Synsets, or clusters of contextual synonyms,
are nodes in a network.
BabelNet: is a multilingual dictionary that covers both lexicographic and
encyclopedic terminology. It was created by semi-automatically mapping
numerous resources, including WordNet, multilingual versions of
WordNet, and Wikipedia.
Wiktionary: a collaborative project aimed at creating a dictionary for each
language separately, is another inventory that has recently gained
popularity.

Approaches for Word Sense Disambiguation

There are many approaches to Word Sense Disambiguation. The three main
approaches are given below:

1. Supervised: The assumption behind supervised approaches is that the

context can supply enough evidence to disambiguate words on its own
(hence, world knowledge and reasoning are deemed unnecessary).

Supervised methods for Word Sense Disambiguation (WSD) involve training

a model using a labeled dataset of word senses. The model is then used to
disambiguate the sense of a target word in new text. Some common
techniques used in supervised WSD include:

1. Decision list: A decision list is a set of rules that are used to assign a
sense to a target word based on the context in which it appears.
2. Neural Network: Neural networks such as feedforward networks,
recurrent neural networks, and transformer networks are used to model
the context-sense relationship.
3. Support Vector Machines: SVM is a supervised machine learning
algorithm used for classification and regression analysis.
4. Naive Bayes: Naive Bayes is a probabilistic algorithm that uses Bayes’
theorem to classify text into predefined categories.
5. Decision Trees: Decision Trees are a flowchart-like structure in which an
internal node represents feature(or attribute), the branch represents a
decision rule, and each leaf node represents the outcome.

Random Forest: Random Forest is an ensemble learning method for

classification, regression, and other tasks that operate by constructing a
multitude of decision trees at training time and outputting the class that is
the mode of the classes.

Supervised WSD Exploiting Glosses: Textual definitions are a prominent

source of information in sense inventories (also known as glosses).
Definitions, which follow the format of traditional dictionaries, are a quick
and easy way to clarify sense distinctions
Purely Data-Driven WSD: In this case, a token tagger is a popular
baseline model that generates a probability distribution over all senses in
the vocabulary for each word in a context.
Supervised WSD Exploiting Other Knowledge: Additional sources of
knowledge, both internal and external to the knowledge base, are also
beneficial to WSD models. Some researchers use BabelNet translations to
fine-tune the output of any WSD system by comparing the output senses’
translations to the target’s translations provided by an NMT system.

2. Unsupervised: The underlying assumption is that similar senses occur in

similar contexts, and thus senses can be induced from the text
by clustering word occurrences using some measure of similarity of context.
Using fixed-size dense vectors (word embeddings) to represent words in
context has become one of the most fundamental blocks in several NLP
systems. Traditional word embedding approaches can still be utilized to
improve WSD, despite the fact that they conflate words with many meanings
into a single vector representation. Lexical databases (e.g., WordNet,
ConceptNet, BabelNet) can also help unsupervised systems map words and
their senses as dictionaries, in addition to word embedding techniques.
Data Science IBM Certification Data Science Data Science Projects Data Analysis Data Visualization Machine L
3. Knowledge-Based: It is built on the idea that words used in a text are
related to one another, and that this relationship can be seen in the
definitions of the words and their meanings. The pair of dictionary senses
having the highest word overlap in their dictionary meanings are used to
disambiguate two (or more) words. Lesk Algorithm is the classical algorithm
based on Knowledge-Based WSD. Lesk algorithm assumes that words in a
given “neighborhood” (a portion of text) will have a similar theme. The
dictionary definition of an uncertain word is compared to the terms in its
neighborhood in a simplified version of the Lesk algorithm.

Subtopics:

1. Supervised methods for WSD

2. Unsupervised methods for WSD
3. Knowledge-based methods for WSD
4. Distributional methods for WSD
5. Hybrid methods for WSD
6. Evaluation metrics for WSD
7. Applications of WSD in NLP tasks such as machine translation,
information retrieval, and text summarization.
8. Limitations and challenges in WSD research
9. Recent developments and future directions in WSD
10. Annotation schemes and tools for WSD

Example:

For example, consider the word “bank” in the sentence “I deposited my

money in the bank.” Without WSD, it would be difficult for a computer to
determine whether the word “bank” refers to a financial institution or the
edge of a river. However, with WSD, the computer can use context clues
such as “deposited” and “money” to determine that the intended meaning of
“bank” in this sentence is a financial institution. This will improve the
accuracy of natural language understanding and machine translation, as the
computer will understand that the sentence is talking about depositing
money in a bank account, not at the edge of a river.

Get ready to boost your rank and secure an exceptional GATE 2025 score
with confidence!
Our GATE CS & IT Test Series 2025 offers 60 PYQs Quizzes, 60 Subject-
Wise Mock Tests, 4500+ PYQs and practice questions, and over 20 Full-
Length Mock Tests that ensure you’re well-prepared to tackle the toughest
questions and secure a top-rank in the GATE 2025 exam. Get personalized
insights with student rankings based on performance and benefit from
expert-designed tests created by industry pros and GATE CS toppers.

Plus, don’t miss out on these exclusive features:

--> All India Mock Test

--> Live GATE CSE Mentorship Classes
--> Live Doubt Solving Sessions

Join now and stay ahead in your GATE 2025 journey!

Comment More info Next Article

Introduction to Natural Language
Processing (NLP)

Similar Reads
Natural Language Processing(NLP) VS Programming Language
In the world of computers, there are mainly two kinds of languages: Natural
Language Processing (NLP) and Programming Languages. NLP is all about…
4 min read

ML | Natural Language Processing using Deep Learning

Machine Comprehension is a very interesting but challenging task in both
Natural Language Processing (NLP) and artificial intelligence (AI) research.…
9 min read

Translation and Natural Language Processing using Google Cloud

Prerequisite: Create a Virtual Machine and setup API on Google Cloud In this
article, we will discuss how to use Google's Translation and Natural Languag…
7 min read

Natural Language Processing: Moving Beyond Zeros and Ones

Machine Learning is one of the wonders of modern technology! Intelligent
robots, smart cars etc. are all applications of ML. And the technology that can…

Sample Certificate of Non-Claim (Car Insurance Claim)
71% (7)
Sample Certificate of Non-Claim (Car Insurance Claim)
1 page
Sins of An Angel - Fez Matsikiti
100% (1)
Sins of An Angel - Fez Matsikiti
2,504 pages
Looting in Kenya-Kroll Report (Hapa Kenya Version)
100% (7)
Looting in Kenya-Kroll Report (Hapa Kenya Version)
101 pages
NLP Semantic 2222 - Word Sense Disambiguation
No ratings yet
NLP Semantic 2222 - Word Sense Disambiguation
4 pages
International Journal of Engineering and Science Invention (IJESI)
No ratings yet
International Journal of Engineering and Science Invention (IJESI)
4 pages
Words Have diff-WPS Office
No ratings yet
Words Have diff-WPS Office
4 pages
WSD Using Dictionary
No ratings yet
WSD Using Dictionary
4 pages
Word Sense Disambiguation and Its Approaches: Vimal Dixit, Kamlesh Dutta and Pardeep Singh
No ratings yet
Word Sense Disambiguation and Its Approaches: Vimal Dixit, Kamlesh Dutta and Pardeep Singh
5 pages
First Stage
No ratings yet
First Stage
15 pages
An Innovative Method For Hindi Word Sense Disambiguation: Binod Kumar Mishra Suresh Jain
No ratings yet
An Innovative Method For Hindi Word Sense Disambiguation: Binod Kumar Mishra Suresh Jain
17 pages
Problem Statement NLP WSD
No ratings yet
Problem Statement NLP WSD
9 pages
Word Sense Diambiguation
No ratings yet
Word Sense Diambiguation
11 pages
Unit No 4
No ratings yet
Unit No 4
9 pages
Semantic Parsing
No ratings yet
Semantic Parsing
79 pages
1508 01346 PDF
No ratings yet
1508 01346 PDF
16 pages
Word Sense Disambiguation: A Survey
No ratings yet
Word Sense Disambiguation: A Survey
16 pages
Recent Trends in Word Sense Disambiguation
No ratings yet
Recent Trends in Word Sense Disambiguation
10 pages
Word Sense Disambiguation
No ratings yet
Word Sense Disambiguation
1 page
Unit 2 - Lecture 3
No ratings yet
Unit 2 - Lecture 3
9 pages
Does BERT Make Any Sense? Interpretable Word Sense Disambiguation With Contextualized Embeddings
No ratings yet
Does BERT Make Any Sense? Interpretable Word Sense Disambiguation With Contextualized Embeddings
10 pages
2019 Wiedemannetal Konvens Bert 1
No ratings yet
2019 Wiedemannetal Konvens Bert 1
2 pages
Corpus Based Approach For Semantic Interpretation
No ratings yet
Corpus Based Approach For Semantic Interpretation
20 pages
An Hybrid Approach To Word Sense Disambiguation
No ratings yet
An Hybrid Approach To Word Sense Disambiguation
12 pages
Advances in WSD
No ratings yet
Advances in WSD
208 pages
Unit 2 - Lecture 3
No ratings yet
Unit 2 - Lecture 3
9 pages
NLP Mod4 Lec1 Word Sense Disambiguation
No ratings yet
NLP Mod4 Lec1 Word Sense Disambiguation
26 pages
What Is Word Sense Disambiguation Good For?: Adam Kilgarriff Itri University of Brighton
No ratings yet
What Is Word Sense Disambiguation Good For?: Adam Kilgarriff Itri University of Brighton
6 pages
Semisupervised Data Driven Word Sense... (Pratibha Rani and Others)
No ratings yet
Semisupervised Data Driven Word Sense... (Pratibha Rani and Others)
11 pages
NLP Assignment 4
No ratings yet
NLP Assignment 4
3 pages
A Survey On Word Sense Disambiguation: Agatheeswaran.T, Vigneshwaran.M, Tamilnadu College of Engineering, Coimbatore
No ratings yet
A Survey On Word Sense Disambiguation: Agatheeswaran.T, Vigneshwaran.M, Tamilnadu College of Engineering, Coimbatore
22 pages
Word Sense Disambiguation
No ratings yet
Word Sense Disambiguation
39 pages
Word Sense Disambiguation
No ratings yet
Word Sense Disambiguation
33 pages
NLP Assignment 4
No ratings yet
NLP Assignment 4
3 pages
Chapter 4 NLP
No ratings yet
Chapter 4 NLP
17 pages
Unsupervised Hindi Word Sense Disambiguation Using Graph Based Centrality Measures
No ratings yet
Unsupervised Hindi Word Sense Disambiguation Using Graph Based Centrality Measures
8 pages
Word Sense Disambiguation
No ratings yet
Word Sense Disambiguation
68 pages
Learning Expressive Models For Word Sense Disambiguation: Lucia Specia Mark Stevenson Maria Das Graças V. Nunes
No ratings yet
Learning Expressive Models For Word Sense Disambiguation: Lucia Specia Mark Stevenson Maria Das Graças V. Nunes
8 pages
ACM Survey 2009 Navigli
No ratings yet
ACM Survey 2009 Navigli
70 pages
A Knowledge Based Approach To Resolve Wo
No ratings yet
A Knowledge Based Approach To Resolve Wo
6 pages
A Metaheuristic With A Neural Surrogate Function - 2022 - Machine Learning With
No ratings yet
A Metaheuristic With A Neural Surrogate Function - 2022 - Machine Learning With
11 pages
WSD 1
No ratings yet
WSD 1
10 pages
Word Sense Disambiguation
No ratings yet
Word Sense Disambiguation
2 pages
Shotgunwsd: An Unsupervised Algorithm For Global Word Sense Disambiguation Inspired by Dna Sequencing
No ratings yet
Shotgunwsd: An Unsupervised Algorithm For Global Word Sense Disambiguation Inspired by Dna Sequencing
11 pages
Semantic and Pragmatic Courses: (Word Sense)
No ratings yet
Semantic and Pragmatic Courses: (Word Sense)
6 pages
WSD Literature Survey 2012 Salil
No ratings yet
WSD Literature Survey 2012 Salil
22 pages
Mihiret Bekel Proposal
No ratings yet
Mihiret Bekel Proposal
14 pages
18 Word Senses and WordNet
No ratings yet
18 Word Senses and WordNet
22 pages
Semantic Disambiguation
No ratings yet
Semantic Disambiguation
46 pages
Performance Enhancement of WSD Using Association Rules in WEKA
No ratings yet
Performance Enhancement of WSD Using Association Rules in WEKA
8 pages
System Paradigms
No ratings yet
System Paradigms
4 pages
CS 388: Natural Language Processing: Word Sense Disambiguation
No ratings yet
CS 388: Natural Language Processing: Word Sense Disambiguation
31 pages
NLP M4 Part 1 SPP
No ratings yet
NLP M4 Part 1 SPP
57 pages
NLP 4-6
No ratings yet
NLP 4-6
29 pages
WSD, Textual Entailment, People Disambiguation and Affective Text
No ratings yet
WSD, Textual Entailment, People Disambiguation and Affective Text
29 pages
Unit-3 02 - Semantic Parsing
No ratings yet
Unit-3 02 - Semantic Parsing
22 pages
Turk Bootstrap Word Sense Inventory 2.0: A Large-Scale Resource For Lexical Substitution
No ratings yet
Turk Bootstrap Word Sense Inventory 2.0: A Large-Scale Resource For Lexical Substitution
5 pages
Madhav Institute of Technology & Science, Gwalior
No ratings yet
Madhav Institute of Technology & Science, Gwalior
13 pages
Slides 6
No ratings yet
Slides 6
21 pages
2019 Wiedemannetal Konvens Bert 5
No ratings yet
2019 Wiedemannetal Konvens Bert 5
2 pages
Visual Word: Unlocking the Power of Image Understanding
From Everand
Visual Word: Unlocking the Power of Image Understanding
Fouad Sabry
No ratings yet
Semantic Modeling In Formal English
From Everand
Semantic Modeling In Formal English
Dr. Ir. Andries Van Renssen
No ratings yet
Semantic Network: Fundamentals and Applications
From Everand
Semantic Network: Fundamentals and Applications
Fouad Sabry
No ratings yet
Relationship Extraction: Fundamentals and Applications
From Everand
Relationship Extraction: Fundamentals and Applications
Fouad Sabry
No ratings yet
ST LINES + CIRCLES TOP 200 PYQs of JEE Mains 2022
No ratings yet
ST LINES + CIRCLES TOP 200 PYQs of JEE Mains 2022
60 pages
Chapter 12.2 - Financial Statements
No ratings yet
Chapter 12.2 - Financial Statements
10 pages
Tle 10-Las Q4-Week 3
No ratings yet
Tle 10-Las Q4-Week 3
4 pages
Native Corn Recipes
100% (3)
Native Corn Recipes
115 pages
Briandavidphillips - Core Skills Hypnosis DVD Course
No ratings yet
Briandavidphillips - Core Skills Hypnosis DVD Course
6 pages
SCADA
No ratings yet
SCADA
12 pages
A Shani 2020
No ratings yet
A Shani 2020
9 pages
What Is Weather in Canada
No ratings yet
What Is Weather in Canada
5 pages
Taxi Reimbursement Request Form 07.31.24 - 0
No ratings yet
Taxi Reimbursement Request Form 07.31.24 - 0
2 pages
Message Analyzer FAQ and Known Issues
No ratings yet
Message Analyzer FAQ and Known Issues
11 pages
NS & Tech - Grade 4 - Terminology List - IsiZulu
No ratings yet
NS & Tech - Grade 4 - Terminology List - IsiZulu
11 pages
Assisting Decision-Making On Age of Neutering For
No ratings yet
Assisting Decision-Making On Age of Neutering For
8 pages
Easy Love Spell
50% (2)
Easy Love Spell
2 pages
B.ing Kls XII
No ratings yet
B.ing Kls XII
1 page
Will (Advanced Uses)
No ratings yet
Will (Advanced Uses)
5 pages
Service Manual: DSC-P10/P12
No ratings yet
Service Manual: DSC-P10/P12
1 page
Force & Laws of Motion5
No ratings yet
Force & Laws of Motion5
2 pages
Kerry Anderson Resume 2017 Weebly
No ratings yet
Kerry Anderson Resume 2017 Weebly
3 pages
Logical Fallacies
100% (1)
Logical Fallacies
52 pages
Oktoma Et Al - 2020
No ratings yet
Oktoma Et Al - 2020
10 pages
American Manufacturing Aw1122bcd Parts Book
100% (1)
American Manufacturing Aw1122bcd Parts Book
6 pages
Online Rail Project Proposal
No ratings yet
Online Rail Project Proposal
2 pages
Diagnostic Report: ENGINE #1 - J1939 Active Fault Codes
No ratings yet
Diagnostic Report: ENGINE #1 - J1939 Active Fault Codes
4 pages
Spectroscopic Techniques
No ratings yet
Spectroscopic Techniques
38 pages
Peter Markus NGEM01
No ratings yet
Peter Markus NGEM01
63 pages
Catalyst Preparation Methods
100% (1)
Catalyst Preparation Methods
25 pages
CRM Section Two
No ratings yet
CRM Section Two
4 pages

Word Sense Disambiguation in Natural Language Processing - GeeksforGeeks

Uploaded by

Word Sense Disambiguation in Natural Language Processing - GeeksforGeeks

Uploaded by

Word Sense Disambiguation in Natural Language

Word sense disambiguation (WSD) in Natural Language Processing (NLP)

Word Sense Disambiguation (WSD) is a subtask of Natural Language

WSD is a challenging task because it requires understanding the context in

1. Supervised learning: This involves training a machine learning model on a

WSD has many practical applications, including machine translation,

Word Sense Disambiguation (WSD) is a subfield of Natural Language

“He always wanted to be a Bollywood star.” The word ‘star’ can be

Difficulties in Word Sense Disambiguation

Different Text-Corpus or Dictionary: One issue with word sense

Sense Inventories for Word Sense Disambiguation

Princeton WordNet: is a vast lexicographic database of English and other

Approaches for Word Sense Disambiguation

1. Supervised: The assumption behind supervised approaches is that the

Supervised methods for Word Sense Disambiguation (WSD) involve training

Random Forest: Random Forest is an ensemble learning method for

Supervised WSD Exploiting Glosses: Textual definitions are a prominent

2. Unsupervised: The underlying assumption is that similar senses occur in

1. Supervised methods for WSD

For example, consider the word “bank” in the sentence “I deposited my

Plus, don’t miss out on these exclusive features:

--> All India Mock Test

Join now and stay ahead in your GATE 2025 journey!

Comment More info Next Article

ML | Natural Language Processing using Deep Learning

Translation and Natural Language Processing using Google Cloud

Natural Language Processing: Moving Beyond Zeros and Ones

You might also like