Natural Language Processing 5
Natural Language Processing 5
L A N G UA G E
PROCESSING
UNIT 5
Ambiguity Resolution in NLP
Ambiguity resolution refers to the process of identifying and resolving ambiguities in language. Ambiguities arise when a word, phrase, or sentence can be
interpreted in multiple ways. Ambiguity can occur at various levels:
1. Example: "I saw the man with the telescope" can mean either the speaker used a telescope to see the man or the man had a telescope.
1. Example: "He promised to meet her at the bank" is semantically ambiguous if the context of 'bank' is not clear.
1. Example: "Can you pass the salt?" usually means a request rather than a question about ability.
STUDY4SUB
Techniques for Ambiguity Resolution:
•Rule-Based Methods: Utilize predefined linguistic rules and context to resolve ambiguities.
•Machine Learning Approaches: Use statistical models trained on large corpora to predict the most likely interpretation.
•Disambiguation Algorithms: Algorithms designed specifically to handle certain types of ambiguities, like Word Sense Disambiguation (WSD)
2
Statistical Methods in NLP
Statistical methods in NLP involve the use of mathematical models and algorithms to understand and generate human language based on statistical properties derived from large datasets.
These methods have been widely adopted due to their ability to handle the variability and complexity of natural language. Key statistical techniques include:
1.N-grams: Sequences of 'n' items (words, characters) used to predict the next item in a sequence. Useful in tasks like language modeling and text generation.
1. Example: In a trigram model, the probability of a word depends on the two preceding words.
2.Hidden Markov Models (HMMs): Used for tasks like part-of-speech tagging and speech recognition. HMMs model sequences where the system being modeled is assumed to be a
Markov process with hidden states.
3.Naive Bayes Classifiers: Simplistic probabilistic classifiers based on Bayes' theorem with strong (naive) independence assumptions between features.
4.Conditional Random Fields (CRFs): Used for labeling and segmenting sequential data. Unlike HMMs, CRFs consider the context of the entire sequence for each prediction.
5.Maximum Entropy Models: Also known as logistic regression in some contexts, these models are used for classification tasks by estimating probabilities in a way that maximizes
entropy.
STUDY4SUB
6.Neural Networks and Deep Learning: Modern NLP relies heavily on deep learning techniques, including:
1. Recurrent Neural Networks (RNNs): Effective for sequential data and tasks like language modeling and translation.
2. Long Short-Term Memory Networks (LSTMs) and Gated Recurrent Units (GRUs): Variants of RNNs that handle long-range dependencies better.
3. Transformers: Architecture used in models like BERT, GPT, and T5, known for handling context over long text spans more efficiently
3
BASIC THEORY OF PROBABILITY
Probability theory plays a crucial role in Natural Language Processing (NLP) by providing a mathematical
framework to model uncertainty and make predictions about language. Here’s an overview of basic probability
concepts as applied to NLP:
1. Probability Basics
•Probability: The likelihood of an event occurring, expressed as a number between 0 and 1. For example, the
probability of flipping a coin and it landing heads is 0.5.
•Random Variable: A variable that can take different values based on the outcome of a random phenomenon. For
instance, a random variable could represent the next word in a sentence.
STUDY4SUB
•Probability Distribution: A function that describes the likelihood of different outcomes. For example, the
probability distribution of a die roll assigns a probability of 1/6 to each of the six possible outcomes
4
2. Conditional Probability
•Conditional Probability: The probability of an event occurring given that another event has occurred. It's denoted as
P(A|B), the probability of A given B.
• In NLP, this can be the probability of a word appearing given the previous words, which is crucial for language
modeling.
3. Bayes' Theorem
•Bayes' Theorem: A fundamental theorem that relates conditional probabilities. It’s used to update the probability estimate of
an event based on new evidence.
• In NLP, Bayes' theorem can be used for tasks like spam detection, where we update the probability that an email is
spam based on the presence of certain words
STUDY4SUB
5
4. Language Models
•Language Models: Models that assign probabilities to sequences of words. They help in predicting the next word in a sentence, generating text, and
understanding the structure of language.
• N-gram Models: Simplest form of language models that use the conditional probability of a word given the previous n-1 words.
• Example: In a bigram model (n=2), the probability of a sentence is the product of the conditional probabilities of each word given the previous
word.
5. Markov Models
•Markov Assumption: A simplifying assumption that the probability of a word depends only on a limited history of previous words.
• Hidden Markov Models (HMMs): A statistical model where the system being modeled is assumed to follow a Markov process with hidden states.
Commonly used in part-of-speech tagging, speech recognition, etc.
6. Applications in NLP
STUDY4SUB
•Text Classification: Using probability to classify documents into categories (e.g., spam vs. non-spam).
•Speech Recognition: Modeling the probability of sequences of phonemes or words to recognize spoken language.
•Machine Translation: Estimating the probability of translating one sequence of words into another.
•Information Retrieval: Ranking documents based on the probability that they are relevant to a query
6
PROBABILISTIC MODELS OF NATURAL LANGUAGE PROCESSING
Probabilistic models of Natural Language Processing (NLP) are frameworks that leverage statistical methods to predict and analyze language
patterns. These models operate on the principle that language can be understood in terms of probabilities of sequences of words or other
linguistic units. Here are some key concepts and components of probabilistic models in NLP
•Probability Theory: Uses probabilities to model linguistic events and predict word sequences.
•N-grams:
• Bigram Model: Predicts the next word based on the previous word.
• Models sequences of observable events (words) and hidden states (linguistic categories).
STUDY4SUB
• Used in part-of-speech tagging and named entity recognition.
•Bayesian Networks:
7
•Latent Dirichlet Allocation (LDA):
• Assumes documents are mixtures of topics, and topics are mixtures of words.
•Language Models:
• Method for estimating model parameters to maximize the likelihood of observed data.
STUDY4SUB
• Enhances model accuracy in predicting language patterns.
Probabilistic models handle uncertainty and variability in language, often combined with deep learning for robust
NLP systems
8
9 STUDY4SUB
10 STUDY4SUB
11 STUDY4SUB
Part-of-Speech (POS) Tagging in Natural Language Processing (NLP)
Definition: Part-of-speech tagging, or POS tagging, is the process of labeling each word in a text with its corresponding part of speech, such as noun, verb,
adjective, adverb, etc. This tagging helps in understanding the syntactic structure of the text and aids in various NLP tasks
Importance
Techniques
1.Rule-Based Tagging:
STUDY4SUB
2. Example: "If a word ends in '-ing', tag as verb (VBG)."
2.Statistical Tagging:
12
3. Machine Learning-Based Tagging:
2. Algorithms: Decision Trees, Maximum Entropy Models, Support Vector Machines (SVMs).
4. Hybrid Approaches:
Evaluation Metrics
STUDY4SUB
Challenges
•Ambiguity: Words with multiple possible tags (e.g., "book" as noun or verb).
13
•Morphological Variations: Different forms of words (e.g., "run", "running").
14 STUDY4SUB
15 STUDY4SUB
16 STUDY4SUB
Example Calculation Using Bigram Model
Given a corpus:
1.Count Bigrams:
1. "the cat": 2
2. "cat sat": 1
3. "sat on": 2
4. "on the": 2
5. "the mat": 2
STUDY4SUB
6. "cat is": 1
7. "is happy": 1
8. "the dog": 1
9. "dog sat": 1
17
DICUSS ENCODING ABIGUITY IN LOGICAL FORM
Ambiguity
Definition: Ambiguity arises when a word, phrase, sentence, or text can be interpreted in more than one way. It is a prevalent feature in
natural language, often leading to multiple possible meanings or interpretations
Types of Ambiguity:
•Syntactic: Sentences with multiple possible parses due to structure (e.g., "I saw the man with the telescope").
•Semantic: Sentences with multiple interpretations based on word meanings (e.g., "The chicken is ready to eat").
•Pragmatic: Interpretations influenced by context or situation (e.g., "Can you pass the salt?").
STUDY4SUB
Encoding Ambiguity in Logical Form
Logical Form: Logical form is a structured representation of the semantic content of a sentence, often used in
formal semantics and computational linguistics to capture syntactic and semantic structure precisely
18
TYPES:
•Scope Ambiguity: Different interpretations of quantifiers, negations, or modals (e.g., "Everyone didn't go.").
•Quantifier Ambiguity: Ambiguity in the order or grouping of quantifiers (e.g., "Every student read a book.").
•Attachment Ambiguity: Uncertainty about how modifying phrases attach to the sentence (e.g., "She saw the
man with the telescope").
•Coordination Ambiguity: Ambiguity in how coordinating conjunctions connect parts of the sentence (e.g., "She
likes coffee and tea.")
Addressing Ambiguity:
•Disambiguation Algorithms: Resolve ambiguity using word sense disambiguation and syntactic parsing.
STUDY4SUB
•Contextual Understanding: Utilize context and machine learning models (e.g., BERT, GPT) to infer meanings.
•Semantic Role Labeling: Identify semantic roles to clarify relationships in ambiguous sentences
19
WHAT IS WORD SENSE
Word Sense refers to the particular meaning that a word takes on in a specific context or situation. In natural language, many
words have multiple meanings or senses depending on how they are used. Understanding word senses is crucial in language
understanding tasks such as natural language processing (NLP), machine translation, and semantic analysis
1.Polysemy: Many words are polysemous, meaning they have multiple related senses. For example, "bank" can refer to a
financial institution, the side of a river, or a slope.
2.Homonymy: Some words are homonyms, meaning they have the same spelling and pronunciation but different meanings
unrelated to each other. For example, "bat" as a flying mammal and "bat" as a piece of sports equipment.
3.Context Dependence: The meaning of a word often depends on the context in which it is used. Context can include
STUDY4SUB
surrounding words, the overall topic of discussion, or the speaker's intention.
4.Word Sense Disambiguation (WSD): WSD is the task of determining which sense of a word is intended in a given context.
It is a crucial step in various NLP applications to ensure accurate understanding and interpretation
20
21 STUDY4SUB
WHAT IS BEST FIRST PARSING
Objective:
STUDY4SUB
•It aims to efficiently find the most promising parse tree by exploring the most likely
paths first, rather than exhaustively searching through all possible parse trees
22
•Efficiency: Prioritizes likely parse trees, reducing computational effort.
•Applications: Used in syntactic parsing, semantic parsing, and machine
translation.
•Advantage: Speeds up parsing while maintaining accuracy.
•Limitation: Dependency on the quality and coverage of parsing models
STUDY4SUB
23
STUDY4SUB TEAM
T H A N K YO U
24 P R E S E N TAT I O N T I T L E