0% found this document useful (0 votes)
8 views23 pages

NLP Unit 03

The document discusses key concepts in Natural Language Processing (NLP), focusing on grammars and parsing techniques such as top-down and bottom-up parsing. It explains the role of parsers in validating sentence structures and introduces morphological analysis, augmented transition networks, and common parsing issues like ambiguity and error propagation. Additionally, it covers the importance of feature systems in capturing linguistic details beyond basic syntax.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views23 pages

NLP Unit 03

The document discusses key concepts in Natural Language Processing (NLP), focusing on grammars and parsing techniques such as top-down and bottom-up parsing. It explains the role of parsers in validating sentence structures and introduces morphological analysis, augmented transition networks, and common parsing issues like ambiguity and error propagation. Additionally, it covers the importance of feature systems in capturing linguistic details beyond basic syntax.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Natural Language

Processing (KOE-088)
Unit - 03 (Grammars And Parsing)

Dr. Abdul Kalam Technical University, Lucknow


Grammar, Parser

Basic Methods of Searching


The top-down chart parsing algorithm
Feature System & Augmented Grammar
Agenda Morphological Analysis
Issues in Parsing & Various Techniques
Augmented Transition Network
Back to Agenda Page
Grammar
In Natural Language Processing (NLP),
grammar refers to a set of rules that define the
structure of sentences in a language. These rules
specify how words can be combined to form
phrases, clauses, and sentences that are
syntactically correct.

Key Points About Grammar in NLP

Structure
Syntax
Formal Representation
Context-Free Grammar (CFG)
Probabilistic Context-Free Grammar (PCFG)
Back to Agenda Page
Parser
It is a software component that takes input and
checks it on several grammar rules if it is valid
then it generates a parse tree.

How is it used in NLP?

Grammar Checking
Intermediate Stage of semantic Analysis
Back to Agenda Page
Concept of Parser

It is a graphical representation of derivation.

Start symbols are the root of the parse tree.

Leaf nodes are terminals.

Interior nodes are non-terminals.

If parsed properly will create input text.


Back to Agenda Page
Basic Methods of Searching

Top Down Parsing


Bottom Up Parsing
Top Down Parsing Back to Agenda Page

1. It is a parsing strategy that first looks at the


highest level of the parse tree and works down
the parse tree by using the rules of grammar.
2. Attempts to find the leftmost derivations for
an input string
3. We start parsing from the top to down.
4. The technique used is the most Derivation
5. The main decision is to select what production
rule to use in order to construct the string.
Bottom Up Parsing Back to Agenda Page

1. It is a parsing strategy that first looks at the lowest level


of the parse tree and works up the parse tree by using
the rules of grammar.
2. Bottom-up parsing can be defined as an attempt to
reduce the input string to the start of the symbol of a
grammar.
3. In this parsing technique, we start parsing from bottom
to up in a bottom-up manner.
4. This parsing technique uses the rightmost derivation.
5. The main decision is to select when to use a production
rule to reduce the string to get the starting symbol.
Top Down Chart Parsing Algorithm Back to Agenda Page

Top-down chart parsing is a parsing technique used in Natural Language


Processing (NLP) to analyze the syntactic structure of a sentence.

1. Initialization
2. Prediction
3. Scanning
4. Completion
5. Repetition
Top Down Chart Parsing Algorithm Back to Agenda Page

Initialization
Start with an empty chart (a table used to store intermediate parsing results).
Initialize the chart with the start symbol of the grammar at the root.
Prediction
For each non-terminal in the chart, use the grammar rules to predict possible
expansions (productions) of that non-terminal.
Add these expansions to the chart.
Scanning

Compare the next word in the input sentence with the terminals in the chart.
If there’s a match, add this information to the chart.
Top Down Chart Parsing Algorithm Back to Agenda Page

Completion
Once all parts of a rule match the input, mark this rule as completed in the chart.
Use completed rules to complete higher-level rules that depend on them.
Repetition
Repeat the prediction, scanning, and completion steps until the entire input is
parsed or no more expansions are possible.
Top Down Chart Parsing Algorithm Back to Agenda Page

Example
Consider a simple grammar for a fragment of English:

S → NP VP
NP → Det N
VP → V NP
Det → 'the'
N → 'cat' | 'mat'
V → 'sat on'

Input sentence: "the cat sat on the mat"


Feature System Back to Agenda Page

A feature system in Natural Language Processing (NLP) is a way to


represent additional information about words and phrases to
capture linguistic details that go beyond basic syntactic structure.

Features: Attributes or properties of linguistic elements.


Example: In the sentence "The dogs are running":
"dogs" has features: {number: plural}
"are" has features: {tense: present, number: plural}

Purpose: Helps in disambiguating and understanding the finer details of


language.
Example: Differentiating between "he" (singular, male) and "they" (plural).

Represents additional linguistic information to capture nuances in language.


Morphological Analysis Back to Agenda Page

Morphological analysis involves breaking down


words into their smallest meaningful units, called
morphemes, and understanding how these units
combine to form words.

Key Concepts in Morphological Analysis

Morphemes
Free & Bound
Word Formation
Inflection
Derivation
Compounding
Steps in Morphological Analysis Back to Agenda Page

Identification of Morphemes:

Segmenting Words: Dividing words into their constituent


morphemes.
Example: "unhappiness" → "un-" + "happy" + "-ness"
Classifying Morphemes: Determining whether morphemes are
free or bound, and identifying their roles (prefix, suffix, root).

Analyzing Word Structure:

Root Identification: Finding the base morpheme that


carries the primary meaning.
Example: In "disapproval", the root is "approve".
Affix Identification: Identifying any prefixes, suffixes, or
infixes that modify the root.
Example: In "disapproval", the prefix is "dis-" and
the suffix is "-al".
Steps in Morphological Analysis Back to Agenda Page

Understanding Morphological Rules:

Inflectional Rules: Rules for adding


inflectional morphemes to indicate
grammatical features.
Example: Adding "-s" to form plurals
("cat" → "cats").
Derivational Rules: Rules for adding
derivational morphemes to create new words
or change word classes.
Example: Adding "-ly" to form adverbs
("quick" → "quickly").
Examples of Morphological Analysis Back to Agenda Page

Simple Inflection (dogs)

Derivation (happiness)

Complex Inflection and Derivation (unbelievably)

Compounding (sunflower)
Augmented Transition Network Back to Agenda Page

An Augmented Transition Network (ATN) is a type of computational model used in Natural Language

Processing (NLP) for parsing sentences. An ATN is like a flowchart that processes sentences by moving

through different states according to specific rules, augmented with additional capabilities to handle complex

language

Key Components

1. States: Points in the network representing different stages in the parsing process.

2. Transitions: Arrows connecting states, representing the rules for moving from one state to another.

3. Registers: Memory slots used to store information during parsing.

4. Tests and Actions: Conditions that must be met to follow a transition and actions to be taken (like

storing or manipulating data in registers).


Augmented Transition Network Back to Agenda Page

How It Works

Start State: The initial state where the parsing begins.

End State: The final state representing the completion of parsing.

Arc: A transition that can involve:

Word Tests: Checking if the next word in the input matches specific criteria (e.g., part of speech).

Push: Handling recursive structures by temporarily moving to a subroutine or another part of the

network.

Pop: Returning from a subroutine or another part of the network after processing a nested structure.

Actions: Operations like storing parts of the sentence in registers for later use.
Issues in Parsing Back to Agenda Page

Parsing in NLP involves analyzing the grammatical structure of sentences to derive their syntactic

structure. Several issues can arise during parsing:

1. Ambiguity:

Lexical Ambiguity: A word can have multiple meanings.

Example: "bank" can mean a financial institution or the side of a river.

Syntactic Ambiguity: A sentence can have multiple valid parse trees.

Example: "I saw the man with the telescope" can mean either "I used a telescope to see the

man" or "I saw a man who had a telescope."

2. Complex Sentences:

Sentences with nested or long structures can be difficult to parse accurately.

Example: Sentences with multiple clauses or embedded phrases.


Issues in Parsing Back to Agenda Page

3) Non-Standard Grammar:

Informal or colloquial language, including slang and incomplete sentences, can be

challenging to parse.

Example: "Gonna go now."

4) Error Propagation:

Mistakes in tokenization or part-of-speech tagging can lead to parsing errors.

Example: Misidentifying a word's part of speech can lead to incorrect parse

trees.

5) Incomplete or Noisy Data:

Incomplete sentences or sentences with errors (e.g., typos) can complicate parsing.

Example: "She want to go" instead of "She wants to go."


Please Like,
Share &
Subscribe
Thank you!
Do you have any questions?

You might also like