0% found this document useful (0 votes)
22 views45 pages

Unit 4

NLP

Uploaded by

adhikariprabn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views45 pages

Unit 4

NLP

Uploaded by

adhikariprabn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Unit 4

Syntax
CFG, Probabilistic CFG
Word’s Constituency (Phrase level, Sentence level),
Parsing (Top-Down and Bottom-Up),
CYK Parser, Probabilistic Parsing

Natural Language Processing (NLP)


MDS 555
Objective

CFG

Probabilistic CFG

Word’s Constituency (Phrase level, Sentence
level)

Parsing (Top-Down and Bottom-Up)

CYK Parser

Probabilistic Parsing
Grammar

Grammar is the structure and system of a
language

It consists of
– Syntax
– Morphology
Constituency

Syntactic constituency is the idea that groups of
words can behave as single units, or
constituents.

Part of developing a grammar involves building
an inventory of the constituents in the language.

Constituents
– are groups of words behaving as single units and
consist of phrases, words, or morphemes
Constituency

Consider the noun phrase, a sequence of words surrounding at
least one noun.
– Here are some examples of noun phrases

What evidence do we have that these words group together (or


“form constituents”)?
– One piece of evidence is that they can all appear in similar
syntactic environments, for example, before a verb
Morphemes

A morpheme is the smallest unit of meaning in a
language.

A typical word consists of one or more morphemes

Morphemes lack independence, as words can comprise
multiple morphemes

For example, “apple” is a word and also a morpheme.
“Apples” is a word comprised of two morphemes, “apple”
and “-s”, which is used to signify the noun is plural.
Words

Words are the smallest units with independent
meanings, making them the focus of our
analysis

In NLP in word level we perform: POS Tagging

Primary POS/Tags
– Noun(N), Verb(V), Adjective(ADJ), Adverb(ADV)
Phrases

A phrase can consist of a single word or a combination of
words, depending on its position and role in a sentence.

There are five major categories of phrases:
– Noun Phrase (NP)
– Verb Phrase (VP)
– Adjective Phrase (ADJP)
– Adverb Phrase (ADVP)
– Preposition Phrase (PP)
Phrases

Noun phrase (NP): These phrases revolve around a noun as the head word and often
serve as subjects or objects of verbs. They can typically be replaced by a pronoun
without affecting the sentence’s syntactical correctness.

Verb phrase (VP): Verb phrases have a verb as the headword, and they can take
various forms. Some include finite verb components, while others focus on the finite verb
itself. These play a significant role in both constituency and dependency grammars.

Adjective phrase (ADJP): These phrases feature an adjective as the head word and
serve to describe or qualify nouns and pronouns in a sentence.

Adverb phrase (ADVP): Adverb phrases use an adverb as the head word and serve as
modifiers for nouns, verbs, or other adverbs.

Prepositional phrase (PP): Prepositional phrases involve a preposition as the head
word and other lexical components, providing additional details that describe other words
or phrases.
Context Free Grammars

A widely used formal system for modeling constituent
structure in natural language is the context-free grammar, or
CFG.

Context-free grammars are also called phrase-structure
grammars, and the formalism is equivalent to Backus-Naur
form,(BNF).

The idea of basing a grammar on constituent structure dates
back to the psychologist Wilhelm Wundt (1900) but was not
formalized until Chomsky (1956) and, independently, Backus
(1959).
CFG

A context-free grammar consists of a set of rules or productions,
each of which expresses the ways that symbols of the language can
be grouped and ordered together, and a lexicon of words and
symbols.
NP → Det Nominal
NP → ProperNoun
Nominal → Noun | Nominal Noun

– For example, the above productions express that an NP (or noun phrase) can
be composed of

either a Proper Noun or a determiner (Det) followed by a Nominal;

a Nominal in turn can consist of one or more Nouns.
CFG

A CFG can be thought of in two ways:

as a device for generating sentences and

as a device for assigning a structure to a given sentence.


Viewing a CFG as a generator,
– we can read the → arrow as “rewrite the symbol on the left with the string
of symbols on the right”.

So starting from the symbol: NP

we can use our first rule to rewrite NP as: DET Nominal

and then rewrite Nominal as: Noun

and finally rewrite these parts-of-speech as: a flight
CFG

We say the string a flight can be derived from
the non-terminal NP
– A CFG can be used to generate a set of strings. This sequence
of rule expansions is called a derivation of the string of words
– It is common to represent a derivation
by a parse tree
(commonly shown inverted with
the root at the top)
CFG
CFG


We can use this grammar to generate sentences of this “language”.
– We start with S, expand it to NP VP, then choose a random expansion of NP (let’s
say, to I)
– and a random expansion of VP (let’s say, to Verb NP),
– and so on until we generate the string
I prefer a morning flight
Context-Free Grammar (CFG)

Formal Definition
– A context-free grammar (CFG) G is a
quadruple (N, Σ, R, S) where
CFG - Example

N = {q, f,}

Σ = {0, 1}

R = {q → 11q, q → 00f,

f → 11f, f → ε }

S=q

(R= {q → 11q | 00f, f → 11f | ε })
CFG - Rules

If A → B, then xAy → xBy and we say that

xAy derivates xBy.


If s → ··· → t, then we write s * t.

A string x in Σ* is generated by G=(V,Σ,R,S)
if S * x.

L(G) = { x in Σ* | S * x}.
CFG - Example

G = ({S}, {0,1}. {S → 0S1 | ε }, S)

ε in L(G) because
S → ε.

01 in L(G) because
S → 0S1 → 01.

0011 in L(G) because
S → 0S1 → 00S11 → 0011.

L(G) = {0n 1n | n > 0}
Context-free Language (CFL)

A language L is context-free if there exists a
CFG G such that L = L(G).

A grammar G generates a language L
Example
G = (N, T, S, P) P = { S → NP VP
V = {S, NP, VP, PP, Det, Noun, NP → Det Noun | NP PP
Verb, Aux, Pre } PP → Pre NP
VP → Verb NP
T = {‘a’, ‘ate’, ‘cake’, ‘child’, Det → ‘a’ | ‘the’
‘fork’, ‘the’, ‘with’} Noun → ‘cake’ | ‘child’ | ‘fork’
S=S Pre → ‘with’
Verb → ‘ate’}
Example

Some notes:
– Note 1: In P, pipe symbol (|) is used to combine productions into
single representation for productions that have same LHS.

For example, Det → ‘a’ | ‘the’ derived from two rules Det → ‘a’ and Det →
‘the’. Yet it denotes two rules not one.
– Note 2: The production highlighted in red are referred as grammar,
and green are referred as lexicon.
– Note 3:

NP – Noun Phrase, VP – Verb Phrase, PP – Prepositional Phrase, Det –
Determiner, Aux – Auxiliary verb
Sample derivation

S → NP VP P = { S → NP VP
→ Det Noun VP NP → Det Noun | NP PP
→ the Noun VP PP → Pre NP
→ the child VP VP → Verb NP
→ the child Verb NP Det → ‘a’ | ‘the’
→ the child ate NP Noun → ‘cake’ | ‘child’ | ‘fork’
→ the child ate Det Noun Pre → ‘with’
→ the child ate a Noun Verb → ‘ate’}
→ the child ate a cake
Probabilistic Context Free Grammar (PCFG)

PCFG is an extension of CFG with a probability for each
production rule

Ambiguity is the reason why we are using probabilistic version
of CFG
– For instance, some sentences may have more than one underlying derivation.
– The sentence can be parsed in more than one ways.
– In this case, the parse of the sentence become ambiguous.


To eliminate this ambiguity, we can use PCFG to find the
probability of each parse of the given sentence
PCFG - Definition

A probabilistic context free grammar G is a quintuple G = (N, T, S, R, P)
where
– (N, T, S, R) is a context free grammar
where N is set of non-terminal (variable) symbols, T is set of terminal symbols, S is the start
symbol and R is the set of production rules where each rule of the form A → S

– A probability P(A → s) for each rule in R. The properties governing the probability
are as follows;

P(A → s) is a conditional probability of choosing a rule A → s in a left-most derivation, given that
A is the non-terminal that is expanded.

The value for each probability lies between 0 and 1.

The sum of all probabilities of rules with A as the left hand side non-terminal should be equal to
1.
PCFG - Example

Probabilistic Context Free Grammar G = (N, T, S, R, P)

N = {S, NP, VP, PP, Det, Noun, Verb, Pre}

T = {‘a’, ‘ate’, ‘cake’, ‘child’, ‘fork’, ‘the’, ‘with’}

S=S

R={ S → NP VP
NP → Det Noun | NP PP
PP → Pre NP
VP → Verb NP
Det → ‘a’ | ‘the’
Noun → ‘cake’ | ‘child’ | ‘fork’
Pre → ‘with’
Verb → ‘ate’ }
R={

PCFG - Example S → NP VP
NP → Det Noun | NP PP
PP → Pre NP
VP → Verb NP

P = R with associated probability Det → ‘a’ | ‘the’
Noun → ‘cake’ | ‘child’ | ‘fork’
as in the table below Pre → ‘with’
Verb → ‘ate’ }

Please observe from the table, the sum of probability values for all rules that have same left hand side is 1
Parse

Resolve (a sentence) into its component parts
and describe their syntactic roles.

On NLP – Parsing can be visualized in the tree
form
Syntax Parsing
Mostly used in programming
b = c + 1;
a= a - d
Parse Tree

A parse of the sentence "the giraffe dreams" is:
– s => np vp => det n vp => the n vp => the giraffe vp
=> the giraffe iv => the giraffe dreams
Parsing

In natural language processing, parsing is the
process of analyzing a sentence to determine its
grammatical structure

There are two main approaches to parsing:
– top-down parsing
– bottom-up parsing
Top-down Parsing

Top-down parsing is a parsing technique that starts with the
highest level of a grammar’s production rules, and then works its
way down to the lowest level.
– It begins with the start symbol of the grammar and applies the production
rules recursively to expand it into a parse tree.
– One example of a top-down parsing algorithm is the Recursive Descent
Parsing.
Top-down Parsing

For example, consider the following CFG:
S -> NP VP
NP -> Det N
VP -> V NP
Det -> the | a
N -> dog | cat | boy | girl
V -> chased | hugged


A top-down parser would begin with the start symbol “S” and then
apply the production rule “S -> NP VP” to expand it into “NP VP”.

The parser would then apply the production rule “NP -> Det N” to
expand “NP” into “Det N”.
Buttom-up Parsing

Bottom-up parsing is a parsing technique that starts with the
sentence’s words and works its way up to the highest level of the
grammar’s production rules.

It begins with the input sentence and applies the production rules
in reverse, reducing the input sentence to the start symbol of the
grammar.

One example of a bottom-up parsing algorithm is the Shift-
Reduce Parsing.
Buttom-up Parsing

For example, consider the following CFG:
S -> NP VP
NP -> Det N
VP -> V NP
Det -> the | a
N -> dog | cat | boy | girl
V -> chased | hugged


A bottom-up parser would begin with the input sentence “the dog chased the cat” and would apply the
production rules in reverse to reduce it to the start symbol “S”.

The parser would start by matching
– “the dog” to the “Det N” production rule,
– “chased” to the “V” production rule, and
– “the cat” to another “Det N” production rule.


These reduce steps will be repeated until the input sentence is reduced to “S”, the start symbol of the
grammar.
Probability of a parse tree

Use of PCFG
– A sentence can be parsed into more than one way
– We can have more than one parse trees for the
sentence as per the CFG due to ambiguity.
Probability of a parse tree

Given a parse tree t,
– with the production rules α1 → β1, α2 → β2, … , αn → βn
– from R (ie., αi → βi ∈ R), we can find the probability of tree t using
PCFG as follows;


As per the equation, the probability P(t) of parse tree is the
product of probabilities of production rules in the tree t.
Probability of a parse tree

Which is the most probable tree?
– The probability of the parse tree t1 is greater than
the probability of parse tree t2. Hence, t1 is the more
probable of the two parses.
Probability of a sentence

Probability of a sentence is the sum of probabilities of all
parse trees that can be derived from the sentence under
PCFG


Probability of the sentence “astronomers saw the stars
with ears”
Ambiguity

Ambiguity is the most serious problem faced by
syntactic parsers

The most common ambiguity is
– Structural ambiguity
Ambiguity

The phrase in my pajamas can be part of the NP
headed by elephant or a part of the VP headed
by shot

Two parse trees for an ambiguous
sentence. The parse on the left
corresponds to the humorous reading
in which the elephant is in the
pajamas,

the parse on the right corresponds to
the reading in which Captain Spaulding
did the shooting in his pajamas
Self Study

Chomsky Normal Form (CNF)

Cocke–Younger–Kasami (CYK) algorithm
Treebank

A corpus in which every sentence is annotated with a
parse tree is called a treebank

Treebanks play an important role in parsing as well as in
linguistic investigations of syntactic phenomena
Reference

Chapter 17 - Speech and Language Processing (3rd Edition)

Automatic Generation of Python Programs Using Context-Free Grammars:
https://arxiv.org/pdf/2403.06503v1
Thank you

You might also like