0% found this document useful (0 votes)
9 views

NLP Module 3

Uploaded by

akarshana102
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

NLP Module 3

Uploaded by

akarshana102
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Natural Language Processing

By
Dr. L. Lakshmi
Course Code: DS311
Module-3
Module-3 Contents

➢ Context Free Grammars for English


➢ Syntax- Context Free Rules and Trees,
➢ Sentence- Level Constructions
➢ Agreement
➢ Sub Categorization
➢ Parsing
➢ Top-down – Early Parsing
➢ feature Structures
➢ Probabilistic Context-Free Grammars.
Context Free Grammars for English
• Constituency:
• How do words group together in English? Consider the Noun phrase, a sequence of words
surrounding at least one noun. Here are some examples of noun phrases
• Harry the Horse a high-class spot such as Mindy’s
• the Broadway coppers the reason he comes into the Hot Box
• they three parties from Brooklyn
• Constituency: Noun phrases function as single units or constituents within sentences.
• Evidence: These groups can appear in similar syntactic environments.
• Examples:
• Before a verb: "Three parties from Brooklyn arrive.“
• "A high-class spot such as Mindy’s attracts attention.“
• "The Broadway coppers love their job.“
• "They sit at the table."
Context Free Grammars for English
• Constituency:
• While the entire noun phrase can occur before a verb, individual words within the phrase cannot be
separated or reordered arbitrarily.
• Examples:
• Grammatical: "Three parties from Brooklyn arrive.
• "Non-Grammatical:
• "From arrive.“ "As attracts.“ "The is.“ "Spot is.“
• These non-grammatical examples highlight that constituents must remain intact to maintain proper
sentence structure.
• Preposing and Postposing: Constituents can be moved to different parts of a sentence, maintaining
their integrity.
• Examples:
• Preposed: "On September seventeenth, I’d like to fly from Atlanta to Denver.“
• Postposed: "I’d like to fly from Atlanta to Denver on September seventeenth.”
• Non-Grammatical Examples:
• "On September, I’d like to fly seventeenth from Atlanta to Denver.“
• "I’d like to fly on September from Atlanta to Denver seventeenth.“
• Explanation: The integrity of the phrase is crucial. The entire phrase can be moved, but its internal
order cannot be changed without losing grammaticality.
Context Free Grammars for English
• Context-Free Grammars:
• A mathematical system used to model the constituent structure of languages like English.
• CFGs are also known as Phrase-Structure Grammars and are formally equivalent to Backus-
Naur Form (BNF).
• A context-free grammar consists of a set of rules or productions, each of which expresses the
ways that symbols of the language can be grouped and ordered together, and a lexicon of
words and symbols.
• NP (or noun phrase), can be composed of either a Proper Noun NP or a determiner (Det)
followed by a Nominal; a Nominal can be one or more Nouns.
• 𝑵𝑷 → 𝑫𝒆𝒕 𝑵𝒐𝒎𝒊𝒏𝒂𝒍 Det → a | the Noun → flight
• Nominal → Noun | Nominal Noun
• The symbols that are used in a CFG are divided into two classes
• Terminals: Symbols that correspond to actual words in the language (e.g., “the”, “nightclub”).
• Non-Terminals: Symbols that represent clusters or generalizations (e.g., NP for noun phrase).
• The items to the left of the arrow is a single non-terminal.
• The items to the right of the arrow is a non-terminal or terminal.
Context Free Grammars for English
• Context-Free Grammars:
• A CFG can be thought of in two ways: as a device for generating sentences, and as a device for
assigning a structure to a given sentence.
• Generation: CFGs can generate sentences by rewriting non-terminal symbols using production
rules.
• Example: Starting with NP, applying rules leads to the derivation of "a flight".
• Derivation Sequence:
• Start with NP
• Apply rule NP → Det Nominal Det → a and Nominal → flight Result: "a flight"
• Structure Assignment: CFGs also assign structure to sentences by defining how constituents
like noun phrases are formed.
Context Free Grammars for English
• Context-Free Grammars:
• The formal language defined by a CFG is the set of strings that are derivable from the
designated start symbol (S).
• Since context-free grammars are often used to define sentences, S is usually interpreted as the
“sentence” node, and the set of strings that are derivable from S is the set of sentences.
• A sentence can consist of a noun phrase followed by a verb phrase.
• CFGs can define more complex sentence structures.
• Example Rules:
• S → NP VP I prefer a morning flight
• VP → Verb NP prefer a morning flight
• VP → Verb NP PP leave Boston in the morning
• VP → Verb PP leaving on Thursday
• PP → Preposition NP from Los Angeles
Context Free Grammars for English
• Context-Free Grammars:
• Lexical Rules (for terminals):
• NP → "I" | "a morning flight" | "Boston" | "Los Angeles“
• Verb → "prefer" | "leave" | "leaving“
• Preposition → "in" | "on" | "from“
• Sentence Derivation Examples:
• Let’s use the CFG rules to derive a few sentences.
• 1. Sentence: "I prefer a morning flight“
• Start with the sentence symbol:
• SS → NP VP
• Expand NP to "I" and VP to Verb NP:
• NP → I VP → Verb NP
• Expand Verb to "prefer" and NP to "a morning flight":
• Verb → prefer
• NP → a morning flight
Context Free Grammars for English
• Context-Free Grammars:
• “I leave Boston in the morning“

• "leaving on Thursday“
• S → VP
• → Verb PP
• → leaving PP
• → leaving Preposition NP
• → leaving on NP
• → leaving on Thursday
Context Free Grammars for English
• Example ATIS Corpus:

[S [NP [Pro I]] [VP [V prefer] [NP [Det a] [Nom [N morning] [Nom [N flight]]]]]]
Context Free Grammars for English
• Example ATIS Corpus:
Context Free Grammars for English
• Example ATIS Corpus:
Context Free Grammars for English
• Example ATIS Corpus:
Context Free Grammars for English
• Context-Free Grammars:
• Grammatical Sentences: Sentences derivable by the CFG rules are considered grammatical.
• Ungrammatical Sentences: Sentences not derivable by the grammar are ungrammatical.
• Generative Grammar: CFGs define a formal language by generating all possible grammatical
sentences.
• Formal definition of context-free grammar
• A context-free grammar G is defined by four parameters N, ∑, R, S ( technically “is a 4-tuple”):
• N a set of non-terminal symbols (or variables)
• ∑ a set of terminal symbols (disjoint from N)
• R a set of rules or productions, each of the form A → β , where A is a non-terminal, β is a string of
symbols from the infinite set of strings (∑ U N)
• S a designated start symbol
• The formal properties of context-free grammars

• Capital letters like A, B, and S Non-terminals


• S The start symbol
• Lower-case Greek letters like ά, β, γ Strings drawn from (∑ U N)
• Lower-case Roman letters like u, v, and w Strings of terminals
Context Free Grammars for English
• Understanding Derivation in CFGs:
• Direct Derivation: A string αAg directly derives another string αβg if A → β is a production in
R.
• Formal Representation: If A → β and α and γ are any strings from ( ∑∪ N)*, then αAg directly
derives αβg..
• General Derivation: Extends direct derivation to sequences.
• Formal Representation: If α1 → α2 → ... → αm is a series of derivations, we say α1 derives αm.
• Language Generated (LG) by CFG: The set of all strings composed of terminal symbols that can
be derived from the start symbol S.
• Formal Definition:
• LG = {w | w is in ∑* and S →* w}
• Here, w is a string of terminals derivable from S.
Sentence-Level Constructions
• Sentence-Level Constructions:
• There are the four common and important sentence structures in English.
• Declarative: Statements.
• Imperative: Commands or requests.
• Yes-No Questions: Questions with yes/no answers.
• Wh-Questions: Questions using "wh" words (who, what, when, where, etc.).
Sentence-Level Constructions
• Declarative Sentence Structure:
• A declarative sentence typically consists of a subject noun phrase (NP) followed by a verb
phrase (VP).
• Used to make statements or declarations.
• Examples:
• The flight should be eleven a.m. tomorrow
• The return flight should leave at around seven p.m.
• I’d like to fly the coach discount class
• I want a flight from Ontario to Chicago
• I plan to leave on July first around six thirty in the evening
• Rule:
• S → NP VP
Sentence-Level Constructions
• Imperative Sentence Structure:
• An imperative sentence often begins with a verb phrase (VP) and does not have an explicit
subject.
• Commonly used for commands or requests..
• Examples:
• Show the lowest fare.
• List all flights between five and seven p.m.
• Please list the flights from Charlotte to Long Beach arriving after lunchtime.
• Rule:
• S → VP
Sentence-Level Constructions
• Yes-No Question Structure:
• Typically begins with an auxiliary verb (Aux) followed by a subject noun phrase (NP) and a
verb phrase (VP)..
• Used to ask questions that can be answered with "yes" or "no."
• Examples:
• Do any of these flights have stops?
• Does American’s flight eighteen twenty-five serve dinner?
• Can you give me the same information for United?
• Rule:
• S → Aux NP VP
Sentence-Level Constructions
• Wh-Question Structure:
• 1) Begins with a wh-phrase (e.g., who, what, which) followed by a verb phrase (VP). The wh-
word serves as the subject
• 2) The wh-phrase is not the subject of the sentence. The sentence includes another subject, and
the auxiliary verb appears before the subject NP
• 1)Used to ask questions about specific information.
• 2)Used to ask questions where the wh-word refers to something other than the subject
• Examples:
• 1)What airlines fly from Burbank to Denver?
• Which flights depart Burbank after noon and arrive in Denver by six p.m.?
• Whose flights serve breakfast?
• 2) What flights do you have from Burbank to Tacoma Washington?
• What movie did you watch last night?
• Rule:
• S → Wh-NP VP
• S → Wh-NP Aux NP VP
Sentence-Level Constructions
• Long-Distance Dependencies in Wh-Questions:
• A long-distance dependency occurs when a wh-phrase is semantically related to a distant verb
or predicate in the sentence
• Examples:
• What flights do you have from Burbank to Tacoma Washington?
• The relationship between "flights" and "have" can be seen as either a semantic or syntactic
connection.
• Some grammar models use markers like traces or empty categories to handle these
dependencies
• Topicalization: Moving a phrase to the beginning of a sentence for emphasis or discourse
purposes.
• Example: On Tuesday, I’d like to fly from Detroit to Saint Petersburg.
• Fronting Constructions: Other similar constructions that alter the typical word order for
specific effects.
Clauses and Sentences
• Clauses and Sentences:
• A clause is a complete thought, often represented by an S node in a parse tree.
• The main verb and its arguments define the completeness of a sentence.
• S rules define complete sentences and can be part of larger grammatical structures.
• S can also appear on the right-hand side of grammar rules, indicating it can be
embedded within larger sentences.
• S rules are not just about standalone sentences; they can be part of more complex
structures.
• Sentence Constructions vs. Other Grammar Rules
• Sentence Constructions (S Rules):
• Represent complete units of discourse.
• Correspond to the notion of a clause in traditional grammar.
Clauses and Sentences
• Clauses and Sentences:
• Other Grammar Rules:
• May not represent complete thoughts.
• Can be part of larger grammatical structures.
• A sentence (S) forms a complete thought, often aligning with the concept of a
clause.
• An S is defined as a node in the parse tree where the main verb has all its
arguments.
• Example:
• Sentence: "I prefer a morning flight.“
• Main Verb: prefer
• Arguments: Subject "I" and Object "a morning flight".
• Mastering S rules is essential for understanding the structure and function of
sentences in grammar.
The Noun Phrase
• The Noun Phrase:
• Our grammar introduced three of the most frequent types of noun phrases that
occur in English
• Pronouns
• Proper Nouns
• NP → Det + Nominal Construction (Focus of this section) Central to syntactic
complexity in English noun phrases
• The Structure of Noun Phrases:
• Head Noun: The central noun in a noun phrase.
• Modifiers: Elements that can occur before or after the head noun
• Examples: "a morning flight“ "the longest layover“
The Noun Phrase
• The Determiner: Noun phrases can begin with simple lexical determiners
• a stop the flights this flight
• those flights any flights some flights
• The role of the determiner in English noun phrases can also be filled by more
complex expressions, as follows:
• United’s flight
• United’s pilot’s union
• Denver’s mayor’s mother’s canceled flight
• The role of the determiner is filled by a possessive expression consisting of a noun
phrase followed by an ’s as a possessive marker
• Det → NP’s
• This rule is recursive (since an NP can start with a Det)
• There are also circumstances under which determiners are optional in English. For
example, determiners may be omitted if the noun they modify is plural
• Show me flights from San Francisco to Denver on weekdays (the flights, the
weekdays)
• Does this flight serve dinner. (the dinner)
The Noun Phrase
• The Nominal:
• The nominal construction follows the determiner and contains any pre- and post-head noun
modifiers.
• Its simplest form is a nominal can consist of a single noun.
• Nominal → Noun
• This rule also provides the basis for the bottom of various recursive rules used to capture more
complex nominal constructions.
• Before the Head Noun:
• A number of different kinds of word classes can appear before the head noun in a nominal.
• These include
• Cardinal numbers
• Ordinal numbers
• Quantifiers
• Examples of cardinal numbers:
• two friends one stop
• Ordinal numbers include first, second, third, and so on, but also words like next, last, past,
other, and another
• the first one the next day the second leg
• the last flight the other American flight
The Noun Phrase
• Before the Head Noun:
• Some quantifiers (many, (a) few, several) occur only with plural count
nouns:
• many fares
• The quantifiers much and a little occur only with noncount nouns.
• Adjectives occur after quantifiers but before nouns.
• a first-class fare a nonstop flight
• the longest layover the earliest lunch flight
• Adjectives can also be grouped into a phrase called an adjective phrase
or AP.
• APs can have an adverb before the adjective:
• the least expensive fare
• We can combine all the options for prenominal modifiers with one rule
as follows:
• NP →(Det) (Card) (Ord) (Quant) (AP) Nominal
The Noun Phrase
• Before the Head Noun:
• Some quantifiers (many, (a) few, several) occur only with plural count
nouns:
• many fares
• The quantifiers much and a little occur only with noncount nouns.
• Adjectives occur after quantifiers but before nouns.
• a first-class fare a nonstop flight
• the longest layover the earliest lunch flight
• Adjectives can also be grouped into a phrase called an adjective phrase
or AP.
• APs can have an adverb before the adjective:
• the least expensive fare
• We can combine all the options for prenominal modifiers with one rule
as follows:
• NP →(Det) (Card) (Ord) (Quant) (AP) Nominal
The Noun Phrase
• After the Head Noun:
• A head noun can be followed by postmodifiers. Three kinds of nominal
postmodifiers are very common in English:
• prepositional phrases all flights from Cleveland
• non-finite clauses any flights arriving after eleven a.m.
• relative clauses a flight that serves breakfast
• Prepositional phrase postmodifiers are particularly common in the
ATIS corpus, since they are used to mark the origin and destination of
flights. Here are some examples, with brackets inserted to show the
boundaries of each PP; note that more than one PP can be strung
together:
• any stopovers [for Delta seven fifty one]
• all flights [from Cleveland] [to Newark]
• arrival [in San Jose] [before seven p.m.]
• a reservation [on flight six oh six] [from Tampa] [to Montreal]:
• Here’s a new nominal rule to account for postnominal PPs:
• Nominal → Nominal PP
The Noun Phrase
• The three most common kinds of non-finite postmodifiers are the gerundive (-ing), -ed, and infinitive
forms.
• Gerundive postmodifiers are so-called because they consist of a verb phrase that begins with the gerundive
(-ing) form of the verb. In the following examples, the verb phrases happen to all have only prepositional
phrases after the verb, but in general this verb phrase can have anything in it (anything, that is, which is
semantically and syntactically compatible with the gerund verb).
• any of those [leaving on Thursday]
• any flights [arriving after eleven a.m.]
• flights [arriving within thirty minutes of each other]
• We can define the Nominals with gerundive modifiers as follows, making use of a new non-terminal
GerundVP:
• Nominal →Nominal GerundVP
• We can make rules for GerundVP constituents by duplicating all of our VP productions, substituting
GerundV for V.
• GerundVP → GerundV NP | GerundV PP | GerundV | GerundV NP PP
• GerundV can then be defined as:
• GerundV → being | arriving | leaving | . . .
• The phrases in italics below are examples of the two other common kinds of non-finite clauses, infinitives
and -ed forms:
• the last flight to arrive in Boston
• I need to have dinner served
• Which is the aircraft used by this flight?
The Noun Phrase
• A postnominal relative clause (more correctly a restrictive relative clause), is a Relative
pronoun clause that often begins with a relative pronoun (that and who are the most
common). The relative pronoun functions as the subject of the embedded verb (is a subject
relative) in the following examples:
• a flight that serves breakfast
• flights that leave in the morning
• the United flight that arrives in San Jose around ten p.m.
• the one that leaves at ten thirty five
• We might add rules like the following to deal with these:.
• Nominal → Nominal RelClause
• RelClause → (who | that) VP
• Before the Noun Phrase
• Word classes that modify and appear before NPs are called predeterminers. Many of these
have to do with number or amount; a common predeterminer is all:
• all the flights all flights all non-stop flights
• The example noun phrase given in Fig. 12.5 illustrates some of the complexity that arises
when these rules are combined.
The Noun Phrase
Agreement
• In previous module we discussed English inflectional morphology.
• Recall that most verbs in English can appear in two forms in the present tense:
• the form used for third-person, singular subjects (the flight does), and
• the form used for all other kinds of subjects (allthe flights do, I do).
• The third-person-singular (3sg) form usually has a final -s where the non-3sg form does not.
Here are some examples, again using the verb do, with various subjects
• Do [NP all of these flights] offer first class service?
• Do [NP I] get dinner on this flight?
• Do [NP you] have a flight from Boston to ForthWorth?
• Does [NP this flight] stop in Dallas?
• Here are more examples with the verb leave:
• What flights leave in the morning?
• What flight leaves from Pittsburgh?
• This agreement phenomenon occurs whenever there is a verb that has some noun acting as its
subject.
Agreement
• Note that sentences in which the subject does not agree with the verb are ungrammatical:
• *[What flight] leave in the morning?
• *Does [NP you] have a flight from Boston to ForthWorth?
• *Do [NP this flight] stop in Dallas?
• How can we modify our grammar to handle these agreement phenomena?
• One way is to expand our grammar with multiple sets of rules, one rule set for 3sg subjects,
and one for non-3sg subjects.
• For example, the rule that handled these yes-no-questions used to look like this:
• S → Aux NP VP
• We could replace this with two rules of the following form:
• S → 3sgAux 3sgNP VP
• S → Non3sgAux Non3sgNP VP
• We could then add rules for the lexicon like these:
• 3sgAux → does | has | can | . . .
• Non3sgAux → do | have | can | . . .
Agreement
• But we would also need to add rules for 3sgNP and Non3sgNP, again by making two copies of
each rule for NP.
• While pronouns can be first, second, or third person, full lexical noun phrases can only be third
person, so for them we just need to distinguish between singular and plural.
• 3SgNP →Det SgNominal
• Non3SgNP → Det PlNominal
• SgNominal → SgNoun
• PlNominal → PlNoun
• SgNoun → flight | fare | dollar | reservation | . . .
• PlNoun → flights | fares | dollars | reservations | . . .
• The problem with this method of dealing with number agreement is that it doubles the size of
the grammar.
• Every rule that refers to a noun or a verb needs to have a “singular” version and a “plural”
version.
• Unfortunately, subject-verb agreement is only the tip of the iceberg.
• We’ll also have to introduce copies of rules to capture the fact that head nouns and their
determiners have to agree in number as well.
• this flight *this flights
• those flights *those flight
Agreement
• Noun case and agreement systems introduce complexity into grammatical rule sets.
• Rule proliferation occurs when additional rules are needed for different grammatical features such as case,
gender, and number.
• Nouns and pronouns change form based on case. Nominative Pronouns: (I, she, he, they) Accusative
Pronouns: (me, her, him, them)
• Each noun phrase (NP) and noun (N) rule requires separate versions for these cases. Each case variation
necessitates additional grammatical rules, increasing complexity. As more grammatical categories are
considered, the number of required rules increases significantly.
• Example: Different rules for "I see her" vs. "She sees me.“
• Key Point: In computational grammars, rule proliferation can lead to an unwieldy number of rules.
• Unlike English, languages like German and French have gender agreement along with number agreement.
Gender must agree between the noun, adjective, and determiner.
• Example: In French, "le grand homme" (the big man) vs. "la grande femme" (the big woman).
• This adds another layer of rule proliferation for gender agreement.
• For each combination of grammatical features, new rules are needed, compounding the size of the grammar.
• Managing large numbers of grammatical rules is challenging in computational grammars. Practical grammars
often rely on Context-Free Grammars (CFGs), which can be inefficient when the number of rules increases.
• The need to balance accuracy in linguistic representation with computational efficiency.
The Verb Phrase and Subcategorization
• The verb phrase consists of the verb and a number of other constituents.
• In the simple rules we have built so far, these other constituents include NPs and PPs and
combinations of the two:
• VP → Verb disappear
• VP → Verb NP prefer a morning flight
• VP → Verb NP PP leave Boston in the morning
• VP → Verb PP leaving on Thursday
• Verb phrases can be significantly more complicated than this. Many other kinds of constituents
can follow the verb, such as an entire embedded sentence.
• These are Sentential called sentential complements
• You [VP [V said [S there were two flights that were the cheapest ]]]
• You [VP [V said [S you had a two hundred sixty six dollar fare]]
• [VP [V Tell] [NP me] [S how to get from the airport in Philadelphia to downtown]]
• I [VP [V think [S I would like to take the nine thirty flight]]
• Here’s a rule for these:
• VP → Verb S
The Verb Phrase and Subcategorization
• Another potential constituent of the VP is another VP. This is often the case for verbs like want, would like,
try, intend, need:.
• I want [VP to fly from Milwaukee to Orlando]
• Hi, I want [VP to arrange three flights]
• Hello, I’m trying [VP to find a flight that goes from Pittsburgh to Denver after two p.m.]
• Verbs can also be followed by particles.
• Phrasal Verbs: Verbs can be followed by particles (e.g., "take off"), forming phrasal verbs treated as single
verbs composed of two words.
• Verb Complements: Not all verbs can be followed by the same types of constituents (e.g., objects, phrases).
• Examples:
• "Want": Can take an NP complement ("I want a flight") or an infinitive VP complement ("I want to fly").
• "Find": Can take an NP complement ("I found a flight") but not an infinitive VP complement (*"I found to
fly").
• This idea that verbs are compatible with different kinds of complements is a very old one
• Transitive vs. Intransitive Verbs:
• Transitive verbs (e.g., "find") require a direct object NP ("I found a flight").
• Intransitive verbs (e.g., "disappear") cannot take a direct object (*"I disappeared a flight").
The Verb Phrase and Subcategorization
• Traditional vs. Modern Grammars: Traditional grammar subcategorizes verbs as transitive or intransitive.
Modern grammars have more detailed subcategories, with up to 100 subcategorization frames.
• Examples of Tagsets:
• COMLEX (Macleod et al., 1998)
• ACQUILEX (Sanfilippo, 1993)
• Subcategorization: Verbs subcategorize for different complements:
• "Find": Requires an NP complement (e.g., "I found a flight").
• "Want": Can take an NP or non-finite VP complement (e.g., "I want a flight" / "I want to fly").
• Subcategorization Frame: The possible complements of a verb define its subcategorization frame.
• Predicate-Argument Structure: Verbs can be viewed as logical predicates, and their complements as
arguments:
• Example: FIND(I, A FLIGHT) or WANT(I, TO FLY).
The Verb Phrase and Subcategorization
• Subcategorization frames for a set of example verbs are given in Fig. 12.6.
• Note that a verb can subcategorize for a particular type of verb phrase, such as a verb phrase whose verb is an
infinitive (VPto), or a verb phrase whose verb is a bare stem (uninflected: VPbrst).
• Note also that a single verb can take different subcategorization frames.
• The verb find, for example, can take an NP NP frame (find me a flight) as well as an NP frame.
• How can we represent the relation between verbs and their complements in a context-free grammar?
The Verb Phrase and Subcategorization
• One thing we could do is to do what we did with agreement features:
• make separate subtypes of the class Verb (Verb-with-NP-complement, Verbwith- Inf-VP-complement, Verb-
with-S-complement, and so on):
• Verb-with-NP-complement → find | leave | repeat | . . .
• Verb-with-S-complement → think | believe | say | . . .
• Verb-with-Inf-VP-complement → want | try | need | . . .
• Then each VP rule could be modified to require the appropriate verb subtype:
• VP → Verb-with-no-complement disappear
• VP → Verb-with-NP-comp NP prefer a morning flight
• VP → Verb-with-S-comp S said there were two flights
• The problem with this approach, as with the same solution to the agreement feature problem, is a vast
explosion in the number of rules.
• The standard solution to both of these problems is the feature structure, which will be introduced in Ch. 16
where we will also discuss the fact that nouns, adjectives, and prepositions can subcategorize for
complements just as verbs can.
Auxiliaries
• Auxiliaries (Helping Verbs): These verbs have specific syntactic constraints and are a subclass of verbs. They
include:
• Modal verbs: can, could, may, might, must, will, would, shall, should
• Perfect auxiliary: have
• Progressive auxiliary: be
• Passive auxiliary: be
• Syntactic Constraints: Each auxiliary verb places constraints on the form of the following verb and must
appear in a specific order.
• Subcategorization by Auxiliaries:
• Modal verbs: Require a VP with a bare stem verb (e.g., "can go in the morning,", "will try to find a flight").
• Perfect auxiliary (have): Requires a VP with a past participle verb (e.g., "have booked 3 flights").
• Progressive auxiliary (be): Requires a VP with a gerundive participle verb (e.g., "am going from Atlanta").
• Passive auxiliary (be): Requires a VP with a past participle verb (e.g., "was delayed by inclement weather").
Auxiliaries
• A sentence can have multiple auxiliary verbs, but they must occur in a particular order: modal <
perfect < progressive < passive. Here are some examples of multiple auxiliaries:
• modal perfect could have been a contender
• modal passive will be married
• perfect progressive have been feasting
• modal perfect passive might have been prevented
• Auxiliaries as Verbs: Auxiliaries (e.g., "can", "will", "have") are treated similarly to verbs like
"want", "seem", or "intend", which subcategorize for specific VP complements.
• Lexicon Entry: For example, "can" would be listed as a verb that requires a bare-stem VP
complement (e.g., "can go").
• Systemic Grammar Approach: In Halliday's Systemic Grammar (1985), auxiliaries and the main
verb are grouped into a single constituent called the verb group, which captures their ordering.
• Auxiliary Ordering: Modals (e.g., "can", "will") cannot follow progressive or passive "be" or perfect
"have" because they don't have progressive or participle forms.
• Passive Construction: Difference in Semantics: In an active sentence, the subject is the agent (e.g.,
"I prevented a catastrophe"). In a passive sentence, the subject is the patient or undergoer of the
action (e.g., "A catastrophe was prevented").
• This semantic difference will be explored further in Chapter 18.
Coordination
• The major phrase types discussed here can be conjoined with conjunctions like and, or, and but to
form larger constructions of the same type.
• For example a coordinate noun phrase can consist of two other noun phrases separated by a
conjunction.
• Please repeat [NP [NP the flights] and [NP the costs]]
• I need to know [NP [NP the aircraft] and [NP the flight number]]
• Here’s a rule that allows these structures:
• NP →NP and NP
• Note that the ability to form coordinate phrases via conjunctions is often used as a test for
constituency. Consider the following examples which differ from the ones given above in that they
lack the second determiner.
• Please repeat the [Nom [Nom flights] and [Nom costs]]
• I need to know the [Nom [Nom aircraft] and [Nom flight number]]
• The fact that these phrases can be conjoined is evidence for the presence of the underlying
Nominal constituent we have been making use of. Here’s a new rule for this:
• Nominal → Nominal and Nominal
Coordination
• The following examples illustrate conjunctions involving VPs and Ss.
• What flights do you have [VP [VP leaving Denver] and [VP arriving in San Francisco]]
• [S [S I’m interested in a flight from Dallas to Washington] and [S I’m also interested in going to
Baltimore]]
• The rules for VP and S conjunctions mirror the NP one given above.
• VP → VP and VP
• S → S and S
• Since all the major phrase types can be conjoined in this fashion it is also possible to represent this
conjunction fact more generally; a number of grammar formalisms such as (Gazdar et al., 1985) do
this via metarules Metarules such as the following:
• X → X and X
• This metarule simply states that any non-terminal can be conjoined with the same nonterminal to
yield a constituent of the same type.
• Of course, the variable X must be designated as a variable that stands for any non-terminal rather
than a non-terminal itself.
Coordination
Parsing with Context Free Grammars
• Parsing is the process of analyzing a sequence of words or symbols (such as a sentence) to
determine its grammatical structure according to a specific set of rules or grammar.
• In computational linguistics and natural language processing (NLP), parsing involves breaking
down a sentence into its constituent parts (like nouns, verbs, and phrases) and establishing the
relationships between these parts to form a parse tree or syntax tree that represents the structure of
the sentence.
• Parsing is fundamental in tasks like language translation, grammar checking, information
extraction, and semantic analysis.
• Parse trees are useful for applications like grammar checking and serve as intermediate stages for
semantic analysis (e.g., question answering, information extraction).
• Parse trees serve as an important intermediate stage of representation for semantic analysis.
• For example, to answer the question
• “What books were written by British women authors before 1800?”
• we’ll need to know that the subject of the sentence was what books and that the by-adjunct was
British women authors to help us figure out that the user wants a list of books (and not a list of
authors).
Parsing with Context Free Grammars
• First, we revisit the search metaphor for parsing and recognition, which we introduced for finite-
state automata in Ch. 2, the top-down and bottom-up search strategies.
• Parsing as Search
• In finite-state automata, the search is through the space of all possible paths through the
automaton to find the correct path for the input.
• In syntactic parsing, the parser searches through the space of possible parse trees to find the correct
tree for a given sentence.
• The structure of the automaton defines the search space of possible paths, just as the structure of
the grammar defines the search space of possible parse trees.
• This concept is applied to syntactic parsing, using a sample sentence from the Air Travel
Information System (ATIS) corpus as an example.
• Book that flight.
• Fig. 13.1 introduces the L1 grammar, which consists of the L0 grammar from the last chapter with a
few additional rules.
• Given this grammar, the correct parse tree for this example would be the one shown in Fig. 13.2
Parsing with Context Free Grammars
Parsing with Context Free Grammars
• The goal of parsing is to find all possible parse trees whose root is the start symbol (S) and which
cover the exact words of the input sentence.
• Two Constraints for Parsing:
• Data Constraint (from the input sentence): The final parse tree must have three leaves
corresponding to the words book, that, and flight.
• Grammar Constraint (from the grammar structure):The final parse tree must have one root, which
is the start symbol S of the grammar.
• Search Strategies for Parsing:
• Two key search strategies arise from these constraints:
• Top-down (Goal-directed) Search: Guided by the grammar rules, starting from the root (S) and
moving toward the input words.
• Bottom-up (Data-directed) Search: Starts from the input words and builds up to the root (S).
• Philosophical Insights:
• The two search strategies reflect philosophical traditions:
• Rationalist Tradition: Emphasizes prior knowledge and guides the top-down search.
• Empiricist Tradition: Focuses on the data (input) and informs the bottom-up search.
Top-Down Parsing
• Top-down parsing begins by attempting to construct a parse tree starting from the root node (S)
and works its way down to the leaves (terminal symbols).
• The search space that a top-down parser explores involves building all possible trees in parallel,
meaning it considers every potential parse tree that could fit the sentence.
• The algorithm initially assumes that the input sentence can be derived from the start symbol (S).
• The parser's next step is to find the possible expansions for the start symbol (S) by looking at
grammar rules with S on the left-hand side.
• In the grammar shown in Fig. 13.1, there are three rules that expand from S, so the second level (or
ply) of the search space has three partial trees based on these rules.
• The parser continues this process, trying to fill out the tree by applying rules to expand each non-
terminal symbol until it reaches the leaves.
• After initially expanding S in the parse tree, the next step is to expand the constituents in the
newly formed trees (from the three rules that expand S).
• The first tree expects an NP followed by a VP.
• The second tree expects an Aux followed by an NP and a VP.
• The third tree expects a VP on its own.
Top-Down Parsing
• In the third ply (level) of the search space, only a subset of the trees is shown, focusing on the
expansion of the left-most leaves.
• At each ply, the parser uses the right-hand sides of the rules to set new expectations, which guide
further expansions.
• This process is recursive, generating the rest of the trees.
• The trees grow downward, expanding until they reach the part-of-speech categories at the bottom.
• Trees whose leaves do not match the words in the input sentence are rejected.
• In this example, only the fifth parse tree in the third ply (the one where VP expands into Verb NP)
will eventually match the input sentence "Book that flight."
Top-Down Parsing
Bottom-Up Parsing
• Bottom-up parsing is an early parsing algorithm, first proposed by Yngve (1955), and commonly
used in shift-reduce parsers for computer languages.
• In bottom-up parsing, the process begins with the words of the input and attempts to build trees
upwards by applying grammar rules incrementally.
• A parse is successful when the parser constructs a tree that is rooted in the start symbol (S) and
covers all the input words.
• Fig. 13.1.2 illustrates the bottom-up parsing process for the sentence "Book that flight.“
• The parser starts by looking up each word in the lexicon to assign part-of-speech categories.
• For example:
• Book can be either a noun or a verb (ambiguity).
• That and flight are assigned respective parts of speech.
• Due to this ambiguity with book, the parser must explore two possible sets of trees for the initial
words.
Bottom-Up Parsing
• The first two plies (levels) of the search space in Fig. 13.1.2 show this bifurcation, where the parser
considers multiple options based on the noun or verb tag for book.
• The parser continues applying rules and building trees bottom-up, checking if the resulting tree
can be rooted in S and cover the input.
• After creating the trees in the second ply (based on the possible parts of speech for "book"), the
parser begins to expand each tree further.
• In the leftmost parse (where "book" is incorrectly treated as a noun), the rule Nominal → Noun is
applied to both book and flight (both considered nouns).
• In the right parse, where "book" is a verb and "flight" is the only noun, the Nominal → Noun rule
is applied only to "flight," leading to different tree expansions in the third ply.
• The parser advances through each ply by searching for places in the partial tree where the right-
hand side of a grammar rule might fit.
Bottom-Up Parsing
• This method contrasts with top-down parsing, where trees are expanded by applying rules when
the left-hand side matches an unexpanded non-terminal.
• In the fourth ply, in the first and third parses, the sequence Det Nominal is recognized as the right-
hand side of the rule NP → Det Nominal, leading to the next expansion.
• By the fifth ply, the interpretation of "book" as a noun is pruned from the search space, as no
further expansions are possible (there is no rule with Nominal NP on the right-hand side).
• The final ply (not shown) contains the correct parse, where "book" is interpreted as a verb, aligning
with Fig. 13.2.
Bottom-Up Parsing
Comparing Top-Down and Bottom-Up Parsing
Aspect Top-Down Parsing Bottom-Up Parsing
Starts from the root symbol (S) and Starts from the input words and
Initial Strategy
generates trees downward. builds trees upward.
Efficiency in Exploring Does not waste time exploring trees that May generate trees that cannot
Trees cannot lead to an S. lead to an S or fit with other trees.
Does not explore subtrees that don’t fit May generate subtrees with no
Handling of Subtrees
into an S-rooted tree. chance of fitting into an S tree.
May spend time on S trees that do not Never generates trees that are not
Consistency with Input
match the input. locally grounded in the input.
In Fig. 13.1.1, many trees in the third ply Generates trees with no chance of
Example of Inefficiency
cannot match "book." leading to S "with wild abandon."
Generates trees before examining the Builds trees that are at least
Input Sensitivity
input. locally grounded in the input.

Top-down parsing is efficient in focusing on trees rooted in S, but may generate trees
incompatible with the input, wasting time on unnecessary branches.

Bottom-up parsing avoids generating trees that don’t match the input but can be
inefficient in creating subtrees that have no chance of leading to an S.
Grammar Rules for English
Optional Use of Augmentative
Machine Translation Spelling Correction
Determiners communication
•Definition: Helps in •Purpose: ranslating •Definition: Real words •Definition:
identifying words in noisy Chinese to English.. resulting from spelling •People who are unable to
and ambiguous input •Example: Consider a errors. use speech or sign
•Example: Woody Allen's "I Chinese sentence •Examples of Questions: language to communicate,
have a gub" misread •note that like the physicist Steven
scenario •potential rough English •“in about fifteen minuets“ Hawking.
•Solution: N-grams help translations. •Corrected to •Example:
avoid such errors by •he briefed to reporters on •“in about fifteen minutes” •Suggesting likely words
predicting more probable the chief contents of the for the menu.
word sequences. •Capabilities:
statement •Benefit:
•“I have a gun” •A spellchecker can use a
•he briefed reporters on the probability estimator both •Enhances the efficiency of
chief contents of the to detect these errors and to communication systems
statement suggest higher-probability
•he briefed to reporters on corrections.
the main contents of the
statement
•he briefed reporters on the
main contents of the
statement
•Choosing the Best
Translation:
Ngrams Models
Applications of N-gram Models
Augmentative
Speech Recognition Machine Translation Spelling Correction
communication
•Definition: Helps in •Purpose: ranslating •Definition: Real words •Definition:
identifying words in noisy Chinese to English.. resulting from spelling •People who are unable to
and ambiguous input •Example: Consider a errors. use speech or sign
•Example: Woody Allen's "I Chinese sentence •Examples of Questions: language to communicate,
have a gub" misread •note that like the physicist Steven
scenario •potential rough English •“in about fifteen minuets“ Hawking.
•Solution: N-grams help translations. •Corrected to •Example:
avoid such errors by •he briefed to reporters on •“in about fifteen minutes” •Suggesting likely words
predicting more probable the chief contents of the for the menu.
word sequences. •Capabilities:
statement •Benefit:
•“I have a gun” •A spellchecker can use a
•he briefed reporters on the probability estimator both •Enhances the efficiency of
chief contents of the to detect these errors and to communication systems
statement suggest higher-probability
•he briefed to reporters on corrections.
the main contents of the
statement
•he briefed reporters on the
main contents of the
statement
•Choosing the Best
Translation:

You might also like