0% found this document useful (0 votes)
40 views19 pages

Syntax

Syntax provides rules for forming sentences in a language and determining if a sentence is valid. It can help improve search applications by understanding relationships between words, enable paraphrasing by changing word order, and allow extracting information from documents by identifying elements like subjects and objects. Languages vary in how much morphological information is expressed within words versus word order in sentences. Context-free grammars are formal systems that generate the syntactic structures of sentences using rewriting rules.

Uploaded by

Mahabat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views19 pages

Syntax

Syntax provides rules for forming sentences in a language and determining if a sentence is valid. It can help improve search applications by understanding relationships between words, enable paraphrasing by changing word order, and allow extracting information from documents by identifying elements like subjects and objects. Languages vary in how much morphological information is expressed within words versus word order in sentences. Context-free grammars are formal systems that generate the syntactic structures of sentences using rewriting rules.

Uploaded by

Mahabat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 19

Syntax: Structural Descriptions of Sentences

Why Study Syntax?

Syntax provides
• systematic rules for forming new sentences in a language.

• can be used to verify if a sentence is legitimate in a language.


• a step closer to the “meaning” of a sentence.
– Who did what to whom semantics
Applications
• Improving precision in search applications
– Yankees beat red sox
– Red sox beat yankees

• Paraphrasing
– John loves Mary = Mary is loved by John
• Information Extraction
– Fill in a form by extracting information from a document.
Structure of Words

What are words?


• Orthographic tokens separated by white space.

In some languages the distinction between words and


sentences is less clear.
• Chinese, Japanese: no white space between words
– nowhitespace  no white space/no whites pace/now hit esp ace
• Turkish: words could represent a complete “sentence”
– Eg: uygarlastiramadiklarimizdanmissinizcasina
Morphology: the structure of words
• Basic elements: morphemes

• Morphological Rules: how to combine morphemes.


Syntax: the structure of sentences
• Rules for ordering words in a sentence

• Elementary units: Phrasal and Clauses


Morphology and Syntax
Interplay between syntax and morphology
• How much information does a language allow to be packed in a
word, and how easy is it to unpack.
• More information  less rigid syntax  more free word order
• Hindi: “John likes Mary” – all six orders are possible, due to rich
morphological information.
– John-nom Mary-acc likes
English expresses relations between words through word order.
Morphologically rich languages have freer word order.
• However, some parts have rigid word order.
– Noun groups in Hindi: “one yellow book”
Outline

Constituency
• How does this notion arise?
• Type of constituents
• Representation: Tree Structure
Formal device: Context Free Grammars
• Derived tree and derivation tree
• Grammar Equivalence
– Strong and weak generative capacity
– Chomsky Normal Form
• Other Formal Frameworks (Tree-Adjoining Grammar)
Other topics in syntax
• Dependency
• Spoken language syntax
• Structural Priming
Constituency

Words are grouped into part-of-speech groups


• Similar morphological inflections
• Allows us to create new word forms (“blog”, “xerox”)
• Nouns, Verbs, Determiners, Adjectives etc…
Certain sequences of words in a sentence are grouped as constituents
• Distributionally similar behavior
• cohesive units (move around in a sentence as a unit)
– In the morning I take a walk
– I take a walk in the morning

• Substrings are typed “Clause”, “Noun Phrase”, “Verb Phrase”


“Preposition Phrase” etc.
Constituency – contd.
Examples of constituents:
• Noun phrase:
– the dog, two big light blue vans
• Preposition phrase:
– in the box, under the bridge
• Clause:
– the dog bit the man, John thought the dog bit the man
The type of a constituent is derived from the “head word” of
the constituent.
Constituent Structure

Decomposition of a sentence into its constituents.


Attaching constituents to each other to reflect relations among words:
Emergence of Tree Structure
• John saw the man with the telescope
• (S (NP John) saw (NP (NP the man) (PP with (NP the telescope))))
• (S (NP John) saw (NP the man) (PP with (NP the telescope))))
Select a sentence from a newspaper text and provide its constituent
structure.
Evidence of another constituent – verb phrase (“VP”)
• Substring involving a verb move around and can be referred to as a unit.
– VP-fronting (and quickly clean the carpet he did! )
– VP-ellipsis (He cleaned the carpets quickly, and so did she )
– Can have adjuncts before and after VP, but not in VP (He often eats beans, *he
eats often beans )
Relations among Words
Types of relations between words
• Arguments: subject, object, indirect object, prepositional object
• Adjuncts: temporal, locative, causal, manner, …
• Function Words
Subcategorization: List of arguments of a word (verb)
• with features about realization (POS, perhaps case, verb form etc)

For English, the argument order: Subject-Object-IndirectObj


Example:
• like: NP-NP (“John likes Mary”), NP-VP(to-inf) (John likes to watch movies)
• think: NP-S (“John thought Mary was going to the party”)
• put: NP-NP-PP
Adjuncts are optional (typically modifiers of an action)
• John put the book on the table at 3pm yesterday

There are words with “demands” and words that fill the “demands”.
• Demands are typed (NP, VP, PP, S)
English Syntax: A Sample

Sentence types:
• Declarative (John closed the door)
• Imperative (close the door!!)
• Yes-No-Question (can you close the door?)
• Wh-question (who closed the door? What did John close?)
Clause types:
• Infinitival (to read a book)
• Gerundive (reading of a book)
• Relative Clause (that has a green cover)
English Syntax: A Sample – contd.

Noun Phrase:
• Before the head noun:
– Pre-determiner Determiner Post-determiner (Adjective|Noun) Noun
• After the head noun (Modifiers)
– Preposition phrases
– Relative Clauses (the book that has only one sentence)
– Gerundive (the flight arriving after 10pm)
Auxiliary Verbs
• Modal (could, might, will, should…) < perfect (have) < progressive (be) <
passive (be)
• “might have been destroyed”
Large wide-coverage grammars have been developed/under
development
• XTAG (www.cis.upenn.edu/~xtag), HPSG, LFG
Two Representations of Syntactic Structure

Phrase structure: illustrates the constituents and its type.


Dependency structure: Relations between words without
intervening structure.

S
reads
adj
arg0
arg1
NP reads NP Adv boy book slowly
fw fw
slowly
DetP boy DetP book
the a

the a
Context Free-Grammars

String Rewriting Systems


• Transform one string to another (until termination)
G=(V,T,P,S)
where V: vocabulary of non-terminals
T: vocabulary of terminals
S: start symbol
P: set of productions of the form
  where   V and   (V U T)*
Derivation: Rewrite a non-terminal with the production of the grammar until
no non-terminals exist in the string.
• Start with “S”
Sample Context-Free Grammar, derivation and derived structure.
Two Representations

String rewriting system: we derive a string (=derived structure)


But derivation history represented by phrase-structure tree
(=derivation structure)!
Grammar Equivalence
• Can have different grammars that generate same set of strings (weak
equivalence)
• Can have different grammars that have same set of derivation trees (strong
equivalence)
• Strong equivalence implies weak equivalence
CFG Normal Forms:
• Chomsky Normal Form (   )

• Griebarch Normal Form (  w )


• Convert a grammar into CNF and GNF
Penn Treebank (PTB)

Syntactically annotated corpus (phrase structure)


Contains 1 miilion words of Wall Street Journal sentences marked
up with syntactic structure.
• Can be converted into a dependency Treebank.
– need for head percolation tables
• Completely flat structure in NP
– brown bag lunch, pink-and-yellow child seat
• Represents a particular linguistic theory
PropBank
• PTB with some grammatical relations made explicit
Unification
Mechanism needed to pass and check constraints.
Constraints, syntactic and semantic:
• Subject-verb agreement
– S  NP VP
– the boy reads / the boys read / * the boys reads
• Subject/Auxiliary inversion: (Yes-no-question)
– S  AuxVerb NP VP
– Do you have flights / * does you have flights
• Selectional restrictions:
– An apple reads a book
Need a mechanism to encode these constraints
• Refine the non-terminal set to encode these constraints.
• S  3sgAux 3sgNP VP ; 3sgAux  does | has …
• S  Non3sgAux Non3sgNP VP; Non3sgAux  do | have | can
• We need to split the NP rule into the 3sgNP and Non3sgNP.
• Size of the grammar grows;
• can we factor these constraints out of the structure of the rules?
Unification – contd.

Attribute value matrix:


Cat V
Cat N
boy : Number read : Number sg
sg Subj agr Number pl
Person 3 Person 1|2

Cat N Cat V
boys : Number pl reads:
Subj agr Number sg
Person 3 3
Person

Percolate Constraints Check Constraints

VP  V S  NP VP

VP.number = V.subj.agr.number NP.number = VP.subj.agr.number


VP.person = V.subj.agr.person NP.person = VP.subj.agr.person

The boy reads / * the boys reads / the boys read


Structural Priming

Structure of preceding sentences helps/hinders the reading times of


subsequent sentences.
• Dative alternation
– The woman gave her car to the church
– The woman gave the church her car

• One of these forms is primed depending on what the prime was


– V NP NP  gave the church her car
– V NP PP  gave her car to the church
Spoken Language Syntax

Not as “clean”, rampant disfluency.


• edits (restarts, repairs)
• Filled pauses
• Ungrammaticality

Sentence  utterance.
“Clean up” the utterance first before understanding it.

You might also like