0% found this document useful (0 votes)
68 views9 pages

Tree Syntax of Natural Language: Lecture Note 1 For COM S 474 Mats Rooth

This document discusses tree structures used to represent the syntax of natural language sentences. It provides examples of syntactic trees with labels for different parts of speech. Key points: 1) Syntactic trees show hierarchical phrase structures with labels for parts of speech like noun, verb, and preposition. 2) Sentences are labeled "S" and consist of a subject noun phrase and a tensed verb phrase. 3) Verb phrases are labeled "VP" and can be recursively embedded when auxiliary verbs are present. 3) Part of speech tags provide more detail on verbs, distinguishing tense, number, and auxiliary vs main verbs.

Uploaded by

esssa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views9 pages

Tree Syntax of Natural Language: Lecture Note 1 For COM S 474 Mats Rooth

This document discusses tree structures used to represent the syntax of natural language sentences. It provides examples of syntactic trees with labels for different parts of speech. Key points: 1) Syntactic trees show hierarchical phrase structures with labels for parts of speech like noun, verb, and preposition. 2) Sentences are labeled "S" and consist of a subject noun phrase and a tensed verb phrase. 3) Verb phrases are labeled "VP" and can be recursively embedded when auxiliary verbs are present. 3) Part of speech tags provide more detail on verbs, distinguishing tense, number, and auxiliary vs main verbs.

Uploaded by

esssa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Introduction

PP NP VP

PRP
IN NP VBP RB VP
they
in have even
PRP$ JJ NNS VBN SBAR

their public lectures claimed


IN S

that
NP VP

DT JJ NN SBAR VBZ NP

the only evidence is


IN S NP VP

that
NP VP DT NN VBN PP

NNP the graffiti found


VBD NP IN NP
Khufu
built in
DT NN DT CD NNS

the pyramid the five chambers

Tree Syntax of Natural


Language

Lecture Note 1 for COM S 474


Mats Rooth

Introduction
In linguistics and natural language processing, it is common to attribute labeled tree
structures called syntactic trees or parse trees to phrases and sentences of human
languages. An example is found above. The tree consists of a set of vertices (also
known as nodes or addresses), including a unique root vertex which is drawn at the
top. Each vertex has a label and an ordered sequence of children. In the example,
the root vertex has label S and three children, which (in order) have labels PP, NP
and VP. The child labeled VP has three children, which (in order) have the labels
VBP, RB and VP. The child of VP with label VBP has one child, which has the label
“have”, and the vertex labeled “have” has no children. Vertices which have no
childeren are called terminal nodes. Other nodes are non-terminal nodes. A vertex
right above a terminal node is a pre-terminal node. Table 1 gives the conventional
long pronunciations of the pre-terminal labels used in the example tree. These pre-
terminal labels correspond to the parts of speech of traditional grammar. In NLP
usage, the term part of speech is lengthened to part of speech tag, and then shortened
to tag. So in this tree, the tag for built is VBD.

Tree Syntax of Natural Language 1


Introduction

TABLE 1.

long
label name example
NN singular noun pyramid
NNS plural noun lectures
NNP proper noun Khufu
VBD past tense verb claimed
VBZ 3rd person singular is
present tense verb
VBP non-3rd person sin- have
gular present tense
verb
VBN past participle found
PRP pronoun they
PRP$ possessive pronoun their
JJ adjective public
IN preposition in
complementizer that
DT determiner the

Table 2 gives the other non-terminal labels in the tree. The labels ending in the let-
ter P are known as phrasal categories, such as noun phrase and verb phrase. A
noun phrase is, roughly speaking, a phrase organized around a noun. This noun is
known as the head of the phrase. The head of the first NP is lectures, and the head
of the second one is evidence. Similarly, a verb phrase is a phrase organized around
a verb, and a prepositional phrase is is a phrase organized around a preposition.

TABLE 2.

label long name example (represented by terminal string)


NP noun phrase their public lectures
VP verb phrase built the pyramid
PP preposi- in the five chambers
tional phrase
S sentence Khufu built the pyramid
SBAR sbar that Khufu built the pyramid

It is useful to become familiar with the symbols used in syntactic trees, and with
the tree analysis of common constructions and sentence types. In these notes, we
use the system of tree annotations from the Penn Treebank of English, which is a
database of trees for about 50,000 English sentences. The system is on one hand a
scientific hypothesis about the structure of the English language, and on the other
hand an engineering standard which is used in designing and testing NLP systems.
Treebanks for other languages (such as Chinese and German) have been published.

2 Tree Syntax of Natural Language


Tensed sentences and VP recursion

Tensed sentences and VP recursion


A minimal sentence in English consists of a subject noun phrase such as the temper-
ature and a tensed verb phrase such as dropped or is high. S is the label for the sen-
tence. In the tag for the verb heading the VP, there is a three-way distinction
between past tense (tag VBD for the pre-terminal above the verb), 3rd person present
tense (tag VBZ)
S and non-3rdS person present
S tense (tag
S VBP): S

NP VP NP VP NP VP NP VP NP VP

PRP PRP PRP PRP PRP


VBD VP MD VP MD VP MD VP VBZ VP
he she he she he
was VBG will VB may may has VBN
VB VP VB VP
sleeping sleep slept
have VBN have
VBN VP

This distinction is expressed in the part of speech tag, but not in the VP or S label.

Where there are auxiliary verbs (such as the modal verbs will, can, and may, or vari-
ous forms of have and be), the verbs are arrayed in a right-branching structure of
VPs:
S S S

NP VP NP VP NP VP

NNS VBD NNS


VBP ADJP DT NN VBZ ADJP
temperatures dropped temperatures
are JJ the temperature is JJ

high high

The rightmost verbs in these structures are called main verbs, in opposition to auxil-
iary verbs. However, in the Penn Treebank tag vocabulary, auxiliary verbs are not
given tags different from those of main verbs, with the exception of modals and to.

Part of speech tags for verbs


Here is the complete vocabulary of verb tags.

TABLE 3.

Tag Long name Example


VBD past tense He ate/VBD the cookies.
She answered/VBD the question.
VBZ present tense He likes/VBZ cookies.
VBP present tense They like/VBP cookies.
3rd person They answer/VBP such questions.
plural They are/VBP tired.

Tree Syntax of Natural Language 3


Part of speech tags for verbs

TABLE 3.

Tag Long name Example


VB base He may like/VB cookies.
I heard her answer/VB the question.
They may be/VB tired.
VBG present parti- Eating/VG cookies is unhealthy.
ciple, G- He likes eating/VG cookies.
form
VBN past partici- He has eaten/VBN the cookies.
ple, N-form She has ansered/VBN the questions.
My question was not answered/VBN.
MD modal She will/MD prevail.
TO auxiliary to She expects to/TO prevail.

Most distinctions between tags correspond to overt differences in the form of the
verb. VBP and VB systematically have the same form, with the exception of are/be..
For verbs including the most regular ones (such as answer), there is no distinction in
form between VBD and VBN. In general, the assignment of tags is determined by
context in the tree, not just by word form.

The VB form a verb is a “base” form of the verb in that, in the case of regular verbs,
other forms are derived from it by adding suffixes. This process may be accompa-
nied by minor alterations in spelling, such as consonant doubling (sit/VB, sitting/
VBG) or deletion of an e (site/VB, siting/VBG). Such processes are much more
elaborate in other languages.

To is considered an auxiliary verb because it is found in VP recursion structures sim-


ilar to what is found with modal verbs:

S S

NP VP NP VP

PRP PRP
VBD SBAR VBD SBAR
they they
believed waited
IN S IN S

that for
NP VP NP VP

NNS NNS
MD VP TO VP
prices prices
would VB to VB

rise rise

4 Tree Syntax of Natural Language


Noun phrases

Noun phrases
A minimal noun phrase consists of just a noun:
NP NP NP NP

NN NNS NNP NNPS

film movies Casablanca Casablancans

A singular noun has tag NN, a plural noun has tag NNS, a singular proper noun has
tag NNP, and a plural proper noun has tag NPS.

What is called a determiner may be added at the start of the noun phrase:
NP NP NP

DT NN DT NN DT NNS

every film most entertainment no movies

Some determiners can form noun phrases in isolation. The interpretation is ellipti-
cal, meaning that the understood noun is picked up from context.
Many impressed me.
Each impressed me.
Some impressed me.
*The impressed me.
*A impressed me.
*Every impressed me.

In the tree structure for these examples, an NP node dominates a DT and nothing
else:
NP

DT

some

The star notation used above is used to mark sentences which do not sound right to
the native speaker, and which, though they may possibly be comprehensible, would
not be used. Such sentences are ungrammatical in the language under discussion.
Scientific and technical work on human language takes a naturalistic view on what
counts as grammatical: if a sentence sounds right to native speakers of the language,
or if one can find the sentence (or a corresponding sentence pattern) being used reg-
ularly, then the sentence is considered grammatical.

The noun in an NP can be preceded by a variety of modifiers, notably adjectives and


other nouns, but also including G-form and N-form verbs:
NP NP NP NP

DT JJ NN DT NN NN DT VBG NN DT VBN NN

a weak economy the priority list a slowing economy a disputed ruling

Modifiers can be combined, making the NP longer:

Tree Syntax of Natural Language 5


Prepositional phrases

NP

DT NN NN NN NN

the state teacher cadet program

Arguably, sequences of modifiers have internal structure. There are two meanings
for school law review (a law review at a school, and a review of school law, possibly
performed in another institution such as a legislature). These correlate with two
intonations (with primary strees on law, and primary stress on school, respectively).
It is plausibly to attribute these different meanings and pronunciations to different
tree structures, along the following lines.

NP NP

DT NN NN DT NN NN

a school a review
NN NN NN NN

law review school law

As an approximation, a flat structure is used.

Prepositional phrases
A typical prepositional phrase consists of a preposition (tag IN) followed by a noun
phrase. The tree structure is as follows.
PP PP PP

IN NP IN NP IN NP

in NNP with NNP on


PRP$ NN
Novgorod Wes
your birthday

There are some systematic semantic subclasses of prepositional phrases:

TABLE 4.

class of PPs examples


temporal on Monday, in November, after lunch
locative in Ithaca, on campus, under the sheet
path through downtown, into Barcelona

6 Tree Syntax of Natural Language


Complementation

Complementation
A simple transitive sentence such as the cat ate a rat consists of a subject, a verb,
and an object. The object is an NP just like the subject, and it is represented as a
child of VP:

NP VP

DT NN VBD NP

the cat ate


DT NN

the rat

The object NP is said to be a complement of the verb ate. Ditransitive verbs are
found with two noun phrase complements:

NP VP

DT NN VBD NP NP

the sheriff gave PRN


DT NN
him
a summons

Prepositional compements
PP complements are PP children of VP, occurring alone or with another comple-
ment:
S S S

NP VP NP VP NP VP

PRN PRN PRN


VBZ PP VBZ NP PP VBZ PP PP
I I I
depend sent spoke
IN NP DT NN IN NP IN NP IN NP

on PRN an email to PRN to PRN about PRN

her her her him

Tree Syntax of Natural Language 7


Clausal complements

Clausal complements
Clausal complements are sentences embedded as complements of a verb. Like other
complements of verbs, they are children of VP:

S S

NP VP NP VP

PRP PRP
VBZ SBAR MD ADVP VP
he he
knows will RB
IN S VB SBAR
never
that know
NP VP IN S

PRP whether
VBD NP NP VP
he
made PRP
DT NN VBD NP
he
a mistake made
DT NN

a mistake

The label for the complement is SBAR; the SBAR begins with a complementizer
such as that, whether, or if. The complementizers have the prepositional tag IN. The
SBAR has an S child, which in these examples is a tensed sentence. Even if there is
no complementizer, a SBAR node is present:

NP VP

PRP
VBZ SBAR
he
knows S

NP VP

PRP
VBD NP
he
made
DT NN

a mistake

An alternative label for the complementizer is C, and an alternative label for SBAR
is CP (complementizer phrase).

8 Tree Syntax of Natural Language


Selection

Selection
It is characteristic of complementation that the kind of complement which is possi-
ble correlates with the verb. If we switch the verbs in the examples above, the result
is often an ungrammatical sentence:
* I depend her.
* I ate to her about him.
*He believed to her.
*He spoke whether he made a mistake.

A verb is said to select the complement or pattern of complements it can occur with.
The complements that a verb can occur with are a property of the individual word,
and this information is typically listed in a computational dictionary.

Some verbs with prepositional complements select particular prepositions:


I depend on/*in her.
He yearned for/*to an icecream cone.

Others select a semantic class of prepositional phrases:


He left the paper in the trash. (Location)
*He left the paper into the trash. (Path)

Tree Syntax of Natural Language 9

You might also like