NLP - Unit 2
NLP - Unit 2
Processing (CS735E01) – 7 th
Semester
Mithun B N
Asst. Prof
Unit 2:
Grammars and Parsing
Some Basic feature systems for
English
Person and Number Features
• Words may be classified as to whether they can describe a single
object or multiple objects.
• Subject and verbs must also agree on another dimension, with
respect to the person.
• The possible values of this dimension are:
• First Person: I, We
• Second Person: you
• Third Person: one or more objects.
Person and Number Features
• It is convenient to combine the number and person features called as
AGR that has six possible values:
• First person singular – 1s
• Second person singular – 2s
• Third person singular – 3s
• First person plural – 1p
• Second person plural – 2p
• Third person plural – 3p
Verb – Form features and Verb
Subcategorization
• Following features values for the feature VFORM:
• Base – base form ( go, be, say, decide)
• Pres – simple present tense (go, goes, am, is, say, says, decide)
• Past – simple past tense (went, was, said, decided)
• Fin – finite (that is, a tensed form, equivalent to)
• Ing – present participle ( going, being, saying, deciding)
• Pastprt – past participle (gone, been, said, decided)
• Inf – used for infinitive forms with the word to
The SUBCAT values for NP/VP
combination
Value Example verb Example
_none Laugh Jack laughed
_np Find Jack found a key
_np_np Give Jack gave sue the paper
_vp:inf Want Jack wants to fly
_np_vp:inf Tell Jack told the man to go
_vp:ing Keep Jack keeps hoping for the best
_np_vp:ing Catch Jack caught same looking at his desk
_np_vp:base Watch Jack watched same look at his desk
• Many verbs have complement structures that require a prepositional
phrase with a particular preposition, or one that plays a particular
role.
• A feature PFORM is introduced on prepositional phrases.
• A prepositional phrase with a PFORM value ‘TO’ must have the
preposition to as its head. It should describe location LOC.
• Ex: Yogesh gave the money to the customer.
• Another useful PFORM value is MOT, used with verbs such as ‘walk’,
which describes aspect of a path.
• Geetha and Preetha walked to the college.
• Prepositions to, from and along are used to create such phrases.
PFORM feature for prepositional
pharses
Value Example Preposition Example
TO to I gave it to the bank
LOC in, on, by, inside, on top of I put it on the bank
MOT to, from, along I walked to the store
+s is a new lexical category that contains only the suffix morpheme –s.
Lexical rules for common suffixes on verbs
and nouns
Present Tense:
1. (V ROOT ?r SUBCAT ?s VFORM pres AGR 3s) (V ROOT ?r SUBCAT ?s VFORM base IRREG-PRES -) +S
2. (V ROOT ?r SUBCAT ?s VFORM pres AGR {1s 2s 1p 2p 3p}) (V ROOT ?r SUBCAT ?s VFORM base
IRREG-PRES -)
Past Tense:
3. (V ROOT ?r SUBCAT ?s VFORM past AGR {1s 2s 3s 1p 2p 3p}) (V ROOT ?r
SUBCAT ?s VFORM base IRREG-PAST -) +ED
Past Participle
4. (V ROOT ?r SUBCAT ?s VFORM pastprt) (V ROOT ?r SUBCAT ?s VFORM base EN-PASTPRT -) +ED
5. (V ROOT ?r SUBCAT ?s VFORM pastprt) (V ROOT ?r SUBCAT ?s VFORM base EN-PASTPRT -) +EN
Present Participle
6. (V ROOT ?r SUBCAT ?s VFORM ing) (V ROOT ?r SUBCAT ?s VFORM base) +ING
Plural Nouns
7. ( N ROOT ?r AGR 3p) (N ROOT ?r AGR 3s IRREG-PL - ) +S
A Lexicon is: (CAT V ROOT BE1 VFORM pres SUBCAT
{_adjp_np} AGR 3s)
jack: (CAT NAME AGR 3s)
• The saw was broken
man: (CAT N1 ROOT MAN1 AGR 3s)
• Jack wanted me to saw the board in half men: (CAT N ROOT MAN1 AGR 3p)
• I saw jack eat the pizza saw: (CAT N ROOT SAW1 AGR 3s)
saw: (CAT V ROOT SEE1 VFORM past SUBCAT _np)
a: ( CAT ART ROOT A1 AGR 3s) see: (CAT V ROOT SEE1 VFORM BASE SUBCAT _NP
IRREG-PAST + EN-PASTPRT + )
be: (CAT V ROOT BE1 VFORM base
IRREG-PRES + IRREG-PAST + seed: (CAT N ROOT SEED1 AGR 3s)
SUBCAT (_adjp_np}) the: (CAT ART ROOT THE1 AGR {3s, 3p})
cry: (CAT V ROOT CRY1 VFORM base
SUBCAT _none) to: (CAT TO)
dog: (CAT N ROOT DOG1 AGR 3s) want: (CAT V ROOT WANT1 VFORM base SUBCAT
{_np_vp:inf _np_vp:inf})
fish: (CAT N ROOT FISH1 AGR {3s, 3p}
IRREG-PL +) was: (CAT V ROOT BE1 VFORM PAST AGR {1s 3s}
SUBCAT {_adjp_np})
happy: (CAT ADJ SUBCAT _vp:inf)
were: (CAT V ROOT BE VFORM past AGR {2s 1p 2p
he: (CAT PRO ROOT BE1 VFORM 3p} SUBCAT {_adjp_np})
pres SUBCAT {_adjp_np} AGR 3s
Parsing with features
• Parsing algorithm used in the previous chapter can be extended.
• A constituent X could extend an arc of the form: C C1… Ci.X …. Cn to
produce a new arc of the form C C1… CiX. .. Cn
• A grammar with features have the instantiate variables in the original
arc before it can be extended by X:
• (NP AGR ?a) . (ART AGR ?a)(N AGR ?a)