We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 8
superseded U4
2.2.4 Lexical Functional
sents those
Grammar (LFG) Model
features of LFG that throw a light on language
the details of lexical functional grammar, readers are
eck Darlymple et al. (1995).
sentences at tw
and functional structure (f-structure).
‘orks 1970, which used phrase
the surface structure of sentences and the
underlying predica tructure, Kaplan (1975, b) proposed a
concrete form for the register names and values (used in ATN
implementation), which became the functional structures in LFG. On the
other hand, Bresnan (1976a, 1977) was more concerned with the problem
such as active/passive and dative
of explaining some linguistic issues,
citernations, in transformational approach. She proposed that such issues
an be dealt with by using lexical redundancy rules. The unification of
these two diverse approaches (with a common concern) led to the
development of the LFG theory, which was presented as Lexical Functional
Grammar: A Formal System for Grammatical Representation in 1982.
The LFG is a formalism that is both computationally and linguistically
motivated and provides precise algorithms for linguistic issues it can
andles The term ‘lexical functional’ is composed of two terms: the
functional’ part is derived from ‘grammatical functions’, such as subject
and object, or roles played by various arguments in a sentence. The
‘lexical’ part is derived from the fact that the lexical rules can be formulated
to help define the given structure of a sentence and some of the long
tance dependencies, which is difficult in transformational grammars.
Cstructure and f-structure in LFG
ea is aimed at providing exact computational algorithms, it provides
: fe on ined objects called constituent structure (c-structure) and functional
ents . (Fstructure). The c-structure is derived from the usual phrase
ntence structure syntax, as in CFG (discussed in Chapter 4). However,
‘This section pre
modelling. For
encouraged to s
Unlike GB, LFG represents
(c-structure)
ted Transition Netw
0 syntactic levels—
constituent structure
Based on Woods’ Augmen
structure trees to represent
te-argument stSS ee ee
36 Natural Language Processing and information Retrioval
as the grammatical-functional role cannot be derived directly from
functional specifications are annotated 9,
and sentence structure,
slied on sentences, results in f-stry,
J
tare |
ich encodes the informar;,
On)
nodes of c-structure, which when app
Hence, structure is the final product whi
obtained from phrase and sentence structure rules and Function
specifications.
Let us consider an example.
Example 2.5
She saw stars in the sky.
CFG rules to handle this sentence are:
S + NP VP
VP V {NP} {NP} PP’ {S'}
PP — P NP
NP Det N (PP}
S’ — Comp S
where
S: sentence V: verb
P: preposition N: noun
Comp: complement
S*: clause
{) optional
*: Phrase can appear any number of times including blank.
When annotated with functional specifications, the rules become:
Rule 1: S— NP
Tsubj=1 T=L
Rule 2: VP+V_ {NP} {NP} pe re
Tobj=L Tobj2=4 T(L case) =! T comp={
Rule 3: PPP NP :
T obj =1
Rule 4: NP {Det} N (PP}
T Adjunct =U
Rule 5: S’-+ Comp
|
on the left hand si a fae renee of the mother node that i
7 fe. The ¥ (down ai arex |
f posed of the node under which it is ele: Symbol zeleriy ™
lence, i og eee : i
NP Se ahareilesirg 4) indicates that the Fstructure of the fs
indicates that the atiucte of the subject of the sentence, while (f= 4}
re ofthe VP node Goes directly to the rae
Here, T (up arro
|Language Modelling 37
in Rule 2, the fstructure of VP is defined
NPs, any number of PPs, and the
‘an be obtained from the lexicon
annotated with T = 4.
of the sentence VP. Similarly,
by the lexical item V, the two optional
optional clause(S’). The fsructure of V ¢
itself. All terminals in LFG can be thought of as
The NPs can function as object and object 2 of the sentence, and their
fstructures are obained using f-sructure of Obj and Objp. F (Lease) = V
inrule 2 indices that the Faructure of the’PP and the case of PP (some
literature refers it as P case) determines the f-structure of VP. ‘Comp’
refers to the compliment in a sentence, e.g., ‘He said that she is powerful’.
Tet us see fret the lexical eritries' of various words in the sentence.
She saw stars. (2.4)
She N (T Pred) = ‘PRO’
(T Pers) =3
(t Num)
(? Gen)
(T Case) = NOM
Saw Vt Pred = ‘see <(T Subj) (T Obj) >”
(? Tense = PAST)
Stars N T Pred = ‘Star’
T Pers
7 Num = PL )
‘This will lead to the.c-structure shown in Figure 2.8.
N
She
4 Pred=‘PRO’ Pred. = ‘see < (1 Subj)(? Obj) >” |
a ad + Tense = PAST N,
stars
1 Pred = ‘star’
re 28 C-structure of sentence (2.4)guage Processing and Information Retrieval
‘ irs, repr
Finally, the structure is the set of attribute-value P&' "*Presenteg
Pers 3
Num SG
subj Gen FEM
Case NOM 1
Pred ‘PRO’
Pers 3
obj Num PL
Pred . ‘Star’
Pred ‘see’ <(T subj) (T obj)?
It is interesting to note that the final ree
unification of various f-structures for yal
etc. This unification is based on the faire
which predicts the overall sentence suet
LEG requires that all possible struct ified. If the given sente,
constructs, dative constructs, ete, must be specie ay na je
Goes not match the specifications, it is said to Pe Pole
three conditions on fstructure (Sells 1985).
cture is obtained through the
object, verb, complemen
specifications of the very
corresponding to passiy,
Consistency In a given f-structure, @ particular See he ata al
the most one value. Hence, while unifying (Wo Fstructures, F A ute
‘Num has value SG in one and PL in the other, it will be rejected.
‘ompleteness “A function is called governable if it appears within the
peace Sf some lexical form, e.g, Subj, Obj, and Obj 2. Adjunct is
not a governable function.
When an f-structure and all its subsidiary f-structures (as the value of|
any attribute of f-structure can again contain other f-structures) contain all
the functions that their predicates govern, then and only then is the
f-structure complete. For example, since the predicate ‘see <_(T Subj)
(T Obj) >’ contains an object as its governable function, a sentence like]
‘He saw’ will be incomplete.
Coherence Coherence maps the completeness property in the reverse
direction. It requires that all governabl i
: le functic g
all its subsidiary fabictures nck ier eae
; must be governed by thei i
di f & er their respective
Chenin te fate ox sent, an object cannot
: not a i ace; BF
laughed a book, low that object. Thus, it will reject the sentence,Language Modelling 39
The completeness and coherence conditions are counterparts of ®
criterion in GB theory.STOEL TET apy
1.4 THE CHALLENGES OF NLP mouszer
There are a number of factors that make NLP difficult. These relate to
the problems of representation and interpretation. Language computing
requires precise representation of content. Given that natural languages
ch representation can be
are highly ambiguous and vague, achieving su c
difficult, The inability to capture all the required knowledge is another
source of difficulty. It is almost impossible to embody all sources of
knowledge that humans use to process language. Even if this were done,
it is not possible to write procedures that imitate language processing as
done by humans. In this section, we detail some of the problems associated
with NLP.
Perhaps the greatest source of difficulty in natural language is identifying
its semantics. The principle of compositional semantics considers the
meaning of a sentence to be a composition of the meaning of words
appearing in it. In the earlier section, we saw a number of examples
where this principle failed to work. Our viewpoint is that words alone do
not make a sentence. Instead, it is the words as well as their syntactic and
semantic relation that give meaning to a sentence..As pointed out by
Wittgenstein (1953): ‘The meaning of a word is its use in the language.’ A
Janguage keeps on evolving. New words are added continually and existingIntroduction 7
words are introduced in new context. For example, most newspapers and
TV channels use 9/11 to refer to the terrorist act on the World Trade
Centre in the USA in 2004. When we process written text or spoken
utterances, we have access to underlying mental representation. The only
way a machine can learn the meaning of a specific word in a message is
by considering its context, unless some explicitly coded general world or
domain knowledge is available. The context of a word is defined by co-
occurring words. It includes everything that occurs before or after a
word. The frequency of a word being used in a particular sense also
affects its meaning. The English word ‘while’ was initially used to mean ‘a
short interval of time’. But now it is more in use as a conjunction. None
of the usages of ‘while’ discussed in this chapter correspond to this
meaning.
Idioms, metaphor, and ellipses add more complexity to identify the
meaning of the written text, As an example, consider the sentence:
The old man finally kicked the bucket. (1.3)
The’ meaning of this sentence has nothing to do with the words ‘kick’
and ‘bucket’ appearing in it.
tifier-scoping is another problem. The scope of quantifiers (the,
, etc.) is often not clear and poses problem in automatic processing.
The ambiguity of natural languages is another difficulty. These go
unnoticed most of the times, yet are correctly interpreted. This is possible
because we use explicit as well as implicit sources of knowledge.
Communication via language involves two brains not just one—the brain
peaker/writer and that of the hearer/reader. Anything that is
u to be known to the receiver is not explicitly encoded. The
receiver possesses the necessary knowledge and fills in the gaps while
making an interpretation. As humans, we are aware of the context and
current cultural knowledge, and also of the language and traditions, and
utilize these to process the meaning. However, incorporating contextual
and world knowledge poses the greatest difficulty in language computing.
An example of cultural impact on language is the representation of different
shades of white in the Eskimo world. It may be hard for a person living
in plain to distinguish among various shades. Similarly, to an Indian, the
-word/Taj’'may mean a monument, a brand of tea, or a hotel, which may
not be so for a non-Indian. Let us now take a look at the various sources
of ambiguities in natural languages.
: re ithout much
first Ievel of ambiguity arises at the word level. Wil
can identify words that have multiple meanings associated withVatural Language Processing and Information Retrieval
them, e.g., bank, can, bat, and still. A word may be ambiguous in ie Part,
of-speech or it may be ambiguous in its meaning. The word ‘can’ jj
ambiguous in its part-of-speech whereas the word ‘bat’ is em eou in ig)
meaning. We hardly consider all possible meanings of @ Ha to get the
correct one. A program on the other hand, must be explicit ly coded ta]
resolve each meaning. Hence, we need to develop various models anq|
algorithms to resolve them. Deciding whether ‘can’ is a noun or a verb ig
solved by ‘part-of-speech tagging’ whereas identifying whether a particulay
use of ‘bank’ corresponds to ‘financial institution’ sense OF river bank}
sense is solved by ‘word sense disambiguation’. ‘Part-of-speech tagging!
and ‘word sense disambiguation’ algorithms are discussed in Chapters 3}
and 5 respectively.
A sentence may be ambiguous even if the words are not, for example]
the sentence: ‘Stolen rifle found by tree’ None of the words in this sentence]
is ambiguous but the sentence is. This is an example of structural ambiguity,
Verb sub-categorization may help to resolve this type of ambiguity but
not always. Probabilistic parsing, which is discussed in Chapter’ 4, ig
another solution. At a still higher level are pragmatic and discourse|
ambiguities. Ambiguities are discussed in Chapter 5.
‘A number of grammars have been proposed to describe the structure
. of sentences. However, there are an infinite number of ways to generate|
them, which makes writing grammar rules, and grammar itself, extremely]
complex. On top of it, we often make correct semantic interpretations off
non-grammatical sentences, This fact makes it almost impossible for
grammar to capture the structure of all and only meaningfal text.