0% found this document useful (0 votes)
181 views

Challenges (NLP) and F C Structure

Aiml

Uploaded by

mohammed umair
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
181 views

Challenges (NLP) and F C Structure

Aiml

Uploaded by

mohammed umair
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 8
superseded U4 2.2.4 Lexical Functional sents those Grammar (LFG) Model features of LFG that throw a light on language the details of lexical functional grammar, readers are eck Darlymple et al. (1995). sentences at tw and functional structure (f-structure). ‘orks 1970, which used phrase the surface structure of sentences and the underlying predica tructure, Kaplan (1975, b) proposed a concrete form for the register names and values (used in ATN implementation), which became the functional structures in LFG. On the other hand, Bresnan (1976a, 1977) was more concerned with the problem such as active/passive and dative of explaining some linguistic issues, citernations, in transformational approach. She proposed that such issues an be dealt with by using lexical redundancy rules. The unification of these two diverse approaches (with a common concern) led to the development of the LFG theory, which was presented as Lexical Functional Grammar: A Formal System for Grammatical Representation in 1982. The LFG is a formalism that is both computationally and linguistically motivated and provides precise algorithms for linguistic issues it can andles The term ‘lexical functional’ is composed of two terms: the functional’ part is derived from ‘grammatical functions’, such as subject and object, or roles played by various arguments in a sentence. The ‘lexical’ part is derived from the fact that the lexical rules can be formulated to help define the given structure of a sentence and some of the long tance dependencies, which is difficult in transformational grammars. Cstructure and f-structure in LFG ea is aimed at providing exact computational algorithms, it provides : fe on ined objects called constituent structure (c-structure) and functional ents . (Fstructure). The c-structure is derived from the usual phrase ntence structure syntax, as in CFG (discussed in Chapter 4). However, ‘This section pre modelling. For encouraged to s Unlike GB, LFG represents (c-structure) ted Transition Netw 0 syntactic levels— constituent structure Based on Woods’ Augmen structure trees to represent te-argument st SS ee ee 36 Natural Language Processing and information Retrioval as the grammatical-functional role cannot be derived directly from functional specifications are annotated 9, and sentence structure, slied on sentences, results in f-stry, J tare | ich encodes the informar;, On) nodes of c-structure, which when app Hence, structure is the final product whi obtained from phrase and sentence structure rules and Function specifications. Let us consider an example. Example 2.5 She saw stars in the sky. CFG rules to handle this sentence are: S + NP VP VP V {NP} {NP} PP’ {S'} PP — P NP NP Det N (PP} S’ — Comp S where S: sentence V: verb P: preposition N: noun Comp: complement S*: clause {) optional *: Phrase can appear any number of times including blank. When annotated with functional specifications, the rules become: Rule 1: S— NP Tsubj=1 T=L Rule 2: VP+V_ {NP} {NP} pe re Tobj=L Tobj2=4 T(L case) =! T comp={ Rule 3: PPP NP : T obj =1 Rule 4: NP {Det} N (PP} T Adjunct =U Rule 5: S’-+ Comp | on the left hand si a fae renee of the mother node that i 7 fe. The ¥ (down ai arex | f posed of the node under which it is ele: Symbol zeleriy ™ lence, i og eee : i NP Se ahareilesirg 4) indicates that the Fstructure of the fs indicates that the atiucte of the subject of the sentence, while (f= 4} re ofthe VP node Goes directly to the rae Here, T (up arro | Language Modelling 37 in Rule 2, the fstructure of VP is defined NPs, any number of PPs, and the ‘an be obtained from the lexicon annotated with T = 4. of the sentence VP. Similarly, by the lexical item V, the two optional optional clause(S’). The fsructure of V ¢ itself. All terminals in LFG can be thought of as The NPs can function as object and object 2 of the sentence, and their fstructures are obained using f-sructure of Obj and Objp. F (Lease) = V inrule 2 indices that the Faructure of the’PP and the case of PP (some literature refers it as P case) determines the f-structure of VP. ‘Comp’ refers to the compliment in a sentence, e.g., ‘He said that she is powerful’. Tet us see fret the lexical eritries' of various words in the sentence. She saw stars. (2.4) She N (T Pred) = ‘PRO’ (T Pers) =3 (t Num) (? Gen) (T Case) = NOM Saw Vt Pred = ‘see <(T Subj) (T Obj) >” (? Tense = PAST) Stars N T Pred = ‘Star’ T Pers 7 Num = PL ) ‘This will lead to the.c-structure shown in Figure 2.8. N She 4 Pred=‘PRO’ Pred. = ‘see < (1 Subj)(? Obj) >” | a ad + Tense = PAST N, stars 1 Pred = ‘star’ re 28 C-structure of sentence (2.4) guage Processing and Information Retrieval ‘ irs, repr Finally, the structure is the set of attribute-value P&' "*Presenteg Pers 3 Num SG subj Gen FEM Case NOM 1 Pred ‘PRO’ Pers 3 obj Num PL Pred . ‘Star’ Pred ‘see’ <(T subj) (T obj)? It is interesting to note that the final ree unification of various f-structures for yal etc. This unification is based on the faire which predicts the overall sentence suet LEG requires that all possible struct ified. If the given sente, constructs, dative constructs, ete, must be specie ay na je Goes not match the specifications, it is said to Pe Pole three conditions on fstructure (Sells 1985). cture is obtained through the object, verb, complemen specifications of the very corresponding to passiy, Consistency In a given f-structure, @ particular See he ata al the most one value. Hence, while unifying (Wo Fstructures, F A ute ‘Num has value SG in one and PL in the other, it will be rejected. ‘ompleteness “A function is called governable if it appears within the peace Sf some lexical form, e.g, Subj, Obj, and Obj 2. Adjunct is not a governable function. When an f-structure and all its subsidiary f-structures (as the value of| any attribute of f-structure can again contain other f-structures) contain all the functions that their predicates govern, then and only then is the f-structure complete. For example, since the predicate ‘see <_(T Subj) (T Obj) >’ contains an object as its governable function, a sentence like] ‘He saw’ will be incomplete. Coherence Coherence maps the completeness property in the reverse direction. It requires that all governabl i : le functic g all its subsidiary fabictures nck ier eae ; must be governed by thei i di f & er their respective Chenin te fate ox sent, an object cannot : not a i ace; BF laughed a book, low that object. Thus, it will reject the sentence, Language Modelling 39 The completeness and coherence conditions are counterparts of ® criterion in GB theory. STOEL TET apy 1.4 THE CHALLENGES OF NLP mouszer There are a number of factors that make NLP difficult. These relate to the problems of representation and interpretation. Language computing requires precise representation of content. Given that natural languages ch representation can be are highly ambiguous and vague, achieving su c difficult, The inability to capture all the required knowledge is another source of difficulty. It is almost impossible to embody all sources of knowledge that humans use to process language. Even if this were done, it is not possible to write procedures that imitate language processing as done by humans. In this section, we detail some of the problems associated with NLP. Perhaps the greatest source of difficulty in natural language is identifying its semantics. The principle of compositional semantics considers the meaning of a sentence to be a composition of the meaning of words appearing in it. In the earlier section, we saw a number of examples where this principle failed to work. Our viewpoint is that words alone do not make a sentence. Instead, it is the words as well as their syntactic and semantic relation that give meaning to a sentence..As pointed out by Wittgenstein (1953): ‘The meaning of a word is its use in the language.’ A Janguage keeps on evolving. New words are added continually and existing Introduction 7 words are introduced in new context. For example, most newspapers and TV channels use 9/11 to refer to the terrorist act on the World Trade Centre in the USA in 2004. When we process written text or spoken utterances, we have access to underlying mental representation. The only way a machine can learn the meaning of a specific word in a message is by considering its context, unless some explicitly coded general world or domain knowledge is available. The context of a word is defined by co- occurring words. It includes everything that occurs before or after a word. The frequency of a word being used in a particular sense also affects its meaning. The English word ‘while’ was initially used to mean ‘a short interval of time’. But now it is more in use as a conjunction. None of the usages of ‘while’ discussed in this chapter correspond to this meaning. Idioms, metaphor, and ellipses add more complexity to identify the meaning of the written text, As an example, consider the sentence: The old man finally kicked the bucket. (1.3) The’ meaning of this sentence has nothing to do with the words ‘kick’ and ‘bucket’ appearing in it. tifier-scoping is another problem. The scope of quantifiers (the, , etc.) is often not clear and poses problem in automatic processing. The ambiguity of natural languages is another difficulty. These go unnoticed most of the times, yet are correctly interpreted. This is possible because we use explicit as well as implicit sources of knowledge. Communication via language involves two brains not just one—the brain peaker/writer and that of the hearer/reader. Anything that is u to be known to the receiver is not explicitly encoded. The receiver possesses the necessary knowledge and fills in the gaps while making an interpretation. As humans, we are aware of the context and current cultural knowledge, and also of the language and traditions, and utilize these to process the meaning. However, incorporating contextual and world knowledge poses the greatest difficulty in language computing. An example of cultural impact on language is the representation of different shades of white in the Eskimo world. It may be hard for a person living in plain to distinguish among various shades. Similarly, to an Indian, the -word/Taj’'may mean a monument, a brand of tea, or a hotel, which may not be so for a non-Indian. Let us now take a look at the various sources of ambiguities in natural languages. : re ithout much first Ievel of ambiguity arises at the word level. Wil can identify words that have multiple meanings associated with Vatural Language Processing and Information Retrieval them, e.g., bank, can, bat, and still. A word may be ambiguous in ie Part, of-speech or it may be ambiguous in its meaning. The word ‘can’ jj ambiguous in its part-of-speech whereas the word ‘bat’ is em eou in ig) meaning. We hardly consider all possible meanings of @ Ha to get the correct one. A program on the other hand, must be explicit ly coded ta] resolve each meaning. Hence, we need to develop various models anq| algorithms to resolve them. Deciding whether ‘can’ is a noun or a verb ig solved by ‘part-of-speech tagging’ whereas identifying whether a particulay use of ‘bank’ corresponds to ‘financial institution’ sense OF river bank} sense is solved by ‘word sense disambiguation’. ‘Part-of-speech tagging! and ‘word sense disambiguation’ algorithms are discussed in Chapters 3} and 5 respectively. A sentence may be ambiguous even if the words are not, for example] the sentence: ‘Stolen rifle found by tree’ None of the words in this sentence] is ambiguous but the sentence is. This is an example of structural ambiguity, Verb sub-categorization may help to resolve this type of ambiguity but not always. Probabilistic parsing, which is discussed in Chapter’ 4, ig another solution. At a still higher level are pragmatic and discourse| ambiguities. Ambiguities are discussed in Chapter 5. ‘A number of grammars have been proposed to describe the structure . of sentences. However, there are an infinite number of ways to generate| them, which makes writing grammar rules, and grammar itself, extremely] complex. On top of it, we often make correct semantic interpretations off non-grammatical sentences, This fact makes it almost impossible for grammar to capture the structure of all and only meaningfal text.

You might also like