CH8_Natural Language Processing in Prolog
CH8_Natural Language Processing in Prolog
107
108 Part II: Programming in Prolog
include things such as love, beauty, and loyalty that do not correspond to
images in our minds.
Conceptual relation nodes indicate a relation involving one or more
concepts. One advantage of formulating conceptual graphs as bipartite
graphs rather than using labeled arcs is that it simplifies the representation
of relations of any number of arcs (arity). A relation of arity n is
represented by a conceptual relation node having n arcs, as shown in
Figure 8.1.
Each conceptual graph represents a single proposition. A typical
knowledge base will contain a number of such graphs. Graphs may be
arbitrarily complex but must be finite. For example, one graph in Figure
8.1 represents the proposition “A dog has a color of brown.” Figure 8.2 is
a graph of somewhat greater complexity that represents the sentence
“Mary gave John the book.” This graph uses conceptual relations to
represent the cases of the verb “to give” and indicates the way in which
conceptual graphs are used to model the semantics of natural language.
object links an event or state with an entity and represents the verb–
object relation.
part links concepts of type physobj and defines the relation between
whole and part.
The verb plays a particularly important role in building an interpretation, as it
defines the relationships between the subject, object, and other components of
the sentence. We can represent each verb using a case frame that specifies:
The linguistic relationships (agent, object, instrument, and so on) appropriate
to that particular verb. Transitive verbs, for example, can have a direct object;
intransitive verbs do not.
Constraints on the values that may be assigned to any component of the case
frame. For example, in the case frame for the verb bites, we have asserted that
the agent of biting must be of the type dog. This causes “Man bites dog” to be
rejected as semantically incorrect.
Default values on components of the case frame. In the “bites” frame, we
have a default value of teeth for the concept linked to the instrument relation.
The case frames for the verbs like and bite appear in Figure 8.3.
Figure 8.3. Case frames for the verbs “like” and “bite.”
These verb-based case frames are also easily built in Prolog. Each verb is
paired with a list of the semantic relations assumed to be part of the verb.
These may include agents, instruments, and objects. We next offer
examples of the verbs give and bite from Figure 8.3. For example, the verb
give requires a subject, object, and indirect object. In the English sentence
“John gives Mary the book,” this structure takes on the obvious
assignments. We can define defaults in a case frame by binding the
appropriate variable values. For example, we could give bite a default
instrument of teeth, and, indeed indicate that the instrument for biting,
teeth, belong to the agent! Case frames for these two verbs might be:
Chapter 8 Natural Language Processing 111
verb(give,
[human (Subject),
agent (Subject, give),
act_of_giving (give),
object (Object, give),
inanimate (Object),
recipient (Ind_obj, give),
human (Ind_obj) ] ).
verb(bite,
[animate (Subject),
agent (Subject, Action),
act_of_biting (Action),
object (Object, Action),
animate (Object),
instrument (teeth, Action),
part_of (teeth, Subject) ] ).
Logic programming also offers a powerful medium for building
grammars as well as representations for semantic meanings. We next
build recursive descent parsers in Prolog, and then add syntactic and
semantic constraints to these parsers.
This sentence Prolog rule takes two parameters, each a list; the first list,
Start, is a sequence of words. The rule attempts to determine whether
some initial part of this list is a NounPhrase. Any remaining tail of the
NounPhrase list will match the second parameter and be passed to the
first parameter of the verbphrase predicate. Any symbols that remain
after the verbphrase check are passed back as the second argument of
sentence. If the original list is a sentence, the second argument of
sentence must be empty, [].Two alternative Prolog descriptions of
nounphrase and verbphrase parses follow.
Figure 8.4 is the parse tree of “the man bites the dog,” with and
constraints in the grammar reflected by and links in the tree.
Figure 8.4. The and/or parse tree for “The man bites the dog.”
Prob = 0.00167212
X = a
Y = [man] ;
etc.
?- utterance(Prob, X).
Prob = 0.0351
X = [man, likes]
;
Prob = 0.0039
X = [man, bites]
;
Prob = 0.027378
X = [man, likes, man]
;
Prob = 0.014742
X = [man, likes, dog]
etc.
A Probabilistic We next demonstrate a probabilistic lexicalized context-free parser. This is a
Lexicalized
Context Free
much more constrained system in which the probabilities, besides giving
Parser measures for the various grammatical structures and individual words as in
the previous section, also describe the possible combinations of words
(thus, it is a probabilistic lexicalized parser). For example, we now measure
the likelihood of both noun-verb and verb-object word combinations.
Constraining noun-verb combinations gives us much of the power of the
context-sensitive parsing that we see next in Section 8.4, where noun-verb
agreement is enforced by the constraints across the subtrees of the parse.
There are a number of goals here, including “measuring” the “quality” of
utterances in the language by determining a probabilistic measure for their
occurring. Thus, we can determine that a possible sentence fails for
syntactic or semantic reasons by seeing that it produces a very low or zero
probability measure, rather than by the interpreter simply saying “no.”
In the following grammar we have hard coded the probabilities of various
structure and word combinations. In a real system, lexical information
could be better obtained by sampling appropriate corpora with noun-verb
or verb-object bigrams. We discuss the n-gram approach to language analysis
in Luger (2009, Section 15.4) where the probability of word combinations
was described (two words—bigrams, three words—trigrams, etc.). These
probabilities are usually determined by sampling over a large collection of
sentences, called a corpus. The result was the ability to assess the likelihood
of these word combinations, e.g., to determine the probability of the verb
“bite” following the noun “dogs.”
In the following examples the Prob value is made up of the probabilities
of the particular sentence structure, the probabilities of the verb-noun and
verb-object combinations, and the probabilities of individual words.
Chapter 8 Natural Language Processing 117
utterance(Prob, X) :-
sentence(Prob, Verb, Noun, X, [ ]).
sentence(Prob, Verb, Noun, Start, End) :-
nounphrase(P1, Noun, Start, Rest),
verbphrase(P2, Verb, Rest, End),
pr(r1, P), % Probability of this structure
pr([r1, Verb, Noun], PrDep),
% Probability of this noun/verb combo
pr(shead, Verb, Pshead),
% Probability this verb heads the sentence
Prob is Pshead*P*PrDep*P1*P2.
nounphrase(Prob, Noun, [Noun | End], End) :-
noun(P1, Noun), pr(r2, P), Prob is P*P1.
nounphrase(Prob, Noun, [Article,Noun | End], End) :-
article(P1, Article), noun(P2,Noun), pr(r3, P),
pr([r3, Noun, Article], PrDep),
% Probability of art/noun combo
Prob is P*PrDep*P1*P2.
verbphrase(Prob, Verb, [Verb | End], End) :-
verb(P1, Verb), pr(r4, P), Prob is P*P1.
verbphrase(Prob, Verb, [Verb,Object | Rest], End) :-
verb(P1, Verb), nounphrase(P2, Object,
Rest, End).
pr([r5, Verb, Object], PrDep),
% Probability of verb/object combo
pr(r5, P), Prob is P*PrDep*P1*P2.
pr(r1, 1.0).
pr(r2, 0.3).
pr(r3, 0.7).
pr(r4, 0.2).
pr(r5, 0.8).
article(1.0, a).
article(1.0, the).
article(1.0, these).
noun(1.0, man).
noun(1.0, dogs).
verb(1.0, likes).
verb(1.0, bite).
pr(shead, likes, 0.5).
pr(shead, bite, 0.5).
pr([r1, likes, man], 1.0).
pr([r1, likes, dogs], 0.0).
pr([r1, bite, man], 0.0).
pr([r1, bite, dogs], 1.0).
118 Part II: Programming in Prolog
Prob = 0.03
X = [man, likes]
;
Prob = 0
X = [man, bite]
;
Prob = 0.0072
X = [man, likes, man]
;
Prob = 0.0288
X = [man, likes, dogs]
;
Prob = 0.0084
X = [man, likes, a, man]
etc
We next enforce many of the same syntax/semantic relationships seen in
this section by imposing constraints (context sensitivity) across the subtrees
of the parse. Context sensitivity can be used to constrain subtrees to
support relationships within a sentence such as article-noun and noun-verb
number agreement.
8.5 Introduction:
A Context-Sensitive
Logic-Based
ParserRepresentation
in Prolog
Exercises
1. Create a predicate calculus and a Prolog representation for the
Conceptual Graph presented in Figure 8.2, “Mary gave John the book.”
Take this same example and create a general Prolog rule, “X gave Y the Z”
along with a number of constraints, such as “object(Z).” Also create a
number of Prolog facts, such as “object(book)” and show how this
conceptual graph can be constrained by using the Prolog interpreter on
your simple program.
2. Figure 8.3 presents case frames for the verbs like and bite. Write
Prolog specifications that captures the constraints of these representations.
Add other related fact and rules in Prolog and then use the Prolog
interpreter to instantiate the constraints that are implicit in these two verb
case frames.
3. Create a predicate calculus and a Prolog representation for the two
Conceptual Graphs presented in Figure 8.5.
4. Describe an algorithm that could be used to impose graph constraints
across the structures of Figure 8.5. You will have to address the nesting
issue to handle sentences like “Mary believes that John does not like soup.”
5. Create Prolog case frames, similar to those of Section 8.1 for five other
verbs, including like, trade, and pardon.
124 Part II: Programming in Prolog
6. Write the Prolog code for a subset of English grammar rules, as in the
context-free and context-sensitive parsers in Sections 8.2 and 8.4, adding:
Adjectives and adverbs that modify verbs and nouns, respectively.
Prepositional phrases. (Can you do this with a recursive call?)
Compound sentences (two sentences joined by a conjunction).
7. Extend the stochastic context-free parser of Section 8.3 to include
probabilities for the new sentence structures of Exercise 8. Explore
obtaining probabilities for these sentence structures from a treebank for
natural language processing. Examples may be found on the www.
8. Add probabilities for more word pair relationships as in the lexicalized
context-free parser of Section 8.3.2. Explore the possibility of obtaining
the probabilistic bigram values for the noun–verb, verb–object, and other
word pairs from actual corpus linguistics. These may be found on the www.
9. Many of the simple natural language parsers presented in Chapter 8 will
accept grammatically correct sentences that may not have a commonsense
meaning, such as “the man bites the dog.” These sentences may be
eliminated from the grammar by augmenting the parser to include some
notion of what is semantically plausible. Design a small “semantic
network” (Section 2.4.1) in Prolog to allow you to reason about some
aspect of the possible interpretations of the English grammar rules, such as
when it is reasonable for the man to bite a dog.
10. Rework the semantic net parser of Section 14.3.2 to support richer class
hierarchies. Specifically, rewrite match_with_inheritance so that
instead of enumerating the common specializations of two items, it
computes this by searching a type hierarchy.