0% found this document useful (0 votes)
44 views37 pages

Anaphora Resolution

This document discusses different approaches to anaphora resolution, which is the process of determining the antecedents of referring expressions like pronouns. It summarizes Hobbs' 1978 algorithm, which uses syntactic constraints and search order to find antecedents. It also summarizes Lappin and Leass' 1994 approach, which maintains a discourse model with representations of potential referents that have degrees of salience based on syntactic and recency factors. The document explains how these two algorithms work and some of their limitations.

Uploaded by

aisha ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views37 pages

Anaphora Resolution

This document discusses different approaches to anaphora resolution, which is the process of determining the antecedents of referring expressions like pronouns. It summarizes Hobbs' 1978 algorithm, which uses syntactic constraints and search order to find antecedents. It also summarizes Lappin and Leass' 1994 approach, which maintains a discourse model with representations of potential referents that have degrees of salience based on syntactic and recency factors. The document explains how these two algorithms work and some of their limitations.

Uploaded by

aisha ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 37

Anaphora Resolution

Spring 2010, UCSC – Adrian Brasoveanu

[Slides based on various sources, collected over a couple of years


and repeatedly modified – the work required to track them down
& list them would take too much time at this point. Please email
me ([email protected]) if you can identify particular sources.]
There are more slides added to Adrian’s presentation.
Reference Phenomena
Five common types of referring expression
Type Example
Indefinite noun phrase I saw a Ford Escort today.
Definite noun phrase I saw a Ford Escort today. The Escort was white.
Pronoun I saw a Ford Escort today. It was white.
Demonstratives I like this better than that.
One-anaphora I saw 6 Ford Escort today. Now I want one.

Three types of referring expression that complicate the reference resolution


Type Example
Inferrables I almost bought a Ford Escort, but a door had a dent.
Discontinuous Sets John and Mary love their Escorts. They often drive them.
Generics I saw 6 Ford Escorts today. They are the coolest cars.
Reference Resolution

 How to develop successful algorithms for reference


resolution? There are two necessary steps.

 First is to filter the set of possible referents by certain


hard-and-fast constraints.

 Second is to set the preference for possible referents.


Constraints (for English)
 Number Agreement:
 To distinguish between singular and plural references.
 *John has a new car. They are red.

 Gender Agreement:
 To distinguish male, female, and non-personal genders.
 John has a new car. It is attractive. [It = the new car]

 Person and Case Agreement:


 To distinguish between three forms of person;
 *You and I have Escorts. They love them.
 To distinguish between subject position, object position, and
genitive position.
Constraints (for English)
 Syntactic Constraints:
 Syntactic relationships between a referring expression and a
possible antecedent noun phrase
 John bought himself a new car. [himself=John]
 John bought him a new car. [him≠John]

 Selectional Restrictions:
 A verb places restrictions on its arguments.
 John parked his Acura in the garage. He had driven it
around for hours. [it=Acura, it≠garage];
 I picked up the book and sat in a chair. It broke.
Syntax can’t be all there is
 John hit Bill. He was severely injured.

 Margaret Thatcher admires Hillary Clinton, and


George W. Bush absolutely worships her.
Preferences in Pronoun Interpretation

 Recency:
 Entities introduced recently are more salient than those
introduced before.
 John has a Legend. Bill has an Escort. Mary likes to drive
it.

 Grammatical Role:
 Entities mentioned in subject position are more salient than
those in object position.
 Bill went to the Acura dealership with John. He bought an
Escort. [he=Bill]
Preferences in Pronoun Interpretation

 Repeated Mention:
 Entities that have been focused on in the prior discourse are
more salient.

John needed a car to get to his new job.


He decided that he wanted something sporty.
Bill went to the Acura dealership with him.
He bought an Integra. [he=John]
Preferences in Pronoun Interpretation
 Parallelism (more generally – discourse structure):
 There are also strong preferences that appear to be induced
by parallelism effects.

 Mary went with Sue to the cinema. Sally went with her
to the mall. [ her = Sue]

 Jim surprised Paul and then Julie shocked him. (him =


Paul)
Preferences in Pronoun Interpretation
 Verb Semantics:
 Certain verbs appear to place a semantically-oriented emphasis
on one of their argument positions.

 John telephoned Bill. He had lost the book in the


mall. [He = John]
 John criticized Bill. He had lost the book in the mall.
[He = Bill]

 David praised Hans because he … [he = Hans]


 David apologized to Hans because he… [he = David]
Preferences in Pronoun Interpretation
 World knowledge in general:

 The city council denied the demonstrators a permit because


they {feared|advocated} violence.

 The city council denied the demonstrators a permit because


they {feared|advocated} violence.

 The city council denied the demonstrators a permit


because they {feared|advocated} violence.
The Plan
Introduce and compare 3 algorithms for
anaphora resolution:

 Hobbs 1978

 Lappin and Leass 1994

 Centering Theory
Hobbs 1978
 Hobbs, Jerry R., 1978, ``Resolving Pronoun
References'', Lingua, Vol. 44, pp. 311-338.

 Also in Readings in Natural Language


Processing, B. Grosz, K. Sparck-Jones, and
B. Webber, editors, pp. 339-352, Morgan
Kaufmann Publishers, Los Altos, California.
Hobbs 1978

 Hobbs (1978) proposes an algorithm that searches parse


trees (i.e., basic syntactic trees) for antecedents of a
pronoun.

 starting at the NP node immediately dominating the


pronoun
 in a specified search order
 looking for the first match of the correct gender and
number

 Idea: discourse and other preferences will be


approximated by search order.
Hobbs’s point
… the naïve approach is quite good. Computationally
speaking, it will be a long time before a semantically
based algorithm is sophisticated enough to perform as
well, and these results set a very high standard for any
other approach to aim for.

Yet there is every reason to pursue a semantically


based approach. The naïve algorithm does not work.
Any one can think of examples where it fails. In these
cases it not only fails; it gives no indication that it has
failed and offers no help in finding the real antecedent.
(p. 345)
Hobbs 1978

 This simple algorithm has become a baseline:


more complex algorithms should do better than
this.

 Hobbs distance: ith candidate NP considered by


the algorithm is at a Hobbs distance of i.
Hobbs’s “Naïve” Algorithm
1. Begin at the NP immediately dominating the pronoun.
2. Go up tree to first NP or S encountered.
 Call node X, and path to it, p.
 Search left-to-right below X and to left of p, proposing any NP node which
has an NP or S between it and X.
3. If X is highest S node in sentence,
 Search previous trees, in order of recency, left-to-right, breadth-first,
proposing NPs encountered.
4. Otherwise, from X, go up to first NP or S node encountered,
 Call this X, and path to it p.
5. If X is an NP, and p does not pass through an N-bar that X immediately dominates,
propose X.
6. Search below X, to left of p, left-to-right, breadth-first, proposing NP encountered.
7. If X is an S, search below X to right of p, left-to-right, breadth-first, but not going
through any NP or S, proposing NP encountered.
8. Go to 2.
Another example:

The referent for “he”: we follow the same path, get to the same place, but reject NP4,
then reject NP5. Finally, accept NP6.
Lappin and Leass 1994
 Idea: Maintain a discourse model , in which there are representations
for potential referents. (much like the DRSs we built throughout the
quarter )

 Lappin and Leass 1994 propose a discourse model in which potential


referents have degrees of salience.

 They try to resolve (pronoun) references by finding highly salient


referents compatible with pronoun agreement features.

 In effect, they incorporate:


 recency
 syntax-based preferences
 agreement, but no (other) semantics
Lappin and Leass 1994
 First, we assign a number of salience factors & salience
values to each referring expression.

 The salience values (weights) are arrived by


experimentation on a certain corpus.
Lappin and Leass 1994
Salience Factor Salience Value
Sentence recency 100
Subject emphasis 80
Existential emphasis 70
Accusative emphasis 50
Indirect object emphasis 40
Non-adverbial emphasis 50
Head noun emphasis 80
Lappin and Leass 1994
 Non-adverbial emphasis is to penalize
“demarcated adverbial PPs” (e.g., “In his hand,
…”) by giving points to all other types.

 Head noun emphasis is to penalize embedded


referents.

 Other factors & values:


 Grammatical role parallelism: 35
 Cataphora: -175
Lappin and Leass 1994
 The algorithm employs a simple weighting scheme that integrates
the effects of several preferences:

 For each new entity, a representation for it is added to the discourse


model and salience value computed for it.

 Salience value is computed as the sum of the weights assigned by a


set of salience factors.
 The weight a salience factor assigns to a referent is the highest one the
factor assigns to the referent’s referring expression.

 Salience values are cut in half each time a new sentence is


processed.
Lappin and Leass 1994
The steps taken to resolve a pronoun are as follows:

 Collect potential referents (four sentences back);

 Remove potential referents that don’t semantically agree;

 Remove potential referents that don’t syntactically agree;

 Compute salience values for the rest potential referents;

 Select the referent with the highest salience value.


Lappin and Leass 1994
 Salience factors apply per NP, i.e., referring expression.

 However, we want the salience for a potential referent.


 So, all NPs determined to have the same referent are
examined.

 The referent is given the sum of the highest salience factor


associated with any such referring expression.

 Salience factors are considered to have scope over a


sentence
 so references to the same entity over multiple sentences add
up
 while multiple references within the same sentence don’t.
Example (from Jurafsky and Martin)
 John saw a beautiful Acura Integra at
the dealership.
 He showed it to Bob.
 He bought it.
John
Salience Factor Salience Value
Sentence recency 100
Subject emphasis 80
Existential emphasis
Accusative emphasis
Indirect object emphasis
Non-adverbial emphasis 50
Head noun emphasis 80
Integra
Salience Factor Salience Value
Sentence recency 100
Subject emphasis
Existential emphasis
Accusative emphasis 50
Indirect object emphasis
Non-adverbial emphasis 50
Head noun emphasis 80
dealership
Salience Factor Salience Value
Sentence recency 100
Subject emphasis
Existential emphasis
Accusative emphasis
Indirect object emphasis
Non-adverbial emphasis 50
Head noun emphasis 80
He
Salience Factor Salience Value
Sentence recency 100
Subject emphasis 80
Existential emphasis
Accusative emphasis
Indirect object emphasis
Non-adverbial emphasis 50
Head noun emphasis 80
It
Salience Factor Salience Value
Sentence recency 100
Subject emphasis
Existential emphasis
Accusative emphasis 50
Indirect object emphasis
Non-adverbial emphasis 50
Head noun emphasis 80
Bob
Salience Factor Salience Value
Sentence recency 100
Subject emphasis
Existential emphasis
Accusative emphasis
Indirect object emphasis 40
Non-adverbial emphasis 50
Head noun emphasis 80
He
Salience Factor Salience Value
Sentence recency 100
Subject emphasis 80
Existential emphasis
Accusative emphasis
Indirect object emphasis
Non-adverbial emphasis 50
Head noun emphasis 80
It
Salience Factor Salience Value
Sentence recency 100
Subject emphasis
Existential emphasis
Accusative emphasis 50
Indirect object emphasis
Non-adverbial emphasis 50
Head noun emphasis 80
Evaluation of Lappin and Leass 1994
 Weights were arrived at by experimentation on
a corpus of computer training manuals.

 Combined with other filters, algorithm achieve


86% accuracy (74% / 89% inter- / intra-
sentential):
 applied to unseen data of same genre

 Hobbs’ algorithm applied to same data is 82%


accurate (87% / 81% inter / intra).

You might also like