NLP Unit-Iv
NLP Unit-Iv
Unit-IV
• Predicate Argument Structure
• Meaning Representation Systems
Predicate-Argument Structure
• Resources
• Systems
• Software
Predicate-Argument Structure
• Shallow semantic parsing or semantic role labeling is the process of
identifying the various arguments of predicates in a sentence.
• There has been a debate over what constitutes the set of arguments
and what the granularity of such argument label should be for various
predicates.
Resources
• FrameNet
• PropBank
• Other Resources
Resources
• We have two important corpora that are semantically tagged. One is
FrameNet and the other is PropBank.
• These resources have transformed from rule based approaches to more
data-oriented approaches.
• These approaches focus on transforming linguistic insights into features.
• FrameNet is based on the theory of frame semantics where a given
predicate invokes a semantic frame initiating some or all of the possible
semantic roles belonging to that frame.
• PropBank is based on Dowty’s prototype theory and uses a more
linguistically neutral view. Each predicate has a set of core arguments
that are predicate dependent and all predicates share a set of noncore
or adjunctive arguments.
FrameNet
• FrameNet contains frame-specific semantic annotation of a number
of predicates in English.
• The process of FrameNet annotation consists of identifying specific
semantic frames and creating a set of frame-specific roles called
frame elements.
• A set of predicates that instantiate the semantic frame irrespective of
their grammatical category are identified and a variety of sentences
are labelled for those predicates.
• The labeling process identifies the following:
• The frame that an instance of the predicate lemma invokes
• The semantic arguments for that instance
• Tagging them with one of the predetermined set of frame elements for that
frame.
FrameNet
• The combination of the predicate lemma and the frame that its instance
invokes is called a lexical unit (LU).
• Each sense of a polysemous word tends to be associated with a unique
frame.
• The verb “break” can mean fail to observe (a law, regulation, or
agreement) and can belong to a COMPLIANCE frame along with other
word meanings such as violation, obey, flout.
• It can also mean cause to suddenly separate into pieces in a destructive
manner and can belong to a CAUSE_TO_FRAGMENT frame along with
other meanings such as fracture, fragment, smash.
FrameNet
• Here the frame Awareness is instantiated by the verb predicate believe
and the noun predicate comprehension.
FrameNet
• FrameNet contains a wide variety of nominal predicates like:
• Ultra-nominal
• Nominals
• Nominalizations
• It also contains some adjectives and preposition predicates
• The frame elements share the same meaning across the lexical units.
• Example:
The frame element BODY_PART in frame CURE has the same meaning
as the same element in the frame GESTURE or WEARING.
PropBank
• PropBank includes annotations of arguments of verb predicates.
• PropBank restricts the argument boundaries to that of a syntactic
constituent as defined in the Penn Treebank.
• The arguments are tagged either:
• Core arguments with labels of type ARGN where N takes values from 0 to 5.
• Adjunctive arguments with labels of the type ARGM-X where X can take
values such as TMP for temporal LOC for locative etc.
PropBank
PropBank
• Adjunctive arguments share the same meaning across all predicates.
• The meaning of core arguments has to be interpreted in connection
with a predicate.
PropBank
• Let us look at an example from PropBank corpus along with its syntax
tree.
PropBank
• Most Treebank-style trees have trace nodes that refer to another node in
the tree but have no words associated with them.
• These can also be marked as arguments.
• Since traces are not reproduced by a usual syntactic parser the community
has disregarded them from most standard experiments.
• There are a few disagreements between Treebank and PropBank. In such
cases the a sequence of nodes in the tree are annotated as the argument
and called as discontinuous arguments.
FrameNet Vs Propbank
• An important distinction between FrameNet and Propbank is as
follows:
• In FrameNet we have lexical units which are words paired with their meanings
or the frames that they invoke.
• In Propbank each lemma has a list of different framesets that represent all the
senses for which there is a different argument structure.
Other Resources
• Other resources have been developed to aid further research in
predicate-argument recognition.
• NomBank was inspired by PropBank.
• In the process of identifying and tagging the arguments of nouns, the
NOMLEX (NOMinalization LEXicon) dictionary was expanded to cover
about 6,000 entries.
• The frames from PropBank were used to generate the frame files for
NomBank.
• Another resource that ties PropBank frames with more predicate-
independent thematic roles and also provides a richer representation
associated with Levin classes is VerbNet.
Other Resources
• FrameNet frames are also related in the sense that FrameNet’s generation
of verb classes is more data driven than theoretical.
• The philosophy of FrameNet and PropBank have propagated to other
languages.
• Since the nature of semantics is lingua independent frames can be reused
to annotate data in other languages.
• The SALSA project was the first to put this into practice.
• Since FrameNet tags both literal and metaphorical interpretation SALSA
project remained close to lexical meaning.
• There are FrameNets in other languages like Japanese, Spanish and
Swedish.
Other Resources
• PropBank has inspired creation of similar resources in Chinese, Arabic,
Korean, Spanish, Catalan, and Hindi.
• Every new PropBank requires the creation of new set of frame files
unlike FrameNet.
• FrameNet and PropBank are not the only styles used in practice.
• Prague Dependency TreeBank tags the predicate argument structure in
its tactogrammatic layer on top of dependency structure.
• It also makes a distinction same as core and adjunctive arguments called
inner participants and free modifications.
• The NAIST text corpus is strongly influenced by the traditions in
Japanese linguistics.
Systems
• Syntactic Representations
• Classification Paradigms
• Overcoming the Independence Assumptions
• Feature Performance
• Feature Salience
• Feature Selection
• Size of Training Data
• Overcoming Parsing Errors
• Noun Arguments
• Multilingual Issues
• Robustness across Genre
Systems
• Very little research has gone into learning predicate argument structures
from unannotated corpora.
• The reason is predicate-argument structure is closer to the actual
applications and has been very close to the area of information extraction.
• Early systems in the area of predicate-argument structures were based on
heuristics on syntax tree which were rule based.
Systems
• A few of the early systems were:
• The Absity parser PUNDIT understanding system were among the early rule
based systems.
• One hybrid method for thematic role tagging using WordNet as a resource
was introduced.
• Other notable applications are:
• Corpus based studies by Manning, Briscoe, and Carroll which seek to derive the
subcategorization information from large corpora
• Pustejovsky which tries to acquire lexical semantic knowledge from corpora
Systems
• A major step in semantic role labelling research happened after the
introduction of FrameNet and PropBank.
• One problem with these corpora is significant work goes into creating
the frames, that is, in classifying verbs into framesets in preparation
for manual annotation.
• Providing coverage for all possible verbs in one or more languages
require significant manual effort.
• Green, Dorr and Resnik propose a way to learn the frame structures
automatically but the result is not accurate enough to replace the
manual frame creation.
Systems
• Swier and Stevenson represent one of the more recent approaches to
handling this problem in an unsupervised fashion.
• Let us now look at a few applications after the advent of these corpora.
• Example:
Systems
• In the example sentence in the previous slide, for the predicate
operates, the word “It” fills with the role ARG0, the word “stores” fills
the ARG1, and the sequence of words “mostly in Iowa and Nebraska”
fills the role ARGM-LOC.
• An ARGN for one predicate need not have similar semantics compared
to another predicate.
• FrameNet was the first project that used hand-tagged arguments of
predicates in data.
• Gildea and Jurafsky formulated semantic role labeling as a supervised
classification problem that assumes the arguments of the predicate.
Systems
• The predicate itself can be mapped to a node in the syntax tree of
that sentence.
• They introduced three tasks which can be used to evaluate the
system:
• Argument Identification: This is the task of identifying all and only the parse
constituents that represent valid semantic arguments of a predicate.
• Argument Classification: Given constituents known to represent arguments
of a predicate, assign the appropriate argument labels to them.
• Argument identification and classification: This task is a combination of the
previous two tasks where the constituents that represent arguments of a
predicate are identified and the appropriate argument label is assigned to
them.
Systems
• After parsing each node in the parse tree can be classified as:
• One that represents a semantic argument (non-null node)
• One that does not represent any semantic argument (null node)
• The non-null node can further be classified into the set of argument
labels.
• In the previous tree the noun phrase that encompasses “mostly in
Iowa and Nebraska” is a null node because it does not correspond to
a semantic argument.
• The node NP that encompasses “stores” is a non-null node because it
does correspond to a semantic argument: ARG1.
Systems
• The pseudo code for a generic semantic role labeling(SRL) algorithm is as
follows:
Syntactic Representation
• Phrase Structure Grammar
• Combinatory Categorial Grammar
• Tree Adjoining Grammar
Syntactic Representations
• PropBank was created as a layer of annotation on top of Penn
TreeBank style phrase structure trees.
• Gidea and Jurafsky added argument labels to parses obtained from a
parser trained on Penn TreeBank .
• Researchers have also used other types of sentence representations
to tackle the semantic role labeling problem.
• We now look at a few of these sentence representations and the
features that were used to tag text with PropBank arguments.
Phrase Structure Grammar
• FrameNet marks word spans in sentences to represent arguments
whereas PropBank tags nodes in a treebank tree with arguments.
• Since the phrase structure representation is amenable to tagging
Gildea and Jurafsky introduced the following features:
• Path: This feature is the syntactic path through the parse tree from
the parse constituent to the predicate being classified.
• For example:
• In the figure in the next slide the path from ARG0 “It” to the predicate
“operates” is represented by the string NP↑ 𝑆 ↓ 𝑉𝑃 ↓ 𝑉𝐵𝑍.
Phrase Structure Grammar
Phrase Structure Grammar
• Predicate: The identity of the predicate lemma is used as a feature.
• Phrase Type: This feature is the syntactic category (NP, PP, S, etc.) of
the constituent to be labeled.
• Position: This feature is a binary feature identifying whether the
phrase is before or after the predicate.
• Voice: This feature indicates whether the predicate is realized as an
active or passive construction. A set of hand written expressions on
the syntax tree are used to identify the passive-voiced predicates.
Phrase Structure Grammar
• Head Word: This feature is the syntactic head of the phrase. It is
calculated using a head word table.
• Subcategorization: This feature is the phrase structure rule expanding
the predicate’s parent node in the parse tree.
• For example:
• In the figure in the previous slide the subcategorization for the predicate
“operates” VPVBZ-NP.
Phrase Structure Grammar
• Verb Clustering:
• This predicate is one of the most salient features in predicting the argument
class.
• Gildea and Jurafsky used a distance function for clustering that is based on the
intuition that verbs with similar semantics will tend to have similar direct
objects.
• For example:
• Verbs such as eat, devour and savor will occur with direct objects describing food.
• The clustering algorithm uses a database of verb-direct object relations.
• The verbs were clustered into 64 classes using the probabilistic co-occurrence
model.
Phrase Structure Grammar
• Surdeanu suggested the following features:
• Content Word: Since in some cases head words are not very informative a
different set of rules were used to identify a so-called content word
instead of using the head-word finding rules. The rules that they used are:
Phrase Structure Grammar
• POS of Head Word and Content Word: Adding the POS of the head word
and the content word of a constituent as a feature to help generalize in the
task of argument identification and gives a performance boost to their
decision tree-based systems.
• Named Entity of the Content Word: Certain roles such as ARGM-TMP,
ARGM-LOC tent to contain time or place named entities. This information
was added as a set of binary valued features.
• Boolean Named Entity Flags: Named entity information was also added as a
feature. They created indicator functions for each of the seven named entity
types: PERSON, PLACE, TIME, DATE, MONEY, PERCENT, ORGANIZATION.
Phrase Structure Grammar
• Phrasal Verb Collocations: This feature comprises frequency statistics
related to the verb and the immediately following preposition.
• Fleischman, Kwon, and Hovy added the following features to their
system:
• Logical function: This is a feature that takes three values external
argument, object argument, and other argument and is computed
using some heuristic on the syntax tree.
• Order of Frame Elements: This feature represents the position of a
frame element relative to other frame elements in a sentence.
Phrase Structure Grammar
• Syntactic Pattern: This feature is also generated using heuristics on
the phrase type and the logical function of the constituent.
• Previous Role: This is a set of features indicating the nth previous role
that had been observed/assigned by the system for the current
predicate.
Phrase Structure Grammar
• Pradhan suggested using the following additional features:
• Named Entities in Constituents:
• Named entities such as location and time are important for the adjunctive
arguments ARGM-LOC and ARGM-TMP.
• Entity tags are also helpful in cases where head words are not common.
• Each of these features is true if its representative type of named entity is contained
in the constituent.
• Verb Sense Information:
• The arguments that a predicate can take depend on the sense of the predicate.
• Each predicate tagged in the PropBank corpus is assigned a separate set of
arguments depending on the sense in which it is used.
• This is also known as the frameset ID.
Phrase Structure Grammar
• The table below illustrates the argument sets for a word. Depending on
the sense of the predicate “talk” either ARG-1 or ARG-2 can identify the
hearer.
• Rule Baes
• Supervised
Rule Based
• A few semantic parsing systems that performed very well for both
ATIS and Communicator projects were rule-based systems.
• They used an interpreter whose semantic grammar was handcrafted
to be robust to speech recognition errors.
• Syntactic explanation of a sentence is much more complex than the
underlying semantic information.
• Parsing the meaning units in the sentence into semantics proved to
be a better approach.
• In dealing with spontaneous speech the system has to account for
ungrammatical instructions, stutters, filled pauses etc.
Rule Based
• Word order becomes less important which leads to meaning units
scattered in the sentences and not necessarily in the order that would
make sense to a syntactic parser.
• Ward’s system, Phoenix uses a recursive transition networks (RTNs) and a
handcrafted grammar to extract a hierarchical frame structure.
• It reevaluates and adjusts the values of the frames with each new piece of
information obtained.
• The system had the following error rates:
• 13.2% for spontaneous speech input of which
• 4.4% speech recognition word-error rate
• 9.3% error for transcript input
Supervised
• The following are the few problems with rule-based systems:
• They need some effort upfront to create the rules
• The time and specificity required to write rules restrict the development to
systems that operate in limited domains
• They are hard to maintain and scale up as the problems become more
complex and more domain independent
• They tend to be brittle
• As an alternative statistical models derived from hand annotated data
can be used.
• Unless some hand annotated data is available statistical models
cannot deal with unknown phenomena.
Supervised
• During the ATIS evaluations some data was hand-tagged for semantic information.
• Schwartz used that information to create the first end-to-end supervised statistical
learning system for ATIS domain.
• They had four components in their system:
• Semantic parse
• Semantic frame
• Discourse
• Backend
• This system used a supervised learning approach with quick training augmentation
through a human in-the-loop corrective approach to generate lower quality but
more data for improved supervision.
Supervised
• This research is now known as natural language interface for databases
(NLIDB).
• Zelle and Mooney tackled the task of retrieving answers from Prolog
database.
• The system tackled the task of retrieving answers from a Prolog database
by converting natural language questions into Prolog queries in the
domain of GeoQuery.
• The CHILL (Constructive Heuristics Induction for Language Learning)
system uses a shift-reduce parser to map the input sentence into parses
expressed as a Prolog program.
Supervised
• A representation closer to formal logic than SQL is preferred for CHILL
because it can be translated into other equivalent representations.
• It took CHILL 175 training queries to match the performance of Geobase.
• After the advances in machine learning new approaches were identified
and existing were refined.
• The SCISSOR (Semantic Composition that Integrates Syntax and
Semantics to get Optimal Representation) system uses a statistical
syntactic parser to create a Semantically Augmented Parse Tree (SAPT).
• Training for SCISSOR consists of a (natural language, SAPT, meaning
representation) triplet.
Supervised
• KRISP (Kernel-based Robust Interpretation for Semantic Parsing) uses string
kernels and SVMs to improve the underlying learning techniques.
• WASP (Word Alignment based Semantic Parsing) takes a radical approach to
semantic parsing by using state-of-the-art machine translation techniques
to learn a semantic parser.
• Wong and Mooney treat the meaning representation language as an
alternative form of natural language.
• They used GIZA++ to produce an alignment between the natural language
and a variation of the meaning representation language.
• Complete meaning representations are then formed by combining these
aligned strings using a synchronous CFG framework.
Supervised
• SCISSOR is more accurate than WASP and KRISP, which benefits from
SAPTs.
• These systems also have semantic parsers for Spanish, Turkish, and
Japanese with similar accuracies.
• Another approach is from Zettlemoyer and Collins.
• They trained a structured classifier for natural language interfaces.
• The system learned probabilistic categorical grammar (PCCG) along with a
log-linear model that represents the distribution over the syntactic and
semantic analysis conditioned on the natural language input.
Software
• The software programs available are as follows:
• WASP
• KRISPER
• CHILL