0% found this document useful (0 votes)
50 views20 pages

SEM-I: Why and What?

This document discusses the SEM-I, which is a semantic interface specification that defines the semantic representations output by natural language grammars. The SEM-I specifies things like the syntax of representations, naming conventions, and attributes of variables. It serves as an interface between grammars and applications, allowing applications to understand the expected representations. The document outlines plans to develop the SEM-I, including automatically generating parts from grammars, documenting it, and establishing a change protocol. It also discusses how the SEM-I could be extended in the future to include more semantics through a proposed SEM-I++.

Uploaded by

Raj Mehta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views20 pages

SEM-I: Why and What?

This document discusses the SEM-I, which is a semantic interface specification that defines the semantic representations output by natural language grammars. The SEM-I specifies things like the syntax of representations, naming conventions, and attributes of variables. It serves as an interface between grammars and applications, allowing applications to understand the expected representations. The document outlines plans to develop the SEM-I, including automatically generating parts from grammars, documenting it, and establishing a change protocol. It also discusses how the SEM-I could be extended in the future to include more semantics through a proposed SEM-I++.

Uploaded by

Raj Mehta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 20

SEM-I: why and what?

Overview
Interfacing grammars to other systems
via semantics: requirements
What is in the SEM-I?
SEM-I tools
Some modest proposals ...
SEM-I ++

Modular architecture
Language independent component
Meaning representation (MRS/RMRS)

Language dependent analysis/realization


(DELPH-IN grammar)

string

Semantics as interface

Applications need to know what


representations to expect / deliver:

Deep/shallow integration via RMRS

transfer component for MT


query answering
information extraction, etc

RMRS from shallow grammars is an underspecified


form of semantics from deep grammars
treats deep grammars as normative, so need to
know their output

Explaining what were doing!

What must be specified

Syntax of representation (XML)


Formalism (MRS/RMRS)
Naming conventions
Attributes and values on variables
Relations, features, constant values, variable
sorts, optionality

`grammar relations (e.g., udef_q_rel)


open-class relations (e.g., _interview_v_rel)

Hierarchy of relations (where motivated by


denotation)

Consultants were interviewed


by Abrams
<mrs>
<var vid='h1'/>
<ep><pred>prpstn_m_rel</pred><var vid='h1'/>
<fvpair><rargname>MARG</rargname><var vid='h3'/></fvpair></ep>
<ep><pred>udef_q_rel</pred><var vid='h6'/>
<fvpair><rargname>ARG0</rargname><var vid='x4'/></fvpair>
<fvpair><rargname>RSTR</rargname><var vid='h7'/></fvpair></ep>
<ep><pred>_consultant_n_rel</pred><var vid='h9'/>
<fvpair><rargname>ARG0</rargname><var vid='x4'/></fvpair></ep>
<ep><pred>_interview_v_rel</pred><var vid='h10'/>
<fvpair><rargname>ARG0</rargname><var vid='e2'/></fvpair>
<fvpair><rargname>ARG1</rargname><var vid='x11'/></fvpair>
<fvpair><rargname>ARG2</rargname><var vid='x4'/></fvpair></ep>
<ep><pred>_by_p_cm_rel</pred><var vid='h10'/>
<fvpair><rargname>ARG0</rargname><var vid='e13'/></fvpair>
<fvpair><rargname>ARG1</rargname><var vid='u12'/></fvpair>
<fvpair><rargname>ARG2</rargname><var vid='x11'/></fvpair></ep>
<ep><pred>proper_q_rel</pred><var vid='h14'/>
<fvpair><rargname>ARG0</rargname><var vid='x11'/></fvpair>
<fvpair><rargname>RSTR</rargname><var vid='h15'/></fvpair></ep>
<ep><pred>named_rel</pred><var vid='h17'/>
<fvpair><rargname>ARG0</rargname><var vid='x11'/></fvpair>
<fvpair><rargname>CARG</rargname><constant>abrams</constant></fvpair></ep>
<hcons hreln='qeq'><hi><var vid='h3'/></hi><lo><var vid='h10'/></lo></hcons>
<hcons hreln='qeq'><hi><var vid='h7'/></hi><lo><var vid='h9'/></lo></hcons>
<hcons hreln='qeq'><hi><var vid='h15'/></hi><lo><var vid='h17'/></lo></hcons>
</mrs>

Some issues

Specification/documentation:

treatment of bare plural, message relations


defining when such relations are present
arity and correspondence of arguments for
_interview_v_rel etc

`unwanted predicates such as _by_p_cm_rel


(some of these are going/gone can all be avoided?)
qeqs etc can be ignored for analysis for some
applications, not for realisation (currently)
changes to grammars: e.g., message relations?

SEM-I: semantic interface

Formal level: MRS/RMRS syntax and


semantics, naming conventions
(_lemma_POS[_sense])
Meta-level: variable feature values; manually
specified `grammar relations

udef_q_rel (construction)
named_rel, proper_q_rel (`fixed lexical
relations)

Object-level (e.g., _consultant_n_rel)

SEM-I and grammars

Object levels SEM-Is are auto-generated and distinct


for each grammar
Meta-level SEM-Is should be (partially) shared

object

meta

SEM-I

object
SEM-I

object
SEM-I

SEM-I functionality

Offline

Definition of `correct (R)MRS for developers


Documentation
Checking of test-suites

Online

SEM-I plus lexical link used in lexical lookup phase


of generation (already)
rejection of invalid (R)MRSs (input to generator,
deep/shallow integration)
patching up input to generation, fixing up output
from parser

SEM-I: implementation
(current and planned)

Database of relations, features, value sorts,


optionality:

Meta-level: plan to generate from grammars, with


manual identification of relations (some relations
are grammar-internal, see later) and manual
documentation
Object-level: auto-generated from lexical entries
in deep grammars (current version is based on
generator code optionality not there yet)

Semantic test suite exemplifying grammar


relations (partial for ERG, in progress for
other grammars)

SEM-I development

SEM-I development must be incremental


SEM-I eventually forms the `API: stable, changes
negotiated.

Grammar writers need flexibility to hide things, make


changes: SEM-I only constrains the external view

Shared meta-level SEM-I is presumably part of Matrix, but


negotiated with consumers
Management needs to be worked out

BUT: automate production of SEM-I from grammars as much


as possible

Documentation needs to be automated as much as


possible: documentation by example

Interface

External representation: (R)MRSSEM-I

public, documented
reasonably stable

Internal representation

mapping to feature structures (MRSFS)

MRSSEM-I to MRSFS mapping needed anyway, but may have to go via


MRSINTERNAL to MRSFS mapping

distinctions between relations which are irrelevant for denotation


are hidden: only some relations are public
e.g., `selected for relations are internal only

External/Internal inter-conversion

e.g., internal-only relation automatically converted to supertype in


output

BUT: want to minimize the discrepancies

relation hierarchies in SEM-I consistent with grammar hierarchies

Architecture with indirection


External LF (defined by SEM-I)

Internal LF

parser/generator

String

bidirectional
mapping

Semi-automated
documentation
[incr tsdb()]

Lex DB
grammar

Object-level
SEM-I

Documentation
strings

and semantic
test-suite
Auto-generate
examples

semi-automatic

examples,
autogenerated
on demand

Documentation
Meta-level
SEM-I

autogenerate

Hierarchies

Type hierarchies of relations in grammars are not there to support


inference
GLB condition not needed for SEM-I
Proposal: basic SEM-I hierarchy of grammar relations derived
automatically from grammar type hierarchy plus marking of relations
as in SEM-I. (Possibly augmented in SEM-I ++, see later)
type1

type1

type3

type2

type4

grammar

type2

type5

type4

SEM-I

type5

Proposals

Documentation on wiki, mailing list for SEM-I developers and


consumers
MRS code to support particular TFS encoding of MRSs and
enforce naming conventions, simplifying basic MRSFS to MRS
mapping and making grammars more consistent
Allow substantive MRSINTERNAL to MRSSEM-I mapping (via
transfer rule mechanism), but hope to keep this minimal since it
hinders deep/shallow integration.
Agreed procedure for adding/changing variable features and
values
Inventory of grammar predicates: extensions/changes by
grammar developers require notification and documentation

Change protocol (initial


proposal)
A developer (grammar developer or software developer)
implementing a change which will affect the SEM-I must follow
the protocol:
Consultation (meta-SEM-I only). Proposed changes to the
meta-SEMI-I must be discussed on the mailing list.
Notification. All changes to the SEM-I (meta and object) must
be posted on the website.
A script for conversion from new to old version must be posted
(unless an incompatible change is agreed by the list members)
Testing. For each grammar, there will be a semantic test suite,
with agreed SEM-I output (for a specified reading). All changes
to a grammar must be validated against the corresponding testsuite. All software changes must be validated against all testsuites. The conversion script must also be validated.
Commit changes.

Applications and the SEM-I

Application code will be isolated from


grammar changes
MT: semantic transfer mapping from one
SEM-I to another
IE: mapping from SEM-I to template (often
ignoring much of the detail in the original
MRS)
QA: matching RMRSs: SEM-I hierarchy
used for compatibility tests (also SEMI ++)

SEM-I++ (aka Floyd)

SEM-I++ is not built by grammar developers, depends on SEM-I, not


grammars
More semantics, domain-independent, shared between applications
Might include:

Definitions of grammar relations and closed-class relations to support


inference
Mapping to external resources (e.g., WordNet and FrameNet)
Enriched hierarchies
Word classes

word classes could support a richer encoding of thematic role e.g., experiencerstimulus psych verbs map ARG1 to EXP and ARG2 to STIM

Plan is to support specification of SEM-I++ in some version of OWL


SEM-I++ information is additional to grammars but DELPH-IN
community may agree to support it

You might also like