S2MT Paper

Conceptual Annotations Preserve Structure Across Translations:
A French-English Case Study
Elior Sulem Omri Abend Ari Rappoport

Institute of Computer Science School of Informatics Institute of Computer Science
Hebrew University of Jerusalem University of Edinburgh Hebrew University of Jerusalem
[email protected] [email protected] [email protected]
Abstract models are effective at improving reordering at the

Divergence of syntactic structures be- phrase level, they are limited in their ability to map
tween languages constitutes a major chal- between arbitrarily divergent structures. Cross-
lenge in using linguistic structure in Ma- linguistic divergences therefore pose a difficult
chine Translation (MT) systems. Here, we problem for the integration of structural knowl-
examine the potential of semantic struc- edge into statistical models (Dorr, 1994; Ding and
tures. While semantic annotation is ap- Palmer, 2004; Zhang et al., 2008).
pealing as a source of cross-linguistically Consequently, an annotation scheme that as-
stable structures, little has been accom- signs similar structures to translations has direct
plished in demonstrating this stability applicative value for structure-aware MT systems.
through a detailed corpus study. In this Such structures can be used either as features in
paper, we experiment with the UCCA phrase-based systems, yielding more robust de-
conceptual-cognitive annotation scheme coding, or as a structural scheme which directs the
in an English-French case study. First, we translation, replacing the PCFG trees often used
show that UCCA can be used to annotate today. Using more stable schemes is likely to
French, through a systematic type-level result in simpler MT systems, avoiding structure
analysis of the major French grammatical modifications like pseudo-nodes (Marcu et al.,
phenomena. Second, we annotate a par- 2006) or tree sequences (Zhang et al., 2008) used
allel English-French corpus with UCCA, in syntax-based systems to handle cross-linguistic
and quantify the similarity of the struc- divergences.
tures on both sides. Results show a high Semantic annotation is an appealing avenue for
degree of stability across translations, sup- constructing cross-linguistically stable structures,
porting the usage of semantic annotations since a major goal of translation is to preserve
over syntactic ones in structure-aware MT the meaning of a sentence. Cross-linguistically
systems. stable schemes have further benefits for applica-
tions such as knowledge projection across lan-
1 Introduction
guages (Kozhevnikov and Titov, 2013), the in-
Structural information, be it syntactic or semantic, duction of cross-lingual semantic relations (Lewis
has the potential to address long-standing prob- and Steedman, 2013), or in translation studies
lems in Statistical Machine Translation (SMT), (Lembersky et al., 2013) (see Section 7.3). A
such as phrase-level (rather than word-level) re- recent example of a semantic scheme aiming to
ordering and discontiguous phrases. Structure- be cross-linguistically stable is AMR (Abstract
aware models1 (Chiang, 2005; Liu et al., 2006; Mi Meaning Representation) (Banarescu et al., 2013)
et al., 2008) aim to address these and other prob- which uses elaborate hierarchical structures in or-
lems by taking into account the hierarchical struc- der to abstractly represent semantic information
ture of language. However, while structure-aware and presents promising preliminary results for
1
We use the term “structure-aware” rather than “syntax- SMT improvement (Jones et al., 2012). Never-
based” so to include any type of hierarchical structure. theless, the stability of semantic annotation across
translations is seldom addressed and has yet to be Scenes (a similar notion to a “frame”; see Sec-
adequately supported (see Section 2), a gap we ad- tion 3) in both languages have a correspondent in
dress in this paper using a detailed analysis of a the other. We analyze the non-corresponding units
semantically annotated parallel corpus. in the two languages according to various param-
Universal Cognitive Conceptual Annotation eters, and show that many of them are due to am-
(UCCA) is a coarse-grained semantic annotation biguity or semantic changes. These results offer a
scheme which builds on typological and cognitive better understanding of UCCA’s stability and sug-
linguistic theory (Abend and Rappoport, 2013a; gest paths for further improvements.
Abend and Rappoport, 2013b). The scheme aims 2 Related Work
to be applicable cross-linguistically, to abstract
We begin by discussing previous work that studied
away from specific syntactic forms and to directly
the portability and stability of semantic schemes.
represent semantic distinctions. These properties
We then briefly survey the means in which seman-
make UCCA an appealing source of structural an-
tic information is integrated into MT systems.
notation which is cross-linguistically stable. We
give an overview of UCCA in Section 3. Portability of semantic annotation. Several
works addressed the portability of semantic anno-
This paper focuses on the case study of English-
tation schemes, namely whether the same scheme,
French, a well studied language pair in MT.
often originally developed for English, can be ap-
We demonstrate through this language pair both
plied to other languages.
UCCA’s portability, namely its ability to be ap-
Burchardt et al. (2009) addressed the applica-
plied to different languages, and its stability,
tion of the English FrameNet (Baker et al., 1998)
namely its ability to preserve structure across
to German. They found that about a third of the
translations. We conduct both type-level and
verb senses identified in the German corpus were
token-level experiments to support our claim.
not covered by FrameNet. Their analysis further
To verify UCCA’s portability to French, we revealed that the English category set is not al-
first conduct a type-level analysis by systemati- ways sufficient, resulting in the introduction of
cally examining UCCA’s applicability over all ma- a new category for German. Van der Plas et al.
jor grammatical phenomena in French. We find (2010) addressed the application of English Prop-
that UCCA is fully applicable to French as ex- Bank (Palmer et al., 2005) to French, and found
emplified in the case of French-specific phenom- that while the scheme can be applied to French,
ena like pronominal verbs (Section 4.1). Further the annotation requires proficiency in both lan-
in the type-level, we apply UCCA to a published guages. Samardzic et al. (2010a; 2010b) also
inventory of structural divergences, and find that studied the portability of the English PropBank to
UCCA abstracts away from almost all of them French, and found that the overwhelming major-
(Section 4.2). ity of the French verbal predicates in the corpus
For a token-level analysis, we manually UCCA- correspond to a verb sense in the PropBank lex-
annotated a parallel French-English corpus of over icon. The portability of PropBank was also ex-
25K tokens, which we make publically available, amined in the case of English-Chinese through the
and compare the similarity between the UCCA construction of annotated parallel corpora used in
structures in the two languages to the correspond- the OntoNotes project (Weischedel et al., 2012).
ing similarity between syntactic annotations. We Portability has also been studied in the context
find that UCCA is considerably less divergent than of more elaborate hierarchical structures (Dorr et
syntactic annotation (Section 6). We expect the al., 2010; Banarescu et al., 2013), often with the
relative stability of UCCA compared to syntactic intention of producing an inter-language – a rep-
schemes to be even greater in language pairs that resentation independent of any specific language,
are more syntactically different than the relatively which exhaustively accounts for the meaning of
similar English-French. the sentence. Dorr et al. (2010) studied portabil-
Finally, we analyze the semantic correspon- ity through the construction of a set of annotated
dence between the annotations on both sides of parallel corpora in six languages, as part of the
the parallel corpus (Section 7). We find remark- IAMTC project. Portability has also been inves-
ably high semantic correspondence between the tigated through the construction of annotated par-
two languages. For instance, over 92% of the allel treebanks such as the Prague Czech-English
Dependency Treebank2 , enabling a subsequent va- nary study of the stability of AMR in our corpus.
lency stability study (Urešová et al., 2015). Integrating semantics into MT systems.
Stability of semantic annotation. Another line Widely used in early MT (Uchida, 1987; Niren-
of work focused on the stability of specific burg, 1989), the integration of semantics into
schemes, i.e., their ability to preserve structure SMT systems is receiving much renewed interest
across translations. Fung et al. (2006; 2007) stud- in recent years. The first line of research is the
ied the stability of semantic role annotation be- integration of semantic features (often semantic
tween arguments in English and Chinese. They roles) in SMT systems. In the phrase-based SMT
found that 83% of the alignable verbal arguments models, they were mainly utilized for influencing
in English have a role-compatible argument in reordering (Wu and Fung, 2009; Xiong et al.,
Chinese, but did not address arguments that have 2012; Feng et al., 2012). In syntax-based SMT
no correspondent in the other language. This mo- models, semantic roles were involved in assisting
tivated the use of semantic roles in MT, but also reordering models (Li et al., 2013) and in transla-
highlighted the existence of divergences between tion rules (Zhai et al., 2012; Liu and Gildea, 2010;
the structures in the two languages. Bazrafshan and Gildea, 2013).
Semantic role schemes used in MT are gener- The second line of research concerns the use
ally restricted to verbal predicates, excluding sev- of an inter-language as an intermediary represen-
eral highly frequent constructions, such as copula tation in SMT. Edelman and Solan (2009), rely-
clauses and nominalizations, which can result in a ing on the cognitive model Revised Hierarchical
loss of stability. Furthermore, the fine-grained in- Model (RHM), tried to represent the network of
formation such schemes provide as to the role of constructions that mediates between concepts and
the arguments can be difficult to port across lan- the channels of linguistic input and output. Jones
guages. For further discussion, see (Abend and et al. (2012) conducted preliminary experiments
Rappoport, 2013b) and (Birch et al., 2013). on a geographical querying domain using AMR.
Abstract Meaning Representation (AMR) (Ba-
3 UCCA Annotation
narescu et al., 2013) is a hierarchical semantic rep-
UCCA is a a semantic annotation scheme, strongly
resentation scheme whose aim is to provide sim-
influenced by typological, notably Basic Lin-
ple, readable semantic annotation that can be ap-
guistic Theory (Dixon, 2010a; Dixon, 2010b;
plied cross-linguistically and assist MT systems.
Dixon, 2012), and cognitive linguistic theories
While UCCA is encoded over the text, AMR pro-
(Langacker, 2008). The scheme aims to provide
vides a structure for each sentence that is not triv-
a coarse-grained, cross-linguistically applicable
ially alignable with the text (Flanigan et al., 2014).
representation by directly reflecting the major se-
Xue et al. (2014) studied the scheme’s portabil-
mantic phenomena represented in the text and ab-
ity and stability when applied to English-Chinese
stracting away from specific syntactic forms. We
and English-Czech parallel corpora. They anno-
briefly introduce the UCCA formalism and main
tated 100 Chinese and Czech sentences translated
categories. For a more elaborate presentation, as
from English, and examined the similarities and
well as evidence for the accessibility of UCCA
differences of the AMRs across translations. In
to annotators with no linguistic background, see
the English-Czech comparison, 53% of the sen-
(Abend and Rappoport, 2013a; Abend and Rap-
tences are reported to be structurally different in
poport, 2013b).
a non-local way. They conclude that at this point
UCCA structures are directed acyclic graphs,
AMR is not stable enough to be used as an inter-
where the words in the text correspond to (a sub-
language, but should be used only either on the
set of) their leaves. The nodes of the graphs,
target or on the source side.
called units, are either terminals or several ele-
Focusing on closer languages, namely English-
ments jointly viewed as a single entity according
French, we employ both type-level and token-level
to some semantic or cognitive consideration. The
approaches for UCCA, including a comparison to
edges bear one or more categories, indicating the
syntax and a qualitative analysis of divergences,
role of the sub-unit in the relation that the parent
which are likely to generalize to some extent to
represents.
other semantic annotations. We report a prelimi-
UCCA is built as a multi-layered scheme, where
2
https://ufal.mff.cuni.cz/pcedt2.0/ each layer represents a different set of distinc-
tions. In this work we use the foundational pendix 2) 3 .
layer of UCCA, which mostly addresses predicate- As an example, we consider reflexive pronouns,
argument structures and linkage relations between representing the applicability of UCCA to French
them. phenomena that have no direct parallel in English.
UCCA views the text as a collection of Scenes In French, in addition to the counterparts of “him-
and relations between them. A Scene, the most ba- self” and “themselves” (“lui-même” and “eux-
sic notion of this layer, describes a movement, an mêmes”), reflexivity is also expressed through the
action or a state which is persistent in time. Every pronouns “se”, “me”, “te”, “nous” and “vous”,
Scene contains one main relation, or anchor (sim- which precede some verbs (termed “pronominal
ilar to frame-evoking element in FrametNet), and verbs”). For instance, “lavé” is “washed”, while
is labeled as a State (S) or a Process (P). “s’est lavé” is “washed himself”. We show that the
A Scene may contain one or more Participants UCCA’s category definitions can be applied natu-
(A), which are interpreted in a broad sense, and rally to this phenomenon.
include locations, destinations and complement A key guideline in UCCA is that the annotation
clauses. Secondary relations in the Scene, such of a unit does not depend on its part of speech
as manner or temporal descriptions, are labeled but rather on its meaning and role in the context
as Adverbials (D). For example, the sentence “He it is situated in. We therefore distinguish between
slowly ran into the park” is annotated as follows: three cases based on their semantics.
“[He]A [slowly]D [ran]P [into the park]A ”. First, cases where the reflexive pronoun refers
The definitions of the UCCA categories are not to the same Participant as the subject. Here the
dependent on POS distinctions. For instance, a pronoun is annotated as an A: “[Jean]A [s’]A [est
Scene’s main relation can be an adjective (“[He]A lavé]P ” (“Jean washed himself”).
[’s thin]S ”) or a noun (“[John ’s]A [decision]P ”). Second, cases where the pronoun changes the
meaning of the verb in an unpredictable way, or
4 Type-Level Analysis alternatively, where the verb may only appear in a
pronominal form. In these cases the formal means
In this section we focus on type-level analysis and
of reflexivity is used, but is not associated with the
show both the portability of UCCA, examining the
semantic phenomena of reflexivity. Semantically
annotation of the French grammatical phenomena
then, the reflexive pronoun and the verb form one
with UCCA, and its stability, assessing UCCA’s
unanalyzable unit, as in the following example: “Il
influence on commonly studied structural diver-
[s’ est aperçu]P qu’il était tard” (“He realized that
gences.
it was late”).
4.1 Portability Third, cases where the pronoun changes the
meaning and the number of arguments of the verb
We examine UCCA’s applicability to French by without creating semantic reflexivity. In these
systematically examining the major grammatical cases the verb is the Center (C) of the Process,
phenomena in French, and verifying that UCCA while the reflexive pronoun serves as an Elab-
categories can be applied to them. To this pur- orator (E). For example: “Je [m’E appelleC ]P
pose, we use the same annotation guidelines and John” (“my name is John” where “appelle” means
category set previously applied to English, and ap- “call”).
ply it to the phenomena and examples described
4.2 Stability
in a French grammar book (Hawkins and Towell,
2001). Tense and agreement are not covered in the Overcoming cross-linguistic divergences (or
UCCA foundational layer which we use, and are translation divergences) is one of the main chal-
therefore disregarded in this work. lenges in machine translation. We briefly review
We find that even for French-specific phenom- the main examples of translation divergences pre-
ena, current UCCA categories permit their anno- sented in (Dorr, 1994; Dorr et al., 2002; Dorr et
tation in the foundational layer without requiring al., 2004), adapting the original English-Spanish
changes in the definitions or additional categories. examples to English-French analogues. Then, for
Due to space limitations, we only present here one each example, we present its annotation according
case of interest. The full analysis according to the 3
www.cs.huji.ac.il/˜eliors/papers/
grammar book can be found in Sulem (2014) (Ap- elior_sulem_thesis.pdf
to UCCA. The resulting annotations show that glish example contains a Process (“to run”) and
UCCA abstracts away from almost all of these a Participant (“in”). The annotation in French is
divergences and exposes the semantic similarity, somewhat different, where “entrer” (“enter”) is a
demonstrating the stability of the scheme at the Process, while “en courant” (“running”) is an Ad-
type-level. verbial.
Categorical divergence: Translation of words To summarize, aside from the case of demo-
in one language into words that have different POS tional divergence, the UCCA annotation (in its
tags in another language. For example, “to be foundational layer) abstracts away from canonical
cold” – “avoir froid” (“to have cold”). In UCCA examples for cross-linguistic divergences. With
the expression in both languages is annotated as a demotional divergence, where UCCA annotation
State where the Center (similar to the notion of a is different across languages, we note that the di-
semantic head) is “cold” / “froid”. vergence does correspond to a semantic difference
Conflational divergence: Translation of two of emphasis, that is, whether the entering action or
or more words in one language into one word the running action is the main relation. We leave
in another language. For example: “to kick” – it open whether this divergence should be consid-
“donner un coup de pied” (“give a kick”). In ered a result of a true semantic difference between
UCCA, the expression describes a Process in the the languages or a shortcoming of UCCA that fails
two languages, and the French light verb “donner” to capture the similarity between them.
(“give”) is a Function (a unit which does not intro-
duce a relation or participant) inside the Process. 5 Parallel French-English UCCA Corpus
Structural4 divergence: Realization of verb The parallel corpus. The French-English cor-
arguments in different syntactic configurations in pus used here is an extract from the book Twenty
different languages. For example, “to enter the Thousand Leagues Under the Sea (Vingt Mille
house” – “entrer dans la maison” (“enter in the Lieues Sous les Mers), a classic science fiction
house”). In UCCA there is a Participant in both novel written in French by Jules Verne (1828–
languages. 1905) and first published in 1870. We use an on-
Thematic divergence: Realization of verb ar- line version of the book and the English translation
guments in syntactic configurations that reflect by J.P. Walter (Verne, 1870; Verne, 1991). Each
different thematic to syntactic mapping orders. of the two monolingual parts of the corpus contain
For example, “I like this house” – “Cette maison 583 sentences which correspond to 12.5K tokens
me plaı̂t” (“this house pleases to me”). In UCCA in English and 13.1K tokens in French. The anno-
there are two Participants in English as well as tated corpus is publically available5 .
two Participants in French (“cette maison” / “this
Initial alignment. We segment the parallel cor-
house” and “me” / “me”).
pus into 154 bilingual pairs of aligned passages.
Promotional/Demotional divergence: Promo- Each passage in French corresponds to a single
tion is the case where a modifier in the source lan- passage in English. The passages correspond to
guage is promoted to a main verb in the target lan- the paragraphs in the original texts except in a
guage (Dorr, 1990; Gola, 2012). Demotion is its few cases of long dialogues, where we split the
mirror image, where a main verb in the source lan- paragraphs into several passages. A sentence-level
guage becomes a modifier in the target language. alignment is not necessary in our analysis since
An example where an English adverb is pro- in UCCA, the text is viewed as a collection of
moted to a main verb is the French: “John usu- Scenes, where sentence boundaries play no signif-
ally goes home” – “John a l’habitude de rentrer icant role.
à la maison” (“John has the habit to go home”). Manual annotation. The annotation was car-
In UCCA, both “usually” and “a l’habitude” (“has ried out using UCCA’s web application. Both
the habit”) are annotated as Adverbials. French and English texts were annotated by the
An example where an English verb is demoted same annotator (one of the authors of the paper),
to an adverb is the French “to run in” – “entrer according to UCCA annotation guidelines6 . Re-
en courant” (“enter running”). In UCCA, the En-
5
www.cs.huji.ac.il/˜eliors
4 6
Here the term “structural” refers specifically to syntax, Both the web-application and the guidelines are available
in contrast to the broader sense used elsewhere in the paper. in homepages.inf.ed.ac.uk/oabend/ucca/.
cent updates to the guidelines concerning the an- ilarity between n(t,F re) and n(t,Eng) , which is an
notation of secondary verbs as Adverbials, are not indication of the stability of the scheme, is com-
applied here. We expect these changes to further puted using l1 and l2 norms of the difference be-
improve the quality of the results (Section 7.3). tween them.
The annotation in English and French was carried We further compute an F-score as follows: pre-
out separately in each of the languages, rather than cision and recall of the French vector against
in parallel, thus permitting cases where the same the English one are defined respectively by
linguistic form in English and French is subject to P = s/f and R = s/e when s =
∑ (t,F re) (t,Eng) ∑ (t,F re)
different interpretations, leading to different anno-
i min(ni , ni ), f = i ni and
tations. This effect on the differences in UCCA ∑ (t,Eng)
e = i ni . The F-score F is the harmonic
annotation in English and French is discussed in mean of P and R. This measure provides an up-
Section 7. per bound of the number of aligned units in the
two languages, looking at the category of the units
6 Token-level Analysis
and their appearance in aligned passages. We note
In order to demonstrate UCCA’s stability at a that the measures described are more applicable
token-level, we examine the number of UCCA in this context than statistical correlation measures
units of various types in both English and French (e.g., the Pearson correlation coefficient). This is
for each parallel passage in our annotated paral- because a stable scheme is determined by the simi-
lel corpus. We compare these numbers to those larity of the count vectors in absolute terms, rather
obtained through syntactic annotation. In light than their statistical correlation.
of our type-level analysis (Section 4), we expect Experimental setup. For tagging, we use the
these UCCA categories to be more stable cross- Stanford POS tagger package (Toutanova et al.,
linguistically than syntactic ones. The number 2003). We compute the number of verbs in the
of Scenes is compared to the number of non- parallel corpus and compare them to the number
auxiliary verbs, and the number of Participants of Scenes. We exclude auxiliaries since such verbs
and Adverbials is compared to the number of noun tend to differ considerably between languages. We
phrases (NPs), prepositional phrases (PPs) and ad- manually correct the tagging (by a single annota-
verb phrases (ADVPs). tor, highly proficient in both languages), and there-
We note that English-French is a particularly fore expect these numbers to be comparable in
challenging candidate for this type of analysis quality to a gold standard7 .
since the language pair is relatively structurally The syntactic constituents we study are noun
similar (e.g., measured by word reordering (Birch, phrases (NP), prepositional phrases (PP) and
2011)). Syntactic annotation is therefore a strong adverb phrases (ADVP in English and AdP in
baseline. We expect UCCA’s relative stability to French). We used the Stanford parser’s pre-
be even greater in more syntactically divergent trained models for English (englishPCFG, (Klein
language pairs. and Manning, 2003)) and French (the frenchFac-
We are mainly interested not in the absolute tored (Green et al., 2011)), with the same man-
number of units/constituents of a certain type, but ual tokenization taken from the UCCA annotation.
more in the extent to which this number diverges Six passages which contain very long sentences in
between languages. Minimal divergence in the French and for which the parser was unable to pro-
number of units/constituents of a certain type be- duce a parse were omitted from this evaluation.
tween the two languages is an indication of the We note that we include in our analysis Scenes
scheme’s stability. marked as unanalyzable (For example: “Hello!”),
We compute the similarity in the number of but exclude Scenes appearing as remote Partici-
units/constituents of each type in the two lan- pants, so to avoid double counting.
guages in the following manner. For each In order to correct for possible biases of the
language l ∈ {F re, Eng} and for each parsers towards overprediction or underprediction
unit/constituent type t, we compute the number of certain syntactic constituents, we conduct the
(t,l) following experiment. We manually count the
of instances of that type ni in each passage
i = 1, .., N . We thereby obtain for each (t, l) a 7
(t,l)
The French tagger overestimated the number of verbs by
vector n(t,l) = {ni }i . For each type t, the sim- 0.6%, while the English tagger overestimated it by 8.7%.
l1 l2 F Fr. Avg. En. Avg.
Scenes 124 14.97 0.96 9.25 9.49
ber of English PPs was obtained through manual
Verbs 157 18.79 0.94 9.30 9.10 and automatic counting. In these passages the
Participants (As) 273 31.13 0.95 17.68 18.27 French parser overpredicted NPs by 0.9% and PPs
NPs and PPs 952 102.74 0.89 26.64 32.33
NPs 847 88.89 0.87 18.78 24.20 by 11.4%. The average difference between the
PPs 299 32.05 0.87 7.86 8.13
Adverbials (Ds) 133 17.18 0.86 3.3 3.07
results of the manual and automatic counting of
Adverb Phrases 342 40.0 0.15 0.24 2.49 French adverb phrases was 0.5. The biases are in
As + Ds 334 37.18 0.95 20.99 21.34 an order of magnitude less than the relative differ-
NPs+PPs+ADVPs 1226 127.40 0.87 26.88 34.82
ences in the l1 and l2 norms. Therefore, the stabil-
Table 1: Comparison of UCCA’s Scene, Participant and ity of UCCA relative to syntactic schemes is not a
Adverbial stability across the two languages with the stabil- result of the parsers’ biases.
ity of verbs, NPs, PPs and ADVPs. l1 and l2 represent re-
spectively the l1 and l2 norms of the difference between the 7 Divergence Analysis and Discussion
French and English count vectors. The F-score F , resulting The analysis in Section 6 provides a comparison
from an upper bound on the number of aligned units in the
two languages, evaluates the similarity between these vectors. in terms of the number of units of specific types,
The Scenes and the verbs are computed over the whole cor- as opposed to corresponding numbers of syntac-
pus (154 passages), while the other categories are computed tic constituents. In this section we define a more
on 148 passages (see text).
refined methodology (Section 7.1) for examining
not only the correspondence in the number of units
number of NPs, PPs and ADVPs in the first 10 pas- between the languages, but also the semantic cor-
sages in English and French, according to the orig- respondence between units (Section 7.2 and 7.3).
inal guidelines of the English and French Tree- 7.1 Defining Divergences using UCCA
banks (Bies et al., 1995; Abeillé et al., 2004). All
borderline cases are counted pessimistically, i.e., We define a correspondence between two UCCA
in the direction that maximizes the difference be- annotations to be a one-to-one mapping which
tween the manual and automatic counts. preserves UCCA’s categories and meaning. Con-
cretely, given a parallel corpus, a unit in one lan-
Results. Our results are given in Table 1. In guage corresponds to a unit in the other language if
all cases the UCCA annotation is more stable they have the same category and if the units have
across annotations than the syntactic counterpart. the same meaning. More formally, we define a
The relative similarity between the number of sufficient subset of a unit u to be a subset of e that
PPs in the two languages, as reflected in the contains its heads (the main relation in the case of
relatively low vector distances of n(P P,Eng) and a Scene, or the Centers in the case of a non-Scene).
n(P P,F re) , can be explained by the fact that the For example, “He ran” is a sufficient subset of the
presence of a preposition in French usually re- Scene “He slowly ran” since it contains the main
quires a preposition in its English translation. PPs relation “ran”. A unit e in English and a unit f
are also less affected than NPs by nominaliza- in French correspond to each other if they have
tions which often result in cross-linguistic syntac- the same category and any of the three following
tic divergences8 . Table 1 also presents the average conditions hold: (1) e is a translation of f , (2) a
number of units/constituents of each type per pas- sufficient subset of e is a translation of f , or (3)
sage, on the two right columns. The latter numbers a sufficient subset of f is a translation of e. For
cannot be seen as a measure of stability, as an ex- example, the English Scene “He slowly ran” cor-
cessive number of units in one passage (relative to responds to the French Scene “Il a couru” (“He
the translation) may cancel out a deficient number ran”) since condition (2) holds.
of units in another. Given a UCCA category, some of the units of
Concerning the correction term for the parsers’ that category are left unaligned between the two
biases, we find that in the first 10 passages, the sides of the parallel corpus, creating a UCCA di-
English parser overpredicted NPs by 12.2% and vergence. We classify UCCA divergences accord-
underpredicted ADVPs by 3.8%. The same num- ing to their category, defining Scene, Participant
8
The low number of French adverb phrases is partially
and Adverbial divergences. We distinguish be-
due to the presence of some adverbial expressions that were tween divergences in the English and French sides.
tagged as multi-word adverbs (MWADV). If we consider An example of a UCCA divergence from our
MWADV as adverb phrases as well, the l1 value is 292 and
the l2 value is 33.05, which is still much higher than the dis- French-English corpus is: “of the ship victimized
tances for UCCA’s Adverbials (133 for l1 and 17.18 for l2 ). by this new ramming” – “du navire victime de ce
Property Scene Div. Participant Div. Adverbial Div.
Eng. Fre. Eng. Fre. Eng. Fre.
Translation Study
1 Similar Translation Possible 65.18 58.33 50 35.29 70.83 50.0
2 Similar Source Possible 73.21 63.89 54.55 47.06 75.0 46.15
- None 18.75 31.94 38.64 47.06 16.67 42.31
Annotation Study
3 Conforming Analysis 41.96 54.16 72.72 73.53 25.0 53.85
4 Different Interpretation 10.71 1.39 25 23.53 8.33 7.69
– None 55.36 44.44 25 20.59 70.83 46.15
Semantic Effect of the Unaligned Unit
5 Additional Information 38.39 18.06 25.0 20.59 37.50 0.0
6 Tense Information 8.04 5.56 – – – –
7 Emphasis 19.64 8.33 31.82 26.47 41.67 3.85
– None 50.89 80.56 61.36 64.71 58.33 96.15
Table 2: Percentage of UCCA divergences according to their types (columns) that have certain properties (rows). All numbers
are percentages computed over all UCCA divergences of the given type. Note that the properties are not mutually exclusive
(see text). Participant and Adverbial divergences are only evaluated on passages with no Scene divergences.
nouvel abordage”. The French noun “victime” de- pairs with AMR. Our analysis shows that AMR
scribes a result, while the corresponding English conserves the main structures in most sentences (7
“victimized” is an action. The unaligned Scene out of 10), and suggests that other semantic anno-
is in English. It is therefore an English Scene tations may also be structurally stable. However,
divergence. In the example “He slowly ran”/”Il semantic roles, used in PropBank and AMR, are
a couru” we saw above, there is no Scene diver- often a source of divergences across the languages.
gences but the English Adverbial “slowly” is un-
7.3 Properties of UCCA Divergences
aligned, creating an English Adverbial divergence.
In order to examine the causes and semantic types
7.2 Number of UCCA Divergences of the different divergences, we manually classi-
The analysis of Scene divergences is performed fied each of them according to three groups of
manually over the entire set of passages. The anal- properties, which are not mutually exclusive. The
ysis of Participant and Adverbial divergences is results of the divergence analysis are presented in
restricted to passages with no Scene divergences, Table 2.
i.e., with a perfect Scene to Scene correspondence Translation study: The properties in this group
(57 passages of the total 154). This permits the investigate whether a given UCCA divergence can
capture of lower level divergences which are not be avoided using a different formulation closer to
just consequences of the divergences at the Scene the one used in the other language. This approach
level. evaluates the translator’s choices and creativity.
We found a total of 112 English Scene diver- Properties #1 and #2 check whether different for-
gences and 72 French ones. This amounted to mulations can be used in the source and target side
92.3% of the English Scenes having a French cor- respectively, that would avoid the UCCA diver-
respondent and 94.9% of the French Scenes hav- gence. Results show that many of the divergences
ing an English correspondent. Only 25% of the can be indeed ascribed to the specific translation
sentences (148 out of 583) contains any Scene di- selected. For example, only less than a third of
vergences. the Scene divergences in each language could not
Concerning Participant divergences, we found have been avoided through a different translation.
that 694 out of 738 English Participants (94.0%) We thus speculate that in a more technical and less
have a correspondent in French. 694 of the 728 literary corpus, the number of UCCA divergences
French Participants (95.3%) have a correspondent will be lower.
in English. 100 out of the 124 English Adverbials Annotation study: These properties study the
(80.6%) have a correspondent in French, and 100 influence of the annotator’s preferences. Prop-
out of the 126 Adverbials (79.4%) have a corre- erty #3 (conforming analysis) covers cases where
spondent in English. Thus, our results show low UCCA allows another analysis which would have
rates of UCCA French-English divergences. avoided the divergence. While both annotations
We also conduct a preliminary study into the are permitted, one of them is sometimes preferred,
applicability of another semantic scheme, namely to capture a nuance of meaning conveyed by one
AMR, to our domain. We annotate 10 sentence language but not the other. Property #4 refers to
Replaced by Scene Div. Participant Div. Adverbial Div.
Eng. Fre. Eng. Fre. Eng. Fre.
Linker 6.25 1.39 – – 8.33 7.69
Ground 1.79 1.39 – – 4.17 3.85
Elaborator of Participant – – 0 2.94 4.17 19.23
Main Relation – – 20.45∗ 20.59∗ 25.0∗ 26.92∗
Parallel Scene – – 13.64 2.94 – –
Participant – – – – 4.17 11.54
Adverbial – – 6.82 2.94 – –
2 Participants – – 11.36 2.94 – –
2 Adverbials – – – – 4.17 0.0
None 91.96 98.21 47.73 67.65 50.0 30.77
Table 3: Analysis of divergences in terms of replacements by other UCCA categories. Columns correspond to divergence
types, while rows correspond to the category, as defined in Abend and Rappoport (2013b), of the replacing unit. All numbers
are given in percents. Percentage is taken over all UCCA divergences of the same type. ∗ : In these cases, a Participant or an
Adverbial in one of the languages is included in the meaning of the main relation (Process or State) in the other language.
divergences resulting from different readings (am- siderably reduce this kind of divergence.
biguity) allowed by the text, where one meaning To summarize, our study sheds light on the cir-
was selected in one language and another in the cumstances in which UCCA divergences arise and
other. The results for this group (properties #3 suggests how many divergences can be avoided.
and #4) reveal that most of the Scene and Ad- This study also contributes to the understanding
verbial divergences could have been avoided had of the differences between original and translated
a different annotation been selected. This sug- texts, which can improve MT (Lembersky et al.,
gests that more restrictive annotation guidelines or 2013).
some post-annotation normalization can substan-
8 Conclusion
tially reduce the number of divergences.
Effect of the unaligned unit: Divergences are We showed that basic semantic structures can be
often a result of a semantic or pragmatic difference stably preserved across English-French transla-
between the source text and its translation. Prop- tions. This means that semantic structures may be
erty #5 addresses cases where additional informa- more suitable to SMT systems than syntactic ones,
tion is conveyed by the unaligned unit. Property which exhibit well known divergence phenomena.
#6 is a sub-case of #5 that specifically addresses We used the UCCA scheme, but we expect these
tense information. Property #7 addresses cases advantages to generalize to other structured se-
where the unaligned unit emphasizes some aspect mantic schemes. Future work will address the in-
of meaning. The results show that many diver- tegration of UCCA into structure-based SMT ei-
gences can be ascribed to a true semantic differ- ther by adding UCCA as features to phrase-based
ence between the source and the translation. and syntax-based systems, or by replacing exist-
ing syntactic structures with UCCA structures. We
Finally, in some cases, the UCCA divergences
also plan to investigate related tasks that would
simply replace one UCCA category with another
benefit from UCCA’s stability like bilingual align-
(Table 3). In these cases there are unaligned units
ment and MT evaluation.
in the English and the French sides that roughly
correspond to one another semantically, but have
Acknowledgments
different UCCA categories. Cases of replacement
are common with Participant and Adverbial diver- We would like to thank Roy Schwartz for help-
gences, but fairly rare in the case of Scene diver- ful comments. This research was supported by the
gences. In case of Adverbial divergences, many Language, Logic and Cognition Center (LLCC)
of them result from including the meaning of an at the Hebrew University of Jerusalem (for the
Adverbial in one language in the meaning of the first author) and by the ERC Advanced Fellowship
main relation (Process or State) in the other lan- 249520 GRAMPLUS (for the second author).
guage. This can be seen as a generalization of
demotional/promotional divergences (Dorr, 1994)
discussed in Section 4.2. Annotating secondary References
verbs (e.g., “begin” or “try”) as Adverbials instead Anne Abeillé, François Toussenel, and Martine
of being part of the main relation, as was done in Chéradame, 2004. Corpus le Monde An-
the latest version of UCCA’s guidelines, may con- notation en constituents Guide pour les cor-
recteurs. http://www.llf.cnrs.fr/Gens/ Robert M.W. Dixon. 2012. Basic Linguistic Theory:
Abeille/guide-annot.pdf. Further Grammatical Topics, volume 3. Oxford
University Press.
Omri Abend and Ari Rappoport. 2013a. UCCA: A
semantic-based grammatical annotation scheme. In Bonnie J. Dorr, Lisa Pearl, Rebecca Hwa, and Nizard
Proc. of IWCS-13, pages 1–12. Habash. 2002. DUSTer: a method for unraveling
cross-language divergences for statistical word-level
Omri Abend and Ari Rappoport. 2013b. Universal alignment. In Proc.of AMTA-02, pages 31–43.
Conceptual Cognitive Annotation (UCCA). In Proc.
of ACL-13, pages 228–238. Bonnie J. Dorr, Eduard H. Hovy, and Lori S. Levin.
2004. Machine translation: interlingual methods. In
Collin F. Baker, Charles J. Fillmore, and John B. Lowe. Encyclopedia of language and linguistics, 2nd edi-
1998. The Berkeley Framenet project. In Proc. of tion. ms.939, Brown, Keith.
ACL-COLING-98, pages 86–90.
Bonnie Dorr, Rebecca Passonneau, David Farwell, Re-
Laura Banarescu, Claire Bonial, Shu Cai, Madalina becca Green, Nizar Habash, Stephen Helmreich, Ed-
Georgescu, Kira Griffitt, Ulf Hermjakob, Kevin ward Hovy, Lori Levin, Keith Miller, Teruko Mi-
Knight, Philipp Koehn, Martha Palmer, and Nathan tamura, Owen Rambow, and Advaith Siddharthan.
Schneider. 2013. Abstract Meaning Representa- 2010. Interlingual annotation of parallel text cor-
tion for sembanking. In Proc. of Linguistic Annota- pora: A new framework for annotation and evalu-
tion Workshop and Interoperability with Discourse, ation. Natural Language Engineering, pages 197–
pages 178–186. 243.
Mazrieh Bazrafshan and Daniel Gildea. 2013. Seman- Bonnie Dorr. 1990. Solving thematic divergences in
tic roles for string to tree machine translation. In machine translation. In Proc. of ACL-90, pages 127–
Proc. of ACL-13 (Short Paper), pages 419–423. 134.
Ann Bies, Mark Ferguson, Karen Katz, and Robert Bonnie J. Dorr. 1994. Machine translation diver-
MacIntyre, 1995. Bracketting Guidelines for Tree- gences: a formal description and proposed solution.
bank II Style Penn Treebank Project. Linguistic Computational linguistics, 20(4):597–635.
Data Consortium.
Shimon Edelman and Zach Solan. 2009. Ma-
Alexandra Birch, Barry Haddow, Ulrich Germann, chine translation using automatically inferred
Maria Nadejde, Chrsitian Buck, and Philipp Koehn. construction-based correspondence and language
2013. The feasibility of HMEANT as a human MT models. In Proc. of PACLIC-09, pages 654–661.
evaluation metric. In Proc. of the 8th Workshop on
SMT, ACL-13, pages 52–61. Minwei Feng, Wiwei Sun, and Hermann Ney. 2012.
Semantic cohesion model for phrase-based SMT.
Alexandra Birch. 2011. Reordering metrics for statis- Proc. of COLING-12, pages 867–878.
tical machine translation. Ph.D. thesis, University
of Edinburgh. Jeffrey Flanigan, Sam Thomson, Jaime Carbonell,
Chris Dyer, and Noah A. Smith. 2014. A discrimi-
Aljoscha Burchardt, Katrin Erk, Anette Frank, An- native graph-based parser for the Abstract Meaning
drea Kowalski, Sebastian Padó, and Manfred Pinkal. Representation. In Proc. of ACL-14, pages 1426–
2009. Using FrameNet for the semantic analysis 1436.
of German: Annotation, representation and automa-
tion. In Hans C. Boas (Herausgeber), editor, Multi- Pascale Fung, Zhojun Wu, Yongsheng Yang, and Dekai
lingual FrameNets in Computational Lexicography Wu. 2006. Learning of Chinese/English semantic
- Methods and Applications, pages 209–244, New structure mapping. In Workshop on Spoken Lan-
York/Berlin. Mouton de Gruyter. guage Technology, IEEE/ACL-06, pages 230–233.
David Chiang. 2005. A hierarchical phrase-based Pascale Fung, Zhaojun Wu, Yongsheng Yang, and
model for statistical machine translation. In Proc. Dekai Wu. 2007. Learning bilingual semantic
of ACL-05, pages 263–270. frames: shallow semantic parsing vs. semantic role
projection. In Proc. of the 11th Conference on
Yuan Ding and Martha Palmer. 2004. Synchronous de- Theoretical and Methodological Issues in Machine
pendency insertion grammars: a grammar formalism Translation (TMI 2007), pages 75–84.
for syntax based statistical MT. In Workshop on Re-
cent Advances in Dependency Grammars, COLING- Francesca Gola. 2012. An analysis of translation
04. divergence patterns using PanLex translation pairs.
Master’s thesis, University of Washington.
Robert M.W. Dixon. 2010a. Basic Linguistic Theory:
Grammatical Topics, volume 2. Oxford University Spence Green, Marie-Catherine de Marneffe, John
Press. Bauer, and Christopher D. Manning. 2011. Mul-
tiword expression identification with Tree Substitu-
Robert M.W. Dixon. 2010b. Basic Linguistic Theory: tion Grammars: a parsing tour de force with French.
Methodology, volume 1. Oxford University Press. In Proc. of EMNLP-11, pages 725–735.
Roger Hawkins and Richard Towell. 2001. French Tanja Samardžić, Lonneke van der Plas, Goljihan
Grammar and Usage. McGraw-Hill, 2nd edition. Kashaeva, and Paola Merlo. 2010a. The scope
and the sources of variation in verbal predicates in
Bevan Jones, Jacob Andreas, Daniel Bauer, English and French. In Proc. of the 9th Interna-
Karl Moritz Hermann, and Kevin Knight. 2012. tional Workshop on Treebanks and Linguistic The-
Semantics-based machine translation with hy- ories, pages 199–211.
peredge replacement grammars. In Proc. of
COLING-12, pages 1359–1376. Tanja Samardžić, Lonneke van der Plas, Goljihan
Kashaeva, and Paola Merlo. 2010b. Variation in
Dan Klein and Christopher D. Manning. 2003. Ac- verbal predicates in English and French. Generative
curate unlexicalized parsing. In Proc. of ACL-03, Grammar in Geneva, 6:109–135.
pages 423–430.
Elior Sulem. 2014. Integration of a cognitive anno-
Mikhail Kozhevnikov and Ivan Titov. 2013. Cross- tation into machine translation: Theoretical founda-
lingual transfer of semantic role labeling models. In tions and bilingual corpus analysis. Master’s thesis,
Proc. of ACL-13, pages 1190–1200. Hebrew University of Jerusalem.
Ronald W. Langacker. 2008. Cognitive Grammar: A Kristina Toutanova, Dan Klein, Christopher Manning,
Basic Introduction. Oxford University Press, USA. and Yoram Singer. 2003. Feature-rich part-of-
speech tagging with a cyclic dependency network.
Gennadi Lembersky, Noam Ordan, and Shuly Wintner. In Proc. of HLT-NAACL-03, pages 252–259.
2013. Improving statistical machine translation by
adapting translation models to translationese. Com- Hiroshi Uchida. 1987. ATLAS: Fujitsu machine
putational Linguistics, 39(4):999–1023. translation system. In Machine Translation Summit,
Japan.
Mike Lewis and Mark Steedman. 2013. Unsuper-
Zdeňka Urešová, Ondřej Dušek, Eva Fučı́ková, Jan
vised induction of cross-lingual semantic relations.
In Proc. of EMNLP-13, pages 681–692. Hajič, and Jana Šindlerová. 2015. Bilingual
English-Czech valency lexicon linked to a parallel
Junhui Li, Philip Resnik, and Hal Daumé III. 2013. corpus. In Proc. of the 9th Linguistic Annotation
Modeling syntactic and semantic structures in hi- Workshop (The LAW IX), pages 124–128.
erarchical phrase-based translation. In Proc. of
Lonneke van der Plas, Tanja Samardžić, and Paola
NAACL-HLT-13, pages 540–549.
Merlo. 2010. Cross-lingual validity of PropBank
in the manual annotation of French. In Proc. of the
Ding Liu and Daniel Gildea. 2010. Semantic role fea-
4th Linguistic Annotation Workshop (The LAW IV),
tures for machine translation. In Proc. of COLING-
pages 113–117.
10, pages 716–724.
Jules Verne. 1870. Vingt Mille Lieues Sous les Mers.
Yang Liu, Qun Liu, and Shouxun Lin. 2006. Tree- J. Hetzel. http://fr.wikisource.org/
to-string alignment template for statistical machine wiki/Vingt_mille_lieues_sous_les_
translation. In Proc. of COLING-ACL-06, pages mers.
609–616.
Jules Verne. 1991. Twenty Thousands Leagues Un-
Daniel Marcu, Wei Wang, Abdessamad Echihabi, and der the Sea. Translated from the original French
Kevin Knight. 2006. SPMT: Statistical machine by J.P. Walter. http://jv.gilead.org.il/
translation with syntactified target language phrases. fpwalter.
In Proc. of EMNLP-06, pages 44–52.
Ralph Weischedel, Sameer Pradhan, Lance Ramshaw,
Haitao Mi, Liang Huang, and Qun Liu. 2008. Forest- Jeff Kaufman, Michelle Franchini, Mohammed El-
based translation. In Proc. of ACL-08 HLT, pages Bachouti, Nianwen Xue, Martha Palmer, Jena D.
192–199. Hwang, Claire Bonial, Jinho Choi, Aous Mansouri,
Maha Foster, Abdel aati Hawwary, Mitchell Marcus,
Sergei Nirenburg. 1989. New developments in Ann Taylor, Craig Greenberg, Eduard Hovy, Robert
knowledge-based machine translation. In Alatis Belvin, and Ann Houston. 2012. OntoNotes Re-
J.E., editor, Georgetown University Round Table on lease 5.0. Technical report, Linguistic Data Consor-
Languages and Linguistics 1989: Language teach- tium 2013T19.
ing, testing, and technology: lessons from the past
with a view toward the future, pages 344–357. Dekai Wu and Pascale Fung. 2009. Semantic roles for
Georgetown University Press. SMT: a hybrid two-pass model. In Proc. of NAACL-
HLT-09 (Short Paper), pages 13–19.
Martha Palmer, Daniel Gildea, and Paul Kingsbury.
2005. The proposition bank: A annotated cor- Deyi Xiong, Min Zhang, and Haizhou Li. 2012. Mod-
pus of semantic roles. Computational Linguistics, eling the translation of predicate-argument structure
31(1):149–159. for SMT. In Proc. of ACL-12, pages 902–911.
Nianwen Xue, Ondřej Bojar, Jan Hajič, Martha Palmer,
Zdeňka Urešová, and Xiuhong Zhang. 2014. Not an
interlingua, but close: comparison of English AMRs
to Chinese and Czech. In Proc. of LREC-14, pages
1765–1772.
Feifei Zhai, Jiajun Zhang, Yu Zhou, and Chengquing

Zong. 2012. Machine translation by model-
ing predicate-argument structure transformation. In
Proc. of COLING-12, pages 3019–3036.
Min Zhang, Hongfei Jiang, Aiti Aw, Haizhou Li,

Chew Lim Tan, and Seng Li. 2008. A tree sequence
alignment-based tree-to-tree translation model. In
Proc. of ACL-08, pages 559–567.

S2MT Paper

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

S2MT Paper

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

S2MT Paper

Uploaded by

Copyright:

Available Formats

Conceptual Annotations Preserve Structure Across Translations:

A French-English Case Study

Elior Sulem Omri Abend Ari Rappoport

Abstract models are effective at improving reordering at the

Feifei Zhai, Jiajun Zhang, Yu Zhou, and Chengquing

Min Zhang, Hongfei Jiang, Aiti Aw, Haizhou Li,

You might also like