0% found this document useful (0 votes)
79 views8 pages

Sta N Z A: A Python Natural Language Processing Toolkit For Many Human Languages

Uploaded by

Grace Lepar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views8 pages

Sta N Z A: A Python Natural Language Processing Toolkit For Many Human Languages

Uploaded by

Grace Lepar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Sta n z a : A Python Natural Language Processing Toolkit

for Many Human Languages


Peng Qi* Yuhao Zhang* Yuhui Zhang
Jason Bolton Christopher D. Manning
Stanford University
Stanford, CA 94305
{pengqi, yuhaozhang, yuhuiz}@stanford.edu
{jebolton, manning}@stanford.edu

Abstract Tokenization & Sentence Split Hello!


EN
Bonjour!
FR
你好!
ZH
Hallo!
DE
TOKENIZE

We introduce Sta n z a , an open-source Python


‫!ﻣرﺣﺑﺎ‬ 안녕하세요! ¡Hola! Здравствуйте!

Multi-word Token Expansion AR KO ES RU

नमस्कार!
natural language processing toolkit support- MWT こんにちは! Hallo! xin chào!
arXiv:2003.07082v2 [cs.CL] 23 Apr 2020

JA NL VI HI

ing 66 human languages. Compared to ex- Lemmatization Multilingual: 66 Languages


LEMMA
isting widely used toolkits, Sta n z a features RAW TEXT
POS & Morphological Tagging
a language-agnostic fully neural pipeline for POS

text analysis, including tokenization, multi- Dependency Parsing


WORDS
Native Python Objects
TOKEN
word token expansion, lemmatization, part-of- DEPPARSE
LEMMA POS HEAD DEPREL ...
speech and morphological feature tagging, de- Named Entity Recognition
WORD
NER
pendency parsing, and named entity recogni- Fully Neural: Language-agnostic SENTENCE

tion. We have trained Sta n z a on a total of PROCESSORS DOCUMENT


112 datasets, including the Universal Depen-
dencies treebanks and other multilingual cor- Figure 1: Overview of Sta n z a ’s neural NLP pipeline.
pora, and show that the same neural architec- Sta n z a takes multilingual text as input, and produces
ture generalizes well and achieves competitive annotations accessible as native Python objects. Be-
performance on all languages tested. Addition- sides this neural pipeline, Sta n z a also features a
ally, Sta n z a includes a native Python interface Python client interface to the Java CoreNLP software.
to the widely used Java Stanford CoreNLP
software, which further extends its function- ing downstream applications and insights obtained
ality to cover other tasks such as coreference
from them. Third, some tools assume input text has
resolution and relation extraction. Source
code, documentation, and pretrained models been tokenized or annotated with other tools, lack-
for 66 languages are available at https:// ing the ability to process raw text within a unified
stanfordnlp.github.io/stanza/. framework. This has limited their wide applicabil-
ity to text from diverse sources.
1 Introduction We introduce Sta n z a 2 , a Python natural language
The growing availability of open-source natural lan- processing toolkit supporting many human lan-
guage processing (NLP) toolkits has made it easier guages. As shown in Table 1, compared to existing
for users to build tools with sophisticated linguistic widely-used NLP toolkits, Sta n z a has the following
processing. While existing NLP toolkits such as advantages:
CoreNLP (Manning et al., 2014), F LAIR (Akbik • From raw text to annotations. Sta n z a fea-
et al., 2019), spaCy1 , and UDPipe (Straka, 2018) tures a fully neural pipeline which takes raw
have had wide usage, they also suffer from several text as input, and produces annotations includ-
limitations. First, existing toolkits often support ing tokenization, multi-word token expansion,
only a few major languages. This has significantly lemmatization, part-of-speech and morpholog-
limited the community’s ability to process multilin- ical feature tagging, dependency parsing, and
gual text. Second, widely used tools are sometimes named entity recognition.
under-optimized for accuracy either due to a focus
on efficiency (e.g., spaCy) or use of less power- • Multilinguality. Sta n z a ’s architectural de-
ful models (e.g., CoreNLP), potentially mislead- sign is language-agnostic and data-driven,

which allows us to release models support-
Equal contribution. Order decided by a tossed coin.
1 2
https://spacy.io/ The toolkit was called StanfordNLP prior to v1.0.0.
System # Human Programming Raw Text Fully Pretrained State-of-the-art
Languages Language Processing Neural Models Performance

CoreNLP 6 Java ! !
F LAIR 12 Python ! ! !
spaCy 10 Python ! !
UDPipe 61 C++ ! ! !
Sta n z a 66 Python ! ! ! !
Table 1: Feature comparisons of Sta n z a against other popular natural language processing toolkits.

ing 66 languages, by training the pipeline on (fr) L’Association des Hôtels


(en) The Association of Hotels
the Universal Dependencies (UD) treebanks (fr) Il y a des hôtels en bas de la rue
and other multilingual corpora. (en) There are hotels down the street

• State-of-the-art performance. We evaluate Figure 2: An example of multi-word tokens in French.


Sta n z a on a total of 112 datasets, and find its The des in the first sentence corresponds to two syntac-
neural pipeline adapts well to text of different tic words, de and les; the second des is a single word.
genres, achieving state-of-the-art or competi-
tive performance at each step of the pipeline. Tokenization and Sentence Splitting. When
presented raw text, Sta n z a tokenizes it and groups
Additionally, Sta n z a features a Python interface tokens into sentences as the first step of processing.
to the widely used Java CoreNLP package, allow- Unlike most existing toolkits, Sta n z a combines tok-
ing access to additional tools such as coreference enization and sentence segmentation from raw text
resolution and relation extraction. into a single module. This is modeled as a tagging
Sta n z a is fully open source and we make pre- problem over character sequences, where the model
trained models for all supported languages and predicts whether a given character is the end of a
datasets available for public download. We hope Sta token, end of a sentence, or end of a multi-word
n z a can facilitate multilingual NLP research and ap- token (MWT, see Figure 2).3 We choose to predict
plications, and drive future research that produces MWTs jointly with tokenization because this task
insights from human languages. is context-sensitive in some languages.

2 System Design and Architecture Multi-word Token Expansion. Once MWTs


are identified by the tokenizer, they are expanded
At the top level, Sta n z a consists of two individual into the underlying syntactic words as the basis
components: (1) a fully neural multilingual NLP of downstream processing. This is achieved with
pipeline; (2) a Python client interface to the Java an ensemble of a frequency lexicon and a neural
Stanford CoreNLP software. In this section we sequence-to-sequence (seq2seq) model, to ensure
introduce their designs. that frequently observed expansions in the training
set are always robustly expanded while maintaining
2.1 Neural Multilingual NLP Pipeline flexibility to model unseen words statistically.
Sta n z a
’s neural pipeline consists of models that
range from tokenizing raw text to performing syn- POS and Morphological Feature Tagging. For
tactic analysis on entire sentences (see Figure 1). each word in a sentence, Sta n z a assigns it a part-
All components are designed with processing many of-speech (POS), and analyzes its universal mor-
human languages in mind, with high-level design phological features (UFeats, e.g., singular/plural,
choices capturing common phenomena in many 1st /2nd /3rd person, etc.). To predict POS and UFeats,
languages and data-driven models that learn the dif- we adopt a bidirectional long short-term mem-
ference between these languages from data. More- ory network (Bi-LSTM) as the basic architecture.
over, the implementation of Sta n z a components is For consistency among universal POS (UPOS),
highly modular, and reuses basic model architec- 3
Following Universal Dependencies (Nivre et al., 2020),
tures when possible for compactness. We highlight we make a distinction between tokens (contiguous spans of
characters in the input text) and syntactic words. These are
the important design choices here, and refer the interchangeable aside from the cases of MWTs, where one
reader to Qi et al. (2018) for modeling details. token can correspond to multiple words.
treebank-specific POS (XPOS), and UFeats, we existing server interface in CoreNLP, and imple-
adopt the biaffine scoring mechanism from Dozat ment a robust client as its Python interface.
and Manning (2017) to condition XPOS and When the CoreNLP client is instantiated, Sta n z
UFeats prediction on that of UPOS. a will automatically start the CoreNLP server as a
local process. The client then communicates with
Lemmatization. Sta n z a also lemmatizes each the server through its RESTful APIs, after which
word in a sentence to recover its canonical form annotations are transmitted in Protocol Buffers, and
(e.g., did→do). Similar to the multi-word token ex- converted back to native Python objects. Users can
pander, Sta n z a ’s lemmatizer is implemented as an also specify JSON or XML as annotation format.
ensemble of a dictionary-based lemmatizer and a To ensure robustness, while the client is being used,
neural seq2seq lemmatizer. An additional classifier Sta n z a periodically checks the health of the server,
is built on the encoder output of the seq2seq model, and restarts it if necessary.
to predict shortcuts such as lowercasing and iden-
tity copy for robustness on long input sequences 3 System Usage
such as URLs.
Sta n z a
’s user interface is designed to allow quick
Dependency Parsing. Sta n z a parses each sen- out-of-the-box processing of multilingual text. To
tence for its syntactic structure, where each word achieve this, Sta n z a supports automated model
in the sentence is assigned a syntactic head that download via Python code and pipeline customiza-
is either another word in the sentence, or in the tion with processors of choice. Annotation results
case of the root word, an artificial root symbol. We can be accessed as native Python objects to allow
implement a Bi-LSTM-based deep biaffine neural for flexible post-processing.
dependency parser (Dozat and Manning, 2017). We
further augment this model with two linguistically 3.1 Neural Pipeline Interface
motivated features: one that predicts the lineariza- Sta n z a ’s neural NLP pipeline can be initialized
tion order of two words in a given language, and with the Pipeline class, taking language name
the other that predicts the typical distance in linear as an argument. By default, all processors will be
order between them. We have previously shown loaded and run over the input text; however, users
that these features significantly improve parsing can also specify the processors to load and run with
accuracy (Qi et al., 2018). a list of processor names as an argument. Users
can additionally specify other processor-level prop-
Named Entity Recognition. For each input sen- erties, such as batch sizes used by processors, at
tence, Sta n z a also recognizes named entities in it initialization time.
(e.g., person names, organizations, etc.). For NER The following code snippet shows a minimal us-
we adopt the contextualized string representation- age of Sta n z a for downloading the Chinese model,
based sequence tagger from Akbik et al. (2018). annotating a sentence with customized processors,
We first train a forward and a backward character- and printing out all annotations:
level LSTM language model, and at tagging time
we concatenate the representations at the end of import stanza
# download Chinese model
each word position from both language models stanza.download(’zh’)
with word embeddings, and feed the result into a # initialize Chinese neural pipeline
nlp = stanza.Pipeline(’zh’, processors=’tokenize,
standard one-layer Bi-LSTM sequence tagger with pos,ner’)
# run annotation over a sentence
a conditional random field (CRF)-based decoder. doc = nlp(’斯坦福是一所私立研究型大学。’)
print(doc)

2.2 CoreNLP Client


After all processors are run, a Document in-
Stanford’s Java CoreNLP software provides a com- stance will be returned, which stores all annotation
prehensive set of NLP tools especially for the En- results. Within a Document, annotations are fur-
glish language. However, these tools are not easily ther stored in Sentences, Tokens and Words
accessible with Python, the programming language in a top-down fashion (Figure 1). The following
of choice for many NLP practitioners, due to the code snippet demonstrates how to access the text
lack of official support. To facilitate the use of and POS tag of each word in a document and all
CoreNLP from Python, we take advantage of the named entities in the document:
# print the text and POS of all words
for sentence in doc.sentences:
for word in sentence.words:
print(word.text, word.pos)

# print all entities in the document


print(doc.entities)

Sta n z a is designed to be run on different hard-


ware devices. By default, CUDA devices will be
used whenever they are visible by the pipeline, or
otherwise CPUs will be used. However, users can
force all computation to be run on CPUs by setting
use_gpu=False at initialization time.

3.2 CoreNLP Client Interface


The CoreNLP client interface is designed in a way
that the actual communication with the backend Figure 3: Sta n z a annotates a German sentence, as vi-
CoreNLP server is transparent to the user. To an- sualized by our interactive demo. Note am is expanded
into syntactic words an and dem before downstream
notate an input text with the CoreNLP client, a
analyses are performed.
CoreNLPClient instance needs to be initialized,
with an optional list of CoreNLP annotators. After
An example of running Sta n z a on a German sen-
the annotation is complete, results will be accessi-
tence can be found in Figure 3.
ble as native Python objects.
This code snippet shows how to establish a 3.4 Training Pipeline Models
CoreNLP client and obtain the NER and corefer-
ence annotations of an English sentence: For all neural processors, Sta n z a provides
command-line interfaces for users to train their
from stanza.server import CoreNLPClient own customized models. To do this, users need
# start a CoreNLP client
to prepare the training and development data in
with CoreNLPClient(annotators=[’tokenize’,’ssplit compatible formats (i.e., CoNLL-U format for the
’,’pos’,’lemma’,’ner’,’parse’,’coref’]) as
client: Universal Dependencies pipeline and BIO format
# run annotation over input
ann = client.annotate(’Emily said that she column files for the NER model). The following
liked the movie.’)
# access all entities
command trains a neural dependency parser with
for sent in ann.sentence: user-specified training and development data:
print(sent.mentions)
# access coreference annotations
print(ann.corefChain) $ python -m stanza.models.parser \
--train_file train.conllu \
--eval_file dev.conllu \
With the client interface, users can annotate text --gold_file dev.conllu \
in 6 languages as supported by CoreNLP. --output_file output.conllu

3.3 Interactive Web-based Demo


4 Performance Evaluation
To help visualize documents and their annotations
generated by Sta n z a , we build an interactive web To establish benchmark results and compare with
demo that runs the pipeline interactively. For all other popular toolkits, we trained and evaluated
languages and all annotations Sta n z a provides in Sta n z a on a total of 112 datasets. All pretrained
those languages, we generate predictions from the models are publicly downloadable.
models trained on the largest treebank/NER dataset,
Datasets. We train and evaluate Sta n z a ’s tokeniz-
and visualize the result with the Brat rapid annota-
er/sentence splitter, MWT expander, POS/UFeats
tion tool.4 This demo runs in a client/server archi-
tagger, lemmatizer, and dependency parser with
tecture, and annotation is performed on the server
the Universal Dependencies v2.5 treebanks (Ze-
side. We make one instance of this demo publicly
man et al., 2019). For training we use 100 tree-
available at http://stanza.run/. It can also be
banks from this release that have non-copyrighted
run locally with proper Python libraries installed.
training data, and for treebanks that do not include
4
https://brat.nlplab.org/ development data, we randomly split out 20% of
Treebank System Tokens Sents. Words UPOS XPOS UFeats Lemmas UAS LAS
Overall (100 treebanks) Sta n z a 99.09 86.05 98.63 92.49 91.80 89.93 92.78 80.45 75.68
Sta n z a 99.98 80.43 97.88 94.89 91.75 91.86 93.27 83.27 79.33
Arabic-PADT
UDPipe 99.98 82.09 94.58 90.36 84.00 84.16 88.46 72.67 68.14
Sta n z a 92.83 98.80 92.83 89.12 88.93 92.11 92.83 72.88 69.82
Chinese-GSD
UDPipe 90.27 99.10 90.27 84.13 84.04 89.05 90.26 61.60 57.81
Sta n z a 99.01 81.13 99.01 95.40 95.12 96.11 97.21 86.22 83.59
English-EWT UDPipe 98.90 77.40 98.90 93.26 92.75 94.23 95.45 80.22 77.03
spaCy 97.30 61.19 97.30 86.72 90.83 – 87.05 – –
Sta n z a 99.68 94.92 99.48 97.30 – 96.72 97.64 91.38 89.05
French-GSD UDPipe 99.68 93.59 98.81 95.85 – 95.55 96.61 87.14 84.26
spaCy 98.34 77.30 94.15 86.82 – – 87.29 67.46 60.60
Sta n z a 99.98 99.07 99.98 98.78 98.67 98.59 99.19 92.21 90.01
Spanish-AnCora UDPipe 99.97 98.32 99.95 98.32 98.13 98.13 98.48 88.22 85.10
spaCy 99.47 97.59 98.95 94.04 – – 79.63 86.63 84.13

Table 2: Neural pipeline performance comparisons on the Universal Dependencies (v2.5) test treebanks. For our
system we show macro-averaged results over all 100 treebanks. We also compare our system against UDPipe and
spaCy on treebanks of five major languages where the corresponding pretrained models are publicly available. All
results are F1 scores produced by the 2018 UD Shared Task official evaluation script.

the training data as development data. These tree- nese Gigaword corpora5 , respectively. We again
banks represent 66 languages, mostly European applied the same hyper-parameters to models for
languages, but spanning a diversity of language all languages.
families, including Indo-European, Afro-Asiatic,
Uralic, Turkic, Sino-Tibetan, etc. For NER, we Universal Dependencies Results. For perfor-
train and evaluate Sta n z a with 12 publicly avail- mance on UD treebanks, we compared Sta n z a
able datasets covering 8 major languages as shown (v1.0) against UDPipe (v1.2) and spaCy (v2.2) on
in Table 3 (Nothman et al., 2013; Tjong Kim Sang treebanks of 5 major languages whenever a pre-
and De Meulder, 2003; Tjong Kim Sang, 2002; trained model is available. As shown in Table 2, St
Benikova et al., 2014; Mohit et al., 2012; Taulé a n z a achieved the best performance on most scores
et al., 2008; Weischedel et al., 2013). For the reported. Notably, we find that Sta n z a ’s language-
WikiNER corpora, as canonical splits are not avail- agnostic architecture is able to adapt to datasets of
able, we randomly split them into 70% training, different languages and genres. This is also shown
15% dev and 15% test splits. For all other corpora by Sta n z a ’s high macro-averaged scores over 100
we used their canonical splits. treebanks covering 66 languages.

NER Results. For performance of the NER com-


Training. On the Universal Dependencies tree- ponent, we compared Sta n z a (v1.0) against F LAIR
banks, we tuned all hyper-parameters on several (v0.4.5) and spaCy (v2.2). For spaCy we reported
large treebanks and applied them to all other tree- results from its publicly available pretrained model
banks. We used the word2vec embeddings released whenever one trained on the same dataset can be
as part of the 2018 UD Shared Task (Zeman et al., found, otherwise we retrained its model on our
2018), or the fastText embeddings (Bojanowski datasets with default hyper-parameters, follow-
et al., 2017) whenever word2vec is not available. ing the publicly available tutorial.6 For F LAIR,
For the character-level language models in the NER since their downloadable models were pretrained
component, we pretrained them on a mix of the 5
https://catalog.ldc.upenn.edu/
Common Crawl and Wikipedia dumps, and the LDC2011T13
news corpora released by the WMT19 Shared Task 6
https://spacy.io/usage/training#ner
(Barrault et al., 2019), except for English and Chi- Note that, following this public tutorial, we did not use
pretrained word embeddings when training spaCy NER
nese, for which we pretrained on the Google One models, although using pretrained word embeddings may
Billion Word (Chelba et al., 2013) and the Chi- potentially improve the NER results.
Language Corpus # Types Sta n z a F LAIR spaCy Sta n z a UDPipe F LAIR
Task
CPU GPU CPU CPU GPU
Arabic AQMAR 4 74.3 74.0 –
UD 10.3× 3.22× 4.30× – –
Chinese OntoNotes 18 79.2 – –
NER 17.7× 1.08× – 51.8× 1.17×
Dutch CoNLL02 4 89.2 90.3 73.8
WikiNER 4 94.8 94.8 90.9 Table 4: Annotation runtime of various toolkits rela-
English CoNLL03 4 92.1 92.7 81.0 tive to spaCy (CPU) on the English EWT treebank and
OntoNotes 18 88.8 89.0 85.4∗ OntoNotes NER test sets. For reference, on the com-
French WikiNER 4 92.9 92.5 88.8∗ pared UD and NER tasks, spaCy is able to process 8140
and 5912 tokens per second, respectively.
German CoNLL03 4 81.9 82.5 63.9
GermEval14 4 85.2 85.4 68.4
Russian WikiNER 4 92.9 – – For future work, we consider the following areas
of improvement in the near term:
Spanish CoNLL02 4 88.1 87.3 77.5
AnCora 4 88.6 88.4 76.1 • Models downloadable in Sta n z a are largely
trained on a single dataset. To make mod-
Table 3: NER performance across different languages
els robust to many different genres of text,
and corpora. All scores reported are entity micro-
averaged test F1 . For each corpus we also list the num-
we would like to investigate the possibility of
ber of entity types. ∗ marks results from publicly avail- pooling various sources of compatible data to
able pretrained models on the same dataset, while oth- train “default” models for each language;
ers are from models retrained on our datasets. • The amount of computation and resources
available to us is limited. We would there-
on dataset versions different from canonical ones,
fore like to build an open “model zoo” for
we retrained all models on our own dataset splits
Sta n z a , so that researchers from outside our
with their best reported hyper-parameters. All test
group can also contribute their models and
results are shown in Table 3. We find that on all
benefit from models released by others;
datasets Sta n z a achieved either higher or close F1
scores when compared against F LAIR. When com- • Sta n z a was designed to optimize for accuracy
pared to spaCy, Sta n z a ’s NER performance is much of its predictions, but this sometimes comes at
better. It is worth noting that Sta n z a ’s high per- the cost of computational efficiency and lim-
formance is achieved with much smaller models its the toolkit’s use. We would like to further
compared with F LAIR (up to 75% smaller), as we investigate reducing model sizes and speed-
intentionally compressed the models for memory ing up computation in the toolkit, while still
efficiency and ease of distribution. maintaining the same level of accuracy.

Speed comparison. We compare Sta n z a against • We would also like to expand Sta n z a ’s func-
existing toolkits to evaluate the time it takes to an- tionality by adding other processors such as
notate text (see Table 4). For GPU tests we use a neural coreference resolution or relation ex-
single NVIDIA Titan RTX card. Unsurprisingly, traction for richer text analytics.
Sta n z a ’s extensive use of accurate neural models
makes it take significantly longer than spaCy to Acknowledgments
annotate text, but it is still competitive when com-
The authors would like to thank the anonymous
pared against toolkits of similar accuracy, espe-
reviewers for their comments, Arun Chaganty for
cially with the help of GPU acceleration.
his early contribution to this toolkit, Tim Dozat for
5 Conclusion and Future Work his design of the original architectures of the tagger
and parser models, Matthew Honnibal and Ines
We introduced Sta n z a , a Python natural language Montani for their help with spaCy integration and
processing toolkit supporting many human lan- helpful comments on the draft, Ranting Guo for the
guages. We have showed that Sta n z a ’s neural logo design, and John Bauer and the community
pipeline not only has wide coverage of human lan- contributors for their help with maintaining and
guages, but also is accurate on all tasks, thanks improving this toolkit. This research is funded in
to its language-agnostic, fully neural architectural part by Samsung Electronics Co., Ltd. and in part
design. Simultaneously, Sta n z a ’s CoreNLP client by the SAIL-JD Research Initiative.
extends its functionality with additional NLP tools.
References Daniel Zeman. 2020. Universal dependencies v2:
An evergrowing multilingual treebank collection. In
Alan Akbik, Tanja Bergmann, Duncan Blythe, Kashif Proceedings of the Twelfth International Conference
Rasul, Stefan Schweter, and Roland Vollgraf. 2019. on Language Resources and Evaluation (LREC’20).
FLAIR: An easy-to-use framework for state-of-the-
art NLP. In Proceedings of the 2019 Conference of Joel Nothman, Nicky Ringland, Will Radford, Tara
the North American Chapter of the Association for Murphy, and James R Curran. 2013. Learning mul-
Computational Linguistics (Demonstrations). Asso- tilingual named entity recognition from Wikipedia.
ciation for Computational Linguistics. Artificial Intelligence, 194:151–175.
Alan Akbik, Duncan Blythe, and Roland Vollgraf. Peng Qi, Timothy Dozat, Yuhao Zhang, and Christo-
2018. Contextual string embeddings for sequence pher D. Manning. 2018. Universal dependency pars-
labeling. In Proceedings of the 27th International ing from scratch. In Proceedings of the CoNLL 2018
Conference on Computational Linguistics. Associa- Shared Task: Multilingual Parsing from Raw Text to
tion for Computational Linguistics. Universal Dependencies. Association for Computa-
tional Linguistics.
Loïc Barrault, Ondřej Bojar, Marta R. Costa-jussà,
Christian Federmann, Mark Fishel, Yvette Gra- Milan Straka. 2018. UDPipe 2.0 prototype at CoNLL
ham, Barry Haddow, Matthias Huck, Philipp Koehn, 2018 UD shared task. In Proceedings of the CoNLL
Shervin Malmasi, Christof Monz, Mathias Müller, 2018 Shared Task: Multilingual Parsing from Raw
Santanu Pal, Matt Post, and Marcos Zampieri. 2019. Text to Universal Dependencies. Association for
Findings of the 2019 conference on machine transla- Computational Linguistics.
tion (WMT19). In Proceedings of the Fourth Con-
ference on Machine Translation (Volume 2: Shared Mariona Taulé, M. Antònia Martí, and Marta Recasens.
Task Papers, Day 1). Association for Computational 2008. AnCora: Multilevel annotated corpora for
Linguistics. Catalan and Spanish. In Proceedings of the Sixth
International Conference on Language Resources
Darina Benikova, Chris Biemann, and Marc Reznicek.
and Evaluation (LREC’08). European Language Re-
2014. NoSta-D named entity annotation for Ger-
sources Association (ELRA).
man: Guidelines and dataset. In Proceedings of
the Ninth International Conference on Language Re- Erik F. Tjong Kim Sang. 2002. Introduction to the
sources and Evaluation (LREC’14). CoNLL-2002 shared task: Language-independent
named entity recognition. In COLING-02: The
Piotr Bojanowski, Edouard Grave, Armand Joulin, and
6th Conference on Natural Language Learning 2002
Tomas Mikolov. 2017. Enriching word vectors with
(CoNLL-2002).
subword information. Transactions of the Associa-
tion for Computational Linguistics, 5. Erik F. Tjong Kim Sang and Fien De Meulder.
Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, 2003. Introduction to the CoNLL-2003 shared task:
Thorsten Brants, Phillipp Koehn, and Tony Robin- Language-independent named entity recognition. In
son. 2013. One billion word benchmark for measur- Proceedings of the Seventh Conference on Natural
ing progress in statistical language modeling. Tech- Language Learning at HLT-NAACL 2003.
nical report, Google.
Ralph Weischedel, Martha Palmer, Mitchell Marcus,
Timothy Dozat and Christopher D. Manning. 2017. Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Ni-
Deep biaffine attention for neural dependency pars- anwen Xue, Ann Taylor, Jeff Kaufman, Michelle
ing. In International Conference on Learning Rep- Franchini, et al. 2013. OntoNotes release 5.0. Lin-
resentations (ICLR). guistic Data Consortium.

Christopher D. Manning, Mihai Surdeanu, John Bauer, Daniel Zeman, Jan Hajič, Martin Popel, Martin Pot-
Jenny Finkel, Steven J. Bethard, and David Mc- thast, Milan Straka, Filip Ginter, Joakim Nivre, and
Closky. 2014. The Stanford CoreNLP natural lan- Slav Petrov. 2018. CoNLL 2018 shared task: Mul-
guage processing toolkit. In Association for Compu- tilingual parsing from raw text to universal depen-
tational Linguistics (ACL) System Demonstrations. dencies. In Proceedings of the CoNLL 2018 Shared
Task: Multilingual Parsing from Raw Text to Univer-
Behrang Mohit, Nathan Schneider, Rishav Bhowmick, sal Dependencies. Association for Computational
Kemal Oflazer, and Noah A Smith. 2012. Recall- Linguistics.
oriented learning of named entities in Arabic
Wikipedia. In Proceedings of the 13th Conference of Daniel Zeman, Joakim Nivre, Mitchell Abrams, Noëmi
the European Chapter of the Association for Compu- Aepli, Željko Agić, Lars Ahrenberg, Gabrielė Alek-
tational Linguistics. Association for Computational sandravičiūtė, Lene Antonsen, Katya Aplonova,
Linguistics. Maria Jesus Aranzabe, Gashaw Arutie, Masayuki
Asahara, Luma Ateyah, Mohammed Attia, Aitz-
Joakim Nivre, Marie-Catherine de Marneffe, Filip Gin- iber Atutxa, Liesbeth Augustinus, Elena Badmaeva,
ter, Jan Hajič, Christopher D. Manning, Sampo Miguel Ballesteros, Esha Banerjee, Sebastian Bank,
Pyysalo, Sebastian Schuster, Francis Tyers, and Verginica Barbu Mititelu, Victoria Basmov, Colin
Batchelor, John Bauer, Sandra Bellato, Kepa Ben- Munro, Yugo Murawaki, Kaili Müürisep, Pinkey
goetxea, Yevgeni Berzak, Irshad Ahmad Bhat, Nainwani, Juan Ignacio Navarro Horñiacek, Anna
Riyaz Ahmad Bhat, Erica Biagetti, Eckhard Bick, Nedoluzhko, Gunta Nešpore-Bērzkalne, Lương
Agnė Bielinskienė, Rogier Blokland, Victoria Bo- Nguyễn Thi., Huyền Nguyễn Thi. Minh, Yoshi-
bicev, Loïc Boizou, Emanuel Borges Völker, Carl hiro Nikaido, Vitaly Nikolaev, Rattima Nitisaroj,
Börstell, Cristina Bosco, Gosse Bouma, Sam Bow- Hanna Nurmi, Stina Ojala, Atul Kr. Ojha, Adédayo.
man, Adriane Boyd, Kristina Brokaitė, Aljoscha Olúòkun, Mai Omura, Petya Osenova, Robert
Burchardt, Marie Candito, Bernard Caron, Gauthier Östling, Lilja Øvrelid, Niko Partanen, Elena Pas-
Caron, Tatiana Cavalcanti, Gülşen Cebiroğlu Ery- cual, Marco Passarotti, Agnieszka Patejuk, Guil-
iğit, Flavio Massimiliano Cecchini, Giuseppe G. A. herme Paulino-Passos, Angelika Peljak-Łapińska,
Celano, Slavomír Čéplö, Savas Cetin, Fabri- Siyao Peng, Cenel-Augusto Perez, Guy Perrier,
cio Chalub, Jinho Choi, Yongseok Cho, Jayeol Daria Petrova, Slav Petrov, Jason Phelan, Jussi
Chun, Alessandra T. Cignarella, Silvie Cinková, Piitulainen, Tommi A Pirinen, Emily Pitler, Bar-
Aurélie Collomb, Çağrı Çöltekin, Miriam Con- bara Plank, Thierry Poibeau, Larisa Ponomareva,
nor, Marine Courtin, Elizabeth Davidson, Marie- Martin Popel, Lauma Pretkalnin, a, Sophie Prévost,
Catherine de Marneffe, Valeria de Paiva, Elvis Prokopis Prokopidis, Adam Przepiórkowski, Tiina
de Souza, Arantza Diaz de Ilarraza, Carly Dicker- Puolakainen, Sampo Pyysalo, Peng Qi, Andriela
son, Bamba Dione, Peter Dirix, Kaja Dobrovoljc, Rääbis, Alexandre Rademaker, Loganathan Ra-
Timothy Dozat, Kira Droganova, Puneet Dwivedi, masamy, Taraka Rama, Carlos Ramisch, Vinit Rav-
Hanne Eckhoff, Marhaba Eli, Ali Elkahky, Binyam ishankar, Livy Real, Siva Reddy, Georg Rehm, Ivan
Ephrem, Olga Erina, Tomaž Erjavec, Aline Eti- Riabov, Michael Rießler, Erika Rimkutė, Larissa Ri-
enne, Wograine Evelyn, Richárd Farkas, Hector naldi, Laura Rituma, Luisa Rocha, Mykhailo Ro-
Fernandez Alcalde, Jennifer Foster, Cláudia Fre- manenko, Rudolf Rosa, Davide Rovati, Valentin
itas, Kazunori Fujita, Katarína Gajdošová, Daniel Rosca, Olga Rudina, Jack Rueter, Shoval Sadde,
Galbraith, Marcos Garcia, Moa Gärdenfors, Se- Benoît Sagot, Shadi Saleh, Alessio Salomoni, Tanja
bastian Garza, Kim Gerdes, Filip Ginter, Iakes Samardžić, Stephanie Samson, Manuela Sanguinetti,
Goenaga, Koldo Gojenola, Memduh Gökırmak, Dage Särg, Baiba Saulı̄te, Yanin Sawanakunanon,
Yoav Goldberg, Xavier Gómez Guinovart, Berta Nathan Schneider, Sebastian Schuster, Djamé Sed-
González Saavedra, Bernadeta Griciūtė, Matias Gri- dah, Wolfgang Seeker, Mojgan Seraji, Mo Shen,
oni, Normunds Grūzı̄tis, Bruno Guillaume, Céline Atsuko Shimada, Hiroyuki Shirasu, Muh Shohibus-
Guillot-Barbance, Nizar Habash, Jan Hajič, Jan Ha- sirri, Dmitry Sichinava, Aline Silveira, Natalia Sil-
jič jr., Mika Hämäläinen, Linh Hà Mỹ, Na-Rae veira, Maria Simi, Radu Simionescu, Katalin Simkó,
Han, Kim Harris, Dag Haug, Johannes Heinecke, Fe- Mária Šimková, Kiril Simov, Aaron Smith, Isabela
lix Hennig, Barbora Hladká, Jaroslava Hlaváčová, Soares-Bastos, Carolyn Spadine, Antonio Stella,
Florinel Hociung, Petter Hohle, Jena Hwang, Milan Straka, Jana Strnadová, Alane Suhr, Umut
Takumi Ikeda, Radu Ion, Elena Irimia, O.lájídé Sulubacak, Shingo Suzuki, Zsolt Szántó, Dima
Ishola, Tomáš Jelínek, Anders Johannsen, Fredrik Taji, Yuta Takahashi, Fabio Tamburini, Takaaki
Jørgensen, Markus Juutinen, Hüner Kaşıkara, An- Tanaka, Isabelle Tellier, Guillaume Thomas, Li-
dre Kaasen, Nadezhda Kabaeva, Sylvain Kahane, isi Torga, Trond Trosterud, Anna Trukhina, Reut
Hiroshi Kanayama, Jenna Kanerva, Boris Katz, Tsarfaty, Francis Tyers, Sumire Uematsu, Zdeňka
Tolga Kayadelen, Jessica Kenney, Václava Ket- Urešová, Larraitz Uria, Hans Uszkoreit, Andrius
tnerová, Jesse Kirchner, Elena Klementieva, Arne Utka, Sowmya Vajjala, Daniel van Niekerk, Gert-
Köhn, Kamil Kopacewicz, Natalia Kotsyba, Jolanta jan van Noord, Viktor Varga, Eric Villemonte de la
Kovalevskaitė, Simon Krek, Sookyoung Kwak, Clergerie, Veronika Vincze, Lars Wallin, Abigail
Veronika Laippala, Lorenzo Lambertino, Lucia Walsh, Jing Xian Wang, Jonathan North Washing-
Lam, Tatiana Lando, Septina Dian Larasati, Alexei ton, Maximilan Wendt, Seyi Williams, Mats Wirén,
Lavrentiev, John Lee, Phương Lê Hồng, Alessandro Christian Wittern, Tsegay Woldemariam, Tak-sum
Lenci, Saran Lertpradit, Herman Leung, Cheuk Ying Wong, Alina Wróblewska, Mary Yako, Naoki Ya-
Li, Josie Li, Keying Li, KyungTae Lim, Maria Li- mazaki, Chunxiao Yan, Koichi Yasuoka, Marat M.
ovina, Yuan Li, Nikola Ljubešić, Olga Loginova, Yavrumyan, Zhuoran Yu, Zdeněk Žabokrtský, Amir
Olga Lyashevskaya, Teresa Lynn, Vivien Macke- Zeldes, Manying Zhang, and Hanzhi Zhu. 2019.
tanz, Aibek Makazhanov, Michael Mandl, Christo- Universal Dependencies 2.5. LINDAT/CLARIAH-
pher Manning, Ruli Manurung, Cătălina Mărăn- CZ digital library at the Institute of Formal and Ap-
duc, David Mareček, Katrin Marheinecke, Héc- plied Linguistics (ÚFAL), Faculty of Mathematics
tor Martínez Alonso, André Martins, Jan Mašek, and Physics, Charles University.
Yuji Matsumoto, Ryan McDonald, Sarah McGuin-
ness, Gustavo Mendonça, Niko Miekka, Mar-
garita Misirpashayeva, Anna Missilä, Cătălin Mi-
titelu, Maria Mitrofan, Yusuke Miyao, Simonetta
Montemagni, Amir More, Laura Moreno Romero,
Keiko Sophie Mori, Tomohiko Morioka, Shin-
suke Mori, Shigeki Moro, Bjartur Mortensen,
Bohdan Moskalevskyi, Kadri Muischnek, Robert

You might also like