End-to-End Information Extraction without Token-Level Supervision

Palm, Rasmus Berg; Hovy, Dirk; Laws, Florian; Winther, Ole

Computer Science > Computation and Language

arXiv:1707.04913 (cs)

[Submitted on 16 Jul 2017]

Title:End-to-End Information Extraction without Token-Level Supervision

Authors:Rasmus Berg Palm, Dirk Hovy, Florian Laws, Ole Winther

View PDF

Abstract:Most state-of-the-art information extraction approaches rely on token-level labels to find the areas of interest in text. Unfortunately, these labels are time-consuming and costly to create, and consequently, not available for many real-life IE tasks. To make matters worse, token-level labels are usually not the desired output, but just an intermediary step. End-to-end (E2E) models, which take raw text as input and produce the desired output directly, need not depend on token-level labels. We propose an E2E model based on pointer networks, which can be trained directly on pairs of raw input and output text. We evaluate our model on the ATIS data set, MIT restaurant corpus and the MIT movie corpus and compare to neural baselines that do use token-level labels. We achieve competitive results, within a few percentage points of the baselines, showing the feasibility of E2E information extraction without the need for token-level labels. This opens up new possibilities, as for many tasks currently addressed by human extractors, raw input and output data are available, but not token-level labels.

Comments:	this http URL @ EMNLP 2017
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1707.04913 [cs.CL]
	(or arXiv:1707.04913v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1707.04913

Submission history

From: Rasmus Berg Palm [view email]
[v1] Sun, 16 Jul 2017 16:57:36 UTC (58 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2017-07

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Rasmus Berg Palm
Dirk Hovy
Florian Laws
Ole Winther

export BibTeX citation

Computer Science > Computation and Language

Title:End-to-End Information Extraction without Token-Level Supervision

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:End-to-End Information Extraction without Token-Level Supervision

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators