Tradeoffs in Sentence Selection Techniques for Open-Domain Question Answering

Lin, Shih-Ting; Durrett, Greg

Computer Science > Computation and Language

arXiv:2009.09120 (cs)

[Submitted on 18 Sep 2020]

Title:Tradeoffs in Sentence Selection Techniques for Open-Domain Question Answering

Authors:Shih-Ting Lin, Greg Durrett

View PDF

Abstract:Current methods in open-domain question answering (QA) usually employ a pipeline of first retrieving relevant documents, then applying strong reading comprehension (RC) models to that retrieved text. However, modern RC models are complex and expensive to run, so techniques to prune the space of retrieved text are critical to allow this approach to scale. In this paper, we focus on approaches which apply an intermediate sentence selection step to address this issue, and investigate the best practices for this approach. We describe two groups of models for sentence selection: QA-based approaches, which run a full-fledged QA system to identify answer candidates, and retrieval-based models, which find parts of each passage specifically related to each question. We examine trade-offs between processing speed and task performance in these two approaches, and demonstrate an ensemble module that represents a hybrid of the two. From experiments on Open-SQuAD and TriviaQA, we show that very lightweight QA models can do well at this task, but retrieval-based models are faster still. An ensemble module we describe balances between the two and generalizes well cross-domain.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2009.09120 [cs.CL]
	(or arXiv:2009.09120v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2009.09120

Submission history

From: Shih-Ting Lin [view email]
[v1] Fri, 18 Sep 2020 23:39:15 UTC (2,881 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-09

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Shih-Ting Lin
Greg Durrett

export BibTeX citation

Computer Science > Computation and Language

Title:Tradeoffs in Sentence Selection Techniques for Open-Domain Question Answering

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Tradeoffs in Sentence Selection Techniques for Open-Domain Question Answering

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators