Multi-document abstractive summarization using ILP based multi-sentence compression

Banerjee, Siddhartha; Mitra, Prasenjit; Sugiyama, Kazunari

Computer Science > Computation and Language

arXiv:1609.07034 (cs)

[Submitted on 22 Sep 2016]

Title:Multi-document abstractive summarization using ILP based multi-sentence compression

Authors:Siddhartha Banerjee, Prasenjit Mitra, Kazunari Sugiyama

View PDF

Abstract:Abstractive summarization is an ideal form of summarization since it can synthesize information from multiple documents to create concise informative summaries. In this work, we aim at developing an abstractive summarizer. First, our proposed approach identifies the most important document in the multi-document set. The sentences in the most important document are aligned to sentences in other documents to generate clusters of similar sentences. Second, we generate K-shortest paths from the sentences in each cluster using a word-graph structure. Finally, we select sentences from the set of shortest paths generated from all the clusters employing a novel integer linear programming (ILP) model with the objective of maximizing information content and readability of the final summary. Our ILP model represents the shortest paths as binary variables and considers the length of the path, information score and linguistic quality score in the objective function. Experimental results on the DUC 2004 and 2005 multi-document summarization datasets show that our proposed approach outperforms all the baselines and state-of-the-art extractive summarizers as measured by the ROUGE scores. Our method also outperforms a recent abstractive summarization technique. In manual evaluation, our approach also achieves promising results on informativeness and readability.

Comments:	IJCAI'15 Proceedings of the 24th International Conference on Artificial Intelligence, Pages 1208-1214, AAAI Press
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1609.07034 [cs.CL]
	(or arXiv:1609.07034v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1609.07034

Submission history

From: Siddhartha Banerjee Siddhartha Banerjee [view email]
[v1] Thu, 22 Sep 2016 15:51:43 UTC (216 KB)

Computer Science > Computation and Language

Title:Multi-document abstractive summarization using ILP based multi-sentence compression

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Multi-document abstractive summarization using ILP based multi-sentence compression

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators