A Deep Network Model for Paraphrase Detection in Short Text Messages

Agarwal, Basant; Ramampiaro, Heri; Langseth, Helge; Ruocco, Massimiliano

doi:10.1016/j.ipm.2018.06.005

Computer Science > Information Retrieval

arXiv:1712.02820 (cs)

[Submitted on 7 Dec 2017]

Title:A Deep Network Model for Paraphrase Detection in Short Text Messages

Authors:Basant Agarwal, Heri Ramampiaro, Helge Langseth, Massimiliano Ruocco

View PDF

Abstract:This paper is concerned with paraphrase detection. The ability to detect similar sentences written in natural language is crucial for several applications, such as text mining, text summarization, plagiarism detection, authorship authentication and question answering. Given two sentences, the objective is to detect whether they are semantically identical. An important insight from this work is that existing paraphrase systems perform well when applied on clean texts, but they do not necessarily deliver good performance against noisy texts. Challenges with paraphrase detection on user generated short texts, such as Twitter, include language irregularity and noise. To cope with these challenges, we propose a novel deep neural network-based approach that relies on coarse-grained sentence modeling using a convolutional neural network and a long short-term memory model, combined with a specific fine-grained word-level similarity matching model. Our experimental results show that the proposed approach outperforms existing state-of-the-art approaches on user-generated noisy social media data, such as Twitter texts, and achieves highly competitive performance on a cleaner corpus.

Subjects:	Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:1712.02820 [cs.IR]
	(or arXiv:1712.02820v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.1712.02820
Journal reference:	B Agarwal, H. Ramampiaro, H Langseth, M Ruocco, (2018), "A Deep Network Model for Paraphrase Detection in Short Text Messages". In Information Processing & Management Journal (IPM), 54(6), pp. 922-937. Elsevier
Related DOI:	https://doi.org/10.1016/j.ipm.2018.06.005

Submission history

From: Heri Ramampiaro [view email]
[v1] Thu, 7 Dec 2017 19:10:45 UTC (303 KB)

Computer Science > Information Retrieval

Title:A Deep Network Model for Paraphrase Detection in Short Text Messages

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:A Deep Network Model for Paraphrase Detection in Short Text Messages

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators