An Adversarially-Learned Turing Test for Dialog Generation Models

Gao, Xiang; Zhang, Yizhe; Galley, Michel; Dolan, Bill

Computer Science > Computation and Language

arXiv:2104.08231 (cs)

[Submitted on 16 Apr 2021]

Title:An Adversarially-Learned Turing Test for Dialog Generation Models

Authors:Xiang Gao, Yizhe Zhang, Michel Galley, Bill Dolan

View PDF

Abstract:The design of better automated dialogue evaluation metrics offers the potential of accelerate evaluation research on conversational AI. However, existing trainable dialogue evaluation models are generally restricted to classifiers trained in a purely supervised manner, which suffer a significant risk from adversarial attacking (e.g., a nonsensical response that enjoys a high classification score). To alleviate this risk, we propose an adversarial training approach to learn a robust model, ATT (Adversarial Turing Test), that discriminates machine-generated responses from human-written replies. In contrast to previous perturbation-based methods, our discriminator is trained by iteratively generating unrestricted and diverse adversarial examples using reinforcement learning. The key benefit of this unrestricted adversarial training approach is allowing the discriminator to improve robustness in an iterative attack-defense game. Our discriminator shows high accuracy on strong attackers including DialoGPT and GPT-3.

Comments:	7 pages, 2 figures
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2104.08231 [cs.CL]
	(or arXiv:2104.08231v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2104.08231

Submission history

From: Xiang Gao [view email]
[v1] Fri, 16 Apr 2021 17:13:14 UTC (5,459 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-04

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Xiang Gao
Yizhe Zhang
Michel Galley
Bill Dolan

export BibTeX citation

Computer Science > Computation and Language

Title:An Adversarially-Learned Turing Test for Dialog Generation Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:An Adversarially-Learned Turing Test for Dialog Generation Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators