Quality Fit for Purpose: Building Business Critical Errors Test Suites

Mariana Cabeça, Marianna Buchicchio, Madalena Gonçalves, Christine Maroti, João Godinho, Pedro Coelho, Helena Moniz, Alon Lavie


Abstract
This paper illustrates a new methodology based on Test Suites (Avramidis et al., 2018) with focus on Business Critical Errors (BCEs) (Stewart et al., 2022) to evaluate the output of Machine Translation (MT) and Quality Estimation (QE) systems. We demonstrate the value of relying on semi-automatic evaluation done through scalable BCE-focused Test Suites to monitor both MT and QE systems’ performance for 8 language pairs (LPs) and a total of 4 error categories. This approach allows us to not only track the impact of new features and implementations in a real business environment, but also to identify strengths and weaknesses in models regarding different error types, and subsequently know what to improve henceforth.
Anthology ID:
2023.eamt-1.44
Volume:
Proceedings of the 24th Annual Conference of the European Association for Machine Translation
Month:
June
Year:
2023
Address:
Tampere, Finland
Editors:
Mary Nurminen, Judith Brenner, Maarit Koponen, Sirkku Latomaa, Mikhail Mikhailov, Frederike Schierl, Tharindu Ranasinghe, Eva Vanmassenhove, Sergi Alvarez Vidal, Nora Aranberri, Mara Nunziatini, Carla Parra Escartín, Mikel Forcada, Maja Popovic, Carolina Scarton, Helena Moniz
Venue:
EAMT
SIG:
Publisher:
European Association for Machine Translation
Note:
Pages:
451–460
Language:
URL:
https://aclanthology.org/2023.eamt-1.44
DOI:
Bibkey:
Cite (ACL):
Mariana Cabeça, Marianna Buchicchio, Madalena Gonçalves, Christine Maroti, João Godinho, Pedro Coelho, Helena Moniz, and Alon Lavie. 2023. Quality Fit for Purpose: Building Business Critical Errors Test Suites. In Proceedings of the 24th Annual Conference of the European Association for Machine Translation, pages 451–460, Tampere, Finland. European Association for Machine Translation.
Cite (Informal):
Quality Fit for Purpose: Building Business Critical Errors Test Suites (Cabeça et al., EAMT 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.eamt-1.44.pdf