[PDF][PDF] Machine translation for multilingual troubleshooting in the IT domain: a comparison of different strategies

S Štajner, J Rodrigues, L Gomes… - Proceedings of the 1st …, 2015 - aclanthology.org
Proceedings of the 1st Deep Machine Translation Workshop, 2015aclanthology.org
In this paper, we address the problem of machine translation (MT) of domain-specific texts
for which large amounts of parallel data for training are not available. We focus on the IT
domain and on English to Portuguese machine translation, and compare different strategies
for improving system performance over two baselines, the first using only large dataset of out-
of-domain data, and the second using only a small dataset of in-domain data. Our results
indicate that adding a domain-specific bilingual lexicon to the training dataset significantly …
Abstract
In this paper, we address the problem of machine translation (MT) of domain-specific texts for which large amounts of parallel data for training are not available. We focus on the IT domain and on English to Portuguese machine translation, and compare different strategies for improving system performance over two baselines, the first using only large dataset of out-of-domain data, and the second using only a small dataset of in-domain data. Our results indicate that adding a domain-specific bilingual lexicon to the training dataset significantly improves the performance of both a hybrid MT system and a PBSMT system, while adding out-of-domain sentence pairs to the training dataset only improves the performance of a hybrid MT system. Furthermore, we perform a human evaluation of the sentences generated by the hybrid MT system and the standard PBSMT system built using the same training datasets. The results indicate some significant differences between those two MT approaches in this specific task.
aclanthology.org
Showing the best result for this search. See all results