Authors:
Daniel Atzberger
;
Jonathan Schneider
;
Willy Scheibel
;
Daniel Limberger
;
Matthias Trapp
and
Jürgen Döllner
Affiliation:
Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Germany
Keyword(s):
Bug Triaging, Topic Models, Latent Dirichlet Allocation, Author-topic Model.
Abstract:
During the software development process, software defects, so-called bugs, are captured in a semi-structured manner in a bug tracking system using textual components and categorical features. It is the task of the triage owner to assign open bugs to developers with the required skills and expertise. This task, known as bug triaging, requires in-depth knowledge about a developer’s skills. Various machine learning techniques have been proposed to automate this task, most of these approaches apply topic models, especially Latent Dirichlet Allocation, for mining the textual components of bug reports. However, none of the proposed approaches explicitly models a developer’s expertise. In most cases, these algorithms are treated as a black box, as they allow no explanation about their recommendation. In this work, we show how the Author-Topic Model, a variant of Latent Dirichlet Allocation, can be used to capture a developer’s expertise in the latent topics of a corpus of bug reports from t
he model itself. Furthermore, we present three novel bug triaging techniques based on the Author-Topic Model. We compare our approach against a baseline model, that is based on Latent Dirichlet Allocation, on a dataset of 18 269 bug reports from the Mozilla Firefox project collected between July 1999 to June 2016. The results show that the Author-Topic Model can outperform the baseline approach in terms of the Mean Reciprocal Rank.
(More)