Trincavelli 2010
Trincavelli 2010
Abstract—In this paper, we introduce a method for identifica- before a final identification can be made. Thus, finding diag-
tion of bacteria in human blood culture samples using an electronic nostic methods that provide fast detection of the presence of the
nose. The method uses features, which capture the static (steady bacteria and identification of its type, thereby, allowing proper
state) and dynamic (transient) properties of the signal from the
gas sensor array and proposes a means to ensemble results from antibiotic treatment can provide an increased benefit to the pa-
consecutive samples. The underlying mechanism for ensembling is tient as well as help to reduce cost to the health care system.
based on an estimation of posterior probability, which is extracted In the last decade, compact gas sensors have been integrated
from a support vector machine classifier. A large dataset repre- into an instrument called an electronic nose, which can pro-
senting ten different bacteria cultures has been used to validate vide fast (order of minutes) and nonintrusive measurement of
the presented methods. The results detail the performance of the
proposed algorithm and show that through ensembling decisions gaseous agents. Within the medical field, electronic noses have
on consecutive samples, significant reliability in classification ac- been applied to the identification of bacterial agents in different
curacy can be achieved. contexts ranging from upper respiratory diseases [1], urine sam-
Index Terms—Bacteria identification, electronic nose, sepsis. ples [2], and ear, nose, and throat bacteria [3]. The challenge
of developing an electronic noses for bacteria identification in
blood cultures depends on a number of factors. First, the way in
which headspace sampling system introduces the samples to the
I. INTRODUCTION sensor array have been shown in comparative studies to have an
EPSIS, also known as blood poisoning of septicaemia, is effect on identification rates, where samplers that control tem-
S caused by the presence of microorganisms, predominantly
bacteria in circulating blood. Rapid administration of efficient
perature and humidity provide better performance results [4].
Second, the blood in which the bacteria is sampled may differ
antibiotic treatment is crucial as sepsis can result in septic shock, from person to person, and thus, identification should be inde-
multiple organ dysfunction, and even death. The current stan- pendent from the medium itself. Third, the postprocessing of the
dard procedure for diagnosis involves routine microbiological signals should be suited for the characteristics of the signals and
blood cultures. Such procedures can take at least 36 h to sev- application. In applications with a larger number of sensors, it
eral days before diagnosis can be made. Typically, automated is important to extract the relevant features from the signals that
blood-culture-monitoring systems are first used to incubate a can be used for further processing. Although, trials have been
blood culture and monitor the production or reduction of gases made with a single sensor type for identification of bacteria in
(this is normally done via nonintrusive methods for detection of blood cultures [5], today’s off-the-shelf electronic nose devices
CO2 ). Once a sample indicates a change in gas tension, a lab contain anywhere from 4 to 32 sensors in an array. This allows
technician will culture the sample on plates for further identi- for a wider range of odors to be detected.
fication. This secondary subculturing may require up to 36 h In this paper, a new approach for classifying bacteria with
an electronic nose is presented. The method evaluates the suit-
ability of a given sample for classification by representing the
output from a support vector machine (SVM) with a posterior
Manuscript received December 13, 2009; revised March 2, 2010; accepted probability estimation. This estimation is ensembled across ten
March 27, 2010. Date of publication May 10, 2010; date of current version
November 17, 2010. This work was supported by European Union structural consecutive responses of the same sample in order to make the
funds from NovaMedTech. Asterisk indicates corresponding author. classification more reliable. An electronic nose containing 22
M. Trincavelli and S. Coradeschi are with the Center for Applied Au- gas sensors with partial and overlapping selectivities and an au-
tonomous Sensor Systems, School of Science and Technology, Örebro
University, Örebro, SE-70182, Sweden (e-mail: [email protected]; tomatic headspace sampler is used to regulate the samples. The
[email protected]). data processing methods presented consist of extracting features
*A. Loutfi is with the Center for Applied Autonomous Sensor Systems, that reside on the static (steady state) and dynamic (transient)
School of Science and Technology, Örebro University, Örebro, SE-70182,
Sweden (e-mail: [email protected]). properties of the signal. These features are fed into a SVM and
B. Söderquist is with the Örebro University Hospital, Örebro, to the ensembling algorithm. The mechanism of ensembling is
Sweden, and also with the Örebro University, Örebro, SE-70182, based on treating the posterior probabilities as a random sample
Sweden (e-mail: [email protected]).
P. Thunberg is with the Department of Medical Physics, Örebro Univer- and estimating the 95% confidence interval for the mean of the
sity Hospital, Örebro, Sweden, and also with the Örebro University, Örebro, posterior of each class. A mean with significant superior confi-
SE-70182, Sweden (e-mail: [email protected]). dence interval for a class is disjoint and above all the others then
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. classification is performed (assigning the sequence of samples
Digital Object Identifier 10.1109/TBME.2010.2049492 to that class); otherwise a rejection is declared. Identifications
local solution is also a global optimum. Many variations of the minimize (rj i pi − rij pj )2 (4)
p
original model of SVM have been proposed, both for classifi- i=1 j :j = i
IV. RESULTS
A 12-fold cross-validation on the collected dataset has been
performed to evaluate the proposed pattern recognition algo-
rithm. In every fold, one sequence of measurements have been
left out and used for testing the algorithm trained with the re-
maining 11 sequences. The hyperparameters of the SVM have
been estimated with an exhaustive grid search in the interval
[−10, 10] with step 1 in the base 2 logarithmic scale for both C
and γ. An eightfold cross-validation, where the folds have been
extracted randomly from the training set, has been performed at
every point in the grid. The confusion matrices obtained using
only the features based on the response of the sensor and using
the features based both on the response and the derivative are
displayed, respectively, in Tables II and III. The classification
accuracies obtained in the two cases are 91.8% ± 11.5% and
94% ± 12.7%, respectively. Before ensembling the posterior
probabilities calculated for the sampling cycles, an analysis of
reliability of the estimate of the posterior probabilities as a con-
fidence measure for the classification algorithm is made, as well
as an analysis of the distribution of errors with respect to the Fig. 2. Performance of the classification algorithm with a varying rejection
threshold. The upper figure shows the error rate and the lower figure the rejection
measuring cycles and measuring sequences. In order to check rate. The dashed lines represent the performance obtained by the algorithm that
the validity of the posteriors as a confidence measure, a thresh- uses only the response-based features, while the solid lines represent the perfor-
old is introduced so that, if not exceeded by the maximum of mance obtained by the algorithm that uses both the response and derivative-based
features.
the posteriors, a rejection is declared. Fig. 2 shows the results of
varying this threshold in the range (0,1) for both the considered
feature sets (response and response + derivative). The fact that dence measure for the classification algorithm. Moreover it can
the error rate decreases when the rejection threshold increases also be observed how the addition of the derivative-based fea-
indicates that the estimation of the posterior is a reliable confi- tures diminish the error of roughly the 25% over all the rejection
2888 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 57, NO. 12, DECEMBER 2010
TABLE IV
CLASSIFICATION ACCURACY FOR 12 MEASUREMENT SESSIONS
threshold spectrum without increasing the rejection rate. This Fig. 5. Number of errors committed by the classification algorithm in the
confirms that the dynamic characteristics of the signal contain different measuring cycles for the dataset, where sequences 1 and 7 have been
useful information for the discrimination of odors. removed. It is evident how the first measurement cycle is more subject to
erroneous decisions.
A second aspect to analyze is how errors are spread across
measurement sessions and sampling cycles. Table IV shows the
performances obtained in 12 measurement sessions. It is evident some leftover from the earlier sampling cycle is still there. This
how measurement sessions 1 and 7 obtain a performance much effect is particularly evident in the first measuring cycle, since
worse than the other sessions. This can be explained by the fact the bacteria that was smelled in the sampling cycle before was
that these two sessions are the ones recorded in the beginning different. Indeed, it is possible to use a sample, which will better
of the two experiment batches. Therefore, we can suppose that “condition” the sensors to the bacteria infected blood, which
this degradation of performance can be due to interference in is a common practice in the electronic nose community. The
the measuring system, like humidity deposited on the sensors’ effect of using a conditioner may result in a better classification
surface, the sensors were not fully warmed, or stagnant air was performance for the first sample and this will be investigated in
present in the sampling system. An analysis of how the errors future studies.
are spread across measurement cycles is given in Fig. 3. It can If the data collected during measurement sessions 1 and 7
be observed how the number of errors made during the first are removed from the set, we obtain 96.4% ± 4.3% classifi-
measuring cycle is larger than the errors in the other cycles. cation accuracy with the feature based on the static response
This can be due to the fact that the purging procedure of the and 98.9% ± 1.4% with the features that include the dynamic
nose at the end of a measuring cycle is not perfect, and therefore, of the sensors. Fig. 4 shows the effect of the introduction of a
TRINCAVELLI et al.: DIRECT IDENTIFICATION OF BACTERIA IN BLOOD CULTURE SAMPLES USING ELECTRONIC NOSE 2889
REFERENCES
[1] S. Y. Lai, O. F. Deffenderfer, W. Hanson, M. P. Phillips, and E. R. Thaler,
“Identification of upper respiratory bacterial pathogens with the electronic
nose,” Laryngoscope, vol. 112, pp. 975–979, Jun. 2002.
[2] S. Aathithan, J. C. Plant, A. N. Chaudry, and G. L. French, “Diagnosis
of bacteriuria by detection of volatile organic compounds in urine us-
ing an automated headspace analyzer with multiple conducting polymer
sensors,” J. Clin. Microbiol., vol. 39, pp. 2590–2593, Jul. 2001.
[3] M. Holmberg, F. Gustafsson, E. G. Hörnsten, F. Winquist, L. E. Nilsson,
L. Ljung, and I. Lundström, “Feature extraction from sensor data on
bacterial growth,” Biotechnol. Tech., vol. 12, no. 4, pp. 319–324, 2004.
Fig. 7. Performance of the ensembled classification algorithm with a varying [4] G. C. Green, A. D. Chan, and R. A. Goubran, “An investigation into the
number of measuring cycles for the dataset where sequences 1 and 7 have been suitability of using three electronic nose instruments for the detection and
removed. Only the rejection rate is shown since the error is constantly zero. The discrimination of bacteria types,” in Proc. IEEE Eng. Med. Biol. Soc.,
solid lines represents the performance obtained by the algorithm that uses all the 2006, vol. 1, pp. 1850–1853.
measuring cycles, while the dashed lines represent the performance obtained [5] M. Bruins, A. Bos, P. L. Petit, K. Eadie, A. Rog, R. Bos, G. H. van
by the algorithm that neglects the first measurement cycle. Notice that the solid Ramshorst, and A. van Belkum, “Device-independent, real-time identifi-
line starts from cycle two and the dashed line from cycle three. This is because cation of bacterial pathogens with a metal oxide-based olfactory sensor,”
at least two samples are needed to calculate an uncertainty. Eur. J. Clin. Microbiol. Infect. Dis., vol. 28, pp. 775–780, Jul. 2009.
[6] K. Arshak, E. Moore, G. Lyons, J. Harris, and S. Clifford, “A review of
gas sensors employed in electronic nose applications,” Sens. Rev., vol. 24,
threshold on the posterior probability on this reduced dataset. no. 2, pp. 181–198, Oct. 2004.
The observations made for the full dataset are confirmed. and [7] T. Eklöv, P. Martensson, and I. Lundström, “Enhanced selectivity of mos-
fet gas sensors by systematical analysis of transient parameters,” Anal.
in this case, the improvement obtained with the introduction of Chim. Acta, vol. 353, no. 2–3, pp. 291–300, 1997.
the features based on the derivative of the signal is even more [8] R. Gutierrez-Osuna, H. T. Nagle, and S. S. Schiffman, “Transient response
significant. Fig. 5 shows that most of the errors are done in the analysis of an electronic nose using multi-exponential models,” Sens.
Actuators B, Chem., vol. 61, no. 1–3, pp. 170–182, 1999.
first measuring cycle, also when measurement sessions 1 and 7 [9] C. J. C. Burges, “A tutorial on support vector machines for pattern recog-
are left out. nition,” Data Mining Knowl. Discov., vol. 2, no. 2, pp. 121–167, 1998.
Results from ensembling the decisions for the full dataset and [10] C. M. Bishop, Pattern Recognition and Machine Learning (Information
Science and Statistics). New York: Springer-Verlag, Aug. 2006.
for the dataset without sequences 1 and 7 are shown in Figs. 6 and [11] C.-W. Hsu and C.-J. Lin, “A comparison of methods for multi-class support
7, respectively. It is important to notice how neglecting the first vector machines,” IEEE Trans. Neural Netw., vol. 13, no. 2, pp. 415–425,
cycle improves the performance of the ensemble. This confirms Mar. 2002.
[12] T.-F. Wu, C.-J. Lin, and R. C. Weng, “Probability estimates for multi-
that the first cycle contains additional noise with respect to the class classification by pairwise coupling,” J. Mach. Learning Res., vol. 5,
subsequent cycles. In particular in Fig. 7, the performance of pp. 975–1005, 2004.
2890 IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 57, NO. 12, DECEMBER 2010
[13] H. T. Lin, C. J. Lin, and R. C. Weng, “A note on platt’s probabilistic outputs Amy Loutfi received the B.Sc. degree in electrical
for support vector machines,” Mach. Learn., vol. 68, no. 3, pp. 267–276, engineering from the University of New Brunswick,
Oct. 2007. Saint John, NB, Canada, and the Ph.D. degree in
[14] J. C. Platt, “Probabilistic outputs for support vector machines and com- computer science from Örebro University, Örebro,
parisons to regularized likelihood methods,” in Advances in Large Margin Sweden.
Classifiers. Cambridge, MA: MIT Press, 1999, pp. 61–74. She is a Postdoctoral Researcher at Örebro Uni-
[15] Y. Hochberg and A. C. Tamhane, Multiple Comparison Procedures (Wiley versity. She has several years experience in electronic
Series in Probability and Statistics). New York: Wiley, Nov. 1987. nose devices. Her current research interests include
the integration of electronic nose into intelligent sys-
tems, such as mobile robots, mobile robotics for en-
vironmental monitoring, artificial intelligence, and
distributed robotic systems.