Icml 2005
Icml 2005
4.3.1. Overall Results and Analysis Figure 3. Overall performance of all algorithms in terms of
accuracy when considering the three classifiers. Number of
Figure 2 shows the number of experiments (out of 39) experiments out of 39 which each algorithm performed the
for which each algorithm performed the best or tied best or tied with the best.
with the best when considering all three classifiers:
C4.5, Naive Bayes or k-Nearest Neighbour. As shown
in this figure, the three FortalFS settings (FS10, FS8 In terms of time consumption (Figure 3), as expected,
and FS6) are among the best algorithms in terms of the wrappers and FortalFS are down in the list, which
accuracy. FortalFS(10·N ) had the best performance in shows the impact of the evaluation method. However,
12 cases, FortalFS(8·N ) and Relief in 7, the Backward the three FortalFS settings were faster than all wrap-
Wrapper in 6, and FortalFS(6 · N ) and the Forward pers.
Wrapper in 5. When considering the best FortalFS
result in each case (FFS), FortalFS performs at least
as well as all other algorithms in 24 out of the 39 ex-
periments. 4983
algorithm
FWR B-F Gen LVF Foc FS6 FS8 FS10 WN2 W10N
Finally,
Rel BWR
we acknowledge CAPES (Brazilian Federal
algorithm Agency for Graduate Studies) as well as NSERC (Nat-
Figure 5. Percentage of the features selected by each algo- ural Sciences and Engineering Research Council of
rithm in all experiments and the three classifiers from a Canada) for their financial support.
total number of 1230 features.
References
Almuallim, H., & Dietterich, T. (1991). Learning
with many irrelevant features. Proceedings of the
Ninth National Conference on Artificial Intelligence
(AAAI’91) (pp. 547–552). Anaheim, CA: AAAI
Press.
5.1. More Experiments
The experiments presented here give us a very good Almuallim, H., & Dietterich, T. (1994). Learning
idea about what we can expect when applying ForT- boolean concepts in the presence of many irrelevant
alFS in real situations. However, more experiments features. Artificial Intelligence, 69, 279–305.
with more highly-dimensional and practical domains Bellman, R. (1961). Adaptive control processes. Prince-
would be useful to confirm ForTalFS’s performance. ton University Press.
In addition, since ForTalFS can be parameterized to
work with different machine learning algorithms and John, G., Kohavi, R., & Pfleger, K. (1994). Irrele-
feature selection systems, it would be important to try vant features and the subset selection problem. Pro-
other combinations of algorithms to evaluate how they ceedings of the Eleventh International Conference on
behave with different datasets. Machine Learning (ICML’94) (pp. 121–129).