17 - Chapter 9
17 - Chapter 9
17 - Chapter 9
Conclusions
In the real world data grow exponentially and it is practically impossible to benefit
from this data without mining or extracting useful rules or interesting patterns
from the data. The data mining in tum utilizes many techniques. Any single
technique is not found suitable for all kinds of data and for all types of domains
(No free lunch theorem holds true). Hybrid techniques have been designed to
hybridized approach based on rough set theory and classical decision tree
work and present the conclusions with directions for future research.
9.1 Summary
The thesis facilitates an important task of data mining namely classification using
decision tree induction. RDT framework, the hybridization of the rough set theory
and the decision tree induction, is proposed to address various data mining issues
for classification. The performance of the RDT has been investigated to handle
237
inconsistency, continuous attributes, large datasets, and noisy domains. In the first
phase, the benchmarking datasets from UCI repository have been used for the
research community and their corresponding algorithms have the chances of being
fine tuned to perform well on these datasets, therefore in the second phase RDT
model is also employed to mine two real world datasets from the agricultural
domain. Machine learning based data mining applications for agricultural domain
have not been in focus, hence useful applications need to be identified and
approach. From the review on decision tree induction ID3 and C4.5 algorithms
were adjudged suitable as the base for the hybridization. Using these two
for comparison of classifiers obtained for the small datasets. Some additional
in the resulting classifier were also given due weights for the moderate or the large
size datasets. The selections of datasets for further study were based on the issues,
in this dataset of large number of attributes also prevailed. Further Iris, Vehicle,
238
Australian-credit-card, Adult and Cover type datasets were selected to deal with
the problems of attributes with continuous and missing values using rough set
assumption is that the dataset includes all kinds of noise, inconsistency or any
other imperfection that usually occur in the real world. To address issues in such
datasets dynamic RDT was employed. Finally RDT was tested for prediction of
epidemic outbreak in mango using another real dataset including the data over a
period of eleven years. At a prior stage this problem has been studied by the
There is no algorithm that dominates all other algorithms for all types of
domains. For every problem, there is no such perfect algorithm that guesses the
target function with no error. The best that can be hoped for is to understand the
the best algorithm for the data being studied and not an average performance
here is to execute various models for classification and estimate the accuracy,
complexity, size of the rule-set and the number of attributes required for each of
them. In addition to this, the cumulative score may be utilized for the comparisons
No system is foolproof. The performance of RDT was not always the best
datasets from a domain can not disprove the validity of the RDT model because
239
decision tree induction algorithm was insignificant. Implementation of RDT
algorithm is automated with the help of interface programs, script files and batch
files which are not as efficient as the fully automated module for RDT model
are required on the part of the user to use CS for performance comparison. The
results of classification using the RDT model and its variants are compared with
the results obtained from the rough set approach, the decision tree induction
Other models based on neural networks or genetic algorithms are not attempted.
conditions for updating the prior predictive model was not possible at this stage
due to non availability of more instances of the data. As evident from the removal
conclusions:
1. A new hybridized model for classification namely RDT, based on rough set
theory and decision tree induction, is proposed. For a decision system that has
240
is O(m2n log n)+ O(n( log n/. However, RDT removes irrelevant attributes at
2. The accuracy, the complexity, the number of rules, and the number of
To deal with such a case, the formulation of Cumulative Score (CS) mainly for
reducts might exist. In order to fine tune the RDT to extend its applicability to
such datasets too, the concept of approximate core is proposed along with the
algorithm for its computation. All possible reducts, denoted by R, are obtained
4. The rough set based discretization offers additional utility of rough set theory
compares well with the popular standard algorithm for handling continuous
attributes namely C4.5. The experimental results obtained from using RDT
and its variants exhibited the potential to trade off between complexity and
241
5. Dynamic RDT model is based on dynamic reduct and is observed to be
suitable to handle real world datasets containing noise. The algorithm for the
6. For the real world prediction problem, RDT algorithms are observed to
the RDT. The resulting classifier obtained from hybrid framework is simple
Our experiments in the dissertation suggest that RDT or any of its variants
i.e. the RDTcoreu, DJU, DJP, RJU, RJP, DRJU or DRJP, outperforms other base
algorithms. The results indicate the suitability of RDT model to yield a classifier
~exploring the potential of rough set theory to hybridize it with the conventional
datasets from real world domains. Experimental results obtained from real world
the predictive modelling using RDT, more applications for agricultural domains
statistical methods for prediction, and RDT and its variants or other machine
learning algorithms, may increase the credibility base of the proposed RDT
model.
242
References
[AIS93] Agrawal, R., Imielinski, T. and Swami, A., Mining Association
Rules between Sets of Items in Large Databases, In Proceedings of
the ACM SIGMOD International Conference on Management of
Data, 1993
[AP96] Ali, K. and Pazzani, M., Error reduction through learning multiple
descriptions. Machine Learning , 24(1 ): 173-202, 1996
[ARS98] Alsabti, K., Ranka, S., Singh, V., CLOUDS: A Decision Tree
Classifier for large datasets, In Proceedings of the International
Conference on Knowledge Discovery and Data Mining, New York,
August, 1998
[BC03] Bagnall, A. J., Cawley, G. C., Learning classifier Systems for Data
Mining: A Comparison of XCS with Other Classifiers for the Forest
Cover Data Set, In Proceedings of the IEEE/INNS International
243
Joint Conference on Artificial Neural Networks (IJCNN-2003),
Portland, Oregon, USA, 3:1802-1807, July 2003
[BMP98] Banerjee M., Mitra S. and Pal S. K., Rough Fuzzy MLP:
Knowledge Encoding and Classification, IEEE Transactions on
Neural Networks, 9:1203-1216, 1998
[BSS94] Bazan, J. G., Skowron, A., Synak, P., Dynamic Reducts as a Tool
for Extracting Laws from Decision Tables, In Proc. International
Symposium on Methodologies fro Intelligent Systems, LNCS
Springer-Verlag, 869:346-3 55, 1994
[CAC99] Cercone, N., An, A., Chan C., Rule Induction and Case based
Reasoning: Hybrid Architectures Appear Advantageous, IEEE
TKDE, 11(1):166-174, 1999
'
[Cat91] Catlett, J., On Changing Continuous Attributes into Ordered
Discrete Attributes, In Y. Kodrtoff (Eds.) EWSL-91, Lecture Notes
in AI Springer Verlag, Berlin, Germany, 482:164-178, 1991
[CBB02] Collobert, R., Bengio, S., Bengio, Y., A Parallel Mixture of SVMs
for Very Large Scale Problems, Neural Computation 14:1105-1114,
2002
244
[CCW90] Chiu, D. K. Y., Cheung, B., Wong, A. K. C., Information Synthesis
Based on Hierarchical Entropy Discretization, Journal of
Experimental and Theoretical Artificial Intelligence 2: 117-129,
1990
[CN86] Clark, P., Niblett, T., Induction in noisy domains, Expert Systems,
UK, 1986
[DK01] Dong, M., Kothari, R., Look-Ahead Based Fuzzy Decision Tree
Induction, IEEE Transactions on Fuzzy Systems 9(3):461-468,
2001
245
Morgan Kaufmann I 022-1027, 1993
[FSS96a] Fayyad, U. Shapiro, G. P. and Smyth P., The KDD Process for
Extracting Useful Knowledge from Volumes of Data,
Communications ofthe ACM 39(11):27-34, 1996
[FSS96b] Fayyad U., Shapiro, G. P. and Smyth P., From Data Mining to
Knowledge Discovery in Databases, AI magazine 37-54, 1996
[FTS02] Francis, E., Tay, H., Shen, L., A Modified Chi2 Algorithm for
Discretization, IEEE Transaction on Knowledge and Data
Engineering, 14(3):666-670, 2002
[FU96] Fayyad, U. and Uthurusamy, R., Data Mining and Knowledge
Discovery in Databases, Communications of the ACM, 39(11):24-
26, 1996
[Gam99a] Gama, J., ·Combining Classification Algorithms. Ph.D. Thesis,
University of Porto, 1999
[Gam99b] Gama, J., Discriminant Trees, In Bratko I. and Dzeroski S. (Eds.),
Procd of the 16th ICML '99. Morgan Kaufmann 134-142, 1999
[GB99] Gama J. and Brazdil, P., Linear Tree, lntell~gent Data Analysis
3(1): 1-22, 1999
[GGRL99] Gehrke, J. E., Ganti, V., Ramkrishnan R., Loh, W. Y., BOAT-
Optimistic Decision Tree Construction, In Proceedings of
SIGMOD, 1999
[Gol89] Goldberg, D.E. Genetic Algorithms in Search Optimization and
Machine Learning, Addison-Wesley, 1989
246
169, 1998
Wiley, I981
[HC96] Hu, X. and Cercone, N., Mining Knowledge Rules from Databases:
a Rough Set Approach, In Procd. 1ih Intlernationa. Conference
Data Engg., Washingaton 96-105, 1996
[HKKOO] Han, E., Karypis G. and Kumar V., Scalable Parallel Data Mining
for Association rules, IEEE ITKDE 12(3):337-352, 2000
[HMS01] Hand, D., Mannila, H., Smyth, P., Principles of Data Mining,
Prentice Hall of India, 200 I
[HMS66] Hunt, E. B., Marin, J., Stone, P. J., Experiments in Induction, San
Diego CA: Academic Press, 1966
247
Systems, ISMIS, Lecture Notes in Artificial Intelligence, Springer
Verlag Trondheim, Norway, June 15-18, 1993
[JM03a] Jain R., Minz S., Should Decision Trees Be Learned Using Rough
Sets ? In Procd. F 1 Indian International Conference on Arti.fical
Intelligence (IICAI-03), Hyderabad, India, 1466-1479, 2003
[JM03b] Jain, R., Minz, S., Classifying Mushrooms in the Hybridized Rough
Sets Framework, In Proceed. F 1 ·Indian International Conference
on Arti.fical Intelligence (IICAI-03), 554-567, 2003
248
Technomettrics, 19(2): 191-200, 1977
[KS96] Kohavi, R., Sahami, M., Error based and Entropy-based
Discretization of Continuous Features, In Proceedings of the 2nd
International conference on Knowledge Discovery and Data
Mining, Menlo Park, AAAI Press, 114-119
[LHM98] Liu, B., Hsu, W., Ma, Y., Integrating Classification and Association
Rule Mining, In Proceedings of the Fourth International
Conference on Knowledge Discovery and Data Mining (KDD-98),
New York, USA, 1998
[LSL95] Lu, H., Setiono, R., Liu, H., Neuro Rule: A Connectionist Approach
to Data Mining, In Procd. 21st VLDB Conference, 1995
[LT95] Liu, H., Tan, S. T., X2R: A Fast Rule Generator, In Procd. IEEE
Inti. Conference on systems, Man and Cybernetics 1995
[MA95] Murphy P.M., Aha D.W., UCI repository of machine learning
databases, www. cs. uci. edu!mlearn/MLRepository.html University of
California, Irvine, 1995
[MJ03a] Minz, S., Jain R., Rough Set based Decision Tree model for
Classification, In Procd. 51h International Conference on Data
Warehousing and Knowledge Discovery, DaWaK 2003 Prague,
Czech Republic, September 3-5,2003, LNCS 2737:172-181,2003
[MJ03b] Minz, S., Jain, R., Hybridizing Rough Set Framework for
Classification: An Experimental View, Design and Application of
Hybrid Intelligent Systems A. Abraham et. Al. (Eds.), lOS Press,
631-640, 2003
[MK97] Michalski, R. S., Kaufman K. A., Data Mining and Knowledge
249
Discovery: A Review of Issues and a Multistrategy Approach
Chapter 2, In Michalski, R. S., Bratko, I., Kubat, M. (Eds.),
Machine Learning and Data Mining: Methods and applications,
London, John Wiley & Sons, 1997
[MMHL86] Michalski R., Mozetic I., Hong J., Lavrac N., The AQ15 Inductive
Learning System: An Overview and Experiments, Proceeding of
!MAL, Orsay 1986
[Modr93] Modrzejewski, M., Feature Selection using Rough Sets Theory,
Procd. of European Conference on Machine Learning, Lecture
Notes in Artificial Intelligence, 667, Springer-Verlag, U.K 213-226,
1993
[MPM02] Mitra, S., Pal, S. K, Mitra, P., Data Mining in Soft Computing
Framework: A Survey, IEEE Transactions on Neural Networks,
13(1), 2002
[MRA95] Mehta, M., Rissanen, J., Agrawal, R., MDL-based Decision Tree
Pruning. In International Conference on Knowledge Discovery in
Databases and Data Mining (KDD-95), Canada, August 1995
[MS96] Mollestad, T., Skowron A., A Rough Set Framework for Mining of
Propositional Default Rules, In LNCS I 079:448-457, 1996
[MSSB94] Murthy, S., Simon, K., Salzberg, S., Beigel, R., OC1: Randomized
Induction of Obliqe Decision Trees, Journal of Artificial
Intelligence Research, 1994
[Mur95] Murthy, S. K., On Growing Better Decision Trees froin Data, Ph.D.
Thesis, Department of Computer Science, Johns Hopkins
University, Baltimore, Maryland, 1995
250
A Multidisciplinary Survey, Data Mining and Knowledge
Discovery 2:345-389, 1998
[Ohr99] Ohm, A., Discernibility and Rough Sets in Medicine: Tools and
Applications, Ph.D. thesis, Norwegian University of Science and
Technology, Department of Computer and Information Science,
NTNU report 1999:133, 1999
[OK96] 0hrn, A., Komorowski J., Rosetta-A Rough Set Toolkit For
Analysis of Data, Technical Report, Department of Computer
Systems, Norwegian University of Science and Technology, 1996
[Paw01] Pawlak, Z., Drawing Conclusions from Data-The Rough Set Way,
International Journal of Intelligent Systems, 16:3-11, 2001
251
Trend in Decision Making, Singapore, Springer-Verlag, 1999
[PWZ88] Pawlak, Z., Wong, S. K. M., Ziarko, W., Rough Sets: Probabilistic
Versus Deterministic Approach, International Journal of Man-
Machine Studies 29:81-95, 1988
[SH97] Son, N. H., Hoa, N. S., Some Efficient Algorithms for Rough Set
Methods, Warsaw University, 1997
[Sha48] Shanon, C., A Mathematical Theory of Communication, The Bell
Systems Technical Journal, 27:379-423, 623-656, 1948
[SHMT93] ,Sutherland, A., Henery, R., Molina, R. , Taylor, C. C. King, R.,
Statistical Methods in Learning, Bouchon-Meunier, B., Valverde,
L., Yager, R. R. (Eds.) LNCS Springer-Verlag, 682, 1993
252
Approach, Computational Intelligence, 11:371-388, 1995
[SP97] Skowron A., Polkowski, L., Synthesis of Decision Systems from
Data Tables, In Lin, T. Y., Cercone, N. (Eds.), Rough Sets and Data
Mining, Analysis oflmprecise Data, Kluwer, 259-300, 1997
[SPWOO] Seewald, A. K., Petrak, J., Wilmer, G., Hybrid Decision Tree
Learners with Alternative Leaf Classifiers: An Empirical Study,
available at citeseer.ist.psu.edu/422774.html, 2000
[SYKPOO] Shin, K. C., Yun U. T., Kim, H. K., Park, S. C., A Hybrid
Approach of Neural Network and Memory Based Learning to Data
Mining, IEEE T. on Neural Network, 11(3):637-646, 2000
[VOOO] Vinterbo, S., Ohm, A., Minimal Approximate Hitting Sets and Rule
Templates, International Journal of Approximate Reasoning,
25(2): 123-143, 2000
253
31(4):650-657, 2001
[ZWC03] Zhu, X., Wu, X., Chen, Q., Eliminating Class Noise in Large
Datasets, In Proceedings of the Twentieth International Conference
on Machine Learning (ICML-03), Washington, 2003
254
List of Publications
1. Sonajharia Minz, Rajni Jain, Rough Set based Decision Tree model for
1479,2003
255
5. Sonajharia Minz, Rajni Jain, Refining Decision Tree Classifiers using
2005 (accepted)
6. Rajni Jain, Sonajharia Minz, P. Adhiguru, Rough Set based Decision Tree
7. Rajni Jain, Sonajharia Minz, Dynamic RDT Model for Data Mining, 2nd
2005 (submitted)
256