Agent Based Meta Learning in Distributed
Agent Based Meta Learning in Distributed
Agent Based Meta Learning in Distributed
343 | P a g e
Sanjay Kumar Sen, Dr Sujata Dash, Subrat P Pattanayak / International Journal of Engineering Research
and Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 3, May-Jun 2012, pp. 342-348
Machine Classifier 1
Data Level
set 1 algorithm
Machine Classifier 3
Data Level
set 3 algorithm
Figure – 1
3. These three base level classifiers are trained from 2. By applying a set of algorithms at the base level
the meta-level training to compute final classifier. and combining these through a meta learner
information is extracted.
344 | P a g e
Sanjay Kumar Sen, Dr Sujata Dash, Subrat P Pattanayak / International Journal of Engineering Research
and Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 3, May-Jun 2012, pp. 342-348
Scalability
The data mining system is highly scalable because
its performance does not hamper as the data sites
increases. It depends on the protocols that transfer
and manage the intelligent agents to support the data
sites.
345 | P a g e
Sanjay Kumar Sen, Dr Sujata Dash, Subrat P Pattanayak / International Journal of Engineering Research
and Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 3, May-Jun 2012, pp. 342-348
Data Site – 2
B
Filtered
Classifier Filtered Classifier Data
repository Data repository
BASE CLASSIFIERS
A.ID1
B.ID2 Data site
C.ID3 Data base
META CLASSIFIERS
B.CART
Data messages
Transfer learning
& classifier agents
Figure 1.1. The JAM Architecture
The above figure is an example of architecture of meta their accumulating knowledge the new classifiers that
learning system. There are three data sites namely A of capture patterns which are learner on new data sources.
Data site1,B of Data site2 and C of Data site3 respectively
share their learning task by exchanging their local
classifiers. Figure 1.1 depicts the JAM system with three 6. Fraud and Intrusion Detection
data sites A, B, C while exchanging their classifier agents.
Here B has imported the RIPPER classifier agent from A The present data mining system must be flexible to
and ID3 classifier agent from C and combined with its own accommodate not only data and patterns which are almost
classifier agent CART to form a local meta-classifier changing from time to time, but also machine learning
agent BAYES. JAM is under distributed protocols in algorithm and data mining technology which are changing
which participating data base sites execute independently over time. JAM is designed using object-oriented methods
and also collaborate with other data sites. Data mining that can be implemented independently of any particular
system with JAM system solves the problem of how to machine learning program or any meta-learning or
make a learning system evolve and adjust according to its classifier combining technique. The learning and meta-
changing environment. For example in medical science the learning agents are designed as objects. JAM provides the
data that are related to various things for example the types definition of the parent agent class and every instance
of dosages, treatments and data which are in medical data agent (i.e. a program that implements any of your favorite
base changes over time and also another example is the learning algorithms ID3 [16], CART [17], Bayes [18],
credit card data where new security system are introduced etc.) are then defined as a subclass of this parent class. For
and also new ways to commit fraud are framed. Though smooth operation of electronic commerce system e.g. inter
the traditional data mining systems are static which can banking network etc. it required smooth operation and also
not adapt to new systems, but data mining systems with it requires access only to legitimate user through
JAM are adaptive to new environment. This adaptively in verification and authentication mechanism. It also thwarts
JAM is achieved by employing meta-learning techniques fraudulent activity attempted by frauds. To overcome
to design learning systems capable of incorporating into fraudulent activities in financial systems from threats, we
346 | P a g e
Sanjay Kumar Sen, Dr Sujata Dash, Subrat P Pattanayak / International Journal of Engineering Research and
Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 3, May-Jun 2012, pp. 342-348
adopted a mechanism for protection by using the models are generated. There is a central repository data base
of errant transaction behaviour that consists of pattern- called DB where the trained data of three data sites of
directed inference systems which can cautious and three banks and three classifiers are stored. In central
forewarn impending threats. By using JAM we can repository these three classifiers C1,C2 and C3 along with
compute fraudulent models by analyzing the huge and three data base are combined with a leaning algorithm to
distributed data base. By computing local fraud detection generate a new classifier C which is also called global
agents which can detect fraudulent transaction and classifier . This global classifier C is sent to the three data
intrusion detection activities which happened in a single base sites of three banks i.e. DB1, DB2 and DB3 acting as
financial corporation, JAM used to integrate meta- a data filter to mark as a fraud label.
learning system that combines the collective knowledge
acquired by individual local agents. JAM used to construct
meta learned system by sharing the models of fraudulent
8. Conclusion and future work.
transactions through exchange of classifier agents in
secured agent infrastructure.. This meta classifier agent The main objective of the distributed data mining system is
used to compute meta classifiers that used to act as sentries to combine information and patterns and extract from
forewarning of possibly fraudulent transactions and threats remote data bases which are distributed in multiple
by inspecting, classifying and labeling every transaction. databases. To fulfill this object and to discover the various
descriptive patterns to compute the classifiers by applying
very machine learning programmes. In this paper we aim
7. Detection of Fraud Label to find useful information and efficiently and more
Approach accurately from huge and distributed data bases. To
achieve this objective we applied a agent based meta-
Step.1 : In total N numbers of banks, each bank Bi having learning system called JAM in this distributed data mining
own data base Di uses its trained data Ti and combines system. Various machine learning programmes are
with its own learning agent to produce classifier ci. implemented in meta-learning system to compute the
combining classifier models for smooth and effective
Step 2 : Then each bank sent its local classifier agents ci mining of large data mining system. For JAM system in
to a central data repository where there is global trained distributed data mining system., we have also discussed
data base T of all banks i.e. T = T1 Ụ T2 Ụ ……. Tn several issues like scalability, efficiency. Scalability is
showed by employing its performance which does not
Step 3 : In central repository, with using some hamper as the data sites increases which depends on the
algorithm and by taking its trained data T and all protocols that transfer and manage the intelligent agents to
classifiers ci of all banks, meta classifier C is produced. support the data sites. Efficiency is achieved by effectively
using the available system resources. It has also been
Step 4 : This meta classifier or global classifier C is sent discussed about fraud detection approach. This paper has
to all banks as a data filter to mark as a fraud label. provided only overview of distributed data mining system
but not provided detailed exposition of techniques. So it
requires extensive research on all issues.
8. Evaluation
347 | P a g e
Sanjay Kumar Sen, Dr Sujata Dash, Subrat P Pattanayak / International Journal of Engineering Research and
Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 3, May-Jun 2012, pp. 342-348
[4] SHAILESH S. DHOK “Credit Card Fraud learning. In Proc. Second Intl. Conf. Knowledge
Detection Using Hidden Markov Model Discovery and Data Mining, pages 2–7, 1996.
International Journal of Soft Computing and
Engineering (IJSCE) ISSN: 2231-2307, Volume- [13] D. Pomerleau. Neural network perception for
2, Issue-1, March 2012 mobile robot guidance. PhD thesis, School of
Computer Sc., Carnegie Mellon Univ., Pittsburgh,
[5] Distributed learning with bagging-like PA, 1992. (Tech. Rep. CMU-CS-92-115).
performance, Pattern Recognition Letters 24,
455–471.(Datta et al., 2006) Datta, S., Bhaduri, [14] K.Mok W. Lee, S. Stolfo. Mining audit data to
K., Giannella, C., Wolff, R. and Kargupta, H., build intrusion models. In G. Piatetsky-Shapiro R
2006. Distributed data mining in peer-to-peer Agrawal, P. Stolorz, editor, Proc. Fourth Intl.
networks, IEEE Internet Computing 10(4), 18–26. Conf. Knowledge Discovery and Data Mining,
pages 66–72. AAAI Press, 1998.
[6] (Ferri et al., 2004) Ferri, C., Flach, P. and [15] P. Chan and S. Stolfo. Meta-learning for multi
Hernández-Orallo, J., 2004. Delegating classifiers, strategy and parallel learning. In Proc. SecondIntl.
ICML ’04: Proceedings of the 21st International Work. Multistrategy Learning, pages 150–165,
Conference on Machine learning, ACM, New 1993.
York, NY, USA, pp. 289–296.
[16] J. R. Quinlan. Induction of decision trees.
[7] Abhinav Srivastava, Amlan Kundu, Shamik Sural, Machine Learning, 1:81–106, 1986.
Senior Member, IEEE, and Arun K. Majumdar,
Senior Member, IEEE “Credit Card Fraud [17] L. Breiman, J. H. Friedman, R. A. Olshen, and C.
Detection Using Hidden Markov Model” IEEE J. Stone. Classification and Regression Trees.
TRANSACTIONS ON DEPENDABLE AND Wadsworth, Belmont, CA, 1984.
SECURE COMPUTING, VOL. 5, NO. 1,
JANUARY-MARCH 2008 [18] R. Duda and P. Hart. Pattern classification
and scene analysis. Wiley, New York, NY,
[8] R. Grossman, S. Baily, S. Kasif, D. Mon, and A. 1973.
Ramu. The preliminary design of papyrus: A
system for high performance. In P. Chan H. [19] Kargupta H., Park B., Hershberger D., Johnson
Kargupta, editor, Work. Notes KDD-98 Workshop E., Collective Data Mining: A New Perspective
on Distributed Data Mining, pages 37–43. AAAI Toward Distributed Data Analysis. Accepted in
Press, 1998. the Advances in Distributed Data Mining, H.
Kargupta and P. Chan (eds.), AAAI/MIT Press,
[9] U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and (1999).
R. Uthurusamy. Advances in Knowledge
Discovery and Data Mining. AAAI Press/MIT [20] Zhang X., Lam C., Cheung W.K., Mining Local
Press, Menlo Park, California/Cambridge, Data Sources For Learning Global Cluster Model
Massachusetts / London, England, 1996. Via Local Model Exchange. IEEE Intelligence
Informatics Bulletin, (4)2(2004).
[10] K.Lang. News weeder: Learning to filter net
news. In A.Prieditis and S.Russel editors, [21] Prodromidis A., Chan P.K., Stolfo S.J., Meta-
Proc.12th Intl. Conf. Machine Learning, pages learning in Distributed Data Mining Systems:
331–339. Morgan Kaufmann, 1995. Issues and Approaches. In: Advances in
Distributed and Parallel Knowledge Discovery,
[11] S. Stolfo,W. Fan,W. Lee, A. Prodromidis, and P. Kargupta H., Chan P.(ed.), AAAI/MIT Press,
Chan. Credit card fraud detection using meta- Chapter 3, (2000).
learning: Issues and initial results. In Working
notes of AAAI Workshop on AI Approaches to [22] Ahang S., Wu X., Zhang C., Multi-Database
Fraud Detection and Risk Management, 1997. Mining. IEEE Computational Intelligence
Bulletin, (2)1 (2003).
[12] P. Chan and S. Stolfo. Sharing learned models
among remote database partitions by local meta-
348 | P a g e