Agent Based Meta Learning in Distributed

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Sanjay Kumar Sen, Dr Sujata Dash, Subrat P Pattanayak / International Journal of Engineering Research

and Applications (IJERA) ISSN: 2248-9622 www.ijera.com


Vol. 2, Issue 3, May-Jun 2012, pp. 342-348

AGENT BASED META LEARNING IN DISTRIBUTED


DATA MINING SYSTEM
1
Sanjay Kumar Sen, 2Dr Sujata Dash, 3Subrat P Pattanayak,
1
Faculty (Comp Sc & Engg.), BEC, Bhubaneswar
2 Principal, KMBB College, Bhubaneswar
3
Faculty (Comp Application), RIMS, Rourkela

Data mining systems aim to discover patterns and


ABSTRACT extract useful information from facts recorded in
The data mining technology is used to identifying databases. One means of acquiring knowledge from
patterns and information from a huge quantity databases is to apply various machine learning
of data. In a single repository data base where algorithms that compute descriptive representations
data is stored in central site, then applying data of the data as well as patterns that may be exhibited
mining algorithms on these data base, patterns in the data. Most of the current generation of
are extracted, which is clearly implausible and learning algorithms, however, are computationally
untenable for many realistic problems and complex and require all data to be resident in main
databases. To deal with these complex systems memory which is clearly untenable for many
has revealed opportunities to improve distributed realistic problems and databases. Furthermore, in
data mining systems in a number of ways. certain situations, data may be inherently distributed
Furthermore, in certain situations, data may be and cannot be merged into a single database for a
inherently distributed and cannot be merged into variety of reasons including security, fault tolerance,
a single database for a variety of reasons legal constraints, competitive reasons, etc. In such
including security, fault tolerance, legal cases, it may not be possible to examine all of the
constraints, competitive reasons, etc. In such data at a central processing site to compute a single
cases, it may not be possible to examine all of the global model. Traditional data analysis methods that
data at a central processing site to compute a require humans to process large data sets are
single global model. Here, we develop completely inadequate. Applying the traditional data
techniques that scale up to large and possibly mining tools to discover knowledge from the
physically distributed databases. Meta-learning distributed data sources might not be possible [19].
(learning from learned knowledge) – a technique Hence, knowledge discovery from multi-databases
dealing with the problem of computing a global has became an important research field and is
classifier from large and inherently distributed considered to be a more complex and difficult task
databases. This paper, describes meta-learning than knowledge discovery from mono-databases
and JAM system (Java Agents for Meta- [22]. The relatively new field of Knowledge
learning), which is an agent-based meta-learning Discovery and Data Mining (KDD) has emerged to
system for large-scale data mining applications. compensate for these deficiencies. Knowledge
Several important desiderata of data mining discovery in databases denotes the complex process
systems are addressed (i.e., scalability, efficiency, of identifying valid, novel, potentially useful and
portability, compatibility, adaptivity, ultimately understandable patterns in data [8]. Data
extensibility and effectiveness) and a combination mining refers to a particular step in the KDD
of AI-based methods and distributed systems process. According to the most recent and broad
techniques are presented. We applied JAM on definition [8], “data mining consists of particular
the real-world data mining task of modeling and algorithms (methods) that, under acceptable
detecting credit card fraud with notable success. computational efficiency limitations, produce a
particular enumeration of patterns (models) over the
Keywords : compatibility, distributed data data.” A common methodology for distributed
mining, global classifier. meta-learning, machine learning and data mining is of two-stage,
portability, scalability. first performing local data analysis and then
combining the local results forming the global one
[20]. For example, in [21], a meta-learning process
was proposed as an additional learning process for
combining a set of locally learned classifiers
1. Introduction
342 | P a g e
Sanjay Kumar Sen, Dr Sujata Dash, Subrat P Pattanayak / International Journal of Engineering Research
and Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 3, May-Jun 2012, pp. 342-348
(decision trees in particular) for a global classifier. extracting automatically, or semi-automatically
Hidden Markov Model is the statistical tools for novel, useful, and understandable pieces of
engineer and scientists to solve credit card fraud can information (e.g., patterns, rules, regularities,
be detected using Hidden Markov Model during constraints) from data in large databases. One way
transactions[4]. Abhinav Srivastava, Amlan of acquiring knowledge from databases is to apply
Kundu, Shamik Sural, Senior Member has various machine learning algorithms that search for
shown the system of credit card fraud detection by patterns that may be exhibited in the data and
using HMM model by bearing in mind a compute descriptive representations. Machine
cardholder’s spending habit without fraud signature learning and classification techniques have been
and have also suggested a method for finding the successfully applied in many problems in diverse
spending profile of cardholders, [7]as well as areas with very good success. Although the field of
application of this knowledge in deciding the value machine learning has made substantial progress over
of observation symbols and initial estimate of the the past few years, both empirically and
model parameters.. Machine-learning algorithms theoretically, one of the continuing challenges is the
have been deployed in , in detecting credit card development of inductive learning techniques that
fraud[11], in steering vehicles driving autonomously effectively scale up to large and possibly physically
on public highways at 70 miles an hour [13], and in distributed data sets. This paper investigates data
computing customized electronic newspapers [10], mining techniques that scale up to large and
to name a few applications. Adnan M. Al-Khatib in physically distributed databases. In this respect, we
his research paper "to present a high accuracy induct some additional features into JAM (Java
method or prototype to detect Card-Not-Present agents for Meta-learning), an agent-based
(CNP) Fraudulent transactions in the e-payment distributed data mining system that supports the
systems” by integrating data from multiple remote dispatch and exchange of learning agents
databases (e.g., bank transactions, federal/state across multiple data sources and employs meta-
crime history DBs);[3] and then using suitable and learning techniques to combine the separately
effective data mining and artificial intelligence (AI) learned models into a higher level representation.
tools to find unusual access sequences. Recently
data mining techniques have been successfully 2 Meta-learning
applied to intrusion detection in network-based Meta-learning[19] is loosely defined as learning
systems [14]. The main focus of this paper is on the from learned knowledge. Meta-learning is a recent
management of machine learning programs with the technique that seeks to compute higher level models,
capacity to travel between computer sites to mine called meta-classifiers, that integrate in some
the local data. The term “management” denotes the principled fashion the information cleaned by the
ability to dispatch and exchange such programs separately learned classifiers to improve predictive
across data sites, but also the potential to control, performance. In meta learning process a number of
evaluate, filter, resolve compatibility problems and learning programme is executed on a number of data
combine their products (that can too be intelligent subsets in parallel then collective result is collected
agents). Data mining refers to the process of in the form of classifiers.

343 | P a g e
Sanjay Kumar Sen, Dr Sujata Dash, Subrat P Pattanayak / International Journal of Engineering Research
and Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 3, May-Jun 2012, pp. 342-348

Machine Classifier 1
Data Level
set 1 algorithm

Meta learning Final


Machine Classifier 2 Algorithm Classifier
Data Level
set 2 algorithm

Machine Classifier 3
Data Level
set 3 algorithm

Figure – 1

From Figure 1, the different stages of a simplified Meta-learning Techniques


meta-learning scenario are
In this section we describe previous research in
1.There are three data sets namely, data set 1. data meta-learning and in particular address the
set 2, data set 3 which are called base-level training following specific research issues:
data sets.
1. Using a variety of statistical, information-
2.These data sets are executed by machine level theoretic and the characterization of datasets is
programme to produce three base classifiers. performed.

3. These three base level classifiers are trained from 2. By applying a set of algorithms at the base level
the meta-level training to compute final classifier. and combining these through a meta learner
information is extracted.

3. To accelerate the rate of learning process,


knowledge is extracted through a continuous
learner.

344 | P a g e
Sanjay Kumar Sen, Dr Sujata Dash, Subrat P Pattanayak / International Journal of Engineering Research
and Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 3, May-Jun 2012, pp. 342-348

In meta-learning (learning from learned knowledge)


Efficiency
technique dealing with the problem of computing a
global classifier from large and inherently It refers to the effective use of the available system
distributed databases. A number of independent resources. It depends on the appropriate evaluation
classifiers – “base classifiers“ -are computed in and filtering of the available agents which
parallel. The base classifiers are then collected and minimizes redundancy.
combined to a „meta-classifier“ by another learning
process. Meta-classifiers can be defined recursively
as collections of classifiers structured in multi-level
5. JAM (Java Agent for Meta-
trees [12]. Such structures, however, can be Learning) :
unnecessarily complex, meaning that many
classifiers may be redundant, wasting resources and Meta-learning system is implemented by JAM
reducing system throughput. system(Java Agents for Meta-learning) which is a
distributed agent-based data mining system. It
• The predictive accuracy of base classifiers provides a set of learning agents which are used to
is improved. compute classifier agents at each site. The launching
• Assuming that a system consists of several and exchanging of each classifier agents takes place
databases interconnected through an at all sites of distributed data mining system by
intranet or internet, the goal is to provide providing a set of meta-learning agents which
the means for each data site to utilize its combined the computed models those computed
own local data and, at the same time, models at different sites. We have achieved this goal
benefit from the data that is available at through the implementation and demonstration of a
other data sites without transferring or system we call JAM (Java Agents for Meta-
directly accessing that data. Learning). To our knowledge, JAM is the first
system to date that employs meta-learning as a
4. Advantages of Meta-learning means to mine distributed databases. A commercial
system based upon JAM has recently appeared [15].
The same base learning process is executed in
parallel by meta-learning on subsets of the training
data set which improves efficiency. Because the The JAM Architecture:
same serial programme is executed in parallel which First of all local classifiers are computed on each
improves time complexity. Another advantage is local data site by executing learning agents. Then
that learning is in small subsets of data which can these local computed classifiers are exchanged
easily accommodated in main memory instead of between local sites combine with each local
huge amount of data. Meta-learning combines classifiers through meta-learning agents. Each local
different learning system each having different data sites is administered by local configuration file
inductive bias, as a result predictive performance is which is used to perform the learning and meta-
increased. A higher level learned model is derived learning task. To supervise agent exchange work
after combining separately learned concepts. Meta- and execution of meta-learning process smoothly,
learning constitutes a scalable machine learning each data site is employed by GUI and animation
method because it generalizes to hierarchical multi- facilities of JAM. After computing the base and
level meta-learning. Also most of these algorithms meta-classifiers, the JAM system executes the
generate classifiers by applying the same algorithms modules for classification of desired data sets. The
on different data base. configuration file manager(CFM) is used as server
which is responsible for keeping the state of the
system up to date.

Scalability
The data mining system is highly scalable because
its performance does not hamper as the data sites
increases. It depends on the protocols that transfer
and manage the intelligent agents to support the data
sites.

345 | P a g e
Sanjay Kumar Sen, Dr Sujata Dash, Subrat P Pattanayak / International Journal of Engineering Research
and Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 3, May-Jun 2012, pp. 342-348

Data Site – 1 Data Site – 3


A C

Data Site – 2
B
Filtered
Classifier Filtered Classifier Data
repository Data repository

Data site Classifier Filtered Data site


Data base repositor Data Data base

BASE CLASSIFIERS
A.ID1
B.ID2 Data site
C.ID3 Data base
META CLASSIFIERS
B.CART

Data messages
Transfer learning
& classifier agents
Figure 1.1. The JAM Architecture

The above figure is an example of architecture of meta their accumulating knowledge the new classifiers that
learning system. There are three data sites namely A of capture patterns which are learner on new data sources.
Data site1,B of Data site2 and C of Data site3 respectively
share their learning task by exchanging their local
classifiers. Figure 1.1 depicts the JAM system with three 6. Fraud and Intrusion Detection
data sites A, B, C while exchanging their classifier agents.
Here B has imported the RIPPER classifier agent from A The present data mining system must be flexible to
and ID3 classifier agent from C and combined with its own accommodate not only data and patterns which are almost
classifier agent CART to form a local meta-classifier changing from time to time, but also machine learning
agent BAYES. JAM is under distributed protocols in algorithm and data mining technology which are changing
which participating data base sites execute independently over time. JAM is designed using object-oriented methods
and also collaborate with other data sites. Data mining that can be implemented independently of any particular
system with JAM system solves the problem of how to machine learning program or any meta-learning or
make a learning system evolve and adjust according to its classifier combining technique. The learning and meta-
changing environment. For example in medical science the learning agents are designed as objects. JAM provides the
data that are related to various things for example the types definition of the parent agent class and every instance
of dosages, treatments and data which are in medical data agent (i.e. a program that implements any of your favorite
base changes over time and also another example is the learning algorithms ID3 [16], CART [17], Bayes [18],
credit card data where new security system are introduced etc.) are then defined as a subclass of this parent class. For
and also new ways to commit fraud are framed. Though smooth operation of electronic commerce system e.g. inter
the traditional data mining systems are static which can banking network etc. it required smooth operation and also
not adapt to new systems, but data mining systems with it requires access only to legitimate user through
JAM are adaptive to new environment. This adaptively in verification and authentication mechanism. It also thwarts
JAM is achieved by employing meta-learning techniques fraudulent activity attempted by frauds. To overcome
to design learning systems capable of incorporating into fraudulent activities in financial systems from threats, we

346 | P a g e
Sanjay Kumar Sen, Dr Sujata Dash, Subrat P Pattanayak / International Journal of Engineering Research and
Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 3, May-Jun 2012, pp. 342-348
adopted a mechanism for protection by using the models are generated. There is a central repository data base
of errant transaction behaviour that consists of pattern- called DB where the trained data of three data sites of
directed inference systems which can cautious and three banks and three classifiers are stored. In central
forewarn impending threats. By using JAM we can repository these three classifiers C1,C2 and C3 along with
compute fraudulent models by analyzing the huge and three data base are combined with a leaning algorithm to
distributed data base. By computing local fraud detection generate a new classifier C which is also called global
agents which can detect fraudulent transaction and classifier . This global classifier C is sent to the three data
intrusion detection activities which happened in a single base sites of three banks i.e. DB1, DB2 and DB3 acting as
financial corporation, JAM used to integrate meta- a data filter to mark as a fraud label.
learning system that combines the collective knowledge
acquired by individual local agents. JAM used to construct
meta learned system by sharing the models of fraudulent
8. Conclusion and future work.
transactions through exchange of classifier agents in
secured agent infrastructure.. This meta classifier agent The main objective of the distributed data mining system is
used to compute meta classifiers that used to act as sentries to combine information and patterns and extract from
forewarning of possibly fraudulent transactions and threats remote data bases which are distributed in multiple
by inspecting, classifying and labeling every transaction. databases. To fulfill this object and to discover the various
descriptive patterns to compute the classifiers by applying
very machine learning programmes. In this paper we aim
7. Detection of Fraud Label to find useful information and efficiently and more
Approach accurately from huge and distributed data bases. To
achieve this objective we applied a agent based meta-
Step.1 : In total N numbers of banks, each bank Bi having learning system called JAM in this distributed data mining
own data base Di uses its trained data Ti and combines system. Various machine learning programmes are
with its own learning agent to produce classifier ci. implemented in meta-learning system to compute the
combining classifier models for smooth and effective
Step 2 : Then each bank sent its local classifier agents ci mining of large data mining system. For JAM system in
to a central data repository where there is global trained distributed data mining system., we have also discussed
data base T of all banks i.e. T = T1 Ụ T2 Ụ ……. Tn several issues like scalability, efficiency. Scalability is
showed by employing its performance which does not
Step 3 : In central repository, with using some hamper as the data sites increases which depends on the
algorithm and by taking its trained data T and all protocols that transfer and manage the intelligent agents to
classifiers ci of all banks, meta classifier C is produced. support the data sites. Efficiency is achieved by effectively
using the available system resources. It has also been
Step 4 : This meta classifier or global classifier C is sent discussed about fraud detection approach. This paper has
to all banks as a data filter to mark as a fraud label. provided only overview of distributed data mining system
but not provided detailed exposition of techniques. So it
requires extensive research on all issues.
8. Evaluation

Here we provide a data base schema of several banks References


which has transaction data sets. To capture fraudulent [1] Adnan M. Al-Khatib and Ezz Hattab; “Mining
activities of banks, several information are continuously Fraudulent Transactions in e-payment Systems”;
analysed by banks. The information of customer contains the 9th international conference (ii WAS 2007);
its credit card no, amount of transaction, details past 2007; P.P. 179 – 189.
information about transaction, transaction data and time ,
the age of account and card, the locational information of [2] Adnan M. Al-Khatib; “Mining Fraudulent
transactor and transaction, confidential and proprietary Behavior in epayment Systems”; Ph.D.
field, other credit card account information, the fraud label Dissertation; 2007.
etc. The approach can be described by a suitable
illustration. Let us assume that there are three banks, [3] Adnan M. Al-Khatib, “Detect CNP Fraudulent
namely “A” and “B” and “C”. Each of these banks has its Transactions” World of Computer Science and
own data base sites namely, DB1 and DB2 and DB3. Information Technology Journal (WCSIT) ISSN:
Taking trained data T1, T2 and T3 from each of the data 2221-0741 Vol. 1, No.8, 326- 332, 2011
base DB1, DB2, DB3 respectively and combining with
each of its own learning algorithm classifiers C1, C2,C3

347 | P a g e
Sanjay Kumar Sen, Dr Sujata Dash, Subrat P Pattanayak / International Journal of Engineering Research and
Applications (IJERA) ISSN: 2248-9622 www.ijera.com
Vol. 2, Issue 3, May-Jun 2012, pp. 342-348
[4] SHAILESH S. DHOK “Credit Card Fraud learning. In Proc. Second Intl. Conf. Knowledge
Detection Using Hidden Markov Model Discovery and Data Mining, pages 2–7, 1996.
International Journal of Soft Computing and
Engineering (IJSCE) ISSN: 2231-2307, Volume- [13] D. Pomerleau. Neural network perception for
2, Issue-1, March 2012 mobile robot guidance. PhD thesis, School of
Computer Sc., Carnegie Mellon Univ., Pittsburgh,
[5] Distributed learning with bagging-like PA, 1992. (Tech. Rep. CMU-CS-92-115).
performance, Pattern Recognition Letters 24,
455–471.(Datta et al., 2006) Datta, S., Bhaduri, [14] K.Mok W. Lee, S. Stolfo. Mining audit data to
K., Giannella, C., Wolff, R. and Kargupta, H., build intrusion models. In G. Piatetsky-Shapiro R
2006. Distributed data mining in peer-to-peer Agrawal, P. Stolorz, editor, Proc. Fourth Intl.
networks, IEEE Internet Computing 10(4), 18–26. Conf. Knowledge Discovery and Data Mining,
pages 66–72. AAAI Press, 1998.

[6] (Ferri et al., 2004) Ferri, C., Flach, P. and [15] P. Chan and S. Stolfo. Meta-learning for multi
Hernández-Orallo, J., 2004. Delegating classifiers, strategy and parallel learning. In Proc. SecondIntl.
ICML ’04: Proceedings of the 21st International Work. Multistrategy Learning, pages 150–165,
Conference on Machine learning, ACM, New 1993.
York, NY, USA, pp. 289–296.
[16] J. R. Quinlan. Induction of decision trees.
[7] Abhinav Srivastava, Amlan Kundu, Shamik Sural, Machine Learning, 1:81–106, 1986.
Senior Member, IEEE, and Arun K. Majumdar,
Senior Member, IEEE “Credit Card Fraud [17] L. Breiman, J. H. Friedman, R. A. Olshen, and C.
Detection Using Hidden Markov Model” IEEE J. Stone. Classification and Regression Trees.
TRANSACTIONS ON DEPENDABLE AND Wadsworth, Belmont, CA, 1984.
SECURE COMPUTING, VOL. 5, NO. 1,
JANUARY-MARCH 2008 [18] R. Duda and P. Hart. Pattern classification
and scene analysis. Wiley, New York, NY,
[8] R. Grossman, S. Baily, S. Kasif, D. Mon, and A. 1973.
Ramu. The preliminary design of papyrus: A
system for high performance. In P. Chan H. [19] Kargupta H., Park B., Hershberger D., Johnson
Kargupta, editor, Work. Notes KDD-98 Workshop E., Collective Data Mining: A New Perspective
on Distributed Data Mining, pages 37–43. AAAI Toward Distributed Data Analysis. Accepted in
Press, 1998. the Advances in Distributed Data Mining, H.
Kargupta and P. Chan (eds.), AAAI/MIT Press,
[9] U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and (1999).
R. Uthurusamy. Advances in Knowledge
Discovery and Data Mining. AAAI Press/MIT [20] Zhang X., Lam C., Cheung W.K., Mining Local
Press, Menlo Park, California/Cambridge, Data Sources For Learning Global Cluster Model
Massachusetts / London, England, 1996. Via Local Model Exchange. IEEE Intelligence
Informatics Bulletin, (4)2(2004).
[10] K.Lang. News weeder: Learning to filter net
news. In A.Prieditis and S.Russel editors, [21] Prodromidis A., Chan P.K., Stolfo S.J., Meta-
Proc.12th Intl. Conf. Machine Learning, pages learning in Distributed Data Mining Systems:
331–339. Morgan Kaufmann, 1995. Issues and Approaches. In: Advances in
Distributed and Parallel Knowledge Discovery,
[11] S. Stolfo,W. Fan,W. Lee, A. Prodromidis, and P. Kargupta H., Chan P.(ed.), AAAI/MIT Press,
Chan. Credit card fraud detection using meta- Chapter 3, (2000).
learning: Issues and initial results. In Working
notes of AAAI Workshop on AI Approaches to [22] Ahang S., Wu X., Zhang C., Multi-Database
Fraud Detection and Risk Management, 1997. Mining. IEEE Computational Intelligence
Bulletin, (2)1 (2003).
[12] P. Chan and S. Stolfo. Sharing learned models
among remote database partitions by local meta-

348 | P a g e

You might also like