Classification Algorithm in Data Mining: An
Classification Algorithm in Data Mining: An
Classification Algorithm in Data Mining: An
I. INTRODUCTION
ISSN: 2249-2615
http://www.ijpttjournal.org
Page 369
International Journal of P2P Network Trends and Technology (IJPTT) Volume 4 Issue 8- Sep 2013
ISSN: 2249-2615
http://www.ijpttjournal.org
Page 370
International Journal of P2P Network Trends and Technology (IJPTT) Volume 4 Issue 8- Sep 2013
the instance space into two or more sub-spaces a K-Nearest Neighbor Classifiers (KNN)
certain discrete function of the input attributes
values.
K-Nearest neighbor classifiers are based on
learning by analogy. The training samples are
described by n dimensional numeric attributes.
Each sample represents a point in an n-dimensional
space. In this way, all of the training samples are
stored in an n-dimensional pattern space. When
given an unknown sample, a k-nearest neighbour
classifier searches the pattern space for the k
training samples that are closest to the unknown
sample. "Closeness" is defined in terms of
Euclidean distance, where the Euclidean distance,
where the Euclidean distance between two points,
X=(x1,x2,,xn) and Y=(y1,y2,.,yn) is
d(X, Y)= 2
ISSN: 2249-2615
http://www.ijpttjournal.org
Page 371
International Journal of P2P Network Trends and Technology (IJPTT) Volume 4 Issue 8- Sep 2013
ISSN: 2249-2615
http://www.ijpttjournal.org
Page 372
International Journal of P2P Network Trends and Technology (IJPTT) Volume 4 Issue 8- Sep 2013
Bayesian Networks
A Bayesian network (BN) consists of a
directed, acyclic graph and a probability
distribution for each node in that graph given its
immediate predecessors [7]. A Bayes Network
Classifier is based on a bayesian network which
represents a joint probability distribution over a set
of categorical attributes. It consists of two parts, the
directed acyclic graph G consisting of nodes and
arcs and the conditional probability tables. The
nodes represent attributes whereas the arcs indicate
where t is the number of training examples, and i , i direct dependencies. The density of the arcs in a BN
= 1, . . . , t, are non-negative numbers such that the is one measure of its complexity. Sparse BNs can
derivatives of LP with respect to i are zero. i are the represent simple probabilistic models (e.g., nave
Lagrange multipliers and LP is called the Bayes models and hidden Markov models), whereas
Lagrangian. In this equation, the vectors and dense BNs can capture highly complex models.
constant b define the hyperplane. A learning Thus, BNs provide a flexible method for
machine, such as the SVM, can be modeled as a probabilistic modelling.
function class based on some parameters.Different
function classes can have different capacity in Neural Network
An artificial neural network (ANN), often
learning, which is represented by a parameter h
known as the VC dimension. The VC dimension just called a "neural network" (NN), is a
measures the maximum number of training mathematical model or computational model based
examples where the function class can still be used on biological neural networks, in other words, is an
to learn perfectly, by obtaining zero error rates on emulation of biological neural system. It consists of
the training data, for any assignment of class labels an interconnected group of artificial neurons and
on these points. It can be proven that the actual processes information using a connectionist
error on the future data is bounded by a sum of two approach to computation. In most cases an ANN is
terms. The first term is the training error, and the an adaptive system that changes its structure based
second term if proportional to the square root of the on external or internal information that flows
VC dimension h. Thus, if we can minimize h, we through the network during the learning phase.
can minimize the future error, as long as we also
III. CONCLUSIONS
minimize the training error, SVM can be easily
Data mining offers promising ways to uncover
extended to perform numerical calculations.
hidden
patterns within large amounts of data. These
One of the initial drawbacks of SVM is its
hidden
patterns can potentially be used to predict
computational inefficiency. However, this problem
is being solved with great success. One approach is future behaviour. The availability of new data
to break a large optimization problem into a series mining algorithms, however, should be met with
of smaller problems, where each problem only caution. First of all, these techniques are only as
involves a couple of carefully chosen variables so good as the data that has been collected. Good data
that the optimization can be done efficiently. The is the first requirement for good data exploration.
process iterates until all the decomposed Assuming good data is available, the next step is to
ISSN: 2249-2615
http://www.ijpttjournal.org
Page 373
International Journal of P2P Network Trends and Technology (IJPTT) Volume 4 Issue 8- Sep 2013
[3]
[4]
REFERENCES
[7]
[1]
[2]
ISSN: 2249-2615
[5]
[6]
http://www.ijpttjournal.org
Page 374