Pattern Recognition Techniques in AI
Pattern Recognition Techniques in AI
net/publication/327906835
CITATIONS READS
20 4,165
1 author:
Rajeshwar Dass
Deenbandhu Chhotu Ram University of Science and Technology, Murthal
12 PUBLICATIONS 162 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Rajeshwar Dass on 27 September 2018.
ISSN 2047-3338
1
Seema Asht and 2Rajeshwar Dass
1,2
Department of Electronics and Communication Engineering, DCRUST, Murthal, Sonepat, India
1
[email protected], [email protected]
Abstract– Pattern Recognition has attracted the attention of and pattern classes available in the data. This information or
researchers in last few decades as a machine learning approach knowledge about the data is used for further processing. Third
due to its wide spread application areas. The application area step used for pattern recognition is classification. Its purpose
includes medicine, communications, automations, military is to decide the category of new data on the basis of
intelligence, data mining, bioinformatics, document
knowledge received from data analysis process. Data set
classification, speech recognition, business and many others. In
this review paper various approaches of Pattern Recognition presented to a Pattern Recognition system is divided into two
have been presented and their pros-cons, application specific sets: training set and testing set. System learns from training
paradigm has been shown. On the basis of survey, pattern set and efficiency of system is checked by presenting testing
recognition techniques can be categorized into six parts. These set to it. The performance of the pattern recognition
include Statistical Techniques, Structural Techniques, Template techniques is influenced by mainly three elements (i) amount
Matching, Neural Network Approach, Fuzzy Model and Hybrid of data (ii) technology used(method) (iii) designer and the
Models. user. The challenging job in pattern recognition is to develop
systems with capability of handling massive amounts of data.
Index Terms– Pattern Recognition, Statistical Pattern
The various models opted for pattern recognition are:
Recognition, Structural Pattern Recognition, Neural Networks
and Fuzzy Sets Statistical Techniques, Structural Techniques, Template
Matching, Neural Network based techniques, Fuzzy models
and Hybrid Models.
I. INTRODUCTION
II. PATTERN RECOGNITION MODELS
measurement is done while testing, then these feature values highly discriminating information. An LDA method in null
is presented to learned system and in this way classification is space of intra-class scatter matrix is N-LDA which involves
performed. solving the Eigen value problem for a very large matrix.
When conditional probability density distribution is known, Principal Component Analysis (PCA) or Karhunen-Loeave
parametric classification schemes are used otherwise non expansion is a multi-element unsupervised technique in which
parametric classification scheme need to be used. Various we approach for dimensionality reduction [10]. Using PCA,
decision rules are there to determine decision boundary like, patterns are detected in the data and these patterns determine
Bayes Decision Rule, Optimal Bayes Decision Rule, The the similarity measure [11]. In PCA Eigen vectors with largest
Maximum Likelihood Rule, Neyman-Pearson rule and MAP Eigen values are computed to form the feature space. PCA is
rule. As feature spaces are partitioned, system becomes noise closely related to Factor Analysis [11]. Kernel PCA is a
insensitive, therefore in case of noisy patterns. The choice of solution for nonlinear feature extraction [12], [13].Other non-
statistical model is a good solution. Depending upon whether linear feature extraction techniques are Multidimensional
the method opted is supervised or unsupervised statistical scaling (MDS) and Kohonen feature Map [14]. Application
technique can be categorized as: Discriminant Analysis and areas of PCA include graphically unreliable patterns.
Principal Component Analysis [1]. Discriminant Analysis is more efficient as compared to PCA,
in terms of accuracy and time elapsed [15].
B. Structural Model
When we came across patterns with strong inherent
structures, statistical methods give ambiguous results, because
feature extraction destroys vital information concerning the
basic structure of pattern. Therefore, in complex pattern
Fig. 1: Statistical Pattern Recognition Model recognition problems, like recognition of multidimensional
objects it is preferred to adopt a hierarchical system, where a
pattern is considered to be made up of simple sub-patterns,
Discriminant Analysis is a supervised technique in which which are further composed of simpler sub patterns [16], [17].
we approach for dimensionality reduction. Here linear In structural approach of pattern recognition a collection of
combination of features is utilized to perform the complex patterns are described by a number of sub-patterns
classification operation. For each pattern class, a discriminant and the grammatical rules with which these sub patterns are
function is defined which performs the classification function associated with each other. This model is concerned with
[5] - [8]. There is not a well defined rule regarding the form of structure and attempts to recognize a pattern from its general
discriminant function like minimum distance classifier uses form. The language which provides structural description of
one reference point for each class, and discriminant function patterns in terms of pattern primitives and their composition is
computes minimum distance from unknown vectors to these termed as pattern description language. Increased descriptive
points, on the other hand nearest neighbor classifier uses set power of a language leads to increased complexity of syntax
of points for each class. There are various kinds of analysis system.
Discriminant Analysis methods that are used based upon the To recognize finite-state languages finite- state automata is
application and system requirement such as: Linear used. Descriptive power of finite-state languages is weaker
Discriminant Analysis (LDA), Null-LDA (N-LDA), Fisher than that of context-sensitive languages. Context sensitive
Discriminant Analysis (FDA), Two Dimensional Linear languages are described by non-deterministic procedures.
Discriminant Analysis (2D-LDA), Two Dimensional Fisher Selection of type of grammar for pattern description depends
Discriminant Analysis (2D-FDA). In LDA feature set is upon the primitives and on the grammar’s descriptive power
obtained by linear combination of original features. Intra-class and analysis efficiency [18]. For description of patterns such
distance is minimized as well as inter-class distance is as chromosome images, 2D-mathematics, chemical structures,
maximized to obtain the optimum results.LDA suffers from spoken words, English characters and fingerprint patterns, a
small sample size (SSS) problem. number of languages have been suggested [19], [20]. High
In FDA ratio of variance in inter-classes to variances in dimensional patterns need high dimensional grammars such
intra-classes defines the separation between classes. In FDA as web grammars, tree grammars, graph grammars and shape
inter-class scatter is maximized and intra-class scatter is grammars for efficient description [19], [21]-[23].
minimized to get the optimum results [8]. FDA approach is a Stochastic languages, approximation and transformational
combination of PCA and LDA. 2D-LDA avoids small sample grammars are used to describe noisy and distorted patterns
size (SSS) problem associated with 1D-LDA. Here matrices [19], [24]-[26]. This approach demands large training sets and
of input data are computed to form the feature vector. Trace very large computational efforts [27]. When dealing with
of interclass scatter matrix is maximized while trace of intra- noisy patterns, grammar defining the basic structure of
class scatter matrix is minimized to get the optimum results in complex patterns becomes too difficult to define, there in such
2 D-LDA. As compared to 1-D LDA; 2D-FDA provides non- cases statistical approach is a good option. Acceptance Error
singular interclass and intra-class matrices. Chen et al. [9] is the criterion to measure performance. This model is used in
suggested that the null space spanned by eigenvectors of the application areas like in textured images, shape analysis of
intra-class scatter matrices having zero Eigen values contains
International Journal of Computer Science and Telecommunications [Volume 3, Issue 8, August 2012] 27
contours and image interpretation where patterns have a E. Fuzzy Based Model
definite structure [28].
The importance of fuzzy sets in Pattern Recognition lies in
C. Template Matching Model modeling forms of uncertainty that cannot be fully understood
by the use of probability theory [37], [38]. Kandel [39] states,
Template matching is simplest and most primitive among “In a very fundamental way, the intimate relation between
all pattern recognition models. It is used to determine the theory of fuzzy sets and theory of Pattern Recognition and
similarity between two samples, pixels or curves. The pattern classification rests on the fact that most real world classes are
to be recognized is matched with the stored templates while fuzzy in nature”, Kandel defined various techniques of fuzzy
assuming that template can be gone through rotational or pattern recognition. Syntactic techniques are utilized when the
scalar changes. The efficiency of this model depends upon the pattern sought is related to the formal structure of language.
stored templates. Correlation function is taken as recognition Semantic techniques are used when fuzzy partitions of data
function and is optimized depending on the available training sets are to be produced. Then a similarity measure based on
set. The shortcoming of this approach is that, it does not work weighted distance is used to obtain similarity degree between
efficiently in the presence of distorted patterns [29]. the fuzzy description of unknown shape and reference shape.
D. Neural Network Based Model F. Hybrid Model
Neural networks are the massively parallel structures In most of the emerging applications, it is clear that a single
composed of “neuron” like subunits [29]. Neural networks model used for classification doesn’t behave efficiently, so
provide efficient result in the field of classification. Its multiple methods have to be combined together giving result
property of changing its weight iteratively and learning [10], to hybrid models. Primitive approaches to design a Pattern
[30], give it an edge over other techniques for recognition Recognition system which aims at utilizing a best individual
process. Perceptron is a primitive neuron model. It is a two classifier have some drawbacks [40]. It is very difficult to
layer structure. If output function of perceptron is step, then it identify a best classifier unless deep prior knowledge is
performs classification problems, if it is linear than it perform available at hand [40], [41]. Statistical and Structural models
regression problems. The most commonly used family of can be combined together to solve hybrid problems. In such
neural networks for pattern classification is the feed forward cases statistical approach is utilized to recognize pattern
networks like MLP and RBF networks [31]. Different types of primitives and syntactic approach is then used for the
neural networks are used depending upon the requirement of recognition of sub-patterns and pattern itself. Fu [28] gave the
the application. concept of attributed grammars which unifies statistical and
Feed Forward Back-propagation Neural Network (FFBP- structural pattern recognition approach. To enhance system
NN) is used to implement non-linear differentiable functions. performance one can use a set of individual classifiers and
Increase in the learning rate in back-propagation neural combiner to make the final decision. Tumer and Ghosh [42]
network leads to decrease in convergence time [32]. General experimentally proved that using a linear combiner or order
Regression neural network (GRNN) is a highly parallel statistics combiner minimize the variance of actual decision
structure in which learning is from input side to output side boundaries around the optimal boundary. Multiple classifiers
[33]. General Regression Neural Network (GRNN) performs can be used in several ways to enhance the system
efficiently on noisy data than Back-propagation. FFBP Neural performance. Each classifier can be trained in a different
Network does not work accurately if available data is large region of feature space or in other way, each classifier can
enough. On the other hand in GRNN, as the size of data provide probability estimate and decision can be made upon
increases, the error approaches towards zero [33]. Kohnen- analyzing individual results. Methods utilizing classifier
Networks are mainly used for data clustering and feature ensemble design [43], [44] generate a set of mutually
mapping [14]. Ripley [34] and Anderson et al. [35] stated the complementary classifiers that achieve optimal accuracy
relationship between neural networks and statistical model of using a fixed decision function. Those methods which utilize
pattern recognition. combination function design tend to find an optimal
The performance of the neural networks enhances upon combination of decisions from a set of classifiers. To achieve
increasing the number of hidden layers up to a certain extent. optimum results, a large set of combination functions of
Increased number of neurons in hidden layer also improves increasing complexity, ranging from simple voting rules
the performance of the system. No. of neurons must be large through trainable combination functions is available to
enough to adequately represent the problem domain and small designer [45] - [47].
enough to permit the generalization from training data. A
trade-off must be maintained between size of network and III. CONCLUSION
complexity resulted because of network size. Percentage
recognition accuracy of a neural network can be further A comparative view of all the models of pattern recognition
enhanced if we use 'tansig'-'tansig' combination of activation has been shown which depicts that for various domains in this
functions for neurons of hidden layer and output layer opted areas different models or combination of models can be used.
rather than opting for other combinations [36]. In case of noisy patterns, choice of statistical model is a good
solution. Practical importance of structural model depends
upon recognition of simple pattern primitives and their
relationships represented by description language. As
Seema Asht and Rajeshwar Dass 28
compared to statistical pattern recognition, structural pattern [17]. T.Pavlidis, “Structural Pattern Recognition,” New York:
recognition is a newer area of research. For complex patterns Springer-Verlag, 1977.
and applications utilizing large number of pattern classes, it is [18]. King-Sun Fu and Azriel Rosenfeld, "Pattern Recognition and
beneficial to describe each pattern in terms of its components. Image Processing, "IEEE Trans. on computers, vol. c-25.No.
12, Dec. 1976.
A wise decision regarding the selection of Pattern grammar [19]. Syntactic Methods in Pattern Recognition, New
influences computations efficiency of recognition system. York;Academic,1974
Pattern primitives and pattern grammar to be utilized depends [20]. K.S. Fu, "Applications of Syntactic Pattern Recognition."New
upon the application requirements. Low dependence of neural York: Springer, 1976.
networks on prior knowledge and availability of efficient [21]. J.Gips, “Shape Grammers and their Uses.”Birkhauser: Verlag,
learning algorithms have made the neural networks famous in Basel and Stuttgart, 1975.
the field of Patten Recognition. Although neural networks and [22]. P.A. Ota, "Mosaic Grammars," Pattern Recognition, vol. 7,
statistical pattern recognition models have different principles June 1975.
most of the neural networks are similar to statistical pattern [23]. K.L.Williams, “A Multidimensional Approach to Syntactic
Pattern Recognition,” Pattern Recognition, vol. 7, Sept. 1975.
recognition models. To recognize unknown shapes fuzzy
[24]. U.Grenander, "Foundations of pattern analysis, "Quart. Appl.
methods are good options. As each model has its own pros Math., vol. 27, pp. 1-55, 1969.
and cons, therefore to enhance system performance for [25]. K.S.Fu, Ed., Digital Pattern Recognition, Communications and
complex applications it is beneficial to append two or more Cybernatics, vol. 10, New York: Springe, 1976.
recognition models at various stages of recognition process. [26]. T.Pavlidis and F.Ali, "Computer Recognition of Handwritten
Numerals by Polygonal Approximations,"IEEE Trans.
REFERENCES Syst.,Man,cybern.vol. SMC-5, Nov. 1975.
[27]. L.I. Perlovsky, “Conundrum of Combinatorial Complexity,”
[1]. K.S. Fu, “A Step towards Unification of Syntactic and IEEE Trans. Pattern Analysis and Machine Intelligence, vol.
Statistical Pattern Recognition,” IEEE Trans. Pattern Analysis 20, no. 6, pp. 666-670, 1998.
and Machine Intelligence, vol. 5, no. 2, pp. 200-205, Mar. [28]. K.S. Fu, Syntactic Pattern Recognition and Applications.
1983. Englewood Cliffs, N.J.: Prentice-Hall, 1982.
[2]. Amin Fazel and Shantnu Chakrabartty, ‟ An Overview of [29]. R.Bajcsy and S.Kovacic, “Multiresolution Elastic Matching,”
Statistical Pattern Recognition Techniques for Speaker Computer vision graphics image processing, vol46, pp. 1-21,
Verification’’, IEEE circuits and systems, pp. 61-81, magazine 1989.
2nd quarter 2011. [30]. Nils J. Nilsson: Artificial Intelligence A New Synthesis,
[3]. L. Devroye, L. Gyorfi, and G. Lugosi, "A Probabilistic Theory ELSEVIERA.K. Jain, J. Mao, and K.M. Mohiuddin, “Artificial
of Pattern Recognition." Berlin: Springer-Verlag, 1996. Neural Networks: A Tutorial,” Computer, pp. 31-44, Mar.
[4]. R.O.Duda and P.E.Hart, Pattern Classification and Scene 1996.
Analysis, New York: John Wiley & sons, 1973. [31]. Anupam Joshi, Narendram Ramakrishman, Elias N.Houstis
[5]. W.W.Cooley and P.R. Lohnes," Multivariate Data and John R.Rice, "On Neurobiological, Neuro-Fuzzy, Machine
Analysis."New York: Wiley, 1971. Learning And Statistical Pattern Recognition Techniques",
[6]. R.Fisher, "The Use of Multiple Measurements in Taxonomic IEEE Trans. on Neural Networks, vol. 8, no. 1, 1997.
Problems," Ann. Eugenics, vol. 7, no. 2, pp. 179-188,1936. [32]. Burcu Erkmell, Tu.lay YlIdmm, “Improving classification
[7]. M.M. Tatsouka, "Multivariate Analysis."New York: Wiley, performance of sonar targets by applying general regression
1971. neural network with PCA,” Science Direct.
[8]. Zhen Lei, Rufeng Chu, Ran He, Shengcai Liao and Stan Z. Li., [33]. B. Ripley, “Statistical Aspects of Neural Networks, Networks
“Face Recognition by Discriminant Analysis with Gabor on Chaos: Statistical and Probabilistic Aspects.” U.
Tensor Representation,” Vol.4642/2007, pp.87-95, Springer Bornndorff-Nielsen, J.Jensen, and W. Kendal, eds., Chapman
Berlin/Heidelberg Publishing, 2007. and Hall, 1993.
[9]. L.Chen, H.Liao, M.Ko, J.Lin and G.Yu," A New LDA-Based [34]. J. Anderson, A. Pellionisz, and E. Rosenfeld, Neurocomputing
Face Recognition System which can Solve the Small Sample 2: Directions for Research. Cambridge Mass.: MIT Press,
Size Problem", Pattern Recognition,2000. 1990.
[10]. Ethem Alpaydin: Introduction to Machine Learning, Prentice [35]. Amit Choudhary,Rahul Rishi, Savita Ahlawat and Vijaypal
Hall of India 2005. Singh Dhaka, “Performance Analysis Of Feed Forward MLP
[11]. Yuehui Sun, Minghui: “DT-CWT Feature Based Classification with Various Activation Functions For Handwritten Numerals
Using Orthogonal Neighborhood Preserving Projections for Recognition”, IEEE Trans. 2010,volume 5,pp. 852-856
Face Recognition,” Volume: 1, pp.719-724.Nov.2006. [36]. J.C. Bezdek, Pattern Recognition with Fuzzy Objective
[12]. S. Haykin, Neural Networks, A Comprehensive Foundation. Function Algorithm, New York:Plenum Press,1981.
Second ed., Englewood Cliffs, N.J.: Prentice Hall, 1999. [37]. J.C. Bezdek And S. K., Fuzzy Models For Pattern
[13]. B. SchoÈlkopf, A. Smola, and K.R. Muller, "Nonlinear Recognition: Methods that Search for Structures In Data.”
Component Analysis as a Kernel Eigenvalue Problem," Neural Pal,Eds.,IEEE CS Press,1992.
Computation, vol. 10, no. 5, pp. 1,299-1,319, 1998. [38]. Kandel,A., Fuzzy Techniques in Pattern Recognition”, John
[14]. T. Kohonen, Self-Organizing Maps, Springer Series in Wiley and Sons, New York,1982
Information Sciences, vol. 30, Berlin, 1995. [39]. J. Kittler, F. J. Ferri, J. M. I~nesta, A. Amin and P. Pudil,
[15]. Tasweer Ahmad.Ahlam Jameel,Dr. Balal Ahmad, "Pattern Eds., “A framework for classifier fusion: is it still needed in
Recognition using Statistical and Neural Techniques", IEEE Advances in Pattern Recognition,” LNCS 1876, 45-56
2011. (Springer-Verlag, 2000).
[16]. K.S.fu, Syntactic Pattern recognition and Applications. [40]. R. O. Duda, P. E. Hart and D. G. Stork, Pattern Classification
Englewood Cliffs, N.J.: Prentice Hall,1982. (John Wiley & Sons 2000).
International Journal of Computer Science and Telecommunications [Volume 3, Issue 8, August 2012] 29
[41]. K. Tumer and J. Ghosh, "Analysis of Decision Boundaries in Seema Asht is pursuing her M. Tech. from
Linearly Combined Neural Classifiers", Pattern Recognition, DCRUST Murthal in Electronics and
vol. 29, pp. 341-348, 1996. Communication Engg. She received her B.E.
[42]. J. Kittler and F. Roli, (Eds.), Multiple Classifier Systems,” (Honors) Degree in Electronics &
LNCS 1857, 404 (Springer-Verlag, 2000). Communication Engineering from CRSCE,
[43]. A. J. C. Sharkey, “Multi-Net Systems", in Combining Murthal (Sonepat) (Affiliated to M.D.U Rohtak)
Artificial Neural Nets,Ensemble and Modular Multi-Net in 2010. Her interest includes Neural Network
Systems, 1-27, (Springer-Verlag, 1999). and Wireless Communication.
[44]. G. Giacinto and F. Roli, “Dynamic Classifier Selection based
on Multiple Classifier Behaviour", Pattern Recognition, 34,
179 -181 (2001).
[45]. G. Giacinto, F. Roli and G. Fumera, “Selection of Image
Classifiers", Electronics Letters 36, 420{422 (2000). Rajeshwar Dass received M.E Degree in
[46]. L. I. Kuncheva, “Combinations of multiple classifiers using Electronics & Communication Engineering
fuzzy sets,” in Fuzzy Classifier Design, 233-267 (Springer- from National Institute of Technical Teachers
Verlag, 2000). Training and Research Center (NITTTR)
Chandigarh in the year 2007 and B.E. (Honors)
degree in Electronics & Communication
Engineering from Apeejay College of
Engineering Sohna Gurgaon (Affiliated to
M.D.U Rohtak) in 2004. He is pursuing his Ph.D from DCRUST
Murthal. In August 2004 he joined Department of Electronics &
Communication Engineering of Apeejay College of Engineering
Sohna Gurgaon. He joined Department of Electronics &
Communication Engineering of Deenbandhu Chhotu Ram University
of Science and Technology (D.C.R.U.S.T) Murthal Sonepat(India) as
Assistant professor in October 2008. His interest includes Medical
Image Processing, Neural Network, Wireless Communication and
Soft Computing Techniques. He has contributed 15 technical papers
in International Journals and Conferences. He has written a book on
wireless communication. He is the member of IEEE (92061144).