0% found this document useful (0 votes)
93 views20 pages

Pattern and Classification

Pattern recognition is the process of classifying data based on knowledge gained from patterns and their representations. It involves extracting features from raw data, classifying patterns into categories, and clustering similar patterns together. Applications include image processing, computer vision, speech recognition, and fingerprint identification.

Uploaded by

Saumya Gurnani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views20 pages

Pattern and Classification

Pattern recognition is the process of classifying data based on knowledge gained from patterns and their representations. It involves extracting features from raw data, classifying patterns into categories, and clustering similar patterns together. Applications include image processing, computer vision, speech recognition, and fingerprint identification.

Uploaded by

Saumya Gurnani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Pattern is everything around in this digital world.

A pattern can either be


seen physically or it can be observed mathematically by applying
algorithms.
Example: The colors on the clothes, speech pattern, etc. In computer
science, a pattern is represented using vector feature values.
What is Pattern Recognition?
Pattern recognition is the process of recognizing patterns by using a
machine learning algorithm. Pattern recognition can be defined as the
classification of data based on knowledge already gained or on statistical
information extracted from patterns and/or their representation. One of the
important aspects of pattern recognition is its application potential.
Examples: Speech recognition, speaker identification, multimedia document
recognition (MDR), automatic medical diagnosis.
In a typical pattern recognition application, the raw data is processed and
converted into a form that is amenable for a machine to use. Pattern
recognition involves the classification and cluster of patterns.
 In classification, an appropriate class label is assigned to a pattern
based on an abstraction that is generated using a set of training
patterns or domain knowledge. Classification is used in supervised
learning.
 Clustering generated a partition of the data which helps decision
making, the specific decision-making activity of interest to us.
Clustering is used in unsupervised learning.
Features may be represented as continuous, discrete, or discrete binary
variables. A feature is a function of one or more measurements, computed
so that it quantifies some significant characteristics of the object.
Example: consider our face then eyes, ears, nose, etc are features of the
face.
A set of features that are taken together, forms the features vector.
Example: In the above example of a face, if all the features (eyes, ears,
nose, etc) are taken together then the sequence is a feature vector([eyes,
ears, nose]). The feature vector is the sequence of a feature represented as
a d-dimensional column vector. In the case of speech, MFCC (Mel-frequency
Cepstral Coefficient) is the spectral feature of the speech. The sequence of
the first 13 features forms a feature vector.
Pattern recognition possesses the following features:
 Pattern recognition system should recognize familiar patterns
quickly and accurate
 Recognize and classify unfamiliar objects
 Accurately recognize shapes and objects from different angles
 Identify patterns and objects even when partly hidden
 Recognize patterns quickly with ease, and with automaticity.
Training and Learning in Pattern Recognition
Learning is a phenomenon through which a system gets trained and
becomes adaptable to give results in an accurate manner. Learning is the
most important phase as to how well the system performs on the data
provided to the system depends on which algorithms are used on the data.
The entire dataset is divided into two categories, one which is used in
training the model i.e. Training set, and the other that is used in testing the
model after training, i.e. Testing set.
 Training set:
The training set is used to build a model. It consists of the set of
images that are used to train the system. Training rules and
algorithms are used to give relevant information on how to
associate input data with output decisions. The system is trained by
applying these algorithms to the dataset, all the relevant information
is extracted from the data, and results are obtained. Generally, 80%
of the data of the dataset is taken for training data.
 Testing set:
Testing data is used to test the system. It is the set of data that is
used to verify whether the system is producing the correct output
after being trained or not. Generally, 20% of the data of the dataset
is used for testing. Testing data is used to measure the accuracy of
the system. For example, a system that identifies which category a
particular flower belongs to is able to identify seven categories of
flowers correctly out of ten and the rest of others wrong, then the
accuracy is 70 %

Real-time Examples and Explanations:


A pattern is a physical object or an abstract notion. While talking about the
classes of animals, a description of an animal would be a pattern. While
talking about various types of balls, then a description of a ball is a pattern.
In the case balls considered as pattern, the classes could be football, cricket
ball, table tennis ball, etc. Given a new pattern, the class of the pattern is to
be determined. The choice of attributes and representation of patterns is a
very important step in pattern classification. A good representation is one
that makes use of discriminating attributes and also reduces the
computational burden in pattern classification.
An obvious representation of a pattern will be a vector. Each element of the
vector can represent one attribute of the pattern. The first element of the
vector will contain the value of the first attribute for the pattern being
considered.
Example: While representing spherical objects, (25, 1) may be represented
as a spherical object with 25 units of weight and 1 unit diameter. The class
label can form a part of the vector. If spherical objects belong to class 1, the
vector would be (25, 1, 1), where the first element represents the weight of
the object, the second element, the diameter of the object and the third
element represents the class of the object.
Advantages:
 Pattern recognition solves classification problems
 Pattern recognition solves the problem of fake biometric detection.
 It is useful for cloth pattern recognition for visually impaired blind
people.
 It helps in speaker diarization.
 We can recognize particular objects from different angles.
Disadvantages:
 The syntactic pattern recognition approach is complex to implement
and it is a very slow process.
 Sometimes to get better accuracy, a larger dataset is required.
 It cannot explain why a particular object is recognized.
Example: my face vs my friend’s face.
Applications:
 Image processing, segmentation, and analysis
Pattern recognition is used to give human recognition intelligence to
machines that are required in image processing.
 Computer vision
Pattern recognition is used to extract meaningful features from
given image/video samples and is used in computer vision for
various applications like biological and biomedical imaging.
 Seismic analysis
The pattern recognition approach is used for the discovery,
imaging, and interpretation of temporal patterns in seismic array
recordings. Statistical pattern recognition is implemented and used
in different types of seismic analysis models.
 Radar signal classification/analysis
Pattern recognition and signal processing methods are used in
various applications of radar signal classifications like AP mine
detection and identification.
 Speech recognition
The greatest success in speech recognition has been obtained
using pattern recognition paradigms. It is used in various algorithms
of speech recognition which tries to avoid the problems of using a
phoneme level of description and treats larger units such as words
as pattern
 Fingerprint identification
Fingerprint recognition technology is a dominant technology in the
biometric market. A number of recognition methods have been used
to perform fingerprint matching out of which pattern recognition
approaches are widely used.

Phases in Pattern Recognition System


Approaches for Pattern Recognition Systems can be represented by distinct
phases, as Pattern Recognition Systems can be divided into the following
components.
 Phase 1: Convert images or sounds or other inputs into signal data.
 Phase 2: Isolate the sensed objects from the background.
 Phase 3: Measure objects properties that are useful for
classification.
 Phase 4: Assign the sensed object to a category.
 Phase 5: Take other considerations to decide on appropriate
action.
Problems solved by these Phases are as follows:
1. Sensing: It deals with problem arises in the input such as its
bandwidth, resolution, sensitivity, distortion, signal-to-noise ratio,
latency, etc.
2. Segmentation and Grouping: Deepest problems in pattern
recognition that deals with the problem of recognizing or grouping
together the various parts of an object.
3. Feature Extraction: It deals with the characterization of an object
so that it can be recognized easily by measurements. Those
objects whose values are very similar for the objects are considered
to be in the same category, while those whose values are quite
different for the objects are placed in different categories.
4. Classification: It deals with assigning the object to their particular
categories by using the feature vector provided by the feature
extractor and determining the values of all of the features for a
particular input.
5. Post Processing: It deals with action decision-making by using the
output of the classifier. Action such as minimum-error-rate
classification will minimize the total expected cost.
Activities for designing the Pattern Recognition Systems
There are various sequences of activities that are used for designing the
Pattern Recognition Systems. These activities are as follows:
 Data Collection
 Feature Choice
 Model Choice
 Training
 Evaluation
Classification
As the name suggests, Classification is the task of “classifying things” into
sub-categories. But, by a machine! If that doesn’t sound like much, imagine
your computer being able to differentiate between you and a stranger.
Between a potato and a tomato. Between an A grade and an F-.
Yeah. It sounds interesting now!
In Machine Learning and Statistics, Classification is the problem of
identifying to which of a set of categories (subpopulations), a new
observation belongs, on the basis of a training set of data containing
observations and whose categories membership is known.
Types of Classification
Classification is of two types:
 Binary Classification: When we have to categorize given data into
2 distinct classes. Example – On the basis of given health
conditions of a person, we have to determine whether the person
has a certain disease or not.
 Multiclass Classification: The number of classes is more than 2.
For Example – On the basis of data about different species of
flowers, we have to determine which specie does our observation
belongs to.

Fig: Binary and Multiclass Classification. Here x1 and x2 are our variables
upon which the class is predicted.
How does classification works?
Suppose we have to predict whether a given patient has a certain disease or
not, on the basis of 3 variables, called features.
This means there are two possible outcomes:
1. The patient has the said disease. Basically, a result labeled “Yes”
or “True”.
2. The patient is disease-free. A result labeled “No” or “False”.

This is a binary classification problem.


We have a set of observations called training data set, which comprises
sample data with actual classification results. We train a model, called
Classifier on this data set, and use that model to predict whether a certain
patient will have the disease or not.
The outcome, thus now depends upon :
1. How well these features are able to “map” to the outcome.
2. The quality of our data set. By quality, I refer to statistical and
Mathematical qualities.
3. How well our Classifier generalizes this relationship between the
features and the outcome.
4. The values of the x1 and x2.

Following is the generalized block diagram of the classification task.

Generalized Classification Block Diagram.


1. X: pre-classified data, in the form of a N*M matrix. N is the no. of
observations and M is the number of features
2. y: An N-d vector corresponding to predicted classes for each of the
N observations.
3. Feature Extraction: Extracting valuable information from input X
using a series of transforms.
4. ML Model: The “Classifier” we’ll train.
5. y’: Labels predicted by the Classifier.
6. Quality Metric: Metric used for measuring the performance of the
model.
7. ML Algorithm: The algorithm that is used to update weights w’,
which update the model and “learns” iteratively.

Types of Classifiers (algorithms)


There are various types of classifiers. Some of them are :
 Linear Classifiers: Logistic Regression
 Tree-Based Classifiers: Decision Tree Classifier
 Support Vector Machines
 Artificial Neural Networks
 Bayesian Regression
 Gaussian Naive Bayes Classifiers
 Stochastic Gradient Descent (SGD) Classifier
 Ensemble Methods: Random Forests, AdaBoost, Bagging
Classifier, Voting Classifier, ExtraTrees Classifier

A detailed description of these methodologies is beyond an article!


Practical Applications of Classification
1. Google’s self-driving car uses deep learning-enabled classification
techniques which enables it to detect and classify obstacles.
2. Spam E-mail filtering is one of the most widespread and well-
recognized uses of Classification techniques.
3. Detecting Health Problems, Facial Recognition, Speech
Recognition, Object Detection, Sentiment Analysis all use
Classification at their core.
Implementation
Let’s get a hands-on experience at how Classification works.
We are going to study various Classifiers
and see a rather simple analytical comparison of their performance on a well-
known, standard data set, the Iris data set.
Requirements for running the given script
1. Python 2.7
2. Scipy and Numpy
3. Matplotlib for data visualization
4. Pandas for data i/o
5. Scikit-learn Provides all the classifiers
Python Implementation- Github link to the Project
Solving A Simple Classification Problem with Python
— Fruits Lovers’ Edition Data

The fruits dataset was created by Dr. Iain Murray from University of
Edinburgh. He bought a few dozen oranges, lemons and apples of
different varieties, and recorded their measurements in a table. And
then the professors at University of Michigan formatted the fruits
data slightly and it can be downloaded from here.

Let’s have a look the first a few rows of the data.


%matplotlib inline
import pandas as pd
import matplotlib.pyplot as pltfruits =
pd.read_table('fruit_data_with_colors.txt')
fruits.head()

Figure 1

Each row of the dataset represents one piece of the fruit as


represented by several features that are in the table’s columns.

We have 59 pieces of fruits and 7 features in the dataset:


print(fruits.shape)

(59, 7)

We have four types of fruits in the dataset:


print(fruits['fruit_name'].unique())
[‘apple’ ‘mandarin’ ‘orange’ ‘lemon’]

The data is pretty balanced except mandarin. We will just have to go


with it.
print(fruits.groupby('fruit_name').size())

Figure 2
import seaborn as sns
sns.countplot(fruits['fruit_name'],label="Count")
plt.show()

Figure 3

Visualization

 Box plot for each numeric variable will give us a clearer idea
of the distribution of the input variables:
fruits.drop('fruit_label', axis=1).plot(kind='box',
subplots=True, layout=(2,2), sharex=False, sharey=False,
figsize=(9,9),
title='Box Plot for each
input variable')
plt.savefig('fruits_box')
plt.show()
Figure 4

 It looks like perhaps color score has a near Gaussian


distribution.
import pylab as pl
fruits.drop('fruit_label' ,axis=1).hist(bins=30, figsize=(9,9))
pl.suptitle("Histogram for each numeric input variable")
plt.savefig('fruits_hist')
plt.show()
Figure 5

 Some pairs of attributes are correlated (mass and width).


This suggests a high correlation and a predictable
relationship.
from pandas.tools.plotting import scatter_matrix
from matplotlib import cmfeature_names = ['mass', 'width',
'height', 'color_score']
X = fruits[feature_names]
y = fruits['fruit_label']cmap = cm.get_cmap('gnuplot')
scatter = pd.scatter_matrix(X, c = y, marker = 'o', s=40,
hist_kwds={'bins':15}, figsize=(9,9), cmap = cmap)
plt.suptitle('Scatter-matrix for each input variable')
plt.savefig('fruits_scatter_matrix')
Figure 6

Statistical Summary
Figure 7

We can see that the numerical values do not have the same scale. We
will need to apply scaling to the test set that we computed for the
training set.

Create Training and Test Sets and Apply Scaling


from sklearn.model_selection import train_test_splitX_train,
X_test, y_train, y_test = train_test_split(X, y,
random_state=0)from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

Build Models

Logistic Regression
from sklearn.linear_model import LogisticRegressionlogreg =
LogisticRegression()
logreg.fit(X_train, y_train)print('Accuracy of Logistic
regression classifier on training set: {:.2f}'
.format(logreg.score(X_train, y_train)))
print('Accuracy of Logistic regression classifier on test set:
{:.2f}'
.format(logreg.score(X_test, y_test)))
Accuracy of Logistic regression classifier on training set:
0.70
Accuracy of Logistic regression classifier on test set: 0.40

Decision Tree
from sklearn.tree import DecisionTreeClassifierclf =
DecisionTreeClassifier().fit(X_train, y_train)print('Accuracy of
Decision Tree classifier on training set: {:.2f}'
.format(clf.score(X_train, y_train)))
print('Accuracy of Decision Tree classifier on test set: {:.2f}'
.format(clf.score(X_test, y_test)))

Accuracy of Decision Tree classifier on training set: 1.00


Accuracy of Decision Tree classifier on test set: 0.73

K-Nearest Neighbors
from sklearn.neighbors import KNeighborsClassifierknn =
KNeighborsClassifier()
knn.fit(X_train, y_train)
print('Accuracy of K-NN classifier on training set: {:.2f}'
.format(knn.score(X_train, y_train)))
print('Accuracy of K-NN classifier on test set: {:.2f}'
.format(knn.score(X_test, y_test)))

Accuracy of K-NN classifier on training set: 0.95


Accuracy of K-NN classifier on test set: 1.00

Linear Discriminant Analysis


from sklearn.discriminant_analysis import
LinearDiscriminantAnalysislda = LinearDiscriminantAnalysis()
lda.fit(X_train, y_train)
print('Accuracy of LDA classifier on training set: {:.2f}'
.format(lda.score(X_train, y_train)))
print('Accuracy of LDA classifier on test set: {:.2f}'
.format(lda.score(X_test, y_test)))
Accuracy of LDA classifier on training set: 0.86
Accuracy of LDA classifier on test set: 0.67

Gaussian Naive Bayes


from sklearn.naive_bayes import GaussianNBgnb = GaussianNB()
gnb.fit(X_train, y_train)
print('Accuracy of GNB classifier on training set: {:.2f}'
.format(gnb.score(X_train, y_train)))
print('Accuracy of GNB classifier on test set: {:.2f}'
.format(gnb.score(X_test, y_test)))

Accuracy of GNB classifier on training set: 0.86


Accuracy of GNB classifier on test set: 0.67

Support Vector Machine


from sklearn.svm import SVCsvm = SVC()
svm.fit(X_train, y_train)
print('Accuracy of SVM classifier on training set: {:.2f}'
.format(svm.score(X_train, y_train)))
print('Accuracy of SVM classifier on test set: {:.2f}'
.format(svm.score(X_test, y_test)))

Accuracy of SVM classifier on training set: 0.61


Accuracy of SVM classifier on test set: 0.33

The KNN algorithm was the most accurate model that we tried. The
confusion matrix provides an indication of no error made on the test
set. However, the test set was very small.
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
pred = knn.predict(X_test)
print(confusion_matrix(y_test, pred))
print(classification_report(y_test, pred))
Figure 7

Plot the Decision Boundary of the k-NN Classifier


import matplotlib.cm as cm
from matplotlib.colors import ListedColormap, BoundaryNorm
import matplotlib.patches as mpatches
import matplotlib.patches as mpatchesX = fruits[['mass',
'width', 'height', 'color_score']]
y = fruits['fruit_label']
X_train, X_test, y_train, y_test = train_test_split(X, y,
random_state=0)def plot_fruit_knn(X, y, n_neighbors, weights):
X_mat = X[['height', 'width']].as_matrix()
y_mat = y.as_matrix()# Create color maps
cmap_light = ListedColormap(['#FFAAAA', '#AAFFAA',
'#AAAAFF','#AFAFAF'])
cmap_bold = ListedColormap(['#FF0000', '#00FF00',
'#0000FF','#AFAFAF'])clf =
neighbors.KNeighborsClassifier(n_neighbors, weights=weights)
clf.fit(X_mat, y_mat)# Plot the decision boundary by
assigning a color in the color map
# to each mesh point.

mesh_step_size = .01 # step size in the mesh


plot_symbol_size = 50

x_min, x_max = X_mat[:, 0].min() - 1, X_mat[:, 0].max() + 1


y_min, y_max = X_mat[:, 1].min() - 1, X_mat[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max,
mesh_step_size),
np.arange(y_min, y_max,
mesh_step_size))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])# Put the
result into a color plot
Z = Z.reshape(xx.shape)
plt.figure()
plt.pcolormesh(xx, yy, Z, cmap=cmap_light)# Plot training
points
plt.scatter(X_mat[:, 0], X_mat[:, 1], s=plot_symbol_size,
c=y, cmap=cmap_bold, edgecolor = 'black')
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())patch0 =
mpatches.Patch(color='#FF0000', label='apple')
patch1 = mpatches.Patch(color='#00FF00', label='mandarin')
patch2 = mpatches.Patch(color='#0000FF', label='orange')
patch3 = mpatches.Patch(color='#AFAFAF', label='lemon')
plt.legend(handles=[patch0, patch1, patch2,
patch3])plt.xlabel('height (cm)')
plt.ylabel('width (cm)')
plt.title("4-Class classification (k = %i, weights = '%s')"
% (n_neighbors, weights))
plt.show()plot_fruit_knn(X_train, y_train, 5, 'uniform')

Figure 8
k_range = range(1, 20)
scores = []for k in k_range:
knn = KNeighborsClassifier(n_neighbors = k)
knn.fit(X_train, y_train)
scores.append(knn.score(X_test, y_test))
plt.figure()
plt.xlabel('k')
plt.ylabel('accuracy')
plt.scatter(k_range, scores)
plt.xticks([0,5,10,15,20])
Figure 9

For this particular dateset, we obtain the highest accuracy when k=5.

Summary

In this post, we focused on the prediction accuracy. Our objective is


to learn a model that has a good generalization performance. Such a
model maximizes the prediction accuracy. We identified the
machine learning algorithm that is best-suited for the problem at
hand (i.e. fruit types classification); therefore, we compared
different algorithms and selected the best-performing one.

Source code that created this post can be found here. I would be
pleased to receive feedback or questions on any of the above.

You might also like