Unit 1 Image Proc
Unit 1 Image Proc
For example:
• All the above are classified as character T
What is Pattern Recognition?
• Pattern recognition is a data analysis method that uses machine learning
algorithms to automatically recognize patterns and regularities in data.
(This data can be anything from text and images to sounds or other definable
qualities.)
• Speech Recognition
• Fingerprint Identification
• OCR ( Optical Character Recognition)
• DNA sequence identification
Pattern Recognition System
Test Phase Training Phase
Features:
• A symbolic or numeric property of a real world object that might
be useful to determine its class.
• The word “attribute” is used for this as well.
• Different objects however may have different numbers of
attributes.
• While, usually for all objects in the same problem the same
features can be measured.
• Thus objects may be represented by a feature vector, or by a set of
attributes.
• A feature property stored in a dataset refers to the set of values
the particular feature may have.
• During the addition of new objects to a dataset the feature values
may be checked for the defined domain.
Two broad types of classification:
• Supervised classification
• Guided by the humans
• It is called supervised learning because the process of an algorithm
learning from the training dataset can be thought of as a teacher
supervising the learning process.
• We know the correct answers, the algorithm iteratively makes predictions
on the training data and is corrected by the teacher.
(Classify the mails as spam or non spam based on redecided parameters.)
• Unsupervised classification
• Not guided by the humans.
• Unsupervised Classification is called clustering.
Contd.
• Clustering :
• The system is not given a set of labeled patterns for training. Instead the
system establishes the classes itself based on the regularities of the patterns.
• Clustering Separate Clouds
• Methods work fine when clusters form well separated compact clouds
• Less well when there are great differences in the number of samples in
different clusters
Another classifier : Semi supervised learning
• It makes use of a small number of labeled data and a large
number of unlabeled data to learn
Samples or patterns :
• The individual items or objects or situations to be classified
will be referred as samples or patterns or data.
• Testing set : around 20-30% will be used for testing the system.
Test data is used to measure the performance, such as accuracy
or efficiency, of the algorithm you are using to train the machine.
• Testing is the measure of quality of your algorithm.
• Many a times even after 80% training, failures can be seen during
testing, reason being not good representation of the test data in
the training set.
• Unsupervised classifier does not use training data
Approaches for Pattern Recognition
Example: OCR
Statistical Decision Theory