Feature Extraction Phase

Feature extraction involves extracting relevant features from objects or alphabets to build feature vectors, which are then used by classifiers to identify inputs. Several techniques can extract features, such as directional chain code features and zoning. There are two main classes of features: statistical features extracted from statistical distributions, and structural features related to character geometry. Classification assigns inputs to predefined classes based on the extracted features. Techniques for classification include statistical techniques, neural networks, template matching, support vector machines, and combining multiple classifiers.

Uploaded by

Osama Abbass

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views3 pages

Feature Extraction Phase

Uploaded by

Osama Abbass

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Feature Extraction Phase

Feature extraction is the operation of extracting the pertinent features from objects or
alphabets to build feature vectors.
These feature vectors are then utilized by classifiers to identify the input unit with objective
output unit. It becomes effortless for the classifier to classify between dissimilar classes by
glancing at these features as it becomes fairly easy to determine.
Several techniques are proposed for extracting features from the segmented characters in
literature. U. Pal et al have proposed directional chain code features and zoning and for
handwritten numeral recognition considered a feature vector of length 100 and have presented
a high level of recognition accuracy. But, the feature extraction process is time consuming and
complex.
Following to some proposes, there are two major classes of features: statistical features and
structural features.
In a character matrix statistical features are obtained from statistical distribution of every point
such as zoning, moments, crossings, fourier transforms and projection histograms. Statistical
features are also notable as global features as they are usually averaged and extracted in sub-
images such as meshes. Initially, statistical features are supplied to recognize machine printed
characters.
On the other hand, structural or topological features are concern to the geometry of the
character set to be contemplated. Some of these features are convexities and concavities in the
characters, number of holes in the characters, number of end points etc.

Classification Phase
OCR systems broadly utilize the methodologies of pattern recognition, which assigns each
example to a predefined class.
Classification is the procedure of distributing inputs with respect to detected information to
their comparing class in order to create groups with homogeneous qualities, while segregating
different inputs into different classes. Classification is conveyed out on the premise of put away
features in the feature space, for example, structural features, global features and so forth.
It can be said that classification isolates the feature space into several classes taking into
account the decision rule.
Choosing classifier depends on several agents, such as, number of free parameters, available
training set and so forth. Various procedures for OCR are explored by the scientists.
Techniques of OCR classification can be categorized as Statistical Techniques, Neural Networks,
Template Matching, Support Vector Machine (SVM) algorithms, and Combination of classifier.

 Template matching
This is the least complex method for character recognition, in view of matching the stored
models against the word or character to be perceived. By gathering of shapes, pixels, curvature
and so forth, the operation of matching decides the level of similitude -between two vectors. A
gray-level or binary input character is contrasted with a standard arrangement of stored
models. The recognition rate of this strategy is extremely delicate to noise and input
disfigurement.

 Statistical Techniques
Hypothesis of Statistical decision is treating with statistical decision capacities and an
arrangement of optimality criteria, which for a given model of a specific class can amplify the
likelihood of the observed pattern.
The main statistical methods that are performed in the area of OCR are Nearest Neighbor (NN),
Likelihood or Bayes classifier, Clustering Analysis, Hidden Markov Modelling (HMM), Fuzzy Set
Reasoning, and Quadratic classifier.

 Neural Networks
Character classification issue is identified with heuristic rationale as people can perceive
characters and records by their learning and experience. Thus neural networks which are pretty
much heuristic in nature are greatly appropriate for this type of issue.
A neural network is an ascertaining architecture that includes enormously parallel
interconnection of flexible node processors. Output from one node is reinforcing to the next
one in the network and an official choice relies on the complicated collaboration of all nodes. As
a result of its similar character, it can apply calculations at a rate higher contrasted with the
traditional strategies.
Feed-forward neural networks and feedback neural networks can be thought as categorization
of neural network architectures. And Table compares and discusses some recent proposed OCR
applications based on Neural Network.

OCR Application Accuracy %

chassis-number recognition 95.49
Automatic Number Plate Recognition ANPR 97.3
OCR for printed Urdu script 98.3
classification and Recognition of broken 68.33
characters

 Kernel Methods
While the most imperative kernel strategies are support Vector Machines, techniques such as
Kernel Fisher Discriminant Analysis (KFDA) and Kernel Principal Component Analysis (KPCA) also
employ kernel method.
Support vector machines (SVM) are one of the most widely used and most effective supervised
learning techniques that can be used for binary or multi-class classification.
In classification techniques, by convention the data set first is partitioned into training and
testing sets. The objective of SVM is to deliver a model, which predicts the output of the test
set. Width of the edge between the classes is the enhancement rule, the unfilled zone around
the decision boundary characterized by the interval to the closest training example.

 Combination Classifier
Different classification strategies have their own particular advantages and shortcomings. Thus
ordinarily various classifiers are consolidated together to solve a given classification problem.

OCS351 Notes
No ratings yet
OCS351 Notes
38 pages
CH 7 - Ensemble Learning and Random Forests
No ratings yet
CH 7 - Ensemble Learning and Random Forests
78 pages
Artificial Intelligence and Machine Learning in 2D/3D Medical Image Processing 1st Edition Rohit Raja (Editor)
100% (4)
Artificial Intelligence and Machine Learning in 2D/3D Medical Image Processing 1st Edition Rohit Raja (Editor)
79 pages
Optical Character Recognition Algorithms and Systems: Definitive Reference for Developers and Engineers
From Everand
Optical Character Recognition Algorithms and Systems: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
1 Introduction
No ratings yet
1 Introduction
81 pages
Queue Structures and Algorithms: Definitive Reference for Developers and Engineers
From Everand
Queue Structures and Algorithms: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Screen Recorder 9.1
No ratings yet
Screen Recorder 9.1
1 page
Alxt
No ratings yet
Alxt
1 page
All Color Codes - Gamed Fido
No ratings yet
All Color Codes - Gamed Fido
26 pages
PR Unit 1 ....
No ratings yet
PR Unit 1 ....
34 pages
01 Basics 01ML 02
No ratings yet
01 Basics 01ML 02
35 pages
Ashageri Assignment
No ratings yet
Ashageri Assignment
13 pages
Tesseract OCR Essentials: Definitive Reference for Developers and Engineers
From Everand
Tesseract OCR Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Pattern Recognition and Computer Vision Unit-1
No ratings yet
Pattern Recognition and Computer Vision Unit-1
27 pages
Research Proposal
No ratings yet
Research Proposal
25 pages
Lecture 1
No ratings yet
Lecture 1
25 pages
March 2025-: Top Cited Articles in Computer Science & Information Technology
No ratings yet
March 2025-: Top Cited Articles in Computer Science & Information Technology
34 pages
UNIT-V Notes
No ratings yet
UNIT-V Notes
24 pages
Speech Processing Question Paper
No ratings yet
Speech Processing Question Paper
6 pages
Dalal 2008
No ratings yet
Dalal 2008
6 pages
Project Report
No ratings yet
Project Report
15 pages
Unit - 3
No ratings yet
Unit - 3
12 pages
January 2023: Top 10 Cited Articles in Computer Science & Information Technology
No ratings yet
January 2023: Top 10 Cited Articles in Computer Science & Information Technology
32 pages
INSEM Exam Answerkey 23
No ratings yet
INSEM Exam Answerkey 23
16 pages
Pattern Recognitionand Neural Networks
No ratings yet
Pattern Recognitionand Neural Networks
12 pages
ML Unit 2 Part 2
No ratings yet
ML Unit 2 Part 2
23 pages
Pattern Recognition
No ratings yet
Pattern Recognition
52 pages
PR Notes
No ratings yet
PR Notes
7 pages
A Review of Learning Vector Quantization Classifiers
No ratings yet
A Review of Learning Vector Quantization Classifiers
14 pages
Data Mining (Mid Sem)
No ratings yet
Data Mining (Mid Sem)
7 pages
Pattern Recognition
No ratings yet
Pattern Recognition
11 pages
Customer Segmentation Techniques On E-Commerce
No ratings yet
Customer Segmentation Techniques On E-Commerce
4 pages
Major In: Machine Learning
No ratings yet
Major In: Machine Learning
11 pages
Aqi To Print
No ratings yet
Aqi To Print
63 pages
Machine Learning in Pattern Recognition
No ratings yet
Machine Learning in Pattern Recognition
6 pages
Applying AI To Biometric Identification For Recognizing Text Using One-Hot Encoding and CNN
No ratings yet
Applying AI To Biometric Identification For Recognizing Text Using One-Hot Encoding and CNN
10 pages
L10a - Machine Learning Basic Concepts
100% (1)
L10a - Machine Learning Basic Concepts
80 pages
Comp 1942 finalExamQuestion-2019
No ratings yet
Comp 1942 finalExamQuestion-2019
14 pages
Women Entrepreneurship
No ratings yet
Women Entrepreneurship
40 pages
Implementation and Analysis of Different Digit Recognition Methods On Reduced MNIST Dataset
No ratings yet
Implementation and Analysis of Different Digit Recognition Methods On Reduced MNIST Dataset
10 pages
Sse - 27-12-459-01
No ratings yet
Sse - 27-12-459-01
1 page
HRW
No ratings yet
HRW
28 pages
Intership Report - Dhanya - 2020
No ratings yet
Intership Report - Dhanya - 2020
22 pages
Exam H31-311: IT Certification Guaranteed, The Easy Way!
No ratings yet
Exam H31-311: IT Certification Guaranteed, The Easy Way!
32 pages
Schwalbe 2019
No ratings yet
Schwalbe 2019
5 pages
Pattern Recognition
No ratings yet
Pattern Recognition
45 pages
Spoken Dialog Systems and Voice XML
No ratings yet
Spoken Dialog Systems and Voice XML
94 pages
Data Science Statistics Mathematics Cheat Sheet
100% (1)
Data Science Statistics Mathematics Cheat Sheet
13 pages
Program 7-EM Algorithm-K Means Algorithm
No ratings yet
Program 7-EM Algorithm-K Means Algorithm
3 pages
Text Scanner (OCR) For Android
No ratings yet
Text Scanner (OCR) For Android
7 pages
Rosetta: Large Scale System For Text Detection and Recognition in Images
No ratings yet
Rosetta: Large Scale System For Text Detection and Recognition in Images
9 pages
Optical Character Recognition (OCR)
No ratings yet
Optical Character Recognition (OCR)
10 pages
Optical Character Recognition
No ratings yet
Optical Character Recognition
27 pages
Drug Recommendation System
No ratings yet
Drug Recommendation System
7 pages
Classification Techniques
No ratings yet
Classification Techniques
99 pages
BSC
100% (2)
BSC
120 pages
Bandwidth Management
100% (1)
Bandwidth Management
50 pages
Introduction To Pattern Recognition PDF
No ratings yet
Introduction To Pattern Recognition PDF
19 pages
Iris Flower Classification
No ratings yet
Iris Flower Classification
1 page
Phases
No ratings yet
Phases
3 pages
A Detailed Review of Feature Extraction in Image Processing Systems
No ratings yet
A Detailed Review of Feature Extraction in Image Processing Systems
8 pages
Offline Handwritten Character Recognition Techniques Using Neural Network A Review
100% (1)
Offline Handwritten Character Recognition Techniques Using Neural Network A Review
8 pages
3100 Battery Report
No ratings yet
3100 Battery Report
12 pages
Software Part No. (Working Fine)
No ratings yet
Software Part No. (Working Fine)
2 pages
Data Mining and Decision Making
No ratings yet
Data Mining and Decision Making
11 pages
Pattern Recognition
No ratings yet
Pattern Recognition
12 pages
13 516 3 Artificial Neural Network A T
No ratings yet
13 516 3 Artificial Neural Network A T
2 pages
PR Some Solutions
No ratings yet
PR Some Solutions
26 pages
An Overview of Pattern Recognition
No ratings yet
An Overview of Pattern Recognition
5 pages
A Safety Assurable Human-Inspired Perception Architecture
No ratings yet
A Safety Assurable Human-Inspired Perception Architecture
10 pages
Pattern and Classification
No ratings yet
Pattern and Classification
20 pages
Handwritten Manuscript Digitizer: Kaushil Ruparelia Ashay Shah Shah - Ashay@yahoo. Com Seema Wadhwani Dr. M Mani Roja
No ratings yet
Handwritten Manuscript Digitizer: Kaushil Ruparelia Ashay Shah Shah - Ashay@yahoo. Com Seema Wadhwani Dr. M Mani Roja
3 pages
Pattern Recognition Techniques in AI
No ratings yet
Pattern Recognition Techniques in AI
6 pages
Feature Extraction and Classification Techniques in O.C.R. Systems For Handwritten Gurmukhi Script - A Survey
No ratings yet
Feature Extraction and Classification Techniques in O.C.R. Systems For Handwritten Gurmukhi Script - A Survey
4 pages
3206 Faulty Report (DTRU900)
No ratings yet
3206 Faulty Report (DTRU900)
7 pages
Soft Computing
No ratings yet
Soft Computing
16 pages
Section: Classification and Release Approval Procedure: © Sap Ag TAMM30 4.0B 17-1
No ratings yet
Section: Classification and Release Approval Procedure: © Sap Ag TAMM30 4.0B 17-1
22 pages
AI Unit-5 Notes
No ratings yet
AI Unit-5 Notes
25 pages
Ijaia 040303
No ratings yet
Ijaia 040303
16 pages
Image Classification Techniques-A Survey
No ratings yet
Image Classification Techniques-A Survey
4 pages
Data Science Project Ideas
No ratings yet
Data Science Project Ideas
6 pages
An Overview of Pattern Recognition
No ratings yet
An Overview of Pattern Recognition
7 pages
Problem Statement Proposed Methodolgy: Rishu Kumar (157251) Pranav Pawar (157243) Rahul Ramteke (157250)
No ratings yet
Problem Statement Proposed Methodolgy: Rishu Kumar (157251) Pranav Pawar (157243) Rahul Ramteke (157250)
1 page
Discussion No 4 Pattern Recognition: Group 3
No ratings yet
Discussion No 4 Pattern Recognition: Group 3
20 pages
João Rodrigues, Francisco Costa, João Silva and António Branco
No ratings yet
João Rodrigues, Francisco Costa, João Silva and António Branco
6 pages
PR Assignment 01 - Seemal Ajaz (206979)
No ratings yet
PR Assignment 01 - Seemal Ajaz (206979)
7 pages
Pehchaan Hindi Handwritten Character Recognition S
No ratings yet
Pehchaan Hindi Handwritten Character Recognition S
6 pages
Optical Character Recognition Technique Algorithms
No ratings yet
Optical Character Recognition Technique Algorithms
8 pages
Numeral Recognition Using Statistical Methods Comparison Study
No ratings yet
Numeral Recognition Using Statistical Methods Comparison Study
9 pages
An Efficient OCR System Based On The Regional Feature Using The ASVM As Classifier
No ratings yet
An Efficient OCR System Based On The Regional Feature Using The ASVM As Classifier
7 pages
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet
A Matlab Project in Optical Character Recognition
No ratings yet
A Matlab Project in Optical Character Recognition
7 pages
BSC Alarms
No ratings yet
BSC Alarms
2 pages
FM Template For Wo/Tt: No. Tts-Wos Number Site Id Item Name Qunaitity Action Type
No ratings yet
FM Template For Wo/Tt: No. Tts-Wos Number Site Id Item Name Qunaitity Action Type
2 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
CS402 Data Mining and Warehousing PDF
No ratings yet
CS402 Data Mining and Warehousing PDF
3 pages
Review On Optical Character Recognition of Devanagari Script Using Neural Network
No ratings yet
Review On Optical Character Recognition of Devanagari Script Using Neural Network
6 pages
Ass
No ratings yet
Ass
8 pages
Roject Anagement: Introduction & Definitions
No ratings yet
Roject Anagement: Introduction & Definitions
10 pages
Basic Pattern Recognition Concept
No ratings yet
Basic Pattern Recognition Concept
5 pages
Optical Character Recognition Using Artificial Neural Network
No ratings yet
Optical Character Recognition Using Artificial Neural Network
4 pages
Recognition of Formatted Text Using Machine Learning Technique
No ratings yet
Recognition of Formatted Text Using Machine Learning Technique
4 pages
A Matlab Project in Optical Character Recognition (OCR) : Introduction: What Is OCR?
No ratings yet
A Matlab Project in Optical Character Recognition (OCR) : Introduction: What Is OCR?
6 pages
V.K.A Roshan Indika: Mobile: +97433297130
No ratings yet
V.K.A Roshan Indika: Mobile: +97433297130
4 pages
Aqeel Ali CV
No ratings yet
Aqeel Ali CV
3 pages
Pattern Recognition
No ratings yet
Pattern Recognition
3 pages

Feature Extraction Phase

Uploaded by

Feature Extraction Phase

Uploaded by

Feature Extraction Phase

OCR Application Accuracy %

You might also like