Optical Character Recognition: Shilpa Kale (46052) Kanchan Navgire (46122)
Optical Character Recognition: Shilpa Kale (46052) Kanchan Navgire (46122)
Kanchan navgire(46122)
PCA has the speciality of being the optimal linear transformation subspace
that has largest variance. However this comes at the price of greater
computational requirement. Unlike other linear transforms, the PCA does not
have a fixed set of basis vectors, Its basis vectors depend on the data set.
Assuming zero empirical mean (the empirical mean of the distribution has
been subtracted from the data set), the principal component wi of a dataset x
can be calculate by finding the eigenvalues and eigenvectors of the
covariance matrix of x, we find that the eigenvectors with the largest
eigenvalues correspond to the dimensions that have the strongest correaltion
in the dataset. The original measurements are finally projected onto the
reduced vector space.
3. Appearance Based Recognition Using PCA.
1. Obtian a trianing set of images of all objects of interset (in our case:
the ABC or the aleph-beth) under variable conditions (in our case:
different fonts, bold, italic, etc.).
o Project both the image and the training set to the PCA subspace
(eigenspace).
o Return oj name.
4. The Algorithm
4.1 Creating PCA subspace (eigenspace).
o Organize the image database into colunm vectors. The vector size
eqauls the image height multiplied by the image width. All the
database images must be of the same size, 64 x 48 in our case
(equivalent to a 36 size font). The result is a vector_size x
database_size matrix, ColumnVectors.
o Find the empirical mean vector. Find the empirical mean along
each dimension. The result is a vector_size x 1 vector,
EmpiricalMean.
o Find the distance from the image colunm vector to each of the
database colunm vectors in the subspace. Project both the image
vector and the training set (database) vector to the PCA subspace,
multiply SubSpaceT by ImageColumVec and by ColumnVectors
and compute the distance using the L2 norm method.
o Return the name (label) of the colunm vector with the minimal
distance.
5. The Program
The program consists of 9 files and 2 main functions, written in Matlab.
Main functions:
1. CreateDB(): loads the database, trains it and saves the data for the
OCR use.
Files:
3. getImages.m a script used for loading the database images. This file
is the one needed to be changed in order to update/add images to the
database.
4. pca.m this function trains the dataset with the images loaded by the
script, using the PCA algorithm described above.
I ran the program using a full hebrew Aleph Beth database, 270 images (10
different images per letter).
After running a few tests, the recognition success rate was about 50% due to
two main reasons: location sensitivity and similar letters with small
differences between them.
but the same image (letter) moved aside within the letter box was
incorrectly recognized as .
Similar letters: among the hebrew Aleph Beth one can find very similar
letters with very small differences between them. For example VAV “ ”וand
NUN SOFIT “”ן, can be called a “similar couple”. This kind of similarity
can cause mistakes when writing with a specific font in which the letter
resembles its “similar couple partner”.
.
7. Discussion
The goal of my project was to create a reliable OCR using the PCA method.
After testing this method and ending up with a poor recognition rate as
described above, one may think that I failed reaching that goal, but with a
few enhancements (maybe a project for next year) one can correct the
problems described above.
Centering the letters. By centering the letters within the image box
(both the dataset and the given character) the location sensitivity
problem would be solved, because all the letters will be in the same
location within the image box. This can be done again by using
clustering or/and a edge detecting techniques in order to find the
location of the letter within the image box and moving it to the
center.
8. References
1. Wikipedia web encyclopedia:
http://en.wikipedia.org/wiki/Main_Page
http://joplin.ucsd.edu/Tutorial/matlab.html