Computer Vision
Computer Vision
NTOGAS, NIKOLAOS
Post Graduate Student
MSC in Computer Science
Department of Computer Science Technology & Telecommunications &
Staffordshire University Faculty of Computing Engineering &
Technology, TEI of Larisa, Greece,
Abstract: Binarization methods are applied to document images for discriminating the text from the background based
on pure thresholding and filtering combined with image processing algorithms. The proposed binarization procedure
consists of five discrete steps in image processing, for different classes of document images. A refinement technique
enhances further the image quality. Results on Byzantine historical manuscripts are discussed and potential applications
and further research are proposed. The main contribution of this paper is to propose a simple and robust binarization
procedure for pre-filtered historical manuscripts images, and simulation results are also presented.
Keywords: Image processing, document, binarization, denoising, global, local, thresholding.
Introduction: Academic libraries, institutions and
historical museums pile-up or preserve documents in
storage areas. Our work in this paper contributes to
documents safe and efficient preservation in its
original state through out the years and their
unconditional exploitation to researchers, a major issue
for historical documents collections that are poorly
preserved and are prone to degradation processes, see
fig. 1. Documents digitalization, allows access to wider
public, while cultural institutions and heritage
organizations create local or national digital libraries
accessed through the internet. Our work concentrates
on basic techniques used for image enhancement and
restoration, denoising and binarization. The entire
system is implemented in visual environment using
Matlab programming and MathWorks Inc, Image
Processing Toolbox ( MathWorks 2004 ).
Figure 1: Byzantine Manuscript; taking photos of
Byzantine manuscript. Digital photo and Intensity
Histograms
Denoising refers to the removal of noise on the
image ( Sonka et al, 2008 ) and binarization refers to
the conversion of a grayscale image to binary. Both
techniques are basic stages in our image processing of
Byzantine historical manuscripts. Denoising are
filtering methods that eliminate the noise, enhance the
quality of text characters and make the background
texture uniform (Gonzalez et al, (2002), Papamarkos,
(2001)). Binarization (thresholding) converts the
grayscale document image to binary, by changing the
foreground pixels (text characters) to black and
background pixels to white. The paper presents the
need for degraded historical manuscripts images
preservation by binarization implemented by a
procedure based on image preparation, type
classification and refinement of pre-filtered images in
spatial (mean, median and Wiener filter) and frequency
(Butterworth and Gaussian low pass filter) domain.
The work concentrates on text image enhancement and
ISSN: 1790-5117
41
ISBN: 978-960-6766-84-8
12th WSEAS International Conference on COMMUNICATIONS, Heraklion, Greece, July 23-25, 2008
local smoothing (
pixels in the neighbourhood N.
a n + a n , if n is even improved
MathWorks,
2004 ).
2
n
+1
2
H (u , v ) =
1
1 + [ D (u , v ) / D0 ]2 n
H (u, v) = e D
( u ,v ) / 2 2
Binarization - Thresholding:
Robust binarization
gives the possibility of a correct extraction of the
sketched line drawing or text from its background. For
the binarization of images many algorithms have been
implemented. Thresholding is a sufficiently accurate
and high processing speed segmentation approach to
monochrome image. This paper describes a modified
logical thresholding method for binarization of
seriously degraded and very poor quality gray-scale
document images. This method can deal with complex
signal-dependent noise and variable background
intensity caused by non uniform illumination, shadow,
smear or smudge and very low contrast images. The
outcome binary image has no obvious loss of useful
information. Firstly, we analyse the clustering and
ISSN: 1790-5117
ISBN: 978-960-6766-84-8
12th WSEAS International Conference on COMMUNICATIONS, Heraklion, Greece, July 23-25, 2008
43
ISBN: 978-960-6766-84-8
12th WSEAS International Conference on COMMUNICATIONS, Heraklion, Greece, July 23-25, 2008
Global thresholding
The simplest implementation of thresholding is to choose an intensity
value as a threshold level and the values below this threshold become 0
(black) and the values above this threshold become 1 (white). If T is the
global threshold of image f(x,y) and the g(x,y) is the thresholding
image,
then:
1,
g ( x, y ) =
0,
(a)
if
f ( x, y ) T
otherwise
Otsus method
Among the global techniques the most
efficient is Otsus technique [7]. Otsus
method applies clustering analysis to the
grayscale data of input image and models
two clusters of Gaussian distribution of
pixels of the image. The optimal threshold
minimizes the class variance of the two
classes of pixels.
(b )
Figure 2: Global threshold (a) grayscale image (b) T=80 (c) T=150
Local thresholding
Niblacks method
Niblacks method is based on the
calculation of the local mean and
of local standard deviation (
Niblack, 1986 ). The threshold in
the pixel (x,y) is decided by the
expression:
Sauvolas method
Sauvolas method is an adaptive threshold
method ( Sauvola et al, 2000 ). The
computation of local threshold (i.e., for
each pixel separately) is based on
estimation of local mean and local standard
deviation. The threshold value T(x,y) at the
pixel (x,y) is defined by the relation:
T(x,y)=m(x,y)+k*s(x,y)
where m(x,y) and s(x,y) are the
average and the standard deviation
of a local area respectively. The
size of the window must be large
enough to suppress the noise in the
image, but also small enough to
preserve local details of the image.
A window size 15-by-15 works
efficiently. The value of k is used
to adjust the percentage of total
pixels that belong to foreground
object especially in the boundaries
of the object. A value of k = -0.2
ISSN: 1790-5117
(c)
Bernsens Method
Bernsens method calculates the
local threshold value based in the
mean value of the minimum and
maximum intensities of pixels
within a window ( Papamarkos
(2001) ). If the window is centred
at the pixel (x,y) the threshold for
I(x,y) is defined by:
T ( x, y ) =
s ( x, y )
T ( x, y ) = m( x, y ) 1 + k 1
Z max + Z min
2
44
ISBN: 978-960-6766-84-8
12th WSEAS International Conference on COMMUNICATIONS, Heraklion, Greece, July 23-25, 2008
window.
This
algorithm
is
dependent on k value and also on
the size n of window N-by-N.
TECHNICAL COMMENTS
Paper acceptable without spots, stains, smears, aging, brightness degradation
Images with spots, stains, smears or smudges, with less or more background noise.
High humidity and illumination variation caused wrinkles effects and shadows
ink seeking from the other side of page and oily page
Images with thin strokes of pen, i.e. stroke width analysis
Broken
Characters with red ink
45
ISBN: 978-960-6766-84-8
12th WSEAS International Conference on COMMUNICATIONS, Heraklion, Greece, July 23-25, 2008
Stage 4: Thresholding
Thresholding are applied by global (Otsus) and local
(Niblack, Sauvola, Bernsen) thresholding techniques on
previous stage resulting filtered images.
Stage 5: Refinement
A refinement procedure, based on erosion and dilation, is
applied on the binarized image, such that the obtained
image has its characteristics further clarified in the texture
and foreground compared with the background area.
Figure 3: Proposed method stages
Denoising results
The filters applied are Mean, Median, Wiener, Gaussian and Butterworth ones. The application of each filter with
variable sizes of window, explored all possible denoising results:
a. Filtering improved the quality of the image, thus preparing it for binarization, see Table 2.
b. Spatial domain filtering using the Mean, Median and especially Wiener filters.
c. Frequency domain filtering using the Butterworth and Gaussian low pass filters.
d. The paper condition is an unexpected factor.
e. The document filtering is a preliminary stage for optical character recognition.
ISSN: 1790-5117
46
ISBN: 978-960-6766-84-8
12th WSEAS International Conference on COMMUNICATIONS, Heraklion, Greece, July 23-25, 2008
Bernsen
BEST
BAD
BAD
BAD
BAD
BEST
Niblack
BEST
GOOD
BEST
GOOD
BAD
GOOD
Otsu
BEST
BAD
BAD
BAD
GOOD
GOOD
Sauvola
BEST
BEST
BEST
BEST
BAD
GOOD
Table 3: Results from combination of Wiener filter 5-by-5 with binarization methods for each image category
results to almost all of the specified image categories,
see Table 3. Eikvils and Parkers binarization methods
were not included into our comparison, but
thresholding techniques review indicated bad text
detection recall ranking.
ISSN: 1790-5117
47
ISBN: 978-960-6766-84-8
12th WSEAS International Conference on COMMUNICATIONS, Heraklion, Greece, July 23-25, 2008
Table 4:
ISSN: 1790-5117
48
ISBN: 978-960-6766-84-8
12th WSEAS International Conference on COMMUNICATIONS, Heraklion, Greece, July 23-25, 2008
a. binary image
(a)
(b)
Table 5: Final Refinement (a) binary image (b) Removing 50 connected pixel
FUTURE WORK:
Potential application fields
include the automation of the combined binarizationfiltering procedure by a neural network and the
extension of the method to a wider area of documental
or non-documental images. Parallel computational
machines and perceptual optical processing techniques
should further increase the methods efficiency. The
application of filtering as a preliminary stage for the
binarization of the document image promises a great
improvement on the quality of the final images. Other
filter schemes in the preliminary stage of digital preprocessing can be investigated. By converting
historical documents and old newspapers (which have
been degraded or partly damaged) to digital formats we
preserve them, in the form of the original document,
for future reproduction. By digitalization and storing of
copies of old books and historical manuscripts, we can
store electronically entire libraries to preserve
historical manuscripts. Such a text images storage
environment and data base is proposed for further
research.
CONCLUSION:
No algorithm works well for
all types of images but some work better than others
for particular types of images suggesting that improved
performance can be obtained by selection or
combination of appropriate algorithm for the type of
document image under investigation. We have
described algorithms that utilize spatial structure,
global and local features or both. Many algorithms
require extensive preprocessing steps in order to obtain
useful data to work with because document image and
data mining classification techniques is still in infancy.
The purpose of our work on text image binarization
was to introduce an innovative procedure for digital
image acquisition of historical documents based on
ISSN: 1790-5117
References
1.
Ashley John, Burgoyne, Laurent, Pugin, Greg
Eustace, Ichiro Fujinaga, A comparative survey of
image binarisation algorithms for optical recognition
on degraded musical sources, ISMIR2007, p509
2.
Badekas Efthimios, Nikos Nikolaou, Nikos
Papamarkos, Text Binarization in Color Documents,
Electrical and Computer Engineering Dpt, Image
49
ISBN: 978-960-6766-84-8
12th WSEAS International Conference on COMMUNICATIONS, Heraklion, Greece, July 23-25, 2008
ISBN: 978-960-6766-84-8
12th WSEAS International Conference on COMMUNICATIONS, Heraklion, Greece, July 23-25, 2008
33.
Wu Sue, Adnan Amin,
Automatic
Thresholding of Gray-level Using Multi-stage
Approach, Proceedings of the 7th International
Conference on Document Analysis and Recognition
(ICDAR 2003), 0-7695-1960-1/03, 2003 IEEE
34.
Yanowitz, D.L. and A.M. Bruckstein, "A new
Method for image segmentation", Computer Vision
Graphics and Image Processing, 1989, vol.46, (no 1),
pp.82-95.
35.
Yibing Yang and Hong Yan, An adaptive
logical method for binarization of degraded document
images Pattern Recognition, Volume 33, Issue 5, May
2000, Pages 787-807, Pattern Recognition Society,
Elsevier Science
36.
Zhang, Z. and C. Tan, "Restoration of images
scanned from thick bound documents", in proceedings
of International Conference on Image Processing 2001,
Volume 1, 2001, pp. 1074-1077
N. Ntogas received the diploma degree in Applied Informatics at the Economic University
of Athens, Greece, in 1989. He finished the School of Pedagogical and Technological
Education of Thessalonica, Greece, in 1998. He attained an MSc in Computer Science at
the Staffordshire University, United Kingdom in cooperation with TEI of Larissa, Greece
in 2007. He worked as Computer Programmer and Analyst and as responsible for the
deployment of the computerization of Municipality of Trikala, Greece. Since 1998 he
works as a teacher in the department of Computer Laboratories of 1st Technical Institution
of Trikala. His research interests include, Image Processing, historical documents
acquisition, etc. This work is part of his MSc Dissertation by Research.
Email:
[email protected].
Dr. D. E. Ventzas (1956) SMISA is Electronic Engineer and Professor of Technological
Institute of Larissa, Greece. He owns an MSc in Control Eng and a PhD in Microprocessor
based Instrumentation from Bradford University, Yorkshire, UK. He was Instrument and
Systems Engineer in Hellenic Aspropyrgos Refinery SA, Athens. His research interests
lies in Signal and Image Processing, Process Control and Instrumentation, Biomedical
Engineering and Computer Tools for Instrumentation. He is the author of many research
and review papers in English and books in Greek. He supervised many undergraduate
diploma thesis and postgraduate dissertations.
Email:
[email protected], [email protected]
ISSN: 1790-5117
51
ISBN: 978-960-6766-84-8