Information System For Graphological Ide PDF
Information System For Graphological Ide PDF
Abstract- In this paper an automated information system is fragment which can be identified as being carved by the
presented, that classifies scripts to corresponding writers same writer, “gains a date”. Thus, the development of an
using graphology. The methodology is based on the idea of information system that will perform automatic
creating a representative of each alphabet symbol in each identification of writers and classification of the
script via proper fitting of all realizations of the specific
inscriptions according to the hand that carved them
symbol in it. The decision for writer identification is based on
pair-wise comparisons of statistical quantities computed for provides a very powerful tool for dating.
all representatives. The system was applied to ancient Greek Some of the major difficulties one may encounter in the
inscriptions of classical era which were correctly attributed development of such a system are:
to 6 different hands. (a) the essential variability of the shape of each letter in
the same inscription and the same writer,
Keywords - pattern recognition, graphology, handwriting (b) it is quite common, that the similarity between two
classification, archaeology, writer identification specific samples of the same writer is smaller than the
similarity between various samples by different writers,
(c) inscriptions suffer from serious wear.
I. INTRODUCTION
In addition, we point out that there is substantial
variability in the form of the alphabet symbol realization
The main reason, because of which Ancient inscriptions generated by the same writer.
were chosen, is that they form one of the most important Writer identification and verification, employing
primary sources for information about the ancient world handwritten text on paper or a paperlike surface (e.g., a
and in particular ancient Greece [1]. In addition ancient pen tablet computer), in both the on-line and off-line
inscriptions contain capital letters, which are easier to be domain has received attention from various researchers
satisfying represented and inscriptions are also curved e.g. [2], [3], [4], [5], [6], [7], [8], [9], [10], [14]. In [11]
following more strict rules than ink written scripts. The and [12] an approach based on analyzing both the texture
main difficulties in studying and interpreting inscriptions and allograph levels is presented.
on stone are that, as a rule, they are unsigned, non-dated
and very fragmentary. As a consequence, they are often
difficult to date; yet dating them, being able to give them II. THE IDEAL REPRESENTATIVE OF EACH
their correct historical context, is crucial to unlocking the ALPHABET SYMBOL FOR EACH INSCRIPTION
valuable information they contain. Identifying the writer
who carved an inscription allows for correct and
A. The assumption of the existence of a “platonic”
unambiguous dating of it. For example, during the
prototype for each writer and all alphabet symbols
Classical era in Attica, in individual inscribers of decrees
had working careers of limited duration. Suppose that one
can identify, via image processing and pattern recognition, We make the working assumption that the writer had in
the characteristics of the writing of a certain workman and his mind an ideal prototype for each alphabet symbol,
that at least one of the inscriptions carved by this writer whose contour corresponds to the curve rG M (t ) . We call
can be dated on the basis of the inscription context and/or this the platonic prototype and thus, we further assume
archaeological arguments. Then any other inscription or that for each alphabet symbol and each writer separately a
Authorized licensed use limited to: National Technical University of Athens. Downloaded on January 29, 2010 at 06:03 from IEEE Xplore. Restrictions apply.
distinct platonic prototype exists. Due to instability in the the entire Π2 is calculated not for each side separately,
carver's hands, random interaction between chisel and then the two realizations may match in completely
marble, variations in the writer's posture and mood, erroneous positions.
variability of the employed carving instruments and in the (iii): A kind of “mean curve” of Π1, Π2 , say M1 is
G G evaluated as follows: consider Π1 and Π2 in their best
material of the plate, etc., the writer carved r M (t ) + n (t )
instead, where nG (t ) , incorporates the related noise. We matching position obtained in step (ii) and in particular
two matching sides si,1 and si,2 of the two letters. We
have developed a novel methodology to suppress noise
G compute the average curvature of the ensemble si,1 and si,2,
n (t ) , from different realizations of the same alphabet via the circle that best fits this ensemble in the Least
symbol in an inscription. This method consists in rotating, Squares sense. Then, the center of this circle is considered
translating and resizing all symbol realizations of the to be an observation point of the ensemble: we construct
inscription, so as they optimally match according to an the minimal angle Θi with vertex at O that includes all
introduced criterion. By properly averaging the matched
pixels of si,1 and si,2 (Fig. 1). Next if Θ ≥ π we divide
contours, a realization of the ideal prototype for each i
6
alphabet symbol and each inscription separately is angle Θi into Ni small angles δΘ j = κ 2 where κ the
generated. We call this the ideal representative or
“platonic” realization of this alphabet symbol. curvature of the circle. In each angle δΘ j we compute the
mean value of the coordinates of the pixels that belong to
B. Estimation of the platonic representative from the it, thus obtaining a single point Aj,i (see Fig. 1) where the
letters of an inscription approach is demonstrated for an arbitrary number of
matching sides.
First estimation: The method to obtain a first estimation
of the platonic prototype consists of the following:
(i): Consider an inscription E1 and any letter, say M,
realizations of which appear in E1 . The contour (Π1) of
one letter realization is randomly chosen and placed at the
center of a selected frame.
(ii): The contour (Π2) of some other realization of letter M
is randomly chosen and we apply to it: a) Rotation around
its center, b) Resize, c) Parallel translation, so that Π2
optimally matches Π1 via the minimization of quantity
⎛ N S N k ,1 N S Nk , 2
⎞
ε = ⎜⎜ ∑∑ d 2 ( Pi , Π 2,k ) + ∑∑ d 2 (Q j , Π1,k ) ⎟⎟ 2 , (1)
⎝ k =1 i =1 k =1 j =1 ⎠
where N S is the number of sides of the considered letter,
Figure 1. Determining the observation point O and the sequence of
th
Π1, k is the k side of Π1 and Π 2, k of Π2, Pi , Q j arbitrary successive angles δΘ j that cover the ensemble of matched letter
points of the kth side of Π1, Π2 respectively, N k ,1 the contour realizations
93
Authorized licensed use limited to: National Technical University of Athens. Downloaded on January 29, 2010 at 06:03 from IEEE Xplore. Restrictions apply.
(v): Proceeding as in step (iii) we compute the average the matching process, the mean curve of the optimally
curve of Π1, Π2 and Π3 in an exactly analogous manner, matched contours is evaluated as in step (v) and it is
via the average curve of si,1, si,2 and si,3 only. considered to be a better approximation I2 of the specific
letter ideal prototype for the inscription in hand.
d=
(S 1
2 Σ
/N +S /N
1
2
2 2 )
Σ 2
degrees of
Figure 3. The ensemble of matched contours of all
letter realizations in the considered inscription and the
(S / N )
1
2 Σ 2
1 (
+ S
2
2 / N 2Σ )
N1Σ − 1 N 2Σ − 1
first estimation I1 of the real prototype of them, depicted
in magenta. freedom and density function f (t ) [16] and quantity
S 12 σ 12
Fk , j = , (3)
Second estimation of the ideal prototype of the letter: All S 22 σ 2
2
contours Π1,..., ΠN of the realizations of the specific letter follows Snedecor’s F distribution with
in the considered inscription are optimally matched to I1,
again via rotation, resizing, translation and minimization
(N Σ ,1
− 1, N Σ, 2
)
− 1 degrees of freedom and density
of the same norm L as in step (ii). After the completion of function h(F ) . Quantity
94
Authorized licensed use limited to: National Technical University of Athens. Downloaded on January 29, 2010 at 06:03 from IEEE Xplore. Restrictions apply.
∏ f (t
L for alll letters
i ,L )h( Fi , L ) , (4) these prototypes come from the same hand we expect a
very good approximation like in Fig. 4. On the contrary if
is used to obtain maximum likelihood and corresponding two ideal representatives come from different writers the
statistical hypothesis are defined to decide whether matching result will be like the one shown in Fig. 5, where
inscriptions E1 and E2 are curved by the same hand. an example of the letter M is depicted.
95
Authorized licensed use limited to: National Technical University of Athens. Downloaded on January 29, 2010 at 06:03 from IEEE Xplore. Restrictions apply.
prototypes that manifest impressive similarities. However, [7] B. Zhang, S. N. Srihari, S. Lee, “Individuality of
eventually, there always seems to be a sufficient number Handwritten Characters”, 7th Int. Conf. on Document
of letters that offer clear distinction between hands. Analysis and Recognition, Edinburgh, Scotland, August 3-
In addition what needs to be done in the future is the 6, (Paper ID:527), 2003.
automated recognition of ancient letters/characters, [8] A. Schlapbach, H. Bunke, “A writer identification and
together with an improved image segmentation system so verification system using HMM based Recognizers”,
as to include inscriptions that have suffered serious wear. Pattern Analysis & Applications, Vol. 10, Number 1 /
Exploiting the depths of the letters on the marble, would February, 2007.
also accelerate the methodology and would make the [9] Zhenyu He, Bin Fang, Jianwei Du, Yuan Yan Tang
segmentation easier and more accurate; this requires a and Xinge You, “A Novel Method for Off-line
three dimensional scanner for obtaining 3D representation Handwriting-based Writer Identification” Proc. of the
of the inscriptions. 2005 8th Int. Conf. on Document Analysis and
Finally, the proposed method could be employed in Recognition ICDAR’05) 1520, 2005 IEEE
writer identification or classical graphology, offering [10] A. Bensefia, T. Paquet and L. Heutte, “A writer
authentication if a record is genuine or not. identification and verification system”, Pattern
Recognition Letters 26 2080–2092, 2005.
ACKNOWLEDGMENT [11] M. Bulacu, L. Schomaker, “Text-Independent Writer
Identification and Verification Using Textural and
The authors would like to thank the expert Allographic Features”, IEEE Transactions on Pattern
archaeologist/epigraphist Prof. St. Tracy, ex Director of Analysis and Machine Intelligence, VOL. 29, NO. 4,
American School of Classical Studies, now at Institute of APRIL 2007
Advanced Study in Princeton, for his cooperation in this [12] L. Schomaker, M. Bulacu, “Automatic writer
work and his comments, as well as for verifying the identification using connected-component contours and
results. edge-based features of uppercase western script”, IEEE
Transactions on Pattern Analysis and Machine
Intelligence, 26(6):787-798, June 2004.
REFERENCES [13] R.G. Casey, Ε. Lecolinet, “A survey of methods and
strategies in character segmentation”, IEEE Transactions
[1] Tracy S.V., “Attic Letter-Cutters of 300 to 229 B.C.”, on Pattern Analysis and Machine Intelligence 18, no 7
Athens and Macedon, Berkeley Ed., 2003. (1996) 690-760.
[2] E. N. Zois and V. Anastassopoulos, ‘‘Morphological [14] V. Pervouchine, G. Leedham, “Extraction and
waveform coding for writer identification’’, Pattern analysis of forensic document examiner features used for
Recognition, 33:385---398, 2000. writer identification”, Pattern Recognition 40, 2007
[3] H. Said, T. Tan, and K. Baker, “Personal identification [15] St. Tracy, C. Papaodysseus, P. Rousopoulos, M.
based on handwriting”, Pattern Recognition, 33(1):149- Panagopoulos, D. Fragoulis, D.Dafi, Th. Panagopoulos.
160, January 2000. “Identifying Hands on Ancient Athenian Inscriptions: First
[4] Cha S.-H., Srihari S.N., “Multiple feature integration Steps towards a Digital Approach”. Archaeometry Vol. 49
for writer verification”, Schomaker L.R.B. & Vuurpijl Issue 4 Page 749 November 2007
L.G. (Eds.), Proc. of the 7th Int. Workshop on Frontiers in [16] Kokolakis, G.E.1981 “Bayesian Classification and
Handwriting Recognition, pp. 333-342, 2000. Classification Performance for Independent
[5] U.-V. Marti, R. Messerli, and H. Bunke, “Writer Distributions”, IEEE, Trans. Inform. Theory, IT-27 pp
identification using text line based features”, In Proc. of 419-421
6th ICDAR, pages 101-105, Seattle, USA, September [17] Panagopoulos, M. Papaodysseus,
2001. C. Rousopoulos, P. Dafi, D. Tracy, S. “Automatic
[6] A. Bensefia, T. Paquet, and L. Heutte. “Handwriting Writer Identification of Ancient Greek Inscriptions”,
analysis for writer verification”, Proc. of 9th IWFHR, IEEE, Transactions on Pattern Analysis and Machine
pages 196-201, Tokyo, Japan, 26-29 October 2004. Intelligence, in press.
96
Authorized licensed use limited to: National Technical University of Athens. Downloaded on January 29, 2010 at 06:03 from IEEE Xplore. Restrictions apply.