Line-of-Sight Stroke Graphs and Parzen Shape Context Features For Handwritten Math Formula Representation and Symbol Segmentation

Uploaded by

bob wu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views7 pages

Line-of-Sight Stroke Graphs and Parzen Shape Context Features For Handwritten Math Formula Representation and Symbol Segmentation

Uploaded by

bob wu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

2016 15th International Conference on Frontiers in Handwriting Recognition

Line-of-Sight Stroke Graphs and Parzen Shape Context Features

for Handwritten Math Formula Representation and Symbol Segmentation

Lei Hu Richard Zanibbi

Department of Computer Science Department of Computer Science
Rochester Institute of Technology Rochester Institute of Technology
Rochester, USA Rochester, USA
[email protected] [email protected]

Abstract—This paper presents a new representation for extraneous edges as possible (i.e., high precision). We want
handwritten math formulae: a Line-of-Sight (LOS) graph to be able to express the correct expression, but we also want
over handwritten strokes, computed using stroke convex hulls. few additional edges so that we increase the likelihood of
Experimental results using the CROHME 2012 and 2014
datasets show that LOS graphs capture the visual structure producing the correct interpretation.
of handwritten formulae better than commonly used graphs In this paper, we propose Line-of-Sight (LOS) stroke
such as Time-series, Minimum Spanning Trees, and k-Nearest graphs for representing handwritten formulae written online,
Neighbor graphs. We then introduce a shape context-based and introduce a symbol segmentation technique using LOS
feature (Parzen window Shape Contexts (PSC)) which is com- graphs and new Parzen window-modified Shape Context
bined with simple geometric features and the distance in time
between strokes to obtain state-of-the-art symbol segmentation features (PSC). We first present existing stroke graph repre-
results (92.43% F-measure for CROHME 2014). This result sentations in Section II, and then algorithms for constructing
is obtained using a simple method, without use of OCR or LOS stroke graphs in Section III. In Section IV we compare
an expression grammar. A binary random forest classifier different graph representations using the CROHME compe-
identifies which LOS graph edges represent stroke pairs that tition benchmarks [2], and find that LOS graphs are able to
should be merged into symbols, with connected components
over merged strokes defining symbols. Line-of-Sight graphs represent the most expressions correctly, while still having
and Parzen Shape Contexts represent visual structure well, a reasonable Precision. In Section V we present our LOS-
and might be usefully applied to other notations. based segmenter using PSC features, which obtains state-of-
Keywords-Line-of-Sight graph, symbol segmentation, hand- the-art symbol segmentation results for the CROHME 2014
written math recognition, shape contexts data set. We then conclude and identify directions for future
work in Section VI.
I. I NTRODUCTION
Math expressions are an essential part of scientific com- II. H ANDWRITTEN S TROKE G RAPH R EPRESENTATIONS
munication. Recognizing handwritten expressions written on In this Section we briefly introduce stroke-level graph
tablets and other touch-sensitive devices would be helpful in representations used to parse formulae written online. Nodes
document editing, mathematics education applications, and represent individual strokes, while edges represent possible
search engines that support mathematical notation in queries. relationships between strokes, such as identifying strokes
In this paper, we are interested in recognizing Symbol Lay- belonging to the same symbol, and identifying spatial rela-
out Trees (SLTs) for expressions, which represent expression tionships between symbols such as right-adjacency (R) or
appearance by a set of symbols with spatial relationships superscript (Sup). Details regarding using stroke graphs to
between them (e.g., Right-adjacent, Subscript, Above [1]). represent formula appearance may be found elsewhere [2].
SLTs represent information similar to LATEX formulae, but The graph types below define edge subsets for the com-
without formatting information. plete graph with n2 = n(n−1)/2 undirected edges between
Recognizing handwritten formulae requires three main all stroke pairs. The motivation to use more compact graphs
tasks: symbol segmentation, symbol recognition and struc- is to reduce the number of irrelevant edges, making both
tural analysis. While often implicit in the literature [1], all training and parsing more efficient and accurate. However,
tasks require a graph-based representation for handwritten pruning stroke pairs can reduce the space of representable
strokes in the expression. These stroke graphs constrain expressions, as we will later show in Section IV.
stroke and symbol relationships considered while searching Time-series. Time-series graphs representing the se-
for the best interpretation of a formula. An ideal stroke graph quence in which strokes are written are common [3], [4].
contains enough edges to represent all spatial relationships Many current systems that parse formula using a modified
and symbols (i.e., perfect recall), while containing as few Cocke-Younger-Kasami (CYK) algorithm consider strokes

DOI 10.1109/ICFHR.2016.41
in time order [2]. A Time-series graph can be represented
using an undirected edge between each stroke and its
successive stroke (except for the final stroke). Time-series
graphs are unable to directly represent formulae with de-
layed strokes (e.g., the dot for an ‘i’ written after writing
other symbols) or non-local relationships. These graphs are
compact; as sequences they are a restricted form of tree,
with n − 1 undirected edges for n strokes.
k-NN Graph. Others such as Eto and Suzuki [5] have
used k-Nearest Neighbor graphs (k-NN). In a k-NN graph,
there is an undirected edge from each stroke to each of its Figure 1: Line-of-Sight (LOS) Graph for a Handwritten
k nearest neighboring strokes. This allows strokes that are Expression. Small square nodes represent bounding box
nearby in space but not necessarily time to be related in centers for eleven handwritten strokes. Edges represent
the graph (e.g., the dot of an ‘i’ written after a delay). k- mutually ‘visible’ strokes. Two strokes share an edge if an
NN graphs are less compact than Time-series, with O(kn) unobstructed line can be drawn from one stroke’s bounding
edges. Smaller values of k may produce a disconnected box center to a point on the convex hull for the other stroke
graph, splitting a formula into two or more sub-expressions. (see Algorithm 1)
Minimum Spanning Tree (MST). Matsakis [6] uses
a stroke graph where edges are defined by the Minimum
Algorithm 1 Line-of-Sight (LOS) Graph Construction
Spanning Tree (MST) over strokes, based on the distance
Input: S, the set of handwritten strokes for an expression
between stroke bounding boxes. The MST is more compact
than k-NN graphs, with n − 1 undirected edges, and guaran- let E = ∅ be an empty edge set
tees that the graph is connected. A limitation is that edges for each stroke s ∈ S with b.box center sc = (x0 , y0 ), do
for relationships between non-neighboring strokes may be let unblocked angle interval set U = {[0, 2π]} radians
absent. for t ∈ S − s, by increasing distance from s, do
let θmin = +∞, θmax = −∞
Delaunay Triangulation Hirata et al. employ Delaunay for each node n = (xh , yh ) in the convex hull of t do
Triangulation (DT) for stroke graphs [7]. A DT for a set of let vector w = n − sc = (xh − x0 , yh − y0 )
2D points S is a triangulation where no point in p is inside let angle θ between w and horizontal h = (1, 0) be
the circumcircle of any triangle [8]. It is the dual structure ⎧
⎨arccos w·h
if yh ≥ y0
of the Voronoi diagram. Each point in the triangulation ||w|| ||h||

θ=
is adjacent to all vertices (strokes) in attached triangles. ⎩2π − arccos w·h
if yh < y0
||w|| ||h||
Like MSTs, Delaunay Triangulations guarantee a connected
let θmin = min(θmin , θ), θmax = max(θmax , θ)
graph, but have more edges: 3n − 3 − h, where h is the
interval h = [θmin , θmax ]
let hull
number of points of the convex hull. This allows more if V = u∈U u − h contains a non-empty interval
relationships to be represented in a DT than an MST. let E= E ∪ (s, t) ∪ (t, s) (sc ‘sees’ t)
In the next Section we propose using a new graph let U = (u − v0 − . . . − vn ), V = {v0 , . . . , vn }
u∈U
representation, Line-of-Sight graphs.
return E (LOS stroke edges)
III. L INE - OF -S IGHT (LOS) G RAPHS OVER S TROKES
We came to consider Line-of-Sight graphs after creating
k-NN graphs with large values of k, and finding that covered by strokes. Strokes other than s are sorted by the
expressions with large exponents were having edges between smallest distance between two sample points: one point is
horizontally adjacent symbols pruned (e.g., for k = 2, in from s, while the other point is from the other stroke. To test
x2a + 1, the x and + do not share an edge). However, an their visibility, strokes are represented by their convex hull
unobstructed line could often be drawn between strokes with [8]: the smallest convex polygon containing sample points
relationships pruned by k-NN in cases like these. A Line- in the stroke. While ‘looking’ at each stroke t from sc , if we
of-Sight graph [8] is a visibility graph defining which nodes find an unobstructed angle range between sc and the convex
can ‘see’ one another. For our stroke graphs, the LOS graphs hull for t, we define an undirected edge between s and the
define edges for strokes that can be ‘seen’ from the bounding other stroke (as two directed edges). We then remove the
box center of another stroke, or vice versa. An example LOS visible angle range for t from the set of unblocked angle
stroke graph is shown in Figure 1. intervals U . Figure 2 illustrates how visible and blocked
Algorithm 1 constructs LOS graphs from handwritten angles are defined using convex hulls.
stroke set S. For a given stroke s ∈ S, we consider whether Algorithm 2 recovers missing labeled edges in stroke
other strokes are visible by incrementally blocking angles graph representations, including LOS, to make sure that

181
(a) Handwritten strokes (b) Time-series edges

(d) Recovered SLT

(c) Edge types
(‘z = 4’)
Figure 3: Recovering Edge Labels with Algorithm 2. Edges
in Time-series graph (b) are assigned corresponding labels
Figure 2: Line-of-Sight Illustration (adapted from [8]). from Ground Truth in (c). (d) is then produced from (c)
‘Looking’ from point p, the purple and blue arcs represent by requiring strokes in a symbol to have the same spatial
the angle ranges blocked by Polygons A and B, ignoring relationship with strokes in other symbols. The Time-series
other polygons in the scene (h in Algorithm 1). The green graph in (a) represents the same SLT as in Ground Truth
arc represents the visible range of Polygon B (V ), taking
into account the portion blocked by Polygon C. The red
lines are sight lines to vertices blocked by Polygon C. uses Algorithm 2 in this way.
Figure 3 illustrates recovering an SLT from a Time-
Algorithm 2 Assigning Labels to an Undirected Stroke series graph for expression z = 4, using the corresponding
Graph. Label *: merge strokes into symbol; Label : un- complete ground truth graph with labels for all strokes and
defined stroke pairs (identical to Figure 3(d)). In Figure 3(c) and
Inputs: (d), there are two directed * (merge) edges between the two
G = (S, E) an undirected and unlabeled stroke graph lines in the equals sign.
Gt = (S, Et , λSt , λEt ) a complete labeled stroke graph with
same node set S (strokes), label functions λSt and λEt From a different perspective, Algorithm 2 can test whether
a stroke graph has sufficient edges to represent labeled edges
let λS = λSt (label strokes) in a ground truth SLT stroke graph. We use this to test the
let φ : S → 2S maps strokes to stroke sets for symbols expressivity (coverage) of different stroke graph types in the
Initially, φ(s) = {s}, ∀s ∈ S next Section.
let Ec = E ∩ Et be the common edges in G and Gt
for e = (s1 , s2 ) ∈ Ec do
λE (e) = λEt (e) (label edge) IV. GRAPH COVERAGE EXPERIMENTS
if λE (e) = ∗ (define symbols)
let φ(s1 ) = φ(s1 ) ∪ φ(s2 ), φ(s2 ) = φ(s1 ) Datasets and Performance Metrics. Our experiments
for (si , sj ) ∈ φ(s1 ) × φ(s1 ) where si = sj do test the expressivity (coverage) of LOS and other stroke
let λE (si , sj ) = λE (sj , si ) = ∗ graph representations. We use data from the Competition on
(refine relationships) Recognition of On-line Handwritten Mathematical Expres-
for e = (s1 , s2 ) ∈ Ec do sions (CROHME) [2]. CROHME 2012 has 1336 training
if λE (e) is labeled with a relationship (i.e., not * or ) and 486 test expressions; CROHME 2014 contains more
for so ∈ φ(s1 ) where so = s1 do
let λE (so , s2 ) = λE (s1 , s2 )
structurally complex formulae, and is larger with 8834
training and 986 test expressions.
return (λS , λE ) (stroke and edge labels for G)
For each test expression, all graph representations are
passed along with Ground Truth to Algorithm 2. Perfor-
mance metrics are then computed using the resulting graph.
edge labels represent a valid SLT. Connected components of We compare SLT graph coverage using four metrics. First,
merge-labeled stroke edges are converted into cliques (i.e., the number of expressions that can be correctly represented:
all pairs of strokes in a symbol share a merge-labeled edge), if all labeled ground truth edges and recovered, the expres-
and all strokes in a symbol are given the same relation- sion is represented correctly (as in Figure 3). At the level
ship with strokes in other symbols. The algorithm ignores of directed edges, we also consider the Recall for labeled
conflicting undefined ( ) labels and assumes no conflicting edges, Precision of selected edges, and F-score.
| Labeled Recovered Edges |
relationship labels, which is valid for well-defined SLTs. 1) Recall = | Labeled Ground Truth Edges |
Note that if we have a labeled stroke graph, we can pass
| Labeled Recovered Edges |
it in both inputs for Algorithm 2 to insure that the SLT 2) Precision = | Graph Edges |
representation is consistent, e.g., for recognition results [9].
In fact, the symbol segmenter described later in this paper 3) F-score = 2 × Recall×Precision
Recall+Precision

182
Graph Construction and Distance Metrics. MST, k- Table I: Coverage Comparison for Stroke Graph Types.
NN and LOS require distances between stroke pairs for Percentage of representable CROHME expressions (SLTs)
construction. We consider three Euclidean distances: are shown along with metrics for directed stroke edges
1) AC: distance between two strokes’ averaged CROHME 2012 Test (486 Expressions)
center/center-of-mass (AC), the mean of x and Stroke Pair Edges
y coordinates for the set of stroke sample points. Expr. (%) Recall Precision F-score
Complete 100.00 1.000 0.087 0.159
2) BBC: distance between bounding box centers (BBC). LOS 98.56 0.999 0.309 0.472
3) CPP: the smallest distance between two sample points 6-NN 89.92 0.994 0.286 0.444
from each of the strokes, the Closest Point Pair. Delaunay 76.75 0.977 0.388 0.555
MST 36.21 0.882 0.921 0.901
We choose the distance metric and stroke representation Time-series 31.28 0.878 0.917 0.897
based on model selection experiments [9]. AC works best for 2-NN 24.90 0.879 0.708 0.784
MST, while CPP distance works best for k-NN. For k-NN,
CROHME 2014 Test (986 Expressions)
2-NN achieves the highest F-score, while 6-NN achieves the Stroke Pair Edges
highest precision for k-NN with a recall higher than 99% [9]. Expr. (%) Recall Precision F-score
For Delaunay Triangulation, we do not require a stroke Complete 100.00 1.000 0.091 0.167
LOS 98.99 0.999 0.297 0.458
pair distance, but instead need to a point to represent strokes, 6-NN 95.81 0.994 0.283 0.441
for which we try the center-of-mass (AC) or bounding box Delaunay 79.11 0.973 0.391 0.558
center (BBC); AC worked best. 2-NN 44.93 0.885 0.685 0.773
MST 42.39 0.875 0.899 0.887
Results. Table I shows results for the CROHME 2012 Time-series 40.77 0.868 0.891 0.879
and 2014 test sets. All graph types have stable edge metrics Complete: all stroke pairs, LOS: Line-of-Sight
across the datasets. For complete graphs containing directed 2/6-NN: k-nearest neighbor, MST: Minimum Spanning Tree
edges between all stroke pairs, Recall is perfect, but over Delaunay: Delaunay Triangulation
90% of the directed edges are irrelevant (Precision < 10%).
The highest Precision and F-scores are obtained for MST, Table II: Number of Missing Edges for Line-of-Sight
suggesting these edges frequently belong to the SLT. Time- Graphs. CROHME 2012/2014 Test correspond to Table I
series graphs have the second-highest precision and F-score
CROHME Total * R Sub Sup Above Below Inside
values. For expressions on a single baseline, the MST and 2012 Train 14 14
Time-series graphs may be identical. This high Precision is 2012 Test 13 10 3
2014 Train 358 2 235 65 24 16 16
partly cause by having fewer edges than the other types, 2014 Test 22 22
which also leads to very low expression coverage (< 45%). * merge (into symbol)
For k-NN, as one expects edge Recall increases with
k, while Precision decreases more rapidly. 2-NN obtains
the lowest expression coverage for CROHME 2012, while V. S EGMENTATION USING LOS G RAPHS AND PARZEN
6-NN obtains the second-highest expression rate for both S HAPE C ONTEXTS (PSC)
data sets. Using k ≥ 6 mostly decreases Precision and F- In this Section we present a new technique for segmenting
score. Delaunay obtains the next-highest expression rate. handwritten symbols using LOS graphs and modified Shape
While Delaunay has higher Precision and F-score through Context features [11]. Our segmentation algorithm is simple,
producing fewer edges than 6-NN and LOS, its expression using a classifier to identify which directed LOS stroke
coverage is 13%-14% lower than 6-NN. pair edges correspond to strokes that should be merged. We
LOS obtains the highest expression coverage (> 98.5%) will now briefly review work on handwritten math symbol
with a slightly higher edge Precision than 6-NN of roughly segmentation, provide a description of the segmentation
30%. LOS misses fewer than 0.1% of labeled ground truth
edges. Table II summarizes missing edges in the LOS results.
Most missing edges are ‘Right’ relationships. Almost all
edges with a merge label can be covered by the Line-of-
(a) Sight lines for leftmost ‘A’ and comma blocked
Sight graph, as related strokes are often close to and can
(bottom-left of ‘A’ and comma can see one another)
‘see’ one another. In Figure 4, we see missing edges caused
by using a single ‘eye’ at the stroke bounding box center,
or completely blocked sight lines.
The nearly perfect Recall for merge edges in LOS graphs
provides a strong foundation for graph-based symbol seg- (b) Sight line between leftmost ‘y’ and the second
mentation, which we discuss in the next Section. Work on ’+’ blocked by subscript and exponent y’s
parsing with LOS graphs using visual features may be found Figure 4: Examples of Missing Right-Adjacency (R) Edges
in a companion paper [10]. in Line-of-Sight graphs

183
algorithm and features, and present segmentation results for width/height aspect ratio of the formula. In generating the
the CROHME 2012 and 2014 benchmarks. image we interpolate ten points between each consecutive
Related Work. There have been many graph-based seg- pair of sample points, and remove any duplicates.
mentation methods for online handwritten formulae [3], Segmentation Algorithm. Our algorithm is fairly simple:
[6], [12]. Toyozumi et al. [12] use a candidate character 1) Construct a stroke LOS graph using Algorithm 1
lattice method, where the closest distance between points 2) Use a binary classifier to classify all directed edges as
on two strokes along with language constraints are used to * (merge into symbol) or ‘ ’ (undefined)
determine whether strokes should be merged. Matsakis [6] 3) Define symbols by converting connected components
proposes a minimum spanning tree (MST) approach, where for * labels into cliques; pass the graph from Step 2
each node in the MST represents a stroke, with distances as both inputs for Algorithm 2
defined by the Euclidean distance between stroke bounding We create a random forest for classification in Step 2,
box centers. A limitation is that this technique only considers using the Python scikit-learn library [14]. We use 129
connected subtrees in the MST for partitioning. features, which are described below. These include the
Other methods include Smithies et al. [4] progressive seg- distance in time between strokes (‘time gap,’ e.g., the first
mentation method, which assumes that symbols are written and third handwritten strokes have a time gap of two),
one-at-a-time. After four strokes are written, the segmenter Parzen window-modified Shape Context features (PSC) and
generates all possible groupings. Strokes from the highest geometric features.
confidence candidate symbol are removed. The process Parzen Shape Context Features (PSC). A Shape Con-
then repeats after another four strokes have been written. text characterizes the relative position and density of points
Kosmala et al. [13] propose a segmentation method based (pixels) in an image around a given point using a log-polar
on Hidden Markov Models (HMM). Discrete left-to-right histogram [11]. Shape Contexts have been widely used in
HMMs without skips and with differing numbers of states computer vision for shape matching and classification, as
are used. A space model is also introduced to represent these local representations of appearance and context are
spaces between symbols. Many recent techniques perform often globally discriminative.
segmentation as a sub-routine while parsing handwritten In our work, we use Shape Contexts to characterize the
strokes using an expression grammar, e.g., using a modified density of points (pixels) in an expression image around
Cocke-Younger-Kasami (CYK) algorithm [2]. two strokes being considered for merging. First, we pro-
Hu and Zanibbi [3] classify pairs of strokes in time order duce smoother probability distributions with Parzen window
as merge/split, i.e., using the Time-series graph for strokes. estimation, using a 2D gaussian kernel. Second, the shape
An AdaBoost classifier with multi-scale shape contexts and context region is divided into bins using uniform angles and
symbol classification confidence features is used. In this distances from the center of the histogram. Previously, it was
paper we extend this work, but use random forests applied found that features using equal rather than the conventional
to Line-of-Sight graphs, do not use classification features, log-polar bin distances perform better when classifying
and improve the shape context features. spatial relationships in formulae [15]. The center of the
Stroke Preprocessing and Image Generation. To reduce PSC is the average of the two stroke bounding box centers.
the effects of sample noise, writing jitter and differences in The radius of the shape context includes the strokes being
resolution between expressions, we preprocess strokes and compared. Note that for distant strokes, the polar histogram
render the expression as a binary image. Preprocessing con- may cover the entire expression.
tains four steps: duplicate point filtering, size normalization, We use three separate Parzen window Shape Contexts
smoothing and resampling. when classifying directed LOS edges as ‘merge’ or ‘split.’
We first delete duplicate points which have the same We use a separate PSC for each stroke, and then a third PSC
(x, y) coordinate as the previous point, because they are for other strokes in the neighborhood of the two strokes.
uninformative. To reduce the influence of writing velocity We do this to improve discrimination by clearly separating
and differences in the coordinate range and resolution for point sources. We confirmed empirically that using multiple
different stroke recording devices, we normalize y coor- histograms is beneficial. Figure 5 shows an example of
dinate values to be in the interval [0, 1], while preserving Parzen Shape Context features for classifying a directed
the width-height aspect ratio for x coordinates. To reduce LOS edge. In this example, we consider an edge from the
noise caused by stylus/finger jitter, we smooth all strokes. vertical stroke of ‘+’ (the parent of the edge) to a nearby
For each stroke, with the exception of the first and last ‘1’ stroke (the child of the edge). Red represents points
points, we replace the coordinate of each point by the from the parent stroke, green points from the child stroke,
average of the current, previous, and next coordinate. Finally, and blue points from other strokes in the histogram. Color
we use linear interpolation to resample the expression and intensity represents the density of each bin. Note that during
render it as a binary image. For the image we use a fixed segmentation, the reverse edge from the ‘1’ to the vertical
height of 200 pixels, and then set the width to preserve the stroke of the ‘+’ would also be considered.

184
PSC features produce a simplified image of the region search and cross validation with random forest classifiers
around a pair of strokes, but with the point sources (parent, to determine the parameters of the PSC features [9]. We
child, and other strokes) clearly separated. Polar histograms choose six angles and five distances for the polar histograms
have higher resolution near their center, which we hoped (30 bins), and a Parzen window width of 15 of the shape
would be beneficial. However, 2D histograms, convolution context radius. The shape radius itself is 1.5 times the
masks or other abstracted/compressed image representations longest distance between stroke points to the center of the
might be used to similar or better effect. histogram. We also used the CROHME training data to
create our random forest merge/split classifier [9]. We use
a random forest with 50 trees, with maximum depth 40 for
each decision tree. The Gini criterion √was used for splitting.
There are n = 129 features, and n features (11) are
selected to define candidate splits at each decision tree node.
Experimental Results. The classification rate for merging
or splitting stroke pairs is quite high: we obtain 98.26%
Expression with PSC center and perimeter shown for CROHME 2012, and 97.88% for the CROHME 2014.
Table III shows our segmenter obtaining the second-highest
reported symbol recall for CROHME 2012 (94.87%), and
CROHME 2014 (92.41%), while obtaining the highest F-
score for CROHME 2014 (92.43%).
Note that we obtain these results without using OCR or an
expression grammar. Many of the systems shown are parser-
Parent stroke Child stroke Other strokes
driven, and use classification and relationship constraints
Figure 5: Example Parzen Shape Contexts (PSCs). A di- (i.e., context) to refine segmentation. We believe that lan-
rected edge from the vertical line in ‘+’ to the ‘1’ is guage constraints would improve our results substantially.
considered. Each PSC has 120 bins, with the PSC radius
reaching the furthest parent or child stroke point (pixel). In Table III: Symbol Segmentation Metrics for CROHME 2012
experiments we use only 30 bins (5 distances × 6 angles), Test (only Recall reported [19]) and CROHME 2014 Test
with a radius 1.5 times the distance to the furthest parent or
CROHME 2012 Test (486 Expressions)
child stroke point, capturing more context from other strokes Symbol
Recall (%) Precision (%) F-score (%)
MacLean et al. [20] 95.56
Geometric Features. We also use geometric features LOS + PSC 94.87 94.56 94.72
from previous work on classifying relationships between Alvaro [21] 91.95
stroke pairs, including horizontal distance, size difference Awal et al. [22] 87.75
Hu et al. [23] 87.51
and vertical offset [16]; minimum point distance [12]; over- Simistira et al. 71.21
lapping area [17]; minimum distance, horizontal overlapping Celik et al. [24] 59.20
of the bounding box, distance and offset between stroke
CROHME 2014 Test (986 Expressions)
start and end points, and finally backward movement and
Symbol
parallelity [18]. Parallelity is the angle between two vectors Recall (%) Precision (%) F-score (%)
representing strokes, with the vectors defined by the first and Alvaro [25] 93.31 90.72 92.00
last points of each stroke. LOS + PSC 92.41 92.45 92.43
Awal et al. [26] 89.43 86.13 87.75
We also add some additional geometric features. These Yao and Wang [27] 88.23 84.20 86.17
include the distance between bounding box centers, dis- Hu et al. [3] 85.52 86.09 85.80
tance between centers-of-mass, maximal point pair distance Le et al. [28] 83.05 85.36 84.19
Aguilar [29] 76.63 80.28 78.41
(two points are from different strokes of the stroke pair),
LOS + PSC: Line-of-Sight Graph using Parzen Shape Contexts
horizontal offset between the last point of the first stroke w. Random Forest Classifier
and the starting point of the second stroke, vertical distance
between bounding box centers, writing slope (angle between
the horizontal and the line connecting the last point of the VI. C ONCLUSION
current stroke and the first point of the next stroke) and We propose a Line-of-Sight (LOS) stroke graph that is
writing curvature (angle between the lines defined by the first able to represent more formulae than Time-series, Mini-
and last points of each stroke). We normalize all geometric mum Spanning Tree, Delaunay and k-NN graphs. For the
features to lie in the interval [0, 1] except for parallelity, CROHME 2012 and 2014 Test sets, LOS graphs omit fewer
writing slope and writing curvature. than 0.1% of necessary directed stroke pair edges, with a
Training. Using CROHME training data, We used greedy Precision of roughly 30%. We have used LOS graphs to

185
create a symbol segmenter making use of Parzen window- [13] A. Kosmala and G. Rigoll, “On-line handwritten formula
modified Shape Context features (PSC) that obtains state- recognition using statistical methods,” in Proc. ICPR, Aug.
of-the-art results for the CROHME 2014 Test set (92.43% 1998, pp. 1306–1308.
[14] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel,
F-measure) without using OCR or expression grammars. In B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss,
other work, LOS graphs have been used to obtain surpris- V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau,
ingly strong results for parsing handwritten formulae using M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn:
primarily visual features [10]. Machine learning in Python,” J. Machine Learning Research,
Avenues for future work include exploring modified ver- vol. 12, pp. 2825–2830, 2011.
[15] F. Alvaro and R. Zanibbi, “A shape-based layout descriptor
sions of LOS graphs (e.g., relaxing the notion of ‘visibility’ for classifying spatial relationships in handwritten math,” in
by allowing strokes to be partially transparent), exploring ACM DocEng, Sep. 2013, pp. 123–126.
new graphs and combinations of graph types, incorporating [16] Y. Shi, H. Li, and F. Soong, “A unified framework for symbol
classification and language constraints with our segmenter, segmentation and recognition of handwritten mathematical
and improving Parzen Shape Context features. expressions,” in Proc. ICDAR, Sep. 2007, pp. 854–858.
[17] S. MacLean and G. Labahn, “A new approach for recognizing
ACKNOWLEDGMENT handwritten mathematics using relational grammars and fuzzy
sets,” IJDAR, vol. 16, no. 2, pp. 139–163, 2013.
This material is based upon work supported by the [18] S. Lehmberg, H.-J. Winkler, and M. Lang, “A soft-decision
National Science foundation under Grant No. IIS-1016815. approach for symbol segmentation within handwritten math-
We thank Francisco Álvaro for providing code to convert ematical expressions,” in International Conference on Acous-
CROHME stroke data to images. tics, Speech, and Signal Processing, May 1996, pp. 3434–
3437.
R EFERENCES [19] H. Mouchère, C. Viard-Gaudin, D. H. Kim, J. H. Kim, and
U. Garain, “ICFHR 2012 competition on recognition of on-
[1] R. Zanibbi and D. Blostein, “Recognition and retrieval of line mathematical expressions (CROHME 2012),” in Proc.
mathematical expressions,” IJDAR, vol. 15, no. 4, pp. 331– ICFHR, Sep. 2012, pp. 811–816.
357, 2012. [20] S. MacLean and G. Labahn, “A bayesian model for recogniz-
[2] H. Mouchère, R. Zanibbi, U. Garain, and C. Viard-Gaudin, ing handwritten mathematical expressions,” Pattern Recogni-
“Advancing the state of the art for handwritten math recognition, vol. 48, no. 8, pp. 2433–2445, 2015.
tion: the crohme competitions, 2011–2014,” IJDAR, pp. 1–17, [21] F. Alvaro, J.-A. Sanchez, and J. Benedi, “Recognition
2016. of printed mathematical expressions using two-dimensional
[3] L. Hu and R. Zanibbi, “Segmenting handwritten math sym- stochastic context-free grammars,” in Proc. ICDAR, Sept.
bols using adaboost and multi-scale shape context features,” 2011, pp. 1225 –1229.
in Proc. ICDAR, Aug. 2013, pp. 1212–1216. [22] A.-M. Awal, H. Mouchere, and C. Viard-Gaudin, “Towards
[4] S. Smithies, K. Novins, and J. Arvo, “A handwriting-based handwritten mathematical expression recognition,” in Proc.
equation editor,” in International Conference on Graphics ICDAR, 2009, pp. 1046–1050.
Interface, 1999, pp. 84–91. [23] L. Hu, K. Hart, R. Pospesel, and R. Zanibbi, “Baseline
[5] Y. Eto and M. Suzuki, “Mathematical formula recognition extraction-driven parsing of handwritten mathematical expres-
using virtual link network,” in Proc. ICDAR, Sep. 2001, pp. sions,” in Proc. ICPR, Nov. 2012, pp. 326–330.
762–767. [24] M. Celik and B. Yanikoglu, “Probabilistic mathematical for-
[6] N. Matsakis, “Recognition of handwritten mathematical ex- mula recognition using a 2d context-free graph grammar,” in
pressions,” Master’s thesis, Massachusetts Institute of Tech- Proc. ICDAR, Sep. 2011, pp. 161–166.
nology, Cambridge, MA, May 1999. [25] F. Alvaro, J. Sanchez, and J. Benedi, “Recognition of on-
[7] N. S. T. Hirata and W. Y. Honda, “Automatic labeling of line handwritten mathematical expressions using 2d stochastic
handwritten mathematical symbols via expression matching,” context-free grammars and hidden markov models,” Pattern
in International Conference on Graph-based Representations Recognition Letters, vol. 35, pp. 58–67, 2014.
in Pattern Recognition, May 2011, pp. 295–304. [26] A. Awal, H. Mouchère, and C. Viard-Gaudin, “A global
[8] M. d. Berg, O. Cheong, M. v. Kreveld, and M. Over- learning approach for an online handwritten mathematical
mars, Computational Geometry: Algorithms and Applications, expression recognition system,” Pattern Recognition Letters,
3rd ed. Springer-Verlag TELOS, 2008. vol. 35, pp. 68–77, 2014.
[9] L. Hu, “Features and algorithms for visual parsing of [27] H. Mouchère, C. Viard-Gaudin, R. Zanibbi, and U. Garain,
handwritten mathematical expressions,” Ph.D. dissertation, “ICFHR 2014 competition on recognition of on-line hand-
Rochester Institute of Technology, 2016. written mathematical expressions (CROHME 2014),” in Proc.
[10] L. Hu and R. Zanibbi, “MST-based visual parsing of on- ICFHR, Sep. 2014, pp. 791–796.
line handwritten mathematical expressions,” in Proc. ICFHR, [28] A. D. Le, T. V. Phan, and M. Nakagawa, “A system for
2016. recognizing online handwritten mathematical expressions and
[11] S. Belongie, J. Malik, and J. Puzicha, “Shape matching and improvement of structure analysis,” in Proc. DAS, Apr. 2014,
object recognition using shape contexts,” TPAMI, vol. 24, pp. 51–55.
no. 4, pp. 509–522, 2002. [29] F. JulcaAguilar, N. Hirata, C. ViardGaudin, H. Mouchere, and
[12] K. Toyozumi, N. Yamada, T. Kitasaka, K. Mori, Y. Suenaga, S. Medjkoune, “Mathematical symbol hypothesis recognition
K. Mase, and T. Takahashi, “A study of symbol segmentation with rejection option,” in Proc. ICFHR, Sep. 2014, pp. 500–
method for handwritten mathematical formula recognition 505.
using mathematical structure information,” in Proc. ICPR,
Aug. 2004, pp. 630–633.

186

Electric Machine Design (Module-4)
No ratings yet
Electric Machine Design (Module-4)
24 pages
Online Handwritten Mathematical Expressions Recognition by Merging Multiple 1D Interpretations
No ratings yet
Online Handwritten Mathematical Expressions Recognition by Merging Multiple 1D Interpretations
6 pages
DS Unit - 4 - A
No ratings yet
DS Unit - 4 - A
20 pages
Graphic Theory
No ratings yet
Graphic Theory
74 pages
Trifocal Tensor: Exploring Depth, Motion, and Structure in Computer Vision
From Everand
Trifocal Tensor: Exploring Depth, Motion, and Structure in Computer Vision
Fouad Sabry
No ratings yet
Graph Algorithms
No ratings yet
Graph Algorithms
20 pages
Two Dimensional Computer Graphics: Exploring the Visual Realm: Two Dimensional Computer Graphics in Computer Vision
From Everand
Two Dimensional Computer Graphics: Exploring the Visual Realm: Two Dimensional Computer Graphics in Computer Vision
Fouad Sabry
No ratings yet
Digraphs: I. Intro To Digraphs
No ratings yet
Digraphs: I. Intro To Digraphs
5 pages
SAT-Based Loop Detection in Graph Rewriting
No ratings yet
SAT-Based Loop Detection in Graph Rewriting
5 pages
Graphs
No ratings yet
Graphs
122 pages
Graph Algorithms: Timothy Vismor June 11, 2011
No ratings yet
Graph Algorithms: Timothy Vismor June 11, 2011
30 pages
Datastructure 5
No ratings yet
Datastructure 5
34 pages
DSA UNIT-5 Notes 2023
No ratings yet
DSA UNIT-5 Notes 2023
65 pages
Unit Iii Graphs
No ratings yet
Unit Iii Graphs
32 pages
Introduction to Analytical Geometry
From Everand
Introduction to Analytical Geometry
Simone Malacrida
No ratings yet
Unit 5
No ratings yet
Unit 5
97 pages
Unit IV Data Structure
No ratings yet
Unit IV Data Structure
40 pages
DS Module Iv Part 2
No ratings yet
DS Module Iv Part 2
20 pages
Netwirks and Graphs
No ratings yet
Netwirks and Graphs
37 pages
Graph
No ratings yet
Graph
31 pages
87 Submission
No ratings yet
87 Submission
9 pages
UNIT - 4 Graphs PDF
No ratings yet
UNIT - 4 Graphs PDF
19 pages
Complex Integration and Cauchy's Theorem
From Everand
Complex Integration and Cauchy's Theorem
G. N. Watson
No ratings yet
Graphs
No ratings yet
Graphs
89 pages
Ders9 - Undirected Graphs
No ratings yet
Ders9 - Undirected Graphs
86 pages
Applications of Graph Theory in Computer Sciences
100% (1)
Applications of Graph Theory in Computer Sciences
15 pages
Triangular Sum Labeling NEW
No ratings yet
Triangular Sum Labeling NEW
33 pages
Graph
No ratings yet
Graph
100 pages
DS Unit 5
No ratings yet
DS Unit 5
17 pages
Lec (Graph1)
No ratings yet
Lec (Graph1)
87 pages
Notes 20241119203916
No ratings yet
Notes 20241119203916
5 pages
Graph Theory Notes
No ratings yet
Graph Theory Notes
3 pages
CSCE 3110 Data Structures & Algorithm Analysis: Rada Mihalcea Graphs (I) Reading: Chap.9, Weiss
No ratings yet
CSCE 3110 Data Structures & Algorithm Analysis: Rada Mihalcea Graphs (I) Reading: Chap.9, Weiss
34 pages
DS Unit 5
No ratings yet
DS Unit 5
17 pages
CSCE 3110 Data Structures & Algorithm Analysis: Rada Mihalcea Graphs (I) Reading: Chap.9, Weiss
No ratings yet
CSCE 3110 Data Structures & Algorithm Analysis: Rada Mihalcea Graphs (I) Reading: Chap.9, Weiss
34 pages
Graphs
No ratings yet
Graphs
53 pages
Algorithms Unit 2
No ratings yet
Algorithms Unit 2
71 pages
Orthographic Projection: Exploring Orthographic Projection in Computer Vision
From Everand
Orthographic Projection: Exploring Orthographic Projection in Computer Vision
Fouad Sabry
No ratings yet
Graphs Notes
No ratings yet
Graphs Notes
59 pages
UNIT-4 (Graph)
No ratings yet
UNIT-4 (Graph)
15 pages
Module 5
No ratings yet
Module 5
35 pages
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet
Conformal Mapping
From Everand
Conformal Mapping
Zeev Nehari
4/5 (1)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
03 Graphs
No ratings yet
03 Graphs
51 pages
MMB 2023 SDM
No ratings yet
MMB 2023 SDM
9 pages
CH 5
No ratings yet
CH 5
44 pages
Trees and Graphs3
No ratings yet
Trees and Graphs3
13 pages
Graph NTree
No ratings yet
Graph NTree
38 pages
Chapter01 GraphsTheory
No ratings yet
Chapter01 GraphsTheory
87 pages
DS Unit-4
No ratings yet
DS Unit-4
47 pages
Graph Theroy & Combinatorics by Prince
No ratings yet
Graph Theroy & Combinatorics by Prince
7 pages
Unit V DS
No ratings yet
Unit V DS
8 pages
CH 6
No ratings yet
CH 6
52 pages
Ankur Graph
No ratings yet
Ankur Graph
12 pages
Branch Dec Om Positions
No ratings yet
Branch Dec Om Positions
27 pages
Graphs
No ratings yet
Graphs
74 pages
Graphs and Graph Traversals: // From Tree To Graph // Many Programs Can Be Cast As Problems On Graph
No ratings yet
Graphs and Graph Traversals: // From Tree To Graph // Many Programs Can Be Cast As Problems On Graph
36 pages
Chapter 11 - Graphs
No ratings yet
Chapter 11 - Graphs
23 pages
Unit 4 DS
No ratings yet
Unit 4 DS
39 pages
Graph
No ratings yet
Graph
36 pages
Benchmarking Keypoint Filtering Approaches For Document Image Matching
No ratings yet
Benchmarking Keypoint Filtering Approaches For Document Image Matching
6 pages
Classification of Graphomotor Impressions Using Convolutional Neural Networks: An Application To Automated Neuro-Psychological Screening Tests
No ratings yet
Classification of Graphomotor Impressions Using Convolutional Neural Networks: An Application To Automated Neuro-Psychological Screening Tests
6 pages
Extremely Sparse Deep Learning Using Inception Modules With Dropfilters
No ratings yet
Extremely Sparse Deep Learning Using Inception Modules With Dropfilters
6 pages
Building A Compact MQDF Classifier by Sparse Coding and Vector Quantization Technique
No ratings yet
Building A Compact MQDF Classifier by Sparse Coding and Vector Quantization Technique
6 pages
Segmentation Free Spotting of Cuneiform Using Part Structured Models
No ratings yet
Segmentation Free Spotting of Cuneiform Using Part Structured Models
6 pages
Zoning Aggregated Hypercolumns For Keyword Spotting
No ratings yet
Zoning Aggregated Hypercolumns For Keyword Spotting
6 pages
Phocnet: A Deep Convolutional Neural Network For Word Spotting in Handwritten Documents
No ratings yet
Phocnet: A Deep Convolutional Neural Network For Word Spotting in Handwritten Documents
6 pages
Sheet Music Statistical Layout Analysis: 2016 15th International Conference On Frontiers in Handwriting Recognition
No ratings yet
Sheet Music Statistical Layout Analysis: 2016 15th International Conference On Frontiers in Handwriting Recognition
6 pages
Convolutional Multi-Directional Recurrent Network For of Ine Handwritten Text Recognition
No ratings yet
Convolutional Multi-Directional Recurrent Network For of Ine Handwritten Text Recognition
6 pages
A Lexicon Verification Strategy in A BLSTM Cascade Framework
No ratings yet
A Lexicon Verification Strategy in A BLSTM Cascade Framework
6 pages
On The Design of Personal Digital Bodyguards: Impact of Hardware Resolution On Handwriting Analysis
No ratings yet
On The Design of Personal Digital Bodyguards: Impact of Hardware Resolution On Handwriting Analysis
6 pages
Cascading Training For Relaxation CNN On Handwritten Character Recognition
No ratings yet
Cascading Training For Relaxation CNN On Handwritten Character Recognition
6 pages
Fourier Coefficients For Fraud Handwritten Document Classification Through Age Analysis
No ratings yet
Fourier Coefficients For Fraud Handwritten Document Classification Through Age Analysis
6 pages
Discovering Visual Element Evolutions For Historical Document Dating
No ratings yet
Discovering Visual Element Evolutions For Historical Document Dating
6 pages
The First Handwritten Balinese Palm Leaf Manuscripts Dataset
No ratings yet
The First Handwritten Balinese Palm Leaf Manuscripts Dataset
6 pages
Multiple Generation of Bengali Static Signatures
No ratings yet
Multiple Generation of Bengali Static Signatures
6 pages
New Tampered Features For Scene and Caption Text Classification in Video Frame
No ratings yet
New Tampered Features For Scene and Caption Text Classification in Video Frame
6 pages
Recognizing Off-Line Flowcharts by Reconstructing Strokes and Using On-Line Recognition Techniques
No ratings yet
Recognizing Off-Line Flowcharts by Reconstructing Strokes and Using On-Line Recognition Techniques
6 pages
Automatic Signature Segmentation Using Hyper-Spectral Imaging
No ratings yet
Automatic Signature Segmentation Using Hyper-Spectral Imaging
6 pages
Efficient Inference in Fully Connected CRFs
No ratings yet
Efficient Inference in Fully Connected CRFs
9 pages
On The Parametrization of The Three-Dimensional Rotation Group
No ratings yet
On The Parametrization of The Three-Dimensional Rotation Group
10 pages
Defensive Patches For Robust Recognition in The Physical World
No ratings yet
Defensive Patches For Robust Recognition in The Physical World
10 pages
Recent Advances in Simultaneous Localiza
No ratings yet
Recent Advances in Simultaneous Localiza
34 pages
Rtfi Class
100% (16)
Rtfi Class
139 pages
Chapter 2 Memory Organization
No ratings yet
Chapter 2 Memory Organization
23 pages
Government Polytechnic College Vechoochira, Pathanamthitta: Ajumal Anish Reg No: 20130611
No ratings yet
Government Polytechnic College Vechoochira, Pathanamthitta: Ajumal Anish Reg No: 20130611
26 pages
SA Health Cleaning Standard 2014 - (v1.1) CDCB Ics 20180301 PDF
No ratings yet
SA Health Cleaning Standard 2014 - (v1.1) CDCB Ics 20180301 PDF
48 pages
Silimer 5060 TDS
No ratings yet
Silimer 5060 TDS
1 page
Introducing Wireless Proximity Switches: Technology Review
No ratings yet
Introducing Wireless Proximity Switches: Technology Review
8 pages
Friday July 30, 2010 Leader
No ratings yet
Friday July 30, 2010 Leader
47 pages
MPU5 Datasheet 05 2020
No ratings yet
MPU5 Datasheet 05 2020
2 pages
CurriculumGuideForm
No ratings yet
CurriculumGuideForm
8 pages
Ebooks File Machine Learning For Mobile Revathi Gopalakrishnan All Chapters
100% (2)
Ebooks File Machine Learning For Mobile Revathi Gopalakrishnan All Chapters
24 pages
Postmodernism and Biology in John Fowles S The French Lieutenant's Woman
No ratings yet
Postmodernism and Biology in John Fowles S The French Lieutenant's Woman
23 pages
Xerox Workcentre 5735 / 5740 / 5745 / 5755 Multifunction Printer
No ratings yet
Xerox Workcentre 5735 / 5740 / 5745 / 5755 Multifunction Printer
8 pages
The Doubling Theory Dark Matter
No ratings yet
The Doubling Theory Dark Matter
5 pages
Agricultural Extension and Advisory Services in Nigeria, Malawi, South Africa, Uganda, and Kenya
No ratings yet
Agricultural Extension and Advisory Services in Nigeria, Malawi, South Africa, Uganda, and Kenya
66 pages
12th 3
No ratings yet
12th 3
35 pages
Ts X Biology Final Exam Revision 2023-24
No ratings yet
Ts X Biology Final Exam Revision 2023-24
7 pages
MCWP 4-6 (1996) - MAGTF Supply Ops
No ratings yet
MCWP 4-6 (1996) - MAGTF Supply Ops
68 pages
Paperbangkok
No ratings yet
Paperbangkok
17 pages
Eluru Estmation PDF
No ratings yet
Eluru Estmation PDF
31 pages
Aster l1t Users Guide
No ratings yet
Aster l1t Users Guide
74 pages
Wizard 2 50c92axx
No ratings yet
Wizard 2 50c92axx
120 pages
The Naomi Letters Rachel Mennies Download
No ratings yet
The Naomi Letters Rachel Mennies Download
29 pages
Delhi Public School Vadodara: Academic Session 2024-2025 Practice Paper-14
No ratings yet
Delhi Public School Vadodara: Academic Session 2024-2025 Practice Paper-14
5 pages
3.5 Food Tests
No ratings yet
3.5 Food Tests
5 pages
Lacrosse Offense
100% (1)
Lacrosse Offense
21 pages
Schneider EZC MCCB PDF
100% (1)
Schneider EZC MCCB PDF
13 pages
1 Plant Physiology
100% (1)
1 Plant Physiology
11 pages
Bounded and Unbounded Sequence: (A) Definition
No ratings yet
Bounded and Unbounded Sequence: (A) Definition
2 pages
Initial Lab Report 1
0% (1)
Initial Lab Report 1
4 pages

Line-of-Sight Stroke Graphs and Parzen Shape Context Features For Handwritten Math Formula Representation and Symbol Segmentation

Uploaded by

Line-of-Sight Stroke Graphs and Parzen Shape Context Features For Handwritten Math Formula Representation and Symbol Segmentation

Uploaded by

2016 15th International Conference on Frontiers in Handwriting Recognition

Line-of-Sight Stroke Graphs and Parzen Shape Context Features

Lei Hu Richard Zanibbi

2167-6445/16 $31.00 © 2016 IEEE 180

(d) Recovered SLT

You might also like