0% found this document useful (0 votes)

4 views6 pages

Convolutional Neural Networks For Page Segmentation of Historical Document Images

This paper presents a page segmentation method for handwritten historical document images using a Convolutional Neural Network (CNN) that treats segmentation as a pixel labeling problem. The proposed method learns features directly from raw image pixels and demonstrates competitive results against traditional methods and deeper architectures on various public datasets. The study emphasizes the effectiveness of a simple CNN architecture with one convolution layer, achieving superior performance in segmenting complex layouts of historical documents.

Uploaded by

bob wu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views6 pages

Convolutional Neural Networks For Page Segmentation of Historical Document Images

Uploaded by

bob wu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

2017 14th IAPR International Conference on Document Analysis and Recognition

Convolutional Neural Networks for Page Segmentation of Historical Document

Images

Kai Chen∗ , Mathias Seuret∗ , Jean Hennebert∗† , and Rolf Ingold∗

∗ DIVA, University of Fribourg, Switzerland, Email: {ﬁrstname.lastname}@unifr.ch
† University of Applied Sciences, HES-SO//FR, Fribourg, Switzerland, Email: [email protected]

Abstract—This paper presents a page segmentation method pattern between neurons of adjacent layers, CNN can dis-
for handwritten historical document images based on a Con- cover spatial correlations at different granularity of local
volutional Neural Network (CNN). We consider page segment- context [10]. With multiple convolutional layers and pooling
ation as a pixel labeling problem, i.e., each pixel is classified as
one of the predefined classes. Traditional methods in this area layers, CNN has achieved many successes in various fields,
rely on hand-crafted features carefully tuned considering prior e.g., handwriting recognition [11], image classification [12],
knowledge. In contrast, we propose to learn features from raw and text recognition in natural images [13].
image pixels using a CNN. While many researchers focus on In [14], the authors show that an autoencoder can be
developing deep CNN architectures to solve different problems, used to learn features automatically on the training images.
we train a simple CNN with only one convolution layer. We
show that the simple architecture achieves competitive results An autoencoder is a feed forward neural network trained to
against other deep architectures on different public datasets. reconstruct its input. Hidden layers outputs are then used as
Experiments also demonstrate the effectiveness and superiority features to feed an off-the-shelf classifier. In [15], the authors
of the proposed method compared to previous methods. show that by using superpixels as units of labeling, the speed
Keywords-convolutional neural network; page segmentation; of the method is increased. In [16], a Conditional Random
layout analysis; historical document images; deep learning; Field (CRF) [17] is applied in order to model the local
and contextual information jointly to refine the segmentation
I. I NTRODUCTION results which have been achieved in [15]. Following the
Page segmentation is an important prerequisite step of same idea of [16], we consider the segmentation problem
document image analysis and understanding. The goal is to as an image patch labeling problem. The image patches
split a document image into regions of interest. Compared are generated by using superpixels algorithm. In contrast
to segmentation of machine printed document images, page to [14], [15], [16], in this work, we focus on developing
segmentation of historical document images is more chal- an end-to-end method. We combine feature learning and
lenging due to many variations such as layout structure, classifier training into one step. Image patches are used as
decoration, writing style, and degradation. Our goal is to input to train a CNN for the labeling task. During training,
develop a generic segmentation method for handwritten the features used to predict labels of the image patches are
historical documents. In this method, we consider the seg- learned on the convolution layers of the CNN.
mentation problem as a pixel-labeling problem, i.e., for a While many researchers focus on developing very deep
given document image, each pixel is labeled as one of the CNN to solving various problems [12], [18], [19], we train
predefined classes. a simple CNN of one convolution layer. Experiments on
Some page segmentation methods have been developed public historical document image datasets show that despite
recently. These methods rely on hand-crafted features [1], the simple structure and little tuning of hyperparameters, the
[2], [3], [4] or prior knowledge [5], [6], [7], or models that proposed method achieves comparable results compared to
combine hand-crafted features with domain knowledge [8], other CNN architectures.
[9]. In contrast, in this paper, our goal is to develop a more
general method which automatically learns features from II. M ETHODOLOGY
the pixels of document images. Elements such as strokes In order to create general page segmentation method
of words, words in sentences, sentences in paragraphs have without using any prior knowledge of the layout structure of
a hierarchical structure from low to high levels. As these the documents, we consider the page segmentation problem
patterns are repeated in different parts of the documents. as a pixel labeling problem. We propose to use a CNN
Based on these properties, feature learning algorithms can be for the pixel labeling task. The main idea is to learn a
applied to learn layout information of the document images. set of feature detectors and train a nonlinear classifier on
Convolutional Neural Network (CNN) is a feed-forward the features extracted by the feature detectors. With the set
artificial neural network which shares weights among neur- of feature detectors and the classifier, pixels on the unseen
ons in the same layer. By enforcing local connectivity document images can be classified into different classes.

2379-2140/17 $31.00 © 2017 IEEE 965

DOI 10.1109/ICDAR.2017.161
A. Preprocessing
In order to speed up the pixel labeling process, for a
given document image, we first apply a superpixel algorithm.
A superpixel is an image patch which contains pixels be-
long to the same object. Then instead of labeling all the
pixels, we only label the center pixel of each superpixel
and the remaining pixels in that superpixel are assigned to
the same label. The superiority of the superpixel labeling
approach over the pixel labeling approach for the page
segmentation task has been demonstrated in [15]. The simple
linear iterative clustering (SLIC) algorithm [20] is applied
as a preprocessing step to generate superpixels for given
document images.
Figure 1: The architecture of the proposed CNN
B. CNN Architecture
The architecture of our CNN is given in Figure 1. The 1
n
structure can be summarized as 28 × 28 × 1 − 26 × 26 × 4 − L(X, Y ) = − (ln a(x(i) ) + (1 − y (i) ) ln(1 − a(x(i) ))),
n i=1
100 − M , where M is the number of classes. The input is a
(3)
grayscale image patch. The size of the image patch is 28×28
where X = {x(1) , · · · , x(n) } is the set of training image
pixels. Our CNN architecture contains only one convolution
patches and Y = {y (1) , · · · , y (n) } is the corresponding set
layer which consists of 4 kernels. The size of each kernel
of labels. The number of training image patches is n. For
is 3 × 3 pixels. Unlike other traditional CNN architecture,
each x(i) , a(x(i) ) is the output of the CNN as defined in
the pooling layer is not used in our architecture. Then one
Eq. 1. The CNN is trained with Stochastic Gradient Descent
fully connected layer of 100 neurons follows the convolution
with the dropout [23] technique. The goal of dropout is to
layer. The last layer consists of a logistic regression with
avoid overfitting by introducing random noise to training
softmax which outputs an estimation of the probability of
samples. Such that during the training, the outputs of the
each class, such that
neurons are masked out with the probability of 0.5.

eWi x+bi III. E XPERIMENT

P (y = i|x, W1 , · · · , WM , b1 , · · · , bM ) = M ,
j=1 e
Wj x+bj Experiments are conducted on six public handwritten
(1) historical document image datasets.
where x is the output of the fully connected layer, Wi and
bi are the weights and biases of the ith neuron in this layer, A. Datasets
and M is the number of the classes. The predicted class ŷ The datasets are of very different nature. The G. Washing-
is the class which has the max probability, such that ton dataset consists of the pages written in English with ink
on paper and the images are in gray levels. The other two
datasets, i.e., Parzival and St. Gall datasets consist of images
ŷ = arg max P (y = i|x, W1 , · · · , WM , b1 , · · · , bM ). (2) of manuscripts written with ink on parchment and the images
i
In the convolution and fully connected layers of the CNN, are in color. The Parzival dataset consits of the pages written
Rectified Linear Units (ReLUs) [21] are used as neurons. An by three writers in the 13th century. The St. Gall dataset
ReLU is given as: f (x) = max(0, x), where x is the input contains the manuscripts from a medieval manuscript written
of the neuron. in Latin. The details of the ground truth for both datasets
are presented in [24].
C. Training Three new datasets with more complex layouts have
been recently created [25]. The CB55 dataset consists of
To train the CNN, for each superpixel, we generate a patch manuscripts from the 14th century which are written in
which is centred on that superpixel. The patch is considered Italian and Latin languages by one writer. The CSG18
as the input of the network. The size of each patch is 28×28 and CSG863 datasets consist of manuscripts from the 11th
pixels. The label of each patch is its center pixel’s label. The century which are written in Latin language. The number of
patches of the training images are used to train the network. writers of both datasets is not specified. The details of the
In the CNN, the stride length is 1 and the weights three datasets are presented in [25].
are initialized by using Xavier initialization [22]. The cost In the experiments, all images are scaled down with a
function is defined as the cross-entropy loss, such that scaling factor 2−3 . Table I gives the details of training, test,
and validation sets of the six datasets.

966
Table I: Details of training, test, and validation sets. T R, T E, and C. Evaluation
V A denote the training, test, and validation sets respectively.
We compare the proposed method to the previous meth-
image size (pixels) |T R| |T E| |V A| ods [15], [16]. Similar to the proposed method, superpixels
G. Washington 2200 × 3400 10 5 4
St. Gall 1664 × 2496 20 30 10 are considered as the basic units of labeling. In [15], the
Parzival 2000 × 3008 20 13 2 features are learned on randomly selected grayscale image
CB55 4872 × 6496 20 10 10 patches with a stacked convolutional autoencoder in an
CSG18 3328 × 4992 20 10 10
CSG863 3328 × 4992 20 10 10 unsupervised manner. Then the features and the labels of
the superpixels are used to train a classifier. With the trained
B. Metrics classifier, superpixels are classified into different classes.
In [16], a Conditional Random Field (CRF) is applied in
The most used metrics for page segmentation of histor- order to model the local and contextual information jointly
ical document images are precision, recall, and pixel level for the superpixel labeling task. The trained classifier in [15]
accuracy. Besides of these standard metrics, we also adapt is considered as the local classifier in [16]. Then the local
the metrics which are well defined and has been widely classifier is used to train a contextual classifier which takes
used for common semantic segmentation and scene parsing the output of the local classifier as input and output the
evaluations to evaluate different page segmentation methods. scores of given labels. With the local and contextual clas-
These metrics have been proposed in [26]. They are based sifiers, a CRF is trained to label the superpixels of a given
on pixel accuracy and region intersection over union (IU). image. In the experiments, we use a multilayer perceptron
Consequently, the metrics used in the experiments are: pixel (MLP) as the local classifier in [15], [16] and an MLP
accuracy, mean pixel accuracy, mean IU, and frequency as the contextual classifier in [16]. Simple Linear Iterative
weighted IU (f.w. IU). Clustering algorithm (SLIC) [20] is applied to generate the
In order to obtained the metrics, we define the variables: superpixels. The superiority of SLIC over other superpixel
• nc : the number of classes. algorithms is demonstrated in [15]. In the experiments, for
• nij : the number of pixels of class i predicted to belong each image, 3000 superpixels are generated.
to class j. For class i: Table II reports the pixel accuracy, mean pixel accuracy,
mean IU, and f.w. IU of the three methods. It is shown
– nii : the number of correctly classified pixels (true
that the proposed CNN outperforms the previous method.
positives).
Figure 2 gives the segmentation results of the three methods.
– nij : the number of wrongly classified pixels (false
We can see that visually the CNN achieves more accurate
positives).
segmentation results compared to other methods.
– nji : the number of wrongly not classified pixels
(false negatives). D. Max Pooling
• ti : the total number of pixels in class i, such that Pooling is a widely used technology in CNN. Max pooling
is the most common type of pooling which is applied in
ti = nji . (4) order to reduce spatial size of the representation to reduce
j the number of parameters of the network. In order to show
the impact of max pooling for the segmentation task. We add
With the defined variables, we can compute:
a max pooling layer after the convolution layer. The pooling
• pixel accuracy: size is 2 × 2 pixels. Table II reports the performance of the

nii CNN with a max pooling layer. We can see that only on the
acc = i . (5)
i ti
CB55 dataset, with max pooling the mean pixel accuracy
and mean IU are slightly improved. In general, adding a
• mean accuracy:
max pooling layer does not improve the performance of the
1 nii segmentation task. Figure 3 reports the f.w. IU of the CNN
accmean = × . (6) with different max pooling sizes. We define the max pooling
nc ti
i
size as m × m, such that m = {2 × n | n ∈ N, 0 ≤ n ≤ 13}.
• mean IU: We can see that increasing the pooling size decreases the
performance. The reason is that for some computer vis-
1 n
iumean = × ii . (7) ion problems, e.g., object recognition and text extraction
nc i
t i + j nji − nii in natural images, the exact location of a feature is less
important than its rough location relative to other features.
• f.w. IU: However, for a given document image, to label a pixel in
1 ti × nii the center of a patch, it is not sufficient to know if there
iuweighted = × . (8)
k tk ti + j nji − nii
is text somewhere in that patch, the location of the text is
i

967
Table II: Performance (in percentage) of superpixel labeling with only local MLP, CRF, and the proposed CNN.

G. Washington Parzival St.Gall

pixel mean mean f.w. pixel mean mean f.w. pixel mean mean f.w.
acc. acc. IU IU acc. acc. IU IU acc. acc. IU IU
Local MLP [15] 87 89 75 83 91 64 58 86 95 89 84 92
CRF [16] 91 90 76 85 93 70 63 88 97 88 84 94
CNN 91 91 77 86 94 75 68 89 98 90 87 96
CNN (max pooling) 91 90 77 86 94 75 68 89 98 90 87 96
CB55 CSG18 CSG863
pixel mean mean f.w. pixel mean mean f.w. pixel mean mean f.w.
acc. acc. IU IU acc. acc. IU IU acc. acc. IU IU
Local MLP [15] 83 53 42 72 83 49 39 73 84 54 42 74
CRF [16] 84 53 42 75 86 47 37 77 86 51 42 78
CNN 86 59 47 77 87 53 41 79 87 58 45 79
CNN (max pooling) 86 60 48 77 87 53 42 80 87 57 45 79

Figure 2: Segmentation results on the Parzival, CB55, and CSG863 datasets from top to bottom respectively. The colors: black, white,
blue, red, and pink are used to represent: periphery, page, text, decoration, and comment respectively. The columns from left to right are:
input, ground truth, and segmentation results of the local MLP, CRF, and CNN respectively.

needed. Therefore, the exact location of a feature is helpful kernels. We can see that except on the CS18 dataset, when
for the page segmentation task. K ≥ 4 the performance is not improved.

E. Number of Kernels F. Number of Layers

In order to show the impact of the number of kernels of In order to show the impact of the number of convo-
the convolution layer on the segmentation task. We deﬁne lutional layers on the page segmentation task. We incre-
the number of kernels as K. In the experiments, we set mentally add convolutional layers, such that there is two
K ∈ {1, 2, 4, 6, 8, 10, 12, 14}. Figure 4 reports the f.w. IU more kernels on the current layer than the previous layer.
of the one convolution layer CNN with different number of Figure 5 reports the f.w. IU of the CNN with different

968
Figure 6: f.w. IU of the CNN on different numbers of training
Figure 3: f.w. IU of the CNN on different max ppooling
g sizes. images.

dataset the pages are more varied and the ground truth is
less consistent.
H. Run Time
The proposed CNN is implemented with the python
library Theano [27]. The experiments are performed on a PC
with an Intel Core i7-3770 3.4 GHz processor and 16 GB
RAM. On average, for each image, the CNN takes about 1
second processing time. The superpixel labeling method [15]
Figure 4: f.w. IU of the CNN on different numbers of kernels. and CRF model [16] take about 2 and 5 seconds respectively.
IV. C ONCLUSION
In this paper, we have proposed a convolutional neural
network (CNN) for page segmentation of handwritten his-
torical document images. In contrast to traditional page
segmentation methods which rely on off-the-shelf classiﬁers
trained with hand-crafted features, the proposed method
learns features directly from image patches. Furthermore,
feature learning and classiﬁer training are combined into one
step. Experiments on public datasets show the superiority
Figure 5: f.w. IU of the CNN on different numbers of conv layers. of the proposed method over the previous methods. While
many researchers focus on applying very deep CNN archi-
number of convolution layers. It is show that the number of tectures for different tasks, we show that with the simple
layers does not affect the performance of the segmentation one convolution layer CNN, we have achieved comparable
task. However, on the G. Washington dataset, with more performance compared to other network architectures.
layers, the performance is degraded slightly. The reason is
ACKNOWLEDGMENT
that compared to other datasets, the G. Washington dataset
has fewer training images. Furthermore, the layouts of the This work is supported by the Swiss National Science
pages in the G, Washington dataset are more varied. Foundation project HisDoc 2.0 with the grant number:
205120 150173 and National Natural Science Foundation of
G. Number of Training Images China with the grant numbers: 61202257 and 61650110512.

In order to show the performance under different amount R EFERENCES

of training images. For each dataset, we choose N images [1] C. Grana, D. Borghesani, and R. Cucchiara, “Automatic seg-
in the training set to train the CNN. For each experi- mentation of digitalized historical manuscripts,” Multimedia
ment, the number of batches is set to 5000. Figure 6 Tools and Applications, vol. 55, no. 3, pp. 483–506, 2011.
reports the f.w. IU under different values of N , such that [2] S. S. Bukhari, T. M. Breuel, A. Asi, and J. El-Sana, “Layout
N ∈ {1, 2, 4, 8, 10, 12, 14, 16, 18, 20}1 . We can see that in analysis for arabic historical document images using machine
general, when N > 2, the performance is not improved. learning,” in Frontiers in Handwriting Recognition (ICFHR),
However, on the G. Washington dataset, with more training 2012 International Conference on. IEEE, 2012, pp. 639–644.
images, the performance is degraded slightly. The reason is [3] K. Chen, H. Wei, M. Liwicki, J. Hennebert, and R. Ingold,
that compared to the other datasets, on the G. Washington “Robust text line segmentation for historical manuscript im-
ages using color and texture,” in 2014 22nd International
1 In the G. Washington dataset, there is 10 training images. Therefore, Conference on Pattern Recognition (ICPR). IEEE, 2014,
N ∈ {1, 2, 4, 8, 10}. pp. 2978–2983.

969
[4] K. Chen, H. Wei, J. Hennebert, R. Ingold, and M. Li- [16] K. Chen, M. Seuret, M. Liwicki, J. Hennebert, C.-L. Liu,
wicki, “Page segmentation for historical handwritten docu- and R. Ingold, “Page segmentation for historical handwrit-
ment images using color and texture features,” in Frontiers in ten document images using conditional random ﬁelds,” in
Handwriting Recognition (ICFHR), 2014 14th International Frontiers in Handwriting Recognition (ICFHR), 2016 15th
Conference on. IEEE, 2014, pp. 488–493. International Conference on. IEEE, 2016, pp. 90–95.

[5] M. Bulacu, R. van Koert, L. Schomaker, and T. van der [17] J. Lafferty, A. McCallum, and F. Pereira, “Conditional ran-
Zant, “Layout analysis of handwritten historical documents dom fields: Probabilistic models for segmenting and labeling
for searching the archive of the cabinet of the dutch queen,” sequence data,” in Proceedings of the eighteenth international
in Ninth International Conference on Document Analysis and conference on machine learning, ICML, vol. 1, 2001, pp. 282–
Recognition (ICDAR 2007), vol. 1. IEEE, 2007, pp. 357–361. 289.
[6] C. Panichkriangkrai, L. Li, and K. Hachimura, “Character [18] M. D. Zeiler and R. Fergus, “Visualizing and understanding
segmentation and retrieval for learning support system of convolutional networks,” in European conference on com-
japanese historical books,” in Proceedings of the 2nd In- puter vision. Springer, 2014, pp. 818–833.
ternational Workshop on Historical Document Imaging and
Processing. ACM, 2013, pp. 118–122. [19] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning
for image recognition,” in Proceedings of the IEEE Confer-
[7] B. Gatos, G. Louloudis, and N. Stamatopoulos, “Segmenta- ence on Computer Vision and Pattern Recognition, 2016, pp.
tion of historical handwritten documents into text zones and 770–778.
text lines,” in Frontiers in Handwriting Recognition (ICFHR),
2014 14th International Conference on. IEEE, 2014, pp. [20] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and
464–469. S. Süsstrunk, “Slic superpixels compared to state-of-the-art
superpixel methods,” IEEE transactions on pattern analysis
[8] R. Cohen, A. Asi, K. Kedem, J. El-Sana, and I. Dinstein,
and machine intelligence, vol. 34, no. 11, pp. 2274–2282,
“Robust text and drawing segmentation algorithm for his-
2012.
torical documents,” in Proceedings of the 2nd International
Workshop on Historical Document Imaging and Processing.
[21] V. Nair and G. E. Hinton, “Rectified linear units improve
ACM, 2013, pp. 110–117.
restricted boltzmann machines,” in Proceedings of the 27th
[9] A. Asi, R. Cohen, K. Kedem, J. El-Sana, and I. Dinstein, “A international conference on machine learning (ICML-10),
coarse-to-fine approach for layout analysis of ancient manu- 2010, pp. 807–814.
scripts,” in Frontiers in Handwriting Recognition (ICFHR),
2014 14th International Conference on. IEEE, 2014, pp. [22] X. Glorot and Y. Bengio, “Understanding the difficulty of
140–145. training deep feedforward neural networks,” in Proceedings
of the Thirteenth International Conference on Artificial Intel-
[10] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient- ligence and Statistics, 2010, pp. 249–256.
based learning applied to document recognition,” Proceedings
of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998. [23] N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and
R. Salakhutdinov, “Dropout: a simple way to prevent neural
[11] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. networks from overfitting.” Journal of Machine Learning
Howard, W. Hubbard, and L. D. Jackel, “Backpropagation Research, vol. 15, no. 1, pp. 1929–1958, 2014.
applied to handwritten zip code recognition,” Neural compu-
tation, vol. 1, no. 4, pp. 541–551, 1989. [24] K. Chen, M. Seuret, H. Wei, M. Liwicki, J. Hennebert, and
R. Ingold, “Ground truth model, tool, and dataset for layout
[12] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet analysis of historical documents,” in IS&T/SPIE Electronic
classification with deep convolutional neural networks,” in Imaging. International Society for Optics and Photonics,
Advances in neural information processing systems, 2012, pp. 2015, pp. 940 204–940 204.
1097–1105.
[25] F. Simistira, M. Seuret, N. Eichenberger, A. Garz, M. Liwicki,
[13] T. Wang, D. J. Wu, A. Coates, and A. Y. Ng, “End-to-end text and R. Ingold, “Diva-hisdb: A precisely annotated large
recognition with convolutional neural networks,” in Pattern dataset of challenging medieval manuscripts,” in Frontiers in
Recognition (ICPR), 2012 21st International Conference on. Handwriting Recognition (ICFHR), 2016 15th International
IEEE, 2012, pp. 3304–3308. Conference on. IEEE, 2016, pp. 471–476.
[14] K. Chen, M. Seuret, M. Liwicki, J. Hennebert, and R. In- [26] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional
gold, “Page segmentation of historical document images with networks for semantic segmentation,” in Proceedings of the
convolutional autoencoders,” in Document Analysis and Re- IEEE Conference on Computer Vision and Pattern Recogni-
cognition (ICDAR), 2015 13th International Conference on. tion, 2015, pp. 3431–3440.
IEEE, 2015, pp. 1011–1015.
[27] J. Bergstra, O. Breuleux, F. Bastien, P. Lamblin, R. Pascanu,
[15] K. Chen, C.-L. Liu, M. Seuret, M. Liwicki, J. Hennebert, and G. Desjardins, J. Turian, D. Warde-Farley, and Y. Bengio,
R. Ingold, “Page segmentation for historical document images “Theano: a cpu and gpu math expression compiler,” in Pro-
based on superpixel classification with unsupervised feature ceedings of the Python for Scientific Computing Conference
learning,” in Document Analysis System (DAS), 2016 12th (SciPy), 2010.
IAPR International Workshop on. IEEE, 2016, pp. 299–304.

970

Stock Market Price Prediction Using LSTM RNN
No ratings yet
Stock Market Price Prediction Using LSTM RNN
11 pages
Tamil Handwritten Character Recognition: BY V.Meenalochini K.Dharshini
No ratings yet
Tamil Handwritten Character Recognition: BY V.Meenalochini K.Dharshini
31 pages
Convolutional Neural Networks For Page Segmentation of Historical Document Images
No ratings yet
Convolutional Neural Networks For Page Segmentation of Historical Document Images
6 pages
Convolutional Neural Networks For Document Image Classification
No ratings yet
Convolutional Neural Networks For Document Image Classification
5 pages
ML Report-Image Segmentation
No ratings yet
ML Report-Image Segmentation
19 pages
Segmentation-Aware Convolutional Networks Using Local Attention Masks
No ratings yet
Segmentation-Aware Convolutional Networks Using Local Attention Masks
11 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Harley MSC Thesis Menos Especializadpo
No ratings yet
Harley MSC Thesis Menos Especializadpo
71 pages
3098 15835 1 PB 2011 PDF
No ratings yet
3098 15835 1 PB 2011 PDF
6 pages
Lecture 13 Image Segmentation Using Convolutional Neural Network
No ratings yet
Lecture 13 Image Segmentation Using Convolutional Neural Network
9 pages
Document Image Segmentation Using Discriminative
No ratings yet
Document Image Segmentation Using Discriminative
7 pages
Implementation of Handwritten Digit Recognizer Using CNN: Vinjit, Bhojak, Kumar and Nikam
No ratings yet
Implementation of Handwritten Digit Recognizer Using CNN: Vinjit, Bhojak, Kumar and Nikam
9 pages
L D C E S S: Earning Ense Onvolutional Mbeddings FOR Emantic Egmentation
No ratings yet
L D C E S S: Earning Ense Onvolutional Mbeddings FOR Emantic Egmentation
10 pages
Deep Learning Approach For Object Detection Using CNN: Abstract
No ratings yet
Deep Learning Approach For Object Detection Using CNN: Abstract
7 pages
Image Segmentation Basics
No ratings yet
Image Segmentation Basics
11 pages
Unsupervised Learning of Image Segmentation Based On Differentiable Feature Clustering
No ratings yet
Unsupervised Learning of Image Segmentation Based On Differentiable Feature Clustering
14 pages
Satellite Image Segmentation With Convolutional Neural Networks (CNN)
100% (1)
Satellite Image Segmentation With Convolutional Neural Networks (CNN)
4 pages
Liu 2018 J. Phys. Conf. Ser. 1087 062032
No ratings yet
Liu 2018 J. Phys. Conf. Ser. 1087 062032
8 pages
Deconvolution Network ICCV 2015 Paper PDF
No ratings yet
Deconvolution Network ICCV 2015 Paper PDF
9 pages
Document Image Layout Analysis Via Explicit Edge Embedding Network
No ratings yet
Document Image Layout Analysis Via Explicit Edge Embedding Network
11 pages
Fully Convolutional Networks For Semantic Segmentation: Jonathan Long Evan Shelhamer Trevor Darrell UC Berkeley
No ratings yet
Fully Convolutional Networks For Semantic Segmentation: Jonathan Long Evan Shelhamer Trevor Darrell UC Berkeley
10 pages
10 1109@iccsp 2019 8698095
No ratings yet
10 1109@iccsp 2019 8698095
5 pages
Optical Character Recognition Using Convolutional Neural Network
No ratings yet
Optical Character Recognition Using Convolutional Neural Network
5 pages
Ijcnn 2011
No ratings yet
Ijcnn 2011
4 pages
A-Lamp: Adaptive Layout-Aware Multi-Patch Deep Convolutional Neural Network For Photo Aesthetic Assessment
No ratings yet
A-Lamp: Adaptive Layout-Aware Multi-Patch Deep Convolutional Neural Network For Photo Aesthetic Assessment
10 pages
Day 2
No ratings yet
Day 2
58 pages
Dint A 00062
No ratings yet
Dint A 00062
16 pages
W-Net A Deep Model For Fully Unsupervised Image Segmentation
No ratings yet
W-Net A Deep Model For Fully Unsupervised Image Segmentation
13 pages
Fully Convolutional Networks For Semantic Segmentation
No ratings yet
Fully Convolutional Networks For Semantic Segmentation
12 pages
UNIT 2 Self Notes
No ratings yet
UNIT 2 Self Notes
10 pages
Image Classification and Text Extraction Using Convolutional Neural Network
No ratings yet
Image Classification and Text Extraction Using Convolutional Neural Network
7 pages
A Review On Multiscale-Deep-Learning Applications
No ratings yet
A Review On Multiscale-Deep-Learning Applications
28 pages
2015 - DeepLab v1 - Semantic Image Segmentation With Deep Convolutional Nets and Fully Connected Crfs
No ratings yet
2015 - DeepLab v1 - Semantic Image Segmentation With Deep Convolutional Nets and Fully Connected Crfs
14 pages
Fine Grained Segmentation - Feb 18
No ratings yet
Fine Grained Segmentation - Feb 18
11 pages
Segmentation by Gan
No ratings yet
Segmentation by Gan
18 pages
Handwritten Digit Recognition Roadmap
No ratings yet
Handwritten Digit Recognition Roadmap
17 pages
A Brief Survey and An Application of Sem
No ratings yet
A Brief Survey and An Application of Sem
38 pages
Fully Convolutional Networks For Semantic Segmentation
No ratings yet
Fully Convolutional Networks For Semantic Segmentation
12 pages
A Review of Various Handwriting Recognition Methods
No ratings yet
A Review of Various Handwriting Recognition Methods
10 pages
Paper 1
No ratings yet
Paper 1
3 pages
Seminar
No ratings yet
Seminar
16 pages
Applsci 08 00837 PDF
No ratings yet
Applsci 08 00837 PDF
17 pages
Base Paper
No ratings yet
Base Paper
5 pages
Isbi 2011 5872414
No ratings yet
Isbi 2011 5872414
4 pages
IPCV 09 Vagus 5129
No ratings yet
IPCV 09 Vagus 5129
7 pages
Batch 6
No ratings yet
Batch 6
38 pages
Machine Learning For Handwriting Recognition: Preetha S, Afrid I M, Karthik Hebbar P, Nishchay S K
No ratings yet
Machine Learning For Handwriting Recognition: Preetha S, Afrid I M, Karthik Hebbar P, Nishchay S K
9 pages
Why Convolutions?: Till Now in MLP
No ratings yet
Why Convolutions?: Till Now in MLP
38 pages
Convolutional Neuralnetworks: Abin - Roozgard
No ratings yet
Convolutional Neuralnetworks: Abin - Roozgard
54 pages
ML Project Docs
No ratings yet
ML Project Docs
45 pages
Boundary-Aware Segmentation Network For Mobile and Web Applications
No ratings yet
Boundary-Aware Segmentation Network For Mobile and Web Applications
19 pages
Medical Image Segmentation
No ratings yet
Medical Image Segmentation
12 pages
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet
Image Skin Cancer Classification Based On FPGA and Convolutional Neural Network
No ratings yet
Image Skin Cancer Classification Based On FPGA and Convolutional Neural Network
7 pages
Generalizability of Semantic Segmentation Techniques: Keshav Bhandari Texas State University, San Marcos, TX
No ratings yet
Generalizability of Semantic Segmentation Techniques: Keshav Bhandari Texas State University, San Marcos, TX
6 pages
Article Hand Writing Character Recognition Using CNN
No ratings yet
Article Hand Writing Character Recognition Using CNN
6 pages
Understanding Deep Learning Techniques For Image Segmentation
No ratings yet
Understanding Deep Learning Techniques For Image Segmentation
58 pages
January 2023: Top 10 Cited Articles in Computer Science & Information Technology
No ratings yet
January 2023: Top 10 Cited Articles in Computer Science & Information Technology
32 pages
Text-Image Separation in Document Images Using Boundary/Perimeter Detection
No ratings yet
Text-Image Separation in Document Images Using Boundary/Perimeter Detection
5 pages
Convolution Neural Network
No ratings yet
Convolution Neural Network
9 pages
Convolutional Neural Network (CNN) : Assignment On
No ratings yet
Convolutional Neural Network (CNN) : Assignment On
8 pages
Extremely Sparse Deep Learning Using Inception Modules With Dropfilters
No ratings yet
Extremely Sparse Deep Learning Using Inception Modules With Dropfilters
6 pages
A Lexicon Verification Strategy in A BLSTM Cascade Framework
No ratings yet
A Lexicon Verification Strategy in A BLSTM Cascade Framework
6 pages
Building A Compact MQDF Classifier by Sparse Coding and Vector Quantization Technique
No ratings yet
Building A Compact MQDF Classifier by Sparse Coding and Vector Quantization Technique
6 pages
Benchmarking Keypoint Filtering Approaches For Document Image Matching
No ratings yet
Benchmarking Keypoint Filtering Approaches For Document Image Matching
6 pages
Classification of Graphomotor Impressions Using Convolutional Neural Networks: An Application To Automated Neuro-Psychological Screening Tests
No ratings yet
Classification of Graphomotor Impressions Using Convolutional Neural Networks: An Application To Automated Neuro-Psychological Screening Tests
6 pages
Zoning Aggregated Hypercolumns For Keyword Spotting
No ratings yet
Zoning Aggregated Hypercolumns For Keyword Spotting
6 pages
Segmentation Free Spotting of Cuneiform Using Part Structured Models
No ratings yet
Segmentation Free Spotting of Cuneiform Using Part Structured Models
6 pages
The First Handwritten Balinese Palm Leaf Manuscripts Dataset
No ratings yet
The First Handwritten Balinese Palm Leaf Manuscripts Dataset
6 pages
Line-of-Sight Stroke Graphs and Parzen Shape Context Features For Handwritten Math Formula Representation and Symbol Segmentation
No ratings yet
Line-of-Sight Stroke Graphs and Parzen Shape Context Features For Handwritten Math Formula Representation and Symbol Segmentation
7 pages
Sheet Music Statistical Layout Analysis: 2016 15th International Conference On Frontiers in Handwriting Recognition
No ratings yet
Sheet Music Statistical Layout Analysis: 2016 15th International Conference On Frontiers in Handwriting Recognition
6 pages
Convolutional Multi-Directional Recurrent Network For of Ine Handwritten Text Recognition
No ratings yet
Convolutional Multi-Directional Recurrent Network For of Ine Handwritten Text Recognition
6 pages
Recognizing Off-Line Flowcharts by Reconstructing Strokes and Using On-Line Recognition Techniques
No ratings yet
Recognizing Off-Line Flowcharts by Reconstructing Strokes and Using On-Line Recognition Techniques
6 pages
Discovering Visual Element Evolutions For Historical Document Dating
No ratings yet
Discovering Visual Element Evolutions For Historical Document Dating
6 pages
Phocnet: A Deep Convolutional Neural Network For Word Spotting in Handwritten Documents
No ratings yet
Phocnet: A Deep Convolutional Neural Network For Word Spotting in Handwritten Documents
6 pages
Efficient Inference in Fully Connected CRFs
No ratings yet
Efficient Inference in Fully Connected CRFs
9 pages
Online Handwritten Mathematical Expressions Recognition by Merging Multiple 1D Interpretations
No ratings yet
Online Handwritten Mathematical Expressions Recognition by Merging Multiple 1D Interpretations
6 pages
On The Design of Personal Digital Bodyguards: Impact of Hardware Resolution On Handwriting Analysis
No ratings yet
On The Design of Personal Digital Bodyguards: Impact of Hardware Resolution On Handwriting Analysis
6 pages
Cascading Training For Relaxation CNN On Handwritten Character Recognition
No ratings yet
Cascading Training For Relaxation CNN On Handwritten Character Recognition
6 pages
Fourier Coefficients For Fraud Handwritten Document Classification Through Age Analysis
No ratings yet
Fourier Coefficients For Fraud Handwritten Document Classification Through Age Analysis
6 pages
Automatic Signature Segmentation Using Hyper-Spectral Imaging
No ratings yet
Automatic Signature Segmentation Using Hyper-Spectral Imaging
6 pages
On The Parametrization of The Three-Dimensional Rotation Group
No ratings yet
On The Parametrization of The Three-Dimensional Rotation Group
10 pages
Multiple Generation of Bengali Static Signatures
No ratings yet
Multiple Generation of Bengali Static Signatures
6 pages
Defensive Patches For Robust Recognition in The Physical World
No ratings yet
Defensive Patches For Robust Recognition in The Physical World
10 pages
New Tampered Features For Scene and Caption Text Classification in Video Frame
No ratings yet
New Tampered Features For Scene and Caption Text Classification in Video Frame
6 pages
Recent Advances in Simultaneous Localiza
No ratings yet
Recent Advances in Simultaneous Localiza
34 pages
B.tech CSE IV Elective Prefrences
No ratings yet
B.tech CSE IV Elective Prefrences
4 pages
GenAI Roadmap
No ratings yet
GenAI Roadmap
8 pages
References
No ratings yet
References
9 pages
Transformer Attention 91cb05dd 182d 4c7d 8c8e f1698567b8d6
No ratings yet
Transformer Attention 91cb05dd 182d 4c7d 8c8e f1698567b8d6
39 pages
Lecture 7 - Neural Networks
No ratings yet
Lecture 7 - Neural Networks
48 pages
ECE 457 Course Synopsis Computational Intelligence: Fuzzy Logic and Neural Networks Fundamentals
No ratings yet
ECE 457 Course Synopsis Computational Intelligence: Fuzzy Logic and Neural Networks Fundamentals
1 page
Finger Vein Recognition
No ratings yet
Finger Vein Recognition
20 pages
Flatten Layer and Pooling Technique
No ratings yet
Flatten Layer and Pooling Technique
3 pages
Pert19 - Learning From Examples II
No ratings yet
Pert19 - Learning From Examples II
29 pages
Clevered AI Wizard Level 3
No ratings yet
Clevered AI Wizard Level 3
17 pages
Various Paradigms of Learning Problems
No ratings yet
Various Paradigms of Learning Problems
14 pages
Mi 2
No ratings yet
Mi 2
605 pages
Unit 1
No ratings yet
Unit 1
109 pages
Lecture Notes For Chapter 4 Artificial Neural Networks Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 4 Artificial Neural Networks Introduction To Data Mining, 2 Edition
20 pages
Neural Sheet 6
No ratings yet
Neural Sheet 6
3 pages
Deep Learning Drizzle
No ratings yet
Deep Learning Drizzle
12 pages
Advanced Soft Computing
No ratings yet
Advanced Soft Computing
24 pages
Counterpropagation Networks
No ratings yet
Counterpropagation Networks
6 pages
Course Material Neural Updated
No ratings yet
Course Material Neural Updated
90 pages
LLM Bootcamp Curriculum AI Planet
No ratings yet
LLM Bootcamp Curriculum AI Planet
3 pages
Stock Prediction RNN
No ratings yet
Stock Prediction RNN
7 pages
MLP and CNN
No ratings yet
MLP and CNN
56 pages
Diagram For ANN
No ratings yet
Diagram For ANN
2 pages
Neuromorphic Computing
No ratings yet
Neuromorphic Computing
16 pages
Batch Normalization
No ratings yet
Batch Normalization
6 pages
Addernet: Do We Really Need Multiplications in Deep Learning?
No ratings yet
Addernet: Do We Really Need Multiplications in Deep Learning?
8 pages
Experiment No. 4 TE SL-II (ANN)
100% (1)
Experiment No. 4 TE SL-II (ANN)
2 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
19 pages
Le y Yang - Tiny ImageNet Visual Recognition Challenge
No ratings yet
Le y Yang - Tiny ImageNet Visual Recognition Challenge
6 pages

Convolutional Neural Networks For Page Segmentation of Historical Document Images

Uploaded by

Convolutional Neural Networks For Page Segmentation of Historical Document Images

Uploaded by

2017 14th IAPR International Conference on Document Analysis and Recognition

Convolutional Neural Networks for Page Segmentation of Historical Document

Kai Chen∗ , Mathias Seuret∗ , Jean Hennebert∗† , and Rolf Ingold∗

2379-2140/17 $31.00 © 2017 IEEE 965

eWi x+bi III. E XPERIMENT

G. Washington Parzival St.Gall

E. Number of Kernels F. Number of Layers

In order to show the performance under different amount R EFERENCES

You might also like