Vanishing Point Detection With Convolutional Neural Networks
Vanishing Point Detection With Convolutional Neural Networks
net/publication/307636469
CITATIONS READS
16 585
1 author:
Ali Borji
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
EgoTransfer: Transferring Motion Across Egocentric and Exocentric Domains using Deep Neural Networks View project
All content following this page was uploaded by Ali Borji on 29 November 2016.
Ali Borji
Center for Research in Computer Vision, University of Central Florida
[email protected]
1. Introduction
arXiv:1609.00967v1 [cs.CV] 4 Sep 2016
11 12 13 14 15 16 17 18 19 20
12
10
Figure 1. Left: Two sample frames of each of 29 videos downloaded from YouTube. Top-right: Sample images without vanishing point
used to train the vanishing point existence prediction network. Bottom-right: Average vanishing point location. Left panel shows all visited
locations and the right panel shows the VP histogram.
0.5
0.4
0.3
0.2
0.1
0
Grid size 10 x 10 20 x 20 30 x 30
Figure 2. Left: Error rates of deep models for VP detection. Top-right: Sample images where our model is able to accurately locate the VP
in five tries. Red circle is the top-1 prediction and blue ones are the next top-4. Bottom-right: Failure examples of our model.
20 grid) but are well below the deep learning performance shortcoming. Another way to improve performance would
(deep learning Top-1 accuracy is about 57%). be through data augmentation (i.e., adding jittered, cropped,
We also compare our model with two vanishing point noisy, and blurry versions of input images).
detection algorithms from the literature. The first one is
a method by Košecká and Zhang [12] and the second one 3. Discussion
is the classic Hough transform [5]. These two algorithms
score 15.6% and 35%, respectively in detecting the vanish- We proposed a method for vanishing point detection
ing point on a 20 × 20 map (Top-1 accuracy) which are based on convolutional neural networks that does well
much lower than our results using CNNs. on road scenes but is not very effective on arbitrary im-
ages. We will consider collecting a larger image dataset
To assess the generalization power of our approach in
with variety of scenes including vanishing points and more
detecting vanishing points in arbitrary natural scenes, we
recent deep learning architectures to improve accuracy.
experimented with pictures of buildings, tunnels, sketches
Extension of this approach to videos is another interest-
and fields shown in Figure 3. Although our model (VGG)
ing future direction. Our dataset is freely available at:
has not been explicitly trained on these images, it success-
http://crcv.ucf.edu/people/faculty/Borji/code.php
fully finds VPs in some of them. It fails on some other
unseen examples (e.g., sketches). Augmenting our dataset Acknowledgments: We wish to thank NVIDIA for their
with more images of these kinds could help overcome this generous donation of the GPU used in this study.
Figure 3. Performance of our vanishing point detector on arbitrary images containing vanishing points. The largest red circle is the first
detection. Other four detections are shown in blue.
References
[1] M. Land, D.N. Lee, Where do we look when we steer, Na-
ture,1994.
[2] A. Borji, L. Itti, State of the art in visual attention modeling,
IEEE Trans. PAMI, 2013.
[3] A. Borji, M. Feng, Vanishing point attracts gaze in free-
viewing and visual search tasks, arXiv:1512.01722, 2015.
[4] M. Feng, A. Borji, H. Lu, Fixation prediction with a com-
bined model of bottom-up saliency and vanishing point,
WACV, 2015.
[5] P.V.C. Hough, Method and means for recognizing complex
patterns, 1962.
[6] Z. Bylinskii, T. Judd, A. Borji, L. Itti, F. Durand, A. Oliva, A.
Torralba, MIT Saliency Benchmark, http://saliency.mit.edu/
[7] A. Borji, L. Itti, Cat2000: A large scale fixation dataset for
boosting saliency research, arXiv:1505.03581, 2015.
[8] G. Griffin, A. Holub, P. Perona, Caltech-256 object category
dataset, 2007.
[9] A. Torralba, A. Oliva, M.S. Castelhano, J. Henderson, Con-
textual guidance of eye movements and attention in real-
world scenes: the role of global features in object search,
Psychological review, 2006.
[10] T. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ra-
manan, P. Dollár, L.C Zitnick, Microsoft coco: Common ob-
jects in context, ECCV 2014.
[11] J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, L. Fei-Fei, Im-
agenet: A large-scale hierarchical image database, CVPR,
2009.
[12] J. Košecká, W. Zhang, Wei, Video compass, ECCV 2002.