Review Article: A Survey On Breaking Technique of Text-Based CAPTCHA
Review Article: A Survey On Breaking Technique of Text-Based CAPTCHA
Review Article
A Survey on Breaking Technique of Text-Based CAPTCHA
Jun Chen,1,2 Xiangyang Luo,1 Yanqing Guo,3 Yi Zhang,1 and Daofu Gong1
1
State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450002, China
2
Henan Institute of Science and Technology, Xinxiang 453003, China
3
Dalian University of Technology, Dalian 116024, China
Copyright © 2017 Jun Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
The CAPTCHA has become an important issue in multimedia security. Aimed at a commonly used text-based CAPTCHA, this
paper outlines some typical methods and summarizes the technological progress in text-based CAPTCHA breaking. First, the
paper presents a comprehensive review of recent developments in the text-based CAPTCHA breaking field. Second, a framework
of text-based CAPTCHA breaking technique is proposed. And the framework mainly consists of preprocessing, segmentation,
combination, recognition, postprocessing, and other modules. Third, the research progress of the technique involved in each
module is introduced, and some typical methods of segmentation and recognition are compared and analyzed. Lastly, the paper
discusses some problems worth further research.
2005, the international seminars on HIP have been held, and (1) A large enough character set. Only when a character
a large number of related research results were published. set is large enough, the total number of CAPTCHA strings is
In subsequent years, many research results were reported large enough to resist violent breaking.
in international conferences including CVPR, NIPS, CCS, (2) The characters with distortion, adhesion, and overlap.
and NDSS. Many internationally renowned universities and Using characters with distortion, adhesion, and overlap, the
research institutions have established research groups on breaking methods cannot easily segmented a CAPTCHA
CAPTCHA technology, such as CMU [1, 8–14], PARC [15– image into single characters.
19], UCB [16, 17, 20, 21], Microsoft [2, 22–27], Google [28– (3) The characters are different in size, width, angle,
location, and fonts. When comparing features of different
30], Bell Laboratory [31, 32], Yan et al. [4, 33–42], Xidian Uni-
characters, the various transformations may reduce recogni-
versity [41–47], and University of Science and Technology of
tion accuracy.
China [48, 49]. In addition, many websites offer CAPTCHA (4) The strings with unfixed length. In a CAPTCHA
services in public such as CAPTCHA [10], BotBlock [50], scheme, strings with unfixed length can increase breaking
JCAPTCHA [51], and HCaptcha [52]. And some research difficulty to a certain extent.
groups focus on CAPTCHA recognition, such as PWNtcha (5) Hollow characters and broken contours. Compared
[53], Captchacker [54], aiCaptcha [55], and Gery Mori [56]. with the solid characters, hollow character’s features are less,
The security of text-based CAPTCHA mainly depends and broken contours can effectively resist the filling attack.
on the visual interference effects [25], including rotation, (6) The color and shape of complex backgrounds are
twisting, adhesion, and overlap. The typical types of text- similar to those of characters. If the images meet these
based CAPTCHA and their features are shown in Table 1. conditions, the noise is difficult to remove. This may reduce
To resist machine recognition, the text-based recognition accuracy.
CAPTCHA’s security is often protected by a series of The above features effectively enhance text-based
technologies. From Table 1, we can sum up the following CAPTCHAs’ security and bring great challenges to the
main features of the text-based CAPTCHA. CAPTCHA breaking research at the same time.
Security and Communication Networks 3
Table 2: Comparison of typical methods based on segmentation for breaking nonadherent CAPTCHA.
Note. SVM: support vector machine, CW-SSIM: complex wavelet based structural similarity.
3. Research Progress of Breaking with the advantage of deep learning, the breaking based
Text-Based CAPTCHA on nonsegmentation will bounce back. The success rates of
typical text-based CAPTCHA breaking methods based on
For all kinds of text-based CAPTCHA schemes, the breaking nonsegmentation are as shown in Table 4.
methods are also various. According to whether there is
segmentation or not, the existing breaking methods be
3.3. The Framework of Text-Based CAPTCHA Breaking Tech-
contained in two categories.
nique. With the improvement of text-based CAPTCHA
design, the breaking technique changes to meet it. The early
3.1. Text-Based CAPTCHA Breaking Methods Based on Seg- text-based CAPTCHA contains nonadherent characters. The
mentation. The text-based CAPTCHA breaking based on breaking technique is the traditional framework of “prepro-
segmentation has different processing methods for different cessing + segmentation + recognition.” In recent years, most
objects and results. When there is no adherent character, indi- of the text-based CAPTCHAs use CCT (Crowded Characters
vidual characters are obtained using vertical projection and Together). Therefore, various breaking frameworks come
connected component with good effect. As shown in Table 2, into being, for example, “preprocessing + recognition,” “pre-
the success rates of nonadherent character CAPTCHA range processing + recognition + postprocessing,” “preprocessing
from 78% to 100%. + segmentation + combination + recognition,” and “pre-
However, it had little success in adherent characters. processing + segmentation + combination + recognition +
Therefore, more complicated methods, such as different postprocessing.”
width, character features, and character contours, have been In this paper, the existing frameworks are integrated
proposed one after another. With more and more antiseg- into an overall framework of text-based CAPTCHA break-
mentation technologies in CAPTCHA field, obtaining indi- ing, as shown in Figure 1. The framework mainly consists
vidual characters is becoming harder and harder. Then the of preprocessing, segmentation, combination, recognition,
researchers proposed the segmentation methods for obtain- postprocessing, and other modules. The research progress of
ing character components by character structure, filters, and each module will be described in the following.
so forth. As can be seen from Table 3, the success rates of
CAPTCHA breaking are generally low, with only a few higher
than 80%. 4. Preprocessing Methods of Breaking
Text-Based CAPTCHA
3.2. Text-Based CAPTCHA Breaking Methods Based on Non-
segmentation. The text-based CAPTCHA breaking methods The CAPTCHA preprocessing is the first step of CAPTCHA
based on nonsegmentation can directly recognize prepro- image processing before segmentation and recognition. Its
cessed CAPTCHA images. The breaking method’s success main purpose is to highlight the information related to
rate relies on recognition technique. In early stage, different characters in a given image and to weaken or eliminate inter-
pattern matching algorithms such as shape context [20] fering information. The preprocessing of existing CAPTCHA
and similarity [57] are used for recognition. Later, with breaking methods mainly includes image binarization, image
the improvement of the success rates of individual char- thinning, denoising, and so on.
acter recognition, researchers focus on the character seg-
mentation technique. However, the text-based CAPTCHA 4.1. Image Binarization. Image binarization is to highlight
design uses antisegmentation technique, which can prevent interesting objects’ contour and to remove noises in back-
obtaining complete and individual characters. Nowadays ground. The key to binarization is to select an appropriate
4 Security and Communication Networks
Table 3: Comparison of typical methods based on segmentation for breaking adherent CAPTCHA.
threshold. When the threshold is applied to the whole image, 4.3. Image Denoising. In order to resist breaking, there
it is called the global threshold method; otherwise, it is are noises and interference lines in CAPTCHA images.
called the local threshold method. If the threshold is not In addition, some noises are generated during grayscale
fixed during processing, it is called variable threshold method and binarization. Therefore, we need to denoise CAPTCHA
or dynamic threshold method. The common thresholding image. The typical methods are as shown in Table 5. We
methods are Sauvola and Pietikainen’s method [65], Otsu’s should choose the effective denoising method according to
method [66], and so on. actual situation.
4.2. Image Thinning. Image thinning is to process the char- 5. Segmentation Methods of Breaking
acter’s contour as skeleton. It must not change the character’s Text-Based CAPTCHA
adhesion. Its purpose is to highlight image contour and
to simplify subsequent processing. The thinning algorithms The segmentation aims to get individual characters or charac-
contain two categories: noniterative algorithm and itera- ter components. There are the segmentation methods based
tive algorithm. The common thinning algorithms include on individual characters and the segmentation methods
Hilditch algorithm [67] and Zhang and Suen algorithm [68]. based on character components.
Security and Communication Networks 5
Table 4: Comparisons of typical methods based on nonsegmentation for breaking adherent CAPTCHA.
Program
55% [62] RNN 2011
generation
Program
54.9% [63] 2D LSTM-RNN 2013
generation
Note. RNN: recurrent neural network, 2D LSTM: 2-dimensional long short-term memory, DCNN: spatial displacement of the neutral network, HMM: Hidden
Markov model.
5.1. Segmentation Methods Based on Individual Characters. 5.1.2. Segmentation Methods Based on Connected Components.
The segmentation methods based on individual characters The segmentation methods based on connected components
segment a CAPTCHA image to individual characters. For effectively segment individual characters using different con-
individual characters, we can use segmentation methods nected components in an image. For slope and distortion
based on character projection and connected components. characters, this method is effective. However, it is limited by
For CCT characters, we can use segmentation methods based adherent characters.
on character width, connected feature, and character contour. Reference [4] tried to segment Microsoft MSN
CAPTCHA by combining connected components and
vertical projection, as shown in Figure 3. First, different
5.1.1. Segmentation Methods Based on Character Projection. connected components are marked with different colors.
The segmentation methods based on character projection And then the character blocks are generated according to
determine the optimal segmentation position by analyzing different colors. Finally, strings are segmented to individual
the number of pixels projected under different conditions. characters using the vertical projection feature, with a success
This method applies to recognizing CAPTCHA characters rate of more than 90%.
without adhesion or slight adhesion. However, its effect is not
obvious for the seriously adherent and distorted characters. 5.1.3. Segmentation Methods Based on Character Width. The
The typical methods include vertical projection segmen- segmentation methods based on character width are suitable
tation, horizontal projection segmentation, and guideline for CAPTCHA images which are not easily segmented to
projection segmentation. individual characters. [60] used different widths (the average
Using (1), [61] defines three-color bar code to segment width of 0.75 times, 1 time, 1.5 times, and 2 times) to
reCAPTCHA images: segment an image. Thus, each character corresponds to four
recognition results, from which to find an optimal segment
for 𝐻Σ (𝑥) = 0, as the final recognition result. In addition, [5] did not take
{Blue,
{
{
{ the average width as standard; they gave a set of character
Three-color Bar (𝑥) {White, for 𝐻Σ (𝑥) = 1, (1) segments between the minimum width and the maximum
{
{
{ width and then determined the optimal segmentation scheme
{Black, for 𝐻Σ (𝑥) > 1, using dynamic programming, as shown in Figure 4.
where 𝐻Σ (𝑥) represents the total of object pixels in the
5.1.4. Segmentation Methods Based on Character Feature.
𝑥th column. In three-color bar a column is colored in blue The segmentation method based on character features uses
if there is not any pixel that belongs to character in the the features of CAPTCHA string, including inside features
column (𝐻Σ (𝑥) = 0). If there is only one pixel in column and outside features. Reference [38] classifies characters
(𝐻Σ (𝑥) = 1), the column is encoded by white. Finally, the according to their own inside features, and each class contains
black corresponds to the column with more than one object the characters as shown in Table 6.
pixel (𝐻Σ (𝑥) > 1), as shown in Figure 2(a). After denoising, Reference [6] segments characters according to outside
the optimal segmentation line is determined in the middle of features among them. This paper proposes a new seg-
blue bar or white bar, as shown in Figure 2(b). mentation algorithm called middle-axis point separation
6 Security and Communication Networks
Input images
Preprocessing
(i) Binarization
(ii) Thinning
(iii) Denoising
(iv) . . .
Yes No
Nonsegmentation?
Segmentation
Based on single character Based on character components
(i) Based on character projection (i) Based on character Structure
(ii) Based on connected components (ii) Based on filter
(iii) Based on character width (iii) . . .
(iv) Based on character feature
(v) Based on character contour
(vi) . . .
Yes No
Single character?
Combination
(i) Based on redundancy
(ii) Based on nonredundancy
Recognition
Yes No
Redundancy?
Postprocessing
(i) Based on selection
(ii) Based on rejection
(iii) . . .
Average filter mean of its neighboring pixels gray The image is blurred.
removed.
Denoising method values.
based on filter in the The gray value of pixel is replaced by the
Remove effectively the salt and
spatial domain Median filter median of its neighboring pixels gray Not applied to the image with many dots, lines, and spires.
pepper noise, speckle noise.
values.
The minimum mean square error Remove effectively the Gaussian
Wiener filter Computation is complex.
criterion is used to adjust the filter effect. noises.
Gibbs Markov random field theory. Remove effectively noise points.
Denoising method
The straight line in the image is detected Not applied to irregular interference line.
based on Gibbs and Remove effectively interference
Hough transform by using the point line duality of image
Hough transform lines.
space and Hough parameter space.
Smooth contours, cut off narrow
Open operation First corrosion to expansion. The effect of denoising varies with operation mode and the
Denoising method lines, and eliminate fine.
size and shape of structural elements; the experiment needs
based on morphology Smooth contour and fill holes, gaps,
Close operation First expansion to corrosion. to be repeated; the adaptability is poor.
and fracture of contour line.
The recursive method is used to find the
Remove effectively the noise
Denoising method connected domain to deal with pixel
Connected interference, and the original details Need to analyze character’s properties; hard to determine
based on connected points, and then denoising based on gray
component of the characters are generally not distinguish features.
component features and morphological features of
lost.
connected domain.
Denoising method Find the best mapping of original image
Complex computation and it needs to adjust relative
based on wavelet Wavelet transform in the wavelet transform domain to Retain more image details.
parameters.
transform restore the original image.
7
8 Security and Communication Networks
(a) Original image (b) Nonshared character compo- (c) Shared character components
nents
Figure 7: An example of segmented CAPTCHA image in [36].
Filter Binarize
(0)
(/4)
(/2)
(3/4)
Gabor
filters
generation. However, due to the diversity and timeliness of technique. First of all, this paper introduces various text-
text-based CAPTCHA, it has not been possible to construct a based CAPTCHAs and focuses on their features. Second,
common image database in the field of text-based CAPTCHA according to whether there is segmentation or not, we classify
recognition. It is necessary to collect, classify, organize, and the existing breaking methods of text-based CAPTCHA
establish the text-based CAPTCHA images database. The and summarize their features. Meanwhile, we propose a
database can provide the reliable training and testing data framework of text-based CAPTCHA breaking technique and
for research work and also provide the premise and basis of introduce the modules contained in the framework one
unified evaluation for various methods in this field. by one. Next, we compare and analyze the basic princi-
ples, advantages, and disadvantages of the existing methods
(2) Multitype CAPTCHA Recognition. At present, only when from five aspects: preprocessing, segmentation, combination,
training set and test set belong to the same type, the classifier recognition, and postprocessing. Finally, some problems
can effectively recognize CAPTCHAs. In fact, there are a worth further research are discussed.
variety of character changes in a CAPTCHA. Therefore, it
is an arduous and important task to design a reasonable
classifier to recognize various types of CAPTCHAs.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
(3) Segmentation-Free CAPTCHA Recognition. After more
than ten years of development, the text-based CAPTCHA
breaking has achieved a high success rate in individual char- Acknowledgments
acter. However, the breaking success rate of the CAPTCHA
This work was supported by the National Natural Science
string is generally low, and the results are less. With the
Foundation of China (nos. 61379151, 61401512, 61572052,
wide application of CCT strings in text-based CAPTCHA, the
and U1636219), the National Key R&D Program of China
problem of segmentation-free CAPTCHA recognition needs
(nos. 2016YFB0801303 and 2016QY01W0105), and the
to be solved urgently. Now deep learning may provide new
Key Technologies R&D Program of Henan Province (no.
ideas and technical means to solve this problem.
162102210032).
(4) Application of Deep Learning Model. At present, in the
field of CAPTCHA recognition, deep learning model can References
achieve better results than traditional methods. The most
frequently used methods are based on CNN and its improved [1] L. Von Ahn, M. Blum, and J. Langford, “Telling humans and
methods, while other deep learning models such as DBN computers apart automatically,” Communications of the ACM,
vol. 47, no. 2, pp. 56–60, 2004.
(Deep Belief Networks), RNN, LSTM/BLSTM/MDLSTM,
[2] K. Chellapilla and P. Y. Simard, “Using Machine Learning to
and DRL (Deep Reinforcement Learning) were not well
Break Visual Human Interaction Proofs (HIPs),” in Proceedings
used in text-based CAPTCHA recognition. Furthermore, of the Advances in Neural Information Processing Systems,
the study of the interrelationships and fusion applications pp. 265–272, ofAdvances in Neural Information Processing
between the various deep learning models is not thorough. Systems, 2004.
We hope that newer and better deep learning models are [3] N. Roshanbin and J. Miller, “A survey and analysis of current
proposed to make a breakthrough in CAPTCHA recognition, CAPTCHA approaches,” Journal of Web Engineering, vol. 12, no.
which will certainly promote the development in this field. 1-2, pp. 001–040, 2013.
[4] J. Yan and A. S. E. Ahmad, “A low-cost attack on a microsoft
(5) Rejection of Text-Based CAPTCHA. With the development CAPTCHA,” in Proceedings of the 15th ACM conference on
of CAPTCHA breaking technique, the reliability of recogni- Computer and Communications Security, CCS’08, pp. 543–554,
tion results is also increasing. In this regard, on one hand, we USA, October 2008.
should improve the correct rate of recognition; on the other [5] F. Jean-Baptiste and R. Paucher, “The Captchacker Project,”
hand, we should guarantee the correct rejection. In the field 2009, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1
of CAPTCHA recognition, the concept of rejection has not .1.800.3065&rep=rep1&type=pdf.
been well known to the researchers. Therefore, this study has [6] S.-Y. Huang, Y.-K. Lee, G. Bell, and Z.-H. Ou, “An efficient
a potential development space. segmentation algorithm for CAPTCHAs with line cluttering
and character warping,” Multimedia Tools and Applications, vol.
(6) Misrecognition of Confusable Characters. When using the 48, no. 2, pp. 267–289, 2010.
deep learning network to extract character features automat- [7] R. A. Nachar, E. Inaty, P. J. Bonnin, and Y. Alayli, “Breaking
ically, the characters with similar features are easily confused. down Captcha using edge corners and fuzzy logic segmen-
It has practical significance to improve the precision of feature tation/recognition technique,” Security and Communication
extraction and the training methods in the deep learning Networks, vol. 8, no. 18, pp. 3995–4012, 2015.
network. [8] L. von Ahn, M. Blum, N. J. Hopper, and J. Langford,
“CAPTCHA: using hard AI problems for security,” in Advances
in cryptology—EUROCRYPT 2003, vol. 2656 of Lecture Notes in
10. Conclusions Computer Science, pp. 294–311, Springer, Berlin, Germany, 2003.
Based on detailed investigation and in-depth analysis, this [9] https://www.google.com/recaptcha.
paper reviews the progress of text-based CAPTCHA breaking [10] http://captcha.net/.
14 Security and Communication Networks
[11] http://www.captcha.net/captchas/bongo. [27] Y. Rui and Z. Liu, “ARTiFACIAL: Automated reverse turing test
[12] A. Schlaikjer and A. Dual, “Use Speech CAPTCHA: Aiding using FACIAL features,” Multimedia Systems, vol. 9, no. 6, pp.
Visually Impaired Web Users while Providing Transcriptions of 493–502, 2004.
Audio Streams,” Tech. Rep. LTI-CMU-07-014, Carnegie Mellon [28] K. A. Kluever and R. Zanibbi, “Balancing usability and security
University, Pittsburgh, Pa, USA, 2007. in a video CAPTCHA,” in Proceedings of the 5th Symposium On
[13] J. Tam, J. Simsa et al., “Improving Audio CAPTCHAs,” in Usable Privacy and Security, SOUPS 2009, USA, July 2009.
Proceedings of the Symposium on Usable Privacy and Security, [29] R. Gossweiler, M. Kamvar, and S. Baluja, “What’s up
2008. CAPTCHA? A CAPTCHA based on image orientation,”
[14] J. Tam, S. Hyde, J. Simsa, and L. Von Ahn, “Breaking audio in Proceedings of the 18th International World Wide Web
CAPTCHAs,” in Proceedings of the 22nd Annual Conference on Conference, WWW 2009, pp. 841–850, Spain, April 2009.
Neural Information Processing Systems, NIPS 2008, pp. 1625– [30] I. J. Goodfellow, Y. Bulatov, J. Ibarz et al., “Multi-digit
1632, can, December 2008. Number Recognition from Street View Imagery using Deep
[15] H. S. Baird and K. Popat, “Human Interactive Proofs and Convolutional Neural Networks,” 2014, https://www.research-
Document Image Analysis,” in Proceedings of the International gate.net/publication/259399973 Multi-digit Number Recog-
Workshop on Document Analysis Systems, vol. 2423 of Lecture nition from Street View Imagery using Deep Convolutional
Notes in Computer Science, pp. 507–518, Springer, 2002. Neural Networks.
[16] A. L. Coates, H. S. Baird, and R. J. Fateman, “Pessimal print: [31] T.-Y. Chan, “Using a test-to-speech synthesizer to generate a
A reverse turing test,” in Proceedings of the 6th International reverse Turing test,” in Proceedings of the 15th IEEE International
Conference on Document Analysis and Recognition, ICDAR Conference on Tools with Artificial Intelligence, pp. 226–232,
2001, pp. 1154–1158, usa, September 2001. Sacramento, Calif, USA, 2003.
[17] M. Chew and H. S. Baird, “Baffletext: A human interactive [32] G. Kochanski, D. Lopresti, and C. Shih, “A reverse turing test
proof,” in Proceedings of the Document Recognition and Retrieval using speech,” in Proceedings of the 7th International Conference
X, pp. 305–316, USA, January 2003. on Spoken Language Processing, ICSLP 2002, pp. 1357–1360,
[18] R. Chow, P. Golle, M. Jakobsson, L. Wang, and X. Wang, September 2002.
“Making CAPTCHAs clickable,” in Proceedings of the 9th Work- [33] http://www.lancaster.ac.uk/people/yanj2/.
shop on Mobile Computing Systems and Applications, HotMobile [34] J. Yan and A. S. El Ahmad, “Breaking visual CAPTCHAs
2008, pp. 91–94, USA, February 2008. with naı̈ve pattern recognition algorithms,” in Proceedings of
[19] P. Golle, “Machine learning attacks against the asirra the 23rd Annual Computer Security Applications Conference,
CAPTCHA,” in Proceedings of the 15th ACM conference on ACSAC 2007, pp. 279–291, December 2007.
Computer and Communications Security, CCS’08, pp. 535–542, [35] J. Yan and A. S. El Ahmad, “Usability of CAPTCHAs or
USA, October 2008. usability issues in CAPTCHA design,” in Proceedings of the 4th
[20] G. Mori and J. Malik, “Recognizing objects in adversarial Symposium on Usable Privacy and Security, SOUPS 2008, pp.
clutter: breaking a visual CAPTCHA,” in Proceedings of the IEEE 44–55, July 2008.
Computer Society Conference on Computer Vision and Pattern [36] A. S. El Ahmad, J. Yan, and L. Marshall, “The robustness of a
Recognition, vol. 1, pp. 134–144, June 2003. new CAPTCHA,” in Proceedings of the 3rd European Workshop
[21] M. Chew and J. D. Tygar, “Image Recognition CAPTCHAs,” on System Security, EUROSEC’10, pp. 36–41, April 2010.
in Proceedings of the 7th International Information Security
[37] B. B. Zhu, J. Yan, Q. Li et al., “Attacks and design of image
Conference, vol. 3225 of Lecture Notes in Computer Science, pp.
recognition CAPTCHAs,” in Proceedings of the 17th ACM
268–279, Springer.
Conference on Computer and Communications Security, CCS’10,
[22] K. Chellapilla, K. Larson, P. Simard, and M. Czerwinski, pp. 187–200, October 2010.
“Designing human friendly human interaction proofs (HIPs),”
[38] A. S. E. Ahmad, J. Yan, and M. Tayara, “The Robustness of
in Proceedings of the the SIGCHI conference, p. 711, Portland,
Google CAPTCHAs,” Computing Science Technical Report CS-
Oregon, USA, April 2005.
TR-1278, Newcastle University, 2011.
[23] P. Y. Simard, R. Szeliski, J. Benaloh, J. Couvreur, and I. Calinov,
[39] A. S. El Ahmad, J. Yan, and W.-Y. Ng, “CAPTCHA design: Color,
“Using character recognition and segmentation to tell computer
usability, and security,” IEEE Internet Computing, vol. 16, no. 2,
from humans,” in Proceedings of the 7th International Conference
pp. 44–51, 2012.
on Document Analysis and Recognition, ICDAR 2003, pp. 418–
423, UK, August 2003. [40] A. Algwil, D. Ciresan, B. Liu, and J. Yan, “A security analysis
of automated Chinese turing tests,” in Proceedings of the 32nd
[24] K. Chellapilla, K. Larson, P. Y. Simard, and M. Czerwinski,
Annual Computer Security Applications Conference, ACSAC
“Building segmentation based human-friendly human interac-
2016, pp. 520–532, December 2016.
tion proofs (HIPs),” in Proceedings of the Second International
Workshop on Human Interactive Proofs, HIP 2005, pp. 1–26, usa, [41] H. Gao, W. Wang, J. Qi, X. Wang, X. Liu, and J. Yan, “The
May 2005. robustness of hollow CAPTCHAs,” in Proceedings of the ACM
[25] K. Chellapilla, K. Larson, P. Simard, and M. Czerwinski, SIGSAC Conference on Computer and Communications Security,
“Computers beat humans at single character recognition in CCS 2013, pp. 1075–1085, November 2013.
reading based human interaction proofs (HIPs),” in Proceedings [42] H. Gao, J. Yan, F. Cao et al., “A Simple Generic Attack on Text
of the 2nd Conference on Email and Anti-Spam, usa, July 2005. Captchas,” in Proceedings of the Network and Distributed System
[26] J. Elson, J. R. Douceur, J. Howell, and J. Saul, “Asirra: A Security Symposium, pp. 1–14, San Diego, Calif, USA, 2016.
CAPTCHA that exploits interest-aligned manual image cat- [43] http://web.xidian.edu.cn/hchgao/paper.html.
egorization,” in Proceedings of the 14th ACM Conference on [44] H. Gao, W. Wang, and Y. Fan, “Divide and conquer: An
Computer and Communications Security, CCS’07, pp. 366–374, efficient attack on Yahoo! CAPTCHA,” in Proceedings of the 11th
USA, November 2007. IEEE International Conference on Trust, Security and Privacy
Security and Communication Networks 15
in Computing and Communications, TrustCom-2012, pp. 9–16, audio symbols,” Journal of Information Processing, vol. 23, no. 6,
June 2012. pp. 814–826, 2015.
[45] F. Dai, H. Gao, and D. Liu, “Breaking CAPTCHAs with second [65] J. Sauvola and M. Pietikäinen, “Adaptive document image
template matching and BP neural network algorithms,” Interna- binarization,” Pattern Recognition, vol. 33, no. 2, pp. 225–236,
tional Journal of Information Processing and Management, vol. 4, 2000.
no. 3, pp. 126–133, 2013. [66] N. Otsu, “A threshold selection method from gray-level his-
[46] H. Gao, W. Wang, Y. Fan, J. Qi, and X. Liu, “The robustness tograms,” IEEE Transactions on Systems, Man, and Cybernetics,
of “connecting characters together” CAPTCHAs,” Journal of vol. 9, no. 1, pp. 62–66, 1979.
Information Science and Engineering, vol. 30, no. 2, pp. 347–369, [67] C. J. Hilditch, “Linear Skeletons from Square Cupboards,”
2014. Machine Intelligence, pp. 403–420, 1969.
[47] H. Gao, X. Wang, F. Cao et al., “Robustness of text-based [68] T. Y. Zhang and C. Y. Suen, “A fast parallel algorithm for
completely automated public turing test to tell computers and thinning digital patterns,” Communications of the ACM, vol. 27,
humans apart,” IET Information Security, vol. 10, no. 1, pp. 45– no. 3, pp. 236–239, 1984.
52, 2016.
[48] R. Hussain, H. Gao, and R. A. Shaikh, “Segmentation of
connected characters in text-based CAPTCHAs for intelligent
character recognition,” Multimedia Tools and Applications, pp.
1–15, 2016.
[49] R. Hussain, H. Gao, R. A. Shaikh, and S. P. Soomro, “Recog-
nition based segmentation of connected characters in text
based CAPTCHAs,” in Proceedings of the 8th IEEE International
Conference on Communication Software and Networks, ICCSN
2016, pp. 673–676, June 2016.
[50] https://captcha.com/.
[51] http://jcaptcha.sourceforge.net/.
[52] http://www.hinsite.com.
[53] http://caca.zoy.org/wiki/PWNtcha.
[54] https://code.google.com/p/captchacker.
[55] http://www.brains-n-brawn.com/default.aspx?vDir=aicaptcha.
[56] http://www.cs.sfu.ca/∼mori/research/gimpy/.
[57] G. Moy, N. Jones, C. Harkless, and R. Potter, “Distortion estima-
tion techniques in solving visual CAPTCHAs,” in Proceedings of
the IEEE Computer Society Conference on Computer Vision and
Pattern Recognition, CVPR 2004, pp. II23–II28, July 2004.
[58] A. Bansal, D. Garg, and A. Gupta, “Breaking a Visual
CAPTCHA: A Novel Approach using HMM,” 2008, https://pdfs
.semanticscholar.org/3c2c/9af1e9a3b7095edaf8f205dfbadc30f-
917fb.pdf.
[59] S. Li, S. A. H. Shah, M. A. U. Khan, S. A. Khayam, A.-R.
Sadeghi, and R. Schmitz, “Breaking e-banking CAPTCHAs,” in
Proceedings of the 26th Annual Computer Security Applications
Conference, ACSAC 2010, pp. 171–180, December 2010.
[60] C. Hong, B. Lopez-Pineda, K. Rajendran, and A. Recasens,
“Breaking Microsoft’s CAPTCHA,” 2015, https://courses.csail
.mit.edu/6.857/2016/files/hong-lopezpineda-rajendran-recan-
sens.pdf.
[61] O. Starostenko, C. Cruz-Perez, F. Uceda-Ponga, and V. Alarcon-
Aquino, “Breaking text-based CAPTCHAs with variable word
and character orientation,” Pattern Recognition, vol. 48, no. 4,
pp. 1097–1108, 2015.
[62] L. Zhang, L. Zhang, S.-G. Huang, and Z.-X. Shi, “A highly
reliable CAPTCHA recognition algorithm based on rejection,”
Acta Automatica Sinica, vol. 37, no. 7, pp. 891–900, 2011.
[63] R. Chen, J. Yang, R.-G. Hu, and S.-G. Huang, “A novel LSTM-
RNN decoding algorithm in CAPTCHA recognition,” in Pro-
ceedings of the 3rd International Conference on Instrumentation
and Measurement, Computer, Communication and Control,
IMCCC 2013, pp. 766–771, September 2013.
[64] S. Sano, T. Otsuka, K. Itoyama, and H. G. Okuno, “HMM-based
attacks on Google’s ReCAPTCHA with continuous visual and
International Journal of
Rotating
Machinery
International Journal of
The Scientific
(QJLQHHULQJ Distributed
Journal of
Journal of
Journal of
Control Science
and Engineering
Advances in
Civil Engineering
Hindawi Publishing Corporation Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014
Journal of
Journal of Electrical and Computer
Robotics
Hindawi Publishing Corporation
Engineering
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014
VLSI Design
Advances in
OptoElectronics
,QWHUQDWLRQDO-RXUQDORI
International Journal of
Modelling &
Simulation
$HURVSDFH
Hindawi Publishing Corporation Volume 2014
Navigation and
Observation
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
in Engineering
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
(QJLQHHULQJ
+LQGDZL3XEOLVKLQJ&RUSRUDWLRQ
KWWSZZZKLQGDZLFRP 9ROXPH
Hindawi Publishing Corporation
http://www.hindawi.com
http://www.hindawi.com Volume 201-
International Journal of
International Journal of Antennas and Active and Passive Advances in
Chemical Engineering Propagation Electronic Components Shock and Vibration Acoustics and Vibration
Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014