Captcha: G.Pullaiah College of Engineering and Technology (Gpcet)
Captcha: G.Pullaiah College of Engineering and Technology (Gpcet)
Submitted by
G.SINDHU PALLAVI
ROLL.NO:07AT1A0519
CSE
Contents
TOPIC PAGE NO
1. Introduction: 4
2. History 4
3. Characteristics 5
4. Applications 6
5. Guidelines 8
8. Conclusion 14
9. References 14
INTRODUCTION:
HISTORY:
Moni Naor was the first person to theorize a list of ways to verify that a request
comes from a human and not a bot. Primitive CAPTCHAs seem to have been developed in
1997 by Andrei Broder, Martin Abadi, Krishna Bharat, and Mark Lillibridge to prevent bots
from adding URLs to their search engine. In order to make the images resistant to OCR (Optical
Character Recognition), the team simulated situations that scanner manuals claimed resulted in
bad OCR. In 2000, Luis von Ahn and Manuel Blum coined the term 'CAPTCHA', improved and
publicized the notion, which included any program that can distinguish humans from
computers. They invented multiple examples of CAPTCHAs, including the first CAPTCHAs to
be widely used, which were those adopted by Yahoo!.
CHARACTERISTICS:
Although a checkbox "check here if you are not a bot" might serve to distinguish
between humans and computers, it is not a CAPTCHA because it relies on the fact that an
attacker has not spent effort to break that specific form.
Withholding of the algorithm can increase the integrity of a limited set of systems,
as in the practice of security through obscurity. The most important factor in deciding whether
an algorithm should be made open or restricted is the size of the system. Although an algorithm
which survives scrutiny by security experts may be assumed to be more conceptually secure
than an unevaluated algorithm, an unevaluated algorithm specific to a very limited set of
systems is always of less interest to those engaging in automated abuse. Breaking a CAPTCHA
generally requires some effort specific to that particular CAPTCHA implementation, and an
abuser may decide that the benefit granted by automated bypass is negated by the effort required
to engage in abuse of that system in the first place.
An example of Text-Based CAPTCHA.
For example, humans can read distorted text as the one shown below, but current computer
programs can't
APPLICATIONS of CAPTCHAs:
Most bloggers are familiar with programs that submit bogus comments, usually
for the purpose of raising search engine ranks of some website (e.g., "buy penny stocks
here"). This is called comment spam. By using a CAPTCHA, only humans can enter
comments on a blog. There is no need to make users sign up before they enter a
comment, and no legitimate comments are ever lost!
Several companies (Yahoo!, Microsoft, etc.) offer free email services. Up until a
few years ago, most of these services suffered from a specific type of attack: "bots" that
would sign up for thousands of email accounts every minute. The solution to this problem
was to use CAPTCHAs to ensure that only humans obtain free accounts. In general, free
services should be protected with a CAPTCHA in order to prevent abuse by automated
programs.
Online Polls:
CAPTCHAs also offer a plausible solution against email worms and spam: "I will
only accept an email if I know there is a human behind the other computer." A few
companies are already marketing this idea.
These are the applications of CAPTCHAs used in websites. In general there are
some guidelines for websites which implements CAPTCHAs.
GUIDELINES:
Accessibility:
Image Security:
Images of text should be distorted randomly before being presented to the user.
Many implementations of CAPTCHAs use undistorted text, or text with only minor
distortions. These implementations are vulnerable to simple automated attacks. For example,
the CAPTCHAs shown below can all be broken using image processing techniques, mainly
because they use a consistent font.
Script Security:
(1) Systems that pass the answer to the CAPTCHA in plain text as part of the
web form.
(2) Systems where a solution to the same CAPTCHA can be used multiple times
(this makes the CAPTCHA vulnerable to so-called "replay attacks").
Human solvers:
CAPTCHA is vulnerable to a relay attack that uses humans to solve the puzzles.
One approach involves relaying the puzzles to a group of human operators who can solve
CAPTCHAs. In this scheme, a computer fills out a form and when it reaches a CAPTCHA, it
gives the CAPTCHA to the human operator to solve.
Another variation of this technique involves copying the CAPTCHA images and
using them as CAPTCHAs for a high-traffic site owned by the attacker. With enough traffic, the
attacker can get a solution to the CAPTCHA puzzle in time to relay it back to the target site. In
October 2007, a piece of malware appeared in the wild which enticed users to solve CAPTCHAs
in order to see progressively further into a series of striptease images.
A number of research projects have attempted (often with success) to beat visual
CAPTCHAs by creating programs that contain the following functionality:
Steps 1 and 3 are easy tasks for computers. The only step where humans still
outperform computers is segmentation. If the background clutter consists of shapes similar to
letter shapes, and the letters are connected by this clutter, the segmentation becomes nearly
Shape Contexts based approach is to break Gimpy, the CAPTCHA test used at
Yahoo! to screen out bots. This approach make use of general purpose algorithms that have been
designed for generic object recognition. The same basic ideas have been applied to finding
people in images, matching handwritten digits, and recognizing 3D objects.
Below are a few examples of images analyzed using this method, and the word
that was found. Correct words are shown in green, incorrect words in red. For EZ-Gimpy
experiments are done using 191 images. It was able to correctly identify the word in 176 of these
images: a success rate of 92%.This algorithm takes only a few seconds to process one image.
SCREW SPACE
Extreme Distortion:
One of the way to improve security is by using more distorted text in images, so
that bots will not be able to break.
If we use extremely distorted images like this, it is almost impossible for any of
the programs to break them.
reCAPTCHAs:
reCAPTCHA is a system developed at Carnegie Mellon University that uses
CAPTCHA to help digitize the text of books while protecting websites from bots attempting to
access restricted areas. reCAPTCHA supplies subscribing websites with images of words that
optical character recognition (OCR) software has been unable to read. The subscribing websites
present these images for humans to decipher as CAPTCHA words, as part of their normal
validation procedures.
Image-Recognition Captchas:
Here is a challenge in image form for users, user has to click the 3 pictures of the
kittens. This is difficult for any bot to overcome. The possibility of getting right by chance is
very less.
CONCLUSION:
Thus CAPTCHAs have undergone many forms from text based to image
recognition. Still many new forms of CAPTCHAs are yet to come. There are some proposals to
even introduce some questions as CAPTCHAs based on common sense in future. We hope for
many sophisticated CAPTCHAs in future so that they will lock out any bot.
REFERENCES:
http://www.captcha.net/
http://en.wikipedia.org/
http://www.captcha.ru/en/breakings/
http://www.cs.sfu.ca/~mori/research/gimpy/