Content deleted Content added
Gfeissweet (talk | contribs) I removed spaces, and removed a advertisement. |
→External links: Removed template per Wikipedia:Templates for discussion/Log/2024 December 4#Template:CAPTCHAs. |
||
(36 intermediate revisions by 25 users not shown) | |||
Line 3:
{{Cleanup rewrite|date=November 2022|It feels like an essay criticising CAPTCHA|section=no}}
{{Use dmy dates|date=October 2022}}
[[File:captcha.jpg|upright=1.35|thumb|This CAPTCHA (
A '''CAPTCHA''' ({{IPAc-en|ˈ|k|æ|p|.|tʃ|ə}} {{respell|KAP|chə}}) is a type of [[challenge–response authentication|challenge–response]] test used in [[computing]] to determine whether the user is human in order to deter bot attacks and spam.<ref>{{Cite web|title=The reCAPTCHA Project – Carnegie Mellon University CyLab|url=https://www.cylab.cmu.edu/partners/success-stories/recaptcha.html|url-status=dead|archive-url=https://web.archive.org/web/20171027203659/https://www.cylab.cmu.edu/partners/success-stories/recaptcha.html|archive-date=2017-10-27|access-date=2017-01-13|website=www.cylab.cmu.edu}}</ref>
The term was coined in 2003 by [[Luis von Ahn]], [[Manuel Blum]], Nicholas J. Hopper, and [[John Langford (computer scientist)|John Langford]].<ref name="abhl" /> It is a [[contrived acronym]] for "Completely Automated Public [[Turing test]] to tell Computers and Humans Apart."<ref>{{Cite web |title=What is CAPTCHA? |url=https://support.google.com/a/answer/1217728 |url-status=live |archive-url=https://web.archive.org/web/20200806173938/https://support.google.com/a/answer/1217728 |archive-date=6 August 2020 |access-date=2022-09-09 |website=Google Support |publisher=Google Inc. |quote=CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a [...]}}</ref> A historically common type of CAPTCHA (displayed as
Two widely used CAPTCHA services are
== Purpose ==
== History ==
Line 19:
One of the earliest commercial uses of CAPTCHAs was in the Gausebeck–Levchin test. In 2000, idrive.com began to protect its signup page<ref>{{Cite web|title=idrive turing signup page|url=https://drive.google.com/open?id=0BzbOLm20p6CrUE1SSXp5Zjl2MW8|access-date=2017-05-19|website=Google Drive|archive-date=15 March 2023|archive-url=https://web.archive.org/web/20230315233241/https://accounts.google.com/v3/signin/identifier?dsh=S-569764738%3A1678923161301090&continue=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fdrive.google.com%2Fopen%3Fid%3D0BzbOLm20p6CrUE1SSXp5Zjl2MW8&followup=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fdrive.google.com%2Fopen%3Fid%3D0BzbOLm20p6CrUE1SSXp5Zjl2MW8&ifkv=AWnogHfb-QQLSi-KGh4vgzje6iZGJ1BZZvpaKSlXZLsXVSfSHlafPjo8v6B9qJTV2nuxzahDQYGTtw&osid=1&passive=1209600&service=wise&flowName=GlifWebSignIn&flowEntry=ServiceLogin|url-status=live}}</ref> with a CAPTCHA and prepared to file a patent.<ref name=":1">{{Cite web|url=https://drive.google.com/open?id=0BzbOLm20p6CrOS1mWEhITGJ4d2s|title=idrive turing patent application|access-date=2017-05-19|archive-date=15 March 2023|archive-url=https://web.archive.org/web/20230315233244/https://accounts.google.com/v3/signin/identifier?dsh=S956306740%3A1678923164278227&continue=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fdrive.google.com%2Fopen%3Fid%3D0BzbOLm20p6CrOS1mWEhITGJ4d2s&followup=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fdrive.google.com%2Fopen%3Fid%3D0BzbOLm20p6CrOS1mWEhITGJ4d2s&ifkv=AWnogHfWh9qH38C8IGelcYVq9WSJcqP6Q30eP1Bba6t1EcfIlDb1n3eZMtAJSv1IRxdTdxgTsu8r0A&osid=1&passive=1209600&service=wise&flowName=GlifWebSignIn&flowEntry=ServiceLogin|url-status=live}}</ref> In 2001, [[PayPal]] used such tests as part of a fraud prevention strategy in which they asked humans to "retype distorted text that programs have difficulty recognizing."<ref name=stringham2015>{{cite book |last1=Stringham|first1=Edward P |title=Private Governance : creating order in economic and social life |publisher=[[Oxford University Press]] |year=2015 |page=105 |isbn=978-0-19-936516-6 |oclc=5881934034 }}</ref> PayPal co founder and CTO [[Max Levchin]] helped commercialize this use.
A popular deployment of CAPTCHA technology, [[reCAPTCHA]], was acquired by Google in 2009.<ref>{{cite web |title=Teaching computers to read: Google acquires reCAPTCHA |url=https://googleblog.blogspot.com/2009/09/teaching-computers-to-read-google.html |url-status=live |archive-url=https://web.archive.org/web/20190831195346/https://googleblog.blogspot.com/2009/09/teaching-computers-to-read-google.html |archive-date=31 August 2019 |access-date=29 October 2018 |website=Google Official Blog}}</ref> In addition to preventing bot fraud for its users, Google used reCAPTCHA and CAPTCHA technology to digitize the archives of ''[[The New York Times]]'' and books from Google Books in 2011.<ref>{{cite news |last1=Gugliotta |first1=Guy |date=28 March 2011 |title=Deciphering Old Texts, One Woozy, Curvy Word at a Time |website=The New York Times |url=https://www.nytimes.com/2011/03/29/science/29recaptcha.html |url-status=live |access-date=29 October 2018 |archive-url=https://web.archive.org/web/20171117172409/http://www.nytimes.com/2011/03/29/science/29recaptcha.html |archive-date=17 November 2017}}</ref>
== Characteristics ==
Line 33 ⟶ 32:
Each of these problems poses a significant challenge for a computer, even in isolation. Therefore, these three techniques in tandem make CAPTCHAs difficult for computers to solve.<ref name=bursz>{{cite book|last1=Bursztein|first1=Elie|first2=Matthieu|last2=Martin|first3=John C.|last3=Mitchell|chapter=Text-based CAPTCHA Strengths and Weaknesses|title=ACM Computer and Communication Security 2011 (CSS'2011)|year=2011|chapter-url=https://www.elie.net/publication/text-based-captcha-strengths-and-weaknesses|access-date=5 April 2016|archive-date=24 November 2015|archive-url=https://web.archive.org/web/20151124055747/https://www.elie.net/publication/text-based-captcha-strengths-and-weaknesses|url-status=live}}</ref>
Whilst primarily used for security reasons, CAPTCHAs can also serve as a benchmark task for artificial intelligence technologies. According to an article by Ahn, Blum and Langford,<ref name=Ahn2003>{{Cite book| chapter-url=https://link.springer.com/content/pdf/10.1007/3-540-39200-9_18.pdf| doi=10.1007/3-540-39200-9_18| chapter=CAPTCHA: Using Hard AI Problems for Security| title=Advances in Cryptology—EUROCRYPT 2003| volume=2656| pages=294–311| series=Lecture Notes in Computer Science| year=2003| last1=von Ahn| first1=Luis| last2=Blum| first2=Manuel| last3=Hopper| first3=Nicholas J.| last4=Langford| first4=John| isbn=978-3-540-14039-9| s2cid=5658745| access-date=30 August 2019| archive-date=4 May 2019| archive-url=https://web.archive.org/web/20190504115630/https://link.springer.com/content/pdf/10.1007%2F3-540-39200-9_18.pdf| url-status=live}}</ref> "any program that passes the tests generated by a CAPTCHA can be used to solve a hard unsolved AI problem."<ref>
== Accessibility ==
{{See also|Web accessibility}}
[[File:FancyCaptcha screenshot.png|left|thumb|260px|Many websites require typing a CAPTCHA when creating an account to prevent spam. This image contains a user trying to type the CAPTCHA word "sepalbeam" to protect against automated spam.]]
CAPTCHAs based on reading text—or other visual-perception tasks—prevent [[blindness|blind]] or [[visual impairment|visually impaired]] users from accessing the protected resource.<ref name="w3c_inaccessibility">{{cite web |url=http://www.w3.org/TR/turingtest/ |title=Inaccessibility of CAPTCHA |date=2005-11-23 |access-date=2015-04-27 |publisher=[[W3C]] |author=May, Matt |archive-date=21 May 2012 |archive-url=https://web.archive.org/web/20120521023537/http://www.w3.org/TR/turingtest/ |url-status=live }}</ref><ref>{{cite
CAPTCHAs do not have to be visual. Any hard [[artificial intelligence]] problem, such as [[speech recognition]], can be used as CAPTCHA. Some implementations of CAPTCHAs permit users to opt for an audio CAPTCHA, such as reCAPTCHA, though a 2011 paper demonstrated a technique for defeating the popular schemes at the time.<ref>{{cite book|last1=Bursztein|first1=Elie|first2=Romain|last2=Beauxis|first3=Hristo|last3=Perito|first4=Daniele|last4=Paskov|last5=fabry|first5=Celine|last6=Mitchell|first6=John C.|title=2011 IEEE Symposium on Security and Privacy |chapter=The Failure of Noise-Based Non-continuous Audio Captchas |pages=19–31|year=2011|chapter-url=https://www.elie.net/publication/the-failure-of-noise-based-non-continuous-audio-captchas|doi=10.1109/SP.2011.14|isbn=978-1-4577-0147-4|s2cid=6933726|access-date=5 April 2016|archive-date=16 April 2016|archive-url=https://web.archive.org/web/20160416221427/https://www.elie.net/publication/the-failure-of-noise-based-non-continuous-audio-captchas|url-status=live}}</ref>
A method of improving CAPTCHA to ease the work with it was proposed by ProtectWebForm and named "Smart CAPTCHA".<ref>{{cite web|date=2006-10-08|title=Smart Captcha|url=http://www.protectwebform.com/smartcaptcha|url-status=dead|archive-url=https://web.archive.org/web/20161104163541/http://protectwebform.com/smartcaptcha|archive-date=2016-11-04|access-date=2017-09-15|publisher=Protect Web Form .COM}}</ref> Developers are advised to combine CAPTCHA with JavaScript. Since it is hard for most bots to parse and execute JavaScript, a combinatory method which fills the CAPTCHA fields and hides both the image and the field from human eyes was proposed.<ref>{{Cite web |title=Invisible reCAPTCHA |url=https://developers.google.com/recaptcha/docs/invisible |access-date=2022-10-28 |website=Google Developers |language=en |archive-date=16 January 2020 |archive-url=https://web.archive.org/web/20200116133416/https://developers.google.com/recaptcha/docs/invisible |url-status=live }}</ref>
Line 54 ⟶ 49:
== Circumvention ==
Two main ways to bypass CAPTCHA include using cheap human labor to recognize them, and using [[machine learning]] to build an automated solver.<ref>{{cite book|last=Jakobsson|first=Markus|title=The death of the Internet|url=http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1118062418.html|access-date=4 April 2016|date=August 2012|archive-date=15 October 2014|archive-url=https://web.archive.org/web/20141015182639/http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1118062418.html|url-status=live}}</ref> According to former Google "
▲Two main ways to bypass CAPTCHA include using cheap human labor to recognize them, and using [[machine learning]] to build an automated solver.<ref>{{cite book|last=Jakobsson|first=Markus|title=The death of the Internet|url=http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1118062418.html|access-date=4 April 2016|date=August 2012|archive-date=15 October 2014|archive-url=https://web.archive.org/web/20141015182639/http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1118062418.html|url-status=live}}</ref> According to former Google "''[[click fraud]] czar''" [[Shuman Ghosemajumder]], there are numerous services which solve CAPTCHAs automatically.<ref name=ai-security>{{cite news |last=Ghosemajumder |first=Shuman |title=The Imitation Game: The New Frontline of Security |url=http://www.infoq.com/presentations/ai-security |agency=InfoQ |access-date=8 December 2015 |newspaper=InfoQ |date=8 December 2015 |archive-date=23 March 2019 |archive-url=https://web.archive.org/web/20190323061742/https://www.infoq.com/presentations/ai-security |url-status=live }}</ref>
[[File:Modern-captcha.jpg|thumb|An example of a [[reCAPTCHA]] challenge from 2007, containing the words "following finding". The waviness and horizontal stroke were added to increase the difficulty of breaking the CAPTCHA with a computer program.]]▼
[[File:Captchacat.png|thumb|A CAPTCHA usually has a text box directly underneath where the user should fill out the text that they see. In this case, "sclt ..was here".]]▼
▲=== Machine learning-based attacks ===
There was not a systematic methodology for designing or evaluating early CAPTCHAs.<ref name=bursz /> As a result, there were many instances in which CAPTCHAs were of a fixed length and therefore automated tasks could be constructed to successfully make educated guesses about where segmentation should take place. Other early CAPTCHAs contained limited sets of words, which made the test much easier to game<!-- This sentence makes no sense! -->. Still others{{Example needed|date=October 2022}} made the mistake of relying too heavily on background confusion in the image. In each case, algorithms were created that were successfully able to complete the task by exploiting these design flaws. However, light changes to the CAPTCHA could thwart them. Modern CAPTCHAs like [[reCAPTCHA]] rely on present variations of characters that are collapsed together, making them hard to segment, and they have warded off automated tasks.<ref name=bursz2 />
▲[[File:Modern-captcha.jpg|thumb|An example of a [[reCAPTCHA]] challenge from 2007, containing the words "following finding". The waviness and horizontal stroke were added to increase the difficulty of breaking the CAPTCHA with a computer program.]]
▲[[File:Captchacat.png|thumb|A CAPTCHA usually has a text box directly underneath where the user should fill out the text that they see. In this case, "sclt ..was here".]]
In October 2013, artificial intelligence company [[Vicarious (Company)|Vicarious]] claimed that it had developed a generic CAPTCHA-solving algorithm that was able to solve modern CAPTCHAs with character recognition rates of up to 90%.<ref>{{cite web|last=Summers|first=Nick|title=Vicarious claims its AI software can crack up to 90% of CAPTCHAs offered by Google, Yahoo and PayPal|url=https://thenextweb.com/insider/2013/10/28/vicarious-claims-ai-software-can-now-crack-90-captchas-google-yahoo-paypal/|publisher=TNW|access-date=19 June 2018|archive-date=15 September 2018|archive-url=https://web.archive.org/web/20180915002117/https://thenextweb.com/insider/2013/10/28/vicarious-claims-ai-software-can-now-crack-90-captchas-google-yahoo-paypal/|url-status=live}}</ref> However, [[Luis von Ahn]], a pioneer of early CAPTCHA and founder of reCAPTCHA, said: "It's hard for me to be impressed since I see these every few months." 50 similar claims to that of Vicarious had been made since 2003.<ref>{{cite web|last=Hof|first=Robert|title=AI Startup Vicarious Claims Milestone In Quest To Build A Brain: Cracking CAPTCHA|url=https://www.forbes.com/sites/roberthof/2013/10/28/ai-startup-vicarious-claims-milestone-in-quest-to-build-a-brain-craking-captcha/|work=Forbes|access-date=25 August 2017|archive-date=15 September 2018|archive-url=https://web.archive.org/web/20180915002819/https://www.forbes.com/sites/roberthof/2013/10/28/ai-startup-vicarious-claims-milestone-in-quest-to-build-a-brain-craking-captcha/|url-status=live}}</ref>
Line 74 ⟶ 68:
Another technique consists of using a script to re-post the target site's CAPTCHA as a CAPTCHA to the attacker's site, which unsuspecting humans visit and solve within a short while for the script to use.<ref>{{cite web|url=http://www.boingboing.net/2004/01/27/solving_and_creating.html |title=Solving and creating captchas with free porn |last=Doctorow |first=Cory |author-link=Cory Doctorow |date=2004-01-27 |work=Boing Boing |archive-url=https://web.archive.org/web/20060209040456/http://www.boingboing.net/2004/01/27/solving_and_creating.html |archive-date=2006-02-09 |access-date=2015-04-27 |url-status=dead }}</ref><ref>{{cite web | url = http://petmail.lothar.com/design.html#auto35 | title = Hire People To Solve CAPTCHA Challenges | access-date = 2015-04-27 | date = 2005-07-21 | work = Petmail Design | archive-date = 18 September 2020 | archive-url = https://web.archive.org/web/20200918050055/http://petmail.lothar.com/design.html#auto35 | url-status = live }}</ref>
In 2023,
=== Outsourcing to paid services ===
Line 89 ⟶ 83:
Chew et al. published their work in the 7th International Information Security Conference, ISC'04, proposing three different versions of image recognition CAPTCHAs, and validating the proposal with user studies. It is suggested that one of the versions, the anomaly CAPTCHA, is best with 100% of human users being able to pass an anomaly CAPTCHA with at least 90% probability in 42 seconds.<ref>{{cite web |url=http://www.cs.berkeley.edu/~tygar/papers/Image_Recognition_CAPTCHAs/imagecaptcha.pdf |title=Image Recognition CAPTCHAs |publisher=Cs.berkeley.edu |access-date=2013-09-28 |archive-url=https://web.archive.org/web/20130510022240/http://www.cs.berkeley.edu/~tygar/papers/Image_Recognition_CAPTCHAs/imagecaptcha.pdf |archive-date=2013-05-10 |url-status=dead }}</ref> Datta et al. published their paper in the [[Association for Computing Machinery|ACM]] [[Multimedia]] '05 Conference, named IMAGINATION (IMAge Generation for INternet AuthenticaTION), proposing a systematic way to image recognition CAPTCHAs. Images are distorted so image recognition approaches cannot recognise them.<ref>{{cite web |url=http://infolab.stanford.edu/~wangz/project/imsearch/IMAGINATION/ACM05/ |title=Imagination Paper |publisher=Infolab.stanford.edu |access-date=2013-09-28 |archive-date=2 October 2013 |archive-url=https://web.archive.org/web/20131002170726/http://infolab.stanford.edu/~wangz/project/imsearch/IMAGINATION/ACM05/ |url-status=live }}</ref>
Microsoft (Jeremy Elson, John R. Douceur, Jon Howell, and Jared Saul) claim to have developed Animal Species Image Recognition for Restricting Access (ASIRRA) which ask users to distinguish cats from dogs. Microsoft had a beta version of this for websites to use.<ref>{{cite web |url=https://www.microsoft.com/en-us/research/publication/asirra-a-captcha-that-exploits-interest-aligned-manual-image-categorization/ |archive-url=https://web.archive.org/web/20081215032402/http://research.microsoft.com/en-us/um/redmond/projects/asirra/ |archive-date=15 December 2008 |title=Asirra is a human interactive proof that asks users to identify photos of cats and dogs |website=[[Microsoft]] |url-status=dead }}</ref> They claim "Asirra is easy for users; it can be solved by humans 99.6% of the time in under 30 seconds. Anecdotally, users seemed to find the experience of using Asirra much more enjoyable than a text-based CAPTCHA." This solution was described in a 2007 paper to Proceedings of 14th ACM Conference on Computer and Communications Security (CCS).<ref>{{
== See also ==
* [[Bot prevention]]
* [[Defense strategy (computing)]]
* [[Proof of personhood]]
* [[Proof
* [[reCAPTCHA]]
Line 168 ⟶ 162:
==Further references==
* von Ahn, L;
== External links ==
{{Sister project links|d=Q484598|mw=CAPTCHA|voy=no|species=no|commons=Category:Captcha|v=no|q=no|n=no|s=no}}
<!--Please, no implementations here – see discussion-->
* [http://www.wisdom.weizmann.ac.il/~naor/PAPERS/human_abs.html Verification of a human in the loop, or Identification via the Turing Test], Moni Naor, 1996.
* [http://www.w3.org/TR/turingtest/ Inaccessibility of CAPTCHA: Alternatives to Visual Turing Tests on the Web], a [[W3C]] Working Group Note.
|