CAPTCHA: Difference between revisions

Content deleted Content added
Accessibility: rearrange to improve flow
(22 intermediate revisions by 13 users not shown)
Line 3:
{{Cleanup rewrite|date=November 2022|It feels like an essay criticising CAPTCHA|section=no}}
{{Use dmy dates|date=October 2022}}
[[File:captcha.jpg|upright=1.35|thumb|This CAPTCHA (Version[[reCAPTCHA 1{{clarifyme|date=June 2023}}v1]]) of "smwm" obscures its message from computer interpretation by twisting the letters and adding a slight background color gradient.]]
 
A '''CAPTCHA''' ({{IPAc-en|ˈ|k|æ|p|.|tʃ|ə}} {{respell|KAP|chə}}) is a type of [[challenge–response authentication|challenge–response]] test used in [[computing]] to determine whether the user is human in order to deter bot attacks and spam.<ref>{{Cite web|title=The reCAPTCHA Project – Carnegie Mellon University CyLab|url=https://www.cylab.cmu.edu/partners/success-stories/recaptcha.html|url-status=dead|archive-url=https://web.archive.org/web/20171027203659/https://www.cylab.cmu.edu/partners/success-stories/recaptcha.html|archive-date=2017-10-27|access-date=2017-01-13|website=www.cylab.cmu.edu}}</ref>
 
The term was coined in 2003 by [[Luis von Ahn]], [[Manuel Blum]], Nicholas J. Hopper, and [[John Langford (computer scientist)|John Langford]].<ref name="abhl" /> It is a [[contrived acronym]] for "Completely Automated Public [[Turing test]] to tell Computers and Humans Apart."<ref>{{Cite web |title=What is CAPTCHA? |url=https://support.google.com/a/answer/1217728 |url-status=live |archive-url=https://web.archive.org/web/20200806173938/https://support.google.com/a/answer/1217728 |archive-date=6 August 2020 |access-date=2022-09-09 |website=Google Support |publisher=Google Inc. |quote=CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a [...]}}</ref> A historically common type of CAPTCHA (displayed as Version[[reCAPTCHA 1.0{{clarifyme|date=June 2023}}v1]]) was first invented in 1997 by two groups working in parallel. This form of CAPTCHA requires entering a sequence of letters or numbers in a distorted image. Because the test is administered by a computer, in contrast to the standard Turing test that is administered by a human, CAPTCHAs are sometimes described as [[reverse Turing test|reverse Turing tests]]s.<ref>{{Cite journal |last=Mayumi Takaya |last2=Yusuke Tsuruta |last3=Akihiro Yamamura |date=2013-09-30 |title=Reverse Turing Test using Touchscreens and CAPTCHA |url=http://isyou.info/jowua/papers/jowua-v4n3-3.pdf |journal=Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications |volume=4 |issue=3 |pages=41–57 |doi=10.22667/JOWUA.2013.09.31.041|archive-date=22 August 2017|archive-url=https://web.archive.org/web/20170822001858/http://isyou.info/jowua/papers/jowua-v4n3-3.pdf|url-status=live}}</ref>
 
Two widely used CAPTCHA services are [[Google]]'s [[reCAPTCHA]]<ref>{{Cite web |title=What is reCAPTCHA? –?reCAPTCHA Help |url=https://support.google.com/recaptcha/answer/6080904?hl=en |access-date=2023-07-20 |website=support.google.com |archive-date=20 July 2023 |archive-url=https://web.archive.org/web/20230720192427/https://support.google.com/recaptcha/answer/6080904?hl=en |url-status=live }}</ref><ref>{{Cite web |last=Sulgrove |first=Jonathan |date=2022-07-07 |title=reCAPTCHA: What It Is and Why You Should Use It on Your Website – TSTS |url=https://www.tsts.com/blog/recaptcha-what-it-is-and-why-you-should-use-it-on-your-website/ |access-date=2022-11-10 |website=Twin State Technical Services |language=en-US |archive-date=10 November 2022 |archive-url=https://web.archive.org/web/20221110020410/https://www.tsts.com/blog/recaptcha-what-it-is-and-why-you-should-use-it-on-your-website/ |url-status=live }}</ref> and the independent hCaptcha.<ref>{{Cite web |title=Websites using hCaptcha |url=https://trends.builtwith.com/websitelist/hCaptcha |access-date=2022-11-10 |website=trends.builtwith.com |archive-date=10 November 2022 |archive-url=https://web.archive.org/web/20221110020408/https://trends.builtwith.com/websitelist/hCaptcha |url-status=live }}</ref><ref>{{Cite web |title=hCaptcha – About Us |url=https://www.hcaptcha.com/about |access-date=2023-07-20 |website=www.hcaptcha.com |language=en |archive-date=20 July 2023 |archive-url=https://web.archive.org/web/20230720192429/https://www.hcaptcha.com/about |url-status=live }}</ref> It takes the average person approximately 10 seconds to solve a typical CAPTCHA.<ref>{{cite book|last1=Bursztein|first1=Elie|last2=Bethard|first2=Steven|last3=Fabry|first3=Celine|last4=Mitchell|first4=John C.|last5=Jurafsky|first5=Dan|title=2010 IEEE Symposium on Security and Privacy |chapter=How Good Are Humans at Solving CAPTCHAs? A Large Scale Evaluation |access-date=March 30, 2018|chapter-url=https://web.stanford.edu/~jurafsky/burszstein_2010_captcha.pdf|year=2010|pages=399–413|doi=10.1109/SP.2010.31|citeseerx=10.1.1.164.7848|isbn=978-1-4244-6894-2|s2cid=14204454|archive-date=8 August 2018|archive-url=https://web.archive.org/web/20180808033552/https://web.stanford.edu/~jurafsky/burszstein_2010_captcha.pdf|url-status=live}}</ref>
 
== Purpose ==
CAPTCHAs'The purpose of CAPTCHAs is to prevent spam on websites, such as promotion spam, registration spam, and data scraping, and bots are less likely to abuse websites with spamming if those websites use CAPTCHA. Many websites use CAPTCHA effectively to prevent bot raiding. CAPTCHAs are designed so that humans can complete them, while most robots cannot.<ref>{{Cite web |last=Stec |first=Albert |date=2022-06-12 |title=What is CAPTCHA and How Does It Work? |url=https://www.baeldung.com/cs/captcha-intro |url-status=live |archive-url=https://web.archive.org/web/20221101005730/https://www.baeldung.com/cs/captcha-intro |archive-date=1 November 2022 |access-date=2022-11-01 |website=Baeldung on Computer Science |language=en-US}}</ref> Newer CAPTCHAs look at the user's behaviour on the internet, to prove that they are a human.<ref>{{Cite web |date=November 1, 2022 |title=What is a CAPTCHA? |url=https://www.cloudflare.com/learning/bots/how-captchas-work/ |url-status=live |archive-url=https://web.archive.org/web/20221027061629/https://www.cloudflare.com/learning/bots/how-captchas-work/ |archive-date=27 October 2022 |access-date=November 1, 2022 |website=Cloudflare}}</ref> A normal CAPTCHA test only appears if the user acts like a bot, such as when they request webpages, or click links too fast.
 
== History ==
Line 20:
 
A popular deployment of CAPTCHA technology, [[reCAPTCHA]], was acquired by Google in 2009.<ref>{{cite web |title=Teaching computers to read: Google acquires reCAPTCHA |url=https://googleblog.blogspot.com/2009/09/teaching-computers-to-read-google.html |url-status=live |archive-url=https://web.archive.org/web/20190831195346/https://googleblog.blogspot.com/2009/09/teaching-computers-to-read-google.html |archive-date=31 August 2019 |access-date=29 October 2018 |website=Google Official Blog}}</ref> In addition to preventing bot fraud for its users, Google used reCAPTCHA and CAPTCHA technology to digitize the archives of ''[[The New York Times]]'' and books from Google Books in 2011.<ref>{{cite news |last1=Gugliotta |first1=Guy |date=28 March 2011 |title=Deciphering Old Texts, One Woozy, Curvy Word at a Time |website=The New York Times |url=https://www.nytimes.com/2011/03/29/science/29recaptcha.html |url-status=live |access-date=29 October 2018 |archive-url=https://web.archive.org/web/20171117172409/http://www.nytimes.com/2011/03/29/science/29recaptcha.html |archive-date=17 November 2017}}</ref>
 
=== Invention ===
Eran Reshef, [[Gili Raanan]] and Eilon Solan, who worked at [[Sanctum (company)|Sanctum]] on [[Web application firewall|Application Security Firewall]], first patented CAPTCHA in 1997. Their patent application details that "The invention is based on applying human advantage in applying sensory and cognitive skills to solving simple problems that prove to be extremely hard for computer software. Such skills include, but are not limited to processing of sensory information such as identification of objects and letters within a noisy graphical environment, signals and speech within an auditory signal, patterns and objects within a video or animation sequence".<ref name=":0">{{cite patent|country=US|number=2005/0114705 A1|status=|title=Method and system for discriminating a human action from a computerized action|pubdate=26 May 2005|gdate=|pridate=11 December 1997|fdate=1 March 2004|inventor1-first=Eran|inventor1-last=Reshef|inventor2-first=Gil|inventor2-last=Raanan|inventorlink2=Gili Raanan|inventor3-first=Eilon|inventor3-last=Solan|url=https://patentimages.storage.googleapis.com/9c/fc/21/1188d59d94d268/US20050114705A1.pdf}} {{Webarchive|url=https://web.archive.org/web/20190224001924/https://patentimages.storage.googleapis.com/9c/fc/21/1188d59d94d268/US20050114705A1.pdf |date=24 February 2019 }}</ref>
 
== Characteristics ==
Line 52 ⟶ 49:
 
== Circumvention ==
Two main ways to bypass CAPTCHA include using cheap human labor to recognize them, and using [[machine learning]] to build an automated solver.<ref>{{cite book|last=Jakobsson|first=Markus|title=The death of the Internet|url=http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1118062418.html|access-date=4 April 2016|date=August 2012|archive-date=15 October 2014|archive-url=https://web.archive.org/web/20141015182639/http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1118062418.html|url-status=live}}</ref> According to former Google "''[[click fraud]] czar''" [[Shuman Ghosemajumder]], there are numerous services which solve CAPTCHAs automatically.<ref name=ai-security>{{cite news |last=Ghosemajumder |first=Shuman |title=The Imitation Game: The New Frontline of Security |url=http://www.infoq.com/presentations/ai-security |agency=InfoQ |access-date=8 December 2015 |newspaper=InfoQ |date=8 December 2015 |archive-date=23 March 2019 |archive-url=https://web.archive.org/web/20190323061742/https://www.infoq.com/presentations/ai-security |url-status=live }}</ref>
 
=== Machine learning-basedlearning–based attacks ===
Two main ways to bypass CAPTCHA include using cheap human labor to recognize them, and using [[machine learning]] to build an automated solver.<ref>{{cite book|last=Jakobsson|first=Markus|title=The death of the Internet|url=http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1118062418.html|access-date=4 April 2016|date=August 2012|archive-date=15 October 2014|archive-url=https://web.archive.org/web/20141015182639/http://eu.wiley.com/WileyCDA/WileyTitle/productCd-1118062418.html|url-status=live}}</ref> According to former Google "''[[click fraud]] czar''" [[Shuman Ghosemajumder]], there are numerous services which solve CAPTCHAs automatically.<ref name=ai-security>{{cite news |last=Ghosemajumder |first=Shuman |title=The Imitation Game: The New Frontline of Security |url=http://www.infoq.com/presentations/ai-security |agency=InfoQ |access-date=8 December 2015 |newspaper=InfoQ |date=8 December 2015 |archive-date=23 March 2019 |archive-url=https://web.archive.org/web/20190323061742/https://www.infoq.com/presentations/ai-security |url-status=live }}</ref>
[[File:Modern-captcha.jpg|thumb|An example of a [[reCAPTCHA]] challenge from 2007, containing the words "following finding". The waviness and horizontal stroke were added to increase the difficulty of breaking the CAPTCHA with a computer program.]]
[[File:Captchacat.png|thumb|A CAPTCHA usually has a text box directly underneath where the user should fill out the text that they see. In this case, "sclt ..was here".]]
 
=== Machine learning-based attacks ===
There was not a systematic methodology for designing or evaluating early CAPTCHAs.<ref name=bursz /> As a result, there were many instances in which CAPTCHAs were of a fixed length and therefore automated tasks could be constructed to successfully make educated guesses about where segmentation should take place. Other early CAPTCHAs contained limited sets of words, which made the test much easier to game<!-- This sentence makes no sense! -->. Still others{{Example needed|date=October 2022}} made the mistake of relying too heavily on background confusion in the image. In each case, algorithms were created that were successfully able to complete the task by exploiting these design flaws. However, light changes to the CAPTCHA could thwart them. Modern CAPTCHAs like [[reCAPTCHA]] rely on present variations of characters that are collapsed together, making them hard to segment, and they have warded off automated tasks.<ref name=bursz2 />
 
[[File:Modern-captcha.jpg|thumb|An example of a [[reCAPTCHA]] challenge from 2007, containing the words "following finding". The waviness and horizontal stroke were added to increase the difficulty of breaking the CAPTCHA with a computer program.]]
[[File:Captchacat.png|thumb|A CAPTCHA usually has a text box directly underneath where the user should fill out the text that they see. In this case, "sclt ..was here".]]
 
In October 2013, artificial intelligence company [[Vicarious (Company)|Vicarious]] claimed that it had developed a generic CAPTCHA-solving algorithm that was able to solve modern CAPTCHAs with character recognition rates of up to 90%.<ref>{{cite web|last=Summers|first=Nick|title=Vicarious claims its AI software can crack up to 90% of CAPTCHAs offered by Google, Yahoo and PayPal|url=https://thenextweb.com/insider/2013/10/28/vicarious-claims-ai-software-can-now-crack-90-captchas-google-yahoo-paypal/|publisher=TNW|access-date=19 June 2018|archive-date=15 September 2018|archive-url=https://web.archive.org/web/20180915002117/https://thenextweb.com/insider/2013/10/28/vicarious-claims-ai-software-can-now-crack-90-captchas-google-yahoo-paypal/|url-status=live}}</ref> However, [[Luis von Ahn]], a pioneer of early CAPTCHA and founder of reCAPTCHA, said: "It's hard for me to be impressed since I see these every few months." 50 similar claims to that of Vicarious had been made since 2003.<ref>{{cite web|last=Hof|first=Robert|title=AI Startup Vicarious Claims Milestone In Quest To Build A Brain: Cracking CAPTCHA|url=https://www.forbes.com/sites/roberthof/2013/10/28/ai-startup-vicarious-claims-milestone-in-quest-to-build-a-brain-craking-captcha/|work=Forbes|access-date=25 August 2017|archive-date=15 September 2018|archive-url=https://web.archive.org/web/20180915002819/https://www.forbes.com/sites/roberthof/2013/10/28/ai-startup-vicarious-claims-milestone-in-quest-to-build-a-brain-craking-captcha/|url-status=live}}</ref>
Line 72 ⟶ 68:
Another technique consists of using a script to re-post the target site's CAPTCHA as a CAPTCHA to the attacker's site, which unsuspecting humans visit and solve within a short while for the script to use.<ref>{{cite web|url=http://www.boingboing.net/2004/01/27/solving_and_creating.html |title=Solving and creating captchas with free porn |last=Doctorow |first=Cory |author-link=Cory Doctorow |date=2004-01-27 |work=Boing Boing |archive-url=https://web.archive.org/web/20060209040456/http://www.boingboing.net/2004/01/27/solving_and_creating.html |archive-date=2006-02-09 |access-date=2015-04-27 |url-status=dead }}</ref><ref>{{cite web | url = http://petmail.lothar.com/design.html#auto35 | title = Hire People To Solve CAPTCHA Challenges | access-date = 2015-04-27 | date = 2005-07-21 | work = Petmail Design | archive-date = 18 September 2020 | archive-url = https://web.archive.org/web/20200918050055/http://petmail.lothar.com/design.html#auto35 | url-status = live }}</ref>
 
In 2023, the generative AI chatbot [[ChatGPT]], tricked a [[Taskrabbit|TaskRabbit]] worker tointo solvesolving a CAPTCHA by telling the worker it was not a robot and had impaired vision.<ref>{{cite web |last1=Hurler |first1=Kevin |title=Chat-GPT Pretended to Be Blind and Tricked a Human Into Solving a CAPTCHA |url=https://gizmodo.com/gpt4-open-ai-chatbot-task-rabbit-chatgpt-1850227471 |website=Gizmodo |access-date=11 April 2023 |archive-date=11 April 2023 |archive-url=https://web.archive.org/web/20230411200745/https://gizmodo.com/gpt4-open-ai-chatbot-task-rabbit-chatgpt-1850227471 |url-status=live }}</ref>
 
=== Outsourcing to paid services ===
Line 92 ⟶ 88:
* [[Bot prevention]]
* [[Defense strategy (computing)]]
* [[NuCaptcha]]
* [[Proof of personhood]]
* [[Proof- of-work systemwork]]
* [[reCAPTCHA]]
 
Line 172 ⟶ 167:
{{Sister project links|d=Q484598|mw=CAPTCHA|voy=no|species=no|commons=Category:Captcha|v=no|q=no|n=no|s=no}}
<!--Please, no implementations here – see discussion-->
* {{Curlie|Computers/Internet/Abuse/CAPTCHA/}}
* [http://www.wisdom.weizmann.ac.il/~naor/PAPERS/human_abs.html Verification of a human in the loop, or Identification via the Turing Test], Moni Naor, 1996.
* [http://www.w3.org/TR/turingtest/ Inaccessibility of CAPTCHA: Alternatives to Visual Turing Tests on the Web], a [[W3C]] Working Group Note.
Line 178 ⟶ 172:
* [https://web.archive.org/web/20170915204258/https://pdfs.semanticscholar.org/692a/31f65e29ea3667de46933245f53bda55a65b.pdf Reverse Engineering CAPTCHAs] Abram Hindle, Michael W. Godfrey, Richard C. Holt, 2009-08-24
 
{{CAPTCHAs}}
{{Authority control}}