Captcha

from Wikipedia, the free encyclopedia

A Captcha [ 'kæptʃə ] (of or even CAPTCHA ; Engl. C ompletely a utomated p ublic T uring test to tell c omputers and h Umans a part "fully automated public Turing test to differentiate between computers and people ") is used to determine whether a person or a machine (robot [program], bot for short ) is involved. As a rule, this is used to check by whom entries have been made in Internet forms, because robots are often misused here. Captchas are used for security. In contrast to the classic Turing test, captchas aim to ensure that computers (and not humans) can clearly distinguish between humans and machines.

The term captcha was first used in 2000 by Luis von Ahn , Manuel Blum and Nicholas J. Hopper at Carnegie Mellon University and by John Langford from IBM and is a homophone of the English word capture . Occasionally method for distinguishing between people and robots to as HIP (engl. H uman I nteraction P roof ), respectively.

Captcha, which is based on the distortion of letters (here: "smwm") in images
With this captcha, the character string against the background is first to be recognized, then the task to be understood and to be solved by entering "25"
Captcha, in which reading is made difficult by the background. (Here: MV52RQ)

Explanation

Captchas are mostly challenge-response tests in which the respondent has to solve a task ( challenge ) and sends back the result ( response ) . In captchas, the tasks are ideally set in such a way that they are easy to solve for humans, but very difficult for computers. An example of this is text that has been distorted by image filters. Computers need pattern recognition algorithms in order to process such images, which are complex to program and place high demands on the hardware. In addition to graphic captchas, audio captchas or video captchas are now also used. In Asirra recognize Microsoft requires the user to animals in photos.

According to their name, captchas have the following properties:

  • Questions and answers are generated fully automatically by a random generator for every access attempt and in compliance with certain rules. No human-presented catalog of questions and answers is used because its limited range of values ​​would lead to repetitions much more quickly and thus facilitate an attack.
  • The algorithm used is published so that experts can assess the security of the system. Captchas thus follow Kerckhoffs' principle and avoid security through obscurity .
Captcha with additional characters in the background

disadvantage

Image-based puzzles are mostly used; however, these are not barrier-free because they are difficult to see for visually impaired people. Various providers therefore also use acoustic captchas to increase accessibility. This also excludes deafblind users and users of purely text-based browsers . For the latter, however, another method would be suitable, which asks for a word in text, such as: "What do you call a motor-driven, four-wheeled vehicle". The answer in this case would be “car”. Such question-and-answer lists would have to be expanded or constantly rewritten very quickly, since a spambot could be taught such lists. In addition, they then represent a barrier for non- native speakers or the intellectually disadvantaged.

The subject of accessibility in connection with captchas is also addressed in the Web Content Accessibility Guidelines (WCAG) of the Web Accessibility Initiative (WAI) of the World Wide Web Consortium (W3C), which, among other things, stipulates the minimum requirements for a low-barrier Captcha.

In principle, every difficult problem in artificial intelligence is suitable to be used for a captcha. The technical escalation has made captchas increasingly difficult to solve for humans - as early as 2010, researchers at Stanford University showed in a study that captchas are often a major challenge even for human Internet users - and therefore not a long-term solution represent.

Captchas are also problematic from the point of view of user-friendliness because they represent hurdles that sometimes lead to considerable additional effort for the user in achieving the goal. In addition, many of those affected do not understand the purpose of a captcha intuitively, which leads to irritation about a supposedly meaningless function. These problem sources often result in discontinued use, error messages and dissatisfaction with the user.

application areas

Possible areas of application are services in which bots can manipulate or misuse the service, such as online surveys, guest books, registering e-mail addresses (from which spam can be sent) or avoiding spam by concealing the e-mail address. Address. Captchas are also used in conjunction with an indexed TAN list to protect against man-in-the-middle attacks in online banking applications.

Example for reCAPTCHA

Google Inc. achieves a secondary benefit from the Captchas solution with the reCAPTCHA service for its Google Books project . Unrecognized text passages in the Google books scan project are displayed as individual words to Internet users as captcha. A real captcha is displayed next to the scanned word. Both words are resolved by the user. The real captcha word is used for the actual security query. The recognition of the scanned word is saved. If the scanned word is spelled the same multiple times in different Captcha processes, the word is saved as correctly recognized and is thus included in the Google book scan project.

Captchas bypassing

Loosening by machines

With the increasing spread of Captcha-protected websites, an arms race began between the manufacturers of Captchas and the developers of machine solutions, so that programs were soon developed to bypass the protection of Captchas. Many meanwhile older implementations can now also be solved for machines with relatively little effort. For widespread implementations, such as the one used in phpBB (software for providing an Internet forum ), there are already spambots that can read the captchas and thus circumvent this protection. Another example is the circumvention of captchas by spammers when automatically creating Gmail accounts with a recognition rate of 20 to 30 percent. In the original version of the Captchas from SchülerVZ , it was also possible to circumvent this and thus initiate a mass copying of profile data. An automated bot which only a simple combination of a Perl - scripts and Ajax programming was, could solve an average of 70% of the spelling. As an experiment shows, there are algorithms that solve captchas, for example from Google's widely used reCAPTCHA, to well over 90% correctly, and thus have a higher recognition rate than humans.

Solve through people

Unknowingly

A technically simple way of circumventing a captcha mechanism is to delegate a recognition task to ignorant users. For example, a spammer set up a honeypot to let visitors solve captchas that were taken over by the spammer's target. In 2007, Trend Micro and Panda Security uncovered a trojan that disguised captchas as an interactive striptease game.

Knowingly

By means of collaborative collaboration , on so-called Captcha Exchange servers , groups worked at the end of the first decade of the 21st century to create solutions for each other. Due to technical developments and the establishment of paid services , the idea could hardly prevail. In the third world there are some providers who have captchas solved in sweatshops .

Web links

Commons : Captcha  - collection of images, videos and audio files

Individual evidence

  1. Christian Radek: Barrier-free e-learning: Overcoming digital hurdles , test.de from May 6, 2014, accessed on November 4, 2014
  2. WCAG - W3C Guidelines for Accessible Web Content . ionos.de/digitalguide. April 13, 2018. Retrieved November 30, 2018.
  3. How Good are Humans at Solving CAPTCHAs? A Large Scale Evaluation ( English ) web.stanford.edu. Retrieved November 30, 2018.
  4. Captchas in the field of tension between accessibility and security
  5. Captchas: Walking a tightrope between security and usability
  6. Official reCAPTCHA page (English)
  7. PWNtcha - CAPTCHA decoder. In: Caca Labs (English).
  8. Bot registers in several thousand phpBB forums. In: Heise online , March 20, 2006.
  9. Spammers take advantage of Google's captchas. In: Heise online , March 11, 2008.
  10. Netzpolitik interview: Security gaps in the VZ group By Markus Beckedahl, Netzpolitik, October 20, 2009
  11. Algorithm solves captchas better than humans By Richard Meusers, SPIEGEL-ONLINE, April 17, 2014
  12. Cory Doctorow : Solving and creating captchas with free porn . In: boingboing.net , January 27, 2004 (English).
  13. Porn versus Captchas. In: Heise online , July 23, 2008.
  14. Trojans let users read captchas. In: Heise online , October 29, 2007.
  15. Spammers Pay Others to Answer Security Tests By VIKAS BAJAJ, The New York Times , April 25, 2010