reCAPTCHA

from Wikipedia, the free encyclopedia
The reCAPTCHA logo
An example of a reCAPTCHA input box

reCAPTCHA is a Captcha - Service provided by the since 2009 Google LLC operates. This tries to differentiate whether a certain action on the Internet is carried out by a person or by a computer program or bot . The fully automated public process is thus similar to the Turing test . reCAPTCHA is also used to digitize books and magazines as well as house numbers and street names from Google Street View .

history

According to a projection by Carnegie Mellon University , Internet users worldwide spend 150,000 hours a day solving captchas . The regularity and free of charge of this work led to the idea of ​​using it for meaningful purposes. The computer scientist Luis von Ahn , who was significantly involved in the invention of the CAPTCHA process in 2000, developed a system called reCAPTCHA from this in 2007 , which converts words scanned during book digitization that the text recognition software does not recognize by entering CAPTCHAs optimized. The system initially took its words from an area of ​​the Internet Archive that deals with the digitization of books. The service also helped digitize the archive of all available 130 volumes of the New York Times : within a few months of the start of this project in 2009, 20 volumes had already been digitized.

On September 16, 2009 it was announced that Google had bought the company reCAPTCHA. Google benefits from this because it is part of its field of activity to digitize books and other printed matter. In March 2012 it was confirmed that Google can now also recognize house numbers from Google Street View in order to optimize the database for Google Maps . Since around October 2015, more and more street signs have been displayed, the recognized street names of which are also used to improve Street View. Sometimes only these street signs and house numbers are displayed for identification and no longer excerpts from scanned books.

Function and use

Two words are shown on each CAPTCHA: one is already known and confirmed by the system, the other is an unrecognized word from a digitization project.

With this CAPTCHA, the user can participate in the reCAPTCHA character recognition project free of charge (see crowdsourcing ) . In order to successfully solve the captcha, however, it is sufficient to solve the actual test captcha and ignore the crowdsourcing task, i.e. not to enter the much more legible word from a digital copy. There are plugins for integration into popular web applications such as Lifetype , WordPress , TYPO3 , Drupal , vBulletin , phpBB , Joomla or MediaWiki . Many millions of people thus participate in the project without knowing the exact purpose of the project and make their work available.

It can be found out statistically whether the input of a user is correct: The word combination is presented to several users at the same time within a very short period of time and the most frequent input is assumed to be correct.

No CAPTCHA reCAPTCHA

In 2013 reCAPTCHA started to implement behavioral analytics in CAPTCHAs. Among other things, the user's browser interactions are examined in order to calculate a probability of whether the user is a person. If the user is identified as a person with a high degree of probability, a simple selection field “ I'm not a robot ” is presented, which must be confirmed with a mouse click. In cases in which the caller cannot be identified as a human being with sufficient certainty, a “significantly more difficult” Captcha is displayed compared to older versions. In late 2014, Google started using the new mechanism in most of its publicly available services.

privacy

Whenever this technology is used, personal data (IP address, access location and time) are forwarded to Google. As a rule, Google has further data from the user at the same time, namely due to the numerous background services on other websites that Google offers for integration: Google Maps, Google Analytics, Google AdWords, etc. This enables comprehensive tracking.

Web links

Commons : ReCAPTCHA  - collection of images, videos and audio files

Individual evidence

  1. What is reCAPTCHA. ( Memento of the original from July 6, 2013 on WebCite ) Info: The archive link was automatically inserted and not yet checked. Please check the original and archive link according to the instructions and then remove this notice. Carnegie Mellon University, Jan. 27, 2004. [Dec. March 2006] @1@ 2Template: Webachiv / IABot / recaptcha.net
  2. Jessie Scanlo: Luis von Ahn: The Pioneer of “Human Computation,” BusinessWeek, November 3, 2008, accessed January 28, 2012
  3. Teaching computers to read: Google acquires reCAPTCHA. In: Official Google Blog. September 16, 2009, accessed November 5, 2011 .
  4. googleblog.blogspot.com
  5. zdnet.de
  6. Google uses street view photos for reCAPTCHA
  7. ^ Sarah Perez: Google Now Using ReCAPTCHA To Decode Street View Addresses . techcrunch.com, March 29, 2013, accessed September 9, 2013
  8. Stop a bot. Improve a map. reCAPTCHA improves our knowledge of the physical world by creating CAPTCHAs out of text visible on Street View imagery As people verify the text in these CAPTCHAs, this information is used to make Google Maps more precise and complete. So if you're a Google Maps user, your experience (and everyone else's) will be even better. google.com; accessed on February 6, 2016
  9. Are you a robot? Introducing “No CAPTCHA reCAPTCHA”. In: Google Online Security Blog. December 3, 2014, accessed August 13, 2015 .