The program distorts a known word so that it will have a way to check that the user is human, and then pairs it up with a word OCR has failed to decipher. Each mystery word is served up in multiple reCAPTCHAs, until a consensus about the correct answer emerges. Sometimes a single user confirms the computer’s best guess, but the average is about four users per word. The system is now correcting over 10 million words a day, with 99.1 percent accuracy, von Ahn says.
As innovative as the system is, von Ahn wasn’t the first to harness the power of personal computers around the world. Since the late 1990s, scientists have been recruiting people to download special screen savers that devote spare computing power to projects ranging from the search for extraterrestrial life to climate modelling. The difference with reCAPTCHA is that humans are doing the computing, without necessarily realizing it.
Still, the service is supplied free to any website that wants it, and in addition to helping deipher books scanned for the Internet Archive, reCAPTCHA has been recruited to assist in the digitization of the entire archive of the New York Times back to 1851, which should be completed this year and posted on the Times website. The pursuit of such public goods, von Ahn hopes, will deflect any resentment from the human scanners whose brain cycles he is capturing. “We could do other things, like digitizing cheques,” he notes. “But banks already make enough money.”












