Wednesday, April 16, 2014

Oh, Craptcha

Surely, you have at some point or other run into a Captcha image, the distorted word (or other alphanumeric combination) that's meant to be a guard against bots gaining access to somewhere. The idea, of course, is that while these distortions are no problem for a human to decipher (though I've had occasions where I've had to refresh quite a few times before I got something readable enough to confidently punch in), a computer has trouble reading anything even slightly altered, and thus could be stopped just by putting a little bend in the letters and running a line through them, or photocopying a word out of a book or a house number from Google Street View, or something like that. In the process, as people enter in what the book or the house number is saying, that actually helps computers record the contents of those books or images.

There's just one small issue with this, which has just been realized by Google: if you tell a computer what enough distorted images are saying, eventually the computer will become able to read them itself, and then the game is up. Google has created a program that has shown itself capable of 90% accuracy when presented with a Street View house number, and 99.8% accuracy when given "the hardest category" (PDF) of distorted text. This opens up a big ol' security flaw, and Lord knows we've about heard enough of online security risks lately, what with Heartbleed and all.

This is not to say Captcha is going away. Google believes it can patch up the way it works so as to better thwart a bot, which to a degree one would think comes down to taking the images it did miss and doing more of that, but because the percentage is just so high, product manager Vinay Shet is also hinting that it may be less text-based altogether in the future. Perhaps audio makes more of an appearance.

Which will work fine, until Siri learns to actually understand human speech. Then the mental arms race will begin again.

No comments: