In Search Of The Perfect CAPTCHA

Advertisement

CAPTCHAs, or Completely Automated Public Turing Tests to Tell Computers and Humans Apart, exist to ensure that user input has not been generated by a computer. These peculiar puzzles are commonly used on the web to protect registration and comment forms from spam. To be honest, I have mixed feelings about CAPTCHAs. They have annoyed me on many occasions, but I’ve also implemented them as quick fixes on websites.

This article follows the search for the perfect solution to the problem of increasing amounts of human-generated spam. We’ll look at how and why CAPTCHAs are used and their effect on usability in order to answer key questions: what is the perfect CAPTCHA, and are they even desirable?

The Incentive To Act Human

To understand the need for CAPTCHAs, we should understand spammers’ incentives for creating and using automated input systems. For the sake of this article, we’ll think of spam as of any unwarranted interaction or input on a website, whether malicious or for the benefit of the spammer (and that differ from the purpose of the website). Incentives to spam include:

  • Advertising on a massive scale;
  • Manipulating online voting systems;
  • Destabilizing a critical human equilibrium (i.e. creating an unfair advantage);
  • Vandalizing or destroying the integrity of a website;
  • Creating unnatural, unethical links to boost search engine rankings;
  • Accessing private information;
  • Spreading malicious code.

All of these incentives lead to profitable or otherwise gainful situations for spammers. Automating the process obviously allows for superhuman speed and efficiency.

Those who run websites know that this is a big business and a big problem. Akismet351, the popular spam killer (commonly seen as a WordPress plug-in)2, catches over 18 million spam comments per day and has caught more than 20 billion in its history. Mollom363, which provides a similar service, catches over half a million spam comments per day and estimates that more than 90% of all messages are spam.

No amount of asking nicely will stop the spammers, but their greed can be used against them; using automated systems to increase profit does have a weakness.

Enter the CAPTCHA

On one side of the coin is the spammer; on the other is the humble website owner, a pleasant sort, who experiences common problems:

  • Blogs and forums that sink under the weight of spam posts,
  • Accounts that are registered under false pretences for unlawful purposes,
  • Bots that ruin the dynamics of a website,
  • A dive in the quality of content and the user experience.

Automated spam plagues website owners to no end, so CAPTCHAs are appealing and compelling… initially. The time needed to moderate and review user-generated content versus the time needed to implement a CAPTCHA is what pushes most developers to do it.

In fact, CAPTCHAs are used a lot. The reCAPTCHA124 project estimates that over 200 million reCAPTCHAs are completed daily, and it takes an average of 10 seconds to complete one. The Drupal CAPTCHA project5 logs close to 100 thousand uses per week, and this is just a fraction of websites (those that choose to report back).

CAPTCHAs tackle a problem head-on: they focus purely on stopping spammers. Genuine users are, for the most part, overlooked. That is to say, an assumption is made that the normal behavior of users is not affected.

It’s not true, though. The issue of genuine usability is not new. The W3C released a report back in 2005 on the inaccessibility of CAPTCHAs6, which suggested that some systems can be defeated with up to 90% accuracy. More recently (in 2009), Casey Henry looked at the effectiveness of CAPTCHAs on conversion rates and suggested a possible conversion loss of around 3%:

“Given the fact that many clients count on conversions to make money, not receiving 3.2% of those conversions could put a dent in sales. Personally, I would rather sort through a few spam conversions instead of losing out on possible income.”

— Casey Henry, CAPTCHAs’ Effect on Conversion Rates7

In 2010, a team from Stanford University released a report entitled “How Good Are Humans at Solving CAPTCHAs? A Large Scale Evaluation588” (PDF), which evaluates CAPTCHAs on the Internet’s biggest websites. Unsurprisingly, the results weren’t favourable, the most astounding being an average of 28.4 seconds to complete audio CAPTCHAs. The study also highlighted worrisome issues for non-native English speakers.

Web developers like Tim Kadlec have called for death to CAPTCHAs9, and he makes a strong argument against their use:

“Spam is not the user’s problem; it is the problem of the business that is providing the website. It is arrogant and lazy to try and push the problem onto a website’s visitors.”

— Tim Kadlec, Death To CAPTCHAs10

Completing a CAPTCHA may seem like a trivial task, but studies (like that of the W3C) have shown that that’s far from the reality. And as Kadlec mentions later in his article, what about users with visual impairments, dyslexia and other special needs? Providing an inaccessible wall doesn’t seem fair. Users are the ones who invest in and give purpose to websites.

The question is, are CAPTCHAs so unusable that they shouldn’t be used at all? Perhaps more importantly, does a usable CAPTCHA that cannot be cracked exist? If the answer is no, what is the real solution to online spam?

The World Of CAPTCHAs

The human brain is an amazing piece of work. Its ability to conceptualize, to find order in chaos and to adapt under extraordinary circumstances makes it highly useful, to say the least. For some tasks, it outshines a computer with great ease. In other tasks — mathematics, for example — it is laughably inferior.

Logic would dictate, therefore, that the most successful CAPTCHA would be:

  • A task that users excel at naturally but that computers can’t begin to comprehend,
  • A task that is incredibly quick for users to perform but arduous for computers,
  • A task that minimizes the need for additional user input,
  • A task that is relatively accessible to all users, even those with special needs (that is, the CAPTCHA should be no more difficult than general web usage and the current task demand).

One of the greatest advantages that humans have over machines is our ability to visually recognize patterns. The most popular CAPTCHA technique derives from this.

Web developers have explored many options: simple recognition tests, interactive tasks, games of Tic Tac Toe and equations11 that even mathematicians would have struggled with. We’ll explore the more sensible ideas being implemented online today.

Text Recognition

The most popular type of CAPTCHA currently used is text recognition (as seen with the reCAPTCHA124 project).

13
The reCAPTCHA project aims to stop spam and help digitize books.

reCAPTCHA was created at Carnegie Mellon University, home to the CAPTCHA pioneers and (in 2000) coiners of the term. Now run by Google, the project uses scanned text that optical character recognition (OCR) technology has failed to interpret. This, in theory, provides unbreakable CAPTCHAs, with the secondary benefit of helping to digitize books.

reCAPTCHA’s example of OCR mistakes14
reCAPTCHA’s example of failed OCR scanning.

Concerns of accessibility and usability are often voiced with regard to this type of CAPTCHA. Completely illegible CAPTCHAs are common on the Web, and asking users to perform impossible tasks can not be good for usability.

The reCAPTCHA project does make efforts to provide audio alternatives for visually impaired users, but many more text-recognition CAPTCHAs are being used without aids. As noted in the Stanford University study, audio CAPTCHAs take a long time to complete. The same study also highlighted an undesirable reliance on recognition of English-language words.

Another take on the basic text CAPTCHA was introduced15 in late 2010 by Solve Media, whose solution was to replace text with an advertisement and a related question, a move that many saw as too invasive.

Solve Media CAPTCHA16

Solve Media claims its CAPTCHAs can be solved more quickly than others. While we should be skeptical of marketing talk, there is clearly some potential, given that many global brands transcend a single language. There is potential here for marginal improvement.

While text-recognition CAPTCHAs have a few downsides (e.g. spammers could use a software that would be able to recognize text embedded in the image and try all possible combinations to “break” the anti-spam mechanism), they are undoubtedly recognizable. This fact alone can go a long way in usability decisions.

Logic Questions

Some have suggested that answering simple logic questions would be better than performing visual tasks, the idea being that the complexity of written language would be enough to confuse computers.

The TextCAPTCHA17 service has over 180 million questions in its database, including:

  • The 6th letter in “unrolled” is?
  • What is fifty-eight thousand, five hundred and seventy-four as digits?
  • Which of 3, twenty-nine, 70, 46 or 65 is the lowest?

These CAPTCHA questions are designed for the intelligence of a seven-year-old child. They are far more accessible than text and image recognition, and while this is a big advantage, it comes with a price. First, the time required to read and comprehend these questions will vary because they are unusual and unknown to users. Secondly, computers can still break these CAPTCHAs. Joel Vanhorn18 points to Wolfram Alpha as an intelligence source strong enough to crack them.

With the likes of IBM’s Watson19 recently showcasing an eerily human-like ability to process language, such technology might become mainstream quicker than we think. Instead of worrying about logic questions becoming solvable by computers, we should use this technology to analyze user-submitted content and then separate natural language from the computer-generated content that is common to spam. Services like SBlam!20 are implementing this idea.

Questions that are website-specific, such as “What is the name of this website?” and “What is the dominant color in the image above?”, might be better than general questions. The downside, of course, is that the pool of pointed questions is very small compared to the 180 million possibilities of TextCAPTCHA.

The biggest problem with logic questions is that they’re specific to a language, usually English. Providing millions of questions in every language in order to avoid alienating potential users would be a huge task. When presented with such a daunting prospect, the same question resurfaces: are CAPTCHAs the right solution?

Image Recognition

Many have experimented with photography instead of text. The benefit? No legibility issues. Services like identiPIC21 ask users to identify the object in an image. Microsoft has also researched this method through its Asirra project22.

Microsoft Asirra Project23
Microsoft’s Asirra project on image recognition.

The fact that we haven’t seen widespread adoption of image recognition CAPTCHAs indicates that it doesn’t improve usability. In fact, it jeopardizes accessibility. Visually impaired users have no chance of passing this type of CAPTCHA, and including a description or alternative text would weaken the tests.

In 2009, Google published research24 (by a team led by Rich Gossweiler, Maryam Kamvar and Shumeet Baluja) that looked at an alternative form of image CAPTCHA. The project asked users to correct the orientation of images by rotating them.

Photo rotation CAPTCHA - Google Research

A novel idea, I’m sure you’ll agree, and the research showed a preference for the ease and simplicity of this technique. Sadly, it fails the accessibility requirement (think again of the visually impaired).

Friend Recognition

One of the more interesting CAPTCHA ideas appeared in January 2011 as a result of an effort by social-networking giant Facebook. The company is currently experimenting with social authentication25 in an effort to verify account authenticity. In the words of the experiment:

“We will show you a few pictures of your friends and ask you to name the person in those photos. Hackers halfway across the world might know your password, but they don’t know who your friends are.”

— Alex Rice, Facebook, A Continued Commitment to Security26

Facebook friend recognition CAPTCHA27
A peek at Facebook’s friend recognition test.

What makes Facebook’s project slightly different than the normal CAPTCHA is that the authentication is supposed to filter out human hackers rather than machines.

There is potential for Facebook to roll this out across the web. With 600 million users and millions of websites that integrate with it, Facebook has the ability to use this social recognition CAPTCHA in a big way — and it could prove to be easier than text recognition (Orwellian privacy concerns aside for the moment).

There is one problem. Do you actually know who your friends are? The reality is that friend requests are exchanged between even the barest of acquaintances; remembering names to go with all those faces could be challenging. As intuitive and intelligent as Facebook’s idea might be, it is ultimately flawed because, as humans, we don’t follow the rules.

User Interaction

One method getting a lot of attention has users perform tasks that are impossible for virtual intelligence. They Make Apps28 features a small slider that must be dragged to the right in order to submit a form. It asks the visitor to “Show your human side; slide the cursor to the end of the line to create your account.”

29
They Make Apps uses a slider CAPTCHA.

Obviously this option is inaccessible to people with special needs. Furthermore, developing a script that is capable of moving the slider automatically to activate the “Submit” button would probably be not that difficult. A multilateral version of the slider option is used in the comments section of the Adafruit blog30. Four different sliders have to be matched to the corresponding colors in order to validate a comment and activate submission.

31
The Adafruit blog’s slider CAPTCHA.

An Over-Engineered Solution?

None of the solutions above meet all of the requirements we highlighted for a perfect CAPTCHA. Each of them impairs usability for a large segment of potential users. Even if we went so far as to assume that users generally welcomed traditional text-recognition CAPTCHAs, they would not likely welcome the other alternatives. The extra few seconds the user takes to decipher what is being asked of them negates the benefits. Too slow means not worth it.

Of the solutions available, text recognition (like reCAPTCHA) still feels like the best choice. But the question remains: why are we asking users to perform these tasks? Surely we can beat spammers at their own game by using automated systems to do the work for us. So far we have assumed that a common problem actually exists for CAPTCHAs to solve.

Despite the advances in intelligent computer systems, most spamming mechanisms are stupid. If a submission fails (because of the CAPTCHA or some other reason), the spam bot will move down its list of thousands of websites. Jeff Atwood showed this in his 2006 article “CAPTCHA Effectiveness32.” Despite all the research that goes into CAPTCHA-breaking, most spammers have no incentive to invest effort in defeating them. The sheer quantity of websites available to attack and the speed at which they can do it means that CAPTCHA-breaking is unlikely to concern many spammers.

The BBC is one of the most highly scrutinized institutions in the UK. Its requirements for accessibility are second to none, and its recent examination33 of CAPTCHAs resulted in an emphatic “No”:

“Visually impaired participants expected full accessibility from the BBC and we felt it would affect our reputation to use them. Elderly users had issues with the distorted text. The logic puzzles were found to be odd and patronising. The audio was struggled with. Overall, extremely negative feelings were expressed towards CAPTCHA technology.”

— Rowun Giles, BBC, CAPTCHA and BBC iD34

Alternative solutions exist that prevent automated submissions without resorting to CAPTCHAs and, more importantly, without user interference.

Alternatives To The CAPTCHA

CAPTCHAs, in their purest form, might realize their potential in another field. As website protectors, though, they’re far from ideal. Doing a disservice to users in an effort to combat spam doesn’t cut it on today’s web. Human-powered spam is on the rise (as is unethical link-building), and we should be implementing unobtrusive, invisible methods.

Automated and Manual Spam Detection

We touched on two detection services at the beginning of this article. Akismet351, Mollom363 and SBlam!37 all analyze user-submitted data and flag spam automatically. Mollom sometimes presents a CAPTCHA, but only when it’s unsure. But why not develop your own system that is tuned to the mechanics of your website?

Taking responsibility and removing the burden from users will improve their interactions with and impressions of your website. Manually moderating content is often a sacrifice worth making.

The Honeypot Method

In 2007, Phil Haack suggested38 a clever method of detecting bots: using a honeypot. The idea behind the honeypot method is simple: website forms would include an additional field that is hidden to users. Spam robots process and interact with raw HTML rather than render the source code and therefore would not detect that the field is hidden. If data is inserted into this “honeypot,” the website administrator could be certain that it was not done by a genuine user.

The honeypot method can be made more sophisticated by using JavaScript and data hashing. These obfuscation methods are not hack-proof, but we can assume that robots are not sophisticated enough to enter the required information.

JavaScript can be used to fill in hidden fields dynamically, which server-side validation can check for. Scratchmedia39 provides an example of this hidden field solution, along with an alternative CAPTCHA if JavaScript is disabled.

Additional timestamp and session data checks can also be used to detect automated submissions. A recent discussion40 on Stack Overflow provides many examples and ideas about this, including the implementation of Hashcash41, which is available as a WordPress plug-in42. A jQuery tutorial43 explains a similar method and includes an interesting thought:

“Thieves know to look for stickers, dogs in the yard, lights on the exterior of a home, and other signs of a well-guarded house. They’re looking for high payoff with minimal work and risk.”

— Jack Born, Safer Contact Forms Without CAPTCHAs44

The analogy suggests that, as with CAPTCHAs, the method used does not stop intruders so much as the presence of any hurdle at all. As mentioned, spammers currently have too many targets to bother searching for a back door.

Centralizing the User Base

With the rise of the social web, many websites now allow users to register and interact with one another. Publishing to a third-party website was traditionally done either by registering a full-fledged account or by submitting totally anonymously, both of which methods leave the gate open to spam. In 2008, Facebook announced45 Facebook Connect, which provides websites and their users with an integrated platform that addresses this and other concerns. Twitter followed suit in 2009 with a similar service (“Sign in with Twitter”). Both of these services can be implemented on websites relatively easily, and they eliminate the need for registration and comment forms, which are accessible to robots.

So many websites offer social-networking integration that services like Janrain46 have popped up. Janrain provides an abstracted umbrella solution to ensure that websites are accessible through any account platform.

Janrain social login at Mahalo.com47
Mahalo48 provides social log-in functionality via Janrain.

Other services, such as the commenting platform Disqus49, allow user interaction with built-in spam detection and user sign-in.

Less anonymity and more accountability make users think twice about the content they submit. It also enables human spammers to be detected and banned quickly across entire websites; remove one Facebook profile and the whole Facebook Connect network is safe from that account owner’s dastardly deeds.

Such services, of course, provoke heated debates about privacy, data protection and the like… but that’s for another article. As alternatives for preventing spam without CAPTCHAs while maintaining usability and accessibility, they have great potential.

Recording User Time Expenditure

Another rather simple method that can be implemented without annoying users is to distinguish between users and bots by measuring the time they take to fill out a contact form or compose a comment. By estimating the average time spent on a comment, one could define certain rules. For example, if a submission takes less than five seconds, which is virtually impossible for a human but just enough time for a bot to do its job, you could ask the user to try again. Jack Born’s tutorial50 on a slight variation of this concept for jQuery is worth a peek, since most users have JavaScript enabled. The whole endeavor is based on one crucial assumption: spammers prefer going after the easiest targets and will leave a website untouched if their initial attempt fails (although this can never be guaranteed).

The Perfect CAPTCHA

It would seem evident from years of use and research that CAPTCHAs are far from perfect as a solution. Remove spammers from the equation and we remove the need for CAPTCHAs entirely; this is the mentality we should be aiming for. The perfect CAPTCHA is no CAPTCHA at all.

The Rise of Humans

CAPTCHAs, by nature, function more by blocking spam than by detecting humans (which is their purpose). But they can’t do that when the spammer is not a computer. A better solution would be to remove the incentive to spam altogether. If we can reverse the trend and drive spam from being highly lucrative to being a net loss, then both automated and manual spam will become worthless.

One of the many dark arts of search engine optimization (SEO) is to artificially generate links to the website being “optimized.” Search engines consider inbound links a strong indicator of value. This can be abused, obviously, by posting worthless links on many websites (forums and comment forms are perfect for this). The SEO benefits are so worthwhile that automated spamming isn’t even required. The practice of enlisting cheap human labor is emerging. And CAPTCHAs are not designed to stop this.

We should accept the need for moderation and background detection. CAPTCHAs are a stop-gap solution at best, and are lazy and inaccessible at worst. Whether you choose to fight the good fight or simply put the interests of genuine users first, you have options.

Taking a Stand

If website owners work together to eliminate the incentives to spam, then spam will slowly wear away over time and eventually remove the need for CAPTCHAs. Is that too idealistic? Probably. In reality, we are likely to see a combination of technology and law dealing the death blow to spammers.

Google’s latest algorithm change51 has significantly demoted low-quality content farms (the effects of which are explained by Johannes Beus52). Advances such as this will ultimately remove all incentives to game the system. However, if we website owners don’t evangelize and adopt alternative solutions, then we might just wake up to a world where CAPTCHAs are worthless and our websites are unmanageable.

Understanding the alternatives (whereby spam detection is silent to users) and implementing them on our clients’ websites is a good start. It’s a positive step toward usability and conversion rates (and clients will love that!). If users comment on your website, reward them with a simple experience. This can be done in several ways:

  • Moderation wherever possible
    Disallow certain content to be posted directly to your website, or allow it through maintained and verified account management. Better yet, use a service like Facebook Connect or Disqus; you’ll make things easier for both yourself and users.
  • CAPTCHA alternatives
    Try the honeypot method or another that is invisible to users. Some could potentially be bypassed, but their presence is often enough to thwart automated efforts.
  • Client-side detection
    This can work because, while there are simple workarounds, spammers won’t waste time (for now). Keyword and mouse interactions can be used to detect genuine user input. This option shouldn’t be used on its own but can add extra assurance.
  • Server-side spam detection
    Developers should focus on server-side spam detection that monitors users and flags unusual activity. Specialist services like Akismet are affordable and proven, but bespoke systems can be tailored to the nuances of your website.
  • Social moderation
    Move toward more sophisticated features that allow this. The simple act of voting content up and down can help to push spam away or flag it for deletion.

It seems clear, considering all the pros and cons of CAPTCHA, that the future lies in a system that is invisible to normal web use. For now, using a CAPTCHA should be your last resort.

Further Reading

We express sincere gratitude to our Twitter followers62 and Facebook fans63 for their support and feedback in helping to prepare this article.

(al) (vf) (ik) (sp)

Footnotes

  1. 1 http://akismet.com/about/
  2. 2 http://wordpress.org/extend/plugins/akismet/
  3. 3 http://mollom.com/
  4. 4 http://www.google.com/recaptcha
  5. 5 http://drupal.org/project/usage/captcha
  6. 6 http://www.w3.org/TR/turingtest/
  7. 7 http://www.seomoz.org/blog/captchas-affect-on-conversion-rates
  8. 8 http://www.stanford.edu/~jurafsky/burszstein_2010_captcha.pdf
  9. 9 http://timkadlec.com/2011/01/death-to-captchas/
  10. 10 http://timkadlec.com/2011/01/death-to-captchas/
  11. 11 http://www.flickr.com/photos/ceejayoz/2674227920/
  12. 12 http://www.google.com/recaptcha
  13. 13 http://www.google.com/recaptcha
  14. 14 http://www.google.com/recaptcha/learnmore
  15. 15 http://www.solvemedia.com/index_ss2.html
  16. 16 http://www.solvemedia.com/index_ss2.html
  17. 17 http://textcaptcha.com/
  18. 18 http://joelvanhorn.com/2010/11/10/using-wolframalpha-to-hack-text-captcha/
  19. 19 http://www.engadget.com/2011/01/13/ibms-watson-supercomputer-destroys-all-humans-in-jeopardy-pract/
  20. 20 http://sblam.com/en.html
  21. 21 http://identipic.com/
  22. 22 http://research.microsoft.com/en-us/um/redmond/projects/asirra/
  23. 23 http://research.microsoft.com/en-us/um/redmond/projects/asirra/
  24. 24 http://googleresearch.blogspot.com/2009/04/socially-adjusted-captchas.html
  25. 25 http://blog.facebook.com/blog.php?post=486790652130
  26. 26 http://blog.facebook.com/blog.php?post=486790652130
  27. 27 http://blog.facebook.com/blog.php?post=486790652130
  28. 28 http://theymakeapps.com/users/add
  29. 29 http://theymakeapps.com/users/add
  30. 30 http://www.adafruit.com/blog/2011/03/01/some-adafruit-website-updates/
  31. 31 http://www.adafruit.com/blog/2011/03/01/some-adafruit-website-updates/
  32. 32 http://www.codinghorror.com/blog/2006/10/captcha-effectiveness.html
  33. 33 http://www.bbc.co.uk/blogs/bbcinternet/2010/10/captcha_and_bbc_id.html
  34. 34 http://www.bbc.co.uk/blogs/bbcinternet/2010/10/captcha_and_bbc_id.html
  35. 35 http://akismet.com/about/
  36. 36 http://mollom.com/
  37. 37 http://sblam.com/en.html
  38. 38 http://haacked.com/archive/2007/09/11/honeypot-captcha.aspx
  39. 39 http://www.webdesignfromscratch.com/javascript/human-form-validation-check-trick/
  40. 40 http://stackoverflow.com/questions/4683117/alternative-to-annoying-captcha-in-forms-how-to-smell-the-difference-between-a-h
  41. 41 http://en.wikipedia.org/wiki/Hashcash
  42. 42 http://wordpress-plugins.feifei.us/hashcash/
  43. 43 http://docs.jquery.com/Tutorials:Safer_Contact_Forms_Without_CAPTCHAs
  44. 44 http://docs.jquery.com/Tutorials:Safer_Contact_Forms_Without_CAPTCHAs
  45. 45 http://developers.facebook.com/blog/post/108/
  46. 46 http://www.janrain.com/products/engage
  47. 47 http://www.mahalo.com/login
  48. 48 http://www.mahalo.com
  49. 49 http://disqus.com
  50. 50 http://docs.jquery.com/Tutorials:Safer_Contact_Forms_Without_CAPTCHAs
  51. 51 http://googleblog.blogspot.com/2011/02/finding-more-high-quality-websites-in.html
  52. 52 http://www.sistrix.com/blog/985-google-farmer-update-quest-for-quality.html
  53. 53 http://polldaddy.com/poll/4657952/
  54. 54 http://polldaddy.com/features-surveys/
  55. 55 http://www2.parc.com/istl/projects/captcha/history.htm
  56. 56 http://www.w3.org/TR/turingtest/
  57. 57 http://www.seomoz.org/blog/captchas-affect-on-conversion-rates
  58. 58 http://www.stanford.edu/~jurafsky/burszstein_2010_captcha.pdf
  59. 59 http://www.codinghorror.com/blog/2008/03/captcha-is-dead-long-live-captcha.html
  60. 60 http://www.readwriteweb.com/archives/the_state_of_web_spam_human-posted_spam_is_on_the.php
  61. 61 http://stackoverflow.com/questions/450835/how-do-you-stop-scripters-from-slamming-your-website-hundreds-of-times-a-second
  62. 62 http://twitter.com/#!/search?q=%23smcaptcha
  63. 63 http://www.facebook.com/smashmag/posts/10150112377942490

↑ Back to topShare on Twitter

David Bushell is a website designer and front-end developer working at Browser Creative, London. He blogs regularly at dbushell.com and xheight, and shares inspiration and web design related interests at Design Heroes. You can also follow him on Twitter.

Advertising
  1. 1

    Daniel Nordstrom

    March 4, 2011 5:54 am

    I personally really hate CAPTCHA’s and that was my first thought when I saw the title of this article as it was linked on Twitter. Thinking it was going to be a horrible article about a horribly annoying technique, I was pleasantly surprised.

    There are some neat things and ideas here. Up until now, I’ve liked stuff like the logical questions most since they’re simple and you don’t need to try several times to recognize an unrecognizable piece of messed up text—especially if you’re partially colorblind and nearsighted.

    Out of these, I find the user interaction method and the friend recognition very pleasing. Centralizing user accounts at Google, Facebook, OpenID or similar is simple too, of course.

    Anyway, thanks for a surprisingly interesting article!

    Daniel Nordstrom
    Nintera(ctive)
    b: mrnordstrom.com
    w: nintera.com

    2
  2. 2

    Awesome article. I’ve recently had to deal with spam quite a lot recently and resorted to word-recognition CAPTCHA tactics. This makes me re-evaluate that decision and look for an alternative. Thanks!

    0
  3. 9

    People still use Captcha? How quaint.

    0
    • 10

      They’re very common, especially on big websites like eBay for example. Smaller, more progressive and web-oriented sites tend to have moved on (though that audience is probably the most CAPTCHA-knowledgeable of all).

      0
    • 11

      Without captchas, how else could we block humans from accessing precious machine-only content?

      0
  4. 12

    Facebooks solution is pretty pointless/flawed if it’s meant to stop hackers. They’re giving the “hacker” multiple choices. Unless there’s a time limit, you could easily Google those names – in today’s world, it’s extremely easy to find a photo of someone when you know their name, heck, I just Googled someone that I haven’t seen in real life just to find out what they looked like – took me 20 seconds max. They’d be better off with some sort of input field with names for instance with auto complete, but hey, that’s just me!

    0
    • 13

      Another reason why Facebook’s solution is flawed is because plenty of people tag themselves as stupid things, like food or animals (or their babies). I have no idea which of my friends has tagged themselves as a slice of pizza, for example.

      I had to use this Facebook recognition thing on holiday and it told me I could only get a couple tags wrong or I wouldn’t be let in. Even though I got more wrong than was allowed (see above), it still let me into my account. The whole thing takes ages and is a complete waste of time.

      0
  5. 14

    I think the points raised in this article are very good, but some of the alternatives to CAPTCHA’s require a lot of skill on the development side. A designer who is stronger with frontend languages like HTML and CSS will more than likely go for the easiest option which is a CAPTCHA. I am guilty of doing this in some of my wweb projects but after reading this I will try out some of these ideas and test out the results.

    0
    • 15

      This is very true. As a designer I used a CAPTCHA on my website for a long time (only just removed it!). The real issue is with very large websites with a huge audience – they should have the budget to implement alternatives if they consider accessibility.

      0
  6. 16

    A informative summary about the CAPTCHA. The fact: Every CAPTCHA has your own advantages and disadvantages.

    0
  7. 17

    The Facebook model is also flawed because profiles have friends listed publicly by default and it also lets you search them.

    So all you would need to do would be to search the users friends until you found the profile photo which matched?

    0
  8. 18

    What about KeyCaptcha? Bots are not able to solve it.

    0
  9. 19

    Great article. I’m not website owner or designer. Just a humble visitor. I, like the BBC folks, hate captchas and never understood the issues. This was a great overview and an appropriately exhortative piece.

    0
  10. 20

    nice article! contains some very interesting thoughts and methods.

    0
  11. 21

    I use a captcha/honeypot solution where an image with text is generated and a field to enter the text but both are invisible for users. I give the image and field captcha named classes to hide them with CSS.
    I’ve had no trouble with people not being able to complete the forms, and I’ve never seen any bot generated content in the databases.

    0
  12. 22

    Vladimir Sobolev

    March 4, 2011 5:24 am

    Best captcha – is NO captcha. Use tokens.

    0
  13. 23

    Great article. One of the worst CAPTCHAs I’ve ever seen is used by Google. If you fail too many times to login to analytics they throw up a CAPTCHA that is next to impossible.

    0
  14. 24

    i believe that for most cases you could allow 1 comment/login/download per ip address and then use some other method to filter out spammers, since most users only comment every so often.

    0
  15. 25

    Captcha is bad. Most spambots just look for pages called guestbook or comment + blog. So replacing things like name, email and comment with images could do the trick. I had a guestbook called gbook it got spammed a lot. I renamed it cafe. And no more spam.

    and if that fails i will do random sums like 3+4 = ?

    The worst thing to do is use cats like rapidshare did. I think it cost them a lot of clients.

    0
  16. 26

    Be careful with the honeypot method! Some users have ‘autofill’ software installed and will populate the hidden field. Text recognition has worked well for years and it’s familiar. I would be slightly confused (and dare I say, put off) if i was filling in a form about mortages and had to identify a picture of a cat or answer a math question!

    0
    • 27

      If auto fill software fills in hidden fields then that software is broken.

      0
      • 28

        Autofill software is a huge issue when it comes to the honeypot technique. The issue is that in order for the honeypot hidden field to properly work, it needs to be named something that the spam bot is likely to fill in, i.e. email. But autofill software will see email and even though it is a hidden field, will try to complete it. This explains the issue thoroughly:

        http://www.electrictoolbox.com/html-form-honeypots-autofill/

        0
        • 29

          Maybe the solution is to give the user an error message to clear out the honeypot field after he have sent the form.

          0
        • 30

          I use a field called ‘hf53bdh9e7j’ and it works just fine. You don’t have to use a name that the bot is likely to fill in. Most of them are too stupid and just fill in every field on the form. In fact, I think you’re better off using an obscure field name so that browsers don’t auto-fill the field. This used to be a problem with Chrome but I no longer have that problem.

          0
        • 31

          This can be solved by dynamically adding autocomplete=”off” like this:
          window.onload=function(){document.getElementById(“email”).setAttribute(“autocomplete”,”off”);}

          0
  17. 32

    Thanks, interesting read. It strikes me that a well-scripted trio combination of a honeypot, ‘time to complete’ and real-time check with a known spam source blacklist service would all meet the useability criteria (ie no user action is impeded) – that would probably eliminate many human spammers who are copy/pasting links too.

    One other thing I wondered about is tracking/detecting mouse movements – a bit like you can for heatmaps. If a regular user’s mouse has shifted on the screen since pageload, the chances of them being a robot are slim! How easily could that be gamed?

    On the Facebook approach – that seems like just another Big Brother-like way to link eveyone together, without actually solving the problem.

    I’ve used maths questions before (to avoid language issues), until I realised just how bad many people are at maths!

    0
  18. 33

    Very interesting points! – And it has certainly made me re-think the verification methods I use on my websites.

    But the underlying point that I think we should all take from this article is that we will NEVER have a 100% perfect verification system.

    Sticking with the “Burglar” analogy; People have been living in houses for 1000’s of years… from Castle’s to Caravans! and yet thieves are still able to “Break In” to them… despite stickers, alarms, dogs, barbed wire fences, CCTV, laser systems, security guards, bio-metrics etc… Burglaries still happen daily, bypassing all types of security measures, and using all types of technology to do it (Just like Spammers use SpamBots)

    You say remove the incentive to “Spam” but this is the same as asking us to remove the valuable items from our homes… Stop putting TV’s, Games Consoles, Jewelery, Laptops etc. in your homes and the incentive to Burgle goes away! – An Impractical solution.

    The good news is, that most people are not thieves (or spammers) and the numbers of spam comments are relatively low when compared to the genuine ones. It is a shame we live in a society like this, but that is “The Human way” and it always has been.

    Another interesting point there… Captcha’s are designed to verify that you are a human-being in order to stop it’s users from behaving like human-beings!

    You’ve got me thinking now! – Thanks for taking the time to write and share this article! – A great read.

    0
    • 34

      Very good point about never being able to remove all the incentives. Perhaps then the punishment and effort required must outweigh the reward – a very difficult challenge without upsetting genuine users! but I think it’s possible. On the analogy, some societies have practically zero crime rates.

      0
  19. 35

    I’ve been using Fassim.com for all my forums, it work great, I was getting to many false positives with all the others, and it doesn’t have a question or annoying letters to fill out on the forum

    0
  20. 36

    A very interesting read. Thanks, David!

    0
  21. 37

    This article is based on very ancient info, 4 years late, at least
    mentioning most captchas that have been cracked and being passed by spam bots for 1-3 years now
    and failing to mention up-to-date captchas like the one from keycaptcha.com

    0
    • 38

      keycaptcha is awful, drawn out and negates the whole point of this article. “The perfect CAPTCHA is no CAPTCHA at all.” I personally use the honeypot method and have never had a problem. Great article.

      -1
  22. 39

    Why not just include a pre-filled tick box which users are instructed to untick?

    Obviously if it became a common feature somebody would be able to write a bot to get around it relatively easily, but as noted above most spammers don’t actually bother trying to hack captchas.

    0
    • 40

      We’ve implemented a similar mechanism on all our recent sites plus a timed server side validation: works like a charm

      0
  23. 41

    We used to have a form that received dozens of spam submissions per day. A simple trick was to insert a hidden input field and on server-processing, make sure that field was left empty. Most often, spambots go through and fill out every field in a form not regarding whether they are hidden fields or not. There is no obstacle presented to humans, and it has proved to be 100% effective against spam. This also goes to prove the point made in the article that most spambots are not very intelligent.

    0
  24. 42

    Note that detecting non-human users by detecting keyboard/mouse usage or by looking at the speed with which they compose posts both fall into the realm of behavioural biometrics.

    I mention this only to highlight that any other type of behavioural biometric, which can be detected by a computer, can be used to distinguish between human and non-human users.

    An interesting approach would be to take the method you have described for scrutinising the speed at which posts are composed and extrapolate it onto the entire registration process, measuring various behavioural biometrics.

    These measurements could be used to create a general behavioural profile, reflecting the typical human user for a given registration process. As more and more users register the system would, overtime, be automatically trained to reliably flag potential non-human users.

    0
    • 43

      This definitely needs exploring and is the type of automated moderation we need. Is there any software that does this yet?

      0
      • 44

        Not that I am aware – behavioural biometrics are predominantly used for authorisation and authentication but I don’t see why it can’t be applied to unobtrusive SPAM protection.

        0
        • 45

          What? “. . . which can be detected by a computer, can be used to distinguish between human and non-human users.”

          If a computer can measure or detect it, then a “bad computer” can also fake it. There are no silver bullets here. A good combination, that evolve over time, is only going to help reduce the problem, but never completely eliminate it.

          0
          • 46

            For sure a bad computer may fake the speed he’s typing, but it will not be lucrative anymore if a spambot needs to pause between autofill and submit.

            That is what makes this approach so brilliant.

            0
  25. 47

    Nicely written , but i fear your research work felt short. Actually , i came across a pretty impressive captcha example a few days ago. I think it was the simplest and most reliable till now. As follows :

    Includes a question, “Drag the flower on this circle”. And there are various images next to the circle. Dogs , House , etc.

    I think its an excellent human identifier. Too bad i forgot where i saw this. It was a file hosting website.

    0
    • 48

      Guilherme Ventura

      March 4, 2011 8:11 am

      even if it is an easy to use, still are an information that the user should not tell, since it has no relation with the purpose of the form itself.

      0
    • 49

      That would not be accessible and it presumably relies on JS, so it’s a non-starter. I would rather receive spam than shut out some legitimate users.

      0
    • 50

      I was going to post this too. On another design site this functionality was mentioned last month, I just can’t remember where.

      0
    • 51

      jquery has a plugin just like that….

      0
  26. 52

    For non-human spam, drop-in solutions like reCAPTCHA are actually broken by design. As soon as a critical mass is using an anti-spam technique, spammers find a way to break it or work around it.

    I have found the honeypot technique to be the most effective for contact forms, but it will only be a matter of time before it’s used on enough websites that spam programs will start checking stylesheets to see if fields are hidden.

    The only suitable method in my eyes is to *be unique*. Create a solution that is not on another site, and which is tailored for your own website. If your website is about cars, use a simple car question. Even if it’s only one question that’s always the same, spammers won’t go programming the answer into their spambot.

    Unfortunately, as the article states human spammers are on the rise so no CAPTCHA will ever work. Akismet et al are going to be the only solutions from here on.

    0
  27. 53

    Guilherme Ventura

    March 4, 2011 8:06 am

    Great Article!

    personally, i hate CAPTCHA’s , i think them are a great hindrance and breaks Usability… i defend my forms using the honeypot method , it’s a simple, effecient and faster way to stop spammers

    0
  28. 55

    Re the behavioural biometrics approach – I guess another i/o source is a webcam, an increasingly common component in laptops especially. “Wave at the webcam” could give rapid feedback to prove you’re human – motion detection software is much more readily available than the stuff which is being designed to identify exact individuals. Granting temporary access to your webcam is something users (particularly in some, er, industries) are already familiar with – think Skype and other video chat. Using motion rather than face detection is less intrusive, potentially quick and language-independent. A couple of seconds to authenticate is a blip in the ocean compared to 15 minutes of YouTube, so the bandwidth cost could still pay off.

    Oh, and I guess the other i/o is voice – given the rise of the mobile web and the smartphone this is probably more appealing in some ways.

    Ultimately all these intrude, so it’s balancing the value of having a ‘noise-free conversation’ versus the cost/delay of authentication.

    0
    • 56

      but most desktop pc’s dont have a webcam. and even if your netbook has one it could be broken

      0
      • 57

        Reminds me of the way Microsoft’s Kinect console allows users to login by waving at the console! If we’re talking very long-term, I think biometrics have a strong future.

        0
  29. 59

    I still trust on reCAPTCHA as they are used globally..

    0
  30. 60

    I use a CAPTCHA on my web design projects all the time, never really thought of using an alternative. But reading how users can get pissed off by them makes me think now… an alternative might be the answer. Great post.

    0
  31. 61

    Thanks for mentioning my article about Wolfram Alpha breaking TextCAPTCHA. You’ve compiled a nice and comprehensive comparison of CAPTCHA variations, each of which have their advantages and flaws. I’m working with a startup called ShareThink on a new CAPTCHA implementation we hope will be more user friendly and harder to crack. More on that soon.

    On a side note, I just saw the “hardest CAPTCHA ever” posted to Hacker News….

    http://random.irb.hr/signup.php

    0
  32. 62

    Personally I totally hate ‘reCaptcha’ which nowadays are hardly even readable by a human, as a web designer I have a slight ‘advantage’ in that I am very impatient online so if it works for me the chances are a lot of people will be happy too. Example: I’d rather just delete my post or whatever than have to poke my eyes out trying to read reCaptcha rubbish.

    As you pointed out in your article there are better ways to stop the scumbag spammers so why do people always resort to the horrible reCaptcha solution?

    0
    • 63

      I’ve implemented reCAPTCHA before myself, it just seemed easy and insignificant at the time. Some of the alternatives require a fair amount of programming knowledge that most web designers don’t have, but I think it’s mainly that people only care about stopping spam quickly and haven’t consider much else.

      0
  33. 64

    I really don’t understand the fuss or necessity for all these horrid methods of confirming an entry is from a human. Ever since I found spam entries being a problem with online forms I added a simple and effective solution. I ask the question – “what colour is grass” and have a dropdown red, blue, green. One line of php confirms. Never had a spam entry since.

    Still get people offering SEO and linking services, but that’s another issue.

    0
  34. 65

    Isn’t it unfortunate that we can’t stop this problem at the “source”? Meaning, the jerks who create this software that crawls our forms?

    If internet providers can catch those downloading illegal music, etc. why can’t they distinguish those who are submitting to 1000’s of forms within an hour or whatever? Is that not possible to detect?

    0
  35. 66

    Great article – very thorough and informative, and I’ve just installed reCaptcha for a client, but may now rethink that.

    0
  36. 67

    I am just a website user, not a developer, but I prefer CAPTCHAs. I would rather spend 28.4 seconds of my time entering some text than 45 minutes wading through SPAM to find one worthwhile comment or article.Despite the one claim made in the article, CAPTCHAs are a user-oriented tool that helps ensure visitors have a positive experience on a website rather than spending lots of time reading bogus or offensive posts.

    0
    • 68

      I think we would all prefer to enter CAPTCHAs if the only other option was wading through spam – there’s nothing worse! However, I hope this article has highlighted that there are indeed alternatives, even for non-developers. Definitely there needs to be more progression is this respect.

      0
  37. 69

    I agree with you that captchas as spam protection are not the best solution
    but to avoid server crashs through heavy traffic created from a download-button for example they are a valid option

    0
  38. 70

    I hate captcha. I think the future is for the services like Akismet. Not this awful atrocity. Captcha…

    0
  39. 71

    I use an Simple antispam plugin without questions on Javascript-solution. Without JS-Solutions give it an textbox. Its a small trick, but very usefull for spam.

    0
  40. 72
  41. 73

    I would like to comment on the Social Authentication.
    I have lots of friends on my fb account that I dont know them personally. When this authentication appeared on my index page, i simply did a “search” of the names on facebook on a seperate window and solved the authentication form easily.

    This way (and only if the hacked user’s friends have their face on their profile image) any profile hacker can overcome this problem. Social Authentication doesn’t really work properly like this…

    On the other hand, more personal questions like “Where was this photo taken” for example, can make more difficult the authentication to strangers.

    0
  42. 74

    Thanks for a great article. I hadn’t realized how sophisticated CAPTCHA had truly become.

    0
  43. 75

    grunching:

    the image solutions are just as user friendly for visually impared as existing captcha’s, you just need to support them with a sound based solution, just like the ‘old school’ captcha’s have.

    0
  44. 76

    Hi David, this is a great article, I enjoyed it, and learnt new things.

    One idea about the perfect *client side* captcha that I had in my mind for a while:
    It asks use to perform some easy *mouse* action such as drag and dropping this shape into that, clicking here and there, or even ask you to draw some symbol.

    It would be also nice to have some service from a huge DB somewhere on Google or else. The DB would contain hashed spammers’ posts, which would be recognized by *website’s owner*, when he approves/disapproves the message. It’s probably the same way how the Gmail works…

    Val

    0
  45. 77

    The best CAPTCHA is no CAPTCHA! It’s probably the most effective, but why punish all users for some spam bots? It’s much better to use sytstems that prevent spam automatically, like Akismet.

    0
  46. 78

    Interesting article!
    one more info about recaptcha project is their two word captcha. one word is known to the system and they take the input of the second word from us, the chance of a correct new word is three and then added to the system, now the solved captcha are used in the digitizing of the NY Times Archives.

    0
  47. 79

    Thank you for the well-researched article explaining the different options available. I especially liked how you pointed out issues regarding accessibility and usability. I was surprised and disappointed by the results of the survey showing the high number of people using reCATPCHA and the low number of people using the honeypot method.

    0
  48. 80

    Funny how smashing mag ALWAYS makes a post regarding a topic I am currently researching. I like webdesignbeach’s new captcha technique. Simple jquery ui dragndrop

    0
  49. 81

    Thanks for the comment Richard. It is an exhaustive article! My point was that after all the efforts in CAPTCHA research not one exists that stops spam completely and maintains a good experience for all users. I also hoped to highlight that there are better alternatives. Particularly automated spam detection techniques which are invisible to the user.

    0
  50. 82

    richard hellyer

    March 5, 2011 4:22 pm

    Moderator — Please ignore and delete my last comment. This is a very useful article and I wrote an unthinking critique. It is good to see postings such as these and others on smashing magazine that have real content, well researched.

    0
  51. 83

    While I understand your point about captcha being annoying or presenting usability issues for some, I think it really depends on the nature of the site. Most modern sites dealing oriented towards adults employ captcha, and I think most adults have no issues with the fact they have a captcha hurdle. For tweenies and other immature markets it likely is an issue, but I have yet to talk to the operators of any normal business site and find that they are losing clients due to having captcha in place.

    0
    • 84

      I agree, yet is there a valid reason not to implement an alternative that solves these annoyances and usability problems, while at the same time not burdening the business more or less?

      0
  52. 85

    Aakash Bhowmick

    March 6, 2011 1:39 am

    Nice post. Captcha’s have always been annoying to me. It was cool to find out that the reCAPTCHA project is being used to digitalize books.

    0
  53. 86

    If you want to detect who post comment — human or bot. You can post your comment via AJAX and check in php script if($_SERVER[‘HTTP_X_REQUESTED_WITH’] == ‘XMLHttpRequest’){

    Then comments will be accepted only if it’s post from JavaScript. And for people that don’t use JavaScript just print message: “Sorry Dude, but you can do this only if you have JavaScript enabled”

    0
  54. 87

    the problem with “Logic” questions is that many of us not native english speakers would find it difficult to solve some of them, even if they could be solved by 7 year old (english speakers) children. This has happened to me in the past and it’s really annoying, in the case of reCAPTCHA, if I don’t know what the words mean it doesn’t matter, I just type the letters and that’s it….and the same goes for any other language. Maybe if you could chose your language before solving the question it could work, but that would mean another annoying step.
    btw sorry for my english

    0
  55. 88

    Gerben van Dijk

    March 6, 2011 12:29 pm

    Bumbed into this the other day: http://www.myjqueryplugins.com/QapTcha/demo
    Also a nice way for a captcha ;)

    0
  56. 89

    Murtaza Ali-Netvepaar

    March 6, 2011 2:14 pm

    There is one more idea for Captcha……that’s http://www.adscaptcha.com/

    I am sure you must have heard about it.

    0
  57. 90

    What do you mean perfect captcha? There is no such thing. Anyone who attempts to use a captcha on their website (including the big Google) is kidding themselves. We have technology that can determine if the user is a bot – it works great.

    0
  58. 91

    Is it just me or is re-captcha awful. Half of the time I cannot read the damn text, I couldn’t imagine someone with severe sight or disability (I know audio exists, but point stands). Not to mention that re-captcha is broken by spam bots currently. The more user friendly designs you have shown seem to be both better for UI and more complicated to solve for bots than text recognition is.

    0
  59. 92

    The best catcha is to not use captcha…

    Try this:
    One invisible text-input field (with css-> display:none)

    If the value of this field is “” (empty) -> request is valid. Otherwise a spam-bot has send the request.

    0
  60. 93

    Bart Breeschoten

    March 7, 2011 6:19 am

    Great article! Thanks for the overview of CAPTCHA and alternative solutions. Apart from the ones you mention, I’ve also seen random fieldnames in server generated forms. For spammers this means that a HTTP-POST string that has worked once, won’t work again. Spambots would have to be programmed to retrieve the form first, figure out what the fields are called this time and use those names to construct a HTTP-POST command. Not impossible, but definately a hurdle for spambots (if not for human spammers). For the online guestbook of my partyband I’ve chosen to silently ignore on the server any entries with a hyperlink in the bodytext. That has effectively filtered out 99,9% of the spammessages I was getting.

    0
  61. 94

    Richard - Accessible Web Testing

    March 7, 2011 7:53 am

    Very interesting and useful article. There does need to be a debate about Captcha.

    Under the User Interaction section you say “Obviously this option is inaccessible to people with special needs.” – that is a very sweeping generalisation as there are plenty of people with special needs who would find it accessible (e.g. someone who is deaf, or someone who is a wheelchair user but has good fine motor skills) . For certain special needs it would prove inaccessible of course.

    0
  62. 95

    The best way to prevent spam is to make sure your form doesn’t work.

    That’s the technique used in the survey on this page, which – when you select ‘Honeypot’ as the only answer – says “This poll cannot find nonce”.

    I bet it hasn’t been spammed once! (or nonce!)

    0
  63. 96

    Great article… I hadn’t put much thought into CAPTCHAs hurting the user experience, which seems fairly “common-sensical” once you actually think about it.

    Also, this article inspired me to add Disqus to my site, in an attempt to make things easier for users, and harder for spammers.

    Thanks!

    0
  64. 97

    The perfect captcha isn’t one – it depends on your pov. For the average user, the perfect captcha is the non-existent (or seemingly non-existent) one; for bots it’s the one that’s easiest to bypass; and for webmasters it’s the one that’s easiest to implement that seemingly gives good enough protection. I started out using reCaptcha for static sites, then progressed to using the built-in solution for the CMS I’d be using (if there was one) and nowadays, as an informed web designer/webmaster I use a combination of honeypot and IP identification. This last one is the only one that works perfectly against both spambots and human spammers. I’d guess it’s the “vicious dog” factor that does it…

    Users shouldn’t have to wade through tons of spam to find interesting stuff. But then webmasters shouldn’t have to deal with tons of sludge just to keep an initially clean site clean for visitors either.

    0
  65. 98

    Marie-Lynn Richard

    March 7, 2011 11:51 am

    This is the most complete post I have found on this topic. It is a topic that has been grating me for years but it’s also almost passé. Why are we spending so much time, money and social capital on getting humans to prove they are human when we can simply catch robots being robots? (http://marie-lynn.org/2011/03/04/on-the-end-of-captcha/) Though I was recently inspired to crete a new kind of CAPTCHA system by Kevin Nealon of all people.

    Also I vehemently disagree with Alex Rice’s statement that “Hackers halfway across the world might know your password, but they don’t know who your friends are.” because I have proven that the contrary is much more likely in my article about Facebook security holes: Facebook Unintentional Feature – Retrieving a Hidden Friend List (http://marie-lynn.org/2011/02/03/facebook-unintentional-feature-retrieving-a-hidden-friend-list/)

    0
  66. 99

    I kind of liked xkcd’s approach, although I doubt anyone has built it. http://xkcd.com/810/

    0
  67. 100

    Richard Ozh (@ozh) custom-coded a matching captcha over at scr.im that’s pretty awesome.

    0
  68. 101

    Image Recognition Captcha’s can be compromised very easily … in my opinion, it’s worth to take a look at scientiffically founded research results published in this area. There are quite some really surprising results … i.e. how image recognition captchas could be outmaneuvert with very little effort -> http://epub.uni-regensburg.de/16872/1/trustbus_1.pdf

    If you’re interesstet in that kind of topic try google scholar and search for Captcha.

    Regards,

    Andy

    0
  69. 102

    @weallneedheroes

    March 10, 2011 2:10 am

    Thanks for this i was just talking about a better solution for “Captcha” the other day.

    Personally i don’t like them, i know they are useful but i like the honey pot idea combined with a question that has an obvious question like 1 + 1 =.

    Great article by the way :)

    0
  70. 103

    Honeypot is the best solution from my point of view. Is the less annoying for the user!

    0
  71. 104

    I think there is only one solution – Google needs change change its algorithm so that incoming links don’t affect SEO. I now often skip the first page of a google search, and never pick anything in the first 5 as they are never the type of website I”m looking for.

    I’d like to see a banlist that Google uses for serial offenders. I am tired of long pages of comments like “nice site” or “great article” which are obviously added only for link building by humans.

    0
  72. 105

    My comment: I don’t think any company (except for, perhaps, search engines) receives sooooo much spam to really need to implement one of these uncomfortable and silly captchas. It was just that someone came up with the idea, and the rest copied it.

    0
  73. 106

    Captcha´s are invented by lazy people without inspiration.

    The honeypot is unbeatable even when spambots learn to read styles. Just place a container with the background color, positon and z-index above the honeypot input. No need to hide it with display: none. Or just simply set the height to 1px and all colors the background color.

    0
  74. 107

    As a freelance developer I’ve used ReCaptcha in the past and do continue to use it on my projects, however i now think of the 3% people who are turned down because they find it cumbersome.. may be some javascript widgets would be the next ‘in’ thing of the future. i’ve also heard of logical captcha’s where users are asked some questions randomly..

    0
  75. 108

    The most less annoying is the math question where for example “1+3=?”. Very simple and straight forward. Every math question will vary each other and not more than equal to 10 after two numbers addition. Who don’t know 1+3 is the stupidest human ever.

    0
  76. 109

    There’s a viable CAPTCHA method that you didn’t cover, salted text. The human sees undistorted text, as large as needed to suit his eyes, and simply types what he sees. “Text”, for example. Machines, on the other hand, observe a salted string, such as “Tsaltext”.

    The trick is that spans and inline CSS are used to make the “salt” substring invisible. Details are outlined in Aza Raskin’s article: http://j.mp/cltO7S.

    Salted text is easy to implement. That said, it doesn’t work for screen readers.

    0
  77. 110

    The guys from Web Design Beach in Belgrade, Serbia have a great Captcha.
    It is a drop and drag jQuery plugin. Hopefully you like it.

    http://www.webdesignbeach.com/beachbar/ajax-fancy-captcha-jquery-plugin

    E.

    0
  78. 111

    Thanks for sharing. I’ll be sending others to read your great article. BTW, people who use speech, blind users, would also be left out with honeypot, as speech technology also reads the code, not what is seen. I can’t understand why folks who feel they must use captcha, just don’t add, e-mail us to get this account if you can’t use the captcha. Some junk mail is certainly worth ensuring that everyone can use a website.

    0
  79. 112

    Pathetic, the arcticle doesn’t even mention the fact, that nowadays HUMANS solve those captchas, not robots.

    http://www.scribd.com/doc/35796404/CAPTCHAs-—-Understanding-CAPTCHA-Solving-Services-in-an-Economic-Context

    p.s.: there is .pdf version in the net, but i’m too lazy so look for it now.

    0
  80. 113

    A few years ago, we invented the IMAGINATION image-based captcha system. It is a two-step process: click and annotate. In the click process, a user is asked to find the geometric center of an image from a collage of images. In the annotation process, the user is asked to choose a word to describe the content of the image. In both steps, carefully distorted images are used so that image recognition software cannot be used to attack the captcha. A demo system is available at:
    http://goldbach.cse.psu.edu/s/captcha/index.php
    This invention is currently being commercialized by the Imagination Systems Inc. A prototype is being tested at http://www.asphaltandrubber.com/

    James Wang
    Professor, Penn State

    0
  81. 114

    I like simple captch like 7+3=?, but it must be difficult: (7×3)/3=?

    0
  82. 115

    In case a comment is not supposed to links or e-mail addresses, like in some reviews. Would it not be possible to simply lock the submit button with an error message ‘links and e-mail addresses are not allowed in the comments’.

    A human reviewer will almost never post links or e-mail addresses, but a spam-bot will.

    Perhaps automatically making the submit button fake (doing nothing), making the bot think it has submitted the content. This would completely prevent any traffic from being sent or having to be filtered in the backend…as it will be blocked in the frontend.

    0
  83. 116

    hi,I’m a web designer from persia
    I do not know whether or not heard?? Google recently, libraries and old books using his captcha service scanners to test the OCR output makes the job interesting.my website

    0
  84. 117

    that was a good read, thank you

    0
  85. 118

    CAPTCHA is an unacceptable user experience. We need to stop getting clever with CAPTCHA and keep creating ways for not using it.

    0
  86. 119

    Tried KeyCAPTCHA

    July 28, 2011 9:04 am

    Uninstalled KEYCAPTCHA since imediately after its installing I’ve got tons of bot spam passed thru it.

    Besides my Email with which I registered in keycaptcha.com site (and used only for this registration) waqs flooded by spam

    0
  87. 120

    If Captcha is based on regular text, it could be a much better user experience, http://www.enmask.com use encrypted regular text so human see regular text while machine read encrypted text.
    The key is machine does not user font to ‘see’, so if we provide a matching encrypted web fonts then human can read it easily. And the fonts and styles can change any time making it difficult for machine to use OCR.

    0
  88. 121

    Hi Everyone,

    Great Captchas here!

    I am using sweetCaptcha. I think its the best captcha I’ve ever seen, it looks so good, and now they have translation coming too, I think it’s worth a mention on a page like this.

    0
  89. 122

    The best captcha it the invisible one

    i use keypic from some time and is much better than any captcha

    0
  90. 123

    The facebook friend reconigition sucks.. alok menghrajani is the man on the picture.. and he is not even MY friend.. I dont even have a facebook account! My opinion.. I expected a more creative solution from one of the biggest online communities in the world.

    0
  91. 124

    Hey

    I just signed up at coding.smashingmagazine.com and I am Completely new

    I am Super sorry If this is not right coding.smashingmagazine.com category to input

    I am Completely new to forums Im 51 year old And my buddy Gifted me A new laptop for For my birthday so still I am not familliar with posting on forums correctly.

    Please Be kind to me, I am looking forward to contributing at coding.smashingmagazine.com

    0
  92. 125

    Brilliantly written! You’ve nudged my outlook regarding reCaptcha so I will now remove it. What really caught my attention was the fact that SPAM is the website owner’s problem, not the user/client. For now at least, I will go for the ‘Hidden input-field’ solution and just make peace with the few SPAM bots that make it through.

    Can anybody direct me to such a script?

    0
  93. 126

    The article was informative and entertaining. I do not like the idea of one-login-for-many-sites. It is a security risk that begs for catastrophic results.

    0
  94. 127

    Hello,
    What does it mean Copy Rights per installation of my Software Program? I am working with the Company and they are asking me to sign the contract where I can give them Copy Rights per installation. I want to make sure that I will not give the ownership or Copyrights of my Software Program.

    Thank you

    0
  95. 128

    I always have clients asking if the CAPTCHA can be removed or made more user friendly. This is a great explanation into how it can be achieved. Thanks for the post, really helpful !

    0
  96. 129

    I use the honeypot method for 4 years now and I, and my clients, never get any spam filled in by a bot. Really ZERO spam from bots! Set an input field (the honeypot) to display: none in the CSS, so that only bots will see it and if PHP notice, before it sends the message, that that field is filled in, it sends the message straight into the honeypot. Bye bye, bots!

    Contact me if you want the PHP mailer script: info*at*gentlemedia*dot*nl

    0
  97. 130

    I think your choice of captcha depends on your Value to the potential spammer/hacker and the data you are trying to protect. If your just blogging at wordpress.com (other social platforms are available) then the invisible captcha is more than adequate but if you believe people are specifically targeting your comments fields or more likely your login screens then you have to take it more seriously.

    PS. anyone else think comments 121 & 123 maybe spam, that would be ironic.

    0
  98. 131

    Really intersting article – thank you. Am now assessing my use of captchas and the need to make them invisible to genuine users.

    0
  99. 132

    Christopher Supnig

    May 7, 2012 2:44 pm

    Some time ago I started to use a honeypot in combination with javascript and never had problems with any bots. I described it here: http://www.supnig.com/?rva.portlet=Blog&p.post=%236%3A2&p.action=view It would be nice to hear your thoughts on that.

    Of course it always depends on the site you are using the captcha on. If it is very popular then somebody might find it worth while to implement a special bot that can circumvent your captcha. But in general I think it is a good idea to keep users from spending their time on filling out captchas.

    0
  100. 133

    How many dots do you see here: • • • 1, 2 or 3? [ ] Don’t know why, but work’s fine on many of my sites … since years. :-)

    0
  101. 134

    Nice article!
    I firmly believe in “the best CAPTCHA is no CAPTCHA”. Instead of building gates and only letting fewer get through, why not let everyone through and then concentrate on building scripts and checkups along the way that discover malicious behavior from a user? This way the website will detect both traditional bots as well as paid human spammers (which are increasing as technology reaches more developing areas). One way could be to build indicators on a site that triggers flags when suspicious behavior occurs after registration, once there are enough flags triggered for a certain type of user, the user is frozen/deleted. This may be more time consuming to build and will require maintenance, but as long as spammers keep changing tactics, site owners need to do so as well. At the end of the day it will the most user friendly solution and the most non-noticeable.

    0
  102. 135
  103. 136

    I always liked motion captcha as a solution. Not very accessible, agreed. But no problems with color (blind), legibility, etc…
    http://www.josscrowcroft.com/demos/motioncaptcha/

    0
  104. 137

    I am not hearing impaired, disabled or the like but half the time I can neither visually make out nor hear the captcha resulting in me simply leaving the site; not good for business!

    0
  105. 138

    A very interesting article.
    But I have a question: and what about multilple captchas?
    As an example: honeypot + 5 second count + choose 1 out of X very simple user interaction (this grants no accessibility problems)?

    Users only “feel” one of them, and very simple, but we made 3 captchas.

    0
  106. 139

    Hi,
    Your article was absolutely fantastic, and browsing a website the other day I found the perfect solution. It is a CAPTCHA that uses a small logic game to verify your result, and has platform specific options. Forgot the name though!!

    0
  107. 140

    I wrote this captcha implementation myself, and I’m just trying to share it with other developers, because it’s so much simpler than many of the other implementations out there.

    http://meta64.com/axis/fb/?id=13655

    0
  108. 141

    With the extent of automated software, day by day traditional Captcha solutions are becoming less effective. For me, the best way to differ a Robot and Human is asking answer of any Question which will make the user thinking about it like “When this site was established”? Regards, Russel from seo company sydney

    0
  109. 142

    Thank you! I’ve been researching what type of captcha to install on several WordPress sites that I moderate. Akismet works well enough for comments, but there are questionable user registrations that appear almost daily. I was considering SweetCaptcha, but had concerns about the complexity of the instructions and the ability of some users to follow through. I also don’t like the implications of certain captcha text instructions, which ask you to “verify your real existence” or “prove you are human.” After reading this, I am much more inclined to go with a honeypot solution.

    0
  110. 143

    Good alternative to Captcha:
    http://areyouahuman.com/

    0
    • 144

      Seems like a confusing solution. Captchas should be as unobstrusive as possible, otherwise you lose visitors/clients. If a user has to figure out what she has to do to pass a test she potentially does not even understand why she has to pass it, user experience can get very messy.

      0
  111. 145

    As long as there is valuable content, there will be spam and people who want to do harm… it’s kind of a vicious circle really ;).

    0
  112. 146

    Excellent article!

    I’ve noticed that a lot of articles here on SM have seem to present some ideas without a lot of thought behind them or any real substance in the article. This one prompted a discussion at my office and is likely to result in my development team implementing a honeypot solution instead of using reCAPTHCA as we do now.

    Great job David.

    0
  113. 147

    Alternative for the new age :)
    http://www.lirullu.com

    What did you think?

    0
  114. 148

    I’m surprised that http://areyouahuman.com got no mention! Those guys are doing great work.

    0
  115. 149

    Great work! This is the type of info that are supposed to be shared around the web. Shame on Google for now not positioning this publish higher! Come on over and visit my site . Thank you =)

    0
  116. 150

    An alternative to Captcha I prefer is keypic. It’s free of charge and quite efficient. I’d recommend it to anyone who wants a strong protection from spam, without any action needed from the site users. Try and search the web for keypic.

    0
  117. 151

    I use a combination of Mollom + my own implementations of honeypot and timestamp methods for my own (Drupal powered) website. The numbers are great, Mollom stops most of the attempts, and the other two methods get most of the remaining. From time to time, about once a week, someone gets into, and I have to moderate it, but the effectiveness of these three methods combined is over 99%.

    Highly recommended.

    My honeypot and Timestamp methods are for Drupal 6, but they can be easily adapted to Drupal 7, in case someone wants to try:
    http://www.isegura.es/blog/stop-spam-your-site-being-invisible-honeytrap-drupal-comments-form
    http://www.isegura.es/blog/stop-spam-your-site-being-slow-flood-control-method-drupal

    0
  118. 152

    Either if this article is dated 2011 , I’ve found it really interesting.

    I think that one additional field to check on the server side (php in my case) just before to submit a database query or email or php function is really effective and transparent.

    But one discussion on another website, made arise one big deal: hidden fields can be tricky if someone uses some kind of website reader that doesn’t processes CSS.

    If you turn off CSS the hidden field is not hidden anymore.

    So I did my version, a visible field labeled “prohibited field”, after the label is the input text, AFTER the input text is another label “<– don't fill this". The input text form POST name could be "subject2" or "emailme" or some other common label name ( for paranoid bots)

    Who fills that field, was warned, and will exit the php execution with an echo that prints "You were warned! You have filled out the prohibited field" + a link to go back to the contact/registration/comments page.
    … hey man, you were warned, next turn skip the prohibited field … <— this thought is for those users that will fill that input text … I'm pretty sure someone real human will, but who care?

    This method IMO sounds interesting also because you can optionally write the label with an image and (anyway, image or not image) it will avoid also those bots that are able to distinguish/recognize hidden fields :-)

    (edit.. HEY! Great idea and alternative!!! An image that in the HTML simulates the CAPTCHA system, a real image but a CAPTCHA fake ;-) , as a matter of fact we will use that image as a label that says "this is an anti-captcha don’t waste your time and leave this field untouched" … this text is quite long, but gives the idea)

    What do you think about this? "Don't write here" … a real anti-captcha

    NOTE: labels can be interpreted by smart BOTs, so
    – you can set more than one (a couple) of anti-captcha fields
    – on top of the page, where you may say that fields marked with a “*” are mandatory fields, you can write that those marked with a “#” are prohibited fields and must be left empty and warn that filling them will fail the form submit.

    Thank you for your opinion

    Robert – Verona – Italy

    0
    • 153

      P.S. The solution I’ve written here above, is quite recent.
      I had to deal with an effective but simple and easy to implement anti-spam system to apply into an open source (and unfortunately, abandoned) ticket system named lynxHD.

      In these days it started a spam campaign and they were created a lot of tickets.

      What has been magic (I’m not an expert) for me, has been that I’ve removed the submit button from the page, I’ve commented it out… but the tickets were created the same… ?!?!?!?!?!?!?!?!

      Thank to any of you that may explain how that was possible. It was like the BOT was able to generate the submit using some kind of event.

      P.S. Before to implement the visible field trick said above (and that for the moment it works), I went the long way.
      – I used a javascript code that was measuring the time elapsed from the page load and the onsubmit event. This anti-spam It is still active, it was working (I’ve tested and measured it). If you fill out the form in less than 7 seconds, the page reloads itself, like pressing F5 on the browser, the time counter resets and you must start over… but no way, new spam-tickets were creating too fast.
      I’ve increased the 7 seconds up to 1 minute, but they were creating 4 o 5 tickets per minute ?!!?!?!?
      —> maybe the BOT had the javascript disabled <— ahahah, hey now I got it :-((

      At that point I've removed the submit button to discover that the tickets were anyway creating?!?!?!?

      So the BOT was "using" some leak of "something" and was submitting the form skipping, or better to say, not affected by the javascript elapsed time check and skipping the absence of the submit button.

      That has been the moment when I’ve realized that the field to be left empty was the right solution, at least for this kind of BOT.

      But please, if any of you may explain to me why I was getting crazy with that sort of magic spam, thank you in advance.

      Robert

      P.S. the javascript issue maybe the key… the BOTs are not browsers.. this could be the explanation, so any elapsed time measuring method before the submit event, must be done on the server side.
      Found right now http://forums.phpfreaks.com/topic/265035-calculating-elapsed-time-between-pageload-and-page-submission/ it could help those of you that want to deal with this solution.
      Mixing the empty field with time elapsed measurement, could lead to an effective and transparent antispam and anti-captcha solution.

      0
  119. 154

    I built https://verscaptcha.com, IT’S FREE, FAST, and EASY TO ENTER!

    0
  120. 155

    The best captcha you can use, is not to use it!

    try keypic.com

    0
  121. 156

    I quite like the idea of incorporating a standard text CAPTCHA as a honeypot into forms along with the methods from this article http://nedbatchelder.com/text/stopbots.html

    I’ll definitely be implementing this next year on all our forms.

    0

↑ Back to top