


This produces something like the following in the output (the variables coded and key will be different): Once we find an e-mail address in our pre-parse we replace the plaintext with the result of a call to the PHP function munge: Conveniently, the addresses on our sites are all of the form or so some of the "deeper" patterns are irrelevant. I'm not going to reprint the regular expression for detecting standard e-mail addresses here because it's really long and complicated. Before the content is sent to the browser, we pre-parse it looking for plaintext e-mail addresses. Every bit of content (where an e-mail address might appear) on every site we've ever built is housed in one or more databases.

#Html email obfuscator code#
I wanted to implement and distribute this code on a reasonably large scale across several sites and so didn't want to generate the code offline and then paste the resulting code into my document, so here's my deviation of the current offering. But, of course, you couldn't pass the plaintext e-mail address to the encryption function ( munge) because you would have to write out the legible address into the document. The code used to encrypt the e-mail address was written in javascript and executed from a browser taking the plaintext e-mail address as a parameter. This was a really valuable evolution of the script.Īs the script stood in that form, it was the best solution out there for small to medium sites but the problem lied mostly in the distribution. Then Andrew Moulden modified the javascript so that a different cipher key would be used every time the script was run. The trouble with this technique was that it used the same cipher key for each e-mail address so if the technique was used widely, a spambot would just need to take the fixed cipher key and write code (again, really easy - but a lot harder to do it efficiently what with coding idiosyncrasies) to decode the obfuscation technique into a useful e-mail address. Get the e-mail address you want to make unreadable to the spambots, convert to lowercase, create a cipher (an encryption technique), encrypt the e-mail address using the cipher, write out the coded e-mail address to the document, write out the cipher to the document (both are basically useless to a harvester) and then wrap this in a piece of javascript to actually write out the link based on the cipher key and cipher text. Here's the theory and its resulting evolution. Recently I came across a solution from one Tim Williams University of Arizona) that works well, later modified by Andrew Moulden Site Engineering). The drawbacks it's painfully easy to "decode". So, for example, a mailto link to the e-mail address might look like the following: The second technique is to write out the e-mail address but replace "legible" characters with "illegible" characters. Also, unless you use some form of CAPTCHA looking text, the bot (depending on its sophistication) could harvest the e-mail address from the image. This has its limitations, if you run a large site with a lot of e-mail addresses (hence prone to change and additions), you would have to have someone on hand to create this plethora of e-mail address graphics. The most basic which has been in use since the dawn of time, (well, spambots) is to fire-up your favourite graphics editor, create an image of your e-mail address and replace the text of your e-mail address on your site with the newly created graphic. There are a couple of techniques out there. Anyway, that's not what this article is about, this article is about e-mail obfuscation, or making the e-mail addresses on your site readable by humans but unintelligible to robots. I've often wondered it would be really easy to determine the context of the e-mail address you find and you could even glean information from multiple sources as to the consumer preferences of your subject, and increase what must be a dismal conversion rate for the spammers. Spambots are programmatic robots that crawl the internet (in much the same way the search engines do) with the sole purpose of gathering e-mail addresses in order to send you completely irrelevant promotional material. The cause of the problem, the spambots are getting better.

Lately, across our network of sites, we've been getting more spam than usual.
