Reveal Your Hidden Email Addresses
Because of the large number of unwanted email that webmasters are receiving, hiding email addresses from spammers' email harvesting robots is becoming commonplace. Methods vary (see below).
There is no guarantee. If a browser can reveal your email address to site users or create a link with it or send it to a script, chances are the dark one's robots can harvest it. Their robots are likely to get smarter and smarter in order to circumvent ever more sophisticated obfuscation attempts.
One could argue that spammers, realizing the fact that those who attempt to hide their email addresses wouldn't buy into their "marketing messages" anyway, won't make an attempt to harvest hidden email addresses. However, harvesters want valid email addresses, especially if the addresses are going on a CD for sale to spammers. Looking at it that way, email addresses in hiding are near 100% certain to be valid, with a live recipient at the other end. Thus, they could be considered prime value.
To demonstrate how easy it is to find your "hidden" email addresses, I've made a CGI program that can be used to find them. Simply type in the URL of your page and the program will tell you which email addresses it found. /a/15/pl.pl?art153demo
Methods of hiding email addresses include:
-
Encoding email addresses with the ISO-Latin-1 codeset. Each encoded character begins with the characters "" and ends with a ";" character. In between, is a number from the codeset. (For ASCII characters, the number is the decimal value of the character.) Thus, the "@" character is represented as "@".
The anti-spam function function of Advanced Email Link Generator with Anti-Spam Encoder from /a/15/pl.pl?aelgwase uses this encoding method. "ab@ab.com" becomes:
ab@ab.com
-
Encoding email address links with hexidecimal numbers. Each encoded character begins with "%" and is followed with the hexidecimal value of the ASCII character. Thus, the "@" character is encoded as "%40" and "ab@ab.com" becomes:
%61%62%40%61%62%2e%63%6f%6d
-
Inserting arbitrary HTML comment tags within the email address. Because HTML comment tags contain characters invalid in URLs, it is assumed or hoped that the robot will retrieve only an invalid email address or no email address at all. Encoding "ab@ab.com" as
a<!-- hi :) -->b@ab.com
might cause the robot to incorrectly assume "b@ab.com" is the email address.
Encoding "ab@ab.com" as
ab<!-- X -->@<!---->a<!-- -->b.com
might cause the robot to miss the email address altogether.
-
A combination of the above. "ab@ab.com" could be encoded as:
a<!-- hi :) -->b@ab.com
The encoded email address could contain a multi-line HTML comment tag:
a<!-- hi :) -->b@<!-- any text here -->ab.com
As a link, "ab@ab.com" could be encoded as:
a<!-- hi :) -->b%40%61b.<!---->com
-
Using JavaScript to insert the email address into the page can be effective when the address is printed in portions and/or encoded with one of the above methods. Well, it can be effective until a robot reads JavaScript.
This would print "ab@ab.com" on a web page:
<script type="text/javascript" language="JavaScript"><!--
document.write('ab');
document.write('@');
document.write('ab.com');
//--></script>
JavaScript code could be much more convoluted. Even the use of document.write() can be hidden. But the above should work for any robots that can't read JavaScript.
-
Probably the most effective way of hiding email addresses from robots is to create an image with the address and then put the image on the web page. When you must display your email address on a web page, use this. However, no links can be created with this method.
-
For 100% certainty that robots won't find email addresses on your web pages, eliminate them altogether. Many of the Master Series titles have been created with this in mind. Master Feedback, Master Form, and the two "recommend my site" titles are the most popular of these. See /a/15/pl.pl?cgi If you still want to display an email address for humans to read, use an image as described above.
The demonstration to find your hidden email addresses at /a/15/pl.pl?art153demo is designed to find email addresses hidden with the first four methods listed above. If you successfully hide an email address with any of those methods, please let me know the URL of the page so I can upgrade the program.
The JavaScript method hasn't yet been incorporated into the program. It will be a bit of work because of the myriad ways JavaScript can be coded, but not impossible.
If you become aware of a method not addressed above, please let me know. I sincerely want to keep the program updated. It can reveal vulnerable pages to webmasters.
I know of no way to extract email addresses from images. That method will probably be safe for the foreseeable future. (I've quit saying "it's impossible" because I almost always had ended up being proved wrong.)
Do you have vulnerable pages? Use the form at the demonstration page and see.
If you wish, you may offer the same demonstration to your site visitors at no charge. Copy the URL from your browser's address bar and offer it to your site visitors. Put the URL into a link, into a popup, into an iframe, or even into an ebook to offer readers the service of checking their pages for email harvesting vulnerability.
The CGI program underlying the demonstration is not and will not be made available to the public. Otherwise, and contrary to its current purpose, it might be converted into an automated email address harvester.
Will Bontrager
©2002 Bontrager Connection, LLC
Please note:
Articles on this website are presented "as is". However -
If you have a question about a CGI script, HTML, CSS, PHP, or JavaScript
Ask one of our Experts and you'll have your answer!
Click here for details.