Harvestable Email Addresses
This 6-line PHP script was designed for two purposes:
-
To test your web pages for vulnerable email addresses.
-
To show how vulnerable web pages are to email harvesting robots by demonstrating how very little code is required.
The script to recognize harvestable email addresses requires PHP to be configured so http://... URLs can be used with function file_get_contents(). Many, if not most, PHP installations are configured accordingly.
Some harvesting robots are likely to be more sophisticated and may find email addresses where this script does not — email addresses broken up and reassembled with JavaScript, for example. In other words, if an email address isn't found with this script, it may still be found by other robots.
Here's the script:
<?php $URL = "http://example.com/webpage.html"; preg_match_all('/[a-zA-Z0-9\-\_\.]+@[a-zA-Z0-9\-\_\.]*[a-zA-Z0-9][a-zA-Z0-9]/',html_entity_decode(rawurldecode(file_get_contents($URL))),$matches); if( count($matches) and count($matches[0]) ) { echo implode('<br>',$matches[0]); } else { echo "No email addresses recognized at $URL"; } ?>
To use the script, follow these steps:
-
Copy the code and replace "http://example.com/webpage.html" with the URL of your own web page.
-
Paste the code into a new or existing PHP page and upload it to your server.
-
Type the uploaded page's URL into your browser.
Remember, even if this script doesn't find an email address on your page, other robots may still find it.
Will Bontrager