Redirect Browsers, Not Bots
Robots and search engine spiders, referred to just as bots in this article, follow redirects to the destination URL — the ones they recognize.
There are several ways to redirect to another URL. I'll mention three.
The last method mentioned is the one this article is about — a method few bots recognize. It may be employed when you want to redirect only regular browsers, not bots.
Perhaps:
-
The current page should be indexed instead of the page at the destination URL.
-
The current page is a honey pot for bad bots that browser's should be redirected away from.
-
Social media bots sent out to detect if a URL is a redirect should miss the redirect.
Whatever your reason for redirecting browsers and not bots, consider the last of the three methods mentioned below (the Obfuscated JavaScript Redirect).
Header Redirect
A header redirect is sending the browser or bot redirect information in the response header. They generally respond with 301 or 302 codes, although other 3xx response codes could be used.
Most short URL services use header redirects because it's the fastest redirect of all. Click counters like Short URL V3 also generally use header redirects.
Header redirects are the most recognized of all. Virtually every bot with any redirect coding in it at all will recognize header redirects. It's the one to use when a person wants bots to proceed to another URL.
Meta Tag Redirect
The meta tag redirect sends the browser to another URL with the meta refresh tag.
Many bots have the ability to find meta-refresh URLs and redirect themselves because it's an easy thing to code.
Obfuscated JavaScript Redirect
Some bots will scan JavaScript to find URLs within them. In which case, they're then able to follow them.
When the URL is broken up, however, and not even recorded in the JavaScript, the bot will need to run the JavaScript — not just scan it, but compile and run it — to detect that a browser redirect is coded and what the redirect URL is.
Few bots, if any, will find the redirect URL.
How It Works
Three non-display divs contain parts of the redirect URL. The JavaScript collects the parts, constructs the URL, and redirects the browser. Because bots generally don't compile and run JavaScript, they don't get to go.
The Source Code
Here are the three divs and the JavaScript, all in one source code box.
<div style="display:none;" id="url-protocol">https</div> <div style="display:none;" id="url-domain">www.willmaster.com</div> <div style="display:none;" id="url-location">contact.php</div> <script type="text/javascript"> window.location = document.getElementById("url-protocol").innerHTML + "://" + document.getElementById("url-domain").innerHTML + "/" + document.getElementById("url-location").innerHTML; </script>
In the above source code, you'll see three divs. Each has an id value.
-
The first div has an
id
valueurl-protocol
and will contain eitherhttp
orhttps
. The example containshttps
. -
The second div has an
id
valueurl-domain
and will contain the domain name of the URL. The example containswww.willmaster.com
. -
The third div has an
id
valueurl-location
and will either be blank or contain a file location for the URL. The example containscontact.php
.
The JavaScript immediately below the divs uses the content of the divs to create a URL and redirect the browser. In the JavaScript you'll see the id values of the divs, color-coded for easier eye coordination.
Because bots generally don't run JavaScript, they aren't redirected.
The JavaScript can be anywhere on your web page so long as it is somewhere below the three divs. At the bottom of the page, above the cancel </body>
tag can work.
When you put the three divs on the page and the JavaScript somewhere below it, the browser will be redirected. But bots generally aren't.
(This article first appeared in Possibilities newsletter.)
Will Bontrager