JavaScript Editor Ajax software     Free javascripts 



Main Page

To prevent this, the following meta tag should be added to the
<head>
section of all cloaked documents:
<meta name=”robots” content=”noarchive” />
If you are cloaking only for a specific spider, you can use a tag like the following. (This can also be applied
for
noindex,nofollow
, as shown in Chapter 5.)
<meta name=”googlebot” content=”noarchive” />
This prevents the cache from being stored or displayed to users. The
New York Times
also notably uses
this tag to prevent people from reading its content through the search engines’ cache.
Implementing Cloaking
In this upcoming exercise you’re implementing a simple cloaking library, in the form of a class named
SimpleCloak
. This class will have two functions that you can access from your web applications:
?
updateAll()
updates your cloaking database with search engine IP and user agent data
?
ipSpider()
verifies if the visitor is a search engine spider
The cloaking data is retrieved from Dan Kramer ’s
iplists.com
. Kudos to Dan to providing such a use-
ful set of data for everyone to use!
To test the
SimpleCloak
library, you’ll create a script named
cloaking_test.php
, which will have the
output shown in Figure 11-1 if read by a “normal” visitor, and the output shown in Figure 11-2 when
read by a search engine.
Figure 11-1
223
Chapter 11: Cloaking, Geo-Targeting, and IP Delivery
c11.qxd:c11 11:01 223


JavaScript Editor Ajax software     Free javascripts