Professional Search Engine Optimization (Seo). Developer’s Guide to SEO

Ajax software Free javascripts
↑

Main Page

To sanitize user input, you simply call the

sanitizeHTML()

function on the user-provided input. It

will strip any tags that are not in the variable

$allowed_tags

, as well as common attributes that can

be cleverly used to execute JavaScript.

Without executing

sanitizeHTML()

over the input HTML, the cleverly constructed HTML would

redirect to

http://too.much.spam

. The event

onerror

is executed upon an error. Because the

image

INVALID-IMAGE

does not exist (which causes an error), it executes the

onerror

event,

location.href=‘http://www.spamsite.com’

, causing the redirection.

After executing

sanitizeHTML()

onerror

is replaced with

SANITIZED

, and nothing occurs.

The

sanitizeHTML

function does not typically return valid HTML. In practice, this does not matter,

because this function is really designed as a stopgap method to prevent spam. The modified HTML code

will not likely cause any problems in browsers or search engines, either. Eventually, the content would

be deleted or edited by the site owner anyway.

Having such “black hat” content within a web site can damage both the human as well as a search engine

perception of reputation. Embedding JavaScript-based redirects can raise red flags in search engine algo-

rithms and may result in penalties and web site bans. It is therefore of the utmost importance to address

and mitigate these concerns.

Note that the nofollow library was not used in this latest example, but you could combine nofollow with

sanitize to obtain a better result, like this:

// display third comment

$inHTML = ‘<p>Sanitizing <img src=”INVALID-IMAGE”‘ .

‘onerror=”location.href=\‘http://too.much.spam/\‘“>!</p>’;

$sanitized = noFollowLinks(sanitizeHTML($inHTML));

echo $sanitized;

Lastly, your implementations — both

noFollowLinks()

and

sanitizeHTML()

— will not exhaustively

block

every

attack, or allow the flexibility some programmers require. They do, however, make a spam-

mer’s life much more difficult, and he or she will likely proceed to an easier target. A project called safe-

html by Pixel-Apes is a more robust solution. It is open-source and written in PHP. You can find it at

http://pixel-apes.com/safehtml/

Requesting Human Input

One common problem webmasters and developers need to consider are the automatic spam robots,

which submit comments on unprotected blogs or other web sites that support comments.

The typical solution to this problem is to use what is called a “CAPTCHA” image that requires the

visitor to read a graphical version of text with some sort of obfuscation. A typical human can read the

image, but an automated script cannot. This approach, however, unfortunately presents usability prob-

lems, because blind users can no longer access the functionality therein. For more information on this

type of CAPTCHA, visit

http://freshmeat.net/projects/kcaptcha/

. An improvement on this

188

Chapter 8: Black Hat SEO

c08.qxd:c08 10:59 188

Ajax software Free javascripts
→