Cloaking: Difference between revisions

Content deleted Content added

Inline

Revision as of 04:23, 5 May 2007

For the cloaking used in science fiction, see cloaking device.

Cloaking is a black hat search engine optimization (SEO) technique in which the content presented to the search engine spider is different from that presented to the users' browser. This is done by delivering content based on the IP addresses or the User-Agent HTTP header of the user requesting the page. When a user is identified as a search engine spider, a server-side script delivers a different version of the web page, one that contains content not present on the visible page. The purpose of cloaking is to deceive search engines so they display the page when it would not otherwise be displayed.

The only legitimate uses for cloaking used to be for delivering content to users that search engines couldn't parse, like Adobe Flash. As of 2006, better methods of accessibility, including progressive enhancement are available, so cloaking is not necessary. Cloaking is often used as a spamdexing technique, to try to trick search engines into giving the relevant site a higher ranking; it can also be used to trick search engine users into visiting a site based on the search engine description which site turns out to have substantially different, or even pornographic content. For this reason, major search engines consider cloaking for deception to be a violation of their guidelines, and therefore, they delist sites when deceptive cloaking is reported.^[1]^[2]^[3]^[4]^[5]

Cloaking is a form of the doorway page technique.

A similar technique is also used on the Open Directory Project web directory. It differs in several ways from search engine cloaking:

It is intended to fool human editors, rather than computer search engine spiders.
The decision to cloak or not is often based upon the HTTP referrer, the user agent or the visitor's IP; but more advanced techniques can be also based upon the client's behaviour analysis after a few page requests: the raw quantity, the sorting of, and latency between subsequent HTTP requests sent to a website's pages, plus the presence of a check for robots.txt file, are some of the parameters in which search engines spiders differ heavily from a natural user behaviour. The referrer tells the URL of the page on which a user clicked a link to get to the page. Some cloakers will give the fake page to anyone who comes from a web directory website, since directory editors will usually examine sites by clicking on links that appear on a directory web page. Other cloakers give the fake page to everyone except those coming from a major search engine; this makes it harder to detect cloaking, while not costing them many visitors, since most people find websites by using a search engine.

The Black Hat Perspective

Increasingly, for a page without natural popularity due to compelling or rewarding content to rank well in the search engines, Webmasters may be tempted to design pages solely for the search engines. This results in pages with too many keywords and other factors that might be search engine "friendly", but make the pages difficult for actual visitors to consume. As such, black hat SEO practitioners consider cloaking to be an important technique to allow Webmasters to split their efforts and separately target the search engine spiders and human visitors.

Cloaking versus IP Delivery

IP delivery can be considered a more benign variation of cloaking, where different content is served based upon the requester's IP address. With cloaking, search engines and people never see the other's pages, whereas, with other uses of IP delivery, both search engines and people can see the same pages.

One use of IP delivery is to determine the requestor's location, and deliver content specifically written for that country. This use isn't necessarily cloaking. For instance, Google uses IP delivery for AdWords and AdSense advertising programs in order to target users in different geographic locations.

As a mean of determining the language(s) in which to provide content, IP delivery is a crude and unreliable method; many countries and regions are multi-lingual, or the requestor may be a foreign national. A better method of content negotiation is to examine the client's Accept-Language HTTP header.

As of 2006, many well-known and well respected sites have taken up IP delivery to personalise content for their regular customers. In fact, many of the top 1000 sites, including household names like Amazon (amazon.com), actively use IP delivery. None of these have been banned from search engines because their intention is not deceptive.

References

External links

What Search Engines See Isn't Always What You Get
- In Defense of Search Engine Cloaking, a response to the above article
- Search Engine Cloaking: The Controversy Continues, a response to In Defense of Search Engine Cloaking
Baoning Wu and Brian D. Davison: "Cloaking and Redirection: A Preliminary Study". Workshop on Adversarial Information Retrieval on the Web, Chiba, Japan, 2005.
Traffic Alternative to Search Engines Cloaking Cloaking Software
Cloaking Forum A discussion on Cloaking.
Cloaking Detector Tool that is capable to see whether a web page is doing user-agent cloaking or not.

[a-wmguide-1] Ask.com Editorial Guidelines

[2] Google's Guidelines on SEOs

[g-wmguide-3] Google's Guidelines on Site Design

[ms-wmguide-4] MSN Search Guidelines for successful indexing

[y-wmguide-5] Yahoo! Search Content Quality Guidelines

[1]

[2]

[3]

[4]

[5]

Revision as of 06:07, 25 April 2007 edit 62.160.59.37 (talk) +fr ← Previous edit		Revision as of 04:23, 5 May 2007 edit undo 69.19.14.15 (talk) No edit summary Next edit →
Line 14:		Line 14:

	* It is intended to fool human editors, rather than computer search engine spiders.		* It is intended to fool human editors, rather than computer search engine spiders.
	* The decision to cloak or not is often based upon the HTTP [[referrer]], the user agent or the visitor's IP; but more advanced techniques can be also based upon the client's behaviour ~~analisys~~ after few page requests: the raw quantity, the sorting of, and latency between subsequent HTTP requests sent to a website's pages, plus the presence of a check for [[robots.txt]] file, are some of the parameters in which search engines spiders differ heavily from a natural user behaviour. The referrer tells the [[URL]] of the page on which a user clicked a link to get to the page. Some cloakers will give the fake page to anyone who comes from a web directory website, since directory editors will usually examine sites by clicking on links that appear on a directory web page. Other cloakers give the fake page to everyone ''except'' those coming from a major search engine; this makes it harder to detect cloaking, while not costing them many visitors, since most people find websites by using a search engine.		* The decision to cloak or not is often based upon the HTTP [[referrer]], the user agent or the visitor's IP; but more advanced techniques can be also based upon the client's behaviour analysis after a few page requests: the raw quantity, the sorting of, and latency between subsequent HTTP requests sent to a website's pages, plus the presence of a check for [[robots.txt]] file, are some of the parameters in which search engines spiders differ heavily from a natural user behaviour. The referrer tells the [[URL]] of the page on which a user clicked a link to get to the page. Some cloakers will give the fake page to anyone who comes from a web directory website, since directory editors will usually examine sites by clicking on links that appear on a directory web page. Other cloakers give the fake page to everyone ''except'' those coming from a major search engine; this makes it harder to detect cloaking, while not costing them many visitors, since most people find websites by using a search engine.

Revision as of 04:23, 5 May 2007

The Black Hat Perspective

Cloaking versus IP Delivery

See also

References

External links