Duplicate content

from Wikipedia, the free encyclopedia

Duplicate content refers to the representation of the same content on different websites . This applies to websites with the same as well as different domains .

Search engines filter out duplicate content or sometimes even rate it negatively. Unique content is the opposite of duplicate content .

Emergence

Duplicate content can arise when several URLs display the same content. This can be the case, for example, if GET parameters are appended to a URL in different order: www.example.com/index.php?a=1&b=2 and www.example.com/index.php?b=2&a= 1 usually deliver identical pages but are different URLs. Search engines see two addresses that have the same content and will only display one of these pages if they are searched for.

Another, frequently encountered form of duplicate content arises when a website is available by specifying the www subdomain , if the website can also be accessed without this information ( e.g. http://www.example.com/ and http: / /example.com/ ). This problem usually occurs automatically on every single subpage of a website.

solution

The solution is to set up a redirect so that, for example, calling up http://www.example.com/ redirects to http://example.com/ . For this purpose http://www.example.com/ should deliver the HTTP status code 301 so that the web crawlers of the search engines recognize the redirect.

The canonical link can be used to indicate to the search engine under which URL the "original" page is located. This tag is always used when it is absolutely necessary to be able to access it via different URLs, e.g. B. the print versions of a website, if not simply CSS media queries but separate pages were used. Canonical tags can be created for both HTML websites and non-HTML websites such as Office or PDF documents. In the latter case, however, the canonical tag must be built into the HTTP header via the configuration of the web server . To avoid problems with pagination , the RDFa tags and are used , which emphasize the relationship between a main category page and the pagination pages. rel="next"rel="prev"

consequences

Google distinguishes between maliciously and non-maliciously duplicated content. Content that is not maliciously duplicated includes, for example, duplication due to different URLs for different end devices, warehouse items that are displayed or referenced via several unique URLs, or print versions of websites. However, duplicate content can cause problems for Google under certain conditions: “Occasionally, however, content is deliberately duplicated on different domains with the intention of influencing search engine rankings or attracting more hits. Such unfair behavior can lead to a negative user experience, as visitors see essentially the same content in a series of search results. "

Individual evidence

  1. a b Google: Duplicated content. Retrieved November 23, 2017 .
  2. Pagination , Google Webmaster Tools

Web links