homeduplicate content 101glossaryfaqsabout

  • Help spread the word

HTTP / HTTPs, WWW / Non-WWW Sites, & Mixed Case URLs

The below will give you a better understanding of duplicate content issues related to HTTP and HTTPS, www and non-www sites and mixed case URLs.

HTTP and HTTPS

We could get technical and talk about protocols and security and what that means regarding HTTP and HTTPS but it doesn't really matter in terms of duplicate content and SEO.

What matters is that you understand that the 2 websites below are two completely different websites that can serve up the same content and are creating a duplicate content issue. Note the use of HTTP in the first domain and HTTPS in the second.

  • http://www.duplicate-content.org/
  • https://www.duplicate-content.org/

An illustration

Many people don't know you can have 2 different sites, one located on HTTP and the other located on HTTPS of the same domain, but you certainly can.

Take a look at the below screenshots. Note the domain that is shared between them. Note that one resides at HTTP and the other at HTTPS. Ignore the parameter in the HTTPS version. That's a separate issue that we will get to later.

In this example below, the technical architecture is set up correctly and duplicate content issues are avoided. Whether it is a good idea to host the content on the HTTPS site is another topic.

http vs https

www and non-www

Similar to HTTP and HTTPS, you can have different sites living at www and non-www versions of a domain. Note in the below 2 URLs that one has www and the other does not. You should choose your canonical domain. Either www or non-www will due. What is most important is that you are consistent with communicating to Google and other search engines the canonical domain in all of your technical architecture configurations and in internal linking.

  • http://www.duplicate-content.org/
  • http://duplicate-content.org/

An illustration

With the help of the FireFox plugin LiveHTTPHeader, you can see how http://duplicate-content.org redirects in the below screenshot to http://www.duplicate-content.com with a 301 Redirect.

301 redirect of non-www to www

In this example, the technical architecture is set up correctly and duplicate content issues are avoided. It should be after all, this site is about duplicate content : )

Mixed case URLs and duplicate content

You can add a tremendous amount of duplicate content (and decrease site performance, another Google ranking factor) if you allow your server to serve pages from a URL that accepts both upper and lower cases in the URL structure. This is especially true in larger organizations where there are multiple hands in the web maintenance cookie jar. One mistake in creating a new URL or linking to a mixed case URL could lead to a slew of duplicate content issues.

The best thing to do is apply a server level rule that states;

If a URL is being accessed by anything other than an all lowercase URL, then 301 Redirect that URL to the all lowercase version of that URL.

It's also a good idea to establish house rules around naming conventions of web pages and files that include lowercase conventions.

An illustration

The below Microsoft page can be accessed with a variety of cases. Try it for yourself; all 3 URLs go to the same content.

This technical architecture is not set up correctly and is an example of a duplicate content issue waiting to happen.

Related Topics

Why duplicate content is bad for SEO

See how duplicate content lowers the number of pages available to rank.
Learn more >

HTTPs, non-www, & mixed case URLs

3 common causes of duplicate content that can easily cripple your SEO.
Learn more >

Subdomains, parameters, and 404 pages

Staging sites, affiliate tracking and faux 404 pages can all be detrimental to your SEO.
Learn more >

Duplicate content and your site's URLs

They all look very similar, but make no mistake about it. They are duplicates.
Learn more >

Google Webmaster Tools

For best results GWT should be coupled with other tools to address duplicate content.
Learn more >

Duplicate-content.org | Made in San Francisco, California © 2012