Fixing Duplicate Content So Google Doesn't See It.

There are a lot of blogs you'll find with an invaluable code snippet designed to ensure Google only indexes the posts on your website and does not duplicate the categories and tags pages, unfortunately the code they're using has a critical flaw as you'll see by viewing my Google Webmasters Tool report to the right

Welcome to my blog, please feel free to subscribe to my RSS feed, join me on Twitter or leave a comment.

google duplicate content 300x177 Fixing duplicate content so Google doesnt see it. imageThere are a lot of blogs you’ll find with an invaluable code snippet designed to ensure Google only indexes the posts on your website and does not duplicate the categories and tags pages, unfortunately the code they’re using has a critical flaw as you’ll see by viewing my Google Webmasters Tool report to the right. Even using the tool, I keep getting duplicate content notifications.

The code is here:

<?php if(is_home() || is_single() || is_page()){
echo ‘<meta name=”robots” content=”index,follow”>’;
} else {
echo ‘<meta name=”robots” content=”noindex,follow”>’;
}?>

You can find it on a ton of blogs:

It places the <meta name=”robots” content=”noindex,follow”> code on any page which is not your home page, a content page or post page. On those pages it places <meta name=”robots” content=”index,follow”> which tells robots to crawl the page to be included in their database. This is great except, what happens when you create a post that’s in two categories on your website?
My post, Meta Tags that Kill Your Blog is found at:
  1. http://www.thisismyurl.com/wordpress/meta-tags-that-kill-your-blog/ and;
  2. http://www.thisismyurl.com/web-advice/meta-tags-that-kill-your-blog/
Google is counting this page twice and will most likely assume that I’m trying to pull a fast one on the robot when in fact, I’m simply trying to help users by listing the content in two categories. The solution? We have to tell Google to ignore one of the two posts.
To accomplish this, we need to first edit the earlier code sample to give us a place to add our new code:
  

if(is_home()) {echo ‘<meta name=”robots” content=”index,follow” />’; }
elseif(is_page()) {echo ‘<meta name=”robots” content=”index,follow” />’; }
elseif(is_single()) {
} else { echo ‘<meta name=”robots” content=”noindex,follow” />’; }

 

Now we know that we’re on a single (that’s a post) so we can test to see if content is unique. I struggled with how to do this but finally decided that the first category would have to be the most important category, hopefully that’s the best way but I’m always open to suggestions.
  

$category = get_the_category();
if(strpos($_SERVER['REQUEST_URI'],$category[0]->category_nicename.”/”)>0) {
echo ‘<meta name=”robots” content=”index,follow” />’;
} else {
echo ‘<meta name=”robots” content=”noindex,follow” />’;
}
What I’ve done is tested to see if this is the first category ($category[0]) and checked it’s nice_name against the current URL. If we’re on the URL of the first category … we tell Google to index our content, otherwise we tell it to ignore our content. I’m sure that I’ll have to fix this at a later date but it’s worked so far. If you have a better way to do it, let me know.
Here’s the final code:
if(is_home()) {echo ‘<meta name=”robots” content=”index,follow” />’; }
elseif(is_page()) {echo ‘<meta name=”robots” content=”index,follow” />’; }
elseif(is_single()) {
$category = get_the_category();
if(strpos($_SERVER['REQUEST_URI'],$category[0]->category_nicename.”/”)>0) {
echo ‘<meta name=”robots” content=”index,follow” />’;
} else {
echo ‘<meta name=”robots” content=”noindex,follow” />’;
}
} else { echo ‘<meta name=”robots” content=”noindex,follow” />’; }

Post Tags
Tags are used throughout a website to link similar content, click any of the keyword tags below to find similar content here on my website. , , ,

Technorati Tags
Technorati is a great way to find similar content, click the tags below to find other blogs on the web with similar content. google, meta, seo, wordpress,

About the author
Christopher Ross - Christopher Ross is a technology evangelist living in Fredericton, Canada who travels the world preaching the benefits of technology to business professionals. When he's not writing or blogging about the future of the industry, he's busy building websites or helping businesses market themselves on the Internet.

4 Responses to “Fixing Duplicate Content So Google doesn’t See It.”

Comments

You're welcome to leave a comment using the form below.

  1. Internet Marketing Do-Follow Blog on October 26, 2008 at 3:56 am

    Isn’t restricting crawls for tags and categories in a robots.txt file not enough? I have the first code you mention in my theme and I also have the robots file.

  2. malcolm coles on February 11, 2009 at 7:29 pm

    Wouldn’t it be simpler to leave the category out of your permalink? That way you’d have the post on just one URL, regardless of how many categories you had it in?

    malcolm coles’s last blog post..SEO friendly URLs: myth and fact

Pingbacks

Here's a list of web sites linking to this article. Would you like to be listed here? Link to my articles in your next post.

  1. Blog Roundup for October 25th :: Christopher Ross on October 25, 2008 at 9:34 am
  2. Free WordPress Theme - One Night in Paris :: Christopher Ross on October 25, 2008 at 10:26 am

Leave a Reply

Please note: I've recently introduced a new policy on my website and welcome comments, including dofollow rules but I require all posters to use their real names (not keywords). Your comments are always welcome, keyword spam will be deleted immediately.

CommentLuv Enabled

Sponsors