Fixing duplicate content so Google doesn’t see it.
thisismyurl.com/wp-content/uploads/2008/10/google-duplicate-content.jpg" rel="lightbox[roadtrip]">
There are a lot of blogs you’ll find with an invaluable code snippet designed to ensure Google only indexes the posts on your website and does not duplicate the categories and tags pages, unfortunately the code they’re using has a critical flaw as you’ll see by viewing my Google Webmasters Tool report to the right. Even using the tool, I keep getting duplicate content notifications.
The code is here:
<?php if(is_home() || is_single() || is_page()){
echo ‘<meta name=”robots” content=”index,follow”>’;
} else {
echo ‘<meta name=”robots” content=”noindex,follow”>’;
}?>
You can find it on a ton of blogs:
It places the <meta name=”robots” content=”noindex,follow”> code on any page which is not your home page, a content page or post page. On those pages it places <meta name=”robots” content=”index,follow”> which tells robots to crawl the page to be included in their database. This is great except, what happens when you create a post that’s in two categories on your website?
My post, Meta Tags that Kill Your Blog is found at:
- http://www.thisismyurl.com/wordpress/meta-tags-that-kill-your-blog/ and;
- http://www.thisismyurl.com/web-advice/meta-tags-that-kill-your-blog/
Google is counting this page twice and will most likely assume that I’m trying to pull a fast one on the robot when in fact, I’m simply trying to help users by listing the content in two categories. The solution? We have to tell Google to ignore one of the two posts.
To accomplish this, we need to first edit the earlier code sample to give us a place to add our new code:
if(is_home()) {echo ‘<meta name=”robots” content=”index,follow” />’; }elseif(is_page()) {echo ‘<meta name=”robots” content=”index,follow” />’; }elseif(is_single()) {} else { echo ‘<meta name=”robots” content=”noindex,follow” />’; }
Now we know that we’re on a single (that’s a post) so we can test to see if content is unique. I struggled with how to do this but finally decided that the first category would have to be the most important category, hopefully that’s the best way but I’m always open to suggestions.
$category = get_the_category();if(strpos($_SERVER['REQUEST_URI'],$category[0]->category_nicename.”/”)>0) {echo ‘<meta name=”robots” content=”index,follow” />’;} else {echo ‘<meta name=”robots” content=”noindex,follow” />’;}
What I’ve done is tested to see if this is the first category ($category[0]) and checked it’s nice_name against the current URL. If we’re on the URL of the first category … we tell Google to index our content, otherwise we tell it to ignore our content. I’m sure that I’ll have to fix this at a later date but it’s worked so far. If you have a better way to do it, let me know.
Here’s the final code:
if(is_home()) {echo ‘<meta name=”robots” content=”index,follow” />’; }elseif(is_page()) {echo ‘<meta name=”robots” content=”index,follow” />’; }elseif(is_single()) {$category = get_the_category();if(strpos($_SERVER['REQUEST_URI'],$category[0]->category_nicename.”/”)>0) {echo ‘<meta name=”robots” content=”index,follow” />’;} else {echo ‘<meta name=”robots” content=”noindex,follow” />’;}} else { echo ‘<meta name=”robots” content=”noindex,follow” />’; }






[...] multiple categories only shows up once, this should help websites SEO a lot as well. The link is Fixing duplicate content so Google doesn’t see it. JohnCow has a great article on two amazing Google data mining tools, ProBlogger wants to know [...]
[...] WordPress themes which have been optimized for search engines. The theme is the first to feature my improved Robots instructions and also includes a great photo from Roberto [...]
Isn’t restricting crawls for tags and categories in a robots.txt file not enough? I have the first code you mention in my theme and I also have the robots file.
Wouldn’t it be simpler to leave the category out of your permalink? That way you’d have the post on just one URL, regardless of how many categories you had it in?
malcolm coles’s last blog post..SEO friendly URLs: myth and fact