Sign up right now for email updates and get two free ebooks:

Blog SEO: Get Your Blog out of the Supplemental Index

Little blog SEO tip today, check to see if, like my blog, you have a “supplemental index problem“.

Check out my Google search result, as you can see I have about 106 pages in the supplemental index. Change chrisg.com to your own domain to test your own result.

So what is this “Supplemental Index”?

I like the answers from Tropical SEO best:

  • The Google Supplemental index is the Siberian work camp for web pages.
  • The Google Supplemental index is where they put web pages with little trust.
  • The Google Supplemental index is where they put web pages that aren’t going to rank for anything important.

Essentially Google throws your pages into the supplemental index when it is not sure what to do with it but doesn’t want to throw it away.

Why am I in “Supplemental Index” hell?

In a nutshell, it’s that old SEO monster, “duplicate content“. On my blog the internal linkage and archives were confusing Googlebot by throwing up the same content over and over. Graywolf did a wonderful video on this.

How do I get my blog out of “Supplemental Index”?

The standard answer seems to be to use Robots.txt to stop Google indexing junk pages. I didn’t want to add a robots.txt so looked for a plugin and came across a recommendation from Ogletree. He hacked a WordPress plugin that seemed ideal, but then in the comments I saw he provided some template code. Just what I was looking for. With a little tweak so it output a comment to explain what was going on, here is what I added to my header template just above the Title tag:


<?php
if((is_single() || is_category() || is_page() || is_home()) && (!is_paged())){
echo "<!-- ok google, index me! -->";
}else{
echo "<!-- google, please ignore - thanks! -->";
echo "<meta name="\"robots\"" content="\"noindex,follow\"">\n";
}
?>

What this does is outputs a special instruction to search engines to tell them to ignore a page if it is not the homepage, single article, static page, category, etc. My main problem was the date archives so hopefully this will sort it, we shall see!


Tags: , , , , , ,

Table of contents for Blog SEO

  1. Blog SEO: Get Your Blog out of the Supplemental Index
  2. Blog SEO: Boost Your Search Rankings With Internal Links
Sign up right now for email updates and get these
two free ebooks

"Creating Killer
Flagship Content"

"Authority Alliances"

Just enter your primary email address in the form below and hit the button!

Before commenting, please read my Comments Policy - thanks!

Comments

  1. Ivan Brezak Brkan says:

    What pages did Googlebot exactly see as junk pages and basically how can I implement this for example in Expression Engine. Do I need to use robots.txt (and if so, how?). Thanks in advance Chris, this is a very interesting post!

  2. Ivan Brezak Brkan says:

    What pages did Googlebot exactly see as junk pages and basically how can I implement this for example in Expression Engine. Do I need to use robots.txt (and if so, how?). Thanks in advance Chris, this is a very interesting post!

  3. You can see the pages Google didn’t like in the linked query above. I don’t know what you can do with expression engine, I guess though robots.txt would do it but someone else would have to guide you as I wimped out from using it myself ;)

  4. You can see the pages Google didn’t like in the linked query above. I don’t know what you can do with expression engine, I guess though robots.txt would do it but someone else would have to guide you as I wimped out from using it myself ;)

  5. Ivan Brezak Brkan says:

    Heh, don’t worry. I read the article on TropicalSEO and basically what I have to add to my “watchlist” is unique meta data for each and every page! :) Now if only I could get Googlebot to spider my blog at night (CET time)…

  6. Ivan Brezak Brkan says:

    Heh, don’t worry. I read the article on TropicalSEO and basically what I have to add to my “watchlist” is unique meta data for each and every page! :) Now if only I could get Googlebot to spider my blog at night (CET time)…

  7. Heh good luck! :)

  8. Heh good luck! :)

  9. Another way to complicate this simple thing is to add a robot.txt file which I did. I’m not sure whether it worked as my blog traffic was not stable. I’m gonna try this thing and remove the robot.txt shit.

  10. Another way to complicate this simple thing is to add a robot.txt file which I did. I’m not sure whether it worked as my blog traffic was not stable. I’m gonna try this thing and remove the robot.txt shit.

  11. Well it seems the pro seo technique is still robots.txt but for me it was just one more thing I didn’t want to have to learn

  12. Well it seems the pro seo technique is still robots.txt but for me it was just one more thing I didn’t want to have to learn

  13. there is a great post on this problem at:
    http://www.seomoz.org/blog/how-i-escaped-googles-supplemental-hell

    maybe it will help…

  14. there is a great post on this problem at:
    http://www.seomoz.org/blog/how-i-escaped-googles-supplemental-hell

    maybe it will help…

  15. Any suggestions for Blogger?

    Of ~1250 pages indexed, ~1140 are in the supplemental. It looks like it’s mostly the pages for individual posts. I’m reading that Categories may help, so I’m wondering if going back to archived posts and adding labels is worth the effort? Any other suggestions?

  16. Any suggestions for Blogger?

    Of ~1250 pages indexed, ~1140 are in the supplemental. It looks like it’s mostly the pages for individual posts. I’m reading that Categories may help, so I’m wondering if going back to archived posts and adding labels is worth the effort? Any other suggestions?

  17. I wouldn’t think adding categories would help, make sure they have unique titles, and you are not showing the same posts in several archive/category/date pages

  18. I wouldn’t think adding categories would help, make sure they have unique titles, and you are not showing the same posts in several archive/category/date pages

  19. My main issue seems to be a common one — printer friendly pages. I’ve gotten positive feedback from readers, so I’m keeping them, but will be adding a line to robots.txt to ignore anything in /print/.

  20. My main issue seems to be a common one — printer friendly pages. I’ve gotten positive feedback from readers, so I’m keeping them, but will be adding a line to robots.txt to ignore anything in /print/.

  21. Yeah that was why I have avoided the printer friendly thing although I have been informed how to do it with CSS I just never got around to doing it :S

  22. Yeah that was why I have avoided the printer friendly thing although I have been informed how to do it with CSS I just never got around to doing it :S

  23. Chris, Thats really a good one. I was looking for this kind of stuff as the date archive was getting indexed too. Can you tell what about tag thing. Like I am using UTW , what will be the syntax for that is_tag?

    awaiting for your reply

  24. Chris, Thats really a good one. I was looking for this kind of stuff as the date archive was getting indexed too. Can you tell what about tag thing. Like I am using UTW , what will be the syntax for that is_tag?

    awaiting for your reply

  25. Hmm, could be but I haven’t used that plugin myself … I am sure the developer or somebody else could help you out?

  26. Hmm, could be but I haven’t used that plugin myself … I am sure the developer or somebody else could help you out?

  27. All my results appear to be my comments. Is this a problem I should look at fixing or does it not matter too much? I don’t think I want folks finding comments when searching in google anyway – it’s my content not my comments I’d be worried about? Or am I missing the point?

  28. All my results appear to be my comments. Is this a problem I should look at fixing or does it not matter too much? I don’t think I want folks finding comments when searching in google anyway – it’s my content not my comments I’d be worried about? Or am I missing the point?

  29. I think it is actually your comment feed that is being indexed mel?

  30. I think it is actually your comment feed that is being indexed mel?

  31. Modifying the robots.txt file is real easy guys. Here’s a sample:

    User-agent: *
    Disallow: /cgi-bin/
    Disallow: /logs/
    Disallow: /wp-admin/
    Disallow: /wp-includes/

    This tells all bots to avoid these directories.

    This article might also help some people as it shows how to (in WordPress) create unique title and meta tags – amongst other things.

    I also think 302 and possibly 301 redirects may temporarily get you tossed into the supplemental index…

  32. Modifying the robots.txt file is real easy guys. Here’s a sample:

    User-agent: *
    Disallow: /cgi-bin/
    Disallow: /logs/
    Disallow: /wp-admin/
    Disallow: /wp-includes/

    This tells all bots to avoid these directories.

    This article might also help some people as it shows how to (in WordPress) create unique title and meta tags – amongst other things.

    I also think 302 and possibly 301 redirects may temporarily get you tossed into the supplemental index…

  33. Cool thanks Mark

  34. Cool thanks Mark

  35. Creating unique title tags for each and every page that you do want indexed is another way to avoid the supplemental index.

    If you’re using WordPress, install the SEO Title Tag plugin – http://www.netconcepts.com/seo-title-tag-plugin/

    Do keyword research for your blog and put as many different relevant terms as possible. If you have a recurring column, such as recaps or roundups, then put the date in the title tag to make it unique. Here’s a free keyword research tool – http://www.keyworddiscovery.com/search.html – this tool can help give you ideas for posts as well when you see what people are searching for.

  36. Creating unique title tags for each and every page that you do want indexed is another way to avoid the supplemental index.

    If you’re using WordPress, install the SEO Title Tag plugin – http://www.netconcepts.com/seo-title-tag-plugin/

    Do keyword research for your blog and put as many different relevant terms as possible. If you have a recurring column, such as recaps or roundups, then put the date in the title tag to make it unique. Here’s a free keyword research tool – http://www.keyworddiscovery.com/search.html – this tool can help give you ideas for posts as well when you see what people are searching for.

  37. @Ashisha: That’s a good call, I too use the UTW plugin.

    Add this to your robots.txt file:

    User-agent: *
    Disallow: /tag/

  38. @Ashisha: That’s a good call, I too use the UTW plugin.

    Add this to your robots.txt file:

    User-agent: *
    Disallow: /tag/

  39. Chris – I was the same way with getting around to a print stylesheet, and got lazy. I’m using the wp-print plugin, with some custom formatting, and it works great. Since it uses a re-write to make the print version a directory path I can block it with:

    User-agent: *
    Disallow: /print/

  40. Chris – I was the same way with getting around to a print stylesheet, and got lazy. I’m using the wp-print plugin, with some custom formatting, and it works great. Since it uses a re-write to make the print version a directory path I can block it with:

    User-agent: *
    Disallow: /print/

  41. Man! I have 10,400 in the supplemental index from my main site. I dont even use wordpress! Gonna have to get that fixed. Thanks for the info!!

  42. Man! I have 10,400 in the supplemental index from my main site. I dont even use wordpress! Gonna have to get that fixed. Thanks for the info!!

  43. @Mark: Need one favor, My blog is in subdiectory, Can u help me out in making the robots.txt file. I can email you what i have made.

  44. @Mark: Need one favor, My blog is in subdiectory, Can u help me out in making the robots.txt file. I can email you what i have made.

  45. @Ashish: sure, goto my blog and use the contact form. I’ll see what I can do.

  46. @Ashish: sure, goto my blog and use the contact form. I’ll see what I can do.

  47. @MArk: Thanks I will do that.

    One more thing, why not just index the single page and ignore rest of them. Everything other tahn single page is duplicate right?

  48. @MArk: Thanks I will do that.

    One more thing, why not just index the single page and ignore rest of them. Everything other tahn single page is duplicate right?

  49. It looks like a great solution I’ve Just implemented it

    It seems that there is variation of this solutions that doesn’t uses the WordPress API and so, it is portable to all CMSs.

    Apply the ‘nofollow’ property on links to archives and other pages with duplicate content.

  50. It looks like a great solution I’ve Just implemented it

    It seems that there is variation of this solutions that doesn’t uses the WordPress API and so, it is portable to all CMSs.

    Apply the ‘nofollow’ property on links to archives and other pages with duplicate content.

  51. @Ashisha, I don’t think that’s a good idea, you’re using a shotgun approach with that when you should use a sniper rifle and be specific on your target when disallowing.

  52. @Ashisha, I don’t think that’s a good idea, you’re using a shotgun approach with that when you should use a sniper rifle and be specific on your target when disallowing.

  53. I checked this out and my site only has 2 pages indexed on Google, the home page and the feed. How can this be possible? I’ve been confused why my PR has been 1 for so long. I’ve been writing since last November and have around 100 posts. How can I get google to index the rest of the pages??

  54. I checked this out and my site only has 2 pages indexed on Google, the home page and the feed. How can this be possible? I’ve been confused why my PR has been 1 for so long. I’ve been writing since last November and have around 100 posts. How can I get google to index the rest of the pages??

  55. @Justin – You might want to try getting google to index using a sitemap?
    http://www.google.com/webmasters/sitemaps/

  56. @Justin – You might want to try getting google to index using a sitemap?
    http://www.google.com/webmasters/sitemaps/

  57. One thing I haven’t seen mentioned here is that Google doesn’t index everyone just because “you” want them to. They have like a billion pages to worry about storing on their computers and they have to prioritize everything. They look at your PR rank and the higher you are the more content they’ll put in their primary search results. If your are a PR1 then it doesn’t matter how you clean up your site because they still aren’t going to index more than a few pages. As your PR ranking goes up then they will put more of your pages in their primary ranking. Of course, cleaning up your site, adding Robots.txt, eliminating duplicate content all helps improve your site quality. But if you have a site with lots of pages, then you need a high PR to get Google to rank them all.

  58. One thing I haven’t seen mentioned here is that Google doesn’t index everyone just because “you” want them to. They have like a billion pages to worry about storing on their computers and they have to prioritize everything. They look at your PR rank and the higher you are the more content they’ll put in their primary search results. If your are a PR1 then it doesn’t matter how you clean up your site because they still aren’t going to index more than a few pages. As your PR ranking goes up then they will put more of your pages in their primary ranking. Of course, cleaning up your site, adding Robots.txt, eliminating duplicate content all helps improve your site quality. But if you have a site with lots of pages, then you need a high PR to get Google to rank them all.

  59. This is real fun. When Google’s engineer tells you “the main determinant of whether a url is in our main web index or in the supplemental index is PageRank” (http://www.mattcutts.com/blog/infrastructure-status-january-2007/ – 4th paragraph) – there are still people believing in fairies and “old SEO monsters of duplicate content”.

  60. This is real fun. When Google’s engineer tells you “the main determinant of whether a url is in our main web index or in the supplemental index is PageRank” (http://www.mattcutts.com/blog/infrastructure-status-january-2007/ – 4th paragraph) – there are still people believing in fairies and “old SEO monsters of duplicate content”.

  61. 2Dolphins has
    347 supplemental results hits on Google. I wonder if this is because, with a Blogger-based blog, the meta data is the same on every page? Each page does have a unique and (fairly) descriptive title and I am using Blogger labels on most posts…

  62. 2Dolphins has
    347 supplemental results hits on Google. I wonder if this is because, with a Blogger-based blog, the meta data is the same on every page? Each page does have a unique and (fairly) descriptive title and I am using Blogger labels on most posts…

  63. Chris,

    I find it very kind of you not only to share your tip, but to answer comments as well.

    I do not know you, but wish you the best and hope you have a wonderful life.

    c_v

  64. Chris,

    I find it very kind of you not only to share your tip, but to answer comments as well.

    I do not know you, but wish you the best and hope you have a wonderful life.

    c_v

  65. When I used that modified plugin and set up my robots.txt my site fell a little bit in the rankings for a few days but then came back with even better rankings. I now only have a few pages in the supp index. They are very short posts that I made. It is very important to have more words on your pages. It really helps to have at least 300-500 words of unique content per page. It also helps to have a unique title and description on each page. That along with more links and links from authority sites is what will get you to rank in Google.

    Here are the two blog posts the guy was talking about.

    http://www.ogletreeseo.com/157.html

    http://www.ogletreeseo.com/146.html

  66. When I used that modified plugin and set up my robots.txt my site fell a little bit in the rankings for a few days but then came back with even better rankings. I now only have a few pages in the supp index. They are very short posts that I made. It is very important to have more words on your pages. It really helps to have at least 300-500 words of unique content per page. It also helps to have a unique title and description on each page. That along with more links and links from authority sites is what will get you to rank in Google.

    Here are the two blog posts the guy was talking about.

    http://www.ogletreeseo.com/157.html

    http://www.ogletreeseo.com/146.html

  67. Has anyone used the URL removal tool at http://www.google.com/webmasters/sitemaps/ ?

    I tried to remove my comments and feeds whcih are supplemental results right now, but no luck so far. I’m adding a robots.txt with

    User-agent: *
    Disallow: /wordpress/comments/
    Disallow: /wordpress/feed/

  68. That is clever!
    Just want to add that, if like me, you are going to do a copy and paste, take care of the formatting (the quotes).

  69. Has anyone used the URL removal tool at http://www.google.com/webmasters/sitemaps/ ?

    I tried to remove my comments and feeds whcih are supplemental results right now, but no luck so far. I’m adding a robots.txt with

    User-agent: *
    Disallow: /wordpress/comments/
    Disallow: /wordpress/feed/

  70. That is clever!
    Just want to add that, if like me, you are going to do a copy and paste, take care of the formatting (the quotes).

  71. also, if you cut and paste the code above, you will get a syntax error because ofthe slanted quotes try this:

    “;
    }else{
    echo “”;
    echo “n”;
    }
    ?>

  72. also, if you cut and paste the code above, you will get a syntax error because ofthe slanted quotes try this:

    “;
    }else{
    echo “”;
    echo “\n”;
    }
    ?>

  73. There is some great information in your post as well as the comments. I’ve already added a meta and differentiating title to my pages and I’ll probably look into the robots.txt as some of your users have suggested.

  74. There is some great information in your post as well as the comments. I’ve already added a meta and differentiating title to my pages and I’ll probably look into the robots.txt as some of your users have suggested.

  75. @Guilherme Zuhlke O’Connor: “Apply the ‘nofollow’ property on links to archives and other pages with duplicate content.”

    That’s not necessary. Archives are nothing more than a different way to get to the same spot or post, that’s not duplicate content.

  76. @Guilherme Zuhlke O’Connor: “Apply the ‘nofollow’ property on links to archives and other pages with duplicate content.”

    That’s not necessary. Archives are nothing more than a different way to get to the same spot or post, that’s not duplicate content.

  77. @cyntax – unfortunately WordPress just does this, anyone know how I can stop the reformatting happening? :o

  78. @cyntax – unfortunately WordPress just does this, anyone know how I can stop the reformatting happening? :o

  79. It seems all the tags page are considered as good ones and the single post page are considered as supplemental :( Only now I understood why I am getting very less hits from Google as I am currently getting very high number of hits from Yahoo and MSN when compared with Google. Any idea, how to make the single posts back to good posts from supplemental?

  80. It seems all the tags page are considered as good ones and the single post page are considered as supplemental :( Only now I understood why I am getting very less hits from Google as I am currently getting very high number of hits from Yahoo and MSN when compared with Google. Any idea, how to make the single posts back to good posts from supplemental?

  81. All I can suggest is follow the advice above and in the comments, but the main thing would be to try to get good links to the single pages

  82. All I can suggest is follow the advice above and in the comments, but the main thing would be to try to get good links to the single pages

  83. Another thing that could potentially cause this is your wordpress feed being indexed. For some reason google occasionally does this. You’re best of blocking it in robots.txt

  84. Another thing that could potentially cause this is your wordpress feed being indexed. For some reason google occasionally does this. You’re best of blocking it in robots.txt

  85. The problem with Blogger seems to be the archive pages. If you have the widget to browse by Year / Month and so on, under this you have the item pages, which are fine. They have their own URL’s and won’t be duplicating any content, providing you don’t have every article on your main page. But, if you click on e.g. a Month name, you’ll see all articles for that month on one page … hence the duplication.

    So the only way around this is to add a meta to the page header for archive pages to get Google to ignore them. Thankfully Blogger allows you to identify archive pages in the template, so it should just be a question of adding the following code somewhere in the section of your template:

    Will have to see if that helps at all the next time Google comes around.

    Very useful article … thanks!

  86. The problem with Blogger seems to be the archive pages. If you have the widget to browse by Year / Month and so on, under this you have the item pages, which are fine. They have their own URL’s and won’t be duplicating any content, providing you don’t have every article on your main page. But, if you click on e.g. a Month name, you’ll see all articles for that month on one page … hence the duplication.

    So the only way around this is to add a meta to the page header for archive pages to get Google to ignore them. Thankfully Blogger allows you to identify archive pages in the template, so it should just be a question of adding the following code somewhere in the section of your template:

    Will have to see if that helps at all the next time Google comes around.

    Very useful article … thanks!

  87. Chris, Ever since you published this article I’ve been trying to figure out why my site had all its pages listed as supplemental. I’ve had a very robost robots.txt file installed for over a month. The only bad thing was I wasn’t using excerpts on my categories. Then today I ran across 2 articles of interest that really explain what might be going on. I’d be curious to see what you think about these:

    http://www.searchengineguide.com/wallace/2005/0209_dw1.html

    and

    http://www.mattcutts.com/blog/google-hell/

    - Nathan

  88. Chris, Ever since you published this article I’ve been trying to figure out why my site had all its pages listed as supplemental. I’ve had a very robost robots.txt file installed for over a month. The only bad thing was I wasn’t using excerpts on my categories. Then today I ran across 2 articles of interest that really explain what might be going on. I’d be curious to see what you think about these:

    http://www.searchengineguide.com/wallace/2005/0209_dw1.html

    and

    http://www.mattcutts.com/blog/google-hell/

    - Nathan

  89. My site went from tens of thousands of main index to two in matter of what seems weeks. Supp hell for sure. I’m guessing my server migration ended up on a bad neighborhood IP. I’m going to contact google and see what’s up.

  90. My site went from tens of thousands of main index to two in matter of what seems weeks. Supp hell for sure. I’m guessing my server migration ended up on a bad neighborhood IP. I’m going to contact google and see what’s up.