What is the “best” setup for robots.txt?
I’m using the following permalink structure /%category%/%postname%/.

My robots.txt currently looks like this (copied from somewhere a long time ago):

User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /trackback
Disallow: /comments
Disallow: /category/*/*
Disallow: */trackback
Disallow: */comments
  1. I want my comments to be indexed. So I can remove this
  2. Do I want to disallow indexing categories because of my permalink structure?
  3. An article can have several tags and be in multiple categories. This may cause duplicates in search providers like Google. How should I work around this?

Would you change anything else here?

6 s
6

FWIW, trackback URLs issue redirects and have no content, so they won’t get indexed.

And at the risk of not answering the question, RE your points 2 and 3:

http://googlewebmastercentral.blogspot.com/2008/09/demystifying-duplicate-content-penalty.html

Put otherwise, I think you’re wasting your time worrying about dup content, and your robots.txt should be limited to:

User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-content/cache

Leave a Reply

Your email address will not be published. Required fields are marked *