Google finds a robots.txt Disallow for a page, it will remove the page's title and description from its search results. It will also no longer match search terms to the words on that page. So, the page essentially disappears from the Google search results pages. However, if Google finds a link to that page, it will still show that page in results when someone clicks on "More results from <this domain>".

I went around and around with this, trying to find a way to tell them "don't mention my contact forms pages at all, please", and here's what I ended up with: For Google, don't Disallow the page in robots.txt, but place a <meta name="robots" content="noindex"> tag in the head section of the page itself.

You'll also need to do this for Ask Jeeves/Teoma as well; their handling of robots.txt is the same as Google's. All the others seem to interpret a robots.txt Disallow as "don't mention this page at all."

Jim goes on to point out why engines prefer the robots.txt... it saves bandwidth, because to see the robots meta tag, the engines have to download the page. I suggest you read his post... it's more precise than this summary here.

Interesting. Basically, both are non-optional sollutions in my view. Robots.txt indeed says don't fetch this, but that does not imply "don't use or link to this". The problem with meta-tags is that they have to be parsed and not all bots do this (depends on their purpose). Also, they again can be interpreted widely. Nofollow does not imply "do not follow links to an other (sub)domains".

I use a combination of both, but I conlude that I may have to make some adjustments for Google and AJ/T.

I use a combination of both, but I conlude that I may have to make some adjustments for Google and AJ/T.

I think the point of Jim's post is that for Google and AJ/T you do need to use both. The robots.txt by itself won't do everything you might hope it will do, as someone reading just your first post could perhaps infer.