As part of an ongoing effort to index the so-called Invisible Web, Google's automated crawlers are now toying with HTML forms. But only on certain "high-quality sites."
"In the past few months we have been exploring some HTML forms to try to discover new web pages and URLs that we otherwise couldn't find and index for users who …

@Henry Cobb

If you are hiding data by using simple forms (we're not talking about passwords hacking here) then you have bigger problems.

And for most people, database load won't be an issue. If you're controlling load by making it difficult for people to find publically-accesible data, then I pity you, and if you are worried about/ find that the googlebot loading your site periodically, that's what robots.txt is for.

@Henry Cobb

It shouldn't make a difference if it's a GET or a POST. It's just as easy to fake either. The only difference is the field values are in the URL for the GET and in the request body for the POST.

And as Stephen Stagg points out, they're not trying to get around your security or logins or anything like that. Consider online shopping sites - now they can "browse" the catalogue if it's only available by form which is quite common these days.

On some sites, it's as simple as selecting a region before you get a customised site.

Watch out!

Soon, the Googlebot will be commenting on El Reg articles! It might even choose the Paris icon! No wait, it only does it on "high-quality sites" ;-)

But really, does this mean Google might start inadvertently spamming forums, sending queries to helpdesks, requesting password resets, and even (although highly unlikely) logging into websites' member areas and then indexing the results?

Hahahaha!

GoogleBot's Al Gore rhythm (hat please!)

I've heard rumors that these new GoogleBots are actually a half-million third world children with OLPCs trawling the web twenty-four hours a day. The most recent trend is to pack them into shipping containers to be sent abroad to work on a contractual basis. Apparently, using their new internally developed compression technology, Google achieves four times the child-density per container than their nearest rival and The Environmentalists are praising Google's efficient harnessing of our most precious, carbon-neutral renewable resource.

@Perhaps El Reg was a test site...

And the result analysis of such BetaTesting, Steven? Are Robots Human with Network InterNetworking IQs/ICQs?

And if El Reg was a test site, what is it after Testing? An Application of Special Access ProgramMIng and/or 4Access2Special ProgramMIng ....AI Stealth Projects Portal .....Virtual PerlyGatesWay with Pythonesque ASPs ..... for the Full Monty of SAIS..... Special Advanced IntelAIgent Serverings .....And just the tip of a NIceberg/QuITe Titanic Quarter Offering, Holywood Palace ProjectIOn Style?

Any doubts would be yours .....and just whenever you are so close to dismissing Disbelief ..... the First Frontier and Final Hurdle for Reality Imagined Virtually and IT that is Truly SurReal....... Life in Love is AIdDream in Love with Life for the Holy Grail of XXXXistentialist Code .....QuITe Peculiar Particular Parameters for Global Operating Devices, XXXXCommunicating. ......... AI Work in Constant Progress, although hardly Artificial. :-) whenever IT is for Real, Virtually.

@Kevin

GET and POST are very different. Only POST should be used for destructive changes and to request real-world actions. By sending POST requests they might order a holiday, or post a message on this page.

WTF!!!oNONEONE!

This seems absolutely ridiculous. If they post data to a form and index the resulting page, how on earth will a user ever see the same page? Build a hidden form on the fly with some javascript hitting submit for them.

Never get rid of the B******s now then

I installed a tracking mechanism on my website to see who was where and when.

No matter what time of day I looked and cleared the logs, within 10 minutes the googlebot was back trawling the site. Due to the circumstances the site hardly ever changed so I don't believe we were singled out I just think they are programmed badly.