Sounds about right for them. I wonder if its got anything to do with the IRC GoogleBot TCL's. :S hope they fix it soon because its annoying having to open up a remote opera to quickly search for something on a browser on shell

Google have been doing this for a little while - I think to cut down on automated search queries from it's database ie. perl scripts etc. Saying that however, it is simple to change the browser identification string and continue as normal.

Dan wrote:
>
> Sounds about right for them. I wonder if its got
> anything to do with the IRC GoogleBot TCL's. :S

>What is that?

Eggdrop has a tcl which can reference the google database by a trigger like !google blah, and im assuming it has an issue with the browser identity.

> hope they fix
> it soon because its annoying having to open up a remote opera
> to quickly search for something on a browser on shell

>Pardon?

Im talking about when i have to use VNC to search for something due to internet restrictions on the local machine at my college.. IE. it has most search engines blocked due to "pornography searches" - as u can tell, my administrator is slightly ... "lost in space". I need to use VNC onto my remote unix box just to be able to surf when im in college.... i used to just use lynx to do most of the browsing because it was simpler.

The campaign is specifically targeted against links and wget, which
can "dump" the content of a remote page into a text file. (Should
this be a starting poing for performing automated queries?)
Anyway note that links 2.x can't do this anymore, so banning this
browser is hilarious.

I can't see why you would want to use a non-interactive browser for any other reason other than for violating their TOS. Google provide a SOAP API, which I have used quite successfully for programmatically searching. They even provide excellent documentation and sample scripts.

I noticed the same thing happening with a page download tool I wrote (bget, available at CPAN in the scripts section). When I used a browser emulation I could get access.

As for mjl, I was doing it because I wanted to save an article I found in Google Groups in the same place that I save all my other news posts. So I copied the URL to the view original format link and tried to fetch the page.

By the way, when I did it the forbidden message I got had just a
simple base64 encoded block in the 'code below' section, but the one here is doublely base64 encoded.

[1] What's a TCL?
[2] Yahoo is not as good as Google ;) but they have improved
[3] Sniffy McNickels makes a great point - "Much better would be to throttle repetitive looking requests, which is pretty easy to do." Could you provide a URL which explains how this is done and at what level (e.g. as a daemon? in hardware...?)
[4] Most browsers and spiders allow the user to spoof the UA (UserAgent); What is this coming to? A fixed browser ID? As trustable as an IP? :)
[5] What is mjl?
[6] To 'chaz' with the college sysadmin who has "search engines blocked due to 'pornography searches'": That sysadmin needs to be fired and expelled. This makes about as much sense as closing down a city because a criminal lives within its boundaries!
[7] Is there a good write up (URL) about the Google SOAP API and what can be done using it?
[8] Sniffy McNickels is incorrect in his/its argument: "Any browser can save the contents of a page to a text file." There's more to the story! wget is non-interactive, whereas most browsers require clicking or scheduling through a GUI. Also, wget can have several instances run and acts in a more linear and consecutive, "robotic" manner than PACU's (point and click users') requests of an HTTP or FTP site.
[9] There is no nine. Please email me if this thread changes, it is hostmaster then an at symbol then Video2Video is the dot com domain. Thanks.