New Google Reader Feature Can Create a Feed for Any Site

Google has launched a new feature for Google Reader that lets users create a custom feed to track changes on pages that don’t have their own feed. In other words, you can follow changes to any site.

"These custom feeds are most useful if you want to be alerted whenever a specific page has been updated," says Google’s Brian Shih. "For example, if you wanted to follow Google.org’s latest products, just type ‘http://www.google.org/products.html’ into Reader’s ‘Add a subscription’ field. Click "create a feed", and Reader will periodically visit the page and publish any significant changes it finds as items in a custom feed created just for that page."

Google points to examples like the Zillow home page, which would show new real estate listings, a Macy’s special offers page, which would keep one informed of the latest special offers, and the NYU Computer Science Department page, which features news and highlights. You can certainly see the possibilities that this feature could open up. If data overload was a problem for you before, this probably isn’t going to help. However, if you are looking to stay informed and up to date on even more web content, this might be just the ticket.

This new feature appears to solve any potential problems that could arise in the future with relation to the dying of RSS feeds. I’m not saying that they’re dying, but some think they will, and if sites stop putting out RSS feeds, a feature like this would presumably allow you to continue using your feed reader to follow those sites anyway.

The feature provides short snippets of page changes so users can decide if the changes are enough to make the page worth going to. If you have a site, and you don’t want Google to crawl or create feeds for it, you can opt-out. To do so, Google says you can:

Add a <meta name="googlebot" content="noarchive"> tag to any page you don’t want available in Reader. Google compares the cached and current versions of your page to determine if that page has been updated. Adding the NOARCHIVE meta tag will prevent Google from caching your page.

Use robots.txt to block Googlebot from crawling your site. (You can block your entire site, or a file or directory.) However, if you decide to block Googlebot, your content will not be available to appear in search results. Doing this will not remove the previously generated feed from Reader, but Reader will stop generating feeds after this measure has been taken.

There are a few things that can prevent a site from being picked up by this feature. It only supports English-language content in HTML. Updates to content in iFrames and updates to content that requires signing in to view are not detected.