README.rdoc

MetaInspector

MetaInspector is a gem for web scraping purposes. You give it an URL, and
it lets you easily get its title, links, and meta tags.

Installation

Install the gem from RubyGems:

gem install metainspector

This gem is tested on Ruby versions 1.8.7, 1.9.2 and 1.9.3.

Usage

Initialize a scraper instance for an URL, like this:

page = MetaInspector::Scraper.new('http://pagerankalert.com')

or, for short, a convenience alias is also available:

page = MetaInspector.new('http://pagerankalert.com')

If you don't include the scheme on the URL, http:// will be used by
defaul:

page = MetaInspector.new('pagerankalert.com')

Then you can see the scraped data like this:

page.url # URL of the page
page.scheme # Scheme of the page (http, https)
page.title # title of the page, as string
page.links # array of strings, with every link found on the page
page.absolute_links # array of all the links converted to absolute urls
page.meta_description # meta description, as string
page.description # returns the meta description, or the first long paragraph if no meta description is found
page.meta_keywords # meta keywords, as string
page.image # Most relevant image, if defined with og:image
page.images # array of strings, with every img found on the page
page.absolute_images # array of all the images converted to absolute urls
page.feed # Get rss or atom links in meta data fields as array
page.meta_og_title # opengraph title
page.meta_og_image # opengraph image

MetaInspector uses dynamic methods for meta_tag discovery, so all these
will work, and will be converted to a search of a meta tag by the
corresponding name, and return its content attribute