Sunspot is a solution for adding full-text searching to Ruby applications. It uses Solr in the background and has many great features. In this episode we’ll use it to add full-text searching to a Rails application, using the simple blogging app we’ve used before in previous episodes.

This application has a page that displays a number of articles and we want to implement the ability to search across them. Using SQL to do this can quickly become difficult and is often not the best approach. A dedicated full-text solution such as Sunspot is a much better way to implement this feature.

Installing Sunspot

Sunspot comes as a gem and is installed in the usual way by adding it to the Gemfile and running bundle.

Once the gem and its dependencies have installed we’ll need to generate Sunspot’s configuration file which we can do by running

terminal

$ rails g sunspot_rails:install

This command creates a YML file at /config/sunspot.yml. We don’t need to make any changes to the default settings in this file.

Sunspot embeds Solr inside the gem so there’s no need to install it separately. This means that it works straight out of the box which makes it far more convenient to use in development. To get it up and running we run

terminal

$ rake sunspot:solr:start

If you’re running OS X Lion and you haven’t installed a Java runtime you’ll be prompted to do so when you run this command. You may also see a deprecation warning but this can be safely ignored. The command will also create some more configuration files for advanced configuration. We won’t cover them here but there are details in the documentation on how to modify these.

Using Sunspot

Now that we have Sunspot installed we can use it in our Article model. To add full text searching we use the searchable method.

This method takes a block and inside it we define the attributes that we want to search against so that Sunspot knows what data to index. We can use the text method to define the attributes that will have full-text searches run against them. For our articles we’ll do this for the name and content fields.

Sunspot automatically indexes any new records but not existing ones. We can tell Sunspot to reindex the existing records by running

terminal

$ rake sunspot:reindex

All of the articles are now in our Solr database and can be searched so we’ll add a search field at the top of the index page.

This form is submitted to the index action using GET, so any search parameters added will be added to the query string. We’ll modify the controller next so that it fetches the articles using that search parameter. To perform a search with Sunspot we call search on the model and pass in a block. Inside the block we can call various methods to handle complex searches. We’ll use the fulltext method and pass it the search parameters from the form. Finally we’ll assign the result of all of this to @search. We can call results on this to get a list of the matching articles.

We can test this now by reloading the articles page and searching for a keyword. When we do so we’ll get a list of matching articles returned.

The search returns a list of the articles that contain the search term whether it’s in the article’s name or its content.

There’s a lot more that we can do inside the searchable block in the Article model. For example we can use boost to weigh the results so that matches in the article’s name are considered more important than those in the content.

This is important when we want to sort results by relevance. In this case articles whose name contains the search term will appear higher up in the results than articles where the search term only appears in the content.

The attributes listed in the searchable block don’t have to be actual database columns, we can use any method that we define in the model. We’ll create a publish_month method that will return a string containing the name of the month and the year when the article was published, then search against that method just as if it was a database column.

We’ll need to reindex the records by running rake sunspot:reindex again before we can search against this new column, but once we’ve done so we can search for articles based on their month name.

As an alternative to creating a method we can pass in a block and search against whatever the block returns. An article has many comments so we’ll add the ability to search for the comments’ content by using a block.

The context inside the block is an instance of an Article so inside it we can get the comments for an article and map them to the content of each comment. Even though this returns an array Sunspot will handle this and index all of the comments so that they’re searchable.

Searching Against Attributes

What if we want to add some search capabilities that go beyond simple full-text searching, maybe searching on a specific attribute? For this we can pass in the type of attribute we want to search, whether it’s a string, an integer, a float or even a timestamp. To add the published_at attribute to the search fields we can use the time method.

With this in place the search won’t return articles that haven’t yet been published. There is some great documentation on the attributes you can pass in on the Sunspot wiki page.

Faceted Searching

Faceted Searching allows us to filter the search results based on certain attributes such as the month on which the article was published. Let’s say that we want to add a list of links showing the months for which there are published articles. When we click one of the links it will filter the list of articles so that only those published in that month are shown.

To do this we’ll first add a string attribute to the searchable block for our publish_month method.

In this code we loop through each of the publish_month facet items and display them. If we call .facet on our @search object and pass in the attribute that we want to list the facets by, in this case :publish_month, and then call .rows on that it will return every facet option for that attribute.

When we call row.value it returns the value for that attribute, e.g. “January 2011”. We can also call row.count to return the number of articles that match that value. If there’s a month parameter in the query string we’ll display the value along with a “remove” link that will remove the parameter. This gives us some nice functionality for selecting a given facet and passing it in through a month parameter.

When we reload the page now, and we’ve reindexed the records, we’ll see a list of facets in a panel, each one of which shows a month and the number of articles published in that month. If we select a month we’ll see it as a month parameter in the query string but the articles aren’t filtered. To fix this we need to add another with parameter to the search in the controller so that it filters by the month if the month parameter is present.

Now when we select a month we’ll see the list correctly filtered by the articles that were published that month.

Clicking the “remove” link will return us to the complete list. This works in conjunction with search results too. If we enter a search term the list will show the months that have articles that match.

Facets are a great feature to have alongside searching.

That’s it for this episode on Sunspot. It’s a great way to add full-text searching to Rails applications and has many extra features that we’ve not covered here. Be sure to take a look at the wiki for more information.