Tropical Software Observations

There comes a time in every app when doing a “SQL LIKE” query just doesn’t cut it. I’m going to show you how easy it is to add proper full-text search to your Rails app using the Sunspot::Rails plugin.

Sunspot

Sunspot is a standalone Ruby library that makes integrating with a Solr search engine a cinch. It wraps all the nitty gritty of indexing and querying in a declarative DSL which you can use to expose virtually any Ruby object to be searched, not just ActiveRecord models. The sunspot gem bundles a standalone Solr search engine (mostly stock, served by Jetty, although also contains support for geolocational ordering).

Sunspot::Rails is Rails plugin which is basically Sunspot the library plus some hooks into ActiveRecord to update indexes on creates and updates as well as the Rails request lifecycle commit the index at the end of every request. It adds the DSL as class methods into ActiveRecord to allow you to configure the index much like in the style of configuring association or named_scopes. The gem also bundles a set of rake tasks to manage starting, stopping and restarting the Solr service.

Sunspot supports text, string, time, boolean, integer and float fields. When planning what to index, note that only text fields are exposed as full-text search while the other field types are used for restricting, sorting and faceting.

What I like about the DSL is the flexibility. You can directly index an ActiveRecord attribute (:title, :body) or virtual attributes by giving it a block (:sort_title) or a symbol to a method (:published). Even indexing associations is really a matter of calling methods on it.

The second part is indexing. Sunspot provides a utility method to reindex all records for a particular class. In our example, we can call

Article.reindex!

and have the entire Article index rebuilt. For finer grained indexing, you can call Article#index! on a particular instance. As mentioned above, if you are creating and updating models via controllers as in a typical Rails app, this should all be transparent to you.

Querying

Sunspot provides a flexible DSL for querying. A SearchController might look something like this

keywords will be applied to all text fields. The remaining non-text fields can be defined to restrict the query (in the example, we want restrict it to published Articles) and ordering (in the example, we ordered by updated_at). If you don’t define an ordering, the results will be returned sorted by relevance based on occurence and location of the keywords in the document and the index as a whole. You can tweak the relevance score by defining boosts — in this example, Article titles that match the keywords are given a boost over other Articles that may match the keyword elsewhere.

You can define multiple restrictions and they don’t always have to be for equality. It supports restricting by a value being less-than, greater-than, between, any or all (when comparing for an indexed array). The restrictor with(:published, true)is simply a short-hand for with(:published).equal_to(true). You can also test the absense of a value using the without operator.

Finally, Sunspot plays nice with the WillPaginate plugin. In your view, you can paginate easily by doing

<%= will_paginate @search.results %>

and expect it to work seamlessly.

Conclusion

That’s all there is to it to get up and running with Sunspot. My take-home point is Sunspot exposes extremely flexible DSLs that allow you to scale from simple to pretty complicated queries with ease.

If this interested you, you may want to check out the wiki for other features not covered by this article including highlighting of keywords, facets and stored fields.