Over many a quaint and curious volume of forgotten lore.

Using Sunspot for Free-Text Search with Redis

After spending time to get some data into Redis (as documented in some of my previous posts here), I not surprisingly wanted to make the data searchable. After looking around at some of the full-text search solutions available for Ruby, I really liked the look of Sunspot. Well-presented, well-designed, and it even has decent documentation. It uses Solr underneath, which is a very respectable search engine, so that’s all good. Of course, it didn’t take me long to discover that the sunspot_rails plugin makes things drop-and-go when using ActiveRecord, but those of us branching off into alternatives have to put in more effort. Hence, I’ll document my findings here to hopefully make it easier for others.

I won’t bother going into the details of getting things set up, as the Sunspot wiki does a fine job of that. Suffice it to say that we install the gem (and the sunspot_rails gem if you’re going to have some ActiveRecord models as well), start the Solr server, and that’s about it. We’ve got Redis already going, right? So now it’s time to get our model indexed and searchable!

There are a few steps that we need to follow to make this happen. First, we put code in the model to tell Sunspot what fields should be indexed, which ones are just for ordering/filtering, and which ones should be stored if desired for quicker display:

First, note that we need to require 'sunspot' to get access to the Sunspot class. This isn’t required for ActiveRecord models, but since we’re on our own, we have to specify that. Then, we call setup, passing the name of our model. In the code block, we specify a few text fields: the number, title, excerpt, and authors. Those fields will be indexed and searchable. Then we specify title and number again as strings, asking that they be stored for quicker retrieval. This is so we can display just that data without fetching the whole object, if we want — I won’t get into the details of doing that here because, well, fetching the objects in Redis is so fast that I found it didn’t matter. Last, the publication date is also listed, so we can filter and order by it if we want.

In our save() method, after we store a book in Redis, we tell Sunspot to index it, and commit the updated index. So far, so good. In theory, we should be able to create a Book, save it, and then search for it. Alas, if this were an ActiveRecord model we’d be pretty much done (and wouldn’t even have to do the index/commit part because those are automagically triggered on create and update). Unfortunately, we have some harder work ahead of us.

Sunspot uses what it calls “adapters” to tell it what to do when it wants to identify an object, and when it wants to fetch an object given an id. We have to provide the adapters for our model. To give credit where it’s due, this Linux Magazine article helped me figure out what to do, and then reading through the Sunspot adapter source code filled in the blanks. If you look back at our model, you’ll see that it requires ‘sunspot_helper’. That’s where we’ll put our adapters:

So, what’s going on here? We provide two adapters for Sunspot: the InstanceAdapter, and the DataAccessor. The InstanceAdapter just provides a method that returns the ID of the object. Easy enough, we just return the book’s number, which is the unique identifier. The DataAccessor has to provide two methods, load() and load_all(), that take an id and a list of ids, respectively, and expect objects back. In my case, the objects are serialized JSON, so we just call our find_by_number() method to get each object, call JSON.parse() to get the Hash of data, and construct a new Book object. (Note: obviously this requires having an initializer that can take a Hash and create the object, which I’ll leave as an exercise) Now we just register our adapters, by adding a couple of lines of code right before the call to Sunspot.setup():

Yes, Sunspot is so cool that it integrates automatically with will_paginate. So, looking through the above, we have a form that posts to our action (assuming you set the routes up, which you did, yes?). The action then takes the searchterm parameter if it’s there, or extracts it from the session if it’s not there. Note that this is not robust code — if it’s called with no parm and nothing in the session, it will end up searching for an empty string, which will return every book. In any case, we store the search term in the session, so that when someone clicks through to page 2, we can re-run the search to get the second page. The more important code here, though, is the call to search.

I will give a thousand thanks to this blog post, specifically the fourth item! I was doing this:

search = Sunspot.search(Book) do
keywords @search_term
end

And it didn’t work — it was fetching every object, even though I knew that @search_term was getting set properly. As that blog post notes, though, the search is done in a new scope, so this didn’t work. The code I showed above, using the query argument, fixes that problem. It certainly took me a while to figure that out, though, because nothing is said about it anywhere in the examples in the Sunspot wiki.

So now you should be all set. Put “test” into the form, submit it, the controller will do the search, return the book, and your view will list it. You are searching! Not so bad, and the fetches from Redis are so fast that the whole thing really speeds along. Pretty simple free-text search against any objects that you put into Redis.

A Warning

I had one other hitch when I was working on this, which mysteriously went away. I hate that. So, in case someone else encounters here, I wanted to document the issue. When I got the adapters in place for the Book model, and tried to work with it, I got an error saying that there was no adapter registered for String. I was very puzzled, wondering if something about the fact that Redis was returning a JSON String was confusing Sunspot. So I made a quick change to the InstanceAdapter:

class InstanceAdapter < Sunspot::Adapters::InstanceAdapter
def id
if (@instance.class.to_s == "String")
@instance
else
@instance.number # return the book number as the id
end
end
end

And that did the trick. I didn’t like it, and intended to try to figure out what was going on. But after getting all the rest of it working, when I put the code back to its pre-String-adapter state, the error didn’t return. Like I said, I hate that. Hopefully it was just due to something that I was unknowingly doing wrong which I fixed along the way, but…just in case, now the quick-fix is documented here for anyone else who runs into the problem.

Like this:

Related

about.me

Startup tech exec, musician, writer, artist

VP of Engineering at Womply; co-founder of FinderLabs. I've worked at startups around San Francisco for over 20 years, building products and leading engineering teams.

Previously I was Chief Architect at RPX Corporation, developing tools to analyze the intellectual property landscape. Earlier, I was at SingleFeed, BlackArrow, DHAP Digital, Nickelodeon Online, and others.

I'm also a musician, primarily a guitarist. Previously I started the band SubArachnoid Space; my current project is Numinous Eye. I've also released many solo/collaborative albums over the years. I founded the Charnel Music label in the early 90s and put out records by bands like Crash Worship, Pain Teens, Gravitar, Fushitsusha, and Mainliner. I travel to Japan frequently, and have written extensively about Japan's independent music scene. I currently review records for Dusted Magazine.

In my nonexistent spare time I also write fiction, paint, and do graphic design for album covers, magazines, and books.

If your load(id) method or load_all(ids) method returns a string instead of an instance of your adapted class, you can get this error. I got this error by setting some attributes in the load method as the last statement, relying on Ruby to return the result of the last statement. So load returned the string attribute rather than the instance of the adapted class loaded.

You can confirm if this is your case using the rails console like this:
>search = Sunspot.search(AdaptedClass) do keywords “find this” end
>one_hit = search.hits.first
>data_accessor = search.data_accessor_for(AdaptedClass)
>r = data_accessor.load(one_hit.primary_key)
>r.class