Tagging with Oracle and Ruby on Rails

by Matt Kern

Clear the confusion and add tags and tag clouds to your Oracle+Rails application.

Published June 2007

Social computing has taken the Internet by storm in the last few years and one of the signatures of the trend has been the notion of "tagging." While tagging is certainly not new, its latest incarnation is novel in its application—at least as far as the Web application world is concerned. Sharing tag data has allowed users of the latest round of Web applications to search for and share data like never before.

Tagging by itself won't add incredible functionality to your Rails application, but the features you build on top of tagging can add a layer of richness to your user's experience. You might find yourself quickly becoming addicted to new features that leverage your simple little strings.

This article will show you just how easy it is to add tag functionality to your site through the use of the acts_as_taggable_on_steroids plugin.

Confusion Abounds

Believe it or not, the hardest part of adding tags to your Rails application is figuring out which of the libraries to use. There are a number of competing implementations of the acts_as_taggable idea. At the time of writing there are at least four advertised ways to implement tagging in a Rails application:

acts_as_taggable gem

acts_as_taggable plugin

acts_as_taggable_on_steroids plugin

has_many_polymorphs plugin

Let's review each of these options and try to shed some light on what can be a very confusing situation.

acts_as_taggable gem. This gem was the first of the implementations written for Rails and as such, it's really meant for older versions of 1.0. It will work on the current Rails versions (1.1, 1.2 and Edge Rails—"Edge Rails" being a fancy term for running the very latest, or HEAD, version of Rails). The Gem has one really significant weakness that makes it a non-starter for usage in your Rails application. It requires a separate join table for every model you want to tag. If you're looking to simply tag one model, though, this may be the simplest solution. You can tag more than one model, but you're required to create a separate join table and association for each model tagged, adding a significant amount of overhead.

acts_as_taggable plugin. This plugin (as opposed to the gem) was written by David Heinemeier Hanson, the creator of Rails. It was implemented using some of the more advanced features available in the newer versions of Rails like :has_many_through and polymorphic associations. It's a good start, but Hanson himself admits that the plugin is only half-baked and the core Rails team has opted not to apply many of the patches that have been submitted by the Rails community. (There's even a reported SQL injection vulnerability that, to my knowledge, has not been patched in the official releases!) This is probably the easiest implementation to install and use, though, so it may be something you want to look at. It's accessible directly through the Rails SVN server, so all that's needed is to call

$ script/plugin install acts_as_taggable

Unfortunately, according to the Rails core developers, the plugin was intended only as a proof of concept and there are
quite a few issues with it.

acts_as_taggable_on_steroids plugin. Although the name of the plugin sounds like it should be additional functionality built atop the original acts_as_taggable plugin, it's really not. It is a rewrite of the original plugin and it does add tests and some additional functionality. And adding to the extreme confusion is that the original author of the acts_as_taggable gem posted a blog entry called
"Tagging on Steroids with Rails". Whew! (We'll focus on this plugin, since the "official" Rails plugin
isn't intended for production use.)

has_many_polymorphs plugin. A relative newcomer on the block, has_many_polymorphs is a more abstract, very powerful plugin that can be used for adding tags to your model, although it's not built specifically for that purpose. I'll explain some of the problems that the has_many_polymorphs plugin solves later; for now, we'll favor the acts_as_taggable_on_steroids due to its more straightforward nature and ease of use for developers new to Rails.
Polymorphic associations are an advanced Rails topic, and although they are used in the acts_as_taggable_on_steroids plugin, they're used as plumbing and it's not necessary to fully understand them.

Installation

First, read my article
"Guide to Ruby on Rails Migrations". In that article readers learn how to use Rails Migrations to create a database schema for an online social music cataloging application called Discographr. You'll be building on that application for the rest of this article.

Let's try out the acts_as_taggable_on_steroids plugin first. It's not included in the official Rails SVN so you'll need to find it first. Rails comes with a handy script/plugin command that you can use to find the plugin you're looking for:

This migration adds two tables. The first, called tags, will hold the actual tag names. The second associates the tag with a model. The plugin works by using the taggable_type field and using it as the table name (using the standard Rails magic of pluralize and singularize) and looking up the object by the primary key. For your reference, this migration was derived directly from the tests included in the plugin. Tests are one of the things (among several) that the acts_as_taggable_on_steriods plugin added to the DHH version.

Look in RAILS_ROOT/vendor/plugins/acts_as_taggable_on_steroids/test and you'll find all the tests written for the plugin. Tests not only serve as an excellent safety net that allows you to feel much freer to change your code, knowing that you can always run rake test and see what breaks, but they also provide for excellent example usage of a library. It's almost as good as built in documentation! So, you can tell exactly what the plugin requires to function by looking in the schema.rb file under the test directory.

Next, you need to run the migration to actually add the required tables to the schema:

Let's Play Tag

Now it's time to use the added functionality. The first thing you need to do is add an acts_as helper to your models. In Discographr you want all the models to be taggable, so you'll need to add a single line to each of the models. Rails uses acts_as helpers to add functionality or extend the ActiveRecord::Base class and subclasses. So, for example, acts_as_taggable adds tagging functionality to your models in a single call (theoretically), while acts_as_tree adds hierarchical tree functionality to your models. Start off by adding tags to the Artist model.

class Artist < ActiveRecord::Base
acts_as_taggable
end

That simple declaration adds a ton of functionality to the Artist model. The first thing it does is create a :has_many relationship with the Taggings and Tags models. But "Wait!" you say. You've never defined either of those models! This is one of the features of using plugins. If you look in the plugin directory for your newly installed plugin, you'll notice two model files, tag.rb and tagging.rb. These models are included in your application by virtue of being in the plugins' lib directory. I'm going to issue a warning here, though: Defining models in your plugins can be a handy thing to do, but it can cause problems as well. Doing so makes it possible to clash with the containing application's models.

If you've worked with Rails in the past you know that the :has_many, :has_one, and :has_and_belongs_to_many methods add an awful lot of methods to your models. Since the plugin relies on the :has_many method and polymorphic associations, you get all that functionality by virtue of declaring your model as acts_as_taggable. You'll see those added methods in action later.

So, go ahead and add tags to the other models in Discographr. You want all of the models you've already defined to be taggable so you'll add the acts_as_taggable call to the Album and Song models:

Now that all your models are taggable, you can start using the tag information in your application. As you haven't really written any controllers at this point, you'll use the handy script/console to test-drive your newly taggable models. Fire up the console by issuing:

The << method (remember that in Ruby everything called on an object is a method, << included) was added to your Artist model when you added the acts_as_taggable helper. Recall that it adds an :has_many :tags helper to your model thereby adding the append operator (method!). So, you've just added two tags to your "Bob Dylan" artist object.

There's already an awful lot you can do with your application now that you've added the ability to tag your data! Take some time to browse the Rails API documentation and get familiar with all the methods that the association methods add to your models. They'll all work with your tags now, too!

But, it gets better. The acts_as_taggable_on_steroids plugin adds quite a few convenience methods you can use to make development even easier and quicker. The plugin is designed to replicate the functionality provided by the original acts_as_taggable plugin. This means that you can do things like:

Note the two artists objects returned there, Bob Dylan and The Flaming Lips. Add four or five more artists of your choice in the same way and make sure to tag them. Try to use the same tags a few times so that the tag cloud example we'll be working on next will make sense.

Forecast Calls for Clouds

One of the most common approaches to visualizing tag data in most social and folksonomy based applications is the notion of a tag cloud. You've seen these before on sites like Flickr and del.icio.us. A tag cloud is basically a visual representation of the tags present in a system sized according to popularity (in other words, frequency). The acts_as_taggable plugin provides a convenient method for calculating tag frequencies called tag_counts. Unfortunately, the developer of the plugin must have had a MySQL bias because the definition of the tag_counts method includes a find_by_sql call that uses invalid SQL with aggregation functions. Oracle follows the standard and requires that any non-aggregate expressions must be included in the GROUP BY clause. Thus tag_counts breaks even though Oracle is doing the right thing!

There's still hope though; simply add the following to the end of the init.rb file found in the plugin at RAILS_ROOT/vendor/plugins/act_as_taggable_on_steroids:

require File.dirname(__FILE__) + '/lib/acts_as_taggable_override'

Then create a file in RAILS_ROOT/vendor/plugins/acts_as_taggable_on_steroids/lib called acts_as_taggable_override.rb put the following code in it:

That code overrides the existing tag_counts method and fixes several problems: First, it changes the quotes around #{name} to single quotes so that Oracle considers it a literal and avoids an ORA-00904: invalid identifier error. Second, it makes the SQL valid by adding the missing expressions to the GROUP BY clause. It also changes the code around the :start_at and :end_at options and removes the :limit option. Restarting your console or application will ensure the changes are picked up and a calling the method should now return successfully:

Hopefully in a future version this bug will be fixed. But until it is, don't forget to reapply these changes if you update to a newer version of the plugin.

The tag_counts method allows you to add conditions like :start_at, :end_at, :at_least, :at_most and :order. These allow you to make queries like:

Artist.tag_counts(:start_at => 7.days.ago) # retrieves tags for added in the last 7 days
Artist.tag_counts(:at_least => 10) # retrieves tags that have been used 10 or more times.
Artist.tag_counts(:order => "name") # order the array of tags by name alphabetically

Now that you have a working tag_counts method you can use it to generate a tag cloud. But before you get started, create the necessary files by using the Rails generator script. Rails includes a generator called scaffold that provides all the needed files for basic CRUD operations on a given model. All the generator requires is that it be passed the name of the model you'll be scaffolding along with the name of the controller to be used. Go ahead and run the scaffold generator for the artist model and catalog controller now:

$ script/generate scaffold Artist Catalog

Now you have the files you need. Fire up the built-in server and see what was created:

$ script/server

Go to http://localhost:3000/catalog and you'll see the list view that you'll be working with from here on out.

In order to make your tag cloud method available to our entire application you'll put the following code in the RAILS_ROOT/app/helpers/application_helper.rb file. You could put it in the catalog_helper.rb file, but you'll probably want to use the tag_cloud outside the catalog controller, too.

You probably noticed several calls to Math.log in that listing. This tag cloud method attempts to compensate for the
"long tail/power law" phenomenon by using a logarithmic distribution to counter the power curve and balance out the distribution across font sizes. Without doing this you could very easily end up with just a couple large tags and hundreds of small tags. Essentially, the logarithmic distribution magnifies the difference between the high number of low popularity tags.

Next you'll have to add your CSS. As you used the scaffold command to generate a basic view you can add to the scaffold.css file under RAILS_ROOT/public/stylesheets:

That style sets the maximum font-size for your tag cloud. If you look back at the method definition for tag_cloud you'll see this: size = (((Math.log(count) - floor)/range)*66)+33. It sets the minimum font-size percentage to 33% and then adds the weighted size to the minimum, resulting in the highest frequency tags being closest to 100% and the lowest frequency tags closest to 33% of the font size you set in the tagCloud CSS class. The nice thing about this algorithm is that it allows for much more variability in the sizing of the tags than do most other tag cloud algorithms. That said, it's best suited for size-based tag clouds, not color-based ones. (I call the latter "heat maps").

Finally, now that the plumbing is in place, you need to add the code to display the tag cloud in a view. Again, since you used the script/generate scaffold command the necessary views and controller actions have been created for you. Let's go ahead and add the tag cloud to the list action and view. First, change Catalog_Controller's list action in RAILS_ROOT/app/controllers/catalog_controller.rb to:

You've taken out the pagination that the scaffold added for you just to keep things simple. Also, you added a call to the tag_counts() method so that the view will have access to the tag frequency information. Remember, in Rails views have access to any instance variables in the controller.

So, now that the controller has the data needed to create the tag cloud, finish up with the view. Open up the default view for the list action at RAILS_ROOT/app/views/catalog/list.rhtml. list.rhtml was automatically generated by the scaffold command along with the catalog controller. Add the following to the end of the list.rhtml file (after the last line):

Also, go ahead and remove the pagination links that the scaffold command put in the view. Delete these lines toward the end of the file; otherwise you'll get an exception since @artist_pages doesn't exists in your list action:

Finally you need to add the show_tag action that is referenced as the action to which your tags link in the cloud. Add this last action to the end of the RAILS_ROOT/app/controllers/catalog_controller.rb file:

You don't need to create a view for this action as you'll just use the same view as the list action does. The only difference is that in the link_to call in the view you pass tag.name as a parameter so you can be sure to display only the tag you clicked on. The show_tag action then uses the tag.name parameter with the Artist.find_tagged_with(params[:id]) method call. Simple.

Now going to http://localhost:3000/catalog/list should display a list of Artists with CRUD functionality along with a tag cloud beneath it:

Clicking on the tags should display a filtered list of artists by tag with the same tag cloud below.

That is but one example of the experience that tags can bring to your application. Tag clouds can convey a great amount of meaning in a very simple format—the mark of a powerful user interface.

Bringing it Home

There are quite a few shortcomings to the acts_as_taggable_on_steroids plugin, as you've seen already. As of this writing, recently the has_many_polymorphs plugin has started to gain acceptance as a more powerful replacement for the acts_as_taggable plugins. As is the case with many solutions that gain flexibility and power, it is a bit more abstract than the plugins we've looked at in this article. As such the plugin is not as much an out-of-the-box, straightforward solution for tagging but it does an excellent job of avoiding many of the inherent challenges of the acts_as_taggable_on_steroids plugin—including potential model definition clashes, incompatibility with Oracle without tweaking, inflexible tag separators, and perhaps most important, the inability to easily query across models for a given tag.

That said, the acts_as_taggable_on_steriods plugin is a very powerful and simple to use extension for any Rails application.

Matt Kern has been searching for and developing ways to make life easier through technologies like Rails for years—mostly an attempt at finding ways to spend ever more time roaming the mountains of Central Oregon with his family. He is the founder of Artisan Technologies Inc. and co-founder of Atlanta PHP.