Ruby on Rails makes it easy to bootstrap a greenfield project. Unfortunately not all projects are greenfield, so sometimes you need to work with a legacy database schema. ActiveRecord is a great choice when you have a full control over your database from the very beginning, but what should you do if you need to connect to a database with a schema that isn’t in-line with Rails conventions? In this post I will explain how to setup a Rails application that uses DataMapper as the ORM. I’ll also guide you through configuring your models to work with a legacy schema. If you are not familiar with DataMapper I recommend reading the “Getting Started” section of the official documentation.
###A legacy schema
For the sake of this post let’s use an example schema of a blog application using MySQL:

column names use camel-case naming convention prefixed with a table name

post primary key is an md5 sum of concatenated values of “author” and “publishTime”

post => user foreign key has a custom name “author”

“postPublishTime” is stored as a string and we would like to use DateTime object

###Creating a rails application with DataMapper as ORM
This is actually a straight-forward task. All you need to do is to use a template provided by DataMapper when generating an application skeleton as follows:
rails new blog-legacy -m http://datamapper.org/templates/rails.rb

By default the skeleton uses sqlite as the database. To change that, simply edit Gemfile and change ‘dm-sqlite-adapter’ to ‘dm-mysql-adapter’ and run bundle install. To generate empty models run:

script/rails g model user
script/rails g model post

###Naming conventions
DataMapper gives you two ways of defining table and column names. You can either set it manually in every model or you can configure global naming conventions for all models.Here’s how to set table and field names explicitly in the model:

With many properties it would add clutter to our models, that is why it is better to set global naming conventions for resources and fields. Those conventions will be applied to every model and will eliminate the need to define storage names and set :field options for every property.

To set global naming conventions, we can create datamapper.rb initializer like this:

# config/initializers/datamapper.rb
Rails::DataMapper.configuration.resource_naming_convention[:default] = lambda do |name|
name.downcase.singularize
end
# note: this will be available in dm-rails 1.1.1
Rails::DataMapper.configuration.field_naming_convention[:default] = lambda do |property|
"#{property.model.name.downcase}#{property.name.to_s.camelize}"
end

Much better!
###Dealing with natural primary keys
In our example schema, a post’s primary key is an MD5 sum of concatenated fields “postAuthorId” and “postPublishTime”. If we want to have the value generated in Ruby we can use :default options in property declaration:

Now whenever you create a post and don’t provide :id it will be generated automatically based on the values of :author and :publish_time properties.

As you can see this doesn’t look very nice in the model. So, we could create a custom property with a generalized functionality to clean this up! This way we can re-use this code in different scenarios whenever we want an MD5 sum of concatenated fields. You might be surprised how easy this is to do:

How easy was that?!?
###Associations with custom foreign keys
Our database has a basic 1:M relationship between User and Post models. The only caveat is that foreign key on post table pointing to user id has a name that doesn’t follow our global naming conventions. To configure the association we must override our defaults:

Easy enough.
###Fine tuning - typecasting properties
Post :publish_time is a String property as values are stored in a VARCHAR column. Our goal is to work with DateTime objects, not strings. To achieve that we could simply override property accessor and do typecasting like this:

class Post
# …
def publish_time
DateTime.parse(super)
end
end

While this approach works, it is not a recommended approach. DataMapper is based on the concept that a model’s logic should be decoupled from the database schema. A much better and cleaner way is to abstract away that logic in to a custom property object. An extra benefit comes with that too - you can re-use that code in many applications as it’s not tied to a specific model. Here’s another approach:

###Further reading and resources
This post explains basics as well as some more advanced things you can do with DataMapper. If you would like to learn more, please take a look at the documentation, subscribe to the Mailing List or jump into the #datamapper IRC channel.

Piotr Solnica is a team member atCode Benders, an agile software collective based in Bend, Oregon. When Piotr’s not writing great code, he’s writing awesome code. He has been working as a web developer for 6+ years and has solid experience with Ruby-based technologies and frameworks. His proficiencies include a range of JavaScript libraries such as Prototype, jQuery, YUI and MooTools. Piotr is an unobtrusive JavaScript evangelist, an HTML/CSS purist and a linux geek. In his free time he supports OSS, is a DataMapper core contributor, an author of DataMapper plugins and adapters and one of the developers of rubytime.org and coderack.org.