Lately, with my friend and colleague Joseph, we made some experiments with the Sinatra Ruby Framework.

As part of the experiment we’ve chosen to stick with our DB of choice PostgreSQL and our preferred template engine HAML, but we decided to give the SequelORM a try as well as using Ruby 1.9.1. We also wanted to store our datas as UTF-8 (this part is the most painful of all).

Our first goal was to have a simple application tying everything together and testable with RSpec and Cucumber.

Using Sinatra and HAML is a snap, as it’s a core feature of Sinatra, just be sure to use HAML version >= 2.2.0 as it includes some work to support new Ruby 1.9 String Encoding.

Then comes Sequel, using it is as simple as requiring it and feeding it with database connection information, just be sure to set the :encoding => 'UTF-8' (cf : Sequel::Database.connect method).
Sequel is great in that it has adapters for most commons connectors, first we tried DataObjects’s do_postgres as it should support asynchronous query (and it does !), but we had to fall back to the PG one and even to a patched version.

Let me explain the problem here, and be warned it’s not limited to Sequel, but to any ORM using currently available db connectors, when using a charset different from ASCII-8BIT under Ruby 1.9.
What happens is that ORMs do not force any encoding on String returned by the database connectors even when you specified an encoding (commonly used to set the connection’s “client_encoding”). Current ruby connectors (under Ruby 1.9) do not use the database/client’s connection encoding as a “hint” to determine and set the encoding of returned values.
This is not a problem on Ruby 1.8, but on Ruby 1.9 you get some weird results, the string returned from db have a default encoding “ASCII-8BIT” (in fact default for BINARY), as ORMs do not force encodings this result in a String with the bad Encoding.
Try to display it on a page and you’re welcomed with friendly “incompatible character encodings: ASCII-8BIT and UTF-8.” messages or try to use Webrat with RSpec matchers and you get “incompatible encoding regexp match (UTF-8 regexp with ASCII-8BIT string)”.

So here are the libraries versions to use to have a working Sinatra, Sequel with Postgres, HAML, RSpec, Cucumber stack :

Sinatra >= 0.9.4

Rack-Test >= 0.4.1

Webrat >= 0.5.0

kamk-pg >= 0.8.0.3 (http://github.com/kamk/pg/tree/master)

Sequel >= 3.0.0

RSpec >= 1.2.8

Cucumber >= 0.3.94

Then you need to monkey-patch Rack (this is highly untested, it worked for my current app but it should not be used in production environment) :