SQL Dependency Injection

“All problems in computer science can be solved by another level of indirection, but that usually will create another problem.” — David Wheeler, inventor of the subroutine

This is the story of a “framework” that doesn’t exist: SqlDi

Most database frameworks’ authors consider that it is wrong to write SQL in your source code and provide many kinds of wrappers you need to deal with, instead of pure query strings.

Their reasons are along the lines of:

Portability: SQL is not database independent

Sadly, this is true. Although there are ANSI standards – ANSI SQL 99, 2003 etc. – no RDBMS supports all the features from ANSI and most vendors force you to use some of their custom extensions to accomplish things from the standard.

(For example, SQL 2003 introduces ‘window functions’ that could be used instead of the vendor specific ROWNUM, LIMIT or TOP keywords – and the window functions allow you to do even more. Few database systems support SQL 2003 though..)

Abstraction: SQL queries contain details about the storage mechanism

The relational model is a good way of assuring the integrity and non-redundancy of data. JOINs allow you to reverse the normalization process, while keeping the “Single Point of Truth” – storage model. INDEXes allow you to lookup faster.

Having SQL (in your source) that displays all these storage details is bad in practice. Even if your attributes (columns) are not subject to change, the way of retrieving them may be.

For example, if you want to display a (paginated) list of books on a web page, with their authors, a minimal query would be:

SELECT books.title as title, books.date_published as date_published, authors.name as author
FROM
books
INNER JOIN
authors
ON
books.author_id = autors.id
WHERE
ROW_NUMBER() OVER (ORDER BY books.title) BETWEEN 1 AND 10

This is both an example of SQL that reveals the storage internals and that is not portable (because ROW_NUMBER() is supported by very few ‘chosen’ ones)

Correctness: Query strings are validated only when run against the database

Even if you’re using an editor that knows how to highlight syntax from the SQL strings, you are still prone to mistakes, syntactical or not (e.g. wrong column name).

The usual solution is to write objects that construct your query – here’s an example for Hibernate’s Criteria API

As opposed to this ruby Starbucks language I found on the web, SQL has proper syntax rules, and everybody knows at least its basics.

So SqlDI will allow you to write a subset of SQL called SqlDi.

But how about the aforementioned Abstraction ?

In order to decouple the storage model from the application query, SqlDi uses a kind of remote-process inversion of control (RP/IoC) pattern where the actual storage-specific part of the query gets injected in your code by the RDBMS process (instead of your code depending on the storage layout).

Now, if some other time you discover that one book can have multiple authors, all you have to do is to re-write the storage-dependent part of the query.

But how does SqlDi provide Portability ?

SqlDi is a parser for a subset of SQL – you have already seen the SELECT statement syntax. The various functions dealing with numbers, dates and strings, even the operators (such as string concatenation) are defined one way or another in SqlDi and translated into the nativedialect.

For example, the LIMIT keyword is translated into ROWNUM, LIMIT, TOP or ROW_NUMBER() depending on the used database.

This translation is accomplished in two steps:

First, the query is tokenized and parsed, and an object which represents the root of the Abstract Syntax Tree is constructed.

Second, the AST spills out the native dialect – based on some form of run-time pluggable dialect writers (plus ‘drivers’)

In this way you’ll never need to change the “SELECT x FROM y WHERE …” from your source, even if you change the used database. (You do, however, need another schema creation script).

Yeah, but how about the compile-time Correctness ?

This is the most interesting part.

SqlDi is implemented in D (because I don’t know LISP), and the first step (parsing) is accomplished at compile-time, through the use of mixin() statement. The output of parsing is code, like this: