Software Development: An Intro to ORMs

Let’s start this post with a couple questions. What is an ORM and why would you use one? If you don’t know the answer to both of these questions, then this post is for you. If you do know the answers, feel free to skip to the next post (after I’ve created it of course).

First, let’s look at what an ORM is. And ORM is an object-relational mapping. According to Wikipedia, which we all know is the infallible and definitive resource for all knowledge, and ORM is “a programming technique for converting data between incompatible type systems in object-oriented programming languages.” Ok, what the hell does that mean? My definition is a little less “scholarly.” I’d say that

An ORM is a way to abstract your data storage into simple class objects (POCOs if you’d like) so that you don’t have to worry about all the crap involved in actually writing data fields, rows, tables and the like. As a developer, I want to deal with a “Person” class. I don’t want to think about SQL statements. I don’t want to have to worry about indexes or tables or even how my “Person” gets stored to or read from disk. I just want to say “Save this Person instance” and have it done. THAT is what an ORM is. A framework that lets me concentrate on solving my business problem instead of spending days writing bullshit, mind-numbing, error-prone data access layer code.

Why would we use an ORM? I think I’ve been fairly upfront that I consider myself to be a lazy developer. No, I don’t mean that I take shortcuts or do shoddy work, I mean that I hate doing things more than once. I hate having to write reams of code to solve problems that have already been solved. I’ve been writing applications that consume data for years, and as a consequence, I’ve been writing data access code for years. If you’re not using an ORM, you probably know down in your soul that this type of work flat-out sucks. Anything that simplifies data access to me is a win.

Of course there are other, more tangible benefits as well. If you’re using an ORM, it’s often possible to swap out data stores – so maybe you could write to SQL Server then swap a line of two in configuration and write to MySQL or an XML file. It also allows you to mock things or create stubs to remove data access (really handy when someone tells you that your data access is what’s slowing things down when you’re pretty sure it’s not).

I can see that some of you still need convincing. That’s good – you shouldn’t ever just take someone’s word for it that they know what they’re doing. Ask for proof. Well let’s look at a case of my own code, why I built the OpenNETCF ORM and how I was recently reminded why it’s a good thing.

A couple years ago I wrote an application for a time and attendance device (i.e. a time clock) that, not surprisingly, stored data about employees, punches, schedules, etc. on the device. It also has the option to store it on a server and synchronize the data from clock to server, but that’s not core to our discussion today. The point is that we were storing a fair bit of data and this was a project I did right before I create the ORM framework. It was, in fact, the project that made it clear to me that I needed to write an ORM.

Just about a month ago the customer wanted to extend the application, adding a couple features to the time clock that required updates to how the data was stored. It took me very little time to realize that the existing DAL code was crap. Crap that I architected and wrote. Sure, it works. They’ve shipped thousands of these devices and I’ve heard no complaints and had no bug reports, so functionally it’s fine and it does exactly what was required. Nonetheless the code is crap and here’s why.

First, let’s look at a table class in the DAL, like an Employee (the fat that the DAL knows about tables is the first indication there’s a problem):

Sure, I get a few bonus points for using inheritance so that each table doesn’t have to do all of this work, but it’s still a pain. Adding a new Table required that I understand all of this goo, create the ColumnInfo right, know what the index stuff is, etc. And what happens when I need to add a field to an existing table? It’s not so clear.

Now how about consuming this from the app? When the app needs to get an Employee you have code like this:

Nevermind the fact that this could be improved a little with GetFields – the big issues here are that you have to hard-code the SQL to get the data, then you have to parse the results and fill out the Employee entity instance. You do this for every table. You change a table, you then have to go change the SQL and every method that touches the table. The process is error prone, time consuming and just not fun. It also makes me uneasy because the test surface area needs to be big. How do I ensure that all places that access the table were fixed? Unit tests help give me some comfort, but really it has to go through full integration testing of all features (since Employees are used by just about every feature on the clock).

Now what would the ORM do for me here? Without going into too much detail on exactly how to use the ORM (we’ll look at that in another blog entry), let’s just look at what the ORM version of things would look like.

We’d not have any “Table” crap. No SQL. No building Commands and no parsing Resultsets. We’d just define an Entity like this:

Note how much cleaner this is than the Table code I had previously. Also note that this one class replaces *both* the Table class and the Business Object class. So this is much shorter.

What about all of that create table, insert, update and delete SQL and index garbage I had to know about, write and maintain? Well, it’s replaced with this:

m_store = new DataStore(databasePath);

if (!m_store.StoreExists)
{
m_store.CreateStore();
}

m_store.AddType<Employee>();

That’s it. Adding another Entity simply requires adding just one more line of code – a call to AddType for the new Entity type. In fact the ORM can auto-detect all Entity types in an assembly with a single call if you want. So that’s another big win. The base class garbage gets shifted into a framework that’s already tested. Less code for me to write means more time to solve my real problems and less chance for me to add bugs.

What about the long, ugly, unmaintainable query though? Well that’s where the ORM really, really pays off. Getting all Employees becomes stupid simple.

var allEmployees = m_store.Select<Employee>();

Yep, that’s it. There are overloads that let you do filtering. There are other methods that allow you to do paging. Creates, updates and deletes are similarly easy.

Why did I create my own instead of using one that already exists? Simple – there isn’t one for the Compact Framework. I also find that, like many existing IoC frameworks, they try to be everything to everyone and end up overly complex. Another benefit to the OpenNETCF ORM is that it is fully supported on both the Compact and Full Frameworks, so I can use it in desktop and device projects and not have to cloud my brain with knowing multiple frameworks. I even have a partial port to Windows Phone (it just needs a little time to work around my use of TableDirect in the SQL Compact implementation).

Oh, and it’s fast. Really fast. Since my initial target was a device with limited resources, I wrote the code for that environment. The SQL Compact implementation avoids using the query parser whenever possible because experience has taught me that as soon as you write actual SQL, you’re going to pay an order of magnitude performance penalty (yes, it really is that bad). It uses TableDirect whenever possible. It caches type info so reflection use is kept to a bare minimum. It caches common commands so if SQL was necessary, it at least can reuse query plans.

So that’s why I use an ORM. Doing data access in any other way has become insanity.