Entity Framework and ORMs: Understand the Trade-Offs

There's no doubt that the ADO.NET Entity Framework and other object-relational mapping (ORM) frameworks provide tangible productivity benefits for developers. Yet I keep bumping into ugly situations in which Entity Framework applications simply can't scale or perform well when lots of data and users are concurrently thrown into the mix. In my experience, that doesn't mean that ORMs can't scale. Instead, it means that too many organizations and developers are intently focused on short-term productivity benefits at the expense of long-term concerns and costs.

Entity Framework Adoption

As a full-time SQL Server consultant, I'm seeing the Entity Framework crop up everywhere. I think this is happening for two primary reasons. First, there's nothing more tedious and ugly for developers than defining parameter after parameter for stored procedures in their data tiers; such a process is not only tedious but also typically involves copy, paste, and tweak techniques that are a fantastic breeding ground for costly bugs and errors that require significant troubleshooting. Second, the Entity Framework ships as a part of Microsoft's development stack, and the company is doing a great job of evangelizing the stack to developers who are sick of manually defining database interactions.

Although I personally hate the Entity Framework (I find it bloated and don't understand why it attempts to do so many things at runtime instead of through generated code), I realize that it brings some substantial benefits to the table. First and foremost, the Entity Framework decreases the potential for SQL injection attacks. Granted, ORMs don't make SQL injection attacks impossible, but their heavy reliance on parameterization means that developers who don't know or aren't inclined to learn SQL will typically do way less damage with these frameworks compared to if they're writing database interactions from scratch. Heavy reliance on parameterization also yields performance benefits in SQL Server environments compared to using ad-hoc, in-line generated T-SQL code. Also, there's simply no denying the huge increase in developer productivity that comes with the use of ORMs.

The Entity Framework and ORMs Aren't Evil

Although ORMs can do dumb things that can cause performance problems when put under load, the reality is that developers who don't use ORMs are just as likely to create performance, scalability, and concurrency problems when creating code from scratch. Stated differently, developers who don't take the time to adequately understand best practices for database architecture, concurrency, and performance will suffer from performance and concurrency problems in production when their systems come under load. (However, I've had some serious belly laughs when I'm looking at some of the convoluted code that the Entity Framework has generated for some types of queries.) I've touched on some of these realities and have recommended solutions in my SQL Server Pro article "Troubleshooting Performance Problems in Entity Framework Applications."

The Entity Framework and ORMs Aren't Magical Either

The problem with many ORMs (including the Entity Framework) is that they commonly encourage developers to bind application logic and functionality directly to physical schema rather than through abstractions or additional interfaces, such as views and stored procedures, which can provide an additional degree of flexibility. Consequently, binding directly against tables (as opposed to projections) acts as the root of many performance problems that come from using ORMs. In addition, these performance problems are much harder to address when they occur. Stated differently, if DBAs find a performance problem in a poorly written stored procedure, the stored procedure can be transparently rewritten to address performance issues while still maintaining the interface used by the stored procedure's output, which results in a net win for overall performance and concurrency.

If performance problems exist at the level where they're fully coupled to application code within a compiled assembly, the ability to transparently address that performance problem simply doesn’t exist. Because of this restriction, the application code needs to be modified and recompiled in many cases to either allow for the use of a new 'seam' or stored procedure where corrections can be made, or for a more performant set of query definitions that will have to be recompiled into the application.

Consequently, I find it perplexing that developers spend so much time and effort on leveraging design patterns and abstractions within their code that makes the code much more robust and flexible, yet will end up hard-coding their apps to tables in the database. Not only does such a poor choice in coupling make performance problems harder to deal with, but it also makes versioning and extensibility that much harder as well.

Developers Shouldn't Go Without an ORM

Granted, many applications might never be modified after they're deployed, and many others will never grow to a point where they need to worry about performance problems. As such, creating applications with an ORM is a no-brainer as concerns about versioning, performance, and extensibility are simply moot. That said, I wouldn't personally undertake any development efforts today without an ORM (I personally use a customized version of PLINQO, which relies heavily on stored procedures)—simply because of the productivity benefits that I'd lose without an ORM.

As such, I'm merely pointing out that ORMs aren't magical and they're not evil. Instead, ORMs are tools that can be used correctly or incorrectly. You'll have a better chance of long-term success if you evaluate ORM limitations and pain points before you embark on development efforts that are focused solely on the up-front productivity benefits that they provide.