Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

What are some database design gotchas to watch out for when you know the database will be accessed using a Object Relational Mapper (ORM) Wikipedia? Also see Entity Framework NHibernate or LLBLGenPro.

As an example, I'll note the 2100 parameter limit on SqlServer's RPC call. This is a problem when using LLBLgen and joining tables when compound primary keys were utilized see MSDN article for compound keys.

No offense, but "joining tables with more than 1 primary key"..this redefines one of the basic rules of a table, a single PK per table. What exactly did you mean?
–
MarianJan 19 '11 at 18:28

How does this hit the 2100 parameter limit? Do you have tables with that many columns? Also, I presume you mean "Object-Relational Mapping" and not "Object-Role Modeling"
–
John SaundersJan 20 '11 at 3:35

The 2100 parameter limit is sometimes an issue if you pass a large list of values to a SQL 'IN' ( LINQ array.Contains(member) ). However, that doesn't really have anything to database design; that is more of a query pattern issue. The easiest workaround for ORMs that 'suffer' from that is to insert the list into a [permanent] temporary table and join to that and/or use a subquery that select the items to be queried from wherever they originate. (passing thousands of items to a SQL 'IN' is generally bad anyway, whether you're using an OR mapper or not).
–
KristoferAJan 20 '11 at 3:40

4 Answers
4

Go through standard normalization techniques, mainly to 3rd normal form. Some ORMs can pick up on the relationships defined at the database level so you won't have to manually.

I would really go the other way and ask what should I know about ORMs. Make sure you know:
- How do I profile the queries?
- How can I override the ORM and write my own query?
- Can I use stored procs instead?

The database should be ORM agnostic since, more than likely, it will outlast the app.

Never let the ORM create or update the schema. Always use its generated SQL as a template but go through it before making the change (I have seen our ORM make redundant indexes, use bad data types...all that bad stuff)

Be aware of what your ORM cannot do. Django for a while did not support group by through the ORM which was a pain (still is, legacy systems fun!). So being aware of the ORM's limitations is a must for the DBA even if they are not the one writing the code for the ORM

Periodically, grab your slow logs, aggregate them through mysqlslowlog, and look at the top 10 or so. Basically it will show you the queries that took the most aggregate time overall. A query that only took on average 1 sec but ran 50K times in the past month would stick out on that aggregate report like a sore thumb ;)

I would not say go as extreme as to dump the ORM completely. If your developers using the ORM are not DBA's, they are likely to write SQL that is as bad or worse than the ORM. So eventually it will fall on the DBA to watch what goes on.

There are a few things I notice right away about a legacy database that differs from a database created by ORMs like Sequel and ActiveRecord for Ruby.

The use of a primary key in all tables called 'id'. Yeah, that can be overridden, but id is the default.

The use of plural names for the tables.

The use of underscores ("snakecase") to separate the words creating the descriptive name of the tables and fields.

The use of _id appended to foreign keys: The layout is typically like referenced_table_id.

I've been writing my SQL by hand for years because I didn't trust an ORM, but over the last two or three I've started using the two previously mentioned ORMs, and am impressed by how well they integrate with existing databases, and how well they can decipher a database if their conventions are followed, or I take the time to give them the hints needed to understand a legacy DB.

I don't know if there are any common guidelines for designing schemas to play nice with ORMs, but I think we'd all benefit by having some standards. There's definitely a time and place for an ORM, especially if you're using a good OO language and you're on a tight schedule.

FYI: there are tools to deal with naming convention differences for many OR mappers. I have an add-in for Visual Studio that takes care of that for Entity Framework and Linq-to-SQL with rule-based naming; see huagati.com/dbmltools for more details on that.
–
KristoferAJan 20 '11 at 5:33

Id is a SQL antipattern for naming the id field and should not be used.
–
HLGEMMar 5 '13 at 18:39

Care to explain why? How do we handle relationships between tables when we need to do lookups like one-to-many and many-to-one? In the ORMs I use not using ID is definitely going against the flow.
–
the Tin ManMar 5 '13 at 18:59

It depends ( :) ) a bit on what OR mapper you're using, so spend some time researching what db features the OR mapper in question support/don't support.

E.g. Microsoft's OR mappers don't support all of SQL Server's built-in datatypes, don't support some of the newer/advanced TSQL features (recursive queries, optimizer hints etc spring to mind).

In theory, a good OR mapper should be flexible enough to overcome (and allow you to map) a well designed relational database schema to a good object model. In reality, we still have a bit to go before all pieces of the puzzle are in place; although many OR mappers support advanced mapping it often comes at the expense of complex queries and performance issues.

For good db performance (and to preserve the dba's sanity :) ) you should still follow best practices when it comes to db schema design; normalize first and denormalize where [/if] necessary. On the code-side, don't go overboard with your object model; even if the OR mapper support complex inheritance models and entities that merge many tables together, these are also the areas where you risk running into trouble with overly complex queries hitting the database etc. Profile, profile, profile and don't just take the ORM generated queries for granted. Keep in mind that OR mapper generated queries can often be tweaked just like normal SQL queries and that two functionally equivalent queries on the object side (e.g. linq queries) can sometimes result in vastly different SQL queries.