Short story
I have a technical problem with a third-party library at my hands that I seem to be unable to easily solve in a way other than creating a surrogate key (despite the fact that I'll never need it). I've read a number of articles on the Net discouraging the use of surrogate keys, and I'm a bit at a loss if it is okay to do what I intend to do.

Long story
I need to specify a primary key, because I use SQLAlchemy ORM (which requires one), and I cannot just set it in __mapper_args__, since the class is being built with classobj, and I have yet to find a way to reference the field of a not-yet-existing class in the appropriate PK definition argument. Another problem is that the natural equivalent of the PK is a composite key that is too long for the version of MySQL I use (and it's generally a bad idea to use such long primary keys anyway).

Actually, as I understand it, using a surrogate key(e.g. sequence) is the best practice.
–
Tom TuckerSep 14 '10 at 21:19

AFAIK it is not. Using a surrogate key allows duplicates to be created when using a natural key would have prevented such problems. Please explain your point if I'm thinking in the wrong direction.
–
dpqSep 14 '10 at 21:22

2

When it comes to your logical model, your primary key should almost always be a natural key. When it comes to your physical model, meaning what's actually used at the database level, your primary key should always be a surrogate key with a unique constraint on your natural key.
–
RandolphoSep 14 '10 at 21:47

"Using a surrogate key allows duplicates to be created" only if you forgot to include a unique index on the natural key. A common error.
–
S.LottSep 15 '10 at 2:29

Don't add TODOs if you have no intention of ever doing them. If you think it's a waste of effort now, it'll be a waste of effort in the future, and all you're doing is adding to the list of garbage you'll have to waste through every time you egrep 'TODO|XXX|FIXME'.
–
Glenn MaynardSep 14 '10 at 23:46

1

Add TODO's to absolve your conscience because you have some guidance from based on "a number of articles on the Net" that you think is important but is clearly inappropriate in this case.
–
S.LottSep 15 '10 at 2:30

When I get stuck finding the "right" way to do something, I will often DoTheSimplestThingThatCouldPossiblyWork, and come back later during a refactoring cycle to determine whether "right" way is actually necessary. I use TODO: merely because "TODO:" is a convenient way in virtually all IDEs to add visibility to a line of source code. If REVIEW_FOR_ACADEMIC_RIGOR: suits you better, go for it! :)
–
SethSep 15 '10 at 3:20

"Using a surrogate key allows duplicates to be created when using a natural key would have prevented such problems" Exactly, so you should have both keys, not just a surrogate. The error you seem to be making is not that you are using a surrogate, it's that you are assuming the table only needs one key. Make sure you create all the keys you need to ensure the integrity of your data.

Having said that, in this case it seems like a deficiency of the ORM software (apparently not being able to use a composite key) is the real cause of your problems. It's unfortunate that a software limitation like that should force you to create keys you don't otherwise need. Maybe you could consider using different software.

That's not a very practical principle, as so few ORMs support composite primary keys you'd be limiting yourself to a tiny set of software. The next time you hit a software limitation and apply this principle, the intersection of the sets leaves you with no selections at all.
–
Glenn MaynardSep 14 '10 at 23:52

"deficiency of the ORM software (apparently not being able to use a composite key". Hardly. It's a deficiency in use of the ORM for failing to declare a unique index on the composite natural key. Many ORM's support "unique" declarations outside the surrogate key.
–
S.LottSep 15 '10 at 10:06

I use surrogate keys in a db that I use reflection on with sqlalchemy. The pro is that you can more easily manage the foreign keys / relationships that exists in your tables / models. Also, the rdbms is managing the data more efficiently. The con is the data inconsistency: duplicates. To avoid this - always use the unique constraint on your natural key.

Now, I understand from your long story that you can't enforce this uniqueness because of your mysql limitations. For long composite keys mysql causes problems. I suggest you move to postgresql.