Database - Slave or Master? 3 of 3 - Integration

This story begins with an effort to store Moose classes in a
Tangram store. Specifically, converting from
Moose::Meta::Class objects to a Tangram::Schema
structure.

The structures are already quite similar. In the Tangram schema,
you have a per-class map of (type, name, (details...)).
In Moose::Meta::Class, you have a map of (attribute,
(details...)), where the details includes a type
constraint. Based on the type constraint, you can guess a reasonable
type. Well, not quite. The next thing you really need is
Higher Order Types on your type constraints (called
parametric roles in the Perl 6 Cannon). In a nutshell,
that's not just saying there's an Array somewhere, but saying
there's an Array ofsomething. Then you can make
sure that you put an actual foreign key or link
table in that point in the schema, rather than the
oid+type pair that you get with Tangram when you use
a ref column (and, in recent versions, without specifying a
class). Getting parametric roles working in Moose is still
an open question, but certainly one I hope to find time for.

So, during this deep contemplation, I thought, well, what would
Tangram be adding? I mean, other than the obvious elitism and other
associated baggage? Why not just tie the schema to the Moose
meta-model, and start a new persistence system from scratch? Or use
DBIx::Class for all the bits I couldn't be bothered
re-writing?

In principle, there are reasons why you might want the storage
schema and the object metamodel to differ. You might not want to map
all object properties to database columns, for instance. Or you might
want to use your own special mapping for them - not just the default.

Then I thought, how often did I do that? I added a
transient type in Class::Tangram for columns that
were not mapped, but only rarely used it, and never for data that I
couldn't derive from the formal columns or some other truly transient
source. I only used the idbif mapping type for classes when
I didn't have the time to describe their entire model. So, perhaps a
storage system that just ties these two things together would be
enough of a good start that the rest wouldn't matter.

The Evil Plan to NOT refactor Tangram using
DBIx::Class

Ok, so the plan is basically this. Take the Tangram API, and make
the core bits that I remember using into thin wrappers around
DBIx::Class and friends. Then, all of the stuff under the
hood that was a headache working with, I'll conveniently forget to
port. That way, it won't be a source compatible refactoring, just
enough to let people who liked the Tangram API do similar sorts of
things with DBIx::Class.

The first thing I remember using is a schema object for the
connection, if only because of acme's reaction when I say "schema".
In a talk I'd use a UML diagram at this point, but given
<img> tags are banned, instead let's use Moose code.

That weak_set is a little bit of magic I cooked up for
nothingmuch recently. All we're doing is keeping references to the
objects we've already loaded from the database, primarily for
transactional consistency. Actually, Tangram uses there a hash from
an oid to a weak reference to the member with that
oid, but I think that oids suck. In Perl memory,
the refaddr can be the oid.

So, hopefully, the DBIx::Class::ResultSet API will be rich
enough to be able to deal with all the things I did with
Tangram, or at least it will given enough TH^HLC.

There will be a bit of double-handling of objects involved.
Basically, the objects that we get back from DBIx::Class will
be freed very soon after loading, their values passed to a
schema-specified constructor (probably just Class->new),
and then their slots that contain collections that are not already
loaded set up to lazy load the referant collections on access. This
happens already in Tangram; the intermediate rows are the arrayrefs
returned by DBI::fetchrow_arrayref(). So there will be
lots of classes, perhaps under DBIx::Moose::DB::, that mirror
the objects in the schema. Perhaps we don't need that, but it should
be a good enough starting point, and if it can be eliminated entirely
later on, then all the better. (Update: Matt has kindly pointed me to the part of the API that deals with this; this shouldn't be a problem at all)

Mapping the Index from the Class

One of the nice things about a database index is that it's
basically a performance 'hack' only (because databases are too dumb to
know what to index themselves), and do not actually affect the
operation of the database. So, for the most part, we can ignore
mapping indices and claim we are doing the 'correct' thing;).

That is, unless the index happens to be a unique index or
a primary key. What those add is a uniqueness
constraint, which does affect the way that the object
behaves. So, what of that?

Interestingly, Perl 6 has the concept of a special.id
property. If two object references have the same.id
property, then they are considered to be the same object.
This has some interesting implications.

So, we can perhaps map this in Perl 6 code, at least map one
uniqueness constraint per type. Generalising this to multiple
uniqueness constraints is probably something left best to our Great
Benevolant Navel-Gazers. In the short term, we'll need to come up
with some other kind of way of specifying this per-class; probably a
Moose::Util::UniquenessConstraint or somesuch.

Mapping Inheritance

Alright, so we still have inheritance to deal with. But wait!
We've got a bigger, brighter picture with Moose. We've now got roles.

Fortunately, this is OK. The Tangram type column was only
ever used (conceptually, anyway) to derive a bitmap of
associated (ie, sharing a primary key) tables that we expect to find
rows in for a particular tuple. So, if we map the role's properties
to columns, then we only have to "duplicate" columns for particular
roles, if those roles are composed into classes that don't share a
common primary key.

The other features

Well, there may be other important features that I'll remember when
the time comes, but for now I think there's enough ideas here to form
a core roadmap, or at least provide a starting point for discussion.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Without JavaScript enabled, you might want to
use the classic discussion system instead. If you login, you can remember this preference.