this is a continuation of issue #114, which introduces the problem of merging revision branches that are also applied to target databases already. In that issue, it took me a really long time to get my sluggish mind around it and even after I "got it" I think my thinking was still muddy. Perhaps it still is now, but I'd like to re-establish this as a feature add with my latest thinking on it.

I've spent many months thinking about #114 and what always disturbed me was how to cleanly address my notion that to support a DAG (directed acyclic graph), we need to store every node that's been invoked in the versions table. Doing so would mean that all existing alembic version tables would need to be "upgraded", having every version that's been invoked be written out to the table. This in itself is extremely difficult to manage, especially for shops using flat SQL files to maintain versions on production databases which would need some special migration script that has a huge series of INSERT statements. And then, once the version table has been populated in this new way, now you can't use an older version of Alembic anymore either. And, you can't even easily tell what the "head" version is anymore without reconciling the version table with the structure of the version scripts themselves.

I can only assume that the use case I was thinking was needed, is the one that we talk about in the documentation; that of knitting together two branches into a single line:

A -> B -> Cb1 -> Db1 -> Eb1
A -> B -> Cb2 -> Db2

becomes

A -> B -> Cb1 -> Db1 -> Eb1 -> Cb2 -> Db2

that is, we knitted 'b1' and 'b2' into a single line - this is how we say to do it in the "Working with Branches" documentation (e.g. "splice"). So for alembic to figure out on databases that previously only had "b2" applied that "b1" also needs to be applied, it needs to examine the version table for every version and find those that aren't present.

But this really shouldn't be necessary. My thinking above is still missing the fact that as we support a DAG, we support multiple heads, which can be reconciled in the traditional way, using a branch point. For Alembic to determine which migrations are necessary in order to reconcile multiple heads only needs to store just what any other VCS stores, the heads.

The use case in the docs of "splicing" two branch points into a single line just wouldn't be necessary anymore. And in fact the manual editing of files wouldn't be needed anymore. The ScriptDirectory object already knows how to recognize multiple heads. All we need to do is have the migration system track these multiple heads as it proceeds. It also makes trivial the notion of an "alembic merge" command, which does nothing more than create a migration file that declares all the current heads as its dependencies. The only need for this file is to keep the migration stream clean; it would not be at all necessary and the system could be used in the "pure DAG" format suggested in #114 as an alternative mode of operation.

Suppose we start with a migration series like the above, with two heads:

A -> B -> Cb1 -> Db1 -> Eb1
-> Cb2 -> Db2

A target database which is at version Db1 would proceed as follows:

Identify Db1 from alembic_version as the current "head".

Identify Eb1 and Db2 as the current target heads.

As Db1 is the actual head invoked, we can determine that the full set of migration files that have been run is A, B, Cb1, Db1. We automatically know that Cb2 and Db2 also need to be run.

We invoke Eb1, and upgrade this particular branch point in the version table. The current SQL for this is UPDATE alembic_version SET version='Eb1'. All we have to do is change this SQL to read UPDATE alembic_version SET version='Eb1' WHERE version='Db1'. That is, our UPDATE statements are now always local to a single head, and other heads in the version table remain unaffected.

We then attend to b2. We invoke Cb2, and upgrade this particular branch point in the version table. We know that this branch isn't present, so we invoke INSERT INTO alembic_version VALUES ('Cb2').

We continue on b2 and invoke Db2. We upgrade in alembic version via UPDATE alembic_version SET version='Db2' WHERE version='Cb2'.

We now have two heads in the alembic_version table, and we are done.

Now, most users probably don't want to work in this mode, where they just maintain two heads like that. These users will want to create a "merge point", like this:

A -> B -> Cb1 -> Db1 -> Eb1 -> Eb1Db2_merge
-> Cb2 -> Db2 ->

Alembic can by default deliver its usual message, "version history has two heads" (or three, or whatever), and suggest running alembic merge to create a merge point. The command just creates a new version file corresponding above to Eb1Db2_merge.

The process for taking our database from version Db1 to Eb1Db2_merge looks like this:

Identify Db1 from alembic_version as the current "head".

Identify Eb1Db2_merge as the current target head.

Upon detection that Eb1Db2_merge has multiple parents, we again move to handle multiple heads at this point, and we go back to step 2 from the previous list of instructions.

Upon completion of the "multiple heads" use case, we now have Eb1 and Db2 in alembic_version. We choose a single one of the heads as the UPDATE target and delete the rest: DELETE FROM alembic_version WHERE version='Db2'; UPDATE alembic_version SET version='Eb1Db2_merge' WHERE version='Eb1'.

We're back to a single head.

Using this system, we now no longer have any upgrade path for users that upgrade to an Alembic that supports this feature; their alembic_version table is already in the correct format. These users can, if they choose, immediately start working with multiple heads and/or merge points, or not; the data written to alembic_version doesn't change.

The alembic revision command will behave as follows:

When run on a series of migration files that only have a single head, the behavior is as it is now; a new file is created with the parent referring to this single head.

When run on a series of files that have multiple heads, currently we deliver an error about "only a single head supported; manually edit the files". This error will be altered to read: "multiple heads are present. Please merge these using alembic merge, or specify an explicit parent revision to work with multiple heads".

The user can either specify an explicit parent revision, as in alembic revision -p <parent_revision>, which will make this new file refer to that parent alone, or they can run alembic merge, which will create a do-nothing "merge point" which specifies all current heads as parents. This "merge point" file can of course be edited to refer only to specific heads if desired. The "merge point" reconciles the multiple heads back into one version, and invoking alembic revision again would use this merge point as the parent.

the -p / --parent option to alembic revision must validate that the given parent version is in fact a head.

This comment has been minimized.

I haven't started this one yet. This one, and the SQlite migrations which I have started, are the current priority. If you want to show me some patches towards making a pull request that would be great! I don't think I'm starting any new development for a couple of weeks.

This comment has been minimized.

The "multiple heads / branches" feature has now landed. This is
by far the most significant change Alembic has seen since its inception;
while the workflow of most commands hasn't changed, and the format
of version files and the alembic_version table are unchanged as well,
a new suite of features opens up in the case where multiple version
files refer to the same parent, or to the "base". Merging of
branches, operating across distinct named heads, and multiple
independent bases are now all supported. The feature incurs radical
changes to the internals of versioning and traversal, and should be
treated as "beta mode" for the next several subsequent releases
within 0.7.
fixes #167

This comment has been minimized.

Sounds great! I just read the documentation and it is pretty extensive. Just one thing I noticed: If I use a branch label and there is a branch point later in that branch, what does branchname@head point to? Is there something like branchname@heads?

Example:

--- 29f8
/
<base> --> 3782 -----> 109e ---->
(networking) \
--- 1975

For example, I would expect networking@head to give an error and networking@heads to point to 29f8 and 1975.

This comment has been minimized.

@NiDi - stamp has been adjusted. It does something a lot more complicated now which is that it has to calculate which branches logically need to be at the revision you're giving it, and adjust the whole table accordingly. but it works, try it out!