This is Django 1.3.1. My setup uses two databases, an SQLite one and a PostgreSQL one. The application we deploy to customers was based on SQLite and the next version will use PostgreSQL. So I want to call syncdb on both DBs, and then use dumpdata/loaddata to move the data over. When calling syncdb --database=old_mdm --noinput in the instalation script, I get the error above.

DATABASES ={'default':{'ENGINE':'django.db.backends.postgresql_psycopg2',# Add 'postgresql_psycopg2', 'postgresql', 'mysql', 'sqlite3' or 'oracle'.'NAME':'mdm',# Or path to database file if using sqlite3.'USER':'django',# Not used with sqlite3.'PASSWORD':'abcabc',# Not used with sqlite3.'HOST':'localhost',# Set to empty string for localhost. Not used with sqlite3.'PORT':'5432',# Set to empty string for default. Not used with sqlite3.},'old_mdm':{'ENGINE':'django.db.backends.sqlite3',# Add 'postgresql_psycopg2', 'postgresql', 'mysql', 'sqlite3' or 'oracle'.'NAME':'/fwxserver/DB/django.sqlite',# Or path to database file if using sqlite3.'USER':'',# Not used with sqlite3.'PASSWORD':'',# Not used with sqlite3.'HOST':'',# Set to empty string for localhost. Not used with sqlite3.'PORT':'',# Set to empty string for default. Not used with sqlite3.}}
MIDDLEWARE_CLASSES =('django.middleware.common.CommonMiddleware','django.contrib.sessions.middleware.SessionMiddleware',# 'django.middleware.csrf.CsrfViewMiddleware','django.contrib.auth.middleware.AuthenticationMiddleware','django.contrib.messages.middleware.MessageMiddleware','django.middleware.transaction.TransactionMiddleware',)
INSTALLED_APPS =('django.contrib.auth','django.contrib.contenttypes','django.contrib.sessions','django.contrib.sites','django.contrib.messages','api','ios','django.contrib.admin','south',# Uncomment the next line to enable admin documentation:# 'django.contrib.admindocs',# Sentry'indexer','paging','sentry','sentry.client',)

@asandroq Does that work with a router returning False for allow_syncdb('specified_db', ContentType) and/or allow_syncdb('specified_db', Permission)? Maybe an initial router.allow_syncdb(db, ContentType) and/or router.allow_syncdb(db, Permission) should be added to short-circuit methods?

It seems the only common multi-db setups for contenttypes are "in single database" and "one DB contains master data, others mirror it".

We could work a solution based on the following:

Document that it is suggested that one syncs the contenttypes/permissions table only into one DB (do we need an easier way than routers here? Could we somehow have this default to "default" database in global_settings.py?)

If there is need for master-slave like setup for contenttypes, if we know which DB is the master, which slaves (again, do we need something else than routers?) then we can have syncdb handle the syncing of data. Read from master, copy to slave, ensure PK values match.

If users create contenttypes and/or permissions manually by the ORM in master-slave setup, then there will be no automatic replication. This is a documentation issue. Most users having a master-slave setup will need to deal with this possibility anyways.

The hardest question is how to make it easy to define which DB is the master, and which (if any) are the slaves. We could take an approach where the "default" database is the master for contenttypes/permissions unless there are routers, in which case we inspect the router's for_write and allow_sync for the master/slaves.

Personally, I think that introducing master/slave idea is too complex for just one bug and I don't think that it will introduce any more functionality. Having said that, I would go for the solution contenttypes/permissions go to 'default' unless stated differently in the routers. I hope I understood the idea of the routers correctly.

Would someone please confirm the idea for the solution is right? If so, I would write the solution.

So, currently even if ContentTypes and Permissions are installed to each database by default, the data isn't created by syncdb to 'other'. Actually, the default is populated by doing a manage.py syncdb --database=other. This seems wrong - the default DB should not be touched at all if syncing 'other'.

So, here is a simplified suggestion for this ticket:

Make sure the default database isn't accessed at all by syncdb --database='other' (If there are more errors than contenttypes and permissions, then deal with them in separate tickets).

Sync the contenttypes and permissions only to one database.

The database is 'default' if the contenttypes _and_ permissions are installed to the 'default' database. If they are not synced to 'default' database, don't do anything at all.

The above solution is very limited, if you want the models in some other DB than default or you want the models mirrored to other databases you are on your own. In short, this is the simplest fix to the current situation I can think of. It works mostly as currently, except that 'default' isn't touched at all by syncdb --database=other.

I wonder what to do with documenting this... Maybe something along the lines that we currently do not handle syncing the contenttypes and permissions data to other databases than default, but we might change this later on.

This means we will intentionally leave two use-cases open:

installing the data in 'other' only

no syncing between databases

To solve the above issues we need a notion of "master" database for contenttypes and permissions.

Personally, I think that introducing master/slave idea is too complex for just one bug and I don't think that it will introduce any more functionality. Having said that, I would go for the solution contenttypes/permissions go to 'default' unless stated differently in the routers. I hope I understood the idea of the routers correctly.

Would someone please confirm the idea for the solution is right? If so, I would write the solution.

I can confirm that this is almost certainly *not* the right solution. Putting special cases into routing behaviour is something we want to avoid. "Special cases aren't special enough to break the rules."

So, currently even if ContentTypes and Permissions are installed to each database by default, the data isn't created by syncdb to 'other'. Actually, the default is populated by doing a manage.py syncdb --database=other. This seems wrong - the default DB should not be touched at all if syncing 'other'.

So, here is a simplified suggestion for this ticket:

There is another option, which I raised on IRC over the weekend but didn't fully endorse at the time; however, I've had a bit of a chance to think about it now, and I think it's work looking into some more.

Don't make any special cases here, don't worry about master/slaving contenttypes, and so on; treat each database as a completely separate concern. If contenttypes is marked to be synchronized onto a database, then you populate that database with the contenttypes values for models synchronized onto that database.

The default routing strategy is "everything synchronizes everywhere", so this means that a content type for every model will be created on every database.

A more subtle routing strategy might allow the model Foo to only be synchronized on 'other'. This would mean no content type for Foo on 'default', but there would be one on 'other'.

The upshot of this approach is that all the Generic FK issues that have been reported with multi-db get resolved. It also means the patch that has been provided is pretty much ready to go,

The complication is that any given content type won't have a unique PK across all databases. Contenttypes would be guaranteed complete and accurate on any given database, but a content type would only be available if the model was synchronized on that database. However, you can't join across databases anyway; all this means is that if you want to simulate a cross-database join, you need to do a lookup on the remote database for the comparable foreign key.

You'll also need to be careful when doing queries against the contenttype table to make sure you're getting the "right" contenttype from the "right" database. A default ContentType.objects.get() call won't hit the right database without providing context. However, any attempt to use a contenttype from the wrong database should raise a cross-database error, so while this will be an annoying detail to get right all the time, it is something that should be easy to debug.

I think the latest patch is doing what Russell is asking. However, there are no docs and I think we want some in syncdb and/or ContentTypes and Permissions docs.

Also, we might need to go through the uses of ContentType and check for potential problems: for example "fetch Comment from DB A, save to DB B - check that this throws an error (or the ContentType ID gets updated/verified).

I wonder if we want to have different treatment for Permissions. It seems there is less potential in having permissions for synced models per database. Instead, one database which has all the permissions and their associations to users and groups seems like the right way to go.

I am beginning to think that if you get any contenttypes to a DB, then you might want them all. For example users will likely want to store all permissions in one database (the database which contains the Users and Groups data). Even permissions for models which are not synced to the default database, so that there is one DB to ask for all permissions. But, if we install all permissions to a database, then we need all ContentTypes there too.

The "sync only those contenttypes whose model is in the DB" seems useful for one use case - GenericForeignKeys. But that isn't the only use case. And, installing additional ContetTypes into a DB seems safe to me.

This ticket has potential for endless amounts of complexity. MultiDB is complex. In addition we don't have information about the use case, nor do we know essential information: for example in master-slave setup we don't know which DB is the master...

So, another proposal for solution:

Sync _all_ contenttypes and permissions to default database (this is essentially what is happening now)

No sync at all for other databases

The proposal isn't the ultimate right solution. But it is simple and safe. With the information at hand at sync time doing anything smarter might be hard, too.

I am not a multidb expert at all. But, if this ticket doesn't get a solution from others for 1.5, then I am going to go with the simple solution.

After discussing with Anssi on IRC and reviewing the history of this ticket, I have two concrete proposals:

(1) All content types and permissions are synced to the target database (with or without the --database option) unless the routers prevent it. That is, if allow_syncdb returns True or None, they're sync'd, if it returns False, they aren't.

(2) Use the same rule as 1) for the default database; for non-default databases sync all content types and permissions only if the routers mandate it. That is, if allow_syncdb returns True, they're sync'd, if it returns False or None, they aren't.

Both rules are compatible with the philosophy of the routers.

Option (2) biases Django towards having a single copy of the content types and permissions in the default table, reducing the likelyhood of the id mismatch problem. But it's harder to implement, document and to explain, and I don't have any strong arguments supporting it. Also, I suspect it won't solve the reporter's problem: without any routers, syncing django.contrib.auth to a non-default database will still fail. So I have a weak preference for option (1).

In addition, for users who wish to run syncdb on more than one database, we should recommend to define routers that allow syncing the contenttypes and permissions tables only to one database, to avoid the pk mismatch problem. I don't expect many people to attempt this anyway. Most multi-db users will:

have a master / slave setup: they run syncdb once to master, and the replication takes care of the rest;

have a master database for Django itself, and interact with existing external databases, most likely with unmanaged models;

or a combination of both.

It seems to me that Russell's proposal to split content types across databases doesn't work as is. The admin's permissions have a foreign key to content types; therefore all content types need to be available in a table in the same database that holds the admin's permissions table. (Actually that's what Anssi says in the comment just above, which I read after writing this.) If we want to investigate this option further, let's split it to another ticket, because it isn't a release blocker.

OK for me for contenttypes. I am still a bit worried about the id mismatch. But, as said, it is likely users who have this setup will need to know how to tackle the ID problem anyways. Permissions are a bigger problem. To me it seems that some core parts of Django aren't ready to handle situations where different databases give different permissions to a user (auth/backends.py).

This ticket is out of my domain, I just don't have experience of the use cases needing contenttypes and/or permissions in different databases so I don't know what behavior we should be looking after.

Maybe one approach could be checking if there is any core-supported use case for having the contenttypes/permissions in multiple databases: for contenttypes, sure, generic foreign keys for example. For permissions - I don't think so, auth/backends.py doesn't work and thus default user.has_perm() doesn't work. So, what is the use case for having permissions in more than one DB?

To me it seems that some core parts of Django aren't ready to handle situations where different databases give different permissions to a user (auth/backends.py).

Yes, you almost certainly have to put all auth-related tables on the same database. But it doesn't have to be the default database.

This ticket is out of my domain, I just don't have experience of the use cases needing contenttypes and/or permissions in different databases so I don't know what behavior we should be looking after.

A legitimate use case would be to use a default database with some business data, and a separate database just for Django's apps (auth, admin, etc.) This is easy to achieve with a router. (But I never tried so I'm not sure to what extent it works.)

Maybe one approach could be checking if there is any core-supported use case for having the contenttypes/permissions in multiple databases: for contenttypes, sure, generic foreign keys for example. For permissions - I don't think so, auth/backends.py doesn't work and thus default user.has_perm() doesn't work. So, what is the use case for having permissions in more than one DB?

In my opinion, in this ticket, the use case is to put permissions and content types in a non-default database. It isn't to put them in more that one database; this is too far from being usable right now.