If you can avoid all of these, then you can benefit from transparent auto-reconnection and transparent connection reuse / pooling etc.

If you can't, then you will have to deal with reconnection, retry at a higher level. This is complicated and difficult to test.

I'll discuss ways of avoiding these things one at a time:

Transactions

I'm not advocating shunning transactions completely, just avoiding keeping any transaction state in the connection:

Keep autocommit on. Execute everything you want in a transaction in a single statement (in MySQL it is possible to delete from several tables in one statement, insert several rows etc).

Transactional engines will still use transactions if autocommit is on - a single statement becomes a single transaction - you will see a consistent view within that statement - no other thread will see parts of a batch insert, update or delete.

Non-transactional engines behave the same anyway.

Alternative: do everything that needs to be in a transaction inside a stored procedure - if these are retried, they will be retried as a whole.

Temporary tables

These are best avoided. They also cause problems for statement-level replication.

Use short-lived permanent tables with unique names instead (perhaps in a separate database which is automatically cleaned up when tables get old)

Wrap things which need temporary tables inside stored procedures

Use derived tables instead - in some cases this is sufficient.

Session variables which affect behaviour

Sometimes you can't get away from having to set these variables. Common examples are:

SET NAMES utf8

SET SQL_MODE = 'TRADITIONAL'

If an autoreconnect happens while these are in effect, you could accidentally get rubbish data inserted into the database if a connection reverts to latin1 unexpectedly or SQL_MODE reverts to its default "anything goes".

As these are unavoidable, they always have to be set - but there is a solution - the MYSQL_INIT_COMMAND option. As these commands won't actually have any effect on their own, we can have them automatically re-done when the library reconnects to the database.

The only problem with this approach is that not all client library interfaces allow the setting of this option - but in some cases it can be set in my.cnf (for client libraries which read this)

Other session variables

Quite simply - don't use them. There is no reason I know of for having another session variable.

Don't select @@LAST_INSERT_ID or last_insert_id() - use your API method to get it instead. An auto-reconnect happening between the insert and the select @@LAST_INSERT_ID will cause it to return zero, which is probably not what you wanted or expected.

Don't define your own variables - keep them on the client instead (or if you absolutely must store state server-side, put it in a table!)

Conclusions

It is possible to enable automatic reconnect for greater resilience at the MySQL API level

To do so safely you must avoid keeping session state in the connection

Variables which need to be reset on connect can be reset with mysql_init_command

5 comments:

"in MySQL it is possible to delete from several tables in one statement, insert several rows etc" - How about updating several rows in one statement (I'm talking about actual unique data, of course, not just a column deletion or something)? Is/will that be possible?

Mark, sure, I understand. As far as updates, what I meant when I said 'unique data' is updating a set of values to something that cannot be described by a single WHERE clause (a set of youtube view counts, for example).