Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

is it possible (and how) to convert a huge MyISAM table into InnoDB without taking the application offline. It requires to insert a couple of rows into that table every second but it is possible to suspend it for about 2 minutes.

Obviously ALTER TABLE ... engine=innodb will not work. Therefor I had the plan to create a new table with the innodb engine and copy the content into it. And in the end, suspend the application log thread and RENAME TABLE.

Unfortunately even doing the copying in small batches of 100 rows generates significant lag after some time.

Edit: Existing rows are never changed, this table is used for logging.

Well, that question is about minimizing conversation time. I don't care if the conversations takes a couple of days or weeks. But it must work in the background without requiring down time of the application and without creating noticeable lag.
–
Hendrik BrummermannJan 9 '11 at 13:14

it would still be blocking, because of the way innodb handles auto_increment,. by default innodb takes a table level lock when inserting into an auto_increment column, and releases the lock as soon as the insert is finished,.
–
ovais.tariqMay 6 '11 at 12:47

I don't care if the conversations
takes a couple of days or weeks. But
it must work in the background without
requiring down time of the application
and without creating noticeable lag

As you're doing logging, if you have some good way to set a marker so you can tell at what you you start the process, so you can then re-apply any logs, or have the logs written out to a text file so you can later ingest them with LOAD DATA INFILE

Part of the problem is that writing in smaller batches means that the indexes have to be recomputed over and over again; you're better off running it all at once, but this might cause some 'noticable' lag on the system .. but you don't have to do it on your production server.

Pause the logging or set some marker so you can re-apply the logs from this point on later.

Unfortunately even doing the copying in small batches of 100 rows generates significant lag after some time.

Are you adding any delay between each batch, or just batching up the updates and running each batch directly after the previous one?

If so then try scripting the conversion in your favourite language with something like:

repeat
copy oldest 100 rows that haven't been copied yet to new table
sleep for as long as that update took
until there are <100 rows unprocessed
stop logging service
move the last few rows
rename tables
restart logging
delete the old table when you are sure the conversion has worked

This should ensure that that the conversion doesn't take more than more-or-less half your server's capacity even allowing for differences in load imposed as the system's use varies with time.

Or if you want to use as much time as possible when the service is relatively idle but back off (potentially pausing for quite a length of time) when the database needs to do some work for its users, replace sleep for as long as the update took with if the server's load is above <upper measure>, sleep for some seconds then check again, loop around the sleep/check until the load drops below <lower measure>. This will mean it can steam ahead in quiet times but will pause completely when the server is busy performing it's normal workload. Determining load will depend on your OS - under Linux and similar the 1-minute load average value from /proc/loadavg or the output of uptime should do. <lower measure> and <upper measure> may be the same value, though it is usual in controls like this to have a difference so your process doesn't keep starting then immediately pausing because of its own restarting having an influence on the load measure.

Of course this would not work for tables where old rows may get modified, but will work fine for a log table like the one you describe.

You will want to ignore the usual wisdom of creating indexes after populating the new table in this case. While that is indeed more efficient when you want things to be as fast as possible (the effect on the rest of the system be damned), in this case you don't want the big glut of load at the end of the process as the indexes are completely created in one go as this is a process you can't pause when things get busy.