Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

DISCLAIMER: I'm very aware this is not supposed to be done, but time consistency between tables aren't a concert here and Im trying whatever pops in my mind right now to have alternatives for punctual, on demand backups in complement to the more robust, scheduled ones.

Pretty much I want to know if there will be any reading problem while I copy a MyISAM table files (the .frm, .MYD, MYI) and it gets a transaction.

Most tables are small, so we could just take the risk, but there are a couple that worries me becuase of their size. I don't mind waiting for the transaction to get done, what worries me is getting a read error and failing to get a response from the query

So your question is "If a file of a MyISAM table (*.MYD, *.MYI) is copied using the Linux command cp, does this has any impact on transactions reading or writing this table?" The answer is NO (if we don't bother about performance issues becuase copying a file may stress the disks). cp does not lock the table. Maybe I did not read your question in the rignt way.
–
miracle173Mar 25 '14 at 6:29

The final recommendation is to stop using MyISAM, and use InnoDB instead. Then you can do fast, non-locking physical backups with Percona XtraBackup.

Re your comment:

Because reading through a large file isn't instantaneous or atomic. While your backup is progressing through the table, other concurrent updates could change both rows that your backup has already read, and rows that your backup hasn't reached yet.

Take a textbook example for transaction behavior: I do a bank transfer by debiting my bank account and crediting your bank account. My bank account is stored on a row that is physically early in the file, and your bank account is stored on a row later in the file.

While this is going on, the backup is reading through the file, and it has read up to a mid-point at the time our transaction happens. When we restore, we get my original account balance, without the debit applied, because the backup had already passed that point when the debit occurred (and it isn't able to go backwards). But the restore includes your updated account balance, because the backup got to that point in the file after we increased your balance.

Ergo, free money! ;-)

MyISAM solves this by requiring the table to be locked against updates while it's doing a read.

InnoDB solves this by keeping multiple versions of rows, for as long as a reading transaction needs to see them for the sake of a consistent view of the database. So anyone can update the data without waiting, even though the backup is in progress.

Percona XtraBackup solves this in a slightly different way: it can go backwards, in a way. While it's reading the data file, it keeps checking the transaction log continually, to see if there are any late changes it needs to include. These changes may apply to parts of the datafile that Percona XtraBackup has already read. But as long as it gets the data file plus any changes that were logged since the backup started, it can reconstruct the full database.

But that only works for storage engines like InnoDB, that create a reliable transaction log. Percona XtraBackup can also back up MyISAM, but only by using locking, like any other backup tool.

But, just to be completely clear, why a single table copy without locking would be inconsistent?
–
Hito_kunMar 24 '14 at 20:43

Just read your edit, thanks for the info. I was on the idea that data couldn't be accesed while used (copied in this case). In this particular setting, the database entries are totally independent one from another (or so I've been told). This whole idea is to get a quick backup while production floor is working, so a definitive lock is a no-no. We already have a locking backup strategy, and I'm looking to implement a concurrent DB system with a slave, but sometimes we need to recover data fast from a short span, so that's why we have to do weird stuff like this.
–
Hito_kunMar 25 '14 at 22:44

In some cases, you could face issues copying .MYD and .MYIs because of a major weakness MyISAM has: Data changes (changes to .MYD files) are cached in the OS. (Of course, this would be 100 times worse with MySQL in Windows, so I'll leave Windows out of this answer).

If you are using MySQL 5.6, I have a little good news.

The command FLUSH TABLES WITH READ LOCK would look all MyISAM tables under a single lock. MySQL 5.6 now has the syntax FLUSH TABLES db.tb,db.tb ... WITH READ LOCK.

This statement flushes and acquires read locks for the named tables.
The statement first acquires exclusive metadata locks for the tables,
so it waits for transactions that have those tables open to complete.
Then the statement flushes the tables from the table cache, reopens
the tables, acquires table locks (like LOCK TABLES ... READ), and
downgrades the metadata locks from exclusive to shared. After the
statement acquires locks and downgrades the metadata locks, other
sessions can read but not modify the tables.

Because this statement acquires table locks, you must have the LOCK
TABLES privilege for each table, in addition to the RELOAD privilege
that is required to use any FLUSH statement.

This statement applies only to existing base tables. If a name refers
to a base table, that table is used. If it refers to a TEMPORARY
table, it is ignored. If a name applies to a view, an ER_WRONG_OBJECT
error occurs. Otherwise, an ER_NO_SUCH_TABLE error occurs.

Use UNLOCK TABLES to release the locks, LOCK TABLES to release the
locks and acquire other locks, or START TRANSACTION to release the
locks and begin a new transaction.

This variant of FLUSH enables tables to be flushed and locked in a
single operation. It provides a workaround for the restriction in
MySQL 5.6 that FLUSH TABLES is not permitted when there is an active
LOCK TABLES ... READ.

This statement does not perform an implicit UNLOCK TABLES, so an error
results if you use the statement while there is any active LOCK TABLES
or use it a second time without first releasing the locks acquired.

If a flushed table was opened with HANDLER, the handler is implicitly
flushed and loses its position.

CAVEAT : For you InnoDB people, same doc has FLUSH TABLES tbl_name [, tbl_name] ... FOR EXPORT that include metadata locking within the system tables.

You could write a script to lock every MyISAM table in a database, sleep that DB Connection, do your OS copy, and release the lock.

This is what I do now because I work with a MyISAM table that has 4019 partitions. This forces the flush of everything for .MYDs. Issuing a lock like this and sleeping the connection, I can then FTP MYDs and MYI to other data centers.

If you use Percona Server 5.6, I have better news. Percona just invented LOCK TABLES FOR BACKUP. This takes the guess work and grunt work out of locking tables with a measure of confidence in the tables' consistency.