I have a directory containing a lot of result log files (directory size ~ 4GB), and a set of processes running that keep on writing to these files.

To be able to correctly analyze the results at a later time, I want to copy the whole directory to an archive destination, and I cannot stop the processes.

I want a copy of the directory as it was at a particular point of time. As the size directory is huge (which means it takes about 40 seconds to copy) and some of the files are being written to, a normal cp -r does NOT give me a snapshot at a particular point in time, rather a snapshot of files spread over some 40 seconds. This is not good enough.

Is there a way to get an exclusive lock on the directory and all its components while copying?

5 Answers
5

If cp is taking too long to provide the results you need, you're most probably not going to be satisfied with other options. Unix copy is about as basic as it gets for I/O overhead.

However, if you retarded the I/O processes buy leveraging another management process between the disk and the files as they were changing, the overhead may not matter. What I mean is, you could run the process in a VM (if this is possible), try snapshots with rsync, or even subversion.

The only other thing I can think of is maybe piping the log output to a symbolic link and updating the links to a secondary location with a batch file.

For instance:

file.log -> /mnt/disk1/logarea/file.log

file.log -> /mnt/disk/logarea2/file.log

So that you can update the symbolic link instantly, and the output pipes to the new location.

You might even be able to write a batch file to update the link, count to 45 seconds, and then switch it back - thus isolating the data you need.

Some filesystems provide snapshot facilities. Creating a filesystem (on an unused area of drive, or in a dedicated data file [loopback]) using one of these filesystems and then mounting that where your logs are written would allow you to snapshot the logs filesystem.

Filesystems known to support snapshots:

ZFS (fuse based)

FreeBSD's FFS

LFS (Log Structured Filesystem)

XFS

There are probably more, but those are the ones I can think of off the top of my head.

Snapshots don't copy the data, rather the freeze the data at a particular moment in time and create a virtual filesystem containing that data that can be mounted and used like any other filesystem (albeit read-only). The original filesystem continues to function as if nothing had happened.