Posted
by
Soulskill
on Friday June 11, 2010 @03:46PM
from the restorative-tux dept.

An anonymous reader writes "I was asked to manage a number of Linux servers at work. I would like to use volume snapshots to improve my backup scripts and keep recent copies of data around for quick restore. I normally manage Windows servers and on those I would just use Microsoft's Volume Shadow Copy for this. I tried Linux LVM snapshots, but most of the servers I manage run regular partitions with ext3 file systems, so LVM snapshots will not work. I found some versioning file systems out there like ext3cow and Tux3. Those look interesting, but I need something I can use on my existing ext3 file systems. I also found the R1Soft Hot Copy command-line utility, but it does not yet support my older 2.4 Linux servers. What are you using to make snapshots on Linux?"

Sweet, I'm going to install Linux on all my systems. I didn't know that Linux could prevent natural and man made disasters as well as being a stable operating system. We've been wasting all this money on backup for all these years.

Sweet, I'm going to install Linux on all my systems. I didn't know that Linux could prevent natural and man made disasters as well as being a stable operating system. We've been wasting all this money on backup for all these years.

There's a mix of humor and catty vitriol here all around, but here is something that addresses the serious point made in Grandparent's statement about it being a "Windows" way of thinking.

Take a look at Infrastructures.org [infrastructures.org] which describes a whole way of thinking about server reliability and configuration. Where I work we essentially use this approach. The fundamental concepts around this approach concentrate more on system configuration, ability to pick a random server and drop it out the window and have another one just like it online in moments. It's less about backups, and far more about a more comprehensive disaster recovery/prevention type of thing. The types of approaches described there are probably more easily implemented using Unix/Linux, but is probably also possible with Windows boxes.

Yes, it's very old. They also talk about using cvs for version control, and mention that that world has moved on to svn, and the world has moved on a couple of times since then even. We also use Nagios rather than more ancient monitoring software. But still the central ideas are sound, even with many details changed. And practical, too.

These ideas actually apply very much to cloud infrastructure. It's really all about the cloud-- considering a machine not as just "a machine", but instead thinking of it as having a base image with certain functionality bolted on top of it. Thinking of a machine not just as a machine, but as a replaceable/exchangeable component in a larger collective system. That essentially is cloud computing. The thing a lot of people don't consider is that even a smaller cluster of machines should/could be configuration managed, maintained, and viewed this way.

Shadow copies are not about server reliability, they're about stupid users inadvertently overwriting or deleting wrong files, and allowing them to fix their mistakes themselves, without needing to access the backup system or bothering the system administrators.

Also, versioning filesystems make a copy-on-write, so there's a backup of *every* version of a file, and not only the versions that are there when the backup (or shadowcopy) runs. It's been only this week that we've been looking fruitlessly through the backups for some vanished files. We can only assume that the files were erroneously deleted on the same day they were created, before the backups had a chance of picking them up.

Eventually you're going to want to migrate those filesystems to btrfs as well, and that has really nice built-in snapshotting capability. But until then (a few years from now) move to LVM. It will save you so much hassle and be much more tested and stable than some hack like ext3cow

Actually, I'd call LVM the long-term pain. For all it's benefits, I don't want my servers to become unresponsive when large files are being written. Doesn't happen on regular partitions, only with LVM.

You cant do this in windows either, not with partitions. Thats why their solution is called shadow VOLUME. Cause it need VOLUMES to work.

Our solution is called LVM Snapshots cause it needs LVM VOLUMES to work.

Now is that so hard to understand?

Well it's obvious you have never used Volume Shadow Copy because in the windows world there is no practical difference between a partition and a volume. No I'm not joking, no I'm not being a Troll.

Find a Windows server with a single drive (basic disk) and a raid Array (Dynamic Disk)Right Click on My Computer and choose ManageClick on disk managementright click on a unallocated portion of a "basic disk" to "create a new partition"right click on a unallocated portion of a "dynamic disk" to "create a new volum

Windows does it in precisely the way uncoordinated open source projects will never be able to do it. They told the NTFS team, a new team in charge of the volume snapshot service, and the team in charge of the logical disk management to work together, create and perform regression tests against each others code on every check-in and patch, and likely set up team liaisons whose sole purpose was to ensure interoperability.

If you told me a third party open source organization, without having full control of the developers and direction of both the ext filesystems and the LVM system, was going to write a service that performed the same function as the volume snapshot service, I would laugh at you. I would laugh and laugh. Open source, because of its nature, tends to attract developers who want to do something, and they want it to be the best at that something. At the same time, they don't want to tie themselves down to stable APIs because, well, it can be limiting and slow development. I totally understand why. So telling me that some third party is going to extend LVM with one API, EXT with another API, and then write a service to coordinate the two is mind-boggling. Those people would have to constantly commit code to match changes in either of the two rapidly changing projects, they'd have to fully understand the inner workings of both ext and LVM, and then they'd have to make it all work without corrupting anyone's data and ruining their reputation beyond repair.

On the other hand, you have projects like ZFS or BTRFS, which are just as monolithic, but more ambitious, and powered by the same developers I mentioned above. They want their solution to be the best. It takes a long time though because they essentially have to start from scratch and incorporate all the things that appear to be within arm's reach. But the people who start projects like BTRFS realize that it's a fool's errand to try and create interoperability between massive, disparately managed open source projects. GNOME and KDE only survive because they threw everything else out and decided to simply come with their own full suite of stuff. X is its own long story.

I don't want to diss open source, like I said, it creates magnificent pieces of software, and the developers really, truly tend to care about their projects. (Even if they can be a little defensive, sometimes.) But without a dictator forcing cooperation between different teams, you often see open source reinventing the wheel. Sure, LVM and EXT3 could theoretically work together to provide sane, fast, performant snapshots. But I'd like to meet the person who thinks they can pull that project off.

The 2.4 series was the last stable series of linux kernels in existence. 2.6 is in a perpetual state of bleeding edge, which makes it a gable to use on a system you care about. It is one of the reasons people (like myself) switched from 2.4 to one of the BSD's.

I'm troubled why people still run 2.4 server. I remember the time when I was reluctant to upgrade to 2.6, and I used prefer the older 2.4, which felt more comfortable than 2.6, regardless of how tempting the new changes sounded. But now I don't see any reason I would run this anywhere, even my router runs 2.6. Especially on newer hardware, 2.4 is really really too old.

I know there are people who probably still run Linux 2.2, but that are probably systems that are running some task well enough to require any changes, and leaving them as they are is the best. Servers are usually not like that. They need security updates, upgrades to catch up with the times, and many other changes required by the circumstances (for example, adding snapshot abilities, for which some person asked recently on Slashdot). Most production servers are not systems that you just leave running, so upgrades to the kernel are also expected and highly recommended. Not to mention that most recent distributions require 2.6.

Exactly, about 2 1/2 years ago I worked for a large company that everyone here has heard of that at the time was running ~4000 servers on a modified Redhat 6.2 image. There was a large code base that got lost sometime in the start-up phase of the mid 90s that was much easier to never touch the OS then to re-write the code.

If you need to keep around such old software, it needs to be running inside Xen/VirtualBox and/or become NFS-booted so that it's insulated from the hardware. That way, you're not forced to keep around old hardware to run your old software. If you insulate with Xen/VBox at the block level you can use LVM2 on the host system to do snapshots but are still constrained by the legacy filesystem. If you NFS-boot, you can use future filesystem-level snapshottable Linux filesystems to do snap

I will admit that I have not tried it on Linux, but zfs is the best of the next gen filesystems. It does cryptographically assured reads and writes (remember that transitory undetected disk malfunctions occur at a rate of ~1/TB of data), it can snapshot changes, it fricken slices bread. If it had a gender, I would probably marry it (well, I guess I can date it for a while and see how things work out). http://zfs-fuse.net/ [zfs-fuse.net]

RSnapshot uses a clever blend of rsync + hard links to do what you want... you can store many incremental backups in just a little more space than a full backup. you can run rsnapshot on a backup server with lots of disk space, and all you need to expose on your target machines is SSH.

you'd create "backup" users on all the target hosts, generate a PKI key pair, and put the private key on your backup server. put the public key in the "backup" account on each target machine so the backup server can securely login without a password. then you just set up rsnapshot to log into your targets and it will use rsync-over-ssh to pull the data.

[RSnapshot] is no good for a true snapshot since the rsync operation is non-atomic on a live filesystem.

I cannot help wondering when I read stuff like that who *really need* atomic,
and who just like it because it sounds cool...
If that 2.4 guy doesn't really *need* theoretical atomicity, and he can do his work with something much more simple, he should.

i wasn't trying to guess at what he needed, but his question was about snap shotting. One of (if not THE) key feature of a snapshot is that it is atomic. Anything that rolls through a changing filesystem one file at a time is not going to fit that bill. Also you run the risk of making "backups" that could break things that presume state consistency. If you capture the log of a daemon before the product output then your backup could have no record of the event which created the output for example.

Unison does a better job, in that it checks for changes while it's running and can be configured to retry until there are none. However, it can still be tripped up by changes that touch multiple files; it doesn't give an atomic snapshot either.

One file is atomically updated by rsync, but not the entire filesystem. So you can have a DB that uses 2 files, and when it finishes copying the first, the status of the second file has already changed and therefore the two aren't compatible.

Since the situation is so hobbled (Old Linux kernel, no LVM) about the only thing you will be able to do is learn to use hardlinks [wikipedia.org]. The ext* filesystems support them but you will have to manage them yourself (cp -varl/source/*/destination/version). Yeah it's a huge hack, but unless you can actually fix the problem, it's about your only hope.

If you have backups, then moving to LVM is obviously the way to go if you desire snapshots. The others options are short-term hackery, LVM was designed from the ground up to do such things. And Ext3 has nothing to do with the price of butter.

To clarify, let me rephrase your question for the other way around

"I was asked to manage a number of *Windows* servers at work. I would like to use volume snapshots to improve my backup scripts and keep recent copies of data around for quick restore. I tried Windows Shadow Copy, but most of the servers I manage run MBR partitions with FAT file systems, so Shadow Copy will not work. I found some versioning file systems out there... Those look interesting, but I need something I can use on my existing FAT file systems. I also found --random freeware--, but it does not yet support my older Windows NT 3.5 servers. What are you using to make snapshots on Windows?"

Except, in that case, it makes more sense because the filesystem is the determining factor, not the volume management. If you have LVM, it doesn't matter what the underlying filesystem is, really. Stop faffing about - if you have a server, with backups, that you need snapshots on, take the hit and wipe the drives to a config that supports that... while you're there upgrade that damn kernel already. If nothing else, it will test that the backups you're making are actually worth the effort. It's like complaining that 95 on FAT16 doesn't support Shadow Copy. If you absolutely *can't* take those servers down, or am unable to restore your backups to another machine for testing such changes (whether because of compatibility, software licensing and/or bad backups), you have bigger problems than some random desire for a feature you don't actually *need* at the moment.

First: upgrade your shit. 2.4 kernel systems? Are you running Redhat 6? You know, from the turn of the millennia.

Second: upgrade your shit. Really,

Third: if your kernels are that old and you're using these machines for file storage/backup, chances are the hardware needs to be replaced before you even consider considering messing with them. Seriously: this stuff is ancient. Even Debian hasn't had a 2.4 kernel in 5+ years, I think.

Third: you can do what you're trying to do with rsync 'snapshots'. It works very well, failing filesystem level support. If you're sharing data over samba, this makes it easy: just put a '.snapshot' dir for these 'temporary' backups in their $HOME and hide dotfiles. Then make sure rsync ignores.snapshot. (Of course, there are other ways to do this.)

People expect a snapshot to be immediately usable and reliable, however in practice a state of device, even if synchronized with filesystem through its transaction is not a state of all data -- some data may be in buffers, prepared to be written, and rebooting into a restored filesystem may require some cleanup of such state. In particular, SQL databases are completely unsuitable for this kind of backup (this is why they have their own backup and transaction log handling procedures), and database-like applications such as mail servers, may require reindexing.

However for purposes other than those applications, file-level backup is entirely adequate, so utilities like rdiff-backup end up providing more functionality than complicated snapshot-handling procedures -- incremental backups for subtrees, readable trees in backup media, etc.

It also should be noted that backups should not be used as a replacement of package management -- on Linux anything installed through a package manager can be uninstalled through it.

On real server operating systems snapshot support is integrated into applications, which receive a "snapshot about to occur event" so they can quiesce their writes for a short period to make the snapshot clean.

For example, on a Windows server, a VSS snapshot is a complete restorable backup of everything, including your databases, event logs, the registry, etc... It's the standard mechanism that practically all Windows backup tools use. They take a snapshot, back it up, and then release it. The point in time that the snapshot was taken turns up in the "last backed up on" date field in SQL Server!

Even third party snapshot mechanisms integrate using plugins. If you take a snapshot with, say, VMware or your SAN, then the same quiescing mechanism is triggered.

Some real server operating systems like HP-UX appear to have LVM extensions that are similar to what VSS can do, but I can't find the equivalent in Linux. From what I can see, the closest you can get is to temporarily halt writes from the ext3 filesystem, but that's not the same thing as proper application quiescing.

"In particular, SQL databases are completely unsuitable for this kind of backup (this is why they have their own backup and transaction log handling procedures)"

While snapshots aren't ideal for SQL DBs, any real snapshot is equivalent to a point-in-time copy of the state of the file system. Restoring it and starting the database should seem to the database just like it's recovering after unexpected power failure or a process crash.

Any database that doesn't recover properly after a snapshot restore will also fail to recover properly after powerfail or a sudden hardware reset, because it's not ordering its writes properly.

Of course, proper snapshot implementations (ie not LVM) notify apps that a snapshot is about to happen so they can pause their work and enter a stable, easy to recover from state for the moment it takes to make the snapshot. So it's even easier on the database.

Now, I'll grant that for databases it's usually *better* to do incremental block-level copies, SQL-level dumps, etc using the databases own tools because doing so is usually much more _efficient_ than taking a snapshot then archiving it somewhere. But sometimes you just want the snapshot around as insurance before doing a major config change or upgrade, and for that they're just unbeatable.

While I don't much like Windows servers in general, I have a major case of VSS envy (Volume Snapshot Service, not Visual Source Safe - blech!), because it's worlds ahead of anything seen on any open *nix and has been for nearly ten years. Hell, my one and only Windows server maintains in incremental backup of its self on a remote iSCSI volume, including many point-in-time snapshots, that I can just unplug from the iSCSI storage host and boot if I need to for disaster recovery. It's impressive, and VSS is the core of what makes it possible.

rdiff-backup backs up one directory to another, possibly over a network. The target directory ends up a copy of the source directory, but extra reverse diffs are stored in a special subdirectory of that target directory, so you can still recover files lost some time ago. The idea is to combine the best features of a mirror and an incremental backup. rdiff-backup also preserves subdirectories, hard links, dev files, permissions, uid/gid ownership, modification times, extended attributes, acls, and resource forks. Also, rdiff-backup can operate in a bandwidth efficient manner over a pipe, like rsync. Thus you can use rdiff-backup and ssh to securely back a hard drive up to a remote location, and only the differences will be transmitted. Finally, rdiff-backup is easy to use and settings have sensical defaults.

Seriously, what is wrong with dump(8)?
It works on ext3. I use it on FreeBSD. It takes snapshots to do the dump, so you can shutdown your database, start the dump and then immediately start your database again.
Of course you have to backup the entire volume, but still...

I'm pretty sure the Linux version of dump doesn't do any snapshoting. The FreeBSD version can do it because the FS supports snapshots, but ext3 does not. (Maybe it will do snapshots automatically if you have a setup that will support them, but the original problem is that this is not the case.)

Just upgrade your kernel using a manual build of the 2.6 kernel.Also install static versions of the modutils ( insmod, modprobe, etc )Use an external ( a machine with decent software ) for this so your compile doesn't break.I have done so in the past and it works fine. ( and plan an update for those machines, anything with 2.4 is way to old... )

Works nicely. I use it for backups over the net. One very nice feature is that it does is reverse diffs, i.e. the nearer to the present time, the faster you get files restored. You also can remove older diffs without any fuss.

Since the early 1990's Novell has had the ability to "Salvage" deleted files and even maintained a near limitless amount of previous versions with a Copy On Write functionality. It still exists, even on Linux in their NSS(Novell Storage Services) Volumes.

Microsoft finally got on board when their Server 2003 product implemented Volume Shadow Copies. This isn't nearly as good as Novell's implementation but, it was better than anything Microsoft had previously offer

Perhaps you are making this more complex than it has to be. I've had zero success simply copying the files and filesystems from a Windows server to backup and then being able to restore anything but data -- you can't reinstall a Windows OS from the moral equivalent of a Linux "cp -R" command. Linux, however, does not share this feature. You can -- and I have -- use rsync or tar to copy the entire filesystem off of a Linux machine to your backup device, then restore an entire machine -- data and

Reading through this thread has brought back the memories of when I first started using Linux. There is a subset of Linux users who seem to think that acting like a giant douche bag will help people adopt the platform.

Don't get me wrong, I've found that there are some amazing people in the Linux community that are more than willing to help out someone genuinely willing to learn, but there still exists this subset of assholes that seem to think ridicule, and basically acting like a dickhead makes them superior. If you're one of those people get over yourself. Linux would be better off without you!

yes, it could be phrased a bit better, "to get atomic backups of data, you need LVM with the snapshot module, or be using BTRFS. If the snapshot module for LVM is unavailable in 2.4.x, you will need to upgrade" The OP is basically asking how can i use shadow volume copy to back up my FAT16 partition, running on a windows 98 computer.

Depending on how long you're keeping them around, LVM Snapshots are likely to be a bad choice anyway. Their intended use-case is to have a very short lifespan, because they're intended to be used like so:

1. Create snapshot

2. Mount snapshot & copy data to backup server

3. Unmount & destroy snapshot

The point behind them is to create an unchanging version of a live partition so that you can copy the data out without worrying about whether it is being updated while you copy. Since the snapshots keep a

You are approaching management of the Linux boxes as if they where windows boxes. That is the event causing you the greatest pain. Basically a Linux box can be divided into 3 groups

Configurations

Data

OS

Configurations: Two types Users Configs: this is kept in their home so no need to worry aboutthat as normal backup takes care of it (exception can be/root) System. System Configs: This is/etc/ and key entries in/var for the most part.

LVM snapshots suck because you can't store the snapshot data on the filesystem you're snapshotting. Sure, there's tons of ways to come up with extra space to store the snapshot data in, but they're all gigantic pains in the ass.

All it needs is the ability to exclude particular blocks from the snapshot, which should be a ridiculously easy option to implement for anyone who's worked on the snapshot code, and then people who aren't experts in kernel hacking can take care of the rest of the layers to make it a

And they only work fine if you have no qualm with seriously degrading performance over time. Windows VSS provided snapshots, called "shadow copies" or whatever, more closely resemble ZFS snapshots than say, dumb SAN or LVM snapshots where the snapshots reside in a dedicated "snapshot area". NTFS, ZFS, have filesystem level snapshots and so the FS is able to put old and stale data relatively close together, and defrags can intelligently move stale data out of the way, making a contiguous area of disk a conti

They require you to pre-allocate space, so you have to guess how much copy-on-write difference will accumulate between the original and snapshot over the lifetime of the snapshot.

If the snapshot runs out of space, it *should* cleanly disable its self. Pity about the file system mounted from it that has no idea its backing block device just vanished. It gets messy, fast, when an LVM snapshot runs out of storage.

LVM snapshots are really inefficient, because they track all block changes, not just user-level file data and metadata. This massively bloats the snapshot, reducing how long it lives until it runs out of backing store and disables its self.

LVM snapshots don't share backing store. If you have three of them, snapshot t+3 has to store all the data snapshot t+2 and snapshot t+1 do, and so on. The differences between the real fs (t) and snapshot t+1 land up being stored three times in three separately allocated backing stores. You waste a HUGE amount of space this way, and it's hard to reliably predict how much you need so your snapshots often vanish out from under you are you're trying to use them.

LVM is useful, but for someone used to the Volume Snapshot Service (VSS) on Windows servers, to ZFS, or to any of the "enterprise-y" file systems often seen deployed with big SANs, it's just going to make them cry.

If you use --link-dest=DIR, rsync will hard link to any files that are identical, so you can have snapshots that are accessible as an entire tree but that consume little more space that a snapshot delta.

Isn't that like complaining that your FAT32 partitions in Windows are not supported by Shadow Volume Copy then?

I think there's a bit of double standard here in the question. The OP is stating that they want to use a feature on an older server (2.4 kernel?) that isn't available unless you update the server, reformat, or what have you. The same applies to a Windows environment.

I think the question is mis-guided. They should be asking for a SVC like feature for kernel 2.4 and ext3 systems.

I tried Linux LVM snapshots, but most of the servers I manage run regular partitions with ext3 file systems, so LVM snapshots will not work.

They have, presumably, tried and failed. It could support it, but it would have to be installed, updated or something to get it to work. There's something they're not doing on an old server that needs to be changed to support the feature they want. I was pointing out how this is not exclusive to the Linux servers.

He isn't complaining. You seem to be responding to his mentioning that "he knows how to do this on Windows" , by interpreting it as "Why is Linux so broken that it can't do a simple thing like that?"
This isn't a Linux versus Windows thing. This is a Windows user, migrating to Linux and wants to know how to accomplish something. Constructive answers are more useful in such cases than getting defensive by alleging hypocrisy and double standards.

The double standard being that the Linux servers wouldn't need updated where the Windows servers would. There's an update that has to happen to support the feature. Linux is not immune to this (though it would likely do the update without a total rebuild opposed to Windows.)

You currently have ext3 fs that are NOT on LVM. In the future, choose LVM.

The choice isn't that simple. LVM comes with its own complications, including a tendency to get volume offsets "wrong" so the file system data doesn't align nicely to RAID stripes. This is not good for performance.

Also, LVM has only recently acquired barrier support, and the combination of no barriers + write cache can be quite dangerous if power is lost. Even battery backed cache won't save you if you use a journalling file system (and everybody does these days) because request ordering isn't guaranteed.

I haven't touched Solaris since it had a 2 in front of its version number, but I must admit that I suffer from ZFS envy.

If they are indeed regular partitions, he can't use LVM snapshots. However, the best solution is to convert from partitions to LVM volumes. It's a little effort to do so, but switching is worth it. Second best is to wait until btrfs is more usable. As a ZFS user, I can say that filesystem-level snapshot are much nicer than LVM snapshots in lots of ways.

Another possibility is to abuse hardlinks. You can create a copy of a directory with cp -al, and then overwrite (not modify) files on the original, you'd have copy-on-write copy. If you make your backups with rsync, you can configure rsync to never write to existing files and always overwrite, then use cp -al each weak or day, to store "incremental" backups for weeks and maybe more. I personally found this solution nice, but then I installed Solaris on the backup machine and used ZFS snapshots which do the same safer, simpler and more efficiently. If the backups are stored on a separate machine, switching it to Solaris is an option.

Another thing that can be done is to keep the LVM snapshots on a separate machine, and leave the current partitions as they are. It can be done with rsync, or a drdb device can be used to sync with the server (it can be created without reformatting the partitions, but you still need to make some changes like shrinking and/or moving the data, which might destroy the data if you don't know what you're doing).

It can be done with rsync, or a drdb device can be used to sync with the server (it can be created without reformatting the partitions, but you still need to make some changes like shrinking and/or moving the data, which might destroy the data if you don't know what you're doing).

DRBD does not require resizing partitions. The meta-data can be kept on a small partition on an entirely different disk if you so choose. At 128MB each, with minimal data being written, you could well plug-in a small USB Flash dr

rsnapshot has no actual snapshotting functionality. It's just a (very useful) wrapper around rsync that takes care of de-duplication. While making a rsync/rsnapshot backup, files on the system can still be changed.For example: A LVM snapshot would give you a consistent MySQL dump if you're using innodb. Rsync/rsnapshot does not.

Only works if the partitions don't change while you are copying them. The big advantage of using LVM for this is that you can create snapshots on a live system, without resorting to remounting the partition read-only (and all the problems that will cause).

But really, those are his only options. If you insist on using plain ext3 and won't add a layer of between the FS to allow for this, then you have no choice but to freeze the partition while doing a volume-level back up.

The GNU folks, in general, abhor man pages, and create info documents instead. The maintainer of tar falls into this category. Thus this man page may not be complete, nor current, and was included in the Red Hat CVS tree because man is a great tool:). This man page was first taken from Debian Linux and has since been loving updated here.

man tar [die.net] Yet another reason to switch from Linux. The crappy state of GNU documentation.