NFS Use For Backups

This is the purposed statement to be included in the release notes following the RFE:

ZCS & NFS:

Zimbra will support customers that store backups (e.g. /opt/zimbra/backup) on an NFS-mounted partition. Please note that this does not relieve the customer from the responsibility of providing a storage system with a performance level appropriate to their desired backup and restore times. In our experience, network-based storage access is more likely to encounter latency or disconnects than is equivalent storage attached by fiber channel or direct SCSI.

Zimbra continues to view NFS storage of other parts of the system as unsupported. Our testing has shown poor read and write performance for small files over NFS implementations, and as such we view it unlikely that this policy will change for the index and database stores. We will continue to evaluate support for NFS for the message store as customer demand warrants.

When working with Zimbra Support on related issues, the customer must please disclose that the backup storage used is NFS.

Things To Check

Check the /var/log/messages on both the zimbra server and the nfs server for nfs related errors during the time frame of your backup.

Check /opt/zimbra/log/mailbox.log for error messages about folders/files not being able to be written or missing directory errors.

Is root_squash configured on the nfs server? If it's changed to no_root_squash , does the behavior of the backup change?

Is the */backup directory owned by zimbra:zimbra with at least 750 or 755 permissions?

This parent directory as given in:

zmprov gs `zmhostname` zimbraBackupTarget

Does zimbraBackupTarget have at least the subdirectories of : sessions and tmp : and are owned by zimbra:zimbra with 750 or 755 permissions?

If not, try manually creating them and then running a test backup.

Debugging Example

Steps I wrote for one customer, where saving out the information as you walk through all the commands would give enough information [hopefully] to submit a good rfe/bug:

1. make a test partition on nfs server - /nfs-test
2. mount on zimbra server
2A. mkdir /nfs-test
2B. chmod 755 /nfs-test
2C. mount nfs-server:/nfs-test /nfs-test
2D. ls -la /nfs-test
2E. mkdir /nfs-test/backup
2F. chown zimbra:zimbra /nfs-test/backup
2G. chmod 755 /nfs-test/backup
2H. su - zimbra ; touch /nfs-test/backup/testfile
2I. ls -laR /nfs-test/
2J. rm /nfs-test/backup/testfile
3. Set zimbraBackupTarget
3A. zmprov ms `zmhostname` zimbraBackupTarget /nfs-test/backup
4. Run a full backup against one account
4A. ex. zmbackup -f -a user@domain.com
5. ls -laR /nfs-test/
6. If you again, run into the same problem. You could also repeat the backup after increasing the backup
logging variable for the account your trying to backup. If you didn't run into the same problem, it might
had to do with the initial setup of the nfs mount and permissions being used during the directory creation.
6A. zmprov aal user@domain.com zimbra.backup debug
6B. logging will show up in /opt/zimbra/log/mailbox.log
6C. Remove account logging when your done. zmprov ral user@domain.com zimbra.backup
8. Change zimbraBackupTarget back to your production path.

Setup A Fast Test NOT Using NFS

A way to "avoid" the NFS issues for testing purposes would be to setup a new zimbraBackupTarget to try doing a full backup of a couple of user accounts. I DON'T recommend this if your using auto-group for zimbraBackupMode [ zmprov gs `zmhostname` zimbraBackupMode ] , only if your using Standard mode.

Understanding Option Flags For zmbackup & zmrestore

First, they don't make sense if your just reading from the help output - I will not argue this point at all.

The biggest problem with the options I point out below is that you can often include them in the command and they do nothing or you include them for a particular situation and they don't apply. Why is this a problem? Because they are silent and give no output telling you that it's not necessary, it's redundant, or it will actually cause your intended results to fail simply because you included the option.

zmrestore Options

Problems mostly revolve around these options.

To Times In The Past (If -lb Isn't Used, Implies Your Using Times/Incr/RedoSeq AFTER Last Full)

-restoreToIncrLabel

<arg> Replay redo logs up to and including this incremental backup

Requires: --label or -lb

-restoreToTime

<arg> Replay rodo logs until this time

Requires: --label or -lb

-restoreToRedoSeq

<arg> Replay up to and including this redo log sequence

Redolog Variables

--backedupRedologs or -br

Replays the redo logs in backup only, which excludes archived and current redo logs of the system

Only useful when restoring against latest full backup (NOT using the -lb option).

Will restore using incremental backup data after the last full backup as well but NOT including any redolog activity.

--restorefullBackup or -rf

Restores to the full backup only, not any incremental backups since that backup.

The default behavior of zmrestore in general is to always play from a "full" and "incrementals" that are associated with it UNLESS you state otherwise.

If you do:

zmrestore -a user@domain.com -lb full6monthsago

It will playback the incremental data associated with that full from 6 months ago.

If you do:

zmrestore -a user@domain.com -lb full6monthsago -rf

It will ONLY playback the data in the full from 6 months ago.

Will not progress past the data that's in the last full backup, no incremental backups after it in other words.

Implies NO redolog play, so there's no need to use -br.

Targets And Labels

--label or -lb

<arg> The label of the full backup to restore. Restores to the latest full backup if this is omitted.

--target or -t

<arg> Specifies the backup target location. The default is <zimbra_home>/backup.

Impact Of AutoGroup Option Being Used

Place for notes about how autogroup backup option might impact or limit command options.

zmbackup Options

Problems mostly revolve around these options:

--target or -t

<arg> Specifies the target backup location. The default is <zimbra_home>/backup.

--zip or -z

Zips email blobs in backup - using compression

--zipStore

Zips email blobs in backup - does NOT use compression

Deleting Old Backups -del

Caution You want to delete from the oldest label to newest. The -del option will automatically purge all older sessions prior to the label you used. To find out the label names, use zmbackupquery.

Format example:

zmbackup -del <oldest_backup_label>

Impact Of AutoGroup Option Being Used

Place for notes about how autogroup backup option might impact or limit command options.

Use DL COS Status To Generate User List Of Accounts

An Overview Of Some Backup/Restore Items

Shared Blobs (messages)

I believe we are a little light on describing the shared blobs situation. Shared blobs can cause different corrections to a problem as compared to a normal message issue that isn't shared. I'll start some notes on this here.

"When backing up shared messages, the backup process looks to see whether a Binary Large Object file (BLOB) representing a message already exists in the backup. If it does, it simply flags this object as such and does not copy its content again."

"Keeping the same backup target saves disk space, because shared binary large object files (BLOB) and other files do not have to be duplicated every time the backup process runs.

Bugs/RFE's

"Use zip files for shared blobs of a full backup made with --zip option"

Remote copies of backup data for DR use

You will want to copy over /opt/zimbra/redolog/archive/* and /opt/zimbra/redolog/redo.log frequently in order to stay current. The redo.log file being open is not a problem since the crash recovery step can work with redo.log file in any state.

The redolog/archive/ contains logs that have not yet been backed up by an incremental (or by a full in auto-grouped mode)

The redo.log rolls over when it reaches zimbraRedoLogRolloverFileSizeKB (by default 100MB prior to ZCS 5.0.11 and 1GB after). When ZCS restarts after a crash, it seems to work through the current redo.log ok regardless of its state, if the current log really must be copied."

My Initial Thoughts On This

Start of process:

Weekend full on prod

rsync full-xxx on prod > remote sessions/

rsync redolog/* > remote redolog/

through non-full and incremental times every x about of minutes

weekday nights 10pm incre-xxx on prod

rsync incr-xxx on prod > remote sessions/

rsync redlog/* > remote redolog/

Created three separate rsync cron rules.

Full - once a week

Confirms full is done and then looks for latest full-xxxx and rsyncs that specific directory

Incre - once a night except for full schedule night

Confirms incre is done and then looks for latest incr-xxx and rsyncs that specific directory

Redolog/ - every x amount of minutes (outside of full and incr backups sessions)

Soes full rsync of redolog/ - probably want delete/remove option?

lsof will report /opt/zimbra/redolog/redo.log is open.

Somewhere we need to account for accounts.xml in this process. And also confirm what else might be missing. Also, steps on the actually restore process depending on when/where the DR event took place.

What's Needed For Later Restores

In regards to what is moved and what is needed for later restores, you must remember this "flow" of the backups:

Full backup files that contains all the information needed to restore mailboxes (to that point in time)

Incremental backup files that contains the LDAP directory server files and all the redo log transactions written since the last backup (to that point in time)

Redo logs that contains current and archived transactions processed by the Zimbra server since the last incremental backup (to that point in time - more about this topic is above)

Variables to be aware of in regards to backup/restore [ viewed with - zmprov gs [server-name] | grep Backup ]:

During the backup we update the zimbra.mailbox table for each mailbox to record the most recent backup time. This is in the "last_backup_at" column within Mysql.

This data is used by auto-grouped backup to figure out which mailboxes to backup.

Possible Issue That A Failed Or Interrupted Backup Causes

An interrupted backup can cause an issue because the table currently gets updated right off the bat rather than waiting for backup to be successfully completed.

Possible RFE: To update zimbra.mailbox.last_backup_at column for successfully backed-up mailboxes to the very end of the backup process, to either just before or just after renaming the /opt/zimbra/backup/tmp/<backup label> directory to /opt/zimbra/backup/sessions/<backup label>.

Setting To Null To Cause A New Backup For User

To undo what was done by an interrupted backup for example, you need to clear this column (set it to null) for the affected mailboxes. By clearing the column, you're forcing the next AG backup to choose these mailboxes because they look like they have never been backed up. If you don't clear this column, you have to wait until the next cycle. (7 days)

Example syntax to view:

mysql zimbra -e "select last_backup_at from mailbox where id=27"

Example syntax to change data of the last_backup_at to NULL"

mysql -e "UPDATE mailbox SET last_backup_at = NULL WHERE id = 27"

Related Bugs RFE's

"In auto-grouped backup, delay the update of mailboxes' last_backup_at timestamp to the very end of backup"

Change "Location" For Backup Or Restore Source Data

Remember that zmbackup and zmrestore can take flags as well in regards to the location of items.

zmrestore & zmbackup can both take : -t,--target (default <zimbra_home/backup)

You can't state a different location with redo logs though. There's a command called zmplayredo [for newer versions of ZCS] and it has a variable to point to the redologs to play from [ --logfiles ]. It will replay into the default redolog directory or redolog file. The mailbox has to be stop to run zmplayredo . This is a command to manual kick off a replay of a redo log. This is normally done with the zmrestore when options about to a specific time aren't included.

Manual Removal Of Older Backup Sessions

General Situation:

Keep in mind every restore requires starting with data from a full backup.

For each account on the server, there must be at least 1 full backup after the deletion is complete.

You should also make sure all incremental backups made after the oldest of the remaining backups are retained.

This basically gets reduced to deleting only those backups that are old enough, based on your full/incremental schedule.

More Specific Issues:

Does the accounts.xml dependency cause issues with this?

No. Just don't delete it.

What about the contents of the backup/tmp/ data or shared blobs type references?

Don't touch this directory either. It is used during backup and restore. You don't want to change its content while an operation is going on, so best to leave it alone.

What if a zimbra server is running some type of restore of backup command while the manual removal is running on the nfs server?

You shouldn't remove the backups that are being used in a restore currently underway. You are responsible for avoiding the race condition.

Please understand you are responsible for avoiding the race condition. Make sure no backup or restore is happening at the moment, then rename the directories that will be deleted, preferably move them to another subdirectory, e.g. /opt/zimbra/backup/sessions_to_delete. Then delete.

A Way To Verify Backup Integrity

Auto-Group Backups Rather Than Default Method Topics

General Description And Official References

Having trouble completing that entire full backup during off-hours? Enter the hybrid auto-grouped mode, which combines the concept of full and incremental backup functions - you’re completely backing up a target number of accounts daily rather than running incrementals.

Auto-grouped mode automatically pulls in the redologs since the last run so you get incremental backups of the remaining accounts; although the incremental accounts captured via the redologs are not listed specifically in the backup account list. This still allows you to do a point in time restore for any account.

Simply divide your total accounts by the number of groups you choose (zimbraBackupAutoGroupedNumGroups is 7 by default) and that’s how many will get a full backup session each night. Newly provisioned accounts, and accounts whose last backup is a specified number of days older are picked first. (zimbraBackupAutoGroupedInterval is defaulted to 1d)

Think of auto-grouped mode as a full backup for the scheduled group as well as an incremental (via redologs) for the all other accounts at the same time.

Bugs - RFE's To Review For Auto-Group

Please see:

"In auto-grouped backup, delay the update of mailboxes' last_backup_at timestamp to the very end of backup"

The Zip - Compression Option For Backups

Using the zip option will compress all those thousands of single files that exist under a user's backup, decreasing performance issues that arise from writing out thousands of small files as compared to large ones. This is often seen when one is :

Using nfs for the backup directory

Copying/rsyncing backups to a remote server

Are using some third party backup software (to tape) to archive/backup the zimbra backup sessions.

Optional Tweaks To The Zip Options

Please see this comment and those underneath it within this RFE:

"Use zip files for shared blobs of a full backup made with --zip option"

Each zip file gets a queue of files to add. This key sets the queue size. Default is 10. Range is 1 to 10,000.

backup_zip_copier_deflate_level

Compression level. Default is -1. (same as in java.util.zip.ZipOutputStream). -1 is same as level 6. This behavior comes from zlib library which the JVM uses to implement zip. Other than the special default value, the level can range from 0 to 9. 0 means no compression. 1 means fastest compression and 9 means best compression.

backup_disable_shared_blobs

This one isn't limited to zip backups. When this is set to true, all blobs are backed up as private backups. Default is false.

backup_debug_use_old_zip_format

If true, backup will behave like ZCS 5.0.4 and earlier. Shared blobs are never zipped, and private blobs are added to a single blobs.zip file in zip backup. Default is false.

"-zip can be added to the command line to zip the message files during backup. Zipping these can save backup storage space."

It's implied that instead of having all the individual message files in the backup that it will bunch them all together into zip files. The body of a shared blob is added once to a shared-blobs zip file, then a small pointer-only entry is added to a mailbox's zip file. Same effect as in non-zipped case. This will be useful when the number of message files is causing disk i/o issues. Maybe your trying to rsync the backup session directories off to another server or your running a third party backup on it to save to tape. The default use of -zip will use compression, if this also causes overhead that you need to avoid you can use the -zipStore option.

Note about -zipStore:

"when used with the -zip option, it allows the backup to write fewer files (-zip), but not incur the compression overhead as well"

The zip options effect backups that are in blob formats (full's). Incremental backups are bascially redologs, not the full message store of the user. In summary, the zip option will not impact the increment type backups. Auto-group backups are a mixture of both fulls and incrementals.

How To Use As A Default Option?

You'll add the options to the zimbra crontab file. This can be done with the zmschedulebackup command.

Run zmschedulebackup with help option:

zmschedulebackup --help

You'll see:

-z: compress - compress email blobs with zip

It appears that you'll need to manually add the options about -zipStore , if you want it, to the crontab file.

/opt/zimbra/conf/log4j.properties for "temporary" change - until next restart. This could take a couple of minutes before jetty "sees" the changes.

/opt/zimbra/conf/log4.properties.in for "permament" change that will stick after restart. A restart of jetty/mailbox would be required for this change - zmmailboxctl restart .

log4j.logger.zimbra.backup=DEBUG

For incremental backups, this should log each redolog being copied to the backup and also log which ones will be deleted out of archive directory. Those not deleted are kept because they are newer than 1 hour (default). The kept logs are deleted (but not copied again) during the next incremental backup.

Redolog Files

Redologs Copied To Backup Session And When Deleted

Archived logs that are less than an hour old at the time of incremental backup are copied to the backup but aren't deleted to support post-crash waitset reinitialization mechanism. The interval is set in localconfig key backup_archived_redolog_keep_time, which is in seconds, default=3600.

An Outline Of The Step

This flushes to /opt/zimbra/redolog/archive/[file] upon hitting the zimbraRedoLogRolloverFileSizeKB.

1 & 2 keep repeating when zimbraRedoLogRolloverFileSizeKB is hit.

When a backup is done, the archive/* files are copied. The redo.log file is not moved.

When the backup processes archive/* logs, it first figures out the last sequence copied to backup. All newer logs are copied to the current backup. Then, all logs are deleted except those that are too new, determined by localconfig parameter backup_archived_redolog_keep_time, which defaults to 1 hour. (This is part of the waitset feature.)

In standard backup mode, only incremental backups move the redologs.

In auto-grouped mode, every backup is a hybrid of full and incremental and thus redologs are moved.

Redologs And Auto-group In Regards To Backups

Think of auto-grouped mode as a full backup for the scheduled group as well as an incremental (via redologs) for the all other accounts at the same time. Auto-grouped mode automatically pulls in the redologs since the last run, so you get incremental backups of the remaining accounts. Although the incremental accounts captured via the redologs are not listed specifically in the backup account list. This still allows you to do a point in time restore for any account.

If You Have Older Redologs Not Being Deleted

According to the code, only archived logs newer than 1 hour old (default for backup_archived_redolog_keep_time) should remain after an incremental backup. It is a bug if you are seeing older logs sticking around. If so, look at mailbox.log and see if any error was logged. If you enable DEBUG logging for "zimbra.backup" logger in log4j.properties you will see log statements for each copy and deletion.

The zimbraRedoLogDeleteOnRollover variable

zimbraRedoLogDeleteOnRollover shouldn't have an effect on "If you have older redologs not being deleted". By default it's FALSE and affects whether or not stuff makes it into /opt/zimbra/redolog/archive at all. With it set to TRUE there's just /opt/zimbra/redolog/redo.log and it's deleted/not rolled over into archive. As discussed above old redologs are deleted after the incremental; thus if you don't take incremental backups you should set this value to TRUE or periodically script manual deletion of /opt/zimbra/redolog/archive. (And with zimbraRedoLogEnabled FALSE there's no redo.log at all.)

If You Don't Run Incremental Backups Or Don't Need Archive Redologs

You would set zimbraRedoLogDeleteOnRollover to TRUE.

(Auto-Grouped backups you can still leave this to the default of FALSE.)

Redolog Sequence And The Backup Session

Redologs will exist in the incremental backup sessions. The zmbackupquery command will reference the redologs associated with the backup. For example"

Need To Move Redologs Because Partition Getting Full

Let's say you have a partition getting full and you need to move the redolog to another partition or nfs mount temporary to deal with the potential crisis that will happen when the partition becomes full. You'll need to reallocate the complete redolog/ directory and the archive subdirectory to the same partition because the roll over from redo.log to the archive directory happens with a rename function within the java code. This will require downtime since you'll need to move the actual redo.log file and zimbra can't be running while you do this. You can use a symlink to your new partition path. For example:

Pick the right redolog file, either redo.log or one of the files under archive/, based on timestamp.

zmplayredo And zmredodump

zmplayredo - Replaying Content From Any Redolog File

zmplayredo is a newer command, first introduced in 5.0.5 I believe. The mailbox has to be stop to run zmplayredo.

The help output from 6.0.8:

$ zmplayredo --help
usage: zmplayredo <options>
--fromSeq <arg> Replay from this redolog sequence (inclusive)
--fromTime <arg> Replay from this time (inclusive)
-h,--help Show help (this output)
--logfiles <arg> Replay these logfiles, in order
--mailboxId <arg> Replay for this mailbox only
--queueCapacity <arg> Queue capacity per player thread; default=100
--stopOnError Stop replay on any error
--threads <arg> Number of parallel redo threads; default=50
--toSeq <arg> Replay to this redolog sequence (inclusive)
--toTime <arg> Replay to this time (inclusive)
Specify date/time in one of these formats:
2010/11/19 13:55:08
2010/11/19 13:55:08 802
2010/11/19 13:55:08.802
2010/11/19-13:55:08-802
2010/11/19-13:55:08
20101119.135508.802
20101119.135508
20101119135508802
20101119135508
Specify year, month, date, hour, minute, second, and optionally millisecond.
Month/date/hour/minute/second are 0-padded to 2 digits, millisecond to 3 digits.
Hour must be specified in 24-hour format, and time is in local time zone.

zmredodump - Replaying Content From Any Redolog File

zmplayredo is a newer command and very useful. It does not require mailboxd to be stopped like zmplayredo does.

The help output from 6.0.8:

$ zmredodump --help
usage: zmredodump [options] <redolog file/directory> [...]
where [options] are:
-h,--help show this output
--m <arg> one or more mailbox ids separated by comma or white
space. The entire list must be quoted if using space as separator. If
this option is given, only redo ops for the specified mailboxes are
dumped. Omit this option to dump redo ops for all mailboxes.
--no-offset don't show file offsets and size for each redo op
-q,--quiet quiet mode. Only print the log filename and any errors.
This option can be used to verify the integrity of redologs with minimal
output.
--show-blob show blob content. Item's blob is printed, surrounded
by <START OF BLOB> and <END OF BLOB> markers. The last newline before end
marker is not part of the blob.
Multiple log files/directories can be specified. For each directory, all
redolog files directly under it are processed, sorted in ascending redolog
sequence order.

Using zmredodump To Get Message Blobs To Inject With zmlmtpinject - RFE

Please see:

"RFE: zmredodump blobs to single files for zmlmtpinject [for example]"

How Do I Figure Out Which Sequence or Time Variable To Use For Restore Or Replay

In 5.0.10+ we'll have a CLI wrapper (zmredodump) with a slightly different command line syntax, but the below long syntax works in earlier versions.

To locate the correct restore-to time, you have to start with an approximate time the message was added/deleted. Look at the redolog files. The filename contains the GMT time when the file was rolled over, which is roughly the tstamp of the last operation in the file. If your time data is accurate you can find the specific file. Or you have a range of files to examine.

Use the redolog verify tool to dump the contents into text form, the -m / --message option to show message body data:

If the message was deleted and you don't know the id, you must go by some other clue such as the subject. Search the file to locate your message. You can cut/paste the message and lmtp-inject it to recover the message. No need to go through with a restore if this is all you needed.

Are You Messages Really Gone - Things To Check If zmplayredo Isn't Doing What You Expect

Here's something I found out testing zmplayredo for a customer case. Testing on a ZCS 6.0.8 single ZCS server.

Created a test account and sent it one message that is in the Inbox. I delete the msg in zwc but don't purge the Trash - msg is in Trash now.

Gap In Redo Log

To avoid future restore problems, discard all existing backups and take a

full backup of all accounts; If this error occurred during restore,

try the --ignoreRedoErrors option"

The output is pretty accurate in how to handle the situation.

If you get the error during a backup, the recommendation is to move your old backups out. The directories in /opt/zimbra/backup/sessions/* . You'll want to keep them around just in case and then proceed to do a full backup.

If you get the error during a restore, you would add the flag --ignoreRedoErrors to your restore command.

Another possible related issue is if your /tmp or /opt/zimbra/redolog/ is filling up.

Error Executing redoOp

Errors with restores that involve the message 'error executing redoOp' will not show up in the admin console but will when you attempt the restore from CLI. This can also be the cause when you use the RestoreToTime option from the admin console and it doesn't seem to work correctly - the restore stopping prematurely from the specified date/time.

This could explain why your restore to time isn't working in the Admin Console but does from CLI when you see an error about redologs and then reattempt restore with the --ignoreRedoErrors and it works.

Another RFE that was made but marked as 'WONTFIX' that gives a background story to the issue is:

Do your restore against the latest full backup of the account in question and then use the zmplayredo command against the redologs in the incrementals and/or the /opt/zimbra/redologs/* directory . This will give you more control to walk the restored account up to the point in time you want it at. One should really read through the whole section above, Ajcody-Backup-Restore-Issues#Redolog_Files , to understand the whole concept of redologs and then the use of zmplayredo.

Generally, "fixing" the redolog itself is not an option.

Why Do My Fulls Not Report All Accounts?

Are you sure it was a full backup that was ran or just a full session that was generated from your incremental backup job? When an incremental is ran, it will create a "full" session for any new accounts it discovers after the last actual full backup job.

For example, here's a full session that was created by an incremental backup job:

Notice the Start and End times, this will show that the full is related to the incremental job.

You'll want to run zmbackupquery against your full labels to see your "main" full backup session - assuming you can't simply guess based upon the cron entry for it [ su - zimbra ; crontab -l | grep backup ]. For example, to see all your fulls from today's date back to October 01, 2008 and the accounts within each session - you would do:

zmbackupquery -v --from 20081001.000000 --type full

The -v flag outputs the accounts, the --from uses YearMonthDay.HourMinuteSecond , and the --type can be full or incremental. To just see one particular sessions date, you would use the lb [label] flag:

You can run into numerous issues if you allow your backup directory to become full.

Confirm your /opt/zimbra/backup/accounts.xml is being updated after a backup. You might see that the newer account.xml* file is accounts.xml.new . This is a sign of problems.

Confirm that the files in /opt/zimbra/backup/tmp/* don't have 0 byte lengths. There might be files like 1.xml and 3.xml in there. If they show 0 bytes, you need to remove them. The backup/restore commands if the file exist and they are empty. Your errors might look like this:

The gist, you MUST use the -lb full-200xxxxxx option when your trying to restore anything that ISN'T meant to include the latest information of the mailbox. The -lb argument should specify a full backup that took place prior to the time of the backup you wish to restore.

This could explain why your restore to time isn't working in the Admin Console but does from CLI when you see an error about redologs and then reattempt restore with the --ignoreRedoErrors and it works.

Restore An Individual Message

The zmrestore command is at a mailbox level.

An RFE was filed already to expand this. It is currently targeted for the Helix release.

You can also export SOME of the 'old' data from the restore account using other options. One option is with the before and after variables. NOTE - We have to set the query string as a variable to get around some of the shell issues.

For example:

$ query='before:11/20/2010 after:11/1/2010'

ZCS5 might require you to have a %20 rather than the actual space character.

Note : A critical option in the above command is the &resolve=replace one. There are various ways you can handle the importing of data. Please review the following to determine what is best for you needs.

Restore Deleted Items - skipDeletes Option - ZCS6+

Added new option --skipDeletes to zmrestore. If specified, skip
over delete operations during redo replay. Delete ops are:
DeleteItem (hard delete)
DeleteMailbox
EmptyFolder
MoveItem, if moving item to Trash folder
PurgeImapDeleted
PurgeOldMessages
Skipping these deletes can lead to other problems later, such as
conflicting paths, but it is assumed the priority is recovering as
much data as possible when using this option.

Restore Account Not Yet In Backups

Please see:

"add ability to restore accounts not yet backed up (but still in redologs)"

Users Trash Items

User Ability To Recover Trash Purge

Retention Policy About Purges

Can't Restore Or Find An Account That Was Renamed

When an account is "renamed", the old account name will no longer be "found" is your "default" type restore or backup queries. This can cause some confusions when one needs to restore to a time frame of when the account was under it's older name.

Restoring From Admin Console GUI

I'll give a detail explanation of the situation when working around restores of renamed accounts in the admin console [web GUI]:

If you in the GUI goto the "Restore" button, it first asks for an account rather than giving an option for date/time/session. I think you already stated, that "renamed" accounts don't show up in this query window. Therefore, one wouldn't really progress to the next window that would allow you to change the backup session label.

They way you get around this is, you actually double click on the full session listing that you see on the backup page in the admin UI. This will bring you to another page, that is specific to that session. In there, you should see the old account name prior to the rename. You can then highlight that account listing and click on the "Restore" button. This will bring up the restore dialog, which will now have the date/time/session label auto-filled out.

Quota Is Stopping A Redirected Restore

Update

"accounts with quota set by COS fail to restore when over default quota"

Reasoning for need, maybe the msg files coming from the restore are no longer "shared message blobs" and therefore increase the mailbox to a size it wasn't in the past. Changes to HSM maybe?

I think I'll need to create a RFE about adding an option to the zmrestore command to also include an option to set the COS value on a created account. Until then... Create a new COS and set it up to NOT have any quota. Once you kick off the backup and you see the account is created you can then apply the COS to that account. Call the cos something like no-quota. Here's the steps below.

zmrestoreldap doesn't have options that allow specific items to be restored (COS, DL's, etc.). It only has option for named accounts (-a). One could try a ldapadd with a ldif of the COS or DL details. One could also take the information on the COS or DL within the ldap file in the backup session to at least have all the variables to manually add it back (via the zmprov command). Your looking at the backups on the LDAP master if your in a multi-server configuration.

/opt/zimbra/backup/sessions/full-xxxxxx/ldap/ldap.bak

Start of ldap entry example to search for:

Cos example

dn: cn=default,cn=cos,cn=zimbra

DL example

dn: uid=dl-group,ou=people,dc=mail,dc=domain,dc=com

To compare a current DL with past details, just save out the ldap entry from the backup to a txt file. And then do:

zmprov gdl maillist@domain.com

Make the necessary changes after comparing the two.

Restoring A Calendar (ics)

There seems to be a bug or odd expectation on how this command is currently working. If the appointment exists in the Calendar and the time is different with the same appointment in the ics file your importing - the time of the appointment will not change to the imported ics one. If you delete the event first, then the imported appointment will reflect the correct time.

Here's what I did to reproduce this situation. It seems this has been true for sometime, customer was on 4.5.11 and I was on 5.0.8

Created test account and made two appointments on friday - 9am and 4pm.
Did a full backup.
Restored test account to restore_user
Ran :
zmmailbox -z -m user@domain.com gru /Calendar > /tmp/calendarA.ics
zmmailbox -z -m restore_user@domain.com gru /Calendar > /tmp/calendarB.ics
And then
diff /tmp/calendarA.ics /tmp/calendarB.ics [no differences]
Now some tests.
As user, I deleted the two appointments and then:
zmmailbox -z -m user@domain.com pru /Calendar /tmp/calendarB.ics
Refreshed Calendar as User in webclient.
9am and 4pm appointment shows up.
I then moved in the webclient the 9am appointment to 11am
Did another restore:
zmmailbox -z -m user@domain.com pru /Calendar /tmp/calendarB.ics
Refreshed Calendar as User in webclient.
11am and 4pm appointment shows up.
** The restore did not move the 11am appointment back to the 9am slot as in /tmp/calendarB.ics
** Assumption, this process will not over-write an appointment if it's there - it does not look to the time.
Let's do a diff of the state of the calendar
zmmailbox -z -m user@domain.com gru /Calendar > /tmp/calendarC.ics
diff /tmp/calendarB.ics /tmp/calendarC.ics
The DTSTAMP and SEQUENCE shows the difference in the time.
If I delete the 11am appointment and then do the calendarB.ics restore the appointment shows up again at 9am.

I see this same behavior if I also use the web interface to export/import the calendar between the restored account and user one. Even when I import it into a NEW calendar, it even changes the two appointments to reflect the new calendar rather than the default one.

One Fix

One fix, if the situation allows, is to purge the current Calendar and then import the full ics file. This would be done like this:

Setup A Secondary Zimbra Box For Restores Of Archive Accounts

Your Zimbra license can be installed on multiple machines. One idea that might prove useful in handling these "archive" accounts for those situations when you need to investigate something is to setup a "archive" Zimbra box. You'll want to isolate this box from any "production" use. It will need to be configured to have the "domains" of the archive accounts. You can then use this box to restore the "archive" account and then use the administration tools to investigate the user data.

Use Of REST And Other Tools For Specific User Data

The following page, User_Migration , will shows numerous examples of how to export different types of data from a user account into a neutral file format that one could use for "archive" purposes.

Use Of The REST Command To Export ALL User Data - Version Dependant, 5.0.9+ [I believe]

Backup And HSM

Scripting Out Individual Backups Of Accounts

If you want to do individual backups of accounts using a for-loop, for example, you might want to include the -sync option from zmbackup. zmbackup without this will normal give an error as it passing the next zmbackup command stating that there's a current backup in progress.

Example command in some for-loop script is [without the -sync option]:

"Full backup is usually run asynchronously. When you begin the full backup, the label of the ongoing backup process is immediately displayed. The backup continues in the background. You can use the zmbackupquery command to check the status of the running backup at any time."

I couldn't find any other indication beyond that to explain in more details the purpose of that flag. But from what is stated above, it does look like the -sync flag will resolve the issues of "backup in progress" when scripting out multiple zmbackup commands.

If not, you could query for "Status: in progress" from the zmbackupquery command.

You can give the zmbackupquery command flags for date/time, label, account, -t target, and so forth [Do a zmbackupquery --help to see the options format].

Backing Up Backups - 3rd Party Tools And Software - Dealing With Directories With Hard Links

Description Of Hard Links

Zimbra uses hard links and special attention needs to be given to this fact. See hard links if your not familiar with hard links and their difference to symbolic links. Not all 3rd party backup software will handle or respect hard links. Many unix commands will need special flags to maintain hard links. When hard links are respected and also "copied" to the new location you could find your data usage become a large multiplication of the original size.

An good thread I found on the topic of preserving hard links for copy/move/backup operations is here:

"How to Copy a Filesystem and Preserve Hard Links in Linux - Some random bits scribbled by Jeremy Zawodny"

I'll be summarizing the comments and including additional information I find on the topic below, based upon the command or software being used.

Zimbra and Single Instance Storage - Hard Links

If hard links are possible, we use them. When hard links aren't possible, we'll have to save the 'msg' again and then internally to that mailstore/disk partition we can use hard links to that initial save.

Easy Way To SEE Hard Links In Use

I sent a test email with 5 accounts listed in the To field, that is shown below with the inode listing of 2133394. the first column of the ls [because of the i option] is listing the inode number of the file. I included the -type f because the . and .. will show directories using the same inode as the 'name' of the directory as listed.

Simple `cp -a` using cp (GNU coreutils) 5.97 on my debian does the job quite nicely, I just checked. No need for the --preserve=all option, -a implies --preserve=link. It didn't seem to take too long either, but I would be surprised if it was very much better than rsync. Much easier to remember though.

xfs_copy

LVM Tricks

LVM Snapshots

SAN Snapshot

Possible note of caution I've seen from someone, "When implementing snapshots for ZCS, you should do the snapshot across all ZCS LUNs for a single host at the same time using a consistency group (for netapp, I believe this means cg-start/cg-commit)."

Cloud Backups

Amazon S3 , Amazon EC2 , SecoBackup And/Or Tar

I've not used Amazon S3 or SecoBackup. I have no idea about the pricing structure of Amazon S3 and how differing solutions might cause price differences. What I think would be a reasonable approach:

Adjust zimbra cron to:

run zmbackup as normally scheduled but then include:

tar and gzip "new" backup that was made to a "staging" partition.

Setup Secobackup [CLI method for cron] to then copy this tar'd/gzip'd file to the Amazon S3 cloud.

Remove local tar'd/gzip'd file from staging partition.

I purpose the tar'd/gzip'd step because I doubt there's a way to avoid the hard link issue with SecoBackup/Amazon S3. Why pay multiple times for the same data?

Some information a customer reported to me:

S3 does not work as a normal filesystem and you cannot mount it; hence
it wouldn't normally work.
However, there are various projects out there which let you use S3 as
a local POSIX-compliant file system.
Possible options:
s3fs-fuse
jungle disk
subcloud
persistentfs
Amazon EBS
To cut a long story short, PersistentFS came out on tops - it worked
extremely well - however did not work with Zimbra at all once I set it
up as the store (/opt/zimbra/store)
The problem is that the filesystem while it is POSIX compliant does not
have support for hard linking (Which is what Zimbra does with tmp incoming
messages to the store). -- [bug 43019 below]
So, overall it's not really possible to do it right now with S3.
They should have hard link support soon.

References:

Amazon Web Services Homepage - There's various "services" available there

Backup Exec - Symantec / Veritas

Uses the "Backup Exec Remote Agent for Linux or UNIX Servers". The "server" is Windows based only.

Unable to find any reference about the Linux/Unix Agent and the Backup Exec server being able to handle or not symbolic and hard links.

Backup Exec "server" is only available on Windows. One might inquire with Symantec if you can "swap out" your current investment in "Backup Exec" and use their NetBackup product. This supports hard links and the "server" can run on Windows or other *nixes. See Ajcody-Backup-Restore-Issues#NetBackup_-_Veritas.2FSymantec

BackupPC

Bacula

When enabled (default), this directive will cause hard links to be backed up. However, the File daemon keeps track of hard linked files and will backup the data only once. The process of keeping track of the hard links can be quite expensive if you have lots of them (tens of thousands or more). This doesn't occur on normal Unix systems, but if you use a program like BackupPC, it can create hundreds of thousands, or even millions of hard links. Backups become very long and the File daemon will consume a lot of CPU power checking hard links. In such a case, set hardlinks=no and hard links will not be backed up. Note, using this option will most likely backup more data and on a restore the file system will not be restored identically to the original.

BRU - TOLIS Group

Found this in one of their manuals, you should confirm with them based upon the product version you'll be using:

Special Files - BRU will save and restore all types of filesystems and files with their proper ownership, access attributes, creation dates, and modification dates. BRU can be used to move an entire directory hierarchy from one system to another, with all files, including directories, block special files, character special files, fifos, hard links, and symbolic links reproduced with all attributes intact.

NetBackup - Veritas/Symantec

On most UNIX systems, only the root user can create a hard link to a directory. Some systems do not permit hard links and many vendors recommend that these links be avoided. NetBackup does not back up and restore hard-linked directories in the same manner as files:

During a backup, if NetBackup encounters hard-linked directories, the directories are backed up once for each hard link.

During a restore, NetBackup restores multiple copies of the hard-linked directory contents if the directories do not already exist on the disk. If the directories exist on disk, NetBackup restores the contents multiple times to the same disk location.

Hard links to files :

A hard link differs from a symbolic link in that a hard link is not a pointer to another file. A hard link is two directory entries that point to the same inode number.

If the backup selection list includes hard-linked files, the data is backed up only once during a backup. NetBackup uses the first file name reference that is found in the directory structure. If a subsequent file name reference is found, it is backed up as a link to the name of the first file. Backup up only the link means that only one backup copy of the data is created, regardless of the number of hard links. Any hard link to the data works.

For more information and examples, see “Hard links to files (NTFS volumes or UNIX)” on page 173.

NetVault - ORBiT

[screen shot of GUI check box] The Attempt to Restore Hard Links’ option as revealed in the Restore Options tab on a Linux/UNIX-based version of the File System Plugin.

Attempt to Restore Hard Links (Linux/UNIX-based O/S, ONLY) - During a backup, when the first occurrence of a hard link is found, the complete data will be backed up. For all other occurrences, only the link is backed up. This data can only be restored when the first occurrence exists; trying to restore subsequent occurrences without the presence of the first causes the job to fail. Selecting this option will attempt to locate the full sequence so that all occurrences of the hard link will be restored.

rsnapshot

Tivoli - IBM

When you back up files that are hard-linked, Tivoli Storage Manager backs up each instance of the linked file. For example, if you back up two files that are hard-linked, Tivoli Storage Manager will back up the file data twice.

When you restore hard-linked files, Tivoli Storage Manager attempts to reestablish the links. For example, if you had a hard-linked pair of files, and only one of the hard-linked files is on your workstation, when you restore both files, they will be hard-linked. The files will also be hard-linked even if neither of the files exists at the time of restore, if both of the files are restored together in a single command. The one exception to this procedure occurs if you back up two files that are hard-linked and then break the connection between them on your workstation. If you restore the two files from the server, Tivoli Storage Manager will respect the current file system and not re-establish the hard link.

Attention: If you do not back up and restore all files that are hard-linked at the same time, problems will occur. To ensure that hard-linked files remain synchronized, back up all hard links at the same time and restore those same files together.

Other Related Items

freedup

I've never used this tool, but from the description it seems it might come in handy for some circumstances.

"Establishes hard or symbolic links between identical files. Search all given file system trees for identical files and link them to the most frequently referenced inode or if equally referenced to the inode of the first file tree. If the devices differ a symbolic link is used instead of a hard link. Symbolic links will not replace files, when at least one of the directory trees is not starting with a '/'."

Tape Devices

Many times, "drivers" aren't needed for tape devices for linux, but many administrators are unaware of this and never give it a test. Instead, they just assume the device doesn't work and the tape vendor isn't supporting it because they "didn't publish" a driver for it.