This article is intended for cases when an error „database is malformed“ appears in P5. It is about checking and repairing structural index errors. Note that only the index's structural integrity is handled here, not its content or actuality.

Checking an index

P5 backup indexes consist of four *.db files, archive indexes have two more files. These reside in a subfolder of the P5 install directory:

Repairing the index (the standard way)

The safest way to repair an index is by recovering an intact version from tape. For a broken backup index that can be achieved by navigating to the Advanced Options → Manage Indexes section, selecting the index and clicking „Check“. The check function will check the index and recover the last saved version from tape. For archive indexes, there is no automatic backup, so hopefully you saved the index yourself.

In case the problem persists

In case the last backup is too old by any reason, or in case that does not overcome the problem and the saved index was already faulty, again the sqlite3 command can be used to dump the index and then create a new index from tape. This command creates a dump of the database and directly reads that dump with another instance of slite3 that creates a new database from the dump:

That way a new file is written, here addrs_new.db. Please check whether the new file has a similar size as the original. The size will probably slightly smaller, as the internal structures are recreated in a more optimized way , but care must be taken that the mechanism did not fail completely by any reason and bigger parts of the data are lost.

In case the size fits, replace the dumped file with the new one, that new file should then be ok.

]]>Organisation basics

PresSTORE maintains for each file in the backup the tape it is stored on and the time stamp of that file. This storage is done in the backup index. The backup index consists of multiple file sets, each one holding the data for one backup cycle. The files where the backup index data is stored in subdirectories below the PresSTORE home directory.

The index data is kept as long as files from the according cycle can be recovered from the tape, that is, as long as the tape exists. The index data file is kept as long as there exists as least one volume of that index cycle.

Calculating the backup index size

An index entry requires approximately 1 KB of data. Example:For 100.000 files over 4 backup cycles, the total amount of space is approximately 100000 * 4 * 1 KB = 400 MB.The number of 100.000 files represents approximately "all files of a workstation computer".Note that the index size depends on the number of files, not on the amount of data. To figure out, how many files are on the disk, execute the following command from a terminal window in the root directory of the volume to be backed upOn Mac, Linux and Solaris: ls -R | wc -lOn Windows (regard the last two output lines): dir /s

Pitfalls

In case hughe disks are to be backed up, the disk space in the PresSTORE directory must be sufficient for the indices.

In case a backup is missused as archive and the backup tapes are kept forever, note that the index files are kept forever, too.

In case old tapes of the media pool are recycled (when the cycle is no longer required) the according indices are cleared when the last tape of the particular cycle is recycled.

In case tapes are taken off the system and kept elsewhere, the according index files are kept automatically.

Note: When erasing the tapes outside of PresSTORE, make sure the according volume is cleared from the volume manager.

]]>PLEASE NOTE: The procedures described here only work in P4 and earlier. They do NOT work in P5 due to the switch to a different database engine!

1. Using the P4 GUI

To perform a simple quick-check of index consistency and optionally schedule index recover from the GUI, follow these simple steps:

Open the PresStore Browser

Navigate to the General Setup / databases

Select index database you want to check

Pull down Edit and select the Check database

Click on the Start button

This will initiate index consistency check. In the event of some non-repairable index problem, you will be presented with a list of media volumes needed to perform the index recovery. You can then start the index recovery if needed. Please note that this procedure might take some time to finish.

2. Using the P4 CLI

To perform a more elaborate index check, you need to access the P4 CLI and then issue commands to perform index operations. Over CLI, you have some more options when handling backup indexes. Please see the list of available commands below.

2.1 Check index

The command can be summarized as follows:

cli::index::backup::check <client_name> <option_list>

<client_name> name of the client<option_list> list of options, as follows:

rebuild rebuild the index files

fixshadows repair some rare table inconsistencies

cleanup clean up the index directory structure

cycles <num> limit the operation to <num> number of cycles (default is: 0, that is, all known cycles)

During the execution, this command will generate lots of log messages in the P4 server log file. You might examine this file to see the detailed info produced by the command. The command returns:

0 check and repair done OK 1 parts of the index are missing; use GUI to recover

This command is supported starting with 2.3.9 release. The command can be summarized as follows:cli::index::backup::dropcycle <client_name> <backup_plan_id> <cycle_id>

<client_name> name of the P4 client<backup_plan_id> ID of the backup plan used to backup data<cycle_id> ID of the cycle (see getcycles command)

This command drops one of the backup index cycles for the given client and backup plan ID. If the cycle being dropped is the current cycle it adjusts all backup tasks to point to the previous cycle. If there is no previous cycle, all the tasks will be set to default (as no backup was performed).

- This time, the P5 CLI returned the name of each custom field along with its newly added contents.:

user_color RED user_year 2016 user_country Deutschland

Afterwards, I was able to view the newly added metadata in the restore browser.

I realize that this was done with just a single file. Getting the paths and then the handles for a larger number of archived file may be a bit more complicated. However, there are two CLI commands that may help with that:

- "ArchiveIndex <name> inventory <output file> [<options>]"

&

- "Job <name> inventory <output file> [<options>]"

Someone who is good at scripting could probably write a script that first creates a list of all archived files contained in an archive index, or a list of archived files for a particular job (depending on which one of the two commands is being used), then starts a loop that gets the archive handle for each file, then adds the new metadata for each file using the 'setmeta' command in conjunction with the archive handle.

For more information on the various P5 CLI commands mentioned in this article, please refer to the P5 CLI manual available here:

]]>We were asked for an explanation why that timestamp and job ID is added to the archive index automatically.This feature has been changed to mandatory since PresTORE 4, it was advisory in PresSTORE 3.

The reason is: to protect from creating a non manageable archive.

There is today a variety of different archiving patterns, but only a few peole really care about the structure of the archive and attempt to simply copy the well known structure they have on disk. In fact, the filesystem on disk is a living system that permanently changes while an archive can be seen more or less as a series of snapshots of (parts of) that filesystem, taken at different times. To maintain the time as an addition dimension, the date stamps are in the archive index.

Customers who disabled the unique index after a year of archiving discovered some problems that cannot really be solved AFTER archiving:

The archive ends up with ten to hundred thousand files in one folder. Browsing such a structure becomes extremely slow.

The folder structure in the archive "overshadows" other parts of the archive. For files, this is resolved in file versions. That is the smaller problem as it simply implies that the restore is possible for one of the versions at a time only.

In some cases a certain file name stands for a folder at one time and a file at another time and for a different folder at a third time. These cases cannot be resolved in a handy way.

Please remember that the folder structure in the archive index does not represent a true file system and cannot be dynamic. An attempt to realize that would end up with a rather long time decision process whether or not to show a folder.

In PresSTORE 4.2 we will allow to remove the date stamp as an expert option. We do so because of the feedback we received for making the unique index pathadvisory. Again, that was intended as a protective step for less experienced users. The resulting archive index will be supported partially only, the effectsdescribed above will be allowed again but without support from our development.

]]>In the restore screen, as well in Backup as in Archive, the folders are shown without size.

The reason for that is a technical one: The size of the content of a folder is not constant, it changes for instance dependeing on the index options.

A folder or a subtree thereof may be included or excluded by selecting a specific pool or tapeset. So the folder size cannot be calculated in advance, it would have to be calculated when viewing a folder.

When starting the restore for one folder, that calculation is performed, it is then visible how long that calculation takes. Doing that online would just take too much time, as a result it would be rather time consuming to open a folder in the restore section. In worst case, if there are many subfolders, it may even be possible that the browser times out in the meanwhile and closes the connection.

]]>

The file and folder structure and change thereof over time, and

The media (tape) related tables with positions where the data is stored.

Both parts are independent from each other. In case for instance a file is saved on three tapes, it has three database records in the media part but just one in the file structure.

When deleting or relabeling a volume, the files of that volume can no longer be restored. In the index, they will however still be visible:On deletion of a media (tape), the media part of the file will be marked as deleted. However, the file part is not, as that would require a check for each file whether it is on another tape. As such a test takes time, it is not done immediately.

When browsing in the Restore areas in P5, only the file and folder structure is regarded to allow fast browsing. In case the media part of the index would be tested, too, browsing would be rather slow. So it is possible that files which cannot be restored anymore are still visible in the index. They will remain visible until the index has been cleaned up.

It is always possible to check whether or not a file can be restored by opening the versions windows from the context menu. That window shows all media where the file is stored on and in which version.

Note that this is handled differently for files and folders: even if there are "versions" of of folder in the index, these are not exposed as they refer to the time stamps and other attributes of the folder, but not to its contents. A folder with a timestamp from May can contain files that are much newer. So the "version" of a folder would be misleading.Some folders may even appear without time stamp, these were not saved but exist only as nodes (to navigate to saved files and folders).

To figure out whether a folder contains files (without navigating down the folder tree), one can select the folder and click "restore to". P5 will then count the files and folders below and sum up the size before the actual restore is started. The restore can be cancelled before files are really restored, but the window shows size and number of the contained files.

In Backup indexes, the cleanup happens automatically after the next backup job using that index. The cleanup will appear in the index tables when 10% of the contained data is invalidated. So it may appear that if only a few files are affected, the cleanup is delayed.

In Archive indexes, there is no automated cleanup. If required it is possible to execute the cleanup manually. The cleanup can be called through the nsdchat utility with the following commands:

This call does the cleanup of elements no longer on tape. Please make sure that during the cleanup, no archive or restore jobs are running that use this index.

/usr/local/aw/bin/nsdchat -c cli::index::purge Default-Archive true

This call removes empty folders in the index. It does so recursively and runs for the given number of seconds. In the example, these are 600 seconds or 10 minutes.Please make sure that during the cleanup, no archive or restore jobs are running that use this index,