Currently, content being uploaded for archival storage is in a specific organization (specified here: [[Share_Drive_Organization]]).

+

Currently, content being uploaded for archival storage is in a specific organization (specified here: [[Share_Drive_Protocols]]).

−

Once this content is placed into the /srv/deposits/content/ directory on libcontent1 (a Linux server), we :

+

Once this content is placed into the /srv/deposits/content/ directory on libcontent (a Linux server), we :

# verify that it copied correctly across the network,

# verify that it copied correctly across the network,

# check the content with quality control verification scripts (such as [[Image:testIncoming.txt]])

# check the content with quality control verification scripts (such as [[Image:testIncoming.txt]])

Line 9:

Line 9:

Archiving it means that we weed out extraneous files, re-order content (via copy) according to our storage organization (specified here: [[Organization_of_completed_content_for_long-term_storage]]), version the metadata, xml, or text files (linking into the manifest only the version; the updated one overwrites the unversioned copy in the directory) and either create a LOCKSS manifest for this content or alter existing ones to include this content.

Archiving it means that we weed out extraneous files, re-order content (via copy) according to our storage organization (specified here: [[Organization_of_completed_content_for_long-term_storage]]), version the metadata, xml, or text files (linking into the manifest only the version; the updated one overwrites the unversioned copy in the directory) and either create a LOCKSS manifest for this content or alter existing ones to include this content.

−

This script (still being modified and updated to handle new problems) is here: [[Image:Relocating.txt]]

+

We have different versions of the archiving script (Relocating) for different materials (on libcontent). Most content is archived using the relocating script in /srv/storing/; there's another for ETDs in the bornDigital subdirectory, one for MODS in the MODS subdirectory, one for EADs in the eads subdirectory, one for tags and transcriptions in the crowdsource subdirectory, and so forth. In each,

−

By uncommenting out the $test = 1; line, you can run this as a test, which will not change any existing manifests or copy content. Instead, it will write all the manifest changes and creations into one huge file called RelocateManfests, and it will still write a list of what files it will copy where to the "moveme" file.

+

by uncommenting out the $test = 1; line, you can run this as a test, which will not change any existing manifests or copy content. Instead, it will write all the manifest changes and creations into one huge file called RelocateManifests, and it will still write a list of what files it will copy where to the "moveme" file.

After running this script for real, run "checkem" which goes through the moveme file, does md5 comparison on the old file and the new one -- if they're the same, it will delete the old on in the deposits directory. If they're not the same, it will output an error and leave the original untouched.

After running this script for real, run "checkem" which goes through the moveme file, does md5 comparison on the old file and the new one -- if they're the same, it will delete the old on in the deposits directory. If they're not the same, it will output an error and leave the original untouched.

Line 16:

Line 16:

Here's the checkem script: [[Image:Checkem.txt]]

Here's the checkem script: [[Image:Checkem.txt]]

−

Another handy script is archiveCheck [[Image:CheckArchive.txt]] which verifies that everything in each manifest is in the archive, and everything I intended to link into the manifest is indeed linked there properly.

+

We also have verification scripts, such as "checkArchive" which verifies that everything in each manifest is in the archive, and everything I intended to link into the manifest is indeed linked there properly.

When we digitize multiple tiny collections, we may combine the spreadsheets, for simplicity. Then, however, they must be split out by collection for archiving: [[Image:splitExcel.txt]]

When we digitize multiple tiny collections, we may combine the spreadsheets, for simplicity. Then, however, they must be split out by collection for archiving: [[Image:splitExcel.txt]]

Revision as of 10:19, 14 December 2012

Currently, content being uploaded for archival storage is in a specific organization (specified here: Share_Drive_Protocols).

Once this content is placed into the /srv/deposits/content/ directory on libcontent (a Linux server), we :

upload the collection information file content into the database to provide access to the online collection via a web-side php script, and

then we archive it.

Archiving it means that we weed out extraneous files, re-order content (via copy) according to our storage organization (specified here: Organization_of_completed_content_for_long-term_storage), version the metadata, xml, or text files (linking into the manifest only the version; the updated one overwrites the unversioned copy in the directory) and either create a LOCKSS manifest for this content or alter existing ones to include this content.

We have different versions of the archiving script (Relocating) for different materials (on libcontent). Most content is archived using the relocating script in /srv/storing/; there's another for ETDs in the bornDigital subdirectory, one for MODS in the MODS subdirectory, one for EADs in the eads subdirectory, one for tags and transcriptions in the crowdsource subdirectory, and so forth. In each,
by uncommenting out the $test = 1; line, you can run this as a test, which will not change any existing manifests or copy content. Instead, it will write all the manifest changes and creations into one huge file called RelocateManifests, and it will still write a list of what files it will copy where to the "moveme" file.

After running this script for real, run "checkem" which goes through the moveme file, does md5 comparison on the old file and the new one -- if they're the same, it will delete the old on in the deposits directory. If they're not the same, it will output an error and leave the original untouched.

We also have verification scripts, such as "checkArchive" which verifies that everything in each manifest is in the archive, and everything I intended to link into the manifest is indeed linked there properly.

When we digitize multiple tiny collections, we may combine the spreadsheets, for simplicity. Then, however, they must be split out by collection for archiving: File:SplitExcel.txt