Currently, content being uploaded for archival storage is in a specific organization (specified here: [[Share_Drive_Organization]]).

+

Currently, content being uploaded for archival storage is in a specific organization (specified here: [[Share_Drive_Protocols]]).

Once this content is placed into the /srv/deposits/content/ directory on libcontent1 (a Linux server), we :

Once this content is placed into the /srv/deposits/content/ directory on libcontent1 (a Linux server), we :

# verify that it copied correctly across the network,

# verify that it copied correctly across the network,

−

# check the content with quality control verification scripts,

+

# check the content with quality control verification scripts (such as [[Image:testIncoming.txt]])

# upload the collection information file content into the database to provide access to the online collection via a web-side php script, and

# upload the collection information file content into the database to provide access to the online collection via a web-side php script, and

# then we archive it.

# then we archive it.

Line 17:

Line 17:

Another handy script is archiveCheck [[Image:CheckArchive.txt]] which verifies that everything in each manifest is in the archive, and everything I intended to link into the manifest is indeed linked there properly.

Another handy script is archiveCheck [[Image:CheckArchive.txt]] which verifies that everything in each manifest is in the archive, and everything I intended to link into the manifest is indeed linked there properly.

+

+

When we digitize multiple tiny collections, we may combine the spreadsheets, for simplicity. Then, however, they must be split out by collection for archiving: [[Image:splitExcel.txt]]

Revision as of 10:52, 29 October 2010

Currently, content being uploaded for archival storage is in a specific organization (specified here: Share_Drive_Protocols).

Once this content is placed into the /srv/deposits/content/ directory on libcontent1 (a Linux server), we :

upload the collection information file content into the database to provide access to the online collection via a web-side php script, and

then we archive it.

Archiving it means that we weed out extraneous files, re-order content (via copy) according to our storage organization (specified here: Organization_of_completed_content_for_long-term_storage), version the metadata, xml, or text files (linking into the manifest only the version; the updated one overwrites the unversioned copy in the directory) and either create a LOCKSS manifest for this content or alter existing ones to include this content.

This script (still being modified and updated to handle new problems) is here: File:Relocating.txt
By uncommenting out the $test = 1; line, you can run this as a test, which will not change any existing manifests or copy content. Instead, it will write all the manifest changes and creations into one huge file called RelocateManfests, and it will still write a list of what files it will copy where to the "moveme" file.

After running this script for real, run "checkem" which goes through the moveme file, does md5 comparison on the old file and the new one -- if they're the same, it will delete the old on in the deposits directory. If they're not the same, it will output an error and leave the original untouched.

Another handy script is archiveCheck File:CheckArchive.txt which verifies that everything in each manifest is in the archive, and everything I intended to link into the manifest is indeed linked there properly.

When we digitize multiple tiny collections, we may combine the spreadsheets, for simplicity. Then, however, they must be split out by collection for archiving: File:SplitExcel.txt