The following scripts are not being used anymore. For current usage, see [[Capture Workflow Scripts]], section Quality Control.

+

+

<font color="gray">

[[Image:filenames.txt]] is a Windows Perl script for locating badly formed and mislocated file names. This version of the script is also known as 'filenamesAndTranscripts' as it checks transcript directories also, if they exist.

[[Image:filenames.txt]] is a Windows Perl script for locating badly formed and mislocated file names. This version of the script is also known as 'filenamesAndTranscripts' as it checks transcript directories also, if they exist.

Line 9:

Line 13:

outputs txt file tab delimiting item number followed by number of files

outputs txt file tab delimiting item number followed by number of files

+

[[Image:BoxFolderCheck.txt]] is a Windows OR Macintosh Perl script designed for digital content where the item number section of the identifier includes the box and folder number where it is to be linked into the EAD finding aid, and the digital files are located in directories named for the box, and in subdirectories named for the folder. This verifies that files are named appropriately and also located in directories which reflect their file name.

+

</font>

== On Linux Server, for Web Delivery==

== On Linux Server, for Web Delivery==

−

+

Once content is uploaded to the Linux server for archival storage, one of the scripts we run to verify that all the archival filenames are correct and in the right directory, and no sequences are missing, is [[Image:TestDeposits.txt]] -- for the Cabaniss content, it's [[Image:testNums.txt]].

−

Once content is uploaded to the Linux server for archival storage, one of the scripts we run to verify that all the archival filenames are correct and in the right directory, and no sequences are missing, is [[Image:TestIncoming.txt]]

+

To test content online in Acumen, and locate items that have no derivatives or no MODS record, use this Linux Perl script: [[Image:findMissing.txt]]

To test content online in Acumen, and locate items that have no derivatives or no MODS record, use this Linux Perl script: [[Image:findMissing.txt]]

Line 20:

Line 25:

For Cabaniss, this one checks the content in the web directory (in Acumen) against what's in the storage archive, to make sure nothing is missing: [[Image:findMissingFile.txt]]

For Cabaniss, this one checks the content in the web directory (in Acumen) against what's in the storage archive, to make sure nothing is missing: [[Image:findMissingFile.txt]]

−

+

To create OCR files for items listed in *ocrList.txt files located in the /srv/deposits/ocrMe directory, and place those OCR files in the correct web location: [[Image:ocrSelected.txt]]

== On Linux Server, the Storage Archive ==

== On Linux Server, the Storage Archive ==

Checking the MD5 checksums of content stored prior to each full-tape backup: [[Image:Dirs.txt]] (as described in [[Watching Our Backs]])

Checking the MD5 checksums of content stored prior to each full-tape backup: [[Image:Dirs.txt]] (as described in [[Watching Our Backs]])

Latest revision as of 15:13, 12 February 2014

Quality Control checks happen in multiple parts of the work flow pipeline.

The following scripts are not being used anymore. For current usage, see Capture Workflow Scripts, section Quality Control.

File:Filenames.txt is a Windows Perl script for locating badly formed and mislocated file names. This version of the script is also known as 'filenamesAndTranscripts' as it checks transcript directories also, if they exist.

File:Numfiles.txt is a Windows Perl script that looks through scans directories in a selected collection directory,
counts up all files of input extension chosen, and
outputs txt file tab delimiting item number followed by number of files

File:BoxFolderCheck.txt is a Windows OR Macintosh Perl script designed for digital content where the item number section of the identifier includes the box and folder number where it is to be linked into the EAD finding aid, and the digital files are located in directories named for the box, and in subdirectories named for the folder. This verifies that files are named appropriately and also located in directories which reflect their file name.

Once content is uploaded to the Linux server for archival storage, one of the scripts we run to verify that all the archival filenames are correct and in the right directory, and no sequences are missing, is File:TestDeposits.txt -- for the Cabaniss content, it's File:TestNums.txt.

To test content online in Acumen, and locate items that have no derivatives or no MODS record, use this Linux Perl script: File:FindMissing.txt

For Cabaniss, this one checks the content in the web directory (in Acumen) against what's in the storage archive, to make sure nothing is missing: File:FindMissingFile.txt

To create OCR files for items listed in *ocrList.txt files located in the /srv/deposits/ocrMe directory, and place those OCR files in the correct web location: File:OcrSelected.txt