Things to do before marking a folder as "Ready" or "Store"

The collection number u0003_0000001 will be used as an example.

Check Subfolders of Collection Level Folder

The Collection Level folder contains subfolders and their content must adhere to certain specifications prior to the collection being considered "Ready" (to go online via the Metadata Unit) or "Store" (to go directly into storage).

Admin

Include a text version of the log file with every batch; previous versions of the log will be overwritten by the newest version.
Specifically, since this log file's source file is an Excel workbook, it is the "log" sheet within the Excel file that needs to be saved as u0003_0000001.log.txt.
This is the sheet with all the scanning data (technician, dates, # of scans, etc.).

and other relevant documents saved as plain .txt (ANSI or UTF-8 without BOM preferred). If possible please incorporate any additional data into the log.txt file.

For example, audio collections often have significant item-level notes that we want to retain. These plain text files can be saved with a ".notes.txt" extension - i.e. "u0008_0000001_0000001.notes.txt".

If multiple Digital Collections spawn from the same Analog Collection, there can be more than one Collection Information XML file as follows:

u0003_0000001.1.xml

u0003_0000001.2.xml

Metadata

This folder must exist.

Must contain:

u0003_0000001.txt

This is tab-delimited text export of the original spreadsheet.
The source .xlsx spreadsheet should be moved to S:\Digital Projects\Administrative\collectionInfo\forMDlib\needsRemediation.
If this is a large or ongoing collection, the tab-delimited text export should contain ONLY the metadata for the items currently being transported to storage.
The text file itself should have a period and then a number to indicate which portion, or "batch", of the complete metadata this is.
The first tab-delimited export would be named, for example, u0003_0000001.1.txt, and would contain the first 500 entries, for example.
The second tab-delimited export, for items 501-1000, would be named u0003_0000001.2.txt, and so forth.
Thus, only by collecting all these tab-delimited exports do we have a complete set of descriptive metadata for the collection items.

For more on how to parse these "batches" out from the complete set of descriptive metadata, see Parsing Metadata.

a MODS folder

This folder will contain all the MODS files created via Archivist Utility (see: Making MODS).

If you are comfortable with XML, please open the EAD file and look for this line (should be the ~4th one down):
<eadid countrycode="US" mainagencycode="US-US-ALM"></eadid>
If the collection number is not there (what we name the file: u0003_0000580) then please enter it, so that line looks like this:
<eadid countrycode="US" mainagencycode="US-US-ALM">u0003_0000580</eadid>
This way the file self-references and can be found by this number during searches if we index it properly. Also, if something gets misnamed somewhere, this will help to sort out the problem.

Scans

This folder must exist.

Note: we may break Scans folders into chunks for manageability, for more information click here.

ideally, no additional fields such as "Notes" are in the Metadata file. "Notes" as such should be deleted or moved to the appropriate row in the log.txt file.

make sure the Format column in the Metadata file has not been altered to the Time format as it sometimes is:

If a tab delimited metadata file is opened via Excel (especially by right clicking the file and choosing to open in Excel), the format column if like: 3 p., 4 p., etc. Will get interpreted as
3:00 PM, 4:00 PM, etc.
If then resaved as .txt, times will have been saved instead of page #s.
The way around this is to have Excel open first, choose Open.
open your text file and while you are being interrogated by Excel about how to import, set the Format column as "Text".

Check all Folder names

Make sure folders are named correctly and that there are no superfluous word concatenations to object level folders, etc.

Match Data across documents and folders

This table attempts to show how data in one of our documents/folders should match with data in another document/folder.

From a digital preservation/delivery perspective, it's not as important to match information to the Archivist Queue spreadsheet, though it would be ideal if possible.
Also, it isn't always feasible to match the number of actual scans vs. what is notated in the TrackingFiles, although that is also ideal.