13b) move any item-level text files out of scans to S:\Digital Projects\Administrative\collectionInfo\forMDlib\itemMD\

+

and notify Mary

+

+

13c) RUN testIncoming(in /srv/scripts/qc) and look at the output for problems.

+

+

13d) open /srv/scripts/storing/relocating and set $test = 1. Run it and look at the output. Make repairs.

+

14) Make whatever repairs are necessary for script to work. If there are any new categories (z0004?),

14) Make whatever repairs are necessary for script to work. If there are any new categories (z0004?),

Line 131:

Line 139:

and modify to fit).

and modify to fit).

−

15) Edit out the lines in /srv/scripts/storing/relocating which write new manifests and copy files,

+

15) Open /srv/scripts/storing/relocating and edit out the line $test = 1;

−

and unedit the one that will write the manifests to a test file. (Search for "COMMENT" and follow the

+

−

instructions. Prefacing the line with # will comment it out to prevent its execution.)

16) Empty the file /srv/scripts/storing/RelocateManifests, and copy moveme to another filename.

16) Empty the file /srv/scripts/storing/RelocateManifests, and copy moveme to another filename.

Revision as of 15:12, 2 October 2009

Currently, we are developing collections on a shared Windows drive called "Share" in the Digital Projects folder.

We are storing content long-term on a Linux server.

We are delivering content on another Linux server via CONTENTdm, and Tonio Loewald of OLT is developing an alternative (Linux-based) delivery solution, which runs off our storage area.

This is the current set of procedures for moving content from Share to long term storage, and linking them into manifests for LOCKSS.
This procedure should happen every Friday:

TO MOVE CONTENT TO STORAGE -- every Friday

ACCESS and IDENTIFICATION

1) ssh to libcontent1.lib.ua.edu.

2) To get root access, run: sudo su (if you do not have sudo access, you cannot do this process)

3) To get access to the share drive from the libcontent1 Linux server, run:
mount -t cifs -o username=jlderidder,domain=lib //libfs1.lib.ua-net.ua.edu/share/Digital\ Projects/ /cifs-mount
and type in your password to the share drive. (substitute your own username there in the line above)

4) To determine what the status in on content in the Digital Services folder on share,
run: /srv/scripts/qc/status
This script goes through the Digital_Coll_Complete and Digital_Coll_in_progress folders, and pulls out
paths to folders that end in _ready (for Mary), _store and _online (for pickup for storage) and also
_check, for digital services quality control. The lists of these paths are written to files in
/srv/scripts/qc/lists:
forMary are things waiting for her;
checkme are for the DS folks
storeMe are things NOT to go to CONTENTdm, ready to store, though
online are things ALREADY in CONTENTdm, ready to store.

5) Share this output with Digital Services folks, to verify it is correct.

MOVE TO LINUX SERVER

6) Once verified, look under /cifs-mount for content in share directory.
Copy it over to /srv/deposits/content/.
for example:
cp -r /cifs-mount/Digital_Coll_Complete/u0003_0002121_Aston_1939 /srv/deposits/content/.
This does a recursive copy of the entire specified directory and all its contents, and places the copy
in the /srv/deposits/content directory on the libcontent1 server.

7) do a diff from /srv/deposits/content to the location of the content on share drive area
to see if we got the content ok; for example:
diff -r /cifs-mount/Digital_Coll_Complete/u0003_0002121_Aston_1939 /srv/deposits/content/u0003_0002121_Aston_1939
look at the output: if none, it's a match. If there's a variation, recopy the specified files,
and then run the diff again. If it's a big directory, prepend the command with "nohup " so the
process will continue to run if you log off or lose your connection. The output then will be
in the nohup.out file in the directory where you ran the "diff" command.

8) Once you're sure you have a clean copy (no output from "diff"), delete the specified directory from
the share drive. For example: rm -r /cifs-mount/Digital_Coll_Complete/u0003_0002121_Aston_1939
This is a powerful command, a recursive removal of a directory and everything in it,
including subdirectories. Be careful with it.

UPDATE DATABASE WITH COLLECTION INFO

9) If the collection has an icon: make sure it is in the Admin folder, named for the collection.
For example, u0003_0002121.icon.png for the collection above. (You will need to copy this
to the CONTENTdm server (content.lib.ua.edu) and place it in the /usr/local/Content4/docs/cdm4/images
directory, with 664 access rights "chmod 664 xyz.png"). List this icon in /srv/scripts/collstuff/icons,
so the following script will know not to apply a default icon link.

10) Then go to the collstuff directory /srv/scripts/collstuff and run: collToDbase
to put the collection info in the database. This looks through the Admin folders in the top level
directories placed in /srv/deposits, for collection-numbered XML files. It parses through the content,
repairing encoding errors, puts the content into mysql InfoTrack.digColls, adds the expected canned link,
and asks you for each collection whether or not the collection is online. To answer, copy the title given and input
into CONTENTdm advanced search interface as the exact phrase in the Relation:isPartOf field. Run the search and make sure
this brings up the content, as this is the link created by the software. Note that the links created
are the ones expected for CONTENTdm -- the software needs updating when Tonio's software goes live.
(It will need to ask where the content is live, verify the URL it creates, and allow you to correct the link
on the command line.)

[Verify that links are correct. Update about.php on CONTENTdm server if collection page PHP is not yet written.]

12) copy the metadata spreadsheets to /srv/scripts/metadata/spreadsheets, and make sure they get into the next batch
processing of MODS files.

{Following step applies if we are keeping a use copy of the database on content.lib.ua.edu}
[Verify that links are correct. Update about.php on CONTENTdm server if collection page PHP is not yet written.]

12b) Back up database and copy to the content.lib.ua.edu server so the collection page will be updated
by the new entries in the digColls table that you marked as "online."
**** NOTE: do this early or late in the day to avoid disruptions to users. ******
a) On libcontent1:
i) cd /srv/mysql (this changes your directory location)
ii) then you back up the existing database:
mysqldump -u jlderidder -p InfoTrack > ./backups/InfoTrack_20090625.sql (where 20090625 is today's date)
iii)then copy it to your home directory on the CONTENTdm server --
substitute your name, use your password:
scp backups/InfoTrack_20090625.sql jlderidder@content.lib.ua.edu:.
b) On content.lib.ua.edu :
i) ssh there;
ii) 'su root' first, to get root access
iii) move the file where we need it:
mv InfoTrack_20090625.sql /usr/local/mysql/var/backups/.
iv) cd /usr/local/mysql/var/ (change your directory location)
v) backup existing database: use yesterday's date, so as not to overwrite new version:
mysqldump -u jlderidder -p InfoTrack > ./backups/InfoTrack_20090624.sql (where 20090624 is yesterday)
vi) drop the old database: (NOTE: the database will go down until you have refilled it!)
mysqladmin -u jlderidder -p drop InfoTrack
vii) create it again:
mysqladmin -u jlderidder create InfoTrack
viii) refill it with the newer backed-up version:
mysql -u firestar -p InfoTrack < backups/InfoTrack_20090625.sql

PREPARE & MOVE CONTENT TO ARCHIVE

13) look through the directories in /srv/deposits/content for anomalies, badly named things,
and new categories -- any which are not listed on http://libcontent1.lib.ua.edu/lockss/Manifest.html
( this manifest is in /srv/www/htdocs/lockss/). Anomalies might be directories which do NOT match
the expected structure (Admin, Metadata, Scans, Transcripts) or subfolder structure beneath them;
spaces in filenames or directories, etc.

13b) move any item-level text files out of scans to S:\Digital Projects\Administrative\collectionInfo\forMDlib\itemMD\
and notify Mary
13c) RUN testIncoming(in /srv/scripts/qc) and look at the output for problems.
13d) open /srv/scripts/storing/relocating and set $test = 1. Run it and look at the output. Make repairs.

14) Make whatever repairs are necessary for script to work. If there are any new categories (z0004?),
then add them, and create top level manifests in those category directories (copy from like directories
and modify to fit).

15) Open /srv/scripts/storing/relocating and edit out the line $test = 1;

16) Empty the file /srv/scripts/storing/RelocateManifests, and copy moveme to another filename.

17) run "relocating > look" and look at the output to make sure it's doing what you want. It will:
a) list the files it's going to move, from where and to where in "moveme"
b) write the new manifests into RelocateManifests so you can see what they're going to look like
c) write the parent manifests to parentMans, so you can see what they're going to look like
d) write errors to "look" file, as well as other info

18) WARNING! Before running /srv/scripts/storing/relocating, do a chmod -R 755 /archive
(after running it, do chmod -R 555 /archive)
this will enable you to write to the directories as root --
and then close off that ability afterwards.

19) Repair the edits made to relocating script and rerun, this time for real... check results.
This time it will actually copy the content into the directories in /srv/archive
And create and modify existing manifests.

SUPPLEMENTAL WORK, CLEAN-UP, AND QUALITY CONTROL

20) If the output tells you to modify or create manifests, do that next (if not done already)
in top level of /srv/www/htdocs/lockss directory
The script does NOT create the top 2 levels of manifests, as it does not have sufficient
information about new categories to fill this in.

21) check links at http://libcontent1.lib.ua.edu/lockss/Manifest.html -- drill down.
If you cannot access it, modify the .htaccess file in /srv/www/htdocs/lockss to allow your IP,
and restart the apache web server: '/usr/sbin/apache2ctl restart'
(be sure to change it back, and then restart the web server)

22) then run 'checkem > look' in /srv/scripts/storing -- it will use the "moveme" file to
verify md5 sums of content that was moved,
and delete from the deposits directory if there's a match. If there's NOT a match, it will output
"ERROR: " and the error. So look through the "look file" for problems.
Again, if there's a lot of content, precede this command with "nohup " and then look later in the
nohup.out file for the content that normally would go into "look".

23) then go look in the /srv/deposits directory, make sure folders are clean, and delete them.
I do this: ls /srv/deposits/content/*
ls /srv/deposits/content/*/*
ls /srv/deposits/content/*/*/*
ls /srv/deposits/content/*/*/*/*
if you find content, there's a problem. Solve it!! It may be necessary to rename things and go back
through steps 13 on again.
if no content: rm -r /srv/deposits/content/*
(this will recursively delete all content in the directory)