For ETDs

For DSpace

New Set Of Content:

new content will be in /srv/deposits/bornDigital/u0015_0000001/ -- you MUST collect copies AFTER the metadata librarians are done with them and BEFORE the end of month archiving (when they will be dispersed to the preservation archive). New content comes in 3 times a year, and whenever there are corrections. We are dependent upon the metadata librarians to let us know of new content.

log into the InfoTrack database and query the bornDigital table for which files are still under embargo, for example: select id_2009, dateAvailable from bornDigital where datestamp > "date of batch" and dateAvailable > "after todays date"

returns a list of ETD items available after todays date (yyyy-mm-dd)` -- make a list of the dates when the content will be available, with the last 4 digits of the identifier (which will be the DSpace ID assigned). This list will need to be provided to the DSpace admin for assigning embargoes there.

create directory by datestamp in S:\Digital Projects\Other\IncomingDigital\ETD_supp and put supplemental files there from the content directory.

come back to libcontent /srv/scripts/etds/toDspace and rename "all" to all_datestamp; recreate "all". Do same with uploads directory.

run pullAndRename_new path is /srv/scripts/etds/toDspace/

cd into /all and run this command to remove duplicate pdfs. (not sure why this is happening but this is a work around)find . -maxdepth 2 -name 'file_2.pdf' -delete

then run:
find . -maxdepth 2 -type f -name "*contents*" -exec sed -i 's/file_2.pdf//g' {} +
this will remove the file_2.pdf references from the contents manifest(try this command with file_2.pdf\n to remove the linefeed in one move)