This document specifies the requirements for the BEN Collaborators Collection Tool. The collections tool will be an open-source software system written for the PHP/MySQL platform. Although the system will be written in an extensible way, its primary focus will be the provision of the BEN Collaborative with a suite of tools which facilitate its mission of providing a federated network of peer-reviewed biological learning resources and their metadata.

Revision History

Date

Version

Description of Document Updates

Author

3/10/2006

0.1

First draft

ccollins et al

3/10/2006

0.2

First draft of glossary

mbusby

3/13/2006

0.3

Second draft of glossary

ccollins

3/14/2006

0.4

Included first release of glossary. Added purpose, scope, and overview.

ssachs

3/15/2006

0.5

Updated definitions, added use cases

ssachs

3/15/2006

0.6

Added diagrams, as well as operating environment and database description.

The primary goal of the collections tool is to streamline the maintenance of existing BEN Collaborative digital library collections, and to facilitate the addition of new collections to the collaborative.

The collections tool will provide users with all of the tools they need to create and maintain a fully functioning catalog of peer-reviewed learning resources and their metadata. This tool does not provide for end user functionality, such as searching and browsing. Furthermore, this tool does not provide the functionality required for a peer review system. However, this tool will be built with the assumption that such end user and peer review functionality will be built to augment the collections tool. Consequently, the software will be developed in a modular, extensible manner which facilitates such augmentation.

- harvesting metadata into the catalog and out of the catalog, and managing the harvesting process

- managing the collection metadata structure

- managing the cataloging process(es)

- managing the collection vocabularies

- managing the collection users

The collections tool does not include facilities for end users to view, browse, or search collection resources or metadata.

The cataloging process will make provisions for the user to refer to an external resource, to upload a resource for permanent archival on the collection’s server, and to refer to a resource previously uploaded to the collection’s server via FTP. Accurate meta-metadata will be recorded both for the entire lifecycle of the metadata, including creation, editing and validation.

The metadata validation will be a simple, fixed process. The administrator may assign a validator to validate a resource’s metadata, or may allow any validator to validate the metadata. Validators will view, modify, and validate or reject the metadata. The administrator may change the assignments for a resource’s metadata at any time before the validation is complete.

Taking into account the assignment of validators, the editing of metadata, and the validation process, the lifecycle of a resource and its metadata record are depicted in Figure 1.

Figure 1 The resource and metadata lifecycle

The peer review process for resources submitted by outside users, while anticipated, remains out of scope of this document and of the initial version of this software. Collection administrators and validators are assumed to use methods external to the collections tool (e.g., email-based peer reviews) to complete the peer review process for such resources, and to start the validation process only after the resource has been peer-reviewed.

The validation process follows the peer review process because the validation process is meant to certify that the metadata record accurately describes the learning resource. If the learning resource changes as a result of the peer review process, then it will be necessary to modify the metadata record as well, and to certify the accuracy of the resulting metadata during validation.

The harvesting process will be compliant with the OAI Protocol for Metadata Harvesting, version 2.0. Through this process, collection administrators will be able to import metadata into the system from another database, and export metadata to another catalog, in particular the BEN Portal.

The collection tool allows the administrator to manage the metadata structure and the cataloging process separately. The metadata structure is the set of metadata fields for which data is collected about each of the resources in the collection. The cataloging process is the process of providing values for each of those fields in order to describe one resource. The separation between the two will allow the administrator to maintain multiple cataloging processes. For example, the administrator may specify that one cataloging process allows outside users to submit metadata about a new resource, and that another process allows organization staff to catalog metadata about a resource previously peer reviewed by the organization. The processes may differ in terms of the instructions given to the user in providing metadata; or the options available to the user; or the fields which are displayed to the user.

As a baseline, the collection tool’s metadata structure will be that specified in the BEN Metadata Specification, including optional Learning Object Metadata (LOM) specification fields not required by the BEN Metadata Specification. The collection administrator may add metadata fields to that structure, but may not remove or modify those fields.

Administrators will be able to manage the vocabularies available for each metadata field, thereby creating new vocabularies and modifying them over time if necessary. It will also be possible to create mappings which translate terms in one vocabulary into terms in a corresponding BEN vocabulary.

While it will be possible to modify the metadata structure, controlled vocabularies and cataloging processes for the collection at any time, it will be advantageous to finalize these elements, to the degree possible, before entering any resources or metadata into the collection. The suggested workflow for installing, configuring, and using the collection is depicted in Figure 2.

Figure 2 The collection installation, configuration and use procedure

The collections tool will be built under the assumption that it may coexist with other installations of the collections tool on the same server. Consequently, it will be possible for multiple collections tools to share the same user database. Each user’s role on each collection will be set by administrators for that collection, making it possible for the collections to share users, if so desired by the administrators.

Because the collection will also enable users to archive learning resources on the server, it will also be necessary to maintain a resource repository. The repository will be a file store. By default, the resources in the repository will be available for Web download without restrictions. However, it will also be possible to restrict access to the resources using simple web server configuration. The organization maintaining the collection may subsequently implement a separate resource access application to provide end users with access to resources in its repository according to the organization’s business rules. The collection administrator will be responsible for setting up the cataloging process so that the rights metadata is accurately recorded for such resources.

Although this document is primarily concerned with the functionality available to the user via a web user interface, the user interface will be built on top of a well-documented application program interface (API). The API will enable other web applications and services to access and modify data in the collection’s databases. This architecture will pave the way for further extensibility, including the implementation of a peer review system, the integration of a resource access application, as well as advanced resource browse and search web applications. Wherever possible, the API will be designed to enable each organization maintaining a collection to develop and install modules which will extend the application to fit its unique business requirements.

The interaction between the web user interface, API, and collection data stores are depicted in Figure 3.

Figure 3 Collections Tool Architecture

The user database and metadata repository and collection administration database will be built on the MySQL 4.0 platform. The API and user interface will be built in PHP version 4.0, using the PEAR v1.4 extensions.

Learning Resource – A file or set of files which together comprise a digital learning object.

Metadata Record – A set of data which describes a learning resource, and conforms to a metadata specification. All metadata records created by the BEN Collaborators Collections Tool will conform to the BEN Metadata Specification.

Activities

Acceptance – The decision to include a learning resource in the collection. This decision certifies that the resource is scientifically accurate and/or of sufficient pedagogical quality to meet the standards of the organization maintaining the collection.

Cataloging – The process of creating a metadata record for a resource for inclusion in the collection. Generally performed by a staff member or affiliate of the Collaborator’s organization for already peer-reviewed resources.

Metadata Editing – The process of modifying a metadata record for a specific learning resource.

Peer Review – The process of subjecting a learning resource to the scrutiny of one or more experts in the field for the purposes of reviewing the scientific accuracy and/or pedagogical quality of a learning resource. Used to help make a decision on eventual acceptance & publication of the learning resource. May involve resource revision and metadata editing.

Publishing – The process of making a learning resource and its associated metadata record available to non-administrative users in a collection.

Resource Revision – The process of modifying a learning resource. Usually performed by the resource’s author, in order to respond to comments made during the peer review process.

Submission – The process of submitting a resource for inclusion in the collection. Generally performed by the author of the resource or somebody otherwise outside of the Collaborator’s organization for resources that have not yet been peer-reviewed. Includes creation of a minimal metadata record for the resource.

Validation – The process by which a Validator confirms that the metadata record representing a learning resource is complete and accurate, and that the learning resource is a valid, peer-reviewed resource. May involve metadata editing, but usually does not involve resource revision.

Learning Resource/Metadata Status

Accepted – Learning resource has been peer reviewed and is considered of sufficient scientific and/or pedagogical quality to meet the standards of the organization maintaining the collection.

Not Accepted – Learning resource has not yet been included in the collection.

Not Validated – Metadata record for a learning resource has not yet been validated.

Published – Submission is in final format and has been made available in the online library.

Rejected – Submission has been peer reviewed and has not been accepted for publication.

Validated – The metadata record associated with a learning resource has been reviewed by an expert and acknowledged as accurate and complete.

User Roles

Administrator – System administrator. Has control over configuration of the collections tool and user accounts.

Author – Person who creates a learning resource.

Browser – User who consumes learning resources via the services available in the digital library, usually search & browse.

Cataloger – User who creates the metadata record for a learning resource with the collections tool.

Staging – Test site. Usually appears to be very similar to the live site, but is not made available to external system users. Used for deployments of new system versions before migration to production use.

The following user roles define sets of responsibilities for users in the system. In some cases it is desirable to have the same person filling more than one role – for example, the role of both Collection Administrator and Collection Manager. The user management component of this tool will make these types of role assignments possible, at the collection administrator’s discretion.

- Collection Administrator – The collection administrator manages the collection tool, by configuring it so that its metadata structure, cataloging processes, and controlled vocabularies are sufficient to capture metadata about the resources to be included in the collection. The collection administrator also oversees other users on the system.

- Collection Manager – The collection manager oversees the cataloging, validation and harvesting processes. The manager will be responsible for determining, for each cataloging process, which user roles may use that process, and who is assigned to validate a resource. The manager will also be able to perform metadata imports via harvesting.

- Validator – The validator is responsible for certifying that the metadata for a resource is correct and that the resource is peer-reviewed.

- Cataloguer – The cataloguer is responsible for adding new resources, and the metadata for thos resources, to the collection, by entering the relevant metadata into the system. The cataloguer is usually on staff of the organization maintaining the collection.

- Submitter – The submitter is a user who may, at the collect manager’s discretion, participate in the harvesting process, like the cataloguer. The submitter is usually not on staff of the organization maintaining the collection, and usually will not be allowed to use the same cataloging processes as the cataloguer,

A use case describes how a user of the proposed system will interact with the system to perform a unit of work. It describes an interaction over time that has meaning for the end user (person, machine or other system), and leaves the system in a complete state.

· A use case typically has requirements and constraints that describe the essential features and rules under which it operates.

· A use case may have an diagram or description illustrating behavior over time - who does what and to whom, when.

· A use case typically has scenarios associated with it that describe the work flow over time that produces the end result. Alternate work flows (to capture exceptions, etc.) are also allowed.

Provides an overview of the metadata structure, allowing the user to view the results of any changes to the structure.

Users

Collection Administrator

Preconditions

The user is logged in as the Collection Administrator.

Basic Course of Events

1. The user indicates clicks “View metadata structure”. 2. The system displays the hierarchy of the metadata structure, including name, description, data type, and controlled vocabulary, if appropriate, for each field.

This use case makes it possible to collect more metadata about each resource in the catalog

Users

Collection Administrator

Preconditions

The user is logged in as the Collection Administrator.

Basic Course of Events

1. The user indicates that a new metadata field is to be added by clicking “Add metadata field” 2. The system responds by providing the user with the following options: field name, description, and data type. a. Data type may be free entry text, multiple text items, date, single vocabulary term, multiple vocabulary terms, integer, or real number b. Whether the field is required. c. If the data type is “single vocabulary term” or “multiple vocabulary terms”, the system allows the user to select the vocabulary to which the field must adhere. The user may choose from any vocabulary which is available in the system. 3. The user provides the required options. 4. The system displays the user’s selections, explains the impact of adding the new field, and asks the user to confirm addition of the field. 5. The user confirms the addition of the field.

Alternative Paths

1. In Step 3., the user makes a mistake by not providing a required option, or by providing invalid data. The system returns to Step 2, displaying the user’s previous inputs and indicating the errors made. 2. In Step 5, the user does not confirm the addition of the field. The system reverts to the precondition, and the new field is not created.

Post conditions

The new field is created. The field is added to existing cataloging processes as a fixed, hidden field with blank or zero value, as appropriate. Corresponding new metadata is not added to existing metadata records; it must be added subsequently on a record-by-record basis. The field is added within the Custom category, which is not part of the BEN specification and therefore is not harvested with the default BEN metadata format.

This use case makes it possible to make revisions to previously added metadata fields.

Users

Collection Administrator

Preconditions

The user is logged in as the Collection Administrator, and the field being modified is not part of the BEN metadata structure.

Basic Course of Events

1. The user indicates that an existing metadata field is to be modified by clicking “Modify metadata field” from the View metadata structure screen. 2. The system responds by providing the user with the following options: field name, description, data type, and controlled vocabulary, if appropriate. The options are pre-populated with information already recorded for the metadata field. 3. The user provides the required options. 4. The system displays the user’s selections, explains the impact of modifying the field, and asks the user to confirm modification of the field. 5. The user confirms the modification of the field.

Alternative Paths

1. In Step 3., the usr makes a mistake by not providing a required option, or by providing invalid data. The system returns to Step 2, displaying the user’s previous inputs and indicating the errors made. 2. In Step 5, the user does not confirm the modification of the field. The system reverts to the precondition, and the field is not modified.

Post conditions

The field is modified. If the field is already in use by some cataloging processes, the input type for the field will be modified accordingly. If existing metadata records have metadata for the field, the metadata will be transformed as described in Table 1 on Page 2523.

An existing metadata field is removed from the metadata structure, all cataloging processes, and the harvester.

Rationale

This use case makes it possible for a metadata field to be temporarily removed from the metadata structure; it may be later re-added via the Re-enable a metadata field use case.

Users

Collection Administrator

Preconditions

The user is logged in as the Collection Administrator, and the field being disabled is not part of the BEN metadata structure.

Basic Course of Events

1. The user indicates that a metadata field is to be disabled by clicking “Disable metadata field” from the View metadata structure screen. 2. The system explains the impact of disabling the field, and asks the user to confirm disabling of the field. 3. The user confirms the disabling of the field.

Alternative Paths

1. In Step 3, the user does not confirm the disabling of the field. The system reverts to the precondition, and the field is not disabled.

Post conditions

The field is disabled. Catalogers may not enter metadata for that field in those cataloging processes which include an input for this field, and those inputs are not displayed. If some metadata records already have entries for that metadata field, the entries are not deleted. However, when the records are harvested, those entries are not exported.

An existing metadata field is re-inserted into the metadata structure, all cataloging processes, and the harvester.

Rationale

This use case makes it possible for a metadata field to be reinstated after having been temporarily removed from the metadata structure.

Users

Collection Administrator

Preconditions

The user is logged in as the Collection Administrator, and the field being re-enabled has been previously disabled.

Basic Course of Events

1. The user indicates that a new metadata field is to be disabled by clicking “Re-enable metadata field” from the View metadata structure screen. 2. The system explains the impact of re-enabling the field, and asks the user to confirm enabling of the field. 3. The user confirms the enabling of the field.

Alternative Paths

1. In Step 3, the user does not confirm the re-enabling of the field. The system reverts to the precondition, and the field remains disabled.

Post conditions

The field is re-enabled. Catalogers may now enter metadata for that field in those cataloging processes which include an input for this field, provided that those processes are enabled. When metadata records are harvested, entries for the re-enabled field are exported.

The following use cases do not include a use case for creating a new cataloging process from the ground up, field by field. It is anticipated that such a use case would be extremely time consuming. Instead, the cataloging tool will be pre-configured to include the default BEN Cataloging Process. Catalog administrators may then set up their own cataloging processes by copying the initial BEN Cataloging Process and modifying it.

The cataloging processes in the system are displayed, and a preview of the cataloging forms is provided.

Rationale

This use case makes it possible to preview the cataloging processes to ensure that the proper data is being collected for each metadata field.

Users

Collection Administrator

Preconditions

The user is logged in as the Collection Administrator.

Basic Course of Events

1. The user clicks “View cataloging processes” 2. The system displays a list of all cataloging processes in the system, including disabled processes. 3. The user clicks “Preview” next to the cataloging process she wishes to view in detail. 4. The system displays the cataloging process as it would appear to the cataloguer and/or submitter. The user may provide inputs and submit the cataloging forms, but no new metadata records will be created.

An existing cataloging process is copied to a new, separate cataloging process.

Rationale

Cataloging processes are very similar to one another, and moreover are large, complex objects. For the most part users will not want to create new cataloging processes that differ substantially from previously defined cataloging processes. This use case allows users to easily and quickly take care of the “common case.”

Users

Collection Administrator

Preconditions

The user is logged in as the Collection Administrator.

Basic Course of Events

1. The user indicates that a cataloging process is to be copied by clicking “Copy cataloging process” from the View cataloging processes screen. 2. The system prompts the user for the name of the new process, a description of the process, and a specification of which types of users may perform cataloging with the process. 3. The user provides the required information. 4. The system explains the impact of creating the new process, and prompts the user for confirmation. 5. The user confirms the creation of the new cataloging process.

Alternative Paths

1. In Step 3, the user does not provide the required information. The system returns to Step 2, pre-populating the options displayed with the previously provided information and indicating the user’s errors. 2. In Step 5, the user does not confirm the copy. The system returns to the pre-condition, and the cataloging process is not copied.

Post conditions

The cataloging process is copied. The new cataloging process is initially disabled, so that users may not begin cataloging until the administrator has finished modifying the process.

A cataloging process may require some refinement and modification over time in order to accommodate changes in the metadata structure or to improve usability.

Users

Collection Administrator

Preconditions

The user is logged in as the Collection Administrator.

Basic Course of Events

1. The user indicates that a cataloging process is to be modified by clicking “Modify cataloging process” from the View cataloging processes screen. It is possible to modify cataloging processes which have been disabled. 2. The system provides an overview screen, prompting the user for: a. the name and description of the process b. the name and help text for each page, as well as the relative ordering of each page 3. The user provides the required information. The user may also click an “add new page” button, which will add a new page at the end of the process. 4. For each page in the cataloging process, the system displays the input fields on that page, including for each field: a. The metadata field with which it is associated. b. The caption prompting the cataloguer for metadata c. A brief help text, explaining to the cataloguer what is required for the field d. More extended help text, providing detailed instructions to the cataloguer for completing the field e. The input method for the field: not fixed, fixed visible, fixed hidden i. If a field is not fixed, users will be able to provide input for that field. If a field is fixed visible, users will be able to see the metadata that will be saved for that field, but will not be able to modify it. If a field is fixed hidden, users will not be able to see or modify the metadata that will be saved for that field. ii. Note that for fixed fields, the default value will be the value for all metadata records submitted using this cataloging process. f. The input type for the field, as appropriate to the metadata field with which the cataloging field is associated, as described in Table 2 on Page 2624. g. The default value for the field 5. The user makes changes to each field, reorders the fields within the page, or moves them to another page. The user may also click an “Add new field” button, which adds a new field to the page. When the user has completed the page, she moves on to the next page or, if desired, to another page within the cataloging process. Note that the user may not specify the following cataloging process configurations: a. A required metadata field is fixed visible or fixed hidden, with no default value. b. A required metadata field is not included in the cataloging process. 6. Steps 4 and 5 continue for each page until the user has made all necessary changes to the cataloging process. 7. The system presents the user with a summary of the new cataloging process; the user may pop up a preview in a new window if desired. The system provides the user with an explanation of the impact of modifying the cataloging process, and prompts the user for confirmation. 8. The user confirms the modification of the cataloging process, and the process is modified.

Alternative Paths

1. In Steps 3 or 5, the user does not provide the required information. The system returns to the previous screen, pre-populating the options displayed with the previously provided information and indicating the user’s errors. 2. In Step 7, the system notes that not all required metadata fields have been included in the cataloging process. The user is notified that some fields are missing and is not allowed to complete modification until all required metadata fields are included in the cataloging process. The user may return to modify other pages within the process as necessary. 3. In Step 8, the user does not confirm the modification. The system returns to the pre-condition, and the cataloging process is not modified.

Post conditions

The cataloging process is modified. If it is enabled, then users will be able to create resources using the modified process immediately.

This use case allows administrators to revise a cataloging process over a period of time, without allowing catalogers to use the process is in proper form.

Users

Collection Administrator

Preconditions

The user is logged in as the Collection Administrator, and the cataloging process being disabled is currently enabled.

Basic Course of Events

1. The user indicates that a cataloging process is to be disabled by clicking “Disable cataloging process” from the View cataloging processes screen. 2. The system explains the impact of disabling the process, and asks the user to confirm disabling of the process. 3. The user confirms the disabling of the process.

Alternative Paths

1. In Step 3, the user does not confirm the disabling of the process. The system reverts to the precondition, and the process is not disabled.

Post conditions

The process is disabled. Catalogers are not allowed to catalog new resources using the cataloging process.

An existing cataloging process which was made unavailable to catalogers is made available.

Rationale

This use case makes it possible for administrators to “turn on” a cataloging process once revisions to it have bene made.

Users

Collection Administrator

Preconditions

The user is logged in as the Collection Administrator, and the process being re-enabled has been previously disabled.

Basic Course of Events

1. The user indicates that a cataloging process is to be enabled by clicking “Re-enable metadata field” from the View metadata structure screen. 2. The system explains the impact of re-enabling the process, and asks the user to confirm enabling of the process. 3. The user confirms the enabling of the process.

Alternative Paths

1. In Step 3, the user does not confirm the re-enabling of the process. The system reverts to the precondition, and the process remains disabled.

Post conditions

The process is re-enabled. Catalogers may now use the cataloging process to catalog new resources.

Note that the collections tool will be packages with the following controlled vocabularies by default: the BEN Subject/Discipline Taxonomy; the BEN Pedagogical Use Taxonomy; the BEN Learning Resource Type vocabulary; the ISO Country Code list; the ISO Language Code list.

Provides an overview of the controlled vocabularies, allowing the user to view the results of any changes to the vocabularies.

Users

Collection Administrator or Collection Manager

Preconditions

The user is logged in as the Collection Administrator or Collection Manager.

Basic Course of Events

1. The user indicates clicks “View controlled vocabularies”. 2. The system displays the list of controlled vocabularies in the system, including the name, description, and number of terms for each vocabulary. 3. The user clicks “View full vocabulary” for one of the vocabularies. 4. The system displays the full list of terms, including both the id and the full text of each term. The system also displays mappings to other controlled vocabularies, if available.

This use case makes it possible to extends the categorization capabilities of the tool, by adding new controlled vocabularies which may be domain-specific or otherwise meet the unique needs of the organization maintaining the collection.

Users

Collection Administrator or Collection Manager

Preconditions

The user is logged in as the Collection Administrator or Collection Manager.

Basic Course of Events

1. The user indicates that a new controlled vocabulary is to be added by clicking “Add controlled vocabulary” 2. The system responds by providing the user with the following options: controlled vocabulary name, description, whether the vocabulary is flat or hierarchical, and a text box to enter the list of vocabulary terms, separated by newlines. The user indicates hierarchical relationships within the vocabulary by preceding child terms with dashes: e.g. “Biology” for a first level term, “-Ecology” for a second level term, and “–Energy Transfer” for a third level term. a. Vocabulary terms may be preceded by a number in parenthesis, to indicate the numerical identifier for that term: e.g. “(1) Biology”, “– (1.1) Ecology”, and “— (1.1.1) Energy Transfer”. b. Vocabulary terms may not begin with a space, a dash or a left parenthesis. 3. The user provides the required input. 4. The system displays the controlled vocabulary settings which the user has provided and asks for confirmation. 5. The user confirms the creation of the controlled vocabulary.

Alternative Paths

1. In Steps 3, makes a mistake and does not provide all of the required input. The system informs her of the mistake, and presents the form again, with the user’s previous input displayed on the form. The following conditions are errors: a. The vocabulary is flat, but the list of terms is hierarchical b. The same identifier has been assigned to two different terms c. The identifier for a term includes characters that are not digits or decimal points. 2. In Step 5, the user does not confirm the addition of the vocabulary. The system reverts to the precondition, and the new vocabulary is not created.

Post conditions

The new vocabulary is created. The vocabulary may be used in defining new metadata fields or modifying existing ones. If numeric identifiers have not been indicated as specified in 2.a, the system automatically computes numeric identifiers (“ids”) for each new term in the vocabulary; these identifiers will remain static for the lifetime of the vocabulary – they will not be re-assigned to other terms in the vocabulary, even if the term to which an identifier was originally assigned is removed from the vocabulary.

The user is logged in as the Collection Administrator or Collection Manager.

Basic Course of Events

1. The user clicks the “Edit controlled vocabulary” link on the View controlled vocabularies screen. 2. The system responds by providing the user with the following options: controlled vocabulary name, description, whether the vocabulary is flat or hierarchical, and a text box to enter the list of vocabulary terms. The text box is pre-populated with the list of terms in the vocabulary, separated by newlines and preceded by dashes indicating the level of each term, and the numerical identifier of each term in parentheses. 3. The user provides the required input. The user may add new terms, may modify the text (but not the numerical identifier) of existing terms, and may disable existing terms by removing the line corresponding to that term from the text box. When adding a new term, the user may not assign the term a numerical identifier already assigned to an existing term. 4. The system displays the controlled vocabulary settings which the user has provided and asks for confirmation. 5. The user confirms the modification of the controlled vocabulary.

Alternative Paths

1. In Step 3, makes a mistake and does not provide all of the required input, or makes an error in modifying the terms of the vocabulary. The system informs her of the mistake, and presents the form again, with the user’s previous input displayed on the form. The following conditions indicate errors: a. The vocabulary is flat, but the list of terms is hierarchical b. The same identifier has been assigned to two different terms c. The numerical identifier for a term has been changed d. The identifier for a term includes characters that are not digits or decimal points. 2. In Step 5, the user does not confirm the modification of the vocabulary. The system reverts to the precondition, and the vocabulary is not modified.

Post conditions

The vocabulary is modified as specified. Changes in the list of terms have the following implications: - A new term is added: that term is not mapped to any controlled vocabulary. - A term is removed: the term is no longer displayed in any cataloging processes in the system, and the term is removed from any metadata records in which it was previously entered. Existing mappings of the term to another vocabulary are removed. - A term is modified: the term is modified in any cataloging processes which use the controlled vocabulary. Any metadata records containing the term are considered updated, and future harvests of that record will include the modified term.

This use case facilitates the use of domain-specific controlled vocabularies. In particular, organizations may create domain-specific controlled vocabularies and map them to standard BEN vocabularies; the domain-specific controlled vocabulary may then be used in the cataloging processes. When the metadata is harvested to BEN, the standard BEN vocabulary term will be used in place of the domain-specific vocabulary term.

Users

Collection Administrator or Collection Manager

Preconditions

The user is logged in as the Collection Administrator or Collection Manager.

Basic Course of Events

1. The user initiates the mapping process by clicking “map terms” next to the name of a controlled vocabulary in the “View controlled vocabularies” screen. This controlled vocabulary is called the “source”. 2. The system lists the other controlled vocabularies in the system. 3. The user selects a controlled vocabulary to map to (the “target”). 4. The system displays the terms of the source vocabulary on the left side of the screen, and the terms of the target vocabulary on the right side. 5. The user selects one or more terms from the source vocabulary and one term from the target vocabulary, and clicks the “map” button. 6. The system records the mapping. If unmapped terms remain in the source vocabulary, the system displays the mapping and then returns to Step 4, displaying only those resources in the source vocabulary which have not yet been mapped. 7. When all the terms have been mapped, the system displays the mapping which the user has created and requests confirmation. 8. The user confirms the mapping.

Alternative Paths

1. In Steps 3 or 5, the user makes a mistake and does not provide all of the required input. The system informs her of the mistake, and presents the form again, with the user’s previous input displayed on the form. 2. In Step 8, the user does not confirm the mapping. The system returns to the precondition, and the mapping is not created.

Post conditions

The new mapping is created. The source vocabulary may now be used in cataloging processes to provide options for metadata fields which use the target vocabulary.

The collections tool will make it possible for outside agents to harvest the collection’s metadata records using the Open Archives Initiative Protocol for Metadata Harvesting, v 2.0, via a component called the “metadata provider.” The provider is complemented by a similar component which harvests metadata from outside sources into the collections tool’s database; this component is called the “metadata ingester.” The metadata provider makes it possible for BEN to retrieve and manage metadata records from each collection in an automated and consistent way. The metadata ingester makes it possible for a collection administrator to automatically import metadata records from an existing database or from an outside web service, e.g., iVia.

In order for the ingester to effectively import records from an existing database, that database must have an existing metadata provider which communicates with the collection’s metadata ingester. For this reason, it will be possible to separately install a second copy of the metadata provider included with the collections tool, and to use this second copy to provide metadata records from an existing database. Advanced customization work will be necessary to configure the second copy of the provider to export data from the existing database in a format acceptable by the collection’s metadata ingester. Care will be taken to minimize the amount of work necessary, In any event, this customization work is only necessary during the installation process. Once completed, the collection administrator will be able to regularly ingest records from the pre-existing database into the collections tool without any additional effort.

It will not be necessary to use to collections tool metadata harvester for the purpose of importing records from a pre-existing database. Any harvester which conforms with the OAI-PMH v2.0, and which provides metadata records in Dublin Core or BEN-LOM format, is suitable for this task.

Note that the following use cases do not provide the collection administrator with any capabilities for configuring the metadata ingester or the metadata provider. Administrators may decide when to initiate a metadata ingest, and may also review logs of metadata records imported by the ingester, and exported by the provider.

Note: The first time a metadata record is ingested into the collection, it is marked un-validated. On subsequent ingests, the ingested metadata record overwrites the corresponding collection record. Moreover, the corresponding collection record is marked as un-validated. Consequently, it is best not to re-ingest a metadata record into the catalog once editing and validation of that record has commenced.

Summary

Metadata Records are harvested into the collection.

Rationale

This use case allows organizations to import existing digital collections of metadata into the catalog automatically.

Users

Collection Manager

Preconditions

The user is logged in as the Collection Manager, and the metadata provider provides metadata in BEN LOM or Dublin Core format.

Basic Course of Events

1. The user initiates the harvesting process by clicking “Harvest Metadata Records into the Collection”. 2. The system provides the user with the following options: the URL for the metadata provider; the metadata format; and whether the metadata provided is BEN LOM, or Dublin Core. 3. The user provides the required input. 4. The system attempts to access the metadata provider and to determine how many records will be provided. 5. The system displays the information it has obtained about the metadata provider and the number of resources to be provided, as well as an explanation of the impact of performing the harvest, and prompts the user for confirmation. 6. The user confirms the harvest.

Alternative Paths

1. In step 3, the user does not provide the required input. The system indicates the user’s mistake and prompts for a correction. 2. In step 4, the system cannot access the metadata provider. The system returns to step 3, and explains that the URL provided does not resolve to an OAI PMH metadata provider. 3. In step 6, the user does not confirm the harvesting. The system returns to the precondition, and the harvesting is not performed.

Post conditions

The harvesting is performed in the background. When the harvest is complete, the user receives an email indicating the results of the harvest. If a record has been previously harvested into the collection’s catalog, then the new metadata record will overwrite the corresponding existing metadata record, including the meta-metadata parts of the record. Regardless of whether the record was previously harvested into the collection’s catalog, the harvested metadata record is recorded as not validated. In order to be made available in the collection, they must be validated.

Note: Ingester logs are cleared once every six months in order to prevent disk quota overruns.

Summary

Logs of previous harvests of metadata records into the collection are displayed.

Rationale

This use case allows administrators to review previous invocations of the metadata ingester, and which records were included in the collections tool as a result.

Users

Collection Manager

Preconditions

The user is logged in as the Collection Manager.

Basic Course of Events

1. The user clicks “View Ingester Logs”. 2. The system displays all previous invocations of the metadata ingester, including the following details: a. Date of the ingest b. Total number of records to be ingested c. Total number of records actually ingested i. The number of records actually ingested will differ from the number of records to be ingested if the ingest has not been completed, or if there was an error ingesting one of the records in the ingest. d. Number of records with errors in the ingest process e. OAI-PMH parameters of the ingest, including metadata format, date range, and set 3. The user clicks the “view ingest details” button. 4. The system displays the list of records to be ingested, and for each record, an indication of whether the record has been ingested, whether there was an error in the ingest process for that record, or whether that record was successfully ingested. If the record was successfully ingested, the record’s title and URL are displayed, as well as the identifier for the record within the collection’s catalog. If the record was de-accessioned, then the de-accessioning is indicated. 5. The user clicks the “view record details” button. 6. The system displays the entire metadata record.

Alternative Paths

1. In Step 5, the user clicks the “view request/response” button. The system displays the ingester’s XML request for the record, as well as the XML response from the metadata provider.

Note: Provider logs are cleared once every six months in order to prevent disk quota overruns.

Summary

Logs of previous harvests of metadata records from the collection are displayed.

Rationale

This use case allows administrators to review previous invocations of the metadata provider, and which records were provided as a result.

Users

Collection Manager

Preconditions

The user is logged in as the Collection Manager.

Basic Course of Events

1. The user clicks “View Provider Logs”. 2. The system displays all previous invocations of the metadata provider, including the following details: a. Date of the provision b. Total number of records to be provided c. Total number of records actually provided d. Number of records with errors in the provision process e. OAI-PMH parameters of the provision, including metadata format, and date range 3. The user clicks the “view ingest details” button. 4. The system displays the list of records to be provided, and for each record, an indication of whether the record has been provided, whether there was an error in the provision process for that record, or whether that record was successfully provided. The record’s title and URL are also displayed. If the record was de-accessioned, then the de-accessioning is indicated. 5. The user clicks the “view record details” button. 6. The system displays the entire metadata record.

Alternative Paths

1. In Step 5, the user clicks the “view request/response” button. The system displays the XML request for the record, as well as the provider’s XML response.

A new resource and its associated metadata record is entered into the database.

Rationale

This use case provides for resource-by-resource expansion of the collection

Users

Cataloguer or Submitter

Preconditions

The user is logged in as the Cataloguer or Submitter, and has sufficient permissions to use the desired cataloging process.

Basic Course of Events

1. The user initiates the cataloging process by clicking “Catalog a resource with the <cataloging process name> process”. 2. The system displays the first page of the cataloging process. 3. The user fills out the first page. 4. The system checks the input and displays the next page of the form. The input is checked according to the metadata fields corresponding to the input 5. The input continues in this way, with the user allowed to easily navigate between the different pages of the form as necessary. When the user is satisfied with the input, she clicks the “Submit” button at the bottom of the screen. 6. The system summarizes the user’s input and requests confirmation. 7. The user confirms the resource creation..

Alternative Paths

1. In step 3, the user does not provide the required input. The system indicates the user’s mistake and prompts for a correction. 2. In step 7, the user does not confirm the cataloging. The system returns to the precondition, and the resource is not cataloged.

Post conditions

The resource is cataloged in the system and is recorded as not validated. The collection manager is notified of the new resource. The system records the appropriate meta-metadata regarding the creation of the metadata.

The user views a summary of those metadata records which she has previously cataloged, and their status on the system.

Rationale

This use case allows users to review their previous work.

Users

Cataloguer or Submitter

Preconditions

The user is logged in as the Cataloguer or Submitter

Basic Course of Events

1. The user clicks the “View previously cataloged records” link. 2. The system displays a list of all of the user’s previously cataloged records, including the ttile and URL for each record, as well as the status of that record (unvalidated, validated, de-accessioned.) 3. The user clicks the “View detailed metadata record”. 4. The system displays the complete metadata record.

Unvalidated metadata records are assigned to validators, so that validation may proceed.

Rationale

This use case provides for quality control over metadata in the collection, balanced by the need to control the workload on individual validators.

Users

Collection Manager

Preconditions

The user is logged in as the Collection Manager.

Basic Course of Events

1. The user clicks the “View unvalidated metadata records” link 2. The system displays the records which have not yet been validated. Alongside each record, the system displays all of the validators in the system, and the current validation workload of each user. 3. For each record, the user selects either “Unvalidated”, “Any validator”, or the name of a validator. 4. The system summarizes the user’s input and requests confirmation. 5. The user confirms the assignment..

Alternative Paths

1. In step 3, the user does not provide the required input. The system indicates the user’s mistake and prompts for a correction. 1. In step 7, the user does not confirm the assignment. The system returns to the precondition, and the validation is not assigned.

Post conditions

The record is assigned to the validators and the assignment recorded. The assigned validators are notified, but in the case of “Any validator”, all validators are notified of the new assignments. If appropriate, previous assignments are deleted, and the previously assigned validators are assigned of the unassignment.

A metadata record is certified as accurate, and the resource is certified as peer-reviewed.

Rationale

This use case provides for quality control over the metadata records in the system.

Users

Validator

Preconditions

The user is logged in as the Validator, and has been assigned the selected metadata record for validation.

Basic Course of Events

1. The user clicks the “View assigned metadata records” 2. The system displays a list of metadata records assigned to the user for validation, as well as the records assigned to “Any validator.” 3. The user clicks the “Validate” button next to a metadata record. 4. The system displays the full metadata record, as well as the assignment details. 5. The user optionally inspects the resource and edits the metadata record as appropriate. 6. The user clicks the Validate button. 7. The system records the validation of the metadata record.

Alternative Paths

1. In step 6, the user clicks the Reject button. The metadata record is not included in the collection, but is permanently archived in the metadata repository as a rejected record.

Post conditions

The record is validated. The system records the appropriate meta-metadata regarding the validation of the metadata. The record and its associated resource may be viewed by users through a separate resource access application, or by metadata requesters via the metadata harvester.

This use case provides for consistency in the application of the cataloging process across the organization, by ensuring that metadata experts modify metadata records according to the organization’s policies.

Users

Validator or Collection Manager

Preconditions

The user is logged in as the Validator or Collection Manager. If the user is a Validator, the user has been assigned the selected metadata record for validation.

Basic Course of Events

1. The user clicks the “Edit metadata” button from the validation screen for the metadata record. 2. The system displays the cataloging process which was used to create the metadata record, pre-populated with the available metadata. 3. The user modifies the metadata on the first page of the process, or navigates to another page and modifies the metadata on that page. 4. The system checks the input and displays the next page of the form. The input is checked according to the metadata fields corresponding to the input. 5. The input continues in this way, with the user allowed to easily navigate between the different pages of the form as necessary. When the user is satisfied with the input, she clicks the “Submit” button at the bottom of the screen. 6. The system summarizes the user’s input and requests confirmation. 7. The user confirms the modification of the resource.

Alternative Paths

1. In step 3, the user does not provide the required input. The system indicates the user’s mistake and prompts for a correction. 2. In step 7, the user does not confirm the cataloging. The system returns to the precondition, and the resource is not cataloged.

Post conditions

The record is updated. The system records the appropriate meta-metadata regarding the editing of the metadata.

This use case allows administrators to review the resources in their collections, and the metadata records in for those resources.

Users

Collection Manager

Preconditions

The user is logged in as the Collection Manager.

Basic Course of Events

1. The user clicks the “View validated metadata records” link 2. The system displays the title and URL for every resource in the system. Each resource’s status is indicated as “validated”, “unvalidated”, or “de-accessioned.” 3. The user clicks on a resource in order to view the metadata record for that resource. 4. The system displays the metadata record for the resource.

A resource is removed from the collection. While the metadata record for the resource remains in the collection, the resource is no longer considered part of the collection. The de-accessioning is communicated to outside parties via the metadata provider.

Rationale

This use case makes it possible to maintain the integrity and quality of the collection, by de-accessioning from the collection those resources which are no longer are available, no longer meet the organization’s standards for scientific or pedagogical quality, or for some other reason no longer belong in the collection.

Users

Collection Manager

Preconditions

The user is logged in as the Collection Manager.

Basic Course of Events

1. The user clicks the “De-accession resource” button from the view all resources screen. 2. The system prompts the user to confirm the de-accessioning. 3. The user confirms that the resource is to be de-accessioned. 4. The system marks the resource as de-accessioned.

Alternative Paths

1. In step 3, the user clicks the “cancel” button. The resource is not de-accessioned from the collection.

Post conditions

The resource is permanently de-accessioned from the collection, and may not be subsequently included in the collection again. The metadata provider reflects the de-accessioning in future harvesting operations.

The collections tool will be delivered with certain default configuration options which facilitate its use as a software platform for building a digital library collection in the BEN collaborative. These “out of the box” settings are detailed in this section.

The collections tool will be preconfigured to use the BEN Metadata Schema, as outlined in the BEN Metadata White Paper. The metadata structure will also include all fields in the Learning Object Metadata Specification v1.0 (LOMv1.0) which are not included in the BEN Metadata Schema but which are optional, except 4.4 Technical Requirements, 7 Relation and 8 Annotation.

The collections tool will be preconfigured to use a single cataloging process. The cataloging process will be consistent with the cataloging process used on the AAAS General Biology Collection. This process is decomposed into six screens and is summarized as follows:

The metadata provider will have several configuration options pre-defined upon installation:

- The selective datestamp-based harvesting granularity is by day, meaning that the harvester will selectively provide metadata for all records updated within a specified range of dates (as opposed to a range of seconds.)

- The adminEmail is the email address for the collection administrator user of the collection.

- Compression is not supported.

- Collection description is blank.

- Metadata formats supported are Dublin Core without qualification (oai_dc), NSDL-compliant Dublin Core (nsdl_dc), BEN-LOM (oai_BEN), and customized BEN LOM (oai_BEN_Custom). The last format is equivalent to appending all metadata in the Custom category to the oai_BEN metadata format.

- Deleted records are supported at the persistent level, i.e., the collections tool persistently keeps track of the full history of deletions and consistently reveals the status of a deleted record over time.

- Harvesting by sets is not supported.

- “Incomplete harvesting” is provided if the requested set of records to be harvested is larger than 25 records. In such cases, parcels of 25 records will be distributed together with resumption tokens for the next 25 records. Resumption tokens expire after one day.

In the event that a metadata field is modified from one field type to another, the information within that metadata field for existing records is transformed according to the field type, as described below:

Table 1 Method by which metadata fields are transformed. The old field type is in italics, and the new field type is in bold type. The value which results from the conversion is entered in the appropriate table cell.

Free Entry Text

Multiple Text Items

Date

Single Vocabulary Term

Multiple Vocabulary Terms

Integer

Real Number

Free Entry Text

n/a

The individual items are joined by a semicolon.

The date is converted to text according to the format MM/DD/YYYY.

The vocabulary term is saved as text.

The vocabulary terms are saved as a comma-separated list of terms.

The integer is converted to text.

The real number is converted to text.

Multiple Text Items

The text is split by semicolons and each resulting piece saved as a single text item.

n/a

The date is converted to text according to the format MM/DD/YYYY.

The vocabulary term is saved as a single item of text.

Each vocabulary term is saved as a separate item of text.

The integer is converted to text.

The real number is converted to text.

Date

A null date.

A null date.

n/a

A null date.

A null date.

A null date.

A null date.

Single Vocabulary Term

An empty string (no vocabulary term)

An empty string (no vocabulary term)

An empty string (no vocabulary term)

n/a

An empty string (no vocabulary term)

The term whose identifier equals the integer, if any; or the empty string (no vocabulary term).

An empty string (no vocabulary term)

Multiple Vocabulary Terms

An empty string (no vocabulary term)

An empty string (no vocabulary term)

An empty string (no vocabulary term)

No change

n/a

The term whose identifier equals the integer, if any; or the empty string (no vocabulary term).