Inhalt

EPICUR Working Report: Architecture of the service

Current situation

URN delivery path and interface for complex URN/URL management

There currently exists an automated procedure for registering URNs for online dissertations.
The aim is to make it possible to register URNs at the German National Library as far as possible
with no additional work. For this reason, the registering process of URNs is being linked to
the procedure for registering online dissertations. This has been implemented by extracting
and then registering the URN and the associated URL from the "MetaDiss" metadata set.
The URN/URL pairings are registered however generated after the online dissertation is
bibliographically indexed and archived by the German National Library. The following properties
characterise this procedure:

A time delay occurs between generating the URN at the institution in question and registering
the URN at the German National Library, such that URNs will not work if they are publicly visible
on a document server immediately after being generated.

Automatic URN registration does NOT mean that it is functionally possible to administer all
associated URN/URL data, such as updating or deleting URLs. Currently, this is being implemented
by means of a separate, semi-automated procedure. An technical interface is required to allow full
automation of all administrative processes.

Considering the URN assignment to other document types the same approach of integrating the URN
registering process into existing workflows would led to inter-dependencies which can only be
solved by increased expenditure of time and money, e.g. by forming special one-off agreements.

Technical infrastructure of the service

A variety of factors influences the technical stability of the service, such as the hardware platform used,
hardware data-redundancy and recovery systems in case of system failure. An aspect that calls for particularly
careful consideration is the database management system being used. The prototype is based on MySQL. As regards
data consistency, quick recovery systems in case of a database failure and synchronising data with external
databases however, this is not adequate to provide a productive service. There is a need to migrate the service
to a database management systems that will meet these requirements. A productive system implies the need
to take scalability of the service into consideration. From this stems the need to construct a mirror.

Architecture: Target situation

Interfaces

In order on the one hand to be able to reduce the time lag between generating and registering a URN and
on the other hand to be able to offer a universal interface for managing URN/URLs associated data, the
link between this interface and existing business procedures has been cut.

Advantages

It is possible actively to use URNs immediately, as part of the process of generating them. In order,
for example, to notify the author of the availability of an online publication on a particular document
server. A URN would then be able to be used as both an access address and as a permanent, citable identifier.

URNs could be registered in a structured way for partial documents.

Furthermore changes to URL registration could be carried out automatically at the German National Library, if
a URL changes, is removed, or if it needs enhancing. Bulk changes to registrations could also be done, if
an entire collection of documents was migrated.

So that URN resolution can be individually controlled using targeted parameters, in order for example to
provide a reference to a PDF version or an archive copy of a document, this information must in the first
instance be captured. This is in any case a component part of the scope of the service of this interface.

Implementation

The functions described will be implemented using an XML based data transfer format.
All URN/URL registrations received at the German National Library, for example by e-mail or web form, will be
converted into this format. Subsequent, internal, automated processing of registrations can thus be carried out.
At the same time, new delivery methods can be incorporated without difficulty.

The following delivery methods are currently envisaged:

Email The plan is to receive these as email attachments in text format. The text file will contain data
records, structured in accordance with the XML specifications of the EPICUR project. Furthermore, it is
planned to provide file uploading through a graphical front-end using a web form.

OAI 2.0 Complex URN management is planned via an XML schema implemented in OAI 2.0. Implementation began
in early 2004. An appropriate XML-Schema was created in the EPICUR project.

existing registration methods In parallel to these various delivery methods, URN registration is possible
via the interfaces currently in use to register online dissertations by means
of MetaDiss and XMetaDiss (planned) as well as network publications (using the
interactive registration form).

Technical infrastructure of the service

URN/URL data supplied via these various delivery methods, such as email or OAI, will be converted into the XML based data
transfer format. The XML based data structure will be internally transferred to an "Ingest" module with modular
programming. There the consistency and completeness of the registration will be checked as well as checking for duplicate
URNs/URLs. To ensure consistent data, on the enhanced data model duplicates will also be blocked at the Sybase database
level. Once the registration has been successfully processed, and should erroneous data be found, then the institution
concerned will be notified.
To enhance service availability, work will start in early 2004 on setting up an external mirror at the Bibliothekszentrum
Baden-Württemberg (BSZ). Data exchange will occur at protocol level, not at database level. To achieve this, an XML
based interface will be implemented at the German National Library and the Bibliothekszentrum Baden-Württemberg. Data
comparisons at broadly spaced intervals are planned for the opening phase.

Warning: include(../includes/footer_en.inc) [function.include]: failed to open stream: No such file or directory in /data/htdocs/www.persistent-identifier.de/english/337-EPICUR_architektur.php on line 164