DataFinder

The DataFinder is light-weight tool for managing scientific data. DataFinder has been developed to manage large data sets which can be stored using a variety of data storage interfaces (e.g., WebDAV, FTP, GridFTP, OpenAFS, or TSM). The structuring of data together with descriptive meta data are being stored in XML format on a central server. Both are being managed and maintained via network connections using the standardized protocol WebDAV.

DataFinder has the following main functionalities:

Up- and downloading data

Assigning data standardized and user-defined meta-information

Search function allowing linkage of multiple search terms

Script execution for automation of tasks (e.g., for automatic up- and download or running of calculations)

DataFinder-GUI (user view)

The administration of data at a central point and description of data with standardized meta-tags make it much easier to find data, and cuts down on double work. Users can also use partial results or input data from other calculations already on the server for their own work, without having to generate the data all over again. The flexible metadata concept also allows users to attach additional meta-information to data on the server.

A data management solution with DataFinder relies on open and flexible standards, such as the WebDAV protocol, on the server side. This guaranties an easy extensibility and flexibility for future extensions. Currently, a couple of existing WebDAV server products are being supported. A commercial solution is the Tamino XML Server of the Software AG and a free solution is the open source WebDAV server Catacomb. Both server solutions allow server-side searching and automatic versioning of documents and data.

DataFinder for Data Management in Grids

The DataFinder can be used as a general and easy-to-use tool for scientific data management in Grids and Clouds. Within the German D-Grid-Integration project (DGI), the DataFinder has been extended with more data storage interfaces. Within the D-Grid project AeroGrid, the DataFinder is being extended with interfaces to UNICORE 6 and being used as user interface for performing complex simulations in Grids.

Availability

The DataFinder is available as Open Source software under the BSD license. Extended and customized versions of the DataFinder can be developed and provides on request.