The DataGrid Project

by Robin Middleton and David Boyd

The DataGrid project is a large European collaboration, supported by the EU, to develop a pan-European Grid infrastructure linking the various science Grids of the participants and to demonstrate its utility through demanding scientific applications. With 21 partners and a challenging technical agenda, the project is calling on the expertise of several ERCIM partners to help achieve its goals.

The LHC Challenge
The initial driver behind development of the DataGrid project was the recognition that the scale of computing and data management required by the particle physics experiments on the Large Hadron Collider (LHC), which will become operational at CERN near Geneva in 2005, far exceeded the capability or capacity of existing resources. With data volumes of several petabytes per year from all the experiments on the LHC feeding into the global particle physics community, who will then progressively reconstruct, filter and analyse the data, the aggregate computational and data throughput required is massive.

The DataGrid solutionTo address this challenge, and similar requirements in other sciences, the DataGrid project was conceived and funded to the extent of approximately 10Meuro through the EU 5th Framework Information Society Technologies programme. In addition to trying to solve the problems of Europe's scientists, this 3 year project will have a wider remit to develop and prove a technological infrastructure which could potentially revolutionise commercial and social activities throughout Europe.

The initial task, now underway, is to define the overall architectural vision of the project and to establish a detailed technical framework within which the project can progress.

WorkPackagesThe above components have been further broken down into workpackages and these are currently defining their detailed work programmes and assigning tasks to the partners. The figure shows these workpackages and indicates the relationships between them.

WP2 - Data Management  will develop and demonstrate the necessary middleware to ensure remote access to petabyte databases and the replication and caching of data in a secure environment

WP3 - Monitoring  will produce the means for users and managers to monitor and optimise performance

WP4 - Fabric Management  will develop new automated system management techniques to support the deployment and operation of tens of thousands of commodity processors

WP 5 - Mass Storage Management  will agree and implement interfaces to mass storage systems in use within the partners

WP6 - Integration Testbed  will evaluate effectiveness of the integrated DataGrid architecture for production use across European networks and provide a platform for computation by the applications

WP7 - Networking Services  will oversee the networking aspects of the project

WP8 (High Energy Physics), WP9 (Earth Observation) and WP10 (Biology) will build on the framework created by the other workpackages to demonstrate use of the DataGrid environment.

Workpackage relationships.

Project PartnersThere are 6 main partners: CERN, ESA, PPARC (UK), CNRS (France), INFN (Italy) and NIKHEF (The Netherlands) with CERN as the co-ordinating partner and 15 associated partners from 10 countries across Europe. The industrial partners are IBM, Datamat and Compagnie des Signaux. They will be contributing their technical expertise and commercial experience and addressing the issue of how to effectively disseminate the new technology developed by the project into the marketplace so that European society and business can also benefit from these advances initially driven by science.

CollaborationThe project will be working very closely with several groups in the US including the Globus team, the Grid Physics Network (GriPhyN) and the Particle Physics Data Group (PPDG). By sharing technology and agreeing on joint development programmes, the resources of all these partners can be brought together most effectively to tackle what must be the largest global computing challenge ever undertaken.