2
CERN November 2001 EU DataGrid 2 Main project goals and characteristics l To build a significant prototype of the LHC computing model l To collaborate with and complement other European and US projects l To develop a sustainable computing model applicable to other sciences and industry: biology, earth observation etc. l Specific project objectives Middleware for fabric & Grid management (mostly funded by the EU) evaluation, test, and integration of existing M/W S/W and research and development of new S/W as appropriate Large scale testbed (mostly funded by the partners) Production quality demonstrations (partially funded by the EU) l Open source and technology transfer Global GRID Forum Industry and Research Forum

10
CERN November 2001 EU DataGrid 10 WP 4 Fabric Management l Goals Facilitate high performance grid computing through effective local site management Permit job performance optimisation and problem tracing at local sites Building on experience of the partners in managing clusters of several hundreds of nodes, provide all the necessary tools to manage a site with grid services on thousands of nodes l Issues How to install reference platform and EDG software on large numbers of hosts with minimal human intervention per node How to ensure the node configurations are consistent and handle updates to the software suites

11
CERN November 2001 EU DataGrid 11 WP 5 Mass Storage Management l Goals Provide extra functionality through common user and data export/import interfaces to all different existing local mass storage systems used by the project partners Ease integration of local mass storage system with the GRID data management system by using these interfaces and publishing information l Issues How to interface the many mass storage systems to the grid and provide mechanisms for interrogating their status

12
CERN November 2001 EU DataGrid 12 WP 6 Integration testbed l Goals Plan, organise, and enable testbeds for the end-to-end application experiments, which will demonstrate the effectiveness of the Data Grid in production quality operation over high performance networks. Integrate successive releases of the software components from each of the development work packages Demonstrate by the end of the project testbeds operating as production facilities for real end-to-end applications over large trans-European and potentially global high performance networks l Issues How to bring together software components from multiple sites to make a coherent, working testbed deployed at multiple sites on which the application groups can perform useful work

13
CERN November 2001 EU DataGrid 13 WP 7 Networking Services l Goals Review the network service requirements of DataGrid then make detailed plans in collaboration with the European and national actors involved Establish and manage the DataGrid network facilities Monitor the traffic and performance of the network, and develop models and provide tools and data for the planning of future networks, to satisfy the requirements of data intensive grids Deal with the distributed security aspects of DataGrid l Issues Dealing with the various national bodies and other institutions to ensure sufficient network capacity with the appropriate characteristics are available to the application groups Work in close co-ordination with GEANT

14
CERN November 2001 EU DataGrid 14 Earth Observation and Biology Science Applications l Earth Observation (WP9) provide a good opportunity to exploit Earth Observation Science (EO) applications that require large computational power and access large data files distributed over a geographical archive (e.g. numerical weather and climate models) l Biology Science (WP10) Production, analysis and data mining of data produced within projects of sequencing of genomes or in projects with high throughput for the determination of three-dimensional macromolecular structures Production, storage, comparison and retrieval of measures of the genetic expression levels obtained through systems of gene profiling based on micro-arrays, or through techniques that involve the massive production of non-textual data as still images or video

22
CERN November 2001 EU DataGrid 22 Licenses & Copyrights l Package Repository and web site Provides access to the packaged Globus, DataGrid and required external software All software is packaged as source and binary RPMs l Copyright Statement Copyright (c) 2001 EU DataGrid – see http://www.edg.org/license.html l License Will be the same (or very similar) to Globus license A BSD-style license which puts few restrictions on use l Condor-G (used by WP1) Not open source or redistributable Through special agreement, can redistribute within DataGrid l LCFG (used by WP4) Uses GPL

23
CERN November 2001 EU DataGrid 23 Security l The EDG software supports many Certification Authorities from the various partners involved in the project http://marianne.in2p3.fr/datagrid/ca/ca-table-ca.html but not Globus CA l For a machine to participate as a Testbed 1 resource all the CAs must be enabled. all CA certificates can be installed without compromising local site security l Each host running a Grid service needs to be able to authenticate users and other hosts site manager has full control over security for local nodes l Virtual Organisation represents a community of users 6 VOs for testbed 1: 4 HEP (ALICE, ATLAS, CMS, LHCb), 1 EO, 1 Biology