Speeding Adoption of Web Services in Place of FTP

Abstract/Agenda:

Purpose:

To explore and perhaps enhance the future of remote data access, highlighting the merits of accessing data via specialized Web services in contrast to FTP, identifying obstacles to (speedier) adoption of such services, and recommending practical actions (perhaps by ESIP) that would address those obstacles.

In the ESIP community, adoption of Web-based data-access services (in lieu of FTP, e.g.) appears to be lagging related trends in other contexts, as evidenced at the recent Extremely Large Databases conference. This lag may become especially problematic for the ESIP community as:

Data volumes outpace the growth of network bandwidths;

Server functions extend the usual notions of "query" to encompass computations (regridding or binning operations, e.g.) that are best executed near the data;

Multi-disciplinary studies create needs for complex data-discovery workflows, wherein data access occurs only after a sequence of preliminary queries to determine the suitability of a dataset under consideration;

Attendees of the Winter Meeting are ideally positioned to provide feedback on the existence/extent of the problem, on obstacles to the adoption of Web-based data-access services, and on actions to overcome these obstacles.

Agenda (with approximate timings):

4Panel introduction, with a focusing (controversial?) assertion:

"move data if and only if you can't query them"

20Panel members (5 mins each):

reactions to the assertion

18Audience comments and/or questions (for panel members)

16Panel members (4 mins each):

obstacles to adoption

actions (by ESIP?) to overcome these

20Audience comments and/or questions (for panel members)

12Panel members (3 mins each):

concluding remarks

Duration:90 minutes

Notes:

Dave Fulker begins, goes through slides

Panel: Jim Frew, Peter Baumann, Hook Hua, Peter Cornillon

Frew: Move data if and only if you can't query them. Most users grab data, then do analysis. It's a big transition to think ask your questions on the server.

Hook: FTP ranks as #1 service used. Matlab-like users typically bring up an app and work in a particular directory all day.

Comment: file name is still the most popular metadata discovery mechanism

Comment: regardless of implementation, end-users prefer a "filesystem-like" view of the data

Frew: drivers of service adoption: 1) bandwidth, 2) he called it "fixed schema" issue, but I say interoperability

Peter B: in Europe, the datacenters are the impasse - they do not want to change their present models

Peter B: For scientists today, they spend 80% of time massaging data, 20% for analysis, a measure of success is to reverse this.

Frew: Need to insure versioning confidence - assurance what we get out of web service pipe is the same each day, or exactly what the diffs are. Just as you trust each time you read a file off disk it is the same.