Workshop Motivation, Goals and Structure

This workshop brought together a cross section of people
involved with information server technologies, search technologies, and
directory and online services, to discuss where repository interface standards
could support better approaches to distributed indexing and searching.
The goal of the workshop to produce standards, but rather to uncover and
discuss areas of mutual concern where standards might gain momentum.

There was a great deal of interest in this workshop. To
keep the group of a workable size while maximizing breadth of attendee
backgrounds, the workshop co-chairs limited attendance to one person per
position paper, and furthermore we limited attendance to one person per
institution. (One extra attendee from each of @Home Network and Transarc
attended, as "scribes" for the plenary sessions of the workshop.)b
There were quite a few cases where people expressed legitimate desires
to bring multiple representatives (e.g., from two different parts of a
large company or government agency, or from two different research labs
at a university), but we felt it was necessary to limit attendance to limit
the workshop to a reasonable size.

The workshop spanned two days. The first day's goal was
to identify areas for potential standardization through several directed
discussion sessions, while the second day's goal was to filter the list
of issues and identify those most likely to lead to useful standards. Each
technical session during the first day began with two 15 minute talks expressing
opposing views on the session topic, followed by a breakout session in
three parallel tracks, during which participants were asked to examine
what might be standardized over 3 month, 12 month, and longer time periods,
and then to report back with a summary slide at the plenary session. At
first we tried to have a brief question and answer session after each talk
and a plenary discussion after the breakout session, but after the first
technical session we decided to cut out questions and plenary discussions
to maximize time available for the breakout sessions.

We selected three areas for the above sessions, based
on the position papers submitted.

The first area, Distributed Data Collection, addressed
issues associated with the collection of data across the network. We led
with questions such as, "Is robots.txt adequate for future
needs?" and, "What is the value of protocol- and programatic-based
solutions?"

The second area concerned Data Transfer Formats, suggesting
a discussion about the relative merits of early deployment (for creating
a dominant standard) vs. format negotiation (allowing interoperable access
to multiple standards).

The final area examined the need for architectures that
distribute search across several repositories. We observed that the most
popular indexes currently are constructed as centralized repositories in
the mainframe model, while more recently, meta-search engines have
become more popular. We asked workshop participants to consider the role
of global vs. topic-specific indexes, repository access protocols like
Z39.50, and other mesh-like models appearing on the Web. We also asked
whether distributed searching is a realistic paradigm for administratively
decentralized resources.

To keep the workshop reporting process manageable, only
session chairs were given a chance to work over drafts of this report
before it was published. Other workshop attendees will be given 2 weeks
after the report is published during which time each will be permitted to
submit a web page to be linked into the report, for any comments they want
to add.

This page is part of the DISW 96 workshop.
Last modified: Thu Jun 20 18:20:11 EST 1996.