Saveen this is a terrific beginning of the requirements. I agree with your
overall sense of scope, particularly the things you excluded.
I must also say that there are a lot of requirements in this document whose
rationale (or perhaps semantics) I do not understand. Can you enlighten me?
Variants (3.1.4). What WebDAV group is working on "mechanisms ... to use
when submitting variants to the server"? I must have missed this.
Regular Expressions (3.1.6) By "must" do you mean that every DASL server
MUST support regex? If so, why? This seems too expensive to me. As far
as I know, the large search engines (e.g. Verity) do not support regex.
Likewise for NEAR (3.1.7) Again, why is this mandatory?
Result Record Definition (3.2.1). I can certainly see the value of
supporting this - it improves performance by cutting round trips.
(otherwise, you do a SEARCH to get the list of resources that match, then a
PROPFIND for each one.)
But is this the only reason, and should it be mandatory?
Paged Search Results (3.2.3). I don't at all see why we need this. If the
search results are returned in chunked Transer Encoding, then the search
engine can start returning results as soon as the first match occurs, and
the client can certainly start displaying them as they arrive. Or perhaps
I don't know what this means. I hope it does not mean that the server has
to store the state of searches in progress, as in Z39.50.
Search Scope 3.3.1 - is a search scope a collection? Why do we need this?
It's a performance improvement, so one does not have to issue N searches?
Search Depth 3.3.2 - by "container" do you mean "collection"?
Extensible Query Syntax (3.4.2) - I am leery of this. Where does this
requirement come from? I challenge it in two ways
1 - it's not needed, because generic query syntaxes are sufficient.
Consider, for example, the DMA (document management alliance) API. It
provides only one syntax, albeit a powerful one, unless by "extensible" you
mean that the client can discover the list of searchable properties and
operators allowed on each one. For this kind of discovery, you should
look at DMA, which provides means of describing the operators, the required
operands for each, the datatypes supported, the default values, and so on.
I think that RDF is actually expressive enough to express all this.
2 - it's not sufficient - I find it hard to believe that a client C can do
discovery on server S and generate an effective query using a *syntax* it
did not previously know, without the intervention of a human. If all you
are trying to say is that server S should be allowed to provide proprietary
search interfaces so that client C (from the same vendor) can work with it,
that's neither difficult nor worthy of the spec.
If you have something different in mind, please correct my misunderstanding.
Internationalization 3.7. I strongly agree. I've certainly run into
problems in searching against non-ascii data, e.g. names in German or
Greek. (Did you think I would have only disagreements?)
Finally, something needs to be said about full text variants - when a
document is stored with N variants, it's not clear to me which one(s) a
full text search applies to. With some indexing systems, you don't get to
choose.
Likewise, if versioning is still in WebDAV (or should I say 'WebDA'?) then
there must be some interaction with search.
I look forward to discussing these and become more enlightened.
PS we should carry followup discussion on www-webdav-dasl only, I think.