I enthusiastically support this proposal, and congratulate Jim on
the clarity of his presentation!
Like Judy, I would advocate replacing (rather than just supplementing)
the URL-to-URL "reference" functionality in the current protocol with
the URL-to-resource "bind" functionality described here. I believe
there is utility in URL-to-URL mappings, but they are less important
and could be deferred in order to get the critical URL-to-resource
mapping defined and accepted.
I would suggest one modification and a couple of supplements to
Jim's proposal.
The supplements are easy, so I'll do those first.
The first supplement is:
"If a server cannot support BIND semantics (in particular,
could not support the fix-up stage for MOVE, or could not guarantee
that GET/PUT/LOCK on this URL affect the same resource as a
GET/PUT/LOCK to all other bindings to this resource), the BIND call
MUST fail."
The second supplement is:
"A MOVE request MUST fail if the binding fix-up cannot be done. A client
can always issue explicit COPY/DELETE in this situation".
The one modification is the semantics of DELETE:
In the context of "BIND", there are two important
delete-like operations. The first is "UNBIND xxx", which says
that the binding of xxx to a resource should be deleted. The second
is "DESTROY xxx", which says that *all* bindings to the resource
bound to xxx should be deleted.
Jim's proposal says that DELETE should have DESTROY semantics
(and that some other method be created to have UNBIND semantics).
While I agree with Jim that associating DELETE with DESTROY is
probably more "natural", given the current wording in RFC 2068,
I believe that for Advanced Collections to be usable for versioning,
it is essential that a downlevel "DELETE" have "UNBIND" semantics.
I also believe that associating UNBIND with DELETE is fully
compatible with current client expectations, namely that after
they do a "DELETE xxx", a "GET xxx" will fail (perhaps with a 404),
any entity caching associated with xxx should be invalidated,
and a "PUT xxx" will return a 201 (Created).
The reason why it is essential for DELETE to have UNBIND semantics
for versioning, is that a key characteristic of versioning is that
you be able to recover previous "states" of the web site. In particular,
when one client issues a DELETE, another client with a different
"workspace" still wants to be able to see that resource, and even after
all clients have "accepted" that DELETE, a client will want to be able
to find that old resource (in the revision history of the web site).
All of this is possible if DELETE is interpreted as UNBIND, since
the deletion of one binding is compatible with the resource still being
visible under another binding (at a different URL). So a versioning
server can just perform an UNBIND whenever a client issues a DELETE.
But if DELETE is interpreted as DESTROY, then a versioning server is
forced to refuse to accept any DELETE calls (or risk violating one
of the key goals of versioning), even though a client would
have been been perfectly happy if just an UNBIND were performed.
And then we would have to introduce a DESTROY operation anyway, for those
times when you really *did* want to do a DESTROY.
In summary, our choices appear to be either:
- interpret DELETE as UNBIND, and introduce a new DESTROY method, or
- interpret DELETE as DESTROY, have a versioning server reject all
DELETE requests, and introduce a new UNBIND and a new DESTROY method.
I would far prefer the former choice.
Cheers,
Geoff
From: Jim Whitehead <ejw@ics.uci.edu>
Within the author's group for Advanced Collections, we've been exploring the
idea of creating a mechanism for adding new URL to resource bindings,
possibly in addition to Direct References, perhaps to replace Direct
References.
Since the goal of a Direct Reference is to create a new location in the HTTP
namespace which can be used to access the reference's target resource, the
basic idea behind the mapping concept is to have a mechanism for creating a
new URL for a resource which acts, in every possible way, just like any
existing URL does for any existing resource.
This would remove some of the limitations of Direct References which result
from a Direct Reference creating a resource for the direct reference, in
addition to the resource which is the target of the reference. This extra
resource creates problems for methods which have a source and a destination
like COPY and MOVE -- which resource do they apply to by default, the
reference or the target? The two resources also create problems for LOCK --
lock really needs to lock both to preserve RFC 2518 semantics.
Current best understanding of how a URL relates to a resource is given in
RFC 2396 (http://www.ics.uci.edu/pub/ietf/uri/rfc2396.txt).
I like to visualize the relationships between URI, resources, and underlying
chunks of state with the followng diagram:
uri1 uri2 uri3
\ | |
\ | |
\ | |
\ | |
+-------------------+
| resource R |
+-------------------+
|
|
|
+---------------------+
| chunk of state |
| (e.g., a file, |
| or multiple files, |
| or a EEPROM memory |
| or database record |
| or DMS object, ...)|
+---------------------+
That is, there is a mapping of one or more URIs to a resource, which is an
abstraction maintained by a server which models a chunk of state. Note that
the same chunk of state can be mapped to more than one resource (and each
resource can be mapped to several URI).
In a nutshell, the proposed BIND method would take two URIs as input. The
first one will be the new URI through which the resource will be accessible
once the operation has completed. The other URI will be an existing URI by
which the resource is accessible. When the server performs the operation,
it performs a lookup on uri3, finds the resource to which it has been
mapped, and then creates a new mapping of uri4 to that resource.
So, a bind operation with uri4 (new URI) and uri3 (existing URI) as inputs
on the resource in the previous diagram, would produce the following:
uri1 uri2 uri3 uri4
\ | | /
\ | | /
\ | | /
\ | | /
+-------------------+
| resource R |
+-------------------+
|
|
|
+---------------------+
| chunk of state |
| (e.g., a file, |
| or multiple files, |
| or a EEPROM memory |
| or database record |
| etc.) |
+---------------------+
That is, after the operation, a GET on uri4 will produce the same response
as a GET on uri3, uri2, or uri1. No new resource is created. The chunk of
state has not been modified.
To preserve namespace consistency, the proposed BIND method would have some
side effects. Creating a binding would have the side effect of adding the
new URI into its parent collection (take the new URI, whack off the last
path segment, then resolve this URI to a collection and add the new URI to
this collection.) If the parent collection doesn't exist, BIND would fail.
BIND can create a binding to a collection resource. The new collection
would behave exactly as would a current DAV collection. This could create
loops, and servers would need to check for these during "Depth: infinity"
operations.
In the example above (a new URI, uri4 has been bound to resource R), the
effect of methods on the new URI is exactly the behavior these methods would
have if this binding had been created by another HTTP method, such as PUT
(which creates the resource, and also creates a binding from a URI to the
resource).
GET uri4: return an entity response for R
HEAD uri4: return the protocol metadata for the GET entity response for R
PUT uri4: if overwrite active, overwrites R, affecting the GET entity
response for uri1, uri2, and uri3
POST uri4: this method can do anything
OPTIONS uri4: return the same as OPTIONS on uri1, uri2, or uri3
DELETE uri4: my read of RFC 2068 is it deletes R, and removes the bindings
for uri1, uri2, uri3, and uri4 (we might want to introduce an UNBIND
operation which only removes the binding)
COPY uri4, uriX: duplicate R in new resource T, then create a mapping of
uriX to T. Mappings of uri1, uri2, uri3, and uri4 are unaffected.
MOVE uri4, uriX: duplicate R in new resource T, then create a mapping of
uriX to T. Perform fix-up stage, which is currently under-specified in RFC
2518, but which in this case would mean re-mapping uri1, uri2, and uri3 to
T. One of the things that would be required by introducing BIND is to
specify well this "fix-up" stage.
LOCK uri4: Resource R is locked, and the lock is visible via uri1, uri2,
and uri3.
UNLOCK uri4: remove the lock from R, and the lock is no longer visible via
uri1, uri2, and uri3.
PROPFIND uri4: retrieve properties on R
PROPPATCH uri4: set or delete properties on R
MKCOL uri4: fails, since there is already a resource bound to uri4. But, if
no resource were bound to uri4, it would create a collection resource, and
bind it to uri4.