If you'll excuse my use of the RDF IG for "bookmarking" ideas
that aren't really thought thru...
I read the Findings of Fact on Microsoft in the usvsms case[1],
and it reminded me of Philg's tutorial on legal citations[2] and it
seems to
me that
-- the promise of the "semantic web" is automating
(parts of) social protocols, and those social protocols
are often grounded in law
-- there are established conventions for legal citations
-- more and more legal proceedings are published via the web
all the time
-- those legal proceedings are often copied in many places,
and there's no recognized canonical URI for them, so
-- caches don't help
-- my browser doesn't tell me I've been there before
-- etc.
So... some ideas...
-- an RDF schema for legal citations
(probably one schema per jurisdiction, with lots of
sharing and sublcassing)
-- a corresponding HTML form for each jurisdiction that, in effect,
allows you to compute the address of a document
To take the example from philg's tutorial:
Ford Motor Co. v. Lonon, 2117 Tenn 400, 398 S.W.2d 240 (1966)
Perhaps in RDF, I'd spell that:
<RDF:Description xmlns="" xmlns:RDF="http:...I.forget..">
<plaintiff>Ford Motor Co.</plaintiff>
<defendant>Lonon</>
<volume>2117</>
<jurisdiction>Tenn</> <!-- there should be a URI for this;
if not for the jurisdiction, then for
the (web projection of) the reporter -->
<page>400</>
...
Hmm... the other part:
398 S.W.2d 240 (1966)
seems to have RDF:alternate semantics. And the year is related to the
dublin core notion of "coverage". Hmm...
Anyway... to compute the canonical address, you need to know
(1) the address of the reporter of the jurisdiction;
The web site for the state of tennesse is:
http://www.state.tn.us/reporter
so let's call it:
http://www.state.tn.us/reporter
(ok... so it would probably be in a subdomain for the judicial
branch of government, ala the TN supreme court:
http://tscaoc.tsc.state.tn.us/
but let's gloss over that for now.)
(2) the RDF schema for that jurisdiction; let's say it just
has defendant, plaintiff, volume, and page number.
I'd make an HTML form ala:
<form action="http://www.state.tn.us/reporter">
<input name="defendant" />
<input name="plaintiff" />
<input name="volume"/>
<input name="page"/>
</form>
hm... we may need conventions for canonical representations of page
numbers
(which we should be able to get from [3]). More tricky: canonical
spelling
of plaintiffs and defendants. Those won't be computable; in the general
case, you'll have to look at the published document to be sure.
Anyway... the resulting address is:
http://www.state.tn.us/reporter?defendant=Ford%20Motor%20Co.&defendant=Lonon&volume=2117&page=400
and there would be another address for the unofficial reporter, and
an assertion relating them.
Strictly speaking, we don't need the function from citation to address
to be computable locally; we can allow courts to publish an arbitrary
mapping, so that the canonical address of that case is something like:
http://www.state.tn.us/archive/1966/32l4ij5203984u029384029
but my intuition says it's more cost-effective for the citation->
address mapping to be a globally deployed convention rather than
a web-site-private issue.
There are some thorny issues around copyright etc. of the actual page
numbers and such; I gather the Westlaw folks have defended their
ownership
of this stuff rigorously. But it's hard for me to believe that it's not
best for all concerned for courts to publish authoritative copyies of
their
stuff from their own web sites.
Cornell has published a bunch of stuff... for example
U.C.C. - ARTICLE 3 - Â§ 3-104.
http://www.law.cornell.edu/ucc/3/3-104.html
Related issues:
-- authenticity, non-repudiation
one mechanism is digital signatures, but another
mechanism is massively redundant publishing, ala newspapers,
which is effectively non-repudiable
(of course, it takes revenue away from Westlaw)
(I have some notes on authenticity at
http://www.w3.org/Architecture/qos that may be relevant)
hmm... this looks interesting:
The Authority Public Key Distribution Protocol
http://www.oasis-open.org/cover/publicKeyXML.html
-- format for the content itself
(the TN court uses some friggin Java applet to publish
their content! I wonder if that's the easiest way they
found to extract data from their legacy database,
or if its a copy restriction mechanism)
e.g.
TEI Extensions for Legal Text
http://www.oasis-open.org/cover/finkeTEI10.html
Legal XML Working Group
http://www.oasis-open.org/cover/xml.html#legalXMLWG
Legal XML
http://www.legalxml.org/
-- stable publishing
guarantees of availability and persistance,
perhaps with time-limits (ala DNS and ala phone
company area code changes; they don't guarantee
an address will work forever, but they tell you
how much notice you'll get before a change, or
how long you can cache a binding).
[1] United States of America v. Microsoft Corporation,
C.A. 98-1232
http://usvms.gpo.gov/
[2] Reading Legal Citations
by Philip Greenspun
http://photo.net/philg/litigation/reading-cites.html
[3] XML Schema Part 2: Datatypes
http://www.w3.org/TR/xmlschema-2/
[Hmm... these citation issues are closely related to URI design and
philosophy; I considered crossposting to uri@w3.org, but decided
against it, for now.]
--
Dan Connolly, W3C
http://www.w3.org/People/Connolly/