> I am interested in the interaction between this and web robots;
> specifically, how to tell the robot that the document is available
> in more than one language.
Using the Alternates header field, which has yet to have a well-defined
syntax.
> Using the Apache type-map model, if no Accept-language
> field is sent, one gets the first URL (assuming qs is not used or are equal).
> Would it be valid to do something like this:
>
> URI: start; vary="language"
>
> URI: unknown.html
> Content-type: text/html
> Content-language: en,fr,de
>
> URI: german.html
> Content-type: text/html
> Content-language: de
>
> URI: english.html
> Content-type: text/html
> Content-language: en
> etc.
>
> so that a robot with language-accept unset would retrieve the
> "Content-language: en,fr,de" header, and then retry with
> Accept-language set to en, fr, de in turn?
No, that would not be valid.
> As this multiple-language header seems to be valid in the current draft
> spec., I presume it should not break (future) browsers.
The semantics are different -- "Content-language: en,fr,de" means that
the content of that *response* is intended for people who want to read
English, French, or German (e.g., a document containing three descriptions
of the same thing in those three languages). It does not indicate what
other languages are available at that resource.
What you want is Alternates. What is currently available is
Vary: Accept-Language
which does indeed give the browser a clue that other languages may be
available (but not an exact clue).
...Roy T. Fielding
Department of Information & Computer Science (fielding@ics.uci.edu)
University of California, Irvine, CA 92697-3425 fax:+1(714)824-4056
http://www.ics.uci.edu/~fielding/