Abstract

Its purpose is to provide guidance to implementors of components of the delivery context as to how to communicate their intentions and capabilities in respect of content transformation.

Status of this Document

This document is an editors' copy that has
no official standing.

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

Publication as a Group Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Appendices

1 Introduction

1.1 Purpose

From the point of view of this document Content Transformation is the manipulation in various ways, by proxies, of content as delivered by an origin server with a view to making it more suitable for mobile presentation.

The W3C MWI BPWG neither approves nor disapproves of Content Transformation, but recognizes that is being deployed widely across mobile data access networks. The deployments are widely divergent to each other, with many non-standard HTTP implications, and no well-understood means either of identifying the presence of such transforming proxies, nor of controlling their actions.

This document establishes a framework to allow that to happen.

A more extensive discussion of the requirements for these guidelines is discussed in "Content Transformation Landscape" [CT-Landscape].

1.2 Scope

Note that we are not chartered to create new technology.

And that we are talking only about browsing.

Rev 1e: And that we are not talking about interactions between the browser and the proxy

1.3 Audience

Transforming Proxy Vendors and Operators, Content Providers.

2 Guidelines

2.1 Summary of Requirements

Rev 1e: Note that this is now beyond the scope we agreed, but there is still stuff here we want to say.

The purpose of this section is to summarize the communication requirements of actors (transforming proxies, origin servers, and to some extent users) to communicate with each other. The relevant scenario involving a content transformation proxy is as follows:

that all content transformation should be avoided, or that reformatting is allowed/desired.

The content transformation proxy needs to be able to tell the origin server:

that some degree of content transformation (re-coding and reformatting) can be performed;

that content transformation will be carried out unless instructed not to;

that content is being requested on behalf of something else [@@?? and what that something else is];

that the request headers have been altered (e.g. additional content types inserted).

The origin server needs to be able to tell the content transformation proxy:

that it varies its presentation according to device type and other factors;

that it's permissible or otherwise to perform content transformation of various kinds;

that it has media-specific representations;

that is unable or unwilling to deal with the request in its present form.

The content transformation proxy needs to be able to tell the client browser:

that it has applied transformations of various kinds to the content;

how to access the untransformed representation of the content.

The content transformation proxy needs to be able to interact with the user:

to allow the user to disable its features;

to alert the user to the fact that it has transformed content and to allow access to an untransformed representation of the content.

2.2 Objectives

Rev 1e:In satisfying these requirements existing HTTP headers and directives and behaviors must be respected, and as far as is practical, no extensions to HTTP [@@bibref RFC 2616] are to be used.

[@@ other principles behind what we are trying to do - e.g. noting Sean's point that there is a wide diversity of different devices that all fall under the simple appellation of "handheld".]

2.3 Types of Proxy

Alteration of HTTP requests and responses is not prohibited by HTTP other than in the circumstances referred to in [HTTP] section 13.5.2. This document describes how the Client and the Destination Server may require conforming transforming proxies not to alter HTTP requests and responses.

"A [Definition: transparent proxy] is a proxy that does not modify the request or response beyond what is required for proxy authentication and identification. A [Definition: non transparent proxy] is a proxy that modifies the request or response in order to provide some added service to the user agent, such as group annotation services, media type transformation, protocol reduction, or anonymity filtering. Except where either transparent or non-transparent behavior is explicitly stated, the HTTP proxy requirements apply to both types of proxies."

This document elaborates the behavior of non-transparent proxies, when used for content transformation in the context discussed in [Content Transformation Landscape] and henceforward referred to as transforming proxies.

2.4 Types of Transformation

Transforming proxies can carry out a wide variety of operations. To carry out an exhaustive survey of those operations and to discuss means of server or client side control of them is beyond the scope of this document. In this document we categorize this rich vocabulary of possible operation into two types:

Alteration of Request Headers;

Alteration of Responses.

Alteration of responses is further sub-categorized into

Restructuring content including rewriting URIs;

Recoding content;

Optimizing content.

Restructuring content is a process whereby the original layout is altered so that content is added or removed or where the spatial or navigational relationship of parts of content is altered, e.g. by linearization or pagination. It includes also rewriting of URIs so that subsequent requests route via the proxy.

Recoding content is a process whereby the layout of the content remains the same, but details of its encoding may be altered. Examples include re-encoding HTML as XHTML, correcting invalid markup in HTML, conversion of images between formats (but not, for example, reducing animations to static images).

2.5 Control of the Behavior of the Proxy

A transforming proxy as described in this document must offers a level of control to users, to origin servers with which it communicates and .

2.5.1 Control by the User

Rev 1f: Placeholder for contribution under ACTION-666, moved from 3.2.3, in last draft, and edited down

Question: If the client has sent a no-transform directive, should the proxy interact with user?

Notification that the content has been transformed.

A means of retrieving the untransformed version, preferably from cache

Notification that the content has not been transformed - and that it can be, especially if the page is a "your browser not supported".

2.5.2 Control by Server

Rev 1f: Rewrite this to make better sense

A transforming proxy gains knowledge of whether a server permits alteration of requests and responses in a number of ways. For requests, by having previously received from the origin server [for a resource on the path that this request is in scope of] an indication of what degree of content transformation is permissible. For responses as a result of the response containing indications as to the servers intentions - such indications include use of HTTP conventions, and site labelling.

Interactions between proxies and origin servers are discussed further in the following section (3 Behavior of Components).

2.5.3 Control by Administrative or Other Arrangements.

Rev 1e: New in this rev, need to say more.

The preferences of users and of servers MAY be ascertained by means outside the scope of this document.

3 Behavior of Components

3.1 Proxy Treatment of Request

3.1.1 no-transform directive in Request

If the request contains a Cache-Control: no-transform directive the proxy must forward the request unaltered to the server, other than to comply with transparent HTTP behavior and in particular to add a Via HTTP header.

Irrespective of the presence of the no-transform directive, the proxy must behave transparently (q.v.) if it detects that the user agent is not a browser [@@open question as to how it does that].

3.1.2 Proxy Decision to Transform

If there is no no-transform directive present in the request the proxy should analyze whether it intends to offer transformation services by referring to:

any administrative arrangements that are in place with the user of the client, or the server

any a priori knowledge it has of client capabilities [@@ from a DDR and so on]

any a priori knowledge it has of server preferences, derived either from a repository of such preferences, or from previous interaction with the server

The type and HTTP method of the request.

Proxies should not alter HTTP requests unless not doing so would result in the users request being rejected by the origin server (this includes HTTP 406 status as well as HTTP 200 status, saying that the request cannot be handled - e.g. "Your browser is not supported") Rev 1f - and unless the user has specifically requested a reformatted version of the desktop oriented experience that would not otherwise be delivered if the headers were not altered.

Proxies should not intervene in methods other than GET POST HEAD and PUT.

Proxies should not alter request bodies [@@we have an open question here as to when and whether they should or not].

Knowing that the browser has available a linearization or zoom capability and/or supports a broad range of content formats the proxy should not recode content.

Rev 1f: Reflects uncertain outcome of discussion 2008-02-26

If, as a result of carrying out this analysis the proxy remains unaware of the servers preferences and capabilities it should:

Issue a HEAD request for the content, and analyse the response as discussed below under [@@server response]

If it is still in doubt, issue a request with the original headers

If it is still in doubt, only then, issue a request with the altered headers

3.1.3 Proxy Indication of its Presence to Server

Rev 1f: per call of 2008-02-26

The proxy must (in accordance with compliance to RFC 2616) include a Via HTTP header indicating its presence.

The proxy should indicate its presence and capabilities by including a comment in the Via HTTP header of the form [@@ (POWDER (<URI>))] where <URI> is a reference to a Web Description Resource, formatting using [@@bibref POWDER] and the Vocabulary defined in [@@Appendix @@]] describing the capabilities and defaults of this transforming proxy.

If a proxy intends to (restructure / reformat / compress) the proxy must indicate this by including a [@@@ Further POWDER indication about this].

3.1.4 Altering Header Values

Rev 1f: This is where we ran out of time at the meeting of 2008-02-26 - this may be open for removal

When altering the User Agent HTTP header, the proxy [@@@@]

When altering the Accept HTTP header, the proxy should indicate any formats that it intends to recode for delivery by assigning a lower q factor (indicated by the q parameter) than those natively supported.

e.g. Accept: image/jpeg, image/gif, image/png;q=0.7

If HTTP header fields are altered then the proxy must be prepared to re-issue the request in an unaltered form on receipt of a Vary header in the response indicating that the server offers variants of its presentation according to any of the HTTP header fields that have been modified.

3.2 Server Response to Proxy

Rev 1f: what are we saying the server should do with the POWDER received from the proxy?

If the server varies its presentation according to examination of received HTTP Headers then it must include a Vary HTTP header indicating this to be the case. If, in addition to, or instead of HTTP headers, the server varies its presentation on other factors (source IP Address ...) then it must include a * as one of the fields in the Vary response.

Rev 1f

The server should indicate, using [@@ref POWDER] and the vocabulary defined in Appendix [@@] its capabilities and its defaults in respect of content it serves. [@@ note that you should be able to get all the info for a server by following links from a single resource]

The server must include a no-transform directive if one is received from the client. If it is capable of varying its presentation it should take account of client capabilities [@@as derived from a DDR etc.] and formulate an appropriate experience according to those criteria.

If the server has distinct presentations according to its perception of the presentation media, then the medium for which the presentation is intended should be indicated [@@using the ...] This is going to be something like the link headers, I think

If the server creates a specific user experience for certain presentation media types it should inhibit transformation of the response by including a no-transform directive.

Note that including a no-transform directive may [@@should actually] disrupt the behavior of WAP/WML proxies, because this inhibits such proxies from converting WML to WMLC (because this is a content-encoding behavior).

Servers may base their actions on a priori knowledge of behavior of transforming proxies, when they are identified in a Via header.

The server should not choose a Content-Type for its response based on its assumptions about the heuristic behavior of any intermediaries. (e.g. it should not choose content-type: application/vnd.wap.xhtml+xml solely on the basis that it suspects that transforming proxies will apply heuristics that make them not restructure it).

[@@ Vary headers in 406 response - restrict to the one(s) that have caused the 406.??]

Servers should respond with a 406 not a 200 if they can't handle the request. Servers should provide information about alternative representations by using the Vary header (if the alternatives are available from the same URI) or using link information if alternative representations are handled by different URIs. [This restricts to HTML for now. If link headers a reinstated in HTTP then this becomes a more universal mechanism. Open question as to whether it SVG or WICD etc. support any such notion]

[@@300 Response - could this be used as a signal from the server to say that it understands the protocol? A la RFC 2295]

3.3 Proxy Receipt and Forwarding of Response from Server

If the proxy has altered any of the HTTP request headers, and it receives a Vary response from the server it should re-make the request with the original headers and forward the subsequent response without restructuring it, irrespective of the contents of the subsequent response. The proxy should take note of this and should not vary headers for subsequent requests, unless requests are subsequently received with no Vary header [@@ + note on back off below]

[@@note that loop detection and elimination is needed here]

3.4 Proxy Response to Client

If the response includes a Warning: 214 Transformation Applied the proxy must not apply further transformation.

If the response includes a Cache-Control: no-transform directive then the response must be forwarded to the client unaltered other than in the respects noted for transparent operation of HTTP proxies as specified in RFC2616, and in particular the addition of a Via HTTP header.

Rev 1f: Follows from saying the server should label itself

If the response contains references to one or more Web Description Resource indicating server preferences for the treatment of its content the proxy should retrieve those WDRs, analyze them as to how they pertain to this resource, and act in accordance with the server's expressed preferences. Proxies should retain such WDRs for future reference in accordance with the policies, if any, described in those WDRs.

In the absence of a Vary or no-transform directive or of a WDR the proxy should apply heuristics to the content to determine whether it is appropriate to restructure or recode it (in the presence of such directives, heuristics should not be used.)

The server has previously shown that it is contextually aware, even if the present response does not indicate this - modified by a need for the proxy to be aware that the server has changed its behavior and is no longer aware in that way

the content-type is known to be specific to the device or class of device e.g. application/vnd.wap.xhtml+xml

examination of the content reveals that it is of a specific type appropriate to the device or class of device e.g. DOCTYPE XHTML-MP or WBMP or [@@mobile video] [@@ note Sean's extensive list of heuristics that should be included as an informative example?]

The response is an HTML response and it includes <link> elements specifying alternatives according to media type [or that such links are included as HTTP headers] or that the content has a mobileOK label.

If the proxy alters the content then it must add a Warning: 214 Transformation Applied HTTP Header. [@@ should this be elaborated to say what kind of transformation?]

If the response contains URIs with the scheme https the proxy must not rewrite them unless [@@er actually this should be a discussion of intercepting links that were https and either a) informing the user of them being now insecure, or alternatively that content transformation is not going to be applied so they may get garbage].

If the proxy has transformed (reformatted) the content but not rewritten https links it should annotate those links to indicate that transformation service is not available on them.

A proxy should strive for the best possible user experience that the client supports. It should only alter the format, layout, dimensions etc. to match the specific capabilities of the client. For example, when resizing images, they should only be reduced so that they are suitable for the specific client, and this should not be done on a generic basis.

Rev 1e: Did we say we would remain silent on this?

If the proxy determines that the resource as currently represented is likely to cause serious mis-operation of the client then the proxy may transform the resource but only sufficiently to alter the specific aspect of the content that is likely to cause mis-operation. Proxies must not exhibit this behavior unless this has been specifically allowed by both the server and the user. [@@ either by persistent registration of preferences, or by use of the [@@correct dangerous content] directive.]

4 Use Case Analysis

Client Proxy Server

Unaware, Unaware, Unaware etc.

[@@TBD]

5 Testing

All ... must be tested for deleterious effects ... [@@TBD]

Providers of transforming proxies should make available interfaces that facilitate testing of Web sites accessed through them. [@@ though how they should make known how to do this and what administrative arrangements would be needed are both probably out of scope]

D To Do (Non-Normative)

Work needed on this draft:

There could be a note that the host should provide interactions that allow the user to have a choice of presentations and so should the proxy and the client, for that matter.

Another as yet unopened Pandora's box is that the discussion and proposed text looks at the issues primarily from the point of view of "varying presentation from Thematically consistent URIs". What hasn't, as yet, been explored is how it all works if there is a common entry point to a site (Thematically consistent URI for a home page) which then dispatches via redirect to media specific versions. This is possibly rather more common than the previous case (e.g. redirect to example.com/mobile - or rather better, imo, example.mobi). Naturally, there will also be varying presentation even within a redirected solution. This whole area needs further thought.

We need to discuss what relationship, if any, this has to the following RFCs:

RFC 2295 is experimental, but actually gets to some of the points we want to make, though doesn't exactly address what we are doing. It's rather a lengthy and detailed read, and has a lot of features that we don't need. It does, however, introduce a couple of headers and field values which have been IANA registered. Also, the main points of the negotiation are implemented in Apache in mod_negotiation (see [APACHE]).

[APACHE] http://httpd.apache.org/docs/2.2/content-negotiation.html

This draft (1f) has introduced the notion of POWDER to describe the proxy and the server. It would seem that two vocabularies are needed, 1 to describe the CT proxy and one to describe server preferences.