Abstract

XHTML-Print is a member of the family of XHTML languages defined by the Modularization of XHTML [XHTMLMOD]. It is designed to be appropriate for printing from mobile devices to low-cost printers that might not have a full-page buffer and that generally print from
top-to-bottom and left-to-right with the paper in a portrait orientation. XHTML-Print is also targeted at printing in environments where it is not feasible or desirable to install a printer-specific
driver and where some variability in the formatting of the output is acceptable.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of
this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

All sections of this document are normative unless noted as informative.

This document is a Proposed Edited Recommendation of XHTML Print. If approved, it will supercede the previous version. The only
substantive changes in this version are the addition of an implementation of the markup language using XML Schema.

Publication as a Proposed Edited Recommendation does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time.
It is inappropriate to cite this document as other than work in progress.

W3C Advisory Committee Members are invited to send formal review comments on this Proposed Edited Recommendation to the W3C Team until 4 June 2009. Members of the W3C Advisory Committee will find
the appropriate review form for this document by consulting their list of current WBS questionnaires.

1. Introduction

All sections of this document are normative unless noted as informative.

1.1. XHTML for Printing

This section is informative.

This document specifies a simple XHTML based data stream suitable for printing as well as display. It is based on XHTML Basic [XHTMLBASIC]. Its targeted
usage is for printing in environments where it is not feasible or desirable to install a printer-specific driver and where some variability in the formatting of the output is acceptable. Throughout
this document this data stream is called "XHTML-Print."

XHTML-Print is designed to be appropriate for low-cost printers that might not have a full-page buffer and that generally print from top-to-bottom and left-to-right with the paper in a portrait
orientation. For other printers (i.e., those that print in another direction or orientation) a full-page buffer could be needed.

XHTML-Print is not appropriate when strict layout consistency and repeatability across printers are needed. The design objective of XHTML-Print is to provide a relatively simple, broadly
supportable page description format where content preservation and reproduction are the goal, i.e. "Content is King." Traditional printer page description formats such as PostScript or PCL are more
suitable when strict layout control is needed. XHTML-Print does not utilize bi-directional communications with the printer either for capabilities or status inquiries.

This document creates a set of conformance criteria for XHTML-Print. It provides a strong basis for consistent printing results without a detailed understanding of each individual printer's
characteristics.

The document type definition for XHTML-Print is implemented based on the
XHTML modules defined in Modularization of XHTML [XHTMLMOD].

1.2. Terminology

The keywords "MUST", "SHALL", "MUST NOT", "SHALL NOT", "REQUIRED", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" when used in this document
are to be interpreted as described in RFC 2119 [RFC2119]. However, for readability, these words might not appear in all uppercase letters in this
specification.

1.3. Design Rationale

This section explains why certain HTML features are not part of XHTML-Print
and any special circumstances concerning a module and printing.

1.3.1. Script and Events

Scripts, as programs that are executed in conjunction with a document, are not relevant to the printed page. However, documents can provide information as an alternative to a script. Therefore,
the script module is part of XHTML-Print. Scripts MUST NOT be executed and their results MUST NOT be printed. If a noscript element is present, it contains alternate content that MUST be printed in
place of the content of the script element.

Events are not applicable to static, printed versions of a document. Therefore, the Intrinsic Events module is not part of XHTML-Print.

1.3.2. Presentation

Many simple printers cannot print a wider variety of fonts than generic serif, sans serif and monospace. It is RECOMMENDED that
style sheets be used to create a presentation that is appropriate for a particular category of printer. How printers are categorized, what those categories are, how a printer identifies itself as a
member of a category, and how style sheets are selectively applied based on category, is outside the scope of this document.

1.3.3. Forms

Basic XHTML forms ([XHTMLMOD], section 5.5.1) are supported. Content developers SHOULD keep in mind that users might not be able to input
many characters from some devices (e.g. from a mobile phone). Furthermore, developers are cautioned that a printer prints a static version
of a form, and the visual appearance of a form depends heavily on the implementation.

1.3.4. Tables

Basic XHTML tables ([XHTMLMOD], section 5.6.1) are supported, but tables can be difficult to format on very low resource devices. Furthermore, content developers are cautioned that in the Basic Tables
Module, nesting of tables is prohibited.

1.3.5. Frames

Frames are not supported. Frames depend on a screen interface and therefore are not applicable to printers.

1.3.6. Attributes

XHTML-Print is a member of the family of XHTML languages defined by Modularization of XHTML [XHTMLMOD]. Therefore, the elements and attributes in the modules that make up XHTML-Print are all valid
constructs of the language. However, not all the attributes are applicable to a rendering of an XHTML-Print document in printed media, especially those that are integral to a dynamic display of the
document in a browser and the submission of a form. Furthermore, special attention is given to simple printers and some attributes are deemed too complex for a such a printer to render. These
attributes are treated as discretionary in that a conforming printer is not REQUIRED to support them, but if a printer wishes to provide
that support, there are requirements stated for consistency in the implementation of extensions.

1.3.7 Character Model

The W3C architectural specification Character Model for the World Wide Web 1.0 [CHARMOD] gives the RECOMMENDED representation of characters in XHTML-Print. Authors of XHTML-Print producing applications should be aware that low cost printers might be limited in both processing power
and memory and therefore, that normalized utf-8 encoded documents could print more quickly than documents in other forms and encodings.

2. Conformance

2.1. Document Conformance

A conforming XHTML-Print document is a document that requires only the facilities described as mandatory in this specification. Such a document SHALL meet all of the following criteria:

The document SHALL conform to the constraints expressed in the DTD found in Appendix B and the XML Schema found in
Appendix B and conform to the constraints expressed in Design Rationale.

The root element of the document MUST be html.

The name of the default namespace on the root element SHALL be the XHTML namespace name, http://www.w3.org/1999/xhtml.

The start tag MAY also contain the declaration of the XML Schema Instance Namespace and an XML Schema Instance schemaLocation
attribute [XMLSCHEMA]. Such an attribute would associate the XHTML namespace http://www.w3.org/1999/xhtml with the XML Schema at the URI
http://www.w3.org/MarkUp/SCHEMA/xhtml-print10.xsd.

There SHALL be a DOCTYPE declaration in the document prior to the root element. If present, the public identifier included in the
DOCTYPE declaration SHALL reference the DTD found in Appendix B of this specification, using its Formal Public
Identifier. The system identifier MAY be modified appropriately.

The DTD subset MUST NOT be used to override any parameter entities in the DTD.

The MIME type used to refer to a conforming XHTML-Print document SHALL be "application/xhtml+xml" with an OPTIONAL "profile" parameter of 'http://www.w3.org/Markup/Profile/Print'. An OPTIONAL "charset" parameter
MAY be provided with the MIME type. Invalid values MUST be ignored and the result be
as if the value were "utf-8". Usage of the OPTIONAL "charset" parameter is as described in Section 3.2 of RFC3023 - XML Media
Types [RFC3023]. Usage of the OPTIONAL "profile" parameter is as described in Section 8 of
RFC3236 - The 'application/xhtml+xml' Media TypeRFC3236].

If a printer encounters an image in a format it does not support, it SHALL render any alternate content provided, and SHOULD reserve the space specified by the height and width attributes and
MAY optionally draw a box around this space of the size specified for the image.

If the image format is not supported and no alternate content is provided, the image is omitted and space SHOULD NOT be
reserved.

If the image format is supported and the height and width attributes are not provided, the printer MUST attempt
to print the image at its intrinsic size. If the image data contain no size information, this specification does not define the size at which the image will be rendered.

A printer MUST support images referenced by a URI [RFC3986] utilizing an http
[RFC2616] scheme. Support for other schemes is OPTIONAL.

Printers that do not support the xml:lang attribute are not REQUIRED to adhere to the rules for language specific white
space handling.

2.3.2 XHTML Requirements

A conforming printer SHALL print a static version of a form using default and selected values as specified in the form.

A conforming printer SHALL identify this datastream by the exact string: "XHTML-Print" (without the quotation marks) in all service
discovery records and protocols, device identification records and protocols, and in other cases where a list of supported datastreams is to be presented by the printer. Where such datastreams are
identified by a MIME media type, the identifier "application/xhtml+xml" SHALL be used in combination with a "profile" parameter of
"http://www.w3.org/Markup/Profile/Print"; e.g.,

3.1 Attributes and Attribute Collections

Some of the attributes defined in the Modularization of XHTML [XHTMLMOD] are not applicable to
the printed page or are not relevant due to the exclusion of their module from XHTML-Print. Other attributes are not REQUIRED but if
supported by a printer, support SHOULD be provided in the RECOMMENDED
manner.

Each attribute in the following sections is annotated with one of the following keywords indicating support options for a conforming printer:

Key

Description

MUST

Support is mandatory; a conforming printer MUST implement this attribute. (However, the inability of a printer to implement part of this
specification due to the limitations of a particular device does not imply non-conformance. E.g., the fact that a monochrome printer user agent cannot render colors does not preclude its
conformance to this specification.)

SHOULD

Support for the attribute is RECOMMENDED, but not REQUIRED.

MAY

The attribute's functionality is entirely OPTIONAL.

N/A

The attribute does Not Apply to the printed page; a conforming printer MAY ignore this attribute for one of the following reasons, but MUST NOT treat it as an error:

The attribute applies to a user interface which is not represented on a printed page. For example, the accesskey attribute is irrelevant.

The attribute applies to form submission which is not performed by the printer, the method attribute of the form element for example.

The attribute, such as title, describes data which is not represented on a printed page

The attribute applies to objects other than JPEG images, such as Java applets.

The attribute does not apply since links specified by the anchor element are not followed.

The Modularization of XHTML ([XHTMLMOD], section 5.1) contains a set of attribute collections for
ease of presentation. This specification continues this practice with the same conditions, that is, that the collections below are informative and their contents normative.

Note that the title attribute of the Core collection is not applicable to the printed page since there is no place to display such supplementary information.

A printer MAY support special processing based on the natural language of the document, such as the use of guillemets for quotation marks
in French text. If a printer implements processing based on the natural language of the document, that processing SHALL be controlled by the
xml:lang attribute.

A printer SHOULD support CSS style sheets, as noted in section 1.3.2 Presentation, within the limits of its
capabilities.

If a printer implements a feature to truncate the contents of a cell because of space constraints, it MUST support the abbr
attribute and print the value of the abbr attribute (if present) instead of the cell's content.

A printer MUST support the values left, right, and center for the align attribute of
the td, th, and tr elements; other values are OPTIONAL. If the align attribute is
missing or has an unsupported value, a printer MUST act as if the align attribute has the value left for the
td and tr elements, and as if the align attribute has the value center for the th element.

A printer MUST support the values top, middle, and bottom for the valign attribute of
the td, th, and tr elements, other values are OPTIONAL. If the valign attribute is
missing or has unrecognized value, a printer SHOULD act as if the valign attribute has the value middle. Vertical
alignment is undefined across page boundaries.

Printers MUST support the http [RFC2616] URI scheme [RFC3986]. Support for other
schemes is OPTIONAL.

A printer MUST support resources of type "image/jpeg." A printer MAY support
other types of image formats and therefore other values of the type attribute. A printer MUST process the content of the
object element when it does not recognize or support the object type referenced by the value of the type attribute.

Conforming documents SHOULD specify the width and height of the image using the width and height attributes or
equivalent styling instructions. (See 2.3.1 Formatting/Rendering Rules).

The param element's purpose is to pass data to an application specified in the enclosing object element. The param element MAY be completely ignored.

A printer MAY implement support for this element and provide implementation specific processing of the meta-information. However,
guidelines and/or recommendations for processing a document's meta-information are beyond the scope of this document.

3.12 Scripting Module

Scripts, as programs that are executed in conjunction with a document, are not relevant to the printed page and MUST NOT be executed
or printed. The noscript element contains alternate content that MUST be printed in place of the content of the
script element when present.

A printer MUST read and process the content of style elements where the media attribute has the value
print or all. A printer MAY read and process the content of style elements where the media
attribute has the value projection. A printer SHOULD ignore the content of style elements where the
media attribute has any other value. The absence of the media attribute MUST be treated as if the media
attribute had the value all.

A printer SHOULD read and process the content of style elements where the value of the type attribute is
"text/css"; a printer MAY read and process the content of style elements where the value of the type attribute is
other than "text/css"; all unsupported values for typeMUST cause the content to be ignored. Style elements without a
type attribute will be treated in an implementation-dependent manner.

3.14 Style Sheet Attribute Module

This module adds the style attribute to the Common attribute collection (section 3.1).

Printers MUST support the http [RFC2616] URI scheme [RFC3986]. Support for other
schemes is OPTIONAL.

If the printer implements processing based on the natural language of the document, then the hreflang attribute MUST be
supported.

A printer SHOULD read and process the content of external style sheets where the media attribute has the value
print or all. A printer MAY read and process the content of external style sheets where the media attribute
has the value projection. A printer SHOULD ignore the content of external style sheets where the media attribute
has any other value. The absence of the media attribute MUST be treated as if the media attribute had the value
all.

A printer SHOULD support the value stylesheet for the rel attribute along with the value "text/css" for the
type attribute; all other values are OPTIONAL.

3.16 Base Module

Printers MUST support the http [RFC2616] URI scheme [RFC3986]. Support for other
schemes is OPTIONAL.

3.17 Character Entities

XHTML-Print is in the family of XHTML document types, since it is created by combining XHTML modules. The character entities that are part of XHTML-Print are, therefore, defined in XHTML Character Entities ([XHTMLMOD], Section F.1).

4. How to Use XHTML-Print

XHTML-Print inherits all the structure, encoding and other basic infrastructure specified by XHTML 1.0 [XHTML1]. The following sections describe and clarify
the application and usage restrictions of XHTML-Print.

4.1 Images

This document specifies only one mandatory image format: baseline JPEG as defined in JPEG File Interchange Format [JPEG]. See Appendix A
for a description of JPEG decoder requirements. Printers are not REQUIRED to support:

4.1.1 Recommended Attributes on the img and object Elements

Because many printers create the page in a serial manner from top to bottom, it is important for the printer to know the size of images before retrieving the image data itself. This information is
then used to create portions of the page layout.

Therefore, the document SHOULD include the height and width attributes within the img or the
object element (or equivalent styling instructions). These attributes MAY be expressed as pixels or percentages within the
img or the object element. Percentages are relative to the parent element and not the page width or printable area.

4.1.2 Image Data

[Informative]

In traditional Web-based applications of XHTML, image data is contained in a separate file on a Web server that the user agent retrieves.

However, there are circumstances where it is desirable to include the image data along with the rest of the print data. For example, some low cost, resource constrained clients might want to
include images in their print output but cannot afford to include an HTTP server. Furthermore, circumstances could require that all the print data be encapsulated in a single file for
transportability, avoiding firewall issues, etc. Therefore, conforming XHTML-Print printers MAY optionally support a format that contains both
a document and its referenced image data as well as the REQUIRED traditional format that contains only the document.

The format recommended for including image data along with xhtml-print markup is defined by RFC3391 - The MIME Application/Vnd.pwg-multiplexed Content-type. [MIMEMPX].

Including image data as defined in RFC2397 - The data URL scheme [RFC2397] may be appropriate for printers capable of buffering large amounts of data, but
will not achieve the intended results for most cost- and memory-constrained printer UA's. Because this method normally encodes the binary image data using base64 encoding, a significant increase in
the size of the data transmitted will be experienced. This should be avoided over low speed connections. Printers supporting included data can support base64 encoding using the img or
object element.

Mechanisms for determining whether or not a printer supports either of the above OPTIONAL document formats is outside the scope of
this specification.

4.1.3 Side-by-Side Images

Low-cost printers today often have very little memory into which page data can be stored before being printed. As such, they may build and print the page in swaths on the fly from the top of the
page to the bottom. To enable the use of XHTML-Print in these low cost printers, some restrictions on the order of images contained in the XHTML-Print data stream must be added.

If two or more images will be even partially side-by-side on the printed page (i.e., a line across the short axis of the page will intersect more than one image), they SHOULD be included by reference; for example <img src="http://example.com/example.jpg">. This allows the printer to get chunks of the
image, as it needs it, as it prints down the page. Interleaved or included image data, as discussed in Section 4.1.2, is discouraged.

An XHTML-Print conforming printer lacking sufficient buffer space to hold multiple side-by-side images MAY choose to reformat the layout
of the page to preserve content. Printers SHALL attempt to preserve content when encountering side-by-side images that MAY be impossible to print as specified within the XHTML-Print. Discarding the second and subsequent of the side-by-side images SHOULD be avoided unless preservation of content is best achieved by doing so. Other than attempting to best preserve content, this specification does not mandate any
specific behavior when encountering this situation. Clients providing images SHOULD order them from left-to-right top-to-bottom unless the
print direction is known to be otherwise.

4.2 Style Sheets

Conforming XHTML-Print printers SHALL support both in-line and referenced style sheets within the style element or
link element in the head element of a document. Conforming XHTML-Print printers SHALL also support the
style attribute (i.e. in-line style) when used within other elements as defined by XHTML 1.1[XHTML1.1]. Normal cascading rules apply.

4.3 Forms Usage

This section is informative.

An HTML form is a dynamic entity when the document is displayed in a browser: data can be entered into text fields, buttons can be pushed, selections made, and options checked. None of this
dynamic activity can be rendered on a printed page. However, a printed page can permanently record a particular state of the form. For example, users might wish to print forms that record products
ordered or payments made.

The following discussion illustrates the activity involved when interacting with and printing forms. Please refer to Sequence Diagram 1

Sequence Diagram 1. Forms Usage

Steps:

The User enters a URL into the Browser

The Browser fetches the form from the Server and displays it

The User enters data into the form

The User asks the Browser to print the form

The Browser composes a page with the form and the user data

The Browser sends the newly composed form to the printer

The User selects the Submit button on the form

The Browser sends the user data to the Server

Detailed discussion of Steps:

The user interacts with a browser on a mobile device to access a form presented by a server on the network (steps 1 and 2 of Sequence Diagram 1). The following fragment
of an XHTML-Print document shows what the server sends to the browser to present to the user. Please note, that the form is blank when first presented to the user.

Here is an example presentation of the above form as the user would see it:

First name:Last name:email:
IEEE
ACM

The user enters data (step 3 of Sequence Diagram 1) into the text fields and checks the IEEE check box so that the form now looks like the following:

First name:Last name:email:
IEEE
ACM

The user then clicks on the browser's print button (step 4 of Sequence Diagram 1), to print the form as it currently appears.

The browser then creates a, possibly new, document (step 5 of Sequence Diagram 1) containing the original form and the users data. Note in the XHTML-Print document below,
created by the browser, that the user's data is included either by a value attribute or a checked attribute.

This specification is based in large part on the specification of the same name, XHTML™-Print [XHTMLPRINT], from the Printer Working Group, a program of and through the IEEE Industry Standards and Technology Organization, Inc.; and which
was in turn based in large part upon an earlier work with the same name by Fujisawa, Grant, Wright, and Zehler. The editors wish to express their gratitude to
all who contributed to this and earlier versions.

A. JPEG Decoder Requirements

A.1 Introduction

A.1.1 Intent

This appendix describes REQUIRED behaviors for JPEG decoders in XHTML-Print devices. Many of the behaviors described in this document
follow directly from language already present in the relevant JPEG standards, but are repeated here to emphasize their importance.

A.1.2 Objectives

The decoder behaviors described in this document are intended to minimize implementation complexity, while retaining maximum compatibility with existing JPEG files. In particular, these
recommendations seek to ensure compatibility with both EXIF (Exchangeable Image File Format) and baseline JFIF (JPEG File Interchange Format); i.e., the subset of JFIF files that use only baseline
JPEG processes. Support for JPEG streams using non-baseline processes, such as arithmetic coding or progressive coding, is not mandated for XHTML-Print compliance.

A.2.2 Handling of APPx Markers

Baseline decoders MAY ignore application-specific markers, such as the JFIF APP0 marker and the EXIF APP1/APP2 markers; rotation fields
within these markers SHOULD be ignored. (Specifically, conforming printers SHOULD
NOT decode the TIFF IFDs embedded in the EXIF APP1 and APP2 markers, as described in Section 2.6.4 of [JEIDA].) This implies images will print in the orientation in
which they are stored, unless style markup indicates otherwise. The image size SHOULD be rendered as specified in the JPEG SOF marker, if
not overridden by style mark-up. A JPEG decoder for a conforming printer SHALL NOT fail as a consequence of encountering an unsupported
APPx marker (i.e. all such markers SHALL be correctly parsed, even if they are ignored).

ITU-R Recommendation BT.601-5, "Studio Encoding Parameters of Digital Television for Standard 4:3 and Wide-Screen 16:9 Aspect Ratios",
International Telecommunications Union, October 1995. It is available from http://www.itu.int/ITU-R