MultilingualWeb-LT Working Group Charter

The mission of the MultilingualWeb-LT Working Group, part of the Internationalization Activity, is
to define meta-data for web content (mainly HTML5) and "deep Web" content, for
example a CMS or XML files from which HTML pages are generated, that facilitates its
interaction with multilingual technologies and localization processes.

Scope

The MultilingualWeb-LT Working Group will enhance the foundation for the integration of language
related technologies into core Web technologies. This will be achieved via the
creation of a W3C Recommendation defining metadata applicable by language related
technologies in the Web. Beginning with the network of stakeholders built via the MultilingualWeb and META-NET projects, the Working Group will
create broad consensus across communities, involving producers of content,
localization workers, language technology experts, browser vendors, tool makers and
users. MultilingualWeb-LT will lay the technical foundations for new business opportunities for
content creators and the vendors who provide them with language and content services
and tools. This will enable content creators and distributors to reach out to a
growing linguistic and cultural diversity of Web users world wide, and to respond to
their specific needs in a timely and cost-effective manner.

MultilingualWeb-LT will base its work on the ITS 1.0 specification. MultilingualWeb-LT will produce the
successor of ITS 1.0.

ITS 1.0 provides conceptual, prose descriptions of its data categories for XML documents, with examples of possible syntax. The MultilingualWeb-LT Working Group has four goals:

To develop the successor of ITS 1.0.

To concentrate on the use of these data categories in HTML5 and “deep Web” content, for example a CMS or XML files from which HTML pages are generated. This does not exclude the definition of additional data categories (see below), but describes the focus of the Working Group.

To define processing requirements of data categories formally and in a consistent manner.

To foster reference implementations of the data categories in other, XML and non-XML environments that are on the Web or closely related to the Web: CMS systems, Web based localization chain services, online machine translation systems.

In addition to the data categories mentioned above, MultilingualWeb-LT MAY discuss data categories in the following areas. The MultilingualWeb-LT Working Group does not commit to define metadata for these data categories, but MAY do so, if this does not influence the timeline described below.

Translation provenance (human and machine translation of different types)

Human or automated post- and pre-editing, including degree of post-editing

Legal metadata pertaining to ownership and usage rights

QA provenance: application of translation QA; result of QA; human and tool input into QA assessment

For HTML5, the prose description of data categories will be normatively implemented as both a microdata and RDFa Lite 1.1. This approach is taken in order to avoid the development of a new metadata mechanism for HTML5 and to avoid adding markup attributes to the HTML5 language. The only exception MAY be adding the "translate" attribute to the HTML5 language, see the HTML liaison description below.

When discussing solutions, the MultilingualWeb-LT Working Group will always look for existing standards.

The working group MAY decide to define and implement further data categories, if this does not influence the timeline described below.

Many of the above data categories could be defined in a complex manner, e.g. legal metadata. The Working Group MUST NOT pursue complex definitions if they would lead to divergence from the timeline defined below.

Success Criteria

The MultilingualWeb-LT Working Group is expected to demonstrate interoperable implementations during the Call for Implementations step. "Interoperable" means here at least that the metadata must be available in various parts of Web related technologies, like CMS systems, localization chains etc., or in profiles of related formats like XLIFF.

Out of Scope

The MultilingualWeb-LT Working Group will focus on metadata that can be added to HTML5 or to "deep Web" content (for example a CMS or XML files from which HTML pages are generated). The Working Group will not define container formats for localization, like XLIFF, but will cooperate closely with the XLIFF TC and other relevant TCs, to ensure the compatibility of MultilingualWeb-LT metadata with these formats. The Working Group also will not address as part of their normative outcome metadata for styling formats like CSS or scripting languages like JavaScript. These topics MAY be addressed, if time permits, in a non-normative manner, or in a successor of the Working Group.

Metadata such as the "Translate" data category can be processed in various ways: to extract translatable text for subsequent localization, as input to a machine translation systems etc. The MultilingualWeb-LT Working Group will not define processing expectations for MultilingualWeb-LT metadata. In an (optional) primer document, the Working Group MAY give non-normative guidance on the topic.

Deliverables

A W3C Recommendation about metadata for multilingual language technologies and localization processes. This MAY be a new version of the ITS 1.0 specification. The Working Group will coordinate this decision with the stakeholders who have developed ITS 1.0 and who are now active in the ITS Interest Group.

Other Deliverables

The Working Group will produce a test suite intended to promote implementation of the Candidate Recommendation, and to assess interoperability between these implementations.

The Working Group may also:

Author a primer that includes guidance on the use of the Recommendation, and

Dependencies and Liaisons

Dependencies

The MultilingualWeb-LT WG will closely interact with the HTML WG in order to assure that the metadata defined within MultilingualWeb-LT extends HTML5 in a manner that is in agreement with the HMTL WG, and to assure that MultilingualWeb-LT metadata does not lead to any impact on Web browser behaviour that does not have the consensus of the HTML WG. This interaction already started with the HTML5 Last Call comment HTML5 is missing attribute for specifying translatability of content. This comment is in part driven forward by the people and organizations who will also join the MultilingualWeb-LT Working Group. As explained in the scope section, the "translate" attribute will be the only addition to the HTML5 language requested by the MultilingualWeb-LT WG. Other data categories discussed by the MultilingualWeb-LT WG will make use of RDFa and microdata.

Relationsships to External Groups

Dependencies

OASIS XML Localisation Interchange File Format (XLIFF) TC

The MultilingualWeb-LT WG will liaise with the OASIS XLIFF TC to understand Localization Industry requirements for MultilingualWeb-LT metadata and to ensure that its metadata data types are considered and incorporated into the XLIFF standard where necessary.

Liaisons

Unicode Unicode Localization Interoperability (ULI) TC

The MultilingualWeb-LT WG will liaise with the ULI TC in order to understand latest developments of segmentation rules for localization interoperability and how this may relate to data types and best practices elaborated by MultilingualWeb-LT. As ULI moves on it might be interesting to watch if they tap morphology and lemmatization areas, XLIFF profiling, content leveraging memories etc.

ETSI Localisation Industry Standards (LIS) ISG

The MultilingualWeb-LT WG will liaise with ETSI LIS ISG to have insight into the development of the former LISA OSCAR standards portfolio, which includes important legacy standards such as TMX, TBX, and SRX. It is especially important to watch TMX, as it is currently not being further developed in any other open standardization body.

ISO TC 37

The MultilingualWeb-LT WG will liaise with ISO TC 37 to understand further development of TBX and SRX standards if any such development occurs within ISO TC 37.

The MultilingualWeb-LT WG will liaise with the OASIS OAXAL TC to understand how the OAXAL Reference Model, and its approach to linking the processing of source language files with localization workflow file such as XLIFF, could inform the work of the MultilingualWeb-LT WG.

OASIS Darwin Information Typing Architecture (DITA) TC

The MultilingualWeb-LT WG will liaise with the OASIS DITA TC to understand their current stance on component content mark-up in relation to localization and translation (including XLIFF interoperability) and how this could be developed in line with the MultilingualWeb-LT outputs.

OASIS Content Management Interoperability Services (CMIS) TC

The MultilingualWeb-LT WG will liaise with OASIS CMIS TC to determine their model of document grouping could support multi-document mark-up requirements and also to inform any future CMIS revisions that may include localization, which is currently out of scope.

Participation

To be successful, the MultilingualWeb-LT Working Group is expected to have 10 or more active participants for its
duration. Effective participation to MultilingualWeb-LT Working Group is expected to consume
one work day per week for each
participant; two days per week for
editors. The MultilingualWeb-LT Working Group will
allocate also the necessary resources for building the test
suite for its specification.

Decision Policy

As explained in the Process Document (section
3.3), this group will seek to make decisions when there
is consensus. When the Chair puts a question and observes
dissent, after due consideration of different opinions, the
Chair should record a decision (possibly after a formal vote)
and any objections, and move on.

Patent
Policy

This Working Group operates under the W3C
Patent Policy (5 February 2004 Version). To promote the
widest adoption of Web standards, W3C seeks to issue
Recommendations that can be implemented, according to this
policy, on a Royalty-Free basis.

About this Charter

This charter for the MultilingualWeb-LT Working Group has been created according to
section
6.2 of the Process
Document. In the event of a
conflict between this document or the provisions of any charter
and the W3C Process, the W3C Process shall take precedence.