Blog

Go here to register: /register
If you are still planning to register for the MultilingualWeb workshop in Luxembourg next week, this is the last day for registrations. If you don't register today you may not be able to gain access to the event location.
Hope to see you there!

If you would like to exhibit a poster during the MultilingualWeb workshop in Luxembourg, 15-16 March, please contact Manuel Tomas Carrasco Benitez (manuel.carrasco-benitez at ec.europa.eu) and copy Richard Ishida (
This email address is being protected from spambots. You need JavaScript enabled to view it.
) for more details.

We made the decision to accept poster applications quite late in the process, as a result of requests from attendees. If you are interested, please make contact as soon as possible, since the deadline for registrations is now only a week away.

A translate attribute was recently added to HTML5. At the three MultilingualWeb workshops we have run over the past two years, the idea of this kind of ‘translate flag’ has constantly excited strong interest from localizers, content creators, and from folks working with language technology. Richard Ishida has written a blog post describing how the attribute is meant to work, and describing support for the attribute in Google Translate and Microsoft Translator. He also hints at some ways in which the translate flag could be extended.

META-NET and Lionbridge are sponsoring the 3rd workshop in the MultilingualWeb series, which will be held in Limerick, Ireland on 21-22 September 2011. (See the Call for Participation and the recently published Program.)

If your organization would like to also sponsor the workshop, see how to apply. The deadline for sponsorship proposals for Limerick is 7 September 2011.

Both the Madrid, and the Pisa workshops of the Thematic Network “Multilingual Web” mentioned the XML Localization Interchange File Format (XLIFF) as a central component of streamlined localization processes. Presentations, in which XLIFF was mentioned included:

Given that the next workshop is in Limerick, we have translated the MultilingualWeb site into Irish.

There are a few user interface terms that are still pending translation, and as for all of the languages the reports, program, call for participation, etc. are still in English (mostly because we don't have the resources to deal with those, and partly because the workshop is in English). But a large amount of text on the site and the navigation is now in Irish.

In addition to Irish, we have translated the site into Spanish, German, French, Italian and Romanian.

There are also two widgets at the bottom of each page, one from Microsoft and one from Google, that allow you to get gist translations of parts of the site that are not translated, or get gist translations into many other languages.

The first two events related to the Thematic Network “Multilingual Web” provided a couple of opportunities to share information on the W3C Internationalization Tag Set (ITS). Presentations, in which ITS was mentioned included:

Especially the workshop in Pisa provided a couple of interesting ITS-related thoughts:

1. Several speakers mentioned that it would be good if content could be categorized in a standard way as "Generated by Machine Translation (MT)". I guess there are various ways of looking at this from an ITS point of view:

a. an additional data category with a semantics such as "generatedBy"

b. via a special, BCP47-compliant, value for the existing ITS data category "Language Information"; that special value may actually be a composite one since there may be a need to capture things like the following

Name of MT system that generated

Quality of the input

(Semi-)official quality rating of the system (BLEU score or the like)

2. Several speakers explained that it would be good if content could be categorized in a standard way as "OK to be submitted to Natural Language Processing (NLP)". Example: In order to build models for statistical Machine Translation the Web is deemed to be an invaluable resource. However, some uncertainty seems to exist whether this use of Web-based content would be permitted or not. A standardized categorization could help. I guess there are various ways of looking at this from an ITS point of view:
a. an additional data category with a semantics such as "nlpOK"
b. something similar to the existing ITS data category "Localization Note" (namely one that captures information for machine processing, not for human consumption; see the discussion).

3. Charles McCathieNevile mentioned the addition of the notion of a default locale to the Widget Packaging and Configuration (see http://www.w3.org/TR/widgets/#widget-package ). This made me wonder if "defaultLocale" might not be something that could be useful in quite a number of contexts - and thus would be a candidate for an additional ITS data category. The Widget document actually initiated another localization related thought (namely that the Widget document should be required reading for anyone who works on standardized packaging for translation-related processes).

P.S.: The above is similar to post to the mailing list for the W3C ITS Interest Group.

By all accounts, the MultilingualWeb Workshop in Pisa proved to be as popular as its predecessor in Madrid, thanks to the efforts of the many excellent speakers and the local organizers. Once again, we had around 100 attendees and 33 speakers. The program page has now been updated to point to speakers' slides and to the relevant part of the IRC log. Links to video recordings will follow in about a week's time.

There is also a page pointing to social media reports, such as blog posts, tweets and photos, related to the workshop. If you have other blog posts, photos, etc. online, please let Richard Ishida know (
This email address is being protected from spambots. You need JavaScript enabled to view it.
) so that we can link to them from this page.

The MultilingualWeb Workshop in Madrid appears to have been a great success, thanks to the efforts of the many excellent speakers. As a first step in reporting the workshop, a page of links is now available that points to speakers' slides and to the relevant part of the IRC log. It also points to blog posts, tweets and photos related to the workshop.

We are still missing a small number of slide sets, and those will be added as speakers provide them.

If you know of other social media references to the workshop, please inform Richard Ishida (
This email address is being protected from spambots. You need JavaScript enabled to view it.
) so that they can be added to the page.