It is ironic that the localization industry, an industry specialized in adaptation and customization, has been insistent on its existing process model and is not adapting itself to meet the requirements of projects using Agile Methodology.

Agile methodology should be embraced by LSPs and localization practices should be rethought in its context. This has the potential to lead to significant opportunities for innovative localization companies.

Agile projects have frequent revisions. The industry needs to rethink its tools and file formats, to support and manage frequent changes. Translation memory, localization kits, etc. should be designed to allow fast identification and removal of obsoleted strings, and have other support for quick updates and validation.

Localization personnel need practices that support efficient and accurate communication about design and other changes. Where possible tools should catch errors by workers who forgot or missed the revision discussion.

Recently a client gave me a laptop to use in their behalf. Using it to look up information on my I18nguy web site, I noticed that one of my web pages, which I had recently worked on to scale with different font sizes, was grossly out of whack. It took me quite a while to sort out why it seemed to work on the laptops that I had tested, but not on this client’s laptop. Here is what I learned. Most of the details are relevant to Mozilla Firefox 3.6. In the future, I may investigate how other browsers behave with respect to default fonts.

Accessibility Recommendation: Avoid Absolute Font Sizes

I believe in universality of the Web and those practices, standards and recommendations that promote accessibility and internationalization so that everyone can have access to information. One of those recommendations is to avoid absolute font sizes, so that users can adjust text to a size that is legible for their eyes and equipment. An absolute font size would be one declared in terms of inches, centimeters, millimeters, points, pixels. Font sizes can instead be declared in em units or percents, relative to other fonts.

For example, using CSS declarations, here are two fixed and two relative font size settings:

.fixedpixel { font-size: 16px;}

.fixedpoint { font-size: 12pt;}

.relativeem { font-size: 1.5em;}

.relpercent { font-size: 150%;}

My web page used entirely relative font sizes.

Browser Default Font

However, there must be a font from which others are declared relatively. Browsers have a default font which can serve this purpose.

There are also font-size keywords (xx-small, x-small, small, medium large x-large xx-large). Medium is considered the default size for body text. The specific value for the font name (or in CSS terms the font-family) and size of the medium font is defined by the browser’s default font. The browser’s default font is usually configurable by users.

The recommendations to browser implementers for the values of the remaining keywords have changed over time. (Lack of stability and consistency of implementations is a related frustration when it comes to default font sizes.)

My page had declared medium as the font size for the body element and other fonts were sized relative to this.

Many people have written about the differences between browsers and their treatment of the default font and potential solutions. (For example, A List Apart)

I had tested scalability with Internet Explorer and Firefox. IE let’s you set the default font-family but not the size. Size in IE is adjusted through zoom controls. Firefox on the other hand let’s user specify a default font and size.

Firefox Default Font Size

Firefox let’s you set both font-family and font-size for the default font. Go to the Firefox Tools menu, Options and choose the Content tab. To test scalability I view pages with different default font size settings.

Now being an I18n Guy, I had also clicked the Advanced... button. This brings up a dialog box with a drop-down for different scripts (writing systems) such as Arabic, Bengali, Simplfied Chinese, Central European, Western… For each script there are font settings for Proportional, Serif, Sans-Serif, and Monospace fonts, as well as other properties such as default character encoding.

I had looked through and edited the settings for a few different scripts. Apparently, the last script I had touched was Central European. Returning to the previous dialog box, it shows as the default font size the value last set for any script.
My subsequent changes to the default font-size all mapped on to the Central European font.Unfortunately, Firefox doesn’t use that value as the default! In my experiments, only the value of the Western script was applied to my pages. So my tests had falsely indicated that the page was stable and independent of the browser default font size.

Note this occurred even though the Western script uses the same font as the Central European. The Western font size was not changed when the default font size was changed. So I was changing the default font-size but it was not affecting the size of the font that was applied to the page.

I have since changed the language of the page to several different values using statements such as <HTML lang="en-US"> and the changes did not seem to determine the script selection. The web page is UTF-8 so the character encoding didn’t drive the choice of script and default font-size. Perhaps, the browser detects the relevant script based on the actual characters used in the page.

Note also, that it doesn’t matter that the page may be using fonts other than the default font. The size associated with medium and other keywords is based on the browser’s default font size, even if that size was associated with a completely different font.

I think it is a mistake that Firefox setting for the default font size hides the script it is applied to behind the Advanced... button. The script should be made explicit, as well as the method by which a script is determined for a page. A good feature would be to also offer an option for the default size change to apply across all scripts.

Summary

As recommended, I defined a page using relative font sizes.

The base font was medium.

Firefox provides settings for different default fonts by script.

The default font-size is displayed as the last script used.

Further changes to the default font-size are applied to the last script used.

The default font-size used by a web page may not be the default font shown.

I have not determined yet how the script selection associated with a page is made.

Impact

This information has consequences for users, designers, localizers and testers.

User, especially multilingual users, may be frustrated by changes to the default font size not having any impact on the pages they are viewing.

The relationship between implementation of relative fonts and default fonts is important to providing usable and accessible pages.

To perform QA and operate pages with different default font-sizes, testers must understand the need to change the default settings for the relevant script(s) used by the pages, not just the default value.

Understanding how the browser associates the script with the page also needs to be understood.

I have lots to blog about, but haven’t had time, so apologies that my blog has gone stale. While I try to get back into the swing of things, I did just return from a trip to Shanghai and Beijing. I’ll offer some thoughts about China subsequently, but for now I’ll report that I had a great time visiting with the folks at CSOFT International. They are developing interesting open source collaboration technologies like TermWiki, which you should all pay attention to. They also have a great organization and it was my privilege to meet with many of their staff and join them for several activities (like the Shanghai Expo!) while I was in Shanghai.

I also spent some time with Zach Overline and Elena McCoy who were nice enough to take some of my remarks and post them on CSOFT’s blog. We spoke about cloud computing, crowd sourcing and related tools for the most part. Zach also asked me my thoughts about GUI and I’ll admit to being vain and worrying that discussing the transition from command line to GUI made me seem too much like an old-timer. (Which it probably does.). However, several people have already remarked they weren’t aware of the tradeoffs that I mentioned, so at least the points have value for some. Going forward though, if you ask me about command-line interfaces I am just going to say I only know what my dad told me and that ever since I had the nanocomputer inserted in my right temple, I don’t even need to click or use voice commands any more.

Although there is some small amount of missing context owing to references to internal presentations we saw together and a presentation I gave referencing “Kowabunga”, surfing, and a vision for localization technology, it stands pretty well on its own.

As I was preparing my visit to China, I mentioned to several friends in the industry that I would be visiting CSOFT. I was surprised at how enthusiastic their clients are about them and the high marks they get for quality and service. Those of you that know me, know I lean to localization providers that are strong on relationship and mutual understanding and that innovate their processes with technology enhancements. CSOFT is definitely in that category.

Ben Cornelius and Ray Flournoy asked whether there is interest in a regular meeting group in Silicon Valley for Machine Translation.

Ben and I host another meeting group for Globalization Management Systems which has been very beneficial for members. We have invited speakers. As a result the group has learned a lot about what is available on the market, what is forthcoming, and how other users think about purchasing, deploying, and integrating these tools, as well as their real-world experiences.

Should there be a meeting group for Machine Translation topics?
What do you think the goals of this group should be? Who should (or would) attend? (e.g. what are typical roles of attendees? MT developers, users, admins, etc.)

We discussed a number of our interests and possibilities. Without revealing our own personal preferences and agendas, I list a few potential topics below.

You can respond here, but I have set up a Yahoo Group that you can add yourself to if interested. If there is enough interest we can work on logistics and have at least an initial meeting or two.
The group is at:

Is anyone interested in helping to organize a meeting schedule for the first few meetings? (I personally don’t have the time.)

Please also comment on how often you are interested and/or willing to meet. Monthly? Quarterly?

Here is a potential mission statement:
This group is for people in Silicon Valley to meet to discuss machine translation topics.
Members are interested in the design, application, integration, use, testing, selection, optimization, etc. of machine translation tools and to share real world experiences.

Meetings are webcast so anyone can participate. We encourage attending in person for networking and to have interactive and high quality discussions.

Possible discussion topics, led by members and invited speakers, vendors, etc.

Machine translation (MT)) was one topic yesterday as I enjoyed lunch with Ben Cornelius and Ray Flournoy of Adobe Systems. Of course, a key criteria for evaluating these tools is the quality of the translated output. Most people would say it is the most important criteria. Speed, cost, and integration with other tools are also significant.

However, evaluation of any tool should take the intended application into account. Some applications can use MT output directly. Many more require review and editing of the output before publication. Manual review and editing introduces labor costs and delay which can be significant.

Therefore we should look at the cost of operating the tool plus the cost of post-editing, when evaluating MT tools. Clearly, the optimum is to have 100% quality and no post-edits are required. But, this is not usually the case…

Not all editing tasks are the same. Some edits are easy to make and low cost to fix. Others are labor intensive. The entries that require editing may be obvious and therefore easy to find. Some may be subtle and require more intensive scrutiny to identify.

The post-editing needed by machine translation output will follow a pattern that varies with the MT engine (and its rules, or training, etc.). (Human authors also have a writing pattern requiring particular classes of edits.) Typographic and terminology substitution errors may be easy to address. Some grammar and style errors may be more costly. Consistency, flow and the relationship among sequences of sentences may be harder yet.

This suggests an interesting criteria for evaluating tools, the joint editing productivity and total operational cost of using the MT tool. An MT product that generates text needing edits that are both easy to find and to fix could be very low total cost. Another tool producing higher quality linguistic output, might still be less productive if post-editing is difficult.

A good metric for MT tools would be to assign a weight proportional to the cost of fixing a problem to each class of error. A document could have 100 typos and be much cheaper to ready for publication than a document with only a few consistency or contextual errors that required thought and consideration to address.

This metric would also help with process configuration. For example, if I have to produce both Mexican and Iberian Spanish translation, based on English source material, I have several options.

If “>-MT->” represents a machine translation step, and “>-PE->” represents a post-edit step:

Option

Step 1

Step 2

A

Simple MT, then PE

en >-MT-> mx
en >-MT-> es

mx >-PE-> mx2
es >-PE-> es2

B

mx to es

en >-MT-> mx
mx >-MT-> es

mx >-PE-> mx2
es >-PE-> es2

C

mx post-edit to es

en >-MT-> mx
mx >-PE-> mx2

mx2 >-PE-> es
es >-PE-> es2

D

es to mx

en >-MT-> es
es >-MT-> mx

es >-PE-> es2
mx >-PE-> mx2

E

es post-edit to mx

en >-MT-> es
es >-PE-> es2

es2 >-PE-> mx
mx >-PE-> mx2

The scenario that is most effective is the one requiring the least editing. This may not correlate with unweighted measurements of each machine translator’s linguistic quality.

When I mentioned this, Ben recalled a demo by ProMT that the three of us attended recently. ProMT machine translation has a nice feature for managing placeholders used to represent program variables.

Here is an example sentence with two placeholders represented by an identifier in curly brackets.
“The file {0} contains {1} words.”

The filename and word count would be substituted at run-time for {0} and {1} respectively.

Many machine translation tools segment the text in between the placeholders rather than treating the placeholders as part of the syntax of the sentence. Therefore the placeholders are not properly addressed in the translated output. The problem is exacerbated by tools that convert markup tags to placeholders.

Even if you don’t read Russian, you can see that “Alex” should retain placeholders as “{5}Alex{6}”.

Post-editors must remove the original placeholders from where they are positioned in the text and insert placeholders into the correct locations. This would be a significant cost consideration for either software or markup localization.

ProMT treats the placeholders as part of the sentence resulting in better placement in the output. This simplifies post editing and improves productivity.

Ideally machine translation would deliver 100% quality. However, if the quality is less than 100%, then evaluating the combination of machine translation and post-editing effort is a more useful measure than selecting tools or configuring workflow based on just quality metrics. Higher quality might be irrelevant if it is more challenging for the human post editor to correct the text.

To address the debate about professional vs. crowd sourcing translations, I would like to offer a recent experience I had in an unrelated field.

I brought some 50 year old pictures that I wanted scanned to a photography store. I could have done the scans myself, but the store was offering a very inexpensive promotion.

Before I would let the store have the work, I spent 20 minutes with the owner vetting them and establishing my requirements. My number one priority was that the originals remain undamaged. Not only did I emphasize this multiple times, but I reviewed the scanner and its feeder, etc. In fact, we made changes to how the pictures were collected after scanning, to assure the pictures would not be bent as they exited.

My second requirement was timing. My deadline was a plane flight a few days later. I also went over resolution, color accuracy, pricing and some other fine points with the owner. He provided all sorts of assurances and testimonials. The scanner was state of the art and very expensive. He would handle the scanning personally to protect the pictures. I questioned how often they did projects like this and he described numerous similar projects and their extensive experience. They were professionals.

As you have no doubt guessed by now, the process broke down, in multiple ways.

The scanner glass and reader were dirty causing the images to have lines and marks on them. I noticed this and they redid all the scans the next day.

Some pictures were damaged. They weren’t bent or mangled, but ink from some of the darker pictures transferred to the scanner rollers which then transferred ink smudges onto pictures of a pure white wedding dress.

The scanned images are networked to a server for making CDs and other processing. The server CD burner stopped working. To resolve the problem they transferred the files to another machine to make the CD. This machine had different software on it which for some reason reduced the resolution from 300 dpi to 90 dpi. It also cropped some pictures (for a reason that is not understood) lopping off heads and other undesirable changes.

You can imagine how upset I was at the damaged originals. Later I reviewed the CD and discovered the low resolution and cropped images. I spoke with the owner. The scans were still on the server so they made a new CD.

I’ll leave out the remaining details. Suffice it to say that I returned to the store each day 3 more times until I finally had uncropped images of the right resolution, (but showing the now ink-stained wedding dress.) I missed my deadline and spent as many hours with the store and owner as I would have spent if I had scanned the images myself.

The owner, throughout this, was surprised and upset by the problems, immensely apologetic, worked overtime to satisfy me and brought in other staffers to rush through equipment and other fixes and to be as timely as possible. He spoke, I believe sincerely, of honoring his commitment and trying to make me whole, and happy.

Now I hope some of you see the relationship with translation.
Many organizations and individual professional translators in the industry are protesting the use of crowd sourcing. The claim is that quality will suffer.

I know many of the folks in the industry and I do not question the training, skills, attention to quality and passion that goes into providing good translations. As with the store owner, it is a matter of personal pride, that the work be excellent and the customer satisfied.

However, if we look at the user community and their satisfaction, we see that the intent does not become achievement. Translation is not just a product of the wordsmithing of an individual. It is a process involving several people, tools and equipment, different kinds of both source material and expected outputs. There are many potential points of failure.

Every experienced translation client has stories of missed deadlines, broken promises, and translations that were rejected by end-users. Many of you will assert that nearly all of the problems with the scanning project could have been managed better and either prevented or anticipated with contingency plans. The same is true for translation projects that go awry. Nevertheless many do go awry.

There are going to be projects that require special skills and attention. These are best attended to by professionals with the relevant experience, and not just professional translators, but organizations that are attentive to the entire process.

However, today the industry frustrates its clients with mediocre project management, inadequate workflow and translation memory tools, poor IT practices and lack of interoperability. It delivers translations that are rejected by user communities with surprising regularity. As long as the failure rate is high enough to cause distrust, clients are going to consider the do-it-yourself solution (i.e. crowd sourcing).

It is a realistic solution as well. For the games market, the burgeoning social networking market, and other markets, end-users are able to self-select the most desirable terms and phrasing. Professional translators do not have a particular advantage here.
For some of the languages of Africa and elsewhere, there aren’t sufficient translators, or established glossaries that professional organizations can claim an advantage either.

Clients will not believe that quality will universally suffer under crowd sourcing until the industry improves reliability of its professional services overall and clients are comfortable that they will get value for the dollar.

I know I won’t be using a professional scanning service until I have a need that I can’t fulfill on my own.

There is additional significance to the mid-day Twitter outage that occurred June 17, 2009 that should not be missed.

Many parts of the world are used to mid-day outages of an America-hosted site that periodically comes down for maintenance.

If 2 A.M. in the U.S. is during your business day, then your business is frequently affected by these outages.

It may only be an hour, and it may be a predicted outage, but it can still cause pain.
The inherent message that your region is not important to the owning company is a motivation to consider other providers, where they exist.

Today Twitter was down during the American afternoon, intentionally in preference to supporting the Iranian tweet community protesting the Iranian election results. Let’s separate the good intentions for this move and consider the event by itself.

This is a very rare (if not a first) instance of a region being shown preference over the large American market.

On the one hand, equality of languages, cultures and regions is a key principle behind internationalization and is to be welcomed.

It is newsworthy if this attitude becomes a new trend where social networks must consider their worldwide users and give balanced support.

At the same time, if Twitter is also a business application, then its business users require consistency of support and reliable availability.

So as social and business network media become significant across the world, the conflict of which markets suffer through these necessary outages will become more important. This points at an issue of significance behind today’s outage.
To be relied upon, Twitter and other social media must adopt techniques that guarantee worldwide uptime. Maintenance methods that provide for continued service as improvements or repairs are made, must be employed.

It cannot be ignored that many American and other Twitter users were anxious not only about Iran, but also the other topics they were discussing and perhaps necessary for business or other reasons, during the hour Twitter was down.

How many more mid-day outages will Twitter users accept before they shift to other more reliable services?
Not many. There is no time of day where large communities of users are not hurt by outages.Social networks need to get behind 24×7 uptime and phase in maintenance.

Why the list isn’t in this blog

Originally, the list of IDNs were in this blog. It turns out that WordPress treats the domain names as URLs and converts the bytes to hex encoded values. This is wrong for IDN. So you cannot properly link to international domain names in WordPress blogs at the moment.

As a result, I suggested on twitter to use the hashtag #idnfail for people to report blog, twitter and other applications that are broken with respect to international domain names.

The initial list of #idnfail apps (which I have not verified myself for each entry) is: