If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Weigh in on the Hefty Job of Internationalization

When I worked in the purely commercial software world, internationalization of a product was a critical, magical event that happened somewhere offsite in the hands of contracted translation companies. Having your software translated into as many different languages as possible is just as important in the open source arena. Arguably, open source projects NEED translations MORE to grow into a global presence. The PROCESS of translating software and documentation for an open source project is an entirely different experience from the commercial event depicted above, as I have learned over my last two years with Pentaho.

This topic is at the forefront of my writings today because I have been tasked with figuring out how to manage localizing the Pentaho documentation in a wiki that has no feature support for internationalization of content! Did I mention that Pentaho is moving all of their documentation to a wiki? Well, there it is. The cat's out of the bag. As of our 1.2 GA release, the documentation will be maintained, by community and development team, in a Confluence wiki.

Confluence is a great tool and has lots of integration points with our case tracking system, JIRA. That said, it seems that Atlassian (the company behind Confluence and JIRA) is a little behind in the internationalization game. Confluence only recently started to support language packs for translation of the wiki itself, and has no support for content translation.

So here we are, and I need to figure out a solution to satisfy three very important groups. The first group are the translators of our documentation. They are community members who contribute those translations to our projects. We need to set up the internationalization in the wiki so that it is easy for these folks to do the initial translation, and also have a mechanism for notifying them when the master language version of the doc has changed. The second group is the users of the documentation. It should be easy for me, if I am French, to find the French documentation, but also be able to peruse the English documentation. And the third group is the poor guys in house that have to maintain the organization of this documentation. When you consider we have over 10 projects, translated (so far) into 8 languages, that get a new revision of documentation for every version of the project released... well, that's alot to manage.

So my initial attempt at a design for this conglomeration that we need to support was to try to stuff all of the languages in the same document in the wiki using DIV type tags and such for separation. I'd then use some custom code to hide the other translations based on the user's language setting in the browser. This would make my translators happy initially, because they can do the initial translation almost inline with the master language version. My users would be happy because our wiki respects their browser's language settings, and if a particular piece of content hasn't been translated to the user's language of choice, we would default to the master language version. Of course, this solution does not address the translator notification of master language changes, and well, it would be a bit of a pain to determine whether it was an master language version change or a translation change, with all that content in the same document. Also, this only addresses translation of the content. What about the document titles? Since the navigation of a wiki (by definition) is based on document name, we have a big problem to solve there. And the largest point of failure in my grand plan is the fact that the merging capabilities are not so hot in our wiki of choice, so the translators would have to line up and take turns translating. Ick, in a word.

So the next path we steer down is that path that takes us to completely separate repositories (called Spaces in Confluence) for the different languages. This gives us autonomy for the purposes of editing and not having the language content intermingled, but at a pretty large synchronization and maintenance cost. We now need to figure out, do we populate the French repository with all of the English documentation, to assist our translators in translating? Well, then we have up to eight copies of the doc, that is changing realtime, and is sure to get out of sync. So, perhaps we should let the translators populate their language wiki from scratch, organically? This isn't very accommodating to our translators, and documents will surely be placed out of order with the master language wiki, making it confusing for the users.

Yikes. It's at this point, having discussed the plethora of less-than-stellar options with a few clever guys on our team, that I decided to step back and write to the community. In my mental gymnastics over this problem, I made many assumptions about what our users and translators really want.

For the translators in our community, have you worked at translating in a wiki before? What did you like about it? What did you hate? Is it easier for you to translate everything in your format of choice, or do you like the idea that once you translate it it would be available, without having to wait for the Pentaho team to publish it? Of the two scenarios I detailed above, which is the lesser of two evils for you?

And for all of the rest our community that must USE and update our documentation - would it be more frustrating for you to work with translations inline in the wiki (in edit mode only) or to have to go hunt around someplace else to find those translations?

And, of course, for any other open source project that holds the silver bullet to this problem - feel free to share your solution here!! Heck, I'd even take well intentioned guesses and good ideas

Here is my experience when I tried to pass one of my documentation to the wikis, I am not right now talking about the internationnalization.

I had to move my documentation to JSPWiki and I loose so much hours because it is definitly not a writer application. It does not give enough options to make to documentation look descent enough, and you are realy slow when using the wiki syntax.
Forced to see that I will never be able to make the documentation in such situation, I decided to move to the Pentaho wiki (confluence) and hopped that I will be able to work with it because it is a profesional product.
But the same problem appends! Confluence is still better than JSPWiki in any case, but writting a documentation is still a lot of pain because its design is not the best for editing. You still loose much of time. Confluence has a WYSIWYG editor but looks bugged, realy.

Wikis seems to be only good as replacement of the sheet of paper you could have on your desk and is filled when you think about something. (ie notes)
But if you want a write a big documentation it is definitly not the solution!

I only used another wiki on the past, and it was MediaWiki and I don't remember having such pain making a documentation. I guess MoinMoin wiki is good also, but both have to be tried in same situation I would say.

Even a CMS (Content Managment System) is a better alternative if you want usable documentations.

Every wiki vendor adds its fonctionalities and is fully not compatible with another wiki, that is a big problem. Even if they are opensource, you will loose much time to adapt your documentation from one wiki to another.
Wiki syntax is then too primitive, not structured, and inconsistent between vendors.

Talking about Internationnalization of the content, not much wiki vendors seem to have this functionnality. I don't know why that is the question any program have at the begining and they all adopted the same position: no multilanguage content!

So the next path we steer down is that path that takes us to completely separate repositories (called Spaces in Confluence) for the different languages. This gives us autonomy for the purposes of editing and not having the language content intermingled, but at a pretty large synchronization and maintenance cost. We now need to figure out, do we populate the French repository with all of the English documentation, to assist our translators in translating? Well, then we have up to eight copies of the doc, that is changing realtime, and is sure to get out of sync. So, perhaps we should let the translators populate their language wiki from scratch, organically? This isn't very accommodating to our translators, and documents will surely be placed out of order with the master language wiki, making it confusing for the users.

Yikes. It's at this point, having discussed the plethora of less-than-stellar options with a few clever guys on our team, that I decided to step back and write to the community. In my mental gymnastics over this problem, I made many assumptions about what our users and translators really want.

Gretchen I guess all the assumption you made are the right ones but I don't know the open tools for that but I guess you can forget hacking wiki to support multilanguages, they must have native support (merge, diff, ...) for that.
If the doc is big, it is hard to translate it. Wiki helps a little because you can cut it in small parts but if you can't see the original part in same time, it is no fun.

For the translators in our community, have you worked at translating in a wiki before? What did you like about it? What did you hate? Is it easier for you to translate everything in your format of choice, or do you like the idea that once you translate it it would be available, without having to wait for the Pentaho team to publish it? Of the two scenarios I detailed above, which is the lesser of two evils for you?

I hated it, it is not usable to produce quality documents.
In a comunity process, it is better to have it directly available plus it allows many ppl to work on it.

And for all of the rest our community that must USE and update our documentation - would it be more frustrating for you to work with translations inline in the wiki (in edit mode only) or to have to go hunt around someplace else to find those translations?

I guess you must not forget the process you will need to transform wiki to paged document. Do not forget the user view.

Sorry for the English and the sens, I guess I will edit it back later once I have more clever ideas in mind. But at least it starts the conversation.