A blog about translation, language, art and technology

Main menu

Monthly Archives: October 2010

Post navigation

Recently on the OmegaT email list a post included the link to a very useful looking software package called TMBuilder. It’s written for Windows, but there were reports of it running under Wine as well. Like most good, free software, it was born from a frustration:

TMbuilder was created to save both my and everybody else’s time that could be wasted using Trados’ WinAlign or any other commercially available alignment programs which usually don’t offer an acceptable merging quality. One of my Localization projects pushed me to build up a new TM project from hundreds of thousands of words saved in MS Excel spreadsheets.

Increasingly I find myself telling students that the important aspect of Translation and Technology to recognise is that it’s not one TEnT or CAT that is going to make it easy for you. Translation has a workflow, as does technology – and technology provides us with tools to overcome the harder parts of that workflow. Building TMs from disparate sources just got a whole lot easier.

Another of TAUS‘s papers, titled What options do translators really have?, has an interesting mediation on the turmoil facing the industry at the moment. Unsurprisingly, using TAUS is an excellent solution to some of the problems they note, but there is a greater wisdom in the whole post. I think that it summarises the current state of play very well and highly recommend it if you want a good review of the bigger picture as it stands:

So you are a translator. You have a loving, intimate relationship with words. You thrive on the challenging, mind-wrestling quest for equivalence and yes – you know the difference between a participle and a gerund. You care – not just about style, register and cultural nuances, but most of all, about the quality of your work. You are a linguist, a writer, a cultural expert and a field expert on many subjects, a researcher, an IT expert, a graphic designer… a one-man orchestra. You work autonomously and are driven by mastery.

Yes, you are a translator. You work long hours on texts that are becoming increasingly boring – be it lengthy automotive manuals that no-one ever reads or help files that no-one ever wants. You are forced to recycle words although climate change is not high on you agenda. And yes – you see the rates dropping with the speed of light and wonder when will they ask you to work for free?

The obvious candidates get a look in – open collaboration, the web as a tool and the current economic climate, but some interesting examples of how to move forward are also presented:

There is also a great potential to move up the value chain. Just like vast numbers of accountants have taken on management consultancy roles, professional translators could offer value-added consultancy services, advising on cultural, technical or authoring issues or providing controlled language services. Your role could become much more important, varied and fun – the scope for development is there, without the need to sacrifice things you love the most.

The solutions offered mirror what I’ve been saying these last 18 months – they don’t pull punches, but it’s not all doom and gloom. I was impressed to see that they’d even quoted Ignacio Garcia, who has been a great help to me over the last year or so as we fumble toward a future for translators in which they keep control, and their livelihoods.

Disruptive change always causes resistance and leaves victims in its wake. But it also brings great opportunities for those who are willing and able to embrace it and utilise the resources it brings. Forget the dodgy opportunists who thrive out there – there will always be someone trying to take advantage. In a professional world translators are not being asked to work for free nor are they likely to be reduced to the low-paid call-centre conditions that Garcia predicts. I would go with Bateman , who argues that the new tools and processes open up new roles and opportunities for variety and growth.

In Darwin’s words, it is not the strongest one that survives, nor the most intelligent, but the one most responsive to change – and if you can’t change something, change the way you think about it. After all, it all starts with an attitude. ‘Open’ is the word for the future and if you can just imagine the potential that an open mind + open resources (linguistic and human) could have, then you have already seen the light. The changes will not happen overnight and you might be lucky to maintain a status quo for some time. However, deep structural changes in the translation industry are happening and it would be wise to keep your finger on the pulse and stay open.

I’ve read critiques (link? gah. lost it) of the open source machine translation system Moses revolving around the need for a *nix (Unix based operating systems like Linux and the Mac OSX) savvy systems administrator – this is no out of the box solution for the home user.

TAUS have recently posted a list of the highlights from a recent conference discussion about businesses that use it. As noted in the first account (from Adobe, no less), Moses doesn’t take that many resources – I’ve got it installed here on my 4 year old Ubuntu desktop – but it does give better and faster results the more you have at hand. It’s not just CPU cycles, but a host of other, potentially costly, resources – programmers, corpus tamers, and the time intensive training and output analysis phases:

In the general discussion, most Moses users said they tend to test their raw output against Google or Bing Translate output, but the results naturally varied with the type of content. Nearly all are now using Moses in production, and in many cases they use TDA data to help train the engines. Moses makes most sense when used for very high volumes of throughput. ROI (the investment is mostly labor) can be obtained in about two weeks.

Robert Frost once said, “Poetry is what gets lost in translation”. Translating poetry is a very hard task even for humans, and is clearly beyond the capability of current machine translation systems. We therefore, out of academic curiosity, set about testing the limits of translating poetry and were pleasantly surprised with the results!

This is no half effort, either – Google has a smart team on this project. Quoting both Vladimir Nabokov and Douglas Hofstadter before giving a brief insight into how their translations are done:

A Statistical Machine Translation system, like Google Translate, typically performs translations by searching through a multitude of possible translations, guided by a statistical model of accuracy. However, to translate poetry, we not only considered translation accuracy, but meter and rhyming schemes as well. In our paper we describe in more detail how we altered our translation model, but in general we chose to sacrifice a little of the translation’s accuracy to get the poetic form right.

As a pleasant side-effect, the system is also able to translate anything into poetry, allowing us to specify the genre (say, limericks or haikus), or letting the system pick the one it thinks fits best. At the moment, the system is too slow to be made publicly accessible, but we thought we’d share some excerpts…

This week, Google posted a large document called The Creative Internet (warning, large file, takes time to download. Caveat: you can start reading before it fully downloads). You may well loose 24 hours following links to some amazing sites.

There were a few of interest to translators: on page 85, despite it being called a joke, the question “What happens when language is no longer a barrier?” is posed, with a iphone/android phone mock up of real time translation of the speaking voice. This is surely soon to be reality – I think the “joke” is that Google is probably very well prepared for this eventuality, if not driving the development behind closed doors.

The other page of note was a Spanish language specific page 103, about “Spanish-speaking bibliophiles are creating the first collaborative audiobook”:

el Quijote by Miguel de Cervantes is being cut up into 2149 10-line sections.

Readers request a section (randomly assigned) and have 6 hours to record & upload.

There are two other points of interest this week as well. There has been active discussion on the email list in regards to OmegaT’s wikipedia entry that has resulted in a number of updates including an updated features section and a “formats directly supported” section. If you aren’t using OmegaT, Wikipedia now has a good description of what you can expect.

The second interesting discussion is about The Shortcomings of OmegaT which follows closely on the heels of last month’s Strong and Weak points thread. The thing I like most about these threads is that often users (me included) will post a shortcoming/weak point only to have one of the more experienced users set us straight, or tell us how they overcome the same problems.

Does any other TEnT software provide such excellent, free and timely support?

I’m lucky enough to have a lot of artistic friends, and have the pleasure of living with two and shacking up with another for half the week. I thought I’d share their sites, as they are diverse and help keep my technical obsessions in check.

Richard always has me “re-routing the mainframe”. I know that this is how he says “I love you” and has a voracious appetite for video content. I’ve got a couple of his paintings, but I think that his The Life of Castles is beautifully understated – simple and evocative like xkcd on a good day.

Amy is the perfect artist. It’s 2pm on Sunday. I know she was in bed last night at 10. She’s still in bed. Yet I still know that she will have a new surprise for me when we chat next.

Amber is the geekiest of my artistic coterie, and her art (in that it’s craft) is singularly different from the others, while her comics are excellent. She is also the brains behind Comic Artist’s Rehab.

Finally (for this post at least), I’d like to introduce you to the photo’s of Edwino “Dolly” Roseno, some of whom’s artwork appears in the header of this blog. I bought the full size images from him while hanging out at Mes 56 in Jogjakarta, early 2009. Here are a bunch of the images from the same series that I didn’t buy: 1*, 2, 3, 4, 5, 6, 7*, ﻿8, 9 (* denotes I’m kicking myself that I didn’t buy these)

Until now, this blog has been focused on what got me motivated to start it – Technology and Translation. My plan all along was to include other things I find interesting or am doing at the time, and this is my first post in this regard. You can expect more – I will try to not repost too much that can be found elsewhere (like The Directory of Wonderful things or Reddit) and tightly focus on what’s of interest. I admit I’m very much inspired by acb’s blog and writing style at /dev/null and hope to be as interesting. I’ve also got my last.fm latest tracks on the right if you are interested in what I’m listening to.

There’s an interesting CMS for photographers called indexhibit that fits it’s target market so well that the lack of code updates for two years doesn’t stop new people from using it all the time.

I’ve set up two instance, one each for my friends Liam White and Meicy Sitorus. Both are part time photographers – Liam’s a designer and Meicy runs projects for Rumah Buku.

The thing I like most about helping my artistic contemporaries is that they can do their own design – I am constantly frustrated by my lack of design foo and impatience with CSS. I know a lot of potential clients go elsewhere because of this, although I’m excited about a budding relationship with PipStafford that may help fix this problem.

For many years, I’ve been an online activist in the traditional sense – an anti capitalist activist, online. I spent many years with the Indymedia mob (and still provide with high level tech support), Enagemedia and a number of other projects that have disappeared from the Internet. I worked for The Wilderness Society as a Systems Administrator for a few years as well.

Two weeks later I returned to Melbourne to find a letter from Craig Minogue. Like any other person, I went searching and was quite shocked, to say the least. As were my flatmates as I read the article out at the kitchen table.

Then I read his personal site. I had to – he has never been on the Internet. He sends me printouts of his website that he has marked up, and I edit the site accordingly. I then send him back the printouts of the text….Hence I’ve read most of the text portions.

I have grown fond of Craig over the last few months, although I’ve never spoken to him nor met him in person. Sometimes his mum will email me asking a question on his behalf, and I get weekly correspondence from him. He’s even been nice enough to give me some of his art as compensation for my hours, and I love it.

More personally important for me is how much I want to do this for him. I’ve done work for rich and poor organisations that always pull the “not much budget” line when they want something done. Finally I’ve found someone that really doesn’t have a budget.

And I think he’s worth it. I am looking forward to shaking his hand and breaking some bread with him in six years.